Jump to navigation Jump to search For an alternative meaning, see variational Bayesian methods. In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Supervised learning algorithms are most commonly described as performing the task of from multimedia file simulink example through a hypothesis space to find a suitable hypothesis that will make good predictions with a particular problem.
Evaluating the prediction of an ensemble typically requires more computation than evaluating the prediction of a single model, so ensembles may be thought of as a way to compensate for poor learning algorithms by performing a lot of extra computation. By analogy, ensemble techniques have been used also in unsupervised learning scenarios, for example in consensus clustering or in anomaly detection. An ensemble is itself a supervised learning algorithm, because it can be trained and then used to make predictions. The trained ensemble, therefore, represents a single hypothesis. This hypothesis, however, is not necessarily contained within the hypothesis space of the models from which it is built. Thus, ensembles can be shown to have more flexibility in the functions they can represent. Empirically, ensembles tend to yield better results when there is a significant diversity among the models.
Did not find what they wanted? Try here
Many ensemble methods, therefore, seek to promote diversity among the models they combine. While the number of component classifiers of an ensemble has a great impact on the accuracy of prediction, there is a limited number of studies addressing this problem. A priori determining of ensemble size and the volume and velocity of big data streams make this even more crucial for online ensemble classifiers. Mostly statistical tests were used for determining the proper number of components. The Bayes Optimal Classifier is a classification technique. It is an ensemble of all the hypotheses in the hypothesis space.
On average, no other ensemble can outperform it. Naive Bayes Optimal Classifier is a version of this that assumes that the data is conditionally independent on the class and makes the computation more feasible. Bootstrap aggregating, often abbreviated as bagging, involves having each model in the ensemble vote with equal weight. In order to promote model variance, bagging trains each model in the ensemble using a randomly drawn subset of the training set. As an example, the random forest algorithm combines random decision trees with bagging to achieve very high classification accuracy. Boosting involves incrementally building an ensemble by training each new model instance to emphasize the training instances that previous models mis-classified. In some cases, boosting has been shown to yield better accuracy than bagging, but it also tends to be more likely to over-fit the training data.
Bayes Optimal Classifier by sampling hypotheses from the hypothesis space, and combining them using Bayes’ law. It has been shown that under certain circumstances, when hypotheses are drawn in this manner and averaged according to Bayes’ law, this technique has an expected error that is bounded to be at most twice the expected error of the Bayes optimal classifier. This modification overcomes the tendency of BMA to converge toward giving all of the weight to a single model. The use of Bayes’ law to compute model weights necessitates computing the probability of the data given each model. Typically, none of the models in the ensemble are exactly the distribution from which the training data were generated, so all of them correctly receive a value close to zero for this term. This would work well if the ensemble were big enough to sample the entire model-space, but such is rarely possible.