Ensemble Methods (Bagging and Boosting)
Ensemble Methods in Machine Learning relies on a very basic hypothesis that if we combine multiple models (weak learners) together it will result in a powerful model.
In this article, I would again try and explain the core concept behind the ensemble methods without getting too much into technicalities, since the idea is to understand what ensemble methods are all about.
There are so many models available in Machine Learning that, irrespective of the type of problem statement we have, that is, regression or classification, we always try to find the best fit model. The ensemble methods combine these base models (often referred as weak learners) in a way that it helps us to achieve a better and much powerful model.
There are various kinds of ensemble models. The most popular of them all are:
- Bagging (Bootstrap Aggregating)
- Boosting
The basic difference between the Bagging and Boosting technique is the way the system learns.
In Bagging all weak learners, learns independently in parallel and the result from each weak learner is then combined to get the final result.
Whereas in Boosting the weak learners, learns in a sequential order wherein they adapt from the result of the previous learner and combines what they have learned and pass on to the next weak learner.
Bagging (Bootstrap Aggregating)
Bagging is short for Bootstrap Aggregating. In bagging since we discussed that we train different weak learners and then combine their results to get the final result. We know that if we run all these weak learners (models) on the entire data it would be very expensive computationally.
That is one of the primary reasons why we first bootstrap our data wherein we generate samples or subsets of our data and train each of these models (weak learners) on these subsets in parallel and then we will figure out how to combine these learners.
Now in case if we have a regression problem we would average out the results of each of the weak learner and get our final answer whereas in case if we have a classification problem we can look at the output which has majority to be our final answer.
Boosting
Boosting is also sometimes referred to as the sequential method. As the name suggests the idea behind this method is that the weak learners learn from the output of the previous learner and adapts from its result. So it is kind of an iterative method where with each weak learner the fitness of the model improves.
There are various permutations and combinations possible when it comes to selecting the weak learners in boosting and the sequence in which these models are applied also matters a lot. The sequence of the model depends on what we are intending to achieve out of our model. For example, we may choose a different model sequence for a high bias model as compared to the case of a high variance model.
There are multiple boosting techniques like Adaptive Boosting and Gradient Boosting which I would be covering in a different article.
In a nutshell, Ensemble Methods are quite effective which results in the overall performance improvement of our model since it combines different models and aggregates or adapts from them.
I hope you liked this article. Thank you for reading.