A short introduction to the AdaBoost algorithm In this post, we will cover a introduction to boosting algorithms, as well as delve under the hood of a popular boosting algorithm, AdaBoost. The purpose of this post is to provide a gentle introduction to some of the key concepts of boosting and AdaBoost. This isn’t a definitive pros and cons of AdaBoost vs Gradient Boosting etc, but more of a summary of the key theory points to understand the algorithm. very brief Real World Applications for AdaBoost AdaBoost can be used to solve a variety of real-world problems, such as predicting and classifying the types of topics customers are . The algorithm is heavily utilised for solving classification problems, given its relative ease of implementation in languages such as R and Python. customer churn talking/calling about What are Boosting Algorithms? Boosting algorithms fall within the broader family of ensemble modelling. Broadly speaking, there are two key approaches to model building within data science; building a single model and building an ensemble of models. Boosting falls within the latter approach and with reference to AdaBoost, the models are constructed as follows: For each iteration, a new weak learner is introduced sequentially and aims to compensate the “shortcomings” of the prior models to create a strong classifier. The overall goal of this exercise is to consecutively fit new models to provide more accurate estimations of our response variable. Boosting works from the assumption that each weak hypothesis, or model, has a higher accuracy than randomly guessing: This assumption is known as the “weak learning condition”. What is AdaBoost? The AdaBoost algorithm was developed by Freund and Schapire in 1996 and is still heavily used in various industries. AdaBoost reaches its end goal of a classifier by sequentially introducing new models to compensate the “shortcomings” of prior models. Scikit Learn summarises AdaBoost’s core principle as that it “fits a sequence of weak learners on repeatedly modified versions of the data.” This definition will allow us to understand and expand upon AdaBoost’s processes. Getting Started To begin with, a weak classifier is trained, and all of the example data samples are given an equal weight. Once the initial classifier is trained, two things happen. A weight is calculated for the classifier, with more accurate classifiers being given a higher weight, and less accurate a lower weight. The weight is calculated based on the classifier’s error rate, which is the number of misclassifications in the training set, divided by total training set size. This output weight per model is known as the “alpha”. Calculating the Alpha Each classifier will have a weight calculated, which is based on the classifier’s error rate. For each iteration, the alpha of the classifier is calculated, with the lower the error rate = the higher the alpha. This is visualised as follows: Image from Chris McCormick’s excellent AdaBoost tutorial Intuitively, there is also a relationship between the weight of the training example and the alpha. If we have a classifier with a high alpha that misclassifies a training example, this example will be given more weight than a weaker classifier which also misclassifies a training example. This is referred to as “intuitive”, as we can consider a classifier with a higher alpha as being a more reliable witness; when it misclassifies something, we want to investigate that further. Understanding Weights for Training Samples: Secondly, the AdaBoost algorithm directs its attention to misclassified data examples from our first weak classifier, by assigning weights to each data sample, the value of which is defined by whether the classifier correctly or incorrectly classified the sample. We can break down a visualisation of weights per example, below: wi=1/N Step 1: Our first model where In this instance, we can see that each training example has an equal weight, and that the model has correctly and incorrectly classified certain examples. After each iteration, the sample weights are modified, and those with higher weights (examples that have been incorrectly classified) are more likely to be included within the training sets. When a sample is correctly classified, it is given less weightage in the next step of model building. : An example of a later model, where the weights have been changed The formula for this weight update is shown below: Building the Final Classifier Once all of the iterations have been completed, all of the weak learners are combined with their weights to form a strong classifier, as expressed in the below equation: The final classifier is therefore built up of “ ” weak classifiers, is the output of the weak classifier, with the weight applied to the classifier. The final output is therefore a combination of all of the classifiers. T ht(x) at This is a whistle-stop tour of the theory of AdaBoost, and should be seen as an introductory exploration of the boosting algorithm. For further reading, I recommend the following resources: Recommended Further Reading Natekin, A. Knoll, A. (2018). Gradient boosting machines, a tutorial. Frontiers in Neurobotics. [online] Available at: https://core.ac.uk/download/pdf/82873637.pdf Li, C. (2018). A Gentle Introduction to Gradient Boosting. [online] Available at: http://www.ccs.neu.edu/home/vip/teach/MLcourse/4_boosting/slides/gradient_boosting.pdf Schapire, R. (2018). Explaining AdaBoost. [online] Available at: https://math.arizona.edu/~hzhang/math574m/Read/explaining-adaboost.pdf YouTube. (2018). Extending Machine Learning Algorithms — AdaBoost Classifier | packtpub.com. [online] Available at: https://www.youtube.com/watch?v=BoGNyWW9-mE Scikit-learn.org. (2018). 1.11. Ensemble methods — scikit-learn 0.20.2 documentation. [online] Available at: https://scikit-learn.org/stable/modules/ensemble.html#AdaBoost McCormick, C. (2013). AdaBoost Tutorial · Chris McCormick . [online] Available at: http://mccormickml.com/2013/12/13/adaboost-tutorial/ Schapire, R. (2018). Explaining AdaBoost. [online] Available at: https://math.arizona.edu/~hzhang/math574m/Read/explaining-adaboost.pdf Mohri, M. (2018). Foundations of Machine Learning [online] Available at: https://cs.nyu.edu/~mohri/mls/ml_boosting.pdf

Under the Hood of AdaBoost

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

10 Ways AI Has Changed Our Lives

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

10 Ways AI Has Changed Our Lives

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps