Classification problem is quite popular in various domains such as finance and telecommunication, for example, to predict the churn in telecommunication. Selection of a method, out of classical or machine learning algorithms, depends on business priorities. Each comes with certain advantages and disadvantages. In this blog, I will examine two machine learning algorithm – boosting and random forest.
In boosting, n number of samples are created using bootstrapping – sampling with replacement. Boosting works on weak classifiers that have high bias and low variation works iteratively on weak learners, and more weightage is given to misclassified learners in next iteration. The final classification is done after combining prediction of each classifier. Boosting uses base model as decision tree generally. However, linear regression or logistic regression can be used as a base model too. It is best to grow a tree with no pruning and trees with 2-8 leaves work well. The process flow of common boosting method- ADABOOST-is as following:
This ensemble method works on bootstrapped samples and uncorrelated classifiers. Alternatively, this model learns from various over grown trees and a final decision is made based on the majority. In this method, predictors are also sampled for each node. It best works on over fitted models that have low bias and high variation and is a bagged model.
I have personally used fgl dataset of R to compare these two methods. Fgl has predictors as chemical elements- RI, Na, Mg, Al, Si, K, Ca, Ba, and Fe and target variable is a type of forensic glass – WinF, WinNF, Veh, Con, Tabl, and Head.
In short, not only is a random forest more accurate model than boosting model but also it is more explainable that it gives importance of various predictors. See the importance of predictors as given by random forest. Boosting is used for data sets that has high dimensions.
This is a guest post by Rajiv Pandey. He is currently doing the Executive Program In Business Analytics course (EPBA) from Jigsaw Academy.
Also Read
Top Data Science Experts on Twitter/LinkedIn You Should Follow
How to Shift a Career From IT to Analytics?
Fill in the details to know more
4 Things You Should Know About Industry 4.0!
July 21, 2022
Artificial Intelligence Future: Comprehensive Overview(2022)
March 8, 2021
7 tips from a data scientist to data analyst
November 15, 2019
Capstone Project in PGPDM – A key differentiator
September 28, 2019
Our PGPDM student, Krishna Punyakoti, talks about applying for jobs and cracking interviews
August 1, 2019
How to Consolidate and Diversify in your Analytics Career
February 25, 2019
From Aspirant To Product Owner – Everyone Has Something To Learn From This IIM Indore Product Management Program!
April 14, 2023
This IT veteran with 25+ years of experience Found Something Interesting To Learn From Our Program!
March 24, 2023
Here is how Radhika learned to unleash the Power of Business Analytics with our IIM program!
March 14, 2023
“Calling the IIM Indore Faculty good is an Understatement” – Says Our IPBA Learners!
March 7, 2023
Here Is How Jolly Aced Motherhood and Business Analytics Like a Pro!
February 28, 2023
If You Have The Will We Have The Perfect Way For You To Excel In Strategic Sales Management With IIM Indore
Add your details:
By proceeding, you agree to our privacy policy and also agree to receive information from UNext through WhatsApp & other means of communication.
Upgrade your inbox with our curated newletters once every month. We appreciate your support and will make sure to keep your subscription worthwhile