AdaBoost Regressor.
All about AdaBoost Regressor and associated interview questions.
THEORITICAL UNDERSTANDING
-
- It is a sequential learning process where next model will depend on the output of previous model.
- All the models doesn’t contribute equally. Models that make mistake will get high weightage(importance) and less weightage(importance) for correct predicting model.
- Here Decision tree are called as Stumps(One root node and two leaf node).
-
- STEP 1: Assign a equal weights to all features by assighning a weight as 1/N where N is the total number of records so that when we sum a weight it should be equal to 1.
- STEP 2: Decision tree as STUMP is created by usual process which I have described in my previous post of decision tree LINK.
- STEP 3: Now if our STUMP which we created in a previous step predicted a incorrect output then we give a higher weight to this output so that in next iteration that will be corrected. This process is completed in following steps:
-
- Calculate a Total Error(TE) which is the summation of weight of that model which our STUMP has made.
-
-
Calculate the Performance(p) of STUMP by using formula given formula:
-
Calculate the Performance(p) of STUMP by using formula given formula:
-
-
Updating the weight of incorrectly classified records is done by using following formula which give the higher weight.
-
Updating the weight of incorrectly classified records is done by using following formula which give the higher weight.
-
-
Updating the weights of the correctly classified records is done by using following formula which gives the lower weight.
-
Updating the weights of the correctly classified records is done by using following formula which gives the lower weight.
- STEP 4: Update a new weight and normalize it by dividing each calculated weight by the sum of all calculated weight so that SUMMATION OF ALL WEIGHT SHOULD BE 1 and formula is given as:
- STEP 5: Now create a new dataset using a normalized weight and create a bucket so that whenever iteration happens incorrect record will capture and populate in new dataset. By using this new dataset again process starts from STEP 1.
INTERVIEW TOPICS
-
- There are no such assumptions.
-
- Adaboost can handle mising values
-
- It doesn’t overfit
- It has few parameters to tune.
-
- Since it is a rule based model and distance calculation doesnot require so there is no need of feature scaling.
-
- Adaboost is sensitive to outliers.
-
- Classification
- Regression
-
-
- Confusion Matrix
- Precision,Recall, F1 score
-
-
- R2,Adjusted R2
- MSE,RMSE,MAE