THEORITICAL UNDERSTANDING

  • How Gradient Boosting Regressor works?

    • STEP 1: Calculate the average of the target records and assign that as the new predicted value and make a new column for it.
    • STEP 2: Calculate the residual(Difference between actual and predicted value) and make a different column for it.
    • STEP 3: Now our residual value is a new target column so that model will fit on that column and predict next residual value.
    • STEP 4: Update the default predicted value using following formula:
    • gb2
    • Our model will iterate untill how many tree we want to grow.
    • gb3
    • INTERVIEW TOPIC

  • 1. What Are the Basic Assumption?

    • There are no such assumptions.
  • Missing Values

    • GradientBoost cannot handle missing values.
  • 2. Advantages of Gradient Boost

    • It has a great performance
    • It can solve complex non linear functions
    • It is better in solving any kind of ML usecases.
  • 3. Disadvantages of Gradient Boosting

    • It requires some amount of parameter tuning.
  • 4. Whether Feature Scaling is required?

    • It is a rule based algorithm and it does not use distance parameter, so feature scaling is not required.
  • 6. Impact of outliers?

    • Gradient Boosting is robust to outliers because decision trees split things into lines and do not distinguish how distant a point is from a line. In general, the nodes are defined by the sample proportions in each split zone(not by their absolute value.)
  • 7. What elements are involbed in Gradient Boosting algorithm?

    • A loss function for Optimization.
    • A weak learner(decision tree) for making predictions.
    • An additive model to add weak learners to minimize the loss function.
  • 8. How can we improve Gradient Boosting algorithm?

    • Pick a lower learning rate(Shrinkage), between 0.1 and 0.3.
    • Have a tree constraints on number of trees, tree depth, minimum improvement in loss and number of observation per split.
    • Lower the learning rate and increase the number of decision tree/estimators proportionally to acheive models that are more robust in nature.
    • Establishing Penalized learning.
    • Implementing random sampling.
    • Utilizing Regularization.
  • Difference between Gradient Boosting algorithm and Random forest algorithm?

    • GB are more prone to overfitting if given data is noisy but this is not the case in RF algorithm
    • GB takes longer to train since their decision trees are built sequentially but in random forest this is not the case since decision trees are made parallelly.
    • GB algorithm is harder to tune.
    • GB utilizes weak learner to get stron prediction but random forest uses maximum voting and taking average to get the prediction.
    • Random forest is more prone to being biased.
    • RF donot use sequential approach donot deal with unbalanced datasets.
    • RF utilizes fully grown decision trees.
  • Difference between AdaBoost and GradientBoost algorithm?

    • Gb train learners based upon minimizing the loss function of a learner while AdaBoost train by concentrating on misclassified observations.
    • Weak learners in AdaBoost are very basic form of decision tree that are STUMP while in Gradient Boosting they are more complex.
    • All the learners in GB have equal weights but in AdaBoost final prediction is based a majority vote of all the weak learners predictions weighted by their individual accuracy.