Interview Preparation of Logistic Regression.
Most Important Interview topics in Logistic Regression..
- Theoretical Understanding:
- 1. What Are the Basic Assumption?
- 2. Advantages
- 3. Disadvantages
- 4. Whether Feature Scaling is required?
- 5. Missing Values
- 6. Impact of outliers?
- Types of Problems it can solve(Supervised)
- Practical Implementation
- Performance Metrics
Theoretical Understanding:
- Before understanding Logistic Regression we should understand why we can not use linear regression for binary classification:
- We know Linear Regression works by finding a best fit line. Here in above figure we can see binary classifier as a linear regression.
- If a new data as a OUTLIER comes then best fit line changes everytime so we cannot get a perfect best fit line and this is one of the problem of linear regression as a binary classifier so from here concept of Logistic Regression arises.
- I have tried to write on paper so that it is more easier to comprehend:
-
1. What Are the Basic Assumption?
- Linear Relation between independent features and the log odds
2. Advantages
Advantages of Logistics Regression
- Logistic Regression Are very easy to understand
- It requires less training
- Good accuracy for many simple data sets and it performs well when the dataset is linearly separable.
- It makes no assumptions about distributions of classes in feature space.
- Logistic regression is less inclined to over-fitting but it can overfit in high dimensional datasets.One may consider Regularization (L1 and L2) techniques to avoid over-fittingin these scenarios.
- Logistic regression is easier to implement, interpret, and very efficient to train.
3. Disadvantages
- Sometimes Lot of Feature Engineering Is required
- If the independent features are correlated it may affect performance
- It is often quite prone to noise and overfitting
- If the number of observations is lesser than the number of features, Logistic Regression should not be used, otherwise, it may lead to overfitting.
- Non-linear problems can’t be solved with logistic regression because it has a linear decision surface. Linearly separable data is rarely found in real-world scenarios.
- It is tough to obtain complex relationships using logistic regression. More powerful and compact algorithms such as Neural Networks can easily outperform this algorithm.
- In Linear Regression independent and dependent variables are related linearly. But Logistic Regression needs that independent variables are linearly related to the log odds (log(p/(1-p)).
4. Whether Feature Scaling is required?
yes
5. Missing Values
Sensitive to missing values
6. Impact of outliers?
Like linear regression, estimates of the logistic regression are sensitive to the unusual observations: outliers, high leverage, and influential observations. Numerical examples and analysis are presented to demonstrate the most recent outlier diagnostic methods using data sets from medical domain
Types of Problems it can solve(Supervised)
- Binary Classification
- Multiclass Classification
Practical Implementation
Performance Metrics
Classification
- Confusion Matrix
- Precision,Recall, F1 score



