Interview Query

Linear model assumptions wrong (The data science course)

I was browsing through the “Data Science Course: Modeling & Machine Learning” and in Section Model Selection/Booking regression found the following main assumptions about the linear regression:

*There are four main assumptions in linear regression:

  1. A normal distribution of error terms
  2. Independence in the predictors
  3. The mean residuals must equal zero with constant variance
  4. No correlation between the features*

These assumptions seem to be very wrong and imprecise.

  1. This point should also assume that the errors are independent
  2. Independence of predictors (also called features) is definitely not required for linear regression. In almost all cases, predictors are not independent.
  3. “The mean residuals must equal zero” also does not make sense. Should probably be “The error terms should have mean zero and constant variance”.
  4. “No correlation between features” is again nonsense. In almost all cases we do have non-zero correlation between features (predictors).

This does not seem like good quality material. What do you guys think?


Sort By: Default

Edit Post

Tag your post (e.g. "Amazon", "Data Scientist" ...)