17.8 Heteroscedasticity
The constant-variance assumption of the errors is also key assumption for Multiple Linear Regression.
For example, if the assumption does not hold, then the CIs for prediction will not respect the confidence for which they were built.
How to diagnose: look at a plot of residuals versus predicted values. Heteroskedasticity can be detected by looking into irregular vertical dispersion patterns in the residuals vs. fitted values plot.
The homoskedasticity assumption may be violated for a variety of reasons:
- For example, if we are regressing non-essential spending for a family based on income, then we might expect more variability for richer families compared to poorer families.
- Also, misspecification can cause heteroskedasticity. E.g. using a regression model that includes independent variables \(x_1\) and \(x_2\) but excludes \(x_1^2\) or \(x_1x_2\) when one of these is relevant.
References:
- Adapted from here
- Also Real-Statistics and here