Ordinary Least Squares and the Line of Best Fit #
Direct Solution Method
Ordinary Least Squares (OLS) is the standard way to choose the “best” line in linear regression by minimising squared prediction errors.
Key takeaway: OLS defines “best fit” as the line that minimises the total squared residual error across all data points.
Predicted Value and Residual #
For each data point ((x_i, y_i)), the model predicts:
\[ \hat{y}_i = \beta_0 + \beta_1 x_i \]The residual (error) is:
\[ r_i = y_i - \hat{y}_i \]Residual intuition: Positive residual means the point is above the line. Negative residual means the point is below the line.
Sum of Squared Errors (SSE) #
OLS chooses (\beta_0, \beta_1) to minimise the Sum of Squared Errors:
\[ SSE = \sum_{i=1}^{n}(y_i - \hat{y}_i)^2 \]Why square the residuals:
- Prevents positive and negative errors cancelling out.
- Penalises large errors more strongly.
- Produces a smooth objective that is easier to optimise.
Closed-Form Solution (Single Predictor) #
When there is a single predictor variable, the solution can be written using covariance and variance:
\[ \beta_1 = \frac{\mathrm{cov}(x,y)}{\mathrm{var}(x)} \]And the intercept:
\[ \beta_0 = \bar{y} - \beta_1 \bar{x} \]Interpretation: (\beta_1) is the slope (how much (y) changes per unit change in (x)). (\beta_0) is the intercept (predicted value when (x=0)).
Matrix View (General Linear Regression Framework) #
Fitting a model can be expressed as an overdetermined system:
\[ Ap = b \]Where:
- (A) is the design matrix.
- (p) is the parameter vector.
- (b) is the output vector.
The least-squares solution satisfies the normal equations:
\[ A^T A p = A^T b \]And (when invertible):
\[ p^* = (A^T A)^{-1}A^T b \]Practical Meaning #
OLS is:
- Fast and standard for many regression problems.
- Sensitive to outliers (because squaring makes large errors dominate).
- A foundation for extensions like regularisation (ridge, LASSO).
References #
- /docs/ai/machine-learning/03-linear-models-regression/
- /docs/ai/machine-learning/03-gradient-descent-linear-regression/