Ordinary Least Squares and the Line of Best Fit #

Direct Solution Method

Ordinary Least Squares (OLS) is the standard way to choose the “best” line in linear regression by minimising squared prediction errors.

Key takeaway: OLS defines “best fit” as the line that minimises the total squared residual error across all data points.

Predicted Value and Residual #

For each data point ((x_i, y_i)), the model predicts:

\[ \hat{y}_i = \beta_0 + \beta_1 x_i \]

The residual (error) is:

\[ r_i = y_i - \hat{y}_i \]

Residual intuition: Positive residual means the point is above the line. Negative residual means the point is below the line.

OLS chooses (\beta_0, \beta_1) to minimise the Sum of Squared Errors:

\[ SSE = \sum_{i=1}^{n}(y_i - \hat{y}_i)^2 \]

Why square the residuals:

When there is a single predictor variable, the solution can be written using covariance and variance:

\[ \beta_1 = \frac{\mathrm{cov}(x,y)}{\mathrm{var}(x)} \]

And the intercept:

\[ \beta_0 = \bar{y} - \beta_1 \bar{x} \]

Interpretation: (\beta_1) is the slope (how much (y) changes per unit change in (x)). (\beta_0) is the intercept (predicted value when (x=0)).

Fitting a model can be expressed as an overdetermined system:

\[ Ap = b \]

Where:

The least-squares solution satisfies the normal equations:

\[ A^T A p = A^T b \]

And (when invertible):

\[ p^* = (A^T A)^{-1}A^T b \]

OLS is: