Prediction & Forecasting

Prediction & Forecasting #

Prediction and forecasting use statistical models to estimate unknown or future values.

In this module, the focus is on correlation, regression, and time series forecasting.

Key takeaway:
Prediction estimates a value using a model.

Forecasting is prediction where the order of time matters.

  • Correlation
  • Regression
  • Time series analysis
  • Components of time series data
  • Moving average and weighted moving average
  • AR model
  • ARMA model
  • ARIMA model
  • SARIMA and SARIMAX
  • VAR and VARMAX
  • Simple exponential smoothing

Prediction vs Forecasting ☆ #

ConceptMeaningExample
PredictionEstimate an unknown outputPredict house price from area and rooms
ForecastingPredict future values using time orderForecast sales for next month
All forecasting is prediction, but not all prediction is forecasting.

Overall Workflow #

flowchart LR
    A[Data] --> B[Explore Pattern]
    B --> C[Choose Model]
    C --> D[Train or Fit]
    D --> E[Validate]
    E --> F[Predict or Forecast]
    F --> G[Interpret Error]

    style A fill:#E1F5FE
    style B fill:#C8E6C9
    style C fill:#FFF9C4
    style D fill:#EDE7F6
    style E fill:#C8E6C9
    style F fill:#E1F5FE
    style G fill:#FFF9C4

Correlation ☆ #

Correlation measures the direction and strength of linear relationship between two variables.

The Pearson correlation coefficient is:

\[ r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2}\sqrt{\sum_{i=1}^{n}(y_i - \bar{y})^2}} \]

Range:

\[ -1 \leq r \leq 1 \]
Value of correlationInterpretation
Close to 1Strong positive linear relationship
Close to -1Strong negative linear relationship
Close to 0Weak or no linear relationship

Correlation does not prove causation.

Two variables may move together because of a third hidden factor.


Regression ☆ #

Regression models the relationship between an input variable and an output variable.

Simple linear regression has the form:

\[ \hat{y} = b_0 + b_1x \]

Where:

  • \( \hat{y} \) is predicted value
  • \( b_0 \) is intercept
  • \( b_1 \) is slope
  • \( x \) is input variable

Regression Residual #

A residual is the difference between actual and predicted value.

\[ e_i = y_i - \hat{y}_i \]

Least Squares Method ☆ #

Linear regression commonly estimates parameters by minimising the sum of squared errors.

\[ SSE = \sum_{i=1}^{n}(y_i - \hat{y}_i)^2 \]

The best-fit line is the line that minimises this quantity.

Slope and Intercept #

\[ b_1 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n}(x_i - \bar{x})^2} \] \[ b_0 = \bar{y} - b_1\bar{x} \]

Model Evaluation for Regression ☆ #

Mean Absolute Error #

\[ MAE = \frac{1}{n}\sum_{i=1}^{n}|y_i - \hat{y}_i| \]

Mean Squared Error #

\[ MSE = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2 \]

Root Mean Squared Error #

\[ RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2} \]

Coefficient of Determination #

\[ R^2 = 1 - \frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{\sum_{i=1}^{n}(y_i - \bar{y})^2} \]

A higher ( R^2 )

means the regression model explains more variation in the output variable.


Time Series Analysis ☆ #

A time series is a sequence of observations collected over time.

Examples:

  • daily stock price
  • monthly sales
  • hourly temperature
  • weekly website traffic

The order of observations matters.

Do not randomly shuffle time series data before forecasting.

Shuffling destroys the temporal pattern.


Components of Time Series Data ☆ #

ComponentMeaningExample
TrendLong-term increase or decreaseSales rising over years
SeasonalityRepeating pattern at fixed intervalsHigher sales every December
Cyclic patternLong-term waves without fixed periodBusiness cycles
Irregular noiseRandom variationUnexpected shocks

Moving Average ☆ #

Moving average smooths short-term fluctuations.

For a window of size \( m \) :

\[ MA_t = \frac{Y_t + Y_{t-1} + \cdots + Y_{t-m+1}}{m} \]

Weighted Moving Average #

Weighted moving average gives different importance to different time points.

\[ WMA_t = \sum_{i=0}^{m-1} w_iY_{t-i} \]

where:

\[ \sum_{i=0}^{m-1} w_i = 1 \]

Simple Exponential Smoothing ☆ #

Simple exponential smoothing updates the forecast using the latest observation and previous forecast.

\[ F_{t+1} = \alpha Y_t + (1 - \alpha)F_t \]

Where:

  • \( F_{t+1} \) is next forecast
  • \( Y_t \) is current observation
  • \( F_t \) is current forecast
  • \( \alpha \) is smoothing constant
\[ 0 < \alpha < 1 \]

Stationarity ☆ #

A stationary time series has statistical properties that do not change strongly over time.

Commonly, stationarity means approximately stable:

  • mean
  • variance
  • autocorrelation pattern

Many AR, MA and ARMA models assume stationarity.


Differencing ☆ #

Differencing is used to reduce trend and make a time series more stationary.

\[ Z_t = Y_t - Y_{t-1} \]

If the first difference is not enough, higher-order differencing may be used.


Autocorrelation and Partial Autocorrelation ☆ #

Autocorrelation measures how a time series relates to its own past values.

\[ \rho_k = \text{Corr}(Y_t, Y_{t-k}) \]
ToolHelps Identify
ACFMoving average order
PACFAutoregressive order
AIC and BICModel selection among candidates

AR Model ☆ #

An autoregressive model predicts the current value using previous values of the same series.

AR(1):

\[ Y_t = \phi_0 + \phi_1Y_{t-1} + \epsilon_t \]

AR(p):

\[ Y_t = \phi_0 + \phi_1Y_{t-1} + \phi_2Y_{t-2} + \cdots + \phi_pY_{t-p} + \epsilon_t \]

MA Model ☆ #

A moving average model predicts the current value using current and past error terms.

MA(1):

\[ Y_t = \mu + \epsilon_t + \theta_1\epsilon_{t-1} \]

MA(q):

\[ Y_t = \mu + \epsilon_t + \theta_1\epsilon_{t-1} + \cdots + \theta_q\epsilon_{t-q} \]

ARMA Model ☆ #

ARMA combines autoregressive and moving average components.

ARMA(1,1):

\[ Y_t = \phi_0 + \phi_1Y_{t-1} + \theta_1\epsilon_{t-1} + \epsilon_t \]

ARMA is normally used for stationary series.


ARIMA Model ☆ #

ARIMA stands for Autoregressive Integrated Moving Average.

It combines:

  • AR: past values
  • I: differencing to make the series stationary
  • MA: past forecast errors

ARIMA is written as:

\[ ARIMA(p,d,q) \]

Where:

  • \( p \) is AR order
  • \( d \) is differencing order
  • \( q \) is MA order

SARIMA and SARIMAX ☆ #

SARIMA extends ARIMA by modelling seasonality.

\[ SARIMA(p,d,q)(P,D,Q)_s \]

SARIMAX further includes external variables.

Example external variables:

  • holiday indicator
  • marketing spend
  • temperature
  • interest rate

VAR and VARMAX ☆ #

VAR is used for multivariate time series.

Each variable can depend on its own past values and the past values of other variables.

VARMAX extends VAR by adding moving-average terms and exogenous variables.

ModelUse Case
AROne series, depends on its own past
MAOne series, depends on past errors
ARMAStationary series with AR and MA behaviour
ARIMANon-stationary series that needs differencing
SARIMASeasonal univariate forecasting
SARIMAXSeasonal forecasting with external variables
VARMultiple interacting time series
VARMAXMultiple series with errors and external variables

Exam-Oriented Model Selection ☆ #

flowchart TD
    A[Time Series Data] --> B{Stationary?}
    B -- Yes --> C{AR or MA pattern?}
    C -- AR only --> D[AR]
    C -- MA only --> E[MA]
    C -- Both --> F[ARMA]
    B -- No --> G[Differencing]
    G --> H[ARIMA]
    H --> I{Seasonality?}
    I -- Yes --> J[SARIMA]
    I -- No --> K[ARIMA]
    J --> L{External variables?}
    L -- Yes --> M[SARIMAX]
    L -- No --> J
    A --> N{Multiple variables?}
    N -- Yes --> O[VAR or VARMAX]

    style A fill:#E1F5FE
    style B fill:#FFF9C4
    style C fill:#FFF9C4
    style D fill:#C8E6C9
    style E fill:#C8E6C9
    style F fill:#EDE7F6
    style G fill:#E1F5FE
    style H fill:#C8E6C9
    style I fill:#FFF9C4
    style J fill:#EDE7F6
    style K fill:#C8E6C9
    style L fill:#FFF9C4
    style M fill:#E1F5FE
    style N fill:#FFF9C4
    style O fill:#C8E6C9

Why It Matters in AI and ML #

Prediction and forecasting are core ML tasks.

They are used in:

  • demand forecasting
  • stock and finance modelling
  • energy usage prediction
  • anomaly detection
  • recommendation systems
  • risk modelling
  • sensor and IoT monitoring

Regression predicts from relationships between variables.

Time series forecasting predicts from temporal patterns.


Revision Checklist ☆ #

  • Can I distinguish prediction from forecasting?
  • Can I explain correlation without claiming causation?
  • Can I write the simple linear regression equation?
  • Can I calculate residuals and interpret error metrics?
  • Can I identify trend, seasonality and noise?
  • Can I explain AR, MA, ARMA and ARIMA?
  • Can I choose SARIMA when seasonality is present?
  • Can I choose VAR when multiple time series interact?

Home | Statistics