Hypothesis Testing

AI, ML, Stats, Hypothesis Testing, ANOVA, Maximum Likelihood

Hypothesis Testing #

Hypothesis testing is a statistical decision-making method used to decide whether sample evidence is strong enough to reject an initial assumption about a population.

It connects probability, sampling distributions, confidence intervals, significance levels, and decision rules.

Key takeaway:
Hypothesis testing is not about proving something with certainty.
It is about asking:
If the null hypothesis were true, how surprising would this sample result be?

Prediction & Forecasting

AI, ML, Stats

AI, ML, Stats, Prediction, Forecasting, Regression, Time Series

Prediction & Forecasting #

Prediction and forecasting use statistical models to estimate unknown or future values.

In this module, the focus is on correlation, regression, and time series forecasting.

Key takeaway:
Prediction estimates a value using a model.
Forecasting is prediction where the order of time matters.

Correlation
Regression
Time series analysis
Components of time series data
Moving average and weighted moving average
AR model
ARMA model
ARIMA model
SARIMA and SARIMAX
VAR and VARMAX
Simple exponential smoothing

Prediction vs Forecasting ☆ #

Concept	Meaning	Example
Prediction	Estimate an unknown output	Predict house price from area and rooms
Forecasting	Predict future values using time order	Forecast sales for next month

All forecasting is prediction, but not all prediction is forecasting.

Overall Workflow #

flowchart LR
    A[Data] --> B[Explore Pattern]
    B --> C[Choose Model]
    C --> D[Train or Fit]
    D --> E[Validate]
    E --> F[Predict or Forecast]
    F --> G[Interpret Error]

    style A fill:#E1F5FE
    style B fill:#C8E6C9
    style C fill:#FFF9C4
    style D fill:#EDE7F6
    style E fill:#C8E6C9
    style F fill:#E1F5FE
    style G fill:#FFF9C4

Correlation ☆ #

Correlation measures the direction and strength of linear relationship between two variables.

Gaussian Mixture Model & Expectation Maximization

AI, ML, Stats

AI, ML, Stats, GMM, Expectation Maximization, Clustering

Gaussian Mixture Model & Expectation Maximization #

A Gaussian Mixture Model represents data as a weighted combination of multiple Gaussian distributions.

It is commonly used for soft clustering and density estimation.

Key takeaway:
K-means gives hard cluster membership.
GMM gives probabilities of belonging to each cluster.

Gaussian Mixture Model
soft clustering
mixing coefficients
latent variables
likelihood and log-likelihood
Expectation-Maximization algorithm
E-step and M-step
responsibilities
convergence

Motivation ☆ #

Many real datasets are not described well by one Gaussian distribution.