Gaussian Mixture Model & Expectation Maximization

Gaussian Mixture Model & Expectation Maximization #

A Gaussian Mixture Model represents data as a weighted combination of multiple Gaussian distributions.

It is commonly used for soft clustering and density estimation.

Key takeaway:
K-means gives hard cluster membership.
GMM gives probabilities of belonging to each cluster.

Many real datasets are not described well by one Gaussian distribution.

Unsupervised Learning is used when we have input data but no target labels.

The model is not told the correct answer. Instead, it tries to discover hidden structure in the data.

Aspect	Supervised Learning	Unsupervised Learning
Data contains target label?	Yes	No
Learns from	Input-output pairs	Input features only
Main goal	Predict output	Discover structure
Example task	Classification, regression	Clustering
Example algorithm	Logistic regression, decision tree	K-means, GMM

Works on unlabelled raw data.
The algorithm discovers hidden patterns without prior knowledge of outcomes.
Requires no human intervention during training.
Does not make direct predictions — it groups or organises data instead.
Carries a higher risk because there’s no ground truth to verify results.
Common techniques include Clustering, Association, and Dimensionality Reduction.

The most common example is clustering, where similar records are grouped together.