Gaussian Mixture Model & Expectation Maximization
#
A Gaussian Mixture Model represents data as a weighted combination of multiple Gaussian distributions.
It is commonly used for soft clustering and density estimation.
Key takeaway:
K-means gives hard cluster membership.
GMM gives probabilities of belonging to each cluster.
- Gaussian Mixture Model
- soft clustering
- mixing coefficients
- latent variables
- likelihood and log-likelihood
- Expectation-Maximization algorithm
- E-step and M-step
- responsibilities
- convergence
Motivation ☆
#
Many real datasets are not described well by one Gaussian distribution.
Unsupervised Learning
#
Unsupervised Learning is used when we have input data but no target labels.
The model is not told the correct answer. Instead, it tries to discover hidden structure in the data.
- K-means Clustering and variants
- Review of EM algorithm
- GMM based Soft Clustering
- Applications
Supervised vs Unsupervised Learning
#
| Aspect | Supervised Learning | Unsupervised Learning |
|---|
| Data contains target label? | Yes | No |
| Learns from | Input-output pairs | Input features only |
| Main goal | Predict output | Discover structure |
| Example task | Classification, regression | Clustering |
| Example algorithm | Logistic regression, decision tree | K-means, GMM |
- Works on unlabelled raw data.
- The algorithm discovers hidden patterns without prior knowledge of outcomes.
- Requires no human intervention during training.
- Does not make direct predictions — it groups or organises data instead.
- Carries a higher risk because there’s no ground truth to verify results.
- Common techniques include Clustering, Association, and Dimensionality Reduction.
The most common example is clustering, where similar records are grouped together.