Unsupervised Learning
#
- Works on unlabelled raw data.
- The algorithm discovers hidden patterns without prior knowledge of outcomes.
- Requires no human intervention during training.
- Does not make direct predictions — it groups or organises data instead.
- Carries a higher risk because there’s no ground truth to verify results.
- Common techniques include Clustering, Association, and Dimensionality Reduction.
stateDiagram-v2
%% ML maths-based colours (same palette as supervised)
classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
classDef category font-style:italic,font-weight:bold,fill:#f3f4f6,stroke:#374151
%% Root
USL: Unsupervised Learning
%% Main branches
USL --> CLU:::category
CLU: Clustering
USL --> DR:::category
DR: Dimensionality Reduction
%% Clustering algorithms
CLU --> KM:::geometry
KM: K-Means
CLU --> HC:::geometry
HC: Hierarchical Clustering
CLU --> DB:::geometry
DB: DBSCAN
%% Probabilistic models
USL --> PM:::category
PM: Probabilistic Models
PM --> GMM:::probability
GMM: Gaussian Mixture Model
PM --> HMM:::probability
HMM: Hidden Markov Model
Clustering
#
- Groups similar data points together based on shared features.
- Commonly used for market segmentation, image compression, and anomaly detection.
Common Types of Clustering
#
- K-Means Clustering – Divides data into K groups based on similarity.
- Hierarchical Clustering – Builds a hierarchy (tree) of clusters.
- DBSCAN (Density-Based Spatial Clustering) – Groups points close in density; identifies noise/outliers.
Association
#
- Identifies relationships or correlations between variables in a dataset.
- Commonly used in market basket analysis (e.g. “Customers who bought X also bought Y”).
Common Techniques
#
- Apriori Algorithm – Finds frequent itemsets and generates association rules.
- Eclat Algorithm – Similar to Apriori but uses set intersections for faster computation.
Dimensionality Reduction
#
- Reduces the number of input variables to simplify data.
- Helps remove noise and redundancy.
- Commonly used in data pre-processing and visualisation.
Common Techniques
#
- Principal Component Analysis (PCA) – Projects data onto fewer dimensions while keeping most variance.
- Linear Discriminant Analysis (LDA) – Focuses on class separation.
- t-SNE (t-Distributed Stochastic Neighbour Embedding) – Used for visualising high-dimensional data.
- Autoencoders – Neural networks that compress and reconstruct data.
mindmap
root(Unsupervised Learning)
Clustering
K Means
Hierarchical Clustering
DBSCAN
Dimensionality Reduction
PCA
t SNE
Autoencoders
Probabilistic Models
Gaussian Mixture Model
Hidden Markov Model
Home | Machine Learning