January 3, 2026Supervised Learning
#
Trained using labelled data.
Each example in the training set includes the correct output.
The algorithm learns to generalise and make predictions on unseen data.
Generally more accurate than unsupervised methods.
Requires human intervention for labelling and setup.
Widely used due to its accuracy and efficiency.
Produces highly accurate results when trained on good-quality labelled data.
Classification
#
Output is discrete (e.g. Yes/No, Spam/Not Spam).
Used for categorising data into predefined classes.
Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).
January 3, 2026Unsupervised Learning
#
- Works on unlabelled raw data.
- The algorithm discovers hidden patterns without prior knowledge of outcomes.
- Requires no human intervention during training.
- Does not make direct predictions — it groups or organises data instead.
- Carries a higher risk because there’s no ground truth to verify results.
- Common techniques include Clustering, Association, and Dimensionality Reduction.
stateDiagram-v2
%% ML maths-based colours (same palette as supervised)
classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
classDef category font-style:italic,font-weight:bold,fill:#f3f4f6,stroke:#374151
%% Root
USL: Unsupervised Learning
%% Main branches
USL --> CLU:::category
CLU: Clustering
USL --> DR:::category
DR: Dimensionality Reduction
%% Clustering algorithms
CLU --> KM:::geometry
KM: K-Means
CLU --> HC:::geometry
HC: Hierarchical Clustering
CLU --> DB:::geometry
DB: DBSCAN
%% Probabilistic models
USL --> PM:::category
PM: Probabilistic Models
PM --> GMM:::probability
GMM: Gaussian Mixture Model
PM --> HMM:::probability
HMM: Hidden Markov Model
Clustering
#
- Groups similar data points together based on shared features.
- Commonly used for market segmentation, image compression, and anomaly detection.
Common Types of Clustering
#
- K-Means Clustering – Divides data into K groups based on similarity.
- Hierarchical Clustering – Builds a hierarchy (tree) of clusters.
- DBSCAN (Density-Based Spatial Clustering) – Groups points close in density; identifies noise/outliers.
Association
#
- Identifies relationships or correlations between variables in a dataset.
- Commonly used in market basket analysis (e.g. “Customers who bought X also bought Y”).
Common Techniques
#
- Apriori Algorithm – Finds frequent itemsets and generates association rules.
- Eclat Algorithm – Similar to Apriori but uses set intersections for faster computation.
Dimensionality Reduction
#
- Reduces the number of input variables to simplify data.
- Helps remove noise and redundancy.
- Commonly used in data pre-processing and visualisation.
Common Techniques
#
- Principal Component Analysis (PCA) – Projects data onto fewer dimensions while keeping most variance.
- Linear Discriminant Analysis (LDA) – Focuses on class separation.
- t-SNE (t-Distributed Stochastic Neighbour Embedding) – Used for visualising high-dimensional data.
- Autoencoders – Neural networks that compress and reconstruct data.
mindmap
root(Unsupervised Learning)
Clustering
K Means
Hierarchical Clustering
DBSCAN
Dimensionality Reduction
PCA
t SNE
Autoencoders
Probabilistic Models
Gaussian Mixture Model
Hidden Markov Model
Home | Machine Learning
January 3, 2026Semi-Supervised Learning
#
- A combination of labelled and unlabelled data.
- Useful when labelling large datasets is expensive or time-consuming.
- Works well with high-volume datasets (e.g. millions of images).
- Only a small fraction of data is labelled (e.g. a few thousand).
- The algorithm learns from both labelled examples and structure in unlabelled data.
- Ideal for medical imaging where labelled data is limited.
- For example, a radiologist can label a small set of medical scans,
and the model uses that to learn from thousands of unlabelled scans. - Helps improve accuracy and generalisation with minimal manual effort.
Home | Machine Learning
Neural Networks
#
- A network of artificial neurons inspired by how neurons function in the human brain.
- At its core - a mathematical model designed to process and learn from data.
- Neural networks form the foundation of Deep Learning (involves training large and complex networks on vast amounts of data).
flowchart LR
subgraph subGraph0["Input Layer"]
I1(("Input 1"))
I2(("Input 2"))
I3(("Input 3"))
end
subgraph subGraph1["Hidden Layer"]
H1(("Hidden 1"))
H2(("Hidden 2"))
H3(("Hidden 3"))
end
subgraph subGraph2["Output Layer"]
O(("Output"))
end
I1 --> H1 & H2 & H3
I2 --> H1 & H2 & H3
I3 --> H1 & H2 & H3
H1 --> O
H2 --> O
H3 --> O
style I1 fill:#C8E6C9
style I2 fill:#C8E6C9
style I3 fill:#C8E6C9
style H1 stroke:#2962FF,fill:#BBDEFB
style H2 fill:#BBDEFB
style H3 fill:#BBDEFB
style O fill:#FFCDD2
style subGraph0 stroke:none,fill:transparent
style subGraph1 stroke:none,fill:transparent
style subGraph2 stroke:none,fill:transparent
Structure of a Neural Network
#
A typical neural network has three main layers:
August 6, 2024Machine Learning
#
stateDiagram-v2
%% ===== CLASS DEFINITIONS (Math-based colours) =====
classDef algebra fill:#cfe8ff,stroke:#1e3a8a,stroke-width:1px
classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
classDef logic fill:#ede9fe,stroke:#5b21b6,stroke-width:1px
classDef category font-style:italic,font-weight:bold,fill:#aaaaaa,stroke:#374151,stroke-width:3px
%% ===== ROOT =====
ML: Machine Learning
%% ===== SUPERVISED =====
ML --> SL:::category
SL: Supervised Learning
SL --> Regression
Regression --> LR:::algebra
LR: Linear Regression
LR --> NN:::algebra
NN: Neural Network
NN --> DT:::logic
DT: Decision Tree
SL --> Classification
Classification --> NB:::probability
NB: Naive Bayes
NB --> KNN:::geometry
KNN: k-Nearest Neighbours
KNN --> SVM:::algebra
SVM: Support Vector Machine
%% ===== UNSUPERVISED =====
ML --> USL:::category
USL: Unsupervised Learning
USL --> Clustering
Clustering --> KM:::geometry
KM: K-Means
KM --> GMM:::probability
GMM: Gaussian Mixture Model
GMM --> HMM:::probability
HMM: Hidden Markov Model
%% ===== REINFORCEMENT =====
ML --> RL:::category
RL: Reinforcement Learning
RL --> DM:::logic
DM: Decision Making
Mathematical Legend
Algebra / Linear Algebra (Blue)
#
Used heavily when models rely on:
Artificial Neuron and Perceptron
#
knowledge in neural networks is stored in connection weights, and learning means modifying those weights.
Biological Neuron
#
A biological neuron is a specialised cell that processes and transmits information through electrical and chemical signals.
Core components:
- Dendrites: receive signals from other neurons
- Cell body (soma): processes incoming signals
- Axon: transmits the output signal
- Synapses: connection points between neurons
Biological intuition:
- many inputs arrive to one neuron
- one neuron can connect out to many neurons
- massive parallelism enables fast perception and recognition
Artificial Neuron
#
An artificial neuron is a simplified computational model inspired by biological neurons.
Machine learning Workflow
#
Data is the foundation of any machine learning system.
Quality of data matters more than model complexity.
Role of Data
#
Data determines:
- What patterns the model can learn
- How well it generalises
- Whether bias or noise is introduced
Bad data → bad model (even with perfect algorithms).
Data Preprocessing, wrangling
#
Raw data is never ready for training.
Data Issues
- Noise
- For objects, noise is an extraneous object
- For attributes, noise refers to modification of original values
- Use Log or Z Transfer to convert to mean
- Outliers
- Data objects with characteristics that are considerably different than most of the other data objects in the data set
- Handle: Use IQR method
- Find Lower and Upper Bound and replace Outlier with Lower or Upper Bound
- Missing Values
- Eliminate data objects or variables
- Handle: Estimate missing values
- Mean, Median or Mode
- Prefer Median if there are missing outliers
- Ignore the missing value during analysis
- Duplicate Data
- Major issue when merging data from heterogeneous sources
- Inconsistent Codes
- Find all Unique and transfer all inconsistent to
Data Preprocessing techniques
Linear Regression
#
Linear Regression is a supervised
ML
method used to predict a numerical target by fitting a model that is linear in its parameters.
In
ML
, linear models are a core baseline:
they’re fast, often surprisingly strong, and usually easy to interpret.
Key takeaway:
Linear Regression learns parameters by minimising a squared-error cost.
You can solve it directly (closed form) or iteratively (gradient descent),
and you can extend it using basis functions and regularisation.
February 21, 2026Direct solution method - Ordinary Least Squares and the Line of Best Fit
#
It is possible to compute the best parameters for linear regression in one shot (closed-form),
instead of iteratively improving them step-by-step. fileciteturn34file10turn34file6
For linear regression, the direct method is usually Ordinary Least Squares (OLS).
Ordinary Least Squares (OLS) chooses the “best” line by minimising squared prediction errors.
Key takeaway:
OLS defines “best fit” as the line that minimises the total squared residual error across all data points.
February 21, 2026Cost Function
#
also known as an objective function
how far the predicted values are from the actual ones
measure of the difference between predicted values and actual values
quantifies the error between a model’s predicted values and actual values
measures the model’s error on a group of datapoints
method used to predict values by drawing the best-fit line through the data
used to evaluate the accuracy of a model’s predictions