ML

Supervised Learning

Supervised Learning #

Trained using labelled data.
Each example in the training set includes the correct output.
The algorithm learns to generalise and make predictions on unseen data.
Generally more accurate than unsupervised methods.
Requires human intervention for labelling and setup.
Widely used due to its accuracy and efficiency.
Produces highly accurate results when trained on good-quality labelled data.


Classification #

Output is discrete (e.g. Yes/No, Spam/Not Spam).
Used for categorising data into predefined classes.
Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).

Unsupervised Learning

Unsupervised Learning #

  • Works on unlabelled raw data.
  • The algorithm discovers hidden patterns without prior knowledge of outcomes.
  • Requires no human intervention during training.
  • Does not make direct predictions — it groups or organises data instead.
  • Carries a higher risk because there’s no ground truth to verify results.
  • Common techniques include Clustering, Association, and Dimensionality Reduction.

stateDiagram-v2

  %% ML maths-based colours (same palette as supervised)
  classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
  classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
  classDef category font-style:italic,font-weight:bold,fill:#f3f4f6,stroke:#374151

  %% Root
  USL: Unsupervised Learning

  %% Main branches
  USL --> CLU:::category
  CLU: Clustering

  USL --> DR:::category
  DR: Dimensionality Reduction

  %% Clustering algorithms
  CLU --> KM:::geometry
  KM: K-Means

  CLU --> HC:::geometry
  HC: Hierarchical Clustering

  CLU --> DB:::geometry
  DB: DBSCAN

  %% Probabilistic models
  USL --> PM:::category
  PM: Probabilistic Models

  PM --> GMM:::probability
  GMM: Gaussian Mixture Model

  PM --> HMM:::probability
  HMM: Hidden Markov Model

Clustering #

  • Groups similar data points together based on shared features.
  • Commonly used for market segmentation, image compression, and anomaly detection.

Common Types of Clustering #

  • K-Means Clustering – Divides data into K groups based on similarity.
  • Hierarchical Clustering – Builds a hierarchy (tree) of clusters.
  • DBSCAN (Density-Based Spatial Clustering) – Groups points close in density; identifies noise/outliers.

Association #

  • Identifies relationships or correlations between variables in a dataset.
  • Commonly used in market basket analysis (e.g. “Customers who bought X also bought Y”).

Common Techniques #

  • Apriori Algorithm – Finds frequent itemsets and generates association rules.
  • Eclat Algorithm – Similar to Apriori but uses set intersections for faster computation.

Dimensionality Reduction #

  • Reduces the number of input variables to simplify data.
  • Helps remove noise and redundancy.
  • Commonly used in data pre-processing and visualisation.

Common Techniques #

  • Principal Component Analysis (PCA) – Projects data onto fewer dimensions while keeping most variance.
  • Linear Discriminant Analysis (LDA) – Focuses on class separation.
  • t-SNE (t-Distributed Stochastic Neighbour Embedding) – Used for visualising high-dimensional data.
  • Autoencoders – Neural networks that compress and reconstruct data.

mindmap
  root(Unsupervised Learning)
    Clustering
      K Means
      Hierarchical Clustering
      DBSCAN
    Dimensionality Reduction
      PCA
      t SNE
      Autoencoders
    Probabilistic Models
      Gaussian Mixture Model
      Hidden Markov Model

Home | Machine Learning

Semi-Supervised Learning

Semi-Supervised Learning #

  • A combination of labelled and unlabelled data.
  • Useful when labelling large datasets is expensive or time-consuming.
  • Works well with high-volume datasets (e.g. millions of images).
  • Only a small fraction of data is labelled (e.g. a few thousand).
  • The algorithm learns from both labelled examples and structure in unlabelled data.
  • Ideal for medical imaging where labelled data is limited.
  • For example, a radiologist can label a small set of medical scans,
    and the model uses that to learn from thousands of unlabelled scans.
  • Helps improve accuracy and generalisation with minimal manual effort.

Home | Machine Learning

Neural Networks

Neural Networks #

  • A network of artificial neurons inspired by how neurons function in the human brain.
  • At its core - a mathematical model designed to process and learn from data.
  • Neural networks form the foundation of Deep Learning (involves training large and complex networks on vast amounts of data).

flowchart LR
 subgraph subGraph0["Input Layer"]
        I1(("Input 1"))
        I2(("Input 2"))
        I3(("Input 3"))
  end
 subgraph subGraph1["Hidden Layer"]
        H1(("Hidden 1"))
        H2(("Hidden 2"))
        H3(("Hidden 3"))
  end
 subgraph subGraph2["Output Layer"]
        O(("Output"))
  end
    I1 --> H1 & H2 & H3
    I2 --> H1 & H2 & H3
    I3 --> H1 & H2 & H3
    H1 --> O
    H2 --> O
    H3 --> O

    style I1 fill:#C8E6C9
    style I2 fill:#C8E6C9
    style I3 fill:#C8E6C9
    style H1 stroke:#2962FF,fill:#BBDEFB
    style H2 fill:#BBDEFB
    style H3 fill:#BBDEFB
    style O fill:#FFCDD2
    style subGraph0 stroke:none,fill:transparent
    style subGraph1 stroke:none,fill:transparent
    style subGraph2 stroke:none,fill:transparent

Structure of a Neural Network #

A typical neural network has three main layers:

Machine Learning

Machine Learning #

stateDiagram-v2

    %% ===== CLASS DEFINITIONS (Math-based colours) =====
    classDef algebra fill:#cfe8ff,stroke:#1e3a8a,stroke-width:1px
    classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
    classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
    classDef logic fill:#ede9fe,stroke:#5b21b6,stroke-width:1px
    classDef category font-style:italic,font-weight:bold,fill:#aaaaaa,stroke:#374151,stroke-width:3px

    %% ===== ROOT =====
    ML: Machine Learning

    %% ===== SUPERVISED =====
    ML --> SL:::category
    SL: Supervised Learning

    SL --> Regression
    Regression --> LR:::algebra
    LR: Linear Regression

    LR --> NN:::algebra
    NN: Neural Network

    NN --> DT:::logic
    DT: Decision Tree

    SL --> Classification
    Classification --> NB:::probability
    NB: Naive Bayes

    NB --> KNN:::geometry
    KNN: k-Nearest Neighbours

    KNN --> SVM:::algebra
    SVM: Support Vector Machine
    
    %% ===== UNSUPERVISED =====
    ML --> USL:::category
    USL: Unsupervised Learning

    USL --> Clustering
    Clustering --> KM:::geometry
    KM: K-Means

    KM --> GMM:::probability
    GMM: Gaussian Mixture Model

    GMM --> HMM:::probability
    HMM: Hidden Markov Model

    %% ===== REINFORCEMENT =====
    ML --> RL:::category
    RL: Reinforcement Learning

    RL --> DM:::logic
    DM: Decision Making

Mathematical Legend

Algebra / Linear Algebra (Blue) #

Used heavily when models rely on:

Artificial Neuron and Perceptron

Artificial Neuron and Perceptron #

knowledge in neural networks is stored in connection weights, and learning means modifying those weights.


Biological Neuron #

A biological neuron is a specialised cell that processes and transmits information through electrical and chemical signals.

Core components:

  • Dendrites: receive signals from other neurons
  • Cell body (soma): processes incoming signals
  • Axon: transmits the output signal
  • Synapses: connection points between neurons

Biological intuition:

  • many inputs arrive to one neuron
  • one neuron can connect out to many neurons
  • massive parallelism enables fast perception and recognition

Artificial Neuron #

An artificial neuron is a simplified computational model inspired by biological neurons.

ML Workflow

Machine learning Workflow #

Data is the foundation of any machine learning system. Quality of data matters more than model complexity.

Role of Data #

Data determines:

  • What patterns the model can learn
  • How well it generalises
  • Whether bias or noise is introduced

Bad data → bad model (even with perfect algorithms).


Data Preprocessing, wrangling #

Raw data is never ready for training.

Data Issues

  • Noise
    • For objects, noise is an extraneous object
    • For attributes, noise refers to modification of original values
    • Use Log or Z Transfer to convert to mean
  • Outliers
    • Data objects with characteristics that are considerably different than most of the other data objects in the data set
    • Handle: Use IQR method
    • Find Lower and Upper Bound and replace Outlier with Lower or Upper Bound
  • Missing Values
    • Eliminate data objects or variables
    • Handle: Estimate missing values
      • Mean, Median or Mode
      • Prefer Median if there are missing outliers
    • Ignore the missing value during analysis
  • Duplicate Data
    • Major issue when merging data from heterogeneous sources
  • Inconsistent Codes
    • Find all Unique and transfer all inconsistent to

Data Preprocessing techniques

Regression(Linear Models)

Linear Regression #

Linear Regression is a supervised ML method used to predict a numerical target by fitting a model that is linear in its parameters.

In ML , linear models are a core baseline: they’re fast, often surprisingly strong, and usually easy to interpret.

Key takeaway: Linear Regression learns parameters by minimising a squared-error cost. You can solve it directly (closed form) or iteratively (gradient descent), and you can extend it using basis functions and regularisation.

Ordinary Least Squares

Direct solution method - Ordinary Least Squares and the Line of Best Fit #

It is possible to compute the best parameters for linear regression in one shot (closed-form), instead of iteratively improving them step-by-step. fileciteturn34file10turn34file6

For linear regression, the direct method is usually Ordinary Least Squares (OLS).

Ordinary Least Squares (OLS) chooses the “best” line by minimising squared prediction errors.

Key takeaway: OLS defines “best fit” as the line that minimises the total squared residual error across all data points.

Cost Function

Cost Function #

  • also known as an objective function

  • how far the predicted values are from the actual ones

  • measure of the difference between predicted values and actual values

  • quantifies the error between a model’s predicted values and actual values

  • measures the model’s error on a group of datapoints

  • method used to predict values by drawing the best-fit line through the data

  • used to evaluate the accuracy of a model’s predictions