ML

Supervised Learning

Supervised Learning #

Trained using labelled data.
Each example in the training set includes the correct output.
The algorithm learns to generalise and make predictions on unseen data.
Generally more accurate than unsupervised methods.
Requires human intervention for labelling and setup.
Widely used due to its accuracy and efficiency.
Produces highly accurate results when trained on good-quality labelled data.


Classification #

Output is discrete (e.g. Yes/No, Spam/Not Spam).
Used for categorising data into predefined classes.
Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).

Semi-Supervised Learning

Semi-Supervised Learning #

  • A combination of labelled and unlabelled data.
  • Useful when labelling large datasets is expensive or time-consuming.
  • Works well with high-volume datasets (e.g. millions of images).
  • Only a small fraction of data is labelled (e.g. a few thousand).
  • The algorithm learns from both labelled examples and structure in unlabelled data.
  • Ideal for medical imaging where labelled data is limited.
  • For example, a radiologist can label a small set of medical scans,
    and the model uses that to learn from thousands of unlabelled scans.
  • Helps improve accuracy and generalisation with minimal manual effort.

Home | Machine Learning

Neural Networks

Neural Networks #

  • A network of artificial neurons inspired by how neurons function in the human brain.
  • At its core - a mathematical model designed to process and learn from data.
  • Neural networks form the foundation of Deep Learning (involves training large and complex networks on vast amounts of data).

flowchart LR
 subgraph subGraph0["Input Layer"]
        I1(("Input 1"))
        I2(("Input 2"))
        I3(("Input 3"))
  end
 subgraph subGraph1["Hidden Layer"]
        H1(("Hidden 1"))
        H2(("Hidden 2"))
        H3(("Hidden 3"))
  end
 subgraph subGraph2["Output Layer"]
        O(("Output"))
  end
    I1 --> H1 & H2 & H3
    I2 --> H1 & H2 & H3
    I3 --> H1 & H2 & H3
    H1 --> O
    H2 --> O
    H3 --> O

    style I1 fill:#C8E6C9
    style I2 fill:#C8E6C9
    style I3 fill:#C8E6C9
    style H1 stroke:#2962FF,fill:#BBDEFB
    style H2 fill:#BBDEFB
    style H3 fill:#BBDEFB
    style O fill:#FFCDD2
    style subGraph0 stroke:none,fill:transparent
    style subGraph1 stroke:none,fill:transparent
    style subGraph2 stroke:none,fill:transparent

Structure of a Neural Network #

A typical neural network has three main layers:

Machine Learning

Machine Learning #

stateDiagram-v2

    %% ===== CLASS DEFINITIONS (Math-based colours) =====
    classDef algebra fill:#cfe8ff,stroke:#1e3a8a,stroke-width:1px
    classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
    classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
    classDef logic fill:#ede9fe,stroke:#5b21b6,stroke-width:1px
    classDef category font-style:italic,font-weight:bold,fill:#aaaaaa,stroke:#374151,stroke-width:3px

    %% ===== ROOT =====
    ML: Machine Learning

    %% ===== SUPERVISED =====
    ML --> SL:::category
    SL: Supervised Learning

    SL --> Regression
    Regression --> LR:::algebra
    LR: Linear Regression

    LR --> NN:::algebra
    NN: Neural Network

    NN --> DT:::logic
    DT: Decision Tree

    SL --> Classification
    Classification --> NB:::probability
    NB: Naive Bayes

    NB --> KNN:::geometry
    KNN: k-Nearest Neighbours

    KNN --> SVM:::algebra
    SVM: Support Vector Machine
    
    %% ===== UNSUPERVISED =====
    ML --> USL:::category
    USL: Unsupervised Learning

    USL --> Clustering
    Clustering --> KM:::geometry
    KM: K-Means

    KM --> GMM:::probability
    GMM: Gaussian Mixture Model

    GMM --> HMM:::probability
    HMM: Hidden Markov Model

    %% ===== REINFORCEMENT =====
    ML --> RL:::category
    RL: Reinforcement Learning

    RL --> DM:::logic
    DM: Decision Making

Mathematical Legend

Algebra / Linear Algebra (Blue) #

Used heavily when models rely on:

Artificial Neuron and Perceptron

Artificial Neuron and Perceptron #

knowledge in neural networks is stored in connection weights, and learning means modifying those weights.


Biological Neuron #

A biological neuron is a specialised cell that processes and transmits information through electrical and chemical signals.

Core components:

  • Dendrites: receive signals from other neurons
  • Cell body (soma): processes incoming signals
  • Axon: transmits the output signal
  • Synapses: connection points between neurons

Biological intuition:

  • many inputs arrive to one neuron
  • one neuron can connect out to many neurons
  • massive parallelism enables fast perception and recognition

Artificial Neuron #

An artificial neuron is a simplified computational model inspired by biological neurons.

ML Workflow

Machine learning Workflow #

Data is the foundation of any machine learning system. Quality of data matters more than model complexity.

Role of Data #

Data determines:

  • What patterns the model can learn
  • How well it generalises
  • Whether bias or noise is introduced

Bad data → bad model (even with perfect algorithms).


Data Preprocessing, wrangling #

Raw data is never ready for training.

Data Issues

  • Noise
    • For objects, noise is an extraneous object
    • For attributes, noise refers to modification of original values
    • Use Log or Z Transfer to convert to mean
  • Outliers
    • Data objects with characteristics that are considerably different than most of the other data objects in the data set
    • Handle: Use IQR method
    • Find Lower and Upper Bound and replace Outlier with Lower or Upper Bound
  • Missing Values
    • Eliminate data objects or variables
    • Handle: Estimate missing values
      • Mean, Median or Mode
      • Prefer Median if there are missing outliers
    • Ignore the missing value during analysis
  • Duplicate Data
    • Major issue when merging data from heterogeneous sources
  • Inconsistent Codes
    • Find all Unique and transfer all inconsistent to

Data Preprocessing techniques

Regression (Linear)

Linear Regression #

Linear Regression is a supervised ML method used to predict a numerical target by fitting a model that is linear in its parameters.

In ML , linear models are a core baseline: they’re fast, often surprisingly strong, and usually easy to interpret.

Key takeaway: Linear Regression learns parameters by minimising a squared-error cost. You can solve it directly (closed form) or iteratively (gradient descent), and you can extend it using basis functions and regularisation.

Ordinary Least Squares

Direct solution method - Ordinary Least Squares and the Line of Best Fit #

Revision:
OLS is the direct method for linear regression. It finds the best-fit line by minimising the sum of squared residuals without iterative updates.


Direct Method vs Iterative Method ☆ #

Linear regression parameters can be found in two main ways.

MethodMain ideaWhen used
Ordinary Least SquaresCompute the best parameters directlySmall or moderate datasets
Gradient DescentStart with parameters and update repeatedlyLarge datasets or many features
flowchart LR
    A["Linear Regression"] --> B["Direct Solution<br/>OLS"]
    A --> C["Iterative Solution<br/>Gradient Descent"]

    B --> B1["Normal Equation"]
    B --> B2["No learning rate"]
    B --> B3["One-shot solution"]

    C --> C1["Learning rate"]
    C --> C2["Repeated updates"]
    C --> C3["Stops after convergence"]

    style A fill:#E1F5FE,stroke:#5b7db1,color:#000
    style B fill:#C8E6C9,stroke:#5f8f6a,color:#000
    style C fill:#FFF9C4,stroke:#b59b3b,color:#000
    style B1 fill:#EDE7F6,stroke:#8a6fb3,color:#000
    style B2 fill:#EDE7F6,stroke:#8a6fb3,color:#000
    style B3 fill:#EDE7F6,stroke:#8a6fb3,color:#000
    style C1 fill:#EDE7F6,stroke:#8a6fb3,color:#000
    style C2 fill:#EDE7F6,stroke:#8a6fb3,color:#000
    style C3 fill:#EDE7F6,stroke:#8a6fb3,color:#000

Why It Is Called “Least Squares” ☆ #

OLS is called least squares because it chooses parameters that make the squared residual errors as small as possible.

Cost Function

Cost Function #

Revision:
A cost function converts model error into a single number. Training means changing the model parameters until this number becomes as small as possible.


Why Cost Function Matters in ML ☆ #

A machine learning model needs a way to decide whether one set of parameters is better than another.

For linear regression, every possible value of the parameters gives a different line. The cost function tells us which line is better by measuring how far the predictions are from the true values.

Gradient Descent

Gradient Descent for Linear Regression #

Revision:
Gradient descent is the step-by-step method for reducing the cost function when a direct closed-form solution is not convenient.


Where Gradient Descent Fits in ML ☆ #

Gradient descent is used when we want the model to learn parameters by repeatedly improving them.

For linear regression, it adjusts the slope and intercept until the prediction error becomes small.

flowchart LR
    A["Initial Parameters"] --> B["Make Predictions"]
    B --> C["Compute Cost"]
    C --> D["Compute Gradient"]
    D --> E["Update Parameters"]
    E --> B

    style A fill:#E1F5FE,stroke:#5b7db1,color:#000
    style B fill:#C8E6C9,stroke:#5f8f6a,color:#000
    style C fill:#FFF9C4,stroke:#b59b3b,color:#000
    style D fill:#EDE7F6,stroke:#8a6fb3,color:#000
    style E fill:#C8E6C9,stroke:#5f8f6a,color:#000

Core Idea ☆ #

The gradient tells us the direction in which the cost increases fastest.