ML

Supervised Learning

Supervised Learning #

Trained using labelled data.
Each example in the training set includes the correct output.
The algorithm learns to generalise and make predictions on unseen data.
Generally more accurate than unsupervised methods.
Requires human intervention for labelling and setup.
Widely used due to its accuracy and efficiency.
Produces highly accurate results when trained on good-quality labelled data.


Classification #

Output is discrete (e.g. Yes/No, Spam/Not Spam).
Used for categorising data into predefined classes.
Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).

Partial Differentiation and Gradients

Partial Differentiation and Gradients #

For f(x1, x2, …, xn):

[ \frac{\partial f}{\partial x_i} ]

Gradient vector:

[ \nabla f = \begin{bmatrix} \frac{\partial f}{\partial x_1} \ \vdots \ \frac{\partial f}{\partial x_n} \end{bmatrix} ]

Gradient points in direction of steepest ascent.

flowchart LR
    Input --> Function
    Function --> Gradient
    Gradient --> Optimisation

Home | Vector Calculus

Linear Independence

Linear Independence #

A set of vectors is linearly independent if none of them can be written as a linear combination of the others.

\[ c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0} \;\Rightarrow\; c_1=\cdots=c_k=0 \]

Independence means each vector adds new information.

Semi-Supervised Learning

Semi-Supervised Learning #

  • A combination of labelled and unlabelled data.
  • Useful when labelling large datasets is expensive or time-consuming.
  • Works well with high-volume datasets (e.g. millions of images).
  • Only a small fraction of data is labelled (e.g. a few thousand).
  • The algorithm learns from both labelled examples and structure in unlabelled data.
  • Ideal for medical imaging where labelled data is limited.
  • For example, a radiologist can label a small set of medical scans,
    and the model uses that to learn from thousands of unlabelled scans.
  • Helps improve accuracy and generalisation with minimal manual effort.

Home | Machine Learning

Gradients of Vector-Valued and Matrix Functions

Gradients of Vector-Valued and Matrix Functions #

Covers gradients when outputs or parameters are vectors/matrices.

If f: R^n -> R^m, the derivative is the Jacobian.

[ J = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \dots & \frac{\partial f_1}{\partial x_n} \ \vdots & \ddots & \vdots \ \frac{\partial f_m}{\partial x_1} & \dots & \frac{\partial f_m}{\partial x_n} \end{bmatrix} ]

For scalar f(x):

[ H = \nabla^2 f ]

Hessian captures curvature.

Reinforcement Learning

Reinforcement Learning (RL) #

RL is learning by trial and error.

Reinforcement Learning (RL) is a type of machine learning where an autonomous agent learns to make decisions by interacting with an environment.

Instead of being told the correct answer, the agent:

  • takes actions
  • observes outcomes
  • receives rewards or penalties
  • gradually learns a strategy that maximises long-term reward

Reinforcement Learning teaches an agent how to act, not what to predict.

Inner Products and Dot Product

Inner Products and Dot Product #

An inner product maps two vectors to a single scalar.

It allows us to measure:

  • similarity
  • vector length
  • projections
  • orthogonality
flowchart TD
T["Inner<br/>products<br/>(types)"] --> DOT["Euclidean<br/>Dot product"]
T --> WIP["Weighted<br/>inner product"]
T --> FN["Function-space<br/>(integral)"]
T --> HERM["Complex<br/>Hermitian"]
T --> MAT["Matrix<br/>inner product<br/>(Frobenius)"]

DOT --> Rn["Vectors in<br/>
<span>
  \( \mathbb{R}^n \)
  </span>

"]
WIP --> SPD["SPD matrix<br/>W"]
FN --> L2["L2 space<br/>functions"]
HERM --> Cn["Vectors in<br/>C^n"]
MAT --> Mnm["Matrices<br/>R^{m×n}"]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style DOT fill:#C8E6C9,stroke:#2E7D32,color:#000
style WIP fill:#C8E6C9,stroke:#2E7D32,color:#000
style FN fill:#C8E6C9,stroke:#2E7D32,color:#000
style HERM fill:#C8E6C9,stroke:#2E7D32,color:#000
style MAT fill:#C8E6C9,stroke:#2E7D32,color:#000

style Rn fill:#CE93D8,stroke:#8E24AA,color:#000
style SPD fill:#CE93D8,stroke:#8E24AA,color:#000
style L2 fill:#CE93D8,stroke:#8E24AA,color:#000
style Cn fill:#CE93D8,stroke:#8E24AA,color:#000
style Mnm fill:#CE93D8,stroke:#8E24AA,color:#000

Definition #

For vectors
\( \mathbf{a}, \mathbf{b} \in \mathbb{R}^n \)

Backpropagation and Automatic Differentiation

Backpropagation and Automatic Differentiation #

Backpropagation applies the chain rule:

  • efficiently across a computational graph.
  • repeatedly.

Chain rule:

[ \frac{dL}{dx} = \frac{dL}{dy} \cdot \frac{dy}{dx} ]
flowchart LR
    x --> y
    y --> L

Automatic differentiation computes exact derivatives efficiently using computational graphs.


Home | Vector Calculus