January 3, 2026
Supervised Learning
# Trained using labelled data . Each example in the training set includes the correct output . The algorithm learns to generalise and make predictions on unseen data. Generally more accurate than unsupervised methods. Requires human intervention for labelling and setup. Widely used due to its accuracy and efficiency . Produces highly accurate results when trained on good-quality labelled data.
Classification
# Output is discrete (e.g. Yes/No, Spam/Not Spam). Used for categorising data into predefined classes. Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).
Differentiation of Univariate Functions
# Differentiation measures rate of change.
For a function f(x), the derivative measures the rate of change.
\[
f'(x) = \lim_{h \to 0} \frac{f(x+h)-f(x)}{h}
\] Interpretation:
Slope of tangent Instantaneous rate of change Home | Vector Calculus
Partial Differentiation and Gradients
# For f(x1, x2, …, xn):
[
\frac{\partial f}{\partial x_i}
] Gradient vector:
[
\nabla f =
\begin{bmatrix}
\frac{\partial f}{\partial x_1} \
\vdots \
\frac{\partial f}{\partial x_n}
\end{bmatrix}
] Gradient points in direction of steepest ascent.
flowchart LR
Input --> Function
Function --> Gradient
Gradient --> Optimisation
Home | Vector Calculus
Linear Independence
# A set of vectors is linearly independent if none of them can be written as a linear combination of the others.
\[
c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0}
\;\Rightarrow\;
c_1=\cdots=c_k=0
\] Independence means each vector adds new information .
January 3, 2026
Semi-Supervised Learning
# A combination of labelled and unlabelled data . Useful when labelling large datasets is expensive or time-consuming . Works well with high-volume datasets (e.g. millions of images). Only a small fraction of data is labelled (e.g. a few thousand). The algorithm learns from both labelled examples and structure in unlabelled data. Ideal for medical imaging where labelled data is limited.For example, a radiologist can label a small set of medical scans, and the model uses that to learn from thousands of unlabelled scans. Helps improve accuracy and generalisation with minimal manual effort. Home | Machine Learning
Gradients of Vector-Valued and Matrix Functions
# Covers gradients when outputs or parameters are vectors/matrices.
If f: R^n -> R^m, the derivative is the Jacobian.
[
J =
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1} & \dots & \frac{\partial f_1}{\partial x_n} \
\vdots & \ddots & \vdots \
\frac{\partial f_m}{\partial x_1} & \dots & \frac{\partial f_m}{\partial x_n}
\end{bmatrix}
] For scalar f(x):
[
H = \nabla^2 f
] Hessian captures curvature.
Reinforcement Learning (RL)
# RL is learning by trial and error .
Reinforcement Learning (RL) is a type of machine learning where an autonomous agent learns to make decisions by interacting with an environment .
Instead of being told the correct answer, the agent:
takes actions observes outcomes receives rewards or penalties gradually learns a strategy that maximises long-term reward Reinforcement Learning teaches an agent how to act, not what to predict.
Useful Gradient Identities
# [
\nabla (a^T x) = a
]
[
\nabla (x^T A x) = (A + A^T)x
] If A symmetric:
[
\nabla (x^T A x) = 2Ax
] These are heavily used in optimisation .
Home | Vector Calculus
Inner Products and Dot Product
# An inner product maps two vectors to a single scalar .
It allows us to measure:
similarity vector length projections orthogonality
flowchart TD
T["Inner<br/>products<br/>(types)"] --> DOT["Euclidean<br/>Dot product"]
T --> WIP["Weighted<br/>inner product"]
T --> FN["Function-space<br/>(integral)"]
T --> HERM["Complex<br/>Hermitian"]
T --> MAT["Matrix<br/>inner product<br/>(Frobenius)"]
DOT --> Rn["Vectors in<br/>
<span>
\( \mathbb{R}^n \)
</span>
"]
WIP --> SPD["SPD matrix<br/>W"]
FN --> L2["L2 space<br/>functions"]
HERM --> Cn["Vectors in<br/>C^n"]
MAT --> Mnm["Matrices<br/>R^{m×n}"]
style T fill:#90CAF9,stroke:#1E88E5,color:#000
style DOT fill:#C8E6C9,stroke:#2E7D32,color:#000
style WIP fill:#C8E6C9,stroke:#2E7D32,color:#000
style FN fill:#C8E6C9,stroke:#2E7D32,color:#000
style HERM fill:#C8E6C9,stroke:#2E7D32,color:#000
style MAT fill:#C8E6C9,stroke:#2E7D32,color:#000
style Rn fill:#CE93D8,stroke:#8E24AA,color:#000
style SPD fill:#CE93D8,stroke:#8E24AA,color:#000
style L2 fill:#CE93D8,stroke:#8E24AA,color:#000
style Cn fill:#CE93D8,stroke:#8E24AA,color:#000
style Mnm fill:#CE93D8,stroke:#8E24AA,color:#000
Definition
# For vectors\( \mathbf{a}, \mathbf{b} \in \mathbb{R}^n \)
Backpropagation and Automatic Differentiation
# Backpropagation applies the chain rule:
efficiently across a computational graph. repeatedly. Chain rule:
[
\frac{dL}{dx} = \frac{dL}{dy} \cdot \frac{dy}{dx}
]
flowchart LR
x --> y
y --> L
Automatic differentiation computes exact derivatives efficiently using computational graphs.
Home | Vector Calculus