MFML Lecture to Course Content Map

MFML Lecture to Course Content Map #

This file maps the uploaded Maths lecture PDFs and webinar PDFs against the official course handout/contact-session plan. It is intended as an exam preparation index and as a source map for future Hugo Markdown notes.

Course identity #

Course: Mathematical Foundations for Machine Learning
Course code: AIML ZC416
Main areas: linear algebra, vector spaces, matrix decompositions, vector calculus, optimisation, PCA, and SVM.

Official module structure #

Module	Course handout area	Main ideas	Uploaded lecture coverage
M1	Solution of linear systems	Systems of equations, matrices, solving Ax = b	Lecture 1, Webinar 1
M2	Vector spaces and analytic geometry	Vector spaces, linear independence, basis, rank, norms, inner products, angles, orthogonality, orthonormal basis	Lecture 2, Lecture 3, Webinar 1
M3	Matrix decomposition methods	Determinant, trace, eigenvalues, eigenvectors, Cholesky, eigendecomposition, diagonalisation, SVD, matrix approximation	Lecture 4, Lecture 5, Webinar 1, Webinar 2
M4	Vector calculus	Univariate differentiation, partial derivatives, gradients, matrix gradients, Taylor/Maclaurin series, Hessian, backpropagation, automatic differentiation	Lecture 6, Lecture 7, Lecture 8, Webinar 2
M5	Continuous optimisation	Gradient descent, constrained optimisation, Lagrange multipliers, convex optimisation	Lecture 9, Lecture 14, Webinar 2, Webinar 3, Webinar 4
M6	Nonlinear optimisation	Learning rate, initialisation, SGD, feature preprocessing, local optima, cliffs/valleys, momentum, AdaGrad, RMSProp, Adam	Lecture 10, Lecture 11, Webinar 3
M7	Dimensionality reduction, PCA, SVM	PCA perspectives, low-rank approximation, high-dimensional PCA, practical PCA, SVM preliminaries, primal/dual SVM, kernels	Lecture 12, Lecture 13, Lecture 14, Lecture 15, Webinar 4

Contact session by lecture #

Session	Course handout topic	Uploaded file	What the lecture appears to cover	Exam relevance
1	Solution of linear systems	`Lecture_1.pdf`	Linear algebra introduction, closure, systems of linear equations, matrix representation, solution types: no solution, unique solution, infinite solutions, pivot/free variables, matrix operations, inverse, transpose, compact Ax=b form	Very high for Mid-Sem and Comprehensive
2	Vector spaces, linear independence, basis, rank	`Lecture_2.pdf`	Groups, Abelian groups, vector spaces, vector subspaces, closure tests, linear combinations, span, linear independence, basis, rank, nullspace/column space ideas	Very high for Mid-Sem and Comprehensive
3	Analytic geometry	`Lecture_3.pdf`	Norms, dot product, inner products, bilinear mappings, symmetric positive-definite matrices, lengths, distances, angles, orthogonality, orthonormal basis, Gram-Schmidt ideas	Very high for Mid-Sem and Comprehensive
4	Matrix Decomposition I	`lecture_4.pdf`	Determinant, cofactor formula, determinant behaviour under row operations, rank-det relation, eigenvalues/eigenvectors, Cholesky-related positive definite ideas	Very high for Mid-Sem and Comprehensive
5	Matrix Decomposition II	`lecture_5.pdf`	Diagonal matrices, diagonalisation, eigendecomposition, spectral theorem for symmetric matrices, SVD, matrix approximation	Very high for Mid-Sem and Comprehensive
6	Vector Calculus I	`lecture_6.pdf`	Differentiation of univariate functions, polynomial derivatives, Taylor polynomial/series, partial derivatives, gradients, vector-valued gradients	Very high for Mid-Sem and Comprehensive
7	Vector Calculus II	`lecture_7_edited.pdf`	Matrix gradients, useful gradient identities, backpropagation, automatic differentiation, chain rule through neural-network layers	High for Mid-Sem and Comprehensive
8	Vector Calculus III	`lecture_8.pdf`	Taylor/Maclaurin series theory, remainder term, two-variable Taylor series, Hessian matrix, maxima/minima, unconstrained optimisation preliminaries	Very high for Mid-Sem and Comprehensive
9	Continuous Optimisation	`Lecture_9.pdf`	Gradient descent, negative gradient direction, local minima, step size, line search, convergence intuition, quadratic examples	Very high for Comprehensive; likely useful for quizzes/problems
10	Nonlinear Optimisation I	`Lecture_10.pdf`	Initialisation, objective functions in ML, overfitting, feature processing/preprocessing, SGD and practical optimisation behaviour	High for Comprehensive
11	Nonlinear Optimisation II	`Lecture_11.pdf`	Difficult topologies: cliffs, valleys, flat regions, curvature; momentum, AdaGrad, RMSProp, Adam	High for Comprehensive
12	PCA I	`Lecture_12.pdf`	Dimensionality reduction, PCA problem setting, centred data, covariance, maximum variance perspective, projection perspective	Very high for Comprehensive
13	PCA II	`Lecture_13.pdf`	Practical PCA, eigenvector computation, SVD relationship, low-rank approximation, high-dimensional PCA, key PCA steps	Very high for Comprehensive
14	Mathematical preliminaries for SVM	`Lecture 14.pdf`	Constrained optimisation, Lagrangian, quadratic programming, primal/dual, weak/strong duality, Slater condition, KKT conditions, kernels, linear classifiers	Very high for Comprehensive
15	Primal/dual linear SVM	`Lecture_15.pdf`	SVM primal problem, dual formulation, KKT conditions, support vectors, hinge loss, linear SVM numerical problem, hard/soft-margin direction	Very high for Comprehensive
16	Nonlinear SVM / kernels	Not clearly uploaded as a separate Lecture 16 PDF	Kernel functions, nonlinear SVM examples; likely partly covered in Lecture 14/15 and webinars	Very high for Comprehensive; gap to fill if Lecture 16 exists

Webinar mapping #

Webinar file	Main role	Best linked lectures	Exam use
`Webinar_1.pdf`	Problem sheet on linear systems, REF/RREF, column space, nullspace, row independence, subspaces, inner products, Cauchy-Schwarz, Cholesky, eigenvalues	Lectures 1-5	Excellent for Mid-Sem problem practice
`Webinar_2.pdf`	Worked problems on maxima/minima, eigenvalues/spectral decomposition, gradient-related calculations and PCA-style examples	Lectures 4-9, 12-13	Excellent for Mid-Sem revision and Comprehensive practice
`Webinar_3.pdf`	Gradient descent algorithm, step-size derivation for quadratic functions, worked gradient descent examples	Lectures 8-11	Excellent for optimisation exam problems
`webinar_4.pdf`	Appears linked to optimisation/SVM/PCA practice based on uploaded set; use as problem-solving supplement after Lecture 12 onwards	Lectures 12-15	Comprehensive exam practice

Mid-Sem focus #

The course handout states that the Mid-Semester Test covers Weeks 1-8. So for Mid-Sem, focus on:

MFML Exam Revision Index

AI, Maths

MFML, Maths, Exam Revision

MFML Exam Revision Index #

This is a practical revision index for the uploaded Mathematical Foundations for Machine Learning material.

Exam split #

Exam	Coverage	Main files
Mid-Semester	Weeks/Sessions 1-8	Lecture 1 to Lecture 8, Webinar 1, Webinar 2
Comprehensive	Sessions 1-16	Lecture 1 to Lecture 15, webinars, and any missing Lecture 16/kernel material

High-priority concept checklist #

Linear systems and matrices #

Convert equations into matrix form Ax = b
Understand solution types: no solution, unique solution, infinite solutions
Identify pivot and free variables
Understand row operations, REF/RREF, rank, nullity
Know matrix inverse and transpose properties

Vector spaces #

Definition of vector space and subspace
Closure under addition and scalar multiplication
Span, linear combination, linear independence
Basis, dimension, rank
Column space, row space, nullspace

Analytic geometry #

Norm properties
Manhattan norm and Euclidean norm
Inner product definition
Symmetric positive-definite matrices
Distance, angle, orthogonality
Orthonormal basis and Gram-Schmidt

Matrix decompositions #

Determinant and trace
Cofactor expansion
Row operation effect on determinant
Eigenvalue equation Av = λv
Characteristic equation det(A - λI) = 0
Diagonalisation A = PDP^{-1}
Spectral theorem for symmetric matrices
Cholesky decomposition
SVD A = UΣV^T
Low-rank approximation

Vector calculus #

Derivative from first principles
Partial derivatives
Gradient as direction of steepest ascent
Gradient of vector-valued functions
Matrix-gradient identities
Chain rule
Backpropagation and automatic differentiation

Taylor series and Hessian #

Taylor polynomial
Taylor series and Maclaurin series
Remainder term
Taylor series in two variables
Hessian matrix
First derivative and second derivative tests
Maxima, minima and saddle points

Gradient descent and optimisation #

Negative gradient direction
Learning rate/step size
Line search
Convergence and local minima
Constrained vs unconstrained optimisation
Lagrange multipliers
Convex optimisation
SGD and optimisation in ML
Feature preprocessing and scaling
Overfitting in optimisation examples

Nonlinear optimisation algorithms #

Difficult surfaces: cliffs, valleys, flat regions
Curvature and why first-order methods can struggle
Momentum update and intuition
AdaGrad
RMSProp
Adam
Learning rate decay

PCA #

Dimensionality reduction problem
Centred data and covariance matrix
Maximum variance view
Projection/reconstruction view
Principal components as eigenvectors of covariance matrix
SVD relation to PCA
Low-rank approximation and Eckart-Young theorem
PCA in high dimensions
Practical PCA steps

SVM #

Linear classifiers
Margin and support vectors
Hard-margin SVM primal formulation
Lagrangian for SVM
KKT conditions
Primal vs dual perspective
Role of inner products
Kernel trick
Hinge loss
Soft-margin SVM

Suggested revision order #

Phase 1: Foundations #

Lecture 1
Lecture 2
Lecture 3
Webinar 1 problems related to REF, nullspace, column space and subspaces

Phase 2: Matrix decompositions #

Lecture 4
Lecture 5
Webinar 1 and Webinar 2 eigenvalue/eigendecomposition problems

Phase 3: Calculus and optimisation foundations #

Lecture 6
Lecture 7
Lecture 8
Webinar 2 maxima/minima and Hessian problems

Phase 4: Optimisation for ML #

Lecture 9
Lecture 10
Lecture 11
Webinar 3 gradient-descent step-size problems

Phase 5: PCA and SVM #

Lecture 12
Lecture 13
Lecture 14
Lecture 15
Webinar 4 / SVM problems

What to ask me next #

Use these prompts when generating Hugo pages:

MFML Topic to Source Index

AI, Maths

MFML, Maths, Source Index

MFML Topic to Source Index #

This index tells you where to look when you want to create future notes or revise a topic.

Topic	Primary source PDFs	Supporting source PDFs	Future Hugo page
Linear systems	Lecture 1	Webinar 1	`01-linear-systems-and-matrices.md`
Matrix operations	Lecture 1	Webinar 1	`01-linear-systems-and-matrices.md`
Vector spaces	Lecture 2	Webinar 1	`02-vector-spaces-subspaces-basis-rank.md`
Subspaces	Lecture 2	Webinar 1	`02-vector-spaces-subspaces-basis-rank.md`
Linear independence, span, basis	Lecture 2	Webinar 1	`02-vector-spaces-subspaces-basis-rank.md`
Rank and nullity	Lecture 2	Webinar 1	`02-vector-spaces-subspaces-basis-rank.md`
Norms and distances	Lecture 3	Webinar 1	`03-analytic-geometry-norms-inner-products.md`
Inner products	Lecture 3	Webinar 1	`03-analytic-geometry-norms-inner-products.md`
Orthogonality and Gram-Schmidt	Lecture 3	Webinar 1	`03-analytic-geometry-norms-inner-products.md`
Determinant and trace	Lecture 4	Webinar 1	`04-determinants-trace-eigenvalues.md`
Eigenvalues/eigenvectors	Lecture 4	Webinar 1, Webinar 2	`04-determinants-trace-eigenvalues.md`
Cholesky	Lecture 4	Webinar 1	`04-determinants-trace-eigenvalues.md`
Diagonalisation	Lecture 5	Webinar 2	`05-eigendecomposition-svd-matrix-approximation.md`
Eigendecomposition	Lecture 5	Webinar 2	`05-eigendecomposition-svd-matrix-approximation.md`
SVD	Lecture 5	Lecture 13, Webinar 1	`05-eigendecomposition-svd-matrix-approximation.md`
Differentiation	Lecture 6	Webinar 2	`06-vector-calculus-gradients.md`
Gradients	Lecture 6, Lecture 7	Webinar 2, Webinar 3	`06-vector-calculus-gradients.md`
Backpropagation	Lecture 7	—	`07-backpropagation-automatic-differentiation.md`
Automatic differentiation	Lecture 7	—	`07-backpropagation-automatic-differentiation.md`
Taylor/Maclaurin series	Lecture 6, Lecture 8	Webinar 2	`08-taylor-series-hessian-maxima-minima.md`
Hessian	Lecture 8	Webinar 2	`08-taylor-series-hessian-maxima-minima.md`
Maxima/minima	Lecture 8	Webinar 2	`08-taylor-series-hessian-maxima-minima.md`
Gradient descent	Lecture 9	Webinar 3	`09-gradient-descent-continuous-optimisation.md`
Step size / line search	Lecture 9	Webinar 3	`09-gradient-descent-continuous-optimisation.md`
Constrained optimisation	Lecture 9, Lecture 14	Webinar 4	`14-lagrangian-duality-kkt.md`
Lagrange multipliers	Lecture 14	Webinar 4	`14-lagrangian-duality-kkt.md`
KKT conditions	Lecture 14, Lecture 15	Webinar 4	`14-lagrangian-duality-kkt.md`
Feature preprocessing	Lecture 10	—	`10-nonlinear-optimisation-sgd-feature-preprocessing.md`
Overfitting	Lecture 10	—	`10-nonlinear-optimisation-sgd-feature-preprocessing.md`
SGD	Lecture 10	Webinar 3	`10-nonlinear-optimisation-sgd-feature-preprocessing.md`
Cliffs and valleys	Lecture 11	—	`11-momentum-adagrad-rmsprop-adam.md`
Momentum	Lecture 11	Webinar 3	`11-momentum-adagrad-rmsprop-adam.md`
AdaGrad, RMSProp, Adam	Lecture 11	—	`11-momentum-adagrad-rmsprop-adam.md`
PCA foundations	Lecture 12	Webinar 4	`12-pca-foundations.md`
PCA computation	Lecture 13	Webinar 4	`13-pca-practical-computation-svd.md`
Low-rank PCA	Lecture 13	Lecture 5	`13-pca-practical-computation-svd.md`
SVM preliminaries	Lecture 14	Webinar 4	`15-support-vector-machines.md`
Linear SVM	Lecture 15	Webinar 4	`15-support-vector-machines.md`
Hinge loss	Lecture 15	Webinar 4	`15-support-vector-machines.md`
Kernels / nonlinear SVM	Lecture 14/15, possibly missing Lecture 16	Webinar 4	`16-nonlinear-svm-kernels.md`

Formula Sheet

March 12, 2026

AI, Statistics

AI, Statistics, Probability, Revision

Formula Sheet #

This page is a quick reference of definitions + formulas, grouped by the modules.

Notation #

Sample size: \( n \) (sample), \( N \) (population)
Sample mean: \( \bar{x} \) , population mean: \( \mu \)
Sample variance: \( s^2 \) , population variance: \( \sigma^2 \)
Sample SD: \( s \) , population SD: \( \sigma \)
Complement: \( A^c \)
Intersection (“and”): \( A\cap B \) , union (“or”): \( A\cup B \)
Conditional probability: \( P(A\mid B) \)

1. Basic Probability & Statistics #

1.1 Measures of Central Tendency #

Arithmetic mean #

Sample mean (ungrouped):

Supervised Learning

January 3, 2026

AI, ML

Supervised Learning #

Trained using labelled data.
Each example in the training set includes the correct output.
The algorithm learns to generalise and make predictions on unseen data.
Generally more accurate than unsupervised methods.
Requires human intervention for labelling and setup.
Widely used due to its accuracy and efficiency.
Produces highly accurate results when trained on good-quality labelled data.

Classification #

Output is discrete (e.g. Yes/No, Spam/Not Spam).
Used for categorising data into predefined classes.
Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).

Differentiation of Univariate Functions

AI, ML

Machine Learning, Mathematics, Vector Calculus

Differentiation of Univariate Functions #

Differentiation measures rate of change.

For a function f(x), the derivative measures the rate of change.

\[ f'(x) = \lim_{h \to 0} \frac{f(x+h)-f(x)}{h} \]

Interpretation:

Slope of tangent
Instantaneous rate of change

Home | Vector Calculus

Artificial Intelligence

July 4, 2024

AI

My AI Notes #

Learning how machines learn! My working notes as I learn AI.

flowchart LR
    AI[Artificial Intelligence]
    ML[Machine Learning]
    DL[Deep Learning]
    FM[Foundation Models]
    LLM[LLM Models]

    AI --> ML
    ML --> DL
    DL --> FM
    FM --> LLM

    style AI fill:#E1F5FE
    style ML fill:#C8E6C9
    style DL fill:#90CAF9
    style FM fill:#64B5F6
    style LLM fill:#FFCCBC

Mathematical Foundations for Machine Learning
Statistical Methods
Machine Learning
Deep Neural Networks

Machine Learning → The broad field where systems learn patterns from data to make predictions or decisions.
Neural Networks → A subset of machine learning that uses interconnected artificial neurons to model complex relationships.
Deep Learning → A subset of neural networks that uses many hidden layers to learn high-level features from large datasets.
Foundation Models → Large deep learning models trained on massive datasets and reused across many tasks using transfer learning.
LLMs (Large Language Models) → A specialised type of foundation model focused on understanding and generating human language.

flowchart TD
AI["Artificial<br/>Intelligence"]
ML["Machine<br/>Learning"]
NN["Neural<br/>Networks"]
DL["Deep<br/>Learning"]
FM["Foundation<br/>Models"]
LLM["LLM<br/>Models"]

AI --> ML
ML --> NN
NN --> DL
DL --> FM
FM --> LLM

LR["Linear<br/>Regression"]
DT["Decision<br/>Trees"]
ML --> LR
ML --> DT

MLP["MLP"]
CNN["CNN"]
NN --> MLP
NN --> CNN

CNNDL["CNN<br/>(deep)"]
RNN["RNN"]
DL --> CNNDL
DL --> RNN

BERT["BERT"]
CLIP["CLIP"]
FM --> BERT
FM --> CLIP

GPT["GPT"]
LLAMA["LLaMA"]
LLM --> GPT
LLM --> LLAMA

TEXT["Text"]
IMAGE["Images"]
AUDIO["Audio"]
VIDEO["Video"]
LLM --> TEXT
LLM --> IMAGE
LLM --> AUDIO
LLM --> VIDEO

style AI fill:#90CAF9,stroke:#1E88E5,color:#000
style ML fill:#90CAF9,stroke:#1E88E5,color:#000
style NN fill:#90CAF9,stroke:#1E88E5,color:#000

style DL fill:#CE93D8,stroke:#8E24AA,color:#000
style FM fill:#CE93D8,stroke:#8E24AA,color:#000

style LLM fill:#C8E6C9,stroke:#2E7D32,color:#000
style LR fill:#C8E6C9,stroke:#2E7D32,color:#000
style DT fill:#C8E6C9,stroke:#2E7D32,color:#000
style MLP fill:#C8E6C9,stroke:#2E7D32,color:#000
style CNN fill:#C8E6C9,stroke:#2E7D32,color:#000
style CNNDL fill:#C8E6C9,stroke:#2E7D32,color:#000
style RNN fill:#C8E6C9,stroke:#2E7D32,color:#000
style BERT fill:#C8E6C9,stroke:#2E7D32,color:#000
style CLIP fill:#C8E6C9,stroke:#2E7D32,color:#000
style GPT fill:#C8E6C9,stroke:#2E7D32,color:#000
style LLAMA fill:#C8E6C9,stroke:#2E7D32,color:#000
style TEXT fill:#C8E6C9,stroke:#2E7D32,color:#000
style IMAGE fill:#C8E6C9,stroke:#2E7D32,color:#000
style AUDIO fill:#C8E6C9,stroke:#2E7D32,color:#000
style VIDEO fill:#C8E6C9,stroke:#2E7D32,color:#000

AI, ML, DL, and Data Science Diagram

Stats Formula Sheet

February 25, 2026

AI, Statistics

AI, Statistics, Probability, Revision

Stats Formula Sheet #

Keep this page as a quick reference of definitions + formulas.

Notation #

Sample size: \( n \) (sample), \( N \) (population)
Mean: \( \bar{x} \) (sample), \( \mu \) (population)
Variance: \( s^2 \) (sample), \( \sigma^2 \) (population)
Standard deviation: \( s \) (sample), \( \sigma \) (population)

Module 1: Basic Statistics #

Measures of Central Tendency #

Sample mean (ungrouped):

Partial Differentiation and Gradients

AI, ML

Machine Learning, Mathematics, Vector Calculus

Partial Differentiation and Gradients #

For f(x1, x2, …, xn):

[ \frac{\partial f}{\partial x_i} ]

Gradient vector:

[ \nabla f = \begin{bmatrix} \frac{\partial f}{\partial x_1} \ \vdots \ \frac{\partial f}{\partial x_n} \end{bmatrix} ]

Gradient points in direction of steepest ascent.

flowchart LR
    Input --> Function
    Function --> Gradient
    Gradient --> Optimisation

Home | Vector Calculus

Linear Independence

AI, ML

Machine Learning, Mathematics, Linear Algebra

Linear Independence #

A set of vectors is linearly independent if none of them can be written as a linear combination of the others.

\[ c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0} \;\Rightarrow\; c_1=\cdots=c_k=0 \]

Independence means each vector adds new information.