MFML

MFML Lecture to Course Content Map

MFML Lecture to Course Content Map #

This file maps the uploaded Maths lecture PDFs and webinar PDFs against the official course handout/contact-session plan. It is intended as an exam preparation index and as a source map for future Hugo Markdown notes.

Course identity #

  • Course: Mathematical Foundations for Machine Learning
  • Course code: AIML ZC416
  • Main areas: linear algebra, vector spaces, matrix decompositions, vector calculus, optimisation, PCA, and SVM.

Official module structure #

ModuleCourse handout areaMain ideasUploaded lecture coverage
M1Solution of linear systemsSystems of equations, matrices, solving Ax = bLecture 1, Webinar 1
M2Vector spaces and analytic geometryVector spaces, linear independence, basis, rank, norms, inner products, angles, orthogonality, orthonormal basisLecture 2, Lecture 3, Webinar 1
M3Matrix decomposition methodsDeterminant, trace, eigenvalues, eigenvectors, Cholesky, eigendecomposition, diagonalisation, SVD, matrix approximationLecture 4, Lecture 5, Webinar 1, Webinar 2
M4Vector calculusUnivariate differentiation, partial derivatives, gradients, matrix gradients, Taylor/Maclaurin series, Hessian, backpropagation, automatic differentiationLecture 6, Lecture 7, Lecture 8, Webinar 2
M5Continuous optimisationGradient descent, constrained optimisation, Lagrange multipliers, convex optimisationLecture 9, Lecture 14, Webinar 2, Webinar 3, Webinar 4
M6Nonlinear optimisationLearning rate, initialisation, SGD, feature preprocessing, local optima, cliffs/valleys, momentum, AdaGrad, RMSProp, AdamLecture 10, Lecture 11, Webinar 3
M7Dimensionality reduction, PCA, SVMPCA perspectives, low-rank approximation, high-dimensional PCA, practical PCA, SVM preliminaries, primal/dual SVM, kernelsLecture 12, Lecture 13, Lecture 14, Lecture 15, Webinar 4

Contact session by lecture #

SessionCourse handout topicUploaded fileWhat the lecture appears to coverExam relevance
1Solution of linear systemsLecture_1.pdfLinear algebra introduction, closure, systems of linear equations, matrix representation, solution types: no solution, unique solution, infinite solutions, pivot/free variables, matrix operations, inverse, transpose, compact Ax=b formVery high for Mid-Sem and Comprehensive
2Vector spaces, linear independence, basis, rankLecture_2.pdfGroups, Abelian groups, vector spaces, vector subspaces, closure tests, linear combinations, span, linear independence, basis, rank, nullspace/column space ideasVery high for Mid-Sem and Comprehensive
3Analytic geometryLecture_3.pdfNorms, dot product, inner products, bilinear mappings, symmetric positive-definite matrices, lengths, distances, angles, orthogonality, orthonormal basis, Gram-Schmidt ideasVery high for Mid-Sem and Comprehensive
4Matrix Decomposition Ilecture_4.pdfDeterminant, cofactor formula, determinant behaviour under row operations, rank-det relation, eigenvalues/eigenvectors, Cholesky-related positive definite ideasVery high for Mid-Sem and Comprehensive
5Matrix Decomposition IIlecture_5.pdfDiagonal matrices, diagonalisation, eigendecomposition, spectral theorem for symmetric matrices, SVD, matrix approximationVery high for Mid-Sem and Comprehensive
6Vector Calculus Ilecture_6.pdfDifferentiation of univariate functions, polynomial derivatives, Taylor polynomial/series, partial derivatives, gradients, vector-valued gradientsVery high for Mid-Sem and Comprehensive
7Vector Calculus IIlecture_7_edited.pdfMatrix gradients, useful gradient identities, backpropagation, automatic differentiation, chain rule through neural-network layersHigh for Mid-Sem and Comprehensive
8Vector Calculus IIIlecture_8.pdfTaylor/Maclaurin series theory, remainder term, two-variable Taylor series, Hessian matrix, maxima/minima, unconstrained optimisation preliminariesVery high for Mid-Sem and Comprehensive
9Continuous OptimisationLecture_9.pdfGradient descent, negative gradient direction, local minima, step size, line search, convergence intuition, quadratic examplesVery high for Comprehensive; likely useful for quizzes/problems
10Nonlinear Optimisation ILecture_10.pdfInitialisation, objective functions in ML, overfitting, feature processing/preprocessing, SGD and practical optimisation behaviourHigh for Comprehensive
11Nonlinear Optimisation IILecture_11.pdfDifficult topologies: cliffs, valleys, flat regions, curvature; momentum, AdaGrad, RMSProp, AdamHigh for Comprehensive
12PCA ILecture_12.pdfDimensionality reduction, PCA problem setting, centred data, covariance, maximum variance perspective, projection perspectiveVery high for Comprehensive
13PCA IILecture_13.pdfPractical PCA, eigenvector computation, SVD relationship, low-rank approximation, high-dimensional PCA, key PCA stepsVery high for Comprehensive
14Mathematical preliminaries for SVMLecture 14.pdfConstrained optimisation, Lagrangian, quadratic programming, primal/dual, weak/strong duality, Slater condition, KKT conditions, kernels, linear classifiersVery high for Comprehensive
15Primal/dual linear SVMLecture_15.pdfSVM primal problem, dual formulation, KKT conditions, support vectors, hinge loss, linear SVM numerical problem, hard/soft-margin directionVery high for Comprehensive
16Nonlinear SVM / kernelsNot clearly uploaded as a separate Lecture 16 PDFKernel functions, nonlinear SVM examples; likely partly covered in Lecture 14/15 and webinarsVery high for Comprehensive; gap to fill if Lecture 16 exists

Webinar mapping #

Webinar fileMain roleBest linked lecturesExam use
Webinar_1.pdfProblem sheet on linear systems, REF/RREF, column space, nullspace, row independence, subspaces, inner products, Cauchy-Schwarz, Cholesky, eigenvaluesLectures 1-5Excellent for Mid-Sem problem practice
Webinar_2.pdfWorked problems on maxima/minima, eigenvalues/spectral decomposition, gradient-related calculations and PCA-style examplesLectures 4-9, 12-13Excellent for Mid-Sem revision and Comprehensive practice
Webinar_3.pdfGradient descent algorithm, step-size derivation for quadratic functions, worked gradient descent examplesLectures 8-11Excellent for optimisation exam problems
webinar_4.pdfAppears linked to optimisation/SVM/PCA practice based on uploaded set; use as problem-solving supplement after Lecture 12 onwardsLectures 12-15Comprehensive exam practice

Mid-Sem focus #

The course handout states that the Mid-Semester Test covers Weeks 1-8. So for Mid-Sem, focus on:

MFML Exam Revision Index

MFML Exam Revision Index #

This is a practical revision index for the uploaded Mathematical Foundations for Machine Learning material.

Exam split #

ExamCoverageMain files
Mid-SemesterWeeks/Sessions 1-8Lecture 1 to Lecture 8, Webinar 1, Webinar 2
ComprehensiveSessions 1-16Lecture 1 to Lecture 15, webinars, and any missing Lecture 16/kernel material

High-priority concept checklist #

Linear systems and matrices #

  • Convert equations into matrix form Ax = b
  • Understand solution types: no solution, unique solution, infinite solutions
  • Identify pivot and free variables
  • Understand row operations, REF/RREF, rank, nullity
  • Know matrix inverse and transpose properties

Vector spaces #

  • Definition of vector space and subspace
  • Closure under addition and scalar multiplication
  • Span, linear combination, linear independence
  • Basis, dimension, rank
  • Column space, row space, nullspace

Analytic geometry #

  • Norm properties
  • Manhattan norm and Euclidean norm
  • Inner product definition
  • Symmetric positive-definite matrices
  • Distance, angle, orthogonality
  • Orthonormal basis and Gram-Schmidt

Matrix decompositions #

  • Determinant and trace
  • Cofactor expansion
  • Row operation effect on determinant
  • Eigenvalue equation Av = λv
  • Characteristic equation det(A - λI) = 0
  • Diagonalisation A = PDP^{-1}
  • Spectral theorem for symmetric matrices
  • Cholesky decomposition
  • SVD A = UΣV^T
  • Low-rank approximation

Vector calculus #

  • Derivative from first principles
  • Partial derivatives
  • Gradient as direction of steepest ascent
  • Gradient of vector-valued functions
  • Matrix-gradient identities
  • Chain rule
  • Backpropagation and automatic differentiation

Taylor series and Hessian #

  • Taylor polynomial
  • Taylor series and Maclaurin series
  • Remainder term
  • Taylor series in two variables
  • Hessian matrix
  • First derivative and second derivative tests
  • Maxima, minima and saddle points

Gradient descent and optimisation #

  • Negative gradient direction
  • Learning rate/step size
  • Line search
  • Convergence and local minima
  • Constrained vs unconstrained optimisation
  • Lagrange multipliers
  • Convex optimisation
  • SGD and optimisation in ML
  • Feature preprocessing and scaling
  • Overfitting in optimisation examples

Nonlinear optimisation algorithms #

  • Difficult surfaces: cliffs, valleys, flat regions
  • Curvature and why first-order methods can struggle
  • Momentum update and intuition
  • AdaGrad
  • RMSProp
  • Adam
  • Learning rate decay

PCA #

  • Dimensionality reduction problem
  • Centred data and covariance matrix
  • Maximum variance view
  • Projection/reconstruction view
  • Principal components as eigenvectors of covariance matrix
  • SVD relation to PCA
  • Low-rank approximation and Eckart-Young theorem
  • PCA in high dimensions
  • Practical PCA steps

SVM #

  • Linear classifiers
  • Margin and support vectors
  • Hard-margin SVM primal formulation
  • Lagrangian for SVM
  • KKT conditions
  • Primal vs dual perspective
  • Role of inner products
  • Kernel trick
  • Hinge loss
  • Soft-margin SVM

Suggested revision order #

Phase 1: Foundations #

  1. Lecture 1
  2. Lecture 2
  3. Lecture 3
  4. Webinar 1 problems related to REF, nullspace, column space and subspaces

Phase 2: Matrix decompositions #

  1. Lecture 4
  2. Lecture 5
  3. Webinar 1 and Webinar 2 eigenvalue/eigendecomposition problems

Phase 3: Calculus and optimisation foundations #

  1. Lecture 6
  2. Lecture 7
  3. Lecture 8
  4. Webinar 2 maxima/minima and Hessian problems

Phase 4: Optimisation for ML #

  1. Lecture 9
  2. Lecture 10
  3. Lecture 11
  4. Webinar 3 gradient-descent step-size problems

Phase 5: PCA and SVM #

  1. Lecture 12
  2. Lecture 13
  3. Lecture 14
  4. Lecture 15
  5. Webinar 4 / SVM problems

What to ask me next #

Use these prompts when generating Hugo pages:

MFML Topic to Source Index

MFML Topic to Source Index #

This index tells you where to look when you want to create future notes or revise a topic.

TopicPrimary source PDFsSupporting source PDFsFuture Hugo page
Linear systemsLecture 1Webinar 101-linear-systems-and-matrices.md
Matrix operationsLecture 1Webinar 101-linear-systems-and-matrices.md
Vector spacesLecture 2Webinar 102-vector-spaces-subspaces-basis-rank.md
SubspacesLecture 2Webinar 102-vector-spaces-subspaces-basis-rank.md
Linear independence, span, basisLecture 2Webinar 102-vector-spaces-subspaces-basis-rank.md
Rank and nullityLecture 2Webinar 102-vector-spaces-subspaces-basis-rank.md
Norms and distancesLecture 3Webinar 103-analytic-geometry-norms-inner-products.md
Inner productsLecture 3Webinar 103-analytic-geometry-norms-inner-products.md
Orthogonality and Gram-SchmidtLecture 3Webinar 103-analytic-geometry-norms-inner-products.md
Determinant and traceLecture 4Webinar 104-determinants-trace-eigenvalues.md
Eigenvalues/eigenvectorsLecture 4Webinar 1, Webinar 204-determinants-trace-eigenvalues.md
CholeskyLecture 4Webinar 104-determinants-trace-eigenvalues.md
DiagonalisationLecture 5Webinar 205-eigendecomposition-svd-matrix-approximation.md
EigendecompositionLecture 5Webinar 205-eigendecomposition-svd-matrix-approximation.md
SVDLecture 5Lecture 13, Webinar 105-eigendecomposition-svd-matrix-approximation.md
DifferentiationLecture 6Webinar 206-vector-calculus-gradients.md
GradientsLecture 6, Lecture 7Webinar 2, Webinar 306-vector-calculus-gradients.md
BackpropagationLecture 707-backpropagation-automatic-differentiation.md
Automatic differentiationLecture 707-backpropagation-automatic-differentiation.md
Taylor/Maclaurin seriesLecture 6, Lecture 8Webinar 208-taylor-series-hessian-maxima-minima.md
HessianLecture 8Webinar 208-taylor-series-hessian-maxima-minima.md
Maxima/minimaLecture 8Webinar 208-taylor-series-hessian-maxima-minima.md
Gradient descentLecture 9Webinar 309-gradient-descent-continuous-optimisation.md
Step size / line searchLecture 9Webinar 309-gradient-descent-continuous-optimisation.md
Constrained optimisationLecture 9, Lecture 14Webinar 414-lagrangian-duality-kkt.md
Lagrange multipliersLecture 14Webinar 414-lagrangian-duality-kkt.md
KKT conditionsLecture 14, Lecture 15Webinar 414-lagrangian-duality-kkt.md
Feature preprocessingLecture 1010-nonlinear-optimisation-sgd-feature-preprocessing.md
OverfittingLecture 1010-nonlinear-optimisation-sgd-feature-preprocessing.md
SGDLecture 10Webinar 310-nonlinear-optimisation-sgd-feature-preprocessing.md
Cliffs and valleysLecture 1111-momentum-adagrad-rmsprop-adam.md
MomentumLecture 11Webinar 311-momentum-adagrad-rmsprop-adam.md
AdaGrad, RMSProp, AdamLecture 1111-momentum-adagrad-rmsprop-adam.md
PCA foundationsLecture 12Webinar 412-pca-foundations.md
PCA computationLecture 13Webinar 413-pca-practical-computation-svd.md
Low-rank PCALecture 13Lecture 513-pca-practical-computation-svd.md
SVM preliminariesLecture 14Webinar 415-support-vector-machines.md
Linear SVMLecture 15Webinar 415-support-vector-machines.md
Hinge lossLecture 15Webinar 415-support-vector-machines.md
Kernels / nonlinear SVMLecture 14/15, possibly missing Lecture 16Webinar 416-nonlinear-svm-kernels.md

Primal and Dual Perspective for Linear SVM

Primal and Dual Perspective for Linear SVM #

A linear Support Vector Machine finds a hyperplane that separates two classes with the maximum possible margin.

The primal view gives the direct geometric optimisation problem. The dual view rewrites the problem using Lagrange multipliers and reveals why only support vectors matter.

Key takeaway: Linear SVM maximises the margin by minimising

\( \frac{1}{2}\|w\|^2 \)

subject to correct-classification constraints. The dual solution expresses ( w )

Mathematical Preliminaries for SVM

Mathematical Preliminaries for SVM #

Support Vector Machines use optimisation, geometry and kernels. Before deriving SVM, we need constrained optimisation, Lagrange multipliers, primal and dual problems, KKT conditions, hyperplanes and kernel functions.

Key takeaway: SVM is built on constrained optimisation. The hard-margin SVM primal problem is a quadratic optimisation problem with linear inequality constraints. The dual problem uses Lagrange multipliers and leads naturally to support vectors and kernels.

Nonlinear SVM

Nonlinear SVM #

A linear SVM works well when the data can be separated by a straight line or hyperplane. When the data is not linearly separable in the original input space, nonlinear SVM maps the data to a higher-dimensional feature space where a linear separator may exist.

Key takeaway: Nonlinear SVM uses the kernel trick. Instead of explicitly mapping

\( x \)

to ( \phi(x) )