AI

Support Vector Machine

Support Vector Machine (SVM) #

A Support Vector Machine (SVM) is a supervised machine learning algorithm used for:

  • Classification (most common)
  • Regression (SVR – Support Vector Regression)

Find the decision boundary that separates classes with the maximum margin.

A Support Vector Machine is a supervised learning algorithm that finds an optimal hyperplane by maximising the margin between classes, using support vectors and kernel functions to handle non-linear data.

Deep Recurrent Neural Networks

Deep Recurrent Neural Networks #

Vanilla RNNs introduce the hidden-state idea, but they struggle on longer and more complex sequences because gradients can vanish across time. Deep recurrent models extend the RNN idea in two important ways:

  1. make the recurrent architecture richer, for example by stacking multiple recurrent layers or using information from both directions,
  2. use gates and memory cells to control what should be remembered, forgotten, updated, and exposed.

This is why practical recurrent modelling usually moves from a simple RNN to stacked RNNs, bidirectional RNNs, GRUs, or LSTMs.

Attention Mechanism

Attention Mechanism #

  • Queries, Keys, and Values
  • Attention Pooling by Similarity
  • Attention Pooling via Nadaraya–Watson Regression
  • Attention Scoring Functions
  • Dot Product Attention
  • Convenience Functions
  • Scaled Dot Product Attention
  • Additive Attention
  • Bahdanau Attention Mechanism
  • Multi-Head Attention
  • Self-Attention
  • Positional Encoding
  • Code implementation (webinar)

Reference #

  • Dive into deep learning. Cambridge University Press.. (Ch 10, Ch7

Home | Deep Learning

Transformer

Transformer #

  • is an architecture of neural networks

  • based on the multi-head attention mechanism

  • text is converted to numerical representations called tokens, and each token is converted into a vector via lookup from a word embedding table

  • takes a text sequence as input and produces another text sequence as output

  • foundation for modern Large Language Models (LLMs) like ChatGPT and Gemini

  • Transformer architecture

  • Model, Positionwise Feed-Forward Networks, Residual Connection and Layer Normalization

Optimisation of Deep models

Optimisation of Deep models #

  • Goal of Optimization
  • Optimization Challenges in Deep Learning
  • Gradient Descent
  • Stochastic Gradient Descent
  • Minibatch Stochastic Gradient Descent
  • Momentum
  • Adagrad and Algorithm
  • RMSProp and Algorithm
  • Adadelta and Algorithm
  • Adam and Algorithm
  • Code Implementation and comparison of algorithms (webinar)

Reference #

  • Dive into deep learning. Cambridge University Press.. (Ch12)

Home | Deep Learning

Regularisation for Deep models

Regularisation for Deep models #

  • Generalization for regression
  • Training Error and Generalization Error
  • Underfitting or Overfitting
  • Model Selection
  • Weight Decay and Norms
  • Generalization in Classification
  • Environment and Distribution Shift
  • Generalization in Deep Learning
  • Dropout
  • Batch Normalization
  • Layer Normalization
  • Code implementation (webinar)

Reference #


Home | Deep Learning

Linear Algebra

Linear Algebra #

The study of vectors and matrices is called Linear Algebra.

Linear Algebra provides the mathematical language used to represent data, transformations, and structure in ML.


Why Linear Algebra Matters in ML #

  • Every machine learning model uses matrices
  • All data in ML is represented using vectors and matrices
  • Neural networks are pipelines of matrix operations
  • Models apply matrix transformations to data
  • Optimisation relies on linear algebra operations

What to Learn #

  • Scalars, vectors, and matrices
  • Vector operations (addition, dot product)
  • Matrix multiplication (critical)
  • Identity matrices and transpose
  • Eigenvalues and eigenvectors (conceptual understanding)

  • Scalar → a number
  • Vector → a directed point
  • Matrix → a space transformer
  • Linear transformation → structured mapping
  • Feature → one axis
  • Feature space → where data lives
  • Vector space → where vectors live

Home | Mathematical Foundation