Linear Algebra

Characteristic Polynomial

Characteristic Polynomial #

The characteristic polynomial of a square matrix is the key tool used to compute eigenvalues.

It connects:

  • Determinants
  • Trace
  • Eigenvalues
  • Matrix structure

Definition #

Let
\( A \in \mathbb{R}^{n \times n} \)
and \( \lambda \in \mathbb{R} \) .

The characteristic polynomial of (A) is defined as:

\[ p_A(\lambda) = \det(A - \lambda I) \]

It is a polynomial in \( \lambda \) of degree (n).

Determinant and Trace

Determinant and Trace #


Minor #

The minor of an element \( a_{ij} \) is the determinant of the smaller square matrix formed by:

  • removing row \( i \)
  • removing column \( j \)

The minor is denoted \( M_{ij} \) .

Minors are used to compute cofactors, which are used for determinants and inverses (via adjoint/adjugate).


Cofactor #

The cofactor of \( a_{ij} \) , denoted \( C_{ij} \) , is:

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors #

  • Eigenvalues give scaling.
  • Eigenvectors define invariant directions of transformation.

Eigenvalues and eigenvectors describe directions that remain unchanged under a linear transformation, except for scaling.

From lectures: matrix multiplication represents a transformation of space.
Most vectors change direction and magnitude.
Some special vectors only scale.
These are eigenvectors.

Key Idea: A matrix transformation stretches or compresses vectors. Eigenvectors are directions that remain unchanged. Eigenvalues tell how much scaling happens.

Cholesky Decomposition

Cholesky Decomposition #

Cholesky decomposition is a special matrix factorisation used for symmetric positive definite matrices.

From lecture discussions, this decomposition is powerful because it reduces a matrix into a triangular form, making computations easier and more stable.

Key Idea: Cholesky decomposition expresses a matrix as a product of a lower triangular matrix and its transpose. It is efficient and numerically stable.


Definition #

For a symmetric positive definite matrix:

Eigen Decomposition

Eigen Decomposition #

Eigen decomposition expresses a matrix using its eigenvectors and eigenvalues.

From lecture discussions, this is one of the most important ways to understand the internal structure of a matrix.

Instead of treating the matrix as a black box, eigen decomposition reveals its fundamental directions and scaling behaviour.

Key Idea: Eigen decomposition rewrites a matrix in terms of directions (eigenvectors) and scaling factors (eigenvalues). This makes complex transformations easier to understand and compute.

Diagonalization

Diagonalization #

Diagonalisation expresses a matrix using its eigenvectors and eigenvalues when possible.

From lecture explanation, diagonalisation is one of the most powerful tools because it converts a complicated matrix into a much simpler form.

Instead of working with a full matrix, we work with a diagonal matrix, which is much easier to analyse and compute.

Key Idea: If a matrix has enough independent eigenvectors, it can be rewritten as a diagonal matrix using a change of basis. This simplifies matrix operations significantly.

Singular Value Decomposition (SVD)

Singular Value Decomposition (SVD) #

Singular Value Decomposition (SVD) is one of the most important matrix decomposition techniques in linear algebra and machine learning.

It factorises any matrix into three simpler matrices that reveal its structure.

Key Idea: SVD decomposes a matrix into rotations + scaling. It tells us how data is transformed along orthogonal directions.


Definition #

For any matrix in real space: \[ A \in \mathbb{R}^{m \times n} \]

Optimisation using Gradient Descent

Optimisation using Gradient Descent #

Gradient descent is an optimisation algorithm used to train ML and neural networks.

  • Gradient descent updates parameters by moving opposite the gradient.

Trains ML models by minimising errors:

  • between predicted and actual results
  • by iteratively adjusting its parameters
  • moves step‑by‑step in the direction of the steepest decrease in the loss function, it helps ML models learn the best possible weights for better predictions

Types of Gradient Gescent learning algorithms #

  1. Batch gradient descent
  2. Stochastic gradient descent
  3. Mini-batch gradient descent

Home | Continuous Optimisation