Mathematics

Vector Spaces

Vector Spaces #

A vector space is the mathematical “home” where vectors live and where addition and scaling are valid operations.

  • A vector space is a set closed under vector addition and scalar multiplication.

  • Machine learning operates in vector spaces.

  • covers independence, bases, rank, and geometric tools like norms and inner products that are used to measure length, distance, and angles.

A vector space is a set of vectors that follows ten axioms, defined under two operations:

Feature Space

Feature #

A feature is an individual measurable property or characteristic of a data point used as input to a machine learning model.

Each feature corresponds to one dimension.

\[ x_i \in \mathbb{R} \]

A data point with ( d ) features is represented as:

Cauchy–Schwarz

Cauchy–Schwarz Inequality #

The Cauchy–Schwarz Inequality is one of the most important results in linear algebra.

It places a fundamental bound on the inner product of two vectors.

If you see angle, cosine, similarity, or inner product bounds
→ think Cauchy–Schwarz Inequality

Key Idea: The inner product (dot product) can never exceed the product of magnitudes. This ensures all geometric interpretations (angles, cosine) are valid.


Statement of the Inequality #

For any vectors:

Matrix Decompositions

Matrix Decompositions #

Decompositions reveal structure in matrices and power algorithms like PCA.

Matrix decompositions break complex matrices into simpler parts.

From the lecture introduction, matrices are used to describe mappings and transformations of vectors.

That is why decomposition is important: it lets us understand a complicated transformation by rewriting it using simpler building blocks.

In the slides, the topic is introduced as part of three closely connected goals: how to summarise matrices, how matrices can be decomposed, and how the decompositions can be used for matrix approximations.

Characteristic Polynomial

Characteristic Polynomial #

The characteristic polynomial of a square matrix is the key tool used to compute eigenvalues.

It connects:

  • Determinants
  • Trace
  • Eigenvalues
  • Matrix structure

Definition #

Let
\( A \in \mathbb{R}^{n \times n} \)
and \( \lambda \in \mathbb{R} \) .

The characteristic polynomial of (A) is defined as:

\[ p_A(\lambda) = \det(A - \lambda I) \]

It is a polynomial in \( \lambda \) of degree (n).

Determinant and Trace

Determinant and Trace #


Minor #

The minor of an element \( a_{ij} \) is the determinant of the smaller square matrix formed by:

  • removing row \( i \)
  • removing column \( j \)

The minor is denoted \( M_{ij} \) .

Minors are used to compute cofactors, which are used for determinants and inverses (via adjoint/adjugate).


Cofactor #

The cofactor of \( a_{ij} \) , denoted \( C_{ij} \) , is:

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors #

  • Eigenvalues give scaling.
  • Eigenvectors define invariant directions of transformation.

Eigenvalues and eigenvectors describe directions that remain unchanged under a linear transformation, except for scaling.

From lectures: matrix multiplication represents a transformation of space.
Most vectors change direction and magnitude.
Some special vectors only scale.
These are eigenvectors.

Key Idea: A matrix transformation stretches or compresses vectors. Eigenvectors are directions that remain unchanged. Eigenvalues tell how much scaling happens.

Cholesky Decomposition

Cholesky Decomposition #

Cholesky decomposition is a special matrix factorisation used for symmetric positive definite matrices.

From lecture discussions, this decomposition is powerful because it reduces a matrix into a triangular form, making computations easier and more stable.

Key Idea: Cholesky decomposition expresses a matrix as a product of a lower triangular matrix and its transpose. It is efficient and numerically stable.


Definition #

For a symmetric positive definite matrix:

Eigen Decomposition

Eigen Decomposition #

Eigen decomposition expresses a matrix using its eigenvectors and eigenvalues.

From lecture discussions, this is one of the most important ways to understand the internal structure of a matrix.

Instead of treating the matrix as a black box, eigen decomposition reveals its fundamental directions and scaling behaviour.

Key Idea: Eigen decomposition rewrites a matrix in terms of directions (eigenvectors) and scaling factors (eigenvalues). This makes complex transformations easier to understand and compute.

Diagonalization

Diagonalization #

Diagonalisation expresses a matrix using its eigenvectors and eigenvalues when possible.

From lecture explanation, diagonalisation is one of the most powerful tools because it converts a complicated matrix into a much simpler form.

Instead of working with a full matrix, we work with a diagonal matrix, which is much easier to analyse and compute.

Key Idea: If a matrix has enough independent eigenvectors, it can be rewritten as a diagonal matrix using a change of basis. This simplifies matrix operations significantly.