Linear Algebra

Systems of Linear Equations

Systems of Linear Equations #

A system of linear equations can be written compactly as:

\[ A\mathbf{x}=\mathbf{b} \]

This represents:

  • a linear transformation applied to an unknown vector (\mathbf{x})
  • producing an output vector (\mathbf{b})

Key components #

Coefficient matrix (A) #

(A) contains the coefficients of the variables.

Matrices

Matrices #

Matrices are the core data structure of linear algebra and the workhorse of machine learning.
Almost every ML model can be described as a sequence of matrix operations.


Matrix #

A matrix is a rectangular array of numbers arranged in rows and columns.

\[ A \in \mathbb{R}^{m \times n} \]

An ( m \times n ) matrix has:

Solving Linear Systems

Solving Linear Systems #

Solve using:

  • Substitution Method
  • Elimination Method (Multiple & then Subtract)
  • Cross Multiplication

Linear system can have:

  • no solution
  • a unique solution
  • infinitely many solutions

Positive Definite Matrices #

A square matrix is positive definite if pre-multiplying and post-multiplying it by the same vector always gives a positive number as a result, independently of how we choose the vector.

Positive definite symmetric matrices have the property that all their eigenvalues are positive.

Forward and Backward Substitution

Forward and Backward Substitution #

Forward and backward substitution are efficient algorithms used to solve linear systems when the coefficient matrix is triangular.

They are typically used after:

  • Gaussian elimination
  • LU decomposition

1. Forward Substitution (Lower Triangular Systems) #

Used to solve:

\[ L\mathbf{x} = \mathbf{b} \]

where (L) is a lower triangular matrix:

Inverse Matrix

Inverse Matrix #

The inverse of a matrix is a matrix that, when multiplied with the original matrix, produces the identity matrix.

A square matrix (A) is invertible if there exists a matrix (A^{-1}) such that:

\[ AA^{-1} = A^{-1}A = I \]

Here:

Convex Combination

Convex Combination of Two Points #

A convex combination describes how to form a point between two points using weighted averages.

It is a fundamental building block in several advanced fields:

  • Linear Algebra & Geometry
  • Optimization Theory
  • Machine Learning (Specifically in SVMs, clustering, and data interpolation)

Given two points (or vectors) $\mathbf{x}_1, \mathbf{x}_2 \in \mathbb{R}^n$, a convex combination of these points is defined as:

$$\mathbf{x} = \lambda \mathbf{x}_1 + (1 - \lambda)\mathbf{x}_2$$

Where:

Vector Spaces

Vector Spaces #

A vector space is the mathematical “home” where vectors live and where addition and scaling are valid operations.

  • A vector space is a set closed under vector addition and scalar multiplication.

  • Machine learning operates in vector spaces.

  • covers independence, bases, rank, and geometric tools like norms and inner products that are used to measure length, distance, and angles.

A vector space is a set of vectors that follows ten axioms, defined under two operations:

Feature Space

Feature #

A feature is an individual measurable property or characteristic of a data point used as input to a machine learning model.

Each feature corresponds to one dimension.

\[ x_i \in \mathbb{R} \]

A data point with ( d ) features is represented as:

Cauchy–Schwarz

Cauchy–Schwarz Inequality #

The Cauchy–Schwarz Inequality is one of the most important results in linear algebra.

It places a fundamental bound on the inner product of two vectors.

If you see angle, cosine, similarity, or inner product bounds
→ think Cauchy–Schwarz Inequality

Key Idea: The inner product (dot product) can never exceed the product of magnitudes. This ensures all geometric interpretations (angles, cosine) are valid.


Statement of the Inequality #

For any vectors:

Matrix Decompositions

Matrix Decompositions #

Decompositions reveal structure in matrices and power algorithms like PCA.

Matrix decompositions break complex matrices into simpler parts.

From the lecture introduction, matrices are used to describe mappings and transformations of vectors.

That is why decomposition is important: it lets us understand a complicated transformation by rewriting it using simpler building blocks.

In the slides, the topic is introduced as part of three closely connected goals: how to summarise matrices, how matrices can be decomposed, and how the decompositions can be used for matrix approximations.