Determinant and Trace

Determinant and Trace #


Minor #

The minor of an element \( a_{ij} \) is the determinant of the smaller square matrix formed by:

  • removing row \( i \)
  • removing column \( j \)

The minor is denoted \( M_{ij} \) .

Minors are used to compute cofactors, which are used for determinants and inverses (via adjoint/adjugate).


Cofactor #

The cofactor of \( a_{ij} \) , denoted \( C_{ij} \) , is:

\[ C_{ij} = (-1)^{i+j} M_{ij} \]

Where:

  • \( i \) is the row index
  • \( j \) is the column index
  • \( M_{ij} \) is the minor

Why the sign term exists #

The factor \( (-1)^{i+j} \) accounts for alternating signs depending on position in the matrix.


Cofactor Matrix and Adjoint (Adjugate) #

Cofactor matrix #

The cofactor matrix is the matrix formed by taking the cofactor of every entry.


Determinant #

The determinant is a scalar value that can be calculated for a square matrix (m x m).

The determinant of a square matrix, \( \det(A) \) , maps matrices to real scalars. It is equal to the product of all the eigenvalues of the matrix.

It is written as:

  • \( \det(A) \)
  • \( |A| \)
  • It serves as a scaling factor that is used for the transformation of a matrix.

  • Acts as a scaling factor for linear transformations

  • Indicates whether a matrix is invertible

  • It is a single numerical value that plays a key role in various matrix operations, such as calculating the inverse of a matrix or solving systems of linear equations.

  • enable the computation of eigenvalues, which are fundamental to PCA and dimensionality reduction in machine learning.

  • Appears in calculus, optimisation, and probability (e.g., Jacobians, covariance matrices)


Determinants of different sizes #

1×1 matrix #

If \( X = [a] \) , then:

\[ \det(X) = a \]

2×2 matrix #

For:

\[ A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \] \[ \det(A) = ad - bc \]

3×3 matrix (concept) #

A 3×3 determinant is computed by expanding into 2×2 determinants.

This can be done by expanding along:

  • any row ( \( R_1, R_2, R_3 \) )
  • or any column ( \( C_1, C_2, C_3 \) )

Adjoint / Adjugate #

The adjoint (more precisely, adjugate) of \( A \) is:

\[ \operatorname{adj}(A) = C^T \]

Where:

  • \( C \) is the cofactor matrix
  • \( C^T \) is its transpose

Used in the classical formula for the inverse: \[ A^{-1} = \frac{1}{\det(A)}\operatorname{adj}(A) \quad \text{when } \det(A)\neq 0 \]


Properties of the determinant #

  • Transpose property \[ \det(A)=\det(A^T) \]

  • Zero property If a matrix has:

    • a zero row/column, or
    • two identical rows/columns, or
    • two proportional rows/columns

    then \( \det(A)=0 \)

  • Row/column swap Swapping two rows/columns changes the sign: \[ \det(A) \rightarrow -\det(A) \]

  • Scalar multiple Multiplying one row/column by \( k \) multiplies the determinant by \( k \) .

  • Row operation invariance Adding a multiple of one row/column to another does not change the determinant: \[ R_i \rightarrow R_i + kR_j \]

  • Product rule \[ \det(AB)=\det(A)\det(B) \]

  • Inverse property \[ \det(A^{-1})=\frac{1}{\det(A)} \quad (\det(A)\neq 0) \]

  • Triangular matrices For upper/lower triangular matrices, the determinant equals the product of diagonal entries.


Trace #

The trace of an \( n \times n \) square matrix \( A \) is defined as the sum of its diagonal elements.

\[ \operatorname{tr}(A) = \sum_{i=1}^{n} a_{ii} \]

In simple terms:

Trace = sum of diagonal entries.


Example #

For:

\[ A = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \]
\[ \operatorname{tr}(A) = a_{11} + a_{22} \]

Properties of the Trace #

Let \( A, B \in \mathbb{R}^{n \times n} \) and \( \alpha \in \mathbb{R} \) .


1. Linearity (Addition) #

\[ \operatorname{tr}(A + B) = \operatorname{tr}(A) + \operatorname{tr}(B) \]

2. Scalar Multiplication #

\[ \operatorname{tr}(\alpha A) = \alpha \operatorname{tr}(A) \]

3. Trace of Identity #

\[ \operatorname{tr}(I_n) = n \]

Because the identity matrix has \( n \) ones on the diagonal.


4. Cyclic Property (Very Important ⭐) #

If
\( A \in \mathbb{R}^{n \times k} \) and
\( B \in \mathbb{R}^{k \times n} \) , then:

\[ \operatorname{tr}(AB) = \operatorname{tr}(BA) \]

This does not mean ( AB = BA )

.
It only means their traces are equal.

This property is extremely important in:

  • Optimisation
  • Matrix calculus
  • Machine learning derivations

Important Identity #

If \( A \) has eigenvalues \( \lambda_1, \lambda_2, \dots, \lambda_n \) , then:

\[ \operatorname{tr}(A) = \sum_{i=1}^{n} \lambda_i \]

Why Trace Matters in Machine Learning #

Trace appears in:

  • Matrix derivatives
  • Quadratic forms
  • PCA
  • Gaussian likelihoods
  • Optimisation proofs

Example quadratic form:

\[ \mathbf{x}^T A \mathbf{x} = \operatorname{tr}(\mathbf{x}^T A \mathbf{x}) \]

Trace is often used to simplify matrix expressions.


The proofs of these properties are straightforward and follow directly from the definition of the trace and properties of summation.


Home | Matrix Decompositions