Characteristic Polynomial #

The characteristic polynomial of a square matrix is the key tool used to compute eigenvalues.

It connects:

Determinants
Trace
Eigenvalues
Matrix structure

Definition #

Let
\( A \in \mathbb{R}^{n \times n} \)
and \( \lambda \in \mathbb{R} \) .

The characteristic polynomial of (A) is defined as:

\[ p_A(\lambda) = \det(A - \lambda I) \]

It is a polynomial in \( \lambda \) of degree (n).

General Form #

We can show that:

\[ p_A(\lambda) = c_0 + c_1 \lambda + \dots + c_{n-1} \lambda^{n-1} + (-1)^n \lambda^n \]

where
\( c_0, c_1, \dots, c_{n-1} \in \mathbb{R} \) .

Important Coefficients #

Two coefficients are especially important.

1. Constant Term #

Set \( \lambda = 0 \) :

\[ p_A(0) = \det(A) \]

So:

\( c_0 = \det(A) \)

2. Coefficient of ( \lambda^{n-1} ) #

One can show (via determinant expansion) that:

\[ c_{n-1} = (-1)^{n-1} \operatorname{tr}(A) \]

So the trace of a matrix appears directly inside the characteristic polynomial.

Leading Term #

The highest-degree term is always:

\[ (-1)^n \lambda^n \]

So:

The characteristic polynomial is always degree (n).

Why These Coefficients Appear (Intuition from Expansion) #

Consider a (3 \times 3) matrix:

\[ A - \lambda I = \begin{bmatrix} a_{11} - \lambda & a_{12} & a_{13} \\ a_{21} & a_{22} - \lambda & a_{23} \\ a_{31} & a_{32} & a_{33} - \lambda \end{bmatrix} \]

When expanding the determinant:

The product
\( \prod_{i=1}^{3} (a_{ii} - \lambda) \)
generates the highest powers of \( \lambda \) .
Other determinant terms contain fewer factors of \( \lambda \)
and therefore produce lower powers.

General Case (n × n) #

When expanding along the first row:

The term
\( \prod_{i=1}^{n}(a_{ii} - \lambda) \)
produces powers up to \( \lambda^n \) .
All other expansion terms contain minors where at least one
\( (a_{ii} - \lambda) \) factor is removed.

So:

Only one contributor produces the ( \lambda^n ) term
Only that same contributor produces the ( \lambda^{n-1} ) term

Which leads to:

\[ \text{Coefficient of } \lambda^n = (-1)^n \]

and

\[ \text{Coefficient of } \lambda^{n-1} = (-1)^{n-1} \sum_{i=1}^{n} a_{ii} = (-1)^{n-1} \operatorname{tr}(A) \]

Connection to Eigenvalues #

Eigenvalues are defined as the roots of the characteristic polynomial.

\[ \det(A - \lambda I) = 0 \]

Solving this equation gives all eigenvalues of (A).

Important Consequences #

From the polynomial structure we obtain:

1️⃣ Product of Eigenvalues #

\[ \prod_{i=1}^{n} \lambda_i = \det(A) \]

2️⃣ Sum of Eigenvalues #

\[ \sum_{i=1}^{n} \lambda_i = \operatorname{tr}(A) \]

These follow from the relationship between polynomial coefficients and roots.

Why This Matters in Machine Learning #

Characteristic polynomials are used to:

Compute eigenvalues (PCA, SVD foundations)
Analyse stability of systems
Understand diagonalisation
Study covariance matrices
Analyse Hessians in optimisation

Every time you compute eigenvalues, you are solving
\( \det(A - \lambda I) = 0 \)
.

Summary #

The characteristic polynomial is
\( p_A(\lambda) = \det(A - \lambda I) \)
It is a degree (n) polynomial
Constant term = determinant
( \lambda^{n-1} ) coefficient involves the trace
Roots of the polynomial are the eigenvalues

Home | Matrix Decompositions