Characteristic Polynomial

Characteristic Polynomial #

The characteristic polynomial of a square matrix is the key tool used to compute eigenvalues.

It connects:

  • Determinants
  • Trace
  • Eigenvalues
  • Matrix structure

Definition #

Let
\( A \in \mathbb{R}^{n \times n} \)
and \( \lambda \in \mathbb{R} \) .

The characteristic polynomial of (A) is defined as:

\[ p_A(\lambda) = \det(A - \lambda I) \]

It is a polynomial in \( \lambda \) of degree (n).


General Form #

We can show that:

\[ p_A(\lambda) = c_0 + c_1 \lambda + \dots + c_{n-1} \lambda^{n-1} + (-1)^n \lambda^n \]

where
\( c_0, c_1, \dots, c_{n-1} \in \mathbb{R} \) .


Important Coefficients #

Two coefficients are especially important.

1. Constant Term #

Set \( \lambda = 0 \) :

\[ p_A(0) = \det(A) \]

So:

\( c_0 = \det(A) \)

2. Coefficient of ( \lambda^{n-1} ) #

One can show (via determinant expansion) that:

\[ c_{n-1} = (-1)^{n-1} \operatorname{tr}(A) \]

So the trace of a matrix appears directly inside the characteristic polynomial.


Leading Term #

The highest-degree term is always:

\[ (-1)^n \lambda^n \]

So:

The characteristic polynomial is always degree (n).


Why These Coefficients Appear (Intuition from Expansion) #

Consider a (3 \times 3) matrix:

\[ A - \lambda I = \begin{bmatrix} a_{11} - \lambda & a_{12} & a_{13} \\ a_{21} & a_{22} - \lambda & a_{23} \\ a_{31} & a_{32} & a_{33} - \lambda \end{bmatrix} \]

When expanding the determinant:

  • The product
    \( \prod_{i=1}^{3} (a_{ii} - \lambda) \)
    generates the highest powers of \( \lambda \) .

  • Other determinant terms contain fewer factors of \( \lambda \)
    and therefore produce lower powers.


General Case (n × n) #

When expanding along the first row:

  • The term
    \( \prod_{i=1}^{n}(a_{ii} - \lambda) \)
    produces powers up to \( \lambda^n \) .

  • All other expansion terms contain minors where at least one
    \( (a_{ii} - \lambda) \) factor is removed.

So:

  • Only one contributor produces the ( \lambda^n ) term
  • Only that same contributor produces the ( \lambda^{n-1} ) term

Which leads to:

\[ \text{Coefficient of } \lambda^n = (-1)^n \]

and

\[ \text{Coefficient of } \lambda^{n-1} = (-1)^{n-1} \sum_{i=1}^{n} a_{ii} = (-1)^{n-1} \operatorname{tr}(A) \]

Connection to Eigenvalues #

Eigenvalues are defined as the roots of the characteristic polynomial.

\[ \det(A - \lambda I) = 0 \]

Solving this equation gives all eigenvalues of (A).


Important Consequences #

From the polynomial structure we obtain:

1️⃣ Product of Eigenvalues #

\[ \prod_{i=1}^{n} \lambda_i = \det(A) \]

2️⃣ Sum of Eigenvalues #

\[ \sum_{i=1}^{n} \lambda_i = \operatorname{tr}(A) \]

These follow from the relationship between polynomial coefficients and roots.


Why This Matters in Machine Learning #

Characteristic polynomials are used to:

  • Compute eigenvalues (PCA, SVD foundations)
  • Analyse stability of systems
  • Understand diagonalisation
  • Study covariance matrices
  • Analyse Hessians in optimisation

Every time you compute eigenvalues, you are solving

\( \det(A - \lambda I) = 0 \)

.


Summary #

  • The characteristic polynomial is
    \( p_A(\lambda) = \det(A - \lambda I) \)
  • It is a degree (n) polynomial
  • Constant term = determinant
  • ( \lambda^{n-1} ) coefficient involves the trace
  • Roots of the polynomial are the eigenvalues

Home | Matrix Decompositions