Jordan Canonical Form

Section 5.7 Jordan Canonical Form

The results of Theorem 5.6.1 and Theorem 5.6.3 tell us that for an eigenvalue

λ

T : V \to V

with multiplicity

m,

we have a sequence of subspace inclusions

E_{λ} (T) = \ker (T - λ I) \subseteq \ker (T - λ I)^{2} \subseteq \dots \subseteq \ker (T - λ I)^{m} = G_{λ} (T) .

Not all subspaces in this sequence are necessarily distinct. Indeed, it is entirely possible that

\dim E_{λ} (T) = m,

in which case

E_{λ} (T) = G_{λ} (T) .

In general there will be some

l \leq m

such that

\ker (T - λ I)^{l} = G_{λ} (T) .

🔗

Our goal in this section is to determine a basis for

G_{λ} (T)

in a standard way. We begin with a couple of important results, which we state without proof. The first can be found in Axler’s book; the second in Nicholson’s.

🔗

Theorem 5.7.1.

Suppose

V

is a complex vector space, and

T : V \to V

is a linear operator. Let

λ_{1}, \dots, λ_{k}

denote the distinct eigenvalues of

T .

(We can assume

V

is real if we also assume that all eigenvalues of

V

are real.) Then:

Generalized eigenvectors corresponding to distinct eigenvalues are linearly independent.
🔗

🔗
$V = G_{λ_{1}} (T) \oplus G_{λ_{2}} (T) \oplus \dots \oplus G_{λ_{k}} (T)$

🔗
Each generalize eigenspace $G_{λ_{j}} (T)$ is $T$ -invariant.
🔗

🔗
Each restriction $(T - λ_{j}) |_{G_{λ_{j}} (T)}$ is nilpotent.
🔗

🔗

🔗

Theorem 5.7.2.

Let

T : V \to V

be a linear operator. If the characteristic polynomial of

T

is given by

c_{T} (x) = (x - λ_{1})^{m_{1}} (x - λ_{2})^{m_{2}} \dots (x - λ_{k})^{m_{k}},

then

\dim G_{λ_{j}} (T) = m_{j}

for each

j = 1, \dots, k .

🔗

Moreover, if we let

B = B_{1} \cup B_{2} \cup \dots \cup B_{k},

where

B_{j}

is any basis for

G_{λ_{j}} (T)

for

j = 1, \dots, k,

then

B

is a basis for

V

(this follows immediately from Theorem 5.7.1) and the matrix of

T

with respect to this basis has the block-diagonal form

M_{B} (T) = [\begin{matrix} A_{1} & 0 & \dots & 0 \\ 0 & A_{2} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & A_{k} \end{matrix}],

where each

A_{j}

has size

m_{j} \times m_{j} .

🔗

A few remarks are called for here.

One of the ways to see that $\dim G_{λ_{j}} (T) = m_{j}$ is to consider $(M_{B} (T) - λ_{j} I_{n})^{m_{j}} .$ This will have the form $diag (U_{1}^{m_{j}}, U_{2}^{m_{j}}, \dots, U_{k}^{m_{j}}),$ where $U_{i}$ is the matrix of $(T - λ_{j})^{m_{j}},$ restricted to $G_{λ_{i}} (T) .$ If $i \neq j,$ $T - λ_{j} I$ restricts to an invertible operator on $G_{λ_{i}} (T),$ but its restriction to $G_{λ_{j}} (T)$ is nilpotent, by Theorem 5.7.1. So $U_{j}$ is nilpotent (with $U_{j}^{m_{j}} = 0$ ), and has size $m_{j} \times m_{j},$ while $U_{i}$ is invertible if $i \neq j .$ The matrix $(M_{B} (T) - λ_{j} I)^{m_{j}}$ thus ends up with a $m_{j} \times m_{j}$ block of zeros, so $\dim \ker (T - λ_{j} I)^{m_{j}} = m_{j} .$
🔗

🔗
If the previous point wasn’t clear, note that with an appropriate choice of basis, the block $A_{i}$ in Theorem 5.7.2 has the form

$A_{i} = [\begin{matrix} λ_{i} & * & \dots & * \\ 0 & λ_{i} & \dots & * \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & λ_{i} \end{matrix}] .$

Thus, $M_{B} (T) - λ_{j} I$ will have blocks that are upper triangular, with diagonal entries $λ_{i} - λ_{j} \neq 0$ when $i \neq j,$ but when $i = j$ we get a matrix that is strictly upper triangular, and therefore nilpotent, since its diagonal entries will be $λ_{j} - λ_{j} = 0 .$

🔗

🔗
if $l_{j}$ is the least integer such that $\ker (A - λ_{j} I)^{l_{j}} = G_{λ_{j}} (T),$ then it is possible to choose the basis of $G_{λ_{j}} (T)$ so that $A_{j}$ is itself block-diagonal, with the largest block having size $l_{j} \times l_{j} .$ The remainder of this section is devoted to determining how to choose such a basis.
🔗

🔗

🔗

The basic principle for choosing a basis for each generalized eigenspace is as follows. We know that

E_{λ} (T) \subseteq G_{λ} (T)

for each eigenvalue

λ .

So we start with a basis for

E_{λ} (T),

by finding eigenvectors as usual. If

\ker (T - λ I)^{2} = \ker (T - λ I),

then we’re done:

E_{λ} (T) = G_{λ} (T) .

Otherwise, we enlarge the basis for

E_{λ} (T)

to a basis of

\ker T (- λ I)^{2} .

\ker (T - λ I)^{3} = \ker (T - λ I)^{2},

then we’re done, and

G_{λ} (T) = \ker (T - λ I)^{2} .

If not, we enlarge our existing basis to a basis of

\ker (T - λ I)^{3} .

We continue this process until we reach some power

l

such that

\ker (T - λ I)^{l} = \ker (T - λ I)^{l + 1} .

(This is guaranteed to happen by Theorem 5.6.1.) We then conclude that

G_{λ} (T) = \ker (T - λ I)^{l} .

🔗

The above produces a basis for

G_{λ} (T),

but we want what is, in some sense, the “best” basis. For our purposes, the best basis is the one in which the matrix of

T

restricted to each generalized eigenspace is block diagonal, where each block is a Jordan block.

🔗

Definition 5.7.3.

Let

λ

be a scalar. A Jordan block is an

m \times m

matrix of the form

J (m, λ) = [\begin{matrix} λ & 1 & 0 & \dots & 0 \\ 0 & λ & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋱ & ⋮ \\ 0 & 0 & \dots & λ & 1 \\ 0 & 0 & 0 & \dots & λ \end{matrix}] .

That is

J (m, λ)

has each diagonal entry equal to

λ,

and each “superdiagonal” entry (those just above the diagonal) equal to 1, with all other entries equal to zero.

🔗

Example 5.7.4.

The following are examples of Jordan blocks:

J (2, 4) = [\begin{matrix} 4 & 1 \\ 0 & 4 \end{matrix}], J (3, \sqrt{2}) = [\begin{matrix} \sqrt{2} & 1 & 0 \\ 0 & \sqrt{2} & 1 \\ 0 & 0 & \sqrt{2} \end{matrix}], J (4, 2 i) = [\begin{matrix} 2 i & 1 & 0 & 0 \\ 0 & 2 i & 1 & 0 \\ 0 & 0 & 2 i & 1 \\ 0 & 0 & 0 & 2 i \end{matrix}] .

🔗

Insight 5.7.5. Finding a chain basis.

A Jordan block corresponds to basis vectors

v_{1}, v_{2}, \dots, v_{m}

with the following properties:

\begin{aligned} T (v_{1}) & = λ v_{1} \\ T (v_{2}) & = v_{1} + λ v_{2} \\ T (v_{3}) & = v_{2} + λ v_{3}, \end{aligned}

and so on. Notice that

v_{1}

is an eigenvector, and for each

j = 2, \dots, m,

(T - λ I) v_{j} = v_{j - 1} .

🔗

Notice also that if we set

N = T - λ I,

then

v_{1} = N v_{2}, v_{2} = N v_{3}, \dots, v_{m - 1} = N v_{m}

so our basis for

G_{λ} (T)

is of the form

v, N v, N^{2} v, \dots, N^{m - 1} v,

where

v = v_{m},

and

v_{1} = N^{m - 1} v

is an eigenvector. (Note that

N^{m} v = (T - λ I) v_{1} = 0,

and indeed

N^{m} v_{j} = 0

for each

j = 1, \dots, m .

) Such a basis is known as a chain basis.

🔗

\dim E_{λ} (T) > 1

we might have to repeat this process for each eigenvector in a basis for the eigenspace. The full matrix of

T

might have several Jordan blocks of possibly different sizes for each eigenvalue.

🔗

Example 5.7.6.

Determine a Jordan basis for the operator

T : R^{5} \to R^{5}

whose matrix with respect to the standard basis is given by

A = [\begin{matrix} 7 & 1 & - 3 & 2 & 1 \\ - 6 & 2 & 4 & - 2 & - 2 \\ 0 & 1 & 3 & 1 & - 1 \\ - 8 & - 1 & 6 & 0 & - 3 \\ - 4 & 0 & 3 & - 1 & 1 \end{matrix}]

🔗

Solution.

First, we need the characteristic polynomial.

🔗

Aside

Programming note: it is generally not considered good practice to use a wildcard when importing a library. The command from sympy import * imports everything. This can lead to clashes when working with multiple libraries with functions of the same name.

🔗

Unfortunately, our usual method (using import sympy as sy) runs into some trouble here, since we are using Sage cells rather than pure Python. In a Jupyter notebook, things will work just fine, but we need to clobber some pre-defined Sage functions. So although it is usually a bad thing to use the biggest available hammer to clobber everything, it turns out to be necessary here.

🔗

The other option would be to force the Sage cells to use Python instead of Sage, but then we would need to use the Python print command to display results, and we’d lose the nice formatting we get from SymPy.

🔗

xxxxxxxxxx
 
from sympy import *
init_printing()
A = Matrix([[7,1,-3,2,1],
            [-6,2,4,-2,-2],
            [0,1,3,1,-1],
            [-8,-1,6,0,-3],
            [-4,0,3,-1,1]])
p = A.charpoly().as_expr()
factor(p)

The characteristic polynomial of

A

is given by

c_{A} (x) = (x - 2)^{2} (x - 3)^{3} .

We thus have two eigenvalues:

2,

of multiplicity

2,

and

3,

of multiplicity

3 .

We next find the

E_{2} (A)

eigenspace.

🔗

xxxxxxxxxx
 
N2 = A-2*eye(5)
E2 = N2.nullspace()
E2

The computer gives us

E_{2} (A) = null (A - 2 I) = span {x_{1}}, where x_{1} = [\begin{matrix} - 1 \\ 0 \\ - 1 \\ 1 \\ 0 \end{matrix}],

so we have only one independent eigenvector, which means that

G_{2} (A) = null (A - 2 I)^{2} .

🔗

Following Insight 5.7.5, we extend

{x_{1}}

to a basis of

G_{2} (A)

by solving the system

(A - 2 I) x = x_{1} .

🔗

xxxxxxxxxx
 
B2 = N2.col_insert(5,E2[0])
B2.rref()

Using the results above from the computer (or Gaussian elimination), we find a general solution

x = [\begin{matrix} - t \\ - 1 \\ - t \\ t \\ 0 \end{matrix}] = t [\begin{matrix} - 1 \\ 0 \\ - 1 \\ 1 \\ 0 \end{matrix}] + [\begin{matrix} 0 \\ - 1 \\ 0 \\ 0 \\ 0 \end{matrix}] .

Note that our solution is of the form

x = t x_{1} + x_{2} .

We set

t = 0,

and get

x_{2} = {[\begin{matrix} 0 & - 1 & 0 & 0 & 0 \end{matrix}]}^{T} .

🔗

Next, we consider the eigenvalue

λ = 3 .

The computer gives us the following:

🔗

xxxxxxxxxx
 
N3 = A-3*eye(5)
E3 = N3.nullspace()
E3

Rescaling to remove fractions, we find

E_{3} (A) = null (A - 3 I) = span {y_{1}, y_{2}}, where y_{1} = [\begin{matrix} 1 \\ - 2 \\ 2 \\ 2 \\ 0 \end{matrix}], y_{2} = [\begin{matrix} - 1 \\ 2 \\ 0 \\ 0 \\ 2 \end{matrix}] .

Again, we’re one eigenvector short of the multiplicity, so we need to consider

G_{3} (A) = null (A - 3 I)^{3} .

🔗

In the next cell, note that we doubled the eigenvectors in E3 to avoid fractions. To follow the solution in our example, we append 2*E3[0], and reduce the resulting matrix. You should find that using the eigenvector

y_{1}

corresponding to E3[0] leads to an inconsistent system. Once you confirm this, replace E3[0] with E3[1] and re-run the cell to see that we get an inconsistent system using

y_{2}

as well!

🔗

xxxxxxxxxx
 
B3 = N3.col_insert(5,2*E3[0])
B3.rref()

The systems

(A - 3 I) y = y_{1}

and

(A - 3 I) y = y_{2}

are both inconsistent, but we can salvage the situation by replacing the eigenvector

y_{2}

by some linear combination

z_{2} = a y_{1} + b y_{2} .

We row-reduce, and look for values of

a

and

b

that give a consistent system.

🔗

The rref command takes things a bit farther than we’d like, so we use the command echelon_form() instead. Note that if

a = b,

the system is inconsistent.

🔗

xxxxxxxxxx
 
a, b = symbols('a b')
C3 = N3.col_insert(5,a*E3[0]+b*E3[1])
C3.echelon_form()

We find that

a = b

does the job, so we set

z_{2} = y_{1} + y_{2} = [\begin{matrix} 0 \\ 0 \\ 2 \\ 2 \\ 2 \end{matrix}] .

🔗

xxxxxxxxxx
 
D3 = N3.col_insert(5,E3[0]+E3[1])
D3.rref()

Solving the system

(A - 3 I) z = y_{1} + y_{2},

using the code above, we find

\begin{aligned} z & = [\begin{array}{c} \frac{1}{2} + \frac{1}{2} s - \frac{1}{2} t \\ 1 - s + t \\ 1 + s \\ s \\ t \end{array}] \\ = [\begin{array}{c} \frac{1}{2} \\ 1 \\ 1 \\ 0 \\ 0 \end{array}] + s [\begin{array}{c} \frac{1}{2} \\ - 1 \\ 1 \\ 1 \\ 0 \end{array}] + t [\begin{array}{c} - \frac{1}{2} \\ 1 \\ 0 \\ 0 \\ 1 \end{array}] \\ = [\begin{array}{c} \frac{1}{2} \\ 1 \\ 1 \\ 0 \\ 0 \end{array}] \frac{s}{2} y_{1} + \frac{t}{2} y_{2} . \end{aligned}

🔗

We let

z_{3} = [\begin{matrix} 1 \\ 2 \\ 2 \\ 0 \\ 0 \end{matrix}],

and check that

A z_{3} = 3 z_{3} + z_{2},

as required:

🔗

xxxxxxxxxx
 
Z3 = Matrix(5,1,[1,2,2,0,0])
A*Z3-3*Z3-2*(E3[0]+E3[1])

This gives us the basis

B = {x_{1}, x_{2}, y_{1}, z_{2}, z_{3}}

for

R^{5},

and with respect to this basis, we have the Jordan canonical form

M_{B} (T) = [\begin{matrix} 2 & 1 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 3 & 0 & 0 \\ 0 & 0 & 0 & 3 & 1 \\ 0 & 0 & 0 & 0 & 3 \end{matrix}] .

🔗

Now that we’ve done all the work required for Example 5.7.6, we should confess that there was an easier way all along:

🔗

xxxxxxxxxx
 
A.jordan_form()

The jordan_form() command returns a pair

P, J,

where

J

is the Jordan canonical form of

A,

and

P

is an invertible matrix such that

P^{- 1} A P = J .

You might find that the computer’s answer is not quite the same as ours. This is because the Jordan canonical form is only unique up to permutation of the Jordan blocks. Changing the order of the blocks amounts to changing the order of the columns of

P,

which are given by a basis of (generalized eigenvectors).

🔗

Exercise 5.7.7.

Determine a Jordan basis for the linear operator

T : R^{4} \to R^{4}

given by

T (w, x, y, z) = (w + x, x, - x + 2 y, w - x + y + z) .

A code cell is given below in case you want to try performing the operations demonstrated in Example 5.7.6.

🔗

xxxxxxxxxx
 
​

One final note: we mentioned above that the minimal polynomial of an operator has the form

m_{T} (x) = (x - λ_{1})^{l_{1}} (x - λ_{2})^{l_{2}} \dots (x - λ_{k})^{l_{k}},

where for each

j = 1, 2, \dots, k,

l_{j}

is the size of the largest Jordan block corresponding to

λ_{j} .

Knowing the minimal polynomial therefore tells as a lot about the Jordan canonical form, but not everything. Of course, if

l_{j} = 1

for all

j,

then our operator can be diagaonalized. If

\dim V \leq 4,

the minimal polynomial tells us everything, except for the order of the Jordan blocks.

🔗

In Exercise 5.7.7, the minimal polynomial is

m_{T} (x) = (x - 1)^{3} (x - 2),

the same as the characteristic polynomial. If we knew this in advance, then the only possible Jordan canonical forms would be

[\begin{matrix} 1 & 1 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 2 \end{matrix}] or [\begin{matrix} 2 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 \end{matrix}] .

If instead the minimal polynomial had turned out to be

(x - 1)^{2} (x - 2)

(with the same characteristic polynomial), then, up to permutation of the Jordan blocks, our Jordan canonical form would be

[\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 2 \end{matrix}] .

🔗

However, once we hit matrices of size

5 \times 5

or larger, some ambiguity creeps in. For example, suppose

c_{A} (x) = (x - 2)^{5}

with

m_{A} (x) = (x - 2)^{2} .

Then the largest Jordan block is

2 \times 2,

but we could have two

2 \times 2

blocks and a

1 \times 1,

or three

1 \times 1

blocks and one

2 \times 2 .

🔗

Exercises Exercises

1.

Find the minimal polynomial

m (x)

[\begin{array}{cccc} 6 & 0 & - 6 & 13 \\ - 4 & 1 & 6 & - 8 \\ 2 & 0 & - 1 & 6 \\ - 1 & 0 & 2 & 0 \end{array}] .

🔗

2.

Let

A = [\begin{array}{cccc} 1 & 8 & - 20 & 16 \\ 8 & 13 & - 38 & 32 \\ 0 & - 8 & 21 & - 16 \\ - 4 & - 18 & 49 & - 39 \end{array}] .

Find a matrix

P

such that

D = P^{- 1} A P

is the Jordan canonical form of

A .

🔗

3.

Let

A = [\begin{array}{cccc} - 4 & 0 & 2 & - 4 \\ - 15 & - 5 & 18 & - 30 \\ - 30 & - 6 & 34 & - 60 \\ - 14 & - 3 & 17 & - 30 \end{array}] .

Find a matrix

P

such that

D = P^{- 1} A P

is the Jordan canonical form of

A .

🔗

You have attempted 1 of 4 activities on this page.

🔗

Prev Top Next