Finding eigenvectors numerically

Section 5.2 Finding eigenvectors numerically

We have typically found eigenvalues of a square matrix

A

as the roots of the characteristic polynomial

det (A - λ I) = 0

and the associated eigenvectors as the null space

Nul (A - λ I) .

Unfortunately, this approach is not practical when we are working with large matrices. First, finding the charactertic polynomial of a large matrix requires considerable computation, as does finding the roots of that polynomial. Second, finding the null space of a singular matrix is plagued by numerical problems, as we will see in the preview activity.

🔗

For this reason, we will explore a technique called the power method that finds numerical approximations to the eigenvalues and eigenvectors of a square matrix.

🔗

Preview Activity 5.2.1.

Let’s recall some earlier observations about eigenvalues and eigenvectors.

How are the eigenvalues and associated eigenvectors of $A$ related to those of $A^{- 1} ?$
🔗

🔗
How are the eigenvalues and associated eigenvectors of $A$ related to those of $A - 3 I ?$
🔗

🔗
If $λ$ is an eigenvalue of $A,$ what can we say about the pivot positions of $A - λ I ?$
🔗

🔗
Suppose that $A = [\begin{array}{rr} 0.8 & 0.4 \\ 0.2 & 0.6 \end{array}] .$ Explain how we know that $1$ is an eigenvalue of $A$ and then explain why the following Sage computation is incorrect.
xxxxxxxxxx

A = matrix(2,2,[0.8, 0.4, 0.2, 0.6])
I = matrix(2,2,[1, 0, 0, 1])
(A-I).rref()
Language:
Messages
🔗
🔗
Suppose that $x_{0} = [\begin{array}{r} 1 \\ 0 \end{array}],$ and we define a sequence $x_{k + 1} = A x_{k};$ in other words, $x_{k} = A^{k} x_{0} .$ What happens to $x_{k}$ as $k$ grows increasingly large?
🔗

🔗
Explain how the eigenvalues of $A$ are responsible for the behavior noted in the previous question.
🔗

🔗

🔗

Subsection 5.2.1 The power method

Our goal is to find a technique that produces numerical approximations to the eigenvalues and associated eigenvectors of a matrix

A .

We begin by searching for the eigenvalue having the largest absolute value, which is called the dominant eigenvalue. The next two examples demonstrate this technique.

🔗

Example 5.2.1.

Let’s begin with the positive stochastic matrix

A = [\begin{array}{rr} 0.7 & 0.6 \\ 0.3 & 0.4 \end{array}] .

We spent quite a bit of time studying this type of matrix in Section 4.5; in particular, we saw that any Markov chain will converge to the unique steady-state vector. Let’s rephrase this statement in terms of the eigenvectors of

A .

🔗

This matrix has eigenvalues

λ_{1} = 1

and

λ_{2} = 0.1

so the dominant eigenvalue is

λ_{1} = 1 .

The associated eigenvectors are

v_{1} = [\begin{array}{r} 2 \\ 1 \end{array}]

and

v_{2} = [\begin{array}{r} - 1 \\ 1 \end{array}] .

Suppose we begin with the vector

x_{0} = [\begin{array}{r} 1 \\ 0 \end{array}] = \frac{1}{3} v_{1} - \frac{1}{3} v_{2}

and find

\begin{aligned} x_{1} & = A x_{0} = \frac{1}{3} v_{1} - \frac{1}{3} (0.1) v_{2} \\ x_{2} & = A^{2} x_{0} = \frac{1}{3} v_{1} - \frac{1}{3} (0.1)^{2} v_{2} \\ x_{3} & = A^{3} x_{0} = \frac{1}{3} v_{1} - \frac{1}{3} (0.1)^{3} v_{2} \\ ⋮ \\ x_{k} & = A^{k} x_{0} = \frac{1}{3} v_{1} - \frac{1}{3} (0.1)^{k} v_{2} \end{aligned}

and so forth. Notice that the powers

{0.1}^{k}

become increasingly small as

k

grows so that

x_{k} \approx \frac{1}{3} v_{1}

when

k

is large. Therefore, the vectors

x_{k}

become increasingly close to a vector in the eigenspace

E_{1},

the eigenspace associated to the dominant eigenvalue. If we did not know the eigenvector

v_{1},

we could use a Markov chain in this way to find a basis vector for

E_{1},

which, as seen in Section 4.5, is essentially how the Google PageRank algorithm works.

🔗

Example 5.2.2.

Let’s now look at the matrix

A = [\begin{array}{rr} 2 & 1 \\ 1 & 2 \end{array}],

which has eigenvalues

λ_{1} = 3

and

λ_{2} = 1 .

The dominant eigenvalue is

λ_{1} = 3,

and the associated eigenvectors are

v_{1} = [\begin{array}{r} 1 \\ 1 \end{array}]

and

v_{2} = [\begin{array}{r} - 1 \\ 1 \end{array}] .

Once again, begin with the vector

x_{0} = [\begin{array}{r} 1 \\ 0 \end{array}] = \frac{1}{2} v_{1} - \frac{1}{2} v_{2}

so that

\begin{aligned} x_{1} & = A x_{0} = 3 \frac{1}{2} v_{1} - \frac{1}{2} v_{2} \\ x_{2} & = A^{2} x_{0} = 3^{2} \frac{1}{3} v_{1} - \frac{1}{2} v_{2} \\ x_{3} & = A^{3} x_{0} = 3^{3} \frac{1}{3} v_{1} - \frac{1}{2} v_{2} \\ ⋮ \\ x_{k} & = A^{k} x_{0} = 3^{k} \frac{1}{3} v_{1} - \frac{1}{2} v_{2} . \end{aligned}

🔗

As the figure shows, the vectors

x_{k}

are stretched by a factor of

3

in the

v_{1}

direction and not at all in the

v_{2}

direction. Consequently, the vectors

x_{k}

become increasingly long, but their direction becomes closer to the direction of the eigenvector

v_{1} = [\begin{array}{r} 1 \\ 1 \end{array}]

associated to the dominant eigenvalue.

🔗

To find an eigenvector associated to the dominant eigenvalue, we will prevent the length of the vectors

x_{k}

from growing arbitrarily large by multiplying by an appropriate scaling constant. Here is one way to do this. Given the vector

x_{k},

we identify its component having the largest absolute value and call it

m_{k} .

We then define

{\overset{―}{x}}_{k} = \frac{1}{m_{k}} x_{k},

which means that the component of

{\overset{―}{x}}_{k}

having the largest absolute value is

1 .

🔗

For example, beginning with

x_{0} = [\begin{array}{r} 1 \\ 0 \end{array}],

we find

x_{1} = A x_{0} = [\begin{array}{r} 2 \\ 1 \end{array}] .

The component of

x_{1}

having the largest absolute value is

m_{1} = 2

so we multiply by

\frac{1}{m_{1}} = \frac{1}{2}

to obtain

{\overset{―}{x}}_{1} = [\begin{array}{r} 1 \\ \frac{1}{2} \end{array}] .

Then

x_{2} = A {\overset{―}{x}}_{1} = [\begin{array}{r} \frac{5}{2} \\ 2 \end{array}] .

Now the component having the largest absolute value is

m_{2} = \frac{5}{2}

so we multiply by

\frac{2}{5}

to obtain

{\overset{―}{x}}_{2} = [\begin{array}{r} 1 \\ \frac{4}{5} \end{array}] .

🔗

The resulting sequence of vectors

{\overset{―}{x}}_{k}

is shown in the figure. Notice how the vectors

{\overset{―}{x}}_{k}

now approach the eigenvector

v_{1},

which gives us a way to find the eigenvector

v = [\begin{array}{r} 1 \\ 1 \end{array}] .

This is the power method for finding an eigenvector associated to the dominant eigenvalue of a matrix.

🔗

Activity 5.2.2.

Let’s begin by considering the matrix

A = [\begin{array}{rr} 0.5 & 0.2 \\ 0.4 & 0.7 \end{array}]

and the initial vector

x_{0} = [\begin{array}{r} 1 \\ 0 \end{array}] .

xxxxxxxxxx
 
​

Compute the vector $x_{1} = A x_{0} .$
🔗

🔗
Find $m_{1},$ the component of $x_{1}$ that has the largest absolute value. Then form ${\overset{―}{x}}_{1} = \frac{1}{m_{1}} x_{1} .$ Notice that the component having the largest absolute value of ${\overset{―}{x}}_{1}$ is $1 .$
🔗

🔗
Find the vector $x_{2} = A {\overset{―}{x}}_{1} .$ Identify the component $m_{2}$ of $x_{2}$ having the largest absolute value. Then form ${\overset{―}{x}}_{2} = \frac{1}{m_{2}} {\overset{―}{x}}_{1}$ to obtain a vector in which the component with the largest absolute value is $1 .$
🔗

🔗
The Sage cell below defines a function that implements the power method. Define the matrix $A$ and initial vector $x_{0}$ below. The command power(A, x0, N) will print out the multiplier $m$ and the vectors ${\overset{―}{x}}_{k}$ for $N$ steps of the power method.
xxxxxxxxxx

def power(A, x, N):
for i in range(N):
x = A*x
m = max([comp for comp in x], key=abs).numerical_approx(digits=14)
x = 1/float(m)*x
print (m, x)

### Define the matrix A and initial vector x0 below
A =
x0 =
power(A, x0, 20)
Language:
Messages
🔗
How does this computation identify an eigenvector of the matrix $A ?$
🔗

🔗
What is the corresponding eigenvalue of this eigenvector?
🔗

🔗
How do the values of the multipliers $m_{k}$ tell us the eigenvalue associated to the eigenvector we have found?
🔗

🔗
Consider now the matrix $A = [\begin{array}{rr} - 5.1 & 5.7 \\ - 3.8 & 4.4 \end{array}] .$ Use the power method to find the dominant eigenvalue of $A$ and an associated eigenvector.
🔗

🔗

🔗

Notice that the power method gives us not only an eigenvector

v

but also its associated eigenvalue. As in the activity, consider the matrix

A = [\begin{array}{rr} - 5.1 & 5.7 \\ - 3.8 & 4.4 \end{array}],

which has eigenvector

v = [\begin{array}{r} 3 \\ 2 \end{array}] .

The first component has the largest absolute value so we multiply by

\frac{1}{3}

to obtain

\overset{―}{v} = [\begin{array}{r} 1 \\ \frac{2}{3} \end{array}] .

When we multiply by

A,

we have

A \overset{―}{v} = [\begin{array}{r} - 1.30 \\ - 0.86 \end{array}] .

Notice that the first component still has the largest absolute value so that the multiplier

m = - 1.3

is the eigenvalue

λ

corresponding to the eigenvector. This demonstrates the fact that the multipliers

m_{k}

approach the eigenvalue

λ

having the largest absolute value.

🔗

Notice that the power method requires us to choose an initial vector

x_{0} .

For most choices, this method will find the eigenvalue having the largest absolute value. However, an unfortunate choice of

x_{0}

may not. For instance, if we had chosen

x_{0} = v_{2}

in our example above, the vectors in the sequence

x_{k} = A^{k} x_{0} = λ_{2}^{k} v_{2}

will not detect the eigenvector

v_{1} .

However, it usually happens that our initial guess

x_{0}

has some contribution from

v_{1}

that enables us to find it.

🔗

The power method, as presented here, will fail for certain unlucky matrices. This is examined in Exercise 5.2.4.5 along with a means to improve the power method to work for all matrices.

🔗

Subsection 5.2.2 Finding other eigenvalues

The power method gives a technique for finding the dominant eigenvalue of a matrix. We can modify the method to find the other eigenvalues as well.

🔗

Activity 5.2.3.

The key to finding the eigenvalue of

A

having the smallest absolute value is to note that the eigenvectors of

A

are the same as those of

A^{- 1} .

If $v$ is an eigenvector of $A$ with associated eigenvector $λ,$ explain why $v$ is an eigenvector of $A^{- 1}$ with associated eigenvalue $λ^{- 1} .$
🔗

🔗
Explain why the eigenvalue of $A$ having the smallest absolute value is the reciprocal of the dominant eigenvalue of $A^{- 1} .$
🔗

🔗
Explain how to use the power method applied to $A^{- 1}$ to find the eigenvalue of $A$ having the smallest absolute value.
🔗

🔗
If we apply the power method to $A^{- 1},$ we begin with an intial vector $x_{0}$ and generate the sequence $x_{k + 1} = A^{- 1} x_{k} .$ It is not computationally efficient to compute $A^{- 1},$ however, so instead we solve the equation $A x_{k + 1} = x_{k} .$ Explain why an $L U$ factorization of $A$ is useful for implementing the power method applied to $A^{- 1} .$
🔗

🔗
The following Sage cell defines a command called inverse_power that applies the power method to $A^{- 1} .$ That is, inverse_power(A, x0, N) prints the vectors $x_{k},$ where $x_{k + 1} = A^{- 1} x_{k},$ and multipliers $\frac{1}{m_{k}},$ which approximate the eigenvalue of $A .$ Use it to find the eigenvalue of $A = [\begin{array}{rr} - 5.1 & 5.7 \\ - 3.8 & 4.4 \end{array}]$ having the smallest absolute value.
xxxxxxxxxx

def inverse_power(A, x, N):
for i in range(N):
x = A \ x
m = max([comp for comp in x], key=abs).numerical_approx(digits=14)
x = 1/float(m)*x
print (1/float(m), x)
### define the matrix A and vector x0
A =
x0 =
inverse_power(A, x0, 20)
Language:
Messages
🔗
🔗
The inverse power method only works if $A$ is invertible. If $A$ is not invertible, what is its eigenvalue having the smallest absolute value?
🔗

🔗
Use the power method and the inverse power method to find the eigenvalues and associated eigenvectors of the matrix $A = [\begin{array}{rr} - 0.23 & - 2.33 \\ - 1.16 & 1.08 \end{array}] .$
🔗

🔗

🔗

With the power method and the inverse power method, we can now find the eigenvalues of a matrix

A

having the largest and smallest absolute values. With one more modification, we can find all the eigenvalues of

A .

🔗

Activity 5.2.4.

Remember that the absolute value of a number tells us how far that number is from

0

on the real number line. We may therefore think of the inverse power method as telling us the eigenvalue closest to

0 .

If $v$ is an eigenvector of $A$ with associated eigenvalue $λ,$ explain why $v$ is an eigenvector of $A - s I$ where $s$ is some scalar.
🔗

🔗
What is the eigenvalue of $A - s I$ associated to the eigenvector $v ?$
🔗

🔗
Explain why the eigenvalue of $A$ closest to $s$ is the eigenvalue of $A - s I$ closest to $0 .$
🔗

🔗
Explain why applying the inverse power method to $A - s I$ gives the eigenvalue of $A$ closest to $s .$
🔗

🔗
Consider the matrix $A = [\begin{array}{rrrr} 3.6 & 1.6 & 4.0 & 7.6 \\ 1.6 & 2.2 & 4.4 & 4.1 \\ 3.9 & 4.3 & 9.0 & 0.6 \\ 7.6 & 4.1 & 0.6 & 5.0 \end{array}] .$ If we use the power method and inverse power method, we find two eigenvalues, $λ_{1} = 16.35$ and $λ_{2} = 0.75 .$ Viewing these eigenvalues on a number line, we know that the other eigenvalues lie in the range between $- λ_{1}$ and $λ_{1},$ as shaded in Figure 5.2.3.
🔗

Figure 5.2.3. The range of eigenvalues of $A .$
🔗
The Sage cell below has a function find_closest_eigenvalue(A, s, x, N) that implements $N$ steps of the inverse power method using the matrix $A - s I$ and an initial vector $x .$ This function prints approximations to the eigenvalue of $A$ closest to $s$ and its associated eigenvector. By trying different values of $s$ in the shaded regions of the number line shown in Figure 5.2.3, find the other two eigenvalues of $A .$
xxxxxxxxxx

def find_closest_eigenvalue(A, s, x, N):
B = A-s*identity_matrix(A.nrows())
for i in range(N):
x = B \ x
m = max([comp for comp in x], key=abs).numerical_approx(digits=14)
x = 1/float(m)*x
print (1/float(m)+s, x)
### define the matrix A and vector x0
A =
x0 =
find_closest_eigenvalue(A, 2, x0, 20)
Language:
Messages
🔗
🔗
Write a list of the four eigenvalues of $A$ in increasing order.
🔗

🔗

🔗

There are some restrictions on the matrices to which this technique applies as we have assumed that the eigenvalues of

A

are real and distinct. If

A

has repeated or complex eigenvalues, this technique will need to be modified, as explored in some of the exercises.

🔗

Subsection 5.2.3 Summary

We have explored the power method as a tool for numerically approximating the eigenvalues and eigenvectors of a matrix.

After choosing an initial vector $x_{0},$ we define the sequence $x_{k + 1} = A x_{k} .$ As $k$ grows larger, the direction of the vectors $x_{k}$ closely approximates the direction of the eigenspace corresponding to the eigenvalue $λ_{1}$ having the largest absolute value.
🔗

🔗
We normalize the vectors $x_{k}$ by multiplying by $\frac{1}{m_{k}},$ where $m_{k}$ is the component having the largest absolute value. In this way, the vectors ${\overset{―}{x}}_{k}$ approach an eigenvector associated to $λ_{1},$ and the multipliers $m_{k}$ approach the eigenvalue $λ_{1} .$
🔗

🔗
To find the eigenvalue having the smallest absolute value, we apply the power method using the matrix $A^{- 1} .$
🔗

🔗
To find the eigenvalue closest to some number $s,$ we apply the power method using the matrix $(A - s I)^{- 1} .$
🔗

🔗

🔗

Exercises 5.2.4 Exercises

This Sage cell has the commands power, inverse_power, and find_closest_eigenvalue that we have developed in this section. After evaluating this cell, these commands will be available in any other cell on this page.

xxxxxxxxxx
 
def power(A, x, N):
    for i in range(N):
        x = A*x
        m = max([comp for comp in x], key=abs).numerical_approx(digits=14)
        x = 1/float(m)*x
        print (m, x)
def find_closest_eigenvalue(A, s, x, N):
    B = A-s*identity_matrix(A.nrows())
    for i in range(N):
        x = B \ x
        m = max([comp for comp in x], key=abs).numerical_approx(digits=14)
        x = 1/float(m)*x
        print (1/float(m)+s, x)
def inverse_power(A, x, N):
    find_closest_eigenvalue(A, 0, x, N)

🔗

1.

Suppose that

A

is a matrix having eigenvalues

- 3,

- 0.2,

1,

and

4 .

What are the eigenvalues of $A^{- 1} ?$
🔗

🔗
What are the eigenvalues of $A + 7 I ?$
🔗

🔗

🔗

2.

Use the commands power, inverse_power, and find_closest_eigenvalue to approximate the eigenvalues and associated eigenvectors of the following matrices.

xxxxxxxxxx
 
​

$A = [\begin{array}{rr} - 2 & - 2 \\ - 8 & - 2 \end{array}] .$
🔗

🔗
$A = [\begin{array}{rr} 0.6 & 0.7 \\ 0.5 & 0.2 \end{array}] .$
🔗

🔗
$A = [\begin{array}{rrrr} 1.9 & - 16.0 & - 13.0 & 27.0 \\ - 2.4 & 20.3 & 4.6 & - 17.7 \\ - 0.51 & - 11.7 & - 1.4 & 13.1 \\ - 2.1 & 15.3 & 6.9 & - 20.5 \end{array}] .$
🔗

🔗

🔗

3.

Use the techniques we have seen in this section to find the eigenvalues of the matrix

A = [\begin{array}{rrrrr} - 14.6 & 9.0 & - 14.1 & 5.8 & 13.0 \\ 27.8 & - 4.2 & 16.0 & 0.9 & - 21.3 \\ - 5.5 & 3.4 & 3.4 & 3.3 & 1.1 \\ - 25.4 & 11.3 & - 15.4 & 4.7 & 20.3 \\ - 33.7 & 14.8 & - 22.5 & 9.7 & 26.6 \end{array}] .

xxxxxxxxxx
 
A = matrix(5,5, [-14.6,  9.0, -14.1, 5.8,  13.0,
                  27.8, -4.2,  16.0, 0.9, -21.3,
                  -5.5,  3.4,   3.4, 3.3,   1.1,
                 -25.4, 11.3, -15.4, 4.7,  20.3,
                 -33.7, 14.8, -22.5, 9.7,  26.6])

🔗

4.

Consider the matrix

A = [\begin{array}{rr} 0 & - 1 \\ - 4 & 0 \end{array}] .

xxxxxxxxxx
 
​

Describe what happens if we apply the power method and the inverse power method using the initial vector $x_{0} = [\begin{array}{r} 1 \\ 0 \end{array}] .$
🔗

🔗
Find the eigenvalues of this matrix and explain this observed behavior.
🔗

🔗
How can we apply the techniques of this section to find the eigenvalues of $A ?$
🔗

🔗

🔗

5.

We have seen that the matrix

A = [\begin{array}{rr} 1 & 2 \\ 2 & 1 \end{array}]

has eigenvalues

λ_{1} = 3

and

λ_{2} = - 1

and associated eigenvectors

v_{1} = [\begin{array}{r} 1 \\ 1 \end{array}]

and

v_{2} = [\begin{array}{r} - 1 \\ 1 \end{array}] .

Describe what happens when we apply the inverse power method using the initial vector $x_{0} = [\begin{array}{r} 1 \\ 0 \end{array}] .$
🔗

🔗
Explain why this is happening and provide a contrast with how the power method usually works.
🔗

🔗
How can we modify the power method to give the dominant eigenvalue in this case?
🔗

🔗

🔗

6.

Suppose that

A

is a

2 \times 2

matrix with eigenvalues

4

and

- 3

and that

B

is a

2 \times 2

matrix with eigenvalues

4

and

1 .

If we apply the power method to find the dominant eigenvalue of these matrices to the same degree of accuracy, which matrix will require more steps in the algorithm? Explain your response.

🔗

7.

Suppose that we apply the power method to the matrix

A

with an initial vector

x_{0}

and find the eigenvalue

λ = 3

and eigenvector

v .

Suppose that we then apply the power method again with a different initial vector and find the same eigenvalue

λ = 3

but a different eigenvector

w .

What can we conclude about the matrix

A

in this case?

🔗

8.

The power method we have developed only works if the matrix has real eigenvalues. Suppose that

A

is a

2 \times 2

matrix that has a complex eigenvalue

λ = 2 + 3 i .

What would happen if we apply the power method to

A ?

🔗

9.

Consider the matrix

A = [\begin{array}{rr} 1 & 1 \\ 0 & 1 \end{array}] .

Find the eigenvalues and associated eigenvectors of $A .$
🔗

🔗
Make a prediction about what happens if we apply the power method and the inverse power method to find eigenvalues of $A .$
🔗

🔗
Verify your prediction using Sage.
xxxxxxxxxx
Language:
Messages
🔗
🔗

🔗

You have attempted 1 of 1 activities on this page.

🔗

Prev Top Next