You might reasonably wonder: where does this definition come from? And why should I care? We are assuming that you saw at least a basic introduction to eigenvalues in your first course on linear algebra, but that course probably focused on mechanics. Possibly you learned that diagonalizing a matrix lets you compute powers of that matrix.
But why should we be interested in computing powers (in particular, large powers) of a matrix? An important context comes from the study of discrete linear dynamical systems, as well as Markov chains, where the evolution of a state is modelled by repeated multiplication of a vector by a matrix.
When weβre able to diagonalize our matrix using eigenvalues and eigenvectors, not only does it become easy to compute powers of a matrix, it also enables us to see that the entire process is just a linear combination of geometric sequences! If you have completed SectionΒ 2.5, you probably will not be surprised to learn that the polynomial roots you found are, in fact, eigenvalues of a suitable matrix.
Eigenvalues and eigenvectors can just as easily be defined for a general linear operator \(T:V\to V\text{.}\) In this context, an eigenvector \(\xx\) is sometimes referred to as a characteristic vector (or characteristic direction) for \(T\text{,}\) since the property \(T(\xx)=\lambda \xx\) simply states that the transformed vector \(T(\xx)\) is parallel to the original vector \(\xx\text{.}\) Some linear algebra textbooks that focus more on general linear transformations frame this topic in the context of invariant subspaces for a linear operator.
A subspace \(U\subseteq V\) is invariant with respect to \(T\) if \(T(\uu)\in U\) for all \(\uu\in U\text{.}\) Note that if \(\xx\) is an eigenvector of \(T\text{,}\) then \(\spn\{\xx\}\) is an invariant subspace. To see this, note that if \(T(\xx)=\lambda \xx\) and \(\yy=k\xx\text{,}\) then
For the matrix \(A = \bbm -1\amp 0\amp 3\\1\amp -1\amp 0\\1\amp 0\amp 1\ebm\text{,}\) match each vector on the left with the corresponding eigenvalue on the right. (For typographical reasons, column vectors have been transposed.)
where \(I_n\) denotes the \(n\times n\) identity matrix. Thus, if \(\lambda\) is an eigenvalue of \(A\text{,}\) any corresponding eigenvector is an element of \(\nll(A-\lambda I_n)\text{.}\)
Note that \(E_\lambda(A)\) can be defined for any real number \(\lambda\text{,}\) whether or not \(\lambda\) is an eigenvalue. However, the eigenvalues of \(A\) are distinguished by the property that there is a nonzero solution to (4.1.1). Furthermore, we know that (4.1.1) can only have nontrivial solutions if the matrix \(A-\lambda I_n\) is not invertible. We also know that \(A-\lambda I_n\) is non-invertible if and only if \(\det (A-\lambda I_n) = 0\text{.}\) This gives us the following theorem.
To prove a theorem involving a βthe following are equivalentβ statement, a good strategy is to show that the first implies the second, the second implies the third, and the third implies the first. The ideas needed for the proof are given in the paragraph preceding the theorem. See if you can turn them into a formal proof.
The polynomial \(c_A(x)=\det(xI_n -A)\) is called the characteristic polynomial of \(A\text{.}\) (Note that \(\det(x I_n-A) = (-1)^n\det(A-x I_n)\text{.}\) We choose this order so that the coefficient of \(x^n\) is always 1.) The equation
\begin{equation}
\det(xI_n - A) = 0\tag{4.1.2}
\end{equation}
is called the characteristic equation of \(A\text{.}\) The solutions to this equation are precisely the eigenvalues of \(A\text{.}\)
A careful study of eigenvalues and eigenvectors relies heavily on polynomials. An interesting fact is that we can plug any square matrix into a polynomial! Given the polynomial \(p(x) = a_0+a_1x+a_2 x^2 + \cdots + a_nx^n\) and an \(n\times n\) matrix \(A\text{,}\) we define
One interesting aspect of this is the relationship between the eigenvalues of \(A\) and the eigenvalues of \(p(A)\text{.}\) For example, if \(A\) has the eigenvalue \(\lambda\text{,}\) see if you can prove that \(A^k\) has the eigenvalue \(\lambda^k\text{.}\)
In order for certain properties of a matrix \(A\) to be satisfied, the eigenvalues of \(A\) need to have particular values. Match each property of a matrix \(A\) on the left with the corresponding information about the eigenvalues of \(A\) on the right. Be sure that you can justify your answers with a suitable proof.
Recall that a matrix \(B\) is said to be similar to a matrix \(A\) if there exists an invertible matrix \(P\) such that \(B = P^{-1}AP\text{.}\) Much of what follows concerns the question of whether or not a given \(n\times n\) matrix \(A\) is diagonalizable.
The roots of the characteristic polynomial are our eigenvalues, so we have \(\lambda_1=-1\) and \(\lambda_2=2\text{.}\) Note that the first eigenvalue comes from a repeated root. This is typically where things get interesting. If an eigenvalue does not come from a repeated root, then there will only be one (independent) eigenvector that corresponds to it. (That is, \(\dim E_\lambda(A)=1\text{.}\)) If an eigenvalue is repeated, it could have more than one eigenvector, but this is not guaranteed.
We find that \(A-(-1)I_n = \bbm 1\amp 1\amp 1\\1\amp 1\amp 1\\1\amp 1\amp 1\ebm\text{,}\) which has reduced row-echelon form \(\bbm 1\amp 1\amp 1\\0\amp 0\amp 0\\0\amp 0\amp 0\ebm\text{.}\) Solving for the nullspace, we find that there are two independent eigenvectors:
For the second eigenvector, we have \(A-2I = \bbm -2\amp 1\amp 1\\1\amp -2\amp 1\\1\amp 1\amp -2\ebm\text{,}\) which has reduced row-echelon form \(\bbm 1\amp 0\amp -1\\0\amp 1\amp -1\\0\amp 0\amp 0\ebm\text{.}\) An eigenvector in this case is given by
where \(q(x)\) is not divisible by \(x-\lambda\text{,}\) then we say that \(\lambda\) is an eigenvalue of multiplicity \(m\text{.}\) In the example above, \(\lambda_1=-1\) has multiplicty 2, and \(\lambda_2=2\) has multiplicty 1.
The eigenvects command in SymPy takes a square matrix as input, and outputs a list of lists (one list for each eigenvalue). For a given eigenvalue, the corresponding list has the form (eigenvalue, multiplicity, eigenvectors). Using SymPy to solve ExampleΒ 4.1.11 looks as follows:
Let \(\{\xx_1,\ldots, \xx_k\}\) be a set of linearly independent eigenvectors of a matrix \(A\text{,}\) with corresponding eigenvalues \(\lambda_1,\ldots, \lambda_k\) (not necessarily distinct). Extend this set to a basis \(\{\xx_1,\ldots, \xx_k,\xx_{k+1},\ldots, \xx_n\}\text{,}\) and let \(P=\bbm \xx_1\amp \cdots \amp \xx_n\ebm\) be the matrix whose columns are the basis vectors. (Note that \(P\) is necessarily invertible.) Then
We can use LemmaΒ 4.1.13 to prove that \(\dim E_\lambda(A)\leq m\) as follows. Suppose \(\{\xx_1,\ldots, \xx_k\}\) is a basis for \(E_\lambda(A)\text{.}\) Then this is a linearly independent set of eigenvectors, so our lemma guarantees the existence of a matrix \(P\) such that
This shows that \(c_A(x)\) is divisible by \((x-\lambda)^k\text{.}\) Since \(m\) is the largest integer such that \(c_A(x)\) is divisible by \((x-\lambda)^m\text{,}\) we must have \(\dim E_\lambda(A)=k\leq m\text{.}\)
Let \(\vv_1,\ldots, \vv_k\) be eigenvectors corresponding to distinct eigenvalues \(\lambda_1,\ldots, \lambda_k\) of a matrix \(A\text{.}\) Then \(\{\vv_1,\ldots, \vv_k\}\) is linearly independent.
The proof is by induction on the number \(k\) of distinct eigenvalues. Since eigenvectors are nonzero, any set consisting of a single eigenvector \(\vv_1\) is independent. Suppose, then, that a set of eigenvectors corresponding to \(k-1\) distinct eigenvalues is independent, and let \(\vv_1,\ldots, \vv_k\) be eigenvectors corresponding to distinct eigenvalues \(\lambda_1,\ldots, \lambda_k\text{.}\)
By hypothesis, the set \(\{\vv_2,\ldots, \vv_k\}\) of \(k-1\) eigenvectors is linearly independent. We know that \(\lambda_j-\lambda_1\neq 0\) for \(j=2,\ldots, k\text{,}\) since the eigenvalues are all distinct. Therefore, the only way this linear combination can equal zero is if \(c_2=0,\ldots, c_k=0\text{.}\) This leaves us with \(c_1\vv_1=\zer\text{,}\) but \(\zz_1\neq \zer\text{,}\) so \(c_1=0\) as well.
TheoremΒ 4.1.14 tells us that vectors from different eigenspaces are independent. In particular, a union of bases from each eigenspace will be an independent set. Therefore, TheoremΒ 4.1.12 provides an initial criterion for diagonalization: if the dimension of each eigenspace \(E_\lambda(A)\) is equal to the multiplicity of \(\lambda\text{,}\) then \(A\) is diagonalizable.
Our focus in the next section will be on diagonalization of symmetric matrices, and soon we will see that for such matrices, eigenvectors corresponding to different eigenvalues are not just independent, but orthogonal.
The matrix \(A={\left[\begin{array}{ccc}
-8 \amp -4 \amp -12\cr
-4 \amp -8 \amp -12\cr
4 \amp 4 \amp 8
\end{array}\right]}\) has two real eigenvalues, one of multiplicity \(1\) and one of multiplicity \(2\text{.}\) Find the eigenvalues and a basis for each eigenspace.
has two real eigenvalues \(\lambda_1 \lt \lambda_2\text{.}\) Find these eigenvalues, their multiplicities, and the dimensions of their corresponding eigenspaces.
Supppose \(A\) is an invertible \(n\times n\) matrix and \(\vec{v}\) is an eigenvector of \(A\) with associated eigenvalue \(3\text{.}\) Convince yourself that \(\vec{v}\) is an eigenvector of the following matrices, and find the associated eigenvalues.
be eigenvectors of the matrix \(A\) which correspond to the eigenvalues \(\lambda_1 = -1\text{,}\)\(\lambda_2 = 0\text{,}\) and \(\lambda_3 = 4\text{,}\) respectively, and let
Verify that \(A={\left[\begin{array}{cc}
0 \amp 1\cr
1 \amp -1
\end{array}\right]}\) is similar to itself by finding a \(T\) such that \(A = T^{-1} A T\text{.}\)
We know that \(A\) and \(B={\left[\begin{array}{cc}
1 \amp -1\cr
1 \amp -2
\end{array}\right]}\) are similar since \(A = P^{-1} B P\) where \(P = {\left[\begin{array}{cc}
1 \amp -1\cr
2 \amp -3
\end{array}\right]}\text{.}\)
We also know that \(B\) and \(C={\left[\begin{array}{cc}
-3 \amp 5\cr
-1 \amp 2
\end{array}\right]}\) are similar since \(B = Q^{-1} C Q\) where \(Q = {\left[\begin{array}{cc}
1 \amp 1\cr
1 \amp 0
\end{array}\right]}\text{.}\)