The Spectral Theorem has motivated the past few sections. In particular, we applied the fact that symmetric matrices can be orthogonally diagonalized to simplify quadratic forms, which enabled us to use principal component analysis to reduce the dimension of a dataset.
But what can we do with matrices that are not symmetric or even square? For instance, the following matrices are not diagonalizable, much less orthogonally so:
In this section, we will develop a description of matrices called the singular value decomposition that is, in many ways, analogous to an orthogonal diagonalization. For example, we have seen that any symmetric matrix can be written in the form where is an orthogonal matrix and is diagonal. A singular value decomposition will have the form where and are orthogonal and is diagonal. Most notably, we will see that every matrix has a singular value decomposition whether it’s symmetric or not.
Recall how the orthogonal diagonalization of a symmetric matrix is formed: if is symmetric, we write where the diagonal entries of are the eigenvalues of and the columns of are the associated eigenvectors. Moreover, the eigenvalues are related to the maximum and minimum values of the associated quadratic form among all unit vectors.
A general matrix, particularly a matrix that is not square, may not have eigenvalues and eigenvectors, but we can discover analogous features, called singular values and singular vectors, by studying a function somewhat similar to a quadratic form. More specifically, any matrix defines a function
which measures the length of . For example, the diagonal matrix gives the function . The presence of the square root means that this function is not a quadratic form. We can, however, define the singular values and vectors by looking for the maximum and minimum of this function among all unit vectors .
You may choose a matrix using the four sliders at the top of the diagram. On the left, you may vary the red unit vector by clicking in the head of the vector.
The first singular value is the maximum value of over all unit vectors and an associated right singular vector is a unit vector describing a direction in which this maximum occurs.
The second singular value is the minimum value of over all unit vectors and an associated right singular vector is a unit vector describing a direction in which this minimum occurs.
Here’s how we can find the right singular values and vectors without using the diagram. Remember that where is the Gram matrix associated to . Since is symmetric, it is orthogonally diagonalizable. Find and an orthogonal diagonalization of it.
xxxxxxxxxx
1
Messages
What is the maximum value of the quadratic form among all unit vectors and in which direction does it occur? What is the minimum value of and in which direction does it occur?
Because , the first singular value will be the square root of the maximum value of and the square root of the minimum. Verify that the singular values that you found from the diagram are the square roots of the maximum and minimum values of .
As this activity shows, the singular values of are the maximum and minimum values of among all unit vectors and the right singular vectors and are the directions in which they occur. The key to finding the singular values and vectors is to utilize the Gram matrix and its associated quadratic form . We will illustrate with some more examples.
We now know that the maximum value of the quadratic form is 8, which occurs in the direction . Since , this tells us that the maximum value of , the first singular value, is and that this occurs in the direction of the first right singular vector .
Because the right singular vectors, the columns of , are eigenvectors of the symmetric matrix , they form an orthonormal basis, which means that is orthogonal. Therefore, we have . This gives the singular value decomposition
The singular values of are the squares roots of eigenvalues of ; that is, . By convention, the singular values are listed in decreasing order: . The right singular vectors are the associated eigenvectors of .
In fact, the left singular vectors will also form an orthonormal basis. To see this, suppose that the associated singular values are nonzero. We then have:
This example is special because is symmetric. With a little thought, it’s possible to relate this singular value decomposition to an orthogonal diagonalization of using the fact that .
In this activity, we will construct the singular value decomposition of . Notice that this matrix is not square so there are no eigenvalues and eigenvectors associated to it.
Construct the Gram matrix and find an orthogonal diagonalization of it.
Identify the singular values of and the right singular vectors ,, and . What is the dimension of these vectors? How many nonzero singular values are there?
Find the left singular vectors and using the fact that . What is the dimension of these vectors? What happens if you try to find a third left singular vector in this way?
As before, form the orthogonal matrices and from the left and right singular vectors. What are the shapes of and ? How do these shapes relate to the number of rows and columns of ?
As we’ll see in the next section, some additional work may be needed to construct the left singular vectors if more of the singular values are zero, but we won’t worry about that now. For the time being, let’s record our work in the following theorem.
An matrix may be written as where is an orthogonal matrix, is an orthogonal matrix, and is an matrix whose entries are zero except for the singular values of which appear in decreasing order on the diagonal.
As we said earlier, a singular value decomposition should be thought of a generalization of an orthogonal diagonalization. For instance, the Spectral Theorem tells us that a symmetric matrix can be written as . Many matrices, however, are not symmetric and so they are not orthogonally diagonalizable. However, every matrix has a singular value decomposition . The price of this generalization is that we usually have two sets of singular vectors that form the orthogonal matrices and whereas a symmetric matrix has a single set of eignevectors that form the orthogonal matrix .
Subsection7.4.2The structure of singular value decompositions
Now that we have an understanding of what a singular value decomposition is and how to construct it, let’s explore the ways in which a singular value decomposition reveals the underlying structure of the matrix. As we’ll see, the matrices and in a singular value decomposition provide convenient bases for some important subspaces, such as the column and null spaces of the matrix. This observation will provide the key to some of our uses of these decompositions in the next section.
This activity shows how a singular value decomposition of a matrix encodes important information about its null and column spaces. More specifically, the left and right singular vectors provide orthonormal bases for Nul and Col. This is one of the reasons that singular value decompositions are so useful.
which says that is a linear combination of ,, and . These three vectors therefore form a basis for Col. In fact, since they are columns in the orthogonal matrix , they form an orthonormal basis for Col.
Remembering that Col, we see that , which results from the three nonzero singular values. In general, the rank of a matrix equals the number of nonzero singular values, and form an orthonormal basis for Col.
Remember that Proposition 7.4.6 says that and its transpose share the same singular values. Since the rank of a matrix equals its number of nonzero singular values, this means that , a fact that we cited back in Section 6.2.
If we have a singular value decomposition of an matrix ,Proposition 7.4.6 also tells us that the left singular vectors of are the right singular vectors of . Therefore, is the matrix whose columns are the right singular vectors of . This means that the last vectors form an orthonormal basis for Nul. Therefore, the columns of provide orthonormal bases for Col and Nul:
ColNul.
This reflects the familiar fact that Nul is the orthogonal complement of Col.
In the same way, is the matrix whose columns are the left singular vectors of , which means that the first vectors form an orthonormal basis for Col. Because the columns of are the rows of , this subspace is sometimes called the row space of and denoted Row. While we have yet to have an occasion to use Row, there are times when it is important to have an orthonormal basis for it, and a singular value decomposition provides just that. To summarize, the columns of provide orthonormal bases for Col and Nul:
Considered altogether, the subspaces Col,Nul,Col, and Nul are called the four fundamental subspaces associated to . In addition to telling us the rank of a matrix, a singular value decomposition gives us orthonormal bases for all four fundamental subspaces.
When we previously outlined a procedure for finding a singular decomposition of an matrix , we found the left singular vectors using the expression . This produces left singular vectors , where . If , however, we still need to find the left singular vectors .Theorem 7.4.9 tells us how to do that: because those vectors form an orthonormal basis for Nul, we can find them by solving to obtain a basis for Nul and applying the Gram-Schmidt algorithm.
If , explain why where the columns of are an orthonormal basis for Col, is a square, diagonal, invertible matrix, and the columns of form an orthonormal basis for Col.
This section has explored singular value decompositions, how to find them, and how they organize important information about a matrix.
A singular value decomposition of a matrix is a factorization where . The matrix has the same shape as , and its only nonzero entries are the singular values of , which appear in decreasing order on the diagonal. The matrices and are orthogonal and contain the left and right singular vectors, respectively, as their columns.
To find a singular value decomposition of a matrix, we construct the Gram matrix , which is symmetric. The singular values of are the square roots of the eigenvalues of , and the right singular vectors are the associated eigenvectors of . The left singular vectors are determined from the relationship .
A singular value decomposition reveals fundamental information about a matrix. For instance, the number of nonzero singular values is the rank of the matrix. The first left singular vectors form an orthonormal basis for Col with the remaining left singular vectors forming an orthonormal basis of Nul. The first right singular vectors form an orthonormal basis for Col while the remaining right singular vectors form an orthonormal basis of Nul.
If is a rank matrix, we can write a reduced singular value decomposition as where the columns of form an orthonormal basis for Col, the columns of form an orthonormal basis for Col, and is an diagonal, invertible matrix.
Construct the Gram matrix and use it to find the singular values and right singular vectors ,, and of . What are the matrices and in a singular value decomposition?