Skip to main content
Logo image

Section 7.4 Singular Value Decompositions

The Spectral Theorem has motivated the past few sections. In particular, we applied the fact that symmetric matrices can be orthogonally diagonalized to simplify quadratic forms, which enabled us to use principal component analysis to reduce the dimension of a dataset.
But what can we do with matrices that are not symmetric or even square? For instance, the following matrices are not diagonalizable, much less orthogonally so:
[2102],[110101].
In this section, we will develop a description of matrices called the singular value decomposition that is, in many ways, analogous to an orthogonal diagonalization. For example, we have seen that any symmetric matrix can be written in the form QDQT where Q is an orthogonal matrix and D is diagonal. A singular value decomposition will have the form UΣVT where U and V are orthogonal and Σ is diagonal. Most notably, we will see that every matrix has a singular value decomposition whether it’s symmetric or not.

Preview Activity 7.4.1.

Let’s review orthogonal diagonalizations and quadratic forms as our understanding of singular value decompositions will rely on them.
  1. Suppose that A is any matrix. Explain why the matrix G=ATA is symmetric.
  2. Suppose that A=[1221]. Find the matrix G=ATA and write out the quadratic form qG([x1x2]) as a function of x1 and x2.
  3. What is the maximum value of qG(x) over all unit vectors and in which direction does it occur?
  4. What is the minimum value of qG(x) over all unit vectors and in which direction does it occur?
  5. What is the geometric relationship between the directions in which the maximum and minimum values occur?

Subsection 7.4.1 Finding singular value decompositions

We will begin by explaining what a singular value decomposition is and how we can find one for a given matrix A.
Recall how the orthogonal diagonalization of a symmetric matrix is formed: if A is symmetric, we write A=QDQT where the diagonal entries of D are the eigenvalues of A and the columns of Q are the associated eigenvectors. Moreover, the eigenvalues are related to the maximum and minimum values of the associated quadratic form qA(u) among all unit vectors.
A general matrix, particularly a matrix that is not square, may not have eigenvalues and eigenvectors, but we can discover analogous features, called singular values and singular vectors, by studying a function somewhat similar to a quadratic form. More specifically, any matrix A defines a function
lA(x)=|Ax|,
which measures the length of Ax. For example, the diagonal matrix D=[3002] gives the function lD(x)=9x12+4x22. The presence of the square root means that this function is not a quadratic form. We can, however, define the singular values and vectors by looking for the maximum and minimum of this function lA(u) among all unit vectors u.
While lA(x) is not itself a quadratic form, it becomes one if we square it:
(lA(x))2=|Ax|2=(Ax)(Ax)=x(ATAx)=qATA(x).
We call G=ATA, the Gram matrix associated to A and note that
lA(x)=qG(x).
This is important in the next activity, which introduces singular values and singular vectors.

Activity 7.4.2.

The following interactive figure will help us explore singular values and vectors geometrically before we begin a more algebraic approach.
Instructions.
You may choose a 2×2 matrix A=[abcd] using the four sliders at the top of the diagram. On the left, you may vary the red unit vector x by clicking in the head of the vector.
Figure 7.4.1. Singular values, right singular vectors and left singular vectors
Select the matrix A=[1221]. As we vary the vector x, we see the vector Ax on the right in gray while the height of the blue bar to the right tells us lA(x)=|Ax|.
  1. The first singular value σ1 is the maximum value of lA(x) over all unit vectors and an associated right singular vector v1 is a unit vector describing a direction in which this maximum occurs.
    Use the diagram to find the first singular value σ1 and an associated right singular vector v1.
  2. The second singular value σ2 is the minimum value of lA(x) over all unit vectors and an associated right singular vector v2 is a unit vector describing a direction in which this minimum occurs.
    Use the diagram to find the second singular value σ2 and an associated right singular vector v2.
  3. Here’s how we can find the right singular values and vectors without using the diagram. Remember that lA(x)=qG(x) where G=ATA is the Gram matrix associated to A. Since G is symmetric, it is orthogonally diagonalizable. Find G and an orthogonal diagonalization of it.
    What is the maximum value of the quadratic form qG(x) among all unit vectors and in which direction does it occur? What is the minimum value of qG(x) and in which direction does it occur?
  4. Because lA(x)=qG(x), the first singular value σ1 will be the square root of the maximum value of qG(x) and σ2 the square root of the minimum. Verify that the singular values that you found from the diagram are the square roots of the maximum and minimum values of qG(x).
  5. Verify that the right singular vectors v1 and v2 that you found from the diagram are the directions in which the maximum and minimum values occur.
  6. Finally, we introduce the left singular vectors u1 and u2 by requiring that Av1=σ1u1 and Av2=σ2u2. Find the two left singular vectors.
  7. Form the matrices
    U=[u1u2],Σ=[σ100σ2],V=[v1v2]
    and explain why AV=UΣ.
  8. Finally, explain why A=UΣVT and verify that this relationship holds for this specific example.
As this activity shows, the singular values of A are the maximum and minimum values of lA(x)=|Ax| among all unit vectors and the right singular vectors v1 and v2 are the directions in which they occur. The key to finding the singular values and vectors is to utilize the Gram matrix G and its associated quadratic form qG(x). We will illustrate with some more examples.

Example 7.4.2.

We will find a singular value decomposition of the matrix A=[1212]. Notice that this matrix is not symmetric so it cannot be orthogonally diagonalized.
We begin by constructing the Gram matrix G=ATA=[2008]. Since G is symmetric, it can be orthogonally diagonalized with
D=[8002],Q=[0110].
We now know that the maximum value of the quadratic form qG(x) is 8, which occurs in the direction [01]. Since lA(x)=qG(x), this tells us that the maximum value of lA(x), the first singular value, is σ1=8 and that this occurs in the direction of the first right singular vector v1=[01].
In the same way, we also know that the second singular value σ2=2 with associated right singular vector v2=[10].
The first left singular vector u1 is defined by Av1=[22]=σ1u1. Because σ1=8, we have u1=[1/21/2]. Notice that u1 is a unit vector because σ1=|Av1|.
In the same way, the second left singular vector is defined by Av2=[11]=σ2u2, which gives us u2=[1/21/2].
We then construct
U=[u1u2]=[1/21/21/21/2]Σ=[σ100σ2]=[8002]V=[v1v2]=[0110]
We now have AV=UΣ because
AV=[Av1Av2]=[σ1u1σ2u2]=ΣU.
Because the right singular vectors, the columns of V, are eigenvectors of the symmetric matrix G, they form an orthonormal basis, which means that V is orthogonal. Therefore, we have (AV)VT=A=UΣVT. This gives the singular value decomposition
A=[1212]=[1/21/21/21/2][8002][0110]T=UΣVT.
To summarize, we find a singular value decomposition of a matrix A in the following way:
  • Construct the Gram matrix G=ATA and find an orthogonal diagonalization to obtain eigenvalues λi and an orthonormal basis of eigenvectors.
  • The singular values of A are the squares roots of eigenvalues λi of G; that is, σi=λi. By convention, the singular values are listed in decreasing order: σ1σ2. The right singular vectors vi are the associated eigenvectors of G.
  • The left singular vectors ui are found by Avi=σiui. Because σi=|Avi|, we know that ui will be a unit vector.
    In fact, the left singular vectors will also form an orthonormal basis. To see this, suppose that the associated singular values are nonzero. We then have:
    σiσj(uiuj)=(σiui)(σjuj)=(Avi)(Avj)=vi(ATAvj)=vi(Gvj)=λjvivj=0
    since the right singular vectors are orthogonal.

Example 7.4.3.

Let’s find a singular value decomposition for the symmetric matrix A=[1221]. The associated Gram matrix is
G=ATA=[5445],
which has an orthogonal diagonalization with
D=[9001],Q=[1/21/21/21/2].
This gives singular values and vectors
σ1=3,v1=[1/21/2],u1=[1/21/2]σ2=1,v2=[1/21/2],u2=[1/21/2]
and the singular value decomposition A=UΣVT where
U=[1/21/21/21/2],Σ=[3001],V=[1/21/21/21/2].
This example is special because A is symmetric. With a little thought, it’s possible to relate this singular value decomposition to an orthogonal diagonalization of A using the fact that G=ATA=A2.

Activity 7.4.3.

In this activity, we will construct the singular value decomposition of A=[101111]. Notice that this matrix is not square so there are no eigenvalues and eigenvectors associated to it.
  1. Construct the Gram matrix G=ATA and find an orthogonal diagonalization of it.
  2. Identify the singular values of A and the right singular vectors v1, v2, and v3. What is the dimension of these vectors? How many nonzero singular values are there?
  3. Find the left singular vectors u1 and u2 using the fact that Avi=σiui. What is the dimension of these vectors? What happens if you try to find a third left singular vector u3 in this way?
  4. As before, form the orthogonal matrices U and V from the left and right singular vectors. What are the shapes of U and V? How do these shapes relate to the number of rows and columns of A?
  5. Now form Σ so that it has the same shape as A:
    Σ=[σ1000σ20]
    and verify that A=UΣVT.
  6. How can you use this singular value decomposition of A=UΣVT to easily find a singular value decomposition of AT=[110111]?

Example 7.4.4.

We will find a singular value decomposition of the matrix A=[221488].
Finding an orthogonal diagonalization of G=ATA gives
D=[14400090000],Q=[1/32/32/32/32/31/32/31/32/3],
which gives singular values σ1=144=12, σ2=9=3, and σ3=0. The right singular vectors vi appear as the columns of Q so that V=Q.
We now find
Av1=[012]=12u1,u1=[01]Av2=[30]=3u1,u1=[10]Av3=[00]
Notice that it’s not possible to find a third left singular vector since Av3=0. We therefore form the matrices
U=[0110],Σ=[1200030],V=[1/32/32/32/32/31/32/31/32/3],
which gives the singular value decomposition A=UΣVT.
Notice that U is a 2×2 orthogonal matrix because A has two rows, and V is a 3×3 orthogonal matrix because A has three columns.
As we’ll see in the next section, some additional work may be needed to construct the left singular vectors uj if more of the singular values are zero, but we won’t worry about that now. For the time being, let’s record our work in the following theorem.
Notice that a singular value decomposition of A gives us a singular value decomposition of AT. More specifically, if A=UΣVT, then
AT=(UΣVT)T=VΣTUT.
As we said earlier, a singular value decomposition should be thought of a generalization of an orthogonal diagonalization. For instance, the Spectral Theorem tells us that a symmetric matrix can be written as QDQT. Many matrices, however, are not symmetric and so they are not orthogonally diagonalizable. However, every matrix has a singular value decomposition UΣVT. The price of this generalization is that we usually have two sets of singular vectors that form the orthogonal matrices U and V whereas a symmetric matrix has a single set of eignevectors that form the orthogonal matrix Q.

Subsection 7.4.2 The structure of singular value decompositions

Now that we have an understanding of what a singular value decomposition is and how to construct it, let’s explore the ways in which a singular value decomposition reveals the underlying structure of the matrix. As we’ll see, the matrices U and V in a singular value decomposition provide convenient bases for some important subspaces, such as the column and null spaces of the matrix. This observation will provide the key to some of our uses of these decompositions in the next section.

Activity 7.4.4.

Let’s suppose that a matrix A has a singular value decomposition A=UΣVT where
U=[u1u2u3u4],Σ=[2000050000000],V=[v1v2v3].
  1. What is the shape of A; that is, how many rows and columns does A have?
  2. Suppose we write a three-dimensional vector x as a linear combination of right singular vectors:
    x=c1v1+c2v2+c3v3.
    We would like to find an expression for Ax.
    To begin, VTx=[v1xv2xv3x]=[c1c2c3].
    Now ΣVTx=[2000050000000][c1c2c3]=[20c15c200].
    And finally, Ax=UΣVTx=[u1u2u3u4][20c15c200]=20c1u1+5c2u2.
    To summarize, we have Ax=20c1u1+5c2u2.
    What condition on c1, c2, and c3 must be satisfied if x is a solution to the equation Ax=40u1+20u2? Is there a unique solution or infinitely many?
  3. Remembering that u1 and u2 are linearly independent, what condition on c1, c2, and c3 must be satisfied if Ax=0?
  4. How do the right singular vectors vi provide a basis for Nul(A), the subspace of solutions to the equation Ax=0?
  5. Remember that b is in Col(A) if the equation Ax=b is consistent, which means that
    Ax=20c1u1+5c2u2=b
    for some coefficients c1 and c2. How do the left singular vectors ui provide an orthonormal basis for Col(A)?
  6. Remember that rank(A) is the dimension of the column space. What is rank(A) and how do the number of nonzero singular values determine rank(A)?
This activity shows how a singular value decomposition of a matrix encodes important information about its null and column spaces. More specifically, the left and right singular vectors provide orthonormal bases for Nul(A) and Col(A). This is one of the reasons that singular value decompositions are so useful.

Example 7.4.7.

Suppose we have a singular value decomposition A=UΣVT where Σ=[σ100000σ200000σ30000000]. This means that A has four rows and five columns just as Σ does.
As in the activity, if x=c1v1+c2v2++c5v5, we have
Ax=σ1c1u1+σ2c2u2+σ3c3u3.
If b is in Col(A), then b must have the form
b=σ1c1u1+σ2c2u2+σ3c3u3,
which says that b is a linear combination of u1, u2, and u3. These three vectors therefore form a basis for Col(A). In fact, since they are columns in the orthogonal matrix U, they form an orthonormal basis for Col(A).
Remembering that rank(A)=dimCol(A), we see that rank(A)=3, which results from the three nonzero singular values. In general, the rank r of a matrix A equals the number of nonzero singular values, and u1,u2,,ur form an orthonormal basis for Col(A).
Moreover, if x=c1v1+c2v2++c5v5 satisfies Ax=0, then
Ax=σ1c1u1+σ2c2u2+σ3c3u3=0,
which implies that c1=0, c2=0, and c3=0. Therefore, x=c4v4+c5v5 so v4 and v5 form an orthonormal basis for Nul(A).
More generally, if A is an m×n matrix and if rank(A)=r, the last nr right singular vectors form an orthonormal basis for Nul(A).
Generally speaking, if the rank of an m×n matrix A is r, then there are r nonzero singular values and Σ has the form
[σ1000000σr0000000],
The first r columns of U form an orthonormal basis for Col(A):
U=[u1  urCol(A)ur+1  um]
and the last nr columns of V form an orthonormal basis for Nul(A):
V=[v1  vrvr+1  vnNul(A)]
Remember that Proposition 7.4.6 says that A and its transpose AT share the same singular values. Since the rank of a matrix equals its number of nonzero singular values, this means that rank(A)=rank(AT), a fact that we cited back in Section 6.2.
If we have a singular value decomposition of an m×n matrix A=UΣVT, Proposition 7.4.6 also tells us that the left singular vectors of A are the right singular vectors of AT. Therefore, U is the m×m matrix whose columns are the right singular vectors of AT. This means that the last mr vectors form an orthonormal basis for Nul(AT). Therefore, the columns of U provide orthonormal bases for Col(A) and Nul(AT):
U=[u1  urCol(A)ur+1  umNul(AT)].
This reflects the familiar fact that Nul(AT) is the orthogonal complement of Col(A).
In the same way, V is the n×n matrix whose columns are the left singular vectors of AT, which means that the first r vectors form an orthonormal basis for Col(AT). Because the columns of AT are the rows of A, this subspace is sometimes called the row space of A and denoted Row(A). While we have yet to have an occasion to use Row(A), there are times when it is important to have an orthonormal basis for it, and a singular value decomposition provides just that. To summarize, the columns of V provide orthonormal bases for Col(AT) and Nul(A):
V=[v1  vrCol(AT)vr+1  vnNul(A)]
Considered altogether, the subspaces Col(A), Nul(A), Col(AT), and Nul(AT) are called the four fundamental subspaces associated to A. In addition to telling us the rank of a matrix, a singular value decomposition gives us orthonormal bases for all four fundamental subspaces.
When we previously outlined a procedure for finding a singular decomposition of an m×n matrix A, we found the left singular vectors uj using the expression Avj=σjuj. This produces left singular vectors u1,u2,,ur, where r=rank(A). If r<m, however, we still need to find the left singular vectors ur+1,,um. Theorem 7.4.9 tells us how to do that: because those vectors form an orthonormal basis for Nul(AT), we can find them by solving ATx=0 to obtain a basis for Nul(AT) and applying the Gram-Schmidt algorithm.
We won’t worry about this issue too much, however, as we will frequently use software to find singular value decompositions for us.

Subsection 7.4.3 Reduced singular value decompositions

As we’ll see in the next section, there are times when it is helpful to express a singular value decomposition in a slightly different form.

Activity 7.4.5.

Suppose we have a singular value decomposition A=UΣVT where
U=[u1u2u3u4],Σ=[1800040000000],V=[v1v2v3].
  1. What is the shape of A? What is rank(A)?
  2. Identify bases for Col(A) and Col(AT).
  3. Explain why
    UΣ=[u1u2][1800040].
  4. Explain why
    [1800040]VT=[18004][v1v2]T.
  5. If A=UΣVT, explain why A=UrΣrVrT where the columns of Ur are an orthonormal basis for Col(A), Σr is a square, diagonal, invertible matrix, and the columns of Vr form an orthonormal basis for Col(AT).
We call this a reduced singular value decomposition.

Example 7.4.11.

In Example 7.4.4, we found the singular value decomposition
A=[221488]=[0110][1200030][1/32/32/32/32/31/32/31/32/3]T.
Since there are two nonzero singular values, rank(A)=2 so that the reduced singular value decomposition is
A=[221488]=[0110][12003][1/32/32/32/32/31/3]T.

Subsection 7.4.4 Summary

This section has explored singular value decompositions, how to find them, and how they organize important information about a matrix.
  • A singular value decomposition of a matrix A is a factorization where A=UΣVT. The matrix Σ has the same shape as A, and its only nonzero entries are the singular values of A, which appear in decreasing order on the diagonal. The matrices U and V are orthogonal and contain the left and right singular vectors, respectively, as their columns.
  • To find a singular value decomposition of a matrix, we construct the Gram matrix G=ATA, which is symmetric. The singular values of A are the square roots of the eigenvalues of G, and the right singular vectors vj are the associated eigenvectors of G. The left singular vectors uj are determined from the relationship Avj=σjuj.
  • A singular value decomposition reveals fundamental information about a matrix. For instance, the number of nonzero singular values is the rank r of the matrix. The first r left singular vectors form an orthonormal basis for Col(A) with the remaining left singular vectors forming an orthonormal basis of Nul(AT). The first r right singular vectors form an orthonormal basis for Col(AT) while the remaining right singular vectors form an orthonormal basis of Nul(A).
  • If A is a rank r matrix, we can write a reduced singular value decomposition as A=UrΣrVrT where the columns of Ur form an orthonormal basis for Col(A), the columns of Vr form an orthonormal basis for Col(AT), and Σr is an r×r diagonal, invertible matrix.

Exercises 7.4.5 Exercises

1.

Consider the matrix A=[121012].
  1. Find the Gram matrix G=ATA and use it to find the singular values and right singular vectors of A.
  2. Find the left singular vectors.
  3. Form the matrices U, Σ, and V and verify that A=UΣVT.
  4. What is rank(A) and what does this say about Col(A)?
  5. Determine an orthonormal basis for Nul(A).

2.

Find singular value decompositions for the following matrices:
  1. [0008].
  2. [2302].
  3. [400002]
  4. [400002]

3.

Consider the matrix A=[2112].
  1. Find a singular value decomposition of A and verify that it is also an orthogonal diagonalization of A.
  2. If A is a symmetric, positive semidefinite matrix, explain why a singular value decomposition of A is an orthogonal diagonalization of A.

4.

Suppose that the matrix A has the singular value decomposition
[0.460.520.460.550.820.000.140.550.040.440.850.280.340.730.180.55][6.20.00.00.04.10.00.00.00.00.00.00.0][0.740.620.240.280.620.730.610.480.64].
  1. What are the dimensions of A?
  2. What is rank(A)?
  3. Find orthonormal bases for Col(A), Nul(A), Col(AT), and Nul(AT).
  4. Find the orthogonal projection of b=[1021] onto Col(A).

5.

Consider the matrix A=[101220112].
  1. Construct the Gram matrix G and use it to find the singular values and right singular vectors v1, v2, and v3 of A. What are the matrices Σ and V in a singular value decomposition?
  2. What is rank(A)?
  3. Find as many left singular vectors uj as you can using the relationship Avj=σjuj.
  4. Find an orthonormal basis for Nul(AT) and use it to construct the matrix U so that A=UΣVT.
  5. State an orthonormal basis for Nul(A) and an orthonormal basis for Col(A).

6.

Consider the matrix B=[102112] and notice that B=AT where A is the matrix in Exercise 7.4.5.1.
  1. Use your result from Exercise 7.4.5.1 to find a singular value decomposition of B=UΣVT.
  2. What is rank(B)? Determine a basis for Col(B) and Col(B).
  3. Suppose that b=[347]. Use the bases you found in the previous part of this exercise to write b=b^+b, where b^ is in Col(B) and b is in Col(B).
  4. Find the least-squares approximate solution to the equation Bx=b.

7.

Suppose that A is a square m×m matrix with singular value decomposition A=UΣVT.
  1. If A is invertible, find a singular value decomposition of A1.
  2. What condition on the singular values must hold for A to be invertible?
  3. How are the singular values of A and the singular values of A1 related to one another?
  4. How are the right and left singular vectors of A related to the right and left singular vectors of A1?

8.

  1. If Q is an orthogonal matrix, remember that QTQ=I. Explain why detQ=±1.
  2. If A=UΣVT is a singular value decomposition of a square matrix A, explain why |detA| is the product of the singular values of A.
  3. What does this say about the singular values of A if A is invertible?

9.

If A is a matrix and G=ATA its Gram matrix, remember that
x(Gx)=x(ATAx)=(Ax)(Ax)=|Ax|2.
  1. For a general matrix A, explain why the eigenvalues of G are nonnegative.
  2. Given a symmetric matrix A having an eigenvalue λ, explain why λ2 is an eigenvalue of G.
  3. If A is symmetric, explain why the singular values of A equal the absolute value of its eigenvalues: σj=|λj|.

10.

Determine whether the following statements are true or false and explain your reasoning.
  1. If A=UΣVT is a singular value decomposition of A, then G=V(ΣTΣ)VT is an orthogonal diagonalization of its Gram matrix.
  2. If A=UΣVT is a singular value decomposition of a rank 2 matrix A, then v1 and v2 form an orthonormal basis for the column space Col(A).
  3. If A is a symmetric matrix, then its set of singular values is the same as its set of eigenvalues.
  4. If A is a 10×7 matrix and σ7=4, then the columns of A are linearly independent.
  5. The Gram matrix is always orthogonally diagonalizable.

11.

Suppose that A=UΣVT is a singular value decomposition of the m×n matrix A. If σ1,,σr are the nonzero singular values, the general form of the matrix Σ is
Σ=[σ1000000σr0000000000].
  1. If you know that the columns of A are linearly independent, what more can you say about the form of Σ?
  2. If you know that the columns of A span Rm, what more can you say about the form of Σ?
  3. If you know that the columns of A are linearly independent and span Rm, what more can you say about the form of Σ?
You have attempted 1 of 1 activities on this page.