Skip to main content
Logo image

Section 6.4 Finding orthogonal bases

The last section demonstrated the value of working with orthogonal, and especially orthonormal, sets. If we have an orthogonal basis w1,w2,,wn for a subspace W, the Projection Formula 6.3.15 tells us that the orthogonal projection of a vector b onto W is
b^=bw1w1w1 w1+bw2w2w2 w2++bwnwnwn wn.
An orthonormal basis u1,u2,,un is even more convenient: after forming the matrix Q=[u1u2un], we have b^=QQTb.
In the examples we’ve seen so far, however, orthogonal bases were given to us. What we need now is a way to form orthogonal bases. In this section, we’ll explore an algorithm that begins with a basis for a subspace and creates an orthogonal basis. Once we have an orthogonal basis, we can scale each of the vectors appropriately to produce an orthonormal basis.

Preview Activity 6.4.1.

Suppose we have a basis for R2 consisting of the vectors
v1=[11],v2=[02]
as shown in Figure 6.4.1. Notice that this basis is not orthogonal.
Figure 6.4.1. A basis for R2.
  1. Find the vector v^2 that is the orthogonal projection of v2 onto the line defined by v1.
  2. Explain why v2v^2 is orthogonal to v1.
  3. Define the new vectors w1=v1 and w2=v2v^2 and sketch them in Figure 6.4.2. Explain why w1 and w2 define an orthogonal basis for R2.
    Figure 6.4.2. Sketch the new basis w1 and w2.
  4. Write the vector b=[810] as a linear combination of w1 and w2.
  5. Scale the vectors w1 and w2 to produce an orthonormal basis u1 and u2 for R2.

Subsection 6.4.1 Gram-Schmidt orthogonalization

The preview activity illustrates the main idea behind an algorithm, known as Gram-Schmidt orthogonalization, that begins with a basis for some subspace of Rm and produces an orthogonal or orthonormal basis. The algorithm relies on our construction of the orthogonal projection. Remember that we formed the orthogonal projection b^ of b onto a subspace W by requiring that bb^ is orthogonal to W as shown in Figure 6.4.3.
Figure 6.4.3. If b^ is the orthogonal projection of b onto W, then bb^ is orthogonal to W.
This observation guides our construction of an orthogonal basis for it allows us to create a vector that is orthogonal to a given subspace. Let’s see how the Gram-Schmidt algorithm works.

Activity 6.4.2.

Suppose that W is a three-dimensional subspace of R4 with basis:
v1=[1111],v2=[1322],v3=[1333].
We can see that this basis is not orthogonal by noting that v1v2=8. Our goal is to create an orthogonal basis w1, w2, and w3 for W.
To begin, we declare that w1=v1, and we call W1 the line defined by w1.
  1. Find the vector v^2 that is the orthogonal projection of v2 onto W1, the line defined by w1.
  2. Form the vector w2=v2v^2 and verify that it is orthogonal to w1.
  3. Explain why Span{w1,w2}=Span{v1,v2} by showing that any linear combination of v1 and v2 can be written as a linear combination of w1 and w2 and vice versa.
  4. The vectors w1 and w2 are an orthogonal basis for a two-dimensional subspace W2 of R4. Find the vector v^3 that is the orthogonal projection of v3 onto W2.
  5. Verify that w3=v3v^3 is orthogonal to both w1 and w2.
  6. Explain why w1, w2, and w3 form an orthogonal basis for W.
  7. Now find an orthonormal basis for W.
As this activity illustrates, Gram-Schmidt orthogonalization begins with a basis v1v2,,vn for a subspace W of Rm and creates an orthogonal basis for W. Let’s work through a second example.

Example 6.4.4.

Let’s start with the basis
v1=[212],v2=[330],v3=[271],
which is a basis for R3.
To get started, we’ll simply set w1=v1=[212]. We construct w2 from v2 by subtracting its orthogonal projection onto W1, the line defined by w1. This gives
w2=v2v2w1w1w1w1=v2+w1=[122].
Notice that we found v2=w1+w2. Therefore, we can rewrite any linear combination of v1 and v2 as
c1v1+c2v2=c1w1+c2(w1+w2)=(c1c2)w1+c2w2,
a linear combination of w1 and w2. This tells us that
W2=Span{w1,w2}=Span{v1,v2}.
In other words, w1 and w2 is a orthogonal basis for W2, the 2-dimensional subspace that is the span of v1 and v2.
Finally, we form w3 from v3 by subtracting its orthogonal projection onto W2:
w3=v3v3w1w1w1w1v3w2w2w2w2=v3+w12w2=[221].
We can now check that
w1=[212],w2=[122],w3=[221],
is an orthogonal set. Furthermore, we have, as before, Span{w1,w2,w3}=Span{v1,v2,v3}, which says that we have found a new orthogonal basis for R3.
To create an orthonormal basis, we form unit vectors parallel to each of the vectors in the orthogonal basis:
u1=[2/31/32/3],u2=[1/32/32/3],u3=[2/32/31/3].
More generally, if we have a basis v1,v2,,vn for a subspace W of Rm, the Gram-Schmidt algorithm creates an orthogonal basis for W in the following way:
w1=v1w2=v2v2w1w1w1w1w3=v3v3w1w1w1w1v3w2w2w2w2wn=vnvnw1w1w1w1vnw2w2w2w2vnwn1wn1wn1wn1.
From here, we may form an orthonormal basis by constructing a unit vector parallel to each vector in the orthogonal basis: uj=1/|wj| wj.

Activity 6.4.3.

Sage can automate these computations for us. Before we begin, however, it will be helpful to understand how we can combine things using a list in Python. For instance, if the vectors v1, v2, and v3 form a basis for a subspace, we can bundle them together using square brackets: [v1, v2, v3]. Furthermore, we could assign this to a variable, such as basis = [v1, v2, v3].
Evaluating the following cell will load in some special commands.
  • There is a command to apply the projection formula: projection(b, basis) returns the orthogonal projection of b onto the subspace spanned by basis, which is a list of vectors.
  • The command unit(w) returns a unit vector parallel to w.
  • Given a collection of vectors, say, v1 and v2, we can form the matrix whose columns are v1 and v2 using matrix([v1, v2]).T. When given a list of vectors, Sage constructs a matrix whose rows are the given vectors. For this reason, we need to apply the transpose.
Let’s now consider W, the subspace of R5 having basis
v1=[146826],v2=[53437],v3=[23021].
  1. Apply the Gram-Schmidt algorithm to find an orthogonal basis w1, w2, and w3 for W.
  2. Find b^, the orthogonal projection of b=[511015] onto W.
  3. Explain why we know that b^ is a linear combination of the original vectors v1, v2, and v3 and then find weights so that
    b^=c1v1+c2v2+c3v3.
  4. Find an orthonormal basis u1, u2, for u3 for W and form the matrix Q whose columns are these vectors.
  5. Find the product QTQ and explain the result.
  6. Find the matrix P that projects vectors orthogonally onto W and verify that Pb gives b^, the orthogonal projection that you found earlier.

Subsection 6.4.2 QR factorizations

Now that we’ve seen how the Gram-Schmidt algorithm forms an orthonormal basis for a given subspace, we will explore how the algorithm leads to an important matrix factorization known as the QR factorization.

Activity 6.4.4.

Suppose that A is the 4×3 matrix whose columns are
v1=[1111],v2=[1322],v3=[1333].
These vectors form a basis for W, the subspace of R4 that we encountered in Activity 6.4.2. Since these vectors are the columns of A, we have Col(A)=W.
  1. When we implemented Gram-Schmidt, we first found an orthogonal basis w1, w2, and w3 using
    w1=v1w2=v2v2w1w1w1w1w3=v3v3w1w1w1w1v3w2w2w2w2.
    Use these expressions to write v1, v2, and v3 as linear combinations of w1, w2, and w3.
  2. We next normalized the orthogonal basis w1, w2, and w3 to obtain an orthonormal basis u1, u2, and u3.
    Write the vectors wi as scalar multiples of ui. Then use these expressions to write v1, v2, and v3 as linear combinations of u1, u2, and u3.
  3. Suppose that Q=[u1u2u3]. Use the result of the previous part to find a vector r1 so that Qr1=v1.
  4. Then find vectors r2 and r3 such that Qr2=v2 and Qr3=v3.
  5. Construct the matrix R=[r1r2r3]. Remembering that A=[v1v2v3], explain why A=QR.
  6. What is special about the shape of R?
  7. Suppose that A is a 10×6 matrix whose columns are linearly independent. This means that the columns of A form a basis for W=Col(A), a 6-dimensional subspace of R10. Suppose that we apply Gram-Schmidt orthogonalization to create an orthonormal basis whose vectors form the columns of Q and that we write A=QR. What are the shape of Q and what the shape of R?
When the columns of a matrix A are linearly independent, they form a basis for Col(A) so that we can perform the Gram-Schmidt algorithm. The previous activity shows how this leads to a factorization of A as the product of a matrix Q whose columns are an orthonormal basis for Col(A) and an upper triangular matrix R.

Example 6.4.6.

We’ll consider the matrix A=[232137201] whose columns, which we’ll denote v1, v2, and v3, are the basis of R3 that we considered in Example 6.4.4. There we found an orthogonal basis w1, w2, and w3 that satisfied
v1=w1v2=w1+w2v3=w1+2w2+w3.
In terms of the resulting orthonormal basis u1, u2, and u3, we had
w1=3u1,w2=3u2,w3=3u3
so that
v1=3u1v2=3u1+3u2v3=3u1+6u2+3u3.
Therefore, if Q=[u1u2u3], we have the QR factorization
A=Q[333036003]=QR.
The value of the QR factorization will become clear in the next section where we use it to solve least-squares problems.

Activity 6.4.5.

As before, we would like to use Sage to automate the process of finding and using the QR factorization of a matrix A. Evaluating the following cell provides a command QR(A) that returns the factorization, which may be stored using, for example, Q, R = QR(A).
Suppose that A is the following matrix whose columns are linearly independent.
A=[103021101135].
  1. If A=QR, what is the shape of Q and R? What is special about the form of R?
  2. Find the QR factorization using Q, R = QR(A) and verify that R has the predicted shape and that A=QR.
  3. Find the matrix P that orthogonally projects vectors onto Col(A).
  4. Find b^, the orthogonal projection of b=[4171422] onto Col(A).
  5. Explain why the equation Ax=b^ must be consistent and then find x.
In fact, Sage provides its own version of the QR factorization that is a bit different than the way we’ve developed the factorization here. For this reason, we have provided our own version of the factorization.

Subsection 6.4.3 Summary

This section explored the Gram-Schmidt orthogonalization algorithm and how it leads to the matrix factorization A=QR when the columns of A are linearly independent.
  • Beginning with a basis v1,v2,,vn for a subspace W of Rm, the vectors
    w1=v1w2=v2v2w1w1w1w1w3=v3v3w1w1w1w1v3w2w2w2w2wn=vnvnw1w1w1w1vnw2w2w2w2vnwn1wn1wn1wn1
    form an orthogonal basis for W.
  • We may scale each vector wi appropriately to obtain an orthonormal basis u1,u2,,un.
  • Expressing the Gram-Schmidt algorithm in matrix form shows that, if the columns of A are linearly independent, then we can write A=QR, where the columns of Q form an orthonormal basis for Col(A) and R is upper triangular.

Exercises 6.4.4 Exercises

1.

Suppose that a subspace W of R3 has a basis formed by
v1=[111],v2=[122].
  1. Find an orthogonal basis for W.
  2. Find an orthonormal basis for W.
  3. Find the matrix P that projects vectors orthogonally onto W.
  4. Find the orthogonal projection of [342] onto W.

2.

Find the QR factorization of A=[472444].

3.

Consider the basis of R3 given by the vectors
v1=[222],v2=[131],v3=[205].
  1. Apply the Gram-Schmit orthogonalization algorithm to find an orthonormal basis u1, u2, u3 for R3.
  2. If A is the 3×3 whose columns are v1, v2, and v3, find the QR factorization of A.
  3. Suppose that we want to solve the equation Ax=b=[917], which we can rewrite as QRx=b.
    1. If we set y=Rx, the equation QRx=b becomes Qy=b. Explain how to solve the equation Qy=b in a computationally efficient manner.
    2. Explain how to solve the equation Rx=y in a computationally efficient manner.
    3. Find the solution x by first solving Qy=b and then Rx=y.

4.

Consider the vectors
v1=[11111],v2=[21442],v3=[54371]
and the subspace W of R5 that they span.
  1. Find an orthonormal basis for W.
  2. Find the 5×5 matrix that projects vectors orthogonally onto W.
  3. Find b^, the orthogonal projection of b=[831284] onto W.
  4. Express b^ as a linear combination of v1, v2, and v3.

5.

Consider the set of vectors
v1=[211],v2=[122],v3=[300].
  1. What happens when we apply the Gram-Schmit orthogonalization algorithm?
  2. Why does the algorithm fail to produce an orthogonal basis for R3?

6.

Suppose that A is a matrix with linearly independent columns and having the factorization A=QR. Determine whether the following statements are true or false and explain your thinking.
  1. It follows that R=QTA.
  2. The matrix R is invertible.
  3. The product QTQ projects vectors orthogonally onto Col(A).
  4. The columns of Q are an orthogonal basis for Col(A).
  5. The orthogonal complement Col(A)=Nul(QT).

7.

Suppose we have the QR factorization A=QR, where A is a 7×4 matrix.
  1. What is the shape of the product QQT? Explain the significance of this product.
  2. What is the shape of the product QTQ? Explain the significance of this product.
  3. What is the shape of the matrix R?
  4. If R is a diagonal matrix, what can you say about the columns of A?

8.

Suppose we have the QR factorization A=QR where the columns of A are a1,a2,,an and the columns of R are r1,r2,,rn.
  1. How can the matrix product ATA be expressed in terms of dot products?
  2. How can the matrix product RTR be expressed in terms of dot products?
  3. Explain why ATA=RTR.
  4. Explain why the dot products aiaj=rirj.
You have attempted 1 of 1 activities on this page.