Skip to main content
Logo image

Section 6.3 Orthogonal bases and projections

We know that a linear system Ax=b is inconsistent when b is not in Col(A), the column space of A. Later in this chapter, we’ll develop a strategy for dealing with inconsistent systems by finding b^, the vector in Col(A) that minimizes the distance to b. The equation Ax=b^ is therefore consistent and its solution set can provide us with useful information about the original system Ax=b.
In this section and the next, we’ll develop some techniques that enable us to find b^, the vector in a given subspace W that is closest to a given vector b.

Preview Activity 6.3.1.

For this activity, it will be helpful to recall the distributive property of dot products:
v(c1w1+c2w2)=c1vw1+c2vw2.
We’ll work with the basis of R2 formed by the vectors
w1=[12],w2=[21].
  1. Verify that the vectors w1 and w2 are orthogonal.
  2. Suppose that b=[74] and find the dot products w1b and w2b.
  3. We would like to express b as a linear combination of w1 and w2, which means that we need to find weights c1 and c2 such that
    b=c1w1+c2w2.
    To find the weight c1, dot both sides of this expression with w1:
    bw1=(c1w1+c2w2)w1,
    and apply the distributive property.
  4. In a similar fashion, find the weight c2.
  5. Verify that b=c1w1+c2w2 using the weights you have found.
We frequently ask to write a given vector as a linear combination of given basis vectors. In the past, we have done this by solving a linear system. The preview activity illustrates how this task can be simplified when the basis vectors are orthogonal to each other. We’ll explore this and other uses of orthogonal bases in this section.

Subsection 6.3.1 Orthogonal sets

The preview activity dealt with a basis of R2 formed by two orthogonal vectors. More generally, we will consider a set of orthogonal vectors, as described in the next definition.

Definition 6.3.1.

By an orthogonal set of vectors, we mean a set of nonzero vectors each of which is orthogonal to the others.

Example 6.3.2.

The 3-dimensional vectors
w1=[111],w2=[110],w3=[112].
form an orthogonal set, which can be verified by computing
w1w2=0w1w3=0w2w3=0.
Notice that this set of vectors forms a basis for R3.

Example 6.3.3.

The vectors
w1=[1111],w2=[1111],w3=[1111]
form an orthogonal set of 4-dimensional vectors. Since there are only three vectors, this set does not form a basis for R4. It does, however, form a basis for a 3-dimensional subspace W of R4.
Suppose that a vector b is a linear combination of an orthogonal set of vectors w1,w2,,wn; that is, suppose that
c1w1+c2w2++cnwn=b.
Just as in the preview activity, we can find the weight c1 by dotting both sides with w1 and applying the distributive property of dot products:
(c1w1+c2w2++cnwn)w1=bw1c1w1w1+c2w2w1++cnwnw1=bw1c1w1w1=bw1c1=bw1w1w1.
Notice how the presence of an orthogonal set causes most of the terms in the sum to vanish. In the same way, we find that
ci=bwiwiwi
so that
b=bw1w1w1w1+bw2w2w2w2++bwnwnwnwn.
We’ll record this fact in the following proposition.
Using this proposition, we can see that an orthogonal set of vectors must be linearly independent. Suppose, for instance, that w1,w2,,wn is a set of nonzero orthogonal vectors and that one of the vectors is a linear combination of the others, say,
w3=c1w1+c2w2.
We therefore know that
w3=w3w1w1w1w1+w3w2w2w2w2=0,
which cannot happen since we know that w3 is nonzero. This tells us that
If the vectors in an orthogonal set have dimension m, they form a linearly independent set in Rm and are therefore a basis for the subspace W=Span{w1,w2,,wn}. If there are m vectors in the orthogonal set, they form a basis for Rm.

Activity 6.3.2.

Consider the vectors
w1=[111],w2=[110],w3=[112].
  1. Verify that this set forms an orthogonal set of 3-dimensional vectors.
  2. Explain why we know that this set of vectors forms a basis for R3.
  3. Suppose that b=[244]. Find the weights c1, c2, and c3 that express b as a linear combination b=c1w1+c2w2+c3w3 using Proposition 6.3.4.
  4. If we multiply a vector v by a positive scalar s, the length of v is also multiplied by s; that is, |sv|=s|v|.
    Using this observation, find a vector u1 that is parallel to w1 and has length 1. Such vectors are called unit vectors.
  5. Similarly, find a unit vector u2 that is parallel to w2 and a unit vector u3 that is parallel to w3.
  6. Construct the matrix Q=[u1u2u3] and find the product QTQ. Use Proposition 6.2.8 to explain your result.
This activity introduces an important way of modifying an orthogonal set so that the vectors in the set have unit length. Recall that we may multiply any nonzero vector w by a scalar so that the new vector has length 1. For instance, we know that if s is a positive scalar, then |sw|=s|w|. To obtain a vector u having unit length, we want
|u|=|sw|=s|w|=1
so that s=1/|w|. Therefore,
u=1|w|w
becomes a unit vector parallel to w.
Orthogonal sets in which the vectors have unit length are called orthonormal and are especially convenient.

Definition 6.3.6.

An orthonormal set is an orthogonal set of vectors each of which has unit length.

Example 6.3.7.

The vectors
u1=[1/21/2],u2=[1/21/2]
are an orthonormal set of vectors in R2 and form an orthonormal basis for R2.
If we form the matrix
Q=[u1u2]=[1/21/21/21/2],
we find that QTQ=I since Proposition 6.2.8 tells us that
QTQ=[u1u1u1u2u2u1u2u2]=[1001]
The previous activity and example illustrate the next proposition.

Subsection 6.3.2 Orthogonal projections

We now turn to an important problem that will appear in many forms in the rest of our explorations. Suppose, as shown in Figure 6.3.9, that we have a subspace W of Rm and a vector b that is not in that subspace. We would like to find the vector b^ in W that is closest to b, meaning the distance between b^ and b is as small as possible.
Figure 6.3.9. Given a plane in R3 and a vector b not in the plane, we wish to find the vector b^ in the plane that is closest to b.
To get started, let’s consider a simpler problem where we have a line L in R2, defined by the vector w, and another vector b that is not on the line, as shown on the left of Figure 6.3.10. We wish to find b^, the vector on the line that is closest to b, as illustrated in the right of Figure 6.3.10.
Figure 6.3.10. Given a line L and a vector b, we seek the vector b^ on L that is closest to b.
To find b^, we require that bb^ be orthogonal to L. For instance, if y is another vector on the line, as shown in Figure 6.3.11, then the Pythagorean theorem implies that
|by|2=|bb^|2+|b^y|2
which means that |by||bb^|. Therefore, b^ is closer to b than any other vector on the line L.
Figure 6.3.11. The vector b^ is closer to b than y because bb^ is orthogonal to L.

Definition 6.3.12.

Given a vector b in Rm and a subspace W of Rm, the orthogonal projection of b onto W is the vector b^ in W that is closest to b. It is characterized by the property that bb^ is orthogonal to W.

Activity 6.3.3.

This activity demonstrates how to determine the orthogonal projection of a vector onto a subspace of Rm.
  1. Let’s begin by considering a line L, defined by the vector w=[21], and a vector b=[24] not on L, as illustrated in Figure 6.3.13.
    Figure 6.3.13. Finding the orthogonal projection of b onto the line defined by w.
    1. To find b^, first notice that b^=sw for some scalar s. Since bb^=bsw is orthogonal to w, what do we know about the dot product
      (bsw)w?
    2. Apply the distributive property of dot products to find the scalar s. What is the vector b^, the orthogonal projection of b onto L?
    3. More generally, explain why the orthogonal projection of b onto the line defined by w is
      b^=bwww w.
  2. The same ideas apply more generally. Suppose we have an orthogonal set of vectors w1=[221] and w2=[102] that define a plane W in R3. If b=[396] another vector in R3, we seek the vector b^ on the plane W closest to b. As before, the vector bb^ will be orthogonal to W, as illustrated in Figure 6.3.14.
    Figure 6.3.14. Given a plane W defined by the orthogonal vectors w1 and w2 and another vector b, we seek the vector b^ on W closest to b.
    1. The vector bb^ is orthogonal to W. What does this say about the dot products: (bb^)w1 and (bb^)w2?
    2. Since b^ is in the plane W, we can write it as a linear combination b^=c1w1+c2w2. Then
      bb^=b(c1w1+c2w2).
      Find the weight c1 by dotting bb^ with w1 and applying the distributive property of dot products. Similarly, find the weight c2.
    3. What is the vector b^, the orthogonal projection of b onto the plane W?
  3. Suppose that W is a subspace of Rm with orthogonal basis w1,w2,,wn and that b is a vector in Rm. Explain why the orthogonal projection of b onto W is the vector
    b^=bw1w1w1 w1+bw2w2w2 w2++bwnwnwn wn.
  4. Suppose that u1,u2,,un is an orthonormal basis for W; that is, the vectors are orthogonal to one another and have unit length. Explain why the orthogonal projection is
    b^=(bu1) u1+(bu2) u2++(bun) un.
  5. If Q=[u1u2un] is the matrix whose columns are an orthonormal basis of W, use Proposition 6.2.8 to explain why b^=QQTb.
In all the cases considered in the activity, we are looking for b^, the vector in a subspace W closest to a vector b, which is found by requiring that bb^ be orthogonal to W. This means that (bb^)w=0 for any vector w in W.
If we have an orthogonal basis w1,w2,,wn for W, then b^=c1w1+cww2++cnwn. Therefore,
(bb^)wi=0bwi=b^wibwi=(c1w1+c2w2++cnwn)wibwi=ciwiwici=bwiwiwi.
This leads to the projection formula:

Caution.

Remember that the projection formula given in Proposition 6.3.15 applies only when the basis w1,w2,,wn of W is orthogonal.
If we have an orthonormal basis u1,u2,,un for W, the projection formula simplifies to
b^=(bu1) u1+(bu2) u2++(bun) un.
If we then form the matrix
Q=[u1u2un],
this expression may be succintly written
b^=(bu1) u1+(bu2) u2++(bun) un=[u1u2un][u1bu2bunb]=QQTb
This leads to the following proposition.

Example 6.3.17.

In the previous activity, we looked at the plane W defined by the two orthogonal vectors
w1=[221],w2=[102].
We can form an orthonormal basis by scalar multiplying these vectors to have unit length:
u1=13[221]=[2/32/31/3],u2=15[102]=[1/502/5].
Using these vectors, we form the matrix
Q=[2/31/52/301/32/5].
The projection onto the plane W is then given by the matrix
QQT=[2/31/52/301/32/5][2/32/31/31/502/5]=[29/454/98/454/94/92/98/452/941/45].
Let’s check that this works by considering the vector b=[100] and finding b^, its orthogonal projection onto the plane W. In terms of the original basis w1 and w2, the projection formula from Proposition 6.3.15 tells us that
b^=bw1w1w1 w1+bw2w2w2 w2=[29/454/98/45]
Alternatively, we use the matrix QQT, as in Proposition 6.3.16, to find that
b^=QQTb=[29/454/98/454/94/92/98/452/941/45][100]=[29/454/98/45].

Activity 6.3.4.

  1. Suppose that L is the line in R3 defined by the vector w=[122].
    1. Find an orthonormal basis u for L.
    2. Construct the matrix Q=[u] and use it to construct the matrix P that projects vectors orthogonally onto L.
    3. Use your matrix to find b^, the orthogonal projection of b=[111] onto L.
    4. Find rank(P) and explain its geometric significance.
  2. The vectors
    w1=[1111],w2=[0112]
    form an orthogonal basis of W, a two-dimensional subspace of R4.
    1. Use the projection formula from Proposition 6.3.15 to find b^, the orthogonal projection of b=[9223] onto W.
    2. Find an orthonormal basis u1 and u2 for W and use it to construct the matrix P that projects vectors orthogonally onto W. Check that Pb=b^, the orthogonal projection you found in the previous part of this activity.
    3. Find rank(P) and explain its geometric significance.
    4. Find a basis for W.
    5. Find a vector b in W such that
      b=b^+b.
    6. If Q is the matrix whose columns are u1 and u2, find the product QTQ and explain your result.
This activity demonstrates one issue of note. We found b^, the orthogonal projection of b onto W, by requiring that bb^ be orthogonal to W. In other words, bb^ is a vector in the orthogonal complement W, which we may denote b. This explains the following proposition, which is illustrated in Figure 6.3.19
Figure 6.3.19. A vector b along with b^, its orthogonal projection onto the line L, and b, its orthogonal projection onto the orthogonal complement L.
Let’s summarize what we’ve found. If Q is a matrix whose columns u1,u2,,un form an orthonormal set in Rm, then
  • QTQ=In, the n×n identity matrix, because this product computes the dot products between the columns of Q.
  • QQT is the matrix the projects vectors orthogonally onto W, the subspace of Rm spanned by u1,,un.
As we’ve said before, matrix multiplication depends on the order in which we multiply the matrices, and we see this clearly here.
Because QTQ=I, there is a temptation to say that Q is invertible. This is usually not the case, however. Remember that an invertible matrix must be a square matrix, and the matrix Q will only be square if n=m. In this case, there are m vectors in the orthonormal set so the subspace W spanned by the vectors u1,u2,,um is Rm. If b is a vector in Rm, then b^=QQTb is the orthogonal projection of b onto Rm. In other words, QQTb is the closest vector in Rm to b, and this closest vector must be b itself. Therefore, QQTb=b, which means that QQT=I. In this case, Q is an invertible matrix.

Example 6.3.20.

Consider the orthonormal set of vectors
u1=[1/31/31/3],u2=[1/21/20]
and the matrix they define
Q=[1/31/21/31/21/30].
In this case, u1 and u2 span a plane, a 2-dimensional subspace of R3. We know that QTQ=I2 and QQT projects vectors orthogonally onto the plane. However, Q is not a square matrix so it cannot be invertible.

Example 6.3.21.

Now consider the orthonormal set of vectors
u1=[1/31/31/3],u2=[1/21/20],u3=[1/61/62/6]
and the matrix they define
Q=[1/31/21/61/31/21/61/302/6].
Here, u1, u2, and u3 form a basis for R3 so that both QTQ=I3 and QQT=I3. Therefore, Q is a square matrix and is invertible.
Moreover, since QTQ=I, we see that Q1=QT so finding the inverse of Q is as simple as writing its transpose. Matrices with this property are very special and will play an important role in our upcoming work. We will therefore give them a special name.

Definition 6.3.22.

A square m×m matrix Q whose columns form an orthonormal basis for Rm is called orthogonal.
This terminology can be a little confusing. We call a basis orthogonal if the basis vectors are orthogonal to one another. However, a matrix is orthogonal if the columns are orthogonal to one another and have unit length. It pays to keep this in mind when reading statements about orthogonal bases and orthogonal matrices. In the meantime, we record the following proposition.

Subsection 6.3.3 Summary

This section introduced orthogonal sets and the projection formula that allows us to project vectors orthogonally onto a subspace.
  • Given an orthogonal set w1,w2,,wn that spans an n-dimensional subspace W of Rm, the orthogonal projection of b onto W is the vector in W closest to b and may be written as
    b^=bw1w1w1 w1+bw2w2w2 w2++bwnwnwn wn.
  • If u1,u2,,un is an orthonormal basis of W and Q is the matrix whose columns are ui, then the matrix P=QQT projects vectors orthogonally onto W.
  • If the columns of Q form an orthonormal basis for an n-dimensional subspace of Rm, then QTQ=In.
  • An orthogonal matrix Q is a square matrix whose columns form an orthonormal basis. In this case, QQT=QTQ=I so that Q1=QT.

Exercises 6.3.4 Exercises

1.

Suppose that
w1=[111],w2=[121].
  1. Verify that w1 and w2 form an orthogonal basis for a plane W in R3.
  2. Use Proposition 6.3.15 to find b^, the orthogonal projection of b=[211] onto W.
  3. Find an orthonormal basis u1, u2 for W.
  4. Find the matrix P representing the matrix transformation that projects vectors in R3 orthogonally onto W. Verify that b^=Pb.
  5. Determine rank(P) and explain its geometric significance.

2.

Consider the vectors
w1=[111],w2=[101],w3=[121].
  1. Explain why these vectors form an orthogonal basis for R3.
  2. Suppose that A=[w1w2w3] and evaluate the product ATA. Why is this product a diagonal matrix and what is the significance of the diagonal entries?
  3. Express the vector b=[363] as a linear combination of w1, w2, and w3.
  4. Multiply the vectors w1, w2, w3 by appropriate scalars to find an orthonormal basis u1, u2, u3 of R3.
  5. If Q=[u1u2u3], find the matrix product QQT and explain the result.

3.

Suppose that
w1=[1101],w2=[1011]
form an orthogonal basis for a subspace W of R4.
  1. Find b^, the orthogonal projection of b=[2167] onto W.
  2. Find the vector b in W such that b=b^+b.
  3. Find a basis for W. and express b as a linear combination of the basis vectors.

4.

Consider the vectors
w1=[1100],w2=[0011],b=[2413].
  1. If L is the line defined by the vector w1, find the vector in L closest to b. Call this vector b^1.
  2. If W is the subspace spanned by w1 and w2, find the vector in W closest to b. Call this vector b^2.
  3. Determine whether b^1 or b^2 is closer to b and explain why.

5.

Suppose that w=[212] defines a line L in R3.
  1. Find the orthogonal projections of the vectors [100], [010], [001] onto L.
  2. Find the matrix P=1|w|2wwT.
  3. Use Proposition 2.5.6 to explain why the columns of P are related to the orthogonal projections you found in the first part of this exercise.

6.

Suppose that
v1=[103],v2=[222]
form the basis for a plane W in R3.
  1. Find a basis for the line that is the orthogonal complement W.
  2. Given the vector b=[662], find y, the orthogonal projection of b onto the line W.
  3. Explain why the vector z=by must be in W and write z as a linear combination of v1 and v2.

7.

Determine whether the following statements are true or false and explain your thinking.
  1. If the columns of Q form an orthonormal basis for a subspace W and w is a vector in W, then QQTw=w.
  2. An orthogonal set of vectors in R8 can have no more than 8 vectors.
  3. If Q is a 7×5 matrix whose columns are orthonormal, then QQT=I7.
  4. If Q is a 7×5 matrix whose columns are orthonormal, then QTQ=I5.
  5. If the orthogonal projection of b onto a subspace W satisfies b^=0, then b is in W.

8.

Suppose that Q is an orthogonal matrix.
  1. Remembering that vw=vTw, explain why
    Qx(Qy)=xy.
  2. Explain why |Qx|=|x|.
    This means that the length of a vector is unchanged after multiplying by an orthogonal matrix.
  3. If λ is a real eigenvalue of Q, explain why λ=±1.

9.

Explain why the following statements are true.
  1. If Q is an orthogonal matrix, then detQ=±1.
  2. If Q is a 8×4 matrix whose columns are orthonormal, then QQT is an 8×8 matrix whose rank is 4.
  3. If b^ is the orthogonal projection of b onto a subspace W, then bb^ is the orthogonal projection of b onto W.

10.

This exercise is about 2×2 orthogonal matrices.
  1. In Section 2.6, we saw that the matrix [cosθsinθsinθcosθ] represents a rotation by an angle θ. Explain why this matrix is an orthogonal matrix.
  2. We also saw that the matrix [cosθsinθsinθcosθ] represents a reflection in a line. Explain why this matrix is an orthogonal matrix.
  3. Suppose that u1=[cosθsinθ] is a 2-dimensional unit vector. Use a sketch to indicate all the possible vectors u2 such that u1 and u2 form an orthonormal basis of R2.
  4. Explain why every 2×2 orthogonal matrix is either a rotation or a reflection.
You have attempted 1 of 1 activities on this page.