We know that a linear system is inconsistent when is not in Col, the column space of . Later in this chapter, we’ll develop a strategy for dealing with inconsistent systems by finding , the vector in Col that minimizes the distance to . The equation is therefore consistent and its solution set can provide us with useful information about the original system .
In this section and the next, we’ll develop some techniques that enable us to find , the vector in a given subspace that is closest to a given vector .
We frequently ask to write a given vector as a linear combination of given basis vectors. In the past, we have done this by solving a linear system. The preview activity illustrates how this task can be simplified when the basis vectors are orthogonal to each other. We’ll explore this and other uses of orthogonal bases in this section.
The preview activity dealt with a basis of formed by two orthogonal vectors. More generally, we will consider a set of orthogonal vectors, as described in the next definition.
form an orthogonal set of 4-dimensional vectors. Since there are only three vectors, this set does not form a basis for . It does, however, form a basis for a 3-dimensional subspace of .
Using this proposition, we can see that an orthogonal set of vectors must be linearly independent. Suppose, for instance, that is a set of nonzero orthogonal vectors and that one of the vectors is a linear combination of the others, say,
.
We therefore know that
,
which cannot happen since we know that is nonzero. This tells us that
If the vectors in an orthogonal set have dimension , they form a linearly independent set in and are therefore a basis for the subspace Span. If there are vectors in the orthogonal set, they form a basis for .
This activity introduces an important way of modifying an orthogonal set so that the vectors in the set have unit length. Recall that we may multiply any nonzero vector by a scalar so that the new vector has length 1. For instance, we know that if is a positive scalar, then . To obtain a vector having unit length, we want
We now turn to an important problem that will appear in many forms in the rest of our explorations. Suppose, as shown in Figure 6.3.9, that we have a subspace of and a vector that is not in that subspace. We would like to find the vector in that is closest to , meaning the distance between and is as small as possible.
To get started, let’s consider a simpler problem where we have a line in , defined by the vector , and another vector that is not on the line, as shown on the left of Figure 6.3.10. We wish to find , the vector on the line that is closest to , as illustrated in the right of Figure 6.3.10.
To find , we require that be orthogonal to . For instance, if is another vector on the line, as shown in Figure 6.3.11, then the Pythagorean theorem implies that
which means that . Therefore, is closer to than any other vector on the line .
Given a vector in and a subspace of , the orthogonal projection of onto is the vector in that is closest to . It is characterized by the property that is orthogonal to .
The same ideas apply more generally. Suppose we have an orthogonal set of vectors and that define a plane in . If another vector in , we seek the vector on the plane closest to . As before, the vector will be orthogonal to , as illustrated in Figure 6.3.14.
Suppose that is an orthonormal basis for ; that is, the vectors are orthogonal to one another and have unit length. Explain why the orthogonal projection is
In all the cases considered in the activity, we are looking for , the vector in a subspace closest to a vector , which is found by requiring that be orthogonal to . This means that for any vector in .
If is an orthonormal basis for a subspace of , then the matrix transformation that projects vectors in orthogonally onto is represented by the matrix where
Let’s check that this works by considering the vector and finding , its orthogonal projection onto the plane . In terms of the original basis and , the projection formula from Proposition 6.3.15 tells us that
Find an orthonormal basis and for and use it to construct the matrix that projects vectors orthogonally onto . Check that , the orthogonal projection you found in the previous part of this activity.
This activity demonstrates one issue of note. We found , the orthogonal projection of onto , by requiring that be orthogonal to . In other words, is a vector in the orthogonal complement , which we may denote . This explains the following proposition, which is illustrated in Figure 6.3.19
Because , there is a temptation to say that is invertible. This is usually not the case, however. Remember that an invertible matrix must be a square matrix, and the matrix will only be square if . In this case, there are vectors in the orthonormal set so the subspace spanned by the vectors is . If is a vector in , then is the orthogonal projection of onto . In other words, is the closest vector in to , and this closest vector must be itself. Therefore, , which means that . In this case, is an invertible matrix.
In this case, and span a plane, a 2-dimensional subspace of . We know that and projects vectors orthogonally onto the plane. However, is not a square matrix so it cannot be invertible.
Moreover, since , we see that so finding the inverse of is as simple as writing its transpose. Matrices with this property are very special and will play an important role in our upcoming work. We will therefore give them a special name.
This terminology can be a little confusing. We call a basis orthogonal if the basis vectors are orthogonal to one another. However, a matrix is orthogonal if the columns are orthogonal to one another and have unit length. It pays to keep this in mind when reading statements about orthogonal bases and orthogonal matrices. In the meantime, we record the following proposition.