Subsection 3.1.1 Invertible matrices
The preview activity began with a familiar type of equation, \(3x = 5\text{,}\) and asked for a strategy to solve it. One possible response is to divide both sides by 3. Instead, let’s rephrase this as multiplying by \(3^{-1} = \frac
13\text{,}\) the multiplicative inverse of 3.
Now that we are interested in solving equations of the form \(A\xvec = \bvec\text{,}\) we might try to find a similar approach. Is there a matrix \(A^{-1}\) that plays the role of the multiplicative inverse of \(A\text{?}\) Of course, the real number \(0\) does not have a multiplicative inverse so we probably shouldn’t expect every matrix to have a multiplicative inverse. We will see, however, that many do.
Definition 3.1.1.
An \(n\times n\) matrix \(A\) is called invertible if there is a matrix \(B\) such that \(AB = I_n\text{,}\) where \(I_n\) is the \(n\times n\) identity matrix. The matrix \(B\) is called the inverse of \(A\) and denoted \(A^{-1}\text{.}\)
Notice that we only define invertibility for matrices that have the same number of rows and columns in which case we say that the matrix is square.
Example 3.1.2.
Suppose that \(A\) is the matrix that rotates two-dimensional vectors counterclockwise by \(90^\circ\) and that \(B\) rotates vectors by \(-90^\circ\text{.}\) We have
\begin{equation*}
A=\left[\begin{array}{rr}
0 \amp -1 \\
1 \amp 0 \\
\end{array}\right],~~~
B=\left[\begin{array}{rr}
0 \amp 1 \\
-1 \amp 0 \\
\end{array}\right].
\end{equation*}
We can check that
\begin{equation*}
AB = \begin{bmatrix}
0 \amp -1 \\
1 \amp 0 \\
\end{bmatrix}
\begin{bmatrix}
0 \amp 1 \\
-1 \amp 0 \\
\end{bmatrix}
= \begin{bmatrix}
1 \amp 0 \\
0 \amp 1 \\
\end{bmatrix}
= I
\end{equation*}
which shows that \(A\) is invertible and that \(A^{-1}=B\text{.}\)
Notice that if we multiply the matrices in the opposite order, we find that \(BA=I\text{,}\) which says that \(B\) is also invertible and that \(B^{-1} = A\text{.}\) In other words, \(A\) and \(B\) are inverses of each other.
Activity 3.1.2.
This activity demonstrates a procedure for finding the inverse of a matrix \(A\text{.}\)
Suppose that \(A = \begin{bmatrix}
3 \amp -2 \\
1 \amp -1 \\
\end{bmatrix}
\text{.}\) To find an inverse \(B\text{,}\) we write its columns as \(B = \begin{bmatrix}\bvec_1 \amp \bvec_2
\end{bmatrix}\) and require that
\begin{equation*}
\begin{aligned}
AB \amp = I \\
\begin{bmatrix}
A\bvec_1 \amp A\bvec_2
\end{bmatrix}
\amp =
\begin{bmatrix}
1 \amp 0 \\
0 \amp 1 \\
\end{bmatrix}.
\end{aligned}
\end{equation*}
In other words, we can find the columns of \(B\) by solving the equations
\begin{equation*}
A\bvec_1 = \twovec10,~~~
A\bvec_2 = \twovec01.
\end{equation*}
Solve these equations to find
\(\bvec_1\) and
\(\bvec_2\text{.}\) Then write the matrix
\(B\) and verify that
\(AB=I\text{.}\) This is enough for us to conclude that
\(B\) is the inverse of
\(A\text{.}\)
Find the product
\(BA\) and explain why we now know that
\(B\) is invertible and
\(B^{-1}=A\text{.}\)
What happens when you try to find the inverse of \(C = \begin{bmatrix}
-2 \amp 1 \\
4 \amp -2 \\
\end{bmatrix}\text{?}\)
-
We now develop a condition that must be satisfied by an invertible matrix. Suppose that \(A\) is an invertible \(n\times n\) matrix with inverse \(B\) and suppose that \(\bvec\) is any \(n\)-dimensional vector. Since \(AB=I\text{,}\) we have
\begin{equation*}
A(B\bvec) = (AB)\bvec = I\bvec = \bvec.
\end{equation*}
This says that the equation \(A\xvec = \bvec\) is consistent and that \(\xvec=B\bvec\) is a solution.
Since we know that \(A\xvec = \bvec\) is consistent for any vector \(\bvec\text{,}\) what does this say about the span of the columns of \(A\text{?}\)
Since \(A\) is a square matrix, what does this say about the pivot positions of \(A\text{?}\) What is the reduced row echelon form of \(A\text{?}\)
In this activity, we have studied the matrices
\begin{equation*}
A = \begin{bmatrix}
3 \amp -2 \\
1 \amp -1 \\
\end{bmatrix},~~~
C = \begin{bmatrix}
-2 \amp 1 \\
4 \amp -2 \\
\end{bmatrix}.
\end{equation*}
Find the reduced row echelon form of each and explain how those forms enable us to conclude that one matrix is invertible and the other is not.
Example 3.1.3.
We can reformulate this procedure for finding the inverse of a matrix. For the sake of convenience, suppose that \(A\) is a \(2\times2\) invertible matrix with inverse \(B=\begin{bmatrix} \bvec_1 \amp \bvec_2 \end{bmatrix}\text{.}\) Rather than solving the equations
\begin{equation*}
A\bvec_1 = \twovec10,~~~
A\bvec_2 = \twovec01
\end{equation*}
separately, we can solve them at the same time by augmenting \(A\) by both vectors \(\twovec10\) and \(\twovec01\) and finding the reduced row echelon form.
For example, if \(A =
\begin{bmatrix}
1 \amp 2 \\
1 \amp 1 \\
\end{bmatrix}\text{,}\) we form
\begin{equation*}
\left[
\begin{array}{rr|rr}
1 \amp 2 \amp 1 \amp 0 \\
1 \amp 1 \amp 0 \amp 1 \\
\end{array}
\right]
\sim
\left[
\begin{array}{rr|rr}
1 \amp 0 \amp -1 \amp 2 \\
0 \amp 1 \amp 1 \amp -1 \\
\end{array}
\right].
\end{equation*}
This shows that the matrix \(B =
\begin{bmatrix}
-1 \amp 2 \\
1 \amp 1 \\
\end{bmatrix}\) is the inverse of \(A\text{.}\)
In other words, beginning with \(A\text{,}\) we augment by the identify and find the reduced row echelon form to determine \(A^{-1}\text{:}\)
\begin{equation*}
\left[
\begin{array}{r|r}
A \amp I \\
\end{array}
\right]
\sim
\left[
\begin{array}{r|r}
I \amp A^{-1} \\
\end{array}
\right].
\end{equation*}
In fact, this reformulation will always work. Suppose that \(A\) is an invertible \(n\times n\) matrix with inverse \(B\text{.}\) Suppose furthermore that \(\bvec\) is any \(n\)-dimensional vector and consider the equation \(A\xvec = \bvec\text{.}\) We know that \(x=B\bvec\) is a solution because \(A(B\bvec) = (AB)\bvec = I\bvec = \bvec.\)
Proposition 3.1.4.
If \(A\) is an invertible matrix with inverse \(B\text{,}\) then any equation \(A\xvec = \bvec\) is consistent and \(\xvec = B\bvec\) is a solution. In other words, the solution to \(A\xvec = \bvec\) is \(\xvec =
A^{-1}\bvec\text{.}\)
Notice that this is similar to saying that the solution to \(3x=5\) is \(x = \frac13\cdot 5\text{,}\) as we saw in the preview activity.
Now since \(A\xvec=\bvec\) is consistent for every vector \(\bvec\text{,}\) the columns of \(A\) must span \(\real^n\) so there is a pivot position in every row. Since \(A\) is also square, this means that the reduced row echelon form of \(A\) is the identity matrix.
Proposition 3.1.5.
The matrix \(A\) is invertible if and only if the reduced row echelon form of \(A\) is the identity matrix: \(A\sim I\text{.}\) In addition, we can find the inverse by augmenting \(A\) by the identity and finding the reduced row echelon form:
\begin{equation*}
\left[
\begin{array}{r|r}
A \amp I \\
\end{array}
\right]
\sim
\left[
\begin{array}{r|r}
I \amp A^{-1} \\
\end{array}
\right].
\end{equation*}
You may have noticed that
Proposition 3.1.4 says that
the solution to the equation
\(A\xvec = \bvec\) is
\(\xvec =
A^{-1}\bvec\text{.}\) Indeed, we know that this equation has a unique solution because
\(A\) has a pivot position in every column.
It is important to remember that the product of two matrices depends on the order in which they are multiplied. That is, if
\(C\) and
\(D\) are matrices, then it sometimes happens that
\(CD \neq DC\text{.}\) However, something fortunate happens when we consider invertibility. It turns out that if
\(A\) is an
\(n\times n\) matrix and that
\(AB=I\text{,}\) then it is also true that
\(BA=I\text{.}\) We have verified this in a few examples so far, and
Exercise 3.1.5.12 explains why it always happens. This leads to the following proposition.
Proposition 3.1.6.
If \(A\) is a \(n\times n\) invertible matrix with inverse \(B\text{,}\) then \(BA=I\text{,}\) which tells us that \(B\) is invertible with inverse \(A\text{.}\) In other words,
\begin{equation*}
(A^{-1})^{-1} = A.
\end{equation*}
Subsection 3.1.3 Triangular matrices and Gaussian elimination
With some of the ideas we’ve developed, we can recast the Gaussian elimination algorithm in terms of matrix multiplication and invertibility. This will be especially helpful later when we consider
determinants and
LU factorizations. Triangular matrices will play an important role.
Definition 3.1.8.
We say that a matrix \(A\) is lower triangular if all its entries above the diagonal are zero. Similarly, \(A\) is upper triangular if all the entries below the diagonal are zero.
For example, the matrix \(L\) below is a lower triangular matrix while \(U\) is an upper triangular one.
\begin{equation*}
L = \left[\begin{array}{rrrr}
* \amp 0 \amp 0 \amp 0 \\
* \amp * \amp 0 \amp 0 \\
* \amp * \amp * \amp 0 \\
* \amp * \amp * \amp * \\
\end{array}\right],
\hspace{24pt}
U =
\left[\begin{array}{rrrr}
* \amp * \amp * \amp * \\
0 \amp * \amp * \amp * \\
0 \amp 0 \amp * \amp * \\
0 \amp 0 \amp 0 \amp * \\
\end{array}\right]\text{.}
\end{equation*}
We can develop a simple test to determine whether an \(n\times n\) lower triangular matrix is invertible. Let’s use Gaussian elimination to find the reduced row echelon form of the lower triangular matrix
\begin{equation*}
\begin{aligned}
\left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
2 \amp -2 \amp 0 \\
-3 \amp 4 \amp -4 \\
\end{array}\right]
\amp {}\sim{}
\left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
0 \amp -2 \amp 0 \\
0 \amp 4 \amp -4 \\
\end{array}\right]
\\
\amp {}\sim{}
\left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
0 \amp -2 \amp 0 \\
0 \amp 0 \amp -4 \\
\end{array}\right]
\sim
\left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \\
0 \amp 0 \amp 1 \\
\end{array}\right]\text{.}
\end{aligned}
\end{equation*}
Because the entries on the diagonal are nonzero, we find a pivot position in every row, which tells us that the matrix is invertible.
If, however, there is a zero entry on the diagonal, the matrix cannot be invertible. Considering the matrix below, we see that having a zero on the diagonal leads to a row without a pivot position.
\begin{equation*}
\left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
2 \amp 0 \amp 0 \\
-3 \amp 4 \amp -4 \\
\end{array}\right]
\sim
\left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
0 \amp 0 \amp 0 \\
0 \amp 4 \amp -4 \\
\end{array}\right]
\sim
\left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp -1 \\
0 \amp 0 \amp 0 \\
\end{array}\right]\text{.}
\end{equation*}
Proposition 3.1.9.
An \(n\times n\) triangular matrix is invertible if and only if the entries on the diagonal are all nonzero.
Activity 3.1.4. Gaussian elimination and matrix multiplication.
This activity explores how the row operations of scaling, interchange, and replacement can be performed using matrix multiplication.
As an example, we consider the matrix
\begin{equation*}
A = \left[\begin{array}{rrr}
1 \amp 2 \amp 1 \\
2 \amp 0 \amp -2 \\
-1 \amp 2 \amp -1 \\
\end{array}\right]
\end{equation*}
and apply a replacement operation that multiplies the first row by \(-2\) and adds it to the second row. Rather than performing this operation in the usual way, we construct a new matrix by applying the desired replacement operation to the identity matrix. To illustrate, we begin with the identity matrix
\begin{equation*}
I = \begin{bmatrix}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \\
0 \amp 0 \amp 1 \\
\end{bmatrix}
\end{equation*}
and form a new matrix by multiplying the first row by \(-2\) and adding it to the second row to obtain
\begin{equation*}
R = \begin{bmatrix}
1 \amp 0 \amp 0 \\
-2 \amp 1 \amp 0 \\
0 \amp 0 \amp 1 \\
\end{bmatrix}.
\end{equation*}
Show that the product
\(RA\) is the result of applying the replacement operation to
\(A\text{.}\)
Explain why \(R\) is invertible and find its inverse \(R^{-1}\text{.}\)
Describe the relationship between \(R\) and \(R^{-1}\) and use the connection to replacement operations to explain why it holds.
Other row operations can be performed using a similar procedure. For instance, suppose we want to scale the second row of
\(A\) by
\(4\text{.}\) Find a matrix
\(S\) so that
\(SA\) is the same as that obtained from the scaling operation. Why is
\(S\) invertible and what is
\(S^{-1}\text{?}\)
Finally, suppose we want to interchange the first and third rows of \(A\text{.}\) Find a matrix \(P\text{,}\) usually called a permutation matrix that performs this operation. What is \(P^{-1}\text{?}\)
The original matrix \(A\) is seen to be row equivalent to the upper triangular matrix \(U\) by performing three replacement operations on \(A\text{:}\)
\begin{equation*}
A = \left[\begin{array}{rrr}
1 \amp 2 \amp 1 \\
2 \amp 0 \amp -2 \\
-1 \amp 2 \amp -1 \\
\end{array}\right]
\sim
\left[\begin{array}{rrr}
1 \amp 2 \amp 1 \\
0 \amp -4 \amp -4 \\
0 \amp 0 \amp -4 \\
\end{array}\right] = U.
\end{equation*}
Find the matrices \(L_1\text{,}\) \(L_2\text{,}\) and \(L_3\) that perform these row replacement operations so that \(L_3L_2L_1 A = U\text{.}\)
Explain why the matrix product
\(L_3L_2L_1\) is invertible and use this fact to write
\(A = LU\text{.}\) What is the matrix
\(L\) that you find? Why do you think we denote it by
\(L\text{?}\)
The following are examples of matrices, known as elementary matrices, that perform the row operations on a matrix having three rows.
- Replacement
Multiplying the second row by 3 and adding it to the third row is performed by
\begin{equation*}
L = \begin{bmatrix}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \\
0 \amp 3 \amp 1 \\
\end{bmatrix}.
\end{equation*}
We often use \(L\) to describe these matrices because they are lower triangular.
- Scaling
Multiplying the third row by 2 is performed by
\begin{equation*}
S = \begin{bmatrix}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \\
0 \amp 0 \amp 2 \\
\end{bmatrix}.
\end{equation*}
- Interchange
Interchanging the first two rows is performed by
\begin{equation*}
P = \begin{bmatrix}
0 \amp 1 \amp 0 \\
1 \amp 0 \amp 0 \\
0 \amp 0 \amp 1 \\
\end{bmatrix}.
\end{equation*}
Example 3.1.10.
Suppose we have
\begin{equation*}
A = \left[\begin{array}{rrr}
1 \amp 3 \amp -2 \\
-3 \amp -6 \amp 3 \\
2 \amp 0 \amp -2 \\
\end{array}\right].
\end{equation*}
For the forward substitution phase of Gaussian elimination, we perform a sequence of three replacement operations. The first replacement operation multiplies the first row by \(3\) and adds the result to the second row. We can perform this operation by multiplying \(A\) by the lower triangular matrix \(L_1\) where
\begin{equation*}
L_1A =
\left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
3 \amp 1 \amp 0 \\
0 \amp 0 \amp 1 \\
\end{array}\right]
\left[\begin{array}{rrr}
1 \amp 3 \amp -2 \\
-3 \amp -6 \amp 3 \\
2 \amp 0 \amp -2 \\
\end{array}\right]
=
\left[\begin{array}{rrr}
1 \amp 3 \amp -2 \\
0 \amp 3 \amp -3 \\
2 \amp 0 \amp -2 \\
\end{array}\right].
\end{equation*}
The next two replacement operations are performed by the matrices
\begin{equation*}
L_2 = \left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \\
-2 \amp 0 \amp 1 \\
\end{array}\right],
\hspace{24pt}
L_3 = \left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \\
0 \amp 2 \amp 1 \\
\end{array}\right]
\end{equation*}
so that
\begin{equation*}
L_3L_2L_1A = U = \begin{bmatrix}
1 \amp 3 \amp -2 \\
0 \amp 3 \amp -3 \\
0 \amp 0 \amp -4 \\
\end{bmatrix}.
\end{equation*}
Notice that the inverse of \(L_1\) has the simple form:
\begin{equation*}
L_1 = \left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
3 \amp 1 \amp 0 \\
0 \amp 0 \amp 1 \\
\end{array}\right],
\hspace{24pt}
L_1^{-1} = \left[\begin{array}{rrr}
1 \amp 0 \amp 0 \\
-3 \amp 1 \amp 0 \\
0 \amp 0 \amp 1 \\
\end{array}\right]\text{.}
\end{equation*}
This says that if we want to undo the operation of multiplying the first row by \(3\) and adding to the second row, we should multiply the first row by \(-3\) and add it to the second row. That is the effect of \(L_1^{-1}\text{.}\)
Notice that we now have \(L_3L_2L_1A = U\text{,}\) which gives
\begin{equation*}
\begin{aligned}
(L_3L_2L_1)A \amp = U \\
(L_3L_2L_1)^{-1}(L_3L_2L_1)A \amp =
(L_3L_2L_1)^{-1}U \\
A \amp = (L_3L_2L_1)^{-1}U = LU\\
\end{aligned}
\end{equation*}
where \(L\) is the lower triangular matrix
\begin{equation*}
L = (L_3L_2L_1)^{-1}=\begin{bmatrix}
1 \amp 0 \amp 0 \\
-3 \amp 1 \amp 0 \\
2 \amp -2 \amp 1 \\
\end{bmatrix}.
\end{equation*}
This way of writing
\(A=LU\) as the product of a lower and an upper triangular matrix is known as an
\(LU\) factorization of
\(A\text{,}\) and its usefulness will be explored in
Section 5.1.