Subsection 5.1.1 Partial pivoting
The first issue that we address is the fact that computers do not perform arithemtic operations exactly. For instance, Python will evaluate
0.1 + 0.2
and report
0.30000000000000004
even though we know that the true value is 0.3. There are a couple of reasons for this.
First, computers perform arithmetic using base 2 numbers, which means that numbers we enter in decimal form, such as
must be converted to base 2. Even though 0.1 has a simple decimal form, its representation in base 2 is the repeating decimal
To accurately represent this number inside a computer would require infinitely many digits. Since a computer can only hold a finite number of digits, we are necessarily using an approximation just by representing this number in a computer.
In addition, arithmetic operations, such as addition, are prone to error. To keep things simple, suppose we have a computer that represents numbers using only three decimal digits. For instance, the number 1.023 would be represented as
1.02
while 0.023421 would be
0.0234
. If we add these numbers, we have 1.023 + 0.023421 = 1.046421; the computer reports this sum as
1.02 + 0.0234 = 1.04
, whose last digit is not correctly rounded. Generally speaking, we will see this problem, which is called
round off error, whenever we add numbers of signficantly different magnitudes.
Remember that Gaussian elimination, when applied to an
matrix, requires approximately
operations. If we have a
matrix, performing Gaussian elimination requires roughly a billion operations, and the errors introduced in each operation could accumulate. How can we have confidence in the final result? We can never completely avoid these errors, but we can take steps to mitigate them. The next activity will introduce one such technique.
Activity 5.1.2.
Suppose we have a hypothetical computer that represents numbers using only three decimal digits. We will consider the linear system
Show that this system has the unique solution
If we represent this solution inside our computer that only holds 3 decimal digits, what do we find for the solution? This is the best that we can hope to find using our computer.
Letβs imagine that we use our computer to find the solution using Gaussian elimination; that is, after every arithmetic operation, we keep only three decimal digits. Our first step is to multiply the first equation by 10000 and subtract it from the second equation. If we represent numbers using only three decimal digits, what does this give for the value of
By substituting our value for into the first equation, what do we find for
Compare the solution we find on our computer with the actual solution and assess the quality of the approximation.
Letβs now modify the linear system by simplying interchanging the equations:
Of course, this doesnβt change the actual solution. Letβs imagine we use our computer to find the solution using Gaussian elimination. Perform the first step where we multiply the first equation by 0.0001 and subtract from the second equation. What does this give for if we represent numbers using only three decimal digits?
Substitute the value you found for into the first equation and solve for Then compare the approximate solution found with our hypothetical computer to the exact solution.
Which approach produces the most accurate approximation?
This activity demonstrates how the practical aspects of computing differ from the theoretical. We know that the order in which we write the equations has no effect on the solution space; row interchange is one of our three allowed row operations in the Gaussian elimination algorithm. However, when we are only able to perform arithmetic operations approximately, applying row interchanges can dramatically improve the accuracy of our approximations.
If we could compute the solution exactly, we find
Since our hypothetical computer represents numbers using only three decimal digits, our computer finds
This is the best we can hope to do with our computer since it is impossible to represent the solution exactly.
When the equations are written in their original order and we multiply the first equation by 10000 and subtract from the second, we find
In fact, we find the same value for
when we interchange the equations. Here we multiply the first equation by 0.0001 and subtract from the second equation. We then find
The difference occurs when we substitute
into the first equation. When the equations are written in their original order, we have
When the equations are written in their original order, we find the solution
When we write the equation in the opposite order, however, substituting
into the first equation gives
In this case, we find the approximate solution
which is the most accurate solution that our hypothetical computer can find. Simply interchanging the order of the equation produces a much more accurate solution.
We can understand why this works graphically. Each equation represents a line in the plane, and the solution is the intersection point. Notice that the slopes of these lines differ considerably.
When the equations are written in their original order, we substitute
into the equation
which is a nearly horizontal line. Along this line, a small change in
leads to a large change in
The slight difference in our approximation
from the exact value
leads to a large difference in the approximation
from the exact value
If we exchange the order in which the equations are written, we substitute our approximation
into the equation
Notice that the slope of the associated line is
On this line, a small change in
leads to a relatively small change in
as well. Therefore, the difference in our approximation
from the exact value leads to only a small difference in the approximation
from the exact value.
This example motivates the technique that computers usually use to perform Gaussian elimation. We only need to perform a row interchange when a zero occurs in a pivot position, such as
However, we will perform a row interchange to put the entry having the largest possible absolute value into the pivot position. For instance, when performing Gaussian elimination on the following matrix, we begin by interchanging the first and third rows so that the upper left entry has the largest possible absolute value.
This technique is called
partial pivoting, and it means that, in practice, we will perform many more row interchange operations than we typically do when computing exactly by hand.
Subsection 5.1.2 factorizations
In
Subsection 1.3.3, we saw that the number of arithmetic operations needed to perform Gaussian elimination on an
matrix is about
This means that a
matrix, requires about two thirds of a billion operations.
Suppose that we have two equations,
and
that we would like to solve. Usually, we would form augmented matrices
and
and apply Gaussian elimination. Of course, the steps we perform in these two computations are nearly identical. Is there a way to store some of the computation we perform in reducing
and reuse it in solving subsequent equations? The next activity will point us in the right direction.
Activity 5.1.3.
We will consider the matrix
and begin performing Gaussian elimination without using partial pivoting.
Perform two row replacement operations to find the row equivalent matrix
Find elementary matrices and that perform these two operations so that
Perform a third row replacement to find the upper triangular matrix
Find the elementary matrix such that
We can write
Find the inverse matrices
and
and the product
Then verify that
Suppose that we want to solve the equation We will write
and introduce an unknown vector such that Find by noting that and solving this equation.
Now that we have found find by solving
Using the factorization and this two-step process, solve the equation
This activity introduces a method for factoring a matrix
as a product of two triangular matrices,
where
is lower triangular and
is upper triangular. The key to finding this factorization is to represent the row operations that we apply in the Gaussian elimination algorithm through multiplication by elementary matrices.
Example 5.1.1.
Suppose we have the equation
which we write in the form
We begin by applying the Gaussian elimination algorithm to find an
factorization of
The first step is to multiply the first row of
by
and add it to the second row. The elementary matrix
performs this operation so that
to obtain the upper triangular matrix
We can write
which tells us that
Notice that the matrix
is lower triangular, a result of the fact that the elementary matrices
and
are lower triangular.
Now that we have factored
into two triangular matrices, we can solve the equation
by solving two triangular systems. We write
and define the unknown vector
which is determined by the equation
Because
is lower triangular, we find the solution using forward substitution,
Finally, we find
the solution to our original system
by applying back substitution to solve
This gives
If we want to solve
for a different right-hand side
we can simply repeat this two-step process.
An
factorization allow us to trade in one equation
for two simpler equations
For instance, the equation
in our example has the form
Because
is a lower-triangular matrix, we can read off the first component of
directly from the equations:
We then have
which gives
and
which gives
Solving a triangular system is simplified because we only need to perform a sequence of substitutions.
In fact, solving an equation with an
triangular matrix requires approximately
operations. Once we have the factorization
we solve the equation
by solving two equations involving triangular matrices, which requires about
operations. For example, if
is a
matrix, we solve the equation
using about one million steps. The compares with roughly a billion operations needed to perform Gaussian elimination, which represents a significant savings. Of course, we have to first find the
factorization of
and this requires roughly the same amount of work as performing Gaussian elimination. However, once we have the
factorization, we can use it to solve
for different right hand sides
Our discussion so far has ignored one issue, however. Remember that we sometimes have to perform row interchange operations in addition to row replacement. A typical row interchange is represented by multiplication by a matrix such as
which has the effect of interchanging the first and third rows. Notice that this matrix is not triangular so performing a row interchange will disrupt the structure of the
factorization we seek. Without giving the details, we simply note that linear algebra software packages provide a matrix
that describes how the rows are permuted in the Gaussian elimination process. In particular, we will write
where
is a permutation matrix,
is lower triangular, and
is upper triangular.
Therefore, to solve the equation
we first multiply both sides by
to obtain
That is, we multiply
by
and then find
using the factorization:
and
Activity 5.1.4.
Sage will create
factorizations; once we have a matrix
A
, we write
P, L, U = A.LU()
to obtain the matrices
and
such that
-
Using Sage, define the matrix and then ask Sage for the factorization. What are the matrices and
Notice that Sage finds a different factorization than we found in the previous example because Sage uses partial pivoting, as described in the previous section, when it performs Gaussian elimination.
Define the vector in Sage and compute
Use the matrices L
and U
to solve and You should find the same solution that we found in the previous example.
Use the factorization to solve the equation
How does the factorization show us that is invertible and that, therefore, every equation has a unique solution?
Suppose that we have the matrix
Use Sage to find the factorization. Explain how the factorization shows that is not invertible.
Consider the matrix
and find its factorization. Explain why and have the same null space and use this observation to find a basis for