This section is a rather rapid tour of some cool ideas that get a lot of use in applied linear algebra. We are rather light on details here. The interested reader can consult sections 8.3–8.6 in the Nicholson textbook.
Let be a linear operator defined by a matrix . If is symmetric (for the case of ) or hermitian (for the case of ), we say that the operator is self-adjoint.
A self-adjoint operator is positive if for all vectors . It is positive-definite if for all nonzero . If for some matrix , we also refer to as a positive(-definite) matrix.
The definition of a positive matrix is equivalent to requiring that all its eigenvalues are non-negative. Every positive matrix has a unique positive square root: a matrix such that . Since is symmetric/hermitian, it can be diagonalized. Writing where is orthogonal/unitary and
What is interesting is that the converse to the above statement is also true. The Cholesky factorization of a positive-definite matrix is given by , where is upper-triangular, with positive diagonal entries.
Even better is that there is a very simple algorithm for obtaining the factorization: Carry the matrix to triangular form, using only row operations of the type . Then divide each row by the square root of the diagonal entry to obtain the matrix .
The SymPy library contains the cholesky() algorithm. Note however that it produces a lower triangular matrix, rather than upper triangular. (That is, the output gives rather than , so you will have .) Let’s give it a try. First, let’s create a positive-definite matrix.
The SymPy library has a function for computing the singular values of a matrix. Given a matrix A, the command A.singular_values() will return its singular values. Try this for a few different matrices below:
For an matrix , we might not be able to diagonalize (with a single orthonormal basis). However, it turns out that it’s always possible to find a pair of orthonormal bases such that
In fact, this can be done even if is not square, which is arguably the more interesting case! Let be an matrix. We will find an orthogonal matrix and orthogonal matrix , such that , where is also .
Next, we compute the vectors , for . As shown in Nicolson, will be an orthonormal basis for the column space of . The matrix is constructed by extending this to an orthonormal basis of .
All of this is a lot of work to do by hand, but it turns out that it can be done numerically, and more importantly, efficiently, by a computer. The SymPy library has an SVD algorithm, but it will not be efficient for larger matrices. In practice, most Python users will use the SVD algorithm provided by NumPy; we will stick with SymPy for simplicity and consistency.
The version of the SVD given above is not used in computations, since it tends to be more resource intensive. In particular, it requires us to store more information than necessary: the last rows of , and the last columns of , get multiplied by columns/rows of zeros in , so we don’t really need to keep track of these columns.
Instead, most algorithms that you find will give the diagonal matrix , consisting of the nonzero singular values, and will be replaced by the matrix consisting of its first columns, while gets replaced by the matrix consisting of its first rows. The resulting product is still equal to the original matrix.
In some cases, even the matrix is too large, and a decision is made to truncate to some smaller subset of singular values. In this case, the resulting product is no longer equal to the original matrix, but it does provide an approximation. A discussion can be found on Wikipedia.
Note that the values are not listed in decreasing order. Now, let’s ask for the singular value decomposition. The output consists of three matrices; the first line below assigns those matrices to the names P,S,Q.
Note that the output is the “condensed” version, which doesn’t match the exposition above. It also doesn’t follow the same ordering convention: we’ll need to swap columns in each of the matrices. But it does give us a decomposition of the matrix :
To match our earlier presentation, we first set . Next, we need to extend the matrix in the output above to a matrix. We can do this by choosing any vector orthogonal to the two existing columns, and normalizing. Let’s use entries . Noting that we also need to swap the first two columns (to match the fact that we swapped columns in ), we get the matrix
The Singular Value Decomposition has a lot of useful appplications, some of which are described in Nicholson’s book. On a very fundamental level the SVD provides us with information on some of the most essential properties of the matrix , and any system of equations with as its coefficient matrix.
The rank of is the number of leadning ones in the RREF of , which is also equal to the dimension of the column space of (or if you prefer, the dimension of ).
The column space of , denoted , is the subspace of spanned by the columns of . (This is the image of the matrix transformation ; it is also the space of all vectors for which the system is consistent.)
Here’s the cool thing about the SVD. Let be the positive singular values of . Let be the orthonormal basis of eigenvectors for , and let be the orthonormal basis of constructed in the SVD algorithm. Then:
If you want to explore this further, have a look at the excellent notebook by Dr. Juan H Klopper. The ipynb file can be found on his GitHub page. In it, he takes you through various approaches to finding the singular value decomposition, using the method above, as well as using NumPy and SciPy (which, for industrial applications, are superior to SymPy).
A -factorization of is a factorization of the form , where is , with orthonormal columns, and is an invertible upper-triangular () matrix with positive diagonal entries. If is a square matrix, will be orthogonal.
A lot of the methods we’re looking at here involve more sophisticated numerical techniques than SymPy is designed to handle. If we wanted to spend time on these topics, we’d have to learn a bit about the NumPy package, which has built in tools for finding things like polar decomposition and singular value decomposition. However, SymPy does know how to do factorization. After defining a matrix A, we can use the command
Details of how to perform the QR factorization can be found in Nicholson’s textbook. It’s essentially a consequence of performing the Gram-Schmidt algorithm on the columns of , and keeping track of our work.
The calculation above is a symbolic computation, which is nice for understanding what’s going on. The reason why the factorization is useful in practice is that there are efficient numerical methods for doing it (with good control over rounding errors). Our next topic looks at a useful application of the factorization.
Our first method focuses on the dominant eigenvalue of a matrix. An eigenvalue is dominant if it is larger in absolute value than all other eigenvalues. For example, if has eigenvalues , then is the dominant eigenvalue.
If a matrix has a dominant eigenvalue, there is a method for finding it (approximately) that does not involve finding and factoring the characteristic polynomial of .
We might want to confirm whether that rather large fraction is close to . To do so, we can get the computer to divide the numerator by the denominator.
The above might show you the fraction rather than its decimal approximation. (This may depend on whether you’re on Sage or Jupyter.) To get the decimal, try wrapping the above in float() (or N, or append with .evalf()).
So why does this work? Suppose is diagonalizable, so that there is a basis of eigenvectors for . Suppose also that has dominant eigenvalue , with corresponding eigenvector . Then our initial guess can be written in terms of this basis:
The result does not depend on the initial guess , unless we choose an for which the coefficient is 0. If this turns out to be the case, we must choose another initial guess.
After steps we have , which still has the same eigenvalues as . By some sort of dark magic, this sequence of matrices converges to an upper triangular matrix with eigenvalues on the diagonal!
Do this a few more times, and see what results! (If someone can come up with a way to code this as a loop, let me know!) The diagonal entries should get closer to , respectively, and the entry should get closer to .