Skip to main content
Logo image

Section 7.2 Quadratic forms

With our understanding of symmetric matrices and variance in hand, we’ll now explore how to determine the directions in which the variance of a dataset is as large as possible and where it is as small as possible. This is part of a much larger story involving a type of function, called a quadratic form, that we’ll introduce here.

Preview Activity 7.2.1.

Let’s begin by looking at an example. Suppose we have three data points that form the demeaned data matrix
A=[21βˆ’312βˆ’3]
  1. Plot the demeaned data points in Figure 7.2.1. In which direction does the variance appear to be largest and in which does it appear to be smallest?
    Figure 7.2.1. Use this coordinate grid to plot the demeaned data points.
  2. Construct the covariance matrix C and determine the variance in the direction of [11] and the variance in the direction of [βˆ’11].
  3. What is the total variance of this dataset?
  4. Generally speaking, if C is the covariance matrix of a dataset and u is an eigenvector of C having unit length and with associated eigenvalue Ξ», what is Vu?

Subsection 7.2.1 Quadratic forms

Given a matrix A of N demeaned data points, the symmetric covariance matrix C=1NAAT determines the variance in a particular direction
Vu=uβ‹…(Cu),
where u is a unit vector defining the direction.
More generally, a symmetric m×m matrix A defines a function q:Rm→R by
q(x)=xβ‹…(Ax).
Notice that this expression is similar to the one we use to find the variance Vu in terms of the covariance matrix C. The only difference is that we allow x to be any vector rather than requiring it to be a unit vector.

Example 7.2.2.

Suppose that A=[1221]. If we write x=[x1x2], then we have
q([x1x2])=[x1x2]β‹…([1221][x1x2])=[x1x2]β‹…[x1+2x22x1+x2]=x12+2x1x2+2x1x2+x22=x12+4x1x2+x22.
We may evaluate the quadratic form using some input vectors:
q([10])=1,q([11])=6,q([24])=52.
Notice that the value of the quadratic form is a scalar.

Definition 7.2.3.

If A is a symmetric mΓ—m matrix, the quadratic form defined by A is the function qA(x)=xβ‹…(Ax).

Activity 7.2.2.

Let’s look at some more examples of quadratic forms.
  1. Consider the symmetric matrix D=[300βˆ’1]. Write the quadratic form qD(x) defined by D in terms of the components of x=[x1x2]. What is the value of qD([2βˆ’4])?
  2. Given the symmetric matrix A=[255βˆ’3], write the quadratic form qA(x) defined by A and evaluate qA([2βˆ’1]).
  3. Suppose that q([x1x2])=3x12βˆ’4x1x2+4x22. Find a symmetric matrix A such that q is the quadratic form defined by A.
  4. Suppose that q is a quadratic form and that q(x)=3. What is q(2x)? q(βˆ’x)? q(10x)?
  5. Suppose that A is a symmetric matrix and qA(x) is the quadratic form defined by A. Suppose that x is an eigenvector of A with associated eigenvalue -4 and with length 7. What is qA(x)?
Linear algebra is principally about things that are linear. However, quadratic forms, as the name implies, have a distinctly non-linear character. First, if A=[abbc], is a symmetric matrix, then the associated quadratic form is
qA([x1x2])=ax12+2bx1x2+cx22.
Notice how the variables x1 and x2 are multiplied together, which tells us this isn’t a linear function.
This expression assumes an especially simple form when D is a diagonal matrix. In particular, if D=[a00c], then qD([x1x2])=ax12+cx22. This is special because there is no cross-term involving x1x2.
Remember that matrix transformations have the property that T(sx)=sT(x). Quadratic forms behave differently:
qA(sx)=(sx)β‹…(A(sx))=s2xβ‹…(Ax)=s2qA(x).
For instance, when we multiply x by the scalar 2, then qA(2x)=4qA(x). Also, notice that qA(βˆ’x)=qA(x) since the scalar is squared.
Finally, evaluating a quadratic form on an eigenvector has a particularly simple form. Suppose that x is an eigenvector of A with associated eigenvalue Ξ». We then have
qA(x)=xβ‹…(Ax)=Ξ»xβ‹…x=Ξ»|x|2.
Let’s now return to our motivating question: in which direction u is the variance Vu=uβ‹…(Cu) of a dataset as large as possible and in which is it as small as possible. Remembering that the vector u is a unit vector, we can now state a more general form of this question: If qA(x) is a quadratic form, for which unit vectors u is qA(u)=uβ‹…(Au) as large as possible and for which is it as small as possible? Since a unit vector specifies a direction, we will often ask for the directions in which the quadratic form q(x) is at its maximum or minimum value.

Activity 7.2.3.

We can gain some intuition about this problem by graphing the quadratic form and paying particular attention to the unit vectors.
  1. Evaluating the following cell defines the matrix D=[300βˆ’1] and displays the graph of the associated quadratic form qD(x). In addition, the points corresponding to vectors u with unit length are displayed as a curve.
    Notice that the matrix D is diagonal. In which directions does the quadratic form have its maximum and minimum values?
  2. Write the quadratic form qD associated to D. What is the value of qD([10])? What is the value of qD([01])?
  3. Consider a unit vector u=[u1u2] so that u12+u22=1, an expression we can rewrite as u12=1βˆ’u22. Write the quadratic form qD(u) and replace u12 by 1βˆ’u22. Now explain why the maximum of qD(u) is 3. In which direction does the maximum occur? Does this agree with what you observed from the graph above?
  4. Write the quadratic form qD(u) and replace u22 by 1βˆ’u12. What is the minimum value of qD(u) and in which direction does the minimum occur?
  5. Use the previous Sage cell to change the matrix to A=[1221] and display the graph of the quadratic form qA(x)=xβ‹…(Ax). Determine the directions in which the maximum and minimum occur.
  6. Remember that A=[1221] is symmetric so that A=QDQT where D is the diagonal matrix above and Q is the orthogonal matrix that rotates vectors by 45∘. Notice that
    qA(u)=uβ‹…(Au)=uβ‹…(QDQTu)=(QTu)β‹…(DQTu)=qD(v)
    where v=QTu. That is, we have qA(u)=qD(v).
    Explain why v=QTu is also a unit vector; that is, explain why
    |v|2=|QTu|2=(QTu)β‹…(QTu)=1.
  7. Using the fact that qA(u)=qD(v), explain how we now know the maximum value of qA(u) is 3 and determine the direction in which it occurs. Also, determine the minimum value of qA(u) and determine the direction in which it occurs.
This activity demonstrates how the eigenvalues of A determine the maximum and minimum values of the quadratic form qA(u) when evaluated on unit vectors and how the associated eigenvectors determine the directions in which the maximum and minimum values occur. Let’s look at another example so that this connection is clear.

Example 7.2.4.

Consider the symmetric matrix A=[βˆ’7βˆ’6βˆ’62]. Because A is symmetric, we know that it can be orthogonally diagonalized. In fact, we have A=QDQT where
D=[500βˆ’10],Q=[1/52/5βˆ’2/51/5].
From this diagonalization, we know that Ξ»1=5 is the largest eigenvalue of A with associated eigenvector u1=[1/5βˆ’2/5] and that Ξ»2=βˆ’10 is the smallest eigenvalue with associated eigenvector u2=[2/51/5].
Let’s first study the quadratic form qD(u)=5u12βˆ’10u22 because the absence of the cross-term makes it comparatively simple. Remembering that u is a unit vector, we have u12+u22=1, which means that u12=1βˆ’u22. Therefore,
qD(u)=5u12βˆ’10u22=5(1βˆ’u22)βˆ’10u22=5βˆ’15u22.
This tells us that qD(u) has a maximum value of 5, which occurs when u2=0 or in the direction [10].
In the same way, rewriting u22=1βˆ’u12 allows us to conclude that the minimum value of qD(u) is βˆ’10, which occurs in the direction [01].
Let’s now return to the matrix A whose quadratic form qA is related to qD because A=QDQT. In particular, we have
qA(u)=uβ‹…(Au)=uβ‹…(QDQTu)=(QTu)β‹…(DQTu)=vβ‹…(Dv)=qD(v).
In other words, we have qA(u)=qD(v) where v=QTu. This is quite useful because it allows us to relate the values of qA to those of qD, which we already understand quite well.
Now it turns out that v is also a unit vector because
|v|2=vβ‹…v=(QTu)β‹…(QTu)=uβ‹…(QQTu)=uβ‹…u=|u|2=1.
Therefore, the maximum value of qA(u) is the same as qD(v), which we know to be 5 and which occurs in the direction v=[10]. This means that the maximum value of qA(u) is also 5 and that this occurs in the direction u=Qv=Q[10]=[1/5βˆ’2/5]. We now know that the maximum value of qA(u) is the largest eigenvalue Ξ»1=5 and that this maximum value occurs in the direction of an associated eigenvector.
In the same way, we see that the minimum value of qA(u) is the smallest eigenvalue Ξ»2=βˆ’10 and that this minimum occurs in the direction of u=Q[01]=[2/51/5], an associated eigenvector.
More generally, we have

Example 7.2.6.

Suppose that A is the symmetric matrix A=[063636066], which may be orthogonally diagonalized as A=QDQT where
D=[120003000βˆ’6],Q=[1/32/32/32/31/3βˆ’2/32/3βˆ’2/31/3].
We see that the maximum value of qA(u) is 12, which occurs in the direction [1/32/32/3], and the minimum value is -6, which occurs in the direction [2/3βˆ’2/31/3].

Example 7.2.7.

Suppose we have the matrix of demeaned data points A=[21βˆ’312βˆ’3] that we considered in Preview Activity 7.2.1. The data points are shown in Figure 7.2.8.
Figure 7.2.8. The set of demeaned data points from Preview Activity 7.2.1.
Constructing the covariance matrix C=13 AAT gives C=[14/313/313/314/3], which has eigenvalues Ξ»1=9, with associated eigenvector [1/21/2], and Ξ»2=1/3, with associated eigenvector [βˆ’1/21/2].
Remember that the variance in a direction u is Vu=uβ‹…(Cu)=qC(u). Therefore, the variance attains a maximum value of 9 in the direction [1/21/2] and a minimum value of 1/3 in the direction [βˆ’1/21/2]. Figure 7.2.9 shows the data projected onto the lines defined by these vectors.
Figure 7.2.9. The demeaned data from Preview Activity 7.2.1 is shown projected onto the lines of maximal and minimal variance.
Remember that variance is additive, as stated in Proposition 7.1.16, which tells us that the total variance is V=9+1/3=28/3.
We’ve been focused on finding the directions in which a quadratic form attains its maximum and minimum values, but there’s another important observation to make after this activity. Recall how we used the fact that a symmetric matrix is orthogonally diagonalizable: if A=QDQT, then qA(u)=qD(v) where v=QTu.
More generally, if we define y=QTx, we have
qA(x)=xβ‹…(Ax)=xβ‹…(QDQTx)=(QTx)β‹…(DQTx)=yβ‹…(Dy)=qD(y)
Remembering that the quadratic form associated to a diagonal form has no cross terms, we obtain
qA(x)=qD(y)=Ξ»1y12+Ξ»2y22+…+Ξ»mym2.
In other words, after a change of coordinates, the quadratic form qA can be written without cross terms. This is known as the Principal Axes Theorem.
We will put this to use in the next section.

Subsection 7.2.2 Definite symmetric matrices

While our questions about variance provide some motivation for exploring quadratic forms, these functions appear in a variety of other contexts so it’s worth spending some more time with them. For example, quadratic forms appear in multivariable calculus when describing the behavior of a function of several variables near a critical point and in physics when describing the kinetic energy of a rigid body.
The following definition will be important in this section.

Definition 7.2.11.

A symmetric matrix A is called positive definite if its associated quadratic form satisfies qA(x)>0 for any nonzero vector x. If qA(x)β‰₯0 for all nonzero vectors x, we say that A is positive semidefinite.
Likewise, we say that A is negative definite if qA(x)<0 for all nonzero vectors x.
Finally, A is called indefinite if qA(x)>0 for some x and qA(x)<0 for others.

Activity 7.2.4.

This activity explores the relationship between the eigenvalues of a symmetric matrix and its definiteness.
  1. Consider the diagonal matrix D=[4002] and write its quadratic form qD(x) in terms of the components of x=[x1x2]. How does this help you decide whether D is positive definite or not?
  2. Now consider D=[4000] and write its quadratic form qD(x) in terms of x1 and x2. What can you say about the definiteness of D?
  3. If D is a diagonal matrix, what condition on the diagonal entries guarantee that D is
    1. positive definite?
    2. positive semidefinite?
    3. negative definite?
    4. negative semidefinite?
    5. indefinite?
  4. Suppose that A is a symmetric matrix with eigenvalues 4 and 2 so that A=QDQT where D=[4002]. If y=QTx, then we have qA(x)=qD(y). Explain why this tells us that A is positive definite.
  5. Suppose that A is a symmetric matrix with eigenvalues 4 and 0. What can you say about the definiteness of A in this case?
  6. What condition on the eigenvalues of a symmetric matrix A guarantees that A is
    1. positive definite?
    2. positive semidefinite?
    3. negative definite?
    4. negative semidefinite?
    5. indefinite?
As seen in this activity, it is straightforward to determine the definiteness of a diagonal matrix. For instance, if D=[7005], then
qD(x)=7x12+5x22.
This shows that qD(x)>0 when either x1 or x2 is not zero so we conclude that D is positive definite. In the same way, we see that D is positive semidefinite if all the diagonal entries are nonnegative.
Understanding this behavior for diagonal matrices enables us to understand more general symmetric matrices. As we saw previously, the quadratic form for a symmetric matrix A=QDQT agrees with the quadratic form for the diagonal matrix D after a change of coordinates. In particular,
qA(x)=qD(y)
where y=QTx. Now the diagonal entries of D are the eigenvalues of A from which we conclude that qA(x)>0 if all the eigenvalues of A are positive. Likewise, qA(x)β‰₯0 if all the eigenvalues are nonnegative.
We will now apply what we’ve learned about quadratic forms to study the nature of critical points in multivariable calculus. The rest of this section assumes that the reader is familiar with ideas from multivariable calculus and can be skipped by others.
First, suppose that f(x,y) is a differentiable function. We will use fx and fy to denote the partial derivatives of f with respect to x and y. Similarly, fxx, fxy, fyx and fyy denote the second partial derivatives. You may recall that the mixed partials, fxy and fyx are equal under a mild assumption on the function f. A typical question in calculus is to determine where this function has its maximum and minimum values.
Any local maximum or minimum of f appears at a critical point (x0,y0) where
fx(x0,y0)=0,fy(x0,y0)=0.
Near a critical point, the quadratic approximation of f tells us that
f(x,y)β‰ˆf(x0,y0)+12fxx(x0,y0)(xβˆ’x0)2+fxy(x0,y0)(xβˆ’x0)(yβˆ’y0)+12fyy(x0,y0)(yβˆ’y0)2.

Activity 7.2.5.

Let’s explore how our understanding of quadratic forms helps us determine the behavior of a function f near a critical point.
  1. Consider the function f(x,y)=2x3βˆ’6xy+3y2. Find the partial derivatives fx and fy and use these expressions to determine the critical points of f.
  2. Evaluate the second partial derivatives fxx, fxy, and fyy.
  3. Let’s first consider the critical point (1,1). Use the quadratic approximation as written above to find an expression approximating f near the critical point.
  4. Using the vector w=[xβˆ’1yβˆ’1], rewrite your approximation as
    f(x,y)β‰ˆf(1,1)+qA(w)
    for some matrix A. What is the matrix A in this case?
  5. Find the eigenvalues of A. What can you conclude about the definiteness of A?
  6. Recall that (x0,y0) is a local minimum for f if f(x,y)>f(x0,y0) for nearby points (x,y). Explain why our understanding of the eigenvalues of A shows that (1,1) is a local minimum for f.
Near a critical point (x0,y0) of a function f(x,y), we can write
f(x,y)β‰ˆf(x0,y0)+qA(w)
where w=[xβˆ’x0yβˆ’y0] and A=12[fxx(x0,y0)fxy(x0,y0)fyx(x0,y0)fyy(x0,y0)]. If A is positive definite, then qA(w)>0, which tells us that
f(x,y)β‰ˆf(x0,y0)+qA(w)>f(x0,y0)
and that the critical point (x0,y0) is therefore a local minimum.
The matrix
H=[fxx(x0,y0)fxy(x0,y0)fyx(x0,y0)fyy(x0,y0)]
is called the Hessian of f, and we see now that the eigenvalues of this symmetric matrix determine the nature of the critical point (x0,y0). In particular, if the eigenvalues are both positive, then qH is positive definite, and the critical point is a local minimum.
This observation leads to the Second Derivative Test for multivariable functions.
Most multivariable calculus texts assume that the reader is not familiar with linear algebra and so write the second derivative test for functions of two variables in terms of D=det(H). If
  • D>0 and fxx(x0,y0)>0, then (x0,y0) is a local minimum.
  • D>0 and fxx(x0,y0)<0, then (x0,y0) is a local maximum.
  • D<0, then (x0,y0) is neither a local maximum nor minimum.
The conditions in this version of the second derivative test are simply algebraic criteria that tell us about the definiteness of the Hessian matrix H.

Subsection 7.2.3 Summary

This section explored quadratic forms, functions that are defined by symmetric matrices.
  • If A is a symmetric matrix, then the quadratic form defined by A is the function qA(x)=xβ‹…(Ax). Quadratic forms appear when studying the variance of a dataset. If C is the covariance matrix, then the variance in the direction defined by a unit vector u is qC(u)=uβ‹…(Cu)=Vu.
    Similarly, quadratic forms appear in multivariable calculus when analyzing the behavior of a function of several variables near a critical point.
  • If Ξ»1 is the largest eigenvalue of a symmetric matrix A and Ξ»m the smallest, then the maximum value of qA(u) among unit vectors u, is Ξ»1, and this maximum value occurs in the direction of u1, a unit eigenvector associated to Ξ»1.
    Similarly, the minimum value of qA(u) is Ξ»m, which appears in the direction of um, an eigenvector associated to Ξ»m.
  • A symmetric matrix is positive definite if its eigenvalues are all positive, positive semidefinite if its eigenvalues are all nonnegative, and indefinite if it has both positive and negative eigenvalues.
  • If the Hessian H of a multivariable function f is positive definite at a critical point, then the critical point is a local minimum. Likewise, if the Hessian is negative definite, the critical point is a local maximum.

Exercises 7.2.4 Exercises

1.

Suppose that A=[4227].
  1. Find an orthogonal diagonalization of A.
  2. Evaluate the quadratic form qA([11]).
  3. Find the unit vector u for which qA(u) is as large as possible. What is the value of qA(u) in this direction?
  4. Find the unit vector u for which qA(u) is as small as possible. What is the value of qA(u) in this direction?

2.

Consider the quadratic form
q([x1x2])=3x12βˆ’4x1x2+6x22.
  1. Find a matrix A such that q(x)=xTAx.
  2. Find the maximum and minimum values of q(u) among all unit vectors u and describe the directions in which they occur.

3.

Suppose that A is a demeaned data matrix:
A=[1βˆ’2011βˆ’1βˆ’11].
  1. Find the covariance matrix C.
  2. What is the variance of the data projected onto the line defined by u=[1/21/2].
  3. What is the total variance?
  4. In which direction is the variance greatest and what is the variance in this direction?

4.

Consider the matrix A=[4βˆ’3βˆ’3βˆ’34βˆ’3βˆ’3βˆ’34].
  1. Find Q and D such that A=QDQT.
  2. Find the maximum and minimum values of q(u)=xTAx among all unit vectors u.
  3. Describe the direction in which the minimum value occurs. What can you say about the direction in which the maximum occurs?

5.

Consider the matrix B=[βˆ’214βˆ’22βˆ’1].
  1. Find the matrix A so that q([x1x2])=|Bx|2=qA(x).
  2. Find the maximum and minimum values of q(u) among all unit vectors u and describe the directions in which they occur.
  3. What does the minimum value of q(u) tell you about the matrix B?

6.

Consider the quadratic form
q([x1x2x3])=7x12+4x22+7x32βˆ’2x1x2βˆ’4x1x3βˆ’2x2x3.
  1. What can you say about the definiteness of the matrix A that defines the quadratic form?
  2. Find a matrix Q so that the change of coordinates y=QTx transforms the quadratic form into one that has no cross terms. Write the quadratic form in terms of y.
  3. What are the maximum and minimum values for q(u) among all unit vectors u?

7.

Explain why the following statements are true.
  1. Given any matrix B, the matrix BTB is a symmetric, positive semidefinite matrix.
  2. If both A and B are symmetric, positive definite matrices, then A+B is a symmetric, positive definite matrix.
  3. If A is a symmetric, invertible, positive definite matrix, then Aβˆ’1 is also.

8.

Determine whether the following statements are true or false and explain your reasoning.
  1. If A is an indefinite matrix, we can’t know whether it is positive definite or not.
  2. If the smallest eigenvalue of A is 3, then A is positive definite.
  3. If C is the covariance matrix associated with a dataset, then C is positive semidefinite.
  4. If A is a symmetric 2Γ—2 matrix and the maximum and minimum values of qA(u) occur at [10] and [01], then A is diagonal.
  5. If A is negative definite and Q is an orthogonal matrix with B=QAQT, then B is negative definite.

9.

Determine the critical points for each of the following functions. At each critical point, determine the Hessian H, describe the definiteness of H, and determine whether the critical point is a local maximum or minimum.
  1. f(x,y)=xy+2x+2y.
  2. f(x,y)=x4+y4βˆ’4xy.

10.

Consider the function f(x,y,z)=x4+y4+z4βˆ’4xyz.
  1. Show that f has a critical point at (βˆ’1,1,βˆ’1) and construct the Hessian H at that point.
  2. Find the eigenvalues of H. Is this a definite matrix of some kind?
  3. What does this imply about whether (βˆ’1,1,βˆ’1) is a local maximum or minimum?
You have attempted 1 of 1 activities on this page.