Let \(V\) and \(W\) be vector spaces. At their most basic, all vector spaces are sets. Given any two sets, we can consider functions from one to the other. The functions of interest in linear algebra are those that respect the vector space structure of the sets.
Definition2.1.1.
Let \(V\) and \(W\) be vector spaces. A function \(T:V\to W\) is called a linear transformation if:
For all \(\vv_1,\vv_2\in V\text{,}\)\(T(\vv_1+\vv_2)=T(\vv_1)+T(\vv_2)\text{.}\)
For all \(\vv\in V\) and scalars \(c\text{,}\)\(T(c\vv)=cT(\vv)\text{.}\)
We often use the term linear operator to refer to a linear transformation \(T:V\to V\) from a vector space to itself.
The properties of a linear transformation tell us that a linear map \(T\)preserves the operations of addition and scalar multiplication. (When the domain and codomain are different vector spaces, we might say that \(T\)intertwines the operations of the two vector spaces.) In particular, any linear transformation \(T\) must preserve the zero vector, and respect linear combinations.
Theorem2.1.2.
Let \(T:V\to W\) be a linear transformation. Then
\(T(\zer_V) = \zer_W\text{,}\) and
For any scalars \(c_1,\ldots, c_n\) and vectors \(\vv_1,\ldots, \vv_n\in V\text{,}\)
For the first part, remember that old trick we’ve used a couple of times before: \(\zer + \zer = \zer\text{.}\) What happens if you apply \(T\) to both sides of this equation?
For the second part, note that the addition property of a linear transformation looks an awful lot like a distributive property, and we can distribute over a sum of three or more vectors using the associative property. You’ll want to deal with the addition first, and then the scalar multiplication.
where the second line follows from the scalar multiplication property.
Remark2.1.3.
Technically, we skipped over some details in the above proof: how exactly, is associativity being applied? It turns out there’s actually a proof by induction lurking in the background!
By definition, we know that \(T(\vv_1+\vv_2)=T(\vv_1)+T(\vv_2)\text{.}\) For three vectors,
For an abitrary number of vectors \(n\geq 3\text{,}\) we can assume that distribution over addition works for \(n-1\) vectors, and then use associativity to write
The right-hand side is technically a sum of two vectors, so we can apply the definition of a linear transformation directly, and then apply our induction hypothesis to \(T(\vv_2+\cdots + \vv_n)\text{.}\)
Example2.1.4.
Let \(V=\R^n\) and let \(W=\R^m\text{.}\) For any \(m\times n\) matrix \(A\text{,}\) the map \(T_A:\R^n\to \R^m\) defined by
\begin{equation*}
T_A(\xx) = A\xx
\end{equation*}
is a linear transformation. (This follows immediately from properties of matrix multiplication.)
Let \(B = \{\mathbf{e}_1,\ldots, \mathbf{e}_n\}\) denote the standard basis of \(\R^n\text{.}\) (See Example 1.7.6.) Recall (or convince yourself, with a couple of examples) that \(A\mathbf{e}_i\) is equal to the \(i\)th column of \(A\text{.}\) Thus, if we know the value of a linear transformation \(T:\R^n\to \R^m\) on each basis vector, we can immediately determine the matrix \(A\) such that \(T=T_A\text{:}\)
Moreover, if two linear transformations agree on a basis, they must be equal. Given any \(\xx\in \R^n\text{,}\) we can write \(\xx\) uniquely as a linear combination
Let’s look at some other examples of linear transformations.
For any vector spaces \(V,W\) we can define the zero transformation \(0:V\to W\) by \(0(\vv)=\zer\) for all \(\vv\in V\text{.}\)
On any vector space \(V\) we have the identity transformation \(1_V:V\to V\) defined by \(1_V(\vv)=\vv\) for all \(\vv\in V\text{.}\)
Let \(V = F[a,b]\) be the space of all functions \(f:[a,b]\to \R\text{.}\) For any \(c\in [a,b]\) we have the evaluation map \(E_a: V\to \R\) defined by \(E_a(f) = f(a)\text{.}\)
To see that this is linear, note that \(E_a(0)=\zer(a)=0\text{,}\) where \(\zer\) denotes the zero function; for any \(f,g\in V\text{,}\)
Note that the evaluation map can similarly be defined as a linear transformation on any vector space of polynomials.
On the vector space \(C[a,b]\) of all continuous functions on \([a,b]\text{,}\) we have the integration map \(I:C[a,b]\to \R\) defined by \(I(f)=\int_a^b f(x)\,dx\text{.}\) The fact that this is a linear map follows from properties of integrals proved in a calculus class.
On the vector space \(C^1(a,b)\) of continuously differentiable functions on \((a,b)\text{,}\) we have the differentiation map \(D: C^1(a,b)\to C(a,b)\) defined by \(D(f) = f'\text{.}\) Again, linearity follows from properties of the derivative.
Let \(\R^\infty\) denote the set of sequences \((a_1,a_2,a_3,\ldots)\) of real numbers, with term-by-term addition and scalar multiplication. The shift operators
On the space \(M_{mn}(\R)\) of \(m\times n\) matrices, the trace defines a linear map \(\operatorname{tr}:M_{mn}(\R)\to \R\text{,}\) and the transpose defines a linear map \(T:M_{mn}(\R)\to M_{nm}(\R)\text{.}\) The determinant and inverse operations on \(M_{nn}\) are not linear.
Exercise2.1.5.
Which of the following are linear transformations?
The function \(T:\R^2\to \R^2\) given by \(T(x,y)=(x-y, x+2y+1)\text{.}\)
Since \(T(0,0)=(0,1)\neq (0,0)\text{,}\) this can’t be a linear transformation.
The function \(f:P_2(\R)\to \R^2\) given by \(f(p(x))=(p(1),p(2))\text{.}\)
This looks unusual, but it’s linear! You can check that \(f(p(x)+q(x))=f(p(x))+f(q(x))\text{,}\) and \(f(cp(x))=cf(p(x))\text{.}\)
The function \(g:\R^2\to \R^2\) given by \(g(x,y)=(2x-y,2xy)\text{.}\)
Although this function preserves the zero vector, it doesn’t preserve addition or scalar multiplication. For example, \(g(1,0)+g(0,1)=(2,0)+(-1,0)=(1,0)\text{,}\) but \(g((1,0)+(0,1))=g(1,1)=(1,2)\text{.}\)
The function \(M:P_2(\R)\to P_3(\R)\) given by \(M(p(x))=xp(x)\text{.}\)
Multiplication by \(x\) might feel non-linear, but remember that \(x\) is not a “variable” as far as the transformation is concerned! It’s more of a placeholder. Try checking the definition directly.
The function \(D:M_{2\times 2}(\R)\to\R\) given by \(D(A)=\det(A)\text{.}\)
Remember that \(\det(A+B)\neq \det(A)+\det(B)\) in general!
The function \(f:\R\to V\) given by \(f(x)=e^x\text{,}\) where \(V=(0,\infty)\text{,}\) with the vector space structure defined in Exercise 1.1.1.
An exponential function that’s linear? Seems impossible, but remember that “addition” \(x\oplus y\) in \(V\) is really multiplication, so \(f(x+y)=e^{x+y}=e^xe^y=f(x)\oplus f(y)\text{,}\) and similarly, \(f(cx)=c\odot f(x)\text{.}\)
Hint.
Usually, you can expect a linear transformation to involve homogeneous linear expressions. Things like products, powers, and added constants are usually clues that something is nonlinear.
For finite-dimensional vector spaces, it is often convenient to work in terms of a basis. The properties of a linear transformation tell us that we can completely define any linear transformation by giving its values on a basis. In fact, it’s enough to know the value of a transformation on a spanning set. The argument given in Example 2.1.4 can be applied to any linear transformation, to obtain the following result.
Theorem2.1.6.
Let \(T:V\to W\) and \(S:V\to W\) be two linear transformations. If \(V = \spn\{\vv_1,\ldots, \vv_n\}\) and \(T(\vv_i)=S(\vv_i)\) for each \(i=1,2,\ldots, n\text{,}\) then \(T=S\text{.}\)
Caution: If the above spanning set is not also independent, then we can’t just define the values \(T(\vv_i)\) however we want. For example, suppose we want to define \(T:\R^2\to\R^2\text{,}\) and we set \(\R^2=\spn\{(1,2),(4,-1),(5,1)\}\text{.}\) If \(T(1,2)=(3,4)\) and \(T(4,-1)=(-2,2)\text{,}\) then we must have \(T(5,1)=(1,6)\text{.}\) Why? Because \((5,1)=(1,2)+(4,1)\text{,}\) and if \(T\) is to be linear, then we have to have \(T((1,2)+(4,-1))=T(1,2)+T(4,-1)\text{.}\)
Remark2.1.7.
If for some reason we already know that our transformation is linear, we might still be concerned about the fact that if a spanning set is not independent, there will be more than one way to express a vector as linear combination of vectors in that set. If we define \(T\) by giving its values on a spanning set, will it be well-defined? (That is, could we get two different values for \(T(\vv)\) by expressing \(\vv\) in terms of the spanning set in two different ways?) Suppose that we have scalars \(a_1,\ldots, a_n, b_1,\ldots, b_n\) such that
Of course, we can avoid all of this unpleasantness by using a basis to define a transformation. Given a basis \(B = \{\vv_1,\ldots, \vv_n\}\) for a vector space \(V\text{,}\) we can define a transformation \(T:V\to W\) by setting \(T(\vv_i)=\ww_i\) for some choice of vectors \(\ww_1,\ldots, \ww_n\) and defining
Because each vector \(\vv\in V\) can be written uniquely in terms of a basis, we know that our transformation is well-defined.
The next theorem seems like an obvious consequence of the above, and indeed, one might wonder where the assumption of a basis is needed. The distinction here is that the vectors \(\ww_1,\ldots, \ww_n\in W\) are chosen in advance, and then we define\(T\) by setting \(T(\mathbf{b}_i)=\ww_i\text{,}\) rather than simply defining each \(\ww_i\) as \(T(\mathbf{b}_i)\text{.}\)
Theorem2.1.8.
Let \(V,W\) be vector spaces. Let \(B=\{\mathbf{b}_1,\ldots, \mathbf{b}_n\}\) be a basis of \(V\text{,}\) and let \(\ww_1,\ldots, \ww_n\) be any vectors in \(W\text{.}\) (These vectors need not be distinct.) Then there exists a unique linear transformation \(T:V\to W\) such that \(T(\mathbf{b}_i)=\ww_i\) for each \(i=1,2,\ldots, n\text{;}\) indeed, we can define \(T\) as follows: given \(\vv\in V\text{,}\) write \(\vv=c_1\mathbf{b}_1+\cdots +c_n\mathbf{b}_n\text{.}\) Then
With the basic theory out of the way, let’s look at a few basic examples.
Example2.1.9.
Suppose \(T:\R^2\to \R^2\) is a linear transformation. If \(T\bbm 1\\0\ebm = \bbm 3\\-4\ebm\) and \(T\bbm 0\\1\ebm =\bbm 5\\2\ebm\text{,}\) find \(T\bbm -2\\4\ebm\text{.}\)
Solution.
Since we know the value of \(T\) on the standard basis, we can use properties of linear transformations to immediately obtain the answer:
Suppose \(T:\R^2\to \R^2\) is a linear transformation. Given that \(T\bbm 3\\1\ebm = \bbm 1\\4\ebm\) and \(T\bbm 2\\-5\ebm = \bbm 2\\-1\ebm\text{,}\) find \(T\bbm 4\\3\ebm\text{.}\)
Solution.
At first, this example looks the same as the one above, and to some extent, it is. The difference is that this time, we’re given the values of \(T\) on a basis that is not the standard one. This means we first have to do some work to determine how to write the given vector in terms of the given basis.
Suppose we have \(a\bbm 3\\1\ebm+b\bbm 2\\-5\ebm = \bbm 4\\3\ebm\) for scalars \(a,b\text{.}\) This is equivalent to the matrix equation
Since \(\{(1,2),(-1,1)\}\) forms a basis of \(\R^2\) (the vectors are not parallel and there are two of them), it suffices to determine how to write a general vector in terms of this basis. Suppose
for a general element \((a,b)\in \R^2\text{.}\) This is equivalent to the matrix equation \(\bbm 1\amp -1\\2\amp 1\ebm\bbm x\\y\ebm = \bbm a\\b\ebm\text{,}\) which we can solve as \(\bbm x\\y\ebm = \bbm 1\amp -1\\2\amp 1\ebm^{-1}\bbm a\\b\ebm\text{:}\)
Let \(T:V\to W\) be a linear transformation. Rearrange the blocks below to create a proof of the following statement:
For any vectors \(\vv_1,\ldots, \vv_n\in V\text{,}\) if \(\{T(\vv_1),\ldots, T(\vv_n)\}\) is linearly independent in \(W\text{,}\) then \(\{\vv_1,\ldots, \vv_n\}\) is linearly independent in \(V\text{.}\)
Suppose that \(\{T(\vv_1),\ldots, T(\vv_n)\}\) is linearly independent.
---
We want to show that \(\{\vv_1,\ldots, \vv_n\}\) is linearly independent, so suppose that we have
By hypothesis, the vectors \(T(\vv_i)\) are linearly independent, so we must have \(c_1=0,c_2=0,\ldots, c_n=0\text{.}\)
---
Since the only solution to \(c_1\vv_1+\cdots + c_n\vv_n=\zer\) is \(c_1=0,\ldots, c_n=0\text{,}\) the set \(\{\vv_1,\ldots, \vv_n\}\) is linearly independent.
Hint.
This is mostly a matter of using Theorem 2.1.2, but it’s important to get the logic correct. We have a conditional statement of the form \(P\Rightarrow Q\text{,}\) where both \(P\) (“the set \(\{T(\vv_1),\ldots, T(\vv_n)\}\) is independent”) and \(Q\) (“the set \(\{\vv_1,\ldots, \vv_n\}\) is independent”) are themselves conditional statements.
The overall structure therefore looks like \((A\Rightarrow B)\Rightarrow (C\Rightarrow D)\text{.}\) A direct proof should be structured as follows:
Assume the main hypothesis: \(A\Rightarrow B\text{.}\)
Assume the “sub”-hypothesis \(C\text{.}\)
Figure out how to show that \(C\Rightarrow A\text{.}\) (This is the “apply \(T\) to both sides” step.)
If we know \(A\text{,}\) and we’ve assumed \(A\Rightarrow B\text{,}\) we know \(B\text{.}\)
Realize that \(B\Rightarrow D\text{.}\)
2.
(a) Suppose \(f : \mathbb{R}^{2} \to \mathbb{R}^{3}\) is a linear transformation such that
Compute \(\displaystyle{ f ( 5 {\vec{e}}_4 + 4 {\vec{e}}_7 ) - f ( 6 {\vec{e}}_8 + 2 {\vec{e}}_7 )}\text{.}\)
(c) Let \(V\) be a vector space and let \({\vec{v}}_1, {\vec{v}}_2, {\vec{v}}_3 \in V\text{.}\) Suppose \(T : V \to \mathbb{R}^{2}\) is a linear transformation such that
Let \(M_{n,n}(\mathbb{R})\) denote the vector space of \(n \times n\) matrices with real entries. Let \(f : M_{2,2}(\mathbb{R}) \to M_{2,2}(\mathbb{R})\) be the function defined by \(f(A) = A^T\) for any \(A \in M_{2,2}(\mathbb{R})\text{.}\) Determine if \(f\) is a linear transformation, as follows:
Let \(A = {\left[\begin{array}{cc}
a_{11} \amp a_{12}\cr
a_{21} \amp a_{22}\cr
\end{array}\right]}\) and \(B = {\left[\begin{array}{cc}
b_{11} \amp b_{12}\cr
b_{21} \amp b_{22}\cr
\end{array}\right]}\) be any two matrices in \(M_{2,2}(\mathbb{R})\) and let \(c \in \mathbb{R}\text{.}\)
(a)\(f(A+B) =\)
\(f(A) + f(B) =\)\(+\) .
Does \(f(A+B) = f(A) + f(B)\) for all \(A, B \in M_{2,2}(\mathbb{R})\text{?}\)
(b)\(f(c A) =\) .
\(c(f(A)) =\)\(\Bigg(\)\(\Bigg)\text{.}\)
Does \(f(c A) = c(f(A))\) for all \(c \in \mathbb{R}\) and all \(A \in M_{2,2}(\mathbb{R})\text{?}\)
(c) Is \(f\) a linear transformation?
4.
Let \(f : \mathbb{R} \to \mathbb{R}\) be defined by \(f(x) = {2x-3}\text{.}\) Determine if \(f\) is a linear transformation, as follows:
(a)\(f(x+y) =\) .
\(f(x) + f(y) =\)\(+\) .
Does \(f(x+y) = f(x) + f(y)\) for all \(x,y \in \mathbb{R}\text{?}\)
(b)\(f(c x) =\) .
\(c(f(x)) =\)\(\Big(\)\(\Big)\text{.}\)
Does \(f(cx) = c(f(x))\) for all \(c, x \in \mathbb{R}\text{?}\)
(c) Is \(f\) a linear transformation?
5.
Let \(V\) and \(W\) be vector spaces and let \(\vec{v}_1, \vec{v}_2 \in V\) and \(\vec{w}_1, \vec{w}_2 \in W\text{.}\)
(a) Suppose \(T : V \to W\) is a linear transformation.
Find \(T( 6 \vec{v}_1 - \vec{v}_2)\) and write your answer in terms of \(T(\vec{v}_1)\) and \(T(\vec{v}_2)\text{.}\)
(b) Suppose \(L : V \to W\) is a linear transformation such that \(L(\vec{v}_1) = \vec{w}_1 + \vec{w}_2\) and \(L(\vec{v}_2) = - 8 \vec{w}_2\text{.}\)
Find \(L(6 \vec{v}_1 + 3 \vec{v}_2)\) in terms of \(\vec{w}_1\) and \(\vec{w}_2\text{.}\)
6.
Let \(T:{\mathbb R}^2 \rightarrow {\mathbb R}^2\) be a linear transformation that sends the vector \(\vec{u} =(5,2)\) into \((2,1)\) and maps \(\vec{v}= (1,3)\) into \((-1, 3)\text{.}\) Use properties of a linear transformation to calculate the following.
(a)\(T(4 \vec{u})\)
(b)\(T(-6 \vec{v})\)
(c)\(T(4 \vec{u} - 6 \vec{v})\)
7.
Let \(\vec{e}_1=(1,0)\text{,}\)\(\vec{e}_2=(0,1)\text{,}\)\(\vec{x}_1= ( 7, -8)\) and \(\vec{x}_2= (2, 9)\text{.}\)
Let \(T: {\mathbb R}^2 \rightarrow {\mathbb R}^2\) be a linear transformation that sends \(\vec{e}_1\) to \(\vec{x}_1\) and \(\vec{e}_2\) to \(\vec{x}_2\text{.}\)
If \(T\) maps \((1, 6)\) to the vector \(\vec{y}\text{,}\) find \(\vec{y}\text{.}\)
Compute \(T(ax^2+bx+c)\text{,}\) where \(a\text{,}\)\(b\text{,}\) and \(c\) are arbitrary real numbers.
11.
If \(T: P_1 \rightarrow P_1\) is a linear transformation such that \(T(1+4 x) = -2 + 4 x \ \) and \(\ T(3 + 11 x) = 3 + 2 x, \ \text{,}\) find the value of \(T(4 - 5 x)\text{.}\)