7.1 Properties of linear maps
In this section we will study some properties of linear maps and develop some of the related structures. We will later see that this connects to the exploration of matrices that we looked at earlier in the course.
7.1.1 Combinations of linear maps
Throughout this section we assume that \(U, V, W\) and \(X\) are vector spaces over \(\mathbb{F}\).
We first notice that we can add linear maps if they relate the same spaces, and multiply them by scalars:
Let \(S:V\to W\) and \(T:V\to W\) be linear maps, and \(\lambda\in\mathbb{F}\). Then we define
\((\lambda T)(x):=\lambda T(x)\) and
\((S+T)(x):=S(x)+T(x)\).
Click for solution
Let \(\lambda \in \mathbb{F}\) and \(T, S:V \to W\) be linear maps. The maps \(\lambda T\) and \(S+T\) are linear maps from \(V\) to \(W\).
The proof follows directly from the definitions.
We can also compose maps in general, and for linear maps we find that the composition (if defined) is linear.
Let \(T:U\to V\) and \(S:V\to W\) be linear maps, then the composition \[S\circ T(x):=S(T(x))\] is a linear map \(S\circ T:U\to V\).
We consider first the action of \(S\circ T\) on \(\lambda x\): Since \(T\) is linear we have \(S\circ T(\lambda x)=S(T(\lambda x))=S(\lambda T(x))\) and since \(S\) is linear, too, we get \(S(\lambda T(x))=\lambda S(T(x))=\lambda S\circ T(x)\). Now we apply \(S\circ T\) to \(x+y\): \[\begin{aligned} S\circ T(x+y)&=S(T(x+y))\\ &=S(T(x)+T(y))\\ &=S(T(x))+S(T(y))=S\circ T(x)+S\circ T(y) . \end{aligned}\]
□
In a similar way one can prove the following.
Let \(T:U\to V\) and \(R, S:V\to W\) be linear maps, then \[(R+S)\circ T=R\circ T+S\circ T\] and if \(S, T:U\to V\) and \(R :V\to W\) be linear maps, then \[R\circ(S+T)=R\circ S+R\circ T .\] Furthermore if \(T:U\to V\), \(S:V\to W\) and \(R:W\to X\) are linear maps then \[(R\circ S)\circ T=R\circ(S\circ T) .\]
Note that the first and third properties hold for any functions, but the second relies on the linearity of the functions.
7.1.2 Image and kernel
Let us define two subsets related naturally to each linear map.
Let \(T:V\to W\) be a linear map, then we define
the image of \(T\) to be \[\operatorname{Im}T=\{ y\in W: \text{there exists } x\in V \text{ with } T(x)=y\}\subseteq W,\]
the kernel of \(T\) to be \[\ker T=\{ x\in V: T(x)=\mathbf{0}\}\subseteq V .\]
In these examples we see that the image and kernel are actually subspaces rather than just subsets. This was not a coincidence, as the following theorem shows.
Let \(T:V\to W\) be a linear map, then \(\operatorname{Im}T\) is a linear subspace of \(W\) and \(\ker T\) is a linear subspace of \(V\).
The proof is left as an exercise.
Now let us relate the image and the kernel to the some general properties of a map.
Recall that if \(A,B\) are sets and \(f:A\to B\) is a map (not necessarily linear), then \(f\) is called
surjective, if for any \(b\in B\) there exists an \(a\in A\) such that \(f(a)=b\).
injective, if whenever \(f(a)=f(a')\) then \(a=a'\).
bijective, if \(f\) is injective and surjective, that is if for any \(b\in B\) there exist exactly one \(a\in A\) with \(f(a)=b\).
If \(f:A\to B\) is bijective, then there exists a unique map \(f^{-1}:B\to A\) with \(f\circ f^{-1}(b)=b\) for all \(b\in B\), \(f^{-1}\circ f(a)=a\) for all \(a\in A\) and \(f^{-1}\) is bijective, too.
Let us first show existence: For any \(b\in B\), there exists an \(a\in A\) such that \(f(a)=b\), since \(f\) is surjective. Since \(f\) is injective, this \(a\) is unique, i.e., if \(f(a')=b\), then \(a'=a\), so we can set \[f^{-1}(b):=a .\] By definition this map satisfies \(f\circ f^{-1}(b)=f(f^{-1}(b))=f(a)=b\) and \(f^{-1}(f(a))=f^{-1}(b)=a\). From these we also get that \(f^{-1}\) is bijective.
□
In the special case of linear maps we can connect these general properties to the image and kernel of our map.
Let \(T: V\to W\) be a linear map, then
\(T\) is surjective if and only if \(\operatorname{Im}T=W\),
\(T\) is injective if and only if \(\ker T=\{\mathbf{0}\}\), and
\(T\) is bijective if and only if \(\operatorname{Im}T=W\) and \(\ker T=\{\mathbf{0}\}\).
That surjective is equivalent to \(\operatorname{Im}T=W\) follows directly from the definitions of surjectivity and \(\operatorname{Im}T\).
Notice that we always have \(\mathbf{0}\in \ker T\), since \(T(\mathbf{0})=\mathbf{0}\). Now assume \(T\) is injective and let \(x\in \ker T\), i.e, \(T(x)=\mathbf{0}\). But then \(T(x)=T(\mathbf{0})\) and injectivity of \(T\) gives \(x=\mathbf{0}\), hence \(\ker T=\{\mathbf{0}\}\). For the converse, let \(\ker T=\{\mathbf{0}\}\), and assume there are \(x,x'\in V\) with \(T(x)=T(x')\). Using linearity of \(T\) we then get \(\mathbf{0}=T(x)-T(x')=T(x-x')\) and hence \(x-x'\in \ker T\), and since \(\ker T=\{\mathbf{0}\}\) this means that \(x=x'\) and hence \(T\) is injective.
This follows immediately from the previous two parts of the theorem.
□
An important property of linear maps with \(\ker T=\{\mathbf{0}\}\) is the following.
Let \(x_1,x_2,\cdots,x_k\in V\) be linearly independent, and \(T:V\to W\) be a linear map with \(\ker T=\{\mathbf{0}\}\). Then \(T(x_1), T(x_2) ,\cdots ,T(x_k)\) are linearly independent.
Assume \(T(x_1), T(x_2),\cdots ,T(x_k)\) are linearly dependent, i.e., there exist \(\lambda_1,\lambda_2,\cdots,\lambda_k\), not all \(0\), such that \[\lambda_1T(x_1)+\lambda_2T(x_2)+\cdots +\lambda_kT(x_k)=\mathbf{0}.\] But since \(T\) is linear we have \[T(\lambda_1x_1+\lambda_2x_2+\cdots +\lambda_kx_k)= \lambda_1T(x_1)+\lambda_2T(x_2)+\cdots +\lambda_kT(x_k)=\mathbf{0},\] and hence \(\lambda_1x_1+\lambda_2x_2+\cdots +\lambda_kx_k\in \ker T\). But since \(\ker T=\{\mathbf{0}\}\) it follows that \[\lambda_1x_1+\lambda_2x_2+\cdots +\lambda_kx_k=\mathbf{0},\] which means that the vectors \(x_1,x_2, \cdots ,x_k\) are linearly dependent, and this contradicts the assumption. Therefore \(T(x_1), T(x_2),\cdots ,T(x_k)\) are linearly independent.
□
Notice that this result implies that if \(T\) is bijective, it maps a basis of \(V\) to a basis of \(W\), hence \(V=W\).
We saw that a bijective map has an inverse; we now show that if \(T\) is linear, then the inverse is linear, too.
Let \(T:V\to V\) be a linear map and assume \(T\) is bijective. Then \(T^{-1}:V\to V\) is also linear.
Let \(y, y' \in V\). We want to show \(T^{-1}(y+y')=T^{-1}(y)+T^{-1}(y')\). Since \(T\) is bijective we know that there are unique \(x,x'\in V\) with \(y=T(x)\) and \(y'=T(x')\), therefore \[y+y'=T(x)+T(x')=T(x+x')\] and applying \(T^{-1}\) to both sides of this equation gives \[T^{-1}(y+y')=T^{-1}(T(x+x'))=x+x'=T^{-1}(y)+T^{-1}(y') .\]
The second property of a linear map is shown in a similar way and this is left as an exercise.□
7.1.3 Rank and nullity
As the image and kernel of a linear map are subspaces we can find bases for these and hence determine their dimension. We give the dimension of the image and the kernel names as they are important concepts that can tell us a lot about a map.
So in view of our discussion in the previous subsection we have that a map \(T: V\to W\) is injective if \(\operatorname{nullity}T=0\) and surjective if \(\operatorname{rank}T=\dim W\). It turns out that rank and nullity are actually related; this is the content of the Rank Nullity Theorem.
Let \(T:V \to W\) be a linear map, then \[\operatorname{rank}T+\operatorname{nullity}T=\dim V .\]
We will explore this theorem and its proof further later in the course. For now, let us just consider a few consequences of this.
Click for solution
We must have that \(n=m\). In fact we can state this as a result about subspaces in general.
If the linear map \(T:V\to W\) is invertible then \(\dim V= \dim W\).
Let \(\dim V=n\) and \(\dim W=m\). Since \(T\) is invertible we have that \(\operatorname{rank}T=m\) and \(\operatorname{nullity}T=0\). Hence by the Rank-Nullity Theorem we have that \(m=\operatorname{rank}T=n\).
□
Let \(T: V\to V\) be a linear map, then
if \(\operatorname{rank}T=\dim V\), then \(T\) is invertible,
if \(\operatorname{nullity}T=0\), then \(T\) is invertible.
We have that \(T\) is invertible if \(\operatorname{rank}T=\dim V\) and \(\operatorname{nullity}T=0\), but by the Rank-Nullity Theorem \(\operatorname{rank}T+\operatorname{nullity}T=\dim V\), hence any one of the conditions implies the other.
□
We have already seen a couple of examples where we have defined linear maps using matrices. However, the connection between matrices and linear maps is even stronger, as we will now explore.