3.3 The structure of the set of solutions

In this section we will study the general structure of the set of solutions to a system of linear equations, when it has solutions at all. In the next section we will then look at methods to actually solve a system of linear equations.

Definition 3.27: (S(A,b))
Let \(A\in M_{m,n}(\mathbb{R})\) and \(b\in \mathbb{R}^m\), then we set \[S(A,b):=\{x\in \mathbb{R}^n: Ax=b\} .\] This is a subset of \(\mathbb{R}^n\) and consists of all the solutions to the system of linear equations \(Ax=b\). If there are no solutions then \(S(A,b)=\emptyset\).

One often distinguishes between two types of systems of linear equations based on their constant terms.

Definition 3.28: (Homogeneous and inhomogenous)
The system of linear equations \(Ax=b\) is called homogeneous if \(b=\mathbf{0}\), i.e., if it is of the form \[Ax=\mathbf{0}.\] If \(b\ne \mathbf{0}\) the system is called inhomogeneous.

If the system is inhomogeneous, then it doesn’t necessarily have a solution. But for the ones which have a solution we can determine the structure of the set of solutions. The key observation is that if we have one solution, say \(x_0\in \mathbb{R}^n\) which satisfies \(Ax_0=b\), then we can create further solutions by adding solutions of the corresponding homogeneous system, \(Ax=\mathbf{0}\), since if \(Ax=\mathbf{0}\) \[A(x_0+x)=Ax_0+Ax=b+\mathbf{0}=b,\] and so \(x_0+x\) is another solution to the inhomogeneous system.

Theorem 3.29:

Let \(A\in M_{m,n}(\mathbb{R})\) and \(b\in \mathbb{R}^m\) and assume there exists \(x_0\in \mathbb{R}^n\) with \(Ax_0=b\). Then \[S(A,b)=\{x_0\}+S(A,\mathbf{0}):=\{x_0+x: x\in S(A,\mathbf{0})\}\]

Proof.

As we noticed above, if \(x\in S(A,\mathbf{0})\), then \(A(x_0+x)=b\), hence \(\{x_0\}+S(A,\mathbf{0})\subseteq S(A,b)\).

On the other hand, if \(y\in S(A,b)\) then \(A(y-x_0)=Ay-Ax_0=b-b=\mathbf{0}\), and so \(y-x_0\in S(A,\mathbf{0})\). Therefore \(S(A,b)\subseteq \{x_0\}+S(A,\mathbf{0})\) and so \(S(A,b)=\{x_0\}+S(A,\mathbf{0})\).

Remarks:

  • The structure of the set of solutions is often described as follows: The general solution of the inhomogeneous system \(Ax=b\) is given by a special solution \(x_0\) to the inhomogeneous system plus a general solution to the corresponding homogeneous system \(Ax=\mathbf{0}\).

  • The case that there is unique solution to \(Ax=b\) corresponds to \(S(A,\mathbf{0})=\{\mathbf{0}\}\), in which case \(S(A,b)=\{x_0\}\).

At first sight the definition of the set \(\{x_0\}+ S(A,\mathbf{0})\) seems to depend on the choice of the particular solution \(x_0\) to \(Ax_0=b\). But this is not so; another choice \(y_0\) just corresponds to a different labelling of the elements of the set.

Example 3.30:

Let us look at an example of three equations with three unknowns: \[\begin{aligned} 3 x +z&= 0,\\ y-z& =1,\\ 3x+y&=1. \end{aligned}\] This set of equations corresponds to \[A=\begin{pmatrix}3 & 0 & 1\\ 0 & 1 &-1\\ 3 & 1 &0\end{pmatrix}\quad\text{and}\quad b=\begin{pmatrix}0\\1\\1\end{pmatrix}.\] To solve this set of equations we try to simplify it: if we subtract the first equation from the third the third equation becomes \(y-z=1\) which is identical to the second equation. Hence the initial system of three equations is equivalent to the following system of two equations: \[3 x +z= 0 ,\quad y-z=1 .\] In the first one we can solve for \(x\) as a function of \(z\) and in the second for \(y\) as a function of \(z\), hence \[\begin{equation} x=-\frac{1}{3}z ,\quad y=1+z. \tag{3.7}\end{equation}\] So \(z\) is arbitrary, but once \(z\) is chosen, \(x\) and \(y\) are fixed, and the set of solutions is given by \[S(A,b)=\{(-z/3,1+z,z): z\in \mathbb{R}\} .\] A similar computation for the corresponding homogeneous system of equations \[\begin{aligned} 3 x +z&= 0,\\ y-z& =0,\\ 3x+y&=0 \end{aligned}\] gives us the solutions \(x=-z/3\), \(y=z\), and \(z\in \mathbb{R}\) arbitrary, hence \[S(A,\mathbf{0})=\{(-z/3,z,z): z\in \mathbb{R}\} .\]

A particular solution to the inhomogeneous system is given by choosing \(z=0\) in (3.7), i.e., \(x_0=(0,1,0)\), and then the relation \[S(A,b)=\{x_0\}+S(A,\mathbf{0})\] can be seen directly, since for \(x=(-z/3,z,z)\in S(A,\mathbf{0})\) we have \(x_0+x=(0,1,0)+(-z/3,z,z)=(-z/3,1+z,z)\) which was the general form of an element in \(S(A,b)\). But what happens if we choose another element of \(S(A,b)\)? Let \(\lambda\in \mathbb{R}\), then \(x_{\lambda}:=(-\lambda/3,1+\lambda,\lambda)\) is in \(S(A,b)\) and we again have \[S(A,b)=\{x_{\lambda}\}+S(A,\mathbf{0}) ,\] since \(x_{\lambda}+x=(-\lambda/3,1+\lambda ,\lambda)+(-z/3,z,z)=(-(\lambda+z)/3,1+(\lambda+z),(\lambda+z))\). Then if \(z\) runs through \(\mathbb{R}\) we again obtain the whole set \(S(A,b)\), independent of which \(\lambda\) we chose initially. The choice of \(\lambda\) only determines the way in which we label the elements in \(S(A,b)\).

Finally we should notice that the set \(S(A,\mathbf{0})\) is spanned by one vector, namely we have \((-z/3,z,z)=z(-1/3,1,1)\) and hence with \(v=(-1/3,1,1)\) we have \(S(A,\mathbf{0})=\operatorname{span}\{v\}\) and \[S(A,b)=\{x_{\lambda}\}+\operatorname{span}\{v\} .\]

In the next section we will develop systematic methods to solve large systems of linear equations.