1 Introduction to Matrices and Linear Systems
1.1 Linear Geometry: Vectors in Space and Angle & Length
Vectors in Space
Definition: A vector is a magnitude with a direction.
→v=head−tail=[b1−a1b2−a2] Vectors or Scalars?
- 60 mph (scalar)
- 60 mph due East (vector)
- 100 lbs (vector)
- 5 kg (scalar)
NOTE: Vectors can exist independently of a coordinate system
Connection to Free Variables
Example: S={[12]t|t∈R}
[12] can be scaled by any length. We say S generates or spans a line in R2.
Example: S={[121]+[−111]t|t∈R}
Precalculus/Calculus Way: 2x+y+z=4→x=2−12y−12z ⇒(0,0,4),(0,4,0),(2,0,0)
Linear Algebra Way: S={[100]s+[010]t|s,t∈R}
[100] and [010] spans the xy-plane.
Vector Length
Definition: The length (magnitude) of a vector →v∈Rn is the square root of the sum of the squares of components. Let →v=⟨v1,...,vn⟩, and its magnitude will be ‖.
Angle between Vectors
Definition: The angle formed by two vectors is defined as the angle formed when they are joined tail-to-tail (subtended), and is denoted as \theta=\cos^{-1}(\frac{\vec{u}\cdot\vec{v}}{\lVert\vec{u}\rVert\lVert\vec{v}\rVert}).
Proof:
1.2 Dot Product
Definition: The dot product, \vec{u}\cdot\vec{v}, is \vec{u}\cdot\vec{v}=u_1v_1+u_2v_2
Note: if \vec{u}\cdot\vec{v}=0, then \cos(\theta) which implies \theta=\frac{\pi}{2} (such vectors are orthogonal).
Properties of Dot Product
\vec{u}\cdot\vec{v}=\vec{v}\cdot\vec{u}\;\;\;\; (dot products are commutative)
c(\vec{u}\cdot\vec{v})=(c\vec{u})\cdot\vec{v}=\vec{u}\cdot(c\vec{v})\;\;\;\; (dot products are associative)
\vec{u}\cdot(\vec{v}+\vec{w})=\vec{u}\cdot\vec{v}=\vec{u}\cdot\vec{w}\;\;\;\; (dot products are distributive)
if \vec{u},\vec{v}\neq\vec{0} and \vec{u}\cdot\vec{v}=0 then \vec{u} is orthogonal to \vec{v}
\vec{u}\cdot\vec{u}=\lVert\vec{u}\rVert^2\;\;\;\;
Examples: Let \vec{u}=\begin{bmatrix} 3\\1 \end{bmatrix}, \vec{v}=\begin{bmatrix} -1\\3 \end{bmatrix}, \vec{w}=\begin{bmatrix} 2\\5 \end{bmatrix}
\vec{u}\cdot\vec{w}=3(2)+1(5)=11=2(3)+5(1)=\vec{w}\cdot\vec{u}
2\vec{u}\cdot\vec{w}=2(11)=4(3)+10(1)=(2\vec{u})\cdot\vec{v}=2(6)+5(2)=\vec{u}\cdot(2\vec{v})
\vec{u}\cdot\vec{v}=3(-1)+1(3)=0\;\;\;\;\therefore\theta=\frac{\pi}{2}
\vec{u}\cdot\vec{u}=3(3)+1(1)=3^2+1^2=\lVert\vec{u}\rVert^2
Parallel Vectors: \vec{u}\parallel\vec{v} if their directions are either same or opposite. In other words, \vec{u}\parallel\vec{v}\Leftrightarrow c\vec{u}=\vec{v} for some c\in\mathbb{R}
Example: Let \vec{u}=\langle2,2,2\rangle and \vec{v}=\langle-1,-1,-1\rangle, then vec{u}\parallel\vec{v} because \vec{u}=-2\vec{v}.
Theorem [Cauchy-Schwarz Inequality]: For any \vec{u},\vec{v}\in\mathbb{R}^n, \lvert\vec{u}\cdot\vec{v}\rvert\leq\lVert\vec{u}\rVert\lVert\vec{v}\rVert.
Proof:
Theorem [Triangle Inequality]: For any \vec{u},\vec{v}\in\mathbb{R}^n, \lVert\vec{u}+\vec{v}\rVert\leq\lVert\vec{u}\rVert+\lVert\vec{v}\rVert.
Proof:
1.3 Solving Linear Equations
Gauss’s Method
Definition: A linear combination of x_1, ... x_n is of the form a_1{x_1} + ... + {a_n{x_n}} where a_i are coefficients.
An n-tuple (S_1, ... , S_n) is solution of or satisfies a system of equations:
\begin{bmatrix} a_{1,1}x_1 + & \dots & + a_{1, n}x_n & | &d_1 \\ \vdots & & \vdots \\ a_{m,1}x_1 + & \dots & + a_{m, n}x_n & | &d_m \end{bmatrix}
Example: S = {\{sinx, cosx\}}
4sinx - 3cosx is a linear combination of sinx, cosx
3(sinx)^2 - 3cos\sqrt{x} is NOT a linear combination of sinx, cosx
Gauss’ Method
Elementary row operations are:
Swapping Equations
Multiplication by nonzero constant \leftarrow (Re-scaling)
Replace with linear combination of existing equations \leftarrow (Row Combo)
Theorem: Performing elementary row operations on a system of equations does not change the solution set.
General = Particular + Homogeneous
Theorem: Any linear system has general solution of the form:
\begin{matrix} \{\vec{p} + c_1\vec{{b_1}}+...+c_1\vec{{b_k}}& c_1,...,c_k \in\mathbb{R}\}\\ \end{matrix}
where k = Number of free variables, \vec{\beta_1}, ... , \vec{\beta_k} are the vectors associated with free variables, and \vec{p} is a particular solution (vector of constants).
Definition: A linear system is homogeneous if all equations equal to 0.
Always guaranteed to have at least one solution. (x_1 = x_2 = ... = 0)
For homogeneous systems \vec{p}=\vec{0}
Theorem: Any linear system has general solution of the form: \begin{matrix} \{\vec{p} + \vec{h}\} \end{matrix}
Where \vec{p} is particular solution, \vec{h} satisfies associated homogeneous system. This theorem suggest that \vec{h} = c_1{\vec{\beta_1}} + .. + c_1\vec{{\beta_k}}i in other words, the free variable terms satisfy associated the homogeneous system.
Theorem: Any system of linear equations must have either
Exactly 1 solution
No solutions
Infinitely many solutions
Definitions:
A square matrix is a matrix where n = m.
A square matrix is non-singular if it is the matrix of coefficients of a homogeneous system with a unique solution.
A square matrix is singular if it is not non-singular.
Example:
A = \begin{bmatrix} 2 & 1\\ -1 & 3\\ \end{bmatrix} \Rightarrow \begin{bmatrix} 2 & 1\\ 0 & 7\\ \end{bmatrix}
Associated System: 2x+y=0
-x+3y=0
\therefore y=x=0
\therefore non singular
Example:
B = \begin{bmatrix} 2 & 2\\ 1 & 1 \\ \end{bmatrix} \Rightarrow \begin{bmatrix} 2 & 2\\ 0 & 0\\ \end{bmatrix} I - II
1.4 Row Echelon Form & Reduced Row Echelon Form
Definition: Occurs when leading variable of an equation is to the right of the row above it AND when all rows of zeros are at the bottom.
Example:
x-y=0
2x+2y+z+2w = 4
y+w = 0 2z+w = 5
-2I + II, replace II
II swap III
x-y =0 y+w = 0 z+2w = 4 2z+w = 5
-2III + IV, replace IV
x-y =0 y+w = 0 z+2w = 4 -3w = -3
w=1
z=2
y=-1
x=-1
Solution: (-1, -1, 2, 1)
1.4.1 Reduced Row Echelon Form
*Gauss-Jordan reduction is an extension of Gauss’ method.
Definition: A matrix is in reduced row echelon form (RREF) if the matrix is in REF and:
All nonzero leading entries are 1.
Any rows containing all 0’s are below all nonzero rows.
Where possible, any non zero entry below a leading variable (column) is 0.
Reminders
Definition: (i) Elementary row opperations are: 1. Swapping, 2. Row Scaling, 3. Row Replacement (linear combination of rows)
Theorem: (ii) Elementary row operations are reversible
Definition: (iii) Two matrices are row equivalent if one can be reduced (changed to) the other by a sequence of elementary row operations.
Pivot Positions: location of a leading entry in RREF matrix.
Example: Compute RREF(A) if A = \begin{bmatrix} 2 & 6 & 1 & 2 & 5 \\ 0 & 3 & 1 & 4 & 1 \\ 0 & 3 & 1 & 2 & 5 \end{bmatrix}
-II + III \Rightarrow \begin{bmatrix} 2 & 6 & 1 & 2 & | & 5 \\ 0 & 3 & 1 & 4 & | & 1 \\ 0 & 0 & 0 & -2 & | & 4 \end{bmatrix}
(1/2)I + (-1/2)III, (1/3)II \Rightarrow \begin{bmatrix} 1 & 3 & 1/2 & 1& | & 5/2 \\ 0 & 3 & 1 & 4 & | & 1/3 \\ 0 & 0 & 0 & -2 & | & 4 \end{bmatrix}
II - (4/3)III, I - III \Rightarrow \begin{bmatrix} 1 & 3 & 1/2 & 0& | & 9/2 \\ 0 & 1 & 1/3 & 0 & | & 3 \\ 0 & 0 & 0 & 1 & | & -2 \end{bmatrix}
(-3)II + I \Rightarrow \begin{bmatrix} 1 & 0 & -1/2 & 0& | & -9/2 \\ 0 & 1 & 1/2 & 0 & | & 3 \\ 0 & 0 & 0 & 1 & | & -2 \end{bmatrix} = RREF A
Columns 1,2 & 4 are pivot columns and a_{11}, a_{22}, a_{44} are the pivots.
\begin{bmatrix} 2 & 6 & 1 & 2 & | & 5 \\ 0 & 3 & 1 & 4 & | & 1 \\ 0 & 0 & 0 & -2 & | & 4 \end{bmatrix} is row equivalent to \begin{bmatrix} 1 & 0 & -1/2 & 0& | & -9/2 \\ 0 & 1 & 1/2 & 0 & | & 3 \\ 0 & 0 & 0 & 1 & | & -2 \end{bmatrix}
Corresponding system of equations:
2x+6y+z+w = 5 \Rightarrow x-(1/2)z = (-9/2)
3y+z+4w = 1 \Rightarrow y+(1/3)z = 3
3y+z+2w = 5 \Rightarrow w = 2
Z is a free variable.
S = \{\begin{bmatrix} (-9/2)\\ 3\\ 0\\ -2 \end{bmatrix} + \begin{bmatrix} (1/2)\\ (1/3)\\ 1\\ 0 \end{bmatrix}\} | z \in\mathbb{R}}
Important Note: The RREF of a matrix is unique. There’s only 1 RREF of a matrix.
Equivalence Relations
Definition: A relation ~ between a set of elements is an equivalence relation if the relation is reflexive, symmetric, and transitive.
Example 1: Equality:
A = A \Rightarrow Reflexive
if A = B then B = A \Rightarrow Symmetric
if A = B and B = C then A = C \Rightarrow Transitive
Example 2: Is same birthday as, an equivalence relation?
Reflexive, Symmetric, and Transitive work.
Example 3: Is friends with an equivalence relation?
No, any of the conditions could fail.
Example 4: S = \mathbb{Z} relation: R = { x ~ y | x and y same parity}
Reflexive: if x is odd then it is odd (same for even) \therefore x ~ x
Symmetric: if x ~ y then y ~ x
Transitive: if x ~ y and y ~ z then x ~ z (all are odd)
Example 5: Let S be the set of nxm matrices
R = {A ~ B | A can be reduced to B by elementary row opperations}
Supporting Theorems
Linear Combination Lemma:
a linear combination of linear combination is a linear combination.
Proof: Let S = \{x_1,...,x_n\}. Consider the following linear combinations of S:
c_{11}x_{1} + ...+c_{1n}x_{n}
By definition, a linear combination of these combinations is:
d_{1}(c_{11}x_1+...+c_{1n}x_n) +...+d_m(c_{m1}x_1+...+c_{mn}x_n) = d_1c_{11}x_1+...+d_1c_{1n}x_n +...+d_mc_{m1}x_1+...+d_mc_{mn}x_n
which is a linear combination of x_1,...,x_n
In a REF matrix, no nonzero row is a liner combination of the other rows:
Example: B = \begin{bmatrix} 1 & 1 & 1 \\ 0 & 3 & 2 \\ 0 & 0 & 5 \end{bmatrix} is a REF matrix.
c_1(II) + c_2(III) \neq (I) since b_21 = 0 = b_31
c_1(I) + c_2(II) \neq (III) since c_1(1) = c_1 \ neq 0, but b_13 = 0
c_1(I) + c_2(III) \neq (II) since c_1(1) = c_1 \neq 0, but b_12 = 0
1.5 Determinants and Inverse Matrices
1.5.1 Determinants
The Determinant as a Function
The determinant of a matrix is a function of the form f: \mathbb{R}^{m \times n} \rightarrow \mathbb{R}
Example: \mathrm{det}(\begin{bmatrix} 4 & 2 \\ 2 & 1 \end{bmatrix} = 0
Background of Understanding the function, \mathrm{det}()
Permutation: Given \{1, 2, ..., n\} all arrangements without omissions or repetitions.
Example: \{1, 2, 3\} = S The permutations of S are
Inversion: Any time when a larger integer proceeds a smaller integer in a permutation.
Example: The permutation (6, 1, 3, 4, 5, 2) has 5+0+1+1+1=8 inversions.
Odd or Even Permutation: Determine whether the permutation is odd or even based on the number of inversions.
Example: perumation (6, 1, 3, 4, 5, 2) is even because there are 8 inversions.
Elementary Prodcuts: Consider permutations of n elements where elements do NOT share some row or columns as products.
Let A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a{33} \end{bmatrix}
Definition of Determinant Function
Let A be a square matrix. The determinant of A denoted \mathrm{det}(A), is the sum of all signed elementary products.
Example: \mathrm{det}(\begin{bmatrix} 4 & -1 \\ 3 & 2 \end{bmatrix}) = 8 - (-3) = 11 = a_{11}a_{12} - a_{12}a_{21}
Use the left expressions to prove the formula on the right
Example: \mathrm{det}(\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}) = a_{11}a_{22}a_{33} + a_{12}a_{23}a_{31} + a_{13}a_{21}a_{32} - a_{11}a_{23}a_{32} - a_{12}a_{21}a_{33} - a_{13}a_{22}a_{31}
Trick: Remember 3\times3 determinant via sum of signed elementary products. (NOTE only valid for 3\times3 matrices)
Evaluating Determinants by Row Reduction
Theorem 1: If A is a square matrix containing a row of zeros then \mathrm{det}(A) = 0
This is so becuase each signed element contains one term from each row, each product is zero.
Theorem 2: If A is n \times n triangular matrix (upper, lower, or diagonal) then \mathrm{det}(A)= a_{11} \cdot a_{22} \cdot...\cdot a_{nn} - which is the product of diagonal elements
This is so since the signed element products are all zero except those along the diagonal.
Theorem 3: Let A be an n\times n matrix
- Row scale scales \mathrm{det}(A): If A' is a matrix contstructed by multiplying (a connection to row operations) one row of A by c \in \mathbb{R} then \mathrm{det}(A') = c\mathrm{det}(A)
- Row swap swaps sign of \mathrm{det}(A): If A' is a matrix constructed by interchanging 2 rows of A then \mathrm{det}(A') = -\mathrm{det}(A)
- Row replace with linear combination doesn’t change \mathrm{det}(A): If A' is a matrix constructed by replacing a row of A with a linear combination of rows then \mathrm{det}(A') = \mathrm{det}(A)
Properties of the Determinant Function
Theorem 1: If A is a square matrix then \mathrm{det}(A) = \mathrm{det}(A^T)
Theorem 2: \mathrm{det}(kA) = k^n\mathrm{det}(A) where A is n\times n and k \in \mathbb{R}
Theorem 3: A is invertible if and only if \mathrm{det}{A} \neq 0
NOTE: \mathrm{det}(A + B) \neq \mathrm{det}(A) + \mathrm{det}(B)
1.5.2 Cofactor Expansion
Definition: Cofactor Expansion is a computational method to evaluate determinants.
Let A be a square matrix. The minor of a_{ij}, dneoted by M_{ij}, is the determinant of the submatrix that remains after deleting the ith and jth row from A. The number (-1)^{i+j}M_{ij} is called the cofactor of a_{ij}
Cofactor Expansion Method
Recall: \mathrm{det}(A) = \mathrm{det}(\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}) = a_{11}a_{22}a_{33} + a_{12}a_{23}a_{31} + a_{13}a_{21}a_{32} - a_{11}a_{23}a_{32} - a_{12}a_{21}a_{33} - a_{13}a_{22}a_{31}
Which we can manipulate to get the following:
= a_{11}(a_{22}a_{33} - a_{23}a_{32}) + a_{21}(a_{13}a_{32} - a_{12}a_{33}) + a_{31}(a_{12}a_{23} - a_{13}a_{22})
= a_{11}C_{11} + a_{21}C_{21} + a_{31}C_{31} - this is cofactor expansion along the 1st column
NOTE: we can expand along any column OR row
Example: A = \begin{bmatrix} 3 & 1 & 0 \\ -2 & -4 & 3 \\ 5 & 4 & -2 \end{bmatrix}
\therefore \mathrm{det}(A) = 3\begin{vmatrix}-4 & 3 \\ 4 & -2 \end{vmatrix} - (-2)\begin{vmatrix} 1 & 0 \\ 4 & -2 \end{vmatrix} + 5\begin{vmatrix} 1 & 0 \\ -4 & 3 \end{vmatrix}
= 3(8 - 12) + 2(-2 - 0) + 5(3-0)
=3(-4) - 4 + 15
=-12 -4 + 15 = -1
Try expanding along a diffefent row or column
1.5.3 Inverses of Matrices
Definition:
- An identity matrix, denoted I_n, is an n\times n matrix with the value 1 in every main diagonal entry and zeros everywhere else.
Examples:
I_2 = \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix}
I_3 = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix}
Unique Property - identity matrix acts like the number one: AI = IA = A
The inverse of an n \times n matrix, denoted A^{-1}, is a matrix so that AA^{-1} = I_n
A matrix with an inverse is called invertible
Example:
A = \begin{bmatrix}1 & 2 \\ 1 & 3 \end{bmatrix}, M = \begin{bmatrix}3 & -2 \\ -1 & 1 \end{bmatrix}
AM = \begin{bmatrix}1 & 2 \\ 1 & 3 \end{bmatrix}\begin{bmatrix}3 & -2 \\ -1 & 1 \end{bmatrix} = \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix}
\therefore M = A^{-1}
Theorem: If B and C are inverses of A then B=C.
Proof: BA = I
(BA)C = IC
(BA)C = C
B(AC) = C
BI = C
B = C
Theorem: If A and B are invertible n\times n matrices then
AB is invertible
(AB)^{-1} = B^{-1}A^{-1}
Proof:
(AB)(B^{-1}A^{-1})
=A(BB^{-1})A^{-1}
=AIA^{-1}
=AA^{-1}
=I \therefore (AB)^{-1} = B^{-1}A^{-1}
Theorem: If A is invertible then
(A^{-1})^{-1} = A
(A^n)^{-1} = (A^{-1})^n for n \in \mathbb{N}
(kA)^{-1} = \frac{1}{k}A^{-1} for k \neq 0
Proof:
(kA)(\frac{1}{k} A^{-1}) = \frac{1}{k} (kA) A^{-1}
= \frac{1 k}{k}A A^{-1}
= 1I
= I
\therefore (kA)^{-1} = \frac{1}{k}A^{-1}
How to Compute the Inverse of A Matrix
Recall: Elementary row operations are reversible.
Procedure for Computing the Inverse
Adjoin A and I
Row reduce A to RREF; apply same operations
The resultant matrix of the righthand side is A^{-1}
Example: Compute the inverse of A = \begin{bmatrix}1 & 2 & 3 \\ 2 & 5 & 3 \\ 1 & 0 & 8\end{bmatrix}
[A | I]
= \begin{bmatrix}1 & 2 & 3 & | & 1 & 0 & 0 \\ 2 & 5 & 3 & | & 0 & 1 & 0\\ 1 & 0 & 8 & | & 0 & 0 & 1\end{bmatrix}
-2I + III and -I + III \rightarrow
\begin{bmatrix}1 & 2 & 3 & | & 1 & 0 & 0 \\ 0 & 1 & -3 & | & -2 & 1 & 0\\ 0 & -2 & 5 & | & -1 & 0 & 1\end{bmatrix}
2II + III \rightarrow
...A^{-1} = \begin{bmatrix}-40 & 16 & 9 \\ 13 & -5 & -3 \\ 5 & -2 & -1\end{bmatrix}
Comment: If row operations result in a row of zeros then A is noninvertible.