2.1 Matrix Theory

\[\begin{equation} \begin{split} A= \left[\begin{array} {cc} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{array} \right] \end{split} \end{equation}\]

\[\begin{equation} \begin{split} A' = \left[\begin{array} {cc} a_{11} & a_{21} \\ a_{12} & a_{22} \\ \end{array} \right] \end{split} \end{equation}\]

\[ \mathbf{(ABC)'=C'B'A'} \\ \mathbf{A(B+C)= AB + AC} \\ \mathbf{AB \neq BA} \\ \mathbf{(A')'=A} \\ \mathbf{(A+B)' = A' + B'} \\ \mathbf{(AB)' = B'A'} \\ \mathbf{(AB)^{-1}= B^{-1}A^{-1}} \\ \mathbf{A+B = B +A} \\ \mathbf{AA^{-1} = I } \]

If A has an inverse, it is called invertible. If A is not invertible it is called singular.


\[\begin{equation} \begin{split} \mathbf{A} &= \left(\begin{array} {ccc} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ \end{array}\right) \left(\begin{array} {ccc} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \\ b_{31} & b_{32} & b_{33} \\ \end{array}\right) \\ &= \left(\begin{array} {ccc} a_{11}b_{11}+a_{12}b_{21}+a_{13}b_{31} & \sum_{i=1}^{3}a_{1i}b_{i2} & \sum_{i=1}^{3}a_{1i}b_{i3} \\ \sum_{i=1}^{3}a_{2i}b_{i1} & \sum_{i=1}^{3}a_{2i}b_{i2} & \sum_{i=1}^{3}a_{2i}b_{i3} \\ \end{array}\right) \end{split} \end{equation}\]

Let \(\mathbf{a}\) be a 3 x 1 vector, then the quadratic form is

\[ \mathbf{a'Ba} = \sum_{i=1}^{3}\sum_{i=1}^{3}a_i b_{ij} a_{j} \]

Length of a vector
Let \(\mathbf{a}\) be a vector, \(||\mathbf{a}||\) (the 2-norm of the vector) is the length of vector \(\mathbf{a}\), is the square root of the inner product of the vector with itself:

\[ ||\mathbf{a}|| = \sqrt{\mathbf{a'a}} \]

2.1.1 Rank

  • Dimension of space spanned by its columns (or its rows).
  • Number of linearly independent columns/rows

For a n x k matrix A and k x k matrix B

  • \(rank(A)\leq min(n,k)\)
  • \(rank(A) = rank(A') = rank(A'A)=rank(AA')\)
  • \(rank(AB)=min(rank(A),rank(B))\)
  • B is invertible if and only if rank(B) = k (non-singular)


2.1.2 Inverse

In scalar, a = 0 then 1/a does not exist. In matrix, a matrix is invertible when it’s a non-zero matrix.

A non-singular square matrix A is invertible if there exists a non-singular square matrix B such that, \[AB=I\] Then \(A^{-1}=B\). For a 2x2 matrix,

\[ A = \left(\begin{array}{cc} a & b \\ c & d \\ \end{array} \right) \]

\[ A^{-1}= \frac{1}{ad-bc} \left(\begin{array}{cc} d & -b \\ -c & a \\ \end{array} \right) \]

For the partition matrix,

\[\begin{equation} \begin{split} \left[\begin{array} {cc} A & B \\ C & D \\ \end{array} \right]^{-1} = \left[\begin{array} {cc} \mathbf{(A-BD^{-1}C)^{-1}} & \mathbf{-(A-BD^{-1}C)^{-1}BD^-1}\\ \mathbf{-DC(A-BD^{-1}C)^{-1}} & \mathbf{D^{-1}+D^{-1}C(A-BD^{-1}C)^{-1}BD^{-1}}\ \\ \end{array} \right] \end{split} \end{equation}\]


Properties for a non-singular square matrix

  • \(\mathbf{A^{-1}}=A\)
  • for a non-zero scalar b, \(\mathbf{(bA)^{-1}=b^{-1}A^{-1}}\)
  • for a matrix B, \(\mathbf(BA)^{-1}=B^{-1}A^{-1}\) only if B is non-singular
  • \(\mathbf{(A^{-1})'=(A')^{-1}}\)
  • Never notate \(\mathbf{1/A}\)


2.1.3 Definiteness

A symmetric square k x k matrix, \(\mathbf{A}\), is Positive Semi-Definite if for any non-zero k x 1 vector \(\mathbf{x}\), \[\mathbf{x'Ax \geq 0 }\]

A symmetric square k x k matrix, \(\mathbf{A}\), is Negative Semi-Definite if for any non-zero k x 1 vector \(\mathbf{x}\) \[\mathbf{x'Ax \leq 0 }\]

\(\mathbf{A}\) is indefinite if it is neither positive semi-definite or negative semi-definite.

The identity matrix is positive definite

Example Let \(\mathbf{x} =(x_1 x_2)'\), then for a 2 x 2 identity matrix,

\[\begin{equation} \begin{split} \mathbf{x'Ix} &= (x_1 x_2) \left(\begin{array} {cc} 1 & 0 \\ 0 & 1 \\ \end{array} \right) \left(\begin{array}{c} x_1 \\ x_2 \\ \end{array} \right) \\ &= (x_1 x_2) \left(\begin{array} {c} x_1 \\ x_2 \\ \end{array} \right) \\ &= x_1^2 + x_2^2 >0 \end{split} \end{equation}\]

Definiteness gives us the ability to compare matrices \(\mathbf{A-B}\) is PSD This property also helps us show efficiency (which variance covariance matrix of one estimator is smaller than another)

Properties

  • any variance matrix is PSD
  • a matrix \(\mathbf{A}\) is PSD if and only if there exists a matrix \(\mathbf{B}\) such that \(\mathbf{A=B'B}\)
  • if \(\mathbf{A}\) is PSD, then \(\mathbf{B'AB}\) is PSD
  • if A and C are non-singular, then A-C is PSD if and only if \(\mathbf{C^{-1}-A^{-1}}\)
  • if A is PD (ND) then \(A^{-1}\) is PD (ND)

Note

  • Indefinite A is neither PSD nor NSD. There is no comparable concept in scalar.
  • If a square matrix is PSD and invertible then it is PD

Example:

  1. Invertible / Indefinite

\[ \left[ \begin{array} {cc} -1 & 0 \\ 0 & 10 \\ \end{array} \right] \]

  1. Non-invertible/ Indefinite

\[ \left[ \begin{array} {cc} 0 & 1 \\ 0 & 0 \\ \end{array} \right] \]

  1. Invertible / PSD

\[ \left[ \begin{array} {cc} 1 & 0 \\ 0 & 1 \\ \end{array} \right] \]

  1. Non-Invertible / PSD

\[ \left[ \begin{array} {cc} 0 & 0 \\ 0 & 1 \\ \end{array} \right] \]

2.1.4 Matrix Calculus

\(y=f(x_1,x_2,...,x_k)=f(x)\) where x is a 1 x k row vector. The Gradient (first order derivative with respect to a vector) is,

\[ \frac{\partial{f(x)}}{\partial{x}}= \left(\begin{array}{c} \frac{\partial{f(x)}}{\partial{x_1}} \\ \frac{\partial{f(x)}}{\partial{x_2}} \\ ... \\ \frac{\partial{f(x)}}{\partial{x_k}} \end{array} \right) \]

The Hessian (second order derivative with respect to a vector) is,

\[ \frac{\partial^2{f(x)}}{\partial{x}\partial{x'}}= \left(\begin{array} {cccc} \frac{\partial^2{f(x)}}{\partial{x_1}\partial{x_1}} & \frac{\partial^2{f(x)}}{\partial{x_1}\partial{x_2}} & ... & \frac{\partial^2{f(x)}}{\partial{x_1}\partial{x_k}} \\ \frac{\partial^2{f(x)}}{\partial{x_1}\partial{x_2}} & \frac{\partial^2{f(x)}}{\partial{x_2}\partial{x_2}} & ... & \frac{\partial^2{f(x)}}{\partial{x_2}\partial{x_k}} \\ ... & ...& & ...\\ \frac{\partial^2{f(x)}}{\partial{x_k}\partial{x_1}} & \frac{\partial^2{f(x)}}{\partial{x_k}\partial{x_2}} & ... & \frac{\partial^2{f(x)}}{\partial{x_k}\partial{x_k}} \end{array} \right) \]

Define the derivative of \(f(\mathbf{X})\) with respect to \(\mathbf{X}_{(n \times p)}\) as the matrix

\[ \frac{\partial f(\mathbf{X})}{\partial \mathbf{X}} = (\frac{\partial f(\mathbf{X})}{\partial x_{ij}}) \]

Define \(\mathbf{a}\) to be a vector and \(\mathbf{A}\) to be a matrix which does not depend upon \(\mathbf{y}\). Then

\[ \frac{\partial \mathbf{a'y}}{\partial \mathbf{y}} = \mathbf{a} \]

\[ \frac{\partial \mathbf{y'y}}{\partial \mathbf{y}} = 2\mathbf{y} \]

\[ \frac{\partial \mathbf{y'Ay}}{\partial \mathbf{y}} = \mathbf{(A + A')y} \]

If \(\mathbf{X}\) is a symmetric matrix then

\[ \frac{\partial |\mathbf{X}|}{\partial x_{ij}} = \begin{cases} X_{ii}, i = j \\ X_ij, i \neq j \end{cases} \] where \(X_{ij}\) is the (i,j)th cofactor of \(\mathbf{X}\)

If \(\mathbf{X}\) is symmetric and \(\mathbf{A}\) is a matrix which does not depend upon \(\mathbf{X}\) then

\[ \frac{\partial tr \mathbf{XA}}{\partial \mathbf{X}} = \mathbf{A} + \mathbf{A}' - diag(\mathbf{A}) \]

If \(\mathbf{X}\) is symmetric and we let \(\mathbf{J}_{ij}\) be a matrix which has a 1 in the (i,j)th position and 0s elsewhere, then

\[ \frac{\partial \mathbf{X}6{-1}}{\partial x_{ij}} = \begin{cases} - \mathbf{X}^{-1}\mathbf{J}_{ii} \mathbf{X}^{-1} , i = j \\ - \mathbf{X}^{-1}(\mathbf{J}_{ij} + \mathbf{J}_{ji}) \mathbf{X}^{-1} , i \neq j \end{cases} \]

2.1.5 Optimization

Scalar Optimization Vector Optimization
First Order Condition \[\frac{\partial{f(x_0)}}{\partial{x}}=0\] \[\frac{\partial{f(x_0)}}{\partial{x}}=\left(\begin{array}{c}0 \\ .\\ .\\ .\\ 0\end{array}\right)\]

Second Order Condition  

Convex \(\rightarrow\) Min

\[\frac{\partial^2{f(x_0)}}{\partial{x^2}} > 0\] \[\frac{\partial^2{f(x_0)}}{\partial{xx'}}>0\]
Concave \(\rightarrow\) Max \[\frac{\partial^2{f(x_0)}}{\partial{x^2}} < 0\] \[\frac{\partial^2{f(x_0)}}{\partial{xx'}}<0\]