6.5 Useful matrix information in statistics

In statistics, we often work with vectors and matrices. The vector of responses variables is often written as $\textbf{y} = \left[ \begin{array}{c} y_1\\ y_2\\ y_3\\ \vdots\\ y_n \end{array} \right].$ These values of $y_i$ are often assumed to have been generated from a process with means $\mu_i$ , so we can write $\boldsymbol{\mu} = \left[ \begin{array}{c} \mu_1\\ \mu_2\\ \mu_3\\ \vdots\\ \mu_n \end{array} \right],$ or $E[\textbf{y}] = \boldsymbol{\mu}$ .

For an appropriately conformable $m\times n$ matrix ${C}$ , $E[C \textbf{y}] = C\boldsymbol{\mu}$ . This is the matrix equivalent of $E[cX] = c E[X] = c\mu$ .

Similarly, $\text{var}[C\textbf{y}] = C\text{var}[\textbf{y}]C^T$ . This is the matrix equivalent of stating $\text{var}[c X] = c^2 \text{var}[X]$ .

A further useful thing to note: Consider an unknown $n\times 1$ vector $\textbf{z}$ and a $n\times n$ matrix $M$ . Then differentiating $S = \textbf{z}^T M \textbf{z}$ with respect to $\textbf{z}$ gives $\frac{dS}{d\textbf{z}} = 2M\textbf{z}.$ This is the matrix equivalent of differentiating $y = a x^2$ and getting $2ax$ .