## 4.6 Moments of coefficient estimates, MLR matrix edition

Let’s start with the expected value. If we consider our whole vector of coefficient estimates, $$\boldsymbol{b}$$, what is the expected value of this vector?

Check yourself! Why is the expected value of the vector $$\boldsymbol{y}$$ equal to $$\boldsymbol{X\beta}$$?

$E({\boldsymbol b}) = E[(\boldsymbol X' \boldsymbol X)^{-1}\boldsymbol X' \boldsymbol y]\\ = (\boldsymbol X' \boldsymbol X)^{-1}\boldsymbol X' E[{\boldsymbol y}]\\ = (\boldsymbol X' \boldsymbol X)^{-1}\boldsymbol X' \boldsymbol X \boldsymbol \beta\\ = \boldsymbol \beta$

How about variance? Well, since we have a whole vector of coefficient estimates, we need a variance-covariance matrix: \begin{aligned} Var({\boldsymbol b}) &= Var[ (\boldsymbol X' \boldsymbol X)^{-1}\boldsymbol X' \boldsymbol y]\\ &= (\boldsymbol X' \boldsymbol X)^{-1}\boldsymbol X'Var[{\boldsymbol y}] ((\boldsymbol X' \boldsymbol X)^{-1}\boldsymbol X')'\\ &= (\boldsymbol X' \boldsymbol X)^{-1}\boldsymbol X'Var[{\boldsymbol y}] \boldsymbol X(\boldsymbol X' \boldsymbol X)^{-1}\\ &= (\boldsymbol X' \boldsymbol X)^{-1}\boldsymbol X'\sigma^2 \boldsymbol I \boldsymbol X(\boldsymbol X' \boldsymbol X)^{-1}\\ &= \sigma^2 (\boldsymbol X' \boldsymbol X)^{-1}\boldsymbol X' \boldsymbol X(\boldsymbol X' \boldsymbol X)^{-1}\\ &= \sigma^2(\boldsymbol X' \boldsymbol X)^{-1} \end{aligned}

Let’s check what happens with the simple linear regression case, with just an intercept and a single slope coefficient. Previously we found $$(\boldsymbol{X}'\boldsymbol{X})^{-1}$$ for this scenario, so let’s use it!

\begin{aligned} (\boldsymbol X' \boldsymbol X)^{-1} &= \frac{1}{nS_{xx}}\left(\begin{array}{cc} \sum_{i=1}^nx_i^2&-\sum_{i=1}^nx_i\\ -\sum_{i=1}^n x_i&n \end{array}\right)\\ &= \frac{1}{S_{xx}}\left(\begin{array}{cc} n^{-1}\sum_{i=1}^nx_i^2&-\bar{x}\\ -\bar{x}&1 \end{array}\right)\\ &= \frac{1}{S_{xx}}\left(\begin{array}{cc} n^{-1}(\sum_{i=1}^nx_i^2 - n\bar{x}^2+ n\bar{x}^2)&-\bar{x}\\ -\bar{x}&1 \end{array}\right)\\ &= \frac{1}{S_{xx}}\left(\begin{array}{cc} n^{-1}S_{xx} + \bar{x}^2&-\bar{x}\\ -\bar{x}&1 \end{array}\right)\\ &= \left(\begin{array}{cc} \frac{1}{n} + \frac{\bar{x}^2}{S_{xx}}&\frac{-\bar{x}}{S_{xx}}\\ \frac{-\bar{x}}{S_{xx}}&\frac{1}{S_{xx}} \end{array}\right)\\ \end{aligned} So

\begin{aligned} \sigma^2_{\varepsilon}(\boldsymbol X' \boldsymbol X)^{-1} &= \left(\begin{array}{cc} \sigma^2_{\varepsilon}(\frac{1}{n} + \frac{\bar{x}^2}{S_{xx}})&\frac{-\bar{x}\sigma^2_{\varepsilon}}{S_{xx}}\\ \frac{-\bar{x}\sigma^2_{\varepsilon}}{S_{xx}}&\frac{\sigma^2_{\varepsilon}}{S_{xx}} \end{array}\right)\\ \end{aligned}

The diagonal elements of the variance-covariance matrix are the variances of the individual vector components (so $$Var(b_0)$$ and $$Var(b_1)$$ here). If you’ve seen formulations for the variance or standard error of $$b_0$$ or $$b_1$$ before, these are equivalent – though you might not have used the sum-of-squares notation previously. Meanwhile, on the off-diagonal, we have the covariance of $$b_0$$ and $$b_1$$. Are they independent?