Chapter 2 Review of Linear Models
General Linear Model (GLM): suppose \(y = X\beta + \epsilon\) with \(E(\epsilon) = 0\)
- distribution of \(y\) is left unspecified
- \(E(y) \in \mathcal{C}(X)\)
Ordinary Least Squares Estimator (OLSE): \(\hat y = P_X X = X(X'X)^-X'y\)
Orthogonal projection matrix \(P_X\):
- \(P_X\) is symmetric, idempotent,
- \(P_XX = X\) and \(X'P_X = P_X\)
- \(rank(X)= rank(P_X) = tr(P_X)\)
- other properties:
- \(X'XA = X'XB \Leftrightarrow XA = XB\)
- \(\forall (X'X)^- \Rightarrow X(X'X)^-X'X = X\), \(X'X(X'X)^-X' = X'\)
- \(A' = A\), \(AGA = A \Rightarrow\) \(AG'A = A\)
Estimable: if \(C\) is any \(q\times p\) matrix, the linear function of \(\beta\) given by \(C\beta\) is estimable if and only if \(C = AX\) for some matrix \(q\times n\) matrix A
- OLSE of an estimable linear function \(C\beta\) is \(C(X'X)^-X'y\)
Normal equation: \(X'Xb = X'y\)
- The OLSE of estimable \(C\beta\) is \(C\hat\beta\) where \(\hat\beta\) is any solution for \(b\) in the normal equations
- \(\hat\beta = (X'X)^-X'y\) is always a solution to the Normal Equations for any \((X'X)^-\)
- if \(C\beta\) is estimable, then \(C\hat\beta\) is the same for all solution \(\hat\beta\) to the Normal equations and \(C\hat\beta = AP_Xy\) where \(C = AX\)
Gauss-Markov Model(GMM): suppose \(y = X\beta + \epsilon\) with \(E(\epsilon) = 0\) and \(Var(\epsilon) = \sigma^2I\)
Gauss-Markov Theorem: the OLSE of an estimable function \(C\beta\) is the BLUE of \(C\beta\)
- an unbiased estimator of \(\sigma^2\) under GMM is given by \(\hat\sigma^2 = \frac{y'(I-P_X)y}{n-r}\)
Gauss-Markov Model with Normal Errors (GMMNE): suppose \(y=X\beta+\epsilon\) with \(\epsilon\sim N(0, \sigma^2I)\)
- GMMNE is useful for drawing statistical inferences regrading estimable \(C\beta\)
- assume: 1. GMMNE 2. \(C\beta\) is estimable 3. rand(C) = q and \(d\) is a known \(q\times 1\) vector. Then \(H_0: C\beta = d\) is a testable hypothesis
\(C\hat\beta\sim N(C\beta, \sigma^2C(X'X)^-C')\) and \(\hat\sigma^2\sim \frac{\sigma^2}{n-r}\chi_{n-r}^2\) are independent
The F test statistic \[ \begin{aligned} F & = (C\hat\beta-d)'[\widehat{Var}(C\hat\beta)]^{-1}(C\hat\beta - d)/q \\ & = \frac{(C\hat\beta-d)'[C(X'X)^-C']^{-1}(C\hat\beta-d)/q}{\hat\sigma^2} \\ & \sim F_{q, n-r}(\frac{(C\beta-d)'[C(X'X)^{-}C']^{-1}(C\beta-d)}{2\sigma^2}) \end{aligned} \] Under the null hypothesis \(H_0: C\beta = d\), the non-negative non-centrality parameter is 0.
The t test statistic \[ \begin{aligned} t &= \frac{c'\hat\beta-d}{\sqrt{\widehat{Var}(c'\beta)}} = \frac{c'\hat\beta-d}{\sqrt{\hat\sigma^2c'(X'X)^-c}} \\ & \sim t_{n-r}(\frac{c'\beta-d}{\sqrt{\sigma^2c'(X'X)^-c}}) \end{aligned} \] Under the null hypothesis \(H_0: C\beta = d\), the non-negative non-centrality parameter is 0.