Chapter 6 Analysis of Variance (ANOVA)
Notation: y=Xβ+ϵ, ϵ∼N(0,σ2). Let X1=1, Xm=X and Xm+1=I. Suppose C(X1)⊂C(X2)⋯⊂C(Xm−1)⊂C(Xm). Let Pj=PXj and rj=rank(Xj),∀j=1,…,m+1
- Total sum of squares: SSTo=∑ni=1(yi−ˉy.)2=y′(I−P1)y=∑mj=1y′(Pj+1−Pj)y
- SSE=y′(I−PX)y
- Sum of squares: SS(2|1)=y′(P2−P1)y,…,SS(m|m−1)=y′(Pm+1−Pm)y
- rank(Pj+1−Pj)=tr(Pj+1)−tr(Pj)=rj+1−rj
- zero cross-products: (Pj+1−Pj)(Pl+1−Pl)=0
- because (Pj+1−Pjσ2)(σ2I) is idempotent, y′(Pj+1−Pj)yσ2∼χ2rj+1−rj(β′X′(Pj+1−Pj)Xβ2σ2) for all j=1,…,m
- Mean squares: MS(j+1∣j)=SS(j+1∣j)rj+1−rj
ANOVA F statistics: Fj=MS(j+1∣j)MSE=y′(Pj+1−Pj)y)/(rj+1−rj)y′(I−PX)y/(n−r)∼Frj+1−rj,n−r(β′X′(Pj+1−Pj)Xβ2σ2)
- Fj can be used to test $H_{0j}: = 0 $ vs. HAj:β′X′(Pj+1−Pj)Xβ2σ2≠0
- β′X′(Pj+1−Pj)Xβ2σ2 ⇔ (Pj+1−Pj)Xβ=0 ⇔ PjE(y)=Pj+1E(y) ⇔ Pj+1E(y)∈C(Xj)
- C∗j=Pj+1−Pj is not full rank so (Pj+1−Pj)Xβ=0 is not a testable hypothesis. We can write H0j as a testable hypothesis by replacing C∗j with any matrix Cj whose q=rj+1−rj rows form a basis for the row space of C∗j.