Chapter 6 Analysis of Variance (ANOVA)

  • Notation: y=Xβ+ϵ, ϵN(0,σ2). Let X1=1, Xm=X and Xm+1=I. Suppose C(X1)C(X2)C(Xm1)C(Xm). Let Pj=PXj and rj=rank(Xj),j=1,,m+1

    • Total sum of squares: SSTo=ni=1(yiˉy.)2=y(IP1)y=mj=1y(Pj+1Pj)y
    • SSE=y(IPX)y
    • Sum of squares: SS(2|1)=y(P2P1)y,,SS(m|m1)=y(Pm+1Pm)y
    • rank(Pj+1Pj)=tr(Pj+1)tr(Pj)=rj+1rj
    • zero cross-products: (Pj+1Pj)(Pl+1Pl)=0
    • because (Pj+1Pjσ2)(σ2I) is idempotent, y(Pj+1Pj)yσ2χ2rj+1rj(βX(Pj+1Pj)Xβ2σ2) for all j=1,,m
    • Mean squares: MS(j+1j)=SS(j+1j)rj+1rj
  • ANOVA F statistics: Fj=MS(j+1j)MSE=y(Pj+1Pj)y)/(rj+1rj)y(IPX)y/(nr)Frj+1rj,nr(βX(Pj+1Pj)Xβ2σ2)

    • Fj can be used to test $H_{0j}: = 0 $ vs. HAj:βX(Pj+1Pj)Xβ2σ20
    • βX(Pj+1Pj)Xβ2σ2 (Pj+1Pj)Xβ=0 PjE(y)=Pj+1E(y) Pj+1E(y)C(Xj)
    • Cj=Pj+1Pj is not full rank so (Pj+1Pj)Xβ=0 is not a testable hypothesis. We can write H0j as a testable hypothesis by replacing Cj with any matrix Cj whose q=rj+1rj rows form a basis for the row space of Cj.