25.1 Completely Randomized Design (CRD)
Treatment factor A with a≥2 treatments levels. Experimental units are randomly assigned to each treatment. The number of experimental units in each group can be
- equal (balanced): n
- unequal (unbalanced): ni for the i-th group (i = 1,…,a).
The total sample size is N=∑ai=1ni
Possible assignments of units to treatments are k=N!n1!n2!...na!
Each has probability 1/k of being selected. Each experimental unit is measured with a response Yij, in which j denotes unit and i denotes treatment.
Treatment
1 | 2 | … | a | |
---|---|---|---|---|
Y11 | Y21 | … | Ya1 | |
Y12 | … | … | … | |
… | … | … | … | |
Sample Mean | ¯Y1. | ¯Y2. | … | ¯Ya. |
Sample SD | s1 | s2 | … | sa |
where ¯Yi.=1ni∑nij=1Yij
s2i=1ni−1∑nij=1(Yij−¯Yi)2
And the grand mean is ¯Y..=1N∑i∑jYij
25.1.1 Single Factor Fixed Effects Model
also known as Single Factor (One-Way) ANOVA or ANOVA Type I model.
Partitioning the Variance
The total variability of the Yij observation can be measured as the deviation of Yij around the overall mean ¯Y..: Yij−¯Y..
This can be rewritten as:
Yij−¯Y..=Yij−¯Y..+¯Yi.−¯Yi.=(¯Yi.−¯Y..)+(Yij−¯Yi.)
where
- the first term is the between treatment differences (i.e., the deviation of the treatment mean from the overall mean)
- the second term is within treatment differences (i.e., the deviation of the observation around its treatment mean)
∑i∑j(Yij−¯Y..)2=∑ini(¯Yi.−¯Y..)2+∑i∑j(Yij−¯Yi.)2SSTO=SSTR+SSEtotal SS=treatment SS+error SS(N−1) d.f.=(a−1) d.f.+(N−a) d.f.
we lose a d.f. for the total corrected SSTO because of the estimation of the mean (∑i∑j(Yij−¯Y..)=0)
And, for the SSTR ∑ini(¯Yi.−¯Y..)=0
Accordingly, MSTR=SSTa−1 and MSR=SSEN−a
ANOVA Table
Source of Variation | SS | df | MS |
---|---|---|---|
Between Treatments | ∑ini(¯Yi.−¯Y..)2 | a-1 | SSTR/(a-1) |
Error (within treatments) | ∑i∑j(Yij−¯Yi.)2 | N-a | SSE/(N-a) |
Total (corrected) | ∑ini(¯Yi.−¯Y..)2 | N-1 |
Linear Model Explanation of ANOVA
25.1.1.1 Cell means model
Yij=μi+ϵ_ij
where
Yij response variable in j-th subject for the i-th treatment
μi: parameters (fixed) representing the unknown population mean for the i-th treatment
ϵij independent N(0,σ2) errors
E(Yij)=μi var(Yij)=var(ϵij)=σ2
All observations have the same variance
Example:
a=3 (3 treatments) n1=n2=n3=2
(Y11Y12Y21Y22Y31Y32)=(100100010010001001)(μ1μ2μ3)+(ϵ11ϵ12ϵ21ϵ22ϵ31ϵ32)y=Xβ+ϵ
Xk,ij=1 if the k-th treatment is used
Xk,ij=0 Otherwise
Note: no intercept term.
b=[μ1μ2μ3]=(x′x)−1x′y=[n1000n2000n3]−1[Y1Y2Y3]=[¯Y1¯Y2¯Y3]is the BLUE (best linear unbiased estimator) for β=[μ1μ2μ3]′
E(b)=β
var(b)=σ2(X′X)−1=σ2[1/n10001/n20001/n3]
var(bi)=var(^μi)=σ2/ni where b∼N(β,σ2(X′X)−1)
MSE=1N−a∑i∑j(Yij−¯Yi.)2=1N−a∑i[(ni−1)∑i(Yij−¯Yi.)2ni−1]=1N−a∑i(ni−1)s21
We have E(s2i)=σ2
E(MSE)=1N−a∑i(ni−1)σ2=σ2
Hence, MSE is an unbiased estimator of σ2, regardless of whether the treatment means are equal or not.
E(MSTR)=σ2+∑ini(μi−μ.)2a−1
where μ.=∑ai=1niμi∑ai=1ni
If all treatment means are equals (=μ.), E(MSTR)=σ2.
Then we can use an F-test for the equality of all treatment means:
H0:μ1=μ2=..=μa
Ha:not all μi are equal
F=MSTRMSE
where large values of F support Ha (since MSTR will tend to exceed MSE when Ha holds)
and F near 1 support H0 (upper tail test)
Equivalently, when H0 is true, F∼f(a−1,N−a)
- If F≤f(a−1,N−a;1−α), we cannot reject H0
- If F≥f(a−1,N−a;1−α), we reject H0
Note: If a=2 (2 treatments), F-test = two sample t-test
25.1.1.2 Treatment Effects (Factor Effects)
Besides Cell means model, we have another way to formalize one-way ANOVA: Yij=μ+τi+ϵij where
- Yij is the j-th response for the i-th treatment
- τi is i-th treatment effect
- μ constant component, common to all observations
- ϵij independent random errors ~ N(0,σ2)
For example, a=3, n1=n2=n3=2
(Y11Y12Y21Y22Y31Y32)=(110011001010101010011001)(μτ1τ2τ3)+(ϵ11ϵ12ϵ21ϵ22ϵ31ϵ32)y=Xβ+ϵHowever,
X′X=(∑inin1n2n3n1n100n20n20n300n3)
is singular thus does not exist, b is insolvable (infinite solutions)
Hence, we have to impose restrictions on the parameters to a model matrix X of full rank.
Whatever restriction we use, we still have:
E(Yij)=μ+τi=μi=mean response for i−th treatment
25.1.1.2.1 Restriction on sum of tau
∑ai=1τi=0
implies
μ=μ+1aa∑i=1(μ+τi)
is the average of the treatment mean (grand mean) (overall mean)
τi=(μ+τi)−μ=μi−μ=treatment mean−grand~mean=treatment effect
τa=−τ1−τ2−...−τa−1
Hence, the mean for the a-th treatment is
μa=μ+τa=μ−τ1−τ2−...−τa−1
Hence, the model need only “a” parameters:
μ,τ1,τ2,..,τa−1
Equation (25.2) becomes
(Y11Y12Y21Y22Y31Y32)=(1101101011011−1−11−1−1)(μτ1τ2)+(ϵ11ϵ12ϵ21ϵ22ϵ31ϵ32)y=Xβ+ϵwhere β≡[μ,τ1,τ2]′
Equation (25.1) with ∑iτi=0 becomes
b=[ˆμ^τ1^τ2]=(x′x)−1x′y=[∑inin1−n3n2−n3n1−n3n1+n3n3n2−n3n3n2−n3]−1[Y..Y1.−Y3.Y2.−Y3.]=[13∑3i=1¯Yi.¯Y1.−13∑3i=1¯Yi.¯Y2.−13∑3i=1¯Yi.]=[ˆμ^τ1^τ2]
and ^τ3=−^τ1−^τ2=¯Y3−13∑i¯Yi.
25.1.1.2.2 Restriction on first tau
In R, lm()
uses the restriction τ1=0
For the previous example, for n1=n2=n3=2, and τ1=0.
Then the treatment means can be written as:
μ1=μ+τ1=μ+0=μμ2=μ+τ2μ3=μ+τ3
Hence, μ is the mean response for the first treatment
In the matrix form,
(Y11Y12Y21Y22Y31Y32)=(100100110110101101)(μτ2τ3)+(ϵ11ϵ12ϵ21ϵ22ϵ31ϵ32)y=Xβ+ϵ
β=[μ,τ2,τ3]′
b=[ˆμ^τ2^τ3]=(x′x)−1x′y=[∑inin2n3n2n20n30n3]−1[Y..Y2.Y3.]=[¯Y1.¯Y2.−¯Y1.¯Y3.−¯Y1.]
E(b)=β=[μτ2τ3]=[μ1μ2−μ1μ3−μ1]
var(b)=σ2(X′X)−1var(ˆμ)=var(¯Y1.)=σ2/n1var(^τ2)=var(¯Y2.−¯Y1.)=σ2/n2+σ2/n1var(^τ3)=var(¯Y3.−¯Y1.)=σ2/n3+σ2/n1
Note For all three parameterization, the ANOVA table is the same
All models have the same calculation for ˆY as
ˆY=X(X′X)−1X′Y=PY=Xb
ANOVA Table
Source of Variation | SS | df | MS | F |
---|---|---|---|---|
Between Treatments | ∑ini(¯Yi.−¯Y..)2=Y′(P−P1)Y | a-1 | SSTRa−1 | MSTRMSE |
Error (within treatments) |
∑i∑j(Yij−¯Yi.)2=e′e | N-a | SSEN−a | |
Total (corrected) | ∑ini(¯Yi.−¯Y..)2=Y′Y−Y′P1Y | N-1 |
where P1=1nJ
The F-statistic here has (a−1,N−a) degrees of freedom, which gives the same value for all three parameterization, but the hypothesis test is written a bit different:
H0:μ1=μ2=...=μaH0:μ+τ1=μ+τ2=...=μ+τaH0:τ1=τ2=...=τa
The F-test here serves as a preliminary analysis, to see if there is any difference at different factors. For more in-depth analysis, we consider different testing of treatment effects.
25.1.1.3 Testing of Treatment Effects
- A Single Treatment Mean μi
- A Differences Between Treatment Means
- A Contrast Among Treatment Means
- A Linear Combination of Treatment Means
25.1.1.3.1 Single Treatment Mean
We have ^μi=¯Yi. where
- E(¯Yi.)=μi
- var(¯Yi)=σ2/ni estimated by s2(¯Yi.)=MSE/ni
Since ¯Yi.−μis(¯Yi.)∼tN−a and the confidence interval for μi is ¯Yi.±t1−α/2;N−as(¯Yi.),
then we can do a t-test for the means difference with some constant c
H0:μi=cH1:μi≠c
where
T=¯Yi.−cs(¯Yi.)
follows tN−a when H0 is true.
If |T|>t1−α/2;N−a, we can reject H0
25.1.1.3.2 Differences Between Treatment Means
Let D=μi−μ′i, also known as pairwise comparison
D can be estimated by ˆD=¯Yi−¯Yi′ is unbiased (E(ˆD)=μi−μ′i)
Since ¯Yi and ¯Yi′ are independent, then
var(ˆD)=var(¯Yi)+var(¯Yi′)=σ2(1/ni+1/n′i)
can be estimated with
s2(ˆD)=MSE(1/ni+1/n′i)
With the single treatment inference,
ˆD−Ds(ˆD)∼tN−a
hence,
ˆD±t(1−α/2;N−a)s(ˆD)
Hypothesis tests:
H0:μi=μ′iHa:μi≠μ′i
can be tested by the following statistic
T=ˆDs(ˆD)∼t1−α/2;N−a
reject H0 if |T|>t1−α/2;N−a
25.1.1.3.3 Contrast Among Treatment Means
generalize the comparison of two means, we have contrasts
A contrast is a linear combination of treatment means:
L=a∑i=1ciμi
where each ci is non-random constant and sum to 0:
a∑i=1ci=0
An unbiased estimator of a contrast L is
ˆL=a∑i=1ciˉYi.
and E(ˆL)=L. Since the ˉYi., i = 1,…, a are independent.
var(ˆL)=var(a∑i=1ciˉYi.)=a∑i=1var(ciˉYi)=a∑i=1c2ivar(ˉYi)=a∑i=1c2iσ2/ni=σ2a∑i=1c2i/ni
Estimation of the variance:
s2(ˆL)=MSEa∑i=1c2ini
ˆL is normally distributed (since it is a linear combination of independent normal random variables).
Then, since SSE/σ2 is χ2N−a
ˆL−Ls(ˆL)∼tN−a
A 1−α confidence limits are given by
ˆL±t1−α/2;N−as(ˆL)
Hypothesis testing
H0:L=0Ha:L≠0
with
T=ˆLs(ˆL)
reject H0 if |T|>t1−α/2;N−a
25.1.1.3.4 Linear Combination of Treatment Means
just like contrast L=∑ai=1ciμi but no restrictions on the ci coefficients.
Tests on a single treatment mean, two treatment means, and contrasts can all be considered form the same perspective.
H0:∑ciμi=cHa:∑ciμi≠c
The test statistics ( t-stat) can be considered equivalently as F-tests; F=(T)2 where F∼F1,N−a. Since the numerator degrees of freedom is always 1 in these cases, we refer to them as single-degree-of-freedom tests.
Multiple Contrasts
To test simultaneously k≥2 contrasts, let T1,...,Tk be the t-stat. The joint distribution of these random variables is a multivariate t-distribution (the tests are dependent since they re based on the same data).
Limitations for comparing multiple contrasts:
The confidence coefficient 1−α only applies to a particular estimate, not a series of estimates; similarly, the Type I error rate, α, applies to a particular test, not a series of tests. Example: 3 t-tests at α=0.05, if tests are independent (which they are not), 0.953=0.857 (thus α−0.143 not 0.05)
The confidence coefficient 1−α and significance level α are appropriate only if the test was not suggest by the data.
- often, the results of an experiment suggest important (i.e.,..g, potential significant) relationships.
- the process of studying effects suggests by the data is called data snooping
Multiple Comparison Procedures:
25.1.1.3.4.1 Tukey
All pairwise comparisons of factor level means. All pairs D=μi−μ′i or all tests of the form:
H0:μi−μ′i=0Ha:μi−μ′i≠0
- When all sample sizes are equal (n1=n2=...=na) then the Tukey method family confidence coefficient is exactly 1−α and the significance level is exactly α
- When the sample sizes are not equal, the family confidence coefficient is greater than 1−α (i.e., the significance level is less than α) so the test conservative
- Tukey considers the studentized range distribution. If we have Y1,..,Yr, observations from a normal distribution with mean α and variance σ2. Define: w=max(Yi)−min(Yi) as the range of the observations. Let s2 be an estimate of σ2 with v degrees of freedom. Then, q(r,v)=ws is called the studentized range. The distribution of q uses a special table.
Notes
- when we are not interested in testing all pairwise comparison,s the confidence coefficient for the family of comparisons under consideration will be greater than 1−α (with the significance level less than α)
- Tukey can be used for “data snooping” as long as the effects to be studied on the basis of preliminary data analysis are pairwise comparisons.
25.1.1.3.4.2 Scheffe
This method applies when the family of interest is the set of possible contrasts among the treatment means:
L=a∑i=1ciμi
where ∑ai=1ci=0
That is, the family of all possible contrasts L or
H0:L=0Ha:L≠0
The family confidence level for the Scheffe procedure is exactly 1−α (i.e., significance level = α) whether the sample sizes are equal or not.
For simultaneous confidence intervals,
ˆL±Ss(ˆL)
where ˆL=∑ciˉYi.,s2(ˆL)=MSE∑c2i/ni and S2=(a−1)f1−α;a−1,N−a
The Scheffe procedure considers
F=ˆL2(a−1)s2(ˆL)
where we reject H0 at the family significance level α if F>f(1−α;a−1,N−a)
Note
- Since applications of the Scheffe never involve all conceivable contrasts, the finite family confidence coefficient will be larger than 1−α, so 1−α is a lower bound. Thus, people often consider a larger α (e.g., 90% confidence interval)
- Scheffe can be used for “data scooping” since the family of statements contains all possible contrasts.
- If only pairwise comparisons are to be considered, The Tukey procedure gives narrower confidence limits.
25.1.1.3.4.3 Bonferroni
Applicable whether the sample sizes are equal or unequal.
For the confidence intervals,
ˆL±Bs(ˆL)
where B=t(1−α/(2g);N−a) and g is the number of comparisons in the family.
Hypothesis testing
H0:L=0Ha:L≠0
Let T=ˆLs(ˆL) and reject H0 if |T|>t1−α/(2g),N−a
Notes
- If all pairwise comparisons are of interest, the Tukey procedure is superior (narrower confidence intervals). If not, Bonferroni may be better.
- Bonferroni is better than Scheffe when the number of contrasts is about the same as the treatment levels (or less).
- Recommendation: compute all threes and pick the smallest.
- Bonferroni can’t be used for data snooping
25.1.2 Single Factor Random Effects Model
Also known as ANOVA Type II models.
Treatments are chosen at from from larger population. We extend inference to all treatments in the population and not restrict our inference to those treatments that happened to be selected for the study.
25.1.2.1 Random Cell Means
Yij=μi+ϵij
where
- μi∼N(μ,σ2μ) and independent
- ϵij∼N(0,σ2) and independent
μi and ϵij are mutually independent for i=1,...,a;j=1,...,n
With all treatment sample sizes are equal
E(Yij)=E(μi)=μvar(Yij)=var(μi)+var(ϵi)=σ2μ+σ2
Since Yij are not independent
cov(Yij,Yij′)=E(YijYij′)−E(Yij)E(Yij′)=E(μ2i+μiϵij′+μiϵij+ϵijϵij′)−μ2=σ2μ+μ2−μ2ifj≠j′=σ2μifj≠j′
cov(Yij,Yi′j′)=E(μiμi′+μiϵi′j′+μi′ϵij+ϵijϵi′j′)−μ2=μ2−μ2if i≠i′=0
Hence,
- all observations have the same variance
- any two observations from the same treatment have covariance σ2μ
- The correlation between any two responses from the same treatment:
ρ(Yij,Yij′)=σ2μσ2μ+σ2j≠j′
Inference
Intraclass Correlation Coefficient
σ2μσ2+σ2μ
which measures the proportion of total variability of Yij accounted for by the variance of μi
H0:σ2μ=0Ha:σ2μ≠0
H0 implies μi=μ for all i, which can be tested by the F-test in ANOVA.
The understandings of the Single Factor Fixed Effects Model and the Single Factor Random Effects Model are different, the ANOVA is same for the one factor model. The difference is in the expected mean squares
Random Effects Model | Fixed Effects Model |
---|---|
E(MSE)=σ2 | E(MSE)=σ2 |
E(MSTR)=σ2−nσ2μ | E(MSTR)=σ2+∑ini(μi−μ)2a−1 |
If σ2μ, then MSE and MSTR have the same expectation (σ2). Otherwise, E(MSTR)>E(MSE). Large values of the statistic
F=MSTRMSE
suggest we reject H0.
Since F∼F(a−1,a(n−1)) when H0 holds. If F>f(1−α;a−1,a(n−1)) we reject H0.
If sample sizes are not equal, F-test can still be used, but the df are a−1 and N−a.
25.1.2.1.1 Estimation of μ
An unbiased estimator of E(Yij)=μ is the grand mean: ˆμ=ˆY..
The variance of this estimator is
var(ˉY..)=var(∑iˉYi./a)=1a2∑ivar(ˉYi.)=1a2∑i(σ2μ+σ2/n)=1a2(σ2μ+σ2/n)=nσ2μ+σ2an
An unbiased estimator of this variance is s2(ˉY)=MSTRan. Thus ˉY..−μs(ˉY..)∼ta−1
A 1−α confidence interval is ˉY..±t(1−α/2;a−1)s(ˉY..)
25.1.2.1.2 Estimation of σ2μ/(σ2μ+σ2)
In the random and fixed effects model, MSTR and MSE are independent. When the sample sizes are equal (ni=n for all i),
MSTRnσ2μ+σ2MSEσ2∼f(a−1,a(n−1))
P(f(α/2;a−1,a(n−1))≤MSTRnσ2μ+σ2MSEσ2≤f(1−α/2;a−1,a(n−1)))=1−α
L=1n(MSTRMSE(1f(1−α/2;a−1,a(n−1)))−1)U=1n(MSTRMSE(1f(α/2;a−1,a(n−1)))−1)
The lower and upper (L∗,U∗) confidence limits for σ2μσ2μ+σ2
L∗=L1+LU∗=U1+U
If the lower limit for σ2μσ2 is negative, it is customary to set L=0.
25.1.2.1.3 Estimation of σ2
a(n−1)MSE/σ2∼χ2a(n−1), the (1−α) confidence interval for σ2:
a(n−1)MSEχ21−α/2;a(n−1)≤σ2≤a(n−1)MSEχ2α/2;a(n−1)
can also be used in case sample sizes are not equal - then df is N-a.
25.1.2.1.4 Estimation of σ2μ
E(MSE)=σ2 E(MSTR)=σ2+nσ2μ. Hence,
σ2μ=E(MSTR)−E(MSE)n
An unbiased estimator of σ2μ is given by
s2μ=MSTR−MSEn
if s2μ<0, set s2μ=0
If sample sizes are not equal,
s2μ=MSTR−MSEn′
where n′=1a−1(∑ini−∑in2i∑ini)
no exact confidence intervals for σ2μ, but we can approximate intervals.
Satterthewaite Procedure can be used to construct approximate confidence intervals for linear combination of expected mean squares
A linear combination:
σ2μ=1nE(MSTR)+(−1n)E(MSE)
S=d1E(MS1)+..+dhE(MSh)
where d_i are coefficients.
An unbiased estimator of S is
\hat{S} = d_1 MS_1 + ...+ d_h MS_h
Let df_i be the degrees of freedom associated with the mean square MS_i. The Satterthwaite approximation:
\frac{(df)\hat{S}}{S} \sim \chi^2_{df}
where
df = \frac{(d_1MS_1+...+d_hMS_h)^2}{(d_1MS_1)^2/df_1 + ...+ (d_hMS_h)^2/df_h}
An approximate 1-\alpha confidence interval for S:
\frac{(df)\hat{S}}{\chi^2_{1-\alpha/2;df}} \le S \le \frac{(df)\hat{S}}{\chi^2_{\alpha/2;df}}
For the single factor random effects model
\frac{(df)s^2_\mu}{\chi^2_{1-\alpha/2;df}} \le \sigma^2_\mu \le \frac{(df)s^2_\mu}{\chi^2_{\alpha/2;df}}
where
df = \frac{(sn^2_\mu)^2}{\frac{(MSTR)^2}{a-1}+ \frac{(MSE)^2}{a(n-1)}}
25.1.2.2 Random Treatment Effects Model
\tau_i = \mu_i - E(\mu_i) = \mu_i - \mu
we have \mu_i = \mu + \tau_i and
Y_{ij} = \mu + \tau_i + \epsilon_{ij}
where
- \mu = constant, common to all observations
- \tau_i \sim N(0,\sigma^2_\tau) independent (random variables)
- \epsilon_{ij} \sim N(0,\sigma^2) independent.
- \tau_{i}, \epsilon_{ij} are independent (i=1,…,a; j =1,..,n)
- our model is concerned with only balanced single factor ANOVA.
Diagnostics Measures
- Non-constant error variance (plots, Levene test, Hartley test).
- Non-independence of errors (plots, Durban-Watson test).
- Outliers (plots, regression methods).
- Non-normality of error terms (plots, Shapiro-Wilk, Anderson-Darling).
- Omitted Variable Bias (plots)
Remedial
- Weighted Least Squares
- [Transformations]
- Non-parametric Procedures.
Note
Fixed effect ANOVA is relatively robust to
- non-normality
- unequal variances when sample sizes are approximately equal; at least the F-test and multiple comparisons. However, single comparisons of treatment means are sensitive to unequal variances.
Lack of independence can seriously affect both fixed and random effect ANVOA.
25.1.3 Two Factor Fixed Effect ANOVA
The multi-factor experiment is
- more efficient
- provides more info
- gives more validity to the findings.
25.1.3.1 Balanced
Assumption:
- All treatment sample sizes are equal
- All treatment means are of equal importance
Assume:
- Factor A has
a
levels and Factor B hasb
levels. All a \times b factor levels are considered. - The number of treatments for each level is n. N = abn observations in the study.
25.1.3.1.1 Cell Means Model
Y_{ijk} = \mu_{ij} + \epsilon_{ijk}
where
- \mu_{ij} are fixed parameters (cell means)
- i = 1,...,a = the levels of Factor A
- j = 1,...,b = the levels of Factor B.
- \epsilon_{ijk} \sim \text{indep } N(0,\sigma^2) for i = 1,...,a, j = 1,..,b and k = 1,..,n
And
\begin{aligned} E(Y_{ijk}) &= \mu_{ij} \\ var(Y_{ijk}) &= var(\epsilon_{ijk}) = \sigma^2 \end{aligned}
Hence,
Y_{ijk} \sim \text{indep } N(\mu_{ij},\sigma^2)
And the model is
\mathbf{Y} = \mathbf{X} \beta + \epsilon
Thus,
\begin{aligned} E(\mathbf{Y}) &= \mathbf{X}\beta \\ var(\mathbf{Y}) &= \sigma^2 \mathbf{I} \end{aligned}
Interaction
(\alpha \beta)_{ij} = \mu_{ij} - (\mu_{..}+ \alpha_i + \beta_j)
where
- \mu_{..} = \sum_i \sum_j \mu_{ij}/ab is the grand mean
- \alpha_i = \mu_{i.}-\mu_{..} is the main effect for factor A at the i-th level
- \beta_j = \mu_{.j} - \mu_{..} is the main effect for factor B at the j-th level
- (\alpha \beta)_{ij} is the interaction effect when factor A is at the i-th level and factor B is at the j-th level.
- (\alpha \beta)_{ij} = \mu_{ij} - \mu_{i.}-\mu_{.j}+ \mu_{..}
Examine interactions:
- Examine whether all \mu_{ij} can be expressed as the sums \mu_{..} + \alpha_i + \beta_j
- Examine whether the difference between the mean responses for any two levels of factor B is the same for all levels of factor A.
- Examine whether the difference between the mean response for any two levels of factor A is the same for all levels of factor B
- Examine whether the treatment mean curves for the different factor levels in a treatment plot are parallel.
For j = 1,...,b
\begin{aligned} \sum_i(\alpha \beta)_{ij} &= \sum_i (\mu_{ij} - \mu_{..} - \alpha_i - \beta_j) \\ &= \sum_i \mu_{ij} - a \mu_{..} - \sum_i \alpha_i - a \beta_j \\ &= a \mu_{.j} - a \mu_{..}- \sum_i (\mu_{i.} - \mu_{..}) - a(\mu_{.j}-\mu_{..}) \\ &= a \mu_{.j} - a \mu_{..} - a \mu_{..}+ a \mu_{..} - a (\mu_{.j} - \mu_{..}) \\ &= 0 \end{aligned}
Similarly, \sum_j (\alpha \beta) = 0, i = 1,...,a and \sum_i \sum_j (\alpha \beta)_{ij} =0, \sum_i \alpha_i = 0, \sum_j \beta_j = 0
25.1.3.1.2 Factor Effects Model
\begin{aligned} \mu_{ij} &= \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} \\ Y_{ijk} &= \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk} \end{aligned}
where
- \mu_{..} is a constant
- \alpha_i are constants subject to the restriction \sum_i \alpha_i=0
- \beta_j are constants subject to the restriction \sum_j \beta_j = 0
- (\alpha \beta)_{ij} are constants subject to the restriction \sum_i(\alpha \beta)_{ij} = 0 for j=1,...,b and \sum_j(\alpha \beta)_{ij} = 0 for i = 1,...,a
- \epsilon_{ijk} \sim \text{indep } N(0,\sigma^2) for k = 1,..,n
We have
\begin{aligned} E(Y_{ijk}) &= \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij}\\ var(Y_{ijk}) &= \sigma^2 \\ Y_{ijk} &\sim N (\mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij}, \sigma^2) \end{aligned}
We have 1+a+b+ab parameters. But there are ab parameters in the Cell Means Model. In the Factor Effects Model, the restrictions limit the number of parameters that can be estimated:
\begin{aligned} 1 &\text{ for } \mu_{..} \\ (a-1) &\text{ for } \alpha_i \\ (b-1) &\text{ for } \beta_j \\ (a-1)(b-1) &\text{ for } (\alpha \beta)_{ij} \end{aligned}
Hence, there are
1 + a - 1 + b - 1 + ab - a- b + 1 = ab
parameters in the model.
We can have several restrictions when considering the model in the form \mathbf{Y} = \mathbf{X} \beta + \epsilon
One way:
\begin{aligned} \alpha_a &= \alpha_1 - \alpha_2 - ... - \alpha_{a-1} \\ \beta_b &= -\beta_1 - \beta_2 - ... - \beta_{b-1} \\ (\alpha \beta)_{ib} &= -(\alpha \beta)_{i1} -(\alpha \beta)_{i2} -...-(\alpha \beta)_{i,b-1} ; i = 1,..,a \\ (\alpha \beta)_{aj}& = -(\alpha \beta)_{1j}-(\alpha \beta)_{2j} - ... -(\alpha \beta)_{a-1,j}; j = 1,..,b \end{aligned}
We can fit the model by least squares or maximum likelihood
Cell Means Model
minimize
Q = \sum_i \sum_j \sum_k (Y_{ijk}-\mu_{ij})^2
estimators
\begin{aligned} \hat{\mu}_{ij} &= \bar{Y}_{ij} \\ \hat{Y}_{ijk} &= \bar{Y}_{ij} \\ e_{ijk} = Y_{ijk} - \hat{Y}_{ijk} &= Y_{ijk} - \bar{Y}_{ij} \end{aligned}
Factor Effects Model
Q = \sum_i \sum_j \sum_k (Y_{ijk} - \mu_{..}-\alpha_i = \beta_j - (\alpha \beta)_{ij})^2
subject to the restrictions
\begin{aligned} \sum_i \alpha_i &= 0 \\ \sum_j \beta_j &= 0 \\ \sum_i (\alpha \beta)_{ij} &= 0 \\ \sum_j (\alpha \beta)_{ij} &= 0 \end{aligned}
estimators
\begin{aligned} \hat{\mu}_{..} &= \bar{Y}_{...} \\ \hat{\alpha}_i &= \bar{Y}_{i..} - \bar{Y}_{...} \\ \hat{\beta}_j &= \bar{Y}_{.j.}-\bar{Y}_{...} \\ (\hat{\alpha \beta})_{ij} &= \bar{Y}_{ij.} - \bar{Y}_{i..} - \bar{Y}_{.j.}+ \bar{Y}_{...} \end{aligned}
The fitted values
\hat{Y}_{ijk} = \bar{Y}_{...}+ (\bar{Y}_{i..}- \bar{Y}_{...})+ (\bar{Y}_{.j.}- \bar{Y}_{...}) + (\bar{Y}_{ij.} - \bar{Y}_{i..}-\bar{Y}_{.j.}+\bar{Y}_{...}) = \bar{Y}_{ij.}
where
\begin{aligned} e_{ijk} &= Y_{ijk} - \bar{Y}_{ij.} \\ e_{ijk} &\sim \text{ indep } (0,\sigma^2) \end{aligned}
and
\begin{aligned} s^2_{\hat{\mu}..} &= \frac{MSE}{nab} \\ s^2_{\hat{\alpha}_i} &= MSE(\frac{1}{nb} - \frac{1}{nab}) \\ s^2_{\hat{\beta}_j} &= MSE(\frac{1}{na} - \frac{1}{nab}) \\ s^2_{(\hat{\alpha\beta})_{ij}} &= MSE (\frac{1}{n} - \frac{1}{na}- \frac{1}{nb} + \frac{1}{nab}) \end{aligned}
25.1.3.1.2.1 Partitioning the Total Sum of Squares
Y_{ijk} - \bar{Y}_{...} = \bar{Y}_{ij.} - \bar{Y}_{...} + Y_{ijk} - \bar{Y}_{ij.}
Y_{ijk} - \bar{Y}_{...}: Total deviation
\bar{Y}_{ij.} - \bar{Y}_{...}: Deviation of treatment mean from overall mean
Y_{ijk} - \bar{Y}_{ij.}: Deviation of observation around treatment mean (residual).
\begin{aligned} \sum_i \sum_j \sum_k (Y_{ijk} - \bar{Y}_{...})^2 &= n \sum_i \sum_j (\bar{Y}_{ij.}- \bar{Y}_{...})^2+ \sum_i \sum_j sum_k (Y_{ijk} - \bar{ij.})^2 \\ SSTO &= SSTR + SSE \end{aligned}
(cross product terms are 0)
\bar{Y}_{ij.}- \bar{Y}_{...} = \bar{Y}_{i..}-\bar{Y}_{...} + \bar{Y}_{.j.}-\bar{Y}_{...} + \bar{Y}_{ij.} - \bar{Y}_{i..} - \bar{Y}_{.j.} + \bar{Y}_{...}
squaring and summing:
\begin{aligned} n\sum_i \sum_j (\bar{Y}_{ij.}-\bar{Y}_{...})^2 &= nb\sum_i (\bar{Y}_{i..}-\bar{Y}_{...})^2 + na \sum_j (\bar{Y}_{.j.}-\bar{Y}_{...})^2 \\ &+ n \sum_i \sum_j (\bar{Y}_{ij.}-\bar{Y}_{i..}- \bar{Y}_{.j.}+ \bar{Y}_{...})^2 \\ SSTR &= SSA + SSB + SSAB \end{aligned}
The interaction term from
\begin{aligned} SSAB &= SSTO - SSE - SSA - SSB \\ SSAB &= SSTR - SSA - SSB \end{aligned}
where
- SSA is the factor A sum of squares (measures the variability of the estimated factor A level means \bar{Y}_{i..})- the more variable, the larger SSA
- SSB is the factor B sum of squares
- SSAB is the interaction sum of squares, measuring the variability of the estimated interactions.
25.1.3.1.2.2 Partitioning the df
N = abn cases and ab treatments.
For one-way ANOVA and regression, the partition has df:
SS: SSTO = SSTR + SSE
df: N-1 = (ab-1) + (N-ab)
we must further partition the ab-1 df with SSTR
SSTR = SSA + SSB + SSAB
ab-1 = (a-1) + (b-1) + (a-1)(b-1)
- df_{SSA} = a-1: a treatment deviations but 1 df is lost due to the restriction \sum (\bar{Y}_{i..}- \bar{Y}_{...})=0
- df_{SSB} = b-1: b treatment deviations but 1 df is lost due to the restriction \sum (\bar{Y}_{.j.}- \bar{Y}_{...})=0
- df_{SSAB} = (a-1)(b-1)= (ab-1)-(a-1)-(b-1): ab interactions, there are (a+b-1) restrictions, so df = ab-a-(b-1)= (a-1)(b-1)
25.1.3.1.2.3 Mean Squares
\begin{aligned} MSA &= \frac{SSA}{a-1}\\ MSB &= \frac{SSB}{b-1}\\ MSAB &= \frac{SSAB}{(a-1)(b-1)} \end{aligned}
The expected mean squares are
\begin{aligned} E(MSE) &= \sigma^2 \\ E(MSA) &= \sigma^2 + nb \frac{\sum \alpha_i^2}{a-1} = \sigma^2 + nb \frac{\sum(\sum_{i.}-\mu_{..})^2}{a-1} \\ E(MSB) &= \sigma^2 + na \frac{\sum \beta_i^2}{b-1} = \sigma^2 + na \frac{\sum(\sum_{.j}-\mu_{..})^2}{b-1} \\ E(MSAB) &= \sigma^2 + n \frac{\sum \sum (\alpha \beta)_{ij}^2}{(a-1)(b-1)} = \sigma^2 + n \frac{\sum (\mu_{ij}- \mu_{i.}- \mu_{.j}+ \mu_{..} )^2}{(a-1)(b-1)} \end{aligned}
If there are no factor A main effects (all \mu_{i.} = 0 or \alpha_i = 0) the MSA and MSE have the same expectation; otherwise MSA > MSE. Same for factor B, and interaction effects. which case we can examine F-statistics.
Interaction
\begin{aligned} H_0: \mu_{ij}- \mu_{i.} - \mu_{.j} + \mu_{..} = 0 && \text{for all i,j} \\ H_a: \mu_{ij}- \mu_{i.} - \mu_{.j} + \mu_{..} \neq 0 && \text{for some i,j} \end{aligned}
or
\begin{aligned} &H_0: \text{All}(\alpha \beta)_{ij} = 0 \\ &H_a: \text{Not all} (\alpha \beta) = 0 \end{aligned}
Let F = \frac{MSAB}{MSE}. When H_0 is true F \sim f_{((a-1)(b-1),ab(n-1))}. So reject H_0 when F > f_{((a-1)(b-1),ab(n-1))}
Factor A main effects:
\begin{aligned} &H_0: \mu_{1.} = \mu_{2.} = ... = \mu_{a.} \\ &H_a: \text{Not all $\mu_{i.}$ are equal} \end{aligned}
or
\begin{aligned} &H_0: \alpha_1 = ... = \alpha_a = 0 \\ &H_a: \text{Not all $\alpha_i$ are equal to 0} \end{aligned}
F= \frac{MSA}{MSE} and reject H_0 if F>f_{(1-\alpha;a-1,ab(n-1))}
25.1.3.1.2.4 Two-way ANOVA
Source of Variation | SS | df | MS | F |
---|---|---|---|---|
Factor A | SSA | a-1 | MSA = SSA/(a-1) | MSA/MSE |
Factor B | SSB | b-1 | MSB = SSB/(b-1) | MSB/MSE |
AB interactions | SSAB | (a-1)(b-1) | MSAB = SSAB /MSE | |
Error | SSE | ab(n-1) | MSE = SSE/ab(n-1) | |
Total (corrected) | SSTO | abn - 1 |
Doing 2-way ANOVA means you always check interaction first, because if there are significant interactions, checking the significance of the main effects becomes moot.
The main effects concern the mean responses for levels of one factor averaged over the levels of the other factor. When interaction is present, we can’t conclude that a given factor has no effect, even if these averages are the same. It means that the effect of the factor depends on the level of the other factor.
On the other hand, if you can establish that there is no interaction, then you can consider inference on the factor main effects, which are then said to be additive.
And we can also compare factor means like the Single Factor Fixed Effects Model using Tukey, Scheffe, Bonferroni.
We can also consider contrasts in the 2-way model
L = \sum c_i \mu_i
where \sum c_i =0
which is estimated by
\hat{L} = \sum c_i \bar{Y}_{i..}
with variance
\sigma^2(\hat{L}) = \frac{\sigma^2}{bn} \sum c_i^2
and variance estimate
\frac{MSE}{bn} \sum c_i^2
Orthogonal Contrasts
\begin{aligned} L_1 &= \sum c_i \mu_i, \sum c_i = 0 \\ L_2 &= \sum d_i \mu_i , \sum d_i = 0 \end{aligned}
these contrasts are said to be orthogonal if
\sum \frac{c_i d_i}{n_i} = 0
in balanced case \sum c_i d_i =0
\begin{aligned} cov(\hat{L}_1, \hat{L}_2) &= cov(\sum_i c_i \bar{Y}_{i..}, \sum_l d_l \bar{Y}_{l..}) \\ &= \sum_i \sum_l c_i d_l cov(\bar{Y}_{i..},\bar{Y}_{l..}) \\ &= \sum_i c_i d_i \frac{\sigma^2}{bn} = 0 \end{aligned}
Orthogonal contrasts can be used to further partition the model sum of squares. There are many sets of orthogonal contrasts and thus, many ways to partition the sum of squares.
A special set of orthogonal contrasts that are used when the levels of a factor can be assigned values on a metric scale are called orthogonal polynomials
Coefficients can be found for the special case of
- equal spaced levels (e.g., (0 15 30 45 60))
- equal sample sizes (n_1 = n_2 = ... = n_{ab})
We can define the SS for a given contrast:
SS_L = \frac{\hat{L}^2}{\sum_{i=1}^a (c^2_i/bn_i)}
T = \frac{\hat{L}}{\sqrt{MSE\sum_{i=1}^a(c_i^2/bn_i)}} \sim t
Moreover,
t^2_{(1-\alpha/2;df)}=F_{(1-\alpha;1,df)}
So,
\frac{SS_L}{MSE} \sim F_{(1-\alpha;1,df_{MSE})}
all contrasts have d.f = 1
25.1.3.2 Unbalanced
We could have unequal numbers of replications for all treatment combinations:
- Observational studies
- Dropouts in designed studies
- Larger sample sizes for inexpensive treatments
- Sample sizes to match population makeup.
Assume that each factor combination has at least 1 observation (no empty cells)
Consider the same model as:
Y_{ijk} = \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk}
where sample sizes are: n_{ij}:
\begin{aligned} n_{i.} &= \sum_j n_{ij} \\ n_{.j} &= \sum_i n_{ij} \\ n_T &= \sum_i \sum_j n_{ij} \end{aligned}
Problem here is that
SSTO \neq SSA + SSB + SSAB + SSE
(the design is non-orthogonal)
- For i = 1,...,a-1,
u_i = \begin{cases} +1 & \text{if the obs is from the i-th level of Factor 1} \\ -1 & \text{if the obs is from the a-th level of Factor 1} \\ 0 & \text{otherwise} \\ \end{cases}
- For j=1,...,b-1
v_i = \begin{cases} +1 & \text{if the obs is from the j-th level of Factor 1} \\ -1 & \text{if the obs is from the b-th level of Factor 1} \\ 0 & \text{otherwise} \\ \end{cases}
We can use these indicator variables as predictor variables and \mu_{..}, \alpha_i ,\beta_j, (\alpha \beta)_{ij} as unknown parameters.
Y = \mu_{..} + \sum_{i=1}^{a-1} \alpha_i u_i + \sum_{j=1}^{b-1} \beta_j v_j + \sum_{i=1}^{a-1} \sum_{j=1}^{b-1}(\alpha \beta)_{ij} u_i v_j + \epsilon
To test hypotheses, we use the extra sum of squares idea.
For interaction effects
\begin{aligned} &H_0: all (\alpha \beta)_{ij} = 0 \\ &H_a: \text{not all }(\alpha \beta)_{ij} =0 \end{aligned}
Or to test
\begin{aligned} &H_0: \beta_1 = \beta_2 = \beta_3 = 0 \\ &H_a: \text{not all } \beta_j = 0 \end{aligned}
Analysis of Factor Means
(e.g., contrasts) is analogous to the balanced case, with modifications in the formulas for means and standard errors to account for unequal sample sizes.
Or , we can fit the cell means model and consider it from a regression perspective
If you have empty cells (i.e., some factor combinations have no observation), then the equivalent regression approach can’t be used. But you can still do partial analyses
25.1.4 Two-Way Random Effects ANOVA
Y_{ijk} = \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ij}
where
- \mu_{..}: constant
- \alpha_i \sim N(0,\sigma^2_{\alpha}), i = 1,..,a (independent)
- \beta_j \sim N(0,\sigma^2_{\beta}), j = 1,..,b (independent)
- (\alpha \beta)_{ij} \sim N(0,\sigma^2_{\alpha \beta}),i=1,...,a,j=1,..,b (independent)
- \epsilon_{ijk} \sim N(0,\sigma^2) (independent)
All \alpha_i, \beta_j, (\alpha \beta)_{ij} are pairwise independent
Theoretical means, variances, and covariances are
\begin{aligned} E(Y_{ijk}) &= \mu_{..} \\ var(Y_{ijk}) &= \sigma^2_Y= \sigma^2_\alpha + \sigma^2_\beta + \sigma^2_{\alpha \beta} + \sigma^2 \end{aligned}
So
Y_{ijk} \sim N(\mu_{..},\sigma^2_\alpha + \sigma^2_\beta + \sigma^2_{\alpha \beta} + \sigma^2)
\begin{aligned} cov(Y_{ijk},Y_{ij'k'}) &= \sigma^2_{\alpha}, j \neq j' \\ cov(Y_{ijk},Y_{i'jk'}) &= \sigma^2_{\beta}, i \neq i'\\ cov(Y_{ijk},Y_{ijk'}) &= \sigma^2_\alpha + \sigma^2_{\beta} + \sigma^2_{\alpha \beta}, k \neq k' \\ cov(Y_{ijk},Y_{i'j'k'}) &= , i \neq i', j \neq j' \end{aligned}
25.1.5 Two-Way Mixed Effects ANOVA
25.1.5.1 Balanced
One fixed factor, while other is random treatment levels, we have a mixed effects model or a mixed model
Restricted mixed model for 2-way ANOVA:
Y_{ijk} = \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk}
where
- \mu_{..}: constant
- \alpha_i: fixed effects with constraints subject to restriction \sum \alpha_i = 0
- \beta_j \sim indep N(0,\sigma^2_\beta)
- (\alpha \beta)_{ij} \sim N(0,\frac{a-1}{a}\sigma^2_{\alpha \beta}) subject to restriction \sum_i (\alpha \beta)_{ij} = 0 for all j, the variance here is written as the proportion for convenience; it makes the expected mean squares simpler (other assumed var((\alpha \beta)_{ij}= \sigma^2_{\alpha \beta})
- cov((\alpha \beta)_{ij},(\alpha \beta)_{i'j'}) = - \frac{1}{a} \sigma^2_{\alpha \beta}, i \neq i'
- \epsilon_{ijk}\sim indepN(0,\sigma^2)
- \beta_j, (\alpha \beta)_{ij}, \epsilon_{ijk} are pairwise independent
Two-way mixed models are written in an “unrestricted” form, with no restrictions on the interaction effects (\alpha \beta)_{ij}, they are pairwise independent.
Let \beta^*, (\alpha \beta)^*_{ij} be the unrestricted random effects, and (\bar{\alpha \beta})_{ij}^* the means averaged over the fixed factor for each level of random factor B.
\begin{aligned} \beta_j &= \beta_j^* + (\bar{\alpha \beta})_{ij}^* \\ (\alpha \beta)_{ij} &= (\alpha \beta)_{ij}^* - (\bar{\alpha \beta})_{ij}^* \end{aligned}
Some consider the restricted model to be more general. but here we consider the restricted form.
\begin{aligned} E(Y_{ijk}) &= \mu_{..} + \alpha_i \\ var(Y_{ijk}) &= \sigma^2_\beta + \frac{a-1}{a} \sigma^2_{\alpha \beta} + \sigma^2 \end{aligned}
Responses from the same random factor (B) level are correlated
\begin{aligned} cov(Y_{ijk},Y_{ijk'}) &= E(Y_{ijk}Y_{ijk'}) - E(Y_{ijk})E(Y_{ijk'}) \\ &= \sigma^2_\beta + \frac{a-1}{a} \sigma^2_{\alpha \beta} , k \neq k' \end{aligned}
Similarly,
\begin{aligned} cov(Y_{ijk},Y_{i'jk'}) &= \sigma^2_\beta - \frac{1}{a} \sigma^2_{\alpha\ \beta}, i \neq i' \\ cov(Y_{ijk},Y_{i'j'k'}) &= 0, j \neq j' \end{aligned}
Hence, you can see that the only way you don’t have dependence in the Y is when they don’t share the same random effect.
An advantage of the restricted mixed model is that 2 observations from the same random factor b level can be positively or negatively correlated. In the unrestricted model, they can only be positively correlated.
Mean Square | Fixed ANOVA (A, B Fixed) |
Random ANOVA (A,B random) |
Mixed ANVOA (A fixed, B random) |
---|---|---|---|
MSA | a - 1 | \sigma ^2+ n b \frac{\sum\alpha_i^2}{a-1} | \sigma^2 + nb\sigma^ 2_ \alpha +n \sigma^ 2_{\alpha \beta} |
MSB | b-1 | \sigma^2 + n a \frac{\sum\beta ^2_j}{b-1} | \sigma^ 2 + na\sigma^2_ \beta +n \sigma^ 2_{\alpha \beta} |
MSAB | ( a-1)(b-1) | \sigma^2 + n \frac{\sum \sum(\alpha \beta )^2_ {ij}} { ( a-1)(b-1)} | \sigma^2+n \sigma^2_{\alpha \beta} |
MSE | (n-1)ab | \sigma^2 | \sigma^2 |
For fixed, random, and mixed models (balanced), the ANOVA table sums of squares calculations are identical. (also true for df and mean squares). The only difference is with the expected mean squares, thus the test statistics.
In Random ANOVA, we test
\begin{aligned} &H_0: \sigma^2 = 0 \\ &H_a: \sigma^2 > 0 \end{aligned}
by considering F= \frac{MSA}{MSAB} \sim F_{a-1;(a-1)(b-1)}
The same test statistic is used for mixed models, but in that case we are testing null hypothesis that all of the \alpha_i = 0
The test statistic different for the same null hypothesis under the fixed effects model.
Test for effects of | Fixed ANOVA (A&B fixed) |
Random ANOVA (A&B random) |
Mixed ANOVA (A fixed, B random) |
---|---|---|---|
Factor A | \frac{MSA}{MSE} | \frac{MSA}{MSAB} | \frac{MSA}{MSAB} |
Factor B | \frac{MSB}{MSE} | \frac{MSB}{MSAB} | \frac{MSB}{MSE} |
AB interactions | \frac{MSAB}{MSE} | \frac{MSAB}{MSE} | \frac{MSAB}{MSE} |
Estimation Of Variance Components
In random and mixed effects models, we are interested in estimating the variance components
Variance component \sigma^2_\beta in the mixed ANOVA.
E(\sigma^2_\beta) = \frac{E(MSB)-E(MSE)}{na} = \frac{\sigma^2 + na \sigma^2_\beta - \sigma^2}{na} = \sigma^2_\beta
which can be estimated with
\hat{\sigma}^2_\beta = \frac{MSB - MSE}{na}
Confidence intervals for variance components can be constructed (approximately) by using the Satterthwaite procedure or the MLS procedure (like the 1-way random effects)
Estimation of Fixed Effects in Mixed Models
\begin{aligned} \hat{\alpha}_i &= \bar{Y}_{i..} - \bar{Y}_{...} \\ \hat{\mu}_{i.} &= \bar{Y}_{...} + (\bar{Y}_{i..}- \bar{Y}_{...}) = \bar{Y}_{i..} \\ \sigma^2(\hat{\alpha}_i) &= \frac{\sigma^2 + n \sigma^2_{\alpha \beta}}{bn} = \frac{E(MSAB)}{bn} \\ s^2(\hat{\alpha}_i) &= \frac{MSAB}{bn} \end{aligned}
Contrasts on the Fixed Effects
\begin{aligned} L &= \sum c_i \alpha_i \\ \sum c_i &= 0 \\ \hat{L} &= \sum c_i \hat{\alpha}_i \\ \sigma^2(\hat{L}) &= \sum c^2_i \sigma^2 (\hat{\alpha}_i) \\ s^2(\hat{L}) &= \frac{MSAB}{bn} \sum c^2_i \end{aligned}
Confidence intervals and tests can be constructed as usual
25.1.5.2 Unbalanced
For a mixed model with a = 2, b = 4
\begin{aligned} Y_{ijk} &= \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk} \\ var(\beta_j)&= \sigma^2_\beta \\ var((\alpha \beta)_{ij})&= \frac{2-1}{2}\sigma^2_{\alpha \beta} = \frac{\sigma^2_{\alpha \beta}}{2} \\ var(\epsilon_{ijk}) &= \sigma^2 \\ E(Y_{ijk}) &= \mu_{..} + \alpha_i \\ var(Y_{ijk}) &= \sigma^2_{\beta} + \frac{\sigma^2_{\alpha \beta}}{2} + \sigma^2 \\ cov(Y_{ijk},Y_{ijk'}) &= \sigma^2 + \frac{\sigma^2_{\alpha \beta}}{2}, k \neq k' \\ cov(Y_{ijk},Y_{i'jk'}) &= \sigma^2_{\beta} - \frac{\sigma^2_{\alpha \beta}}{2}, i \neq i' \\ cov(Y_{ijk},Y_{i'j'k'}) &= 0, j \neq j' \end{aligned}
assume
\mathbf{Y} \sim N(\mathbf{X}\beta, M)
where M is block diagonal
density function
f(\mathbf{Y}) = \frac{1}{(2\pi)^{N/2}|M|^{1/2}}exp(-\frac{1}{2}\mathbf{(Y - X \beta)' M^{-1}(Y-X\beta)})
if we knew the variance components, we could use GLS:
\hat{\beta}_{GLS} = \mathbf{(X'M^{-1}X)^{-1}X'M^{-1}Y}
but we usually don’t know the variance components \sigma^2, \sigma^2_\beta, \sigma^2_{\alpha \beta} that make up M
Another way to get estimates is by Maximum likelihood estimation
we try to maximize its log
\ln L = - \frac{N}{2} \ln (2\pi) - \frac{1}{2}\ln|M| - \frac{1}{2} \mathbf{(Y-X \beta)'\Sigma^{-1}(Y-X\beta)}