24.1 Completely Randomized Design

A Completely Randomized Design (CRD) is the simplest type of experimental design, where experimental units are randomly assigned to treatments.

Consider a treatment factor A with a2 treatment levels. Each experimental unit is randomly assigned to one of these levels. The number of units in each group can be:

  • Balanced: All groups have equal sample sizes n.
  • Unbalanced: Groups have different sample sizes ni (for i=1,...,a).

The total sample size is given by:

N=ai=1ni

The number of possible assignments of units to treatments is:

k=N!n1!n2!na!

Each assignment has an equal probability of being selected: 1/k. The response of each experimental unit is denoted as Yij, where:

  • i indexes the treatment group.
  • j indexes the individual unit within treatment i.
Treatment Response Table
Treatment 1 2 a
Y11 Y21 Ya1
Y12
Sample Mean ¯Y1. ¯Y2. ¯Ya.
Sample SD s1 s2 sa

Where:

¯Yi.=1ninij=1Yij

s2i=1ni1nij=1(Yij¯Yi.)2

The grand mean is:

¯Y..=1NijYij


24.1.1 Single-Factor Fixed Effects ANOVA

Also known as One-Way ANOVA or ANOVA Type I Model.

The total variability in the response variable Yij can be decomposed as follows:

Yij¯Y..=YijˉY..+ˉYi.ˉYi.=(¯Yi.¯Y..)+(Yij¯Yi.)

where:

  • The first term represents between-treatment variability (deviation of treatment means from the grand mean).
  • The second term represents within-treatment variability (deviation of observations from their treatment mean).

Thus, we partition the total sum of squares (SSTO) as:

ij(Yij¯Y..)2=ini(¯Yi.¯Y..)2+ij(Yij¯Yi.)2

Or equivalently:

SSTO=SSTR+SSE

Where:

  • SSTO (Total SS): Total variability in the data.
  • SSTR (Treatment SS): Variability due to differences between treatment means.
  • SSE (Error SS): Variability within treatments (unexplained variance).

Degrees of freedom (d.f.):

(N1)=(a1)+(Na)

where we lose a degree of freedom for the total corrected SSTO because of the estimation of the mean (ij(YijˉY..)=0) and for the SSTR (ini(ˉYi.ˉY..)=0)

Mean squares:

MSTR=SSTRa1,MSR=SSENa

ANOVA Table
Source of Variation SS df MS
Between Treatments ini(¯Yi.¯Y..)2 a1 SSTR/(a1)
Error (within treatments) ij(Yij¯Yi.)2 Na SSE/(Na)
Total (corrected) ini(¯Yi.¯Y..)2 N1

For a linear model interpretation of ANOVA, we have either

  1. Cell Means Model
  2. Treatment Effect (Factor Effects Model)

24.1.1.1 Cell Means Model

The cell means model describes the response as:

Yij=μi+ϵij

where:

  • Yij: Response for unit j in treatment i.
  • μi: Fixed population mean for treatment i.
  • ϵijN(0,σ2): Independent errors.
  • E(Yij)=μi, Var(Yij)=σ2.

All observations are assumed to have equal variance across treatments.


Example: ANOVA with a=3 Treatments

Consider a case with three treatments (a=3), where each treatment has two replicates (n1=n2=n3=2). The response vector can be expressed in matrix form as:

(Y11Y12Y21Y22Y31Y32)=(100100010010001001)(μ1μ2μ3)+(ϵ11ϵ12ϵ21ϵ22ϵ31ϵ32)y=Xβ+ϵ

where:

  • Xk,ij=1 if the k-th treatment is applied to unit (i,j).
  • Xk,ij=0 otherwise.

Note: There is no intercept term in this model.

The least squares estimator for β is given by:

b=[μ1μ2μ3]=(XX)1Xy=[n1000n2000n3]1[Y1Y2Y3]=[¯Y1¯Y2¯Y3]

Thus, the estimated treatment means are:

ˆμi=¯Yi,i=1,2,3

This estimator b=[¯Y1,¯Y2,¯Y3] is the best linear unbiased estimator (BLUE) for β (i.e., E(b)=β)

Since bN(β,σ2(XX)1), the variance of the estimated treatment means is:

var(b)=σ2(XX)1=σ2[1/n10001/n20001/n3]

Thus, the variance of each estimated treatment mean is:

var(bi)=var(ˆμi)=σ2ni,i=1,2,3

The mean squared error (MSE) is given by:

MSE=1Naai=1nij=1(Yij¯Yi)2=1Naai=1[(ni1)1ni1nij=1(Yij¯Yi)2=s2i]=1Naai=1(ni1)s2i.

where s2i is the sample variance within the i-th treatment group.

Since E(s2i)=σ2, we get:

E(MSE)=1Nai(ni1)σ2=σ2

Thus, MSE is an unbiased estimator of σ2, regardless of whether the treatment means are equal.


The expected mean square for treatments (MSTR) is:

E(MSTR)=σ2+ini(μiμ.)2a1

where:

μ.=ai=1niμiai=1ni

If all treatment means are equal (μ1=μ2==μa=μ.), then:

E(MSTR)=σ2


F-Test for Equality of Treatment Means

We test the null hypothesis:

H0:μ1=μ2==μa

against the alternative:

Ha:at least one μi differs

The test statistic is:

F=MSTRMSE

  • Large values of F suggest rejecting H0 (since MSTR will be larger than MSE when Ha is true).
  • Values of F near 1 suggest that we fail to reject H0.

Since MSTR and MSE are independent chi-square random variables scaled by their degrees of freedom, under H0:

FF(a1,Na)

Decision Rule:

  • If FF(a1,Na;1α), fail to reject H0.
  • If FF(a1,Na;1α), reject H0.

If there are only two treatments (a=2), the ANOVA F-test reduces to the two-sample t-test:

F=t2

where:

t=¯Y1¯Y2MSE(1n1+1n2)


24.1.1.2 Treatment Effects (Factor Effects)

Besides cell means model, we have another way to formalize one-way ANOVA:

Yij=μ+τi+ϵij

where:

  • Yij is the j-th response for the i-th treatment.
  • τi is the i-th treatment effect.
  • μ is the constant component common to all observations.
  • ϵij are independent random errors, assumed to be normally distributed: ϵijN(0,σ2).

For example, if we have a=3 treatments and n1=n2=n3=2 observations per treatment, the model representation is:

(Y11Y12Y21Y22Y31Y32)=(110011001010101010011001)(μτ1τ2τ3)+(ϵ11ϵ12ϵ21ϵ22ϵ31ϵ32)y=Xβ+ϵ

However, the matrix:

XX=(inin1n2n3n1n100n20n20n300n3)

is singular, meaning XX is not invertible. This results in an infinite number of possible solutions for b.

To resolve this, we impose restrictions on the parameters to ensure that X has full rank. Regardless of the restriction used, the expected value remains:

E(Yij)=μ+τi=μi=mean response for the i-th treatment


24.1.1.2.1 Restriction on Sum of Treatment Effects

One common restriction is:

ai=1τi=0

which implies that:

μ=1aai=1(μ+τi)

meaning that μ represents the grand mean (the overall mean response across treatments).

Each treatment effect can then be expressed as:

τi=μiμ=treatment meangrand mean

Since iτi=0, we can solve for τa as:

τa=(τ1+τ2++τa1)

Thus, the mean for the a-th treatment is:

μa=μ+τa=μ(τ1+τ2++τa1)

This reduces the number of parameters from a+1 to just a, meaning we estimate:

μ,τ1,τ2,...,τa1

Rewriting Equation (24.2):

(Y11Y12Y21Y22Y31Y32)=(110110101101111111)(μτ1τ2)+(ϵ11ϵ12ϵ21ϵ22ϵ31ϵ32)y=Xβ+ϵ

where β=[μ,τ1,τ2].


24.1.1.2.2 Restriction on the First τ

In R, the default parameterization in lm() for a one-way ANOVA model sets τ1=0. This effectively chooses the first treatment (or group) as a baseline or reference, making its treatment effect τ1 equal to zero.

Consider the last example with three treatments, each having two observations, n1=n2=n3=2. Under the restriction τ1=0, the treatment means can be expressed as:

μ1=μ+τ1=μ+0=μ,μ2=μ+τ2,μ3=μ+τ3.

Hence, μ becomes the mean response for the first treatment.

We write the observations in vector form:

y=(Y11Y12Y21Y22Y31Y32)=(100100110110101101)X(μτ2τ3)+(ϵ11ϵ12ϵ21ϵ22ϵ31ϵ32)=Xβ+ϵ,

where

β=(μτ2τ3).

The OLS estimator is:

b=(ˆμ^τ2^τ3)=(XX)1Xy.

In our specific case with equal sample sizes (n1=n2=n3=2), the (XX)1Xy calculation yields:

b=[inin2n3n2n20n30n3]1[Y..Y2.Y3.]=(ˉY1ˉY2ˉY1ˉY3ˉY1) where ˉY1, ˉY2, and ˉY3 are the sample means for treatments 1, 2, and 3, respectively.

Taking the expectation of b confirms:

E(b)=β=(μτ2τ3)=(μ1μ2μ1μ3μ1).

Recall that:

Var(b)=σ2(XX)1.

Hence,

Var(ˆμ)=Var(ˉY1)=σ2n1,Var(^τ2)=Var(ˉY2ˉY1)=σ2n2+σ2n1,Var(^τ3)=Var(ˉY3ˉY1)=σ2n3+σ2n1.


24.1.1.3 Equivalence of Parameterizations

Despite having different ways of writing the model, all three parameterizations yield the same ANOVA table:

  1. Model 1: Yij=μi+ϵij.
  2. Model 2: Yij=μ+τi+ϵij where iτi=0.
  3. Model 3: Yij=μ+τi+ϵij where τ1=0.

All three lead to the same fitted values, because

ˆY=X(XX)1XY=PY=Xb.


24.1.1.4 ANOVA Table

The generic form of the ANOVA table is:

Source of Variation SS df MS F
Between Treatments ini(¯Yi¯Y)2=Y(PP1)Y a1 SSTRa1 MSTRMSE

Error

(within treatments)

ij(Yij¯Yi)2=ee Na SSENa
Total (corrected) ini(¯Yi¯Y)2=YYYP1Y N1

where P1=1nJ, n=ini, and J is the all-ones matrix.

The F-statistic has (a1,Na) degrees of freedom and the numeric value is unchanged under any of the three parameterizations. The slight difference lies in how we state the null hypothesis:

H0:μ1=μ2==μa,H0:μ+τ1=μ+τ2==μ+τa,H0:τ1=τ2==τa.

The F-test here serves as a preliminary analysis, to see if there is any difference at different factors. For more in-depth analysis, we consider different testing of treatment effects.

24.1.1.5 Testing of Treatment Effects

24.1.1.5.1 Single Treatment Mean

For a single treatment group, the sample mean serves as an estimate of the population mean:

^μi=ˉYi.

where:

  • E(ˉYi.)=μi, indicating unbiasedness.
  • var(ˉYi.)=σ2ni, estimated by s2(ˉYi.)=MSEni.

Since the standardized test statistic

T=ˉYi.μis(ˉYi.)

follows a t-distribution with Na degrees of freedom (tNa), a (1α)100% confidence interval for μi is:

ˉYi.±t1α/2;Nas(ˉYi.)

To test whether μi is equal to some constant c, we set up the hypothesis:

H0:μi=cH1:μic

The test statistic:

T=ˉYi.cs(ˉYi.)tNa

Under H0, we reject H0 at the α level if:

|T|>t1α/2;Na

24.1.1.5.2 Differences Between Treatment Means

The difference between two treatment means, also called a pairwise comparison, is given by:

D=μiμi

which is estimated by:

ˆD=ˉYi.ˉYi.

This estimate is unbiased since:

E(ˆD)=μiμi

Since ˉYi. and ˉYi. are independent, the variance of ˆD is:

var(ˆD)=var(ˉYi.)+var(ˉYi.)=σ2(1ni+1ni)

which is estimated by:

s2(ˆD)=MSE(1ni+1ni)

Using the same inference structure as the single treatment mean:

ˆDDs(ˆD)tNa

A (1α)100% confidence interval for D is:

ˆD±t1α/2;Nas(ˆD)

For hypothesis testing:

H0:μi=μiHa:μiμi

we use the test statistic:

T=ˆDs(ˆD)tNa

We reject H0 at the α level if:

|T|>t1α/2;Na

24.1.1.5.3 Contrast Among Treatment Means

To generalize the comparison of two means, we introduce contrasts.

A contrast is a linear combination of treatment means:

L=ai=1ciμi

where the coefficients ci are non-random constants that satisfy the constraint:

ai=1ci=0

This ensures that contrasts focus on relative comparisons rather than absolute magnitudes.

An unbiased estimator of L is given by:

ˆL=ai=1ciˉYi.

Since expectation is a linear operator:

E(ˆL)=ai=1ciE(ˉYi.)=ai=1ciμi=L

Thus, ˆL is an unbiased estimator of L.

Since the sample means ˉYi. are independent, the variance of ˆL is:

var(ˆL)=var(ai=1ciˉYi.)=ai=1c2ivar(ˉYi.)=ai=1c2iσ2ni=σ2ai=1c2ini

Since σ2 is unknown, we estimate it using the mean squared error:

s2(ˆL)=MSEai=1c2ini

Since ˆL is a linear combination of independent normal random variables, it follows a normal distribution:

ˆLN(L,σ2ai=1c2ini)

Since SSE/σ2χ2Na and MSE=SSE/(Na), we use the t-distribution:

ˆLLs(ˆL)tNa

Thus, a (1α)100% confidence interval for L is:

ˆL±t1α/2;Nas(ˆL)

To test whether a specific contrast equals zero:

H0:L=0(no difference in the contrast)Ha:L0(significant contrast)

We use the test statistic:

T=ˆLs(ˆL)tNa

We reject H0 at the α level if:

|T|>t1α/2;Na

24.1.1.5.4 Linear Combination of Treatment Means

A linear combination of treatment means extends the idea of a contrast:

L=ai=1ciμi

Unlike contrasts, there are no restrictions on the coefficients ci (i.e., they do not need to sum to zero).

Since tests on a single treatment mean, pairwise differences, and contrasts are all special cases of this general form, we can express the hypothesis test as:

H0:ai=1ciμi=cHa:ai=1ciμic

The test statistic follows a t-distribution:

T=ˆLcs(ˆL)tNa

Since squaring a t-distributed variable results in an F-distributed variable,

F=T2F1,Na

This means that all such tests can be viewed as single-degree-of-freedom F-tests, since the numerator degrees of freedom is always 1.


Multiple Contrasts

When testing k2 contrasts simultaneously, the test statistics T1,T2,...,Tk follow a multivariate t-distribution, since they are dependent (as they are based on the same data).

Limitations of Multiple Comparisons

  1. Inflation of Type I Error:
    The confidence coefficient (1α) applies to a single estimate, not a series of estimates. Similarly, the Type I error rate α applies to an individual test, not a collection of tests.

    Example: If three t-tests are performed at α=0.05, and if they were independent (which they are not), then:

    (10.05)3=0.857

    meaning the overall Type I error rate would be approximately 0.143, not 0.05.

  2. Data Snooping Concern:
    The significance level α is valid only if the test was planned before examining the data.

    • Often, an experiment suggests relationships to investigate.
    • Exploring effects based on observed data is known as data snooping.

To address these issues, we use Multiple Comparison Procedures, such as:

  • Tukey – for all pairwise comparisons of treatment means.
  • Scheffé – for all possible contrasts.
  • Bonferroni – for a fixed number of planned comparisons.

24.1.1.5.4.1 Tukey

Used for all pairwise comparisons of treatment means:

D=μiμi

Hypothesis test:

H0:μiμi=0Ha:μiμi0

Properties:

  • When sample sizes are equal (n1=n2=...=na), the family confidence coefficient is exactly (1α).
  • When sample sizes are unequal, the method is conservative (i.e., the actual significance level is less than α).

The Tukey test is based on the studentized range:

w=max

If Y_1, ..., Y_r are observations from a normal distribution with mean \mu and variance \sigma^2, then the statistic:

q(r, v) = \frac{w}{s}

follows the studentized range distribution, which requires a special table.

Notes:

  • When testing only a subset of pairwise comparisons, the confidence coefficient exceeds (1-\alpha), making the test more conservative.

  • Tukey’s method can be used for data snooping, as long as the investigated effects are pairwise comparisons.


24.1.1.5.4.2 Scheffé

Scheffé’s method is used for testing all possible contrasts:

L = \sum_{i=1}^{a} c_i \mu_i, \quad \text{where} \quad \sum_{i=1}^{a} c_i = 0

Hypothesis test:

\begin{aligned} &H_0: L = 0 \\ &H_a: L \neq 0 \end{aligned}

Properties:

  • Valid for any set of contrasts, making it the most general multiple comparison procedure.
  • The family confidence level is exactly (1-\alpha), regardless of sample sizes.

Simultaneous Confidence Intervals:

\hat{L} \pm S s(\hat{L})

where:

  • \hat{L} = \sum c_i \bar{Y}_{i.}
  • s^2(\hat{L}) = MSE \sum \frac{c_i^2}{n_i}
  • S^2 = (a-1) f_{1-\alpha; a-1, N-a}

Test Statistic:

F = \frac{\hat{L}^2}{(a-1) s^2(\hat{L})}

We reject H_0 if:

F > f_{1-\alpha; a-1, N-a}

Notes:

  • Finite Family Correction: Since we never test all possible contrasts in practice, the actual confidence coefficient is greater than (1-\alpha). Thus, some researchers use a higher \alpha (e.g., a 90% confidence level instead of 95%).
  • Scheffé is useful for data snooping, since it applies to any contrast.
  • If only pairwise comparisons are needed, Tukey’s method gives narrower confidence intervals than Scheffé.

24.1.1.5.4.3 Bonferroni

The Bonferroni correction is applicable regardless of whether sample sizes are equal or unequal. It is particularly useful when a small number of planned comparisons are of interest.

A (1-\alpha)100\% simultaneous confidence interval for a set of g comparisons is:

\hat{L} \pm B s(\hat{L})

where:

B = t_{1-\alpha/(2g), N-a}

and g is the number of comparisons in the family.

To test:

\begin{aligned} &H_0: L = 0 \\ &H_a: L \neq 0 \end{aligned}

we use the test statistic:

T = \frac{\hat{L}}{s(\hat{L})}

Reject H_0 if:

|T| > t_{1-\alpha/(2g),N-a}

Notes:

  • If all pairwise comparisons are needed, Tukey’s method is superior, as it provides narrower confidence intervals.
  • Bonferroni is better than Scheffé when the number of contrasts is similar to or smaller than the number of treatment levels.
  • Practical recommendation: Compute Tukey, Scheffé, and Bonferroni and use the method with the smallest confidence intervals.
  • Bonferroni cannot be used for data snooping, as it assumes the comparisons were planned before examining the data.

24.1.1.5.4.4 Fisher’s Least Significant Difference

The Fisher LSD method does not control the family-wise error rate (refer to 16.3), meaning it does not correct for multiple comparisons. However, it can be useful for exploratory analysis when a preliminary ANOVA is significant.

The hypothesis test for comparing two treatment means:

H_0: \mu_i = \mu_j

uses the t-statistic:

t = \frac{\bar{Y}_i - \bar{Y}_j}{\sqrt{MSE \left(\frac{1}{n_i} + \frac{1}{n_j}\right)}}

where:

  • \bar{Y}_i and \bar{Y}_j are the sample means for treatments i and j.

  • MSE is the mean squared error from ANOVA.

  • n_i, n_j are the sample sizes for groups i and j.

Notes:

  • The LSD method does not adjust for multiple comparisons, which increases the Type I error rate.
  • It is only valid if the overall ANOVA is significant (i.e., the global null hypothesis of no treatment effect is rejected).
  • Tukey and Bonferroni methods are preferred when many comparisons are made.

24.1.1.5.4.5 Newman-Keuls

The Newman-Keuls procedure is a stepwise multiple comparison test similar to Tukey’s method but less rigorous.

Key Issues:

  • Unlike Tukey, Newman-Keuls does not control the family-wise error rate.
  • It has less power than ANOVA.
  • It is rarely recommended in modern statistical practice.
  • Do not recommend using the Newman-Keuls test.
24.1.1.5.4.6 Summary of Multiple Comparison Procedures
Method Type of Comparisons Controls Family-Wise Error Rate? Best Used For Strengths Weaknesses
Tukey All pairwise comparisons Yes Comparing all treatment means Exact confidence level when sample sizes are equal; more powerful than Scheffé for pairwise tests Conservative if sample sizes are unequal
Scheffé All possible contrasts Yes Exploratory analysis, especially when interested in any contrast Valid for any contrast; can be used for data snooping Confidence intervals wider than Tukey for pairwise comparisons
Bonferroni Fixed number of planned comparisons Yes A small number of pre-specified tests Simple and flexible; better than Scheffé for few comparisons Less powerful than Tukey for many pairwise tests; cannot be used for data snooping
Fisher’s LSD Pairwise comparisons No Exploratory comparisons after significant ANOVA Most powerful for pairwise comparisons when ANOVA is significant Inflates Type I error rate; not valid without a significant ANOVA
Newman-Keuls Pairwise comparisons No - - Less power than ANOVA; generally not recommended
24.1.1.5.4.7 Dunnett’s Test

In some experiments, instead of comparing all treatment groups against each other, we are specifically interested in comparing each treatment to a control. This is common in clinical trials or A/B testing, where one group serves as a baseline.

Dunnett’s test is designed for experiments with a groups, where:

  • One group is the control (e.g., placebo or standard treatment).

  • The remaining a-1 groups are treatment groups.

Thus, we perform a-1 pairwise comparisons:

D_i = \mu_i - \mu_c, \quad i = 1, \dots, a-1

where \mu_c is the mean of the control group.

Dunnett’s Test vs. Other Methods

  • Unlike Tukey’s method (which compares all pairs), Dunnett’s method only compares treatments to the control.
  • Dunnett’s test controls the family-wise error rate, making it more powerful than Bonferroni for this scenario.
  • If the goal is to compare treatments against each other as well, Tukey’s method is preferable.

24.1.2 Single Factor Random Effects ANOVA

Also known as an ANOVA Type II model, the single factor random effects model assumes that treatments are randomly selected from a larger population. Thus, inference extends beyond the observed treatments to the entire population of treatments.


24.1.2.1 Random Cell Means Model

The model is given by:

Y_{ij} = \mu_i + \epsilon_{ij}

where:

  • \mu_i \sim N(\mu, \sigma^2_{\mu}), independent across treatments.
  • \epsilon_{ij} \sim N(0, \sigma^2), independent across observations.
  • \mu_i and \epsilon_{ij} are mutually independent for i = 1, \dots, a and j = 1, \dots, n.

When all treatment sample sizes are equal:

\begin{aligned} E(Y_{ij}) &= E(\mu_i) = \mu \\ var(Y_{ij}) &= var(\mu_i) + var(\epsilon_{ij}) = \sigma^2_{\mu} + \sigma^2 \end{aligned}


24.1.2.1.1 Covariance Structure

Since Y_{ij} are not independent, we calculate their covariances:

  1. Same treatment group (i fixed, j \neq j'):

\begin{aligned} cov(Y_{ij}, Y_{ij'}) &= E(Y_{ij} Y_{ij'}) - E(Y_{ij}) E(Y_{ij'}) \\ &= E(\mu_i^2 + \mu_i \epsilon_{ij'} + \mu_i \epsilon_{ij} + \epsilon_{ij} \epsilon_{ij'}) - \mu^2 \\ &= \sigma^2_{\mu} + \mu^2 - \mu^2 \\ &= \sigma^2_{\mu} \end{aligned}

  1. Different treatment groups (i \neq i'):

\begin{aligned} cov(Y_{ij}, Y_{i'j'}) &= E(\mu_i \mu_{i'} + \mu_i \epsilon_{i'j'} + \mu_{i'} \epsilon_{ij} + \epsilon_{ij} \epsilon_{i'j'}) - \mu^2 \\ &= \mu^2 - \mu^2 = 0 \end{aligned}

Thus:

  • All observations have the same variance: var(Y_{ij}) = \sigma^2_{\mu} + \sigma^2.
  • Observations from the same treatment have covariance: \sigma^2_{\mu}.
  • Observations from different treatments are uncorrelated.

The intraclass correlation between two responses from the same treatment:

\rho(Y_{ij}, Y_{ij'}) = \frac{\sigma^2_{\mu}}{\sigma^2_{\mu} + \sigma^2}, \quad j \neq j'


24.1.2.1.2 Inference for Random Effects Model

The Intraclass Correlation Coefficient:

\frac{\sigma^2_{\mu}}{\sigma^2 + \sigma^2_{\mu}}

measures the proportion of total variability in Y_{ij} that is accounted for by treatment differences.

To test whether treatments contribute significantly to variance:

\begin{aligned} &H_0: \sigma_{\mu}^2 = 0 \quad \text{(No treatment effect, all $\mu_i = \mu$)} \\ &H_a: \sigma_{\mu}^2 \neq 0 \end{aligned}

Under H_0, an ANOVA F-test is used:

F = \frac{MSTR}{MSE}

where:

  • MSTR (Mean Square for Treatments) captures variation between treatments.
  • MSE (Mean Square Error) captures variation within treatments.

If H_0 is true, then:

F \sim F_{(a-1, a(n-1))}

Reject H_0 if:

F > f_{(1-\alpha; a-1, a(n-1))}


24.1.2.1.3 Comparison: Fixed Effects vs. Random Effects Models

Although ANOVA calculations are the same for fixed and random effects models, the interpretation of results differs.

Random Effects Model Fixed Effects Model
E(MSE) = \sigma^2 E(MSE) = \sigma^2
E(MSTR) = \sigma^2 + n \sigma^2_{\mu} E(MSTR) = \sigma^2 + \frac{ \sum_i n_i (\mu_i - \mu)^2}{a-1}
  • If \sigma^2_{\mu} = 0, then E(MSTR) = E(MSE), implying no treatment effect.
  • Otherwise, E(MSTR) > E(MSE), suggesting significant treatment variation.

When sample sizes are not equal, the F-test remains valid, but the degrees of freedom change to:

F \sim F_{(a-1, N-a)}


24.1.2.1.4 Estimation of \mu

An unbiased estimator of E(Y_{ij}) = \mu is the grand mean:

\hat{\mu} = \bar{Y}_{..} = \frac{1}{a n} \sum_{i=1}^{a} \sum_{j=1}^{n} Y_{ij}

The variance of this estimator is:

\begin{aligned} var(\bar{Y}_{..}) &= var\left(\frac{1}{a} \sum_{i=1}^{a} \bar{Y}_{i.} \right) \\ &= \frac{1}{a^2} \sum_{i=1}^{a} var(\bar{Y}_{i.}) \\ &= \frac{1}{a^2} \sum_{i=1}^{a} \left(\sigma^2_\mu + \frac{\sigma^2}{n} \right) \\ &= \frac{n \sigma^2_{\mu} + \sigma^2}{a n} \end{aligned}

An unbiased estimator of this variance is:

s^2(\bar{Y}_{..}) = \frac{MSTR}{a n}

Since:

\frac{\bar{Y}_{..} - \mu}{s(\bar{Y}_{..})} \sim t_{a-1}

A (1-\alpha)100\% confidence interval for \mu is:

\bar{Y}_{..} \pm t_{1-\alpha/2; a-1} s(\bar{Y}_{..})


24.1.2.1.5 Estimation of Intraclass Correlation Coefficient \frac{\sigma^2_\mu}{\sigma^2_{\mu}+\sigma^2}

In both random and fixed effects models, MSTR and MSE are independent.

When sample sizes are equal (n_i = n for all i), the test statistic:

\frac{\frac{MSTR}{n\sigma^2_\mu + \sigma^2}}{\frac{MSE}{\sigma^2}} \sim F_{a-1, a(n-1)}

A (1-\alpha)100\% confidence interval for \frac{\sigma^2_\mu}{\sigma^2_\mu + \sigma^2} follows from:

P\left(f_{\alpha/2; a-1, a(n-1)} \leq \frac{\frac{MSTR}{n\sigma^2_\mu + \sigma^2}}{\frac{MSE}{\sigma^2}} \leq f_{1-\alpha/2; a-1, a(n-1)} \right) = 1 - \alpha

Defining:

\begin{aligned} L &= \frac{1}{n} \left( \frac{MSTR}{MSE} \times \frac{1}{f_{1-\alpha/2; a-1, a(n-1)}} - 1 \right) \\ U &= \frac{1}{n} \left( \frac{MSTR}{MSE} \times \frac{1}{f_{\alpha/2; a-1, a(n-1)}} - 1 \right) \end{aligned}

The lower and upper confidence limits for \frac{\sigma^2_\mu}{\sigma^2_\mu + \sigma^2} are:

\begin{aligned} L^* &= \frac{L}{1+L} \\ U^* &= \frac{U}{1+U} \end{aligned}

If L^* is negative, we customarily set it to 0.


24.1.2.1.6 Estimation of \sigma^2

Since:

\frac{a(n-1) MSE}{\sigma^2} \sim \chi^2_{a(n-1)}

A (1-\alpha)100\% confidence interval for \sigma^2 is:

\frac{a(n-1) MSE}{\chi^2_{1-\alpha/2; a(n-1)}} \leq \sigma^2 \leq \frac{a(n-1) MSE}{\chi^2_{\alpha/2; a(n-1)}}

If sample sizes are unequal, the same formula applies, but the degrees of freedom change to:

df = N - a


24.1.2.1.7 Estimation of \sigma^2_\mu

From the expectations:

E(MSE) = \sigma^2, \quad E(MSTR) = \sigma^2 + n\sigma^2_\mu

we solve for \sigma^2_{\mu}:

\sigma^2_{\mu} = \frac{E(MSTR) - E(MSE)}{n}

An unbiased estimator of \sigma^2_\mu is:

s^2_\mu = \frac{MSTR - MSE}{n}

If s^2_\mu < 0, we set s^2_\mu = 0 (since variances cannot be negative).

If sample sizes are unequal, we replace n with an effective sample size n':

s^2_\mu = \frac{MSTR - MSE}{n'}

where:

n' = \frac{1}{a-1} \left(\sum_i n_i - \frac{\sum_i n_i^2}{\sum_i n_i} \right)


There are no exact confidence intervals for \sigma^2_\mu, but we can approximate them using the Satterthwaite procedure.

24.1.2.1.7.1 Satterthwaite Approximation

A linear combination of expected mean squares:

\sigma^2_\mu = \frac{1}{n} E(MSTR) + \left(-\frac{1}{n}\right) E(MSE)

For a general linear combination:

S = d_1 E(MS_1) + \dots + d_h E(MS_h)

where d_i are coefficients, an unbiased estimator of S is:

\hat{S} = d_1 MS_1 + \dots + d_h MS_h

Let df_i be the degrees of freedom associated with each mean square MS_i. The Satterthwaite approximation states:

\frac{(df) \hat{S}}{S} \sim \chi^2_{df}

where the degrees of freedom are approximated as:

df = \frac{(d_1 MS_1 + \dots + d_h MS_h)^2}{\sum_{i=1}^{h} \frac{(d_i MS_i)^2}{df_i}}


Applying the Satterthwaite method to the single factor random effects model:

\frac{(df) s^2_\mu}{\chi^2_{1-\alpha/2; df}} \leq \sigma^2_\mu \leq \frac{(df) s^2_\mu}{\chi^2_{\alpha/2; df}}

where the approximate degrees of freedom are:

df = \frac{(s^2_\mu)^2}{\frac{(MSTR)^2}{a-1} + \frac{(MSE)^2}{a(n-1)}}


24.1.2.2 Random Treatment Effects Model

In a random effects model, treatment levels are considered random samples from a larger population of possible treatments. The model accounts for variability across all potential treatments, not just those observed in the study.

We define the random treatment effect as:

\tau_i = \mu_i - E(\mu_i) = \mu_i - \mu

where \tau_i represents the deviation of treatment mean \mu_i from the overall mean \mu.

Thus, we rewrite treatment means as:

\mu_i = \mu + \tau_i

Substituting this into the response model:

Y_{ij} = \mu + \tau_i + \epsilon_{ij}

where:

  • \mu = common mean across all observations.
  • \tau_i \sim N(0, \sigma^2_\tau), random treatment effects, assumed independent.
  • \epsilon_{ij} \sim N(0, \sigma^2), random error terms, also independent.
  • \tau_{i} and \epsilon_{ij} are mutually independent for i = 1, \dots, a and j = 1, \dots, n.
  • We consider only balanced single-factor ANOVA (equal sample sizes across treatments).

24.1.2.3 Diagnostic Measures for Model Assumptions

Checking assumptions is crucial for valid inference. Common issues include:

Issue Diagnostic Tools
Non-constant error variance (heteroscedasticity) Residual plots, Levene’s test, Hartley’s test
Non-independence of errors Residual plots, Durbin-Watson test (for autocorrelation)
Outliers Boxplots, residual plots, regression influence measures (e.g., Cook’s distance)
Non-normality of errors Histogram, Q-Q plot, Shapiro-Wilk test, Anderson-Darling test
Omitted variable bias Residual plots, checking for unaccounted sources of variation

24.1.2.4 Remedial Measures

If diagnostic checks indicate violations of assumptions, possible solutions include:


24.1.2.5 Key Notes on Robustness

  • Fixed effects ANOVA is relatively robust to:
    • Non-normality, particularly when sample sizes are moderate to large.
    • Unequal variances when sample sizes are roughly equal.
    • F-test and multiple comparisons remain valid under mild violations.
  • Random effects ANOVA is sensitive to:
    • Lack of independence, which severely affects both fixed and random effects models.
    • Unequal variances, particularly when estimating variance components.

24.1.3 Two-Factor Fixed Effects ANOVA

A multi-factor experiment offers several advantages:

  • Higher efficiency – More precise estimates with fewer observations.
  • Increased information – Allows for testing interactions between factors.
  • Greater validity – Reduces confounding by controlling additional sources of variation.

Balanced Two-Factor ANOVA: Assumptions

  • Equal sample sizes for all treatment combinations.
  • All treatment means are of equal importance (no weighting).
  • Factors are categorical and chosen purposefully.

We assume:

  • Factor A has a levels and Factor B has b levels.
  • All a \times b factor level combinations are included.
  • Each treatment combination has n replications.
  • The total number of observations:
    N = abn

24.1.3.1 Cell Means Model

The response is modeled as:

Y_{ijk} = \mu_{ij} + \epsilon_{ijk}

where:

  • \mu_{ij} are fixed parameters (cell means).
  • i = 1, \dots, a represents levels of Factor A.
  • j = 1, \dots, b represents levels of Factor B.
  • \epsilon_{ijk} \sim \text{independent } N(0, \sigma^2) for all i, j, k.

Expected values and variance:

\begin{aligned} E(Y_{ijk}) &= \mu_{ij} \\ var(Y_{ijk}) &= var(\epsilon_{ijk}) = \sigma^2 \end{aligned}

Thus:

Y_{ijk} \sim \text{independent } N(\mu_{ij}, \sigma^2)

This can be expressed in matrix notation:

\mathbf{Y} = \mathbf{X} \beta + \epsilon

where:

\begin{aligned} E(\mathbf{Y}) &= \mathbf{X} \beta \\ var(\mathbf{Y}) &= \sigma^2 \mathbf{I} \end{aligned}


24.1.3.1.1 Interaction Effects

Interaction measures whether the effect of one factor depends on the level of the other factor. It is defined as:

(\alpha \beta)_{ij} = \mu_{ij} - (\mu_{..} + \alpha_i + \beta_j)

where:

  • Grand mean:
    \mu_{..} = \frac{1}{ab} \sum_i \sum_j \mu_{ij}
  • Main effect for Factor A (average effect of level i):
    \alpha_i = \mu_{i.} - \mu_{..}
  • Main effect for Factor B (average effect of level j):
    \beta_j = \mu_{.j} - \mu_{..}
  • Interaction effect:
    (\alpha \beta)_{ij} = \mu_{ij} - \mu_{i.} - \mu_{.j} + \mu_{..}

To determine whether interactions exist:

  1. Check if all \mu_{ij} can be written as sums \mu_{..} + \alpha_i + \beta_j
    (i.e., check if interaction terms are zero).
  2. Compare mean differences across levels of Factor B at each level of Factor A.
  3. Compare mean differences across levels of Factor A at each level of Factor B.
  4. Graphical method:
    • Plot treatment means for each level of Factor B.
    • If lines are not parallel, an interaction exists.

The interaction terms satisfy:

For each level of Factor B:

\sum_i (\alpha \beta)_{ij} = \sum_i \left(\mu_{ij} - \mu_{..} - \alpha_i - \beta_j \right)

Expanding:

\begin{aligned} \sum_i (\alpha \beta)_{ij} &= \sum_i \mu_{ij} - a \mu_{..} - \sum_i \alpha_i - a \beta_j \\ &= a \mu_{.j} - a \mu_{..} - \sum_i (\mu_{i.} - \mu_{..}) - a (\mu_{.j} - \mu_{..}) \\ &= a \mu_{.j} - a \mu_{..} - a \mu_{..}+ a \mu_{..} - a (\mu_{.j} - \mu_{..}) \\ &= 0 \end{aligned}

Similarly:

\sum_j (\alpha \beta)_{ij} = 0, \quad i = 1, \dots, a

and:

\sum_i \sum_j (\alpha \beta)_{ij} = 0, \quad \sum_i \alpha_i = 0, \quad \sum_j \beta_j = 0


24.1.3.2 Factor Effects Model

In the Factor Effects Model, we express the response as:

\begin{aligned} \mu_{ij} &= \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} \\ Y_{ijk} &= \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk} \end{aligned}

where:

  • \mu_{..} is the grand mean.
  • \alpha_i are main effects for Factor A, subject to:
    \sum_i \alpha_i = 0
  • \beta_j are main effects for Factor B, subject to:
    \sum_j \beta_j = 0
  • (\alpha \beta)_{ij} are interaction effects, subject to:
    \sum_i (\alpha \beta)_{ij} = 0, \quad j = 1, \dots, b
    \sum_j (\alpha \beta)_{ij} = 0, \quad i = 1, \dots, a
  • \epsilon_{ijk} \sim \text{independent } N(0, \sigma^2) for k = 1, \dots, n.

Thus, we have:

\begin{aligned} E(Y_{ijk}) &= \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} \\ var(Y_{ijk}) &= \sigma^2 \\ Y_{ijk} &\sim N (\mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij}, \sigma^2) \end{aligned}


24.1.3.3 Parameter Counting and Restrictions

The Cell Means Model has ab parameters corresponding to each combination of factor levels.
In the Factor Effects Model, the imposed constraints reduce the number of estimable parameters:

Parameter Count
\mu_{..} 1
\alpha_i (Main effects for A) a-1 (due to constraint \sum_i \alpha_i = 0)
\beta_j (Main effects for B) b-1 (due to constraint \sum_j \beta_j = 0)
(\alpha \beta)_{ij} (Interaction effects) (a-1)(b-1) (due to two constraints)

Thus, the total number of parameters:

1 + (a-1) + (b-1) + (a-1)(b-1) = ab

which matches the number of parameters in the Cell Means Model.


To uniquely estimate parameters, we apply constraints:

\begin{aligned} \alpha_a &= -(\alpha_1 + \alpha_2 + \dots + \alpha_{a-1}) \\ \beta_b &= -(\beta_1 + \beta_2 + \dots + \beta_{b-1}) \\ (\alpha \beta)_{ib} &= -(\alpha \beta)_{i1} - (\alpha \beta)_{i2} - \dots - (\alpha \beta)_{i,b-1}, \quad i = 1, \dots, a \\ (\alpha \beta)_{aj} &= -(\alpha \beta)_{1j} - (\alpha \beta)_{2j} - \dots - (\alpha \beta)_{a-1,j}, \quad j = 1, \dots, b \end{aligned}

The model can be fitted using least squares or maximum likelihood estimation.


24.1.3.3.1 Cell Means Model Estimation

Minimizing:

Q = \sum_i \sum_j \sum_k (Y_{ijk} - \mu_{ij})^2

yields estimators:

\begin{aligned} \hat{\mu}_{ij} &= \bar{Y}_{ij} \\ \hat{Y}_{ijk} &= \bar{Y}_{ij} \\ e_{ijk} &= Y_{ijk} - \hat{Y}_{ijk} = Y_{ijk} - \bar{Y}_{ij} \end{aligned}

where e_{ijk} \sim \text{independent } N(0, \sigma^2).


24.1.3.3.2 Factor Effects Model Estimation

Minimizing:

Q = \sum_i \sum_j \sum_k (Y_{ijk} - \mu_{..} - \alpha_i - \beta_j - (\alpha \beta)_{ij})^2

subject to the constraints:

\begin{aligned} \sum_i \alpha_i &= 0 \\ \sum_j \beta_j &= 0 \\ \sum_i (\alpha \beta)_{ij} &= 0, \quad j = 1, \dots, b \\ \sum_j (\alpha \beta)_{ij} &= 0, \quad i = 1, \dots, a \end{aligned}

yields estimators:

\begin{aligned} \hat{\mu}_{..} &= \bar{Y}_{...} \\ \hat{\alpha}_i &= \bar{Y}_{i..} - \bar{Y}_{...} \\ \hat{\beta}_j &= \bar{Y}_{.j.} - \bar{Y}_{...} \\ (\hat{\alpha \beta})_{ij} &= \bar{Y}_{ij.} - \bar{Y}_{i..} - \bar{Y}_{.j.} + \bar{Y}_{...} \end{aligned}


The fitted values are:

\hat{Y}_{ijk} = \bar{Y}_{...} + (\bar{Y}_{i..} - \bar{Y}_{...}) + (\bar{Y}_{.j.} - \bar{Y}_{...}) + (\bar{Y}_{ij.} - \bar{Y}_{i..} - \bar{Y}_{.j.} + \bar{Y}_{...})

which simplifies to:

\hat{Y}_{ijk} = \bar{Y}_{ij.}

The residuals are:

e_{ijk} = Y_{ijk} - \bar{Y}_{ij.}

and follow:

e_{ijk} \sim \text{independent } N(0, \sigma^2)


The variances of the estimated effects are:

\begin{aligned} s^2_{\hat{\mu}_{..}} &= \frac{MSE}{nab} \\ s^2_{\hat{\alpha}_i} &= MSE \left(\frac{1}{nb} - \frac{1}{nab} \right) \\ s^2_{\hat{\beta}_j} &= MSE \left(\frac{1}{na} - \frac{1}{nab} \right) \\ s^2_{(\hat{\alpha\beta})_{ij}} &= MSE \left(\frac{1}{n} - \frac{1}{na} - \frac{1}{nb} + \frac{1}{nab} \right) \end{aligned}


24.1.3.3.3 Partitioning the Total Sum of Squares

The total deviation of an observation from the overall mean can be decomposed as:

Y_{ijk} - \bar{Y}_{...} = (\bar{Y}_{ij.} - \bar{Y}_{...}) + (Y_{ijk} - \bar{Y}_{ij.})

where:

  • Y_{ijk} - \bar{Y}_{...}: Total deviation of an observation.
  • \bar{Y}_{ij.} - \bar{Y}_{...}: Deviation of treatment mean from the overall mean.
  • Y_{ijk} - \bar{Y}_{ij.}: Residual deviation of an observation from the treatment mean.

Summing over all observations:

\sum_i \sum_j \sum_k (Y_{ijk} - \bar{Y}_{...})^2 = n \sum_i \sum_j (\bar{Y}_{ij.} - \bar{Y}_{...})^2 + \sum_i \sum_j \sum_k (Y_{ijk} - \bar{Y}_{ij.})^2

Thus:

SSTO = SSTR + SSE

where:

  • SSTO = Total Sum of Squares (Total variation).
  • SSTR = Treatment Sum of Squares (Variation due to factor effects).
  • SSE = Error Sum of Squares (Residual variation).

Since the cross-product terms are 0, the model naturally partitions the variance.

From the factor effects model:

\bar{Y}_{ij.} - \bar{Y}_{...} = (\bar{Y}_{i..} - \bar{Y}_{...}) + (\bar{Y}_{.j.} - \bar{Y}_{...}) + (\bar{Y}_{ij.} - \bar{Y}_{i..} - \bar{Y}_{.j.} + \bar{Y}_{...})

Squaring and summing:

\begin{aligned} n\sum_i \sum_j (\bar{Y}_{ij.} - \bar{Y}_{...})^2 &= nb\sum_i (\bar{Y}_{i..} - \bar{Y}_{...})^2 + na\sum_j (\bar{Y}_{.j.} - \bar{Y}_{...})^2 \\ &+ n\sum_i \sum_j (\bar{Y}_{ij.} - \bar{Y}_{i..} - \bar{Y}_{.j.} + \bar{Y}_{...})^2 \end{aligned}

Thus, treatment sum of squares can be further partitioned as:

SSTR = SSA + SSB + SSAB

where:

  • SSA: Sum of Squares for Factor A.
  • SSB: Sum of Squares for Factor B.
  • SSAB: Sum of Squares for Interaction.

The interaction term can also be expressed as:

SSAB = SSTO - SSE - SSA - SSB

or equivalently:

SSAB = SSTR - SSA - SSB

where:

  • SSA measures the variability of the estimated factor A level means (\bar{Y}_{i..}). The more variable these means, the larger SSA.
  • SSB measures the variability of the estimated factor B level means (\bar{Y}_{.j.}).
  • SSAB measures the variability in interaction effects.

For Two-Factor ANOVA, the degrees of freedom partitioning follows:

Sum of Squares Degrees of Freedom (df)
SSTO (Total) N - 1 = abn - 1
SSTR (Treatments) ab - 1
SSE (Error) N - ab = ab(n - 1)
SSA (Factor A) a - 1
SSB (Factor B) b - 1
SSAB (Interaction) (a-1)(b-1)

Since:

SSTR = SSA + SSB + SSAB

the treatment degrees of freedom also partition as:

ab - 1 = (a - 1) + (b - 1) + (a - 1)(b - 1)

  • df_{SSA} = a - 1
    (One degree of freedom lost due to the constraint \sum (\bar{Y}_{i..} - \bar{Y}_{...}) = 0).
  • df_{SSB} = b - 1
    (One degree of freedom lost due to the constraint \sum (\bar{Y}_{.j.} - \bar{Y}_{...}) = 0).
  • df_{SSAB} = (a - 1)(b - 1)
    (Due to interaction constraints).

The Mean Squares are obtained by dividing Sum of Squares by the corresponding degrees of freedom:

\begin{aligned} MSA &= \frac{SSA}{a - 1} \\ MSB &= \frac{SSB}{b - 1} \\ MSAB &= \frac{SSAB}{(a - 1)(b - 1)} \end{aligned}

The expectations of the mean squares are:

\begin{aligned} E(MSE) &= \sigma^2 \\ E(MSA) &= \sigma^2 + nb \frac{\sum \alpha_i^2}{a - 1} = \sigma^2 + nb \frac{\sum (\mu_{i..} - \mu_{..})^2}{a - 1} \\ E(MSB) &= \sigma^2 + na \frac{\sum \beta_j^2}{b - 1} = \sigma^2 + na \frac{\sum (\mu_{.j.} - \mu_{..})^2}{b - 1} \\ E(MSAB) &= \sigma^2 + n \frac{\sum \sum (\alpha \beta)^2_{ij}}{(a-1)(b-1)} = \sigma^2 + n \frac{\sum (\mu_{ij} - \mu_{i..} - \mu_{.j.} + \mu_{..})^2}{(a - 1)(b - 1)} \end{aligned}

If Factor A has no effect (\mu_{i..} = \mu_{..}), then MSA and MSE have the same expectation.
Similarly, if Factor B has no effect, then MSB = MSE.

Thus, MSA > MSE and MSB > MSE suggest the presence of factor effects.


24.1.3.4 Testing for Interaction

Hypotheses:

\begin{aligned} H_0: \mu_{ij} - \mu_{i..} - \mu_{.j.} + \mu_{..} = 0 &\quad \text{(No interaction)} \\ H_a: \mu_{ij} - \mu_{i..} - \mu_{.j.} + \mu_{..} \neq 0 &\quad \text{(Interaction present)} \end{aligned}

or equivalently:

\begin{aligned} &H_0: \text{All } (\alpha \beta)_{ij} = 0 \\ &H_a: \text{Not all } (\alpha \beta)_{ij} = 0 \end{aligned}

The F-statistic is:

F = \frac{MSAB}{MSE}

Under H_0, F \sim F_{(a-1)(b-1), ab(n-1)}. Reject H_0 if:

F > F_{1-\alpha; (a-1)(b-1), ab(n-1)}


24.1.3.5 Two-Way ANOVA Summary Table

The Two-Way ANOVA table partitions the total variation into its components:

Source of Variation Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F-Statistic
Factor A SSA a-1 MSA = \frac{SSA}{a-1} F_A = \frac{MSA}{MSE}
Factor B SSB b-1 MSB = \frac{SSB}{b-1} F_B = \frac{MSB}{MSE}
Interaction (A × B) SSAB (a-1)(b-1) MSAB = \frac{SSAB}{(a-1)(b-1)} F_{AB} = \frac{MSAB}{MSE}
Error SSE ab(n-1) MSE = \frac{SSE}{ab(n-1)} -
Total (corrected) SSTO abn - 1 - -

Interpreting Two-Way ANOVA Results

When conducting a Two-Way ANOVA, always check interaction effects first:

  1. If the interaction (A \times B) is significant:
    • The effect of one factor depends on the level of the other factor.
    • Main effects are not interpretable alone because their impact varies across levels of the second factor.
  2. If the interaction is NOT significant:
    • The factors have independent (additive) effects.
    • Main effects can be tested individually.

Post-Hoc Comparisons

  • If interaction is not significant, proceed with main effect comparisons using:
  • If interaction is significant, post-hoc tests should examine simple effects (comparisons within each level of a factor).

24.1.3.5.1 Contrasts in Two-Way ANOVA

In Two-Way ANOVA, we can define contrasts to test specific hypotheses:

L = \sum c_i \mu_i, \quad \text{where } \sum c_i = 0

An unbiased estimator of L:

\hat{L} = \sum c_i \bar{Y}_{i..}

with variance:

\sigma^2(\hat{L}) = \frac{\sigma^2}{bn} \sum c_i^2

and variance estimate:

\frac{MSE}{bn} \sum c_i^2


24.1.3.5.1.1 Orthogonal Contrasts in Two-Way ANOVA

For two contrasts:

\begin{aligned} L_1 &= \sum c_i \mu_i, \quad \sum c_i = 0 \\ L_2 &= \sum d_i \mu_i, \quad \sum d_i = 0 \end{aligned}

They are orthogonal if:

\sum \frac{c_i d_i}{n_i} = 0

For balanced designs (n_i = n):

\sum c_i d_i = 0

This ensures that orthogonal contrasts are uncorrelated:

\begin{aligned} cov(\hat{L}_1, \hat{L}_2) &= cov\left(\sum_i c_i \bar{Y}_{i..}, \sum_l d_l \bar{Y}_{l..}\right) \\ &= \sum_i \sum_l c_i d_l cov(\bar{Y}_{i..},\bar{Y}_{l..}) \\ &= \sum_i c_i d_i \frac{\sigma^2}{bn} = 0 \end{aligned}

Thus, orthogonal contrasts allow us to partition the sum of squares further.


24.1.3.5.1.2 Orthogonal Polynomial Contrasts
  • Used when factor levels are equally spaced (e.g., dose levels: 0, 15, 30, 45, 60).
  • Requires equal sample sizes across factor levels.

The Sum of Squares (SS) for a given contrast:

SS_L = \frac{\hat{L}^2}{\sum_{i=1}^a \frac{c^2_i}{bn_i}}

The t-statistic for testing contrasts:

T = \frac{\hat{L}}{\sqrt{MSE \sum_{i=1}^a \frac{c_i^2}{bn_i}}} \sim t

Since:

t^2_{(1-\alpha/2; df)} = F_{(1-\alpha; 1, df)}

we can equivalently test:

\frac{SS_L}{MSE} \sim F_{(1-\alpha;1,df_{MSE})}

All contrasts have df = 1.


24.1.3.6 Unbalanced Two-Way ANOVA

In many practical situations, sample sizes may be unequal across factor combinations, such as in:

  • Observational studies (e.g., real-world data with missing values).
  • Dropouts in designed studies (e.g., clinical trials with subject attrition).
  • Larger sample sizes for inexpensive treatments.
  • Sample sizes chosen to match population proportions.

We assume the standard Two-Way ANOVA model:

Y_{ijk} = \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk}

where sample sizes vary:

\begin{aligned} n_{i.} &= \sum_j n_{ij} \quad \text{(Total for factor level } i) \\ n_{.j} &= \sum_i n_{ij} \quad \text{(Total for factor level } j) \\ n_T &= \sum_i \sum_j n_{ij} \quad \text{(Total sample size)} \end{aligned}

However, for unbalanced designs, a major issue arises:

SSTO \neq SSA + SSB + SSAB + SSE

Unlike the balanced case, the design is non-orthogonal, meaning sum-of-squares partitions do not add up cleanly.


24.1.3.6.1 Indicator Variables for Factor Levels

To handle unbalanced data, we use indicator (dummy) variables as predictors.

For Factor A (i = 1, \dots, a-1):

u_i = \begin{cases} +1 & \text{if observation is from level } i \text{ of Factor A} \\ -1 & \text{if observation is from the reference level (level } a \text{)} \\ 0 & \text{otherwise} \end{cases}

For Factor B (j = 1, \dots, b-1):

v_j = \begin{cases} +1 & \text{if observation is from level } j \text{ of Factor B} \\ -1 & \text{if observation is from the reference level (level } b \text{)} \\ 0 & \text{otherwise} \end{cases}

Rewriting the ANOVA model using indicator variables:

Y = \mu_{..} + \sum_{i=1}^{a-1} \alpha_i u_i + \sum_{j=1}^{b-1} \beta_j v_j + \sum_{i=1}^{a-1} \sum_{j=1}^{b-1}(\alpha \beta)_{ij} u_i v_j + \epsilon

Here, the unknown parameters are:

  • \mu_{..} (grand mean),

  • \alpha_i (main effects for Factor A),

  • \beta_j (main effects for Factor B),

  • (\alpha \beta)_{ij} (interaction effects).


24.1.3.6.2 Hypothesis Testing Using Extra Sum of Squares

For unbalanced designs, we use sequential (type I) or adjusted (type III) sum of squares to test hypotheses.

To test for interaction effects, we test:

\begin{aligned} &H_0: \text{All } (\alpha \beta)_{ij} = 0 \quad \text{(No interaction)} \\ &H_a: \text{Not all } (\alpha \beta)_{ij} = 0 \quad \text{(Interaction present)} \end{aligned}

To test whether Factor B has an effect:

\begin{aligned} &H_0: \beta_1 = \beta_2 = \dots = \beta_b = 0 \\ &H_a: \text{At least one } \beta_j \neq 0 \end{aligned}


24.1.3.6.3 Factor Mean Analysis and Contrasts

Factor means and contrasts (e.g., pairwise comparisons) work similarly to the balanced case but require adjustments due to unequal sample sizes.

The variance estimate for a contrast:

\sigma^2(\hat{L}) = \frac{\sigma^2}{\sum n_{ij}} \sum c_i^2

is modified to:

\frac{MSE}{\sum n_{ij}} \sum c_i^2

Orthogonal contrasts are harder to define because unequal sample sizes break orthogonality.


24.1.3.6.4 Regression Approach to Unbalanced ANOVA

An alternative is to fit the cell means model as a regression model:

Y_{ij} = \mu_{ij} + \epsilon_{ij}

which allows us to analyze each treatment mean separately.

However, if there are empty cells (some factor combinations have no observations), the regression approach fails, and only partial analyses can be conducted.


24.1.4 Two-Way Random Effects ANOVA

The Two-Way Random Effects ANOVA assumes that both Factor A and Factor B levels are randomly sampled from larger populations.

The model is:

Y_{ijk} = \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk}

where:

  • \mu_{..}: Overall mean (constant).
  • \alpha_i \sim N(0, \sigma^2_{\alpha}) for i = 1, \dots, a (random effects for Factor A, independently distributed).
  • \beta_j \sim N(0, \sigma^2_{\beta}) for j = 1, \dots, b (random effects for Factor B, independently distributed).
  • (\alpha \beta)_{ij} \sim N(0, \sigma^2_{\alpha \beta}) for i = 1, \dots, a, j = 1, \dots, b (random interaction effects, independently distributed).
  • \epsilon_{ijk} \sim N(0, \sigma^2) (random error, independently distributed).

Additionally, all random effects (\alpha_i, \beta_j, (\alpha \beta)_{ij}) and error terms (\epsilon_{ijk}) are mutually independent.


24.1.4.1 Expectation

Taking expectations on both sides:

E(Y_{ijk}) = E(\mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk})

Since all random effects have mean zero:

E(Y_{ijk}) = \mu_{..}

Thus, the mean response across all factor levels is \mu_{..}.


24.1.4.2 Variance

The total variance of observations is the sum of all variance components:

\begin{aligned} var(Y_{ijk}) &= var(\alpha_i) + var(\beta_j) + var((\alpha \beta)_{ij}) + var(\epsilon_{ijk}) \\ &= \sigma^2_{\alpha} + \sigma^2_{\beta} + \sigma^2_{\alpha \beta} + \sigma^2 \end{aligned}

Thus:

Y_{ijk} \sim N(\mu_{..}, \sigma^2_{\alpha} + \sigma^2_{\beta} + \sigma^2_{\alpha \beta} + \sigma^2)


24.1.4.3 Covariance Structure

In random effects models, observations are correlated if they share the same factor levels.

Case 1: Same factor A, different factor B

If i is the same but j \neq j', then:

cov(Y_{ijk}, Y_{ij'k'}) = var(\alpha_i) = \sigma^2_{\alpha}

Case 2: Same factor B, different factor A

If j is the same but i \neq i', then:

cov(Y_{ijk}, Y_{i'jk'}) = var(\beta_j) = \sigma^2_{\beta}

Case 3: Same factor A and B, different replication

If both factor levels are the same (i, j fixed), but different replication (k \neq k'):

cov(Y_{ijk}, Y_{ijk'}) = var(\alpha_i) + var(\beta_j) + var((\alpha \beta)_{ij}) = \sigma^2_{\alpha} + \sigma^2_{\beta} + \sigma^2_{\alpha \beta}

Case 4: Completely different factor levels

If neither factor A nor B is the same (i \neq i', j \neq j'), then:

cov(Y_{ijk}, Y_{i'j'k'}) = 0

since all random effects are independent across different factor levels.


Summary of Variance-Covariance Structure

Case Condition Covariance
Same factor A, different factor B i same, j \neq j' \sigma^2_{\alpha}
Same factor B, different factor A j same, i \neq i' \sigma^2_{\beta}
Same factor levels, different replications i same, j same, k \neq k' \sigma^2_{\alpha} + \sigma^2_{\beta} + \sigma^2_{\alpha \beta}
Different factor levels i \neq i', j \neq j' 0

24.1.5 Two-Way Mixed Effects ANOVA

In a Two-Way Mixed Effects Model, one factor is fixed, while the other is random.
This is often referred to as a mixed effects model or simply a mixed model.

24.1.5.1 Balanced

For a balanced design, the restricted mixed model is:

Y_{ijk} = \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk}

where:

  • \mu_{..}: Overall mean (constant).
  • \alpha_i: Fixed effects for Factor A, subject to the constraint \sum \alpha_i = 0.
  • \beta_j \sim N(0, \sigma^2_\beta) (random effects for Factor B).
  • (\alpha \beta)_{ij} \sim N(0, \frac{a-1}{a} \sigma^2_{\alpha \beta})
    (interaction effects, constrained so that \sum_i (\alpha \beta)_{ij} = 0 for all j). The variance is written as the proportion for convenience, it makes the expected mean squares simpler.
  • \epsilon_{ijk} \sim N(0, \sigma^2) (random error).
  • \beta_j, (\alpha \beta)_{ij}, \epsilon_{ijk} are pairwise independent.

The restriction on interaction variance (\frac{a-1}{a} \sigma^2_{\alpha \beta}) simplifies the expected mean squares, though some sources assume var((\alpha \beta)_{ij}) = \sigma^2_{\alpha \beta}.


An unrestricted version of the model removes constraints on interaction terms.

Define:

\begin{aligned} \beta_j &= \beta_j^* + (\bar{\alpha \beta})_{ij}^* \\ (\alpha \beta)_{ij} &= (\alpha \beta)_{ij}^* - (\bar{\alpha \beta})_{ij}^* \end{aligned}

where \beta^* and (\alpha \beta)^*_{ij} are unrestricted random effects.

Some consider the restricted model more general, but we use the restricted form for simplicity.


Taking expectations:

E(Y_{ijk}) = \mu_{..} + \alpha_i

The total variance of responses:

var(Y_{ijk}) = \sigma^2_\beta + \frac{a-1}{a} \sigma^2_{\alpha \beta} + \sigma^2


Covariance Structure

Observations sharing the same random factor (B) level are correlated.

Covariances for Different Cases

Condition Covariance
Same i, j, different replications (k \neq k') cov (Y_{ijk}, Y_{ijk'}) = \sigma^2_\beta + \frac{a-1}{a} \sigma^2_{\alpha \beta}
Same j, different i (i \neq i') cov(Y_{ijk}, Y_{i'jk'}) = \sigma^2_\beta - \frac{1}{a} \sigma^2_{\alpha \beta}
Different i and j (i \neq i', j \neq j') cov(Y_{ijk}, Y_{i'j'k'}) = 0

Thus, observations only become independent when they do not share the same random effect.

An advantage of the restricted mixed model is that 2 observations from the same random factor (B) level can be positively or negatively correlated. In the unrestricted model, they can only be positively correlated.


Comparison of Fixed, Random, and Mixed Effects Models

Mean Square Fixed ANOVA (A, B fixed) Random ANOVA (A, B random) Mixed ANOVA (A fixed, B random)
MSA \sigma^2 + n b \frac{\sum \alpha_i^2}{a-1} \sigma^2 + n b \sigma^2_{\alpha} \sigma^2 + n b \frac{\sum_{i = 1}^a \alpha_i^2}{a-1} + n \sigma^2_{\alpha \beta}
MSB \sigma^2 + n a \frac{\sum \beta_j^2}{b-1} \sigma^2 + n a \sigma^2_{\beta} \sigma^2 + n a \sigma^2_{\beta} + n \sigma^2_{\alpha \beta}
MSAB \sigma^2 + n \frac{\sum (\alpha \beta)_{ij}^2}{(a-1)(b-1)} \sigma^2 + n \sigma^2_{\alpha \beta} \sigma^2 + n \sigma^2_{\alpha \beta}
MSE \sigma^2 \sigma^2 \sigma^2

While SS and df are identical across models, the expected mean squares differ, affecting test statistics.


24.1.5.1.1 Hypothesis Testing in Mixed ANOVA

In random ANOVA, we test:

\begin{aligned} H_0: \sigma^2 = 0 \quad vs. \quad H_a: \sigma^2 > 0 \end{aligned}

using:

F = \frac{MSA}{MSAB} \sim F_{a-1, (a-1)(b-1)}

For mixed models, the same test statistic is used for:

H_0: \alpha_i = 0, \quad \forall i

However, for fixed effects models, the test statistic differs.

Test for Effect of Fixed ANOVA (A, B fixed) Random ANOVA (A, B random) Mixed ANOVA (A fixed, B random)
Factor A \frac{MSA}{MSE} \frac{MSA}{MSAB} \frac{MSA}{MSAB}
Factor B \frac{MSB}{MSE} \frac{MSB}{MSAB} \frac{MSB}{MSE}
Interaction (A × B) \frac{MSAB}{MSE} \frac{MSAB}{MSE} \frac{MSAB}{MSE}

24.1.5.1.2 Variance Component Estimation

In random and mixed effects models, we are interested in estimating variance components.

To estimate \sigma^2_\beta:

E(\sigma^2_\beta) = \frac{E(MSB) - E(MSE)}{na} = \frac{\sigma^2 + na \sigma^2_\beta - \sigma^2}{na} = \sigma^2_\beta

which is estimated by:

\hat{\sigma}^2_\beta = \frac{MSB - MSE}{na}

Confidence intervals for variance components can be approximated using:


24.1.5.1.3 Estimating Fixed Effects in Mixed Models

Fixed effects \alpha_i are estimated by:

\begin{aligned} \hat{\alpha}_i &= \bar{Y}_{i..} - \bar{Y}_{...} \\ \hat{\mu}_{i.} &= \bar{Y}_{...} + (\bar{Y}_{i..} - \bar{Y}_{...}) = \bar{Y}_{i..} \end{aligned}

Their variances:

\begin{aligned} \sigma^2(\hat{\alpha}_i) &= \frac{\sigma^2 + n \sigma^2_{\alpha \beta}}{bn} = \frac{E(MSAB)}{bn} \\ s^2(\hat{\alpha}_i) &= \frac{MSAB}{bn} \end{aligned}


24.1.5.1.4 Contrasts on Fixed Effects

For a contrast:

L = \sum c_i \alpha_i, \quad \text{where } \sum c_i = 0

Estimate:

\hat{L} = \sum c_i \hat{\alpha}_i

Variance:

\sigma^2(\hat{L}) = \sum c^2_i \sigma^2(\hat{\alpha}_i), \quad s^2(\hat{L}) = \frac{MSAB}{bn} \sum c^2_i


24.1.5.2 Unbalanced Two-Way Mixed Effects ANOVA

In an unbalanced two-way mixed model (e.g., a = 2, b = 4), the model remains:

Y_{ijk} = \mu_{..} + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \epsilon_{ijk}

where:

  • \alpha_i: Fixed effects for Factor A.
  • \beta_j \sim N(0, \sigma^2_\beta): Random effects for Factor B.
  • (\alpha \beta)_{ij} \sim N(0, \frac{\sigma^2_{\alpha \beta}}{2}): Interaction effects.
  • \epsilon_{ijk} \sim N(0, \sigma^2): Residual error.

24.1.5.2.1 Variance Components

The variance components are:

\begin{aligned} var(\beta_j) &= \sigma^2_\beta \\ var((\alpha \beta)_{ij}) &= \frac{2-1}{2} \sigma^2_{\alpha \beta} = \frac{\sigma^2_{\alpha \beta}}{2} \\ var(\epsilon_{ijk}) &= \sigma^2 \end{aligned}

24.1.5.2.2 Expectation and Variance

Taking expectations:

E(Y_{ijk}) = \mu_{..} + \alpha_i

Total variance:

var(Y_{ijk}) = \sigma^2_{\beta} + \frac{\sigma^2_{\alpha \beta}}{2} + \sigma^2


24.1.5.2.3 Covariance Structure

Observations sharing Factor B (random effect) are correlated.

Covariances for Different Cases

Condition Covariance
Same i, j, different replications (k \neq k') cov(Y_{ijk}, Y_{ijk'}) = \sigma^2 + \frac{\sigma^2_{\alpha \beta}}{2}
Same j, different i (i \neq i') cov (Y_{ijk}, Y_{i'jk'}) = \sigma^2_{\beta} - \frac{\sigma^2_{\alpha \beta}}{2}
Different i and j (i \neq i', j \neq j') cov(Y_{ijk}, Y_{i'j'k'}) = 0

Thus, only observations within the same random factor level share dependence.


24.1.5.2.4 Matrix Representation

Assume:

\mathbf{Y} \sim N(\mathbf{X} \beta, M)

where:

  • \mathbf{X}: Fixed effects design matrix.
  • \beta: Fixed effect coefficients.
  • M: Block diagonal covariance matrix containing variance components.

The density function of \mathbf{Y} is:

f(\mathbf{Y}) = \frac{1}{(2\pi)^{N/2} |M|^{1/2}} \exp \left( -\frac{1}{2} (\mathbf{Y} - \mathbf{X} \beta)' M^{-1} (\mathbf{Y} - \mathbf{X} \beta) \right)

If variance components were known, we could use Generalized Least Squares:

\hat{\beta}_{GLS} = (\mathbf{X}' M^{-1} \mathbf{X})^{-1} \mathbf{X}' M^{-1} \mathbf{Y}

However, since variance components (\sigma^2, \sigma^2_\beta, \sigma^2_{\alpha \beta}) are unknown, we estimate them using:

Maximizing the likelihood:

\ln L = - \frac{N}{2} \ln (2\pi) - \frac{1}{2} \ln |M| - \frac{1}{2} (\mathbf{Y} - \mathbf{X} \beta)' M^{-1} (\mathbf{Y} - \mathbf{X} \beta)

where:

  • |M|: Determinant of the variance-covariance matrix.
  • (\mathbf{Y} - \mathbf{X} \beta)' M^{-1} (\mathbf{Y} - \mathbf{X} \beta): Quadratic form in the likelihood.