24.6 Single Factor Covariance Model

The single-factor covariance model (Analysis of Covariance, ANCOVA) accounts for both treatment effects and a continuous covariate:

$Y_{ij} = \mu_{.} + \tau_i + \gamma(X_{ij} - \bar{X}_{..}) + \epsilon_{ij}$

for $i = 1, \dots, r$ (treatments) and $j = 1, \dots, n_i$ (observations per treatment).

$\mu_{.}$ : Overall mean response.
$\tau_i$ : Fixed treatment effects ( $\sum \tau_i = 0$ ).
$\gamma$ : Fixed regression coefficient (relationship between covariate $X$ and response $Y$ ).
$X_{ij}$ : Observed covariate (fixed, not random).
$\epsilon_{ij} \sim iid N(0, \sigma^2)$ : Independent random errors.

If we use $\gamma X_{ij}$ directly (without centering), then $\mu_{.}$ is no longer the overall mean. Thus, centering the covariate is necessary to maintain interpretability.

Expectation and Variance

$\begin{aligned} E(Y_{ij}) &= \mu_. + \tau_i + \gamma(X_{ij}-\bar{X}_{..}) \\ var(Y_{ij}) &= \sigma^2 \end{aligned}$

Since $Y_{ij} \sim N(\mu_{ij},\sigma^2)$ , we express:

$\mu_{ij} = \mu_. + \tau_i + \gamma(X_{ij} - \bar{X}_{..})$

where $\sum \tau_i = 0$ . The mean response $\mu_{ij}$ is a regression line with intercept $\mu_. + \tau_i$ and slope $\gamma$ for each treatment $i$ .

Key Assumptions

All treatments share the same slope ( $\gamma$ ).
No interaction between treatment and covariate (parallel regression lines).
If slopes differ, ANCOVA is not appropriate → use separate regressions per treatment.

A more general model allows multiple covariates:

$Y_{ij} = \mu_. + \tau_i + \gamma_1(X_{ij1}-\bar{X}_{..1}) + \gamma_2(X_{ij2}-\bar{X}_{..2}) + \epsilon_{ij}$

Using indicator variables for treatments:

For treatment $i = 1$ : $l_1 = \begin{cases} 1 & \text{if case belongs to treatment 1} \\ -1 & \text{if case belongs to treatment $r$} \\ 0 & \text{otherwise} \end{cases}$

For treatment $i = r-1$ : $l_{r-1} = \begin{cases} 1 & \text{if case belongs to treatment $r-1$} \\ -1 & \text{if case belongs to treatment $r$} \\ 0 & \text{otherwise} \end{cases}$

Defining $x_{ij} = X_{ij}- \bar{X}_{..}$ , the regression model is:

$Y_{ij} = \mu_. + \tau_1 l_{ij,1} + \dots + \tau_{r-1} l_{ij,r-1} + \gamma x_{ij} + \epsilon_{ij}$

where $I_{ij,1}$ is the indicator variable $l_1$ for the $j$ -th case in treatment $i$ .

The treatment effects ( $\tau_i$ ) are simply regression coefficients for the indicator variables.

24.6.1 Statistical Inference for Treatment Effects

To test treatment effects:

$\begin{aligned} &H_0: \tau_1 = \tau_2 = \dots = 0 \\ &H_a: \text{Not all } \tau_i = 0 \end{aligned}$

Full Model (with treatment effects): $Y_{ij} = \mu_. + \tau_i + \gamma X_{ij} + \epsilon_{ij}$
Reduced Model (without treatment effects): $Y_{ij} = \mu_. + \gamma X_{ij} + \epsilon_{ij}$

F-Test for Treatment Effects

The test statistic is:

$F = \frac{SSE(R) - SSE(F)}{(N-2)-(N-(r+1))} \Big/ \frac{SSE(F)}{N-(r+1)}$

where:

$SSE(R)$ : Sum of squared errors for the reduced model.
$SSE(F)$ : Sum of squared errors for the full model.
$N$ : Total number of observations.
$r$ : Number of treatment groups.

Under $H_0$ , the statistic follows an $F$ -distribution:

$F \sim F_{(r-1, N-(r+1))}$

Comparisons of Treatment Effects

For $r = 3$ , we estimate:

Comparison	Estimate	Variance of Estimator
$\tau_1 - \tau_2$	$\hat{\tau}_1 - \hat{\tau}_2$	$var(\hat{\tau}_1) + var(\hat{\tau}_2) - 2cov(\hat{\tau}_1, \hat{\tau}_2)$
$\tau_1 - \tau_3$	$2 \hat{\tau}_1 + \hat{\tau}_2$	$4var(\hat{\tau}_1) + var(\hat{\tau}_2) - 4cov(\hat{\tau}_1, \hat{\tau}_2)$
$\tau_2 - \tau_3$	$\hat{\tau}_1 + 2 \hat{\tau}_2$	$var(\hat{\tau}_1) + 4var(\hat{\tau}_2) - 4cov(\hat{\tau}_1, \hat{\tau}_2)$

24.6.2 Testing for Parallel Slopes

To check if slopes differ across treatments, we use the model:

$Y_{ij} = \mu_{.} + \tau_1 I_{ij,1} + \tau_2 I_{ij,2} + \gamma X_{ij} + \beta_1 I_{ij,1}X_{ij} + \beta_2 I_{ij,2}X_{ij} + \epsilon_{ij}$

where:

$\beta_1, \beta_2$ : Interaction coefficients (slope differences across treatments).

Hypothesis Test

$\begin{aligned} &H_0: \beta_1 = \beta_2 = 0 \quad (\text{Slopes are equal}) \\ &H_a: \text{At least one } \beta \neq 0 \quad (\text{Slopes differ}) \end{aligned}$

If the $F$ -test fails to reject $H_0$ , then we assume parallel slopes.

24.6.3 Adjusted Means

The adjusted treatment means account for covariate effects:

$Y_{i.}(\text{adj}) = \bar{Y}_{i.} - \hat{\gamma}(\bar{X}_{i.} - \bar{X}_{..})$

where:

$\bar{Y}_{i.}$ : Observed mean response for treatment $i$ .
$\hat{\gamma}$ : Estimated regression coefficient.
$\bar{X}_{i.}$ : Mean covariate value for treatment $i$ .
$\bar{X}_{..}$ : Overall mean covariate value.

This provides estimated treatment means after controlling for covariate effects.