## 21.6 Single Factor Covariance Model

$Y_{ij} = \mu_{.} + \tau_i + \gamma(X_{ij} - \bar{X}_{..}) + \epsilon_{ij}$

for $$i = 1,...,r;j=1,..,n_i$$

where

• $$\mu_.$$ overall mean
• $$\tau_i$$: fixed treatment effects ($$\sum \tau_i =0$$)
• $$\gamma$$: fixed regression coefficient effect between X and Y
• $$X_{ij}$$ covariate (not random)
• $$\epsilon_{ij} \sim iid N(0,\sigma^2)$$: random errors

If we just use $$\gamma X_{ij}$$ as the regression term (rather than $$\gamma(X_{ij}-\bar{X}_{..})$$), then $$\mu_.$$ is no longer the overall mean; thus we need to centered mean.

\begin{aligned} E(Y_{ij}) &= \mu_. + \tau_i + \gamma(X_{ij}-\bar{X}_{..}) \\ var(Y_{ij}) &= \sigma^2 \end{aligned}

$$Y_{ij} \sim N(\mu_{ij},\sigma^2)$$,

where

\begin{aligned} \mu_{ij} &= \mu_. + \tau_i + \gamma(X_{ij} - \bar{X}_{..}) \\ \sum \tau_i &=0 \end{aligned}

Thus, the mean response ($$\mu_{ij}$$) is a regression line with intercept $$\mu_. + \tau_i$$ and slope $$\gamma$$ for each treatment i.

Assumption:

• All treatment regression lines have the same slope
• when treatment interact with covariate $$X$$ (non-parallel slopes), covariance analysis is not appropriate. in which case we should use separate regression lines.

$Y_{ij} = \mu_. + \tau_i + \gamma_1(X_{ij1}-\bar{X}_{..2}) + \gamma_2(X_{ij2}-\bar{X}_{..2}) + \epsilon_{ij}$

Regression Formulation

We can use indicator variables for treatments

$l_1 = \begin{cases} 1 & \text{if case is from treatment 1}\\ -1 & \text{if case is from treatment r}\\ 0 &\text{otherwise}\\ \end{cases}$

$.$

$.$

$l_{r-1} = \begin{cases} 1 & \text{if case is from treatment r-1}\\ -1 & \text{if case is from treatment r}\\ 0 &\text{otherwise}\\ \end{cases}$

Let $$x_{ij} = X_{ij}- \bar{X}_{..}$$. the regression model is

$Y_{ij} = \mu_. + \tau_1l_{ij,1} + .. + \tau_{r-1}l_{ij,r-1} + \gamma x_{ij}+\epsilon_{ij}$

where $$I_{ij,1}$$ is the indicator variable $$l_1$$ for the j-th case from treatment i. The treatment effect $$\tau_1,..\tau_{r-1}$$ are just regression coefficients for the indicator variables.

We could use the same diagnostic tools for this case.

Inference

Treatment effects

\begin{aligned} &H_0: \tau_1 = \tau_2 = ...= 0 \\ &H_a: \text{not all } \tau_i =0 \end{aligned}

\begin{aligned} &\text{Full Model}: Y_{ij} = \mu_. + \tau_i + \gamma X_{ij} +\epsilon_{ij} \\ &\text{Reduced Model}: Y_{ij} = \mu_. + \gamma X_{ij} + \epsilon_{ij} \end{aligned}

$F = \frac{SSE(R) - SSE(F)}{(N-2)-(N-(r+1))} / \frac{SSE(F)}{N-(r+1)} \sim F_{(r-1,N-(r+1))}$

If we are interested in comparisons of treatment effects.
For example, r - 3. We estimate $$\tau_1,\tau_2, \tau_3 = -\tau_1 - \tau_2$$

Comparison Estimate Variance of Estimator
$$\tau_1 - \tau_2$$ $$\hat{\tau}_1 - \hat{\tau}_2$$ $$var(\hat {\tau}_1) + var(\hat{\tau}_2) - 2cov(\hat{ \tau}_1\hat{\tau}_2)$$
$$\tau_1 - \tau_3$$ $$2 \hat{\tau}_1 + \hat{\tau}_2$$ $$4var(\hat {\tau}_1) + var(\hat{\tau}_2) - 4cov(\hat{ \tau}_1\hat{\tau}_2)$$
$$\tau_2 - \tau_3$$ $$\hat{\tau}_1 + 2 \hat{\tau}_2$$ $$var(\hat{\tau}_1) + 4var(\hat{\tau}_2) - 4cov(\hat{\tau}_1\hat{\tau}_2)$$

Testing for Parallel Slopes

Example:

r = 3

$Y_{ij} = \mu_{.} + \tau_1 I_{ij,1} + \tau_2 I_{ij,2} + \gamma X_{ij} + \beta_1 I_{ij,1}X_{ij} + \beta_2 I_{ij,2}X_{ij} + \epsilon_{ij}$

where $$\beta_1,\beta_2$$: interaction coefficients.

\begin{aligned} &H_0: \beta_1 = \beta_2 = 0 \\ &H_a: \text{at least one} \beta \neq 0 \end{aligned}

If we can’t reject $$H_0$$ using F-test then we have evidence that the slopes are parallel

$Y_{i.}(adj) = \bar{Y}_{i.} - \hat{\gamma}(\bar{X}_{i.} - \bar{X}_{..})$