2.4 Follow-up Tests and Contrasts

2.4.1 Introduction: Why follow up?

So now we have discovered ANOVA, which gives us this very nice way of thinking about many groups at once. We don’t have to ask ourselves “Does diet 1 make baby chicks grow faster than diet 3?”; we can ask “Is there some difference in weight based on diet?”

But what happens if we do an ANOVA test and we reject the null hypothesis? All we can say is: “I found evidence that there is some difference – not all the diet groups are the same.” That’s great and all, but it doesn’t tell me what to feed my chickens.

If we have rejected \(H_0\) in an ANOVA test, we want to follow up and try to find out why we rejected it. Are all the diets different? Is there just one standout? Which ones are the best? And so on.

2.4.2 Follow-up tests

Now when I say we want to get more specific about which groups are different, your instinctive response may be: well, we know how to test whether two groups are different. That’s just a \(t\) test for the difference of two means.

And yes, indeed you can do that. Suppose I’m really mostly interested in whether diet 1 and diet 4 are different, at a 99% confidence level. Well, I can run a \(t\) test to find out:

t.test(x = chicken_dat %>% filter(Diet == 1) %>% .$weight,
       y = chicken_dat %>% filter(Diet == 4) %>% .$weight,
       paired = FALSE,
       conf.level = 0.99)
## 
##  Welch Two Sample t-test
## 
## data:  chicken_dat %>% filter(Diet == 1) %>% .$weight and chicken_dat %>% filter(Diet == 4) %>% .$weight
## t = -3.4548, df = 22.275, p-value = 0.002226
## alternative hypothesis: true difference in means is not equal to 0
## 99 percent confidence interval:
##  -115.20820  -11.74605
## sample estimates:
## mean of x mean of y 
##  170.4118  233.8889

Yay! Now I have a p-value, 0.002, which is below my chosen alpha; this tells me I can reject the null hypothesis that the two group means are equal, and conclude that diet 1 and diet 4 are different. Plus, I get a confidence interval for what the difference actually is – I’m 99% confident that on average, diet 1 chickens weigh between 115 and 11 grams less than chickens fed on diet 4.

2.4.3 Contrasts

But what if I want to ask a somewhat more complicated question? For example, suppose that diets 1 and 2 are low-carb, and diets 3 and 4 are high-carb. I want to know if, on average, low-carb diets are different from high-carb diets.

I can answer this question using contrasts. The idea is that I’m interested in the contrast between some combinations of factor levels. In this example, my null hypothesis is: \[H_0: \mu_1 + \mu_2 = \mu_3 + \mu_4\] and my alternative is: \[H_A: \mu_1 + \mu_2 \ne \mu_3 + \mu_4\]

Now I translate my null hypothesis into a mathematical contrast. If my null hypothesis is true, \[\mu_1 + \mu_2 = \mu_3 + \mu_4\]

which means \[\mu_1 + \mu_2 - \mu_3 - \mu_4 = 0\]

Note the “coefficients” of the \(\mu_i\)’s here: 1, 1, -1, and -1. These add up to 0!

Formally speaking, we have a null hypothesis of the form \[\sum_{i=1}^t c_i \mu_i = 0\] where \[\sum_{i=1}^t c_i = 0\] as well. These “coefficients” \(c_i\) define the contrast.

Let’s find out whether the data disagree with my fancy null hypothesis. For this you use R’s other ANOVA function, which is called aov(), plus a special function from the gmodels library called fit.contrast().

Here’s what happens when I use aov(). The results are the same as using lm with anova:

chicken_aov = aov(weight ~ Diet, data = chicken_dat)
chicken_aov %>% summary()
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## Diet         3  55881   18627   5.464 0.00291 **
## Residuals   42 143190    3409                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Next, I tell R about the contrast I want to test, by giving it the \(c_i\) values associated with my fancy null hypothesis. R is picky about the syntax here – it needs a matrix with one column for each level of the factor:

chicken_contrasts = matrix(c(1, 1, -1, -1), nrow = 1, byrow = TRUE)
rownames(chicken_contrasts) = c(": lo carb vs hi carb")
chicken_contrasts
##                      [,1] [,2] [,3] [,4]
## : lo carb vs hi carb    1    1   -1   -1

Now I ask R to test this contrast on the chicken data. This function works on the aov object:

fit.contrast(model = chicken_aov, 
             varname = "Diet", 
             coeff = chicken_contrasts)
##                           Estimate Std. Error   t value    Pr(>|t|)
## Diet: lo carb vs hi carb -116.7771    35.5136 -3.288237 0.002043969
## attr(,"class")
## [1] "fit_contrast"

Behold, a p-value! Assuming I’m still using \(\alpha = 0.01\), this tells me that I have strong evidence that low-carb and high-carb diets are not the same on average.

There is a lot more to say about contrasts, but I don’t want to spend too much time on them here. The important takeaway at this point is that they are a thing: you can construct complex null hypotheses involving multiple groups and then test them!

Response moment: The contrast above – diets 1 and 2 vs. diets 3 and 4 – was defined using the \(c_i\) values (1, 1, -1, -1). Suppose your null hypothesis was “diet 3 is the same as diet 4.” What \(c_i\) values would you use to define this contrast?