Chapter 14 ANOVA test

The ANOVA test relies on the F-test statistic which incorporates the variation between the mean weight of cocoa pods – between and within varieties.

The results from the ANOVA test were revealed in the Cocoabix factory briefing paper. The p value, 2x $10^-{16}$ , associated with this test indicates that there exists a difference in cocoa pod weight (mean) across the three varieties. This information suggests that at least two means are different from each other.

The p value quantifies the probability that the observed differences between the mean cocoa pod weight (between the varieties under consideration) arose “by chance” (under the null hypothesis). The team concluded that there was strong evidence against the null hypothesis.

One of the first questions that the press asked the team was whether the statistical assumptions for ANOVA were met. The analytical team informed them that

the variance for the three populations under consideration were similar,
the weights obtained for cocoa pods of a given variety did not affect that obtained for the other varieties, and
the weights obtained for each variety were drawn from Normally distributed populations.

The team also discussed the Tukey Honest Significant Differences test (see results below) which provided confidence intervals (the overall error rate is 5%) for the difference in mean cocoa pod weight between each pair of cocoa pod variety.

ANOVA<- aov(Weight ~ Variety, Investigate)
#summary(ANOVA)
TukeyHSD(ANOVA)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Weight ~ Variety, data = Investigate)
## 
## $Variety
##                           diff        lwr         upr     p adj
## Forastero-Criollo     -0.61934  -1.640964   0.4022838 0.3298715
## Trinitario-Criollo   -30.16429 -31.185914 -29.1426662 0.0000000
## Trinitario-Forastero -29.54495 -30.566574 -28.5233262 0.0000000