5.5 One-way ANOVA

Analysis of variance (ANOVA) is a method to compare the mean values of a continuous variable between groups of a categorical independent variable. ANOVA is typically used to analyze the response to a manipulation of the independent variable in a controlled experiment, but it can also be used to analyze the difference in the observed value among groups in a non-experimental setting.7

How it Works

ANOVA decomposes the variability around the overall mean \(Y_{ij} - \bar{Y}_{..}\) into two parts: the variability of the factor level means around the overall mean \(\bar{Y}_{i.} - \bar{Y}_{..}\) (between-group variability) plus the variability of the factor level values around their means \(Y_{ij} - \bar{Y}_{i.}\) (within-group variability). In the table below, the ratio of the treatment mean square and the mean squared error, \(F = \frac{MSR}{MSE}\), follows an F distribution with \(k-1\) numerator dof and \(N-k\) denominator dof. The more observation variance captured by the treatments, the larger is the between-group variability relative to the within-group variability, and thus the larger is \(F\), and the less likely that the null hypothesis, \(H_0 = \mu_1 = \mu_2 = \cdots = \mu_k\) is true.

Table 5.1: ANOVA Table
Source SS df MS F
\(SSR\) \(\sum{n_i(\bar{Y}_{i.} - \bar{Y}_{..})^2}\) \(k - 1\) \({SSR}/{(k - 1)}\) \({MSR}/{MSE}\)
\(SSE\) \(\sum(Y_{ij} - \bar{Y}_{i.})^2\) \(N - k\) \({SSE}/{(N - k)}\)
\(SST\) \(\sum(Y_{ij} - \bar{Y}_{..})^2\) \(N - 1\)


Assumptions

The ANOVA test applies when the independent variable is categorical, and the dependent variable is continuous and independent within groups. Independence means the observations are from a random sample, or from an experiment using random assignment. Each group’s size should be less than 10% of its population size. The groups must also be independent of each other (non-paired, and non-repeated measures). Additionally, there are three assumptions related to the distribution of the dependent variable. If any assumption fails, either try the work-around or revert to the nonparametric Kruskal-Wallis test (Chapter 5.6).

  1. No outliers. There should be no significant outliers in the groups. Outliers exert a large influence on the mean and variance. Test with a box plot or residuals vs predicted plot. Work-arounds are dropping the outliers or transforming the dependent variable.
  2. Normality. The dependent variable should be nearly normally distributed. ANOVA is robust to this condition, but it important with small sample sizes. Test with the Q-Q plots or the Shapiro-Wilk test for normality. Work-around is transforming the dependent variable.
  3. Equal Variances. The group variances should be roughly equal. This condition is especially important with differing sample sizes. Test with a box plot, residuals vs predicted plot, rule of thumb (see case study in Chapter ??), or one of the formal homogeneity of variance tests such as Bartlett and Levene (be careful here because the formal tests can be overly sensitive, esp. Bartlett). Work-around is the Games-Howell post hoc test instead of the Tukey post hoc test.

Post Hoc Tests

If the ANOVA procedure rejects the null hypothesis, use a post hoc procedure to determine which groups differ. The Tukey test is the most common. The test compares the differences in means to Tukey’s \(w\), \(w = q_\alpha(p, df_{Err}) \cdot s_\bar{Y}\) where \(q_\alpha(p, df_{Err})\) is a lookup table value, and \(s_\bar{Y} = \sqrt{MSE/r}\) and \(r\) is the number of comparisons. Any difference in group means greater than Tukey’s \(w\) is statistically significant. The Tukey test is only valid with equal sample sizes. Otherwise, the Tukey–Cramer method calculates the standard deviation for each pairwise comparison separately.

There are other post hoc tests. Fisher’s Protected Least Significant Difference (LSD) test is an older approach and less commonly used today. The Bonferroni and Scheffe methods are used for general tests of contrasts, including combinations of groups. The Bonferroni method is better when the number of contrasts is about the same as the number of factor levels. The Scheffe method is better for testing all possible contrasts. Dunnett’s mean comparison method is appropriate for comparisons of treatment levels against a control.

ANOVA and OLS

ANOVA is related to linear regression. The regression model intercept is the overall mean and the coefficient estimators indirectly indicate the group means. The analysis of variance table in a regression model shows how much of the overall variance is explained by those coefficient estimators. It’s the same thing.


  1. These notes are gleaned from PSU STAT-502 “Analysis of Variance and Design of Experiments”, and Laerd Statistics.↩︎