14.5 Type I, Type II, and Type III ANOVAs
It turns out that there is not just one way to calculate ANOVAs. In fact, there are three different types - called, Type 1, 2, and 3 (or Type I, II and III). These types differ in how they calculate variability (specifically the
sums of of squares). If your data is relatively
balanced, meaning that there are relatively equal numbers of observations in each group, then all three types will give you the same answer. However, if your data are
unbalanced, meaning that some groups of data have many more observations than others, then you need to use Type II (2) or Type III (3).
aov() function in base-R uses Type I sums of squares. Therefore, it is only appropriate when your data are balanced. If your data are unbalanced, you should conduct an ANOVA with Type II or Type III sums of squares. To do this, you can use the
Anova() function in the
car package. The
Anova() function has an argument called
type that allows you to specify the type of ANOVA you want to calculate.
In the next code chunk, I’ll calculate 3 separate ANOVAs from the poopdeck data using the three different types. First, I’ll create a regression object with
lm(). As you’ll see, the
Anova() function requires you to enter a regression object as the main argument, and
not a formula and dataset. That is, you need to first create a regression object from the data with
glm()), and then enter that object into the
Anova() function. You can also do the same thing with the standard
# Step 1: Calculate regression object with lm() time.lm <- lm(formula = time ~ type + cleaner, data = poopdeck)
Now that I’ve created the regression object
time.lm, I can calculate the three different types of ANOVAs by entering the object as the main argument to either
aov() for a Type I ANOVA, or
Anova() in the car package for a Type II or Type III ANOVA:
# Type I ANOVA - aov() time.I.aov <- aov(time.lm) # Type II ANOVA - Anova(type = 2) time.II.aov <- car::Anova(time.lm, type = 2) # Type III ANOVA - Anova(type = 3) time.III.aov <- car::Anova(time.lm, type = 3)
As it happens, the data in the poopdeck dataframe are perfectly balanced (so we’ll get exactly the same result for each ANOVA type. However, if they were not balanced, then we should not use the Type I ANOVA calculated with the aov() function.
To see if your data are balanced, you can use the function:
# Are observations in the poopdeck data balanced? with(poopdeck, table(cleaner, type)) ## type ## cleaner parrot shark ## a 100 100 ## b 100 100 ## c 100 100
As you can see, in the poopdeck data, the observations are perfectly balanced, so it doesn’t matter which type of ANOVA we use to analyse the data.
For more detail on the different types, check out https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/.