## 14.5 Type I, Type II, and Type III ANOVAs

It turns out that there is not just one way to calculate ANOVAs. In fact, there are three different types - called, Type 1, 2, and 3 (or Type I, II and III). These types differ in how they calculate variability (specifically the `sums of of squares`

). If your data is relatively `balanced`

, meaning that there are relatively equal numbers of observations in each group, then all three types will give you the same answer. However, if your data are `unbalanced`

, meaning that some groups of data have many more observations than others, then you need to use Type II (2) or Type III (3).

The standard `aov()`

function in base-R uses Type I sums of squares. Therefore, it is only appropriate when your data are balanced. If your data are unbalanced, you should conduct an ANOVA with Type II or Type III sums of squares. To do this, you can use the `Anova()`

function in the `car`

package. The `Anova()`

function has an argument called `type`

that allows you to specify the type of ANOVA you want to calculate.

In the next code chunk, I’ll calculate 3 separate ANOVAs from the poopdeck data using the three different types. First, I’ll create a regression object with `lm()`

. As you’ll see, the `Anova()`

function requires you to enter a regression object as the main argument, and `not`

a formula and dataset. That is, you need to first create a regression object from the data with `lm()`

(or `glm()`

), and then enter that object into the `Anova()`

function. You can also do the same thing with the standard `aov()`

function`.

```
# Step 1: Calculate regression object with lm()
time.lm <- lm(formula = time ~ type + cleaner,
data = poopdeck)
```

Now that I’ve created the regression object `time.lm`

, I can calculate the three different types of ANOVAs by entering the object as the main argument to either `aov()`

for a Type I ANOVA, or `Anova()`

in the car package for a Type II or Type III ANOVA:

```
# Type I ANOVA - aov()
time.I.aov <- aov(time.lm)
# Type II ANOVA - Anova(type = 2)
time.II.aov <- car::Anova(time.lm, type = 2)
# Type III ANOVA - Anova(type = 3)
time.III.aov <- car::Anova(time.lm, type = 3)
```

As it happens, the data in the poopdeck dataframe are perfectly balanced (so we’ll get exactly the same result for each ANOVA type. However, if they were not balanced, then we should *not* use the Type I ANOVA calculated with the *aov()* function.

To see if your data are balanced, you can use the function:

```
# Are observations in the poopdeck data balanced?
with(poopdeck,
table(cleaner, type))
## type
## cleaner parrot shark
## a 100 100
## b 100 100
## c 100 100
```

As you can see, in the poopdeck data, the observations are perfectly balanced, so it doesn’t matter which type of ANOVA we use to analyse the data.

For more detail on the different types, check out https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/.