# Chapter 14 ANOVA

In the last chapter we covered 1 and two sample hypothesis tests. In these tests, you are either comparing 1 group to a hypothesized value, or comparing the relationship between two groups (either their means or their correlation). In this chapter, we’ll cover how to analyse more complex experimental designs with ANOVAs.

When do you conduct an ANOVA? You conduct an ANOVA when you are testing the effect of one or more nominal (aka factor) independent variable(s) on a numerical dependent variable. A nominal (factor) variable is one that contains a finite number of categories with no inherent order. Gender, profession, experimental conditions, and Justin Bieber albums are good examples of factors (not necessarily of good music). If you only include one independent variable, this is called a One-way ANOVA. If you include two independent variables, this is called a Two-way ANOVA. If you include three independent variables it is called a Menage a trois NOVA.

Ok maybe it’s not yet, but we repeat it enough it will be and we can change the world.

For example, let’s say you want to test how well each of three different cleaning fluids are at getting poop off of your poop deck.To test this, you could do the following: over the course of 300 cleaning days, you clean different areas of the deck with the three different cleaners. You then record how long it takes for each cleaner to clean its portion of the deck. At the same time, you could also measure how well the cleaner is cleaning two different types of poop that typically show up on your deck: shark and parrot. Here, your independent variables cleaner and type are factors, and your dependent variable time is numeric.

Thankfully, this experiment has already been conducted. The data are recorded in a dataframe called poopdeck in the yarrr package. Here’s how the first few rows of the data look:

head(poopdeck)
##   day cleaner   type time int.fit me.fit
## 1   1       a parrot   47      46     54
## 2   1       b parrot   55      54     54
## 3   1       c parrot   64      56     47
## 4   1       a  shark  101      86     78
## 5   1       b  shark   76      77     77
## 6   1       c  shark   63      62     71

We can visualize the poopdeck data using (of course) a pirate plot:

pirateplot(formula = time ~ cleaner + type,
data = poopdeck,
ylim = c(0, 150),
xlab = "Cleaner",
ylab = "Cleaning Time (minutes)",
main = "poopdeck data",
back.col = gray(.97),
cap.beans = TRUE,
theme = 2)

Given this data, we can use ANOVAs to answer four separate questions:

Question Analysis Formula
Is there a difference between the different cleaners on cleaning time (ignoring poop type)? One way ANOVA time ~ cleaner
Is there a difference between the different poop types on cleaning time (ignoring which cleaner is used) One-way ANOVA time ~ type
Is there a unique effect of the cleaner or poop types on cleaning time? Two-way ANOVA time ~ cleaner + type
Does the effect of cleaner depend on the poop type? Two-way ANOVA
with interaction term
time ~ cleaner * type`