3.1 Factorial design basics

3.1.1 What’s a factorial design?

Here’s the short definition: a factorial design is one in which you have cases at each combination of levels of the treatment factors.

We’ve actually already seen an example of a factorial design, in the guinea pig experiment. There were two treatment factors, vitamin C dosage (with three levels) and the type of supplement used to administer it (with two levels). And for each combination of these levels, we observed some number of guinea pigs. The same number of guinea pigs, in fact, which made the design balanced.

One important feature of factorial designs is that, because we observe every combination of levels for the treatments, we can spot any interactions – as, indeed, we did in the guinea pig experiment. They are also more efficient than some other designs, a topic we’ll return to elsewhere.

There are a couple of tools that we use to talk about factorial designs, which become increasingly handy as we start working with more factors.

3.1.2 Visualizing the design space

The first is to think about the design space. This is the mathematical/metaphorical space of all possible levels of the factors. In the guinea pig experiment, we had two factors, so we can picture the design space like this:

Each point here corresponds to a run – actually, in this case, 10 runs. The point’s “coordinates” in the space show the level of dosage and the level of supplement-type used.

3.1.3 The design matrix

We can also create a design matrix to show the factor levels for each run.

There are a couple of different versions of design matrices. In the most mathematical one, each column corresponds to a particular factor level, and each row corresponds to a run. In this case, I’m going to let each row stand for the 10 replicates in the group. We put a 1 in the cell if that factor level applies to that run, and a 0 otherwise.

Intercept Dose 0.5 Dose 1 Dose 2 Supp OJ Supp VC
1 1 0 0 1 0
1 1 0 0 0 1
1 0 1 0 1 0
1 0 1 0 0 1
1 0 0 1 1 0
1 0 0 1 0 1

The fun thing about this kind of design matrix is that you can actually use it to see where interaction effects occur. We add columns for each interaction effect. The value of an interaction effect’s column is just the element-wise product of the columns corresponding to the main effects. Like so:

Intercept Dose 0.5 Dose 1 Dose 2 Supp OJ Supp VC 0.5_OJ 0.5_VC 1_OJ 1_VC 2_OJ 2_VC
1 1 0 0 1 0 1 0 0 0 0 0
1 1 0 0 0 1 0 1 0 0 0 0
1 0 1 0 1 0 0 0 1 0 0 0
1 0 1 0 0 1 0 0 0 1 0 0
1 0 0 1 1 0 0 0 0 0 1 0
1 0 0 1 0 1 0 0 0 0 0 1

For example, the first row here indicates the group with dosage 0.5 administered via orange juice. That row “involves” the effects of dosage 0.5, OJ, and any interaction between dosage 0.5 and OJ, as well as the intercept or overall mean. The predicted response for such a guinea pig is:

$\hat{\mu} + \widehat{\alpha}_{0.5} + \widehat{\beta}_{OJ} + \widehat{(\alpha\beta)}_{0.5, OJ}$

There’s another way of showing experimental design in table form, which is especially useful for two-level designs. Stay tuned!

Response moment: Hey, why is that first column in the design matrix all 1’s?