## 23.1 Mean differences

House insulation is important for saving energy, particularly in cold climates.

Consider a study to estimate the average energy savings made by using a new type of house insulation. Different study designs could be used to address this.

One approach is to take a sample of homes,
and measure the energy consumption
*before* adding the insulation,
and then *after* adding the insulation for the same houses.
Each home gets *two* observations:
the energy consumption *before*
and
*after* adding the insulation.

This is a *descriptive RQ*:
the *Outcome* is the mean energy *saving*,
and the response variable is the energy saving for each house.
There is *no* Comparison:
units of analysis that have been treated differently are not compared.

Alternatively,
the researchers could take a sample of homes *without* the insulation,
and measure their energy consumption;
then take a *different* sample of homes with the insulation,
and measure their energy consumption.

This is a *relational RQ*:
the *Outcome* is the mean energy consumption,
and the response variable is the energy consumption for each house.
The Comparison
is between units of analysis *with* the insulation,
and units of analysis *without* the insulation.

Either study is possible,
and each has advantages and disadvantages
(Zimmerman 1997).
Here the *first* (Descriptive) design would seem superior (why?).
In the first design,
each home gets a *pair* of energy consumption measurements:
this is *paired data*,
which is the subject of this chapter.
The second (Relational) design
requires the means of two different groups of homes to be compared,
which is the topic of the next chapter.

**Definition 23.1 (Paired data)**Data are

*paired*when two observations about the same variable are recorded for each unit of analysis.

Since each unit of analysis has two observations about energy consumption,
the *change* (or the *difference*, or the *reduction*) in energy consumption
can be computed for each house.
Then,
questions can be asked about the *population mean difference*,
which is not the same as *difference between two separate population means*
(the subject of the next chapter).
In paired data,
finding the difference between the two measurements
for each individual unit of analysis makes sense:
each unit of analysis (each house) has two related observations.

**Think 23.1 (Paired situations) **Which of these are paired situations?

- The mean difference between blood pressure for 36 people, before and after taking a drug.
- The difference between the mean HDL cholesterol levels for 22 males and 19 females.
- The mean protein levels were compared in sea turtles before and after being rehabilitated (March et al. 2018).

**1**and

**3**are paired.