34 Relationships between two quantitative variables

So far, you have learnt to ask a RQ, identify different ways of obtaining data, design the study, collect the data describe the data, summarise data graphically and numerically, understand the tools of inference, to form confidence intervals, and to perform hypothesis tests.

In this chapter, you will learn about relationships between two quantitative variables. You will learn to:

  • describe the relationships between two quantitative variables.

34.1 Introduction: The red deer data

Consider a study513 that examined the relationship between the age of \(n = 78\) male red deer and the weight of their molars; the data are shown below.

FIGURE 34.1: The male red deer data

34.2 Two quantitative variables: Graphical summaries

For the red deer data, both variables are quantitative, so the appropriate graphical summary (Sect. 12.5) is a scatterplot (Fig. 34.2).

In the graph, the response variable is graphed on the vertical axis, and denoted \(y\); the explanatory variable is graphed on the horizontal axis, and denoted \(x\). The explanatory variable (potentially) influences the response variable, so in this example:

  • The explanatory variable (\(x\)) is the age of the deer (in years), and
  • The response variable (\(y\)) is the weight of molars (in grams).

In other words, the age of the deer would seem likely to influence the weight of the molars. (In some cases, it doesn't matter which is \(x\) and which is \(y\), such as exploring the relationship between height and weight of red deer.)

Each row in the data set (and each point on the scatterplot) correspond to a single deer (that is, the individual deer are the units of analysis), and two different variables (age; molar weight) are measured on each deer.

Molar weight verses age for the red deer data

FIGURE 34.2: Molar weight verses age for the red deer data

34.3 Understanding scatterplots

The purpose of a graph is to help us understand the data (Sect. 12.1). To understand the data displayed in a scatterplot, the form, direction, and variation (or the strength) are described:

  1. Form: Identify the overall form or structure of the relationship (e.g., linear; curved upwards; etc.).
  2. Direction: Identify the direction of the relationship (sometimes not relevant if the relationship is non-linear):
    • The variables are positively associated if high values of one variable accompany high values of the other variable, in general.
    • The variables are negatively associated if high values of one variable accompany low values of the other variable, in general.
  3. Variation: The amount of variation in the relationship. A small amount of variation in the response variable for given values of the explanatory variable means the relationship is strong; a lot of variation in the response variable for given values of the explanatory variable means the relationship is less strong.

Anything unusual or noteworthy should also be discussed. These three features help us understand the type of relationship (form and direction), and the strength of that relationship (variation).

To demonstrate the use of these descriptions, see the example scatterplots in the carousel below (click to move through the scatterplots).

Example 34.1 (Describing scatterplots) A study514 examined the lung capacity of children in Boston (measured using the forced expiratory volume (FEV)). The scatterplot (Fig. 34.3) could be described as curved (form), where older children have larger FEVs, in general (direction). The variation gets larger for taller youth.

FEV plotted against height for children in Boston

FIGURE 34.3: FEV plotted against height for children in Boston

Describe the scatterplot of diastolic BP against age (Fig. 34.4), from the NHANES data.

(Answer is here515.)

Diastolic blood pressure plotted against age for the NHANES data

FIGURE 34.4: Diastolic blood pressure plotted against age for the NHANES data

Example 34.2 (Scatterplots) For the red deer data (Fig. 34.2), the scatterplot could be described as approximately linear (form), with a negative direction (older deer generally have less heavy teeth); the variation is... perhaps moderate.

34.4 Summary

A scatterplot is used to show the relationship between two quantitative variables (the response denoted \(y\); the explanatory denoted \(x\)). The relationship can be described by the form (linear, or otherwise), the direction of the relationship (sometimes not relevant if the graph is not linear), and the variation in the relationship (or the strength of the relationship).

34.5 Quick review questions

A study of onion growth516 produced the scatterplot shown in Fig. 34.5.

Onion yield plotted against planting density

FIGURE 34.5: Onion yield plotted against planting density

  1. The \(x\)-variable is
  2. The form is best described as
  3. The direction is best described as
  4. The variation is best described as

Progress:

34.6 Exercises

Selected answers are available in Sect. D.31.

Exercise 34.1 A study evaluated various food mixtures for sheep.517 Describe the scatterplot (Fig. 34.6) in terms of the form of the relationship, the direction of the relationship (if relevant), and the variation in the relationship.

Scatterplots for the sheep-food data

FIGURE 34.6: Scatterplots for the sheep-food data

Exercise 34.2 A study examined the direct current generated by a windmill and its association with wind speed.518 Describe the relationship (Fig. 34.7).

The relationship between DC output and wind speed

FIGURE 34.7: The relationship between DC output and wind speed

Exercise 34.3 A study examined the mandible length and gestational age for 167 foetuses from the 12th week of gestation onward.519 How would you describe the relationship (Fig. 34.8)?

The relationship between gestational age and mandible length

FIGURE 34.8: The relationship between gestational age and mandible length

Exercise 34.4 A study examined the time taken to deliver soft drinks to vending machines.520 How would you describe the relationship (Fig. 34.9)?

The time taken to deliver soft drinks to vending machines

FIGURE 34.9: The time taken to deliver soft drinks to vending machines