# 35 Relationships between two quantitative variables

So far, you have learnt about the research process, including analysing data using confidence intervals and hypothesis tests. Specifically, you have learnt to construct confidence intervals, and performs hypothesis tests, for one groups and for comparing two separate groups.

In this chapter, you will learn about relationships between two quantitative variables. You will learn to:

• describe the relationships between two quantitative variables. ## 35.1 Introduction: red deer

So far, RQs about single variables and RQs for comparing two groups have been studied. Comparing the mean value of a quantitative variable in two groups was studied in Sects. 24 and 32. Comparing the percentage of times an outcome of interest appears in two groups was studied in Sects. 25 and 33. In this Chapter (and the next two), the relationship between two quantitative variables is studied.

Our main example in the next three chapters is a study that examined the relationship between the age of $$n = 78$$ male red deer and the weight of their molars; the data are shown below. The data comprises two quantitative variables.

FIGURE 35.1: The male red deer data

## 35.2 Two quantitative variables: graphical summaries For the red deer data, both variables are quantitative, so the appropriate graphical summary (Sect. 12.6) is a scatterplot (Fig. 35.2). The response variable is graphed on the vertical axis, and denoted $$y$$; the explanatory variable is graphed on the horizontal axis, and denoted $$x$$. In some cases, when only a relationship is being explored, which variable is $$x$$ and which is $$y$$ is not important (for example, see Example 36.7.)

Since the explanatory variable (potentially) influences the response variable, in this example:

• The explanatory variable ($$x$$) is the age of the deer (in years), and
• The response variable ($$y$$) is the weight of molars (in grams).

In other words, the age of the deer may influence the weight of the molars. (Supposing that the weight of the molars may influence the age of the deer is silly.)

Each row in the dataset (and each point on the scatterplot) corresponds to a single deer (the units of analysis); two quantitative variables (age; molar weight) are measured on each deer. FIGURE 35.2: Molar weight verses age for the red deer data

## 35.3 Understanding scatterplots

The purpose of a graph is to help understand the data (Sect. 12.1). For a scatterplot, the form, direction, and variation in the relationship (or the strength of the relationship) are described:

1. Form: The overall form or structure of the relationship (e.g., linear; curved upwards; etc.).
2. Direction: The direction of the relationship (sometimes not relevant if the relationship is non-linear):
• The variables are positively associated if high values of one variable accompany high values of the other variable, in general.
• The variables are negatively associated if high values of one variable accompany low values of the other variable, in general.
3. Variation: The amount of variation in the relationship. A small amount of variation in the response variable for given values of the explanatory variable means the relationship is strong; a lot of variation in the response variable for given values of the explanatory variable means the relationship is less strong.

Anything unusual or noteworthy should also be discussed. These three features explain the type of relationship (form; direction), and the strength of that relationship (variation). Examples are shown in the carousel below (click to move through the scatterplots).

Example 35.1 (Describing scatterplots) A study measured the lung capacity of children in Boston (using the forced expiratory volume, FEV). The scatterplot (Fig. 35.3) is curved (form), where older children have larger FEVs, in general (direction). The variation gets larger for taller youth. FIGURE 35.3: FEV plotted against height for children in Boston

Describe the scatterplot of the diameter against the age of $$385$$ small-leaved lime trees (Schepaschenko et al. (2017)), shown in Fig. 35.4).

• Form: may start off straight-ish, but then seems hard to assess.
• Direction: biomass increases as age increases (on average).
• Variation: small-ish for small ages; large-ish for older trees (after about 60 years old). FIGURE 35.4: The age and foliage biomass of small-leaved lime trees grown in Russia ($$n = 385$$)

Example 35.2 (Scatterplots) For the red deer data (Fig. 35.2), the relationship is approximately linear (form) with a negative direction (older deer generally have lighter teeth); the variation is... perhaps moderate.

## 35.4 Summary

A scatterplot displays the relationship between two quantitative variables (the response denoted $$y$$; the explanatory denoted $$x$$). The relationship is described by the form (linear, or otherwise), the direction of the relationship (sometimes not relevant if the graph is not linear), and the variation in the relationship (or the strength of the relationship).

## 35.5 Quick review questions

A study of onion growth produced the scatterplot shown in Fig. 35.5. FIGURE 35.5: Onion yield plotted against planting density

1. The $$x$$-variable is
2. The form is best described as
3. The direction is best described as
4. The variation is best described as

## 35.6 Exercises

Selected answers are available in Sect. D.32.

Exercise 35.1 A study examined the time taken to deliver soft drinks to vending machines . Describe the relationship (Fig. 35.6, left panel).

Exercise 35.2 A study examined the mandible length and gestational age for 167 foetuses from the 12th week of gestation onward . Describe the relationship (Fig. 35.6, right panel). FIGURE 35.6: Two scatterplots. Left: The time taken to deliver soft drinks to vending machines. Right: The relationship between gestational age and mandible length.

Exercise 35.3 A study 25 gorillas are recorded information about their chest beating and their size (measured by the breadth of the gorillas' backs). Describe the relationship (Fig. 35.7, left panel). FIGURE 35.7: Two scatterplots. Left: Chest beating in gorillas; right: The relationship between DC output and wind speed.

Exercise 35.4 A study examined the relationship between direct current generated by a windmill and wind speed . Describe the relationship (Fig. 35.7, right panel).