So far, you have learnt to ask a RQ, identify different ways of obtaining data, design the study, collect the data describe the data, summarise data graphically and numerically, understand the tools of inference, to form confidence intervals, and to perform hypothesis tests.
In this chapter, you will learn about relationships between two quantitative variables. You will learn to:
- describe the relationships between two quantitative variables.
Consider a study513 that examined the relationship between the age of \(n = 78\) male red deer and the weight of their molars; the data are shown below.
In the graph, the response variable is graphed on the vertical axis, and denoted \(y\); the explanatory variable is graphed on the horizontal axis, and denoted \(x\). The explanatory variable (potentially) influences the response variable, so in this example:
- The explanatory variable (\(x\)) is the age of the deer (in years), and
- The response variable (\(y\)) is the weight of molars (in grams).
In other words, the age of the deer would seem likely to influence the weight of the molars. (In some cases, it doesn't matter which is \(x\) and which is \(y\), such as exploring the relationship between height and weight of red deer.)
Each row in the data set (and each point on the scatterplot) correspond to a single deer (that is, the individual deer are the units of analysis), and two different variables (age; molar weight) are measured on each deer.
The purpose of a graph is to help us understand the data (Sect. 12.1). To understand the data displayed in a scatterplot, the form, direction, and variation (or the strength) are described:
- Form: Identify the overall form or structure of the relationship (e.g., linear; curved upwards; etc.).
Direction: Identify the direction of the relationship
(sometimes not relevant if the relationship is non-linear):
- The variables are positively associated if high values of one variable accompany high values of the other variable, in general.
- The variables are negatively associated if high values of one variable accompany low values of the other variable, in general.
- Variation: The amount of variation in the relationship. A small amount of variation in the response variable for given values of the explanatory variable means the relationship is strong; a lot of variation in the response variable for given values of the explanatory variable means the relationship is less strong.
Anything unusual or noteworthy should also be discussed. These three features help us understand the type of relationship (form and direction), and the strength of that relationship (variation).
To demonstrate the use of these descriptions, see the example scatterplots in the carousel below (click to move through the scatterplots).
Example 34.1 (Describing scatterplots) A study514 examined the lung capacity of children in Boston (measured using the forced expiratory volume (FEV)). The scatterplot (Fig. 34.3) could be described as curved (form), where older children have larger FEVs, in general (direction). The variation gets larger for taller youth.
Describe the scatterplot of diastolic BP against age (Fig. 34.4), from the NHANES data.
(Answer is here515.)
Example 34.2 (Scatterplots) For the red deer data (Fig. 34.2), the scatterplot could be described as approximately linear (form), with a negative direction (older deer generally have less heavy teeth); the variation is... perhaps moderate.
A scatterplot is used to show the relationship between two quantitative variables (the response denoted \(y\); the explanatory denoted \(x\)). The relationship can be described by the form (linear, or otherwise), the direction of the relationship (sometimes not relevant if the graph is not linear), and the variation in the relationship (or the strength of the relationship).
- The \(x\)-variable is
- The form is best described as
- The direction is best described as
- The variation is best described as
Selected answers are available in Sect. D.31.
Exercise 34.1 A study evaluated various food mixtures for sheep.517 Describe the scatterplot (Fig. 34.6) in terms of the form of the relationship, the direction of the relationship (if relevant), and the variation in the relationship.