## 33.3 Understanding scatterplots

The purpose of a graph is to help us understand the data (Sect. 12.1). To understand the data displayed in a scatterplot, the form, direction, and variation (or the strength) are described:

1. Form: Identify the overall form or structure of the relationship (e.g., linear; curved upwards; etc.).
2. Direction: Identify the direction of the relationship (sometimes not relevant if the relationship is non-linear):
• The variables are positively associated if high values of one variable accompany high values of the other variable, in general.
• The variables are negatively associated if high values of one variable accompany low values of the other variable, in general.
3. Variation: The amount of variation in the relationship. A small amount of variation in the response variable for given values of the explanatory variable means the relationship is strong; a lot of variation in the response variable for given values of the explanatory variable means the relationship is less strong.

Anything unusual or noteworthy should also be discussed. These three features help us understand the type of relationship (form and direction), and the strength of that relationship (variation).

To demonstrate the use of these descriptions, see the example scatterplots in the carousel below (click to move through the scatterplots).

Example 33.1 (Describing scatterplots) A study examined the lung capacity of children in Boston (measured using the forced expiratory volume (FEV)). The scatterplot (Fig. 33.3) could be described as curved (form), where older children have larger FEVs, in general (direction). The variation gets larger for taller youth.
Think 33.1 (Scatterplots) Describe the scatterplot of diastolic BP against age (Fig. 33.4), from the NHANES data.
Form: curved. Direction: not relevant (up, then down). Variation: large.
Example 33.2 (Scatterplots) For the red deer data (Fig. 33.2), the scatterplot could be described as approximately linear (form), with a negative direction (older deer generally have less heavy teeth); the variation is… perhaps moderate.

### References

Kahn M. An exhalent problem for teaching statistics. Journal of Statistical Education. 2005;13(2).
Tager IB, Weiss ST, Rosner B, Speizer FE. Effect of parental cigarette smoking on the pulmonary function of children. American Journal of Epidemiology. 1979;110(1):15–26.