15 Comparing quantitative data within individuals

So far, you have learnt to ask a RQ, design a study, collect the data, and describe the data. In this chapter, you will learn to:

  • summarise within-individual changes in a quantitative data using appropriate graphs.
  • summarise within-individual changes using in summary tables.

15.1 Introduction

Sometimes the same variable is measured on each individual more than once (i.e., within-individual changes for each unit of analysis) but only a small number of times. Examples of this type of data include:

  • Measurements of household water consumption for many households, before and after installing water-saving devices.
  • Blood pressure recorded for people at \(8\)am, \(1\)pm and \(8\)pm each day.

In both cases, the same variable is measured multiple times for each individual. Within-individuals changes require different methods than between-individuals comparisons. This chapter considers how to summarise within-individuals changes in quantitative variables.

15.2 Summarising the comparison: mean differences

The best way to compare the two groups is to summarise the differences between the before and after measurements; for example, using the mean difference. A numerical summary table can be constructed summarising both group, plus the differences.

Example 15.1 (Within-individual comparisons) Lothian, Grey, and Lands (2006) studied children with atopic asthma, and measured the immunogoblin E concentrations (IgE) before and after an intervention for each child (Table 15.1). The child is the individual.

For the IgE data, the numerical summary table is shown in Table 15.2. The direction of the difference is implied by the word 'reduction'.

TABLE 15.1: The IgE before and after an intervention, and the change in IgE (in micrograms/L)
IgE (before) in micrograms/L IgE (after) in micrograms/L IgE reduction in micrograms/L
\(\phantom{0}\phantom{0}83\) \(\phantom{0}\phantom{0}83\) \(\phantom{0}\phantom{0}\phantom{0}0\)
\(\phantom{0}292\) \(\phantom{0}292\) \(\phantom{0}\phantom{0}\phantom{0}0\)
\(\phantom{0}293\) \(\phantom{0}292\) \(\phantom{0}\phantom{0}\phantom{0}1\)
\(\phantom{0}623\) \(\phantom{0}542\) \(\phantom{0}\phantom{0}81\)
\(\phantom{0}792\) \(\phantom{0}709\) \(\phantom{0}\phantom{0}83\)
\(1543\) \(1000\) \(\phantom{0}543\)
\(1668\) \(1000\) \(\phantom{0}668\)
\(1960\) \(1626\) \(\phantom{0}334\)
\(2877\) \(2502\) \(\phantom{0}375\)
\(2961\) \(2711\) \(\phantom{0}250\)
\(5504\) \(4504\) \(1000\)
TABLE 15.2: A numerical summary of the IgE data (in \(\mu\)g/L)
Mean Std. dev. Sample size
Before \(\phantom{0}1690.5\) \(1615.53\) \(11\)
After \(\phantom{0}1387.4\) \(1354.28\) \(11\)
Reduction \(\phantom{0}\phantom{0}303.2\) \(\phantom{0}325.28\) \(11\)

15.3 Graphs

For within-individual changes for a quantitative variable, options for plotting include:

  • Histograms of differences (Sect. 15.3.1): useful for changes in pairs of measurements or observations.
  • Case-profile plots (Sect. 15.3.2): useful when the same individuals are measured or observed a small number of times.

15.3.1 Histogram of differences

Sometimes the same variable is measured on each unit of analysis twice, when the changes (or differences) for each individual can be produced, and a histogram construct. The direction of the differences should be clear (e.g., first measurement minus second, or second measurement minus first).

Example 15.2 (Within-individual comparisons) For the IgE data (Table 15.1), the reduction in IgE for each child can be shown using a histogram (Fig. 15.1, top panel).

The IgE data. Top: a case-profile plot. Each line represents one subject, joining that person's pre-intervention score to their post-intervention score. Bottom: a histogram of the differences

FIGURE 15.1: The IgE data. Top: a case-profile plot. Each line represents one subject, joining that person's pre-intervention score to their post-intervention score. Bottom: a histogram of the differences

15.3.2 Case-profile plots

Sometimes the variable is measured or recorded more than twice, and so a single set of differences cannot be produced. In these cases, the values for each individual can be plotted using a case-profile plot. A case-profile plot is still useful for paired data, of course.

Example 15.3 (Case-profile plot) For the IgE data (Table 15.1), the measurements of IgE for each child at both times can be shown in a case-profile plot (Fig. 15.1, bottom panel). Each line corresponds to a unit of analysis (i.e., a child).

Example 15.4 (Case-profile plot) Runners use wearable devices to measure many performance indicators, including vertical oscillation (VO). VO contributes to running economy and injury risk, so reliable VO measurements are crucial. Smith et al. (2022) compared four devices, and obtained data from video analysis for \(n = 150\) athletes; that is, each participant had the same runs measured using five methods. The case-profile plot (Fig. 15.2) shows the means for each method using a solid point. NOVA and Footpod give smaller VO measurements in general.

Vertical oscillation measured using five methods for $15$ runners. The solid black points represent the means for each method. Left: a line is plotted for each individuals. Right: only the means are shown, with vertical lines from the minimum value to the maximum value.

FIGURE 15.2: Vertical oscillation measured using five methods for \(15\) runners. The solid black points represent the means for each method. Left: a line is plotted for each individuals. Right: only the means are shown, with vertical lines from the minimum value to the maximum value.

As in the last example, the case-profile plot is hard to read with large numbers of individuals, and so sometimes the mean (or median, as appropriate) is shown, with some measure of the variation of the observations (Fig. 15.2 shows the minimum and maximum values for each method, for instance).

15.4 Example: invasive plants

Skypilot (Polemonium viscosum) is a native alpine wildflower growing in the Colorado Rocky Mountains (USA). In recent years, a willow shrub (Salix) has been encroaching on skypilot territory and, because willow often flowers early, researchers (Kettenbach et al. 2017) are concerned that the willow may 'negatively affect pollination regimes of resident alpine wildflower species' (p. 6965). One RQ was:

In the Colorado Rocky Mountains, what is the mean difference between first-flowering day for the native skypilot and the encroaching willow?

Data for both species was collected at \(25\) different sites. The site is the individual; the data are paired (Sect. 27.1), a form of blocking (Sect. 7.2). The data are shown in the table below. The 'first-flowering day' is the number of days since the start of the year (e.g., January 12 is 'day 12') when flowers were first observed.

TABLE 15.3: The day of the year of first flowering by encroaching willow and native skypilot.
Site Willow Skypilot
\(\phantom{0}1\) \(201\) \(201\)
\(\phantom{0}2\) \(178\) \(179\)
\(\phantom{0}3\) \(189\) \(189\)
\(\phantom{0}4\) \(189\) \(189\)
\(\phantom{0}5\) \(196\) \(203\)
\(\phantom{0}6\) \(207\) \(203\)
\(\phantom{0}7\) \(199\) \(199\)
\(\phantom{0}8\) \(178\) \(182\)
\(\phantom{0}9\) \(178\) \(178\)
\(10\) \(191\) \(191\)
\(11\) \(187\) \(192\)
\(12\) \(190\) \(197\)
\(13\) \(190\) \(190\)
\(14\) \(209\) \(209\)
\(15\) \(221\) \(221\)
\(16\) \(179\) \(188\)
\(17\) \(174\) \(179\)
\(18\) \(172\) \(166\)
\(19\) \(196\) \(196\)
\(20\) \(173\) \(173\)
\(21\) \(180\) \(173\)
\(22\) \(181\) \(179\)
\(23\) \(186\) \(186\)
\(24\) \(194\) \(209\)
\(25\) \(197\) \(197\)

Since the raw data are available, the data should be summarised graphically (Fig. 15.4) and numerically (Table 15.4), using software output (Fig. 15.3).

jamovi output for the flowering-day data

FIGURE 15.3: jamovi output for the flowering-day data

TABLE 15.4: The day of first flowering for encroaching willow and native skypilot
Mean Std. dev. Std. error Sample size
Willow (encroaching) \(189.40\) \(\phantom{0}12.200\) \(\phantom{0}2.440\) \(25\)
Skypilot (native) \(190.76\) \(\phantom{0}13.062\) \(\phantom{0}2.612\) \(25\)
Differences \(\phantom{0}\phantom{0}1.36\) \(\phantom{0}\phantom{0}4.698\) \(\phantom{0}0.940\) \(25\)
The flowering-day data. Left: a histogram of the difference between the first-flowering days (skypilot minus willow). Right: a case-profile plot of days of first flowering (unfilled points and dashed lines indicate earlier dates (smaller values) for willow)

FIGURE 15.4: The flowering-day data. Left: a histogram of the difference between the first-flowering days (skypilot minus willow). Right: a case-profile plot of days of first flowering (unfilled points and dashed lines indicate earlier dates (smaller values) for willow)

15.5 Example: pain-relieving tape

A study examined the effect of using Kinesio Tape (Naugle et al. 2021) to alleviate pain in athletes. Pain was measured by applying a slow constant rate of pressure on the left arm, and subjects pressed a button when the sensation moved from pressure to pain. The pressure when this occurred was recorded. This was repeated \(5\) mins before applying the tape, \(5\) min after applying the tape, and again \(15\)--\(20\) min after applying the tape.

Figure 15.5 shows the reported pain for \(16\) subjects. A summary table is shown in Table 15.5.

Pain threshold (left arm) at three time points when using Kinesio Tape, without applying tension, for $n = 26$ subjects. The black points represent the means for each time point.

FIGURE 15.5: Pain threshold (left arm) at three time points when using Kinesio Tape, without applying tension, for \(n = 26\) subjects. The black points represent the means for each time point.

TABLE 15.5: A numerical summary of the Tape data
Mean (in kPa) Std. dev. (in kPa) Sample size Mean CHANGE SD CHANGE
Pre: 5 mins \(446.5\) \(175.18\) \(16\)
Post: 5 mins \(479.6\) \(199.61\) \(16\) \(33.1\) \(\phantom{0}73.93\)
Post: 15-20 mins \(506.9\) \(214.36\) \(16\) \(60.4\) \(102.72\)

15.6 Chapter summary

Quantitative data measured within individuals can be summarised using a histogram of differences when the variable is measured (or observed) twice, or a case-profile plot (with two or more measurement or observations). A summary table should show the numerical summaries for the quantitative variable at each measurement or observation and, if appropriate, the changes.

15.7 Quick review questions

Are the following statements true or false?

  1. A histogram of the differences is only appropriate for showing changes for two measurements or observations.
  2. A case-profile plot is only appropriate for showing changes for two measurements or observations.
  3. The median and IQR are not appropriate for summarising differences.
  4. Explaining how the differences are computed is important.

15.8 Exercises

Answers to odd-numbered exercises are available in App. E.

Exercise 15.1 [Dataset: Insulation] The Electricity Council in Bristol wanted to determine if a certain type of wall-cavity insulation reduced energy consumption in winter (The Open University (1983), Hand et al. (1996)). Their RQ was:

In Bristol homes, what is the mean reduction in energy consumption after adding home insulation?

  1. What are the individuals (units of analysis)?
  2. Use the collected data (shown below) to sketch a case-profile plot.
  3. Use the data to sketch a histogram of the differences.
  4. Use software or a calculator to prepare a summary table.

Exercise 15.2 [Dataset: Captopril] In a study of hypertension (Hand et al. 1996; MacGregor et al. 1979), \(15\) patients were given a drug (Captopril) and their systolic blood pressure measured (in mm Hg) immediately before and two hours after being given the drug (Table 15.6).

  1. Explain why this is a within-individuals comparison.
  2. Construct a histogram of the differences.
  3. Construct a case-profile plot for the data.
TABLE 15.6: The Captopril data: before after after systolic blood pressures (in mm Hg)
Before After Before After
\(210\) \(201\) \(173\) \(147\)
\(169\) \(165\) \(146\) \(136\)
\(187\) \(166\) \(174\) \(151\)
\(160\) \(157\) \(201\) \(168\)
\(167\) \(147\) \(198\) \(179\)
\(176\) \(145\) \(148\) \(129\)
\(185\) \(168\) \(154\) \(131\)
\(206\) \(180\)

Exercise 15.3 [Dataset: PainRelief] Augustino et al. (2023) measured the reported pain of new mothers in Dodoma (Tanzania) at four times: near giving birth, then \(20\), \(40\) and \(60\) minutes after giving birth. Mothers were administered either paracetamol or a cold pack as pain relief. Pain was recorded using a 'numeric rating scale represented by the horizontal line marked from zero to ten', where higher scores mean greater pain.

Since the number of individuals is large (\(n = 912\)), use the summary data in Table 15.7 to sketch a plot of the means and the range, like that in Figure 15.5.

TABLE 15.7: Reported pain for mothers after giving birth
At birth After 20 mins After 40 mins After 60 mins
Paracetamol Mean \(7.44\) \(6.89\) \(4.69\) \(2.84\)
(\(n = 456\)) Std. deviation \(2.01\) \(1.83\) \(1.49\) \(1.19\)
Minimum \(2.00\) \(2.00\) \(2.00\) \(0.00\)
Maximum \(10.00\) \(10.00\) \(9.00\) \(7.00\)
Cold pack Mean \(8.63\) \(5.67\) \(3.19\) \(0.99\)
(\(n = 455\)) Std. deviation \(1.40\) \(2.03\) \(1.63\) \(0.99\)
Minimum \(4.00\) \(0.00\) \(0.00\) \(0.00\)
Maximum \(10.00\) \(9.00\) \(6.00\) \(4.00\)

Exercise 15.4 [Dataset: Stress] The concentration of beta-endorphins in the blood is a sign of stress. One study (Hand et al. (1996), Dataset 232; Hoaglin, Mosteller, and Tukey (2011)) measured the beta-endorphin concentration for \(19\) patients about to undergo surgery.

Each patient had their beta-endorphin concentrations measured \(12\)--\(14\) hours before surgery, and also \(10\) minutes before surgery. A numerical summary (from the jamovi output) is in Table 15.8.

TABLE 15.8: The numerical summary for the presurgical stress data
Mean Std deviation Std error Sample size
12--14 hours before surgery \(\phantom{0}8.35\) \(\phantom{0}4.397\) \(1.009\) \(19\)
10 minutes before surgery \(16.05\) \(12.509\) \(2.870\) \(19\)
Increase \(\phantom{0}7.70\) \(13.519\) \(3.102\) \(19\)
  1. Explain why this is a within-individuals comparison.
  2. Construct a histogram of the differences.
  3. Construct a case-profile plot for the data.

Exercise 15.5 Romero-Blanco et al. (2020) measured (among other things) the number of minutes of vigorous physical activity (PA) performed by Spanish health students before and during the COVID-19 lockdown (from March to April 2020 in Spain). Since the before and during lockdown were both measured on each participant, the data are paired (within individuals). The data are summarised in Table 15.9.

  1. Explain why this is a within-individuals comparison.
  2. Construct a histogram of the differences.
  3. Construct a case-profile plot for the data.
TABLE 15.9: Summary information for the COVID-lockdown exercise data for \(n = 214\) Spanish students
Mean (mins) Std. dev. (mins)
Before \(28.47\) \(54.13\)
During \(30.66\) \(30.04\)
Increase \(\phantom{0}2.68\) \(51.30\)

Exercise 15.6 [Dataset: Running] Create a summary table for the data in Example 15.4.