19 Making decisions

So far, you have learnt to ask a RQ, design a study, and describe and summarise the data. In this chapter, you will learn how decisions are made in science, so you can answer RQs. You will learn to:

  • state the two broad reasons to explain the difference between statistics and parameters.
  • explain how decisions are made in research.

19.1 Introduction

In Sect. 17.4, a study of three rural communities in Cameroon and their access to water was described (López-Serrano et al. 2022). One purpose of the study was to determine contributors to the incidence of diarrhoea in young children.

In the observed sample, the odds of an incidence of diarrhoea in children in the last two weeks among households without livestock was \(0.176\) (Table 15.3). However, the odds in households with livestock was \(0.548\). In other words, the odds of an incidence of diarrhoea in young children was over three times greater in households with livestock than without (i.e., \(\text{OR} = 0.548/0.176 = 3.11\)). That is, the odds are not the same in the sample; however, RQs ask about the population.

Of course, the sample studied is just one of countless possible samples that could have been chosen to study. Some other samples may even have produced the opposite results: incidence of diarrhoea greater in households without livestock. Every possible sample comprises different households, and the sample results depends on which households are in the studied sample. This leads to one of the most important observations about sampling.

Studying a sample leads to the following observations:

  • Every sample is likely to be different.
  • We observe just one of the many possible samples.
  • Every sample is likely to yield a different value for the sample statistic.
  • We observe just one of the many possible values for the statistic.

Since many values for the sample are possible, the possible values of the statistic vary (this is called sampling variation) and have a distribution (this is called a sampling distribution).

We observe just one of the many possible samples, and the sample odds are not the same in that sample. What does this imply about the population odds in the two groups? Two reasons could explain why the sample odds in the two groups are not the same:

  1. The two population odds are the same, but the two sample odds are not the same due to sampling variation. That is, we just happen to have---by chance---one of those samples where the two sample odds are not the same, even though they are the same in the population.
  2. The two population odds are different, and the different sample odds simply reflect the situation in the population.

These two possible explanations ('statistical hypotheses') for the sample odds being unequal have special names:

  1. The first explanation is the null hypothesis, or \(H_0\): the population odds are the same, and the sample odds are not equal simply due to sampling variation.

  2. The second explanation is the alternative hypothesis, or \(H_1\): the population odds are not the same, which is reflected in the sample odds.

How do we decide which of these explanations is supported by the data? What is the decision-making process?

The usual approach to decision making in science begins by assuming the null hypothesis (the sampling variation explanation) is true. Then the data are examined to see if compelling evidence exists to change our mind (and support the alternative hypothesis). Conclusions drawn about the population from the sample can never be certain, since the sample studied is just one of many possible samples that could have been taken (and every sample is likely to be different).

The onus is on the data to refute the null hypothesis. That is, the null hypothesis is retained unless compelling evidence suggests otherwise.

19.2 The need for making decisions

In research, decisions need to be made about populations based on samples. That means that decisions need to be made about parameters based on statistics. The challenge is that the decision must be made using only one of the many possible samples, and every sample is likely to be different, and so each sample will produce different statistics. This is called sampling variation.

Definition 19.1 (Sampling variation) Sampling variation refers to how the sample estimates (statistics) vary from sample to sample, because every possible sample is different.

Sensible decisions can be made (and are made) about parameters based on statistics. To do this though, the process of how decisions are made needs to be articulated, which is the purpose of this chapter.

To begin, suppose I produce a pack of cards, and shuffle them well, and I am interested in the event 'the number of red cards when I draw \(15\) cards from the pack'. The pack of cards can be considered the population (where the proportion of red cards is \(p = 0.5\)). Suppose I draw a sample of \(15\) cards from the pack. Define \(\hat{p}\) as the proportion of red cards in the sample.

Suppose my sample of \(15\) cards produces \(\hat{p} = 1\); that is, all \(n = 15\) cards are red cards. What should you conclude? How likely is it that this would happen by chance from a fair pack; see the animation below. Is this evidence that the pack of cards is somehow unfair, poorly shuffled, or manipulated?

Getting \(15\) reds cards out of \(15\) (i.e., \(\hat{p} = 1\)) from a well-shuffled pack seems very unlikely; you probably conclude that the pack is somehow unfair. Importantly, how did you reach that decision? Your unconscious decision-making process may have been this:

  1. You assumed, quite reasonably, that I used a standard, well-shuffled pack of cards, where half the cards are red and half the cards are black. That is, you assumed the population proportion of red cards really was \(p = 0.5\).
  2. Based on that assumption, you expected about half the cards in the sample of \(15\) to be red (i.e., expect \(\hat{p}\) to be about \(0.5\)). You wouldn't necessarily expect exactly half red and half black, but you'd probably expect something close to that. That is, you would expect that \(\hat{p}\) would be close to \(0.5\).
  3. You then observed that all \(15\) cards were red. That is, \(\hat{p} = 1\).
  4. You were expecting \(\hat{p} = 0.5\) approximately, but instead observed \(\hat{p} = 1\). Since what you observed ('all red cards') was not at all like what you were expecting ('about half red cards'), the sample contradicts what you were expecting if the pack was fair. This suggests your assumption of a fair pack is probably wrong.

Of course, getting \(15\) red cards in a row is possible; it's just very unlikely. For this reason, you would probably conclude, based on the evidence, that this is compelling evidence that the pack is not fair.

You probably didn't consciously go through this process, but it seems reasonable. This process of decision making is similar to the process used in research.

19.3 How decisions are made

Based on the ideas in the last section, a formal process of decision making in research can be described.

  1. Make an assumption. Make an assumption about the value of the parameter. By doing so, we are assuming that sampling variation explains any discrepancy between the value of the observed statistic and this assumed value of the parameter.
  2. Define the expectations of the statistic. Based on the assumption made about the parameter, describe what values of the statistic might reasonably be observed from all the possible samples from the population (due to sampling variation).
  3. Make the observations. Take a sample (one of the many possible samples), and compute the observed sample statistic from these data.
  4. Make a decision. If the observed sample statistic:
    • is unlikely to have been observed by chance, the statistic (i.e., the evidence) contradicts the assumption about the parameter: the assumption is probably (but not certainly) wrong.
    • could have easily be explained by chance, the statistic (i.e., the evidence) is consistent with the assumption about the parameter, and the assumption may be (but is not certainly) correct.

This is one way to describe the process of decision making in science (Fig. 19.2).

A way to make decisions.

FIGURE 19.1: A way to make decisions.

A way to make decisions.

FIGURE 19.2: A way to make decisions.

This approach is similar to how we unconsciously make decisions every day. For example, suppose I ask my son to brush his teeth (Budgett et al. 2013). Later, I wish to decide if he really did.

  1. Assumption: I assume my son brushed his teeth (because I asked him to).
  2. Expectation: Based on my assumption, I would expect to find his toothbrush is damp.
  3. Observation: When I check later, I observe a dry toothbrush.
  4. Decision: The evidence contradicts what I expected to find based on my assumption, so my assumption is probably false: he probably didn't brush his teeth.

I may have made the wrong decision: he may have brushed his teeth, but dried his toothbrush with a hair dryer. However, based on the evidence, it is more likely that he has not brushed his teeth.

The situation may have ended differently: when I check later, suppose I observe a damp toothbrush. Then, the evidence seems consistent with what I expected if he brushed his teeth (my assumption), so my assumption is probably true. He probably did brush his teeth. Again, I may be wrong: he may have ran his toothbrush under a tap. Nonetheless, I don't have any evidence that he didn't brush his teeth.

Similar logic underlies most decision making in science3.

Example 19.1 (The decision-making process) Consider the cards example from Sect. 19.2 again. The formal process might look like this:

  1. Assumption: Assume the pack is fair and well-shuffled pack of cards: the population proportion of red cards is \(p = 0.5\) (the value of the parameter).
  2. Expectation: Based on this assumption, roughly (but not necessarily exactly) equal proportions of red and black cards would be expected in a sample of \(15\) cards. The sample proportion of red cards \(\hat{p}\) (the value of the statistic) is expected to be close to, but maybe not exactly, \(0.5\).
  3. Observation: Suppose I then deal \(15\) cards, and all \(15\) are red cards: then \(\hat{p} = 1\).
  4. Decision: \(15\) red cards from \(15\) cards seems unlikely to occur if the pack is fair and well-shuffled. The data seem inconsistent with what I was expecting based on the assumption (Fig. 19.3), suggesting the assumption is probably false.

Of course, getting \(15\) red cards out of \(15\) is not impossible, so my conclusion may be wrong. However, the evidence certainly suggests the pack is not fair.

A way to make decisions for the cards example.

FIGURE 19.3: A way to make decisions for the cards example.

19.4 Making decisions in research

Let's think about each step in the decision-making process (Fig. 19.2) individually.

  • Making an assumption about the parameter (Sect. 19.4.1);
  • Describing the expectations of the statistic (Sect. 19.4.2);
  • Making the sample observations (Sect. 19.4.3); and
  • Making a decision (Sect. 19.4.4).

19.4.1 Making an assumption about the parameter

The initial assumption is that sampling variation explains any discrepancy between the parameter and the statistic. This assumption about the parameter is called the null hypothesis, denoted \(H_0\). The null hypothesis is always the 'sampling variation' explanation; that is, there is 'no change, no difference, no relationship' in the population (depending on the context). Any change, difference or relationship seen in the sample is due only to sampling variation.

Using this idea, a reasonable assumption can be made about the parameter. For example, when comparing the mean of two groups, we would initially assume no difference between the population means: any difference between the sample means would be attributed to sampling variation.

Example 19.2 (Assumptions about the population) Many dental associations recommend brushing teeth for two minutes. I. D. M. Macgregor and Rugg-Gunn (1979) recorded the tooth-brushing time for \(85\) uninstructed schoolchildren from England to assess compliance with the guidelines.

We initially assume the population mean tooth-brushing time is two minutes (\(H_0\): \(\mu = 2\)). If the sample mean is not two minutes, the null hypothesis explains the discrepancy by sampling variation. A sample can then be obtained to determine if the sample mean is consistent with, or contradicts, this assumption.

19.4.2 Describing the expectations of the statistic

Based on the assumed value for the parameter, we then determine what values to expect from the statistic from all the possible samples we could select (of which we only select one). Since many samples are possible, and every sample is likely to be different (sampling variation), the observed value of the statistic depends on which one of the countless possible samples we obtain.

Think about the cards in Sect. 19.2. Assuming a fair pack, then half the cards are red in the population (the pack of cards), so the population proportion is assumed to be \(p = 0.5\). In a sample of \(15\) cards, what values could be reasonably expected for the sample proportion \(\hat{p}\) of red cards (the statistic)? If samples of size \(15\) were repeatedly taken from a fair pack with \(p = 0.5\), the sample proportion of red cards would vary from hand to hand, of course. But how would \(\hat{p}\) vary from sample to sample? Perhaps \(15\) red cards out of \(15\) cards happens reasonably frequently, or perhaps not. How could we find out how often \(\hat{p} = 1\) occurs in a sample of \(n = 15\)? We could:

  • use mathematical theory.
  • shuffle a pack of cards and deal \(15\) cards many hundreds of times, then count how often we see \(15\) red cards of out \(15\) cards.
  • simulate (using a computer) dealing \(15\) cards many hundreds of times from a fair pack (i.e., \(p = 0.5\)), and count how often we get \(15\) red cards of out \(15\) cards (i.e., \(\hat{p} = 1\)).

The third option is the most practical. To begin, suppose we simulated only ten hands of \(n = 15\) cards each; the animation below shows the sample proportion of red cards from each of the ten repetitions. None of those hands produced \(15\) red cards in \(15\) cards.

Suppose we repeated this for hundreds of hands of \(15\) cards (rather than the ten above), and for each hand we recorded \(\hat{p}\), sample proportion of cards that were red. The value of \(\hat{p}\) would vary from sample to sample (sampling variation), and we could record the value of \(\hat{p}\) from each of those hundreds of hands. A histogram of these hundreds of sample proportions could be constructed; the animation below shows a histogram of the sample proportions from \(1\,000\) repetitions of a hand of \(15\) cards. This histogram shows how we might expect the sample proportions \(\hat{p}\) to vary from sample to sample, when the population proportion of red cards is \(p = 0.5\). None of these \(1\,000\) simulations produces a value of \(\hat{p}\) close to \(1\).

Interestingly, the shape of the distribution in the animation above is roughly bell-shaped. This is no co-incidence. These bell-shaped distributions appear in many places, and are studied further in Chap. 21.

19.4.3 Making the sample observations

While many samples are possible, we only observe one of those countless possible samples. From our sample of \(15\) cards, all cards are red so the statistic is \(\hat{p} = 1\) (i.e., \(15\) red cards out of \(15\) cards). Assuming \(p = 0.5\), is this value of \(\hat{p}\) unusual, or not unusual?

19.4.4 Making a decision

Observing \(15\) red cards out of \(15\) cards is rare: it never happened once in the \(1\,000\) simulations (see the animation above), so the chance that our specific sample produced \(\hat{p} = 1\) is incredibly unlikely. So based on simulating \(1\,000\) hands, you could conclude that finding \(\hat{p} = 1\) almost never occurs... if the assumption of a fair pack was true.

But we did find \(\hat{p} = 1\) (i.e., \(15\) red cards in \(15\) cards)... so the assumption of a fair pack (i.e., \(p = 0.5\)) is probably wrong. That is, there seems to be compelling evidence that the pack of cards has been manipulated somehow.

What if we had observed \(11\) red cards in a hand of \(15\) cards, a sample proportion of \(\hat{p} = 11/15 = 0.733\)? The conclusion is not quite so obvious then: the animation above shows shows that \(\hat{p} = 0.733\) is unlikely... but \(\hat{p} = 0.733\) (and even larger values) did occur in the \(1\,000\) simulations.

What if \(9\) red cards were found in the \(15\) (i.e., \(\hat{p} = 0.6\))? The animation above shows shows that \(\hat{p} = 0.6\) could reasonably be observed, since there are many possible samples that lead to \(\hat{p} = 0.6\), or even higher. This would not seem unusual at all; it is not compelling evidence to change our mind. Many of the possible samples produce values of \(\hat{p}\) near \(0.6\).

Example 19.3 (Sampling variation) Many dental associations recommend brushing teeth for two minutes. I. D. M. Macgregor and Rugg-Gunn (1979) recorded the tooth-brushing time for \(85\) uninstructed schoolchildren from England to assess compliance with the guidelines.

Of course, every possible sample of \(85\) children will include different children, and so produce a different sample mean brushing time \(\bar{x}\). Even if the population mean toothbrushing time really was two minutes (\(\mu = 2\)), the sample mean probably won't be exactly two minutes, because of sampling variation.

Assume the population mean tooth-brushing time is two minutes (\(\mu = 2\)). If this is true, we then could describe what values of the sample statistic \(\bar{x}\) to expect from all possible samples. Then, after obtaining a sample and computing the sample mean, we could determine if the sample mean seems consistent with the assumption of two minutes, or whether it seems to contradict this assumption.

19.5 Tools for describing sampling variation

Making decisions about parameters based on a statistic is difficult: only one of the many possible samples is observed. Since every sample is likely to be different, different values of the sample statistic are possible. In this chapter, though, a process for making decisions has been studied (Fig. 19.2). To apply this process to research, describing how sample statistics vary from sample to sample (sampling variation) is necessary. Those tools are discussed in the following chapters:

  • tools to describe how sample statistics vary from sample to sample (sampling variation), and hence what to expect from the sample statistic: Chap. 20.
  • tools to describe the distribution of the population and the sample: Chap. 21.

19.6 Chapter summary

Decisions are often made by making an assumption about the parameter, which leads to an expectation of what values of the statistic are reasonably possible. We can then make observations about our sample, and then make a decision about whether the sample data support or contradict the initial assumption.

19.7 Quick review questions

Are the following statements true or false?

  1. Parameters describe populations.
  2. Both \(\bar{x}\) and \(\mu\) are statistics.
  3. The value of a statistic is likely to be same in every sample.
  4. Sampling variation describes how the value of a statistic varies from sample to sample.
  5. An initial assumption is made about the sample statistic.
  6. The variation in the values of the statistic from sample to sample is called sampling variation.
  7. If the sample results seem inconsistent with what was expected, then the assumption about the population is probably true.
  8. In the sample, we know exactly what to expect.
  9. Hypotheses are made about the population.

19.8 Exercises

Answers to odd-numbered exercises are available in App. E.

Exercise 19.1 Suppose you are playing a die-based game, and your opponent rolls a ten times in a row.

  1. Do you think there is a problem with the die?
  2. Explain how you came to this decision.

Exercise 19.2 In a 2012 advertisement, an Australian pizza company claimed that their \(12\)-inch pizzas were 'real \(12\)-inch pizzas' (P. K. Dunn 2012).

  1. What is a reasonable assumption to make to test this claim?
  2. The claim is based on a sample of \(125\) pizzas, for which the sample mean pizza diameter was \(\bar{x} = 11.48\) inches. (See Sect. 19.1.) What are the two reasons why the sample mean is not exactly \(12\)-inches?
  3. Does the claim appear to be supported by, or contradicted by, the data? Explain.
  4. Would your conclusion change if the sample mean was \(\bar{x} = 11.25\) inches? Explain.
  5. Does your answer depend on the sample size? For example, is observing a sample mean of \(11.25\) inches from a sample of size \(n = 10\) equivalent to observing a sample mean of \(11.25\) inches from a sample of size \(n = 125\)? Explain.

Exercise 19.3 Since my wife and I have been married, I have been called to jury service four times. The latest notice reads: 'Your name has been selected at random from the electoral roll'.

In the same length of time, my wife has never been called to jury service. Do you think the selection process really is 'at random'? Explain.