15 Making decisions: an introduction

So far, you have learnt to ask a RQ, design a study, and describe and summarise the data. In this chapter, you will learn how decisions are made in science, so you can answer RQs. You will learn to:

  • explain the two broad reasons why differences are seen between sample statistics and population parameters.
  • explain how decisions are made in research.

15.1 Introduction

In Sect. 14.8, the NHANES data (Center for Disease Control and Prevention (CDC) 1994) were numerically summarised. The sample mean direct HDL cholesterol concentration was different for smokers (\(\bar{x} = 1.31\)mmol/L) and for non-smokers (\(\bar{x} = 1.39\)mmol/L).

Importantly, we must realise that the sample studied is only one of countless possible samples that could have been chosen. If a different sample of people was chosen, a different value for the difference between the sample means would have been produced. And, of course, since countless samples are possible, countless values for \(\bar{x}\) are possible.

This leads to one of the most important observations about sampling.

Studying a sample leads to the following observations:

  • Each sample is likely to be different.
  • Our sample is just one of countless possible samples from the population.
  • Each sample is likely to produce a different value for the sample statistic.
  • Hence we only observe one of the many possible values for the sample statistic.

Since many values for the sample statistic are possible, the possible values of the sample statistic vary (called sampling variation) and have a distribution (called a sampling distribution).

Definition 15.1 (Sampling variation) Sampling variation refers to how the sample estimates (statistics) vary from sample to sample, because each sample is different.

Since we only have one value of the sample statistic, out of the many values of the sample statistic that are possible, what does this difference between the sample means imply about the difference between the population means?

Two reasons could explain why the sample means are different:

  1. The population means are the same; the difference is due to sampling variation.
    That is, we just happen to have---by chance---one of those samples where the difference between the means is quite noticeable. The sample means are different only because we have data from one of the many possible samples, and every sample is likely to be different.
  2. Alternatively, the population means are different, and the sample means reflect this.

How do we decide which of these explanations is supported by the data?

Similarly, in Sect. 14.8 the odds of being diabetic were different for smokers (0.181) and non-smokers (0.084). What does this difference between the sample odds imply about the population odds?

Again, two possible reasons could explain why the sample odds are different:

  1. The population odds are the same. That is, we just happen to have---by chance---one of those samples where the difference between the odds is quite noticeable. The sample odds are different only because we have data from one of the many possible samples, and every sample is likely to be different, so sometimes, the sample odds are different by chance. Again, this is called sampling variation.
  2. Alternatively, the odds are different in the population, and the sample odds reflect this.

In both situations (means; odds), the two possible explanations ('statistical hypotheses'2) have special names:

  1. There is no difference between the population parameters: the difference is simply due to sampling variation. This is the null hypothesis, or \(H_0\).
  2. There is a difference between the population parameters. This is the alternative hypothesis, or \(H_1\).

How do we decide which of these explanations is supported by the data? What is the decision-making process?

One approach to the decision-making process begins by assuming the null hypothesis is true. Then the data are examined to see if sufficient information exists to support the alternative hypothesis. However, conclusions drawn about the population from the sample can never be certain, since the sample studied is just one of many possible samples that could have been taken.

15.2 The need for making decisions

In research, decisions need to be made about population parameters based on sample statistics. The difficulty is that the decision must be made using one of the many possible sample, and every sample is likely to be different (comprising different individuals from the population), and so each sample will produce different summary statistics. This is called sampling variation.

Sampling variation refers to how much a sample estimate (a statistic) is likely to vary across all possible samples, because each sample is different.

However, sensible decisions can be made (and are made) about population parameters based on sample statistics. To do this though, the process of how decisions are made needs to be articulated, which is the purpose of this chapter.

To begin, consider the following scenario. Suppose I produce a standard pack of cards, and shuffle them well. The pack of cards can be considered a population.

A standard pack of cards has 52 cards, with four suits: spades and clubs (both black), and hearts and diamonds (both red). Each suit has 13 denominations: 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack (J), Queen (Q), King (K), Ace (A). The Ace, King, Queen and Jack are referred to as picture cards. (Most packs also contain two jokers, but these are not usually considered part of a standard pack.)

Suppose I draw a sample of 15 cards from the pack, and all are red cards. What should you conclude? How likely is it that this would happen simply by chance? See the animation below. Is this evidence that the pack of cards is somehow unfair, or rigged?

Getting 15 reds cards out of 15 from a well-shuffled pack seems very unlikely, so you probably conclude that the pack is somehow unfair. But importantly, how did you reach that decision? Your unconscious decision-making process may have worked like this:

  1. You assumed, quite reasonably, that I used a standard, well-shuffled pack of cards, where half the cards are red and half the cards are black. That is, you assumed the population proportion is \(p = 0.5\).
  2. Based on that assumption, you expected about half the cards in the sample of 15 to be red, and about half to be black. You wouldn't necessarily expect exactly half red and half black, but you'd probably expect something close to that. That is, you would expect that \(\hat{p}\) would be close to 0.5.
  3. But what you observed was nothing like that: All 15 cards were red. That is, \(\hat{p} = 0\).
  4. You then made a decision: since what you observed ('all red cards') was not like what you were expecting ('about half red cards'), the 15 red cards contradict what you were expecting, based on your assumption of a fair pack... so your assumption of a fair pack is probably wrong.

Of course, getting 15 red cards in a row is possible... but very unlikely. For this reason, you would probably conclude, based on the evidence, there appears to be strong evidence that the pack is not a fair pack.

You probably didn't consciously go through this process, but it seems reasonable. This process of decision making is similar to the process used in research.

15.3 How decisions are made

Based on the ideas in the last section, a formal process of decision making in research can be described. To expand:

  1. Assumption: Make an assumption about the population parameter. Initially, assume that the sampling variation explains any discrepancy between the observed sample and assumed value of the population parameter. The initial assumption is that there has been 'no change, no difference, no relationship', depending on the context.

  2. Expectation: Based on the assumption about the parameter, describe what values of the sample statistic might reasonably be observed from all the possible samples that might be obtained (due to sampling variation).

  3. Observation: Observe the data from one of the many possible samples, and compute the observed sample statistic from this sample.

  4. Decision: If the observed sample statistic is:

    • unlikely to have happened by chance, it contradicts the assumption about the population parameter, and the assumption is probably wrong. The evidence suggests that the assumption is wrong (but it is not certainly wrong).
    • likely to have happened by chance, it is consistent with the assumption about the population parameter, and the assumption may be correct. No evidence exists to suggest the assumption is wrong (though it may be wrong).

This is one way to describe the formal process of decision making in science (Fig. 15.2).

A way to make decisions

FIGURE 15.1: A way to make decisions

A way to make decisions

FIGURE 15.2: A way to make decisions

This approach is similar to how we unconsciously make decisions every day. For example, suppose I ask my son to brush his teeth (Budgett et al. 2013), and later I want to decide if he really did.

  1. Assumption: I assume my son brushed his teeth (because I told him to).
  2. Expectation: Based on that assumption, I expect to find a damp toothbrush when I check.
  3. Observation: When I check later, I observe a dry toothbrush.
  4. Decision: The evidence contradicts what I expected to find based on my assumption, so my assumption is probably false. He probably didn't brush his teeth.

I may have made the wrong decision: He may have brushed his teeth, but dried his brush with a hair dryer. However, based on the evidence, quite probably he has not brushed his teeth.

The situation may have ended differently: When I check later, I observe a damp toothbrush. In this case, the evidence seems consistent with what I expected to find based on my assumption, so my assumption is probably true. He probably did brush his teeth.

Again, I may be wrong: He may have just ran his toothbrush under a tap. I don't have any evidence that he didn't brush his teeth, though.

Similar logic underlies most decision making in science3.

Example 15.1 (The decision-making process) Consider the cards example from Sect. 15.2 again. The formal process might look like this:

  1. Assumption: Assume the pack is fair and well-shuffled pack of cards: the population proportion of red cards is \(p = 0.5\) (the value of the parameter).

  2. Expectation: Based on this assumption, roughly equal (but not necessarily exactly) equal numbers of red and black cards would be expected in a sample of 15 cards. The sample proportion of red cards (the value of the statistic) is expected to be close to, but maybe not exactly, 0.5.

  3. Observation: Suppose I then deal 15 cards, and all 15 are red cards: \(\hat{p} = 0\).

  4. Decision: 15 red cards from 15 cards seems unlikely to occur if the pack is fair and well-shuffled. The data seem inconsistent with what I was expecting based on the assumption (Fig. 15.3). The evidence suggests that the assumption is probably false.

Of course, getting 15 red cards out of 15 is not impossible, so I may be wrong... but it is very unlikely. Based on the evidence, concluding that a problem exists with the pack of cards seems reasonable.

A way to make decisions for the cards example

FIGURE 15.3: A way to make decisions for the cards example

15.4 Making decisions in research

Let's think about each step in the decision-making process (Fig. 15.2) individually.

  • The assumption about the parameter (Sect. 15.4.1);
  • The expectation of the statistic (Sect. 15.4.2);
  • The observations (Sect. 15.4.3); and
  • Make a decisions (Sect.15.4.4).

15.4.1 Assumption about the population parameter

The initial assumption is that there has been 'no change, no difference, no relationship', depending on the context. Using this idea, a reasonable assumption can be made about the population parameter:

  • We might assume that no difference exists between the parameter for two groups in the population, since we don't have any evidence yet to say there is a difference. For example, we might assume that the mean HDL cholesterol is the same for current smokers and non-smokers in the population, for the NHANES data. If we already knew there was a difference, why would we be performing a study to see if there is a difference?

  • We might be interested in testing a claim, or evaluating a benchmark, about a population parameter, to determine if the evidence supports this claim or benchmark.

These assumptions about the population parameter are called null hypotheses.

Example 15.2 (Assumptions about the population) Many dental associations recommend brushing teeth for two minutes. One study (I. D. M. Macgregor and Rugg-Gunn 1979) recorded the tooth-brushing time for 85 uninstructed schoolchildren (11 to 13 years old) from England.

We could assume the population mean tooth-brushing time in the population (11 to 13 year-old children from England) is two minutes, as recommended. After all, we don't have evidence to suggest any other value for the mean. A sample can then be obtained to determine if the sample mean is consistent with, or contradicts, this assumption.

15.4.2 Expectations of sample statistics

Having assumed a value for the population parameter, the second step is to determine what values to expect from the sample statistic, based on this assumption.

Since many samples are possible, and every sample is likely to be different ('sampling variation'), the value of the sample statistic depends on which one of the possible samples we obtain: the sample statistic is likely to be different for every sample.

Think about the cards in Sect. 15.2. Assuming a fair pack, then half the cards are red in the population (the pack of cards), so the population proportion is assumed to be \(p = 0.5\). In a sample of 15 cards, what values could be reasonably expected for the sample proportion \(\hat{p}\) of red cards (the statistic)? If samples of size 15 were repeatedly taken, the sample proportion of red cards would vary from hand to hand, of course.

How would \(\hat{p}\) vary from sample to sample? Perhaps 15 red cards out of 15 cards happens reasonably frequently... or perhaps it doesn't. How could we find out? We could:

  • use mathematical theory.
  • shuffle a pack of cards and deal 15 cards many hundreds of times, then count how often we see 15 red cards of out 15 cards.
  • simulate (using a computer) dealing 15 cards many hundreds of times, and count how often we get 15 red cards of out 15 cards.

The third option is the most practical... To begin, suppose we simulated only ten hands of 15 cards each; the animation below shows the sample proportion of red cards from each of the ten repetitions. Not one of those ten hands produced 15 red cards in 15 cards.

Suppose we repeated this for hundreds of hands of 15 cards (rather than the ten above), and for each hand we recorded the sample proportion of cards that were red. The proportion of red cards would vary from sample to sample ('sampling variation'), and we could record the proportion of red cards from each of those hundreds of hands. A histogram of these hundreds of sample proportions could be constructed; the animation below shows a histogram of the sample proportions from 1000 repetitions of a hand of 15 cards. This histogram shows how we might expect the sample proportions \(\hat{p}\) to vary from sample to sample, when the population proportion of red cards is \(p = 0.5\).

15.4.3 Observations about our sample

We then take a sample (one of the many samples that are possible), and observe the sample statistic. In this situation, we observe 15 red cards out of 15 cards.

15.4.4 Making a decision

Using the sample data, we make a decision.

We note that observing 15 red cards out of 15 cards is quite rare: it never happened once in the 1000 simulations. So based on simulating one thousand hands, we could conclude that we would almost never find 15 red cards in 15 cards... if the assumption of a fair pack was true. But we did find 15 red cards in 15 cards... so the assumption ('a fair pack') is probably wrong.

What if we had observed 4 red cards in a hand of 15 cards (a sample proportion of \(\hat{p} = 4/15 = 0.267\)), rather than 15 red cards out of 15? The conclusion is not quite so obvious then: these values of \(\hat{p}\) are uncommon, but they certainly do happen when \(p = 0.5\). In these situations, a more sophisticated approach for making a decision is needed.

Special tools are needed to describe what to expect from the sample statistic after making assumptions about the population parameter. These special tools are discussed in the next chapters.

Example 15.3 (Sampling variation) Many dental associations recommend brushing teeth for two minutes. One study (I. D. M. Macgregor and Rugg-Gunn 1979) recorded the tooth-brushing time for 85 uninstructed schoolchildren (11 to 13 years old) from England.

Of course, every possible sample of 85 children will include different children, and so produce a different sample mean \(\bar{x}\). Even if the population mean toothbrushing time really is two minutes (\(\mu = 2\)), the sample mean probably won't be exactly two minutes, because of sampling variation.

We could assume the population mean tooth-brushing time is two minutes (\(\mu = 2\)). If this assumption is true, we then could describe what values of the sample statistic \(\bar{x}\) to expect from all possible samples. Then, after obtaining a sample and computing the sample mean, we could determine if the sample mean seems consistent with the assumption of two minutes, or whether it seems to contradict this assumption.

15.5 Tools for describing sampling variation

Making decisions about population parameters based on a sample statistic is difficult: Only one of the many possible samples is selected, and every sample is likely to be different, and can produce a different value of the sample statistic. In this chapter, though, a process for making decisions has been studied (Fig. 15.2).

To apply this process to research, describe how sample statistics vary from sample to sample (sampling variation) is necessary. Some of those tools are discussed in the following chapters:

  • Tools to describe the distribution of the population and the sample: Chap. 17.
  • Tools to describe how sample statistics vary from sample to sample (sampling variation), and hence what to expect from the sample statistic: Chap. 18.
  • Tools to describe the random nature of what happens with sample statistics, and so determine if the sample statistic is consistent with the assumption: Chap. 16.

15.6 Summary

Decisions are often made by making an assumption about the population parameter, which leads to an expectation of what might occur in the sample statistics. We can then make observations about our sample, and then make a decision about whether the sample data support or contradict the initial assumption.

15.7 Quick review questions

  1. True or false: Parameters describe populations.
  2. True or false: Both \(\bar{x}\) and \(\mu\) are statistics.
  3. True or false: The value of a statistic is likely to be different in every sample.
  4. True or false: Sampling variation describes how the value of a statistic varies from sample to sample.
  5. True or false: The initial assumption is made about the sample statistic.
  6. True or false: The variation in statistics from sample to sample is called sampling variation
  7. True or false: If the sample results seem inconsistent with what was expected, then the assumption about the population is probably true.
  8. True or false: In the sample, we know exactly what to expect.
  9. True or false: Hypotheses are made about the population.

15.8 Exercises

Selected answers are available in Sect. D.15.

Exercise 15.1 Suppose you are playing a die-based game, and your opponent rolls a ten times in a row.

  1. Do you think there is a problem with the die?
  2. Explain how you came to this decision.

Exercise 15.2 In a 2012 advertisement, an Australian pizza company claimed that their 12-inch pizzas were 'real 12-inch pizzas' (P. K. Dunn 2012).

  1. What is a reasonable assumption to make to test this claim?
  2. The claim is based on a sample of 125 pizzas, for which the sample mean pizza diameter was \(\bar{x} = 11.48\) inches. What are the two reasons why the sample mean is not 12-inches?
  3. Does the claim appear to be supported by, or contradicted by, the data? Why?
  4. Would your conclusion change if the sample mean was \(\bar{x} = 11.25\) inches, rather than 11.48 inches? Does the claim appear to be supported by, or contradicted by, the data? Why?
  5. Does your answer depend on the sample size? For example, is observing a sample mean of 11.25 inches from a sample of size 10 equivalent to observing a sample mean of 11.25 inches from a sample of size 125?