15.2 The need for making decisions

In research, decisions need to be made about population parameters based on sample statistics. The difficulty is that every sample is likely to be different (comprise different individuals from the population), and each sample will produce different summary statistics. This is called sampling variation.

Sampling variation refers to how much a sample estimate (a statistic) is likely to vary from sample to sample, because each sample is different.

However, sensible decisions can be (and are) made about population parameters based on sample statistics. For example, to determine if a pot soup is ready to serve, we don’t have to consume the whole pot of soup (the ‘population’); a sensible decision can be made from a small taste (the ‘sample’). Likewise, in research sensible decisions about the population parameter can be made from the sample statistic.

To do this though, the process of how decisions are made needs to be articulated. In this chapter, the logic of making decisions is discussed.

To begin, consider the following scenario. Suppose I produce a standard pack of cards, and shuffle them well. The pack of cards can be considered a population.

A standard pack of cards has 52 cards, with four suits: spaces and clubs (which are both black), and heart and diamonds (which are both red). Each suit has 13 denominations: 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack (J), Queen (Q), King (K), Ace (A). Most packs also contain two jokers, but these special cards are not usually considered part of a standard pack.

Suppose I draw a sample of 15 cards from the pack, and notice that all are red cards. How likely is it that this would happen simply by chance? See the animation below. Is that evidence that the pack of cards is somehow unfair, or rigged?

Getting 15 reds cards out of 15 seems very unlikely, so perhaps you may conclude that the pack is unfair in some way. But importantly, how did you reach that decision? Your unconscious decision-making process may have worked like this:

You assumed, quite reasonably, that this is a standard, well-shuffled pack of cards, so that half the cards are red and half the cards are black.
Based on that assumption then, you, quite reasonably, expected about half the cards in the sample of 15 to be red, and about half to be black. You wouldn’t necessarily expect to see exactly half red and half black, but you’d probably expect something close to that.
But what you observed was nothing like that: All 15 cards were red. Since what you observed (‘all red cards’) was not like what you were expecting (‘about half red cards’), the 15 cards in my hand contradict what you were expecting, based on your assumption of a fair pack… so your assumption of a fair pack is probably wrong.

Of course, getting 15 red cards in a row is possible… but very unlikely⁵. For this reason, we would probably conclude that the most likely explanation is that the pack is not a fair pack.

You probably didn’t consciously go through this process, but it does seem reasonable. This process of decision making is similar to the process used in research.

In fact, the probability of getting 15 cards of the same colour (either red or black) is about 0.0001025%.↩︎