Chapter 4 Populations and samples

When we collect data, we are recording some kind of observation or measurement. If we are working in a forest, for example, we might want to measure the heights of different trees, or measure the concentration of carbon in the soil. The idea might be to use these measurements to make some kind of inference about the forest. But as scientists, we are almost always limited in the amount of data that we can collect. We cannot measure everything, so we need to collect a sample of data and use it to make inferences about the population of interest. For example, while we probably cannot measure the height of every tree in a forest, nor can we measure the concentration of carbon at every possible location in the forest’s soil, we can collect a smaller number of measurements and still make useful conclusions about overall forest tree height and carbon concentration.

Statistics thereby allows us to approximate properties of entire populations from a limited number of samples. This needs to be done with caution, but before getting into the details of how, it is important to fully understand the difference between a population and a sample to avoid confusing these two concepts. A population is the entire set of possible observations that could be collected (Sokal & Rohlf, 1995). Some examples will make it easier to understand:

  • All of the birch trees in Scotland
  • All genes in the house mouse (Mus musculus)
  • All wrens in the United Kingdom

These populations might be important for a particular research question. For example, we might want to know something about the feeding behaviours of wrens in the UK. But there is no way that we can find and observe the behaviour of every single wren, so we need to take a subset of the population (sample) instead (Fowler et al., 1998). Examples of samples include the following:

  • 25 birch trees from a Scottish forest
  • 10 genes from the Mus musculus genome
  • 60 caught wrens in UK nest boxes

It is important to recognise that the word ‘population’ means something different in statistics than it does in biology. A biological population, for example, could be defined as all of the individuals of the same species in the same general location. A statistical population, in contrast, refers to a set of observations (i.e., things that we can measure). Sokal & Rohlf (1995) provide a more technical definition for ‘population’,

In statistics, population always means the totality of individual observations about which inferences are to be made, existing anywhere in the world or at least within a definitely specified sampling area limited in space and time [p. 9, emphasis theirs].

They define a sample to be ‘a collection of individual observations selected by a specified procedure’ (Sokal & Rohlf, 1995). For our purposes, it is not necessary to be able to recite the technical definitions, but it is important to understand the relationship between a population and a sample. When we collect data, we are almost always taking a small sample of observations from a much larger number of possible observations in a population (Figure 4.1).

Conceptual figure illustrating how a statistical population relates to a statistical sample. The population is represented by 35 smiling faces enclosed within a dashed box. The sample is represented by a solid box within the dashed box, within which there are 3 smiling faces. Hence, we have a sample of 3 measurements from the total population.

Figure 4.1: Conceptual figure illustrating how a statistical population relates to a statistical sample. The population is represented by 35 smiling faces enclosed within a dashed box. The sample is represented by a solid box within the dashed box, within which there are three smiling faces. Hence, we have a sample of three measurements from the total population.

References

Fowler, J., Cohen, L., & Jarvis, P. (1998). Practical Statistics for Field Biology (2nd ed., p. 259). John Wiley & Sons.
Sokal, R. R., & Rohlf, F. J. (1995). Biometry (3rd ed., p. 887). W. H. Freeman & Company, New York, USA.