3.7 Statistical estimation

Imagine you are cooking a pot of soup and you want to determine how salty the soup is. You would probably stir the soup and try a sample. You would probably infer that the whole pot of soup is about as salty as your sample.

This is an example of statistical estimation: Making an inference about the value of a population parameter based on a sample.

Statisticians, scientists, and pollsters are all interested in statistical estimation. For example, consider a pollster who wants to estimate the approval rating of the president among all adults in the U.S. It is not feasible to ask every single adult in the U.S. whether they approve of the president. Instead, the pollster will ask a sample of adults in the US. They will use the sample to estimate the president’s approval rating in the population.

Vocab

We use the term population parameter to refer to the value that we want to estimate in the population. For example, the saltiness of the whole pot of soup, or the president’s approval rating for all adults in the U.S.
We use the term sample statistic to refer to observed result in the sample. For example, the saltiness of the soup that we observe in the sample, or the president’s approval rating that we observe in the sample.

3.7.1 Making a statistical estimate

In the soup example, we estimated that the whole pot would be about as salty as the sample. In formal terms, we used the sample statistic (the saltiness in the sample) to estimate the population parameter (the saltiness of the whole pot). Why is this reasonable?

When we stirred the soup, we ensured that we were taking a representative sample. The saltiness of a typical sample taken this way will, on average, match the saltiness of the pot. And, even if the saltiness of the sample does not perfectly match the pot, it is not systematically biased in terms of saltiness (it is not systematically too salty or not not salty enough). This means that you would be justified in inferring that the pot is about as salty as your sample.

The same principle applies whenever we make an estimate about a population using a sample. If we have a representative sample, the sample statistic is our best estimate for the population parameter. Again, this is reasonable because (a) sample statistics from representative samples will, on average, match the population parameter, and (b) even if the sample statistic is not perfectly accurate, it does not systematically over- or under-estimate the population parameter.

Of course, even if the sample statistic is our best estimate for the population parameter, that does not mean it is perfect. It is just the least wrong, on average.

Thus, even if we make our best estimate, we have to recognize that there is still some uncertainty in the estimate. This uncertainty due to sampling variability. If we had taken a different sample, we might have gotten a different result. The next sections discuss how statisticians account for uncertainty in a statistical estimate due to sampling variability. For now, the key points are:

Key points: Statistical estimation

If we have a representative sample, the sample statistic is our best estimate for the population parameter
“Best estimate” does not mean “perfectly accurate.” There is still some uncertainty in the estimate due to sampling variability.