## 3.8 Estimating sampling variabilty from a single sample: Bootstrapping

When making a statistical estimate, if we have a representative sample, the sample statistic is our best estimate for the population parameter. This is reasonable because (a) sample statistics from representative samples will, on average, match the population parameter, and (b) even if the sample statistic is not perfectly accurate, it does not systematically over- or under-estimate the population parameter.

But even if the sample statistic is not systematically wrong, that doesn’t mean that the sample statistic will exactly match the population parameter. There is still some uncertainty in the estimate. This uncertainty comes from sampling variability. A different sample could have given us a different result, and thus a different estimate. To account for the uncertainty, the question is how much sampling variability is there? In other words, if we had taken a different sample, just how different would the results have been?

Let’s consider some ways that we might address this question.

1. First, if we had unlimited time and resources, we could just take a whole bunch of different samples from the population. We could find the percentage of climbers that use drugs in each sample, and then find the sampling distribution—the distribution of all the sample results. The SD of this distribution would be a good measure of sampling variability.

In practice, we don’t get to take a bunch of samples from the population. So what about…

1. If we had a hypothesized model of the population, we could use that to simulate multiple samples. This is what we do with a hypothesis test. The null hypothesis gives us a hypothesized model of the world, and we use Monte Carlo simulation to determine the sampling variability if that model were true.

For a statistical estimate, though, we don’t have a hypothesized model for the population. Here, we don’t have anything like a null hypothesis. We only have one sample, and somehow we need to use it to see how much variability there would be if we were to take multiple samples. So that leads to…

1. Bootstrapping! Bootstrapping is an ingenious method for modeling random sampling when you only have a sample.

### 3.8.1 Estimating sampling variabilty using bootstrapping

If we had access to the entire population, we could estimate sampling variability using Monte Carlo simulation. We would repeatedly draw random samples from the population and create a sampling distribution. We would use the SD of the sampling distribution to estimate the sampling variability.

The problem is, we don’t have access to the entire population. We only have one sample. The key insight behind bootstrapping is that (assuming we have a representative sample), the sample is a good model for the population. Thus, we can model random sampling from the population by instead random sampling from the sample.

Now, there’s an obvious problem. Say we have a sample of size $$n$$=50. To model random sampling, we need to draw samples of size $$n$$=50. But if we repeatedly drew samples of size $$n$$=50 from our sample, we would just get the same sample, over and over!

We can, however use our sample to estimate the sampling error by drawing repeated samples of size n=50. Here’s how. When we draw our samples of size n=50, we sample with replacement. In this way, we essentially create an infinite population, and thus we model random sampling from larger population without actually needing the larger population. This nifty method is called bootstrapping. It was developed by Brad Efron in the late 1970’s17. Efron’s big discovery was that by bootstrapping (sampling with replacement) from a random sample, a person could come up with a good estimate of the sampling variation.

• To model random sampling, we take repeated samples of size $$n$$ from our sample, with replacement.