3.9 Accounting for uncertainty in a statistical estimate: Compatibility intervals and margin of error

When statisticians report statistical estimates, they typically provide the value of the estimate along with the quantification of uncertainty in this estimate. This uncertainty measurement gives us an indication of the “precision” of the estimate. For example, consider the following poll results reported in the New York Times on June 30, 2011:

As the housing market slumped over the last few years with a speed and magnitude not seen since the Great Depression, aspects of homeownership have been debated as never before. There are tough questions about the role the government should take…includ[ing] how much of a down payment lenders should demand. Whether buyers need to come up with a 20 percent down payment—the standard for decades, but beyond the reach of many families now—is hotly debated. Fifty-eight percent of respondents say lenders should require this, while 36 percent say they should not. The nationwide telephone poll was conducted June 24–28 with 979 adults and has a margin of sampling error of plus or minus three percentage points for all adults.

In this article, the percentage of all adults in the United States who believe that lenders should require a 20% down payment on a house was reported as:

  • Estimate: 58%
  • Margin of error: 3 percentage points

The margin of error quantifies the uncertainty in the estimate. It tells us the amount of “give or take” around the sample estimate that is reasonable. We can read this as saying: “Based on our sample, we estimate that 58% of adults in the U.S. believe lenders should require a 20% down payment. Because this estimate is based on a sample, there is some uncertainty in the estimate. Any estimate with 3 percentage points is compatible with our sample.”

In other words, the margin of error defines a compatibility interval: An interval of estimates for the population parameter, such that any estimate in the interval is compatible with the sample:

\[ \mathrm{Compatibiltiy~Interval} = \mathrm{Sample~Estimate} \pm \mathrm{Margin~of~Error} \]

In this example,

\[ \mathrm{Compatibiltiy~Interval} = 58\% \pm 3\% = 55\% - ~61\% \]

Statisticians refer to this as a compatibility interval because it gives an interval of plausible values for the percentage of all adults in the United States who believe that lenders should require a 20% down payment on a house that are compatible with the observed data. Based on the observed data, the best estimate for the “truth” (the population parameter) is that 58% of all adults in the United States believe that lenders should require a 20% down payment on a house. However, because of the uncertainty associated with random sampling, we are more comfortable giving a range of plausible estimates: Based on our sample, we estimate that between 55% and 61% of all adults in the United States believe that lenders should require a 20% down payment on a house.

3.9.1 Calculating the margin of error and compatibilty interval

The uncertainty in an estimate is due to sampling variability. We quantify the uncertainty as follows:

\[ \mathrm{Margin~of~Error} = 2 \times \mathrm{SD~of~sampling~distribution} \]

And therefore a compatibility interval can be calculated using:

\[ \mathrm{Compatibility~interval} = \mathrm{sample~estimate} \pm 2 \times \mathrm{SD~of~sampling~distribution} \]

Using 2 SD of the sampling distribution gives us an interval that captures the true population parameter 95% of the time. We say that we are “95% confident” that the compatibility interval captures the true population parameter. For this reason, the compatibility interval is sometimes called a 95% confidence interval.

While statisticians and polling organizations tend to use two SDs to compute the margin of error, this is a somewhat arbitrary choice. Some researchers choose one or three SDs of the sampling distribution.

3.9.2 Interpreting margins of error and compatibilty intervals

The margin of error and compatibility interval are used to account for uncertainty in a statistical estimate due to sampling variability. Remember, the uncertainty is in the estimate, not in the population parameter. There is only one true value of the population parameter. We do do our best to estimate it, including giving the best single estimate and given an interval of estimates that are compatible with our sample. Ultimately, this compatibility interval will capture the “true” population parameter about 95% of the time.

Thus, we can interpret the compatibility interval as follows:

  • Any estimate for the population parameter in the compatibility interval is compatible with our sample
  • We are 95% confident that the true population parameter is captured by the compatibility interval

As you interpret compatibility intervals, there are a couple things to keep in mind.

  • The compatibility interval is being used to estimate the population parameter.
  • The compatibility interval gives a range of compatible values for the population parameter.
  • The uncertainty is in the estimate, not in the population parameter

To learn more about accounting for uncertainty due to sampling variability, please read the short pamphlet, What is a Margin of Error? from the American Statistical Association’s Section on Survey Research.

See the video below to learn how to set up a bootstrap model in TinkerPlots

Key points: margin of error and compatibilty interval
  • The margin of error and compatibility interval are used to account for the uncertainty in a statistical estimate

  • They give a range of plausible estimates for the true population parameter. Any estimate in this range is compatible with the sample.

  • We calculate the margin of error and compatibility interval as follows:

    • \(\mathrm{Margin~of~Error} = 2 \times \mathrm{SD~of~sampling~distribution}\)
    • \(\mathrm{CI} = \mathrm{Estimate} \pm 2 \times \mathrm{SD~of~sampling~dist}\)
  • We can interpret the compatibility interval as follows:

    • Any estimate for the population parameter in the compatibility interval is compatible with our sample
    • We are 95% confident that the true population parameter is captured by the compatibility interval