30 Sample sizes for CIs

So far, you have learnt to ask a RQ, design a study, classify and summarise the data, form confidence intervals, and conduct hypothesis tests. In this chapter, you will learn to:

  • estimate the sample size for producing a CI of given width for a proportion.
  • estimate the sample size for producing a CI of given width for a mean.
  • estimate the sample size for producing a CI of given width for a mean difference.
  • estimate the sample size for producing a CI of given width for a difference between two means.
  • explain issues around estimating sample sizes.

30.1 Introduction

A confidence interval is an interval which gives a range of values of the population parameter that could plausibly have produced the observed value of the statistic. All other things being equal, a larger sample size gives a more precise estimate of the population parameter. After all, that's why we prefer larger samples: to get more precise estimates, and hence narrower CIs. If that was not the case, we could take the smallest, cheapest and easiest possible sample of size one... which is clearly absurd.

For a given level of confidence (e.g., \(95\)%), the width of a CI depends on the size of the sample. All other things being equal, larger samples produce more precise estimates of the parameter (Sect. 5.2), and hence narrower CIs.

Example 30.1 (Impact of sample size on CIs) Suppose we wish to estimate an unknown proportion, and find that \(\hat{p} = 0.55\) from a sample of size \(n = 25\). The approximate \(95\)% CI is \(0.55 \pm 0.199\) (so the margin of error is \(0.199\))

If the estimate of \(\hat{p} = 0.55\) was found from a sample of size \(n = 100\), a more precise estimate should be expected. The approximate \(95\)% CI is \(0.55\pm 0.100\); the margin of error is \(0.100\).

If the estimate of \(\hat{p} = 0.55\) was found from a sample of size \(n = 400\), the approximate \(95\)% CI is \(0.55\pm 0.050\); the margin of error is \(0.050\).

(Notice that in each case, the sample size was increased by a factor of \(4\), and the margin of error was halved.)

The approximate width of the CI changes for different sample sizes (all else being equal). Try changing the sample size in the interaction below (Fig. 30.1). From this graph, we can see that:

  • greater precision (smaller CI widths) are obtained using larger sample sizes.
  • for small sample sizes (say, smaller than 15), precision is greatly increased with small increases in the sample size.
  • for large sample sizes (say, greater than 30), precision only improves slightly when the sample size is increased.

FIGURE 30.1: The approximate 95% confidence interval for various sample sizes

That is, improving precision gets more difficult as sample sizes get larger. Large gains in precision are made by moderately increasing small sample sizes, but only small gains in precision are made by large increases in already-large sample sizes.

Remember that the sample size is the number of units of analysis.

30.2 General ideas

If larger samples give more precise estimates, should the largest sample possible always be used? Not necessarily: using large samples also has disadvantages:

  • Studies with larger samples sizes take longer to complete.
  • Studies with larger samples sizes are more expensive.
  • Ethics committees aim to keep sample sizes as small as possible, so that:
    • The environments is impacted as little as possible.
    • The fewest possible animals are harmed.
    • The fewest possible people are harmed or inconvenienced.
    • Resources, time and money are not wasted.

Determining the sample size to use is a trade-off between the advantages of increasing precision, and the challenges of cost, time, and remaining ethical (Chap. 4).

Example 30.2 (The cost of research) A project studied the residual effect of organic biochar compound fertilizers (BCFs) two years after application (Farrar et al. 2021). This study required planting turmeric in pots using soil previously treated with BCFs.

After the turmeric was grown, the concentration of potassium, phosphorus and nitrogen---as well as many trace minerals---was determined from the soil in every pot. In addition, every turmeric plant was analysed for the number of shoots, the leaf mass fraction, and foliar nutrient information.

Clearly, every pot that is used comes with a substantial cost, both in terms of time and money.

In this chapter, we learn how to compute the (approximate) minimum sample size needed to obtain a given precision (i.e., for a given margin of error) for a confidence interval. We only study the estimation of sample sizes for constructing a CI in these situations:

  • Estimating a proportion: Sect. 30.3.
  • Estimating a mean: Sect. 30.4.
  • Estimating a mean difference: Sect.  30.5.
  • Estimating a difference between two means: Sect.  30.6.

The formulas given in this chapter only apply for forming \(95\)% CIs, and are very conservative: they will probably give samples sizes a bit too large, but that is better than being too small. In any case, larger-than-calculated sample sizes are often used anyway to allow for drop outs: animals or plants that die; people who can no longer be contacted; and so on.

30.3 One proportion

In Sect. 24.7, a CI was formed for the population proportion of female college students in the United States that drink coffee daily (Kelpin et al. 2018). From a sample of \(n = 360\), the CI was \(0.1694 \pm 0.0395\) (i.e., the margin of error is \(0.0395\)), or from \(0.130\) to \(0.209\).

To obtain a more precise estimate (i.e., a narrower CI), a larger sample is needed. For instance, suppose we would like a CI with margin of error of \(0.02\). What size sample is needed? Since we seek a more precise estimate, a larger sample is needed... but how much larger?

Definition 30.1 (Sample size: proportion) Conservatively, the size of the simple random sample needed for a \(95\)% CI for a proportion with a specified margin-of error is at least
\[ \frac{1}{(\text{Margin of error})^2}. \]

For the coffee-drinking situation above, a sample size of at least \(\displaystyle 1\div (0.02^2) = 2\ 500\) female college students in the US is needed. This is a substantial increase from the original sample size of \(360\).

Always round up the result of the sample size calculation.

Example 30.3 (Sample size calculations for one proportion) To estimate the population proportion of Australians that smoke, to within \(0.07\) with \(95\)% confidence, a sample size of at least
\[ \frac{1}{(\text{Margin of error})^2} { = \frac{1}{0.07^2}} \] is needed; at least \(n = 204.0816\) people. In practice, at least \(205\) people are needed to achieve this desired level of precision (that is, always round up in sample size calculations).

The following short video may help explain some of these concepts:

30.4 One mean

Definition 30.2 (Sample size: mean) Conservatively, the size of the simple random sample needed for a \(95\)% CI for the mean with a specified margin-of error is at least
\[ \left( \frac{2 \times s}{\text{Margin of error}}\right)^2, \] where \(s\) is an estimate of the standard deviation in the population.

The formula requires a value for the sample standard deviation, \(s\). But if we don't have a sample yet... how can we have a value for the standard deviation of the sample? An approximate value for \(s\) is used, which can come from:

  • the value of \(s\) from the results of a pilot study (Sect. 10.1).
  • the results of a similar study, where the value \(s\) there can be used (see Example 30.4).

Always round up the results of a sample size calculation.

Example 30.4 (Sample size estimation for one mean) Sect. 25.5 discusses a study about the mean cadmium concentrations in peanuts in the United States, where \(s = 0.0460\) ppm (Blair and Lamb 2017).

Suppose we wanted to estimate the mean cadmium concentration in Australian peanuts, to give-or-take \(0.005\) ppm with \(95\)% confidence. We could use this value for \(s\) as a starting point, and then compute:
\[ \left( \frac{2 \times 0.0460}{0.005}\right)^2 = 338.56; \] we would need at least \(339\) peanuts.

30.5 Mean differences

The ideas in the previous section also work for computing sample sizes for estimating mean differences, since the differences can be treated like a single sample.

Definition 30.3 (Sample size: mean difference) Conservatively, the size of the simple random sample needed for a \(95\)% CI for the mean difference with a specified margin-of error is at least
\[ \left( \frac{2 \times s_d}{\text{Margin of error}}\right)^2, \] where \(s_d\) is an estimate of the standard deviation in the population.

Example 30.5 (Sample size estimation for mean differences) In Sect. 27.7, a CI is computed for the mean weight gain by Cornell University students from Week 1 to Week 12 (D. A. Levitsky, Halbmaier, and Mrdjenovic (2004), D. Levitsky (n.d.)). The CI is \(0.862\pm 0.232\) kg, where the margin of error is \(0.232\) kg.

Suppose we wanted to estimate the mean weight change at a different university; we could use the value of \(s\) from this study as a starting point (i.e., \(s = 0.956\)). Also, suppose we wanted a more precise estimate, to give-or-take \(0.15\) kg. For a more precise estimate, we would need a larger sample. So we compute:
\[ \left( \frac{2 \times 0.965}{0.15}\right)^2 = 162.4775; \] we would need at least \(163\) students after rounding up (which is indeed larger than the \(68\) students used at Cornell university).

30.6 Difference between two means

A formula for computing sample sizes for estimating difference between two means is simple if we make some assumptions:

  • the sample size in each group is the same; and
  • the standard deviation in each group is the same.

Formulas are available for computing sample sizes without these restrictions, but are more complicated than that given here.

Definition 30.4 (Sample size: difference between two means) Conservatively, the size of the simple random sample needed for a \(95\)% CI for the difference between two means with a specified margin-of error is at least
\[ 2\times \left( \frac{2 \times s}{\text{Margin of error}}\right)^2 \] for each sample, where \(s\) is an estimate of the common standard deviation in the population for both groups.

Example 30.6 (Sample size estimation for difference between means) In Sect. 28.1, a CI is computed for difference between the reaction times for drivers using and not using a mobile phone (Strayer and Johnston 2001; Agresti and Franklin 2007). Suppose we wanted to estimate the difference between the mean reaction times to within \(15\) ms.

Table 28.2 shows that \(s = 65.36\) ms for those not using a phone, and \(s = 89.65\) ms for those using a phone. The sample-size calculation requires using a common value for the standard deviations in both populations. To use the formula, we will use \(s = 78\) (the mean of the two values) as the common standard deviation, understanding that this may give only give a rough sample size. Then:
\[ 2 \times \left( \frac{2 \times 78}{15}\right)^2 = 216.32. \] We would need at least \(217\) students in each sample.

30.7 Other issues related to sample size

The above calculations form just one part of the information needed to make the final decision about the necessary sample size. For example, the cost (time and money) of taking sample of this size has not been considered.

The calculations in this chapter assume a simple random sample will be used, which is often unreasonable. Other, more complex, formulas are available for computing sample sizes for other random-sampling schemes (such as stratified samples). However, the above calculations do give an estimate of the sample size that would be required. In addition, the calculations in this chapter are only for producing \(95\)% confidence intervals.

In practice, researchers often start with a slightly larger sample than required to allow for drop-outs (for example, plants die, or people withdraw from the study).

30.8 Example: emergency residential aged care

A study examined residential aged care residents in Australia needing emergency care (Dwyer et al. 2021) and recorded, among other information, the average age of such residents (\(\bar{x} = 85\); \(s = 7.3\)) and the proportion of calls related to falls (\(\hat{p} = 0.156\)).

Suppose a similar study was to be conducted in New Zealand. The aim was to estimate the mean age of residents to with \(2\) years of age, and the proportion of incidents related to falls to within \(0.10\).

The sample size required to meet the age requirement is at least
\[ n = \left(\frac{2\times s}{\text{Margin of error}}\right)^2 = \left(\frac{2\times 7.3}{2}\right)^2 = 53.29, \] or at least \(54\) residents (rounding up!). The sample size required to meet the falls requirement is at least
\[ n = \frac{1}{(\text{Margin of error}^2)} = \frac{1}{0.1^2} = 100. \] Since the same subjects will be required for both estimates, the study should use at least \(100\) residents.

30.9 Chapter summary

30.10 Quick review questions

  1. True or false: A larger sample size produces a more precise estimate of the parameter, all other things being equal.
  2. True or false: A larger sample size produces a more random sample.
  3. True or false: We should always take the largest possible sample size.
  1. TRUE. The reason why larger sample are "better" is that they estimate the unknown population parameter with greater precision.
  2. FALSE. The size of the sample, and how the sample was obtained, are two different issues.
  3. FALSE. We also need to consider the cost (in terms of size and time) and ethical issues also.

30.11 Exercises

Selected answers are available in App. E.

Exercise 30.1 Suppose we need to estimate a population proportion (with \(95\)% confidence).

  1. What size sample is needed to estimate the population proportion within \(0.04\)?
  2. What size sample is needed to estimate the population proportion within \(0.02\) (that is, the confidence interval will be half as wide as in the first calculation)?
  3. What size sample is needed to estimate the population proportion within \(0.01\) (that is, the confidence interval will be a quarter as wide as in the first calculation)?
  4. To get an estimate half as wide, how many times more units of analysis are needed?
  5. To get an estimate a quarter as wide, how many times more units of analysis are needed?

Exercise 30.2 Section 24.3 discusses a study of the eating habits of university students in Canada (Mann and Blotnicky 2017). In that study, they estimated the proportion of Canadian students that ate a sufficient number of servings of grains each day.

Suppose we wished to repeat the study but for New Zealand university students; that is, we seek an estimate of the population proportion of New Zealand students that eat a sufficient number of servings of grains each day (with \(95\)% confidence).

  1. What size sample is needed to estimate the proportion to give-or-take \(0.01\)?
  2. What size sample is needed to estimate the proportion to give-or-take \(0.02\)?
  3. What size sample is needed to estimate the proportion to give-or-take \(0.10\)?
  4. Do you think this study would be costly, in terms of time and money?

Exercise 30.3 In Exercise 25.3, a study by Tager et al. (1979) was discussed that measured the lung capacity of 11-year-old girls in East Boston (using the forced expiratory volume (FEV) of the children). Suppose we wished to repeat the study, and find a \(95\)% confidence interval for the mean FEV for 11-year-old Australian girls.

Since Australian and American children might be somewhat similar, we could use (as a first approximation) the standard deviation from that study: \(s = 0.43\) litres.

  1. What size sample is needed to estimate the mean to give-or-take \(0.02\) litres?
  2. What size sample is needed to estimate the mean to give-or-take \(0.05\) litres?
  3. What size sample is needed to estimate the mean to give-or-take \(0.10\) litres?
  4. Suppose we wished to find \(99\)% (not \(95\)%) confidence interval for the mean FEV for 11-year-old Australian girls, to give-or-take \(0.10\) litres. Would this sample size be larger or smaller than the sample size found for a \(95\)% confidence interval (also with give-or-take \(0.10\) litres)?
  5. Do you think this study would be costly, in terms of time and money?

Exercise 30.4 A study of paramedics (B. Williams and Boyle 2007) asked participants (\(n = 199\)) to estimate the amount of blood loss on four different surfaces. When the actual amount of blood spill on concrete was \(1000\) ml, the mean guess was \(846.4\) ml (with a standard deviation of \(651.1\) ml).

  1. How many paramedics is needed to estimated the mean with an precision of give-or-take \(50\) ml?
  2. How many paramedics is needed to estimated the mean with an precision of give-or-take \(25\) ml?
  3. How many times greater does the sample size need to be to halve the width of the margin of error?

Exercise 30.5 We wish to estimate the population proportion of Australians that smoke.

  1. Suppose we wish our \(95\)% CI to be give-or-take \(0.05\). How many Australians would need to be surveyed?
  2. Suppose we wish our \(95\)% CI to be give-or-take \(0.025\); that is, we wish to halve the width of the interval above. How many Australians would need to be surveyed?
  3. How many times as many Australians are needed to halve the width of the interval?

Exercise 30.6 In Sect. 28.9, a study was described to measures the reduction in mean vehicle speed after installation of additional signage (Ma et al. 2019).

  1. What would be a useful value to use for the common standard deviation (see Fig. 28.5)?
  2. Suppose we wish our \(95\)% CI to be give-or-take \(5\) km.h-1. How many vehicle speeds would need to be measured?