Methods: Study Size (10)

The items from STROBE state that you should report:
- Explain how the study size was arrived at

The sample size needed for a study depends on many factors including the size of the model, distribution of the variables, amount of missing data, reliability of the variables, and strength of the relationships among the variables.

Some key items to consider adding:
- Any unique restrictions placed on the study sample size
- Different determinants of sample size for different levels of organization (e.g., parent and offspring, family unit, etc.)
- How non-independence of measurements was incorporated into sample-size considerations
- The parameters, assumptions, methods, and effect size justification of the sample size calculation

Examples

Example 1.
“The number of cases in the area during the study period determined the sample size” (Vandenbroucke et al., 2007; Yadon et al., 2003).

Example 2.
“A survey of postnatal depression in the region had documented a prevalence of 19.8%. Assuming depression in mothers with normal weight children to be 20% and an odds ratio of 3 for depression in mothers with a malnourished child we needed 72 case-control sets (one case to one control) with an 80% power and 5% significance” (Anoop et al., 2004; Vandenbroucke et al., 2007).

Explanation

A study should be large enough to obtain a point estimate with a sufficiently narrow confidence interval to meaningfully answer a research question. Large samples are needed to distinguish a small association from no association. Small studies often provide valuable information, but wide confidence intervals may indicate that they contribute less to current knowledge in comparison with studies providing estimates with narrower confidence intervals. Also, small studies that show ‘interesting’ or ‘statistically significant’ associations are published more frequently than small studies that do not have ‘significant’ findings. While these studies may provide an early signal in the context of discovery, readers should be informed of their potential weaknesses.

The importance of sample size determination in observational studies depends on the context. If an analysis is performed on data that were already available for other purposes, the main question is whether the analysis of the data will produce results with sufficient statistical precision to contribute substantially to the literature, and sample size considerations will be informal. Formal, a priori calculation of sample size may be useful when planning a new study (Carlin & Doyle, 2002; Rigby & Vail, 1998). Such calculations are associated with more uncertainty than implied by the single number that is generally produced. For example, estimates of the rate of the event of interest or other assumptions central to calculations are commonly imprecise, if not guesswork (KF & DA, n.d.). The precision obtained in the final analysis can often not be determined beforehand because it will be reduced by inclusion of confounding variables in multivariable analyses (Drescher et al., 1990), the degree of precision with which key variables can be measured (Devine & Smith, 1998), and the exclusion of some individuals.

Few epidemiological studies explain or report deliberations about sample size.(Pocock et al., 2004; Tooth et al., 2005) We encourage investigators to report pertinent formal sample size calculations if they were done. In other situations they should indicate the considerations that determined the study size (eg, a fixed available sample, as in the first example above). If the observational study was stopped early when statistical significance was achieved, readers should be told. Do not bother readers with post hoc justifications for study size or retrospective power calculations.(KF & DA, n.d.) From the point of view of the reader, confidence intervals indicate the statistical precision that was ultimately obtained. It should be realized that confidence intervals reflect statistical uncertainty only, and not all uncertainty that may be present in a study (see item 20).(Vandenbroucke et al., 2007)

Box 4. Grouping

There are several reasons why continuous data may be grouped (Altman, n.d.). When collecting data it may be better to use an ordinal variable than to seek an artificially precise continuous measure for an exposure based on recall over several years. Categories may also be helpful for presentation, for example to present all variables in a similar style, or to show a dose-response relationship.

Grouping may also be done to simplify the analysis, for example to avoid an assumption of linearity. However, grouping loses information and may reduce statistical power (Cohen, 1983) especially when dichotomization is used (R. C. MacCallum et al., 2002; Royston et al., 2006; Zhao & Kolonel, 1992). If a continuous confounder is grouped, residual confounding may occur, whereby some of the variable’s confounding effect remains unadjusted for (see Box 5) (Becher, 1992; Cochran, 1968). Increasing the number of categories can diminish power loss and residual confounding, and is especially appropriate in large studies. Small studies may use few groups because of limited numbers.

Investigators may choose cut-points for groupings based on commonly used values that are relevant for diagnosis or prognosis, for practicality, or on statistical grounds. They may choose equal numbers of individuals in each group using quantiles (Clayton & Hills, 1993). On the other hand, one may gain more insight into the association with the outcome by choosing more extreme outer groups and having the middle group(s) larger than the outer groups (Cox, 1957). In case-control studies, deriving a distribution from the control group is preferred since it is intended to reflect the source population. Readers should be informed if cut-points are selected post hoc from several alternatives. In particular, if the cut-points were chosen to minimise a P value the true strength of an association will be exaggerated (Altman, 1994).

When analysing grouped variables, it is important to recognise their underlying continuous nature. For instance, a possible trend in risk across ordered groups can be investigated. A common approach is to model the rank of the groups as a continuous variable. Such linearity across group scores will approximate an actual linear relation if groups are equally spaced (e.g., 10 year age groups) but not otherwise. Il’yasova et al (Il’yasova et al., 2005) recommend publication of both the categorical and the continuous estimates of effect, with their standard errors, in order to facilitate meta-analysis, as well as providing intrinsically valuable information on dose-response. One analysis may inform the other and neither is assumption-free. Authors often ignore the ordering and consider the estimates (and P values) separately for each category compared to the reference category. This may be useful for description, but may fail to detect a real trend in risk across groups. If a trend is observed, a confidence interval for a slope might indicate the strength of the observation.

Field-specific guidance

Seroepidemiologic studies for influenza (Horby et al., 2017)
- Describe the baseline estimated seroprevalence at given antibody titers or incidence of infection and cite published literature to support these estimates

## Resources
Do you know of any good guidance or resources related to this item? Suggest them via comments below, Twitter, GitHub, or e-mail.
Coppock, A. Power Calculator (Shiny App). Retrieved January 27, 2020, from https://egap.shinyapps.io/Power_Calculator/ (Coppock, n.d.)

References

Altman, D. (n.d.). Categorizing continuous variables. In Encyclopedia of biostatistics (2nd ed., pp. 708–711). John Wiley & Sons, Ltd.

Altman, D. (1994). Dangers of using “optimal” cutpoints in the evaluation of prognostic factors. J Natl Cancer Inst, 86, 829–835.

Anoop, S., Saravanan, B., Joseph, A., Cherian, A., & Jacob, K. S. (2004). Maternal depression and low maternal intelligence as risk factors for malnutrition in children: A community based case-control study from South India. Archives of Disease in Childhood, 89(4), 325–329. https://doi.org/10.1136/adc.2002.009738

Becher, H. (1992). The concept of residual confounding in regression models and some applications. Statistics in Medicine, 11(13), 1747–1758. https://doi.org/10.1002/sim.4780111308

Carlin, J. B., & Doyle, L. W. (2002). Sample size. Journal of Paediatrics and Child Health, 38(3), 300–304. https://doi.org/10.1046/j.1440-1754.2002.00855.x

Clayton, D., & Hills, M. (1993). Models for dose-response (chapter 25). In Statistical models in epidemiology (pp. 249–260). Oxford University Press.

Cochran, W. G. (1968). The Effectiveness of Adjustment by Subclassification in Removing Bias in Observational Studies. Biometrics, 24(2), 295–313. https://doi.org/10.2307/2528036

Cohen, J. (1983). The Cost of Dichotomization. Applied Psychological Measurement, 7(3), 249–253. https://doi.org/10.1177/014662168300700301

Coppock, A. (n.d.). Power Calculator. Retrieved January 27, 2020, from https://egap.shinyapps.io/Power_Calculator/

Cox, D. R. (1957). Note on Grouping. Journal of the American Statistical Association, 52(280), 543–547. https://doi.org/10.1080/01621459.1957.10501411

Devine, O. J., & Smith, J. M. (1998). Estimating sample size for epidemiologic studies: The impact of ignoring exposure measurement uncertainty. Statistics in Medicine, 17(12), 1375–1389. https://doi.org/10.1002/(SICI)1097-0258(19980630)17:12<1375::AID-SIM857>3.0.CO;2-D

Drescher, K., Timm, J., & Jöckel, K.-H. (1990). The design of case-control studies: The effect of confounding on sample size requirements. Statistics in Medicine, 9(7), 765–776. https://doi.org/10.1002/sim.4780090706

Horby, P. W., Laurie, K. L., Cowling, B. J., Engelhardt, O. G., Sturm-Ramirez, K., Sanchez, J. L., Katz, J. M., Uyeki, T. M., Wood, J., Van Kerkhove, M. D., & the CONSISE Steering Committee. (2017). CONSISE statement on the reporting of Seroepidemiologic Studies for influenza (ROSES-I statement): An extension of the STROBE statement. Influenza and Other Respiratory Viruses, 11(1), 2–14. https://doi.org/10.1111/irv.12411

Il’yasova, D., Hertz-Picciotto, I., Peters, U., Berlin, J. A., & Poole, C. (2005). Choice of exposure scores for categorical regression in meta-analysis: A case study of a common problem. Cancer Causes & Control, 16(4), 383–388. https://doi.org/10.1007/s10552-004-5025-x

KF, S., & DA, G. (n.d.). Sample size calculations in randomised trials: Mandatory and mystical. The Lancet, 365, 1348–1353. https://www.sciencedirect.com/science/article/pii/S0140673605610343

MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (2002). Sample Size in Factor Analysis. Psychol Methods, 7, 19–40.

Pocock, S. J., Collier, T. J., Dandreo, K. J., Stavola, B. L. de, Goldman, M. B., Kalish, L. A., Kasten, L. E., & McCormack, V. A. (2004). Issues in the reporting of epidemiological studies: A survey of recent practice. The BMJ, 329(7471), 883. https://doi.org/10.1136/bmj.38250.571088.55

Rigby, A. S., & Vail, A. (1998). Statistical methods in epidemiology. II: A commonsense approach to sample size estimation. Disability and Rehabilitation, 20(11), 405–410. https://doi.org/10.3109/09638289809166102

Royston, P., Altman, D. G., & Sauerbrei, W. (2006). Dichotomizing continuous predictors in multiple regression: A bad idea. Statistics in Medicine, 25(1), 127–141. https://doi.org/10.1002/sim.2331

Tooth, L., Ware, R., Bain, C., Purdie, D. M., & Dobson, A. (2005). Quality of Reporting of Observational Longitudinal Research. American Journal of Epidemiology, 161(3), 280–288. https://doi.org/10.1093/aje/kwi042

Vandenbroucke, J. P., Elm, E. von, Altman, D. G., Gotzsche, P. C., Mulrow, C. D., Pocock, S. J., Poole, C., Schlesselman, J. J., & Egger, M. (2007). Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): Explanation and Elaboration. Epidemiology, 18(6), 805–835. https://doi.org/10.1097/EDE.0b013e3181577511

Yadon, Z. E., Rodrigues, L. C., Davies, C. R., & Quigley, M. A. (2003). Indoor and peridomestic transmission of american cutaneous leishmaniasis in northwestern argentina: A retrospective case-control study. The American Journal of Tropical Medicine and Hygiene, 68(5), 519–526. https://doi.org/10.4269/ajtmh.2003.68.519

Zhao, L. P., & Kolonel, L. N. (1992). Efficiency Loss from Categorizing Quantitative Exposures into Qualitative Exposures in Case-Control Studies. American Journal of Epidemiology, 136(4), 464–474. https://doi.org/10.1093/oxfordjournals.aje.a116520