Chapter 9 Publication Bias

In the last chapters, we have showed you how to pool effects in meta-analysis, choose the right pooling model, assess the heterogeneity of your effect estimate, and determine sources of heterogeneity through outlier, influence, and subgroup analyses.

Nevertheless, even the most thoroughly conducted meta-analysis can only work with the study material at hand. An issue commonly discussed in research, however, is the file-drawer or publication bias problem, which states that a study with high effect sizes is more likely to be published than a study with a low effect size (Rothstein, Sutton, and Borenstein 2006).

Such missing studies with low effect sizes, it is assumed, thus never get published and therefore cannot be integrated into our meta-analysis. This leads to publication bias, as the pooled effect we estimate in our meta-analysis might be higher than the true effect size because we did not consider the missing studies with lower effects due to the simple fact that they were never published.

Although this practice is gradually changing (Nelson, Simmons, and Simonsohn 2018), whether a study is published or not heavily depends on the statistical significance (\(p<0.05\)) of its results (Dickersin 2005). For any sample size, a result is more likely to become statistically significant if its effect size is high. This is particularly true for small studies, for which very large effect sizes are needed to attain a statisitcally significant result.

In the following chapter, we will describe the idea behind statistical models for publication bias in further depth. We call these concepts and methods small sample bias methods, as it is small studies that they mostly focus on. These methods assume that publication bias is primarily driven by effect size and because researchers immediately put every study in the file drawer once the results are not significant.

Recently, it has been argued that these assumptions may not be true, and that publication bias is mostly caused by significance levels and p-hacking (Simonsohn, Nelson, and Simmons 2014a). An alternative method called P-Curve has therefore been suggested to examine publication bias. We will present this method in the last chapter of this section.

Which method should i use for my meta-analysis?

While recent research suggests that the conventional small sample bias methods may have substantial limitations, and that p-Curve may be able to estimate the true effect with less bias (Simonsohn, Nelson, and Simmons 2014a, 2014b; Simonsohn, Simmons, and Nelson 2015), please note that both methods are based on different theoretical assumptions about the origin of publication bias. As we cannot ultimately decide which assumption is the “true” one in specific research fields, and, in practice the true effect is unkown when doing meta-analysis, we argue that you may use both methods and compare results as sensitivity analyses (Harrer et al. 2019).

P-curve was developed with full-blown experimental psychological research in mind, in which researchers often have high degrees of “researcher freedom” (Simmons, Nelson, and Simonsohn 2011) in deleting outliers and performing statistical test on their data.

We argue that this looks slightly different for clinical psychology and the medical field, where researchers conduct randomized controlled trials whith a clear primary outcome: the difference between the control and the intervention group after the treatment. While it is also true for medicine and clinical psychology that statistical significance plays an important role, the effect size of an intervention is often of greater interest, as treatments are often compared in terms of their treatment effects in this field. Furthermore, best practice for randomized controlled trials is to perform intention-to-treat analyses, in which all collected data in a trial has to be considered, giving researchers less space to “play around” with their data and perform p-hacking. While we certainly do not want to insinuate that outcome research in clinical psychology is free from p-hacking and bad data analysis practices, this should be seen as a caveat that the assumptions of the small sample bias methods may be more adequate for clinical psychology than other fields within psychology, especially when **the risk of bias for each study is also taken into account*. Facing this uncertainty, we think that conducting both analyses and reporting them in our research paper may be the most adequate approach until meta-scientific research gives us more certainty about which assumption actually best reflects the field of clinical psychology.


Rothstein, Hannah R, Alexander J Sutton, and Michael Borenstein. 2006. Publication Bias in Meta-Analysis: Prevention, Assessment and Adjustments. John Wiley & Sons.

Nelson, Leif D, Joseph Simmons, and Uri Simonsohn. 2018. “Psychology’s Renaissance.” Annual Review of Psychology 69.

Dickersin, Kay. 2005. “Publication Bias: Recognizing the Problem, Understanding Its Origins and Scope, and Preventing Harm.” Publication Bias in Meta-Analysis: Prevention, Assessment and Adjustments. Wiley Chichester, UK, 11–33.

Simonsohn, Uri, Leif D Nelson, and Joseph P Simmons. 2014a. “P-Curve: A Key to the File-Drawer.” Journal of Experimental Psychology: General 143 (2). American Psychological Association: 534.

Simonsohn, Uri, Leif D Nelson, and Joseph P Simmons. 2014b. “P-Curve and Effect Size: Correcting for Publication Bias Using Only Significant Results.” Perspectives on Psychological Science 9 (6). Sage Publications Sage CA: Los Angeles, CA: 666–81.

Simonsohn, Uri, Joseph P Simmons, and Leif D Nelson. 2015. “Better P-Curves: Making P-Curve Analysis More Robust to Errors, Fraud, and Ambitious P-Hacking, a Reply to Ulrich and Miller (2015).” American Psychological Association.

Harrer, Mathias, Sophia H Adam, Harald Baumeister, Erini Karyotaki, Pim Cuijper, Ronny Bruffaerts, Randy P Auerbach, Ronald C Kessler, Matthias Berking, and David D. Ebert. 2019. “Internet Interventions for Mental Health in University Students: A Systematic Review and Meta-Analysis.” Journal of Methods in Psychiatric Research.

Simmons, Joseph P, Leif D Nelson, and Uri Simonsohn. 2011. “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science 22 (11). Sage Publications Sage CA: Los Angeles, CA: 1359–66.