Chapter 20 Pre-registration and Registered Reports

20.1 Study registration

The basic idea of study registration is that the researcher declares in advance key details of the study - effectively a full protocol that explains the research question and the methods that will be used to address it. Crucially, this should specify the primary outcome and the method of analysis, without leaving the researcher any wiggle room to tweak results to make them look more favourable. A study record in a trial registry should be public and time-stamped, and completed before any results have been collected. Since the Food and Drug Administration Modernization Act of 1997 first required FDA-regulated trials to be deposited in a registry, other areas of medicine have followed, with a general registry, https://clinicaltrials.gov/ being established in 2000. Registration of clinical trials has become widely adopted, and is now required if one wants to publish clinical trial results in a reputable journal (De Angelis et al., 2004).

Study registration serves several functions, but perhaps the most important one is that it makes research studies visible, regardless of whether they obtain positive results. In Chapter 19, we showed results from a study that was able to document publication bias (De Vries et al., 2018) precisely because trials in this area were registered. Without registration, we would have no way of telling that the unpublished trials had ever existed.

A second important function of trial registration is that it allows us to see whether researchers did what they planned to do. Of course, “The best laid schemes o’ mice an’ men / Gang aft a-gley”, as Robert Burns put it. It may turn out impossible to recruit all the participants one hoped for. An outcome variable may turn out to be unsuitable for analysis. A new analytic method may come along which is much more appropriate for the study. The purpose of registration is not to put the study in a straightjacket, but rather to make it transparent when there are departures from the protocol. As noted in Chapter 11, it is not uncommon for researchers to (illegitimately) change their hypothesis on the basis of seeing the data. This practice can be highly misleading to both the researcher and the intended audience. It can happen that the researcher is initially disappointed by a null result, but then notices that their hypothesis might be supported if a covariate is used to adjust the analysis or if a subgroup of particular individuals is analysed instead. But if we look at the data and observe interesting patterns, then form subgroups of individuals based on these observations, we raise the likelihood that we will pursue chance findings rather than a true phenomenon (Senn, 2018). As discussed in Chapter 13, it can be entirely reasonable to suppose that some people are more responsive to the intervention than others, but there is a real risk of misinterpreting chance variation as meaningful difference if we identify subgroups only after examining the results.

Does this mean we are prohibited from exploring our data to discover unexpected findings? A common criticism of pre-registration is that it kills creativity and prevents us from making new discoveries, but this is not the case. Data exploration is an important part of scientific discovery and is to be encouraged, provided that the complete analysis time line is presented and unregistered analyses are labelled as exploratory. An interesting-looking subgroup effect can then be followed up in a new study to see if it replicates. The problem arises when such analyses are presented as if they were part of the original plan, with results that favour an intervention effect being cherry-picked. As we saw in Chapter 11, the interpretation of statistical analyses is highly dependent on whether a specific hypothesis is tested prospectively, or whether the researcher is data-dredging - running numerous analyses in search of the elusive “significant” result; registration of the study protocol means that this distinction cannot be obscured.

In psychology, a move towards registration of studies has been largely prompted by concerns about the high rates of p-hacking and HARKing in this literature (Simmons et al., 2011) (see Chapter 11), and the focus is less on clinical trials than on basic observational or experimental studies. The term “pre-registration” has been adopted to cover this situation. For psychologists, the Open Science Framework has become the most popular repository for pre-registrations, allowing researchers to deposit a time-stamped protocol, which can be embargoed for a period of time if it is desired to keep this information private (Hardwicke & Wagenmakers, 2021). Examples of pre-registration templates are available.

Does trial registration prevent outcome-switching?

A registered clinical trial protocol should specify a primary outcome measure, which will be used in the principal analysis of the study data. This should protect against cases where the researcher looks at numerous outcomes and picks the one that looks best - in effect p-hacking. In practice, trial registration does not always achieve its goals: Goldacre et al. (2019) identified 76 trials published in a six-week period in one of five journals: New England Journal of Medicine, The Lancet, Journal of the American Medical Association, British Medical Journal, and Annals of Internal Medicine. These are all high-impact journals that officially endorse Consolidated Standards of Reporting Trials (CONSORT), which specify that pre-specified primary outcomes should be reported. Not only did Goldacre et al find high rates of outcome-switching in these trial reports; they also found that some of the journals were reluctant to publish a letter that drew attention to the mismatch, with qualitative analysis demonstrating “extensive misunderstandings among journal editors about correct outcome reporting”.

20.2 Registered Reports

Michael Mahoney, whose book was mentioned in Chapter 19, provided an early demonstration of publication bias with his little study of journal reviewers (Mahoney, 1976). Having found that reviewers are far too readily swayed by a paper’s results, he recommended:

Manuscripts should be evaluated solely on the basis of their relevance and their methodology. Given that they ask an important question in an experimentally meaningful way, they should be published - regardless of their results. (p. 105).

37 years later, Chris Chambers independently came to the same conclusion. In his editorial role at the journal Cortex, he introduced a new publishing initiative adopting this model, which was heralded by an open letter in the Guardian newspaper. The registered report is a specific type of journal article that embraces pre-registration as one element of the process; crucially, peer review occurs before data is collected.

Comparison of stages in regular publishing model and Registered Reports

Figure 20.1: Comparison of stages in regular publishing model and Registered Reports

Figure 20.1 shows how registered reports differ from the traditional publishing model. In traditional publishing, reviewers evaluate the study after it has been completed, giving ample opportunity for them to be swayed by the results. In addition, their input comes at a stage when it is too late to remedy serious flaws in the experimental design: there is a huge amount of waste that occurs when researchers put months or years into a piece of research that is then rejected because peer reviewers find serious flaws. In contrast, with registered reports, reviewers are involved at a much earlier stage. The decision whether or not to accept the article for publication is based on the introduction, methods and analysis plan, before results are collected. At this point reviewers cannot be influenced by the results, as they have not yet been collected. The second stage review is conducted after the study has been completed, but this is much lighter touch and only checks whether the authors did as they had said they would and whether the conclusions are reasonable given the data. The “in principle” acceptance cannot be overturned by reviewers coming up with new demands at this point. This unique format turns the traditional publishing paradigm on its head. Peer review can be viewed less as “here’s what you got wrong and should have done” and more like a helpful collaborator who gives feedback at a stage in the project when things can be adapted to improve the study.

Methodological quality of registered reports tends to be high because no editor or reviewer wants to commit to publish a study that is poorly conducted, underpowered or unlikely to give a clear answer to an interesting question. Registered reports are required to specify clear hypotheses, give specification of an analysis plan to test these, justify the sample size, and document how issues such as outlier exclusion and participant selection criteria will be handled. These requirements are more rigorous than those for clinical trial registration.

The terminology in this area can be rather confusing, and it is important to distinguish between pre-registration, as described in the previous section (which in the clinical trials literature is simply referred to as “trial registration”) and registered reports, which include peer review prior to data collection, with “in principle acceptance” of the paper before results are known. Another point of difference is that trial registration is always made public; that is not necessarily the case for registered reports, where the initial protocol may be deposited with the journal but not placed in the public domain. Pre-registrations outside the clinical trials domain need to be deposited on a repository with a time-stamp, but there is flexibility about when, or indeed if, they are made public.

The more stringent requirements for a registered report, versus standard pre-registration, mean that this publication model can counteract four major sources of bias in scientific publications - referred by Bishop (2019) as the four horsemen of the reproducibility apocalypse, namely:

  • Publication bias. By basing reviews on introduction and methods only, it is no longer possible for knowledge of results to influence publication decisions. As Mahoney (1976) put it, it allows us to : place our trust in good questions rather than cheap answers.

  • Low power. No journal editor wants to publish an ambiguous null result that could just be the consequence of low statistical power -see Chapter 10. However, in an adequately powered intervention study, a null result is important and informative for telling us what does not work. Registered reports require authors to justify their sample size, minimising the likelihood of type II errors.

  • P-hacking. Pre-specification of the analysis plan makes transparent the distinction between pre-planned hypothesis-testing analyses, and post hoc exploraration of the data. Note that exploratory analyses are not precluded in a registered report, but they are reported separately, on the grounds that statistical inferences need to be handled differently in this case - see Chapter 11.

  • HARKing. Because hypotheses are specified before the data are collected, it will be obvious if the researcher uses their data to generate a new hypothesis. HARKing is so common as to be normative in many fields, but it generates a high rate of false positives when a hypothesis that is only specified after seeing the data is presented as if it was the primary motivation for a study. Instead, in a registered report, authors are encouraged to present new ideas that emerge from the data in a separate section entitled “Exploratory analyses”.

Registered reports are becoming increasingly popular in psychology, and are beginning to be adopted in other fields, but many journal editors have resisted adopting this format. In part this is because any novel system requires extra work, and in part because of other concerns - e.g. that this might lead to less interesting work being published in the journal. Answers to frequently asked questions about registered reports can be found on Open Science Framework. As might be gathered from this account, we are enthusiastic advocates of this approach, and have co-authored several registered reports ourselves.

References

Bishop, D. V. M. (2019). Rein in the four horsemen of irreproducibility. Nature, 568(7753), 435–435. https://doi.org/10.1038/d41586-019-01307-2
De Angelis, C., Drazen, J. M., Frizelle, F. A., Haug, C., Hoey, J., Horton, R., Kotzin, S., Laine, C., Marusic, A., Overbeke, A. J. P. M., Schroeder, T. V., Sox, H. C., Van Der Weyden, M. B., & International Committee of Medical Journal Editors. (2004). Clinical trial registration: A statement from the International Committee of Medical Journal Editors. Lancet (London, England), 364(9438), 911–912. https://doi.org/10.1016/S0140-6736(04)17034-7
De Vries, Y. A., Roest, A. M., Jonge, P. de, Cuijpers, P., Munafò, M. R., & Bastiaansen, J. A. (2018). The cumulative effect of reporting and citation biases on the apparent efficacy of treatments: The case of depression. Psychological Medicine, 48(15), 2453–2455. https://doi.org/10.1017/S0033291718001873
Goldacre, B., Drysdale, H., Dale, A., Milosevic, I., Slade, E., Hartley, P., Marston, C., Powell-Smith, A., Heneghan, C., & Mahtani, K. R. (2019). COMPare: A prospective cohort study correcting and monitoring 58 misreported trials in real time. Trials, 20(1), 118. https://doi.org/10.1186/s13063-019-3173-2
Hardwicke, T. E., & Wagenmakers, E.-J. (2021). Preregistration: A pragmatic tool to reduce bias and calibrate confidence in scientific research. MetaArXiv. https://doi.org/10.31222/osf.io/d7bcu
Mahoney, M. J. (1976). Scientist as Subject: The Psychological Imperative. Ballinger Publishing Company.
Senn, S. (2018). Statistical pitfalls of personalized medicine. Nature, 563(7733), 619–621. https://doi.org/10.1038/d41586-018-07535-2
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632