What are pre-registrations good for? (Absolutely nothing!?) A quick review of some relevant references and results.
In Data Colada [115], Uri Simonsohn posted in late 2023 a piece on pre-registration prevalence. He motivated it such:
Pre-registration is the best and possibly only solution to p-hacking. Ten years ago, pre-registrations were virtually unheard of in psychology, but they have become increasingly common since then. I was curious just how common they have become, and so I collected some data. This post shares the results.
Here are the results:
The arguments for (and against) pre-registration, and registered reports for that matter, have been laid out in Logg & Dorison (OBHDP 2021) and Chambers & Tzavella (NHB 2022).
About the same time Simonsohn updated us on the prevalence of pre-registration in psychology journals, we learned about a study titled “Preregistration in practice: A comparison of preregistered and non‑preregistered studies in psychology” in which the authors (van den Akker et al, 2023) find:
Overall, our data indicate that preregistration has beneficial effects in the realm of statistical power and impact, but we did not find robust evidence that preregistration prevents p-hacking and HARKing (Hypothesizing After the Results are Known). (from the abstract of their paper)
So much for pre-registration being the best and possibly only solution to p-hacking, no?!
It certainly is a surprising result and a result that contradicts both (my) intuition — when you impose constraints such as pre-registration then it seems self-evident that p-hacking and HARKing will be reduced — and also empirical findings. For example, Scheel, Schijen, & Lakens (2021), comparing the standard psychology literature with registered reports, found an excess of false positives and quite dramatically so:
We compared the results in published RRs (N = 71 as of November 2018) with a random sample of hypothesis-testing studies from the standard literature (N = 152) in psychology. Analyzing the first hypothesis of each article, we found 96% positive results in standard reports but only 44% positive results in RRs. (from the abstract of their paper)
This seems very much in line with (my) intuition.
Granted, registered reports are not the same as pre-registration. The latter is, however, a necessary condition for registered reports which pair pre-registration with peer review and the decision to publish before results are known.
Relatedly, Kvarven et al. (NHB 2019, correction 2020), comparing meta-analyses and preregistered multiple-laboratory replication projects, find:
The multiple-laboratory replications provide precisely estimated effect sizes that do not suffer from publication bias or selective reporting. We searched the literature and identified 15 meta-analyses on the same topics as multiple-laboratory replications. We find that meta-analytic effect sizes are significantly different from replication effect sizes for 12 out of the 15 meta-replication pairs. These differences are systematic and, on average, meta-analytic effect sizes are almost three times as large as replication effect sizes. (from their original abstract)
The van den Akker et al results are thus a puzzle and, as the authors write themselves, results that came “unexpectedly” (see conclusion and discussion) It is likely that selection bias confounds their results. They report, for example, that pre-registered studies “more often contained a power analysis and larger sample sizes than non-preregistered studies. … and that preregistered studies had a greater impact in terms of citations, Altmetric Attention Score, and journal impact factor than non-preregistered studies.” (from conclusion and discussion section)
Plus there is of course the interesting question of how you determine reliably for non-pregistered studies that the degree to which p-hacking and HARKing did indeed happen.
Update 2024_01_23: I just came across this paper by Brodeur et al.: Brodeur, Abel and Brodeur, Abel and Cook, Nikolai and Hartley, Jonathan and Heyes, Anthony, Do Pre-Registration and Pre-Analysis Plans Reduce p-Hacking and Publication Bias? (December 15, 2022).
Randomized controlled trials (RCTs) are increasingly prominent in economics, with pre-registration and pre-analysis plans (PAPs) promoted as important in ensuring the credibility of findings. We investigate whether these tools reduce the extent of p-hacking and publication bias by collecting and studying the universe of test statistics, 15,992 in total, from RCTs published in 15 leading economics journals from 2018 through 2021. In our primary analysis, we find no meaningful difference in the distribution of test statistics from pre-registered studies, compared to their non-pre-registered counterparts. However, pre-registered studies that have a complete PAP are significantly less p-hacked. These results point to the importance of PAPs, rather than pre-registration in itself, in ensuring credibility. (from their abstract)
As an aside, there’s a (widespread) misconception that pre-registration prevents you from reporting what you did not pre-register. Not so. See this recent pre-print:
PsyArXiv Preprints | When and How to Deviate from a Preregistration (osf.io)
The point of pre-registration is not to report only what you pre-register but to separate clearly what you did and what you did not. The latter therefore being flagged as somewhat exploratory and ex post.
I personally have pre-registered all my studies since 2019 or so. In my view the benefits outweigh the costs.