Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Mar 2.
Published in final edited form as: Assessment. 2024 Sep 11;32(2):235–243. doi: 10.1177/10731911241275256

On the Use and Misuses of Preregistration: A Reply to Klonsky (2024)

Colin E Vize 1, Nathaniel L Phillips 2, Joshua D Miller 3, Donald R Lynam 4
PMCID: PMC11871985  NIHMSID: NIHMS2047706  PMID: 39258834

Abstract

In his commentary, Klonsky (2024) outlines several arguments for why preregistration mandates (PRMs) will have a negative impact on the field. Klonsky’s overarching concern is that when preregistration ceases to be a tool for research and becomes an indicator of quality itself (a primary example being preregistration badges), it loses its intended benefits. Separate from his concerns surrounding policies like preregistration badges, Klonsky also critiques the practice of preregistration itself, arguing that it can impede our use of other valuable research tools (e.g., multiverse analyses, exploratory analyses). We provide a response to Klonsky’s concerns about preregistration and related policies. First, we provide conceptual clarification on the purpose of preregistration, which was missing in Klonsky’s commentary. Second, with a clearer conceptual framework, we highlight where some of Klonsky’s concerns are warranted, but also highlight where Klonsky’s concerns, critiques, and proposed alternatives to the use of preregistration fall short. Third, with this conceptual understanding of preregistration, we briefly outline some challenges related to the effective implementation of preregistration in psychological science.

Keywords: Preregistration, falsification, severity, open science, methodological reform


Klonsky (2024) outlines several arguments for why preregistration becoming a norm in psychological science will be detrimental for the field. His primary concern is that preregistration will cease to be a tool for research and will become instead a target itself (a primary example being preregistration badges), at which point, it cannot achieve its intended goal. Klonsky points to psychology’s unfortunate history of turning tools into superficial indicators of robust findings (e.g., p<.05), which in turn has led to negative consequences (e.g., in Klonsky’s view, it is the driving force behind the replication crisis in psychology). Although Klonsky’s overarching criticism is focused on how the mere presence of a preregistration badge or document is being conflated with strong science, he also criticizes preregistration more generally, arguing that it can impede our use of other valuable research tools (e.g., multiverse analyses, exploratory analyses).

Specifically, Klonsky (2024) outlines several concerns with what he calls “preregistration mandates” (PRMs). Klonsky defines PRMs as policies that seek to elevate preregistration to a norm in the field, best exemplified by preregistration badges (p. 3; emphasis ours). But Klonsky also uses PRMs to refer to preregistration becoming a norm in the field in general, which he believes will negatively impact ideal scientific practice.1 Labeling these distinct implementations of preregistration with the same PRM term blurs legitimate concerns about unintended consequences of preregistration policies (e.g., Campbell’s Law and the inadvertent consequences of preregistration badges) with misguided concerns about the practice of preregistration, itself (e.g., preregistration becoming a norm in the field will impede ideal scientific practice). In what follows, we separate these distinct components of Klonsky’s criticism and discuss them accordingly.

The focus of the present response is threefold. First, we provide conceptual clarification surrounding the goal of preregistration, which was missing from Klonsky’s commentary. Second, by providing a clearer conceptual background, we highlight where some of Klonsky’s concerns about preregistration remain warranted, but also highlight where Klonsky’s criticisms and proposed alternatives to the use of preregistration fall short. Third, with a clearer understanding of preregistration in mind, we briefly outline some challenges and, importantly, opportunities regarding the effective implementation of preregistration in psychological science.

Campbell’s Law and Methodological Reform in Psychology

Klonsky’s primary concern is that preregistration incentives like badges are becoming an example of Campbell’s Law, which states that the more a tool or indicator is used for decision-making, the more likely the tool or indicator will distort the process it was designed to initially monitor (Campbell, 1979). He highlights empirical evidence to suggest that preregistration badges in flagship journals like Psychological Science are not functioning as intended (e.g., published papers with preregistration badges contain non-trivial, undisclosed deviations from the preregistration; Claesen et al., 2021). Overall, Klonsky’s concern about Campbell’s Law and the potential for history to repeat itself with regard to incentivization policies for preregistration (i.e., badges) is well-placed. It is critical that we learn from past missteps and work to ensure that we understand the ways preregistration can and cannot improve our science. Moreover, we share Klonsky’s broader concerns about common research practices that negatively impact the credibility of research findings (e.g., undisclosed use of researcher degrees of freedom; Simmons et al., 2011), but also share his excitement for the tangible methodological reforms occurring in psychological science.

Although Klonsky’s commentary on preregistration and Campbell’s Law raises an important concern about the mere use of a tool becoming a shallow indicator of rigor, many of his other concerns (i.e., those focused on preregistration becoming a norm in the field) reflect misunderstandings of the principles and applications of preregistration. We believe that when used as intended, preregistration as a norm will improve the field. Importantly, as Klonsky notes, the “used as intended” part is the crux. In order for preregistration to be used as intended, prominent misconceptions must be addressed.

Clarifying the Goal of Preregistration

In his commentary, Klonsky does not provide a clear definition of what preregistration is designed to do. Klonsky names a variety of potential benefits related to the use of preregistration including improved replication rates, increased thoughtfulness in study planning, increased statistical power, clearly demarcating confirmatory and exploratory tests, and the potential to constrain biasing effects of researcher degrees of freedom. However, these are better understood as indirect consequences or “positive externalities” of preregistration (Lakens, 2019). There is nothing about preregistration that guarantees these secondary benefits will be realized—one can always preregister an under-powered, poorly designed study with inappropriate analyses, vague hypotheses, poorly chosen measures, and non-specific inferential criteria (Rubin, 2020; Szollosi et al., 2020). Moreover, preregistration will not necessarily stop researchers from inadvertently capitalizing on researcher degrees of freedom (Navarro, 2020). We agree with Klonsky that preregistration is not a sufficient condition for good science. It should not, by itself, be taken as a marker of credibility.

Because preregistration cannot guarantee the benefits listed by Klonsky, it is essential to be clear about what preregistration is intended to do: the goal of preregistration is to increase the transparency of the research process in order to allow evaluation of the severity with which a particular hypothesis or empirical claim has been tested (Lakens, 2019). Preregistration is not designed to increase credibility per se; it is designed to increase transparency which allows evaluations of credibility (i.e., severity). A test is considered severe when it has a high probability of demonstrating a claim is false when it is false (Mayo, 2018; Lakens, 2019). More generally, the principle of severity provides an answer to the question, “when do data provide good evidence in support of a claim or hypothesis?” (Mayo & Spanos, 2011). The short answer is that more severe tests provide better evidence.

The concept of a “severe test” clearly applies to straightforward confirmatory research questions in psychology (e.g., “Does a therapeutic intervention reduce symptoms of depression?”, “Does amygdalar activation predict psychopathy?”) but evaluating the severity of a test is important to various other empirical claims in psychology as well (e.g., Does a newly developed scale demonstrate predictive improvements over an established scale?; Is a measure unidimensional or does it contain subfactors?; Which hierarchical model fits the data best? How strongly does psychopathy in adolescence predict psychopathy in adulthood?). Thus, the goal of preregistration is relevant for a wide array of psychological research questions.

Being able to transparently evaluate the severity of a test is critical, because in many areas of psychology, available theories do not constrain all possible choices (e.g., which measurement instrument to use; how to exclude outliers; which covariates should be included; Lakens, 2019). Others have highlighted that the various crises in psychology (e.g., the replication crisis, the generalizability crisis, the theory crisis) share a core problem: empirical claims are not based on severe tests (Claesen et al., 2022). The field would likely be in a different place if it was aware of how each study was conducted and, thus, the severity of the test it offered. We would not weigh very heavily papers that extensively searched the data for significant results and then wrote the introduction to predict those findings. We would not weigh very heavily papers that reported the single study out of 10 that found significant results. We would not weigh very heavily papers that determined outlier status or covariate inclusion based on what these decisions did to the results. When such decisions remain opaque, severe tests cannot be distinguished from non-severe ones. The field requires transparency if it is to move forward—this is the core of Open Science. Klonsky’s claims about the benefits of preregistration (e.g., restraining researcher degrees of freedom, increasing power) are not unique to his commentary. These secondary benefits of preregistration are oft-cited reasons to preregister a study. There is correlational evidence to suggest that preregistration, particularly when peer reviewed (i.e., by using a Registered Report format), is related to increased rigor of a study (e.g., Soderberg et al., 2021) via these secondary benefits. Nonetheless, the field’s shift towards understanding preregistration as a tool to increase transparency in order to allow the evaluation of test’s severity reflects conceptual advances made in regard to preregistration, and helps connect the use of preregistration to an explicit philosophy of science (Lakens, 2019; Mayo & Spanos, 2011; Meehl, 1990). What is most noteworthy about the conceptual and theoretical development surrounding preregistration is that it explains why researchers want to know which analyses were exploratory versus confirmatory, what statistical power was for a primary test, whether researchers examined a separate outcome after initial null findings, and so on—this information is critical to evaluating the extent to which a claim has been severely tested–underpowered, exploratory studies with numerous undisclosed researcher degrees of freedom provide very weak (i.e., non-severe) tests. Absent the transparency provided by preregistration, these weak tests cannot be distinguished from the stronger tests. With this conceptual clarification in mind, we are better able to evaluate the validity of Klonsky’s concerns with preregistration.

First, Klonsky’s concerns about the presence of preregistration being treated as a sufficient indicator of robustness still deserves attention. We agree that viewing preregistration in this way lends itself to the kind of “checklist science” that Klonsky rightfully critiques. As a field, we must correct course in light of evidence that preregistration is, at times, being used not as a tool, but as a superficial signal of rigor. Although a preregistration badge or the presence of a link to a preregistration provides some indicator of transparency, as noted by Klonsky, it tells us nothing about the quality of the study. We emphasize Klonsky’s call for researchers to resist simplistic thinking that regards the presence or absence of a research tool as a sufficient indicator of either good or bad science. As such, it is worth considering whether preregistration badges serve much purpose at this stage of the reform movement.2 The conceptual framework for preregistration offered above also clarifies why the use of preregistration does not itself serve as an indicator of robust science. Preregistration simply allows others to evaluate the extent to which a particular claim has been severely tested; it makes the process more transparent and, thus, open to evaluation. Preregistration does not, in itself, guarantee a severe test. In some cases, a preregistration will reveal that the methodological approach produced a test that lacked severity and was, therefore, unable to provide strong evidence for a hypothesis or empirical claim. In other cases, there may be so many deviations from the preregistration that readers will take the findings with a much larger, and more appropriate, grain of salt than they would have in the absence of such a preregistration (Lakens, 2024).

Importantly, the conceptual clarification offered above also indicates why Klonsky’s separate concern over preregistration becoming a norm is misguided. It is difficult to see how it would be harmful to the field if all studies were made more transparent through preregistration in order to facilitate the evaluation of the severity of the tests they provided. While this is dependent on researchers assembling preregistrations that do, in fact, increase transparency and facilitate this kind of evaluation (and those preregistrations being read and evaluated by peer reviewers and subsequent readers), it is important to separate criticisms of a tool from criticisms of the tool’s misuse. Klonsky’s overarching concern is focused on the misuse of preregistration. Advocates of preregistration will find no reasons to disagree with concerns about its misuse. However, Klonsky’s other concerns focused on the use of preregistration itself reflect a misunderstanding of preregistration’s limits.

Misunderstanding the Limits of Preregistration

Separate from his concerns about the presence of preregistration being uncritically accepted as an indicator of strong science, Klonsky highlights three concerns with preregistration becoming the norm. First, Klonsky argues that preregistration becoming prevalent will, “establish a norm that researchers should choose one or a subset of reasonable analyses and not others.” (p. 14). This is simply not true, but it is a common misperception regarding preregistration. A number of previous articles have highlighted this misperception and discussed how preregistration can accommodate multiverse analyses or the various other robustness checks described by Klonsky (e.g., see Box 2 in Hardwicke & Wagenmakers, 2023; Table 1 in Chambers & Tzavella, 2022; Figure 1 in Simmons et al., 2021). Klonsky does not provide data to support his claim that preregistrations offer “a blanket reward for a selective rather than comprehensive approach to understanding data, even though the latter approach is optimal.” In fact, when used together, robustness checks and preregistration can be a particularly effective way to demonstrate that a claim or hypothesis has been severely tested. The same is true for Klonsky’s emphasis on replication. The impact of replication results can be amplified by being preregistered; one need look no further than Bem (2011) to understand why non-preregistered replications can be far from compelling. Importantly and ironically, all of Klonsky’s methodological alternatives to preregistration (i.e., robustness checks, multiverse analyses, and replications) can most effectively advance the field when they are preregistered and, thus, transparent.

When understood as a tool to increase transparency in order to facilitate the evaluation of severity, there is no reason to fear that preregistration will detrimentally confine researchers to a limited set of analyses. Researchers should select the analysis or set of analyses that will provide the most severe test(s) of their hypotheses. For examples, Watts and colleagues (2023), in discussing “dos and don’ts” of factor analysis, explicitly encourage researchers to examine multiple factor models, rotations, and extraction approaches due to the “value in approaching statistical modeling using varied methods to converge on shared, and thus robust, conclusions and to minimize the likelihood of reporting spurious effects (p. 114). In this case, such explorations can be explicitly built into a detailed preregistration so as to a) increase the severity of the test (of the factor models) and b) provide some protection for authors from reviewers who may object to this comprehensive, thorough analytic approach.

Second, Klonsky argues that preregistration becoming the norm will “devalue analyses conceived after engaging with the data. From the PRM [Preregistration Mandate] perspective, only the former is badge-worthy, and the latter perhaps worthy of suspicion.” (p. 15). This is neither true in principle nor in practice (e.g., see response 3 in Chambers et al., 2014). In principle, preregistration is designed to increase transparency in order to facilitate the evaluation of severity, thus analyses conceived after engaging with the data can be appropriate (whether author or peer reviewer instigated), and these analyses may enhance, decrease, or be unrelated to the severity of a test. The important thing is that preregistration allows us to properly evaluate the analyses conceived after engagement with the data and the context in which they were undertaken.

In practice, there are published guidelines for how to deviate from preregistrations (e.g., Lakens, 2024; Willroth & Atherton, 2024), which is difficult to square with Klonsky’s belief that these analyses are devalued in the context of preregistration. Lakens (2024) specifically outlines how to report deviations with the principle of severity in mind, and Willroth and Atherton (2024) provide a valuable template for researchers to use when reporting deviations from their preregistration. Moreover, Willroth and Atherton (2024) provide empirical data on journal editors’ perceptions of preregistration deviations. They found that, on average, editors reported that deviations would not affect their perception of the manuscript or reported deviations would have a slight positive impact on their perception of the manuscript, which does not support Klonsky’s claim that deviations are worthy of suspicion.

Third, Klonsky argues that preregistration becoming a norm will produce a different kind of “file drawer” problem. Research labs will continue to conduct numerous studies at a time, only some of which are selectively submitted for publication. However, when preregistration is the norm, the results that are selectively submitted for publication can attain badges vouching for their (unwarranted) credibility. Klonsky does not specify how and when preregistration is being carried out in this hypothetical scenario. But assuming that all studies are being preregistered prior to the studies being conducted, we argue that preregistration can help bring about the exact opposite effect that Klonsky suggests (i.e., preregistration can combat publication bias). Specifically, when null findings can be shown to be the product of a severe test (with this evaluation being facilitated by preregistration), these research labs should want to publish all their findings, not just positive ones. Indeed, in a footnote, Klonsky excludes registered reports from his concern about this new kind of publication bias. But Klonsky does not articulate the reason that registered reports would be immune from his concern. We argue it is largely because of their use of preregistration.3 More specifically, by ensuring that any result, null or otherwise, is the product of a severe test, we can be more confident that the result is informative and thus there is good reason for it to be published. Preregistration can help us differentiate null results that are the product of a severe test from null results that are uninformative (e.g., when an underpowered test produces a null result).

Last, Klonsky states that preregistration becoming the norm will lead to various forms of “preregistering after the results are known”, or PARKing (Yamada, 2018). Klonsky notes that in some cases (e.g., lying in a preregistration about previous access to data), PARKing is simply fraud. We agree, and it should be labeled, described, and sanctioned as such. However, Klonsky also provides a different hypothetical example of PARKing he believes would be more common. In his example, a student conducts a partial exploration of a dataset, discovers an interesting relation between variables, and proceeds to stop exploring and draft a preregistration on advice from their advisor. First, without knowing what information is reported in the preregistration, there is nothing inherently problematic about Klonsky’s example. What matters is transparently reporting how their analyses were carried out to facilitate the evaluation of severity. If the student and advisor claim in their preregistration that they had no prior knowledge of the data, that would be fraudulent behavior that should be considered scientific misconduct. In a more scientifically rigorous (and ethical) scenario, the student and mentor would transparently report in the preregistration that their subsequent analyses were based on some initial exploration of relations in the data. We see no issue with their behavior per se, though their approach is likely to have implications for the severity of their subsequent tests, as it should. If the student and mentor transparently report their approach, preregistration can function as intended and judgments about the strength of their findings can be appropriately calibrated. Without the transparency provided by preregistration, the appropriate calibration cannot be applied.

Taken together, we believe that Klonsky’s concerns about preregistration becoming a norm are products of misunderstanding the purpose of preregistration, and a failure to attend to the growing meta-scientific literature surrounding preregistration. Regarding the latter point, Klonsky relies on Pham and Oh (2021), McDermott (2022), or hypothetical scenarios to criticize preregistration with the unfortunate consequence that he repeats common misunderstandings about preregistration (e.g., preregistration devalues exploratory analyses; preregistration suppresses creativity). There is a rapidly expanding meta-science literature, providing conceptual and empirical overviews of the benefits and limitations of preregistration (e.g., see Lakens et al. (2024), for a recent review). Engagement with this literature can help increase the understanding of preregistration and its nuances, hopefully improving its use in the field.

Preregistration’s Place in Methodological Reforms

In the final section of his commentary, Klonsky outlines that the best way to improve our science is sociocultural in nature—we must build a scientific culture that celebrates robust findings. In his opinion, this sociocultural change will serve as the “straitjacket” against practices that reduce the robustness of our science. But this final section of the commentary highlights an important inconsistency in Klonsky’s views. While Klonsky believes that researchers cannot be trusted to use preregistration honestly or to avoid selectively publishing only significant findings, this cynicism is notably absent from his beliefs about the responsiveness of researchers to sociocultural pressures to produce robust findings. Presumably, if researchers are responsive to the kind of sociocultural pressures encouraged by Klonsky, there is no reason to think that researchers cannot use preregistration effectively. We agree with Klonsky that sociocultural change is essential but highlight that preregistration, when understood appropriately, is an essential tool in the push for the kinds of sociocultural change that are necessary to build a more robust psychological science.

Thus, the most immediate challenge regarding preregistration is to clarify its intended purpose, which is to increase transparency in order to allow evaluation of the severity of a test. As this conceptual understanding of preregistration increases, researchers will be able to more effectively implement preregistration in their own work, understand when it may be less appropriate, and also more effectively evaluate preregistrations of other researchers. Importantly, increasing the field’s understanding of preregistration is also the best remedy to researchers mistakenly viewing the mere presence of a preregistration (e.g., a preregistration badge) as an indicator of strong science. An accurate understanding of preregistration will help researchers avoid thinking that a preregistration (or badge) tells us anything about the quality of the research, in the same way that an accurate understanding of p values helps researchers avoid the common inferential pitfalls associated with “p <.05”.

Relatedly, concerns about researchers who only engage in preregistration to reap the benefits of a preregistration badge (e.g., other researchers’ belief that the results are more robust) are also addressed by an improved understanding of what preregistration is designed to do. For example, researchers who engage in preregistration because they know it will give them an advantage in getting their otherwise low-quality results published will be “successful” only if people continue to misunderstand what preregistration can tell us about a study. This kind of bad-faith actor capitalizes on researchers’ lack of understanding about preregistration and its purpose, and they will be “successful” to the extent that researchers continue to misunderstand the purpose of preregistration. Although we believe that greater conceptual understanding of preregistration’s purpose will help mitigate the belief that mere use of preregistration is an indicator of rigor, this is an empirical question ripe for future meta-scientific research. For now, we must continue to work to convey what preregistration is designed to do and push back against misunderstandings when they arise (e.g., if a study is preregistered, we can trust the findings).

Practically, researchers have developed many tools to aid in the effective implementation of preregistration which include guidelines for preregistering secondary data analyses (van den Akker et al., 2021), how preregistration can be used in clinical assessment research (Tackett et al., 2019), using preregistration for qualitative studies (Haven et al., 2020; but also see Rubin, 2023), and deviating from preregistrations (Willroth & Atherton, 2023). Meta-science research has also begun to provide empirical data on the use of these tools. There is emerging evidence that more structured preregistration templates can increase the rigor and transparency of preregistrations (Bakker et al., 2020; van den Akker et al., 2023), and such tools are valuable aids in bringing the practice of preregistration in line with its goal. As others have noted, developing an effective preregistration is difficult to do (Nosek et al., 2019). Creating and evaluating preregistrations will often require domain-expertise, and, like other skills, it will take time to develop (Hardwicke & Wagenmakers, 2023). As advocates of preregistration ourselves, we note that our own preregistrations have grown more detailed and nuanced with practice and suspect that such growth and improvement is common.

Nonetheless, an immediate practical concern is how preregistrations are assessed during peer review.4 While it may not always be necessary to evaluate a preregistration (e.g., there may be a fatal study design flaw identifiable in the manuscript alone), there is evidence from some journals that preregistrations are rarely accessed during the review process (e.g., Syed., 2023). However, it may be unreasonable to ask an already overburdened peer review system to take on the additional responsibility of evaluating both the preregistration and primary manuscript. A valuable direction journals can explore is to create paid roles for experts or teams of experts to evaluate preregistrations and provide summaries of relevant discrepancies to editors and/or reviewers. For example, at Psychological Science and Clinical Psychological Science there are specific editorial teams used to evaluate the various open science practices in submitted and accepted manuscripts, albeit in a limited way given the limited resources available.5 These kinds of implementation issues are highly relevant as the use of preregistration expands. Journals and publishers, who benefit from their alignment with the Open Science movement and perceptions of greater credibility, must provide the necessary resources to make sure that preregistration is being vetted during the review process and not falling entirely to overworked and unpaid reviewers. While these structural-level policies will take time to implement and streamline, we also strongly encourage researchers who preregister their studies to include information about deviations from their preregistrations using a comprehensive template like the one provided by Willroth and Atherton (2023). Such templates can substantially increase readers’ abilities to evaluate any deviation’s influence on severity and help bring the practice of preregistration in line with its intended goal. We would also note that there are benefits to preregistrations even when not evaluated during the review process in that in an increasingly “online” academic world, post publication review has become more common. For instance, in a hypothetical case, scholars may review a preregistration and publication after the latter has been published and then point out important deviations that went unreported and evaluated. As such, the field may come to see that the initial test was less severe than previously assumed and update their priors accordingly.

We also encourage the increased use of the registered report format, whereby a study preregistration benefits from peer review prior to data collection or data analysis. By explicitly requiring peer review of preregistrations, registered reports help ensure that 1) the preregistration is actually evaluated 2) the preregistration provides sufficient transparency about the planned approach; 3) the approach is maximally severe so that results, null or otherwise, are informative. The registered report format is the most effective way to ensure that preregistration achieves its intended goal. Although it is possible that the stage 1 review process in a registered report can still miss important methodological shortcomings of a planned study or analysis, oversights are far more likely to be detected and adjusted prior to the study being completed compared to when preregistration does not undergo review. Despite the fact that more and more journals are offering the registered report format, the actual number of published registered reports remains low. For example, Montoya and colleagues (2021) found that at the time of their review 278 journals had adopted the registered report format (137 of which were psychology journals). However, most journals (71.58%) had yet to publish a registered report. The field will significantly benefit from more researchers taking advantage of the registered report format.

Ultimately, preregistration is simply one means among many designed to help improve the robustness of psychological science. Like Klonsky (2024), we believe our science will benefit from using all the open science tools at our disposal (e.g., sharing code and data, registered reports, preregistration). Many researchers may be skeptical of preregistration because it implies that researchers are not capable of developing and enacting a severe testing approach absent the use of preregistration. It is true that researchers can develop severe tests without preregistration being involved, and preregistration does not automatically increase the severity of a test (Lakens, 2019). As Vazire (2019) has written, “Transparency doesn’t guarantee credibility; transparency and scrutiny together guarantee that research gets the credibility it deserves.” In other words, although transparency is not sufficient for credibility, it is necessary for it. Preregistration is the single tool designed to increase the transparency of the research process to allow others to evaluate the severity with which a claim has been tested, and thus provides an additional way for others to see that researchers are conducting rigorous work and evaluate it appropriately. Like Klonsky, we believe that, “we must create a culture that reveres and celebrates robust findings as much as it once revered and celebrated flashy but fragile findings.” (p. 23). Preregistration, when understood and used appropriately, is a tool that aligns with Klonsky’s vision for the future and can help support this cultural change.

Footnotes

1

As one example, Klonsky (2024) writes, “We can only evaluate the robustness or fragility of particular statistical findings by examining what happens when different reasonable versions of the analysis are conducted. Converging findings support robustness, diverging findings suggest fragility. Many ideas for robustness checks will only be stimulated after engaging with the data. This is not a problem scientifically! Ideal data analytic practice requires constant care and thought both before and after engaging with data. However, by selectively rewarding only one part of the data analytic enterprise, PRMs disincentivize other critical parts of the process, including robustness checks, and even imply that post-hoc analyses are of inferior relevance when the opposite is often true.” (p. 18). It is clear that Klonsky’s use of PRMs in this context refers to the practice of preregistration itself, which is distinct from incentivization policies like preregistration badges.

2

We note that Psychological Science has already discontinued the use of preregistration badges for published articles (see Hardwicke & Vazire, 2023), due in part to the evidence cited by Klonsky (i.e., Claesen et. al, 2021). To provide additional context, we note that badges at Psychological Science were designed to be temporary and not “an ideal end state” (Eich, 2014) nor “an end in themselves” (Lindsay, 2019). This additional context does not impact the importance of Klonsky’s concerns surrounding Campbell’s Law (in fact, it suggests that despite people’s best intentions, Campbell’s Law still wins out), but it does provide room for optimism—the reform movement appears to be far more responsive to evidence of problems compared to how the field responded to evidence of other more impactful issues (e.g., concerns about insufficient statistical power went unheeded for decades; Sedlmeier & Gigerenzer, 1989). However, many journals still offer badges for preregistration, so Klonsky’s concerns about badges remain relevant.

3

Registered reports also provide peer review of the proposed study or analysis plan at Stage 1, which allows for researchers to receive feedback on the severity of their planned test(s). Thus, registered reports provide a form of quality control that is not available in typical preregistrations. Registered reports have an additional component that helps combat publication bias, which is that manuscripts receive in principle acceptance after peer review at Stage 1. In principle acceptance means that the manuscript will be published regardless of the nature of results (i.e., null or positive) following review at Stage 2, so long as no deviations have occurred that impact the severity of the test(s). This added benefit of registered reports protects against publication bias and bias on part of reviewers (e.g., reviewers rejecting a paper based on results they do not find favorable). We believe these additional components of registered reports add important value beyond simple preregistration, but argue that the development of a severe testing strategy at Stage 1, strengthened by peer review, is what makes the null results publishable (i.e., informative) in the context of registered reports.

4

While we focus on preregistration during the peer review stage, preregistration can continue to serve its intended goal after peer review.

5

Personality Disorders: Theory, Research, and Treatment also has an Open Science Consulting Editor (thus far a doctoral student interested in open science) who reviews all relevant preregistrations and provides the editor and author feedback on discrepancies. As to date, however, this has been largely an unpaid position due to a lack of available funds from APA.

Conflict of Interest Statement: All authors declare that they have no conflict of interest to report.

Contributor Information

Colin E. Vize, University of Pittsburgh

Nathaniel L. Phillips, University of Georgia

Joshua D. Miller, University of Georgia

Donald R. Lynam, Purdue University

References

  1. Bakker M, Veldkamp CLS, Van Assen MALM, Crompvoets EAV, Ong HH, Nosek BA, Soderberg CK, Mellor D, & Wicherts JM (2020). Ensuring the quality and specificity of preregistrations. PLOS Biology, 18(12), e3000937. 10.1371/journal.pbio.3000937 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bem D (2011). Feeling the future: Experimental evidence for anamalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100, 407–425. [DOI] [PubMed] [Google Scholar]
  3. Campbell DT (1979). Assessing the impact of planned social change. Evaluation and Program Planning, 2(1), 67–90. 10.1016/0149-7189(79)90048-X [DOI] [Google Scholar]
  4. Chambers CD, Feredoes E, Muthukumaraswamy SD, & Etchells PJ (2014). Instead of “playing the game” it is time to change the rules: Registered Reports at AIMS Neuroscience and beyond. AIMS Neuroscience, 1, 4–17. [Google Scholar]
  5. Chambers CD, & Tzavella L (2022). The past, present and future of Registered Reports. Nature Human Behaviour, 6, 29–42. 10.1038/s41562-021-01193-7 [DOI] [PubMed] [Google Scholar]
  6. Claesen A, Gomes S, Tuerlinckx F, & Vanpaemel W (2021). Comparing dream to reality: An assessment of adherence of the first generation of preregistered studies. Royal Society Open Science, 8(10). 10.1098/rsos.211037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Claesen A, Lakens D, Vanpaemel W, & Dongen N van. (2022). Severity and Crises in Science: Are We Getting It Right When We’re Right and Wrong When We’re Wrong? OSF. 10.31234/osf.io/ekhc8 [DOI] [Google Scholar]
  8. Eich E (2014). Business Not as Usual. Psychological Science, 25(1), 3–6. 10.1177/0956797613512465 [DOI] [PubMed] [Google Scholar]
  9. Hardwicke TE, & Wagenmakers E (2023). Reducing bias, increasing transparency, and calibrating confidence with preregistration. Nature Human Behaviour, 7(2), 0–3. 10.1038/s41562-022-01497-2 [DOI] [PubMed] [Google Scholar]
  10. Haven TL, Errington TM, Gleditsch KS, van Grootel L, Jacobs AM, Kern FG, Piñeiro R, Rosenblatt F, & Mokkink LB (2020). Preregistering Qualitative Research: A Delphi Study. International Journal of Qualitative Methods, 19, 1609406920976417. 10.1177/1609406920976417 [DOI] [Google Scholar]
  11. Lakens D (2019). The value of preregistration for psychological science: A conceptual analysis. Japanese Psychological Review, 62(3), 221–230. 10.31234/osf.io/jbh4w [DOI] [Google Scholar]
  12. Lakens D (2024). When and How to Deviate from a Preregistration. Collabra: Psychology. 10.31234/osf.io/ha29k [DOI] [Google Scholar]
  13. Lakens D, Mesquida C, Rasti S, & Ditroilo M (2024). The Benefits of Preregistration and Registered Reports. OSF. 10.31234/osf.io/dqap7 [DOI] [Google Scholar]
  14. Lindsay DS (2019). Swan Song Editorial. Psychological Science, 30(12), 1669–1673. 10.1177/0956797619893653 [DOI] [PubMed] [Google Scholar]
  15. McDermott R (2022). Breaking free: How preregistration hurts scholars and science. Politics and the Life Sciences, 41(1), 55–59. 10.1017/pls.2022.4 [DOI] [PubMed] [Google Scholar]
  16. Meehl PE (1990). Appraising and Amending Theories: The Strategy of Lakatosian Defense and Two Principles That Warrant It. Psychological Inquiry, 1, 108–141. [Google Scholar]
  17. Montoya AK, Krenzer WLD, & Fossum JL (2021). Opening the door to registered reports: census of journals publishing registered reports (2013–2020). Collabra: Psychology, 7(1), 24404. [Google Scholar]
  18. Navarro D (2020). Paths in strange spaces: A comment on preregistration. OSF. 10.31234/osf.io/wxn58 [DOI] [Google Scholar]
  19. Nosek BA, Beck ED, Campbell L, Flake JK, & Hardwicke TE (2019). Preregistration Is Hard, And Worthwhile. Trends in Cognitive Sciences, 23(10), 815–818. 10.1016/j.tics.2019.07.009 [DOI] [PubMed] [Google Scholar]
  20. Pham MT, & Oh TT (2021). Preregistration Is Neither Sufficient nor Necessary for Good Science. Journal of Consumer Psychology, 31(1), 163–176. 10.1002/jcpy.1209 [DOI] [Google Scholar]
  21. Rubin M (2020). Does preregistration improve the credibility of research findings? arXiv.Org. 10.20982/tqmp.16.4.p376 [DOI] [Google Scholar]
  22. Rubin M (2023). Opening up open science to epistemic pluralism: Comment on Bazzoli (2022) and some additional thoughts. MetaArXiv. https://osf.io/dgzxa/download [Google Scholar]
  23. Sedlmeier P, & Gigerenzer G (1989). Do studies of statistical power have an effect on the power of studies? Psychological Bulletin, 105(2), 309–316. 10.1037/0033-2909.105.2.309 [DOI] [Google Scholar]
  24. Simmons JP, Nelson LD, & Simonsohn U (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 22(11), 1359–1366. 10.1177/0956797611417632 [DOI] [PubMed] [Google Scholar]
  25. Simmons JP, Nelson LD, & Simonsohn U (2021). Pre-registration is a Game Changer. But, Like Random Assignment, it is Neither Necessary Nor Sufficient for Credible Science. Journal of Consumer Psychology, 31(1), 177–180. 10.1002/jcpy.1207 [DOI] [Google Scholar]
  26. Soderberg CK, Errington TM, Schiavone SR, Bottesini J, Thorn FS, Vazire S, Esterling KM, & Nosek BA (2021). Initial evidence of research quality of registered reports compared with the standard publishing model. Nature Human Behaviour, 5, 990–997. 10.1038/s41562-021-01142-4 [DOI] [PubMed] [Google Scholar]
  27. Syed M (2023). Some data indicating that editors and reviewers do not check preregistrations during the review process. OSF. 10.31234/osf.io/nh7qw [DOI] [Google Scholar]
  28. Szollosi A, Kellen D, Navarro DJ, Shiffrin R, van Rooij I, Zandt TV, & Donkin C (2020). Is Preregistration Worthwhile? Trends in Cognitive Sciences, 24(2), 94–95. [DOI] [PubMed] [Google Scholar]
  29. Hardwicke TE, Vazire S (2023). Transparency Is Now the Default at Psychological Science. Psychological Science, 1–4. https://journals-sagepub-com.pitt.idm.oclc.org/doi/10.1177/09567976231221573 [DOI] [PubMed] [Google Scholar]
  30. Tackett JL, Brandes CM, & Reardon KW (2019). Leveraging the Open Science Framework in clinical psychological assessment research. Psychological Assessment, 31(12), 1386–1394. 10.1037/pas0000583 [DOI] [PubMed] [Google Scholar]
  31. van den Akker OR, van Assen MALM, Bakker M, Elsherif M, Wong TK, & Wicherts JM (2023). Preregistration in practice: A comparison of preregistered and non-preregistered studies in psychology. Behavior Research Methods. 10.3758/s13428-023-02277-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. van den Akker O, Weston S, Campbell L, Chopik B, Damian R, Davis-Kean P, Hall A, Kosie J, Kruse E, Olsen J, Ritchie S, Valentine K, Veer A. V. ‘t, & Bakker M (2021). Preregistration of secondary data analysis: A template and tutorial. Meta-Psychology, 5(2625). 10.15626/MP.2020.2625 [DOI] [Google Scholar]
  33. Vazire S (2019). Do We Want to Be Credible or Incredible? APS Observer. https://www.psychologicalscience.org/observer/do-we-want-to-be-credible-or-incredible
  34. Watts AL, Greene AL, Ringwald W, Forbes MK, Brandes CM, Levin-Aspenson HF, & Delawalla C (2023). Factor analysis in personality disorders research: Modern issues and illustrations of practical recommendations. Personality Disorders: Theory, Research, and Treatment, 14(1), 105–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Willroth EC, & Atherton OE (2024). Best Laid Plans: A Guide to Reporting Preregistration Deviations. Advances in Methods and Practices in Psychological Science, 7(1), 25152459231213802. 10.1177/25152459231213802 [DOI] [Google Scholar]
  36. Yamada Y (2018). How to Crack Pre-registration: Toward Transparent and Open Science. Frontiers in Psychology, 9. 10.3389/fpsyg.2018.01831 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES