How to design a pre-specified statistical analysis approach to limit p-hacking in clinical trials: the Pre-SPEC framework

Brennan C Kahan; Gordon Forbes; Suzie Cro

doi:10.1186/s12916-020-01706-7

. 2020 Sep 7;18:253. doi: 10.1186/s12916-020-01706-7

How to design a pre-specified statistical analysis approach to limit p-hacking in clinical trials: the Pre-SPEC framework

Brennan C Kahan ^1,^✉, Gordon Forbes ², Suzie Cro ³

PMCID: PMC7487509 PMID: 32892743

Abstract

Results from clinical trials can be susceptible to bias if investigators choose their analysis approach after seeing trial data, as this can allow them to perform multiple analyses and then choose the method that provides the most favourable result (commonly referred to as ‘p-hacking’). Pre-specification of the planned analysis approach is essential to help reduce such bias, as it ensures analytical methods are chosen in advance of seeing the trial data. For this reason, guidelines such as SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials) and ICH-E9 (International Conference for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use) require the statistical methods for a trial’s primary outcome be pre-specified in the trial protocol. However, pre-specification is only effective if done in a way that does not allow p-hacking. For example, investigators may pre-specify a certain statistical method such as multiple imputation, but give little detail on how it will be implemented. Because there are many different ways to perform multiple imputation, this approach to pre-specification is ineffective, as it still allows investigators to analyse the data in different ways before deciding on a final approach. In this article, we describe a five-point framework (the Pre-SPEC framework) for designing a pre-specified analysis approach that does not allow p-hacking. This framework was designed based on the principles in the SPIRIT and ICH-E9 guidelines and is intended to be used in conjunction with these guidelines to help investigators design the statistical analysis strategy for the trial’s primary outcome in the trial protocol.

Keywords: Randomised trial, Pre-specification, Transparency, Bias, p-hacking

Background

Results from clinical trials depend upon the statistical methods used for analysis [1–5]. Different methods of analysis applied to the same trial can lead to different conclusions around effectiveness and safety [1–14]. Therefore, results from clinical trials can be susceptible to bias if investigators choose their analysis approach after seeing trial data, as this can allow them to perform multiple analyses and then choose the approach that provides the most favourable result. This is commonly referred to as ‘p-hacking’ and can lead to bias in treatment effect estimates, confidence intervals, and p values [1–5, 7–10, 12, 15]. Pre-specification of the planned analysis approach is therefore essential to help reduce such bias, as it ensures that analytical methods are chosen in advance of seeing the trial data [1–5, 7, 9, 10, 12]. The SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials) and ICH-E9 (International Conference for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use) guidelines require that the method of analysis for the trial’s primary outcome be pre-specified in the trial protocol [1, 3, 4].

However, pre-specification is only effective if done in a way that does not allow p-hacking. For example, investigators may pre-specify a certain statistical method, such as multiple imputation to handle missing data, but give little detail on how it will be implemented. However, there are many different ways to implement multiple imputation, such as including different variables in the imputation model and imputing under different statistical models. Therefore, this approach to pre-specification is ineffective, as it still allows investigators to analyse the data in many different ways before deciding on a final approach. This issue of ‘incomplete’ pre-specification, where methods are pre-specified to some extent but the specification still allows for some degree of p-hacking, is common in clinical trials (Table 1) [2–5]. For example, two reviews which examined trial protocols found that 11–20% of protocols did not specify the analysis model that would be used for the primary outcome, 42% did specify the model but omitted essential detail on how the model would be implemented, and 19% specified an approach that would allow the investigators to subjectively choose the final analysis model after seeing the trial data [2, 5].

Table 1.

Common issues in pre-specifying statistical analysis approaches in clinical trial protocols

		Estimated prevalence
Issue	Problems associated with issue	Aspect	Prevalence^a
Omitting an aspect of the analysis approach	Investigators could run multiple analyses, and selectively report the most favourable	Analysis population: Analysis model: Covariates: Missing data:	27-47% 11-20% 27% 66-77%
Insufficient detail around an aspect of the analysis approach	Investigators could run multiple analyses, and selectively report the most favourable	Analysis population: Analysis model: Covariates: Missing data:	64% 42% 23% 17%^b
Analysis approach allows some aspects of the final analysis to be subjectively chosen based on trial data	Investigators could run multiple analyses, and selectively report the most favourable	Analysis model: Covariates:	19% 8%
Multiple analysis approaches specified, without one being identified as the primary	Investigators could selectively report the most favourable result, or to elevate its importance compared to less favourable results.	Analysis population: Analysis model: Covariates: Missing data:	11% 11% 9% 2%

Open in a new tab

^aBased on references [5] and [2]; one study evaluated protocols and published results for 70 randomised trials approved by the ethics committees for Copenhagen and Frederiksberg, Denmark in 1994-5; the other study evaluated 100 protocols of randomised trials indexed in PubMed November 2016.

^b15/99 protocols gave insufficient detail around how they planned to implement multiple imputation, 2/99 protocols but gave insufficient detail around their planned inverse probability weighting procedure

The SPIRIT and ICH-E9 documents contain guidance on what statistical content should be included in the trial protocol [1, 4], and there are also guidelines for the content of Statistical Analysis Plans [9]. These guidance documents contain some statistical principles which help to limit p-hacking (e.g. requiring that when multiple analysis strategies are planned, one of them is identified as the primary analysis); however, the primary aim of these guidelines is to describe what information should be included in the protocol or Statistical Analysis Plan, rather than describe exactly how the analysis should be designed. As such, these guidelines do not offer a prescriptive approach for how analysis strategies should be designed in order to limit p-hacking. In this article, we describe a framework for how a statistical analysis strategy could be designed to ensure it does not allow p-hacking (i.e. so that no part of the statistical methods can be chosen after seeing the trial data in order to ‘improve’ results) [2–4]. This framework was developed to be consistent with the statistical principles outlined in the SPIRIT and ICH-E9 guidelines (a comparison is shown in Additional file 1: Table S1) and is intended to be used in conjunction with these guidelines [1, 3, 4] to help investigators design the statistical analysis strategy for the trial’s primary outcome in the trial protocol.

The Pre-SPEC framework

We now outline the Pre-SPEC framework (Table 2). The five points are as follows: (1) pre-specify before recruitment to the trial begins, (2) specify a single primary analysis strategy, (3) plan each aspect of the analysis, (4) enough detail should be provided so that a third party could independently perform the analysis, and (5) adaptive analysis strategies should use deterministic decision rules. We expand on each of these points below.

Table 2.

Framework for pre-specifying a statistical analysis strategy (Pre-SPEC)

Pre-specify before recruitment	Pre-specify the analysis strategy before recruitment to the trial begins.
Single analysis strategy	Specify a single primary analysis strategy.
Plan each aspect	Each aspect of the planned analysis should be covered, including analysis population, statistical model, covariates, and handling of missing data.
Enough detail	Provide sufficient detail to allow a third party to independently perform the analysis (ideally through statistical code).
Choices made deterministically	For adaptive analysis strategies which use the trial data to inform some aspect of the analysis, use deterministic decision rules that prevent analysis choices being driven by results.

Open in a new tab

Pre-specify the analysis strategy before recruitment to the trial begins

Pre-specifying the analysis strategy before the trial begins ensures the choice of methods is not influenced by any trial data. This can give readers confidence that trial results are not due to p-hacking [1, 3, 4], as they will generally have no way to verify that analyses specified after the trial began were not based on trial data.

Pre-specifying the analysis approach for the trial’s primary outcome in the protocol before the trial begins is a requirement of both the SPIRIT and ICH-E9 guidelines (see Additional file 1: Table S1). For instance, ICH-E9 states that ‘… the principal features of its proposed statistical analysis should be clearly specified in a protocol written before the trial begins’, and ‘… the principal features of the eventual statistical analysis of the data should be described in the statistical section of the protocol. This section should include all the principal features of the proposed confirmatory analysis of the primary variable(s) and the way in which anticipated analysis problems will be handled’ [1], while SPIRIT states ‘The planned methods of statistical analysis should be fully described in the protocol’ and ‘The protocol should indicate explicitly each intended analysis comparing study groups. An unambiguous, complete, and transparent description of statistical methods facilitates execution, replication, critical appraisal, and the ability to track any changes from the original pre-specified methods’ [4].

Specify a single primary analysis strategy

Specifying a single primary analysis strategy ensures investigators cannot perform multiple analyses and then selectively report the most favourable as their main approach. There are often valid reasons to specify additional methods of analysis, for instance to answer different questions about the intervention (e.g. the effect of a treatment policy vs. the effect if everyone adheres [16]), or to assess the robustness of the main results to different assumptions about the data (e.g. sensitivity analyses for missing data [17]). In these instances, a single approach should be clearly labelled as the primary analysis strategy, with other approaches identified as sensitivity or supplementary analyses as appropriate [1, 3, 4].

Plan each aspect of the statistical analysis

Omission of a particular aspect from the analysis strategy could allow investigators to run multiple analyses for that aspect, and selectively report the most favourable. For example, if the analysis population is not specified, investigators could run both an intention-to-treat and per-protocol analysis and present whichever is most favourable.

The minimum set of essential aspects to cover are as follows:

Analysis population
Statistical model
The use of covariates
Handling of missing data

However, for many trials, there will be additional aspects to cover; for instance, a non-inferiority trial would need to specify the non-inferiority margin.

It is also useful to specify the target estimand [16] and what information will be presented from the analysis, such as the level of the confidence interval and the threshold for statistical significance if applicable.

Enough detail should be provided so that a third party could independently perform the analysis

There is often a substantial amount of detail required to implement an analysis. For example, using multiple imputation for missing data requires specification of the method of imputing data; this includes specifying which variables are included in the imputation model (and how they are included), whether multivariate normal, chained equations or some other imputation approach is used, the number of imputed datasets to be used, and how imputed datasets will be combined. Simply stating that multiple imputation will be used is not sufficient, as this allows the investigator to carry out multiple analyses based on different imputation approaches, each of which could give a different result.

Fully pre-specifying these details to such a degree that a third party could independently perform the analysis helps to ensure investigators cannot perform multiple analyses. A good test of whether there is sufficient detail is to write out the statistical code that would be used to implement the analysis in a statistical software program; if investigators are unable to write out their planned code, this likely means the analysis strategy is not sufficiently well specified. This code could be tested on a simulated (fake) dataset to ensure if performs as intended.

An additional benefit to providing this code as a supplement to the description of the planned analysis in the protocol is that it leaves no room for ambiguity, and ensures all necessary detail is provided [18].

Adaptive analysis strategies should use deterministic decision rules

Sometimes investigators use adaptive analysis strategies, where some aspect of the final analysis is chosen based on the trial data. For instance, they may specify that either multiple imputation or a complete case analysis will be used depending on the level of missing data. Many clinical trials will not require such decision rules, as there will often be an available analysis approach which can provide valid results under minimal assumptions about the data. However, investigators may find these rules useful in certain settings where their preferred approach will depend on some features of the data, which are not known in advance.

Adaptive analysis strategies can be problematic if the decision rules are subjective, as this allows investigators to perform each potential analysis and selectively report the most favourable. For example, without a clear rule about when to use multiple imputation vs. complete cases, investigators could perform both and then select whichever gives a ‘better’ result.

In order to prevent decisions from being driven by results, adaptive analysis strategies should use deterministic decision rules for selection of the final analysis approach. A decision rule is deterministic if two different people are guaranteed to get the exact same result by following the rule. This removes the investigators ability to influence decisions and will therefore ensure results cannot be p-hacked. In the example above, investigators could specify that multiple imputation will be used if the level of missing outcome data is > 5%, and a complete case analysis will be used otherwise.

We note that in many instances, adaptive analysis strategies can lead to biased estimates or incorrect standard errors even when decision rules are fully deterministic. For example, this occurs when using stepwise selection to choose which covariates to adjust for, when using a test for carryover to determine the final analysis model in a crossover design, or when using a test for interaction to determine the final analysis model for a factorial trial [19–21]. Therefore, caution should be applied when considering adaptive strategies, even if deterministic decision rules are planned.

Example

We now illustrate our framework in an example. Consider the following analysis section from a trial protocol for a continuous primary outcome measured at multiple follow-up time points:

‘Primary analyses will be undertaken on an intention-to-treat basis, including all participants as randomised, regardless of treatment actually received. The intervention group will be compared with the control group using a planned contrast of change from baseline to the week 12 endpoint using a mixed-model repeated measures analysis. Stratification variables will be evaluated and retained in analyses where they are measured as significant or quasi-significant. Transformation of outcomes, including categorisation, may be undertaken to meet distributional assumptions and accommodate outliers.’

Evaluating whether the analysis approach is designed to prevent p-hacking

This analysis approach meets our first two points; it was described in the trial protocol before recruitment began and consists of a single overall analysis strategy.

For our third point, the analysis approach covers three analysis aspects (population, analysis model, covariates); however, it does not specify how missing data will be handled. We can guess that participants with missing outcome data at all follow-up time points will be excluded from the analysis; however, this is not entirely clear.

For our fourth point, there is insufficient detail for a third party to independently replicate the analysis model; there are numerous ways to implement a mixed-model repeated measures analysis (for instance, different approaches to specifying random-effects, or different correlation structures to model the correlation between outcomes from the same participant at different time points), and it is not clear which approach the authors intend to use.

For our fifth point, the authors plan to use an adaptive analysis strategy for two components: which stratification variables to include in the analysis, and whether to transform the outcome (and if so, which transformation to use). In both instances, they do not include deterministic decision rules on how the final analysis approach should be decided (e.g. for stratification variables, there is no definition of what quasi-significant means). Therefore, this strategy would allow investigators to perform multiple analyses on the final trial data before choosing their preferred approach.

Overall, the specified analysis approach could allow investigators to implement a number of different analysis strategies (relating to handling of missing data, the analysis model, covariates, and transformation of the outcome) and present the most favourable result. As such, although this approach has been pre-specified, it still allows p-hacking.

Modifying the analysis approach so it is designed to prevent p-hacking using the Pre-SPEC framework

We can modify the approach described in the previous section so that it does not allow p-hacking by resolving the issues relating to points 3–5 above. First, we could explicitly state that the analysis will use all available follow-up data; participants with an available outcome from at least one follow-up time point will be included in the analysis, and participants with missing outcome data at all follow-up time points will be excluded from the analysis.

Second, we could provide additional information on how the analysis model will be implemented; for instance, we could specify a linear mixed-effects model with an unstructured correlation matrix for observations at different time points, estimated using restricted maximum likelihood. We could supplement this description by including the planned statistical code to remove any ambiguity from our description (see below for example code for the statistical package Stata).

Finally, we need to resolve the issues around the adaptive analysis strategies related to the stratification variables and the transformation of the outcome. In this scenario, it is unlikely that the adaptive strategies are necessary, or even beneficial. All stratification variables should be included in the model regardless of statistical significance, as failure to do so can lead to incorrect confidence intervals and p values [22, 23]. Furthermore, linear regression models are usually very robust to violations of distributional assumptions [24], and transformation can lead to issues of interpretability (in particular, categorisation could lead to a substantial reduction in power [25]). Therefore, the simplest way to resolve this issue is to remove the adaptive part and use a strategy which includes all stratification variables in the model and does not consider transformations of the outcome. This approach would guarantee valid results under minimal assumptions about the data, which are easily interpretable. If an adaptive strategy was deemed necessary, then a deterministic decision rule would need to be specified, for example by giving the exact p value threshold for retaining stratification variables in the model (though we note this approach can be problematic even if fully pre-specified [21]).

Incorporating these changes, we could re-write the planned analysis strategy as follows:

Primary analyses will be undertaken on an intention-to-treat basis, including all participants as randomised, regardless of treatment actually received. The analysis will use all available outcome data; participants with an available outcome from at least one follow-up time point will be included in the analysis, and participants with no recorded outcomes will be excluded from the analysis. The intervention group will be compared with the control group using a planned contrast of change from baseline to the week 12 endpoint and will be fit using a linear mixed-model which includes outcomes at all time-points in the model. The model will use an unstructured correlation matrix for observations at different time points, and will be fit using restricted maximum likelihood. The model will include treatment group, time point, a treatment-by-time interaction, and the stratification variables as fixed factors. This analysis will be implemented using the following Stata code:

mixed outcome treat_group i.time_point treat_group#i.time_point strat1 strat2 || patient_id:, res (unstructured, t (time_point)) noconstant reml

lincom treat_group+treat_group#12.time_point

Where ‘outcome’ refers to the primary outcome (change from baseline), ‘treat_group’ to the treatment group, ‘time_point’ refers to the follow-up time-point, ‘treat_group#i.time_point’ refers to the treatment group by follow-up time-point interaction, ‘strat1’ and ‘strat2’ refer to the stratification variables and ‘participant_id’ is a unique ID for participant. The treatment effect at week 12 (primary outcome) is estimated using the Stata code: lincom treat_group+treat_group#12.time_point

We note that Stata automatically excludes participants with no recorded outcomes from the analysis and so does not require additional code to perform this step. Further, we note that the above strategy is not necessarily the optimal statistical approach, but is used simply to illustrate how the original approach could be fully pre-specified.

Discussion

Pre-specification of the planned statistical analysis approach can help to help reduce bias from p-hacking in clinical trials, as it ensures analytical methods are chosen in advance of seeing the trial data. However, ‘incomplete’ pre-specification, which still allows some degree of p-hacking, is common in clinical trials [2, 5]. Pre-SPEC is a framework that describes how a statistical analysis strategy could be designed to ensure it does not allow p-hacking.

This framework was designed to be consistent with the SPIRIT and ICH-E9 guidelines [1, 4] and is intended to be used in conjunction with these and other guidelines [9]. The SPIRIT and ICH-E9 guidelines require the analysis strategy for a trial’s primary outcome be documented in the trial protocol, and as such, the Pre-SPEC framework is intended to help investigators design the analysis strategy for the trial’s primary outcome in the trial protocol. Our intention is not for the use of this framework be mandated, but rather for it to provide guidance for those who wish to design a statistical analysis approach which both (i) does not allow p-hacking and (ii) can be seen by others to not allow p-hacking.

The statistical analysis approach for the trial’s primary outcome is usually specified well in advance of the trial start date, as it is often required for grant application or the sample size calculation. Therefore, this information will usually be available to include in the trial protocol. However, for trials for which this information is not known at the protocol stage, and where investigators feel that specifying this information would pose an insurmountable barrier to the timely start of the trial, then investigators should specify the planned analysis approach for the primary outcome as soon after the trial has begun as possible. For these trials, it may be difficult for readers to determine whether the planned analysis approach was specified before investigators had access to unblinded trial data, and so accurate reporting around when trial investigators and statisticians received data, and whether they were blinded to treatment allocation codes within the dataset, is essential to allow transparent evaluation of results [26, 27].

Although this framework was developed with a trial’s primary outcome in mind, it could also be used for secondary outcomes. As above, where investigators feel that specifying this information would pose an insurmountable barrier to the timely start of the trial, then investigators should simply specify the planned analysis approach as soon after the trial has begun as possible. Importantly, we note that our framework does not require that a detailed Statistical Analysis Plan be written before the trial begins.

We note that the Pre-SPEC framework is not intended to preclude changes or force investigators to stick with an analysis strategy they feel is no longer appropriate. There are sometimes good reasons for investigators to change their statistical methods during the course of the trial, for instance because of an advance in statistical methodology or the implementation of new methods in statistical software packages. Instead, if it is anticipated beforehand that the preferred method of analysis may depend on some aspect of the trial data (for instance, the distribution of outcome data), then the manner in which this decision will be made should be pre-specified, and if the analysis strategy needs to change due to an unanticipated issue (for instance, the occurrence of unanticipated intercurrent events [28], or new methodology becoming available in statistical software packages), then these changes should be documented and explained [26]. Instead of preventing useful or necessary changes, Pre-SPEC simply increases transparency around the process; as stated in the SPIRIT guidelines, ‘An unambiguous, complete, and transparent description of statistical methods facilitates execution, replication, critical appraisal, and the ability to track any changes from the original pre-specified methods.’ [4].

We note that transparency around the statistical methods used in clinical trials is increasing, and there are initiatives in place to further increase transparency (for example, those conducted by the UKCRC CTU network, https://www.ukcrc-ctu.org.uk/). However, there is still a long way to go; evidence shows that the statistical methods for the trial’s primary outcome are often poorly specified in both trial protocols [5, 26, 27] and Statistical Analysis Plans [26]; that protocols and Statistical Analysis Plans are often not made publicly available, or are only done so after they may have already been modified during the course of the trial [26, 27, 29]; that undisclosed changes to the planned analysis approach are frequent [2, 26, 27]; and that reporting around data access and blinding status of statisticians is often poor [26, 27], hampering the ability of readers to evaluate whether changes have been made based on unblinded trial data. Pre-SPEC can play a part, alongside other initiatives, to help increase transparency in clinical trials, and resolving some of the issues outlined above.

Conclusion

Use of the Pre-SPEC framework can help ensure that statistical analyses are designed so they do not allow p-hacking.

Supplementary information

Additional file 1. ^{(16.5KB, docx)}

Acknowledgements

We would like to thank Thomas Bandholm, Victoria Cornelius, Rachel Phillips, Francesca Fiorentino, Nicholas Johnson, Consuelo Nohpal de la Rosa, and Jinky Lozano-Kuehne for helpful comments on a draft of the manuscript.

Authors’ contributions

BK conceived the idea for this article and wrote the first draft. GF and SC contributed to the manuscript and helped refine the Pre-SPEC framework. All authors read and approved the final manuscript.

Funding

None

Availability of data and materials

Not applicable

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information accompanies this paper at 10.1186/s12916-020-01706-7.

References

1.ICH Harmonised Tripartite Guideline Statistical principles for clinical trials. International conference on harmonisation E9 expert working group. Stat Med. 1999;18(15):1905–1942. [PubMed] [Google Scholar]
2.Chan AW, Hrobjartsson A, Jorgensen KJ, Gotzsche PC, Altman DG. Discrepancies in sample size calculations and data analyses reported in randomised trials: comparison of publications with protocols. BMJ. 2008;337:a2299. doi: 10.1136/bmj.a2299. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Chan AW, Tetzlaff JM, Altman DG, Laupacis A, Gotzsche PC, Krleza-Jeric K, et al. SPIRIT 2013 statement: defining standard protocol items for clinical trials. Ann Intern Med. 2013;158(3):200–207. doi: 10.7326/0003-4819-158-3-201302050-00583. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Chan AW, Tetzlaff JM, Gotzsche PC, Altman DG, Mann H, Berlin JA, et al. SPIRIT 2013 explanation and elaboration: guidance for protocols of clinical trials. BMJ. 2013;346:e7586. doi: 10.1136/bmj.e7586. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Greenberg L, Jairath V, Pearse R, Kahan BC. Pre-specification of statistical analysis approaches in published clinical trial protocols was inadequate. J Clin Epidemiol. 2018;101:53–60. doi: 10.1016/j.jclinepi.2018.05.023. [DOI] [PubMed] [Google Scholar]
6.Abraha I, Cherubini A, Cozzolino F, De Florio R, Luchetta ML, Rimland JM, et al. Deviation from intention to treat analysis in randomised trials and treatment effect estimates: meta-epidemiological study. BMJ. 2015;350:h2445. doi: 10.1136/bmj.h2445. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Dwan K, Altman DG, Clarke M, Gamble C, Higgins JP, Sterne JA, et al. Evidence for the selective reporting of analyses and discrepancies in clinical trials: a systematic review of cohort studies of clinical trials. PLoS Med. 2014;11(6):e1001666. doi: 10.1371/journal.pmed.1001666. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Dworkin JD, McKeown A, Farrar JT, Gilron I, Hunsinger M, Kerns RD, et al. Deficiencies in reporting of statistical methodology in recent randomized trials of nonpharmacologic pain treatments: ACTTION systematic review. J Clin Epidemiol. 2016;72:56–65. doi: 10.1016/j.jclinepi.2015.10.019. [DOI] [PubMed] [Google Scholar]
9.Gamble C, Krishan A, Stocken D, Lewis S, Juszczak E, Dore C, et al. Guidelines for the content of statistical analysis plans in clinical trials. JAMA. 2017;318(23):2337–2343. doi: 10.1001/jama.2017.18556. [DOI] [PubMed] [Google Scholar]
10.Grant S, Booth M, Khodyakov D. Lack of pre-registered analysis plan allows unacceptable data mining for and selective reporting of consensus in Delphi studies. J Clin Epidemiol. 2018;99:96–105. [DOI] [PubMed]
11.Nuesch E, Trelle S, Reichenbach S, Rutjes AW, Burgi E, Scherer M, et al. The effects of excluding patients from the analysis in randomised controlled trials: meta-epidemiological study. BMJ. 2009;339:b3244. doi: 10.1136/bmj.b3244. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Page MJ, McKenzie JE, Forbes A. Many scenarios exist for selective inclusion and reporting of results in randomized trials and systematic reviews. J Clin Epidemiol. 2013;66(5):524–537. doi: 10.1016/j.jclinepi.2012.10.010. [DOI] [PubMed] [Google Scholar]
13.Porta N, Bonet C, Cobo E. Discordance between reported intention-to-treat and per protocol analyses. J Clin Epidemiol. 2007;60(7):663–669. doi: 10.1016/j.jclinepi.2006.09.013. [DOI] [PubMed] [Google Scholar]
14.Saquib N, Saquib J, Ioannidis JP. Practices and impact of primary outcome adjustment in randomized controlled trials: meta-epidemiologic study. BMJ. 2013;347:f4313. doi: 10.1136/bmj.f4313. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Schulz KF, Altman DG, Moher D, Group C CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. PLoS Med. 2010;7(3):e1000251. doi: 10.1371/journal.pmed.1000251. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Committee for Human Medicinal Products. ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials, Step 2b.; http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2017/08/WC500233916.pdf. Accessed 21 Sept 2019.
17.Morris TP, Kahan BC, White IR. Choosing sensitivity analyses for randomised trials: principles. BMC Med Res Methodol. 2014;14:11. doi: 10.1186/1471-2288-14-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Goldacre B, Morton CE, DeVito NJ. Why researchers should share their analytic code. BMJ. 2019;367:l6365. doi: 10.1136/bmj.l6365. [DOI] [PubMed] [Google Scholar]
19.Freeman PR. The performance of the two-stage analysis of two-treatment, two-period crossover trials. Stat Med. 1989;8(12):1421–1432. doi: 10.1002/sim.4780081202. [DOI] [PubMed] [Google Scholar]
20.Kahan BC. Bias in randomised factorial trials. Stat Med. 2013;32(26):4540–4549. doi: 10.1002/sim.5869. [DOI] [PubMed] [Google Scholar]
21.Raab GM, Day S, Sales J. How to select covariates to include in the analysis of a clinical trial. Control Clin Trials. 2000;21(4):330–342. doi: 10.1016/S0197-2456(00)00061-1. [DOI] [PubMed] [Google Scholar]
22.Kahan BC, Morris TP. Reporting and analysis of trials using stratified randomisation in leading medical journals: review and reanalysis. BMJ. 2012;345:e5840. doi: 10.1136/bmj.e5840. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Kahan BC, Morris TP. Improper analysis of trials randomised using stratified blocks or minimisation. Stat Med. 2012;31(4):328–340. doi: 10.1002/sim.4431. [DOI] [PubMed] [Google Scholar]
24.Wang B, Ogburn EL, Rosenblum M. Analysis of covariance (ANCOVA) in randomized trials: more precision and valid confidence intervals, without model assumptions. Biometrics. 2019;75(4):1391–1400. [DOI] [PubMed]
25.Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332(7549):1080. doi: 10.1136/bmj.332.7549.1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Cro S, Forbes G, Johnson NA, et al. Evidence of unexplained discrepancies between planned and conducted statistical analyses: a review of randomized trials. BMC Med. 2020;18:137. 10.1186/s12916-020-01590-1. [DOI] [PMC free article] [PubMed]
27.Kahan BC, Ahmad T, Forbes G, Cro S. Availability and adherence to pre-specified statistical analysis approaches was low in published randomised trials. OSF (osfio/nbp8v). 2020. [DOI] [PubMed]
28.ICH E9 working group. ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials [Available from: https://www.ema.europa.eu/en/documents/scientific-guideline/ich-e9-r1-addendum-estimands-sensitivity-analysis-clinical-trials-guideline-statistical-principles_en.pdf. Accessed 15 Dec 2019.
29.Spence O, Hong K, Onwuchekwa Uba R, Doshi P. Availability of study protocols for randomized trials published in high-impact medical journals: a cross-sectional analysis. Clin Trials. 2019;1740774519868310. [DOI] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Additional file 1. ^{(16.5KB, docx)}

Data Availability Statement

Not applicable

[CR1] 1.ICH Harmonised Tripartite Guideline Statistical principles for clinical trials. International conference on harmonisation E9 expert working group. Stat Med. 1999;18(15):1905–1942. [PubMed] [Google Scholar]

[CR2] 2.Chan AW, Hrobjartsson A, Jorgensen KJ, Gotzsche PC, Altman DG. Discrepancies in sample size calculations and data analyses reported in randomised trials: comparison of publications with protocols. BMJ. 2008;337:a2299. doi: 10.1136/bmj.a2299. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Chan AW, Tetzlaff JM, Altman DG, Laupacis A, Gotzsche PC, Krleza-Jeric K, et al. SPIRIT 2013 statement: defining standard protocol items for clinical trials. Ann Intern Med. 2013;158(3):200–207. doi: 10.7326/0003-4819-158-3-201302050-00583. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Chan AW, Tetzlaff JM, Gotzsche PC, Altman DG, Mann H, Berlin JA, et al. SPIRIT 2013 explanation and elaboration: guidance for protocols of clinical trials. BMJ. 2013;346:e7586. doi: 10.1136/bmj.e7586. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Greenberg L, Jairath V, Pearse R, Kahan BC. Pre-specification of statistical analysis approaches in published clinical trial protocols was inadequate. J Clin Epidemiol. 2018;101:53–60. doi: 10.1016/j.jclinepi.2018.05.023. [DOI] [PubMed] [Google Scholar]

[CR6] 6.Abraha I, Cherubini A, Cozzolino F, De Florio R, Luchetta ML, Rimland JM, et al. Deviation from intention to treat analysis in randomised trials and treatment effect estimates: meta-epidemiological study. BMJ. 2015;350:h2445. doi: 10.1136/bmj.h2445. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Dwan K, Altman DG, Clarke M, Gamble C, Higgins JP, Sterne JA, et al. Evidence for the selective reporting of analyses and discrepancies in clinical trials: a systematic review of cohort studies of clinical trials. PLoS Med. 2014;11(6):e1001666. doi: 10.1371/journal.pmed.1001666. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Dworkin JD, McKeown A, Farrar JT, Gilron I, Hunsinger M, Kerns RD, et al. Deficiencies in reporting of statistical methodology in recent randomized trials of nonpharmacologic pain treatments: ACTTION systematic review. J Clin Epidemiol. 2016;72:56–65. doi: 10.1016/j.jclinepi.2015.10.019. [DOI] [PubMed] [Google Scholar]

[CR9] 9.Gamble C, Krishan A, Stocken D, Lewis S, Juszczak E, Dore C, et al. Guidelines for the content of statistical analysis plans in clinical trials. JAMA. 2017;318(23):2337–2343. doi: 10.1001/jama.2017.18556. [DOI] [PubMed] [Google Scholar]

[CR10] 10.Grant S, Booth M, Khodyakov D. Lack of pre-registered analysis plan allows unacceptable data mining for and selective reporting of consensus in Delphi studies. J Clin Epidemiol. 2018;99:96–105. [DOI] [PubMed]

[CR11] 11.Nuesch E, Trelle S, Reichenbach S, Rutjes AW, Burgi E, Scherer M, et al. The effects of excluding patients from the analysis in randomised controlled trials: meta-epidemiological study. BMJ. 2009;339:b3244. doi: 10.1136/bmj.b3244. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Page MJ, McKenzie JE, Forbes A. Many scenarios exist for selective inclusion and reporting of results in randomized trials and systematic reviews. J Clin Epidemiol. 2013;66(5):524–537. doi: 10.1016/j.jclinepi.2012.10.010. [DOI] [PubMed] [Google Scholar]

[CR13] 13.Porta N, Bonet C, Cobo E. Discordance between reported intention-to-treat and per protocol analyses. J Clin Epidemiol. 2007;60(7):663–669. doi: 10.1016/j.jclinepi.2006.09.013. [DOI] [PubMed] [Google Scholar]

[CR14] 14.Saquib N, Saquib J, Ioannidis JP. Practices and impact of primary outcome adjustment in randomized controlled trials: meta-epidemiologic study. BMJ. 2013;347:f4313. doi: 10.1136/bmj.f4313. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Schulz KF, Altman DG, Moher D, Group C CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. PLoS Med. 2010;7(3):e1000251. doi: 10.1371/journal.pmed.1000251. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Committee for Human Medicinal Products. ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials, Step 2b.; http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2017/08/WC500233916.pdf. Accessed 21 Sept 2019.

[CR17] 17.Morris TP, Kahan BC, White IR. Choosing sensitivity analyses for randomised trials: principles. BMC Med Res Methodol. 2014;14:11. doi: 10.1186/1471-2288-14-11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Goldacre B, Morton CE, DeVito NJ. Why researchers should share their analytic code. BMJ. 2019;367:l6365. doi: 10.1136/bmj.l6365. [DOI] [PubMed] [Google Scholar]

[CR19] 19.Freeman PR. The performance of the two-stage analysis of two-treatment, two-period crossover trials. Stat Med. 1989;8(12):1421–1432. doi: 10.1002/sim.4780081202. [DOI] [PubMed] [Google Scholar]

[CR20] 20.Kahan BC. Bias in randomised factorial trials. Stat Med. 2013;32(26):4540–4549. doi: 10.1002/sim.5869. [DOI] [PubMed] [Google Scholar]

[CR21] 21.Raab GM, Day S, Sales J. How to select covariates to include in the analysis of a clinical trial. Control Clin Trials. 2000;21(4):330–342. doi: 10.1016/S0197-2456(00)00061-1. [DOI] [PubMed] [Google Scholar]

[CR22] 22.Kahan BC, Morris TP. Reporting and analysis of trials using stratified randomisation in leading medical journals: review and reanalysis. BMJ. 2012;345:e5840. doi: 10.1136/bmj.e5840. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Kahan BC, Morris TP. Improper analysis of trials randomised using stratified blocks or minimisation. Stat Med. 2012;31(4):328–340. doi: 10.1002/sim.4431. [DOI] [PubMed] [Google Scholar]

[CR24] 24.Wang B, Ogburn EL, Rosenblum M. Analysis of covariance (ANCOVA) in randomized trials: more precision and valid confidence intervals, without model assumptions. Biometrics. 2019;75(4):1391–1400. [DOI] [PubMed]

[CR25] 25.Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332(7549):1080. doi: 10.1136/bmj.332.7549.1080. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Cro S, Forbes G, Johnson NA, et al. Evidence of unexplained discrepancies between planned and conducted statistical analyses: a review of randomized trials. BMC Med. 2020;18:137. 10.1186/s12916-020-01590-1. [DOI] [PMC free article] [PubMed]

[CR27] 27.Kahan BC, Ahmad T, Forbes G, Cro S. Availability and adherence to pre-specified statistical analysis approaches was low in published randomised trials. OSF (osfio/nbp8v). 2020. [DOI] [PubMed]

[CR28] 28.ICH E9 working group. ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials [Available from: https://www.ema.europa.eu/en/documents/scientific-guideline/ich-e9-r1-addendum-estimands-sensitivity-analysis-clinical-trials-guideline-statistical-principles_en.pdf. Accessed 15 Dec 2019.

[CR29] 29.Spence O, Hong K, Onwuchekwa Uba R, Doshi P. Availability of study protocols for randomized trials published in high-impact medical journals: a cross-sectional analysis. Clin Trials. 2019;1740774519868310. [DOI] [PubMed]

PERMALINK

How to design a pre-specified statistical analysis approach to limit p-hacking in clinical trials: the Pre-SPEC framework

Brennan C Kahan

Gordon Forbes

Suzie Cro

Abstract

Background

Table 1.

The Pre-SPEC framework

Table 2.

Pre-specify the analysis strategy before recruitment to the trial begins

Specify a single primary analysis strategy

Plan each aspect of the statistical analysis

Enough detail should be provided so that a third party could independently perform the analysis

Adaptive analysis strategies should use deterministic decision rules

Example

Evaluating whether the analysis approach is designed to prevent p-hacking

Modifying the analysis approach so it is designed to prevent p-hacking using the Pre-SPEC framework

Discussion

Conclusion

Supplementary information

Acknowledgements

Authors’ contributions

Funding

Availability of data and materials

Ethics approval and consent to participate

Consent for publication

Competing interests

Footnotes

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

How to design a pre-specified statistical analysis approach to limit p-hacking in clinical trials: the Pre-SPEC framework

Brennan C Kahan

Gordon Forbes

Suzie Cro

Abstract

Background

Table 1.

The Pre-SPEC framework

Table 2.

Pre-specify the analysis strategy before recruitment to the trial begins

Specify a single primary analysis strategy

Plan each aspect of the statistical analysis

Enough detail should be provided so that a third party could independently perform the analysis

Adaptive analysis strategies should use deterministic decision rules

Example

Evaluating whether the analysis approach is designed to prevent p-hacking

Modifying the analysis approach so it is designed to prevent p-hacking using the Pre-SPEC framework

Discussion

Conclusion

Supplementary information

Acknowledgements

Authors’ contributions

Funding

Availability of data and materials

Ethics approval and consent to participate

Consent for publication

Competing interests

Footnotes

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases