Skip to main content
BMJ Open logoLink to BMJ Open
. 2025 Nov 28;15(11):e111418. doi: 10.1136/bmjopen-2025-111418

Comparison of intention-to-treat and per-protocol results in non-inferiority trials: a methodological review protocol

Sameer Parpia 1,2,, Sandra Ofori 2, Tyler McKechnie 3, Naveen Rajan 2, Yang Wang 1,2, Borong Wang 3, Gordon Guyatt 2
PMCID: PMC12666176  PMID: 41314826

Abstract

Abstract

Introduction

Non-inferiority (NI) trial designs, which assess whether an experimental intervention is no worse than the standard of care, have become increasingly prevalent in recent years. Current thinking suggests that the intention-to-treat (ITT) analysis is considered anti-conservative in the presence of protocol violations when compared with the per-protocol (PP) analysis.

Methods and analysis

We aim to conduct a methodological review of NI trials to compare the results from ITT and PP analysis in NI trials. A comprehensive electronic search strategy will be used to identify studies indexed in MEDLINE, Embase and Cochrane Central Register of Controlled Trials databases. We will include 390 NI trials published prior to 31 December 2024. The primary outcomes are the treatment effect estimates from ITT and PP analyses. Secondary outcomes are the CI widths and the bounds of the CIs from the ITT and PP analyses. Analysis will calculate the relative difference in the point estimates, CI widths and CI bounds between the two approaches. Linear models will be used to investigate the relationship between the outcomes and the proportion of patients excluded from the PP analysis.

Ethics and dissemination

This is a methodological review that has been registered on the International Prospective Register for Systematic Reviews (PROSPERO, CRD420251125360). Research ethics is not required as the project is a methodological review of previously published trials. Study findings will be shared via peer-reviewed publications and presentations at academic conferences.

Keywords: STATISTICS & RESEARCH METHODS, Methods, Clinical Trial


STRENGTHS AND LIMITATIONS OF THIS STUDY.

  • Within-trial empirical comparison of the results from the intention-to-treat and per-protocol analysis in non-inferiority (NI) trials.

  • Assessment of differences in point estimates as well as CIs of the two approaches.

  • There is no restriction on disease or clinical area.

  • Possible sampling bias from the restricted timeframe.

  • Findings will not be applicable to NI trials using continuous outcomes.

Introduction

Randomised controlled trials (RCTs) are regarded as the gold standard in biomedical research for evaluating new interventions against the standard of care (SOC).1 While superiority RCTs assess whether a novel intervention is better than the SOC in efficacy, non-inferiority (NI) designs are used when a new intervention may reduce harm or burdens.2 These NI trials aim to demonstrate that any potential loss of benefit from the new intervention is sufficiently small that its use in clinical practice is warranted. To do so, NI establishes a pre-determined NI margin that represents the maximum acceptable loss in efficacy compared with the SOC, given the advantages of the new intervention.3

NI trials present unique challenges in design, analysis and interpretation.4,7 One challenge is the choice of the analysis population set. In superiority trials, the intention-to-treat (ITT) analysis population is widely accepted. However, in the presence of protocol non-adherence, the role of the ITT in NI trials remains debatable. The concern is that the ITT analysis tends to attenuate the treatment difference of interest towards the null, suggesting that the ITT is conservative for showing differences but anti-conservative for demonstrating similarity.2 5 8 9 Consequently, it may increase type I error, potentially leading researchers to incorrectly conclude that the intervention is non-inferior to the SOC.

The per-protocol (PP) analysis aims to measure the intervention effect in patients among protocol adherent patients. However, excluding patients post randomisation is likely to disturb the prognostic balance that randomisation initially achieved and may therefore result in biased attenuation or amplification of the treatment effect. Despite this vulnerability, researchers recommend that the ITT analysis be supplemented by the PP analysis in NI trials as it is assumed to run less risk of a false conclusion of NI.8 10 11

Our objective is to conduct a methodological review to empirically compare the results from ITT and PP analyses in NI trials to investigate whether PP analyses are more conservative in NI trials.

Materials and methods

Study design

This study is a methodological review of individual randomised NI trials published across all medical and surgical disciplines. The methodology for this review has been reported per the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols (online supplemental file 1). The protocol has been registered on the International Prospective Register of Systematic Reviews (PROSPERO - CRD420251125360). The study will begin in January 2026 and anticipates completion by August 2027.

Primary research objective

To evaluate the extent to which treatment effect, estimated using the ITT and PP analysis populations, differs in individually randomised NI trials published in peer-reviewed medical literature prior to 31 December 2024 and the nature of any differences.

Secondary research objective

  1. To assess whether the width of the CI, estimated using the ITT and PP analysis populations, differs in NI trials.

  2. To assess whether the CI bound of interest (ie the CI bound that is compared with the NI margin to draw the trial conclusion), estimated using the ITT and PP analysis populations, differs in NI trials.

  3. To assess the relationship between the proportion of patients excluded from the PP analysis and the:

    1. difference in treatment effect, estimated using the ITT and PP analysis populations

    2. difference in CI width, estimated using the ITT and PP analysis populations

    3. difference in the CI bound of interest estimated using the ITT and PP analysis populations.

  4. To assess the extent to which treatment effect, estimated using the ITT and PP analysis populations in relevant subgroups.

Eligibility criteria

The inclusion criteria are:

  1. Peer-reviewed NI RCTs defined as studies self-described as a NI trial in their title, abstract, introduction or methods of publication.

  2. Published prior to 31 December 2024 in any medical or surgical discipline.

  3. Published in English.

The exclusion criteria are:

  1. Pilot or feasibility randomised trials.

  2. Secondary analyses of randomised trial data or any other type of study not reporting primary data.

  3. Conference presentations or abstracts.

  4. Cluster or crossover trials.

  5. Primary outcome that is not binary or time-to-event (measured in terms of HR).

  6. Do not report both ITT and PP results.

Information sources

A systematic search will be conducted across MEDLINE, Embase and the Cochrane Central Register of Controlled Trials databases for trials published prior to 31 December 2024.

Search strategy

The search strategy was developed in collaboration with study investigators and a medical research librarian. The Medical Subject Heading term ‘Non-inferiority’ was applied across all databases, supplemented by additional keywords related to NI trials, including ‘non-inferiority randomised trial’, ‘non-inferiority design’, and ‘non-inferiority clinical trial’. A comprehensive list of search strategies is provided in online supplemental file 2.

Study selection

In this methodological review, we will examine trials published prior to 31 December 2024, and work backward in time until we have identified a total of 390 trials. Two reviewers will independently assess the titles and abstracts identified through a systematic search. To ensure accuracy and consistency, prior to screening, both reviewers will receive comprehensive training on the predefined inclusion and exclusion criteria. Any disagreements arising during the title and abstract screening process will be resolved by including the study for further evaluation. At the full-text screening stage, discrepancies will be addressed through consensus, and if consensus cannot be reached, a third reviewer will be consulted to make the final determination.

Outcome definitions

The primary outcomes for this study are the point estimates of the treatment effect, as estimated using the ITT and PP analysis populations.

Secondary outcomes include the width of the CI and the CI bounds, as estimated using the ITT and PP analysis populations.

Data management

Two reviewers will independently conduct data extraction into a data collection form designed a priori. Discrepancies will be reviewed in detail by a third reviewer who will resolve the conflict. The extracted data will include:

  1. Trial characteristics: author, year of publication, journal of publication, funding, disease area, intervention details and control details.

  2. Primary outcome and whether it is a desirable or undesirable outcome.

  3. Description of the ITT and PP analyses.

  4. Number of patients included in the ITT and PP analyses.

  5. Direction of the comparison (experimental vs control, or control vs experimental).

  6. Point estimates of treatment effect from the ITT and PP analyses.

  7. Confidence levels used in the analysis (eg, 90% or 95%).

  8. Width of the CI from the ITT and PP analyses.

  9. CI bounds from the ITT and PP analyses.

  10. NI margin.

  11. Trial conclusion (inferior, non-inferior, superior and inconclusive).

Statistical analysis

The trial characteristics will be summarised descriptively. To standardise results, the following steps will be taken:

  1. For binary outcome trials that report results on the absolute scale, the results will be converted into the relative scale by extracting the number of events and patients and estimating the relating risk and corresponding 95% CIs.

  2. To ensure that the CI bound of interest is the upper bound, the following will be done:

  3. For trials with an undesirable primary outcome in which the comparison is not reported as experimental versus control, we will convert the results to correspond to an experimental versus control comparison.

  4. For trials with a desirable primary outcome in which the comparison is not reported as control versus experimental, we will convert the results to correspond to a control versus experimental comparison.

The difference in the point estimates between the ITT and PP analysis for each trial will first be calculated using the relative difference: (θ^ITTθ^PP/θ^ITT)×100 where θ^ITT and θ^PP are the respective point estimates. The relative difference will then be summarised over all trials using the mean and corresponding 95% CIs. A similar approach will be undertaken to estimate the difference in the width of CI and in the CI bound of interest between the ITT and PP analyses.

Linear regression models will be used to investigate the relationship between the proportion of patients excluded from the PP analysis and the relative difference in the three outcomes. The proportion of patients excluded from the PP analysis will be modelled using restricted cubic splines to account for potential non-linear relationships.

We will conduct a descriptive subgroup analysis of differences in point estimates and CIs by disease type, limited to scenarios where the disease has at least 30 trials.

All analyses were performed using R statistical software (v4.0.3; R Core Team 2025).

Sample size

Sample size is based on precision around the mean estimate of the relative difference between the ITT and PP point estimates. Assuming an SD of 20 estimated from a smaller review (unpublished), 390 studies will provide 95% CIs that have a width of 4, which we believe is adequate precision.

Risk of bias assessment

As the review is assessing methods, risk of bias is not required for this review.

Ethics and dissemination

The findings of this review will be submitted for publication in a peer-reviewed journal. Additionally, the results will be presented at scientific conferences and disseminated to relevant stakeholders through professional societies and research networks. Research ethics is not required as the project is a methodological review of previously published trials.

Patient and public involvement

There was no patient or public involvement in the development of this study.

Discussion

The selection of patient analysis populations in NI trials remains a topic of methodological concern. A common perspective suggests that the ITT analysis may be anti-conservative in the presence of protocol deviations, as it could reduce the differences between treatment groups that would exist if the protocol were optimally followed, resulting in spurious conclusions of NI. The alternative PP analysis is also susceptible to this issue, as the direction of the treatment effect due to post-randomised exclusions is not predictable and dependent on the nature of the violations. In this methodological review, we aim to empirically assess the differences between ITT and PP analyses, and examine the relationship between these differences and the proportion of patients excluded from the PP analysis.

Several researchers have used statistical simulations to study the effect of missing data, treatment discontinuations and crossovers on the ITT and PP analyses in NI trials on recommendations.12,15 Brittain and Lin compared the ITT and PP results in antibiotic NI trials presented at the drug advisory committee of the Food and Drug Administration in which they concluded that they found no evidence that the PP analysis leads to larger differences in effect between intervention and control, favouring the control than the ITT analysis.16 However, this review was based on only 20 trials. A larger review of 164 antibiotic NI trials showed that, contrary to the rationale for including the PP analysis, the ITT was more conservative in the majority of these trials.17 Previous research by Aberegg et al demonstrated similar patterns in the relationship between ITT and PP analyses; however, their review was limited to only five trials that reported results of both approaches.18 More broadly, prior work in this area has been constrained by small sample sizes and restriction to specific disease areas, limiting the generalisability of their findings.

Our study has several strengths. First, we plan to include 390 trials from diverse therapeutic areas, providing greater precision in our estimates. Second, by not restricting our search to specific disease areas or interventions, our findings will be more generalisable. Third, our study quantifies both the relative differences in point estimates as well as CI widths between ITT and PP analysis sets, providing insights into how the choice of analytical approach affects both effect size estimation and precision. Finally, our standardised approach to calculating within-trial relative differences allows for combination across trials using different effect measures.

The study is limited by: (1) selection bias from excluding abstracts, conference proceedings and non-English publications, potentially affecting generalisability; (2) findings will not be applicable to NI trials using continuous outcomes; and (3) possible sampling bias from the restricted timeframe, though this was intentionally chosen to focus on contemporary NI trials. Finally, we acknowledge that the ITT and PP analyses target different estimands,11 yet researchers recommend conducting both analyses in NI trials, making comparison of results from these analyses important.

This study will offer valuable insights into the treatment effects generated by ITT and PP analyses in NI trials and will examine empirically whether the conventional belief that ITT analysis produces smaller treatment effects compared with PP analysis holds true in NI trials.

Supplementary material

online supplemental file 1
bmjopen-15-11-s001.docx (33.8KB, docx)
DOI: 10.1136/bmjopen-2025-111418
online supplemental file 2
bmjopen-15-11-s002.docx (16.5KB, docx)
DOI: 10.1136/bmjopen-2025-111418

Footnotes

Funding: The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Prepublication history and additional supplemental material for this paper are available online. To view these files, please visit the journal online (https://doi.org/10.1136/bmjopen-2025-111418).

Provenance and peer review: Not commissioned; externally peer reviewed.

Patient consent for publication: Not applicable.

Patient and public involvement: Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

References

  • 1.Gale RP, Zhang MJ, Lazarus HM. The role of randomized controlled trials, registries, observational databases in evaluating new interventions. Best Pract Res Clin Haematol. 2023;36:101523. doi: 10.1016/j.beha.2023.101523. [DOI] [PubMed] [Google Scholar]
  • 2.Christensen E. Methodology of superiority vs. equivalence trials and non-inferiority trials. J Hepatol. 2007;46:947–54. doi: 10.1016/j.jhep.2007.02.015. [DOI] [PubMed] [Google Scholar]
  • 3.Schumi J, Wittes JT. Through the looking glass: understanding non-inferiority. Trials. 2011;12:106. doi: 10.1186/1745-6215-12-106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Chen R, Shi Q. Challenges and considerations in non-inferiority trials: a narrative review from statisticians’ perspectives. Chin Clin Oncol. 2025;14:8. doi: 10.21037/cco-24-84. [DOI] [PubMed] [Google Scholar]
  • 5.Mauri L, D’Agostino RB., Sr Challenges in the Design and Interpretation of Noninferiority Trials. N Engl J Med. 2017;377:1357–67. doi: 10.1056/NEJMra1510063. [DOI] [PubMed] [Google Scholar]
  • 6.Gupta SK. Non-inferiority clinical trials: Practical issues and current regulatory perspective. Indian J Pharmacol. 2011;43:371–4. doi: 10.4103/0253-7613.83103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.D’Agostino RB, Sr, Massaro JM, Sullivan LM. Non-inferiority trials: design concepts and issues - the encounters of academic consultants in statistics. Stat Med. 2003;22:169–86. doi: 10.1002/sim.1425. [DOI] [PubMed] [Google Scholar]
  • 8.Mo Y, Lim C, Watson JA, et al. Non-adherence in non-inferiority trials: pitfalls and recommendations. BMJ. 2020;370:m2215. doi: 10.1136/bmj.m2215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Leung JT, Barnes SL, Lo ST, et al. Non-inferiority trials in cardiology: what clinicians need to know. Heart. 2020;106:99–104. doi: 10.1136/heartjnl-2019-315772. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Rehal S, Morris TP, Fielding K, et al. Non-inferiority trials: are they inferior? A systematic review of reporting in major medical journals. BMJ Open. 2016;6:e012594. doi: 10.1136/bmjopen-2016-012594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Morgan KE, White IR, Leyrat C, et al. Applying the Estimands Framework to Non-Inferiority Trials: Guidance on Choice of Hypothetical Estimands for Non-Adherence and Comparison of Estimation Methods. Stat Med. 2025;44:e10348. doi: 10.1002/sim.10348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Matilde Sanchez M, Chen X. Choosing the analysis population in non-inferiority studies: per protocol or intent-to-treat. Stat Med. 2006;25:1169–81. doi: 10.1002/sim.2244. [DOI] [PubMed] [Google Scholar]
  • 13.Parpia S, Julian JA, Thabane L, et al. Treatment crossovers in time-to-event non-inferiority randomised trials of radiotherapy in patients with breast cancer. BMJ Open. 2014;4:e006531. doi: 10.1136/bmjopen-2014-006531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Wiens BL, Zhao W. The role of intention to treat in analysis of noninferiority studies. Clin Trials. 2007;4:286–91. doi: 10.1177/1740774507079443. [DOI] [PubMed] [Google Scholar]
  • 15.Sheng D, Kim MY. The effects of non-compliance on intent-to-treat analysis of equivalence trials. Stat Med. 2006;25:1183–99. doi: 10.1002/sim.2230. [DOI] [PubMed] [Google Scholar]
  • 16.Brittain E, Lin D. A comparison of intent-to-treat and per-protocol results in antibiotic non-inferiority trials. Stat Med. 2005;24:1–10. doi: 10.1002/sim.1934. [DOI] [PubMed] [Google Scholar]
  • 17.Bai AD, Komorowski AS, Lo CKL, et al. Intention-to-treat analysis may be more conservative than per protocol analysis in antibiotic non-inferiority trials: a systematic review. BMC Med Res Methodol. 2021;21:75. doi: 10.1186/s12874-021-01260-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Aberegg SK, Hersh AM, Samore MH. Empirical Consequences of Current Recommendations for the Design and Interpretation of Noninferiority Trials. J Gen Intern Med. 2018;33:88–96. doi: 10.1007/s11606-017-4161-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    online supplemental file 1
    bmjopen-15-11-s001.docx (33.8KB, docx)
    DOI: 10.1136/bmjopen-2025-111418
    online supplemental file 2
    bmjopen-15-11-s002.docx (16.5KB, docx)
    DOI: 10.1136/bmjopen-2025-111418

    Articles from BMJ Open are provided here courtesy of BMJ Publishing Group

    RESOURCES