Abstract
The goal of a non-inferiority trial is to evaluate whether the effect of an experimental treatment is not inferior to that of the active control. Determination of an appropriate non-inferiority margin is critical to the demonstration of non-inferiority. A commonly used method is called the fixed-margin approach recommended by the FDA. The fixed-margin approach consists of two steps: first the lower limit of the two-sided confidence interval (CI) of the active-control effect versus placebo is calculated from relevant historical trials or meta-analysis; second, the non-inferiority margin is obtained as a fraction of the lower confidence limit of the control effect to preserve partial control effect. An alternative method is to use the point estimate, instead of the lower confidence limit, of the active-control effect. The fixed-margin approach based on the lower limit may be ultra-conservative with unconditional Type 1 error rate much smaller than target level, while the margin based on the point estimate is liberal. We derive the Type 1 error rate as a function of variances of the effect estimates in the historical and the current non-inferiority trials. We also propose an alternative approach for the non-inferiority margin that maintains the target Type 1 error rate. For the endpoint of landmark survival, we conduct simulations to compare the fixed-margin methods and the proposed method. For illustration, we apply the proposed method to an oncology non-inferiority clinical trial to determine an alternative non-inferiority margin.
Keywords: Fixed-margin approach, Landmark survival, Non-inferiority test, Type 1 error
1. Introduction
If the absence of any treatment may lead to death or serious irreversible morbidity to patients in clinical trials for indications where effective treatments exist, a placebo-controlled trial is considered unethical [1,2]. In additional to the superiority trials that demonstrate the efficacy of a new treatment over an existing therapy, non-inferiority (NI) trials are often used to evaluate whether the experimental treatment is not unacceptably less efficacious than the current active-control treatment already and as the same time is more effective than the placebo. The maximum acceptable amount of loss of clinical efficacy is represented by the NI margin [3].
It is critical to set the appropriate NI margin for a NI trial as the choice of the NI margin dictates the conclusions of the trial and clinical decision-making. The choice of NI margin must satisfy the requirement that the difference in clinical benefits between the two treatments is negligible. In practice, the methods of determining the NI margin vary considerably [4] and often times the rationales for defining the margins are not reported [5]. There are several guidance documents available for the design, conduct and analysis of NI clinical trials [6,7]. While ‘clinical judgement’ was mentioned in the guidance, there was rarely a clinical trial that the NI margin was solely based on clinical judgement. In practice, the determination of NI margin should also be a statistical issue that summarizes information of treatment effects from historical data [8]. Therefore, the NI margin must be pre-specified based on both clinical and statistical reasoning.
There has been extensive research on the statistical determination of the NI margin. The most commonly used method is the fixed-margin approach, which is a two-step process. Here we consider oncology clinical trials where the primary endpoint is the objective response rate or landmark survival rate. In this scenario, the higher response rate indicates a better clinical efficacy. The treatment effect is measured as the difference of response rates between the treatment and placebo groups. The first step of the fixed margin approach is to calculate the lower limit of the two-sided confidence interval (CI) of the clinical effect of active-control versus placebo. Typically, is used to obtain the 95% CI. The second step is to multiply the lower confidence limit by a factor of λ to preserve of the active-control effect. The fixed-margin approach has been shown to be conservative, where the unconditional Type 1 error rate may be substantially lower than the target level in some cases [9,10]. The ultra-conservative NI margin may lead to very high or even infeasible sample sizes for the NI studies. Sankoh [10] demonstrated the conservativeness of the fixed-margin approach for continuous endpoints with simulations.
In the article, we derive the formula for unconditional Type 1 error rate for the NI margin based on the CI and propose a simple alternative for determining the NI margin, which is still conservative but with Type 1 error rate closer to the target level. For survival endpoints, e.g., landmark survival, we use simulation to demonstrate the conservativeness of the fixed-margin approach and to evaluate the performance of the proposed method. The rest of the article is organized as follows. In Section 2, we describe the NI test and two methods for determining the NI margin. We derive the formula for the unconditional Type 1 error rate and propose the modified CI approach. We evaluate the performances of various methods using simulations. The proposed method is illustrated using a hypothetical NI trial for Nivolumab in Section 4. The article ends with discussions in Section 5.
2. Methods for determining the non-inferiority margin
We consider the oncology clinical trials where the endpoint is the objective response rate or survival rate. Let and be the response rates for the placebo (P), experimental treatment (E) and active-control (C), respectively. Without loss of generality, we assume that a higher response rate indicates a better clinical effect and . Let be the true effect of active control. If the true effect is known, the NI margin is often defined as to preserve of the control effect.
Let denote the difference between the control and the experimental treatments. The hypothesis to be tested in a NI trial is
| (1) |
Let and be the point estimate of the difference and the associated standard error. If the upper confidence limit , the null hypothesis is rejected and the NI conclusion is reached. Because the upper limit of a two-sided CI is used, the Type 1 error rate is . Note that under the null hypothesis with the known effect of and the constancy assumption that the active-control effect in the current NI study is similar to that observed in historical data, the asymptotic estimates should satisfy that
| (2) |
| (3) |
Therefore, the asymptotic distribution of the difference is
| (4) |
Because the true active-control effect is often unknown and should be estimated from historical data. The estimate may be based on one relevant study with the same patients population and trial design. If there are more than one study, one may estimate using meta-analysis by combining the control effect estimates from individual studies. Here, we consider the situation where one historical trial is available for the estimation of placebo and active-control effects and assume the constancy assumption. Let and be the estimates of mean response rates and from the historical trial, respectively. Let be the point estimate of difference and let be the standard error. We start by discussing two common approaches to the determination of NI margin. The first one is the point-estimate (PE) approach by setting and the second one is to use the lower limit of a two-sided CI
| (5) |
where is the percentile of a standard normal distribution. The CI approach is also called the fixed-margin approach when . Particularly when , the CI approach is sometimes called the 95%–95% method [7]. The type 1 error rate of an NI test with NI margin is calculated as
| (6) |
where is defined by the PE or CI approaches. The type 1 error probability in (6) is an unconditional error because the probability incorporates the statistical uncertainty of the estimate of NI margin [11,12]. If the margin is treated as a fixed known constant after it is estimated, the conditional error probability is controlled at level . However, when is estimated from the PE and CI approaches, the unconditional error probability may be different from target level because of the uncertainty associated with the estimation of the NI margin. The uncertainty may arise from the sampling variability within a trial or between-trial variability.
The Type 1 error rate under consideration is the unconditional probability thereafter. We have the following theorem for the Type 1 error rates for the PE and CI approaches.
Theorem 1
Let and be the Type 1 error rates for the PE and fixed margin CI approaches, respectively. If and are asymptotic normal as specified in (2) and (3) and and are consistent estimates of and , the asymptotic Type 1 error rates
PROOF. Note that When the PE approach is used, and the Type 1 error rate is
Because and are consistent estimates of and and is asymptotic normal as equation (4),
Therefore, the Type 1 error for the PE approach is liberal with .
When the fixed-margin or CI approach is used as defined in Equation (5), , the Type 1 error rate is
This indicates the fixed margin CI method is conservative with .
Corollary 1
Let . Under the assumptions for Theorem 1, the actual Type 1 error rates for the PE and fixed margin CI approaches are
(7)
(8) PROOF. The actual Type 1 error rate for the fixed-margin CI approach is
Note that asymptotic follows a standard normal distribution.
The Type 1 error rate for the PE approach is obtained by removing the term in Equation (7).
We have the following remarks about the Type 1 error rates of the NI test.
Remark 1
When , i.e., 100% of the control effect is retained, both the PE and the fixed-margin CI approaches maintain the α level .
Remark 2
If the value of λ is fixed and , is an increasing function of k.
Remark 3
If the effect of control effect is very precise with a small standard error, i.e., , then . Then and are both close to .
Remark 4
If the NI trial involves a larger sample size than the historical trials or the control effect is estimated from meta-analysis of heterogeneous studies, we expect that . The upper bound of can be obtained by setting is Equation (8).
In order to maintain the target Type 1 error rate for the fixed-margin CI approach, one has to adjust the value so that
(9) The correct value that is needed for maintaining the target Type 1 error rate is
(10) The CI approach with the correct level as in (10) is referred to as the the modified CI approach, in comparison to the fixed-margin CI approach.
The conservativeness of the fixed-margin CI approach has been noticed by several authors [11,13]. Rothmann [14] proposed a two-CI procedure based on the CI for and , where the NI will be inferred when
(11) where is defined as equation (10). Rothmann’s procedure relaxes the conservatism of 95%-95% fixed margin method and maintain the correct Type 1 error. This procedure belongs to the category of synthesis method that combined data from historical trials and the current NI trial into one analysis [7]. The Rothmann’s two-CI procedure and the modified CI approach are equivalent analytically and both control the Type 1 error. The two-CI procedure can only be conducted when both historical data and NI trial data are available. The proposed modified CI approach is more suitable for the conduct of clinical trials, where the NI margin is pre-defined in advance of the NI trial. It also allows input of clinical judgement of the appropriateness of the NI margin and planning of the NI trial [7].
In practice, it is difficult to determine the exact value of k, which is the ratio of the standard deviation of the NI trial to that of the historical trial. As shown in Fig. 1 in the simulation, the half-width of the corrected CI is an increasing function of k, one may use the lower bound of k to set the correct α level. Another choice is to use set , where and are the number of subjects in the historical and the current NI trials. This is because the ratio of variance is inversely related to the number of subjects under the hypothesis of equivalent survival. In many situations, it is difficult to project the sample size in the NI trial, one may set a reasonable range for k to obtain the bounds of Type 1 error. Based on equation (10), the modified NI margin based on is a decreasing function of k. For example, if we anticipate that the original NI trial size needs to be doubled, we may set the NI margin to be a value corresponding to . This adjustment may still control the Type 1 error below , but not overly conservative.
Fig. 1.
The ratio of the half-width with the confidence level to.
3. Numerical study
Sankoh [10] conducted numeric studies to show the inflated Type 1 error rates for the PE approach and the deflated Type 1 error rates for the CI approach. Here we focus on the CI approach, which is recommended by the regulatory agencies for setting the NI margin. First, we calculate the correct value that is needed to maintain the correct Type 1 error rate. Second, we run simulations to examine the performance of the two CI approaches and the impact of using an adjusted value for testing the NI of landmark survival rates at a fixed time point.
3.1. Ratio of the NI margins from the 95%–95% fixed margin approach and the modified CI approach
To compare with the 95%–95% fixed margin approach, we examine the reduction of half-width of the CI for constructing the NI margin in Equation (5). Let , where is half-width of the CI that maintains the correct Type 1 error rate in Equation (10) and for the 95%–95% fixed margin approach. The plot of R with respect to different values of k and preservation fraction is shown in Fig. 1. We see clearly that the actual half-width of the CI that leads to the correct Type 1 error rate is much smaller than the full 95% CI. In a typical scenario where and 50% control effect is preserved, . Therefore, the NI margin can be constructed as .
3.2. Type 1 error rate of the modified CI approach
To evaluate the performance of the NI margins based on the fixed-margin CI approach and the modified CI approach in oncology NI trials, we run some additional simulations. We assume a mixture cure model with Weibull model for the uncured patients for the historical trial. The overall survival function is
where for the placebo group and for the control group, is the cure or long-term survival rate, a is the shape parameter and is the scale parameter. For the simulation, we set and . The survival curves for the control and placebo groups are shown in Fig. 2. The primary endpoint is the landmark survival rate at month 24. The true difference in the 24-month survival rates is . The NI margin is to preserve of the control effect.
Fig. 2.
Survival curves for the historical data in the simulation.
Suppose that an experimental treatment is under development with the goal to show that the survival rate at month 24 is not inferior to that for the active control by a margin of . The hypothesis to be tested is
The simulation is conducted as follows:
-
1.
Historical data for the active control and placebo are generated from the mixture cure models with sample sizes 100, 150, 200 and 300 per treatment arm.
-
2.
The NI margin is determined as the lower 95% confidence limit per FDA guidance or by the modified CI approach using Equation (9) with . In the situation that we don't know the actual sample size , but we expect that , so that . The NI margin is set at the minimum value of the lower 95% confidence limit with .
-
3.The data from the NI trials are generated from the mixture cure model
-
(a)Under the constancy assumption, the survival data for the active control are generated from the survival function .
-
(b)The survival data for the experimental treatment group are generated from the mixture cure model such that .
-
(a)
-
4.
Perform the NI test by comparing the upper bound of the survival difference and the NI margin.
The point estimate and the CI of the landmark survival difference are calculated using the non-parametric Beta product method [15]. The Type 1 error rates are calculated as the proportion of rejecting the null hypothesis when the null hypothesis is true. We set so the target Type 1 error rate is 0.025. Note that the a new set of historical data are generated in each simulation, the resulting Type 1 error rate is unconditional on the historical data. The simulation results are based on 5,000 simulations with sample size and 400. Here we only present the results for as the results for are similar.
In Fig. 3, we see that the Type 1 error rates based on the 95%–95% fixed margin are substantially lower than 0.025. When the percent of retained control effect decreases, the Type 1 error rates may drop to below 0.005. This indicates that the fixed-margin approach may be ultra-conservative for the NI test for landmark survival rates. The Type 1 error rates for the modified CI approach with adjusted value are shown in Fig. 4. The adjust value is based on equation (10) with . We see that the Type 1 error rates increases remarkably to around 0.02, but still are all below the target level 0.025.
Fig. 3.
Type 1 error rate for the 95%–95% fixed-margin CI approach.
Fig. 4.
Type 1 error rate for the CI approach with the adjusted value.
In the situation when the sample size for the NI trial may be increased in an adaptive clinical trial, one may set the NI margin as if the NI trial sample size would be doubled, i.e., . The resulting Type 1 error rate is shown in Fig. 5. We see that the Type 1 error rates is about 0.01, which is still conservative. In a special case where , . As shown in Fig. 1, the resulting NI margin is about double the width of the 95%–95% fixed NI margin. This shows that the NI margin based on the modified CI approach is still conservative, while is much wider than the 95%–95% fixed margin.
Fig. 5.
Type 1 error rate for the CI approach with the NI trial sample size doubled.
4. Application
For illustration, we apply the proposed method to the overall survival data from the ATTRACTION-2 clinical trial. This is a randomised, double-blind, placebo-controlled, phase 3 trial to evaluate the effect of nivolumab with placebo for patients with advanced gastric or gastro-oesophageal junction cancer refractory to, or intolerant of, at least two previous chemotherapy regimens [16]. The survival data are reconstructed from the plots of Kaplan-Meier curves of overall survival using the algorithm described by Ref. [17]. The digitized data for the PFS curves are extracted using WebPlotDigitizer [18]. The Kaplan-Meier curves of the reconstructed data for all patients in the study are shown in Fig. 6.
Fig. 6.
Kaplan-Meier curves by treatment for the reconstructed overall survival data for the ATTRACTION-2 trial.
Suppose that a competitor company is developing an experimental treatment that has non-inferior survival but may reduce the side effects and improve the quality of life substantially. For simplicity, we assume that the effect of nivolumab is unbiased in ATTRACTION-2 trial and there is a small varability across different trials of nivolumab for the indication under study. We use the landmark survival rate at month 12 as the endpoint. The difference of the 1-year survival probabilities between the two groups is 0.162 with standard error 0.046 and 95% CI (0.068, 0.246). What would be the appropriate NI margin for non-inferiority test? The 95%–95% fixed NI margin is 0.034, while the NI margins based on the modified CI approach are 0.140 for and 0.123 for . The 95%–95% NI margin of 0.034 is smaller than the standard error of the 1-year survival differences. Using quality-adjusted time without symptoms of toxicity (Q-TWiST) as the outcome in cancer clinical trials, Revicki et al. [19] recommended that a difference of 10% in overall survvial is clinically important and meaningful. If less is known about a specific treatment and/or disease area, a clinically meaningful improvement in survival should be greater than 5%, but not more than 10%. Furthermore, using the NI margin 0.034 may lead to a sample size greater than 1,000 for the NI trial if indeed there is no difference between the experimental and the control treatments. Therefore, the 95%–95% fixed margin is too tight and not feasible. The alternative NI margin of 0.123 is slightly more conservative than 0.140 and falls within the range of 0.10–0.15 for the NI margins for the difference in probability of survival (DPS) at a particular time, which is the landmark survival, in Table 3 of the summary by Tanaka et al. [20]. Ideally, the NI margin should be based on both clinical reasoning and historical data. Appropriate NI margin is necessary to maintain the correct Type 1 error rate.
5. Discussion
For the determination of NI margin, we prove that the 95%–95% fixed margin CI approach is ultra-conservative while the PE approach is liberal. For the non-inferiority test of landmark survival in oncology clinical trials, we confirm this conclusion with simulations. The proposed modified CI approach yields the unconditional Type 1 error rates closer to the target level, thus avoiding many false negative results. The NI margin based on the modified CI approach is still conservative, but is typically wider than the 95%–95% margin. The statistically determined NI margin should be evaluated by clinical judgement to ensure that the margin is clinically meaningful.
In this article, we also assume the constancy assumption, where the active-control effect in the current NI study is similar to that observed in historical data. It is of interest to assess the robustness of the proposed method to unknown bias [21] and non-constancy [22].
Acknowledgement
The authors wish to thank Dr. Ralph Bloomfield and two anonymous reviewers for their many insightful comments and suggestions in the NI study design.
Footnotes
The authors alone are responsible for the content and writing of the paper. The views of the authors are solely their own and do not necessarily reflect the views of AstraZeneca.
References
- 1.Schumi J., Wittes J.T. Through the looking glass: understanding non-inferiority. Trials. 2011;12(1):106. doi: 10.1186/1745-6215-12-106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hahn S. Understanding noninferiority trials. Korean J. Pediatr. 2012;55(11):403. doi: 10.3345/kjp.2012.55.11.403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.FDA, Guidance for Industry Non-inferiority Clinical Trials, Center for Biologics Evaluation and Research (CBER).
- 4.Wangge G., Roes K.C., de Boer A., Hoes A.W., Knol M.J. The challenges of determining noninferiority margins: a case study of noninferiority randomized controlled trials of novel oral anticoagulants. CMAJ (Can. Med. Assoc. J.) 2013;185(3):222–227. doi: 10.1503/cmaj.120142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Althunian T.A., de Boer A., Groenwold R.H.H., Klungel O.H. Defining the noninferiority margin and analysing noninferiority: an overview. Br. J. Clin. Pharmacol. 2017;83(8):1636–1642. doi: 10.1111/bcp.13280. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.EMA . 2005. Guideline on the Choice of the Non-inferiority Margin. [Google Scholar]
- 7.FDA . November 2016. Non-Inferiority Clinical Trials to Establish Effectiveness, Guidance for Industry.https://www.fda.gov/media/78504/download URL. [Google Scholar]
- 8.D'Agostino R.B., Sr., Massaro J.M., Sullivan L.M. Non-inferiority trials: design concepts and issues–the encounters of academic consultants in statistics. Stat. Med. 2003;22(2):169–186. doi: 10.1002/sim.1425. [DOI] [PubMed] [Google Scholar]
- 9.Hung H.J., Wang S.-J., O'Neill R. A regulatory perspective on choice of margin and statistical inference issue in non-inferiority trials. Biom. J.: J. Math. Methods Biosci. 2005;47(1):28–36. doi: 10.1002/bimj.200410084. [DOI] [PubMed] [Google Scholar]
- 10.Sankoh A.J. A note on the conservativeness of the confidence interval approach for the selection of non-inferiority margin in the two-arm active-control trial. Stat. Med. 2008;27(19):3732–3742. doi: 10.1002/sim.3256. [DOI] [PubMed] [Google Scholar]
- 11.James Hung H., Wang S.-J., Tsong Y., Lawrence J., O'Neil R.T. Some fundamental issues with non-inferiority testing in active controlled trials. Stat. Med. 2003;22(2):213–225. doi: 10.1002/sim.1315. [DOI] [PubMed] [Google Scholar]
- 12.Lawrence J. Some remarks about the analysis of active control studies. Biom. J. 2005;47(5):616–622. doi: 10.1002/bimj.200410145. [DOI] [PubMed] [Google Scholar]
- 13.Holmgren E.B. Establishing equivalence by showing that a specified percentage of the effect of the active control over placebo is maintained. J. Biopharm. Stat. 1999;9(4):651–659. doi: 10.1081/bip-100101201. [DOI] [PubMed] [Google Scholar]
- 14.Rothmann M., Li N., Chen G., Chi G.Y., Temple R., Tsou H.-H. Design and analysis of non-inferiority mortality trials in oncology. Stat. Med. 2003;22(2):239–264. doi: 10.1002/sim.1400. [DOI] [PubMed] [Google Scholar]
- 15.Fay M.P., Brittain E.H., Proschan M.A. Pointwise confidence intervals for a survival distribution with small samples or heavy censoring. Biostatistics. 2013;14(4):723–736. doi: 10.1093/biostatistics/kxt016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kang Y.-K., Boku N., Satoh T., Ryu M.-H., Chao Y., Kato K., Chung H.C., Chen J.-S., Muro K., Kang W.K., Yeh K.-H., Yoshikawa T., Oh S.C., Bai L.-Y., Tamura T., Lee K.-W., Hamamoto Y., Kim J.G., Chin K., Oh D.-Y., Minashi K., Cho J.Y., Tsuda M., Chen L.-T. Nivolumab in patients with advanced gastric or gastro-oesophageal junction cancer refractory to, or intolerant of, at least two previous chemotherapy regimens (ONO-4538-12, ATTRACTION-2): a randomised, double-blind, placebo-controlled, phase 3 trial. The Lancet. 2017;390(10111):2461–2471. doi: 10.1016/S0140-6736(17)31827-5. [DOI] [PubMed] [Google Scholar]
- 17.P. Guyot, A. Ades, M. J. Ouwens, N. J. Welton, Enhanced secondary analysis of survival data: reconstructing the data from published kaplan-meier survival curves, BMC Med. Res. Methodol. 12 (1). doi:10.1186/1471-2288-12-9. [DOI] [PMC free article] [PubMed]
- 18.Rohatgi A. Webplotdigitizer. https://automeris.io/WebPlotDigitizer version 4.2 (Apr. 2019). URL.
- 19.Revicki D.A., Feeny D., Hunt T.L., Cole B.F. Analyzing oncology clinical trial data using the Q-TWiST method: clinical importance and sources for health state preference data. Qual. Life Res. 2006;15(3):411–423. doi: 10.1007/s11136-005-1579-7. [DOI] [PubMed] [Google Scholar]
- 20.Tanaka S., Kinjo Y., Kataoka Y., Yoshimura K., Teramukai S. Statistical issues and recommendations for noninferiority trials in oncology: a systematic review. Clin. Cancer Res. 2012;18(7):1837–1847. doi: 10.1158/1078-0432.CCR-11-1653. [DOI] [PubMed] [Google Scholar]
- 21.Odem-Davis K., Fleming T.R. Adjusting for unknown bias in noninferiority clinical trials. Stat. Biopharm. Res. 2013;5(3):248–258. doi: 10.1080/19466315.2013.795910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Liu Q., Li Y., Odem-Davis K. On robustness of noninferiority clinical trial designs against bias, variability, and nonconstancy. J. Biopharm. Stat. 2015;25(1):206–225. doi: 10.1080/10543406.2014.923738. [DOI] [PMC free article] [PubMed] [Google Scholar]






