Improving randomized controlled trial analysis via data-adaptive borrowing

Chenyin Gao; Shu Yang; Mingyang Shan; Wenyu Ye; Ilya Lipkovich; Douglas Faries

doi:10.1093/biomet/asae069

. 2024 Dec 17;112(2):asae069. doi: 10.1093/biomet/asae069

Improving randomized controlled trial analysis via data-adaptive borrowing

Chenyin Gao ¹, Shu Yang ^2,^✉, Mingyang Shan ³, Wenyu Ye ⁴, Ilya Lipkovich ⁵, Douglas Faries ⁶

PMCID: PMC11972012 PMID: 40191435

Summary

In recent years, real-world external controls have grown in popularity as a tool to empower randomized placebo-controlled trials, particularly in rare diseases or cases where balanced randomization is unethical or impractical. However, as external controls are not always comparable to the trials, direct borrowing without scrutiny may heavily bias the treatment effect estimator. Our paper proposes a data-adaptive integrative framework capable of preventing unknown biases of the external controls. The adaptive nature is achieved by dynamically sorting out a comparable subset of external controls via bias penalization. Our proposed method can simultaneously achieve (a) the semiparametric efficiency bound when the external controls are comparable and (b) selective borrowing that mitigates the impact of the existence of incomparable external controls. Furthermore, we establish statistical guarantees, including consistency, asymptotic distribution and inference, providing Type-I error control and good power. Extensive simulations and two real-data applications show that the proposed method leads to improved performance over the trial-only estimator across various bias-generating scenarios.

Keywords: Adaptive lasso, Calibration weighting, Dynamic borrowing, Study heterogeneity

1. Introduction

Randomized controlled trials have been considered the gold standard of clinical research to provide confirmatory evidence on the safety and efficacy of treatments. However, randomized placebo-controlled trials are expensive, require lengthy recruitment periods and may not always be ethical, feasible or practical in rare or life-threatening diseases. In response, quality patient-level real-world data from disease registries and electronic health records have become increasingly available and can generate fit-for-purpose real-world evidence to facilitate healthcare and regulatory decision-making (FDA, 2021). Studies using real-world data may have advantages over randomized placebo-controlled trials, including longer observation windows, larger and more heterogeneous patient populations, and reduced burden on investigators and patients (Visvanathan et al., 2017; Colnet et al., 2020). There is interest in novel clinical trial designs that leverage external controls from real-world data to improve the efficiency of randomized placebo-controlled trials while maintaining robust evidence on the safety and efficacy of treatments (Silverman, 2018; FDA, 2019; Ghadessi et al., 2020). The focus of this paper is on hybrid control arm designs using real-world data, where the concurrent control arm is augmented with real-world external controls to form a hybrid comparator group.

The concept of hybrid controls dates back to Pocock (1976), who combined the trial data and historical controls by adjusting for data-source-level differences. Since then, numerous methods for using external controls have been developed. However, regulatory approvals of external control arm designs as confirmatory trials are rare and limited to ultra-rare diseases, pediatric trials or oncology trials (FDA, 2014, 2016; Odogwu et al., 2018). Concerns regarding the validity and comparability of the external controls have limited their use in a broader context. Guidance documents from regulatory agencies, including the recent FDA draft guidance (FDA, 2023), note several potential issues with the external controls, including selection bias, lack of concurrency, differences in the definitions of covariates, treatments or outcomes, and unmeasured confounding (FDA, 2001, 2019, 2023). Without proper scrutiny, each of these concerns may lead to biased treatment effect estimates and misleading conclusions.

Selection bias is a type of data heterogeneity often encountered in nonrandomized studies. In the context of external control augmentation, it arises when the real-world baseline subjects’ characteristics differ from those in the trial data. Multiple methods are available to adjust for selection bias by balancing the baseline covariates’ distributions across the different data sources. For example, matching and subclassification approaches select a subset of comparable external controls to construct the hybrid control arm (Stuart, 2010). Matching on the propensity score or the probability of trial inclusion can balance numerous baseline covariates simultaneously (Rosenbaum & Rubin, 1983). Weighting approaches that reweight external controls using the probability of trial inclusion or other balancing scores have also been proposed, e.g., empirical likelihood (Qin et al., 2015), entropy balancing (Lee et al., 2022b; Wu & Yang, 2022b; Chu et al., 2023), constrained maximum likelihood (Chatterjee et al., 2016; Zhang et al., 2020) and Bayesian power priors (Neuenschwander et al., 2010; van Rosmalen et al., 2018). Furthermore, matching or weighting can be combined with outcome modelling to enhance robustness against model misspecification in addressing selection bias of external controls (Li et al., 2023).

Differences in the outcomes may still exist between the concurrent controls and the external controls after matching or weighting due to differences in study settings, time frame, data quality or the definition of covariates or outcomes (Phelan et al., 2017). Methods were proposed to adaptively select the degree of borrowing or adjust the outcomes for external controls based on observed outcome differences with concurrent controls. Some researchers suggested first testing the heterogeneity in control outcomes before deciding whether to incorporate external subjects into the hybrid control arm (Viele et al., 2014; Li et al., 2023). More dynamic borrowing approaches were also proposed, including matching and bias adjustment (Stuart & Rubin, 2008), power priors (Ibrahim & Chen, 2000; Neuenschwander et al., 2009), Bayesian hierarchical models including meta-analytic predictive priors (Neuenschwander et al., 2010; Schoenfeld et al., 2019) and commensurate priors (Hobbs et al., 2011). While these existing methods seem appealing, simulation studies could not identify a single approach that could perform well across all scenarios where hidden biases exist (Shan et al., 2022). The surveyed Bayesian methods often have inflated Type-I errors, while frequentist methods suffer lower power when hidden biases exist. Nearly all methods performed poorly in the presence of unmeasured confounding and could not simultaneously minimize bias and gain power. Furthermore, many existing methods rely on parametric assumptions that are sensitive to model misspecification and cannot capture complex relationships that are prevalent in practice.

In this paper, we propose an approach to achieve an efficient estimation of treatment effects that is robust to various potential discrepancies that may arise in the external controls. When handling the selection bias of external controls, our proposal is based on calibration weighting (Lee et al., 2022b) so that the covariate distribution of external controls matches with that of the trial subjects. Furthermore, leveraging semiparametric theory, we develop an integrative augmented calibration weighting estimator, motivated by the efficient influence function (Bickel et al., 1998; Tsiatis, 2006), which is semiparametrically efficient and doubly robust against model misspecification. Despite the potential to view the selection bias problem as a generalizability or transportability issue (Lee et al., 2022b), our framework fundamentally diverges from theirs as our context encompasses the outcomes from both the trial data and external controls, while Lee et al. (2022b) solely considered the trial outcomes.

To deal with potential outcome heterogeneity, we develop a selective borrowing framework to determine an optimal subset from the external controls for integration. Specifically, we introduce a bias parameter for each external subject entailing his or her comparability with the concurrent control. To prevent bias in the integrative estimator, the goal is to select the comparable external controls with zero bias and exclude any others with nonzero bias. Thus, this formulation recasts the selective borrowing strategy as a model selection problem, which can be solved by penalized estimation (e.g., the adaptive lasso penalty; Zou, 2006). Subsequent to the selection process, comparable external controls are utilized to construct the integrative estimator. Prior works such as those by Chen et al. (2021), Liu et al. (2021) and Zhai & Han (2022) although able to identify biases, exclude the entire external sample when confronted with incomparability. Moreover, compared to these existing selective borrowing approaches, our method leverages off-the-shelf machine learning models to achieve semiparametric efficiency and does not require stringent parametric assumptions on the distribution of outcomes.

2. Methodology

2.1. Notation, assumptions and objectives

Let Inline graphic represent a randomized placebo-controlled trial and represent an external control source, which contain and subjects, respectively. The total sample size is . An extension to multiple external control groups is discussed in the Supplementary Material. A total of and subjects receive the active treatment and control treatment in Inline graphic , while we assume that all subjects in receive the control. Each observation comprises the outcomes , the treatment assignment and a set of baseline covariates . Similarly, each observation comprises , and . Let represent a data source indicator, which is 1 for all subjects and 0 for all subjects Inline graphic . To sum up, an independent and identically distributed sample is observed, where . Let denote the potential outcomes under treatment (Rubin, 1974). The causal estimand of interest is defined as the average treatment effect among the trial population, , where for . The clinical trials for treatment effect estimation satisfy the following assumption.

Assumption 1

(Consistency, randomization and positivity).

Suppose that

i.
,

ii.
for and

iii.
the known treatment propensity score satisfies

for all such that .

Assumption 1 is standard in the causal inference literature (Rosenbaum & Rubin, 1983; Imbens, 2004) and holds for the well-controlled clinical trials guaranteed by the randomization mechanism. Under Assumption 1, Inline graphic is identifiable with the trial data.

Moreover, the external controls should ideally be comparable with the concurrent controls.

Assumption 2

(External control compatibility).

Suppose that

i.
and

ii.
for all such that .

Assumption 2 states that the conditional mean of Inline graphic is the same for the trial data and external controls. This assumption holds if captures all the outcome predictors that are correlated with . From the guidance in FDA (2023) for drug development in rare diseases, there are five main concerns regarding the use of external controls: (i) selection bias, (ii) unmeasured confounding, (iii) lack of concurrency, (iv) data quality and (v) outcome validity. Assumption 2 does not require the covariate distribution of external controls to be the same as that of the trial data, which is referred to as selection bias in the guidance. Under Assumption 2, borrowing external controls to improve treatment effect estimation is similar to a transportability or covariate shift problem. However, the presence of concerns (ii)–(v) can result in violation of Assumption 2. Our paper has two main objectives: (i) under Assumption 2, similarly to the work of Li et al. (2023), we develop a semiparametrically efficient and robust strategy to borrow external controls to improve estimation while correcting for selection bias (§ 2.2); (ii) considering that Assumption 2 can be potentially violated, we incorporate a selective borrowing procedure that will detect the biases and retain only a subset of comparable external controls for integration (§ 2.3).

2.2. Semiparametric efficient estimation under the ideal situation

From the semiparametric theory (Bickel et al., 1998), we derive efficient and robust estimators for Inline graphic under Assumptions 1 and 2. The derivation reaches the same estimator as Li et al. (2023), and will serve as the base for our selective borrowing strategy. The semiparametric model is attractive as it exploits the observed data without making assumptions about the nuisance parts of the data generation process that are not of substantive interest. We derive the efficient influence function of Inline graphic in Theorem 1 below, which shall serve as the foundational component of our proposed framework.

Theorem 1.

Under Assumptions 1 and 2, the efficient influence function of is

where

Based on Theorem 1, the semiparametric efficiency bound for Inline graphic is . Hence, a principled estimator can be motivated by solving the empirical analogue of for .

Let the estimators of Inline graphic be , and define . Then, by solving the empirical version of the efficient influence function for , we have

(1)

We now discuss the estimators for the nuisance functions Inline graphic . To estimate , and , one can follow the standard approach by fitting parametric models based on the trial data.

For estimating weight Inline graphic , a direct approach is to predict , which however is unstable due to inverting probability estimates. To achieve stability of weighting, the key insight is based on the central role of as balancing the covariate distribution between two groups: for any , which is a -dimensional function of Inline graphic . Thus, we estimate by calibrating the covariate balance between the trial data and external controls. In particular, we assign a weight for each subject , then solve the following optimization problem for :

subject to (i) Inline graphic (ii) . First, is the entropy of the weights; thus, minimizing this criterion ensures that the calibration weights are not too far from uniform, so it minimizes the variability due to heterogeneous weights. Constraint (i) is a standard condition for the weights. Constraint (ii) forces the empirical moments of the covariates to be the same after calibration, leading to better-matched distributions of the trial data and external controls.

The optimization problem can be solved using constrained convex optimization. The estimated calibration weight is Inline graphic , and solves which is the Lagrangian dual problem to the optimization problem. The dual problem also entails that the calibration weighting approach makes a log regression model for . We refer to with calibration weights as the augmented calibration weighting estimator .

Remark 1.

The variance ratio quantifies the relative residual variability of given between the trial data and external controls. In general, estimating the conditional variance ratio involves nonparametric regression, which can be challenging; see Shen et al. (2020) and the references therein. Fortunately, the consistency of does not rely on the correct specification of . For example, if is set to be zero, reduces to the trial-only estimator without borrowing any external information, which is always consistent. In order to leverage external information and estimate practically, we can make a simplifying homoscedasticity assumption that the residual variances of after addressing are constant over studies. In this case, can be estimated by .

We show that Inline graphic has the following desirable properties. (i) Local efficiency: achieves the semiparametric efficiency bound if the nuisance functions are correctly specified. (ii) Double robustness: is consistent for if either the model for or that for is correct; see the proof in the Supplementary Material.

The doubly robust estimators were initially developed to gain robustness to parametric misspecification, but are now known to also be robust to approximation errors using machine learning methods (e.g., Chernozhukov et al., 2018). We investigate this new doubly robust feature for the proposed estimator Inline graphic , and use flexible semiparametric or nonparametric methods to estimate both (), and in (1). First, we consider the method of sieves (Chen, 2007) for . In comparison with other nonparametric methods such as kernels, the method of sieves is particularly well suited for calibration weighting. We consider general sieve basis functions such as power series, Fourier series, splines, wavelets and artificial neural networks; see Chen (2007) for a comprehensive review. The number of bases can be selected by cross-validation. Second, we consider flexible outcome models, e.g., generalized additive models, kernel regression and the method of sieves for Inline graphic (). Using flexible methods alleviates bias from the misspecification of parametric models. The following regularity conditions are required for the nuisance function estimators.

Assumption 3.

For a function with a generic random variable , define its norm as . Assume that

i.
and ,

ii.
,

iii.
for some ,

iv.
the additional regularity conditions Assumptions S1 and S2 in the Supplementary Material hold.

Assumption 3 is a set of typical regularity conditions for Inline graphic -estimation to achieve rate double robustness (Van der Vaart, 2000). Under these regularity conditions, our proposed framework can incorporate flexible methods for estimating the nuisance functions, while maintaining parametric rate consistency for .

Theorem 2.

Under Assumptions 1–3, we have , where . If , achieves semiparametric efficiency.

Theorem 2 motivates variance estimation by Inline graphic , which is consistent for under Assumptions 1–3.

2.3. Bias detection and selective borrowing

In practical situations, Assumption 2 may not hold, and the augmentation in (1) can be biased. We develop a selective borrowing framework to select external subjects that are comparable with the concurrent controls for integration. To account for potential violations, we introduce a vector of bias parameters Inline graphic for all , where . When Assumption 2 holds, we have . Otherwise, there exists at least one such that . To prevent bias in from incomparable external controls, the goal is to select the comparable subset with and exclude any others with .

Let Inline graphic be a consistent estimator for where is a consistent estimator for . Let be an initial estimator for . We propose a refined estimator of by penalized estimation:

(2)

Here Inline graphic is the estimated variance of , is the adaptive lasso penalty term and are two tuning parameters. Intuitively, if is close to zero, the associated penalty will be large, which further shrinks estimate towards zero. According to Zou (2006), Huang et al. (2008) and Lin et al. (2009), the adaptive lasso penalty can lead to a desirable property under the following regularity conditions.

Assumption 4.

Suppose that

i.
and for all ,

ii.
there exist constants and such that , where and are the smallest and largest eigenvalues of ,

iii.
, where , and

iv.
and .

Lemma 1.

Suppose that the assumptions in Theorem 2 and Assumption 4 hold except that Assumption 2 may be violated. We have .

Lemma 1 shows that the adaptive lasso penalty has the ability to select zero-valued parameters consistently when using an Inline graphic -consistent initial estimator and proper choices of , provided that the minimum of the nonzero bias does not diminish too fast and the initial estimator is sufficiently good. In practice, the initial estimator can be obtained by leveraging off-the-shelf machine learning models with a guaranteed convergence rate, and Inline graphic are selected by minimizing the mean square error using cross-validation. Given , the selected set of comparable external controls is . The modified integrative estimator is

(3)

where Inline graphic is the estimated function of , which is used to adjust for changes in the covariate distribution from all external controls in to .

Following the suggestions of Ho et al. (2007) to improve the finite-sample performances, nearest-neighbour matching based on the estimated probability of trial inclusion Inline graphic is performed after selecting the comparable subset , which ensures a more balanced allocation ratio between the treated group and the hybrid control arm; see Algorithm 1 below for an overview of our selective borrowing framework.

Algorithm 1.

Proposed selective integrative estimator.

Input: a randomized controlled trial with size Inline graphic and external controls.

Step 1. Fit the models for the outcome means and weights .
Step 2. Construct the initial estimator for the bias parameter .
Step 3. Select the comparable subset via the bias penalization ( 2 ).
Step 4. If then perform the nearest-neighbour matching to select external controls as the final ; otherwise, jump to step 5.
Step 5. Compute in ( 2.3 ) using the selected external controls in .

We show the efficiency gain of the proposed estimator compared to the trial-only estimator.

Theorem 3.

Suppose that the assumptions in Theorem 2 and Assumption 4 hold except that Assumption 2 may be violated. Let . The reduction of the asymptotic variance of compared to the trial-only estimator is

(4)

which is strictly positive unless or or for all such that .

We derive (4) using orthogonality of the efficient influence function of Inline graphic to the nuisance tangent space, and relegate the details to the should be highlighted. Theorem 3 showcases the advantage of including external controls in a data-adaptive manner, where the asymptotic variance of should be strictly smaller than the trial-only estimator unless the external controls all suffer exceeding noise, i.e., Inline graphic , or the compatible subset of the external controls is an empty set, i.e., , or the covariate captures all the variability of in the trial data, i.e., . Below, we establish the asymptotic properties and provide a valid inferential framework for the proposed integrative estimator; more details are provided in the Supplementary Material.

Theorem 4.

Suppose that the assumptions in Theorem 2 and Assumption 4 hold except that Assumption 2 may be violated. We have . Furthermore, the confidence interval for can be constructed as

where is a variance estimator of , is the quantile for the standard normal distribution and satisfies as .

3. Simulation

In this section, we evaluate the finite-sample performance of the proposed framework to estimate treatment effects under potential bias scenarios via plasmode simulations. First, a set of Inline graphic baseline covariates is generated by mimicking the correlation structure and the moments (up to the sixth) of variables from an oncology randomized placebo-controlled trial (i.e., the trial data) and the Flatiron Health Spotlight Phase 2 cohort (© 2020 Flatiron Health, all rights reserved; external controls).

Next, we generate the data source indicator Inline graphic as given the sample sizes , where represents an unmeasured confounder. The treatment assignment for the trial data is completely at random (i.e., ), while all external subjects receive the control (i.e., ). The outcomes are generated as

We consider three data-generating scenarios in Table 1(a), where Inline graphic is chosen adaptively to ensure the desired sample sizes , and are chosen empirically based on the model fits using the observed oncology clinical trial data. In all the scenarios, we use the linear predictor of to fit , and thus the models are correctly specified under the model choices Inline graphic , where the linear predictor of governs the true data generation, but are misspecified under choices , where the data generation depends on a new set of covariates , which include the quadratic and cubic terms of the th and th covariates (i.e., ) addition to the baseline covariate . Moreover, we utilize the cross-fitting procedure to select tuning parameters for the gradient boosting model.

Table 1:

Simulation settings: (a) model choices (C and W), where Inline graphic , and (b) descriptions of the five estimators

(a) Model choices

(b) Estimators
	The augmented inverse probability weighting estimator without borrowing (Cao et al., 2009)
	The integrative augmented calibration weighting estimator with full borrowing (Li et al., 2023)
	The data-adaptive integrative estimator using the linear regressions for
	The data-adaptive integrative estimator using the tree-based gradient boosting for
	The Bayesian predictive -value power prior estimator (Kwiatkowski et al., 2023)

Open in a new tab

The proposed framework is evaluated on imbalanced trial data, where Inline graphic and with an external control group of size . We investigate the performance of our proposed estimator under two levels of unmeasured confounding ( and 0.3) by comparing with other estimators in Table 1(b). The trial-only augmented inverse probability weighting estimator (Cao et al., 2009) and the augmented calibration weighting estimator Inline graphic with full borrowing (Li et al., 2023) are used as benchmarks. Two data-adaptive integrative estimators, and , are considered, where linear regressions and tree-based gradient boosting are used to estimate the nuisance models. Other machine learning algorithms that satisfy pointwise consistency, such as the generalized additive model, can also be utilized to select a comparable subset of external controls consistently. The Bayesian predictive Inline graphic -value power prior estimator, , is an extension of the power prior, which discounts each external control according to its outcome compatibility using Box’s -value (Kwiatkowski et al., 2023).

Figure 1 displays the average bias, variance, mean squared error and Type-I error when Inline graphic , and power for testing when based on 1000 sets of data replications. Over the three model scenarios, the trial-only estimator is always consistent, but lacks efficiency as it only utilizes the concurrent controls for estimation, especially when is small. When the conditional mean exchangeability in Assumption 2 holds (i.e., Inline graphic ), the full-borrowing estimator is most efficient, shown by its low mean squared error and high power for detecting a significant treatment effect. Our proposed selective integrative estimators, and , may be less efficient than due to finite-sample selection error. However, they maintain smaller variance and improved power compared to Inline graphic , regardless of whether the nuisance models are misspecified. When Assumption 2 is violated (i.e., ), becomes biased, leading to an inflated Type-I error and low power. The Bayesian estimator requires correct parametric specification of the outcome model and performs poorly when the model omits a key confounder that is imbalanced between data sources. In our simulations, high weights were assigned to the external control subjects, which led to some bias in the treatment effect estimates when Inline graphic was small. However, both and achieve smaller mean squared errors than the trial-only estimator by incorporating external control subjects. In cases where the outcome model is incorrectly specified and , the benefit of using machine learning methods becomes apparent. Specifically, the flexibility of the gradient boosting model ensures the convergence rate assumption for Inline graphic , i.e., for a certain sequence (Zhang & Yu, 2005). By incorporating compatible external controls more accurately, better controls bias and achieves comparable power levels to . However, the adaptive lasso estimation based on the misspecified linear model lacks such properties and may not provide gains in power. One notable trade-off of our proposed estimators is the slight Type-I error inflation when Inline graphic is small and Assumption 2 is violated, which can be attributed to finite-sample selection error and was also observed by Viele et al. (2014).

4. Real-data application

In this section, we present an application of the proposed methodology to investigate the effectiveness of basal insulin lispro against regular insulin glargine in patients with Type-I diabetes. When combined with preprandial insulin lispro, basal insulin lispro and insulin glargine are two long-acting insulin formulations used for patients with Type-I diabetes mellitus. We analyse the IMAGINE-1 study, a randomized controlled trial where participants were unevenly assigned to either basal insulin lispro (treatment group) or insulin glargine (control group). Additionally, external control subjects from the IMAGINE-3 trial were used. In the Supplementary Material we also explore the effectiveness of solanezumab versus the placebo in slowing Alzheimer’s disease progression using external observational data.

Our primary objective is to test the hypothesis of whether basal insulin lispro is superior to regular insulin glargine at glycemic control for patients with Type-I diabetes mellitus. This can be achieved by comparing the deviation of the hemoglobin A1c level from baseline after 52 weeks of treatment. Both studies contain a rich set of baseline covariates Inline graphic , such as age, gender, baseline hemoglobin A1c (%), baseline fasting serum glucose (mmol/L), baseline triglycerides (mmol/L), baseline low-density lipoprotein cholesterol (mmol/L) and baseline alanine transaminase (U/L). The primary analysis population in IMAGINE-1 was the randomized patients who received at least one treatment dose. To mimic the full-analysis population from IMAGINE-1, external control subjects with missing baseline assessments are discarded from IMAGINE-3. The last observation carried forward is used to impute missing postbaseline outcomes. The IMAGINE-1 study consists of Inline graphic subjects with 286 in the treated group and 153 in the control group, while the IMAGINE-3 study includes patients in the control arm. In our statistical analysis, we first use the baseline covariates to model the trial inclusion probability by calibration weighting under the entropy loss function. Next, we assume a linear heterogeneity treatment effect function for the outcomes with Inline graphic as the treatment modifier, and compare the same set of estimators in the simulation study.

Table 2 reports the estimated results. The trial-only estimator Inline graphic shows that basal insulin lispro has a significant treatment effect on reducing the glucose level solely based on the IMAGINE-1 study. Because of potential population bias, the naively integrative estimators and , albeit significant, are slightly different from , which may be subject to possible biases of the external controls. After filtering out the incompatible patients from the external controls by our adaptive lasso selection, the final integrative estimates Inline graphic and are closer to the benchmark, but have narrower confidence intervals. According to our adaptive analysis result, basal insulin lispro is significantly more effective than regular insulin glargine at glycemic control when used for patients with Type-I diabetes mellitus.

Table 2:

Point estimates, standard errors and 95% confidence intervals of the treatment effect of BIL against regular GL based on the IMAGINE-1 and IMAGINE-3 studies


Est. (SE)
CI

Open in a new tab

Est., estimate; SE, standard error; CI, confidence interval; BIL, basal insulin lispro; GL, regular insulin glargine.

Next, we compare the performances of Inline graphic with our data-adaptive integrative estimates to highlight the advantages of our dynamic borrowing framework. To this end, we retain the size of the treatment group, but create 100 subsamples by randomly selecting patients from its control group, where . Then, the patients treated with regular insulin glargine in the IMAGINE-3 study are augmented to each selected subsample and the treatment effect is evaluated upon the hybrid control arm design. Figure 2 presents the average probabilities of successfully detecting Inline graphic , the so-called probability of success, against the size of subsamples. When solely utilizing patients from the IMAGINE-1 study, produces a probability of success larger than 0.8 only if the size of the control group is larger than 25. Combined with the IMAGINE-3 study, and refine the treatment effect estimation and only 15 patients are needed in the concurrent control group to attain a probability of success higher than 0.8. Therefore, by properly leveraging the external controls, we may accelerate drug development by decreasing the number of patients on the concurrent control, thereby reducing the duration and cost of the clinical trial.

Figure 2: — Probability of success for detecting by , and with varying control group sizes in the IMAGINE-1 study.

5. Discussion

Interest in the use of external control arms for drug development is becoming more common. However, concerns regarding their quality and validity have limited their use for healthcare decision-making thus far, necessitating careful and appropriate assessment. To adjust for potential selection bias, our proposed method calibrates the covariate moments across two data sources, ensuring that the covariate distributions in both sources match each other. Alternative predictive model-based strategies are applicable when only a subset of covariates is shared (Stuart et al., 2011; Tipton, 2014). To address differences in outcomes, we select comparable external subsets based on the adaptive lasso penalty. Alternative penalties can be considered if the selection consistency property is attained, such as the smoothly clipped absolute deviation penalty (Fan & Li, 2001). Moreover, our framework can be easily extended to augment observational studies with external data, which may require additional modelling and assumptions to achieve double robustness. Slight Type-I error inflation is observed in our simulations when the concurrent control group is small, attributed to selection error in finite samples. One future direction will be to rigorously construct a data-adaptive confidence interval to account for finite-sample selection uncertainty without being overly conservative (Lee et al., 2016; Tibshirani et al., 2016). Other future directions include extending the proposed integrated inferential framework to survival outcomes (Lee et al., 2022a), estimating heterogeneous treatment effects (Wu & Yang, 2022a; Yang et al., 2022) and combining probability and nonprobability samples (Yang et al., 2020; Gao & Yang, 2023).

Supplementary Material

asae069_Supplementary_Data

asae069_supplementary_data.zip^{(417KB, zip)}

Acknowledgement

This project was supported by the Food and Drug Administration (FDA) of the U.S. Department of Health and Human Services (HHS) (U01FD007934) and the National Institute on Aging of the National Institutes of Health (R01AG06688). The views and opinions expressed herein are those of the authors and do not necessarily represent those of, nor endorsement by, FDA/HHS, the National Institutes of Health or the U.S. Government.

Contributor Information

Chenyin Gao, Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, USA.

Shu Yang, Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, USA.

Mingyang Shan, Eli Lilly & Company, Lilly Corporate Center, 893 Delaware Street, Indianapolis, Indiana 46285, USA.

Wenyu Ye, Eli Lilly & Company, Lilly Corporate Center, 893 Delaware Street, Indianapolis, Indiana 46285, USA.

Ilya Lipkovich, Eli Lilly & Company, Lilly Corporate Center, 893 Delaware Street, Indianapolis, Indiana 46285, USA.

Douglas Faries, Eli Lilly & Company, Lilly Corporate Center, 893 Delaware Street, Indianapolis, Indiana 46285, USA.

Supplementary material

The Supplementary Material includes all technical proofs, additional simulation results and other real-data applications. An open-source software R package (R Development Core Team, 2025) is available for implementing our proposed methodology at https://github.com/IntegrativeStats/SelectiveIntegrative.

References

Bickel P. J., Klaassen C., Ritov Y. & Wellner J. (1998). Efficient and Adaptive Inference in Semiparametric Models, vol. 50. Baltimore, MD: Johns Hopkins University Press. [Google Scholar]
Cao W., Tsiatis A. A. & Davidian M. (2009). Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika 96, 723–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chatterjee N., Chen Y.-H., Maas P. & Carroll R. J. (2016). Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources. J. Am. Statist. Assoc. 111, 107–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen X. (2007). Large sample sieve estimation of semi-nonparametric models. In Handbook of Econometrics, vol. 6, Ed. Heckman J.J. and Leamer E. E., pp. 5549–5632. Amsterdam: Elsevier. [Google Scholar]
Chen Z., Ning J., Shen Y. & Qin J. (2021). Combining primary cohort data with external aggregate information without assuming comparability. Biometrics 77, 1024–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chernozhukov V., Chetverikov D., Demirer M., Duflo E., Hansen C., Newey W. & Robins J. (2018). Double/debiased machine learning for treatment and structural parameters. Econom. J. 21, 1–68. [Google Scholar]
Chu J., Lu W. & Yang S. (2023). Targeted optimal treatment regime learning using summary statistics. Biometrika 110, 913–31. [Google Scholar]
Colnet B., Mayer I., Chen G., Dieng A., Li R., Varoquaux G., Vert J.-P., Josse J. & Yang S. (2020). Causal inference methods for combining randomized trials and observational studies: a review. Statist. Sci. 39, 165–91. [Google Scholar]
Fan J. & Li R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Statist. Assoc. 96, 1348–60. [Google Scholar]
FDA. (2001). E10 Choice of Control Group and Related Issues in Clinical Trials. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/e10-choice-control-group-and-related-issues-clinical-trials [PubMed]
FDA. (2014). Blinatumomab Drug Approval Package. https://www.accessdata.fda.gov/drugsatfda_docs/nda/2014/125557Orig1s000TOC.cfm
FDA. (2016). Avelumab Drug Approval Package. https://www.fda.gov/drugs/resources-information-approved-drugs/avelumab-bavencio
FDA. (2019). Rare Diseases: Natural History Studies for Drug Development. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/rare-diseases-natural-history-studies-drug-development
FDA. (2021). Real-World Data: Assessing Registries to Support Regulatory Decision-Making for Drug and Biological Products Guidance for Industry. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/real-world-data-assessing-registries-support-regulatory-decision-making-drug-and-biological-products
FDA. (2023). Considerations for the Design and Conduct of Externally Controlled Trials for Drug and Biological Products Guidance for Industry. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-design-and-conduct-externally-controlled-trials-drug-and-biological-products
Gao C. & Yang S. (2023). Pretest estimation in combining probability and non-probability samples. Electron. J. Statist. 17, 1492–546. [Google Scholar]
Ghadessi M., Tang R., Zhou J., Liu R., Wang C., Toyoizumi K., Mei C., Zhang L., Deng C. & Beckman R. A. (2020). A roadmap to using historical controls in clinical trials – by Drug Information Association Adaptive Design Scientific Working Group (DIA-ADSWG). Orphanet J. Rare Dis. 15, 1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ho D. E., Imai K., King G. & Stuart E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit. Anal. 15, 199–236. [Google Scholar]
Hobbs B. P., Carlin B. P., Mandrekar S. J. & Sargent D. J. (2011). Hierarchical commensurate and power prior models for adaptive incorporation of historical information in clinical trials. Biometrics 67, 1047–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
Huang J., Ma S. & Zhang C.-H. (2008). Adaptive lasso for sparse high-dimensional regression models. Statist. Sinica 18, 1603–18. [Google Scholar]
Ibrahim J. G. & Chen M.-H. (2000). Power prior distributions for regression models. Statist. Sci. 15, 46–60. [Google Scholar]
Imbens G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: a review. Rev. Econ. Statist. 86, 4–29. [Google Scholar]
Kwiatkowski E., Zhu J., Li X., Pang H., Lieberman G. & Psioda M. A. (2023). Case weighted adaptive power priors for hybrid control analyses with time-to-event data. arXiv: 2305.05913v1. [DOI] [PMC free article] [PubMed]
Lee D., Yang S., Dong L., Wang X., Zeng D. & Cai J. (2022b). Improving trial generalizability using observational studies. Biometrics 79, 1213–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee D., Yang S. & Wang X. (2022a). Doubly robust estimators for generalizing treatment effects on survival outcomes from randomized controlled trials to a target population. J. Causal Infer. 10, 415–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee J. D., Sun D. L., Sun Y. & Taylor J. E. (2016). Exact post-selection inference, with application to the lasso. Ann. Statist. 44, 907–27. [Google Scholar]
Li X., Miao W., Lu F. & Zhou X.-H. (2023). Improving efficiency of inference in clinical trials with external control data. Biometrics 79, 394–403. [DOI] [PubMed] [Google Scholar]
Lin Z., Xiang Y. & Zhang C. (2009). Adaptive lasso in high-dimensional settings. J. Nonparam. Statist. 21, 683–96. [Google Scholar]
Liu M., Bunn V., Hupf B., Lin J. & Lin J. (2021). Propensity-score-based meta-analytic predictive prior for incorporating real-world and historical data. Statist. Med. 40, 4794–808. [DOI] [PubMed] [Google Scholar]
Neuenschwander B., Branson M. & Spiegelhalter D. J. (2009). A note on the power prior. Statist. Med. 28, 3562–6. [DOI] [PubMed] [Google Scholar]
Neuenschwander B., Capkun-Niggli G., Branson M. & Spiegelhalter D. J. (2010). Summarizing historical information on controls in clinical trials. Clin. Trials 7, 5–18. [DOI] [PubMed] [Google Scholar]
Odogwu L., Mathieu L., Blumenthal G., Larkins E., Goldberg K. B., Griffin N., Bijwaard K., Lee E. Y., Philip R., Jiang X. et al. (2018). FDA approval summary: dabrafenib and trametinib for the treatment of metastatic non-small cell lung cancers harboring BRAF V600E mutations. The Oncologist 23, 740–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Phelan M., Bhavsar N. A. & Goldstein B. A. (2017). Illustrating informed presence bias in electronic health records data: how patient interactions with a health system can impact inference. J Electron. Health Data Meth. 5, 22–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pocock S. J. (1976). The combination of randomized and historical controls in clinical trials. J. Chronic Dis. 29, 175–88. [DOI] [PubMed] [Google Scholar]
Qin J., Zhang H., Li P., Albanes D. & Yu K. (2015). Using covariate-specific disease prevalence information to increase the power of case-control studies. Biometrika 102, 169–80. [Google Scholar]
R Development Core Team (2025). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0. http://www.R-project.org [Google Scholar]
Rosenbaum P. R. & Rubin D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55. [Google Scholar]
Rubin D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66, 688–701. [Google Scholar]
Schoenfeld D. A., Finkelstein D. M., Macklin E., Zach N., Ennist D. L., Taylor A. A., Atassi N. & Pooled Resource Open-Access ALS Clinical Trials Consortium. (2019). Design and analysis of a clinical trial using previous trials as historical control. Clin. Trials 16, 531–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Shan M., Faries D., Dang A., Zhang X., Cui Z. & Sheffield K. M. (2022). A simulation-based evaluation of statistical methods for hybrid real-world control arms in clinical trials. Statist. Biosci. 14, 259–84. [Google Scholar]
Shen Y., Gao C., Witten D. & Han F. (2020). Optimal estimation of variance in nonparametric regression with random design. Ann. Statist. 48, 3589–618. [Google Scholar]
Silverman B. (2018). A baker’s dozen of US FDA efficacy approvals using real world evidence. Pharma Intelligence Pink Sheet, 7 August.
Stuart E. A. (2010). Matching methods for causal inference: a review and a look forward. Statist. Sci. 25, 1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
Stuart E. A. & Rubin D. B. (2008). Matching with multiple control groups with adjustment for group differences. J. Educ. Behav. Statist. 33, 279–306. [Google Scholar]
Stuart E. A., Cole S. R., Bradshaw C. P. & Leaf P. J. (2011). The use of propensity scores to assess the generalizability of results from randomized trials. J. R. Statist. Soc. A 174, 369–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tibshirani R. J., Taylor J., Lockhart R. & Tibshirani R. (2016). Exact post-selection inference for sequential regression procedures. J. Am. Statist. Assoc. 111, 600–20. [Google Scholar]
Tipton E. (2014). How generalizable is your experiment? An index for comparing experimental samples and populations. J. Educ. Behav. Statist. 39, 478–501. [Google Scholar]
Tsiatis A. (2006). Semiparametric Theory and Missing Data. New York: Springer. [Google Scholar]
Van der Vaart A. W. (2000). Asymptotic Statistics, vol. 3. Cambridge: Cambridge University Press. [Google Scholar]
van Rosmalen J., Dejardin D., van Norden Y., Löwenberg B. & Lesaffre E. (2018). Including historical data in the analysis of clinical trials: is it worth the effort? Statist. Meth. Med. Res. 27, 3167–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
Viele K., Berry S., Neuenschwander B., Amzal B., Chen F., Enas N., Hobbs B., Ibrahim J. G., Kinnersley N., Lindborg S. et al. (2014). Use of historical control data for assessing treatment effects in clinical trials. Pharm. Statist. 13, 41–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
Visvanathan K., Levit L. A., Raghavan D., Hudis C. A., Wong S., Dueck A. & Lyman G. H. (2017). Untapped potential of observational research to inform clinical decision making: American Society of Clinical Oncology research statement. J. Clin. Oncol. 35, 1845–54. [DOI] [PubMed] [Google Scholar]
Wu L. & Yang S. (2022a). Integrative -learner of heterogeneous treatment effects combining experimental and observational studies. In Proc. 1st Conf. Causal Learn. Reason., pp. 904–26. PMLR.
Wu L. & Yang S. (2022b). Transfer learning of individualized treatment rules from experimental to real-world data. J. Comp. Graph. Statist. 32, 1036–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yang S., Kim J. K. & Song R. (2020). Doubly robust inference when combining probability and non-probability samples with high dimensional data. J. R. Statist. Soc. B 82, 445–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yang S., Zeng D. & Wang X. (2022). Elastic integrative analysis of randomized trial and real-world data for treatment heterogeneity estimation. arXiv: 2005.10579v3. [DOI] [PMC free article] [PubMed]
Zhai Y. & Han P. (2022). Data integration with oracle use of external information from heterogeneous populations. J. Comp. Graph. Statist. 31, 1001–12. [Google Scholar]
Zhang H., Deng L., Schiffman M., Qin J. & Yu K. (2020). Generalized integration model for improved statistical inference by leveraging external summary data. Biometrika 107, 689–703. [Google Scholar]
Zhang T. & Yu B. (2005). Boosting with early stopping: convergence and consistency. Ann. Statist. 33, 1538–79. [Google Scholar]
Zou H. (2006). The adaptive LASSO and its oracle properties. J. Am. Statist. Assoc. 101, 1418–29. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

asae069_Supplementary_Data

asae069_supplementary_data.zip^{(417KB, zip)}

[asae069-B1] Bickel P. J., Klaassen C., Ritov Y. & Wellner J. (1998). Efficient and Adaptive Inference in Semiparametric Models, vol. 50. Baltimore, MD: Johns Hopkins University Press. [Google Scholar]

[asae069-B2] Cao W., Tsiatis A. A. & Davidian M. (2009). Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika 96, 723–34. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B3] Chatterjee N., Chen Y.-H., Maas P. & Carroll R. J. (2016). Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources. J. Am. Statist. Assoc. 111, 107–17. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B4] Chen X. (2007). Large sample sieve estimation of semi-nonparametric models. In Handbook of Econometrics, vol. 6, Ed. Heckman J.J. and Leamer E. E., pp. 5549–5632. Amsterdam: Elsevier. [Google Scholar]

[asae069-B5] Chen Z., Ning J., Shen Y. & Qin J. (2021). Combining primary cohort data with external aggregate information without assuming comparability. Biometrics 77, 1024–36. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B6] Chernozhukov V., Chetverikov D., Demirer M., Duflo E., Hansen C., Newey W. & Robins J. (2018). Double/debiased machine learning for treatment and structural parameters. Econom. J. 21, 1–68. [Google Scholar]

[asae069-B7] Chu J., Lu W. & Yang S. (2023). Targeted optimal treatment regime learning using summary statistics. Biometrika 110, 913–31. [Google Scholar]

[asae069-B8] Colnet B., Mayer I., Chen G., Dieng A., Li R., Varoquaux G., Vert J.-P., Josse J. & Yang S. (2020). Causal inference methods for combining randomized trials and observational studies: a review. Statist. Sci. 39, 165–91. [Google Scholar]

[asae069-B9] Fan J. & Li R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Statist. Assoc. 96, 1348–60. [Google Scholar]

[asae069-B10] FDA. (2001). E10 Choice of Control Group and Related Issues in Clinical Trials. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/e10-choice-control-group-and-related-issues-clinical-trials [PubMed]

[asae069-B11] FDA. (2014). Blinatumomab Drug Approval Package. https://www.accessdata.fda.gov/drugsatfda_docs/nda/2014/125557Orig1s000TOC.cfm

[asae069-B12] FDA. (2016). Avelumab Drug Approval Package. https://www.fda.gov/drugs/resources-information-approved-drugs/avelumab-bavencio

[asae069-B13] FDA. (2019). Rare Diseases: Natural History Studies for Drug Development. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/rare-diseases-natural-history-studies-drug-development

[asae069-B14] FDA. (2021). Real-World Data: Assessing Registries to Support Regulatory Decision-Making for Drug and Biological Products Guidance for Industry. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/real-world-data-assessing-registries-support-regulatory-decision-making-drug-and-biological-products

[asae069-B15] FDA. (2023). Considerations for the Design and Conduct of Externally Controlled Trials for Drug and Biological Products Guidance for Industry. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-design-and-conduct-externally-controlled-trials-drug-and-biological-products

[asae069-B16] Gao C. & Yang S. (2023). Pretest estimation in combining probability and non-probability samples. Electron. J. Statist. 17, 1492–546. [Google Scholar]

[asae069-B17] Ghadessi M., Tang R., Zhou J., Liu R., Wang C., Toyoizumi K., Mei C., Zhang L., Deng C. & Beckman R. A. (2020). A roadmap to using historical controls in clinical trials – by Drug Information Association Adaptive Design Scientific Working Group (DIA-ADSWG). Orphanet J. Rare Dis. 15, 1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B18] Ho D. E., Imai K., King G. & Stuart E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit. Anal. 15, 199–236. [Google Scholar]

[asae069-B19] Hobbs B. P., Carlin B. P., Mandrekar S. J. & Sargent D. J. (2011). Hierarchical commensurate and power prior models for adaptive incorporation of historical information in clinical trials. Biometrics 67, 1047–56. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B20] Huang J., Ma S. & Zhang C.-H. (2008). Adaptive lasso for sparse high-dimensional regression models. Statist. Sinica 18, 1603–18. [Google Scholar]

[asae069-B21] Ibrahim J. G. & Chen M.-H. (2000). Power prior distributions for regression models. Statist. Sci. 15, 46–60. [Google Scholar]

[asae069-B22] Imbens G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: a review. Rev. Econ. Statist. 86, 4–29. [Google Scholar]

[asae069-B23] Kwiatkowski E., Zhu J., Li X., Pang H., Lieberman G. & Psioda M. A. (2023). Case weighted adaptive power priors for hybrid control analyses with time-to-event data. arXiv: 2305.05913v1. [DOI] [PMC free article] [PubMed]

[asae069-B24] Lee D., Yang S., Dong L., Wang X., Zeng D. & Cai J. (2022b). Improving trial generalizability using observational studies. Biometrics 79, 1213–25. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B25] Lee D., Yang S. & Wang X. (2022a). Doubly robust estimators for generalizing treatment effects on survival outcomes from randomized controlled trials to a target population. J. Causal Infer. 10, 415–40. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B26] Lee J. D., Sun D. L., Sun Y. & Taylor J. E. (2016). Exact post-selection inference, with application to the lasso. Ann. Statist. 44, 907–27. [Google Scholar]

[asae069-B27] Li X., Miao W., Lu F. & Zhou X.-H. (2023). Improving efficiency of inference in clinical trials with external control data. Biometrics 79, 394–403. [DOI] [PubMed] [Google Scholar]

[asae069-B28] Lin Z., Xiang Y. & Zhang C. (2009). Adaptive lasso in high-dimensional settings. J. Nonparam. Statist. 21, 683–96. [Google Scholar]

[asae069-B29] Liu M., Bunn V., Hupf B., Lin J. & Lin J. (2021). Propensity-score-based meta-analytic predictive prior for incorporating real-world and historical data. Statist. Med. 40, 4794–808. [DOI] [PubMed] [Google Scholar]

[asae069-B30] Neuenschwander B., Branson M. & Spiegelhalter D. J. (2009). A note on the power prior. Statist. Med. 28, 3562–6. [DOI] [PubMed] [Google Scholar]

[asae069-B31] Neuenschwander B., Capkun-Niggli G., Branson M. & Spiegelhalter D. J. (2010). Summarizing historical information on controls in clinical trials. Clin. Trials 7, 5–18. [DOI] [PubMed] [Google Scholar]

[asae069-B32] Odogwu L., Mathieu L., Blumenthal G., Larkins E., Goldberg K. B., Griffin N., Bijwaard K., Lee E. Y., Philip R., Jiang X. et al. (2018). FDA approval summary: dabrafenib and trametinib for the treatment of metastatic non-small cell lung cancers harboring BRAF V600E mutations. The Oncologist 23, 740–5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B33] Phelan M., Bhavsar N. A. & Goldstein B. A. (2017). Illustrating informed presence bias in electronic health records data: how patient interactions with a health system can impact inference. J Electron. Health Data Meth. 5, 22–36. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B34] Pocock S. J. (1976). The combination of randomized and historical controls in clinical trials. J. Chronic Dis. 29, 175–88. [DOI] [PubMed] [Google Scholar]

[asae069-B35] Qin J., Zhang H., Li P., Albanes D. & Yu K. (2015). Using covariate-specific disease prevalence information to increase the power of case-control studies. Biometrika 102, 169–80. [Google Scholar]

[asae069-B36] R Development Core Team (2025). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0. http://www.R-project.org [Google Scholar]

[asae069-B37] Rosenbaum P. R. & Rubin D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55. [Google Scholar]

[asae069-B38] Rubin D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66, 688–701. [Google Scholar]

[asae069-B39] Schoenfeld D. A., Finkelstein D. M., Macklin E., Zach N., Ennist D. L., Taylor A. A., Atassi N. & Pooled Resource Open-Access ALS Clinical Trials Consortium. (2019). Design and analysis of a clinical trial using previous trials as historical control. Clin. Trials 16, 531–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B40] Shan M., Faries D., Dang A., Zhang X., Cui Z. & Sheffield K. M. (2022). A simulation-based evaluation of statistical methods for hybrid real-world control arms in clinical trials. Statist. Biosci. 14, 259–84. [Google Scholar]

[asae069-B41] Shen Y., Gao C., Witten D. & Han F. (2020). Optimal estimation of variance in nonparametric regression with random design. Ann. Statist. 48, 3589–618. [Google Scholar]

[asae069-B42] Silverman B. (2018). A baker’s dozen of US FDA efficacy approvals using real world evidence. Pharma Intelligence Pink Sheet, 7 August.

[asae069-B43] Stuart E. A. (2010). Matching methods for causal inference: a review and a look forward. Statist. Sci. 25, 1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B44] Stuart E. A. & Rubin D. B. (2008). Matching with multiple control groups with adjustment for group differences. J. Educ. Behav. Statist. 33, 279–306. [Google Scholar]

[asae069-B45] Stuart E. A., Cole S. R., Bradshaw C. P. & Leaf P. J. (2011). The use of propensity scores to assess the generalizability of results from randomized trials. J. R. Statist. Soc. A 174, 369–86. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B46] Tibshirani R. J., Taylor J., Lockhart R. & Tibshirani R. (2016). Exact post-selection inference for sequential regression procedures. J. Am. Statist. Assoc. 111, 600–20. [Google Scholar]

[asae069-B47] Tipton E. (2014). How generalizable is your experiment? An index for comparing experimental samples and populations. J. Educ. Behav. Statist. 39, 478–501. [Google Scholar]

[asae069-B48] Tsiatis A. (2006). Semiparametric Theory and Missing Data. New York: Springer. [Google Scholar]

[asae069-B49] Van der Vaart A. W. (2000). Asymptotic Statistics, vol. 3. Cambridge: Cambridge University Press. [Google Scholar]

[asae069-B50] van Rosmalen J., Dejardin D., van Norden Y., Löwenberg B. & Lesaffre E. (2018). Including historical data in the analysis of clinical trials: is it worth the effort? Statist. Meth. Med. Res. 27, 3167–82. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B51] Viele K., Berry S., Neuenschwander B., Amzal B., Chen F., Enas N., Hobbs B., Ibrahim J. G., Kinnersley N., Lindborg S. et al. (2014). Use of historical control data for assessing treatment effects in clinical trials. Pharm. Statist. 13, 41–54. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B52] Visvanathan K., Levit L. A., Raghavan D., Hudis C. A., Wong S., Dueck A. & Lyman G. H. (2017). Untapped potential of observational research to inform clinical decision making: American Society of Clinical Oncology research statement. J. Clin. Oncol. 35, 1845–54. [DOI] [PubMed] [Google Scholar]

[asae069-B53] Wu L. & Yang S. (2022a). Integrative -learner of heterogeneous treatment effects combining experimental and observational studies. In Proc. 1st Conf. Causal Learn. Reason., pp. 904–26. PMLR.

[asae069-B54] Wu L. & Yang S. (2022b). Transfer learning of individualized treatment rules from experimental to real-world data. J. Comp. Graph. Statist. 32, 1036–45. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B55] Yang S., Kim J. K. & Song R. (2020). Doubly robust inference when combining probability and non-probability samples with high dimensional data. J. R. Statist. Soc. B 82, 445–65. [DOI] [PMC free article] [PubMed] [Google Scholar]

[asae069-B56] Yang S., Zeng D. & Wang X. (2022). Elastic integrative analysis of randomized trial and real-world data for treatment heterogeneity estimation. arXiv: 2005.10579v3. [DOI] [PMC free article] [PubMed]

[asae069-B57] Zhai Y. & Han P. (2022). Data integration with oracle use of external information from heterogeneous populations. J. Comp. Graph. Statist. 31, 1001–12. [Google Scholar]

[asae069-B58] Zhang H., Deng L., Schiffman M., Qin J. & Yu K. (2020). Generalized integration model for improved statistical inference by leveraging external summary data. Biometrika 107, 689–703. [Google Scholar]

[asae069-B59] Zhang T. & Yu B. (2005). Boosting with early stopping: convergence and consistency. Ann. Statist. 33, 1538–79. [Google Scholar]

[asae069-B60] Zou H. (2006). The adaptive LASSO and its oracle properties. J. Am. Statist. Assoc. 101, 1418–29. [Google Scholar]

PERMALINK

Improving randomized controlled trial analysis via data-adaptive borrowing

Chenyin Gao

Shu Yang

Mingyang Shan

Wenyu Ye

Ilya Lipkovich

Douglas Faries

Summary

1. Introduction

2. Methodology

2.1. Notation, assumptions and objectives

Assumption 1

Assumption 2

2.2. Semiparametric efficient estimation under the ideal situation

Theorem 1.

Remark 1.

Assumption 3.

Theorem 2.

2.3. Bias detection and selective borrowing

Assumption 4.

Lemma 1.

Algorithm 1.

Theorem 3.

Theorem 4.

3. Simulation

Table 1:

Figure 1:

4. Real-data application

Table 2:

Figure 2:

5. Discussion

Supplementary Material

Acknowledgement

Contributor Information

Supplementary material

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases