Exploring the discrepancies between clinical trials and real‐world data: A small‐cell lung cancer study

Luca Marzano; Adam S Darwich; Asaf Dan; Salomon Tendler; Rolf Lewensohn; Luigi De Petris; Jayanth Raghothama; Sebastiaan Meijer

doi:10.1111/cts.13909

. 2024 Aug 7;17(8):e13909. doi: 10.1111/cts.13909

Exploring the discrepancies between clinical trials and real‐world data: A small‐cell lung cancer study

Luca Marzano ^1,^✉, Adam S Darwich ¹, Asaf Dan ², Salomon Tendler ^2,³, Rolf Lewensohn ², Luigi De Petris ², Jayanth Raghothama ¹, Sebastiaan Meijer ¹

PMCID: PMC11306525 PMID: 39113428

Abstract

The potential of real‐world data to inform clinical trial design and supplement control arms has gained much interest in recent years. The most common approach relies on reproducing control arm outcomes by matching real‐world patient cohorts to clinical trial baseline populations. However, recent studies pointed out that there is a lack of replicability, generalisability, and consensus. In this article, we propose a novel approach that aims to explore and examine these discrepancies by concomitantly investigating the impact of selection criteria and operations on the measurements of outcomes from the patient data. We tested the approach on a dataset consisting of small‐cell lung cancer patients receiving platinum‐based chemotherapy regimens from a real‐world data cohort (n = 223) and six clinical trial control arms (n = 1224). The results showed that the discrepancy between real‐world and clinical trial data potentially depends on differences in both patient populations and operational conditions (e.g., frequency of assessments, and censoring), for which further investigation is required. Discovering and accounting for confounders, including hidden effects of differences in operations related to the treatment process and clinical trial study protocol, would potentially allow for improved translation between clinical trials and real‐world data. Continued development of the method presented here to systematically explore and account for these differences could pave the way for transferring learning across clinical studies and developing mutual translation between the real‐world and clinical trials to inform clinical study design.

Study highlights.

WHAT IS THE CURRENT KNOWLEDGE ON THE TOPIC?

Real‐world data have the potential to inform clinical trial design, control arms, and regulatory assessment. However, real‐world evidence studies have shown poor replication and generalizability and a lack of consensus on the analytical process, thus underlining that the mechanisms that would allow the translation between clinical trials and real‐world populations are still not completely understood.

WHAT QUESTION DID THIS STUDY ADDRESS?

What are the mechanisms that would allow translation between clinical trials and real‐world? How can we design a comprehensive and systematic approach to explore the grade of translation? Does the approach work in a challenging real‐world case study such as small‐cell lung cancer chemotherapy?

WHAT DOES THIS STUDY ADD TO OUR KNOWLEDGE?

Differences in operations and protocols have a relevant impact on the gap in clinical outcomes. These must be studied in concomitance with the selection criteria of the baselines. Previous works proposed pure empirical approaches (such as propensity score), and limitations of the findings can be related to the lack of consideration of operational differences between trials and real‐world practice. Our approach allowed novel insights regarding which aspects would benefit from further investigation to improve the design of small‐cell lung cancer studies (ECOG 2 underrepresentation and pre‐trial biases, exploring the therapies with the new TNM staging categories, operational biases of trials censoring and progress free survival).

HOW MIGHT THIS CHANGE CLINICAL PHARMACOLOGY OR TRANSLATIONAL SCIENCE?

Designing a comprehensive and systematic approach to investigate how selection criteria and operations are impacting on the measurements of outcome would allow us to estimate the trade‐off between internal validity and generalizability of clinical trials. Thus, pushing real‐world evidence toward a learn and confirm cycle from which we learn case by case and close the translational gap between clinical trials and real‐world populations.

INTRODUCTION

During the past few years, discussions have highlighted the limitations of randomized clinical trials (RCTs) due to their high costs and the challenge of translating clinical outcomes between RCT cohorts and real‐world patient populations. ¹ , ² Since the 21st Century Cures Act in 2015, the potential for translating between real‐world data (RWD) and clinical trials to inform regulatory decision‐making has gained attention. ³ , ⁴ , ⁵ , ⁶ , ⁷ A growing body of research has focused on the extrapolation of RWD to inform clinical trial design, often described as emulation of RCT control arms or simulated (synthetic) data, ⁸ , ⁹ , ¹⁰ , ¹¹ , ¹² with the purpose of reproducing clinical trial outcomes. ⁸ , ⁹ , ¹⁰ , ¹¹ , ¹³ , ¹⁴ , ¹⁵ , ¹⁶ , ¹⁷

In an attempt to adjust for confounders, analyses have often focused on recreating the inclusion criteria of clinical trials in RWD cohorts. ⁸ , ⁹ , ¹⁰ , ¹⁴ , ¹⁶ , ¹⁸ , ¹⁹ , ²⁰ , ²¹ , ²² , ²³ , ²⁴ , ²⁵ , ²⁶ The most common method is patient matching based on propensity score. ²⁷ Propensity score is formally defined by Rosenbaum and Rubin as the conditional probability of assignment to a particular treatment. ²⁸ The method gives a probability of an RWD patient being enrolled in a clinical trial study arm given a vector of observed covariates. Then, the RWD cohort is adjusted by including only patients with high propensity scores. Propensity score approaches, including variations of this method, still constitute the main proposed technique for adjusting for confounders in RWD cohorts based on the characteristics of RCT control arms. ⁸ , ¹⁰ , ¹⁶ , ¹⁹ , ²¹ , ²² , ²⁴ , ²⁶ , ²⁹ , ³⁰

However, propensity score approaches have been shown to be limited in this aspect since a probabilistic empirical approach is highly sensitive to undetected confounders and biases of the data. ¹ , ⁸ , ¹⁰ , ¹¹ , ¹⁴ , ²⁵ Indeed, the alignment of patient characteristics is seldom sufficient to reproduce outcomes across populations. ¹⁰ , ¹¹ , ¹⁶ It has repeatedly been pointed out that there is a lack of replicability and generalisability ¹ , ⁹ , ¹⁰ , ¹³ , ¹⁷ with only a few clinical trials being replicable based on RWD. ¹⁰ , ¹³ , ¹⁷ Results have been mixed depending on the specific case study ¹³ , ¹⁷ , ³¹ and measurements of outcome ¹⁰ , ¹⁶ (e.g., overall survival and intermediate end points ¹⁰ , ¹⁴ ).

The lack of translatability has been attributed to the differences in populations, such as baseline confounders and key eligibility criteria that are not available in the data. ¹⁰ , ¹¹ , ¹³ , ¹⁴ The main focus of improvement has been on how patient demographics affect measurements of outcome with little focus on the operations and processes behind the data. ⁸ , ¹⁰ , ¹³ , ²³ , ³² The potential operational differences between clinical practice and clinical trial protocols have been mentioned as potential confounders but are yet to be fully explored. ¹ , ² , ⁸ , ¹⁰ , ¹¹ , ¹³ , ³² There is a need to investigate the bias that is introduced by differences in investigation and clinical assessment ¹ , ¹⁰ , ¹¹ , ²³ , ³² (e.g., lack of pre‐trial monitoring in clinical practice ¹ ), and potential differences in the RCT monitoring process compared with real‐world patients, with more detailed and potentially more frequent follow‐up on tumor response and adverse effects, and decision‐making such as withdrawal of therapy. ¹ , ¹⁰ , ¹¹ , ²⁵ , ³²

Translation between clinical trials and real‐world populations is therefore still not completely understood. ¹ , ⁸ , ⁹ , ¹⁰ , ¹³ , ¹⁴ , ¹⁶ , ¹⁷ Further refinement of proposed methodologies is required to realize the potential of RWD to inform clinical trial design. ⁶ , ⁹ , ¹⁰ , ¹³ , ³³ In the past, the added value of a mechanistic systems view of translation in drug development has been beneficial for other areas of model‐informed drug development, such as quantitative in vitro in vivo extrapolation and physiologically‐based pharmacokinetics of metabolic drug–drug interactions. ³⁴ Developing systems approaches for real‐world evidence could enable a similar learn‐confirm cycle and learning across studies, where the represented systems include not only the patient, disease, and treatment but also the operational context. ³⁵

In this article, we propose an approach that aims to systematically explore the discrepancies between RWD and RCTs by discovering and accounting for the differences in population samples (randomization) and operation (protocols and clinical practice). We abbreviate the approach as SOMO as it is based on exploring the effects of Selection Criteria (S), Operations (O) and study protocols, on the replication of the measurements of outcome (MO). We developed and tested the SOMO approach using RCTs and RWD on extensive disease (ED, multiple distant metastases) small‐cell lung cancer (SCLC) patients receiving platinum‐etoposide chemotherapy. This was done to the extent it was possible given the available information, many factors were determined to be known unknowns.

SCLC is a case study that fits particularly well the scope of our study. The disease is very aggressive, with a reported median overall survival of about 6 months, and limited therapy options, ³⁶ , ³⁷ and the treatment approaches have unfortunately not or only marginally improved during the past two decades. ³⁶ Hence, SCLC would benefit from leveraging RCT‐RWD discrepancies.

METHODS

The SOMO approach

Figure 1 shows the SOMO approach and the accompanying data analysis that was carried out. In short, the analysis included the following components:

(S). Selection Criteria refers to all aspects related to the baseline variables that define the population, the biology of the disease, and the inclusion and exclusion criteria of the clinical datasets.
(O). Operations and study protocols refer to aspects related to operational processes (or mechanisms) occurring during treatment. These can be grouped into longitudinal disease factors (e.g., tumor progression), the study protocol operations (e.g., removal and censoring of patients that did not adhere to the study protocol), or the potential differences from the real‐world routine healthcare operations (e.g., adjustment of doses, change of treatment due to relapse, patients opting to not receive treatment).
(MO). Measurements of Outcome refer to the metrics used to evaluate treatment efficacy (and safety, when available) and estimation of the feasibility of translation between RWD and RCTs using a comparative approach. These can be one or multiple outcomes depending on the study end points and the statistical analysis defined by the protocol (e.g., overall survival, progress‐free survival, toxicity, exposure, or overall response, etc.), as well as the available information from the real‐world cohort.

Summary description of the SOMO approach. Data are retrieved and pre‐processes (step a), explorative analysis (step b), estimating the impact of factors on translation between cohorts (step c), validation and evaluation of the results (step d).

Available RCT and RWD data are retrieved, pre‐processed, and harmonized into a combined dataset. Then, clinical experts interpret and contextualize information (Figure 1, steps a.1–2). The analysis is then carried out in the following steps:

Explorative analysis. An exploratory data analysis is performed to define the SOMO components (steps b.1–5). During this phase outcomes, relevant available factors, and potentially important missing aspects, are mapped into the three categories as detailed below. First, the comparison between RCTs and RWD outcomes is carried out to estimate baseline differences in MOs between cohorts. Then, a comparison between the two populations is performed to detect any potential mismatch between patient covariates due to selection and randomization. Finally, for the operational aspects, potential confounders are investigated through a comparison between study protocols and clinical practice with the aid of longitudinal outcome measures (e.g., Kaplan–Meier or dose–response curves) and clinical expert feedback.
Estimating the impact of factors on the discrepancies in outcomes between cohorts. The potential impact of selection criteria and operations is explored by applying matching based on the relevant available variables and simulating the effects of operations. The impact on MO is then compared (steps c.1–2). Clinical expert involvement allows verification of the results by identifying and discussing eventual biases and confounders (step d).

Hence, one of the outcomes was a list of potential factors related to selection criteria and operations that could contribute to explaining the translational gap and quantifying their effects, or indicative future aspects to explore when the hypotheses cannot be tested due to the lack of available information.

Case study: Extensive disease small‐cell lung cancer patients receiving platinum‐etoposide chemotherapy

Cohort description

In this study, a mixed cohort was collated, including RWD and RCTs of ED‐SCLC patients that had received platinum‐etoposide chemotherapy as first‐line treatment:

The RWD included in this analysis was part of a retrospective cohort of SCLC patients treated at Karolinska University Hospital (Stockholm, Sweden) between 2008 and 2016 ³⁷ (RWD KI, n patients = 223). The study was approved by the institutional review boards at Karolinska Institutet and Stockholm County Council (2016/8‐31). This cohort has served diverse studies and proved to be robust for RWD analyses. ³⁷ , ³⁸ , ³⁹
The RCT comparative groups originated from open data shared through the Project Data Sphere Initiative, ⁴⁰ including participants receiving the standard platinum‐etoposide treatment that were randomized into the control arm from three randomized phase III clinical trials: PDS_Amgen (NCT00119613, n = 232), PDS_Alliance (NCT00003299, n = 270), PDS_EliLilly (NCT00363415, n = 370), and three phase Ib‐II trials PDS_PHASE2_Alliance (NCT00453154, n = 46), PDS_PHASE2_EliLilly (NCT01439568, n = 41), PDS_PHASE2_G1Thera (NCT02499770, n = 37). A subset of patients (n = 85) of PDS_EliLilly was censored after the study was declared futile after the interim analysis. These were removed prior to the analysis.

The combined mixed cohort encompassed n = 1224 patients in total. The common patient variables were age, sex, brain metastasis (BM), Eastern Cooperative Oncology Group performance status (ECOG), the cohort from which the data were retrieved (STUDY), and the label referring to whether patients originated from the RCTs or real‐world cohort (STUDY_TYPE). Survival outcomes were progression‐free survival (PFS) and overall survival (OS). PFS was calculated from the date of randomization or from the date of treatment start in the RCT and RWD cohorts, respectively, until the occurrence of radiological or clinical progression, or death of any cause. Similarly, OS was calculated with the starting date as for PFS, until the death of any cause. At the time of the last follow‐up, patients alive and without experiencing disease progression were censored for PFS, and those still alive were censored for OS (CENSOR). The criteria for censoring were the same in both the RCT and RWD cohorts. The presence of BM was considered as a variable of special interest. The course of the disease in these patients is particularly unfavorable, as compared with metastases in other organs (liver, bone). In clinical trials, brain metastases are underrepresented, since such patients are usually considered eligible only if asymptomatic, which is seldom the case. RWD instead is a precious source for studying this patient category, since SCLC with BM represents a rather common challenge in clinical practice. The real‐world cohort had been re‐staged using the 8th version of the International Association for the Study of Lung Cancer (IASLC) TNM in a previous validation study. ³⁸ A stage variable (STAGE) was created where real‐world patients were defined using TNM staging (IVA or IVB), while the random clinical trials patients were staged using the traditional Veterans' Administration Lung Study Group (VALSG) method (ED stage). Although using the TNM staging classification is strongly recommended in clinical practice, a modified staging system categorizing SCLC patients into two groups, according to the treatment strategy, is potentially curable (limited disease, LD) or purely palliative (extensive disease, ED) has been adopted as a selection criterion for a large number of clinical trials. ³⁸ TNM‐IVA patients correspond to ED patients with multiple tumoral nodules in the contralateral lung, pleural or pericardial nodules, malignant pleural or pericardial effusion, or single extrathoracic metastases in a single organ, while IVB patients are ED patients with multiple extrathoracic metastases in one or more organs. ³⁸ A summary of the RWD and clinical trials cohorts is reported in the Table S1.

Survival analysis

Table 1 summarizes how the SOMO approach was applied to the ED‐SCLC case study reporting the detected aspects during the explorative analysis and how were investigated. The main techniques used during the analysis were: cohort stratification given the available variables, propensity score matching with weights computed using logistic regression, ²⁷ and oversampling to generate simulated cohorts using the standard Synthetic Minority Over‐sampling TEchnique‐Nominal Continuous (SMOTENC) algorithm based on k‐nearest neighbors. ⁴¹ Table 1 details how these techniques were used for investigating the impact of variables on the observed outcomes.

TABLE 1.

SOMO analysis description for mixed cohort SCLC study.

SOMO component	Parameter/factor to investigate	Is possible to investigate using the current data?	Analysis description
Selection criteria (S)	ECOG Performance status	Yes	Cohort stratification based on ECOG
	Sex	Yes	Cohort stratification sex
	Age	Yes	Cohort stratification of patients older than 75 years
	Brain metastasis	Yes Baseline brain metastasis records were not available for all clinical trial patients, or not explicit mentioned as an exclusion criterion	Cohort stratification based on brain metastasis where possible
	Cancer stage	Yes For RWD: 8th TNM version (“IVA” PR “IVB”). For RCTs: VALSG stage (unique value: “ED‐SCLC”)	Cohort stratification of RWD TNM staging IVA and IVB
	Cancer stage		Oversampling with SMOTENC of RWD TNM‐IVA to balance with IVB patients
	RCT selection criteria “probability of survive more than 3 months”	Yes This information is not explicitly expressed in the RCTs data. We can only find a simulation strategy to reproduce this implicit selection in the RWD	Simulated by matching the RWD patients using a propensity score matching with Y = “OS >90 days” (y/n)
	RWD propensity score matching	Yes	Propensity score matching with Y = “probability of being included in the RCT study cohort” (y/n)
	Lab tests and chemistry blood values before initiating the treatment	Few overlaps between RWD and RCTs. The only common baseline available in all the studies was hemoglobin. Few RCTs instances matched with the RWD available blood values (Lactate Dehydrogenase, Albumin, and Sodium)
	Other Key baselines: smoking status, ethnicity, and comorbidity score	Smoking status: information not available from the RCTs Ethnicity: Information not available from the RWD Comorbidity score: information not available from RWD and RCTs
Operations and protocols (O)	Outcomes definition Correction (RWD survival outcomes are calculated from the start of the therapy, RCTs from the randomization)	Yes Correction made from the RCTs data when timestamps were available. For the RCTs with this information missing, we imputed the correction as random effect informed by the available one	Analysis performed with the corrected RCTs outcomes
	ECOG 2 RCT pre‐trial enrolment bias	Yes RCT patients with ECOG 2 were compared with RWD ECOG 2‐3 to assess of there was a worsening of baseline for the RCTs patients	Sub‐cohort stratification
	Clinical trials censoring	Yes	Inclusion the censored records for futility of the PDS_EliLilly study to estimate the censoring bias
			Removing all censored patients from the analysis
			Propensity score matching including the censoring as variable

	Variability and robustness of the longitudinal profile of survival curves difference	Yes	Creation a real‐world synthetic cohort with same RCTs sample size using SMOTENC oversampling and simulating censoring effect with same distribution of the clinical trials
	Treatment administration (e.g., dose–response variables)	Information not available from the RWD
	Longitudinal biomarkers, and lab test or chemistry values	Information not available from RWD
	Longitudinal events and intermediate decisions (e.g., tumor regression, adherence to therapy, radiological progression, worsening of patient status)	Information not available from RWD
	RECIST assessment of the radiological processes	Information not available from RWD RECIST assessment is performed systematically in RCTs, but no in RWD
Treatment lines after first‐line therapy	Information not available from RCTs
Measurements of outcome (MO)	Survival outcomes: Overall survival (OS) and progress‐free survival (PFS)	Yes PFS is not available for all the RCTs data	Common reported outcomes. Analysis with Kaplan–Meier Curves, and Cox Hazard ratios of the entire cohort hazard ratio variability. Hazard ratios distribution obtained from the pairwise matching simulation of interventions on the matching (Figure S1)
Measurements of outcome (MO)	Toxicity analysis (Adverse Effects incidence, toxicity grade)	Information not available from RWD

Open in a new tab

Abbreviations: OS, overall survival; PFS, progress free survival; RCT, random clinical trial; RECIST, response evaluation criteria in solid tumors; RWD, real‐world data; SMOTENC, Synthetic Minority Over‐sampling TEchnique‐Nominal Continuous.

First, the RCTs were combined into one dataset and directly compared with the RWD. Then, a pairwise analysis was carried out using a simulation approach to compare the RWD with each RCT to explore the between‐study variability in MO.

Differences in the survival outcomes (OS and PFS) were explored. This was done using Kaplan–Meier Curves and Cox proportional Hazard ratios. The reference for computing the hazard ratios was the RWD cohort.

For the pairwise analysis, simulations were carried out creating a surrogate cohort (n = 250) by randomly selecting a 1:1 ratio of patients from the two populations, the RWD and the analyzed RCT (Figure S1). Then, the Cox hazard ratios between real‐world and clinical trials cohorts were computed. This was repeated 100 times to estimate the variability in the outcomes. This number of iterations was chosen since was enough to reproduce the baseline MO discrepancy between trial and RWD cohort. Different scenarios were simulated by adjusting the selection of patients according to the identified variables across the datasets (e.g., stratification by individual variables). The main assumption was that in the ideal scenario, where we would be capable of blindly adjusting for all confounders in the RWD to RCTs, there would be no significant differences in outcome between the two populations. Hence, assessing the impact of the explored factors on MO could be used for future trial designs.

RESULTS

Mapping available data and SOMO components

In total eight selection criteria factors were identified, nine potential operational discrepancies, and three potential effects on MO. Among these, it was possible to study from the available data the effects of seven selection criteria and four operational factors, on the two MO (See Table 1 for more details).

From the available data, it was possible to explore the effects of the commonly available variables reported in the previous section by observing the differences in outcomes for stratified subsamples of the cohort. The RWD stage IVA patients (n = 46) were oversampled with the SMOTENC algorithm to balance these against the IVB patients (n = 231). Propensity score matching allowed the exploration of the effects of these variables. Moreover, propensity score was used to simulate the common selection criterion across all RCTs, patients that are expected to live more than 3 months, for the RWD.

The operational aspects that were possible to explore were: discrepancies in outcome definitions (RWD survival outcomes were calculated from the start of the therapy, for RCTs this was defined from the randomization), the potential progression of ECOG from 2 to 3 in the RCTs as compared with the outcomes of ECOG 3 RWD patients, and RCT censoring effects. The censoring was studied from different perspectives: removal of all censored patients, simulating censoring in a synthetic RWD cohort using SMOTENC sampling using the same distribution in the RCTs, and including the censoring as a variable in the propensity score matching. This latter aspect is of particular importance, since a too short follow‐up time may lead to an over‐representation of censored cases, namely, cases that will eventually experience disease progression or death within a time frame of few months after study closure, and where the event would have instead been captured with a longer and more adequate follow‐up time.

Due to a lack of overlap in information across RWD and RCTs, it was not possible to investigate the effect of blood chemistry values and biomarkers, dose administration, intermediate decisions, and longitudinal events (tumor progression or adverse effects), and treatments after the first chemo‐cycle. Considerations regarding toxicity outcomes were not possible due to the lack of longitudinal information. Furthermore, information on PFS was limited and only reported for a small subset of RCT patients (mainly from PDS_EliLilly).

Cohort discrepancies

Table 2 reports the baseline discrepancies between the studies and the main results of the comparison of RWD with the aggregated RCT data. A significant difference was observed when comparing survival outcomes in RWD and RCT patients: OS (hazard ratio: 0.65 [0.55, 0.75], reference: RWD) and PFS (hazard ratio: 0.70 [0.58, 0.85]). The full set of results is reported in the Table S2.

TABLE 2.

Main results for the aggregated cohort analysis.

Measurement of outcome (MO)	Parameter analysis	Total cohort (RCT, RWD)	Hazard ratio (ref = RWD)
Overall survival	Baseline MO difference	1224 (996, 228)	0.65 [0.55–0.75]***
	(S) ECOG 0	386 (331, 55)	0.71 [0.52–0.97]*
	(S) ECOG 1	669 (564, 105)	0.75 [0.6–0.94]*
	(S) ECOG 2	169 (101, 68)	0.73 [0.53–1]
	(S) TNM staging STRAT	1224 (996, IVA:40 IVB: 188)	IVA: (ref), IVB: 1.9 [1.36–2.7], RCT: 1.1 [0.78–1.5]*
	(S) Oversampling TNM‐IVA stage	1372 (996, IVA: 188 IVB: 188)	IVA: (ref), IVB: 1.72 [1.40–2.1], RCT: 0.96 [0.81–1.1]*
	(O) Propensity score including also the censoring as variable	456 (228, 228)	1.1 [0.87–1.3]
	(O) Oversampling RWD simulating RCT censoring	1992 (996, 996)	0.89 [0.8–0.99]*
Progress free survival	Baseline MO difference	689 (461, 228)	0.7 [0.58–0.85]***
	(S) ECOG 0	261 (206, 55)	0.78 [0.54–1.1]
	(S) ECOG 1	330 (225, 105)	0.85 [0.64–1.1]
	(S) ECOG 2	98 (30, 68)	1.2 [0.74–2]
	(S) TNM staging STRAT	689 (461, IVA: 40, IVB: 188)	IVA: (ref), IVB: 1.7 [1.21–2.5], RCT: 1.1 [0.76–1.6]**
	(S) Oversampling TNM‐IVA stage	837 (461, IVA: 188, IVB:188)	IVA: (ref), IVB: 1.6 [1.26–1.9], RCT 1.0 [0.81–1.2]**
	(O) ECOG 2 pre‐trial effect	216 (ECOG 2 RCT: 101, ECOG 2 RWD: 169; ECOG 3 RWD: 47)	ECOG 2 RCT: (ref), ECOG 2 RWD: 0.79 [0.48–1.3], ECOG 3 RWD: 1.26 [0.74–2.1]
	(O) Including censored records for study futility	772 (544, 228)	0.67 [0.56–0.81]***
	(O) Removing all censored patients	433 (210, 223)	1.5 [1.2–1.8]***
	(O) Propensity score including also the censoring as variable	456 (228, 228)	1.5 [1.2–1.8]***
	(O) Oversampling RWD simulating RCT censoring	922 (461, 461)	1.5 [1.2–1.8]***

Open in a new tab

Note: The log‐rank test of Kaplan–Meier Curves confirmed the results obtained with the Cox Hazard ratios.

Bold text: detected similarity of outcomes, Underlined text: detected switch of survival outcome (better prognosis for real‐world patients). (S), selection criteria; (O), operations, MO, measurement of outcomes; RCT, randomize clinical trials; RWD, real‐world data. For the hazard ratios is reported the 95% confidence level interval. *p‐value <0.05, **p < 0.01, ***p < 0.001.

Analysis of selection criteria identified several differences in the RWD as compared with the RCTs. In the RWD, the age distribution was skewed toward older age (median: 70 years, range: [42–86] years), a more balanced sex ratio (male: 43.4%, female: 56.6%), and a higher frequency of patients with ECOG 2 (n = 68, 29.8%; Table S1). Baseline BM records were not available for all clinical trial patients, or were not explicit mentioned as an exclusion criterion (e.g., PDS_Alliance).

The analysis of operational aspects highlighted differences in the censoring between the two settings. Real‐world censoring corresponded to a few patients (n = 5) with long survival (subject to right censoring). For trial patients, censoring occurred with higher frequency (i.e., n = 225, 60.8% in PDS_EliLilly) across the survival range of 0–300 days. This resulted in higher estimates of OS in the RCTs compared with the RWD (Figure 2a).

Overall survival Kaplan–Meier curves and number of censored patients across the time for the (a) baseline survival gap of measurement of outcome (n = 1224), (b) synthetic oversampled RWD cohort with same trial censoring (n = 1992). The cohort in (b) was obtained by oversampling RWD with the SMOTENC algorithm to match RCT sample size, and simulating RWD censoring using the same distribution of the RCT cohort. RWD, real‐world data.

Selection criteria: Performance status and TNM staging

The subcohorts of patients that showed similar OS across the two cohorts (i.e., hazard ratios not statistically different from (1) were ECOG 2 patients (Table 2), and RWD patients with TNM staging IVA, using both conventional stratification and SMOTENC (Table 2; Figure S2)). Traditional propensity score matching did not overcome the difference in OS between the two cohorts (hazard ratio: 0.55 [0.44, 0.70]; Table S1).

For what concerns PFS, the outcomes were similar for the RCTs and RWD cohort across ECOG subgroups, TNM‐IVA cancer staging of RWD patients using conventional stratification and with SMOTENC oversampling, and traditional propensity score matching (Table 2).

Operations: The effect of RCT censoring on survival estimation and propensity score matching

The effect of operational aspects could be deduced from the comparison of the survival curve shapes. Figure 2a shows that the differing shapes of the Kaplan–Meier curve between the two cohorts (i.e., lower survival in the first 30 days for the RCTs, followed by a higher survival for RCTs until 130 days before reaching a similar trend for the rest of the longitudinal curve).

The high RCT censoring was a key operational aspect that was impacting on the survival discrepancies. In fact, the higher survival in the RCTs during the first 130 days was strongly related to the censoring (Figure 2a), where removing the censored patients or simulating the same censoring distribution in the RWD synthetic cohort reduced the difference in OS (Figure 2b; Table S1).

Moreover, adjusting the propensity score by matching the censoring operations allowed to reduce the discrepancy in OS between RCTs and RWD (Figure 3). In contrast, PFS was higher in the RWD compared with RCT cohort when correcting for censoring (hazard ratio: 1.5 [1.2, 1.8]) (Table 2).

Overall Survival Kaplan–Meier Curves using (a) traditional propensity score marching (n = 456), and (b) propensity score accounting trials censoring (n = 456). RWD, real‐world data.

The lower OS in the RCTs during the first 30 days could be related to potential progression of baseline ECOG 2 between the time of screening and randomization in the RCTs. In fact, Figure S3 shows that ECOG 2 patients in RCTs have similar OS as compared with ECOG 3 patients from the RWD (n = 47) in the range of 0 to 30 days, and identical PFS curve.

Pairwise randomized controlled trial comparison to the real‐world population

Figures 4 and 5 show the OS of PDS_EliLilly and PDS_Amgen, and PFS for PDS_EliLilly following simulation‐based oversampling of subgroups. Figure S4 shows the OS for PDS_Alliance and the aggregated cohort of phases I–II patients following simulation‐based oversampling. Overall, the results confirmed the findings of the aggregated RCT analysis (Table S2). In addition, this allowed us to investigate the variability between the RCTs and the impact of correcting for the known discrepancies between the RCTs and RWD.

Matching simulation results for overall survival hazard ratios for PDS_EliLilly and PDS_Amgen. For each simulation scenario are reported the boxplots of the hazard ratio distribution. BM, brain metastases; PR.SC., propensity score; RWD, real‐world data; Strat.: stratification. Hazard ratio and 95% confidence level interval with the whole cohort is reported in red dotted lines, ideal scenario of RCT‐RWD matching is reported with the blue‐dotted line.

Matching simulation results for progress‐free survival hazard ratios for PDS_EliLilly. For each simulation scenario are reported the boxplots of the hazard ratio distribution. BM, Brain metastases; PR.SC., Propensity score; RWD, real‐world data; Strat., Stratification. Hazard ratio and 95% confidence level interval with the whole cohort is reported in red dotted lines, ideal scenario of RCT‐RWD matching is reported with the blue‐dotted line.

Oversampling of stage IVA patients in RWD played a role in reducing the discrepancies in OS across all the trials. Similarly, this improved the discrepancy in PFS for the PDS_EliLilly trial. For the trials with a relevant number of censored patients (i.e., 60% for PDS_EliLilly, 18.28% for the phase I/II aggregated cohort), censoring was the most impactful factor and was associated with a larger difference in the baseline discrepancy of OS between the trial and the RWD cohort.

Figure 5 underlines the impact of operations on PFS, from which similar outcomes were achieved when correcting for ECOG and TNM stage. Instead, when correcting for censoring alone the discrepancy in PFS between PDS_EliLilly and the RWD was increased, with a. higher PFS in the RWD cohort.

DISCUSSION

In this study, we developed and tested an approach that aims to explore and estimate the impact of selection criteria and operations on outcome replication between RWD and RCTs. This was done using a systematic approach, attempting to account for known differences in the population samples.

The results of our work are relevant from the general perspectives on how to improve future clinical trial design. For example, censoring in clinical trials is a known potential confounder. ⁴² This work underlined that this was a key operational factor with a relatively high impact, thus leading to a potential overestimation of OS in clinical trials (Table 2; Figures 2 and 3). Moreover, when censored patients were removed, other variables reported in Table S2 reduced the difference in OS between the RCTs and RWD. Thus, indicating how the censoring could bias the estimated difference in OS between the cohorts. Moreover, the interventions we applied to assess censoring effect were useful as a retrospective correction of the outcomes, but not applicable as prospective prior RWD‐RCT translation.

As shown in Figure S3, ECOG 2 RCT patients showed more similar OS and PFS to ECOG 3 RWD patients. This could potentially be explained by differences in patient condition between the pre‐trial phase and randomization that could lead to worsening in baseline ECOG.

The results showed that the challenge of replicating outcomes in clinical trials from RWD patients depend not only on patient characteristics discrepancies, but also on differences in operations. This can be seen in Figures 2 and 5 and Figure S3 where the survival curves of the RCTs and RWD patients present differences in OS, potential due to operational differences in follow‐up. This is underlined by the PFS analysis, where matching of population variables (performance status and TNM staging) resulted in indistinguishable PFS between RCTs and RWD. However, the discrepancy in PFS war larger when accounting for the censoring (Table 2; Figure 5). This result is surprising, and we suspect it could be due to more frequent monitoring of clinical trial patients that could result in a shorter reported PFS when adverse events or relapse occur.

In previous works, the dominant approach has been the propensity score informed by clinical trial data. ¹⁰ , ²⁴ , ²⁵ , ²⁶ , ³⁰ , ³¹ , ⁴³ No in‐depth examination of differences in operations has been done before. Figure 3 demonstrates the relevant contribution operations (i.e., censoring) to biasing propensity score matching. This can potentially explain why replication was not achieved in previous works. Indeed, in the ideal condition of having two populations with the same support of baseline covariates and sufficient population sample, accounting for the eventual operational differences, should theoretically account for any misalignment in study outcomes. ⁹ , ¹¹ , ³¹ , ⁴³ One limitation of the propensity score is that it works from only one direction by selecting RWD patients that match RCTs, thus not accounting for any additional findings and confounders that are only detectable in the RWD population. This approach is limited for prospective applications since it would blindly work only in the scenario where all relevant confounders are accounted for in the RCTs.

RWD can inform additional disparities between RCTs and clinical practice. ¹ Previous studies indicate that this difference is consistent and independent of the studied case. ⁹ , ¹⁰ , ¹³ , ¹⁷ One of the observations in this study related to the time‐dynamic differences in Kaplan–Meier curves between RCTs and RWD (Figure 2). Similar Kaplan–Meier curves as in Figure 2a have previously been found in other oncology case studies, ¹⁰ , ¹⁴ thus suggesting that the impact of operations is not isolated only to small cell lung cancer. Further research is needed to understand the origins of this difference. ³⁵ It has been noted that in translating between RCTs and RWD, understanding the complexity of clinical practice and treatment processes may be instrumental to explaining this. ³⁵

Previous real‐world evidence studies of lung cancer have mainly focused on non‐small cell lung cancer, thus leaving small cell lung cancer understudied. ⁴⁴ To the best of the authors' knowledge, this is the most comprehensive real‐world evidence study for small cell lung cancer disease. Analyses of selection criteria and MO showed comparable results to previous studies, where a larger difference was observed in OS between RCTs and RWD compared with other intermediate endpoints. ¹⁰ , ¹⁴ Further, population differences were observed in age, an underrepresentation of ECOG 2 and higher representation of females in the RWD as compared with the clinical trials. ⁴⁵ , ⁴⁶ , ⁴⁷

The similarity between the stage IVA real‐world patients and the clinical trial patients presented an interesting aspect. The 8th TNM staging (with sub‐categorization of ED‐SCLC patients in IVA and IVB stages) was not yet developed when the trials were executed, and the LD/ED staging is still largely used to define treatment intention. ³⁸ Previous studies have pointed out significant differences in survival between IVA and IVB patients. ³⁸ , ³⁹ , ⁴³ , ⁴⁸ This result may be due to selection bias in RCTs, favoring younger patients with comparably better overall health and lower disease burden, which is only partially captured by the TNM staging. However, the low sample size of IVA patients in the real‐world cohort is a source of potential bias and further investigation would be needed to confirm this.

The SOMO approach aims to contribute to the research direction on establishing an analytical process to estimate the challenging trade‐off between internal validity and generalizability. ⁶ , ¹¹ , ¹³ , ³³ In this work we proposed how such a framework could be beneficial for both translational directions: clinical trials can be improved by understanding the discrepancy with the real‐world, and real‐world therapies can be leveraged by comparing the discrepancies of trials from the real‐world (e.g., Figure 4; Figure S4).

We are not claiming to have developed a definitive approach, as this work shows we identified several factors that were not possible to account for due to the lack of overlapping information. However, we believe that further development of the general method will allow for learning across translational activities. This may contribute to building regulatory acceptance in the real‐world evidence approach over time as has been the case with other mechanistic modeling efforts (e.g., metabolic drug–drug interaction predictions ³⁴ , ⁴⁹ ). This is a first attempt at trying to push the real‐world evidence paradigm into a similar learn‐confirm cycle from which we learn case by case and pragmatically improve the usage of RWD in future clinical trials.

The work presents some limitations to be address in the future research. There are still a set of key hypotheses reported in Table 1 that were not possible to address in the present study. The RWD represented only a single center and lacked longitudinal variables (e.g., tumor progression, doses, and adverse effects). To address these issues, the collection of these variables and the extensive collection of data is being performed. Expansion of RWD patient cohort trough national cancer registries is the next step of the analysis.

Another set of limitations that are not addressed in our work is in the inherent difference in comparing population with different drug use. RCTs, use fixed dosing regimens and adverse events are adjusted according to a treatment protocol. In contrast, RWD allows more freedom for dose adjustments. This difference can lead to discrepancies in the evaluation of a drug's effectiveness and safety. Differences in dosing could potentially affect the frequency of reported adverse events, and clinical outcomes. ⁵⁰ An improved longitudinal data collection in the RWD, it could be possible to correct for dose discrepancies using modeling approaches. Furthermore, a potential issue arises in other treatment areas where medication adherence may play an important role in treatment.

In this study, the experts involved represented the clinical side of the real‐world domain. In the future, we will explore the involvement of experts on clinical trial in oncology and regulatory agencies.

Moreover, SOMO was tested on small cell lung cancer, and research on other cancers or clinical case studies would be beneficial to increase the generalizability of the work.

The increasing presence of RWD in clinical trial studies constitute a natural step for regulatory decision and future study designs. Improving upon approaches such as SOMO would pave the way to understand how we can use RWD to close the gap between internal validity and generalizability of clinical trials.

AUTHOR CONTRIBUTIONS

L.M. and A.S.D. wrote the manuscript. L.M., A.S.D., A.D., S.T., J.R., and S.M. designed the research. L.M., A.S.D., S.T., A.D., R.L., and L.D.P. performed the research. L.M., A.S.D., S.T., A.D., R.L., and L.D.P. analyzed the data.

FUNDING INFORMATION

The Swedish Cancer Society (grant no. CAN 2018/597 and CAN2021/1469 Pj01) to R. Lewensohn and from the Stockholm Cancer Society (grant no. #201202 to R. Lewensohn and #174063, #201103 and #231123 to L. De Petris).

CONFLICT OF INTEREST STATEMENT

The authors declared no competing interests for this work.

Supporting information

Data S1.

CTS-17-e13909-s001.docx^{(929.3KB, docx)}

ACKNOWLEDGMENTS

This project is a contribution to the Center for Data‐Driven Health (CDDH), KTH Royal Institute of Technology (https://www.kth.se/sv/cddh).

Marzano L, Darwich AS, Dan A, et al. Exploring the discrepancies between clinical trials and real‐world data: A small‐cell lung cancer study. Clin Transl Sci. 2024;17:e13909. doi: 10.1111/cts.13909

REFERENCES

1. Beaulieu‐Jones BK, Finlayson SG, Yuan W, et al. Examining the use of real‐world evidence in the regulatory process. Clin Pharmacol Ther. 2020;107(4):843‐852. doi: 10.1002/CPT.1658 [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Franklin JM, Schneeweiss S. When and how can real world data analyses substitute for randomized controlled trials? Clin Pharmacol Ther. 2017;102(6):924‐933. doi: 10.1002/cpt.857 [DOI] [PubMed] [Google Scholar]
3. Schurman B. The framework for FDA's real‐world evidence program. Appl Clin Trials. 2019;28(4):15‐17. [Google Scholar]
4. Dagenais S, Russo L, Madsen A, Webster J, Becnel L. Use of real‐world evidence to drive drug development strategy and inform clinical trial design. Clin Pharmacol Ther. 2022;111(1):77‐89. doi: 10.1002/CPT.2480 [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Burns L, Roux NL, Kalesnik‐Orszulak R, et al. Real‐world evidence for regulatory decision‐making: guidance from around the world. Clin Ther. 2022;44(3):420‐437. doi: 10.1016/j.clinthera.2022.01.012 [DOI] [PubMed] [Google Scholar]
6. Baumfeld Andre E, Carrington N, Siami FS, et al. The current landscape and emerging applications for real‐world data in diagnostics and clinical decision support and its impact on regulatory decision making. Clin Pharmacol Ther. 2022;112(6):1172‐1182. doi: 10.1002/CPT.2565 [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Campbell UB, Honig N, Gatto NM. SURF: a screening tool (for sponsors) to evaluate whether using real‐world data to support an effectiveness claim in an FDA application has regulatory feasibility. Clin Pharmacol Ther. 2023;114(5):981‐993. doi: 10.1002/cpt.3021 [DOI] [PubMed] [Google Scholar]
8. Wang CY, Berlin JA, Gertz B, et al. Uncontrolled extensions of clinical trials and the use of external controls—scoping opportunities and methods. Clin Pharmacol Ther. 2022;111(1):187‐199. doi: 10.1002/CPT.2346 [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Wang SV, Sreedhara SK, Schneeweiss S, et al. Reproducibility of real‐world evidence studies using clinical practice data to inform regulatory and coverage decisions. Nat Commun. 2022;13(1):1‐11. doi: 10.1038/s41467-022-32310-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Jemielita T, Widman L, Fox C, Salomonsson S, Liaw KL, Pettersson A. Replication of oncology randomized trial results using Swedish registry real world‐data: a feasibility study. Clin Pharmacol Ther. 2021;110(6):1613‐1621. doi: 10.1002/CPT.2424 [DOI] [PubMed] [Google Scholar]
11. Ramagopalan SV, Simpson A, Sammon CJ. Can real‐world data really replace randomised clinical trials? BMC Med. 2020;18(1):13. doi: 10.1186/s12916-019-1481-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Lin J, Liao R, Gamalo‐Siebers M. Dynamic incorporation of real world evidence within the framework of adaptive design. J Biopharm Stat. 2022;32(6):986‐998. doi: 10.1080/10543406.2022.2089159 [DOI] [PubMed] [Google Scholar]
13. He Z, Tang X, Yang X, et al. Clinical trial generalizability assessment in the big data era: a review. Clin Transl Sci. 2020;13(4):675‐684. doi: 10.1111/cts.12764 [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Tan K, Bryan J, Segal B, et al. Emulating control arms for cancer clinical trials using external cohorts created from electronic health record‐derived real‐world data. Clin Pharmacol Ther. 2022;111(1):168‐178. doi: 10.1002/CPT.2351 [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Franklin JM, Glynn RJ, Martin D, Schneeweiss S. Evaluating the use of nonrandomized real‐world data analyses for regulatory decision making. Clin Pharmacol Ther. 2019;105(4):867‐877. doi: 10.1002/CPT.1351 [DOI] [PubMed] [Google Scholar]
16. Franklin JM, Pawar A, Martin D, et al. Nonrandomized real‐world evidence to support regulatory decision making: process for a randomized trial replication project. Clin Pharmacol Ther. 2020;107(4):817‐826. doi: 10.1002/cpt.1633 [DOI] [PubMed] [Google Scholar]
17. Bartlett V, Dhruva S, Shah N, Ryan P, Ross J. Feasibility of using real‐world data to replicate clinical trial evidence. JAMA Netw Open. 2019;2(10):e1912869. doi: 10.1001/jamanetworkopen.2019.12869 [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Abrahami D, Pradhan R, Yin H, Honig P, Andre EB, Azoulay L. Use of real‐world data to emulate a clinical trial and support regulatory decision making: assessing the impact of temporality, comparator choice, and method of adjustment. Clin Pharmacol Ther. 2021;109(2):452‐461. doi: 10.1002/cpt.2012 [DOI] [PubMed] [Google Scholar]
19. Ho M, Gruber S, Fang Y, et al. Examples of applying RWE causal‐inference roadmap to clinical studies. Stat Biopharm Res. 2023;14:26‐39. doi: 10.1080/19466315.2023.2177333 [DOI] [Google Scholar]
20. Stewart M, Norden AD, Dreyer N, et al. An exploratory analysis of real‐world end points for assessing outcomes among immunotherapy‐treated patients with advanced non–small‐cell lung cancer. JCO Clin Cancer Inform. 2019;3:1‐15. doi: 10.1200/cci.18.00155 [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Oksen D, Prince P, Boutmy E, et al. Treatment effectiveness in a rare oncology indication: lessons from an external control cohort study. Clin Transl Sci. 2022;15(8):1990‐1998. doi: 10.1111/CTS.13315 [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Rivera DR, Henk HJ, Garrett‐Mayer E, et al. The friends of cancer research real‐world data collaboration pilot 2.0: methodological recommendations from oncology case studies. Clin Pharmacol Ther. 2022;111(1):283‐292. doi: 10.1002/CPT.2453 [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Cramer‐van der Welle CM, Verschueren MV, Tonn M, et al. Real‐world outcomes versus clinical trial results of immunotherapy in stage IV non‐small cell lung cancer (NSCLC) in The Netherlands. Sci Rep. 2021;11(1):6306. doi: 10.1038/s41598-021-85696-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Lakdawalla DN, Shafrin J, Hou N, et al. Predicting real‐world effectiveness of cancer therapies using overall survival and progression‐free survival from clinical trials: empirical evidence for the ASCO value framework. Value Health. 2017;20(7):866‐875. doi: 10.1016/J.JVAL.2017.04.003 [DOI] [PubMed] [Google Scholar]
25. Lasiter L, Tymejczyk O, Garrett‐Mayer E, et al. Real‐world overall survival using oncology electronic health record data: friends of cancer research pilot. Clin Pharmacol Ther. 2022;111(2):444‐454. doi: 10.1002/CPT.2443 [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Loureiro H, Roller A, Schneider M, Talavera‐López C, Becker T, Bauer‐Mehren A. Matching by OS prognostic score to construct external controls in lung cancer clinical trials. Clin Pharmacol Ther. 2024;115:333‐341. doi: 10.1002/cpt.3109 [DOI] [PubMed] [Google Scholar]
27. Webster‐Clark M, Stürmer T, Wang T, et al. Using propensity scores to estimate effects of treatment initiation decisions: state of the science. Stat Med. 2021;40(7):1718‐1735. doi: 10.1002/sim.8866 [DOI] [PubMed] [Google Scholar]
28. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41‐55. doi: 10.1093/biomet/70.1.41 [DOI] [Google Scholar]
29. Loureiro H, Becker T, Bauer‐Mehren A, Ahmidi N, Weberpals J. Artificial intelligence for prognostic scores in oncology: a benchmarking study. Front Artif Intell. 2021;4:625573. doi: 10.3389/FRAI.2021.625573 [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Liu R, Rizzo S, Whipple S, et al. Evaluating eligibility criteria of oncology trials using real‐world data and AI. Nature. 2021;592(7855):629‐633. doi: 10.1038/S41586-021-03430-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Franklin J, Patorno E, Desai R, et al. Emulating randomized clinical trials with nonrandomized real‐world evidence studies first results from the RCT DUPLICATE initiative. Circulation. 2021;143(10):1002‐1013. doi: 10.1161/CIRCULATIONAHA.120.051718 [DOI] [PMC free article] [PubMed] [Google Scholar]
32. Feinberg BA, Gajra A, Zettler ME, Phillips TD, Phillips EG, Kish JK. Use of real‐world evidence to support FDA approval of oncology drugs. Value Health. 2020;23(10):1358‐1365. doi: 10.1016/J.JVAL.2020.06.006 [DOI] [PubMed] [Google Scholar]
33. Bolislis WR, Fay M, Kühler TC. Use of real‐world data for new drug applications and line extensions. Clin Ther. 2020;42(5):926‐938. doi: 10.1016/j.clinthera.2020.03.006 [DOI] [PubMed] [Google Scholar]
34. Wang Y, Zhu H, Madabushi R, Liu Q, Huang SM, Zineh I. Model‐informed drug development: current US regulatory practice and future considerations. Clin Pharmacol Ther. 2019;105(4):899‐911. doi: 10.1002/cpt.1363 [DOI] [PubMed] [Google Scholar]
35. Greenhalgh T, Papoutsi C. Studying complexity in health services research: desperately seeking an overdue paradigm shift. BMC Med. 2018;16(1):95. doi: 10.1186/s12916-018-1089-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
36. Johal S, Hettle R, Carroll J, Maguire P, Wynne T. Real‐world treatment patterns and outcomes in small‐cell lung cancer: a systematic literature review. J Thorac Dis. 2021;13(6):3692‐3707. doi: 10.21037/jtd-20-3034 [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Tendler S, Zhan Y, Pettersson A, et al. Treatment patterns and survival outcomes for small‐cell lung cancer patients—a Swedish single center cohort study. Acta Oncol. 2020;59(4):388‐394. doi: 10.1080/0284186X.2019.1711165 [DOI] [PubMed] [Google Scholar]
38. Tendler S, Grozman V, Lewensohn R, Tsakonas G, Viktorsson K, De Petris L. Validation of the 8th TNM classification for small‐cell lung cancer in a retrospective material from Sweden. Lung Cancer. 2018;120:75‐81. doi: 10.1016/j.lungcan.2018.03.026 [DOI] [PubMed] [Google Scholar]
39. Marzano L, Darwich AS, Tendler S, et al. A novel analytical framework for risk stratification of real‐world data using machine learning: a small cell lung cancer study. Clin Transl Sci. 2022;15(10):2437‐2447. doi: 10.1111/CTS.13371 [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Green AK, Reeder‐Hayes KE, Corty RW, et al. The project data sphere initiative: accelerating cancer research by sharing data. Oncologist. 2015;20(5):464‐e20. doi: 10.1634/theoncologist.2014-0431 [DOI] [PMC free article] [PubMed] [Google Scholar]
41. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over‐sampling technique. J Artif Intell Res. 2002;16:321‐357. doi: 10.1613/jair.953 [DOI] [Google Scholar]
42. Gilboa S, Pras Y, Mataraso A, Bomze D, Markel G, Meirson T. Informative censoring of surrogate end‐point data in phase 3 oncology trials. Eur J Cancer. 2021;153:190‐202. doi: 10.1016/j.ejca.2021.04.044 [DOI] [PubMed] [Google Scholar]
43. Tan F, Bi N, Zhang H, et al. External validation of the eighth edition of the TNM classification for lung cancer in small cell lung cancer. Lung Cancer. 2022;170:98‐104. doi: 10.1016/j.lungcan.2022.03.011 [DOI] [PubMed] [Google Scholar]
44. Pietanza MC, Byers LA, Minna JD, Rudin CM. Small cell lung cancer: will recent progress lead to improved outcomes? Clin Cancer Res. 2015;21(10):2244‐2255. doi: 10.1158/1078-0432.CCR-14-2958 [DOI] [PMC free article] [PubMed] [Google Scholar]
45. Pang HH, Wang X, Stinchcombe TE, et al. Enrollment trends and disparity among patients with lung cancer in National Clinical Trials, 1990 to 2012. JCO. 2016;34(33):3992‐3999. doi: 10.1200/JCO.2016.67.7088 [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Hutchins LF, Unger JM, Crowley JJ, Coltman CA, Albain KS. Underrepresentation of patients 65 years of age or older in cancer‐treatment trials. N Engl J Med. 1999;341(27):2061‐2067. doi: 10.1056/NEJM199912303412706 [DOI] [PubMed] [Google Scholar]
47. Jaoude JA, Kouzy R, Mainwaring W, et al. Performance status restriction in phase III cancer clinical trials. J Natl Compr Cancer Netw. 2020;18(10):1322‐1326. doi: 10.1200/jco.2020.38.15_suppl.2059 [DOI] [PMC free article] [PubMed] [Google Scholar]
48. Hwang JK, Page BJ, Flynn D, et al. Validation of the eighth edition TNM lung cancer staging system. J Thorac Oncol. 2020;15(4):649‐654. doi: 10.1016/j.jtho.2019.11.030 [DOI] [PubMed] [Google Scholar]
49. Shebley M, Sandhu P, Emami Riedmaier A, et al. Physiologically based pharmacokinetic model qualification and reporting procedures for regulatory submissions: a consortium perspective. Clin Pharmacol Ther. 2018;104(1):88‐110. doi: 10.1002/cpt.1013 [DOI] [PMC free article] [PubMed] [Google Scholar]
50. Lombard A, Mistry H, Aarons L, Ogungbenro K. Dose individualisation in oncology using chemotherapy‐induced neutropenia: example of docetaxel in non‐small cell lung cancer patients. Br J Clin Pharmacol. 2021;87(4):2053‐2063. doi: 10.1111/bcp.14614 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data S1.

CTS-17-e13909-s001.docx^{(929.3KB, docx)}

[cts13909-bib-0001] 1. Beaulieu‐Jones BK, Finlayson SG, Yuan W, et al. Examining the use of real‐world evidence in the regulatory process. Clin Pharmacol Ther. 2020;107(4):843‐852. doi: 10.1002/CPT.1658 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0002] 2. Franklin JM, Schneeweiss S. When and how can real world data analyses substitute for randomized controlled trials? Clin Pharmacol Ther. 2017;102(6):924‐933. doi: 10.1002/cpt.857 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0003] 3. Schurman B. The framework for FDA's real‐world evidence program. Appl Clin Trials. 2019;28(4):15‐17. [Google Scholar]

[cts13909-bib-0004] 4. Dagenais S, Russo L, Madsen A, Webster J, Becnel L. Use of real‐world evidence to drive drug development strategy and inform clinical trial design. Clin Pharmacol Ther. 2022;111(1):77‐89. doi: 10.1002/CPT.2480 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0005] 5. Burns L, Roux NL, Kalesnik‐Orszulak R, et al. Real‐world evidence for regulatory decision‐making: guidance from around the world. Clin Ther. 2022;44(3):420‐437. doi: 10.1016/j.clinthera.2022.01.012 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0006] 6. Baumfeld Andre E, Carrington N, Siami FS, et al. The current landscape and emerging applications for real‐world data in diagnostics and clinical decision support and its impact on regulatory decision making. Clin Pharmacol Ther. 2022;112(6):1172‐1182. doi: 10.1002/CPT.2565 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0007] 7. Campbell UB, Honig N, Gatto NM. SURF: a screening tool (for sponsors) to evaluate whether using real‐world data to support an effectiveness claim in an FDA application has regulatory feasibility. Clin Pharmacol Ther. 2023;114(5):981‐993. doi: 10.1002/cpt.3021 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0008] 8. Wang CY, Berlin JA, Gertz B, et al. Uncontrolled extensions of clinical trials and the use of external controls—scoping opportunities and methods. Clin Pharmacol Ther. 2022;111(1):187‐199. doi: 10.1002/CPT.2346 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0009] 9. Wang SV, Sreedhara SK, Schneeweiss S, et al. Reproducibility of real‐world evidence studies using clinical practice data to inform regulatory and coverage decisions. Nat Commun. 2022;13(1):1‐11. doi: 10.1038/s41467-022-32310-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0010] 10. Jemielita T, Widman L, Fox C, Salomonsson S, Liaw KL, Pettersson A. Replication of oncology randomized trial results using Swedish registry real world‐data: a feasibility study. Clin Pharmacol Ther. 2021;110(6):1613‐1621. doi: 10.1002/CPT.2424 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0011] 11. Ramagopalan SV, Simpson A, Sammon CJ. Can real‐world data really replace randomised clinical trials? BMC Med. 2020;18(1):13. doi: 10.1186/s12916-019-1481-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0012] 12. Lin J, Liao R, Gamalo‐Siebers M. Dynamic incorporation of real world evidence within the framework of adaptive design. J Biopharm Stat. 2022;32(6):986‐998. doi: 10.1080/10543406.2022.2089159 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0013] 13. He Z, Tang X, Yang X, et al. Clinical trial generalizability assessment in the big data era: a review. Clin Transl Sci. 2020;13(4):675‐684. doi: 10.1111/cts.12764 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0014] 14. Tan K, Bryan J, Segal B, et al. Emulating control arms for cancer clinical trials using external cohorts created from electronic health record‐derived real‐world data. Clin Pharmacol Ther. 2022;111(1):168‐178. doi: 10.1002/CPT.2351 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0015] 15. Franklin JM, Glynn RJ, Martin D, Schneeweiss S. Evaluating the use of nonrandomized real‐world data analyses for regulatory decision making. Clin Pharmacol Ther. 2019;105(4):867‐877. doi: 10.1002/CPT.1351 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0016] 16. Franklin JM, Pawar A, Martin D, et al. Nonrandomized real‐world evidence to support regulatory decision making: process for a randomized trial replication project. Clin Pharmacol Ther. 2020;107(4):817‐826. doi: 10.1002/cpt.1633 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0017] 17. Bartlett V, Dhruva S, Shah N, Ryan P, Ross J. Feasibility of using real‐world data to replicate clinical trial evidence. JAMA Netw Open. 2019;2(10):e1912869. doi: 10.1001/jamanetworkopen.2019.12869 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0018] 18. Abrahami D, Pradhan R, Yin H, Honig P, Andre EB, Azoulay L. Use of real‐world data to emulate a clinical trial and support regulatory decision making: assessing the impact of temporality, comparator choice, and method of adjustment. Clin Pharmacol Ther. 2021;109(2):452‐461. doi: 10.1002/cpt.2012 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0019] 19. Ho M, Gruber S, Fang Y, et al. Examples of applying RWE causal‐inference roadmap to clinical studies. Stat Biopharm Res. 2023;14:26‐39. doi: 10.1080/19466315.2023.2177333 [DOI] [Google Scholar]

[cts13909-bib-0020] 20. Stewart M, Norden AD, Dreyer N, et al. An exploratory analysis of real‐world end points for assessing outcomes among immunotherapy‐treated patients with advanced non–small‐cell lung cancer. JCO Clin Cancer Inform. 2019;3:1‐15. doi: 10.1200/cci.18.00155 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0021] 21. Oksen D, Prince P, Boutmy E, et al. Treatment effectiveness in a rare oncology indication: lessons from an external control cohort study. Clin Transl Sci. 2022;15(8):1990‐1998. doi: 10.1111/CTS.13315 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0022] 22. Rivera DR, Henk HJ, Garrett‐Mayer E, et al. The friends of cancer research real‐world data collaboration pilot 2.0: methodological recommendations from oncology case studies. Clin Pharmacol Ther. 2022;111(1):283‐292. doi: 10.1002/CPT.2453 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0023] 23. Cramer‐van der Welle CM, Verschueren MV, Tonn M, et al. Real‐world outcomes versus clinical trial results of immunotherapy in stage IV non‐small cell lung cancer (NSCLC) in The Netherlands. Sci Rep. 2021;11(1):6306. doi: 10.1038/s41598-021-85696-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0024] 24. Lakdawalla DN, Shafrin J, Hou N, et al. Predicting real‐world effectiveness of cancer therapies using overall survival and progression‐free survival from clinical trials: empirical evidence for the ASCO value framework. Value Health. 2017;20(7):866‐875. doi: 10.1016/J.JVAL.2017.04.003 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0025] 25. Lasiter L, Tymejczyk O, Garrett‐Mayer E, et al. Real‐world overall survival using oncology electronic health record data: friends of cancer research pilot. Clin Pharmacol Ther. 2022;111(2):444‐454. doi: 10.1002/CPT.2443 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0026] 26. Loureiro H, Roller A, Schneider M, Talavera‐López C, Becker T, Bauer‐Mehren A. Matching by OS prognostic score to construct external controls in lung cancer clinical trials. Clin Pharmacol Ther. 2024;115:333‐341. doi: 10.1002/cpt.3109 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0027] 27. Webster‐Clark M, Stürmer T, Wang T, et al. Using propensity scores to estimate effects of treatment initiation decisions: state of the science. Stat Med. 2021;40(7):1718‐1735. doi: 10.1002/sim.8866 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0028] 28. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41‐55. doi: 10.1093/biomet/70.1.41 [DOI] [Google Scholar]

[cts13909-bib-0029] 29. Loureiro H, Becker T, Bauer‐Mehren A, Ahmidi N, Weberpals J. Artificial intelligence for prognostic scores in oncology: a benchmarking study. Front Artif Intell. 2021;4:625573. doi: 10.3389/FRAI.2021.625573 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0030] 30. Liu R, Rizzo S, Whipple S, et al. Evaluating eligibility criteria of oncology trials using real‐world data and AI. Nature. 2021;592(7855):629‐633. doi: 10.1038/S41586-021-03430-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0031] 31. Franklin J, Patorno E, Desai R, et al. Emulating randomized clinical trials with nonrandomized real‐world evidence studies first results from the RCT DUPLICATE initiative. Circulation. 2021;143(10):1002‐1013. doi: 10.1161/CIRCULATIONAHA.120.051718 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0032] 32. Feinberg BA, Gajra A, Zettler ME, Phillips TD, Phillips EG, Kish JK. Use of real‐world evidence to support FDA approval of oncology drugs. Value Health. 2020;23(10):1358‐1365. doi: 10.1016/J.JVAL.2020.06.006 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0033] 33. Bolislis WR, Fay M, Kühler TC. Use of real‐world data for new drug applications and line extensions. Clin Ther. 2020;42(5):926‐938. doi: 10.1016/j.clinthera.2020.03.006 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0034] 34. Wang Y, Zhu H, Madabushi R, Liu Q, Huang SM, Zineh I. Model‐informed drug development: current US regulatory practice and future considerations. Clin Pharmacol Ther. 2019;105(4):899‐911. doi: 10.1002/cpt.1363 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0035] 35. Greenhalgh T, Papoutsi C. Studying complexity in health services research: desperately seeking an overdue paradigm shift. BMC Med. 2018;16(1):95. doi: 10.1186/s12916-018-1089-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0036] 36. Johal S, Hettle R, Carroll J, Maguire P, Wynne T. Real‐world treatment patterns and outcomes in small‐cell lung cancer: a systematic literature review. J Thorac Dis. 2021;13(6):3692‐3707. doi: 10.21037/jtd-20-3034 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0037] 37. Tendler S, Zhan Y, Pettersson A, et al. Treatment patterns and survival outcomes for small‐cell lung cancer patients—a Swedish single center cohort study. Acta Oncol. 2020;59(4):388‐394. doi: 10.1080/0284186X.2019.1711165 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0038] 38. Tendler S, Grozman V, Lewensohn R, Tsakonas G, Viktorsson K, De Petris L. Validation of the 8th TNM classification for small‐cell lung cancer in a retrospective material from Sweden. Lung Cancer. 2018;120:75‐81. doi: 10.1016/j.lungcan.2018.03.026 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0039] 39. Marzano L, Darwich AS, Tendler S, et al. A novel analytical framework for risk stratification of real‐world data using machine learning: a small cell lung cancer study. Clin Transl Sci. 2022;15(10):2437‐2447. doi: 10.1111/CTS.13371 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0040] 40. Green AK, Reeder‐Hayes KE, Corty RW, et al. The project data sphere initiative: accelerating cancer research by sharing data. Oncologist. 2015;20(5):464‐e20. doi: 10.1634/theoncologist.2014-0431 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0041] 41. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over‐sampling technique. J Artif Intell Res. 2002;16:321‐357. doi: 10.1613/jair.953 [DOI] [Google Scholar]

[cts13909-bib-0042] 42. Gilboa S, Pras Y, Mataraso A, Bomze D, Markel G, Meirson T. Informative censoring of surrogate end‐point data in phase 3 oncology trials. Eur J Cancer. 2021;153:190‐202. doi: 10.1016/j.ejca.2021.04.044 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0043] 43. Tan F, Bi N, Zhang H, et al. External validation of the eighth edition of the TNM classification for lung cancer in small cell lung cancer. Lung Cancer. 2022;170:98‐104. doi: 10.1016/j.lungcan.2022.03.011 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0044] 44. Pietanza MC, Byers LA, Minna JD, Rudin CM. Small cell lung cancer: will recent progress lead to improved outcomes? Clin Cancer Res. 2015;21(10):2244‐2255. doi: 10.1158/1078-0432.CCR-14-2958 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0045] 45. Pang HH, Wang X, Stinchcombe TE, et al. Enrollment trends and disparity among patients with lung cancer in National Clinical Trials, 1990 to 2012. JCO. 2016;34(33):3992‐3999. doi: 10.1200/JCO.2016.67.7088 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0046] 46. Hutchins LF, Unger JM, Crowley JJ, Coltman CA, Albain KS. Underrepresentation of patients 65 years of age or older in cancer‐treatment trials. N Engl J Med. 1999;341(27):2061‐2067. doi: 10.1056/NEJM199912303412706 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0047] 47. Jaoude JA, Kouzy R, Mainwaring W, et al. Performance status restriction in phase III cancer clinical trials. J Natl Compr Cancer Netw. 2020;18(10):1322‐1326. doi: 10.1200/jco.2020.38.15_suppl.2059 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0048] 48. Hwang JK, Page BJ, Flynn D, et al. Validation of the eighth edition TNM lung cancer staging system. J Thorac Oncol. 2020;15(4):649‐654. doi: 10.1016/j.jtho.2019.11.030 [DOI] [PubMed] [Google Scholar]

[cts13909-bib-0049] 49. Shebley M, Sandhu P, Emami Riedmaier A, et al. Physiologically based pharmacokinetic model qualification and reporting procedures for regulatory submissions: a consortium perspective. Clin Pharmacol Ther. 2018;104(1):88‐110. doi: 10.1002/cpt.1013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[cts13909-bib-0050] 50. Lombard A, Mistry H, Aarons L, Ogungbenro K. Dose individualisation in oncology using chemotherapy‐induced neutropenia: example of docetaxel in non‐small cell lung cancer patients. Br J Clin Pharmacol. 2021;87(4):2053‐2063. doi: 10.1111/bcp.14614 [DOI] [PubMed] [Google Scholar]

PERMALINK

Exploring the discrepancies between clinical trials and real‐world data: A small‐cell lung cancer study

Luca Marzano

Adam S Darwich

Asaf Dan

Salomon Tendler

Rolf Lewensohn

Luigi De Petris

Jayanth Raghothama

Sebastiaan Meijer

Abstract

Study highlights.

INTRODUCTION

METHODS

The SOMO approach

FIGURE 1.

Case study: Extensive disease small‐cell lung cancer patients receiving platinum‐etoposide chemotherapy

Cohort description

Survival analysis

TABLE 1.

RESULTS

Mapping available data and SOMO components

Cohort discrepancies

TABLE 2.

FIGURE 2.

Selection criteria: Performance status and TNM staging

Operations: The effect of RCT censoring on survival estimation and propensity score matching

FIGURE 3.

Pairwise randomized controlled trial comparison to the real‐world population

FIGURE 4.

FIGURE 5.

DISCUSSION

AUTHOR CONTRIBUTIONS

FUNDING INFORMATION

CONFLICT OF INTEREST STATEMENT

Supporting information

ACKNOWLEDGMENTS

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases