Correlation Between Early Trends of a Prognostic Biomarker and Overall Survival in Non–Small-Cell Lung Cancer Clinical Trials

Hugo Loureiro; Theresa M Kolben; Astrid Kiermaier; Dominik Rüttinger; Narges Ahmidi; Tim Becker; Anna Bauer-Mehren

doi:10.1200/CCI.23.00062

. 2023 Nov 3;7:e2300062. doi: 10.1200/CCI.23.00062

Correlation Between Early Trends of a Prognostic Biomarker and Overall Survival in Non–Small-Cell Lung Cancer Clinical Trials

Hugo Loureiro ^1,^2,³, Theresa M Kolben ⁴, Astrid Kiermaier ⁵, Dominik Rüttinger ⁶, Narges Ahmidi ², Tim Becker ¹, Anna Bauer-Mehren ^1,^✉

PMCID: PMC10730042 PMID: 37922432

Abstract

PURPOSE

Overall survival (OS) is the primary end point in phase III oncology trials. Given low success rates, surrogate end points, such as progression-free survival or objective response rate, are used in early go/no-go decision making. Here, we investigate whether early trends of OS prognostic biomarkers, such as the ROPRO and DeepROPRO, can also be used for this purpose.

METHODS

Using real-world data, we emulated a series of 12 advanced non–small-cell lung cancer (aNSCLC) clinical trials, originally conducted by six different sponsors and evaluated four different mechanisms, in a total of 19,920 individuals. We evaluated early trends (until 6 months) of the OS biomarker alongside early OS within the joint model (JM) framework. Study-level estimates of early OS and ROPRO trends were correlated against the actual final OS hazard ratios (HRs).

RESULTS

We observed a strong correlation between the JM estimates and final OS HR at 3 months (adjusted $R^{2}$ = 0.88) and at 6 months (adjusted $R^{2}$ = 0.85). In the leave-one-out analysis, there was a low overall prediction error of the OS HR at both 3 months (root-mean-square error [RMSE] = 0.11) and 6 months (RMSE = 0.12). In addition, at 3 months, the absolute prediction error of the OS HR was lower than 0.05 for three trials.

CONCLUSION

We describe a pipeline to predict trial OS HRs using emulated aNSCLC studies and their early OS and OS biomarker trends. The method has the potential to accelerate and improve decision making in drug development.

Longitudinal prognostic score found a high correlation with OS in lung cancer clinical trials

INTRODUCTION

In oncology clinical trials, the gold standard measure of efficacy is overall survival (OS). To get a reliable estimate of OS, a high number of patients and a long follow-up are required.^1,2 These requirements constrain the estimation of OS across clinical trial phases: (1) in early phases (I/II) where both the number of patients and follow-up time are limited and (2) in interim analyses of late phases, where the follow-up time might still be a limiting factor.

CONTEXT

Key Objective
Early efficacy predictions in interim analyses of clinical trials are usually dependent on surrogate end points such as progression-free survival. In this introductory analysis, we explored the correlation between the longitudinal trend of prognostic scores (risk trend) at early time points (equivalent to interim analysis) and efficacy.
Knowledge Generated
We considered 12 clinical trials emulated with a large real-world database. The risk trend in interim analyses strongly correlated with the efficacy.
Relevance
The observed correlation suggests that the early risk trend could be an interesting additional tool for internal decision making. Still, further validation analyses in different types of data are necessary to develop the methodology.

To assist with early go/no-go decisions in the aforementioned settings, several statistical and machine learning tools have been proposed. Specifically, Shameer et al³ created a software tool that predicts the OS hazard ratio (HR) on the basis of progression-free survival (PFS) values from early interim analyses. Seo et al⁴ approached this problem by analyzing the relationship between the molecular structure, drug target, and drug success. In addition, Beinse et al⁵ and Hegge et al⁶ aimed to predict success of new molecules given drug/trial characteristics and results from phases I and II, respectively. Finally, Schperberg et al⁷ created an algorithm to predict both OS and PFS results given drug, target, and trial characteristics. In this work, we explore whether early OS in combination with the early trend of oncology prognostic scores is predictive of final OS results and can be used to inform go/no-go decisions.

Oncology prognostic scores, such as ROPRO⁸ or DeepROPRO,⁹ that were recently introduced by our group are correlated with OS. Both models combine a set of 27 parameters describing the host (demographics, vitals, and blood test parameters), the lifestyle (BMI and smoking history), and the tumor characteristics, all of which are associated with cancer survival. We use the (prognostic score) risk trend from baseline (treatment start) in a joint modeling framework, which measures the patient-level deviation of these scores from the start of treatment. We hypothesize that the risk trend can represent the actual improvement/deterioration of the patient's condition/fitness over time. Therefore, in a clinical trial setting, we expect that the treatment arm whose patients have the highest improvement in risk trend should be the arm with the highest OS benefit.

In this study, we performed retrospective analyses to investigate the applicability of the risk trends, combined with observed early OS results, to inform go/no-go decisions in interim analyses of late-stage clinical trials. For the benchmarking of our approach, we emulated 12 recent phase III clinical trials (covering a wide range of medication types) using data from a large real-world database and assessed the performance of our approach in these data.

METHODS

Ethics Statement

Institutional Review Board approval of the Flatiron Health (FH) study protocol for data collection from the real-world cohort was obtained before study conduct, including a waiver of informed consent. Additional details on Flatiron's institutional review board approval are outlined below:

• IRB name: WCG IRB
• Protocol No. and title: RWE-001: The Flatiron Health Real World Evidence Parent Protocol
• Registration No.: IRB00000533
• Protocol approval ID/tracking No.: 420180044

Joint Modeling of Risk Trend and OS

We modeled OS alongside the risk trends using Joint Models for Longitudinal and Survival Data (in short JM).¹⁰ JM couples the risk trend from baseline (the longitudinal variable) with the survival information, measuring the impact of the risk trend on survival and hence eliminating the possible bias.¹⁰ Specifically, we defined the JM as

h_{i} (t) = h_{0} (t) \exp {γ \cdot t r e a t m e {n t}_{i} + α \cdot r i s k_{t r e n d, i} (t)},

r i s k_{t r e n d, i} (t) = β_{0} + (b_{1, i} + β_{1}) \cdot t + β_{2} t \cdot t r e a t m e n t_{i} .

(1)

where $h (t)$ is the hazard function, $h_{0} (t)$ is the baseline hazard, $γ$ is the direct effect of the treatment on the hazard (ie, analogous to the classical OS HR of the treatment), $β_{0}$ and $β_{1}$ are the intercept and slope coefficients, $β_{2}$ is an additional slope coefficient that depends on the arm of treatment, and $b_{1}$ is a slope random effect. The $r i s k_{t r e n d}$ value at time $t$ for patient $i$ is given by the difference between the risk at time $t$ and the baseline risk: $r i s k_{t r e n d, i} (t) = r i s k_{i} (t) - r i s k_{i} (t = 0)$ . The risk values at all time points ( $r i s k_{i} (t)$ ) were calculated using the selected prognostic scores.

As an auxiliary, we also visualize the average progression of the risk trends using the locally estimated scatterplot smoothing (LOESS) smoother¹¹ and provide an illustration of $r i s k_{t r e n d}$ . These visualizations are solely for illustrative purposes and are not used in the actual framework described below.

Figure 1 summarizes the risk trend framework.

FIG 1. — The risk trend framework workflow. HR, hazard ratio; JM, joint model; OS, overall survival.

Correlation Analysis

Within each study, we estimated the JM coefficient using either all information available until 3 months or, alternatively, until 6 months (all future patient information was censored) and extracted the JM coefficients ( $β_{2}$ , $γ$ ). These two time points represent clinical trial interim analyses. Next, we investigated whether the early JM coefficients correlated with the final study OS HRs. Specifically, we performed a weighted linear regression

\log (O S H R_{j}) = θ_{0} + θ_{1} β_{2, j} + θ_{2} γ_{j} + ϵ, j \in {1, . . ., 12}

(2)

between the JM coefficients ( $β_{2}$ , $γ$ ) from early time points and the final OS HR of the 12 emulated studies ( $j$ is the trial iterator and $θ$ are regression coefficients). The regression was weighted by the inverse variance of $O S H R_{j}$ . The adjusted multiple correlation coefficient ( $R^{2}$ ) of the linear regression and Kendall $τ$ served as quality measures of the regression. In addition, we performed a leave-one-study-out analysis, where we predicted the final OS HR of the study from its specific JM coefficients using the regression formula (Eq 2) derived from the other 11 studies. In the leave-one-out analysis, we used the root-mean-square error (RMSE) to characterize the prediction performance. We estimated the CI of the $R^{2}$ and RMSE with bootstrap¹² by resampling the trial results and refitting Equation 2.

Finally, to describe the contribution of the $β_{2}$ and $γ$ coefficients to the OS HR prediction, we performed an additional analysis where we fit the JM (Eq 1) without the $γ$ coefficient. Next, we performed the same linear regression and leave-one-out analyses (Eq 2), including only the $β_{2}$ coefficient.

Clinical Trial Emulation with RWD

We emulated previously conducted clinical trials with RWD to obtain a data basis on which to evaluate our prediction framework since the actual study data were not available to us in the majority of cases. First, we gathered a comprehensive (Fig 2) list of phase III lung cancer clinical trials from ClinicalTrials.gov covering multiple medication types. Next, we emulated these clinical trials using the deidentified electronic health record (EHR)–derived FH database. The FH database is a longitudinal database, comprising deidentified patient-level structured and unstructured data, curated via technology-enabled abstraction.^13,14 From FH, we extracted deidentified information collected between January 2011 and December 2020 from approximately 280 US cancer clinics (approximately 800 sites of care) about the first-line treatment of 34,061 patients diagnosed with advanced non–small-cell lung cancer (aNSCLC).

FIG 2. — Flowchart of the clinical trial selection. NSCLC, non–small-cell lung cancer.

We focused on phase III aNSCLC studies since (1) aNSCLC is one of the most common types of cancers and (2) phase III trials usually report OS results. We extracted a list of 184 clinical trials from ClinicalTrials.gov (accessed on October 5, 2021). From the initial list, we excluded 171 clinical trials on the basis of the trial design and patient availability in FH (Fig 2 shows detailed criteria). The final list of 12 potentially reproducible clinical trials is included in Table 1 and the Data Supplement (Table S1; following the study by Yang et al,¹⁵ we incorporated LUX-Lung 3 and LUX-Lung 6 together, lowering the number of trials by one).

TABLE 1.

Absolute Prediction Error of the Trial's OS HR Using the ROPRO Risk Trend and Early OS

Open in a new tab

To emulate the clinical trials, we selected patients from FH who were prescribed the trial's medications. We applied the clinical trial–specific inclusion/exclusion criteria (such as tumor histology or specific tumor mutations). We relaxed bloodwork inclusion/exclusion criteria following the results from the study by Liu et al.¹⁶ Next, to reduce confounding, we applied propensity score matching^17,18 on the baseline (before treatment) ROPRO value. Propensity score matching attempts to create experimental and control arms with reduced confounding bias. The approach of controlling for confounding with prognostic scores was introduced in the landmark study by Stuart et al,¹⁸ where they showed that prognostic score values controlled for the bias. Finally, to verify that our emulated clinical trials are concordant with the original clinical trials, we contrasted their OS HR confidence intervals.

Implementation

The analyses were performed using R 4.0.4¹⁹ and Python 3.6. The ROPRO⁸ and DeepROPRO⁹ were calculated as specified in the original publications. The propensity score matching, DeepSurv, and JM were implemented using the MatchIt,²⁰ DeepSurv,²¹ and JM²² packages, respectively. The analysis code is available in GitHub.²³

RESULTS

Evaluation and Sensitivity of the Emulated Trials

The data sets of the 12 RWD-emulated clinical trials cover a wide range of patient numbers (from 190 to 2,156 patients) per treatment arm (baseline characteristics are available in the Data Supplement [Tables S3-S14]). The emulated clinical trials were generally more inclusive than the original trials, including more female patients (who are still under-represented in clinical trials,²⁴ and also slightly older patients [eg, median age of the KEYNOTE-189 control arm: 63.5 years; median age of the KEYNOTE-189 emulated control: 68 years]). Regardless of the slight differences in baseline characteristics, the OS results of the emulated and original clinical trials were consistent in the majority of trials. There were a high correlation between the emulated and actual HR ( $R^{2}$ of 0.86) and moderate error (RMSE of 0.17). Still, there were some observed differences in the emulated OS. For instance, in both LUX-Lung 3+6 and PROFILE 1014, there was a higher OS benefit in the emulated (OS HR LUX-Lung 3+6: 0.40, PROFILE 1014: 0.50) versus original (OS HR LUX-Lung 3+6: 0.81, PROFILE 1014: 0.67) clinical trials (the Data Supplement [Fig S1] contains a comparison of the actual and emulated HRs).

Correlation of JM Coefficients With Final OS HR

Next, we explored the correlation between the early treatment-specific JM coefficients ( $β_{2}$ , $γ$ ) in our makeshift interim analyses and the final OS HR in the 12 considered trials. The JM coefficients at 3 months highly correlated with the final OS HR (ROPRO JM adjusted $R^{2}$ values [bootstrap CI]: 0.88 [0.62 to 0.98], Kendall $τ$ : 0.82, and Fig 3). The 6-month JM coefficients similarly correlated with the final OS HR (ROPRO JM adjusted $R^{2}$ values [bootstrap CI]: 0.85 [0.52 to 0.98], Kendall $τ$ : 0.82). In addition, the DeepROPRO JM coefficients had similarly high correlation with the final OS HR (3-month adjusted $R^{2}$ values [bootstrap CI]: 0.86 [0.52 to 0.98], and 6-month adjusted $R^{2}$ values [bootstrap CI]: 0.82 [0.40 to 0.98], Kendall $τ$ : 0.70).

FIG 3. — Scatter plot of the final OS HRs versus the ROPRO JM coefficients at 3 months. The plot includes the adjusted $R^{2}$ value and its bootstrapped CI of the regression. Our linear prediction model of OS HR depends on both $β_{2}$ and $γ$ coefficients; here, we consider the plane $γ = 0$ to make a simple 2D representation of the prediction model and correlation. 2D, two-dimensional; HR, hazard ratio; JM, joint model; OS, overall survival.

As a sensitivity analysis, we performed the previous analysis without the two emulated studies that had a higher OS benefit than the original trials. The correlation results were similar to the those in previous analysis (3-month adjusted $R^{2}$ [bootstrap CI] 0.90 [0.46 to 0.99], Kendall $τ$ 0.87).

Prediction of Final OS HR Using JM Coefficients

After the positive correlation between the JM coefficients and the OS HR, we investigated whether the JM coefficients ( $β_{2}$ , $γ$ ) could predict the final OS HR values in a leave-one-out analysis. The JM coefficients obtained from only 3 months of data (Fig 4) predicted the final OS HR with low error (RMSE [bootstrap CI] for ROPRO JM: 0.11 [0.08 to 0.14] and DeepROPRO JM: 0.11 [0.08 to 0.14]). Remarkably, for the ROPRO JM models (considering both coefficients [ $β_{2}$ , $γ$ ]), five studies had an absolute OS HR error of <0.1 and three trials had an absolute error lower than 0.05 (AURA3: 0.03, KEYNOTE-042: 0.04, and KEYNOTE-024: 0.02). When additional data were added to the models (up to 6 months), there was a similar overall prediction error (RMSE [bootstrap CI] for ROPRO: 0.12 [0.07 to 0.16] and for DeepROPRO: 0.13 [0.08 to 0.17]). Although, for the ROPRO risk trend, there were more studies (eight in total) that had an absolute OS HR error value lower than 0.1, five had an absolute OS HR error lower than 0.05 (specifically, FLAURA: 0.03, ClinicalTrials.gov identifier: NCT00540514: 0.01, AURA3: 0.02, ClinicalTrials.gov identifier: NCT00520676: 0.00). The full prediction errors for the ROPRO and DeepROPRO analyses are available in Table 1 and the Data Supplement (Table S2), respectively.

FIG 4. — Absolute prediction error of the OS HRs obtained with the ROPRO JM coefficients. HR, hazard ratio; JM, joint model; OS, overall survival.

Characterization the Effect of the $β_{2}$ and $γ$ Parameters

To determine which parameter had larger predictive performance, we performed an additional analysis without the $γ$ parameter. The performance using only the $β_{2}$ parameter decreased only slightly in the 3-month (ROPRO adjusted $R^{2}$ [bootstrap CI]: 0.85 [0.53 to 0.97], Kendall $τ$ : 0.88) and 6-month (ROPRO adjusted $R^{2}$ [bootstrap CI]: 0.86 [0.46 to 0.98], Kendall $τ$ : 0.88) analyses when compared with the previous results.

Additional Analysis: Risk Trend Concordance With the OS Benefit

In addition, we verified that the risk trends (the signs of the $β_{2}$ coefficient) were concordant with the original clinical trial's OS benefit in 11 of 12 clinical trials (all but PRONOUNCE, Table 1, Fig 5). That is, the medications that had the lowest risk trend (ie, highest risk improvement over time) also had the highest OS. Only in PRONOUNCE, the carboplatin + pemetrexed treatment arm had significantly higher risk trend values although there was no difference in OS between the arms in the original study. A representation of the risk curves is available in the Data Supplement (Figs S2-S13).

FIG 5. — Superimposed risk trend and survival curves of the (A) KEYNOTE-024 and (B) NCT00520676 clinical trials. In each panel, the first two plots are the ROPRO and DeepROPRO trends, followed by the OS Kaplan-Meier curves and the risk table. The HR is referent to the effect of the pembrolizumab and carboplatin + docetaxel, respectively. HR, hazard ratio; OS, overall survival.

DISCUSSION

We introduced a new research and development (R&D) decision support methodology leveraging the time course of an OS prognostic marker. It models the difference in the risk trend and early OS between treatment arms with JMs. Our results show that the early JM coefficients (at 3 and 6 months) correlated with the final OS HR and could predict the final OS HR of unseen trials with a small error (in most cases as low as 0.1).

The JM framework had an adequate performance when lower amounts of information were included. Specifically, the correlation performance with 3-month JM coefficients was similar to the performance later at 6 months (Kendall $τ$ of 0.82 and 0.82, adjusted $R^{2}$ of 0.88 and 0.85, respectively). The prediction performance was also similar (RMSE of 0.11 and 0.12, respectively). In addition, the performance of the risk trend framework did not decrease substantially when the sample size was lower (PROFILE 1014 and ClinicalTrials.gov identifier: NCT00520676, both studies with about 190 patients per arm). For both these studies, there was a low absolute prediction error of the final OS HR (always below 0.11 and as low as 0 for ClinicalTrials.gov identifier: NCT00520676).

The JM coefficients obtained higher correlation (adjusted $R^{2}$ ) results than other analyses that considered the correlation between early PFS and final OS HR. Specifically, the recent analysis by Shameer et al,²⁵ which also considered multiple mechanisms of action, reported overall $R^{2}$ values of 0.23 (and for PD-1/PD-L1 inhibitors of 0.86), whereas our adjusted $R^{2}$ was higher at 0.88. Still, we performed our analysis on a smaller number of studies and drugs, and therefore, our adjusted $R^{2}$ value is subject to change. We have to admit that our analysis is based on a comparatively small number of studies, and we cannot rule out the possibility that the power of our approach is to some extent overestimated by mere chance. Still, we think that the results presented here are substantial and support the role of early OS/risk trend modeling as a further decision making tool. The method is not intended as a replacement, but as an add-on to prediction via progression results. We note that one could construct a combined framework in the future; progression, early OS, and risk trend could be simultaneously modeled in the JM framework to further improve OS HR prediction.

Following the moderate error obtained in the prediction analysis, we argue that one possible use for the risk trend framework could be to inform futility analyses in phase III interim analyses. In addition, another possible use of the risk trend framework could be as an initial indicator of OS benefit in early clinical trial phases (this setting was not considered in this work and needs to be studied in a future analysis). At the aforementioned stages, the risk trend framework could be used alongside other methods such as those in the studies by Beinse et al⁵ and Shameer et al³ to inform go/no-go decisions. All these methods consider different types of data, and hence, their joint use could more comprehensively describe the effects of the drug. In summary, our analysis suggests that the risk trend framework has the potential to serve as a valuable additional R&D support tool. Still, further analyses would be necessary to validate the risk trend framework in these settings.

At this stage of the risk trend framework, we did not attempt to formally prove its surrogacy to OS according to the guidelines introduced by Prentice.^26-28 Although the framework might have the potential to generate a surrogacy end point, we focused here solely on its possible utility in early decision making. Proof of surrogacy would be an effort reaching beyond the scope of this manuscript, requiring the involvement of further types of data (eg, clinical studies and other RWD sources), other pharmaceutical companies, and academic institutions.

In addition, we focused on aNSCLC. The good performance at early time points (3 months) is likely partially due to the generally low median survival time observed in aNSCLC. Further analyses are required to validate the framework in other cancer indications, also with respect to the identification of optimal time points for interim analyses. Nevertheless, from our experience with the baseline pan-cancer ROPRO,^8,29 we believe that the risk trend of ROPRO has the potential to perform well in other indications.

Since our analyses were conducted using RWD, it has to be shown that the results translate to clinical trials. More validation is needed, especially in early clinical trial phases. Finally, our analysis is biased toward trials testing efficacious medications as only these medications are available in RWD. We plan to investigate the risk trend framework further in drugs that failed to show efficacy.

Finally, we used the ROPRO and DeepROPRO prognostic scores to calculate the risk. These prognostic scores are composed mainly of vital and blood test parameters. There are other, independent, prognostic biomarkers that were not included in these models (cancer-specific biomarkers, circulating tumor DNA, and C-reactive protein, among others). These can be investigated in their own right, using the risk trend framework we exemplified here, or be combined into a joined score to further increase the performance.

In conclusion, trustworthy estimates of OS are essential for precise decision making in clinical trials. Our results show that the early OS/risk trend framework (using ROPRO/DeepROPRO) predicted treatment benefit for the majority of emulated clinical trials studied. In our analysis, prediction of the OS HR with a low error was possible at 3 and 6 months after the start of treatment for most considered trials.

The results of this initial analysis introduce the risk trend framework as a potential new R&D decision support tool for aNSCLC clinical trials. Further analyses in clinical studies and other RWD sources are necessary to further validate the risk trend framework in aNSCLC.

ACKNOWLEDGMENT

The authors thank Carlos Talavera-López (Institute for Computational Health, Helmholtz Munich), Fabian Schmich (Roche Innovation Center Munich), and Bruno Gomes (Roche Innovation Center Basel) for their valuable input.

Hugo Loureiro

Employment: Roche

Theresa M. Kolben

Employment: Bayer HealthCare Pharmaceuticals, Roche

Stock and Other Ownership Interests: Bayer HealthCare Pharmaceuticals, Roche

Patents, Royalties, Other Intellectual Property: I am a patent holder based on my work at Roche

Astrid Kiermaier

Employment: Roche

Stock and Other Ownership Interests: Roche

Patents, Royalties, Other Intellectual Property: Patent applications in context of HER2 disease

Dominik Rüttinger

Employment: Bayer

Leadership: Bayer

Stock and Other Ownership Interests: Bayer

Patents, Royalties, Other Intellectual Property: I am a patent holder for patents derived out of my R&D work at Roche Diagnostics GmbH

Narges Ahmidi

Honoraria: Sanofi

Research Funding: Roche Diagnostics Penzberg (Inst)

Tim Becker

Consulting or Advisory Role: xValue GmbH

Anna Bauer-Mehren

Employment: Roche

Stock and Other Ownership Interests: Roche

No other potential conflicts of interest were reported.

SUPPORT

Supported by the Helmholtz Association under the joint research school Munich School for Data Science (MUDS), in which Hugo Loureiro is a doctoral researcher.

T.B. and A.B.-M contributed equally to this work.

AUTHOR CONTRIBUTIONS

Conception and design: Theresa M. Kolben, Astrid Kiermaier, Narges Ahmidi, Tim Becker, Anna Bauer-Mehren

Collection and assembly of data: Hugo Loureiro, Tim Becker

Data analysis and interpretation: Hugo Loureiro, Theresa M. Kolben, Narges Ahmidi, Tim Becker

Manuscript writing: All authors

Final approval of manuscript: All authors

Accountable for all aspects of the work: All authors

AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST

The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/cci/author-center.

Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).

Hugo Loureiro

Employment: Roche

Theresa M. Kolben

Employment: Bayer HealthCare Pharmaceuticals, Roche

Stock and Other Ownership Interests: Bayer HealthCare Pharmaceuticals, Roche

Patents, Royalties, Other Intellectual Property: I am a patent holder based on my work at Roche

Astrid Kiermaier

Employment: Roche

Stock and Other Ownership Interests: Roche

Patents, Royalties, Other Intellectual Property: Patent applications in context of HER2 disease

Dominik Rüttinger

Employment: Bayer

Leadership: Bayer

Stock and Other Ownership Interests: Bayer

Patents, Royalties, Other Intellectual Property: I am a patent holder for patents derived out of my R&D work at Roche Diagnostics GmbH

Narges Ahmidi

Honoraria: Sanofi

Research Funding: Roche Diagnostics Penzberg (Inst)

Tim Becker

Consulting or Advisory Role: xValue GmbH

Anna Bauer-Mehren

Employment: Roche

Stock and Other Ownership Interests: Roche

No other potential conflicts of interest were reported.

REFERENCES

1.Mushti SL, Mulkey F Sridhara R: Evaluation of overall response rate and progression-free survival as potential surrogate endpoints for overall survival in immunotherapy trials. Clin Cancer Res 24:2268-2275, 2018 [DOI] [PubMed] [Google Scholar]
2.Zhuang SH, Xiu L, Elsayed YA: Overall survival: A gold standard in search of a surrogate: The value of progression-free survival and time to progression as end points of drug efficacy. Cancer J 15:395-400, 2009 [DOI] [PubMed] [Google Scholar]
3.Shameer K, Zhang Y, Prokop A, et al. : OSPred tool: A digital health aid for rapid predictive analysis of correlations between early end points and overall survival in non–small-cell lung cancer clinical trials. JCO Clin Cancer Inform 10.1200/CCI.21.00173 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Seo S, Kim Y, Han H-J, et al. : Predicting successes and failures of clinical trials with outer product–based convolutional neural network. Front Pharmacol 12:670670, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Beinse G, Tellier V, Charvet V, et al. : Prediction of drug approval after phase I clinical trials in oncology: RESOLVED2. JCO Clin Cancer Inform 10.1200/CCI.19.00023 [DOI] [PubMed] [Google Scholar]
6.Hegge S, Thunecke M, Krings M, et al. : Predicting success of phase III trials in oncology. medRxiv 2020.12.15.20248240, 2020 [Google Scholar]
7.Schperberg AV, Boichard A, Tsigelny IF, et al. : Machine learning model to predict oncologic outcomes for drugs in randomized clinical trials. Int J Cancer 147:2537-2549, 2020 [DOI] [PubMed] [Google Scholar]
8.Becker T, Weberpals J, Jegg AM, et al. : An enhanced prognostic score for overall survival of patients with cancer derived from a large real-world cohort. Ann Oncol 31:1561-1568, 2020 [DOI] [PubMed] [Google Scholar]
9.Loureiro H, Becker T, Bauer-Mehren A, et al. : Artificial intelligence for prognostic scores in oncology: A benchmarking study. Front Artif Intell 4:9, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Rizopoulos D: Joint Models for Longitudinal and Time-to-Event Data: With Applications in R. Boca Raton, FL, CRC Press, 2012 [Google Scholar]
11.Cleveland WS, Devlin SJ: Locally weighted regression: An approach to regression analysis by local fitting. J Am Stat Assoc 83:596-610, 1988 [Google Scholar]
12.Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning: Data Mining, Inference, and Prediction (ed 2). New York, NY, Springer Science & Business Media, 2009 [Google Scholar]
13.Ma X, Long L, Moon S, et al. : Comparison of population characteristics in real-world clinical oncology databases in the US: Flatiron Health, SEER, and NPCR. medRxiv 2020.03.16.20037143, 2020 [Google Scholar]
14.Birnbaum B, Nussbaum N, Seidl-Rathkopf K, et al. : Model-assisted cohort selection with bias analysis for generating large-scale cohorts from the EHR for oncology researchArXiv200109765 Cs. 2020. http://arxiv.org/abs/2001.09765
15.Yang JC-H, Wu Y-L, Schuler M, et al. : Afatinib versus cisplatin-based chemotherapy for EGFR mutation-positive lung adenocarcinoma (LUX-Lung 3 and LUX-Lung 6): Analysis of overall survival data from two randomised, phase 3 trials. Lancet Oncol 16:141-151, 2015 [DOI] [PubMed] [Google Scholar]
16.Liu R, Rizzo S, Whipple S, et al. : Evaluating eligibility criteria of oncology trials using real-world data and AI. Nature 592:629-633, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Ho DE, Imai K, King G, et al. : Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit Anal 15:199-236, 2007 [Google Scholar]
18.Stuart EA, Lee BK, Leacy FP: Prognostic score–based balance measures can be a useful diagnostic for propensity score methods in comparative effectiveness research. J Clin Epidemiol 66:S84-S90.e1, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.R Core Team: R: A Language and Environment for Statistical Computing. Vienna, Austria, R Foundation for Statistical Computing, 2018. https://www.R-project.org/ [Google Scholar]
20.Ho D, Imai K, King G, et al. : MatchIt: Nonparametric preprocessing for parametric causal inference. J Stat Softw 42:1-28, 2011 [Google Scholar]
21.Katzman JL, Shaham U, Cloninger A, et al. : DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 18:24, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Rizopoulos D: JM: An R package for the joint modelling of longitudinal and time-to-event data. J Stat Softw 35:1-33, 2010. 21603108 [Google Scholar]
23.Loureiro H: Risk trend analysis Github repository. https://github.com/loureirh/risktrend
24.Bierer BE, Meloney LG, Ahmed HR, et al. : Advancing the inclusion of underrepresented women in clinical research. Cell Rep Med 3:100553, 2022 [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Shameer K, Zhang Y, Jackson D, et al. : Correlation between early endpoints and overall survival in non-small-cell lung cancer: A trial-level meta-analysis. Front Oncol 11:672916, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Prentice RL: Surrogate endpoints in clinical trials: Definition and operational criteria. Stat Med 8:431-440, 1989 [DOI] [PubMed] [Google Scholar]
27.Alonso A, Bigirumurame T, Burzykowski T, et al. : Applied Surrogate Endpoint Evaluation Methods with SAS and R. Chapman and Hall/CRC, 2016. https://www.taylorfrancis.com/books/9781482249378 [Google Scholar]
28.Burzykowski T, Molenberghs G, Buyse M (eds): The Evaluation of Surrogate Endpoints. New York, NY, Springer New York, 2005 [Google Scholar]
29.Becker T, Mailman M, Tan S, et al. : Comparison of overall survival prognostic power of contemporary prognostic scores in prevailing tumor indications. Med Res Arch 10.18103/mra.v11i4.3638 [Google Scholar]

[b1] 1.Mushti SL, Mulkey F Sridhara R: Evaluation of overall response rate and progression-free survival as potential surrogate endpoints for overall survival in immunotherapy trials. Clin Cancer Res 24:2268-2275, 2018 [DOI] [PubMed] [Google Scholar]

[b2] 2.Zhuang SH, Xiu L, Elsayed YA: Overall survival: A gold standard in search of a surrogate: The value of progression-free survival and time to progression as end points of drug efficacy. Cancer J 15:395-400, 2009 [DOI] [PubMed] [Google Scholar]

[b3] 3.Shameer K, Zhang Y, Prokop A, et al. : OSPred tool: A digital health aid for rapid predictive analysis of correlations between early end points and overall survival in non–small-cell lung cancer clinical trials. JCO Clin Cancer Inform 10.1200/CCI.21.00173 [DOI] [PMC free article] [PubMed] [Google Scholar]

[b4] 4.Seo S, Kim Y, Han H-J, et al. : Predicting successes and failures of clinical trials with outer product–based convolutional neural network. Front Pharmacol 12:670670, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]

[b5] 5.Beinse G, Tellier V, Charvet V, et al. : Prediction of drug approval after phase I clinical trials in oncology: RESOLVED2. JCO Clin Cancer Inform 10.1200/CCI.19.00023 [DOI] [PubMed] [Google Scholar]

[b6] 6.Hegge S, Thunecke M, Krings M, et al. : Predicting success of phase III trials in oncology. medRxiv 2020.12.15.20248240, 2020 [Google Scholar]

[b7] 7.Schperberg AV, Boichard A, Tsigelny IF, et al. : Machine learning model to predict oncologic outcomes for drugs in randomized clinical trials. Int J Cancer 147:2537-2549, 2020 [DOI] [PubMed] [Google Scholar]

[b8] 8.Becker T, Weberpals J, Jegg AM, et al. : An enhanced prognostic score for overall survival of patients with cancer derived from a large real-world cohort. Ann Oncol 31:1561-1568, 2020 [DOI] [PubMed] [Google Scholar]

[b9] 9.Loureiro H, Becker T, Bauer-Mehren A, et al. : Artificial intelligence for prognostic scores in oncology: A benchmarking study. Front Artif Intell 4:9, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]

[b10] 10.Rizopoulos D: Joint Models for Longitudinal and Time-to-Event Data: With Applications in R. Boca Raton, FL, CRC Press, 2012 [Google Scholar]

[b11] 11.Cleveland WS, Devlin SJ: Locally weighted regression: An approach to regression analysis by local fitting. J Am Stat Assoc 83:596-610, 1988 [Google Scholar]

[b12] 12.Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning: Data Mining, Inference, and Prediction (ed 2). New York, NY, Springer Science & Business Media, 2009 [Google Scholar]

[b13] 13.Ma X, Long L, Moon S, et al. : Comparison of population characteristics in real-world clinical oncology databases in the US: Flatiron Health, SEER, and NPCR. medRxiv 2020.03.16.20037143, 2020 [Google Scholar]

[b14] 14.Birnbaum B, Nussbaum N, Seidl-Rathkopf K, et al. : Model-assisted cohort selection with bias analysis for generating large-scale cohorts from the EHR for oncology researchArXiv200109765 Cs. 2020. http://arxiv.org/abs/2001.09765

[b15] 15.Yang JC-H, Wu Y-L, Schuler M, et al. : Afatinib versus cisplatin-based chemotherapy for EGFR mutation-positive lung adenocarcinoma (LUX-Lung 3 and LUX-Lung 6): Analysis of overall survival data from two randomised, phase 3 trials. Lancet Oncol 16:141-151, 2015 [DOI] [PubMed] [Google Scholar]

[b16] 16.Liu R, Rizzo S, Whipple S, et al. : Evaluating eligibility criteria of oncology trials using real-world data and AI. Nature 592:629-633, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]

[b17] 17.Ho DE, Imai K, King G, et al. : Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit Anal 15:199-236, 2007 [Google Scholar]

[b18] 18.Stuart EA, Lee BK, Leacy FP: Prognostic score–based balance measures can be a useful diagnostic for propensity score methods in comparative effectiveness research. J Clin Epidemiol 66:S84-S90.e1, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[b19] 19.R Core Team: R: A Language and Environment for Statistical Computing. Vienna, Austria, R Foundation for Statistical Computing, 2018. https://www.R-project.org/ [Google Scholar]

[b20] 20.Ho D, Imai K, King G, et al. : MatchIt: Nonparametric preprocessing for parametric causal inference. J Stat Softw 42:1-28, 2011 [Google Scholar]

[b21] 21.Katzman JL, Shaham U, Cloninger A, et al. : DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 18:24, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[b22] 22.Rizopoulos D: JM: An R package for the joint modelling of longitudinal and time-to-event data. J Stat Softw 35:1-33, 2010. 21603108 [Google Scholar]

[b23] 23.Loureiro H: Risk trend analysis Github repository. https://github.com/loureirh/risktrend

[b24] 24.Bierer BE, Meloney LG, Ahmed HR, et al. : Advancing the inclusion of underrepresented women in clinical research. Cell Rep Med 3:100553, 2022 [DOI] [PMC free article] [PubMed] [Google Scholar]

[b25] 25.Shameer K, Zhang Y, Jackson D, et al. : Correlation between early endpoints and overall survival in non-small-cell lung cancer: A trial-level meta-analysis. Front Oncol 11:672916, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]

[b26] 26.Prentice RL: Surrogate endpoints in clinical trials: Definition and operational criteria. Stat Med 8:431-440, 1989 [DOI] [PubMed] [Google Scholar]

[b27] 27.Alonso A, Bigirumurame T, Burzykowski T, et al. : Applied Surrogate Endpoint Evaluation Methods with SAS and R. Chapman and Hall/CRC, 2016. https://www.taylorfrancis.com/books/9781482249378 [Google Scholar]

[b28] 28.Burzykowski T, Molenberghs G, Buyse M (eds): The Evaluation of Surrogate Endpoints. New York, NY, Springer New York, 2005 [Google Scholar]

[b29] 29.Becker T, Mailman M, Tan S, et al. : Comparison of overall survival prognostic power of contemporary prognostic scores in prevailing tumor indications. Med Res Arch 10.18103/mra.v11i4.3638 [Google Scholar]

PERMALINK

Correlation Between Early Trends of a Prognostic Biomarker and Overall Survival in Non–Small-Cell Lung Cancer Clinical Trials

Hugo Loureiro, MSc

Theresa M Kolben, MD, PhD

Astrid Kiermaier, PhD

Dominik Rüttinger, MD, PhD

Narges Ahmidi, PhD

Tim Becker, PhD

Anna Bauer-Mehren, PhD

Abstract

PURPOSE

METHODS

RESULTS

CONCLUSION

INTRODUCTION

CONTEXT

METHODS

Ethics Statement

Joint Modeling of Risk Trend and OS

FIG 1.

Correlation Analysis

Clinical Trial Emulation with RWD

FIG 2.

TABLE 1.

Implementation

RESULTS

Evaluation and Sensitivity of the Emulated Trials

Correlation of JM Coefficients With Final OS HR

FIG 3.

Prediction of Final OS HR Using JM Coefficients

FIG 4.

Characterization the Effect of the β2 and γ Parameters

Additional Analysis: Risk Trend Concordance With the OS Benefit

FIG 5.

DISCUSSION

ACKNOWLEDGMENT

SUPPORT

AUTHOR CONTRIBUTIONS

AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Characterization the Effect of the $β_{2}$ and $γ$ Parameters