Abstract
Background
Phase III trials with prospective biomarker validation are essential to drug development in the era of personalized oncology. However, concerns have emerged regarding the design and reporting of phase III trials with prospective biomarker validation.
Methods
We searched MEDLINE for phase III oncology trials with prospective biomarker validation published in high-impact medical journals from 2011 to 2020. Information regarding trial design and reporting were extracted. Descriptive methods were used to summarize the results.
Results
We identified 45 phase III trials with prospective biomarker validation. There was a trend for increasing use of biomarker validation phase III trials (from 1 trial in 2011 to 12 trials in 2020). For 39 (86.7%) trials, results in biomarker-negative population were either listed as an exploratory subgroup analysis (62.2%) or not mentioned in the methods (24.4%). Twenty-one (46.7%) trials were originally designed without biomarker validation but were then apparently modified to incorporate prospective biomarker validation after trial commencement, albeit only 15 (33.3%) trials reported this change. Treatment effect and primary outcome values in biomarker-negative patients were not reported in 24.4% and 40.0% trials, respectively. For 18 trials with statistically significant results in the overall population, only 7 trials reported a hazard ratio less than 0.8 in the biomarker-negative population.
Conclusions
Although biomarker validation in phase III trials have been increasingly used in the past decade, issues regarding changes in trial design after commencement without disclosure, underreporting of results in biomarker-negative groups, and recommending treatment in biomarker negative groups despite modest effects require substantial improvement.
Efficient development of novel molecularly targeted and immune anticancer therapies requires identification of predictive biomarkers that can reliably differentiate patient population into a specific biomarker-defined subgroup (ie, biomarker-positive population) that benefits from the therapy vs the remaining subgroup (ie, biomarker-negative population). To address this issue, several phase III biomarker validation designs have been proposed (1-5). When there is a strong rationale that the treatment effect of the investigated agent is confined to the biomarker-positive population, the so-called “enrichment design” is often used, which only enrolls biomarker-positive patients. When a potential treatment benefit in the biomarker-negative population cannot be ruled out, the so-called “biomarker-stratified design,” in which patients are randomized into the new or standard treatments stratified by biomarker status, would be an appropriate approach. For this approach, a key issue is how to prioritize the statistical hierarchy of the biomarker-positive, biomarker-negative, and overall populations.
A direct method is to use a parallel biomarker subgroup–specific design powered to detect clinically relevant treatment effect separately in the biomarker-positive and biomarker-negative populations. An alternative, indirect approach is by formally assessing treatment benefit in the biomarker-positive population and overall population but not in the biomarker-negative population, which usually allows for a smaller sample size than the subgroup-specific design. The major difference between the biomarker-positive and overall design and the biomarker subgroup–specific design is whether a confirmatory conclusion can be drawn for the biomarker-negative subgroup. Biomarker-positive and overall designs are not powered to detect the difference in biomarker-negative subgroup and result in the biomarker-negative subgroup being considered exploratory. But for the biomarker subgroup–specific design, result in the biomarker-negative subgroup is usually a coprimary objective with adjustment for multiplicity and adequately powered to draw a confirmatory conclusion. Biomarker-positive and overall designs have caused controversies because they have been criticized for not selecting a sensitive subpopulation that derives tangible clinical benefit from the treatment; instead, they maximize the potential target population of investigated drugs (6-8). Another concern is that the results of biomarker-negative patients may not be reported, preventing physicians from robust benefit-risk assessments in biomarker-negative subpopulations.
Our study aimed to assess the design and reporting of phase III oncology trials with predictive biomarker validation published in 7 high-impact medical journals from January 2011 to December 2020. Particularly, for trials that reported significant treatment effect in overall population, we evaluated the reporting and magnitude of treatment effect in the biomarker-negative subgroup.
Methods
Identification of phase III randomized controlled trials (RCTs) with prospective biomarker validation
A literature search was performed using MEDLINE to identify all randomized controlled phase III studies published in 7 high-impact medical journals, throughout a 10-year period from January 2011 to December 2020, that have published the vast majority of practicing-change phase III RCTs: New England Journal of Medicine, Lancet, JAMA, Lancet Oncology, Journal of Clinical Oncology, JAMA Oncology, and Annals of Oncology. The detailed search strategy is provided in the Supplementary Methods (available online). All the journals’ issues over the 2011-2020 period were hand-searched to supplement previous search results.
We excluded noncancer studies; phase I, II, or IV trials; secondary reports of published trials; and trials that investigated nondrug intervention based on initial screening. Following this, the full article of each trial was screened to identify biomarker validation phase III trials. Eligible trials met the 2 following criteria: first, trials were designed to assess the primary endpoint in 2 or more populations, with at least 1 population being the biomarker-positive subgroup. Second, evaluation of a treatment effect in the biomarker-positive population was confirmatory not exploratory, which means that the biomarker-positive population was clearly stated as either the coprimary population or secondary population that was included in the statistical controlling strategy of the type I error rate (the rate of falsely rejecting a true null hypothesis) (9). Biomarkers were defined as characterization of biologic molecules or diagnostic tests carried out on DNA, RNA, proteins, and metabolites from blood, body fluids, or tissues (2).
For trials incorporating more than 1 predictive biomarker, we selected the main biomarker according to the following order: if multiple biomarkers were validated in a hierarchical sequence, the first biomarker was chosen. If multiple biomarkers were tested in parallel, the one validated as the predictive biomarker was chosen. For trials that investigated a continuous biomarker with different cutoffs, the highest cutoff was chosen to define the biomarker-positive population.
Data collection
The following information was extracted from eligible trials: journal, year of publication, cancer type, sample size for biomarker-positive population and overall population, type of intervention (immunotherapy, targeted therapy, or chemotherapy), treatment setting (palliative, adjuvant, or neoadjuvant), funding source, blinding, and primary outcome measure.
Information about trial design regarding biomarker validation was extracted, which was considered essential to evaluate the appropriateness of the method and the validity of the results: design type (biomarker subgroup–specific design vs biomarker-positive and overall design); randomization according to biomarker status; sample size calculation for biomarker-positive, biomarker-negative, or overall populations; data type of biomarker (continuous data with cutoff vs binary data); whether adaptive population-enrichment design was applied, where all-comers regardless of biomarker status are accrued initially, and the eligibility criteria may change adaptively on the basis of planned interim analysis results and if only the biomarker-positive patients are benefiting, further enrollment in the biomarker-negative subgroup would be terminated; statistical strategy used for dealing with multiplicity adjustment; status of biomarker-negative population in the objective hierarchy (co-primary objective, secondary objective, or exploratory subgroup). We also examined whether trials were originally designed as conventional phase III trials but subsequently changed to predictive biomarker validation trials during trial conduct by comparing the primary objectives among different versions of protocol, if available, or tracking the “History of Changes” section in ClinicalTrials.gov, and if so, whether this change was reported.
The “positivity” of each trial was determined according to the identified multiplicity strategy. For trials reporting positive results in overall population, we screened the conclusion section to see if the authors recommended the investigated experimental treatment for the overall population. We also recorded whether the results of the biomarker-negative population were reported and the type of presentation (figure, forest plot, text only, or appendix only). We extracted the treatment effect size and 95% confidence intervals of primary outcome in biomarker-positive, biomarker-negative, and overall populations. For trials with co-primary endpoints of overall survival (OS) and progression-free survival, the results of OS were extracted unless the OS results were immature.
Two authors (F.L. and L.P.) independently selected eligible trials and extracted data, and disagreements were discussed with a third author (J.S.) to reach consensus.
Statistical analysis
The primary objective of our study was to describe the design and reporting of phase III oncology trials with biomarker validation. Categorical variables are presented as frequencies and percentages, and quantitative variables are presented as median and range. Differences in treatment effect size between the biomarker-positive population and the biomarker-negative population were evaluated by using the ratio of hazard ratio (HR) (RHR), which has been used in previous studies (10,11). RHR referred to the ratio of HR in the biomarker-positive population to that in biomarker-negative population. We then estimated a combined RHR and 95% confidence interval across the included trials by combining the individual RHR using a random-effect meta-analysis model weighted by variance of RHR.
Results
The search strategy yielded 3683 results, and 3538 articles were excluded (Supplementary Figure 1, available online). A total of 45 phase III RCTs with predictive biomarker validation were identified and analyzed (Figure 1).
Figure 1.
Number of phase III trials with biomarker validation per year between 2011 and 2020.
The characteristics of included RCTs are detailed in Table 1, and the full list of included trials is provided in Supplementary Table 1 (available online). The journal with the most included trials was the New England Journal of Medicine (31.1%), and the majority of studied cancer types were lung cancer (33.3%) and breast cancer (17.8%). Immunotherapy represented 64.4% of all experimental agents in the study cohort. The median sample size was 719 (range = 333-2312) and 341 (range = 94-853) for the overall population and biomarker-positive population, respectively. Most of the included trials were in the palliative setting (88.9%), superiority trials (90.4%), and open label (57.8%). OS (35.6%) was the most used primary endpoint, followed by progression-free survival (28.9%). The trials that were solely industry funded totalled 91.1%.
Table 1.
Characteristic of included trials (N = 45)a
Characteristics | No. of trials (%) |
---|---|
Journal | |
New England Journal of Medicine | 14 (31.1) |
Lancet | 11 (24.4) |
Lancet Oncology | 12 (26.7) |
Journal of Clinical Oncology | 4 (8.9) |
JAMA Oncology | 3 (6.7) |
Annals of Oncology | 1 (2.2) |
Year of publication | |
2011-2015 | 3 (6.7) |
2016-2020 | 42 (93.3) |
Cancer type | |
Lung | 15 (33.3) |
Breast | 8 (17.8) |
Gastric or esophageal | 5 (11.1) |
Urothelial | 4 (8.9) |
Ovarian | 4 (8.9) |
Others | 14 (20.0) |
Type of intervention | |
Immunotherapy | 29 (64.4) |
Targeted therapy | 14 (31.1) |
Chemotherapy | 2 (4.4) |
Setting | |
Palliative | 40 (88.9) |
Adjuvant | 3 (6.7) |
Neoadjuvant | 2 (4.4) |
Sample size of overall population | |
Median (range) | 719 (333-2312) |
Sample size of biomarker-positive population | |
Median (range) | 341 (94-853) |
Funding source | |
Solely industry funded | 41 (91.1) |
Partially industry funded | 4 (8.9) |
Blinding | |
Double-blind | 19 (42.2) |
Open label | 26 (57.8) |
Primary outcome measure | |
OS | 16 (35.6) |
PFS | 13 (28.9) |
PFS and OS | 11 (24.4) |
DFS | 3 (6.7) |
EFS | 1 (2.2) |
PCR | 1 (2.2) |
DFS = disease-free survival; EFS = event-free survival; OS = overall survival; PCR = pathological complete response; PFS = progression-free survival.
Over time, there has been a numerical trend toward increasing of prospective predictive biomarker validation phase III RCTs (from 1 trial in 2011 to 12 trials in 2020) (Figure 1), with only 3 (6.7%) trials published between 2011 and 2015 and 42 (93.3%) trials published between 2016 and 2020.
The design features are shown in Table 2. Among the 45 predictive biomarker validation phase III RCTs, 5 (11.1%) employed biomarker subgroup–specific design and 40 (88.9%) trials used biomarker-positive and overall design. The investigated biomarkers were used as randomization stratification factors in only 22 (48.9%) trials. All trials applied statistical methods for multiplicity adjustment to control the overall type I error rate. Sample size calculations were provided for biomarker-positive or biomarker-negative and overall populations in 42 (93.3%), 5 (11.1%), and 39 (86.7%) trials, respectively. A total of 17 (37.7%) trials investigated binary biomarkers, and 27 (60.0%) trials involved continuous biomarkers with cutoff values. Two (4.4%) trials employed an adaptive population enrichment design. For 39 (86.7%) trials, efficacy results in biomarker-negative populations either were listed as exploratory subgroup analysis (62.2%) or were not even mentioned in the “Methods” section (24.4%). Twenty-one (46.7%) trials were originally designed as conventional RCTs focusing on single primary populations but changed to incorporate the prospective biomarkers validation during conduct of the trials; however, only 15 (33.3%) of those reported the design change in the text.
Table 2.
Design characteristics of included trials (N = 45)
Characteristics | No. of trials (%) |
---|---|
Design type | |
Biomarker subgroup–specific design | 5 (11.1) |
Biomarker-positive/overall design | 40 (88.9) |
Stratified by biomarker status | |
Yes | 22 (48.9) |
No | 23 (51.1) |
Sample size calculation for biomarker-positive population | |
Yes | 42 (93.3) |
No | 3 (6.7) |
Sample size calculation for overall population | |
Yes | 39 (86.7) |
No | 6 (13.3) |
Sample size calculation for biomarker-negative population | |
Yes | 5 (11.1) |
No | 40 (88.9) |
Adaptive population-enrichment design | |
Yes | 43 (95.6) |
No | 2 (4.4) |
Multiplicity adjustments | |
Yes | 45 (100.0) |
No | 0 |
Data type of biomarker | |
Continuous data with cutoff | 27 (60.0) |
Binary data | 17 (37.7) |
NAa | 1 (2.2) |
Status of biomarker-negative population in objective hierarchy | |
Co-primary objective | 5 (11.1) |
Secondary objective | 1 (2.2) |
Exploratory subgroup | 28 (62.2) |
Not mentioned in “Methods” section | 11 (24.4) |
Changed to predictive biomarker validation trials during trial conduct | |
Yes | 21 (46.7) |
No | 24 (53.3) |
Reported the change in the text | |
Yes | 15 (33.3) |
No | 6 (13.3) |
One trial aimed to validate a gene signature with a split-sample approach based on the adaptive population-enrichment design. One-third (training set) of study patients were to be used to look for a predictive gene signature. The remaining two-thirds of study patients (test set) were to be used for clinical validation of the gene signature. But no gene signature was identified in the training set. NA = not applicable.
For trials with biomarker-positive and overall design, 15 (33.3%) reported positive results in both the biomarker-positive and overall populations, 4 (8.9%) reported positive results only in the biomarker-positive population, and 2 (4.4%) reported positive results only in the overall population. For trials with biomarker subgroup–specific design, 1 (2.2%) trial reported positive results in both biomarker-positive and biomarker-negative populations, and 3 (6.7%) trials reported positive results in biomarker-positive populations. A total of 43 (95.6%) trials drew their conclusions based on prespecified multiplicity adjustment methods. One trial overruled their prespecified statistical trial design and concluded that the benefit was limited to biomarker-positive patients. Two trials concluded the benefit was seen in the overall population regardless of biomarker status, whereas statistical significance was not demonstrated under multiplicity control. Eleven (24.4%) trials did not report treatment effect size in the biomarker-negative population, and 18 (40.0%) trials did not provide primary outcome measure value by treatment arm in the biomarker-negative populations. For 7 (15.6%) trials, the results in the biomarker-negative populations are presented in the online Appendix (Table 3).
Table 3.
Results and reporting of included trials (N = 45)
Characteristics | No. of trials (%) |
---|---|
Results for trials with biomarker-positive and overall design | |
Positive in biomarker-positive and overall populations | 15 (33.3) |
Positive in biomarker-positive population | 5 (11.1) |
Positive in overall population | 2 (4.4) |
Negative in biomarker-positive and overall populations | 18 (40.0) |
Results for trials with biomarker subgroup–specific design | |
Positive in biomarker-positive and biomarker-negative populations | 1 (2.2) |
Positive in biomarker-positive population | 3 (6.7) |
Positive in biomarker-negative population | 0 |
Negative in biomarker-positive and biomarker-negative populations | 1 (2.2) |
Drawing conclusion according to prespecified multiplicity adjustment methods | |
Yes | 42 (93.3) |
No | 3 (6.7) |
Reporting of treatment effect size in biomarker-negative populationa | |
Yes | 33 (73.3) |
No | 11 (24.4) |
NAb | 1 (2.2) |
Reporting of primary outcome measure value by study arm in biomarker-negative populationc | |
Yes | 26 (57.8) |
No | 18 (40.0) |
NAb | 1 (2.2) |
Mode of presentation of efficacy results in biomarker-negative population | |
Figure | 12 (26.7) |
Forest plot | 12 (26.7) |
Text only | 2 (4.4) |
Appendix only | 7 (15.6) |
Treatment effect size refers to hazard ratio for time to event outcome and odds ratio or rate difference for binary outcome.
One trial aimed to validate a gene signature with a split-sample approach based on the adaptive population-enrichment design. One-third (training set) of study patients were to be used to look for a predictive gene signature. The remaining two-thirds of study patients (test set) were to be used for clinical validation of the gene signature. But no gene signature was identified in the training set.
Outcome measure value refers to medians or milestones for time to event outcome and proportions for binary outcome.
Eighteen trials reported a statistically significant treatment effect in the overall population, as summarized in Table 4. Only 1 (5.6%) trial adopted the biomarker subgroup–specific design. In 7 (38.8%) trials, patients were not stratified according to biomarker status. Efficacy results in biomarker-negative populations were either listed as exploratory subgroup analysis in 15 (83.3%) studies or not mentioned in the “Methods” section (11.1%). Efficacy results in biomarker-negative populations were not reported at all in 2 (11.1%) trials. Treatment effects and primary outcome values by study arm in biomarker-negative populations were reported in 16 and 11 trials, respectively. For the 15 trials (12-26) that reported efficacy results in biomarker-negative populations and used a time to event primary outcome, the point estimate of the hazards ratio of primary outcome in biomarker-negative populations was at least 0.8 in 8 trials, and the upper limit of the 95% confidence interval of the hazard ratio of primary outcome in biomarker-negative populations was at least 1 in 8 trials (Table 4). All of the 8 trials recommended the investigational agent for the overall population based on statistically significant treatment effect in overall population. Treatment effect in terms of the hazard ratio was 31% greater in biomarker-positive patients than in biomarker-negative patients (RHR = 0.69, 95% CI = 0.62 to 0.78, P < .001; Figure 2).
Table 4.
Characteristics of trials reporting statistically significant treatment effect in overall population (N = 18)
Characteristics | No. of trials (%) |
---|---|
Design type | |
Biomarker subgroup–specific design | 1 (5.6) |
Biomarker-positive and overall design | 17 (94.4) |
Sample size calculated for biomarker-negative population | |
Yes | 1 (5.6) |
No | 17 (94.4) |
Stratified by biomarker status | |
Yes | 11 (61.1) |
No | 7 (38.9) |
Status of biomarker-negative population in objectives hierarchy | |
Co-primary objective | 1 (5.6) |
Secondary objective | 0 |
Exploratory subgroup | 15 (83.3) |
Not mentioned in methods | 2 (11.1) |
Reporting of efficacy results in biomarker-negative population | |
Yes | 16 (89.1) |
No | 2 (11.1) |
Reporting of treatment effect size in biomarker-negative populationa | |
Yes | 16 (88.9) |
No | 2 (11.1) |
Reporting of outcome measure value by study arm in biomarker-negative populationb | |
Yes | 11 (61.1) |
No | 7 (38.9) |
Recommending investigated experimental treatment for overall population | |
Yes | 17 (94.4) |
No | 1 (5.6) |
Mode of presentation | |
Figure | 6 (33.3) |
Forest plot | 7 (38.9) |
Text only | 0 |
Appendix only | 3 (16.7) |
Not reported | 2 (11.1) |
Point estimate of HR in biomarker-negative population | |
≥0.8 | 8 (44.4) |
0.8-0.5 | 4 (22.2) |
≤0.5 | 3 (16.7) |
Not reported or NAc | 3 (16.7) |
Upper limit of the 95% confidence interval of HR in biomarker-negative population | |
≥1 | 8 (44.4) |
<1 | 7 (38.9) |
Not reported or NAc | 3 (16.7) |
Treatment effect size refers to hazard ratio for time to event outcome and odds ratio or rate difference for binary outcome. HR = hazard ratio; NA = not applicable.
Outcome measure value refers to medians or milestones for time to event outcome and proportions for binary outcome.
One trial used complete pathologic response as primary outcome.
Figure 2.
Comparison of treatment effect in biomarker-positive population with that in biomarker-negative population among trials reporting positive results in overall population. CI = confidence interval; IV = inverse variance. Error bars indicate the 95% confidence intervals.
Discussion
To our knowledge, this is the first study of the prevalence, design, and reporting of biomarker validation phase III trials. Our study showed that there is a trend for increasing use of biomarker validation phase III trials, with biomarker-positive and overall design as the predominant design type. Although the subgroup-specific approach provides the best direct evidence for clinical decision making because it presented reliable assessment of the risk-to-benefit ratio in each of the biomarker-defined subgroups, still only 11.1% of trials employed this design. The usually cited reason for not using the subgroup-specific approach was that it requires a much larger sample size than the biomarker-positive or overall approach, especially when the treatment effect is hypothesized to be modest in the biomarker-negative subgroups. But even when the sample size was acceptable for the subgroup-specific design, the industry sponsors may choose the biomarker-positive or overall approach, because it maximized the potential target populations or the market share of investigated drugs, even if this benefit was mainly driven by the biomarker-positive populations. For example, the KEYNOTE-042 trial (22), which compared pembrolizumab with chemotherapy for patients with advanced NSCLC, employed a biomarker positive and overall design by sequential testing of OS in patients with a tumor proportion score (TPS) at least 50%, at least 20%, and at least 1%. The superiority of pembrolizumab was demonstrated in all 3 co-primary populations, but the OS in the TPS 1%-49% group treated with pembrolizumab appears similar but not superior to chemotherapy (13.4 vs 12.1 months). The result in the TPS 1%-49% group was only exploratory because this is not a primary objective and it was not included in the statistical adjustment of multiplicity. This design is problematic for 2 reasons: first, pembrolizumab has already been proved to improve OS compared with chemotherapy for NSCLC patients with programmed death ligand 1 (PD-L1) expression by TPS at least 50% in the KEYNOTE-024 trial (27); thus, the real clinically relevant question is whether NSCLC patients with PD-L1 expression between 1% and 49% could benefit from pembrolizumab monotherapy. Second, the 453 deaths observed in the TPS 1%-49% subset would have provided more than 80% power, at the 2-sided 5% level, to detect a hazard ratio of 0.76, the threshold of clinically meaningful improvement of OS set by the American Society for Clinical Oncology Cancer Research Committee for non-squamous cell lung cancer (28).
We demonstrated that 17 trials recommended the investigated experimental treatment for the overall populations based on positive results in the overall populations, with 15 trials reporting results in the biomarker-negative populations. For 8 trials, treatment effects in the biomarker-negative populations were either modest or with an upper limit of the 95% confidence greater than 1. Statistically significant treatment effect heterogeneity according to biomarker status existed, with 31% greater clinical benefit in biomarker-positive patients than in biomarker-negative patients. Recommendation of the therapy for the overall population is appropriate as long as both subgroups benefit; however, in the presence of modest treatment effect in the biomarker-negative subgroup, the results for both the biomarker-positive and biomarker-negative subgroups should be presented and considered separately in the risk–benefit assessment in the treatment decision process. The strength of recommendation may vary for biomarker-defined subgroups according to the magnitude of clinical benefit, especially when the treatment effect in the biomarker-negative patients is modest. Simplifying the results of a biomarker positive and overall design by relying solely on whether the primary outcome is positive in the overall population should be avoided (29).
For trials in which biomarker-negative patients gained considerable benefit—as seen in the PRIMA trial (16) with a hazard ratio of 0.43 and 0.68 in the biomarker-positive population and biomarker-negative population, respectively—it would be appropriate for the authors to recommend the investigated agents to the overall population. If only a modest treatment effect was observed for the biomarker-negative subgroup, as seen in the KEYNOTE-042 trial (22), the authors should be cautious in recommending the investigated agents to the overall population and remind the readers that different risk–benefit profiles may exist for biomarker-defied subgroups.
Nearly one-half (46.7%) of included trials were originally designed as conventional phase III trials but subsequently changed to predictive biomarker validation trials during trial conduct, which involved the change in primary objectives. This proportion is higher than a previous study showing that 12.2% of oncology trials published in leading journals in 2012 had a discrepancy between their planned and published primary endpoints (30). Furthermore, 13.3% trials did not report this change in the “Methods” section, which contrasted with the CONSORT statement (31) that requires important changes to the methods to be reported after trial commencement.
If the required sample size is feasible, biomarker-stratified design powered in each relevant biomarker subgroup would be the best way to validate a biomarker. When the prevalence of biomarker negative is low, thus not feasible to conduct a biomarker-specific design, the biomarker-positive or overall approach may be considered to be justified. However, the design should specify biomarker-negative populations as a secondary goal, and results should be adequately reported. Our study found that results in the biomarker-negative populations were underreported, with 26.7% trials not reporting results in the biomarker-negative populations and another 15.6% trials providing results only in the online Appendix. Reporting of biomarker-positive and overall design needs be further improved and standardized to ensure results in the biomarker-negative population are appropriately assessed, reviewed, and published. A CONSORT extension for biomarker validation trial may be developed with input from multidisciplinary key stakeholders in clinical trials research to standardize the reporting with focusing on the adequate reporting on the results in the biomarker-negative population. For drugs approved based on trials with biomarker-positive and overall design, the regulatory agencies should require the manufacturer to provide the results in the biomarker-negative population in the drug labels.
Our study has several limitations. First, the analyzed cohorts have been obtained from published reports only. Trials reported in major international conferences but not published in journals were not included because it is not possible to fully assess the design and reporting using only abstracts. Second, only trials published in 7 selected journals were included in our analysis, which may affect the generalizability of our conclusions. However, most practice-changing and high-quality phase III RCTs are published in these journals.
In summary, our study showed that biomarker-positive and overall design is the predominant design type of biomarker validation phase III trials. Major problems include a change in trial design after trial commencement without disclosure, underreporting of results in biomarker-negative groups, and recommending treatment in biomarker-negative groups despite modest effects. Relevant guidelines should be developed to standardize the design and reporting of biomarker validation phase III trials.
Supplementary Material
Contributor Information
Fei Liang, Department of Biostatistics, Zhongshan Hospital, Fudan University, Shanghai, China; Clinical Research Unit, Institute of Clinical Science, Zhongshan Hospital, Fudan University, Shanghai, China.
Ling Peng, Department of Respiratory Disease, Zhejiang Provincial People’s Hospital, Affiliated People’s Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China.
Zhengyu Wu, Department of Biostatistics, School of Public Health, Key Laboratory of Public Health Safety and Collaborative Innovation Center of Social Risks Governance in Health, Fudan University, Shanghai, China.
Georgios Giamas, Department of Biochemistry and Biomedicine, School of Life Sciences, University of Sussex, Brighton, UK.
Justin Stebbing, Division of Cancer, Department of Surgery and Cancer, Imperial College London, London, UK; Department of Biomedical Sciences, Anglia Ruskin University, Cambridge, UK.
Funding
None.
Notes
Role of the funder: Not applicable.
Disclosures: JS is the Editor-in-Chief of Oncogene and has sat on SABs for Vaccitech, Heat Biologics, Eli Lilly, Alveo Technologies, Pear Bio, Agenus, Equilibre Biopharmaceuticals, Graviton Bioscience Corporation, Celltrion, Volvox, Certis, Greenmantle, vTv Therapeutics, APIM Therapeutics, Onconox, IO Labs, Bryologyx and Benevolent AI. JS has consulted with Lansdowne Partners and Vitruvian. JS chairs the Board of Directors for Xerion and previously BB Biotech Healthcare Trust PLC. GG is Editor-in-Chief of Cancer Gene Therapy and the Founder and Chief Scientific Advisor of Stingray Bio. None are relevant here. FL, LP, and ZW have no disclosures.
Author contributions: FL, PL and ZW: Conceptualization, data curation, formal analysis, investigation, methodology, software, writing—original draft, writing—review and editing. GG and JS: Conceptualization, supervision, writing—review and editing.
Data availability
Data extracted from published manuscripts are available from the corresponding author at liangfei0726@163.com.
References
- 1. Freidlin B, Sun Z, Gray R, et al. Phase III clinical trials that integrate treatment and biomarker evaluation. J Clin Oncol. 2013;31(25):3158-3161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Hu C, Dignam JJ.. Biomarker-driven oncology clinical trials: key design elements, types, features, and practical considerations. J Clin Oncol Precis Oncol. 2019;3:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Mandrekar SJ, Sargent DJ.. Clinical trial designs for predictive biomarker validation: theoretical considerations and practical challenges. J Clin Oncol. 2009;27(24):4027-4034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Matsui S, Crowley J.. Biomarker-stratified phase III clinical trials: enhancement with a subgroup-focused sequential design. Clin Cancer Res. 2018;24(5):994-1001. [DOI] [PubMed] [Google Scholar]
- 5. Sargent DJ, Conley BA, Allegra C, et al. Clinical trial designs for predictive marker validation in cancer treatment trials. J Clin Oncol. 2005;23(9):2020-2027. [DOI] [PubMed] [Google Scholar]
- 6. Kim MS, Prasad V.. Nested and adjacent subgroups in cancer clinical trials: when the best interests of companies and patients diverge. Eur J Cancer. 2021;155:163-167. [DOI] [PubMed] [Google Scholar]
- 7. Fundytus A, Booth CM, Tannock IF.. How low can you go? PD-L1 expression as a biomarker in trials of cancer immunotherapy. Ann Oncol. 2021;32(7):833-836. [DOI] [PubMed] [Google Scholar]
- 8. Freidlin B, Korn EL.. A problematic biomarker trial design. J Natl Cancer Inst. 2022;114(2):187-190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Dmitrienko A, D’Agostino RB Sr. Multiplicity considerations in clinical trials. N Engl J Med. 2018;378(22):2115-2122. [DOI] [PubMed] [Google Scholar]
- 10. Liang F, Wu Z, Mo M, et al. Comparison of treatment effect from randomised controlled phase II trials and subsequent phase III trials using identical regimens in the same treatment setting. Eur J Cancer. 2019;121:19-28. [DOI] [PubMed] [Google Scholar]
- 11. Tan A, Porcher R, Crequit P, et al. Differences in treatment effect size between overall survival and progression-free survival in immunotherapy trials: a meta-epidemiologic study of trials with results posted at ClinicalTrials.gov. J Clin Oncol. 2017;35(15):1686-1694. [DOI] [PubMed] [Google Scholar]
- 12. de Bono J, Mateo J, Fizazi K, et al. Olaparib for metastatic castration-resistant prostate cancer. N Engl J Med. 2020;382(22):2091-2102. [DOI] [PubMed] [Google Scholar]
- 13. Coleman RL, Oza AM, Lorusso D, et al. ARIEL3 investigators.Rucaparib maintenance treatment for recurrent ovarian carcinoma after response to platinum therapy (ARIEL3): a randomised, double-blind, placebo-controlled, phase 3 trial. Lancet. 2017;390(10106):1949-1961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Coleman RL, Fleming GF, Brady MF, et al. Veliparib with first-line chemotherapy and as maintenance therapy in ovarian cancer. N Engl J Med. 2019;381(25):2403-2415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Mirza MR, Monk BJ, Herrstedt J, et al. ; ENGOT-OV16/NOVA Investigators.Niraparib maintenance therapy in platinum-sensitive, recurrent ovarian cancer. N Engl J Med. 2016;375(22):2154-2164. [DOI] [PubMed] [Google Scholar]
- 16. González-Martín A, Pothuri B, Vergote I, et al. ; PRIMA/ENGOT-OV26/GOG-3012 Investigators.Niraparib in patients with newly diagnosed advanced ovarian cancer. N Engl J Med. 2019;381(25):2391-2402. [DOI] [PubMed] [Google Scholar]
- 17. Powles T, Park SH, Voog E, et al. Avelumab maintenance therapy for advanced or metastatic urothelial carcinoma. N Engl J Med. 2020;383(13):1218-1230. [DOI] [PubMed] [Google Scholar]
- 18. Socinski MA, Jotte RM, Cappuzzo F, et al. ; IMpower150 Study Group.Atezolizumab for first-line treatment of metastatic nonsquamous NSCLC. N Engl J Med. 2018;378(24):2288-2301. [DOI] [PubMed] [Google Scholar]
- 19. Schmid P, Adams S, Rugo HS, et al. ; IMpassion130 Trial Investigators.Atezolizumab and nab-paclitaxel in advanced triple-negative breast cancer. N Engl J Med. 2018;379(22):2108-2121. [DOI] [PubMed] [Google Scholar]
- 20. Herbst RS, Baas P, Kim DW, et al. Pembrolizumab versus docetaxel for previously treated, PD-L1-positive, advanced non-small-cell lung cancer (KEYNOTE-010): a randomised controlled trial. Lancet. 2016;387(10027):1540-1550. [DOI] [PubMed] [Google Scholar]
- 21. Bellmunt J, de Wit R, Vaughn DJ, et al. ; KEYNOTE-045 Investigators.Pembrolizumab as second-line therapy for advanced urothelial carcinoma. N Engl J Med. 2017;376(11):1015-1026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Mok TSK, Wu YL, Kudaba I, et al. KEYNOTE-042 Investigators.Pembrolizumab versus chemotherapy for previously untreated, PD-L1-expressing, locally advanced or metastatic non-small-cell lung cancer (KEYNOTE-042): a randomised, open-label, controlled, phase 3 trial. Lancet. 2019;393(10183):1819-1830. [DOI] [PubMed] [Google Scholar]
- 23. Motzer RJ, Penkov K, Haanen J, et al. Avelumab plus axitinib versus sunitinib for advanced renal-cell carcinoma. N Engl J Med. 2019;380(12):1103-1115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. West H, McCleod M, Hussein M, et al. Atezolizumab in combination with carboplatin plus nab-paclitaxel chemotherapy compared with chemotherapy alone as first-line treatment for metastatic non-squamous non-small-cell lung cancer (IMpower130): a multicentre, randomised, open-label, phase 3 trial. Lancet Oncol. 2019;20(7):924-937. [DOI] [PubMed] [Google Scholar]
- 25. Rittmeyer A, Barlesi F, Waterkamp D, et al. ; OAK Study Group. Atezolizumab versus docetaxel in patients with previously treated non-small-cell lung cancer (OAK): a phase 3, open-label, multicentre randomised controlled trial. Lancet. 2017;389(10066):255-265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Eggermont AMM, Blank CU, Mandala M, et al. Adjuvant pembrolizumab versus placebo in resected stage III melanoma. N Engl J Med. 2018;378(19):1789-1801. [DOI] [PubMed] [Google Scholar]
- 27. Reck M, Rodríguez-Abreu D, Robinson AG, et al. KEYNOTE-024 Investigators.Pembrolizumab versus chemotherapy for PD-L1-positive non-small-cell lung cancer. N Engl J Med. 2016;375(19):1823-1833. [DOI] [PubMed] [Google Scholar]
- 28. Ellis LM, Bernstein DS, Voest EE, et al. American Society of Clinical Oncology perspective: raising the bar for clinical trials by defining clinically meaningful outcomes. J Clin Oncol. 2014;32(12):1277-1280. [DOI] [PubMed] [Google Scholar]
- 29. Pocock SJ, Stone GW.. The primary outcome is positive - is that good enough? N Engl J Med. 2016;375(10):971-979. [DOI] [PubMed] [Google Scholar]
- 30. Raghav KP, Mahajan S, Yao JC, et al. From protocols to publications: a study in selective reporting of outcomes in randomized trials in oncology. J Clin Oncol. 2015;33(31):3583-3590. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Moher D, Hopewell S, Schulz KF, et al. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ. 2010;340:c869. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data extracted from published manuscripts are available from the corresponding author at liangfei0726@163.com.