Abstract
Background
A minority of phase III trials in gastrointestinal oncology are positive. We assessed the association between their outcome and the level and characteristics of preexisting evidence.
Methods
EMBASE, PubMed, and proceedings from international meetings were searched for phase III gastrointestinal cancer trials (gastroesophageal, hepatocellular, biliary tract, pancreatic, small bowel, colorectal, anal, stromal, and neuroendocrine) between January 2000 and June 2020. Trials investigating anticancer drugs for advanced disease, with superiority design and standard treatments as control were eligible. The highest level of preexisting evidence was retrieved from the main study report.
Results
A total of 193 phase III trials were included, and 69 (35.8%) met their primary endpoint. Positivity rates were as follows: gastroesophageal 37%, colorectal 48%, pancreatic 17.1%, hepatocellular 20%, neuroendocrine 75%, and both biliary tract and GIST 60%. No information about preexisting evidence was found for 44 trials (22.8%). For the remaining 149, preexisting evidence consisted of phase II studies in 123 cases (82.6%) and phase I studies in 26 cases (17.4%). The probability of success was 34.1%, 35.8%, and 35.7%, respectively (P = .934). No parameter from prior studies predicted the outcome of phase III trials except β < .2 (P = .048). A numerically increased success rate was observed for phase III trials preceded by positive phase II studies (41.9% vs 18.5%, P = .2).
Conclusions
There does not appear to be an association between level of prior evidence and success of phase III gastrointestinal cancer trials. These data, along with the high phase III failure rate, highlight the need to improve the drug development process in this setting.
Despite major efforts to develop effective new therapeutics, only a small proportion of initially promising investigational compounds are eventually granted regulatory approval across all medical specialties (1). The probability of success in the development of a molecule, from discovery to approval, is particularly dismal in oncology (3%-5%), the main attrition occurring during clinical testing (2). At this stage, many drugs fail to hold the initial promise, therefore slowing down therapeutic advances; leading to an enormous waste of time, money, and resources from funders and sponsors; and leaving investigators’ and patients’ efforts unrewarded.
According to previous studies, less than 40% of phase III cancer trials meet their primary endpoint, and only 35% of drugs tested in this setting are approved by regulatory authorities (2). This major bottleneck is largely due to the lack of efficacy of the investigational compound (3). Additionally, for those drugs eventually granted approval, the magnitude of clinical benefit is marginal, with only approximately one-third of FDA-approved agents for the treatment of advanced adult solid tumors meeting the criteria for meaningful benefit according to the European Society for Medical Oncology Magnitude of Clinical Benefit Scale (4,5). Such disappointing figures are especially concerning for gastrointestinal cancers. Although these altogether represent the most common tumor (>5.1 million new cases per year) and account for the first cause of cancer-related deaths (>3.6 millions of deaths per year) worldwide (6), they have historically lagged behind other malignancies in terms of clinical impact of novel class anticancer compounds (7). Among all the drugs approved by the FDA between 2006 and 2017 for the treatment of these tumors, only 18% showed a median survival advantage more than 3 months, 14% were associated with an improvement in quality of life, and 21% met the European Society for Medical Oncology Magnitude of Clinical Benefit Scale criteria for meaningful benefit (8).
Clinical drug development generally relies on a standardized process whereby sequential testing phases follow one another, aiming to first assess safety, pharmacodynamics, pharmacokinetics, and optimal dosing; then activity; and finally comparative efficacy. Such a rigorous process is meant to prioritize investigation of therapeutics, which are safe and most likely to produce an ultimate benefit for patients. The high failure rate of phase III cancer trials, however, hints at the possibility that the screening potential from earlier phase trials might be either limited, with failure to filter out ineffective molecules, or undervalued, with the eventual decision by investigators with exceedingly positive expectations to pursue the investigation of likely ineffective molecules (3,9). Also, not all drugs under assessment are subject to the same level of scrutiny, and intermediate steps of evaluation are often skipped in an attempt to accelerate drug approval. Although such practice is largely discouraged and thought to foster failure in drug development (10), there are no data to support this contention, and limited information exists regarding the association between the path followed for drug investigation and the ultimate success of the same.
Based on these premises, we conducted a systematic review to analyze the overall characteristics of the drug development process in gastrointestinal oncology and elucidate major determinants of success or failure of late-phase clinical trials in this setting.
Methods
Search strategy and selection criteria
The PubMed and EMBASE databases were searched for phase III clinical trials of gastrointestinal malignancies (including gastroesophageal cancers, hepatocellular carcinoma, biliary tract cancers, pancreatic cancers, small bowel cancers, colorectal cancers, anal cancers, gastrointestinal stromal tumors, and neuroendocrine neoplasms) published between January 2000 and June 2020 (research query for EMBASE available in the Supplementary material, available online). Also, proceedings from the American Society of Clinical Oncology Annual Meeting, European Society for Medical Oncology Congress, American Society of Clinical Oncology Gastrointestinal Cancers Symposium, and European Society for Medical Oncology World Congress on Gastrointestinal Cancer from the same period were searched. The Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines for systematic review protocols were followed, and the systematic review was prospectively registered in the International Prospective Register of Systematic Reviews database (CRD42022346169) (11).
Trials for advanced disease with a superiority design and using a standard-of-care treatment as control arm were included. Trials of neoadjuvant, adjuvant, or definitive therapies, testing an approved drug as experimental treatment, comparing different schedules of the same drug, evaluating nonpharmacologic interventions, supportive care drugs or prevention strategies, or not reporting complete results in relation to the primary endpoint were excluded.
Two investigators (G.B. and E.T.) independently searched the databases and meeting proceedings and assessed the potential relevance of the retrieved articles by reading titles and abstracts. Selected articles were then assessed for study inclusion by reading the full article. Disagreements regarding eligibility were settled by the senior investigator (F.S.) acting as referee. Only 1 publication for each eligible trial was retained. If more than 1 publication was available for the same trial, the final trial report was included, and potential missing data were retrieved from other prior publications.
Data collection and analysis
Data extracted from the eligible phase III trials and used as variables for the analysis included start/stop date and duration of accrual, year of publication or presentation, type of sponsorship, endorsement by academic groups, recruitment ambit, main geographic area, number of participating sites, tumor type, line of treatment, patient enrichment, type of experimental and standard treatments, randomization design, masking of treatment allocation, central radiologic review, primary endpoint, α error, β error, target hazard ratio (HR), level of statistical significance, multiplicity analysis, change of statistical plan in due course, preplanned interim analysis, planned sample size, enrolled sample size, premature discontinuation, overall response rate (ORR), disease control rate, progression-free survival (PFS), time to progression (TTP), time to treatment failure (TTF), overall survival (OS), hazard ratio with confidence intervals, statistical significance, study results in relation to the primary endpoint, and coprimary endpoint (if any) with relative assumptions and results.
The highest-level study preceding and supporting the conduct of each phase III trial was retrieved from the main study report of the same, and relative data (including the abovementioned variables) were extracted. If this was uninformative, an ad hoc, study-specific literature research was carried out using PubMed. For phase III trials preceded by more than 1 earlier-phase studies, the trial with the most advanced design was considered for data extraction and analysis according to the following hierarchical criteria: 1) randomized phase II study, 2) single-arm phase II study, and 3) phase I study. For the purpose of the analysis, phase I/II studies were considered as phase II studies.
Study objectives and statistical considerations
The primary objective of the study was to assess the predictive value of the level of evidence preceding phase III trials. Secondary objectives included the identification of parameters from preexisting studies that could predict success or failure of subsequent phase III trials and the evaluation of the concordance in terms of key efficacy outcomes between preexisting studies and phase III trials.
The level of evidence preceding phase III trials by tumor type and the association between this and the success of the subsequent phase III trials were tested with the Pearson χ2 and Fisher exact tests. The change in positivity rates between the 2 study periods was assessed with the Pearson χ2 test. The Mann-Whitney U test was used to assess variations in median survival outcomes over the study period. The correlation between expected and actual hazard ratio and between phase III and prior study results were assessed through linear regression. All reported P values are 2-sided. P values less than .05 were considered statistically significant. Analyses were performed using R (version 4.2.0), SPSS (version 26), and GraphPad Prism (version 8).
Results
A total of 11 819 articles and abstracts were retrieved using the database search criteria. Of these, 614 reported on randomized phase III trials for gastrointestinal cancers, and 193 matched the eligibility criteria and were included in the present analysis (Figure 1). Characteristics of the eligible trials are summarized in Table 1. The majority of these were published between 2011 and 2020 (n = 132, 68.4%), with a median difference between the end of accrual and publication of 4 years, were sponsored by a pharmaceutical company (n = 124, 64.2%), recruited more than one-half of the study population from non-Asian countries (n = 117, 71.8%), and tested first-line treatments (n = 110, 57.0%). The primary endpoint was OS in 130 cases (67.4%); PFS, TTP, or TTF in 58 (30.0%); and ORR or disease control rate in 5 (2.6%). Regarding tumor types, 54 (28%) were conducted in gastroesophageal cancers, 50 (25.9%) in colorectal cancer, 41 (21.2%) in pancreatic cancer, 30 (15.5%) in hepatocellular carcinoma, 8 (4.1%) in neuroendocrine neoplasms, 5 (2.6%) in biliary tract cancer, and 5 (2.6%) in gastrointestinal stromal tumors.
Figure 1.
CONSORT diagram.
Table 1.
Characteristics of the eligible phase III trialsa
| 2000-2010 | 2011-2020 | Total | |
|---|---|---|---|
| Study characteristics | No. (%) | No. (%) | No. (%) |
| Cooperative group trial | |||
| Yes | 27 (44.3) | 26 (19.7) | 53 (27.5) |
| No | 34 (55.7) | 106 (80.3) | 140 (72.5) |
| Main recruitment area | |||
| Asia | 6 (9.8) | 40 (30.3) | 46 (23.8) |
| ROW | 43 (70.5) | 74 (56.1) | 117 (60.6) |
| Not reported | 12 (19.7) | 18 (13.6) | 30 (15.6) |
| Sponsorship | |||
| Pharma | 27 (44.3) | 97 (73.5) | 124 (64.2) |
| Academic | 34 (55.7) | 35 (26.5) | 69 (35.8) |
| No. of sites | |||
| Multicentre | 58 (95.1) | 132 (100) | 190 (98.5) |
| Single center | 2 (3.3) | 0 (0) | 2 (1.0) |
| Not reported | 1 (1.6) | 0 (0) | 1 (0.5) |
| Nationality | |||
| International | 36 (59.1) | 91 (68.9) | 127 (65.8) |
| National | 24 (39.3) | 41 (31.1) | 65 (33.7) |
| Not reported | 1 (1.6) | 0 (0) | 1 (0.5) |
| Randomization | |||
| 1:1 | 56 (91.8) | 97 (73.4) | 153 (79.3) |
| 1:1:1 | 2 (3.3) | 0 (0) | 2 (1.0) |
| 2:1 | 3 (4.9) | 34 (25.8) | 37 (19.2) |
| 3:2 | 0 (0) | 1 (0.8) | 1 (0.5) |
| Masking | |||
| Double blind | 13 (21.3) | 66 (50.0) | 79 (41) |
| Open label | 48 (78.7) | 65 (49.2) | 113 (58.5) |
| Partially blinded | 0 (0) | 1 (0.8) | 1 (0.5) |
| Tumor type | |||
| Biliary tract | 2 (3.3) | 3 (2.3) | 5 (2.6) |
| Colorectal | 20 (32.8) | 30 (22.7) | 50 (25.9) |
| Gastrointestinal stromal | 1 (1.6) | 4 (3.0) | 5 (2.6) |
| Gastroesophageal | 10 (16.4) | 44 (33.3) | 54 (28.0) |
| Hepatocellular | 8 (13.1) | 22 (16.7) | 30 (15.5) |
| Neuroendocrine | 0 (0) | 8 (6.1) | 8 (4.1) |
| Pancreatic | 20 (32.8) | 21 (15.9) | 41 (21.2) |
| Line of treatment | |||
| First | 49 (80.3) | 61 (46.2) | 110 (56.9) |
| Maintenance after first | 0 (0) | 3 (2.3) | 3 (1.6) |
| First and later | 0 (0) | 6 (4.6) | 6 (3.1) |
| Second | 5 (8.2) | 36 (27.3) | 41 (21.1) |
| Second and later | 1 (1.6) | 10 (7.6) | 11 (5.7) |
| Third | 0 (0) | 1 (0.8) | 1 (0.5) |
| Third and later | 2 (3.3) | 7 (5.3) | 9 (4.7) |
| Later lines | 1 (1.6) | 8 (6.1) | 9 (4.7) |
| Any | 3 (5.0) | 0 (0) | 3 (1.6) |
| Biomarker selection | |||
| Yes | 4 (6.6) | 27 (20.5) | 31 (16.1) |
| No | 57 (93.4) | 105 (79.5) | 162 (83.9) |
| Treatment type | |||
| Chemo | 32 (52.5) | 29 (21.9) | 61 (31.6) |
| Chemo + Immuno | 4 (6.6) | 4 (3.0) | 8 (4.1) |
| Chemo + Targeted | 14 (22.9) | 46 (34.9) | 60 (31.1) |
| Immuno | 1 (1.6) | 10 (7.6) | 11 (5.7) |
| Hormonal | 4 (6.6) | 1 (0.8) | 5 (2.6) |
| Hormonal + Targeted | 0 (0) | 3 (2.3) | 3 (1.6) |
| Targeted | 6 (9.8) | 37 (28.0) | 43 (22.3) |
| Targeted + Immuno | 0 (0) | 2 (1.5) | 2 (1.0) |
| Primary endpoint | |||
| OS | 41 (67.2) | 89 (67.4) | 130 (67.4) |
| PFS | 10 (16.4) | 41 (31.0) | 51 (26.4) |
| TTF | 0 (0) | 1 (0.8) | 1 (0.5) |
| TTP | 6 (9.8) | 0 (0) | 6 (3.1) |
| ORR | 3 (5.0) | 1 (0.8) | 4 (2.1) |
| DCR | 1 (1.6) | 0 (0) | 1 (0.5) |
| Central radiology review | |||
| Yes | 4 (6.6) | 22 (16.6) | 26 (13.5) |
| No | 57 (93.4) | 109 (82.6) | 166 (86.0) |
| Not reported | 0 (0) | 1 (0.8) | 1 (0.5) |
Chemo = chemotherapy; DCR = disease control rate; Immuno = immunotherapy; ORR = overall response rate; OS = overall survival; PFS = progression-free survival; ROW = rest of the world; TTF = time to treatment failure; TTP = time to progression.
Overall, 69 (35.8%) phase III trials met their primary endpoint, 18 of 61 (29.5%) in 2000-2010 and 51 of 132 (38.6%) in 2011-2020 (P = .22). Success rates were 75.0% for neuroendocrine neoplasms, 60.0% for biliary tract cancers and gastrointestinal stromal tumors, 48.0% for colorectal cancers, 37.0% for gastroesophageal cancers, 20.0% for hepatocellular carcinomas, and 17.1% for pancreatic cancers (Figure 2). Figure 3 shows the correlation between the target and actual hazard ratios for phase III trials that used PFS/TTP/TTF or OS as primary endpoint. Among the 69 positive phase III trials, 30 were conducted in first line only (30 of 110, 27.3%), 19 in second line only (19 of 41, 46.3%), and 9 in third or later lines (9 of 19, 47.4%). Regardless of the primary endpoint, median PFS/TTP/TTF and OS advantage for the experimental arm over the control arm was 2.05 and 1.8, 1.5 and 1.6, and 0.3 and 1.8 months, for first, second, and third or later lines, respectively. For first-line trials, these values did not change over time (P = .904 for PFS/TTP/TTF, P = .516 for OS), whereas second and later line trials were not analyzed because of the small numbers.
Figure 2.
Positivity rate of phase III trials by tumor type and study period. GIST = gastrointestinal stromal tumors; HCC = hepatocellular carcinoma.
Figure 3.

Correlation between target and actual hazard ratio (HR) of phase III trials for A) progression-free survival (PFS), time to progression (TTP) or time to treatment failure (TTF) and B) overall survival (OS).
Information regarding preexisting evidence was found for 149 (77.2%) phase III trials. Pre-existing evidence consisted of phase II studies in 123 cases (82.6%, 99 [80.5%] with a single-arm and 24 [19.5%] with a randomized design) and phase I studies in 26 cases (17.4%). No association was found between characteristics of the phase III trial and type of preexisting evidence. Of the 26 phase III trials directly preceded by a phase I trial, 6 were published in 2000-2010 and 20 in 2011-2020 (13.0 and 19.4%, respectively, of the phase III trials published in either period; P = .28) (Supplementary Figure, available online). The proportion of phase III trials directly preceded by a phase I trial by tumor type was 50% for gastrointestinal stromal tumors, 31.4% for colorectal cancer, 25% for biliary tract cancers, 18.8% for pancreatic cancers, 13.8% for hepatocellular carcinomas, 5.4% for gastroesophageal cancers, and 0% for neuroendocrine neoplasms (P = .066). The median sample size of these studies was 19 (range = 1-173). The 44 (22.8%) trials for which no information regarding preexisting evidence was available accounted for 24.6% and 22.0%, respectively, of the phase III trials published in 2000-2010 and in 2011-2020, respectively (P = .69). These included 17 (31.5%) of the trials conducted in gastroesophageal cancers, 15 (30.0%) in colorectal cancers, 9 (22.0%) in pancreatic cancers, 1 (20.0%) in biliary tract cancers, 1 (20.0%) in gastrointestinal stromal tumors, 1 (3.3%) in hepatocellular carcinomas, and 0 in neuroendocrine neoplasms (P = .02) (Figure 4).
Figure 4.
Positivity rate of phase III trials by tumor types and prior level of evidence. GIST = gastrointestinal stromal tumors; HCC = hepatocellular carcinoma.
The success rate of phase III trials was 35.8% and 38.5% when directly preceded by phase II and phase I studies, respectively, and 34.1% when no prior evidence was available (P = .934). The lack of an association between the presence of a preceding phase II study and the success of phase III trials was confirmed when the analysis was restricted to trials of chemotherapy (P = .301), targeted therapy with or without chemotherapy (P = .821), immunotherapy with or without chemotherapy (P = .109), and other agents (P = 1) (Figure 5). Among the 26 phase I studies directly followed by phase III trials, the median sample size was 18 (range = 1-60) and 26 (range = 4-178) for those preceding negative and positive phase III trials, respectively. Among the 123 phase II studies that preceded phase III trials, 89 (72.4%) were assessable for outcome against a predefined formal hypothesis. Phase III trials following a positive phase II study met their primary endpoint in 26 of 62 (41.9%) cases compared with 5 of 27 (18.5%) cases for those following a negative phase II study (P = .19). Sensitivity, specificity, positive predictive value, negative predictive value, and overall accuracy were 83.9%, 37.9%, 41.9%, 81.5%, and 53.9%, respectively. The success rate was 37.0% for phase III trials preceded by phase II studies without any formal hypothesis. Of the 24 randomized phase II studies that informed the development of subsequent phase III trials, 3 were conducted in 2000-2010 (preceding 4.9% of the phase III studies published in that period) and 21 in 2011-2020 (15.9%) (P = .031) (Supplementary Figure, available online). Their design was largely specular to that of the follow-on phase III trial, with the same experimental and control interventions in all cases and same treatment line in 20 cases (83%). Overall, 4 of 12 (33.3%) positive phase II studies preceded a successful phase III trial vs 2 of 12 (16.7%) negative phase II studies (P = .346).
Figure 5.
Positivity rate of phase III trials by treatment type and prior level of evidence. P3 = phase III trial.
Among all parameters tested from prior studies (including both phase I and II), only a β value of less than .2 (ie, a statistical power ≥80%) was associated with an increased success rate of phase III trials (53.1% vs 29.7%, P = .048) (Figure 6). No parameter predicted the outcome when the analysis was restricted to phase II studies only (Figure 6). In matched phase III trials and preceding studies conducted in the same line of treatment, information about the antitumor activity of the investigational compound or regimen was available for 108 of 149 (72.5%) prior studies. The ORR reported in these strongly correlated with that subsequently observed in the follow-on phase III trials (n = 66, r = 0.899; P < .01). A strong correlation (r = 0.884; P < .01) was confirmed for 46 cases (69.7%), in which an independent central review was performed neither in the phase III trial nor in the prior study. Also, a strong correlation was observed for median PFS/TTP/TTF (n = 45, r = 0.8; P < .01) and median OS (n = 60, r = 0.793, P < .01). Among the 24 prior randomized phase II studies, hazard ratios for PFS/TTP/TTF and OS were available in 22 and 21 cases, respectively. Compared with the corresponding values from subsequent phase III trials, a moderate correlation was observed for PFS/TTP/TTF (r = 0.503, P = .02), whereas no correlation was found for OS (r = 0.097, P = .68).
Figure 6.
Association between characteristics of A) prior studies or B) prior phase II studies and success of phase III trials.
Discussion
The high failure rate of phase III trials is a major issue in oncology. It prevents the adoption of novel treatments, halts further development of similar but potentially more effective compounds, and ultimately delays therapeutic advances. Also, it results in a waste of resources that contrasts with the basic principles of sustainable cancer research. Although this is likely a multifactorial phenomenon subject to drug- and study-specific caveats, a common denominator underlying most such failures may actually exist. When we launched this systematic review, we hypothesized that a leitmotif of negative gastrointestinal cancer phase III trials could be the lack of robust preexisting evidence. In particular, we thought that drugs or combination treatments that were not thoroughly assessed in phase II studies would have lower chances of meeting their primary endpoint. Our findings, however, appear to confute this contention.
By systematically reviewing data from the last 2 decades, we have shown that almost 1 out of 5 phase III gastrointestinal cancer trials was not preceded by a phase II study, but this did not affect the chance of success. Although the frequent decision to continue drug development and enter phase III in spite of negative phase II studies (which preceded 30% of the phase III trials) might have influenced our results, we could not find any difference either between the success rate of phase III trials following phase I vs positive phase II trials. One could argue that our unexpected findings could be biased by the growing number of phase I studies with dose expansion or randomized designs that eventually make up for the lack of independent, follow-on phase II studies. Nevertheless, we found only 1 randomized phase I study, and with the only exception of the studies of atezolizumab and bevacizumab for hepatocellular carcinoma (n = 119) and ripretinib for gastrointestinal stromal tumors (n = 178), no major difference was observed in terms of sample size between phase I studies preceding positive vs negative phase III trials.
Overall, these data are consistent with the results of similar analyses previously conducted across solid tumors (12,13) and challenge the paradigm of the stepwise drug development process, especially questioning the need to build on the preliminary results of phase I studies before launching phase III trials. Historically, phase II studies have been conceived as an intermediate step to confirm the safety findings from phase I studies and provide activity or efficacy data that should guide the decision regarding further investigation of the experimental treatment (14). As shown in our study, however, the decision to accelerate drug development and skip such checkpoint is sometimes made, especially for tumors like colorectal cancers, where phase I studies moved straight into phase III trials in one-third of cases. This is possibly driven by a variety of factors, including an overenthusiastic interpretation of preliminary activity data, the urgent demand to fill an unmet clinical need, and the time constraints that sponsors may have to meet the regulatory requirements or position their compounds into the market (3,9). In all these circumstances, the risk exists of pushing forward ineffective treatments due to limited information. Our results, however, suggest that such risk is acceptable overall, possibly mitigated by the cautious assessment from investigators and sponsors who are generally able to put preliminary data in context and select those treatments with the highest chance of overperforming standard-of-care therapies in randomized phase III trials. Similar conclusions can be drawn regarding the feasibility of using limited toxicity data to accurately evaluate the overall safety of new treatments because in no case was the failure of phase III trials directly preceded by phase I studies due to excessive toxicity.
It is clear that these results are largely influenced by the poor performance of phase II studies, which had limited capability to select or screen out novel compounds. In line with the data reported by Monzon et al. (13), studies (either single-arm or randomized) that met their primary hypothesis were more likely to be followed by positive phase III trials, but, possibly because of the small sample size, this association was not statistically significant. Furthermore, the overall predictive accuracy of phase II studies was poor, with a high risk (∼60%) of phase III failure in spite of initial positive results and a nonnegligible risk (∼20%) of precluding the development of successful drugs based on initial negative findings. Bearing in mind the caveat of assessing even smaller numbers, these figures did not appear different when the analysis was restricted to randomized studies, and no additional study parameter was found to be associated with the failure or success of subsequent phase III trials.
Regardless of the path of drug testing, the failure rate for phase III research in our study was disappointingly high, with approximately 2 out of 3 trials reporting negative results. As expected, the number of phase III trials increased over time, but this was not accompanied by a similar trend in terms of success rate. Trials were most successful in later treatment lines, possibly because of the increased adoption of either best supportive care or placebo only as control in this setting (68.4% for third or later lines, 31.7% for second line, and 2.7% for first line). Also, some differences were observed between tumor types. Phase III trials in neuroendocrine neoplasms, biliary tract cancers, and gastrointestinal stromal tumors were positive in most cases. This could be a random finding driven by the small numbers, or it may reflect a more conservative attitude from investigators and sponsors, who raise the threshold for moving drugs to the late phase of development in low-incidence tumors. Conversely, figures were especially dismal for pancreatic cancer. In this setting, less than 20% of trials met their primary endpoint, reflecting the marginal survival advances reported for advanced-stage pancreatic tumors over the last decades (15). Overall, our findings are consistent with those previously reported for other cancers and suggest that the historical lag of gastrointestinal cancers in terms of treatment availability and survival improvements might rather be secondary to some biological differences, which hamper the identification of valuable therapeutic targets and limit survival gains from approved therapies. Many examples exist of targeted agents that failed to provide in gastrointestinal cancers of the same magnitude of clinical benefit as observed in other tumor types (16-21). Also, with the exception of immune checkpoint inhibitors for mismatch repair deficient or microsatellite instability-high tumors, survival advances in these tumors are generally incremental rather than revolutionary (22). Not surprisingly, across all treatment lines, the median PFS/TTP/TTF and OS advantage from successfully tested drugs in phase III trials were quite marginal, ranging between 0.3 and 2 months, and 1.6 and 1.8 months, respectively. As shown in our study, however, phase III trials may still suffer from the exceedingly optimistic expectations from investigators and sponsors, with quite a statistically significant gap between expected and actual relative survival improvements. In this regard, it is interesting to note that, whereas a strong correlation was found between phase III trials and preceding studies in terms of absolute metrics of drug activity and efficacy, moderate or no correlation was observed with regard to relative survival improvements over standard therapy.
We acknowledge the limitations of our study. First, including only phase III trials and restricting the analysis to these and their most advanced preceding studies did not allow us to assess the entire drug testing process, including the comparison between matched phase I and II studies. Also, we could not put our results into the broader context of drug development for these tumors because phase I and II studies investigating therapies that never reached the latest phase of testing were excluded. Similarly, our study did not take into account publication bias against negative phase III trials. Second, identification of the preexisting evidence preceding phase III trials was not done through a systematic research but relied on the references from the phase III trials themselves and on a nonsystematic research when needed. This might have prevented some relevant studies from being included in the analysis, a hypothesis that would be supported by the unexpectedly high rate of phase III trials for which no preexisting evidence was found. Third, the relatively small sample size, especially for subgroup analyses, reduced the power to detect statistically significant associations, whereas the multiplicity testing without any formal correction might have led to some random findings. Finally, it should be noted that our analysis was restricted to gastrointestinal cancers, and the generalizability of our results is unknown.
To our knowledge, this is the largest systematic review thoroughly assessing late-phase experimental testing in gastrointestinal oncology. Bearing in mind the above limitations, our findings should prompt stakeholders to evaluate and ultimately implement more efficient methods of drug development, which can rapidly translate into meaningful advances for gastrointestinal cancer patients. In particular, should phase II studies keep a key role as a joining link between phase I and phase III trials, novel designs would need to be explored to increase their efficiency.
Supplementary Material
Acknowledgements
The funder did not play a role in the design of the study; the collection, analysis, and interpretation of the data; the writing of the manuscript; and the decision to submit the manuscript for publication.
Contributor Information
Giacomo Bregni, Department of Digestive Oncology, Institut Jules Bordet, The Brussels University Hospital (HUB), Brussels, Belgium; Université Libre de Bruxelles (ULB), Brussels, Belgium.
Elena Trevisi, Oncology Institute of Southern Switzerland—EOC, Bellinzona, Switzerland.
Rita Saúde Conde, Department of Digestive Oncology, Institut Jules Bordet, The Brussels University Hospital (HUB), Brussels, Belgium.
Michele Vanhooren, Department of Digestive Oncology, Institut Jules Bordet, The Brussels University Hospital (HUB), Brussels, Belgium.
Tugba Akin Telli, Department of Digestive Oncology, Institut Jules Bordet, The Brussels University Hospital (HUB), Brussels, Belgium.
Irene Assaf, Department of Digestive Oncology, Institut Jules Bordet, The Brussels University Hospital (HUB), Brussels, Belgium.
Alain Hendlisz, Department of Digestive Oncology, Institut Jules Bordet, The Brussels University Hospital (HUB), Brussels, Belgium; Université Libre de Bruxelles (ULB), Brussels, Belgium.
Massimo Di Maio, Department of Oncology, University of Turin, at A.O. Ordine Mauriziano Hospital, Turin, Italy.
Francesco Sclafani, Department of Digestive Oncology, Institut Jules Bordet, The Brussels University Hospital (HUB), Brussels, Belgium; Université Libre de Bruxelles (ULB), Brussels, Belgium.
Data availability
The data that support the findings of this study are available from the corresponding author (FS) upon reasonable request.
Author contributions
Giacomo Bregni, MD (conceptualization; data curation; formal analysis; methodology; writing—original draft; writing—review and editing); Elena Trevisi, MD (data curation); Rita Saude Conde, MD (methodology; writing—review and editing); Michele Vanhooren, MD (data curation; writing—review and editing); Tugba Akin Telli, MD (formal analysis; writing—review and editing); Irene Assaf, MD (data curation; writing—review and editing); Alain Hendlisz, MD, PhD (supervision; writing—review and editing); Massimo Di Maio, MD (data curation; formal analysis; methodology); Francesco Sclafani, MD, PhD (conceptualization; data curation; formal analysis; methodology; supervision; writing—original draft; writing—review and editing).
Funding
This work was supported by Fonds de la Recherche Scientifique—FNRS through a personal fellowship granted to G.B.
Conflicts of interest
G.B.—Travel grants: Amgen. A.H.—Consultancy, advisory roles, honoraria: Amgen, Bayer, Eli Lilly, Merck, Pierre Fabre, Servier, Sirtex. Research funding (institutional): Amgen, Astra Zeneca, Ipsen, Leo Pharma, Merck, Roche, Sanofi, Teva Pharma. Travel grants: Merck, Roche, Sirtex. M.D.M.—Consultancy, advisory roles, honoraria: Astra Zeneca, Boehringer Ingelheim, Janssen, Merck Sharp & Dohme, Novartis, Pfizer, Roche, Takeda; Research funding (institutional): Beigene, Exelixis, Merck Sharp & Dohme, Pfizer, Roche, Tesaro—GlaxoSmithKline. FS—Consultancy, advisory roles, honoraria: AMAL Therapeutics, Bayer, BMS, Dragonfly Therapeutics, Merck, Nordic Pharma, Roche, Servier; Research funding (institutional): Amgen, Astra Zeneca, Bayer, BMS, Roche, Sanofi; Travel grants: Amgen, Bayer, Lilly, Servier; Leadership roles: Co-Chair EORTC Task Force Colon, Rectum, Anal Canal. All other authors do not have any conflicts of interests.
References
- 1. Smietana K, Siatkowski M, Møller M.. Trends in clinical success rates. Nat Rev Drug Discov. 2016;15(6):379-380. doi: 10.1038/nrd.2016.85. [DOI] [PubMed] [Google Scholar]
- 2. Wong CH, Siah KW, Lo AW.. Estimation of clinical trial success rates and related parameters. Biostatistics. 2019;20(2):273-286. doi: 10.1093/biostatistics/kxx069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Harrison RK. Phase II and phase III failures: 2013–2015. Nat Rev Drug Discov. 2016;15(12):817-818. doi: 10.1038/nrd.2016.184. [DOI] [PubMed] [Google Scholar]
- 4. Vivot A, Jacot J, Zeitoun JD, Ravaud P, Crequit P, Porcher R.. Clinical benefit, price and approval characteristics of FDA-approved new drugs for treating advanced solid cancer, 2000–2015. Ann Oncol. 2017;28(5):1111-1116. doi: 10.1093/annonc/mdx053. [DOI] [PubMed] [Google Scholar]
- 5. Tibau A, Molto C, Ocana A, et al. Magnitude of clinical benefit of cancer drugs approved by the US Food and Drug Administration. J Natl Cancer Inst. 2018;110(5):486-492. doi: 10.1093/jnci/djx232. [DOI] [PubMed] [Google Scholar]
- 6. Ferlay J, Colombet M, Soerjomataram I, et al. Cancer statistics for the year 2020: an overview. Int J Cancer. 2021;149(4):778-789. doi: 10.1002/ijc.33588. [DOI] [PubMed] [Google Scholar]
- 7. Thomas DW, Burns J, Audette J, et al. Clinical Development Success Rates 2006-2015. BIO Industry Analysis. bio.org/sites/ default/files/legacy/bioorg/docs/Clinical%20Development%20Success%20Rates%202006-2015%20-%20BIO,%20Biomedtracker,%20Amplion%202016.pdf/. Accessed June 22, 2022.
- 8. Jiang DM, Chan KKW, Jang RW, et al. Anticancer drugs approved by the Food and Drug Administration for gastrointestinal malignancies: clinical benefit and price considerations. Cancer Med. 2019;8(4):1584-1593. doi: 10.1002/cam4.2058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Liang F, Wu Z, Mo M, et al. Comparison of treatment effect from randomised controlled phase II trials and subsequent phase III trials using identical regimens in the same treatment setting. Eur J Cancer. 2019;121:19-28. doi: 10.1016/j.ejca.2019.08.006. [DOI] [PubMed] [Google Scholar]
- 10. Sclafani F. MEK and PD-L1 inhibition in colorectal cancer: burning blaze turning into a flash in the pan. Lancet Oncol. 2019;20(6):752-753. doi: 10.1016/S1470-2045(19)30076-2. [DOI] [PubMed] [Google Scholar]
- 11. Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. doi: 10.1136/bmj.n71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Zia MI, Siu LL, Pond GR, Chen EX.. Comparison of outcomes of phase II studies and subsequent randomized control studies using identical chemotherapeutic regimens. J Clin Oncol. 2005;23(28):6982-6991. doi: 10.1200/J.Clin.Oncol.2005.06.679. [DOI] [PubMed] [Google Scholar]
- 13. Monzon JG, Hay AE, McDonald GT, et al. Correlation of single arm versus randomised phase 2 oncology trial characteristics with phase 3 outcome. Eur J Cancer. 2015;51(17):2501-2507. doi: 10.1016/j.ejca.2015.08.004. [DOI] [PubMed] [Google Scholar]
- 14. Sharma MR, Stadler WM, Ratain MJ.. Randomized phase II trials: a long-term investment with promising returns. J Natl Cancer Inst. 2011;103(14):1093-1100. doi: 10.1093/jnci/djr218 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Bengtsson A, Andersson R, Ansari D.. The actual 5-year survivors of pancreatic ductal adenocarcinoma based on real-world data. Sci Rep. 2020;10(1):16425-doi: 10.1038/s41598-020-73525-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Eiermann W; International Herceptin Study Group. Trastuzumab combined with chemotherapy for the treatment of HER2-positive metastatic breast cancer: pivotal trial data. Ann Oncol. 2001;12(Suppl 1):S57-S62. [PubMed]
- 17. Bang Y-J, Van Cutsem E, Feyereislova A, et al. ; ToGA Trial Investigators. Trastuzumab in combination with chemotherapy versus chemotherapy alone for treatment of HER2-positive advanced gastric or gastro-oesophageal junction cancer (ToGA): a phase 3, open-label, randomised controlled trial. Lancet. 2010;376(9742):687-697. doi: 10.1016/S0140 [DOI] [PubMed] [Google Scholar]
- 18. Chapman PB, Hauschild A, Robert C, et al. ; BRIM-3 Study Group. Improved survival with vemurafenib in melanoma with BRAF V600E mutation. N Engl J Med. 2011;364(26):2507-2516. doi: 10.1056/NEJMoa1103782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Kopetz S, Desai J, Chan E, et al. Phase II pilot study of vemurafenib in patients with metastatic BRAF-mutated colorectal cancer. J Clin Oncol. 2015;33(34):4032-4038. doi: 10.1200/J.Clin.Oncol.2015.63.2497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Skoulidis F, Li BT, Dy GK, et al. Sotorasib for lung cancers with KRAS p.G12C mutation. N Engl J Med. 2021;384(25):2371-2381. doi: 10.1056/NEJMoa2103695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Fakih MG, Kopetz S, Kuboki Y, et al. Sotorasib for previously treated colorectal cancers with KRASG12C mutation (CodeBreaK100): a prespecified analysis of a single-arm, phase 2 trial. Lancet Oncol. 2022;23(1):115-124. doi: 10.1016/S1470-2045(21)00605-7. [DOI] [PubMed] [Google Scholar]
- 22. Sobrero A, Bruzzi P.. Incremental advance or seismic shift? The need to raise the bar of efficacy for drug approval. J Clin Oncol. 2009;27(35):5868-5873. doi: 10.1200/J.Clin.Oncol.2009.22.4162. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data that support the findings of this study are available from the corresponding author (FS) upon reasonable request.





