Abstract
Designing and interpreting single-arm phase II trials of combinations of agents is challenging because it can be difficult, based on historical data, to identify levels of activity for which the combination would be worth pursuing. We identified Cancer Therapy Evaluation Program single-arm combination trials that were activated in 2008–2017 and tabulated their design characteristics and results. Positive trials were evaluated as to whether they provided credible evidence that the combination was better than its constituents. A total of 125 trials were identified, and 120 trials had results available. Twelve had designs where eligible patients were required to be resistant or refractory to all but one element of the combination. Only 17.8% of the 45 positive trials were deemed to provide credible evidence that the combination was better than its constituents. Of the 10 positive trials with observed rates 10 percentage points higher than their upper (alternative hypothesis) targets, only five were deemed to provide such credible evidence. Many trials were definitively negative, with observed clinical activity at or below their lower (null hypothesis) targets. Ideally, use of single-arm combination trials should be restricted to settings where each agent is known to have minimal monotherapy activity (and a randomized trial is infeasible). In these settings, an observed signal is attributable to synergy and thus could be used to decide whether the combination is worth pursuing. In other settings, credible evidence can still be obtained if the observed activity is much higher than expected, but experience suggests that this is a rare occurrence.
Phase II trials are designed to provide evidence that an experimental regimen has sufficient activity to proceed with further definitive phase III testing. Randomized designs are the “gold standard” for gathering such evidence, because they make it possible to isolate the benefit of an experimental intervention and can use a wide variety of endpoints (1–4). However, such trials are larger than single-arm trials and may not always be feasible. In particular, it may be difficult to perform a randomized trial when the target population is small. Thus, many phase II trials employ single-arm designs.
Historically, single-arm phase II monotherapy trials have targeted tumor response rates (RRs). In this situation, a single-arm design is reasonable, because any responses can be attributed to the clinical activity of the agent. However, the use of single-arm designs to evaluate a combination of agents can be problematic, because the individual contributions of the agents cannot be easily isolated; it is often difficult to decide what the null hypothesis (lower) target RR should be and how to account for trial-to-trial variability in defining this null RR (5). Uncertainty regarding the null RR may increase the risk of false positives (if the null is underestimated) and false negatives (if the null is overestimated). This difficulty becomes more acute when the outcome of the trial is based on a time-to-event endpoint, like progression-free survival (PFS) or overall survival (OS), where variability in the natural history of the disease makes interpretation of nonrandomized trials particularly challenging (1,6). For these reasons, when evaluating a combination of agents, a randomized design should be used whenever possible.
If a randomized trial is not possible, there are a number of factors that should be considered when selecting a single-arm design to evaluate a combination therapy (Table 1). Single-arm designs are least likely to give ambiguous results in settings where essentially no responses are expected with the current treatment options. For example, to assess the value of adding a new drug to the current standard agent, a single-arm design could be appropriate if eligibility is restricted to patients who are progressing on the standard agent, because any observed activity could be attributed to the addition of the second agent. It may also be reasonable to expand eligibility to patients who previously received the standard agent and did not have a response. In contrast, inclusion of patients who previously had a response to the standard agent and then progressed can be problematic, because these patients may have a nonnegligible chance of responding to the standard agent if given again. It is worth noting that unless there are reliable data indicating that this second agent has no activity by itself, a single-arm design does not inform whether the combination is better than the second agent given alone.
Table 1.
Characteristics influencing the ability of a combination phase II trial design to produce credible evidence for proceeding with further development of the combination
| Characteristic | More likely to be credible | Less likely to be credible | ||
|---|---|---|---|---|
| Patient eligibility | Progressing on one of single agents of the combination (retreatment designs) | Unexposed to the single agents | ||
| Trial endpoint | Response rate | Overall survival | Progression-free survival | |
| Historical comparison population | ||||
| Institutions | Same as present trial | Different from present trial | ||
| Population clinical or molecular characteristics | Same as present trial | Different from present trial | ||
| Time frame when treated | Recent | Not recent | ||
| Number of trials | Multiple trials | Single trial | ||
| Agent(s) | Agents in the combination | Other agents | ||
| Activity of single agents | Inactive | Some have activity | ||
| Sample size | Larger | Smaller | ||
In many settings, it may be unreasonable to restrict eligibility to patients who are currently progressing on the standard agent (or previously failed to respond). In such cases, the likelihood that a single-arm trial will provide credible evidence depends on the nature of the historical data (Table 1): if the institutions where the data were collected are different from the present trial or the data are not recent, then differences in patient populations (eg, due to the introduction of molecularly defined treatments), ancillary care, and staging/diagnostic methods can make historical comparisons problematic, especially for PFS and OS. Preferably, the historical data will contain single-agent efficacy information on all agents in the combination and come from multiple trials, allowing the possibility of incorporating between-trial variability to better designate targets (7). Finally, if the historical data sets are small, then target benchmarks for the current trial based on these data will be unreliable. Sometimes estimates from historical data are taken as known quantities, but not accounting for the variability of the historical data is a mistake (8).
In this commentary, we review a cohort of single-arm phase II combination trials sponsored by the National Cancer Institute Cancer Therapy Evaluation Program (CTEP) during 2008–2017. We tabulate the design characteristics and results of these trials to see what lessons can be learned. We end with some conclusions derived from this experience.
Methods
The CTEP trials database was searched for phase II trials involving more than one agent (including radiation) that were activated on or after January 1, 2008, and permanently closed to accrual by December 31, 2017. Trials whose primary objective was not efficacy were excluded, as were comparative trials. Arms from noncomparative multi-arm trials were treated as separate trials. The designs and outcomes of the identified trials were summarized using information from the trial protocols, trial publications, and www.clinicaltrials.gov. To find trial publications, internet searches were performed using the CTEP trial ID, National Clinical Trials Network number, trial title, and a combination of the principal investigator’s surname, agents in the trial, and cancer histology. Trials were identified as having “retreatment designs” if patients received the combination therapy after having progressed on a constituent of the combination (or an agent with similar mechanism of action).
Trial results were classified as positive or negative based on the pre-specified protocol criteria. In cases where the attained sample sizes did not match those projected in the protocol, we used the following guidelines for positive/negative determination: if the sample size was more than projected, the null was tested with the same type-I error specified in the protocol. If the sample size was less than projected and the outcome was not already determined by the data (ie, the number of necessary responses was reached or would be mathematically impossible to reach if the trial fully accrued), we compared the observed result with the protocol-defined null and alternative hypotheses using one-sided tests with type-I error set at the protocol-defined type-I and type-II errors, respectively. If the observed result was statistically significantly higher than the null, the trial was classified as positive. If the observed result did not differ statistically significantly from the null but was statistically significantly lower than the alternative, the trial was classified as negative. If the observed result was not found to differ statistically significantly from either the null or alternative, the trial was classified as indeterminate.
Combinations were designated as “protocol-defined highly effective” if the observed result was 10 percentage points higher than the target alternative, an admittedly arbitrary definition. For example, if a design targeted an RR of 20% vs 40%, then an observed RR of 50% or higher would be designated as protocol-defined highly effective. Note that in this example, the trial would typically be positive if the observed RR was 30% or higher, so that an RR of 50% is 20 percentage points higher than the cut-off for positivity. Some of these trials are discussed individually, because the hope for such “home runs” is frequently used as a justification for a single-arm design.
An assessment of whether the positive trials provided credible evidence that the combination treatment was better than all its constituents was performed independently by two of the coauthors (B. Freidlin, E. L. Korn). The assessment was based on the trial protocol, the historical data used to define the null/alternative for the trial, and the trial results. It should be emphasized that designation of a trial as not providing credible evidence does not imply that the combination was inactive but only that there was not credible evidence that the combination was better than all of its constituents. It should also be emphasized that this type of assessment is, by definition, subjective. In case of disagreement between the two reviewers, a consensus was reached after discussion.
The non-retreatment trials were categorized into three groups based on the type of endpoint (some type of RR, OS based, or non-OS time-to-event based). Additional information about the trials (including references) is included in the Supplementary Materials (available online).
Results
Initially, 110 eligible protocols representing 147 trials were identified (Figure 1). Twenty-two trials were excluded because they enrolled less than 50% of their planned accrual (or <50% of the first-stage planned accrual for two-stage designs), including 13 trials with zero accrual, leaving 125 trials analyzed in this review. Of these, 120 had outcomes available, with 45 (37.5%), 67 (55.8%), and 8 (6.7%) being positive, negative, and indeterminate, respectively. Ten (22.2%) of these positive trials were protocol-defined highly effective. Eight (17.8%) of these positive trials were classified as providing credible evidence that the combination was better than its constituents, of which five were also protocol-defined highly effective. The independent assessment of the credibility of trial results for the 45 positive trials resulted in 86.7% agreement between the two assessors. Twelve trials used retreatment designs.
Figure 1.
Flow diagram of trial selection. *Trials classified as “other” include the following: two trials in which the analysis subset was defined by a post-registration event, one biomarker study, one reduced therapy trial, one pilot study, and one trial in which the primary aim was to construct a historical database. CTEP = National Cancer Institute Cancer Therapy Evaluation Program.
Trials With Retreatment Designs
For the 12 trials identified, six had an RR endpoint, and six used an endpoint based on time to progression (Table 2). Trials varied in their entry criteria vis-à-vis the timing of progression of a patient’s disease on their previous treatment with one of the constituents of the combination. For example, trial 9303 (9) required patients to be progressing on erlotinib for entry, whereas trial 8698(II) (10) allowed patients who previously responded to erlotinib but subsequently progressed (and may have received other treatments since progression on erlotinib). There was one positive trial [8698(II)] (10), and it had a protocol-defined highly effective combination but did not provide credible evidence.
Table 2.
Designs and outcomes of single-arm combination phase II trials activated by CTEP for which patients are known to be resistant or refractory to one of the elements of the combination therapy (retreatment designs)
| CTEP ID | Histology | Agents | Endpoint | Targets | Maximum sample size* | Outcome of trial |
|---|---|---|---|---|---|---|
| (NCTN No.) | ||||||
| 8698 (I) | NSCLC | Erlotinib† | RR | 5% vs 20% | 41 | 4 of 45 = 9% |
| (NCT01294306) | MK-2206 | (negative) | ||||
| 8698 (II) | NSCLC | Erlotinib† | Response or SD at 12 wk | 5% vs 20% | 41 | 15 of 35 = 43% |
| (NCT01294306) | MK-2206 | |||||
| (positive) | ||||||
| 9048 (II) | Renal cell | Anti-VEGF† | RR | 3% vs 15% | 39 | 1 of 18 = 5.6% |
| (NCT01664182) | Trebananib | (indeterminate) | ||||
| 9303 | NSCLC | Erlotinib† | RR | 5% vs 20% | 37 | 3 of 37 = 8% |
| (NCT01866410) | Cabozantinib | (negative) | ||||
| ABTC-1402 | Glioblastoma | Temozolomide† | RR | 13% vs 30% | 31 | 0 of 19 = 0.0% |
| (NCT02395692) | TRC102 | (negative) | ||||
| G0G-0126T | Ovarian | Carboplatin† | RR | 10% vs 25% | 51 | 2 of 27 = 7.4% |
| (NCT00993616) | Belinostat | (negative) | ||||
| N054C | Colorectal | Bevacizumab† | 3-mo PFS | 50% vs 65% | 72 | 42 of 79 = 53.2% |
| (NCT00826540) | Sorafenib | (negative) | ||||
| N093B | Breast | Letrozole† Panobinostat | RR | 5% vs 20% | 27 | 0 of 13 = 0% |
| (NCT01105312) | (negative) | |||||
| RTOG 0929 (I) | Glioblastoma | Temozolomide† | 6-mo PFS | 15% vs 30% | 53 | 9 of 53 = 17.0% |
| (NCT01026493) | Veliparib (5-d schedule) | (negative) | ||||
| RTOG 0929 (II) | Glioblastoma | Temozolomide† | 6-mo PFS | 15% vs 30% | 53 | 9 of 53 = 17.0% |
| (NCT01026493) | Veliparib (21-d schedule) | (negative) | ||||
| RTOG 0929 (III) | Glioblastoma | Temozolomide† | 6-mo PFS | 2% vs 15% | 26 | 1 of 19 = 5.3% |
| (indeterminate) | ||||||
| (NCT01026493) | Veliparib (5-d schedule) | |||||
| RTOG 0929 (IV) | Glioblastoma | Temozolomide† | 6-mo PFS | 2% vs 15% | 26 | 1 of 26 = 4.4% |
| (NCT01026493) | Veliparib (21-d schedule) | (negative) |
For two-stage designs, this is the targeted sample size if the trial proceeds to the second stage. CTEP = National Cancer Institute Cancer Therapy Evaluation Program; NCTN = National Clinical Trials Network; NSCLC = Non-small cell lung cancer; PFS = progression-free survival; RR = response rate; VEGF = vascular endothelial growth factor.
Agent for which patients were previously treated.
Trials With Non-retreatment Designs
Sixty-six trials were designed with response-rate type endpoints (Figure 2; Supplementary Table 1, available online); the endpoints were RR, complete response or remission, complete remission possibly with incomplete blood count recovery, complete response or remission possibly with incomplete platelet recovery, second complete remission, and pathologic compete response in 39, 9, 9, 1, 1, and 7 trials, respectively. The projected maximum sample size of the trials ranged from 17 to 80, with a median of 37. Fifty-nine of the trials had determinable trial positivity/negativity, of which 23 (39.0%) were positive, including five with protocol-defined highly effective combinations [8309(III) (11), 8327 (12), N1087 (13), S0910 (14), S0919(I) (15)]. Of the 23 positive trials, 21.7% were deemed to provide credible evidence that the combination was better than its constituents, including two with protocol-defined highly effective combinations [8327 (12), S0910 (14)].
Figure 2.
Trial designs and outcomes for historically controlled trials with response-rate (RR) type endpoints; protocols are listed by their National Cancer Institute Cancer Therapy Evaluation Program (CTEP) protocol ID. Vertical bars are the lower and upper targeted response rates. As noted in the figure key, for each trial, the plotted symbol denotes the way in which the trial results were classified. The right vertical axis displays the endpoints used for the trials. pCR = pathologic complete response; RR = response rate; CR = complete response; CR/CRi = complete response possibly with incomplete blood count recovery; CR/CRp = complete response possibly with incomplete platelet recovery; CR2 = second complete remission.
Ten trials were designed with OS-based endpoints (Supplementary Table 2, available online; Figure 1). The projected maximum sample size of the trials ranged from 39 to 98, with a median of 69. The lower targets for the OS rates ranged from 30% to 58%, with a median of 45%. All of the trials had determinable positivity or negativity: four were positive, one of which contained a protocol-defined highly effective combination [CALGB 11001 (16)]. Three of the four positive trials were deemed to provide credible evidence that the combination was better than its constituents.
Thirty-one trials were designed with non-OS time-to-event endpoints; the endpoints were based on PFS, recurrence-free survival, event-free survival, distant metastases-free survival, and freedom from cystectomy (Supplementary Table 3, Supplementary Figure 2, available online). The projected maximum sample size of the trials ranged from 32 to 110, with a median of 54. For the 27 trials with determinable positivity or negativity, 14 were positive, one of which was deemed to provide credible evidence that the combination was better than its constituents (S0805) (17).
Six trials were identified with co-primary response and time-to-event-based endpoints (Supplementary Table 4, available online). All of these trials had results available; three were positive, of which two had protocol-defined highly effective combinations [8233(IV) (18), GOG-0229G (19)]. One of these positive trials [8233(IV) (18)] was deemed to provide credible evidence that the combination was better than its constituents.
Discussion
We now discuss some trials in more detail to highlight key design issues with single-arm combination trials.
Trials With Protocol-Defined Highly Effective Combinations
Ten trials were identified: 8698(II) (10) in Table 2; 8309 (III) (11), 8327 (12), N1087 (13), S0910 (14), and S0919(I) (15) in Figure 2; CALGB-11001 (16) in Supplementary Figure 1 (available online); S0805 (17) in Supplementary Figure 2 (available online); 8233(IV) (18) and GOG-0229G (19) in Supplementary Table 4 (available online). Five of these trials were deemed to provide credible evidence that the combination was better than its constituents.
Trial 8698(II) had a 43% observed disease control rate at 12 weeks for erlotinib plus MK-2206 for patients with non-small cell lung cancer; the targets were 5% vs 20%. However, it is possible that a nontrivial proportion of patients would have had stable disease at 12 weeks even without an active therapy, particularly those without aggressive disease. Only one patient had a (unconfirmed) partial response on this trial. Therefore, this trial was not deemed to provide credible evidence that erlotinib plus MK-2206 was meaningfully better than the erlotinib alone.
Trial 8233(IV) used a combination of temsirolimus and bevacizumab in patients with pancreatic neuroendocrine tumors who had progressed within 7 months of trial entry; the design targeted an improvement in RR from 5% to 20% or an improvement in 6-month PFS from 60% to 80% (18). The lower targets were based on a phase II trial of single-agent temsirolimus where 15 patents with pancreatic neuroendocrine tumors had an RR of 7% (1 of 15) and a 6-month PFS estimated at 60%. The trial was positive for both endpoints (although only one was needed for the trial to be positive), with a 79% 6-month PFS rate and a 41% RR. Because of the high RR, this trial was deemed to provide credible evidence that the combination was worth pursuing.
Other Trials With Positive Outcomes
There were 35 positive trials that were not protocol-defined highly effective, three of which were deemed to provide credible evidence that the combination was better than its constituents. The main barrier to credibility was the relevance of the null chosen for these trials.
N064A targeted 1-year OS of 40% vs 60% for the combination of Fluorouracil, radiation, gemcitabine, and panitumumab for patients with unresectable metastatic pancreatic cancer (20). The null was based on data from three small older trials (21–23), suggesting the observed 50.2% 1-year OS (a positive result in this 51-patient trial) does not provide credible evidence that panitumumab improves the efficacy of modern chemoradiotherapy.
Trials 8121 (I, II, III) targeted 3-month PFS of 20% vs 40% for the combination of cixutumumab and temsirolimus in patients with refractory insulin-like growth factor-1R-positive soft-tissue sarcoma (I), IGF-1R-positive bone sarcoma (II), and IGF-1R-negative soft-tissue or bone sarcoma (III) (24). The historical rates came from a pooled analysis of 12 European Organisation for Research and Treatment of Cancer trials of pretreated soft-tissue sarcoma patients treated with two active agents (n = 146, 3-month PFS = 39%) or nine inactive agents (n = 234, 3-month PFS = 21%) (25). The 8121 trials were positive with RRs of 31% (I), 35% (II), and 39% (III). However, given the possible differences between the 8121 population and the European Organisation for Research and Treatment of Cancer historical control population, the evidence from these trials that this combination is worth pursuing is weak; a later trial demonstrated limited activity of the combination (26).
Trials With Negative Outcomes
Negative outcomes were observed for a total of 67 trials, of which 21 had null targets of 10% or less. We consider these 21 trials as providing credible evidence that the combinations were not worth pursuing. Single-arm designs with a very low null can only be used in settings where each agent is known to have minimal monotherapy activity. An example is trial 8603, which targeted 5% vs 25% RR for the combination of imatinib and the mTOR inhibitor everolimus in patients with platelet-derived growth factor receptor alpha-positive synovial sarcoma (27). The 5% null RR was appropriate because both imatinib and everolimus would be expected to yield few or no responses in this setting when given alone; imatinib had 0 of 20 responses in synovial sarcoma (28), and the mTOR inhibitor ridaforolimus had 4 of 212 responses among patients with advanced sarcomas (29). Trial 8603 stopped after its first stage with zero of nine responses, credibly demonstrating that the combination was not worth pursuing.
The remaining 46 negative trials had null targets that were higher than 10%. For these trials, the credibility depends on how negative the results were and how the alternative was determined. The nine trials where the observed outcome was more than 10 percentage points lower than the null are more convincingly negative than the 20 trials where the observed outcome was between the null and alternative. For example, N0779 targeted 6-month PFS of 15% vs 30% for the combination of vorinostat and bortezomib in recurrent glioblastoma (30). The null was based on 15.2% (10 of 66) patients being progression free at 6 months in a previous North Central Cancer Treatment Group phase II trial of vorinostat alone (31). Trial N0779 stopped at its first stage, with 0 of 37 patients progression free at 6 months (30). Thus, even though the trial design was problematic because it relied on a single small trial for the single-agent activity of vorinostat, the results provided convincing evidence that this combination was not worth pursuing. On the other hand, N054C (32) targeted 50% vs 65% 3-month PFS for the combination of bevacizumab and sorafenib for metastatic colorectal cancer patients who progressed on bevacizumab. The observed 53.2% 3-month PFS makes it difficult to be confident that this combination was not worth pursuing.
Further Development of Credibly Positive Trial Combinations
We consider the further clinical development of the eight combinations identified from credibly positive trials [8233(IV), 8237, AHOD1221, CALGB 11001, GOG-0076GG, GOG-186F, S0805, and S0910].
One of these trials (8237) led directly to an ongoing phase III trial of the combination (NRG-GY006, NCT02466971). For S0910, there are ongoing trials, including two randomized phase III trials (NCT03150693 and NCT03959085) of a different anti-CD22 antibody (inotuzumab) than epratuzumab, which was tested in S0910. For 8233(IV), which evaluated the combination of bevacizumab and temsirolimus, the patent on temsirolimus was expiring, and a similar, but oral agent, everolimus, became available. In addition, although an improvement in RR was observed in the randomized phase II trial of everolimus with and without bevacizumab, there was only a modest, statistically nonsignificant improvement in PFS (33).
Trial S0805, which considered the addition of dasatinib (a second-generation TKI) to hyper-CVAD is the combination of: Cyclophosphamide, Vincristine Sulfate, Doxorubicin Hydrochloride and Dexamethasone in patients with Philadelphia-positive acute lymphoblastic leukemia who received allogenic stem cell transplants, provided strong evidence that dasatinib is promising in this setting. Dasatinib is currently US Food and Drug Administration-approved for patients in this population with resistance or intolerance to previous therapy (34). Because of this approval and the promising results of this study (and others), dasatinib is currently used in this setting. Although CALGB 11001, which considered the combination of 7 + 3 chemotherapy and sorafenib in FLT3-mutated acute myeloid leukemia, was classified as credibly positive, an updated review of the clinical setting suggests that this is not the case. In particular, results of two concurrent studies (35,36) suggest that survival of FLT3-mutated acute myeloid leukemia patients on the 7 + 3 backbone was underestimated at the time CALGB 11001 was designed. In addition, remaining interest in sorafenib in this setting seems to have been dramatically reduced as a result of a randomized study evaluating the addition of sorafenib to 7 + 3 in a slightly different population (36), which was negative and showed poor tolerance for the study combination.
Development of the remaining three combinations appears to have ended (at least in part) due to issues relating to toxicity. The combinations considered in Gynecologic Oncology Group (GOG) trials GOG-0076GG and GOG-186F were viewed as too toxic for further development, and in addition, single-agent docetaxel (part of the combination studied in GOG-186F) is not considered a standard therapy for ovarian cancer, so the path of development for this combination was unclear. Though not necessarily too toxic, the combination considered in AHOD1221 has not been further developed in this setting, because the study rationale was to have a less toxic combination (without alkylating agents); successful development of less toxic checkpoint inhibitors in this setting has made the AHOD1221 combination less appealing. In addition, the experimental agent tested in AHOD1221, brentuximab, was moved to first line (in combination with different agents) as a result of the ECHELON-1 trial (37).
Foreseeably, many of the single-arm phase II combination trials included in this review led to ambiguous conclusions. However, three findings from our survey of trials were unexpected. First, there were more definitively negative trials than we would have expected, many with activity less than the null target. This may partly be due to intentionally selecting high targets to account for potential bias and variability of the historical comparisons. However, it is more likely that this reflects the difficulty in estimating an appropriate baseline rate from historical data.
Our second unexpected finding concerned the landmark placement of the time-to-event endpoints: we expected endpoints to be defined at landmark times far enough out that success rates would be expected to be very low if the combination were inactive, for example, less than 5% PFS at a 2-year landmark. Instead, trials used earlier landmarks and targeted higher rates. With higher rates, more historical data are required for the results of the trial to provide credible evidence because the variability of a very low proportion (eg, 0.05) is much smaller than the variability of a higher proportion (eg, 0.5).
Our final unexpected finding was how few protocol-defined highly effective combinations were seen (10), of which only one-half provided credible evidence that the combination was better than its constituents. Given combination trials are frequently justified based on predicted synergy of the agents, this finding was disappointing.
Although the major focus of this commentary is efficacy, it should be noted that single-arm trials also provide information on the safety and tolerability of the experimental combination, which may be very useful for a variety of reasons, even if the corresponding efficacy data are negative or ambiguous. Such information is very valuable but should not be used to justify a design that is likely to produce ambiguous efficacy results. Selecting a design that is more likely to produce credible efficacy information need not decrease the quality of the toxicity data and may even improve it. For example, in addition to providing all the toxicity information of a single-arm trial, a randomized design allows for a direct comparison of toxicity between the experimental combination and the control.
The findings of this review further emphasize the fact that randomized designs should be used whenever possible. When a randomized design is not possible, there are a number of issues that should be considered before proceeding with a single-arm design, especially when evaluating a combination of treatments. Among single-arm combination trial designs, retreatment designs (especially where the other constituents are known to have minimal or no single-agent activity) offer the strongest evidence about the activity of the combination: any activity seen can be attributed to the synergistic activity of the combination. This is especially true if the eligibility is restricted to patients who are currently progressing on one of the constituents. However, even with a retreatment design, it can be problematic to use a non-RR endpoint.
In settings where a retreatment design cannot be used, one must rely on historical control data. Interpretation of these trials is least problematic when sufficient historical data demonstrate that each of the agents in the combination has little or no activity when given alone; our survey of trials suggests that this can be a useful strategy. For new agents with unknown activity, one might consider single-arm monotherapy trials to evaluate single-agent activity before considering a single-arm combination trial. When some of the agents have more than minimal single-agent activity, single-arm designs should be avoided if possible. In such cases, higher nulls must be targeted, and trial interpretation becomes challenging unless the results are strongly negative or are so positive that the uncertainties in using a historical comparison are overcome. Designing a trial when expecting the former is inappropriate, and our survey suggests a small probability of the latter.
Notes
The authors have no conflicts of interest to disclose.
Supplementary Material
References
- 1. Rubinstein LV, Korn EL, Freidlin B, et al. Design issues of randomized phase II trials and a proposal for phase II screening trials. J Clin Oncol. 2005;23(28):7199–7206. [DOI] [PubMed] [Google Scholar]
- 2. Ratain MJ, Sargent DJ.. Optimising the design of phase II oncology trials: the importance of randomisation. Eur J Cancer. 2009;45(2):275–280. [DOI] [PubMed] [Google Scholar]
- 3. Tang H, Foster NR, Grothey A, et al. Comparison of error rates in single-arm versus randomized phase II cancer clinical trials. J Clin Oncol. 2010;28(11):1936–1941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Van Glabbeke M, Steward W, Armand JP.. Non-randomised phase II trials of drug combinations: often meaningless, sometimes misleading. Are there alternative strategies? Eur J Cancer. 2002;38(5):635–638. [DOI] [PubMed] [Google Scholar]
- 5. Simon R, Wittes RE, Ellenberg SS.. Randomized phase II clinical trials. Cancer Treat Rep. 1985;69(12):1375–1381. [PubMed] [Google Scholar]
- 6.U.S. Department of Health and Human Services Food and Drug Administration, Oncology Center of Excellence, Center for Drug Evaluation and Research (CDER), Center for Biologics Evaluation and Research (CBER). Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics: Guidance for Industry; 2018. https://www.fda.gov/downloads/Drugs/Guidances/ucm071590.pdf. Accessed April 19, 2019.
- 7. Korn EL, Liu PY, Lee SJ, et al. Meta-analysis of phase II Cooperative Group trials in metastatic stage IV melanoma to determine progression-free and overall survival benchmarks for future phase II trials. J Clin Oncol. 2008;26(4):527–534. [DOI] [PubMed] [Google Scholar]
- 8. Korn EL, Freidlin B.. Conditional power calculations for clinical trials with historical controls. Stat Med. 2006;25(17):2922–2931. [DOI] [PubMed] [Google Scholar]
- 9. Reckamp KL, Mack PC, Ruel N, et al. Biomarker analysis of a phase II trial of cabozantinib and erlotinib in patients (pts) with EGFR-mutant NSCLC with epidermal growth factor receptor (EGFR) tyrosine kinase inhibitor (TKI) resistance: a California Cancer Consortium Phase II Trial (NCI 9303). J Clin Oncol. 2015;33(15_suppl):15s. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Lara PN, Longmate J, Mack PC, et al. Phase II study of the AKT inhibitor MK-2206 plus erlotinib in patients with advanced non-small cell lung cancer who previously progressed on erlotinib. Clin Cancer Res. 2015;21(19):4321–4326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Kline J, Rapaport AP, Petrich AM, et al. Phase II study of temsirolimus and lenalidomide in patients with relapsed and refractory lymphomas: final analysis of NCI 8309. Blood. 2016;128(22):4147. [Google Scholar]
- 12. Kunos CA, Radivoyevitch T, Waggoner S, et al. Radiochemotherapy plus 3-aminopyridine-2-carboxaldehyde thiosemicarbazone (3-AP, NSC #663249) in advanced-stage cervical and vaginal cancers. Gynecol Oncol. 2013;130(1):75–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Larsen JT, Shanafelt TD, Leis JF, et al. Akt inhibitor MK-2206 in combination with bendamustine and rituximab in relapsed or refractory chronic lymphocytic leukemia: results from the N1087 alliance study. Am J Hematol. 2017;92(8):759–763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Advani AS, McDonough S, Coutre S, et al. SWOG S0910: a phase 2 trial of clofarabine/cytarabine/epratuzumab for relapsed/refractory acute lymphocytic leukaemia. Br J Haematol. 2014;165(4):504–509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Advani AS, McDonough S, Copelan E, et al. SWOG0919: a phase 2 study of idarubicin and cytarabine in combination with pravastatin for relapsed acute myeloid leukaemia. Br J Haematol. 2014;167(2):233–237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Uy GL, Mandrekar SJ, Laumann K, et al. A phase 2 study incorporating sorafenib into the chemotherapy for older adults with FLT3-mutated acute myeloid leukemia: CALGB 11001. Blood Adv. 2017;1(5):331–340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Ravandi F, Othus M, O'Brien SM, et al. US intergroup study of chemotherapy plus dasatinib and allogenic stem cell transplant in Philadelphia chromosome positive ALL. Blood Adv. 2016;1(3):250–259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Hobday TJ, Qin R, Reidy-Lagunes D, et al. Multicenter phase II trial of temsirolimus and bevacizumab in pancreatic neuroendocrine tumors. J Clin Oncol. 2015;33(14):1551–1556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Alvarez EA, Brady WE, Walker JL, et al. Phase II trial of combination bevacizumab and temsirolimus in the treatment of recurrent or persistent endometrial carcinoma: a Gynecologic Oncology Group study. Gynecol Oncol. 2013;129(1):22–27. [DOI] [PubMed] [Google Scholar]
- 20. Kim GP, Foster NR, Haddock MG, et al. North Central Cancer Treatment Group phase II study of panitumumab (Pmab), chemotherapy, and external beam radiation (Chemo-RT) in patients with locally advanced (LA) pancreatic cancer. J Clin Oncol. 2012;30(4_suppl):4s. [Google Scholar]
- 21. Moertel CG, Childs DS, Reitemeier RJ, et al. Combined 5-fluorouracil and supervoltage radiation therapy of locally unresectable gastrointestinal cancer. Lancet. 1969;2(7626):865–867. [DOI] [PubMed] [Google Scholar]
- 22. Moertel CG, Frytak S, Hahn RG, et al. Therapy of locally unresectable pancreatic carcinoma: a randomized comparison of high dose (6000 rads) radiation alone, moderate dose radiation (4000 rads + 5-fluorouracil) and high dose radiation + 5-fluorouracil: the Gastrointestinal Tumor Study Group. Cancer. 1981;48(8):1705–1710. [DOI] [PubMed] [Google Scholar]
- 23.Gastrointestinal Tumor Study Group. Treatment of locally unresectable carcinoma of the pancreas: comparison of combined-modality therapy (chemotherapy plus radiotherapy) to chemotherapy alone. J Natl Cancer Inst. 1988;80(10):751–755. [PubMed] [Google Scholar]
- 24. Schwartz GK, Tap WD, Qin LX, et al. Cixutumumab and temsirolimus for patients with bone and soft-tissue sarcoma: a multicentre, open-label, phase 2 trial. Lancet Oncol. 2013;14(4):371–382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Van Glabbeke M, Verweij J, Judson I, et al. Progression-free rate as the principal end-point for phase II trials in soft-tissue sarcomas. Eur J Cancer. 2002;38(4):543–549. [DOI] [PubMed] [Google Scholar]
- 26. Wagner LM, Fouladi M, Ahmed A, et al. Phase II study of cixutumumab in combination with temsirolimus in pediatric patients and young adults with recurrent or refractory sarcoma: a report from the Children's Oncology Group. Pediatr Blood Cancer. 2015;62(3):440–444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Keohan ML, Tap WD, Dickson MA, et al. A phase Ib/II study of imatinib and everolimus in patients with PDGFRA+ synovial sarcoma. J Clin Oncol. 2013;31:15s. [Google Scholar]
- 28. Chugh R, Wathen JK, Maki RG, et al. Phase II multicenter trial of imatinib in 10 histologic subtypes of sarcoma using a Bayesian hierarchical statistical model. J Clin Oncol. 2009;27(19):3148–3153. [DOI] [PubMed] [Google Scholar]
- 29. Chawla SP, Staddon AP, Baker LH, et al. Phase II study of the mammalian target of rapamycin inhibitor ridaforolimus in patients with advanced bone and soft tissue sarcomas. J Clin Oncol. 2012;30(1):78–84. [DOI] [PubMed] [Google Scholar]
- 30. Friday BB, Anderson SK, Buckner J, et al. Phase II trial of vorinostat in combination with bortezomib in recurrent glioblastoma: a North Central Cancer Treatment Group study. Neuro Oncol. 2012;14(2):215–221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Galanis E, Jaeckle KA, Maurer MJ, et al. Phase II trial of vorinostat in recurrent glioblastoma multiforme: a North Central Cancer Treatment Group study. J Clin Oncol. 2009;27(12):2052–2058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Grothey A, Lafky JM, Morlan BW, et al. Dual VEGF inhibition with sorafenib and bevacizumab (BEV) as salvage therapy in metastatic colorectal cancer (mCRC): results of the phase II North Central Cancer Treatment Group study N054C. J Clin Oncol. 2010;28(15_suppl):15s.. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Kulke M, Niedzwiecki D, Foster N, et al. Randomized phase II study of everolimus (E) versus everolimus plus bevacizumab (E+B) in patients (Pts) with locally advanced or metastatic pancreatic neuroendocrine tumors (pNET), CALGB 80701 (Alliance). J Clin Oncol. 2015;33(15_suppl):15s. [Google Scholar]
- 34. Brave M, Goodman V, Kaminskas E, et al. Sprycel for chronic myeloid leukemia and Philadelphia chromosome-positive acute lymphoblastic leukemia resistant to or intolerant of imatinib mesylate. Clin Cancer Res. 2008;14(2):352–359. [DOI] [PubMed] [Google Scholar]
- 35. Stone RM, Mandrekar SJ, Sanford BL, et al. Midostaurin plus chemotherapy for acute myeloid leukemia with a FLT3 mutation. N Engl J Med. 2017;377(5):454–464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Serve H, Krug U, Wagner R, et al. Sorafenib in combination with intensive chemotherapy in elderly patients with acute myeloid leukemia: results from a randomized, placebo-controlled trial. J Clin Oncol. 2013;31(25):3110–3118. [DOI] [PubMed] [Google Scholar]
- 37. Connors JM, Jurczak W, Straus DJ, et al. Brentuximab vedotin with chemotherapy for stage III or IV Hodgkin’s lymphoma. N Engl J Med. 2018;378(4):331–344. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


