In the phase II HCC 15-132 trial, investigators asked whether adding concurrent or sequential pembrolizumab to standard-of-care chemoradiation for locally advanced head and neck squamous cell carcinoma should be examined in future, definitive phase III trials.1 To test this, 80 patients were randomized 1:1 to each experimental arm. However, although randomized trials traditionally compare the outcomes of each arm, HCC 15-132 was designed as a randomized, non-comparative trial (RNCT), meaning that formal statistical comparisons between arms were precluded in a pre-specified fashion.2 Consequently, the primary analysis of the trial evaluated a trivariate composite endpoint separately within each arm. If both arms met the trivariate composite endpoint, which turned out to be the case, then the protocol specified that the mean estimate of the 1-year progression-free survival (PFS) probability would be numerically compared between arms to “pick the winner.”
RNCTs are a relatively novel concept in clinical trial design, though our recent meta-epidemiological analysis found that RNCTs are being increasingly used, especially in oncology.3 By precluding statistical comparisons between randomized arms, RNCTs promise reduced sample-size requirements and thus provide early-phase trials an expedited, resource-conserving means of selecting therapeutic signals for more definitive late-phase testing. Accordingly, the HCC 15-132 trial protocol noted that “an early-phase trial cannot provide the statistical power required for a formal statistical comparison between arms.”
However, RNCTs face a number of challenges. First, RNCTs are themselves a methodological paradox.2 The purpose of randomization is to facilitate comparisons between arms; thus, randomizing patients without comparison negates the primary benefit offered by random treatment allocation.4,5 Indeed, there is a strong natural inclination among researchers to compare outcomes between randomized arms in RNCTs exactly because randomization is meant to provide statistical robustness for such comparisons. However, formally precluding such comparisons creates a stark dilemma between presenting interesting (but post-hoc) comparative analyses versus remaining aligned to the protocol and RNCT design. In HCC 15-132, post-hoc Cox regression comparisons are reported for multiple endpoints, and this is commonplace in RNCTs.3 Caution is always warranted with post-hoc analyses, as the lack of pre-specification can increase the risk of potential bias.
As emphasized, in RNCTs, the outcomes of randomized arms are separately compared to historical controls or point thresholds. In HCC 15-132, three outcomes within each arm were numerically compared to point thresholds. However, as both arms met the initial primary endpoint criteria, HCC 15-132 took the additional pre-specified step of numerically comparing a mean estimate of 1-year PFS probability between arms, which is unusual for RNCTs. This analysis is furthermore difficult to interpret for multiple other reasons. First, the milestone survival estimates of each arm are themselves challenging to estimate reliably without random sampling techniques or rigorous modeling accounting for selection biases.6 Second, this comparison between arms, even though pre-specified, contradicts the foundational methodology of an RNCT. Third, numerical milestone survival estimates disregard all other data on the survival curve, including the change in survival probability over time prior to the milestone timepoint and all subsequent survival information. As such, milestone estimates such as 1-year PFS probability do not represent the survival function and are typically presented only descriptively.7,8
HCC 15-132 represents a notable effort to address an important question, but its findings remain difficult to interpret due to the RNCT design and post hoc contradictions to RNCT design principles.3 The foundational statistical issues inherent to RNCTs merit greater attention by oncological researchers and methodologists, and the increasing use of RNCTs should be revisited.2 Recognizing that financial support for trials is often limited, how then should early-phase signal-finding trials be designed? If the research seeks to estimate the outcomes of a new experimental arm, a well-conducted single-arm trial is entirely appropriate. Alternatively, if the research seeks to “pick the winner”, randomized comparative trials are the gold standard. Multiple opportunities exist to reduce sample sizes and costs of early-phase randomized trials, while still preserving the comparative essence of randomization. Surrogate endpoints with high expected event rates are commonly used for this reason. Regressing for strongly prognostic covariates adds power at no additional cost.9 For signal-finding trials, α can be increased. Beyond α and P values, the probability of superiority in early-phase randomized trials can be readily estimated using Bayesian designs that can reliably compare two treatment arms with sample sizes as low as 90 patients total, even when each arm is divided across three biomarker subgroups.10 Thus, ample, robust statistical mechanisms are available to facilitate signal identification in resourced-constrained early-phase trials without falling into the avoidable interpretative traps of RNCTs.
Acknowledgments:
We thank Laura L. Russell of the MD Anderson Research Medical Library’s Editing Services team for editing the manuscript.
Support:
This work was supported in part by the National Institutes of Health/National Cancer Institute through Cancer Center Support Grant P30CA016672 to The University of Texas MD Anderson Cancer Center. Pavlos Msaouel and Ethan Ludmir are recipients of the Andrew Sabin Family Foundation Fellowship.
Authors’ disclosures of potential conflicts of interest:
Alexander Sherry reports honoraria from Sermo. Pavlos Msaouel reports honoraria for scientific advisory board membership for Mirati Therapeutics, Bristol-Myers Squibb, and Exelixis; consulting fees from Axiom Healthcare; non-branded educational programs supported by DAVA Oncology, Exelixis, and Pfizer; leadership or fiduciary roles as a Medical Steering Committee Member for the Kidney Cancer Association and a Kidney Cancer Scientific Advisory Board Member for KCCure; and research funding from Regeneron Pharmaceuticals, Summit Therapeutics, Merck, Takeda, Bristol-Myers Squibb, Mirati Therapeutics, and Gateway for Cancer Research (all unrelated to this manuscript’s content). No other authors report any conflicts of interest.
References
- 1.Zandberg DP, Vujanovic L, Clump DA, et al. : Randomized phase ii study of concurrent versus sequential pembrolizumab in combination with chemoradiation in locally advanced head and neck cancer. J Clin Oncol:Jco2401580, 2025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Msaouel P: Bad stats: A regular series exploring slip-ups, snafus and salutary lessons from the world of statistics. Significance 22:40–44, 2025 [Google Scholar]
- 3.Sherry AD, Msaouel P, Ludmir EB: A meta-epidemiological analysis of post-hoc comparisons and primary endpoint interpretability among randomized noncomparative trials in clinical medicine. J Clin Epidemiol 175:111540, 2024 [DOI] [PubMed] [Google Scholar]
- 4.Senn S: Controversies concerning randomization and additivity in clinical trials. Stat Med 23:3729–53, 2004 [DOI] [PubMed] [Google Scholar]
- 5.Rosenberger WF, Uschner D, Wang Y: Randomization: The forgotten component of the randomized clinical trial. Stat Med 38:1–12, 2019 [DOI] [PubMed] [Google Scholar]
- 6.Sherry AD, Passy AH, Abi Jaoude J, et al. : Treatment group-specific inferences in phase iii randomized oncology trials. Acta Oncol 64:470–474, 2025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Fojo T, Bates S: Skimming the median and the problem of exaggerated survival gains. JAMA Oncology, 2025 [DOI] [PubMed] [Google Scholar]
- 8.Das A, Lin TA, Lin C, et al. : Assessment of median and mean survival time in cancer clinical trials. JAMA Netw Open 6:e236498, 2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Sherry AD, Passy AH, McCaw ZR, et al. : Increasing power in phase iii oncology trials with multivariable regression: An empirical assessment of 535 primary end point analyses. JCO Clin Cancer Inform 8:e2400102, 2024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lee J, Thall PF, Msaouel P: Bayesian treatment screening and selection using subgroup-specific utilities of response and toxicity. Biometrics 79:2458–2473, 2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
