Skip to main content
Neuro-Oncology logoLink to Neuro-Oncology
. 2016 Mar 17;18(Suppl 2):ii21–ii25. doi: 10.1093/neuonc/nov254

Creating clinical trial designs that incorporate clinical outcome assessments

Mark R Gilbert 1, Lawrence Rubinstein 1, Glenn Lesser 1
PMCID: PMC4795995  PMID: 26989129

Abstract

Clinical outcome assessments (COAs) are increasingly being used in determining the efficacy of new treatment regimens. This was typified in the recent use of a symptom-based instrument combined with an organ-based measure of response for the approval of ruxolitinib in myelofibrosis. There are challenges in incorporating these COAs into clinical trials, including designating the level of priority, incorporating these measures into a combined or composite endpoint, and dealing with issues related to compliance and interpretation of results accounting for missing data. This article describes the results of a recent panel discussion that attempted to address these issues and provide guidance to the incorporation of COAs into clinical trials, including novel statistical designs, so that the testing of new treatments in patients with cancers of the central nervous system can incorporate these important clinical endpoints.

Keywords: clinical trials, statistical design, study endpoints


Incorporation of clinical outcome assessments (COAs) into clinical trials remains the exception rather than a standard component of study design. However, the inclusion of COAs into brain tumor trials may enhance our understanding of the true benefit or harm of a tested therapy because conventional endpoints, particularly those involving imaging, may be confounded by treatments and supportive care medications as was described in the first outcomes workshop.1 This prior workshop focused on defining imaging endpoints and concluded that the treatment under investigation and other factors such as steroid use/dose may impact the ability to interpret diagnostic imaging as a component of the study endpoint determination. There are frequent discrepancies in the correspondence of imaging changes with clinical status. For example, pseudoprogression has been operationally defined as stable clinical status despite worsening imaging findings in the treatment field that improve over time.2 This underscores the need for COA measures to be incorporated along with the imaging-based response measures (such as Response Assessment in Neuro-Oncology criteria) when evaluating efficacy.3

Examples of Incorporation of Clinical Outcome Assessments into Brain Tumor Clinical Trials

There have been several examples from brain tumor clinical trials, described below, that clearly illustrate that imaging and overall survival (OS) endpoints are important but not sufficient to provide a comprehensive understanding of the effects of treatment. These typical clinical trial outcomes measures—objective response, progression-free survival (PFS), and OS—do not completely characterize patient symptoms or functional status following protocol therapy.

RTOG 05254

A randomized trial comparing dose-dense temozolomide with standard dose during the maintenance treatment phase in newly diagnosed glioblastoma. This trial from the Radiation Therapy Oncology Group (RTOG) piloted the use of a battery of tests, including neurocognitive function, symptom burden measurement using the MD Anderson Symptom Inventory Brain Tumor Module, and health-related quality of life (QoL) using the European Organisation for Research and Treatment of Cancer 30-item Quality of Life/20-item Brain Neoplasm questionnaires (EORTC QLQ30/BN-20), collectively called the Net Clinical Benefits battery.5 These results confirmed that the dose-dense treatment did not improve survival and was associated with greater symptom burden and worse QoL.

RTOG 06146

A randomized trial of memantine vs placebo in patients with brain metastases undergoing whole-brain radiotherapy. This trial was designed with neurocognitive function as the primary endpoint. A battery of neurocognitive tests, including Trail Making A and B, the Hopkins Verbal Learning Test, and the Controlled Oral Word Association test, were measured longitudinally in patients who had been treated with whole-brain radiotherapy for brain metastases. The study demonstrated a neuroprotective benefit from the addition of memantine in this patient population without a change in survival or response rate.

RTOG 08257

A randomized trial comparing bevacizumab with placebo when added to the standard chemoradiation for newly diagnosed glioblastoma. This trial included the Net Clinical Benefits battery described above, including neurocognitive function, symptom burden, and health-related QoL measured longitudinally. These tests demonstrated a decline in all 3 measures late in the treatment course in patients treated with bevacizumab who were deemed to have stable disease, compared with similar patients on the placebo arm.

AVAglio8

A randomized trial comparing bevacizumab with placebo when added to the standard chemoradiation for newly diagnosed glioblastoma. This trial included serial measurement of health-related QoL using the EORTC QLQ30/BN-20, and in a small subset, formal neurocognitive function was measured longitudinally. Select domains, when evaluated as time-to-event phenomena, showed prolongation of maintenance of QoL with bevacizumab compared with placebo. A recent in-depth analysis did not show significant differences in QoL measures from this study between the bevacizumab and placebo arms.9

The demonstrated importance of COA measures in understanding the true benefits or harms of the tested therapies in the clinical trials listed above underscores the need to consider these measures as a mandatory component of brain tumor treatment trials. Otherwise, without mandatory participation, issues such as missing data and other compliance concerns may hamper the incorporation of these measures into interpretation of the efficacy results. This is discussed in greater detail below.

Novel Clinical Trial Designs

It is increasingly recognized that there is disease heterogeneity even within specific brain tumor types (eg, glioblastoma, anaplastic glioma).10,11 Furthermore, the relative rarity of brain tumors and the well-established inter- and intratumoral heterogeneity mandate that optimal testing of new treatment regimens incorporate a more efficient approach compared with the traditional phase I and single-arm phase II designs.12 Additionally, the shortcomings of using historical data for phase II comparisons have led to increased use of randomized phase II studies to determine if there is an efficacy “signal.”13 However, this requires more patients than the single-arm or “signal” finding trial and often requires an extended time for outcomes measures—such as PFS and especially OS data—to mature. This apparent dilemma, balancing the need to test new treatments in molecularly defined but smaller subgroups but needing accurate comparators (typically a control arm), may offer an opportunity to incorporate COAs into the trial design so that this additional patient-related information can be incorporated into the determination of treatment effectiveness.

The panel raised the question whether this model could be successfully implemented for brain tumor clinical trials. This leads to the following questions about the incorporation and integration of COAs into clinical trials:

  1. Should COAs be a primary, secondary, or exploratory endpoint?

  2. What are the challenges of incorporating COAs into clinical trials?

  3. What are the statistical considerations in planning clinical trials that incorporate COAs? How does the statistical modeling vary by the endpoint designation of the COA (primary, secondary, exploratory)?

  4. How should missing data be handled?

Clinical Outcome Assessments as Primary, Secondary, or Exploratory Endpoints

A major question when considering utilizing COAs in a clinical trial is determining what “level” of objective these measures should be given. COAs can be incorporated into clinical trials as primary efficacy endpoints or components of a primary efficacy endpoint, as secondary endpoints, or as exploratory endpoints. When COAs are incorporated as exploratory endpoints, relatively fewer clinical trial resources are employed in their measurement, and interpretation of results can be difficult for a variety of reasons: most prominently poor compliance resulting in missing data. Such missing data, especially if they were a substantial portion of the data expected to be collected, would present an analysis problem, since they could not be assumed to be missing at random. They would most likely have to be imputed based upon patient clinical covariates and nonmissing COA data. There are several examples outside of neuro-oncology where COA measures were combined with more traditional assessments as primary indicators of treatment efficacy. A major precedent was established by the recent FDA approval of ruxolitinib in myelofibrosis. In addition to an objective clinical measure (decrease in spleen size), a myelofibrosis symptom score was incorporated into 1 of the 2 pivotal trials. The results from the symptom score improvement (46% with treatment vs 5% with placebo, P < .0001) provided important evidence of the clinical relevance of the treatment and were incorporated into product labeling.

  1. Incorporating COAs into the primary outcome has important implications. With this designation, the trial design can make these COA components mandatory and would warrant full support and resources. However, with few exceptions (eg, RTOG 0614), COAs will likely be evaluated in conjunction with a more traditional efficacy endpoint, such as PFS in clinical trials that are seeking labeling claims for treatment of brain cancers. Whether this design, using COA measures as a co-primary endpoint, will compel the allocation of better resources and oversight leading to higher compliance remains unknown.

  2. Conversely, incorporating COAs into a clinical trial as a secondary endpoint avoids some of the complicating issues discussed above but may diminish the allocation of resources for administering and providing oversight to improve compliance and thus diminish the impact of these measures on the evaluation of the outcomes. However, as described above, one of the studies of ruxolitinib in myelofibrosis incorporated a symptom scale as a secondary endpoint.14 Positive symptom scale results contributed to the overall assessment of the drug and provided further evidence that the clinical measure (decrease in spleen size) was associated with improved symptom burden.

  3. To date, the combined primary analysis of COAs with traditional measures has not occurred in brain tumor clinical trials but is planned in 2 NRG Oncology Cooperative Group studies: BN001, comparing proton with photon-based radiation in newly diagnosed glioblastoma, and BN002, evaluating the impact of the addition of immune checkpoint inhibitors on survival, neurocognitive function, and symptom burden. This would require either a composite measure or an algorithm that statistically evaluates the measures and creates a logic tree that contains the various permutations and their interpretation. Both the composite measure and the combination algorithm need to be developed and validated.

  4. In designing a clinical trial that incorporates COAs, should there be a direct relationship between these measures and the primary endpoint? In this context, how should the identification of the optimal endpoints for the clinical trial be determined?

Challenges in Incorporating Clinical Outcome Assessments into Clinical Trials

Patient accrual is often identified as a major concern in planning clinical trials. However, although primary brain tumors are relatively uncommon, the recent robust accrual to several large randomized trials (RTOG 0525, RTOG 0825, AVAglio, CENTRIC) suggests that novel treatments of interest will attract patient participation.4,7,8,15 Decisions regarding which COAs to incorporate into the clinical trial, how the COA results will be analyzed, and which strategies will be employed to monitor COA use to ensure consistent and appropriate administration (eg, how they will be presented to the patient, consistent training of test administrators across sites, how to maximize compliance, when they will be administered in relation to the therapy delivered) must be defined at the time of protocol development and must employ content experts who can ensure appropriate use of validated tests as specified by the technique used for validation.

  1. There are unique aspects of patients and study characteristics that present challenges to incorporating COAs into clinical trials:
    1. Some characteristics of patients with primary brain tumor may limit or preclude participation in the clinical trial or the COA component of the trial. Examples of these limitations include language barrier, patient burden, loss of ability to self-report or participate in testing over time, and study fatigue.
    2. In order to be useful and allow comparisons across studies, COAs must be validated in the patient population under study. Their validity absolutely depends on compliance with the defined administration schedule with an effort made to minimize “missingness” of data consistent with the effort put forth to ensure collection of primary outcome data like imaging and clinical exam.
    3. As alluded to above, historically, use of COAs has been an “add-on” to clinical trials and not a mandatory trial component. This limits applicability of findings due to lack of rigor in data collection and completion rates. Efforts to include these in trials should include:
      1. Electronic data capture and real-time monitoring of compliance: Incorporation of the COAs as a mandatory component of the clinical trial should improve compliance, but because these measures are not yet perceived with the same level of importance as other clinical trial components (eg, blood counts), a real-time system to monitor compliance and provide reminders to research staff would improve participation.
      2. Funding support: Resources to include COAs into clinical trials are often limited. Decisions regarding the frequency of evaluation and the oversight provided to this component of the clinical trial may be determined by the amount of funding available rather than by the optimal design for the clinical trial. In order to establish the utility of incorporating COA measures into efficacy assessments, the clinical trial sponsor must recognize the importance of these measures.

Statistical Considerations in Planning Clinical Trials with COAs

The statistical considerations of COAs vary considerably based on the endpoint designation planned (primary, secondary, exploratory). Exploratory endpoint designation is likely of limited value, as the anticipated compliance rate and allocated resources will be low and the rate of missing data correspondingly high. However, strong consideration can be given to trials that incorporate COAs as primary or secondary endpoints.

In phase II trials, COAs that relate to QoL or neurocognitive function can be made secondary endpoints. Several may be evaluated separately in an exploratory fashion. Since a substantial effect will usually be targeted, it is possible to utilize a type I error bound of .05 and still maintain 90% power for each endpoint separately. In phase III trials, the COAs that show a treatment effect in phase II may be combined in a single phase III endpoint, to be utilized as a co-primary endpoint with OS or PFS or as a key secondary efficacy endpoint. In this context, algorithms can be developed from these phase II studies that inform the design of the phase III trial that incorporates COA endpoints.

Incorporating COAs into the efficacy algorithms for phase III trials will require defining a promising outcome that is based on clinical considerations, exemplified by the symptom scale successfully utilized with ruxolitinib in myelofibrosis. Furthermore, metrics must be developed to determine which individual COA or combinations of COAs will provide information that is most meaningful. The overriding principles should be: (i) that a promising outcome should be defined as one that is substantially better than what would be expected from standard treatment, and (ii) that the number of COA endpoints assessed should be sufficiently limited to facilitate comprehensive data collection and use of statistical methods to control for multiplicity.

There are additional considerations for deciding how to incorporate COA endpoints into the clinical trial design. As highlighted in a recent FDA Guidance Document (http://www.fda.gov/downloads/Drugs/Guidances/UCM193282.pdf), testing multiple COA measures complicates the statistical evaluation by creating the problem of multiple comparisons. Conversely, a tiered, sequential approach can be considered in which there is a primary (single or composite) COA endpoint that is evaluated first and has been designed to test clinical outcome. If a “positive” endpoint is reached in the clinical trial, then additional analyses can be performed on individual components of the COAs, some of which may be analyses limited to subsets of patients demonstrating particular symptoms prior to the initiation of treatment. However, if the primary endpoint is not reached with this tiered model, then additional analyses are not undertaken. This approach avoids the multiple comparisons problem but makes the choice of the primary COA endpoint critical.

A composite endpoint has the benefit of providing wider “range” of testing, but the creation of the composite is complicated and requires validation. A single endpoint is simple but limited in scope. Potentially, an “index” symptom/sign/functional measure could be chosen for individual patients from a restricted list that would be used as a measure, but additional work would be required to define response/failure parameters for each of the chosen COA metrics in the restricted list. Also, it is essential that such index symptoms be defined prospectively, for each patient, rather than retrospectively, at the end of the trial; in this way, a multiple comparison problem is avoided. Additional issues, such as how to account for patients with no symptoms/signs/dysfunction to monitor, require consideration (often referred to as the “ceiling effect”). It may be possible to stratify patients in a randomized trial by the presence or absence of a definable COA measure. This approach would require validation but has the benefit of an additional prognostic stratification, as there are reports from several large trials indicating that symptomatic patients have an overall worse prognosis. A final and crucial consideration with respect to evaluation of a COA endpoint in a phase III trial is the potential for bias if treatment is not blinded, both for the patient and for the evaluating investigator.

Combining a Promising COA Outcome with Overall Survival

A promising COA or a combination of COAs could potentially be combined with a traditional phase III outcome measure like OS, but this would require the development of a composite endpoint. It would probably be more interpretable to use a “dual endpoint” composed of the separate survival and COA assessments. The primary question is whether an overall positive treatment effect is to be defined as a promising outcome in both OS and the COA endpoint or defined as a promising outcome in either OS or the COA endpoint. It may be that this question will be answered differently for different agents, based upon expectations of effect or expectations of degree of toxicity. For example, it must be determined during the trial design phase whether extending OS without a positive result for the overall COA measure, or vice versa—achieving a positive overall COA outcome without extending OS—will be considered sufficient to call the treatment promising, or if both an increase in OS and a positive COA outcome should be required.

Handling Missing Data

Historically, compliance issues that result in missing data have hampered measurements of COAs in clinical trials. Although the use of novel measures, improved oversight, and designation of the COA as an integral study measure may improve this issue, prespecified methods for handling missing data are critical to the unbiased interpretation of the results. In phase II exploratory trials, it may, in some cases, be acceptable to base analyses on the nonmissing data, assuming that the missing data are missing at random, to avoid inflating the type II error. In phase III trials, on the other hand, the statistical analysis plan must clearly outline the procedures for imputation of missing data. For example, the statistical analysis design may include a plan for imputation of missing data in the experimental arm in accordance with what is seen in the control arm, to avoid inflating the type I error by introducing bias arising from data that are not missing at random. Unfortunately, often there is a substantial amount of missing COA data, and if they are imputed based upon what is seen in the control arm, this is likely to dilute the observed treatment effect and increase the type II error (reduce the power of the trial). Thus, it is crucial to keep the amount of missing COA data to a minimum.

Conclusions

COAs are increasingly being recognized as important components in treatment regimens. Although most of the discussions and considerations for the use of COAs have been in the context of large-scale randomized clinical trials, often with the goal of registration, earlier-stage clinical trials may benefit from incorporation of these measures. Although the earlier-phase trial designs (phases I and II) will require a different statistical model, COA measures may provide valuable information (in addition to the more traditional toxicity [phase I] and response [phase II] information) about efficacy and impact on patient outcome measures. There are challenges to incorporation of COA measures into clinical trials that are either pragmatic or design related. Precedence for successful incorporation of these measures into clinical trials exists, but there needs to be a concerted effort to incorporate validated COAs into clinical trials and consideration for making patient participation in COAs mandatory and routine. Design-related issues remain, mostly related to the statistical modeling that will require creative solutions to areas such as integrating multiple testing platforms, strategies to minimize and handle missing data, and statistical plans for endpoint evaluation with other measures such as PFS and OS. Additionally, trial designs need to specify whether the COA outcomes analysis will encompass a single measure, a composite measure, or multiple measures. Each option has advantages and pitfalls and will require careful implementation and testing in prospective clinical trials. Although challenging, the incorporation of COAs into clinical trials is an important goal that will enhance future studies by providing insights not generated by traditional endpoints and efficacy measures and will ultimately improve patient care.

Acknowledgments

The authors would like to thank Drs Terri Armstrong, Mark Gilbert, Patrick Wen and Jennifer Helfer, and David Arons for their efforts to coordinate and lead the Jumpstarting Brain Tumor Drug Development Coalition and FDA Clinical Trials Clinical Outcome Assessment Endpoints Workshop on October 15, 2014; Drs Remi Kaleta, Martha Donoghue, and Rajeshwari Sridhara for their expert advice during the preparation of the Clinical Outcome Assessment Endpoints Workshop.

Conflict of interest statement. The authors have no conflicts of interest to disclose regarding this work.

References

  • 1.Wen PY, Cloughesy TF, Ellingson BM et al. Report of the Jumpstarting Brain Tumor Drug Development Coalition and FDA clinical trials neuroimaging endpoint workshop (January 30, 2014, Bethesda MD). Neuro Oncol. 2014;16(Suppl 7):vii36–vii47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.de Wit MC, de Bruin HG, Eijkenboom W et al. Immediate post-radiotherapy changes in malignant glioma can mimic tumor progression. Neurology. 2004;63:535–537. [DOI] [PubMed] [Google Scholar]
  • 3.Wen PY, Macdonald DR, Reardon DA et al. Updated response assessment criteria for high-grade gliomas: Response Assessment in Neuro-Oncology working group. J Clin Oncol. 2010;28:1963–1972. [DOI] [PubMed] [Google Scholar]
  • 4.Gilbert MR, Wang M, Aldape K et al. RTOG 0525: a randomized phase III trial comparing standard adjuvant temozolomide (TMZ) with a dose-dense (dd) schedule in newly diagnosed glioblastoma (GBM). J Clin Oncol. 2011;29:141s. [Google Scholar]
  • 5.Armstrong TS, Wefel JS, Wang M et al. Net clinical benefit analysis of Radiation Therapy Oncology Group 0525: a phase III trial comparing conventional adjuvant temozolomide with dose-intensive temozolomide in patients with newly diagnosed glioblastoma. J Clin Oncol. 2013;31:4076–4084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Brown PD, Pugh S, Laack NN et al. Memantine for the prevention of cognitive dysfunction in patients receiving whole-brain radiotherapy: a randomized, double-blind, placebo-controlled trial. Neuro Oncol. 2013;15:1429–1437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Gilbert MR, Dignam JJ, Armstrong TS et al. A randomized trial of bevacizumab for newly diagnosed glioblastoma. N Engl J Med. 2014;370:699–708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Chinot OL, Wick W, Mason W et al. Bevacizumab plus radiotherapy-temozolomide for newly diagnosed glioblastoma. N Engl J Med. 2014;370:709–722. [DOI] [PubMed] [Google Scholar]
  • 9.Taphoorn MJ, Henriksson R, Bottomley A et al. Health-related quality of life in a randomized phase III study of bevacizumab, temozolomide, and radiotherapy in newly diagnosed glioblastoma. J Clin Oncol. 2015;33:2166–2175. [DOI] [PubMed] [Google Scholar]
  • 10.Brennan CW, Verhaak RG, McKenna A et al. The somatic genomic landscape of glioblastoma. Cell. 2013;155:462–477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Cancer Genome Atlas Research Network Brat DJ, Verhaak RG et al. Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas. N Engl J Med. 2015;372:2481–2498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Galanis E, Wu W, Cloughesy T et al. Phase 2 trial design in neuro-oncology revisited: a report from the RANO group. Lancet Oncol. 2012;13:e196–e204. [DOI] [PubMed] [Google Scholar]
  • 13.Grossman SA, Ye X, Piantadosi S et al. Survival of patients with newly diagnosed glioblastoma treated with radiation and temozolomide in research studies in the United States. Clin Cancer Res. 2010;16:2443–2449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Mesa RA, Gotlib J, Gupta V et al. Effect of ruxolitinib therapy on myelofibrosis-related symptoms and other patient-reported outcomes in COMFORT-I: a randomized, double-blind, placebo-controlled trial. J Clin Oncol. 2013;31:1285–1292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Stupp R, Hegi ME, Gorlia T et al. Cilengitide combined with standard treatment for patients with newly diagnosed glioblastoma with methylated MGMT promoter (CENTRIC EORTC 26071-22072 study): a multicentre, randomised, open-label, phase 3 trial. Lancet Oncol. 2014;15:1100–1108. [DOI] [PubMed] [Google Scholar]

Articles from Neuro-Oncology are provided here courtesy of Society for Neuro-Oncology and Oxford University Press

RESOURCES