Abstract
Background
The patient-reported outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) is used to assess symptomatic adverse events in oncology trials. Currently, no standard for PRO-CTCAE analysis exists.
Methods
Key methods of descriptive analysis and longitudinal modeling using PRO-CTCAE data from an oncology clinical trial, DRiving Excellence in Approaches to Multiple Myeloma-2 (DREAMM-2), a phase II trial of belantamab mafodotin in multiple myeloma (NCT03525678), were explored. Descriptive methods included maximum postbaseline ratings, mean change over time, ratings above a predefined cutoff, line graphs, and stacked bar charts to illustrate patient-reported adverse events at one timepoint or dynamics over time. Analysis methods involving modeling over time included toxicity over time (ToxT) (repeated measurement model, time-to-event, area under the curve analyses), generalized estimating equations (GEE), and ordinal log-linear models (OLLMs).
Results
Visualizations of PRO-CTCAE data highlighted different aspects of the data. Selection of the appropriate visualization will depend on the audience and message to be conveyed. Consistent results were obtained by all modeling approaches; no difference was found between dose groups of the DREAMM-2 study in any PRO-CTCAE item by the ToxT approach or the more sophisticated GEE and OLLM methods. Interpretation of GEE results was the most challenging. OLLM supported the interval nature of the PRO-CTCAE response scale in the DREAMM-2 study. All modeling approaches account for multiple testing (driven by the number of items).
Conclusions
Descriptive analyses and longitudinal modeling approaches are complementary approaches to presenting PRO-CTCAE data. In modeling, the ToxT approach may be a good compromise compared with more sophisticated analyses.
Core patient-reported outcome (PRO) concepts that should be measured in oncology trials include disease-related symptoms, physical and role functioning, overall impact of side effects, and symptomatic adverse events (AEs) (1,2). The PRO version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) has been recommended as a useful instrument to assess the latter (2,3). The PRO-CTCAE is a library of 124 items characterizing 78 symptomatic toxicities in terms of frequency, severity, interference, and/or presence or absence (4). Symptomatic AEs related to treatment are selected from the library during trial design to provide an unbiased presentation of patient perceptions of the most important toxicities of the therapies being studied (5).
The number of oncology clinical trials using the PRO-CTCAE has increased tenfold since its introduction in 2015 through 2020 (4,6). However, consensus guidelines for analyzing and presenting PRO-CTCAE data do not yet exist (5). Current approaches to PRO-CTCAE data analysis are mostly descriptive and do not address the longitudinal trajectory of symptomatic AEs with statistical models (7).
The PRO-CTCAE was collected over the course of the phase II DREAMM-2 study (8,9) to better understand symptomatic AEs incurred by treatment with belantamab mafodotin, a first-in-class antibody-drug conjugate targeting B-cell maturation antigen (10). Treatment with single-agent belantamab mafodotin 2.5 mg/kg resulted in deep and durable responses and had an acceptable safety profile in patients with triple-class relapsed or refractory multiple myeloma (9,11).
The objective of this work was to explore selected statistical methods and data visualization techniques to support analysis and presentation of PRO-CTCAE data to inform a variety of relevant research questions using data collected within the DREAMM-2 study as an illustrative example.
Methods
Overview of DREAMM-2 study and PRO-CTCAE data collection
In the DREAMM-2 study, patients were randomly assigned to receive belantamab mafodotin 2.5 mg/kg or 3.4 mg/kg; the study was not designed to identify statistically significant differences in outcomes between doses (9,11). The DREAMM-2 study evaluated 28 items corresponding to 15 symptomatic toxicities (PRO-CTCAE version 1.0, with a 7-day recall period; Supplementary Table 1, available online). PRO-CTCAE data were collected at screening, on day 1 of the first treatment cycle before drug administration, during each subsequent treatment cycle (starting at week 4 after first dose) or every 3 weeks, and at the end of the treatment visit (which occurred within 45 days after the last dose). The analyses were conducted using all available PRO-CTCAE data from patients who received at least 1 dose of belantamab mafodotin as part of the study in the 2.5- or 3.4-mg/kg arm. In this exploration, DREAMM-2 study PRO-CTCAE data are used to illustrate the analysis methods.
Statistical analyses
Methods for analyzing and visualizing PRO-CTCAE data were selected to be adaptable to the specific features of longitudinal PRO-CTCAE data for which independent single items with 4-level response scales are frequently collected. The methods selected were not exhaustive or intended as a definitive list of available approaches. Rather, the intention was to demonstrate how a variety of statistical methods could be applied to address a number of relevant research questions associated with the PRO-CTCAE. No statistical comparisons were made between the selected analysis methods. Equally, selection of specific AEs was not intended to compare results from the analyses but to serve only as illustrations of the methods.
The administration of up to 28 items produced complex datasets with the possibility of skipped and missing data. Statistical techniques with various underlying assumptions and levels of sophistication were tested, as described in Table 1. Conditional branching logic was applied to PRO-CTCAE items in the study to reduce patient burden (12). For example, if a patient reported a symptomatic AE, they were asked about the severity and the extent to which it interfered with daily activities. If a patient did not report an AE, the follow-up questions were not posed, resulting in higher skipped data rates for some of the severity and interference items. In addition to these data skipped by design, participant discontinuation from treatment (mostly due to disease progression) was a major cause of increasing numbers of missing data over the duration of the study (Supplementary Figure 1, available online). Descriptive and graphical visualizations and analysis based on modeling methods are shown in Table 1.
Table 1.
Rationales for method selection criteria for PRO-CTCAE data and means of applying analysis method to PRO-CTCAE dataa
Analysis method | Rationale for using analysis method to analyze longitudinal PRO-CTCAE data | How analysis method can be applied to PRO-CTCAE data | Methodological details in analyses |
---|---|---|---|
Maximum baseline-adjusted PRO-CTCAE item ratings | All symptomatic AEs are shown using aggregated baseline-adjusted values in a single image | Single image for all symptomatic AEs | Descriptive analyses for each PRO-CTCAE item included the percentage of patients that show any symptom at baseline (PRO-CTCAE rating ≥1), percentage of patients that show any worsening while being treated (change from baseline in PRO-CTCAE rating at any postbaseline visit ≥1), and percentage of patients that show a worsening to a rating of 3 or 4. Baseline-adjusted PRO-CTCAE values are the observed value at the visit if higher than the baseline value, and 0 otherwise. |
Mean change from baseline in items over time | Changes from baseline levels of symptomatic AEs are shown over time | One graph per symptomatic AE | Mean PRO-CTCAE item scores were also described per visit, including maximum baseline-adjusted scores (7). |
Summary of symptomatic AEs over time per visit | Distribution of symptomatic AE ratings at each timepoint in the study | One graph per symptomatic AE | Line graphs clustered per symptom type, stacked bar charts, and butterfly charts (as included in FDA’s Project Patient Voice) (13) were produced at each scheduled visit to graphically visualize the data. In some instances, the data were aggregated to permit the use of 1 descriptive or visualization method for a whole PRO-CTCAE item. |
Cumulative frequencies of patients with scores ≥2 or ≥3 per treatment arm | Accumulating numbers of patients who have experienced symptomatic AEs and the speed at which this happens | One graph per symptomatic AE or group of symptomatic AEs | Cumulative frequencies of patients reaching at least a certain response category for each PRO-CTCAE item were produced at each scheduled visit to graphically visualize the data. |
ToxT | |||
ANOVA | Comparison of ratings over time accounting for within-patient variation | A repeated-measures ANOVA was applied to baseline and postbaseline PRO-CTCAE data and computed for each PRO-CTCAE item. The explained variable was the PRO-CTCAE rating (treated as continuous), the explanatory variables were dose arm, visit (repeated measure), and the stratification factors (no. of prior lines of treatment and cytogenetic risk category) | One model computed per PRO-CTCAE item |
KM | Assessment of first occurrence of a PRO-CTCAE rating ≥ to a predefined value | KM estimates for time to first occurrence of ratings ≥3, and maximum rated occurrence were obtained for each PRO-CTCAE item. A rating of ≥3 was selected as predefined threshold to mimic the cutoff commonly used for CTCAE criteria because this represents an escalation beyond mild or moderate symptoms | One model per PRO-CTCAE item and per threshold |
AUC | Comparison of symptomatic toxicities based on a single number | Estimation of the AUC was based on individual longitudinal trajectories of PRO-CTCAE ratings using the trapezoid method and missing data imputed using the LOCF approach | An AUC analysis performed for each PRO-CTCAE item |
GEE | Marginal models that handle repeated measurements with ordinal variables (such as PRO-CTCAE ratings) | One model was computed per PRO-CTCAE item | GEE models compared symptomatic AEs over time between both treatment arms (30,31). An alternating linear regression algorithm for ordinal multinomial responses (32) was applied for each PRO-CTCAE item, where the explained variable was PRO-CTCAE rating (treated as ordinal) and explanatory variables were dose arm, visit (the repeated element), and stratification factors (no. of prior lines of treatment and cytogenetic risk category). |
OLLM | Approach to understand the association in complex contingency tables; give estimates of ORs for impact of dose on maximum postbaseline rating, adjusted on baseline rating | Assessed a 3-way contingency table for each PRO-CTCAE item consisting of (1) treatment arm, (2) PRO-CTCAE rating at baseline, and (3) maximum postbaseline PRO-CTCAE rating | OLLMs were used for dose arm comparison, controlling for baseline PRO-CTCAE rating (33-35). For each PRO-CTCAE item, a procedure was conducted to assess the 3-way contingency table cross-tabulating dose arm, PRO-CTCAE rating at baseline, and maximum postbaseline PRO-CTCAE rating. Several nested models were estimated to study the partial association between the 3 variables. Five PRO-CTCAE item scoring systems were explored to reflect various assumptions on how the response categories of PRO-CTCAE items were distributed. For example, in 1 case the PRO-CTCAE responses were assumed to be evenly distributed (ie, items have interval scale properties), whereas in another case, lower ratings were assumed to be closer than higher ratings. All models were compared using AIC. Odds and ORs comparing postbaseline ratings between treatment groups, controlling for baseline rating, were estimated from the OLLMs having the best goodness-of-fit statistics. |
AE = adverse event; AIC = Akaike information criterion; ANOVA = analysis of variance; AUC = area under the operating curve; FDA = US Food and Drug Administration; GEE = generalized estimating equation; KM = Kaplan–Meier; LOCF = last observation carried forward; OLLM = ordinal log-linear models; OR = odds ratio; PRO-CTCAE = Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events; ToxT = toxicity over time.
As each method implied multiple statistical tests, given the number of PRO-CTCAE items (28), the Bonferroni adjustment was used to control for multiple comparisons, and a threshold of 0.0018 was used for statistical significance based on P values (.05/28). Missing data were not imputed, except for the area under the operating curve (AUC) analysis in the toxicity over time (ToxT) approach, which required this imputation to be valid (missing data would have created different follow-up time between participants, making the calculation of AUC invalid). Data analysis was performed using SAS software version 9.4 (SAS Institute Inc., Cary, NC, USA).
Results
Descriptive analyses of PRO-CTCAE data from DREAMM-2 study
In the present analyses, 221 patients were included at baseline and by week 10, 132 patients remained in the analysis. Data for 136 patients were available at the end of treatment. An overview of the advantages and disadvantages of each analysis method can be found in Table 2.
Table 2.
Advantages and disadvantages of the descriptive, visualization, and modeling methods identified as appropriate to analyze PRO-CTCAE dataa
Method | Advantages | Disadvantages |
---|---|---|
Maximum baseline-adjusted PRO-CTCAE item ratings |
|
|
Mean change from baseline in items over time |
|
|
Summary of symptomatic AEs over time per visit |
|
|
Cumulative frequencies of patients with scores ≥2 or ≥3 per treatment arm |
|
|
ToxT | ||
ANOVA |
|
|
KM |
|
|
AUC | Provides a single number to quantify “cumulative” toxicity within each treatment arm |
|
GEE |
|
|
OLLM |
|
|
AE = adverse event; ANOVA =analysis of variance; AUC = area under the curve; GEE = generalized estimating equations; KM = Kaplan–Meier; OLLM = ordinal log-linear models; OR = odds ratio; PRO-CTCAE = Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events; ToxT = toxicity over time.
Visualization of maximum baseline-adjusted symptomatic AEs
An overview of symptomatic AEs over the course of the DREAMM-2 study using aggregated maximum baseline-adjusted values for all PRO-CTCAE items is shown in Figure 1. This visualization provides an overview of the symptomatic AEs experienced by patients during the study, displaying full distribution and variation in AE ratings, related to symptoms at baseline. This figure is a good summary but does not show how symptoms change over time, and handling of missing items requires careful consideration.
Figure 1.
Maximum baseline-adjusted scores for Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) symptom items. Data from DRiving Excellence in Approaches to Multiple Myeloma-2 (DREAMM-2) study PRO-CTCAE collection, full analysis set (N = 221). AE = adverse event; Inj = injection; IV = intravenous.
Visualization of symptomatic AEs over the study follow-up
Line graphs showing the mean change from baseline at each visit, per symptom type, provide a conventional approach to show the change of a symptomatic AE over time. The mean change from baseline in severity of nausea over time in the DREAMM-2 study is shown in Figure 2 as an illustrative example.
Figure 2.
Mean change from baseline in nausea over time. Data from DRiving Excellence in Approaches to Multiple Myeloma-2 (DREAMM-2) study Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) collection, full analysis set (N = 221); 2.5 mg/kg, n = 97; 3.4 mg/kg, n = 124 (frozen, n = 99; lyophilized, n = 25); end of treatment 2.5 mg/kg, n = 63; 3.4 mg/kg, n = 73. Frequency questions are the first patients are asked on a specific symptom. If no symptoms are declared, the severity question is not asked, which can result in reduced sample sizes for severity measures. AE = adverse event; BL = baseline; EOT = end of treatment; PRO-CTCAE = Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events; W = week.
Other approaches to describe PRO-CTCAE ratings over time are butterfly charts and stacked bar charts. Examples of these for severity of nausea in the DREAMM-2 study are presented in Figure 3, A and B, respectively. These charts show the distribution of ratings for a given symptomatic AE at each timepoint per dose arm. Butterfly charts are similar to the visualization method used by the US Food and Drug Administration’s Project Patient Voice, an online platform developed to display patient-reported symptom data in a consistent manner across selected cancer trials (13).
Figure 3.
Percent of ratings for nausea at specific timepoints per dose arm by severity. Data from DRiving Excellence in Approaches to Multiple Myeloma-2 (DREAMM-2) study Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) collection, full analysis set (N = 221). AE = adverse event; BL = baseline; EOT = end of treatment; FDA = US Food and Drug Administration; W = week.
Visualization of cumulative frequencies of participants experiencing symptomatic AEs
The number of participants who experience certain symptomatic AEs over the course of the study may be relevant information to present. Cumulative frequencies are useful tools for this purpose; Figure 4 provides example cumulative frequencies for participants with nausea during the DREAMM-2 study. These graphical representations show the proportion of patients experiencing a given level of a symptomatic AE and how quickly it occurs and provide an overall comparison of change in symptomatic AEs over time between treatment groups.
Figure 4.
Cumulative frequencies of patients reaching scores of at least 2 and 3 per dose arm for nausea. Data from DRiving Excellence in Approaches to Multiple Myeloma-2 (DREAMM-2) study Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) collection, full analysis set (N = 221). AE = adverse event; BL = baseline; EOT = end of treatment; FAS = full analysis set; W = week.
Modeling longitudinal PRO-CTCAE data
Comparing symptomatic AEs over time (ToxT)
Accounting for within-patient variation with repeated-measures analysis of variance (ANOVA)
A first analytical component of the ToxT approach is repeated-measures ANOVA. It is based on a linear model for repeated measures and controls for within-individual correlations. Figure 5, A shows the adjusted mean PRO-CTCAE ratings for severity of constipation in DREAMM-2 participants, predicted from the model at each visit, for each dose arm along with confidence intervals. More details on model fit for the model assessing severity of constipation can be found in Supplementary Table 2 (available online). Repeated-measures ANOVA provides model-based estimates of mean PRO-CTCAE ratings that permit comparison of 2 treatment arms of a trial while considering between-patient variation.
Figure 5.
Toxicity over time analysis for severity of constipation. A) Adjusted mean repeated measures analysis of variance (ANOVA) after Bonferroni correction; B) Kaplan-Meier (KM) estimates of time to first occurrence of maximum postbaseline rating for Patient-Reported Outcomes version of the Common Terminology for Adverse Events (PRO-CTCAE) and log-rank test; and C) Area under the curve (AUCs) of mean PRO-CTCAE ratings. Data from DRiving Excellence in Approaches to Multiple Myeloma-2 (DREAMM-2) study PRO-CTCAE collection, full analysis set (N = 221). Results shown are for illustrative purposes only; modeling approaches were not compared with each other. CI = confidence interval; SE = standard error.
Time-to-event analysis using Kaplan–Meier (KM) estimates
The second analytical component of the ToxT approach is a KM estimate of occurrence of the symptomatic AE of interest. The KM estimate for the time-to-event analysis of severity of constipation is shown in Figure 5, B. This approach compares the time until participants in 2 treatment groups experience the specific symptomatic AE.
Comparing symptomatic toxicities using AUC analysis with a single number
The third analytical component of the ToxT approach is an AUC analysis. It provides a single number to quantify “cumulative” toxicity within each treatment arm. An example for severity of constipation in the DREAMM-2 study is shown in Figure 5, C. AUC analysis is used to perform an overall comparison of the overall toxicity between treatment groups.
Comparing symptomatic AEs considering the ordinal nature of PRO-CTCAE ratings with generalized estimating equations (GEE) models
Dose arms in a study can be compared using GEE model analysis using the full data available while accounting for within-patient variation and the ordinal nature ratings for PRO-CTCAEs. The GEE model provides a point estimate with an associated P value to compare the PRO-CTCAE ratings over time between the 2 groups. An example GEE model output is shown in Table 3 for shivering and shaking chills. More details on model fit for the model assessing frequency of shivering and shaking chills can be found in Supplementary Table 3 (available online). This approach, which is the most sophisticated tested here, may also be the most appropriate for PRO-CTCAE data given their ordinal nature. The odds ratio for the cumulative logit link function can be derived from model parameter estimates, the interpretation of which might be challenging and counterintuitive. As an example, GEE results are displayed in Table 3 for frequency of shivering and shaking chills. The odds ratio of 0.58 reported in the table refers to the probability of having a rating in the 2.5-mg/kg arm of the DREAMM-2 study that is lower or equal to a rating in the 3.4-mg/kg arm. This implies that overall higher ratings are expected for this symptom in the 2.5-mg/kg arm (note that the difference was not statistically significant [P = .007] after the Bonferroni adjustment).
Table 3.
Combined example results for GEE and OLLM analysesa
Longitudinal modeling method | PRO-CTCAE item | Level | No. | Estimate | SE | 95% CI | OR (cumulative) | Pr > |Z| |
---|---|---|---|---|---|---|---|---|
GEE | Shivering and shaking chills (frequency) | 2.5 vs 3.4 mg/kg | 1478 | −0.545 | 0.201 | −0.940 to −0.151 | 0.58 | .007 |
PRO-CTCAE item | Variable | No. | Estimate | SE | 95% CI | OR | Pr > ChiSq | |
OLLM | Heart palpitations (frequency) | Maximum rating by dose (2.5 vs 3.4 mg/kg) | 177 | −0.33 | 0.20 | −0.71 to 0.06 | 0.72 | .10 |
Maximum baseline rating | 177 | 0.63 | 0.12 | 0.40 to 0.87 | 1.89 | <.001 |
Data from DREAMM-2 study PRO-CTCAE collection, FAS (N = 221). Results shown are for illustrative purposes only; modeling approaches were not compared with each other. CI = confidence interval; DREAMM-2 = DRiving Excellence in Approaches to Multiple Myeloma-2; FAS = full analysis set; GEE = generalized estimating equations; OLLM = ordinal log-linear models; OR = odds ratio; Pr = probability; PRO-CTCAE = Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events; SE = standard error.
Comparing symptomatic AEs adjusting for baseline PRO-CTCAE ratings with ordinal log-linear models (OLLMs)
OLLMs are versatile descriptive approaches permitting comparison of symptomatic AEs using aggregated variables or at specific timepoints, adjusting from baseline ratings. OLLMs provide interpretable estimates—odds ratios—comparing the PRO-CTCAE ratings between groups of interest, with an associated P value. OLLM output exemplars are shown in Table 3 for frequency of heart palpitations, where the odds ratio of 0.72 indicates that the maximum postbaseline rating for the PRO-CTCAE item is likely to be lower in the 2.5-mg/kg arm of the DREAMM-2 study (though this was not statistically significant [P = .10] after the Bonferroni adjustment). Various models with different specifications of the response scale coding were compared using Akaike information criterion (AIC), with a lower AIC indicating better fit. The model with equal spacing (assuming interval scale) was the one with the lowest AIC, suggesting the possible linearity of PRO-CTCAE ratings in the DREAMM-2 data (Supplementary Table 4, available online).
Discussion
The analyses reported here explored various statistical methods and data visualization techniques to improve the analysis and presentation and address a variety of relevant research questions related to PRO-CTCAE data from oncology clinical trials. The overall aim was to explore some of the different options and issues relating to data visualization and analyses of PRO-CTCAE data to promote further research as to how best to move beyond descriptive analyses and include examinations of the longitudinal trajectory of symptomatic AEs with statistical models. In some instances, visualization may precede the hypothesis in an exploratory setting when the objective may be to generate the research question. In other circumstances, the visualization is used to illustrate the analysis that has been carried out to address a predefined hypothesis. The appraisal of the analytical tools that we applied in this posthoc analysis of the DREAMM-2 study data will help in the selection of relevant techniques for future oncology studies.
Multiple options were considered for the description and visualization of PRO-CTCAE data, including replication of the visualization of the pilot case of the US Food and Drug Administration’s Project Patient Voice (13). These techniques highlighted the importance of precise definition of the message, objective, and targeted audience in order to select the best visualization and that a fine balance is needed between simplicity and interpretable display of the complexity of the results. Additionally, we investigated the modeling methods that would be appropriate for statistical comparison of longitudinal PRO-CTCAE data between groups in a clinical trial. The most sophisticated methods, such as GEE models or OLLM, did not provide additional insight beyond the more standard techniques included in the ToxT approach. The ToxT approach, which to our knowledge has been applied to PRO-CTCAE for the first time, may be a good compromise for standard use in this context. The methods we tested, particularly the more sophisticated ones, may be used on a supplemental basis to address specific objectives. For instance, OLLM can be used to study the appropriateness of treating PRO-CTCAE item ratings as ordinal or interval variables by comparing the results with different specifications of the response coding. The OLLM analysis result showed that using a “linear” scoring system for the PRO-CTCAE items worked well for the DREAMM-2 study data. This suggests that using raw PRO-CTCAE ratings in models that assume linearity (such as those used in the ToxT approach) can be acceptable and may inform further future development of methods to analyze PRO-CTCAE data.
Although our findings did not identify any statistically significant differences between arms using the more sophisticated modeling methods for longitudinal PRO-CTCAE data, these methods should not be dismissed solely based on their performance in the DREAMM-2 study because the trial compared 2 doses of belantamab mafodotin and was not designed to detect differences between the PRO-CTCAE items. In clinical studies containing cohorts treated with different drugs instead of the same drug at different doses, these statistical modeling methods would be more likely to show greater differences in PRO-CTCAE item ratings, and their utility should be confirmed in such studies. Our work is certainly not a comprehensive list of all of the valid or useful approaches that can be applied for the analysis of PRO-CTCAE data. We identified the application cases corresponding to typical research questions and objectives that could be informed by the analyses of PRO-CTCAE data, and we searched for appropriate analyses in this setting. This led to the series of analyses and visualization techniques applied here, but others could certainly be imagined. This is true for visualization where the graphical representation options are almost infinite [eg, Sankey diagrams could be used (14,15)] but also for modeling approaches where other methods also could have been appropriate (eg, cumulative link mixed models) (16,17). For example, the toxicity index might be useful because this can detect difference in toxicity profile between treatments and does not rely on maximum grades often used for CTCAE data or average summary measures as used by ToxT (18-20). Inferential statistics are typically not relevant for PRO-CTCAE items but may be an option in some cases (16,17). Selection of the most appropriate approach may depend on the research question being asked and the target audience.
Collection of PRO-CTCAE data generates large and complex datasets that present a number of challenges in terms of analysis. The main challenge we identified was the high dimensionality of the data, where many data points needed handling (eg, individual item scores) and a high number of assessments occurred for each individual over the course of the study. In this context, multiplicity of statistical testing and type I error rate inflation should be considered; several approaches are possible for this purpose (eg, Bonferroni adjustment but also Holm or Hochberg procedures). The choice of the best method to deal with multiplicity will have to be informed by the objective of the analyses and the risk of type I error that the analyst is ready to accept. Given that PRO-CTCAE items are increasingly being collected in clinical studies and the need for both clinicians and patients to clearly interpret results, future work should include optimizing the visual presentation of the data to improve clarity for these different audiences. This could potentially include developing dynamic tools or interactive interfaces for PRO-CTCAE item visualization. It is critical that the message conveyed by any visualization is well received by the target audience. Therefore, further research to gain feedback from the targeted audience, including clinicians and patients, on the various graphical visualizations of PRO-CTCAE would identify the most effective graphics. This could be informed by previous research conducted for more general longitudinal PRO data (21). For example, researchers have found that clearly indicating the directionality of PRO scoring was important for accurate interpretation, particularly for patient audiences, while normed data were more likely to be misinterpreted than non-normed data (22,23). Similarly, studies have found line graphs easy to understand and pie charts easier to interpret than bar charts for data presenting proportions changed. Recent work explored aggregating PRO-CTCAE items through the use of composite scores (24). In this approach, scores on up to 3 attributes of frequency, severity, and interference with daily activities for each AE are combined into a single metric for each AE, which may be beneficial when visualizing a more succinct version of the data is preferred. These approaches are promising, but they also require further specific research, especially to inform their interpretation, which goes beyond the simple illustration purpose that we had with this research.
The second major challenge we identified was handling the large amount of skipped and missing data of different types, including missing attributes, intermittent missing data, and informative missing data (in other words, missing data that do not randomly occur but relate to the quality of life of the patient; eg, if the patient is too ill to complete the questionnaire). Identifying the precise reasons for skipped and missing data in longitudinal studies is important to best match the analysis and imputation approach (25-28) and may provide additional insight into interpretation of the study. For example, if the missingness is informative, excluding these data could bias the interpretation of the results (26). Similarly, clarifying whether missing data due to study discontinuation is due to disease progression as opposed to toxicity would be revealing, especially if the research question was to summarize tolerability over the course of the study. Indeed, the importance of correctly handling skipped and missing data is recognized in recommendations on using PRO-CTCAE data being developed by the NCI Cancer Moonshot Standardization Working Group (14). In the DREAMM-2 study, some data were skipped due to conditional branching logic, where, due to the format of the questioning algorithm, certain follow-up questions were not posed if a patient did not report experiencing a specific AE. PRO-CTCAE data were electronically collected during clinic visits, so there may have been a lower burden of missing data than might be expected from postal survey methods in which patients may not return every survey (25,26,29). There was an increasing burden of missing data over time as a progressive number of patients dropped out of the study over time due to disease progression. This was expected given the natural history of the disease in question and the longitudinal nature of the study.
More work on the missing PRO-CTCAE data challenges and solution, including how to calculate percentages, accounting for patient discontinuations due to disease progression during interpretation, and how to handle missing data (best imputation strategy), would be useful for the increasing uptake of PRO-CTCAE collection and analysis.
In conclusion, many analytical and visualization techniques are available to process PRO CTCAE data depending on the research question of interest, but also on the targeted audience. This study highlights a toolbox of potential methods that could be optimized to the specific requirements of oncology trials, which may be complemented by other analytical techniques in the future.
Supplementary Material
Contributor Information
Antoine Regnault, Modus Outcomes, A Division of THREAD, Lyon, France.
Angély Loubert, Modus Outcomes, A Division of THREAD, Lyon, France.
Boris Gorsh, GSK, Upper Providence, PA, USA.
Randy Davis, GSK, Upper Providence, PA, USA.
Anna Cardellino, GSK, Upper Providence, PA, USA.
Kristin Creel, Modus Outcomes, A Division of THREAD, Lyon, France.
Stéphane Quéré, Modus Outcomes, A Division of THREAD, Lyon, France.
Sandhya Sapra, GSK, Upper Providence, PA, USA.
Linda Nelsen, GSK, Upper Providence, PA, USA.
Laurie Eliason, GSK, Upper Providence, PA, USA.
Funding
This study was funded by GSK (Study 212225).
Notes
Role of the funder: GSK funded the design of this study, data analysis and interpretations and provided financial support for the writing assistance. The authors are responsible for the decision to submit the article.
Disclosures: BG, SS, RD, LN, and AC are employees of GSK and hold stocks/shares in the company. LE was an employee of GSK at the time of the study and holds stocks/shares in the company. AL, KC, SQ, and AR are employees of Modus Outcomes, a division of THREAD, a patient-centered outcome consultancy commissioned by GSK to conduct this research.
Author contributions: Conceptualization: LE, RD, AL, AR, KC, LN, SQ. Investigation: AL, AC. Formal Analysis: LE, BG, RD, AL, AR, KC, SS, AC, LN, SQ. Writing—original draft: AR, AL, BG, RD, AC, KC, SQ, SS, LN, LE. Writing—review and editing: AR, AL, BG, RD, AC, KC, SQ, SS, LN, LE.
Acknowledgements: Writing assistance was provided by Elisabeth Walsby, PhD and Jiah Pearson-Leary, PhD of Fishawack Indicia Ltd, part of Fishawack Health.
Prior presentations: Parts of these data have previously been presented in the following congress abstracts: Eliason L., Gorsh B., Sapra S., et al. Adoption of the Patient-Reported Outcomes Version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) and Arising Descriptive Methods in Cancer Clinical Trials: A Landscape Review and Illustration with a Multiple Myeloma Study. Quality of Life Research 2021; 30(SUPPL 1):S52.
Regnault A., Quere S., Gorsh B., et al. Landscape review of the patient-reported outcomes version of the common terminology criteria for adverse events (PRO-CTCAE) in oncology: Adoption and recent learnings. HemaSphere 2021; 5(SUPPL 2):563.
Regnault A., Quéré S., Gorsh B., Sapra S., et al. Landscape review of the patient reported outcomes version of the common terminology criteria for adverse events (PRO-CTCAE) in oncology: Adoption and recent learnings. Journal of Clinical Oncology 2021; 39(15)SUPPL: e18587.
Regnault A., Loubert A., Davis R., et al. Statistical analysis of symptomatic adverse events over the course of oncology clinical trials: What are the appropriate methods for longitudinal PRO-CTCAE data? Quality of Life Research 2021;30(SUPPL 1):S123.
Data availability
GSK makes available anonymized individual participant data and associated documents from interventional clinical studies which evaluate medicines, upon approval of proposals submitted to www.clinicalstudydatarequest.com. To access data for other types of GlaxoSmithKline-sponsored research, for study documents without patient/participant-level data, and for clinical studies not listed, please submit an enquiry via the website (www.clinicalstudydatarequest.com).
References
- 1. Trask PC, Dueck AC, Piault E, et al. Patient-reported outcomes version of the common terminology criteria for adverse events: methods for item selection in industry-sponsored oncology clinical trials. Clin Trials. 2018;15(6):616-623. [DOI] [PubMed] [Google Scholar]
- 2. U.S. Food and Drug Administration. Core Patient-Reported Outcomes in Cancer Clinical Trials Draft Guidance for Industry June 2021. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/core-patient-reported-outcomes-cancer-clinical-trials. Accessed February 17, 2023.
- 3. European Medicines Agency. Guidelines on the evaluation of anticancer medicinal products in man: the use of patient-reported outcome (PRO) measures in oncology studies, Appendix 2. https://www.ema.europa.eu/en/documents/other/appendix-2-guideline-evaluation-anticancer-medicinal-products-man_en.pdf. Accessed February 17, 2023.
- 4. Basch E, Reeve BB, Mitchell SA, et al. Development of the National Cancer Institute's patient-reported outcomes version of the common terminology criteria for adverse events (PRO-CTCAE). J Natl Cancer Inst. 2014;106(9):dju244. doi: 10.1093/jnci/dju244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Kluetz PG, Slagle A, Papadopoulos EJ, et al. Focusing on core patient-reported outcomes in cancer clinical trials: symptomatic adverse events, physical function, and disease-related symptoms. Clin Cancer Res. 2016;22(7):1553-1558. [DOI] [PubMed] [Google Scholar]
- 6. Regnault A, Quéré S, Gorsh B, et al. Landscape review of the patient reported outcomes version of the common terminology criteria for adverse events (PRO-CTCAE) in oncology: adoption and recent learnings. J Clin Oncol. 2021;39(suppl 15):e18587. [Google Scholar]
- 7. Basch E, Rogak LJ, Dueck AC.. Methods for implementing and reporting patient-reported outcome (PRO) measures of symptomatic adverse events in cancer clinical trials. Clin Ther. 2016;38(4):821-830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. ClinicalTrials.gov. A study to investigate the efficacy and safety of two doses of GSK2857916 in participants with multiple myeloma who have failed prior treatment with an anti-CD38 antibody. https://clinicaltrials.gov/ct2/show/NCT03525678?term=DREAMM-2&draw=2&rank=1. Accessed February 17, 2023.
- 9. Lonial S, Lee HC, Badros A, et al. Belantamab mafodotin for relapsed or refractory multiple myeloma (DREAMM-2): a two-arm, randomised, open-label, phase 2 study. Lancet Oncol. 2020;21(2):207-221. [DOI] [PubMed] [Google Scholar]
- 10. Tai YT, Mayes PA, Acharya C, et al. Novel anti-B-cell maturation antigen antibody-drug conjugate (GSK2857916) selectively induces killing of multiple myeloma. Blood. 2014;123(20):3128-3138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Lonial S, Lee HC, Badros A, et al. Longer term outcomes with single-agent belantamab mafodotin in patients with relapsed or refractory multiple myeloma: 13-month follow-up from the pivotal DREAMM-2 study. Cancer. 2021;127(22):4198-4212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. National Institute of Health National Cancer Institute. Division of Cancer Control and Population Sciences. PRO-CTCAE frequently asked questions. https://healthcaredelivery.cancer.gov/pro-ctcae/faqs.html#branching. Accessed August 23, 2021.
- 13. U.S. Food and Drug Administration. FDA perspective: patient self-reporting in the evaluation of cancer drug tolerability. https://www.fda.gov/media/148468/download. Accessed February 17, 2023.
- 14. Gresham G, Mazza GL, Langlais B, et al. ; Patient Tolerability Consortia Standardization Working Group. Graphical representations of patient tolerability data: recommendations from the National Cancer Institute (NCI) Cancer Moonshot Standardization Working Group. J Clin Oncol. 2021;39(suppl 15):e18612. [Google Scholar]
- 15. Otto E, Culakova E, Meng S, et al. Overview of Sankey flow diagrams: focusing on symptom trajectories in older adults with advanced cancer. J Geriatr Oncol. 2022;13(5):742-746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Hedeker D, Gibbons RD.. A random-effects ordinal regression model for multilevel analysis. Biometrics. 1994;50(4):933-944. [PubMed] [Google Scholar]
- 17. Christensen RHB. Cumulative link models for ordinal regression with the R package ordinal. http://cran.uni-muenster.de/web/packages/ordinal/vignettes/clm_article.pdf. Accessed February 17, 2023. [Google Scholar]
- 18. Gresham G, Diniz MA, Razaee ZS, et al. Evaluating treatment tolerability in cancer clinical trials using the toxicity index. J Natl Cancer Inst. 2020;112(12):1266-1274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Razaee ZS, Amini AA, Diniz MA, et al. On the properties of the toxicity index and its statistical efficiency. Stat Med. 2021;40(6):1535-1552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Langlais B, Mazza GL, Thanarajasingam G, et al. Evaluating treatment tolerability using the toxicity index with patient-reported outcomes data. J Pain Symptom Manage. 2022;63(2):311-320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Bantug ET, Coles T, Smith KC, et al. ; PRO Data Presentation Stakeholder Advisory Board. Graphical displays of patient-reported outcomes (PRO) for use in clinical practice: what makes a pro picture worth a thousand words? Patient Educ Couns. 2016;99(4):483-490. [DOI] [PubMed] [Google Scholar]
- 22. Snyder C, Smith K, Holzner B, et al. ; PRO Data Presentation Delphi Panel. Making a picture worth a thousand numbers: recommendations for graphically displaying patient-reported outcomes data. Qual Life Res. 2019;28(2):345-356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Tolbert E, Brundage M, Bantug E, et al. ; PRO Data Presentation Stakeholder Advisory Board. Picture this: presenting longitudinal patient-reported outcome research study results to patients. Med Decis Making. 2018;38(8):994-1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Basch E, Becker C, Rogak LJ, et al. Composite grading algorithm for the National Cancer Institute's Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE). Clin Trials. 2021;18(1):104-114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Fielding S, Fayers P, Ramsay CR.. Analysing randomised controlled trials with missing data: choice of approach affects conclusions. Contemp Clin Trials. 2012;33(3):461-469. [DOI] [PubMed] [Google Scholar]
- 26. Fielding S, Fayers PM, McDonald A, et al. Simple imputation methods were inadequate for missing not at random (MNAR) quality of life data. Health Qual Life Outcomes. 2008;6(1):57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Fielding S, Fayers PM, Ramsay CR.. Investigating the missing data mechanism in quality of life outcomes: a comparison of approaches. Health Qual Life Outcomes. 2009;7(1):57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Preston NJ, Fayers P, Walters SJ, et al. ; MORECare. Recommendations for managing missing data, attrition and response shift in palliative and end-of-life care research: part of the MORECare research method guidance on statistical issues. Palliat Med. 2013;27(10):899-907. [DOI] [PubMed] [Google Scholar]
- 29. Basch E, Mody GN, Dueck AC.. Electronic patient-reported outcomes as digital therapeutics to improve cancer outcomes. J Clin Oncol Oncol Pract. 2020;16(9):541-542. [DOI] [PubMed] [Google Scholar]
- 30. Liang KY, Zeger SL.. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):13-22. [Google Scholar]
- 31. Wedderburn RWM. Quasi-likelihood functions, generalized linear models, and the Gauss—Newton method. Biometrika. 1974;61(3):439-447. [Google Scholar]
- 32. Heagerty PJ, Zeger SL.. Marginal regression models for clustered ordinal measurements. J Am Stat Assoc. 1996;91(435):1024-1036. [Google Scholar]
- 33. Gounder MM, Mahoney MR, Van Tine BA, et al. Sorafenib for advanced and refractory desmoid tumors. N Engl J Med. 2018;379(25):2417-2428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Ishii-Kuntz M. Ordinal log-Linear Models. Newbury Park, CA: Sage Publications, Inc.; 1994. [Google Scholar]
- 35. O'Connell AA. Logistic Regression Models for Ordinal Response Variables. Newbury Park, CA: Sage Publications, Inc.; 2006. [Google Scholar]
- 36. Brundage MD, Smith KC, Little EA, et al. ; PRO Data Presentation Stakeholder Advisory Board. Communicating patient-reported outcome scores using graphic formats: results from a mixed-methods evaluation. Qual Life Res. 2015;24(10):2457-2472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Brundage M, Blackford A, Tolbert E, et al. ; PRO Data Presentation Stakeholder Advisory Board (various names and locations). Presenting comparative study PRO results to clinicians and researchers: beyond the eye of the beholder. Qual Life Res. 2018;27(1):75-90. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
GSK makes available anonymized individual participant data and associated documents from interventional clinical studies which evaluate medicines, upon approval of proposals submitted to www.clinicalstudydatarequest.com. To access data for other types of GlaxoSmithKline-sponsored research, for study documents without patient/participant-level data, and for clinical studies not listed, please submit an enquiry via the website (www.clinicalstudydatarequest.com).