Abstract
BACKGROUND
Given the variability in pulmonary exacerbation (PEx) management within and between Cystic Fibrosis (CF) Care Centers, it is possible that some approaches may be superior to others. A challenge with comparing different PEx management approaches is lack of a community consensus with respect to treatment-response metrics. In this analysis, we assess the feasibility of using different response metrics in prospective randomized studies comparing PEx treatment protocols.
METHODS
Response parameters were compiled from the recent STOP (Standardized Treatment of PEx) feasibility study. Pulmonary function responses (recovery of best prior 6-month and 12-month FEV1% predicted and absolute and relative FEV1% predicted improvement from treatment initiation) and sign and symptom recovery from treatment initiation (measured by the Chronic Respiratory Infection Symptom Score [CRISS]) were studied as categorical and continuous variables. The proportion of patients retreated within 30 days after the end of initial treatment was studied as a categorical variable. Sample sizes required to adequately power prospective 1:1 randomized superiority and non-inferiority studies employing candidate endpoints were explored.
RESULTS
The most sensitive endpoint was mean change in CRISS from treatment initiation, followed by mean absolute FEV1 % predicted change from initiation, with the two responses only modestly correlated (R2=.157; P < 0.0001). Recovery of previous best FEV1 was a problematic endpoint due to missing data and a substantial proportion of patients beginning PEx treatment with FEV1 exceeding their previous best measures (12.1% >12-month best, 19.6% >6-month best). Although mean outcome measures deteriorated approximately 2-weeks post-treatment follow-up, the effect was non-uniform: 62.7% of patients experienced an FEV1 worsening versus 49.0% who experienced a CRISS worsening.
CONCLUSIONS
Results from randomized prospective superiority and non-inferiority studies employing mean CRISS and FEV1 change from treatment initiation should prove compelling to the community. They will need to be large, but appear feasible.
Keywords: exacerbation, endpoints, clinical trial, sample size
INTRODUCTION
People with cystic fibrosis (CF) are prone to acute intervals of exaggerated signs and symptoms of airway infection that are frequently coupled with lung function decline, weight loss, and malaise that we collectively identify as ‘pulmonary exacerbations’ (PEx).[1] PEx management commonly includes chest physiotherapy and treatment with antibiotics targeted at bacterial opportunists previously detected in the patient’s airway, as well as nutritional and psychosocial support.[2] It has proven difficult to reach consensus on a prospective objective definition of CF PEx for clinical research purposes,[1] but associations between poor health outcomes and PEx as defined by a clinician’s decision to treat PEx signs and symptoms with antibiotics are indisputable. In 2014, 17,882 PEx were treated with intravenous (IV) antibiotics among 9,318 individuals followed in the US CF Foundation Patient Registry (CFFPR);[3] more than twice as many were likely diagnosed and treated with outpatient antibiotics during the same year.[4] IV antibiotic-treated PEx have been associated with decreased quality of life,[5] increased resource utilization,[6,7] accelerated lung function decline,[8] overall loss of lung function,[9] and increased mortality risk.[10–13]
Unfortunately, objective evidence supporting current PEx management practices is both scant and inconclusive.[1] The few relatively small prospective studies comparing PEx treatments that have been reported have mainly failed to provide actionable clinical guidance with respect to antibiotic choice(s), routes of delivery, or treatment duration.[1] Observations of poor overall PEx outcomes [9] and substantial variability in PEx management both within and between CF care programs [14–16] have precipitated a discussion of PEx management practices,[17] and specifically whether current practices are optimal or whether objective clinical trials might be able to distinguish ‘better’ PEx treatment regimens from those that are either less effective or are similarly effective but with greater associated burden, expense, or toxicities.
The US CF Foundation has sponsored multicenter studies to determine if standardized PEx treatment protocols can be introduced and tested in CF Care Centers, with an aspirational goal of bringing evidence-based medicine to PEx treatment in order to optimize outcomes. A recent multi-center US study of IV antibiotic treatment of PEx (Standardized Treatment of PEx; STOP) probed feasibility of patient/clinician participation in future prospective protocol-driven PEx treatment studies, and systematically collected treatment response data in order to identify/characterize efficacy endpoints to be employed in prospective PEx treatment studies.[18,19] In this communication, we describe endpoint properties derived from STOP study data for assessing PEx treatment protocol efficacy, and evaluate the strengths and weaknesses of potential exacerbation study efficacy endpoints, including change in forced expiratory volume in 1 second (FEV1), change in signs and symptoms of exacerbation, and retreatment with IV antibiotics within 30 days.
METHODS
Data were obtained from the STOP study ( NCT02109822), which has been previously described.[18,19] Lung function changes were evaluated using spirometry, and specifically the percentage predicted of FEV1 (FEV1 % predicted) based on a subject’s sex, age, height, and race using the GLI normative equations.[20] Signs and symptoms of pulmonary exacerbation were collected using the Cystic Fibrosis Respiratory Symptom Diary-Chronic Respiratory Infection Symptom Score (CRISS).[21,22] FEV1 % predicted and CRISS data were collected at hospital admission for IV antibiotic treatment (Visit 1), at Day 7 (± 3 days) of treatment, at IV antibiotic treatment termination (Visit 2), and at Day 28 (Visit 3). When available, a patient’s best FEV1% predicted measure recorded in the prior 6 months and the prior 12 months were collected from the CFFPR. Finally, time to next PEx treated with IV antibiotics (or censor) following treatment was collected for each subject from the CFFPR.
Descriptive statistics (mean, standard deviation [SD], median, range, etc.) were calculated for FEV1% predicted and CRISS score change from Visit 1 (admission) to Day 7, Visit 2 and Visit 3. In addition, statistics associated with the proportion of a subject’s recovery of their historic best FEV1% predicted (in the prior 6 months and 1 year as recorded in the CFFPR) were calculated for STOP study visits. Individuals with missing data were excluded from these calculations. As sensitivity analyses, missing Visit 3 data were imputed using the last observation carried forward (LOCF) method to estimate effects of missing data on change from admission outcomes. Time-to-next IV antibiotic treatment for PEx from end of treatment was studied using Kaplan-Meier survival methods to account for censoring (subjects who experienced no subsequent event at the time of analysis) and the proportion of subjects receiving retreatment with IV antibiotics for PEx within 30 days of Visit 2 was studied as a categorical variable.
To characterize endpoint utility for future prospective, randomized, comparative superiority and non-inferiority trials, sample size estimates were generated. PEx protocol-based superiority studies employing continuous variable endpoints were generated for FEV1 % predicted and CRISS scores as changes from Visit 1 to Visit 3 as two-sided t-tests assuming 1:1 randomized allocation, 80% or 90% power, and alpha = 0.05. Sample sizes for 1:1 randomized non-inferiority (NI) studies of clinically identical treatments with 80% or 90% power and alpha = 0.025 were determined based on observed standard deviations for FEV1 and CRISS responses as a function of varying NI margins. Sample sizes for NI study designs were determined where NI margins preserved ≥50% of the lower 95% confidence bound [23] of observed STOP means assuming 1:1 randomization, 80% or 90% power, and one-sided alpha = 0.025.
Variables were also categorized as proportions of treated subjects achieving a) ≥100% of their best prior 12-month and 6-month CFFPR FEV1% predicted, b) ≥9% predicted improvement from admission in FEV1, c) ≥17% relative improvement from admission in FEV1% predicted, and d) ≥11-point decrease from admission in CRISS score.[24] Absolute and relative FEV1 improvement thresholds were derived from the previous observation that a 15% relative FEV1 drop is strongly associated with antibiotic treatment for exacerbation:[25] a 9% predicted absolute FEV1 improvement is roughly equal to recovery of a 15% loss of the average best FEV1 % predicted in the prior 6 months for STOP subjects (0.15 × 60.6% predicted = 9.1% predicted; N=200); a 17% relative FEV1 improvement represents recovery of a 15% relative FEV1 loss (1/0.85 = 1.176). Sample sizes for superiority studies employing categorical endpoints assumed 1:1 allocation ratio and two-sided Chi-Square tests with 80% or 90% power and alpha = 0.05.
RESULTS
Continuous Variable Endpoints
Characteristics of continuous outcome measures for STOP study participants are shown in Table 1. Average improvement was observed for all FEV1 and CRISS measures from admission to Day 7 and from Day 7 to end of IV antibiotic treatment, with average increases after Day 7 consistently smaller than increases from admission to Day 7 (Table 1). Absolute and relative FEV1 and CRISS scores improved from admission to Day 28, showing similar change distributions (Figure 1, Panels A-C). Mean FEV1 and CRISS responses from admission to Day 28 were only modestly correlated: simple least squares regression of absolute change in FEV1% predicted versus CRISS score change an R2 of 0.157 (P<.001; Figure 2). On average there was a worsening from end of IV antibiotic treatment to Day 28 (Table 1); about 60% of STOP subjects had worsening FEV1 measures and about 50% had worsening CRISS scores from end of treatment to Day 28 (Figure 1, Panels D-F). Missing Day 28 data were observed to be fairly evenly distributed across the treatment population as a function of age, lung function, or complication, and imputation of missing Day 28 data by LOCF method produced little effect on change from admission FEV1 and CRISS statistics (Table 1). The average proportion of subjects’ best prior 6-month and 12-month FEV1% predicted recovered between admission for IV treatment and Day 28 of the STOP study were similar (11.0% and 10.8%, respectively), with the standard deviation of the 6-month value slightly higher (Table 1). Importantly, 19.6% and 12.1% of patients with historical FEV1 data had admission FEV1 values greater than their best FEV1 value recorded in the 6 and 12 months prior to admission, respectively (Table 2). Of note, 28 patients with admission FEV1 values exceeding their best 6-month FEV1 experienced smaller average FEV1 and CRISS changes from admission to Day 28: 0.93% predicted [95% CI −1.18, 3.05] and −12.25 [−17.99, −6.51], respectively.
Table 1.
Continuous Variable Statistics for FEV1 and CRISS Endpoints
| N | Mean (SD) | 95% CI | Median | Range | |
|---|---|---|---|---|---|
| Percentage of Best Prior 12-Month* FEV1 % Predicted | |||||
| Admission | 198 | 79.2% (17.9%) | 77.8%, 82.3% | 80.7% | 18.8%, 147.2% |
| Day 7 | 138 | 92.0% (19.7%) | 88.7%, 95.3% | 91.9% | 35.1%, 200.4% |
| End of Treatment | 163 | 95.6% (15.0%) | 93.2%, 97.9% | 96.1% | 44.3%, 155.0% |
| Day 28 | 172 | 90.2% (17.6%) | 87.5%, 92.8% | 93.0% | 23.2%, 153.6% |
| Day 28 – Admission | 158 | 10.8% (14.8%) | 8.5%, 13.1% | 9.0% | −27.1%, 52.5% |
| Percentage of Best Prior 6-Month* FEV1 % Predicted | |||||
| Admission | 184 | 85.1% (18.1%) | 82.5%, 87.7% | 85.7% | 37.4%, 160.0% |
| Day 7 | 129 | 99.2% (21.1%) | 95.5%, 102.9% | 95.7% | 35.2%, 200.4% |
| End of Treatment | 152 | 101.8% (16.6%) | 99.1%, 104.5% | 99.6% | 44.3%, 168.1% |
| Day 28 | 159 | 95.7% (31.7%) | 92.9%, 98.5% | 95.7% | 31.9%, 166.6% |
| Day 28 – Admission | 145 | 11.0% (15.7%) | 8.5%, 13.6% | 8.9% | −27.1%, 61.2% |
| Absolute FEV1 Change from Admission, % Predicted | |||||
| Day 7 | 131 | 8.4 (11.3) | 6.5, 10.4 | 6.3 | −18.7, 63.0 |
| End of Treatment | 156 | 9.4 (10.1) | 7.8, 11.0 | 7.6 | −13.9, 47.8 |
| Day 28 | 160 | 7.5 (10.7) | 5.8, 9.1 | 5.2 | −16.8, 57.1 |
| Day 28 LOCF† | 203 | 7.3 (10.6) | 5.8, 8.7 | 5.2 | −16.8, 57.1 |
| Relative FEV1 % Predicted Change from Admission | |||||
| Day 7 | 131 | 12.6% (16.6%) | 9.7%, 15.5% | 10.7% | −43.5%, 67.9% |
| End of Treatment | 156 | 21.4% (24.4%) | 17.5%, 25.2% | 16.6% | −19.0%, 141.6% |
| Day 28 | 160 | 17.2% (25.2%) | 13.3%, 21.2% | 12.4% | −28.8%, 122.1% |
| Day 28 LOCF† | 203 | 16.9% (25.1%) | 13.5%, 20.4% | 10.1% | −28.8%, 122.1% |
| CRISS Score Change from Admission‡ | |||||
| Day 7 | 200 | −18.1, (14.2) | −20.1, −16.2 | −16 | −63, 12 |
| End of Treatment | 186 | −26.1 (15.7) | −28.3, −23.8 | −25 | −65, 5 |
| Day 28 | 158 | −21.2 (16.5) | −23.8, −18.6 | −20 | −63, 24 |
| Day 28 LOCF† | 209 | −20.9 (17.2) | −23.2, −18.6 | −20 | −63, 56 |
from CFFPR
missing Day 28 data imputed by LOCF method
negative score change represents symptom improvement
Figure 1. Changes in FEV1 and CRISS values from admission for IV antibiotic treatment and end of IV treatment to Day 28.
Panel A: Absolute FEV1% predicted change from admission to Day 28. Panel B: Relative FEV1 (L or % predicted) change from admission to Day 28. Panel C: CRISS score change from admission to Day 28. Panel D: Cumulative frequency of absolute FEV1% predicted change from end of IV treatment to Day 28. Panel E: Cumulative frequency of relative (from admission) FEV1 change from end of IV treatment to Day 28. Panel F. Cumulative frequency of CRISS score change from end of IV treatment to Day 28. Vertical dashed lines identify zero change. Smooth gray curves represent the parametric normal distribution.
Figure 2.
Least squares regression of absolute FEV1 and CRISS score changes from admission to Day 28.
Table 2.
Categorical FEV1 and CRISS Endpoint Statistics
| Categorical Response |
N (All Subjects) |
Categorical Responders, N (%) |
Responder Proportion 95% CI |
|---|---|---|---|
| ≥100% of Best FEV1 % Predicted in Prior 12 months* | |||
| Admission | 198 | 24 (12.1%) | 8.3%, 17.4% |
| Day 7 | 138 | 35 (25.4%) | 18.8%, 33.2% |
| End of Treatment | 163 | 53 (32.5%) | 25.8%, 40.0% |
| Day 28 | 172 | 42 (24.4%) | 18.6%, 31.3% |
| ≥100% of Best FEV1 % Predicted in Prior 6 months* | |||
| Admission | 184 | 36 (19.6%) | 14.5%, 25.9% |
| Day 7 | 129 | 50 (38.8%) | 30.8%, 47.4% |
| End of Treatment | 152 | 72 (47.4%) | 39.6%, 55.3% |
| Day 28 | 159 | 62 (39.0%) | 31.8%, 46.7% |
| ≥9% Predicted Absolute FEV1 Increase from Admission | |||
| Day 7 | 131 | 50 (38.2%) | 30.3%, 46.7% |
| End of Treatment | 156 | 66 (42.3%) | 34.8%, 50.1% |
| Day 28 | 160 | 57 (35.6%) | 28.6%, 43.3% |
| ≥17% Relative FEV1 Increase from Admission | |||
| Day 7 | 131 | 47 (35.9%) | 28.2%, 44.4% |
| End of Treatment | 156 | 76 (48.7%) | 41.0%, 56.5% |
| Day 28 | 160 | 67 (41.9%) | 34.5%, 49.6% |
| ≥11 Point CRISS Score Decrease from Admission** | |||
| Day 7 | 200 | 137 (68.5%) | 61.8%, 74.5% |
| End of Treatment | 186 | 154 (82.8%) | 76.7%, 87.5% |
| Day 28 | 158 | 119 (75.3%) | 68.0%, 81.4% |
| IV Retreatment >6 Days and <29 Days after End of IV Treatment | |||
| 172 | 16 (9.3%) | 5.8%, 14.6% | |
from CFFPR
score decrease represents relative improvement
Sample size estimates for randomized 1:1 allocation, controlled superiority studies of exacerbation treatment showed that change in CRISS Score from admission to Day 28 would require fewer subjects as an efficacy endpoint than either absolute or relative change in FEV1% predicted (Figure 3A–C):153 subjects per group would provide 80% power to detect a CRISS score response ≥25% of that observed in STOP (Figure 3C). Little difference in sample size requirements was observed between absolute and relative FEV1 change from admission as endpoints; 512 subjects per group would be required for 80% power to detect a ≥25% improvement in absolute FEV1 versus 540 subjects per group to detect the same improvement using relative FEV1 change from admission (Figures 3A, 3B). Sample sizes required for testing change in proportion of best FEV1% predicted recorded in the prior 12 months as an efficacy endpoint were similar to those for absolute and relative FEV1 change.
Figure 3. Sample size requirements for superiority and non-inferiority studies of continuous FEV1 and CRISS endpoints.
Panels A-C: Sample size requirements for 1:1 randomized two-sided superiority studies with alpha = 0.05 using change from admission to Day 28 in absolute FEV1 change (Panel A), relative FEV1 change (Panel B), and CRISS Score change (Panel C) as endpoints. Horizontal dashed lines show treatment effects equivalent to 25% and 50% improvements over STOP study outcomes (Table 1). Dotted curves show sample size requirements to attain 80% power and solid curves show 90% power requirements. Panels D-F: Sample size requirements for 1:1 randomized one-sided non-inferiority studies of identically effective treatments with alpha = 0.025 as a function of pre-established non-inferiority (NI) margins for FEV1 change (Panel D), relative FEV1 change (Panel E) and CRISS change (Panel F) from baseline. Dashed lines show NI margins that retain 50% of the lower bound of STOP treatment effectd. Dotted curves show sample size requirements to attain 80% power and solid curves show 90% power requirements. Panels G-I: Sample size requirements for 1:1 randomized one-sided non-inferiority studies with alpha = 0.025 using absolute FEV1 change (Panel G), relative FEV1 change (Panel G), and CRISS Score change (Panel I) as endpoints. Non-inferiority (NI) margins were derived as half of the lower bound of the 95% confidence interval from STOP study outcomes (Table 1). Y-axes represent actual differences in treatment effects between groups, with the horizontal line placed at no difference between treatments. Dotted curves show sample size requirements to attain 80% power and solid curves show 90% power requirements.
Non-inferiority (NI) margins for active-comparator studies using continuous endpoints, set at 50% of the lower bound of 95% confidence intervals associated with mean STOP responses (Table 1), were 2.9% predicted for absolute FEV1 response, 6.65% for relative FEV1 response, and 9.3 points for CRISS score response. As with superiority studies, sample sizes required to power one-sided non-inferiority study designs were smallest using CRISS score as an efficacy endpoint when compared with absolute or relative change in FEV1 (Figure 3D–I). Using these estimated NI margins, per group sample sizes required to demonstrate non-inferiority of a treatment with efficacy identical to that of an active comparator with 80% power were 214 subjects for absolute FEV1 change, 226 subjects for relative FEV1 change, versus only 50 subjects for CRISS score change (Figure 3D–F). Sample sizes per group required to assure 90% power in the same studies were 287, 302, and 67 subjects, respectively. Sample sizes per group become much larger if NI margins are reduced (Figure 3D–F) or if a treatment is assumed to be only marginally less effective than an active comparator (Figure 3G–I).
Categorical Variable Endpoints
As noted above, a substantial proportion of subjects with both an admission FEV1 measure and a best FEV1 measure from the CFFPR either 6 months or 12 months prior to admission for IV treatment had an FEV1 value at admission greater than their best prior recorded FEV1 (Table 2). Proportions of patients with an FEV1 measure exceeding their best prior 6-month or 12-month values increased from 19.6% at admission to 39.0% at Day 28 and from 12.1% at admission to 24.4% at Day 28, respectively.
A minority of subjects were responders using FEV1 thresholds of ≥ 9 % predicted absolute improvement or ≥17% relative improvement from admission. At Day 28, 57 of 160 subjects (35.6%) had experienced at least a 9% predicted increase in FEV1 from admission for IV treatment, and 67 of 160 (41.9%) had experienced a relative FEV1 increase from admission of at least 17% (Table 2). A much larger proportion of subjects (119 of 158, 75.3%) experienced an 11 point or greater decrease in CRISS score at Day 28. Similar to continuous outcomes, proportions of subjects meeting categorical outcome thresholds were lower at Day 7 and at Day 28 than at end of treatment. Sixteen of 172 STOP subjects with available CFFPR data (9.3%) met the categorical outcome of receiving additional IV antibiotic treatment for pulmonary exacerbation >6 days and <29 days after the end of their STOP study IV treatment (Table 2).
Sample size requirements for a superiority study of a treatment to a comparator using the categorical endpoints described in Table 2 are shown in Figure 4. In general, use of categorical endpoints increases sample size requirements. Sample sizes for categorical measures are influenced by the underlying proportions themselves (larger sample sizes required for proportions closer to 0.5); these analyses use the observed STOP results (Table 2). A study intended to detect a 10% treatment-associated increase in the proportion of subjects having a ≥9% predicted increase in FEV1 from admission to Day 28 (i.e., 45.6% of subjects versus 35.6%) would require 375 subjects per group for 80% power (Figure 4). In contrast, only 245 subjects per group would be required for 80% power to see a 10% treatment-associated increase in proportion of subjects having a ≥11 point CRISS score decrease (i.e., 85.3% of subjects versus 75.3%) over the same period.
Figure 4. Sample size requirements for superiority studies of categorical FEV1, CRISS, and IV retreatment endpoints.
Sample size requirements for 1:1 randomized two-sided superiority studies with alpha = 0.05. Panel A: Difference in proportion of subjects achieving a ≥9 % predicted FEV1 increase from admission at Day 28. Panel B: Difference in proportion of subjects achieving a ≥11-point decrease in CRISS score from admission at Day 28. Panel C: Difference in proportion of subjects retreated with IV antibiotics for exacerbation between days 7 and 28 after end of IV antibiotic treatment. Dotted curves show sample size requirements to attain 80% power and solid curves show 90% power sample size requirements.
Larger sample sizes would be required to study a treatment-associated effect on the proportion of subjects retreated with IV antibiotics within 7 to 28 days after the end of initial IV treatment due to the low incidence of retreatment in a comparator arm (9.3% in the STOP study, Table 2), and therefore relatively small treatment effect the study would be designed to detect. Over 500 subjects per group would be required for 80% power to detect a reduction of 4.5% in the proportion of IV-retreated subjects (from 9.3% of subjects to 4.8%; Figure 4). Of note, the 9.3% IV-retreatment rate observed in STOP is higher than the 5.7% rate observed in a much larger analysis of 13,579 CFFPR patients treated with IV antibiotics on or after Jan 1, 2010,[26] and thus sample sizes required to see such a reduction might be substantially higher in a prospective study.
DISCUSSION
The CF Foundation has funded a multi-center effort to assess the feasibility of introducing protocol-based comparative exacerbation treatment studies, beginning with an initial feasibility study.[18,19] In this report, we analyzed data from the STOP observational study to describe different objective measures of exacerbation treatment response and assessed their potential as categorical or continuous efficacy endpoints in prospective randomized superiority and non-inferiority studies of exacerbation treatments.
Our analyses suggest that there are few differences among FEV1-derived efficacy endpoints (mean proportion of stable pre-exacerbation value recovered, absolute change in value from treatment initiation, and relative change in value from treatment initiation), with mean absolute FEV1 change having slightly lower variability and corresponding sample size requirements. In contrast, the CRISS score measuring sign and symptom reduction was a more consistently sensitive efficacy endpoint, requiring substantially smaller sample sizes in both superiority and non-inferiority designs. However, large CRISS improvements observed across the population raise questions as to the treatment-associated specificity of CRISS changes. Symptom improvement is subjective and could be influenced by any receipt of treatment, whether relatively effective or ineffective. The categorical outcome of retreatment with IV antibiotics within 30 days appears to be a useful measure of treatment relapse based on epidemiologic analyses,[26] but appears least tractable as an efficacy endpoint for prospective studies because of its relatively low incidence and correspondingly high sample size requirements. There was little evidence that missing Day 28 data from STOP substantially biased outcome statistics and subsequent sample size calculations based on LOCF imputation. Based on feasibility, one might conclude that CRISS-based studies are more desirable than FEV1-based studies, but this conclusion ignores questions of the specificity of CRISS response, the weak correlation between CRISS and FEV1 response, and the imperative for efficacy metrics to be compelling to the CF community if study results are expected to influence treatment behavior. Although there is no question that signs and symptoms of exacerbation are important drivers of exacerbation diagnosis and treatment,[19,25] the relatively modest clinical experience with symptom or quality of life instruments such as CRISS compares unfavorably with the volumes of clinical experience with FEV1 as an important measure of CF lung disease progression. Interestingly, absolute (or relative) change in FEV1 and change in CRISS score are not redundant measures, but appear to be complimentary in capturing different aspects of clinical response.
The presence of two complimentary efficacy measures that identify different types of response may suggest the utility of a composite efficacy endpoint in which improvement in either measure above established thresholds constitutes ‘response’, but there are important shortcomings to this approach. Beyond loss of statistical power associated with categorization of continuous variables, there would be the challenge of setting response thresholds for CRISS and FEV1 that would be perceived as ‘compelling’ by community stakeholders. CRISS and FEV1 response thresholds we chose for our sample size analyses, although justifiable, may not meet this requirement. Further, a protocol could show better efficacy for a composite endpoint despite a worse response for one component (e.g., a substantially higher CRISS response combined with a modestly worse FEV1 response would likely be problematic for physicians, as would the inverse be for patients), which highlights a fundamental problem of equally weighting symptom and lung function responses: these endpoints are not perceived as equivalent by CF clinicians and patients.[19] Taken together, our observations suggest that comparative research trials powered to detect a difference in absolute FEV1 responses as a primary endpoint will also be sufficiently powered to identify differences in CRISS responses in the population and thus have the greatest potential to change clinical care and patient outcomes.
An important focus of this analysis is the question of when to measure exacerbation treatment response. Previous exacerbation trial designs have commonly measured response on the last day of treatment, for largely pragmatic reasons. However, there are some logistic as well as clinical advantages to considering an individual’s response at some time point subsequent to treatment cessation. Our data suggest that there is a modest increase in response variance if outcomes are measured ~2 weeks after IV antibiotic treatment cessation, primarily because some patients continue to experience improvement after cessation while others begin to degrade, a phenomenon recently noted by Waters et al.[27] A recent report that patients treated with IV antibiotics for <9 days are at ~2-fold risk of IV antibiotic retreatment within 30 days compared with those treated 13 to 16 days [26] demonstrates why response measurement at treatment cessation may be problematic for determining the overall efficacy of exacerbation management protocols of different durations. Overall consequences for two protocols with apparently identical outcomes at treatment cessation may look quite different weeks later; it is should be the ultimate longitudinal treatment response that we are interested in optimizing being careful, however, not to extend time to measurement to a point where clinical response can be muddled by other confounding treatment and adherence factors.
The STOP dataset and our analyses have some limitations that affect broad applicability to future exacerbation management studies. First, our study was limited to hospital admission for treatment with IV antibiotics, excluding exclusively outpatient treatments of exacerbation with oral, inhaled, or IV antibiotics. If our ultimate goal is to rationalize all antibiotic treatments for exacerbation, similar analyses of much more common outpatient antibiotic treatments [4,28] are needed. In addition, the STOP study was dominated by adult participants;[18] studies in younger patients may reach different conclusions with respect to endpoint suitability. Of note, the CRISS score can currently only be collected from subjects ≥12 years of age [29] and spirometry is consistently reliable in patients ≥6 years of age, but pulmonary exacerbations occur frequently in individuals with CF of all ages, including infants.[21,22] We report sample size requirements for non-inferiority studies in which we arbitrarily chose retention of 50% of the lowest possible expected efficacy for each endpoint as our non-inferiority margins. Although there is mathematical/statistical precedent for this method of establishing NI margins,[23] the method does not assure clinical relevance. Active-comparator non-inferiority studies are dependent upon the critical (untested) assumption that an active comparator is truly active and would be found to be significantly superior to an administered placebo, an assumption that we make for exacerbation treatment almost entirely on faith (a 12-patient randomized placebo-controlled study of exacerbation treatment being the only objective evidence that treatment is associated with response [30]). We appreciate that the NI margins we have identified may not resonate within the CF community as clinically relevant measures of “equivalence” and will need to be carefully vetted within the context of the treatment and the study population in question prior to prospective non-inferiority studies. Although greater confidence in active comparator non-inferiority study results might be achieved with reduced NI margins, sample size ramifications can be profound. Finally, we purposely did not consider non-inferiority study designs using categorical outcome variables because the categorical outcomes we considered tended to occur at very high or very low incidence, making it challenging to define what would necessarily be very small NI margins.
Extreme variability in pulmonary exacerbation management practices within and between CF Care Centers today introduces a potential for poorer overall outcomes, as well as making systematic research in this arena very challenging without embarking on very large trials. Although we tend to focus on the proposition that patient response is attributable to some exacerbation management approaches that are superior to others, it may be as important to recognize practices that offer little or no difference in treatment response while significantly reducing treatment burden, resource utilization, and/or potential for patient toxicity.[31] Without prospective protocol-based comparative studies of different treatment regimens using efficacy measures considered both valid and compelling to CF patients, families, and care-providers, it will be difficult to optimize exacerbation treatment paradigms.
Acknowledgements
The authors thank the patients and their families who participated in the STOP clinical study, and Dr. Bruce Marshall and the CF Foundation for their strong commitment to improving the management of pulmonary exacerbations. STOP was supported by grants from Cystic Fibrosis Foundation Therapeutics (SANDERS14A0, HELTSH13A1, GOSS13A0, FLUME13A1, CLANCY09Y0, SORSCH15RO, ORENST14Y0, NICKR0, DAINES14Y0), the National Institutes of Health (KL2 TR000428), and the University of Wisconsin-Madison ICTR (NIH UL1 TR000427). This project was also supported by the South Carolina Clinical & Translational Research (SCTR) Institute, with an academic home at the Medical University of South Carolina through National Institutes of Health grant UL1TR001450. The study sponsors had no role in the construction of this manuscript or the decision to submit for publication. The STOP study group: Dr. N. West and A. Thaxton (Johns Hopkins University, Baltimore MD); Dr. J. Nick and K. Poch (National Jewish Hospital, Denver CO); Dr. G. Solomon and K. Brand (University of Alabama at Birmingham, Birmingham AL); Dr. C. Goss and E. Wilhelm (University of Washington, Seattle WA); Dr. P. Flume and A. Warden (Medical University of South Carolina, Charleston SC); Dr. J. Spahr and K. Iurlano (Pittsburgh Children’s Hospital, Pittsburgh PA); Dr. D. VanDevanter, Dr. E. Dasenbrook and D. Weaver (Case Western Reserve University); R. Gibson and S. McNamara (Seattle Children’s Hospital, Seattle WA); Dr. R. Jain, A. Keller and A. Hebert (Dallas Southwestern, Dallas TX); Dr. D. Sanders, A. Amessoudji and L. Makholm (University of Wisconsin, Madison WI); Dr. C. Daines and O. Molina de Rodriguez (University of Arizona, Tucson AZ); Dr. S. Heltshe, B. Fogarty, V. Beckett, and J. Kirihara (CF Foundation Therapeutics Development Network Coordinating Center, Seattle WA). Dr. B. Marshall and A. Elbert (CF Foundation).
REFERENCES
- [1].Flume PA, VanDevanter DR. Pulmonary exacerbations In: Hodson and Geddes’ Cystic Fibrosis, 4th Edition Bush A, Bilton D, Hodson M, eds. Press CRC. 2015 [Google Scholar]
- [2].Cystic Fibrosis Foundation. Treatment of pulmonary exacerbation of cystic fibrosis. In: Clinical Practice Guidelines for Cystic Fibrosis. 1997. [Google Scholar]
- [3].Cystic Fibrosis Foundation Patient Registry. 2014 Annual data report to the center directors. Cystic Fibrosis Foundation, Bethesda (MD) (2015). [Google Scholar]
- [4].Wagener JS, VanDevanter DR, Pasta DJ, Regelmann W, Morgan WJ, Konstan MW. Oral, inhaled, and intravenous antibiotic choice for treating pulmonary exacerbations in cystic fibrosis. Pediatr Pulmonol. 2013;48(7):666–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Britto MT, Kotagal UR, Hornung RW, Atherton HD, Tsevat J, Wilmott RW. Impact of recent pulmonary exacerbations on quality of life in patients with cystic fibrosis. Chest. 2002;121(1):64–72. [DOI] [PubMed] [Google Scholar]
- [6].Lieu TA, Ray GT, Farmer G, Shay GF. The cost of medical care for patients with cystic fibrosis in a health maintenance organization. Pediatr. 1999;103(6):e72. [DOI] [PubMed] [Google Scholar]
- [7].Ouyang L, Grosse SD, Amendah DD, Schechter MS. Healthcare expenditures for privately insured people with cystic fibrosis. Pediatr Pulmonol. 2009;44(10):989–96. [DOI] [PubMed] [Google Scholar]
- [8].Konstan MW, Morgan WJ, Butler SM, Pasta DJ, Craib ML, Silva SJ, et al. Risk factors for rate of decline in forced expiratory volume in one second in children and adolescents with cystic fibrosis. J Pediatr. 2007;151(2):134–9, 9 e1. [DOI] [PubMed] [Google Scholar]
- [9].Sanders DB, Bittner RC, Rosenfeld M, Redding GJ, Goss CH. Pulmonary exacerbations are associated with subsequent FEV1 decline in both adults and children with cystic fibrosis. Pediatr Pulmonol. 2011;46(4):393–400. [DOI] [PubMed] [Google Scholar]
- [10].Liou TG, Adler FR, Fitzsimmons SC, Cahill BC, Hibbs JR, Marshall BC. Predictive 5-year survivorship model of cystic fibrosis. Am J Epidemiol. 2001;153(4):345–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Mayer-Hamblett N, Rosenfeld M, Emerson J, Goss CH, Aitken ML Developing cystic fibrosis lung transplant referral criteria using predictors of 2-year mortality. Am J Respir Crit Care Med. 2002;166(12 Pt 1):1550–5. [DOI] [PubMed] [Google Scholar]
- [12].Emerson J, Rosenfeld M, McNamara S, Ramsey B, Gibson RL. Pseudomonas aeruginosa and other predictors of mortality and morbidity in young children with cystic fibrosis. Pediatr Pulmonol. 2002;34(2):91–100. [DOI] [PubMed] [Google Scholar]
- [13].Ellaffi M, Vinsonneau C, Coste J, Hubert D, Burgel PR, Dhainaut JF, et al. One-year outcome after severe pulmonary exacerbation in adults with cystic fibrosis. Am J Respir Crit Care Med. 2005;171(2):158–64. [DOI] [PubMed] [Google Scholar]
- [14].Johnson C, Butler SM, Konstan MW, Morgan W, Wohl ME. Factors influencing outcomes in cystic fibrosis: a center-based analysis. Chest. 2003;123(1):20–7. [DOI] [PubMed] [Google Scholar]
- [15].Kraynack NC, Gothard MD, Falletta LM, McBride JT. Approach to treating cystic fibrosis pulmonary exacerbations varies widely across US CF care centers. Pediatr Pulmonol. 2011;46(9):870–81. [DOI] [PubMed] [Google Scholar]
- [16].Heltshe SL, Goss CH, Thompson V, Sagel SD, Sanders DB, Marshall BC, Flume PA. Short-term and long-term response to pulmonary exacerbation treatment in cystic fibrosis. Thorax. 2016. March;71(3):223–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Flume PA, Mogayzel PJ Jr, Robinson KA, Goss CH, Rosenblatt RL, Kuhn RJ, Marshall BC; Clinical Practice Guidelines for Pulmonary Therapies Committee. Cystic fibrosis pulmonary guidelines: treatment of pulmonary exacerbations. Am J Respir Crit Care Med. 2009. November 1;180(9):802–8. [DOI] [PubMed] [Google Scholar]
- [18].Sanders DB, Solomon GM, Beckett VV, Daines CL, Heltshe SL, VanDevanter DR, Spahr JE, Gibson RL, Nick JA, Marshall BC, Flume PA, Goss CH, on behalf of the STOP investigators. Standardized treatment of pulmonary exacerbations (STOP) study: Observations at the initiation of intravenous antibiotics for cystic fibrosis pulmonary exacerbations. J Cyst Fibros, submitted [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].West NW, Beckett VV, Jain R, Sanders DB, Nick JA, Heltshe SL, Dasenbrook EC, VanDevanter DR, Solomon GM, Goss CH, Flume PA, on behalf of the STOP investigators. Standardized treatment of pulmonary exacerbations (STOP) study: Physician treatment practices and outcomes for individuals with cystic fibrosis with pulmonary exacerbations. J Cyst Fibros, submitted. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Quanjer PH, Stanojevic S, Cole TJ, Baur X, Hall GL, Culver BH, Enright PL, Hankinson JL, Ip MS, Zheng J, Stocks J; ERS Global Lung Function Initiative. Multi-ethnic reference values for spirometry for the 3–95-yr age range: the global lung function 2012 equations. Eur Respir J. 2012. December;40(6):1324–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Quon BS, Patrick DL, Edwards TC, Aitken ML, Gibson RL, Genatossio A, McNamara S, Goss CH. Feasibility of using pedometers to measure daily step counts in cystic fibrosis and an assessment of its responsiveness to changes in health state. J Cyst Fibros. 2012. May;11(3):216–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Lechtzin N, West N, Allgood S, Wilhelm E, Khan U, Mayer-Hamblett N, Aitken ML, Ramsey BW, Boyle MP, Mogayzel PJ Jr, Goss CH. Rationale and design of a randomized trial of home electronic symptom and lung function monitoring to detect cystic fibrosis pulmonary exacerbations: the early intervention in cystic fibrosis exacerbation (eICE) trial. Contemp Clin Trials. 2013. November;36(2):460–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Holmgren EB Establishing equivalence by showing that a prespecified percentage of the effect of the active control over placebo is maintained. J Biopharm Stat. 1999. November;9(4):651–9. [DOI] [PubMed] [Google Scholar]
- [24].Goss CH, Caldwell E, Gries KS, et al. Validation of a novel patient-reported respiratory symptoms instrument in cystic fibrosis CFRSD-CRISS. Pediatr Pulmonol. 2013;S36:295–96. [Google Scholar]
- [25].Rabin HR, Butler SM, Wohl ME, Geller DE, Colin AA, Schidlow DV, Johnson CA, Konstan MW, Regelmann WE; Epidemiologic Study of Cystic Fibrosis. Pulmonary exacerbations in cystic fibrosis. Pediatr Pulmonol. 2004. May;37(5):400–6. [DOI] [PubMed] [Google Scholar]
- [26].VanDevanter DR, Flume PA, Morris N, Konstan MW. Probability of IV Antibiotic Retreatment within Thirty Days is Associated with Duration and Location of IV Antibiotic Treatment for Pulmonary Exacerbation in Cystic Fibrosis. J Cyst Fibros. 2016. April 29 pii: S1569–1993(16)30022–4. doi: 10.1016/j.jcf.2016.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Waters V, Stanojevic S, Klingel M, Chiang J, Sonneveld N, Kukkar R, Tullis E, Ratjen F. Prolongation of antibiotic treatment for cystic fibrosis pulmonary exacerbations. J Cyst Fibros. 2015. November;14(6):770–6. [DOI] [PubMed] [Google Scholar]
- [28].Stanojevic S, McDonald A, Waters V, MacDonald S, Horton E, Tullis E, Ratjen F. Effect of pulmonary exacerbations treated with oral antibiotics on clinical outcomes in cystic fibrosis. Thorax. 2016. August 18 pii: thoraxjnl-2016–208450. doi: 10.1136/thoraxjnl-2016-208450. [DOI] [PubMed] [Google Scholar]
- [29].Brumback LC, Baines A, Ratjen F, Davis SD, Daniel SL, Quittner AL, Rosenfeld M; for the ISIS Study Group Pulmonary exacerbations and parent-reported outcomes in children <6 years with cystic fibrosis. Pediatr Pulmonol. 2014. April 29. doi: 10.1002/ppul.23056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Regelmann WE, Elliott GR, Warwick WJ, Clawson CC. Reduction of sputum Pseudomonas aeruginosa density by antibiotics improves lung function in cystic fibrosis more than do bronchodilators and chest physiotherapy alone. Am Rev Respir Dis. 1990. April;141(4 Pt 1):914–21. [DOI] [PubMed] [Google Scholar]
- [31].Tarshish Y, Huang L, Jackson FI, Edwards J, Fligor B, Wilkins A, Uluer A, Sawicki G, Kenna M. Risk factors for hearing loss in patients with cystic fibrosis. J Am Acad Audiol. 2016;27:6–12. [DOI] [PubMed] [Google Scholar]




