Abstract
Objective.
Survival from Acute Respiratory Distress Syndrome (ARDS) is improving, and outcomes beyond mortality may be important for testing new treatments. “Ventilator-free days” (the VFD score), is an established composite that equates ventilation on day 28 to death. A hierarchical outcome treating death as a worse than prolonged ventilation would enhance face validity, but performance characteristics and reporting of such an outcome are unknown. We therefore evaluated the performance of a novel hierarchical composite endpoint, the Alive and Ventilator Free (AVF) score.
Design.
Using data from four ARDS Network clinical trials, we compared AVF to the VFD score. AVF compares each patient with every other patient in a win-lose-tie for each comparison. Duration of mechanical ventilation is only compared if both patients survived. We evaluated power of AVF vs. VFD score under various circumstances.
Setting:
Intensive care units within the ARDS Network
Patients:
Individuals enrolled in four ARDS Network trials.
Interventions:
None for this analysis.
Measurements and Main Results.
Within the four trials (N=2410 patients), AVF and VFD score had similar power, with AVF slightly more powerful when a mortality difference was present, and VFD score slightly more powerful with a difference in duration of mechanical ventilation. AVF less often found in favor of treatments that increased mortality and increased days free of ventilation among survivors.
Conclusions.
A hierarchical composite endpoint, AVF, preserves statistical power while improving face validity. AVF is less prone to favor a treatment with discordant effects on survival and days free of ventilation. This general approach can support complex outcome hierarchies with multiple constituent outcomes. Approaches to interpretation of differences in AVF are also presented.
Funding.
National Institutes of Health
Keywords: ARDS, trial endpoints, composite outcomes
Background
Acute Respiratory Distress Syndrome (ARDS) is an often fatal, highly morbid condition for which specific treatments are actively being sought.(1) An important traditional outcome for clinical trials of therapies for ARDS has been hospital mortality at 28 or 60 days,(2, 3) although some recent trials have used, e.g., 90 days or longer.(4) Unfortunately, using early mortality, very few interventions have proved beneficial, an observation that may reflect Type II statistical error, i.e. a “false negative”.(3) Especially in early-phase trials, a mortality outcome requires larger than feasible sample sizes and may ignore important treatment effects on morbidity. Furthermore, assuming similar mortality, improvement in other outcomes may be adequate to endorse the effectiveness of a therapy, particularly when the non-mortality outcomes are patient-centered. Testing endpoints separately is highly inefficient, requiring extremely large sample sizes, which compounds the risk of rejecting efficacious therapies. By contrast, a composite outcome that incorporates mortality and important morbidity to model a “better” outcome for patients would overcome this limitation.
The most widely used composite outcome in the ARDS literature is commonly called “ventilator-free days” (VFD). Described by Schoenfeld and Bernard,(5) this composite endpoint combines mortality with number of days after successful liberation from mechanical ventilation among survivors, truncated at 28 days. Although often erroneously reported as such, the units are not actually days. Crucially, the VFD composite endpoint treats a patient dead on any day before day 28 as identical with a patient alive but dependent on the ventilator on day 28. Equating 28 days of mechanical ventilation with death does not reflect patient, family, clinician, or societal values and beliefs.(6) Nor does the standard reporting of the outcome facilitate interpretation, since it is a merger of the probabilities of death or ventilation on day 28 with days free of ventilation among patients alive and free of the ventilator on day 28. To avoid common misinterpretation of the units of this composite endpoint, we refer to the “VFD score” throughout the remainder of the text.
An alternative approach using a topic-specific hierarchical composite outcome(7) has been employed in Acquired Immune Deficiency Syndrome(8) and cardiovascular clinical trials(9) to avoid the problem of making more and less severe constituent outcomes equivalent.(10) We hypothesized that a similar approach (statistically equivalent to modifications of the parallel “worst-rank ordinal” approach(11, 12)) could be useful in ARDS trials and critical care more broadly (and indeed have implemented it in the EP-Vent2 trial of esophageal-manometry-guided ventilator management(13, 14) with similar application in another trial(15)).
In this work, we therefore evaluate performance characteristics of a novel hierarchical composite endpoint for ARDS trials using data from four published multicenter trials by the NHLBI ARDS Network, in four published ARDS Network clinical trials. This hierarchical endpoint, the Alive and Ventilator-Free (AVF) score, incorporates death and days after successful liberation from mechanical ventilation at 28 days in such a manner that death constitutes a worse outcome than prolonged mechanical ventilation. This technique can be modified or expanded to accommodate multiple hierarchically ranked outcomes to reflect outcomes considered more and less important by patients, family members, clinicians, researchers, and regulators. In this paper, we consider three related questions: (1) a general approach to hierarchical composite endpoint construction, (2) a specific hierarchical composite endpoint, and (3) a useful and interpretable effect measure for reporting rank-based endpoints.
Methods
Patients
We studied patients enrolled in four multicenter clinical trials who met American European Consensus Conference definitions of ARDS (then termed Acute Lung Injury and ARDS).(16) For simplicity, following the subsequent Berlin consensus definition,(1) we refer to patients in these trials as having had ARDS.
Using the NIH/NHLBI BioLINCC data repository, we accessed the deidentified datasets for the NHLBI ARDS Network ARMA,(17) ALVEOLI(18) and FACTT(19, 20) trials. The four data sets (FACTT included two factorialized interventions) are described in Table 1. The ARMA trial was performed among 861 patients with ARDS and compared higher vs. lower tidal volumes. The ALVEOLI trial was performed among 549 patients with ARDS who were randomly assigned to lower versus higher positive end-expiratory pressures (PEEP). The FACTT trial (1000 patients) was a 2×2 design comparing a liberal versus conservative fluid-management strategy (FACTT-Fluid) and, in a factorial randomization, treatment guided by either a pulmonary artery catheter or a central venous catheter (FACTT-Catheter). ARDSNet methods for defining liberation from mechanical ventilation are presented in the Online Supplement.
Table 1.
Trial | Treatment | Control | N (treatment: control) | Primary endpoint |
---|---|---|---|---|
And Respiratory Management in ALI and ARDS (ARMA) | Low tidal volume ventilation | Traditional tidal volume ventilation | 432: 429 | Death before discharge home breathing without assistance, to 180 days; ventilator-free days to day 28 |
Assessment of Low tidal Volume and Elevated end-expiratory volume to Obviate Lung Injury (ALVEOLI) | High PEEP | Low PEEP | 276: 273 | Death before discharge home breathing without assistance, to 60 days |
Fluids and Catheter Treatment Trial (FACTT)-Fluid* | Liberal fluid strategy | Conservative fluid strategy | 494: 501 | 60-day mortality prior to discharge home |
Fluids and Catheter Treatment Trial (FACTT)-Catheter* | PAC | CVC | 486: 509 | 60-day mortality prior to discharge home |
These two trials were a 2×2 factorial design in the same patient population. Only 995 patients had evaluable VFD within the BioLINCC datasets for these two trials; the reported numbers reflect those with evaluable VFD.
PEEP: positive end expiratory pressure; PAC: pulmonary artery catheter; CVC: central venous catheter; ALI: acute lung injury; ARDS: Acute Respiratory Distress Syndrome; BioLINCC: Biological Specimen and Data Repository Information Coordinating Center
The study protocol was reviewed and approved by the institutional review board of Beth Israel Deaconess Medical Center in Boston, MA.
Construction of the hierarchical composite endpoint, Alive and Ventilator Free (AVF)
With minor modifications, we followed the method of Finkelstein and Schoenfeld(8) by which each patient is compared with every other patient in the trial. For each patient-to-patient comparison, a win, loss, or tie is defined in a hierarchical manner. The comparisons are first performed on the basis of the most important outcome (typically death for critical care trials), and only if neither patient has experienced that outcome will the win-lose-tie comparison be based on a less important outcome (typically a measure of morbidity), for example duration of mechanical ventilation. This technique can accommodate multiple secondary outcomes, arranged hierarchically and tailored to the population and treatment of study.
With this framework, we developed the novel hierarchical outcome Alive and Ventilator Free (AVF). This outcome incorporates vital status and time since successful liberation from mechanical ventilation through day 28. Following standard practice, we defined time since successful liberation as 28 - n, where n is the number of days between the first and last day of mechanical ventilation; patients who are still ventilator-dependent on day 28 are assigned a value of 0 days since successful liberation. To compute AVF, each subject is compared to every other subject in both trial arms and assigned a score (win=+1; lose=−1; tie=0) for each pairwise comparison, based on which fared better (Table 2). If one subject survives and the other does not, scores of +1 and −1, respectively, are assigned for that pairwise comparison. If both subjects in the pairwise comparison survive, their scores are determined by time since successful liberation from mechanical ventilation: the subject with more time since successful liberation from the ventilator is assigned a score of +1, and the subject with less time since liberation is assigned a score of −1. If both subjects either die at any time during the 28-day period or have equal duration of mechanical ventilation, both are assigned a score of 0 for that pairwise comparison. Then, the points from all pairwise comparisons are summed to obtain a cumulative score for each subject. These cumulative scores are ranked and compared between treatment versus control groups using the Mann-Whitney-Wilcoxon rank sum test. (For efficiency of calculation, identical statistical comparisons can be obtained using an ordinal outcome in which worse outcomes are ranked lower than better outcomes.) Fundamentally, the proposed comparisons seek to answer the clinically relevant question: with which treatment strategy would a patient be likeliest to have a better outcome?
Table 2.
Index subject died | Comparison subject died | Days free of ventilator for index subject vs. comparison subject | Points for index subject | Points for comparison subject |
---|---|---|---|---|
Yes | Yes | N/A | 0 (tie) | 0 (tie) |
No | Yes | N/A | +1 (win) | −1 (lose) |
Yes | No | N/A | −1 (lose) | +1 (win) |
No | No | More | +1 (win) | −1 (lose) |
No | No | Less | −1 (lose) | +1 (win) |
No | No | Same | 0 (tie) | 0 (tie) |
The points are summed up to obtain a cumulative score for each subject. Every patient is compared to every other patient in both study arms. The scores are compared between study arms by a Mann-Whitney test.
Reporting of the hierarchical composite endpoint AVF
Best practice for reporting any composite endpoint always should include separate reporting of each constituent endpoint and, ideally, a measure of the difference between groups for the composite endpoint itself. For AVF, we therefore recommend reporting four aspects of the endpoint:
the main effect estimate, the probability of a superior outcome (θ) with 95% confidence interval (CI)
p-value obtained via the Mann-Whitney-Wilcoxon test
mortality, by treatment group
time since successful liberation from mechanical ventilation through day 28 among survivors only, by treatment group
The probability of superior outcome (θ), also known as the “probabilistic index” or “common language effect size statistic,” is defined as the estimated probability that an individual randomly selected from the study population will have a superior outcome if assigned to a given treatment arm. Details of its calculation are in the Online Supplement.
Traditional composite outcome: the VFD score
In its standard form,(5) the VFD score incorporates 28-day mortality and number of days after successful liberation from mechanical ventilation through day 28 into a single composite score. Two types of patients are assigned 0 ventilator-free days under this schema: patients who die on or before day 28, and patients who are alive and mechanically ventilated at day 28. Similar to the ventilator-free component of AVF, survivors who are no longer ventilated on day 28 are assigned VFD equal to 28 - n, where n is the number of days between the first and last day of mechanical ventilation. We describe VFD score calculations for each trial in the Online Supplement.
Simulations of effect sizes, sample sizes, and statistical power
We used simulations to assess the relative statistical power of the hierarchical AVF score versus the VFD score. Simulation parameters were obtained from each of the four clinical trial data sets. Frequencies of the resulting significance levels based on the Mann Whitney Wilcoxon statistic were plotted against the range of scenarios for each trial. Power calculations were performed by simulating 5000 independent trials for each specification of parameters. The following parameters were specified: 28-day mortality rates, proportion of patients alive and mechanically ventilated at day 28, and distribution of days of mechanical ventilation among patients alive and not mechanically ventilated at day 28. Mortality rates and proportions of patients alive and mechanically ventilated at day 28 were simulated by a random binary function, whereas the distribution of days of mechanical ventilation among patients alive and ventilator-free at day 28 was simulated by a truncated normal distribution, which empirically fit well the distribution of observed values.
To further investigate the performance of AVF in the presence of varying effect estimates for the intervention, we simulated multiple treatment arm scenarios, compared to simulations based on the parameters estimated from the ARMA control group. These simulations (5000 trials per simulation for all analyses) evaluated power at a sample size of 1000 patients over a range of deviations (mortality improvement from 2% to 10% and days free of ventilation among survivors of 3 to 8 days) from the ARMA control group. In related simulations, we held the mortality rate constant and varied the days free of ventilation among survivors to evaluate power to detect differences in days free of ventilation for the two composite endpoints. Similarly, in other simulations, we held days free of ventilation constant and evaluated the association between differences in mortality rates and power for the two endpoints. In additional sensitivity analyses, again with 5000 simulated trials of 1000 patients each), we held the proportion of patients either dead or ventilated at day 28 constant, while decreasing the mortality rate, when compared to parameters estimated from the ARMA control group.
To explore tolerance to discordant effects on mortality and non-mortality endpoints within AVF and the VFD score, we performed additional simulations evaluating scenarios in which an increase in mortality was associated with shorter duration of mechanical ventilation among survivors. For this analysis, we used the ARMA control group estimates and compared simulated treatment groups with discordant outcomes across a range of differences in mortality and day free of ventilation. We held the proportion of patients alive and ventilator dependent at 28 days constant.
All analyses were performed in the R Statistical Package 3.5.2(21) and in SAS version 9.3 (SAS Institute, Cary NC).
Results
ARDS Network trial results
Mechanical ventilation with lower tidal volume (ARMA) decreased 28-day mortality as compared to ventilation with traditional tidal volumes (25% vs. 35%, respectively) and increased the VFD score (median 13 vs. 4, respectively) (Table 3). Mechanical ventilation with lower or higher PEEP levels (ALVEOLI) did not significantly affect mortality rates or the VFD score. Conservative vs. liberal fluid management (FACTT-Fluid) did not affect mortality but increased the VFD score (median 18 vs. 14, respectively). Management with a PAC versus CVC (FACTT-Catheter) did not significantly affect mortality or the VFD score.
Table 3.
Treatment Group | Control Group | P value | |
---|---|---|---|
ARMA | Low tidal volume | Traditional tidal volume | |
Mortality, %† | 25.2 | 35.2 | 0.002 |
Days free of mechanical ventilation among survivors, median (interquartile range) | 20 (9; 24) | 20 (5; 24) | 0.46 |
VFD score, median (interquartile range) | 13 (0; 23) | 4 (0; 22) | 0.003 |
VFD score, probability of superior outcome, % (95% CI) | 56.2 (52.4 to 60.0) | 43.8 (40.0 to 47.6) | 0.003 |
AVF, probability of superior outcome, % (95% CI) | 56.5 (52.7 to 60.3) | 43.5 (39.7 to 47.3) | 0.003 |
ALVEOLI | High PEEP | Low PEEP | |
Mortality, % (n) | 23.2 | 22.3 | 0.84 |
Days free of mechanical ventilation among survivors, median (interquartile range) | 20 (11.5; 23.5) | 20 (13; 24) | 0.47 |
VFD score median (interquartile range) | 17 (0; 23) | 17 (0; 23) | 0.42 |
VFD score, probability of superior outcome, % (95% CI) | 49.2 (44.4 to 54.0) | 50.8 (46.0 to 55.6) | 0.42 |
AVF, probability of superior outcome (95% CI) | 48.4 (43.6 to 53.2) | 51.6 (46.8 to 56.4) | 0.51 |
FACTT-Fluid | Liberal fluid strategy | Conservative fluid strategy | |
Mortality, % (n) | 24.9 | 21.5 | 0.20 |
Days free of mechanical ventilation among survivors, median (interquartile range) | 18 (9; 22) | 21 (15; 24) | < 0.001 |
VFD score median (interquartile range) | 14 (0; 21) | 18 (0; 23) | < 0.001 |
VFD score, probability of superior outcome, % (95% CI) | 41.2 (37.8 to 44.8) | 58.8 (55.2 to 62.2) | < 0.001 |
AVF, probability of superior outcome (95% CI) | 42.5 (39.1 to 46.1) | 57.5 (53.9 to 60.9) | < 0.001 |
FACTT-Catheter | PAC | CVC | |
Mortality, % (n) | 23.0 | 23.4 | 0.88 |
Days free of mechanical ventilation among survivors, median (interquartile range) | 19 (12; 23) | 19 (12.5; 24) | 0.11 |
VFD score median (interquartile range) | 16 (0; 22) | 16 (0; 23) | 0.32 |
VFD score, probability of superior outcome, % (95% CI) | 46.8 (43.3 to 50.4) | 53.2 (49.6 to 56.7) | 0.25 |
AVF, probability of superior outcome (95% CI) | 48.1 (44.5 to 51.6) | 51.9 (48.4 to 55.5) | 0.29 |
The mortality outcome for the ARMA trial was hospital mortality, as opposed to the 28-day mortality presented here.
VFD score = ventilator-free day score; PEEP: positive end expiratory pressure; PAC: pulmonary artery catheter; CVC: central venous catheter.
Note that the hierarchical endpoint should be presented with the distributions of the constituent outcomes.
The AVF hierarchical endpoint differed between treatment groups in ARMA (probability of superior outcome with lower tidal volume: 56.5%, 95% CI 52.7 to 60.3%, p=0.003) and FACTT-Fluid (probability of superior outcome with conservative fluid management: 57.5%, 95% CI 53.9 to 60.9%, p<0.001) (Table 3). The AVF score did not differ between treatment groups in the ALVEOLI and FACTT-Catheter studies.
Power estimates as a function of sample size and effect sizes
Figure 1 displays a plot of overall statistical power simulated using different sample sizes for the comparison of the AVF and VFD scores. Mortality rates, proportion of patients alive and mechanically ventilated at day 28, and distribution of days free of mechanical ventilation among survivors not ventilated on day 28 were obtained from the respective ARDS Network trials.
As expected, both the AVF and VFD scores had low power (5%−14%) for simulations up to a sample size of 1000 patients based on parameters estimated from the ALVEOLI and FACTT-Catheter trials. In simulations based on FACTT-Fluid (where the efficacy was in the distribution of days free of ventilation among survivors), the VFD score had similar power to AVF (e.g., 88 [87–89] % vs. 87 [86–88] % with 600 total patients). By contrast, in simulations based on ARMA (where the efficacy was in mortality), the AVF score had slightly higher power than VFD score (e.g., 83 [82–84] % vs. 80 [79–81] % with 900 total patients).
Results of other simulations are reported in the Online Supplement; exemplary results are displayed in Figure 2. In general, the AVF score had similar power to the VFD score and was less prone to find in favor of a treatment that increased both mortality and days free of mechanical ventilation.
Discussion
Mortality rates in the ARDS Network trials declined substantially over the course of the network’s existence.(22) Control group in-hospital mortality in ARMA (enrolling 1996–1999) was 40%, while control group in-hospital mortality to 28 days was 22% in EDEN (enrolling 2008–2011).(23) This decrease in mortality in randomized controlled trials has important implications for future trials. To continue trials in broad populations of patients with ARDS may mean that design of such trials based on a mortality outcome will require ever-increasing sample sizes, which limits feasibility, increases cost, and risks delaying evaluation of promising therapies. One alternative approach would be to restrict enrollment to the most severely ill patients, as was done in OSCILLATE(24) and ACURASYS,(25) with associated control group hospital mortality of 35% and 41% respectively. An alternative approach would be to test potential interventions in studies powered to include clinically and mechanistically relevant non-mortality endpoints.
Composite outcomes have become standard in cardiovascular trials and see some use (generally as the VFD score) in critical care trials. These composite outcomes improve power and efficiency of trials and allow incorporation of relevant non-mortality outcomes that are likely affected by candidate treatments. However, despite their widespread use, they have occasioned caution and criticism.(26–31)
The NHLBI working group on future research directions in ARDS recently summarized the considerations for possible endpoints for clinical trials in ARDS.(2) They concluded that there are no proven surrogate markers for intermediate or long-term mortality in ARDS. Furthermore, patient-important outcomes beyond survival such as prolonged organ support therapy, physical, cognitive, or vocational recovery may be complicated by variable and difficult-to-measure patient baseline impairments, differences in ICU and end-of-life decision making, and differential follow-up rates. Despite this uncertainty, the VFD score has been commonly applied as a composite outcome of mortality and duration of respiratory failure. The VFD score is problematic, though. Most importantly, the VFD score treats death and ventilation on day 28 as equivalent, a claim with limited face validity because most patients do not consider prolonged ventilation identical to death, even where they would prefer shorter ventilation to longer ventilation.(32, 33) Similarly, the VFD score would not distinguish a treatment that increased survival while increasing by the same amount patients ventilated on day 28 (e.g., by saving very sick patients who would otherwise have died).
Hierarchical outcomes address certain failings of simpler composite outcomes through improved face validity and interpretability.(7–11, 34, 35) Hierarchical composite outcomes have better face validity because they explicitly rate death as more important than non-mortality outcomes; while the pairwise approach is more complex to calculate, it has in its favor the intuitive interpretation that patients are compared to each other to determine on balance which treatment arm is better. Such composite outcomes can accommodate multiple, hierarchically ranked outcomes into a single summary. Stakeholder groups could thus together establish outcome hierarchies, which could be implemented precisely within a hierarchical composite endpoint.
In this simulation-based study, the hierarchical composite outcome AVF score has similar power to the VFD score with better face validity. In addition, the AVF score has higher power to detect differences in mortality across a range of plausible increases in days free of ventilation. This basic attribute is manifest in the differences in power between ARMA and FACTT-Fluid: in ARMA, the AVF score has slightly more power because the difference in mortality was greater, while in FACTT-Fluid, the VFD score had slightly more power because mortality was similar but days free of ventilation was greater.
The AVF score may also have a more clinically intuitive interpretation than the VFD score, which as a trial summary is largely opaque as a merger of probabilities and distributions. The effect estimate for the AVF score is the probability of superior outcome with receipt of the studied intervention. True to its literal meaning, the probability of superior outcome is defined as the probability that a patient randomly selected from the study population would do better if assigned to a given study arm. An alternative, previously described metric for reporting treatment effect with hierarchical outcomes is the win/lose ratio.(10) However, the win/lose ratio may be less easily interpreted if widespread misinterpretation of odds ratios is any indicator, whereas clinicians and the lay public naturally think in terms of probabilities.(36, 37) Our reporting approach preserves face validity and robust statistical power while also prioritizing ease of interpretation (the probability of superior outcome), a crucial design feature of any clinical trial endpoint. As with any composite outcome, we recommend also reporting individual constituent endpoints. While all composite outcomes represent compromises among competing priorities, the hierarchical AVF endpoint appears superior to the traditional VFD score.
Limitations
We acknowledge that improvements in power with AVF are generally small. An increase in power was not our primary motivation, and we are reassured that improvements in power are most marked in situations where, e.g., an intervention increases mortality but decreases duration of ventilation among survivors or where mortality decreases but the proportion ventilated on day 28 increases by the same amount). We acknowledge that this endpoint has only been carefully evaluated in these four ARDS Network trials, although it has been used in other trials as a primary endpoint.(13–15) We acknowledge that worst-rank ordinal endpoints have the same statistical characteristics as the AVF score and are as easy to understand when there is only one non-mortality outcome. We acknowledge that the VFD score could also be presented in terms of probability of superiority, although this would not solve its face validity problem. We acknowledge that interpretability of endpoints is always complex, there is no established minimum clinically important difference for AVF, and while a hierarchical composite is an improvement, it does not solve all problems. We also acknowledge that we did not formally engage patient collaborators for this specific project. We believe that this framework provides an infrastructure for building patient-centered composite outcomes and strongly recommend patient collaboration for the development of new outcomes within this proposed hierarchical framework.
In summary we present a hierarchical composite endpoint for clinical trials in ARDS. This endpoint enhances face validity and ease of clinical interpretation. AVF can facilitate more efficient performance of ARDS clinical trials of without appreciable loss of power and may yield higher power as compared to the non-hierarchical composite outcome, the VFD score. A similar hierarchical endpoint, focused on mortality and the duration of non-pulmonary organ dysfunction, may similarly be relevant to clinical trials in other areas of critical care medicine.
Supplementary Material
Copyright form disclosure:
Dr. Novack received funding from Cardiomed Consultants LLC. Drs. Beitler and Schoenfeld’s institution received funding from the National Institutes of Health (NIH). Drs. Beitler, Thompson, Schoenfeld, and Brown received support for article research from the NIH. Dr. Thompson’s institution received funding from the National Heart, Lung, and Blood Institute and Department of Defense, and reports consulting for Bayer, Boehringer Ingelheim, and GlaxoSmithKline, and authorship for UpToDate, all outside the submitted work. The remaining authors have disclosed that they do not have any potential conflicts of interest.
This work was supported by NHLBI 1UM1HL108724 (PI: Talmor) and K23HL133489 (PI: Beitler).
Footnotes
No author has a relevant conflict of interest.
References
- 1.ARDS Definition Task Force, Ranieri VM, Rubenfeld GD, et al. Acute respiratory distress syndrome: the Berlin Definition. JAMA 2012;307(23):2526–2533. [DOI] [PubMed] [Google Scholar]
- 2.Lieu TA, Au D, Krishnan JA, et al. Comparative effectiveness research in lung diseases and sleep disorders: recommendations from the National Heart, Lung, and Blood Institute workshop. Am J Respir Crit Care Med 2011;184(7):848–856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Spragg RG, Bernard GR, Checkley W, et al. Beyond mortality: future clinical research in acute lung injury. Am J Respir Crit Care Med 2010;181(10):1121–1127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.PETAL Network Investigators. Early Neuromuscular Blockade in the Acute Respiratory Distress Syndrome. New England Journal of Medicine 2019;380(21):1997–2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Schoenfeld DA, Bernard GR. Statistical evaluation of ventilator-free days as an efficacy measure in clinical trials of treatments for acute respiratory distress syndrome. Crit Care Med 2002;30(8):1772–1777. [DOI] [PubMed] [Google Scholar]
- 6.Mendelsohn AB, Belle SH, Fischhoff B, et al. How patients feel about prolonged mechanical ventilation 1 year later. Crit Care Med 2002;30(7):1439–1445. [DOI] [PubMed] [Google Scholar]
- 7.O’Brien PC. Procedures for comparing samples with multiple endpoints. Biometrics 1984;40(4):1079–1087. [PubMed] [Google Scholar]
- 8.Finkelstein DM, Schoenfeld DA. Combining mortality and longitudinal measures in clinical trials. Stat Med 1999;18(11):1341–1354. [DOI] [PubMed] [Google Scholar]
- 9.Moye LA, Davis BR, Hawkins CM. Analysis of a clinical trial involving a combined mortality and adherence dependent interval censored endpoint. Stat Med 1992;11(13):1705–1717. [DOI] [PubMed] [Google Scholar]
- 10.Pocock SJ, Ariti CA, Collier TJ, et al. The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. Eur Heart J 2012;33(2):176–182. [DOI] [PubMed] [Google Scholar]
- 11.Lachin JM. Worst-rank score analysis with informatively missing observations in clinical trials. Control Clin Trials 1999;20(5):408–422. [DOI] [PubMed] [Google Scholar]
- 12.Colantuoni E, Scharfstein DO, Wang C, et al. Statistical methods to compare functional outcomes in randomized controlled trials with high mortality. BMJ 2018;360:j5748. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Fish E, Novack V, Banner-Goodspeed VM, et al. The Esophageal Pressure-Guided Ventilation 2 (EPVent2) trial protocol: a multicentre, randomised clinical trial of mechanical ventilation guided by transpulmonary pressure. BMJ open 2014;4(9):e006356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Beitler JR, Sarge T, Banner-Goodspeed VM, et al. Effect of Titrating Positive End-Expiratory Pressure (PEEP) With an Esophageal Pressure-Guided Strategy vs an Empirical High PEEP-Fio2 Strategy on Death and Days Free From Mechanical Ventilation Among Patients With Acute Respiratory Distress Syndrome: A Randomized Clinical Trial. Jama 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Bellingan G, Brealey D, Mancebo J, et al. Comparison of the efficacy and safety of FP-1201-lyo (intravenously administered recombinant human interferon beta-1a) and placebo in the treatment of patients with moderate or severe acute respiratory distress syndrome: study protocol for a randomized controlled trial. Trials 2017;18(1):536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bernard GR, Artigas A, Brigham KL, et al. The American-European Consensus Conference on ARDS. Definitions, mechanisms, relevant outcomes, and clinical trial coordination. Am J Respir Crit Care Med 1994;149(3 Pt 1):818–824. [DOI] [PubMed] [Google Scholar]
- 17.The Acute Respiratory Distress Syndrome Network. Ventilation with lower tidal volumes as compared with traditional tidal volumes for acute lung injury and the acute respiratory distress syndrome. N Engl J Med 2000;342(18):1301–1308. [DOI] [PubMed] [Google Scholar]
- 18.Brower RG, Lanken PN, MacIntyre N, et al. Higher versus lower positive end-expiratory pressures in patients with the acute respiratory distress syndrome. N Engl J Med 2004;351(4):327–336. [DOI] [PubMed] [Google Scholar]
- 19.National Heart Lung and Blood Institute Acute Respiratory Distress Syndrome Clinical Trials Network, Wheeler AP, Bernard GR, et al. Pulmonary-artery versus central venous catheter to guide treatment of acute lung injury. N Engl J Med 2006;354(21):2213–2224. [DOI] [PubMed] [Google Scholar]
- 20.National Heart L, Blood Institute Acute Respiratory Distress Syndrome Clinical Trials N, Wiedemann HP, et al. Comparison of two fluid-management strategies in acute lung injury. N Engl J Med 2006;354(24):2564–2575. [DOI] [PubMed] [Google Scholar]
- 21.R Core Team. R: A Language and Environment for Statistical Computing. In. 3.2.3 ed. Vienna, Austria: R Foundation for Statistical Computing; 2015. [Google Scholar]
- 22.Erickson SE, Martin GS, Davis JL, et al. Recent trends in acute lung injury mortality: 1996–2005. Crit Care Med 2009;37(5):1574–1579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Rice TW, Wheeler AP, Thompson BT, et al. Initial trophic vs full enteral feeding in patients with acute lung injury: the EDEN randomized trial. JAMA 2012;307(8):795–803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ferguson ND, Cook DJ, Guyatt GH, et al. High-frequency oscillation in early acute respiratory distress syndrome. N Engl J Med 2013;368(9):795–805. [DOI] [PubMed] [Google Scholar]
- 25.Papazian L, Forel JM, Gacouin A, et al. Neuromuscular blockers in early acute respiratory distress syndrome. N Engl J Med 2010;363(12):1107–1116. [DOI] [PubMed] [Google Scholar]
- 26.Freemantle N, Calvert M, Wood J, et al. Composite outcomes in randomized trials: greater precision but with greater uncertainty? Jama 2003;289(19):2554–2559. [DOI] [PubMed] [Google Scholar]
- 27.Gent M Some issues in the construction and use of clusters of outcome events. Contemporary Clinical Trials 1997;18(6):546–549. [Google Scholar]
- 28.Cordoba G, Schwartz L, Woloshin S, et al. Definition, reporting, and interpretation of composite outcomes in clinical trials: systematic review. BMJ 2010;341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ferreira-Gonzalez I, Busse JW, Heels-Ansdell D, et al. Problems with use of composite end points in cardiovascular trials: systematic review of randomised controlled trials. Bmj 2007;334(7597):786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Ferreira-Gonzalez I, Permanyer-Miralda G, Busse JW, et al. Methodologic discussions for using and interpreting composite endpoints are limited, but still identify major concerns. J Clin Epidemiol 2007;60(7):651–657; discussion 658–662. [DOI] [PubMed] [Google Scholar]
- 31.Montori VM, Permanyer-Miralda G, Ferreira-Gonzalez I, et al. Validity of composite end points in clinical trials. Bmj 2005;330(7491):594–596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Fried TR, Bradley EH, Towle VR, et al. Understanding the treatment preferences of seriously ill patients. N Engl J Med 2002;346(14):1061–1066. [DOI] [PubMed] [Google Scholar]
- 33.Guentner K, Hoffman LA, Happ MB, et al. Preferences for mechanical ventilation among survivors of prolonged mechanical ventilation and tracheostomy. Am J Crit Care 2006;15(1):65–77. [PubMed] [Google Scholar]
- 34.Felker GM, Anstrom KJ, Rogers JG. A global ranking approach to end points in trials of mechanical circulatory support devices. Journal of cardiac failure 2008;14(5):368–372. [DOI] [PubMed] [Google Scholar]
- 35.Ediebah DE, Galindo-Garre F, Uitdehaag BM, et al. Joint modeling of longitudinal health-related quality of life data and survival. Quality of life research: an international journal of quality of life aspects of treatment, care and rehabilitation 2015;24(4):795–804. [DOI] [PubMed] [Google Scholar]
- 36.Schwartz LM, Woloshin S, Welch HG. Misunderstandings about the effects of race and sex on physicians’ referrals for cardiac catheterization. N Engl J Med 1999;341(4):279–283; discussion 286–277. [DOI] [PubMed] [Google Scholar]
- 37.Persoskie A, Ferrer RA. A Most Odd Ratio:: Interpreting and Describing Odds Ratios. Am J Prev Med 2017;52(2):224–228. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.