Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2010 Jun 1.
Published in final edited form as: Crit Care Med. 2009 Jun;37(6):1913–1920. doi: 10.1097/CCM.0b013e3181a009b4

A Simple Clinical Predictive Index for Objective Estimates of Mortality in Acute Lung Injury

Colin R Cooke 1, Chirag V Shah 2,3, Robert Gallop 4, Scarlett Bellamy 3, Marek Ancukiewicz 5, Mark D Eisner 6, Paul N Lanken 2, A Russell Localio 3, Jason D Christie 2,3, for the National Heart, Lung, and Blood Institute Acute Respiratory Distress Syndrome network
PMCID: PMC2731230  NIHMSID: NIHMS117976  PMID: 19384214

Abstract

Objective

We sought to develop a simple point score that would accurately capture the risk of hospital death for patients with acute lung injury (ALI).

Design

This is a secondary analysis of data from two randomized trials. Baseline clinical variables collected within 24 hours of enrollment were modeled as predictors of hospital mortality using logistic regression and bootstrap resampling to arrive at a parsimonious model. We constructed a point score based on regression coefficients.

Setting

Medical centers participating in the Acute Respiratory Distress Syndrome Clinical Trials network (ARDSnet).

Patients

Model development: 414 patients with non-traumatic ALI participating in the low tidal volume arm of the ARDSnet ARMA study. Model validation: 459 patients participating in the ARDSnet ALVEOLI study.

Interventions

None

Measurements and Main Results

Variables comprising the prognostic model were: hematocrit <26% (1 point), bilirubin ≥ 2 mg/dl (1 point), fluid balance greater than 2.5 liters positive (1 point), and age (1 point for age 40–64, 2 points for age ≥ 65 years). Predicted mortality (95% confidence interval) for 0, 1, 2, 3, and 4+ point totals was 8% (5–14%), 17% (12–23%), 31% (26–37%), 51% (43–58%), and 70% (58–80%), respectively. There was excellent agreement between predicted and observed mortality in the validation cohort. Observed mortality for 0, 1, 2, 3, and 4+ point totals in the validation cohort was 12%, 16%, 28%, 47%, and 67%, respectively. Compared to the APACHE III score, areas under the receiver operating characteristic curve for the point score were greater in the development cohort (0.72 vs. 0.67, p=0.09) and lower in the validation cohort (0.68 vs. 0.75, p=0.03).

Conclusions

Mortality in ALI patients can be predicted using an index of four readily-available clinical variables with good calibration. This index may help inform prognostic discussions, but validation in non-clinical trial populations is necessary before widespread use.

Keywords: Acute respiratory distress syndrome, acute lung injury, Respiratory Distress Syndrome, Adult, Human ARDS, Statistical Model, logistic models, mortality determinants, Mortality, In-Hospital, Acute Physiology and Chronic Health Evaluation, APACHE III, Bayesian Prediction, Prognosis

INTRODUCTION

Acute lung injury (ALI) is a devastating cause of respiratory failure associated with significant morbidity and mortality.1,2 Despite the wealth of existing knowledge about risk factors for death in this syndrome, providers remain unable to determine which patients with ALI will ultimately die during their hospital stay. The vast majority of patients with ALI who die do so in the context of a decision to forgo life sustaining treatment driven in large part by patient preferences.35

Prognostication in the intensive care unit (ICU) is an important part of communication with surrogates, and often plays a role in the decision to forgo life sustaining treatment.6,7 Incapacitated patients rely upon surrogates such as their family members to represent their wishes during ICU care, and surrogates often rely upon clinician estimates of the likelihood of survival and functional recovery from acute illness when deciding whether to forgo life sustaining treatment for their loved one.7 Documented cognitive and non-cognitive biases held by physicians may overly influence their prognostic estimates for a given patient and have the potential to misrepresent true risk of death.811 Objective prognostic models, such as the Acute Physiology Assessment and Chronic Health Evaluation (APACHE) III score12 and Simplified Acute Physiology Score (SAPS) III score13 can provide estimated probabilities of death for an individual patient in the ICU. However, experts recommend against use of these models for predicting outcomes for individual patients in part because of their inability to convey uncertainty in estimated probabilities of death for an individual patient and the complexity involved in their calculation.14

The goal of this study was to develop a simple, disease-specific multivariable predictive scorecard for mortality to be used at the bedside in patients with early ALI. Given the importance of well calibrated models for individual prognostication15, we sought to maximize the concordance between predicted and actual probabilities of hospital death across point strata for our model, and thus to arrive at a system that might classify patients into groups for planning patient care.

MATERIALS AND METHODS

Study Population

The model derivation population arose from the 861 patients participating in the ARDSNet low tidal volume study (ARMA).16 Briefly, intubated, mechanically ventilated patients meeting American European Consensus Conference (AECC)17 definition for ALI were randomized within 36 hours of meeting the last qualifying AECC criterion to receive tidal volumes of 6 mL/kg or 12 mL/kg predicted body weight. Demographics, comorbidities, ALI precipitating cause, physiology, radiographic and ventilator data were recorded within the 24 hours prior to change in ventilator settings for all enrolled patients. Vital status for each patient was determined at hospital discharge. We limited our development cohort to all patients randomized into the 6 mL/kg arm of the parent study to eliminate tidal volume as a predictive variable in the analysis since current best practice involves low tidal volume ventilation for this population (n=473). Patients with trauma as the primary risk factor for ALI were excluded due to the low mortality rate in this subgroup.18

Model development

Our general strategy to develop a predictive model for death consisted of three steps. First, we identified variables previously reported as associated with mortality or severity of illness in ALI. Baseline values were selected to minimize missing data and to allow for mortality prediction at the beginning of ALI. Next, we constructed a parsimonious multivariable model based on these predictors. Finally, we validated the final predictive model in an independent sample of patients.

When deciding which covariates to retain as candidate predictors for the multivariable model, we considered the clinical relevance and generalizability of each covariate; the amount of missing data (retaining the measure with the least missing data); and finally, the amount of spread in the covariate’s scale (retaining the measure with the most variability) in that order. We assessed the collinearity among the predictors using the Pearson correlation coefficient, χ2 tests, and ANOVA/t-tests. When highly correlated covariates quantifying the same clinical information (e.g. A-a difference and PaO2) we selected the covariate that was more clinically relevant, had less missing data, and had more variability.

Multivariable Modeling

The resulting baseline clinically relevant covariates with minimal collinearity were entered into a multivariable logistic regression model. These variables included demographics (age, gender, race/ethnicity); weight; respiratory physiology (PaO2/FiO2, PaCO2, positive end-expiratory pressure [PEEP], number of opacified quadrants on frontal chest x-ray19, volume/pressure targeted ventilation, assist/control ventilation); primary ALI risk factor as coded by the clinical coordinator and physician investigator within 36 hours of ALI onset (pneumonia, sepsis, aspiration, other/none); timing of ALI onset (hospital days prior to ARDSnet screen, days with ALI prior to randomization) and physiologic and laboratory derangement (number of non-pulmonary organ failures, vasopressor use, net 24-hour fluid balance prior to enrollment, 24-hour urine output prior to enrollment, peak bilirubin, peak creatinine, lowest systolic blood pressure, lowest hematocrit). All peak and nadir values were identified during the 24 hour period prior to enrollment. We included continuous variables in categorical form to simplify point calculation from the final model. We determined cut points for continuous variables by assessing each variable’s functional form using generalized additive models.20 We evaluated two-way multiplicative interactions for each covariate which were excluded from the final model if they were not statistically significant.

Variable selection in the multivariable regression framework utilized a bootstrap algorithm.21 We generated 1000 bootstrap samples from the original dataset. Each bootstrap sample was the same size of the original derivation sample; however, patients in each bootstrap sample were randomly drawn from the original data with replacement.21 Within each bootstrap sample, we performed stepwise logistic regression with thresholds of p=0.10 for selection and p=0.20 for variable elimination. Predictors present in at least 600 runs (e.g., 60% of the 1000 generated bootstrap samples), were entered in a final logistic regression model using the original data.22,23 This method determines the empirical distribution of a variable’s likelihood of being included in the model thereby quantifying the strength of evidence that a given variable is indeed a true independent predictor of death and compares favorably to more traditional cross-validation or isolated automated model development methods.23

Score generation

Point scores were assigned to each covariate by rounding the regression coefficients in the final model to integers.24 We then calculated a point score for each patient in the cohort and plotted the resulting receiver operating characteristic (ROC) curve. The ROC curve graphically describes the overall performance of our point score.25 Discrimination of the model was summarized with area under the curve (AUC) of the ROC curve.25 In addition, we derived positive likelihood ratio (LR+) estimates for each level of the point score to be able to estimate how much a prior probability of death would be influenced by an observed point score. The LR+ summarizes how many more times likely patients who die are to have that particular point total than patients who survive.26,27 Predicted probabilities of death and their respective confidence intervals for each point strata were generated from a logistic regression with mortality as the outcome and the point totals per patient as the sole predictor. Post-test probabilities of death were generated using hypothetical, provider-determined pre-test probabilities of death and the LR+ for each point category as previously described.27 We calculated confidence intervals for post-test probabilities of death by incorporating the uncertainty in the likelihood ratio. Pretest probabilities were assumed to have no uncertainty. We assessed calibration using the Hosmer-Lemeshow statistic with P<0.10 indicating that fit was inadequate.28 Given the low power of this test in small samples we also compared the actual and predicted mortality within each point stratum for the development and validation cohorts.

Model Validation

We assessed internal validity of our model by comparing the AUC of our point score to that of the predicted mortality estimated from the APACHE III score12 using the method outlined by DeLong et al.29 APACHE probabilities of death were generated by fitting the APACHE III score in a logistic model where hospital death was the outcome. We assessed external valididity by applying our model to an independent database which consisted of the same target study population used in constructing the prediction model (participants in the ARDSnet clinical trial ALVEOLI).30 Briefly, ALVEOLI randomized 549 intubated, mechanically ventilated patients meeting the AECC definition for ALI or ARDS within 36 hours to receive higher or lower PEEP. All patients received tidal volumes of 6mL/kg predicted body weight. Baseline variables collected in ALVEOLI were similar to those captured in ARMA. Patients were followed until discharged. We limited our analysis of ALVEOLI to patients without trauma as the primary ALI risk factor (n=505).

As a sensitivity analysis, we determined the influence of missing data on our model by performing multiple imputation (SAS PROC MI) for each incomplete covariate as described by Rubin.31 The imputed model and mortality estimates derived from the imputed model were identical to those from compete case analysis. We also utilized the same variables and cut points to determine model performance for predicting 28-day mortality.

The institutional review board for each center participating in ARDSnet approved of the parent studies. All statistical analyses were conducted with SAS 9.1 (Statistical Analysis Systems, Cary, NC) and Stata 9.2 (StataCorp, College Station, TX). All tests of significance utilized a two-sided α = 0.05.

RESULTS

Of the 902 patients participating in the ARDSnet low tidal volume study, 429 were randomized to the 12cc/Kg tidal volume arm and excluded. Of the remaining 473 patients, 59 (12%) were excluded due to trauma as the primary risk factor for ALI leaving 414 patients (88% of patients in the 6 cc/Kg arm) available for analysis. Demographics, ALI risk factor, severity of illness, and laboratory and physiology data for the cohort are shown in Table 1. Of the 414 patients in the development cohort 139 (33%) were dead at hospital discharge, similar to the 31% mortality reported in the 6mL/kg arm of the parent study.16 In general, patients dead at hospital discharge were older and had a greater severity of physiologic and laboratory derangement.

Table 1.

Baseline characteristics of patients eligible for model development by vital status*

Variable Vital status at hospital discharge
P value
Alive Dead
Cases 275 139
Age, years, median (IQR) 48 (37–61) 60 (45–72) <0.001
Male (%) 59 61 0.61
Race (%) 0.27
 White 75 70
 Black 17 19
 Hispanic 4 4
 Other / unknown 4 7
Timing of ALI
 Hospital days prior to ALI, median (IQR) 2 (1–5) 4 (1–8) 0.001
 ALI days prior to randomization, median (IQR) 1 (0–1) 1 (0–1) 0.97
Primary ALI risk factor (%) 0.59
 Pneumonia 36 33
 Sepsis 28 34
 Aspiration 17 19
 Multiple transfusion 3 2
 None/other 16 12
Severity of illness
 APACHE III score 79 (27) 96 (30) <0.001
 Number of organ failures, median (IQR) 1 (0–1) 1 (0–2) 0.01
 Net volume during preceding 24 hours (cc) 2276 (3616) 3361 (4546) 0.01
 Vasopressor use (%) 38 49 0.03
Respiratory physiology
 Minute ventilation (L/min) 13 (4) 14 (4) 0.1
 Plateau pressure (mmHg) 29 (7) 31 (8) 0.01
 PEEP, mmHg, median (IQR) 8 (5–10) 10 (5–10) 0.11
 PaO2/FiO2 ratio 152 (71) 135 (61) 0.02
 pH 7.37 (0.1) 7.36 (0.1) 0.67
 PaCO2 (mmHg) 36 (8) 36 (8) 0.98
Additional physiology and laboratories||
 Systolic blood pressure (mmHg) 89 (19) 83 (19) 0.003
 24 hour urine output (cc) 2400 (1539) 2068 (1612) 0.05
 Glucose (mg/dL) 177 (100) 184 (90) 0.48
 Creatinine (mg/dL) 1.6 (1.5) 1.8 (1.4) 0.22
 Hematocrit (%) 30 (6) 29 (5) 0.04
 Bilirubin (mg/dL) 1.6 (1.9) 2.4 (3.3) 0.02
*

IQR, Interquartile range; ALI, acute lung injury; APACHE, Acute Physiology Assessment and Chronic Health Evaluation; PaO2, partial pressure of arterial oxygen; FiO2, fraction of inspired oxygen; PaCO2, partial pressure of arterial carbon dioxide.

Data were missing for plateau pressure in 87 (21%) patients, bilirubin, 38, (9%);PaCO2 in 30, (7%); PaO2/FiO2, 30 (7%); fluid balance, 27 (7%); glucose, 26 (6%); creatinine, 23 (6%); urine output, 22 (5%); minute ventilation, 4 (<1%); pH, 3 (<1%); HCT, 3 (<1%); APACHE III in 2; vasopressor in 2; primary ALI risk factor in 1; systolic blood pressure in 1.

Numbers reflect mean (SD) unless otherwise noted. Percentages may not add to 100 due to rounding.

||

Numbers represent worst values over the 24 hour period surrounding enrollment day

Multivariable modeling

During multivariable modeling 64 additional patients were excluded due to missing data for bilirubin (n=38, 9%), fluid balance (n=24, 6%), and hematocrit (n=2). Variables retained in the final regression of the covariates present in >60% of the bootstrap iterations included age, hematocrit, 24-hour fluid balance, and bilirubin. The model derived from imputed data was identical to that derived by complete case analysis. For simplicity, we report only the results of the complete case analysis. Point values generated from the regression coefficients for each of these covariates are shown in Table 2. The resulting point total for each patient was incorporated in a regression with hospital mortality as the outcome. We refer to this model as the custom model. Predicted mortality by point total for the development cohort and observed mortality in the development and validation cohorts are presented in Table 3. The mean predicted mortality for each point strata was very close to the observed mortality in both the development and validation cohorts. In all strata, observed mortality in the validation cohort fell within the confidence bounds of the predicted mortality.

Table 2.

Model based points for each cut point in predictive variables in the final multivariable model

Variable Points
0 1 2
Age, years ≤39 40–64 ≥65
Bilirubin, mg/dL < 2.0 ≥ 2.0 -
Net 24-hour volume (in - out), mL ≤ 2500 >2500 -
Hematocrit, (%) ≥26 <26 -

Table 3.

Predicted and observed hospital mortality, and positive likelihood ratios in the derivation set (ARMA) and the validation set (ALVELOLI)*

Total Points Predicted mortality
Observed Mortality
Diagnostic Likelihood Ratio +
(95% CI)
% 95% CI ARMA ALVEOLI
0 8.0 (4.6, 13.7) 8.1 12.3 0.30 (0.16, 0.54)
1 16.5 (11.9, 22.5) 16.0 16.3 0.47 (0.34, 0.64)
2 31.0 (26.0, 36.6) 30.1 27.8 0.98 (0.79, 1.23)
3 50.6 (42.7, 58.4) 54.4 46.5 2.50 (1.94, 3.22)
4+ 70.0 (58.1, 79.5) 60.0 66.7 4.13 (2.12, 8.07)
*

CI, confidence interval; ARMA, Acute respiratory distress syndrome network Respiratory Management in the Acute lung injury; ALVEOL,I Assessment of Low Tidal Volume and Elevated End-Expiratory Pressure to Obviate Lung Injury.

Pooled likelihood ratios for ARMA and ALVEOLI. LR+ can be multiplied by the pre-test odds of outcome to get the post test odds of outcome. Pre-test odds can be calculated as p/1−p, where p = pre-test probability of disease. Post-test probability is calculated as (post-test odds / 1 + post-test odds).

Positive likelihood ratios (LR+) and 95% confidence intervals for each point total in the combined cohorts are also shown in Table 3. Utilizing the LR+s from Table 3, we calculated the hypothetical post-test probability of death as a function of point total from our model over a range of pre-test probabilities of death (Table 4).

Table 4.

Estimated post-test percent hospital mortality (95% confidence interval) for a range of pre-test rates of death

Pre-test estimated mortality, % Calculated point total for patient
0 1 2 3 4+
5 2 (1, 3) 2 (2, 3) 5 (4, 6) 12 (9, 14) 18 (10, 30)
10 3 (2, 6) 5 (4, 7) 10 (8, 12) 22 (18, 26) 31 (19, 47)
25 9 (5,15) 14 (10, 18) 25 (21, 29) 45 (39, 52) 58 (41, 73)
50 23 (14, 35) 32 (26, 39) 50 (44, 55) 71 (66, 76) 81 (68, 89)
75 47 (33, 62) 59 (51, 66) 75 (70, 79) 88 (85, 91) 93 (86, 96)
90 73 (60, 83) 81 (76, 85) 90 (88, 92) 96 (95, 97) 97 (95, 99)
95 85 (76, 91) 90 (87, 92) 95 (94, 96) 98 (97, 98) 99 (98, 99)
*

Post-test percents calculated using likelihood ratios reported in Table 3 and Bayes’ Theorem. Confidence intervals incorporate the uncertainty in the estimated likelihood ratios.

Numbers represent a hypothetical, bedside assessment of the chance of dying prior to calculation of point score

The comparison between predicted mortality estimated from the APACHE III score and the mortality rate predicted by the custom model is illustrated in Figure 1. Overall, there was considerable spread in the predicted mortality estimated from the APACHE III score within each point total. Hosmer-Lemeshow goodness of fit test for the custom model in the development and validation cohort showed no evidence of inadequate fit ( χdf=32=1.5, p=0.67 and χdf=32=1.0, p=0.79, respectively).

Figure 1. Calibration plot.

Figure 1

For each patient in the validation dataset, the predicted mortality estimated from the Acute Physiology Assessment and Chronic Health Evaluation III (APACHE) score is plotted against the point total from the custom model developed from the ARDSnet low tidal volume study. Patients with an APACHE III predicted mortality overlapping with the custom model predicted mortality in the validation cohort are shown using triangles (within). Patients where APACHE III predicts a greater rate of death than predicted by the simple model are shown as closed circles (above). Patients where APACHE III predicts a lower rate of death than predicted by the simple model are shown in open circles (below).

ROC curves for the custom model in the development and validation cohorts are compared to APACHE III in Figure 2. The custom model outperformed APACHE III in the development cohort and performed worse than APACHE III in the validation cohort. The AUC for the custom model in the derivation set was 0.72 compared to 0.67 for APACHE III (p=0.09). When applied to the validation cohort the AUC for the custom model was 0.68 while the AUC for APACHE III was 0.75 (p=0.03).

Figure 2. Receiver operating characteristic (ROC) curves for the custom model.

Figure 2

Panel A – comparison of the ROC curves for the custom model and APACHE III score (area 0.72 vs. 0.67, p=0.09) in the development cohort. Panel B – comparison between custom model and APACHE III (area 0.68 vs. 0.75, p= 0.03) in the validation cohort.

Twenty-eight day mortality

At 28 days, 90 (26%) patients in the development cohort were dead. Predicted 28-day mortality, observed 28-day mortality, and LR+ for the development and validation cohorts are present in Table 5. In general, 28-day mortality was lower than hospital mortality for each point total; however, there was good agreement between predicted and observed mortality for each point total in the validation cohort. Positive likelihood ratios for each point total were similar to those reported for hospital mortality. Discrimination of the custom model in the development cohort was similar to discrimination in the validation cohort (AUC 0.71 vs. 0.71, respectively). Hosmer-Lemeshow goodness of fit test for the custom model for 28-day mortality in the development and validation cohort showed no evidence of inadequate fit ( χdf=32=0.37, p=0.95 and χdf=32=1.04, p=0.79, respectively).

Table 5.

Predicted and observed 28-day mortality in the derivation set (ARMA) and the validation set (ALVELOLI)*

Total Points Predicted mortality
Observed Mortality
Diagnostic Likelihood Ratio + (95% CI)
% 95% CI ARMA ALVEOLI
0 6.6 (3.6, 11.8) 5.4 6.2 0.20 (0.09, 0.45)
1 13.2 (9.1, 18.7) 13.0 10.4 0.42 (0.29, 0.60)
2 24.6 (20.0, 29.8) 25.2 25.3 1.08 (0.87, 1.37)
3 41.1 (33.7, 49.0) 42.2 41.8 2.33 (1.81, 3.00)
4+ 60.0 (47.3, 71.6) 55.0 53.3 3.82 (2.00, 7.27)
*

CI, confidence interval; ARMA, Acute respiratory distress syndrome network Respiratory Management in the Acute lung injury; ALVEOL,I Assessment of Low Tidal Volume and Elevated End-Expiratory Pressure to Obviate Lung Injury.

Pooled likelihood ratios for ARMA and ALVEOLI. LR+ can be multiplied by the pre-test odds of outcome to get the post test odds of outcome. Pre-test odds can be calculated as p/1-p, where p = pre-test probability of disease. Post-test probability is then calculated as (post-test odds / 1 + post-test odds).

DISCUSSION

We developed and validated a simple, easily-calculable scoring model that accurately predicts hospital mortality for patients with ALI. Our simple point score, incorporating age, 24-hour fluid balance, hematocrit, and bilirubin, is able to discriminate patients with high mortality from those with a lower mortality. Importantly, observed mortality in the validation dataset fell within predicted mortality ranges for the point total strata, indicating good model calibration. Furthermore, the accuracy of the model’s prediction for 28-day mortality was similar to that predicting hospital mortality. These results support the use of this model as a useful clinical tool for prognostication, classification, and counseling.

Our results are notable for the excellent concordance or calibration between our custom model’s predicted mortality rate and the observed mortality in each point strata within the validation cohort. Although the AUC of our model in the validation cohort was worse than in the development cohort, calibration remained intact. Discrimination refers to a model’s ability to distinguish survivors from non-survivors. The AUC represents the probability that a patient who died had a greater predicted probability of dying than a patient who survived. Calibration refers to the agreement between predicted probabilities and the actual, observed probabilities. Ideally, a predictive model should have excellent discrimination (AUC >0.9) and calibration (observed rates = predicted rates). Maximizing calibration is of primary importance when a model is used to counsel patients or their families about prognosis,15 because patients and their families are more interested in accurate assessment of the probability of death (calibration); not necessarily how sick the patient is relative to other patients (discrimination).15

This model can be used to inform prognosis (e.g. in counseling patients or families) but should not be used for decision making (e.g. withdrawal of support). The literature documenting the presence of cognitive biases in physician decision making is extensive.8,10 Confronted with task of prognosticating in the complex environment of the ICU physicians must assess the probability of an uncertain event. Physicians often use heuristics or simple rules-of-thumb in place of explicit analysis of probabilities to reduce these complex tasks to simpler judgements.8 While often useful when utilized by experienced ICU attending physicians32 these heuristics can lead to severe errors in assessing the probability of an event. For example, the availability of recent memories (e.g. “the last patient I cared for…”)8,10, an aversion to change therapeutic course (status-quo bias),11,33 or the potential to feel more responsible for an adverse outcome due to active treatment compared to inaction (regret/outcome bias)10 can unduly influence a physician’s estimates of prognosis in the ICU. There are often additional factors that ought not play a role in prognostic decision making such as physician age, experience and religion, patient age and race, and other conscious or unconscious biases that impede rational and compassionate decision making in critically ill patients.9,3437 These biases may contribute to the discrepancy between an attending physician’s predicted outcome and the patient’s actual outcome.38

For these reasons, there is a great need for objective measures to facilitate prognostication in critically ill patients that are immune to bias and subjectivity. To date, however, experts advocate against using traditional severity-of-illness measures (e.g. APACHE, SAPS) for decision making at the end of life for multiple reasons.32,39,40 There is little evidence to suggest that prognostication systems influence physician decisions caring for patients at the end of life.41 Additional objections stem from the inability of severity scores to convey uncertainty in estimated probabilities of death, the poor concordance between individual predictions among different severity models,39 the poor performance of such models at the extremes of estimated probabilities (e.g. close to zero or to one), and the complexity involved in their calculation.42 Based upon the above limitations, we caution physicians in the solitary use of our model purely for decision making in individual patients; ICU severity of illness scores, including our point score, will never predict patient outcomes with 100% certainty. Though accurate for populations of patients such models can never truly account for all uncertainty when applied to individuals. Nonetheless, families value prognostic discussions and utilize mortality estimates to prepare emotionally for the possibility that a patient may not survive even when they appreciate that prognostic estimates may not be correct.43,44 Providing stratum-specific estimates of mortality, such as those provided by our point score, to patients and their families has been recommended by many risk communication experts.45,46

While the use of scoring systems as a sole guide to making decisions about whether to initiate or continue to provide intensive care is inappropriate,40 they can provide an objective means for providers to inform their own assessment of prognosis. Combining clinician estimates of mortality with model estimates of mortality improves one’s overall ability to discriminate patients who live from those who die compared to either estimate alone.41,47 Given physicians’ pessimistic estimates of mortality, whether combining physician and model estimates improves agreement between the expected an actual mortality is still unclear.47,48

Providers can utilize the likelihood ratios from our model at the bedside similarly to a diagnostic test to estimate the post-test probability of death. Figure 3 illustrates a hypothetical “case study” examining how a prior probability of death of 0.4 (based upon population estimates from the literature) is updated to a probability of 0.74 with knowledge that the patient’s point score is four. It is important to note that population-based data support a pre-test mortality in all comers with ALI of approximately 40%.49 Given this estimate, most ALI patients will have post-test mortalities indicating a significant chance of surviving to hospital discharge. We also stress that, in practice, providers often have uncertainty in their estimated pre-test probability of death. Our analyses do not incorporate this uncertainty and thus confidence intervals around the post-test probabilities are too narrow.

Figure 3. Example calculation of post-test probability of death.

Figure 3

Confidence intervals for the post-test probability integrate uncertainty in the likelihood ratio. LR+, positive likelihood ratio; CI, confidence interval.

There are several strengths to our analysis. We utilized a well-defined cohort of patients with ALI cared for in hospitals throughout the United States. We subsequently validated our model utilizing an independent cohort of patients arising from a similar patient population. Finally, our score, utilizing only four readily available clinical variables is considerably easier to calculate than the APACHE III predicted probability of death or SAPS 3 predicted probability of death, yet maintains excellent discrimination and calibration.

We also recognize several limitations to our analysis. First, our model was derived on data from the ARDSNet low tidal volume study, a study conducted over 10 years ago. The mortality of ALI has decreased over time as implementation of evidence-based therapy in this disease has improved.50 We attempted to address this limitation by validating the model in a more contemporary population of patients (ALVEOLI); nevertheless, our model may perform differently in more current ALI cohorts. Second, our derivation population had a small number of deaths limiting our ability to evaluate all potential predictors of death without over fitting the model.51 Third, in contrast to development of APACHE III, our model development was limited to variables available in the data set; we were unable to evaluate some potentially important predictors such as pulmonary dead space and PEEP responsiveness as they were not collected routinely in this cohort.5254 We were also unable to evaluate the predictive ability of other comorbidities, such as chronic liver disease and metastatic cancer55, as patients with these underlying illnesses were excluded from the parent study. Fourth, in addition to excluding trauma patients, we excluded of 15% (64/414) of the cohort due to missing data to maximize the utility of our model in practice. This may have influenced the variables selected for our model and may bias the mortality within each strata when applied. Validation of our model in populations with complete data is important prior to its routine use. Fifth, our model was derived in a cohort collected from multiple academic tertiary-care hospitals participating in a randomized trial with specific exclusion criteria. Documented differences between academic- and community-based ALI patients, and patients enrolled versus not enrolled in randomized trials may prevent generalization to the broader community.49 Moreover, our inclusion of fluid balance, a treatment dependent variable, may influence the performance of our model under different practice patterns. Further validation of this model in a contemporary, large, multicenter study should be performed prior to widespread adoption. Finally, APACHE III was developed to predict mortality utilizing data during the first 24 hrs of ICU stay; therefore, our use of APACHE III scores generated at the time of enrollment may have resulted in underperformance of APACHE III.

Conclusions

We have developed simple prognostic score that accurately identifies groups of ALI patients at high risk of death. This model can facilitate a provider’s assessment of prognosis when informing patients and their families about the possible outcomes of ALI. Prior to widespread use, this model should be validated in contemporary non-clinical trial populations.

Acknowledgments

Financial Support: F32 HL090220, N01 HR46055, NO1 HR46058

Footnotes

This study was conducted at the University of Pennsylvania and the University of Washington.

Conflict of interest: All authors have no conflicts of interest to disclose.

Contributor Information

Colin R. Cooke, Email: crcooke@u.washington.edu.

Chirag V. Shah, Email: chirag.shah@uphs.upenn.edu.

Robert Gallop, Email: RGallop@wcupa.edu.

Scarlett Bellamy, Email: sbellamy@cceb.upenn.edu.

Marek Ancukiewicz, Email: msa@biostat.mgh.harvard.edu.

Mark D. Eisner, Email: mark.eisner@ucsf.edu.

Paul N. Lanken, Email: lanken@mail.med.upenn.edu.

A. Russell Localio, Email: rlocalio@mail.med.upenn.edu.

Jason D. Christie, Email: jchristi@mail.med.upenn.edu.

References

  • 1.Herridge MS, Cheung AM, Tansey CM, et al. One-year outcomes in survivors of the acute respiratory distress syndrome. N Engl J Med. 2003;348:683–693. doi: 10.1056/NEJMoa022450. [DOI] [PubMed] [Google Scholar]
  • 2.Rubenfeld GD, Herridge MS. Epidemiology and outcomes of acute lung injury. Chest. 2007;131:554–562. doi: 10.1378/chest.06-1976. [DOI] [PubMed] [Google Scholar]
  • 3.Prendergast TJ, Claessens MT, Luce JM. A national survey of end-of-life care for critically ill patients. Am J Respir Crit Care Med. 1998;158:1163–1167. doi: 10.1164/ajrccm.158.4.9801108. [DOI] [PubMed] [Google Scholar]
  • 4.Cook D, Rocker G, Marshall J, et al. Withdrawal of mechanical ventilation in anticipation of death in the intensive care unit. N Engl J Med. 2003;349:1123–1132. doi: 10.1056/NEJMoa030083. [DOI] [PubMed] [Google Scholar]
  • 5.Stapleton RD, Wang BM, Hudson LD, et al. Causes and timing of death in patients with ARDS. Chest. 2005;128:525–532. doi: 10.1378/chest.128.2.525. [DOI] [PubMed] [Google Scholar]
  • 6.Luce JM, White DB. The pressure to withhold or withdraw life-sustaining therapy from critically ill patients in the United States. Am J Respir Crit Care Med. 2007;175:1104–1108. doi: 10.1164/rccm.200609-1397CP. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.White DB, Engelberg RA, Wenrich MD, et al. Prognostication during physician-family discussions about limiting life support in intensive care units. Crit Care Med. 2007;35:442–448. doi: 10.1097/01.CCM.0000254723.28270.14. [DOI] [PubMed] [Google Scholar]
  • 8.Tversky A, Kahneman D. Judgment under Uncertainty: Heuristics and Biases. Science. 1974;185:1124–1131. doi: 10.1126/science.185.4157.1124. [DOI] [PubMed] [Google Scholar]
  • 9.Christakis NA, Asch DA. Biases in how physicians choose to withdraw life support. Lancet. 1993;342:642–646. doi: 10.1016/0140-6736(93)91759-f. [DOI] [PubMed] [Google Scholar]
  • 10.Bornstein BH, Emler AC. Rationality in medical decision making: a review of the literature on doctors’ decision-making biases. J Eval Clin Pract. 2001;7:97–107. doi: 10.1046/j.1365-2753.2001.00284.x. [DOI] [PubMed] [Google Scholar]
  • 11.Aberegg SK, Haponik EF, Terry PB. Omission bias and decision making in pulmonary and critical care medicine. Chest. 2005;128:1497–1505. doi: 10.1378/chest.128.3.1497. [DOI] [PubMed] [Google Scholar]
  • 12.Knaus WA, Wagner DP, Draper EA, et al. The APACHE III prognostic system. Risk prediction of hospital mortality for critically ill hospitalized adults. Chest. 1991;100:1619–1636. doi: 10.1378/chest.100.6.1619. [DOI] [PubMed] [Google Scholar]
  • 13.Moreno RP, Metnitz PG, Almeida E, et al. SAPS 3--From evaluation of the patient to evaluation of the intensive care unit. Part 2: Development of a prognostic model for hospital mortality at ICU admission. Intensive Care Med. 2005;31:1345–1355. doi: 10.1007/s00134-005-2763-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Herridge MS. Prognostication and intensive care unit outcome: the evolving role of scoring systems. Clin Chest Med. 2003;24:751–762. doi: 10.1016/s0272-5231(03)00094-7. [DOI] [PubMed] [Google Scholar]
  • 15.Justice AC, Covinsky KE, Berlin JA. Assessing the generalizability of prognostic information. Ann Intern Med. 1999;130:515–524. doi: 10.7326/0003-4819-130-6-199903160-00016. [DOI] [PubMed] [Google Scholar]
  • 16.Ventilation with lower tidal volumes as compared with traditional tidal volumes for acute lung injury and the acute respiratory distress syndrome. The Acute Respiratory Distress Syndrome Network. N Engl J Med. 2000;342:1301–1308. doi: 10.1056/NEJM200005043421801. [DOI] [PubMed] [Google Scholar]
  • 17.Bernard GR, Artigas A, Brigham KL, et al. The American-European Consensus Conference on ARDS. Definitions, mechanisms, relevant outcomes, and clinical trial coordination. Am J Respir Crit Care Med. 1994;149:818–824. doi: 10.1164/ajrccm.149.3.7509706. [DOI] [PubMed] [Google Scholar]
  • 18.Calfee CS, Eisner MD, Ware LB, et al. Trauma-associated lung injury differs clinically and biologically from acute lung injury due to other clinical disorders. Crit Care Med. 2007;35:2243–2250. doi: 10.1097/01.ccm.0000280434.33451.87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Murray JF, Matthay MA, Luce JM, et al. An expanded definition of the adult respiratory distress syndrome. Am Rev Respir Dis. 1988;138:720–723. doi: 10.1164/ajrccm/138.3.720. [DOI] [PubMed] [Google Scholar]
  • 20.Hastie T, Tibshirani R. Generalized Additive-Models - Some Applications. Journal of the American Statistical Association. 1987;82:371–386. [Google Scholar]
  • 21.Efron B, Tibshirani R. An Introduction to the bootstrap. New York: Chapman & Hall; 1993. [Google Scholar]
  • 22.Altman DG, Andersen PK. Bootstrap investigation of the stability of a Cox regression model. Stat Med. 1989;8:771–783. doi: 10.1002/sim.4780080702. [DOI] [PubMed] [Google Scholar]
  • 23.Austin PC, Tu JV. Bootstrap methods for developing predictive models. American Statistician. 2004;58:131–137. [Google Scholar]
  • 24.Moons KG, Harrell FE, Steyerberg EW. Should scoring rules be based on odds ratios or regression coefficients? J Clin Epidemiol. 2002;55:1054–1055. doi: 10.1016/s0895-4356(02)00453-5. [DOI] [PubMed] [Google Scholar]
  • 25.Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36. doi: 10.1148/radiology.143.1.7063747. [DOI] [PubMed] [Google Scholar]
  • 26.Sackett DL, Haynes RB, Tugwell P. Clinical epidemiology: a basic science for clinical medicine. 1. Boston: Little, Brown; 1985. [Google Scholar]
  • 27.Deeks JJ, Altman DG. Diagnostic tests 4: likelihood ratios. Bmj. 2004;329:168–169. doi: 10.1136/bmj.329.7458.168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Lemeshow S, Hosmer DW., Jr A review of goodness of fit statistics for use in the development of logistic regression models. Am J Epidemiol. 1982;115:92–106. doi: 10.1093/oxfordjournals.aje.a113284. [DOI] [PubMed] [Google Scholar]
  • 29.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–845. [PubMed] [Google Scholar]
  • 30.Brower RG, Lanken PN, MacIntyre N, et al. Higher versus lower positive end-expiratory pressures in patients with the acute respiratory distress syndrome. N Engl J Med. 2004;351:327–336. doi: 10.1056/NEJMoa032193. [DOI] [PubMed] [Google Scholar]
  • 31.Rubin DB, Schenker N. Multiple imputation in health-care databases: an overview and some applications. Stat Med. 1991;10:585–598. doi: 10.1002/sim.4780100410. [DOI] [PubMed] [Google Scholar]
  • 32.Sinuff T, Adhikari NK, Cook DJ, et al. Mortality predictions in the intensive care unit: comparing physicians with scoring systems. Crit Care Med. 2006;34:878–885. doi: 10.1097/01.CCM.0000201881.58644.41. [DOI] [PubMed] [Google Scholar]
  • 33.Redelmeier DA, Shafir E. Medical decision making in situations that offer multiple alternatives. Jama. 1995;273:302–305. doi: 10.1001/jama.1995.03520280048038. [DOI] [PubMed] [Google Scholar]
  • 34.Cook DJ, Guyatt GH, Jaeschke R, et al. Determinants in Canadian health care workers of the decision to withdraw life support from the critically ill. Canadian Critical Care Trials Group Jama. 1995;273:703–708. [PubMed] [Google Scholar]
  • 35.Christakis NA, Asch DA. Medical specialists prefer to withdraw familiar technologies when discontinuing life support. J Gen Intern Med. 1995;10:491–494. doi: 10.1007/BF02602399. [DOI] [PubMed] [Google Scholar]
  • 36.Christakis NA, Asch DA. Physician characteristics associated with decisions to withdraw life support. Am J Public Health. 1995;85:367–372. doi: 10.2105/ajph.85.3.367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hinkka H, Kosunen E, Metsanoja R, et al. Factors affecting physicians’ decisions to forgo life-sustaining treatments in terminal care. J Med Ethics. 2002;28:109–114. doi: 10.1136/jme.28.2.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Detsky AS, Stricker SC, Mulley AG, et al. Prognosis, survival, and the expenditure of hospital resources for patients in an intensive-care unit. N Engl J Med. 1981;305:667–672. doi: 10.1056/NEJM198109173051204. [DOI] [PubMed] [Google Scholar]
  • 39.Lemeshow S, Klar J, Teres D. Outcome prediction for individual intensive care patients: useful, misused, or abused? Intensive Care Med. 1995;21:770–776. doi: 10.1007/BF01704747. [DOI] [PubMed] [Google Scholar]
  • 40.Consensus statement of the Society of Critical Care Medicine’s Ethics Committee regarding futile and other possibly inadvisable treatments. Crit Care Med. 1997;25:887–891. doi: 10.1097/00003246-199705000-00028. [DOI] [PubMed] [Google Scholar]
  • 41.Knaus WA, Harrell FE, Jr, Lynn J, et al. The SUPPORT prognostic model. Objective estimates of survival for seriously ill hospitalized adults. Study to understand prognoses and preferences for outcomes and risks of treatments. Ann Intern Med. 1995;122:191–203. doi: 10.7326/0003-4819-122-3-199502010-00007. [DOI] [PubMed] [Google Scholar]
  • 42.Metnitz PG, Moreno RP, Almeida E, et al. SAPS 3--From evaluation of the patient to evaluation of the intensive care unit. Part 1: Objectives, methods and cohort description. Intensive Care Med. 2005;31:1336–1344. doi: 10.1007/s00134-005-2762-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Zier LS, Burack JH, Micco G, et al. Doubt and belief in physicians’ ability to prognosticate during critical illness: the perspective of surrogate decision makers. Crit Care Med. 2008;36:2341–2347. doi: 10.1097/CCM.0b013e318180ddf9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Evans LR, Boyd EA, Malvar G, et al. Surrogate Decision Makers’ Perspectives on Discussing Prognosis in the Face of Uncertainty. Am J Respir Crit Care Med. 2008:200806–200969OC. doi: 10.1164/rccm.200806-969OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Gigerenzer G, Edwards A. Simple tools for understanding risks: from innumeracy to insight. Bmj. 2003;327:741–744. doi: 10.1136/bmj.327.7417.741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Thomson R, Edwards A, Grey J. Risk communication in the clinical consultation. Clin Med. 2005;5:465–469. doi: 10.7861/clinmedicine.5-5-465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Rocker G, Cook D, Sjokvist P, et al. Clinician predictions of intensive care unit mortality. Crit Care Med. 2004;32:1149–1154. doi: 10.1097/01.ccm.0000126402.51524.52. [DOI] [PubMed] [Google Scholar]
  • 48.Wildman MJ, Sanderson C, Groves J, et al. Implications of prognostic pessimism in patients with chronic obstructive pulmonary disease (COPD) or asthma admitted to intensive care in the UK within the COPD and asthma outcome study (CAOS): multicentre observational cohort study. Bmj. 2007;335:1132. doi: 10.1136/bmj.39371.524271.55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Rubenfeld GD, Caldwell E, Peabody E, et al. Incidence and outcomes of acute lung injury. N Engl J Med. 2005;353:1685–1693. doi: 10.1056/NEJMoa050333. [DOI] [PubMed] [Google Scholar]
  • 50.Zambon M, Vincent JL. Mortality rates for patients with ALI/ARDS have decreased over time. Chest. 2008 doi: 10.1378/chest.07-2134. [DOI] [PubMed] [Google Scholar]
  • 51.Peduzzi P, Concato J, Kemper E, et al. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996;49:1373–1379. doi: 10.1016/s0895-4356(96)00236-3. [DOI] [PubMed] [Google Scholar]
  • 52.Nuckton TJ, Alonso JA, Kallet RH, et al. Pulmonary dead-space fraction as a risk factor for death in the acute respiratory distress syndrome. N Engl J Med. 2002;346:1281–1286. doi: 10.1056/NEJMoa012835. [DOI] [PubMed] [Google Scholar]
  • 53.Ware LB. Prognostic determinants of acute respiratory distress syndrome in adults: impact on clinical trial design. Crit Care Med. 2005;33:S217–222. doi: 10.1097/01.ccm.0000155788.39101.7e. [DOI] [PubMed] [Google Scholar]
  • 54.Gattinoni L, Caironi P, Cressoni M, et al. Lung recruitment in patients with the acute respiratory distress syndrome. N Engl J Med. 2006;354:1775–1786. doi: 10.1056/NEJMoa052052. [DOI] [PubMed] [Google Scholar]
  • 55.Cooke CR, Kahn JM, Caldwell E, et al. Predictors of hospital mortality in a population–based cohort of patients with acute lung injury. Crit Care Med. 2008;36:1412–1420. doi: 10.1097/CCM.0b013e318170a375. [DOI] [PubMed] [Google Scholar]

RESOURCES