Discriminative Accuracy of Physician and Nurse Predictions for Survival and Functional Outcomes 6 Months After an ICU Admission

Michael E Detsky; Michael O Harhay; Dominique F Bayard; Aaron M Delman; Anna E Buehler; Saida A Kent; Isabella V Ciuffetelli; Elizabeth Cooney; Nicole B Gabler; Sarah J Ratcliffe; Mark E Mikkelsen; Scott D Halpern

doi:10.1001/jama.2017.4078

. 2017 Jun 6;317(21):2187–2195. doi: 10.1001/jama.2017.4078

Discriminative Accuracy of Physician and Nurse Predictions for Survival and Functional Outcomes 6 Months After an ICU Admission

Michael E Detsky ^1,^2,^3,⁴, Michael O Harhay ^1,^5,⁶, Dominique F Bayard ⁷, Aaron M Delman ⁸, Anna E Buehler ⁹, Saida A Kent ¹⁰, Isabella V Ciuffetelli ¹, Elizabeth Cooney ^1,⁶, Nicole B Gabler ¹, Sarah J Ratcliffe ⁵, Mark E Mikkelsen ^5,¹¹, Scott D Halpern ^1,^5,^6,^11,^12,^✉

¹Palliative and Advanced Illness Research (PAIR) Center, University of Pennsylvania Perelman School of Medicine, Philadelphia

²Sinai Health System, Toronto, Ontario, Canada

³Depatment of Medicine, University of Toronto, Toronto, Ontario, Canada

⁴Interdepartmental Division of Critical Care, University of Toronto, Toronto, Ontario, Canada

⁵Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia

⁶Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia

⁷Pulmonary and Critical Care of Atlanta, Atlanta, Georgia

⁸Wayne State University School of Medicine, Detroit, Michigan

⁹University of California, San Diego School of Medicine

¹⁰University of Kentucky College of Medicine, Lexington

¹¹Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia

¹²Department of Medical Ethics and Health Policy, University of Pennsylvania, Philadelphia

^✉

Corresponding Author: Scott D. Halpern, MD, PhD, University of Pennsylvania Pereleman School of Medicine, 726 Blockley Hall, 423 Guardian Dr, Philadelphia, PA 19104-6021 (shalpern@upenn.edu).

Accepted for Publication: March 22, 2017.

Published Online: May 21, 2017. doi:10.1001/jama.2017.4078

Author Contributions: Dr Detsky had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Detsky, Delman, Cooney, Ratcliffe, Mikkelsen, Halpern.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Detsky, Harhay, Delman, Mikkelsen, Halpern.

Critical revision of the manuscript for important intellectual content: Detsky, Harhay, Bayard, Buehler, Kent, Ciuffetelli, Cooney, Gabler, Ratcliffe, Mikkelsen, Halpern.

Statistical analysis: Detsky, Harhay, Kent, Gabler, Halpern.

Obtained funding: Halpern.

Administrative, technical, or material support: Detsky, Bayard, Delman, Kent, Ciuffetelli, Cooney, Mikkelsen, Halpern.

Supervision: Detsky, Cooney, Ratcliffe, Mikkelsen, Halpern.

Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Dr Mikkelsen reported receiving a grant from the National Institute of Nursing Research. No other authors reported disclosures.

Funding/Support: Dr Detsky was supported by the National Heart, Lung, and Blood Institute (T32-HL098054), and Dr Harhay was supported by the National Heart, Lung, and Blood Institute (F31-HL127947). This work was also supported in part by a grant from the Otto Haas Charitable Trust to Dr Halpern.

Role of the Funders/Sponsors: The funders had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript; or decision to submit the manuscript for publication.

Meeting Presentation: This article was presented at the American Thoracic Society International Conference; May 21, 2017; Washington, DC.

Additional Contributions: We wish to thank Steven P. Gale, PhD, for his technical assistance with figure preparation. Dr Gale received no financial compensation for his contribution.

^✉

Corresponding author.

PMCID: PMC5710341 PMID: 28528347

Key Points

Question

What is the discriminative accuracy of physicians and nurses in predicting 6-month mortality and functional outcomes of critically ill patients?

Findings

In this prospective cohort study of 303 critically ill patients, physicians most accurately predicted 6-month mortality (positive likelihood ratio, 5.91; negative likelihood ratio, 0.41) and least accurately predicted cognition (positive likelihood ratio, 2.36; negative likelihood ratio, 0.75). Nurses’ predictions were similar or less accurate.

Meaning

Intensive care unit physicians’ and nurses’ discriminative accuracy in predicting 6-month outcomes of critically ill patients varied depending on the outcome being predicted and confidence of the predictor, but further research is needed to better understand how clinicians derive prognostic estimates of long-term outcomes.

Abstract

Importance

Predictions of long-term survival and functional outcomes influence decision making for critically ill patients, yet little is known regarding their accuracy.

Objective

To determine the discriminative accuracy of intensive care unit (ICU) physicians and nurses in predicting 6-month patient mortality and morbidity, including ambulation, toileting, and cognition.

Design, Setting, and Participants

Prospective cohort study conducted in 5 ICUs in 3 hospitals in Philadelphia, Pennsylvania, and enrolling patients who spent at least 3 days in the ICU from October 2013 until May 2014 and required mechanical ventilation, vasopressors, or both. These patients’ attending physicians and bedside nurses were also enrolled. Follow-up was completed in December 2014.

Main Outcomes and Measures

ICU physicians’ and nurses’ binary predictions of in-hospital mortality and 6-month outcomes, including mortality, return to original residence, ability to toilet independently, ability to ambulate up 10 stairs independently, and ability to remember most things, think clearly, and solve day-to-day problems (ie, normal cognition). For each outcome, physicians and nurses provided a dichotomous prediction and rated their confidence in that prediction on a 5-point Likert scale. Outcomes were assessed via interviews with surviving patients or their surrogates at 6 months. Discriminative accuracy was measured using positive and negative likelihood ratios (LRs), C statistics, and other operating characteristics.

Results

Among 340 patients approached, 303 (89%) consented (median age, 62 years [interquartile range, 53-71]; 57% men; 32% African American); 6-month follow-up was completed for 299 (99%), of whom 169 (57%) were alive. Predictions were made by 47 physicians and 128 nurses. Physicians most accurately predicted 6-month mortality (positive LR, 5.91 [95% CI, 3.74-9.32]; negative LR, 0.41 [95% CI, 0.33-0.52]; C statistic, 0.76 [95% CI, 0.72-0.81]) and least accurately predicted cognition (positive LR, 2.36 [95% CI, 1.36-4.12]; negative LR, 0.75 [95% CI, 0.61-0.92]; C statistic, 0.61 [95% CI, 0.54-0.68]). Nurses most accurately predicted in-hospital mortality (positive LR, 4.71 [95% CI, 2.94-7.56]; negative LR, 0.61 [95% CI, 0.49-0.75]; C statistic, 0.68 [95% CI, 0.62-0.74]) and least accurately predicted cognition (positive LR, 1.50 [95% CI, 0.86-2.60]; negative LR, 0.88 [95% CI, 0.73-1.06]; C statistic, 0.55 [95% CI, 0.48-0.62]). Discriminative accuracy was higher when physicians and nurses were confident about their predictions (eg, for physicians’ confident predictions of 6-month mortality: positive LR, 33.00 [95% CI, 8.34-130.63]; negative LR, 0.18 [95% CI, 0.09-0.35]; C statistic, 0.90 [95% CI, 0.84-0.96]). Compared with a predictive model including objective clinical variables, a model that also included physician and nurse predictions had significantly higher discriminative accuracy for in-hospital mortality, 6-month mortality, and return to original residence (P < .01 for all).

Conclusions and Relevance

ICU physicians’ and nurses’ discriminative accuracy in predicting 6-month outcomes of critically ill patients varied depending on the outcome being predicted and confidence of the predictors. Further research is needed to better understand how clinicians derive prognostic estimates of long-term outcomes.

This study assesses the 6-month discriminative accuracy of patient mortality and functional outcome predictions made by intensive care unit (ICU) physicians and nurses at the time of ICU admission.

Introduction

Providing prognostic guidance has long been a core physician responsibility and is an essential part of shared decision making, which requires integrating prognostic assessments with patients’ values and preferences. Prior studies have shown that intensive care unit (ICU) physicians are moderately accurate in predicting in-hospital mortality, but evaluations of ICU physicians’ abilities to predict longer-term mortality and functional outcomes have been limited to patients who require long-term mechanical ventilation. Furthermore, although nurses are not specifically charged with formulating and conveying prognostic judgments, many do so given the greater time they spend with patients, and there are no data regarding their accuracy.

Clinicians’ abilities to discriminate between patients who will or will not develop unfavorable outcomes other than mortality are important for several reasons. First, knowledge of future function may be very important to ICU patients, their families, and clinicians. This is particularly true as increasing numbers of patients survive the ICU but experience long-term impairments in cognition and overall function. Second, multidisciplinary family meetings commonly focus on function and quality of life. Third, predictions of future function may influence clinician behavior, as physicians are more likely to offer the withdrawal of life support when they believe the patient will experience future dysfunction. Fourth, recent qualitative work has highlighted a central tension for ICU clinicians “between their professional responsibility to discuss likely functional outcomes vs uncertainty about their ability to predict those outcomes for an individual patient.” This study sought to determine how well ICU physicians and nurses can determine which patients recently admitted to an ICU and requiring life support will have died or will experience unfavorable functional outcomes 6 months later.

Methods

Study Population

This was a prospective cohort study in 5 ICUs (3 medical, 2 surgical) in 3 hospitals in the University of Pennsylvania Health System that vary in their levels of specialization and incorporation of trainees into patient care. Recruitment occurred from October 2013 through May 2014, and written or oral informed consent was obtained from physicians, nurses, patients, and surrogates. Consent for patients was sought from surrogates when patients lacked capacity. The University of Pennsylvania institutional review board approved this study.

Patients were eligible to be enrolled from their third through sixth ICU day if they had received mechanical ventilation for more than 48 consecutive hours, vasoactive infusions for more than 24 consecutive hours, or both. Patients were enrolled 3 to 6 days after ICU admission because guidelines recommend that family conferences occur early in an ICU admission. The use of either vasoactive infusions or mechanical ventilation was required to target a population of patients who were sufficiently sick that they would require ICU-level care in essentially all hospitals. Patients receiving ICU-level care for fewer than 3 days were excluded because outcomes among such patients are often readily foreseeable and because such patients may be less likely to benefit from family discussions and prognostic assessments. Other exclusion criteria are listed in eAppendix 1 in the Supplement.

Recruited clinicians were the patient’s attending physician and primary bedside registered nurse on the day the patient was enrolled. Physicians were required to have been the patient’s attending of record for at least 2 calendar days. Nurses were only required to have 1 day of contact, owing to greater variability in their daily schedule and more frequent contact with patients on a given day.

Clinicians’ Predictions

Clinicians were asked to predict 6 outcomes that were easily understood by patients, surrogates, and physicians during pilot testing (eTable 1 in the Supplement). These outcomes were hospital survival plus 5 outcomes at 6 months from the time of study enrollment: survival, return to original residence, ability to toilet independently, ability to ambulate up 10 stairs independently, and ability to “remember most things, think clearly, and solve day-to-day problems” (ie, cognitive function). For toileting, ambulation, and cognition, clinicians were asked to make predictions conditional on the assumption that the patient would be alive at 6 months. Clinicians were required to provide a dichotomous prediction of whether the outcome would be achieved and to then state their confidence in each prediction using a 5-point Likert scale ranging from 1 (not confident at all) to 5 (very confident) (eAppendix 2 in the Supplement). Predictions were made within 24 hours of patient enrollment.

Patient Data Collection

Baseline patient information was collected prospectively from the electronic health record and through in-person or telephone interviews with patients or surrogates. The clinical variables assessed included prior physical and cognitive function, medical comorbidities, and Acute Physiology and Chronic Health Evaluation III (APACHE III) scores. Interviews with patients or surrogates were used to determine patients’ baseline function (ie, toileting, ambulation, and cognition) and place of residence, using using scales identical to those ultimately used for assessments during follow-up at 6 months. Contact information for patients and surrogates was obtained during enrollment.

6-Month Patient Data Collection

In-hospital mortality was determined by reviewing the electronic health record. If patients were not confirmed to be dead at 6 months, a trained research assistant unaware of the clinicians’ baseline predictions attempted to contact survivors via telephone or email to determine the patient’s vital status and function. If initial attempts to contact the patient were unsuccessful, attempts to contact both the patient and the surrogate occurred twice weekly, and whomever was contacted first was interviewed using a standardized script. Patients were considered lost to follow-up after 5 failed contact attempts.

Statistical Analysis

The discriminative accuracy of predictions was derived from 2 × 2 tables (eTable 2 in the Supplement). All patients were included in analyses of in-hospital mortality, 6-month mortality, and return to original place of residence; 6-month survivors were included in analyses of ambulation, toileting, and cognition. Patients’ and surrogates’ reports of 6-month outcomes were considered equivalent.

For each outcome, 8 operating characteristics were calculated separately for physicians’ and nurses’ predictions: sensitivity, specificity, C statistic, positive predictive value, negative predictive value, positive likelihood ratio (LR), negative LR, and diagnostic odds ratio. The primary results are presented as LRs and C statistics because these combine both sensitivity and specificity and are minimally influenced by outcome prevalence. A higher positive LR indicates greater discriminative accuracy in predicting adverse outcomes, with positive LR values of 5 to 10 yielding “modest shifts” in the probability of an event and values greater than 10 yielding “conclusive changes.” A lower negative LR indicates greater discriminative accuracy in predicting favorable outcomes, with negative LR values of 0.1 to 0.2 yielding “modest shifts” in the probability of an event and values less than 0.1 yielding “conclusive changes.”

Agreement between physicians and nurses was calculated using unweighted κ scores. To evaluate differences in clinicians’ optimism or pessimism, the McNemar test was used to evaluate dyads of predictions made by physicians and nurses. Discriminative accuracy between physicians and nurses was compared using C statistics for each of the 6 outcomes.

Four secondary analyses were specified a priori. First, confident predictions, defined as those rated as a 4 or 5 by clinicians on the Likert scale, were compared with nonconfident predictions using C statistics for correlated data. These analyses were preformed separately for physicians and nurses. Second, operating characteristics were measured for predictions for which the physician and nurse were concordant and both clinicians were confident. Third, LRs for physician and nurse predictions of 6-month mortality were compared across quartiles of patient’s APACHE III scores. Fourth, C statistics for discriminating hospital mortality and 6-month outcomes were calculated using available patient demographics and clinical data. These results were then compared with models that included these same variables as well as physician and nurse predictions and confidence. A target sample size of 300 patients was chosen to yield 80% power to detect 95% CIs no wider than 20% around sensitivity and specificity estimates if these operating characteristics were approximately 75%.

Results were considered significant at α<.05 (2-sided). Because less than 3% of data points were missing for all analyses, missing observations were excluded from relevant analyses. All analyses were conducted using Stata version 13.1 (StataCorp).

Results

Patient and Clinician Characteristics

Among 340 eligible patients (or their surrogates) approached for enrollment, 303 (89%) consented to participate (eFigure 1 in the Supplement). Patients had a median age of 62 years (interquartile range, 53-71), 173 (57%) were men, and 190 (63%) and 113 (37%) were admitted to medical and surgical ICUs, respectively. Prior to their critical illness, 283 patients (94%) lived at home, 243 (81%) could ambulate up 10 stairs independently, 267 (88%) could toilet independently, and 249 (83%) had normal cognition (Table 1). Outcomes at 6 months were verified for 299 patients (99%). Of the 169 patients (57%) confirmed to be alive at 6 months, functional outcomes were verified by patients for 87 (51%) and by surrogates for 82 (49%).

Table 1. Patient Characteristics.

Characteristic	Patients, No. (%) (n = 303)
Age, median (IQR), y^a	62 (53-71)
Men^a	173 (57.1)
Race
White	191 (63.0)
African American	98 (32.3)
Some college or more	147 (49.8)
Married or living with partner	150 (50.1)
Employed	85 (28.4)
Living at home prior to critical illness^b	283 (94.1)
Insurance status^a
Private	106 (35.6)
Medicare	161 (54.0)
Hospitalized in prior year	213 (70.3)
Able to ambulate up 10 stairs before hospitalization^b	243 (80.7)
Toileting independently before hospitalization^b	267 (88.4)
Normal cognition before hospitalization^b	249 (80.8)
Coronary artery disease^a	103 (34.0)
Peripheral vascular disease^a	61 (20.1)
Chronic obstructive pulmonary disease^a	66 (21.8)
Renal failure requiring dialysis^a	30 (9.9)
Liver disease^a	35 (11.6)
Obesity^a	106 (35.0)
Rheumatologic condition^a	50 (16.5)
Psychiatric condition^a	104 (34.3)
Malignancy (treated for cure)^a	65 (21.5)
Malignancy (metastatic or palliative)^a	40 (13.2)
Transplant history^a^,^c	21 (6.9)
ICU type^a
Medical	190 (62.7)
Surgical	113 (37.3)
ICU admitting diagnosis^a
Respiratory failure	83 (27.4)
Sepsis	66 (21.8)
Nonemergency surgery	54 (17.8)
Emergency surgery	34 (11.2)
Cardiac (nonsurgical)	18 (5.9)
Hemorrhagic shock	11 (3.6)
Other	37 (12.2)
APACHE III score, median (IQR), d^a	96 (75-120)
ICU length of stay, median (IQR), d^a	8 (5-14)
Hospital length of stay, median (IQR), d^a	17 (11-27)
Required ventilation^a	276 (91.1)
Ventilator days, median (IQR)^a	6 (3-10)
Required vasoactive infusions^a	247 (81.5)
Received dialysis^a	64 (21.1)
Goals of care made palliative in the ICU^a	73 (24.1)
Discharge disposition^a
Dead	72 (23.8)
Home	90 (29.7)
Other	141 (46.5)
6-mo disposition^b
Dead	130 (43.5)
Original place of residence	138 (44.4)
Other	31 (10.3)

Open in a new tab

Abbreviations: APACHE III, Acute Physiology and Chronic Health Evaluation III; ICU, intensive care unit; IQR, interquartile range.

^{^a}

Abstracted from chart.

^{^b}

Data acquired from patient or surrogate.

^{^c}

Transplant includes lung, liver, kidney, bone marrow.

Forty-seven physicians and 128 nurses contributed predictions for at least 1 patient. Almost half of patients (47%) were enrolled on ICU day 3, and the remainder were enrolled on days 4 to 6. At the time of the interview, 84% of physicians and 90% of nurses reported having already thought about patients’ future morbidity, but only 39% and 31%, respectively, reported having discussed this with the patient or surrogate (Table 2).

Table 2. Physician and Nurse Characteristics^a.

Characteristic	No. (%)
Characteristic	Physicians (n = 47)	Nurses (n = 128)
Age, median (IQR), y	41 (38-52)	29 (27-37)
Sex
Men	37 (80)	20 (16)
Women	9 (20)	108 (85)
Time since graduation from medical or nursing school, y
Physicians
>30	10 (21)
20-29	7 (15)
15-19	8 (17)
10-14	17 (36)
<10	5 (11)
Nurses
>25		6 (5)
15-24		18 (15)
10-14		15 (13)
5-9		64 (53)
<5		17 (14)
Days involved with patient^b
1	0/300 (0)	197/302 (65)
2	93/300 (31)	81/302 (27)
3	120/300 (40)	18/302(16)
4	58/300 (19)	5/302 (2)
5	23/300 (8)	0/302(0)
6	6/300(2)	1/302 (<1)
Considered patient’s future morbidity before making prognostic estimates^b	252/300 (84)	272/302 (90)
Spoke to family about patient’s future morbidity before making prognostic estimates^b	117/300 (39)	92/300 (31)

Open in a new tab

Abbreviation: IQR, interquartile range.

^{^a}

One nurse did not report age; 8 nurses did not report their year of graduation.

^{^b}

Numbers after forward slashes indicate numbers of patients who received a prediction from the physician or nurse.

Overall Predictions

Among physicians, 95% CIs around all 6 positive and negative LR statistics excluded the null value of 1.00, indicating better-than-chance prediction (Table 3). Similarly, 95% CIs around all 6 C statistics excluded the chance value of 0.5 (Table 4). For physicians, the best positive LR values were for 6-month mortality (5.91 [95% CI, 3.74-9.32]) and toileting (6.00 [95% CI, 3.18-11.30]), and the best negative LR was for 6-month mortality (0.41 [95% CI, 0.33-0.52]). For nurses, the best positive LR was for in-hospital mortality (4.71 [95% CI, 2.94-7.56]), and the best negative LR values were for toileting (0.48 [95% CI, 0.30-0.78]) and ambulating up 10 stairs (0.48 [95% CI, 0.31-0.74]). The full operating characteristics and crude 2 × 2 tables for physicians and nurses are reported in eTables 3 through 6 in the Supplement.

Table 3. Prevalence and Likelihood Ratios and for Physicians’ Predictions, Nurses’ Predictions, and Concordant Predictions.

Outcome	All Predictions			Confident Predictions^a
	Prevalence, No./Total (%)^b	Likelihood Ratio (95% CI)		Prevalence, No./Total (%)^b	Likelihood Ratio (95% CI)
	Prevalence, No./Total (%)^b	Positive	Negative	Prevalence, No./Total (%)^b	Positive	Negative
Physicians’ Predictions
In-hospital mortality	69/298 (23)	4.81 (2.91-7.95)	0.64 (0.52-0.78)	19/158 (12)	7.32 (3.11-17.20)	0.61 (0.42-0.90)
6-mo mortality	128/296 (43)	5.91 (3.74-9.32)	0.41 (0.33-0.52)	40/120 (33)	33.00 (8.34-130.63)	0.18 (0.09-0.35)
Unable to return to original residence at 6 mo	156/294 (53)	3.20 (2.21-4.62)	0.49 (0.40-0.60)	63/122 (52)	5.62 (2.91-10.86)	0.28 (0.18-0.43)
Unable to toilet independently at 6 mo	30/165 (18)	6.00 (3.18-11.30)	0.51 (0.35-0.75)	17/82 (21)	22.94 (5.67-92.89)	0.30 (0.15-0.64)
Unable to ambulate up 10 stairs at 6 mo	47/163 (29)	2.18 (1.53-3.11)	0.51 (0.34-0.76)	25/72 (35)	3.76 (1.99-7.10)	0.35 (0.18-0.66)
Abnormal cognition at 6 mo	62/164 (38)	2.36 (1.36-4.12)	0.75 (0.61-0.92)	32/90 (39)	4.53 (1.54-13.29)	0.74 (0.58-0.95)
Nurses’ Predictions
In-hospital mortality	71/301 (24)	4.71 (2.94-7.56)	0.61 (0.49-0.75)	26/160 (16)	6.70 (3.30-13.62)	0.54 (0.37-0.80)
6-mo mortality	129/297 (43)	4.23 (2.71-6.61)	0.56 (0.47-0.68)	53/135 (39)	9.90 (4.12-23.80)	0.42 (0.30-0.59)
Unable to return to original residence at 6 mo	157/295 (53)	2.06 (1.57-2.69)	0.51 (0.40-0.65)	84/154 (55)	3.33 (2.18-5.11)	0.25 (0.16-0.40)
Unable to toilet independently at 6 mo	30/166 (18)	2.61 (1.74-3.90)	0.48 (0.30-0.78)	19/91 (21)	3.28 (1.91-5.66)	0.40 (0.20-0.78)
Unable to ambulate up 10 stairs at 6 mo	47/164 (29)	2.04 (1.48-2.82)	0.48 (0.31-0.74)	25/82 (30)	2.28 (1.56-3.34)	0.25 (0.10-0.64)
Abnormal cognition at 6 mo	62/165 (38)	1.50 (0.86-2.60)	0.88 (0.73-1.06)	30/94 (32)	2.44 (0.97-6.10)	0.82 (0.65-1.04)
Concordant Physicians’ and Nurse’ Predictions^c
In-hospital mortality				7/98 (7)	17.33 (4.80-62.62)	0.44 (0.19-1.04)
6-mo mortality				17/66 (26)	40.35 (5.73-284.28)	0.18 (0.06-0.50)
Unable to return to original residence at 6 mo				31/66 (47)	15.24 (3.94-58.94)	0.14 (0.05-0.34)
Unable to toilet independently at 6 mo				10/45 (22)	15.75 (4.04-61.46)	0.11 (0.02-0.68)
Unable to ambulate up 10 stairs at 6 mo				14/39 (36)	5.36 (2.13-13.4	0.17 (0.05-0.62)
Abnormal cognition at 6 mo				17/52 (33)	12.35 (1.61-94.63)	0.67 (0.47-0.95)

Open in a new tab

^{^a}

Confident predictions are defined as 4 (“considerably confident”) or 5 (“very confident”) on the Likert scale.

^{^b}

Represents the No. with adverse outcome/Total No. (%).

^{^c}

Represents when physicians and nurses were concordant in their predictions.

Table 4. C Statistics and 95% CIs for All, Confident, and Nonconfident Predictions by Physicians and Nurses.

	C Statistic (95% CI)		P Value	Physicians’ Predictions, C Statistic (95% CI)			Nurses’ Predictions, C Statistic (95% CI)
	Physicians’ Predictions	Nurses’ Predictions	P Value	Confident^a	Nonconfident^b	P Value	Confident^a	Nonconfident^b	P Value
Hospital mortality	0.67 (0.61-0.73)	0.68 (0.62-0.74)	.81	0.68 (0.57-0.80)	0.64 (0.57-0.72)	.59	0.71 (0.61-0.81)	0.65 (0.57-0.73)	.36
No. of patients	299	301		158	140		160	139
6-mo mortality	0.76 (0.72-0.81)	0.69 (0.64-0.74)	.02	0.90 (0.84-0.96)	0.70 (0.63-0.77)	<.001	0.77 (0.70-0.84)^f	0.63 (0.56-0.70)	.006
No. of patients	294	297		120	174		135	159
Unable to return to original residence at 6 mo	0.70 (0.65-0.75)	0.67 (0.61-0.72)	.24	0.81 (0.74-0.88)	0.62 (0.55-0.69)	<.001	0.78 (0.72-0.85)	0.54 (0.46-0.62)	<.001
No. of patients	294	295		122	169		154	139
Unable to toilet independently at 6 mo	0.72 (0.63-0.82)	0.70 (0.60-0.79)	.57	0.84 (0.72-0.95)	0.58 (0.44-0.72)	.004	0.74 (0.62-0.86)	0.66 (0.49-0.83)	.44
No. of patients	163	166		82	81		91	73
Unable to ambulate up 10 stairs at 6 mo	0.68 (0.60-0.76)	0.67 (0.59-0.75)	.93	0.76 (0.66-0.87)	0.59 (0.46-0.71)	.03	0.74 (0.64-0.83)	0.60 (0.48-0.72)	.08
No. of patients	162	164		72	89		82	81
Abnormal cognition at 6 mo	0.61 (0.54-0.68)	0.55 (0.48-0.62)	.13	0.62 (0.53-0.71)	0.58 (0.47-0.69)	.94	0.58 (0.49-0.67)	0.51 (0.40-0.62)	.48
No. of patients	164	165		90	51		94	49

Open in a new tab

^{^a}

Defined as 4 (“considerably confident”) or 5 (“very confident”) on the Likert scale for confidence.

^{^b}

Defined as 1 (“not confident at all”), 2 (“slightly confident”), or 3 (“moderately confident”) on the Likert scale for confidence.

Physicians and nurses had fair to moderate agreement when predicting outcomes for the same patient, with κ scores ranging from 0.32 to 0.51 (eTable 7 in the Supplement). In these paired assessments, physicians were more optimistic than nurses in predicting return to original residence (P = .002) and toileting independently (P < .001) (eTables 7 and 8 in the Supplement). Physicians and nurses were equally accurate in predicting all outcomes except for 6-month mortality, for which physicians’ predictions were significantly better (C statistic, 0.76 [95% CI, 0.72-0.81] for physicians vs 0.69 [95% CI, 0.64-0.74] for nurses; P = .02) (Table 3).

Confident, Concordant, and Stratified Predictions

Physicians were confident (rated a 4 or 5 on the Likert scale) in 41% to 55% of their predictions across the 6 outcomes, while nurses were confident in 44% to 57% of their predictions. (Table 4, Figure). When physicians were confident, their predictions of 6-month survival, return to original residence, toileting independently, and ambulating up 10 stairs were significantly more accurate than when they were not confident (Table 4). For example, physicians’ confident predictions of 6-month survival had excellent LRs (positive LR, 33.00 [95% CI, 8.34-130.63]; negative LR, 0.18 [95% CI, 0.09-0.35]) and discrimination (C statistic, 0.90 [95% CI, 0.84-0.96]) (Table 4; eTables 9 and 10 and eFigure 2 in the Supplement). Nurses’ confident predictions of 6-month survival and return to original residence were significantly more accurate than their nonconfident predictions of these outcomes (Table 4; eTables 11 and 12 and eFigure 2 in the Supplement).

Figure. — ^aReported confidence in all predictions. For unable to toilet independently at 6 months, unable to ambulate up 10 stairs at 6 months, and abnormal cognition at 6 months, this includes the confidence in all predictions, not just among patients who survived. For ranking of confidence, 1 indicates “not confident at all”; 2, “slightly confident”; 3, “moderately confident”; 4, “considerably confident”; and 5, “very confident.”

Across outcomes, physicians and nurses were both concordant and confident for 22% to 33% of predictions. Discriminative accuracy of these concordant and confident predictions was typically excellent, especially for 6-month mortality (positive LR, 40.35 [95% CI, 5.73-284.28], negative LR, 0.18 [95% CI, 0.06- 0.50]) and toileting (positive LR, 15.75 [95% CI, 4.04-58.94]; negative LR, 0.11 [95% CI, 0.02-0.68]) (eFigure 3 and eTable 13 in the Supplement).

Physicians’ and nurses’ LRs for predicting 6-month survival were generally similar across all quartiles of illness severity as measured by APACHE III scores and excluded the null value of 1.0 in all cases (eFigure 4 in the Supplement). Relative to prediction models for all 6 outcomes derived using clinical variables, including APACHE III scores and the Functional Comorbidity Index available close to the time of ICU admission, models that included these clinical variables plus physicians’ and nurses’ predictions and levels of confidence generally had significantly better C statistics (eTable 14 in the Supplement).

Discussion

Among a diverse cohort of critically ill patients, ICU physicians’ and nurses’ abilities to discriminate between those who would or would not survive or experience unfavorable functional outcomes 6 months later varied by the outcome being predicted and whether physicians and nurses were confident or concordant in their predictions. However, these clinicians’ predictions were often sufficient to change the probability of future outcomes by more than diagnostic tests commonly used in clinical medicine and added significantly to the discriminative accuracy of prediction models obtained using APACHE scores and other clinical variables.

Despite descriptions of functional outcomes for specific cohorts of patients with acute respiratory distress syndrome, prolonged mechanical ventilation, and sepsis, clinicians struggle to apply these population estimates to individual patients. Several specific findings in the present study may help ICU clinicians understand the power and limitations of their prognostic judgements.

First, physicians’ aggregate predictions of in-hospital mortality, 6-month mortality, return to original residence by 6 months, and cognitive and functional outcomes at 6 months were all better than chance but were typically modest for unselected patients. Similar results were found among nurses, except that their predictions of cognition were no better than chance. Thus, when relaying such predictions at the bedside, it is important that clinicians acknowledge their uncertainty. Most family members not only desire prognostic guidance from ICU clinicians but also appreciate that some uncertainty is inevitable. In our study, roughly one-third of physicians and nurses reported giving prognostic estimates to the family prior to a patient’s enrollment in the study.

Second, for the approximately one-half of general ICU patients for whom ICU physicians and nurses may form confident predictions, the discriminative accuracy of these predictions appears to improve considerably. Caution in interpreting this result is required because the CIs surrounding the LRs for confident predictions are often wide owing to the smaller sample sizes. However, it is notable that few clinical diagnostic tests achieve positive LRs as high as those estimated for several confident physician predictions. Furthermore, when ICU physicians and nurses agreed in their predictions and both clinicians were confident, the discrimination was typically maximized. Thus, ICU physicians should consider both their own confidence and their agreement, or lack thereof, with bedside nurses in deciding how to frame their prognostic judgments to patients and surrogates.

Third, clinician discrimination varied among the 6 predicted outcomes. Clinicians were most accurate in predicting 6-month mortality and toileting independence and least accurate in predicting cognition. Prior evaluations of clinicians’ abilities to predict in-hospital mortality have shown that clinicians often do better than quantitative models. However, such results may be attributable to a self-fulfilling prophecy if clinicians more commonly recommend withdrawal of life support for patients they expect to die. The current results regarding 6-month outcomes may be less susceptible to such self-fulfilling prophecies. Furthermore, physicians predicted only 29 of the 69 in-hospital deaths in this cohort. Thus, even if self-fulfilling prophecies influenced the results, the overall effect appears to be small. Regardless, given the importance of longer-term survival to patients and surrogates, as well as clinicians’ aptitude for predicting this outcome, it may be reasonable for clinicians to focus on 6-month mortality and functional outcomes in discussions with patients and families.

The finding that clinicians perform poorly in predicting future cognition among ICU survivors is consistent with prior work showing that even validated measures of cognition assessed at the time of hospital discharge do not predict long-term cognitive sequlae. Although cognitive dysfunction is common among survivors and is important to patients and caregivers, the current data suggest the need for caution when discussing individual patients’ risks for future cognitive dysfunction.

This study has several strengths. Outcome data were obtained at 6 months for 99% of enrolled patients, thereby minimizing the chance of bias due to missing data. Although all patients were recruited from the same health system, they emanated from 3 medical and 2 surgical ICUs using different staffing models and trainee involvement at 3 different hospitals. Also, the diversity of critical illness represented among enrolled patients and the requirement for only brief exposure to life support augment the generalizability of the results, particularly in contrast to studies of ICU survivorship that focus on patients with prolonged mechanical ventilation, sepsis, or other specific syndromes.

This study also has several limitations. First, a fundamental limitation of this study is that it focused on clinicians’ abilities to discriminate among patients who will or will not experience adverse outcomes but did not assess the calibration of these predictions. It is possible that when examining groups of patients, clinicians more frequently and confidently predict unfavorable outcomes for those more likely to experience them, yet still grossly underestimate or overestimate these probabilities. Future work is therefore needed to determine how well calibrated clinicians are in their predictions. Such studies might ask clinicians to rate the probabilities of future outcomes rather than eliciting dichotomous judgements and confidence ratings, as in the present study.

Second, these data apply only to predictions made on ICU days 3 to 6, which may be prior to when some clinicians were comfortable predicting patients’ outcomes. However, guidelines and systematic reviews recommend conducting multidisciplinary meetings with families by either ICU day 3 or 5. Thus, these data apply to information that may be conveyed during recommended family meetings.

A third limitation is that most study outcomes were verified through patient or surrogate reports by telephone. Although these patient- and caregiver-reported outcomes may differ from objectively measured physical or cognitive outcomes, this study intentionally chose to measure perceived cognitive or physical dysfunction, because such outcomes may be as or more important to patients and families as objectively measured outcomes. Additionally, it is possible that patients and surrogates may respond differently to these outcome measures. This cannot be ascertained reliably because of potential differences between patients who responded themselves and those for whom surrogates responded.

Conclusions

Supplement.

eAppendix 1. Exclusion Criteria

eTable 1. Outcomes Predicted by Physicians and Nurses

eAppendix 2. Survey for Prognosis Questions for ICU Physicians

eTable 2. Mock 2x2 Table With Definitions

eFigure 1. Flow Diagram for Study Cohort

eTable 3. 2x2 Tables for Comparing Physician Predictions to Outcomes for All Predictions

eTable 4. Operating Characteristics of Physicians for Fll Predictions

eTable 5. 2x2 Tables for Comparing Nurse Predictions to Outcomes for All Predictions

eTable 6. Operating Characteristics of Nurses for All Predictions

eTable 7. Agreement Between Physicians and Nurses Evaluating the Same Patient

eTable 8. 2x2 Tables of Physician and Nurse Predictions of Outcomes

eTable 9. 2x2 Tables for Comparing Physician Predictions to Outcomes for Confident Predictions

eFigure 2. Likelihood Ratios for Physicians and Nurses, Total Population and Confident Predictions

eTable 10. Operating Characteristics by Physicians When Restricted to Confident Predictions

eTable 11. 2x2 Tables for Comparing Nurses Predictions to Outcomes for Confident Predictions

eTable 12. Operating Characteristics of Nurses When Restricted to Confident Predictions

eFigure 3. Likelihood Ratios for Physicians and Nurses Concordant Predictions

eTable 13. Operating Characteristics of Concordant Responses and at Both Providers Being Confident at Predicting Mortality and 6 Month Physical and Cognitive Function

eFigure 4. Physician and Nurse Predictions of 6-Month Mortality by APACHE III Quartile

eTable 14. Prediction of Outcomes Based on Objective Measures, Physician, Nurse Predictions and Confidence

Click here for additional data file.^{(694.7KB, pdf)}

Section Editor: Derek C. Angus, MD, MPH, Associate Editor, JAMA (angusdc@upmc.edu).

References

1.Christakis NA. The ellipsis of prognosis in modern medical thought. Soc Sci Med. 1997;44(3):301-315. [DOI] [PubMed] [Google Scholar]
2.Kon AA, Davidson JE, Morrison W, Danis M, White DB; American College of Critical Care Medicine; American Thoracic Society . Shared decision making in ICUs: an American College of Critical Care Medicine and American Thoracic Society policy statement. Crit Care Med. 2016;44(1):188-201. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Sinuff T, Adhikari NK, Cook DJ, et al. . Mortality predictions in the intensive care unit: comparing physicians with scoring systems. Crit Care Med. 2006;34(3):878-885. [DOI] [PubMed] [Google Scholar]
4.White DB, Ernecoff N, Buddadhumaruk P, et al. . Prevalence of and factors related to discordance about prognosis between physicians and surrogate decision makers of critically ill patients. JAMA. 2016;315(19):2086-2094. [DOI] [PubMed] [Google Scholar]
5.Cox CE, Martinu T, Sathy SJ, et al. . Expectations and outcomes of prolonged mechanical ventilation. Crit Care Med. 2009;37(11):2888-2894. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Fried TR, Bradley EH, Towle VR, Allore H. Understanding the treatment preferences of seriously ill patients. N Engl J Med. 2002;346(14):1061-1066. [DOI] [PubMed] [Google Scholar]
7.Ehlenbach WJ, Cooke CR. Making ICU prognostication patient centered: is there a role for dynamic information? Crit Care Med. 2013;41(4):1136-1138. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Pandharipande PP, Girard TD, Jackson JC, et al. ; BRAIN-ICU Study Investigators . Long-term cognitive impairment after critical illness. N Engl J Med. 2013;369(14):1306-1316. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Unroe M, Kahn JM, Carson SS, et al. . One-year trajectories of care and resource utilization for recipients of prolonged mechanical ventilation: a cohort study. Ann Intern Med. 2010;153(3):167-175. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Ferrante LE, Pisani MA, Murphy TE, Gahbauer EA, Leo-Summers LS, Gill TM. Functional trajectories among older persons before and after critical illness. JAMA Intern Med. 2015;175(4):523-529. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.White DB, Engelberg RA, Wenrich MD, Lo B, Curtis JR. Prognostication during physician-family discussions about limiting life support in intensive care units. Crit Care Med. 2007;35(2):442-448. [DOI] [PubMed] [Google Scholar]
12.Cook D, Rocker G, Marshall J, et al. ; Level of Care Study Investigators and the Canadian Critical Care Trials Group . Withdrawal of mechanical ventilation in anticipation of death in the intensive care unit. N Engl J Med. 2003;349(12):1123-1132. [DOI] [PubMed] [Google Scholar]
13.Turnbull AE, Davis WE, Needham DM, White DB, Eakin MN. Intensivist-reported facilitators and barriers to discussing post-discharge outcomes with intensive care unit surrogates: a qualitative study. Ann Am Thorac Soc. 2016;13(9):1546-1552. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Davidson JE, Powers K, Hedayat KM, et al. ; American College of Critical Care Medicine Task Force 2004-2005, Society of Critical Care Medicine . Clinical practice guidelines for support of the family in the patient-centered intensive care unit: American College of Critical Care Medicine Task Force 2004-2005. Crit Care Med. 2007;35(2):605-622. [DOI] [PubMed] [Google Scholar]
15.Ahluwalia S, Mularski RA, Lendon J, et al. . Improving ICU family meetings: do the experts agree with the evidence? Am J Respir Crit Care Med. 2015;191:A1005. [Google Scholar]
16.Meadow W, Pohlman A, Frain L, et al. . Power and limitations of daily prognostications of death in the medical intensive care unit. Crit Care Med. 2011;39(3):474-479. [DOI] [PubMed] [Google Scholar]
17.Conti M, Friolet R, Eckert P, Merlani P. Home return 6 months after an intensive care unit admission for elderly patients. Acta Anaesthesiol Scand. 2011;55(4):387-393. [DOI] [PubMed] [Google Scholar]
18.Katz S, Downs TD, Cash HR, Grotz RC. Progress in development of the index of ADL. Gerontologist. 1970;10(1):20-30. [DOI] [PubMed] [Google Scholar]
19.Brown CJ, Flood KL. Mobility limitation in the older patient: a clinical review. JAMA. 2013;310(11):1168-1177. [DOI] [PubMed] [Google Scholar]
20.Horsman J, Furlong W, Feeny D, Torrance G. The Health Utilities Index (HUI): concepts, measurement properties and applications. Health Qual Life Outcomes. 2003;1:54. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Knaus WA, Wagner DP, Draper EA, et al. . The APACHE III prognostic system: risk prediction of hospital mortality for critically ill hospitalized adults. Chest. 1991;100(6):1619-1636. [DOI] [PubMed] [Google Scholar]
22.Fletcher RH, ed. Clinical Epidemiology: The Essentials. Philadelphia, PA: Wolters Kluwer Health/Lippincott Williams & Wilkins; 2014. [Google Scholar]
23.Jaeschke R, Guyatt GH, Sackett DL; Evidence-Based Medicine Working Group . Users’ Guides to the Medical Literature, III: how to use an article about a diagnostic test, B: what are the results and will they help me in caring for my patients? JAMA. 1994;271(9):703-707. [DOI] [PubMed] [Google Scholar]
24.Nofuentes JA, Del Castillo JdeD. Comparison of the likelihood ratios of two binary diagnostic tests in paired designs. Stat Med. 2007;26(22):4179-4201. [DOI] [PubMed] [Google Scholar]
25.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837-845. [PubMed] [Google Scholar]
26.Flahault A, Cadilhac M, Thomas G. Sample size calculation should be performed for design accuracy in diagnostic test studies. J Clin Epidemiol. 2005;58(8):859-862. [DOI] [PubMed] [Google Scholar]
27.Viera AJ, Garrett JM. Understanding interobserver agreement: the kappa statistic. Fam Med. 2005;37(5):360-363. [PubMed] [Google Scholar]
28.Groll DL, To T, Bombardier C, Wright JG. The development of a comorbidity index with physical function as the outcome. J Clin Epidemiol. 2005;58(6):595-602. [DOI] [PubMed] [Google Scholar]
29.McGee S. Simplifying likelihood ratios. J Gen Intern Med. 2002;17(8):646-649. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Herridge MS, Tansey CM, Matté A, et al. ; Canadian Critical Care Trials Group . Functional disability 5 years after acute respiratory distress syndrome. N Engl J Med. 2011;364(14):1293-1304. [DOI] [PubMed] [Google Scholar]
31.Herridge MS, Chu LM, Matte A, et al. ; RECOVER Program Investigators (Phase 1: Towards RECOVER); Canadian Critical Care Trials Group . The RECOVER program: disability risk groups and 1-year outcome after 7 or more days of mechanical ventilation. Am J Respir Crit Care Med. 2016;194(7):831-844. [DOI] [PubMed] [Google Scholar]
32.Iwashyna TJ, Ely EW, Smith DM, Langa KM. Long-term cognitive impairment and functional disability among survivors of severe sepsis. JAMA. 2010;304(16):1787-1794. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Evans LR, Boyd EA, Malvar G, et al. . Surrogate decision-makers’ perspectives on discussing prognosis in the face of uncertainty. Am J Respir Crit Care Med. 2009;179(1):48-53. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Woon FL, Dunn CB, Hopkins RO. Predicting cognitive sequelae in survivors of critical illness with cognitive screening tests. Am J Respir Crit Care Med. 2012;186(4):333-340. [DOI] [PubMed] [Google Scholar]
35.Nelson EC, Eftimovska E, Lind C, Hager A, Wasson JH, Lindblad S. Patient-reported outcome measures in practice. BMJ. 2015;350:g7818. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials