Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Oct 1.
Published in final edited form as: Crit Care Med. 2017 Oct;45(10):1607–1615. doi: 10.1097/CCM.0000000000002548

Identifying Distinct Subgroups of Intensive Care Unit Patients: a Machine Learning Approach

Kelly C Vranas 1,2, Jeffrey K Jopling 1,3, Timothy E Sweeney 4, Meghan C Ramsey 1,5, Arnold S Milstein 1, Christopher G Slatore 6,2, Gabriel J Escobar 7, Vincent X Liu 7
PMCID: PMC5600667  NIHMSID: NIHMS879701  PMID: 28640021

Abstract

Objective

Identifying subgroups of intensive care unit (ICU) patients with similar clinical needs and trajectories may provide a framework for more efficient ICU care through the design of care platforms tailored around patients’ shared needs. However, objective methods for identifying these ICU patient subgroups are lacking. We used a machine learning approach to empirically identify ICU patient subgroups through clustering analysis and evaluate whether these groups might represent appropriate targets for care redesign efforts.

Design

We performed clustering analysis using data from patients’ hospital stays to retrospectively identify patient subgroups from a large, heterogeneous ICU population.

Setting

Kaiser Permanente Northern California (KPNC), a healthcare delivery system serving 3.9 million members.

Patients

ICU patients aged ≥ 18 years with an ICU admission between January 1, 2012 and December 31, 2012 at one of 21 KPNC hospitals.

Interventions

none.

Measurements and Main Results

We used clustering analysis to identify putative clusters among 5,000 patients randomly selected from 24,884 ICU patients. To assess cluster validity, we evaluated the distribution and frequency of patient characteristics, and the need for invasive therapies. We then applied a classifier built from the sample cohort to the remaining 19,884 patients to compare the derivation and validation clusters. Clustering analysis successfully identified six clinically-recognizable subgroups that differed significantly in all baseline characteristics and clinical trajectories, despite sharing common diagnoses. In the validation cohort, the proportion of patients assigned to each cluster was similar and demonstrated significant differences across clusters for all variables.

Conclusions

A machine learning approach revealed important differences between empirically-derived subgroups of ICU patients that are not typically revealed by admitting diagnosis or severity of illness alone. Similar data-driven approaches may provide a framework for future organizational innovations in ICU care tailored around patients’ shared needs.

KEY MeSH TERMS: clustering analysis, unsupervised machine learning, critical care, intensive care units, patient care management

Introduction

The creation of the intensive care unit (ICU) represented an organizational innovation in the way care is provided for the sickest inpatients (1). Physically grouping patients in a single location allowed them to maximally benefit from providers with expertise in caring for the critically ill. Other ICU organizational innovations, such as interprofessional rounding teams, intensivist staffing, and lower nurse-to-patient ratios, are associated with improved outcomes (12). At the same time, considerable growth in ICU utilization, rising therapeutic costs, and advances in medical science and technology have added significant complexity to care processes (37). This rapidly changing healthcare landscape challenges the efficiency and sustainability of our current ICU organizational model (1).

Because critical care is itself an organizational innovation, it stands to reason that continued organizational interventions will be necessary for the field to successfully evolve (8). Identifying subgroups of ICU patients with similar clinical trajectories and tailoring care delivery services to meet their shared needs is one organizational intervention that may offer a more efficient approach to critical care delivery. This strategy, in which ‘care platforms’ are organized around patient subgroups, has been proffered as the type of operational redesign necessary for the successful reform of the nation’s healthcare delivery system (7). Such an approach is related to but distinct from the creation of subspecialty ICUs, which provide diagnostic-specific care for select populations, but with mixed effects on patient outcomes (910). In contrast, care platforms are designed around patients’ shared needs rather than shared diagnoses, and can be implemented in care settings other than the ICU.

An example of this approach involves grouping hospitalized non-ICU patients into geographic areas based on similar patient characteristics, including comparable lengths of stay (LOS) and nursing skill requirements. In one pilot hospital within a large health system, such a redesign effort yielded a reduction in the percentage of low-risk patients admitted to their ICU from 42% to 22%, while simultaneously reducing their ICU LOS from 4.6 days to 4.1 days (11). This model was then implemented system-wide with similar results. These findings highlight the potential improvement in the value of ICU care derived through the reorganization of critical care designed around patients with similar needs.

Despite the promise of such an approach, efforts to implement this type of reorganization across health systems have not been widely reported in the literature. This may be in part due to the lack of objective methods available to identify clinically distinct ICU patient subgroups who might benefit from shared care platforms. Therefore, we sought to develop a novel method for the empiric identification of ICU patient subgroups based on their shared needs and similar clinical trajectories. We hypothesized that clustering analysis, a form of unsupervised machine learning, would offer a unique, data-driven approach to objectively identify clinically distinct subgroups of ICU patients. These subgroups, or ‘clusters,’ could then provide a framework for the design of care platforms tailored around patients’ shared needs.

Materials and Methods

Clustering analysis

Clustering analysis is a form of unsupervised machine learning that has been used to explore and characterize disease and patient phenotypes in multiple areas, including cancer and asthma (1214). Specifically, clustering analysis refers to a group of multivariate mathematical algorithms that quantify the similarity between individuals within a population on the basis of multiple specified variables. This data-driven approach generates novel subgroups that are not based on any a priori hypotheses. In other words, unsupervised machine learning methods such as clustering analysis allow data to be grouped in a way that may be useful, without the user specifying how this grouping should occur (1314).

Study sample

Study subjects were drawn from Kaiser Permanente Northern California (KPNC), an integrated healthcare delivery system serving 3.9 million members. Clinical care is managed through a comprehensive electronic health information system. We included patients aged ≥ 18 years who experienced an ICU admission between January 1, 2012 and December 31, 2012 if their hospitalization began at any one of 21 KPNC hospitals and was not for obstetrical care. Most ICUs in the study are general mixed medical-surgical ICUs with a mean bed capacity of 16.9 beds (range, 6 to 34 beds). The hospitals vary from community medical centers to tertiary-care teaching hospitals; two hospitals include specialty cardiovascular surgery ICUs while another two include neurosurgical ICUs. Individual ICU episodes for patients with multiple ICU admissions during the study period were considered independent. The KPNC Institutional Review Board approved this study.

Feature extraction

Three authors with expertise in critical care (KV, MR, and JJ) selected 23 clinical features in four domains that are representative of critical care needs and could be feasibly quantified. These domains included patient characteristics (age, gender, comorbid disease burden); hospital admission characteristics (admission source, admitting diagnosis, need for surgical or procedural intervention, severity of illness at admission, code status, and predicted hospital mortality); ICU admission characteristics (severity of illness at ICU admission, length of ICU stay, duration of mechanical ventilation, total days receiving benzodiazepines, non-benzodiazepine sedatives, opiates, vasopressors, or inotropes, and any use of continuous intravenous vasopressors); and discharge characteristics (hospital length of stay, vital status at discharge, vital status at 30 days after hospital admission, discharge location, and hospital readmission within 30 days).

We extracted retrospective data regarding patient characteristics and utilization from KPNC membership tables and quantified hospitalization characteristics with existing inpatient databases detailed in prior studies (1520). These databases are comprised of data drawn directly from the electronic medical record (EMR) and then subjected to a variety of cleaning and validation algorithms (1625). Acute and chronic illness severities were based on automated scores that have been validated within the KPNC population. These scores include: the Laboratory and Acute Physiology Score, version 2, which uses 5 vital signs and 18 laboratory values to quantify severity of illness at the time of hospital admission; the Comorbidity Point Score, version 2, which evaluates all patient diagnoses from inpatient and outpatient settings over the past year; and an electronic adaptation of the Simplified Acute Physiology Score, version 3, which uses 20 variables to quantify the risk of hospital mortality within a 2 hour window surrounding ICU admission. Each score has previously exhibited excellent discrimination for predicting mortality (c-statistics ≥ 0.81) in large populations (≥70,000 hospitalizations) (1520).

We quantified treatment data based on detailed medication administration records and mechanical ventilation flowsheets. We represented the receipt of any continuous vasopressor as a binary variable, and then calculated the total number of vasopressor days by totaling the number of ICU days during which patients received any vasopressors. We used similar procedures to quantify the total days receiving other medications included in the analysis. In order to determine the duration of invasive mechanical ventilation, we used a validated set of algorithms to identify the start and stop times for ventilation based on flowsheet data (26). Documented limits on life-sustaining therapies were based on orders for ‘code status’ in the EMR at the time of hospital admission, rather than specifically during or at the end of patients’ ICU stays (27).

Unsupervised learning approach

The goal of our clustering analysis was to algorithmically place similar patients into distinct subgroups so as to minimize within-cluster heterogeneity and maximize between-cluster separation. To manage computational time for the clustering algorithm, we selected a random sample of 5,000 (20.1% of total) ICU admissions from our initial cohort. In this sample, we used consensus clustering via R package ConsensusClusterPlus with 1,000 subsamples (28). We compared patient and hospitalization characteristics between the random sample and the remaining cohort based on Student t-tests and chi-squared values. Sixteen of the 23 clinical features were used in the clustering analyses. Details of these methods are available in the Supplemental Digital Content. We included discharge data in the analyses, since our intent was to identify relevant subgroups based on care needs, rather than to build a prediction model used at point of triage. However, we intentionally withheld the use of ICU-mandated treatments (defined a priori as the use of mechanical ventilation, vasopressors, or inotropes) in order to use this information in the subsequent validation step. We also did not include admitting diagnosis as part of the clustering algorithm so that the derived clusters would reflect patients’ shared needs and clinical trajectories, rather than their diagnoses. Clustering was performed in R, version 3.1.2. All analyses besides clustering were performed using STATA SE/13.1.

Surrogate validation of clusters

After selecting the optimal number of clusters, we sought to assess the validity of the putative clusters as clinically meaningful subgroups. We did this surrogate validation in three ways. First, we compared the patient and hospitalization characteristics between clusters and assessed whether these characteristics were clinically distinguishable. Second, we evaluated the distribution of ICU-mandated treatments among the putative clusters, hypothesizing that the clusters would lack validity if patients requiring these treatments appeared uniformly distributed among all clusters.

Third, we validated results by building a classifier for cluster assignment from the sample cohort of 5,000 patients and applying it to the remaining 19,884 patients of the original cohort in order to yield predictions of cluster assignment for this larger group (29). Details of these methods are available in the Supplemental Digital Content. The resulting validation clusters were then compared to the originally-derived clusters. For each variable, we compared the training (random) and test cohorts using Wilcoxon signed-rank, analysis of variance, or chi-squared tests.

Results

Our initial cohort included 24,884 first ICU admissions occurring in 2012. The sample of 5,000 randomly-selected ICU admissions demonstrated similar characteristics to the remainder of the cohort (Table 1). While hospital mortality was statistically significantly different, mortality at 30 days, along with all other variables, were not significantly different. Sepsis was the most common reason for ICU admission in both the random and validation cohorts.

Table 1.

Patient and hospitalization features in the randomly selected sample (n = 5,000; 20.1%) compared with all others intensive care unit (ICU) patients from 2012. Comparisons are based on analysis of variance or chi-squared tests.

Characteristic Random cohort (n = 5,000) Validation cohort (n = 19,884) p-value
Patient
Age, years 65.4 ± 16.6 65.2 ± 16.5 0.45
Male gender 55.3% 54.3% 0.18
Comorbidity (COPS2)a 49 ± 48 49 ± 48 0.78
Hospitalization
Admitted through EDb 74.5% 74.9% 0.51
Most common diagnosis Sepsis (20.0%) Sepsis (20.1%)
Need for surgery or procedure 24.8% 24.0% 0.28
Severity of illness (LAPS2)c 81.2 ± 51.5 82.4 ± 51.3 0.13
Code status
 Do not resuscitate 6.8% 7.4% 0.11
 Partial code 1.9% 1.9% 0.88
Predicted hospital mortality 6.7 ± 11.6% 6.8 ± 11.2% 0.91
Intensive care unit
Severity of illness (eSAPS3)d 9.6 ± 11.3% 9.2 ± 11.3% 0.07
Length of ICU stay, days 2.7 ± 5.0 2.8 ± 5.6 0.55
Duration of ventilation, days 0.9 ± 5.9 1.0 ± 4.4 0.77
Days receiving benzodiazepines 0.2 ± 1.0 0.2 ± 0.9 0.95
Days receiving other sedatives 0.5 ± 1.7 0.5 ± 1.6 0.20
Days receiving opiates 0.3 ± 1.6 0.3 ± 1.5 0.74
Days receiving inotropes 0.2 ± 0.9 0.2 ± 1.0 0.86
Days receiving vasopressors 0.6 ± 2.1 0.6 ± 1.7 0.97
Any continuous vasopressor 20.4% 20.6% 0.71
Discharge
Total length of stay, days 7.9 ± 11.7 8.3 ± 12.2 0.06
Hospital mortality 11.2% 10.2% 0.04
Mortality at 30 days after hospital admission 13.0% 12.3% 0.19
Discharge location
 Home 72.1% 71.2% 0.19
 Subacute nursing facility 15.6% 15.6% 0.99
 Hospice 2.1% 2.0% 0.72
Readmission within 30 days 19.6% 18.7% 0.18
a

Comorbidity Point Score, version 2

b

Emergency Department

c

Laboratory and Acute Physiology Score, version 2

d

Simplified Acute Physiology Score, version 2

Clustering analysis

Clustering analysis was implemented in our sample over a pre-specified number of clusters, from 2 through 9. Based on the consensus clustering results, we chose to assess our surrogate validation measures on the putative cluster memberships with k=6 clusters. The distribution of cluster membership in the random sample included: 1,933 (38.7%) observations in Cluster 1; 622 (12.4%) in Cluster 2; 1,250 (25.0%) in Cluster 3; 897 (17.9%) in Cluster 4; 207 (4.1%) in Cluster 5; and 91 (1.8%) in Cluster 6 (Table 2). Variability in cluster membership across different facilities within the KPNC healthcare system is shown in Figure 1.

Table 2.

Selected patient and hospitalization characteristics based on putative cluster membership for 5,000 randomly-selected intensive care unit (ICU) admissions. Comparisons are based on analysis of variance or chi-squared tests. All between-cluster comparisons were significant to p < 0.001. Results that are distinct for a particular cluster are bolded for emphasis.

Cluster 1 (n = 1,933; 38.7%) Cluster 2 (n = 622; 12.4%) Cluster 3 (n = 1,250; 25.0%) Cluster 4 (n = 897; 17.9%) Cluster 5 (n = 207; 4.1%) Cluster 6 (n = 91; 1.8%)

PATIENT SUBGROUP Relatively healthy, short-stay ICU patients Older patients suffering catastrophic illness Post-surgical and post-procedural patients Older patients discharged with long-term care needs Prior healthy patients with prolonged stay and good recovery Patients with severe illness and desire for limits of life-sustaining therapy
CHARACTERISTIC
Patient
Age, years 60.9 ± 17.1 72.7 ± 14.1 63.8 ± 15.0 74.8 ± 12.7 58.7 ± 16.3 79.4 ± 11.6
Male 54.6% 52.1% 60.0% 47.5% 54.1% 53.9%
Comorbidity (COPS2)a 44 ± 46 65 ± 52 35 ± 35 63 ± 54 48 ± 49 70 ± 54
Hospitalization
EDb Admission 100.0% 86.8% 21.5% 82.8% 79.7% 100.0%
Most common diagnosis Sepsis (19.8%) Sepsis (38.9%) Acute MIc (10.1%) Sepsis (27.6%) Sepsis (24.6%) Sepsis (28.9%)
Need for procedure 0.2% 9.7% 76.9% 17.2% 19.8% 4.4%
Code status
 Do not resuscitate 0.0% 18.0% 0.0% 28.2% 0.0% 0.0%
 Partial code 0.0% 0.8% 0.0% 0.0% 0.5% 100.0%
Predicted hospital ortality 4.8 ± 7.6% 16.5 ± 19.0% 1.9 ± 3.0% 9.4 ± 11.9% 8.1 ± 11.6% 22.5 ± 19.7%
Intensive care unit
ICU length of stay, days 1.7 ± 1.6 3.3 ± 3.9 2.0 ± 2.0 3.3 ± 4.7 13.9 ± 20.7 2.9 ± 4.2
Severity of illnessd 8.0 ± 8.9% 21.6 ± 16.8% 3.5 ± 5.9% 12.5 ± 11.4% 13.1 ± 12.1% 16.4 ± 11.9%
Discharge
Total length of stay, days 5.1 ± 5.5 7.0ed ± 6.7 6.2 ± 5.4 11.1 ± 10.4 32.3 ± 36.6 7.7 ± 7.4
Hospital mortality 0.0% 78.6% 0.0% 0.0% 10.1% 23.1%
Discharge location
 Home 100.0% 5.6% 100.0% 16.5% 73.9% 46.2%
 Subacute nursing facility 0.0% 0.0% 0.0% 83.5% 14.0% 30.8%
 Hospice 0.0% 15.8% 0.0% 0.0% 1.9% 0.0%
Validation Metrics
Patients requiring any vasopressor 11.6% 42.4% 19.7% 20.5% 40.1% 27.5%
Patients requiring mechanical ventilation 12.5% 46.3% 7.8% 22.6% 65.2% 35.1%
a

Comorbidity Point Score, version 2

b

Emergency Department

c

Myocardial infarction

d

Simplified Acute Physiology Score, version 2

Figure 1.

Figure 1

Variability in cluster membership across the 21 different facilities included in the study. Facilities 8, 14, 19, and 21 are those which have provide regional specialty care in cardiovascular surgery and neurosurgery.

The six identified clusters differed significantly in all patient and hospitalization characteristics, and exhibited highly distinct features that were clinically recognizable subgroups of ICU patients (Table 2 and Supplemental Table E1). For example, Cluster 1 patients included the relatively healthy ICU patients who presented through the emergency department with low comorbid disease burden and severity of illness. They had the shortest total LOS and infrequently received sedatives or opiates. All patients in Cluster 1 survived hospitalization and were discharged home.

Cluster 2 patients provided a stark contrast. They represented older patients with greater comorbid disease who were admitted with catastrophic critical illness. Nearly 80% of these patients died during their hospital stay. Cluster 3 patients were primarily those admitted for surgery or a procedure, and comprised a greater proportion of those cared for at the four regional specialty hospitals that provide cardiovascular surgery and neurosurgery (Figure 1). This group was characterized by relatively low comorbid disease burden and severity of illness, and all of them survived their admission and were discharged home. Cluster 4 included older patients who generally survived their hospitalization but developed long-term care needs following their ICU stay reflected by the significant number (83.5%) discharged to subacute nursing facilities. Cluster 5 represented younger, previously healthy patients who suffered prolonged hospitalizations (32.3 ± 36.6 days) but overall had good recovery, with the majority of patients being discharged home. Cluster 6 consisted of elderly patients with the highest predicted in-hospital mortality (22.5 ± 19.7%), all of whom had documented limitations on life-sustaining therapies.

Surrogate validation measures

In addition to the significant differences noted in all patient and hospitalization characteristics across the six identified clusters, each cluster also exhibited variable needs for ICU-mandated treatments (Table 2 and Supplemental Table E1). For example, Cluster 1 had the lowest percentage of patients requiring vasopressors (11.6%) and mechanical ventilation (12.5%), whereas Cluster 5 had the highest percentage of patients requiring mechanical ventilation (65.2%). In the validation cohort of 19,884 patients, the proportion of patients assigned to each cluster was similar to the random cohort and demonstrated comparable, significant differences across clusters for all variables (Table 3 and Supplemental Table E2).

Table 3.

Selected patient and hospitalization characteristics based on putative cluster membership for the 19,884 validation intensive care unit (ICU) admissions. Comparisons are based on analysis of variance or chi-squared tests. All between-cluster comparisons were significant to p < 0.001. Results that are distinct for a particular cluster are bolded for emphasis.

Cluster 1 (n = 9,771; 49.1%) Cluster 2 (n = 2,108; 10.6%) Cluster 3 (n = 4,298; 21.6%) Cluster 4 (n = 2,910; 14.6%) Cluster 5 (n = 418; 2.1%) Cluster 6 (n = 379; 1.9%)

PATIENT SUBGROUP Relatively healthy, short-stay ICU patients Older patients suffering catastrophic illness Post-surgical and post-procedural patients Older patients discharged with long-term care needs Prior healthy patients with prolonged stay and good recovery Patients with severe illness and desire for limits of life-sustaining therapy
CHARACTERISTIC
Patient
Age, years 61.6 ± 17.4 72.3 ± 14.1 63.2 ± 14.7 73.9 ± 12.5 59.4 ± 15.9 77.0 ± 11.6
Male 53.9% 51.3% 62.1% 47.6% 61.2% 46.7%
Comorbidity (COPS2) a 46 ± 47 69 ± 56 34 ± 33 65 ± 56 45 ± 50 76 ± 54
Hospitalization
ED b admission 95.8% 87.6% 13.8% 79.1% 80.6% 98.7%
Most common diagnosis Sepsis (20.6%) Sepsis (37.3%) Acute MI c (7.9%) Sepsis (29.1%) Sepsis (36.1%) Sepsis (43.1%)
Need for procedure 6.7% 8.5% 79.2% 19.7% 20.6% 2.9%
Code status
 Do not resuscitate 6.5% 17.7% 0.3% 14.6% 0.5% 0.0%
 Partial code 0.0% 0.8% 0.0% 0.0% 0.5% 100.0%
Predicted hospital mortality 5.4 ± 8.4% 18.1 ± 19.4% 1.3 ± 2.2% 9.6 ± 12.0% 9.5 ± 12.7% 21.9 ± 19.5%
Intensive care unit
ICU length of stay, days 1.9 ± 2.1 3.5 ± 4.1 2.0 ± 2.1 3.3 ± 3.8 22.0 ± 21.1 2.8 ± 4.1
Severity of illness d 7.9 ± 8.5% 21.6 ± 16.8% 2.8 ± 4.1% 12.5 ± 11.8% 14.3 ± 13.4% 16.1 ± 12.7%
Discharge
Total length of stay, days 6.0 ± 6.8 7.3 ± 7.7 6.7 ± 7.1 11.9 ± 11.5 44.3 ± 48.4 7.5 ± 10.3
Hospital mortality 0.1% 89.0% 0.3% 0.0% 18.9% 22.2%
Discharge location
 Home 98.8% 1.0% 99.3% 0.0% 51.9% 47.5%
 Subacute nursing facility 0.0% 0.2% 0.0% 98.4% 27.0% 23.8%
 Hospice 1.1% 9.7% 0.4% 1.6% 2.2% 6.6%
Validation Metrics
Patients requiring any vasopressor 12.1% 43.9% 20.3% 24.2% 61.0% 29.3%
Patients requiring mechanical ventilation 14.8% 51.2% 4.9% 24.5% 79.4% 24.8%
a

Comorbidity Point Score, version 2

b

Emergency Department

c

Myocardial infarction

d

Simplified Acute Physiology Score, version 2

Discussion

As a clinical domain, critical care is a costly and relatively scarce resource (30). Improving the value and efficiency of our critical care delivery system is important in efforts to provide high quality care and contain growing healthcare expenditures (5, 31). However, the heterogeneity and unpredictability of critical illness have hampered easily generalizable solutions (3234). Fortunately, the ICU’s data-rich environment is particularly suitable for the use of sophisticated machine learning methods that can yield new insight into optimizing the delivery of safe and efficient care (32).

In this study, we demonstrated that clustering analysis enabled the retrospective identification of six clinically distinguishable subgroups from a heterogeneous ICU population. Qualitatively, the clusters are: Cluster 1—the relatively healthy, short-stay ICU patient; Cluster 2—the older patient with catastrophic critical illness; Cluster 3—the post-surgical or procedural ICU patient; Cluster 4—the older ICU patient discharged with long-term care needs; Cluster 5—the previously healthy patient with a prolonged ICU course and good recovery; and Cluster 6—the elderly patient with severe illness and documented limitations on life-sustaining therapies.

Interestingly, sepsis was the most common admitting diagnosis across multiple clusters, yet these subgroups displayed very different clinical needs and trajectories. Our results suggest that, while patients diagnosed with ‘sepsis’ share a similar underlying condition of infection and severe inflammation, grouping them primarily by admitting diagnosis neglects important distinctions between them. In other words, patients’ care needs can differ tremendously despite carrying the same diagnosis. Identifying these distinctions in care needs, and the proportion of ICU patients with similar needs, is an important precursor to future efforts to redesign the organization of ICU around care processes.

Our study also revealed important differences across patient subgroups that had similarly low hospital mortality. Specifically, the three clusters with very low hospital mortality (Clusters 1, 3, and 4) in the sample and validation cohorts comprised 81.6% and 85.4% of all ICU patients, respectively. Despite this, there was striking variability between clusters. While patients in Cluster 1 all went home, for example, over 80% of those in Cluster 4 were discharged to subacute nursing facilities. Here, we see that grouping ICU patients based on illness severity scores or predicted mortality can also neglect important distinctions between them.

There are relatively few studies assessing the needs of subgroups of critically ill patients across diagnoses, rather than by disease or illness severity. However, this needs-based framework could help identify opportunities to appropriately reorganize ICU resources given the heterogeneity and expense of critical care. For example, multiple studies have shown that a significant proportion of ICU admissions in the United States are for patients at low risk of dying or needing invasive therapies (3536). A better understanding of these patients and their needs is fundamental in efforts to re-design our healthcare system so that we may better meet those needs outside the high fixed-cost environment of the ICU. Our study uses a machine learning approach and highly granular patient data to empirically describe ICU patient subgroups that are not adequately distinguished by traditional methods, such as diagnosis labeling or severity of illness scoring.

In our study, Cluster 1 patients may be a potential target for care redesign efforts, since this cluster contained the greatest proportion of patients who were at low risk of dying or needing invasive therapies, and who survived hospitalization and were discharged home. Similarly, empirically identifying patient subgroups with documented limitations on life-sustaining therapies but who end up in the ICU (e.g. Cluster 4) may lead to further research and insights about deficiencies in our current system that lead to the unwanted receipt of critical care. Once these subgroups are identified and their needs better defined, hospital care delivery processes could be optimized to better meet their needs outside the ICU, improving both the value and efficiency of critical care using an objective, data-driven approach.

Our study has important limitations. First, we used data collected from ICU patients’ hospitalizations, including at the time of discharge, in order to retrospectively identify subgroups of ICU patients through clustering analysis. This method is not meant to serve as a predictive model for patients who may or may not need ICU admission, nor is it intended to support triage decisions at the point of care for individual patients. Instead, we view our study as hypothesis generating, as we attempt to find novel approaches that will allow us to provide optimal, timely care through the design of care platforms tailored around patients’ shared needs.

Second, the study is based on an unsupervised learning approach, which is prone to biases in certain circumstances (e.g., when there are perturbations in the data or when there are outliers that are forced into a specific cluster membership). Furthermore, our validation was performed in a held-out test set, not in a truly independent sample. Thus, further validation will be needed in different populations over time. Finally, the provision and use of ICU care is laden with complex patient, provider, facility, and societal issues that defy easy categorization. While our findings are useful for informing new design models for ICU care, they may not capture other important factors in triage decision-making or operational design.

Conclusions

Clustering analysis successfully identified six distinct, clinically-recognizable subgroups of ICU patients which may represent potential opportunity for care redesign efforts. Our study demonstrates how unsupervised learning methods may offer a novel, data-driven approach to objectively identify ICU patient subgroups with similar clinical trajectories, and provide a framework for organizational innovations in ICU care tailored around patients’ shared needs.

Supplementary Material

Supplemental Methods
Supplemental Table E1
Supplemental Table E2
Supplemental Table E3

Acknowledgments

SOURCES OF FUNDING:

KCV is supported by T32 HL083808 07. JKJ is supported by NIH UL1 TR001085. VXL is supported by NIH K23GM112018. CGS is supported by resources from the VA Portland Health Care System, Portland, Oregon.

Dr. Vranas received support for article research from the National Institutes of Health (NIH). Dr. Jopling received funding and support for article research from NIH UL1 TR001085 grant. Dr. Sweeney received funding from Inflammatix. Dr. Ramsey received funding from the American Thoracic Society. Dr. Slatore disclosed government work. Dr. Escobar’s institution received funding from Merck and the Gordon and Betty Moore Foundation. Dr. Liu received support for article research from the NIH; his institution received funding from NIH K23GM112018; and he received funding from the Permanente Medical Group (employee).

Footnotes

Disclosure: Dr. Milstein has disclosed that he does not have any potential conflicts of interest.

CONFLICTS OF INTEREST:

Authors have disclosed that they do not have any conflicts of interest.

AUTHOR CONTRIBUTIONS:

KCV, JKJ, MCR, and VXL contributed to the conception and design of this study. VXL contributed to data acquisition. VXL and TES contributed to the analysis and interpretation of data. All authors contributed to the preparation and/or revision of this manuscript.

Note: The Department of Veterans Affairs did not have a role in the conduct of the study, in the collection, management, analysis, interpretation of data, or in the preparation of the manuscript. The views expressed in this article are those of the authors and do not necessarily represent the views of the Department of Veterans Affairs or the U.S. Government.

References

  • 1.Costa DK, Kahn JM. Organizing critical care for the 21st century. JAMA. 2016;315:751–2. doi: 10.1001/jama.2016.0974. [DOI] [PubMed] [Google Scholar]
  • 2.Nguyen Y-L, Wunsch H, Angus DC. Critical care: the impact of organization and management on outcomes. Curr Opin Crit Care. 2010;16:487–492. doi: 10.1097/MCC.0b013e32833d9180. [DOI] [PubMed] [Google Scholar]
  • 3.Halpern NA, Pastores SM. Critical care medicine in the United States 2000–2005: an analysis of bed numbers, occupancy rates, payer mix, and costs. Crit Care Med. 2010;38:65–71. doi: 10.1097/CCM.0b013e3181b090d0. [DOI] [PubMed] [Google Scholar]
  • 4.Nguyen YL, Kahn JM, Angus DC. Reorganizing adult critical care delivery: the role of regionalization, telemedicine, and community outreach. Am J Respir Crit Care Med. 2010;181:1164–9. doi: 10.1164/rccm.200909-1441CP. [DOI] [PubMed] [Google Scholar]
  • 5.Task Force of the American College of Critical Care Medicine, Society of Critical Care Medicine. Guidelines for intensive care unit admission, discharge, and triage. Crit Care Med. 1999;27:633–8. [PubMed] [Google Scholar]
  • 6.Valley TS, Sjoding MW, Ryan AM, et al. Association of intensive care unit admission with mortality among older patients with pneumonia. JAMA. 2015;314:1272–9. doi: 10.1001/jama.2015.11068. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bohmer RM, Lawrence DM. Care platforms: a basic building block for care delivery. Health Aff (Millwood) 2008;27:1336–40. doi: 10.1377/hlthaff.27.5.1336. [DOI] [PubMed] [Google Scholar]
  • 8.Scales DC, Rubenfeld GD, editors. The Organization of Critical Care. New York: Springer Science + Business Media; 2014. [Google Scholar]
  • 9.Diringer MN, Edwards DF. Admission to a neurologic/neurosurgical intensive care unit is associated with reduced mortality rate after intracerebral hemorrhage. Crit Care Med. 2001;29(3):635–40. doi: 10.1097/00003246-200103000-00031. [DOI] [PubMed] [Google Scholar]
  • 10.Lott JP, Iwashyna TJ, Christie JD, Asch DA, Kramer AA, Kahn JM. Critical illness outcomes in specialty versus general intensive care units. Am J Respir Crit Care Med. 2009;179(8):676–83. doi: 10.1164/rccm.200808-1281OC. [DOI] [PubMed] [Google Scholar]
  • 11.Dlugacz YD, Stier L, Lustbader D, et al. Expanding a performance improvement initiative in critical care from hospital to system. Jt Comm J Qual Improv. 2002;28:419–34. doi: 10.1016/s1070-3241(02)28042-6. [DOI] [PubMed] [Google Scholar]
  • 12.Gentles AJ, Newman AM, Liu CL, et al. The prognostic landscape of genes and infiltrating immune cells across human cancers. Nat Med. 2015;21:938–45. doi: 10.1038/nm.3909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Konno S, Taniguchi N, Makita H, et al. Hi-CARAT Investigators Distinct phenotypes of cigarette smokers identified by cluster analysis of patients with severe asthma. Ann Am Thorac Soc. 2015;12:1771–80. doi: 10.1513/AnnalsATS.201507-407OC. [DOI] [PubMed] [Google Scholar]
  • 14.Haldar P, Pavord ID, Shaw DE, et al. Cluster analysis and clinical asthma phenotypes. Am J Respir Crit Care Med. 2008;178:218–24. doi: 10.1164/rccm.200711-1754OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Liu V, Kipnis P, Gould MK, et al. Length of stay predictions: improvements through the use of automated laboratory and comorbidity variables. Med Care. 2010;48:739–44. doi: 10.1097/MLR.0b013e3181e359f3. [DOI] [PubMed] [Google Scholar]
  • 16.Liu V, Turk BJ, Ragins AI, et al. An electronic Simplified Acute Physiology Score-based risk adjustment score for critical illness in an integrated healthcare system. Crit Care Med. 2013;41:41–8. doi: 10.1097/CCM.0b013e318267636e. [DOI] [PubMed] [Google Scholar]
  • 17.Escobar GJ, Greene JD, Scheirer P, et al. Risk-adjusting hospital inpatient mortality using automated inpatient, outpatient, and laboratory databases. Med Care. 2008;46:232–9. doi: 10.1097/MLR.0b013e3181589bb6. [DOI] [PubMed] [Google Scholar]
  • 18.Escobar GJ, LaGuardia JC, Turk BJ, et al. Early detection of impending physiologic deterioration among patients who are not in intensive care: development of predictive models using data from an automated electronic medical record. J Hosp Med. 2012;7:388–95. doi: 10.1002/jhm.1929. [DOI] [PubMed] [Google Scholar]
  • 19.Escobar GJ, Gardner MN, Greene JD, et al. Risk-adjusting hospital mortality using a comprehensive electronic record in an integrated health care delivery system. Med Care. 2013;51:446–53. doi: 10.1097/MLR.0b013e3182881c8e. [DOI] [PubMed] [Google Scholar]
  • 20.van Walraven C, Escobar GJ, Greene JD, et al. The Kaiser Permanente inpatient risk adjustment methodology was valid in an external patient population. J Clin Epidemiol. 2009;63:798–803. doi: 10.1016/j.jclinepi.2009.08.020. [DOI] [PubMed] [Google Scholar]
  • 21.Escobar GJ, Turk BJ, Ragins A, et al. Piloting electronic medical record-based early detection of inpatient deterioration in community hospitals. J Hosp Med. 2016;11(Suppl 1):S18–S21. doi: 10.1002/jhm.2652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kipnis P, Turk BJ, Wulf DA, et al. Development and validation of an electronic medical record-based alert score for detection of inpatient deterioration outside the ICU. J Biomed Inform. 2016;64:10–19. doi: 10.1016/j.jbi.2016.09.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Escobar GJ, Ragins A, Scheirer P, et al. Nonelective rehospitalizations and postdischarge mortality: predictive models suitable for use in real time. Med Care. 2015;53:916–23. doi: 10.1097/MLR.0000000000000435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Liu V, Kipnis P, Rizk NW, et al. Adverse outcomes associated with delayed intensive care unit transfers in an integrated healthcare system. J Hosp Med. 2012;7:224–30. doi: 10.1002/jhm.964. [DOI] [PubMed] [Google Scholar]
  • 25.Escobar GJ, Fireman BH, Palen TE, et al. Risk adjusting community-acquired pneumonia hospital outcomes using automated databases. Am J Manag Care. 2008;14:158–66. [PubMed] [Google Scholar]
  • 26.Liu V, Herbert D, Foss-Durant A, et al. Evaluation following staggered implementation of the “Rethinking Critical Care” ICU care bundle in a multicenter community setting. Crit Care Med. 2016;44:460–7. doi: 10.1097/CCM.0000000000001462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kim YS, Escobar GJ, Halpern SD, et al. The natural history of changes in preferences for life-sustaining treatments and implications for inpatient mortality in younger and older hospitalized adults. J Am Geratr Soc. 2016;64:981–9. doi: 10.1111/jgs.14048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Monti S, Tamayo P, Mesirov J, et al. Consensus Clustering: A Resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning. 2003;52:91–118. [Google Scholar]
  • 29.Tibshirani R, Hastie T, Narasimhan B, et al. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA. 2002;99:6567–72. doi: 10.1073/pnas.082099299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Adhikari NK, Fowler RA, Bhagwanjee S, et al. Critical care and the global burden of critical illness in adults. Lancet. 2010;376:1339–46. doi: 10.1016/S0140-6736(10)60446-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Halpern NA, Goldman DA, Tan KS, et al. Trends in critical care beds and use among population groups and Medicare and Medicaid beneficiaries in the United States: 2000-2010. Crit Care Med. 2016;44:1490–9. doi: 10.1097/CCM.0000000000001722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Chen LM, Kennedy EH, Sales A, et al. Use of health IT for higher-value critical care. N Engl J Med. 2013;368:594–7. doi: 10.1056/NEJMp1213273. [DOI] [PubMed] [Google Scholar]
  • 33.Piers RD, Azoulay E, Ricou B, et al. APPROPRICUS Study Group of the Ethics Section of the ESICM Perceptions of appropriateness of care among European and Israeli intensive care unit nurses and physicians. JAMA. 2011;306:2694–703. doi: 10.1001/jama.2011.1888. [DOI] [PubMed] [Google Scholar]
  • 34.Piers RD, Azoulay E, Ricou B, et al. APPROPRICUS Study Group of the Ethics Section of the ESICM Inappropriate care in European ICUs: confronting views from nurses and junior and senior physicians. Chest. 2014;146:267–75. doi: 10.1378/chest.14-0256. [DOI] [PubMed] [Google Scholar]
  • 35.Zimmerman JE, Kramer AA. A model for identifying patients who may not need intensive care unit admission. J Crit Care. 2010;25:205–13. doi: 10.1016/j.jcrc.2009.06.010. [DOI] [PubMed] [Google Scholar]
  • 36.Chen LM, Render M, Sales A, et al. Intensive care unit admitting patterns in the Veterans Affairs health care system. Arch Intern Med. 2012;172:1220–6. doi: 10.1001/archinternmed.2012.2606. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Methods
Supplemental Table E1
Supplemental Table E2
Supplemental Table E3

RESOURCES