Skip to main content
Journal of the American Medical Informatics Association: JAMIA logoLink to Journal of the American Medical Informatics Association: JAMIA
. 2022 May 2;29(7):1263–1270. doi: 10.1093/jamia/ocac060

Inclusion of social determinants of health improves sepsis readmission prediction models

Fatemeh Amrollahi 1, Supreeth P Shashikumar 2, Angela Meier 3, Lucila Ohno-Machado 4, Shamim Nemati 5, Gabriel Wardi 6,7,
PMCID: PMC9196687  PMID: 35511233

Abstract

Objective

Sepsis has a high rate of 30-day unplanned readmissions. Predictive modeling has been suggested as a tool to identify high-risk patients. However, existing sepsis readmission models have low predictive value and most predictive factors in such models are not actionable.

Materials and Methods

Data from patients enrolled in the AllofUs Research Program cohort from 35 hospitals were used to develop a multicenter validated sepsis-related unplanned readmission model that incorporates clinical and social determinants of health (SDH) to predict 30-day unplanned readmissions. Sepsis cases were identified using concepts represented in the Observational Medical Outcomes Partnership. The dataset included over 60 clinical/laboratory features and over 100 SDH features.

Results

Incorporation of SDH factors into our model of clinical and demographic features improves model area under the receiver operating characteristic curve (AUC) significantly (from 0.75 to 0.80; P < .001). Model-agnostic interpretability techniques revealed demographics, economic stability, and delay in getting medical care as important SDH predictive features of unplanned hospital readmissions.

Discussion

This work represents one of the largest studies of sepsis readmissions using objective clinical data to date (8935 septic index encounters). SDH are important to determine which sepsis patients are more likely to have an unplanned 30-day readmission. The AllofUS dataset provides granular data from a diverse set of individuals, making this model potentially more generalizable than prior models.

Conclusion

Use of SDH improves predictive performance of a model to identify which sepsis patients are at high risk of an unplanned 30-day readmission.

Keywords: readmission, sepsis, machine learning, OMOP, interpretable

INTRODUCTION

Sepsis is one of the most prevalent and deadly conditions in the United States, with 1.7 million cases reported annually resulting in over 270 000 sepsis-related deaths.1 Sepsis is also responsible for a significant economic burden in the United States, resulting in an estimated healthcare cost of $24 billion dollars each year.2 While the economic impact and high mortality of sepsis have been long described, recent data have shown that 30-day readmissions after a hospitalization for sepsis are more common and costly than those for acute myocardial infarction, chronic obstructive pulmonary disease, and congestive heart failure.3 Following hospitalization for sepsis, patients are at risk for immunosuppression, recurrent infection and significant cognitive, physical, and psychological decline.4–6 Improved systems are needed in order to identify those at the highest risk for readmission, to develop targeted help for those individuals, and to prevent costly readmissions. A broader understanding of contributing factors is therefore indicated, including an investigation of whether social factors impact readmissions.

Social determinants of health (SDH) are defined as the “conditions in the environments where people are born, live, learn, work, play, worship, and age that affect a wide range of health, functioning, and quality-of-life outcomes and risks”.7 These can be classified further into 5 domains: (1) economic stability, (2) educational access and quality, (3) healthcare access and quality, (4) neighborhood and built environment, and (5) social and community access.7 Prior investigations have shown a significant impact of these factors in readmissions across a variety of conditions, including myocardial infarction, pneumonia, and heart failure.8,9 Additionally, programs focused on overcoming nonmedical obstacles to recovery have shown a significant decrease in 30-, 60-, and 90-day readmission rates, although this was not specific to sepsis patients.10 Importantly, little is known about the association of such SDH and readmissions following hospitalization for sepsis.

Prediction of unplanned readmission within 30 days of hospital discharge is challenging. Many hospitals use scores such as the LACE or LACE+ index (Length of stay, Acuity of admission, Co-morbidities and Emergency department visits) to identify patients at high risk of readmission. However, these scores were not specifically developed for patients with sepsis, and prior work has shown these scores have poor predictive ability to identify patients at risk of sepsis readmission within 30-days of discharge. This poor performance limits their utility and may result in inappropriate utilization of resources.11 Other scores have been developed as part of sepsis-specific readmission programs, yet are limited by moderate predictive ability.12 A better understanding of the role SDH have in predicting sepsis readmissions may help augment the ability of healthcare workers to successfully predict which patients are at high risk for readmission and align discharge planning and resources.

Using a large contemporary dataset and patient-level survey information provided by the AllofUs dataset in Observational Medical Outcomes Partnership (OMOP) format, we sought to determine the impact of SDH on 30-day readmissions in patients with sepsis and to establish a model predicting readmissions including these variables. We compared this novel deep learning model to predict unanticipated 30-day readmissions in hospitalized sepsis patients against other models. Our hypothesis was that SDH provide critical data to predict 30-day readmission.

MATERIALS AND METHODS

Study design and setting

This was a retrospective multicenter cohort study consisting of participants in the AllofUs (Registered Tired Dataset v4 CDR data dictionary [R2020Q4R3]) longitudinal study, which contains patient-level data on 265 883 individuals from 35 hospitals across the United States. The AllofUs database was developed to recruit a diverse group of persons, including those previously underrepresented in biomedical research, and captures granular data across various aspects of health, including social and behavioral determinants. Patient data also include demographic information, medical history, and detailed data during hospitalizations, including laboratory results, vital signs, and disposition. Further details regarding this dataset are described elsewhere.13 Institutional Reviewing Board (IRB) approval was obtained prior to enrollment of patients in the AllofUs Research Program. Model development and validation were done according to the TRIPOD reporting framework for predictive modeling (see Supplementary Table S2).14

Patient selection and definitions

Patients 18 years or older who presented to the Emergency Department (ED) and were admitted to the hospital between May 2017 and May 2021 were eligible for inclusion. We identified adult patients (age ≥ 18 years old) who met criteria for sepsis, defined according to the Third International Consensus Definition,15 which required a 2-point change in the patient’s sequential organ failure assessment (SOFA) score and clinical suspicion of infection (including at least 3 days of nonprophylactic antimicrobial coverage and drawing of blood cultures) within 24 h of each other during an admission. We refer to this definition as “Sepsis 3” throughout the text. The time of sepsis was defined when the first element (either suspicion of infection or change in SOFA score) occurred, as used in our prior work.16 Septic shock was defined as the presence of sepsis plus the initiation of a vasoactive medication (ie, norepinephrine). We included patients who met sepsis criteria at any point during their hospitalization. We excluded patients not admitted to the hospital from the ED (eg, patients who were admitted directly to the hospital and bypassed the ED), and patients who did not meet Sepsis 3 criteria during admission, and those less than 18 years old. For patients with multiple hospitalizations for sepsis, we included the first admission (and corresponding 30-day readmission, if present). Thirty-day readmission was defined as a patient readmitted to any of the hospitals in the AllofUs dataset for more than 2 nights within 30 days of the index admission. Index admissions are identified as the first septic hospitalization for each patient admitted through the ED. Patients with a planned procedure (defined according to reference17) within 30 days of their index admission were excluded. We only followed patients for 30 days after their index admission.

Statistical methods

For all continuous variables we report medians ([25-percentile, 75-percentile]) and utilize a 2-sided Wilcoxon Rank-Sum test when comparing 2 populations. For categorical variables we report percentages and utilize a 2-sided Chi-square test to assess differences in proportions between 2 populations. For all statistical tests, a 2-sided alpha of <0.001 was considered significant. The area under the receiver operating characteristic curves (AUC), specificity (Spec), positive predictive value (PPV), and negative predictive value (NPV) at a fixed threshold determined by the training set (80% sensitivity level on the training cohort) were calculated to measure the performance of the models. Statistical comparison of all AUC curves was performed using the method of DeLong et al.18 Although there is no formal sample size calculation method available to estimate the required sample size for validation of machine learning-based predictive models, some studies have suggested a minimum event per variable (EPV) ratio of 10 is required to avoid the problem of overfitting.14 Based on the above, the recommended sample size for our study was 1570 (10×157 predictors) or more.

Model features

A total of 157 predictors of the proposed AllofUs Sepsis-related Unplanned Readmission (ASURe) model were categorized into 3 domains: (1) patient demographics (including age, gender, race, number of prior ED visits, and index admission length of stay; 5 in total), (2) clinical variables (including comorbid conditions, Charlson Comorbidity Index (CCI), surgical procedures, diagnosis, laboratory results, and vital signs; 64 in total), and (3) SDH (88 in total). A total of 64 clinical variables were extracted from the AllofUs dataset and can be found in Supplementary Note A. These clinical variables were selected based on our prior work predicting sepsis readmission and were reviewed by physicians (GW and AM) for clinical relevance and common use by providers.16,19 Laboratory results and vital signs were obtained via automated OMOP queries. (See GitHub Repo: https://github.com/NematiLab/ASURE_Readmission.) The CCI for each patient was calculated using International Classification of Disease (ICD-10) diagnosis codes. SDH were derived from 88 questions from survey results within the AllofUs database that we mapped to NIH-defined SDH categories7,20 of economic stability, education access and quality, healthcare access and quality, neighborhood and built environment, social and community context (see Supplementary Note A for a detailed description of questionnaires that were included). Any SDH feature with more than 80% missing values was excluded. For every dynamic temporal clinical observation and measurement, 5%, 50%, and 95% quantiles over the last 48 h prior to discharge were recorded.

Development of the ASURe model

The ASURe model is a 1-layered feedforward neural network (of size 45) trained to predict 30-day unplanned hospital readmission among septic index admissions. As described previously, patient demographics, clinical variables, and SDH factors constituted inputs to the ASURe model. Model interpretability and feature importance assessment were achieved using relevance score analysis.19,21 The relevance score was obtained by taking the derivative (or gradient) of the output with respect to all input features and multiplying it by the input features. Since a neural network with a hidden layer and sufficient number of nodes is capable of capturing nonlinear and multiplicative risk factors, the relevance score has the additional advantage to reveal nonlinear relationships (eg, U shape) among the features and the risk score. To assess the additional contribution of SDH factors over demographics and clinical variables, first a simplified model based on the latter 2 feature categories was constructed (ASURe-base), followed by an augmented model that also included SDH features (ASURe). Finally, performance of the ASURe model was compared against a logistic regression model (LR) constructed using all the above feature categories.

Data processing, training, and hyperparameters

All the models were trained and evaluated using 10-fold cross validation. Within each fold, the entire cohort was randomly split into training and testing cohorts in the ratio 80%/20% respectively. Within each fold, the training set was first standardized by applying normalization transformations, followed by subtracting the mean and dividing by the standard deviation. The testing cohort was then standardized using the same transformations. For continuous and categorical variables, mean imputation and K-nearest neighbor imputation (K = 10) was used to replace all missing values, respectively. The latter choice of imputation for categorical variables was based on the fact that we could not rule out the possibility of correlation between missingness of SDH variables and other observed features.

All hyperparameters of the model (number and size of neural network layers, learning rate, mini-batch size, L1 regularization parameter, and L2 regularization parameter) were selected using 10-fold cross-validation on the training cohort via Bayesian Optimization. The ASURe model was trained using an Adam optimizer with the learning rate set to 1e−3. To minimize overfitting and to improve the generalizability of the model, L1–L2 regularization was used, with the L2 regularization parameter set to 1e−2 and L1 regularization parameter set to 1e−3. A mini-batch size during training was set to 1000 (50% septic readmission/50% control). All data analysis was performed using Jupyter-notebook 6.4.3 on the AllofUs Research workbench, which is hosted in a secure cloud environment within the Google Cloud Platform. All preprocessing of data was performed using Numpy.22 The baseline logistic regression models and the ASURe model were implemented using PyTorch (Keras 2.6.0).

RESULTS

Baseline characteristics

The AllofUs dataset included 442 783 ED encounters that met our inclusion criteria (see Figure 1), of which 209 690 encounters met our criteria for hospitalization (length of stay of 48 h or more). Among the hospitalized encounters, 21 871 encounters (−10.4%) met the criteria for Sepsis 3. We further focused on the prediction of unplanned readmissions only for the septic index admissions, which yielded a total of 8935 septic encounters with 1741 unplanned 30-day readmissions (19%).

Figure 1.

Figure 1.

Population study.

A detailed flowchart of the patient selection process is shown in Figure 1. The median time to readmission was 13 days [7 days–20 days]. The most common reasons for readmission were infectious complications (26.2%), with sepsis being the most prevalent cause within this category, followed by miscellaneous causes (13.6%), cardiovascular causes (11.6%), and genitourinary causes (8.4%). Patients who experienced unexpected 30-day readmission were more likely to be male (54.0% vs 45.4%, P < .001), had more comorbidities (CCI of 4.4 vs 3.2, P < .001), were more likely to require care in an intensive care unit (11.4% vs 6.0%), and had a longer median length of stay (150 h vs 137 h, P < .001). Additionally, these patients were more likely to have prior hospitalizations in the 6-months preceding the index admission for sepsis (2 vs 0, P < .001) and were more likely to be discharged to a skilled nursing facility (32.4% vs 24.8%). See Table 1 for detailed description of patient characteristics.

Table 1.

Characteristics of patients who were alive following a sepsis index admission

Without 30-day readmission With 30-day readmission
(N = 7194) (N = 1741)
Age (years) [IQR] 52 [40–61] 53 [39–63]a
Gender (male) % 45.4 54.0a
Race
 Asian (%) 1.8 3.1a
 Black or African American (%) 29.4 39.9a
 White (%) 38.7 36.4a
 More than one race (%) 2.1 0.8
 Skip or prefer not to answer or  not indicated (%) 28.0 19.8
Ethnicity
 Hispanic or Latino (%) 21.4 24.2
 Not Hispanic or Latino (%) 77.5 71.2
 Skip or prefer not to answer or  not indicated (%) 1.0 4.7
Comorbidities
 Malignancy (%) 5.0 2.4
 Heart failure (%) 13.5 23.4a
 Chronic kidney disease (%) 18.3 26.6a
 Diabetes (%) 25.3 27.7a
 Chronic lung disease (%) 24.9 27.8
Charlson comorbidity index (CCI) [IQR] 3.2 [1.8–5.4] 4.4 [2.6–6.3]a
Number of prior hospitalizations in last 6 months 0 [0–8] 2 [0–21]a
Intensive care unit admission (%) 6.0% 11.4%
Discharge disposition
 Home (%) 39.6 31.9
 Skilled nursing facility (%) 24.8 32.4
 Home with home health (%) 19.7 24.6
 Hospice (%) 2.0 3.0
 Left against medical advice (%) 0.9 3.5
 Otherb (%) 14.0 4.7
a

Statistically significant difference between septic patients with and without 30-day readmission.

b

Includes transfer to another acute care facility or long-term acute care facility.

Association of social determinants of health with 30-day unexpected readmissions

Table 2 shows the comparison of our 2 baseline models, namely ASURe-Base and LR, with the ASURe model. For the test set, the ASURe model had significantly higher AUC (0.80 [0.80–0.82], P value) compared to the ASURe-Base (0.75 [0.71–0.77], P value) and LR (0.73 [0.70–0.76], P value). At 80% sensitivity, the median specificity of the ASURe model was 65 [64–67]%, with median PPV and NPV values of 26 [25–28]% and 90 [90–91]%. Additionally, the AUC and the area under the precision recall curve of the ASURe model corresponding to the best fold are shown in Figure 2.

Table 2.

Summary of performance across the 2 baseline models (ASURe-Base and LR) and ASURe model

LR ASURe-Base ASURe
AUC (median [IQR]) Training: 0.77 [0.75–0.79] 0.77 [0.74–0.78] 0.82 [0.80–0.83]
Test: 0.73 [0.70–0.76] 0.75 [0.71–0.77] 0.80 [0.80–0.82]**
SPC (in %) (median [IQR]) Training: 58 [56–60] 60 [59–65] 68 [66–69]
Test: 52 [49–56] 57 [52–61] 65 [64–67]
SEN (in %) (median [IQR]) Training: 80 [80–80] 80 [80–80] 81 [80–81]
Test: 77 [74–78] 78 [76–79] 79 [78–80]
PPV (in %) (median [IQR]) Training: 24 [22–25] 24 [21–25] 28 [26–30]
Test: 20 [18–23] 23 [19–24] 26 [25–28]
NPV (in %) (median [IQR]) Training: 88 [87–88] 88 [86–88] 92 [91–93]
Test: 84 [83–87] 87 [83–88] 90 [90–91]

NPV: negative predictive value; PPV: positive predictive value; SEN: sensitivity; SPC: specificity.

**

The Delong test shows there is a statistically significant difference in AUCs between baseline models and the SDH augmented model (ASURe) across all folds (P < .001).

Figure 2.

Figure 2.

The area under the receiver operating characteristic curve and area under the precision recall curve for the ASURe model is shown in panels (A) and (B), respectively. The model corresponding to the highest AUC on the holdout test set across the 10 folds was chosen.

Top contributing factors to sepsis readmission

We performed a model explanation analysis using the relevance score to describe the relationship between the input features and the risk scores at the population level. Table 3 depicts the predictive SDH features that were identified as statistically significant and their relevance scores. For example, delayed visits due to lack of access to transportation or having completed education only up to high school resulted in increased risk for readmission. On the other hand, annual income in the range of $50 000–$75 000 or having advanced college degrees were protective. Supplementary Figures S1 and S2 show the clinical variables contributing to the increased risk score and their respective direction of influence, which was deduced from the sign of the gradient in the relevance score calculations. For example, in over 60% of patients an elevated temperature (>40°C) was associated with increased odds of readmission (colored in blue) and, in roughly 40% of patients, hypothermia also increased the risk of readmission (colored in red), indicating the nonlinear nature of the risk associated with the input features.

Table 3.

Relevance score of SDH features associated with increased risk of readmission

Predictive features Relevance score
Demographics
 Race (Asian) 0.3 [0.3–0.4]
 Race (Black or African American) 0.09 [0.08–0.12]
 Gender (male) 0.3 [0.12–0.35]
 Sexual orientation (LGBTQ) 0.13 [0.12–0.15]
 Have you ever used nicotine products?—Yes 0.009 [0.007–0.01]
Economic stability
 Income level 0.15 [0.13–0.16]
  Less than 10k 0.09 [0.05–0.17]
  10–25k 0.08 [0.001–0.14]
  50–75k −0.27 [−0.3 to −0.07]
  100–150k −3e−4 [−0.01 to 2e−7]
  >200k 0.16 [0.18–0.15]
Currently unemployed?—Yes 0.04 [0.03–0.04]
Are you not covered by health insurance or some other kind of health care plan?—Yes 0.13 [0.11–0.14]
Education access and quality
What is the highest grade or year of school you completed? 0.02 [0.01–0.03]
 Advanced degree −0.05 [−0.15, −1.54e−4]
College degree −0.04 [−0.08, −1.95e−4]
 GED 0.03 [2.25e−5, 0.1]
 High school or less 0.09 [1.95e−5, 0.12]
Healthcare access and qualitya
 Have you delayed visiting a doctor over the past 12 month because you did not have transportation?—Yes 0.4 [0.3–0.4]
 In general, would you say your health is: 0.2 [0.2–0.3]
  Excellent −0.13 [−0.16, −0.1]
  Very good −0.14 [−0.21, −0.12]
  Good −0.14 [−0.23, −0.12]
  Fair 0.15 [0.12, 0.24]
  Poor 0.17 [0.14,0.19]
 To what extent are you able to carry out your everyday physical activities? 0.2 [0.08–0.3]
  Not at all 0.11 [0.07, 0.15]
  A little 0.11 [0.01, 0.14]
  Moderately −0.08 [−0.1, −0.04]
  Mostly −0.1 [−0.12, −0.06]
  Completely −0.16 [−0.21, −0.12]
 How confident are you in filling out medical forms by yourself? 0.14 [0.11–0.16]
  Extremely −0.09 [−0.15, −0.019]
  Quite a bit −0.07 [−0.14, −0.02]
  Somewhat 1.82e−6 [−0.03, 0.02]
  A little bit 0.02 [−0.02, 0.13]
  Not at All 0.02 [0.02, 0.14]
Neighborhood and built environment
In the past 6 months, have you been worried or concerned about NOT having a place to live?—Yes 0.08 [0.07–0.08]
Do you rent the place where you live?—Yes 6e−4 [5e−4,8e−4]

Note: Only those features whose relevance scores were identified as statistically significant are shown. Note that a positive relevance score corresponds to increased risk of readmission, whereas a negative relevance score corresponds to reduced risk of readmission.

a

Due to space constraints only 5 features under ‘Healthcare Access and Quality’ have been shown. Please see Supplementary Table S3 for the entire list.

DISCUSSION

The central finding of this multicenter longitudinal cohort study is that certain SDH are strongly associated with unplanned 30-day sepsis readmissions and that the inclusion of such information into a predictive model for readmissions can significantly improve predictive ability and model actionability. Our results highlight the importance of SDH in identifying which patients may benefit from additional resources around the time of discharge, or postdischarge, to prevent 30-day readmissions. Additionally, our manuscript highlights the use of the OMOP common data model as a viable approach to abstract data from many institutions, as afforded in the AllofUs dataset. Importantly, the AllofUs dataset draws from a diverse group of individuals and provides not only robust survey data but also granular data regarding hospitalizations, which enabled us to define sepsis using objective clinical data in electronic health records. This is in contrast to less accurate approaches (eg, administrative coding) that are commonly used for larger observational studies in sepsis.23,24

An explicit goal of the AllofUs program is to enroll a diverse group of individuals, including those historically underrepresented in biomedical research, to better represent patients cared for in the United States. Prior work has highlighted the lack of diversity and restrictive policies regarding data access that impairs generalizability and may lead to incorrect conclusions based on less diverse datasets.25,26 Data harmonization through OMOP allows for multiple large datasets to be combined with minimal information loss to a standardized vocabulary, thus potentially improving generalizability of results. In the past few years, there has been significant interest in the use of OMOP for harmonization of multicenter datasets to study a variety of disease processes, including cancer, organ transplant, coronavirus 19, and heart disease.27–31 Granular data, as well as detailed survey information, allow researchers to draw robust and potentially more generalizable conclusions. This can help avoid a common criticism of large administrative database studies that contain demographic and billing data but lack relevant granular data about patients (ie, individual laboratory results or vital signs). Since enrollments are ongoing and the AllofUs dataset is currently growing in size, this work provides a benchmark for future epidemiological and clinical studies involving septic patients.

Although the use of OMOP allows for data harmonization across a large number of medical centers and does provide granular data, it is not without limitations. In particular, we noted some difficulty with obtaining data from flowsheets (ie, timing of administration of intravenous fluids). However, to minimize any loss of specificity, we avoided use of flow sheet data and instead focused on laboratory results, demographics, and SDH features which are reliably captured in the AllofUs dataset. Additionally, we did not have access to genomic data, although more recent versions of the AllofUS dataset plan to provide genomic data which may allow for more precision care of patients with sepsis and augment predictive abilities.

It is well-known that sepsis patients are at risk of unplanned 30-day readmissions and that these patients are more likely to be readmitted than patients with heart failure, myocardial infarctions, and COPD exacerbations.3,32 Multiple reasons have been discussed to explain this phenomenon, including prolonged immunosuppression,33–35 physical weakness and deconditioning, impaired cough and swallowing mechanisms, as well as cognitive impairment.5,36 Importantly, recent reviews describing risk factors for sepsis readmissions focus on “traditional” components of readmission predictors, such as age, comorbidities and illness severity components. These factors make up a large component of popular scoring systems (eg, LACE+ index) to predict readmissions. Yet, prior work has shown that scoring systems have poor to moderate ability to predict unplanned 30-day sepsis readmissions, including our prior work where we showed the AUC of the LACE+ score was 0.64 for predicting unplanned 30-day sepsis readmissions.11,12 The ASURe model provides an alternative approach to the LACE+ score using commonly used clinical variables and demographic data—and when augmented with SDH variables resulted in a statistically significant improvement of AUC from 0.75 to 0.80. Importantly, the ASURe model had strong negative predictive ability (90%) at the selected threshold, which may help providers determine which patients do not need aggressive therapy and monitoring in the postdischarge phase. Additionally, although the PPV was low (26%), the model might help clinicians adjudicate who should be prioritized to receive aggressive postdischarge care to help prevent readmissions. Importantly, it is estimated that unplanned sepsis readmissions are responsible for at least $3.5 billion of healthcare expenditures within the United States each year32 and a mean cost per patient of $16 852.32 This huge economic impact certainly justifies intensified postdischarge measures and the ASURe model may better align resources for these patients than baseline models used for comparison, as readmission programs are demonstrating ability to keep high-risk patients out of the hospital.12,37 Although the ASURe model is an improvement over classic logistic regression models such as the LACE+ index, the ability to capture higher-order interaction among risk factors is important. More advanced models with multiple layers of neural networks and additional data may improve predictive abilities and should be an area of investigation for future research in this area.

To date, there have been only a handful of investigations into SDH factors that predict unplanned sepsis readmissions. Donnelly et al38 performed a large evaluation using administrative data from the University Health System Consortium showed that payor type was associated with 30-day readmissions following a sepsis hospitalization although the authors did not find any associations between race and readmissions. A recent publication showed a greater level of neighborhood deprivation was strongly associated with readmission.39 Our findings are confirmatory of Donnelly et al in that we show insurance status is a strong predictor of readmissions. Our findings also add to the literature by showing that SDH factors such as healthcare access and quality have a major role in sepsis readmissions. We identified a number of potentially actionable factors, such as poor transportation to obtain healthcare, inability to pay for specific aspects of medical care, or lack of insurance that were strongly associated with a 30-day readmission. Case managers and others can use this information to design specific interventions (ie, arrange for transportation to postdischarge clinic appointments) to help these at-risk patients obtain the required services and therefore aim to prevent readmission. In addition to modifiable risk factors for unplanned 30-day readmissions, we identified several not immediately modifiable variables that had strong associations with readmissions. Male gender, Asian or Black race, economic and housing instability, and education level were all identified as increasing risk of readmission. These have been previously described as factors for readmission, although not specifically in sepsis patients, and it is uncertain if hospital readmission programs are effective when targeting these populations.40–42 The AllofUs dataset did not include data on digital literacy and Internet connectivity. These have been highlighted recently as “super” SDH,43 future investigations should include such information as greater reliance on digital tools may worsen health disparities, particularly as more health systems rely on patient portals, remote monitoring systems, and health apps.

We acknowledge limitations to this study. Our analysis is based on a retrospective longitudinal cohort study that carries inherent limitations in design and generalizability, although we used a large robust dataset specifically designed to include persons of diverse backgrounds. Data on the SDH were obtained largely from survey responses and self-reported answers, which may not accurately reflect actual conditions or status of a respondent. Certain SDH variables, such as neighborhood depravity index, digital connectivity and literacy were not included in the AllofUs dataset, which may have impacted our model’s predictive abilities. Other SDH variables with more than 80% missingness were excluded which might bias our findings and discount the importance of the corresponding SDH factors. Nevertheless, the 88 SDH variables that were included in our models significantly improved our predictive performance, highlighting the importance of accounting for such factors in predictive models and the need for additional investigation in this domain. We chose to use an “implicit” diagnosis of sepsis that was based on clinical suspicion of infection, which may not accurately reflect true infection. However, “implicit” diagnosis of sepsis (eg, based on clinical suspicion and evidence of organ failure) has been reported to be superior to administrative diagnosis (eg, coding and billing diagnosis of sepsis) and given the large number of patients and data source, manual chart review was impossible to perform.44 Moreover, the ASURe model was not prospectively validated, although we aim to implement a similar model at UC San Diego Health in collaboration with its population health and care management team.

CONCLUSION

Using a large dataset composed of subjects from diverse backgrounds, we describe ASURe, a machine learning model developed to predict 30-day readmission based on clinical, demographic and SDH features following an index sepsis hospitalization. The model’s predictive ability and actionability was significantly augmented by incorporation of SDH as input features. Future studies are required to prospectively validate these findings and further explore the relationship between SDH, readmissions, and patient-centered outcomes.

FUNDING

Dr. Nemati is funded by the National Institutes of Health (#R56LM013517, #R35GM143121) and the Gordon and Betty Moore Foundation (#GBMF9052). Dr. Wardi has been supported by the National Foundation of Emergency Medicine, the Gordon and Betty Moore Foundation (GBMF9052) and the National Institutes of Health (R35GM143121). Dr. Meier is supported by the National Institutes of Health (1KL2TR001444). Dr. Meier was supported by the NIH NHLBI (PI Dr. Crotty-Alexander, unrelated to this study). Dr. Machado is funded by the National Institutes of Health (OT2OD026552 and RM1HG011558). Dr. Shashikumar and Ms. Amrollahi have no sources of funding to declare. The opinions or assertions contained herein are the private ones of the author and are not to be construed as official or reflecting the views of the Department of Defense, the NIH or any other agency of the US Government.

AUTHOR CONTRIBUTIONS

SN, GW, and SPS were involved in the original conception and design of the work. FA developed the network architectures, conducted the experiments. SPS, SN, GW, and AM reviewed the experiments and contributed to the interpretation of results. LOM provided clinical expertise and contributed to the write-up. FA, SPS, GW, AM, LOM, and SN wrote and edited the manuscript.

ETHICAL APPROVAL

Institutional Reviewing Board (IRB) approval was obtained prior to enrollment of patients in the AllofUs Research Program.

SUPPLEMENTARY MATERIAL

Supplementary material is available at Journal of the American Medical Informatics Association online.

CONFLICT OF INTEREST STATEMENT

None declared.

DATA AVAILABILITY

The AllofUs cohort is publicly available at www.allofus.nih.gov. The AllofUs cohort version number 4, consisting of data from all participants who enrolled from the beginning of the program on May 30, 2017 to May 6, 2021 was used in this study. The OMOP queries used to extract data from the AllofUs database will be made available at https://github.com/NematiLab/ASURE_Readmission.

Supplementary Material

ocac060_supplementary_data

Contributor Information

Fatemeh Amrollahi, Division of Biomedical Informatics, University of California San Diego, San Diego, California, USA.

Supreeth P Shashikumar, Division of Biomedical Informatics, University of California San Diego, San Diego, California, USA.

Angela Meier, Division of Pulmonary, Critical Care and Sleep Medicine, University of California San Diego, San Diego, California, USA.

Lucila Ohno-Machado, Division of Biomedical Informatics, University of California San Diego, San Diego, California, USA.

Shamim Nemati, Division of Biomedical Informatics, University of California San Diego, San Diego, California, USA.

Gabriel Wardi, Division of Pulmonary, Critical Care and Sleep Medicine, University of California San Diego, San Diego, California, USA; Department of Emergency Medicine, University of California San Diego, San Diego, California, USA.

REFERENCES

  • 1. Rhee C, Dantes R, Epstein L, et al. ; CDC Prevention Epicenter Program. Incidence and trends of sepsis in US hospitals using clinical vs claims data, 2009–2014. JAMA  2017; 318 (13): 1241–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Torio CM, Moore BJ. National inpatient hospital costs: the most expensive conditions by payer, 2013: statistical brief #204. In: Healthcare Cost and Utilization Project (HCUP) Statistical Briefs. Rockville, MD: Agency for Healthcare Research and Quality (US); 2006. http://www.ncbi.nlm.nih.gov/books/NBK368492/ Accessed January 21, 2022. [PubMed]
  • 3. Mayr FB, Talisa VB, Balakumar V, et al.  Proportion and cost of unplanned 30-day readmissions after sepsis compared with other medical conditions. JAMA  2017; 317 (5): 530–1. [DOI] [PubMed] [Google Scholar]
  • 4. Prescott HC, Angus DC.  Enhancing recovery from sepsis: a review. JAMA  2018; 319 (1): 62–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Iwashyna TJ, Ely EW, Smith DM, et al.  Long-term cognitive impairment and functional disability among survivors of severe sepsis. JAMA  2010; 304 (16): 1787–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Readmission diagnoses after hospitalization for severe sepsis and other acute medical conditions. JAMA | JAMA Network. https://jamanetwork.com/journals/jama/fullarticle/2190975 Accessed January 21, 2022. [DOI] [PMC free article] [PubMed]
  • 7.Social Determinants of Health – Healthy People 2030. health.gov. https://health.gov/healthypeople/objectives-and-data/social-determinants-health Accessed January 21, 2022.
  • 8. Meddings J, Reichert H, Smith SN, et al.  The impact of disability and social determinants of health on condition-specific readmissions beyond Medicare risk adjustments: a cohort study. J Gen Intern Med  2017; 32 (1): 71–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Chin MH, Goldman L.  Correlates of early hospital readmission or death in patients with congestive heart failure. Am J Cardiol  1997; 79 (12): 1640–4. [DOI] [PubMed] [Google Scholar]
  • 10. Evans WN, Kroeger S, Munnich EL, et al.  Reducing readmissions by addressing the social determinants of health. Am J Health Econ  2021; 7 (1): 1–40. [Google Scholar]
  • 11. Wardi G, Shashikumar S, Allen T, et al.  1233: development and validation of a novel machine learning algorithm to predict sepsis readmissions. Crit Care Med  2021; 49 (1): 620. [Google Scholar]
  • 12. Taylor SP, Murphy S, Rios A, et al.  Effect of a multicomponent sepsis transition and recovery program on mortality and readmissions after sepsis: the improving morbidity during post-acute care transitions for sepsis randomized clinical trial. Crit Care Med 2022; 50 (3): 469–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.The “All of Us” Research Program. N Engl J Med. https://www.nejm.org/doi/full/10.1056/NEJMsr1809937 Accessed January 21, 2022.
  • 14. Collins GS, Reitsma JB, Altman DG, et al.  Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med  2015; 13: 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Singer M, Deutschman CS, Seymour CW, et al.  The Third International Consensus definitions for sepsis and septic shock (Sepsis-3). JAMA  2016; 315 (8): 801–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Wardi G, Carlile M, Holder A, et al.  Predicting progression to septic shock in the emergency department using an externally generalizable machine-learning algorithm. Ann Emerg Med  2021; 77 (4): 395–406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Horwitz L, Partovian C, Lin Z, et al. Measure updates and specification report: hospital-wide all-cause risk-standardized readmission measure–version 4.0. Prepared for the Centers for Medicare and Medicaid Services. Prepared by Centers for Medicare Medicaid Services New Haven CT Yale New Haven Health Services Corporation Outcomes Research and Evaluation; 2015: 26.
  • 18. DeLong ER, DeLong DM, Clarke-Pearson DL.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics  1988; 44 (3): 837–45. [PubMed] [Google Scholar]
  • 19. Nemati S, Holder A, Razmi F, et al.  An interpretable machine learning model for accurate prediction of sepsis in the ICU. Crit Care Med  2018; 46 (4): 547–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.About Social Determinants of Health (SDOH); 2021. https://www.cdc.gov/socialdeterminants/about.html Accessed January 24, 2022.
  • 21.Baehrens D, Schroeter T, Harmeling S, Kawanabe M, Hansen K, Müller K-R. How to explain individual classification decisions. J Mach Learn Res 2010; 11: 1803–31. [Google Scholar]
  • 22. Harris CR, Millman KJ, van der Walt SJ, et al.  Array programming with NumPy. Nature  2020; 585 (7825): 357–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Iwashyna TJ, Odden A, Rohde J, et al.  Identifying patients with severe sepsis using administrative claims: patient-level validation of the angus implementation of the international consensus conference definition of severe sepsis. Med Care  2014; 52 (6): e39–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Rhee C, Li Z, Wang R, et al.  Impact of risk adjustment using clinical vs administrative data on hospital sepsis mortality comparisons. Open Forum Infect Dis  2020; 7 (6): ofaa213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Popejoy AB, Fullerton SM.  Genomics is failing on diversity. Nature  2016; 538 (7624): 161–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Wilkinson MD, Dumontier M, Aalbersberg IJ, et al.  The FAIR Guiding Principles for scientific data management and stewardship. Sci Data  2016; 3: 160018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Sun Y, Butler A, Stewart LA, et al.  Building an OMOP common data model-compliant annotated corpus for COVID-19 clinical trials. J Biomed Inform  2021; 118: 103790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Belenkaya R, Gurley MJ, Golozar A, et al.  Extending the OMOP common data model and standardized vocabularies to support observational cancer research. JCO Clin Cancer Inform  2021; 5: 12–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Cho S, Sin M, Tsapepas D, et al.  Content coverage evaluation of the OMOP vocabulary on the transplant domain focusing on concepts relevant for kidney transplant outcomes analysis. Appl Clin Inform  2020; 11 (4): 650–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Papez V, Moinat M, Payralbe S, et al.  Transforming and evaluating electronic health record disease phenotyping algorithms using the OMOP common data model: a case study in heart failure. JAMIA Open  2021; 4 (3): ooab001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Prats-Uribe A, Sena AG, Lai LYH, et al.  Use of repurposed and adjuvant drugs in hospital patients with covid-19: multinational network cohort study. BMJ  2021; 373 : n1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Gadre SK, Shah M, Mireles-Cabodevila E, et al.  Epidemiology and predictors of 30-day readmission in patients with sepsis. Chest  2019; 155 (3): 483–90. [DOI] [PubMed] [Google Scholar]
  • 33. Hotchkiss RS, Monneret G, Payen D.  Sepsis-induced immunosuppression: from cellular dysfunctions to immunotherapy. Nat Rev Immunol  2013; 13 (12): 862–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Carson WF, Cavassani KA, Dou Y, et al.  Epigenetic regulation of immune cell functions during post-septic immunosuppression. Epigenetics  2011; 6 (3): 273–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Yang L, Xie M, Yang M, et al.  PKM2 regulates the Warburg effect and promotes HMGB1 release in sepsis. Nat Commun  2014; 5: 4436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Shah FA, Pike F, Alvarez K, et al.  Bidirectional relationship between cognitive function and pneumonia. Am J Respir Crit Care Med  2013; 188 (5): 586–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Taylor SP, Chou S-H, Sierra MF, et al.  Association between adherence to recommended care and outcomes for adult survivors of sepsis. Ann Am Thorac Soc  2020; 17 (1): 89–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Donnelly JP, Hohmann SF, Wang HE.  Unplanned readmissions after hospitalization for severe sepsis at academic medical center-affiliated hospitals. Crit Care Med  2015; 43 (9): 1916–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Galiatsatos P, Follin A, Alghanim F, et al.  The association between neighborhood socioeconomic disadvantage and readmissions for patients hospitalized with sepsis. Crit Care Med  2020; 48 (6): 808–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Cafagna G, Seghieri C.  Educational level and 30-day outcomes after hospitalization for acute myocardial infarction in Italy. BMC Health Serv Res  2017; 17 (1): 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Pandey A, Keshvani N, Khera R, et al.  Temporal trends in racial differences in 30-day readmission and mortality rates after acute myocardial infarction among Medicare beneficiaries. JAMA Cardiol  2020; 5 (2): 136–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Khatana SAM, Wadhera RK, Choi E, et al.  Association of homelessness with hospital readmissions – an analysis of three large states. J Gen Intern Med  2020; 35 (9): 2576–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Sieck CJ, Sheon A, Ancker JS, et al.  Digital inclusion as a social determinant of health. NPJ Digit Med  2021; 4: 1–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Wardi G, Tainter CR, Ramnath VR, et al.  Age-related incidence and outcomes of sepsis in California, 2008–2015. J Crit Care  2021; 62: 212–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ocac060_supplementary_data

Data Availability Statement

The AllofUs cohort is publicly available at www.allofus.nih.gov. The AllofUs cohort version number 4, consisting of data from all participants who enrolled from the beginning of the program on May 30, 2017 to May 6, 2021 was used in this study. The OMOP queries used to extract data from the AllofUs database will be made available at https://github.com/NematiLab/ASURE_Readmission.


Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES