Abstract
BACKGROUND
Unplanned hospital readmissions constitute a significant cost burden in healthcare. Identifying factors contributing to readmission risk presents opportunities for actionable change to reduce readmission rates.
OBJECTIVE
To combine machine learning classification and feature importance analysis to identify drivers of readmission in a large cohort of spine patients.
METHODS
Cases involving surgical procedures for degenerative spine conditions between 2008 and 2016 were retrospectively reviewed. Of 11 150 cases, 396 patients (3.6%) experienced an unplanned hospital readmission within 30 d of discharge. Over 75 pre-discharge variables were collected and categorized into demographic, perioperative, and resource utilization feature domains. Random forest classification was used to construct predictive models for readmission from feature domains. An ensemble tree-specific method was used to quantify and rank features by relative importance.
RESULTS
In the demographics domain, age and comorbidity burden were the most important features for readmission prediction. Surgical duration and intraoperative oral morphine equivalents were the most important perioperative features, whereas total direct cost and length of stay were most important in the resource utilization domain. In supervised learning experiments for predicting readmission, the demographic domain model performed the best alone, suggesting that demographic features may contribute more to readmission risk than perioperative variables following spine surgery. A predictive model, created using only enriched features showing substantial importance, demonstrated improved predictive capacity compared to previous models, and approached the performance of state-of-the-art, deep-learning models for readmission.
CONCLUSION
This strategy provides insight into global patterns of feature importance and better understanding of drivers of readmissions following spine surgery.
Keywords: Machine learning, Feature importance, Principal components analysis, Outcomes prediction, Classification, Hospital readmission, Spine surgery
ABBREVIATIONS
- ASA
American Society of Anesthesiologists
- AUROC
area under each ROC curve
- EI
Elixhauser Index
- ICU
intensive care unit
- MDI
mean decrease in impurity
- OME
oral morphine equivalents
With hospital costs reaching over $40 billion for patients readmitted to the hospital within 30 d, unplanned readmissions constitute a significant cost burden in healthcare.1 In 2012, to combat the increasing flow of government and hospital funds diverted towards readmission costs, the Centers for Medicare and Medicaid Services (CMS) implemented the Hospital Readmissions Reduction Program (HRRP),2 a value-based reimbursement program that financially penalizes hospitals with excess readmissions for 6 conditions.
Despite some improvement in the national rate of readmissions, there remains a poor understanding of which patient subpopulations are at the greatest risk of readmission. In addition, recent studies suggest the HRRP may have unintended consequences such as higher mortality rates in select populations.3,4 Identifying patients with heightened risks of adverse events after discharge as well as institutional or systemic factors that may contribute to improper patient management or failed discharge transitions may present opportunities for actionable change that can mitigate readmission risk and direct resources to patients with greater need.5
Although not targeted under the HRRP, spine surgeries can be particularly grueling procedures with complex recoveries, requiring careful management of patient progress. Previous work demonstrated that a significant number of unplanned readmissions occur within 30 d following spine surgery (∼4.2%-7.4%)6-8 and that the most common reasons for readmission include infection and refractory pain.9 Thus, unplanned readmissions following spine surgery have significant adverse effects on patient postoperative quality of life and costs.
The accurate prediction of readmission could enable significant improvements in resource allocation, leading to improved patient outcomes and lower costs.10 Despite advances in machine learning techniques, however, readmissions remain notoriously difficult to accurately classify; indeed, previous studies have produced only modestly successful results.11 There have been few attempts at machine learning prediction of readmission following spine surgery, though nearly all have been limited to commercial databases to achieve adequate patient numbers. Although healthcare databases can be powerful tools for studying certain surgical topics, their utility is complicated by their lack of granularity, restriction to coded procedures and diagnoses, and potential heterogeneity because of different sampling strategies.12
This study takes a unique approach combining machine learning classification and an ensemble-specific technique for quantifying feature importance to reduce feature selection bias and identify candidate drivers of readmission in a large cohort of spine patients treated at an urban, academic hospital in the United States.
METHODS
Data Source, Inclusion Criteria, and Patient Stratification
This study was conducted at a large, urban academic medical center in the United States. Institutional records were retrospectively reviewed for all cases involving a surgical procedure for a degenerative spine condition between 2008 and 2016. Cases were excluded if they involved procedures for traumatic injuries, tumors, or infections. This yielded 11 150 total cases, with 396 patients (3.6%) experiencing an unplanned hospital readmission within 30 d of discharge. The Institutional Review Board approved this study and waived informed consent.
Variable Selection and Primary Outcome
This study focused on reviewing patient and institutional data available before discharge; these data would be more pertinent for developing predictive models that could be used by healthcare entities for identifying patient and institutional factors that may contribute to readmission risk before discharge. Over 75 different variables were collected to ensure a rich selection for learning (see Text, Supplemental Digital Content for a more detailed assessment of the added importance of feature granularity in this study). Variables were categorized into 1 of 3 feature domains (demographics, perioperative, and resource utilization) for analyses. Demographic data included age, sex, race, American Society of Anesthesiologists (ASA) status classification, comorbidity burden as described by the Elixhauser Index (EI)13 with van Walraven weighting,14 preoperative diagnosis type, admission type, and primary insurance payer type. In addition, perioperative variables were also obtained, including total hospitalization length, procedure class, total oral morphine equivalents (OME) received intraoperatively, number of spine segments operated, total anesthesia time, time in the operating room, surgery duration, and prolonged intubation, defined as extubation after leaving the operating room. Cases were also reviewed for intraoperative metrics, such as estimated blood loss, urine output, and intraoperative quantities received of crystalloid, colloid, packed red blood cells, platelets, fresh frozen plasma, and cell saver. Measures of in-hospital adverse events and resource utilization were obtained, including discharge disposition, total direct costs, and in-hospital complication, mortality, intensive care unit (ICU) stay, or emergency room visit. The primary outcome was 30-d unplanned readmission.
Data Preprocessing
Prior to analysis, data were preprocessed as follows. A label encoder was applied to categorical variables to create binary representations of each category. Certain continuous variables, including length of stay, surgery duration, and total cost, were binned using a KBinsDiscretizer (scikit learn) set to 2 bins. Missing values were imputed using multivariate imputation in which features with missing values are modeled as a function of other features to iteratively estimate imputation. Quantitative features were standardized, scaled to unit variance, and normalized to unit norm.
Statistical Analysis
Two-sided, two-sample T tests compared the means of continuous variables between patients with and without 30-d readmissions. Chi-square tests assessed contingency tables for categorical variables. Fisher's exact test was used for contingency tables containing an expected count less than one under the null hypothesis of independence. A P value less than .05 was used for determining statistical significance. Prism 7 (GraphPad, La Jolla, California, 2017) was used for statistical analyses.
Supervised Machine Learning and Cross Validation
A Random forest classifier algorithm was implemented for predictive modeling. Random forest classification is an ensemble method based on randomized decision trees, in which the ensemble prediction is the averaged prediction of individual classifiers generated by introducing randomness during classifier construction. The combined estimator often outperforms individual estimators because averaging decreases its variance, yielding more predictive and generalizable models. Random forest classifiers were used because they are fast, flexible, and robust in dealing with high-dimensional data and reduce overfitting.15 Prior to model training, 25% of samples were randomly set aside to comprise a test set for final model evaluation. With the remaining 75% of samples, a k-fold cross validation scheme (k = 5) was employed to ensure robust model performance. Receiver operating characteristic (ROC) curves were created, and the area under each ROC curve (AUROC) was calculated to measure models’ predictive powers. An empirical bootstrap method (N = 1000 bootstraps)16 was implemented to resample from the distributions, and permutation tests (N = 1000 shuffles) were used to make statistical comparisons of models’ AUROC scores by determining if the mean differences between model distributions were different from zero.
Feature Importance Analysis
An ensemble tree-specific method was applied to analyze and rank the relative importance of each model's features. This method is based on the principle that the depth of a feature being used as a node in a decision tree is related to that feature's importance in the overall target prediction. Features comprising decision nodes towards the top of the tree guide target prediction for a greater proportion of samples and are therefore relatively more important. The mean decrease in impurity (MDI) is a widely used estimate of relative feature importance that considers the number of samples the feature splits and the number of times a feature splits a node. This measure provides a standardized estimate of a feature's predictive power and reduces the variance due to averaging over all the trees of the ensemble. Each feature's importance is given as a percentage relative to the importance of the highest ranked feature for that domain. To obtain a better contextual understanding of the “absolute importance” of the highest ranked feature in each domain, a readmission prediction model was constructed for each domain using only the highest ranked feature in that domain and the AUROC was reported. Python 3.7 (Python Software Foundation) was used for feature importance analyses.
RESULTS
Study Population Characteristics
Of the 11 150 reviewed cases, 396 (3.6%) involved a 30-d unplanned readmission. Readmitted patients were more likely to be nonwhite, insured by Medicare or Medicaid, have a higher EI comorbidity burden and ASA class, and admitted nonelectively (Table 1). Readmitted patients also had slightly different distributions for their pre-operative diagnoses and procedure types compared to non-readmitted patients. The mean age and sex distributions of both cohorts were similar.
TABLE 1.
Demographics of the Study Population by Hospital Readmission Status
| Hospital Readmission (N = 396) | No Readmission (N = 10 754) | P value | |
|---|---|---|---|
| Age (Mean ± SEM) | 56.3 ± 0.8 | 55.3 ± 0.1 | .1627 |
| Sex (%) | .6832 | ||
| Male | 203 (51.3) | 5639 (52.4) | |
| Female | 193 (48.7) | 5115 (47.6) | |
| Race (%) | <.0001 | ||
| White | 217 (54.8) | 6472 (60.2) | |
| Black | 70 (17.7) | 1029 (9.6) | |
| Asian | 20 (5.0) | 711 (6.6) | |
| Other | 89 (22.5) | 2542 (23.6) | |
| ASA Status (%) | <.0001 | ||
| I | 17 (4.3) | 917 (8.5) | |
| II | 153 (38.6) | 6128 (57.0) | |
| III | 200 (50.5) | 3491 (32.5) | |
| IV | 26 (6.6) | 218 (2.0) | |
| Elixhauser Comorbidity Index Score (%) | <.0001 | ||
| <0 | 33 (8.3) | 985 (9.2) | |
| 0 | 145 (36.6) | 5871 (54.6) | |
| 1-4 | 55 (13.9) | 1282 (11.9) | |
| >4 | 163 (41.2) | 2616 (24.3) | |
| Preoperative Diagnosis (%) | <.0001 | ||
| Herniation | 54 (13.6) | 1926 (17.9) | |
| Radiculopathy | 16 (4.0) | 589 (5.5) | |
| Myelopathy | 11 (2.8) | 450 (4.2) | |
| Spondylosis | 36 (9.1) | 1170 (10.9) | |
| Spondylolisthesis | 12 (3.0) | 403 (3.7) | |
| Stenosis | 109 (27.5) | 3612 (33.6) | |
| Unspecified Back Pain | 36 (9.1) | 975 (9.1) | |
| Other | 122 (30.8) | 1629 (15.1) | |
| Procedure Type (%) | <.0001 | ||
| Cervical Fusion | 69 (17.4) | 3194 (29.7) | |
| Thoracolumbar Fusion | 103 (26.0) | 3281 (30.5) | |
| Non-Fusion Procedure | 79 (19.9) | 2490 (23.2) | |
| Other | 145 (36.6) | 1789 (16.6) | |
| Admission Type (%) | <.0001 | ||
| Elective | 309 (78.0) | 9812 (91.2) | |
| Emergent | 38 (9.6) | 614 (5.7) | |
| Urgent | 10 (2.5) | 162 (1.5) | |
| Other | 39 (9.9) | 166 (1.6) | |
| Primary Insurance Type (%) | <.0001 | ||
| Private | 169 (42.7) | 5416 (50.4) | |
| Medicare | 139 (35.1) | 3088 (28.7) | |
| Medicaid | 62 (15.7) | 955 (8.9) | |
| Other | 26 (6.6) | 1295 (12.0) |
APRDRG = all patients refined diagnosis related groups; ASA = American Society of Anesthesiologists physical status classification system; BMI = body mass index; SEM = standard error of the mean. P < .05 was used as a threshold for statistical significance.
The cohorts shared similar perioperative characteristics, including average number of spine segments operated, and amounts of colloid, packed red blood cells, platelets, fresh frozen plasma, and cell saver received intraoperatively (Table 2). Both cohorts experienced similar estimated volumes of intraoperative blood loss; however, readmitted patients experienced longer times in the operating room, under anesthesia, and in surgery. They also received greater quantities of total crystalloid and OME intraoperatively, and a greater proportion also experienced a prolonged intubation during their procedure.
TABLE 2.
Perioperative Characteristics of the Study Population by Hospital Readmission Status
| Hospital Readmission (N = 396) | No Readmission (N = 10 754) | ||
|---|---|---|---|
| (Mean ± SEM) | (Mean ± SEM) | P value | |
| Oral Morphine Equivalents (mg) | 129.6 ± 4.3 | 122.1 ± 0.8 | .0687 |
| Spine Segments Operated | 2.4 ± 0.1 | 2.4 ± 0.0 | .9461 |
| Anesthesia Length (minutes) | 311.9 ± 6.0 | 285.0 ± 1.1 | <.0001 |
| Time in Operating Room (minutes) | 295.5 ± 5.9 | 269.6 ± 1.1 | <.0001 |
| Length of Surgery (minutes) | 202.0 ± 5.1 | 184.8 ± 0.9 | .0005 |
| Received Crystalloid (%) | .9128 | ||
| Yes | 376 (94.9) | 10210 (94.9) | |
| No | 20 (5.1) | 544 (5.1) | |
| Received Colloid (%) | .9653 | ||
| Yes | 7 (1.8) | 173 (1.6) | |
| No | 389 (98.2) | 10581 (98.4) | |
| Received Packed Red Blood Cells (%) | .0179 | ||
| Yes | 38 (9.6) | 695 (6.5) | |
| No | 358 (90.4) | 10059 (93.5) | |
| Received Platelets (%) | .5259 | ||
| Yes | 6 (1.5) | 113 (1.1) | |
| No | 390 (98.5) | 10641 (98.9) | |
| Received Fresh Frozen Plasma (%) | .7932 | ||
| Yes | 3 (0.8) | 110 (1.0) | |
| No | 393 (99.2) | 10644 (99.0) | |
| Received Cell Saver (%) | .2132 | ||
| Yes | 46 (11.6) | 1500 (13.9) | |
| No | 350 (88.4) | 9254 (86.1) | |
| Total Urine Output (mL) | 540.6 ± 29.5 | 470.9 ± 5.6 | .0193 |
| Estimated Blood Loss (mL) | 261.7 ± 20.7 | 234.5 ± 4.3 | .2361 |
| Prolonged Intubation | .0083 | ||
| Yes | 27 (6.8) | 431 (4.0) | |
| No | 369 (93.2) | 10323 (96.0) | |
| Method of Anesthesia | .0008 | ||
| General | 387 (97.7) | 10681 (99.3) | |
| Monitored Anesthesia Care | 9 (2.3) | 73 (0.7) |
IQR = interquartile range; SEM = standard error of the mean. P < .05 was used as a threshold for statistical significance.
Readmitted patients were more likely to have longer initial hospitalizations compared to non-readmitted patients (5.7 vs 3.2 d) and a non-home discharge and higher total direct costs for their visit (Table 3). Both cohorts, however, shared similar rates of in-hospital complications and ICU stays.
TABLE 3.
Resource Utilization and Complications of the Study Population by Hospital Readmission Status
| Hospital Readmission (N = 396) | No Readmission (N = 10 754) | ||
|---|---|---|---|
| (Mean ± SEM) | (Mean ± SEM) | P value | |
| Length of Hospital Stay (Days) | 5.7 ± 0.4 | 3.2 ± 0.1 | <.0001 |
| ICU Stay | .7592 | ||
| Yes | 29 (7.3) | 847 (7.9) | |
| No | 367 (92.7) | 9907 (92.1) | |
| Discharge Disposition (%) | <.0001 | ||
| Home | 289 (73.0) | 9040 (84.1) | |
| Rehabilitation Center | 56 (14.1) | 1086 (10.1) | |
| Skilled Nursing Facility | 21 (5.3) | 285 (2.7) | |
| Care Facility | 23 (5.8) | 294 (2.7) | |
| Hospice or Death | 2 (0.5) | 15 (0.1) | |
| Other | 5 (1.3) | 34 (0.3) | |
| Non-Home Discharge | <.0001 | ||
| Yes | 100 (25.3) | 1664 (15.5) | |
| No | 296 (74.7) | 9090 (84.5) | |
| 30-Day Emergency Room Admission | <.0001 | ||
| Yes | 53 (13.4) | 242 (2.3) | |
| No | 343 (86.6) | 10512 (97.7) | |
| In-Hospital Complication | .6310 | ||
| Yes | 14 (3.5) | 321 (3.0) | |
| No | 382 (96.5) | 10433 (97.0) | |
| In-Hospital Mortality | .9636 | ||
| Yes | 0 (0) | 15 (0.1) | |
| No | 396 (100) | 10739 (99.9) | |
| Total Direct Costs ($) | 34633 ± 1655 | 30784 ± 257 | .0052 |
IQR = interquartile range; SEM = standard error of the mean. P < .05 was used as a threshold for statistical significance.
Analysis of Relative Importance by Feature Domain
When models were constructed using only demographic features, MDI revealed age as the most important factor for prediction in the greatest proportion of samples (Figure 1). The AUROC of Age alone in predicting readmission was 0.48. The next highest ranked features included EI comorbidity burden, ASA class, and sex. Features pertaining to race, insurance type, admission type, admission source, and pre-operative diagnosis classification type were considered relatively unimportant for prediction.
FIGURE 1.
Relative importance of features categorized in the demographics domain for predicting unplanned 30-d hospital readmissions. Feature importance was determined using the MDI technique and is given as a percentage relative to the importance of the highest ranked feature for that category.
In models utilizing only perioperative variables, surgery duration (AUROC = 0.55) was the most important feature (Figure 2). This was followed by intraoperative OME administered, urine output, intraoperative crystalloid and propofol, and estimated blood loss. These features were estimated to have around 60% importance relative to surgical duration.
FIGURE 2.
Relative importance of features categorized in the perioperative domain for predicting unplanned 30-d hospital readmissions. Feature importance was determined using the MDI technique and is given as a percentage relative to the importance of the highest ranked feature for that category.
In the final ensemble using hospital resource utilization metrics, total direct hospitalization cost (AUROC = 0.49) was the most important feature (Figure 3). Hospitalization length was ranked as the next most important feature, but only carried around 15% of the importance relative to total direct cost. Interestingly, non-home discharge and ICU stay during hospitalization lacked importance in this ensemble.
FIGURE 3.
Relative importance of features categorized in the resource utilization domain for predicting unplanned 30-d hospital readmissions. Feature importance was determined using the MDI technique and is given as a percentage relative to the importance of the highest ranked feature for that category.
Supervised Machine Learning
Supervised learning experiments were performed using the domain-specific ensemble models to better understand which areas surrounding patient care best predicted readmission. Ensemble models utilizing only individual domains of features performed modestly (Figure 4A), with demographic domain models performing best (AUROC = 0.63), compared to those using perioperative (AUROC = 0.60) or resource utilization features (AUROC = 0.59). A significant boost in predictive power was observed when all features were included (AUROC = 0.67) compared to using demographic (P = .005), perioperative (P = .002), or resource utilization (P = .002) features alone (Figure 4B).
FIGURE 4.
ROC curves for model ensembles predicting unplanned hospital readmissions constructed from A, individual feature domains and B, all features in the dataset. Areas under the ROC curves (AUROCs) are provided in the lower right side. Dashed line denotes random chance with an AUROC = 0.50.
This study then utilized insights gained from feature importance analyses by applying them to feature selection to build a model with enhanced predictive capacity using only the most important, enriched features from each ensemble domain. A threshold of 15% relative importance was set for a feature to be included in this enriched model. Using this guide, 15 features were selected across the three domains and all contributed to the final model's predictive power (Figure Figure 5A). The majority of features in the enriched set did not show significant associations with each other (Figure 5B), and the predictive performance of the enriched ensemble (AUROC = 0.72) was significantly better than the combined ensemble using all dataset features (P = .023; Figure 5C).
FIGURE 5.
A, Relative importance of features categorized in the enriched feature set containing the top 15 highest ranking features across all domains. Feature importance was determined using the MDI technique and is given as a percentage relative to the importance of the highest ranked feature. B, Heatmap of Pearson correlation coefficients allows visualization of any colinear relationships between variables in the enriched feature set. C, ROC curves for model ensembles predicting readmissions using the enriched feature set. Areas under the ROC curves (AUROCs) are provided in the lower right side. Dashed line denotes random chance with an AUROC = 0.50.
DISCUSSION
Given the tremendous cost and national attention associated with unplanned hospital readmissions in the United States, health systems have been searching for pathways to curb readmissions. Despite these efforts, there is still an incomplete understanding of which patient care factors may be active drivers of readmission. Additionally, factors specific to the patient population or to the institution may further complicate the picture and could contribute to the incidence of unplanned readmissions. Taken together, these issues make predicting readmissions difficult.11,17
Previous studies have examined 30-d readmissions with various techniques across a range of patient populations. Perhaps the most notable effort involved a team from Google that took a large-scale deep learning-based approach using electronic health records from over 200 000 patients to produce a dataset with 45 billion data points.17 Although accurate predictions were made for several clinical outcomes, readmissions were among the most difficult to predict (AUROC = 0.75). Additional efforts that applied conventional machine learning techniques to nationwide databases to predict 30-d readmission for HRRP-targeted conditions resulted in AUROC scores between 0.69 and 0.70.18 Other studies have examined individual predictors of readmissions, including race, length of stay, ASA class, and insurance payer type, using traditional techniques.8,19,20 Since the introduction of the HRRP, a few studies have even started reporting comparisons of their readmission rates for various procedure groups in the periods before and after implementation of the HRRP. One study found that implementation of the HRRP was associated with decreased readmissions for targeted procedures, but not similar untargeted procedures, including those involving the spine.21
Relatively few studies have used machine learning to predict 30-d readmissions in spine surgery. Despite previous efforts to predict readmission risk,22 there is still a poor understanding of which factors are drivers of readmission in these patients. This study sought to address this by applying machine learning with feature domains to a large, single-center spine patient population.
Feature Importance by Domain
The most important feature in the demographics domain was age, followed more distantly by EI comorbidity burden and ASA class. This was interesting, particularly given how similar the patient cohorts were with respect to age (Table 1). Several studies have suggested that age may serve as a nonmodifiable risk factor for readmission in various populations.23-25 Reasons for this might be that age can produce delays in recovery, wound healing, and pain reduction. Comorbidity burden, health status, and race also have significant associations with unplanned readmissions.19,26,27 Interestingly, admission type and preoperative diagnosis were ranked as relatively unimportant in predicting readmission in this domain. This was somewhat surprising given the large difference in symptom severity that can sometimes exist between conditions such as myelopathy, radiculopathy, stenosis, and other spinal pathologies.
The analysis revealed that surgical duration and number of segments operated were most important in the perioperative feature domain, whereas procedure type carried a low relative importance ranking. Interestingly, total intraoperative OME was also considered relatively important in predicting readmission. Intraoperative OMEs showed only a weak positive association with surgical duration, suggesting that its importance is not simply because of an underlying relationship between OME and surgery length (Figure 5B). It is possible that patients requiring higher OMEs intraoperatively may have a history of chronic opioid use and may also require greater doses post-operatively and postdischarge.28-30 This may result in a more complicated recovery process, particularly with respect to pain management.29,30 Other studies also demonstrated that patients with increased opioid use carry increased risk of poor postoperative outcomes and wound complications,31-33 both of which would prompt readmission.34 Given that previous retrospective studies suggested infection and pain were among the most common reasons for readmission following spine surgery,9 the link between opioid use and hospital readmission should be closely investigated, as it may be a potential modifiable parameter in an enhanced recovery pathway, as enhanced recovery after surgery protocols may decrease readmission in various spine surgeries.35
In the resource utilization feature domain, total direct cost and hospitalization length were the most important features for readmission prediction. Although several studies suggested a strong link between length of stay and readmission, they are mixed regarding the direction of the relationship. Some studies found a positive relationship between prolonged length of stay and readmission, suggesting that perhaps both measures are indicative of a patient's overall health status and that sicker patients tend to have prolonged hospitalizations and greater readmission risk.36,37 Other studies, however, report a negative relationship, suggesting that increasing length of stay may be increasing the quality and extent of care received, thereby reducing risk of future readmission.38 Both conclusions may be true in certain contexts, as length of stay can be a delicate balance of ensuring that care is not discontinued prematurely and decreasing risk of hospital-acquired conditions and other hazards of hospitalization. Given the correlation between length of stay and hospitalization costs (Figure 5B), similar forces may be driving the observed result with total direct costs. This may imply a trade-off between higher costs in the present during initial hospitalization or later during readmission.39
Supervised Learning With Feature Domains
In utilizing the enriched feature set for classifying readmission, the predictive capacity of this model (AUROC = 0.72) was in line with previous studies attempting to predict readmission.40-43 To highlight the difficulty of predicting readmission, as discussed previously, state-of-the-art, advanced, deep-learning models trained on large patient cohorts have achieved similar results (AUROC = 0.75),17 and a systematic review found only two studies with an AUROC above 0.8.40 National studies of 30-d readmission have fared more poorly, with one NSQIP study reporting AUROCs of 0.63 to 0.66,43 perhaps because of the heterogeneity of a national population. Regardless, the problem of predicting readmission remains challenging and existing “black-box” models do not indicate which factors may be addressable to reduce readmission rates. Although this study attempts to provide some answers using a machine learning approach, further research is needed to better understand drivers of readmission. In addition, surgery-specific machine learning models that integrate procedure-specific complications may provide additional granularity and accuracy in identifying patients at-risk for readmission after various surgical procedures.44
Strengths and Limitations
This study is notable for several reasons. First, it utilizes a rich feature set from a single institution that contains less heterogeneity compared to existing national database studies (see Text, Supplemental Digital Content). Second, in contrast to previous studies that examined the role a specific variable may play in readmission, this study takes a machine learning-based approach to identify patient and institutional factors driving readmission, which avoids some of the bias introduced by traditional techniques that allow expectations to influence study results. Finally, by applying rigorous internal validation and algorithm selection criteria, this study ensures that the most predictive models possible were constructed with the available tools.
It is also important to acknowledge this study's limitations. First, despite use of cross validation to assess generalizability on previously unseen data, external validation with outside data or prospectively acquired data would further validate these findings. Second, the study's retrospective design makes it difficult to evaluate temporal relationships between variables. In addition, despite containing less heterogeneity relative to national databases, this study utilizes a relatively heterogeneous population of spine patients, with many pathologies represented (and combinations therein). As such, a certain degree of granularity is lost in terms of the insights and conclusions that can be drawn about any particular spinal patient population. Finally, despite the advantages of ensemble-based methods for quantifying feature importance, they do not use the ultimate performance measure of interest (AUROC) for ranking feature importance, but instead use the structure of the decision trees in the ensemble. As such, one should exercise good judgment in determining whether the output makes sense given the clinical context.
CONCLUSION
Unplanned readmissions adversely impact patient experience and outcomes and present a significant cost burden for hospitals. This study took a unique approach to readmission prediction by applying feature importance analysis and supervised machine learning to a large, single-institution dataset. This strategy helped identify potential factors that may contribute to readmission following spine surgery and produce a model with improved predictive capacity. The framework presented here may serve as a roadmap for other institutions to enhance clinical outcomes prediction with machine learning and identify candidate drivers of adverse outcomes at the patient and institution levels.
Disclosures
The authors have no personal, financial, or institutional interest in any of the drugs, materials, or devices described in this article.
Supplementary Material
Contributor Information
Michael L Martini, Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, New York.
Sean N Neifert, Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, New York.
Eric K Oermann, Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, New York.
Jonathan Gal, Department of Anesthesiology, Perioperative, and Pain Medicine, Icahn School of Medicine at Mount Sinai, New York, New York.
Kanaka Rajan, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York.
Dominic A Nistal, Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, New York.
John M Caridi, Email: john.caridi@mountsinai.org, Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, New York.
Supplemental Digital Content. Text. Comparison of standard and non-standard features in machine learning-based prediction of 30-d readmission after spine surgery. The Supplemental Digital Content expands on the results and discussion of the importance of feature granularity by comparing standard vs nonstandard feature sets in readmission prediction described in the text. Figure S1, ROC curves.
REFERENCES
- 1. Fischer C, Lingsma HF, Marang-van de Mheen PJ, Kringos DS, Klazinga NS, Steyerberg EW. Is the readmission rate a valid quality indicator? A review of the evidence. PLoS One. 2014;9(11):e112282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Joshi S, Nuckols T, Escarce J, Huckfeldt P, Popescu I, Sood N. Regression to the mean in the medicare hospital readmissions reduction program. JAMA Intern Med. 2019;179(9):1167-1173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Wadhera RK, Yeh RW, Joynt Maddox KE. The hospital readmissions reduction program - time for a reboot. N Engl J Med. 2019;380(24):2289-2291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Wadhera RK, Joynt Maddox KE, Wasfy JH, Haneuse S, Shen C, Yeh RW. Association of the hospital readmissions reduction program with mortality among medicare beneficiaries hospitalized for heart failure, acute myocardial infarction, and pneumonia. JAMA. 2018;320(24):2542-2552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Karhade AV, Ahmed AK, Pennington Z, et al. External validation of the SORG 90-day and 1-year machine learning algorithms for survival in spinal metastatic disease. Spine J. 2020;20(1):14-21. [DOI] [PubMed] [Google Scholar]
- 6. McCormack RA, Hunter T, Ramos N, Michels R, Hutzler L, Bosco JA. An analysis of causes of readmission after spine surgery. Spine (Phila Pa 1976). 2012;37(14):1260-1266. [DOI] [PubMed] [Google Scholar]
- 7. Akins PT, Harris J, Alvarez JL, et al. risk factors associated with 30-day readmissions after instrumented spine surgery in 14,939 patients: 30-day readmissions after instrumented spine surgery. Spine (Phila Pa 1976). 2015;40(13):1022-1032. [DOI] [PubMed] [Google Scholar]
- 8. Bernatz JT, Anderson PA. Thirty-day readmission rates in spine surgery: systematic review and meta-analysis. Neurosurg Focus. 2015;39(4):E7. [DOI] [PubMed] [Google Scholar]
- 9. Adogwa O, Elsamadicy AA, Han JL, Karikari IO, Cheng J, Bagley CA. 30-Day readmission after spine surgery: an analysis of 1400 consecutive spine surgery patients. Spine (Phila Pa 1976). 2017;42(7):520-524. [DOI] [PubMed] [Google Scholar]
- 10. Ashfaq A, Sant’Anna A, Lingman M, Nowaczyk S. Readmission prediction using deep learning on electronic health records. J Biomed Inform. 2019;97:103256. [DOI] [PubMed] [Google Scholar]
- 11. Topol EJ. High-performance: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. [DOI] [PubMed] [Google Scholar]
- 12. Alluri RK, Leland H, Heckmann N. Surgical research using national databases. Ann Transl Med. 2016;4(20):393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8-27. [DOI] [PubMed] [Google Scholar]
- 14. van Walraven C, Austin PC, Jennings A, Quan H, Forster AJ. A modification of the elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. [DOI] [PubMed] [Google Scholar]
- 15. Breiman L. Random forests. Machine Learning. 2001;45(1):5-32. [Google Scholar]
- 16. Wu JC, Martin AF, Kacker RN. Validation of nonparametric two-sample bootstrap in ROC analysis on large datasets. Commun Stat Simul Comput. 2016;45(5):1689-1703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Rajkomar A, Oren E, Chen K, et al. Scalable and accurate deep learning with electronic health records. NPJ Digit Med. 2018;1:18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Yang CL, Delcher C, Shenkman E, Ranka S. Predicting 30-day all-cause readmissions from hospital inpatient discharge data. 2016 Ieee 18th International Conference on E-Health Networking, Applications and Services (Healthcom). Munich, Germany: IEEE, 2016:188- 193. [Google Scholar]
- 19. Martin JR, Wang TY, Loriaux D, et al. Race as a predictor of postoperative hospital readmission after spine surgery. J Clin Neurosci. 2017;46:21-25. [DOI] [PubMed] [Google Scholar]
- 20. Singh S, Sparapani R, Wang MC. Variations in 30-day readmissions and length of stay among spine surgeons: a national study of elective spine surgery among US medicare beneficiaries. J Neurosurg-Spine. 2018;29(3):286-291. [DOI] [PubMed] [Google Scholar]
- 21. Ramaswamy A, Marchese M, Cole AP, et al. Comparison of hospital readmission after total hip and total knee arthroplasty vs spinal surgery after implementation of the hospital readmissions reduction program. JAMA Netw Open. 2019;2(5):e194634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Kalagara S, Eltorai AEM, Durand WM, DePasse JM, Daniels AH. Machine learning modeling for predicting hospital readmission following lumbar laminectomy. J Neurosurg-Spine. 2019;30(3):344-352. [DOI] [PubMed] [Google Scholar]
- 23. Horney C, Capp R, Boxer R, Burke RE. Factors associated with early readmission among patients discharged to post-acute care facilities. J Am Geriatrics Soc. 2017;65(6):1199-1205. [DOI] [PubMed] [Google Scholar]
- 24. Fuller RL, Atkinson G, McCullough EC, Hughes JS. Hospital readmission rates: the impacts of age, payer, and mental health diagnoses. J Ambul Care Manage. 2013;36(2):147-155. [DOI] [PubMed] [Google Scholar]
- 25. Li CY, Karmarkar A, Adhikari D, Ottenbacher K, Kuo YF. Effects of age and sex on hospital readmission in traumatic brain injury. Arch Phys Med Rehabil. 2018;99(7):1279-1288. e1271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Kwok CS, Martinez SC, Pancholy S, et al. Effect of comorbidity on unplanned readmissions after percutaneous coronary intervention (from the nationwide readmission database). Sci Rep. 2018;8(1):11156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Hijazi HH, Alyahya MS, Hammouri HM, Alshraideh HA. Risk assessment of comorbidities on 30-day avoidable hospital readmissions among internal medicine patients. J Eval Clin Pract. 2017;23(2):391-401. [DOI] [PubMed] [Google Scholar]
- 28. Myers J, Compton P. Addressing the potential for perioperative relapse in those recovering from opioid use disorder. Pain Med. 2018;19(10):1908-1915. [DOI] [PubMed] [Google Scholar]
- 29. Fletcher D, Martinez V. Opioid-induced hyperalgesia in patients after surgery: a systematic review and a meta-analysis. Br J Anaesth. 2014;112(6):991-1004. [DOI] [PubMed] [Google Scholar]
- 30. Hina N, Fletcher D, Poindessous-Jazat F, Martinez V. Hyperalgesia induced by low-dose opioid treatment before orthopaedic surgery: an observational case-control study. Eur J Anaesthesiol. 2015;32(4):255-261. [DOI] [PubMed] [Google Scholar]
- 31. Martini ML, Nistal DA, Deutsch BC, Caridi JM. Characterizing the risk and outcome profiles of lumbar fusion procedures in patients with opioid use disorders: a step toward improving enhanced recovery protocols for a unique patient population. Neurosurg Focus. 2019;46(4):E12. [DOI] [PubMed] [Google Scholar]
- 32. Armaghani SJ, Lee DS, Bible JE, et al. Preoperative opioid use and its association with perioperative opioid demand and postoperative opioid independence in patients undergoing spine surgery. Spine (Phila Pa 1976). 2014;39(25):E1524-1530. [DOI] [PubMed] [Google Scholar]
- 33. Cron DC, Englesbe MJ, Bolton CJ, et al. Preoperative opioid use is independently associated with increased costs and worse outcomes after major abdominal surgery. Ann Surg. 2017;265(4):695-701. [DOI] [PubMed] [Google Scholar]
- 34. Jain N, Phillips FM, Weaver T, Khan SN. Preoperative chronic opioid therapy: a risk factor for complications, readmission, continued opioid use and increased costs after one- and two-level posterior lumbar fusion. Spine (Phila Pa 1976). 2018;43(19):1331-1338. [DOI] [PubMed] [Google Scholar]
- 35. Staartjes VE, de Wispelaere MP, Schroder ML. Improving recovery after elective degenerative spine surgery: 5-year experience with an enhanced recovery after surgery (ERAS) protocol. Neurosurg Focus. 2019;46(4):E7. [DOI] [PubMed] [Google Scholar]
- 36. Ansari SF, Yan H, Zou J, Worth RM, Barbaro NM. Hospital length of stay and readmission rate for neurosurgical patients. Neurosurgery. 2018;82(2):173-181. [DOI] [PubMed] [Google Scholar]
- 37. Sud M, Yu B, Wijeysundera HC, et al. Associations between short or long length of stay and 30-day readmission and mortality in hospitalized patients with heart failure. JACC Heart Fail. 2017;5(8):578-588. [DOI] [PubMed] [Google Scholar]
- 38. Carey K, Lin MY. Hospital length of stay and readmission: an early investigation. Med Care Res Rev. 2014;71(1):99-111. [DOI] [PubMed] [Google Scholar]
- 39. Oh J-h, Z Zheng, Bardhan IR. Sooner or later? Health information technology, length of stay, and readmission risk. Production and Operations Management. 2018;27(11):2038-2053. [Google Scholar]
- 40. Zhou H, Della PR, Roberts P, Goh L, Dhaliwal SS. Utility of models to predict 28-day or 30-day unplanned hospital readmissions: an updated systematic review. BMJ Open. 2016;6(6):e011060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Parker SL, Sivaganesan A, Chotai S, McGirt MJ, Asher AL, Devin CJ. Development and validation of a predictive model for 90-day readmission following elective spine surgery. J Neurosurg Spine. 2018;29(3):327-331. [DOI] [PubMed] [Google Scholar]
- 42. McGirt MJ, Sivaganesan A, Asher AL, Devin CJ. Prediction model for outcome after low-back surgery: individualized likelihood of complication, hospital readmission, return to work, and 12-month improvement in functional disability. Neurosurg Focus. 2015;39(6):E13. [DOI] [PubMed] [Google Scholar]
- 43. Goyal A, Ngufor C, Kerezoudis P, McCutcheon B, Storlie C, Bydon M. Can machine learning algorithms accurately predict discharge to nonhome facility and early unplanned readmissions following spinal fusion? Analysis of a national surgical registry. J Neurosurg Spine. 2019:1-11. [DOI] [PubMed] [Google Scholar]
- 44. Kohls MR, Jain N, Khan SN. What are the rates, reasons, and risk factors of 90-day hospital readmission after lumbar discectomy?: an institutional experience. Clin Spine Surg. 2018;31(8):E375-E380. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





