Skip to main content
JAMA Network logoLink to JAMA Network
. 2019 Aug 14;154(11):1014–1021. doi: 10.1001/jamasurg.2019.2979

Novel Machine Learning Approach to Identify Preoperative Risk Factors Associated With Super-Utilization of Medicare Expenditure Following Surgery

J Madison Hyer 1, Aslam Ejaz 1, Diamantis I Tsilimigras 1, Anghela Z Paredes 1, Rittal Mehta 1, Timothy M Pawlik 1,2,
PMCID: PMC6694398  PMID: 31411664

Key Points

Question

What preoperative risk factors are associated with super-utilization of health care resources after surgery?

Findings

In this cohort study of 1 049 160 patients, super-utilizers of health care comprised 4.8% of the overall cohort yet incurred 31.7% of the expenditures. A machine learning approach identified history of hemiplegia, paraplegia, weight loss, and congestive heart failure with chronic kidney disease stages I to IV as the most significant risk factors associated with super-utilization following surgery.

Meaning

By proactively identifying patients who may be at risk for super-utilization of health care following surgery, targeted efforts may decrease the cost burden on the health care system while improving quality of care and outcomes for those patients.

Abstract

Importance

Typically defined as the top 5% of health care users, super-utilizers are responsible for an estimated 40% to 55% of all health care costs. Little is known about which factors may be associated with increased risk of long-term postoperative super-utilization.

Objective

To identify clusters of patients with distinct constellations of clinical and comorbid patterns who may be associated with an elevated risk of super-utilization in the year following elective surgery.

Design, Setting, and Participants

A retrospective longitudinal cohort study of 1 049 160 patients who underwent abdominal aortic aneurysm repair, coronary artery bypass graft, colectomy, total hip arthroplasty, total knee arthroplasty, or lung resection were identified from the 100% Medicare inpatient and outpatient Standard Analytic Files at all inpatient facilities performing 1 or more of the evaluated surgical procedures from 2013 to 2015. Data from 2012 to 2016 were used to evaluate expenditures in the year preceding and following surgery. Using a machine learning approach known as Logic Forest, comorbidities and interactions of comorbidities that put patients at an increased chance of becoming a super-utilizer were identified. All comorbidities, as defined by the Charlson (range, 0-24) and Elixhauser (range, 0-29) comorbidity indices, were used in the analysis. Higher scores indicated higher comorbidity burden. Data analysis was completed on November 16, 2018.

Main Outcome and Measures

Super-utilization of health care in the year following surgery.

Results

In total, 1 049 160 patients met inclusion criteria and were included in the analytic cohort. Their median (interquartile range) age was 73 (69-78) years, and approximately 40% were male. Super-utilizers comprised 4.8% of the overall cohort (n = 79 746) yet incurred 31.7% of the expenditures. Although the difference in overall expenditures per person between super-utilizers ($4049) and low users ($2148) was relatively modest prior to surgery, the difference in expenditures between super-utilizers ($79 698) vs low users ($2977) was marked in the year following surgery. Risk factors associated with super-utilization of health care included hemiplegia/paraplegia (odds ratio, 5.2; 95% CI, 4.4-6.2), weight loss (odds ratio, 3.5; 95% CI, 2.9-4.2), and congestive heart failure with chronic kidney disease stages I to IV (odds ratio, 3.4; 95% CI, 3.0-3.9).

Conclusions and Relevance

Super-utilizers comprised only a small fraction of the surgical population yet were responsible for a disproportionate amount of Medicare expenditure. Certain subpopulations were associated with super-utilization of health care following surgical intervention despite having lower overall use in the preoperative period.


In this cohort study, health care expenditures among patients in the year preceding and following elective surgery are evaluated with a machine learning approach to identify clinical and comorbid patterns that may be associated with an increased risk of health care super-utilization in the year following surgery.

Introduction

Rising health care costs have increased awareness around super-utilizers—a small subset of the total patient population who consume a disproportionate amount of health care resources.1,2,3,4,5,6,7,8,9,10 Typically defined as the top 5% of health care users, super-utilizers are responsible for an estimated 40% to 55% of all health care costs.7,9,10,11 By proactively identifying those patients, a process known as “hot spotting,” targeted efforts may improve quality of care and outcomes as well as decrease the cost burden on the health care system.1,12,13

Previous studies have identified episode-level characteristics associated with increased expenditure at the time of surgery, including patient age, comorbidities at the time of surgery, surgical approach, and perioperative and postoperative complications.1,3,9,14,15,16,17,18,19 Other studies have identified hospital-level characteristics associated with expenditure variability, such as hospital volume, affordable care organization status, hospital network participation, and the use of bundled payments.2,5,15,20,21,22,23,24 These studies, however, have largely considered health care expenditure only during the perioperative period. To our knowledge, no study has performed a longitudinal analysis of long-term expenditures among surgical patients following discharge from the index hospitalization. As such, patient and hospital characteristics associated with long-term super-utilization following surgery remain largely unknown.

One promising methodologic approach that may be useful for identifying long-term super-utilization among surgical patients is machine learning. Machine learning is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns, and make decisions with minimal human intervention.25,26 Specifically, machine learning is a method of data analysis that automates analytic model building. As such, machine learning is increasingly being used in health care to identify trends for improving health care delivery as well as to develop prognostic and treatment predictive models.25,27,28,29 Unlike statistical models that require that data meet certain strong underlying assumptions, machine learning probes data to identify an underlying structure using an iterative approach to learn from the data. A specific machine learning technique, called Logic Forest, iteratively investigates the main data effects and the sample space of all interactions without any specific a priori assumptions.30,31 As such, this method represents an optimal approach to identify potentially associated predictors and combinations of associated predictors without any user input bias.

To our knowledge, machine learning has not been applied to investigate super-utilization among surgical patients. Thus, the objective of the current study was to identify and characterize postoperative Medicare super-utilizers with a machine learning–based approach. Specifically, we sought to identify clusters of patients with distinct constellations of clinical and comorbid patterns who were associated with an elevated risk of super-utilization following elective surgery.

Methods

Data Source and Sample Selection

Data were derived from the 100% version of the Medicare inpatient and outpatient Standard Analytic Files between 2012 and 2016. The Standard Analytic Files are maintained by the Centers for Medicare & Medicaid Services and include patient-level data on demographic characteristics, diagnoses, procedures, and expenditures. The study was deemed exempt from review by the institutional review board at The Ohio State University, which also waived the need to obtained informed patient consent because the data were acquired from an existing Medicare database.

Patients aged 65 or older who underwent any of the following operations between 2013 and 2015 were included in the analytic cohort: abdominal aortic aneurysm (AAA) repair, coronary artery bypass graft (CABG), colectomy, total hip arthroplasty (THA), total knee arthroplasty (TKA), or lung resection. The following International Classification of Diseases, Ninth Revision, codes were used to identify patients: AAA repair (38.34, 38.44, 38.64, 39.25, and 39.71-39.78), CABG (36.10-36.17, 36.19), colectomy (17.31-17.36, 17.39, 45.71-45.76, 45.79-45.83), THA (8151), TKA (8154), and lung resection (329, 3221, 3222, 3229, 3220, 3230, 3241, 3250, and 3259). For patients having had more than 1 of these surgical procedures, only the first procedure was evaluated. Patients were excluded when they (1) were not enrolled in Medicare parts A and B in the month of the surgical episode, (2) received additional payments from a health maintenance organization (HMO), (3) died during the surgical visit, or (4) had no recorded use within a year following surgery.

To identify super-utilizers and low users, an established technique was used to categorize patients into 4 expenditure groups using a bisecting k-means clustering method with bin sorting by median to compute cluster seed.32 This technique assumed a priori that there would be 4 groups and was performed in 2 stages. The first stage grouped high and low observations, and the second stage then regrouped the high observations into their own relative high and low groupings, followed by regrouping the low observations into their own relative high and low groupings. Medicare payments were price standardized and adjusted by wage index, Disproportionate Share Hospital, and Indirect Medical Education.22 The groups with the highest and lowest median annual postoperative Medicare expenditure were defined as super-utilizers and low users, respectively. Propensity score matching was used to identify subsets of the super-utilizers and low users with minimal differences on baseline, perioperative, and postoperative characteristics, including patient demographics, type of surgery, comorbidities indices, index hospital length of stay, as well as the incidence of 90-day morbidity, readmission, and mortality. Propensity score matching was performed using variable ratio matching with a maximum match ratio of 3:1 (low users to super-utilizers) and greedy nearest neighbor matching strategy sorted in descending order.33,34 The quality of the match was evaluated by standardized mean differences between groups, with postmatch differences less than 0.10 defined as a good match.33,34 Variables included in the propensity score matching are indicated in Table 1 and Table 2.

Table 1. Patient Demographics and Preoperative Characteristics for Low Users and Super-Utilizers of Medicare in the Crude Population and the Propensity-Matched Population.

Variable No. (%) of Patients
Crude Population Propensity-Matched Population
Total (n = 626 754) Low User (n = 547 008) Super-Utilizer (n = 79 746) Effect Size Total (n = 52 986) Low User (n = 35 497) Super-Utilizer (n = 17 489) Effect Size
Age, median (IQR), ya 73 (69-78) 73 (68-78) 74 (70-79) −0.009 74 (69-79) 74 (69 -79) 74 (69-79) −0.002
Age category, y
65-69 172 189 (27.5) 154 233 (28.2) 17 956 (22.5) 0.059 12 735 (24.0) 8799 (24.8) 3936 (22.5) 0.027
70-74 181 782 (29.0) 160 162 (29.3) 21 620 (27.1) 14 358 (27.1) 9493 (26.7) 4865 (27.8)
75-79 143 416 (22.9) 123 613 (22.6) 19 803 (24.8) 12 192 (23.0) 8014 (22.6) 4178 (23.9)
80-84 87 383 (13.9) 73 876 (13.5) 13 507 (16.9) 8331 (15.7) 5578 (15.7) 2753 (15.7)
>84 41 981 (6.7) 35 121 (6.4) 6860 (8.6) 5369 (10.1) 3612 (10.2) 1757 (10.0)
Male sexa 249 326 (39.8) 203 581 (37.2) 45 745 (57.4) 0.137 25 203 (47.6) 16 758 (47.2) 8445 (48.3) 0.010
Race/ethnicitya
White 582 559 (92.9) 511 032 (93.4) 71 527 (89.7) 0.055 48 727 (92.0) 32 663 (92.0) 16 064 (91.9) 0.003
African American 27 263 (4.3) 21 549 (3.9) 5714 (7.2) 2926 (5.5) 1944 (5.5) 982 (5.6)
Hispanic 1599 (0.3) 1287 (0.2) 312 (0.4) 163 (0.3) 110 (0.3) 53 (0.3)
Other/unknown 15 333 (2.4) 13 140 (2.4) 2193 (2.7) 1170 (2.2) 780 (2.2) 390 (2.2)
Procedurea
Colectomy 73 427 (11.7) 52 404 (9.6) 21 023 (26.4) 0.660 12 903 (24.4) 8375 (23.6) 4528 (25.9) 0.048
AAA 14 173 (2.3) 7549 (1.4) 6624 (8.3) 2733 (5.2) 1850 (5.2) 883 (5.0)
CABG 33 498 (5.3) 1983 (0.4) 31 515 (39.5) 2075 (3.9) 1217 (3.4) 858 (4.9)
Total hip arthoplasty 162 005 (25.8) 155 270 (28.4) 6735 (8.4) 11 727 (22.1) 7874 (22.2) 3853 (22.0)
Total knee arthoplasty 315 697 (50.4) 307 584 (56.2) 8113 (10.2) 17 417 (32.9) 11 979 (33.7) 5438 (31.1)
Lung resection 27 954 (4.5) 22 218 (4.1) 5736 (7.2) 6131 (11.6) 4202 (11.8) 1929 (11.0)
Comorbidity index, median (IQR)a
Charlson 1 (0-2) 0 (0-2) 4 (3-6) −0.052 3 (1-5) 2 (1-5) 3 (2-5) −0.007
Elixhauser 2 (1-4) 2 (1-3) 5 (4-7) −0.047 3 (2-5) 3 (2-5) 3 (2-5) −0.005
CMS-HCC 0.70 (0.41-1.25) 0.65 (0.39-1) 2.77 (1.66-3.95) −0.047 1.31 (0.74-2.35) 1.25 (0.71- 2.28) 1.44 (0.84-2.47) −0.006

Abbreviations: AAA, abdominal aortic aneurysm; CABG, coronary artery bypass graft; CMS-HCC, Centers for Medicare & Medicaid Services–Hierarchical Condition Category; IQR, interquartile range.

a

Used in the propensity-matching analysis.

Table 2. Medicare Expenditure and Perioperative and Postoperative Characteristics for Low Users and Super-Utilizers in the Crude Population and the Propensity-Matched Population.

Variable Median (IQR) Expenditures
Crude Population Propensity-Matched Population
Total (n = 626 754) Low User (n = 547 008) Super-Utilizer (n = 79 746) Effect Size Total (n = 52 986) Low User (n = 35 497) Super-Utilizer (n = 17 489) Effect Size
Preoperative outcome
Sum expenditure, $ million 3303 1961 1342 NA 465 230 235 NA
Expenditure per patient, $ thousand 1.3 (0.4-4.2) 1.1 (0.3-3.2) 6.6 (1.8-21.0) −2.490 2.6 (0.8-8.7) 2.1 (0.7-6.6) 4.0 (1.1-15.1) −2.363
Operative outcomes
Expenditure per patient, $ thousand 12.7 (11.4-14.9) 12.4 (11.2-13.9) 37.4 (20.9-58.9) −6.016 13.5 (11.7-17.0) 12.9 (11.3-15.3) 16.0 (12.8-25.6) −3.926
Length of staya 3 (2-4) 3 (2-3) 11 (6-19) −0.104 4 (3-6) 4 (3-6) 4 (3-7) −0.012
Postoperative outcome
Sum expenditure, $ million 7548 1016 6532 NA 1693 134 1558 NA
Expenditure per patient, $ thousand 1.3 (0.4-3.8) 1.1 (0.4-2.4) 72.8 (52.2-100.1) −12.179 6.2 (1.6-68.1) 3.0 (0.9-6.3) 79.7 (68.5-101.1) −23.782
Complications within 90 d, No. (IQR)a 0 (0-0) 0 (0-0) 2 (1-3) −0.050 0 (0-1) 0 (0-1) 0 (0-1) −0.009
Readmission, No. (%)
90 d 76 376 (12.2) 16 921 (3.1) 59 455 (74.6) 0.728 21 383 (40.4) 12 833 (36.2) 8550 (48.9) 0.122
30 d 37 447 (6.0) 10 140 (1.9) 27 307 (34.2) 0.455 11 464 (21.6) 7953 (22.4) 3511 (20.1) 0.027
Mortality, No. (%)
90 d 9980 (1.6) 1739 (0.3) 8241 (10.3) 0.267 1144 (2.2) 681 (1.9) 463 (2.6) 0.024
30 d 2942 (0.5) 805 (0.1) 2137 (2.7) 0.123 424 (0.8) 287 (0.8) 137 (0.8) 0.001

Abbreviations: IQR, interquartile range; NA, not applicable.

a

Used in the propensity-matching analysis.

Statistical Analysis

The primary outcome was Medicare super-utilization (vs low use) in the year following surgery. All comorbidities as defined by the Charlson (range, 0-24) and Elixhauser (range, 0-29) comorbidity indices, were used in the analysis; higher scores indicated higher comorbidity burden.35,36 In addition, comorbidity variables were composed for any condition not included either in the Charlson or Elixhauser Indices that were present in at least 2% of the study sample. Descriptive statistics are presented as medians with interquartile ranges (IQRs) for continuous variables and as frequencies (%) for categorical variables. Effect sizes are presented as Cohen d and Cohen w for continuous and categorical variables, respectively. To visualize expected expenditure of a health care episode over time, a 4-week moving average excluding the index surgical episode was constructed on the propensity-matched cohort for each procedure and the entire matched cohort. This analytic approach has been previously described and used in time-series and health care cost analyses.37,38,39

Logic Forest, a machine learning algorithm, was designed to identify main effects and interactions of binary predictors (ie, preoperative comorbidities) that most accurately classified patients by a dichotomous outcome (ie, super-utilizer vs low user).30,31 This method does not require specification of main effects or interactions a priori; instead, it iteratively evaluates the sample space for all interactions. In the present study, Logic Forest analysis was performed with 200 iterations on a 50%–training data set derived from a simple random sample drawn without replacement from the propensity-matched data set.26 To validate the results of the Logic Forest, the associations among the most frequent variable combinations identified by the machine learning algorithm and super-utilization were tested with logistic regression on the validation data set (ie, all observations not in the training data set) and propensity score–matched cohort.31 Logic Forest analysis was performed using R, version 3.2.5, Logic Forest package. All other analyses were performed using SAS, version 9.4 (SAS Institute Inc). Data analysis was completed on November 16, 2018.

Results

In total, 1 049 160 patients met inclusion criteria and were included in the analytic cohort (Table 1). The median (IQR) age of the cohort was 73 (69-78) years, and 249 326 (approximately 40%) were male. The majority of patients underwent TKA (315 697 [50.4%]), THA (162 005 [25.8%]), or colectomy (73 427 [11.7%]); a smaller subset of patients underwent AAA (14 173 [2.3%]), CABG (33 498 [5.3%]), or lung resection (27 954 [4.5%]). Comorbidities were common, as the median (IQR) Elixhauser index score was 2 (1-4). For both low users and super-utilizers, the most prevalent preoperative comorbidities included uncomplicated hypertension (393 581 [62.8%]) and chronic obstructive pulmonary disease (174 313 [27.8%]) (eTable 1 in the Supplement).

Results of the cluster analysis identified that super-utilizers comprised 4.8% of the overall cohort (n = 79 746) yet incurred 31.7% of the expenditures. By contrast, 52.1% (n = 547 008) of patients were categorized as low users (Table 1). Compared with low users, the majority of super-utilizers were male (45 745 [57.4%] vs 203 581 [37.2%]) and African American (7.2% vs 3.9%) and had a higher comorbidity burden as evidenced by higher median (IQR) Charlson index (4 [3-6] vs 0 [0-2]) and Elixhauser index (5 [4-7] vs 2 [1-3]) comorbidity scores. With regard to procedure type, a majority of super-utilizers underwent CABG (39.5%), followed by colectomy (26.4%), TKA (10.2%), THA (8.4%), AAA (8.3%), and lung resection (7.2%).

Propensity-score–matched cohorts were derived to account for differences in certain baseline clinical and pathological differences (17 489 super-utilizers [33.0%] vs 35 497 low users [67.0%]) that may have been associated with membership in the low user or super-utilizer group (Table 1 and Table 2). All variables included in the matching met the good match criteria. In assessing the propensity-score–matched groups, the majority of patients who were low users had undergone colectomy (24.4%), TKA (32.9%), or THA (22.1%).

In the variable ratio (ie, unequal sample sizes) propensity-matched cohort, both preoperative and postoperative median annual Medicare expenditures were higher among patients categorized as super-utilizers vs low users (Table 2). Of note, although the difference in overall expenditures per person among super-utilizers ($4049) vs low users ($2148) was relatively modest prior to surgery, the difference in expenditures between super-utilizers ($79 698) and low users ($2977) was marked in the year following surgery (Table 2). Overall, postoperative Medicare expenditure among super-utilizers (approximately $1.6 billion) was more than 11-fold higher than low users ($134 million).

Evaluation of the moving expenditure average showed that prior to surgery, super-utilizers were consistently estimated to have higher costs for a health care episode (Figure). The estimated cost of a health care episode increased shortly after surgery and then stabilized at a much higher cost than that before surgery. Furthermore, there was marked variation in the expected cost of a health care episode in the different surgical subgroups within the super-utilizer population. Within the low-user population, the moving average indicated that the estimated cost of a health care episode increased shortly after surgery yet became attenuated after approximately 90 days for each of the surgical subgroups examined.

Figure. Moving Averages for Each Surgical Subpopulation and Overall for Medicare Low Users and Super-Utilizers .

Figure.

Thin lines represent each surgical subpopulation; thick lines, overall use.

Preoperative Risk Factors Associated With Super-Utilization Following Surgery

The machine learning Logic Forest program was used to identify individual preoperative patient characteristics associated with increased Medicare expenditures (Table 3). The Logic Forest analysis was performed on the test cohort and subsequently on the validation cohort, and the results of both tests are described in eTable 2 in the Supplement. After controlling for age, sex, race/ethnicity, and preoperative total annual Medicare expenditure, the factors associated with postoperative super-utilization included the following: unplanned weight loss in the absence of a malignant neoplasm (odds ratio [OR], 3.5; 95% CI, 2.9-4.2), hemiplegia/paraplegia (OR, 5.2; 95% CI, 4.4-6.2), other neurologic disorders (OR, 2.8; 95% CI, 2.2-3.6), congestive heart failure (CHF) combined with chronic kidney disease stages I to IV (OR, 3.4; 95% CI, 3.0-3.9), as well as CHF and chronic kidney disease stages I to V without concurrent hypertension (OR, 2.8; 95% CI, 2.5-3.1). Of note, the additional median (IQR) Medicare expenditure associated with hemiplegia/paraplegia in the year following surgery was $73 998 ($8657-$96 192) compared with an expenditure of $52 529 ($3257-$76 852) for unplanned weight loss and $42 383 ($2924-$81 519) for CHF associated with renal disease.

Table 3. Data for Each Identified Interaction on the PS-Matched Validation Cohort, the Total PS-Matched Cohort, and the Crude Cohort.

Interaction OR (95% CI)
Propensity Score Matched Cohort Crude Cohort Total
Validation Total
Hemiplegia or paraplegiaa,b 5.19 (4.37-6.18) 5.35 (4.74-6.05) 18.42 (17.38-19.52)
No solid tumor and WLa,c,d 3.49 (2.89-4.22) 3.50 (3.06-4.01) 30.23 (28.72-31.82)
CHF and stage I-IV RDa,c,e 3.40 (2.99-3.86) 3.45 (3.15-3.77) 24.66 (23.66-25.71)
Other ND and CHFa,c 2.79 (2.19-3.55) 2.45 (2.06-2.92) 38.15 (35.51-40.99)
No HPTN C and CHF and stage I-V RDa,c 2.75 (2.45-3.08) 2.76 (2.54-2.99) 20.36 (19.62-21.13)
Lymphomaa,f 2.04 (1.66-2.52) 1.93 (1.66-2.23) 2.77 (2.57-3.00)

Abbreviations: CHF, congestive heart failure; HPTN C, complicated hypertension; METS, metastatic cancer; ND, neurologic disorders; OR, odds ratio; PS, propensity score; RD, renal disease; WL, weight loss.

a

Charlson cormorbidity index.

b

Also identified, hemiplegia or paraplegia and no solid tumor with METS, or no blood loss anemia, or no liver disease, or no psychoses.

c

Elixhauser index.

d

Also identified, WL and no solid tumor with or without METS.

e

Also identified, stage I to IV RD and no solid tumor with METS.

f

Also identified, lymphoma and no solid tumor with METS.

Discussion

Owing to the large amount of resources that super-utilizers consume, characterizing and understanding this subpopulation of patients is important to guide targeted interventions aimed at decreasing their financial reliance on the health care system.4,9,10,11,12 Hot spotting has been a term commonly used as a means to identify which surgical patients may be super-utilizers during any given surgical episode. Typically, only costs of care based on the time around the surgical episode itself (eg, 30-day perioperative period) have been examined.1,2,5,13,14,15,16,17,18,19,20,22,23,40 Although previous studies have identified super-utilizers at the time of surgical intervention, data on the longer-term relationship of these patients with the health care system have not been well examined. To our knowledge, no study has longitudinally analyzed super-utilization following surgical intervention. The current study adds to the literature by examining health care use and expenditure for a full year following surgery. Perhaps more importantly, we identified factors associated with increased postoperative super-utilization in the year following elective surgery by using an innovative machine learning technique. Specifically, several preoperative comorbidities associated with super-utilization following surgery were identified, and the actual financial association of these different conditions among surgical patients in the year following surgery were quantified.

Data on super-utilizers in the surgical population have been limited because only a few studies have investigated characteristics associated with increased expenditure at the time of the surgical episode.1,2,13,14,15,16,17,18,19,23,24 In addition, the assessment of overall expenditure has varied among different studies. For example, some investigators have considered expenditure as a continuous measure, aiming to identify factors associated with a higher “average” expenditure.2,14,15,16,17,18,19,23,24 By contrast, other authors have focused more on factors associated with being in the high-expenditure group rather than on comparing average expenditure.1,4,13 In the present study, we specifically sought to identify factors associated with increased overall expenditure in the 12 months following surgical intervention. In addition, we identified certain patient characteristics that were associated with super-utilization defined by both inpatient and outpatient expenditures in the year following elective surgery. Several factors, including preoperative hemiplegia/paraplegia, CHF, chronic renal failure, and weight loss, were associated with a marked increased risk of much higher long-term expenditure after surgery. Of note, although this patient population may have also been expected to have a higher median expenditure prior to surgery, the relative increase in postoperative expenditure among these super-utilizers was marked. Previous studies have noted that these comorbidities in particular were strongly associated with adverse surgical outcomes.41,42,43,44,45,46 Specifically, patients with CHF, hemiplegia/paraplegia, or renal disease are known to have higher risks of 90-day morbidity and mortality.41,42 Furthermore, preoperative weight loss has been reported to be a strong surrogate of overall patient well-being and frailty, which in turn can increase the risk of postoperative complications.43 By using propensity score matching, we were able to compare super-utilizers with low users after controlling for age, race/ethnicity, sex, index hospitalization length of stay, and the number of complications within 90 days—measures thought to be associated with short-term postoperative expenditure. Even after controlling for these factors associated with short-term expenditure, factors such as CHF, renal failure, and weight loss remained associated with super-utilization during the course of the entire year following surgical intervention. Taken together, these data suggested that super-utilization among certain subpopulations of patients was not simply limited to the immediate postoperative period. Rather, a subset of patients who had comparable preoperative use went on to become super-utilizers in the postoperative period, with durable and marked increased expenditures for at least 1 year following surgery.

The present study used a machine learning technique to identify factors associated with super-utilization. The application of the machine learning technique allowed for data mining and extraction from a large database, such as that of Medicare, in a sophisticated and less biased manner than might be involved with traditional statistical methods.25,26,30 Machine learning is an application of artificial intelligence that provides the ability to automatically learn and improve understanding of data patterns via programmed algorithms.25,26 Given the attributes of machine learning, this statistical technique works best with big data sets because the volume and complexity of such data are high. As such, machine learning is an ideal application for data sources such as Medicare when trying to identify patterns of factors associated with an outcome, such as super-utilization. In turn, by identifying patients associated with a risk for high postoperative expenditures, targeted efforts can be made to improve patient care and outcomes, which may subsequently decrease their burden on the health care system. By identifying expenditures of patients for the full year following the surgical intervention, the data highlighted how super-utilizers partake of resources beyond the immediate inpatient stay, such as outpatient visits and other ancillary services. Future studies should aim to employ machine learning approaches not only to better understand the specific areas of high use but also to develop specific evidence-based algorithms aimed at decreasing or avoiding unnecessary costs. Data from the present study may help identify patients associated with the highest risk of becoming super-utilizers to inform targeted interventions in the preoperative setting to decrease, anticipate, and help manage the massive burden on the health care system in the postoperative period.

Limitations and Strengths

The current study had several limitations. Standard Analytic Files lack detailed clinical information to define certain characteristics that might have been associated with use.24,47,48 Specifically, comorbidities were defined using International Classification of Diseases, Ninth Revision, codes, which may lack precision.47 In addition, given that we used the Medicare database, the cohort largely consisted of surgical patients aged 65 years or older; as such, the data may not be generalizable to younger surgical patients or patients with an insurance status other than Medicare.14,15 Furthermore, given that propensity matching yielded a subset of the entire population, the subsequent Logic Forest analysis was not performed on the entire cohort but on a more balanced cohort of the total population. However, because propensity matching was performed on the summary measures of comorbid burden, specific comorbidities might not be balanced, as was the case with fibromyalgia. The strengths of the present study, however, were the use of an innovative approach and method to identify super-utilizers and low users, creation of a balanced cohort using propensity matching, and examination of postoperative super-utilization for a year following surgery. Rather than identifying use subgroups based on quartiles or quintiles, the cluster analysis fragmented the data at more organic breaking points, resulting in more homogeneous use populations. In addition, by using propensity score matching, we were able to decrease selection bias, which is a common problem with retrospective observational studies. By leveraging the abilities of Logic Forest to identify salient risk factors of postoperative super-utilization from a pool of preoperative comorbidities, the method was able to identify main effects and all possible interactions.31

Conclusions

Super-utilizers comprised only a fraction of the surgical population yet were responsible for a disproportionately large portion of Medicare expenditure. Certain subpopulations became super-utilizers following surgical intervention despite having lower overall use in the preoperative period. An innovative machine learning method identified certain specific patient factors that were associated with increased odds of health care super-utilization in the year following elective surgery. This study found that less than 1 in 20 surgical patients represented 31.7% of the total health care expenditures in the year following a surgical intervention. These results suggest that future studies may target efforts to decrease super-utilization among groups of patients associated with higher risks following elective surgery as well as use machine learning approaches to identify ways to mitigate overuse of health care resources.

Supplement.

eTable 1. Charlson and Elixhauser Comorbidities in the Crude Population and the Propensity-Matched Population

eTable 2. Patient Demographics and Pre-Operative Characteristics for Low- and Super-Utilizers in the Crude Population and the Propensity-Matched Population

References

  • 1.Shubeck SP, Thumma JR, Dimick JB, Nathan H. Hot spotting as a strategy to identify high-cost surgical populations. Ann Surg. 2019;269(3):453-458. doi: 10.1097/SLA.0000000000002663 [DOI] [PubMed] [Google Scholar]
  • 2.Birkmeyer JD, Gust C, Dimick JB, Birkmeyer NJ, Skinner JS. Hospital quality and the cost of inpatient surgery in the United States. Ann Surg. 2012;255(1):1-5. doi: 10.1097/SLA.0b013e3182402c17 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Figueroa JF, Lyon Z, Zhou X, Grabowski DC, Jha AK. Persistence and drivers of high-cost status among dual-eligible Medicare and Medicaid beneficiaries: an observational study. Ann Intern Med. 2018;169(8):528-534. doi: 10.7326/M18-0085 [DOI] [PubMed] [Google Scholar]
  • 4.Harris LJ, Graetz I, Podila PS, Wan J, Waters TM, Bailey JE. Characteristics of hospital and emergency care super-utilizers with multiple chronic conditions. J Emerg Med. 2016;50(4):e203-e214. doi: 10.1016/j.jemermed.2015.09.002 [DOI] [PubMed] [Google Scholar]
  • 5.Nathan H, Atoria CL, Bach PB, Elkin EB. Hospital volume, complications, and cost of cancer surgery in the elderly. J Clin Oncol. 2015;33(1):107-114. doi: 10.1200/JCO.2014.57.7155 [DOI] [PubMed] [Google Scholar]
  • 6.Riley GF. Long-term trends in the concentration of Medicare spending. Health Aff (Millwood). 2007;26(3):808-816. doi: 10.1377/hlthaff.26.3.808 [DOI] [PubMed] [Google Scholar]
  • 7.Congressional Budget Office High-Cost Medicare Beneficiaries. https://www.cbo.gov/sites/default/files/109th-congress-2005-2006/reports/05-03-medispending.pdf. Published May 2005. Accessed July 3, 2019.
  • 8.Monheit AC. Persistence in health expenditures in the short run: prevalence and consequences. Med Care. 2003;41(7)(suppl):III53-III64. doi: 10.1097/00005650-200307001-00007 [DOI] [PubMed] [Google Scholar]
  • 9.Agency for Healthcare Research and Quality Healthcare Cost and Utilization Project (HCUP). Statistical Brief #190: characteristics of hospital stays for super-utilizers by payer, 2012. https://www.hcup-us.ahrq.gov/reports/statbriefs/sb190-Hospital-Stays-Super-Utilizers-Payer-2012.jsp. Published May 2015. Accessed July 3, 2019.
  • 10.US Department of Health and Human Services; Agency for Healthcare Research and Quality The high concentration of U.S. health care expenditures. https://meps.ahrq.gov/data_files/publications/ra19/ra19.pdf. Published June 2006. Accessed July 3, 2019.
  • 11.Johnson TL, Rinehart DJ, Durfee J, et al. . For many patients who use large amounts of health care services, the need is intense yet temporary. Health Aff (Millwood). 2015;34(8):1312-1319. doi: 10.1377/hlthaff.2014.1186 [DOI] [PubMed] [Google Scholar]
  • 12.Gawande A. The hot spotters: can we lower medical costs by giving the neediest patients better care? New Yorker. 2011;40-51. [PubMed] [Google Scholar]
  • 13.Merath K, Chen Q, Johnson M, et al. . Hot spotting surgical patients undergoing hepatopancreatic procedures. HPB (Oxford). 2019;21(6):765-772. doi: 10.1016/j.hpb.2018.10.011 [DOI] [PubMed] [Google Scholar]
  • 14.Sheetz KH, Norton EC, Regenbogen SE, Dimick JB. An instrumental variable analysis comparing Medicare expenditures for laparoscopic vs open colectomy. JAMA Surg. 2017;152(10):921-929. doi: 10.1001/jamasurg.2017.1578 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Sheetz KH, Ibrahim AM, Regenbogen SE, Dimick JB. Surgeon experience and Medicare expenditures for laparoscopic compared to open colectomy. Ann Surg. 2018;268(6):1036-1042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Chen Q, Bagante F, Merath K, et al. . Hospital teaching status and Medicare expenditures for hepato-pancreato-biliary surgery. World J Surg. 2018;42(9):2969-2979. doi: 10.1007/s00268-018-4566-1 [DOI] [PubMed] [Google Scholar]
  • 17.Chen Q, Beal EW, Kimbrough CW, et al. . Perioperative complications and the cost of rescue or failure to rescue in hepato-pancreato-biliary surgery. HPB (Oxford). 2018;20(9):854-864. doi: 10.1016/j.hpb.2018.03.010 [DOI] [PubMed] [Google Scholar]
  • 18.Merath K, Chen Q, Bagante F, et al. . Variation in the cost-of-rescue among Medicare patients with complications following hepatopancreatic surgery. HPB (Oxford). 2019;21(3):310-318. doi: 10.1016/j.hpb.2018.08.005 [DOI] [PubMed] [Google Scholar]
  • 19.Chen Q, Merath K, Bagante F, et al. . A Comparison of open and minimally invasive surgery for hepatic and pancreatic resections among the Medicare population. J Gastrointest Surg. 2018;22(12):2088-2096. doi: 10.1007/s11605-018-3883-x [DOI] [PubMed] [Google Scholar]
  • 20.Miller DC, Gust C, Dimick JB, Birkmeyer N, Skinner J, Birkmeyer JD. Large variations in Medicare payments for surgery highlight savings potential from bundled payment programs. Health Aff (Millwood). 2011;30(11):2107-2115. doi: 10.1377/hlthaff.2011.0783 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Borza T, Oerline MK, Skolarus TA, et al. . Association between hospital participation in Medicare shared savings program accountable care organizations and readmission following major surgery. Ann Surg. 2019;269(5):873-878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Nathan H, Thumma JR, Ryan AM, Dimick JB. Early impact of Medicare accountable care organizations on inpatient surgical spending. Ann Surg. 2019;269(2):191-196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sheetz KH, Ryan AM, Ibrahim AM, Dimick JB. Association of hospital network participation with surgical outcomes and Medicare expenditures. Ann Surg. 2018. doi: 10.1097/SLA.0000000000002791 [DOI] [PubMed] [Google Scholar]
  • 24.Ho V, Aloia T. Hospital volume, surgeon volume, and patient costs for cancer surgery. Med Care. 2008;46(7):718-725. doi: 10.1097/MLR.0b013e3181653d6b [DOI] [PubMed] [Google Scholar]
  • 25.Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930. doi: 10.1161/CIRCULATIONAHA.115.001593 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Hastie T, Tibshirani R, Friedman JH. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed New York, NY: Springer; 2009. doi: 10.1007/978-0-387-84858-7 [DOI] [Google Scholar]
  • 27.Corey KM, Kashyap S, Lorenzi E, et al. . Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (Pythia): a retrospective, single-site study. PLoS Med. 2018;15(11):e1002701. doi: 10.1371/journal.pmed.1002701 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Bertsimas D, Dunn J, Velmahos GC, Kaafarani HMA. Surgical risk is not linear: derivation and validation of a novel, user-friendly, and machine-learning-based predictive optimal trees in emergency surgery risk (POTTER) calculator. Ann Surg. 2018;268(4):574-583. doi: 10.1097/SLA.0000000000002956 [DOI] [PubMed] [Google Scholar]
  • 29.Bertsimas D, Kung J, Trichakis N, Wang Y, Hirose R, Vagefi PA. Development and validation of an optimized prediction of mortality for candidates awaiting liver transplantation. Am J Transplant. 2019;19(4):1109-1118. [DOI] [PubMed] [Google Scholar]
  • 30.Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics. 2010;26(17):2183-2189. doi: 10.1093/bioinformatics/btq354 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wolf BJ, Ramos PS, Hyer JM, et al. . An analytic approach using candidate gene selection and logic forest to identify gene by environment interactions (G × E) for systemic lupus erythematosus in African Americans. Genes (Basel). 2018;9(10):E496. doi: 10.3390/genes9100496 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Tan P-N, Steinbach M, Karpatne A, Kumar V. Introduction to Data Mining. 2nd ed New York, NY: Pearson Education Inc; 2019. [Google Scholar]
  • 33.Stuart EA. Matching methods for causal inference: a review and a look forward. Stat Sci. 2010;25(1):1-21. doi: 10.1214/09-STS313 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Stuart EA, DuGoff E, Abrams M, Salkever D, Steinwachs D. Estimating causal effects in observational studies using electronic health data: challenges and (some) solutions. EGEMS (Wash DC). 2013;1(3). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373-383. doi: 10.1016/0021-9681(87)90171-8 [DOI] [PubMed] [Google Scholar]
  • 36.Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8-27. doi: 10.1097/00005650-199801000-00004 [DOI] [PubMed] [Google Scholar]
  • 37.Obama B. United States health care reform: progress to date and next steps. JAMA. 2016;316(5):525-532. doi: 10.1001/jama.2016.9797 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Baicker K, Robbins JA. Medicare payments and system-level health-care use: the spillover effects of Medicare managed care. Am J Health Econ. 2015;1(4):399-431. doi: 10.1162/AJHE_a_00024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Kutner MH. Applied Linear Statistical Models. 5th ed Boston: McGraw-Hill Irwin; 2005. [Google Scholar]
  • 40.Sheetz KH, Dimick JB, Ghaferi AA. The association between hospital care intensity and surgical outcomes in Medicare patients. JAMA Surg. 2014;149(12):1254-1259. doi: 10.1001/jamasurg.2014.552 [DOI] [PubMed] [Google Scholar]
  • 41.Bozic KJ, Lau E, Kurtz S, et al. . Patient-related risk factors for periprosthetic joint infection and postoperative mortality following total hip arthroplasty in Medicare patients. J Bone Joint Surg Am. 2012;94(9):794-800. doi: 10.2106/JBJS.K.00072 [DOI] [PubMed] [Google Scholar]
  • 42.Bozic KJ, Ong K, Lau E, et al. . Estimating risk in Medicare patients with THA: an electronic risk calculator for periprosthetic joint infection and mortality. Clin Orthop Relat Res. 2013;471(2):574-583. doi: 10.1007/s11999-012-2605-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Moghadamyeghaneh Z, Hanna MH, Hwang G, et al. . Outcome of preoperative weight loss in colorectal surgery. Am J Surg. 2015;210(2):291-297. doi: 10.1016/j.amjsurg.2015.01.019 [DOI] [PubMed] [Google Scholar]
  • 44.Phruetthiphat OA, Gao Y, Anthony CA, Pugely AJ, Warth LC, Callaghan JJ. Incidence of and preoperative risk factors for surgical delay in primary total hip arthroplasty: analysis from the American College of Surgeons National Surgical Quality Improvement Program. J Arthroplasty. 2016;31(11):2432-2436. doi: 10.1016/j.arth.2016.05.054 [DOI] [PubMed] [Google Scholar]
  • 45.Carson JL, Duff A, Poses RM, et al. . Effect of anaemia and cardiovascular disease on surgical mortality and morbidity. Lancet. 1996;348(9034):1055-1060. doi: 10.1016/S0140-6736(96)04330-9 [DOI] [PubMed] [Google Scholar]
  • 46.Leung JM, Dzankic S. Relative importance of preoperative health status versus intraoperative factors in predicting postoperative adverse outcomes in geriatric surgical patients. J Am Geriatr Soc. 2001;49(8):1080-1085. doi: 10.1046/j.1532-5415.2001.49212.x [DOI] [PubMed] [Google Scholar]
  • 47.Ghaferi AA, Dimick JB. Practical guide to surgical data sets: Medicare claims data. JAMA Surg. 2018;153(7):677-678. doi: 10.1001/jamasurg.2018.0489 [DOI] [PubMed] [Google Scholar]
  • 48.Nathan H, Pawlik TM. Limitations of claims and registry data in surgical oncology research. Ann Surg Oncol. 2008;15(2):415-423. doi: 10.1245/s10434-007-9658-3 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement.

eTable 1. Charlson and Elixhauser Comorbidities in the Crude Population and the Propensity-Matched Population

eTable 2. Patient Demographics and Pre-Operative Characteristics for Low- and Super-Utilizers in the Crude Population and the Propensity-Matched Population


Articles from JAMA Surgery are provided here courtesy of American Medical Association

RESOURCES