Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Jul 1.
Published in final edited form as: J Am Acad Orthop Surg. 2020 Jul 1;28(13):e580–e585. doi: 10.5435/JAAOS-D-19-00395

A Novel Machine Learning Model Developed to Assist in Patient Selection for Outpatient Total Shoulder Arthroplasty

Dustin R Biron 1,2, Ishan Sinha 1,2, Justin E Kleiner 1, Dilum P Aluthge 1,2, Avi D Goodman 3, I Neil Sarkar 2, Eric Cohen 3, Alan H Daniels 3
PMCID: PMC7180108  NIHMSID: NIHMS1548995  PMID: 31663914

Abstract

Introduction:

Patient selection for outpatient total shoulder arthroplasty is important to optimizing patient outcomes. The goal of this study was to develop a machine learning tool that may aid in patient selection for outpatient total should arthroplasty based on medical comorbidities and demographic factors.

Methods:

Patients undergoing elective total shoulder arthroplasty from 2011–2016 in the National Surgical Quality Improvement Program were queried. A Random Forest machine learning model was used to predict which patients had a length of stay of 1 day or less (short stay). A multivariable logistic regression was then used to identify which variables were significantly correlated with a short or long stay.

Results:

From 2011–2016, 4,500 patients were identified as having undergone elective total shoulder arthroplasty as well as having the necessary predictive features and outcomes recorded. The machine learning model was able to successfully identify short stay patients, producing an area under the receiver operator curve of 0.77. The multivariate logistic regression identified numerous variables associated with a short stay including, age less than 70 and male sex as well as variables associated with a longer stay including diabetes, COPD, and ASA class greater than 2.

Discussion:

Machine learning may be used to predict which patients are suitable candidates for short stay or outpatient total shoulder arthroplasty based on their medical comorbidities and demographic profile.

Introduction

Total shoulder arthroplasty (TSA) is a commonly performed orthopaedic procedure that can reliably treat glenohumeral arthritis. Traditionally, TSA has been associated with an inpatient hospital stay. There has been an increased interest in outpatient TSA in order to lower healthcare costs while maintaining quality. Inpatient TSA may cost up to three times more on average than outpatient TSA due to charges including those related to the procedure itself, nursing, medication, and rehabilitation.1 Therefore, there is an incentive for insurers, health systems, and surgeons to shift TSA volume to an outpatient basis; however, appropriate patient selection remains a challenge.

Currently, the number of patients undergoing outpatient TSA is limited as surgeons select only the healthiest patients as candidates for outpatient TSA due to concerns about complications and readmissions.2,3,4 Importantly, within these narrow cohorts, it has been noted that the complication rate of outpatient TSA is not significantly different than inpatient, indicating that it is a safe procedure with proper patient selection.3,5,6 Therefore, developing evidence-based tools to aid in patient selection is an essential step to expand the possible patient pool for outpatient TSA.

Artificial intelligence may play an important role in supporting patient selection for TSA. Clinical decision support tools that use machine learning algorithms such as random forests, artificial neural networks, or support vector machines have been proved useful in medical research.7,8 In particular, machine learning has been successfully employed to predict length of stay (LOS), including after total hip and total knee arthroplasty.9,10,11 We are unaware of such tools being regularly used in clinical orthopaedics practice, suggesting that there is potential utility to integrate computerized algorithms into electronic health record systems where they can be used as point-of-care decision support tools, with actionable analysis provided to the surgeon.

The purpose of this study was to develop a machine learning algorithm and perform multivariable logistic regression analysis using a national database to identify appropriate candidates for short stay TSA, a proxy for outpatient TSA, as well as elucidate factors correlated with extended hospital LOS.

Methods

The American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) is a prospective, national, multi-institution, multi-specialty surgical registry. Data collected include demographics, medical comorbidities, and outcomes. In this study, the following data were extracted as predictive features: age, sex, race, year of procedure, obesity (defined as body mass index [BMI] > 30), history of smoking, diabetes, congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD), American Society of Anesthesiologists (ASA) class, dyspnea, steroid use, and hypertension. LOS was chosen as the outcome variable of interest.

Patient records from ACS-NSQIP, years 2011–2016, were queried. Patients undergoing TSA were identified using Current Procedural Terminology (CPT) code 23472 (“arthroplasty, glenohumeral joint”). Patients were excluded that underwent the procedure non-electively. Patients were further excluded that did not have all of the predictive features and outcomes recorded. Patients were labelled as “short LOS” if they had LOS less than or equal to one day. Patients were labelled as “long LOS” if they had LOS greater than or equal to three days. Patients were excluded if they had LOS equal to two days.

The data were randomly partitioned into a training set (70%) and testing set (30%). The training set data were used to develop and train a random forest machine learning model for predicting long LOS, with the predictive features as inputs and the long LOS outcome as output. The testing set data were then used to evaluate the sensitivity and specificity of the random forest model at multiple operating thresholds. These sensitivities and specificities were used to generate a receiver operating characteristic (ROC) curve. The area under the ROC curve was calculated using trapezoidal integration. These analyses were performed in the Julia programming language using the PredictMD machine learning toolkit.12

Multivariable logistic regression was used to determine the relationships between predictive features and LOS. Adjusted odds ratios and 95% confidence intervals were calculated for all predictive features. Statistical significance was set at p<0.05. These analyses were performed using SAS software (SAS Institute, Inc., Cary, NC). Two-sided t tests were performed on continuous features including age 70 or older, BMI, and operative time to elucidate differences between the short and long LOS groups. Two-sided Chi-square tests were used to compare readmission and reoperation rates between the two groups.

Results

From 2011–2016, 12,897 patients were identified as undergoing elective TSA in the ACS-NSQIP. The average length of stay in these patients was 1.93 days. A total of 8,397 patients were excluded, who did not have all of the necessary predictive features and outcomes recorded. Of the remaining 4,500 patients, 2,122 were labeled as short LOS and 1,006 were labeled as long LOS with 1,372 patients excluded for having a LOS of 2 days (Table 1).

Table 1.

Patient Demographic and Medical Comorbidity Information

Feature Short LOS patients Long LOS patients
Age (years) 67.8 72.8
Sex 1084 male (51.1%)
1038 female (48.9%)
321 male (31.9%)
685 female (68.1%)
Race 2000 white (94.3%)
122 non-white (5.75%)
928 white (92.2%)
78 non-white (7.76%)
Obesity (BMI >30) 1063 (50.1%) 534 (53.1%)
History of Smoking 254 (12.0%) 98 (9.74%)
Diabetes 346 (16.3%) 260 (25.8%)
CHF 6 (0.282%) 10 (0.994%)
COPD 133 (6.27%) 127 (12.6%)
ASA class 2.50 2.85
ASA class ≥ 3 (%) 50.14% 77.04%
Dyspnea 119 (5.61%) 114 (11.3%)
Hypertension 1374 (64.8%) 762 (75.7%)
Steroid use 95 (4.48%) 58 (5.77%)

LOS = length of stay, BMI = body mass index, CHF = congestive heart failure, COPD = chronic obstructive pulmonary disease, ASA = American Society of Anesthesiologists class

The data were randomly partitioned into a training set of 2,190 patients and a testing set of 938 patients. The training set was used to develop and train a random forest machine learning model for predicting long LOS. The testing set was used to evaluate the performance of the random forest model and generate a ROC curve (Figure 1). The area under the ROC curve was 0.77.

Figure 1.

Figure 1.

Receiver operating curve for the random forest model.

Multivariable regression identified multiple predictive features associated with either short LOS or long LOS (Figure 2). The features associated with long LOS were age greater than or equal to 70 (p<0.001), diabetes (p<0.001), dyspnea (p=0.008), COPD (p<0.001), hypertension (p=0.008), ASA class greater than 2 (p<0.001), CHF (p=0.008). The features associated with short LOS were male sex (p<0.001), white race (p<0.001), and surgery performed in a more recent year (p=0.001).

Figure 2.

Figure 2.

Demographic variables and medical comorbidities associated with extended LOS after total shoulder arthroplasty. Odds ratios and 95% confidence intervals. COPD = chronic obstructive pulmonary disease, ASA = American Society of Anesthesiologists class, CHF = congestive heart failure, OR = odds ratio, LCL = lower confidence limit, UCL = upper confidence limit

Direct statistical comparison for select variables between groups follows. The average age was 67.8 in short LOS patients and 72.8 in long LOS patients (p < 0.0001). The average BMI was 31.2 in short LOS patients and 31.7 in long LOS patients (p = 0.062). The average operative time was 109.3 minutes in short LOS patients and 123.0 minutes long LOS patients (p < 0.0001). 2,990 patients (66.44%) had available readmission and reoperation data. In that subset, readmission rate was 2.39% and 5.51% (p < 0.0001) and the reoperation rate was 1.07% and 2.26% (p = 0.0175) for short and long LOS patients respectively.

Discussion

An increasing number of total joint arthroplasties are being moved to the outpatient setting, especially with advances in multimodal pain management protocols and improved patient selection tools.13,14,15 With the demonstrated safety of outpatient TSA in select populations and the recent approval of outpatient total knee arthroplasty by Medicare, it is likely that over time a greater proportion of TSA’s will be performed outside of the hospital. Prior studies that have examined the safety of outpatient TSA in a narrow cohort of low-risk patients.5,16 Before outpatient TSA becomes more common, it will be important to give surgeons tools to effectively and consistently screen candidates from a broader population. This study provides an example of a novel machine learning tool to help with patient selection for outpatient TSA. The findings of this study also suggest that white, male patients under the age of 70 without comorbidities are the best candidates for outpatient TSA, whereas patients 70 or older with diabetes, hypertension, COPD, and ASA class of 2 or greater seem to be poor candidates for outpatient TSA.

Given the potential financial savings associated with outpatient TSA and increased cost-consciousness around healthcare spending, it would be economically prudent to identify as many appropriate candidates for outpatient TSA as possible. This may be even more imperative with the introduction of bundled payments for surgical care. Gregory et al demonstrated that the average cost of an inpatient TSA in Texas from 2010–2015 was $76,109 compared to $22,907 in the outpatient setting, which was driven by costs related to nursing, medication, rehabilitation, and the procedure itself.1 However, despite the cost-saving possibilities of performing outpatient TSA, it is critical to prioritize patient safety and properly identify patients for whom the increased cost of inpatient TSA is justified.

While other studies have demonstrated the efficacy of algorithms for patient selection for outpatient TSA by case series, this is the first study to utilize machine learning in the selection process to assess a patient’s overall risk for TSA.2,5,15 The random forest model presented in this paper can make a risk determination for an individual patient based on their overall health profile made up of their medical comorbidities and demographic factors. Previous studies have used machine learning to predict length of stay after total knee and total hip arthroplasty with c-statistics of 0.78 and 0.87, respectively (similar to our finding of 0.77); however we are unaware of any machine learning tools currently used in clinical orthopaedics practice.10,11 Essentially, this tool has the potential to be integrated into the electronic medical record to provide a personalized assessment for a patient’s potential need for a longer stay in the hospital after TSA versus their ability to have an outpatient procedure with a very low likelihood of being readmitted or needing another operation. As these decision support tools become part of regular practice, however; they should not replace the clinical judgement of the surgeon, but rather supplement the informed consent process and contribute to shared decision making. Additionally, surgeon and facility factors should be considered, and could be added to individualized artificial intelligence models before performing outpatient TSA.

One advantage of this algorithm is the ability to adjust the threshold for prediction based on the user’s clinical priorities. This is unique to this study compared to previous work where the algorithms are fixed and have hard stops along their decision trees. When developing a screening tool it is necessary to balance selecting good candidates (sensitivity) and avoiding poor candidates (specificity). For example, at one particular threshold, the random forest model was able to correctly classify 44% of patients as possible candidates for outpatient TSA with an associated specificity of 90%. If an operator wanted to capture a larger pool of possible outpatient candidates, it is possible to adjust the threshold with the understanding that more patients who would likely benefit from a longer stay may be misclassified. This adds to the customizability of the tool and allows individual surgeons to set a threshold at their preferred level of risk.

A drawback of a machine learning approach for clinical classification is the “black box” nature of the algorithm, where the model can accurately predict an individual patient’s need to stay in the hospital but cannot quantitatively describe how each factor contributes to the overall assessment. To address this concern, we performed a multivariable logistic regression to identify which patient characteristics are associated with a longer length of stay in this dataset. The medical comorbidities corresponding to longer LOS after TSA in this study are similar to what has been found in previous studies and include higher ASA classification, diabetes, hypertension, and COPD.17,18,19 Identifying these features is critical as surgeons may want to avoid patients with these comorbidities when selecting candidates for outpatient TSA; however, these factors must be viewed in the context of a patient’s overall risk assessment rather than using them for automatic exclusion. This study also found a significant correlation between certain demographic variables and increased LOS including age, female sex, and non-white race. It is important to consider that these demographic correlations may reflect existing biases in the healthcare system, particularly regarding race and sex, rather than a causal relationship.20,21

This study has several potential limitations. First, a methodologic limitation is the lack of internal validation for the machine learning algorithm. While the training and testing components of the dataset were separated, there may be patterns or characteristics of the dataset itself that aided the algorithm in making proper classifications of short or long stay. In the future, this tool should be validated on a separate dataset using the same set of features. This work is further limited by the small percentage of patients undergoing outpatient TSA, as well as the inherent limitations of any large database such as the ACS-NSQIP (including coding errors, potential lack of relevant data points, and others). As more outpatient total shoulder arthroplasties are performed and these data become more ubiquitous, the utility of the algorithm can be increased by eliminating a LOS of one day as a proxy for outpatient selection.

Conclusions

This study demonstrated the potential of a machine learning algorithm to aid in patient selection for outpatient total shoulder arthroplasty. The promising results suggest that this model can be used to predict which patients had a one day LOS or less after TSA based on their medical comorbidities and demographic factors, a potential proxy for the potential selection of outpatient candidates. This further suggests that a more refined model could be integrated into the electronic health record to assist surgeons in patient selection for outpatient TSA. The machine learning model can potentially incorporate new data as well, making the algorithm more accurate and therefore more useful. A prospective application of this model on outpatient TSA would further validate our findings.

Acknowledgments

No funds were received in support of this work.

Dr. Daniels is a paid consultant for Stryker, Spineart, EOS, Southern Spine and Orthofix, has received research support from Orthofix, and receives royalties from Springer.

Footnotes

None of the other authors have any conflicts of interest to report.

References

  • 1.Gregory JM, Wetzig AM, Wayne CD, Bailey L, Warth RJ: Quantification of patient-level costs in outpatient total shoulder arthroplasty. Journal of Shoulder and Elbow Surgery 2019;28:1066–1073. Level IV evidence. [DOI] [PubMed] [Google Scholar]
  • 2.Leroux TS, Zuke WA, Saltzman BM, et al. : Safety and patient satisfaction of outpatient shoulder arthroplasty. JSES open access 2018;2:13–17. Level IV evidence. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Basques B, Erickson B, Leroux T, et al. : Comparative outcomes of outpatient and inpatient total shoulder arthroplasty: an analysis of the Medicare dataset. The bone & joint journal 2017;99:934–938. Level III evidence. [DOI] [PubMed] [Google Scholar]
  • 4.Brolin TJ, Cox RM, Zmistowski BM, Namdari S, Williams GR, Abboud JA: Surgeons’ experience and perceived barriers with outpatient shoulder arthroplasty. Journal of shoulder and elbow surgery 2018;27:S82–S87. [DOI] [PubMed] [Google Scholar]
  • 5.Brolin TJ, Mulligan RP, Azar FM, Throckmorton TW: Neer Award 2016: outpatient total shoulder arthroplasty in an ambulatory surgery center is a safe alternative to inpatient total shoulder arthroplasty in a hospital: a matched cohort study. Journal of shoulder and elbow surgery 2017;26:204–208. Level III evidence. [DOI] [PubMed] [Google Scholar]
  • 6.Leroux TS, Basques BA, Frank RM, et al. : Outpatient total shoulder arthroplasty: a population-based study comparing adverse event and readmission rates to inpatient total shoulder arthroplasty. Journal of shoulder and elbow surgery 2016;25:1780–1786. Level III evidence. [DOI] [PubMed] [Google Scholar]
  • 7.Embi PJ, Kaufman SE, Payne PR: Biomedical informatics and outcomes research: enabling knowledge-driven health care. Circulation 2009;120:2393–2399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Cafri G, Li L, Paxton EW, Fan J: Predicting risk for adverse health events using random forest. Journal of Applied Statistics 2018;45:2279–2294. Level III evidence. [Google Scholar]
  • 9.Cai X, Perez-Concha O, Coiera E, et al. : Real-time prediction of mortality, readmission, and length of stay using electronic health record data. Journal of the American Medical Informatics Association 2015;23:553–561. Level III evidence. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Navarro SM, Wang EY, Haeberle HS, et al. : Machine learning and primary total knee arthroplasty: patient forecasting for a patient-specific payment model. The Journal of arthroplasty 2018;33:3617–3623. Level III evidence. [DOI] [PubMed] [Google Scholar]
  • 11.Ramkumar PN, Navarro SM, Haeberle HS, et al. : Development and Validation of a Machine Learning Algorithm After Primary Total Hip Arthroplasty: Applications to Length of Stay and Payment Models. The Journal of Arthroplasty 2019;34:632–637. Level III evidence. [DOI] [PubMed] [Google Scholar]
  • 12.Aluthge DP, Sinha I, Stey P, Restrepo MI, Chen ES, Sarkar IN: PredictMD - Uniform interface for machine learning in Julia. Zenodo. 2019. [Google Scholar]
  • 13.Day JS, Lau E, Ong KL, Williams GR, Ramsey ML, Kurtz SM: Prevalence and projections of total shoulder and elbow arthroplasty in the United States to 2015. Journal of shoulder and elbow surgery 2010;19:1115–1120. [DOI] [PubMed] [Google Scholar]
  • 14.Bean BA, Connor PM, Schiffern SC, Hamid N: Outpatient Shoulder Arthroplasty at an Ambulatory Surgery Center Using a Multimodal Pain Management Approach. JAAOS Global Research & Reviews 2018;2(10). Level IV evidence. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Fournier MN, Brolin TJ, Azar FM, Stephens R, Throckmorton TW: Identifying appropriate candidates for ambulatory outpatient shoulder arthroplasty: validation of a patient selection algorithm. Journal of shoulder and elbow surgery 2019;28:65–70. Level IV evidence [DOI] [PubMed] [Google Scholar]
  • 16.Cancienne JM, Brockmeier SF, Gulotta LV, Dines DM, Werner BC: Ambulatory total shoulder arthroplasty: a comprehensive analysis of current trends, complications, readmissions, and costs. JBJS 2017;99:629–637. Level III evidence. [DOI] [PubMed] [Google Scholar]
  • 17.Dunn JC, Lanzi J, Kusnezov N, Bader J, Waterman BR, Belmont PJ Jr: Predictors of length of stay after elective total shoulder arthroplasty in the United States. Journal of shoulder and elbow surgery 2015;24:754–759. Level III evidence. [DOI] [PubMed] [Google Scholar]
  • 18.Menendez ME, Baker DK, Fryberger CT, Ponce BA: Predictors of extended length of stay after elective shoulder arthroplasty. Journal of shoulder and elbow surgery 2015;24:1527–1533. Level III evidence. [DOI] [PubMed] [Google Scholar]
  • 19.Matsen III FA, Li N, Gao H, Yuan S, Russ SM, Sampson PD: Factors affecting length of stay, readmission, and revision after shoulder arthroplasty: a population-based study. JBJS 2015;97:1255–1263. Level III evidence. [DOI] [PubMed] [Google Scholar]
  • 20.Char DS, Shah NH, Magnus D: Implementing machine learning in health care—addressing ethical challenges. The New England journal of medicine 2018;378:981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Chen JH, Asch SM: Machine learning and prediction in medicine—beyond the peak of inflated expectations. The New England journal of medicine 2017;376:2507. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES