Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Oct 9.
Published in final edited form as: Lupus. 2022 Jul 14;31(11):1296–1305. doi: 10.1177/09612033221114805

Exploration of Machine Learning Methods to Predict Systemic Lupus Erythematosus Hospitalizations

April M Jorge 1, Dylan Smith 2, Zhiyao Wu 2, Tashrif Chowdhury 2, Karen Costenbader 3, Yuqing Zhang 1, Hyon K Choi 1, Candace H Feldman 3, Yijun Zhao 2
PMCID: PMC9547899  NIHMSID: NIHMS1821574  PMID: 35835534

Abstract

Objectives:

Systemic lupus erythematosus (SLE) is a heterogeneous disease characterized by disease flares which can require hospitalization. Our objective was to apply machine learning methods to predict hospitalizations for SLE from electronic health record (EHR) data.

Methods:

We identified patients with SLE in a longitudinal EHR-based cohort with ≥ 2 outpatient rheumatology visits between 2012–2019. We applied multiple machine learning methods to predict hospitalizations with a primary diagnosis code for SLE, including decision tree, random forest, naive Bayes, logistic regression, and an ensemble method. Candidate predictors were derived from structured EHR features, including demographics, laboratory tests, medications, ICD-9/10 codes for SLE manifestations, and healthcare utilization. We used two approaches to assess these variables over longitudinal follow-up, including the incorporation of lagged features to capture changes over time of clinical data. The performance of each model was evaluated by overall accuracy, the F statistic, and the area under the receiver operator curve (AUC).

Results:

We identified 1,996 patients with SLE. 4.6% were hospitalized for SLE in their most recent year of follow-up. Random forest models had highest performance in predicting SLE hospitalizations, with AUC 0.751 and AUC 0.772 for two approaches (averaging and progressive) respectively. The leading predictors of SLE hospitalizations included dsDNA positivity, C3 level, blood cell counts, and inflammatory markers as well as age and albumin.

Conclusion:

We have demonstrated that machine learning methods can predict SLE hospitalizations. We identified key predictors of these events including known markers of SLE disease activity; further validation in external cohorts is warranted.

Keywords: systemic lupus erythematosus, epidemiology, machine learning

INTRODUCTION

The clinical course of systemic lupus erythematosus (SLE) is heterogeneous and characterized by disease flares that can range from mild to life-threatening, affecting various organ systems.(13) Over time, patients can suffer irreversible organ damage and lower health-related quality of life, as well as considerable economic costs.(4) Hospitalizations account for most of the direct costs of SLE care and are needed for the most severe SLE flares and complications.(5, 6) The clinical heterogeneity of SLE contributes to the difficulty in predicting the clinical course. Known predictors of SLE flares include dsDNA antibody positivity, low c3 and c4 values, elevated sedimentation rate, and prior SLE manifestations including renal and hematologic disease.(3) However, predictors of hospitalization for SLE are less well-established.

Data-driven methods can be applied to utilize electronic health record (EHR) data to predict clinical outcomes.(7) Recent studies have applied machine learning algorithms to predict important outcomes such as heart failure hospitalizations among patients with type 2 diabetes(8) and premature mortality in a general population context.(9) These methods have also been recently applied to predict disease activity among patients with chronic diseases including rheumatoid arthritis(10, 11) and multiple sclerosis.(12, 13) We sought to apply machine learning methods and leverage EHR data to predict hospitalizations for SLE and identify predictors of SLE hospitalizations.

MATERIALS AND METHODS

Data Source and Study Population

We utilized the longitudinal EHR-based Mass General Brigham (MGB) lupus cohort, including patients with SLE from two large academic medical centers and multiple community hospitals, who were identified by a previously validated EHR-based SLE phenotype algorithm (positive predictive value [PPV] 90%, area under the receiver operator curve [AUC] 0.922)(14) and who had at least two visits with an MGB rheumatologist during the period of data accrual, January 2012-December, 2019. The SLE phenotype algorithm has been previously described and incorporated counts of dsDNA and complement laboratory tests, antimalarial medication use, ICD codes for SLE and chronic renal failure, age, and number of facts in the EHR to identify a prevalent cohort.(14)

Study Design

We utilized EHR data from this SLE cohort and applied multiple machine learning methods to predict SLE hospitalizations. Our dataset was composed of the time-series records of patients that are irregular and unevenly distributed, owing to the nature of medical records. As such, we utilized two approaches to transform these EHR data into predictive features, an averaging approach and a progressive approach (Supplemental Figure 1). With the averaging approach, we first created models by either taking the average or mode of a feature over the complete time scale. The advantage of this method is to minimize the amount of missing data, and thus decrease the amount of data synthesis needed. The study sample included 1,996 patients (primary cohort) for this experiment. We next investigated a progressive approach by dividing the up to three-year observation periods into six-month intervals. The time step of six months was chosen by weighing competing factors of typical frequency of patient evaluation for SLE, completeness of resulting time series data, and effectiveness of associated potential machine learning methods. Specifically, we collected the features for the progressive model by doing the same per record aggregation as in the averaging approach but on six-month intervals within a three-year window. Patients without a minimum of two available time points for all lagged variables were excluded, resulting in a secondary cohort of 925 patients. To account for differences in cohort size for comparison of the averaging and progressive approaches, we also repeated the above averaging approach using the secondary cohort. We followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline.(15) This study was approved by the Mass General Brigham Institutional Review Board.

SLE Hospitalizations

The primary outcome of interest was hospitalization with a primary discharge diagnosis code of SLE (ICD-10 M32, excluding M32.0, or ICD-9 710.0). This outcome was assessed in the most recent calendar year of follow up for each patient (Supplemental Figure 1). Patients with at least one SLE hospitalization during this period were labeled as “hospitalized for SLE” (i.e., class = 1). Otherwise, patients were labeled as “not hospitalized for SLE” (i.e., class = 0).

Algorithm Features

Candidate features were extracted from the Research Patient Data Registry, a centralized clinical data warehouse linked with the MGB EHR,(16) for the feature assessment period. These variables included demographics, socioeconomic factors, clinical manifestations, SLE medications, laboratory values, and healthcare utilization (Supplemental Table 1). Clinical manifestations (e.g., lupus nephritis) were assessed by relevant ICD-10 codes. Healthcare utilization included counts of clinical encounters with rheumatologists, dermatologists, nephrologists, and other providers. Categorical features in all groups were preprocessed using one-hot encoding, a technique in which an integer encoded categorical variable is converted to a set of binary variables each of which indicates a unique value in the category.(17) One-hot encoding eliminates the artificial ordering introduced by the integer values that a machine learning algorithm could exploit erroneously.

To capture changes in the disease course, we introduced lagged variables as additional features included in the progressive models. Specifically, temporal features were created for all clinical manifestations, SLE medications, laboratory values, and healthcare utilization variables. These variables were set to a time-series structure and lagged at 6-month intervals to indicate the difference of the variable’s value between the current and previous.

Missing Value Imputation

We employed two types of missing value estimation methods, regressive imputation and longitudinal interpolation, for the averaging and progressive approaches, respectively. Using regressive imputation, a model was trained to predict the observed values of a variable (V) based on other variables in the dataset, and the model was then used to impute the missing values of V. For categorical variables, we used a k nearest neighbor model; for continuous variables, we used a linear regression model. In our progressive approach, each patient’s record forms a longitudinal time series with six observation points. In this case, we performed missing variable estimation for a given variable using linear interpolation/extrapolation fitted to its observed data points.(12)

Algorithm Development Using Machine Learning Methods

We selected five machine learning algorithms as our baseline learners: Decision Tree [DT], Logistic Regression [LR], Random Forest [RF], Naïve Bayes [NB], and Neural Network [NN]. In addition, we employed an ensemble learning method, which takes a weighted majority vote of the predictions from the above models, and the weights are the AUC scores of the classifiers. Our models were developed using the Python programming language and pandas, sklearn, and imblearn packages.

Model Building and Evaluation Framework

The architecture of our experimental framework for evaluating each of the models is illustrated in Figure 1. All experiments are conducted using a 10-fold (outer) cross validation. Therein, we divide the training data into 10 disjoint partitions (i.e., folds), and train/evaluate each classifier 10 times with different training and test data. Specifically, at each iteration i (i=1, 2, .. 10), fold i will be designated as the test data, and the remaining nine folds will be designated as the training data. We report average performance of the 10 test folds.

Figure 1.

Figure 1.

Model architecture

A challenge for our task is that correctly classifying high-risk patients as class 1 is more important than correctly classifying the low-risk patients as class 0. This is because a high-risk classification engenders closer monitoring. However, the lupus hospitalization rate was 4.6%, indicating a significant class imbalance in the data. Therefore, the breadth of our data describing the high-risk cases is more limited than the low-risk ones, and applying standard machine learning algorithms to the data would lead to unsatisfactory performance on the minority class. We addressed this using the bootstrap aggregating (i.e., bagging) with random under-sampling technique.(18) Specifically, we generated 100 “bags” of balanced datasets where each “bag” contained all minority instances and an equal number of randomly sampled majority instances. Each subset of the majority instances was sampled with replacement from the entire majority population. For each of our machine learning models, 100 sub-models were trained using these balanced “bags” of data. Inner nine-fold cross validation was further applied to select the optimal model hyper-parameters for each sub-model. Grid Search(19) was used to select the best parameters which result in the highest validation area under the receiver operator curve (AUC).(20) In particular, for the tree-based models (i.e., DT, RF), we optimized min_samples_split, min_samples_leaf, and max_depth, and their search values ranged from 2 to 8. For the NN model, we optimized the number of neurons in each hidden layer and the search value ranged from 50 to 256. For logistic regression, we searched for the optimal C (i.e., regularization strength) in a range of 1 to 10. The remaining hyperparameters adopted the default values provided by the scikit-learn library. The optimal parameter combination was then used to perform a final training on the complete bag of data. The final model predictions and predictive probabilities were obtained by aggregating the results of all sub-models using a majority vote.

Evaluation at Different Thresholds

Treating the imbalanced training data prevents degenerated models in which the predictions are biased towards the majority class. However, in different circumstances, practitioners may have different priorities, and thus different willingness to accept lower accuracies in one class in exchange for higher accuracy in the other. We trade off the accuracies between the two classes by setting a threshold to classify an instance belonging to the high-risk class. Consequently, a lower threshold leads to a higher accuracy in the high-risk class at the cost of lower accuracy in the low-cost class. We assessed a threshold favoring class 1 (0.45) in addition to neutral (0.50) and favoring class 0 (0.55). Practitioners could choose a threshold based on their tolerance on the reduced accuracies in class 0.

Identification of Predictors

We identified the top 10 features associated with SLE hospitalization for three algorithms (DT, RF, and LR) and the ensemble model for the averaging and progressive approaches, respectively. For LR, the importance of each feature was determined by the magnitude of its coefficients. For DT and RF models, the ranking followed the order of features that the algorithms used to split the branches by selecting an available node that produced the most homogeneous (i.e., purest) sub-branches per the Gini Index.(21) For the ensemble algorithm, attributes were ranked according to their average rank scores across all learners. We generated SHAP plots (Shapley additive explanations) to display the direction of the impact of these top features.

Statistical Analysis

Patient characteristics of the study population were assessed during the observation period and were presented as means +/− standard deviations (SD) for continuous variables and frequencies with percentages (%) for categorical variables.

The performance of each model was evaluated by accuracy, the F statistic, and the AUC.

RESULTS

A total of 1,996 patients with SLE were included in the primary cohort. The majority (91%) were female, with mean age 52 years, and 67% were white (Table 1). 1,402 (70%) had a first encounter for SLE prior to 2012, and 1,971 (99%) had a first encounter for SLE prior to 2016. 20% had lupus nephritis by ICD-10 code, and 33% had a positive dsDNA antibody. Nearly all patients used hydroxychloroquine, and 69.2% used oral corticosteroids at some point during the study period. The 925 patients included in the secondary cohort had similar demographics but a higher volume of clinical encounters, a higher proportion with lupus nephritis (27.5%) and greater use of corticosteroids (78.9%) (Supplemental Table 2). The average feature assessment period for the primary cohort used in the averaging approach was 5.1 years. The progressive approach utilized a feature assessment period of 3.0 years in the secondary cohort. Overall, 4.6% of patients were hospitalized for SLE in the last year of follow-up in the primary cohort and 4.8% in the secondary cohort.

Table 1.

Characteristics of Overall Study Cohort

Characteristics Overall Hospitalized for SLE Not Hospitalized for SLE
Number of patients 1996 92 (4.6%) 1,904 (95.4%)
Age, mean (SD) 52.3 (15.2) 46.6 (14.8) 52.5 (15.1)
Female, n (%) 1818 (91.1%) 86 (93.5%) 1732 (91.0%)
Race/Ethnicity, n (%)
 White 1343 (67.3) 47 (51.1%) 1296 (68.1%)
 Black 267 (13.4) 27 (29.3%) 240 (12.6%)
 Asian 101 (5.1) 3 (3.3%) 98 (5.1%)
 Hispanic 131 (6.6%) 6 (6.5%) 125 (6.6%)
Insurance Payer, n (%)
 Private 1556 (78.0%) 68 (73.9%) 1448 (78.2%)
 Medicare 334 (16.7%) 16 (17.4%) 318 (16.7%)
 Medicaid 93 (4.7%) 6 (6.5%) 87 (4.6%)
 Other 8 (0.4%) 2 (2.2%) 6 (0.3%)
Medication use, n (%)
 Hydroxychloroquine 1960 (98.2%) 89 (96.7%) 1871 (98.3%)
 Oral corticosteroids 1381 (69.2%) 77 (83.7%) 1304 (68.5%)
 Oral immunosuppressant 144 (7.2%) 6 (6.5%) 138 (7.2%)
SLE manifestations, n (%)
 Acute lupus rash 702 (35.2%) 54 (58.7%) 648 (34.0%)
 Lupus nephritis 396 (19.8%) 34 (37.0%) 362 (19.0%)
SLE serologies, % positive
 dsDNA ab 33 42 32
 SS-A ab 36 52 35
 SS-B ab 15 18 15
 Smith ab 13 22 13
 RNP ab 27 41 26
 B2 glycoprotein 1 ab, IgG 2 4 2
 B2 glycoprotein 1 ab, IgM 1 0 1
Healthcare Utilization, mean (SD)
 Rheumatology Visits 23.0 (26.7) 28.6 (35.4) 22.7 (26.1)
 Nephrology Visits 3.0 (12.0) 7.0 (16.2) 2.8 (11.8)
 Dermatology Visits 6.4 (14.9) 8.0 (17.6) 6.3 (14.8)
 Other Specialist Visits 137.2 (158.7) 229.2 (290.9) 132.8 (148.0)

The AUCs for the five individual machine learning models and the ensemble model are shown in Figure 2A-C for the averaging and progressive approaches. RF models had the highest respective AUCs in the averaging and progressive approaches (e.g., 0.751 and 0.772), followed by the DT and ensemble methods. The AUCs for these top performing models were higher in the progressive approach than the averaging approach models used in the primary cohort, and this performance gain was larger when comparing the progressive approach with the averaging approach performance in the secondary cohort (AUC 0.737). Precision-recall curves are shown in Supplemental Figure 2.

Figure 2.

Figure 2.

Figure 2.

Figure 2.

Receiver operator curves for predicting hospitalization for SLE flare. AUC, area under the curve.

A. Averaging approach models, primary cohort (n=1,996)

B. Progressive approach models, secondary cohort (n=925)

C. Averaging approach models, secondary cohort (n=925)

Across various thresholds, the true positive rate was 71% for neutral threshold 0.50 and 79% for the 0.45 threshold for the RF model, with a tradeoff in true negative rate of 69% for the 0.50 threshold and 61% for the 0.45 threshold (Table 2). As expected, given the imbalanced data, the overall model accuracy was higher with the 0.55 threshold (75% for the RF model). For the progressive approach models, the true positive rate was 86% for RF with a 0.45 threshold. The top progressive approach models had similar accuracy as the averaging approach models using the primary cohort, but the progressive approach outperformed the averaging approach models in the secondary cohort (Supplemental Table 3).

Table 2.

Performance Characteristics of Machine Learning Models in Predicting Hospitalization for Systemic Lupus Erythematosus, Averaging Approach

Threshold Model True Pos. Rate True Neg. Rate Overall Accuracy F1 Class 1 F1 Class 0
0.45 Averaging Approach
Decision Tree 0.79 0.59 0.60 0.15 0.74
Random Forest 0.79 0.61 0.62 0.16 0.75
Logistic Regression 0.67 0.62 0.62 0.14 0.75
Naïve Bayes 0.61 0.75 0.74 0.18 0.85
Neural Network 0.67 0.63 0.63 0.14 0.76
Ensemble 0.67 0.69 0.69 0.17 0.81

0.5 Decision Tree 0.71 0.65 0.65 0.16 0.78
Random Forest 0.71 0.69 0.69 0.17 0.81
Logistic Regression 0.61 0.69 0.68 0.15 0.81
Naïve Bayes 0.59 0.76 0.75 0.18 0.85
Neural Network 0.62 0.66 0.66 0.14 0.79
Ensemble 0.60 0.74 0.73 0.17 0.84

0.55 Decision Tree 0.63 0.71 0.70 0.16 0.82
Random Forest 0.62 0.76 0.75 0.18 0.85
Logistic Regression 0.55 0.74 0.73 0.16 0.84
Naïve Bayes 0.58 0.77 0.76 0.18 0.86
Neural Network 0.58 0.69 0.69 0.15 0.81
Ensemble 0.52 0.78 0.77 0.17 0.86

0.45 Progressive Approach
Decision Tree 0.79 0.62 0.63 0.17 0.76
Random Forest 0.86 0.57 0.58 0.16 0.72
Logistic 0.67 0.66 0.66 0.15 0.79
Naïve Bayes 0.56 0.71 0.70 0.15 0.82
Neural Network 0.61 0.66 0.65 0.14 0.78
Ensemble 0.63 0.72 0.72 0.17 0.83

0.50 Decision Tree 0.73 0.69 0.69 0.18 0.81
Random Forest 0.70 0.69 0.69 0.17 0.81
Logistic 0.63 0.69 0.69 0.16 0.81
Naïve Bayes 0.49 0.72 0.71 0.13 0.83
Neural Network 0.58 0.69 0.68 0.14 0.80
Ensemble 0.51 0.79 0.77 0.17 0.87

0.55 Decision Tree 0.54 0.76 0.75 0.17 0.85
Random Forest 0.53 0.81 0.80 0.20 0.89
Logistic 0.59 0.72 0.71 0.16 0.83
Naïve Bayes 0.47 0.74 0.73 0.13 0.84
Neural Network 0.52 0.73 0.72 0.14 0.83
Ensemble 0.41 0.84 0.82 0.16 0.90

Predictors of SLE Hospitalization

Laboratory values including low albumin, elevated inflammatory markers (erythrocyte sedimentation rate [ESR] and C-reactive protein [CRP]), low hemoglobin/hematocrit, and elevated white blood cell count and platelet count were top predictors in at least two of the three top performing algorithms, as was age (Table 3, Supplemental Figure 3). Positive DsDNA antibody was a top predictor in the DT and ensemble models. The volume of clinical encounters outside of specialties that typically care for SLE (i.e., rheumatology, nephrology, and dermatology) was also in the top ten predictive features in all models.

Table 3.

Comparison of Top Ten Predictive Features of Machine Learning Models, Averaging Approach

Rank Decision Tree Random Forest Logistic Regression Ensemble*
1 Albumin Albumin Age ESR
2 ESR HGB/HCT Other visits HGB/HCT
3 dsDNA Ab (binary) ESR AST Age
4 HGB/HCT Age Creatinine Other visits
5 Age Other visits HGB/HCT SS-A Ab
6 Other visits CRP ESR AST
7 SS-A Ab WBC Venous thrombosis Creatinine
8 PLT PLT SS-A Ab dsDNA Ab (binary)
9 RNP Ab SS-A Ab Nephrology visits Venous thrombosis
10 WBC RNP Ab CRP Albumin

HGB, hemoglobin; HCT, hematocrit; ESR, erythrocyte sedimentation rate; CRP, c-reactive protein; SS-A, Sjogren’s Syndrome A antibody; AST, aspartate transaminase; WBC, white blood cell count; RNP, anti-ribonucleoprotein antibody

Other visits include counts of encounters with providers other than rheumatology , nephrology, and dermatology.

*

Ensemble estimated ranking computed based on average of ranking of DT, RF, LG models

For the progressive approach, several leading predicting features were the same as in the averaging models, including albumin, CRP, and age (Table 4). In addition, several lagged variable features were also among the top 10 predictors for these models. Specifically, the change over time in C3 complement level, platelet count, white blood cell count, and CRP level were all top predictors of SLE hospitalization. Additionally, oral corticosteroid usage was a top predictor in the LR model. ESR was excluded from the progressive model due to missing data in 46% of the secondary cohort.

Table 4.

Comparison of Top Ten Predictive Features of Machine Learning Models, Progressive Approach

Rank Decision Tree Random Forest Logistic Regression Ensemble*
1 Albumin Albumin Age Albumin
2 Δ CRP CRP Oral corticosteroids Age
3 Δ PLT Δ CRP Δ WBC Δ PLT
4 CRP HGB/HCT Albumin Δ WBC
5 Age Age dsDNA Ab, binary CRP
6 Δ C3 Δ PLT Δ Acute lupus rash Δ C3
7 HGB/HCT Δ Creatinine Δ C3 Δ CRP
8 Δ WBC Δ WBC Δ Dermatology visits Δ HGB/HCT
9 Δ Albumin Δ HGB/HCT Δ PLT Δ Albumin
10 Δ Creatinine Δ C3 PLT PLT

CRP, c-reactive protein; PLT, platelet count; WBC, white blood cell count; HGB, hemoglobin; HCT, hematocrit

*

Ensemble estimated ranking computed based on average of ranking of DT, RF, LG models

Lagged variables are indicated by the Δ icon.

DISCUSSION

Using multiple machine learning methods, we leveraged real world EHR data to predict future SLE hospitalizations in a large, longitudinal EHR-based SLE cohort. We explored the use of two approaches to ascertain predictive features from the longitudinal but irregular time series of clinical data captured from usual care. In an averaging approach, we gathered predictive features over up to three years of clinical observation to predict a SLE hospitalization in the following year, and in a progressive approach we introduced lagged variables to capture the change over time of clinical features. Using both approaches, we identified RF to be the best predictive model or this task. We identified known biomarkers of SLE disease activity, including dsDNA antibody positivity and low C3 level, elevated inflammatory markers, and changes over time in blood counts among the strongest predictors of SLE hospitalizations.

Our experimental results suggest that, with the same (secondary) dataset, the progressive approach is preferred over the averaging approach because the former demonstrates performance gains over the latter across all thresholds. However, since the progressive approach demands a higher quality of the underlying data that may be impractical to collect, the averaging approach could be an effective alternative. The advantage of the progressive approach is to better capture the changes in the disease course over time. The disadvantage is a higher demand on the quality of the data, as all subjects must have recurring clinical follow-ups. As a result, our overall cohort was reduced to 925 patients (secondary cohort) after excluding those without regular six-month visits. Furthermore, since missing variable imputation was performed in shortened time intervals, our secondary dataset lost the feature ESR, which was a predictive feature identified by the averaging approach in the primary dataset.

Our models incorporated a range of clinical information readily available from the EHR including demographics, insurance status, ICD-coded disease manifestations, laboratory values, and medications into the predictive models. Multiple laboratory features were included among the top predictors. This likely relates to the relative availability of relevant laboratory predictors captured as structured EHR data. We captured relevant disease manifestations through specific ICD diagnosis codes, such as acute lupus rashes and nephritis. However, a limitation of using structured EHR data is that many SLE manifestations do not have an associated diagnosis code and/or the diagnosis may be missed when relying on ICD diagnosis codes alone.(22) Therefore, the paucity of non-hematologic SLE manifestations among the top predictors identified by our lgorithms may reflect the incomplete capture of relevant disease manifestations in the EHR. Future work will additionally focus on incorporating additional features into predictive models through natural language processing of clinical information that would not be captured in structured EHR data.

The top predictors of SLE hospitalizations that were identified in our study fit with prior understanding of SLE flares and of hospitalizations in general, which grants face validity to our findings. Interestingly, dsDNA and C3 are known predictors of SLE flares, and they were both found to be predictive features of SLE hospitalizations in multiple machine learning models.(3) Additionally, inflammatory markers ESR and CRP were both important predictors in the averaging approach, with a change over time in CRP value an additional top predictor in the progressive approach. Changes in white blood cell counts, hemoglobin/hematocrit values, and platelet counts were each important predictive features. This is most likely because hematologic abnormalities reflect the inflammatory response and can also capture changes in SLE disease activity as discussed above. Glucocorticoid use was a predictor of SLE hospitalizations in the progressive LR model only, and other SLE medications were not top predictors of hospitalizations in this study. Our findings are somewhat different than those of a prior EHR-based US study of SLE hospitalizations at Ohio State University.(23) They found leukopenia, anemia, thrombocytopenia, elevated creatinine level, and glucocorticoid use at the initial rheumatology visit to predict ever being SLE hospitalized for SLE. Our study used a different approach to incorporate predictors of SLE hospitalizations over the longitudinal clinical course, which likely explains why we additionally identified variables that are known to change over time with SLE disease activity (e.g., C3, dsDNA and inflammatory markers) as predictors of SLE hospitalizations.

This study has several limitations. We used observational EHR data that were collected as part of clinical care and not for primary research purposes. It is known that EHR data contains variable quantities of facts among individuals and this is captured at irregular intervals. Further, this variation is not randomly distributed, as sicker patients may have a higher volume of facts than healthier patients. We used data reduction techniques through our averaging and progressive approaches to attempt to reduce bias due to variation in available data. Another limitation is that we captured SLE hospitalizations that occurred within our healthcare system, so we may have missed outcomes and therefore misclassified patients who were hospitalized outside our system. However, we utilized a large, EHR-based SLE cohort with a validated case definition to reduce potential bias from misclassification of SLE, as is otherwise a concern in EHR- or claims-based studies of SLE.

In conclusion, our findings suggest that machine learning approaches can be effective in predicting SLE hospitalizations using EHR data. Our models identified predictors of SLE hospitalizations including known biomarkers for SLE flares, markers of illness, and healthcare utilization. Further studies are needed to externally validate these algorithms in other health networks and in diverse settings. If successfully validated in a larger setting, machine learning models could be utilized in health systems to identify patients at high risk of SLE hospitalizations, which could guide the use of resources to proactively support them. The methods used to capture the change over time in variables to predict SLE hospitalizations may also be informative for other clinical conditions and health outcomes.

Supplementary Material

1

Acknowledgements:

This work was presented as an abstract at the ACR Convergence 2020.

Financial Support:

AJ is supported by the Rheumatology Research Foundation and the National Institutes of Health [K23-AR-079–040]. KC is supported by the National Institutes of Health [K24-AR-066–109]. HKC is supported by the National Institutes of Health [P50-AR-060–772].

Footnotes

Conflicts of interest: The authors have no conflicts of interest to disclose.

Data availability:

The data underlying this article cannot be shared publicly for the protection of patient privacy and per a data use agreement. The data will be shared on reasonable request to the corresponding author and pending approval of Mass General Brigham. To facilitate reproducibility and further methods development, we have provided code associated with this study and model training/test instructions under a Github repository: https://github.com/dsmith167/Lupus.

References:

  • 1.Jorge AM, Lu N, Zhang Y, Rai SK, Choi HK. Unchanging premature mortality trends in systemic lupus erythematosus: a general population-based study (1999–2014). Rheumatology (Oxford) 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Yen EY, Singh RR. Lupus - An Unrecognized Leading Cause of Death in Young Women: Population-based Study Using Nationwide Death Certificates, 2000–2015. Arthritis Rheumatol 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Petri MA, van Vollenhoven RF, Buyon J, Levy RA, Navarra SV, Cervera R, et al. Baseline predictors of systemic lupus erythematosus flares: data from the combined placebo groups in the phase III belimumab trials. Arthritis Rheum 2013;65(8):2143–53. [DOI] [PubMed] [Google Scholar]
  • 4.Carter EE, Barr SG, Clarke AE. The global burden of SLE: prevalence, health disparities and socioeconomic impact. Nat Rev Rheumatol 2016;12(10):605–20. [DOI] [PubMed] [Google Scholar]
  • 5.Gu K, Gladman DD, Su J, Urowitz MB. Hospitalizations in Patients with Systemic Lupus Erythematosus in an Academic Health Science Center. J Rheumatol 2017;44(8):1173–8. [DOI] [PubMed] [Google Scholar]
  • 6.Lee J, Peschken CA, Muangchan C, Silverman E, Pineau C, Smith CD, et al. The frequency of and associations with hospitalization secondary to lupus flares from the 1000 Faces of Lupus Canadian cohort. Lupus 2013;22(13):1341–8. [DOI] [PubMed] [Google Scholar]
  • 7.Rajkomar A, Dean J, Kohane I. Machine Learning in Medicine. N Engl J Med 2019;380(14):1347–58. [DOI] [PubMed] [Google Scholar]
  • 8.Segar MW, Vaduganathan M, Patel KV, McGuire DK, Butler J, Fonarow GC, et al. Machine Learning to Predict the Risk of Incident Heart Failure Hospitalization Among Patients With Diabetes: The WATCH-DM Risk Score. Diabetes care 2019;42(12):2298–306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Weng SF, Vaz L, Qureshi N, Kai J. Prediction of premature all-cause mortality: A prospective general population cohort study comparing machine-learning and standard epidemiological approaches. PloS one 2019;14(3):e0214365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Lin C, Karlson EW, Canhao H, Miller TA, Dligach D, Chen PJ, et al. Automatic prediction of rheumatoid arthritis disease activity from the electronic medical records. PloS one 2013;8(8):e69932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Norgeot B, Glicksberg BS, Trupin L, Lituiev D, Gianfrancesco M, Oskotsky B, et al. Assessment of a Deep Learning Model Based on Electronic Health Record Data to Forecast Clinical Outcomes in Patients With Rheumatoid Arthritis. JAMA network open 2019;2(3):e190606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhao Y, Healy BC, Rotstein D, Guttmann CR, Bakshi R, Weiner HL, et al. Exploration of machine learning techniques in predicting multiple sclerosis disease course. PloS one 2017;12(4):e0174866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zhao Y, Wang T, Bove R, Cree B, Henry R, Lokhande H, et al. Ensemble learning predicts multiple sclerosis disease course in the SUMMIT study. NPJ Digit Med 2020;3:135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Jorge A, Castro VM, Barnado A, Gainer V, Hong C, Cai T, et al. Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms. Semin Arthritis Rheum 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Bmj 2015;350:g7594. [DOI] [PubMed] [Google Scholar]
  • 16.Research Patient Data Registry: RPDR Data Query Tool: Partners Healthcare; [Available from: https://rc.partners.org/research-apps-and-services/identify-subjects-request-data#rpdr-daily-query-tool.
  • 17.Murphy KP. Machine learning: a probabilistic perspective (page 35): MIT press; 2012. [Google Scholar]
  • 18.Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F. A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 2011;42(4):463–84. [Google Scholar]
  • 19.Claesen M, De Moor B. Hyperparameter search in machine learning. arXiv preprint arXiv:150202127 2015. [Google Scholar]
  • 20.Fawcett T. An introduction to ROC analysis. Pattern recognition letters 2006;27(8):861–74. [Google Scholar]
  • 21.Quinlan JR. Induction of decision trees. Machine Learning 1986;1:81–106. [Google Scholar]
  • 22.Ramsey-Goldman R, Walanus T, Jackson K, Chung A, Erickson D, Mancera-Cuevas K, et al. 457 Algorithms to identify systemic lupus erythematosus (sle) from electronic health record (ehr) data. Lupus Science & Medicine 2017;4(Suppl 1):A220–A. [Google Scholar]
  • 23.Li D, Madhoun HM, Roberts WN Jr,Jarjour W. Determining risk factors that increase hospitalizations in patients with systemic lupus erythematosus. Lupus 2018;27(8):1321–8. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

The data underlying this article cannot be shared publicly for the protection of patient privacy and per a data use agreement. The data will be shared on reasonable request to the corresponding author and pending approval of Mass General Brigham. To facilitate reproducibility and further methods development, we have provided code associated with this study and model training/test instructions under a Github repository: https://github.com/dsmith167/Lupus.

RESOURCES