Replicating prediction algorithms for hospitalization and corticosteroid use in patients with inflammatory bowel disease

Ryan W Gan; Diana Sun; Amanda R Tatro; Shirley Cohen-Mekelburg; Wyndy L Wiitala; Ji Zhu; Akbar K Waljee

doi:10.1371/journal.pone.0257520

. 2021 Sep 20;16(9):e0257520. doi: 10.1371/journal.pone.0257520

Replicating prediction algorithms for hospitalization and corticosteroid use in patients with inflammatory bowel disease

Ryan W Gan ¹, Diana Sun ¹, Amanda R Tatro ², Shirley Cohen-Mekelburg ^3,⁴, Wyndy L Wiitala ⁴, Ji Zhu ⁵, Akbar K Waljee ^3,^4,^*

Editor: Hsu-Heng Yen⁶

PMCID: PMC8452029 PMID: 34543353

Abstract

Introduction

Previous work had shown that machine learning models can predict inflammatory bowel disease (IBD)-related hospitalizations and outpatient corticosteroid use based on patient demographic and laboratory data in a cohort of United States Veterans. This study aimed to replicate this modeling framework in a nationally representative cohort.

Methods

A retrospective cohort design using Optum Electronic Health Records (EHR) were used to identify IBD patients, with at least 12 months of follow-up between 2007 and 2018. IBD flare was defined as an inpatient/emergency visit with a diagnosis of IBD or an outpatient corticosteroid prescription for IBD. Predictors included demographic and laboratory data. Logistic regression and random forest (RF) models were used to predict IBD flare within 6 months of each visit. A 70% training and 30% validation approach was used.

Results

A total of 95,878 patients across 780,559 visits were identified. Of these, 22,245 (23.2%) patients had at least one IBD flare. Patients were predominantly White (87.7%) and female (57.1%), with a mean age of 48.0 years. The logistic regression model had an area under the receiver operating curve (AuROC) of 0.66 (95% CI: 0.65−0.66), sensitivity of 0.69 (95% CI: 0.68−0.70), and specificity of 0.74 (95% CI: 0.73−0.74) in the validation cohort. The RF model had an AuROC of 0.80 (95% CI: 0.80−0.81), sensitivity of 0.74 (95% CI: 0.73−0.74), and specificity of 0.72 (95% CI: 0.72−0.72) in the validation cohort. Important predictors of IBD flare in the RF model were the number of previous flares, age, potassium, and white blood cell count.

Conclusion

The machine learning modeling framework was replicated and results showed a similar predictive accuracy in a nationally representative cohort of IBD patients. This modeling framework could be embedded in routine practice as a tool to distinguish high-risk patients for disease activity.

Introduction

Inflammatory bowel disease (IBD), encompassing ulcerative colitis (UC) and Crohn’s disease (CD), is a chronic, relapsing, and remitting gastrointestinal disease affecting 1.5 million people in the United States [1, 2]. Although patients may experience periods of remission, both UC and CD can flare up unpredictably, resulting in substantial morbidity, loss of productivity and high medical costs.

Waljee et al. [3] developed a novel machine learning model for predicting hospitalization and corticosteroid use as a surrogate for IBD flares using data from the U.S. Veterans Health Administration (VHA). The primary results from this study found the random forest (RF) model outperformed the logistic regression model’s ability to predict IBD flares ([RF]: 79.7% sensitivity and 80% specificity; logistic regression: 64% sensitivity and 64% specificity) [3].

However, the study was performed using VHA data in a predominantly male population, limiting its generalizability. In order to implement machine learning models into practice, replication of model-based predictions in more representative cohorts are necessary. Furthermore, recent advances in machine learning can offer transparency between model output and variables [4], allowing for informed judgment. Ensuring reproducibility and interpretability of machine learning models can increase understanding and usage of these algorithms for clinical decision support.

The objective of this study was two-fold: (1) to replicate the models developed by Waljee et al. [3] to predict IBD flares in a community-based cohort; and (2) to understand which demographic and laboratory data were most informative at predicting IBD flares.

Methods

Data source

This was a retrospective observational analysis using data from Optum Electronic Health Records (EHR) Database, which contains de-identified clinical and administrative data from more than 140,000 providers at 700 hospitals and 7,000 clinics [5]. Data are obtained from physician offices, emergency rooms, laboratories, and hospitals and include demographic information, vital signs and other observable measurements, medications prescribed and administered, laboratory test results, administrative data for clinical and inpatient stays, and coded diagnoses and procedures. Data are de-identified in compliance with the Health Insurance Portability and Accountability Act Expert Method and managed according to Optum customer data use agreements.

Study population

Patients were selected for inclusion if they had at least 2 diagnosis codes (International Classification of Diseases, 9th Revision, Clinical Modification [ICD-9-CM] or ICD, 10th Revision, Clinical Modification [ICD-10-CM]) for UC or CD during at least 2 clinical encounters between 1 January 2007 and 31 December 2017, with at least 1 clinical encounter being an outpatient visit [6]. For each patient, the date of the first recorded diagnosis of IBD was defined as the index date. Patients were required to have at least 12 months of follow-up after the index date to the last recorded IBD visit in the EHR, and were observed from the index date until the last patient visit, death, or until the end of study (31 December 2018), whichever occurred first. Patients were classified as having UC if all diagnosis codes were 556.xx for ICD-9-CM or K51.xx for ICD-10-CM, CD if all diagnosis codes were 555.xx for ICD-9-CM or K50.xx for ICD-10-CM, and with indeterminate colitis (IC) otherwise. Only patients aged 18 years and older were included in the analysis.

Outcome measures

The outcome of interest was IBD flare within 6 months of a visit, defined as a composite measure capturing either an IBD-related hospitalization or corticosteroid use for IBD. Inflammatory bowel disease-related hospitalization was considered an inpatient or emergency room admission associated with any diagnosis of UC or CD. Inflammatory bowel disease-indicated corticosteroid use was defined as a prescription for an outpatient oral corticosteroid (S1 List). The indication for corticosteroid prescriptions was determined by searching for diagnosis codes for common inflammatory comorbid conditions (S2 List) within 7 days prior to the corticosteroid prescription date, and excluded those corticosteroid prescriptions associated with a non-IBD diagnosis. Corticosteroid prescriptions fills with a day supply of < 7 days were excluded, following the same approach as Waljee et al. to exclude steroid use for non-IBD indications [3]. Hospitalization or corticosteroid use was assumed to be part of a previous course if it occurred within 90 days of the previous hospitalization or corticosteroid prescription, similar to the approach presented by Waljee et al. [3]

Predictor variables

Predictor variables included age, sex, race, use of immunosuppressive medication, laboratory data, laboratory summaries (S1 Table), and number of previous IBD flares. Laboratory data included previous laboratory values and laboratory values from the current visit; values that were out of the physiologically possible range were excluded from the analysis. Fecal calprotectin was not included as a predictor variable because it was available in < 1% of the study population. Missing laboratory values were imputed using the population median value. Laboratory data missing in > 70% of the visits were excluded from the analysis.

Statistical analysis

Model development

Three prediction models that looked at the summed contribution of all predictor variables for a given observation were developed to predict IBD flares: (1) a logistic regression using demographic data only, (2) a logistic regression using demographic and laboratory data, and (3) a RF using demographic and laboratory data. The models were assessed at the visit-level, with each patient contributing data from any inpatient or outpatient visits. At each visit, a prediction was made to evaluate whether an IBD flare (defined as an IBD-related hospitalization or corticosteroid use) occurred within the next 6 months.

Logistic regression using demographic data (LR-Demo)

A logistic regression model using demographic data (S1 Table) was developed. This model aimed to evaluate the predictive ability of demographic data for the risk of hospitalization and corticosteroid use during the 6-month window following every visit.

Logistic regression using demographic and laboratory data (LR-DemoLab)

A logistic regression model using demographic and laboratory data (S1 Table) was developed to evaluate the added utility of laboratory data. To avoid overfitting the model, an L1 regularization penalty was added [7, 8]. An optimal λ hyper parameter of 1.11 for the L1 penalty was identified using a 5-fold cross validation. Regularized coefficients were plotted as odds ratios to understand their relationship with odds of a flare in the 6 months following the visit. Laboratory values were standardized after imputation to allow for interpretation of a standard deviation increase.

RF using demographic and laboratory data (RF-DemoLab)

An RF model using demographic and laboratory data (S1 Table). RF is a tree-based technique that turns a large collection of de-correlated trees into a prediction function [9]. Using demographic and laboratory data, an RF model with 500 trees was fit. Variable importance by Gini impurity were plotted were identify the top 10 variables in the trained RF model. Laboratory values were not standardized in this model to allow for ease of interpretation.

Shapley Additive exPlantation (SHAP) values were estimated using the TreeSHAP algorithm [4] on the trained RF model. The SHAP values quantify the contribution of each variable on the prediction made by the model [4]. The SHAP values are estimated using the withheld validating set, and are the average conditional expected value from a trained tree-based model [4]. Global contributions of the top 10 variables were estimated for the model overall and plotted using a SHAP summary plot. Explanatory plots were created for the top 4 predictor variables.

Training and validating sets

To validate the predictive ability of the models, visits were randomly split into training and validating sets, with 70% of the visits directed into the training set and 30% reserved for the validating set. Because flare is a rare event, the non-events were randomly under-sampled to match the distribution of events to avoid class imbalance; model predicted probabilities were corrected to account for the rare event proportion using Bayes’ rule [10].

Model performance

Model performance was evaluated using the following metrics: sensitivity, specificity, positive predictive value, negative predictive value, accuracy, area under the receiver operating curve (AuROC), and decision curve analysis (DCA) [11]. Brier scores were reported as an overall measure of model performance, where lower values indicate better calibration and discrimination [12]. Non-parametric bootstrap of 100 iterations were performed to estimate model performance 95% CIs.

Sensitivity analyses

Sensitivity analyses to evaluate the robustness of the RF-DemoLab model included: (1) sub-populations of patients with UC, CD and IC, and (2) the RF-DemoLab model with imputation of laboratory values using multiple imputation by chain equation (MICE) [13, 14].

Software

All data processing, cleaning, and analyses were performed in Python 3. Data processing was conducted using Teradata SQL [15], SQLAlchemy [16], Jinja2 [17], and Pandas [18]. Model fitting and preprocessing were conducted using Scikit-learn [19] and Numpy [20]. Random under sampling was performed using Imbalanced-learn [21]. TreeSHAP values calculated and plots created using the TreeSHAP package [4]. Plots were created using Matplotlib [22] and Seaborn [23].

Patient involvement

No patients were involved in this study as we used de-identified Optum EHR data.

Data and code availability

R code (GitHub: https://github.com/CCMRcodes/IBD_Flare) and paper previously published by Waljee et al. [3] was reviewed and adapted to Python code.

Manuscript code used to produce tables and plots can be found in the public GitHub repository: https://github.com/phcanalytics/ibd_flare_model. Summary data used to create plots and tables can be found in the public GitHub repository: https://github.com/phcanalytics/ibd_flare_model/tree/master/results. Note, SHAP values require raw data to be calculated and are not included in the repository due to data use agreements.

Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines were implemented [24]; checklist can be found in S3 Table.

Results

Patient characteristics

In total, 454,769 patients with an IBD diagnosis were identified between 2007 and 2017. Patients were excluded if they did not have at least 2 diagnosis codes for UC or CD during at least 2 encounters, with at least 1 encounter being an outpatient visit; did not have at least 365 days between the index date and last recorded visit; less than 18 years old; and did not have any laboratory data. The final cohort consisted of 95,878 patients and 780,559 visits (Fig 1).

Patients were predominantly White (87.7%) and female (57.1%), with a mean (SD) age of 48.2 years (16.8 years) at the time of index (Table 1). The median (IQR) follow-up time for all patients was 55.0 months (34.0−79.0 months). A majority of the patients were seen in the Midwest region of the United States (49.4%), followed by the Southern region (25.2%), which aligned with the territory covered by the Optum EHR.

Table 1. Patient characteristics.

Characteristic	All Patients	No Flare Event ^a	Had Flare Event
Number of patients (%)	95,878 (100)	73,633 (76.8)	22,245 (23.2)
Age in years, mean (SD)	48.2 (16.8)	48.6 (16.6)	47.1 (17.3)
Female, n (%)	54,794 (57.1)	41,645 (56.6)	13,149 (59.1)
Deceased, n (%)	3,304 (3.4)	2,717 (3.7)	587 (2.6)
Immunosuppressive medication use ^b, n (%)	20,375 (21.3)	13,577 (18.4)	6,798 (30.6)
Race, n (%)
Black	5,958 (6.2)	4,291 (5.8)	1,667 (7.5)
Asian	998 (1.0)	824 (1.1)	174 (0.8)
White	84,055 (87.7)	64,532 (87.6)	19,523 (87.8)
Other/Unknown	3,876 (4.0)	2,896 (3.9)	980 (4.4)
Region, n (%)
Midwest	47,371 (49.4)	36,274 (49.3)	11,097 (49.9)
Northeast	12,518 (13.1)	9,845 (13.4)	2,673 (12.0)
South	24,198 (25.2)	18,528 (25.2)	5,670 (25.5)
West	7,915 (8.3)	6,090 (8.3)	1,825 (8.2)
Disease, n (%)
Crohn’s disease	42,977 (44.8)	31,990 (43.4)	10,987 (49.4)
Ulcerative colitis	40,167 (41.9)	33,483 (45.5)	6,684 (30.0)
Indeterminate disease	12,734 (13.3)	8,160 (11.1)	4,574 (20.6)
Number of non-flare visits ^c
Mean (SD)	7.2 (10.5)	6.0 (8.3)	11.4 (15.1)
Median (IQR)	4.0 (2.0, 8.0)	3.0 (2.0, 7.0)	7.0 (3.0, 15.0)
Follow-up in months, mean (SD)	58.5 (30.5)	57.0 (30.2)	63.4 (31.1)
Follow-up in months, median (IQR)	55.0 (34.0, 79.0)	53.0 (32.0, 77.0)	61.0 (38.0, 85.0)

Open in a new tab

IQR = interquartile range.

^a Flare event was classified as an IBD-related hospitalization or corticosteroid prescription within 6 months of any inpatient or outpatient visit between 2007 and 2018.

^b Immunomodulators and/or anti-tumor necrosis factor agents.

^c Visits without a hospitalization or corticosteroid prescription in the following 6 months.

Model performance

The AuROC and DCA curves for the models are shown in Fig 2. The AuROC was 0.64 (95% CI: 0.64 to 0.64) for the LR-Demo model, 0.65 (95% CI: 0.65 to 0.66) for the LR-DemoLab model, and 0.80 (95% CI: 0.80 to 0.81) for the RF-DemoLab model. The addition of laboratory data slightly increased the overall AuROC, suggesting minimal discriminatory ability over demographic data alone. The RF-DemoLab model had a higher AuROC than LR-Demo or LR-DemoLab model, suggesting better discriminatory ability than the logistic models. DCA analysis suggests the RF-DemoLab model appears to have better net benefit when compared with the logistic models and intervention for all and intervention for none scenarios across a wide range of thresholds (Fig 2B).

Table 2 contains bootstrapped model performance metrics. For the LR-Demo model, the sensitivity was 68%, indicating reasonable ability to predict flare events in the following 6 months of a visit, but specificity was no better than chance at 51%. The LR-DemoLab model had similar performance: sensitivity of 69% and specificity of 54%. The RF-DemoLab model had the highest sensitivity of 74% and specificity of 72% and the lowest Brier score, indicating that this model had better diagnostic accuracy compared to the logistic regressions. The positive predictive value of 27% was highest for the RF-DemoLab (Table 2).

Table 2. Bootstrapped estimates of median and 95% confidence interval (CI) model performance metrics.

Diagnostic Metric	LR-Demo	LR-DemoLab	RF-DemoLab
Sensitivity, median (95% CI)	0.68 (0.67−0.69)	0.69 (0.68−0.70)	0.74 (0.73−0.74)
Specificity, median (95% CI)	0.51 (0.51−0.51)	0.54 (0.53−0.54)	0.72 (0.72−0.72)
Positive predictive value, median (95% CI)	0.16 (0.16−0.17)	0.17 (0.17−0.18)	0.27 (0.27−0.28)
Negative predictive value, median (95% CI)	0.92 (0.92−0.92)	0.92 (0.92−0.93)	0.95 (0.95−0.95)
Accuracy, median (95% CI)	0.53 (0.53−0.53)	0.56 (0.55−0.56)	0.72 (0.72−0.73)
Area under the curve, median (95% CI)	0.64 (0.64−0.64)	0.66 (0.65−0.66)	0.80 (0.80−0.81)
Brier score	0.26	0.26	0.19

Open in a new tab

LR-Demo = a logistic regression model using demographic data; LR-DemoLab = a logistic regression model using demographic and laboratory data; RF-DemoLab = a RF model using demographic and laboratory data.

Interpretability of the RF models

The 10 most important predictors based on TreeSHAP are presented in Fig 3. These included number of previous flares, age, mean potassium, white blood cell (WBC) count at visit and past visit mean and maximum, calcium past visit mean, mean corpuscular volume (MCV) at current visit, blood urea nitrogen (BUN) past visit mean, and platelet past visit maximum.

TreeSHAP summary plot shows how each of these top 10 variables are influencing the log odds. For example, the variable for the number of previous flares in a patient shows that the log odds increase (positive SHAP value) for higher numbers of previous flares. For the variable of age, the log odds increase (positive SHAP value) for lower age values.

For comparison, the previous flare sum, age, potassium mean of past laboratory values, WBC count mean and maximum of past laboratory values were also identified as important variables in the variable importance plots (S1 Fig). Some differences are expected as variable importance by Gini impurity has some bias when variables with both multiple categories and continuous values are present [25].

TreeSHAP conditional dependency plots were created for the top 4 variables identified in the TreeSHAP summary plots were to describe the conditional non-linear relationship between the likelihood of a flare in the next 6 months and presented in S2 Fig. For comparison, odds ratios from the 2 logistic regression models are presented in S3 Fig.

Sensitivity analyses

Sensitivity analysis for UC, CD, and IC subgroups are presented in the AuROC and RF variable importance S4 and S5 Figs, and were similar across patients with UC, CD and IC. This suggests that the trained RF-DemoLab model had similar predictive performance and variables regardless of diagnosis.

Performance metrics for the sensitivity analysis using the RF model using MICE imputation for laboratory values rather than simple median imputation are presented in S2 Table. Sensitivity in the RF-DemoLab model using MICE imputation was lower (0.60) compared to the RF-DemoLab model with simple median imputation (0.74), resulting in lower accuracy overall. This suggests that linear iterative imputers such as MICE may not be ideal for tree-based modeling tasks.

Discussion

The model-building process was replicated to predict flare in patients with IBD using the Optum EHR with similar predictive accuracy as Waljee et al. [3] and following TRIPOD guidelines [24]. The RF model with both clinical and laboratory variables saw improvements in all predictive accuracy metrics and DCA net benefit when compared to the logistic models. This higher performance is likely due to the ability of the RF algorithm to implicitly handle interactions between variables as well as non-linearities, which must be explicitly modeled in logistic regression.

Results from the TreeSHAP algorithm offer insight and model interpretability to RF models. The number of previous flare events a patient experienced was found to be the most important variable at predicting a flare in the next 6 months. This relationship showed a threshold effect where the influence of previous flares on the likelihood of future flares increased linearly as the number of previous flares increased up to 3 previous flares and remained elevated at > 3 previous flares (S2A Fig).

While the process outlined in Waljee et al. [3] was generally replicable and reproducible, performance metrics in this study were noticeably lower. In Waljee et al., the AuROC for the RF model with clinical variables (including previous flare) and laboratory variables was 0.87 [3], where this comparable model had an AuROC of 0.80. One explanation for the lower performance metrics observed could be the difference in the patient populations. Where this study evaluated patient populations with IBD in the commercial Optum EHR, Waljee et al. [3] evaluated patients with IBD in the VHA. The VHA population was predominantly male and had an average age approximately 10 years older than the Optum EHR cohort. In addition, our study required 12 months of post-baseline follow-up data, which would limit model generalizability to only to patients with continuous care within a single healthcare system.

This study deviated from some preprocessing of laboratory values when compared to Waljee et al. [3] Specifically, laboratory variables were required to be measured on at least 70% of visits, and a median imputation of population values rather than a past median per patient was used. It is possible that these criteria also resulted in lower model performance. However, these criteria were made to simplify the preprocessing of laboratory data, which made interpretations of model SHAP values and odds ratios easier to understand. Furthermore, highly correlated lab values such as WBC past mean and WBC past max values, could result in some dilution of variable importance in both SHAP (Fig 3) and VIF (S1 Fig). Future work could use feature selection approaches to reduce the number of correlated variables.

The use of the Optum EHR may have introduced misclassification. Oral corticosteroids received during the course of a hospitalization were not available and inpatient care received outside the Optum network could not be tracked. Clinical information, such as disease extent and location, were also not captured in the Optum EHR. As a result, the analysis of the Optum EHR reported here is anticipated to have more missing data than that reported by Waljee et al. [3] These limitations may explain the lower proportion of events in this analysis and subsequent lower model performance metrics.

Another difference between this study and Waljee et al. [3] was that patients with concurrent Clostridium difficile were not excluded. While this may have led to misclassification, a positive C. difficile diagnosis was observed in < 2% of the study population. In addition, objective definitions of flare, such as using fecal calprotectin [26], were not specifically examined. Specifically, fecal calprotectin was measured on < 1% of the study population. Measurements from endoscopy and disease specific activity/severity scores were also not available in the EHR. Future research could work on disease-specific models integrating fecal calprotectin and other UC- and CD-specific disease activity/severity scores with clinical and laboratory variables readily available in EHR.

Reproduction of a study is essential for replication of scientific findings. However, this is a challenge in biomedical sciences, as often not enough key elements to reproduce a study are made available to external researchers, including detailed methods, data, and code [27]. Data sharing is not always feasible in biomedical research due to data use agreements and privacy regulations, as was the case in this study which prohibited external validation of the trained models on the VHA cohort. However, other essential elements of reproducible research can be provided, including use of open source software and sharing code, to increase likelihood of replicating a study [28]. These elements, in addition to a detailed methods section, helped in replicate the findings of Waljee et al. [3]

Machine learning models could help enable personalized care in IBD patients and could be incorporated into routine practice as clinical decision support tools for monitoring and identifying patients at high-risk of flare. As algorithms such as RF are utilized more in clinical decision-making, it will be important for patients or clinicians to be able to make sense of algorithms beyond high level accuracy metrics. Use of SHAP values can help accomplish this task. However, model interpretability is only one aspect. Ensuring machine learning models are validated in different EHRs and generalizable will be critical to promote use in routine practice.

Supporting information

S1 Fig. Random forest top 10 features estimated using variable importance.

IBD = inflammatory bowel disease; Max = maximum.

(TIFF)

Click here for additional data file.^{(2.3MB, tiff)}

S2 Fig. Treeshap dependency plot of the top 4 important variables.

Dependency Plot For A) Number of Previous Flares, B) Age At Visit, C) Potassium Mmol/L Past Visits Mean, and D) WBC Count×10³/μL Past Visits Mean.

(TIFF)

Click here for additional data file.^{(16.5MB, tiff)}

S3 Fig. Odds ratios from logistic regression prediction models for the demographics only model, and the demographics and laboratory variables.

Max = maximum; Labs = laboratory variables. Note: Python scikit-learn used for logistic models does not estimate variance parameters for the coefficients and are therefore not reported.

(TIFF)

Click here for additional data file.^{(21.5MB, tiff)}

S4 Fig. AuROC for random forest (RF) models (clinical and laboratory variables) subset by ulcerative colitis, Crohn’s disease, and indeterminate colitis.

AUC = area under the curve; auROC = area under the receiver operating curve; RF = regression model; ROC = receiver operating curve.

(TIFF)

Click here for additional data file.^{(34.3MB, tiff)}

S5 Fig. Random forest top 10 features estimated using variable importance subset by ulcerative colitis, Crohn’s disease, and indeterminate colitis.

Max = maximum.

(TIFF)

Click here for additional data file.^{(2MB, tiff)}

S1 Table. Patient variables.

(DOCX)

Click here for additional data file.^{(13.1KB, docx)}

S2 Table. Bootstrapped estimates of median and 95% Confidence Interval (CI) model performance for random forest (RF) model with clinical and multiple imputation by chain equation laboratory features on laboratory variables measured at >70% of visits.

(DOCX)

Click here for additional data file.^{(14.5KB, docx)}

S3 Table. TRIPOD checklist for prediction model development and validation.

(DOCX)

Click here for additional data file.^{(26.8KB, docx)}

S1 List. Generic names of corticosteroids.

(DOCX)

Click here for additional data file.^{(11.9KB, docx)}

S2 List. ICD-9-CM codes for a variety of common inflammatory comorbid conditions.

(DOCX)

Click here for additional data file.^{(12.3KB, docx)}

Data Availability

The Optum EHR [5] data used in this study were licensed from Optum and are not publicly available due to data licensing and use agreements; interested researchers can contact Optum to license the data. All interested researchers can access the data in the same manner as the authors. The authors had no special access privileges. Optum EHR contact website: https://www.optum.com/business/solutions/government/federal/data-analytics-federal/clinical-data.html. R code (GitHub: https://github.com/CCMRcodes/IBD_Flare) and paper previously published by Waljee et al.[3] was reviewed and adapted to Python code. Manuscript code used to produce tables and plots can be found in the public GitHub repository: https://github.com/phcanalytics/ibd_flare_model. Summary data used to create plots and tables can be found in the public GitHub repository: https://github.com/phcanalytics/ibd_flare_model/tree/master/results. Note, SHAP values require raw data to be calculated and are not included in the repository due to data use agreements. Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines were implemented;[24] checklist can be found in S3 Table.

Funding Statement

Funding to license Optum data was provided by Roche/Genentech. The funder (Roche/Genentech) provided support in the form of salaries for authors Ryan Gan, Diana Sun, Amanda Tatro, and licensing fees for Optum EHR, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.

References

1.Kappelman MD, Rifas–Shiman SL, Kleinman K, Ollendorf D, Bousvaros A, Grand RJ, et al. The Prevalence and Geographic Distribution of Crohn’s Disease and Ulcerative Colitis in the United States. Clin Gastroenterol Hepatol. 2007;5: 1424–1429. doi: 10.1016/j.cgh.2007.07.012 [DOI] [PubMed] [Google Scholar]
2.Loftus EV. Clinical epidemiology of inflammatory bowel disease: incidence, prevalence, and environmental influences. Gastroenterology. 2004;126: 1504–1517. doi: 10.1053/j.gastro.2004.01.063 [DOI] [PubMed] [Google Scholar]
3.Waljee AK, Lipson R, Wiitala WL, Zhang Y, Liu B, Zhu J, et al. Predicting Hospitalization and Outpatient Corticosteroid Use in Inflammatory Bowel Disease Patients Using Machine Learning. Inflamm Bowel Dis. 2017;24: 45–53. doi: 10.1093/ibd/izx007 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2: 56–67. doi: 10.1038/s42256-019-0138-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.OPTUM® ⎯ Clinical/EHR Data [database online]. [cited 1 Jun 2019]. Available: https://www.optum.com/business/solutions/government/federal/data-analytics-federal/clinical-data.html
6.Hou J, Tan M, Stidham R, Colozzi J, Adams D, El-Serag H, et al. Accuracy of diagnostic codes for identifying patients with ulcerative colitis and Crohn’s disease in the Veterans Affairs Health Care System. Dig Dis Sci. 2014;59: 2406–2410. doi: 10.1007/s10620-014-3174-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Tibshirani R. Regression Shrinkage and Selection Via the Lasso. J R Stat Soc Ser B Methodol. 1996;58: 267–288. 10.1111/j.2517-6161.1996.tb02080.x [DOI] [Google Scholar]
8.Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010;33: 1–22. [PMC free article] [PubMed] [Google Scholar]
9.Breiman L. Random Forests. Mach Learn. 2001;45: 5–32. doi: 10.1023/A:1010933404324 [DOI] [Google Scholar]
10.Saerens M, Latinne P, Decaestecker C. Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comput. 2002;14: 21–41. doi: 10.1162/089976602753284446 [DOI] [PubMed] [Google Scholar]
11.Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Mak Int J Soc Med Decis Mak. 2006;26: 565–574. doi: 10.1177/0272989X06295361 [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Brier GW. VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY. Mon Weather Rev. 1950;78: 1–3. doi: [DOI] [Google Scholar]
13.Azur MJ, Stuart EA, Frangakis C, Leaf PJ. Multiple imputation by chained equations: what is it and how does it work? Int J Methods Psychiatr Res. 2011;20: 40–49. doi: 10.1002/mpr.329 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Waljee AK, Mukherjee A, Singal AG, Zhang Y, Warren J, Balis U, et al. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open. 2013;3: e002847. doi: 10.1136/bmjopen-2013-002847 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Preface • Teradata Database SQL Functions, Operators, Expressions, and Predicates • Reader • Teradata Online Documentation | Quick access to technical manuals. [cited 29 Jun 2020]. Available: https://docs.teradata.com/reader/1DcoER_KpnGTfgPinRAFUw/URMgPkXpBvFL~jOlmMLmkQ
16.Bayer M. SQLAlchemy. In: Brown A, Wilson G, editors. The Architecture of Open Source Applications Volume II: Structure, Scale, and a Few More Fearless Hacks. aosabook.org; 2012. Available: http://aosabook.org/en/sqlalchemy.html
17.Jinja—Jinja Documentation (3.0.x). [cited 16 Aug 2021]. Available: https://jinja.palletsprojects.com/en/3.0.x/
18.Jeff Reback, Wes McKinney, jbrockmendel, Joris Van den Bossche, Tom Augspurger, Phillip Cloud, et al. pandas-dev/pandas: Pandas 1.0.3. Zenodo; 2020. doi: 10.5281/zenodo.3715232 [DOI]
19.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12: 2825–2830. [Google Scholar]
20.van der Walt S, Colbert SC, Varoquaux G. The NumPy Array: A Structure for Efficient Numerical Computation. Comput Sci Eng. 2011;13: 22–30. doi: 10.1109/MCSE.2011.37 [DOI] [Google Scholar]
21.Lemaître G, Nogueira F, Aridas CK. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning. J Mach Learn Res. 2017;18: 1–5. [Google Scholar]
22.Hunter JD. Matplotlib: A 2D Graphics Environment. Comput Sci Eng. 2007;9: 90–95. doi: 10.1109/MCSE.2007.55 [DOI] [Google Scholar]
23.Waskom ML. seaborn: statistical data visualization. J Open Source Softw. 2021;6: 3021. doi: 10.21105/joss.03021 [DOI] [Google Scholar]
24.Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD). Circulation. 2015;131: 211–219. doi: 10.1161/CIRCULATIONAHA.114.014508 [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Strobl C, Boulesteix A-L, Zeileis A, Hothorn T. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics. 2007;8: 25. doi: 10.1186/1471-2105-8-25 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Mao R, Xiao Y, Gao X, Chen B, He Y, Yang L, et al. Fecal Calprotectin in Predicting Relapse of Inflammatory Bowel Diseases: A Meta-analysis of Prospective Studies. Inflamm Bowel Dis. 2012;18: 1894–1899. doi: 10.1002/ibd.22861 [DOI] [PubMed] [Google Scholar]
27.Wallach JD, Boyack KW, Ioannidis JPA. Reproducible research practices, transparency, and open access data in the biomedical literature, 2015–2017. PLOS Biol. 2018;16: e2006930. doi: 10.1371/journal.pbio.2006930 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Peng RD. Reproducible Research in Computational Science. Science. 2011;334: 1226–1227. doi: 10.1126/science.1213847 [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0257520.r001

Decision Letter 0

Hsu-Heng Yen

26 Jul 2021

PONE-D-21-20616

REPLICATING PREDICTION ALGORITHMS FOR HOSPITALIZATION AND CORTICOSTEROID USE IN PATIENTS WITH INFLAMMATORY BOWEL DISEASE

PLOS ONE

Dear Dr. Waljee,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Two expert reviewers in this field identified some area need further revision. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Sep 09 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Hsu-Heng Yen

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that according to our submission guidelines (http://journals.plos.org/plosone/s/submission-guidelines), outmoded terms and potentially stigmatizing labels should be changed to more current, acceptable terminology. For example: “Caucasian” should be changed to “white” or “of [Western] European descent” (as appropriate).

3. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match.

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

4. Thank you for stating the following in the Competing Interests section:

[I have read the journal's policy and the authors of this manuscript have the following competing interests:

Ryan Gan and Diana Sun are full time employees of Genentech, Inc., a member of the Roche group, and own shares of Roche stock.

Amanda Tatro is a full time employee of F. Hoffmann La Roche AG and own shares of Roche stock.].

We note that one or more of the authors are employed by a commercial company: Roche stock.

a) Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

b) Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

5. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

6. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Thank you for the opportunity to review this manuscript. Statistical analysis provided accurate data to explain predictors of IBD flare. It will supply clinicians useful information in decision-making.

Reviewer #2: Previously, this group developed a machine-learning model that more accurately predicts IBD flares. Corticosteroid use and previous hospitalization were key variables of the model (Waljee et al 2018). The present study aim to replicate and validate the model previously published using a distinct and independent IBD cohort, which is clinically relevant. This study could be improved by attention to some concerns that are described below.

General comments

1) Predictor variables: fecal calprotectin was not available, what about HBI (CD), Mayo score (UC), Ulcerative Colitis Disease Activity Index (UCDAI), endoscopy scores? Baseline and overtime disease activity/severity is a more accurate tool to determine real flares.

2) Why diagnosis variable (CD, UC, IC) were not included in the prediction models? Since CD and UC are two distinct diseases, with distinct pathologies and characteristics and given the large sample size here used, independent analyses based on disease type would better support the generalizability of the model prediction for IBD flares.

3) Abstract is lacking sensitivity and specificity results.

4) Cross-validation analysis was used to provide evidence for generalizability. It does provide internal validation, but a more robust approach would be to train the data on the previous published VHA cohort and test on the present cohort (exteran validation).

Specific comments

1) Abstract conclusion and key words have a different format.

2) Introduction: ”they found a random forest (RF) model improved the ability to predict IBD flares and outperformed logistic regression models”. Grammatically confuse.

3) Introduction: “(1) to replicate the findings of models developed by Waljee et al.[3] using demographic and laboratory data in independent dataset of a commercial EHR to predict hospitalization and corticosteroid use as a surrogate for IBD flares in a community-based cohort;”. Statement is too long and grammatically confuse.

4) Study population: “and were observed from the index date until the last observed observation”?

5) Figures: bad quality/resolution.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Svetlana Ferreira Lima

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Sep 20;16(9):e0257520. doi: 10.1371/journal.pone.0257520.r002

Author response to Decision Letter 0

25 Aug 2021

Response to edits requested on submission PONE-D-21-20616R1:

We've checked your submission and before we can proceed, we need you to address the following issues:

1. Thank you for stating the following in the Competing Interests section:

[I have read the journal's policy and the authors of this manuscript have the following competing interests:

Ryan Gan and Diana Sun are full time employees of Genentech, Inc., a member of the Roche group, and own shares of Roche stock.

Amanda Tatro is a full time employee of F. Hoffmann La Roche AG and own shares of Roche stock.].

We note that one or more of the authors are employed by a commercial company: Roche stock.

-Amended Funding Statement provided in updated Cover Letter. The funding organization only provided financial support in the form of authors’ salaries and/or research materials and is reflected as such in the Author Contributions.

Please also include the following statement within your amended Funding Statement.

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

-This statement has been included in our amended Funding Statement within the updated Cover Letter.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

-We have updated our Competing Interests statement in our Cover Letter adding the following statement “This does not alter our adherence to PLOS ONE policies on data sharing and materials.”

2. Please remove your Supporting Information figures within your manuscript file, leaving only the individual TIFF/EPS image files. These will be automatically included in the reviewer’s PDF

-We have removed Supporting Information figures within the manuscript file, only leaving the figure captions.

3. We note your current Data Availability statement is:

"The Optum EHR [5] data used in this study were licensed from Optum and are not publicly available due to data licensing and use agreements; interested researchers can contact Optum to license the data. All interested researchers can access the data in the same manner as the authors. The authors had no special access privileges. Optum EHR contact website: https://www.optum.com/business/solutions/government/federal/data-analytics-federal/clinical-data.html.

R code (GitHub: https://github.com/CCMRcodes/IBD_Flare) and paper previously published by Waljee et al.[3] was reviewed and adapted to Python code.

Manuscript code and summary data used to produce tables and plots can be found in the public GitHub repository: https://github.com/phcanalytics/ibd_flare_model. Note, SHAP values require raw data to be calculated and are not included in the repository due to data use agreements.

Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines were implemented;[24] checklist can be found in S3 Table."

PLOS defines the "minimal data set" as consisting of the data used to reach the conclusions drawn in the manuscript with related metadata and methods, and any additional data required to replicate the reported study findings in their entirety. This includes:

- The values behind the means, standard deviations and other measures reported;

- The values used to build graphs;

- The points extracted from images for analysis.

Before we proceed, please address the following prompts:

a.) Please confirm whether your current Data Availability statement includes all of the relevant information and links needed for accessing the minimal data set necessary for replicating the study's results?

-We confirm that our Data Availability statements includes all relevant information and links need for accessing the minimal data set necessary for replicating the study’s results. In an effort to improve data accessibility, we have put a markdown table on meta data used to make the plots and tables in the results folder on the public git repository and modified the data availability paragraph slightly to separate out the link for the summary data availability. We have updated our Data Availability statement in the manuscript slightly to read “ The Optum EHR [5] data used in this study were licensed from Optum and are not publicly available due to data licensing and use agreements; interested researchers can contact Optum to license the data. All interested researchers can access the data in the same manner as the authors. The authors had no special access privileges. Optum EHR contact website: https://www.optum.com/business/solutions/government/federal/data-analytics-federal/clinical-data.html.

R code (GitHub: https://github.com/CCMRcodes/IBD_Flare) and paper previously published by Waljee et al.[3] was reviewed and adapted to Python code.

Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines were implemented;[24] checklist can be found in S3 Table.”

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(19.5KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0257520.r003

Decision Letter 1

Hsu-Heng Yen

6 Sep 2021

REPLICATING PREDICTION ALGORITHMS FOR HOSPITALIZATION AND CORTICOSTEROID USE IN PATIENTS WITH INFLAMMATORY BOWEL DISEASE

PONE-D-21-20616R1

Dear Dr. Waljee,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Hsu-Heng Yen

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0257520.r004

Acceptance letter

Hsu-Heng Yen

10 Sep 2021

PONE-D-21-20616R1

Replicating prediction algorithms for hospitalization and corticosteroid use in patients with inflammatory bowel disease

Dear Dr. Waljee:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Hsu-Heng Yen

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Fig. Random forest top 10 features estimated using variable importance.

IBD = inflammatory bowel disease; Max = maximum.

(TIFF)

Click here for additional data file.^{(2.3MB, tiff)}

S2 Fig. Treeshap dependency plot of the top 4 important variables.

Dependency Plot For A) Number of Previous Flares, B) Age At Visit, C) Potassium Mmol/L Past Visits Mean, and D) WBC Count×10³/μL Past Visits Mean.

(TIFF)

Click here for additional data file.^{(16.5MB, tiff)}

S3 Fig. Odds ratios from logistic regression prediction models for the demographics only model, and the demographics and laboratory variables.

Max = maximum; Labs = laboratory variables. Note: Python scikit-learn used for logistic models does not estimate variance parameters for the coefficients and are therefore not reported.

(TIFF)

Click here for additional data file.^{(21.5MB, tiff)}

S4 Fig. AuROC for random forest (RF) models (clinical and laboratory variables) subset by ulcerative colitis, Crohn’s disease, and indeterminate colitis.

AUC = area under the curve; auROC = area under the receiver operating curve; RF = regression model; ROC = receiver operating curve.

(TIFF)

Click here for additional data file.^{(34.3MB, tiff)}

S5 Fig. Random forest top 10 features estimated using variable importance subset by ulcerative colitis, Crohn’s disease, and indeterminate colitis.

Max = maximum.

(TIFF)

Click here for additional data file.^{(2MB, tiff)}

S1 Table. Patient variables.

(DOCX)

Click here for additional data file.^{(13.1KB, docx)}

(DOCX)

Click here for additional data file.^{(14.5KB, docx)}

S3 Table. TRIPOD checklist for prediction model development and validation.

(DOCX)

Click here for additional data file.^{(26.8KB, docx)}

S1 List. Generic names of corticosteroids.

(DOCX)

Click here for additional data file.^{(11.9KB, docx)}

S2 List. ICD-9-CM codes for a variety of common inflammatory comorbid conditions.

(DOCX)

Click here for additional data file.^{(12.3KB, docx)}

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(19.5KB, docx)}

Data Availability Statement

R code (GitHub: https://github.com/CCMRcodes/IBD_Flare) and paper previously published by Waljee et al. [3] was reviewed and adapted to Python code.

Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines were implemented [24]; checklist can be found in S3 Table.

[pone.0257520.ref001] 1.Kappelman MD, Rifas–Shiman SL, Kleinman K, Ollendorf D, Bousvaros A, Grand RJ, et al. The Prevalence and Geographic Distribution of Crohn’s Disease and Ulcerative Colitis in the United States. Clin Gastroenterol Hepatol. 2007;5: 1424–1429. doi: 10.1016/j.cgh.2007.07.012 [DOI] [PubMed] [Google Scholar]

[pone.0257520.ref002] 2.Loftus EV. Clinical epidemiology of inflammatory bowel disease: incidence, prevalence, and environmental influences. Gastroenterology. 2004;126: 1504–1517. doi: 10.1053/j.gastro.2004.01.063 [DOI] [PubMed] [Google Scholar]

[pone.0257520.ref003] 3.Waljee AK, Lipson R, Wiitala WL, Zhang Y, Liu B, Zhu J, et al. Predicting Hospitalization and Outpatient Corticosteroid Use in Inflammatory Bowel Disease Patients Using Machine Learning. Inflamm Bowel Dis. 2017;24: 45–53. doi: 10.1093/ibd/izx007 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref004] 4.Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2: 56–67. doi: 10.1038/s42256-019-0138-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref005] 5.OPTUM® ⎯ Clinical/EHR Data [database online]. [cited 1 Jun 2019]. Available: https://www.optum.com/business/solutions/government/federal/data-analytics-federal/clinical-data.html

[pone.0257520.ref006] 6.Hou J, Tan M, Stidham R, Colozzi J, Adams D, El-Serag H, et al. Accuracy of diagnostic codes for identifying patients with ulcerative colitis and Crohn’s disease in the Veterans Affairs Health Care System. Dig Dis Sci. 2014;59: 2406–2410. doi: 10.1007/s10620-014-3174-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref007] 7.Tibshirani R. Regression Shrinkage and Selection Via the Lasso. J R Stat Soc Ser B Methodol. 1996;58: 267–288. 10.1111/j.2517-6161.1996.tb02080.x [DOI] [Google Scholar]

[pone.0257520.ref008] 8.Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010;33: 1–22. [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref009] 9.Breiman L. Random Forests. Mach Learn. 2001;45: 5–32. doi: 10.1023/A:1010933404324 [DOI] [Google Scholar]

[pone.0257520.ref010] 10.Saerens M, Latinne P, Decaestecker C. Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comput. 2002;14: 21–41. doi: 10.1162/089976602753284446 [DOI] [PubMed] [Google Scholar]

[pone.0257520.ref011] 11.Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Mak Int J Soc Med Decis Mak. 2006;26: 565–574. doi: 10.1177/0272989X06295361 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref012] 12.Brier GW. VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY. Mon Weather Rev. 1950;78: 1–3. doi: [DOI] [Google Scholar]

[pone.0257520.ref013] 13.Azur MJ, Stuart EA, Frangakis C, Leaf PJ. Multiple imputation by chained equations: what is it and how does it work? Int J Methods Psychiatr Res. 2011;20: 40–49. doi: 10.1002/mpr.329 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref014] 14.Waljee AK, Mukherjee A, Singal AG, Zhang Y, Warren J, Balis U, et al. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open. 2013;3: e002847. doi: 10.1136/bmjopen-2013-002847 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref015] 15.Preface • Teradata Database SQL Functions, Operators, Expressions, and Predicates • Reader • Teradata Online Documentation | Quick access to technical manuals. [cited 29 Jun 2020]. Available: https://docs.teradata.com/reader/1DcoER_KpnGTfgPinRAFUw/URMgPkXpBvFL~jOlmMLmkQ

[pone.0257520.ref016] 16.Bayer M. SQLAlchemy. In: Brown A, Wilson G, editors. The Architecture of Open Source Applications Volume II: Structure, Scale, and a Few More Fearless Hacks. aosabook.org; 2012. Available: http://aosabook.org/en/sqlalchemy.html

[pone.0257520.ref017] 17.Jinja—Jinja Documentation (3.0.x). [cited 16 Aug 2021]. Available: https://jinja.palletsprojects.com/en/3.0.x/

[pone.0257520.ref018] 18.Jeff Reback, Wes McKinney, jbrockmendel, Joris Van den Bossche, Tom Augspurger, Phillip Cloud, et al. pandas-dev/pandas: Pandas 1.0.3. Zenodo; 2020. doi: 10.5281/zenodo.3715232 [DOI]

[pone.0257520.ref019] 19.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12: 2825–2830. [Google Scholar]

[pone.0257520.ref020] 20.van der Walt S, Colbert SC, Varoquaux G. The NumPy Array: A Structure for Efficient Numerical Computation. Comput Sci Eng. 2011;13: 22–30. doi: 10.1109/MCSE.2011.37 [DOI] [Google Scholar]

[pone.0257520.ref021] 21.Lemaître G, Nogueira F, Aridas CK. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning. J Mach Learn Res. 2017;18: 1–5. [Google Scholar]

[pone.0257520.ref022] 22.Hunter JD. Matplotlib: A 2D Graphics Environment. Comput Sci Eng. 2007;9: 90–95. doi: 10.1109/MCSE.2007.55 [DOI] [Google Scholar]

[pone.0257520.ref023] 23.Waskom ML. seaborn: statistical data visualization. J Open Source Softw. 2021;6: 3021. doi: 10.21105/joss.03021 [DOI] [Google Scholar]

[pone.0257520.ref024] 24.Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD). Circulation. 2015;131: 211–219. doi: 10.1161/CIRCULATIONAHA.114.014508 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref025] 25.Strobl C, Boulesteix A-L, Zeileis A, Hothorn T. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics. 2007;8: 25. doi: 10.1186/1471-2105-8-25 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref026] 26.Mao R, Xiao Y, Gao X, Chen B, He Y, Yang L, et al. Fecal Calprotectin in Predicting Relapse of Inflammatory Bowel Diseases: A Meta-analysis of Prospective Studies. Inflamm Bowel Dis. 2012;18: 1894–1899. doi: 10.1002/ibd.22861 [DOI] [PubMed] [Google Scholar]

[pone.0257520.ref027] 27.Wallach JD, Boyack KW, Ioannidis JPA. Reproducible research practices, transparency, and open access data in the biomedical literature, 2015–2017. PLOS Biol. 2018;16: e2006930. doi: 10.1371/journal.pbio.2006930 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0257520.ref028] 28.Peng RD. Reproducible Research in Computational Science. Science. 2011;334: 1226–1227. doi: 10.1126/science.1213847 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Replicating prediction algorithms for hospitalization and corticosteroid use in patients with inflammatory bowel disease

Ryan W Gan

Diana Sun

Amanda R Tatro

Shirley Cohen-Mekelburg

Wyndy L Wiitala

Ji Zhu

Akbar K Waljee

Roles

Abstract

Introduction

Methods

Results

Conclusion

Introduction

Methods

Data source

Study population

Outcome measures

Predictor variables

Statistical analysis

Model development

Logistic regression using demographic data (LR-Demo)

Logistic regression using demographic and laboratory data (LR-DemoLab)

RF using demographic and laboratory data (RF-DemoLab)

Training and validating sets

Model performance

Sensitivity analyses

Software

Patient involvement

Data and code availability

Results

Patient characteristics

Fig 1. Flow chart of identification of patients with inflammatory bowel disease (IBD) from the Optum electronic health records (EHR) database.

Table 1. Patient characteristics.

Model performance

Fig 2.

Table 2. Bootstrapped estimates of median and 95% confidence interval (CI) model performance metrics.

Interpretability of the RF models

Fig 3. TreeSHAP summary plot of the top 10 most important variables for predicting a flare in the next 6 months for patients with IBD.

Sensitivity analyses

Discussion

Supporting information

Data Availability

Funding Statement

References

Decision Letter 0

Hsu-Heng Yen

Roles

Author response to Decision Letter 0

Decision Letter 1

Hsu-Heng Yen

Roles

Acceptance letter

Hsu-Heng Yen

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases