Skip to main content
Patient preference and adherence logoLink to Patient preference and adherence
. 2020 Jun 3;14:917–926. doi: 10.2147/PPA.S253732

Applying Machine Learning Models to Predict Medication Nonadherence in Crohn’s Disease Maintenance Therapy

Lei Wang 1,*, Rong Fan 1,*, Chen Zhang 1, Liwen Hong 1, Tianyu Zhang 1, Ying Chen 2, Kai Liu 2, Zhengting Wang 1,, Jie Zhong 1
PMCID: PMC7280067  PMID: 32581518

Abstract

Objective

Medication adherence is crucial in the management of Crohn’s disease (CD), and yet the adherence remains low. This study aimed to develop machine learning models that can help predict CD patients of nonadherence to azathioprine (AZA), and thus assist caregivers to streamline the intervention process.

Methods

This single-centered, cross-sectional study recruited 446 CD patients who have been prescribed AZA between Sep 2005 and Sep 2018. Questionnaires of medication adherence, anxiety and depression, beliefs of medication necessity and concerns, and medication knowledge were provided to patients, while other data were extracted from the electronic medical records. Two machine learning models of back-propagation neural network (BPNN) and support vector machine (SVM) were developed and compared with logistic regression (LR), and assessed by accuracy, recall, precision, F1 score and the area under the receiver operating characteristic curve (AUC).

Results

The average classification accuracy and AUC of the three models were 81.6% and 0.896 for LR, 85.9% and 0.912 for BPNN, and 87.7% and 0.930 for SVM, respectively. Multivariate analysis identified four risk factors associated with AZA nonadherence: medication concern belief (OR=3.130, p<0.001), education (OR=2.199, p<0.001), anxiety (OR=1.549, p<0.001) and depression (OR=1.190, p<0.001), while medication necessity belief (OR=0.004, p<0.001) and medication knowledge (OR=0.805, p=0.013) were protective factors.

Conclusion

We developed three machine learning models and proposed an SVM model with promising accuracy in the prediction of AZA nonadherence in Chinese CD patients. The study also reconfirmed that education, psychologic distress, and medication beliefs and knowledge are correlated to AZA nonadherence.

Keywords: Crohn’s disease, azathioprine, medication adherence, maintenance therapy, machine learning, support vector machine, back-propagation neural network

Introduction

Crohn’s disease (CD) is an inflammatory bowel disease (IBD) that can affect any segment of the gastrointestinal (GI) tract with an etiology combining both genetic predisposition and environmental factors.1 The disease is characterized by the alternating courses of remission and relapse, and its treatment often involves two consecutive phases of induction and maintenance which plays a critical role in remission. Various studies have found that low medication adherence is closely linked to more frequent relapses and hospitalization, increased mortality, and higher healthcare expenditure.2,3 Improved medication adherence can, by contrast, lead to well-maintained condition, lowered overall medical cost and better quality of life. It is estimated that at least 30–45% of the patients fall into the category of nonadherence to the maintenance therapy.4,5

Studies using logistic regression analytics have been performed to identify factors that may influence adherence to medication and/or clinical follow-ups, and the results suggest that age, employment, socioeconomic status, travel time to clinic, C-reactive protein levels, patient perceived stress or anxiety, and duration of the disease are among the predictors for adherence.69 Among them, the psychological impacts of the disease, such as attachment insecurity and impaired mentalization ability, have been particularly placed under the spotlight as they can be both the risk factor to and the outcome of low medication adherence, establishing a vicious triangle of difficulty in disease management, poor prognosis and nonadherence to therapy.10,11 However, these analyses sometimes generate conflicting results as predictors such as gender, smoking or education level can show strong correlation in one setup yet be irrelevant in another, depending on the study design, patient profile, and data collection. Together these aspects make more difficult the tasks of developing reliable prediction models for adherence and patient interventional programs for adherence enhancement.

With the advance of data processing technologies, machine learning algorithms such as artificial neural network (ANN) and support vector machine (SVM) have shown great potentials in constructing predictive models based on electronic health records to support medical decision making, especially in chronic disease management of diabetes, heart failure and kidney disease.1214 Machine learning models are considered having the strengths of including nonlinear associations, less biased auto-learning and higher flexibility to avoid over-fitting when compared with the traditional logistic regression. Three attempts of employing machine learning techniques in CD management have led to the finding of seasonal patterns of disease onset and relapse using ANN, an optimized surgical predictive model using random forest following the comparison of five different algorithms including ANN and SVM, and random forest prediction models for objective remission and nonadherence of IBD/CD patients on thiopurines.1517

Although the incidence rate of CD in China is lower than that in the western countries, a fast-growing trend is being observed in parallel with the nation’s rapid industrialization process and adoption of the more westernized dietary habits.18 Due to the differences of genetic, environmental and socioeconomic backgrounds between the Chinese population and people in the western countries, it is important to develop further understanding of the mechanism underlying nonadherence of CD treatment in China. Hence, in this paper, we aimed to explore and compare machine learning and logistic regression models that can help predict CD patients who may demonstrate nonadherence to azathioprine (AZA), which is the first-line immunosuppressant recommended for CD maintenance therapy. Such model(s) once externally validated in the real-world study can help guide caregivers to prioritize their time and efforts to CD patients according to their risk profiles, and offer proactive and personalized interventions by addressing the relevant risk factors.

Patients and Methods

Patient Recruitment

This cross-sectional study included CD patients who were either hospitalized at or visited the GI Department of Shanghai Ruijin Hospital, Affiliated to Shanghai Jiaotong University School of Medicine (Shanghai, China). A total of 446 consecutive CD patients on AZA maintenance therapy for at least 6 months were recruited between Sep 2005 and Sep 2018. CD diagnosis was confirmed based on clinical, morphological (radiological and/or endoscopic) and pathological evidence, and remission was considered if the Crohn’s Disease Activity Index (CDAI) was less than 150. The maintenance AZA dosage was adjusted according to the side effects and blood tests in a stepwise manner to reach the maximal tolerated dose (1.0–1.5 mg/kg/day). Exclusion criteria were patients with: 1) concomitant treatment using drugs other than AZA, such as corticosteroids, methotrexate or anti-TNF, for maintenance therapy, 2) other accompanied chronic diseases, 3) disease flares yet interrupted AZA treatment, and 4) difficulty understanding the questionnaire. All patients provided written informed consent to participate in the study which was approved by the Ethics Committee of Shanghai Ruijin Hospital in compliance with the Declaration of Helsinki.

Patient Characteristics and Data Extraction

A single-center database of 128 items for CD management was constructed by manually extracting patients’ health information from the hospital electronic medical records (EMR) that captured patients’ profile including demographic characteristics, socioeconomics (education, occupation and income), clinical presentations, laboratory tests and diagnosis, therapeutic regimen, and follow-up records. Questionnaires of AZA adherence, medication beliefs, medication knowledge, and anxiety and depression were other data sources to update and complete this database.

Assessment of AZA Adherence

AZA adherence was assessed using the Medication Adherence Report Scale (MARS) which was designed as a 4-item questionnaire to be provided to patients every time during their hospital visit. Each MARS self-report question had a 5-point scale (where 5 = never, 4 = rarely, 3 = sometimes, 2 = often and 1 = very often) to produce a score between 1 and 5 with the total MARS score between 4 and 20.19,20 High adherence was indicated by the MARS score of 17–20, approximately ≥80% adherence rate, and nonadherence by <17, ie, <80% adherence rate, according to the previous reports.2125

Assessment of AZA Medication Beliefs

Patients’ beliefs about AZA medication were evaluated using the Beliefs about Medicines Questionnaire (BMQs).26 The BMQs consisted of two sections in the 5-point Likert scales: belief of medication necessity and concerns about potential adverse effects. Each section included a 5-item questionnaire with scores ranging from 5 to 25, and was calculated independently. Higher scores (15–25) indicated greater belief or concerns. Medication acceptance was considered in the case of high necessity but low concern scores, suggesting improved medication adherence.

Assessment of AZA Medication Knowledge

Patients’ general knowledge of AZA was evaluated using a self-report questionnaire-the AZA Knowledge Report Scale (AKRS) which we specifically designed for the Chinese patients (see Supplement) considering the uniqueness of the patient profile and the healthcare system in the country. This new instrument had 10 questions regarding the AZA knowledge of treatment indication, dose, cessation, side-effects, surveillance, and pregnancy. The questions were prepared as yes and no responses (yes: 1 point, no: 0 point), and the total AKRS score was 0–10 with higher scores indicating better AZA knowledge.

Assessment of Anxiety and Depression

Patients’ anxiety and depression were evaluated using the Hospital Anxiety and Depression Scale (HADS) which was a 14-item questionnaire (7 for anxiety and 7 for depression).27 Each item was assessed with a 4-point scale (0–3), and higher scores suggested higher levels of anxiety or depression: 0–7 (normal), 8–10 (mild), 11–15 (moderate) and 16–21 (severe).

Data Processing and Feature Selection

We divided all 446 patients into two groups of AZA adherence (MARS 17–20) and nonadherence (MARS < 17), and constructed a database by extracting their information from EMR and questionnaires. Univariate analyses included student’s t-test, Mann–Whitney test or one-way analysis of variance (ANOVA) for continuous variables and Fisher’s exact test or Chi-square test for categorical variables. Statistical analysis was performed using SPSS 22.0 (SPSS, Inc, Chicago, IL) and significance was regarded as P-values <0.05.

For feature selection, we combined two methods of univariate analysis and random forest as random forest variable importance measures have been shown effective in classification tasks such as identifying genetic biomarkers to predict the onset or outcome of certain diseases.28 The importance of each predictor (feature) was evaluated by permutation of out-of-bag (OOB) prediction using the “feature selection” function in MATLAB Statistics and Machine Learning Toolbox.29 As the independent variable data set in this study was heterogeneous, it was not reliable to estimate the variable importance using random forest model developed on classification and regression trees (CART). We hence applied the interaction test instead of CART to grow unbiased trees, and compared the random forest variables with those found in the univariate analysis. The variables in common were then chosen as the feature set for model construction.

Development of Prediction Models of Logistic Regression, Back-Propagation Neural Network and SVM

We first constructed a new dataset using the identified features and classification labels of AZA adherence for all 446 patients, and then randomly divided this new dataset into the training and the testing sets in the ratio of 9:1. The modeling consisted of two steps: 1) the learning process in which the three models of logistic regression, back-propagation neural network and SVM were applied and validated on the training and testing sets, and a stratified 10-fold cross-validation procedure was also employed to the predictive models on both datasets to limit overfitting and selection bias; and 2) the evaluation process in which the five metrics of accuracy, recall, precision, F1 score and AUC were tested and compared (Figure 1).

Figure 1.

Figure 1

The flow chart of developing machine learning models.

Note: The processed patient data were randomly divided into the training and testing sets in the ratio of 9:1, followed by the learning and evaluation steps.

Abbreviations: LR, logistic regression; BPNN, back-propagation neural network; SVM, support vector machine; AUC, area under the receiver operating characteristic curve.

The logistic regression model was developed using SPSS 22.0 (SPSS, Inc, Chicago, IL) and variables with P-value <0.05 were included in the model. We tested the linearity in the logit for continuous variables using the Box-Tidwell Transformation. If any of the resulting statistic terms were significant (P < 0.05), we would translate associated continuous variables into categorical variables to satisfy the linearity assumption.30 The identified variation inflation factor (VIF) for each independent variable was used to check the absence of multicollinearity, and variables with VIF > 10 would be eliminated. Once these assumption tests ensured the use of logistic regression, a stepwise approach was applied to construct the model.

Back-propagation neural network is a machine learning algorithm that learns by adjusting the node connection weights through backward propagating the output error term, and we used MATLAB R2017a in developing the model. As the prediction of patients’ nonadherence may not be linearly separable, we chose back-propagation neural network with two hidden layers that have been found sufficient for creating classification regions of any desired shape.31 The number of nodes in each hidden layer would be determined through trial and error. The transfer function between the input/hidden and hidden/output layers was generated using the sigmoid function-logsig, and the train function was achieved using trainlm according to Levenberg-Marquardt optimization.

SVM is a supervised machine learning algorithm commonly used for data classification modeling, and we used the open-source SVM software library-LIBSVM for model development. For both training and testing datasets, values of input features were normalized into the range of 0 to 1, while the classification label of AZA adherence and nonadherence were designated as 1 and −1, in order to meet the format requirement of SVM. As the number of features was small, and the relation between the features and the adherence outcome could be nonlinear, we constructed the SVM model using the kernel of radial basis function (RBF) that included parameter C as the weight between empirical error and generalization error, and parameter γ to control the shape of the separating hyperplane for the training predictive model. Optimization of both C and γ was achieved by grid-search using cross-validation, and the pair of (C & γ) with the highest accuracy would be selected for continuous training steps until the final classifier was produced.32 The classification model would then be applied to the testing dataset for validation.

Performance Evaluation

To assess the performance of each model we included four statistics of accuracy, recall, precision, and F1 score as follows:

  • Accuracy = (TP + TN)/(TP + TN +FP + FN)

  • Recall = TP/(TP + FN)

  • Precision = TP/(TP + FP)

  • F1 Score = (2 × Recall × Precision)/(Recall + Precision)

where TP, FP, TN, and FN referred to the number of true positive, false positive, true negative, and false-negative cases, respectively. In this study, true positive indicated the correctly classified case of AZA nonadherence; false-positive indicated a case classified as AZA nonadherence yet was adherent to AZA; true negative indicated the correctly classified case of AZA adherence; and false negative indicated a case classified as AZA adherence yet was of AZA nonadherence. More comprehensive assessment was achieved using AUC which has been proposed as an accurate measure of evaluating the predictive ability of learning algorithms.33

Results

Patient Characteristics Between Groups of AZA Adherence and Nonadherence

During the study period, 553 CD patients attending the GI Dept. were solicited to the present research. Among them, 34 fell into the exclusion criteria, 50 refused to participate, 21 had missing values in EMR, and 2 were unable to complete the questionnaire. As a result, a total of 446 CD patients on AZA maintenance therapy with an average AZA duration of 34.3 months were recruited and their characteristics are listed in Table 1. Of the study population (male 58.3%, mean age 31.7), 187 (41.9%) patients were of AZA nonadherence (MARS < 17, ie, adherence rate <80%) and demonstrated differences in univariate analysis from the AZA adherence group (MARS 17–20) in age, marital status, education level, alcoholism, psychological distress, medication beliefs and knowledge, P < 0.05, respectively (Table 1). In addition, the incidence rates of moderate-to-severe anxiety and depression were 18.2% (81/446) and 12.1% (54/446), comparable to those in a previous report.34

Table 1.

Patient Characteristics Between Groups of AZA Adherence and Nonadherence

Features Adherence (n=259) Nonadherence (n=187) P value
Male (n [%]) 161(62.2) 99(52.9) 0.051
Age (Mean ± SD) 32.8±11.1 30.2±12.0 0.022
Married (n [%]) 137(52.9) 81(43.3) 0.046
Offspring (n [%]) 102(39.4) 76(40.6) 0.789
Education (n [%]) <0.001
Primary school 15(5.8) 3(1.6)
Secondary school 36(13.9) 13(7.0)
High school 95(36.7) 41(21.9)
College 100(38.6) 97(51.9)
Postgraduate 13(5.0) 33(17.6)
Family income per month (n [%]) 0.986
>10 thousand USD 21(8.1) 16(8.6)
 5–10 thousand USD 35(13.5) 22(11.8)
 2–5 thousand USD 77(29.7) 56(29.9)
 1–2 thousand USD 95(36.7) 69(36.9)
<1 thousand USD 31(12.0) 24(12.8)
Cost of disease per year (n [%]) 0.106
>10 thousand USD 67(25.9) 40(21.4)
 5–10 thousand USD 100(38.6) 91(48.7)
<5 thousand USD 92(35.5) 56(29.9)
Smoking (n [%]) 9(3.5) 11(5.9) 0.225
Alcoholism (n [%]) 3(1.2) 12(6.4) 0.002
Disease duration (yrs) (Mean ± SD) 4.7±2.5 4.9±2.4 0.420
Age of onset (n [%]) 0.357
<17 years old 24(9.3) 15(8.0)
 17–40 years old 217(83.8) 152(81.3)
>40 years old 18(6.9) 20(10.7)
Location of lesions (n [%]) 0.079
Ileum 113(43.6) 91(48.7)
Colon 37(14.3) 14(7.5)
Ileocolon 109(42.1) 82(43.9)
Behaviour (n [%]) 0.382
Non-stricture non-penetrating 181(69.9) 119(63.6)
Stricture 48(18.5) 42(22.5)
Penetrating 30(11.6) 26(13.9)
Perianal disease (n [%]) 83(32.0) 59(31.6) 0.912
CD-related surgery (n [%]) 44(17.0) 40(21.4) 0.241
Anxiety (Mean ± SD) 4.4±2.1 7.2±3.3 <0.001
Depression (Mean ± SD) 5.9±2.9 7.3±3.1 <0.001
AZA usage (n [%])
Dosage (mg/d) (Mean ± SD) 67.3±24.8 66.5±22.4 0.719
Duration (months) (Mean ± SD) 34.8±16.2 33.5±17.8 0.429
Necessity belief (Mean ± SD) 18.0±1.2 15.8±2.6 <0.001
Concerns belief (Mean ± SD) 14.8±2.0 17.0±1.8 <0.001
Knowledge (Mean ± SD) 6.0±1.8 5.6±1.8 0.016
Side effect (n [%]) 28(10.8) 18(9.6) 0.685

Abbreviations: SD, standard deviation; CD, Crohn’s disease; AZA, azathioprine.

Feature Selection by Random Forest and Univariate Analysis

Random forest was used to select SVM model features according to their importance measures. We started with building 100 classification trees based on the original dataset to identify features with the highest predictive accuracy, and the top-10 features were illustrated in descending order in Figure 2. Cross-examination of the random forest features and those of univariate analysis yielded eight common features to be selected for modeling, including: age, education, alcoholism, anxiety, depression, AZA necessity belief, AZA knowledge and AZA concerns belief.

Figure 2.

Figure 2

Top 10 features with the highest importance identified by random forest.

Note: The importance score on the Y-axis was quantified by computing the OOB error.

Abbreviation: OOB, out-of-bag.

Development and Evaluation of the Logistic Regression, Back-Propagation Neural Network and SVM Models

Contribution of each independent variable to the logistic regression model and its statistical significance was listed in Table 2. This multivariate analysis identified that AZA concerns belief (OR: 3.130, 95% CI: 1.673–5.854), education (OR: 2.199, 95% CI: 1.543–3.134), anxiety (OR: 1.549, 95% CI: 1.372–1.749) and depression (OR: 1.190, 95% CI: 1.080–1.312) were risk factors of AZA nonadherence. By contrast, AZA necessity belief (OR: 0.004, 95% CI: 0.0004–0.033) and AZA knowledge (OR: 0.805, 95% CI: 0.679–0.955) were protective factors of adherence.

Table 2.

Predictive Factors for AZA Nonadherence in Patients with CD on Maintenance Therapy (Multivariate Analysis)

Variables B SE Wald χ2 P value OR 95% CI for OR
Lower Upper
Concerns belief 1.141 0.319 12.759 <0.001 3.130 1.673 5.854
Education 0.788 0.181 19.001 <0.001 2.199 1.543 3.134
Anxiety 0.438 0.062 50.009 <0.001 1.549 1.372 1.749
Depression 0.174 0.050 12.258 <0.001 1.190 1.080 1.312
Knowledge −0.217 0.087 6.187 0.013 0.805 0.679 0.955
Necessity belief −5.614 1.129 24.715 <0.001 0.004 0.0004 0.033
Constant −1.156 1.359 0.723 0.395 0.315

Note: B, regression coefficient or regression constant.

Abbreviations: SE, standard error; CI, confidence interval; OR, odds ratio.

The performance of the three models was compared using four evaluation measures in Table 3. The average accuracy of the three models was 81.6% for logistic regression, 85.9% for back-propagation neural network, and 87.7% for SVM, respectively. The SVM model scored highest in every other aspect of recall, precision, and F1 score. Moreover, the AUC analysis that can classify patients of nonadherence from adherence also identified that the SVM model had the highest value of 0.930 (high accuracy), as compared with those of logistic regression (0.896, moderate accuracy) and back-propagation neural network (0.912, high accuracy) in Figure 3.35 Taken together, these results indicated that the SVM model is most appropriate in predicting AZA nonadherence in the maintenance therapy among Chinese CD patients.

Table 3.

Predictive Performance of LR, BPNN and SVM Models

Model Accuracy (%) Recall (%) Precision (%) F1 Score
LR 81.6 73.2 82.6 0.773
BPNN 85.9 83.0 83.7 0.832
SVM 87.7 86.2 85.6 0.855

Abbreviations: LR, logistic regression; BPNN, back-propagation neural network; SVM, support vector machine.

Figure 3.

Figure 3

AUC of LR, BPNN and SVM models for prediction of AZA nonadherence.

Abbreviations: AUC, area under the receiver operating characteristic curve; LR, logistic regression; BPNN, back-propagation neural network; SVM, support vector machine; AZA, azathioprine.

Discussion

The present research is the first report of developing machine learning models to predict Chinese CD patients who may undergo nonadherence in AZA maintenance therapy. The connection between nonadherence and poor clinical outcomes has long been established in CD and other chronic conditions.3,7,8,36 A variety of efforts have been poured into regression analyses to identify predictors that can help profile patients’ adherence so that engagement programs can be developed and implemented to target low adherers.3741 To circumvent the common limitation of potential interactions of dependent variables in the process of logistic regression, we applied back-propagation neural network and SVM modeling and compared them with logistic regression on the dataset of CD patients, and found that SVM showed the best prediction performance on AZA nonadherence. These models lay the groundwork of developing a user-friendly digital tool for clinical practitioners to identify CD patients of nonadherence at early stage, allowing appropriate interventions to be provided in time to improve adherence and disease prognosis.

The overall AZA nonadherence rate in the present study was 41.9%, and multivariate analysis revealed results in agreement with the previous findings that education, psychological distress, and medication beliefs and knowledge could be predictors of medication nonadherence.7,9 The association between education and nonadherence has been postulated that people of higher educational degrees often live a busy professional and social life, which may lead to forgetfulness of taking medicine.38 Disease-related psychological distress is known to be another factor influencing the disease progress and prognosis, as one of the latest studies proposed a potential interplay among the IBD-associated stress, impaired mentalization and attachment insecurity. In this scenario medication, nonadherence could be the cause and the outcome at the same time due to patients’ emotional disturbance, giving adherence improvement a more critical role in CD management.42 In addition, we confirmed strong relevance of the following two elements to AZA nonadherence: medication beliefs (necessity and concern) and medication knowledge. These findings are not only in line with others’ works but strongly implies that patient follow-up programs that tackle these factors, such as involving psychiatrists for mental health evaluation and intervention, and healthcare educators for medication-related coaching, shall be developed as a comprehensive solution to improve both mental health and knowledge to achieve higher AZA adherence.

All three models of the present study showed reliable prediction with the minimum accuracy of 81.6% and AUC of 0.896. We believe this generally good performance may attribute to the procedure of feature selection in which random forest and univariate analysis were combined to produce a feature set of eight-dimensional vectors of small dimensionality. This step would not only reduce the time of model development but also help avoid overfitting, leading to enhanced model generalization and better classification. Another reason was the application of the stratified 10-fold cross-validation in the modeling, a step that could identify selection bias or overfitting, shedding further insights into how the model would generalize on an independent dataset. Interestingly, we found SVM outperformed back-propagation neural network and logistic regression in nearly every aspect with an accuracy of 87.7%, recall of 86.2%, precision of 85.6%, F1 score of 0.855, and AUC of 0.930. This result suggests that SVM is an optimal classifier in processing complex clinical data in AZA nonadherence prediction.

An intriguing recent study demonstrated a prediction model on thiopurine nonadherence among IBD/CD patients using random forest modeling but based on mainly lab test data of complete blood count with differential and comprehensive chemistry panel.17 The goal of this model, which ran on a set of over 20 variables such as the red cell distribution width and mean corpuscular volume, was to predict nonadherence using objective data and therefore to adjust thiopurine dose and regimen during the therapy. It is possible that all these algorithms, regardless of the data processing technology or the source, shall function in complementary to achieve the best performance on classification. Therefore, various real-world studies need to be designed and carried out to evaluate the health outcomes of these models in combination with patient adherence enhancement programs.

There are several limitations. First, AZA was selected as the representative of CD maintenance therapy agent as it was unanimously covered by China’s national payer system and thus most commonly used, while other immunomodulators such as anti-TNF need to be paid out-of-pocket by patients and only account for a small portion of usage. Hence, predictors and models developed in this study may not apply to cases when medications other than AZA are chosen for the maintenance therapy. Second, this was a single-centered study in Shanghai and the patient profile might be biased and not representative of the Chinese as a whole. Data interpretation thus requires caution when extrapolated to the CD patients in general. Last but not the least, application of machine learning prediction models in daily clinical practice remains a challenge, which could result from the much higher regulatory standard for model performance and fidelity when in medical use, or from the lack of causality interpretation when demanded by the practitioner.

Conclusions

In conclusion, the present research introduced three machine learning models to predict Chinese CD patients of AZA nonadherence and proposed an SVM model with better classifier performance than the back-propagation neural network or logistic regression model. This work also reconfirmed that variables including higher educational degrees, enhanced levels of anxiety and depression, and less medication beliefs and knowledge were risk factors for nonadherence and thus can serve as targets to be further addressed by tailored engagement programs. We are in the process of developing a cloud-based solution with built-in SVM model and mobile Apps for caregivers and patients in an effort to integrate the adherence enhancement intervention into daily CD management. Future investigation would focus on clinical validation of this SVM model-enabled cloud solution on health and pharmacoeconomic outcomes, and development of more comprehensive machine learning solutions that aggregate data from multi-centered patient pool and assist in AZA dosage adjustment and life-style coaching.

Author Contributions

All authors contributed to data analysis, drafting and revising the article, gave final approval of the version to be published, and agree to be accountable for all aspects of the work.

Disclosure

The authors report no conflicts of interest in this work, financial or otherwise.

References

  • 1.Lichtenstein GR, Loftus EV, Isaacs KL, Regueiro MD, Gerson LB, Sands BE. Correction: ACG clinical guideline: management of Crohn’s disease in adults. Am J Gastroenterol. 2018;113(7):1101. doi: 10.1038/s41395-018-0120-x [DOI] [PubMed] [Google Scholar]
  • 2.Cramer JA, Roy A, Burrell A, et al. Medication compliance and persistence: terminology and definitions. Value Health. 2008;11(1):44–47. doi: 10.1111/j.1524-4733.2007.00213.x [DOI] [PubMed] [Google Scholar]
  • 3.Actis GC, Pellicano R. Inflammatory bowel disease: efficient remission maintenance is crucial for cost containment. World J Gastrointest Pharmacol Ther. 2017;8(2):114–119. doi: 10.4292/wjgpt.v8.i2.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Jackson CA, Clatworthy J, Robinson A, Horne R. Factors associated with non-adherence to oral medication for inflammatory bowel disease: a systematic review. Am J Gastroenterol. 2010;105(3):525–539. doi: 10.1038/ajg.2009.685 [DOI] [PubMed] [Google Scholar]
  • 5.Sewitch MJ, Abrahamowicz M, Barkun A, et al. Patient nonadherence to medication in inflammatory bowel disease. Am J Gastroenterol. 2003;98(7):1535–1544. doi: 10.1111/j.1572-0241.2003.07522.x [DOI] [PubMed] [Google Scholar]
  • 6.Woo DH, Kim KO, Kang MK, Lee SH, Jang BI, Kim TN. Predictors and clinical outcomes of follow-up loss in patients with inflammatory bowel disease. J Gastroenterol Hepatol. 2018;33(11):1834–1838. doi: 10.1111/jgh.14258 [DOI] [PubMed] [Google Scholar]
  • 7.Depont F, Berenbaum F, Filippi J, et al. Interventions to improve adherence in patients with immune-mediated inflammatory disorders: a systematic review. PLoS One. 2015;10(12):e0145076. doi: 10.1371/journal.pone.0145076 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lenti MV, Selinger CP. Medication non-adherence in adult patients affected by inflammatory bowel disease: a critical review and update of the determining factors, consequences and possible interventions. Expert Rev Gastroenterol Hepatol. 2017;11(3):215–226. doi: 10.1080/17474124.2017.1284587 [DOI] [PubMed] [Google Scholar]
  • 9.Khan S, Rupniewska E, Neighbors M, Singer D, Chiarappa J, Obando C. Real-world evidence on adherence, persistence, switching and dose escalation with biologics in adult inflammatory bowel disease in the United States: a systematic review. J Clin Pharm Ther. 2019;44(4):495–507. doi: 10.1111/jcpt.12830 [DOI] [PubMed] [Google Scholar]
  • 10.Bonaz BL, Bernstein CN. Brain-gut interactions in inflammatory bowel disease. Gastroenterology. 2013;144(1):36–49. doi: 10.1053/j.gastro.2012.10.003 [DOI] [PubMed] [Google Scholar]
  • 11.Colonnello V, Agostini A. Disease course, stress, attachment, and mentalization in patients with inflammatory bowel disease. Med Hypotheses. 2020;140:109665. doi: 10.1016/j.mehy.2020.109665 [DOI] [PubMed] [Google Scholar]
  • 12.Almansour NA, Syed HF, Khayat NR, et al. Neural network and support vector machine for the prediction of chronic kidney disease: a comparative study. Comput Biol Med. 2019;109:101–111. doi: 10.1016/j.compbiomed.2019.04.017 [DOI] [PubMed] [Google Scholar]
  • 13.Yu W, Liu T, Valdez R, Gwinn M, Khoury MJ. Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes. BMC Med Inform Decis Mak. 2010;10(1):16. doi: 10.1186/1472-6947-10-16 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Son YJ, Kim HG, Kim EH, Choi S, Lee SK. Application of support vector machine for prediction of medication adherence in heart failure patients. Healthc Inform Res. 2010;16(4):253–259. doi: 10.4258/hir.2010.16.4.253 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Dong Y, Xu L, Fan Y, et al. A novel surgical predictive model for Chinese Crohn’s disease patients. Medicine (Baltimore). 2019;98(46):e17510. doi: 10.1097/MD.0000000000017510 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Peng JC, Ran ZH, Shen J. Seasonal variation in onset and relapse of IBD and a model to predict the frequency of onset, relapse, and severity of IBD based on artificial neural network. Int J Colorectal Dis. 2015;30(9):1267–1273. doi: 10.1007/s00384-015-2250-6 [DOI] [PubMed] [Google Scholar]
  • 17.Waljee AK, Sauder K, Patel A, et al. Machine learning algorithms for objective remission and clinical outcomes with thiopurines. J Crohns Colitis. 2017;11(7):801–810. doi: 10.1093/ecco-jcc/jjx014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Li Y, Chen B, Gao X, et al. Current diagnosis and management of Crohn’s disease in China: results from a multicenter prospective disease registry. BMC Gastroenterol. 2019;19(1):145. doi: 10.1186/s12876-019-1057-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Horne R, Parham R, Driscoll R, Robinson A. Patients’ attitudes to medicines and adherence to maintenance treatment in inflammatory bowel disease. Inflamm Bowel Dis. 2009;15(6):837–844. doi: 10.1002/ibd.20846 [DOI] [PubMed] [Google Scholar]
  • 20.Horne R, Weinman J. Self-regulation and self-management in asthma: exploring the role of illness perceptions and treatment beliefs in explaining non-adherence to preventer medication. Psychol Health. 2002;17(1):17–32. doi: 10.1080/08870440290001502 [DOI] [Google Scholar]
  • 21.Severs M, Mangen MJ, Fidder HH, et al. Clinical predictors of future nonadherence in inflammatory bowel disease. Inflamm Bowel Dis. 2017;23(9):1568–1576. doi: 10.1097/MIB.0000000000001201 [DOI] [PubMed] [Google Scholar]
  • 22.Ribaldone DG, Vernero M, Saracco GM, et al. The adherence to the therapy in inflammatory bowel disease: beyond the number of the tablets. Scand J Gastroenterol. 2018;53(2):141–146. doi: 10.1080/00365521.2017.1405070 [DOI] [PubMed] [Google Scholar]
  • 23.Karve S, Cleves MA, Helm M, Hudson TJ, West DS, Martin BC. Good and poor adherence: optimal cut-point for adherence measures using administrative claims data. Curr Med Res Opin. 2009;25(9):2303–2310. doi: 10.1185/03007990903126833 [DOI] [PubMed] [Google Scholar]
  • 24.Severs M, Zuithoff PN, Mangen MJ, et al. Assessing self-reported medication adherence in inflammatory bowel disease: a comparison of tools. Inflamm Bowel Dis. 2016;22(9):2158–2164. doi: 10.1097/MIB.0000000000000853 [DOI] [PubMed] [Google Scholar]
  • 25.Tiao DK, Chan W, Jeganathan J, et al. Inflammatory bowel disease pharmacist adherence counseling improves medication adherence in Crohn’s disease and ulcerative colitis. Inflamm Bowel Dis. 2017;23(8):1257–1261. doi: 10.1097/MIB.0000000000001194 [DOI] [PubMed] [Google Scholar]
  • 26.Horne R, Weinman J, Hankins M. The beliefs about medicines questionnaire: the development and evaluation of a new method for assessing the cognitive representation of medication. Psychol Health. 1999;14(1):1–24. doi: 10.1080/08870449908407311 [DOI] [Google Scholar]
  • 27.Zigmond AS, Snaith RP. The hospital anxiety and depression scale. Acta Psychiatr Scand. 1983;67(6):361–370. doi: 10.1111/j.1600-0447.1983.tb09716.x [DOI] [PubMed] [Google Scholar]
  • 28.Strobl C, Boulesteix AL, Zeileis A, Hothorn T. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics. 2007;8(1):25. doi: 10.1186/1471-2105-8-25 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Predictor importance estimates by permutation of out-of-bag predictor observations for random forest of classification trees. Available from: https://www.mathworks.com/help/stats/classificationbaggedensemble.oobpermutedpredictorimportance.html#bvgfu5_. Accessed May15, 2020.
  • 30.Stoltzfus JC. Logistic regression: a brief primer. Acad Emerg Med. 2011;18(10):1099–1104. doi: 10.1111/j.1553-2712.2011.01185.x [DOI] [PubMed] [Google Scholar]
  • 31.Lippmann R. An introduction to computing with neural nets. IEEE ASSP Mag. 1987;4(2):4–22. doi: 10.1109/MASSP.1987.1165576 [DOI] [Google Scholar]
  • 32.Yang C, Odvody GN, Fernandez CJ, Landivar JA, Minzenmayer RR, Nichols RL. Evaluating unsupervised and supervised image classification methods for mapping cotton root rot. Precis Agric. 2015;16(2):201–215. doi: 10.1007/s11119-014-9370-9 [DOI] [Google Scholar]
  • 33.Huang J, Ling CX. Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng. 2005;17(3):299–310. doi: 10.1109/TKDE.2005.50 [DOI] [Google Scholar]
  • 34.Neuendorf R, Harding A, Stello N, Hanes D, Wahbeh H. Depression and anxiety in patients with inflammatory bowel disease: a systematic review. J Psychosom Res. 2016;87:70–80. doi: 10.1016/j.jpsychores.2016.06.001 [DOI] [PubMed] [Google Scholar]
  • 35.Swets JA. Measuring the accuracy of diagnostic systems. Science. 1988;240(4857):1285–1293. doi: 10.1126/science.3287615 [DOI] [PubMed] [Google Scholar]
  • 36.Tabibian A, Tabibian JH, Beckman LJ, Raffals LL, Papadakis KA, Kane SV. Predictors of health-related quality of life and adherence in Crohn’s disease and ulcerative colitis: implications for clinical management. Dig Dis Sci. 2015;60(5):1366–1374. doi: 10.1007/s10620-014-3471-1 [DOI] [PubMed] [Google Scholar]
  • 37.Varni JW, Shulman RJ, Self MM, et al. Perceived medication adherence barriers mediating effects between gastrointestinal symptoms and health-related quality of life in pediatric inflammatory bowel disease. Qual Life Res. 2018;27(1):195–204. doi: 10.1007/s11136-017-1702-6 [DOI] [PubMed] [Google Scholar]
  • 38.Coenen S, Weyts E, Ballet V, et al. Identifying predictors of low adherence in patients with inflammatory bowel disease. Eur J Gastroenterol Hepatol. 2016;28(5):503–507. doi: 10.1097/MEG.0000000000000570 [DOI] [PubMed] [Google Scholar]
  • 39.Gaines LS, Slaughter JC, Horst SN, et al. Association between affective-cognitive symptoms of depression and exacerbation of Crohn’s disease. Am J Gastroenterol. 2016;111(6):864–870. doi: 10.1038/ajg.2016.98 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Michetti P, Weinman J, Mrowietz U, et al. Impact of treatment-related beliefs on medication adherence in immune-mediated inflammatory diseases: results of the Global ALIGN Study. Adv Ther. 2017;34(1):91–108. doi: 10.1007/s12325-016-0441-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Bruna-Barranco I, Lue A, Gargallo-Puyuelo CJ, et al. Young age and tobacco use are predictors of lower medication adherence in inflammatory bowel disease. Eur J Gastroenterol Hepatol. 2019;31(8):948–953. doi: 10.1097/MEG.0000000000001436 [DOI] [PubMed] [Google Scholar]
  • 42.Agostini A, Scaioli E, Belluzzi A, Campieri M. Attachment and mentalizing abilities in patients with inflammatory bowel disease. Gastroenterol Res Pract. 2019;2019:7847123. doi: 10.1155/2019/7847123 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. Predictor importance estimates by permutation of out-of-bag predictor observations for random forest of classification trees. Available from: https://www.mathworks.com/help/stats/classificationbaggedensemble.oobpermutedpredictorimportance.html#bvgfu5_. Accessed May15, 2020.

Articles from Patient preference and adherence are provided here courtesy of Dove Press

RESOURCES