Table 3.
Ref.
|
AI classifier vs comparator
|
IBD type
|
Study design and sample size
|
Modality
|
Outcomes
|
Study results/validation cohort
|
Waljee et al[59], 2018 | Random forest (RF). No comparator | CD/UC | Post-hoc analysis of prospective clinical trial, 594 CD patients | Veteran’s Health Administration Electronic Health Record (EHR) | Outpatient corticosteroids prescribed for IBD and inpatient hospitalizations associated with a diagnosis of IBD | AUC for the RF longitudinal model was 0.85 [95% confidence interval (CI): 0.84–0.85]. AUC for the RF longitudinal model using previous hospitalization or steroid use was 0.87 (95%CI: 0.87-0.88). Validation cohort included |
Uttam et al[60], 2019 | Support vector machines (SVM) vs nanoscale nuclear architecture mapping (NanoNAM) | CD/UC | Prospective cohort, 103 IBD patients | 3-dimensional NanoNAM of normal-appearing rectal biopsies | Colonic neoplasia | NanoNAM detects colonic neoplasia with an AUC of 0.87 ± 0.04, sensitivity of 0.81 ± 0.09, and specificity of 0.82 ± 0.07 in the independent validation set. Validation cohort included |
Waljee et al[61], 2017 | RF. No comparator | CD/UC | Retrospective cohort, 1080 IBD patients | EHR, lab values | Remission and clinical outcomes with thiopurines | AUC for algorithm-predicted remission in the validation set was 0.79 vs 0.49 for 6-TGN. The mean number of clinical events per year in patients with sustained algorithm-predicted remission (APR) was 1.08 vs 3.95 in those that did not have sustained APR (P < 1 × 10-5). Validation cohort included |
Popa et al[62], 2020 | Neural network model. No comparator | UC | Prospective cohort, 55 UC patients | Clinical and biological parameters and the endoscopic Mayo score | Disease activity after one year of anti-TNF treatment | The classifier achieved an excellent performance predicting the disease activity at one year with an accuracy of 90% and AUC 0.92 on the test set and an accuracy of 100% and an AUC of 1 on the validation set. Validation cohort included |
Douglas et al[45], 2018 | RF. No comparator | Peds CD | Cross-sectional, 20 CD patients, 20 healthy controls | Shotgun metagenomics (MGS), 16S rRNA gene sequencing | Response to induction therapy | 16S genera were again the top dataset (accuracy = 77.8%; P = 0.008) for predicting response to therapy. MGS strain (P = 0.029), genus (P = 0.013), and KEGG pathway (P = 0.018) datasets could also classify patients according to therapy response with accuracy = 72.2% for all three. Validation cohort included |
Waljee et al[63], 2010 | RF vs boosted trees, RuleFit | CD/UC | Cross-sectional, 774 IBD patients | EHR, lab values (thiopurine metabolites) | Response to thiopurine therapy | A RF algorithm using laboratory values and patient age differentiated clinical response from nonresponse in the model validation data set with an AUC of 0.856 (95%CI: 0.793-0.919). Validation cohort included |
Menti et al[64], 2016 | Naïve bayes vs Bayesian additive regression trees vs Bayesian networks | CD/UC | Retrospective cohort, 152 CD patients | Genomic DNA, genetic polymorphism | Presence of extra-intestinal manifestations in IBD patients | Bayesian networks achieved accuracy of 82% when considering only clinical factors and 89% when considering also genetic information, outperforming the other techniques. Validation cohort included |
Waljee et al[65], 2017 | RF vs baseline regression model | CD/UC | Retrospective cohort, 20368 IBD patients | EHR, lab values | Corticosteroid-free biologic remission with vedolizumab | The AUC for corticosteroid-free biologic remission at week 52 using baseline data was only 0.65 (95%CI: 0.53-0.77), but was 0.75 (95%CI: 0.64-0.86) with data through week 6 of vedolizumab. Validation cohort included |
Morilla et al[66], 2019 | Deep neural networks. No comparator | UC | Retrospective cohort, 47 UC patients | Colonic microrna profiles | Responses to therapy | A deep neural network-based classifier identified 9 microRNAs plus 5 clinical factors, routinely recorded at time of hospital admission, that were associated with responses of patients to treatment. This panel discriminated responders to steroids from non-responders with 93% accuracy (AUC, 0.91). Three algorithms, based on microRNA levels, identified responders to infliximab vs non-responders (84% accuracy, AUC 0.82) and responders to cyclosporine vs non-responders (80% accuracy, AUC 0.79). Validation cohort included |
Wang et al[67], 2020 | Back-propagation neural network (BPNN), SVM vs logistic regression | CD | Cross-sectional, 446 CD patients | EHR | Medication nonadherence to maintenance therapy | The average classification accuracy and AUC of the three models were 85.9% and 0.912 for BPNN, and 87.7% and 0.930 for SVM, respectively. Validation cohort included |
Bottigliengo et al[68], 2019 | Bayesian machine learning techniques (BMLTs) vs logistic regression | CD/UC | Retrospective cohort, 142 IBD patients | EHR, genetic polymorphisms | Presence of extra-intestinal manifestations in IBD patients | BMLTs had an AUC of 0.50 for classifying the presence of extra-intestinal manifestations. Validation cohort included |
Ghoshal et al[69], 2020 | Nonlinear artificial neural network (ANN) vs multivariate linear PCA | UC | Prospective cohort, 263 UC patients | EHR | Responses to therapy | The multilayer perceptron neural network was trained by back-propagation algorithm (10 networks retained out of 16 tested). The classification accuracy rate was 73% in correctly classifying response to medical treatment in UC patients. No validation cohort included |
Sofo et al[70], 2020 | SVM leave-one-out cross-validation. No comparator | UC | Retrospective cohort, 32 UC patients | EHR | Post-surgical complications after colectomy | Evaluating only preoperative features, machine learning algorithms were able to predict minor postoperative complications with a high strike rate (84.3%), high sensitivity (87.5%) and high specificity (83.3%) during the testing phase. Validation cohort included |
Kang et al[71], 2017 | ANN vs logistic regression | UC | Cross-sectional, 24 UC patients | Gene expression profiles | Response to anti-TNF | Balanced accuracy in cross validation test for predicting response to anti-TNF therapy in ulcerative colitis patient was 82%. Validation cohort included |
Babic et al[72], 1997 | CART vs back propagation neural network (BPNN) | CD/UC | Cross-sectional, 200 IBD patients | EHR | Quality of life | Best reached classification accuracy did not exceed 80% in any case. Other classifiers namely, K-nearest-neighbor, learning vector quantization and BPNN confirmed that outcome. Validation cohort included |
Dong et al[73], 2019 | RF, SVM, ANN vs logistic regression | CD | Retrospective cohort, 239 CD patients | EHR, laboratory tests | Crohn's related surgery | The results revealed that RF predictive model performed better than LR model in terms of accuracy (93.11% vs 91.15%), precision (53.42% vs 44.81%), F1 score (0.6016 vs 0.5763), TN rate (95.08% vs 92.00%), and the AUC (0.8926 vs 0.8809). The AUCs were excellent at 0.9864 in RF,0.9538 in LR, 0.8809 in DT, 0.9497 in SVM, and 0.9059 in ANN, respectively. Validation cohort included |
Lerrigo et al[74], 2019 | Latent Dirichlet allocation, unsupervised machine learning algorithm. No comparator | CD/UC | Retrospective cohort, 28623 IBD patients | Online posts from the Crohn’s and colitis foundation community forum | Impact of online community forums on well-being and their emotional content | 10702 (20.8%) posts were identified expressing: gratitude (40%), anxiety/fear (20.8%), empathy (18.2%), anger/frustration (13.4%), hope (13.2%), happiness (10.0%), sadness/depression (5.8%), shame/guilt (2.5%), and/or loneliness (2.5%). A common subtheme was the importance of fostering social support. No validation cohort included |
AI: Artificial intelligence; IBD: Inflammatory bowel disease; CD: Crohn’s disease; UC: Ulcerative colitis; AUC: Area under the curve; TNF: Tumor necrosis factor.