Skip to main content
. 2024 Jan 1;55(2):2300670. doi: 10.1080/07853890.2023.2300670

Table 1.

Performance of AI models developed for patients with CD.

Study Objective AI method used Performance
Con et al. 2021 [25] Predicting the response to anti-TNF therapy using conventional vs deep-learning models Deep learning: feed-forward and recurrent neural network AuROC and 95% CI:
  • conventional model: 0.659 (0.562–0.756)

  • feed-forward model: 0.710 (0.622–0.799; p = 0.25 vs conventional model)

  • recurrent neural-network model: 0.754 (0.674–0.834; p = 0.036 vs conventional model)

Waljee et al. 2018 [27] Predicting the response to vedolizumab treatment Random forest method AuROC and 95% CI for corticosteroid-free biologic remission at week 52:
  • baseline data: 0.65 (0.53–0.77)

  • data through week 6 of vedolizumab treatment: 0.75 (0.64–0.86)

Waljee et al. 2017 [28] Predicting the response to thiopurine treatment Random forest method AuROC and 95% CI for objective remission:
  • ML model using laboratory values and patient age: 0.79 (0.78–0.81)

  • method using 6-TGN: 0.49 (0.44–0.54)

Park et al. 2022 [29] Predicting the non-durable response to anti-TNF therapy in CD using transcriptome imputed from genotypes LASSO regression AuROC (SD) for training and test datasets:
  • whole blood (for DPY19L3): 0.845 (0.027) and 0.839 (0.070)

  • colon transverse (for TXNDC16): 0.728 (0.060) and 0.711 (0.150)

  • small intestine terminal ileum (for ENSG00000270127): 0.738 (0.050) and 0.720 (0.120)


AuROC (SD) for training and test dataset, respectively, for most frequently selected combination of two or three genes for whole-blood expression imputation model:
  • DPY19L3: 0.845 (0.027) and 0.839 (0.070)

  • DPY19L3 and GSTT1: 0.918 (0.023) and 0.919 (0.040)

  • DPY19L3, GSTT1, NUCB1: 9 0.935 (0.024) and 0.935 (0.040)

He et al. 2021 [30] Predicting response to ustekinumab using gene transcription profiling of patients with CD Least absolute shrinkage and selection operator regression analysis AuROC:
  • training dataset: 0.746

  • test dataset: 0.734

Stidham et al. 2021 [7] Predicting surgical outcomes in US veterans with CD using ML models incorporating routinely collected laboratory studies LASSO regularized logistic regression Mean (SD) sensitivity, specificity, AuROC, Brier score, AuROC (random splitting method), and Brier score (random splitting method), for the five models, respectively:
  • best model demographic + medication + last laboratory measurement + historical laboratory summary: 0.735 (0.013), 0.726 (0.013), 0.782 (0.0019), 0.0451 (0.0002), 0.775 (0.0447), 0.0465 (0.0018)

  • demographic + medication + last laboratory measurement: 0.722 (0.011), 0.714 (0.010), 0.761 (0.0014), 0.0455 (0.0002), 0.761 (0.0446), 0.0466 (0.0018)

  • demographic + medication: 0.631 (0.103), 0.702 (0.012), 0.714 (0.0016), 0.0473 (0.0002), 0.715 (0.0473), 0.0482 (0.0012)

  • last laboratory measurement alone: 0.690 (0.009), 0.670 (0.009), 0.691 (0.0021), 0.0477 (0.0002), 0.673 (0.0494), 0.0489 (0.0010)

  • random forest method for all variables: 0.673 (0.017), 0.652 (0.016), 0.686 (0.0049), 0.0488 (0.0002), 0.675 (0.0526), 0.0500 (0.0016)

Dong et al. 2019 [31] Predicting surgery for therapeutic decision-making in Chinese patients with CD RF, LR, SVM, DT, ANN Accuracy, precision, true negative rate, and F1 score of the models, respectively:
  • RF: 96.26%, 72.13%, 97.37%, 0.7706

  • LR: 92.33%, 49.66%, 92.76%, 0.6308

  • DT: 95.05%, 64.05%, 96.24%, 0.7112

  • SVM: 92.36%, 50.21%, 93.00%, 0.6288

  • ANN: 90.89%, 46.83%, 92.24%, 0.5757

Venkatapurapu et al. 2022 [32] Predicting temporal changes in mucosal health using a computational approach integrated with a mechanistic model of CD A hybrid mechanistic-statistical platform Overall sensitivity and specificity:
  • endoscopic remission: 80% and 69%

  • mucosal healing: 75% and 70%


Overall performance of the platform:
  • good (at least 70% of data points matched)

  • fair (at least 50%)

  • poor (less than 50%) for 71%, 23%, and 6% of patients

6-TGN, 6-thioguanine nucleotide; ANN, artificial neural network; AuROC, area under the receiver operator characteristic curve; CD, Crohn’s disease; CI, confidence interval; DT, decision tree; LASSO, least absolute shrinkage and selection operator; LR, logistic regression; ML, machine learning; RF, random forest; SD, standard deviation; SVM, support vector machine; TNF, tumor necrosis factor.