Skip to main content
. 2021 Aug 5;17(6):392–397. doi: 10.1002/cld.1071

TABLE 2.

Using ML Algorithms to Identify Patients at High Risk for NAFLD and Determine the Severity of the Disease

Study No. of Patients Purpose Gold Standard Algorithms Used Key Variables AUROC or Accuracy (%) Sensitivity Specificity
Yip et al. 1 922 Patient identification using EMR MRI
  • Logistic regression
  • Ridge regression*
  • Decision tree
  • AdaBoost
  • ALT
  • High‐density lipoprotein cholesterol
  • TGs
  • HbA1c
  • White blood cells
  • Hypertension
AUROC: 0.88 0.923 0.904
Ma et al. 2 10,508 Patient identification using EMR Ultrasound
  • Logistic regression*
  • κ‐nearest neighbor
  • Support vector machines*
  • Naive Bayes
  • Bayes network*
  • Decision tree
  • AdaBoost
  • Bootstrap aggregating
  • Random forest
  • AODEs*
  • Age
  • Sex
  • BMI
  • ALT
  • AST
  • Alkaline phosphatase
  • GGT
  • TGs
  • Blood urea nitrogen
  • Bilirubin
  • Cholesterol
  • Creatinine
  • Fasting glucose
  • Uric acid
Accuracy: 83% 0.68 0.946
Sowa et al. 5 126 Determining severity of fibrosis Liver biopsy
  • Logistic regression
  • κ‐nearest neighbor
  • Support vector machines
  • Rule based
  • Decision tree
  • Random forest*
  • ALT
  • AST
  • M30
  • M65
  • HA
AUROC: 0.67 0.60 0.77
Accuracy: 79%
Canbay et al. 6 164 Determine histological severity Liver biopsy
  • Ensemble feature selection*
  • Logistic regression*
  • Age
  • Sex
  • BMI
  • ALT
  • AST
  • M30
  • M65
  • HbA1c
  • Fasting glucose
  • Cholesterol
  • Adiponectin
AUROC: 0.73
*

Identifies most significant ML algorithms used in data analyses.