Table 2. Details of machine learning for the diagnosis of hepatocellular carcinoma.
Author and year | Data type | Sample number | Machine learning model/algorithm | Results |
---|---|---|---|---|
Phan et al. 202013 | Clinical data | N: 6,052 (training set: 70%; test set: 30%) | Convolutional neural network | AUC: 0.886 |
Nam et al. 202014 | Clinical data | Training set: 424; validation set (independent external cohort): 316 | Deep neural network | c-index: 0.782 |
Sato et al. 201915 | Clinical data | N: 1,580 (training set: 80%; development set and test set: 20%) | SVM, gradient boosting, random forest, neural network, deep learning, and other algorithms | Gradient boosting model had the highest accuracy (87.34%) AUC: 0.94 |
Kim et al. 202117 | Clinical data | Training set: 6,051; validation set (external validation cohorts): (5,817 patients from Korean centers and 1,640 from Western centers) | GBM | c-index: 0.79 |
Wong et al. 202218 | Clinical data | N: 124,006 (training set: 70%; test set: 30%) | AdaBoost, decision tree and random forest | Accuracy of random forest (AUROC: 0.837) was stable |
Schmauch et al. 201929 | Imaging | Training set: 367; validation set: 177 | Deep learning | Weighted mean ROC-AUC scores of 0.891 |
Li et al. 202130 | Imaging | N: 226 (training set: 80%; test set: 20%) | SVM | AUC: 0.86 |
Brehar et al. 202031 | Imaging | N: 268 (training set: 66%; test set: 20%; validation set: 14%) | CNN, SVM, random forest, and AdaBoost | CNN was the best (accuracy of 91% with AUC of 95%) |
Jin et al. 202132 | Imaging | Training set: 262; validation set: 86; testing set: 86 | Deep learning | AUCs: 0.981, 0.942 and 0.900 in training, validation, and testing cohorts |
Ren et al. 202137 | Imaging | Training set: 149; test set: 38; validation set: 39 | SVM | AUC: 0.936 |
Yasaka et al. 201838 | Imaging | Training set: 460; test set: 100 | CNN | AUC: 0.92 |
Mokrane et al. 202039 | Imaging | Discovery set: 142; validation set: 36 | KNN, SVM, and random forest | AUC: 0.70 and 0.66 in discovery and validation cohorts |
Hamm et al. 201940 | Imaging | Training set: 434; test set: 60 | CNN | AUC: 0.992 |
Liu et al. 202141 | Imaging | N: 86 | SVM | AUC: 0.77 |
Mao et al. 202042 | Imaging | Training set: 237; test set: 60 | XGBoost | AUC: 0.8014 |
Nebbia et al. 202043 | Imaging | N: 99 | SVM | Highest AUC: 0.8669 (multiparametric MRI combination yield) |
Lin et al. 201944 | Pathology | N: 113 | CNN | Accuracy>90% |
Chen et al. 202045 | Pathology | Training set: 261; test set: 50; internal validation set: 155; external validation set: 101 | CNN | Accuracy: 96.0% |
Kiani et al. 202046 | Pathology | Training set: 70; test set: 80; validation set: 26 | CNN | Accuracy: 0.885 |
Zhang et al. 202047 | Gene | Training set: 1,333; test set: 336 | SVM | Sensitivity: 91.93%, specificity: 100%, and AUC: 0.9597 |
Chen et al. 202148 | Genes | Training set: 361; validation set: 183 | Random forest, SVM, KNN | Best predictive performances: random forest (AUC: 0.96; accuracy, 0.90) |
Tao et al. 202049 | Genes | Training set: 209; validation sets: 76/99 | Random forest | AUC>0.800 |
AUROC, area under the receiver operating characteristic; CNN, convolutional neural network; GBM, gradient boosting machine; SVM, support vector machine.