Skip to main content
. 2021 Nov 30;9(12):1662. doi: 10.3390/healthcare9121662

Table 2.

Summary of data processing and performance of machine-learning algorithm in enrolled studies.

Study Feature Selection Algorithm Feature Selection Method Data Splitting Machine Learning Algorithm AUROC
Kate et al. [60] NR NR ten-fold cross-validation naïve Bayes 0.654
SVM 0.621
decision trees 0.639
logistic regression 0.660
Thottakkara et al. [61] LASSO embedded method training data (70%); validation (30%) naïve Bayes 0.819
generalized additive model 0.858
logistic regression 0.853
support vector machine 0.857
Davis et al. [62] according to clinical experience or previous report NR five-fold cross-validation random forest 0.73
neural network 0.72
naïve Bayes 0.69
logistic regression 0.78
Cheng et al. [63] according to clinical experience or previous report NR ten-fold cross-validation random forest 0.765
AdaBoostM1 0.751
logistic regression 0.763
Ibrahim et al. [64] LASSO embedded method Monte Carlo cross-validation logistic regression 0.79
Koola et al. [65] LASSO embedded method five-fold cross-validation logistic regression 0.93
naïve Bayes; 0.73
support vector machines; 0.90
random forest; 0.91
gradient boosting 0.88
Koyner et al. [66] tree-based method embedded method ten-fold cross-validation gradient boosting 0.9
Huang et al. [67] XGBoost and LASSO embedded method training data (70%); validation (30%) gradient boost; 0.728
logistic regression 0.717
Lin et al. [68] according to clinical experience or previous report NR five-fold cross-validation SVM 0.86
Simonov et al. [69] according to clinical experience or previous report NR training data (67%); validation (33%) discrete-time logistic regression 0.74
Huang et al. [70] stepwise backward selection, LASSO, premutation-based selection embedded method training (50%); validation (50%) generalized additive model 0.777
Tomašev et al. [71] L1 regularization embedded method training (80%); validation (5%); calibration (5%); test (10%) recurrent neural network 0.934
Adhikari et al. [72] F-test filter method five-fold cross-validation random forest 0.86
Flechet et al. [73] according to clinical experience or previous report NR NR random forest 0.78
Parreco et al. [74] NR NR NR gradient boosting; 0.834
logistic regression; 0.827
deep learning 0.817
Xu et al. [75] gradient boosting embedded method five-fold cross-validation gradient boosting 0.749
Tran et al. [76] NR NR Scikit-learn cross validation k-nearest neighbor 0.92
Zhang et al. [77] XGBoost embedded method bootstrap validation gradient boosting 0.86
Zimmerman et al. [78] logistic regression embedded method five-fold cross-validation logistic regression 0.783
random forest 0.779
neural network 0.796
Rashidi et al. [79] according to clinical experience or previous report NR Scikit-learn cross validation recurrent neural network 0.92
Zhou et al. [80] NR NR five-fold cross-validation logistic regression 0.73
linear kernel SVM 0.84
Gaussian kernel SVM 0.77
random forest 0.89
Martinez et al. [81] LASSO embedded method ten-fold cross-validation random forest not provided
Lei et al. [82] NR NR training data (70%); validation (30%) Gradient boosting 0.8
Lei et al. [82] NR NR training data (70%); validation (30%) Gradient boosting 0.772
Light gradient boosted machine 0.725
random forest 0.662
DecisionTree 0.628
Qu et al. [84] NR NR ten-fold cross-validation random forest 0.821
classification and regression tree 0.8033
logistic regression 0.8728
extreme gradient boosting 0.9193
Tseng et al. [85] tree-based method embedded method five-fold cross-validation random forest 0.839
random forest with extreme gradient boosting 0.843
Sun et al. [86] Boruta algorithm wrapper method ten-fold cross-validation random forest 0.82
logistic regression; 0.69
Churpek et al. [87] gradient boosting embedded method ten-fold cross-validation gradient boosted machine 0.72
Hsu et al. [88] XGBoost and LASSO embedded method five-fold cross-validation logistic regression; 0.767
Penny-Dimri et al. [89] tree-based method embedded method five-fold cross-validation logistic regression; 0.77
gradient boosted machine 0.78
neural networks 0.77
Li et al. [90] LASSO embedded method ten-fold cross-validation Bayesian networks 0.736

AUROC: area under the receiver operating characteristic curve; LASSO: least absolute shrinkage and selection operator; NR: not reported; SAPS: simplified acute physiology score; SVM: support vector machine; XGB: eXtreme Gradient Boostin.