Skip to main content
. 2024 Jun 6;14:13049. doi: 10.1038/s41598-024-63916-w

Table 2.

Comparison of discriminative features of 11 machine learning models in testing set.

Characteristics SVM NN MLP GP GBM LR NB XGB C5.0 KNN RF
Apparent prevalence 0.21(0.08, 0.41) 0.21(0.08, 0.41) 0.25(0.11, 0.45) 0.14(0.06, 0.27) 0.78(0.71, 0.83) 0.14(0.06, 0.27) 0.43(0.24, 0.63) 0.18(0.09, 0.31) 0.18(0.06, 0.37) 0.21(0.08, 0.41) 0.82(0.76, 0.87)
True prevalence 0.32(0.16, 0.52) 0.32(0.16, 0.52) 0.32(0.16, 0.52) 0.18(0.09, 0.31) 0.79(0.73, 0.85) 0.18(0.09, 0.31) 0.32(0.16, 0.52) 0.18(0.09, 0.31) 0.32(0.16, 0.52) 0.32(0.16, 0.52) 0.79(0.73, 0.85)
Sensitivity 0.56(0.21, 0.86) 0.56(0.21, 0.86) 0.67(0.30, 0.93) 0.56(0.21, 0.86) 0.96(0.92, 0.98) 0.56(0.21, 0.86) 0.89(0.52, 1.00) 0.78(0.40, 0.97) 0.56(0.21, 0.86) 0.56(0.21, 0.86) 1.00(0.98, 1.00)
Specificity 0.95(0.74, 1.00) 0.95(0.74, 1.00) 0.95(0.74, 1.00) 0.95(0.83, 0.99) 0.93(0.81, 0.99) 0.95(0.83, 0.99) 0.79(0.54, 0.94) 0.95(0.83, 0.99) 1.00(0.82, 1.00) 0.95(0.74, 1.00) 0.88(0.75, 0.96)
PPV 0.83(0.36, 1.00) 0.83(0.36, 1.00) 0.86(0.42, 1.00) 0.71(0.29, 0.96) 0.98(0.95, 1.00) 0.71(0.29, 0.96) 0.67(0.35, 0.90) 0.78(0.40, 0.97) 1.00(0.48, 1.00) 0.83(0.36, 1.00) 0.97(0.93, 0.99)
NPV 0.82(0.60, 0.95) 0.82(0.60, 0.95) 0.86(0.64, 0.97) 0.91(0.78, 0.97) 0.85(0.72, 0.94) 0.91(0.78, 0.97) 0.94(0.70, 1.00) 0.95(0.83, 0.99) 0.83(0.61, 0.95) 0.82(0.60, 0.95) 1.00(0.91, 1.00)
PLR 10.56(1.44, 77.62) 10.56(1.44, 77.62) 12.67(1.78, 90.18) 11.39(2.61, 49.66) 13.73(4.61, 40.91) 11.39(2.61, 49.66) 4.22(1.72, 10.39) 15.94(3.95, 64.40) Inf(NaN, Inf) 10.56(1.44, 77.62) 8.60(3.77, 19.60)
NLR 0.47(0.22, 0.98) 0.47(0.22, 0.98) 0.35(0.14, 0.89) 0.47(0.22, 0.97) 0.05(0.02, 0.09) 0.47(0.22, 0.97) 0.14(0.02, 0.91) 0.23(0.07, 0.79) 0.44(0.21, 0.92) 0.47(0.22, 0.98) 0.00(0.00, NaN)

All constructed predictive models were developed without the utilization of data augmentation techniques.

SVM supported vector machine, NN neural network, MLP multi-layer perceptron, GP gaussian process, GBM gradient boosting machine, LR logistic regression, NB Naive Bayes, XGB XGBoost, C5.0 C5.0 Decision Trees, KNN k-nearest neighbor, RF random forest, PPV positive predictive value, NPV negative predictive value, PLR positive likelihood ratio, NLR negative likelihood ratio.