Skip to main content
. 2021 May 7;34(3):605–617. doi: 10.1007/s10278-021-00455-0

Table 1.

Classification metrics: accuracy (ACC), area under the curve (AUC) and f1-score of three data combinations, two classification models (RF and GBRT), and two training/test splitting methods (KF and RS). In both KF and RS results, using combination 2 led to significantly improved accuracy (from ≤ 0.745 to ≥ 0.950), for both RF and GBRT. Combination 3 further increased the metrics. Similar pattern was observed for AUC and f1-score: using combination 2 increased the results (AUC: from ≤ 0.813 to ≥ 0.972, f1-score: from ≤ 0.747 to ≥ 0.950). Std. stands for standard deviation

Method Data Model Mean ± std
Training Test ACC AUC f1-score
KF (1) Ori Ori RF 0.726 ± 0.020 0.788 ± 0.007 0.726 ± 0.018
GBRT 0.738 ± 0.009 0.797 ± 0.008 0.740 ± 0.009
(2) Aug Ori RF 0.955 ± 0.005 0.978 ± 0.005 0.955 ± 0.003
GBRT 0.956 ± 0.004 0.982 ± 0.003 0.956 ± 0.003
(3) Aug Aug RF 0.966 ± 0.002 0.992 ± 0.001 0.966 ± 0.002
GBRT 0.967 ± 0.002 0.993 ± 0.001 0.967 ± 0.002
RS (1) Ori Ori RF 0.692 ± 0.015 0.758 ± 0.019 0.671 ± 0.019
GBRT 0.745 ± 0.012 0.813 ± 0.017 0.747 ± 0.012
(2) Aug Ori RF 0.950 ± 0.006 0.972 ± 0.007 0.950 ± 0.006
GBRT 0.960 ± 0.005 0.984 ± 0.004 0.960 ± 0.004
(3) Aug Aug RF 0.950 ± 0.003 0.985 ± 0.002 0.950 ± 0.003
GBRT 0.965 ± 0.003 0.992 ± 0.001 0.966 ± 0.003

Ori. original, Aug. augmented