Table 3.
Differences in performance per algorithm, outcome, and dataset
| Metric (95% CI) | Original | RUS 50 | SMOTE 20 | SMOTE 30 | SMOTE 40 | SMOTE 50 | ADASYN 50 |
P values (compared to original) |
|||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RUS 50 | SMOTE 20 | SMOTE 30 | SMOTE 40 | SMOTE 50 | ADASYN 50 | ||||||||
| AUC | |||||||||||||
| LR | 0.77 (0.74; 0.8) | 0.77 (0.74; 0.8) | 0.77 (0.75; 0.79) | 0.77 (0.74; 0.79) | 0.77 (0.75; 0.79) | 0.77 (0.76; 0.79) | 0.82 (0.81; 0.83) | .980 | .859 | .847 | .942 | .787 | .005 |
| DT | 0.73 (0.69; 0.76) | 0.74 (0.72; 0.76) | 0.76 (0.74; 0.77) | 0.77 (0.75; 0.78) | 0.77 (0.74; 0.79) | 0.77 (0.75; 0.78) | 0.77 (0.75; 0.78) | .340 | .078 | .054 | .059 | .031 | .040 |
| XGB | 0.82 (0.79; 0.85) | 0.83 (0.80; 0.85) | 0.89 (0.88; 0.90) | 0.88 (0.86; 0.90) | 0.90 (0.89; 0.91) | 0.91 (0.90; 0.92) | 0.95 (0.95; 0.96) | .673 | .001 | .009 | .000 | .000 | .000 |
| RF | 0.82 (0.80; 0.85) | 0.83 (0.80; 0.86) | 0.91 (0.90; 0.92) | 0.90 (0.89; 0.92) | 0.92 (0.91; 0.93) | 0.93 (0.92; 0.94) | 0.96 (0.95; 0.96) | .531 | .000 | .000 | .000 | .000 | .000 |
| NN | 0.74 (0.71; 0.77) | 0.78 (0.75; 0.8) | 0.91 (0.9; 0.92) | 0.88 (0.86; 0.89) | 0.92 (0.91; 0.93) | 0.93 (0.92; 0.94) | 0.93 (0.93; 0.94) | .016 | .000 | .000 | .000 | .000 | .000 |
| SVM | 0.68 (0.64; 0.71) | 0.74 (0.69; 0.79) | 0.76 (0.75; 0.77) | 0.75 (0.74; 0.77) | 0.78 (0.76; 0.8) | 0.78 (0.77; 0.79) | 0.77 (0.76; 0.78) | .024 | .001 | .001 | .000 | .000 | .000 |
| Precision | |||||||||||||
| LR | 0.53 (0.37; 0.69) | 0.70 (0.65; 0.75) | 0.58 (0.52; 0.65) | 0.65 (0.61; 0.69) | 0.69 (0.66; 0.72) | 0.71 (0.68; 0.73) | 0.74 (0.73; 0.75) | .042 | .538 | .132 | .063 | .038 | .015 |
| DT | 0.25 (0.10; 0.40) | 0.68 (0.63; 0.72) | 0.56 (0.50; 0.61) | 0.59 (0.54; 0.63) | 0.65 (0.62; 0.67) | 0.71 (0.69; 0.72) | 0.68 (0.66; 0.71) | .000 | .000 | .000 | .000 | .000 | .000 |
| XGB | 0.67 (0.54; 0.8) | 0.74 (0.70; 0.79) | 0.75 (0.70; 0.79) | 0.74 (0.70; 0.78) | 0.78 (0.76; 0.81) | 0.80 (0.78; 0.82) | 0.86 (0.85; 0.87) | .249 | .296 | .298 | .100 | .049 | .010 |
| RF | 0.67 (0.44; 0.89) | 0.74 (0.69; 0.79) | 0.86 (0.81; 0.92) | 0.81 (0.78; 0.84) | 0.81 (0.79; 0.84) | 0.83 (0.81; 0.86) | 0.88 (0.87; 0.88) | .460 | .088 | .171 | .166 | .126 | .065 |
| NN | 0.45 (0.36; 0.54) | 0.70 (0.67; 0.74) | 0.77 (0.74; 0.8) | 0.75 (0.72; 0.77) | 0.82 (0.8; 0.84) | 0.85 (0.84; 0.87) | 0.86 (0.84; 0.87) | .000 | .000 | .000 | .000 | .000 | .000 |
| SVM | 0.00 (0; 0) | 0.65 (0.57; 0.72) | 0.35 (0.12; 0.58) | 0.62 (0.58; 0.67) | 0.65 (0.63; 0.67) | 0.69 (0.67; 0.71) | 0.67 (0.66; 0.69) | .000 | .007 | .000 | .000 | .000 | .000 |
| Recall | |||||||||||||
| LR | 0.09 (0.07; 0.12) | 0.66 (0.61; 0.7) | 0.20 (0.19; 0.22) | 0.36 (0.34; 0.39) | 0.54 (0.51; 0.57) | 0.72 (0.7; 0.73) | 0.77 (0.75; 0.78) | .000 | .000 | .000 | .000 | .000 | .000 |
| DT | 0.04 (0.01; 0.08) | 0.66 (0.59; 0.73) | 0.20 (0.13; 0.28) | 0.44 (0.28; 0.6) | 0.64 (0.59; 0.7) | 0.73 (0.7; 0.76) | 0.79 (0.74; 0.83) | .000 | .002 | .001 | .000 | .000 | .000 |
| XGB | 0.17 (0.13; 0.21) | 0.73 (0.70; 0.77) | 0.44 (0.40; 0.49) | 0.62 (0.59; 0.66) | 0.77 (0.75; 0.80) | 0.88 (0.86; 0.89) | 0.92 (0.91; 0.93) | .000 | .000 | .000 | .000 | .000 | .000 |
| RF | 0.08 (0.05; 0.11) | 0.73 (0.70; 0.77) | 0.39 (0.36; 0.42) | 0.58 (0.55; 0.61) | 0.77 (0.75; 0.78) | 0.88 (0.86; 0.89) | 0.91 (0.9; 0.92) | .000 | .000 | .000 | .000 | .000 | .000 |
| NN | 0.22 (0.18; 0.27) | 0.68 (0.64; 0.72) | 0.62 (0.6; 0.63) | 0.66 (0.63; 0.69) | 0.83 (0.8; 0.86) | 0.91 (0.89; 0.93) | 0.91 (0.9; 0.92) | .000 | .000 | .000 | .000 | .000 | .000 |
| SVM | 0.00 (0; 0) | 0.65 (0.59; 0.71) | 0.01 (0.0; 0.01) | 0.21 (0.19; 0.23) | 0.59 (0.56; 0.61) | 0.76 (0.74; 0.78) | 0.77 (0.76; 0.78) | .000 | .003 | .000 | .000 | .000 | .000 |
| Brier score | |||||||||||||
| LR | 0.08 (0.05; 0.11) | 0.20 (0.15; 0.24) | 0.13 (0.12; 0.15) | 0.17 (0.14; 0.2) | 0.19 (0.16; 0.22) | 0.19 (0.18; 0.21) | 0.17 (0.16; 0.19) | .000 | .000 | .000 | .000 | .000 | .000 |
| DT | 0.09 (0.05; 0.13) | 0.21 (0.14; 0.28) | 0.14 (0.06; 0.21) | 0.17 (0.01; 0.33) | 0.19 (0.13; 0.24) | 0.19 (0.16; 0.22) | 0.19 (0.15; 0.23) | .000 | .000 | .000 | .000 | .000 | .000 |
| XGB | 0.07 (0.04; 0.11) | 0.17 (0.14; 0.21) | 0.10 (0.06; 0.14) | 0.13 (0.09; 0.16) | 0.13 (0.1; 0.15) | 0.12 (0.11; 0.13) | 0.09 (0.07; 0.1) | .000 | .000 | .000 | .000 | .000 | .013 |
| RF | 0.08 (0.05; 0.11) | 0.17 (0.13; 0.21) | 0.10 (0.07; 0.13) | 0.12 (0.09; 0.15) | 0.12 (0.1; 0.14) | 0.12 (0.1; 0.13) | 0.09 (0.08; 0.11) | .000 | .000 | .000 | .000 | .000 | .001 |
| NN | 0.09 (0.04; 0.14) | 0.20 (0.15; 0.24) | 0.08 (0.07; 0.1) | 0.12 (0.09; 0.16) | 0.11 (0.08; 0.14) | 0.10 (0.08; 0.12) | 0.10 (0.09; 0.11) | .000 | .293 | .001 | .033 | .156 | .286 |
| SVM | 0.09 (0; 0) | 0.21 (0.15; 0.26) | 0.15 (0.15; 0.16) | 0.18 (0.16; 0.2) | 0.19 (0.16; 0.21) | 0.19 (0.18; 0.21) | 0.19 (0.18; 0.21) | .000 | .000 | .000 | .000 | .000 | .000 |
Note: Yellow marking indicates a significant difference compared to the original dataset.