Table 3.
Calibration and discrimination tests of six machine learning models by both internal and external validations.
Validation | Algorithm | Calibration | Discrimination tests | ||
---|---|---|---|---|---|
Slope (95% CI) | Intercept (95% CI) | AUROC (95% CI) | Prec. (95% CI) * | ||
Internal | LR | 1.08 (1.08, 1.09) | −0.04 (−0.04, −0.03) | 0.70 (0.69, 0.70) | 0.78 (0.78, 0.78) |
DT | 0.99 (0.99, 1.00) | 0.01 (0.01, 0.01) | 0.66 (0.66, 0.67) | 0.73 (0.72, 0.74) | |
ANN | 0.64 (0.63, 0.64) | 0.14 (0.14, 0.15) | 0.65 (0.64, 0.67) | 0.74 (0.73, 0.75) | |
RF | 1.54 (1.54, 1.54) | −0.27 (−0.27, −0.26) | 0.86 (0.85, 0.86) | 0.86 (0.85, 0.86) | |
SVM | 2.68 (2.66, 2.70) | −0.89 (−0.90, −0.88) | 0.68 (0.67, 0.68) | 0.78 (0.76, 0.79) | |
Ens. | 1.21 (1.21, 1.22) | −0.13 (−0.13, −0.12) | 0.70 (0.70, 0.71) | 0.78 (0.77, 0.78) | |
External, geographical split | LR | 1.80 (1.76, 1.83) | −0.34 (−0.35, −0.32) | 0.74 (0.73, 0.76) | 0.68 (0.67, 0.70) |
DT | 0.69 (0.67, 0.71) | 0.15 (0.14, 0.16) | 0.60 (0.59, 0.61) | 0.80 (0.79, 0.81) | |
ANN | 0.75 (0.73, 0.77) | 0.08 (0.07, 0.09) | 0.67 (0.64, 0.70) | 0.55 (0.52, 0.58) | |
RF | 1.47 (1.45, 1.50) | −0.19 (−0.21, −0.18) | 0.76 (0.76, 0.77) | 0.82 (0.81, 0.83) | |
SVM | 3.12 (3.02, 3.21) | −1.07 (−1.12, −1.02) | 0.62 (0.61, 0.62) | 0.54 (0.52, 0.57) | |
Ens. | 1.52 (1.49, 1.55) | −0.28 (−0.30, −0.26) | 0.72 (0.71, 0.73) | 0.70 (0.68, 0.72) | |
External, temporal split | LR | 0.74 (0.72, 0.76) | 0.16 (0.15, 0.17) | 0.62 (0.62, 0.63) | 0.77 (0.76, 0.77) |
DT | 0.92 (0.90, 0.93) | 0.08 (0.08, 0.09) | 0.63 (0.62, 0.63) | 0.69 (0.68, 0.70) | |
ANN | 0.30 (0.29, 0.31) | 0.34 (0.33, 0.35) | 0.58 (0.58, 0.59) | 0.71 (0.70, 0.72) | |
RF | 1.09 (1.08, 1.11) | 0.02 (0.02, 0.03) | 0.70 (0.70, 0.70) | 0.78 (0.78, 0.79) | |
SVM | 2.25 (2.20, 2.30) | −0.65 (−0.67, −0.62) | 0.63 (0.63, 0.63) | 0.72 (0.71, 0.73) | |
Ens. | 0.74 (0.72, 0.76) | 0.15 (0.14, 0.16) | 0.61 (0.61, 0.62) | 0.74 (0.73, 0.74) |
AUROC, area under the receiver operating characteristic curve; LR, machine learning-optimized logistic regression; DT, decision tree; ANN, artificial neural network; RF, random forest; SVM, support vector machine; Ens., ensemble algorithm.
For a specificity of ∼90%.