Table 2.
Model performances based on the “real” secondary dataset |
Trained on dataset A real data (95% CI) |
Trained on dataset B (synthetic data ×1) (95% CI) |
Trained on dataset C (synthetic data ×2) (95% CI) |
Trained on dataset D (synthetic data ×5) (95% CI) |
---|---|---|---|---|
MILO’s best models | MILO GBM | MILO SVM | MILO DNN | MILO DNN |
ROC-AUC | 0.95 (0.87–1) | 0.83 (0.63–1) | 0.91 (0.8–1) | 0.55 (0.48–0.62) |
Accuracy | 90 (84–95) | 91 (85–95) | 71 (63–78) | 54 (46–62) |
Sensitivity | 89 (83–94) | 93 (87–96) | 67 (59–75) | 49 (40–58) |
Specificity | 100 (81–100) | 77 (50–93) | 100 (81–100) | 94 (71–99) |
MILO’s best RF models | MILO RF | MILO RF | MILO RF | MILO RF |
ROC-AUC | 0.96 (0.82–1) | 0.77 (0.67–0.87) | 0.87 (0.77–0.97) | 0.66 (0.52–0.8) |
Accuracy | 89 (83–93) | 71 (63–78) | 74 (66–81) | 56 (48–64) |
Sensitivity | 88 (81–93) | 69 (60–76) | 72 (64–80) | 53 (44–61) |
Specificity | 100 (81–100) | 88 (64–99) | 88 (64–99) | 82 (57–96) |
Non-MILO RF models | Non-MILO RF | Non-MILO RF | Non-MILO RF | Non-MILO RF |
ROC-AUC | 0.97 (0.94–1) | 0.73 (0.60–0.88) | 0.83 (0.71–0.92) | 0.68 (0.57–0.82) |
Accuracy | 77 (70–84) | 62 (54–69) | 64 (56–72) | 39 (31–47) |
Sensitivity | 75 (66–82) | 61 (52–69) | 64 (55–72) | 40 (32–49) |
Specificity | 100 (81–100) | 71 (44–90) | 71 (44–90) | 29 (10–56) |
DNN = deep neural network, GBM = gradient boosting machine, RF = random forest, SVM = support vector machine.