Table 2.
Average area under the receiver operating characteristic curve and its standard deviation for all experiments.
| Model | Center A (n = 1,160) | Center B (n = 631) | Average of centers | ||||
|---|---|---|---|---|---|---|---|
| Balanced class weight | Random oversampling | Balanced class weight | Random oversampling | Balanced class weight | Random oversampling | ||
| Cyclical | XGBoost | 0.58 ± 0.10 | 0.58 ± 0.10 | 0.62 ± 0.16 | 0.54 ± 0.15 | 0.60 ± 0.13 | 0.56 ± 0.13 |
| CatBoost | 0.62 ± 0.15 | 0.60 ± 0.14 | 0.61 ± 0.14 | 0.61 ± 0.16 | 0.62 ± 0.15 | 0.61 ± 0.15 | |
| Random forest | 0.62 ± 0.11 | 0.61 ± 0.12 | 0.64 ± 0.13 | 0.64 ± 0.14 | 0.63 ± 0.12 | 0.63 ± 0.13 | |
| Neural network (wide) | 0.62 ± 0.14 | 0.63 ± 0.14 | 0.67 ± 0.14 | 0.65 ± 0.17 | 0.65 ± 0.14 | 0.64 ± 0.16 | |
| Neural network (narrow) | 0.64 ± 0.12 | 0.62 ± 0.13 | 0.68 ± 0.12 | 0.62 ± 0.15 | 0.66 ± 0.12 | 0.62 ± 0.14 | |
| Stacking | XGBoost | 0.67 ± 0.10 | 0.61 ± 0.08 | 0.63 ± 0.17 | 0.60 ± 0.13 | 0.65 ± 0.14 | 0.61 ± 0.11 |
| CatBoost | 0.64 ± 0.11 | 0.62 ± 0.10 | 0.65 ± 0.16 | 0.62 ± 0.13 | 0.65 ± 0.14 | 0.62 ± 0.12 | |
| Random forest | 0.63 ± 0.10 | 0.60 ± 0.09 | 0.64 ± 0.15 | 0.63 ± 0.15 | 0.64 ± 0.13 | 0.62 ± 0.12 | |
| Neural network (wide) | 0.64 ± 0.13 | 0.62 ± 0.13 | 0.64 ± 0.14 | 0.61 ± 0.11 | 0.64 ± 0.14 | 0.62 ± 0.12 | |
| Neural network (narrow) | 0.64 ± 0.12 | 0.65 ± 0.13 | 0.66 ± 0.14 | 0.59 ± 0.14 | 0.65 ± 0.13 | 0.62 ± 0.14 | |
| Mono-center | XGBoost | 0.65 ± 0.11 | 0.59 ± 0.11 | 0.59 ± 0.17 | 0.56 ± 0.18 | 0.62 ± 0.14 | 0.58 ± 0.15 |
| CatBoost | 0.63 ± 0.11 | 0.59 ± 0.12 | 0.60 ± 0.15 | 0.64 ± 0.17 | 0.62 ± 0.13 | 0.62 ± 0.15 | |
| Random forest | 0.65 ± 0.10 | 0.59 ± 0.11 | 0.62 ± 0.14 | 0.62 ± 0.16 | 0.64 ± 0.12 | 0.61 ± 0.14 | |
| Neural network (wide) | 0.64 ± 0.11 | 0.62 ± 0.13 | 0.63 ± 0.15 | 0.61 ± 0.15 | 0.64 ± 0.13 | 0.62 ± 0.14 | |
| Neural network (narrow) | 0.63 ± 0.12 | 0.58 ± 0.12 | 0.65 ± 0.16 | 0.60 ± 0.16 | 0.64 ± 0.14 | 0.59 ± 0.14 | |
The rows are the classifiers on different setups (cyclical, stacking or internal validation) and the columns are different balancing techniques per center. Highest accuracies per center and on average are highlighted in bold.