Table 5.
Performance evaluation of the best performing models on test sets A & B, which were selected based on the average AUROC performance on the validation sets, as shown in Supplementary Section C. Model type indicates the type of the base learners within the final selected ensemble. All the metrics were computed using bootstrapping with 1,000 iterations [46].
Complication | Result | Test Set A | Test Set B |
---|---|---|---|
SBI | Model Type | LR | LR |
AUROC | 0.902 (0.862, 0.939) | 0.859 (0.762, 0.932) | |
AUPRC | 0.436 (0.297, 0.609) | 0.387 (0.188, 0.623) | |
Calibration Slope | 0.933 (0.321, 1.370) | 1.031 (−0.066, 1.550) | |
Calibration Intercept |
0.031 (−0.111, 0.213) |
0.010 (−0.164, 0.273) |
|
AKI | Model Type | LR | LR |
AUROC | 0.906 (0.856, 0.948) | 0.891 (0.804, 0.961) | |
AUPRC | 0.436 (0.278, 0.631) | 0.387 (0.115, 0.679) | |
Calibration Slope | 0.655 (0.043, 1.292) | 1.370 (−0.050, 2.232) | |
Calibration Intercept |
0.059 (−0.136, 0.251) |
−0.072 (−0.183, 0.154) |
|
ARDS | Model Type | LR | LGBM |
AUROC | 0.854 (0.789, 0.909) | 0.827 (0.646, 0.969) | |
AUPRC | 0.288 (0.172, 0.477) | 0.399 (0.150, 0.760) | |
Calibration Slope | 0.598 (0.028, 1.149) | 0.742 (−0.029, 1.560) | |
Calibration Intercept | 0.000 (−0.159, 0.164) | 0.050 (−0.166, 0.243) |