Table 2.
The OOD performance of training from scratch versus the pre-trained models (vanilla, SSL, and SWSL).
| Pre-training | Weights | Training scenario | Hospital 1 | Hospital 2 | Hospital 3 | Hospital 4 | Hospital 5 | Average |
|---|---|---|---|---|---|---|---|---|
| F | Random | S1 | 93.01 | 89.22 | 84.95 | 91.06 | 81.09 | 87.9 ± 4.22 |
| F | Random | S2 | 92.72 | 90.28 | 82.01 | 90 | 80.71 | 87.1 ± 4.73 |
| T | Vanilla | S1 | 98.75 | 96.03 | 94.42 | 96.65 | 90.54 | 95.3 ± 2.69 |
| T | Vanilla | S2 | 98.62 | 93.6 | 97.06 | 97.19 | 91.67 | 95.6 ± 2.52 |
| T | SSL | S1 | 98.52 | 96.92 | 94.8 | 97.46 | 96.61 | 96.9 ± 1.19 |
| T | SSL | S1 | 99.18 | 94.98 | 95.09 | 97.79 | 97.21 | 96.8 ± 1.59 |
| T | SWSL | S2 | 99.08 | 96.52 | 94.97 | 98.12 | 83.93 | 94.5 ± 5.37 |
| T | SWSL | S2 | 99.31 | 96.19 | 97.44 | 98.09 | 89.71 | 96.1 ± 3.3 |
| Average | 97.4 ± 1.95 | 94.2 ± 2.05 | 92.6 ± 4.01 | 95.8 ± 2.29 | 88.9 ± 4.48 |
Each column represents the OOD top-1 accuracy on the hold-out set.