Table 3. Quantitative results for the classification models when evaluating the external test dataset for classification (Hong Kong hospitals dataset).
| Bone suppression of external test data | None | Gusarev | Rajaraman |
|---|---|---|---|
| COVID-Net CXR2 | |||
| Sensitivity (%) | 80.93 | 77.81 | 71.25 |
| Specificity (%) | 56.18 | 60.42 | 67.95 |
| NPV (%) | 82.67 | 81.51 | 79.28 |
| Accuracy (%) | 65.63 | 67.06 | 69.21 |
| AUC ± SE95% | 0.686±0.031 | 0.691±0.031 | 0.696±0.032 |
| VGG16-Modified | |||
| Sensitivity (%) | 52.81 | 61.88 | 61.88 |
| Specificity (%) | 86.87 | 78.76 | 84.56 |
| NPV (%) | 74.88 | 76.98 | 78.21 |
| Accuracy (%) | 73.87 | 72.32 | 75.90 |
| AUC ± SE95% | 0.698±0.031 | 0.703±0.032 | 0.732±0.031* |
For the COVID-Net CXR2 architecture, the same model was tested with non-suppressed, Gusarev-suppressed and Rajaraman-suppressed external testing data. For the VGG16-Modified architecture, separate models trained on non-suppressed, Gusarev-suppressed and Rajaraman-suppressed data were each tested with their correspondingly suppressed external test data (e.g., non-suppressed training data model with non-suppressed test data). *, denotes a significant (P<0.05) difference from the non-suppressed external test data. NPV, negative predictive value; AUC, area under the receiver operating curve; SE95%, the error associated with a 95% confidence interval.