Table 1. Dataset for model development and evaluation.
Dataset for model development | Test dataset | ||
CXR images according to data source (normal vs. pneumonia) |
n = 157,016 (120,722 vs. 36,294) from NIH (60,361 vs. 36,294) from KNTA (60,361 vs. 0) |
normal: NIH(n = 106), KNTA (n = 106) pneumonia from GUGMC (n = 212) | |
Training dataset (70%) | Validation dataset (30%) | ||
n = 109,912 (84,506 vs. 25,406) |
n = 47,104 (36,216 vs. 10,888) |
n = 424 (212 vs. 212) |
|
Number of patients (male vs. female) | 62,703 (32,065 vs. 30,638) |
28,463 (14,746 vs. 13,717) |
424 (288 vs. 136) |
Age (mean ± SD) | 47 ± 16 | 47 ± 16 | 54 ± 13 |
Abbreviations: CXR, chest X-ray; NIH, National Institutes of Health; KNTA, Korean National Tuberculosis Association; GUGMC, Gachon University Gil Medical Center.