Skip to main content
. 2021 Sep 1;11:15523. doi: 10.1038/s41598-021-93967-2

Table 1.

Data and patient characteristics of the 6 test datasets.

Scenario Abnormality detection Unseen disease: TB Unseen disease: COVID-19
Dataset DS-1 CXR-14 (“ChestX-ray14”) TB-1 TB-2 COV-1 COV-2
Dataset origin 5 clusters of hospitals from 5 cities in India NIH Clinical Center7 A hospital in Shenzhen, China A hospital in Montgomery, MD, USA A hospital in Illinois, USA A hospital in Illinois, USA
No. patients 7747 532 462 133 1819 605
Median age (IQR) 48 (38–58) 49.5 (36–60) 33 (26–43) 40 (28–52) 54 (39–66) 56 (43–68)
No. female (%) 2805 (36.2%) 375 (46.3%) 151 (32.7%) 70 (54.1%) 950 (47.8%) 325 (46.3%)
Race/ethnicity N/A N/A N/A N/A

White/Caucasian: 769 (42%)

Hispanic: 336 (18%)

Black/African American: 516 (28%)

Asian: 67 (4%)

Native Hawaiian/Other Pacific Islander: 3 (0.2%)

American Indian/Alaskan Native: 2 (0.1%)

Other: 65 (4%)

Not available: 61 (3%)

White/Caucasian: 369 (61%)

Hispanic: 123 (20%)

Black/African American: 58 (10%)

Asian: 21 (3%)

Native Hawaiian/Other Pacific Islander: 1 (0.2%)

American Indian/Alaskan Native: 0 (0%)

Other: 24 (4%)

Not available: 9 (1%)

No. images 7747 810 462 133 1819 605
PA images 7747 810 462 133 0 0
AP images 0 0 0 0 1819 605
Reference standard Normal/abnormal based on majority vote of 3 radiologists Normal/abnormal based on majority vote of 3 radiologists Radiologists reading without clinical tests Radiology reports confirmed by clinical tests COVID-19 status based on RT-PCR test COVID-19 status based on RT-PCR test
No. abnormal images (%) 1845 (23.8%) 578 (71.4%) N/Aa N/Aa N/Aa N/Aa
No. positive images (%, specific disease/finding) See Supplementary Table 3 See Supplementary Table 4 241 (52.2%, TB) 53 (39.8%, TB) 583 (32.1%, COVID-19) 290 (47.9%, COVID-19)
Image properties
Width (pixels) 512–4400 1143–3827 1130–3001 4020–4892 1024–4200 1024–4200
Height (pixels) 512–4784 966–4715 948–3001 4020–4892 2014–4200 2014–4200
Bit-depth (bits) 12 8 8 8 12 12

N/A indicates information was not available.

aAbnormal images in the disease-specific datasets include both those positive for TB or COVID-19, and those with other findings; the numbers of images that contained other findings were not available.