Table 1.
Scenario | Abnormality detection | Unseen disease: TB | Unseen disease: COVID-19 | |||
---|---|---|---|---|---|---|
Dataset | DS-1 | CXR-14 (“ChestX-ray14”) | TB-1 | TB-2 | COV-1 | COV-2 |
Dataset origin | 5 clusters of hospitals from 5 cities in India | NIH Clinical Center7 | A hospital in Shenzhen, China | A hospital in Montgomery, MD, USA | A hospital in Illinois, USA | A hospital in Illinois, USA |
No. patients | 7747 | 532 | 462 | 133 | 1819 | 605 |
Median age (IQR) | 48 (38–58) | 49.5 (36–60) | 33 (26–43) | 40 (28–52) | 54 (39–66) | 56 (43–68) |
No. female (%) | 2805 (36.2%) | 375 (46.3%) | 151 (32.7%) | 70 (54.1%) | 950 (47.8%) | 325 (46.3%) |
Race/ethnicity | N/A | N/A | N/A | N/A |
White/Caucasian: 769 (42%) Hispanic: 336 (18%) Black/African American: 516 (28%) Asian: 67 (4%) Native Hawaiian/Other Pacific Islander: 3 (0.2%) American Indian/Alaskan Native: 2 (0.1%) Other: 65 (4%) Not available: 61 (3%) |
White/Caucasian: 369 (61%) Hispanic: 123 (20%) Black/African American: 58 (10%) Asian: 21 (3%) Native Hawaiian/Other Pacific Islander: 1 (0.2%) American Indian/Alaskan Native: 0 (0%) Other: 24 (4%) Not available: 9 (1%) |
No. images | 7747 | 810 | 462 | 133 | 1819 | 605 |
PA images | 7747 | 810 | 462 | 133 | 0 | 0 |
AP images | 0 | 0 | 0 | 0 | 1819 | 605 |
Reference standard | Normal/abnormal based on majority vote of 3 radiologists | Normal/abnormal based on majority vote of 3 radiologists | Radiologists reading without clinical tests | Radiology reports confirmed by clinical tests | COVID-19 status based on RT-PCR test | COVID-19 status based on RT-PCR test |
No. abnormal images (%) | 1845 (23.8%) | 578 (71.4%) | N/Aa | N/Aa | N/Aa | N/Aa |
No. positive images (%, specific disease/finding) | See Supplementary Table 3 | See Supplementary Table 4 | 241 (52.2%, TB) | 53 (39.8%, TB) | 583 (32.1%, COVID-19) | 290 (47.9%, COVID-19) |
Image properties | ||||||
Width (pixels) | 512–4400 | 1143–3827 | 1130–3001 | 4020–4892 | 1024–4200 | 1024–4200 |
Height (pixels) | 512–4784 | 966–4715 | 948–3001 | 4020–4892 | 2014–4200 | 2014–4200 |
Bit-depth (bits) | 12 | 8 | 8 | 8 | 12 | 12 |
N/A indicates information was not available.
aAbnormal images in the disease-specific datasets include both those positive for TB or COVID-19, and those with other findings; the numbers of images that contained other findings were not available.