Skip to main content
. 2021 Sep 2;374:n1872. doi: 10.1136/bmj.n1872

Table 1.

Summary of study characteristics for studies using AI as standalone system

Study Study design Population Mammography vendor Index test Comparator Reference standard
Lotter 202128 Enriched test set MRMC laboratory study (accuracy of a read) 285 women from 1 US health system with 4 centres (46.0% screen detected cancer); age and ethnic origin NR Hologic 100% In-house AI system (DeepHealth); threshold NR (set to match readers’ sensitivity and specificity, respectively) 5 MQSA certified radiologists (US), single reading; threshold of BI-RADS scores 3, 4, and 5 considered recall Cancer: pathology confirmed cancer within 3 months of screening; confirmed negative: a negative examination followed by an additional BI-RADS score 1 or 2 interpretation at the next screening examination 9-39 months later
McKinney 202029 Retrospective test accuracy study (accuracy of a read) 3097 women from 1 US centre (22.2% cancer within 27 months of screening); age <40, 181 (5.8%); 40-49, 1259 (40.7%); 50-59, 800 (25.8%); 60-69, 598 (19.3%); ≥70, 259 (8.4%) Hologic / Lorad branded: >99%;
Siemens or General
Electric: <1%
In-house AI system (Google Health); threshold: to achieve superiority for both sensitivity and specificity compared with original single reading using validation set Original single radiologist decision (US); threshold: BI-RADS scores 0, 4, 5 were treated as positive Cancer: biopsy confirmed cancer within 27 months of imaging; non-cancer: one follow-up non-cancer screen or biopsied negative (benign pathologies) after ≥21 months
Rodriguez-Ruiz 201933 Enriched test set MRMC laboratory study (accuracy of a read) 199 examinations from a Dutch digital screening pilot project (39.7% cancer);
age range 50-74
Hologic 100% Transpara version 1.4.0 (Screenpoint Medical BV, Nijmegen, Netherlands);
threshold: 8.26/10, corresponding to the average radiologist’s specificity
Nine Dutch radiologists, single reading, as part of a previously completed MRMC study38; no threshold Cancer: histopathology-proven cancer; non-cancer: ≥1 normal follow-up screening examination (2 year screening interval)
Salim
202035
Retrospective test accuracy study (accuracy of a read) 8805 women from a Swedish cohort study (8.4% cancer within 12 months of screening); median age 54.5 (IQR 47.4-63.5) Hologic 100% 3 commercial AI systems (anonymised: AI-1, AI-2, and AI-3); threshold: corresponding to the specificity of the first reader Original radiologist decision (Sweden); (1) single reader (R1; R2), (2) consensus reading; no threshold Cancer: pathology confirmed cancer within 12 months of screening; non-cancer: ≥2 years cancer free follow-up
Schaffter 202036 Retrospective test accuracy study (accuracy of a read) 68 008 consecutive women from 1 Swedish centre (1.1% cancer within 12 months of screening) mean age 53.3 (SD 9.4) NR 4 in-house AI systems: 1 top performing model submitted to the DREAM challenge, 1 ensemble method of the eight best performing models (CEM), CEM combined with reader decision (single reader or consensus reading); threshold: corresponding to the sensitivity of single and consensus reading, respectively Original radiologist decision (Sweden); (1) single reader (R1; R2), (2) consensus reading; no threshold Cancer: tissue diagnosis within 12 months of screening; non-cancer: no cancer diagnosis ≥12 months after screening

AI=artificial intelligence; BI-RADS=breast imaging reporting and data system; CEM=challenge ensemble method; DREAM=Dialogue on Reverse Engineering Assessment and Methods; IQR=interquartile range; MQSA=Mammography Quality Standards Act; MRMC=multiple reader multiple case; NR=not reported; R1=first reader; R2=second reader; SD=standard deviation.