Skip to main content
. Author manuscript; available in PMC: 2022 Aug 30.
Published in final edited form as: Nat Med. 2021 Jan 11;27(2):244–249. doi: 10.1038/s41591-020-01174-9

Figure 2: Reader study results.

Figure 2:

a) Index cancer exams & confirmed negatives. i) The proposed deep learning model outperformed all five radiologists on the set of 131 index cancer exams and 154 confirmed negatives. Each data point represents a single reader, and the ROC curve represents the performance of the deep learning model. The cross corresponds to the mean radiologist performance with the lengths of the cross indicating 95% confidence intervals. ii) Sensitivity of each reader and the corresponding sensitivity of the proposed model at a specificity chosen to match each reader. iii) Specificity of each reader and the corresponding specificity of the proposed model at a sensitivity chosen to match each reader. b) Pre-index cancer exams & confirmed negatives. i) The proposed deep learning model also outperformed all five radiologists on the early detection task. The dataset consisted of 120 pre-index cancer exams - which are defined as mammograms interpreted as negative 12–24 months prior to the index exam in which cancer was found - and 154 confirmed negatives. The cross corresponds to the mean radiologist performance with the lengths of the cross indicating 95% confidence intervals. ii) Sensitivity of each reader and the corresponding sensitivity of the proposed model at a specificity chosen to match each reader. iii) Specificity of each reader and the corresponding specificity of the proposed model at a sensitivity chosen to match each reader. For the sensitivity and specificity tables, the standard deviation of the model minus reader difference was calculated via bootstrapping.