Figure 1:
Overview of extraction and curation of the EMory BrEast imaging Dataset (EMBED) with four core components: images and regions of interest (ROIs), structured imaging descriptors and pathologic outcomes from MagView, free-text pathology reports, and additional clinical data. “Other” racial category includes Asian, not reported, and mixed. DBT = digital breast tomosynthesis, NLP = natural language processing, 2D = two-dimensional.