. Author manuscript; available in PMC: 2022 Oct 1.

Published in final edited form as: Med Image Anal. 2021 Jul 2;73:102138. doi: 10.1016/j.media.2021.102138

Table 1:

General characteristics of the six study datasets. For each dataset, this table shows the institution where images were collected, the number of images and individual women, the range of screening dates, the racial distribution, and information about the dataset usage in this study. The case-control datasets (ds3-b and ds5) include any available cancer case from the HUP and MC screening cohorts as long as a negative FFDM exam acquired prior to breast cancer diagnosis was available for analysis.

	ds 1	ds 2	ds 3-a	ds 3-b	ds 4	ds 5

Institution	HUP	HUP	MC	HUP	HUP	MC
Number of Images	11,200	1,100	3,314	1,147	110	6,368
Number of Women	2,200	1,100	1,662	575	110	1,592

Screening start date	2010	2010	2008	2010	2010	2013
Screening end date	2012	2012	2012	2014	2012	2015

Caucasian/White(%)	45	45	98	47	45	97
African American/Black (%)	45	45	_	53	45	_
Other (%)	10	10	2	-	10	3

Used in development	Yes	Yes	Yes	Yes	No	No
Cross-validation or Bootstrap	No	No	Yes	Yes	No	Yes
Training (%)	90	90	67	67	-	_
Validation (%)	10	10	33	33	-	_
Testing (%)	-	-	-	-	100	100

Accuracy in breast density assessment	No	No	Yes	No	No	Yes
Case-control classification based on breast density	No	No	No	Yes	No	Yes