Skip to main content
JNCI Cancer Spectrum logoLink to JNCI Cancer Spectrum
. 2021 Jan 11;5(1):pkaa119. doi: 10.1093/jncics/pkaa119

Deep Learning Image Analysis of Benign Breast Disease to Identify Subsequent Risk of Breast Cancer

Adithya D Vellal 1, Korsuk Sirinukunwattan 2,3,4,5, Kevin H Kensler 6, Gabrielle M Baker 7, Andreea L Stancu 8, Michael E Pyle 9, Laura C Collins 10, Stuart J Schnitt 11, James L Connolly 12, Mitko Veta 13, A Heather Eliassen 14,15, Rulla M Tamimi 16,17,18, Yujing J Heng 19,✉,1
PMCID: PMC7898083  PMID: 33644680

Abstract

Background

New biomarkers of risk may improve breast cancer (BC) risk prediction. We developed a computational pathology method to segment benign breast disease (BBD) whole slide images into epithelium, fibrous stroma, and fat. We applied our method to the BBD BC nested case-control study within the Nurses’ Health Studies to assess whether computer-derived tissue composition or a morphometric signature was associated with subsequent risk of BC.

Methods

Tissue segmentation and nuclei detection deep-learning networks were established and applied to 3795 whole slide images from 293 cases who developed BC and 1132 controls who did not. Percentages of each tissue region were calculated, and 615 morphometric features were extracted. Elastic net regression was used to create a BC morphometric signature. Associations between BC risk factors and age-adjusted tissue composition among controls were assessed using analysis of covariance. Unconditional logistic regression, adjusting for the matching factors, BBD histological subtypes, parity, menopausal status, and body mass index evaluated the relationship between tissue composition and BC risk. All statistical tests were 2-sided.

Results

Among controls, direction of associations between BBD subtypes, parity, and number of births with breast composition varied by tissue region; select regions were associated with childhood body size, body mass index, age of menarche, and menopausal status (all P <.05). A higher proportion of epithelial tissue was associated with increased BC risk (odds ratio = 1.39, 95% confidence interval = 0.91 to 2.14, for highest vs lowest quartiles, Ptrend=.047). No morphometric signature was associated with BC.

Conclusions

The amount of epithelial tissue may be incorporated into risk assessment models to improve BC risk prediction.


One in 8 women in the United States will develop breast cancer (BC) in her lifetime (1). Although early detection is imperative, identifying and lowering BC risk may help reduce BC morbidity and mortality. BC risk factors may be nonmodifiable (eg, genetics, dense breast tissue, and benign breast disease [BBD]) or modifiable (eg, adiposity and alcohol consumption). Among women diagnosed with BBD, the subsequent BC risk varies with the subtype of BBD in this order: nonproliferative, proliferative without atypia, and proliferative with atypia (2–4). Researchers continue to identify new biomarkers of risk (2,3,5–8) as well as update risk assessment models (9–13) to improve BC risk prediction. For example, the well-validated Rosner–Colditz model includes age at menarche, age at first birth, age at subsequent births, age at menopause, family history of BC, body mass index (BMI), alcohol intake, and postmenopausal hormone therapy use (14). Recent studies demonstrated that the including of genetic risk variants, mammographic density, and endogenous hormones improves the Rosner–Colditz model to predict BC risk (11,12).

Technological advances have enabled the engineering of deep-learning algorithms to analyze whole slide images (WSIs) for disease detection and diagnosis (15‐20), including discriminating between BC and benign breast tissue (21‐23). For example, terminal duct lobular unit (TDLU) involution assessed using qualitative and semi-quantitative methods was suggested to be linked to lower BC risk (24‐27). We developed and applied an automated deep-learning method to capture quantitative measures of TDLU involution (28,29) in a large, nested case-control study (30).

Here, we engineered another deep-learning method to segment BBD histopathological images into epithelial, fibrous stroma, and fat regions; calculate the amount of each tissue region expressed as a percentage of total tissue; and extract morphometric features from each region. We applied our method to the BBD BC Nested Case-Control study within the Nurses’ Health Study (NHS) and NHSII to evaluate whether computer-derived tissue composition or a morphometric signature in women diagnosed with BBD was associated with subsequent risk of BC.

Materials and Methods

Study Population

The NHS and NHSII participants completed questionnaires that provided a medical history, diagnoses of BBD or BC, as well as extensive information about demographic, lifestyle, reproductive, and dietary risk factors for BC (3,31‐33). Details about the study design methods for the NHS and NHSII have been published previously (34). Eligible women with biopsy-confirmed BBD were placed into 2 substudies—the BBD Incidence study (35‐38) and/or the BBD BC nested case-control study (2,3,5,24,30,32,33,39‐41). BC diagnosis was confirmed verbally by the participant, via medical record review, or via the cancer registry. WSIs from women in the BBD Incidence study were used in the development phase; the BBD BC nested case-control study was used in the application phase. The study protocol was approved by the institutional review boards of the Brigham and Women’s Hospital and Harvard T.H. Chan School of Public Health, and those of participating registries as required.

Engineering the Networks

The tissue segmentation network was engineered using 48 hematoxylin and eosin WSIs from the BBD Incidence study and a custom 21-layer fully convolutional network (42‐47) to segment WSIs into background, epithelium (normal TDLUs, TDLUs exhibiting proliferative or metaplastic changes, and various BBD lesions), fibrous stroma (inter- and intra-lobular), and fat (42‐47) (Supplementary Table 1, available online). The nuclei detection network was created using a set of previously annotated 30 hematoxylin and eosin BC WSIs from The Cancer Genome Atlas (48) and a fully convolutional U-Net architecture (43) with the sliding window approach (44). An example of an original image, ground truth, and automated segmentation for each network is presented in Figure 1. The majority of the precision, recall, and Dice similarity coefficient values of the tissue segmentation network and nuclei detection were greater than 0.75 (Supplementary Table 2, available online).

Figure 1.

Figure 1.

An example of an original image, ground truth, and automated segmentation or detection for each deep-learning network. A) For tissue segmentation, white represents background, green represents fibrous stroma, red is epithelium, and purple is fat. B) For cell nuclei detection, white represents background, red is nucleus, and cyan is nuclei membrane border. The final output produces a binary mask that considers nucleus membrane pixels to be part of the background.

BBD BC Nested Case-Control Study Participants

The BBD BC Nested Case-Control study consisted of 293 cases and 1132 controls (Supplementary Table 3, available online). Cases were women who had previously reported a BBD diagnosis and were diagnosed with BC a median of 7.67 years after BBD diagnoses (interquartile range = 4.33-11.75 years). Tumor estrogen receptor (ER) status was obtained from centralized review of breast tissue microarrays (49). Controls were women diagnosed with BBD who did not develop BC. Cases and controls were matched 1:4 on year of BBD diagnosis, age at BC diagnosis (index date for controls), and years between BBD and BC diagnosis (or index date). A total of 3795 slides were digitized at 20× (n=213) or 40× magnification (n=3582). Each woman contributed between 1 and 4 WSIs (median WSIs n=3).

Central pathology review classified BBD lesion as nonproliferative, proliferative without atypia, or proliferative with atypia. Participant BMI, age at menarche, parity, age at first birth, breastfeeding history, and menopausal status were obtained from questionnaires of the participants closest to but before BBD biopsy. The average body sizes at ages 5 and 10 years were reported by using a 9-level pictogram (level 1 as leanest) (40). Birth index, a surrogate metric that reflects the timing and spacing of births, was calculated (50). A higher birth index indicates a higher number of births occurring at earlier ages.

Applying Our Networks to the BBD BC Nested Case-Control Study

Figure 2 shows an overview of our image analysis pipeline. Briefly, tissue-containing areas were located for each WSI (Figure 2, B), the WSI was split into patches of 2048 × 2048 pixels, tissue segmentation and nuclei detection were performed (Figure 2, C), and each patch resulted as a segmentation map with every pixel classified as epithelium, stroma, fat, or background (see Supplementary Methods, available online).

Figure 2.

Figure 2.

Overview of our benign breast disease image analysis pipeline. A) A whole slide image (WSI). B) Image processing to extract tissue-containing areas of the WSI. C) Applying our tissue segmentation and nuclei detection networks created in the development phase to a WSI to obtain a segmentation map. D) From the segmentation map, computer-derived morphometric features were extracted. Percentages of tissue regions were also computed from the map. Morphometric data were summarized from all WSIs belonging to the same woman. E) Identifying if morphometric features are associated with breast cancer.

Each tissue region was expressed as a percentage of the total amount of tissue analyzed for each woman. Pixels classified as epithelium, stroma, or fat were individually summed across patches from a single WSI, combined across WSIs pertaining to each woman, and divided by the total number of pixels detected across all tissue regions.

Because fat regions were mostly empty white spaces, fat and stroma regions were combined as stroma for feature extraction. Morphology, texture, and graph-based spatial features (ie, computer-derived morphometric features; n=615) were extracted using the WSIs in conjunction with the automated tissue segmentation and nuclei detection results (Figure 2, D) (51‐55). For women with more than 1 WSI, the value for each feature was further summarized using the median calculated across all her WSIs. A morphometric signature associated with BC was constructed using a training set of 855 women (60%) and elastic net regularized regression model (see Supplementary Methods, available online) (56). A signature score for each woman in the test set was computed.

Statistical Analysis

Preliminary assessments using Wilcoxon rank sum and Kruskal-Wallis tests evaluated if there was any difference in tissue composition between cases and controls and when stratified by BBD histological subtypes. The associations between risk factors and tissue composition (natural log-transformed) among controls were assessed using analysis of covariance (ANCOVA) adjusting for age at BBD biopsy (emmeans R package version 1.4.4) (57). Each tissue region was categorized into quartiles as defined by the distribution among controls. Unconditional logistic regression models accounting for the matching factors to estimate odds ratios (ORs) and 95% confidence intervals (CI) were used to determine the relationship between each tissue region (in quartiles) and BC risk (Figure 2, E). Model 1 adjusted for matching factors (year of BBD biopsy, age at index date, time between BBD biopsy and index date); model 2 adjusted for matching factors and BBD histological subtypes; and model 3 adjusted for matching factors, BBD histological subtypes, parity, menopausal status, and BMI. Analyses were also conducted by stratifying the women according to BBD histological subtype, parity, menopausal status, or BMI. Polytomous logistic regression models assessed the association between each tissue region and risk of BC defined by tumor ER expression. The ratio of epithelium to fibrous stroma was calculated and log-transformed, and its association with BC risk was evaluated using logistic regression models in all women and women stratified by BBD histological subtype. The level of statistical significance used for all statistical tests was P less than.05. All tests were 2-sided. All statistical analyses were performed using R (see Supplementary Methods, available online).

Results

Preliminary Assessment of Breast Tissue Composition

Cases have statistically significantly more epithelium (P <.001; Wilcoxon test) and suggestively more stroma (P =.07) than controls; controls have statistically significantly more fat (P <.001; Wilcoxon test) than cases (Figure 3, A). When stratified by BBD histological subtypes, there were statistically significant differences among cases and controls, or between cases and controls for each tissue region (epithelium P <.001, stroma P =.02, fat P <.001; Kruskal-Wallis tests; Figure 3, B-D).

Figure 3.

Figure 3.

Boxplots display the amount of each tissue region (%) among cases and controls and when stratified by benign breast disease (BBD) histological subtypes. A) Cases have more epithelium than controls (Wilcoxon test). Controls have statistically significantly more fat than cases (Wilcoxon test). When stratified by BBD histological subtypes, there were statistically significant differences among cases or controls, or between cases and controls within epithelium P less than.001 (B), fibrous stroma P =.02 (C), and fat P less than.001 (D) (Kruskal-Wallis tests). Statistically significant Kruskal-Wallis tests were further evaluated using Dunn’s post hoc tests with Benjamini-Hochberg multiple testing method to obtain adjusted P values; only meaningful statistically significant comparisons within cases, controls, and between case and controls were indicated in B, C, and D. Cases are represented by boxes with slanted lines. Controls are represented by clear boxes. Each box displays the median and the 25th and 75th percentiles (upper and lower hinges). The lower whisker represents the smallest observation greater than or equal to the lower hinge: 1.5 * interquartile range (IQR); the upper whisker represents the largest observation less than or equal to upper hinge + 1.5 * IQR. The black dots represent outliers. All statistical tests were 2-sided.

Age-Adjusted Tissue Composition and Risk Factors Among Controls

Table 1 displays the age-adjusted means (95% confidence intervals) and the ANCOVA P values of the associations between risk factors and the tissue composition among the controls. Controls with the nonproliferative subtype of BBD had lower percentages of epithelium and stroma but higher percentages of fat than those with proliferative subtypes (all P <.001; ANCOVA). Women with a larger childhood body size (levels 1.5-2 and ≥2.5) had less stroma (P =.048; ANCOVA) compared with women with body sizes of 1 or 1.5-2 at ages 5-10 years. Breast tissues of women with a BMI of 30 or more at the time of BBD biopsy had a lower amount of stroma (P <.001; ANCOVA) but higher amount of fat (P <.001) compared with women with lower BMI.

Table 1.

Tissue composition and BC risk factors among 1132 controlsa

Risk factors No. Epithelium, %
(95% CI)
Fibrous stroma, %
(95% CI)
Fat, %
(95% CI)
Mean age at BBD biopsy, y
 <40  251 9.2 (8.5 to 10.0) 76.0 (74.3 to 77.7) 7.8 (6.9 to 8.8)
 40-49  438 7.8 (7.3 to 8.3) 72.0 (70.8 to 73.2) 13.2 (12.0 to 14.4)
 50-59  293 6.1 (5.7 to 6.6) 69.1 (67.7 to 70.5) 17.5 (15.6 to 19.6)
 ≥60  150 5.0 (4.5 to 5.6) 63.5 (61.7 to 65.4) 23.9 (20.4 to 28.0)
P valueb <.001 <.001 <.001
BBD histological subtype
 Nonproliferative 331 5.7 (5.3 to 6.1) 68.3 (67.0 to 69.7) 16.2 (14.6 to 18.0)
 Proliferative without atypia 645 7.8 (7.5 to 8.2) 71.8 (70.9 to 72.8) 12.5 (11.6 to 13.5)
 Atypical hyperplasia 156 8.0 (7.2 to 8.8) 72.8 (70.8 to 74.8) 13.3 (11.4 to 15.6)
P valueb <.001 <.001 <.001
Body size at age 5-10 y
 Level 1 322 7.5 (7.0 to 8.0) 72.0 (70.6 to 73.4) 12.5 (11.2 to 14.0)
 Level 1.5-2 290 7.1 (6.6 to 7.7) 71.8 (70.4 to 73.3) 12.8 (11.4 to 14.3)
 Level ≥2.5 367 7.0 (6.5 to 7.5) 69.9 (68.6 to 71.2) 14.6 (13.2 to 16.2)
P valueb .42 .048 .09
BMI, kg/m2
 <25 641 7.1 (6.8 to 7.5) 72.6 (71.6 to 73.6) 12.3 (11.4 to 13.3)
 25 to <30 303 7.3 (6.7 to 7.8) 70.7 (69.3 to 72.1) 13.5 (12.1 to 15.0)
 ≥30 173 7.2 (6.5 to 7.9) 65.5 (63.8 to 67.2) 19.8 (17.1 to 22.8)
P valueb .91 <.001 <.001
Mean age of menarche, y
 ≤12 532 7.0 (6.6 to 7.4) 70.0 (68.9 to 71.0) 14.6 (13.5 to 15.9)
 13 335 7.2 (6.7 to 7.7) 71.1 (69.7 to 72.5) 12.5 (11.3 to 13.9)
 ≥14 260 7.4 (6.9 to 8.1) 72.8 (71.2 to 74.4) 13.0 (11.6 to 14.7)
P valueb .50 .01 .05
Parity
 Nulliparous 107 5.2 (4.6 to 5.9) 73.8 (71.3 to 76.4) 9.7 (8.1 to 11.7)
 Parous 1020 7.4 (7.1 to 7.7) 70.6 (69.8 to 71.4) 14.2 (13.3 to 15.0)
P valueb <.001 .02 <.001
No. of births
 Nulliparous 107 5.8 (5.1 to 6.7) 75.8 (73.2 to 78.5) 8.1 (6.7 to 9.9)
 Primiparous (1 birth) 97 7.0 (6.1 to 8.1) 73.4 (70.8 to 76.2) 12.6 (10.3 to 15.5)
 Multiparous (≥2 births) 923 7.3 (7.0 to 7.7) 70.1 (69.3 to 71.0) 14.6 (13.7 to 15.6)
P valueb .005 <.001 <.001
Time between last birth and BBD biopsy, y
 0 (ie, nulliparous) 107 5.2 (4.6 to 5.9) 73.7 (71.2 to 76.3) 9.8 (8.1 to 11.8)
 <20 (among parous women) 578 7.6 (7.1 to 8.0) 70.4 (69.2 to 71.5) 15.1 (13.8 to 16.5)
 ≥20 (among parous women) 409 7.0 (6.5 to 7.5) 70.3 (68.8 to 71.8) 14.3 (12.8 to 16.1)
P valueb <.001 .04 <.001
Mean age at first birth among parous women, y
 <25 563 7.1 (6.8 to 7.5) 70.5 (69.4 to 71.5) 14.8 (13.8 to 15.8)
 25-29 359 7.7 (7.2 to 8.2) 69.6 (68.3 to 70.9) 14.4 (13.2 to 15.7)
 ≥30 101 7.1 (6.3 to 8.1) 72.8 (70.3 to 75.4) 12.8 (10.9 to 15.0)
P valueb .19 .08 .27
Birth index among parous women
 ≤30 229 7.4 (6.8 to 8.1) 72.0 (70.3 to 73.8) 13.3 (11.9 to 14.9)
 31-59 281 7.7 (7.2 to 8.3) 71.5 (70.0 to 72.9) 13.8 (12.6 to 15.2)
 ≥60 231 7.8 (7.2 to 8.6) 70.9 (69.2 to 72.6) 13.5 (12.1 to 15.0)
P valueb .65 .67 .85
Breastfeeding among parous women
 Never 409 7.1 (6.7 to 7.5) 70.2 (69.0 to 71.4) 15.3 (14.1 to 16.5)
 <6 mo 209 7.3 (6.7 to 8.0) 70.9 (69.2 to 72.6) 15.4 (13.8 to 17.2)
 ≥6 mo 305 7.5 (7.0 to 8.1) 70.4 (69.0 to 71.8) 13.5 (12.3 to 14.8)
P valueb .47 .79 .09
Menopausal status
 Pre 679 7.8 (7.3 to 8.2) 72.4 (71.2 to 73.5) 13.2 (12.0 to 14.4)
 Post 365 6.3 (5.7 to 6.9) 68.8 (67.2 to 70.6) 13.7 (11.9 to 15.7)
P valueb .001 .004 .71
a

Data presented for age are means (95% CI). Data for other variables are presented as age-adjusted means (95% CI); age was adjusted as a continuous variable. ANCOVA = analysis of covariance; BBD = benign breast disease; BC = breast cancer; BMI = body mass index; CI = confidence interval.

b

The P values were using ANCOVA adjusting for age at BBD biopsy.

Parous women had more epithelium and fat and less stroma compared with nulliparous women (all P <.05; ANCOVA). When parous women were further subdivided, women who had 2 and more births (multiparous) had more epithelium and fat but less stroma than women who had 1 birth (primiparous) or nulliparous women (P <.05). Women who had their last birth within 20 years had more epithelium and fat compared with nulliparous women and women who had their last birth 20  and more years before BBD diagnosis (P <.05). Postmenopausal women had less epithelium (P =.001; ANCOVA) and stroma (P =.004) compared with premenopausal women. The age of menarche positively correlated with the amount of stroma (P =.01; ANCOVA). Age at first birth, birth index, and breastfeeding were not associated with breast tissue composition.

Tissue Composition and BC Risk

Higher percentages of epithelium were statistically significantly associated with subsequent BC risk when accounting for matching factors (OR = 1.53, 95% CI = 1.04 to 2.27 comparing highest and lowest quartiles, Ptrend=.02). On additional adjustment for BBD histological subtype, parity, menopausal status, and BMI, the association modestly attenuated but remained statistically significant (OR = 1.39, 95% CI = 0.91 to 2.14 comparing highest and lowest quartiles, Ptrend=.047; Table 2). Neither the amount of stroma nor fat was associated with BC risk (all Ptrend>.05; Table 2).

Table 2.

The association between tissue composition and BC risk was evaluated using unconditional logistic regression models to estimate odds ratios and 95% confidence intervalsa

Tissue region Quartile 1 Quartile 2 Quartile 3 Quartile 4 P trend b
Epithelium
 Cases/controls, No. 56/283 65/283 68/283 104/283
 Quartile cutoff, % <4.8 ≥4.8 to <7.5 ≥7.5 to <11.2 ≥11.2
 Model 1, OR (95% CI) Ref 1.12 (0.76 to 1.67) 1.12 (0.75 to 1.67) 1.53 (1.04 to 2.27) .02
 Model 2, OR (95% CI) Ref 0.95 (0.63 to 1.43) 0.92 (0.61 to 1.39) 1.36 (0.91 to 2.03) .06
 Model 3, OR (95% CI) Ref 0.95 (0.61 to 1.49) 0.95 (0.61 to 1.49) 1.39 (0.91 to 2.14) .047
Fibrous stroma
 Cases/controls, No. 62/283 67/283 78/283 86/283
 Quartile cutoff, % <64.5 ≥64.5 to <73.5 ≥73.5 to <81.3 ≥81.3
 Model 1, OR (95% CI) Ref 0.98 (0.66 to 1.45) 1.07 (0.73 to 1.57) 1.20 (0.81 to 1.76) .33
 Model 2, OR (95% CI) Ref 0.87 (0.58 to 1.30) 0.96 (0.65 to 1.42) 1.07 (0.72 to 1.59) .65
 Model 3, OR (95% CI) Ref 0.78 (0.51 to 1.20) 0.86 (0.56 to 1.31) 0.93 (0.61 to 1.41) .85
Fat
 Cases/controls, No. 102/283 80/283 49/283 62/283
 Quartile cutoff, % <8.7 ≥8.7 to <16.7 ≥16.7 to <27.0 ≥27.0
 Model 1, OR (95% CI) Ref 0.81 (0.57 to 1.15) 0.55 (0.36 to 0.81) 0.75 (0.50 to 1.12) .11
 Model 2, OR (95% CI) Ref 0.81 (0.56 to 1.15) 0.55 (0.36 to 0.82) 0.83 (0.55 to 1.25) .27
 Model 3, OR (95% CI) Ref 0.83 (0.58 to 1.21) 0.56 (0.36 to 0.85) 0.93 (0.59 to 1.45) .52
a

Each tissue region was categorized into quartiles as defined by the distribution among the controls. Model 1 adjusted for matching factors. Model 2 adjusted for matching factors and BBD histological subtypes. Model 3 adjusted for matching factors, BBD histological subtypes, parity, menopausal status, and BMI. BC = breast cancer; BBD = benign breast disease; BMI = body mass index; CI = confidence interval; OR = odds ratio.

b

The median value for each quartile was included as a continuous variable in the unconditional logistic regression for models 1, 2, and 3 to obtain the Ptrend value (Wald test).

Within the proliferative without atypia subtype of BBD, women with percentage of epithelium in the fourth quartile had a higher BC risk compared with women in the first quartile (adjusted OR = 1.92, 95% CI = 1.11 to 3.40, Ptrend=.01; Supplementary Table 4, available online). In general, the association between tissue regions and BC risk defined by tumor ER expression demonstrated no heterogeneity. Fat was associated with lower BC risk among ER-positive women in the crude model 1 (second vs first tertile: OR = 0.62, 95% CI = 0.42 to 0.92; third vs first tertile: OR = 0.62, 95% CI = 0.41–0.95, Ptrend=.04; Supplementary Table 5, available online).

Further analyses were conducted to understand the substitution effects by using each tissue region as a continuous variable per 10% change and with 2 of the 3 tissue regions in the model. The association between per 10% change of epithelium and BC risk remained the strongest in fully adjusted models, irrespective of whether it was substituted for stroma (adjusted OR = 1.30, 95% CI = 1.05 to 1.61) or fat tissue (adjusted OR = 1.26, 95% CI = 1.03 to 1.54; Supplementary Table 6, available online). The ratio of epithelium to fibrous stroma was statistically significantly associated with BC risk in the fully adjusted model (OR = 1.29, 95% CI = 1.05 to 1.59). When stratified by BBD histological subtype, the association of this ratio and BC risk only remained statistically significant among women with nonproliferative subtype of BBD (matching factor adjusted model 1 OR = 1.42, 95% CI = 1.02 to 1.99; fully adjusted model 3 OR = 1.44, 95% CI = 1.00 to 2.06; Supplementary Table 7, available online).

Morphometric Signature

The morphometric signature built using training data consisted of 4 features in the epithelium (area under the receiver operator curve [AUC ROC] = 0.61, optimal λ  =  0.08). When evaluated on the test set of 570 women, the AUC ROC was 0.51. Due to the poor AUC ROC of the test set, the association of the signature score with BC was not further evaluated.

Discussion

The identification of new biomarkers may improve BC risk prediction. We developed a deep-learning–based computational pathology method to segment BBD histopathological images into epithelial, fibrous stroma, and fat regions. Among women who did not develop BC, BBD histological subtypes, parity, and number of births were statistically significantly associated with breast tissue composition; the direction of association varied by tissue region. Select regions were associated with body size, BMI, age of menarche, and menopausal status. Women whose breast tissues had higher percentages of epithelium had a statistically significantly increased risk of BC compared with women with lower percentages, especially among women with proliferative without atypia subtype of BBD. The ratio of epithelium to stroma was also statistically significantly associated with BC risk, particularly among women with nonproliferative subtype of BBD. We were unable to construct a BC morphometric signature. Our study showed that the percentage of epithelium may be used as a potential biomarker of BC risk.

BBD and BC originate from TDLUs. The epithelium captured by our computational method was all-encompassing. This study was the first, to our knowledge, to demonstrate a direct quantitative relationship between the percentage of epithelium and BC risk in women diagnosed with BBD, supporting the long-held hypothesis that elevated cellular mass increases cancer risk (58). Some lesion types within the proliferative without atypia subtype such as adenosis and radical scar are highly cellular, thus explaining why when stratified by BBD histological subtype, the association of the percentage of epithelium and BC risk remained statistically significant among those women. Our study also demonstrated that the ratio of epithelium to fibrous stroma may be an important measure to further refine the BC risk among women with the nonproliferative subtype of BBD.

The associations of age-adjusted breast tissue composition and BC risk factors among controls provided histopathological evidence to support epidemiological studies, mainly by demonstrating the link between breast tissue cellularity and cancer risk (58). Our work suggests that risk factors have different influences on epithelium and stroma. Gertig et al. (59) evaluated the proportion of epithelium and stroma in 300 BBD women who did not develop BC. Our findings support Gertig et al. (59) by also demonstrating that breast tissues associated with the nonproliferative subtype of BBD were less cellular (ie, lower epithelium and stroma but higher fat percentages) than proliferative subtypes, thus partly explaining why women with the nonproliferative subtype have lower BC risk (4,39,60‐62).

Adiposity during childhood or in young adults is inversely associated with BC risk (63‐65). Body adiposity is correlated with the amount of breast fat when evaluated using percentage mammographic density (ie, proportion of dense [epithelium and stroma] to nondense tissues [fat]) (66,67). In 153 normal breast tissue samples, Gabrielson et al. (68) observed statistically significant inverse associations of BMI with percentages of epithelium and stroma. Our study and the study by Gertig et al. (59), conducted using more participants, observed a statistically significant inverse association only between BMI and proportion of stroma. Nevertheless, all 3 studies provided histological evidence to partially explain the differential BC risk by adiposity; breast tissues of women with a larger childhood body size or younger women with a BMI of 30 or more have lower overall cellularity and thus are less dense compared with women with a leaner childhood body size or women with lower BMI, respectively.

Parity had the strongest influence on breast tissue composition among the reproductive risk factors investigated in our study. Gertig et al. (59) and Gabrielson et al. (68) observed more epithelium and less stroma in parous women compared with nulliparous women. Our findings in multiparous women who had a live birth within the last 20 years were similar to other studies that observed less TDLU involution in parous vs nulliparous women (30,69); supported epidemiological reports of increased BC risk in parous women who had a live birth within the last 5 to 24 years compared with nulliparous women (70); and highlighted the extensive stroma remodeling in mammary glands during pregnancy to accommodate expanding epithelium (71). The correlation between age of menarche and proportion of stroma reported by us and others (59,68) is in line with a higher percent breast density in young women who had later ages of menarche (72).

The null associations between age of first birth and length of breastfeeding with breast tissue composition agreed with Gertig et al. (59), whereas Gabrielson et al. (68) found an association between percentage of epithelium and length of breastfeeding, but not percentage of stroma. Using our other method that measures normal TDLUs, we also did not find an association between length of breastfeeding and TDLU involution (30). Older women have less dense breasts than younger women, with the greatest change in density occurring during the menopause years (73). Indeed, we and Gertig et al. (59) reported that postmenopausal women had less epithelium and stroma compared with premenopausal women. However, this was not observed by Gabrielson et al. (68), possibly due to low power.

Computer-derived morphometric signatures have shown potential as prognostic or diagnostic biomarkers (17,18,74). We did not identify a BC morphometric signature in women with BBD. Morphometric feature data are typically noisy. In an effort to reduce signal noise, we attempted unsuccessfully to create a BC signature within each BBD histopathological subtype due to low power. Extracting and combining morphometric features from different types of epithelium may have diluted meaningful signals. Using the median metric, a common method of aggregating morphometric features (17), may not be optimal for this dataset. There is no gold standard method for feature aggregation, and this remains an active area of research. Future work can include improving methods for morphometric feature aggregation or create specific BC morphometric signatures for each type of BBD lesion.

The strengths of our study include the application of a computer pathology method to assess breast tissue composition in a large study with rich data on risk factors (2,3,24,32,33,40), BBD samples underwent centralized pathology review, and BC cases were confirmed through review of medical records. Some limitations of our study include being underpowered to evaluate the association of breast composition and ER-negative BC, BC molecular subtypes (75,76), or mammographic density (77,78) because mammogram data were available for only 105 women (7.8%) in this study. Our findings were limited to White women, the predominant race of the NHS and NHSII participants. Dysfunctional epithelial-stroma interactions in the breast have been implicated in breast carcinogenesis (79); however, our study was not designed to investigate epithelium-stroma interactions. Lastly, the majority of our BBD biopsies were surgical biopsies, and the sampling of breast tissue may not be random in nature—pathologists tend to oversample nonfatty tissue for histological processing because firm and fibrous regions are more likely to represent cancer. Although such selection bias may result in misclassification or measurement error, this would have been conducted at the time of BBD biopsy and is unlikely to be different between those who later developed BC and those who did not.

In conclusion, we found that BBD histopathological subtypes and anthropometric and selective reproductive risk factors were associated with breast tissue composition. Higher percentages of epithelium were associated with increased risk of BC, specifically among women with the proliferative without atypia subtype of BBD. No morphometric signature was associated with subsequent BC. Future work can include incorporation of the percentage of epithelium into risk assessment models as well as explore end-to-end deep-learning BC prediction models. We can also conduct studies to understand how modifiable BC risk factors modulate breast tissue composition.

Funding

This work was supported by the National Cancer Institute of the National Institutes of Health R21CA187642 (RMT), R01CA175080 (RMT), R01CA240341 (RMT, YJH), UM1CA186107 (AHE), and U01 CA176726 (AHE); Susan G. Komen for the Cure IIR13264020 (RMT); Breast Cancer Research Foundation 17–174, the Klarman Family Foundation (YJH); and Beth Israel Deaconess Medical Center High School Summer Research Program (ADV).

Footnotes

Role of the funder: The funding sources listed in the Funding section were not involved in the design of the study; the collection, analysis, and interpretation of the data; the writing of the manuscript; and the decision to submit the manuscript for publication.

Disclosures: K. Sirinukunwattan is a co-founder of University of Oxford spinout Ground Truth Labs. Ground Truth Labs has no financial or commercial interest in this work. All other authors have nothing to disclose. All authors have no conflict of interest.

Prior presentation: Part of this work was previously presented at the United States and Canadian Academy of Pathology 2018 annual meeting in Vancouver, BC, Canada.

Author contributions: Conceived and designed the study: RMT YJH KS. Data analysis: YJH ADV KS MV KHK RMT. Epidemiological data collection: RMT AHE KHK. Breast pathology expertise: GMB LCC SJS JLC. Computational method and data acquisition: ADV KS ALS MEP YJH. All authors contributed to the writing and reviewing of the manuscript.

Acknowledgements: We thank the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, WY. The authors assume full responsibility for analyses and interpretation of these data.

Data Availability

The source code for our deep learning networks is available at https://github.com/avellal14/BBD_Pipeline. The data that support the findings of this study are available from the Nurses’ Health Studies. Investigators interested in using the data can request access, and feasibility will be discussed at an investigators meeting. Limits are not placed on scientific questions or methods, and there is no requirement for co-authorship. Data sharing information and policy details are available at http://www.nurseshealthstudy.org/researchers.

Supplementary Material

pkaa119_Supplementary_Data

Contributor Information

Adithya D Vellal, Department of Pathology, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA.

Korsuk Sirinukunwattan, Department of Pathology, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Engineering Science, Institute of Biomedical Engineering (IBME), University of Oxford, Oxford, UK; Big Data Institute, University of Oxford, Li Ka Shing Centre for Health Information and Discovery, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford University NHS Foundation Trust, Oxford, UK.

Kevin H Kensler, Division of Population Sciences, Dana Farber Cancer Institute, Boston, MA, USA.

Gabrielle M Baker, Department of Pathology, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA.

Andreea L Stancu, Department of Pathology, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA.

Michael E Pyle, Department of Pathology, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA.

Laura C Collins, Department of Pathology, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA.

Stuart J Schnitt, Dana-Farber/Brigham and Women's Cancer Center, Harvard Medical School, Dana-Farber Cancer Institute-Brigham and Women's Hospital, Boston, MA, USA.

James L Connolly, Department of Pathology, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA.

Mitko Veta, Medical Image Analysis Group, Eindhoven University of Technology, Eindhoven, the Netherlands.

A Heather Eliassen, Channing Division of Network Medicine, Department of Medicine, Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Rulla M Tamimi, Channing Division of Network Medicine, Department of Medicine, Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.

Yujing J Heng, Department of Pathology, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA.

References

  • 1.DeSantis CE, Ma J, Gaudet MM, et al. Breast cancer statistics, 2019. CA A Cancer J Clin. 2019;69(6):438–451. [DOI] [PubMed] [Google Scholar]
  • 2.Aroner SA, Collins LC, Connolly JL, et al. Radial scars and subsequent breast cancer risk: results from the Nurses’ Health Studies. Breast Cancer Res Treat. 2013;139(1):277–285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Collins LC, Aroner SA, Connolly JL, Colditz GA, Schnitt SJ, Tamimi RM.. Breast cancer risk by extent and type of atypical hyperplasia: an update from the Nurses’ Health Studies. Cancer. 2016;122(4):515–520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Dupont WD, Page DL.. Risk factors for breast cancer in women with proliferative breast disease. N Engl J Med. 1985;312(3):146–151. [DOI] [PubMed] [Google Scholar]
  • 5.Kensler KH, Beca F, Baker GM, et al. Androgen receptor expression in normal breast tissue and subsequent breast cancer risk. NPJ Breast Cancer. 2018;4(1):33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Zhang H, Ahearn T, Lecarpentier J, et al. G-CM. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat Genet. 2020;52(6):572–581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wang J, Eliassen AH, Spiegelman D, Willett WC, Hankinson SE.. Plasma free 25-hydroxyvitamin D, vitamin D binding protein, and risk of breast cancer in the Nurses’ Health Study II. Cancer Causes Control. 2014;25(7):819–827. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Kotsopoulos J, McGee EE, Lozano-Esparza S, et al. Premenopausal plasma osteoprotegerin and breast cancer risk: a case-control analysis nested within the Nurses’ Health Study II. Cancer Epidemiol Biomarkers Prev. 2020;29(6):1264–1270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tice J, Cummings S, Smith-Bindman R, Ichikawa L, Barlow W, Kerlikowske K.. Using clinical factors and mammographic breast density to estimate breast cancer risk: development and validation of a new predictive model. Ann Intern Med. 2008;148(5):337–347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Tice JA, Miglioretti DL, Li C-S, Vachon CM, Gard CC, Kerlikowske K.. Breast density and benign breast disease: risk assessment to identify women at high risk of breast cancer. J Clin Oncol. 2015;33(28):3137–3143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Rice MS, Tworoger SS, Hankinson SE, et al. Breast cancer risk prediction: an update to the Rosner-Colditz breast cancer incidence model. Breast Cancer Res Treat. 2017;166(1):227–240. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhang X, Rice M, Tworoger SS, et al. Addition of a polygenic risk score, mammographic density, and endogenous hormones to existing breast cancer risk prediction models: a nested case-control study. PLoS Med. 2018;15(9):e1002644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Pastor-Barriuso R, Ascunce N, Ederra M, et al. Recalibration of the Gail model for predicting invasive breast cancer risk in Spanish women: a population-based cohort study. Breast Cancer Res Treat. 2013;138(1):249–259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Colditz GA, Rosner B.. Cumulative risk of breast cancer to age 70 years according to risk factor status: data from the Nurses’ Health Study. Am J Epidemiol. 2000;152(10):950–964. [DOI] [PubMed] [Google Scholar]
  • 15.Veta M, Heng YJ, Stathonikos N, et al. Predicting breast tumor proliferation from whole-slide images: the TUPAC16 challenge. Med Image Anal. 2019;54:111–121. [DOI] [PubMed] [Google Scholar]
  • 16.Bejnordi BE, Veta M, van Diest PJ, et al. ; the CAMELYON16 Consortium. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199–2210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Tian K, Rubadue CA, Lin DI, et al. Automated clear cell renal carcinoma grade classification with prognostic significance. PLoS One. 2019;14(10):e0222641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Beck AH, Sangoi AR, Leung S, et al. Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci Transl Med. 2011;3(108):108ra113. [DOI] [PubMed] [Google Scholar]
  • 19.Tabesh A, Teverovskiy M, Pang HY, et al. Multifeature prostate cancer diagnosis and Gleason grading of histological images. IEEE Trans Med Imaging. 2007;26(10):1366–1378. [DOI] [PubMed] [Google Scholar]
  • 20.Yu K-H, Zhang C, Berry GJ, et al. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat Commun. 2016;7(1):12474. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Ehteshami Bejnordi B, Lin J, Glass B, et al. Deep learning-based assessment of tumor-associated stroma for diagnosing breast cancer in histopathology images.Proc IEEE Int Symp Biomed Imaging. 2017;2017:929-932. [DOI] [PMC free article] [PubMed]
  • 22.Xu J, Luo X, Wang G, Gilmore H, Madabhushi A.. A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images. Neurocomputing. 2016;191:214–223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Dong F, Irshad H, Oh EY, et al. Computational pathology to discriminate benign from malignant intraductal proliferations of the breast. PLoS One. 2014;9(12):e114885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Baer HJ, Collins LC, Connolly JL, Colditz GA, Schnitt SJ, Tamimi RM.. Lobule type and subsequent breast cancer risk: results from the Nurses’ Health Studies. Cancer. 2009;115(7):1404–1411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Milanese TR, Hartmann LC, Sellers TA, et al. Age-related lobular involution and risk of breast cancer. J Natl Cancer Inst. 2006;98(22):1600–1607. [DOI] [PubMed] [Google Scholar]
  • 26.McKian KP, Reynolds CA, Visscher DW, et al. Novel breast tissue feature strongly associated with risk of breast cancer. J Clin Oncol. 2009;27(35):5893–5898. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Figueroa JD, Pfeiffer RM, Brinton LA, et al. Standardized measures of lobular involution and subsequent breast cancer risk among women with benign breast disease: a nested case-control study. Breast Cancer Res Treat. 2016;159(1):163–172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Wetstein SC, Onken AM, Baker GM, et al. Detection of acini in histopathology slides: towards automated prediction of breast cancer risk. In: Tomaszewski JE, Ward AD, eds. Medical Imaging 2019: Digital Pathology, vol 10956. San Diego, CA: SPIE; 2019: 152–158.
  • 29.Wetstein SC, Onken AM, Luffman C, et al. Deep learning assessment of breast terminal duct lobular unit involution : towards automated prediction of breast cancer risk. PLoS One. 2020;15(4):e0231653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kensler KH, Liu EZ, Wetstein SC, et al. The association of automated quantitative measures of terminal duct lobular unit involution and breast cancer risk. Cancer Epidemiol Biomarkers Prev. 2020;29(11):2358–2368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Colditz GA, Hankinson SE.. The Nurses’ Health Study: lifestyle and health among women. Nat Rev Cancer. 2005;5(5):388–396. [DOI] [PubMed] [Google Scholar]
  • 32.Collins LC, Baer HJ, Tamimi RM, Connolly JL, Colditz GA, Schnitt SJ.. The influence of family history on breast cancer risk in women with biopsy-confirmed benign breast disease: results from the Nurses’ Health Study. Cancer. 2006;107(6):1240–1247. [DOI] [PubMed] [Google Scholar]
  • 33.Tamimi RM, Byrne C, Baer HJ, et al. Benign breast disease, recent alcohol consumption, and risk of breast cancer: a nested case-control study. Breast Cancer Res. 2005;7(4):R555–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Bao Y, Bertoia ML, Lenart EB, et al. Origin, methods, and evolution of the three Nurses’ Health Studies. Am J Public Health. 2016;106(9):1573–1581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Su X, Colditz GA, Willett WC, et al. Genetic variation and circulating levels of IGF-I and IGFBP-3 in relation to risk of proliferative benign breast disease. Int J Cancer. 2010;126(1):180–190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Baer HJ, Schnitt SJ, Connolly JL, et al. Early life factors and incidence of proliferative benign breast disease. Cancer Epidemiol Biomarkers Prev. 2005;14(12):2889–2897. [DOI] [PubMed] [Google Scholar]
  • 37.Farland LV, Tamimi RM, Eliassen AH, et al. A prospective study of endometriosis and risk of benign breast disease. Breast Cancer Res Treat. 2016;159(3):545–552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Jung MM, Colditz GA, Collins LC, Schnitt SJ, Connolly JL, Tamimi RM.. Lifetime physical activity and the incidence of proliferative benign breast disease. Cancer Causes Control. 2011;22(9):1297–1305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Collins LC, Baer HJ, Tamimi RM, Connolly JL, Colditz GA, Schnitt SJ.. Magnitude and laterality of breast cancer risk according to histologic type of atypical hyperplasia: results from the Nurses’ Health Study. Cancer. 2007;109(2):180–187. [DOI] [PubMed] [Google Scholar]
  • 40.Oh H, Eliassen AH, Wang M, et al. Expression of estrogen receptor, progesterone receptor, and Ki67 in normal breast tissue in relation to subsequent risk of breast cancer. Npj Breast Cancer. 2016;2(1):16032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Beca F, Kensler K, Glass B, Schnitt SJ, Tamimi RM, Beck AH.. EZH2 protein expression in normal breast epithelium and risk of breast cancer: results from the Nurses’ Health Studies. Breast Cancer Res. 2017;19(1):R32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. http://arxiv.org/abs/1409.1556.
  • 43.Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. http://arxiv.org/abs/1505.04597. Published May 18, 2015. Accessed May 1, 2018.
  • 44.Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2015:3431–3440. [DOI] [PubMed]
  • 45.Stark F, Hazirbas C, Triebel R, Cremers D CAPTCHA recognition with active deep learning. In: GCPR Workshop on New Challenges in Neural Computation, 2015.
  • 46.Scheffer T, Decomain C, Wrobel S. (2001) Active Hidden Markov Models for Information Extraction. In: Hoffmann F, Hand DJ, Adams N, Fisher D, Guimaraes G. (eds) Advances in Intelligent Data Analysis. IDA 2001. Lecture Notes in Computer Science, vol 2189. Springer, Berlin, Heidelberg. [Google Scholar]
  • 47.Kingma DP, Ba JL. Adam: A method for stochastic optimization. https://arxiv.org/abs/1412.6980.
  • 48.Kumar N, Verma R, Sharma S, Bhargava S, Vahadane A, Sethi A.. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Trans Med Imaging. 2017;36(7):1550–1560. [DOI] [PubMed] [Google Scholar]
  • 49.Tamimi RM, Baer HJ, Marotti J, et al. Comparison of molecular phenotypes of ductal carcinoma in situ and invasive breast cancer. Breast Cancer Res. 2008;10(4):R67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Sisti JS, Collins LC, Beck AH, Tamimi RM, Rosner BA, Eliassen AH.. Reproductive risk factors in relation to molecular subtypes of breast cancer: results from the Nurses’ Health Studies. Int J Cancer. 2016;138(10):2346–2356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Van Der Walt S, Schönberger JL, Nunez-Iglesias J, et al. Scikit-image: image processing in python. PeerJ. 2014;2014;(2:):e453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Haralick RM, Shanmugam K, Dinstein I.. Textural features for image classification. IEEE Trans Syst Man Cybern. 1973;SMC-3(6):610–621. [Google Scholar]
  • 53.Doshi N, Schaefer G. A comparative analysis of local binary pattern texture classification. In: 2012 IEEE Visual Communications and Image Processing, VCIP 2012. San Diego, CA; 2012:1–6.
  • 54.Martínez A, Martínez J, Pérez H, Quirós R, Image processing using Voronoi diagrams. In: Proceedings of the 2007 International Conference on Image Processing, Computer Vision, and Pattern Recognition, IPCV 2007. Las Vegas, NV; 2007:485–491.
  • 55.Kohout J. (2007) On Digital Image Representation by the Delaunay Triangulation. In: Mery D., Rueda L. (eds) Advances in Image and Video Technology. PSIVT 2007. Lecture Notes in Computer Science, vol 4872. Springer, Berlin, Heidelberg.
  • 56.Friedman J, Hastie T, Tibshirani R.. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1–22. [PMC free article] [PubMed] [Google Scholar]
  • 57.Lenth R, Singmann H, Love J, Buerkner P, Herve M, Estimated Marginal Means, aka Least-Squares Means. R package. https://cran.r-project.org/package=emmeans. 2020. Accessed May 1, 2020.
  • 58.Preston-Martin S, Pike MC, Ross RK, Henderson BE, Jones PA.. Increased cell division as a cause of human cancer. Cancer Res. 1990;50(23):7415–7421. [PubMed] [Google Scholar]
  • 59.Gertig DM, Stillman IE, Byrne C, et al. Association of age and reproductive factors with benign breast tissue composition. Cancer Epidemiol Biomarkers Prev. 1999;8(10):873–879. [PubMed] [Google Scholar]
  • 60.Hartmann LC, Sellers TA, Frost MH, et al. Benign breast disease and the risk of breast cancer. N Engl J Med. 2005;353(3):229–237. [DOI] [PubMed] [Google Scholar]
  • 61.Dyrstad SW, Yan Y, Fowler AM, Colditz GA.. Breast cancer risk associated with benign breast disease: systematic review and meta-analysis. Breast Cancer Res Treat. 2015;149(3):569–575. [DOI] [PubMed] [Google Scholar]
  • 62.Menes TS, Kerlikowske K, Lange J, Jaffer S, Rosenberg R, Miglioretti DL.. Subsequent breast cancer risk following diagnosis of atypical ductal hyperplasia on needle biopsy. JAMA Oncol. 2017;3(1):36–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Schoemaker MJ, Nichols HB, Wright LB, et al. ; The Premenopausal Breast Cancer Collaborative Group. Association of body mass index and age with subsequent breast cancer risk in premenopausal women. JAMA Oncol. 2018;4(11):e181771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Baer HJ, Colditz GA, Rosner B, et al. Body fatness during childhood and adolescence and incidence of breast cancer in premenopausal women: a prospective cohort study. Breast Cancer Res. 2005;7(3):R314-25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Baer HJ, Tworoger SS, Hankinson SE, Willett WC.. Body fatness at young ages and risk of breast cancer throughout life. Am J Epidemiol. 2010;171(11):1183–1194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Oh H, Rice MS, Warner ET, et al. Early-life and adult anthropometrics in relation to mammographic image intensity variation in the nurses’ health studies. Cancer Epidemiol Biomarkers Prev. 2020;29(2):343–351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Byrne C, Schairer C, Wolfe J, et al. Mammographic features and breast cancer risk: effects with time, age, and menopause status. J Natl Cancer Inst. 1995;87(21):1622–1629. [DOI] [PubMed] [Google Scholar]
  • 68.Gabrielson M, Chiesa F, Behmer C, Rönnow K, Czene K, Hall P.. Association of reproductive history with breast tissue characteristics and receptor status in the normal breast. Breast Cancer Res Treat. 2018;170(3):487–497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Russo J, Hu YF, Yang X, Russo IH.. Developmental, cellular, and molecular basis of human breast cancer. J Natl Cancer Inst Monogr. 2000;2000(27):17–37. [DOI] [PubMed] [Google Scholar]
  • 70.Nichols HB, Schoemaker MJ, Cai J, et al. Breast cancer risk after recent childbirth: a pooled analysis of 15 prospective studies. Ann Intern Med. 2019;170(1):22–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.McCready J, Arendt LM, Rudnick JA, Kuperwasser C.. The contribution of dynamic stromal remodeling during mammary development to breast carcinogenesis. Breast Cancer Res. 2010;12(3):205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Houghton LC, Jung S, Troisi R, et al. Pubertal timing and breast density in young women: a prospective cohort study. Breast Cancer Res. 2019;21(1):122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Ghosh K, Vachon CM, Pankratz VS, et al. Independent association of lobular involution and mammographic breast density with breast cancer risk. J Natl Cancer Inst. 2010;102(22):1716–1723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Rawat RR, Ruderman D, Macklin P, Rimm DL, Agus DB.. Correlating nuclear morphometric patterns with estrogen receptor status in breast cancer pathologic specimens. NPJ Breast Cancer. 2018;4(1):32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Guo C, Sung H, Zheng S, et al. Age-related terminal duct lobular unit involution in benign tissues from Chinese breast cancer patients with luminal and triple-negative tumors. Breast Cancer Res. 2017;19(1):61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Yang XR, Figueroa JD, Falk RT, et al. Analysis of terminal duct lobular unit involution in luminal A and basal breast cancers. Breast Cancer Res. 2012;14(2):R64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Sung H, Guo C, Li E, et al. The relationship between terminal duct lobular unit features and mammographic density among Chinese breast cancer patients. Int J Cancer. 2019;145(1):70–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Gierach GL, Patel DA, Pfeiffer RM, et al. Relationship of terminal duct lobular unit involution of the breast with area and volume mammographic densities. Cancer Prev Res. 2016;9(2):149–158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Shekhar MPV, Werdell J, Santner SJ, Pauley RJ, Tait L.. Breast stroma plays a dominant regulatory role in breast epithelial growth and differentiation: implications for tumor development and progression. Cancer Res. 2001;61(4):1320–1326. [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

pkaa119_Supplementary_Data

Data Availability Statement

The source code for our deep learning networks is available at https://github.com/avellal14/BBD_Pipeline. The data that support the findings of this study are available from the Nurses’ Health Studies. Investigators interested in using the data can request access, and feasibility will be discussed at an investigators meeting. Limits are not placed on scientific questions or methods, and there is no requirement for co-authorship. Data sharing information and policy details are available at http://www.nurseshealthstudy.org/researchers.


Articles from JNCI Cancer Spectrum are provided here courtesy of Oxford University Press

RESOURCES