Abstract
The predictive capability of combining demographic risk factors, germline genetic variants, and mammogram abnormality features for breast cancer risk prediction is poorly understood. We evaluated the predictive performance of combinations of demographic risk factors, high risk single nucleotide polymorphisms (SNPs), and mammography features for women recommended for breast biopsy in a retrospective case-control study (n = 768) with four logistic regression models. The AUC of the baseline demographic features model was 0.580. Both genetic variants and mammography abnormality features augmented the performance of the baseline model: demographics + SNP (AUC =0.668), demographics + mammography (AUC =0.702). Finally, we found that the demographics + SNP + mammography model (AUC = 0.753) had the greatest predictive power, with a significant performance improvement over the other models. The combination of demographic risk factors, genetic variants and imaging features improves breast cancer risk prediction over prior methods utilizing only a subset of these features.
Introduction
Accurate assessment of the likelihood that a woman has breast cancer is necessary for effective clinical decision-making. This goal has prompted the development of breast cancer risk prediction models, through analysis of easily measurable factors that are predictive of a breast cancer diagnosis (1–7). Some of these models are based off epidemiological risk factors. The Breast Cancer Risk Assessment Tool (the Gail model) is a widely used prediction model, estimating 5-year and lifetime risk of breast cancer development based on self-reported risk factors including the number of first-degree relatives with breast cancer, age at menarche, age at first live birth and number of previous breast biopsies (8). However, the tool has been shown to have poor discriminative power (5, 9), and thus is mainly used for decisions around long-term projected risk and benefits such as informing decisions about the use of chemoprevention. Recently, technical and scientific advances have led to cost reduction in genome sequencing and genome-wide association studies (GWAS), which has engendered optimism that models can use individual genetic variants to predict breast cancer risk. However, the initial excitement about these models has been tempered by modest improvements in predictive performance (2, 10) and insufficient clinical utility (7, 11, 12). Imaging, often referred to as an “intermediate phenotype” (13) can represent both the environmental and genetic predispositions (and the complex interactions) in a given patient and may improve risk prediction more than expected based on the individual risk factors alone. While mammography has been widely utilized for reducing mortality when used for breast cancer screening, the abnormality features codified in the Breast Imaging Reporting and Data System (BI-RADS) lexicon (14) can also be used as information to augment risk prediction (15–19).
Studies have demonstrated how the predictive power abnormality features (described using the of BI-RADS lexicon) including masses and microcalcifications, can contribute to breast cancer risk prediction (20). Further, studies have investigated how global imaging features, including breast density (21–23) and BI-RADS assessment category (24), can be combined with genetic variants to assess risk. However, specific breast imaging abnormalities have only recently been explored for risk prediction. For example, models of mammographic abnormalities can provide superior information to risk determination based off genetic variants (25). Combining information from genetic features with mammography improved the overall predictive capability of the model for younger patients (26). Subsequently, the use of BI-RADS hierarchical features combined to model the risk of breast cancer has been evaluated, and compared to models of other risk factors (27). At this point, a diverse set of information is available for determining breast cancer risk: the patients’ epidemiological risk factors, abnormalities seen on mammography, and genetic variants. However, few datasets exist containing all of these elements, and little work has been done investigating how these variables improve risk prediction when combined. The objective of this study is to evaluate if breast cancer risk prediction can be improved by combining a set of the patient’s demographic risk factors, the hierarchical mammography features and a panel of genetic variants.
Methods
Subjects
The study participants were women enrolled in the Marshfield Clinic Personalized Medicine Research Project (PMRP), which has been described previously (28, 29). The PMRP included patients aged 18 years and older residing near to Marshfield, Wisconsin who provided a blood sample from which DNA, plasma and serum were extracted and genotyped on the MassARRAY system (Sequenom, San Diego, CA, USA). Permission was given to link DNA samples with medical records.
Subjects included in the study were those that had a genotyped DNA sample available, a diagnostic mammogram and a subsequent breast biopsy taken within 12 months. Exclusion criteria were BRCA1 or BRCA2 genetic mutations, as these mutations confer a high risk of breast cancer which would overshadow other risk factors. Women who were not of Western European heritage were excluded as there were insufficient numbers for race matching cases and controls.
This was a retrospective case-control study. Breast cancer cases were subjects who had been identified in the Marshfield cancer registry as having a confirmed breast cancer diagnosis, while controls were subjects who had both a benign breast biopsy and no recorded breast cancer diagnosis. We employed an age matching strategy, where a control whose age was within five years of the age of each case was selected to ensure similar age distributions in the case and control cohorts.
The Marshfield Clinic Institutional Review Board reviewed and approved this study.
Demographic Risk Factors
We collected demographic information that are known risk factors for breast cancer (Table 1). We obtained age of subjects at the time of enrollment, and calculated age at time of biopsy. We manually extracted whether subjects had a family history of breast cancer, and the number of prior breast biopsies that each subject had undergone from medical records. A woman’s age at the first live birth of a child is a risk factor; however this was not available in our cohort so parity (number of pregnancies) was used. Parity is both associated with breast cancer risk and correlated with age at first live birth (30). Risk factors such as BRCA status and race were not included for the aforementioned reasons. Data for menarche were not available for a large proportion of the cohort and thus excluded.
Table 1.
Distribution of demographic risk factors in breast cancer cases and controls
| Variables | Cases (N = 373) | Controls (N = 395) | All Subjects (N = 768) | p | |
|---|---|---|---|---|---|
| Mean age | 62.12 (std¥=13.3) | 61.80 (std=12.2) | 61.95 (std=12.8) | 0.73 | |
| No. of first-degree relatives with breast cancer | 0.002 | ||||
| 0 | 268 (71.8%) | 325 (82.3%) | 593 (77.2%) | ||
| 1 | 91 (24.4%) | 57 (14.4%) | 148 (19.3%) | ||
| >2 | 14 (3.8%) | 13 (3.3%) | 27 (3.5%) | ||
| No. of biopsies | 0.25 | ||||
| 0 | 303 (81.2%) | 337 (85.3%) | 640 (83.3%) | ||
| 1 | 60 (16.1%) | 52 (13.2%) | 112 (14.6%) | ||
| >2 | 10 (2.7%) | 6 (1.5%) | 16 (2.1%) | ||
| No. of pregnancies | 0.51 | ||||
| 0 | 31 (8.3%) | 42 (11.1%) | 73 (9.7%) | ||
| 1-2 | 126 (33.9%) | 125 (33.0%) | 251 (33.4%) | ||
| 3-5 | 163 (43.8%) | 168 (44.3%) | 331 (44.1%) | ||
| >6 | 52 (14.0%) | 44 (11.6%) | 96 (12.8%) | ||
| Missing | 1 | 16 | 17 | ||
std represents standard deviation of ages
Genetic Variants
Genetic variants assessed in this study were SNPs that were predictive of breast cancer risk, and found to be high-frequency/low-penetrance (minor allele frequency >25%) as opposed to low frequency with high penetrance (BRCA1 and BRCA2) or intermediate penetrance (e.g. CHEK-2). The 10 SNPs included in this study (Table 2) have been found to be associated with increased breast cancer risk (7, 28), and validated by several large-scale genome-wide association studies (31, 32). The homozygote pair of high risk alleles for each gene is noted (*) in the table (7, 25).
Table 2.
Distribution of genotypes in breast cancer cases and controls
| SNPs | Cases (N = 373) | Controls (N = 395) | All Subjects (N = 768) | p | |
|---|---|---|---|---|---|
| RS1045485 | 0.69 | ||||
| CC | 4 (1.1%) | 7 (1.8%) | 11 (1.4%) | ||
| CG | 79 (21.2%) | 86 (21.8%) | 165 (21.5%) | ||
| GG* | 290 (77.7%) | 302 (76.5%) | 592 (77.1%) | ||
| RS13281615 | 0.087 | ||||
| AA | 121 (32.4%) | 154 (39.0%) | 275 (35.8%) | ||
| AG | 181 (48.5%) | 184 (46.6%) | 365 (47.5%) | ||
| GG* | 71 (19.0%) | 57 (14.4%) | 128 (16.7%) | ||
| RS13387042 | 0.001 | ||||
| AA* | 126 (33.8%) | 89 (22.5%) | 215 (28.0%) | ||
| AG | 179 (48.0%) | 206 (52.2%) | 385 (50.1%) | ||
| GG | 68 (18.2%) | 100 (25.3%) | 168 (21.9%) | ||
| RS2981582 | 0.22 | ||||
| CC | 134 (35.9%) | 151 (38.2%) | 285 (37.1%) | ||
| CT | 173 (46.4%) | 192 (48.6%) | 365 (47.5%) | ||
| TT* | 66 (17.7%) | 52 (13.2%) | 118 (15.4%) | ||
| RS3803662 | 0.25 | ||||
| CC | 176 (47.2%) | 209 (52.9%) | 385 (50.1%) | ||
| CT | 169 (45.3%) | 156 (39.5%) | 325 (42.3%) | ||
| TT* | 28 (7.5%) | 30 (7.6%) | 58 (7.6%) | ||
| RS3817198 | 0.41 | ||||
| CC* | 36 (9.7%) | 31 (7.8%) | 67 (8.7%) | ||
| CT | 170 (45.6%) | 170 (43.0%) | 340 (44.3%) | ||
| TT | 167 (44.8%) | 194 (49.1%) | 361 (47.0%) | ||
| RS889312 | 0.027 | ||||
| AA | 175 (46.9%) | 196 (49.6%) | 371 (48.3%) | ||
| AC | 160 (42.9%) | 179 (45.3%) | 339 (44.1%) | ||
| CC* | 38 (10.2%) | 20 (5.1%) | 58 (7.6%) | ||
| RS10941679 | 0.022 | ||||
| AA | 182 (48.8%) | 232 (58.7%) | 414 (53.9%) | ||
| AG | 164 (44.0%) | 141 (35.7%) | 305 (39.7%) | ||
| GG* | 27 (7.2%) | 22 (5.6%) | 49 (6.4%) | ||
| RS999737 | 0.086 | ||||
| CC* | 230 (61.7%) | 243 (61.5%) | 473 (61.6%) | ||
| CT | 133 (35.7%) | 129 (32.7%) | 262 (34.1%) | ||
| TT | 10 (2.7%) | 23 (5.8%) | 33 (4.3%) | ||
| RS11249433 | 0.91 | ||||
| CC* | 62 (16.6%) | 69 (17.5%) | 131 (17.1%) | ||
| CT | 182 (48.8%) | 187 (47.3%) | 369 (48.0%) | ||
| TT | 129 (34.6%) | 139 (35.2%) | 268 (34.9%) |
High risk homozygote
Mammography Abnormality Features
One diagnostic mammogram that was performed within the 12 months prior to breast biopsy was selected for each subject. If multiple diagnostic mammograms were available, the one that was closest in time to the biopsy or had the greatest number of suspicious features was selected. The results of the mammograms were recorded as free text reports in the Marshfield Clinic electronic health record. We used a natural language parser to extract mammography features as defined in the Breast Imaging Reporting and Data System (BI-RADS) lexicon, third edition (19, 33). This lexicon includes hierarchical features that are predictive of breast cancer. Four predictive abnormality features include mass margins, microcalcification shape, microcalcification distribution, and architectural distortion (20). We extracted these features and categorized them as either “present” or “not present” (Table 3). We classified suspicious microcalcification morphology shape descriptors (amorphous, pleomorphic, and fine linear) and distribution descriptors (clustered, segmental, linear).
Table 3.
Distribution of mammography features in breast cancer cases and controls
| Variables | Cases (N=373) | Controls (N=395) | Total (N=768) | p |
|---|---|---|---|---|
| Mass margins | ||||
| Microlobulated | 3 (0.8%) | 2 (0.5%) | 5 (0.7%) | 0.95 |
| Circumscribed | 24 (6.4%) | 42 (10.6%) | 66 (8.6%) | 0.052 |
| Obscured | 11 (2.9%) | 15 (3.8%) | 26 (3.4%) | 0.65 |
| Ill-defined | 50 (13.4%) | 49 (12.4%) | 99 (12.9%) | 0.76 |
| Spiculated | 82 (22.0%) | 4 (1.0%) | 86 (11.2%) | < 0.001 |
| Microcalcification shape | 0.31 | |||
| Amorphous, pleomorphic or fine linear | 63 (16.9%) | 79 (20.0%) | 142 (18.5%) | |
| None | 310 (83.1%) | 316 (80. %) | 626 (81.5%) | |
| Microcalcification distribution | 0.009 | |||
| Clustered, segmental or linear | 79 (21.2%) | 117 (29.6%) | 196 (25.5%) | |
| None | 294 (78.8%) | 278 (70.4%) | 572 (74.5%) | |
| Architectural distortion | <0.001 | |||
| Not present | 323 (86.6%) | 374 (94.7%) | 697 (90.8%) | |
| Present | 50 (13.4%) | 21 (5.3%) | 71 (9.2%) |
Statistical analysis
Differences in demographic risk factors, SNPs, and mammography features between breast cancer case and control groups were analyzed using the t-test for continuous variables or the chi-squared test for categorical variables. Four logistic regression models were built with breast cancer case vs. control as the outcome. The model built on demographic risk factors only was considered the baseline model. The additional models were constructed by sequentially including additional data types: (1) Demographics + SNP, (2) Demographics + mammography, and (3) Demographics + SNP + mammography. Age at menarche and breast density were not included in the models due to high proportions of missing data.
Receiver operating characteristic (ROC) curves that indicated the risk of cancer were generated, and area under the curve (AUC) including 95% confidence intervals (CI) from 2000 bootstrap replicates were calculated to evaluate performance of the models. These AUC estimates are inherently biased, however, because the predictions were made for the same dataset from which the model was estimated. Therefore the optimism-corrected estimates of AUC were calculated using 2000 bootstrap samples, in order to provide a nearly unbiased estimate of internal validity (34).
All statistical tests were two-sided, and 5% (p<0.05) was set as the level of significance. Statistical analyses were performed in R 3.4.2, including the “ROCR”, “pROC”, and “tableone” packages.
Results
The study included 373 cases and 395 controls who had a diagnostic mammogram concerning for breast cancer and thus subsequently had a breast biopsy between January 29, 1989 and December 15, 2010. The majority of the mammograms were performed between 1993 and 2005 (28). The age range for the subjects in this study was 29 to 90 years of age with a mean age of 61.95 (standard deviation 12.8 years). There was no significant difference between the ages of cases and controls, which is expected given the age-matching strategy (p = 0.73). The subjects included patients with both abnormal screening mammograms and those initially presenting with symptoms, thus allowing the model to determine how a combination of demographic, genetic and imaging risk factors aided risk assessment prior to biopsy.
Four models were developed using 751 subjects (372 cases and 379 controls) with complete data to assess the performance of demographic risk factors, genetic variants (SNPs) and imaging (mammography abnormality features) features on breast cancer prediction (Table 4). The logistic regression models excluded subjects with incomplete data.
Table 4.
Area under the curve (AUC), 95% confidence interval of the original AUC, and optimism-corrected AUC for the 4 models analyzing combinations of demographic risk factors, mammography features and genetic variants (SNPs)
| Model | AUC | 95% CI | Optimism-corrected AUC |
|---|---|---|---|
| Demographics | 0.580 | 0.539-0.620 | 0.549 |
| Demographics + SNPs | 0.668 | 0.630-0.707 | 0.612 |
| Demographics + Mammography | 0.702 | 0.665-0.739 | 0.668 |
| Demographics + Mammography + SNPs | 0.753 | 0.719-0.787 | 0.698 |
In all cases, models incorporating the risk factors were superior to chance, with improvement shown when more features were added (Figure 1, Table 4). The demographics + SNP model (AUC = 0.668) improves the risk prediction of the baseline demographics model. The demographics + mammography model (AUC = 0.702) also improves the risk prediction over demographics model, although is not statistically superior to the demographics + SNP model (95% CI, Table 4). Finally, the demographics + SNP + mammography model is an improvement over prior models (AUC = 0.753), and is statistically superior to both the baseline demographic model and the demographics + SNP model. However, the demographics + SNP + mammography model is not statistically superior to a model utilizing only demographic and mammographic features (Table 4).
Figure 1.
ROC curves of four prediction models.
In our subjects, some demographic risk factors were significantly associated with increased risk of breast cancer in the demographics + SNP + mammography model while others did not meet the significance threshold. Subjects with one first-degree relative with breast cancer had an increased risk of breast cancer compared to subjects with no first-degree relatives with breast cancer. (p < 0.001) (Table 1). Additionally, subjects with 1-2 pregnancies had an increased risk of breast cancer compared to subjects with no pregnancies (p = 0.05). However, the number of previous biopsies and age of patients were not significantly associated with an increased risk of breast cancer.
The distribution of high risk alleles for the 10 SNPs in the case and control groups is presented in Table 2. In the combined model, the high risk alleles for the following genetic variants were associated with breast cancer: RS13281615 (p<0.05), RS13387042 (p<0.05), RS889312 (p<0.05), RS10941679 (p < 0.05). The remaining genetic variants were not found to significantly contribute to increased risk of breast cancer: RS1045485, RS2981582, RS3803662, RS3817198, RS999737, RS11249433. In all cases, the percentage of homozygote low risk alleles is lower in cases than controls (Table 2), although this is not statistically significant for all genetic variants.
In the demographics + SNP + mammography model, the mammography abnormality features representing pathologic architectural distortion and microcalcification distribution were significantly associated (p=0.017 and p=0.012, respectively) with increased risk of breast cancer (Table 3). Microcalcification shape was not found to be a significant predictor of malignancy in this model (p = 0.20). Of the mass margin features analyzed, only spiculated margins were significant predictors of malignancy in the combined model (p < 0.001).
Discussion
Multiple sources of information are available for assessing breast cancer risk. Associations between demographic factors, germline genetic variants and general mammographic features (breast density and BI-RADS assessment category) have been previously identified to be associated with increased breast cancer risk. We have quantified the predictive value of combining demographic factors, germline genetic variants as well as mammography abnormality features (i.e. intermediate phenotypic information) in women recommended for breast biopsy. We found that the predictive value of combining genetic variants and demographic risk factors is significantly higher than that of demographic risk factors alone. Further, we found that incorporating mammography abnormality features improved breast cancer risk prediction. Finally, we found that the combination of all modalities of features (mammography, genetics and demographics) resulted in the highest predictive performance. Clinical decision making requires incorporating the best information about the patient; understanding how a diverse set of factors can contribute to assessing the likelihood that a patient has breast cancer aids decisions about recommending diagnostic tests (including invasive tests like biopsy) and pursuing treatment.
Prior studies have explored the value of genetic variants in breast cancer risk prediction. GWAS have identified a series of genetic variants underlying breast disease, which are available for breast cancer risk prediction. Our work aligns with prior studies that show that adding seven SNPs to the Gail model (2, 10), improved discriminatory accuracy modestly. More than 150 SNPs have been identified to date (4, 35, 36), which could potentially facilitate personalized cancer diagnosis. However, it is unlikely that germline genetic analysis alone will be sufficient to fulfill the promise of precision medicine for breast cancer risk prediction (37, 38) since cancer risk is mostly a result of external, environmental risk factors rather than intrinsic genetic mutations (39). We found in our population that while genetic testing overall improves breast cancer risk prediction, the presence of specific high risk alleles was not always associated with an increased risk of breast cancer with statistical significance. Theoretically the ability of SNPs to predict breast cancer risk has an upper bound (35, 40), which argues for greater use of intermediate phenotype data such as mammography abnormality features in addition to genetic variants to predict breast cancer risk. Given that the development of breast cancer is multi-factorial, this encourages the use of multiple modes of patient information to improve risk prediction.
Our current study differs from prior studies by adding mammography abnormality features to demographic risk factors and genetic variants in risk prediction for the biopsy population (24). Our study improves risk prediction over models that are based of demographic factors and SNPs (41). Mammography abnormality features offer richer intermediate phenotype data directly relevant to breast cancer diagnosis. However, mammography alone does not best determine the risk of breast cancer. Some features are correlated; while architectural distortion and microcalcification distribution were associated with increased risk of breast cancer, microcalcification shape was not found to be a significant predictor in the combined model. This is likely due to a strong correlation between microcalcification shape and distribution (19, 42). Our study shows that combining genetic variants and mammography abnormality features is useful in breast cancer risk prediction.
We recognize that there are several limitations to our study providing opportunities for future investigation. Our current study suffers from small sample size due to the inherent difficulty of collecting a rich multi-modality dataset. To increase data size, we used mammograms over a period in which breast cancer care has evolved substantially. Modern diagnostics (tomosynthesis and MRI) and novel treatments (neoadjuvant chemotherapy, immunotherapy, or nanomedicine) impact risk prediction protocols in ways that we are not able to investigate in this project. The development of large biobanks linked with EMRs at multiple universities is promising and will allow for future studies with a larger number of subjects cared for more recently. Further, more sophisticated algorithms may further improve prediction performance. For example, group fusion penalties with lasso logistic regression models can incorporate dependency structures inherent within the data to improve breast cancer risk prediction using mammographic features and genetic variants [27]. In addition, we know that parsing mammography features from free text reports may introduce noise. Mammography features extracted from structured reporting systems can be used instead (43), however the parser used on mammography reports was designed with consistency checks and outperformed manual classification (44). Further, there are concerns about inter-reader variability in mammography reports. However, we use all mammographic features according to an established and mandated lexicon (BI-RADS); and improved inter-reader agreement has been found with mammography assessments using BI-RADS (45). Some key predictors have a high percentage of missing observations, for example, breast density, thus these were excluded. Finally, we used AUC estimates including 95% confidence intervals for assessing the discrimination ability of the four fitted logistic regression models, but the original AUC estimates are inherently biased since predictions are based on the same dataset from which the model was estimated. This limitation, coupled with the fact that the dataset used to fit and train the model is small, leads to predictive ability estimates that are likely optimistic, i.e., they will fit the current dataset somewhat better than they will fit new data. The optimism-corrected AUC calculation mitigates some of these limitations by providing a nearly unbiased estimate of internal validity. Statistical testing on the optimism-corrected estimates was not included in the current study as there is not a standardized methodology and should be explored in future work.
Conclusions
These results demonstrate improved breast cancer risk prediction by combining demographic risk factors, germline genetic variants, and mammography features. The value of mammography abnormality features provides evidence that this intermediate phenotype not only combines genetic and environmental risk factors but incorporates information to improve disease prediction.
Acknowledgments
Acknowledgements: The authors acknowledge the support of the Wisconsin Genomics Initiative from the state of Wisconsin and support from the National Institutes of Health (grants: R01CA127379, R01CA127379-03S1, R01GM097618, R01LM011028, R01ES017400). The authors acknowledge support from the eMERGE Network (U01HG004608), the University of Wisconsin Institute for Clinical and Translational Research (UL1TR000427), and the University of Wisconsin Carbone Comprehensive Cancer Center Support grant (P30CA014520), the Center for Predictive Computational Phenotyping (CPCP), supported by the National Institutes of Health Big Data to Knowledge (BD2K) Initiative, U54AI117924,the University of Wisconsin Madison Office of the Vice Chancellor for Research and Graduate Education with funding from the Wisconsin Alumni Research Foundation, and the University of Wisconsin Departments of Radiology and Medical Physics.
References
- 1.Dai J, Hu Z, Jiang Y, Shen H, Dong J, Ma H, et al. Breast cancer risk assessment with five independent genetic variants and two risk factors in Chinese women. Breast Cancer Res. 2012;14(1)::R17. doi: 10.1186/bcr3101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Gail MH. Value of adding single-nucleotide polymorphism genotypes to a breast cancer risk model. J Natl Cancer Inst. 2009;101(13):959–63. doi: 10.1093/jnci/djp130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Liu J, Page D, Nassif H, Shavlik J, Peissig P, McCarty C, et al. Washington, DC: 2013. Genetic variants improve breast cancer risk prediction on mammograms. American Medical Informatics Association Symposium (AMIA) [PMC free article] [PubMed] [Google Scholar]
- 4.Liu J, Page D, Peissig P, McCarty C, Onitilo AA, Trentham-Dietz A, et al. San Francisco, CA: 2014. New genetic variants improve personalized breast cancer diagnosis. AMIA Summit on Translational Bioinformatics (AMIA-TBI) [PMC free article] [PubMed] [Google Scholar]
- 5.Meads C, Ahmed I, Riley R. A systematic review of breast cancer incidence risk prediction models with meta-analysis of their performance. Breast Cancer Res Treat. 2012;132(2):365–77. doi: 10.1007/s10549-011-1818-2. [DOI] [PubMed] [Google Scholar]
- 6.Quante A, Whittemore A, Shriver T, Strauch K, Terry M. Breast cancer risk assessment across the risk continuum: genetic and nongenetic risk factors contributing to differential model performance. Breast Cancer Res. 2012;14(6):R144. doi: 10.1186/bcr3352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wacholder S, Hartge P, Prentice R, Garcia-Closas M, Feigelson HS, Diver WR, et al. Performance of common genetic variants in breast-cancer risk models. N Engl J Med. 2010;362(11):986–93. doi: 10.1056/NEJMoa0907727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gail M, Brinton L, Byar D, Corle D, Green S, Schairer C, et al. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J Natl Cancer Inst. 1989;81(24):1879–86. doi: 10.1093/jnci/81.24.1879. [DOI] [PubMed] [Google Scholar]
- 9.Anothaisintawee T, Teerawattananon Y, Wiratkapun C, Kasamesup V, Thakkinstian A. Risk prediction models of breast cancer: a systematic review of model performances. Breast Cancer Res Treat. 2012;133(1):1. doi: 10.1007/s10549-011-1853-z. [DOI] [PubMed] [Google Scholar]
- 10.Gail MH. Discriminatory accuracy from single-nucleotide polymorphisms in models to predict breast cancer risk. J Natl Cancer Inst. 2008;100(14):1037–41. doi: 10.1093/jnci/djn180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Devilee P, Rookus MA. A tiny step closer to personalized risk prediction for breast cancer. N Engl J Med. 2010;362(11):1043–5. doi: 10.1056/NEJMe0912474. [DOI] [PubMed] [Google Scholar]
- 12.Pharoah PD, Antoniou AC, Easton DF, Ponder BA. Polygenes, risk prediction, and targeted prevention of breast cancer. N Engl J Med. 2008;358(26):2796–803. doi: 10.1056/NEJMsa0708739. [DOI] [PubMed] [Google Scholar]
- 13.Kraft P, Hunter DJ. Genetic risk prediction--are we there yet? N Engl J Med. 2009;360(17):1701–3. doi: 10.1056/NEJMp0810107. [DOI] [PubMed] [Google Scholar]
- 14.American College of Radiology. 4th ed. Reston VA: American College of Radiology; 2003. Breast Imaging Reporting And Data System (BI-RADS®) [Google Scholar]
- 15.Baker JA, Kornguth PJ, Lo JY, Williford ME, Floyd CE., Jr. Breast cancer: prediction with artificial neural network based on BI-RADS standardized lexicon. Radiology. 1995;196(3):817–22. doi: 10.1148/radiology.196.3.7644649. [DOI] [PubMed] [Google Scholar]
- 16.Burnside ES, Davis J, Chhatwal J, Alagoz O, Lindstrom MJ, Geller BM, et al. Probabilistic computer model developed from clinical data in national mammography database format to classify mammographic findings. Radiology. 2009;251(3):663–72. doi: 10.1148/radiol.2513081346. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Burnside ES, Rubin DL, Fine JP, Shachter RD, Sisney GA, Leung WK. Bayesian network to predict breast cancer risk of mammographic microcalcifications and reduce number of benign biopsy results: initial experience. Radiology. 2006;240(3):666–73. doi: 10.1148/radiol.2403051096. [DOI] [PubMed] [Google Scholar]
- 18.Chhatwal J, Alagoz O, Lindstrom MJ, Kahn CE, Jr.,, Shaffer KA, Burnside ES. A logistic regression model based on the national mammography database format to aid breast cancer diagnosis. AJR Am J Roentgenol. 2009;192(4):1117–27. doi: 10.2214/AJR.07.3345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Liberman L, Abramson AF, Squires FB, Glassman JR, Morris EA, Dershaw DD. The breast imaging reporting and data system: positive predictive value of mammographic features and final assessment categories. AJR Am J Roentgenol. 1998;171(1):35–40. doi: 10.2214/ajr.171.1.9648759. [DOI] [PubMed] [Google Scholar]
- 20.Wu Y, Alagoz O, Ayvaci MU, Munoz Del Rio A, Vanness DJ, Woods R, et al. A comprehensive methodology for determining the most informative mammographic features. Journal of digital imaging. 2013;26(5):941–7. doi: 10.1007/s10278-013-9588-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Darabi H, Czene K, Zhao W, Liu J, Hall P, Humphreys K. Breast cancer risk prediction and individualised screening based on common genetic variation and breast density measurement. Breast Cancer Res. 2012;14(1):R25. doi: 10.1186/bcr3110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lee CP, Choi H, Soo KC, Tan MH, Chay WY, Chia KS, et al. Mammographic breast density and common genetic variants in breast cancer risk prediction. PloS one. 2015;10(9):e0136650. doi: 10.1371/journal.pone.0136650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Tamimi RM, Cox D, Kraft P, Colditz GA, Hankinson SE, Hunter DJ. Breast cancer susceptibility loci and mammographic density. Breast Cancer Res. 2008;10(4):R66. doi: 10.1186/bcr2127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Armstrong K, Handorf EA, Chen J, Bristol Demeter MN. Breast cancer risk prediction and mammography biopsy decisions: a model-based study. American journal of preventive medicine. 2013;44(1):15–22. doi: 10.1016/j.amepre.2012.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Burnside ES, Liu J, Wu Y, Onitilo AA, McCarty CA, Page CD, et al. Comparing Mammography Abnormality Features to Genetic Variants in the Prediction of Breast Cancer in Women Recommended for Breast Biopsy. Acad Radiol. 2016;23(1):62–9. doi: 10.1016/j.acra.2015.09.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Feld S, Fan J, Yuan M, Wu Y, Woo K, Alexandridis R, et al. Utility of Genetic Testing in Addition to Mammography for Determining Risk of Breast Cancer Depends on Patient Age. American Medical Informatics Association Informatics Conference Proceedings2018; [PMC free article] [PubMed] [Google Scholar]
- 27.Fan J, Wu Y, Yuan M, Page D, Liu J, Ong IM, et al. Structure-Leveraged Methods in Breast Cancer Risk Prediction. J Mach Learn Res. 2016;17 [PMC free article] [PubMed] [Google Scholar]
- 28.Burnside ES, Liu J, Wu Y, Onitilo AA, McCarty CA, Page CD, et al. 2015. Comparing Mammography Abnormality Features to Genetic Variants in the Prediction of Breast Cancer in Women Recommended for Breast Biopsy. Acad Radiol. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.McCarty CA, Wilke RA, Giampietro PF, Wesbrook SD, Caldwell MD. Marshfield Clinic Personalized Medicine Research Project (PMRP): design, methods and recruitment for a large population-based biobank. Personalized Med. 2005;2(1):49–79. doi: 10.1517/17410541.2.1.49. [DOI] [PubMed] [Google Scholar]
- 30.Kobayashi S, Sugiura H, Ando Y, Shiraki N, Yanagi T, Yamashita H, et al. Reproductive history and breast cancer risk. Breast cancer (Tokyo, Japan) 2012;19(4):302–8. doi: 10.1007/s12282-012-0384-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, Ballinger DG, et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature. 2007;447(7148):1087–93. doi: 10.1038/nature05887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, et al. A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet. 2007;39(7):870–4. doi: 10.1038/ng2075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Liberman L, Menell JH. Breast imaging reporting and data system (BI-RADS). Radiol Clin North Am. 2002;40(3):409–30, v.. doi: 10.1016/s0033-8389(01)00017-3. [DOI] [PubMed] [Google Scholar]
- 34.Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361–87. doi: 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]
- 35.Couch FJ, Nathanson KL, Offit K. Two decades after BRCA: setting paradigms in personalized cancer care and prevention. Science. 2014;343(6178):1466. doi: 10.1126/science.1251827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Sawyer E, Roylance R, Petridis C, Brook MN, Nowinski S, Papouli E, et al. Genetic predisposition to in situ and invasive lobular carcinoma of the breast. PLoS genetics. 2014;10(4):e1004285. doi: 10.1371/journal.pgen.1004285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Novelli G, Ciccacci C, Borgiani P, Papaluca Amati M, Abadie E. Genetic tests and genomic biomarkers: regulation, qualification and validation. Clinical cases in mineral and bone metabolism : the official journal of the Italian Society of Osteoporosis, Mineral Metabolism, and Skeletal Diseases. 2008;5(2):149–54. [PMC free article] [PubMed] [Google Scholar]
- 38.Wang X, Zhang L, Chen Z, Ma Y, Zhao Y, Rewuti A, et al. Association between 5p12 genomic markers and breast cancer susceptibility: evidence from 19 case-control studies. PloS one. 2013;8(9):e73611. doi: 10.1371/journal.pone.0073611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Wu S, Powers S, Zhu W, Hannun YA. 2015. Substantial contribution of extrinsic risk factors to cancer development. Nature. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Maxwell K, Nathanson K. Common breast cancer risk variants in the post-COGS era: a comprehensive review. Breast Cancer Res. 2013;15(6):212. doi: 10.1186/bcr3591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.McCarthy AM, Keller B, Kontos D, Boghossian L, McGuire E, Bristol M, et al. The use of the Gail model, body mass index and SNPs to predict breast cancer among women with abnormal (BI-RADS 4) mammograms. Breast Cancer Res. 2015;17(1) doi: 10.1186/s13058-014-0509-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bent CK, Bassett LW, D’Orsi CJ, Sayre JW. The positive predictive value of BI-RADS microcalcification descriptors and final assessment categories. AJR Am J Roentgenol. 2010;194(5):1378–83. doi: 10.2214/AJR.09.3423. [DOI] [PubMed] [Google Scholar]
- 43.Burnside ES, Sickles E, Bassett L, Rubin D, Lee C, Ikeda D, et al. The ACR BI-RADS experience: learning from history. J American College of Radiology. 2009;6(12):851–60. doi: 10.1016/j.jacr.2009.07.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Nassif H, Woods R, Burnside E, Ayvaci M, Shavlik J, Page D. Information extraction for clinical data mining: a mammography case study. IEEE International Conference on Data Mining Workshops; Miami, FL. 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Redondo A, Comas M, Macià F, Ferrer F, Murta-Nascimento C, Maristany MT, et al. Inter- and intraradiologist variability in the BI-RADS assessment and breast density categories for screening mammograms. Br J Radiol. 2012;85(1019):1465–70. doi: 10.1259/bjr/21256379. [DOI] [PMC free article] [PubMed] [Google Scholar]

