Skip to main content
JCO Precision Oncology logoLink to JCO Precision Oncology
. 2021 Jul 21;5:PO.20.00541. doi: 10.1200/PO.20.00541

Polygenic Breast Cancer Risk for Women Veterans in the Million Veteran Program

Jessica Minnier 1,2,3, Nallakkandi Rajeevan 4,5, Lina Gao 1,2,3, Byung Park 1,2,3, Saiju Pyarajan 6,7, Paul Spellman 2,3, Sally G Haskell 8,9, Cynthia A Brandt 5,8, Shiuh-Wen Luoh 2,3,, VA Million Veteran Program
PMCID: PMC8345920  PMID: 34381935

PURPOSE

Accurate breast cancer (BC) risk assessment allows personalized screening and prevention. Prospective validation of prediction models is required before clinical application. Here, we evaluate clinical- and genetic-based BC prediction models in a prospective cohort of women from the Million Veteran Program.

MATERIALS AND METHODS

Clinical BC risk prediction models were validated in combination with a genetic polygenic risk score of 313 (PRS313) single-nucleotide polymorphisms in genetic females without prior BC diagnosis (n = 35,130, mean age 49 years) with 30% non-Hispanic African ancestry (AA). Clinical risk models tested were Breast and Prostate Cancer Cohort Consortium, literature review, and Breast Cancer Risk Assessment Tool, and implemented with or without PRS313. Prediction accuracy and association with incident breast cancer was evaluated with area under the receiver operating characteristic curve (AUC), hazard ratios, and proportion with high absolute lifetime risk.

RESULTS

Three hundred thirty-eight participants developed incident breast cancers with a median follow-up of 3.9 years (2.5 cases/1,000 person-years), with 196 incident cases in women of European ancestry and 112 incident cases in AA women. Individualized Coherent Absolute Risk Estimator-literature review in combination with PRS313 had an AUC of 0.708 (95% CI, 0.659 to 0.758) in women with European or non-African ancestries and 0.625 (0.539 to 0.711) in AA women. Breast Cancer Risk Assessment Tool with PRS313 had an AUC of 0.695 (0.62 to 0.729) in European or non-AA and 0.675 (0.626 to 0.723) in AA women. Incorporation of PRS313 with clinical models improved prediction in European but not in AA women. Models estimated up to 9% of European and 18% of AA women with absolute lifetime risk > 20%.

CONCLUSION

Clinical and genetic BC risk models predict incident BC in a large prospective multiracial cohort; however, more work is needed to improve genetic risk estimation in AA women.

INTRODUCTION

Breast cancer (BC) is a leading cancer for American women. Although BC screening with mammography has resulted in a decrease in BC mortality,1,2 the optimal BC screening strategy regarding age of initiation, screening intervals, and imaging modalities remains elusive.3 Understanding individuals' risks for developing BC may allow us to adopt a more appropriate BC screening and prevention strategy.

CONTEXT

  • Key Objective

  • The key objective of this study is to assess the breast cancer (BC) risk prediction instruments in a large multiethnic prospective cohort of Women Veterans in the Million Veteran Program (MVP). Instruments evaluated are composed of polygenic risk scores (PRSs), personal health, demographics, family history, and environmental exposure.

  • Knowledge Generated

  • Clinical parameters can predict the risk of BC reasonably well in multiple ethnic groups. PRSs that are primarily based on genomic analysis in European populations can moderately predict the BC risk in women of European or non-African ancestry (AA) but not in women of AA. Addition of PRSs to clinical parameters provides the best prediction in European and non-African women.

  • Relevance

  • Accurate BC risk prediction is essential for personalized BC screening and risk reduction intervention. Additional work, however, is needed for women of AA to optimize risk prediction based on PRSs.

Women with a history of BC in a first-degree relative are at approximately two-fold higher risk than women without a family history.4 Rare high-risk mutations, particularly in the BRCA1 and BRCA2 genes, explain < 20% of the two-fold familial relative risk and account for a small proportion of BC cases in the general population.5 Low-frequency variants conferring intermediate risk, such as those in CHEK2, ATM, and PALB2, explain 2% to 5% of the familial relative risk. Mutations in BRCA1, BRCA2, CHEK2, ATM, and PALB2 are actionable mutations for BC risk assessment per NCCN guidelines. These are rare BC risk alleles but with moderate to high penetrance.

Genome-wide association studies (GWAS) have led to the discovery of multiple common, low-risk variants (single-nucleotide polymorphisms [SNPs]) associated with BC risk.6 Risks conferred by SNPs are not sufficiently large to be useful in risk prediction individually. However, the combined effect of multiple SNPs as in a polygenic risk score (PRS) may achieve a degree of risk discrimination that is useful for population-based programs of BC prevention and early detection. For PRS313 (313 SNPs; see the Data Supplement for SNP ids and log-odd ratio coefficients), the odds ratio for overall disease per 1 standard deviation (SD) in 10 prospective studies was 1.61 (95% CI, 1.57 to 1.65) with area under receiver operating characteristic (ROC) (AUC) at 0.630 (95% CI, 0.628 to 0.651).6

The potential for combining PRSs and other known risk factors found no evidence that per-standard deviation PRS313 odds ratio differed across strata defined by individual risk factors.7 Goodness-of-fit tests did not reject the assumption of a multiplicative model between PRS313 and each risk factor. Variation in projected absolute lifetime risk of BC associated with classical risk factors was greater for women with higher genetic risk (PRS313 and family history), and on average 17.5% higher in the highest versus lowest deciles of genetic risk.7

Recently, more comprehensive clinical models have been developed for predicting BC risk. The Individualized Coherent Absolute Risk Estimator (iCARE) tool is implemented as an R package in Bioconductor and can compute absolute risk estimators by synthesizing multiple data sources containing information on relative risks, the distribution of risk factors in the population, and age-specific incidence rates for the disease of interest and rates of competing risks.8

Using the iCARE tool, we have compared the performance of two recently developed models, one based on the Breast and Prostate Cancer Cohort Consortium analysis (iCARE-BPC3) and another based on a literature review (iCARE-Lit), with the best known established model (Breast Cancer Risk Assessment Tool [BCRAT])9,10 alone and in combination with the best performing PRS, PRS313. The iCARE-Lit and -BPC3 models incorporate multiple clinical and demographic risk factors of age at menarche, age at menopause, parity, age at first live birth, height, alcohol intake (quantity per week), BC family history, smoking status (never, current, and former), usage of oral contraceptives, use of hormone-replacement therapy, and type of hormonal-replacement therapy.

Our study cohort is part of the Department of Veterans Affairs (VA) Million Veteran Program (MVP) and is one of the largest biobanks in the world.11 Important attributes of MVP include large sample size (> 830,000 participants recruited, April 2020), national coverage, multiethnic representation, access to biospecimens from which multiomics measurements have been conducted, thousands of biomedical phenotypes derived from our baseline and lifestyle surveys and electronic health records (EHRs), and access to medical data before and after specified clinical events. About 9% of MVP registrants are women, and 41% have non-European ancestry.

MATERIALS AND METHODS

Study Cohort

The cohort used for the BC risk prediction in this study consisted of 35,130 female participants from the MVP program. This cohort has been described elsewhere.11 Briefly, US veterans were recruited from 63 participating Department of Veterans Affairs (VA) medical facilities starting in 2011. Participants provided consent to access their EHRs for research, provided blood samples for genotyping, and were given surveys to obtain demographic and lifestyle information. The MVP protocol was approved by the VA Central Institutional Review Board in accordance with principles outlined in the Declaration of Helsinki.

This study cohort was pooled from the 455,789 genotyped MVP participants by first determining their sex using the genotype data. The 39,077 genetic female subjects were further filtered to give 37,411 participants who did not have a previous diagnosis of BC and the final cohort of 35,130 female subjects were selected by restricting their age to be between 20 and 80 years at the time of enrollment. Summaries of the analytical sample size are shown in Figure 1.

FIG 1.

FIG 1.

Cohort and model exclusions. Number of MVP women included in model validation, with exclusions specified for each model, and number of incident BC events. BC, breast cancer; BCRA, Breast Cancer Risk Assessment; BPC3, Breast and Prostate Cancer Cohort Consortium analysis; Lit, literature review; MVP, Million Veteran Program; PRS, polygenic risk score.

A total of 338 incident BC cases were diagnosed based on either oncology raw or compatible International Classification of Diseases and Current Procedural Terminology codes (Data Supplement). The phenotyping algorithm and chart review confirmation and the rest of the Materials and Methods are described in the Appendix.

RESULTS

The demographic and clinical characteristics of the full analysis cohort are described in Table 1. The mean age of MVP women studied was 49 years at entry and 53 years at study exit, with 59% European ancestry, 31.7% African ancestry (AA), 8.3% Hispanic, and 0.9% Asian. The median follow-up time was 3.9 years, with a range of 0.03-7.7 years. The mean age at first diagnosis was 59 years. The mean ages for the BCRAT risk variables were 12.8 years at menarche, 23.7 years at first birth, and 44.2 years at menopause. Mean height was 1.6 m and mean body mass index was 30.2 kg/m2.

TABLE 1.

Clinical and Demographic Characteristics of Cohort

graphic file with name po-5-po.20.00541-g002.jpg

The rate of BC incidence was 2.47 per 1,000-person years, with 338 events total, of which 196 incident cases were in women of European ancestry and 112 were in women of AA. Women who had incident BC over follow-up were older, had higher parity, slightly higher body mass index, and higher rates of oral contraceptive and hormone-replacement therapy use.

The accuracy of the implemented clinical and genetic prediction instruments were estimated with the AUC for 5-year absolute risk estimates, presented in Table 2. Hazard ratios (HR) for an increase of 1-SD or 1% increase in 5-year absolute risk estimates are presented in the Data Supplement.

TABLE 2.

AUC Estimates With 95% CIs for Each Model Within HARE Ancestry Cohorts, as Well as Sample Size, Number of Incident Cases, and Percent of Cohort With Estimated Lifetime Absolute Risk > 20%

graphic file with name po-5-po.20.00541-g003.jpg

Comparing clinical models alone without genetic risk, we observed similar accuracy across ancestry groups, with BCRAT performing similarly to iCARE-Lit models in women with non-African ancestries, and slight improvement over iCARE-Lit in AA women. For BCRAT in the cohort of women with European, Hispanic, Asian, or Other (non-African) ancestry, the AUC was 0.668 (95% CI, 0.635 to 0.701) and in AA women, the AUC was 0.663 (0.614 to 0.713). The AUC for iCARE-Lit was 0.663 (0.615 to 0.711) in non-African cohort and 0.633 (0.563 to 0.704) in AA cohort. The AUC for iCARE-BPC3 was 0.488 (0.431 to 0.545) in non-African cohort and 0.518 (0.43 to 0.607) in AA cohort, although the AUC point estimates improved slightly when restricting to age > 50 years, with AUC 0.522 (0.460 to 0.583) in non-African descent cohort and 0.513 (0.424 to 0.602) in AA cohort. The less-than-expected performance of BPC3 is not known but may be in part because of the differences in specification of risk factors (ie, finer v coarser categories of continuous risk factors).

Prediction accuracy of the PRS313 alone without clinical or demographic predictors was also modest, with the best AUC of 0.622 (0.580 to 0.664) in women of European descent and lowest AUC of 0.579 (0.522 to 0.636) in AA women. This is exhibited also in survival curves of quartiles of PRS313 where risk stratification is more distinct across quartiles for women of European descent than for AA women, although the highest quartile of PRS313 clearly captures women with lowest survival probabilities (Data Supplement). The HR for 1-SD (approximately 0.58) increase of PRS313 in the entire cohort was 1.31 (95% CI, 1.18 to 1.46) and the HR for a 1% increase in 5-year absolute risk estimate was 1.45 (1.34 to 1.58) in women of non-African descent and 1.37 (1.22 to 1.54) in AA women (Data Supplement).

Validation of combined PRS313 with clinical risk scores improved prediction accuracy but mainly in women of European descent. The best performing model in women of European descent was iCARE-Lit combined with PRS313 with an AUC of 0.708 (0.655 to 0.761) and HR for 1% increase in 5-year absolute risk of 1.35 (1.23 to 1.49). The best performing model in AA women was BCRAT combined with PRS313 with an AUC of 0.675 (0.626 to 0.723) and HR for 1% increase in 5-year absolute risk of 1.42 (1.25 to 1.61), and iCARE-Lit plus PRS313 had moderate performance of AUC 0.625 (0.539 to 0.711). The BCRAT plus PRS313 also performed well in women of European descent with AUC 0.697 (0.661 to 0.734). Including women of Hispanic, Asian, and other ancestries with European women did not appreciably change the AUCs or HRs for any model, although the additional sample size was small. The iCARE-BPC3 model combined with PRS313 also had reasonable performance in women with European ancestry with 0.637 (0.570 to 0.688) in women > 50 years old, but had lower prediction accuracy in AA women (AUC 0.526). Comparing combined clinical and genetic risk scores with clinical scores showed improvement in European and non-African descent cohorts with AUC estimates increasing 0.05-0.12, but rarely in AA women, with the maximum incremental AUC value of 0.013 with BCRAT plus PRS313 versus BCRAT alone (Table 2). This is expected, because of the fact that PRS313 was developed primarily in European populations. The increases in AUC for European women were not statistically significant, however, and a cohort with larger number of cases is required to formally test the incremental AUCs between these models. Survival curves of quartiles of BCRAT plus PRS313 exhibit clear risk stratification across ancestry groups, and the iCARE plus PRS313 models show clear risk stratification for European and non-African ancestries, although the stratification is less clear in AA women (Fig 2). This may partially be attributed to the fact that the iCARE models were necessarily validated in a smaller cohort with fewer cases than the BCRAT models because of missing survey data.

FIG 2.

FIG 2.

Kaplan-Meier curves by quartile of 5-year absolute risk for (A and B) BCRAT plus PRS313 and (C and D) iCARE-Lit plus PRS313 models. Quartiles calculated within each HARE cohort. Panels A and C, EUR/HIS/ASN/Missing; Panels B and D, AFR. P values from log-rank test. AFR, African American of non-Hispanic African HARE ancestry; ASN, Asian; BCRA, Breast Cancer Risk Assessment; BCRAT, Breast Cancer Risk Assessment Tool; EUR, non-Hispanic women of European HARE ancestry; HARE, harmonized ancestry and race/ethnicity; HIS, Hispanic; iCARE-Lit, Individualized Coherent Absolute Risk Estimator-literature review; PRS, polygenic risk score.

Model calibration showing expected versus observed ratios (E/O) over 10 predicted risk deciles and distribution plots of 5-year absolute risk are shown in the Data Supplement. iCARE-BPC3 and BCRAT models combined with PRS313 showed acceptable calibration with E/O ratios close to one and expected versus observed risk plots for deciles of risk aligning well over the diagonal. iCARE-Lit plus PRS313 had evidence of some overestimation of absolute 5-year risk, with an E/O ratio of 1.36 (1.12 to 1.65) in non-African ancestries, and similarly for AA women (E/O 1.32, 0.95 to 1.8). This follows a similar trend seen in the calibration of these iCARE models in a large UK cohort.9

The distribution of 5-year absolute risk from all models is shifted higher in women with incident cases (Fig 3), and similarly for lifetime risk (Data Supplement). Notably, BCRAT plus PRS313 absolute risk estimates have lower variance and lower mode than iCARE plus PRS313 models. The right tail of iCARE plus PRS313 absolute risk distributions is longer with the distribution more right-skewed, thus estimating a higher proportion of women with higher absolute risk values. We defined high risk as absolute lifetime risk estimate > 20%. The proportion of women determined to be high risk varied from 0.25% (BCRAT in AA women) to 18.2% (iCARE-BPC3 plus PRS313 in African women; Table 2). In general, BCRAT models estimated the lowest proportions of high risk women even when combined with genetic risk, while the iCARE models combined with PRS313 estimated the highest proportion of high risk women, with approximately 8%-9% in European or non-AA cohorts, and 15%-18% in AA cohort.

FIG 3.

FIG 3.

Distribution of 5-year absolute risk estimates for each model, stratified by HARE cohort (AFR, African American women with HARE non-Hispanic African ancestry; EUR/HIS/ASN/Missing, all other HARE ancestry groups) and BC incidence (BC, with incident breast cancer diagnosis; no BC, no diagnosis of breast cancer in follow-up). ASN, Asian; BCRA, Breast Cancer Risk Assessment; BPC3, Breast and Prostate Cancer Cohort Consortium analysis; EUR, non-Hispanic women of European HARE ancestry; HARE, harmonized ancestry and race/ethnicity; HIS, Hispanic; iCARE-Lit, Individualized Coherent Absolute Risk Estimator-literature review; PRS, polygenic risk score.

DISCUSSION

This work replicates and demonstrates the utility of BC risk prediction among a large cohort of women prospectively. We have evaluated the performance of several BC risk prediction instruments that are based on PRSs, personal health, lifestyle, demographics, family history, and environmental exposure. The BCRAT models are well established but include minimal clinical predictors: age at menarche, age at first live birth, number of previous biopsies, and number of first-degree relatives with BC. In the MVP cohort, all the clinical information was available from the baseline survey and lifestyle survey that subjects filled out at the time of MVP registration. Mammographic density, diagnosis of atypical ductal hyperplasia, or benign breast biopsy information was not available. The latter two items, however, were specified in the BCRAT models but were not part of the iCARE-Lit or iCARE-BPC3 models.

We showed that iCARE-Lit in combination with PRS313 could predict BC development among European White Women Veterans with a reasonable performance (AUC = 0.708), with iCARE-BPC3 moderate prediction (AUC = 0.637) in European women older than 50 years (Table 2). iCARE-BPC3 and BCRAT models were well calibrated in this cohort, with some evidence of overestimation of risk for iCARE-Lit plus PRS313 models. Of note, 8% of the European Whites were estimated to have a lifetime risk of BC at > 20% (Table 2). BCRAT plus PRS313 predicted the BC risk for both European Whites (AUC = 0.697) and African Americans (AUC = 0.675), respectively. BCRAT plus PRS313 predicted that about 3% of African Americans had a lifetime risk of BC at > 20%. Inclusion of PRS scores improved the prediction performance of BCRAT for European Whites but not African Americans. This is not surprising as PRS313 was derived based on largely European White subjects.6

Our study is among the first to examine the BC risk prediction instrument in a large AA prospective cohort. Few studies have evaluated the performance of existing BC risk prediction models among women of AA. To date, SNPs from GWAS have been identified almost exclusively in populations of European ancestry and often show different association patterns among African populations.12-14 Even among women of AA, variants identified in one population are often not applicable to another population.15 These conflicting results could be because of several reasons, including differences in allele frequencies and linkage disequilibrium blocks among different ethnicities,16 and differences in population characteristics within one ethnicity. In replication studies of genetic variants, a change in direction of risk association is a common observation. This flip-flop affects the performance of risk prediction models. Flip-flop phenomenon was observed in 30%-40% of variants across studies.17 Using the 34 variants with consistent directionality among three prior studies of AA populations, a PRS was constructed that showed an AUC of 0.531 (95% CI, 0.512 to 0.550) among a different AA population. This is similar to a PRS using 93 SNP variants and ORs from European ancestry populations (AUC, 0.525; 95% CI, 0.506 to 0.544).17-19 In our study, PRS313 has an AUC of 0.622 (0.58 to 0.664) in European ancestry women and 0.579 (0.522 to 0.636) in AA women.

Because of sample size, we have a very limited number of Asian or Hispanic participants, so we cannot estimate risk distributions in these cohorts independently. We have grouped Asian, Hispanic, and other harmonized ancestry and race/ethnicity (HARE) ancestries together with European non-Hispanic Whites. Such grouping is justified, as a recent report found that PRSs primarily consisting of SNPs identified in European populations were predictive of BC risk in self-reported Latinas20: a 180-SNP PRS had an adjusted odds ratio per SD increments of 1.58 (95% CI, 1.52 to 1.64) and an AUC of 0.63 (95% CI, 0.62 to 0.64). These results are comparable to those of European studies, which tested PRSs including 77-3,820 SNPs, and reported odds ratios per SD between 1.46 and 1.66 and AUCs between 0.60 and 0.64.6,19 Similarly, approximately one half of GWAS-identified BC risk variants can be directly replicated in East Asian women.21

The strength of this study includes a large sample size and multiethnic nature of the prospective cohort. Our study showed that clinical models, such as lit and BCRAT, performed comparably for women of different ethnic backgrounds but PRS performed poorly for AA women. The tradeoffs in higher risk prediction accuracy of iCARE-Lit plus PRS313 versus iCARE-BPC3 plus PRS313 in European women must be balanced with some miscalibration, and additional follow-up time with more incident cases are needed to further evaluate these differences. Limitations include a relatively short follow-up and some patients with ductal carcinoma in situ (DCIS) were pulled as invasive cancer cases. A large study provided strong evidence of shared genetic susceptibility for invasive ductal carcinoma and DCIS as most (67%) of the known BC predisposition loci showed an association with DCIS in the same direction as previously reported for invasive BC.22,23 Moreover, DCIS represented a small percentage of BCs pulled and therefore did not affect the main conclusion of the study.

In conclusion, our study replicates the BC risk prediction with clinical models and PRSs in a large prospective cohort of women from the MVP. Much work still needs to be done to improve the risk prediction within AA women. Our ability to predict varying BC risk positions us to develop and implement well-designed clinical trials to assess the clinical utility of these risk assessment tools in risk-adapted BC screening and risk reduction prevention strategy.

ACKNOWLEDGMENT

The authors greatly appreciate the staff and veteran participants who have contributed to the Million Veteran Program.

APPENDIX. ADDITIONAL MATERIALS AND METHODS

Phenotype Data

Oncology phenotypes were based on information in the Veterans Affairs (VA) Corporate Data Warehouse (CDW). For breast cancer (BC) diagnosis of MVP participants, we used relevant International Classification of Diseases (ICD)-9-CM and ICD-10-CM diagnosis and procedure codes as well as the Current Procedural Terminology (CPT) procedure codes from the EHRs. CPT codes for chemotherapy and radiation therapy were included as the diagnosis criteria. Additionally, use of selective estrogen receptor modulator medication (tamoxifen, anastrozole, letrozole, and exemestane) was used as an indicator for BC diagnosis when present concurrently with ICD codes for invasive BCs. A list of ICD-9, ICD-10, and CPT codes is included in the Data Supplement.

The primary BC phenotypes were taken from the oncology raw table (Oncology_Primary table in CDW). The patients with BC listed in this table were diagnosed for BC and being treated. In addition to the oncology raw table, the CPT procedure table (CDW.CPT_Procedures) was queried for any mastectomy or chemotherapy CPT codes along with a diagnosis code for BC to be included in the BC phenotypes. Along with this, the Medications table (MVP.Medications table in CDW) was queried for prescription of any selective estrogen receptor modulator medication to be included in the BC phenotype.

Chart Review

A manual review of a subset of EHRs data from the VA CDW was performed by an author (S-W L) to determine the accuracy of the phenotyping algorithm described above. Using the phenotyping algorithm, the following sets of potential incident diagnoses were detected: 268 patients with BC were identified from oncology raw. 635 patients with BC were identified excluding in situ by ICD only. Two hundred seventy BCs were identified excluding in situ based on ICD and compatible CPT codes. Three hundred thirty-eight patients were diagnosed based on either oncology raw or compatible ICD and CPT codes.

A random sample of 20 patients in each phenotyping category was reviewed. Out of 20 patients who were found by both oncology raw and compatible ICD and CPT codes (out of a total of 200), 19 were infiltrating ductal carcinoma (IDC) and one ductal carcinoma in situ (DCIS). Out of 20 patients who were present in ICD and compatible CPT group but not in oncology raw (out of 70), 16 were IDC, three DCIS, and one with mastectomy for unclear reasons. Out of 20 patients who were present in the oncology raw but not in the ICD and compatible CPT group (out of 68), 14 were IDC, five DCIS, and one LCIS. Out of 20 patients who were ICD-positive but in neither of the above (out of 324), one was IDC and one DCIS.

Based on the results of this validation study, we excluded participants who had ICD diagnoses with no other data evidence because of the observed low incidence of BC in this subset. Participants who were diagnosed with oncology raw and compatible ICD and CPT codes were included.

Demographic and Clinical Data

The clinical and demographic characteristics are shown in Table 1. Clinical risk factors including age of menarche, first birth, and menopause, as well as parity, alcohol use (grams of ethanol per day), oral contraceptive and hormone-replacement therapy use, and family history of BC were extracted from the CDW core demographics and baseline survey tables. Height and body mass index values were pulled from the core vitals table. As discussed below, the race or ethnicity of the subjects were determined from the genotype data and the self-identified race or ethnicity using the HARE24 classification algorithm.

Race and Ancestry

The race or ethnicity data in MVP was derived based on information in the VA CDW and the MVP baseline survey. Only in about 60% of the participants were these data consistent, and the remaining participants had either missing or inconsistent assignments. To remedy this, we used the HARE24 classification as the race or ethnicity of subjects. In this, each subject is classified to belong to one of the four groups—Hispanic (HIS), non-Hispanic Asian (ASN), non-Hispanic White (EU), and non-Hispanic Black (AA). This assignment was done using a supervised machine learning algorithm applied on the principal component analysis of the genotype data where the data for which the self-identified race and ethnicity classification was available were used as the training set.

Genotype Data

Genotyping was performed using an Affymetrix Axiom biobank array with probes for 723,305 single-nucleotide polymorphisms (SNPs) that was customized for MVP. The MVP data release 3 we used in this study contained genotype data from 455,789 subjects after quality control filtering. Details of the sample- and variant-level quality control are described elsewhere in other MVP publications.25 The final quality-controlled genotype data for analysis contained 455,789 subjects and 668,280 SNPs. The genotype data were used to determine the sex of the subjects using PLINK2. Out of the 39,077 genetic females in the MVP cohort, 35,130 females did not have any recorded diagnosis of BC at the time of enrollment and were used as the cohort in this study.

The MVP genotype data were imputed through Minimac426 using genotype data from 1,000 Genome Project reference panel (v.3), which resulted in imputed genotypes for 49,134,253 SNPs. The imputed data were generated in PLINK2 file format that stores both the genotype and dosage values for each variant. The imputed data were used to determine mutation polymorphisms at locations used in polygenic risk score calculations as described below.

Polygenic Risk Score

The polygenic risk score of 313 SNPs was calculated as the sum of the number of risk alleles or allele dosage multiplied by the log-odds-ratio estimated from the Breast Cancer Association Consortium study6 using the formula.

PRS313 = β1x1+β2x2+ +β313x313,

where xj is the number of minor alleles (0, 1, 2) for SNP k, and βk are the estimated log-odds-ratios.6 Of the 313 SNPs, 243 SNPs were directly genotyped and 70 SNPs were imputed. When the number of alleles for an SNP for an individual was missing, the value was imputed with the effect allele frequency within the study to represent the mean. A table of SNP ids with log-odds-ratio coefficients is presented in the Data Supplement.

Statistical Methods

Demographic and clinical characteristics were summarized with means and standard deviations or medians and interquartile ranges where appropriate. Time to primary event of incident BC was calculated as the time from study entry to time of first diagnosis of BC, as described in Phenotype Data above. Women who were not diagnosed with BC were right-censored at their last visit date before the data freeze on January 31, 2019. Women with BC at study entry (n = 1,666) and women with age at enrollment < 20 or ≥ 80 years old (n = 2,281) were excluded.

Validation of prediction instruments was performed on the subset of MVP participants with sufficient data to calculate each model. For the Breast Cancer Risk Assessment Tool (BCRAT) model without genetic component, the prediction score and absolute risk estimates were calculated on all MVP women with the BCRA package in R27 (version 2.1.2). The number of biopsies was unknown for all women and set to the missing value. The HARE algorithm was used to denote racial ancestry groups and race-specific BCRA models, and incident rates were used according to the following categorizations: White 1983-1987 SEER rates for HARE EU, African-American for HARE AA, Hispanic-American (US born) 1995-2004 for HARE HIS, and Other Asian for HARE ASN, and Other (Native American and unknown race) for the remainder.

For the Individualized Coherent Absolute Risk Estimation (iCARE) models, women with more than three predictors missing were excluded (resulting in cohort size of 14,152 for BPC3 and 13,018 for Lit; Fig 1). The Individualized Coherent Absolute Risk Estimator-literature review (iCARE-Lit) model was calculated using the iCARE-Lit-ge50 model for women > 50 years old and the iCARE-Lit-lt50 model for women ≤ 50 years old, and the iCARE-BPC3 model was calculated for the entire cohort of women > 20 years old as well as for the subcohort of women > 50 years old. The iCARE models and absolute risk estimates were fit with the iCARE package8 in R (version 1.10.3). The iCARE package was used to calculate absolute risk, relative risk, and mean expected and observed risk within deciles of the risk score. Calibration was assessed with overall expected/observed risk ratio and corresponding 95% Wald-based CIs, as well as the Hosmer-Lemeshow test for goodness of fit. Model validation was assessed with the area under the curve statistic with 95% Wald-based confidence intervals using the asymptotic variance formula28 as implemented in iCARE. Absolute risk estimates were calculated for a cumulative interval of 5 years as well as lifetime (30-80 years). For patients 76 years of age or older, the interval used was the difference between 80 years and study entry age. The area under the curve presented in the text denotes the prediction accuracy of 5-year absolute risk estimates, unless otherwise noted.

When incorporating the genetic component, risk models combining either BCRAT or iCARE with PRS313 were calculated on the subset of genotyped individuals. The iCARE package was used to calculate absolute risk estimates and validation statistics with all combined PRS313 and clinical models for consistency in methods. For the BCRAT model, the linear predictor was input into the iCARE package function with PRS313 added with weight 1. Previous research has indicated that a multiplicative model between PRS313 and clinical risk factors is not violated,7 hence an addition of PRS313 on the log-hazard scale is appropriate.

To further assess the association of prediction tools with BC incidence, absolute risk estimates for a specific model were categorized into four quartiles and Kaplan-Meier survival curves were calculated for each quartile. Cox proportional hazard models were also fit with the absolute risk value for each model individually to calculate the hazard ratio of BC for a 1% increase and a 1-standard deviation increase in absolute risk or a 1-standard deviation increase in PRS313. The proportion of individuals with absolute risk > 20% were calculated to estimate proportions of high-risk women. Models were fit within HARE-defined racial groups as well as for a collapsed HARE racial group (European, Asian, Hispanic, and Other and Missing).

All analyses were performed with R version 3.5.1.29 Statistical significance for hypothesis testing was defined by a two-sided P value of < .05.

MVP CORE ACKNOWLEDGEMENTMVP Executive Committee

Co-Chair: J. Michael Gaziano, MD, MPH, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

Co-Chair: Sumitra Muralidhar, PhD, US Department of Veterans Affairs, 810 Vermont Avenue NW, Washington, DC 20420

Rachel Ramoni, DMD, ScD, Chief VA Research and Development Officer, US Department of Veterans Affairs, 810 Vermont Avenue NW, Washington, DC 20420

Jean Beckham, PhD, Durham VA Medical Center, 508 Fulton Street, Durham, NC 27705

Kyong-Mi Chang, MD, Philadelphia VA Medical Center, 3900 Woodland Avenue, Philadelphia, PA 19104

Christopher J. O’Donnell, MD, MPH, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

Philip S. Tsao, PhD, VA Palo Alto Health Care System, 3801 Miranda Avenue, Palo Alto, CA 94304

James Breeling, MD, Ex-Officio, US Department of Veterans Affairs, 810 Vermont Avenue NW, Washington, DC 20420

Grant Huang, PhD, Ex-Officio, US Department of Veterans Affairs, 810 Vermont Avenue NW, Washington, DC 20420

Juan P. Casas, MD, PhD, Ex-Officio, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

MVP Program Office

Sumitra Muralidhar, PhD, US Department of Veterans Affairs, 810 Vermont Avenue NW, Washington, DC 20420

Jennifer Moser, PhD, US Department of Veterans Affairs, 810 Vermont Avenue NW, Washington, DC 20420

MVP Recruitment/Enrollment

Recruitment/Enrollment Director/Deputy Director, Boston—Stacey B. Whitbourne, PhD; Jessica V. Brewer, MPH, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

MVP Coordinating Centers

Clinical Epidemiology Research Center (CERC), West Haven—Mihaela Aslan, PhD, West Haven VA Medical Center, 950 Campbell Avenue, West Haven, CT 06516

Cooperative Studies Program Clinical Research Pharmacy Coordinating Center, Albuquerque—Todd Connor, PharmD.; Dean P. Argyres, BS, MS, New Mexico VA Health Care System, 1501 San Pedro Drive SE, Albuquerque, NM 87108

Genomics Coordinating Center, Palo Alto—Philip S. Tsao, PhD, VA Palo Alto Health Care System, 3801 Miranda Avenue, Palo Alto, CA 94304

MVP Boston Coordinating Center, Boston - J. Michael Gaziano, MD, MPH, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

MVP Information Center, Canandaigua—Brady Stephens, MS, Canandaigua VA Medical Center, 400 Fort Hill Avenue, Canandaigua, NY 14424

VA Central Biorepository, Boston—Mary T. Brophy MD, MPH; Donald E. Humphries, PhD; Luis E. Selva, PhD, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

MVP Informatics, Boston—Nhan Do, MD; Shahpoor (Alex) Shayan, MS, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

MVP Data Operations/Analytics, Boston—Kelly Cho, MPH, PhD, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

Director of Regulatory Affairs—Lori Churby, BS, VA Palo Alto Health Care System, 3801 Miranda Avenue, Palo Alto, CA 94304

MVP Science

Science Operations—Christopher J. O’Donnell, MD, MPH, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

Genomics Core—Christopher J. O’Donnell, MD, MPH, Saiju Pyarajan PhD, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130; Philip S. Tsao, PhD, VA Palo Alto Health Care System, 3801 Miranda Avenue, Palo Alto, CA 94304

Data Core—Kelly Cho, MPH, PhD, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

VA Informatics and Computing Infrastructure (VINCI)—Scott L. DuVall, PhD, VA Salt Lake City Health Care System, 500 Foothill Drive, Salt Lake City, UT 84148

Data and Computational Sciences—Saiju Pyarajan, PhD, VA Boston Healthcare System, 150 S. Huntington Avenue, Boston, MA 02130

Statistical Genetics—Elizabeth Hauser, PhD, Durham VA Medical Center, 508 Fulton Street, Durham, NC 27705; Yan Sun, PhD, Atlanta VA Medical Center,1670 Clairmont Road, Decatur, GA 30033; Hongyu Zhao, PhD, West Haven VA Medical Center, 950 Campbell Avenue, West Haven, CT 06516

Current MVP Local Site Investigators

Atlanta VA Medical Center (Peter Wilson, MD), 1670 Clairmont Road, Decatur, GA 30033

Bay Pines VA Healthcare System (Rachel McArdle, PhD), 10,000 Bay Pines Blvd Bay Pines, FL 33744

Birmingham VA Medical Center (Louis Dellitalia, MD), 700 S. 19th Street, Birmingham AL 35233

Central Western Massachusetts Healthcare System (Kristin Mattocks, PhD, MPH), 421 North Main Street, Leeds, MA 01053

Cincinnati VA Medical Center (John Harley, MD, PhD), 3200 Vine Street, Cincinnati, OH 45220

Clement J. Zablocki VA Medical Center (Jeffrey Whittle, MD, MPH), 5000 West National Avenue, Milwaukee, WI 53295

VA Northeast Ohio Healthcare System (Frank Jacono, MD), 10701 East Boulevard, Cleveland, OH 44106

Durham VA Medical Center (Jean Beckham, PhD), 508 Fulton Street, Durham, NC 27705

Edith Nourse Rogers Memorial Veterans Hospital (John Wells, PhD), 200 Springs Road, Bedford, MA 01730

Edward Hines, Jr. VA Medical Center (Salvador Gutierrez, MD), 5000 South 5th Avenue, Hines, IL 60141

Veterans Health Care System of the Ozarks (Gretchen Gibson, DDS, MPH), 1100 North College Avenue, Fayetteville, AR 72703

Fargo VA Health Care System (Kimberly Hammer, PhD), 2101 N. Elm, Fargo, ND 58102

VA Health Care Upstate New York (Laurence Kaminsky, PhD), 113 Holland Avenue, Albany, NY 12208

New Mexico VA Health Care System (Gerardo Villareal, MD), 1501 San Pedro Drive, S.E. Albuquerque, NM 87108

VA Boston Healthcare System (Scott Kinlay, MBBS, PhD), 150 S. Huntington Avenue, Boston, MA 02130

VA Western New York Healthcare System (Junzhe Xu, MD), 3495 Bailey Avenue, Buffalo, NY 14215-1199

Ralph H. Johnson VA Medical Center (Mark Hamner, MD), 109 Bee Street, Mental Health Research, Charleston, SC 29401

Columbia VA Health Care System (Roy Mathew, MD), 6439 Garners Ferry Road, Columbia, SC 29209

VA North Texas Health Care System (Sujata Bhushan, MD), 4500 S. Lancaster Road, Dallas, TX 75216

Hampton VA Medical Center (Pran Iruvanti, DO, PhD), 100 Emancipation Drive, Hampton, VA 23667

Richmond VA Medical Center (Michael Godschalk, MD), 1201 Broad Rock Blvd., Richmond, VA 23249

Iowa City VA Health Care System (Zuhair Ballas, MD), 601 Highway 6 West, Iowa City, IA 52246-2208

Eastern Oklahoma VA Health Care System (Douglas Ivins, MD), 1011 Honor Heights Drive, Muskogee, OK 74401

James A. Haley Veterans’ Hospital (Stephen Mastorides, MD), 13000 Bruce B. Downs Blvd, Tampa, FL 33612

James H. Quillen VA Medical Center (Jonathan Moorman, MD, PhD), Corner of Lamont & Veterans Way, Mountain Home, TN 37684

John D. Dingell VA Medical Center (Saib Gappy, MD), 4646 John R Street, Detroit, MI 48201

Louisville VA Medical Center (Jon Klein, MD, PhD), 800 Zorn Avenue, Louisville, KY 40206

Manchester VA Medical Center (Nora Ratcliffe, MD), 718 Smyth Road, Manchester, NH 03104

Miami VA Health Care System (Hermes Florez, MD, PhD), 1201 NW 16th Street, 11 GRC, Miami FL 33125

Michael E. DeBakey VA Medical Center (Olaoluwa Okusaga, MD), 2002 Holcombe Blvd, Houston, TX 77030

Minneapolis VA Health Care System (Maureen Murdoch, MD, MPH), One Veterans Drive, Minneapolis, MN 55417

N. FL/S. GA Veterans Health System (Peruvemba Sriram, MD), 1601 SW Archer Road, Gainesville, FL 32608

Northport VA Medical Center (Shing Shing Yeh, PhD, MD), 79 Middleville Road, Northport, NY 11768

Overton Brooks VA Medical Center (Neeraj Tandon, MD), 510 East Stoner Ave, Shreveport, LA 71101

Philadelphia VA Medical Center (Darshana Jhala, MD), 3900 Woodland Avenue, Philadelphia, PA 19104

Phoenix VA Health Care System (Samuel Aguayo, MD), 650 E. Indian School Road, Phoenix, AZ 85012

Portland VA Medical Center (David Cohen, MD), 3710 SW U.S. Veterans Hospital Road, Portland, OR 97239

Providence VA Medical Center (Satish Sharma, MD), 830 Chalkstone Avenue, Providence, RI 02908

Richard Roudebush VA Medical Center (Suthat Liangpunsakul, MD, MPH), 1481 West 10th Street, Indianapolis, IN 46202

Salem VA Medical Center (Kris Ann Oursler, MD), 1970 Roanoke Blvd, Salem, VA 24153

San Francisco VA Health Care System (Mary Whooley, MD), 4150 Clement Street, San Francisco, CA 94121

South Texas Veterans Health Care System (Sunil Ahuja, MD), 7400 Merton Minter Boulevard, San Antonio, TX 78229

Southeast Louisiana Veterans Health Care System (Joseph Constans, PhD), 2400 Canal Street, New Orleans, LA 70119

Southern Arizona VA Health Care System (Paul Meyer, MD, PhD), 3601 S 6th Avenue, Tucson, AZ 85723

Sioux Falls VA Health Care System (Jennifer Greco, MD), 2501 W 22nd Street, Sioux Falls, SD 57105

St. Louis VA Health Care System (Michael Rauchman, MD), 915 North Grand Blvd, St. Louis, MO 63106

Syracuse VA Medical Center (Richard Servatius, PhD), 800 Irving Avenue, Syracuse, NY 13210

VA Eastern Kansas Health Care System (Melinda Gaddy, PhD), 4101 S 4th Street Trafficway, Leavenworth, KS 66048

VA Greater Los Angeles Health Care System (Agnes Wallbom, MD, MS), 11301 Wilshire Blvd, Los Angeles, CA 90073

VA Long Beach Healthcare System (Timothy Morgan, MD), 5901 East 7th Street Long Beach, CA 90822

VA Maine Healthcare System (Todd Stapley, DO), 1 VA Center, Augusta, ME 04330

VA New York Harbor Healthcare System (Scott Sherman, MD, MPH), 423 East 23rd Street, New York, NY 10010

VA Pacific Islands Health Care System (George Ross, MD), 459 Patterson Rd, Honolulu, HI 96819

VA Palo Alto Health Care System (Philip Tsao, PhD), 3801 Miranda Avenue, Palo Alto, CA 94304-1290

VA Pittsburgh Health Care System (Patrick Strollo, Jr., MD), University Drive, Pittsburgh, PA 15240

VA Puget Sound Health Care System (Edward Boyko, MD), 1660 S. Columbian Way, Seattle, WA 98108-1597

VA Salt Lake City Health Care System (Laurence Meyer, MD, PhD), 500 Foothill Drive, Salt Lake City, UT 84148

VA San Diego Healthcare System (Samir Gupta, MD, MSCS), 3350 La Jolla Village Drive, San Diego, CA 92161

VA Sierra Nevada Health Care System (Mostaqul Huq, PharmD, PhD), 975 Kirman Avenue, Reno, NV 89502

VA Southern Nevada Healthcare System (Joseph Fayad, MD), 6900 North Pecos Road, North Las Vegas, NV 89086

VA Tennessee Valley Healthcare System (Adriana Hung, MD, MPH), 1310 24th Avenue, South Nashville, TN 37212

Washington DC VA Medical Center (Jack Lichy, MD, PhD), 50 Irving St, Washington, D. C. 20422

W.G. (Bill) Hefner VA Medical Center (Robin Hurley, MD), 1601 Brenner Ave, Salisbury, NC 28144

White River Junction VA Medical Center (Brooks Robey, MD), 163 Veterans Drive, White River Junction, VT 05009

William S. Middleton Memorial Veterans Hospital (Robert Striker, MD, PhD), 2500 Overlook Terrace, Madison, WI 53705

Paul Spellman

Expert Testimony: Natera, Foundation Medicine

No other potential conflicts of interest were reported.

DISCLAIMER

This publication does not represent the views of the Department of Veterans Affairs or the US Government.

SUPPORT

This research is based on data from the Million Veteran Program, Office of Research and Development, Veterans Health Administration, and was supported by award 1I01BX004188-01.

*

J.M. and N.R. contributed equally to this work.

S.G.H., C.A.B., and S.-W.L. contributed equally to his work.

AUTHOR CONTRIBUTIONS

Conception and design: Jessica Minnier, Nallakkandi Rajeevan, Saiju Pyarajan, Sally G. Haskell, Cynthia A. Brandt, Shiuh-Wen Luoh

Administrative support: Cynthia A. Brandt

Collection and assembly of data: Nallakkandi Rajeevan, Saiju Pyarajan, Shiuh-Wen Luoh

Data analysis and interpretation: Jessica Minnier, Nallakkandi Rajeevan, Lina Gao, Byung Park, Paul Spellman, Shiuh-Wen Luoh

Manuscript writing: All authors

Final approval of manuscript: All authors

Accountable for all aspects of the work: All authors

AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST

The following represents disclosure information provided by the authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO’s conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/po/author-center.

Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).

Paul Spellman

Expert Testimony: Natera, Foundation Medicine

No other potential conflicts of interest were reported.

REFERENCES

  • 1.Nyström L, Andersson I, Bjurstam N, et al. : Long-term effects of mammography screening: Updated overview of the Swedish randomised trials. Lancet 359:909-919, 2002 [DOI] [PubMed] [Google Scholar]
  • 2.Nyström L, Bjurstam N, Jonsson H, et al. : Reduced breast cancer mortality after 20 years of follow-up in the Swedish randomized controlled mammography trials in Malmö, Stockholm, and Göteborg. J Med Screen 24:34-42, 2017 [DOI] [PubMed] [Google Scholar]
  • 3.Shieh Y, Eklund M, Sawaya GF, et al. : Population-based screening for cancer: Hope and hype. Nat Rev Clin Oncol 13:550-565, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Collaborative Group on Hormonal Factors in Breast Cancer : Familial breast cancer: Collaborative reanalysis of individual data from 52 epidemiological studies including 58 209 women with breast cancer and 101 986 women without the disease. Lancet 358:1389–1399, 2001 [DOI] [PubMed] [Google Scholar]
  • 5.Thompson D, Easton D: The genetic epidemiology of breast cancer genes. J Mammary Gland Biol Neoplasia 9:221-236, 2004 [DOI] [PubMed] [Google Scholar]
  • 6.Mavaddat N, Michailidou K, Dennis J, et al. : Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am J Hum Genet 104:21-34, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kapoor PM, Mavaddat N, Choudhury PP, et al. : Combined associations of a polygenic risk score and classical risk factors with breast cancer risk. J Natl Cancer Inst 113:329-337, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Choudhury PP, Maas P, Wilcox A, et al. : iCARE: An R package to build, validate and apply absolute risk models. PLoS One 15:e0228198, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Choudhury PP, Wilcox AN, Brook MN, et al. : Comparative validation of breast cancer risk prediction models and projections for future risk stratification. J Natl Cancer Inst 112:278-285, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Maas P, Barrdahl M, Joshi AD, et al. : Breast cancer risk from modifiable and nonmodifiable risk factors among white women in the United States. JAMA Oncol 2:1295-1302, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Gaziano JM, Concato J, Brophy M, et al. : Million Veteran Program: A mega-biobank to study genetic influences on health and disease. J Clin Epidemiol 70:214-223, 2016 [DOI] [PubMed] [Google Scholar]
  • 12.Huo D, Feng Y, Haddad S, et al. : Genome-wide association studies in women of African ancestry identified 3q26.21 as a novel susceptibility locus for oestrogen receptor negative breast cancer. Hum Mol Genet 25:4835-4846, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Feng Y, Stram DO, Rhie SK, et al. : A comprehensive examination of breast cancer risk loci in African American women. Hum Mol Genet 23:5518-5526, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Palmer JR, Ruiz-Narvaez EA, Rotimi CN, et al. : Genetic susceptibility loci for subtypes of breast cancer in an African American population. Cancer Epidemiol Biomarkers Prev 22:127-134, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chen F, Chen GK, Stram DO, et al. : A genome-wide association study of breast cancer in women of African ancestry. Hum Genet 132:39-48, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wall JD, Pritchard JK: Haplotype blocks and linkage disequilibrium in the human genome. Nat Rev Genet 4:587-597, 2003 [DOI] [PubMed] [Google Scholar]
  • 17.Wang S, Qian F, Zheng Y, et al. : Genetic variants demonstrating flip-flop phenomenon and breast cancer risk prediction among women of African ancestry. Breast Cancer Res Treat 168:703-712, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Michailidou K, Beesley J, Lindstrom S, et al. : Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer. Nat Genet 47:373-380, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Mavaddat N, Pharoah PDP, Michailidou K, et al. : Prediction of breast cancer risk based on profiling with common genetic variants. J Natl Cancer Inst 107:djv036, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Shieh Y, Fejerman L, Lott PC, et al. : A polygenic risk score for breast cancer in US Latinas and Latin American women. J Natl Cancer Inst 112:590-598, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Wen W, Shu X-O, Guo X, et al. : Prediction of breast cancer risk based on common genetic variants in women of East Asian ancestry. Breast Cancer Res 18:124, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Petridis C, Brook MN, Shah V, et al. : Genetic predisposition to ductal carcinoma in situ of the breast. Breast Cancer Res 18:22, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Evans DGR, Harkness EF, Brentnall AR, et al. : Breast cancer pathology and stage are better predicted by risk stratification models that include mammographic density and common genetic variants. Breast Cancer Res Treat 176:141-148, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Fang H, Hui Q, Lynch J, et al. : Harmonizing genetic ancestry and self-identified race/ethnicity in genome-wide association studies. Am J Hum Genet 105:763-772, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Hunter-Zinck H, Shi Y, Li M, et al. : Genotyping array design and data quality control in the Million Veteran Program. Am J Hum Genet 106:535-548, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Das S, Forer L, Schönherr S, et al. : Next-generation genotype imputation service and methods. Nat Genet 48:1284-1287, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Zhang F: BCRA: Breast Cancer Risk Assessment. R Package Version 20, 2018. https://cran.r-project.org/package=BCRA [Google Scholar]
  • 28.DeLong ER, DeLong DM, Clarke-Pearson DL: Comparing the areas under two or more correlated receiver operating characteristic curves: {A} nonparametric approach. Biometrics 44:837-845, 1988 [PubMed] [Google Scholar]
  • 29.R Development Core Team: R: A Language and Environment for Statistical Computing, 2004. http://www.r-project.org [Google Scholar]

Articles from JCO Precision Oncology are provided here courtesy of American Society of Clinical Oncology

RESOURCES