Genome- and transcriptome-wide association studies of 386,000 Asian and European-ancestry women provide new insights into breast cancer genetics

Guochong Jia; Jie Ping; Xiang Shu; Yaohua Yang; Qiuyin Cai; Sun-Seog Kweon; Ji-Yeob Choi; Michiaki Kubo; Sue K Park; Manjeet K Bolla; Joe Dennis; Qin Wang; Xingyi Guo; Bingshan Li; Ran Tao; Kristan J Aronson; Tsun L Chan; Yu-Tang Gao; Mikael Hartman; Weang Kee Ho; Hidemi Ito; Motoki Iwasaki; Hiroji Iwata; Esther M John; Yoshio Kasuga; Mi-Kyung Kim; Allison W Kurian; Ava Kwong; Jingmei Li; Artitaya Lophatananon; Siew-Kee Low; Shivaani Mariapun; Koichi Matsuda; Keitaro Matsuo; Kenneth Muir; Dong-Young Noh; Boyoung Park; Min-Ho Park; Chen-Yang Shen; Min-Ho Shin; John J Spinelli; Atsushi Takahashi; Chiuchen Tseng; Shoichiro Tsugane; Anna H Wu; Taiki Yamaji; Ying Zheng; Alison M Dunning; Paul DP Pharoah; Soo-Hwang Teo; Daehee Kang; Douglas F Easton; Jacques Simard; Xiao-ou Shu; Jirong Long; Wei Zheng

doi:10.1016/j.ajhg.2022.10.011

. 2022 Nov 9;109(12):2185–2195. doi: 10.1016/j.ajhg.2022.10.011

Genome- and transcriptome-wide association studies of 386,000 Asian and European-ancestry women provide new insights into breast cancer genetics

Guochong Jia ^1,⁵⁶, Jie Ping ^1,⁵⁶, Xiang Shu ², Yaohua Yang ¹, Qiuyin Cai ¹, Sun-Seog Kweon ^3,⁴, Ji-Yeob Choi ^5,^6,⁷, Michiaki Kubo ⁸, Sue K Park ^5,^6,⁷, Manjeet K Bolla ⁹, Joe Dennis ⁹, Qin Wang ⁹, Xingyi Guo ¹, Bingshan Li ¹⁰, Ran Tao ^11,¹², Kristan J Aronson ¹³, Tsun L Chan ^14,¹⁵, Yu-Tang Gao ¹⁶, Mikael Hartman ^17,^18,¹⁹, Weang Kee Ho ²⁰, Hidemi Ito ^21,²², Motoki Iwasaki ²³, Hiroji Iwata ²⁴, Esther M John ^25,^26,²⁷, Yoshio Kasuga ²⁸, Mi-Kyung Kim ²⁹, Allison W Kurian ²⁶, Ava Kwong ^14,^30,³¹, Jingmei Li ^19,^32,³³, Artitaya Lophatananon ^34,³⁵, Siew-Kee Low ⁸, Shivaani Mariapun ³⁶, Koichi Matsuda ³⁷, Keitaro Matsuo ^38,³⁹, Kenneth Muir ^34,³⁵, Dong-Young Noh ^7,⁴⁰, Boyoung Park ⁴¹, Min-Ho Park ⁴², Chen-Yang Shen ^43,⁴⁴, Min-Ho Shin ³, John J Spinelli ^45,⁴⁶, Atsushi Takahashi ^8,⁴⁷, Chiuchen Tseng ⁴⁸, Shoichiro Tsugane ⁴⁹, Anna H Wu ⁴⁸, Taiki Yamaji ²³, Ying Zheng ⁵⁰, Alison M Dunning ⁵¹, Paul DP Pharoah ^9,⁵¹, Soo-Hwang Teo ^36,⁵², Daehee Kang ^6,^7,^53,⁵⁴, Douglas F Easton ^9,⁵¹, Jacques Simard ⁵⁵, Xiao-ou Shu ¹, Jirong Long ¹, Wei Zheng ^1,^∗

¹Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, 2525 West End Avenue, Suite 800, Nashville, TN, USA

²Department of Epidemiology & Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA

³Department of Preventive Medicine, Chonnam National University Medical School, Hwasun, Korea

⁴Jeonnam Regional Cancer Center, Chonnam National University Hwasun Hospital, Hwasun, Korea

⁵Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea

⁶Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, Korea

⁷Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea

⁸RIKEN Center for Integrative Medical Sciences, Yokohama, Japan

⁹Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK

¹⁰Department of Molecular Physiology & Biophysics, Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA

¹¹Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA

¹²Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA

¹³Department of Public Health Sciences and Queen’s Cancer Research Institute, Queen’s University, Kingston, ON, Canada

¹⁴Hong Kong Hereditary Breast Cancer Family Registry, Hong Kong SAR, China

¹⁵Department of Molecular Pathology, Hong Kong Sanatorium & Hospital, Hong Kong SAR, China

¹⁶State Key Laboratory of Oncogene and Related Genes & Department of Epidemiology, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China

¹⁷Department of Surgery, National University Hospital, Singapore, Singapore

¹⁸Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore

¹⁹Department of Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore

²⁰Department of Applied Mathematics, Faculty of Engineering, University of Nottingham Malaysia Campus, Semenyih, Selangor, Malaysia

²¹Division of Cancer Information and Control, Aichi Cancer Center Research Institute, Nagoya, Japan

²²Department of Descriptive Cancer Epidemiology, Nagoya University Graduate School of Medicine, Nagoya, Japan

²³Division of Epidemiology, National Cancer Center Institute for Cancer Control, Tokyo, Japan

²⁴Department of Breast Oncology, Aichi Cancer Center, Nagoya, Aichi, Japan

²⁵Departments of Epidemiology, Cancer Prevention Institute of California, Fremont, CA, USA

²⁶Departments of Health Research and Policy, School of Medicine, Stanford University, Stanford, CA, USA

²⁷Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, USA

²⁸Department of Surgery, Nagano Matsushiro General Hospital, Nagano, Japan

²⁹Division of Cancer Epidemiology and Management, National Cancer Center, Goyang, Korea

³⁰Department of Surgery, University of Hong Kong, Hong Kong SAR, China

³¹Department of Surgery, Hong Kong Sanatorium & Hospital, Hong Kong SAR, China

³²Human Genetics, Genome Institute of Singapore, Singapore, Singapore

³³Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden

³⁴Division of Health Sciences, Warwick Medical School, Warwick University, Coventry, UK

³⁵Institute of Population Health, University of Manchester, Manchester, UK

³⁶Cancer Research Malaysia, Subang Jaya, Selangor, Malaysia

³⁷Laboratory of Clinical Genome Sequencing, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan

³⁸Division of Cancer Epidemiology and Prevention, Aichi Cancer Center Research Institute, Nagoya, Japan

³⁹Division of Cancer Epidemiology, Nagoya University Graduate School of Medicine, Nagoya, Japan

⁴⁰Department of Surgery, Seoul National University Hospital, Seoul, South Korea

⁴¹Department of Medicine, Hanyang University College of Medicine, Seoul, Korea

⁴²Department of Surgery, Chonnam National University Medical School, Gwangju, Korea

⁴³College of Public Health, China Medical University, Taichong, Taiwan

⁴⁴Taiwan Biobank, Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan

⁴⁵Department of Cancer Control Research, British Columbia Cancer Agency, Vancouver, BC, Canada

⁴⁶School of Population and Public Health, University of British Columbia, Vancouver, BC, Canada

⁴⁷Department of Genomic Medicine, Research Institute, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan

⁴⁸Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA

⁴⁹National Institute of Health and Nutrition, National Institutes of Biomedical Innovation, Health and Nutrition, Tokyo, Japan

⁵⁰Shanghai Municipal Center for Disease Control and Prevention, Shanghai, China

⁵¹Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK

⁵²Department of Surgery, Faculty of Medicine, University Malaya, Kuala Lumpar, Malaysia

⁵³Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Korea

⁵⁴Institute of Environmental Medicine, Seoul National University Medical Research Center, Seoul, Korea

⁵⁵Genomics Center, Centre Hospitalier Universitaire de Québec - Université Laval, Research Center, Québec City, QC, Canada

^∗

Corresponding author wei.zheng@vanderbilt.edu

⁵⁶

These authors contributed equally

PMCID: PMC9748250 PMID: 36356581

Summary

By combining data from 160,500 individuals with breast cancer and 226,196 controls of Asian and European ancestry, we conducted genome- and transcriptome-wide association studies of breast cancer. We identified 222 genetic risk loci and 137 genes that were associated with breast cancer risk at a p < 5.0 × 10⁻⁸ and a Bonferroni-corrected p < 4.6 × 10⁻⁶, respectively. Of them, 32 loci and 15 genes showed a significantly different association between ER-positive and ER-negative breast cancer after Bonferroni correction. Significant ancestral differences in risk variant allele frequencies and their association strengths with breast cancer risk were identified. Of the significant associations identified in this study, 17 loci and 14 genes are located 1Mb away from any of the previously reported breast cancer risk variants. Pathways analyses including 221 putative risk genes identified multiple signaling pathways that may play a significant role in the development of breast cancer. Our study provides a comprehensive understanding of and new biological insights into the genetics of this common malignancy.

Keywords: breast cancer, multi-ancestry meta-analysis, transcriptome-wide association study

Using data from 386,000 Asian- and European-ancestry women, we conducted extensive genome- and transcriptome-wide association studies that identified 222 risk loci and 137 genes in association with breast cancer risk. These studies, along with pathway analyses, provide a comprehensive understanding of and new biological insights into the genetics of breast cancer.

Introduction

Breast cancer is the most commonly diagnosed cancer in women worldwide, with an estimated 2.3 million new cases in 2020.¹ Genetic factors play a critical role in the etiology of both familial and sporadic breast cancers. In addition to breast cancer predisposition genes, such as BRCA1 and BRCA2,²^,³^,⁴ common genetic variants in approximately 200 loci have been identified in genome-wide association studies (GWASs).⁵^,⁶^,⁷ However, most GWASs of breast cancer have been conducted among women of European ancestry,⁸ and GWASs conducted among women of Asian ancestry have had relatively smaller sample sizes.⁹^,¹⁰ Although most susceptibility loci have been shown to be shared across European and Asian populations, the lead variants at some susceptibility loci can be different between these two populations given their differences in genetic architecture.¹¹^,¹² To identify additional genetic risk loci and provide a more comprehensive understanding of breast cancer genetics, we conducted cross-ancestry meta-analyses of data from the Asia Breast Cancer Consortium (ABCC) and the Breast Cancer Association Consortium (BCAC), including 386,696 women (139,523 of Asian ancestry and 247,173 of European ancestry). Furthermore, we performed a transcriptome-wide association study (TWAS) to uncover putative breast cancer susceptibility genes and gain biological insights into the genetics of this common malignancy.

Subjects and methods

Study population

In this study, we conducted a cross-ancestry meta-analysis using data from two large breast cancer genetic research consortia: ABCC and BCAC. All studies were approved by relevant institutional ethical committees. The detailed descriptions of participating studies are described in the supplemental information. In brief, the 133,384 individuals with breast cancer and 113,789 controls of European ancestry included in this analysis were from BCAC, which consisted of three datasets: iCOGS (38,349 individuals with breast cancer and 37,818 controls), OncoArray (80,125 individuals with breast cancer and 58,383 controls), and other GWASs (14,910 individuals with breast cancer and 17,588 controls).⁶ For European-ancestry participants, we used summary statistics data generated in BCAC, following the data use agreements. Individuals of Asian ancestry included in this analysis were 27,116 individuals with breast cancer and 112,407 controls recruited by studies in AABC and BCAC (Table S1). Proper informed consent was obtained from all study participants.

Genotyping and quality control

Genotyping and quality control procedures for the contributing studies have been described previously.⁵^,⁶^,⁷^,⁹^,¹⁰^,¹¹^,¹³^,¹⁴^,¹⁵^,¹⁶^,¹⁷^,¹⁸^,¹⁹ After quality control, we imputed all datasets using the 1000 Genomes Project Phase 3 and excluded variants with an imputation quality score (R²) <0.3. Variants with a minor allele frequency (MAF) of >0.01 in Asian-ancestry datasets or >0.005 in European-ancestry datasets were included for association analyses.

Statistical meta-analyses

Analyses using logistic regression models were performed within each of the ABCC studies, except Biobank Japan project (BBJ2), to estimate the per-allele odds ratio (OR) for each variant using PLINK 2.0.²⁰ Age and the top two principal components (PCs) were adjusted as covariates. The number of PCs included in the regression was determined by evaluating the Scree plot. Summary statistics were acquired for BBJ2 and BCAC-European dataset. Age and top five PCs were adjusted in BBJ as covariates.¹³ The country of contributing studies and the first ten PCs were adjusted in the BCAC-European dataset.⁶ A fixed-effects model was used for ancestry-specific meta-analyses and cross-ancestry meta-analyses for risk of overall breast cancer and estrogen receptor (ER) subtypes using METAL.²¹ The heterogeneity of risk estimates was evaluated using Cochran’s Q statistic and I². We estimated the statistical power of our cross-ancestry meta-analyses with $α$ at 5 × 10⁻⁸ (Figure S1). We had 80% power to detect a minimum per-allele OR of 1.07, 1.05, 1.04, and 1.03 for variants with a MAF of 0.05, 0.15, 0.20, and 0.30, respectively. In order to take into account of the population heterogeneity, we also used the meta-regression approach implemented in MR-MEGA²² in cross-ancestry meta-analyses for overall breast cancer. At each risk locus, we performed fine-mapping analysis using SuSiE²³ and constructed a 95% credible set for the lead variant at the locus (detailed methods in supplemental information). We investigated the ancestral heterogeneity of the lead variants and all variants in the credible sets.

Novel risk loci were defined as loci with the sentinel variants located at least 1 Mb away from any of the risk variants identified by previous GWASs included in the NHGRI-EBI GWAS Catalog.²⁴ For each novel locus, we conducted conditional analyses to identify additional independent signals located flanking ± 500 kb from the lead variant. The GCTA-COJO was used for the conditional analyses. In each iteration of the stepwise conditional analysis, we conducted ancestry-specific conditional analyses and combined the results by a fixed-effects model using METAL. Asian samples (N = 20,554) genotyped by Multi-Ethnic Genotyping Array (MEGA) chips were used as a reference panel for linkage disequilibrium (LD) estimation among women of Asian ancestry. For women of European ancestry, we used 5,000 samples from the Vanderbilt University Medical Center biobank (BioVU) genotyped by MEGA as a reference panel for LD estimation.²⁵^,²⁶ Since the conditional analyses were restricted to local regions of the novel loci identified at genome-wide significance, we used 1 × 10⁻⁴ as significance level (adjusting for ∼500 comparisons in each locus). If the variant with the lowest conditional p was lower than 1 × 10⁻⁴, it was considered an independent signal at that locus, and it was subsequently adjusted, along with the lead variant, from cross-ancestry meta-analyses in later iterations. This process was repeated until there were no variants with a cross-ancestry conditional p < 1 × 10⁻⁴.

Genetic variance explained by novel risk variants

We estimated the genetic variance explained by novel risk variants identified in this study using a log-additive model:

\sum_{i}^{n} 2 p_{i} (1 - p_{i}) (β_{i}^{2} - τ_{i}^{2})

where n is the total number of novel risk variants, $p_{i}$ is the MAF of the ith variant, $β_{i}$ is the log-OR for the ith variant and $τ_{i}$ is the standard error of $β_{i}$ . The explained genetic variance was estimated for overall breast cancer and by ER subtypes for Asian- and European-ancestry populations, respectively.

Transcriptome-wide association analysis

We used RNA sequencing data from 115 samples collected from European-ancestry women from the Genotype-Tissue Expression Project (GTEx, version 8) to build prediction models for each gene expressed in normal breast tissue. Germline genotyping data were obtained using whole-genome sequencing (WGS) of genomic DNA extracted from blood samples. The details of data processing are described in the supplemental information. We used a cross-tissue approach, joint-tissue imputation (JTI), to build prediction models for gene-expression levels in normal breast tissue.²⁷ Besides breast tissue, data from all 31 other tissues were borrowed in the JTI approach to leverage shared genetic regulation and improve prediction performance in a tissue-dependent manner (Table S10). Prediction models were built using genetic variants within flanking +/− 500 kb from the respective gene boundaries. Five-fold cross-validation was conducted to validate the models internally. Genes with a model prediction R > 0.1 were included for association analyses.

To evaluate the performance of prediction models, we performed an external validation using 86 tumor-adjacent normal breast tissue samples from European-ancestry females with breast cancer in The Cancer Genome Atlas (TCGA). We calculated the Spearman’s correlation between the prediction performance (R²) in GTEx and TCGA.

We conducted association analyses of predicted gene expression with breast cancer risk with S-PrediXcan tool,²⁸ using the summary statistics from our ancestry-specific and cross-ancestry meta-analyses of GWASs for breast cancer. For genes identified at Bonferroni correction in the association analyses, we also conducted TWAS fine-mapping analyses and colocalization analyses. Pathway analyses were conducted for protein-coding genes. The details of statistical analyses were described in supplemental information.

Results

By cross-ancestry meta-analyzing GWAS data from 160,500 individuals with breast cancer and 226,196 controls of Asian and European ancestry using fixed-effects models, we identified 23,461 variants in 184 regions that were associated with overall breast cancer risk at genome-wide significance level (p < 5.00 × 10⁻⁸; Table S2). Twenty-seven additional risk loci were uncovered in population-specific analyses, including 25 loci identified in European-specific GWASs and two in Asian-specific GWASs. In total, we identified 211 loci showing a significant association with risk of overall breast cancer. Of them, 16 loci are novel, with the sentinel variants located at least 1 Mb away from any of the risk variants identified by previous GWASs (Table 1).

Table 1.

Results for the lead risk variants at 17 novel loci identified in cross-ancestry meta-analyses of GWAS data

Variants	Loci	Nearest gene	Gene region	Alleles^a	EAF^b	OR (95% CI)	p^c	I², %	p_het
Overall

rs727477	2p22.1	SLC8A1	Intron	G/T	0.36	0.97 (0.96, 0.98)	2.85 × 10⁻⁸	52.1	0.03
rs3010266	5q13.2	LINC02056	8.5 kb from 5′	A/G	0.24	0.96 (0.95, 0.98)	3.56 × 10⁻⁸	0	0.83
rs6890591^d	5q35.2	CPEB4	3.3 kb from 3′	A/T	0.38	0.97 (0.96, 0.98)	3.25 × 10⁻⁸	50.5	0.04
rs3829964	6p21.2	CDKN1A	Intron	T/C	0.47	0.97 (0.96, 0.98)	4.61 × 10⁻⁹	0	0.46
rs74392007	6q22.31	HSF2	5.4 kb from 5′	T/C	0.12	1.05 (1.03, 1.07)	1.55 × 10⁻⁸	0	0.93
rs3778663	6q27	AFDN	Intron	A/G	0.13	1.06 (1.04, 1.07)	8.51 × 10⁻⁹	0	0.69
rs17167576	7p21.2	AC005019.3^e	5.5 kb from 3′	A/T	0.37	1.03 (1.02, 1.04)	6.93 × 10⁻⁹	47.2	0.05
rs3988353	8p22	PCM1	Intron	CT/C	0.42	1.03 (1.02, 1.04)	4.32 × 10⁻⁸	0	0.81
rs1937680	10q21.1	PRKG1	Intron	C/A	0.36	1.03 (1.02, 1.04)	8.18 × 10⁻⁹	1.3	0.42
rs11354045	11q23.1	ALG9	Intron	CT/C	0.35	1.03 (1.02, 1.04)	2.68 × 10⁻⁸	22.3	0.25
rs36028244	11q23.3	PCSK7	Intron	C/CTTA	0.07	1.06 (1.04, 1.08)	1.77 × 10⁻⁸	0	1.00
rs3809114	12q13.3	INHBE	5′ UTR^f	G/A	0.47	0.97 (0.96, 0.98)	2.33 × 10⁻⁸	37.8	0.12
rs956006	15q22.2	TLN2	Intron	T/C	0.32	1.03 (1.02, 1.05)	3.54 × 10⁻⁸	1.7	0.42
rs4797754	18p11.21	LDLRAD4	Intron	G/C	0.31	1.03 (1.02, 1.05)	2.08 × 10⁻⁸	0	0.50
rs112208395	20q11.23	PHF20	Intron	C/CT	0.14	1.05 (1.03, 1.07)	4.11 × 10⁻⁸	0	0.96
rs74157632^g	10q26.11	DENND10	Missense	G/A	0.05	0.86 (0.81, 0.90)	1.41 × 10⁻⁸	0	1.00
ER-negative
rs2123844	17p13.2	ZZEF1	Intron	A/C	0.07	1.13 (1.09, 1.18)	2.81 × 10⁻¹⁰	37.4	0.16

Open in a new tab

Effect allele/reference allele.

Effect allele frequency.

Unless otherwise specified, p derived from meta-analyses using fixed-effects model.

Identified using cross-ancestry meta-regression (Table S6). The p derived from cross-ancestry fixed-effects model is $1.16 \times 10^{- 7}$ (Table S2).

AC005019.3 (ENSG00000224330) does not have a gene symbol in HUGO yet.

UTR, untranslated region.

Identified in Asian-specific GWASs. The p for cross-ancestry fixed-effects model is $1.74 \times 10^{- 7}$ (Table S2).

Analyses by ER status identified 13,392 variants in 100 loci and 2,425 variants in 34 loci that were associated with ER-positive and ER-negative breast cancer, respectively, at the genome-wide significance level (Tables S3 and S4). Two loci for ER-positive and nine loci for ER-negative breast cancer did not overlap with any of the loci identified for overall breast cancer. Of them, 17p13.2, associated with ER-negative breast cancer risk, has not yet been reported in previous GWASs (Table 1).

Of the 222 lead risk variants identified in our study that were associated with the risk of either overall breast cancer (n = 211) or exclusively ER-positive (n = 2) or ER-negative (n = 9) breast cancer, 68 variants showed a significantly different association by ER status at a false discovery rate (FDR) <0.05 in heterogeneity tests (Table S7). Among them, eight risk loci were not reported previously. Except for rs12335941 at 9p21.3, all other seven variants had a stronger association with ER-positive than ER-negative breast cancer. Of the 32 variants showing a different association at a Bonferroni-corrected p < 2.25 × 10⁻⁴ (0.05/222, Table 2), five lead variants showed an opposite direction of the association by ER status.

Table 2.

Results for breast cancer risk loci showing different associations by estrogen receptor status

Variants	Loci	Allele^a	EAF^b	ER-Positive		ER-Negative		p for ER heterogeneity
Variants	Loci	Allele^a	EAF^b	OR (95% CI)	p	OR (95% CI)	p	p for ER heterogeneity
rs2506885	1p36.22	T/A	0.34	0.95 (0.94, 0.97)	5.91 × 10⁻¹⁰	0.88 (0.86, 0.90)	3.68 × 10⁻²⁷	2.63 × 10⁻⁸
rs11249433	1p11.2	G/A	0.39	1.13 (1.11, 1.15)	3.45 × 10⁻⁵⁹	1.01 (0.99, 1.04)	0.29	1.01 × 10⁻¹⁵
rs12129456	1q32.1	G/T	0.38	1.02 (1.00, 1.03)	0.03	0.92 (0.90, 0.94)	1.52 × 10⁻¹³	2.00 × 10⁻¹³
rs2169137	1q32.1	G/C	0.25	1.00 (0.98, 1.02)	0.9	1.13 (1.11, 1.16)	4.03 × 10⁻²⁴	2.30 × 10⁻¹⁷
rs56158184	2p23.2	C/T	0.09	1.03 (1.00, 1.05)	0.02	0.89 (0.86, 0.92)	1.01 × 10⁻⁹	1.60 × 10⁻¹⁰
rs2016394	2q31.1	A/G	0.44	0.94 (0.93, 0.96)	1.05 × 10⁻¹⁶	1.00 (0.98, 1.02)	0.91	2.51 × 10⁻⁶
rs4442975	2q35	G/T	0.46	1.15 (1.14, 1.17)	1.42 × 10⁻⁹²	1.05 (1.03, 1.07)	1.12 × 10⁻⁵	3.72 × 10⁻¹⁴
rs552647	3p24.1	A/C	0.48	1.12 (1.10, 1.14)	6.35 × 10⁻⁶⁰	1.05 (1.03, 1.07)	4.89 × 10⁻⁶	1.06 × 10⁻⁷
rs7697216	4q34.1	T/C	0.15	0.89 (0.87, 0.91)	1.17 × 10⁻³⁰	0.98 (0.96, 1.01)	0.24	1.49 × 10⁻⁸
rs2853669	5p15.33	G/A	0.31	0.96 (0.95, 0.97)	3.29 × 10⁻⁸	0.89 (0.87, 0.91)	3.03 × 10⁻²⁴	4.32 × 10⁻⁸
rs7710996	5p12	A/G	0.25	1.00 (0.98, 1.02)	0.97	1.07 (1.04, 1.09)	1.50 × 10⁻⁸	3.84 × 10⁻⁶
rs10941679	5p12	G/A	0.31	1.16 (1.14, 1.18)	5.38 × 10⁻⁸⁶	1.02 (1.00, 1.05)	0.04	1.45 × 10⁻²⁰
rs59957907	5q11.2	G/A	0.22	1.19 (1.17, 1.21)	2.95 × 10⁻⁹⁰	1.06 (1.04, 1.09)	2.09 × 10⁻⁶	2.46 × 10⁻¹³
rs60954078	6q25.1	G/A	0.17	1.16 (1.14, 1.19)	1.75 × 10⁻⁴¹	1.33 (1.29, 1.37)	6.92 × 10⁻⁷⁶	2.18 × 10⁻¹²
rs910416	6q25.1	C/T	0.46	0.95 (0.94, 0.96)	3.23 × 10⁻¹³	0.91 (0.89, 0.93)	1.08 × 10⁻²¹	1.02 × 10⁻⁴
rs116426014	8p23.3	G/A	0.26	1.03 (1.01, 1.04)	0.01	1.09 (1.06, 1.12)	1.83 × 10⁻¹⁰	1.68 × 10⁻⁴
rs60037937	9q31.2	T/TAA	0.26	1.10 (1.08, 1.11)	7.92 × 10⁻²⁸	1.03 (1.00, 1.05)	0.04	1.57 × 10⁻⁵
rs7862747	9q31.2	C/A	0.36	0.88 (0.87, 0.90)	1.89 × 10⁻⁵⁸	0.98 (0.96, 1.00)	0.05	4.49 × 10⁻¹³
rs7098100	10p12.31	A/G	0.34	1.07 (1.06, 1.09)	9.46 × 10⁻²¹	0.97 (0.95, 1.00)	0.02	1.42 × 10⁻¹²
rs9420318	10q26.12	A/G	0.33	0.94 (0.93, 0.95)	2.55 × 10⁻¹⁷	1.00 (0.98, 1.02)	0.74	6.53 × 10⁻⁶
rs2981579	10q26.13	A/G	0.41	1.32 (1.31, 1.34)	3.72 × 10⁻³⁵⁹	1.06 (1.04, 1.08)	4.23 × 10⁻⁸	5.37 × 10⁻⁷⁴
rs78540526	11q13.3	T/C	0.07	1.39 (1.35, 1.42)	3.11 × 10⁻¹³⁷	1.01 (0.97, 1.05)	0.73	1.67 × 10⁻³⁶
rs199504893	11q22.3	CA/C	0.41	1.02 (1.00, 1.03)	0.01	0.94 (0.92, 0.96)	3.31 × 10⁻⁹	1.56 × 10⁻¹⁰
rs1292011	12q24.21	G/A	0.39	0.90 (0.89, 0.92)	3.34 × 10⁻⁴⁷	0.97 (0.95, 0.99)	0	1.05 × 10⁻⁷
rs1744947	14q24.1	T/C	0.15	1.08 (1.06, 1.10)	8.58 × 10⁻¹⁴	1.00 (0.97, 1.03)	0.82	2.26 × 10⁻⁵
rs4784227	16q12.1	T/C	0.24	1.26 (1.25, 1.28)	1.03 × 10⁻²⁰²	1.15 (1.13, 1.18)	3.57 × 10⁻³⁶	3.21 × 10⁻¹¹
rs2123844	17p13.2	A/C	0.07	1.03 (1.00, 1.06)	0.03	1.13 (1.09, 1.18)	2.81 × 10⁻¹⁰	6.69 × 10⁻⁵
rs745983748	18q11.2	A/AAGTGTT	0.32	0.93 (0.91, 0.94)	6.12 × 10⁻²⁴	1.01 (0.99, 1.03)	0.44	3.07 × 10⁻¹⁰
rs4609972	19p13.11	C/G	0.48	1.00 (0.98, 1.01)	0.80	0.88 (0.86, 0.90)	6.13 × 10⁻³⁵	6.60 × 10⁻²⁴
rs34753522	20q12	C/T	0.35	0.96 (0.94, 0.97)	3.21 × 10⁻⁸	1.02 (1.00, 1.04)	0.1	8.07 × 10⁻⁶
rs2403907	21q21.1	A/C	0.29	0.91 (0.90, 0.93)	1.09 × 10⁻³²	0.97 (0.95, 1.00)	0.02	3.14 × 10⁻⁶
rs4822992	22q12.1	A/G	0.02	1.25 (1.19, 1.31)	7.16 × 10⁻¹⁹	1.00 (0.93, 1.09)	0.91	6.23 × 10⁻⁶

Open in a new tab

Effect allele/reference allele.

Effect allele frequency.

Of the 211 lead risk variants for overall breast cancer, 166 variants had a >25% difference in the effect allele frequency between Asian-ancestry and European-ancestry women (Figure S2). Seventeen lead variants, all identified from ancestry-specific GWASs, are rare (a MAF of <0.01) in one population but common in the other population. For nine of these lead variants, all variants included in their 95% credible sets were rare in one population but common in the other population (Table S2). Of the 194 common risk variants in both populations, 36 showed a significant difference in risk estimates between Asian- and European-ancestry populations at p < 0.05, including 31 lead variants with the entire credible sets showing ancestral heterogeneity in risk estimates (p < 0.05). Three variants showed ancestral heterogeneity with a p < 2.58 × 10⁻⁴, the significance level after adjusting for multiple comparisons (0.05/194) (Table S2). In particular, variant rs59957907 showed a highly significant ancestral difference in risk estimate with a p for heterogeneity of 1.27 × 10⁻¹⁰⁴. Overall, risks estimated in European-ancestry populations are larger than those estimated in Asian-ancestry populations with a regression beta coefficient of 0.579 derived from linear regression (Figure 1, Table S2). The ancestral difference observed in our study could be underestimated, as variants with similar risk estimates were more likely to be identified by cross-ancestry meta-analyses.

Comparison of risk estimates for lead risk variants between Asian- and European-ancestry women

The red regression line shows the trend of risk estimates in both ancestry groups. To be conservative, the regression was performed excluding four variants with risk estimates >0.15 in European-ancestry women, which could be outliers or with a high leverage. The black dashed diagonal line shows where risk estimates are the same in both ancestries.

Twenty-three previously reported index variants are not located at the regions identified at genome-wide significance in our meta-analyses. However, 16 of them were associated with breast cancer risk at p < 2.04 × 10⁻⁴, a significant level with Bonferroni correction for comparisons of 245 index variants. Of the remaining seven index risk variants, four were previously identified in a GWAS by breast cancer intrinsic subtypes⁶ (Table S8). Two index variants showed a nominally significant association with breast cancer in cross-ancestry and European-ancestry meta-analyses (p < 0.05). Only variant rs9348512 showed a null association with overall breast cancer risk (p = 0.505). The association with this variant was originally reported in a GWAS conducted among individuals with BRCA2 mutation²⁹ but was not replicated in subsequent studies.⁵^,⁶

The sentinel variants at all 17 newly identified risk loci showed the same association direction in both Asian- and European-ancestry populations (Tables S2 and S4). Except for the Asian-specific risk variant rs74157632, all other lead variants are common, with a MAF >0.01 in both populations. Significant ancestral heterogeneity was observed for rs6890591 (identified by meta-regression) and rs74157632 (identified as Asian specific). The estimated ORs for these 17 lead variants in the BCAC and AABC studies are shown in Table S5. The proportion of variance explained by the 17 novel loci identified in our study was 1.15% for overall breast cancer, 1.07% for ER-positive breast cancer, and 1.03% for ER-negative breast cancer in Asian-ancestry populations. The corresponding numbers are 0.74%, 0.61%, and 1.03% for European-ancestry populations. The higher percentage of genetic variation explained by these new loci in Asian- compared to European-ancestry populations was because of the population differences in the risk estimates at the new loci. Of the 17 novel loci, one locus was specific to the Asian populations. For the remaining 16 loci, the effect size, as measured using OR, was larger in Asian- than in European-ancestry populations for nine loci, including two loci showing a significant difference (p for heterogeneity <0.05). In only two loci, the OR for the lead variant was larger in European- than in Asian-ancestry populations, but no significant heterogeneity was found in either locus. The Asian-specific lead variant rs74157632 (GenBank: NM_207009.4; c.658A>G; p.Asn220Asp) is a missense variant of protein-coding gene DENND10, which has been shown to regulate the progression of epidermal growth factor receptor (EGFR) trafficking.³⁰ Eleven lead variants are located in the intronic regions of genes. Some of these genes have been reported to be involved in breast cancer cell migration and invasion (SLC8A1,³¹ CDKN1A,³² AFDN,³³ TLN2³⁴), resistance to radiotherapy (ALG9³⁵), and TGF-β (LDLRAD4³⁶) or p53 (PHF20³⁷) signaling pathways.

For each of the novel loci identified in this study, we performed conditional analyses for variants located within 500 kb of the lead variant, adjusted for the lead variant separately for Asian and European descendants, to identify potential secondary association signals. These results were then combined by meta-analyses. We found eight independent association signals (conditional p < 1.0 × 10⁻⁴) at six loci: 2p22.1, 6q22.31, 6q27, 8p22, 15q22.2, and 18p11.21 (Table S9). There were two additional independent association signals found at loci 8p22 and 18p11.21.

To identify putative breast cancer susceptibility genes, we conducted a transcriptome-wide association analysis (TWAS). We used whole-genome sequencing data generated in genomic DNA samples and RNA sequencing data generated in normal tissues obtained from 115 individuals included in the GTEx project (version 8) to build genetic models to predict gene expression across the transcriptome (Material and methods, Table S10). Of the 30,362 genes evaluated, models were successfully built for 17,127 genes, in which 10,820 genes could be predicted with R > 0.1. The performance of the models was evaluated using the adjacent normal breast tissue samples from TCGA. Overall, genes that were predicted with R > 0.1 in GTEx data were also predicted well in TCGA tumor-adjacent normal tissue data (correlation coefficient of 0.69; Figure S3).

Of the 10,820 genes evaluated using GWAS data from 160,500 individuals with breast cancer and 226,196 controls, we identified 137 genes in association with risk of breast cancer at the Bonferroni-corrected threshold of p < 4.62 × 10⁻⁶, including 76 protein-coding genes (Tables S11 and S18). Of them, 14 genes at 13 loci are located at least 1 Mb away from any of the previous GWAS-identified risk variants for breast cancer (Table 3), including 11 genes associated with overall breast cancer risk and three additional genes associated with ER-positive breast cancer. CPNE1 is located at a novel risk locus identified in our cross-ancestry meta-analyses. CPNE1 has been reported to be overexpressed in triple-negative breast cancer and promotes tumorigenesis and radio-resistance by the AKT signaling pathway.³⁸ In addition, we also identified 87 genes (including 39 protein-coding genes) that are located in known risk loci but have not yet been reported in previous TWASs³⁹^,⁴⁰^,⁴¹^,⁴² (Table S11).

Table 3.

Genes identified in TWASs in novel loci in association with breast cancer risk

Loci^a	Gene	Gene type	Z score	p	R²^b
Overall

1p11.2	NBPF8	Pseudogene	7.05	1.76 × 10⁻¹²	0.23
1p11.2	PFN1P2	Pseudogene	9.22	2.87 × 10⁻²⁰	0.22
3p21.31	RNF123	Protein coding	4.63	3.62 × 10⁻⁶	0.26
5p15.31	NSUN2	Protein coding	−4.89	1.01 × 10⁻⁶	0.37
10q26.13	EEF1AKMT2	Protein coding	−4.70	2.63 × 10⁻⁶	0.34
15q15.1	SRP14-DT	LincRNA	−4.80	1.55 × 10⁻⁶	0.29
15q15.3	STRCP1	Pseudogene	−4.66	3.18 × 10⁻⁶	0.12
17p12	MAP2K4	Protein coding	4.99	6.06 × 10⁻⁷	0.02
19q13.12	ZNF793-AS1	Antisense RNA	−4.94	7.64 × 10⁻⁷	0.10
20q11.22	CPNE1	Protein coding	−4.68	2.88 × 10⁻⁶	0.38
20q13.33^c	RGS19	Protein coding	4.64	3.47 × 10⁻⁶	0.07

ER-positive

6p22.1	H4C12	Protein coding	5.01	5.54 × 10⁻⁷	0.07
11q13.2	RHOD	Protein coding	4.78	1.73 × 10⁻⁶	0.19
5q13.2^c	GUSBP14	Pseudogene	5.08	3.73 × 10⁻⁷	0.08

Open in a new tab

Unless otherwise specified, results are based on TWAS analyses using cross-ancestry GWAS data.

Prediction performance derived using GTEx data.

Genes identified from association analysis using European-ancestry GWAS data.

Of the 137 genes identified by TWAS, 15 genes showed different associations with ER-positive and ER-negative breast cancer, with a p for heterogeneity <3.65 × 10⁻⁴ (0.05/137; Tables 4 and S12). Of them, protein-coding genes ABHD8 and ANKLE1 at 19p13.11 showed an exclusive association with ER-negative breast cancer, and similar heterogeneity also was found for the lead variant rs4808616 at this risk locus. These findings were supported by a previous study, which identified ABHD8 and ANKLE1 as potential target genes at the risk locus 19p13.11.⁴³

Table 4.

TWAS-identified breast cancer risk genes showing a significantly different association by estrogen receptor status

Loci	Gene	Gene type	ER-Positive		ER-Negative		p for ER heterogeneity
Loci	Gene	Gene type	Z score	P	Z score	p	p for ER heterogeneity
1p11.2	SRGAP2C	Protein coding	−9.45	3.32′10⁻²¹	−1.47	0.14	6.99′10⁻⁵
1p11.2	H3P4	Pseudogene	8.89	6.05′10⁻¹⁹	1.10	0.27	1.72′10⁻⁴
1p11.2	RP11-343N15.2^a	LincRNA	−8.74	2.27′10⁻¹⁸	−1.00	0.32	3.35′10⁻⁵
1p11.2	EMBP1	Pseudogene	−8.38	5.23′10⁻¹⁷	−0.27	0.78	9.32′10⁻⁶
1p36.13	KLHDC7A	Protein coding	−7.10	1.27′10⁻¹²	0.10	0.92	5.79′10⁻⁶
1p36.22	DFFA	Protein coding	4.37	1.26′10⁻⁵	7.60	2.96′10⁻¹⁴	9.54′10⁻⁵
1q22	GBAP1	Pseudogene	−6.66	2.73′10⁻¹¹	0.59	0.56	2.54′10⁻⁵
1q22	THBS3	Protein coding	5.72	1.07′10⁻⁸	−0.89	0.38	8.72′10⁻⁵
1q32.1	PTPRVP	Pseudogene	−1.50	0.14	6.67	2.52′10⁻¹¹	1.36′10⁻¹⁰
2q35	TNP1	Protein coding	5.85	5.04′10⁻⁹	−0.37	0.71	5.44′10⁻⁵
5p12	MRPS30-DT	Antisense RNA	16.38	2.48′10⁻⁶⁰	−0.15	0.88	4.20′10⁻²¹
5q11.2	CTD-2310F14.1^a	Antisense RNA	14.50	1.17′10⁻⁴⁷	3.73	1.90′10⁻⁴	4.24′10⁻⁷
8p23.3	SEPT14P8	Pseudogene	−2.29	0.02	−6.00	1.98′10⁻⁹	2.53′10⁻⁴
19p13.11	ABHD8	Protein coding	−0.51	0.61	9.64	5.25′10⁻²²	2.39′10⁻¹⁵
19p13.11	ANKLE1	Protein coding	−0.24	0.81	6.74	1.62′10⁻¹¹	8.17′10⁻⁹

Open in a new tab

RP11-343N15.2 (ENSG00000231429) and CTD-2310F14.1 (ENSG00000271828) do not have gene symbols in HUGO yet.

In addition, 16 genes showed a significantly different association between Asian- and European-ancestry women at the Bonferroni-corrected threshold p for heterogeneity <3.65 × 10⁻⁴, including seven protein-coding genes (Table S13). Of them, CASP8 and ALS2CR12 at 2q33.1 and HLA-F at 6p22.1 showed a stronger association with breast cancer risk in Asian-ancestry women than in European-ancestry women. The CASP8 gene plays a central role in extrinsic apoptosis⁴⁴ and has been reported to be associated with breast cancer risk in previous TWASs among European-ancestry women.³⁹^,⁴⁰^,⁴¹^,⁴²

To identify the most likely target genes in the locus in which multiple genes were found to be associated with breast cancer risk in TWASs, we performed fine-mapping analyses using FOCUS.⁴⁵ In total, we identified 69 genes showing significant posterior inclusion probability and thus included them in the credible target gene sets (Table S14). In addition, we identified 50 genes that were colocalized with both GWASs and eQTL signals from colocalization analyses using COLOC⁴⁶ (Table S15), including 28 genes included in the credible target gene sets from TWAS fine-mapping analyses.

We performed pathway analyses to identify biological pathways that may play a role in breast cancer etiology. Of the 137 genes identified in our TWASs in association with breast cancer risk, 76 located in 53 genomic regions are protein-coding genes. In 47 regions, we were able to identify 53 genes as putative target genes with supporting evidence from either fine-mapping analyses (n = 25), colocalization analyses (n = 10), or both (n = 18). Additionally, for the remaining 152 loci, in which no target genes were identified in TWASs, we selected 89 protein-coding genes previously reported as putative target genes⁴⁷ and 79 protein-coding genes located nearby the lead variants identified in our GWAS. In total, 221 putative risk genes for breast cancer were included in our pathway analysis (supplemental methods and Table S16). We identified multiple signaling pathways that were significantly associated with breast cancer risk at FDR <0.05, including p53, cGMP-PKG, TNF, and MAPK signaling pathways, as well as pathways of DNA-binding transcription activator activity and cell cycle phase transition⁴⁸^,⁴⁹^,⁵⁰ (Table S17).

Discussion

We conducted a large GWAS and TWAS of breast cancer, including 386,696 women of Asian and European ancestry. In total, 222 genetic risk loci and 137 genes were identified by GWAS and TWAS, respectively, in association with breast cancer risk after adjusting for multiple comparisons.

Our pathway analyses identified multiple biological pathways that have been implicated in the development of breast and other cancers. For example, CACNA1A, DUSP4, FGFR2, MAP2K4, MAP3K1, MYC, NF1, PLA2G6, TAB2, TGFBR2, and TP53 are involved in mitogen-activated protein kinase (MAPK) signaling pathway.⁴⁸^,⁵¹ ATG10, CDKAL1, KLF4, MAF8, and MAP3K1 are regulated by the activation of KRAS.⁵¹ KRAS is a proto-oncogene from the RAS family and a part of the RAS/MAPK pathway. Although the RAS signaling pathway is commonly activated in breast cancer, somatic mutations of RAS are not common in individuals with breast cancer.⁵² Our findings indicate that the germline alternation of genes involved in the RAS signaling pathway could play a role in the development and progression of breast cancer.

Although the p53 pathway is often altered in breast cancer tissues, particularly those from ER-negative and triple-negative cancer, germline mutations of TP53 are detected only in less than 1% of individuals with breast cancer.⁵³ In this study, we found that 15 genes (CASP8, CCND1, CCNE1, CDKN1A, CHEK2, MDM4, INHBB, KLF4, MXD1, PHLDA3, PIDD1, TNNI1, TP53, ZFP36L1, ZNF365) are involved in the p53 signaling pathway,⁴⁸^,⁵¹ providing support that germline alterations of this pathway could play a more significant etiologic role than what is appreciated based on analyzing TP53 alone. Intriguingly, the MDM4 and CCNE1 are located at risk loci with a stronger association with ER-negative than ER-positive breast cancer. Our TWAS also found that the expression of MDM4 was exclusively associated with an increased risk of ER-negative breast cancer. These findings suggest that the p53 signaling pathway plays an important role in the risk of breast cancer, especially ER-negative breast cancer.

By increasing the sample size and incorporating transcriptome data, we were able to identify 30 novel associations in loci and genes that are located >1 Mb away from any of the previously reported breast cancer risk variants. The discovery of these novel associations further expanded our understanding of the genetic and biological mechanism of breast cancer development. For example, the lead variant at the novel risk locus 6p21.2 is located at the intronic region of CDKN1A. CDKN1A regulates cell-cycle progression as a cyclin-dependent kinase inhibitor³² and plays an important role in both PI3K/AKT signaling pathway and p53 pathway.⁵¹

MAP2K4 at 17p12 is a novel target gene identified by our TWAS. This gene encodes a member of the mitogen-activated protein kinase and it is involved in multiple signaling pathways, including MAPK pathway, EGF pathway, FAS signaling pathway,⁵¹ and PI3K/AKT signaling pathway.⁵⁴ In addition, our TWAS identified 39 protein-coding genes that are located in known risk loci but have not yet been reported in previous TWAS. Of them, MDM4, PLA2G6, and RIT1 are involved in the p53 pathway, RAS/MAPK pathway, and PI3K/AKT pathway, respectively. These newly identified putative breast cancer risk genes could be potential targets for therapies.

Given the much larger sample size for GWASs conducted in European descendants compared to those conducted in East Asians, many of the associations were driven by data from European-ancestry GWASs. Increasing the sample size for GWASs of non-European populations will be valuable to fully uncover the genetic basis for breast cancer. In our TWAS, we built gene prediction models using European-ancestry samples from GTEx. Given the difference in genetic architectures between Asian and European descendants, some of these models may not perform well in TWASs in Asian populations, affecting the detection of significant association signals, particularly in regions where significant ancestral differences exist. Using Asian-specific gene prediction models in future studies should help to identify additional genes associated with breast cancer risk.

In summary, in this large GWAS and TWAS for breast cancer, we uncovered a large number of genetic variants associated with breast cancer risk and identified potential target genes for this common cancer. We discovered significant differences for many of these variants and genes in association with breast cancer risk by ER status and ancestry. We identified multiple signaling pathways that play an etiologic role in breast cancer risk and propose that germline alterations in TP53, RAS, and MAPK pathways may play a more significant role in the etiology of breast cancer than what is currently appreciated. Our study provides substantial insights into the genetics and biology of breast cancer.

Acknowledgments

The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agents. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. This research was supported in part by U.S. National Institutes of Health grants R01CA235553, R01CA202981, R01CA124558, R01CA148667, R01CA158473, R01CA064277, R37CA070867, and UM1CA182910 (to W.Z.), R01CA118229 and R01CA092585 (to X.-O.S.), R01CA122756 (to Q.C.), and R01CA137013 (to J. Long); Department of Defense Idea Awards BC011118 (to X.-O.S.) and BC050791 (to Q.C.); and Ingram Professor and Anne Potter Wilson Chair and Research Reward funds (to W.Z.). Sample preparation and genotyping assays at Vanderbilt were conducted at the Survey and Biospecimen Shared Resources and Vanderbilt Microarray Shared Resource, which are supported in part by the Vanderbilt-Ingram Cancer Center (P30CA068485). Data analyses were conducted using the Advanced Computing Center for Research and Education (ACCRE) at Vanderbilt University. Additional information is provided in the supplemental information.

Declaration of interests

The authors declare no competing interests.

Published: November 9, 2022

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.ajhg.2022.10.011.

Supplemental information

Document S1. Figures S1–S3 and supplemental methods

mmc1.pdf^{(542.1KB, pdf)}

Data S1. Tables S1–S18

mmc2.xlsx^{(336.4KB, xlsx)}

Document S2. Article plus supplemental information

mmc3.pdf^{(987.5KB, pdf)}

Data and code availability

Access to the ABCC data can be requested by submission of an inquiry to Dr. Wei Zheng (wei.zheng@vanderbilt.edu). Request for access to the BCAC data can be submitted directly to BCAC (http://bcac.ccge.medschl.cam.ac.uk/). All GTEx data are publicly available through dbGaP: phs000424.v8.p2. TCGA data are publicly available through National Cancer Institute’s Genomic Data Commons Data Portal (https://portal.gdc.cancer.gov/). Access to the custom code: https://github.com/pingjie/EURASN_GWAS/.

References

1.Sung H., Ferlay J., Siegel R.L., Laversanne M., Soerjomataram I., Jemal A., Bray F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J. Clin. 2021;71:209–249. doi: 10.3322/caac.21660. [DOI] [PubMed] [Google Scholar]
2.Hu C., Hart S.N., Gnanaolivu R., Huang H., Lee K.Y., Na J., Gao C., Lilyquist J., Yadav S., Boddicker N.J., et al. A population-based study of genes previously implicated in breast cancer. N. Engl. J. Med. 2021;384:440–451. doi: 10.1056/NEJMoa2005936. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Breast Cancer Association Consortium. Dorling L., Carvalho S., Allen J., González-Neira A., Luccarini C., Wahlström C., Pooley K.A., Parsons M.T., Fortuno C., et al. Breast cancer risk genes - association analysis in more than 113, 000 women. N. Engl. J. Med. 2021;384:428–439. doi: 10.1056/NEJMoa1913948. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Narod S.A. Which genes for hereditary breast cancer? N. Engl. J. Med. 2021;384:471–473. doi: 10.1056/NEJMe2035083. [DOI] [PubMed] [Google Scholar]
5.Michailidou K., Lindström S., Dennis J., Beesley J., Hui S., Kar S., Lemaçon A., Soucy P., Glubb D., Rostamianfar A., et al. Association analysis identifies 65 new breast cancer risk loci. Nature. 2017;551:92–94. doi: 10.1038/nature24284. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Zhang H., Ahearn T.U., Lecarpentier J., Barnes D., Beesley J., Qi G., Jiang X., O’Mara T.A., Zhao N., Bolla M.K., et al. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat. Genet. 2020;52:572–581. doi: 10.1038/s41588-020-0609-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Shu X., Long J., Cai Q., Kweon S.-S., Choi J.-Y., Kubo M., Park S.K., Bolla M.K., Dennis J., Wang Q., et al. Identification of novel breast cancer susceptibility loci in meta-analyses conducted among Asian and European descendants. Nat. Commun. 2020;11:1217. doi: 10.1038/s41467-020-15046-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Martin A.R., Kanai M., Kamatani Y., Okada Y., Neale B.M., Daly M.J. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 2019;51:584–591. doi: 10.1038/s41588-019-0379-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Zheng W., Long J., Gao Y.-T., Li C., Zheng Y., Xiang Y.-B., Wen W., Levy S., Deming S.L., Haines J.L., et al. Genome-wide association study identifies a new breast cancer susceptibility locus at 6q25.1. Nat. Genet. 2009;41:324–328. doi: 10.1038/ng.318. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Cai Q., Zhang B., Sung H., Low S.-K., Kweon S.-S., Lu W., Shi J., Long J., Wen W., Choi J.-Y., et al. Genome-wide association analysis in East Asians identifies breast cancer susceptibility loci at 1q32.1, 5q14.3 and 15q26.1. Nat. Genet. 2014;46:886–890. doi: 10.1038/ng.3041. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Zheng W., Zhang B., Cai Q., Sung H., Michailidou K., Shi J., Choi J.-Y., Long J., Dennis J., Humphreys M.K., et al. Common genetic determinants of breast-cancer risk in East Asian women: a collaborative study of 23 637 breast cancer cases and 25 579 controls. Hum. Mol. Genet. 2013;22:2539–2550. doi: 10.1093/hmg/ddt089. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Yang Y., Tao R., Shu X., Cai Q., Wen W., Gu K., Gao Y.-T., Zheng Y., Kweon S.-S., Shin M.-H., et al. Incorporating polygenic risk scores and nongenetic risk factors for breast cancer risk prediction among asian women. JAMA Netw. Open. 2022;5 doi: 10.1001/jamanetworkopen.2021.49030. e2149030. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Ishigaki K., Akiyama M., Kanai M., Takahashi A., Kawakami E., Sugishita H., Sakaue S., Matoba N., Low S.-K., Okada Y., et al. Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases. Nat. Genet. 2020;52:669–679. doi: 10.1038/s41588-020-0640-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Michailidou K., Beesley J., Lindstrom S., Canisius S., Dennis J., Lush M.J., Maranian M.J., Bolla M.K., Wang Q., Shah M., et al. Genome-wide association analysis of more than 120, 000 individuals identifies 15 new susceptibility loci for breast cancer. Nat. Genet. 2015;47:373–380. doi: 10.1038/ng.3242. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Cai Q., Long J., Lu W., Qu S., Wen W., Kang D., Lee J.-Y., Chen K., Shen H., Shen C.-Y., et al. Genome-wide association study identifies breast cancer risk variant at 10q21.2: results from the Asia Breast Cancer Consortium. Hum. Mol. Genet. 2011;20:4991–4999. doi: 10.1093/hmg/ddr405. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Long J., Cai Q., Sung H., Shi J., Zhang B., Choi J.-Y., Wen W., Delahanty R.J., Lu W., Gao Y.-T., et al. Genome-wide association study in east Asians identifies novel susceptibility loci for breast cancer. PLoS Genet. 2012;8 doi: 10.1371/journal.pgen.1002532. e1002532. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Han M.-R., Long J., Choi J.-Y., Low S.-K., Kweon S.-S., Zheng Y., Cai Q., Shi J., Guo X., Matsuo K., et al. Genome-wide association study in East Asians identifies two novel breast cancer susceptibility loci. Hum. Mol. Genet. 2016;25:3361–3371. doi: 10.1093/hmg/ddw164. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Zhang Y., Long J., Lu W., Shu X.-O., Cai Q., Zheng Y., Li C., Li B., Gao Y.-T., Zheng W. Rare coding variants and breast cancer risk: evaluation of susceptibility Loci identified in genome-wide association studies. Cancer Epidemiol. Biomarkers Prev. 2014;23:622–628. doi: 10.1158/1055-9965.EPI-13-1043. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Kim H.c., Lee J.-Y., Sung H., Choi J.-Y., Park S.K., Lee K.-M., Kim Y.J., Go M.J., Li L., Cho Y.S., et al. A genome-wide association study identifies a breast cancer risk variant in ERBB4 at 2q34: results from the Seoul breast cancer study. Breast Cancer Res. 2012;14:R56. doi: 10.1186/bcr3158. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Chang C.C., Chow C.C., Tellier L.C., Vattikuti S., Purcell S.M., Lee J.J. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Willer C.J., Li Y., Abecasis G.R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Mägi R., Horikoshi M., Sofer T., Mahajan A., Kitajima H., Franceschini N., McCarthy M.I., COGENT-Kidney Consortium T2D-GENES Consortium. Morris A.P., Morris A.P. Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution. Hum. Mol. Genet. 2017;26:3639–3650. doi: 10.1093/hmg/ddx280. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Wang G., Sarkar A., Carbonetto P., Stephens M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. Roy. Stat. Soc. B. 2020;82:1273–1300. doi: 10.1111/rssb.12388. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Buniello A., MacArthur J.A.L., Cerezo M., Harris L.W., Hayhurst J., Malangone C., McMahon A., Morales J., Mountjoy E., Sollis E., et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47:D1005–D1012. doi: 10.1093/nar/gky1120. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Roden D.M., Pulley J.M., Basford M.A., Bernard G.R., Clayton E.W., Balser J.R., Masys D.R. Development of a large-scale De-identified DNA biobank to enable personalized medicine. Clin. Pharmacol. Ther. 2008;84:362–369. doi: 10.1038/clpt.2008.89. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Kasimatis K.R., Abraham A., Ralph P.L., Kern A.D., Capra J.A., Phillips P.C. Evaluating human autosomal loci for sexually antagonistic viability selection in two large biobanks. Genetics. 2021;217:1–10. doi: 10.1093/genetics/iyaa015. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Zhou D., Jiang Y., Zhong X., Cox N.J., Liu C., Gamazon E.R. A unified framework for joint-tissue transcriptome-wide association and Mendelian randomization analysis. Nat. Genet. 2020;52:1239–1246. doi: 10.1038/s41588-020-0706-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Barbeira A.N., Dickinson S.P., Bonazzola R., Zheng J., Wheeler H.E., Torres J.M., Torstenson E.S., Shah K.P., Garcia T., Edwards T.L., et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 2018;9:1825. doi: 10.1038/s41467-018-03621-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Gaudet M.M., Kuchenbaecker K.B., Vijai J., Klein R.J., Kirchhoff T., McGuffog L., Barrowdale D., Dunning A.M., Lee A., Dennis J., et al. Identification of a BRCA2-specific modifier locus at 6p24 related to breast cancer risk. PLoS Genet. 2013;9 doi: 10.1371/journal.pgen.1003173. e1003173. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Zhang J., Zhang K., Qi L., Hu Q., Shen Z., Liu B., Deng J., Zhang C., Zhang Y. DENN domain-containing protein FAM45A regulates the homeostasis of late/multivesicular endosomes. Biochim. Biophys. Acta Mol. Cell Res. 2019;1866:916–929. doi: 10.1016/j.bbamcr.2019.02.006. [DOI] [PubMed] [Google Scholar]
31.Zhu Q., Zhang X., Zai H.-Y., Jiang W., Zhang K.-J., He Y.-Q., Hu Y. circSLC8A1 sponges miR-671 to regulate breast cancer tumorigenesis via PTEN/PI3k/Akt pathway. Genomics. 2021;113:398–410. doi: 10.1016/j.ygeno.2020.12.006. [DOI] [PubMed] [Google Scholar]
32.Zaremba-Czogalla M., Hryniewicz-Jankowska A., Tabola R., Nienartowicz M., Stach K., Wierzbicki J., Cirocchi R., Ziolkowski P., Tabaczar S., Augoff K. A novel regulatory function of CDKN1A/p21 in TNFα-induced matrix metalloproteinase 9-dependent migration and invasion of triple-negative breast cancer cells. Cell. Signal. 2018;47:27–36. doi: 10.1016/j.cellsig.2018.03.010. [DOI] [PubMed] [Google Scholar]
33.Fournier G., Cabaud O., Josselin E., Chaix A., Adélaïde J., Isnardon D., Restouin A., Castellano R., Dubreuil P., Chaffanet M., et al. Loss of AF6/afadin, a marker of poor outcome in breast cancer, induces cell migration, invasiveness and tumor growth. Oncogene. 2011;30:3862–3874. doi: 10.1038/onc.2011.106. [DOI] [PubMed] [Google Scholar]
34.Li L., Li X., Qi L., Rychahou P., Jafari N., Huang C. The role of talin2 in breast cancer tumorigenesis and metastasis. Oncotarget. 2017;8:106876–106887. doi: 10.18632/oncotarget.22449. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Sun X., He Z., Guo L., Wang C., Lin C., Ye L., Wang X., Li Y., Yang M., Liu S., et al. ALG3 contributes to stemness and radioresistance through regulating glycosylation of TGF-β receptor II in breast cancer. J. Exp. Clin. Cancer Res. 2021;40:149. doi: 10.1186/s13046-021-01932-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Nakano N., Maeyama K., Sakata N., Itoh F., Akatsu R., Nakata M., Katsu Y., Ikeno S., Togawa Y., Vo Nguyen T.T., et al. C18 ORF1, a novel negative regulator of transforming growth factor-β signaling. J. Biol. Chem. 2014;289:12680–12692. doi: 10.1074/jbc.M114.558981. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Cui G., Park S., Badeaux A.I., Kim D., Lee J., Thompson J.R., Yan F., Kaneko S., Yuan Z., Botuyan M.V., et al. PHF20 is an effector protein of p53 double lysine methylation that stabilizes and activates p53. Nat. Struct. Mol. Biol. 2012;19:916–924. doi: 10.1038/nsmb.2353. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Shao Z., Ma X., Zhang Y., Sun Y., Lv W., He K., Xia R., Wang P., Gao X. CPNE1 predicts poor prognosis and promotes tumorigenesis and radioresistance via the AKT singling pathway in triple-negative breast cancer. Mol. Carcinog. 2020;59:533–544. doi: 10.1002/mc.23177. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Wu L., Shi W., Long J., Guo X., Michailidou K., Beesley J., Bolla M.K., Shu X.-O., Lu Y., Cai Q., et al. A transcriptome-wide association study of 229, 000 women identifies new candidate susceptibility genes for breast cancer. Nat. Genet. 2018;50:968–978. doi: 10.1038/s41588-018-0132-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Ferreira M.A., Gamazon E.R., Al-Ejeh F., Aittomäki K., Andrulis I.L., Anton-Culver H., Arason A., Arndt V., Aronson K.J., Arun B.K., et al. Genome-wide association and transcriptome studies identify target genes and risk loci for breast cancer. Nat. Commun. 2019;10:1741. doi: 10.1038/s41467-018-08053-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Lawrenson K., Kar S., McCue K., Kuchenbaeker K., Michailidou K., Tyrer J., Beesley J., Ramus S.J., Li Q., Delgado M.K., et al. Functional mechanisms underlying pleiotropic risk alleles at the 19p13.1 breast-ovarian cancer susceptibility locus. Nat. Commun. 2016;7:12675. doi: 10.1038/ncomms12675. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Fritsch M., Günther S.D., Schwarzer R., Albert M.-C., Schorn F., Werthenbach J.P., Schiffmann L.M., Stair N., Stocks H., Seeger J.M., et al. Caspase-8 is the molecular switch for apoptosis, necroptosis and pyroptosis. Nature. 2019;575:683–687. doi: 10.1038/s41586-019-1770-6. [DOI] [PubMed] [Google Scholar]
43.Wen W., Chen Z., Bao J., Long Q., Shu X.-O., Zheng W., Guo X. Genetic variations of DNA bindings of FOXA1 and co-factors in breast cancer susceptibility. Nat. Commun. 2021;12:5318. doi: 10.1038/s41467-021-25670-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Feng H., Gusev A., Pasaniuc B., Wu L., Long J., Abu-Full Z., Aittomäki K., Andrulis I.L., Anton-Culver H., Antoniou A.C., et al. Transcriptome-wide association study of breast cancer risk by estrogen-receptor status. Genet. Epidemiol. 2020;44:442–468. doi: 10.1002/gepi.22288. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Mancuso N., Freund M.K., Johnson R., Shi H., Kichaev G., Gusev A., Pasaniuc B. Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet. 2019;51:675–682. doi: 10.1038/s41588-019-0367-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Giambartolomei C., Vukcevic D., Schadt E.E., Franke L., Hingorani A.D., Wallace C., Plagnol V. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10 doi: 10.1371/journal.pgen.1004383. e1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Fachal L., Aschard H., Beesley J., Barnes D.R., Allen J., Kar S., Pooley K.A., Dennis J., Michailidou K., Turman C., et al. Fine-mapping of 150 breast cancer risk regions identifies 191 likely target genes. Nat. Genet. 2020;52:56–73. doi: 10.1038/s41588-019-0537-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Kanehisa M., Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Gene Ontology Consortium The Gene ontology resource: enriching a GOld mine. Nucleic Acids Res. 2021;49:D325–D334. doi: 10.1093/nar/gkaa1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T., et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Liberzon A., Birger C., Thorvaldsdóttir H., Ghandi M., Mesirov J.P., Tamayo P. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Galiè M. RAS as supporting actor in breast cancer. Front. Oncol. 2019;9:1199. doi: 10.3389/fonc.2019.01199. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Schon K., Tischkowitz M. Clinical implications of germline mutations in breast cancer: TP53. Breast Cancer Res. Treat. 2018;167:417–423. doi: 10.1007/s10549-017-4531-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Liu S., Huang J., Zhang Y., Liu Y., Zuo S., Li R. MAP2K4 interacts with Vimentin to activate the PI3K/AKT pathway and promotes breast cancer pathogenesis. Aging (Albany NY) 2019;11:10697–10710. doi: 10.18632/aging.102485. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S3 and supplemental methods

mmc1.pdf^{(542.1KB, pdf)}

Data S1. Tables S1–S18

mmc2.xlsx^{(336.4KB, xlsx)}

Document S2. Article plus supplemental information

mmc3.pdf^{(987.5KB, pdf)}

Data Availability Statement

[bib1] 1.Sung H., Ferlay J., Siegel R.L., Laversanne M., Soerjomataram I., Jemal A., Bray F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J. Clin. 2021;71:209–249. doi: 10.3322/caac.21660. [DOI] [PubMed] [Google Scholar]

[bib2] 2.Hu C., Hart S.N., Gnanaolivu R., Huang H., Lee K.Y., Na J., Gao C., Lilyquist J., Yadav S., Boddicker N.J., et al. A population-based study of genes previously implicated in breast cancer. N. Engl. J. Med. 2021;384:440–451. doi: 10.1056/NEJMoa2005936. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] 3.Breast Cancer Association Consortium. Dorling L., Carvalho S., Allen J., González-Neira A., Luccarini C., Wahlström C., Pooley K.A., Parsons M.T., Fortuno C., et al. Breast cancer risk genes - association analysis in more than 113, 000 women. N. Engl. J. Med. 2021;384:428–439. doi: 10.1056/NEJMoa1913948. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] 4.Narod S.A. Which genes for hereditary breast cancer? N. Engl. J. Med. 2021;384:471–473. doi: 10.1056/NEJMe2035083. [DOI] [PubMed] [Google Scholar]

[bib5] 5.Michailidou K., Lindström S., Dennis J., Beesley J., Hui S., Kar S., Lemaçon A., Soucy P., Glubb D., Rostamianfar A., et al. Association analysis identifies 65 new breast cancer risk loci. Nature. 2017;551:92–94. doi: 10.1038/nature24284. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib6] 6.Zhang H., Ahearn T.U., Lecarpentier J., Barnes D., Beesley J., Qi G., Jiang X., O’Mara T.A., Zhao N., Bolla M.K., et al. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat. Genet. 2020;52:572–581. doi: 10.1038/s41588-020-0609-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib7] 7.Shu X., Long J., Cai Q., Kweon S.-S., Choi J.-Y., Kubo M., Park S.K., Bolla M.K., Dennis J., Wang Q., et al. Identification of novel breast cancer susceptibility loci in meta-analyses conducted among Asian and European descendants. Nat. Commun. 2020;11:1217. doi: 10.1038/s41467-020-15046-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib8] 8.Martin A.R., Kanai M., Kamatani Y., Okada Y., Neale B.M., Daly M.J. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 2019;51:584–591. doi: 10.1038/s41588-019-0379-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] 9.Zheng W., Long J., Gao Y.-T., Li C., Zheng Y., Xiang Y.-B., Wen W., Levy S., Deming S.L., Haines J.L., et al. Genome-wide association study identifies a new breast cancer susceptibility locus at 6q25.1. Nat. Genet. 2009;41:324–328. doi: 10.1038/ng.318. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] 10.Cai Q., Zhang B., Sung H., Low S.-K., Kweon S.-S., Lu W., Shi J., Long J., Wen W., Choi J.-Y., et al. Genome-wide association analysis in East Asians identifies breast cancer susceptibility loci at 1q32.1, 5q14.3 and 15q26.1. Nat. Genet. 2014;46:886–890. doi: 10.1038/ng.3041. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib11] 11.Zheng W., Zhang B., Cai Q., Sung H., Michailidou K., Shi J., Choi J.-Y., Long J., Dennis J., Humphreys M.K., et al. Common genetic determinants of breast-cancer risk in East Asian women: a collaborative study of 23 637 breast cancer cases and 25 579 controls. Hum. Mol. Genet. 2013;22:2539–2550. doi: 10.1093/hmg/ddt089. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib12] 12.Yang Y., Tao R., Shu X., Cai Q., Wen W., Gu K., Gao Y.-T., Zheng Y., Kweon S.-S., Shin M.-H., et al. Incorporating polygenic risk scores and nongenetic risk factors for breast cancer risk prediction among asian women. JAMA Netw. Open. 2022;5 doi: 10.1001/jamanetworkopen.2021.49030. e2149030. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib13] 13.Ishigaki K., Akiyama M., Kanai M., Takahashi A., Kawakami E., Sugishita H., Sakaue S., Matoba N., Low S.-K., Okada Y., et al. Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases. Nat. Genet. 2020;52:669–679. doi: 10.1038/s41588-020-0640-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] 14.Michailidou K., Beesley J., Lindstrom S., Canisius S., Dennis J., Lush M.J., Maranian M.J., Bolla M.K., Wang Q., Shah M., et al. Genome-wide association analysis of more than 120, 000 individuals identifies 15 new susceptibility loci for breast cancer. Nat. Genet. 2015;47:373–380. doi: 10.1038/ng.3242. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] 15.Cai Q., Long J., Lu W., Qu S., Wen W., Kang D., Lee J.-Y., Chen K., Shen H., Shen C.-Y., et al. Genome-wide association study identifies breast cancer risk variant at 10q21.2: results from the Asia Breast Cancer Consortium. Hum. Mol. Genet. 2011;20:4991–4999. doi: 10.1093/hmg/ddr405. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib16] 16.Long J., Cai Q., Sung H., Shi J., Zhang B., Choi J.-Y., Wen W., Delahanty R.J., Lu W., Gao Y.-T., et al. Genome-wide association study in east Asians identifies novel susceptibility loci for breast cancer. PLoS Genet. 2012;8 doi: 10.1371/journal.pgen.1002532. e1002532. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib17] 17.Han M.-R., Long J., Choi J.-Y., Low S.-K., Kweon S.-S., Zheng Y., Cai Q., Shi J., Guo X., Matsuo K., et al. Genome-wide association study in East Asians identifies two novel breast cancer susceptibility loci. Hum. Mol. Genet. 2016;25:3361–3371. doi: 10.1093/hmg/ddw164. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib18] 18.Zhang Y., Long J., Lu W., Shu X.-O., Cai Q., Zheng Y., Li C., Li B., Gao Y.-T., Zheng W. Rare coding variants and breast cancer risk: evaluation of susceptibility Loci identified in genome-wide association studies. Cancer Epidemiol. Biomarkers Prev. 2014;23:622–628. doi: 10.1158/1055-9965.EPI-13-1043. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib19] 19.Kim H.c., Lee J.-Y., Sung H., Choi J.-Y., Park S.K., Lee K.-M., Kim Y.J., Go M.J., Li L., Cho Y.S., et al. A genome-wide association study identifies a breast cancer risk variant in ERBB4 at 2q34: results from the Seoul breast cancer study. Breast Cancer Res. 2012;14:R56. doi: 10.1186/bcr3158. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib20] 20.Chang C.C., Chow C.C., Tellier L.C., Vattikuti S., Purcell S.M., Lee J.J. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib21] 21.Willer C.J., Li Y., Abecasis G.R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib22] 22.Mägi R., Horikoshi M., Sofer T., Mahajan A., Kitajima H., Franceschini N., McCarthy M.I., COGENT-Kidney Consortium T2D-GENES Consortium. Morris A.P., Morris A.P. Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution. Hum. Mol. Genet. 2017;26:3639–3650. doi: 10.1093/hmg/ddx280. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib23] 23.Wang G., Sarkar A., Carbonetto P., Stephens M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. Roy. Stat. Soc. B. 2020;82:1273–1300. doi: 10.1111/rssb.12388. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib24] 24.Buniello A., MacArthur J.A.L., Cerezo M., Harris L.W., Hayhurst J., Malangone C., McMahon A., Morales J., Mountjoy E., Sollis E., et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47:D1005–D1012. doi: 10.1093/nar/gky1120. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib25] 25.Roden D.M., Pulley J.M., Basford M.A., Bernard G.R., Clayton E.W., Balser J.R., Masys D.R. Development of a large-scale De-identified DNA biobank to enable personalized medicine. Clin. Pharmacol. Ther. 2008;84:362–369. doi: 10.1038/clpt.2008.89. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib26] 26.Kasimatis K.R., Abraham A., Ralph P.L., Kern A.D., Capra J.A., Phillips P.C. Evaluating human autosomal loci for sexually antagonistic viability selection in two large biobanks. Genetics. 2021;217:1–10. doi: 10.1093/genetics/iyaa015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib27] 27.Zhou D., Jiang Y., Zhong X., Cox N.J., Liu C., Gamazon E.R. A unified framework for joint-tissue transcriptome-wide association and Mendelian randomization analysis. Nat. Genet. 2020;52:1239–1246. doi: 10.1038/s41588-020-0706-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib28] 28.Barbeira A.N., Dickinson S.P., Bonazzola R., Zheng J., Wheeler H.E., Torres J.M., Torstenson E.S., Shah K.P., Garcia T., Edwards T.L., et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 2018;9:1825. doi: 10.1038/s41467-018-03621-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib29] 29.Gaudet M.M., Kuchenbaecker K.B., Vijai J., Klein R.J., Kirchhoff T., McGuffog L., Barrowdale D., Dunning A.M., Lee A., Dennis J., et al. Identification of a BRCA2-specific modifier locus at 6p24 related to breast cancer risk. PLoS Genet. 2013;9 doi: 10.1371/journal.pgen.1003173. e1003173. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib30] 30.Zhang J., Zhang K., Qi L., Hu Q., Shen Z., Liu B., Deng J., Zhang C., Zhang Y. DENN domain-containing protein FAM45A regulates the homeostasis of late/multivesicular endosomes. Biochim. Biophys. Acta Mol. Cell Res. 2019;1866:916–929. doi: 10.1016/j.bbamcr.2019.02.006. [DOI] [PubMed] [Google Scholar]

[bib31] 31.Zhu Q., Zhang X., Zai H.-Y., Jiang W., Zhang K.-J., He Y.-Q., Hu Y. circSLC8A1 sponges miR-671 to regulate breast cancer tumorigenesis via PTEN/PI3k/Akt pathway. Genomics. 2021;113:398–410. doi: 10.1016/j.ygeno.2020.12.006. [DOI] [PubMed] [Google Scholar]

[bib32] 32.Zaremba-Czogalla M., Hryniewicz-Jankowska A., Tabola R., Nienartowicz M., Stach K., Wierzbicki J., Cirocchi R., Ziolkowski P., Tabaczar S., Augoff K. A novel regulatory function of CDKN1A/p21 in TNFα-induced matrix metalloproteinase 9-dependent migration and invasion of triple-negative breast cancer cells. Cell. Signal. 2018;47:27–36. doi: 10.1016/j.cellsig.2018.03.010. [DOI] [PubMed] [Google Scholar]

[bib33] 33.Fournier G., Cabaud O., Josselin E., Chaix A., Adélaïde J., Isnardon D., Restouin A., Castellano R., Dubreuil P., Chaffanet M., et al. Loss of AF6/afadin, a marker of poor outcome in breast cancer, induces cell migration, invasiveness and tumor growth. Oncogene. 2011;30:3862–3874. doi: 10.1038/onc.2011.106. [DOI] [PubMed] [Google Scholar]

[bib34] 34.Li L., Li X., Qi L., Rychahou P., Jafari N., Huang C. The role of talin2 in breast cancer tumorigenesis and metastasis. Oncotarget. 2017;8:106876–106887. doi: 10.18632/oncotarget.22449. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib35] 35.Sun X., He Z., Guo L., Wang C., Lin C., Ye L., Wang X., Li Y., Yang M., Liu S., et al. ALG3 contributes to stemness and radioresistance through regulating glycosylation of TGF-β receptor II in breast cancer. J. Exp. Clin. Cancer Res. 2021;40:149. doi: 10.1186/s13046-021-01932-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib36] 36.Nakano N., Maeyama K., Sakata N., Itoh F., Akatsu R., Nakata M., Katsu Y., Ikeno S., Togawa Y., Vo Nguyen T.T., et al. C18 ORF1, a novel negative regulator of transforming growth factor-β signaling. J. Biol. Chem. 2014;289:12680–12692. doi: 10.1074/jbc.M114.558981. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib37] 37.Cui G., Park S., Badeaux A.I., Kim D., Lee J., Thompson J.R., Yan F., Kaneko S., Yuan Z., Botuyan M.V., et al. PHF20 is an effector protein of p53 double lysine methylation that stabilizes and activates p53. Nat. Struct. Mol. Biol. 2012;19:916–924. doi: 10.1038/nsmb.2353. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib38] 38.Shao Z., Ma X., Zhang Y., Sun Y., Lv W., He K., Xia R., Wang P., Gao X. CPNE1 predicts poor prognosis and promotes tumorigenesis and radioresistance via the AKT singling pathway in triple-negative breast cancer. Mol. Carcinog. 2020;59:533–544. doi: 10.1002/mc.23177. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib39] 39.Wu L., Shi W., Long J., Guo X., Michailidou K., Beesley J., Bolla M.K., Shu X.-O., Lu Y., Cai Q., et al. A transcriptome-wide association study of 229, 000 women identifies new candidate susceptibility genes for breast cancer. Nat. Genet. 2018;50:968–978. doi: 10.1038/s41588-018-0132-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib42] 40.Ferreira M.A., Gamazon E.R., Al-Ejeh F., Aittomäki K., Andrulis I.L., Anton-Culver H., Arason A., Arndt V., Aronson K.J., Arun B.K., et al. Genome-wide association and transcriptome studies identify target genes and risk loci for breast cancer. Nat. Commun. 2019;10:1741. doi: 10.1038/s41467-018-08053-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib43] 41.Lawrenson K., Kar S., McCue K., Kuchenbaeker K., Michailidou K., Tyrer J., Beesley J., Ramus S.J., Li Q., Delgado M.K., et al. Functional mechanisms underlying pleiotropic risk alleles at the 19p13.1 breast-ovarian cancer susceptibility locus. Nat. Commun. 2016;7:12675. doi: 10.1038/ncomms12675. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib44] 42.Fritsch M., Günther S.D., Schwarzer R., Albert M.-C., Schorn F., Werthenbach J.P., Schiffmann L.M., Stair N., Stocks H., Seeger J.M., et al. Caspase-8 is the molecular switch for apoptosis, necroptosis and pyroptosis. Nature. 2019;575:683–687. doi: 10.1038/s41586-019-1770-6. [DOI] [PubMed] [Google Scholar]

[bib40] 43.Wen W., Chen Z., Bao J., Long Q., Shu X.-O., Zheng W., Guo X. Genetic variations of DNA bindings of FOXA1 and co-factors in breast cancer susceptibility. Nat. Commun. 2021;12:5318. doi: 10.1038/s41467-021-25670-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib41] 44.Feng H., Gusev A., Pasaniuc B., Wu L., Long J., Abu-Full Z., Aittomäki K., Andrulis I.L., Anton-Culver H., Antoniou A.C., et al. Transcriptome-wide association study of breast cancer risk by estrogen-receptor status. Genet. Epidemiol. 2020;44:442–468. doi: 10.1002/gepi.22288. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib45] 45.Mancuso N., Freund M.K., Johnson R., Shi H., Kichaev G., Gusev A., Pasaniuc B. Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet. 2019;51:675–682. doi: 10.1038/s41588-019-0367-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib46] 46.Giambartolomei C., Vukcevic D., Schadt E.E., Franke L., Hingorani A.D., Wallace C., Plagnol V. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10 doi: 10.1371/journal.pgen.1004383. e1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib47] 47.Fachal L., Aschard H., Beesley J., Barnes D.R., Allen J., Kar S., Pooley K.A., Dennis J., Michailidou K., Turman C., et al. Fine-mapping of 150 breast cancer risk regions identifies 191 likely target genes. Nat. Genet. 2020;52:56–73. doi: 10.1038/s41588-019-0537-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib48] 48.Kanehisa M., Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib49] 49.Gene Ontology Consortium The Gene ontology resource: enriching a GOld mine. Nucleic Acids Res. 2021;49:D325–D334. doi: 10.1093/nar/gkaa1113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib50] 50.Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T., et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib51] 51.Liberzon A., Birger C., Thorvaldsdóttir H., Ghandi M., Mesirov J.P., Tamayo P. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib52] 52.Galiè M. RAS as supporting actor in breast cancer. Front. Oncol. 2019;9:1199. doi: 10.3389/fonc.2019.01199. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib53] 53.Schon K., Tischkowitz M. Clinical implications of germline mutations in breast cancer: TP53. Breast Cancer Res. Treat. 2018;167:417–423. doi: 10.1007/s10549-017-4531-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib54] 54.Liu S., Huang J., Zhang Y., Liu Y., Zuo S., Li R. MAP2K4 interacts with Vimentin to activate the PI3K/AKT pathway and promotes breast cancer pathogenesis. Aging (Albany NY) 2019;11:10697–10710. doi: 10.18632/aging.102485. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Genome- and transcriptome-wide association studies of 386,000 Asian and European-ancestry women provide new insights into breast cancer genetics

Guochong Jia

Jie Ping

Xiang Shu

Yaohua Yang

Qiuyin Cai

Sun-Seog Kweon

Ji-Yeob Choi

Michiaki Kubo

Sue K Park

Manjeet K Bolla

Joe Dennis

Qin Wang

Xingyi Guo

Bingshan Li

Ran Tao

Kristan J Aronson

Tsun L Chan

Yu-Tang Gao

Mikael Hartman

Weang Kee Ho

Hidemi Ito

Motoki Iwasaki

Hiroji Iwata

Esther M John

Yoshio Kasuga

Mi-Kyung Kim

Allison W Kurian

Ava Kwong

Jingmei Li

Artitaya Lophatananon

Siew-Kee Low

Shivaani Mariapun

Koichi Matsuda

Keitaro Matsuo

Kenneth Muir

Dong-Young Noh

Boyoung Park

Min-Ho Park

Chen-Yang Shen

Min-Ho Shin

John J Spinelli

Atsushi Takahashi

Chiuchen Tseng

Shoichiro Tsugane

Anna H Wu

Taiki Yamaji

Ying Zheng

Alison M Dunning

Paul DP Pharoah

Soo-Hwang Teo

Daehee Kang

Douglas F Easton

Jacques Simard

Xiao-ou Shu

Jirong Long

Wei Zheng

Summary

Introduction

Subjects and methods

Study population

Genotyping and quality control

Statistical meta-analyses

Genetic variance explained by novel risk variants

Transcriptome-wide association analysis

Results

Table 1.

Table 2.

Figure 1.

Table 3.

Table 4.

Discussion

Acknowledgments

Declaration of interests

Footnotes

Supplemental information

Data and code availability

References

Associated Data