Abstract
Proteome-wide association studies (PWAS) address gaps in post-transcriptional modifications that previous genome- and transcriptome-wide association studies could not capture. Zhao and colleagues conducted the first breast tissue-based PWAS and identified proteins associated with breast cancer risk, providing insights into the role of protein expression in breast cancer development.
Subject terms: Genome-wide association studies, Risk factors
Breast cancer is the most common cancer among females worldwide, and genome-wide association studies (GWAS) have identified more than 200 common risk variants for the disease [1]. However, one limitation of GWAS is its inability to explain how these variants affect cancer risk and pinpoint which genes they act on. To address this gap, researchers have conducted transcriptome-wide association studies (TWAS) to investigate the effect of genetically predicted gene expression levels on breast cancer risk [2, 3]. While TWAS focuses on abundance of mRNA, they cannot account for post-transcriptional and translational modifications, which play a critical role in protein function. Proteins are the final products of the DNA decoding process in cells. At each stage of transcription and translation, complex regulatory processes occur, with various factors influencing mRNA stability, mRNA-to-protein translation, post-translation regulations, and protein stability [4]. Even minor alterations can lead to significant changes in protein expression, contributing to disease development, including breast cancer. Studying protein-level regulation is essential for comprehending the biological mechanisms underlying cancer development. Proteome-wide association studies (PWAS), which investigate genetically predicted protein levels through regulatory common variants [5] or integrate functionality of coding variants [6], fill the gap in post-transcriptional research.
While several plasma protein biomarkers for breast cancer have been identified, the number of PWAS remains limited and none have focused on targeted tissues specific to breast cancer. In this issue of British Journal of Cancer, Zhao et al. [7] conducted the first breast tissue-based PWAS to identify proteins associated with the risk of breast cancer. They used single nucleotide polymorphisms (SNPs) to predict the expression of 2,060 proteins measured in cancer-free breast tissues from 120 women of European ancestry from the Susan G. Komen Tissue Bank. The study identified seven proteins that were significantly associated with the risk of overall breast cancer and its intrinsic subtypes. This research highlights the added value of PWAS to current genetic studies for breast cancer. It also provides insights for future research on how protein expression influences breast cancer risk.
PWAS can identify potential risk variants for breast cancer that were not detected in previous GWAS and provide explanations for the genetic mechanisms underlying cancer development. In Zhao et al. [7], the genes corresponding to three proteins associated with overall breast cancer were located more than 1 megabase away from breast cancer risk variants previously identified in GWAS. This suggests that novel variants may influence breast cancer risk through protein expression. These potential risk variants could be integrated into breast cancer prediction models to improve risk prediction accuracy and help identify high-risk individuals at an earlier stage. They also found that the DNAJA3 protein, encoded by the gene DNAJA3 located within previous GWAS-identified breast cancer risk loci, remained significantly associated with overall breast cancer risk after adjusting for nearby GWAS risk variants [7]. This indicates that the protein’s effect on breast cancer risk may not be fully explained by nearby known GWAS risk loci. Future research could investigate SNPs not identified in previous GWAS that may affect protein regulatory process. Additionally, these findings suggest that proteins may have an independent, direct effect on breast cancer development, pointing to novel protein biomarkers that could uncover new aetiological pathways and serve as targets for screening and therapy.
PWAS could also help identify downstream biological mechanisms following gene expression, revealing associations not captured by TWAS and expanding our knowledge of the genetic architecture of breast cancer. For example, of the seven proteins identified by Zhao et al. [7], three were encoded by genes not previously reported in TWAS. Conversely, a gene identified in TWAS may show increased expression in breast tissue, and PWAS can further investigate whether this overexpression leads to increased protein expression associated with cancer risk. Zhao et al. [7] compared their PWAS results with previous TWAS findings and found one gene, NCKAP1L, significantly positively associated with breast cancer risk in both approaches. This strengthens the evidence that NCKAP1L is a causal gene. However, the authors did not conduct a formal investigation into the corroboration of TWAS and PWAS. Integrative cross-omics methods could collectively provide a more holistic view of the mechanisms underlying complex human traits [8]. Future studies could further develop multi-omics models incorporating TWAS and PWAS analytically to increase statistical power and elucidate the complex interactions among gene and protein expression.
PWAS in cancer research still has a long way to go. Zhao et al. [7] utilised a parsimonious approach for proteomics quantification that largely ignores multiple protein isoforms without uniquely identified peptides and assigns shared peptides to the protein with the strongest identification evidence. This exclusion of such isoforms and proteins may only detect the low-hanging fruit, limit the detection power, and hinder understanding of gene function in cancer development because protein synthesis typically involves complex processes and there are high levels of sequence similarity between protein isoforms encoded by the same gene or genes in the same family [9]. Identifying and quantifying protein isoforms through mass spectrometry-based shotgun proteomics remains a challenge but improved experimental and bioinformatics approaches of proteomics can help advance PWAS for gene mapping. In addition, the limited number of proteins identified by Zhao et al. [7], despite a less stringent false discovery rate threshold (<0.1), could be attributed to relatively small sample size used to build protein prediction models. Moreover, the study was restricted only to women of European ancestry. Including more diverse populations in future studies will lead to a more comprehensive understanding of the mechanisms driving cancer progression and may identify protein expression factors unique to specific breast cancer subtypes. Women of African ancestry experience disproportionately higher rates of triple-negative breast cancer (TNBC) and TNBC-specific mortality compared to other populations [10]. Identifying proteins specific to TNBC risk could facilitate early detection and reduce the disparity. However, cancer-free breast tissue samples from non-European ethnic groups are not available so far, underscoring the importance of expanding future participant recruitment and sample collection in diverse populations.
Acknowledgements
This work is supported in part by the National Cancer Institute (R01-CA242929 and R01-CA228198).
Author contributions
YS conducted literature review and wrote the first draft the paper. DH revised and finalised the paper.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Zhang H, Ahearn TU, Lecarpentier J, Barnes D, Beesley J, Qi G, et al. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat Genet 2020;52:572–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Li JL, McClellan JC, Zhang H, Gao G, Huo D. Multi-tissue transcriptome-wide association studies identified 235 genes for intrinsic subtypes of breast cancer. JNCI J Natl Cancer Inst 2024;116:1105–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gao G, Fiorica PN, McClellan J, Barbeira AN, Li JL, Olopade OI, et al. A joint transcriptome-wide association study across multiple tissues identifies candidate breast cancer susceptibility genes. Am J Hum Genet 2023;110:950–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Vogel C, Marcotte EM. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat Rev Genet 2012;13:227–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Gregga I, Pharoah PDP, Gayther SA, Manichaikul A, Im HK, Kar SP, et al. Predicted proteome association studies of breast, prostate, ovarian, and endometrial cancers implicate plasma protein regulation in cancer susceptibility. Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol 2023;32:1198–207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Brandes N, Linial N, Linial M. PWAS: proteome-wide association study—linking genes and phenotypes by functional variation in proteins. Genome Biol. 2020;21:173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhao T, Xu S, Ping J, Jia G, Dou Y, Henry JE, et al. A proteome-wide association study identifies putative causal proteins for breast cancer risk. Br J Cancer 2024 Online ahead of print. [DOI] [PMC free article] [PubMed]
- 8.Lu Y, Oliva M, Pierce BL, Liu J, Chen LS. Integrative cross-omics and cross-context analysis elucidates molecular links underlying genetic effects on complex traits. Nat Commun 2024;15:2383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Dou Y, Liu Y, Yi X, Olsen LK, Zhu H, Gao Q, et al. SEPepQuant enhances the detection of possible isoform regulations in shotgun proteomics. Nat Commun 2023;14:5809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Huo D, Ikpatt F, Khramtsov A, Dangou JM, Nanda R, Dignam J, et al. Population differences in breast cancer: survey in indigenous African women reveals over-representation of triple-negative breast cancer. J Clin Oncol 2009;27:4515–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
