Abstract
Background
The immune response against tumors relies on distinguishing between self and non-self, the basis of cancer immunotherapy. Neoantigens from somatic mutations are central to many immunotherapeutic strategies and understanding their landscape in breast cancer is crucial for targeted interventions. We aimed to profile neoantigens in Kenyan breast cancer patients using genomic DNA and total RNA from paired tumor and adjacent non-cancerous tissue samples of 23 patients.
Methods
We sequenced the genome-wide exome (WES) and RNA, from which somatic mutations were identified and their expression quantified, respectively. Neoantigen prediction focused on human leukocyte antigens (HLA) crucial to cancer, HLA type I. HLA alleles were predicted from WES data covering the adjacent non-cancerous tissue samples, identifying four alleles that were present in at least 50% of the patients. Neoantigens were deemed potentially immunogenic if their predicted median IC50 (half-maximal inhibitory concentration) binding scores were ≤500nM and were expressed [transcripts per million (TPM) >1] in tumor samples.
Results
An average of 1465 neoantigens covering 10260 genes had ≤500nM median IC50 binding score and >1 TPM in the 23 patients and their presence significantly correlated with the somatic mutations (R 2 = 0.570, P=0.001). Assessing 58 genes reported in the catalog of somatic mutations in cancer (COSMIC, v99) to be commonly mutated in breast cancer, 44 (76%) produced >2 neoantigens among the 23 patients, with a mean of 10.5 ranging from 2 to 93. For the 44 genes, a total of 477 putative neoantigens were identified, predominantly derived from missense mutations (88%), indels (6%), and frameshift mutations (6%). Notably, 78% of the putative breast cancer neoantigens were patient-specific. HLA-C*06:01 allele was associated with the majority of neoantigens (194), followed by HLA-A*30:01 (131), HLA-A*02:01 (103), and HLA-B*58:01 (49). Among the genes of interest that produced putative neoantigens were MUC17, TTN, MUC16, AKAP9, NEB, RP1L1, CDH23, PCDHB10, BRCA2, TP53, TG, and RB1.
Conclusions
The unique neoantigen profiles in our patient group highlight the potential of immunotherapy in personalized breast cancer treatment as well as potential biomarkers for prognosis. The unique mutations producing these neoantigens, compared to other populations, provide an opportunity for validation in a much larger sample cohort.
Keywords: neoantigen, breast cancer, exome-seq, RNA-seq, somatic mutations, Kenya
Introduction
Breast cancer is among the most frequent causes of cancer-related mortality in women. Disease heterogeneity and limited immunogenicity contribute to the lethality of breast cancer (1). Immune evasion, an important hallmark of cancer, adds to the complexity of cancer burden through induction of immunosuppression (2). Immune checkpoint blockade (CKB) therapy has been developed to target and block immune regulatory molecules (PD-1/PD-L1 and CTLA-4) and in the process reactivate T cell immunity (3). This approach has been reported to improve clinical responses and survival, especially in tumors with high mutational burdens, such as lung cancer and melanoma (4). However, CKB therapy is not universally successful among all patients and shows increased efficacy with higher mutational burden tumors (5). Another immunotherapy approach that has been tested in clinical studies is the targeting of tumor-associated antigens (TAAs) that are expressed in tumors at abnormally high levels and rarely detectable in normal tissues (6). One of the limitations of this therapy approach is that many TAAs represent normal self-antigens and thus can be tolerated by T-cells, resulting in poor immune response (1). Considering the lower mutational burden in breast cancer, both CKB and TAAs immunotherapy have had limited success (7).
Tumor neoantigens are tumor-specific antigens derived from somatic mutations in expressed genes and are presentable to the major histocompatibility complex (MHC) by both class I human leukocyte antigen (HLA-I) molecules present on surface of cancer cell, as well as class II HLA molecules present on professional antigen-presenting cells (8). This elicits anti-tumor immune responses that have the potential of eliminating the tumor cells with minimal off-target effects (9). Neoantigens are encoded in various mutational types, including single nucleotide substitution, insertion and deletions (INDELs), splice sites, stop codons gains and silent change, which can result in translational frameshifts or novel open reading frames (1). As such, these neoantigens offer an advantage over TAAs in that they are only expressed by cancer cells and not by normal cells, which enables specific recognition by the immune system (1). Although some neoantigens are shared among patients, most of them are patient-specific and are not subject to immune tolerance mechanisms (10). The specificity of neoantigens could provide an opportunity for future personalized therapy in a cancer with a low tumor mutational burden and a high disease heterogeneity, such as breast cancer. Moreover, neoantigens can potentially be used as biomarkers in cancer immunotherapy to assess or predict the response of a patient to treatment (1).
Despite advancements in next generation sequencing and high-performance computing that has resulted in improved cancer immunotherapy research and neoantigen-based treatments, there remains a scarcity of information regarding neoantigens in specific populations from sub-Saharan African countries such as Kenya. This lack of data poses a significant challenge in tailoring immunotherapeutic strategies for breast cancer patients in such regions that have a high cancer burden, especially when compounded by germline ancestral factors and a distinct mutational spectrum that may influence tumor biology and immune response. Thus, it is critical to profile the neoantigen burden in this population to contribute to the global collection of breast cancer immunogenic antigens for future drug development. To this end, we sought to profile neoantigens in Kenyan women diagnosed with breast cancer in silico through analysis of the whole exome and RNA sequencing data from 23 patients. We characterized the mutation burden for each patient using WES, identified gene expression patterns in tumor tissue, and predicted the putative neoantigens incorporating these datasets.
Materials and methods
Patients and samples
Tumor and adjacent normal tissue pairs were obtained from 23 breast cancer patients at the Aga Khan Hospital, Nairobi, Kenya and AIC Kijabe Hospital, Kijabe, Kenya between 2019 and 2021. Samples were collected through surgical excision, after which tissues were snap frozen in liquid nitrogen and temporarily stored at Aga Khan Hospital. Frozen tissue samples were shipped to the National Cancer Institute, Bethesda, MD, USA, for sequencing. Prior to tissue collection, all patients provided written informed consent and the study was approved by Research and Ethics Committees at Aga Khan University Hospital, Nairobi (Ref: 2018/REC-80) and AIC Kijabe Hospital (KH IERC-02718/2019).
Whole-exome sequencing and RNA-sequencing
Genomic DNA was extracted from the samples using the DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany), following manufacturer’s instructions. Total RNA was extracted from the frozen tissues using TRIzol reagent (Invitrogen). WES was performed by the company, Psomagen (https://www.psomagen.com/). This service provider is Clinical Laboratory Improvement Amendments-certified and College of American Pathologists (CAP)-accredited, achieving a sequence depth of 250x for tumor tissues and 150x for adjacent non-cancerous tissues, as previously described by us (11). Total RNA from the 23 sample pairs was processed by a NCI Leidos core facility, where library preparation was performed using the TruSeq Poly A kit (19). Samples were sequenced on a Novaseq system with 150 bp paired-end reads and a depth of 30 million reads.
Reads mapping and variant calling
For WES, raw reads were quality checked using FASTQC (12) and results summarized using MultiQC (13). The reads were trimmed for low quality reads and adapter sequences using Trimmomatic (14) and quality-checked again using FASTQC and MultiQC. All samples passed the QC test after trimming and the reads were aligned using BWA-MEM (15) to the hg38 human reference genome, where >95% of the reads aligned properly to the genome. The aligned reads were deduplicated and read groups added to the deduplicated bam files using Picard. This was followed by base quality recalibration in GATK (16). Prior to variant calling, a panel of normal (PoN) was built using MuTect2 utilizing the reads from non-cancerous tissue. This was done to exclude artifacts and potential germline mutations in subsequent steps. Somatic variant calling was performed using MuTect2 (16) in paired tumor-normal mode utilizing the panel of normal option. Variants were normalized using a variant tool set (vt; 17), filtered using GATK and functional/consequence-annotated using a variant effect predictor (VEP; 18). Annotated variants were converted to MAF files using vcf2maf (19) and concatenated into a single file. The MAF files were imported into R package maftools (20) for further processing.
For RNA-seq, a quality check was performed using FASTQC and MultiQC after which the reads were trimmed and quality checked again. All samples passed the quality check and the reads were pseudo-aligned to the hg38 reference genome using Kallisto aligner (21) with default settings to obtain count matrix. Alignment statistics showed that over >50% reads mapped uniquely to the genome. The raw counts were normalized into estimated Transcripts Per Million (TPM), and scaled using the average transcript length over samples and the library size by tximport (22).
Variant expression annotation
VCF files containing the variants were annotated for expression using the vcf-expression-annotator (https://github.com/griffithlab/VAtools) with default setting except for choosing the use of gene names instead of transcripts and thereby ignoring the Ensembl id version. The tool takes the output of Kallisto and adds the data contained in the file to the VEP annotated VCF’s INFO column. Each of the variant annotated gets its expression value (TPM) added to the annotation information and this is used to determine the level of variant expression during neoantigen filtering.
Neoantigen prediction
Human leukocyte antigen (HLA) class I alleles (HLA a, b and c) were predicted from each patient’s normal sample exome-seq data using HLA-HD v.1.2.1 (23). Here, the putative HLA reads are aligned to an imputed library of full-length HLA alleles. Neoantigens were then predicted using pVACseq (24) with MHCflurry, MHCnuggetsI, SMM, and SMMPMBEC algorithms and keeping the default parameters, except for turning off the VAF and coverage filters. Here, the neoepitopes that could bind to the patient-specific HLA alleles were predicted from the Immune Epitope Database (IEDB; 25). This involved matching patient HLA type to the existing IEDB list keeping all amino acids with lengths for 9, 10 and 11-mers. Predicted epitopes were filtered to retain only those with high affinity (IC50 ≤ 500nM) and were expressed (transcripts per million, TPM>1) in tumor samples. The bioinformatic analysis workflow is outlined in Figure 1 .
Sample summary statistics and the pairwise tests for differences among mutations and neoantigens abundance among the BC subtypes using Wilcoxon test and visualization of the results were performed in R software (26).
Results
Patients and sample characteristics
The demographic and clinical characteristics of the 23 breast cancer patients are summarized in Supplementary Table S1 . We grouped the tumors into 3 subtypes based on expression of either the hormone receptors (HR) or human epidermal growth factor receptor 2 (HER2) (7): those that were HER2+ regardless of the HR status, those that were negative for all hormone receptors (triple negative breast cancer; TNBC) and those that were HR+ and HER2-. Majority of the samples were HR+/HER2- constituting 52.2%, followed by HER2+ at 34.8% and TNBC at 13.0%. Most of the patients had invasive carcinoma (invasive ductal carcinoma, 78.26% and invasive carcinoma; 4.35%). For tumor grade, 65.22% of the patients had grade 3 tumors (65.22%), while the rest had grade 2 tumors (34.78%). Clinically, 39.13% of the patients were in stage II, 30.44% in stage III, and 8.7% in stage I ( Supplementary Table S1 ).
Mutation profiles for the 23 patients
Across all genes, the average number of detected mutations in the 23 patients was 2809 mutations. Considering the different subtypes, TNBC had the highest average number of mutations at 3202, followed by HR+/HER2- at 2757, and HER2+ at 2740 mutations ( Supplementary Figure S1 ). From the catalog of somatic mutations in cancer (COSMIC, v99), we identified 73 genes reported to be mutated in breast cancer and among those, 62 (84.9%) had at least one mutation in our samples. The mutation characteristics are summarised in ( Figures 2A–F). In brief, the mutation frequency among the 62 genes ranged from 1 to 55 mutations per individual. The majority of the mutations were of the missense type, most of which were substitutions of C>T ( Figure 2A ). The top 10 mutated genes among the 62 are shown in Figure 3 . Four genes (MUC16, MUC17, TTN, RP1L1) were altered in more than 95% of the patients ( Figure 3 ). Moreover, mutations in genes TP53-ERBB3, PTEN-CFAP46 were found to significantly co-occur, while BRCA1-MUC17 mutations were significantly mutually exclusive (P<0.05) ( Figure 4 ). Furthermore, the majority of the single nucleotide mutations were substitutions were most uncommon ( Figure 5A ). Transitions occurred more frequently than transversions in these substitutions ( Figure 5B ), and there was obvious variation in proportions of each substitution among the 23 samples ( Figure 5C ).
Neoantigen burden
In an analysis that included all the genes (10260), an average of 1465 neoantigens had a ≤500nM median IC50 binding score and >1 TPM expression level in any of the 23 patients and their presence significantly correlated with the somatic mutations (R 2 = 0.570, P=0.001) ( Figure 6 ). Out of the 62 COSMIC genes that were mutated in the tumor tissue, 58 genes produced at least one neoantigen. After filtering for genes that produced at least two neoantigens, 44 genes had a mean of 10.5 neoantigens ranging from 2 to 93. A total of 477 putative neoantigens were identified in these 44 genes across the 23 patients ( Figure 7 ) predominantly derived from missense mutations (88%), indels (6%) and frameshift mutations (6%) ( Figure 8 ). Most of the neoantigens were produced in the TNBC subtype with an average of 25 neoantigens, followed by HR+/HER2- at 20 neoantigens and HER2+ with an average of 19 neoantigens ( Supplementary Figure S1 ). Notably, 78% of the putative breast cancer neoantigens were patient-specific ( Supplementary Table S2 ). HLA-C*06:01 allele was associated with majority of neoantigens (194), followed by HLA-A*30:01 (131), HLA-A*02:01 (103), and HLA-B*58:01 (49). Among the genes of interest that produced putative neoantigens include MUC17, TTN, MUC16, AKAP9, NEB, RP1L1, CDH23, PCDHB10, BRCA2, TP53, TG, RB1 among others ( Figure 7 ; Supplementary Table S3 ).
Discussion
We analyzed the mutational burden and predicted the neoantigen repertoire in 23 Kenyan breast cancer patients using WES and RNA sequencing data. Among the different breast cancer subtypes, we found that the TNBC molecular subtype had the highest mutational and neoantigen burden although there was no significant difference among the subtypes ( Supplementary Figure S1 , Supplementary Table S4 ). This is consistent with other studies (24). TNBC origin is not well understood although it is reported to be heterogeneous in nature relying on different signaling pathways such as JAK/STAT, PI3K/AKT/mTOR or NOTCH, cell cycle regulators (TP53) and genome integrity genes (BRCA1/2) (1). This makes it a disease that is difficult to manage because we do not have a clear understanding of the molecular mechanisms driving it. Yet, the high mutational and neoantigens burden combined with the patient specificity may provide an untapped opportunity to design and optimize personalized immunotherapy for this subtype.
In contrast to most populations (Caucasian American, African American, Asian and European) where TP53, PIK3CA and GATA3 are the most mutated genes (11, 27, 28), in our study population, three genes MUC16, MUC17 and TTN were highly mutated in over 50% of the samples and produced the highest number of neoantigens. MUC16 has been reported to take part in breast cancer progression and metastasis when overexpressed due to its influence on cell cycle and survival through the JAK2/STAT3 pathway (29). It has been reported as one of the highly mutated genes in breast cancer (30). MUC16 has also been described as a marker for disease progression, recurrence, and chemotherapy response (31). A high mutation frequency for MUC17 and TTN have recently been reported as an unexpected finding in a study of early onset breast cancer (EOBC) in Taiwanese women (32). MUC17 may influence chemoresistance and has recently been reported as a driver gene in adult gliomas (33, 34). For TTN, Oh et al. (35) found that mutations in TTN correlate with tumor mutational burden and high microsatellite instability, which is associated with poor breast cancer prognosis. Thus, the role of MUC17 and TTN should further be investigated on how mutations in them may relate to early onset of breast cancer in Kenyan patients (11).
We found that TP53 gene mutations significantly co-occurred with ERBB3 mutations and so did mutations in PTEN and CFAP46, whereas BRCA1 and MUC17 mutations never co-occurred. TP53 mutations are associated with tumor aggression and are found in about half of HER2-amplified tumors (36). The TP53 mutations have been implicated in poor prognosis of HER2+ subtypes compared to other subtypes (37). PTEN is a tumor suppressor gene, whose mutation has been associated with initiation, progression, and metastasis of breast cancer (38). On the other hand, although CFAP46 role in breast cancer is not yet clear, gene fusion involving various other genes such as VTI1A (reported to cause the initiation of glioma and other cancers) has been reported to play a role in breast cancer (39).
Breast tumors with either germline or somatic BRCA1 mutations show no difference in their cancer biology, but inherited mutations in this gene confers a very high lifetime risk of developing breast cancer (40–42). This could be the reason such mutations do not necessarily need to co-occur with other gene mutations to initiate or promote breast cancer progression. In our study, BRCA1 was not among the highly mutated genes considering all mutations but was among the genes with high number of missense mutations ( Figure 4 ). In contrast, MUC17 mutations were among the most prevalent. Given the role of MUC17 mutations in chemoresistance and in early onset breast cancer (33, 34), its high prevalence and exclusive occurrence in the Kenyan samples that are prone to early onset of breast cancer should be investigated further.
Similar to most studies on neoantigen prediction in breast cancer, we have found that neoantigens burden is positively correlated with tumor mutational burden and that neoantigens were patient-specific (7, 43). Although most of the top 10 mutated genes (80%) were also the top 10 in the number of neoantigens generated, genes like TP53 and PIK3CA that are reported to be highly mutated in most patient cohorts were not among the top 10 mutated genes in this study, but generated among the highest number of neoantigens ( Figures 6 , 7 ). ARID1A gene, which showed unique mutational profile in Kenyan population using exome data compared to African American and Asian population (11), was not among the highly mutated, but produced neoantigens. We found that most neoantigens were derived predominantly from missense mutations (88%), compared to indels and frameshift mutations (12%). This is consistent with other studies although the majority do not predict neoantigens from indels and frameshift mutations (44). Similar to other studies, the TNBC subtype had more neoantigens, compared to HR+/HER2- and HER2+ subtypes (7, 44).
In our small sample cohort, we have been able to identify putative neoantigens that show patient-specificity and thus are important in tailored treatment. Interestingly, the mutations and neoantigens in this population are predominantly derived from a unique set of genes (MUC16, MUC17, TTN) compared to other populations, which provide an opportunity for validation in a much larger sample cohort. We predicted neoantigens based on binding affinity to HLA class I only as it is the most important class of antigen binding proteins in cancer immunity. However, HLA class II-based neoantigens may also have a role in tumor immune response (45). Moreover, we did not investigate the expression of the predicted neoantigens on tumor cells alongside the MHC class I molecules and their ability to activate T cells. This being a discovery study, validation of the findings need to be done in a larger cohort while addressing the highlighted limitations of this study.
Taken together, our findings corroborate the neoantigen profile in breast cancer, highlighting the patient specificity in Kenyan population breast cancer mutational and neoantigens signatures. We also describe putative neoantigens that could be used as markers for breast cancer diagnosis, treatment monitoring, and development of novel immunotherapy.
Acknowledgments
We would like to thank the patients for their consent to provide samples and Aga Khan University Hospital (Nairobi) and AIC Kijabe Hospital (Kijabe) for granting access to patient samples.
Funding Statement
The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was funded by the National Research Fund -Kenya that supported sample collection, and by the Center for Cancer Research, National Cancer Institute, USA, that supported the sequencing work.
Data availability statement
The original contributions presented in the study are included in the article/ Supplementary Material . Further inquiries can be directed to the corresponding author.
Ethics statement
The studies involving humans were approved by Research and Ethics Committee, The Aga Khan University Hospital, Nairobi (Ref: 2018/REC-80) Research and Ethics Committee, AIC Kijabe Hospital, Kijabe (KH IERC-02718/2019). The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
GW: Data curation, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review & editing. JG: Formal analysis, Writing – original draft, Writing – review & editing. KM: Formal analysis, Writing – original draft, Writing – review & editing. MM: Formal analysis, Writing – original draft, Writing – review & editing. EM: Writing – original draft, Writing – review & editing. AH: Resources, Writing – original draft, Writing – review & editing. SS: Data curation, Investigation, Resources, Writing – original draft, Writing – review & editing. SA: Resources, Writing – original draft, Writing – review & editing. FM: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2024.1444327/full#supplementary-material
References
- 1. Benvenuto M, Focaccetti C, Izzi V, Masuelli L, Modesti A, Bei R. Tumor antigens heterogeneity and immune response-targeting neoantigens in breast cancer. Semin Cancer Biol. (2021) 72:65–75. doi: 10.1016/j.semcancer.2019.10.023 [DOI] [PubMed] [Google Scholar]
- 2. Bates JP, Derakhshandeh R, Jones L, Webb TJ. Mechanisms of immune evasion in breast cancer. BMC Cancer. (2018) 18:556. doi: 10.1186/s12885-018-4441-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Touchaei ZA, Vahidi S. MicroRNAs as regulators of immune checkpoints in cancer immunotherapy: Targeting PD-1/PD-L1 and CTLA-4 pathways. Cancer Cell Int. (2024) 24:102. doi: 10.1186/s12935-024-03293-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Shiravand Y, Khodadadi F, Kashani SMA, Hosseini-Fard SR, Hosseini S, Sadeghirad H, et al. Immune checkpoint inhibitors in cancer therapy. Curr Oncol. (2022) 29:3044–60. doi: 10.3390/curroncol29050247 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Brahmer J, Reckamp KL, Baas P, Crinò L, Eberhardt WEE, Poddubskaya E, et al. Nivolumab versus docetaxel in advanced squamous-cell non–small-cell lung cancer. New Engl J Med. (2015) 373:123–35. doi: 10.1056/NEJMoa1504627 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Valilou SF, Rezaei N. Tumor antigens. In: Vaccines for Cancer Immunotherapy. Cambridge, Massachusetts, US: Elsevier; (2019). p. 61–74. doi: 10.1016/B978-0-12-814039-0.00004-7 [DOI] [Google Scholar]
- 7. Narang P, Chen M, Sharma AA, Anderson KS, Wilson MA. The neoepitope landscape of breast cancer: Implications for immunotherapy. BMC Cancer. (2019) 19:200. doi: 10.1186/s12885-019-5402-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Blass E, Ott PA. Advances in the development of personalized neoantigen-based therapeutic cancer vaccines. Nat Rev Clin Oncol. (2021) 18:215–29. doi: 10.1038/s41571-020-00460-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Pan R-Y, Chung W-H, Chu M-T, Chen S-J, Chen H-C, Zheng L, et al. Recent development and clinical application of cancer vaccine: targeting neoantigens. J Immunol Res. (2018) 2018:1–9. doi: 10.1155/2018/4325874 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Yarchoan M, Johnson BA, Lutz ER, Laheru DA, Jaffee EM. Targeting neoantigens to augment antitumour immunity. Nat Rev Cancer. (2017) 17:209–22. doi: 10.1038/nrc.2016.154 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Tang W, Zhang F, Byun JS, Dorsey TH, Yfantis HG, Ajao A, et al. Population-specific mutation patterns in breast tumors from African American, European American, and Kenyan patients. Cancer Res Commun. (2023) 3:2244–55. doi: 10.1158/2767-9764.CRC-23-0165 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Andrews S. FastQC: a quality control tool for high throughput sequence data (2010). Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc (accessed April 15, 2024).
- 13. Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: Summarize analysis results for multiple tools and samples in a single report. Bioinformatics. (2016) 32:3047–8. doi: 10.1093/bioinformatics/btw354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Bolger AM, Lohse M, Usadel B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics. (2014) 30:2114–20. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. (2013). doi: 10.48550/ARXIV.1303.3997 [DOI] [Google Scholar]
- 16. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. (2010) 20:1297–303. doi: 10.1101/gr.107524.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Tan A, Abecasis GR, Kang HM. Unified representation of genetic variants. Bioinformatics. (2015) 31:2202–4. doi: 10.1093/bioinformatics/btv112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. The ensembl variant effect predictor. Genome Biol. (2016) 17:122. doi: 10.1186/s13059-016-0974-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Kandoth C, Gao J, Mattioni M, Struck A, Boursin Y, Penson A, et al. mskcc/vcf2maf: vcf2maf v1.6.16 (v1.6.16). (2018). doi: 10.5281/ZENODO.593251. Computer software. [DOI] [Google Scholar]
- 20. Mayakonda A, Lin D-C, Assenov Y, Plass C, Koeffler HP. Maftools: Efficient and comprehensive analysis of somatic variants in cancer. Genome Res. (2018) 28:1747–56. doi: 10.1101/gr.239244.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. (2016) 34:525–7. doi: 10.1038/nbt.3519 [DOI] [PubMed] [Google Scholar]
- 22. Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: Transcript-level estimates improve gene-level inferences. F1000Research. (2016) 4:1521. doi: 10.12688/f1000research.7563.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Kawaguchi S, Higasa K, Shimizu M, Yamada R, Matsuda F. HLA-HD: An accurate HLA typing algorithm for next-generation sequencing data. Hum Mutat. (2017) 38:788–97. doi: 10.1002/humu.23230 [DOI] [PubMed] [Google Scholar]
- 24. Hundal J, Carreno BM, Petti AA, Linette GP, Griffith OL, Mardis ER, et al. pVAC-Seq: A genome-guided in silico approach to identifying tumor neoantigens. Genome Med. (2016) 8:11. doi: 10.1186/s13073-016-0264-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Vita R, Mahajan S, Overton JA, Dhanda SK, Martini S, Cantrell JR, et al. The immune epitope database (IEDB): 2018 update. Nucleic Acids Res. (2019) 47:D339–43. doi: 10.1093/nar/gky1006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. R Core Team . R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; (2023). Available at: https://www.R-project.org/ (accessed April 15, 2024). [Google Scholar]
- 27. Pan J-W, Zabidi MMA, Ng P-S, Meng M-Y, Hasan SN, Sandey B, et al. The molecular landscape of Asian breast cancers reveals clinically relevant population-specific differences. Nat Commun. (2020) 11:6433. doi: 10.1038/s41467-020-20173-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Pipek O, Alpár D, Rusz O, Bödör C, Udvarnoki Z, Medgyes-Horváth A, et al. Genomic landscape of normal and breast cancer tissues in a Hungarian pilot cohort. Int J Mol Sci. (2023) 24:8553. doi: 10.3390/ijms24108553 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Lakshmanan I, Ponnusamy MP, Das S, Chakraborty S, Haridas D, Mukhopadhyay P, et al. MUC16 induced rapid G2/M transition via interactions with JAK2 for increased proliferation and anti-apoptosis in breast cancer cells. Oncogene. (2012) 31:805–17. doi: 10.1038/onc.2011.297 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Wang X, Guda C. Integrative exploration of genomic profiles for triple negative breast cancer identifies potential drug targets. Medicine. (2016) 95:e4321. doi: 10.1097/MD.0000000000004321 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Felder M, Kapur A, Gonzalez-Bosquet J, Horibata S, Heintz J, Albrecht R, et al. MUC16 (CA125): Tumor biomarker to cancer therapy, a work in progress. Mol Cancer. (2014) 13:129. doi: 10.1186/1476-4598-13-129 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Midha MK, Huang Y-F, Yang H-H, Fan T-C, Chang N-C, Chen T-H, et al. Comprehensive cohort analysis of mutational spectrum in early onset breast cancer patients. Cancers. (2020) 12:2089. doi: 10.3390/cancers12082089 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Al Amri WS, Allinson LM, Baxter DE, Bell SM, Hanby AM, Jones SJ, et al. Genomic and expression analyses define MUC17 and PCNX1 as predictors of chemotherapy response in breast cancer. Mol Cancer Ther. (2020) 19:945–55. doi: 10.1158/1535-7163.MCT-19-0940 [DOI] [PubMed] [Google Scholar]
- 34. MaChado GC, Ferrer VP. MUC17 mutations and methylation are associated with poor prognosis in adult-type diffuse glioma patients. J Neurological Sci. (2023) 452:120762. doi: 10.1016/j.jns.2023.120762 [DOI] [PubMed] [Google Scholar]
- 35. Oh J-H, Jang SJ, Kim J, Sohn I, Lee J-Y, Cho EJ, et al. Spontaneous mutations in the single TTN gene represent high tumor mutation burden. NPJ Genomic Med. (2020) 5:33. doi: 10.1038/s41525-019-0107-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Marvalim C, Datta A, Lee SC. Role of p53 in breast cancer progression: An insight into p53 targeted therapy. Theranostics. (2023) 13:1421–42. doi: 10.7150/thno.81847 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Dumay A, Feugeas J, Wittmer E, Lehmann-Che J, Bertheau P, Espié M, et al. Distinct tumor protein p53 mutants in breast cancer subgroups. Int J Cancer. (2013) 132:1227–31. doi: 10.1002/ijc.27767 [DOI] [PubMed] [Google Scholar]
- 38. Chen J, Sun J, Wang Q, Du Y, Cheng J, Yi J, et al. Systemic deficiency of PTEN accelerates breast cancer growth and metastasis. Front Oncol. (2022) 12:825484. doi: 10.3389/fonc.2022.825484 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Tsuge S, Saberi B, Cheng Y, Wang Z, Kim A, Luu H, et al. Detection of novel fusion transcript VTI1A-CFAP46 in hepatocellular carcinoma. Gastrointestinal Tumors. (2019) 6:11–27. doi: 10.1159/000496795 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Milne RL, Antoniou AC. Genetic modifiers of cancer risk for BRCA1 and BRCA2 mutation carriers. Ann Oncol. (2011) 22:i11–7. doi: 10.1093/annonc/mdq660 [DOI] [PubMed] [Google Scholar]
- 41. den Brok WD, Schrader KA, Sun S, Tinker AV, Zhao EY, Aparicio S, et al. Homologous recombination deficiency in breast cancer: A clinical review. JCO Precis. Oncol. (2017) 1:1–13. doi: 10.1200/PO.16.00031 [DOI] [PubMed] [Google Scholar]
- 42. Bodily WR, Shirts BH, Walsh T, Gulsuner S, King M-C, Parker A, et al. Effects of germline and somatic events in candidate BRCA-like genes on breast-tumor signatures. PloS One. (2020) 15:e0239197. doi: 10.1371/journal.pone.0239197 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Animesh S, Ren X, An O, Chen K, Lee SC, Yang H, et al. Exploring the neoantigen burden in breast carcinoma patients. (2022). doi: 10.1101/2022.03.03.482669 [DOI] [Google Scholar]
- 44. Morisaki T, Kubo M, Umebayashi M, Yew PY, Yoshimura S, Park J-H, et al. Neoantigens elicit T cell responses in breast cancer. Sci Rep. (2021) 11:13590. doi: 10.1038/s41598-021-91358-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Alspach E, Lussier DM, Miceli AP, Kizhvatov I, DuPage M, Luoma AM, et al. MHC-II neoantigens shape tumour immunity and response to immunotherapy. Nature. (2019) 574:696–701. doi: 10.1038/s41586-019-1671-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The original contributions presented in the study are included in the article/ Supplementary Material . Further inquiries can be directed to the corresponding author.