Abstract
In this study, we used whole-exome sequencing of a cohort of 45 advanced-stage, treatment-naïve Nigerian (NG) primary prostate cancer tumors and 11 unmatched nontumor tissues to compare genomic mutations with African American (AA) and European American (EA) The Cancer Genome Atlas (TCGA) prostate cancer. NG samples were collected from six sites in central and southwest Nigeria. After whole-exome sequencing, samples were processed using GATK best practices. BRCA1 (100%), BARD1 (45%), BRCA2 (27%), and PMS2(18%) had germline alterations in at least two NG nontumor samples. Across 111 germline variants, the AA cohort reflected a pattern [BRCA1 (68%), BARD1 (34%), BRCA2 (28%), and PMS2 (16%)] similar to NG samples. Of the most frequently mutated genes, BRCA1 showed a statistically (P ≤ 0.05) higher germline mutation frequency in men of African ancestry (MAA) and increasing variant frequency with increased African ancestry. Disaggregating gene-level mutation frequencies by variants revealed both ancestry-linked and NG-specific germline variant patterns. Driven by rs799917 (T>C), BRCA1 showed an increasing mutation frequency as African ancestry increased. BRCA2_rs11571831 was present only in MAA, and BRCA2_rs766173 was elevated in NG men. A total of 133 somatic variants were present in 26 prostate cancer–associated genes within the NG tumor cohort. BRCA2 (27%), APC (20%), ATM (20%), BRCA1 (13%), DNAJC6 (13%), EGFR (13%), MAD1L1 (13%), MLH1 (11%), and PMS2 (11%) showed mutation frequencies >10%. Compared with TCGA cohorts, NG tumors showed statistically significant elevated frequencies of BRCA2, APC, and BRCA1. The NG cohort variant pattern shared similarities (cosign similarities ≥0.734) with Catalogue of Somatic Mutations in Cancer signatures 5 and 6, and mutated genes showed significant (q < 0.001) gene ontology (GO) and functional enrichment in mismatch repair and non-homologous repair deficiency pathways. Here, we showed that mutations in DNA damage response genes were higher in NG prostate cancer samples and that a portion of those mutations correlate with African ancestry. Moreover, we identified variants of unknown significance that may contribute to population-specific routes of tumorigenesis and treatment. These results present the most comprehensive characterization of the NG prostate cancer exome to date and highlight the need to increase diversity of study populations.
Significance:
MAA have higher rates of prostate cancer incidence and mortality, however, are severely underrepresented in genomic studies. This is the first study utilizing whole-exome sequencing in NG men to identify West African ancestry-linked variant patterns that impact DNA damage repair pathways.
Introduction
For men, adenocarcinoma of the prostate is the most frequently diagnosed cancer, accounting, globally, for an estimated 1,414,259 cases and 375,304 deaths in 2020. The preponderance of this mortality is for men of African ancestry (MAA), including African American (AA), Central American, Caribbean, and Sub-Saharan African men (1). The Globocan 2020-derived mortality-to-incidence ratio, “an indirect description of the general survival experience” of prostate cancer in Africa, is 0.55, relative to 0.32 and 0.17 for Asia and Europe, respectively (2, 3). Furthermore, studies of AA men showed higher incidence, worse prognoses, and higher mortality compared with European American (EA) men (4, 5). Although there is a substantial contribution of social and environmental influence on the disparity, emerging evidence from genomic profiles suggests that this disease is highly heterogenous (6), and its etiology and phenotype are influenced by enrichment of African ancestral genetic markers, with West African ancestry linked with higher Gleason grade at diagnosis (7–9).
The racial disparity in prostate cancer biology is typically characterized by increased genomic mutations, resulting in a more aggressive phenotype. For instance, West African ancestry is associated with distinctive somatic genomic mutations (10, 11). Conversely, understanding these putatively targetable genomic mutations presents opportunities for effective population-relevant and genomics-guided interventions that can improve clinical outcomes. This relies on the availability of genomic data for tumors from MAA. However, despite an upsurge in genomic data for human cancers, the data on prostate cancer from African sources are grossly underrepresented in the literature and genomics databases. For instance, AA samples account for only about 10% of The Cancer Genome Atlas (TCGA) prostate cancer sample cohorts (11, 12). Consequently, this gross underrepresentation impedes deciphering of clinically actionable genomic mutations that could be used to develop precision interventions for MAA. Hence, it is imperative to increase the representation by sequencing the tumor genome of prostate cancer in MAA.
Although Black men in the Americas generally have ancestral roots in the Atlantic coasts of Africa (13–15), the translational impact of genomics data of AA men to indigenous Africans is limited because of the varying proportions of European-related and intra-African admixtures of African American (16, 17). Such admixture and variation in germline mutations influence gene expression and phenotype (18); and are limiting factors in the understanding of the contribution of genetics to health disparities (19). Hence, studying the genomic architecture of prostate cancer in the indigenous African population is essential for advancing understanding of the contribution of African genetics to prostate cancer biology and the phenotype of this disease in the African Diaspora. To date, inadequate attention has been given to generation of genomics studies of prostate cancer among indigenous Africans. Aside from three genome-wide association studies of prostate cancer in Ghanaian (20), Ugandan (21), and South African (22) men, only Jaratlerdsiri and colleagues (23) have reported whole-genome sequence data on tumors from indigenous sub-Saharan Africans. Their analysis of prostate cancer of six South African Black men identified distinctive and elevated oncogenic driver mutations, with a high proportion of these recurrent mutations appearing early in tumorigenesis. They also showed that tumors of the African men they studied had fewer complex genomic rearrangements, loss of PTEN, and absent ERG fusions and PIK3CA mutations relative to AAs. Furthermore, apart from large deletions within the BRCA2, DEFA1B, and MFF genes, they did not report any pathogenic mutations in high-penetrance genes, such as BRCA1, BRCA2, ATM, and CHK2 among the South African cohort (23, 24). Previous studies have identified, for prostate cancer of AAs, a high burden of mutations in these DNA repair genes (25, 26); suggesting that PARP inhibitors could improve clinical outcomes for men of African ancestry with prostate cancer (27). The South African study is limited by the small sample size. Furthermore, the differences observed could be due to the low contribution of South-African Khoe-San ancient ancestry genes to the AA genetic pool, which is a source of bias (16, 17). It is therefore our hypothesis that genomic profiling of prostate cancer in indigenous West African men will identify clinically actionable targets for precision intervention for MAA. The utility of clinical mutational profiling necessitates greater emphasis on identifying variants that can be used to help these underrepresented patients.
The aim of the current study was to analyze whole-exome sequencing (WES) of 45 Nigerian (NG) primary treatment-naïve formalin-fixed, paraffin-embedded (FFPE) prostate cancer and 11 NG nontumor prostate samples collected within the Prostate Cancer Transatlantic Consortium (CaPTC). Study of prostate cancer of NG men allowed us to provide genetic information of the indigenous West Africa population with the highest genetic contribution to AA men (16) and provide opportunities to investigate the shared genetic background of both groups for causal disease variants. As such, our data will be relevant for deriving actionable clinical information for prostate cancer intervention for MAA.
Materials and Methods
Sample Collection and Genomic DNA Extraction
This study utilized 45 FFPE advanced-stage, treatment-naïve primary prostate cancer and 11 NG nontumor prostate samples collected from four participating clinical sites within the CaPTC network (Fig. 1A). In accordance with the U.S. Common Rule, the archived samples used in this study were reviewed and approved by the Institutional Review Boards of their respective clinical institutions (University of Ilorin Teaching Hospital, National Hospital Abuja, Lagos State University Teaching Hospital, Federal Medical Centre, Ahmadu Bello University, and the University of Abuja) and by the Institutional Review Board at Tuskegee University (Tuskegee, AL). Because of the retrospective nature of this study and the usage of deidentified archived samples, the review boards deemed informed written consent to be unnecessary for this study. Five 10-μm-thick curls were obtained from each block with >50% tumor and ≤50% necrosis and shipped to Q2 Solutions for DNA extraction and quality analysis. Following the manufacturer's protocol, genomic DNA and total RNA were purified using Allprep DNA/RNA FFPE kits (Qiagen). DNA quality and quantity were checked with Qubit 2.0 fluorometry (Life Technologies) and with KAPA hgDNA quantification and QC kits [Kappa Biosystems (Roche)]. DNA quality and quantity thresholds were >0.2 μg and a Q129/Q41 ratio >0.00225, respectively.
FIGURE 1.
Sample collection sites and genetic admixture analysis. A, At clinical sites across Nigeria, 45 samples were collected. B, Admixture v1.3.0 was used to estimate ancestry proportions, based on reference populations from the 1000 Genomes Project phase III superpopulations. Prior to analysis, rare variants (i.e., <5% across all phase III 1000 genomes), all indels, and any SNPs that were not biallelic were removed. Samples within the CaPTC cohort had an average African proportion of 99.1%. TCGA Samples (n = 50) with >70% African ancestry were classified as AA; 402 TCGA samples contained >60% European admixture. Those samples were sorted by European proportion, and the top 50 samples were classified as EA and utilized in this study. The average European proportion of this group was 99.996%. C, Germline variants within the NG and TCGA cohorts were compared with phase III 1000 Genomes superpopulations using principal component analysis. NG samples strongly clustered with the African superpopulation. AAs samples clustered with the African superpopulation; European samples clustered with the European superpopulation.
Pathologic Scoring
The FFPE blocks were processed at the Pathology Biorepository Shared Service (PBSS) core at the University of Maryland, Baltimore (Baltimore, MD). For pathology review, an initial hematoxylin and eosin (H&E) slide was prepared from each block. A pathologist assessed the presence and quantity of tumor, and the presence and quantity of normal tissue. Up to two tumor and normal cores per H&E slide were circled, and the total number of cores per slide was recorded in a sample manifest. In addition, the pathologist provided the tumor core(s) Gleason score for each tumor circle. Those FFPE blocks with sufficient tumor and/or normal cores available were submitted to the PBSS research histology lab for core extraction. Using the corresponding H&E slide as a template, cores were extracted from each FFPE block by use of a manual tissue core extraction device with RNAse/DNAse-free conditions. The cores were placed in labeled RNAse/DNAse-free cryovials. A sample manifest for the extracted cores accompanied samples shipped to the NIH Laboratory of Human Carcinogenesis.
WES
As outlined in Supplementary Fig. S1, library preparation was performed using Agilent SureSelectXT Human All Exon V6 r2 Exome Kits (Agilent Technologies). Sequencing (2 × 150 bp) was performed on either an Illumina HiSeq4000 or on an Illumina NextSeq500 (Illumina), to a target of 100 (± 10) million raw data reads per each sample library. Following sequencing, raw fastq files were transferred to the NIH Biowulf supercomputing cluster and analyzed using the Center for Cancer Research Collaborative Bioinformatics Resource (CCBR) whole-exome pipeline (https://github.com/CCBR/Pipeliner). Reads were trimmed using Trimmomatic v0.33 (28) and mapped to the hs37d5 version of the human reference genome (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz) using BWA-MEM v07.17 (29). Binary Alignment Map (BAM) files were processed using Samtools v1.8 (http://www.htslib.org/; ref. 30), and Picard v1 (http://broadinstitute.github.io/picard/) was used to mark duplicates. GATK v3.8 (31) was used to perform indel realignment and base recalibration. Read and alignment-level quality analysis was performed using Qualimap v2.2.1 (32). Alignment quality metrics were analyzed and visualized using RStudio 1.2.5003 (http://www.rstudio.com) [R 3.6.3 (https://www.R-project.org)], the ggstatsplot v0.3.0 (https://indrajeetpatil.github.io/ggstatsplot/) package, and the compareGroups v4.2.0 (33) package.
Variant Calling and Filtration
Germline variant calling was completed using HaplotypeCaller (34), and Annovar v2019Oct24 was used for variant annotation. Variant filtration followed GATK Best practices. Cohort SNPs and indels were separated, filtered, and recombined for downstream analysis. SNP filters were Qual < 30.0, QD < 2.0, FS > 60.0, MQ < 40.0, MQRanksum < −12.5, and ReadPosRankSum < −8.0. Indel filters were Qual < 30.0, QD < 2.0, FS >200.0, and ReadPosRankSum < −20.0. Once the variants were recombined, cohort germline variants were compared with a ClinVar-derived list of known prostate cancer–associated germline variant regions (Supplementary Table S1) to target variants of known clinical importance. Germline variants with a read depth >3 and a variant allele frequency >50% were retained for downstream analysis (Supplementary Table S2). Somatic variant calling was completed using MuTect2 (35), and Annovar v2019Oct24 (36) was used for variant annotation. As described by Jones and colleagues, a single unmatched NG normal FFPE exome, sequenced using the same methods as for the tumor samples, was paired with each tumor exome to filter false-positive variant calls (37). Mutation tables were imported into RStudio for visualization and analysis using the maftools (ref. 38; v2.4.10) package. Variants were (i) screened for strand bias (Supplementary Fig. S2) using GATK FisherStrand Phred Score, (ii) separated into variants within known prostate cancer–associated genes and variants within novel prostate cancer–associated genes (Supplementary Table S3), using ClinVar and (iii) and filtered (Supplementary Fig. S3) using two filtering regimes. After filtering, retained variants within genes identified in ClinVar were considered “Known”; conversely, variants within genes not identified in ClinVar were called “Novel.” Filtering steps included (i) exclusion of silent and non–protein-coding mutations, (ii) variant allele read depth ≥3, (iii) variant allele frequency >5% (10% for variants within novel prostate cancer–associated genes), (iv) dbNSFP (39) Genome Aggregation Database (gnomAD; ref. 40) exome allele frequency <0.01 (<0.001 for variants within novel prostate cancer–associated genes and variants lacking allele frequencies were retained for downstream filtration), (v) identification as Pathogenic or Uncertain in ClinVar v 20200419 (41), (vi) removal of dbSNP-annotated variants identified in NG unmatched normal samples (n = 11; Supplementary Table S4), (vii) present in genes altered in least 5% of tumors, and (viii) manual validation in the Integrative Genomics Viewer (IGV; Supplementary Fig. S4). For genes mutated across at least five prostate cancer samples, Fishers exact test was used to compare cohort mutation frequencies. The test was completed using the maftools clinicalenrichment function. P values < 0.05 were considered significant.
TCGA PRAD Data Acquisition and Analysis
Access to TCGA Prostate adenocarcinoma (PRAD) data (Accession: phs000178.v11.p8) was obtained through the database of Genotypes and Phenotypes (dbGAP). Raw sequencing files in BAM format were downloaded through the Genomic Data Commons (GDC) data transfer tool from the GDC Data Portal (https://portal.gdc.cancer.gov/). After download, the raw files were sorted using Samtools and split into their constituent forward and reverse fastq files using bedtools v2.29 (42). Once separated, the fastq files were processed through the CCBR whole-exome pipeline and analyzed using the same methods as for NG CaPTC samples.
Genetic Admixture Estimation
To ensure accurate ancestral group assignment, HaplotypeCaller (43) and Admixture v1.3.0 (44) were used to estimate ancestry proportions, based on reference populations from the 1000 Genomes Project phase III superpopulations in all TCGA and NG samples (Supplementary Fig. S5). Rare variants [i.e., <5% across all phase III (45) 1000 genomes], all indels, and any SNPs that were not biallelic were removed prior to analysis. TCGA samples (n = 57) with majority African ancestry were classified as AAs. AAs are highly admixed, which in turn increases genetic variation (17, 46). Because our NG population has greater than 90% African Ancestry, we focused on AAs within TCGA that have at least 70% African admixture (n = 50) as a comparison group, which has been used in previous reports. Thus, the AA patients with TCGA that have African admixture at or above 70%, represent 87% of the total AA samples in the prostate cancer TCGA. The EA samples that at or above >60% European admixture represent 402 TCGA samples. Those samples were sorted by European proportion, and the top 50 samples were classified as EAs and utilized in this study. The average European proportion of this group was 99.996%.
COSMIC Signature Enrichment
Using the maftools package, filtered single-nucleotide variants across each cohort were used to estimate the representation of Catalogue of Somatic Mutations in Cancer (COSMIC; cancer.sanger.ac.uk; ref. 47) mutation signatures within each tumor sample. Maftools uses cophenetic correlation and nonnegative matrix factorization to determine the optimal number of SNP signatures (across the cohort), extracts those signatures, and compares them with the known (n = 30) COSMIC signatures.
Variant Functional Gene Ontology and Network Analyses
Filtered variants, present in at least two NG prostate cancer tumor samples, were imported into Cytoscape (48) (v. 3.7.2) to assess functional gene ontology enrichment and to visualize the GO term interaction network. Once separated, functional analysis and network construction were completed using the stringApp (ref. 49; v 1.6.0) and ClueGO (ref. 50; v. 2.5.7) plug-ins. A two-sided (enrichment/depletion) hypergeometric test with Bonferroni step-down was used to determine Reactome Pathways (v. 08.05.2020) enrichment. Analysis thresholds included enrichment significance of P ≤ 0.01, a minimum of 5% gene inclusion and a kappa score threshold of ≥0.4. ClueGO uses kappa scores to determine the likelihood of GO term interactions and groupings.
Data Availability
The data generated in this study are publicly available in dbGaP (Accession phs002547.v1.p1).
Results
To determine the somatic and germline variants associated with prostate cancer in NG men, we collected samples from multiple institutions within the CaPTC. Specifically, we collected 45 intermediate (Gleason scores 4 + 3) and high-grade (Gleason scores ≥4 + 3) tumors and 11 nontumor prostate samples. Samples averaged >68 × coverage, 57.2 mapping quality and 263 million mapped reads per sample (Supplementary Table S5). A total of 31 (20 tumor and 11 nontumor) samples were collected from Northern Nigeria), 17 samples from Central Nigeria, and eight samples from Southwest Nigeria (Supplementary Table S6). PRAD exome data were downloaded from TCGA, using the dbGAP database as a comparison cohort (Supplementary Table S7).
Race is a poor group classifier for linking genetic variation and disease causation (51); moreover, self-reported race can obscure genetic variation due to misunderstandings about family heritage, cultural influences, and/or other societal factors (52). To ensure that our NG and TCGA cohort comparisons were not skewed by bias within self-reported race, we quantified the individual genetic admixture within each patient sample. To accomplish this, germline SNPs were compared with 1000 genomes superpopulations (African, European, South Asian, East Asian, and admixed American), and ancestry proportion estimates were calculated (Fig. 1B). NG patients showed an average genetic ancestry of 99.1% African. The genetic ancestry of TCGA AA patients was predominantly a mixture of African (50.2%–99.99%) and European admixture (1%–43%). To reduce the impact of this variance on our analysis, we selected only TCGA AA patients with ≥70% African ancestry (n = 50). TCGA EA patients showed minimal admixture, with >98.3% European ancestry. Nine patients self-identified as EAs possessed ≤45% European ancestry. Four of the 9 patients were majority (>50%) admixed American, 2 were majority East Asian, 2 were majority African, and 1 was predominantly (45%) European with 35% admixed American and 16% African admixture. To obtain an EA comparison group, we sorted (high to low) the cohort by European ancestry proportion and selected the top 50 TCGA EA patients. After admixture estimation and sample selection, principal component analysis plots were used to visualize the relationships between each cohort and the five 1000 Genomes superpopulations (Fig. 1C). The NG and TCGA EA cohorts clustered with their ancestral 1000 genomes superpopulations, and TCGA AA cohort clustered with the African superpopulation. Thus, data for these patients were used in subsequent analyses.
The NG cohort harbored 31 known, non-benign, germline variants. Four genes [BRCA1 (100%), BARD1 (45%), BRCA2 (27%), and PMS2 (18%)] were altered in at least two samples (Fig. 2A). These genes also showed top mutation frequencies within both TCGA cohorts (Fig. 2B and C). Across 111 germline variants, the AA TCGA cohort reflected a pattern [BRCA1 (68%), BARD1 (34%), BRCA2 (28%), and PMS2 (16%)] similar to that for NG samples. In addition, the rate of BRCA1 mutations increased (P ≤ 0.021) as African admixture increased (Fig. 3A; Supplementary Table S8). A total of 126 germline variants were present in the EA TCGA cohort. Disaggregating mutation frequencies down to specific variants revealed both ancestry-linked and NG-specific germline variant patterns. BRCA1 showed an increasing mutation frequency as African admixture increased (Fig. 3A). That pattern was driven by three variants (rs799917, rs16941, and rs16942; Fig. 3B). The frequency of rs799917 was higher for men of African ancestry; rs16941 and rs16942 were lower. In esophageal squamous cell carcinoma, the BRCA1_rs799917 T>C SNP inhibits mir-638–mediated regulation of BRCA1, thus reducing BRCA1 expression and increasing cancer cell proliferation (53). This variant is also linked to a higher risk of gastric, lung, and triple-negative breast cancer (54–56). BRCA1_rs16941 and BRCA1_rs16942 are variants of unknown significance (VUS). BARD1 germline variant patterns appear to be specific to NG men. Compared with AA and EA cohorts, rs2070096 is lower and rs2070094 is higher. Of note, the BARD1_rs2070094 SNP resides within the BARD1-binding domain of BRCA1 and may provide a protective function that enhances DNA repair by enhancing BARD1/BRCA1 binding stability (57). BARD1_rs2070096 is a VUS. BRCA2 germline variants displayed both ancestry-linked and NG-specific patterns. rs11571831 was present only in men of African ancestry, and rs766173 was high in NG men. Both BRCA2 variants are classified as VUS. Although most variants in NG prostate cancer were identified as VUS, their presence and differing frequencies, compared with TCGA, provide opportunities for future investigations. Characterizing the full mutation spectrum is a first step in understanding how best to diagnose and treat this underrepresented patient population.
FIGURE 2.
Prostate cancer germline variant oncoplot. A, NG germline variants detected across 11 normal samples were filtered against known ClinVar cancer variants. In at least two tumor samples, four genes known to harbor cancer-related variants were mutated. These genes included BRCA1 (BRCA1 DNA repair associated)—100%, BARD1—45%, BRCA2 (DNA repair associated)—27%, and PMS2 (PMS1 homolog 2, mismatch repair system component)—18%. As a comparison with NG prostate cancer exome samples, TCGA prostate cancer samples (n = 50 AA and n = 50 EA) were downloaded through dbGAP and analyzed for genetic variants. B, In the AA cohort. eight genes showed germline mutations in at least two samples. C, In the EA cohort, 10 genes showed germline mutations in at least two samples.
FIGURE 3.
NG prostate cancer cohort germline mutations comparison with TCGA prostate cancer cohorts and lollipop plots. A, Prostate cancer of NG and AA men showed more BRCA1 germline mutations (P ≤ 0.001 and P ≤ 0.01, respectively) compared with European men. In addition, prostate cancer of NG men showed more (P ≤ 0.036) BRCA1 mutations relative to AA men. In prostate cancer of EA men, BARD1 was mutated at a higher rate (P ≤ 0.048). BRCA2 showed no significant difference in cohort mutation rates. AA men with greater than 90% African ancestry (n = 19) show a statistically significant (P ≤ 0.021) increase in BRCA1 germline mutations when compared with those with lower amounts of African ancestry (n = 29). In addition, AA men with 90% African ancestry have a statistically higher (P = 0.012) frequency of the BRCA1 variant rs799917. B, To disaggregate mutation rates down to specific variants, lollipop plots revealed a finer variation in cohort patterns. The ancestry-linked pattern of BRCA1 is driven by rs799917, which was more frequent (P ≤ 0.001) for men of African ancestry. rs16941 and rs16942 were elevated in prostate cancer of European men, but that difference was not statistically significant. BARD1 germline variants showed no significant difference in variant rates; however, the patterns appeared to be specific to NG men. Compared with AA and EA cohorts, rs2070096 was lower and rs2070094 was higher. BRCA2 germline variants displayed no statistically different variant rates, but both ancestry-linked and NG-specific patterns were discernable. rs11571831 was present only in prostate cancer of men of African ancestry, and rs766173 was elevated in prostate cancer of NG Men. P values were produced via two-sided Fisher exact test groupwise comparison.
Somatic variant analysis of tumor-only sequencing data involves multiple nontrivial steps that are distinct from the analysis of matched tumor and normal sequencing. Therefore, we used an established pipeline that incorporated a panel of normal samples. We used an unmatched NG normal sample to filter out NG-specific germline variants (37, 58), reducing the unique NG variants by 70.8% from 2,506,254 to 730,285 variants (Supplementary Fig. S6). Within TCGA cohorts, we used each sample's patient-matched normal, which produced 11,208 unique AA variants and 15,191 unique EA variants. Because the NG cohort contained many somatic variants, we employed two filtering regimes, one for variants within known prostate cancer–associated genes (as identified in ClinVar) and one for variants within genes not associated with prostate cancer. We identified 905 variants across 25 genes known to be associated with prostate cancer, and 156 variants across 51 novel prostate cancer genes. Using the same approach, we identified 15,854 variants in TCGA AA cohort and 21,957 variants in TCGA EA cohort (Supplementary Table S9). Consistent with other sequencing studies (59), our results showed the same racial mutation patterns for SPOP, ATM, TP53, and PIK3CA. TCGA cohorts did not show recurrent mutations in genes not associated with prostate cancer. Our dual filtering approach allowed us to filter, independently, each set of variants across the NG cohort without overfiltering variants within known prostate cancer–associated genes and to identify high-confidence variants in novel prostate cancer–associated genes.
Within the NG cohort, 133 somatic variants were present in 26 prostate cancer–associated genes. Nine genes [BRCA2 (27%), APC (20%), ATM (20%), BRCA1 (13%), DNAJC6 (13%), EGFR (13%), MAD1L1 (13%), MLH1 (11%), and PMS2 (11%)] showed mutation frequencies >10% (Fig. 4A). Of NG prostate cancer, 53% showed mutations in genes (BRCA2, ATM, BRCA1, CHEK2, TP53, and MSH6) associated with genome integrity. TCGA AA and EA cohorts harbored 67 and 73 somatic variants, respectively. Across both cohorts, fifteen genes were mutated in at least two samples (SPOP, ATM, TP53, BRAF, MED12, PIK3CA, CTNNB1, EGFR, FLCN, MYH7, PTEN, and TTN). SPOP and ATM were the most frequently mutated genes in AA tumors and were mutated two times more compared with EA. Comparison of the mutation frequencies between TCGA cohorts did not show any statistically significant differences; however, AA tumors showed a significant increase in SPOP mutations compared with NG (Fig. 4B). BRCA2, APC, and BRCA1 showed statistically significant increases in the NG cohort. Though not statistically significant, ATM had the highest mutation frequency associated with increasing African ancestry; specifically TCGA EAs had an ATM mutation frequency of 4%, but TCGA AA and NG cohorts had rates of 8% and 20%, respectively. Somatic mutations for NGs and AAs were distributed across the amino acid sequence of the most mutated genes. None of the variants were shared across or within cohorts. (Fig. 4C). Comparison of EA and NG cohorts showed no discernible pattern (Supplementary Fig. S7); however, the EA cohort did show a number of variants within a known SPOP hotspot.
FIGURE 4.
Prostate cancer somatic variants within known prostate cancer–associated genes. A, Variant calling within the NG cohort (n = 45) produced 1,168,250 variants. A total of 25 genes were known to be associated with prostate cancer harbored variants in at least two tumor samples. The most frequently mutated of these included BRCA2 (BRCA2 DNA repair associated)—27%, APC (APC regulator of WNT signaling pathway)—20%, ATM (ATM serine/threonine kinase)—20%, BRCA1 (BRCA1 DNA repair associated)—13%, and DNAJC6 [DnaJ heatshock protein family (Hsp40) member C6]—13%. As a comparison with NG prostate cancer exome samples, TCGA prostate cancer samples (n = 50 AA and n = 50 EA) were downloaded through dbGAP and analyzed for genetic variants. B, Prostate cancer of NG men showed a significant (P ≤ 0.01) elevation in BRCA2 somatic mutations compared with African and EA men. A significant increase was also evident for APC (P ≤ 0.05) and BRCA1 (P ≤ 0.05). Compared with NG men, prostate cancer of AAs were elevated (P ≤ 0.05). NG and AA men also showed higher, but not significant, mutation frequencies of ATM, MED12, and BRAF. C, Somatic mutations for NGs and AAs were distributed across the amino acid sequence of the most mutated genes. None of the variants were shared across or within cohorts. D and E, SNPs in the NG prostate cancer cohort were compared with known cancer-related mutation signatures within the COSMIC. A total of 89% of NG prostate cancer mutation patterns were similar (cosign similarity ≥0.796) with COSMIC signatures 6. The remaining 11% were more like COSMIC 5. F, Mutated genes (n = 83) present in at least two NG prostate cancer tumor samples (n = 45) were imported into Cytoscape to assess functional gene ontology enrichment and visualize the GO term interaction network, using Kyoto Encyclopedia of Genes and Genomes pathways. Variants showed significant (q ≤ 0.000538) GO and functional enrichment across multiple GO groups, including mismatch repair, homologous recombination, prostate cancer, and several cancer-related signaling pathways.
In addition to the variants within known prostate cancer driver genes, we identified four novel mutated genes that showed mutation frequencies >10%. CACNA2D2 had the highest mutation rate of 29% (Supplementary Fig. S8A) and showed a recurrent (n = 13) missense SNP of Leu54Phe (rs569543350; Supplementary Fig. S8B). TTN (Titin) had the second highest mutation frequency at 20%. The size of this large protein (>30,000 amino acids) renders it more susceptible to DNA repair errors, making the functional significance of these mutations unreliable, even after rigorous variant filtering (60–63). SYNE1 was the third most frequently mutated gene (16%) in the novel prostate cancer set. This gene showed a recurrent (n = 2) missense SNP of Gln1491Glu. The fourth most mutated gene was ADAMTS2. This gene showed a recurrent (n = 4) in-frame insertion of Leu_Pro29dup and an overall mutation frequency of 13%. Finally, 47 other genes not known to be associated with prostate cancer were mutated in 2 or more patients; however, we did not characterize these due to their low mutation frequencies.
To validate the observed germline variant frequencies, we analyzed NG tumors using the germline variant pipeline (Supplementary Fig. S9). NG tumor samples not only possessed comparable rates of gene-level variation [BRCA1 (100%), BARD1 (41%), and BRCA2 (18%)] but also showed comparable variant frequencies.
We next investigated the overall mutational patterns within each cohort to understand global somatic events. The NG cohort shared similarities (cosign similarities ≥0.734) with COSMIC signatures 5 and 6 (Fig. 4D). Five NG cohort samples had a mutational pattern similar (cosign similarity ≥0.734) to COSMIC 5. Forty cohort samples were similar (cosign similarity ≥0.796) to COSMIC 6 (Fig. 4E). TCGA AA mutational patterns shared similarities with COSMIC 1 and 5 (cosign similarities ≥0.645), and TCGA EA tumors shared similarities with COSMIC 1, 3, and 4 (cosign similarities ≥0.481; Supplementary Fig. S10). Within the COSMIC database, mutational signatures 1, 5, and 6 were the signatures most often observed in prostate cancer.
To determine the mechanism associated with tumorigenesis in NG prostate cancer tumors, mutated genes present in at least 2 NG patients were analyzed for functional gene ontology enrichment. NG tumors showed significant (q < 0.001) GO and functional enrichment in mismatch repair and non-homologous repair deficiency pathways (Fig. 4F; Supplementary Table S10). Additional enriched pathways included PD-1 checkpoint, thyroid hormone signaling, FOXO signaling, ErB2 signaling, adherens junctions, proteoglycans, and sphingolipids.
Discussion
This study is the first to perform WES of advanced-stage, treatment-naïve primary tumors from NG men with prostate cancer. We analyzed the genomes of 45 tumors and 11 normal NG prostate samples. Because most AAs in the United States have majority West African Ancestry, we assessed genetic admixture on comparison datasets from TCGA, which contains data on AA and EA patients with prostate cancer. We identified ancestry-linked germline and somatic mutation frequencies in DNA damage repair genes (BRCA1, BRCA2, APC, and ATM), as well as three novel prostate cancer–associated genes (CACNA2D2, SYNE1, and ADAMTS2). Mutations in DNA damage repair pathways are involved in prostate cancer development and progression and are clinically targetable (64). Because men of African ancestry are severely underrepresented in genomic studies, our findings address a gap in the contribution of genetic variation to the incidence of prostate cancer and aggressiveness of the disease in MAA. Furthermore, these findings encourage us to identify clinically targetable sites to close the gap in health-related disparities.
We observed a high level of BRCA1 germline mutation in prostate cancer of NG and men of African ancestry. The high rate is driven by the variant BRCA1_rs799917 T>C, which enhances disease risk in triple-negative breast (55), gastric (56), esophageal squamous cell (53), and lung (54) cancers. Results of meta-analyses, however, suggest that this variant is nonpathogenic (65–67). Because none of these studies included patients of African ancestry, the impact of this variant on that population remains poorly explored. BRCA1_rs799917 T>C alters the coding sequence of BRCA1, lowering BRCA1 expression by inhibiting its interaction with miR-638 (68). In addition, because BRCA1 is a DNA damage repair gene, it has been reported to upregulate the expression of multiple antioxidant genes and oxidoreductases, balancing cellular redox (69). Lower expression of BRCA1, skews this balance, which results an increase in DNA-damaging reactive oxygen species (ROS). AA men have been shown to possess lower mtDNA content, which can lead to enhanced ROS production and mitochondrial dysfunction (70). Coupled together, lower BRCA1 expression and increased ROS production can lead to an accumulation of mutated DNA, which enhances tumorigenesis. BRCA1_rs799917 T>C has not previously been associated with prostate cancer; however, BRCA1 germline mutations contribute to increased prostate cancer risk (71) and are associated with higher prostate cancer aggression and poorer outcomes (72). We found this trend in AA prostate cancer, which have higher frequencies of germline BRCA1 VUS (73). This pattern is also evident in both NG and AA breast cancer tumors (74–76). Thus, our exome analysis provides evidence of distinctive germline BRCA1 mutations in prostate cancer of patients with African Ancestry.
Our analysis also showed that, in their DNA damage response (DDR) genes, NG prostate cancer tumors have higher somatic variant rates, with 53% of NG tumors having at least one somatic DDR gene mutation. NG tumors demonstrated increases in BRCA2, APC, and BRCA1 mutations and SNP patterns associated with defective DNA mismatch repair. In addition, these tumors contained mutated genes that had significant gene ontology and functional enrichment across multiple GO groups, including mismatch repair and homologous recombination (HR) signaling pathways. Pathogenic mutations in DDR genes are prevalent in advanced-stage, localized prostate cancer, especially affecting genes responsible for repair by HR (77). Of note, BRCA2 facilitates the formation of RAD51 (RAD51 Recombinase) filaments, which are necessary for HR (78). The clinical implications of somatic BRCA2 mutations are poorly understood; however, all BRCA2 mutations (germline and somatic) are understood to destabilize HR, increase tumor aggression, and contribute to poor patient outcomes (79). DNA repair signalling is often impaired in cancer cells (80); however, this impairment is more substantial for MAA. Yadav and colleagues observed increased somatic mutations of BRCA2, BRCA1, and ATM in AA prostate cancer (26). AAs had a 1.24- to 2.16-fold increase in BRCA2, BRCA1, and ATM, as compared with EAs. The innate impairment of DNA repair in cancer cells leads to a dependence on alternative repair pathways that can be therapeutically exploited. Taking our findings, which are in line with other reports for AA men, the recent FDA approval of PARP inhibitors (specifically olaparib) and other therapies such as platinum drugs, ATR inhibitors, CHEK1/2 inhibitors, and radiotherapy may be useful for men of African Ancestry (81, 82).
Our analyses identified, in addition to somatic mutations in DDR genes, somatic mutations in novel prostate cancer–associated genes. Of NG tumors, 27% (n = 13) contained a recurrent CACNA2D2 missense SNP of Leu54Phe (rs569543350). Both ClinVar and dbSNP designate this variant as having “unknown significance.” CACNA2D2 modulates the expression of functional calcium channels (83), which contribute to cancer development (84). Compared with noncancerous prostate tissue, CACNA2D2 is expressed higher in prostate cancer tissue and can increase tumor proliferation and angiogenesis (85). SYNE1 encodes a multi-isomeric protein that participates in connecting the nuclear envelope to the cytoskeleton. This connection is necessary for proper nuclear movement and positioning, and for cellular migration (86). Abnormal nuclear envelope structure, a feature of cancer, is thought to contribute to tumorigenesis (87). Mutations in SYNE1 are linked to several human cancers (88). ADAMTS2 encodes a procollagen N-proteinase that is necessary for collagen fibril assembly (89). The role of ADAMTS2 in cancer remains poorly understood; however, the impact of collagen metabolism is well characterized (90, 91). Collagen is a structural component of the extracellular matrix, and its metabolism can affect tumor development, tumor tissue stiffness, metastasis, and treatment response. The presence of these novel prostate cancer–associated mutations is unsurprising. Non-European populations have higher rates of VUS (92–94). This is a result of the lack of diversity in research and (by extension) genomic databases (95). Non-European populations are underrepresented in research and genomic databases, which skews genomic annotations away from identifying clinically relevant variants in these populations. Further investigation into these novel mutations may expose alternative routes of disease aggression for MAA and mitigate the disparities.
These findings present the most comprehensive characterization of the NG prostate cancer exome to date and highlight the need to increase study population diversity. Although clinical genomics is a powerful tool to guide clinical interventions, the lack of non-European patients limits the capacity of these advancements to benefit men of African ancestry. Furthermore, the high level of genetic diversity within African men necessities the need for larger cohort studies to identify population-specific, recurrent mutations that contribute to prostate carcinogenesis.
Although our results are compelling, the study has some limitations. The high level of genetic diversity in Africans coupled with the lack of matched normal samples increased the number of detected somatic variants, requiring us to aggressively filter variants to limit the possibility of false positives. This conservative approach, although necessary, provides the possibility that we inadvertently excluded some true somatic variants. Moreover, the use of a nonmatched NG normal was employed to further filter out germline variants within somatic calls. These strategies reduced our ability to resolve ethnic and geographical differences within our NG cohort and required that we conduct a supervised analysis of variants within known prostate cancer driver genes. Second, we set a 70% African ancestry threshold for AA samples, (i) to make sure our AA and European TCGA comparison groups were as distinct as possible and (ii) to enrich the AA cohort for African ancestry. Setting this threshold limits the ability to determine the prevalence of the observed variant patterns in more admixed AA samples; however, a review of the available literature shows that the majority of AAs have at least that amount (96, 97). Finally, the limited representation of African patients within genomic databases and genomic research reduces our ability to determine the larger population-specific distribution of these findings. To our knowledge, this is the largest prostate cancer exome study of NG men, and, despite the limitations, provides a robust characterization of the somatic landscape within NG prostate cancer.
Supplementary Material
Supplementary Table 1. Known Germline Variants. Supplementary Table 2. Germline Variants. Supplementary Table 3. Known Somatic Variants. Supplementary Table 4. Somatic Vars in NG Normal. Supplementary Table 5. NG WES QC Metrics. Supplementary Table 6. NG Cohort Description. Supplementary Table 7. TCGA Cohort Description. Supplementary Table 8. AA Germ Var Frequency. Supplementary Table 9. Somatic Variants. Supplementary Table 10. NG Somatic ClueGo Results.
Supplementary Figure 1. WES Analysis Workflow. Supplementary Figure 2. Variant Strand Bias. Supplementary Figure 3. Somatic Variant Filtering. Supplementary Figure 4. Manual Somatic Variant Inspection. Supplementary Figure 5. Genetic Admixture Analysis. Supplementary Figure 6. Comparison of somatic variant count based on normal sample usage. Supplementary Figure 7. TCGA EA Somatic Variant Lollipop Plots. Supplementary Figure 8. NG PCa Novel Somatic Variants. Supplementary Figure 9. Nigerian Germline Variant – Normal/Tumor Comparison Lollipop Plots. Supplementary Figure 10. TCGA PCa Somatic Variant COSMIC Signature Analysis.
Acknowledgments
We thank all members of the Prostate Cancer Translatlantic Consoritium (including our community partners) and members of each respective lab that contributed to this study. We also thank members of the CCR Collaborative Bioinformatics Resource group (Mayank Tandon and Skyler Kuhn), Justin Lack of the NIAID Collaborative Bioinformatics Resource group, and numerous staff within the NIH High-Performance Computing group. This work was supported by funding from Leidos contract YT16-010 (Project ID# 001.050.0010); NCI contract HHSN261201600732P (University of Florida); NCI grants U54-MD007585-26 (NIH/NIMHD) and U54 CA118623 (NIH/NCI); and Department of Defense Grant (PC170315P1, W81XWH-18-1-0589) awarded to C. Yates.
Footnotes
Note: Supplementary data for this article are available at Cancer Research Communications Online (https://aacrjournals.org/cancerrescommun/).
Authors’ Disclosures
F. Mohammed reports other from BIO Ventures for Global Health outside the submitted work. M. Davis reports grants from Weill Cornell Prostate SPORE during the conduct of the study; grants from Genentech and grants and non-financial support from NIH outside the submitted work. P. Polak reports other from C2i Genomcis outside the submitted work. C. Yates reports other from Leidos contract YT16-010 (Project ID# 001.050.0010), NCI contracts HHSN261201600732P (University of Florida); grants from U54-MD007585-26, U54 CA118623 (NIH/NCI), and PC170315P1, W81XWH-18-1-0589) awarded to C. Yates during the conduct of the study; personal fees from Riptide Biosciences, QED Therapeutics, and Amgen and other from Riptide Biosciences outside the submitted work. No disclosures were reported by the other authors.
Authors’ Contributions
J.A. White: Data curation, software, formal analysis, supervision, validation, investigation, visualization, methodology, writing-original draft, project administration, writing-review and editing. E.T. Kaninjing: Data curation, writing-review and editing. K.A. Adeniji: Conceptualization, resources, writing-review and editing. P. Jibrin: Conceptualization, resources, writing-review and editing. J.O. Obafunwa: Conceptualization, resources, writing-review and editing. C.N. Ogo: Conceptualization, resources, writing-review and editing. F. Mohammed: Conceptualization, resources, writing-review and editing. A. Popoola: Conceptualization, resources, writing-review and editing. O.A. Fatiregun: Conceptualization, resources, writing-review and editing. O.P. Oluwole: Conceptualization, resources, writing-review and editing. B. Karanam: Resources, data curation, writing-review and editing. I. Elhussin: Methodology, writing-review and editing. S. Ambs: Resources, software, supervision, methodology, writing-review and editing. W. Tang: Resources, investigation, methodology, writing-review and editing. M. Davis: Investigation, methodology, writing-review and editing. P. Polak: Investigation, methodology, writing-review and editing. M.J. Campbell: Methodology, writing-review and editing. K.R. Brignole: Data curation, supervision, writing-review and editing. S.O. Rotimi: Investigation, methodology, writing-original draft, writing-review and editing. W. Dean-Colomb: Investigation, writing-review and editing. F.T. Odedina: Conceptualization, resources, data curation, funding acquisition, project administration, writing-review and editing. D.N. Martin: Data curation, funding acquisition, writing-review and editing. C. Yates: Conceptualization, resources, supervision, funding acquisition, validation, investigation, visualization, methodology, writing-original draft, project administration, writing-review and editing.
References
- 1. Ferlay J, Colombet M, Soerjomataram I, Mathers C, Parkin DM, Piñeros M, et al. Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods. Int J Cancer 2019;144:1941–53. [DOI] [PubMed] [Google Scholar]
- 2. Jensen OM, Storm HH. Cancer registration: principles and methods. Reporting of results. IARC Sci Publ 1991;108–25. [PubMed] [Google Scholar]
- 3. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021;71:209–49. [DOI] [PubMed] [Google Scholar]
- 4. Chornokur G, Dalton K, Borysova ME, Kumar NB. Disparities at presentation, diagnosis, treatment, and survival in African American men, affected by prostate cancer. Prostate 2011;71:985–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Pietro GD, Chornokur G, Kumar NB, Davis C, Park JY. Racial differences in the diagnosis and treatment of prostate cancer. Int Neurourol J 2016;20:S112–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Hayes VM, Jaratlerdsiri W, Bornman MSR. Prostate cancer genomics and racial health disparity. Oncotarget 2018;9:36650–1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Grizzle WE, Kittles RA, Rais-Bahrami S, Shah E, Adams GW, DeGuenther MS, et al. Self-Identified African Americans and prostate cancer risk: West African genetic ancestry is associated with prostate cancer diagnosis and with higher Gleason sum on biopsy. Cancer Med 2019;8:6915–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Irizarry-Ramírez M, Kittles RA, Wang X, Salgado-Montilla J, Nogueras-González GM, Sánchez-Ortiz R, et al. Genetic ancestry and prostate cancer susceptibility SNPs in Puerto Rican and African American men. Prostate 2017;77:1118–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Batai K, Hooker S, Kittles RA. Leveraging genetic ancestry to study health disparities. Am J Phys Anthropol 2021;175:363–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Oak N, Cherniack AD, Mashl RJ, TCGA Analysis Network, Hirsch FR, Ding L, et al. Ancestry-specific predisposing germline variants in cancer. Genome Med 2020;12:51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Yuan J, Hu Z, Mahal BA, Zhao SD, Kensler KH, Pi J, et al. Integrated analysis of genetic ancestry and genomic alterations across cancers. Cancer Cell 2018;34:549–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, et al. The Cancer Genome Atlas pan-cancer analysis project. Nat Genet 2013;45:1113–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Patin E, Lopez M, Grollemund R, Verdu P, Harmant C, Quach H, et al. Dispersals and genetic adaptation of Bantu-speaking populations in Africa and North America. Science 2017;356:543–6. [DOI] [PubMed] [Google Scholar]
- 14. Salas A, Carracedo A, Richards M, Macaulay V. Charting the ancestry of African Americans. Am J Hum Genet 2005;77:676–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Zakharia F, Basu A, Absher D, Assimes TL, Go AS, Hlatky MA, et al. Characterizing the admixed African ancestry of African Americans. Genome Biol 2009;10:R141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Micheletti SJ, Bryc K, Ancona Esselmann SG, Freyman WA, Moreno ME, Poznik GD, et al. Genetic consequences of the transatlantic slave trade in the Americas. The Am J Hum Genet 2020;107:265–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Stefflova K, Dulik MC, Barnholtz-Sloan JS, Pai AA, Walker AH, Rebbeck TR. Dissecting the within-Africa ancestry of populations of African descent in the Americas. PLoS One 2011;6:e14495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Gay NR, Gloudemans M, Antonio ML, Abell NS, Balliu B, Park Y, et al. Impact of admixture and ancestry on eQTL analysis and GWAS colocalization in GTEx. Genome Biol 2020;21:233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Kim MS, Patel KP, Teng AK, Berens AJ, Lachance J. Genetic disease risks can be misestimated across global populations. Genome Biol 2018;19:179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Cook MB, Wang Z, Yeboah ED, Tettey Y, Biritwum RB, Adjei AA, et al. A genome-wide association study of prostate cancer in West African men. Hum Genet 2014;133:509–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Du Z, Lubmawa A, Gundell S, Wan P, Nalukenge C, Muwanga P, et al. Genetic risk of prostate cancer in Ugandan men. Prostate 2018;78:370–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Petersen DC, Jaratlerdsiri W, van Wyk A, Chan EKF, Fernandez P, Lyons RJ, et al. African KhoeSan ancestry linked to high-risk prostate cancer. BMC Med Genet 2019;12:82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Jaratlerdsiri W, Chan EKF, Gong T, Petersen DC, Kalsbeek AMF, Venter PA, et al. Whole-genome sequencing reveals elevated tumor mutational burden and initiating driver mutations in African men with treatment-naïve, high-risk prostate cancer. Cancer Res 2018;78:6736–46. [DOI] [PubMed] [Google Scholar]
- 24. Hayes VM, Bornman MSR. Prostate cancer in Southern Africa: does Africa hold untapped potential to add value to the current understanding of a common disease? J Glob Oncol 2018;4:1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Beebe-Dimmer JL, Zuhlke KA, Johnson AM, Liesman D, Cooney KA. Rare germline mutations in African American men diagnosed with early-onset prostate cancer. Prostate 2018;78:321–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Yadav S, Anbalagan M, Baddoo M, Chellamuthu VK, Mukhopadhyay S, Woods C, et al. Somatic mutations in the DNA repairome in prostate cancers in African Americans and Caucasians. Oncogene 2020;39:4299–311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. de Bono J, Mateo J, Fizazi K, Saad F, Shore N, Sandhu S, et al. Olaparib for metastatic castration-resistant prostate cancer. N Engl J Med 2020;382:2091–102. [DOI] [PubMed] [Google Scholar]
- 28. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 2014;30:2114–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997v1 [q-bio.GN]; 2013.
- 30. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics 2009;25:2078–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010;20:1297–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. García-Alcalde F, Okonechnikov K, Carbonell J, Cruz LM, Götz S, Tarazona S, et al. Qualimap: evaluating next-generation sequencing alignment data. Bioinformatics 2012;28:2678–9. [DOI] [PubMed] [Google Scholar]
- 33. Subirana I, Sanz H, Vila J. Building Bivariate Tables: The compareGroups Package for R. J Stat Softw 2014;57(12): 1–16.25400517 [Google Scholar]
- 34. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 2011;43:491–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 2013;31:213–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 2010;38:e164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Jones S, Anagnostou V, Lytle K, Parpart-Li S, Nesselbush M, Riley DR, et al. Personalized genomic analyses for cancer mutation discovery and interpretation. Sci Transl Med 2015;7:283ra53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Mayakonda A, Lin D-C, Assenov Y, Plass C, Koeffler HP. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res 2018;28:1747–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Liu X, Wu C, Li C, Boerwinkle E. dbNSFP v3.0: a one-stop database of functional predictions and annotations for human nonsynonymous and splice-site SNVs. Hum Mutat 2016;37:235–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 2016;536:285–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res 2018;46:D1062–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 2010;26:841–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics 2013;43:11.10.11–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res 2009;19:1655–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature 2015;526:68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Korunes KL, Goldberg A. Human genetic admixture. PLoS Genet 2021;17:e1009374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Tate JG, Bamford S, Jubb HC, Sondka Z, Beare DM, Bindal N, et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res 2019;47:D941–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003;13:2498–504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Doncheva NT, Morris JH, Gorodkin J, Jensen LJ. Cytoscape stringapp: network analysis and visualization of proteomics data. J Proteome Res 2019;18:623–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, et al. ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 2009;25:1091–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Collins FS. What we do and don't know about ‘race’, ‘ethnicity’, genetics and health at the dawn of the genome era. Nat Genet 2004;36:S13–5. [DOI] [PubMed] [Google Scholar]
- 52. Shraga R, Yarnall S, Elango S, Manoharan A, Rodriguez SA, Bristow SL, et al. Evaluating genetic ancestry and self-reported ethnicity in the context of carrier screening. BMC Genet 2017;18:99. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Zhang X, Wei J, Zhou L, Zhou C, Shi J, Yuan Q, et al. A functional BRCA1 coding sequence genetic variant contributes to risk of esophageal squamous cell carcinoma. Carcinogenesis 2013;34:2309–13. [DOI] [PubMed] [Google Scholar]
- 54. Liu D, Gao Y, Li L, Chen H, Bai L, Qu Y, et al. Single nucleotide polymorphisms in breast cancer susceptibility gene 1 are associated with susceptibility to lung cancer. Oncol Lett 2021;21:424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Shi M, Ma F, Liu J, Xing H, Zhu H, Yu J, et al. A functional BRCA1 coding sequence genetic variant contributes to prognosis of triple-negative breast cancer, especially after radiotherapy. Breast Cancer Res Treat 2017;166:109–16. [DOI] [PubMed] [Google Scholar]
- 56. Wang K, Xu L, Pan L, Xu K, Li G. The functional BRCA1 rs799917 genetic polymorphism is associated with gastric cancer risk in a Chinese Han population. Tumour Biol 2015;36:393–7. [DOI] [PubMed] [Google Scholar]
- 57. Liu X, Zhang X, Chen Y, Yang X, Xing Y, Ma L. Association of three common BARD1 variants with cancer susceptibility: a system review and meta-analysis. Int J Clin Exp Med 2015;8:311–21 [PMC free article] [PubMed] [Google Scholar]
- 58. Koboldt DC. Best practices for variant calling in clinical sequencing. Genome Med 2020;12:91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Koga Y, Song H, Chalmers ZR, Newberg J, Kim E, Carrot-Zhang J, et al. Genomic profiling of prostate cancers from men with African and European ancestry. Clin Cancer Res 2020;26:4651–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Cannataro VL, Gaffney SG, Townsend JP. Effect sizes of somatic mutations in cancer. J Natl Cancer Inst 2018;110:1171–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Greenman C, Stephens P, Smith R, Dalgliesh GL, Hunter C, Bignell G, et al. Patterns of somatic mutation in human cancer genomes. Nature 2007;446:153–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 2013;499:214–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Tan H, Bao J, Zhou X. Genome-wide mutational spectra analysis reveals significant cancer-specific heterogeneity. Sci Rep 2015;5:12566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Brown JS, O'Carrigan B, Jackson SP, Yap TA. Targeting DNA repair in cancer: beyond parp inhibitors. Cancer Discov 2017;7:20–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Qin T-tChen T, Zhang Q, Du H-nShu Y-qLuo K, et al. Association between BRCA1 rs799917 polymorphism and breast cancer risk: A meta-analysis of 19,878 subjects. Biomed Pharmacother 2014;68:905–10. [DOI] [PubMed] [Google Scholar]
- 66. Xu G-P, Zhao Q, Wang D, Xie W-Y, Zhang L-J, Zhou H, et al. The association between BRCA1 gene polymorphism and cancer risk: a meta-analysis. Oncotarget 2018;9:8681–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Yang M, Du X, Zhang F, Yuan S. Association between BRCA1 polymorphisms rs799917 and rs1799966 and breast cancer risk: a meta-analysis. J Int Med Res 2019;47:1409–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Nicoloso MS, Sun H, Spizzo R, Kim H, Wickramasinghe P, Shimizu M, et al. Single-nucleotide polymorphisms inside microRNA target sites influence tumor susceptibility. Cancer Res 2010;70:2789–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Bae I, Fan S, Meng Q, Rih JK, Kim HJ, Kang HJ, et al. BRCA1 induces antioxidant gene expression and resistance to oxidative stress. Cancer Res 2004;64:7893–909. [DOI] [PubMed] [Google Scholar]
- 70. Xiao J, Cohen P, Stern MC, Odedina F, Carpten J, Reams R. Mitochondrial biology and prostate cancer ethnic disparity. Carcinogenesis 2018;39:1311–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. Grindedal EM, Møller P, Eeles R, Stormorken AT, Bowitz-Lothe IM, Landro SM, et al. Germ-line mutations in mismatch repair genes associated with prostate cancer. Cancer Epidemiol Biomarkers Prev 2009;18:2460–7. [DOI] [PubMed] [Google Scholar]
- 72. Castro E, Goh C, Olmos D, Saunders E, Leongamornlert D, Tymrakiewicz M, et al. Germline BRCA mutations are associated with higher risk of nodal involvement, distant metastasis, and poor survival outcomes in prostate cancer. J Clin Oncol 2013;31:1748–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Ledet EM, Burgess EF, Sokolova AO, Jaeger EB, Hatton W, Moses M, et al. Comparison of germline mutations in African American and Caucasian men with metastatic prostate cancer. Prostate 2021;81:433–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Fackenthal JD, Zhang J, Zhang B, Zheng Y, Hagos F, Burrill DR, et al. High prevalence of BRCA1 and BRCA2 mutations in unselected Nigerian breast cancer patients. Int J Cancer 2012;131:1114–23. [DOI] [PubMed] [Google Scholar]
- 75. Pitt JJ, Riester M, Zheng Y, Yoshimatsu TF, Sanni A, Oluwasola O, et al. Characterization of Nigerian breast cancer reveals prevalent homologous recombination deficiency and aggressive molecular features. Nat Commun 2018;9:4181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76. Ricks-Santi L, McDonald JT, Gold B, Dean M, Thompson N, Abbas M, et al. Next generation sequencing reveals high prevalence of BRCA1 and BRCA2 variants of unknown significance in early-onset breast cancer in African American women. Ethn Dis 2017;27:169–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77. Marshall CH, Fu W, Wang H, Baras AS, Lotan TL, Antonarakis ES. Prevalence of DNA repair gene mutations in localized prostate cancer according to clinical and pathologic features: association of Gleason score and tumor stage. Prostate Cancer Prostatic Dis 2019;22:59–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Prakash R, Zhang Y, Feng W, Jasin M. Homologous recombination and human health: the roles of BRCA1, BRCA2, and associated proteins. Cold Spring Harb Perspect Biol 2015;7:a016600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Nombela P, Lozano R, Aytes A, Mateo J, Olmos D, Castro E. BRCA2 and other DDR genes in prostate cancer. Cancers 2019;11:352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Jeggo PA, Löbrich M. How cancer cells hijack DNA double-strand break repair pathways to gain genomic instability. Biochem J 2015;471:1–11. [DOI] [PubMed] [Google Scholar]
- 81. Choi M, Kipps T, Kurzrock R. ATM mutations in cancer: therapeutic implications. Mol Cancer Ther 2016;15:1781–91. [DOI] [PubMed] [Google Scholar]
- 82. Ma J, Setton J, Morris L, Carrillo Albornoz PB, Barker C, Lok BH, et al. Genomic analysis of exceptional responders to radiotherapy reveals somatic mutations in ATM. Oncotarget 2017;8:10312–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83. Gao B, Sekido Y, Maximov A, Saad M, Forgacs E, Latif F, et al. Functional properties of a new voltage-dependent calcium channel α2δ auxiliary subunit gene (CACNA2D2). J Biol Chem 2000;275:12237–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84. Fiske JL, Fomin VP, Brown ML, Duncan RL, Sikes RA. Voltage-sensitive ion channels and cancer. Cancer Metastasis Rev 2006;25:493–500. [DOI] [PubMed] [Google Scholar]
- 85. Warnier M, Roudbaraki M, Derouiche S, Delcourt P, Bokhobza A, Prevarskaya N, et al. CACNA2D2 promotes tumorigenesis by stimulating cell proliferation and angiogenesis. Oncogene 2015;34:5383–94. [DOI] [PubMed] [Google Scholar]
- 86. Mellad JA, Warren DT, Shanahan CM. Nesprins LINC the nucleus and cytoskeleton. Curr Opin Cell Biol 2011;23:47–54. [DOI] [PubMed] [Google Scholar]
- 87. Chow K-H, Factor RE, Ullman KS. The nuclear envelope environment and its cancer connections. Nat Rev Cancer 2012;12:196–209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88. Sur-Erdem I, Hussain MS, Asif M, Pınarbası N, Aksu AC, Noegel AA. Nesprin-1 impact on tumorigenic cell phenotypes. Mol Biol Rep 2020;47:921–34. [DOI] [PubMed] [Google Scholar]
- 89. Tang BL. ADAMTS: a novel family of extracellular matrix proteases. Int J Biochem Cell Biol 2001;33:33–44. [DOI] [PubMed] [Google Scholar]
- 90. Discher DE, Smith L, Cho S, Colasurdo M, García AJ, Safran S. Matrix mechanosensing: from scaling concepts in ’omics data to mechanisms in the nucleus, regeneration, and cancer. Annu Rev Biophys 2017;46:295–315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91. Egeblad M, Rasch MG, Weaver VM. Dynamic interplay between the collagen scaffold and tumor evolution. Curr Opin Cell Biol 2010;22:697–706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92. Caswell-Jin JL, Gupta T, Hall E, Petrovchich IM, Mills MA, Kingham KE, et al. Racial/ethnic differences in multiple-gene sequencing results for hereditary cancer risk. Genet Med 2018;20:234–9. [DOI] [PubMed] [Google Scholar]
- 93. Ndugga-Kabuye MK, Issaka RB. Inequities in multi-gene hereditary cancer testing: lower diagnostic yield and higher VUS rate in individuals who identify as Hispanic, African or Asian and Pacific Islander as compared to European. Fam Cancer 2019;18:465–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94. Saulsberry K, Terry SF. The need to build trust: a perspective on disparities in genetic testing. Genet Test Mol Biomarkers 2013;17:647–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95. Tan SH, Petrovics G, Srivastava S. Prostate cancer genomics: recent advances and the prevailing underrepresentation from racial and ethnic minorities. Int J Mol Sci 2018;19:1255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96. Bryc K, Durand EY, Macpherson JM, Reich D, Mountain JL. The genetic ancestry of African Americans, Latinos, and European Americans across the United States. Am J Hum Genet 2015;96:37–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97. Baharian S, Barakatt M, Gignoux CR, Shringarpure S, Errington J, Blot WJet al. The great migration and African-American genomic diversity. PLoS Genet 2016;12:e1006059. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Table 1. Known Germline Variants. Supplementary Table 2. Germline Variants. Supplementary Table 3. Known Somatic Variants. Supplementary Table 4. Somatic Vars in NG Normal. Supplementary Table 5. NG WES QC Metrics. Supplementary Table 6. NG Cohort Description. Supplementary Table 7. TCGA Cohort Description. Supplementary Table 8. AA Germ Var Frequency. Supplementary Table 9. Somatic Variants. Supplementary Table 10. NG Somatic ClueGo Results.
Supplementary Figure 1. WES Analysis Workflow. Supplementary Figure 2. Variant Strand Bias. Supplementary Figure 3. Somatic Variant Filtering. Supplementary Figure 4. Manual Somatic Variant Inspection. Supplementary Figure 5. Genetic Admixture Analysis. Supplementary Figure 6. Comparison of somatic variant count based on normal sample usage. Supplementary Figure 7. TCGA EA Somatic Variant Lollipop Plots. Supplementary Figure 8. NG PCa Novel Somatic Variants. Supplementary Figure 9. Nigerian Germline Variant – Normal/Tumor Comparison Lollipop Plots. Supplementary Figure 10. TCGA PCa Somatic Variant COSMIC Signature Analysis.
Data Availability Statement
The data generated in this study are publicly available in dbGaP (Accession phs002547.v1.p1).