Skip to main content
Genome Medicine logoLink to Genome Medicine
. 2020 May 29;12:51. doi: 10.1186/s13073-020-00744-3

Ancestry-specific predisposing germline variants in cancer

Ninad Oak 1,2, Andrew D Cherniack 3,4, R Jay Mashl 5,6; TCGA Analysis Network, Fred R Hirsch 7, Li Ding 5,6,8,9, Rameen Beroukhim 3,4,10, Zeynep H Gümüş 11,12, Sharon E Plon 2,13,14, Kuan-lin Huang 11,12,15,
PMCID: PMC7260738  PMID: 32471518

Abstract

Background

Distinct prevalence of inherited genetic predisposition may partially explain the difference of cancer risks across ancestries. Ancestry-specific analyses of germline genomes are required to inform cancer genetic risk and prognosis of diverse populations.

Methods

We conducted analyses using germline and somatic sequencing data generated by The Cancer Genome Atlas. Collapsing pathogenic and likely pathogenic variants to cancer predisposition genes (CPG), we analyzed the association between CPGs and cancer types within ancestral groups. We also identified the predisposition-associated two-hit events and gene expression effects in tumors.

Results

Genetic ancestry analysis classified the cohort of 9899 cancer cases into individuals of primarily European (N = 8184, 82.7%), African (N = 966, 9.8%), East Asian (N = 649, 6.6%), South Asian (N = 48, 0.5%), Native/Latin American (N = 41, 0.4%), and admixed (N = 11, 0.1%) ancestries. In the African ancestry, we discovered a potentially novel association of BRCA2 in lung squamous cell carcinoma (OR = 41.4 [95% CI, 6.1–275.6]; FDR = 0.002) previously identified in Europeans, along with a known association of BRCA2 in ovarian serous cystadenocarcinoma (OR = 8.5 [95% CI, 1.5–47.4]; FDR = 0.045). In the East Asian ancestry, we discovered one previously known association of BRIP1 in stomach adenocarcinoma (OR = 12.8 [95% CI, 1.8–90.8]; FDR = 0.038). Rare variant burden analysis further identified 7 suggestive associations in African ancestry individuals previously described in European ancestry, including SDHB in pheochromocytoma and paraganglioma, ATM in prostate adenocarcinoma, VHL in kidney renal clear cell carcinoma, FH in kidney renal papillary cell carcinoma, and PTEN in uterine corpus endometrial carcinoma. Most predisposing variants were found exclusively in one ancestry in the TCGA and gnomAD datasets. Loss of heterozygosity was identified for 7 out of the 15 African ancestry carriers of predisposing variants. Further, tumors from the SDHB or BRCA2 carriers showed simultaneous allelic-specific expression and low gene expression of their respective affected genes, and FH splice-site variant carriers showed mis-splicing of FH.

Conclusions

While several CPGs are shared across patients, many pathogenic variants are found to be ancestry-specific and trigger somatic effects. Studies using larger cohorts of diverse ancestries are required to pinpoint ancestry-specific genetic predisposition and inform genetic screening strategies.

Background

Cancer risk differs across ancestries. According to the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) program, the cancer incidence per 100,000 ranges from 449 in race/ethnicity population self-identified as Whites, 453 in Blacks, 298 in Asian/Pacific Islanders, 315 in American Indian/Alaskan Natives, and 336 in Hispanics in the USA between 2011 and 2015 [1, 2]. While some of these differences may be attributed to non-genetic factors such as access to health care or diet, much can likely be explained by differences in the genomic architecture of these ancestries and differing frequencies of inherited genetic predisposition. Previous studies revealed different carrier rates of pathogenic variants across ancestries, albeit often in a limited panel of genes or selected cancer types [35].

While multiple large-scale genome-wide association studies have investigated the common risk variants contributing to cancer [610], fewer studies have interrogated rare pathogenic variants in non-European ancestries [5, 1115]. A 2019 systematic review of cancer sequencing studies found a total of only 764 reported non-European (minority) cases in 27 published studies with reported race/ethnicity [9]. Consequently, germline genetic testing in non-White patients often results in higher rates of variants of unknown significance (VUSs) [16]. Ongoing efforts are bridging the knowledge gap of cancer genetic predisposition in under-studied populations [1719]. Meanwhile, systematic cross-ancestry investigations of predisposing variants across cancer types are urgently needed to inform genetic testing for each ancestral group.

Herein, we analyzed germline variant data of 9899 cancer cases across 33 cancer types from the Cancer Genome Atlas Project (TCGA) [20] to identify ancestry-specific cancer-gene associations where the genes show an excess of pathogenic/likely pathogenic germline variants the TCGA samples. In samples of African ancestry, we identified two associations, BRCA2 in lung squamous cell carcinoma (LUSC) and ovarian serous cystadenocarcinoma (OV). In analyses of individuals with East Asian ancestry, we identified an association for BRIP1 in stomach adenocarcinoma (STAD). Using a rare-variant association analysis, we identified seven additional suggestive cancer gene associations. Evidence of a somatic second hit event (i.e., loss of heterozygosity [LOH] or a biallelic mutation) was found in two thirds of the tumors with germline predisposing variants. Many carriers of ancestry-specific predisposition variants showed altered expression of the affected genes, including allelic-specific expression (ASE), mis-splicing, and reduced tumor suppressor gene expression, further supporting these genetic variants’ contribution to cancer predisposition.

Methods

Study cohort and genetic ancestry assignment

We used the clinical data provided by TCGA PanCanAtlas and restricted analyses to those with pass-QC blood/normal sequencing data. In addition to excluding cases with PanCanAtlas blacklisted germline BAM-files, cases with less than 60% genotype concordance between sequencing variant calls and SNP-genotype data were eliminated, where 10,389 cases were left [20]. We further overlapped with the cases included in the PanCanAtlas Ancestry Informative Markers (AIM) genetic ancestry assignment, resulting in the final set of 9899 samples. The detailed descriptions of ancestry assignment procedures are available in the marker publication [21].

Briefly, consensus genetic ancestry for each TCGA case was determined as the majority of ancestry assignments that were independently determined by five methods across four institutions. These methods include those based on SNP-array genotypes used by Broad Institute, University of California San Francisco (UCSF), and Washington University (WashU), as well as those based on whole-exome sequencing data used by University of Trento and ExAC/Broad Institute. The five methods conducted variations of principal component analyses (PCA) on TCGA normal samples to infer genetic ancestry. We further provide the PCA plots showing the alignment of the major PCs in the UCSF and WashU analyses with the AIM-group consensus genetic ancestry in Additional file 1: Fig. S1.

For each sample, the percentage of global ancestry of African, European, East Asian, Native/Latin American, and South Asian (k = 5) was further estimated using ADMIXTURE [22] version 1.23 based on the common SNP markers (1000 genomes allele frequency (AF) > 1%) in the Broad Institute analysis. Samples with the proportion of the secondary ancestry greater than 20% were considered as admixed samples (Additional file 2: Table S1). Sensitivity analyses revealed increased power by including admix samples in this cohort. Thus, cases with admixed ancestry assignments were grouped to their nearest neighbors (e.g., afr_admix to afr) for downstream analyses.

Pathogenic and likely pathogenic germline variant calls

We downloaded the overall and predisposing germline variant calls previously reported by the PanCanAtlas Germline Analyses Working Group (https://gdc.cancer.gov/about-data/publications/PanCanAtlas-Germline-AWG) [20]. The detailed description of variant calling and classification procedures are available in the TCGA PanCanAtlas germline publication [20].

Briefly, germline SNVs were identified using the union of variant calls between Varscan [23] and GATK [24]. Germline indels were identified using Varscan, GATK, and Pindel [25], and we only retained variants called by at least two out of the three callers or high-confidence Pindel-unique calls (at least 30× coverage and 20% variant allele fraction [VAF]). We used the GRCh37-lite reference. We further required the variants to have an allelic depth (AD) ≥ 5 for the alternative allele. We then used bam-readcount to quantify the number of reference and alternative alleles in both normal and tumor samples. We required the variants to have at least 5 counts of the alternative allele and an alternative allele frequency of at least 20%. Of these, we included those rare variants with ≤ 0.05% allele frequency in 1000 Genomes and ExAC (release r0.3.1). We subsequently retained only cancer-relevant pathogenic variants, based on whether they were found in the curated cancer variant databases or a 152 curated cancer predisposing gene list. Finally, we manually reviewed all variants using integrative genomics viewer (IGV) and filtered out variants with poor support sequence reads.

The variants defined by the above pipeline were then classified using an automatic pipeline termed CharGer [26] (https://github.com/ding-lab/CharGer) that adopts the American College of Medical Genetics and Genomics/Association of Molecular Pathology (ACMG/AMP) variant classification guidelines which are designed for assessment of germline variants in Mendelian disorders [27]. For the CharGer classification pipeline, we defined 12 pathogenic evidence levels and 4 benign evidence levels using a number of datasets, including ExAC and ClinVar. The pathogenic evidence adds points, whereas benign evidence subtracts points that amount to pathogenicity (pathogenic requires the variant to be described as pathogenic by the reviewed clinical significance in ClinVar (not including variants showing “conflicting interpretations of pathogenicity”) or other cancer predisposition gene databases, likely pathogenic requires CharGer score > 8). To acquire enough CharGer points to be classified as likely pathogenic, the variants typically need to be predicted to result in truncation in cancer predisposition genes where the loss of function (LOF) is a known disease mechanism and harbor variants with a dominant (evidence level PVS1, + 8 points) or a recessive (evidence level PSC1, + 4 points) mode of inheritance. Additionally, evidence level PS1, + 7 points are scored if the variant results in the same peptide sequence change as an established pathogenic variant. All other modules will each add ≤ 2 points.

Principal component analysis (PCA)

Birdseed genotype files were downloaded from Genomic Data Commons (GDC) in the legacy (hg19) archive onto Institute for System Biology-Cancer Genome Cloud (ISB-CGC), converted to individual VCF files, and then merged into a combined VCFs containing 11,459 samples and 522,606 variants. We conducted PCA as implemented by PLINK (v1.9) [28]. Specifically, we retained 298,004 variants with AF > 0.15 for population structure analysis. The resulting eigenvalues and eigenvectors were then recorded. PC1 and PC2 accounted for 51.6% and 29.2% of the variations across the first 20 PCs, and none of the trailing PCs accounted for more than 3.2%. Thus, we subsequently controlled for PC1 and PC2 in ancestry-specific cancer predisposing gene analysis (Additional file 1: Fig. S1).

Multivariate regression to identify the enrichment of pathogenic variants

For each cancer type within each ancestry, we conducted multivariate logistic regression analyses considering the case status of the cancer type as the dependent variable (using all other cancer cohorts as controls) and the carrier status of each predisposing gene as an independent variable. The model corrected for age at the initial pathologic diagnosis, gender, and the first two principal components (accounted for 80.8% variations across the first 20 PCs). All ancestry cohorts are called using the same variant calling pipeline, thus avoiding the potential danger of comparing this population against other cohorts such as ExAC. We collapsed predisposing (pathogenic and likely pathogenic) germline variants to the gene level. Only ancestry-cancer combinations with at least 20 cases and predisposing genes with at least two individuals with predisposing variants within the cohort are tested. In total, we tested 33 cancers in European Ancestry, 15 cancers in African Ancestry, and 8 cancers in East Asian ancestry that met this criterion. No cohorts of the Native/Latin American and South Asian ancestry have sufficient sample sizes in TCGA for testing. Among these tested cancers, we tested a total of 114 cancer-gene combinations for multivariate regression analysis, of which 101 were within European ancestry, 9 were in African ancestry, and 4 were in East Asian ancestry. P values were calculated using the Wald test and adjusted to FDR using the standard Benjamini-Hochberg procedure.

Burden testing of pathogenic variants

We conducted burden testing of the cohort within each ancestry as defined by the TCGA AIM working group. Specifically, we adopted the Total Frequency Test (TFT) [29] by collapsing predisposing (pathogenic and likely pathogenic) germline variants to the gene level. For each cancer type with at least 20 cases of the tested ancestry with at least one predisposing variant carrier, we tested the burden of predisposing variants for each gene against all other cancer cohorts as controls. Among the cancers that met the sample size criteria described above, we tested a total of 120 cancer-gene combinations using rare variant burden testing, of which 104 were within European ancestry, 11 were in African ancestry, and 5 were in East Asian ancestry. The resulting P values were adjusted to FDR using the standard Benjamini-Hochberg procedure.

gnomAD analysis

We analyzed the gene-level and variant-level frequency of the identified genetic predisposition using the non-cancer subset of the genome aggregation database (gnomAD-non-cancer) cohort (118,479 WES and 15,708 WGS samples) [30, 31] (http://gnomad.broadinstitute.org). For the gene-level analysis, we retained rare variants with ancestry-specific minor allele frequency < 0. 5%. We further retained pathogenic and likely pathogenic variants per ACMG/AMP criteria as ascertained by InterVar [32] and annotated using ANNOVAR [33]. Allele frequencies were summarized at gene-level within each sub-population in gnomAD using total allele counts and maximum allele numbers within each group.

The lolliplot diagrams in Fig. 2 were constructed and modified using protein paint (https://pecan.stjude.cloud/proteinpaint).

Fig. 2.

Fig. 2

Ancestry-specific predisposing germline variants. Predisposing variants in the significant (regression analysis; a) and suggestive (rare variant burden testing; b) cancer-gene associations are shown. The variants are labeled with carrier counts and colored by their respective carriers’ ancestry (European Ancestry: blue, African ancestry: red, East Asian Ancestry: green). a Significant predisposing variants identified in the African and East Asian ancestries are shown across respective cancer types. For BRCA2, predisposing variants across all cancers are shown (top) in comparison with the two cancer types with significant associations in the African ancestry (LUSC and OV, bottom). Similarly, predisposing variants contributing to the significant association of BRIP1 in STAD in the East Asian ancestry are shown. b Suggestive predisposing variants identified in the African ancestry are shown for ATM, FH, and VHL genes within their associated cancer types. Bi-allelic events in each carrier are linked by a grey line bracket where the somatic second-hit mutations are marked with a box. c Borderline-suggestive predisposing variants identified in the East Asian ancestry are shown for RECQL in STAD and POLE in LIHC

Expression analysis

TCGA level-3 normalized RNA expression data were downloaded from Firehose (2016/1/28 analysis archive). The tumor expression percentile of individual genes in each cancer cohort was calculated using the empirical cumulative distribution function (ecdf), as implemented in R. We annotated germline carriers of predisposition variants with extreme mRNA tumor expression (> 80th or < 20th percentile) of the affected gene. For samples within the same ancestry and same cancer cohort, we then used the two-sample Kolmogorov-Smirnov test to compare the expression percentile distribution between variants of oncogenes and tumor suppressors. The resulting P values were adjusted to false discovery rate (FDR) using the standard Benjamini-Hochberg procedure.

For the ancestry-specific variants, we recorded the RNA VAF of the mutant allele in the RNA-Seq bam files. For splice site variants, we assessed the mis-splicing of the transcript and variants using IGV.

Power and downsampling analysis

Post hoc power analyses were performed using R-package SKAT [34] and the power_logistic function to calculate the number of samples for rare variant association with causal percentage = 80%, minor allele frequency < 0.1%, and using odds ratio (OR) > 1 through OR < 10. Each calculation was performed using 100 simulations over a target 5 kb region.

Additionally, we performed a downsampling analysis for each tumor type by random sampling of subsets of samples with incremental sizes from zero to the total number of samples in that tumor type. We identified the number of significantly mutated genes as described above within each subset and plotted a smoothed function (loess method) against the subset size. Each calculation was performed at ten iterations (Additional file 1: Fig. S2).

Results

Ancestry demographics of TCGA cohort

We classified the 9899 TCGA cases with pass-QC germline data across 33 cancer types by genotype-defined ancestries defined by the PanCanAtlas Ancestry Informative Markers (AIM) working group (Additional file 1: Fig. S1, the “Methods” section, Table 1). The European ancestry contained 82.68% (n = 8184) of individuals in this cohort. The remainder of the cohort consisted of 9.76% (n = 966) African ancestry, 6.56% (n = 649) East Asian ancestry, 0.48% (n = 48) South Asian ancestry, 0.41% (n = 41) Native/Latin American ancestry, and 0.11% (n = 11) mixed ancestry. The largest ancestry-specific tumor cohorts are breast invasive carcinoma (BRCA) for the European ancestry (n = 811) and African ancestry (n = 180), liver hepatocellular carcinoma (LIHC) for the East Asian ancestry (n = 162), and thyroid carcinoma (THCA) for the Native/Latin American ancestry (n = 11) and the South Asian ancestry (n = 11).

Table 1.

The demographic distribution of TCGA PanCanAtlas cohort

graphic file with name 13073_2020_744_Tab1_HTML.jpg

Ancestry-specific cancer predisposing genes

Acknowledging the limited power to assess ancestry-specific associations as shown by the post hoc power analyses (Additional file 1: Fig. S2), we sought to identify cancer predisposing genes within each ancestry. We considered cancer predisposing genes as those statistically enriched for pooled pathogenic and likely pathogenic variants (referred to here as predisposing variants) as previously classified [20]). For each ancestry-cancer type pair, we conducted multivariate regression analyses correcting for onset age, gender, and the first two principal components.

Along with 36 cancer-gene associations (FDR < 0.05, Wald test) found in the European ancestry, we identified two specific cancer-gene associations in the African ancestry: BRCA2 in ovarian cancer (OV) (OR = 8.5 [95% CI, 1.5–47.4]; FDR = 0.045) and LUSC (OR = 41.4 [95% CI, 6.1–275.6]; FDR = 0.002). We also identified one association in the East Asian ancestry, BRIP1 in STAD (OR = 12.8 [95% CI, 1.8–90.84]; FDR = 0.038) (Fig. 1, Additional file 2: Table S2a). While the association of BRCA2 and LUSC is first described in African-American ancestry here, BRCA2 was previously found to be associated with non-small cell lung cancer (including LUAD and LUSC) and ovarian cancer (OV) in the European ancestry [3537]. The association of BRIP1 predisposition to STAD in the East Asian ancestry was also previously reported for the European ancestry [38]. These findings (including novel associations) in a large heterogeneous cancer population build on older studies that evaluated individual cancer predisposition genes and cancer risk across ancestries.

Fig. 1.

Fig. 1

Cancer predisposing genes identified in each ancestry across 9899 TCGA cases across cancer types in the African ancestry, East Asian, and European ancestries. a Ancestry-specific cancer-gene pairs from TCGA dataset containing cancer predisposing variants as identified by multivariate logistic regression analyses. Each number represents carrier frequencies of predisposing genes within that cancer cohort. Genes with significant associations (Wald test FDR < 0.05) are highlighted with blue boxes. b Significant cancer-predisposing gene associations (FDR < 0.05) identified in the African and East Asian ancestries

The top associated predisposing genes and their carrier frequency vary widely across ancestries (Fig. 1a). For genes with a significant association in the African ancestry, we observed a higher carrier frequency compared to other ancestries. For example, in LUSC, BRCA2 predisposing variants were found in 2 of the 29 African ancestry samples (6.9%), whereas we only found 1 BRCA2 carrier out of the 455 European-ancestry samples (0.44%).

We next investigated whether the cross-ancestry differences in predisposing gene frequencies were also observed in other cohorts. Specifically, we examined the gene-level rates of individuals carrying pathogenic and likely pathogenic variants in the gnomAD non-cancer cohort [30, 31] (118,479 WES and 15,708 WGS samples, the “Methods” section, Additional file 2: Table S3). BRCA2 showed the highest frequency in the African ancestry (0.072%) than all other defined ancestries, including non-Finnish European (0.048%) and East Asian (0.047%). BRIP1 also showed higher frequency in the East Asian ancestry (0.068%) than all ancestries (≤ 0.045%) except for the non-Finnish European ancestry (0.099%).

To generate hypotheses for future targeted studies, we investigated additional ancestry-implicated genes using total frequency testing (TFT) of predisposing variants, fully acknowledging potential confounders using this method (Additional file 2: Table S2b). We identified 7 suggestive (FDR < 0.05 in the TFT analysis) ancestry-specific cancer-gene associations in the African ancestry, 6 of which have been previously described including SDHB in PCPG [39], ATM in PRAD [40, 41], FH in KIRP [42], VHL in KIRC [43], PTEN in UCEC [44], and BRCA2 in OV [12]. We also re-discovered the BRCA2 in LUSC described above. In the East Asian ancestry, we identified 3 borderline-suggestive associations (FDR = 0.32): RECQL in STAD, BRIP1 in STAD, and POLE in LIHC. In STAD, RECQL and BRIP1 each affected 2 of the 90 East Asian ancestry cases, but none of the 294 European-ancestry cases. In LIHC, two protein-truncating variants were seen in POLE among 162 East Asian ancestry cases compared to none in 179 European-ancestry cases. These suggestive associations remain to be established and are only used to identify potential predisposing variants with supporting somatic evidence.

Ancestry-specific predisposing variants

We next examined ancestry-specific predisposition at the variant level (Fig. 2, Additional file 2: Table S4) for the 3 significant associations from the multivariate logistic regression analyses and the 7 suggestive associations from the TFT analysis. The cancer-gene pairs included 15 predisposing variants within the African ancestry and another 6 within the East Asian ancestry.

None of the above variants discovered in the African ancestry were observed in any other ancestry within that cancer type (Fig. 2). Across the pan-cancer TCGA cohort, all of the BRCA2 frameshift variants found in LUSC and OV were unique to the African ancestry. For other associated genes in the African ancestry, including ATM (PRAD), FH (KIRP), and VHL (KIRC), the predisposing variants differ between the African and European ancestries (Fig. 2b). The African ancestry-specific predisposing variants include splice site variants ATM c.2921+1G>A and FH c.556-2A>T, protein-truncating variants ATM p.T2333fs and FH p.S187*, and missense variants ATM p.R3008C. VHL p.C162F is the only recurrent variant found in two KIRC cases.

In the East Asian ancestry, we assessed predisposing variants in BRIP1 (STAD), POLE (LIHC), and RECQL (STAD) (Fig. 2a and c). These include two BRIP1 variants p.I525fs and p.E1222fs and two protein-truncating variants in POLE and RECQL, respectively. All six predisposing variants were not shared with any other ancestry in the TCGA cohort (Fig. 2c).

We further analyzed the frequency of these variants of the gnomAD non-cancer dataset [30, 31]. Among the African ancestry-specific predisposing variants, splice-site variant ATM c.2921+1G>A (African ancestry allelic count [AC]/total allele number [AN] = 1/14,878; allelic frequency [AF] = 0.0067%) and BRCA2 p.R3128* (African ancestry AC/AN = 4/23,610; AF = 0.016%) were the only variants present in the African and non-Finnish European ancestries in gnomAD-non-cancer dataset. All other variants were absent within African ancestry and most other ancestries in gnomAD except SDHB p.R46* (Finnish European ancestry AC/AN = 2/25,066; AF = 0.007%) and ATM p.R3008C (East Asian ancestry AC/AN = 1/17,688; AF = 0.005%). Similarly, only two of the six East Asian ancestry-specific predisposing variants, BRIP1 p.E1222Gfs (East Asian ancestry AC/AN = 11/19,232; AF = 0.05%) and POLE p.Tyr1078fs (East Asian ancestry AC/AN = 1/17,692; AF = 0.005%), were present exclusively in the East Asian ancestry of gnomAD-non-cancer dataset. Of note, 7 of the 15 predisposing variants, including BRCA2 variants in OV (p.Y1710fs, p.K1202fs) and in LUSC (p.V3082fs), were not found in ClinVar [45]. While VHL p.C162F lacks a ClinVar record, the co-localizing p.C162W showed three reports of pathogenicity and one report of uncertain significance.

We also investigated the presence of the six predisposing variants in the East Asian ancestry from the gnomAD non-cancer dataset. Only the POLE p.Y1078fs (AC/AN = 1/17,692, AF = 0.0056%) and BRIP1 p.E1222fs (AC/AN = 11/19,232, AF = 0.057%) were present exclusively in the East Asian ancestry of gnomAD-non-cancer dataset. All other East Asian-ancestry variants were not detected in this dataset. Of note, none of the six variants were previously reported in ClinVar [45].

Germline-somatic two-hit events

We next examined the two-hit hypothesis, whereby a somatic second hit of the same gene is found in carriers of the germline predisposing variants [46, 47]. First, we investigated the extent of loss of heterozygosity (LOH) of the predisposing variants using our previously developed statistical test [38] (the “Methods” section) that compares the variant allele fractions in tumor vs. normal samples. Among the variants observed in the African ancestry, we observed significant LOH (FDR < 0.05) for both truncating variants in SDHB p.R116fs and p.R46* in PCPG (Fig. 3a). Three additional variants exhibited significant LOH, including BRCA2 p.R3128* (LUSC), BRCA2 p.K1202fs (OV), and FH p.S187* (KIRP). We also observed suggestive LOH (FDR < 0.15 or tumor VAF > 0.6) for ATM c.2921+1G>A (PRAD) and BRCA2 p.Y1710fs (OV) (Fig. 3b). Among the six predisposing variants in the East Asian ancestry, only POLE p.E2137* (LIHC) showed significant LOH (Fig. 3a).

Fig. 3.

Fig. 3

Loss of heterozygosity (LOH) and transcriptional effects associated with ancestry-specific predisposing germline variants. a LOH in ancestry-specific predisposing variants shown by comparing variant allele frequency in tumor vs. that in normal samples. Each dot denotes a variant and the affected genes are labeled in cases where showed both significant allelic imbalance and copy number deletion of the wild-type alleles (in purple). Variants showing significant allelic imbalance, yet no conclusive evidence of wild-type alleles are considered as other LOH and marked in yellow. All other variants are shown in grey. b Count distribution of each type of LOH events across genes in the African ancestry, the East Asian ancestry, and the European ancestry. Note given the larger number of events, the x-axis for the European ancestry is shown on a different scale

As an alternative mechanism of a somatic second hit, we identified three biallelic mutations where the rare germline predisposing variant was coupled with a second somatic mutation of the same gene, all found in African ancestry carriers (labeled in Fig. 2b, Additional file 2: Table S4b). In a PRAD carrier of ATM, the germline p.L2332fs variant was coupled with a somatic p.E2164K mutation; in the KIRC carrier of VHL, the germline p.C162F variant was coupled with somatic p.E186* mutation. In a KIRP carrier of FH, whose FH gene expression is low (Fig. 4a), germline p.S187* variant was coupled with a somatic splice-site mutation c.1390+6T>A. Analysis of RNA from the KIRP tumor revealed that the somatic FH: c.1390+6T>A causes mis-splicing of 27.6% of the transcripts in tumor RNA, as indicated by the number of reads spanning consensus splice site (n = 68) and the new cryptic splice site (n = 26) (case 2 in Fig. 4b). None of the six carriers of the predisposing variants in East Asian ancestry harbored a biallelic somatic mutation. Overall, the assessment of LOH and biallelic mutation supports the variants’ contribution to oncogenesis through the two-hit model.

Fig. 4.

Fig. 4

Expression changes associated with the predisposing variants. a mRNA gene expression of the affected genes in the carriers of ancestry-specific variants as quantiles in their respective cancer cohort. Each dot denotes the gene expression level of a predisposing variant carrier colored by ancestry. Non-European variants corresponding to the bottom 25% expression in affected tumor suppressor genes and top 25% expression in affected oncogenes are further labeled. b Tumor RNA expression highlighting (red box) mis-spliced exon 5 with germline or somatic splice site variants in two cases with FH splice site variants as visualized using the integrated genome viewer (IGV). c Tumor RNA expression for the BRCA2 gene. The first two rows correspond to samples with a germline predisposing variant coupled with or without somatic LOH event, respectively. The third row corresponds to an unrelated sample without any BRCA2 alteration. All three coverage plots are group-scaled to show lower expression in the two samples harboring BRCA2 alterations

Expression changes in predisposing genes

To examine the transcriptional effects of the predisposing variants, we investigated the gene expression in tumor samples of the predisposing variant carriers (Fig. 4a). We observed 154 overall and 27 non-European ancestry-specific predisposing variants co-occurring with an extreme expression (> 80% or < 20% in the same cancer cohort) of the respective gene, although the current sample sizes preclude us from discovering significantly associated genes compared to non-carriers within each ancestry-cancer cohort (Additional file 2: Table S5a).

All of the expression-associated variants were germline heterozygous variants at the DNA level. The degree of their variant allele fraction in the tumor RNAseq data (RNA VAF) thus indicates the degree of allelic-specific expression (ASE). The African carriers of SDHB truncating variants p.R116fs (the corresponding gene’s expression ranks at the bottom 0.5 percentile among all PCPG cases [0.5%], RNA VAF = 0.25 and p.R46* (9% in PCGP, RNA VAF = 0.80) showed low SDHB expression. The African carriers of BRCA2 p.Y1710fs (6% in OV, RNA VAF = 0) and p.3082fs (15% in LUSC, RNA VAF = 0) also exhibited low BRCA2 (Fig. 4c). In the OV case, the germline BRCA2 p.Y1710fs is coupled with a somatic LOH event, resulting in nearly complete loss of BRCA2 expression.

Both of the African ancestry carriers of FH predisposing variants, FH p.S187* (2% in KIRP, RNA VAF = 0.13) and FH:c.556-2A>T (2% in KIRP, RNA VAF = 0.50), showed low FH expression. In addition to the biallelic somatic FH:c.1390+6T>A mutation in the carrier of germline FH p.S187* described earlier, we also observed a mis-splicing event in a different case carrying germline FH:c.556-2A>T at the RNA level (case 1 in Fig. 4b).

For other ancestries, the tumor from one predisposing variant carrier of the Native/Latin American ancestry, NF1 p.Y489C, showed low NF1 mRNA expression (2% in BRCA, RNA VAF = 0). Overall, RNA VAF of the majority of protein-truncating variants not accompanied by LOH varied between 0 and 0.25 (Additional file 2: Table S5a), suggesting degradation of the mutant allele.

Many predisposing truncating variants of tumor suppressors are assumed to lead to loss of gene expression through mechanisms such as nonsense-mediated decay (NMD). Using the NMD Classifier [48], we revealed all frameshift variants found in the African and East Asian ancestries were located in the NMD-competent region (Additional file 1: Fig. S3). These results support that a fraction of predisposing variants likely result in reduced gene products of tumor suppressors in ancestral groups.

Conversely, for the rare tumors with germline variants in oncogenes, the two predisposing RET variants are coupled with elevated RET expression in their African ancestry carriers, including p.C631Y (84% in KIRC) and p.D634Y (91% in PCGP).

Power consideration for predisposing gene discovery

Given the currently limited sample sizes in most of the minority cohorts, we sought to identify the required numbers of samples to discover novel cancer predisposing genes. We performed post hoc power analyses to detect a rare-variant association in an aggregation test using SKAT [34]. We assumed that a high proportion (80%) of variants are casual when focusing on prioritized predisposing variants in accordance with ACMG/AMP guidelines (Additional file 2: Table S6a, see the “Methods” section) [26, 27, 32]. The detection of rare variants (AF < 0.01) with moderate effect sizes (odds ratio [OR] > 5) with at least 80% power requires sample sizes exceeding 1000 samples (n = 1014) per cancer type (Additional file 1: Fig. S2A).

The sample size requirement suggests limited power for ancestry-specific analyses using TCGA, one of the largest cancer sequencing cohorts to date. For the largest ancestry subgroup in the study, European-ancestry BRCA cases (n = 811), there is 67% power to detect genes with smaller effect sizes (OR < 3). For all other ancestries, their respective largest cohorts afford inadequate power to detect genes with large effect sizes (OR = 9), including the African ancestry BRCA cohort (n = 180, power = 36%), the East Asian-ancestry LIHC cohort (n = 162, power = 24.5%), and the Native/Latin American-ancestry THCA cohort (n = 11, power = < 1%). As a reference, most known cancer predisposing genes, including ATM, PTEN, STK11, CHEK2, BRIP1, and PALB2, have an estimated OR < 10. BRCA1/BRCA2 are exceptions with an OR > 10 for BRCA, but also show more moderate OR for other cancer types [49]. Despite limited power, this TCGA study includes threefold more non-European cases (n = 1715) compared to the combined number of samples across 27 published non-TCGA sequencing studies that report race/ethnicity information from cancer cohorts (n = 764 non-Europeans, 10 cancer types) [9]. Moreover, the majority of these studies focused on somatic alterations, and only a handful reported ancestry-specific germline predisposition (Additional file 2: Table S7).

Standard power analyses have the caveat of assuming various unknown parameters that may be inaccurate. We thus performed a downsampling analysis using two cancer types with at least five significantly associated germline genes in the European-ancestry: pheochromocytoma and paraganglioma (PCPG) and sarcoma (SARC) [4] (Additional file 1: Fig. S2B, Additional file 2: Table S6b). We found that the sample size requirements differ for each gene and cancer cohort, likely due to varying penetrance. For example, six predisposing genes are discovered in both PCPG (n = 146) and SARC (n = 217) samples of the European ancestry, respectively, at their full cohort size. Upon downsampling the cohort size in half, we found VHL, SDHB, RET, and NF1 to be still associated in 73 PCPG cases, whereas only TP53 remained significantly associated in 108 SARC cases. Even while assuming similar penetrance of the predisposing genes across ancestries, this analysis implicates that the discovery power is still far from saturation for most ancestry-specific cohorts (N < 100). The different predisposition landscapes across cancer types should also be accounted for in future study designs.

Discussion

We report one of the most extensive multi-ancestry investigations of rare cancer predisposing genes to date, encompassing 9899 cancer cases across 33 cancer types. In the African ancestry, our results validated six known predisposing genes and nominated BRCA2 as a potential predisposing gene for LUSC (Fig. 1) previously shown only for Europeans. In the East Asian ancestry, we found predisposing variants affecting BRIP1 in STAD that warrants further investigation. Although the number of germline predisposing variants is small, they were associated with LOH (Fig. 3), biallelic mutations (Fig. 2), and gene expression effects in the tumor samples (Fig. 4), supporting their potential contribution to cancer predisposition in carriers.

In this TCGA cohort, we found multiple significant predisposing genes for the European ancestry and seven for the African ancestry, yet lack cancer cohorts with sufficient testing samples for many other ancestries, including Native/Latin American and South Asian that each constitute a considerable fraction of the US population. Even when tested, this study likely contains false negatives in multiple smaller cancer cohorts, especially those of non-Europeans. To achieve 80% power, the post hoc power calculation showed that the detection of rare variants (AF < 0.01) with moderate effect sizes (OR > 5) requires at least 1014 samples (Additional file 1: Fig. S2), a cohort size larger than any of the TCGA non-European cohorts.

It is necessary to use caution when interpreting the ancestry-specific predisposing gene associations identified herein or previous studies of smaller sample sizes, where a handful of carriers may give rise to the association in a limited cancer cohort. Further, the suggestive associations nominated by the TFT analyses will need to be established by analyses of larger cohorts adjusted for potential confounders. Two of the associations we identified in the African ancestry were also complemented by familial studies [39, 42], providing further validation. To design future cancer genomics studies, one must note that the power considerations differ for discovering somatic driver genes and germline predisposing genes. Current detection powers have potentially reached saturation in detecting somatically mutated genes for sample sizes in multiple cancer types of TCGA [4], although racial disparities of the sequencing data could potentially limit the generalizability of findings [5052]. We further highlighted the imbalanced dataset limits power for germline gene discovery in populations under-represented in research studies.

We observed selected predisposing genes shared across ancestries (ex. BRCA2 in BRCA/OV and SDHB in PCPG for both the African and European ancestries). Predisposing variants, on the other hand, are highly ancestry-specific (Fig. 2). Many of the predisposing variants found in the African or East Asian ancestry were not identified in the much larger European-ancestry population of TCGA (n = 8184) or even the gnomAD non-cancer cohort (n = 134,187) or submitted to ClinVar by clinical laboratories assessing patients for cancer predisposition. Rare variant classification and interpretation remain a challenge given the low frequency of observation precluding statistical associations. The identification of ancestry-specific predisposing variants further highlights this challenge in minority groups, where current germline sequencing often results in higher rates of variants of unknown significance (VUSs) [16].

Personalized medicine provides tailored disease diagnosis and treatment plans based on an individual’s unique genetic profile. The knowledge of different cancer predisposing genes and prevalence across ancestries suggests that we need to provide ancestry-specific interpretations of genetic data. In particular, many of the current guidelines for when genetic testing is recommended rely on the underlying likelihood of identifying a germline variant. Thus, accurate estimates of germline prevalence may alter recommendations for different patient populations. At the current sample sizes for minority cohorts, our study is still limited in power to discover and establish ancestry-specificity of predisposing genes (Additional file 1: Fig. S2). However, we were able to discover many ancestry-specific variants not currently submitted to ClinVar. Further, much of the diverse populations within the USA, not to mention worldwide, still lack representation in existing sequencing cohorts. Ongoing sequencing projects will begin to address this disparity within US populations (e.g., CSER [17], eMERGEIII [18], Million Veteran Program [19], and the All of Us Research Program) and multiple countries in East Asia and Europe [53]. Yet, many populations, such as the diverse African ancestry [54], remain underserved although projects like H3Africa are designed to address this problem. Additional efforts will be required to deliver the promise of genome-based precision medicine for all.

TCGA provides a powerful multi-omic sequencing dataset comprising more than ten thousand adult cancer cases [55, 56]. The dataset is used not only for characterizing somatic mutations and molecular subtypes but also enables studies of rare genetic predisposition and germline-somatic interactions [20, 38, 5759]. However, in such applications, one needs to note that TCGA is not a prospective cohort nor designed as a case-control study. Using the matched-ancestry cases of other cancer types as “controls” (the “Methods” section) is not ideal, yet they are the only available samples in the same study. The associations herein, therefore, may show biased effect sizes that require validation in carefully designed epidemiological studies. To enhance the confidence of the reported variants, we focused on identifying their somatic impacts, including LOH, ASE, and extreme gene expression levels that can be uniquely revealed in the multi-omic dataset.

To aid interpretation of low-frequency ancestry-specific variants, evidence of a somatic second hit event (i.e., loss of heterozygosity [LOH] or a biallelic mutation) in tumor samples can support functionality. Our analysis of the two-hit model identified the second somatic events in two thirds (10/15) of the African ancestry-specific predisposing variants and in one out of six of the East Asian ancestry-specific predisposing variants (Additional file 2: Table S4b). Additionally, some carriers of ancestry-specific predisposing variants showed simultaneous extreme expression of the affected genes (Fig. 3). Such evidence derived from analysis of the somatic genome or transcriptome can be further utilized to characterizing rare germline variants [60], especially when DNA-level analysis still suffers from limited sample sizes.

Our observation of somatic second hit (Figs. 2 and 3) and transcriptional effects (Fig. 4) coupled with germline variants also adds on to the current literature on germline-somatic interactions in cancer [61]. While the majority of cancer genomic studies focus exclusively on the germline or somatic genome, pathogenic germline variants are associated with different somatic mutational signatures, allele-specific imbalance, or somatic drivers [20, 38, 58, 62, 63]. The availability of germline DNA analysis and tumor genomic and transcriptomic analyses from the same individual provides critical data to the analyses described here that is not possible in many studies that only analyze germline DNA samples alone. Collectively, these findings are providing the roadmaps of how germline variants may trigger and collaborate with specific somatic mutations, eventually leading to cancer development. In this process, genomes across different ancestral populations provide different contexts for developing somatic mutations and genomic instability, even when the individual carries the same germline predisposition variant. We showcased examples of predisposition-associated LOH and gene expression changes in diverse individuals. As sample sizes of sequencing cohorts expand, analyzing germline-somatic interactions across ancestry will be pivotal to reveal potential ancestry-specific effects.

Conclusions

In summary, we identify ancestry-specific predisposing genes and variants contributing to multiple cancer types. The results provide insights into rare genetic predisposition and their somatic impacts in cases of African and East Asian ancestries. While the identified cancer predisposition genes are known, most predisposing variants are found to be exclusive within ancestries, supporting the “clan-genomics” hypothesis [64]. Continuous studies using larger ancestry cohorts will be required to enable adequately powered discovery of predisposing genes and improve genetic screening for diverse populations [65].

Supplementary information

13073_2020_744_MOESM1_ESM.pdf (1.1MB, pdf)

Additional file 1: Figure S1. Principal component analyses (PCA) of germline TCGA samples to infer genetic ancestry as performed by PanCanAtlas Ancestry Informative Markers (AIM) working group. Figure S2. Power analysis for ancestry-specific sample sizes to discover predisposing genes. Figure S3. Nonsense-mediated decay prediction for predisposing frameshift variants in African and East Asian ancestries.

13073_2020_744_MOESM2_ESM.xlsx (387.5KB, xlsx)

Additional file 2: Table S1. The demographic information of TCGA PanCanAtlas cohort with separate admixture populations. Table S2a. Ancestry-specific cancer-gene associations discovered from multivariate regression analyses. Table S2b. Ancestry-specific cancer-gene associations discovered from rare variant burden testing (Total Frequency Test- TFT). Table S3. Frequency of predisposing variants in TCGA PanCanAtlas and gnomAD-non-cancer subset across all ancestries. Table S4a. Ancestry-Specific Predisposing Variants as identified from Supp. Table.2. Table S4b. Summary of somatic second hit mutations in carriers of germline predisposing variants. Table S5a. Statistical analysis of gene expression in tumor samples of the variant carriers vs. non-carriers within each ancestry-cancer combination. Table S5b. Tumor RNAseq variant allele fractions and the somatic second hit events in germline predisposing variants with extreme expression within that cancer type. Table S6a. Post hoc power analyses to detect rare-variant associations in an aggregation test using SKAT. Table S6a. Down-sampling analysis for PCGP and SARC (cancers with at least 5 significantly associated germline genes in the European ancestry). Table S7. Prior studies that report ancestry-specific germline predisposition.

Acknowledgements

The authors wish to acknowledge The Cancer Genome Atlas and its participating patients and family that generously contributed the data. The authors would also like to acknowledge members of the TCGA PanCanAtlas Research network, particularly active members of the Germline Analysis Working Group and the Ancestry Information Markers Analysis Working Group, for helpful discussions. ZHG acknowledges funds from the LUNGevity Foundation.

@TCGA Analysis Network

Jian Carrot-Zhang1,2, Nyasha Chambwe3, Jeffrey S. Damrauer4, Theo A. Knijnenburg3, A. Gordon Robertson5, Christina Yau6,7, Wanding Zhou8, Ashton C. Berger1,2, Kuan-lin Huang9, R. Jay Mashl10, Justin Newberg11, Alessandro Romanel12, Rosalyn W. Sayaman13,14, Francesca Demichelis12, Ina Felau15, Garret Frampton11, Seunghun Han1,2, Katherine A. Hoadley4, Anab Kemal15, Peter W. Laird8, Alexander J. Lazar16, Xiuning Le17, Ninad Oak18, 19, Hui Shen8, Christopher K. Wong20, Jean C. Zenklusen15, Elad Ziv13,14, Francois Aguet1, Li Ding6, John A. Demchok15, Michael K.A. Mensah15, Roy Tarnuzzer15, Zhining Wang15, Liming Yang15, Jessica Alfoldi1, Konrad J. Karczewski1, Daniel G. MacArthur1, Garret M. Frampton11, Christopher Benz6, Joshua M. Stuart20, Andrew D. Cherniack1,2, Rameen Beroukhim1,2,21.

1. The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.

2. Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.

3. Institute for Systems Biology, Seattle, WA 98109, USA.

4. Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.

5. British Columbia Cancer Agency, Genome Sciences Centre, Vancouver, Canada V5Z 4S6.

6. Buck Institute for Research on Aging, Novato, CA 94945, USA.

7. Department of Surgery, University of California, San Francisco, San Francisco, CA 94115, USA.

8. Van Andel Research Institute, Grand Rapids, MI 49503, USA.

9. Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 2129, USA.

10. Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA.

11. Cancer Genomics Research, Foundation Medicine, Inc., Cambridge, MA 02141, USA.

12. Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Via Sommarive 9 Povo (TN) 38123 Italy.

13. Department of Laboratory Medicine, Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94143, USA.

14. Department of Population Sciences, Beckman Research Institute, City of Hope, Duarte, CA 9210.

15. National Cancer Institute, Bethesda, MD 20892, USA.

16. Departments of Pathology, Genomic Medicine, and Translational Molecular Pathology, The University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA.

17. Department of Thoracic and Head and Neck Medical Oncology, The University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA.

18. Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA.

19. Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

20. Department of Biomolecular Engineering, Center for Biomolecular Sciences and Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA.

21. Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA.

Abbreviations

ACC

Adrenocortical carcinoma

BLCA

Bladder urothelial carcinoma

BRCA

Breast invasive carcinoma

CESC

Cervical squamous cell carcinoma and endocervical adenocarcinoma

CHOL

Cholangiocarcinoma

COAD

Colon adenocarcinoma

DLBC

Lymphoid neoplasm diffuse large B cell lymphoma

ESCA

Esophageal carcinoma

GBM

Glioblastoma multiforme

HNSC

Head and neck squamous cell carcinoma

KICH

Kidney chromophobe

KIRC

Kidney renal clear cell carcinoma

KIRP

Kidney renal papillary cell carcinoma

LAML

Acute myeloid leukemia

LGG

Brain lower grade glioma

LIHC

Liver hepatocellular carcinoma

LUAD

Lung adenocarcinoma

LUSC

Lung squamous cell carcinoma

MESO

Mesothelioma

OV

Ovarian serous cystadenocarcinoma

PAAD

Pancreatic adenocarcinoma

PCPG

Pheochromocytoma and paraganglioma

PRAD

Prostate adenocarcinoma

READ

Rectum adenocarcinoma

SARC

Sarcoma

SKCM

Skin cutaneous melanoma

STAD

Stomach adenocarcinoma

TGCT

Testicular germ cell tumors

THCA

Thyroid carcinoma

THYM

Thymoma

UCEC

Uterine corpus endometrial carcinoma

UCS

Uterine carcinosarcoma

UVM

Uveal melanoma

SEER

Surveillance, Epidemiology, and End Results

TCGA

The Cancer Genome Atlas

VUS

Variants of unknown significance

LOH

Loss of heterozygosity

AIM

Ancestry informative markers

PCA

Principal component analyses

OR

Odds ratio

ExAC

Exome aggregation consortium

gnomAD

Genome aggregation dataset

IGV

Integrative Genomics Viewer

GDC

Genomic Data Commons

TFT

Total Frequency Test

ACMG-AMP

American College of Medical Genetics- Association for Molecular Pathology

FDR

False discovery rate

VAF

Variant allele fraction

AC

Allele count

AN

Allele number

AF

Allele frequency

CI

Confidence interval

WES

Whole exome sequencing

WGS

Whole genome sequencing

Authors’ contributions

K.H. and N.O. conceived and designed the research and analyses. N.O., K.H., R.B., S.E.P., and L.D. acquired the data. The PanCanAtlas AIM working group, A.C., J.M., and K.H. conducted the genetic ancestry assignment. N.O. and K.H. conducted the analyses. N.O., K.H., and S.E.P. interpreted the results and drafted the manuscript. K.H. supervised the study. All authors read, edited, and approved the manuscript.

Funding

K.H. was supported by the Icahn School of Medicine at Mount Sinai seed fund to the Huang lab. This work was also funded by the ClinGen grant [1U41HG009649] supported by the National Human Genome Research Institute to SEP.

Availability of data and materials

Ancestry ascertainments for all samples in this study were obtained from the TCGA-AIM analysis working group’s publication and its supplementary information files [21]. Germline variant calls used in this study are available through controlled access data release of the TCGA PanCanAtlas germline working group (https://gdc.cancer.gov/about-data/publications/PanCanAtlas-Germline-AWG) [20]. Somatic mutation and expression data are publicly available at the National Cancer Institute’s GDC data portal (https://portal.gdc.cancer.gov/). All downstream analysis results generated in this study are available in Additional files 1 and 2.

Ethics approval and consent to participate

The need for Institutional Review Board Approval at our institution (Icahn School of Medicine at Mount Sinai, New York) was waived for this study as all data used from this project had previously been generated as part of The Cancer Genome Atlas Project and none of the results reported in this manuscript can be used to identify individual patients.

Consent for publication

Not applicable.

Competing interests

S.E.P. is a member of the Baylor Genetics laboratory scientific advisory panel. A.D.C receives research funding from Bayer. R.B. owns equity in Ampressa Therapeutics and receives research funding from Novartis. The remaining authors declare that they have no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Kuan-lin Huang, Email: kuan-lin.huang@mssm.edu.

TCGA Analysis Network:

Jian Carrot-Zhang, Nyasha Chambwe, Jeffrey S. Damrauer, Theo A. Knijnenburg, A. Gordon Robertson, Christina Yau, Wanding Zhou, Ashton C. Berger, Kuan-lin Huang, R. Jay Mashl, Justin Newberg, Alessandro Romanel, Rosalyn W. Sayaman, Francesca Demichelis, Ina Felau, Garret Frampton, Seunghun Han, Katherine A. Hoadley, Anab Kemal, Peter W. Laird, Alexander J. Lazar, Xiuning Le, Ninad Oak, Hui Shen, Christopher K. Wong, Jean C. Zenklusen, Elad Ziv, Francois Aguet, Li Ding, John A. Demchok, Michael K. A. Mensah, Roy Tarnuzzer, Zhining Wang, Liming Yang, Jessica Alfoldi, Konrad J. Karczewski, Daniel G. MacArthur, Garret M. Frampton, Christopher Benz, Joshua M. Stuart, Andrew D. Cherniack, and Rameen Beroukhim

Supplementary information

Supplementary information accompanies this paper at 10.1186/s13073-020-00744-3.

References

  • 1.Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA Cancer J Clin. 2018;68:7–30. doi: 10.3322/caac.21442. [DOI] [PubMed] [Google Scholar]
  • 2.Noone AM, Howlader N, Krapcho M, Miller D, Brest A, Yu M, et al. SEER cancer statistics review. Bethesda: National Cancer Institute; 2018. [Google Scholar]
  • 3.Kurian AW. BRCA1 and BRCA2 mutations across race and ethnicity: distribution and clinical implications. Curr Opin Obstet Gynecol. 2010;22:72–78. doi: 10.1097/GCO.0b013e328332dca3. [DOI] [PubMed] [Google Scholar]
  • 4.Lawrence MS, Stojanov P, Mermel CH, Robinson JT, Garraway LA, Golub TR, et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature. 2014;505:495–501. doi: 10.1038/nature12912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Huo D, Hu H, Rhie SK, Gamazon ER, Cherniack AD, Liu J, et al. Comparison of breast cancer molecular features and survival by African and European ancestry in The Cancer Genome Atlas. JAMA Oncol. 2017;3:1654–1662. doi: 10.1001/jamaoncol.2017.0595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Easton DF, Pooley KA, Dunning AM, Pharoah PDP, Thompson D, Ballinger DG, et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature. 2007;447:1087–1093. doi: 10.1038/nature05887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Zheng W, Zhang B, Cai Q, Sung H, Michailidou K, Shi J, et al. Common genetic determinants of breast-cancer risk in East Asian women: a collaborative study of 23 637 breast cancer cases and 25 579 controls. Hum Mol Genet. 2013;22:2539–2550. doi: 10.1093/hmg/ddt089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Huo D, Feng Y, Haddad S, Zheng Y, Yao S, Han Y-J, et al. Genome-wide association studies in women of African ancestry identified 3q26.21 as a novel susceptibility locus for oestrogen receptor negative breast cancer. Hum Mol Genet. 2016;25:4835–4846. doi: 10.1093/hmg/ddw305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Nugent Adrienne, Conatser Kelly R., Turner Llaran L., Nugent James T., Sarino Esther May B., Ricks-Santi Luisel J. Reporting of race in genome and exome sequencing studies of cancer: a scoping review of the literature. Genetics in Medicine. 2019;21(12):2676–2680. doi: 10.1038/s41436-019-0558-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Lilyquist J, Ruddy KJ, Vachon CM, Couch FJ. Common genetic variation and breast cancer risk—past, present, and future. Cancer Epidemiol Prev Biomark. 2018;27:380–394. doi: 10.1158/1055-9965.EPI-17-1144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Couch FJ, Shimelis H, Hu C, Hart SN, Polley EC, Na J, et al. Associations between cancer predisposition testing panel genes and breast cancer. JAMA Oncol. 2017;3:1190–1196. doi: 10.1001/jamaoncol.2017.0424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Churpek JE, Walsh T, Zheng Y, Moton Z, Thornton AM, Lee MK, et al. Inherited predisposition to breast cancer among African American women. Breast Cancer Res Treat. 2015;149:31–39. doi: 10.1007/s10549-014-3195-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Fackenthal JD, Zhang J, Zhang B, Zheng Y, Hagos F, Burrill DR, et al. High prevalence of BRCA1 and BRCA2 mutations in unselected Nigerian breast cancer patients. Int J Cancer. 2012;131:1114–1123. doi: 10.1002/ijc.27326. [DOI] [PubMed] [Google Scholar]
  • 14.Newman B, Mu H, Butler LM, Millikan RC, Moorman PG, King MC. Frequency of breast cancer attributable to BRCA1 in a population-based series of American women. JAMA. 1998;279:915–921. doi: 10.1001/jama.279.12.915. [DOI] [PubMed] [Google Scholar]
  • 15.Schubert EL, Lee MK, Mefford HC, Argonza RH, Morrow JE, Hull J, et al. BRCA2 in American families with four or more cases of breast or ovarian cancer: recurrent and novel mutations, variable expression, penetrance, and the possibility of families whose cancer is not attributable to BRCA1 or BRCA2. Am J Hum Genet. 1997;60:1031–1040. [PMC free article] [PubMed] [Google Scholar]
  • 16.Caswell-Jin JL, Gupta T, Hall E, Petrovchich IM, Mills MA, Kingham KE, et al. Racial/ethnic differences in multiple-gene sequencing results for hereditary cancer risk. Genet Med. 2018;20:234–239. doi: 10.1038/gim.2017.96. [DOI] [PubMed] [Google Scholar]
  • 17.Amendola LM, Berg JS, Horowitz CR, Angelo F, Bensen JT, Biesecker BB, et al. The clinical sequencing evidence-generating research consortium: integrating genomic sequencing in diverse and medically underserved populations. Am J Hum Genet. 2018;103:319–327. doi: 10.1016/j.ajhg.2018.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Harmonizing Clinical Sequencing and Interpretation for the eMERGE III Network. Am J Hum Genet. 2019;105:588–605. [DOI] [PMC free article] [PubMed]
  • 19.Gaziano JM, Concato J, Brophy M, Fiore L, Pyarajan S, Breeling J, et al. Million Veteran Program: a mega-biobank to study genetic influences on health and disease. J Clin Epidemiol. 2016;70:214–223. doi: 10.1016/j.jclinepi.2015.09.016. [DOI] [PubMed] [Google Scholar]
  • 20.Huang Kuan-lin, Mashl R. Jay, Wu Yige, Ritter Deborah I., Wang Jiayin, Oh Clara, Paczkowska Marta, Reynolds Sheila, Wyczalkowski Matthew A., Oak Ninad, Scott Adam D., Krassowski Michal, Cherniack Andrew D., Houlahan Kathleen E., Jayasinghe Reyka, Wang Liang-Bo, Zhou Daniel Cui, Liu Di, Cao Song, Kim Young Won, Koire Amanda, McMichael Joshua F., Hucthagowder Vishwanathan, Kim Tae-Beom, Hahn Abigail, Wang Chen, McLellan Michael D., Al-Mulla Fahd, Johnson Kimberly J., Lichtarge Olivier, Boutros Paul C., Raphael Benjamin, Lazar Alexander J., Zhang Wei, Wendl Michael C., Govindan Ramaswamy, Jain Sanjay, Wheeler David, Kulkarni Shashikant, Dipersio John F., Reimand Jüri, Meric-Bernstam Funda, Chen Ken, Shmulevich Ilya, Plon Sharon E., Chen Feng, Ding Li, Caesar-Johnson Samantha J., Demchok John A., Felau Ina, Kasapi Melpomeni, Ferguson Martin L., Hutter Carolyn M., Sofia Heidi J., Tarnuzzer Roy, Wang Zhining, Yang Liming, Zenklusen Jean C., Zhang Jiashan (Julia), Chudamani Sudha, Liu Jia, Lolla Laxmi, Naresh Rashi, Pihl Todd, Sun Qiang, Wan Yunhu, Wu Ye, Cho Juok, DeFreitas Timothy, Frazer Scott, Gehlenborg Nils, Getz Gad, Heiman David I., Kim Jaegil, Lawrence Michael S., Lin Pei, Meier Sam, Noble Michael S., Saksena Gordon, Voet Doug, Zhang Hailei, Bernard Brady, Chambwe Nyasha, Dhankani Varsha, Knijnenburg Theo, Kramer Roger, Leinonen Kalle, Liu Yuexin, Miller Michael, Reynolds Sheila, Shmulevich Ilya, Thorsson Vesteinn, Zhang Wei, Akbani Rehan, Broom Bradley M., Hegde Apurva M., Ju Zhenlin, Kanchi Rupa S., Korkut Anil, Li Jun, Liang Han, Ling Shiyun, Liu Wenbin, Lu Yiling, Mills Gordon B., Ng Kwok-Shing, Rao Arvind, Ryan Michael, Wang Jing, Weinstein John N., Zhang Jiexin, Abeshouse Adam, Armenia Joshua, Chakravarty Debyani, Chatila Walid K., de Bruijn Ino, Gao Jianjiong, Gross Benjamin E., Heins Zachary J., Kundra Ritika, La Konnor, Ladanyi Marc, Luna Augustin, Nissan Moriah G., Ochoa Angelica, Phillips Sarah M., Reznik Ed, Sanchez-Vega Francisco, Sander Chris, Schultz Nikolaus, Sheridan Robert, Sumer S. Onur, Sun Yichao, Taylor Barry S., Wang Jioajiao, Zhang Hongxin, Anur Pavana, Peto Myron, Spellman Paul, Benz Christopher, Stuart Joshua M., Wong Christopher K., Yau Christina, Hayes D. Neil, Parker Joel S., Wilkerson Matthew D., Ally Adrian, Balasundaram Miruna, Bowlby Reanne, Brooks Denise, Carlsen Rebecca, Chuah Eric, Dhalla Noreen, Holt Robert, Jones Steven J.M., Kasaian Katayoon, Lee Darlene, Ma Yussanne, Marra Marco A., Mayo Michael, Moore Richard A., Mungall Andrew J., Mungall Karen, Robertson A. Gordon, Sadeghi Sara, Schein Jacqueline E., Sipahimalani Payal, Tam Angela, Thiessen Nina, Tse Kane, Wong Tina, Berger Ashton C., Beroukhim Rameen, Cherniack Andrew D., Cibulskis Carrie, Gabriel Stacey B., Gao Galen F., Ha Gavin, Meyerson Matthew, Schumacher Steven E., Shih Juliann, Kucherlapati Melanie H., Kucherlapati Raju S., Baylin Stephen, Cope Leslie, Danilova Ludmila, Bootwalla Moiz S., Lai Phillip H., Maglinte Dennis T., Van Den Berg David J., Weisenberger Daniel J., Auman J. Todd, Balu Saianand, Bodenheimer Tom, Fan Cheng, Hoadley Katherine A., Hoyle Alan P., Jefferys Stuart R., Jones Corbin D., Meng Shaowu, Mieczkowski Piotr A., Mose Lisle E., Perou Amy H., Perou Charles M., Roach Jeffrey, Shi Yan, Simons Janae V., Skelly Tara, Soloway Matthew G., Tan Donghui, Veluvolu Umadevi, Fan Huihui, Hinoue Toshinori, Laird Peter W., Shen Hui, Zhou Wanding, Bellair Michelle, Chang Kyle, Covington Kyle, Creighton Chad J., Dinh Huyen, Doddapaneni HarshaVardhan, Donehower Lawrence A., Drummond Jennifer, Gibbs Richard A., Glenn Robert, Hale Walker, Han Yi, Hu Jianhong, Korchina Viktoriya, Lee Sandra, Lewis Lora, Li Wei, Liu Xiuping, Morgan Margaret, Morton Donna, Muzny Donna, Santibanez Jireh, Sheth Margi, Shinbrot Eve, Wang Linghua, Wang Min, Wheeler David A., Xi Liu, Zhao Fengmei, Hess Julian, Appelbaum Elizabeth L., Bailey Matthew, Cordes Matthew G., Ding Li, Fronick Catrina C., Fulton Lucinda A., Fulton Robert S., Kandoth Cyriac, Mardis Elaine R., McLellan Michael D., Miller Christopher A., Schmidt Heather K., Wilson Richard K., Crain Daniel, Curley Erin, Gardner Johanna, Lau Kevin, Mallery David, Morris Scott, Paulauskis Joseph, Penny Robert, Shelton Candace, Shelton Troy, Sherman Mark, Thompson Eric, Yena Peggy, Bowen Jay, Gastier-Foster Julie M., Gerken Mark, Leraas Kristen M., Lichtenberg Tara M., Ramirez Nilsa C., Wise Lisa, Zmuda Erik, Corcoran Niall, Costello Tony, Hovens Christopher, Carvalho Andre L., de Carvalho Ana C., Fregnani José H., Longatto-Filho Adhemar, Reis Rui M., Scapulatempo-Neto Cristovam, Silveira Henrique C.S., Vidal Daniel O., Burnette Andrew, Eschbacher Jennifer, Hermes Beth, Noss Ardene, Singh Rosy, Anderson Matthew L., Castro Patricia D., Ittmann Michael, Huntsman David, Kohl Bernard, Le Xuan, Thorp Richard, Andry Chris, Duffy Elizabeth R., Lyadov Vladimir, Paklina Oxana, Setdikova Galiya, Shabunin Alexey, Tavobilov Mikhail, McPherson Christopher, Warnick Ronald, Berkowitz Ross, Cramer Daniel, Feltmate Colleen, Horowitz Neil, Kibel Adam, Muto Michael, Raut Chandrajit P., Malykh Andrei, Barnholtz-Sloan Jill S., Barrett Wendi, Devine Karen, Fulop Jordonna, Ostrom Quinn T., Shimmel Kristen, Wolinsky Yingli, Sloan Andrew E., De Rose Agostino, Giuliante Felice, Goodman Marc, Karlan Beth Y., Hagedorn Curt H., Eckman John, Harr Jodi, Myers Jerome, Tucker Kelinda, Zach Leigh Anne, Deyarmin Brenda, Hu Hai, Kvecher Leonid, Larson Caroline, Mural Richard J., Somiari Stella, Vicha Ales, Zelinka Tomas, Bennett Joseph, Iacocca Mary, Rabeno Brenda, Swanson Patricia, Latour Mathieu, Lacombe Louis, Têtu Bernard, Bergeron Alain, McGraw Mary, Staugaitis Susan M., Chabot John, Hibshoosh Hanina, Sepulveda Antonia, Su Tao, Wang Timothy, Potapova Olga, Voronina Olga, Desjardins Laurence, Mariani Odette, Roman-Roman Sergio, Sastre Xavier, Stern Marc-Henri, Cheng Feixiong, Signoretti Sabina, Berchuck Andrew, Bigner Darell, Lipp Eric, Marks Jeffrey, McCall Shannon, McLendon Roger, Secord Angeles, Sharp Alexis, Behera Madhusmita, Brat Daniel J., Chen Amy, Delman Keith, Force Seth, Khuri Fadlo, Magliocca Kelly, Maithel Shishir, Olson Jeffrey J., Owonikoko Taofeek, Pickens Alan, Ramalingam Suresh, Shin Dong M., Sica Gabriel, Van Meir Erwin G., Zhang Hongzheng, Eijckenboom Wil, Gillis Ad, Korpershoek Esther, Looijenga Leendert, Oosterhuis Wolter, Stoop Hans, van Kessel Kim E., Zwarthoff Ellen C., Calatozzolo Chiara, Cuppini Lucia, Cuzzubbo Stefania, DiMeco Francesco, Finocchiaro Gaetano, Mattei Luca, Perin Alessandro, Pollo Bianca, Chen Chu, Houck John, Lohavanichbutr Pawadee, Hartmann Arndt, Stoehr Christine, Stoehr Robert, Taubert Helge, Wach Sven, Wullich Bernd, Kycler Witold, Murawa Dawid, Wiznerowicz Maciej, Chung Ki, Edenfield W. Jeffrey, Martin Julie, Baudin Eric, Bubley Glenn, Bueno Raphael, De Rienzo Assunta, Richards William G., Kalkanis Steven, Mikkelsen Tom, Noushmehr Houtan, Scarpace Lisa, Girard Nicolas, Aymerich Marta, Campo Elias, Giné Eva, Guillermo Armando López, Van Bang Nguyen, Hanh Phan Thi, Phu Bui Duc, Tang Yufang, Colman Howard, Evason Kimberley, Dottino Peter R., Martignetti John A., Gabra Hani, Juhl Hartmut, Akeredolu Teniola, Stepa Serghei, Hoon Dave, Ahn Keunsoo, Kang Koo Jeong, Beuschlein Felix, Breggia Anne, Birrer Michael, Bell Debra, Borad Mitesh, Bryce Alan H., Castle Erik, Chandan Vishal, Cheville John, Copland John A., Farnell Michael, Flotte Thomas, Giama Nasra, Ho Thai, Kendrick Michael, Kocher Jean-Pierre, Kopp Karla, Moser Catherine, Nagorney David, O’Brien Daniel, O’Neill Brian Patrick, Patel Tushar, Petersen Gloria, Que Florencia, Rivera Michael, Roberts Lewis, Smallridge Robert, Smyrk Thomas, Stanton Melissa, Thompson R. Houston, Torbenson Michael, Yang Ju Dong, Zhang Lizhi, Brimo Fadi, Ajani Jaffer A., Gonzalez Ana Maria Angulo, Behrens Carmen, Bondaruk Jolanta, Broaddus Russell, Czerniak Bogdan, Esmaeli Bita, Fujimoto Junya, Gershenwald Jeffrey, Guo Charles, Lazar Alexander J., Logothetis Christopher, Meric-Bernstam Funda, Moran Cesar, Ramondetta Lois, Rice David, Sood Anil, Tamboli Pheroze, Thompson Timothy, Troncoso Patricia, Tsao Anne, Wistuba Ignacio, Carter Candace, Haydu Lauren, Hersey Peter, Jakrot Valerie, Kakavand Hojabr, Kefford Richard, Lee Kenneth, Long Georgina, Mann Graham, Quinn Michael, Saw Robyn, Scolyer Richard, Shannon Kerwin, Spillane Andrew, Stretch Jonathan, Synott Maria, Thompson John, Wilmott James, Al-Ahmadie Hikmat, Chan Timothy A., Ghossein Ronald, Gopalan Anuradha, Levine Douglas A., Reuter Victor, Singer Samuel, Singh Bhuvanesh, Tien Nguyen Viet, Broudy Thomas, Mirsaidi Cyrus, Nair Praveen, Drwiega Paul, Miller Judy, Smith Jennifer, Zaren Howard, Park Joong-Won, Hung Nguyen Phi, Kebebew Electron, Linehan W. Marston, Metwalli Adam R., Pacak Karel, Pinto Peter A., Schiffman Mark, Schmidt Laura S., Vocke Cathy D., Wentzensen Nicolas, Worrell Robert, Yang Hannah, Moncrieff Marc, Goparaju Chandra, Melamed Jonathan, Pass Harvey, Botnariuc Natalia, Caraman Irina, Cernat Mircea, Chemencedji Inga, Clipca Adrian, Doruc Serghei, Gorincioi Ghenadie, Mura Sergiu, Pirtac Maria, Stancul Irina, Tcaciuc Diana, Albert Monique, Alexopoulou Iakovina, Arnaout Angel, Bartlett John, Engel Jay, Gilbert Sebastien, Parfitt Jeremy, Sekhon Harman, Thomas George, Rassl Doris M., Rintoul Robert C., Bifulco Carlo, Tamakawa Raina, Urba Walter, Hayward Nicholas, Timmers Henri, Antenucci Anna, Facciolo Francesco, Grazi Gianluca, Marino Mirella, Merola Roberta, de Krijger Ronald, Gimenez-Roqueplo Anne-Paule, Piché Alain, Chevalier Simone, McKercher Ginette, Birsoy Kivanc, Barnett Gene, Brewer Cathy, Farver Carol, Naska Theresa, Pennell Nathan A., Raymond Daniel, Schilero Cathy, Smolenski Kathy, Williams Felicia, Morrison Carl, Borgia Jeffrey A., Liptay Michael J., Pool Mark, Seder Christopher W., Junker Kerstin, Omberg Larsson, Dinkin Mikhail, Manikhas George, Alvaro Domenico, Bragazzi Maria Consiglia, Cardinale Vincenzo, Carpino Guido, Gaudio Eugenio, Chesla David, Cottingham Sandra, Dubina Michael, Moiseenko Fedor, Dhanasekaran Renumathy, Becker Karl-Friedrich, Janssen Klaus-Peter, Slotta-Huspenina Julia, Abdel-Rahman Mohamed H., Aziz Dina, Bell Sue, Cebulla Colleen M., Davis Amy, Duell Rebecca, Elder J. Bradley, Hilty Joe, Kumar Bahavna, Lang James, Lehman Norman L., Mandt Randy, Nguyen Phuong, Pilarski Robert, Rai Karan, Schoenfield Lynn, Senecal Kelly, Wakely Paul, Hansen Paul, Lechan Ronald, Powers James, Tischler Arthur, Grizzle William E., Sexton Katherine C., Kastl Alison, Henderson Joel, Porten Sima, Waldmann Jens, Fassnacht Martin, Asa Sylvia L., Schadendorf Dirk, Couce Marta, Graefen Markus, Huland Hartwig, Sauter Guido, Schlomm Thorsten, Simon Ronald, Tennstedt Pierre, Olabode Oluwole, Nelson Mark, Bathe Oliver, Carroll Peter R., Chan June M., Disaia Philip, Glenn Pat, Kelley Robin K., Landen Charles N., Phillips Joanna, Prados Michael, Simko Jeffry, Smith-McCune Karen, VandenBerg Scott, Roggin Kevin, Fehrenbach Ashley, Kendler Ady, Sifri Suzanne, Steele Ruth, Jimeno Antonio, Carey Francis, Forgie Ian, Mannelli Massimo, Carney Michael, Hernandez Brenda, Campos Benito, Herold-Mende Christel, Jungk Christin, Unterberg Andreas, von Deimling Andreas, Bossler Aaron, Galbraith Joseph, Jacobus Laura, Knudson Michael, Knutson Tina, Ma Deqin, Milhem Mohammed, Sigmund Rita, Godwin Andrew K., Madan Rashna, Rosenthal Howard G., Adebamowo Clement, Adebamowo Sally N., Boussioutas Alex, Beer David, Giordano Thomas, Mes-Masson Anne-Marie, Saad Fred, Bocklage Therese, Landrum Lisa, Mannel Robert, Moore Kathleen, Moxley Katherine, Postier Russel, Walker Joan, Zuna Rosemary, Feldman Michael, Valdivieso Federico, Dhir Rajiv, Luketich James, Pinero Edna M. Mora, Quintero-Aguilo Mario, Carlotti Carlos Gilberto, Dos Santos Jose Sebastião, Kemp Rafael, Sankarankuty Ajith, Tirapelli Daniela, Catto James, Agnew Kathy, Swisher Elizabeth, Creaney Jenette, Robinson Bruce, Shelley Carl Simon, Godwin Eryn M., Kendall Sara, Shipman Cassaundra, Bradford Carol, Carey Thomas, Haddad Andrea, Moyer Jeffey, Peterson Lisa, Prince Mark, Rozek Laura, Wolf Gregory, Bowman Rayleen, Fong Kwun M., Yang Ian, Korst Robert, Rathmell W. Kimryn, Fantacone-Campbell J. Leigh, Hooke Jeffrey A., Kovatich Albert J., Shriver Craig D., DiPersio John, Drake Bettina, Govindan Ramaswamy, Heath Sharon, Ley Timothy, Van Tine Brian, Westervelt Peter, Rubin Mark A., Lee Jung Il, Aredes Natália D., Mariamidze Armaz. Pathogenic Germline Variants in 10,389 Adult Cancers. Cell. 2018;173(2):355-370.e14. doi: 10.1016/j.cell.2018.03.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Carrot-Zhang Jian, Chambwe Nyasha, Damrauer Jeffrey S., Knijnenburg Theo A., Robertson A. Gordon, Yau Christina, Zhou Wanding, Berger Ashton C., Huang Kuan-lin, Newberg Justin Y., Mashl R. Jay, Romanel Alessandro, Sayaman Rosalyn W., Demichelis Francesca, Felau Ina, Frampton Garrett M., Han Seunghun, Hoadley Katherine A., Kemal Anab, Laird Peter W., Lazar Alexander J., Le Xiuning, Oak Ninad, Shen Hui, Wong Christopher K., Zenklusen Jean C., Ziv Elad, Cherniack Andrew D., Beroukhim Rameen, Aguet Francois, Ding Li, Demchok John A., Mensah Michael K.A., Caesar-Johnson Samantha, Tarnuzzer Roy, Wang Zhining, Yang Liming, Alfoldi Jessica, Karczewski Konrad J., MacArthur Daniel G., Meyerson Matthew, Benz Christopher, Stuart Joshua M. Comprehensive Analysis of Genetic Ancestry and Its Molecular Correlates in Cancer. Cancer Cell. 2020;37(5):639-654.e6. doi: 10.1016/j.ccell.2020.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19:1655–1664. doi: 10.1101/gr.094052.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 2012;22:568–576. doi: 10.1101/gr.129684.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.McKenna AH, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ye K, Schulz MH, Long Q, Apweiler R, Ning Z. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics. 2009;25:2865–2871. doi: 10.1093/bioinformatics/btp394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Scott Adam D, Huang Kuan-Lin, Weerasinghe Amila, Mashl R Jay, Gao Qingsong, Martins Rodrigues Fernanda, Wyczalkowski Matthew A, Ding Li. CharGer: clinical Characterization of Germline variants. Bioinformatics. 2018;35(5):865–867. doi: 10.1093/bioinformatics/bty649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–423. doi: 10.1038/gim.2015.30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Purcell Shaun, Neale Benjamin, Todd-Brown Kathe, Thomas Lori, Ferreira Manuel A.R., Bender David, Maller Julian, Sklar Pamela, de Bakker Paul I.W., Daly Mark J., Sham Pak C. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. The American Journal of Human Genetics. 2007;81(3):559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Basu S, Pan W. Comparison of statistical tests for disease association with rare variants. Genet Epidemiol. 2011;35:606–619. doi: 10.1002/gepi.20609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. bioRxiv. 2019;531210. 10.1101/531210.
  • 31.Lek M, Karczewski KJ, Samocha KE, Banks E, Fennell T, O AH, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2015;536:285–291. [DOI] [PMC free article] [PubMed]
  • 32.Li Quan, Wang Kai. InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines. The American Journal of Human Genetics. 2017;100(2):267–280. doi: 10.1016/j.ajhg.2017.01.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:1–7. doi: 10.1093/nar/gkq603. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet. 2011;89:82–93. doi: 10.1016/j.ajhg.2011.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Esai Selvan Myvizhi, Klein Robert J., Gümüş Zeynep H. Rare, Pathogenic Germline Variants in Fanconi Anemia Genes Increase Risk for Squamous Lung Cancer. Clinical Cancer Research. 2018;25(5):1517–1525. doi: 10.1158/1078-0432.CCR-18-2660. [DOI] [PubMed] [Google Scholar]
  • 36.Wang Y, McKay JD, Rafnar T, Wang Z, Timofeeva MN, Broderick P, et al. Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer. Nat Genet. 2014;46:736–741. doi: 10.1038/ng.3002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Gayther SA, Mangion J, Russell P, Seal S, Barfoot R, Ponder BAJ, et al. Variation of risks of breast and ovarian cancer associated with different germline mutations of the BRCA2 gene. Nat Genet. 1997;15:103–105. doi: 10.1038/ng0197-103. [DOI] [PubMed] [Google Scholar]
  • 38.Lu C, Xie M, Wendl MC, Wang J, McLellan MD, Leiserson MDM, et al. Patterns and functional implications of rare germline variants across 12 cancer types. Nat Commun. 2015;6:10086. doi: 10.1038/ncomms10086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Majumdar S, Friedrich CA, Koch CA, Megason GC, Fratkin JD, Moll GW. Compound heterozygous mutation with a novel splice donor region DNA sequence variant in the succinate dehydrogenase subunit B gene in malignant paraganglioma. Pediatr Blood Cancer. 2010;54:473–475. doi: 10.1002/pbc.22338. [DOI] [PubMed] [Google Scholar]
  • 40.Beebe-Dimmer JL, Zuhlke KA, Johnson AM, Liesman D, Cooney KA. Rare germline mutations in African American men diagnosed with early-onset prostate cancer. Prostate. 2018;78:321–326. doi: 10.1002/pros.23464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Na R, Zheng SL, Han M, Yu H, Jiang D, Shah S, et al. Germline mutations in ATM and BRCA1/2 distinguish risk for lethal and indolent prostate cancer and are associated with early age at death. Eur Urol. 2017;71:740–747. doi: 10.1016/j.eururo.2016.11.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Wei M-H, Toure O, Glenn GM, Pithukpakorn M, Neckers L, Stolle C, et al. Novel mutations in FH and expansion of the spectrum of phenotypes expressed in families with hereditary leiomyomatosis and renal cell cancer. J Med Genet. 2006;43:18–27. doi: 10.1136/jmg.2005.033506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Krishnan B, Rose TL, Kardos J, Milowsky MI, Kim WY. Intrinsic genomic differences between African American and white patients with clear cell renal cell carcinoma. JAMA Oncol. 2016;2:664–667. doi: 10.1001/jamaoncol.2016.0005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Sutton J, Orloff MS, Michener C, Chiesa-Vottero A, Prayson R, Nowacki AS, et al. Association of specific PTEN/10q haplotypes with endometrial cancer phenotypes in African-American and European American women. Gynecol Oncol. 2015;138:434–440. doi: 10.1016/j.ygyno.2015.05.024. [DOI] [PubMed] [Google Scholar]
  • 45.Landrum Melissa J, Lee Jennifer M, Benson Mark, Brown Garth R, Chao Chen, Chitipiralla Shanmuga, Gu Baoshan, Hart Jennifer, Hoffman Douglas, Jang Wonhee, Karapetyan Karen, Katz Kenneth, Liu Chunlei, Maddipatla Zenith, Malheiro Adriana, McDaniel Kurt, Ovetsky Michael, Riley George, Zhou George, Holmes J Bradley, Kattman Brandi L, Maglott Donna R. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Research. 2017;46(D1):D1062–D1067. doi: 10.1093/nar/gkx1153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Knudson A. G. Mutation and Cancer: Statistical Study of Retinoblastoma. Proceedings of the National Academy of Sciences. 1971;68(4):820–823. doi: 10.1073/pnas.68.4.820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Knudson Alfred G. Two genetic hits (more or less) to cancer. Nature Reviews Cancer. 2001;1(2):157–162. doi: 10.1038/35101031. [DOI] [PubMed] [Google Scholar]
  • 48.Hsu Min-Kung, Lin Hsuan-Yu, Chen Feng-Chi. NMD Classifier: A reliable and systematic classification tool for nonsense-mediated decay events. PLOS ONE. 2017;12(4):e0174798. doi: 10.1371/journal.pone.0174798. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Turnbull Clare, Rahman Nazneen. Genetic Predisposition to Breast Cancer: Past, Present, and Future. Annual Review of Genomics and Human Genetics. 2008;9(1):321–345. doi: 10.1146/annurev.genom.9.081307.164339. [DOI] [PubMed] [Google Scholar]
  • 50.Spratt Daniel E., Chan Tiffany, Waldron Levi, Speers Corey, Feng Felix Y., Ogunwobi Olorunseun O., Osborne Joseph R. Racial/Ethnic Disparities in Genomic Sequencing. JAMA Oncology. 2016;2(8):1070. doi: 10.1001/jamaoncol.2016.1854. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Berger Ashton C., Korkut Anil, Kanchi Rupa S., Hegde Apurva M., Lenoir Walter, Liu Wenbin, Liu Yuexin, Fan Huihui, Shen Hui, Ravikumar Visweswaran, Rao Arvind, Schultz Andre, Li Xubin, Sumazin Pavel, Williams Cecilia, Mestdagh Pieter, Gunaratne Preethi H., Yau Christina, Bowlby Reanne, Robertson A. Gordon, Tiezzi Daniel G., Wang Chen, Cherniack Andrew D., Godwin Andrew K., Kuderer Nicole M., Rader Janet S., Zuna Rosemary E., Sood Anil K., Lazar Alexander J., Ojesina Akinyemi I., Adebamowo Clement, Adebamowo Sally N., Baggerly Keith A., Chen Ting-Wen, Chiu Hua-Sheng, Lefever Steve, Liu Liang, MacKenzie Karen, Orsulic Sandra, Roszik Jason, Shelley Carl Simon, Song Qianqian, Vellano Christopher P., Wentzensen Nicolas, Weinstein John N., Mills Gordon B., Levine Douglas A., Akbani Rehan, Caesar-Johnson Samantha J., Demchok John A., Felau Ina, Kasapi Melpomeni, Ferguson Martin L., Hutter Carolyn M., Sofia Heidi J., Tarnuzzer Roy, Wang Zhining, Yang Liming, Zenklusen Jean C., Zhang Jiashan (Julia), Chudamani Sudha, Liu Jia, Lolla Laxmi, Naresh Rashi, Pihl Todd, Sun Qiang, Wan Yunhu, Wu Ye, Cho Juok, DeFreitas Timothy, Frazer Scott, Gehlenborg Nils, Getz Gad, Heiman David I., Kim Jaegil, Lawrence Michael S., Lin Pei, Meier Sam, Noble Michael S., Saksena Gordon, Voet Doug, Zhang Hailei, Bernard Brady, Chambwe Nyasha, Dhankani Varsha, Knijnenburg Theo, Kramer Roger, Leinonen Kalle, Liu Yuexin, Miller Michael, Reynolds Sheila, Shmulevich Ilya, Thorsson Vesteinn, Zhang Wei, Akbani Rehan, Broom Bradley M., Hegde Apurva M., Ju Zhenlin, Kanchi Rupa S., Korkut Anil, Li Jun, Liang Han, Ling Shiyun, Liu Wenbin, Lu Yiling, Mills Gordon B., Ng Kwok-Shing, Rao Arvind, Ryan Michael, Wang Jing, Weinstein John N., Zhang Jiexin, Abeshouse Adam, Armenia Joshua, Chakravarty Debyani, Chatila Walid K., de Bruijn Ino, Gao Jianjiong, Gross Benjamin E., Heins Zachary J., Kundra Ritika, La Konnor, Ladanyi Marc, Luna Augustin, Nissan Moriah G., Ochoa Angelica, Phillips Sarah M., Reznik Ed, Sanchez-Vega Francisco, Sander Chris, Schultz Nikolaus, Sheridan Robert, Sumer S. Onur, Sun Yichao, Taylor Barry S., Wang Jioajiao, Zhang Hongxin, Anur Pavana, Peto Myron, Spellman Paul, Benz Christopher, Stuart Joshua M., Wong Christopher K., Yau Christina, Hayes D. Neil, Parker Joel S., Wilkerson Matthew D., Ally Adrian, Balasundaram Miruna, Bowlby Reanne, Brooks Denise, Carlsen Rebecca, Chuah Eric, Dhalla Noreen, Holt Robert, Jones Steven J.M., Kasaian Katayoon, Lee Darlene, Ma Yussanne, Marra Marco A., Mayo Michael, Moore Richard A., Mungall Andrew J., Mungall Karen, Robertson A. Gordon, Sadeghi Sara, Schein Jacqueline E., Sipahimalani Payal, Tam Angela, Thiessen Nina, Tse Kane, Wong Tina, Berger Ashton C., Beroukhim Rameen, Cherniack Andrew D., Cibulskis Carrie, Gabriel Stacey B., Gao Galen F., Ha Gavin, Meyerson Matthew, Schumacher Steven E., Shih Juliann, Kucherlapati Melanie H., Kucherlapati Raju S., Baylin Stephen, Cope Leslie, Danilova Ludmila, Bootwalla Moiz S., Lai Phillip H., Maglinte Dennis T., Van Den Berg David J., Weisenberger Daniel J., Auman J. Todd, Balu Saianand, Bodenheimer Tom, Fan Cheng, Hoadley Katherine A., Hoyle Alan P., Jefferys Stuart R., Jones Corbin D., Meng Shaowu, Mieczkowski Piotr A., Mose Lisle E., Perou Amy H., Perou Charles M., Roach Jeffrey, Shi Yan, Simons Janae V., Skelly Tara, Soloway Matthew G., Tan Donghui, Veluvolu Umadevi, Fan Huihui, Hinoue Toshinori, Laird Peter W., Shen Hui, Zhou Wanding, Bellair Michelle, Chang Kyle, Covington Kyle, Creighton Chad J., Dinh Huyen, Doddapaneni HarshaVardhan, Donehower Lawrence A., Drummond Jennifer, Gibbs Richard A., Glenn Robert, Hale Walker, Han Yi, Hu Jianhong, Korchina Viktoriya, Lee Sandra, Lewis Lora, Li Wei, Liu Xiuping, Morgan Margaret, Morton Donna, Muzny Donna, Santibanez Jireh, Sheth Margi, Shinbrot Eve, Wang Linghua, Wang Min, Wheeler David A., Xi Liu, Zhao Fengmei, Hess Julian, Appelbaum Elizabeth L., Bailey Matthew, Cordes Matthew G., Ding Li, Fronick Catrina C., Fulton Lucinda A., Fulton Robert S., Kandoth Cyriac, Mardis Elaine R., McLellan Michael D., Miller Christopher A., Schmidt Heather K., Wilson Richard K., Crain Daniel, Curley Erin, Gardner Johanna, Lau Kevin, Mallery David, Morris Scott, Paulauskis Joseph, Penny Robert, Shelton Candace, Shelton Troy, Sherman Mark, Thompson Eric, Yena Peggy, Bowen Jay, Gastier-Foster Julie M., Gerken Mark, Leraas Kristen M., Lichtenberg Tara M., Ramirez Nilsa C., Wise Lisa, Zmuda Erik, Corcoran Niall, Costello Tony, Hovens Christopher, Carvalho Andre L., de Carvalho Ana C., Fregnani José H., Longatto-Filho Adhemar, Reis Rui M., Scapulatempo-Neto Cristovam, Silveira Henrique C.S., Vidal Daniel O., Burnette Andrew, Eschbacher Jennifer, Hermes Beth, Noss Ardene, Singh Rosy, Anderson Matthew L., Castro Patricia D., Ittmann Michael, Huntsman David, Kohl Bernard, Le Xuan, Thorp Richard, Andry Chris, Duffy Elizabeth R., Lyadov Vladimir, Paklina Oxana, Setdikova Galiya, Shabunin Alexey, Tavobilov Mikhail, McPherson Christopher, Warnick Ronald, Berkowitz Ross, Cramer Daniel, Feltmate Colleen, Horowitz Neil, Kibel Adam, Muto Michael, Raut Chandrajit P., Malykh Andrei, Barnholtz-Sloan Jill S., Barrett Wendi, Devine Karen, Fulop Jordonna, Ostrom Quinn T., Shimmel Kristen, Wolinsky Yingli, Sloan Andrew E., De Rose Agostino, Giuliante Felice, Goodman Marc, Karlan Beth Y., Hagedorn Curt H., Eckman John, Harr Jodi, Myers Jerome, Tucker Kelinda, Zach Leigh Anne, Deyarmin Brenda, Hu Hai, Kvecher Leonid, Larson Caroline, Mural Richard J., Somiari Stella, Vicha Ales, Zelinka Tomas, Bennett Joseph, Iacocca Mary, Rabeno Brenda, Swanson Patricia, Latour Mathieu, Lacombe Louis, Têtu Bernard, Bergeron Alain, McGraw Mary, Staugaitis Susan M., Chabot John, Hibshoosh Hanina, Sepulveda Antonia, Su Tao, Wang Timothy, Potapova Olga, Voronina Olga, Desjardins Laurence, Mariani Odette, Roman-Roman Sergio, Sastre Xavier, Stern Marc-Henri, Cheng Feixiong, Signoretti Sabina, Berchuck Andrew, Bigner Darell, Lipp Eric, Marks Jeffrey, McCall Shannon, McLendon Roger, Secord Angeles, Sharp Alexis, Behera Madhusmita, Brat Daniel J., Chen Amy, Delman Keith, Force Seth, Khuri Fadlo, Magliocca Kelly, Maithel Shishir, Olson Jeffrey J., Owonikoko Taofeek, Pickens Alan, Ramalingam Suresh, Shin Dong M., Sica Gabriel, Van Meir Erwin G., Zhang Hongzheng, Eijckenboom Wil, Gillis Ad, Korpershoek Esther, Looijenga Leendert, Oosterhuis Wolter, Stoop Hans, van Kessel Kim E., Zwarthoff Ellen C., Calatozzolo Chiara, Cuppini Lucia, Cuzzubbo Stefania, DiMeco Francesco, Finocchiaro Gaetano, Mattei Luca, Perin Alessandro, Pollo Bianca, Chen Chu, Houck John, Lohavanichbutr Pawadee, Hartmann Arndt, Stoehr Christine, Stoehr Robert, Taubert Helge, Wach Sven, Wullich Bernd, Kycler Witold, Murawa Dawid, Wiznerowicz Maciej, Chung Ki, Edenfield W. Jeffrey, Martin Julie, Baudin Eric, Bubley Glenn, Bueno Raphael, De Rienzo Assunta, Richards William G., Kalkanis Steven, Mikkelsen Tom, Noushmehr Houtan, Scarpace Lisa, Girard Nicolas, Aymerich Marta, Campo Elias, Giné Eva, Guillermo Armando López, Van Bang Nguyen, Hanh Phan Thi, Phu Bui Duc, Tang Yufang, Colman Howard, Evason Kimberley, Dottino Peter R., Martignetti John A., Gabra Hani, Juhl Hartmut, Akeredolu Teniola, Stepa Serghei, Hoon Dave, Ahn Keunsoo, Kang Koo Jeong, Beuschlein Felix, Breggia Anne, Birrer Michael, Bell Debra, Borad Mitesh, Bryce Alan H., Castle Erik, Chandan Vishal, Cheville John, Copland John A., Farnell Michael, Flotte Thomas, Giama Nasra, Ho Thai, Kendrick Michael, Kocher Jean-Pierre, Kopp Karla, Moser Catherine, Nagorney David, O’Brien Daniel, O’Neill Brian Patrick, Patel Tushar, Petersen Gloria, Que Florencia, Rivera Michael, Roberts Lewis, Smallridge Robert, Smyrk Thomas, Stanton Melissa, Thompson R. Houston, Torbenson Michael, Yang Ju Dong, Zhang Lizhi, Brimo Fadi, Ajani Jaffer A., Angulo Gonzalez Ana Maria, Behrens Carmen, Bondaruk Jolanta, Broaddus Russell, Czerniak Bogdan, Esmaeli Bita, Fujimoto Junya, Gershenwald Jeffrey, Guo Charles, Lazar Alexander J., Logothetis Christopher, Meric-Bernstam Funda, Moran Cesar, Ramondetta Lois, Rice David, Sood Anil, Tamboli Pheroze, Thompson Timothy, Troncoso Patricia, Tsao Anne, Wistuba Ignacio, Carter Candace, Haydu Lauren, Hersey Peter, Jakrot Valerie, Kakavand Hojabr, Kefford Richard, Lee Kenneth, Long Georgina, Mann Graham, Quinn Michael, Saw Robyn, Scolyer Richard, Shannon Kerwin, Spillane Andrew, Stretch Jonathan, Synott Maria, Thompson John, Wilmott James, Al-Ahmadie Hikmat, Chan Timothy A., Ghossein Ronald, Gopalan Anuradha, Levine Douglas A., Reuter Victor, Singer Samuel, Singh Bhuvanesh, Tien Nguyen Viet, Broudy Thomas, Mirsaidi Cyrus, Nair Praveen, Drwiega Paul, Miller Judy, Smith Jennifer, Zaren Howard, Park Joong-Won, Hung Nguyen Phi, Kebebew Electron, Linehan W. Marston, Metwalli Adam R., Pacak Karel, Pinto Peter A., Schiffman Mark, Schmidt Laura S., Vocke Cathy D., Wentzensen Nicolas, Worrell Robert, Yang Hannah, Moncrieff Marc, Goparaju Chandra, Melamed Jonathan, Pass Harvey, Botnariuc Natalia, Caraman Irina, Cernat Mircea, Chemencedji Inga, Clipca Adrian, Doruc Serghei, Gorincioi Ghenadie, Mura Sergiu, Pirtac Maria, Stancul Irina, Tcaciuc Diana, Albert Monique, Alexopoulou Iakovina, Arnaout Angel, Bartlett John, Engel Jay, Gilbert Sebastien, Parfitt Jeremy, Sekhon Harman, Thomas George, Rassl Doris M., Rintoul Robert C., Bifulco Carlo, Tamakawa Raina, Urba Walter, Hayward Nicholas, Timmers Henri, Antenucci Anna, Facciolo Francesco, Grazi Gianluca, Marino Mirella, Merola Roberta, de Krijger Ronald, Gimenez-Roqueplo Anne-Paule, Piché Alain, Chevalier Simone, McKercher Ginette, Birsoy Kivanc, Barnett Gene, Brewer Cathy, Farver Carol, Naska Theresa, Pennell Nathan A., Raymond Daniel, Schilero Cathy, Smolenski Kathy, Williams Felicia, Morrison Carl, Borgia Jeffrey A., Liptay Michael J., Pool Mark, Seder Christopher W., Junker Kerstin, Omberg Larsson, Dinkin Mikhail, Manikhas George, Alvaro Domenico, Bragazzi Maria Consiglia, Cardinale Vincenzo, Carpino Guido, Gaudio Eugenio, Chesla David, Cottingham Sandra, Dubina Michael, Moiseenko Fedor, Dhanasekaran Renumathy, Becker Karl-Friedrich, Janssen Klaus-Peter, Slotta-Huspenina Julia, Abdel-Rahman Mohamed H., Aziz Dina, Bell Sue, Cebulla Colleen M., Davis Amy, Duell Rebecca, Elder J. Bradley, Hilty Joe, Kumar Bahavna, Lang James, Lehman Norman L., Mandt Randy, Nguyen Phuong, Pilarski Robert, Rai Karan, Schoenfield Lynn, Senecal Kelly, Wakely Paul, Hansen Paul, Lechan Ronald, Powers James, Tischler Arthur, Grizzle William E., Sexton Katherine C., Kastl Alison, Henderson Joel, Porten Sima, Waldmann Jens, Fassnacht Martin, Asa Sylvia L., Schadendorf Dirk, Couce Marta, Graefen Markus, Huland Hartwig, Sauter Guido, Schlomm Thorsten, Simon Ronald, Tennstedt Pierre, Olabode Oluwole, Nelson Mark, Bathe Oliver, Carroll Peter R., Chan June M., Disaia Philip, Glenn Pat, Kelley Robin K., Landen Charles N., Phillips Joanna, Prados Michael, Simko Jeffry, Smith-McCune Karen, VandenBerg Scott, Roggin Kevin, Fehrenbach Ashley, Kendler Ady, Sifri Suzanne, Steele Ruth, Jimeno Antonio, Carey Francis, Forgie Ian, Mannelli Massimo, Carney Michael, Hernandez Brenda, Campos Benito, Herold-Mende Christel, Jungk Christin, Unterberg Andreas, von Deimling Andreas, Bossler Aaron, Galbraith Joseph, Jacobus Laura, Knudson Michael, Knutson Tina, Ma Deqin, Milhem Mohammed, Sigmund Rita, Godwin Andrew K., Madan Rashna, Rosenthal Howard G., Adebamowo Clement, Adebamowo Sally N., Boussioutas Alex, Beer David, Giordano Thomas, Mes-Masson Anne-Marie, Saad Fred, Bocklage Therese, Landrum Lisa, Mannel Robert, Moore Kathleen, Moxley Katherine, Postier Russel, Walker Joan, Zuna Rosemary, Feldman Michael, Valdivieso Federico, Dhir Rajiv, Luketich James, Mora Pinero Edna M., Quintero-Aguilo Mario, Carlotti Carlos Gilberto, Dos Santos Jose Sebastião, Kemp Rafael, Sankarankuty Ajith, Tirapelli Daniela, Catto James, Agnew Kathy, Swisher Elizabeth, Creaney Jenette, Robinson Bruce, Shelley Carl Simon, Godwin Eryn M., Kendall Sara, Shipman Cassaundra, Bradford Carol, Carey Thomas, Haddad Andrea, Moyer Jeffey, Peterson Lisa, Prince Mark, Rozek Laura, Wolf Gregory, Bowman Rayleen, Fong Kwun M., Yang Ian, Korst Robert, Rathmell W. Kimryn, Fantacone-Campbell J. Leigh, Hooke Jeffrey A., Kovatich Albert J., Shriver Craig D., DiPersio John, Drake Bettina, Govindan Ramaswamy, Heath Sharon, Ley Timothy, Van Tine Brian, Westervelt Peter, Rubin Mark A., Lee Jung Il, Aredes Natália D., Mariamidze Armaz. A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers. Cancer Cell. 2018;33(4):690-705.e9. doi: 10.1016/j.ccell.2018.03.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Yuan Jiao, Hu Zhongyi, Mahal Brandon A., Zhao Sihai D., Kensler Kevin H., Pi Jingjiang, Hu Xiaowen, Zhang Youyou, Wang Yueying, Jiang Junjie, Li Chunsheng, Zhong Xiaomin, Montone Kathleen T., Guan Guoqiang, Tanyi Janos L., Fan Yi, Xu Xiaowei, Morgan Mark A., Long Meixiao, Zhang Yuzhen, Zhang Rugang, Sood Anil K., Rebbeck Timothy R., Dang Chi V., Zhang Lin. Integrated Analysis of Genetic Ancestry and Genomic Alterations across Cancers. Cancer Cell. 2018;34(4):549-560.e9. doi: 10.1016/j.ccell.2018.08.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Abul-Husn NS, Kenny EE. Personalized medicine and the power of electronic health records. Cell. 2019;177:58–69. doi: 10.1016/j.cell.2019.02.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Tishkoff SA, Reed FA, Friedlaender FR, Ehret C, Ranciaro A, Froment A, et al. The genetic structure and history of Africans and African Americans. Science. 2009;324:1035–1044. doi: 10.1126/science.1172257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Cancer Genome Atlas Research Network. Weinstein JN, Collisson EA, Mills GB, KRM S, Ozenberger BA, et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. 2013;45:1113–1120. doi: 10.1038/ng.2764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Ding L, Bailey MH, Porta-Pardo E, Thorsson V, Colaprico A, Bertrand D, et al. Perspective on oncogenic processes at the end of the beginning of cancer genomics. Cell. 2018;173:305–320.e10. doi: 10.1016/j.cell.2018.03.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Park S, Supek F, Lehner B. Systematic discovery of germline cancer predisposition genes through the identification of somatic second hits. Nat Commun. 2018;9:2601. doi: 10.1038/s41467-018-04900-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Carter H, Marty R, Hofree M, Gross AM, Jensen J, Fisch KM, et al. Interaction landscape of inherited polymorphisms with somatic events in cancer. Cancer Discov. 2017;7:410–423. doi: 10.1158/2159-8290.CD-16-1045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Shahamatdar S, He MX, Reyna MA, Gusev A, AlDubayan SH, Van Allen EM, et al. Germline features associated with immune infiltration in solid tumors. Cell Rep. 2020;30:2900–2908.e4. doi: 10.1016/j.celrep.2020.02.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Walsh Michael F., Ritter Deborah I., Kesserwan Chimene, Sonkin Dmitriy, Chakravarty Debyani, Chao Elizabeth, Ghosh Rajarshi, Kemel Yelena, Wu Gang, Lee Kristy, Kulkarni Shashikant, Hedges Dale, Mandelker Diana, Ceyhan‐Birsoy Ozge, Luo Minjie, Drazer Michael, Zhang Liying, Offit Kenneth, Plon Sharon E. Integrating somatic variant data and biomarkers for germline variant classification in cancer predisposition genes. Human Mutation. 2018;39(11):1542–1552. doi: 10.1002/humu.23640. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Ramroop JR, Gerber MM, Toland AE. Germline variants impact somatic events during tumorigenesis. Trends Genet. 2019;35:515–526. doi: 10.1016/j.tig.2019.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Campbell PJ, Getz G, Korbel JO et al. Pan-cancer analysis of whole genomes. Nature. 2020;578:82–93. 10.1038/s41586-020-1969-6. [DOI] [PMC free article] [PubMed]
  • 63.Waszak SM, Tiao G, Zhu B, Rausch T, Muyas F, Rodríguez-Martín B, et al. Germline determinants of the somatic mutation landscape in 2,642 cancer genomes. bioRxiv. 2017;208330.
  • 64.Lupski JR, Belmont JW, Boerwinkle E, Gibbs RA. Clan genomics and the complex architecture of human disease. Cell. 2011;147:32–43. doi: 10.1016/j.cell.2011.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Hindorff LA, Bonham VL, Brody LC, Ginoza MEC, Hutter CM, Manolio TA, et al. Prioritizing diversity in human genomics research. Nat Rev Genet. 2018;19:175–185. doi: 10.1038/nrg.2017.89. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

13073_2020_744_MOESM1_ESM.pdf (1.1MB, pdf)

Additional file 1: Figure S1. Principal component analyses (PCA) of germline TCGA samples to infer genetic ancestry as performed by PanCanAtlas Ancestry Informative Markers (AIM) working group. Figure S2. Power analysis for ancestry-specific sample sizes to discover predisposing genes. Figure S3. Nonsense-mediated decay prediction for predisposing frameshift variants in African and East Asian ancestries.

13073_2020_744_MOESM2_ESM.xlsx (387.5KB, xlsx)

Additional file 2: Table S1. The demographic information of TCGA PanCanAtlas cohort with separate admixture populations. Table S2a. Ancestry-specific cancer-gene associations discovered from multivariate regression analyses. Table S2b. Ancestry-specific cancer-gene associations discovered from rare variant burden testing (Total Frequency Test- TFT). Table S3. Frequency of predisposing variants in TCGA PanCanAtlas and gnomAD-non-cancer subset across all ancestries. Table S4a. Ancestry-Specific Predisposing Variants as identified from Supp. Table.2. Table S4b. Summary of somatic second hit mutations in carriers of germline predisposing variants. Table S5a. Statistical analysis of gene expression in tumor samples of the variant carriers vs. non-carriers within each ancestry-cancer combination. Table S5b. Tumor RNAseq variant allele fractions and the somatic second hit events in germline predisposing variants with extreme expression within that cancer type. Table S6a. Post hoc power analyses to detect rare-variant associations in an aggregation test using SKAT. Table S6a. Down-sampling analysis for PCGP and SARC (cancers with at least 5 significantly associated germline genes in the European ancestry). Table S7. Prior studies that report ancestry-specific germline predisposition.

Data Availability Statement

Ancestry ascertainments for all samples in this study were obtained from the TCGA-AIM analysis working group’s publication and its supplementary information files [21]. Germline variant calls used in this study are available through controlled access data release of the TCGA PanCanAtlas germline working group (https://gdc.cancer.gov/about-data/publications/PanCanAtlas-Germline-AWG) [20]. Somatic mutation and expression data are publicly available at the National Cancer Institute’s GDC data portal (https://portal.gdc.cancer.gov/). All downstream analysis results generated in this study are available in Additional files 1 and 2.


Articles from Genome Medicine are provided here courtesy of BMC

RESOURCES