Phenotypic evaluation of deep learning models for classifying germline variant pathogenicity

Ryan D Chow; Katherine L Nathanson; Ravi B Parikh

doi:10.1038/s41698-024-00710-x

. 2024 Oct 19;8:235. doi: 10.1038/s41698-024-00710-x

Phenotypic evaluation of deep learning models for classifying germline variant pathogenicity

Ryan D Chow ^1,^✉, Katherine L Nathanson ^2,^3,^#, Ravi B Parikh ^4,^5,^6,^7,^#

PMCID: PMC11490490 PMID: 39427061

Abstract

Deep learning models for predicting variant pathogenicity have not been thoroughly evaluated on real-world clinical phenotypes. Here, we apply state-of-the-art pathogenicity prediction models to hereditary breast cancer gene variants in UK Biobank participants. Model predictions for missense variants in BRCA1, BRCA2 and PALB2, but not ATM and CHEK2, were associated with breast cancer risk. However, deep learning models had limited clinical utility when specifically applied to variants of uncertain significance.

Subject terms: Cancer genetics, Breast cancer, Computational biology and bioinformatics, Molecular medicine

A core challenge in the interpretation of germline genetic testing is distinguishing pathogenic vs. benign variants¹. This decision point has clinical ramifications, as individuals carrying pathogenic variants may benefit from more intensive screening and treatment^2,3. Many identified alterations have not been well-characterized and are thus deemed variants of uncertain significance (VUSs)^4–6. Based on conserved patterns in protein sequences and structures, deep learning models have been developed that can independently recapitulate variant pathogenicity annotations from gold-standard clinical references such as the expert-curated ClinVar database^7–10. Furthermore, deep learning models can be used to predict the pathogenicity of unannotated missense variants, potentially offering a scalable approach to help close the interpretation gap for VUSs by enhancing current variant classification schema. However, these models have yet to be rigorously evaluated in relation to clinical disease phenotypes—for instance, whether VUSs predicted to be pathogenic are functionally associated with increased risk of disease within a population. As clinical guidelines currently discourage the use of computational pathogenicity predictions alone to guide decision-making¹¹, a key step towards clinical application of deep learning models is to assess whether they can meaningfully distinguish variants that confer increased risk of disease in a clinical setting.

Here, we applied three state-of-the-art deep learning models for pathogenicity prediction (AlphaMissense⁷, EVE⁸, and ESM1b^9,10; Table S1) to classify germline missense variants in UK Biobank participants^12,13. We benchmarked model-based pathogenicity predictions for five hereditary breast cancer genes^14,15 (BRCA1, BRCA2, ATM, CHEK2, and PALB2), evaluating whether current deep learning models can classify pathogenic germline variants that are functionally associated with increased breast cancer risk in a real-world setting. To expand our analysis to another cancer with well-established hereditary risk genes, we further evaluated pathogenicity predictions for BRCA1 and BRCA2 in relation to ovarian/fallopian cancers.

We identified 469,623 UK Biobank participants with matched exome sequencing and health record data (Fig. 1a). We compiled all missense variants in BRCA1, BRCA2, ATM, CHEK2, and PALB2 among UK Biobank participants, including both males and females (Table S2). Classifying each variant based on ClinVar annotations, we observed that the majority were VUSs (including conflicting interpretations) (Fig. 1b). At the participant-level, most participants with missense variants in a hereditary breast cancer gene were classified as benign variant carriers (Fig. 1c), with the exception of CHEK2.

We then compared the ability of each deep learning model to recapitulate ClinVar variant annotations (Tables S3-4). Whereas AlphaMissense and EVE each employ two thresholds to define pathogenic, benign, or ambiguous variants, the ESM1b model instead uses a single threshold to binarize all variants into pathogenic or benign categories. EVE model predictions were unavailable for CHEK2, and none of the identified PALB2 variants were classified as pathogenic in ClinVar; thus, these comparisons were excluded from analysis. AlphaMissense and ESM1b correctly labeled the majority of ClinVar benign variants, in contrast with EVE (Fig. 1d). Similarly, the deep learning models recapitulated pathogenic ClinVar variant annotations, though performance varied across genes. The majority of ClinVar VUSs were predicted to be benign by AlphaMissense and ESM1b, whereas EVE classifications were evenly split between benign and ambiguous designations.

Prior studies on variant pathogenicity prediction have largely relied on ClinVar variant annotations as the gold-standard reference for model validation. Thus, we sought to conduct a real-world assessment of whether deep learning models could identify pathogenic variants that confer increased cancer risk. This approach would enable us to evaluate the utility of deep learning models when applied to ClinVar VUSs—a clinically relevant task that could not be assessed in prior studies that validated model predictions on ClinVar annotations.

Of the 254,523 female UK Biobank participants meeting inclusion criteria, 11,496 (4.52%) had been diagnosed with breast cancer at the time of analysis. As a positive control, we first constructed Firth’s penalized multivariable logistic regression models including BRCA1 variant pathogenicity (as defined by ClinVar, comparing pathogenic vs benign variants) and age at enrollment in the UK Biobank as covariates. We found that ClinVar-defined pathogenic BRCA1 variant status and older age were both associated with increased risk of breast cancer in the multivariable model (Fig. 2a; Table S5). ClinVar-defined pathogenic variants in BRCA2, ATM and CHEK2 were similarly associated with increased breast cancer risk. To validate our approach in a distinct cancer type, ClinVar pathogenic variants in BRCA1 and BRCA2 also conferred increased risk of ovarian/fallopian cancers (Fig. 2b; Table S6).

Fig. 2 — a, b Summary statistics from Firth’s penalized logistic regression models, evaluating the association between risk of breast cancer (a) or ovarian/fallopian cancer (b) with variant pathogenicity, as determined by ClinVar or each deep learning classifier (left column). Composite classifiers were created by augmenting ClinVar annotations with model-based predictions for VUSs (middle column). Each deep learning model was also evaluated on ClinVar VUS carriers (right column). Points are color-coded by the magnitude of the log odds ratio (OR), with positive values indicating increased cancer risk. Points are also size-scaled by statistical significance, expressed as -log₁₀ p-values. All models included participant age at the time of UK Biobank enrollment as a covariate. EVE model predictions were unavailable for *CHEK2*, and none of the identified *PALB2* variants were classified as pathogenic in ClinVar – thus, these comparisons were excluded from analysis. c The percentage of participants annotated as VUS carriers when applying each of the different classifiers. Data in the right-hand panel indicates the percentage of participants classified as ClinVar VUS carriers that are then further classified by the deep learning model as carriers of an ambiguous variant.

We then assessed the utility of the deep learning models for discriminating pathogenic variants that functionally confer increased cancer risk. AlphaMissense (but not ESM1b or EVE) distinguished pathogenic BRCA1 variant carriers with higher risk of breast cancer (Fig. 2a), but not ovarian/fallopian cancer (Fig. 2b). For BRCA2, both AlphaMissense and ESM1b identified pathogenic variants that conferred increased risk of breast cancer, while AlphaMissense alone identified pathogenic variants associated with ovarian/fallopian cancer risk. Whereas ClinVar-defined pathogenic variants in ATM and CHEK2 were associated with increased breast cancer risk, all three deep learning models failed to predict functionally pathogenic variants in ATM and CHEK2. Conversely, even though none of the PALB2 missense variants were annotated as pathogenic by ClinVar, both AlphaMissense and ESM1b identified variants in PALB2 as potentially pathogenic that were associated with increased breast cancer risk.

To more closely simulate how deep learning models might be applied in clinical practice, we constructed composite classifiers by augmenting ClinVar classifications with model-based pathogenicity predictions for ClinVar VUSs. The resulting composite classifiers using either AlphaMissense or ESM1b successfully distinguished pathogenic variants in BRCA1, BRCA2, and PALB2 associated with higher risk of cancer (Fig. 2a, b), while reducing the proportion of participants annotated as VUS carriers (Fig. 2c). For instance, 0.888% of all BRCA1 missense variant carriers were originally annotated by ClinVar as VUS carriers, decreasing to 0.0436% when using the AlphaMissense composite classifier and to 0% with the ESM1b classifier. However, we noted that the composite classifiers had reduced discriminative power compared to ClinVar annotations alone. As an example, ClinVar-defined BRCA1 pathogenic missense variants had a log odds ratio (OR) for breast cancer of 1.89 [1.32–2.4] (p = 1.05 × 10⁻⁸) in this setting, compared to 0.45 [0.38–1.09] (p = 1.51 × 10⁻⁴) for the AlphaMissense composite classifier. For ATM and CHEK2, ClinVar annotations in isolation could distinguish functionally pathogenic variants, whereas the composite classifiers could not. Thus, we next evaluated the deep learning models exclusively on ClinVar VUS carriers. When applied to VUS carriers only, none of the deep learning models effectively distinguished pathogenic variants in BRCA1, BRCA2, ATM, or CHEK2 that conferred increased cancer risk (Fig. 2a, b). In contrast, AlphaMissense and ESM1b both maintained discriminative power for distinguishing PALB2 VUSs associated with increased breast cancer risk.

All three deep learning models generate continuous pathogenicity scores that are subsequently partitioned into pathogenicity predictions using a shared threshold across genes. Given the observed variation in model performance across genes, we tested a range of AlphaMissense score thresholds to ascertain whether gene-specific cutoffs for defining pathogenic variants might improve model performance (Tables S7-S11). For ATM and CHEK2, although the default AlphaMissense score threshold failed to enrich for variants associated with increased breast cancer risk, higher thresholds could achieve statistical significance with improved effect sizes (Fig. 3a, b). For PALB2, BRCA1 and BRCA2, higher AlphaMissense score thresholds further improved discriminative power (Fig. 3c–g), at times achieving effect sizes comparable to ClinVar annotations.

Fig. 3 — a–g Log ORs from Firth’s penalized logistic regression models with 95% confidence intervals (CIs) shown, evaluating the association between risk of breast cancer (a–e) or ovarian/fallopian cancer f–g with variant pathogenicity, defined by applying a range of AlphaMissense score thresholds to *ATM* (a), *CHEK2* (b), *PALB2* (c), *BRCA1* (**d, f**), and *BRCA2* (**e, g**). For comparison, the regression model using ClinVar variant annotations is shown (95% CIs shaded in). Points are colored by the positive predictive value (PPV) for carrying a predicted pathogenic variant in relation to cancer diagnoses and size-scaled by statistical significance. Asterisks denote p < 0.05. All models included age at the time of UK Biobank enrollment as a covariate.

In summary, here we investigated the real-world utility of three state-of-the-art deep learning models (AlphaMissense, EVE, and ESM1b) for classifying missense variants in hereditary breast cancer genes, mimicking the clinical scenario of adjudicating cancer risk based on germline genetic profiling. We demonstrate that model-based pathogenicity predictions were associated with human disease phenotypes in a real-world context, underscoring their potential for clinical application. In particular, pathogenicity prediction models could fill key knowledge gaps by informing variant classification for genes that are less well-studied. Although none of the identified PALB2 variants were annotated as pathogenic by ClinVar, AlphaMissense and ESM1b both successfully identified PALB2 variants that conferred increased breast cancer risk, representing a concrete example in which deep learning models could inform variant classification in the context of ACMG/AMP guidelines and thus impact clinical decision-making¹¹. Additionally, composite classifiers reduced the proportion of individuals that were classified as VUS carriers. While this reduction in indeterminant classifications generally came at the expense of discriminative power, such a tradeoff may be warranted in circumstances where minimizing the proportion of VUS carriers is the priority.

Prior studies have often assessed model performance in terms of recapitulating ClinVar pathogenic vs benign classifications. By design, this approach precludes critical evaluation of model predictions for VUSs—arguably the primary task that deep learning models were designed to address. Our study instead directly benchmarked model performance on the disease phenotype of interest—e.g., breast cancer diagnosis—thereby enabling a dedicated assessment of whether the models could meaningfully classify VUSs. To that end, we find that current deep learning models have limited clinical utility for predicting VUS pathogenicity in most of the hereditary cancer genes studied here, and thus would not be informative for variant classification in the context of ACMG/AMP guidelines¹¹. Aside from PALB2, VUSs predicted to be pathogenic generally did not confer increased cancer risk relative to predicted-benign VUSs.

Our results further indicate that variation in model performance across genes remains an important challenge precluding broad clinical application. In seeking to address the issue of gene-wise variation in model performance, we demonstrate that a scanning approach to optimize gene-specific score thresholds could increase the specificity of AlphaMissense for identifying functional pathogenic variants. This variation in model performance may stem in part from underlying biases in the training data, such that certain archetypes of proteins and pathogenic variants may be comparatively underrepresented. Additionally, a large proportion of missense variants within genes may confer widely varying degrees of associated cancer risk, and functionally deleterious mutations may also have low disease penetrance, posing challenges for pathogenicity classification. These phenomena have been well-documented for CHEK2^16–18 and may help explain our findings regarding the limited utility of deep learning models for CHEK2 missense variant classification.

Our study highlights the value of clinicogenomic data for evaluating the real-world clinical utility of pathogenicity prediction algorithms. A limitation with the use of ClinVar and similar variant databases as the gold-standard reference for model evaluation is that such databases can perpetuate racial and ethnic disparities in the existing literature. For instance, VUSs are more common in individuals of African and Asian descent compared to those of European descent^19–21. Thus, pathogenicity prediction models trained on existing variant databases may exhibit reduced predictive performance when applied to other populations. Moving forward, we anticipate that the upfront inclusion of more diverse clinicogenomic cohorts^22–25 will be critical for empowering the next generation of pathogenicity prediction models.

Methods

Study Design and Participants

We analyzed data from the UK Biobank, a prospective cohort study comprising approximately 500,000 participants aged 37-73 years old living in the UK that were recruited between 2006 and 2010. UK Biobank participants have matched genomic profiling and longitudinal health record data, including cancer diagnosis history. For this study, all participants with matched exome sequencing profiles and linked health record data were included (total n = 469,623 participants). All participants provided written informed consent, which was approved by the North West Multicenter Research Ethics Committee. As the present study involved reanalysis of fully de-identified preexisting data, no additional approval was required. This study was performed in accordance with the Declaration of Helsinki.

Defining cancer diagnoses

Participants in the UK Biobank were linked to national cancer registries, included as data fields 40006 (ICD10) and 40013 (ICD9). Using the UK Biobank Research Analysis Platform (RAP), we queried the entire cohort for all instances of the “Type of cancer” entry within the cancer registry data, annotated by ICD9 and ICD10 codes. Breast cancer diagnoses were defined by ICD9 (174*) and/or ICD10 (C50.* or D48.6). To assess whether our findings were consistent beyond the context of breast cancer, we also considered diagnoses of ovarian or fallopian tube cancers for BRCA1 and BRCA2 analyses: ICD9 (1830, 1832) and/or ICD10 (C56, C57.0, C57.4, D39.1). All participants with matched genomic and clinical data were included for the analyses comparing variant pathogenicity annotations between ClinVar and each deep learning model. For analyses of breast or ovarian cancer risk in relation to pathogenic vs benign variant carrier status, only female participants were selected.

Analysis of exome sequencing data

Within the UK Biobank RAP, we used the Swiss Army Knife tool to filter and annotate variants in BRCA1, BRCA2, ATM, CHEK2, and PALB2. Specifically, we used PLINK2 to extract variants in the chromosome regions corresponding to each of the 5 target genes, with a minimum minor allele frequency of 0, a minimum minor allele count of 4, a Hardy-Weinberg equilibrium filter of p < 1 × 10⁻¹⁵ with the “keep-fewhet” flag activated, as well as less than 10% of missing genotypes and variant calls²⁶. From there, we used SNPEFF²⁷ to annotate each variant by their functional type and impact on protein sequence, with further annotation by SNPSIFT²⁸ to incorporate annotations from the February 15, 2024 ClinVar data release.

ClinVar²⁹ labels were consolidated into benign, pathogenic or VUS categories, with conflicting or missing annotations consolidated as VUSs. For all identified missense coding variants, we used precomputed scores from AlphaMissense⁷, EVE⁸ and ESM1b⁹ to predict variant pathogenicity. For AlphaMissense, we used the default optimized score of 0.34 to distinguish benign vs ambiguous variants, and 0.564 to distinguish ambiguous vs pathogenic variants. For EVE, we used the “EVE_classes_75_pct_retained_ASM” classifications as optimized in the original study. As the EVE database (https://evemodel.org/) did not have predictions available for CHEK2, EVE was excluded from the CHEK2 analyses. For ESM1b, we used the default optimized threshold of −7.5 to distinguish pathogenic vs benign variants; unlike AlphaMissense and EVE, ESM1b was designed to use a single binary threshold such that no variants are annotated as ambiguous.

We examined UK Biobank participants (male or female) with at least one missense variant in each of the five genes and classified the participants as benign, VUS, or pathogenic variant carriers. Given that individual participants may have multiple variants simultaneously, pathogenic variants were prioritized over VUSs, and in turn VUSs were prioritized over benign variants. For instance, if a participant had both pathogenic and benign variants in a given gene, the participant was annotated as a pathogenic variant carrier and not as a benign variant carrier.

We then compared ClinVar and the deep learning model annotations, first on the level of unique variants and subsequently on the participant level to evaluate the accuracy of the deep learning models in recapitulating ClinVar pathogenicity labels. To model a clinical scenario in which deep learning models might be applied, we generated composite classifiers by starting with ClinVar annotations as a foundation and augmenting them with model-based pathogenicity predictions to specifically classify ClinVar VUSs; in other words, all benign and pathogenic ClinVar annotations were retained for the composite classifiers, and the pathogenicity predictions were only applied to ClinVar VUSs.

Association of variant pathogenicity with cancer risk

To assess whether the pathogenicity classifications were functionally associated with breast cancer risk, we analyzed female participants with a benign vs pathogenic missense variant for the gene of interest, while excluding participants that carried a frameshift, stop gain, stop loss, or start loss variant in that particular gene. We then used Firth’s penalized logistic regression to determine the association of pathogenic vs benign variants with diagnoses of breast or ovarian/fallopian cancer (as noted, only females were included in these analyses). In all regression models, we included age at time of enrollment in the UK Biobank as a covariate, binarizing by the median age. Participants carrying a VUS (as defined by ClinVar or each deep learning model), but no pathogenic variants, were excluded from the regression analysis, regardless of the presence of co-occurring benign variants (see discussion above). We further calculated regression models using the composite pathogenicity labels (see above) to simulate the clinical scenario of first relying on ClinVar annotations where available and subsequently employing deep learning models to classify VUSs. To directly assess the predictive utility of deep learning models on classifying VUSs, we then analyzed VUS carriers only (as defined by ClinVar) and assessed whether the predicted pathogenicity labels were associated with cancer risk.

Regression results were reported as log odds ratios (ORs) and p-values, with a significance threshold of p < 0.05. We did not adjust for multiple comparisons. The 95% confidence intervals (CIs) for all log ORs are detailed in the supplementary data.

Assessment of gene-specific thresholds for defining pathogenic variants

For evaluating whether gene-specific thresholds could improve model performance, we focused on AlphaMissense pathogenicity scores. We retained the default threshold of 0.34 to distinguish benign vs ambiguous variants, while varying the threshold to distinguish ambiguous vs pathogenic variants, ranging from 0.564 (the default) to 1, in increments of 0.01. Regression results were reported as described above.

Supplementary information

Supplementary Information^{(13KB, docx)}

Supplementary Data 1-11^{(670.4KB, xlsx)}

Acknowledgements

This study was supported by the NIH/NCI (5K08CA263541), awarded to RBP. KLN is supported by the Basser Center for BRCA, Gray Foundation and Breast Cancer Research Foundation.

Author contributions

R.D.C. conceived and designed the study. R.D.C. analyzed the data. R.D.C., K.L.N., and R.B.P. prepared the manuscript. R.B.P. secured funding. K.L.N. and R.B.P. jointly supervised the study.

Data availability

All participant-level genomic and phenotypic data are available through the UK Biobank, through the standardized data-access protocol. Details for registering are available at https://www.ukbiobank.ac.uk/enable-your-research/register. Summary statistics and variant annotations generated in this study are available in the supplementary data.

Code availability

Initial data pre-processing and variant annotation were performed on the UK Biobank RAP: https://ukbiobank.dnanexus.com/landing. Custom analysis code has been deposited to Github: https://github.com/rdchow/UKB_pathogenicityPrediction.

Competing interests

R.D.C. reports no competing interests. R.B.P. has received grants from the National Institutes of Health, Department of Defense, Prostate Cancer Foundation, National Palliative Care Research Center, NCCN Foundation, Conquer Cancer Foundation, Humana, Emerson Collective, Schmidt Futures, Arnold Ventures, Mendel.ai, and Veterans Health Administration; personal fees and equity from GNS Healthcare, Thyme Care, and Onc.AI; personal fees from the ConcertAI, Cancer Study Group, Biofourmis, Genetic Chemistry Therapeutics, CreditSuisse, G1 Therapeutics, Humana, and Nanology; honoraria from Flatiron and Medscape; has board membership (unpaid) at the Coalition to Transform Advanced Care and American Cancer Society; and serves on a leadership consortium (unpaid) at the National Quality Forum, all outside the submitted work. K.L.N. reports serving on a Scientific Advisory Board for Merck, unrelated to the current study.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Katherine L. Nathanson, Ravi B. Parikh.

Supplementary information

The online version contains supplementary material available at 10.1038/s41698-024-00710-x.

References

1.Couch, F. J., Nathanson, K. L. & Offit, K. Two Decades After BRCA: Setting Paradigms in Personalized Cancer Care and Prevention. Science343, 1466–1470 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Domchek, S. M. et al. Association of risk-reducing surgery in BRCA1 or BRCA2 mutation carriers with cancer risk and mortality. JAMA304, 967–975 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
3.U. S. Preventive Services Task Force. Risk Assessment, Genetic Counseling, and Genetic Testing for BRCA-Related Cancer: US Preventive Services Task Force Recommendation Statement. JAMA322, 652–665 (2019). [DOI] [PubMed] [Google Scholar]
4.Makhnoon, S., Bednar, E. M., Krause, K. J., Peterson, S. K. & Lopez-Olivo, M. A. Clinical management among individuals with variant of uncertain significance in hereditary cancer: A systematic review and meta-analysis. Clin. Genet.100, 119–131 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Daly, M. B. et al. Genetic/Familial High-Risk Assessment: Breast, Ovarian, and Pancreatic, Version 2.2021, NCCN Clinical Practice Guidelines in Oncology. J. Natl Compr. Canc Netw.19, 77–102 (2021). [DOI] [PubMed] [Google Scholar]
6.Lindor, N. M., Goldgar, D. E., Tavtigian, S. V., Plon, S. E. & Couch, F. J. BRCA1/2 Sequence Variants of Uncertain Significance: A Primer for Providers to Assist in Discussions and in Medical Management. Oncologist18, 518–524 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science381, eadg7492 (2023). [DOI] [PubMed] [Google Scholar]
8.Frazer, J. et al. Disease variant prediction with deep generative models of evolutionary data. Nature599, 91–95 (2021). [DOI] [PubMed] [Google Scholar]
9.Brandes, N., Goldman, G., Wang, C. H., Ye, C. J. & Ntranos, V. Genome-wide prediction of disease variant effects with a deep protein language model. Nat. Genet.55, 1512–1522 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl. Acad. Sci. USA118, e2016239118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med.17, 405–424 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature562, 203–209 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med.12, e1001779 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Breast Cancer Association Consortium et al. Breast Cancer Risk Genes - Association Analysis in More than 113,000 Women. N. Engl. J. Med.384, 428–439 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Hu, C. et al. A Population-Based Study of Genes Previously Implicated in Breast Cancer. N. Engl. J. Med.384, 440–451 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Dorling, L. et al. Breast cancer risks associated with missense variants in breast cancer susceptibility genes. Genome Med.14, 51 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Boonen, R. A. C. M., Vreeswijk, M. P. G. & van Attikum, H. CHEK2 variants: linking functional impact to cancer risk. Trends Cancer8, 759–770 (2022). [DOI] [PubMed] [Google Scholar]
18.Hanson, H. et al. Management of individuals with germline pathogenic/likely pathogenic variants in CHEK2: A clinical practice resource of the American College of Medical Genetics and Genomics (ACMG). Genet. Med.25, 100870 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Chen, E. et al. Rates and Classification of Variants of Uncertain Significance in Hereditary Disease Genetic Testing. JAMA Netw. Open6, e2339571 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Venner, E. et al. The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities. Commun. Biol.7, 1–11 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Auton, A. et al. A global reference for human genetic variation. Nature526, 68–74 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Sohail, M. et al. Mexican Biobank advances population and medical genomics of diverse ancestries. Nature622, 775–783 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Johnson, R. et al. The UCLA ATLAS Community Health Initiative: Promoting precision health research in a diverse biobank. Cell Genom.3, 100243 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Chen, Z. et al. China Kadoorie Biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. Int J. Epidemiol.40, 1652–1666 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Bick, A. G. et al. Genomic data in the All of Us Research Program. Nature627, 340–346 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet.53, 942–948 (2021). [DOI] [PubMed] [Google Scholar]
27.Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly. (Austin)6, 80–92 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Cingolani, P. et al. Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift. Front Genet.3, 35 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res.42, D980–D985 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information^{(13KB, docx)}

Supplementary Data 1-11^{(670.4KB, xlsx)}

Data Availability Statement

[CR1] 1.Couch, F. J., Nathanson, K. L. & Offit, K. Two Decades After BRCA: Setting Paradigms in Personalized Cancer Care and Prevention. Science343, 1466–1470 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Domchek, S. M. et al. Association of risk-reducing surgery in BRCA1 or BRCA2 mutation carriers with cancer risk and mortality. JAMA304, 967–975 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.U. S. Preventive Services Task Force. Risk Assessment, Genetic Counseling, and Genetic Testing for BRCA-Related Cancer: US Preventive Services Task Force Recommendation Statement. JAMA322, 652–665 (2019). [DOI] [PubMed] [Google Scholar]

[CR4] 4.Makhnoon, S., Bednar, E. M., Krause, K. J., Peterson, S. K. & Lopez-Olivo, M. A. Clinical management among individuals with variant of uncertain significance in hereditary cancer: A systematic review and meta-analysis. Clin. Genet.100, 119–131 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Daly, M. B. et al. Genetic/Familial High-Risk Assessment: Breast, Ovarian, and Pancreatic, Version 2.2021, NCCN Clinical Practice Guidelines in Oncology. J. Natl Compr. Canc Netw.19, 77–102 (2021). [DOI] [PubMed] [Google Scholar]

[CR6] 6.Lindor, N. M., Goldgar, D. E., Tavtigian, S. V., Plon, S. E. & Couch, F. J. BRCA1/2 Sequence Variants of Uncertain Significance: A Primer for Providers to Assist in Discussions and in Medical Management. Oncologist18, 518–524 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science381, eadg7492 (2023). [DOI] [PubMed] [Google Scholar]

[CR8] 8.Frazer, J. et al. Disease variant prediction with deep generative models of evolutionary data. Nature599, 91–95 (2021). [DOI] [PubMed] [Google Scholar]

[CR9] 9.Brandes, N., Goldman, G., Wang, C. H., Ye, C. J. & Ntranos, V. Genome-wide prediction of disease variant effects with a deep protein language model. Nat. Genet.55, 1512–1522 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl. Acad. Sci. USA118, e2016239118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med.17, 405–424 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature562, 203–209 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med.12, e1001779 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Breast Cancer Association Consortium et al. Breast Cancer Risk Genes - Association Analysis in More than 113,000 Women. N. Engl. J. Med.384, 428–439 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Hu, C. et al. A Population-Based Study of Genes Previously Implicated in Breast Cancer. N. Engl. J. Med.384, 440–451 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Dorling, L. et al. Breast cancer risks associated with missense variants in breast cancer susceptibility genes. Genome Med.14, 51 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Boonen, R. A. C. M., Vreeswijk, M. P. G. & van Attikum, H. CHEK2 variants: linking functional impact to cancer risk. Trends Cancer8, 759–770 (2022). [DOI] [PubMed] [Google Scholar]

[CR18] 18.Hanson, H. et al. Management of individuals with germline pathogenic/likely pathogenic variants in CHEK2: A clinical practice resource of the American College of Medical Genetics and Genomics (ACMG). Genet. Med.25, 100870 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Chen, E. et al. Rates and Classification of Variants of Uncertain Significance in Hereditary Disease Genetic Testing. JAMA Netw. Open6, e2339571 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Venner, E. et al. The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities. Commun. Biol.7, 1–11 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Auton, A. et al. A global reference for human genetic variation. Nature526, 68–74 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] 22.Sohail, M. et al. Mexican Biobank advances population and medical genomics of diverse ancestries. Nature622, 775–783 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Johnson, R. et al. The UCLA ATLAS Community Health Initiative: Promoting precision health research in a diverse biobank. Cell Genom.3, 100243 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] 24.Chen, Z. et al. China Kadoorie Biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. Int J. Epidemiol.40, 1652–1666 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Bick, A. G. et al. Genomic data in the All of Us Research Program. Nature627, 340–346 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet.53, 942–948 (2021). [DOI] [PubMed] [Google Scholar]

[CR27] 27.Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly. (Austin)6, 80–92 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Cingolani, P. et al. Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift. Front Genet.3, 35 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] 29.Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res.42, D980–D985 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Phenotypic evaluation of deep learning models for classifying germline variant pathogenicity

Ryan D Chow

Katherine L Nathanson

Ravi B Parikh

Abstract

Fig. 1. Deep learning models recapitulate ClinVar annotations for hereditary breast cancer gene variants.

Fig. 2. Deep learning models exhibit variable performance across genes for identifying functional pathogenic variants and are of limited utility for classifying VUSs.

Fig. 3. Gene-specific thresholds for defining variant pathogenicity can improve model performance.

Methods

Study Design and Participants

Defining cancer diagnoses

Analysis of exome sequencing data

Association of variant pathogenicity with cancer risk

Assessment of gene-specific thresholds for defining pathogenic variants

Supplementary information

Acknowledgements

Author contributions

Data availability

Code availability

Competing interests

Footnotes

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Phenotypic evaluation of deep learning models for classifying germline variant pathogenicity

Ryan D Chow

Katherine L Nathanson

Ravi B Parikh

Abstract

Fig. 1. Deep learning models recapitulate ClinVar annotations for hereditary breast cancer gene variants.

Fig. 2. Deep learning models exhibit variable performance across genes for identifying functional pathogenic variants and are of limited utility for classifying VUSs.

Fig. 3. Gene-specific thresholds for defining variant pathogenicity can improve model performance.

Methods

Study Design and Participants

Defining cancer diagnoses

Analysis of exome sequencing data

Association of variant pathogenicity with cancer risk

Assessment of gene-specific thresholds for defining pathogenic variants

Supplementary information

Acknowledgements

Author contributions

Data availability

Code availability

Competing interests

Footnotes

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases