Skip to main content
Journal of Translational Medicine logoLink to Journal of Translational Medicine
. 2024 Jul 31;22:713. doi: 10.1186/s12967-024-05508-w

Accuracy of renovo predictions on variants reclassified over time

Emanuele Bonetti 1, Giulia Tini 1,#, Luca Mazzarella 1,✉,#
PMCID: PMC11293099  PMID: 39085881

Abstract

Background

Interpreting the clinical consequences of genetic variants is the central problem in modern clinical genomics, for both hereditary diseases and oncology. However, clinical validation lags behind the pace of discovery, leading to distressing uncertainty for patients, physicians and researchers. This “interpretation gap” changes over time as evidence accumulates, and variants initially deemed of uncertain (VUS) significance may be subsequently reclassified in pathogenic/benign. We previously developed RENOVO, a random forest-based tool able to predict variant pathogenicity based on publicly available information from GnomAD and dbNFSP, and tested on variants that have changed their classification status over time. Here, we comprehensively evaluated the accuracy of RENOVO predictions on variants that have been reclassified over the last four years.

Methods

we retrieved 16 retrospective instances of the ClinVar database, every 3 months since March 2020 to March 2024, and analyzed time trends of variant classifications. We identified variants that changed their status over time and compared RENOVO predictions generated in 2020 with the actual reclassifications.

Results

VUS have become the most represented class in ClinVar (44.97% vs. 9.75% (likely) pathogenic and 40,33% (likely) benign). The rate of VUS reclassification is linear and slow compared to the rate of VUS reporting, exponential and currently ~ 30x faster, creating a growing divide between what can be sequenced vs. what can be interpreted. Out of 10,196 VUS variants in January 2020 that have undergone a clinically meaningful reclassification to march 2024, RENOVO correctly classified 82.6% in 2020. In addition, RENOVO correctly identified the majority of the few variants that switched clinically meaningful classes (e.g., from benign to pathogenic and vice versa). We highlight variant classes and clinically relevant genes for which RENOVO provides particularly accurate estimates. In particularly, genes characterized by large prevalence of high- or low-impact variants (e.g., POLE, NOTCH1, FANCM etc.). Suboptimal RENOVO predictions mostly concern genes validated through dedicated consortia (e.g., BRCA1/2), in which RENOVO would anyway have a limited impact.

Conclusions

Time trend analysis demonstrates that the current model of variant interpretation cannot keep up with variant discovery. Machine learning-based tools like RENOVO confirm high accuracy that can aid in clinical practice and research.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12967-024-05508-w.

Keywords: Variant classification, Machine learning, Precision medicine, Genomics, Random forest

Introduction

The decreasing cost of sequencing over the years has facilitated the analysis of an ever-expanding number of genomes, resulting in the creation of the collection of genetic variations affecting human health, commonly referred to as the “human variome” [1, 2]. The establishment of the ClinVar database in 2014 marked a significant milestone in clinical variant classification, providing a centralized repository of variants of clinical significance along with associated phenotypes and supporting evidence [3].

Numerous algorithms have been developed to support the time-consuming process of discriminating benign from pathogenic variants. Early computational tools (e.g. SIFT [4], Polyphen2 [5], MutPred2 [6]), only evaluate the impact of missense variants on protein structures, assessing whether the mutation introduced amino acids with impactful chemical-physical properties [7]. As genetic sequences of other organisms became available, conservation-based tools as well rule-based or machine-learning approaches emerged, often yielding conflicting predictions. Both functional and conservation-based tools were included, in 2015, among standardized criteria for clinical classification of genetic variants by the American College of Medical Genetics and the Association for Molecular Pathology (ACMG-AMP) [8]. More recently, new machine learning algorithms, called meta-learners, have been proposed to integrate scores from multiple prediction methods [9] and align their predictions with the ACMG-AMP guidelines, as for InterVar [10]. Such classifiers not only consider protein structure impact and conservation, but also scrutinize protein functional domains, variant rarity in population databases, protein tissue expression, and other parameters specified by the ACMG-AMP guidelines to classify variants accurately.

Despite the proliferation of tools, the rate of novel variants reported on ClinVar escalated significantly in the last years, resulting also in a notable increase of Variants of Unknown Significance (VUS) or those with Conflicting Interpretation of Pathogenicity (CIP) [11].

In response to this challenge, in 2020, we developed RENOVO, a random forest-based classifier [11] trained on well-established pathogenic and benign variants in ClinVar. RENOVO uses exclusively publicly available information retrieved from different sources (e.g., GnomAD [12] for population allele frequencies, dbNSFP [13, 14] for functional and conservation scores.) and computes a Pathogenicity Likelihood Score (PLS) to each variant. The PLS is used to classify variants as pathogenic or benign in 6 classes with varying levels of estimated precision: over 99% precision (High Precision Pathogenic HP-P and High Precision Benign HP-B), between 90% and 99% precision (Intermediate Precision Pathogenic IP-P and Intermediate Precision Benign IP-B), and below 90% precision (Low Precision Pathogenic LP-P and Low Precision Benign LP-B). The primary objective of RENOVO is to assist in the reclassification of variants for which existing evidence is insufficient for assignment to the ACMG (likely) pathogenic/benign class and are therefore classified as uncertain significance (VUS). The algorithm demonstrated superior performance over existing tools and was externally validated on independent databases such as ENIGMA for BRCA1/2 variants [15] and in-vitro validated variants of SCN5A [16].

A key idea in RENOVO was the strategy for the construction of training and test sets, which were not derived from a simple random split of pathogenic and benign variants. Variants included in ClinVar at the time of algorithm development were subjected to an analysis of their change over time, and considered “stable” (variants that have maintained their classification over time) or “unstable” (variants that have changed their classification over time). The algorithm was trained on stable variants and was tested on the unstable variants maintaining an accuracy higher than 98%. This allowed to speculate that RENOVO might be able to predict future reclassification of variants currently classified as VUS.

Here, we set out to validate the correctness of our 2020 predictions on variants that have been reclassified over the last 4 years. We show that RENOVO provides reliable reclassification of uncertain/conflicting variants and we identify variant classes in which its performance appears of particular utility.

Materials and methods

Data retrieval and preprocessing

Successive releases of the ClinVar database were downloaded from the ClinVar ftp (see Web Resources), choosing the first available Variant Call Format (VCF) files of March, June, September and December per years 2020–2023, retrieving a set of 16 VCFs. A comprehensive list is available in the table S1.

Some classification terms under the variable CLNSIG (Clinical Significance) in the ClinVar VCFs were collapsed to simplify interpretation, in particular:

  • -“ClinVar-benign” collapses the terms Benign, Likely benign, or Benign/Likely benign;

  • -“ClinVar-pathogenic” collapses the terms Pathogenic/Likely pathogenic, Likely pathogenic, or Pathogenic.

Variants with either ClinVar-benign or ClinVar-pathogenic classification were considered belonging to “clinically meaningful class”.

RENOVO predictions were retrieved from version published on the public database (https://github.com/mazzalab-ieo/renovo) on January 6th 2020. RENOVO classes were collapsed in:

  • -“RENOVO-P” for HP-P and IP-P (i.e. variants predicted to be pathogenic with > 90% precision).

  • -“RENOVO-B” for HP-B and IP-B (i.e. variants predicted to be benign with > 90% precision).

  • -“RENOVO-Low Precision” for LP-P and LP-B (i.e. variants with estimated precision below 90%).

Quarterly VUSs time analysis

A ClinVar-VUS was considered reclassified when its value under “CLNSIG” changed over time to a clinically meaningful class. Variants with a non-clinically meaningful classification (Table S2) were excluded from the analysis.

For each retrieved VCF at time point i we defined and compute the quarterly VUS reclassification rate as (Inline graphic), the quarterly VUS discovery rate as (Inline graphic) and the discovery/reclassification ratio.

Analysis of reclassification and accuracy definition

We selected the subset of ClinVar-VUS in 2020 (release 01/06/2020) and that were reclassified in ClinVar release 03/07/2024 (a total of 10,196 variants), and compared their RENOVO classification of 2020 with their final ClinVar classification. For accuracy quantification, we did not use the classical categories of true/false positive/negative, as they can be ambiguous in the present context, since RENOVO generates low confidence predictions that Instead, we defined:

  • -True Pathogenic (TP) was a variant defined as ClinVar-pathogenic in ClinVar release 03/07/2024, and RENOVO-P.

  • -True Benign (TB) was a variant defined as ClinVar-benign in ClinVar release 03/07/2024, and RENOVO-B;

  • -False Benign (FB) was a variant reclassified as ClinVar-pathogenic in ClinVar release 03/07/2024 and either RENOVO-B or RENOVO-Low Precision.

  • -False Pathogenic (FP) was a variant reclassified as ClinVar-benign in ClinVar release 03/07/2024 and either RENOVO-P or RENOVO-Low Precision.

Accuracy metrics were defined as follows:

  • -sensitivity for pathogenic as TP/(TP + FB).

  • -sensitivity for benign as TB/(TB + FP).

  • -positive predictive value for pathogenic (PPV-P) as TP/(TP + FP).

  • -positive predictive value for benign (PPV-B) as TB/(TB + FB).

  • -overall accuracy as TP + TB/(TP + FP + TB + FB).

We then calculated Receiver Operating Characteristics (ROC) and the Area Under the ROC Curve (AUC).

For more granular analyses, we categorized variants based on their potential functional impact, as follows:

  • -High-impact variants: frameshift deletion, frameshift insertion, frameshift substitution, start-loss, stop-gain, stop-loss;

  • -medium-impact variants: non-frameshift deletion, non-frameshift insertion, non-frameshift substitution, nonsynonymous SNV;

  • -low-impact variants: synonymous SNV.

Class-wise average percentage of reclassification was weighted for the size of the mutational class.

For each gene and for each functional impact category we calculated the number of correct, incorrect or low precision predictions.

For the analysis of feature distribution in FP vs. TP and TB variants, we selected the features with the highest weight in the RENOVO model as measured by the SHAP method, which we had already defined in the original RENOVO publication, in which we showed that a model constructed on only these 12 features resulted in no significant loss of performance. We analysed the distribution of these features in FP, TP and TB and tested differences by two-tailed t-test.

All analyses were performed using R version 4.3.1.

Strategy definition is showed in Fig. 2A.

Results

We first updated our analysis of the time trend of the ClinVar database. As shown in Fig. 1A, the growing number of total reported variants is increasingly sustained by those classified as “Uncertain_significance”, which currently represent the most represented class (44.97% of the total variants in March 2024). This dynamic appears more clearly when one rapport the quarterly average VUS reclassification rate (0.42%/trimester) to the average quarterly rate of new VUS entering the database (12.40%/trimester, Fig. 1B), a rate 29.76 times higher. This divide is bound to increase since reclassification follows a linear trend (y = 0.0033x + 58.0125), while discovery grows exponentially (y = e− 22.7883 + 0.0014x) (Fig. 1B).

Fig. 1.

Fig. 1

(A) Overview of ClinVar release trends over time. The classes have been divided into five colors to represent the degree of pathogenicity, from blue (benign) to red (pathogenic) with gray representing variants that are of uncertain clinical significance or have no clinical significance. (B) quarterly cumulative percentage trend of variants of uncertain significance (grey dots) and variants of uncertain significance that have been reclassified by RENOVO compared to the previous ClinVar version (light blue dots). The gray and light blue lines represent respectively the exponential and linear regression of the two distributions, in red their ratio

Out of the 216,716 variants categorized as “Uncertain_significance” in ClinVar (“ClinVar-VUS”) (Fig. 2A) as of January 2020, we identified a dataset of 10,196 variants (4.7%) which have undergone reclassification as Benign, Likely benign, Benign/Likely benign (‘ClinVar-benign’), or Pathogenic/Likely pathogenic, Likely pathogenic or Pathogenic (‘ClinVar-pathogenic’). We compared the RENOVO predictions in 2020 with the final reclassifications. As shown in Fig. 2B, overall accuracy was 79.36% (n = 8092/10,196). In particular, sensitivity for benign variants was 77.16% (n = 6088/7890), whereas sensitivity for pathogenic was 86.90% (n = 2004/2306). In terms of predictive value, RENOVO was more precise in identifying benign variants (PPV-B 98.16%, n = 6088/6202) than pathogenic (PPV-P 84.37%, n = 2004/2375).

Fig. 2.

Fig. 2

(A) Strategy to define RENOVO accuracy. (B) Confusion matrix showing the repartition of reclassified VUSs in the RENOVO classes. (C) Sankey diagram showing in the first column the variants of uncertain significance coming from the 2020 version of ClinVar that were reclassified in March 2024, in the second the distribution of these variants in the RENOVO PLS classes and in the third the current reclassification in ClinVar (March 2024). (D) ROC curves comparing the original training set (blue line), the original test set (light blue) and the reclassified VUS in this study (red line)

Figure 2C provides a granular view of how each subclass was reclassified and predicted. A small set of 1431 variants (15.70%) were predicted by RENOVO with low precision; in most cases these were reclassified as ClinVar-benign.

We compared the ROC and AUC of the subset of reclassified VUS with the ROC and AUC from the original paper. (Fig. 2D) On reclassified VUS, RENOVO achieved an AUC of 0.967, which is only slightly lower than AUC on the original test set (0.986) and training set (0.997).

Then, we examined the distribution of correct/incorrect classification in specific mutational classes (Fig. 3A). RENOVO achieves the highest accuracy (average = 98.74%) with those variants classified as “frameshift insertion” (n = 71), “frameshift deletions” (n = 160), “frameshift substitutions” (n = 13), “stop gain” (n = 204) and “synonymous SNV” (n = 1666). “nonframeshift insertion” (n = 38), “nonframeshift deletion” (n = 101), “nonframeshift substitution” (n = 31) and “nonsynonymous SNV” (n = 4240) represent the categories with the highest percentage of low precision predictions (average = 35.74%). The category with the highest error percentage (25%) is the “start loss”, which however accounts only for 12 variants.

Fig. 3.

Fig. 3

(A) Percentages of RENOVO predictions on functional mutational classes. In red the percentages of incorrect predictions, in gray the percentages of Low Precision predictions and in blue the percentages of correct predictions. (B) Percentage of correct (blue), incorrect (red) or low-precision (grey) predictions by functional impact class for each gene. Within each class, genes are ranked based on the total number of predictions of variants belonging to that class. NB some genes (e.g., BRCA2, NF1) appear in more than one class because their mutational spectrum includes variants belonging to more than one class

We wanted to assess RENOVO accuracy by gene and by variant class. Variants were categorized into high-impact, medium-impact, and low-impact based on their predicted functional impact (see Material and Methods); each gene was analysed separately within each class. RENOVO demonstrates almost perfect performance on genes with high prevalence of both high-impact and low-impact variants, including some with high mutational frequency such as POLE or TTN (Fig. 3B). For genes with high prevalence of medium-impact variants, including some of high clinical relevance such as BRCA1, BRCA2, and TP53, RENOVO accuracy was less good. It must be stressed that, for these problematic genes, low accuracy was restricted to medium-impact variants, whereas no difference was appreciated in high/low impact variants in the same genes, such as NF1 and BRCA2.

We further assessed the reasons for misclassification in “false pathogenic” (FP) variants, which despite affecting a small minority (371/10,196 variants, 3.64%) may constitute a more relevant clinical problem. FPs are mostly (325/371) constituted by nonsynonymous variants associated with medium-high functional impact. We assessed the distribution of the 12 most important features in our model as per SHapley Additive exPlanations (SHAP) analysis [11] in FP vs. TP and TB, as a means to investigate the reason for misclassification (Fig. S1A). The selected features were previously demonstrated to yield a model (RENOVO-M) with virtually identical performance to the model including all available features (RENOVO-F). Ten out of these 12 features exhibited some differential distribution among the three classes (SIFT, MutPred, M-CAP, MetaLR, FATHMM, PROVEAN, phyloP100way_vertebrate, MutationAssessor, CLNDN, AF, fathmm.MKL_coding). The most significant differences between TB and FP were associated with features summarizing functional and conservation scores (SIFT, MutPred, M-CAP, MetaLR, FATHMM, PROVEAN, phyloP100way_vertebrate, MutationAssessor) (Fig. S1B). We conclude that misclassification in TP is mostly due a systematic bias in prediction scores, which tend to classify as “damaging” variants that are in fact neutral. This bias has been previously recognized [17]. Interestingly, distributions in these features were also slightly but significantly lower than in TP variants, suggesting that such bias may be mitigated by fine-tuning the model.

Finally, to provide a more comprehensive analysis, we also examined variants previously categorized as pathogenic or benign by ClinVar in 2020 but subsequently reassigned to the opposite category. (Fig. 4A, B) Compared to the VUS, the incidence of reclassified variants was expectedly much lower (n = 184 pathogenic to benign, n = 14 benign to pathogenic), constituting a small fraction of the total variants listed in ClinVar as of January 2020 (0.026%), lower than the average reclassification rate (4.7%). RENOVO demonstrates a relatively high accuracy in classifying variants transitioning from pathogenic to benign states, correctly identifying as HP or IP Benign 51.09% of such instances (33.70% of the 184 pathogenic to benign have instead been classified as LP, 15.21% as HP or IP pathogenic). With benign variants reclassified as pathogenic in 2024, RENOVO displayed a different pattern, correctly identifying 21.43% of them as HP or IP pathogenic, but with lower precision (57.14% of variants classified as LP) and misclassification (21.43% as HP or IP benign). This contrasted trend suggests RENOVO’s differential efficacy in reclassifying variants, particularly noteworthy in its aptitude for identifying pathogenic-to-benign transitions. Yet, due to the limited instances of benign variants reclassified as pathogenic (n = 14), this represents a minor limitation.

Fig. 4.

Fig. 4

(A) Sankey diagram showing RENOVO predictions and past and present ClinVar pathogenic variants; (B) Sankey diagram showing RENOVO predictions and past and present ClinVar benign variants

Discussion

Our study provides two key findings that can inform current practice in clinical genetics. First, a simple time trend analysis demonstrates that the current model of variant interpretation is unable to keep the pace of variant discovery, which is fueled by the growing volume of sequencing. Secondly, we show that RENOVO is able to provide correct predictions of variant classification when existing evidence is insufficient to immediately classify the variant according to standard ACMG guidelines.

Estimates of the reliability or predictive tools are based on different criteria, leading to variability in identifying the optimal tools depending on the classification criterion applied [18]. Consequently, determining which are the best tools in a specific case study becomes complex, [1922]. As RENOVO relies on publicly available information, it is possible to monitor the tool performance over time, as new evidence accumulates, despite ClinVar’s inherent limitations as a sole source for determining genetic variant pathogenicity [23]. RENOVO accuracy on the limited set of ~ 10.000 variants reclassified over 2020–2024 is in line with that exhibited on pathogenic/benign ClinVar variants in 2020, with an AUC > 96%. We project that 58.97% of VUS from 2020, totaling 121,801 variants, could be correctly reclassified. Extrapolating to 2024, this impact escalates significantly to 712,289 variants, a sixfold increase. This underscores its enduring utility as a valuable analytical tool. RENOVO exhibited a striking accuracy in identifying strongly misclassified variants, especially those that switched status from pathogenic to benign. Thus, variants showing extreme disagreement between RENOVO prediction and ClinVar reclassification should be prioritized for reconsideration. RENOVO exhibits almost perfect accuracy in classifying variants with high or low functional impact. Thus, we propose RENOVO as an easily implemented solution for filtering variants in large panels or whole exome/genome sequencing [24], in which high/low-impact variants can be rapidly and efficiently filtered even in the absence of sufficient criteria as per ACMG guidelines. RENOVO cannot replace ACMG guidelines for diagnostic classification, as it is blind to some criteria (e.g., segregation analysis, in vitro assays) that can only be assessed on a case-by-case basis, and just like any other predictor tool it is not supported by the complex evidence-gathering underlying ACMG guidelines. However, RENOVO can be optimally used to prioritize variants that should be subjected to formal assessment through the ACMG algorithm.

We highlight comparably less good accuracy for medium-impact VUS in clinically relevant genes such as BRCA1, BRCA2, or TP53, for which RENOVO predictions should be interpreted cautiously. Due to their clinical relevance, reclassification rate of variants in these genes is faster and supported by dedicated expert panel consortia (e.g. ENIGMA BRCA1 and BRCA2 Variant Curation Expert Panel, TP53 Variant Curation Expert Panel) [25], suggesting that automated interpretation tools may be less important. In these cases, RENOVO still provides useful information in the form of a quantitative score that can be integrated with additional evidence to define pathogenicity. Also, it must be stressed that for these critical genes the performance drop only concerns medium-impact variants, but not high/low, where performance remains high.

The reassessment of RENOVO predictions described here allowed us to identify the source of specific biases that systematically cause misclassification. When RENOVO was implemented, feature values often had to be imputed, with important biases for some variant classes, e.g., insertions/deletions (indels). Also, functional and evolutionary prediction tools suffer from now well recognized biases [17]. Many of these shortcomings can be improved by leveraging on the wealth of information now accessible, which includes novel and more sophisticated functional and evolutionary prediction tools such as Alpha Missense and EVE [26]. These newer sources of information, together with updated data from previously used databases, is being included in a novel version of RENOVO in current development. We are confident that RENOVO can represent a valuable tool to optimize the process of variant interpretation, especially as multi-gene sequencing is expanding its use in research and clinics.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1 (159.6KB, docx)
Supplementary Material 2 (13.5KB, docx)

Acknowledgements

E.B. is a PhD student at the European School of Molecular Medicine (SEMM), Milan, Italy.

Abbreviations

VUS

Variant(s) with Unknown Significance

CIP

Variant(s) with Conflicting Interpretation of Pathogenicity

PLS

Pathogenicity Likelihood Score

AUC

Area Under the Curve

ROC

Receiver Operating Characteristic

HP

High Precision

IP

Intermediate Precision

LP

Low Precision

P

Pathogenic

B

Benign

Author contributions

E.B. Formal Analysis, Writing, Revision, G.T. Writing, Revision, L.M. Writing, Revision.

Funding

This work was supported by the Italian Ministry of Health with Ricerca Corrente and Ricerca Corrente di Rete (ACCORD) 2020–2022 and by a My First AIRC grant to LM n 25791 and funding from the Italian ministry of Health (Finanziamento a valere sul PSC Salute, Traiettoria 4 - Biotecnologie, bioinformatica e sviluppo farmaceutico, progetto “CAL.HUB.RIA”, codice locale progetto T4-AN-09).

Data availability

ClinVar FTP URL: https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/archive_2.0/.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Giulia Tini and Luca Mazzarella contributed equally to this work.

References

  • 1.Ring HZ, Kwok P-Y, Cotton RG. Human Variome Project: an international collaboration to Catalogue Human Genetic Variation. Pharmacogenomics. 2006;7:969–72. 10.2217/14622416.7.7.969 [DOI] [PubMed] [Google Scholar]
  • 2.Chiara M, Pavesi G. Evaluation of Quality Assessment Protocols for High Throughput Genome Resequencing Data. Front Genet. 2017;8:94. 10.3389/fgene.2017.00094 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Landrum MJ, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980–5. 10.1093/nar/gkt1113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Ng PC. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31:3812–4. 10.1093/nar/gkg509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Adzhubei IA, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–9. 10.1038/nmeth0410-248 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Pejaver V, et al. Inferring the molecular and phenotypic impact of amino acid variants with MutPred2. Nat Commun. 2020;11:5918. 10.1038/s41467-020-19669-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lucci-Cordisco E, et al. Variants of uncertain significance (VUS) in cancer predisposing genes: what are we learning from multigene panels? Eur J Med Genet. 2022;65:104400. 10.1016/j.ejmg.2021.104400 [DOI] [PubMed] [Google Scholar]
  • 8.Richards S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–24. 10.1038/gim.2015.30 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tang H, Thomas PD. Tools for Predicting the functional impact of Nonsynonymous Genetic Variation. Genetics. 2016;203:635–47. 10.1534/genetics.116.190033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Li Q, Wang K, InterVar. Clinical interpretation of genetic variants by the 2015 ACMG-AMP guidelines. Am J Hum Genet. 2017;100:267–80. 10.1016/j.ajhg.2017.01.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Favalli V, et al. Machine learning-based reclassification of germline variants of unknown significance: the RENOVO algorithm. Am J Hum Genet. 2021;108:682–95. 10.1016/j.ajhg.2021.03.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Karczewski KJ, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–43. 10.1038/s41586-020-2308-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Liu X, Jian X, Boerwinkle E, dbNSFP. A lightweight database of human nonsynonymous SNPs and their functional predictions. Hum Mutat. 2011;32:894–9. 10.1002/humu.21517 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Liu X, Li C, Mou C, Dong Y, Tu Y. dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med. 2020;12:103. 10.1186/s13073-020-00803-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Parsons MT, et al. Large scale multifactorial likelihood quantitative analysis of BRCA1 and BRCA2 variants: an ENIGMA resource to support clinical variant classification. Hum Mutat. 2019;40:1557–78. 10.1002/humu.23818 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Glazer AM, et al. High-throughput reclassification of SCN5A variants. Am J Hum Genet. 2020;107:111–23. 10.1016/j.ajhg.2020.05.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Wang D, Li J, Wang Y, Wang E. A comparison on predicting functional impact of genomic variants. NAR Genomics Bioinforma 4, (2022). [DOI] [PMC free article] [PubMed]
  • 18.Sefid Dashti MJ, Gamieldien JA. Practical guide to filtering and prioritizing genetic variants. Biotechniques. 2017;62:18–30. 10.2144/000114492 [DOI] [PubMed] [Google Scholar]
  • 19.Garcia FA, de Andrade O. E. S. de & Palmero, E. I. insights on variant analysis in silico tools for pathogenicity prediction. Front Genet 13, (2022). [DOI] [PMC free article] [PubMed]
  • 20.Hecht M, Bromberg Y, Rost B. Better prediction of functional effects for sequence variants. BMC Genomics. 2015;16:S1. 10.1186/1471-2164-16-S8-S1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Andersen LL, et al. Frequently used bioinformatics tools overestimate the damaging effect of allelic variants. Genes Immun. 2019;20:10–22. 10.1038/s41435-017-0002-z [DOI] [PubMed] [Google Scholar]
  • 22.Pabinger S, et al. A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform. 2014;15:256–78. 10.1093/bib/bbs086 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Yang S, et al. Sources of discordance among germ-line variant classifications in ClinVar. Genet Med. 2017;19:1118–26. 10.1038/gim.2017.60 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Chen E, et al. Rates and classification of variants of Uncertain significance in Hereditary Disease Genetic Testing. JAMA Netw Open. 2023;6:e2339571. 10.1001/jamanetworkopen.2023.39571 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Preston CG, et al. ClinGen variant curation interface: a variant classification platform for the application of evidence criteria from ACMG/AMP guidelines. Genome Med. 2022;14:6. 10.1186/s13073-021-01004-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Frazer J, et al. Disease variant prediction with deep generative models of evolutionary data. Nature. 2021;599:91–5. 10.1038/s41586-021-04043-8 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (159.6KB, docx)
Supplementary Material 2 (13.5KB, docx)

Data Availability Statement

ClinVar FTP URL: https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/archive_2.0/.


Articles from Journal of Translational Medicine are provided here courtesy of BMC

RESOURCES