Skip to main content
. 2022 Sep 29;13:981005. doi: 10.3389/fgene.2022.981005

TABLE 2.

Representative diseases-, phenotypes-, genes-specific variants impact predictors.

Characteristic category Name Type of variants Targeted disease/phenotype/gene # of genes Website Distribution (web-server/stand-alone) First publication Programming language Algorithm/model Features Dataset for modeling Classification index Classification Additional data Publication
Meta-predictor VIPPID (Variant Impact Predictor for PIDs) Missense Primary immunodeficiency (PID) diseases 146 https://mylab.shinyapps.io/VIPPID/ Web and stand-alone April 2022 Perl, R Conditional Inference Forest 85 features including AA, exonic, protein structural, conservation, and 20 pre-existing prediction tools 4,865 disease-associated variants from Asian Primary Immunodeficiency Diseases (RAPID) database, HGMD and ClinVar; 4,237 neutral variants from gnomAD Classifier Pathogenic/non-pathogenic 26 reviewed P/LP variants of known PID pathogenic genes from 1318 patients cohort and 39 validated in-house variants Fang et al. (2022)
Meta-predictor CanPredict Missense Cancer http://www.canpredict.org/or http://www.cgl.ucsf.edu/Research/genentech/canpredict/, both are not accessible May 2007 R RF SIFT, Pfam-based LogR.E-value and GO Similarity Score (GOSS) metrics Classifier Likely cancer/likely non-cancer/not determined Kaminker et al. (2007)
Meta-predictor PolyPhen-HCM Missense Hypertrophic cardiomyopathy 6 http://genetics.bwh.harvard.edu/hcm/ Pre-computed results February 2011 Naïve bayes classifier Prediction scores, protein structure comparison score 74 curated variants from literitures and manually classified by Laboratory for Molecular Medicine standard variant-assessment pipeline (41 pathogenic, 26 benign) Classifier Pathogenic/benign/no call Jordan et al. (2011)
Meta-predictor Cadioboost Missense Cardiomyopathies and arrhythmias 22 https://www.cardiodb.org/cardioboost/ Pre-computed results October. 2020 R 2 Adaptive Boosting (Adaboost) classifiers 76 functional features CM datasets: 356 rare P/LP variants from 9,007 clinical CM patients, 302 rare missense variants in CM genes from 2,090 healthy controls. Inherited arrhythmia dataset: 252 P/LP in arrhythmia-associated genes from ClinVar, 237 rare missense variants in arrhythmia genes from 2,090 healthy controls Pathogenicity score Disease-causing/VUS/Benign 4 datasets from ClinVar, HGMD, Oxford Medical Genetics Laboratory (OMGL), a large registry of HCM patients, SHaRe Zhang et al. (2021)
Multiple features GENESIS (GENe-specific EnSemble grId Search) Variants of uncertain clinical significance Catecholaminergic polymorphic ventricular tachycardia and long QT syndrome (LQTS) 4 https://github.com/rachellea/medgenetics Stand-alone and pre-computed results March 2022 Python Logistic regression and multilayer perceptron model 8 kinds of features including AA features, domain, conservation, rate of evolution, signal-to-noise ratio, and a position-specific scoring matrix (PSSM) score 717 pathogenic variants and 3,164 benign variants curated from literiture Probabilities of pathogenicity Pathogenic/VUS/benign 925 VUS classified according to ACMG Draelos et al. (2022)
Multiple features CACNA1F-vp Missense X-linked incomplete Congenital Stationary Night Blindness (iCSNB) 1 https://github.com/shalawsallah/CACNA1F-variants-analysis Stand-alone April 2020 Python Logistic regression model Variant-level features and structural features 72 disease-implicated from HGMD or MGDL database, 322 benign variants from gnomAD Probabilities of pathogenicity Pathogenic/benign - Sallah et al. (2020)
Optimized PON-P2 PON-MMR2 AA substitution Mismatch repair (MMR) 4 http://structure.bmc.lu.se/PON-MMR2/ Web and stand-alone September 2015 R RF 5 features: sequence conservation, physical and biochemical properties of AA 109 pathogenic, 99 neutral, 354 VUS from InSiGHT database and VariBench Probabilities of pathogenicity Pathogenic/VUS/benign 354 VUS dataset Niroula and Vihinen, (2015)
Optimized MAPP CoDP (Combination of Different Properties of MSH6 protein) Missense Lynch syndrome (LS) 1 http://cib.cf.ocha.ac.jp/CoDP/ Web April 2013 Logistic regression model MSA, phylogenetic tree, structral properties, MAPP, SIFT, PolyPhen2 294 missense variants from InSiGHT, MMRUV, UniProt, dbSNP, ESP, HapMap Project, 1KGP and literature Probabilities of pathogenicity Likely LS/Unlikely LS 260 unclassified variants dataset Terui et al. (2013)
Meta-predictor with MAF as features DvPred nsSNVs Genetic hearing loss (HL) 157 https://github.com/WCH-IRD/DVPred/tree/main/DVPred_score Stand-alone and pre-computed results February 2022 Python Gradient boosting decision tree (GBDT) 65 features include conservation scores, prediction scores, MAF, gene intolerance scores and other features 1,318 P/LP and 4,628 B/LB from China Deafness Genetics Consortium (CDGC), Deafness Variation Database (DVD), ClinVar, HGMD DvPred score Deleterious/neutral 463 pathogenic and 454 benign variants from new version of CDGC and ClinVar Bu et al. (2022)
Meta-predictor NBDriver Missense Cancer 58 https://github.com/RamanLab/NBDriver Stand-alone May 2021 Python RF, extra tress (ET) classifier, generative KDE classifier 3 types of features: one-hot encoding, overlapping k-mers, 27 genomic features 5,265 disease-associated variants from five literatures Classifier Banerjee et al. (2021)
Combination of rule-based and meta-predictor CancerVar Exon variants, CNVs, indels Cancer 1911 https://cancervar.wglab.org/index.php Web, stand-alone and pre-computed results May 2022 Python Semi-supervised generative adversarial network used in scoring method OPAI 12 clinical evidence prediction scores and 23 precomputed scores by other computational tools 13 million variants from 7 cancer knowledgebases OPAI score Oncogenic/benign 4 datasets from OncoKB and CIViC, IARC and literatures Li et al. (2022)

*VUS, variant of uncertain significance.