TABLE 2.
Characteristic category | Name | Type of variants | Targeted disease/phenotype/gene | # of genes | Website | Distribution (web-server/stand-alone) | First publication | Programming language | Algorithm/model | Features | Dataset for modeling | Classification index | Classification | Additional data | Publication |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Meta-predictor | VIPPID (Variant Impact Predictor for PIDs) | Missense | Primary immunodeficiency (PID) diseases | 146 | https://mylab.shinyapps.io/VIPPID/ | Web and stand-alone | April 2022 | Perl, R | Conditional Inference Forest | 85 features including AA, exonic, protein structural, conservation, and 20 pre-existing prediction tools | 4,865 disease-associated variants from Asian Primary Immunodeficiency Diseases (RAPID) database, HGMD and ClinVar; 4,237 neutral variants from gnomAD | Classifier | Pathogenic/non-pathogenic | 26 reviewed P/LP variants of known PID pathogenic genes from 1318 patients cohort and 39 validated in-house variants | Fang et al. (2022) |
Meta-predictor | CanPredict | Missense | Cancer | — | http://www.canpredict.org/or http://www.cgl.ucsf.edu/Research/genentech/canpredict/, both are not accessible | — | May 2007 | R | RF | SIFT, Pfam-based LogR.E-value and GO Similarity Score (GOSS) metrics | — | Classifier | Likely cancer/likely non-cancer/not determined | — | Kaminker et al. (2007) |
Meta-predictor | PolyPhen-HCM | Missense | Hypertrophic cardiomyopathy | 6 | http://genetics.bwh.harvard.edu/hcm/ | Pre-computed results | February 2011 | — | Naïve bayes classifier | Prediction scores, protein structure comparison score | 74 curated variants from literitures and manually classified by Laboratory for Molecular Medicine standard variant-assessment pipeline (41 pathogenic, 26 benign) | Classifier | Pathogenic/benign/no call | — | Jordan et al. (2011) |
Meta-predictor | Cadioboost | Missense | Cardiomyopathies and arrhythmias | 22 | https://www.cardiodb.org/cardioboost/ | Pre-computed results | October. 2020 | R | 2 Adaptive Boosting (Adaboost) classifiers | 76 functional features | CM datasets: 356 rare P/LP variants from 9,007 clinical CM patients, 302 rare missense variants in CM genes from 2,090 healthy controls. Inherited arrhythmia dataset: 252 P/LP in arrhythmia-associated genes from ClinVar, 237 rare missense variants in arrhythmia genes from 2,090 healthy controls | Pathogenicity score | Disease-causing/VUS/Benign | 4 datasets from ClinVar, HGMD, Oxford Medical Genetics Laboratory (OMGL), a large registry of HCM patients, SHaRe | Zhang et al. (2021) |
Multiple features | GENESIS (GENe-specific EnSemble grId Search) | Variants of uncertain clinical significance | Catecholaminergic polymorphic ventricular tachycardia and long QT syndrome (LQTS) | 4 | https://github.com/rachellea/medgenetics | Stand-alone and pre-computed results | March 2022 | Python | Logistic regression and multilayer perceptron model | 8 kinds of features including AA features, domain, conservation, rate of evolution, signal-to-noise ratio, and a position-specific scoring matrix (PSSM) score | 717 pathogenic variants and 3,164 benign variants curated from literiture | Probabilities of pathogenicity | Pathogenic/VUS/benign | 925 VUS classified according to ACMG | Draelos et al. (2022) |
Multiple features | CACNA1F-vp | Missense | X-linked incomplete Congenital Stationary Night Blindness (iCSNB) | 1 | https://github.com/shalawsallah/CACNA1F-variants-analysis | Stand-alone | April 2020 | Python | Logistic regression model | Variant-level features and structural features | 72 disease-implicated from HGMD or MGDL database, 322 benign variants from gnomAD | Probabilities of pathogenicity | Pathogenic/benign | - | Sallah et al. (2020) |
Optimized PON-P2 | PON-MMR2 | AA substitution | Mismatch repair (MMR) | 4 | http://structure.bmc.lu.se/PON-MMR2/ | Web and stand-alone | September 2015 | R | RF | 5 features: sequence conservation, physical and biochemical properties of AA | 109 pathogenic, 99 neutral, 354 VUS from InSiGHT database and VariBench | Probabilities of pathogenicity | Pathogenic/VUS/benign | 354 VUS dataset | Niroula and Vihinen, (2015) |
Optimized MAPP | CoDP (Combination of Different Properties of MSH6 protein) | Missense | Lynch syndrome (LS) | 1 | http://cib.cf.ocha.ac.jp/CoDP/ | Web | April 2013 | — | Logistic regression model | MSA, phylogenetic tree, structral properties, MAPP, SIFT, PolyPhen2 | 294 missense variants from InSiGHT, MMRUV, UniProt, dbSNP, ESP, HapMap Project, 1KGP and literature | Probabilities of pathogenicity | Likely LS/Unlikely LS | 260 unclassified variants dataset | Terui et al. (2013) |
Meta-predictor with MAF as features | DvPred | nsSNVs | Genetic hearing loss (HL) | 157 | https://github.com/WCH-IRD/DVPred/tree/main/DVPred_score | Stand-alone and pre-computed results | February 2022 | Python | Gradient boosting decision tree (GBDT) | 65 features include conservation scores, prediction scores, MAF, gene intolerance scores and other features | 1,318 P/LP and 4,628 B/LB from China Deafness Genetics Consortium (CDGC), Deafness Variation Database (DVD), ClinVar, HGMD | DvPred score | Deleterious/neutral | 463 pathogenic and 454 benign variants from new version of CDGC and ClinVar | Bu et al. (2022) |
Meta-predictor | NBDriver | Missense | Cancer | 58 | https://github.com/RamanLab/NBDriver | Stand-alone | May 2021 | Python | RF, extra tress (ET) classifier, generative KDE classifier | 3 types of features: one-hot encoding, overlapping k-mers, 27 genomic features | 5,265 disease-associated variants from five literatures | Classifier | — | — | Banerjee et al. (2021) |
Combination of rule-based and meta-predictor | CancerVar | Exon variants, CNVs, indels | Cancer | 1911 | https://cancervar.wglab.org/index.php | Web, stand-alone and pre-computed results | May 2022 | Python | Semi-supervised generative adversarial network used in scoring method OPAI | 12 clinical evidence prediction scores and 23 precomputed scores by other computational tools | 13 million variants from 7 cancer knowledgebases | OPAI score | Oncogenic/benign | 4 datasets from OncoKB and CIViC, IARC and literatures | Li et al. (2022) |
*VUS, variant of uncertain significance.