Table 2.
Tools | Methodology | Score ranges | Prediction | References |
---|---|---|---|---|
Sorting Intolerant from Tolerant (SIFT) | Position-specific scoring matrix (PSSM) with Dirichlet priors Sequence based. uses PSI-BLAST |
0 to 1* | D – Damaging (<0.05) T – Tolerated (>0.05) |
64 |
Polymorphism Phenotyping version-2 (PolyPhen-v2) | Naïve Bayes classifier trained using supervised machine-learning Sequence and structure based |
0 to 1** | D – probably damaging (0.957–1) P – possibly damaging (0.453–0.956) B – benign (0.00–0.452) |
65 |
PolyPhen2_HDIV (HumDiv$) | ||||
Polyphen2_HVAR (HumVar%) | ||||
Log ratio test (LRT) | Uses log ratio test Sequence based |
0 to 1*** | D – Deleterious N – Neutral U – Unknown |
69 |
MutationTaster | Naïve bayes model operated on the integrated data source Based on sequence and annotation. |
0 to 1** | A– disease_causing_automatic D – disease_causing (>0.5) N – polymorphism (<0.5) P – polymorphism_automatic |
70 |
MutationAssessor | Multiple sequence alignment (MSA) and conservation scores | −5.135 to 6.49** | H – High L – Low M – Medium N – Neutral |
71 |
Functional Analysis Through Hidden Markov Models (FATHMM) | Hidden Markov models (HMM) Based on sequences and protein domains |
−18.09 to 11.0* | D – Damaging (< = −1.5) T– Tolerated (>−1.5) |
72 |
MetaSVM | Support vector machine (SVM) based score, derived by incorporating different scores# | −2 to 3** | D – Damaging (>0) T– Tolerated (<0) |
18 |
MetaLR | Logistic regression (LR) based score, derived by combining different scores# | 0 to 1** | D – Damaging (>0.5) T – Tolerated (<0.5) |
18 |
Variant Effect Scoring Tool version 3 (VEST3) | Supervised machine learning-based method Combines conservational and structural features |
0 to 1** | NA | 73 |
Protein Variation Effect Analyzer (PROVEAN) | Pair-wise alignment-based scoring method | −14 to 14* | D – Damaging (< = −2.5) N– Neutral (>−2.5) |
74 |
Reliability index (RI) | SVM based Combines protein sequence and structural features |
0 to 10** | D – Damaging (≥5) N– Neutral (<5) |
75 |
*Lower scores indicate deleterious nature.
**Higher scores indicate deleterious nature.
***Score cannot decide deleterious nature.
$HumDiv - collection of mendelian disease variants (5564 deleterious + 7539 neutral in 978 human protein) against divergence from close mammalian homologs of human proteins (> = 95% sequence identity).
%HumVar - compilation of all human variants (22196 deleterious + 21119 neutral) associated with some disease (non-cancer mutations) or loss of activity/function vs. common (MAF > 1%) human polymorphism with no reported association with a disease.
#10 scores from SIFT, PolyPhen-2 HDIV, PolyPhen-2 HVAR, GERP++, MutationTaster, Mutation Assessor, FATHMM, LRT, SiPhy and PhyloP and the maximum frequency observed in the 1000 G data.