Skip to main content
. 2018 Aug 2;8:11635. doi: 10.1038/s41598-018-29952-z

Table 2.

Summary of used tools for deleteriousness prediction for missense variants.

Tools Methodology Score ranges Prediction References
Sorting Intolerant from Tolerant (SIFT) Position-specific scoring matrix (PSSM) with Dirichlet priors
Sequence based. uses PSI-BLAST
0 to 1* D – Damaging (<0.05)
T – Tolerated (>0.05)
64
Polymorphism Phenotyping version-2 (PolyPhen-v2) Naïve Bayes classifier trained using supervised machine-learning
Sequence and structure based
0 to 1** D – probably damaging (0.957–1)
P – possibly damaging (0.453–0.956)
B – benign (0.00–0.452)
65
PolyPhen2_HDIV (HumDiv$)
Polyphen2_HVAR (HumVar%)
Log ratio test (LRT) Uses log ratio test
Sequence based
0 to 1*** D – Deleterious
N – Neutral
U – Unknown
69
MutationTaster Naïve bayes model operated on the integrated data source
Based on sequence and annotation.
0 to 1** A– disease_causing_automatic
D – disease_causing (>0.5)
N – polymorphism (<0.5)
P – polymorphism_automatic
70
MutationAssessor Multiple sequence alignment (MSA) and conservation scores −5.135 to 6.49** H – High
L – Low
M – Medium
N – Neutral
71
Functional Analysis Through Hidden Markov Models (FATHMM) Hidden Markov models (HMM)
Based on sequences and protein domains
−18.09 to 11.0* D – Damaging (< = −1.5)
T– Tolerated (>−1.5)
72
MetaSVM Support vector machine (SVM) based score, derived by incorporating different scores# −2 to 3** D – Damaging (>0)
T– Tolerated (<0)
18
MetaLR Logistic regression (LR) based score, derived by combining different scores# 0 to 1** D – Damaging (>0.5)
T – Tolerated (<0.5)
18
Variant Effect Scoring Tool version 3 (VEST3) Supervised machine learning-based method
Combines conservational and structural features
0 to 1** NA 73
Protein Variation Effect Analyzer (PROVEAN) Pair-wise alignment-based scoring method −14 to 14* D – Damaging (< = −2.5)
N– Neutral (>−2.5)
74
Reliability index (RI) SVM based
Combines protein sequence and structural features
0 to 10** D – Damaging (≥5)
N– Neutral (<5)
75

*Lower scores indicate deleterious nature.

**Higher scores indicate deleterious nature.

***Score cannot decide deleterious nature.

$HumDiv - collection of mendelian disease variants (5564 deleterious + 7539 neutral in 978 human protein) against divergence from close mammalian homologs of human proteins (> = 95% sequence identity).

%HumVar - compilation of all human variants (22196 deleterious + 21119 neutral) associated with some disease (non-cancer mutations) or loss of activity/function vs. common (MAF > 1%) human polymorphism with no reported association with a disease.

#10 scores from SIFT, PolyPhen-2 HDIV, PolyPhen-2 HVAR, GERP++, MutationTaster, Mutation Assessor, FATHMM, LRT, SiPhy and PhyloP and the maximum frequency observed in the 1000 G data.