Skip to main content
. 2020 Oct 22;22(4):bbaa250. doi: 10.1093/bib/bbaa250

Table 1.

Some commonly used tools for predicting the pathogenic impact of variants in the human genome. Except for Eigen and Eigen-PC, most methods use supervised learning. Most methods use data integration, utilizing conservation measures, functional annotations and other feature groups to optimize prediction accuracy

Name Method and features used Reference
CADD Logistic regression model trained with a wide variety of genomic features. Uses proxy neutrals estimated from the last human-ape genome divide and simulated de novo variants for proxy deleterious. Kircher et al. [21]
Rentzsch et al. [13].
DANN Deep neural network using conservation measures, epigenomics and genomic data. Quang et al. [10]
Eigen Eigen-PC Unsupervised learning methods using genomics, functional annotations and epigenomics. Ionita-Laza et al. [22]
FATHMM-MKL
FATHMM-XF
Multiple kernel learning and a later gradient boosting method using conservation measures, genomic and epigenomic features. Shihab et al. [8]
Rogers et al. [4]
Mutation Taster 2 Naive Bayes classifier using conservation measures, regulatory and genomic features. Schwarz et al. [39]
Polyphen2 Naive Bayes classifier using sequence and structure-based features. Adzhubei et al. [24]
PON-P2 Random Forest classifier, scoring amino acid substitutions as pathogenic, neutral or unknown, using conservation, functional and structural annotations. Niroula et al. [25]
PROVEAN Alignment scores based on sequence homology. Choi et al. [26]
SIFT
SIFT4G
Position-specific scoring matrix derived from sequence homology Ng et al. [27]
Vaser et al. [28]
VEST Random Forest method using conservation measures, protein structural measures, genomic and amino acid features. Carter et al. [29]