Table 1.
Name | Method and features used | Reference |
---|---|---|
CADD | Logistic regression model trained with a wide variety of genomic features. Uses proxy neutrals estimated from the last human-ape genome divide and simulated de novo variants for proxy deleterious. | Kircher et al. [21] Rentzsch et al. [13]. |
DANN | Deep neural network using conservation measures, epigenomics and genomic data. | Quang et al. [10] |
Eigen Eigen-PC | Unsupervised learning methods using genomics, functional annotations and epigenomics. | Ionita-Laza et al. [22] |
FATHMM-MKL
FATHMM-XF |
Multiple kernel learning and a later gradient boosting method using conservation measures, genomic and epigenomic features. | Shihab et al. [8] Rogers et al. [4] |
Mutation Taster 2 | Naive Bayes classifier using conservation measures, regulatory and genomic features. | Schwarz et al. [39] |
Polyphen2 | Naive Bayes classifier using sequence and structure-based features. | Adzhubei et al. [24] |
PON-P2 | Random Forest classifier, scoring amino acid substitutions as pathogenic, neutral or unknown, using conservation, functional and structural annotations. | Niroula et al. [25] |
PROVEAN | Alignment scores based on sequence homology. | Choi et al. [26] |
SIFT
SIFT4G |
Position-specific scoring matrix derived from sequence homology | Ng et al. [27] Vaser et al. [28] |
VEST | Random Forest method using conservation measures, protein structural measures, genomic and amino acid features. | Carter et al. [29] |