Table 2.
A summary of functional prediction scores and conservation scores.
Score | Training data | Information used | Prediction model |
---|---|---|---|
PolyPhen-2 | UniProtKB/UniRef100; PDB/DSSP; UCSC alignments of 45 vertebrate genomes | eight sequence-based and three structure-based predictive features | naive Bayes classifier |
SIFT | SWISS-PROT/TrEMBL | sequence homology based on PSI-BLAST | position specific scoring matrix |
Mutation Taster | UniProt; homologous genes in humans and 10 other species; dbSNP; HapMap | conservation, splice site, mRNA features, protein features; | naive Bayes classifier |
LRT | coding sequences of 32 vertebrate species | sequence homology | likelihood ratio test of codon neutrality |
Mutation Assessor | homologous sequences from Uniprot identified by BLAST | sequence homology of protein families and sub-families within and between species | combinatorial entropy formalism |
FATHMM | homologous sequences from UniRef90, SUPERFAMILY and Pfam | sequence homology | hidden Markov models |
SiPhy | genomes of 29 mammals | multiple alignments | inferring nucleotide substitution pattern per site |
GERP++ | genomes of 34 mammals | multiple alignments and phylogenetic tree | maximum likelihood evolutionary rate estimation |
PhyloP | genomes of 33 placental mammals | multiple alignments and phylogenetic tree | distributions of the number of substitutions based on phylogenetic hidden Markov model |