Table 2:
Computational biology SNP prediction webservers fall into three categories
| Name | SIFT | PolyPhen | SNAP | PMUT | PANTHER | nsSNP Analyzer | PhD-SNP | Auto-Mute |
|---|---|---|---|---|---|---|---|---|
| Table 2A: Methods servers | ||||||||
| Computational Methods | conservation among protein homologs | decision tree | Neural network | Neural network | Hidden Markov model of protein family | Random forest | Decision tree coupled to two support vector machines | Random forest, Delaunay tesselation [79] of protein structure. |
| WebsiteURL | http://blocks.fhcrc.org/sift/SIFT.html | http://genetics.bwh.harvard.edu/pph | http://cubic.bioc.columbia. edu/services/SNAP/ | http://mmb2.pcb.ub.es:8080/PMut/ | http://www.pantherdb.org/tools/csnpScoreForm.jsp | http://snpanalyzer.utmem.edu | http://gpcr2.biocomp.unibo.it/~emidio/PhD-SNP/ | http://proteins.gmu.edu/automute/AUTO-MUTE_nsSNPs.html |
| Datatypes | Protein sequences and multiple sequence alignments | Protein sequences, multiple sequence alignments, protein structures | Protein sequences,multiple sequence alignments, predicted protein secondary structures | Protein sequences,multiple sequence alignments, predicted protein secondary structures, protein structure | Protein sequences, hidden Markov models | Protein sequences, multiple sequence alignments, homologous protein structures, protein secondary structure | Protein sequences, sequence profiles | Protein structures |
| Bench-mark or training data | Saturation mutagenesis data sets of two bacterial and one retroviral protein [80–82] | Disease variants and mutagens from SwissProt Variant Pages [83] and presumed netural between-species replace-ments in multiple sequence alignments | The 80 000+ mutations from Protein Mutant Database [84], two bacterial and one retroviral protein [80–82], enzymes with experimentally annotated function in SwissProt [35] | Disease variants from SwissProt Variant Pages [83] and presumed netural between-species replace-ments in multiple sequence alignments | Human Gene Mutation Database [85] for disease mutations and dbSNP [53] for presumably neutral variants | Selected mutants from SwissProt Variant Pages [83] that are mapped onto homologous protein structures | SwissProt Variant Pages [83] | The 1790 disease and neutral variants from SwissProt Variant Pages [83] that can be mapped onto PDB [36] structures |
| Batch input of SNPs? | Yes | No | Yes | Yes | Yes | Yes | No | Up to five SNPs at one time |
| Name | SNP@Domain | PolyDoms | MutDB | Snap | StSNP |
|---|---|---|---|---|---|
| Table 2B: Meta servers | |||||
| Computational Methods | Integration of data from multiple sources | Integration of data from multiple sources | Integration of data from multiple sources | Integration of data from multiple sources | Integration of data from multiple sources |
| Website URL | http://snpnavigator.net | http://polydoms.cchmc.org | http://mutdb.org | http://snap.humgen.au.dk | http://glinka.bio.neu.edu/StSNP |
| Datatypes | Protein multiple sequence alignments, protein structures, predicted functional effects, disease annotations | Protein multiple sequence alignments, protein structures, GO categories [86], disease annotations, pathways, interacting protein networks, mammalian phenotypes, predicted functional effects. Includes synonymous SNPs | Genomic DNA, mRNA transcripts, protein sequence, protein multiple sequence alignments, protein structures, pathways, disease annotations. Includes intronic, untranslated region, and and synonymous SNPs | Genomic DNA, mRNA transcripts, protein sequence, phylogenetic trees, interacting protein networks, diseases, post translational modifications, splice sites | Genomic. DNA, protein sequence, pathways, protein structures, protein homology models |
| Benchmark or training data | N/A | A total of 1338 SNPs from 611 candidate genes with known disease mutations (ftp://ftp.ncbi.nih.gov/snp/Entrez/snp_omimvar.txt) | N/A | N/A | N/A |
| Batch input of SNPs? | Yes | No | Yes | Yes, if in the same gene | Yes |
| Name | PupaSuite | SNP function portal | SNPselect | F-SNP |
|---|---|---|---|---|
| Computational Methods | Integration of data from multiple sources | Integration of data from multiple sources | Integration of data from multiple sources | Integration of data from multiple sources |
| Website URL | http://pupasuite.bioinfo.cipf.es | http://brainarray.mbni.med.umich.edu/Brainarray/ Database/SearchSNP/snpfunc.aspx | http://snpselector.duhs.duke.edu/hqsnp36.html | http://compbio.cs.queensu.ca/F-SNP/ |
| Data types | Genomic DNA, mRNA transcripts, protein sequence, haplotypes. Regulatory SNPs, synonymous SNPs, intronic SNPs, untranslated region SNPs, intergenic SNPs, nonsense and frameshift mutations, protein structure, cellular processes, functional sites, evolutionary selection strength dN/dS, and epigenetic effects (triplex DNA regions). Human, mouse and rat included | Genomic DNA, mRNA transcripts, protein sequences, protein structures and homology models, pathways, diseases, and haplotypes, (gene expression is under construction) | Applied Biosystems and Illumina SNP data, genomic DNA and haplotypes | Protein sequences, protein structures protein homology models, mRNA transcripts, predicted functional impact on protein structure, splicing regulation, post translational modifications and evolutionary conservation |
| Benchmark or training data | N/A | N/A | A total of 700 SNPs from 140 genes associated with cardio-vascular disease in [87] | N/A |
| Batch input of SNPs? | Yes | Yes | Yes | Yes, if in same gene or genomic region |
| Name | LS-SNP | SNPeffect | SNPs3D | FastSNP | MutaGeneSys |
|---|---|---|---|---|---|
| Table 2C: Hybrids | |||||
| Computational Methods | Support vector machine | Integration of data from multiple sources | Support vector machine | Decision tree | Identifies indirect correlations between SNPs and mutations from OMIM [13] |
| Website URL | http://karchinlab.org/LS-SNP | http://snpeffect.vib.be/ | http://www.snps3d.org/ | http://fastsnp.ibms.sinica.edu.tw | http://magnet.c2b2.columbia.edu/mutagenesys |
| Datatypes | Protein sequences, multiple sequence alignments, protein homology models, predicted domain interfaces, ligand binding sites, hidden Markov models, genes, pathways, genomic DNA | Predicted changes in protein stability and folding, aggregation and amyloidosis, catalytic sites and binding sites, phosphorylation and glycosylation sites, cellular localization and protein turnover. | Protein sequences, multiple sequence alignments, profiles, protein structures, genes, gene networks, disease candidate genes, GO categories [86], mouse knockout data | Genes, genomic DNA, mRNA transcripts, protein sequences, protein domains | Genomic sequence, haplotypes, linkage disequilibrium data |
| Benchmark or training data | A total of 1457 disease-associated variants from SwissProt [35] which could be mapped to the OMIM database [59] and and 2504 putatively neutral nsSNPs from dbSNP [53] | N/A | A total of 10 263 deleterious mutants in 731 proteins from Human Gene Mutation Database [85] and 16 682 control substitutions in 348 proteins from aligned positions of close orthologs | A total of 1569 SNPs from the SNP500 Cancer database [88] | N/A |
| Batch input of SNPs? | Yes | Yes | Yes | Yes | Yes |
(A) Methods servers primarily disseminate an original computational method for SNP function prediction. (B) Meta-servers pull information from many severs, including general purpose protein and genomic annotation bioinformatics servers and servers from category. (C) Hybrids both disseminate original method(s) and pull information from other servers. Technical terms have been italicized and can be looked up in the Glossary (Table 1).