Table 1.
Tool | URL | Refs |
---|---|---|
VEP | VEP incorporates annotations from the Ensembl database, allowing it to make predictions genome-wide as well as predict tissue-specific activity for 13 human cell lines. http://www.ensembl.org/info/docs/tools/vep/script/index.html | McLaren et al. [17] |
RegulomeDB | RegulomeDB uses a heuristic scoring system to catalog the likelihood that a given SNP or indel resides in a functional region, using functional data from over 100 cell types. http://regulomedb.org | Boyle et al. [18] |
FunciSNP | FunciSNP is an R/Bioconductor package that employs user input annotations to prioritize SNPs, allowing users to customize their annotations to query a cell type of interest. http://www.bioconductor.org/packages/release/bioc/html/FunciSNP.html | Coetzee et al. [19] |
ANNOVAR | ANNOVAR is a command line tool that uses region-based annotations to annotate noncoding variants and insertions and deletions (indels), in addition to comparing them to known variation databases. http://annovar.openbioinformatics.org | Wang et al. [21] |
HaploReg | HaploReg is a searchable repository for SNPs and indels from the 1000 Genomes Project, providing a summary of known annotations for variants within an LD block. http://www.broadinstitute.org/mammals/haploreg/haploreg.php | Ward and Kellis [22] |
GWAS3D | GWAS3D evaluates SNPs and indels by analyzing their 3D chromosomal interactions and disruptions to TF binding affinity. It outputs scores as well as a circle plot mapping local 3D interactions. http://jjwanglab.org/gwas3d | Li et al. [23] |
fitCons | fitCons uses the INSIGHT method to predict the probability that SNPs will influence fitness by screening for signatures positive and negative selection using data from three cell types. http://compgen.bscb.cornell.edu/fitCons/ | Gulko et al. [24] |
GWAVA | GWAVA trains on a random forest algorithm using disease mutations from HGMD and control variants from the 1000 genomes project to predict if queried variants are functional. ftp://ftp.sanger.ac.uk/pub/resources/software/gwava/ | Ritchie et al. [28] |
CADD | CADD trains on a linear kernel support vector matrix using simulated variants as deleterious variants and alleles fixed between human and chimpanzee as control variants. http://cadd.gs.washington.edu | Kircher et al. [26] |
DANN | DANN trains on a nonlinear learning neural network algorithm using the same training set data (fixed alleles vs. simulated variants) as CADD. https://cbcl.ics.uci.edu/public_data/DANN/ | Quang et al. [29] |
FATHMM-MKL | FATHMM-MKL implements a kernel-based classifier to estimate complex nonlinear patterns using HGMD pathogenic and 1000 Genomes Project control variant training set data. http://fathmm.biocompute.org.uk | Shihab et al. [30] |
deltaSVM | deltaSVM uses a gapped k-mer support vector machine to estimate the effect of a variant in a cell-type-specific manner. http://www.beerlab.org/deltasvm/ | Lee et al. [31] |
DeepSEA | DeepSEA uses a multilayered hierarchical structured deep learning-based sequence model to predict functional SNPs with single nucleotide sensitivity using ENCODE and Roadmap Epigenomics data. http://deepsea.princeton.edu/job/analysis/create/ | Zhou and Troyanskaya [32] |
Abbreviation: HGMD, Human Gene Mutation Database.