Table 2.
Summary of the main variant annotation tools for non-coding DNA regions
Name | Uses | Main data sources | Advantages | Limitations | Reference |
---|---|---|---|---|---|
RegulomeDB | Prioritization of functional variants, using a score based on the number of elements with which the variant overlaps | ENCODE, Roadmap Epigenomics Project | Includes information from numerous functional annotation sources | The scoring system can be difficult to interpret | [103] |
HaploReg | Annotation of variants in LD, located within or next to regulatory elements | ENCODE, GTEx, Roadmap Epigenomics Project | Allows the identification and mining of causal variants in LD that affect regulatory sites | Functional annotations are not updated periodically | [106] |
FunciSNP | Identification and prioritization of putative regulatory SNPs | ENCODE, Roadmap Epigenomics Project | Large data queries are fast to perform | A minimum knowledge of R is needed for its use | [107] |
rVarBase | Annotation of regulatory variants that are involved in transcriptional and post-transcriptional regulation | ENCODE, Roadmap Epigenomics Project | Uses annotations of numerous regulatory features, easy to use, intuitive website | Results summary can be initially confusing, i.e. a SNP can appear annotated with both strong and weak transcription | [108] |
FunSeq2 | Prioritization of cancer-associated SNVs in non-coding DNA | ENCODE | Can annotate and prioritize variants directly from BED or VCF files and the analysis can be customized | It is specifically designed to annotate cancer-associated variants but not for variants associated with other diseases | [109] |
ENlight | Annotation of GWAS variants and analysing their putative effects through plot visualization | GWAS, ENCODE, GTEx | Plot system is useful to visually identify causal variants and the analysis can be customized | Functional annotations are not updated periodically | [110] |
INFERNO | Characterization and prioritization of regulatory variants in different tissues | GTEx, FANTOM5, Roadmap Epigenomics Project | Prioritize variants by calculating an empirical p-value | Large Web queries take a long time to complete | [111] |
Cepip | Prioritization of gene regulatory variants using tissue-expression data and predicted scores | GTEx, ENCODE, scores from different prediction tools | Integrates the effect of multiple chromatin states to identify and prioritize functional regulatory variants | A minimum knowledge of the command line is needed for its installation and use | [112] |
GEMINI | Annotation of non-coding variants by integrating chromatin information for different cell types | ENCODE | Incorporates a workflow that automatically annotates variants from VCF or pedigree files | Requires command line use and lacks regulatory features in comparison with some other annotation tools | [113] |
OncoCis | Prioritization and annotation of cis-regulatory somatic variants in cancer samples | ENCODE, Human Epigenome Atlas, Jaspar, FANTOM5 | The annotation procedure is more rigorous in comparison with other tools for identifying cis-regulatory mutations and it can be applied for identifying cell type-specific variants | It is specifically designed to annotate cancer-associated variants but not for variants associated with other diseases | [114] |
SuRFR | R package that integrates annotations from different resources to prioritize functional regulatory SNPs | ENCODE, FANTOM5 | Short execution times and higher data confidentiality in comparison with Web-based tools | The user must be familiar with the R programming language | [115] |