. 2016 Dec 4;10:267–289. doi: 10.4137/BBI.S38427

Table 2.

Software and tools for epigenomic data analysis.

SOFTWARE/TOOL	DESCRIPTION	URL	REFS
1. DNA methylation
1.1. Mapping BS-seq reads
1.1.1. General aligners with a BS-Seq module
GSNAP	A wild-card bisulfite aligner included in a general-purpose alignment tool (Genomic Short-read Nucleotide Alignment Program)	http://share.gene.com/gmap	323
LAST	A wild-card bisulfite aligner included in a general-purpose alignment tool	http://last.cbrc.jp	161
RMAP	A Wild-card bisulfite aligner included in a general-purpose alignment tool	http://rulai.cshl.edu/rmap/	6
segemehl	A wild-card bisulfite aligner included in a general-purpose alignment tool	http://www.bioinf.uni-leipzig.de/Software/segemehl	304
1.1.2 Specific BS-Seq aligner that use a three-letter approach
Bismark	A widely used three-letter bisulfite aligner based on Bowtie/Bowtie2	http://www.bioinformatics.babraham.ac.uk/projects/bismark	165
BRAT	A bisulfite-treated reads tool using the three-letter alignment	http://compbio.cs.ucr.edu/brat	166
BS-Seeker	A three-letter bisulfite aligner based on Bowtie	https://github.com/BSSeeker/Bsseeker2	324
MethylCoder	A three-letter bisulfite aligner based on Bowtie/GSNAP	https://github.com/brentp/methylcode	168
1.1.3 The specific BS-Seq aligner by wild-card approch
BSMAP	A widely used wild-card aligner for bisulfite sequencing reads	http://code.google.com/p/bsmap	325
Pash	A wild-card bisulfite aligner using gapped k-mer and multi-positional hash table	http://brl.bcm.tmc.edu/pash	170–172
1.1.4 Other BS-seq aligners
BISMA	Mapping and clustering of bisulfite sequencing data for individual clones from unique and repetitive sequences	http://biochem.jacobs-university.de/BDPC/BISMA/	326
BRAT-BW	A fast, accurate and memory-efficient BS aligner using the FM-index (Burrows-Wheeler transform)	http://compbio.cs.ucr.edu/brat/	304
B-SOLANA	A aligner for bisulfite-sequencing data of ABI SOLiD sequencers	http://code.google.com/p/bsolana	327
RRBSMAP	A wild-card aligner for RRBS reads	http://rrbsmap.computational-epigenetics.org	328
1.2. Detecting differential methylated regions (DMRs)
1.2.1 Software for DMR calling only
BiSeq	An R package for detect differentially methylated regions (DMRs) for BS data	https://www.bioconductor.org/packages/release/bioc/html/BiSeq.html	175
bumphunter	Bump hunting to identify differentially methylated regions	http://bioconductor.org/packages/release/bioc/html/bumphunter.html	177
DMRcate	An R package for detecting differentially methylated regions (DMRs) based on tunable kernel smoothing	www.bioconductor.org/packages/release/bioc/html/DMRcate.html	178
IMA	An R package for high-throughput analysis of Illumina’s 450K Infinium methylation data	http://www.rforge.net/IMA	329
M3D	An R package for detecting differentially methylated regions (DMRs) using a non-parametric, kernel-based method	https://www.bioconductor.org/packages/release/bioc/html/M3D.html	330
methylSig	An R package for detecting differentially methylated sites (DMCs) or regions (DMRs) using a beta-binomial model	https://github.com/sartorlab/methylSig	331
metilene	A fast and sensitive tool for detecting DMR by a binary segmentation algorithm combined with a two-dimensional statistical test	http://www.bioinf.uni-leipzig.de/Software/metilene/	185
MOABS	A tool for detecting differentially methylated sites (DMCs) or regions (DMRs) based on a Beta-Binomial hierarchical model with relative low CpG coverage (~10X)	https://code.google.com/archive/p/moabs/	332
NHMMfdr	An R package for detecting differential DNA methylation based on non-homogeneeous hidden Markov model (NHMM) by estimating false discovery rates (FDRs)	http://www.ams.sunysb.edu/~pfkuan/NHMMfdr/	182
QDMR	A tool for detecting DMR based on Shannon entropy	http://bioinfo.hrbmu.edu.cn/qdmr	333
1.2.2 Pipeline for both BS-seq mapping and DMR calling
Bsmooth	Bsmooth is a pipeline for analyzing whole genome bisulfite sequencing (WGBS) data. It includes tools for aligning the data, quality control, and identifying differentially methylated regions (DMRs).	http://rafalab.jhsph.edu/bsmooth/	304
MethPipe	A computational pipeline for analyzing bisulfite sequencing data (WGBS and RRBS), including BS mapping (Wild-Card aligner) and DMR calling	http://smithlabresearch.org/software/methpipe/	334
RefFreeDMA	Mapping for RRBS reads and DMR calling without a reference genome	https://github.com/jklughammer/RefFreeDMA	335
2. Histone Modifications and DNA-binding Proteins
2.1 Short-read Alignment
BWA	A fast and efficientlight-weighted tool that aligns short sequences to a sequence database; based on the Burrows–Wheeler transform	http://bio-bwa.sourceforge.net	233
Bowtie	Ultrafast, memory-efficient short read aligner. Uses a Burrows-Wheeler-Transformed (BWT) index	http://bowtie-bio.sourceforge.net	232
ELAND	Efficient Large-Scale Alignment of Nucleotide Databases. Whole genome alignments to a reference genome	http://support.illumina.com/help/SequencingAnalysisWorkflow/Content/Vault/Informatics/Sequencing_Analysis/CASAVA/swSEQ_mCA_ReferenceFiles.htm	Illumina
GenomeMapper	GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments	http://1001genomes.org/software/genomemapper. html	336
GNUMAP	Genomic Next-generation Universal MAPper is a program designed to accurately map sequence data obtained from next-generation sequencing machines back to a genome of any size. It seeks to align reads from nonunique repeats using statistics	http://dna.cs.byu.edu/gnumap/	323
HiCUP	A tool for mapping and performing quality control on Hi-C data	http://www.bioinformatics.babraham.ac.uk/projects/hicup/	337
GSNAP	Considers a set of variant allele inputs to better align to heterozygous sites	http://research-pub.gene.com/gmap	160
MAQ	Mapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina with preliminary functions to handle ABI SOLiD data	http://maq.sourceforge.net/	230
SOAP	SOAP (Short Oligonucleotide Alignment Program). A program for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences	http://soap.genomics.org.cn/	229
SOAP2	SOAP2 used a Burrows Wheeler Transformation (BWT) compression index to substitute the seed strategy for indexing the reference sequence in the main memory	http://soap.genomics.org.cn/soapaligner.html	234
ZOOM	ZOOM (Zillions Of Oligos Mapped) is designed to map millions of short reads, emerged by next-generation sequencing technology, back to the reference genomes, and carry out post-analysis	http://omictools.com/zoom-tool	231
2.2 Peak Detection
2.2.1 Peak Caller
BroadPeak	A novel algorithm for identifying broad peaks in diffuse ChIP-seq datasets	http://jordan.biology.gatech.edu/page/software/broadpeak/	237
MACS	MACS fits data to a dynamic Poisson distribution; works with and without control data	http://liulab.dfci.harvard.edu/MACS	238
PeakSeq	PeakSeq takes into account differences in mappability of genomic regions; enrichment based on FDR calculation	http://info.gersteinlab.org/PeakSeq	338
SICER	A clustering approach for identification of enriched domains from histone modification ChIP-Seq data	http://home.gwu.edu/~wpeng/Software.htm	236
SISSRS	A novel algorithm for precise identification of binding sites from short reads generated from ChIP-Seq experiments	http://sissrs.rajajothi.com/	239
ZINBA	ZINBA can incorporate multiple genomic factors, such as mappability and GC content; can work with point-source and broad-source peak data	http://code.google.com/p/zinba	339
2.2.2 Differential Peak Caller
baySeq	An R package that uses empirical Bayes approach to identify significant differences; assumes negative binomial distribution of data	http://www.bioconductor.org/packages/release/bioc/html/baySeq.html	340
ChIPDiff	A toolkit for the genome-wide comparison of histone modification sites identified by ChIP-seq, differential histone modification sites (DHMS) identification, uses binomial distribution, Baum-Welch expectation maximization (EM) algorithm, forward-backward algorithm	http://cmb.gis.a-star.edu.sg/ChIPSeq/paperChIP-Diff.htm	341
edgeR	An R package that uses negative binomial distribution to model differences in tag counts; uses replicates to better estimate significant differences	http://www.bioconductor.org/packages/2.9/bioc/html/edgeR.html	257
DESeq	DESeq uses negative binomial distribution, but differs in the calculation of the mean and variance of the distribution	http://www-huber.embl.de/users/anders/DESeq	253
SAMSeq	SAMSeq based on the popular SAM software; a non-parametric method that uses resampling to normalize for differences in sequencing depth	http://www.stanford.edu/~junli07/research.html#SAM	342
3. ncRNAs
3.1 ncRNAs detection and quantification
miRDeep	miRDeep was developed to discover active known or novel miRNAs from deep sequencing data after the removal of adapters with a number of scripts to preprocess and score the mapped data	https://www.mdc-berlin.de/8551903/en/	248
miRDeep2	miRDeep2 is more sensitively and robustly to carry out identifying known and novel miRNAs by evaluating the structure and signature for each precursor, quantifying known miRNAs based on the annotation in miRBase and predicting secondary structure by RNAfold tool	https://www.mdc-berlin.de/8551903/en/	252
miRDeep^*	miRDeep^* is an integrated standalone miRNA identification application with a user-friendly graphic interface to conduct sequence alignment, pre-miRNA secondary structure calculation, and graphical display with low memory requirement	http://www.australianprostatecentre.org/research/software/mirdeep-star	249
DARIO	DARIO is a web service for studying short read data from small RNA-seq experiments. It provides a wide range of analysis features, including quality control, read normalization, ncRNA quantification and prediction of putative ncRNA candidates	http://dario.bioinf.uni-leipzig.de/index.py	343
ncPRO-seq	ncPRO-seq is a tool for annotation and profiling of ncRNAs from small-RNA sequencing data. It aims to interrogate and perform detailed analysis on small RNAs derived from annotated non-coding regions in miRBase, piRBase, Rfam and repeatMasker, and regions defined by users. The ncPRO pipeline also has a module to identify regions significantly enriched with short reads that cannot be classified as known ncRNA families	https://sourceforge.net/projects/ncproseq/	344
CoRAL	CoRAL is a machine-learning package that can predict the precursor class of small RNAs present in a high-throughput RNA-sequencing dataset and produces information about the features that are most important for discriminating different populations of small non-coding RNAs	http://wanglab.pcbi.upenn.edu/coral/	345
RNA-CODE	RNA-CODE is designed for ncRNA identification in NGS data that lack quality reference genomes. Given a set of short reads, it classifies the reads into different types of ncRNA families. The classification results can be used to quantify the expression levels of different types of ncRNAs in RNA-seq data and ncRNA composition profiles in metagenomic data, respectively	http://www.cse.msu.edu/~chengy/RNA_CODE/	346
CAP-miRSeq	A comprehensive analysis pipeline for deep microRNA sequencing that integrates read preprocessing, alignment, mature/precursor/novel miRNA qualification, variant detection in miRNA coding region, and flexible differential expression between experimental conditions	http://bioinformaticstools.mayo.edu/research/capmirseq/	256
iMir	A modular pipeline for comprehensive analysis of smallRNA-Seq data, comprising specific tools for adapter trimming, quality filtering, differential expression analysis, biological target prediction and other useful options by integrating multiple open source modules and resources in an automated workflow	http://www.labmedmolge.unisa.it/inglese/research/imir	250
UEA sRNA workbench	UEA sRNA workbench performs complete analysis of single or multiple-sample small RNA datasets to identify novel micro RNA sequences and profiling small RNA expression patterns in genetic data	http://srna-workbench.cmp.uea.ac.uk/	260
omiRas	omiRas is a web server for annotation, comparison and visualization of interaction networks of non-coding RNAs derived from small RNA-Sequencing	http://tools.genxpro.net/omiras/	259
sRNAtoolbox	sRNAtoolbox provide several tools including sRNAbench for sRNA expression profiling and prediction of novel microRNAs, sRNAde for differential expression analysis, miRNA-consTarget for prediction of miRNAs, sRNAjBrowserDE for visualization differential expression as a fuction of read length and sRNAfuncTerms for determination of over represented functional annotations in target gene set	http://bioinfo5.ugr.es/srnatoolbox	347
iSeeRNA	iSeeRNA is a support vector machine (SVM)-based classifier for the identification of lincRNAs	http://137.189.133.71/software.html	261
Sebnif	Sebnif is an Integrated Bioinformatics Pipeline for the Identification of Novel Large Intergenic Noncoding RNAs (lincRNAs) base on iSeeRNA	http://137.189.133.71/sebnif/	262
LncRNA2Function	LncRNA2Function – a comprehensive resource for functional investigation of human lncRNAs based on RNA-seq data	http://mlg.hit.edu.cn/lncrna2function/	264
3.2 RIP-seq and CLIP-seq
3.2.1 Differential Peak Caller and Binding site detector from C LIP-seq
Novoalign	An accurate NGS short reads aligner for aligning to reference genome	http://www.novocraft.com/products/novoalign/	267
PIPE-CLIP	A Galaxy framework-based comprehensive online pipeline for reliable analysis of data generated by three types of CLIP-seq protocol	http://pipeclip.qbrc.org/	270
PARalyzer	It utilizes this nucleotide ubstation in a kernel density estimate classifier to generate the high-resolution set of Protein-RNA interaction sites	https://ohlerlab.mdc-berlin.de/software/PARalyzer_85/	271
Piranha	Piranha is a peak finding and differential binding detection algorithm	http://smithlabresearch.org/software/piranha/	266
wavClusteR	An integrated pipeline for the analysis of PAR-CLIP data	https://bioconductor.org/packages/release/bioc/html/wavClusteR.html	272
dCLIP	dCLIP is designed for quantitative CLIP-seq comparative analysis is able to effectively identify differential binding regions of RBPs in four CLIP-seq datasets	http://qbrc.swmed.edu/software/	273
3.2.2 Motif Discovery
GraphProt	GraphProt is a machine learning computational framework for learning sequence- and structure-binding preferences of RNA-RBPs from high-throughput experimental data	http://www.bioinf.uni-freiburg.de/Software/GraphProt/	280
MEME	Perform motif discovery on DNA, RNA or protein datasets	http://meme-suite.org/	348
cERMIT	cERMIT is a computationally efficient motif discovery tool based on analyzing genome-wide quantitative regulatory evidence	https://ohlerlab.mdc-berlin.de/software/cERMIT_82/	276
GLAM2 (Gapped Local Alignment of Motifs)	GLAM2 is a motif detection tool for discovering motifs allowing indels in a fully general manner from DNA, RNA and protein datasets	http://bioinformatics.org.au/glam2	277
MatrixREDUCE	A motif discovery tool for genome-wide ChIP-seq and CLIP-seq data analysis	http://www.bussemakerlab.org/	278
RNA Bind-n-Seq	A quantitative assessment of the sequence and structural binding specificity		349
CapR	An efficient algorithm that calculates the probability that each RNA base position is located within each secondary structural context	https://sites.google.com/site/fukunagatsu/software/capr	281
RNAcontext	An efficient motif finding method ideally suited for using large-scale RNA-binding affinity datasets to determine the relative binding preferences of RBPs for a wide range of RNA sequences and structures	http://www.cs.toronto.edu/~hilal/rnacontext/	279
ViennaRNA Package 2.0	A widely used compilation of RNA secondary structure	http://www.tbi.univie.ac.at/RNA/	279
4. Storing, retrieving and visualizing epigenomics data
4.1 Genome browser for visualizing DNA methylation
Ensembl	A widely used Web-based genome browser with various epigenome data sets	http://www.ensembl.org	283
IGV	A widely used graphical genome browser that is run locally on the user’s computer	http://www.broadinstitute.org/igv	286
UCSC Genome Browser	Widely used Web-based genome browser hosting all ENCODE data	http://genome.ucsc.edu	282
BDPC	Web-based tool for bisulfite sequencing data presentation and compilation	http://biochem.jacobs-university.de/BDPC	350
DaVIE	The database with an intuitive user interface to perform visual comparisons across large DNA methylation data sets	https://github.com/apfejes/epigenetics-software	285
EpiExplorer	A web server provides an interactive gateway for exploring large-scale epigenetic datasets of the human and mouse genome	http://epiexplorer.mpi-inf.mpg.de	351
EpiGRAPH	A user-friendly software for advanced (epi-) genome analysis and prediction by powerful machine learning algorithms	http://epigraph.mpi-inf.mpg.de	352
WashU Epigenome Browser	Web-based genome browser focusing on the human epigenome	http://epigenomegateway.wustl.edu	353
4.2 Specialized-DNA methylation databases
MethBase	A central reference methylome database created from public BS-seq datasets	http://smithlabresearch.org/software/methbase/	334
MethDB	A database for DNA methylation and environmental epigenetic effects	http://www.methdb.de	288
MethyCancer	Database of cancer DNA methylation data	http://methycancer.psych.ac.cn	354
PubMeth	Database of DNA methylation literature	http://www.pubmeth.org	290
4.3 Specialized histone modification databases
ChromatinDB	A database of genome-wide histone modification patterns for Saccharomyces cerevisiae	http://integbio.jp/dbcatalog/en/record/nbdc00939?jtpl=56	294
CR Cistrome	A ChIP-Seq database for chromatin regulators and histone modification linkages in human and mouse	http://cistrome.org/cr/	293
Histome	A relational knowledgebase of human histone proteins and histone modifying enzymes	http://www.actrec.gov.in/histome/	292
HHMD	The human histone modification database	http://202.97.205.78/hhmd/	291
4.4 Specialized nc RNA and RBPs interaction database
starBase V2.0	starBase is designed for decoding ncRNA and the RNA-protein interaction networks and predicting functions especially incancer samples	http://starbase.sysu.edu.cn/	296,297
CLIPZ	CLIPZ supports the automatic functional annotation and visualization of CLIP-seq identified binding sites	http://www.clipz.unibas.ch/	298
doRiNA	A database of RNA interactions in post-transcriptional regulation	http://dorina.mdc-berlin.de/	300
CLIPdb	An intergrated resource for characterizing the regulatory networks between RBPs and various RNA transcript classes	http://lulab.life.tsinghua.edu.cn/clipdb/	301

Note:

The descriptions are adapted from the software/tools website descriptions.