Table 2.
SOFTWARE/TOOL | DESCRIPTION | URL | REFS |
---|---|---|---|
1. DNA methylation | |||
1.1. Mapping BS-seq reads | |||
1.1.1. General aligners with a BS-Seq module | |||
GSNAP | A wild-card bisulfite aligner included in a general-purpose alignment tool (Genomic Short-read Nucleotide Alignment Program) | http://share.gene.com/gmap | 323 |
LAST | A wild-card bisulfite aligner included in a general-purpose alignment tool | http://last.cbrc.jp | 161 |
RMAP | A Wild-card bisulfite aligner included in a general-purpose alignment tool | http://rulai.cshl.edu/rmap/ | 6 |
segemehl | A wild-card bisulfite aligner included in a general-purpose alignment tool | http://www.bioinf.uni-leipzig.de/Software/segemehl | 304 |
1.1.2 Specific BS-Seq aligner that use a three-letter approach | |||
Bismark | A widely used three-letter bisulfite aligner based on Bowtie/Bowtie2 | http://www.bioinformatics.babraham.ac.uk/projects/bismark | 165 |
BRAT | A bisulfite-treated reads tool using the three-letter alignment | http://compbio.cs.ucr.edu/brat | 166 |
BS-Seeker | A three-letter bisulfite aligner based on Bowtie | https://github.com/BSSeeker/Bsseeker2 | 324 |
MethylCoder | A three-letter bisulfite aligner based on Bowtie/GSNAP | https://github.com/brentp/methylcode | 168 |
1.1.3 The specific BS-Seq aligner by wild-card approch | |||
BSMAP | A widely used wild-card aligner for bisulfite sequencing reads | http://code.google.com/p/bsmap | 325 |
Pash | A wild-card bisulfite aligner using gapped k-mer and multi-positional hash table | http://brl.bcm.tmc.edu/pash | 170–172 |
1.1.4 Other BS-seq aligners | |||
BISMA | Mapping and clustering of bisulfite sequencing data for individual clones from unique and repetitive sequences | http://biochem.jacobs-university.de/BDPC/BISMA/ | 326 |
BRAT-BW | A fast, accurate and memory-efficient BS aligner using the FM-index (Burrows-Wheeler transform) | http://compbio.cs.ucr.edu/brat/ | 304 |
B-SOLANA | A aligner for bisulfite-sequencing data of ABI SOLiD sequencers | http://code.google.com/p/bsolana | 327 |
RRBSMAP | A wild-card aligner for RRBS reads | http://rrbsmap.computational-epigenetics.org | 328 |
1.2. Detecting differential methylated regions (DMRs) | |||
1.2.1 Software for DMR calling only | |||
BiSeq | An R package for detect differentially methylated regions (DMRs) for BS data | https://www.bioconductor.org/packages/release/bioc/html/BiSeq.html | 175 |
bumphunter | Bump hunting to identify differentially methylated regions | http://bioconductor.org/packages/release/bioc/html/bumphunter.html | 177 |
DMRcate | An R package for detecting differentially methylated regions (DMRs) based on tunable kernel smoothing | www.bioconductor.org/packages/release/bioc/html/DMRcate.html | 178 |
IMA | An R package for high-throughput analysis of Illumina’s 450K Infinium methylation data | http://www.rforge.net/IMA | 329 |
M3D | An R package for detecting differentially methylated regions (DMRs) using a non-parametric, kernel-based method | https://www.bioconductor.org/packages/release/bioc/html/M3D.html | 330 |
methylSig | An R package for detecting differentially methylated sites (DMCs) or regions (DMRs) using a beta-binomial model | https://github.com/sartorlab/methylSig | 331 |
metilene | A fast and sensitive tool for detecting DMR by a binary segmentation algorithm combined with a two-dimensional statistical test | http://www.bioinf.uni-leipzig.de/Software/metilene/ | 185 |
MOABS | A tool for detecting differentially methylated sites (DMCs) or regions (DMRs) based on a Beta-Binomial hierarchical model with relative low CpG coverage (~10X) | https://code.google.com/archive/p/moabs/ | 332 |
NHMMfdr | An R package for detecting differential DNA methylation based on non-homogeneeous hidden Markov model (NHMM) by estimating false discovery rates (FDRs) | http://www.ams.sunysb.edu/~pfkuan/NHMMfdr/ | 182 |
QDMR | A tool for detecting DMR based on Shannon entropy | http://bioinfo.hrbmu.edu.cn/qdmr | 333 |
1.2.2 Pipeline for both BS-seq mapping and DMR calling | |||
Bsmooth | Bsmooth is a pipeline for analyzing whole genome bisulfite sequencing (WGBS) data. It includes tools for aligning the data, quality control, and identifying differentially methylated regions (DMRs). | http://rafalab.jhsph.edu/bsmooth/ | 304 |
MethPipe | A computational pipeline for analyzing bisulfite sequencing data (WGBS and RRBS), including BS mapping (Wild-Card aligner) and DMR calling | http://smithlabresearch.org/software/methpipe/ | 334 |
RefFreeDMA | Mapping for RRBS reads and DMR calling without a reference genome | https://github.com/jklughammer/RefFreeDMA | 335 |
2. Histone Modifications and DNA-binding Proteins | |||
2.1 Short-read Alignment | |||
BWA | A fast and efficientlight-weighted tool that aligns short sequences to a sequence database; based on the Burrows–Wheeler transform | http://bio-bwa.sourceforge.net | 233 |
Bowtie | Ultrafast, memory-efficient short read aligner. Uses a Burrows-Wheeler-Transformed (BWT) index | http://bowtie-bio.sourceforge.net | 232 |
ELAND | Efficient Large-Scale Alignment of Nucleotide Databases. Whole genome alignments to a reference genome | http://support.illumina.com/help/SequencingAnalysisWorkflow/Content/Vault/Informatics/Sequencing_Analysis/CASAVA/swSEQ_mCA_ReferenceFiles.htm | Illumina |
GenomeMapper | GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments | http://1001genomes.org/software/genomemapper. html | 336 |
GNUMAP | Genomic Next-generation Universal MAPper is a program designed to accurately map sequence data obtained from next-generation sequencing machines back to a genome of any size. It seeks to align reads from nonunique repeats using statistics | http://dna.cs.byu.edu/gnumap/ | 323 |
HiCUP | A tool for mapping and performing quality control on Hi-C data | http://www.bioinformatics.babraham.ac.uk/projects/hicup/ | 337 |
GSNAP | Considers a set of variant allele inputs to better align to heterozygous sites | http://research-pub.gene.com/gmap | 160 |
MAQ | Mapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina with preliminary functions to handle ABI SOLiD data | http://maq.sourceforge.net/ | 230 |
SOAP | SOAP (Short Oligonucleotide Alignment Program). A program for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences | http://soap.genomics.org.cn/ | 229 |
SOAP2 | SOAP2 used a Burrows Wheeler Transformation (BWT) compression index to substitute the seed strategy for indexing the reference sequence in the main memory | http://soap.genomics.org.cn/soapaligner.html | 234 |
ZOOM | ZOOM (Zillions Of Oligos Mapped) is designed to map millions of short reads, emerged by next-generation sequencing technology, back to the reference genomes, and carry out post-analysis | http://omictools.com/zoom-tool | 231 |
2.2 Peak Detection | |||
2.2.1 Peak Caller | |||
BroadPeak | A novel algorithm for identifying broad peaks in diffuse ChIP-seq datasets | http://jordan.biology.gatech.edu/page/software/broadpeak/ | 237 |
MACS | MACS fits data to a dynamic Poisson distribution; works with and without control data | http://liulab.dfci.harvard.edu/MACS | 238 |
PeakSeq | PeakSeq takes into account differences in mappability of genomic regions; enrichment based on FDR calculation | http://info.gersteinlab.org/PeakSeq | 338 |
SICER | A clustering approach for identification of enriched domains from histone modification ChIP-Seq data | http://home.gwu.edu/~wpeng/Software.htm | 236 |
SISSRS | A novel algorithm for precise identification of binding sites from short reads generated from ChIP-Seq experiments | http://sissrs.rajajothi.com/ | 239 |
ZINBA | ZINBA can incorporate multiple genomic factors, such as mappability and GC content; can work with point-source and broad-source peak data | http://code.google.com/p/zinba | 339 |
2.2.2 Differential Peak Caller | |||
baySeq | An R package that uses empirical Bayes approach to identify significant differences; assumes negative binomial distribution of data | http://www.bioconductor.org/packages/release/bioc/html/baySeq.html | 340 |
ChIPDiff | A toolkit for the genome-wide comparison of histone modification sites identified by ChIP-seq, differential histone modification sites (DHMS) identification, uses binomial distribution, Baum-Welch expectation maximization (EM) algorithm, forward-backward algorithm | http://cmb.gis.a-star.edu.sg/ChIPSeq/paperChIP-Diff.htm | 341 |
edgeR | An R package that uses negative binomial distribution to model differences in tag counts; uses replicates to better estimate significant differences | http://www.bioconductor.org/packages/2.9/bioc/html/edgeR.html | 257 |
DESeq | DESeq uses negative binomial distribution, but differs in the calculation of the mean and variance of the distribution | http://www-huber.embl.de/users/anders/DESeq | 253 |
SAMSeq | SAMSeq based on the popular SAM software; a non-parametric method that uses resampling to normalize for differences in sequencing depth | http://www.stanford.edu/~junli07/research.html#SAM | 342 |
3. ncRNAs | |||
3.1 ncRNAs detection and quantification | |||
miRDeep | miRDeep was developed to discover active known or novel miRNAs from deep sequencing data after the removal of adapters with a number of scripts to preprocess and score the mapped data | https://www.mdc-berlin.de/8551903/en/ | 248 |
miRDeep2 | miRDeep2 is more sensitively and robustly to carry out identifying known and novel miRNAs by evaluating the structure and signature for each precursor, quantifying known miRNAs based on the annotation in miRBase and predicting secondary structure by RNAfold tool | https://www.mdc-berlin.de/8551903/en/ | 252 |
miRDeep* | miRDeep* is an integrated standalone miRNA identification application with a user-friendly graphic interface to conduct sequence alignment, pre-miRNA secondary structure calculation, and graphical display with low memory requirement | http://www.australianprostatecentre.org/research/software/mirdeep-star | 249 |
DARIO | DARIO is a web service for studying short read data from small RNA-seq experiments. It provides a wide range of analysis features, including quality control, read normalization, ncRNA quantification and prediction of putative ncRNA candidates | http://dario.bioinf.uni-leipzig.de/index.py | 343 |
ncPRO-seq | ncPRO-seq is a tool for annotation and profiling of ncRNAs from small-RNA sequencing data. It aims to interrogate and perform detailed analysis on small RNAs derived from annotated non-coding regions in miRBase, piRBase, Rfam and repeatMasker, and regions defined by users. The ncPRO pipeline also has a module to identify regions significantly enriched with short reads that cannot be classified as known ncRNA families | https://sourceforge.net/projects/ncproseq/ | 344 |
CoRAL | CoRAL is a machine-learning package that can predict the precursor class of small RNAs present in a high-throughput RNA-sequencing dataset and produces information about the features that are most important for discriminating different populations of small non-coding RNAs | http://wanglab.pcbi.upenn.edu/coral/ | 345 |
RNA-CODE | RNA-CODE is designed for ncRNA identification in NGS data that lack quality reference genomes. Given a set of short reads, it classifies the reads into different types of ncRNA families. The classification results can be used to quantify the expression levels of different types of ncRNAs in RNA-seq data and ncRNA composition profiles in metagenomic data, respectively | http://www.cse.msu.edu/~chengy/RNA_CODE/ | 346 |
CAP-miRSeq | A comprehensive analysis pipeline for deep microRNA sequencing that integrates read preprocessing, alignment, mature/precursor/novel miRNA qualification, variant detection in miRNA coding region, and flexible differential expression between experimental conditions | http://bioinformaticstools.mayo.edu/research/capmirseq/ | 256 |
iMir | A modular pipeline for comprehensive analysis of smallRNA-Seq data, comprising specific tools for adapter trimming, quality filtering, differential expression analysis, biological target prediction and other useful options by integrating multiple open source modules and resources in an automated workflow | http://www.labmedmolge.unisa.it/inglese/research/imir | 250 |
UEA sRNA workbench | UEA sRNA workbench performs complete analysis of single or multiple-sample small RNA datasets to identify novel micro RNA sequences and profiling small RNA expression patterns in genetic data | http://srna-workbench.cmp.uea.ac.uk/ | 260 |
omiRas | omiRas is a web server for annotation, comparison and visualization of interaction networks of non-coding RNAs derived from small RNA-Sequencing | http://tools.genxpro.net/omiras/ | 259 |
sRNAtoolbox | sRNAtoolbox provide several tools including sRNAbench for sRNA expression profiling and prediction of novel microRNAs, sRNAde for differential expression analysis, miRNA-consTarget for prediction of miRNAs, sRNAjBrowserDE for visualization differential expression as a fuction of read length and sRNAfuncTerms for determination of over represented functional annotations in target gene set | http://bioinfo5.ugr.es/srnatoolbox | 347 |
iSeeRNA | iSeeRNA is a support vector machine (SVM)-based classifier for the identification of lincRNAs | http://137.189.133.71/software.html | 261 |
Sebnif | Sebnif is an Integrated Bioinformatics Pipeline for the Identification of Novel Large Intergenic Noncoding RNAs (lincRNAs) base on iSeeRNA | http://137.189.133.71/sebnif/ | 262 |
LncRNA2Function | LncRNA2Function – a comprehensive resource for functional investigation of human lncRNAs based on RNA-seq data | http://mlg.hit.edu.cn/lncrna2function/ | 264 |
3.2 RIP-seq and CLIP-seq | |||
3.2.1 Differential Peak Caller and Binding site detector from C LIP-seq | |||
Novoalign | An accurate NGS short reads aligner for aligning to reference genome | http://www.novocraft.com/products/novoalign/ | 267 |
PIPE-CLIP | A Galaxy framework-based comprehensive online pipeline for reliable analysis of data generated by three types of CLIP-seq protocol | http://pipeclip.qbrc.org/ | 270 |
PARalyzer | It utilizes this nucleotide ubstation in a kernel density estimate classifier to generate the high-resolution set of Protein-RNA interaction sites | https://ohlerlab.mdc-berlin.de/software/PARalyzer_85/ | 271 |
Piranha | Piranha is a peak finding and differential binding detection algorithm | http://smithlabresearch.org/software/piranha/ | 266 |
wavClusteR | An integrated pipeline for the analysis of PAR-CLIP data | https://bioconductor.org/packages/release/bioc/html/wavClusteR.html | 272 |
dCLIP | dCLIP is designed for quantitative CLIP-seq comparative analysis is able to effectively identify differential binding regions of RBPs in four CLIP-seq datasets | http://qbrc.swmed.edu/software/ | 273 |
3.2.2 Motif Discovery | |||
GraphProt | GraphProt is a machine learning computational framework for learning sequence- and structure-binding preferences of RNA-RBPs from high-throughput experimental data | http://www.bioinf.uni-freiburg.de/Software/GraphProt/ | 280 |
MEME | Perform motif discovery on DNA, RNA or protein datasets | http://meme-suite.org/ | 348 |
cERMIT | cERMIT is a computationally efficient motif discovery tool based on analyzing genome-wide quantitative regulatory evidence | https://ohlerlab.mdc-berlin.de/software/cERMIT_82/ | 276 |
GLAM2 (Gapped Local Alignment of Motifs) | GLAM2 is a motif detection tool for discovering motifs allowing indels in a fully general manner from DNA, RNA and protein datasets | http://bioinformatics.org.au/glam2 | 277 |
MatrixREDUCE | A motif discovery tool for genome-wide ChIP-seq and CLIP-seq data analysis | http://www.bussemakerlab.org/ | 278 |
RNA Bind-n-Seq | A quantitative assessment of the sequence and structural binding specificity | 349 | |
CapR | An efficient algorithm that calculates the probability that each RNA base position is located within each secondary structural context | https://sites.google.com/site/fukunagatsu/software/capr | 281 |
RNAcontext | An efficient motif finding method ideally suited for using large-scale RNA-binding affinity datasets to determine the relative binding preferences of RBPs for a wide range of RNA sequences and structures | http://www.cs.toronto.edu/~hilal/rnacontext/ | 279 |
ViennaRNA Package 2.0 | A widely used compilation of RNA secondary structure | http://www.tbi.univie.ac.at/RNA/ | 279 |
4. Storing, retrieving and visualizing epigenomics data | |||
4.1 Genome browser for visualizing DNA methylation | |||
Ensembl | A widely used Web-based genome browser with various epigenome data sets | http://www.ensembl.org | 283 |
IGV | A widely used graphical genome browser that is run locally on the user’s computer | http://www.broadinstitute.org/igv | 286 |
UCSC Genome Browser | Widely used Web-based genome browser hosting all ENCODE data | http://genome.ucsc.edu | 282 |
BDPC | Web-based tool for bisulfite sequencing data presentation and compilation | http://biochem.jacobs-university.de/BDPC | 350 |
DaVIE | The database with an intuitive user interface to perform visual comparisons across large DNA methylation data sets | https://github.com/apfejes/epigenetics-software | 285 |
EpiExplorer | A web server provides an interactive gateway for exploring large-scale epigenetic datasets of the human and mouse genome | http://epiexplorer.mpi-inf.mpg.de | 351 |
EpiGRAPH | A user-friendly software for advanced (epi-) genome analysis and prediction by powerful machine learning algorithms | http://epigraph.mpi-inf.mpg.de | 352 |
WashU Epigenome Browser | Web-based genome browser focusing on the human epigenome | http://epigenomegateway.wustl.edu | 353 |
4.2 Specialized-DNA methylation databases | |||
MethBase | A central reference methylome database created from public BS-seq datasets | http://smithlabresearch.org/software/methbase/ | 334 |
MethDB | A database for DNA methylation and environmental epigenetic effects | http://www.methdb.de | 288 |
MethyCancer | Database of cancer DNA methylation data | http://methycancer.psych.ac.cn | 354 |
PubMeth | Database of DNA methylation literature | http://www.pubmeth.org | 290 |
4.3 Specialized histone modification databases | |||
ChromatinDB | A database of genome-wide histone modification patterns for Saccharomyces cerevisiae | http://integbio.jp/dbcatalog/en/record/nbdc00939?jtpl=56 | 294 |
CR Cistrome | A ChIP-Seq database for chromatin regulators and histone modification linkages in human and mouse | http://cistrome.org/cr/ | 293 |
Histome | A relational knowledgebase of human histone proteins and histone modifying enzymes | http://www.actrec.gov.in/histome/ | 292 |
HHMD | The human histone modification database | http://202.97.205.78/hhmd/ | 291 |
4.4 Specialized nc RNA and RBPs interaction database | |||
starBase V2.0 | starBase is designed for decoding ncRNA and the RNA-protein interaction networks and predicting functions especially incancer samples | http://starbase.sysu.edu.cn/ | 296,297 |
CLIPZ | CLIPZ supports the automatic functional annotation and visualization of CLIP-seq identified binding sites | http://www.clipz.unibas.ch/ | 298 |
doRiNA | A database of RNA interactions in post-transcriptional regulation | http://dorina.mdc-berlin.de/ | 300 |
CLIPdb | An intergrated resource for characterizing the regulatory networks between RBPs and various RNA transcript classes | http://lulab.life.tsinghua.edu.cn/clipdb/ | 301 |
Note:
The descriptions are adapted from the software/tools website descriptions.