Abstract
Enhancers are distal cis-regulatory elements that activate the transcription of their target genes. They regulate a wide range of important biological functions and processes, including embryogenesis, development, and homeostasis. As more and more large-scale technologies were developed for enhancer identification, a comprehensive database is highly desirable for enhancer annotation based on various genome-wide profiling datasets across different species. Here, we present an updated database EnhancerAtlas 2.0 (http://www.enhanceratlas.org/indexv2.php), covering 586 tissue/cell types that include a large number of normal tissues, cancer cell lines, and cells at different development stages across nine species. Overall, the database contains 13 494 603 enhancers, which were obtained from 16 055 datasets using 12 high-throughput experiment methods (e.g. H3K4me1/H3K27ac, DNase-seq/ATAC-seq, P300, POLR2A, CAGE, ChIA-PET, GRO-seq, STARR-seq and MPRA). The updated version is a huge expansion of the first version, which only contains the enhancers in human cells. In addition, we predicted enhancer–target gene relationships in human, mouse and fly. Finally, the users can search enhancers and enhancer–target gene relationships through five user-friendly, interactive modules. We believe the new annotation of enhancers in EnhancerAtlas 2.0 will facilitate users to perform useful functional analysis of enhancers in various genomes.
INTRODUCTION
As distal regulatory DNA elements, enhancers regulate the gene expression in a cell type-specific manner and function in a wide range of biological processes, including embryogenesis, development, homeostasis and diseases (1–3). With the development of Next-Generation Sequencing (NGS), multiple high-throughput experimental methods were designed to detect thousands of enhancers in different cell types (2–9). These methods can be classified into five categories: (i) chromatin immunoprecipitation and sequencing (ChIP-seq) of various transcription factors (TFs), specific mediator or cofactors, and specific histone modifications (10). The method was adopted to identify enhancer-related binding sites. The TFs often regulate gene expression by binding to the DNA regulatory elements (11). A specific TF, EP300, was a well-known enhancer marker. It was shown that over 75% of P300 binding sites were associated with enhancers and located far away from transcription start sites (TSSs) (12). Recent studies also revealed that the RNA polymerase II with the largest subunit POLR2A could move away from gene coding regions and bind to thousands of enhancers (4,12,13). (ii) The open chromatin regions identified by chromatin accessibility assays. Specifically, DNase I digestion coupled to sequencing (DNase-seq), transposase-accessible chromatin followed by sequencing (ATAC-seq), formaldehyde-assisted isolation and sequencing (FAIRE-seq) and micrococcal nuclease sequencing (MNase-seq) have been used to define transcriptional enhancers (8,9,14–16). (iii) Bi-directionally transcribed nascent enhancer RNAs (eRNAs). A large number of eRNAs detected by global run-on sequencing (GRO-seq) and cap-analysis gene expression (CAGE) often indicated a direct enhancer activity. (iv) High-throughput reporter assays, which were employed to quantitatively and directly detect the enhancer activities of thousands of DNA regulatory elements. Two representatives, STARR-seq (self-transcribing active regulatory region sequencing) and MPRA (massively parallel reporter assay), produced a library of reporter DNA sequence constructs as well as the unique tags or barcodes to assess the enhancer activities of tested regulatory regions (5,17). (v) Methods based on chromatin interactions, including Hi-C (18,19) and ChIA-PET (20). These approaches could identify enhancers from enhancer–enhancer or enhancer–promoter interactions. It was reported that ∼53% of chromatin interactions measured by the RNA polymerase II based ChIA-PET are enhancer-related (20).
While these methods are powerful to identify enhancers on a genome-wide scale, none of them are perfect in terms of sensitivity and specificity. For example, eRNAs only identify ∼25% of all 12 000 neuronal enhancers in the mouse genome (4). Furthermore, some methods, such as STARR-seq and GRO-seq, have only been successfully applied to certain species (e.g. Drosophila and C. elegans) (5,8,21). While many enhancer databases exist, such as SEdb, HACER, RAEdb, HEDD, DiseaseEnhancer, TiED, GeneHancer, SEA, DENdb and dbSUPER (22–31), none of them combined the datasets obtained from all different high-throughput approaches for enhancer annotation (Supplementary Figure S1). GeneHancer (25) integrated the enhancers from four different enhancer resources, including Ensembl, FANTOM, VISTA and ENCODE (2,32–34). The comparison of the enhancer databases showed that most databases utilized one or a few approaches for enhancer analysis (Supplementary Figure S1).
A comprehensive database is highly desirable for integrating enhancers from these genome-wide approaches for a better quality annotation. We developed a database, EnhancerAtlas (35), in which we combined the enhancer annotation from multiple pieces of experimental evidence and provided a set of analytic tools. Here, we present an updated version, EnhancerAtls 2.0, which includes a huge improvement from the previous version. It has three main improvements: (i) The new version expanded the enhancer annotation to nine species. In total, EnhancerAtlas 2.0 contained 13 494 603 enhancers based on 16 055 genome-wide profiling datasets (e.g. H3K4me1/H3K27ac, Dnase-seq/ATAC-seq, P300, POLR2A CAGE-seq, ChIA-PET, GRO-seq, STARR-seq and MPRA) in nine species. (ii) We improved the methods for combining multiple experimental datasets. (iii) New browse and new analytic tools were introduced to improve the web server.
MATERIALS AND METHODS
Data sources
The consensus enhancers in EnhancerAtlas 2.0 were identified based on twelve high-throughput experimental approaches, including P300 (12), Histone (10), POLR2A (13,21), TF-binding (11), DHS (or ATAC) (8,9), FAIRE (16), MNase-seq (14,15), GRO-seq (6), STARR-seq (5), CAGE (2), ChIA-PET (20) and MPRA (17). We manually downloaded 16 055 datasets, including processed or the raw sequencing data, from NCBI GEO datasets (36), ENCODE project portal at UCSC (32), Epigenome Roadmap (7) and FANTOM5 (2). The datasets in Homo sapiens, Sus scrofa, Rattus norvegicus, Mus musculus, Gallus gallus, Danio rerio, Drosophila melanogaster, Caenorhabditis elegans and Saccharomyces cerevisiae were mapped to hg19, susScr3, rn5, mm9, galGal4, danRer10, dm3, ce10 and sacCer3, respectively.
Data processing for individual dataset/track
To build EnhancerAtlas 2.0, we collected 16 055 datasets. We first converted processed data into standard bed file based on their formats. The datasets with original bed, gff or narrowpeak format will be directly converted into standard bed using bedtools (37). We called peaks from the datasets with bedgraph format using macs2 module ‘bdgpeakcal’ (38). Finally, the dataset with bigwig format was firstly converted into bedgraph and then used for peak calling. The datasets in other genome build will be transformed into the right version by liftOver (39). We also removed the irregular datasets with a size <5 kb or >10 mb, which may contain too few or too many peaks. Also, the peaks overlapping with promoter, exon or CTCF-defined insulator regions were removed. The tools liftOver and bigWigToBedGraph (39) were downloaded from http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/.
We also called the peaks from raw sequencing data (Supplementary Figure S2). We summarized the parameters used for different types of datasets (Supplementary Table S1). We chose the parameters based on the dataset characterization and our experience. GRO-seq could detect 5-prime-capped RNAs that accurately mark active transcriptional regulatory elements (TREs) including enhancers (6,40). We used a GRO-seq specific calling tool, dREG, to get the candidate active enhancers with a bi-directional transcription (6). Note that the input plus and minus bigwig files for dREG were processed with no normalization by ‘RunOnBamToBigWig’ (41). The CAGE method had been successfully used to identify tens of thousands of eRNAs (2). We obtained the CAGE candidate enhancers in both human and mouse from the FANTOM5 project (http://fantom.gsc.riken.jp/5/datafiles/latest/extra/Enhancers/) (2).
RNA polymerase II based ChIA-PET detected the interactions between promoters and other regulatory regions including enhancers (20). We obtained the enhancer regions from the ChIA–PET interactions after filtering out the promoter and gene regions (Supplementary Tables S2 and S3). The MPRA data was collected from the RAEdb resource (29). In addition, we obtained STARR-seq datasets from the GEO Datasets (36).
Generation of consensus track
We developed an unsupervised learning approach to weigh each track and combine them to determine the consensus enhancers (35). In this version, we improved the method by making a few adjustments of the method. To get the consensus track, we first normalized each individual track. The individual track usually includes several datasets. Especially for the ‘TF-binding’ track, it could contain dozens of datasets for different TFs. We normalized each dataset to make them comparable for combination. The normalization on each dataset or track was defined as:
where and are the fold enrichment and length of peak (), respectively. We filtered out peaks with length over 2500 bp in each dataset. If one track contains multiple datasets (e.g. multiple TFs), we merged the datasets with the area centering the average summit in the size of the average peak width (ASW) (42).
In our previous version, we used the Pearson Correlation Coefficient (PCC) to evaluate the correlations between two tracks across the whole genome region (35). To weight more on the enhancer regions, rather the largely non-enhancer regions in the genome, in the new version, we used the Jaccard index with intersection over union to assess the similarity based on the overlapping degree between two different tracks.
where represents the number of overlapped regions between tracks and while means the number of union regions. Given a tissue/cell type with tracks, we calculated the similarities of all combinations of any two tracks and put them into a matrix as following:
For any track , we calculated its weight:
In addition, we set that each peak in the merged profile must be supported by at least 50% of tracks. The signal value for each combined peak was determined as:
where and means the length of relative peak in track and the length of combined peak in the merged consensus track, respectively.
Enhancer–gene interactions
We developed an algorithm, an Enhancer And Gene based Learning Ensemble method (EAGLE), to identify Enhancer–Gene (EG) interactions (43). The method is based on six features, including correlation between enhancer activity and gene expression across cell types, gene expression level of target genes, genomic distance between an enhancer and its target gene, enhancer signal, average gene activity in the region between the enhancer and target gene and enhancer–enhancer correlation. These genomic features were derived from enhancers and gene expression datasets from the same cell type. Therefore, the method could be widely used in different tissue/cell types. We used ChIA-PET (20) and/or Hi-C (18,19) as the gold standards to define the training datasets and built three prediction models for human, mouse and fly, respectively (Supplementary Figure S3). Applying EAGLE to these three species, we identified 7 680 203, 7 437 255 and 317 588 EG interactions involving 31 375, 43 724 and 12 766 genes, 138 547, 177 062 and 40 321 enhancers across 89, 110, and 7 tissue/cell types in mouse, human, and fly respectively (43). We will provide the enhancer-gene relationships for the other species when the genomic interaction datasets (e.g. ChIA–PET and Hi-C) become available for these species.
Implementation of database
The EnhancerAtlas 2.0 runs on a Linux platform based on Apache-Tomcat-MySQL-PHP-HTML5-JavaScript-Perl and can be used on Windows, Mac and Linux. Specially, we designed a genome browser to display the coordinates and signals of individual datasets as well as consensus track in specific cells. If the enhancer-gene interactions are available for a particular cell type, they will also be displayed. The visualization was implemented using the HTML5 <canvas> element and a drawing module in JavaScript. A two-handle slider widget in the genome browser was set to zoom in or out the genome area. We provided several useful analytic tools so that the users can compare enhancers across species or predict enhancers in their own datasets.
RESULTS
Statistics
EnhancerAtlas 2.0 included 13 494 603 annotated consensus enhancers based on 16 055 datasets in 586 tissue/cell types across nine species. The datasets have 12 major data types (tracks) (Table 1). The number of datasets, tracks and enhancers are also summarized in each species (Supplementary Tables S2–S10). For some species (e.g. H. sapiens, M. musculus, D. melanogaster and C. elegans), we determined the consensus enhancers with at least three tracks for each tissue/cell type (Supplementary Tables S2–S5). For the remaining species, we have at least two tracks in each cell type (Supplementary Tables S6–S10). If only two tracks are available for a particular cell type, we require that the consensus enhancer must be supported by both tracks. We also predicted 7 680 203, 7 437 255 and 317 588 enhancer–target gene interactions in human, mouse and fly, respectively. We plan to predict the interactions for the other species when the Hi-C (18,19) and/or ChIA-PET (20) datasets become available.
Table 1.
Tissue/cells | Enhancers | Datasets | P300 | POLR2A | Histone | TF-binding | DHS | FAIRE | MNase | GRO | CAGE | MPRA | STARR | CHIA-PET | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Homo sapiens | 277 | 6 031 402 | 8005 | 132 | 696 | 1580 | 4159 | 1113 | 116 | 56 | 31 | 83 | 5 | 10 | 24 |
Mus musculus | 241 | 6 198 364 | 5838 | 102 | 451 | 1533 | 2930 | 592 | 60 | 91 | 24 | 47 | 0 | 0 | 8 |
Drosophila melanogaster | 21 | 294 158 | 801 | 0 | 101 | 85 | 396 | 96 | 30 | 53 | 22 | 0 | 0 | 17 | 0 |
Caenorhabditis elegans | 9 | 53 060 | 954 | 0 | 69 | 36 | 677 | 150 | 0 | 16 | 6 | 0 | 0 | 0 | 0 |
Danio rerio | 15 | 324 595 | 117 | 0 | 2 | 30 | 37 | 42 | 4 | 2 | 0 | 0 | 0 | 0 | 0 |
Rattus norvegicus | 11 | 267 542 | 101 | 1 | 6 | 37 | 48 | 8 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
Gallus gallus | 3 | 248 792 | 35 | 0 | 6 | 8 | 15 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Sus scrofa | 2 | 71 851 | 11 | 1 | 0 | 2 | 4 | 0 | 0 | 1 | 3 | 0 | 0 | 0 | 0 |
Saccharomyces cerevisiae | 7 | 4839 | 197 | 0 | 24 | 17 | 115 | 7 | 6 | 27 | 0 | 0 | 0 | 0 | 0 |
Total | 586 | 13 494 603 | 16 055 | 236 | 1355 | 3324 | 8382 | 2014 | 216 | 246 | 87 | 130 | 5 | 27 | 32 |
Database search
EnhancerAtlas 2.0 was constructed in a user-friendly way. It provided easy-to-use web interfaces for users to search, browse and download different types of enhancers and enhancer-gene interactions in different species. We provided five web-based analytical tools to query and visualize the enhancers and enhancer–gene interactions: (i) search enhancers by region, (ii) search enhancers by gene, (iii) compare enhancers across cells, (iv) compare enhancers of gene across cells, (v) predict enhancers and target genes for custom datasets. Users can search the enhancers by region in any tissue/cell of any species (Figure 1A).
Users can search the enhancers that regulate a particular gene of interest using the second search option (Figure 1B). The input of gene name or ID can be in many formats, such as Ensembl, EMBL, UCSC, PDB, FlyBase, RefSeq and UniProt (33,44–49). A genome browser will be provided for users to visualize the enhancer–gene interactions in the genome. Users could also compare enhancers across different tissue/cell types to identify conserved or cell type-specific enhancers using the third search option (Figure 1C). A gene could be regulated by different enhancers in different tissue/cells. The fourth search tool will let users to visualize the different enhancers that regulate the input gene in different tissue/cells (Figure 1D). Users can click the cell names to access the detailed track information for each individual cell. Users can also click ‘show the details’ or ‘download enhancers associated with the gene’ to get the list of enhancer-gene interactions in all selected cells and obtain relevant enhancers (Supplementary Figure S4). We also designed a module to help users to identify the promoters, potential enhancers and the target genes of enhancers from a set of peaks (e.g. obtained from a ChIP-seq or ATAC-seq dataset) (Figure 1E).
Enhancer browser
We also provide a browser page for each enhancer. Users can select the species, cell type, chromosome and a particular enhancer, and the database will generate a summary table, which includes coordinate of the enhancer, GWAS SNPs (50) within the enhancer, TF binding motifs from JASPAR (51), associated super-enhancer, related disease and enhancer sequence (Figure 2).
CONCLUSIONS
EnhancerAtlas 2.0 has a great improvement from version 1.0. It annotated 13 494 603 consensus enhancers in 586 tissue/cell types from 12 high-throughput technologies across nine species. We believe this is the most comprehensive enhancer database that includes the largest number of enhancer-related datasets. The database has the following advantages. First, it provides enhancer consensus annotation for ∼600 tissue/cell types, which represent the reliable enhancer annotation. Second, it provides useful analytic tools that users can search, compare and download the enhancers of interest. Third, we also provided the potential enhancer–target gene interactions using a newly developed method, EAGLE (43). The method outperformed IM-PET (52), which we used to predict enhancer–target gene interactions in our previous version of database. Finally, we optimized the search functions in the website, which increased the convenience for users to search, query and browse our database. For the future development, we plan to provide more relevant information of the enhancers such as evolutionary conservation across species and associated diseases (24,27,28,30,31).
DATA AVAILABILITY
All the data can be downloaded in http://www.enhanceratlas.org/downloadv2.php.
Supplementary Material
ACKNOWLEDGEMENTS
We thank the members from Qian’ lab for discussion.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
FUNDING
National Institutes of Health [EY024580, GM111514, EY029548, EY001765 to J.Q.]. Funding for open access charge: National Institutes of Health [EY024580, GM111514, EY029548, EY001765 to J.Q.].
Conflict of interest statement. None declared.
REFERENCES
- 1. Ong C.T., Corces V.G.. Enhancer function: new insights into the regulation of tissue-specific gene expression. Nat. Rev. Genet. 2011; 12:283–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Andersson R., Gebhard C., Miguel-Escalada I., Hoof I., Bornholdt J., Boyd M., Chen Y., Zhao X., Schmidl C., Suzuki T. et al.. An atlas of active enhancers across human cell types and tissues. Nature. 2014; 507:455–461. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Kvon E.Z., Kazmar T., Stampfel G., Yanez-Cuna J.O., Pagani M., Schernhuber K., Dickson B.J., Stark A.. Genome-scale functional characterization of Drosophila developmental enhancers in vivo. Nature. 2014; 512:91–95. [DOI] [PubMed] [Google Scholar]
- 4. Kim T.K., Hemberg M., Gray J.M., Costa A.M., Bear D.M., Wu J., Harmin D.A., Laptewicz M., Barbara-Haley K., Kuersten S. et al.. Widespread transcription at neuronal activity-regulated enhancers. Nature. 2010; 465:182–187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Arnold C.D., Gerlach D., Stelzer C., Boryn L.M., Rath M., Stark A.. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science. 2013; 339:1074–1077. [DOI] [PubMed] [Google Scholar]
- 6. Danko C.G., Hyland S.L., Core L.J., Martins A.L., Waters C.T., Lee H.W., Cheung V.G., Kraus W.L., Lis J.T., Siepel A.. Identification of active transcriptional regulatory elements from GRO-seq data. Nat. Methods. 2015; 12:433–438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Roadmap Epigenomics C., Kundaje A., Meuleman W., Ernst J., Bilenky M., Yen A., Heravi-Moussavi A., Kheradpour P., Zhang Z., Wang J. et al.. Integrative analysis of 111 reference human epigenomes. Nature. 2015; 518:317–330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Daugherty A.C., Yeo R.W., Buenrostro J.D., Greenleaf W.J., Kundaje A., Brunet A.. Chromatin accessibility dynamics reveal novel functional enhancers in C. elegans. Genome Res. 2017; 27:2096–2107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Fullard J.F., Hauberg M.E., Bendl J., Egervari G., Cirnaru M.D., Reach S.M., Motl J., Ehrlich M.E., Hurd Y.L., Roussos P.. An atlas of chromatin accessibility in the adult human brain. Genome Res. 2018; 28:1243–1252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Creyghton M.P., Cheng A.W., Welstead G.G., Kooistra T., Carey B.W., Steine E.J., Hanna J., Lodato M.A., Frampton G.M., Sharp P.A. et al.. Histone H3K27ac separates active from poised enhancers and predicts developmental state. PNAS. 2010; 107:21931–21936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Inukai S., Kock K.H., Bulyk M.L.. Transcription factor-DNA binding: beyond binding site motifs. Curr. Opin. Genet. Dev. 2017; 43:110–119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Heintzman N.D., Stuart R.K., Hon G., Fu Y., Ching C.W., Hawkins R.D., Barrera L.O., Van Calcar S., Qu C., Ching K.A. et al.. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 2007; 39:311–318. [DOI] [PubMed] [Google Scholar]
- 13. Koch F., Jourquin F., Ferrier P., Andrau J.C.. Genome-wide RNA polymerase II: not genes only!. Trends Biochem. Sci. 2008; 33:265–273. [DOI] [PubMed] [Google Scholar]
- 14. He H.H., Meyer C.A., Shin H., Bailey S.T., Wei G., Wang Q., Zhang Y., Xu K., Ni M., Lupien M. et al.. Nucleosome dynamics define transcriptional enhancers. Nat. Genet. 2010; 42:343–347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. West J.A., Cook A., Alver B.H., Stadtfeld M., Deaton A.M., Hochedlinger K., Park P.J., Tolstorukov M.Y., Kingston R.E.. Nucleosomal occupancy changes locally over key regulatory regions during cell differentiation and reprogramming. Nat. Commun. 2014; 5:4719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Simon J.M., Giresi P.G., Davis I.J., Lieb J.D.. Using formaldehyde-assisted isolation of regulatory elements (FAIRE) to isolate active regulatory DNA. Nat. Protoc. 2012; 7:256–267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Melnikov A., Murugan A., Zhang X., Tesileanu T., Wang L., Rogov P., Feizi S., Gnirke A., Callan C.G. Jr, Kinney J.B. et al.. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 2012; 30:271–277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Rao S.S., Huntley M.H., Durand N.C., Stamenova E.K., Bochkov I.D., Robinson J.T., Sanborn A.L., Machol I., Omer A.D., Lander E.S. et al.. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014; 159:1665–1680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Teng L., He B., Wang J., Tan K.. 4DGenome: a comprehensive database of chromatin interactions. Bioinformatics. 2015; 31:2560–2564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Li G., Ruan X., Auerbach R.K., Sandhu K.S., Zheng M., Wang P., Poh H.M., Goh Y., Lim J., Zhang J. et al.. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell. 2012; 148:84–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Kruesi W.S., Core L.J., Waters C.T., Lis J.T., Meyer B.J.. Condensin controls recruitment of RNA polymerase II to achieve nematode X-chromosome dosage compensation. eLife. 2013; 2:e00808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Khan A., Zhang X.. dbSUPER: a database of super-enhancers in mouse and human genome. Nucleic Acids Res. 2016; 44:D164–D171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Ashoor H., Kleftogiannis D., Radovanovic A., Bajic V.B.. DENdb: database of integrated human enhancers. Database. 2015; 2015:bav085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Zhang G., Shi J., Zhu S., Lan Y., Xu L., Yuan H., Liao G., Liu X., Zhang Y., Xiao Y. et al.. DiseaseEnhancer: a resource of human disease-associated enhancer catalog. Nucleic Acids Res. 2018; 46:D78–D84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Fishilevich S., Nudel R., Rappaport N., Hadar R., Plaschkes I., Iny Stein T., Rosen N., Kohn A., Twik M., Safran M. et al.. GeneHancer: genome-wide integration of enhancers and target genes in GeneCards. Database. 2017; 2017:bax028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Xiong L., Kang R., Ding R., Kang W., Zhang Y., Liu W., Huang Q., Meng J., Guo Z.. Genome-wide identification and characterization of enhancers across 10 human tissues. Int. J. Biol. Sci. 2018; 14:1321–1332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Wang J., Dai X., Berry L.D., Cogan J.D., Liu Q., Shyr Y.. HACER: an atlas of human active enhancers to interpret regulatory variants. Nucleic Acids Res. 2019; 47:D106–D112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Wang Z., Zhang Q., Zhang W., Lin J.R., Cai Y., Mitra J., Zhang Z.D.. HEDD: human enhancer disease database. Nucleic Acids Res. 2018; 46:D113–D120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Cai Z., Cui Y., Tan Z., Zhang G., Tan Z., Zhang X., Peng Y.. RAEdb: a database of enhancers identified by high-throughput reporter assays. Database. 2019; 2019:bay140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Wei Y., Zhang S., Shang S., Zhang B., Li S., Wang X., Wang F., Su J., Wu Q., Liu H. et al.. SEA: a super-enhancer archive. Nucleic Acids Res. 2016; 44:D172–D179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Jiang Y., Qian F., Bai X., Liu Y., Wang Q., Ai B., Han X., Shi S., Zhang J., Li X. et al.. SEdb: a comprehensive human super-enhancer database. Nucleic Acids Res. 2019; 47:D235–D243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Davis C.A., Hitz B.C., Sloan C.A., Chan E.T., Davidson J.M., Gabdank I., Hilton J.A., Jain K., Baymuradov U.K., Narayanan A.K. et al.. The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res. 2018; 46:D794–D801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Flicek P., Amode M.R., Barrell D., Beal K., Brent S., Carvalho-Silva D., Clapham P., Coates G., Fairley S., Fitzgerald S. et al.. Ensembl 2012. Nucleic Acids Res. 2012; 40:D84–D90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Visel A., Minovitsky S., Dubchak I., Pennacchio L.A.. VISTA Enhancer Browser–a database of tissue-specific human enhancers. Nucleic Acids Res. 2007; 35:D88–D92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Gao T., He B., Liu S., Zhu H., Tan K., Qian J.. EnhancerAtlas: a resource for enhancer annotation and analysis in 105 human cell/tissue types. Bioinformatics. 2016; 32:3543–3551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Barrett T., Wilhite S.E., Ledoux P., Evangelista C., Kim I.F., Tomashevsky M., Marshall K.A., Phillippy K.H., Sherman P.M., Holko M. et al.. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 2013; 41:D991–D995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Quinlan A.R., Hall I.M.. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010; 26:841–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., Nusbaum C., Myers R.M., Brown M., Li W. et al.. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008; 9:R137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Hinrichs A.S., Karolchik D., Baertsch R., Barber G.P., Bejerano G., Clawson H., Diekhans M., Furey T.S., Harte R.A., Hsu F. et al.. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 2006; 34:D590–D598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Nagari A., Murakami S., Malladi V.S., Kraus W.L.. Computational approaches for mining GRO-Seq data to identify and characterize active enhancers. Methods Mol. Biol. 2017; 1468:121–138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Chu T., Wang Z., Chou S.P., Danko C.G.. Discovering transcriptional regulatory elements from run-on and sequencing data using the web-based dREG gateway. Curr. Protoc. bioinformatics. 2019; 66:e70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Yang Y., Fear J., Hu J., Haecker I., Zhou L., Renne R., Bloom D., McIntyre L.M.. Leveraging biological replicates to improve analysis in ChIP-seq experiments. Comput. Struct. Biotechnol. J. 2014; 9:e201401002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Gao T., Qian J.. EAGLE: an algorithm that utilizes a small number of genomic features to predict tissue/cell type-specific enhancer-gene interactions. PLoS Comput. Biol. 2019; 15:e1007436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. UniProt, C. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019; 47:D506–D515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Haft D.H., DiCuccio M., Badretdin A., Brover V., Chetvernin V., O’Neill K., Li W., Chitsaz F., Derbyshire M.K., Gonzales N.R. et al.. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res. 2018; 46:D851–D860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Marygold S.J., Crosby M.A., Goodman J.L., FlyBase C.. Using FlyBase, a database of Drosophila genes and genomes. Methods Mol. Biol. 2016; 1478:1–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Velankar S., Alhroub Y., Best C., Caboche S., Conroy M.J., Dana J.M., Fernandez Montecelo M.A., van Ginkel G., Golovin A., Gore S.P. et al.. PDBe: Protein Data Bank in Europe. Nucleic Acids Res. 2012; 40:D445–D452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Haeussler M., Zweig A.S., Tyner C., Speir M.L., Rosenbloom K.R., Raney B.J., Lee C.M., Lee B.T., Hinrichs A.S., Gonzalez J.N., ( al.. The UCSC Genome Browser database: 2019 update. Nucleic Acids Res. 2019; 47:D853–D858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. McWilliam H., Li W., Uludag M., Squizzato S., Park Y.M., Buso N., Cowley A.P., Lopez R.. Analysis Tool Web Services from the EMBL-EBI. Nucleic Acids Res. 2013; 41:W597–W600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. MacArthur J., Bowler E., Cerezo M., Gil L., Hall P., Hastings E., Junkins H., McMahon A., Milano A., Morales J. et al.. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 2017; 45:D896–D901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Khan A., Fornes O., Stigliani A., Gheorghe M., Castro-Mondragon JA., van der Lee R., Bessy A., Chèneby J., Kulkarni SR., Tan G. et al.. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. 2018; 46:D260–D266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. He B., Chen C., Teng L., Tan K.. Global view of enhancer-promoter interactome in human cells. PNAS. 2014; 111:E2191–E2199. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All the data can be downloaded in http://www.enhanceratlas.org/downloadv2.php.