Abstract
DNA methylation plays crucial roles during embryonic development. Here we present MethBank (http://dnamethylome.org), a DNA methylome programming database that integrates the genome-wide single-base nucleotide methylomes of gametes and early embryos in different model organisms. Unlike extant relevant databases, MethBank incorporates the whole-genome single-base-resolution methylomes of gametes and early embryos at multiple different developmental stages in zebrafish and mouse. MethBank allows users to retrieve methylation levels, differentially methylated regions, CpG islands, gene expression profiles and genetic polymorphisms for a specific gene or genomic region. Moreover, it offers a methylome browser that is capable of visualizing high-resolution DNA methylation profiles as well as other related data in an interactive manner and thus is of great helpfulness for users to investigate methylation patterns and changes of gametes and early embryos at different developmental stages. Ongoing efforts are focused on incorporation of methylomes and related data from other organisms. Together, MethBank features integration and visualization of high-resolution DNA methylation data as well as other related data, enabling identification of potential DNA methylation signatures in different developmental stages and accordingly providing an important resource for the epigenetic and developmental studies.
INTRODUCTION
DNA methylation is a major epigenetic mark that is crucial for embryogenesis and highly dynamic during embryonic development (1,2). According to our previous studies (3,4), it is found that, in Danio rerio (zebrafish), paternal methylome is discovered to be stably inherited, while maternal methylome is reprogrammed to the sperm pattern (3,5). In Mus musculus (mouse), the paternal methylome and at least a significant proportion of maternal methylome go through active demethylation during embryonic development (4). The strategies for reprogramming parental methylomes are fundamentally different between vertebrates (6), suggesting that the underlying developmental programs may be distinct in different species (3,4,7–10). Therefore, studying DNA methylation at multiple developmental stages in different species may extend the knowledge of the inheritance and reprogramming of methylomes.
In contrast to region-wide methods for DNA methylation profiling and reduced representation bisulfite sequencing (RRBS), whole-genome bisulfite sequencing (WGBS) powered with high-throughout sequencing technologies enables the single-base nucleotide measurement of DNA methylation and accordingly the generation of genome-wide high-resolution DNA methylome. The comprehensive integration of DNA methylomes in different species, therefore, bears promises to help us systemically study DNA methylation reprogramming in early embryos at full aspects (10–14). Over the past years, several methylation-related databases have been developed for managing methylation data (15–22). However, none of them is designed to support developmental studies by integrating high-resolution whole-genome DNA methylomes from multiple developmental stages in different organisms. Specifically, MethylomeDB (15) is focused only on a specific tissue. DiseaseMeth (16), MethyCancer (17) and PubMeth (18) link methylome dedicatedly with human cancer/disease. MethDB (19) and PCMdb (20) are not committed to storing whole-genome single-base-resolution methylome data. NGSmethDB does not contain methylome data or other related omics data in gametes and early embryos (21). Cyclonet is under construction as of 28 June 2014 (http://cyclonet.biouml.org) and what we can learn is that it is mainly centered on cell cycles and does not include methylome data of gametes and early embryos. MethBase does not focus on methylomes of gametes and early embryos for developmental studies (22). NCBI Epigenomics Resources is designed for general research purposes, mainly providing pre-computed methylation results at individual cytosimes. Clearly, it can be seen that there is a lack of a specialized database that contains high-resolution genome-wide developmental methylomes with the aim to exploit the full potential of methylomes for systematic analysis of methylation dynamics during development.
Here we develop MethBank (http://dnamethylome.org), a database of DNA methylome programming that integrates the genome-wide single-base nucleotide methylomes of gametes and early embryos covering multiple diverse developmental stages and spanning different model organisms. Unlike extant databases, MethBank includes multiple whole-genome methylomes of gametes and early embryos at different developmental stages, incorporates related data (e.g. expression, genetic polymorphism) to facilitate systematic integrative investigation of DNA methylome reprogramming in embryonic development, and provides a configurable and interactive methylome browser to visualize high-resolution methylation data as well as other related data.
IMPLEMENTATION
MethBank has been implemented using MySQL (http://www.mysql.org; a free and popular relational database management system), JSP (JavaServer Pages; a technology facilitating rapid development of dynamic web pages based on the Java programming language) and Apache Tomcat (http://tomcat.apache.org; an open source web server for Java code to run in) on a Red Hat Enterprise Linux Server. The web pages were developed using Eclipse (http://www.eclipse.org), an integrated development environment (IDE) that features rapid development of Java-based web applications and simplified connection with database management systems. To provide friendly and interactive web pages, browser-based interfaces were coded in JSP and AJAX (Asynchronous JavaScript and XML, a collection of web development technologies for creating highly interactive web applications), enabling data transfer between server and browser asynchronously without interfering with the display of the current web page. MethBank is freely available at http://dnamethylome.org.
DATABASE CONTENT AND USAGE
MethBank is a database of DNA methylome programming, dedicating to storing, browsing and visualizing single-base nucleotide whole-genome DNA methylation data of gametes and early embryos in different animals. Based on our previous studies (3,4), MethBank incorporates large cohorts of gametes and early embryo methylomes at single-base nucleotide resolution on two well-studied species, D. rerio and M. musculus. For each species, there are nine different developmental stages involving two gametes and seven embryos (for details see http://dnamethylome.org/about). For each developmental stage, >17 and ∼19 million methylated CG sites are included for zebrafish and mouse, respectively, covering about 90% of all CpGs in the whole genome. To enable in-depth investigation of DNA methylation data, MethBank profiles methylation level for each methylated CG site and identifies differentially methylated regions (DMRs) between oocyte and sperm, facilitating users to explore the dynamic methylation changes between different developmental stages.
As genetic polymorphisms may disrupt the methylation status and gene expression correlates closely with DNA methylation, a large number of single nucleotide polymorphisms (SNPs) and expression profiles are also included in MethBank. Consequently, interconnections among different omics data are established and presented in MethBank. The detailed data statistics in MethBank are summarized in Table 1 and maintained online at http://dnamethylome.org/about. For a given region or gene, MethBank is able to profile methylation levels, locate DMRs inside, and unveil the corresponding expression and SNP information. Considering that advanced users may download raw data for their own analysis, the ‘Download’ page provides links to whole-genome single-base-resolution methylomes at different developmental stages.
Table 1. MethBank data content and statistics as of 1 July 2014.
| Data content | Data statistics | 
|---|---|
| Methylation data (WGBSa) | |
| Sperm (zebrafish) | 20 606 757 CpG sites | 
| Oocyte (zebrafish) | 20 987 274 CpG sites | 
| 16-Cell (zebrafish) | 18 948 935 CpG sites | 
| 32-Cell (zebrafish) | 20 825 693 CpG sites | 
| 64-Cell (zebrafish) | 17 836 722 CpG sites | 
| 128-Cell (zebrafish) | 19 889 625 CpG sites | 
| 1k-Cell (zebrafish) | 20 815 960 CpG sites | 
| Germ-ring (zebrafish) | 19 258 632 CpG sites | 
| Testis (zebrafish) | 20 095 377 CpG sites | 
| Sperm (mouse) | 19 455 209 CpG sites | 
| Oocyte (mouse) | 19 100 312 CpG sites | 
| 2-Cell (mouse) | 19 844 915 CpG sites | 
| 4-Cell (mouse) | 19 637 338 CpG sites | 
| E7.5 (mouse) | 19 644 831 CpG sites | 
| ICM (mouse) | 19 523 556 CpG sites | 
| E13.5 female (mouse) | 19 253 566 CpG sites | 
| E13.5 male (mouse) | 19 352 586 CpG sites | 
| SNPb d ata | |
| TU female (zebrafish) | 3 653 795 SNPs | 
| TL male (zebrafish) | 3 423 892 SNPs | 
| Mouse | 2 561 574 SNPs | 
| Gene expres sion data | |
| Sperm (zebrafish) | 13 529 genes | 
| Oocyte (zebrafish) | 11 191 genes | 
| 1k-Cell (zebrafish) | 12 168 genes | 
| Germ-ring (zebrafish) | 12 129 genes | 
| Oocyte (mouse) | 18 259 genes | 
| CpG island data | |
| Zebrafish | 8768 CpG islands | 
| Mouse | 37 730 CpG islands | 
| DMRc d ata | |
| Sperm versus oocyte (zebrafish) | 53 680 DMRs | 
| 1k-cell versus oocyte (zebrafish) | 51 886 DMRs | 
| 1k-cell versus germ-ring (zebrafish) | 95 DMRs | 
| Germ-ring versus oocyte (zebrafish) | 36 004 DMRs | 
| Sperm versus oocyte (mouse) | 2000 DMRs | 
aWGBS: whole-genome bisulfate sequencing.
bSNP: single nucleotide polymorphism.
cDMR: differentially methylated region.
To visualize single-base-resolution DNA methylation data, an interactive and user-friendly methylome browser built on JBrowse (23) is deployed in MethBank (Figure 1). For each species, the methylome browser includes a variety of data tracks (namely, CpG island, DMR, gene expression, methylation level, reference gene, reference sequence and SNP) and allows users to choose tracks of interest and to zoom and scroll any region along the genome. Therefore, the methylome browser is of usefulness to investigate methylation status of specific genes/regions across different developmental stages by taking account of multiple relevant data tracks (Figure 1A). Moreover, when clicking a gene/region on a specific track, its corresponding details are displayed and accessible for download (Figure 1B–D). Equipping with the methylome browser, MethBank is able to visualize high-resolution DNA methylation profiles as well as DMRs, gene expression levels, SNPs, CpG islands, etc., in an interactive manner and thus is of great utility to investigate methylation patterns of gametes and early embryos at different developmental stages within specific genes/regions.
Figure 1.

Screenshots of the methylome browser in MethBank. (A) Overview of the methylome browser that corresponds to two zebrafish developmental-associated hoxa genes, showing methylation levels, gene expression profiles, DMRs and SNPs. (B) Detailed gene expression information for hoxa13b. (C) Detailed DMR information for hoxa13b. (D) Detailed SNP information for hoxa13b.
To support information search and exploration, MethBank provides friendly web interfaces to retrieve a diversity of information for a specific gene or region (Figure 2). By specifying a gene symbol, users can obtain its methylation states at promoter and gene body across multiple developmental stages, as well as its basic information, gene expression, etc. (Figure 2A). For a given gene symbol (Figure 2B) or a specified genomic region (Figure 2C), MethBank can also provide all relevant DMRs between two developmental stages. Detailed SNP information is available for any genomic region, including allele information and genomic locus (Figure 2D). Moreover, MethBank provides not only the detailed methylation levels for all CG sites in a specific genomic region, but also the averaged methylation level for this region (Figure 2E). Additionally, all these information can be interactively and integratedly displayed in the methylome browser just by clicking the ‘View’ link.
Figure 2.

Screenshots of information query in MethBank. (A) Methylation states at promoter and gene body as well as basic information and gene expression, by searching the mouse gene DNMT3a. (B) DMR information for the zebrafish gene EEF1A1. (C) DMR information, by searching a genomic region from 30171452 to 30176667 in zebrafish chromosome 1. (D) SNP information, by searching a genomic region from 55442889 to 55458838 in zebrafish chromosome 3. (E) Methylation levels for all CG sites in a genomic region from 30171452 to 30172100 in mouse chromosome 1 and its average methylation level (0.62115).
DISCUSSION AND FUTURE DEVELOPMENTS
Different from extant databases, MethBank features (1) integrating the whole-genome single-base-resolution methylomes of gametes and early embryos at multiple developmental stages; (2) storing vast amounts of methylated CG sites in zebrafish and mouse, with >17 million and ∼19 million in count for each developmental stage, respectively; (3) incorporating other related omics data, interconnecting them with methylomes and building a methylome browser for visualization of all types of data in a genomic context; and (4) allowing the online query of methylation levels, DMRs, CpG islands, expression profiles and SNP information for a specific region or gene. Taken together, MethBank integrates and visualizes high-resolution genome-wide DNA methylomes as well as gene expression profiles and genetic polymorphisms, enabling identification of potential DNA methylation signatures in different developmental stages and accordingly providing an important resource for the epigenetic and developmental studies.
MethBank is committed to integrating the genome-wide single-base nucleotide methylomes of gametes and early embryos in different animals. With the rapid advancements of high-throughput sequencing technologies, more and more single-base-resolution developmental-related methylation data will become available in the following years, which bear great promises in unveiling fundamentals of DNA methylation in the development and differentiation of various cell types in different organisms. Therefore, future developments for MethBank include incorporation of whole-genome methylomes at multiple developmental stages from other species. Accordingly, MethBank will continue to integrate related types of data (e.g. expression, SNP) from different resources and add more methylation analysis tools. Considering the increasing volume of methylation data, it is also important to develop web pages and tools to allow the easy incorporation of new data. Furthermore, MethBank will also provide orthologous genes in different species and develop web interfaces to facilitate cross-species comparison of DNA methylation at different developmental stages. The methylome browser will be further improved to support interactive visualization of big methylation data as well as other related data. In addition to our DNA methylation data generated in-house, we also invite the scientific community to submit their methylation data to MethBank and to build collaborations in improving the functionalities of MethBank.
Acknowledgments
We thank Dr Jun Yu for valuable discussions on this work and members of the Zhang Lab for reporting bugs and sending comments.
Footnotes
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
FUNDING
Strategic Priority Research Program of the Chinese Academy of Sciences [XDB13040000 to Z.Z. and J.L.]; National Natural Science Foundation of China [31200958 to J.Z., 91219104 to J.L. and 31000584 to R.L.]; Youth Innovation Promotion Association of the Chinese Academy of Sciences [to J.Z.]; the ‘100-Talent Program’ of the Chinese Academy of Sciences [Y1SLXb1365 to Z.Z.]; National Programs for High Technology Research and Development [863 Program; 2012AA020409 to Z.Z.]; the Ministry of Science and Technology of the People's Republic of China. Funding for open access charge: National Programs for High Technology Research and Development [2012AA020409].
Conflict of interest statement. None declared.
REFERENCES
- 1.Li E., Bestor T.H., Jaenisch R. Targeted mutation of the DNA methyltransferase gene results in embryonic lethality. Cell. 1992;69:915–926. doi: 10.1016/0092-8674(92)90611-f. [DOI] [PubMed] [Google Scholar]
- 2.Okano M., Bell D.W., Haber D.A., Li E. DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell. 1999;99:247–257. doi: 10.1016/s0092-8674(00)81656-6. [DOI] [PubMed] [Google Scholar]
- 3.Jiang L., Zhang J., Wang J., Wang L., Zhang L., Li G., Yang X., Ma X., Sun X., Cai J., et al. Sperm, but not oocyte, DNA methylome is inherited by zebrafish early embryos. Cell. 2013;153:773–784. doi: 10.1016/j.cell.2013.04.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wang L., Zhang J., Duan J., Gao X., Zhu W., Lu X., Yang L., Zhang J., Li G., Ci W., et al. Programming and inheritance of parental DNA methylomes in mammals. Cell. 2014;157:979–991. doi: 10.1016/j.cell.2014.04.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Potok M.E., Nix D.A., Parnell T.J., Cairns B.R. Reprogramming the maternal zebrafish genome after fertilization to match the paternal methylation pattern. Cell. 2013;153:759–772. doi: 10.1016/j.cell.2013.04.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hackett J.A., Surani M.A. Beyond DNA: programming and inheritance of parental methylomes. Cell. 2013;153:737–739. doi: 10.1016/j.cell.2013.04.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wossidlo M., Nakamura T., Lepikhov K., Marques C.J., Zakhartchenko V., Boiani M., Arand J., Nakano T., Reik W., Walter J. 5-Hydroxymethylcytosine in the mammalian zygote is linked with epigenetic reprogramming. Nat. Commun. 2011;2:241. doi: 10.1038/ncomms1240. [DOI] [PubMed] [Google Scholar]
- 8.Gu T.P., Guo F., Yang H., Wu H.P., Xu G.F., Liu W., Xie Z.G., Shi L., He X., Jin S.G., et al. The role of Tet3 DNA dioxygenase in epigenetic reprogramming by oocytes. Nature. 2011;477:606–610. doi: 10.1038/nature10443. [DOI] [PubMed] [Google Scholar]
- 9.Inoue A., Zhang Y. Replication-dependent loss of 5-hydroxymethylcytosine in mouse preimplantation embryos. Science. 2011;334:194. doi: 10.1126/science.1212483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Smith Z.D., Chan M.M., Mikkelsen T.S., Gu H., Gnirke A., Regev A., Meissner A. A unique regulatory phase of DNA methylation in the early mammalian embryo. Nature. 2012;484:339–344. doi: 10.1038/nature10960. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Beck S., Rakyan V.K. The methylome: approaches for global DNA methylation profiling. Trends Genet. 2008;24:231–237. doi: 10.1016/j.tig.2008.01.006. [DOI] [PubMed] [Google Scholar]
- 12.Laird P.W. Principles and challenges of genomewide DNA methylation analysis. Nat. Rev. Genet. 2010;11:191–203. doi: 10.1038/nrg2732. [DOI] [PubMed] [Google Scholar]
- 13.Ball M.P., Li J.B., Gao Y., Lee J.H., LeProust E.M., Park I.H., Xie B., Daley G.Q., Church G.M. Targeted and genome-scale strategies reveal gene-body methylation signatures in human cells. Nat. Biotechnol. 2009;27:361–368. doi: 10.1038/nbt.1533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Harris R.A., Wang T., Coarfa C., Nagarajan R.P., Hong C., Downey S.L., Johnson B.E., Fouse S.D., Delaney A., Zhao Y., et al. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat. Biotechnol. 2010;28:1097–1105. doi: 10.1038/nbt.1682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Xin Y., Chanrion B., O'Donnell A.H., Milekic M., Costa R., Ge Y., Haghighi F.G. MethylomeDB: a database of DNA methylation profiles of the brain. Nucleic Acids Res. 2012;40:D1245–D1249. doi: 10.1093/nar/gkr1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lv J., Liu H., Su J., Wu X., Liu H., Li B., Xiao X., Wang F., Wu Q., Zhang Y. DiseaseMeth: a human disease methylation database. Nucleic Acids Res. 2012;40:D1030–D1035. doi: 10.1093/nar/gkr1169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.He X., Chang S., Zhang J., Zhao Q., Xiang H., Kusonmano K., Yang L., Sun Z.S., Yang H., Wang J. MethyCancer: the database of human DNA methylation and cancer. Nucleic Acids Res. 2008;36:D836–D841. doi: 10.1093/nar/gkm730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ongenaert M., Van Neste L., De Meyer T., Menschaert G., Bekaert S., Van Criekinge W. PubMeth: a cancer methylation database combining text-mining and expert annotation. Nucleic Acids Res. 2008;36:D842–D846. doi: 10.1093/nar/gkm788. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Grunau C., Renault E., Rosenthal A., Roizes G. MethDB—a public database for DNA methylation data. Nucleic Acids Res. 2001;29:270–274. doi: 10.1093/nar/29.1.270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Nagpal G., Sharma M., Kumar S., Chaudhary K., Gupta S., Gautam A., Raghava G.P. PCMdb: pancreatic cancer methylation database. Sci. Rep. 2014;4:4197. doi: 10.1038/srep04197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Hackenberg M., Barturen G., Oliver J.L. NGSmethDB: a database for next-generation sequencing single-cytosine-resolution DNA methylation data. Nucleic Acids Res. 2011;39:D75–D79. doi: 10.1093/nar/gkq942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Song Q., Decato B., Hong E.E., Zhou M., Fang F., Qu J., Garvin T., Kessler M., Zhou J., Smith A.D. A reference methylome database and analysis pipeline to facilitate integrative and comparative epigenomics. PLoS One. 2013;8:1148. doi: 10.1371/journal.pone.0081148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Skinner M.E., Uzilov A.V., Stein L.D., Mungall C.J., Holmes I.H. JBrowse: a next-generation genome browser. Genome Res. 2009;19:1630–1638. doi: 10.1101/gr.094607.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
