Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2015 Apr 16;43(Web Server issue):W109–W116. doi: 10.1093/nar/gkv345

The ReproGenomics Viewer: an integrative cross-species toolbox for the reproductive science community

Thomas A Darde 1,2, Olivier Sallou 2, Emmanuelle Becker 1, Bertrand Evrard 1, Cyril Monjeaud 2, Yvan Le Bras 2, Bernard Jégou 1,3, Olivier Collin 2, Antoine D Rolland 1, Frédéric Chalmel 1,*
PMCID: PMC4489245  PMID: 25883147

Abstract

We report the development of the ReproGenomics Viewer (RGV), a multi- and cross-species working environment for the visualization, mining and comparison of published omics data sets for the reproductive science community. The system currently embeds 15 published data sets related to gametogenesis from nine model organisms. Data sets have been curated and conveniently organized into broad categories including biological topics, technologies, species and publications. RGV's modular design for both organisms and genomic tools enables users to upload and compare their data with that from the data sets embedded in the system in a cross-species manner. The RGV is freely available at http://rgv.genouest.org.

INTRODUCTION

Sexual reproduction in eukaryotes involves a wide spectrum of biological processes by which species give rise to new individuals and thus perpetuate. These include the formation of haploid gametes after meiosis, a specific type of cell division that takes place only in the germ line. In the male, the differentiation of germ cells into highly specialized spermatozoa is a complex and tightly regulated process called spermatogenesis. This developmental process involves the sequential and coordinated expression of thousands of genes, many of them testis-specific. Spermatogenesis has thus been widely explored by several microarray-based expression studies over the last two decades (1,2) and several databases devoted to spermatogenesis and gametogenesis (35) or to reproduction in general (68) have been developed to organize and provide access to this massive quantity of data.

More recently, ultra-high-throughput next-generation sequencing (NGS) projects have imposed new challenges on the life science research community: the complex tasks of processing, hosting and interpreting these data (9). The repositories or databases referred to above, however, cannot cope with several intrinsic features of NGS data. For instance, although microarrays provide an average measurement of gene or transcript expression that can be easily displayed, NGS offers quantification at a single-base resolution, a feature that could only be observed by specific visualization tools that can take into account both genome coordinates of sequenced nucleotides and coverage information along every genomic locus. Additionally, microarray-based expression databases are typically organized around annotated entities, i.e. probes, transcripts, genes or, perhaps, corresponding proteins. Their structure is therefore incompatible with the ability of RNA-sequencing to lead to new discoveries (e.g. when new transcript isoforms are assembled and/or new loci identified) and not adapted to ChIP- or Methyl-seq analyses of specific chromatin regions, the boundaries of which cannot be strictly defined. The so-called genome browsers, a new type of database, have emerged to meet these requirements (10). UCSC's famous website (11) is a pioneer in this regard. The implementation of new modules (12,13) makes it possible to create even more flexible and intuitive browsers. These allow the hosting, visualization, customization, retrieval and analysis of various types of genomics data in a single environment, thus enabling researchers to extract and share data easily and construct new hypotheses from them. Most of these browsers, however, focus on a single species (1417) or a single type of genomic data (18,19). To our knowledge, there is no tool directed toward a specific research field and scientific community that can bring together the major relevant studies, regardless of species and technology type.

Here we present the ReproGenomics Viewer (RGV), a cross-species genomic toolbox for the reproductive community. The system is based on the implementation of a ‘JBrowse genome browser’ (20) and a ‘Galaxy bioinformatics workflow environment’ (2123). It was developed to provide a one-stop genomic working environment and aims to assist scientists in the analysis and the mining of a wide range of high-throughput repro-genomics data, including sequencing data. RGV allows hosting, visualization and direct comparison of users’ data to published genomics studies as well as to relevant genetic variations linked to reproduction. One way it does this is by enabling various genomic file format conversions. These genomic coordinates can be converted not only between genome releases of a given species but also and more importantly between different species. This key feature allows the direct comparison of data sets acquired in different organisms and thus makes RGV not only a multispecies genome browser but also a true cross-species tool for comparing reproductive genomics data. The RGV currently hosts data sets that are oriented mainly toward testis biology and spermatogenesis. In the near future, these will extend to other areas of reproduction, including gonad development, urogenital cancers and reproductive toxicology.

DESCRIPTION OF DATA SETS

As mentioned above, the RGV currently embeds 15 published studies related to male gamete development or gametogenesis in general (2436) (Table 1). These data sets are publicly available through the NCBI Gene Expression Omnibus Repository (37). They describe the extensive re-exploration of the spermatogenesis process over the past few years by the emerging ultra-high-throughput sequencing technologies. Specifically, the studies investigated the dynamic omics landscape of developing male germ cells, including: (i) chromatin remodeling and epigenetic features such as active and repressive marks (2425,2730); (ii) cistromes of transcription factors important for spermatogenesis (26,29); (iii) transcriptional landscapes, defined mainly by RNA sequencing technologies (24,28,3136); and (iv) proteomic profiles generated with the recent Proteomic Inferred by Transcriptomic approach (34). All these experiments took place in a wide spectrum of model organisms, including Homo sapiens (25,30,36), Gorilla gorilla (36), Macaca mulatta (36), Mus musculus (2429,3132,35), Rattus norvegicus (33,34), Monodelphis domestica (36), Ornithorhynchus anatinus (36), Gallus gallus (29,36) and Saccharomyces cerevisiae (as sporulation in yeast is the developmental process analogous to spermatogenesis in higher eukaryotes (3841)). Taken together, these published data sets currently represent 342 samples, 168 of vertebrates.

Table 1. Published data sets relevant to gamete development currently included in the RGV system and some relevant characteristics.

Publication PubMed IDs Species (release) Technologies Biological topics
Chocu et al., 2014 (34) 25210130 Rat (rn4) RNA-seq Spermatogenesis
Hammoud et al., 2014 (25) 24835570 Multi (2 species) Chip-seq, Bisulfite-seq Spermatogenesis
Chalmel et al., 2014 (33) 24740603 Rat (rn4) RNA-seq Spermatogenesis
Meikar et al., 2014 (35) 24554440 Mouse (mm9) RNA-seq, smallRNA-seq Spermatogenesis
Necsulea et al., 2014 (36) 24463510 Multi (7 species) RNA-Seq Tissue profiling
Soumillon et al., 2013 (32) 23791531 Mouse (mm9) RNA-seq Spermatogenesis
Erkek et al., 2013 (28) 23770822 Mouse (mm9) RNA-seq Spermatogenesis
Gan et al., 2013 (24) 23759713 Mouse (mm9) RNA-seq, 5hMeDIP-seq Spermatogenesis
Laiho et al., 2013 (31) 23613874 Mouse (mm9) RNA-seq Spermatogenesis
Li et al., 2013 (29) 23523368 Multi (2 species) ChIP-seq Spermatogenesis
Gaucher et al., 2012 (26) 22922464 Mouse (mm9) RNA-seq Spermatogenesis
Brick et al., 2012 (27) 22660327 Mouse (mm9) ChIP-seq Spermatogenesis
Lardenois et al., 2011 (38) 21149693 Yeast (sacCer3) Tiling Array Sporulation (SK1, MATa-alpha)
Brykczynska et al., 2010 (30) 20473313 Human (hg18) MNase-seq Spermatogenesis
Granovskaia et al., 2010 (41) 20193063 Yeast (sacCer3) Tiling Array Mitosis (W101, MATa)

In a critical step, we also gathered allele and genotype frequency data and significant genetic association findings from such public databases as GWAS and ClinVar (42,43). The control vocabulary provided by both projects enabled us to split genetic association studies into two categories: reproductive and non-reproductive symptoms. Direct links to PubMed and variant databases are provided.

THE RGV BACKBONE: DATA PROCESSING AND ORGANIZATION

The backbone of the RGV is the series of tools for processing and organizing data within the system (Figure 1A). Four types of information were manually extracted and curated for each study, including: the scientific name of each species and the genome release with which the experiments were performed and analyzed; the associated scientific publication; each biology topic investigated in the study; and the high-throughput technologies performed. Then each sample of a given data set underwent a series of automatic conversions to make it fully compatible with the RGV system (Figure 1B). Briefly, for a given sample X analyzed under a genome release r-1 of Species Y, five processing steps were sequentially performed: (i) each of the various input data formats (bedGraph/BED, WIG, bigWig) was converted into a simple tab-delimited text file (BED); (ii) as some differences can occur even in the same genome release of a given species, the resulting BED data file might have needed to be modified to standardize, for example, the chromosome names that might differ between the Ensembl, UCSC and NCBI databases; (iii) the standardized BED file was then converted into an indexed binary format (bigWig or bw) to enable fast remote access to the data; the pairwise alignments between genome assemblies and between species provided by UCSC made it possible to convert genome coordinates in the resulting bigWig file from a genome release r-1 of the species Y into (iv) the current assembly r of the same species Y and then (v) the current assembly r of another species Z.

Figure 1.

Figure 1.

The RGV backbone. (A) A schematic diagram of the strategy used to process and organize each individual sample from the published data sets embedded in the RGV system. The publication by Chalmel et al. is taken as an example (33). The organization of the data is based on the information manually extracted from the publication (species name, genome release, biological topic and technology). (B) The ‘RGV data processing’ workflow used to convert data file formats, standardize data files and then to convert genome coordinates between assemblies (r−1r) and between species (species YZ). (C) Screenshot of the JBrowse ‘Available tracks’ menu illustrating the ‘in-house’ organization of the published data sets embedded in the RGV system in several categories, such as ‘Biological topics’, ‘Technologies’, ‘Publications’ and ‘Species’.

Finally, we used manually extracted information to organize the processed data into four broad categories, i.e. biological topics, technologies, publications and species (Figure 1A). This organization is mirrored in the ‘Available Tracks’ option of the ‘JBrowse genome browser’ implemented in RGV (see the next section) to facilitate access to curated and relevant experimental data (Figure 1C).

BIOINFORMATICS TOOLS DEPLOYED IN THE RGV WORKING ENVIRONMENT

The system also integrates an implementation of the ‘JBrowse genome browser’ (20) and of the ‘Galaxy bioinformatics workflow environment’ (2123), grafted to the RGV backbone.

The RGV working environment

To host genomics tools essential for data comparisons between genome releases and above all between species, we implemented a ‘Galaxy bioinformatics workflow environment’. Briefly, Galaxy is an open web-based platform for genomic research that provides users with an easy-to-use web interface to create complex biological workflows by tools that simply need to be dragged and dropped. It is worth mentioning that the ‘RGV Galaxy session’ is available without creating an account. Users are, however, strongly invited to create an account to have access to their history, saved analyses, data sets and workflows. By default, this environment contains a myriad of tools designed mainly to assist users in handling files; these are largely simple file manipulation tools to convert, filter, sort, select, extract features or combine files. The current release already uses this versatile Galaxy working environment to deploy two workflows.

The ‘RGV data processing’ workflow described in the RGV backbone section (Figure 1B) was conveniently implemented as a Galaxy module. This pipeline is based on the implementation of three tool suites: UCSC tools (44), bedtools (45) and CrossMap (46). The former is used for all data file format conversions in either bedGraph or bigWig formats. The second is employed for the data standardization step. Finally, the latter is used in both cross-assembly and cross-species conversions of genome coordinates and makes use of pair-wise alignment files (chain format) provided by UCSC. The entire process takes roughly 30 min for an input file (bam format) of 200 Mb. Once the conversion is completed, the user can easily upload the resulting bigWig file to the ‘RGV JBrowse session’.

‘A genome alignment workflow’ based on the Blast-Like Alignment Tool (BLAT) (47) was implemented as a tool in the Galaxy working environment. Briefly, it allows users to use a one-step procedure to automatically align their DNA/RNA or protein sequences (fasta format) onto the 13 reference genome sequences available in the RGV system. The resulting alignments are post-processed and made available in two forms: a table including direct links to the ‘JBrowse session’ and a General Feature Format (gff) file that can be uploaded to the genome browser.

The RGV JBrowse session

As many more genomes, transcriptomes and epigenomes will be sequenced in the decade to come, a user-friendly genome browser has become essential for work in reproductive biology.

JBrowse advantages

The client-server architecture of JBrowse offers several advantages over other genome browser solutions, such as GBrowse (13): (i) the system is fully compatible with a wide spectrum of data types, including sequence files (fasta format), genomic feature files (gff), alignment files (bam) and quantitative data files (bedGraph, wig, bigWig); (ii) genome browsing is rapid even when multiple users are processing data simultaneously; (iii) JBrowse provides a user-friendly and highly flexible graphical interface in which users can efficiently pan and zoom over a genomic sequence region and turn genomics tracks on and off by simply clicking buttons.

Track organization

As mentioned above, RGV currently includes 15 published data sets. Each has been standardized and converted via the ‘RGV data processing’ pipeline. Data were then organized into four broad categories in the JBrowse track selector, from the information manually extracted from the original publications. These categories (Figure 1C) currently include: biological topics (spermatogenesis and tissue profiling), technologies (epigenomics, regulomics or transcriptomics), publications and species (nine species).

User interaction

The implementation of JBrowse allows users to download data sets embedded into the RGV genome browser by choosing a track of interest and then by clicking on ‘Save track data’. Users can also upload their own data sets (several file formats are allowed: gff3, gtf, bigWig, bam and vcf) in the ‘JBrowse session’ to compare them to the existing tracks by using the option ‘Open’ in the ‘File’ tab. If necessary the user can first run the ‘RGV data processing’ pipeline, implemented in the ‘Galaxy session’ (see the previous section), and then upload their own tracks into the system.

Example

During spermiogenesis, sperm chromatin is remodeled into a condensed inactive state due to the replacement of histones by protamines (48,49). The latter are small arginine-rich proteins binding DNA expressed in the late-stage spermatids of many animals and plants. We used the ‘RGV JBrowse session’ to illustrate the mammalian conserved expression pattern of the genes encoding PRM1, PRM2 and PRM3 which are clustered on the human chromosome 6 (Supplementary Figure S1). Once the genes have been selected with the search bar and the genome fixed to Human (hg19), three expression data sets from human, mouse and rat were compared (3233,36). The corresponding tracks were accessed by (i) the ‘Publications’ tab, by selecting ‘2013>Soumillon et al.’, ‘2014>Necsulea et al.’ and ‘2014>Chalmel et al.’. Note that the ‘Available Tracks’ menu is organized so that the same tracks could have been identified by (ii) the ‘Technologies’ tab or by (iii) the ‘Biological topics’ tab. The examination of the displayed tracks highlights the specific post-meiotic expression of the genes encoding protamines, as well as its strong conservation across mammals.

DISCOVERING NOVEL GENES ACTIVE IN SPERMATOGENESIS

The large variety of ultra-high-throughput data across many eukaryotic organisms encourages the use of the RGV as a testing ground for building novel scientific hypotheses on the basis of relevant, curated experimental data on reproduction. The possibilities are numerous, and the applications of RGV diverse. For example, the integration and visualization of pertinent transcriptome data and genome-wide association studies related to reproductive symptoms in the ‘JBrowse session’ may help to elucidate the mechanisms through which genetic mutations lead to reproductive disorders. Another example concerns the integration of active/repressive epigenetic marks and transcriptomic data, which may help to identify the role of specific epigenetic modifications in modulating the expression of genes involved in spermatogenesis.

To corroborate RGV's usefulness, we decided to test its ability to identify novel human loci dynamically expressed during male gamete development and conserved across species. We first integrated three RNA-sequencing studies in the ‘JBrowse session’: a tissue profiling project including samples from human testis and three other tissues (ovary, brain and placenta) published by Necsulea et al. (36); then we added two high-resolution expression profiles of male germ cells, one in rats (33) and the other in mice (32). Next, we analyzed the human testis sample provided by Necsulea et al. and assembled the transcripts with the cufflinks tool suite (50). We then sought to identify novel intergenic and multi-exonic loci that are expressed in human testes and have a meiotic and/or postmeiotic expression pattern in rodents (data not shown). This allowed us to select one promising candidate, designated TCONS_00962903, for further experimental validations to illustrate the relevance of our strategy (Figure 2A). This novel locus maps to chromosome 6 (positions 41 349 211–41 350 871) and is composed of three exons with a cumulative exon size of 659 bp. It shows preferential expression in testes compared with the other three tissue types in the study by Necsulea et al. (36). A simple examination of the ‘JBrowse session’, using the cross-species feature of RGV, showed very strong conservation in rodents, in which expression of this locus unambiguously peaked in spermatocytes and spermatids. This finding suggests its expression pattern in humans and rodents is similar (Figure 2A). Reverse transcriptase-polymerase chain reaction (RT-PCR) found substantial amounts of TCONS_00962903 RNA in human, mouse and rat testis samples, compared with the other tissue samples analyzed (brain, kidney, liver and lung for rodents; epididymis, seminal vesicle and prostate for humans) and thus confirmed its ‘testis-restricted’ expression pattern (Figure 2BD) (Supplementary file S1). Finally, as suggested by the rodent RNA-seq data, we clearly confirmed that the expression of this novel gene in the testis is restricted to the human germ cells at spermatid stage (Figure 2E).

Figure 2.

Figure 2.

Tissue and cell-specific expression patterns of one novel intergenic locus are shown. (A) Structure of the novel intergenic locus (blue boxes correspond to introns), TCONS_00962903, in the human genome (release hg19), is displayed in the ‘RGV JBrowse session’. Four RNA-seq data sets were selected to illustrate the transcript abundance of this promising candidate in human testes (Chalmel, F. and Rolland, A.D., in preparation) (36) as well as in rodent meiotic and post-meiotic germ cells (32,33). The amount of transcript determined in each tissue/cell and in each study is displayed as color-coded red heat maps. Red histogram bars represent the sequence conservation score distributions between 100 species as provided by the UCSC genome browser (phastCons scores, y-axis ranges from 0 to 1). TCONS_00962903 detection at the RNA level was further confirmed by RT-PCR in four rat (B) and mouse (C) tissue samples, including total testis (TT), brain (BR), kidney (KI), liver (LI) and lung (LU). RT-PCR analysis was also performed in four human tissue samples (D), including total testis (TT), epididymis (EP), seminal vesicle (SV) and prostate (PR), as well as five isolated testicular cell populations (E) including Leydig cells (LC), peritubular myoid cells (PC), Sertoli cells (SC), spermatocytes (Spc), round spermatids (rSpt) and total testis (TT) as positive control.

FUTURE DEVELOPMENTS

In the near future we intend to extend the scope of the RGV to keep pace with rapid technological, bioinformatics/genomic and biological/clinical advances in the reproductive sciences. In concrete terms, we are currently planning four separate actions. First, we will gather other relevant data sets from a wide range of species in RGV to cover other reproductive biological topics (e.g. gonad development, oogenesis, reproductive cancers and reproductive toxicology). We have already selected 18 studies to integrate into the system (Supplementary Table S1), and we encourage data submission from colleagues. Second, we will be adding other genetic information related to reproductive disorders (such as GWAS and Quantitative trait loci information from diverse sources and diverse model organisms). Third, we plan to develop community tools that will greatly facilitate collaborative work and stimulate the emergence of novel forms of collaboration in our research field.

Finally, we will be enhancing the features and functionalities of the ‘RGV-Galaxy working environment’. In particular, we intend to embed the ‘JBrowse genome browser’ directly into the Galaxy environment. Users will thus be able to entirely customize, and eventually share, their own personal genome browser session with their ultra-high-throughput data sets. This integration of JBrowse within the RGV-Galaxy working environment will also facilitate communications and data export between the two sessions. Another crucial point involves the direct implementation of several workflows for analysis of NGS data (e.g. RNA-seq, ChIP-seq) within the Galaxy environment. This will have several user benefits, for it will enable reproductive biologists/clinicians to perform their own analyses independently. Above all, it will help to standardize data analysis procedures within the reproductive science community to facilitate comparisons of data sets.

CONCLUSIONS

We report the development of the RGV, a webserver-based toolbox for reproductive scientists. The system combines specific solutions for ultra-high-throughput data management, curation and organization, with data conversion across releases and species (CrossMap), genome browsing (JBrowse session) and a bioinformatics workflow environment to deploy analysis pipelines (Galaxy session). RGV currently embeds 15 published data sets related to germ cell development from nine eukaryotic species. We intend to complete RGV's repertoire with other related biological processes, other model organisms and other technologies of interest related to reproductive biology in the near future. This may help scientists and clinicians who work on reproduction to compare their own data sets to relevant published studies in their specific field by overcoming the standard technical problems we face daily regarding data format, genome release and species issues. To the best of our knowledge, the RGV is the first cross-species working environment dedicated to a single biological field of interest. This community-based system could thus be applicable to other conserved biological processes studied in several model organisms.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

SUPPLEMENTARY DATA

Acknowledgments

We thank Céline Le Béguec, Yoanne Ando Randriamanantena, Laetitia Guillot, François Moreews, Raphaël Charles, Aurélie Lardenois and Michael Primig for stimulating discussions and/or beta-testing RGV. We acknowledge the GenOuest bioinformatics facility for hosting the software. We also thank Dominique Mahe Poiron, Nathalie Dejucq-Rainsford and Nathalie Rioux-Leclerq for providing the human samples.

FUNDING

The Agence nationale de sécurité sanitaire de l'alimentation, de l'environnement et du travail [ANSES No. EST-13-081 to F.C.]; the Fondation pour la recherche médicale [FRM No. DBI20131228558 to F.C.]; the European Union [FEDER to F.C.]. Funding for open access charge: the Fondation pour la recherche médicale [FRM No. DBI20131228558 to F.C.].

Conflict of interest statement. None declared.

REFERENCES

  • 1.Calvel P., Rolland A.D., Jegou B., Pineau C. Testicular postgenomics: targeting the regulation of spermatogenesis. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2010;365:1481–1500. doi: 10.1098/rstb.2009.0294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Rolland A.D., Jegou B., Pineau C. Testicular development and spermatogenesis: harvesting the postgenomics bounty. Adv. Exp. Med. Biol. 2008;636:16–41. doi: 10.1007/978-0-387-09597-4_2. [DOI] [PubMed] [Google Scholar]
  • 3.Lardenois A., Gattiker A., Collin O., Chalmel F., Primig M. GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle. Database. 2010;2010:baq030. doi: 10.1093/database/baq030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Zhang Y., Zhong L., Xu B., Yang Y., Ban R., Zhu J., Cooke H.J., Hao Q., Shi Q. SpermatogenesisOnline 1.0: a resource for spermatogenesis based on manual literature curation and genome-wide data mining. Nucleic Acids Res. 2013;41:D1055–D1062. doi: 10.1093/nar/gks1186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Lee T.L., Cheung H.H., Claus J., Sastry C., Singh S., Vu L., Rennert O., Chan W.Y. GermSAGE: a comprehensive SAGE database for transcript discovery on male germ cell development. Nucleic Acids Res. 2009;37:D891–D897. doi: 10.1093/nar/gkn644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lee T.L., Li Y., Cheung H.H., Claus J., Singh S., Sastry C., Rennert O.M., Lau Y.F., Chan W.Y. GonadSAGE: a comprehensive SAGE database for transcript discovery on male embryonic gonad development. Bioinformatics. 2010;26:585–586. doi: 10.1093/bioinformatics/btp695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hsueh A.J., Rauch R. Ovarian Kaleidoscope database: ten years and beyond. Biol. Reprod. 2012;86:192. doi: 10.1095/biolreprod.112.099127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Davies J.A., Little M.H., Aronow B., Armstrong J., Brennan J., Lloyd-MacGilp S., Armit C., Harding S., Piu X., Roochun Y., et al. Access and use of the GUDMAP database of genitourinary development. Methods Mol. Biol. 2012;886:185–201. doi: 10.1007/978-1-61779-851-1_17. [DOI] [PubMed] [Google Scholar]
  • 9.Merelli I., Perez-Sanchez H., Gesing S., D'Agostino D. High-performance computing and big data in omics-based medicine. BioMed Res. Int. 2014;2014:825649. doi: 10.1155/2014/825649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Wang J., Kong L., Gao G., Luo J. A brief introduction to web-based genome browsers. Brief. Bioinform. 2013;14:131–143. doi: 10.1093/bib/bbs029. [DOI] [PubMed] [Google Scholar]
  • 11.Rosenbloom K.R., Armstrong J., Barber G.P., Casper J., Clawson H., Diekhans M., Dreszer T.R., Fujita P.A., Guruvadoo L., Haeussler M., et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res. 2015;43:D670–D681. doi: 10.1093/nar/gku1177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Goecks J., Coraor N., Nekrutenko A., Taylor J. NGS analyses by visualization with Trackster. Nat. Biotechnol. 2012;30:1036–1039. doi: 10.1038/nbt.2404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Donlin M.J. Using the Generic Genome Browser (GBrowse) Curr. Protoc. Bioinform. 2009 doi: 10.1002/0471250953.bi0909s28. Chapter 9, Unit 9.9. [DOI] [PubMed] [Google Scholar]
  • 14.Choo S.W., Heydari H., Tan T.K., Siow C.C., Beh C.Y., Wee W.Y., Mutha N.V., Wong G.J., Ang M.Y., Yazdi A.H. VibrioBase: a model for next-generation genome and annotation database development. ScientificWorldJournal. 2014;2014:569324. doi: 10.1155/2014/569324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Heydari H., Wee W.Y., Lokanathan N., Hari R., Mohamed Yusoff A., Beh C.Y., Yazdi A.H., Wong G.J., Ngeow Y.F., Choo S.W. MabsBase: a Mycobacterium abscessus genome and annotation database. PloS one. 2013;8:e62443. doi: 10.1371/journal.pone.0062443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Heydari H., Mutha N.V., Mahmud M.I., Siow C.C., Wee W.Y., Wong G.J., Yazdi A.H., Ang M.Y., Choo S.W. StaphyloBase: a specialized genomic resource for the staphylococcal research community. Database. 2014;2014:bau010. doi: 10.1093/database/bau010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Choo S.W., Ang M.Y., Fouladi H., Tan S.Y., Siow C.C., Mutha N.V., Heydari H., Wee W.Y., Vadivelu J., Loke M.F., et al. HelicoBase: a Helicobacter genomic resource and analysis platform. BMC Genomics. 2014;15:600. doi: 10.1186/1471-2164-15-600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Geisen S., Barturen G., Alganza A.M., Hackenberg M., Oliver J.L. NGSmethDB: an updated genome resource for high quality, single-cytosine resolution methylomes. Nucleic Acids Res. 2014;42:D53–D59. doi: 10.1093/nar/gkt1202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hackenberg M., Barturen G., Oliver J.L. NGSmethDB: a database for next-generation sequencing single-cytosine-resolution DNA methylation data. Nucleic Acids Res. 2011;39:D75–D79. doi: 10.1093/nar/gkq942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Skinner M.E., Uzilov A.V., Stein L.D., Mungall C.J., Holmes I.H. JBrowse: a next-generation genome browser. Genome Res. 2009;19:1630–1638. doi: 10.1101/gr.094607.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Blankenberg D., Von Kuster G., Coraor N., Ananda G., Lazarus R., Mangan M., Nekrutenko A., Taylor J. Galaxy: a web-based genome analysis tool for experimentalists. Curr. Protoc. Mol. Biol. 2010 doi: 10.1002/0471142727.mb1910s89. Chapter 19, Unit 19.10.1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Giardine B., Riemer C., Hardison R.C., Burhans R., Elnitski L., Shah P., Zhang Y., Blankenberg D., Albert I., Taylor J., et al. Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 2005;15:1451–1455. doi: 10.1101/gr.4086505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Goecks J., Nekrutenko A., Taylor J. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 2010;11:R86. doi: 10.1186/gb-2010-11-8-r86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Gan H., Wen L., Liao S., Lin X., Ma T., Liu J., Song C.X., Wang M., He C., Han C., et al. Dynamics of 5-hydroxymethylcytosine during mouse spermatogenesis. Nat. Commun. 2013;4:1995. doi: 10.1038/ncomms2995. [DOI] [PubMed] [Google Scholar]
  • 25.Hammoud S.S., Low D.H., Yi C., Carrell D.T., Guccione E., Cairns B.R. Chromatin and transcription transitions of mammalian adult germline stem cells and spermatogenesis. Cell Stem Cell. 2014;15:239–253. doi: 10.1016/j.stem.2014.04.006. [DOI] [PubMed] [Google Scholar]
  • 26.Gaucher J., Boussouar F., Montellier E., Curtet S., Buchou T., Bertrand S., Hery P., Jounier S., Depaux A., Vitte A.L., et al. Bromodomain-dependent stage-specific male genome programming by Brdt. EMBO J. 2012;31:3809–3820. doi: 10.1038/emboj.2012.233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Brick K., Smagulova F., Khil P., Camerini-Otero R.D., Petukhova G.V. Genetic recombination is directed away from functional genomic elements in mice. Nature. 2012;485:642–645. doi: 10.1038/nature11089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Erkek S., Hisano M., Liang C.Y., Gill M., Murr R., Dieker J., Schubeler D., van der Vlag J., Stadler M.B., Peters A.H. Molecular determinants of nucleosome retention at CpG-rich sequences in mouse spermatozoa. Nat. Struct. Mol. Biol. 2013;20:868–875. doi: 10.1038/nsmb.2599. [DOI] [PubMed] [Google Scholar]
  • 29.Li X.Z., Roy C.K., Dong X., Bolcun-Filas E., Wang J., Han B.W., Xu J., Moore M.J., Schimenti J.C., Weng Z., et al. An ancient transcription factor initiates the burst of piRNA production during early meiosis in mouse testes. Mol. Cell. 2013;50:67–81. doi: 10.1016/j.molcel.2013.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Brykczynska U., Hisano M., Erkek S., Ramos L., Oakeley E.J., Roloff T.C., Beisel C., Schubeler D., Stadler M.B., Peters A.H. Repressive and active histone methylation mark distinct promoters in human and mouse spermatozoa. Nat. Struct. Mol. Biol. 2010;17:679–687. doi: 10.1038/nsmb.1821. [DOI] [PubMed] [Google Scholar]
  • 31.Laiho A., Kotaja N., Gyenesei A., Sironen A. Transcriptome profiling of the murine testis during the first wave of spermatogenesis. PloS one. 2013;8:e61558. doi: 10.1371/journal.pone.0061558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Soumillon M., Necsulea A., Weier M., Brawand D., Zhang X., Gu H., Barthes P., Kokkinaki M., Nef S., Gnirke A., et al. Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. Cell Rep. 2013;3:2179–2190. doi: 10.1016/j.celrep.2013.05.031. [DOI] [PubMed] [Google Scholar]
  • 33.Chalmel F., Lardenois A., Evrard B., Rolland A.D., Sallou O., Dumargne M.C., Coiffec I., Collin O., Primig M., Jegou B. High-resolution profiling of novel transcribed regions during rat spermatogenesis. Biol. Reprod. 2014;91:5. doi: 10.1095/biolreprod.114.118166. [DOI] [PubMed] [Google Scholar]
  • 34.Chocu S., Evrard B., Lavigne R., Rolland A.D., Aubry F., Jegou B., Chalmel F., Pineau C. Forty-four novel protein-coding loci discovered using a proteomics informed by transcriptomics (PIT) approach in rat male germ cells. Biol. Reprod. 2014;91:123. doi: 10.1095/biolreprod.114.122416. [DOI] [PubMed] [Google Scholar]
  • 35.Meikar O., Vagin V.V., Chalmel F., Sostar K., Lardenois A., Hammell M., Jin Y., Da Ros M., Wasik K.A., Toppari J., et al. An atlas of chromatoid body components. RNA. 2014;20:483–495. doi: 10.1261/rna.043729.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Necsulea A., Soumillon M., Warnefors M., Liechti A., Daish T., Zeller U., Baker J.C., Grutzner F., Kaessmann H. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature. 2014;505:635–640. doi: 10.1038/nature12943. [DOI] [PubMed] [Google Scholar]
  • 37.Barrett T., Wilhite S.E., Ledoux P., Evangelista C., Kim I.F., Tomashevsky M., Marshall K.A., Phillippy K.H., Sherman P.M., Holko M., et al. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 2013;41:D991–D995. doi: 10.1093/nar/gks1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lardenois A., Liu Y., Walther T., Chalmel F., Evrard B., Granovskaia M., Chu A., Davis R.W., Steinmetz L.M., Primig M. Execution of the meiotic noncoding RNA expression program and the onset of gametogenesis in yeast require the conserved exosome subunit Rrp6. Proc. Natl. Acad. Sci. U.S.A. 2011;108:1058–1063. doi: 10.1073/pnas.1016459108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Lavigne R., Becker E., Liu Y., Evrard B., Lardenois A., Primig M., Pineau C. Direct iterative protein profiling (DIPP) - an innovative method for large-scale protein detection applied to budding yeast mitosis. Mol. Cell. Proteomics. 2012;11 doi: 10.1074/mcp.M111.012682. M111.012682. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Xu Z., Wei W., Gagneur J., Perocchi F., Clauder-Munster S., Camblong J., Guffanti E., Stutz F., Huber W., Steinmetz L.M. Bidirectional promoters generate pervasive transcription in yeast. Nature. 2009;457:1033–1037. doi: 10.1038/nature07728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Granovskaia M.V., Jensen L.J., Ritchie M.E., Toedling J., Ning Y., Bork P., Huber W., Steinmetz L.M. High-resolution transcription atlas of the mitotic cell cycle in budding yeast. Genome Biol. 2010;11:R24. doi: 10.1186/gb-2010-11-3-r24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Johnston J.J., Rubinstein W.S., Facio F.M., Ng D., Singh L.N., Teer J.K., Mullikin J.C., Biesecker L.G. Secondary variants in individuals undergoing exome sequencing: screening of 572 individuals identifies high-penetrance mutations in cancer-susceptibility genes. Am. J. Hum. Genet. 2012;91:97–108. doi: 10.1016/j.ajhg.2012.05.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Welter D., MacArthur J., Morales J., Burdett T., Hall P., Junkins H., Klemm A., Flicek P., Manolio T., Hindorff L., et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–D1006. doi: 10.1093/nar/gkt1229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Kuhn R.M., Haussler D., Kent W.J. The UCSC genome browser and associated tools. Brief. Bioinform. 2013;14:144–161. doi: 10.1093/bib/bbs038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Quinlan A.R. BEDTools: the Swiss-Army Tool for Genome Feature Analysis. Curr. Protoc. Bioinform. 2014;47:11.12.11–11.12.34. doi: 10.1002/0471250953.bi1112s47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Zhao H., Sun Z., Wang J., Huang H., Kocher J.P., Wang L. CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics. 2014;30:1006–1007. doi: 10.1093/bioinformatics/btt730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Kent W.J. BLAT—the BLAST-like alignment tool. Genome Res. 2002;12:656–664. doi: 10.1101/gr.229202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Dadoune J.P. Expression of mammalian spermatozoal nucleoproteins. Microsc. Res. Tech. 2003;61:56–75. doi: 10.1002/jemt.10317. [DOI] [PubMed] [Google Scholar]
  • 49.Balhorn R. The protamine family of sperm nuclear proteins. Genome Biol. 2007;8:227. doi: 10.1186/gb-2007-8-9-227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Trapnell C., Roberts A., Goff L., Pertea G., Kim D., Kelley D.R., Pimentel H., Salzberg S.L., Rinn J.L., Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 2012;7:562–578. doi: 10.1038/nprot.2012.016. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SUPPLEMENTARY DATA

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES