Skip to main content
eLife logoLink to eLife
. 2018 Oct 26;7:e37344. doi: 10.7554/eLife.37344

Chromatin accessibility dynamics across C. elegans development and ageing

Jürgen Jänes 1,2,, Yan Dong 1,2,, Michael Schoof 1,2,, Jacques Serizay 1,2,, Alex Appert 1,2, Chiara Cerrato 1,2, Carson Woodbury 1,2, Ron Chen 1,2,§, Carolina Gemma 1,2,#, Ni Huang 1,2, Djem Kissiov 1,2,, Przemyslaw Stempor 1,2, Annette Steward 1,2, Eva Zeiser 1,2, Sascha Sauer 3,4, Julie Ahringer 1,2,
Editors: Siu Sylvia Lee5, Jessica K Tyler6
PMCID: PMC6231769  PMID: 30362940

Abstract

An essential step for understanding the transcriptional circuits that control development and physiology is the global identification and characterization of regulatory elements. Here, we present the first map of regulatory elements across the development and ageing of an animal, identifying 42,245 elements accessible in at least one Caenorhabditis elegans stage. Based on nuclear transcription profiles, we define 15,714 protein-coding promoters and 19,231 putative enhancers, and find that both types of element can drive orientation-independent transcription. Additionally, more than 1000 promoters produce transcripts antisense to protein coding genes, suggesting involvement in a widespread regulatory mechanism. We find that the accessibility of most elements changes during development and/or ageing and that patterns of accessibility change are linked to specific developmental or physiological processes. The map and characterization of regulatory elements across C. elegans life provides a platform for understanding how transcription controls development and ageing.

Research organism: C. elegans

Introduction

The genome encodes the information for organismal life. Because the deployment of genomic information depends in large part on regulatory elements such as promoters and enhancers, their identification and characterization is essential for understanding genome function and its regulation.

Regulatory elements are typically depleted for nucleosomes, which facilitates their identification using sensitivity to digestion by nucleases such as DNase I or Tn5 transposase, termed DNA accessibility (Sabo et al., 2006; Crawford et al., 2006; Buenrostro et al., 2013). In different organisms, large repertoires of regulatory elements have been determined by profiling DNA accessibility genome-wide in different cell types and developmental stages (Thomas et al., 2011; Kharchenko et al., 2011; Thurman et al., 2012; Yue et al., 2014; Kundaje et al., 2015; Daugherty et al., 2017; Ho et al., 2017). However, no study has yet investigated regulatory element usage across the life of an animal, from the embryo to the end of life. Such information is important, because different transcriptional programs operate in different periods of life and ageing. Caenorhabditis elegans is ideal for addressing this question, as it has a simple anatomy, well-defined cell types, and short development and lifespan. A map of regulatory elements and their temporal dynamics would facilitate understanding of the genetic control of organismal life.

Active regulatory elements have previously been shown to have different transcriptional outputs and chromatin modifications (Andersson, 2015; Kim and Shiekhattar, 2015). Transcription is initiated at both promoters and enhancers, with most elements having divergent initiation events from two independent sites (Core et al., 2008; Kim et al., 2010; De Santa et al., 2010; Koch et al., 2011; Chen et al., 2013). However, promoters and enhancers differ in the production of stable transcripts. At protein-coding promoters, productive transcription elongation produces a stable transcript, whereas enhancers and the upstream divergent initiation from promoters generally produce short, aborted, unstable transcripts (Core et al., 2014; Andersson et al., 2014; Rennie et al., 2017).

Promoters and enhancers have also been shown to be differently enriched for specific patterns of histone modifications. In particular, promoters often have high levels of H3K4me3 and low levels of H3K4me1, whereas enhancers tend to have the opposite pattern of higher H3K4me1 and lower H3K4me3 (Heintzman et al., 2007; Heintzman et al., 2009). However, in human and Drosophila cell lines, it was observed that H3K4me3 and H3K4me1 levels correlate with levels of transcription at regulatory elements, rather than whether the element is a promoter or an enhancer (Core et al., 2014; Henriques et al., 2018; Rennie et al., 2018). Further, analyses of genes that are highly regulated in development showed that their promoters lacked chromatin marks associated with activity (including H3K4me3), even when the associated genes are actively transcribed (Zhang et al., 2014; Pérez-Lluch et al., 2015). Therefore, stable elongating transcription, rather than histone modification patterns, appears to be the defining feature that distinguishes active promoters from active enhancers (reviewed in Andersson, 2015; Andersson et al., 2015; Kim and Shiekhattar, 2015; Henriques et al., 2018; Rennie et al., 2018).

Regulatory elements have not been systematically mapped and annotated in C. elegans. Promoter identification has been hampered because the 5’ ends of ~70% of protein-coding transcripts are trans-spliced to a 22nt leader sequence (Allen et al., 2011). Because the region from the transcription initiation site to the trans-splice site (the ‘outron’) is removed and degraded, the 5’ end of the mature mRNA does not mark the transcription start site. To overcome this difficulty, previous studies identified transcription start sites for some genes through profiling transcription initiation and elongation in nuclear RNA or by inhibiting trans-splicing at a subset of stages (Gu et al., 2012; Chen et al., 2013; Kruesi et al., 2013; Saito et al., 2013). In addition, two recent studies used ATAC-seq or DNAse I hypersensitivity to map regions of accessible chromatin in some developmental stages, and predicted element function by proximity to first exons or chromatin state (Daugherty et al., 2017; Ho et al., 2017).

Toward building a comprehensive map of regulatory elements and their use during the life of an animal, here we used multiple assays to systematically identify and annotate accessible chromatin in the six C. elegans developmental stages and at five time points of adult ageing. Strikingly, most elements undergo a significant change in accessibility during development and/or ageing. Clustering the patterns of accessibility changes in promoters reveals groups that act in shared processes. This map makes a major step toward defining regulatory element use during C. elegans life.

Results and discussion

Defining and annotating regions of accessible DNA

To define and characterize regulatory elements across C. elegans life, we collected biological replicate samples from a developmental time course and an ageing time course (Figure 1A). The developmental time course consisted of wild-type samples from six developmental stages (embryos, four larval stages, and young adults). For the ageing time course, we used glp-1(e2144ts) mutants to prevent progeny production, since they lack germ cells at the restrictive temperature. Five adult ageing time points were collected, starting from the young adult stage (day 1) and ending at day 13, just before the major wave of death.

Figure 1. Overview of the project.

(A) Overview of genome-wide assays and time points of developmental and ageing samples. For development samples, chromatin accessibility, transcription initiation, productive elongation, and chromatin state were profiled in six stages of wild-type animals (embryos, four larval stages, young adults). For ageing samples, chromatin accessibility and productive transcription elongation were profiled in five time points of sterile adult glp-1 mutants (Day 1/Young adult, Day 2, Day 6, Day 9, Day 13). (B) Representative screen shot of normalized genome-wide accessibility profiles in the eleven samples (chrIII:9,041,700–9,196,700, 154 kb).

Figure 1—source data 1. Accessible sites identified using ATAC-seq.
● chrom_ce10, start_ce10, end_ce10 location of the accessible site (bed-style coordinates, ce10). ● atac_%stage_height maximum SPMR-normalized ATAC-seq signal at the peak in %stage (one of wt_emb, wt_l1, wt_l2, wt_l3, wt_l4, wt_ya, glp1_d1, glp1_d2, glp1_d6, glp1_d9, glp1_d13). ● atac_source source of the ATAC-seq peak call (see Materials and methods). ○ atac_wt_pe wt (developmental) ATAC-seq treated as paired-end. ○ atac_wt_se wt (developmental) ATAC-seq treated as single-end. ○ atac_glp1_se glp-1 (ageing) ATAC-seq, single-end only.
DOI: 10.7554/eLife.37344.006

Figure 1.

Figure 1—figure supplement 1. Comparison of ATAC-seq to concentration courses of DNase I-seq and MNase-seq.

Figure 1—figure supplement 1.

(A) Genomic DNA digested using different concentrations of DNase I (top) or MNase (bottom). Red rectangles highlight approximate size ranges subjected to paired-end Illumina sequencing. (B) SPMR-normalized coverage of a DNase I concentration series (blue tracks), MNase concentration series (green tracks), and ATAC-seq (red track) at the lin-23 locus (chrII:6,369,650–6,373,750, 4.1 kb). The modENCODE/modERN ChIP-seq peak call pileup (grey track) shows a TF binding region upstream of the gene. Different concentrations of nuclease show different types of signal. Low concentrations of DNase I and MNase produce a peak in the middle of the TF-binding region, at the expected NDR (middle vertical bar). At higher concentrations, both enzymes show a peak at the −1 and +1 nucleosomes (left and right vertical bars). ATAC-seq has a single large peak centered in the middle of the TF-binding region. (C) Mean normalized coverage at transcription factor binding sites defined by clustering modENCODE/modERN peak calls (n = 36,389; Materials and methods) in ATAC-seq, DNase-seq, and MNase-seq (the latter two are shown at concentrations with the highest accessibility enrichment). ATAC-seq shows higher signal than DNase-seq or MNase-seq. Shaded rectangles show the range of signal between assay replicates at the midpoint of the TFBS cluster. (D) Normalized read coverage of ATAC-seq prepared from nuclei harvested from live (red), or frozen (blue) embryos. Shaded rectangles are used as in (C).
Figure 1—figure supplement 2. Reproducibility and broad relatedness of ATAC-seq and RNA-seq data.

Figure 1—figure supplement 2.

Reproducibility and broad relatedness of the different samples and time points in ATAC-seq (top), long cap RNA-seq (middle), and short cap RNA-seq (bottom). We applied PCA to peak accessibility at promoters (ATAC-seq), read counts at annotated genes (long cap RNA-seq), and 5' end read counts at promoters (short cap RNA-seq). Black markers show different biological samples (two per time point) whereas gray markers show centroids, calculated by averaging the two samples collected at the same time point. Gray dotted lines show the time progression across development or ageing.
Figure 1—figure supplement 3. Reproducibility and broad relatedness of the histone modification data.

Figure 1—figure supplement 3.

Reproducibility and broad relatedness of the different samples and time points in H3K4me3 (top left), H3K4me1 (top right), H3K36me3 (bottom left), and H3K27me3 ChIP-seq (bottom right). We applied PCA to genic regions, from the most upstream promoter to the annotated 3' end, considering genes with at least one promoter. Black markers show different biological samples (two per time point), whereas gray markers show centroids, calculated by averaging the two samples collected at the same time point. Gray dotted lines show the time progression across development.

Figure 1A outlines the datasets generated. For all developmental and ageing time points, we used ATAC-seq to identify accessible regions of DNA. We also sequenced strand-specific nuclear RNA (>200 nt long) to determine regions of transcriptional elongation, because previous work demonstrated that this approach could capture outron signal linking promoters to annotated exons (Chen et al., 2013; Kruesi et al., 2013; Saito et al., 2013). For the development time course, we additionally sequenced short (<100 nt) capped nuclear RNA to profile transcription initiation, profiled four histone modifications to characterize chromatin state (H3K4me3, H3K4me1, H3K36me3, and H3K27me3), and performed a DNase I concentration course to investigate the relative accessibility of elements. Micrococcal nuclease (MNase) data were also collected for the embryo stage. As previously noted by others, we found that ATAC-seq accessibility signal is similar to that observed using a low-concentration DNase I or MNase, and that the ATAC-seq data has the highest signal-to-noise ratio (Buenrostro et al., 2013); Figure 1—figure supplement 1C) (Buenrostro et al., 2013); Figure 1—figure supplement 1A).

To define sites that are accessible in at least one developmental or ageing stage, focal peaks of significant ATAC-seq enrichment were identified across all developmental and ageing samples, yielding 42,245 individual elements (Figure 1B, Figure 1—source data 1; see Materials and methods for details). Of these, 72.8% overlap a transcription factor binding site (TFBS) mapped by the modENCODE or modERN projects (Araya et al., 2014; Kudron et al., 2018), supporting their potential regulatory functions (Figure 2—figure supplement 1A).

Two recent studies reported accessible regions in C. elegans identified using DNase I hypersensitivity or ATAC-seq (Ho et al., 2017; Daugherty et al., 2017). The 42,245 accessible elements defined here overlap 33.7% of (Ho et al., 2017) DNase I hypersensitive sites and 47.9% of (Daugherty et al., 2017) ATAC-seq peaks (Figure 2—figure supplement 1B,C). Examining the non-overlapping sites from pairwise comparisons, it appears that differences in peak calling methods account for some of the differences. Accessible regions determined here required a focal peak of enrichment, whereas the other studies found both focal sites and broad regions with increased signal. Consistent with these differences in methods, sites unique to the two studies are enriched for exonic chromatin, depleted for both TFBS and transcription initiation sites, and often found in broad regions of increased accessibility across transcriptionally active gene bodies (Figure 2—figure supplement 1B–E). Similarly, using MACS2 to call peaks on the ATAC-seq data reported here, as used by Daugherty et al. (2017), identified a group of exon enriched sites not found using our peak calling method (Figure 2—figure supplement 2A). However, the fraction of such sites is relatively small indicating that other differences also contribute, such as signal-to-noise or nematode growth methods.

To functionally classify elements, we annotated each of the 42,245 elements for transcription initiation and transcription elongation signals on both strands (Figure 2A,B; Figure 2—source data 1; see Materials and methods for details). Overall, 37.1% of elements had promoter activity, defined by a significant increase in transcription elongation signal originating at the element in at least one stage and one direction. Promoters were assigned to protein-coding or pseudogenes if continuous transcription elongation signal extended from the element to an annotated first exon (covering the outron). Promoters were unassigned if transcription elongation signal was not linked to an annotated gene. We observed detectable transcription initiation signal at 82.3% of elements (Figure 2—source data 1); those with no significant transcription elongation signal in either direction were annotated as putative enhancers (hereafter referred to as ‘enhancers’). The remaining elements had no detectable transcriptional activity or overlapped ncRNAs (tRNA, snRNA, snoRNA, rRNA, or miRNA) (Figure 2B; Figure 2—source data 1). We found that accessible sites are enriched for being located within outrons or intergenic regions (Figure 2—figure supplement 3).

Figure 2. Annotation of accessible elements.

(A) Top, strand-specific nuclear RNA in each developmental stage monitors transcription elongation; plus strand, blue; minus strand, red. Below is transcription initiation signal, accessible elements (colored by annotation), and gene models (chrI:12,675,000–12,683,400, 8.4 kb). The left side of each element is colored by the reverse strand annotation whereas the right side of an element is colored by the forward strand annotation (color key at bottom). (B) Left, distribution of accessible sites in four categories: promoters (one or both strands), putative enhancers, no activity, or overlapping a tRNA, snRNA, snoRNA, rRNA, or miRNA. Right, distribution of different types of promoter annotations. (C) Left, distribution of the number of promoters and enhancers per gene; right, boxplot shows that genes with more promoters also have more enhancers.

Figure 2—source data 1. Regulatory annotation of accessible sites.
● chrom_ce10, start_ce10, end_ce10 location of the accessible site (bed-style coordinates, ce10). ● chrom_ce11, start_ce11, end_ce11 as above, but lifted over to ce11. ● annot final regulatory element type, obtained by combining strand-specific transcription patterns (see Materials and methods). ● annot_%strand annotation of the strand-specific transcription patterns at the site (%strand is either fwd or rev). ● promoter_gene_id_%strand, promoter_locus_id_%strand, promoter_gene_biotype_%strand WormBase gene id, locus id, biotype for sites annotated as coding_promoter, pseudogene_promoter or non-coding_RNA on %strand. ● associated_gene_id, associated_locus_id WormBase gene id, locus id of genes whose gene body or outron region overlaps the site. These are defined for for sites annotated as unassigned_promoter, putative_enhancer or other_element. If a site overlaps multiple genes, all overlaps are reported, separated by commas. ● tss_%strand_ce10 representative transcription initiation mode (Materials and methods) on %strand, ce10 coordinates. ● tss_%strand_ce11 as above, but lifted over to ce11. ● scap_%strand_passed True or False based on whether the site has reproducible transcription initiation (Materials and methods). ● lcap_%stage_%strand_passed_jump True or False based on whether the site passed the jump test for elongating transcription (Materials and methods, %stage is one of wt_emb, wt_l1, wt_l2, wt_l3, wt_l4, wt_ya, glp1_d1, glp1_d2, glp1_d6, glp1_d9, glp1_d13). ● lcap_%stage_%strand_passed_incr True or False based on whether the site passed the incr test for elongating transcription (Materials and methods).
DOI: 10.7554/eLife.37344.014

Figure 2.

Figure 2—figure supplement 1. Comparisons to previous accessibility maps.

Figure 2—figure supplement 1.

(A) Venn diagrams showing the overlap of transcription factor binding sites defined by clustering modENCODE/modERN peak calls (n = 36,389; Materials and methods) to accessible sites from this study and two previous studies (Daugherty et al., 2017; Ho et al., 2017). (B) Comparison of accessible sites defined in this study to accessible sites defined in Daugherty et al. (2017). (C) Comparison of accessible sites defined in this study to accessible sites defined in Ho et al. (2017). (B,C) Leftmost plot shows overlaps between accessible sites. Rightmost three plots compare regions found in both studies or unique to only one study, for mean profile of modENCODE/modERN peak call pileup, fraction of sites with transcription initiation signal (negative values are reverse strand signals), and fraction overlapping an exon. (D,E) IGV screenshots (D: nhr-25, chrX:13,007,000–13,015,000, 8 kb; E: kin-18, chrIII:6,117,000–6,125,500, 8.5 kb) of stage-specific accessibility profiles and peak calls from Daugherty et al. (2017) (top, red), (Ho et al., 2017) (middle, green), and this study (bottom, blue).
Figure 2—figure supplement 2. Effect of differences in peak calling methods on the types of identified accessible sites.

Figure 2—figure supplement 2.

As done by Daugherty et al. (2017), MACS2 was used to call peaks on the ATAC-seq data reported here using MACS2 parameters --shift −75 --gsize ce -q 5e-2 --nomodel --extsize 150 --bdg --keep-dup all --call-summits --SPMR. Peaks from biological replicates of the same stage were combined by intersecting peak calls from the two biological replicates, and then peaks from each stage were combined by taking their union. This identified 27,998 peaks used for this analysis. (A) Comparison between accessible regions identified by the focal enrichment peak calling method used in this study (n = 42,245) to those defined using MACS2 (n = 27,998). (B) Comparison of ATAC-seq MACS2 peak calls from our data to ATAC-seq MACS2 peak calls from (Daugherty et al., 2017). (C) Comparison of stage-matched ATAC-seq MACS2 peak calls from our data to ATAC-seq MACS2 peak calls from Daugherty et al. (2017). (A–C) Leftmost plots show overlaps between accessible sites. Rightmost three plots compare regions found in both sets of peak calls or unique to only one set, for mean profile of modENCODE/modERN peak call pileup, fraction of sites with transcription initiation signal (negative values are reverse strand signals), and fraction overlapping an exon.
Figure 2—figure supplement 3. Genomic locations of accessible sites.

Figure 2—figure supplement 3.

(A) Left: distribution of bases in the C. elegans genome, partitioned into outronic, exonic, intronic, intergenic or mixed, based on the regulatory annotation. Right: distribution of genomic region type at accessible sites. (B) Distribution of genomic region at specific types of accessible sites.
Figure 2—figure supplement 4. Comparison to published TSS maps.

Figure 2—figure supplement 4.

(A–D) Left: overlap between accessible sites and TSS annotations from (A) (Chen et al., 2013); (B) (Kruesi et al., 2013); (C) (Saito et al., 2013); (D) (Gu et al., 2012). Right: accessible site annotations of elements that overlap a TSS in the indicated study. TSSs were considered to overlap an accessible site if they were located within 150 bp of peak accessibility. For Gu et al. (2012), TSSs were clustered using a single-linkage approach using a distance threshold of 50 bp, and the overlaps are based on those clusters.
Figure 2—figure supplement 5. Types of unassigned promoters.

Figure 2—figure supplement 5.

(A) Types and numbers of unassigned promoters. (B–D) Examples of transcription patterns at unassigned promoters. Shown are forward and reverse strand nuclear RNA-seq signals to indicate genomic regions with transcription elongation, forward and reverse strand transcription initiation signal (pooled across stages), and accessible elements colored with left halves indicating reverse strand annotation and right halves indicating forward strand annotation. Vertical dotted lines highlight unassigned promoters. (B) uaRNA/PROMPT (chrIII:1,020,500–1,021,700, 1.2 kb), (C) antisense to coding gene (chrI:11,590,000–11,596,000, 6 kb), (D) intergenic (chrV:2,296,000–2,300,500, 4.5 kb).
Figure 2—figure supplement 6. Transgenic tests of annotated promoters and enhancers for promoter activity.

Figure 2—figure supplement 6.

(A) Comparison of annotations to 23 elements previously shown to function as promoters in transgenic assays (Merritt et al., 2008; Hunt-Newbury et al., 2007; Chen et al., 2014). (B) Indicated elements were fused to his-58::gfp (see Materials and methods) and the resulting transgenic strains tested for GFP expression in embryos. Elements were cloned in the endogenous orientation relative to their associated gene or in inverted orientation, as indicated. In expression strength column, ‘strong’ and ‘medium’ indicate high and low level of GFP visible in live embryos; ‘weak’ indicates expression only visible by immunofluorescence. (C) Examples of transgene expression. Shown is expression driven by the ztf-11 promoter and the bro-1 enhancer in both orientations; DIC image on left, HIS-58-GFP on right.

Within the promoter class, we defined 15,572 protein-coding coding promoters: 11,478 elements are unidirectional promoters and 2118 are divergent promoters that drive expression of two oppositely oriented protein-coding genes (Figure 2—source data 1). In total, promoters were defined for 11,196 protein-coding genes, with 3000 genes having >1 promoter (Figure 2C). The protein-coding promoter annotations show good overlap with four sets of TSSs previously defined based on mapping transcription (Chen et al., 2013; Kruesi et al., 2013; Saito et al., 2013; Gu et al., 2012); 76.8–85.1%; Figure 2—figure supplement 5). Enhancers (n = 19,231) were assigned to a gene if they are located within the region from its most upstream promoter to its gene end; 6668 genes have at least one associated enhancer, and 3240 genes have >1 enhancer (Figure 2C).

The locations of unassigned promoters (n = 3106) suggest different potential functions. A large fraction (35.1%) generate antisense transcripts within the body of a protein coding gene, suggesting a possible role in regulating expression of the associated gene (Figure 2—figure supplement 5). Another large group (38.4%) produce antisense transcripts from an element that is a protein coding promoter in the sense direction, a pattern seen in many mammalian promoters, termed upstream antisense (uaRNA) or promoter upstream (PROMPT) transcripts (Figure 2—figure supplement 5; Preker et al., 2008; Flynn et al., 2011; Sigova et al., 2013). Most of the rest (21.7%) are intergenic and may define promoters for unannotated transcripts.

Patterns of histone marks at promoters and enhancers

Promoters and enhancers show general differences in patterns of histone modifications, such as higher levels of H3K4me3 at promoters or H3K4me1 at enhancers, and chromatin states are frequently used to define elements as promoters or enhancers (Heintzman et al., 2007; Ernst and Kellis, 2010; Ernst et al., 2011; Kharchenko et al., 2011; Hoffman et al., 2013; Daugherty et al., 2017). However, it has been shown that H3K4me3 levels correlate with transcriptional activity rather than with function (Pekowska et al., 2011; Core et al., 2014; Andersson et al., 2014; Henriques et al., 2018; Rennie et al., 2018), suggesting that defining regulatory elements solely based on chromatin state is likely to lead to incorrect annotations.

To further investigate the relationship between chromatin marking and element function, we mapped four histone modifications at each developmental stage (H3K4me3, H3K4me1, H3K27me3, H3K36me3) and examined their patterns around coding promoters and enhancers. As expected, many coding promoters had high levels of H3K4me3 and were depleted for H3K4me1 (Figure 3A). Moreover, enhancers had generally low levels of H3K4me3 and higher levels of H3K4me1 than promoters (Figure 3A). However, many elements did not have these patterns. For example, about 50% of coding promoters have a high level of H3K4me1 and no or low H3K4me3 marking (Figure 3A).

Figure 3. Chromatin state and sequence features of promoters and enhancers.

(A) Heatmaps of indicated histone modifications and CV values at coding promoters (top), and enhancers (bottom), aligned at element midpoints. Elements are ranked by mean H3K4me3 levels. Low CV values indicate broad expression across development and cell types and high CV values indicate regulated expression. Promoters of genes with low CV values have high H3K4me3 levels. (B) Distribution of initiator Inr motif, TATA motif, and CpG content at coding promoters and enhancers, separated by H3K4me3 level (top, middle, and bottom thirds). Grey-shaded regions represent 95% confidence intervals of the sample mean at the genomic position with the highest signal.

Figure 3.

Figure 3—figure supplement 1. Chromatin state and sequence features of promoters and enhancers sorted by CV value.

Figure 3—figure supplement 1.

(A) Heatmaps of indicated histone modifications and CV values at coding promoters (top), and enhancers (bottom). Elements are ranked by CV value. Low CV values indicate broad expression across development and cell types and high CV values indicate regulated expression. H3K4me3, H3K4me1, and H3K27me3 signals are aligned at element midpoints; H3K36me3 is aligned at the start of the associated gene annotation. (B) Distribution of initiator Inr motif, TATA motif, and CpG content at coding promoters and enhancers, separated by CV value (top (high CV), middle, and bottom (low CV)). Grey-shaded regions represent 95% confidence intervals of the sample mean at the genomic position with the highest signal.

To investigate the nature of these patterns, we examined coefficients of variation of gene expression (CV; Gerstein et al., 2014) of the associated genes. Genes with broad stable expression across cell types and development, such as housekeeping genes, have low variation of gene expression levels and hence a low CV value. In contrast, genes with regulated expression, such as those expressed only in particular stages or cell types have a high CV value. We found a strong inverse correlation between a gene’s CV value and its promoter H3K4me3 level (−0.64, p<10−15, Spearman's rank correlation; Figure 3; Figure 3—figure supplement 1A). Furthermore, promoters with low or no H3K4me3 marking are enriched for H3K27me3 (Figure 3; Figure 3—figure supplement 1A), which is associated with regulated gene expression (Tittel-Elmer et al., 2010; Pérez-Lluch et al., 2015; Evans et al., 2016). These results support the view that H3K4me3 marking may be a specific feature of promoters with broad stable activity, consistent with the finding that active promoters of regulated genes lack H3K4me3 (Pérez-Lluch et al., 2015). The profiling here was done in whole animals, which may have precluded detecting modifications occurring in a small number of nuclei. Nevertheless, the results indicate that chromatin state alone is not a reliable metric for element annotation. Histone modification patterns at many promoters resemble those at enhancers, and vice versa.

Promoters and enhancers also share sequence features. Both are enriched for initiator INR elements, although enhancers have a slightly lower INR frequency (Figure 3B and Figure 3—figure supplement 1B). Promoters and enhancers are also both enriched for CpG dinucleotides (Figure 3B and Figure 3—figure supplement 1B). Promoters with high H3K4me3 and low CV values (broadly expressed genes) have the highest CpG content, whereas those with low H3K4me3 and high CV values have the lowest CpG content (Figure 3B and Figure 3—figure supplement 1B). Promoters also differ from enhancers by the presence of TATA motifs, which occur predominantly at genes with low H3K4me3,and high CV values (i.e. with regulated expression; Figure 3B and Figure 3—figure supplement 1B).

Promoters and enhancers can drive gene expression in an orientation independent manner

To validate the promoter annotations, we compared them with studies where small regions of DNA had been defined as promoters using transgenic assays. These comprised 10 regions are defined based on transcription initiation signal (Chen et al., 2014), nine regions defined based on proximity to a germ line gene (Merritt et al., 2008), and four defined by proximity to the first exon of a muscle expressed gene (Hunt-Newbury et al., 2007). Of these 23 regions, 21 overlap an element in our set of accessible sites, 19 of which are annotated as protein coding promoters (Figure 2—figure supplement 6A). One of the remaining two is annotated as an enhancer and the other overlaps an accessible element for which no transcriptional signal was detected. We further directly tested three elements annotated as promoters (for hlh-2, ztf-11 and bed-3 genes), and found that all three drove robust expression of a histone-GFP reporter (Figure 2—figure supplement 6A). Overall, there is good concordance between promoter annotation and promoter activity.

Most of the elements annotated as protein-coding promoters are flanked by bidirectional transcription initiation signal (74.0%), similar to the pattern seen in mammals. Most (82.6%) are unidirectional promoters, producing a protein-coding transcript in one direction, but no stable transcript from the upstream initiation site. To test whether such upstream antisense initiation sites could function as promoters, we inverted the orientation of two active unidirectional promoters (ztf-11 and F58D5.5). If the lack of in vivo transcription elongation was a property of the element or initiation site itself, the GFP fusion should not be expressed. However, we observed that the two inverted unidirectional promoters both drove GFP expression. The expression patterns generated were similar in both orientations, although the ztf-11 promoter was weaker when inverted (Figure 2—figure supplement 6B,C). These results suggest that signals for productive elongation occur downstream of the transcription initiation site.

Similar to the upstream antisense transcription initiation observed at promoters, enhancers also show transcription initiation signals but generally do not produce stable transcripts (Core et al., 2014; Andersson et al., 2014). Previous studies have reported that some enhancers can function as promoters in transgenic assays and also at endogenous loci (Kowalczyk et al., 2012; Leung et al., 2015; Nguyen et al., 2016; van Arensbergen et al., 2017; Mikhaylichenko et al., 2018). To assess the potential promoter activities of C. elegans enhancers, we directly fused 12 putative enhancers that had transcription initiation signal in embryos to a histone-GFP reporter gene and assessed transgenic strains for embryo expression. Two of the tested enhancers are located in introns, and one of these, from the bro-1 gene, has been previously validated as an enhancer (Brabin et al., 2011); most of the others are associated with the hlh-2 or ztf-11 genes. We found that 10 of 12 tested regions drove reporter expression in embryos, including the two intronic enhancers (Figure 2—figure supplement 6B,C). Whereas the hlh-2 and ztf-11 promoters drove strong, broad expression, the associated enhancers were active in a smaller number of cells and expression levels were overall lower (Figure 2—figure supplement 6B,C). We also tested two enhancers in inverted orientation and found that both showed similar activity in both orientations, as observed for the two tested promoters (Figure 2—figure supplement 6B,C). The percentage of enhancers that functioned as active promoters is higher than that observed in a cell-based assay (Nguyen et al., 2016), possibly because all cell types are tested in an intact animal. Episomal-based assays have also been reported to underestimate activity (Inoue et al., 2017).

Extensive regulation of chromatin accessibility in development

We observed extensive changes in chromatin accessibility across development, with most elements showing a significant difference within the developmental time course (71%,>=2 fold change, FDR < 0.01; Figure 4—source data 1; see Materials and methods). To investigate how accessibility relates to gene expression, we focused on the 13,596 elements annotated as protein-coding promoters. Of these, 10,199 displayed significant changes in accessibility in development, with the remaining 3397 promoters classified as having stable accessibility. We note that the detected changes could be due to regulation of accessibility, or alternatively to changes in cell number during development (e.g. the number of germ line nuclei increases from two in L1 larvae to ~2000 in young adults).

We reasoned that promoters having similar patterns of accessibility changes over development may regulate genes that function in shared processes and be regulated by shared sets of transcription factors. To investigate this, we applied k-medoid clustering to the 10,199 promoters with developmental changes in accessibility, defining 16 clusters (Figure 4A, Figure 4—figure supplement 1, Figure 4—figure supplement 2, and Figure 4—source data 1; see Materials and methods). Within clusters, we observed that promoter accessibility and nuclear RNA levels are usually correlated (mean r = 0.47 (sd = 0.11) across all clusters), indicating that accessibility is a good metric of promoter activity and overall gene expression (Figure 4—figure supplement 1 and Figure 4—figure supplement 2).

Figure 4. Shared dynamics of promoter accessibility in development and ageing.

Clusters of promoters with shared relative accessibility patterns across (A) development or (B) ageing. Relative promoter accessibility is log2 of the depth-normalized ATAC-seq coverage at a given time point divided by the mean ATAC-seq coverage across the time series (see Materials and methods). The percentage of associated genes that have enriched expression in the indicated tissues was determined from single-cell L2 larval RNA-seq data (Cao et al., 2017); see Materials and methods). Right hand panels show examples of GO terms enriched in genes associated with development or ageing clusters.

Figure 4—source data 1. Element accessibility dynamics and promoter accessibility clusters in development and ageing.
● chrom_ce10, start_ce10, end_ce10 location of the accessible site (bed-style coordinates, ce10). ● devel_is_dynamic True or False based on whether the site shows differential accessibility between any two developmental stages. ● ageing_is_dynamic True or False based on whether the site shows differential accessibility between any two ageing time points. ● devel_prom_cluster_label assigned developmental accessibility promoter cluster. ● ageing_prom_cluster_label assigned ageing accessibility promoter cluster. ● HOTness based on the number of transcription factors overlapping the accessible site, either HOT (19 or more factors), cold (between 1 and 18 factors) or none (zero factors). ● factor_count number of transcription factors with binding sites overlapping the accessible site. ● factor_names comma-separated list of the names of transcription factors with binding sites overlapping the accessible site.
DOI: 10.7554/eLife.37344.021

Figure 4.

Figure 4—figure supplement 1. Characteristics of developmental promoter clusters (continued in Figure 4—figure supplement 2).

Figure 4—figure supplement 1.

Characteristics of promoter clusters with shared accessibility patterns across development. Relative promoter ATAC-seq coverage is shown across the time series as a graph (each line representing a promoter) and a heatmap (each row representing a promoter). Values are scaled from −2 (dark blue) to 0 (white) to +2 (dark red). Heatmap showing relative expression of the associated genes across the time series, using the same color scale. Tissue expression box plots show TPMs of clustered genes in individual tissues (data from Cao et al., 2017). Percentage of genes with tissue-enriched expression shows the percentage of genes within the cluster with enriched expression in the indicated tissues. Enriched GO terms show the top five enriched GO terms obtained for each cluster from the corresponding list of genes using gProfiler. MF = Molecular Function, CC = Cellular Component, BP = Biological Process.
Figure 4—figure supplement 2. Characteristics of developmental promoter clusters (continued from Figure 4—figure supplement 1).

Figure 4—figure supplement 2.

Characteristics of promoter clusters with shared accessibility patterns across development. Relative promoter ATAC-seq coverage is shown across the time series as a graph (each line representing a promoter) and a heatmap (each row representing a promoter). Values are scaled from −2 (dark blue) to 0 (white) to +2 (dark red). Heatmap showing relative expression of the associated genes across the time series, using the same color scale. Tissue expression box plots show TPMs of clustered genes in individual tissues (data from Cao et al., 2017). Percentage of genes with tissue-enriched expression shows the percentage of genes within the cluster with enriched expression in the indicated tissues. Enriched GO terms show the top five enriched GO terms obtained for each cluster from the corresponding list of genes using gProfiler. MF = Molecular Function, CC = Cellular Component, BP = Biological Process.
Figure 4—figure supplement 3. Characteristics of ageing promoter clusters.

Figure 4—figure supplement 3.

Characteristics of promoter clusters with shared accessibility patterns across ageing. Relative promoter ATAC-seq coverage is shown across the time series as a graph (each line representing a promoter) and a heatmap (each row representing a promoter). Values are scaled from −2 (dark blue) to 0 (white) to +2 (dark red). Heatmap showing relative expression of the associated genes across the time series, using the same color scale. Tissue expression box plots show TPMs of clustered genes in individual tissues (data from Cao et al., 2017). Percentage of genes with tissue-enriched expression shows the percentage of genes within the cluster with enriched expression in the indicated tissues. Enriched GO terms show the top five enriched GO terms obtained for each cluster from the corresponding list of genes using gProfiler. MF = Molecular Function, CC = Cellular Component, BP = Biological Process. Note that Cluster Mix5 does not have any enriched GO terms.

To investigate whether the shared patterns of accessibility changes over development identify promoters of genes involved in common processes, we took advantage of recent single-cell profiling data obtained from L2 larvae, which provides gene expression measurements in different tissues (Cao et al., 2017). We find that half of the developmental promoter clusters are enriched for genes with tissue biased expression (Figure 4A, Figure 4—figure supplement 1 and Figure 4—figure supplement 2). Based on these patterns of enrichment, we defined four gonad promoter clusters (G1-G4), two intestine clusters (I1, I2), one hypodermal cluster (H) and one cluster enriched for neural and muscle expression (N + M) (Figure 4A, Figure 4—figure supplement 1 and Figure 4—figure supplement 2). Genes associated with the remaining eight promoter clusters (Mix1–8) are generally expressed in multiple tissues, but predominantly in the soma (Figure 4A, Figure 4—figure supplement 1 and Figure 4—figure supplement 2). As expected, genes linked to the stable promoters are widely expressed. Interestingly, within a tissue, promoter clusters can exhibit similar variations in accessibility but with different amplitude. For instance, gonad clusters G1 and G2 both show a sharp increase in accessibility at the L3 stage; however, the increase is 1.5-fold larger in G2 than in G1. The gonad clusters are generally characterized by an increase of promoter accessibility starting in L3 when germ cell number strongly increases.

To further investigate promoter clusters sharing accessibility dynamics, we performed Gene Ontology analyses on the associated genes. As expected, we found that clusters containing genes enriched for expression in a particular tissue are also associated with GO terms related to that tissue (Figure 4A, Figure 4—figure supplement 1 and Figure 4—figure supplement 2). For instance, cluster H contains genes highly expressed in hypodermis and GO terms linked to cuticle development. Of note, the four accessibility clusters enriched for expression in germ line are associated with GO terms for different sets of germ line functions (Figure 4—figure supplement 1 and Figure 4—figure supplement 2). Similarly, the two intestinal clusters also identify genes with different types of intestinal function. Furthermore, accessibility dynamics can reflect the temporal function of the associated promoters. For instance, cluster Mix4 has GO terms indicative of neuronal development and highest accessibility in the embryo, when many neurons develop. These results suggest that promoter clusters contain genes acting in a shared process and having a similar mode of regulation.

To identify potential transcriptional regulators, we asked whether the binding of particular transcription factors is enriched in any promoter clusters, using TF binding data from the modENCODE and modERN projects (Boyle et al., 2014; Kudron et al., 2018). TFs with enriched binding were found for each cluster (Figure 5A), and the expression of such TFs was generally enriched in the expected tissue. For example, we found that ELT-2, an intestine-specific GATA protein (Fukushige et al., 1998), has enriched binding at promoters in intestinal clusters 1 and 2. Similarly, hypodermal transcription factors BLMP-1 (Horn et al., 2014), NHR-25 (Gissendanner and Sluder, 2000) and ELT-3 (Gilleard et al., 1999) are enriched in the hypodermal promoter cluster, and binding of the germ line XND-1 factor (Wagner et al., 2010) is enriched in the germ line clusters of promoters. We also identified novel tissue-specific associations for uncharacterized transcription factors, such as ZTF-18 and ATHP-1 with germ line promoter clusters and CRH-2 with the intestinal clusters (Figure 5A). These results agree and extend those of Cao et al. (2017), who identified TFs for which binding was correlated with cell-type-specific expression levels.

Figure 5. Transcription factor binding enrichment in developmental and ageing promoter clusters.

Figure 5.

Transcription factor (TF) binding enrichments in developmental (A) or ageing (B) promoter clusters from Figure 4. TF-binding data are from modENCODE/modERN (Araya et al., 2014; Kudron et al., 2018); peaks in HOT regions were excluded (see Materials and methods). Only TFs enriched more than twofold in at least one cluster are shown, and only enrichments with a p<0.01 (Fisher’s exact test) are shown. Plots show TF binding enrichment odds ratio (left), expression of the TF in each tissue relative to its expression across all tissues (log2(TF tissue TPM/mean of the TF’s TPMs across all tissues), middle), and the decile of expression of the TF in each tissue (right; TPMs < 1 are not taken into account when calculating TPMs deciles). Expression data are from Cao et al. (2017). Legends for Figure Supplements.

Figure 5—source data 1. TF datasets used for analyses.
● factor transcription factor name. ● dataset_name modENCODE/modERN DCC dataset name(s), separated by commas if multiple datasets from the same transcription factor were used. ● dataset_id modENCODE/modERN DCC dataset ID(s), comma-separated as above.
DOI: 10.7554/eLife.37344.023

We also observed differences in TF-binding enrichments between promoter clusters associated with the same tissue. For example, Clusters G1-G4 all contain promoters associated with germline-enriched genes (Figure 4A). However, distinct binding enrichments are observed in promoters in G1-G2 compared to those in G3-G4, with the latter showing enrichment for LIN-35 and DPL-1, two members of the DREAM complex, which controls cell cycle progression (Figure 5A). Taken together, the results suggest that promoters with shared accessibility patterns have shared cell- and process-specific activity, and they highlight potential regulators that are candidates for future studies.

Analysis of ageing clusters

We next focused on chromatin accessibility changes during ageing. In contrast to the development time course, the accessibility of most promoters is stable during ageing, with only 13% (n = 1,800) of promoters showing changes (Figure 4—source data 1). Interestingly, 75% of these also had regulated accessibility in development.

As for the development time course, we clustered accessibility changes in ageing. We identified eight clusters of promoters with similar accessibility changes across ageing and annotated them based on tissue biases in gene expression (Figure 4B; Figure 4—source data 1). This defined one intestinal cluster (I), two clusters enriched for intestine or hypodermal biased expression (I + H) and five mixed clusters. Several mixed clusters show weak gene expression enrichments, such as intestine expression in Mix1-2 and neural expression in Mix3 (Figure 4B). As observed for the development clusters, enriched GO terms were consistent with gene expression biases (Figure 4B, Figure 4—figure supplement 3).

We then evaluated the enrichment of transcription factors at each ageing promoter cluster. The binding of DAF-16/FoxO, a master regulator of ageing (Lin et al., 2001), is associated with five ageing promoter clusters (Figure 5B). Consistent with a prominent role in the intestine (Figure 4B; Kaplan and Baugh, 2016), promoter clusters enriched for DAF-16 binding are also enriched for intestinal genes (Figure 4B). The binding enrichment patterns of five other TFs implicated in ageing (DVE-1, NHR-80, ELT-2, FOS-1 and PQM-1 (Uno et al., 2013; Folick et al., 2015; Goudeau et al., 2011; Mann et al., 2016; Tian et al., 2016; Mao et al., 2016; Tepper et al., 2013) are similar to DAF-16 (Figure 5B). These TFs and DAF-16 are also enriched in developmental intestine promoter clusters (Figure 5A), supporting cooperation between them in development and ageing. A group of hypodermal TFs including BLMP-1, ELT-1 and ELT-3 are found enriched at promoters in one of the two I + H ageing clusters (Figure 5B). Finally, CEBP-1 binding is enriched in clusters Mix3 and Mix4, which are characterized by a continuous increase of promoter accessibility across ageing. This suggests a potential role of CEBP-1 in activating a subset of genes during ageing, as it is the case for its homologue CEBP-β in mouse (Sandhir and Berman, 2010).

Conclusion

For the first time, we systematically map regulatory elements across the lifespan of an animal. We identified 42,245 accessible sites in C. elegans chromatin and functionally annotated them based on transcription patterns at the accessible site. This avoided the problems of histone-mark-based approaches for defining element function (Core et al., 2014; Henriques et al., 2018; Rennie et al., 2018). Our map identified promoters active across development and ageing, but we did not find promoters for every gene. Classes that would have been missed are those for genes expressed only in males or dauer larvae (which we did not profile) and genes not active under laboratory conditions. In addition, whole-animal profiling would miss promoters active in only a small number of cells. In the future, assaying accessible chromatin and nuclear transcription in specific cell types should identify many of these missed elements.

We found that accessibility of most elements changes during the life of the worm, supporting a key role played by chromatin structure. Despite the map being based on bulk profiling in whole animals, we find that regulatory elements with shared accessibility dynamics often share patterns of tissue-specific expression, GO annotation, and TF binding. The promoters with shared accessibility changes are therefore excellent starting points for studies of cell- and process-specific gene expression. In summary, our identification of regulatory elements across C. elegans life together with an initial characterization of their properties provides a key resource that will enable future studies of transcriptional regulation in development and ageing.

Materials and methods

Collection of developmental time series samples

Wild-type N2 were grown at 20°C in liquid culture to the adult stage using standard S-basal medium with HB101 bacteria, animals bleached to obtain embryos, and the embryos hatched without food in M9 buffer for 24 hr at 20°C to obtain synchronized starved L1 larvae. L1 larvae were grown in a further liquid culture at 20°C to the desired stage, then collected, washed in M9, floated on sucrose, washed again in M9, then frozen into ‘popcorn’ by dripping embryo or worm slurry into liquid nitrogen. Popcorn were stored at −80°C until use. Times of growth were L1 (4 hr), L2 (20 hr), L3 (30 hr), L4 (45 hr), young adults (60 hr). Mixed populations of embryos were collected by bleaching cultures of synchronized 1-day-old adults.

Collection of ageing time series samples

glp-1(e2144) were raised at 15°C on standard NGM plates seeded with OP50 bacteria. Embryos were obtained by bleaching gravid adults and then approximately 6000 placed at 25°C on 150 mm 2% NGM plates seeded with a 30X concentrated overnight culture of OP50. For harvest, worms were washed 3X in M9 and then worm slurry was frozen into popcorn by dripping into liquid nitrogen and stored at −80°C. Harvest times after embryo plating were D1/YA (53 hr), D2 (71 hr), D6 (167 hr), D9 (239 hr), D13 (335 hr).

Nuclear isolation and ATAC-seq

Frozen embryos or worms (1–3 frozen popcorns) were broken by grinding in a mortar and pestle or smashing using a Biopulverizer, then the frozen powder was thawed in 10 ml Egg buffer (25 mM HEPES pH 7.3, 118 mM NaCl, 48 mM KCl, 2 mM CaCl2, 2 mM MgCl2). Ground worms were pelleted by spinning at 1500 g for 2 min, then resuspended in 10 ml working Buffer A (0.3M sucrose, 10 mM Tris pH 7.5, 10 mM MgCl2, 1 mM DTT, 0.5 mM spermidine 0.15 mM spermine, protease inhibitors (Roche complete, EDTA free) containing 0.025% IGEPAL CA-630. The sample was dounced 10X in a 14-ml stainless steel tissue grinder (VWR), then the sample spun 100 g for 6 min to pellet large fragments. The supernatant was kept and the pellet resuspended in a further 10 ml Buffer A, then dounced for 25 strokes. This was spun 100 g for 6 min to pellet debris and the supernatants, which contain the nuclei, were pooled, spun again at 100 g for 6 min to pellet debris, and transferred to a new tube. Nuclei were counted using a hemocytometer. One million nuclei were transferred to a 1.5-ml tube and spun 2000 g for 10 min to pellet. ATAC-seq was performed essentially as in Buenrostro et al. (2013). The supernatant was removed, the nuclei resuspended in 47.5 µl of tagmentation buffer, incubated for 30 min at 37°C with 2.5 µl Tn5 enzyme (Illumina Nextera kit), and then tagmented DNA purified using a MinElute column (Qiagen) and converted into a library using the Nextera kit protocol. Typically, libraries were amplified using 12–16 PCR cycles. ATAC-seq was performed on two biological replicates for each developmental stage and each ageing time point.

DNAse I and MNase mapping

Replicate concentration courses of DNase I were performed for each stage as follows. Twenty million nuclei were digested in Roche DNAse I buffer for 10 min at 25°C using 2.5, 5, 10, 25, 50, 100, 200, and 800 units/ml DNase I (Roche), then EDTA was added to stop the reactions. Embryo micrococcal nuclease (MNase) digestion concentration courses for embryos were made by digesting nuclei with 0.025, 0.05, 0.1, 0.25, 0.5, 1, 4, 8, or 16 units/ml MNase in 10 mM Tris pH 7.5, 10 mM MgCl2, 4 mM CaCl2 for 10 min at 37C. Reactions were stopped by the additon of EDTA. Following digestions, total DNA was isolated from the nuclei following proteinase K and RNase A digestion, then large fragments removed by binding to Agencourt AMPure XP beads (0.5 volumes). Small double cut fragments < 300 bp were isolated either using a Pippen prep gel (protocol 1) or using Agencourt AMPure XP beads (protocol 2). Libraries were prepared as described in the Sequencing library preparation section below.

Transcription initiation and nuclear RNA profiling

Nuclei were isolated and then chromatin associated RNA (development series) or nuclear RNA (ageing series) was isolated. Chromatin associated RNA was isolated as in (Pandya-Jones and Black, 2009), resuspending washed nuclei in Trizol for RNA extraction. To isolate nuclear RNA, nuclei were directly mixed with Trizol. Following purification, RNA was separated into fractions of 17–200nt and >200 nt using Zymo clean and concentrate columns. To profile transcription elongation (‘long cap RNA-seq’) in the nucleus, stranded libraries were prepared from the >200 nt RNA fraction using the NEB Next Ultra Directional RNA Library Prep Kit (#E7420S). Libraries were made from two biological replicates for each developmental stage and each ageing time point. To profile transcription initiation (‘short cap RNA-seq’), stranded libraries were prepared from the 17–200nt RNA fraction. Non-capped RNA was degraded by first converting uncapped RNAs into 5’-monophosphorylated RNAs using RNA polyphosphatase (Epibio), then treating with 5' Terminator nuclease (Epibio). The RNA was treated with calf intestinal phosphatase to remove 5’ phosphates from undegraded RNA, and decapped using Tobacco Acid Pyrophosphatase (Epicentre), Cap-Clip Acid Pyrophosphatase (CellScript, for one L2 and one L3 replicate) or Decapping Pyrophosphohydrolase (Dpph tebu-bio, for one L3 replicate) and then converted into sequencing libraries using the Illumina TruSeq Small RNA Preparation Kit kit. Libraries were size selected to be 145–225 bp long on a 6% acrylamide gel, giving inserts of 20–100 bp long. Libraries were made from two biological replicates for each developmental stage. During the course of this work, the TAP enzyme stopped being available; the Cap-Clip and Dpph enzymes perform less well than TAP. One L3 and one YA replicate was made using a slightly different protocol. Embryo short cap RNA-seq data from Chen et al. (2013) was also included in the analyses (GSE42819).

ChIP-seq

Balls of frozen embryos or worms were ground to a powder using a mortar and pestle or a Retch Mixer Mill to break animals into pieces. Frozen powder was thawed into 1% formaldehyde in PBS, incubated 10 min, then quenched with 0.125M glycine. Fixed tissue was washed 2X with PBS with protease inhibitors (Roche EDTA-free protease inhibitor cocktail tablets 05056489001), once in FA buffer (50 mM Hepes pH7.5, 1 mM EDTA, 1% TritonX-100, 0.1% sodium deoxycholate, and 150 mM NaCl) with protease inhibitors (FA+), then resuspended in 1 ml FA +buffer per 1 ml of ground worm powder and the extract sonicated to an average size of 200 base pairs with a Diagenode Bioruptor or Bioruptor Pico for 25 pulses of 30 s followed by 30 s pause. For ChIP, 500 ug protein extract was incubated 2 ug antibody in FA +buffer with protease inhibitors overnight at 4°C, then incubated with magnetic beads conjugated to secondary antibodies for 2 hr at 4°C. Magnetic beads bound to immunoprecipitate were washed at room temperature twice in FA+, then once each in FA with 0.5M NaCl, FA with 1M NaCl, 0.25M LiCl (containing 1% NP-40, 1% sodium deoxycholate, 1 mM EDTA, 10 mM Tris pH8) and finally twice with TE pH8. Immunoprecipitated DNA was then eluted twice with 1% SDS, 250 mM NaCl, 10 mM Tris pH8, 1 mM EDTA at 65°C. Eluted DNA was treated with RNase for 1 hr at 37C and crosslinks reversed by overnight incubation at 65°C with 200 ug/ml proteinase K, and the DNA purified using a Qiagen column. Libraries were prepared as described in the Sequencing library preparation section below. Two biological replicate ChIPs were conducted for each histone modification at each developmental time point (Embryo, L1, L2, L3, L4, YA). Antibodies used were: anti-H3K4me3 (Abcam ab8580), anti-H3K4me1 (Abcam ab8895), anti-H3K36me3 (Abcam ab9050), and anti-H3K27me3 (Wako 309–95259).

Sequencing library preparation

DNA was converted into sequencing libraries using a modified Illumina Truseq protocol based on https://ethanomics.files.wordpress.com/2012/09/chip_truseq.pdf. Briefly DNA fragments are first repaired with an End repair enzyme mix (New England Biolabs, cat E5060) for 30 min at 20C in 50 µl, then all DNA fragments were recovered using 1 vol of AMPure XP beads and 1 vol of 30% PEG8000 in 1.25M NaCl, and eluted in 16.5 µl of H2O. The DNA was 3’ A-tailed in 1X NEB buffer 2 using 2.5 units of Klenow 3’ to 5’ exo(minus) (New England Biolabs, cat M0212) and 0.2 mM ATP for 30 min at 37C in 20 µl. Illumina Truseq adaptors were then directly ligated to the DNA fragments by adding 25 µl 2X buffer, 1 µl of 0.06 nM adaptors (1 µl of 1:250 dilution of Illumina stock solution), 2.5 µl water and 1.5 µl of NEB Quick ligase (cat M2200). After 20 min at room temperature, 5 µl of 0.5M EDTA pH8 was added to inactivate the enzyme and DNA was purified using AMPure XP beads. For DNAse and MNase libraries, 1.3 volumes of beads were used; for ChIP libraries, 0.9 volumes of beads were used. DNA fragments were eluted in 20 µl of H2O. We used 1 µl to determine the number of cycles needed to get amplification to 50% of the plateau as in https://ethanomics.wordpress.com/ngs-pcr-cycle-quantitation-protocol/. Libraries were amplified by PCR by adding 20 µl of the KAPA Hifi Hotstart Ready Mix (Kapabiosystem cat KK2601) and 1 µl of 25 uM Illumina Universal primers. Libraries were then size selected. DNAse and MNase libraries were purified using 1.3 volumes of beads. For ChIP libraries, 0.7 volumes of beads were added to bind large DNA. Beads were discarded and DNA recovered from the supernatant by adding 0.75 volumes of beads and 0.75 volumes of 30% PEG8000 in 1.25M NaCl. DNA was eluted in 40 µl water and 0.8 volumes of beads used to bind the library, leaving adaptor dimers in the supernatant. DNA was eluted in 10–15 µl water, quantified using a Qubit, and analyzed using a Agilent Tapestation.

Data processing

Reads were aligned using bwa-backtrack (Li and Durbin, 2009) in single-end (ATAC-seq, short cap RNA-seq, ChIP-seq) or paired-end mode (ATAC-seq - developmental only, DNase-seq, MNase-seq, long cap RNA-seq). Low-quality (q < 10), mitochondrial and modENCODE-blacklisted (Boyle et al., 2014) reads were discarded at this point.

For ATAC-seq, normalized genome-wide accessibility profiles from single-end reads were then calculated with MACS2 (Zhang et al., 2008) using the parameters --format BAM --bdg --SPMR --gsize ce --nolambda --nomodel --extsize 150 --shift −75 --keep-dup all. Developmental ATAC-seq was also processed in paired-end mode (ATAC-seq libraries of ageing samples were single-end). We did not observe major differences between accessible sites identified from paired-end, and single-end profiles, and therefore use single-end profiles throughout the study for consistency.

Short-cap and long-cap data were processed essentially as in Chen et al. (2013). Following alignment, and filtering, transcription initiation was represented using strand-specific coverage of 5’ ends of short-cap reads. Transcription elongation was represented as strand-specific coverage of long-cap reads, with regions between read pairs filled in. For browsing, transcription elongation signal was normalized between samples by sizeFactors calculated from gene-level read counts using DESeq2 (Love et al., 2014). Normalized (linear) coverage signal was then further log-transformed with log2(normalised_coverage+1).

ChIP-seq data was processed as in Chen et al. (2014). After alignment and filtering, the BEADS algorithm was used to generate normalized ChIP-seq coverage tracks (Cheung et al., 2011).

Stage-specific tracks used in downstream analyses were obtained by averaging normalized signal across two biological replicates. Manipulations of genome-wide signal were performed using bedtools (Quinlan and Hall, 2010), UCSC utilities (Kent et al., 2010), and wiggleTools (Zerbino et al., 2014). Computationally intensive steps were managed and parallelized using snakemake (Köster and Rahmann, 2012). Genome-wide data was visualized using the Integrative Genomics Viewer (Robinson et al., 2011; Thorvaldsdóttir et al., 2013).

To assess the reproducibility of replicate datasets, we performed PCA using the plotPCA() function in DESeq2 (Love et al., 2014) on peak accessibility at promoters (ATAC-seq), read counts at annotated genes (long cap RNA-seq), 5' end read counts at promoters (short cap RNA-seq), and genic regions, from the most upstream promoter to the annotated 3’ end, excluding genes with no annotated promoter (histone modifications). Replicates agreed well as shown in Figure 1—figure supplements 2 and 3.

Identification of accessible sites

Accessible sites were identified as follows. We first identified concave regions (regions with negative smoothed second derivative) from ATAC-seq coverage averaged across all stages and replicates. This approach is extremely sensitive, identifying a large number (>200,000) of peak-like regions. We then scored all peaks in each sample using the magnitude of the sample-specific smoothed second derivative. We used IDR (Li et al., 2011) on the scores to assess stage-specific signal levels and biological reproducibility, setting a conservative cutoff at 0.001. Final peaks boundaries were set to peak accessibility extended by 75 bp on both sides. We found that calling peaks using paired end or single end data were highly similar, but some regions were captured better by one or the other. Developmental ATAC-seq datasets were sequenced paired-end and ageing datasets single-end. Peaks were therefore called separately using developmental paired-end data, developmental single-end data extended to 150 bp and shifted 75 bp upstream, and ageing (single-end only) data, and then merged. This was achieved by successively including peaks from the three sets if they did not overlap a peak already identified in an earlier set. Figure 1—source data 1 gives peak calls and ATAC peak heights at each stage.

Datasets and genome versions

Throughout this study, we used the WBcel215/ce10 (WS220) version of the C. elegans genome, and WormBase WS260 genome annotations - with coordinates backlifted to WBcel215/ce10 (WS220). For convenience, Figure 2—source data 1 also contains WBcel235/ce11 coordinates of accessible sites and representative transcription initiation modes.

For motif analyses, Inr and TATA consensus sequences were obtained from Sloutskin et al. (2015), and mapped with zero mismatches using homer (Heinz et al., 2010). CpG density was defined as in Chen et al. (2014).

modENCODE (Araya et al., 2014) and modERN (Kudron et al., 2018) transcription factor binding datasets used in this paper were obtained from http://www.encodeproject.org or http://data.modencode.org (EOR-1). ChIP-seq profiles were manually inspected and 227 high quality datasets selected, covering 176 transcription factors (given in Figure 5—source data 1). To define TFBS clusters (Figure 1—figure supplement 1C,D; Figure 2—figure supplement 1), TF peak calls were extended to 200 bp on either side of the summit, and clustered using a single-linkage approach. To analyze enrichment of individual factors (Figure 5), TF peaks were assigned to a regulatory element if their summits overlapped with the 400 bp region centered at the element midpoint. Factors associated with each regulatory element via this approach are given in Figure 4—source data 1. We excluded binding at so-called ‘HOT’ (highly occupied target) regions from enrichment analyses in Figure 5, as these are thought to represent non-sequence-specific TF binding or ChIP artifacts (Gerstein et al., 2010; Kudron et al., 2018). HOT regions were defined here as accessible sites with binding of 19 or more of the analyzed 176 TFs (sites in the top 20% of binding, excluding sites with no binding).

Coefficients of variation of gene expression (CV) are from (Gerstein et al., 2014); processed table was kindly provided by Burak Alver).

Annotation of regulatory elements

Patterns of nuclear transcription were used to annotate elements. At each stage, separately on both strands, we assessed 1) initiating and elongating transcription at the site, 2) continuity of transcription from the site to the closest downstream gene, and 3) positioning of nearby exons (on the matching strand).

To assess for transcription elongation at an accessible site, we counted 5' ends of long cap reads upstream (−250:−75), and downstream (+75:+250) of peak accessibility. We then used two approaches to identify sites with a local increase in transcription elongation. First, we used DESeq2 to test for an increase in downstream vs upstream counts (‘jump’ method). Statistical significance was called at log2FoldChange > 1.5, and adjusted p-value<0.1 (one-sided test). To capture additional regions with weak signal (‘incr’ method), we accepted sites with 0 reads upstream, at least one read in both biological replicates downstream, and three reads total when summed across both biological replicates.

To assess transcription initiation, we pooled short cap across all six wild-type stages, and included two additional embryo replicates from Chen et al. (2013). The pooled signal was filtered for reproducibility by only keeping signal at base pairs with non-zero transcription initiation in at least two replicates. We then required the presence of at least one base pair with reproducible signal within 125 bp of peak accessibility to designate an accessible site as having transcription initiation. For every site, we also defined a representative transcription initiation mode as the position with maximum short-cap signal within 125 bp of peak accessibility. For sites without reproducible short-cap signal, we used an extrapolated, ‘best-guess’ position at 60 bp downstream of peak accessibility.

We annotated accessible sites as coding_promoter or pseudogene_promoter if they fulfilled the following four criteria. (1) The accessible site had transcription initiation, and passed at least one of the elongation tests (jump or incr), or passed both elongation tests (jump and incr). (2) Transcription initiation mode at the accessible site was either upstream of the closest first exon, or, in the presence of a UTR, up to 250 bp downstream within the UTR. (The closest first exon was chosen based on the distance between the 5' end of the first exon and peak accessibility at the accessible site, allowing the 5' end of the exon to be up to 250 bp upstream or anywhere downstream of peak accessibility). (3) The region from peak accessibility to the closest first exon did not contain the 5' end of a non-first exon. (4) Distal sites (peak accessibility >250 bp from the closest first exon) were additionally required to (a) have continuous long-cap coverage from 250 bp downstream of peak accessibility to the closest first exon, and (b) be further than 250 bp away from any non-first exon.

We then further attempted to assign a single, lower-confidence promoter to genes that were not assigned a promoter so far. For every gene without promoter assignments, we re-examined sites that fulfilled criteria (2-4), and were either intergenic, or within 250 bp of the closest first exon. We then annotated the site with the largest jump test log2FoldChange as the promoter, if it was also larger than 1.

Next, sites within 250 bp of the 5' end of an annotated tRNA, snRNA, snoRNA, miRNA, or rRNA were annotated as non-coding_RNA. Intergenic sites more than 250 bp away from annotated exons that had initiating transcription, and passed the jump test were annotated as unassigned_promoter. All remaining sites were annotated as transcription_initiation or no_transcription based on whether they had transcription initiation.

Elements were then annotated on each strand based on aggregating transcription patterns across stages by determining the ‘highest’ annotation using the ranking of: coding_promoter, pseudogene_promoter, non-coding_RNA, unassigned_promoter, transcription_initiation, no_transcription. Element type and coloring was then defined using the following ranking: coding_promoter on either strand => coding_promoter (red); pseudogene_promoter on either strand => pseudogene_promoter (orange); non-coding_RNA on either strand => non-coding_RNA (black); unassigned_promoter on either strand => unassigned_promoter (yellow); transcription_initiation on either strand => putative_enhancer (green); all remaining sites => other_element (blue). Figure 2—source data 1 gives annotation information.

Clustering of promoter accessibility

Accessible elements with regulated accessibility were determined as follows. All elements (n = 42,245) were tested for a difference in ATAC-seq coverage between any two developmental time points or between any two ageing time points using DESeq2 (Love et al., 2014). Sites with >= 2 absolute fold change and adjusted p-value<0.01 were defined as ‘regulated’ (n = 30,032 in development and n = 6590 in ageing; Figure 4—source data 1); regulated promoters (n = 10,199 in development and n = 1800 in ageing) were used in clustering analyses.

For clustering analyses, depth-normalized ATAC-seq coverage of each promoter was calculated at each time point in development or ageing. Relative accessibility was calculated at each time point in development or ageing by applying the following formula: log2ATACseqcoveragetimepointi+1-log2meanATACseqcoverageacrosstimepoints+1. Mean ATAC-seq coverage across time points was calculated separately for the developmental and ageing time courses. Clustering was performed using k-medoids, as implemented in the pam() method of the cluster R package (Maechler et al., 2017). Different numbers of clusters were tested for clustering of regulatory elements in developmental and ageing datasets; 16 was chosen for developmental data and 10 for ageing data as the normalized changes in promoter ATAC-seq signals within each cluster were relatively homogeneous. We manually merged two ageing clusters showing comparable accessibility and tissue-specific gene enrichment (resulting in the cluster I + H [2]). Clusters labels were determined based on enrichment for tissue-biased gene expression within each cluster (see below).

To compare accessibility and gene expression, FPM-normalized gene-level read counts were calculated using DESeq2, and then averaged across biological replicates. For visualisation, relative expression levels were calculated using the approach described above for relative promoter accessibility (see formula above), with FPM values instead of ATAC-seq coverage values.

Using single-cell RNA-seq data from Cao et al. (2017), we defined tissue-biased gene expression as follows: Gene expression was considered enriched in a given tissue if it had a fold-change >= 3 between expression in the tissues with highest and second highest levels and an adjusted p-value<0.01. This defined 5315 genes with tissue-biased expression (1432 in Gonad, 553 in Hypodermis, 799 in Intestine, 352 in Muscle, 1218 in Neurons, 447 enriched in Glia, 514 in Pharynx). For each developmental or ageing cluster of promoters, we calculated the percentage of genes with biased expression in a given tissue relative to the total number of genes in the cluster. These values were plotted in Figure 4A and B (bar plots).

GO enrichments were evaluated using the R package gProfileR (Reimand et al., 2016) against C. elegans GO database. Significant enrichment was set at an adjusted p-value of 0.05, and hierarchically redundant terms were automatically removed by gProfileR.

Enrichment for transcription factor binding in promoter clusters

Prior to analysis of TF peak enrichment at annotated promoters, accessible elements considered ‘HOT’ (see above) were removed, resulting in 10,086 to be assessed by enrichment analysis. Only transcription factors with more than 200 peaks overlapping ‘non-hot’ regulatory elements were kept, to ensure sufficient data for analysis. Following this stringent filtering, 89 transcription factors could be assayed for binding enrichment. Transcription factor binding enrichment in each cluster was estimated using the odds ratio and enrichments with an associated p-value<0.01 (Fisher’s exact test) were kept. Transcription factors which did not show enrichment higher than two in any cluster were discarded. Figure 5 summarizes the transcription factor binding enrichment in each cluster during development or ageing. Relative tissue expression profiles of each transcription factor at the L2 stage (data from Cao et al., 2017) was calculated in each tissue by taking the log2 of its expression (TPM) in the tissue divided by its mean expression across all tissues. A pseudo-value of 0.1 was first added to all the TPM values before calculation of the relative levels of expression.

Construction of transgenic lines

Transgene constructs were made using three-site Gateway cloning (Invitrogen) as in Chen et al. (2014). Site one has the regulatory element sequence to be tested, site two has a synthetic outron (OU141; Conrad et al., 1995) fused to his-58 (plasmid pJA357), and site three has gfp-tbb-2 3’UTR (pJA256; Zeiser et al., 2011) in the MosSCI compatible vector pCFJ150, which targets Mos site Mos1(ttTi5605); MosSCI lines were generated as described (Frøkjaer-Jensen et al., 2008).

Data access

ATAC-seq, ChIP-seq, DNase/MNase-seq, long/short cap RNA-seq data from this study, including processed tracks are available at the NCBI Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE114494.

Acknowledgements

We thank C Bradshaw for bioinformatics support, K Harnish for sequencing, B Alver for providing processed data, and F Carelli, C Gal, and A Frapporti for comments on the manuscript. The work was supported by Wellcome Trust Senior Research Fellowships to JA (054523 and 101863), a Wellcome Trust PhD fellowship to JJ (097679), a Sir Robert Edwards Scholarship from Churchill College, an English Speaking Union Graduate Scholarship, and funding from the Cambridge Trust to MS, a Medical Research Council DTP studentship to JS, and a Thouron award to CW. This study was also supported by the European Sequencing and Genotyping Infrastructure (funded by the EC, FP7/2007-2013) under Grant Agreement 26205 (ESGI) as part of the transnational access program. We thank Drs. Hans Lehrach and Marie-Laure Yaspo for generous support of the ESGI project, Dr. Marc Sultan for setting up sequencing technology platforms, and Mathias Linser and the rest of the sequencing team of the Department of Vertebrate Genomics at the Max Planck Institute for Molecular Genetics for technical assistance. We also acknowledge core support from the Wellcome Trust (092096) and Cancer Research UK (C6946/A14492).

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Julie Ahringer, Email: ja219@cam.ac.uk.

Siu Sylvia Lee, Cornell University, United States.

Jessica K Tyler, Weill Cornell Medicine, United States.

Funding Information

This paper was supported by the following grants:

  • Wellcome 101863 to Jürgen Jänes, Yan Dong, Alex Appert, Chiara Cerrato, Ron Chen, Carolina Gemma, Ni Huang, Przemyslaw Stempor, Annette Steward, Eva Zeiser, Julie Ahringer.

  • Medical Research Council to Jacques Serizay.

  • European Commission FP7/2007-2013 to Sascha Sauer, Julie Ahringer.

  • Wellcome 097679 to Jürgen Jänes.

Additional information

Competing interests

Reviewing editor, eLife.

No competing interests declared.

Author contributions

Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing.

Investigation, Methodology, Writing—review and editing.

Formal analysis, Investigation, Methodology.

Software, Formal analysis, Visualization, Writing—original draft, Writing—review and editing.

Formal analysis, Investigation, Methodology.

Formal analysis, Investigation.

Formal analysis, Investigation.

Formal analysis, Investigation.

Formal analysis, Investigation.

Software, Formal analysis.

Formal analysis, Investigation.

Data curation, Software, Formal analysis.

Formal analysis, Investigation.

Formal analysis, Investigation.

Funding acquisition, Project administration.

Conceptualization, Formal analysis, Supervision, Funding acquisition, Writing—original draft, Project administration, Writing—review and editing.

Additional files

Transparent reporting form
DOI: 10.7554/eLife.37344.024

Data availability

Sequencing data have been deposited in as a SuperSeries in GEO under accession code GSE114494.

The following datasets were generated:

Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [DNase, MNase] Gene Expression Omnibus. GSE114481

Ahringer J, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [scap] NCBI Gene Expression Omnibus. GSE114490

Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [lcap] Gene Expression Omnibus. GSE114483

Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [ChIP-seq] Gene Expression Omnibus. GSE114440

Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [ATAC-seq] Gene Expression Omnibus. GSE114439

The following previously published datasets were used:

Down TA. 2013. The landscape of RNA polymerase II transcription initiation in C. elegans reveals a novel regulatory architecture. NCBI Gene Expression Omnibus. GSE42819

References

  1. Allen MA, Hillier LW, Waterston RH, Blumenthal T. A global analysis of C. elegans trans-splicing. Genome research. 2011;21:255–264. doi: 10.1101/gr.113811.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Andersson R, Refsing Andersen P, Valen E, Core LJ, Bornholdt J, Boyd M, Heick Jensen T, Sandelin A. Nuclear stability and transcriptional directionality separate functionally distinct RNA species. Nature Communications. 2014;5:5336. doi: 10.1038/ncomms6336. [DOI] [PubMed] [Google Scholar]
  3. Andersson R, Sandelin A, Danko CG. A unified architecture of transcriptional regulatory elements. Trends in Genetics. 2015;31:426–433. doi: 10.1016/j.tig.2015.05.007. [DOI] [PubMed] [Google Scholar]
  4. Andersson R. Promoter or enhancer, what's the difference? deconstruction of established distinctions and presentation of a unifying model. BioEssays. 2015;37:314–323. doi: 10.1002/bies.201400162. [DOI] [PubMed] [Google Scholar]
  5. Araya CL, Kawli T, Kundaje A, Jiang L, Wu B, Vafeados D, Terrell R, Weissdepp P, Gevirtzman L, Mace D, Niu W, Boyle AP, Xie D, Ma L, Murray JI, Reinke V, Waterston RH, Snyder M. Regulatory analysis of the C. elegans genome with spatiotemporal resolution. Nature. 2014;512:400–405. doi: 10.1038/nature13497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Boyle AP, Araya CL, Brdlik C, Cayting P, Cheng C, Cheng Y, Gardner K, Hillier LW, Janette J, Jiang L, Kasper D, Kawli T, Kheradpour P, Kundaje A, Li JJ, Ma L, Niu W, Rehm EJ, Rozowsky J, Slattery M, Spokony R, Terrell R, Vafeados D, Wang D, Weisdepp P, Wu YC, Xie D, Yan KK, Feingold EA, Good PJ, Pazin MJ, Huang H, Bickel PJ, Brenner SE, Reinke V, Waterston RH, Gerstein M, White KP, Kellis M, Snyder M. Comparative analysis of regulatory information and circuits across distant species. Nature. 2014;512:453–456. doi: 10.1038/nature13668. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Brabin C, Appleford PJ, Woollard A. The Caenorhabditis elegans GATA factor ELT-1 works through the cell proliferation regulator BRO-1 and the fusogen EFF-1 to maintain the seam stem-like fate. PLoS Genetics. 2011;7:e1002200. doi: 10.1371/journal.pgen.1002200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nature Methods. 2013;10:1213–1218. doi: 10.1038/nmeth.2688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cao J, Packer JS, Ramani V, Cusanovich DA, Huynh C, Daza R, Qiu X, Lee C, Furlan SN, Steemers FJ, Adey A, Waterston RH, Trapnell C, Shendure J. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science. 2017;357:661–667. doi: 10.1126/science.aam8940. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Chen RA, Down TA, Stempor P, Chen QB, Egelhofer TA, Hillier LW, Jeffers TE, Ahringer J. The landscape of RNA polymerase II transcription initiation in C. elegans reveals promoter and enhancer architectures. Genome Research. 2013;23:1339–1347. doi: 10.1101/gr.153668.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chen RA, Stempor P, Down TA, Zeiser E, Feuer SK, Ahringer J. Extreme HOT regions are CpG-dense promoters in C. Elegans and humans. Genome Research. 2014;24:1138–1146. doi: 10.1101/gr.161992.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cheung MS, Down TA, Latorre I, Ahringer J. Systematic Bias in high-throughput sequencing data and its correction by BEADS. Nucleic Acids Research. 2011;39:e103. doi: 10.1093/nar/gkr425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Conrad R, Lea K, Blumenthal T. SL1 trans-splicing specified by AU-rich synthetic RNA inserted at the 5' end of Caenorhabditis elegans pre-mRNA. RNA. 1995;1:164–170. [PMC free article] [PubMed] [Google Scholar]
  14. Core LJ, Waterfall JJ, Lis JT. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science. 2008;322:1845–1848. doi: 10.1126/science.1162228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Core LJ, Martins AL, Danko CG, Waters CT, Siepel A, Lis JT. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nature Genetics. 2014;46:1311–1320. doi: 10.1038/ng.3142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Crawford GE, Davis S, Scacheri PC, Renaud G, Halawi MJ, Erdos MR, Green R, Meltzer PS, Wolfsberg TG, Collins FS. DNase-chip: a high-resolution method to identify DNase I hypersensitive sites using tiled microarrays. Nature Methods. 2006;3:503–509. doi: 10.1038/nmeth888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Daugherty AC, Yeo RW, Buenrostro JD, Greenleaf WJ, Kundaje A, Brunet A. Chromatin accessibility dynamics reveal novel functional enhancers in C. elegans. Genome Research. 2017;27:2096–2107. doi: 10.1101/gr.226233.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. De Santa F, Barozzi I, Mietton F, Ghisletti S, Polletti S, Tusi BK, Muller H, Ragoussis J, Wei CL, Natoli G. A large fraction of extragenic RNA pol II transcription sites overlap enhancers. PLoS Biology. 2010;8:e1000384. doi: 10.1371/journal.pbio.1000384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Ernst J, Kellis M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nature Biotechnology. 2010;28:817–825. doi: 10.1038/nbt.1662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, Zhang X, Wang L, Issner R, Coyne M, Ku M, Durham T, Kellis M, Bernstein BE. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011;473:43–49. doi: 10.1038/nature09906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Evans KJ, Huang N, Stempor P, Chesney MA, Down TA, Ahringer J. Stable Caenorhabditis elegans chromatin domains separate broadly expressed and developmentally regulated genes. PNAS. 2016;113 doi: 10.1073/pnas.1608162113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Flynn RA, Almada AE, Zamudio JR, Sharp PA. Antisense RNA polymerase II divergent transcripts are P-TEFb dependent and substrates for the RNA exosome. PNAS. 2011;108:10460–10465. doi: 10.1073/pnas.1106630108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Folick A, Oakley HD, Yu Y, Armstrong EH, Kumari M, Sanor L, Moore DD, Ortlund EA, Zechner R, Wang MC. Aging. Lysosomal signaling molecules regulate longevity in Caenorhabditis elegans. Science. 2015;347:83–86. doi: 10.1126/science.1258857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Frøkjaer-Jensen C, Davis MW, Hopkins CE, Newman BJ, Thummel JM, Olesen SP, Grunnet M, Jorgensen EM. Single-copy insertion of transgenes in Caenorhabditis elegans. Nature Genetics. 2008;40:1375–1383. doi: 10.1038/ng.248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Fukushige T, Hawkins MG, McGhee JD. The GATA-factor elt-2 is essential for formation of the Caenorhabditis elegans intestine. Developmental Biology. 1998;198:286–302. doi: 10.1016/S0012-1606(98)80006-7. [DOI] [PubMed] [Google Scholar]
  26. Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, Liu T, Yip KY, Robilotto R, Rechtsteiner A, Ikegami K, Alves P, Chateigner A, Perry M, Morris M, Auerbach RK, Feng X, Leng J, Vielle A, Niu W, Rhrissorrakrai K, Agarwal A, Alexander RP, Barber G, Brdlik CM, Brennan J, Brouillet JJ, Carr A, Cheung MS, Clawson H, Contrino S, Dannenberg LO, Dernburg AF, Desai A, Dick L, Dosé AC, Du J, Egelhofer T, Ercan S, Euskirchen G, Ewing B, Feingold EA, Gassmann R, Good PJ, Green P, Gullier F, Gutwein M, Guyer MS, Habegger L, Han T, Henikoff JG, Henz SR, Hinrichs A, Holster H, Hyman T, Iniguez AL, Janette J, Jensen M, Kato M, Kent WJ, Kephart E, Khivansara V, Khurana E, Kim JK, Kolasinska-Zwierz P, Lai EC, Latorre I, Leahey A, Lewis S, Lloyd P, Lochovsky L, Lowdon RF, Lubling Y, Lyne R, MacCoss M, Mackowiak SD, Mangone M, McKay S, Mecenas D, Merrihew G, Miller DM, Muroyama A, Murray JI, Ooi SL, Pham H, Phippen T, Preston EA, Rajewsky N, Rätsch G, Rosenbaum H, Rozowsky J, Rutherford K, Ruzanov P, Sarov M, Sasidharan R, Sboner A, Scheid P, Segal E, Shin H, Shou C, Slack FJ, Slightam C, Smith R, Spencer WC, Stinson EO, Taing S, Takasaki T, Vafeados D, Voronina K, Wang G, Washington NL, Whittle CM, Wu B, Yan KK, Zeller G, Zha Z, Zhong M, Zhou X, Ahringer J, Strome S, Gunsalus KC, Micklem G, Liu XS, Reinke V, Kim SK, Hillier LW, Henikoff S, Piano F, Snyder M, Stein L, Lieb JD, Waterston RH, modENCODE Consortium Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science. 2010;330:1775–1787. doi: 10.1126/science.1196914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Gerstein MB, Rozowsky J, Yan KK, Wang D, Cheng C, Brown JB, Davis CA, Hillier L, Sisu C, Li JJ, Pei B, Harmanci AO, Duff MO, Djebali S, Alexander RP, Alver BH, Auerbach R, Bell K, Bickel PJ, Boeck ME, Boley NP, Booth BW, Cherbas L, Cherbas P, Di C, Dobin A, Drenkow J, Ewing B, Fang G, Fastuca M, Feingold EA, Frankish A, Gao G, Good PJ, Guigó R, Hammonds A, Harrow J, Hoskins RA, Howald C, Hu L, Huang H, Hubbard TJ, Huynh C, Jha S, Kasper D, Kato M, Kaufman TC, Kitchen RR, Ladewig E, Lagarde J, Lai E, Leng J, Lu Z, MacCoss M, May G, McWhirter R, Merrihew G, Miller DM, Mortazavi A, Murad R, Oliver B, Olson S, Park PJ, Pazin MJ, Perrimon N, Pervouchine D, Reinke V, Reymond A, Robinson G, Samsonova A, Saunders GI, Schlesinger F, Sethi A, Slack FJ, Spencer WC, Stoiber MH, Strasbourger P, Tanzer A, Thompson OA, Wan KH, Wang G, Wang H, Watkins KL, Wen J, Wen K, Xue C, Yang L, Yip K, Zaleski C, Zhang Y, Zheng H, Brenner SE, Graveley BR, Celniker SE, Gingeras TR, Waterston R, Jj L. Comparative analysis of the transcriptome across distant species. Nature. 2014;512:445–448. doi: 10.1038/nature13424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Gilleard JS, Shafi Y, Barry JD, McGhee JD. ELT-3: a Caenorhabditis elegans GATA factor expressed in the embryonic epidermis during morphogenesis. Developmental Biology. 1999;208:265–280. doi: 10.1006/dbio.1999.9202. [DOI] [PubMed] [Google Scholar]
  29. Gissendanner CR, Sluder AE. nhr-25, the Caenorhabditis elegans ortholog of ftz-f1, is required for epidermal and somatic gonad development. Developmental Biology. 2000;221:259–272. doi: 10.1006/dbio.2000.9679. [DOI] [PubMed] [Google Scholar]
  30. Goudeau J, Bellemin S, Toselli-Mollereau E, Shamalnasab M, Chen Y, Aguilaniu H. Fatty acid desaturation links germ cell loss to longevity through NHR-80/HNF4 in C. elegans. PLoS Biology. 2011;9:e1000599. doi: 10.1371/journal.pbio.1000599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Gu W, Lee HC, Chaves D, Youngman EM, Pazour GJ, Conte D, Mello CC. CapSeq and CIP-TAP identify Pol II start sites and reveal capped small RNAs as C. elegans piRNA precursors. Cell. 2012;151:1488–1500. doi: 10.1016/j.cell.2012.11.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Heintzman ND, Stuart RK, Hon G, Fu Y, Ching CW, Hawkins RD, Barrera LO, Van Calcar S, Qu C, Ching KA, Wang W, Weng Z, Green RD, Crawford GE, Ren B. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nature Genetics. 2007;39:311–318. doi: 10.1038/ng1966. [DOI] [PubMed] [Google Scholar]
  33. Heintzman ND, Hon GC, Hawkins RD, Kheradpour P, Stark A, Harp LF, Ye Z, Lee LK, Stuart RK, Ching CW, Ching KA, Antosiewicz-Bourget JE, Liu H, Zhang X, Green RD, Lobanenkov VV, Stewart R, Thomson JA, Crawford GE, Kellis M, Ren B. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature. 2009;459:108–112. doi: 10.1038/nature07829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Molecular Cell. 2010;38:576–589. doi: 10.1016/j.molcel.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Henriques T, Scruggs BS, Inouye MO, Muse GW, Williams LH, Burkholder AB, Lavender CA, Fargo DC, Adelman K. Widespread transcriptional pausing and elongation control at enhancers. Genes & Development. 2018;32:26–41. doi: 10.1101/gad.309351.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Ho MCW, Quintero-Cadena P, Sternberg PW. Genome-wide discovery of active regulatory elements and transcription factor footprints in Caenorhabditis elegans using DNase-seq. Genome Research. 2017;27 doi: 10.1101/gr.223735.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Hoffman MM, Ernst J, Wilder SP, Kundaje A, Harris RS, Libbrecht M, Giardine B, Ellenbogen PM, Bilmes JA, Birney E, Hardison RC, Dunham I, Kellis M, Noble WS. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Research. 2013;41:827–841. doi: 10.1093/nar/gks1284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Horn M, Geisen C, Cermak L, Becker B, Nakamura S, Klein C, Pagano M, Antebi A. DRE-1/FBXO11-dependent degradation of BLMP-1/BLIMP-1 governs C. elegans developmental timing and maturation. Developmental Cell. 2014;28:697–710. doi: 10.1016/j.devcel.2014.01.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Hunt-Newbury R, Viveiros R, Johnsen R, Mah A, Anastas D, Fang L, Halfnight E, Lee D, Lin J, Lorch A, McKay S, Okada HM, Pan J, Schulz AK, Tu D, Wong K, Zhao Z, Alexeyenko A, Burglin T, Sonnhammer E, Schnabel R, Jones SJ, Marra MA, Baillie DL, Moerman DG. High-throughput in vivo analysis of gene expression in Caenorhabditis elegans. PLoS Biology. 2007;5:e237. doi: 10.1371/journal.pbio.0050237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Inoue F, Kircher M, Martin B, Cooper GM, Witten DM, McManus MT, Ahituv N, Shendure J. A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity. Genome Research. 2017;27:38–52. doi: 10.1101/gr.212092.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Kaplan RE, Baugh LR. L1 arrest, daf-16/FoxO and nonautonomous control of post-embryonic development. Worm. 2016;5:e1175196. doi: 10.1080/21624054.2016.1175196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Kent WJ, Zweig AS, Barber G, Hinrichs AS, Karolchik D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics. 2010;26:2204–2207. doi: 10.1093/bioinformatics/btq351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kharchenko PV, Alekseyenko AA, Schwartz YB, Minoda A, Riddle NC, Ernst J, Sabo PJ, Larschan E, Gorchakov AA, Gu T, Linder-Basso D, Plachetka A, Shanower G, Tolstorukov MY, Luquette LJ, Xi R, Jung YL, Park RW, Bishop EP, Canfield TK, Sandstrom R, Thurman RE, MacAlpine DM, Stamatoyannopoulos JA, Kellis M, Elgin SC, Kuroda MI, Pirrotta V, Karpen GH, Park PJ. Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. Nature. 2011;471:480–485. doi: 10.1038/nature09725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kim TK, Hemberg M, Gray JM, Costa AM, Bear DM, Wu J, Harmin DA, Laptewicz M, Barbara-Haley K, Kuersten S, Markenscoff-Papadimitriou E, Kuhl D, Bito H, Worley PF, Kreiman G, Greenberg ME. Widespread transcription at neuronal activity-regulated enhancers. Nature. 2010;465:182–187. doi: 10.1038/nature09033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Kim TK, Shiekhattar R. Architectural and functional commonalities between enhancers and promoters. Cell. 2015;162:948–959. doi: 10.1016/j.cell.2015.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Koch F, Fenouil R, Gut M, Cauchy P, Albert TK, Zacarias-Cabeza J, Spicuglia S, de la Chapelle AL, Heidemann M, Hintermair C, Eick D, Gut I, Ferrier P, Andrau JC. Transcription initiation platforms and GTF recruitment at tissue-specific enhancers and promoters. Nature Structural & Molecular Biology. 2011;18:956–963. doi: 10.1038/nsmb.2085. [DOI] [PubMed] [Google Scholar]
  47. Köster J, Rahmann S. Snakemake--a scalable bioinformatics workflow engine. Bioinformatics. 2012;28:2520–2522. doi: 10.1093/bioinformatics/bts480. [DOI] [PubMed] [Google Scholar]
  48. Kowalczyk MS, Hughes JR, Garrick D, Lynch MD, Sharpe JA, Sloane-Stanley JA, McGowan SJ, De Gobbi M, Hosseini M, Vernimmen D, Brown JM, Gray NE, Collavin L, Gibbons RJ, Flint J, Taylor S, Buckle VJ, Milne TA, Wood WG, Higgs DR. Intragenic enhancers act as alternative promoters. Molecular cell. 2012;45:447–458. doi: 10.1016/j.molcel.2011.12.021. [DOI] [PubMed] [Google Scholar]
  49. Kruesi WS, Core LJ, Waters CT, Lis JT, Meyer BJ. Condensin controls recruitment of RNA polymerase II to achieve nematode X-chromosome dosage compensation. eLife. 2013;2:e00808. doi: 10.7554/eLife.00808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Kudron MM, Victorsen A, Gevirtzman L, Hillier LW, Fisher WW, Vafeados D, Kirkey M, Hammonds AS, Gersch J, Ammouri H, Wall ML, Moran J, Steffen D, Szynkarek M, Seabrook-Sturgis S, Jameel N, Kadaba M, Patton J, Terrell R, Corson M, Durham TJ, Park S, Samanta S, Han M, Xu J, Yan KK, Celniker SE, White KP, Ma L, Gerstein M, Reinke V, Waterston RH. The ModERN resource: genome-wide binding profiles for hundreds of Drosophila and Caenorhabditis elegans Transcription Factors. Genetics. 2018;208:937–949. doi: 10.1534/genetics.117.300657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, Heravi-Moussavi A, Kheradpour P, Zhang Z, Wang J, Ziller MJ, Amin V, Whitaker JW, Schultz MD, Ward LD, Sarkar A, Quon G, Sandstrom RS, Eaton ML, Wu YC, Pfenning AR, Wang X, Claussnitzer M, Liu Y, Coarfa C, Harris RA, Shoresh N, Epstein CB, Gjoneska E, Leung D, Xie W, Hawkins RD, Lister R, Hong C, Gascard P, Mungall AJ, Moore R, Chuah E, Tam A, Canfield TK, Hansen RS, Kaul R, Sabo PJ, Bansal MS, Carles A, Dixon JR, Farh KH, Feizi S, Karlic R, Kim AR, Kulkarni A, Li D, Lowdon R, Elliott G, Mercer TR, Neph SJ, Onuchic V, Polak P, Rajagopal N, Ray P, Sallari RC, Siebenthall KT, Sinnott-Armstrong NA, Stevens M, Thurman RE, Wu J, Zhang B, Zhou X, Beaudet AE, Boyer LA, De Jager PL, Farnham PJ, Fisher SJ, Haussler D, Jones SJ, Li W, Marra MA, McManus MT, Sunyaev S, Thomson JA, Tlsty TD, Tsai LH, Wang W, Waterland RA, Zhang MQ, Chadwick LH, Bernstein BE, Costello JF, Ecker JR, Hirst M, Meissner A, Milosavljevic A, Ren B, Stamatoyannopoulos JA, Wang T, Kellis M, Roadmap Epigenomics Consortium Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–330. doi: 10.1038/nature14248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Leung D, Jung I, Rajagopal N, Schmitt A, Selvaraj S, Lee AY, Yen CA, Lin S, Lin Y, Qiu Y, Xie W, Yue F, Hariharan M, Ray P, Kuan S, Edsall L, Yang H, Chi NC, Zhang MQ, Ecker JR, Ren B. Integrative analysis of haplotype-resolved epigenomes across human tissues. Nature. 2015;518:350–354. doi: 10.1038/nature14217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Li Q, Brown JB, Huang H, Bickel PJ. Measuring reproducibility of high-throughput experiments. The Annals of Applied Statistics. 2011;5:1752–1779. doi: 10.1214/11-AOAS466. [DOI] [Google Scholar]
  55. Lin K, Hsin H, Libina N, Kenyon C. Regulation of the Caenorhabditis elegans longevity protein DAF-16 by insulin/IGF-1 and germline signaling. Nature Genetics. 2001;28:139–145. doi: 10.1038/88850. [DOI] [PubMed] [Google Scholar]
  56. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K. Cluster: Cluster Analysis Basics and Extensions. Scientific Research Publisher; 2017. [Google Scholar]
  58. Mann FG, Van Nostrand EL, Friedland AE, Liu X, Kim SK. Deactivation of the GATA Transcription Factor ELT-2 Is a Major Driver of Normal Aging in C. elegans. PLoS Genetics. 2016;12:e1005956. doi: 10.1371/journal.pgen.1005956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Mao XR, Kaufman DM, Crowder CM. Nicotinamide mononucleotide adenylyltransferase promotes hypoxic survival by activating the mitochondrial unfolded protein response. Cell Death & Disease. 2016;7:e2113. doi: 10.1038/cddis.2016.5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Merritt C, Rasoloson D, Ko D, Seydoux G. 3' UTRs are the primary regulators of gene expression in the C. elegans germline. Current Biology. 2008;18:1476–1482. doi: 10.1016/j.cub.2008.08.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Mikhaylichenko O, Bondarenko V, Harnett D, Schor IE, Males M, Viales RR, Furlong EEM. The degree of enhancer or promoter activity is reflected by the levels and directionality of eRNA transcription. Genes & Development. 2018;32:42–57. doi: 10.1101/gad.308619.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Nguyen TA, Jones RD, Snavely AR, Pfenning AR, Kirchner R, Hemberg M, Gray JM. High-throughput functional comparison of promoter and enhancer activities. Genome Research. 2016;26:1023–1033. doi: 10.1101/gr.204834.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Pandya-Jones A, Black DL. Co-transcriptional splicing of constitutive and alternative exons. RNA. 2009;15:1896–1908. doi: 10.1261/rna.1714509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Pekowska A, Benoukraf T, Zacarias-Cabeza J, Belhocine M, Koch F, Holota H, Imbert J, Andrau JC, Ferrier P, Spicuglia S. H3K4 tri-methylation provides an epigenetic signature of active enhancers. The EMBO Journal. 2011;30:4198–4210. doi: 10.1038/emboj.2011.295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Pérez-Lluch S, Blanco E, Tilgner H, Curado J, Ruiz-Romero M, Corominas M, Guigó R. Absence of canonical marks of active chromatin in developmentally regulated genes. Nature Genetics. 2015;47:1158–1167. doi: 10.1038/ng.3381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Preker P, Nielsen J, Kammler S, Lykke-Andersen S, Christensen MS, Mapendano CK, Schierup MH, Jensen TH. RNA exosome depletion reveals transcription upstream of active human promoters. Science. 2008;322:1851–1854. doi: 10.1126/science.1164096. [DOI] [PubMed] [Google Scholar]
  67. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Reimand J, Arak T, Adler P, Kolberg L, Reisberg S, Peterson H, Vilo J. G:profiler-a web server for functional interpretation of gene lists (2016 update) Nucleic Acids Research. 2016;44:W83–W89. doi: 10.1093/nar/gkw199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Rennie S, Dalby M, Lloret-Llinares M, Bakoulis S, Vaagenso CD, Jensen TH, Andersson R. Transcription start site analysis reveals widespread divergent transcription in D. Melanogaster and core promoter encoded enhancer activities. bioRxiv. 2017 doi: 10.1093/nar/gky244. https://www.biorxiv.org/content/early/2017/11/18/221952 [DOI] [PMC free article] [PubMed]
  70. Rennie S, Dalby M, Lloret-Llinares M, Bakoulis S, Dalager Vaagensø C, Heick Jensen T, Andersson R. Transcription start site analysis reveals widespread divergent transcription in D. Melanogaster and core promoter-encoded enhancer activities. Nucleic Acids Research. 2018;46:5455–5469. doi: 10.1093/nar/gky244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nature biotechnology. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Sabo PJ, Kuehn MS, Thurman R, Johnson BE, Johnson EM, Cao H, Yu M, Rosenzweig E, Goldy J, Haydock A, Weaver M, Shafer A, Lee K, Neri F, Humbert R, Singer MA, Richmond TA, Dorschner MO, McArthur M, Hawrylycz M, Green RD, Navas PA, Noble WS, Stamatoyannopoulos JA. Genome-scale mapping of DNase I sensitivity in vivo using tiling DNA microarrays. Nature Methods. 2006;3:511–518. doi: 10.1038/nmeth890. [DOI] [PubMed] [Google Scholar]
  73. Saito TL, Hashimoto S, Gu SG, Morton JJ, Stadler M, Blumenthal T, Fire A, Morishita S. The transcription start site landscape of C. elegans. Genome Research. 2013;23:1348–1361. doi: 10.1101/gr.151571.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Sandhir R, Berman NE. Age-dependent response of CCAAT/enhancer binding proteins following traumatic brain injury in mice. Neurochemistry International. 2010;56:188–193. doi: 10.1016/j.neuint.2009.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Sigova AA, Mullen AC, Molinie B, Gupta S, Orlando DA, Guenther MG, Almada AE, Lin C, Sharp PA, Giallourakis CC, Young RA. Divergent transcription of long noncoding RNA/mRNA gene pairs in embryonic stem cells. PNAS. 2013;110:2876–2881. doi: 10.1073/pnas.1221904110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Sloutskin A, Danino YM, Orenstein Y, Zehavi Y, Doniger T, Shamir R, Juven-Gershon T. ElemeNT: a computational tool for detecting core promoter elements. Transcription. 2015;6:41–50. doi: 10.1080/21541264.2015.1067286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Tepper RG, Ashraf J, Kaletsky R, Kleemann G, Murphy CT, Bussemaker HJ. PQM-1 complements DAF-16 as a key transcriptional regulator of DAF-2-mediated development and longevity. Cell. 2013;154:676–690. doi: 10.1016/j.cell.2013.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Thomas S, Li XY, Sabo PJ, Sandstrom R, Thurman RE, Canfield TK, Giste E, Fisher W, Hammonds A, Celniker SE, Biggin MD, Stamatoyannopoulos JA. Dynamic reprogramming of chromatin accessibility during Drosophila embryo development. Genome Biology. 2011;12:R43. doi: 10.1186/gb-2011-12-5-r43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in bioinformatics. 2013;14:178–192. doi: 10.1093/bib/bbs017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, Haugen E, Sheffield NC, Stergachis AB, Wang H, Vernot B, Garg K, John S, Sandstrom R, Bates D, Boatman L, Canfield TK, Diegel M, Dunn D, Ebersol AK, Frum T, Giste E, Johnson AK, Johnson EM, Kutyavin T, Lajoie B, Lee BK, Lee K, London D, Lotakis D, Neph S, Neri F, Nguyen ED, Qu H, Reynolds AP, Roach V, Safi A, Sanchez ME, Sanyal A, Shafer A, Simon JM, Song L, Vong S, Weaver M, Yan Y, Zhang Z, Zhang Z, Lenhard B, Tewari M, Dorschner MO, Hansen RS, Navas PA, Stamatoyannopoulos G, Iyer VR, Lieb JD, Sunyaev SR, Akey JM, Sabo PJ, Kaul R, Furey TS, Dekker J, Crawford GE, Stamatoyannopoulos JA. The accessible chromatin landscape of the human genome. Nature. 2012;489:75–82. doi: 10.1038/nature11232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Tian Y, Garcia G, Bian Q, Steffen KK, Joe L, Wolff S, Meyer BJ, Dillin A. Mitochondrial Stress Induces Chromatin Reorganization to Promote Longevity and UPR(mt) Cell. 2016;165:1197–1208. doi: 10.1016/j.cell.2016.04.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Tittel-Elmer M, Bucher E, Broger L, Mathieu O, Paszkowski J, Vaillant I. Stress-induced activation of heterochromatic transcription. PLoS Genetics. 2010;6:e1001175. doi: 10.1371/journal.pgen.1001175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Uno M, Honjoh S, Matsuda M, Hoshikawa H, Kishimoto S, Yamamoto T, Ebisuya M, Yamamoto T, Matsumoto K, Nishida E. A fasting-responsive signaling pathway that extends life span in C. elegans. Cell Reports. 2013;3:79–91. doi: 10.1016/j.celrep.2012.12.018. [DOI] [PubMed] [Google Scholar]
  84. van Arensbergen J, FitzPatrick VD, de Haas M, Pagie L, Sluimer J, Bussemaker HJ, van Steensel B. Genome-wide mapping of autonomous promoter activity in human cells. Nature Biotechnology. 2017;35:145–153. doi: 10.1038/nbt.3754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Wagner CR, Kuervers L, Baillie DL, Yanowitz JL. xnd-1 regulates the global recombination landscape in Caenorhabditis elegans. Nature. 2010;467:839–843. doi: 10.1038/nature09429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Yue F, Cheng Y, Breschi A, Vierstra J, Wu W, Ryba T, Sandstrom R, Ma Z, Davis C, Pope BD, Shen Y, Pervouchine DD, Djebali S, Thurman RE, Kaul R, Rynes E, Kirilusha A, Marinov GK, Williams BA, Trout D, Amrhein H, Fisher-Aylor K, Antoshechkin I, DeSalvo G, See LH, Fastuca M, Drenkow J, Zaleski C, Dobin A, Prieto P, Lagarde J, Bussotti G, Tanzer A, Denas O, Li K, Bender MA, Zhang M, Byron R, Groudine MT, McCleary D, Pham L, Ye Z, Kuan S, Edsall L, Wu YC, Rasmussen MD, Bansal MS, Kellis M, Keller CA, Morrissey CS, Mishra T, Jain D, Dogan N, Harris RS, Cayting P, Kawli T, Boyle AP, Euskirchen G, Kundaje A, Lin S, Lin Y, Jansen C, Malladi VS, Cline MS, Erickson DT, Kirkup VM, Learned K, Sloan CA, Rosenbloom KR, Lacerda de Sousa B, Beal K, Pignatelli M, Flicek P, Lian J, Kahveci T, Lee D, Kent WJ, Ramalho Santos M, Herrero J, Notredame C, Johnson A, Vong S, Lee K, Bates D, Neri F, Diegel M, Canfield T, Sabo PJ, Wilken MS, Reh TA, Giste E, Shafer A, Kutyavin T, Haugen E, Dunn D, Reynolds AP, Neph S, Humbert R, Hansen RS, De Bruijn M, Selleri L, Rudensky A, Josefowicz S, Samstein R, Eichler EE, Orkin SH, Levasseur D, Papayannopoulou T, Chang KH, Skoultchi A, Gosh S, Disteche C, Treuting P, Wang Y, Weiss MJ, Blobel GA, Cao X, Zhong S, Wang T, Good PJ, Lowdon RF, Adams LB, Zhou XQ, Pazin MJ, Feingold EA, Wold B, Taylor J, Mortazavi A, Weissman SM, Stamatoyannopoulos JA, Snyder MP, Guigo R, Gingeras TR, Gilbert DM, Hardison RC, Beer MA, Ren B, Mouse ENCODE Consortium A comparative encyclopedia of DNA elements in the mouse genome. Nature. 2014;515:355–364. doi: 10.1038/nature13992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Zeiser E, Frøkjær-Jensen C, Jorgensen E, Ahringer J. MosSCI and gateway compatible plasmid toolkit for constitutive and inducible expression of transgenes in the C. elegans germline. PLoS One. 2011;6:e20082. doi: 10.1371/journal.pone.0020082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Zerbino DR, Johnson N, Juettemann T, Wilder SP, Flicek P. WiggleTools: parallel processing of large collections of genome-wide datasets for visualization and statistical analysis. Bioinformatics. 2014;30:1008–1009. doi: 10.1093/bioinformatics/btt737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, Liu XS. Model-based analysis of ChIP-Seq (MACS) Genome Biology. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Zhang H, Gao L, Anandhakumar J, Gross DS. Uncoupling transcription from covalent histone modification. PLoS Genetics. 2014;10:e1004202. doi: 10.1371/journal.pgen.1004202. [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision letter

Editor: Siu Sylvia Lee1

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Chromatin accessibility dynamics across C. elegans development and ageing" for consideration by eLife. Your article has been reviewed Jessica Tyler as the Senior Editor, a Reviewing Editor, and three reviewers. The following individual involved in review of your submission has agreed to reveal her identity: Bérénice A Benayoun (Reviewer #1).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The reviewers appreciated the high impact of the work, particularly as an important resource for the community. The reviewers generally agreed that the experimental designs and data analyses were rigorous and the results were well described.

Essential revisions:

1) A major point that needs to be addressed is the potential caveats relate to changes in cell numbers as pointed out by reviewer 2. This can be addressed textually, however the corresponding conclusions will need to modified accordingly.

2) The reviewers suggested a number of clarifications, including additional quality control analyses, and clear discussion of differences in methods. These include all specific comments from reviewers 1 and 3 below.

Reviewer #1:

This manuscript by Janes et al., described a novel dataset resource mapping chromatin elements throughout C. elegans lifespan, from development to adulthood, and, for the first time in worm, throughout several stages of adulthood. The authors identify > 40K elements with accessible chromatin in at least one of the assayed time points, with highly dynamic landscape of accessibility throughout worm lifespan. Using nuclear transcription profiling, to bypass the C. elegans specific trans-splicing phenomenon, the authors refine annotations of >15K promoter and > 19K enhancers. Both classes of elements seem to be able to drive bidirectional transcription based on follow-up reporter experiments in 2 lines.

The article does describe the resource dataset well, and the raw data is already readily available. This resource will be of broad interest to the aging and chromatin fields. A couple of points need to be further clarified, and some complementary analyses need to be performed for reproducibility and statistical soundness.

1) The young adult stage profiling is performed in 2 distinct contexts: WT N2 and the glp-1 mutant, serving as an anchor point between the developmental and adult datasets. The rationale for the switch is reasonable, but it would be important to discuss how similar/dissimilar the YA ATAC-seq data is between the two strains (N2 and glp-1 mutant).

2) In the Discussion section of similarity vs. dissimilarity with previous developmental datasets, it would also be important to include the state of culture: liquid vs. solid, as this may have broad consequences on gene activation patterns, notably in the muscle tissue. The Daugherty et al., paper used solid cultures, whereas this study had liquid cultures (according to subsection “Collection of developmental time series samples”). It would be important to include a significance of correlation test.

3) In the CV analysis (subsection “Patterns of histone marks at promoters and enhancers”/Figure 3), it may be useful to include a statistical analysis, for instance in the form of a significance of correlation test, and maybe a scatter plot.

4) In the Materials and methods section, the authors mention the use of "the Illumina TruSeq kit or a homemade equivalent". For the sake of reproducibility, it is important to include (i) a table listing which dataset was generated which each kit/method and (ii) a description of the "homemade equivalent (steps, enzymes, used suppliers).

5) For the tissue-specific enrichment analysis (Materials and methods section), the use of a background and how it was selected needs to be included. This choice can greatly affect observed enrichment results.

6) Metagene analysis are convenient representation tools but (i) they smooth away variations in the data, potentially masking heterogeneity, and (ii) have no statistical support for changes described. To fully support conclusions, the authors need to include a complementary statistical analysis for the data reported in Figure 3B, Figure 1—figure supplement 1C/D, Figure 2—figure supplement 1B/C, Figure 3—figure supplement 1B. This can be done by either including the 95% confidence interval on the metagene plots or performing a quantitative analysis using a boxplot and a non-parametric Wilcoxon rank-sum test.

Reviewer #2:

The authors present whole animal ATAC-seq data for six developmental stages of C. elegans along with five stages of aging adults to define accessible sites. They also collect long nuclear RNA-seq data for each time point in both time series to help classify the accessible sites into promoter and enhancer types. To further augment the developmental series in particular, the authors collected chromatin ChIP-seq data and short, capped nuclear RNA to assess chromatin state and transcription initiation, respectively. After assigning a type to the accessible regions, they cluster the promoter sites based on their accessibility over the developmental time course and the aging time course. Using single cell data from Cao et al., (2017) they find that some site clusters are associated with target genes with tissue-specific expression. They also use the single cell data in conjunction with ChIP-seq TF data to show that some TFs are preferentially associated with genes expressed in certain tissues.

Overall, the paper is well-written and the data sets seem of high quality. The chromatin accessibility data should be of considerable interest to the C. elegans community, and their evaluation of chromatin marks relative to promoters, enhancers, and the patterns of expression of the target genes should be of interest to the wider chromatin community. The fact that most sites change over time is not surprising, given that the expression of most genes changes considerably over time. This conclusion and the cluster analysis is complicated by the fact that the fraction of cells from any given tissue changes over time, and thus the fraction of reads from a peak associated with that tissue will change, even if accessibility at that site within that cell does not change over time. The clearest example is the gonad, where cells from this tissue go from being less than 1% of the total cell (genome) number to more than half over the course of larval development. An accessible site in gonad cells that was constant throughout this expansion would be expected to show a large increase over time, simply from the increasing cell number. Similarly, intestinal cells double their genomes with each larval molt. Neurons and muscle increase in L1 but not thereafter, so proportionally the signal from their accessible sites would be expected to diminish over time, even if accessibility remained constant. Indeed, the patterns of most of the assigned clusters for the developmental time course follow the expectation from cell numbers. Finally, the analysis of TF ChIP-seq data seems quite similar to that of Cao et al., except that the authors condition their analysis on overlap of ChIP-seq sites with accessible sites. However, since the overlap of accessible sites with ChIP-seq sites is very high, it is not clear what is new here.

In sum, the paper presents data sets that will provide an important resource for the worm community. The biological insights are modest in the present paper, but its utility to the community should produce valuable insights over time.

Reviewer #3:

The Janes et al., manuscript reports ATAC seq profiling along a developmental and aging time course in whole worm C. elegans. In general, the experimental design and the data processing all appear rigorous and well done. By comparing the ATAC data with RNA-seq of nuclear RNAs, the authors define many promoters and putative enhancers. Moreover, the authors found that many of the accessible sites show dynamic developmental regulation, and the sites are often associated with tissue-specific gene expression. Overall, the data reported represent an important resource for the community.

1) It will be important to show quality control analyses to assess the reproducibility of the data / between replicates.

2) Similarly, MDA or PCA to show the overall relatedness of the data from the different time points will be helpful.

3) For the ATAC data comparison with previously published data, the authors indicated that different peak calling parameters likely account for the differences. It will be helpful for the authors to examine their data using the previously published analysis pipelines (or vice versa) and provide some quantification of the differences / similarities between the datasets.

eLife. 2018 Oct 26;7:e37344. doi: 10.7554/eLife.37344.039

Author response


Essential revisions:

Reviewer #1:

This manuscript by Janes et al., described a novel dataset resource mapping chromatin elements throughout C. elegans lifespan, from development to adulthood, and, for the first time in worm, throughout several stages of adulthood. The authors identify > 40K elements with accessible chromatin in at least one of the assayed time points, with highly dynamic landscape of accessibility throughout worm lifespan. Using nuclear transcription profiling, to bypass the C. elegans specific trans-splicing phenomenon, the authors refine annotations of >15K promoter and > 19K enhancers. Both classes of elements seem to be able to drive bidirectional transcription based on follow-up reporter experiments in 2 lines.

The article does describe the resource dataset well, and the raw data is already readily available. This resource will be of broad interest to the aging and chromatin fields. A couple of points need to be further clarified, and some complementary analyses need to be performed for reproducibility and statistical soundness.

1) The young adult stage profiling is performed in 2 distinct contexts: WT N2 and the glp-1 mutant, serving as an anchor point between the developmental and adult datasets. The rationale for the switch is reasonable, but it would be important to discuss how similar/dissimilar the YA ATAC-seq data is between the two strains (N2 and glp-1 mutant).

We make no comparisons between the developmental and ageing datasets in this paper, hence we do not discuss similarities and differences. The primary difference between wild-type N2 and glp-1 young adults is the absence of germ line tissue in glp-1 young adults. To validate this difference in the ATAC-seq data and address the reviewer’s question about similarities and differences, we compared signal at protein-coding promoters based on the tissue with maximum expression according to (Cao et al., 2017). Promoters of genes with the highest expression in somatic tissues show correlations between 0.87 and 0.90 (Spearman rank correlation), validating similarity between wild-type and glp-1 somatic tissue. As expected, promoters of genes with the highest expression in the gonad show a lower correlation of 0.71. Many of these latter promoters are ubiquitously active, and so also have ATAC-seq signal in somatic tissues, leading to a relatively high correlation.

Spearman's rank correlation was calculated between peak accessibility at protein-coding promoters between wild-type young adult, and glp-1 day 1 time points. Promoters were grouped by the tissue with highest gene expression of the target gene, according to L2 single-cell gene-expression data from (Cao et al. 2017).</Author response image 1 title/legend>

Author response image 1. Correlation of peak chromatin accessibility between wild-type young adult, and glp-1 day 1 time points at protein-coding promoters, grouped by tissue with the highest gene expression.

Author response image 1.

2) In the Discussion section of similarity vs. dissimilarity with previous developmental datasets, it would also be important to include the state of culture: liquid vs. solid, as this may have broad consequences on gene activation patterns, notably in the muscle tissue. The Daugherty et al., paper used solid cultures, whereas this study had liquid cultures (according to subsection “Collection of developmental time series samples”). It would be important to include a significance of correlation test.

In the response to point 1 above, we compared accessibility of elements in wild-type young adults grown in liquid culture and glp-1 young adults grown on solid media. Notably, ATAC-seq signals at regulatory elements of genes with muscle biased expression show equally high correlation compared to those of other somatic tissues, suggesting that media type is unlikely to play a major role in accessibility differences. Nevertheless, we added to the paper that such a difference could have some contribution. Further, in response to this point and reviewer 3 point 3, we have expanded the analyses (adding new Figure 2—figure supplement 2. Effect of differences in peak calling methods on the types of identified accessible sites) and discussion of similarities and differences between previous datasets (see below).

3) In the CV analysis (subsection “Patterns of histone marks at promoters and enhancers”/Figure 3), it may be useful to include a statistical analysis, for instance in the form of a significance of correlation test, and maybe a scatter plot.

We have added a p-value confirming the statistical significance of the correlation.

4) In the Materials and methods section, the authors mention the use of "the Illumina TruSeq kit or a homemade equivalent". For the sake of reproducibility, it is important to include (i) a table listing which dataset was generated which each kit/method and (ii) a description of the "homemade equivalent (steps, enzymes, used suppliers).

We have added a section to the Materials and methods section describing the protocol used to prepare the sequenced libraries.

5) For the tissue-specific enrichment analysis (Materials and methods section), the use of a background and how it was selected needs to be included. This choice can greatly affect observed enrichment results.

We have clarified this in the Materials and methods section as follows:

“Using single cell RNA-seq data from (Cao et al., 2017), we defined tissue-biased gene expression as follows: Gene expression was considered enriched in a given tissue if it had a fold-change >= 3 between expression in the tissues with highest and second highest levels and an adjusted p-value < 0.01. This defined 5,315 genes with tissue-biased expression (1432 in Gonad, 553 in Hypodermis, 799 in Intestine, 352 in Muscle, 1218 in Neurons, 447 enriched in Glia, 514 in Pharynx). For each developmental or ageing cluster of promoters, we calculated the percentage of genes with biased expression in a given tissue relative to the total number of genes in the cluster. These values were plotted in Figure 4 A and B (bar plots).”

6) Metagene analysis are convenient representation tools but (i) they smooth away variations in the data, potentially masking heterogeneity, and (ii) have no statistical support for changes described. To fully support conclusions, the authors need to include a complementary statistical analysis for the data reported in Figure 3B, Figure 1—figure supplement 1C/D, Figure 2—figure supplement 1B/C, Figure 3—figure supplement 1B. This can be done by either including the 95% confidence interval on the metagene plots or performing a quantitative analysis using a boxplot and a non-parametric Wilcoxon rank-sum test.

As requested, we have added 95% confidence intervals to Figure 2—figure supplement 1B,C; Figure 3B; Figure 3—figure supplement 1B. For Figure 1—figure supplement 1C and D only two data points (replicates) were available for each comparison group (assay/condition). We therefore highlighted the signal range for every group.

Reviewer #2:

The authors present whole animal ATAC-seq data for six developmental stages of C. elegans along with five stages of aging adults to define accessible sites. They also collect long nuclear RNA-seq data for each time point in both time series to help classify the accessible sites into promoter and enhancer types. To further augment the developmental series in particular, the authors collected chromatin ChIP-seq data and short, capped nuclear RNA to assess chromatin state and transcription initiation, respectively. After assigning a type to the accessible regions, they cluster the promoter sites based on their accessibility over the developmental time course and the aging time course. Using single cell data from Cao et al., (2017) they find that some site clusters are associated with target genes with tissue-specific expression. They also use the single cell data in conjunction with ChIP-seq TF data to show that some TFs are preferentially associated with genes expressed in certain tissues.

Overall, the paper is well-written and the data sets seem of high quality. The chromatin accessibility data should be of considerable interest to the C. elegans community, and their evaluation of chromatin marks relative to promoters, enhancers, and the patterns of expression of the target genes should be of interest to the wider chromatin community. The fact that most sites change over time is not surprising, given that the expression of most genes changes considerably over time. This conclusion and the cluster analysis is complicated by the fact that the fraction of cells from any given tissue changes over time, and thus the fraction of reads from a peak associated with that tissue will change, even if accessibility at that site within that cell does not change over time. The clearest example is the gonad, where cells from this tissue go from being less than 1% of the total cell (genome) number to more than half over the course of larval development. An accessible site in gonad cells that was constant throughout this expansion would be expected to show a large increase over time, simply from the increasing cell number. Similarly, intestinal cells double their genomes with each larval molt. Neurons and muscle increase in L1 but not thereafter, so proportionally the signal from their accessible sites would be expected to diminish over time, even if accessibility remained constant. Indeed, the patterns of most of the assigned clusters for the developmental time course follow the expectation from cell numbers.

When clustering promoter accessibility changes across development, clear patterns are visible. As noted by the reviewer, in some cases these patterns follow changes in cell number during development. However, this does not invalidate the clustering analysis, which aimed at identifying groups of promoters with similar patterns of accessibility across development or ageing. The patterns could be due to changes in accessibility or changes in cell number. For example, through this method we identified different groups of germ line active promoters that have greatly increased accessibility signals as the germ cell number increases, that are active in the germ line. Other patterns, such as oscillating accessibility (observed in Cluster H) or decreased accessibility after embryogenesis (Clusters Mix6 and Mix7) are likely to be due to accessibility regulation rather than changes in cell number. We clarified in the manuscript that the accessibility changes may sometimes reflect changes in cell numbers rather than the accessibility dynamics across development.

Reviewer #3:

The Janes et al., manuscript reports ATAC seq profiling along a developmental and aging time course in whole worm C. elegans. In general, the experimental design and the data processing all appear rigorous and well done. By comparing the ATAC data with RNA-seq of nuclear RNAs, the authors define many promoters and putative enhancers. Moreover, the authors found that many of the accessible sites show dynamic developmental regulation, and the sites are often associated with tissue-specific gene expression. Overall, the data reported represent an important resource for the community.

1) It will be important to show quality control analyses to assess the reproducibility of the data / between replicates.

2) Similarly, MDA or PCA to show the overall relatedness of the data from the different time points will be helpful.

We have added PCA plots showing reproducibility and broad relatedness of the samples and assays in Figure 1—figure supplement 2.

3) For the ATAC data comparison with previously published data, the authors indicated that different peak calling parameters likely account for the differences. It will be helpful for the authors to examine their data using the previously published analysis pipelines (or vice versa) and provide some quantification of the differences / similarities between the datasets.

As suggested, we used MACS2 to call peaks on our ATAC-seq data as done by Daugherty et al., 2017 and then compared this set to the accessible sites presented in this paper, which were identified using a focal enrichment peak calling method. Overall, the MACS2 peak calls are very similar to the accessible sites reported in this study (91.5% overlap; Figure 2—figure supplement 2A). The overlapping sites are enriched for ChIP-seq peaks and transcription initiation signal, and they are depleted for mapping at exons. The accessible sites that do not overlap MACS2 peak calls are similar to these, showing transcription initiation signal and depletion for exons, but this set is depleted for ChIP-seq peaks suggesting they may be sites active in a small number of cells. The small fraction of MACS2 peak calls that do not overlap an accessible site (8.5%, n=2392) are similar to the peak calls found by Daugherty et al., 2017 or Ho et al., 2017 but that are not present in our accessible sites. They are depleted for ChIP-seq peaks and transcription initiation signal, and they are enriched in annotated exons. The difference in peak calling methods therefore can account for some of the differences with previously published data. However, the fraction of such sites is relatively small indicating that data specific differences also contribute. Indeed, the peaks defined using MACS2 on the ATAC-seq data reported here still show substantial differences with those reported by Daugherty et al., 2017 (Figure 2—figure supplement 2B). We include this analysis as a new figure (Figure 2—figure supplement 2B) and discuss the results in the text (subsection “Defining and annotating regions of accessible DNA”). Contributing factors may be differences in signal to noise or differences in growth methods (liquid vs solid media).

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [DNase, MNase] Gene Expression Omnibus. GSE114481
    2. Ahringer J, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [scap] NCBI Gene Expression Omnibus. GSE114490
    3. Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [lcap] Gene Expression Omnibus. GSE114483 [DOI] [PMC free article] [PubMed]
    4. Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [ChIP-seq] Gene Expression Omnibus. GSE114440 [DOI] [PMC free article] [PubMed]
    5. Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [ATAC-seq] Gene Expression Omnibus. GSE114439
    6. Down TA. 2013. The landscape of RNA polymerase II transcription initiation in C. elegans reveals a novel regulatory architecture. NCBI Gene Expression Omnibus. GSE42819

    Supplementary Materials

    Figure 1—source data 1. Accessible sites identified using ATAC-seq.

    ● chrom_ce10, start_ce10, end_ce10 location of the accessible site (bed-style coordinates, ce10). ● atac_%stage_height maximum SPMR-normalized ATAC-seq signal at the peak in %stage (one of wt_emb, wt_l1, wt_l2, wt_l3, wt_l4, wt_ya, glp1_d1, glp1_d2, glp1_d6, glp1_d9, glp1_d13). ● atac_source source of the ATAC-seq peak call (see Materials and methods). ○ atac_wt_pe wt (developmental) ATAC-seq treated as paired-end. ○ atac_wt_se wt (developmental) ATAC-seq treated as single-end. ○ atac_glp1_se glp-1 (ageing) ATAC-seq, single-end only.

    DOI: 10.7554/eLife.37344.006
    Figure 2—source data 1. Regulatory annotation of accessible sites.

    ● chrom_ce10, start_ce10, end_ce10 location of the accessible site (bed-style coordinates, ce10). ● chrom_ce11, start_ce11, end_ce11 as above, but lifted over to ce11. ● annot final regulatory element type, obtained by combining strand-specific transcription patterns (see Materials and methods). ● annot_%strand annotation of the strand-specific transcription patterns at the site (%strand is either fwd or rev). ● promoter_gene_id_%strand, promoter_locus_id_%strand, promoter_gene_biotype_%strand WormBase gene id, locus id, biotype for sites annotated as coding_promoter, pseudogene_promoter or non-coding_RNA on %strand. ● associated_gene_id, associated_locus_id WormBase gene id, locus id of genes whose gene body or outron region overlaps the site. These are defined for for sites annotated as unassigned_promoter, putative_enhancer or other_element. If a site overlaps multiple genes, all overlaps are reported, separated by commas. ● tss_%strand_ce10 representative transcription initiation mode (Materials and methods) on %strand, ce10 coordinates. ● tss_%strand_ce11 as above, but lifted over to ce11. ● scap_%strand_passed True or False based on whether the site has reproducible transcription initiation (Materials and methods). ● lcap_%stage_%strand_passed_jump True or False based on whether the site passed the jump test for elongating transcription (Materials and methods, %stage is one of wt_emb, wt_l1, wt_l2, wt_l3, wt_l4, wt_ya, glp1_d1, glp1_d2, glp1_d6, glp1_d9, glp1_d13). ● lcap_%stage_%strand_passed_incr True or False based on whether the site passed the incr test for elongating transcription (Materials and methods).

    DOI: 10.7554/eLife.37344.014
    Figure 4—source data 1. Element accessibility dynamics and promoter accessibility clusters in development and ageing.

    ● chrom_ce10, start_ce10, end_ce10 location of the accessible site (bed-style coordinates, ce10). ● devel_is_dynamic True or False based on whether the site shows differential accessibility between any two developmental stages. ● ageing_is_dynamic True or False based on whether the site shows differential accessibility between any two ageing time points. ● devel_prom_cluster_label assigned developmental accessibility promoter cluster. ● ageing_prom_cluster_label assigned ageing accessibility promoter cluster. ● HOTness based on the number of transcription factors overlapping the accessible site, either HOT (19 or more factors), cold (between 1 and 18 factors) or none (zero factors). ● factor_count number of transcription factors with binding sites overlapping the accessible site. ● factor_names comma-separated list of the names of transcription factors with binding sites overlapping the accessible site.

    DOI: 10.7554/eLife.37344.021
    Figure 5—source data 1. TF datasets used for analyses.

    ● factor transcription factor name. ● dataset_name modENCODE/modERN DCC dataset name(s), separated by commas if multiple datasets from the same transcription factor were used. ● dataset_id modENCODE/modERN DCC dataset ID(s), comma-separated as above.

    DOI: 10.7554/eLife.37344.023
    Transparent reporting form
    DOI: 10.7554/eLife.37344.024

    Data Availability Statement

    Sequencing data have been deposited in as a SuperSeries in GEO under accession code GSE114494.

    The following datasets were generated:

    Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [DNase, MNase] Gene Expression Omnibus. GSE114481

    Ahringer J, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [scap] NCBI Gene Expression Omnibus. GSE114490

    Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [lcap] Gene Expression Omnibus. GSE114483

    Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [ChIP-seq] Gene Expression Omnibus. GSE114440

    Julie Ahringer, Jürgen Jänes. 2018. Chromatin accessibility dynamics across C. elegans development and ageing [ATAC-seq] Gene Expression Omnibus. GSE114439

    The following previously published datasets were used:

    Down TA. 2013. The landscape of RNA polymerase II transcription initiation in C. elegans reveals a novel regulatory architecture. NCBI Gene Expression Omnibus. GSE42819


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES