Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Jul 7.
Published in final edited form as: Mol Cell. 2016 Jun 30;63(1):167–178. doi: 10.1016/j.molcel.2016.05.032

Prevalent, dynamic, and conserved R-loop structures associate with specific epigenomic signatures in mammals

Lionel A Sanz 1,*, Stella R Hartono 1,*, Yoong Wearn Lim 1,*, Sandra Steyaert 1, Aparna Rajpurkar 1, Paul A Ginno 1,2, Xiaoqin Xu 1, Frédéric Chédin 1,§
PMCID: PMC4955522  NIHMSID: NIHMS800865  PMID: 27373332

Abstract

R-loops are three-stranded nucleic acid structures formed upon annealing of an RNA strand to one strand of duplex DNA. We profiled R-loops using a high-resolution, strand-specific methodology in human and mouse cell types. R-loops are prevalent, collectively occupying up to 5% of mammalian genomes. R-loop formation occurs over conserved genic hotspots such as promoter and terminator regions of poly(A)-dependent genes. In most cases, R-loops occur co-transcriptionally and undergo dynamic turnover. Detailed epigenomic profiling revealed that R-loops associate with specific chromatin signatures. At promoters, R-loops associate with a hyper-accessible state characteristic of unmethylated CpG island promoters. By contrast, terminal R-loops associate with an enhancer- and insulator-like state and define a broad class of transcription terminators. Altogether, this suggests that the retention of nascent RNA transcripts at their site of expression represents an abundant, dynamic, and programmed component of the mammalian chromatin that impacts chromatin patterning and the control of gene expression.

Graphical Abstract

graphic file with name nihms800865f8.jpg

ETOC BLURB

Sanz et al. (2016) develop a high-resolution methodology to map R-loop structures and report that in mammalian genomes co-transcriptional R-loop formation is prevalent, conserved and dynamic. The study highlights that promoters and terminators are hotspots of R-loop formation and associate with specific chromatin signatures.

INTRODUCTION

R-loops are three-stranded nucleic acid structures composed of an RNA:DNA hybrid and a displaced single-stranded DNA (ssDNA) loop. These non-B DNA structures have long been considered to result from rare and accidental entanglements of RNA with DNA during transcription. Aberrant R-loop formation has been linked to increased mutagenesis, hyper-recombination, rearrangements, and transcription-replication collisions as a result of topological, replication, or transcriptional stress (Aguilera and Garcia-Muse, 2012; Hamperl and Cimprich, 2014). R-loop dysfunction is also thought to underlie human diseases including Fragile X syndrome, Frontotemporal Dementia, Amyotrophic Lateral Sclerosis, and Ataxia with Ocular Apraxia type 2 (Skourti-Stathaki and Proudfoot, 2014).

The observation that R-loops form at class switch sequences in activated murine B cells provided the first evidence that R-loops form under physiological conditions (Yu et al., 2003). Our recent work showed that sites of stable DNA:RNA hybrid formation can be profiled genome-wide in human cells using DRIP-seq (DNA:RNA Immunoprecipitation coupled to high-throughput sequencing), a method that relies on the unique specificity of the S9.6 monoclonal antibody (Boguslawski et al., 1986) for RNA:DNA hybrids and R-loops independently of DNA sequence. This revealed that R-loops form normally over unmethylated CpG island (CGI) promoters (Ginno et al., 2012). The observation that R-loops also form at the 3’-end of a number of human genes (Ginno et al., 2013) together with evidence that terminal R-loop formation is a key step in transcription termination (Skourti-Stathaki et al., 2011), suggests that R-loops may possess more general physiological roles (Skourti-Stathaki and Proudfoot, 2014).

Despite the rising importance of R-loops in pathological and physiological processes, little remains known about their prevalence and distribution in the human genome, their conservation across cell types and species, their mechanisms of formation and turnover, and their possible functions. Here we used a novel iteration of our original method called DRIPc-seq to profile R-loops genome-wide at near base-pair resolution and in a strand-specific manner. Our data support the notion that the hybridization of nascent RNA transcripts to template DNA during transcription is a programmed event that associates with a variety of functional biological outputs underlying transcription regulation and chromatin patterning.

RESULTS

High-resolution, strand-specific profiling of RNA:DNA hybrids

We developed a high-resolution, strand-specific, R-loop profiling method termed DRIPc-seq (DNA:RNA Immunoprecipitation followed by cDNA conversion coupled to high-throughput sequencing). This method builds on DRIP, except that S9.6 immunoprecipitated fragments were further digested by DNase I to remove any DNA and the RNA strands were recovered, reverse-transcribed (RT) to cDNA, and subjected to a strand-specific RNA-seq protocol (see Experimental Procedures for details; Figure S1A). This technique allows the capture of a highly enriched population of RNA molecules involved in RNA:DNA hybrid interactions.

DRIPc-seq was performed on the human embryonic carcinoma Ntera2 cell line; DRIP-seq was performed in parallel for comparison. As expected, DRIP-seq resulted in robust signal but suffered from limited (~ kilobase) resolution, higher background, and a lack of strand-specificity. Importantly, treatment of genomic DNA with Ribonuclease A (RNase A) prior to DRIP resulted in no significant difference in signal (Figure 1A and S1B). By contrast, pre-treatment of genomic DNA with purified human Ribonuclease H2 (RNase H) abolished R-loop signal. This shows that DRIP-seq allows the specific recovery of RNA:DNA hybrids. DRIPc-seq was strand-specific and showed lower background and greater resolution than DRIP-seq (Figure 1A). DRIP-seq and DRIPc-seq (referred to thereafter as DRIP and DRIPc, respectively) were reproducible between biological replicates and in strong agreement with each other (Figure S1B and S1C). The observation that DRIP signal (which is overwhelmingly RNase H-sensitive and results from direct sequencing of DNA fragments) and RNA-based DRIPc signal are in strong agreement argues that DRIPc specifically queries RNA strands involved in RNA:DNA interactions and not from residual free RNA. To further establish this point, we showed that RNase H treatment post DRIP abrogated our ability to detect signal after RT at all tested loci including highly transcribed genes (Figure S1D). Likewise, DRIPc sequencing libraries could not be built from RNase H-treated material because not enough material could be recovered (data not shown). By contrast, pre-treatment with RNase A, did not affect DRIPc (see below). This, together with the fact that DRIPc signal showed no enrichment over exons as would be expected from residual mRNA (Figure 1A), established that the method is specific. Finally, omission of the RT step led to complete lack of signal over tested loci, showing that DNA strands were fully digested by DNase I post DRIP (Figure S1D). Altogether, DRIPc-seq enables RNA:DNA hybrid profiling at high-resolution and with strand-specificity in any genome.

Figure 1. Distribution, frequency, and dynamics of RNA:DNA hybrids.

Figure 1

(A) Screenshot of a representative genomic region. DRIPc data is shown in red (+ strand) and blue (− strand) (two independent replicates). DRIP-seq data is in green; DRIP-seq data after RNase A and RNase H pre-treatment is shown below (in khaki and teal, respectively). (B) Distribution of DRIPc peak numbers as a function of peak size. (C) Bar chart of DRIP-qPCR (as % input) for a various loci. Each bar is the average of two independent experiments (shown with standard error). The effect of RNase A and RNase H pre-treatment on DRIP-qPCR is shown. (D) Location analysis of DRIPc peaks (right) compared to expected genomic distribution (left). The various regions are color-coded as per cartoon below. ‘X’ refers to an extended 3 kb region, as shown. (E) Number of DRIPc peaks in the sense and antisense orientations relative to gene transcription. (F) R-loops were measured by DRIP-qPCR at a number of loci through a time course post DRB treatment and wash (5’ indicates promoters; 3’ indicates terminators). The initial DRIP-qPCR value for each locus prior DRB treatment (time zero) was normalized to 100%. Bars represent the average of two independent experiments with standard error.

Prevalent R-loop formation over genic regions

Based on DRIPc datasets, up to ~150 megabases of DNA distributed among ~70,000 peaks can exist under an R-loop form, representing 5% of the human genome. This is in close agreement with DRIP-based data generated from primary fibroblasts (Lim et al., 2015). While individual R-loop footprints cannot be measured, population average DRIPc peaks showed a median size of 1.5 kb (Figure 1B). We used DRIP-qPCR to estimate steady-state R-loop formation frequencies. At positive loci, frequencies ranged from ~2–12% of input depending on the locus, while negative loci (intergenic regions and untranscribed genes) ranged from 0.01 to 0.1% (Figure 1C). As expected, DRIP-qPCR signal was insensitive to RNase A and abrogated by RNase H pre-treatment. Thus, DRIP allows assessment of RNA:DNA hybrid formation with a 100-fold dynamic range.

The vast majority of DRIPc signal mapped onto RNA polymerase II-dependent genes, with 2- and 3-fold over-representation at promoter and terminal regions, respectively (Figure 1D). The largest fraction of R-loops was observed over gene bodies. Intergenic signal was depleted over 5-fold and a significant fraction of this signal corresponded to unannotated transcripts and extensions of putative long non-coding RNA genes (Figure S1E). In contrast to yeast (El Hage et al., 2014), little signal could be detected over tRNA genes in human (Figure S1F). Genic R-loop formation was highly strand-specific, with over 90% of the R-loop signal resulting from RNAs transcribed co-linearly with transcription (Figure 1E). This, together with the strong agreement of DRIP and DRIPc signals (Figure 1A, S1C), argues for co-transcriptional structure formation in cis. In agreement, R-loop signal (measured by DRIPc-seq) was positively correlated with gene expression (measured by RNA-seq) (Figure S1G).

Co-transcriptional R-loop formation is dynamic

To further test the dependence of R-loop formation on active transcription, we treated cells with 5,6-Dichloro-1-β-D-ribofuranosyl-benzimidazole (DRB), a Cdk9 inhibitor that rapidly arrests transcription initiation. R-loops were then measured at time points post treatment by DRIP-qPCR for a range of promoter and terminal loci. At promoters, a rapid drop in R-loop levels followed DRB treatment (Figure 1F, left). Most structures disappeared within 30 minutes post transcription block, with loci showing an average 10-minute half-life. At terminal regions, a clear albeit slower kinetic response was also observed. Such delayed response is expected given that RNA polymerase II complexes released from the promoter prior to DRB addition must travel the distance of the gene before R-loops can be resolved. For instance, while R-loops dropped rapidly at the TRIM33 promoter, TRIM33 terminator R-loops held steady for 60 min and only began to drop at 120 min. This response is consistent with elongation rates of ~2 kb/min (Jonkers and Lis, 2015) and the fact that TRIM33 is 118 kb long. R-loops at the 3’-ends of SRRT, REXO4, and CALM3, three short genes (9.5 – 13.6 kb), showed a half-life of about 30 min, which can roughly be accounted for by assuming a ~10 min travel time through the gene body and a 10–20 min half-life. Washing DRB two hours after treatment led to the progressive reappearance of R-loop structures. At promoters, R-loops started appearing within 10 minutes post-wash but steady-state pre-DRB levels were only reached after two hours (Figure 1F, right). The kinetics of R-loop re-emergence post DRB was similar at terminator regions of short genes but markedly slower for the longer TRIM33 gene, as expected. Altogether, our data support a co-transcriptional origin for most instances of R-loop formation and show that R-loops are dynamically created and resolved.

R-loop formation over promoter and poly(A)-dependent termination regions is a prevalent feature of expressed human genes

Promoter R-loop formation was observed for 8,112 genes. R-loop (+) promoters were enriched for CpG island promoters (Figure S2A–D), as expected (Ginno et al., 2013). Promoter R-loop signal rose downstream of the transcription start site (TSS) and reached a maximal about 1.5 kb downstream (Figure 2A, 2B), suggesting that RNA:DNA hybrid formation often spans the promoter-proximal part of the first intron. GC skew, which measures strand asymmetry in the distribution of guanines and cytosines, was associated with promoter R-loop formation (Figure 2A, 2B), consistent with the higher thermodynamic stability of RNA:DNA hybrids carrying G-rich RNA strands (Ratmeyer et al., 1994).

Figure 2. Prevalent R-loop formation.

Figure 2

(A, B, C, D) Left panel: Metaplot of DRIPc signal centered on TSS (A, B) or PAS (C, D). Red: + strand signal; blue: - strand signal. All genes oriented to the right. Green: average GC skew for corresponding genes. Right panel: Representative screenshots for each gene class. (A) All R-loop forming TSSs. (B) Bidirectional R-loop forming TSSs. (C) All R-loop forming terminators. (D) R-loop forming colliding terminators. (E) Pie chart indicating the fraction of genes (in %) carrying gene body R-loop signal (in % gene body covered by signal grouped by decile). (F) Length and expression characteristics of “sticky” genes versus “normal” R-loop forming genes. (G) Representative screenshot of a “sticky” gene, BRD3. GC skew is indicated as red (positive skew) and blue (negative skew) blocks.

Terminal R-loop formation was observed for 9,320 genes. Terminal R-loop signal was broad and peaked just before the polyadenylation site (PAS) (Figure 2C). Colliding genes showed strong terminal signal emanating from each gene (Figure 2D). Terminal R-loops were accompanied by only modest GC skew transitions around PAS; most genes with 3’ R-loops did not associate with significant GC skew as measured by the SkewR algorithm (Ginno et al., 2013). Interestingly, terminal R-loop formation was specifically observed for poly(A)-dependent genes. Highly expressed genes such as histone genes, certain long non-coding RNAs, and many RNA polymerase III transcripts that do not undergo poly(A)-dependent processing, did not show evidence of terminal R-loops.

Gene body R-loop formation is prevalent

RNA-DNA interactions in gene bodies were frequent (Figure 2E). In general, the cumulative length of R-loop peaks in a gene was proportional to that gene’s length (Figure S2E), indicating that transcription elongation through longer genes may be more R-loop prone. For nearly two-thirds of genes, R-loop signal occupied less than 30% of the gene body sequence space (Figure 2E). However, R-loop signal extended over 50–100% of the gene body for a quarter of genes. These “sticky” genes tended to be shorter and/or highly transcribed (Figure 2F, S2F) and some were accompanied by extensive gene body GC skew (Figure 2G). Gene body signal was highly reproducible (Figure 2G and below), arguing that RNA-DNA interactions during elongation occur at specific positions along genes.

R-loop formation is conserved within human cell types and across species

To address R-loop conservation, we performed DRIP-seq in human K562 (erythroleukemia) cells and in mouse E14 embryonic stem cells (ESC) and NIH3T3 cells (embryonic fibroblast). Our recent R-loop map from human primary fibroblasts (Lim et al., 2015) was also analyzed and DRIPc-seq was performed on NIH3T3 after RNase A pre-treatment. R-loop formation was prevalent, biased over genes, and enriched over promoters and terminators in both human and mouse cells (Figure S3A). Importantly, R-loop formation was conserved across cell types and species. For instance, regions in human chromosome 14 that are syntenic to mouse chromosome 12 and 14 displayed high conservation of DRIP signal (Figure 3A). This conservation was reflected by high Pearson correlation when comparing DRIP signals over orthologous genes (Figure 3B, left) or over 1 kb syntenic genomic tiles (Figure 3B, right). In most cases, differences in R-loop formation over orthologous genes could be explained by differences in gene expression (Figure S3B, S3C). We also analyzed how conservation of signal varied over genic sub-regions and determined that promoter and terminal regions showed the highest agreement, ranging from 40–60% when comparing within species, to 25–40% when comparing between species (Figure 3C). A slightly lower but significant conservation was observed throughout gene body regions. At higher resolution using DRIPc datasets in human and mouse, conservation was also strong, in particular over promoters and terminators (Figure S3D, E). To test whether DNA sequence conservation underlies R-loop conservation, we compared DNA sequence between orthologous R-loop (+) and (−) regions. Conservation was significantly higher for terminal and promoter regions but not for gene bodies (Figure 3D). Altogether, this establishes that R-loops often form at specific conserved loci in the human and mouse genomes.

Figure 3. R-loop conservation.

Figure 3

(A) Top: Circos plot depicting DRIP signal (red = high, green = low) along human chromosome 14 in 1 Mb bins for human Ntera2 (NT2), K562, and fibroblasts (Fibro) as well as murine E14 and NIH3T3 (3T3). Regions from mouse chromosome 14 and 12 syntenic to human chromosome 14 are connected by ribbons. Bottom: zoomed-in region; genes are indicated below. (B) Human v. mouse Pearson correlation of DRIP signal over orthologous genes (left) and 1 kb genomic tiles (right). (C) Conservation of R-loop signal between mouse and human cell lines broken down by gene parts. (D) Sequence conservation over genic regions for R-loop (+) (orange) and (−) (blue) loci matched for expression. P-values (Wilcoxon Mann-Whitney one-tailed) indicate significance.

R-loop regions exhibit increased overall chromatin accessibility

Our DRIPc maps enabled us to determine with unprecedented resolution if R-loop formation associates with particular chromatin states. For this we obtained ENCODE and other datasets representing a range of chromatin marks including DNase-hypersensitivity, micrococcal nuclease cleavage, a number of histone modifications, and various chromatin binding factors (supplementary table 1). To determine if a chromatin trait was enriched or depleted over R-loop loci, we measured the intensity of that trait over DRIPc peaks and compared it to the intensity for a shuffled set of non-R-loop-forming peaks. To ensure adequate comparisons, we treated promoter, gene body, and terminal loci separately. Shuffling was conducted within each genic sub-region while strictly matching peak size and location relative to TSS and PAS and matching gene expression levels and gene lengths (supplementary experimental procedures). This analysis allowed us to identify chromatin signals that distinguish R-loop positive (+) and negative (−) loci without interference from location or expression.

All R-loop (+) regions showed significantly higher DNase accessibility compared to R-loop (−) loci (Figure 4). For promoters, increased accessibility was most marked 1–2 kb upstream of the R-loop peak, corresponding to the TSS. DNase accessibility was maximal at the center of the R-loop peak for gene bodies and was broadly distributed around R-loop (+) terminators. Patterns of MNase cleavage confirmed the presence of a hyper-accessible region over the TSS of R-loop (+) promoters (Figure 4). However, R-loop (+) regions in gene bodies and terminators were less prone to MNase cleavage despite higher DNase accessibility. This pattern could reflect a reduced MNase cleavage efficiency at R-loop structures or the possibility that R-loops favor the deposition of labile nucleosomes and/or are bound by non-histone proteins. FAIRE-seq is a third method by which the accessibility of chromatin can be measured (Giresi et al., 2007). R-loop (+) promoters were characterized by a clear second peak of FAIRE-seq signal overlapping with the R-loop peak. This second TSS-distal peak was conspicuously absent in R-loop (−) promoters (Figure 4), suggesting that R-loop formation associates with increased chromatin accessibility. Gene body R-loop peaks also coincided with a clear FAIRE-seq peak compared to matched R-loop (−) controls. At terminal regions, however, R-loop (+) regions showed lower FAIRE accessibility than matched R-loop (−) terminators, suggesting a mixed chromatin architecture characterized by DNase hyper-accessibility and lower MNase and FAIRE sensitivity.

Figure 4. R-loop forming regions are associated with open chromatin and high RNA polymerase occupancy.

Figure 4

All metaplots are centered over R-loop peaks. R-loop (+) and R-loop (−) represent loci with and without R-loops, respectively. R-loop (+) and (−) genes were matched for expression and grouped into four expression quartiles (see inset in DRIPc metaplot). Metaplots of DNase, MNase, FAIRE, and DRIPc signal are arranged top to bottom and further broken by promoter, gene body, and terminal regions from left to right. The horizontal boxplots at the bottom of the promoter and terminal columns indicate the positions of TSS and TTS, respectively. Lines indicate the median value of signal at each position; the 95% confidence interval is shaded.

R-loop (+) promoters show enhanced epigenomic signatures characteristic of open, active promoters

H3K4 di- and tri-methylation, along with H3K9 and H3K27 acetylation, were significantly enriched over promoter R-loops (Figure 5A, 5B). Levels of these four histone marks were ~ 2-fold higher for R-loop (+) promoters compared to matched R-loop (−) controls. As noted for DNase hypersensitivity, increased deposition was observed over TSSs (Figure 5B). H3K4 mono-methylation was also significantly enriched but enrichment was directly over the R-loop peak as well as upstream of the TSS (Figure 5B, 5C). A modest but significant enrichment of the histone variant H3.3 was observed around TSSs of R-loop (+) promoters along with a significant enhanced recruitment of the transcription elongation mark, H3K36me3 into gene bodies (Figure 5B, 5C). Increased levels of H3K36me3 coincided with an elevated density of active RNA polymerase (RNAP) complexes as measured by PRO-seq (Kwak et al., 2013) (Figure 5B, S4A). Overall, R-loop (+) promoters displayed chromatin hyper-accessibility, hyper-acetylation, hyper-methylation of H3K4, and enhanced H3K36me3 deposition compared to expression-matched R-loop (−) promoters. In addition, we observed depletion of the heterochromatic mark H3K9me3 (Figure 5B) and enhanced protection against DNA methylation at R-loop (+) promoters as well as promoter-proximal R-loop peaks (Figure 5D, S4B, S4C), indicative of a hyper-protected state against epigenetic silencing. Analysis of chromHMM states (Ernst and Kellis, 2012) confirmed significant associations with strong promoters, enhancers, and transcriptional transition states accompanied by a depletion of the heterochromatic state (Figure S4D).

Figure 5. R-loop forming promoters associate with specific histone marks, chromatin binding factors, and DNA hypomethylation.

Figure 5

(A) Heatmap showing chromatin marks and chromatin binding factors enrichment or depletion over R-loop regions (promoter, gene body and terminal) relative to shuffled regions. Grey indicates effect size less than 5%. NS: not significant; other differences show p< 0.05 (Monte Carlo). (B) Metaplots of ChIP-seq signal for H3K4me1/2/3, H3K9ac, H3K27ac, H3K36me3, H3K9me3, H3.3 and PRO-seq with DRIPc signal at promoters. Colors and organization is as described for Figure 4. (C) Overlay of H3K4me1 (left) and H3K36me3 (right) ChIP-seq over DRIPc signal. ChIP-seq signal is broken down between R-loop (+) genes (red) and expression-matched R-loop (−) genes (blue); highly expressed genes (Q4) are shown. Green lines represent DRIPc signal for R-loop (+) (solid line) and R-loop (−) (dotted line) Q4 genes; 95% confidence intervals are shaded. (D) Boxplot of DNA methylation over a TSS-centered 4 kb region (low expression quartile, Q1, is shown). Each boxplot pair corresponds to a 200 bp window over R-loop (+) (orange) and expression-matched R-loop (−) (blue) promoters; median is shown by a black line. For each window, p-values indicate the significance of any difference between R-loop (+) and (−). Dotted lines indicate average DNA methylation values over the entire region. (E) Metaplots of ChIP-seq signal for RBPP5, PAF1, HDAC2, SIN3A, SAP30, PHF8, KDM4A, and ZNF274 with DRIPc signal.

The levels of multiple chromatin binding factors also responded to R-loop formation. RBBP5, a core component of the COMPASS H3K4 methyltransferase complexes (Shilatifard, 2012) was significantly enriched at R-loop (+) promoters (Figure 5A, 5E). PAF1, a subunit of the RNAP-interacting PAF1 complex (PAF1C), was also enriched (Figure 5A, 5E), together with the p300 acetyltransferase (Figure 5A, S4E). SIN3A, SAP30 and HDAC2, components of the SIN3 complex, were similarly enriched around the TSS of R-loop (+) genes (Figure 5A, 5E). KDM4A and PHF8, two histone demethylases targeting H3K9/H3K36 and H3K9/H4K20, respectively (Klose et al., 2006; Liu et al., 2010) were also highly enriched, consistent with the observed reduction of H3K9me3 levels. ZNF274, a KRAB zinc finger protein recruited to H3K9me3-marked regions (Frietze et al., 2010) was conversely depleted (Figure 5A, 5E). Finally, we observed increased recruitment of the EZH2 methyltransferase and increased deposition of the polycomb mark H3K27me3 over R-loop (+) promoters in a manner inversely proportional to gene expression (Figure S4E). Overall, our data suggest that promoter R-loop formation is accompanied by the enhanced recruitment or depletion of specific chromatin binding factors, resulting in unique epigenomic signatures.

Terminal R-loops show an enhancer and insulator chromatin state

R-loop (+) terminator regions showed the strongest enrichment for enhancer marks H3K4me1 and p300 (Heintzman et al., 2007) (Figure 5A, 6A). In agreement, R-loop (+) terminators were 2–3 times more likely to overlap with annotated enhancers (Figure S5A) and the “strong enhancer” state was the most significantly enriched chromHMM state over terminal R-loops (Figure S4D). Whether such enhancers are active is unclear given the low H3K27Ac levels and the slight enrichment of H3K27me3 (Figure S5B). Overall, as judged by increased DNase accessibility (Figure 4), H3K4me1 deposition, p300 recruitment, and enhancer annotations, terminal R-loops associate with an enhancer state. Terminal R-loop formation was also associated with increased levels of CTCF, a zinc finger DNA binding protein with insulator function that serves as a major organizer of chromatin loops (Bell et al., 1999) (Figure 6B). A significant enrichment for ZNF143 and RAD21, components of the cohesin complex that often co-localize with CTCF (Parelho et al., 2008), was also observed. ChromHMM confirmed that the “insulator” state was specifically enriched at terminal R-loop regions, not promoters (Figure S4D). Recent evidence suggest that terminal R-loops associate with a condensed, H3K9me2/3-marked, state (Skourti-Stathaki et al., 2014). Epigenomic profiling of human R-loop (+) termini, however, indicated that these regions are DNase hyper-accessible (Figure 4) and depleted for H3K9me3 (Figure 5A, S5B) and heterochromatic traits (Figure S4D). Analysis of H3K9me2 and other datasets available in mouse broadly confirmed that terminal R-loops are open and enriched for H3K4me1 and H3K36me3 (Figure S5C) compared to matched R-loop (−) termini. However, no global enrichment for H3K9me2/3 could be observed, although this association may hold at specific genes.

Figure 6. R-loop forming terminators are associated with enhancer and insulator-like state and show characteristics of transcription terminators.

Figure 6

(A) Metaplots of ChIP-seq signal for H3K4me1, p300, with DRIPc signal over terminal R-loop regions. The plots are centered on terminal R-loop peaks. Color code is as described for Figure 4. (B) Same as (A) except CTCF, RAD21 and ZNF143 were analyzed. (C) Same as (A) except PAF1, PRO-seq and H3K36me3 were analyzed. (D) Overlay of PRO-seq (left) and H3K36me3 ChIP-seq (right) over highly expressed (Q4) terminal DRIPc signal for R-loop (+) (red) and R-loop (−) genes. See Figure 5C for color codes. (E) Nearest neighbor gene distances for expressed (>Q1) colliding gene neighbors where none, one, or two of the neighbors form terminal R-loops. P-values were determined by a Wilcoxon Mann-Whitney test.

Terminal R-loop regions show characteristics of transcription terminators

PAF1 was also highly enriched at the 3’-ends of R-loop forming genes (Figure 6C). This observation is significant given that PAF1C interacts with 3’-end processing factors such as the Cleavage and Polyadenylation Specificity Factors (CPSF) (Tomson and Arndt, 2013) and that R-loops have been linked to transcription termination (Skourti-Stathaki et al., 2011). To analyze the association between terminal R-loops and termination pathways further, we measured the density of active RNAP taking advantage of PRO-seq datasets. R-loop (+) termini showed a significantly greater RNAP density compared to expression-matched R-loop (−) termini (Figure 6C). RNAP density for R-loop (+) loci started rising as early as 4 kb before PAS, coinciding with increased R-loop signal (Figure 6D). In contrast, RNAP density for R-loop (−) loci remained flat before PAS. At PAS, a sharp rise of RNAP density was observed, most likely reflecting termination-associated stalling of the RNAP complex. RNAP density post-PAS peaked higher, earlier, and started to return back down sooner and at a faster rate for R-loop (+) than for R-loop (−) termini. As a result, RNAP density returned towards background levels earlier for R-loop forming termini, consistent with the notion that terminal R-loops favor transcription termination. In agreement, H3K36me3 levels rose up to the R-loop peak and rapidly decreased past that point for R-loop (+) genes (Figure 6C). By contrast, R-loop (−) genes showed lower H3K36me3 levels and a decreased rate of signal decay (Figure 6C, D). Thus, the transcription complex rapidly dissociates over terminal R-loops, a characteristic of transcription terminators. In general, genes with R-loop (+) terminators tended to be closer to their nearest 3’ neighbor compared to expression-matched terminal R-loop (−) genes (median distances 12 kb versus 17 kb, respectively, p-value < 2e−5). This trend was true for co-directional gene neighbors and particularly noticeable for colliding genes where the presence of one or two R-loop (+) terminator(s) was correlated with shorter and shorter inter-gene distances (Figure 6E). Vice-versa, colliding genes with short inter-gene distances were enriched for R-loop (+) terminators compared to colliding genes with long inter-gene distances for which they were depleted (Figure S6A). This suggests that terminal R-loop formation defines a class of transcription terminators that are enriched for genes separated by shorter intergenic distances.

The position of terminal R-loops influences the position of transcription termination

The broadness of the R-loop signal observed around PAS in metaplots suggests that early, middle, and late R-loop-forming loci may exist. To test this, we systematically determined the center of gravity of the R-loop signal for each annotated terminal peak and assigned the position of that point to 1 kb bins distributed +/− 5 kb around PAS. Bins ranging from −5 to −2 kb were called “early”, bins from −2 to +2 kb were called “middle” and bins from +2 to +5 kb were deemed “late” (Figure 7A). This allowed us to determine if the position of R-loops relative to PAS influences termination as measured through the patterns of accumulation and clearance of RNAP, H3K36me3, and PAF1. RNAP density clearly responded to R-loop position, as measured by PRO-seq, NET-seq, and RNAP ChIP-seq data (Figure 7A, S6A). Genes with early R-loops showed the earliest accumulation of RNAP post PAS followed by the earliest return of RNAP density back to background levels. By contrast, genes with late R-loops showed a profound delay in RNAP accumulation and a much reduced rate of RNAP clearance from the chromatin template (Figure 7A, S6A). Similar patterns were observed for H3K36me3. Finally, early and late R-loop-forming terminators showed predominant PAF1 levels over early and late regions, respectively, with middle R-loop-forming genes showed highest PAF1 recruitment around PAS. This suggests that the position of the R-loop signal relative to PAS influences the transcription termination process.

Figure 7. the position of the R-loop signal correlates with the position of transcription termination.

Figure 7

(A) Metaplots of DRIPc, PRO-seq, H3K36me3, and PAF1 signals over terminal R-loop (+) genes (Q4). In each case, the signal is centered on PAS and broken down between early (n= 2,095), middle (n= 4280), and late-forming (n= 1126) R-loop (+) genes. Lines represent median values; 95% confidence intervals are shown. For PAF1, signal intensities were normalized and reported as percent maximal signal to better define the positional features. (B) Distribution of distances between the PAS site and the transcription termination points of individual genes as a function of R-loop signal position arranged in 1 kb bins around the PAS. The boxplots include all genes and a median trend line is shown. Data was calculated from PRO-seq data (K562). Genes with no terminal R-loops are shown at right.

To further investigate the relationship between terminal R-loop position and transcription termination, we annotated the distal positions at which transcription terminates for individual genes using available PRO-seq (Kwak et al., 2013) and NET-seq (Mayer et al., 2015) datasets. Annotation was performed using custom Hidden Markov Models (see Experimental Procedures for details; Figure S6B). The resulting data revealed a strong positive relationship between the position of the R-loop signal and distal termination points (Figure 7B). Genes with early R-loops terminated earliest (~7 kb downstream of PAS as per PRO-seq data), while genes with late R-loops terminated latest (~17 kb downstream). By contrast, genes without terminal R-loops terminated at a median value of 9 kb downstream of PAS. Slightly shorter distances were obtained from NET-seq data but identical trends prevailed (Figure S6C). The effect of R-loop position on termination was observed regardless of gene expression with highly expressed genes terminating further away from the PAS site than less expressed genes (Figure S6D). Altogether, this suggests that the position at which the RNAP is released from chromatin is influenced by the presence and the position of terminal R-loops.

DISCUSSION

Prevalent and dynamic R-loop formation in mammalian genomes

Our results establish co-transcriptional R-loop formation as an important feature of RNA polymerase II-dependent transcription in mammalian genomes. Unlike previously thought, RNA-DNA interactions are not rare under physiological conditions (Figures 1, 3). Furthermore, R-loop formation does not result from accidental entanglements of the nascent RNA transcript (Figure 3). CGI promoters and terminator regions of poly(A)-dependent genes represent universal hotspots of R-loop formation. At CGI regions, positive GC skew supports structure formation by ensuring the synthesis of G-rich nascent RNAs that can stably associate with the template DNA strand (Figure 2). Terminal R-loop formation, by contrast, was not as highly associated with GC skew, suggesting that R-loop formation mechanisms may differ between gene ends. Importantly, R-loops formed during transcription are dynamically resolved with a half-life of ~10–20 min (Figure 1). Such turnover may enable cells to mitigate the negative effects of R-loop formation on genomic stability (Aguilera and Garcia-Muse 2012; Hamperl and Cimprich 2014). The ~10 min half life measured at promoters is also in close agreement with estimates of the paused promoter-proximal RNAP half-life (~7 min) (Jonkers et al., 2014). This suggests that R-loop formation may be compatible with the normal dynamics of the transcription cycle. Efficient R-loop resolution may also allow cells to limit the steady-state frequency of R-loop formation. In agreement, R-loop frequency estimates range from ~2–15% of input (Figure 1B). Thus, while a surprisingly large portion of mammalian genomes is R-loop prone, R-loop formation is regulated in terms of frequency and residence time. Overall, our data suggest that the retention of nascent RNAs at the transcription site is a conserved, prevalent, and dynamic feature of the mammalian chromatin.

R-loops favor an open chromatin state in physiological conditions

Regardless of location, R-loops associate with DNase I hyper-accessibility (Figure 4). This is consistent with reports that RNA:DNA hybrids prevent nucleosome deposition in vitro (Dunn and Griffith, 1980) and adopt a more rigid A form-like conformation (Noy et al., 2005). Thus, R-loops may interfere with nucleosome re-deposition behind the advancing RNAP, thereby favoring an open state. Evidence that R-loops promote chromatin decondensation and lower nucleosome occupancy (Powell et al., 2013) and that, conversely, R-loop destabilization causes chromatin compaction (Boque-Sastre et al., 2015), further support this view. The association of R-loops with H3S10P and with chromosome condensation reported in R-loop accumulating mutants (Castellano-Pozo et al., 2013) may denote possible pathological consequences of R-loop formation. Careful measurements of R-loop distribution, frequencies, and turnover rates in various conditions are now possible and should elucidate how R-loops differentially associate with chromatin states.

R-loop formation favors specific epigenomic signatures

At promoters, R-loops associate with an open, H3K4 hyper-methylated, hyper-acetylated and hyper-protected state characteristic of strong CGI promoters. In agreement, a recent report showed that R-loops favor a hyper-acetylated state (Chen et al., 2015). Whether R-loops play a causal role in setting this active chromatin state or follow its establishment is a key question. Our work shows that the presence or absence of R-loops affects the levels of multiple chromatin marks and factors anchored around the TSS region of R-loop (+) promoters located 1–2 kb upstream of the R-loop peak itself (Figures 4 and 5). This is most easily interpreted to suggest that the architecture of these promoters is favorable for R-loop formation. However, some R-loop-responsive chromatin features including H3K4me1, H3K36me3, RNAP density, and promoter FAIRE-seq, coincide with the location of R-loop peaks. This raises the possibility that R-loops may impact the chromatin state in specific instances. Recent studies support the possibility that R-loops may play a causal role in chromatin patterning. R-loop destabilization at the human VIM promoter causes a shift from an active, open, DNA hypomethylated state to a silent, closed, and methylated state (Boque-Sastre et al., 2015). In mouse ESCs, R-loops were suggested to recruit the Tip60-p400 acetylase complex to promoters (Chen et al., 2015). It is possible that R-loops are directly recognized by proteins with a role in epigenetic patterning, thereby connecting the formation of nucleic acid structures to the establishment of specific chromatin states. Proteins with a capacity for ssDNA and/or RNA:DNA hybrid binding are attractive candidates as possible readers and effectors of R-loop formation. H3K4 methyltransferases of the SET1/COMPASS family were shown to bind to ssDNA and negatively supercoiled DNA in vitro (Krajewski et al., 2005). Here, we observed that H3K4me1 was enriched over the peak of R-loop at promoters and terminators (Figures 5, 6), and that H3K4me2/3 was enriched around the TSSs of R-loop (+) promoters (Figure 5). Processive H3K4 methylation by COMPASS requires PAF1C (Shilatifard, 2012). We show here that PAF1 is enriched at R-loop (+) promoters and 3’-ends (Figures 57). Interestingly, the RTF1 subunit of PAF1C was reported to bind to ssDNA (de Jong et al., 2008). Furthermore, deletion of the Leo1 and Cdc73 PAF1C subunits leads to genomic instability in yeast, and this instability can be corrected by RNase H1 over-expression (Wahba et al., 2011). PAF1C is also required for H3K36 trimethylation by SETD2 (Chu et al., 2007) and we show here that R-loop (+) regions show significant increased H3K36me3 deposition at both gene ends. Interestingly, while R-loops tend to generally associate with an open, H3K4/H3K36 methylated state, R-loop (+) promoters and terminators also take on distinct states. In particular, the enhancer and insulator states are dominant or specific for terminal regions only (Figures 56). This suggests that R-loop formation may influence chromatin states in a context-dependent manner.

R-loop formation defines a class of transcription terminators

Our data supports the view that terminal R-loops are implicated in transcription termination (Figure 6). Furthermore, the position at which transcription terminated was correlated to the position at which R-loops formed (Figure 7). R-loop formation therefore appears to be a conserved hallmark of a broad class of transcription terminators. The mechanistic roles of R-loops in the termination process remain to be fully established. It is possible that the intrinsic ability of R-loops to stall transcription (Belotserkovskii and Hanawalt, 2011) represents an initial pause signal, as suggested (Skourti-Stathaki et al., 2011). The formation of an enhancer-like chromatin state (Figure 6) may account for the observation of non-coding RNA transcripts initiating from the distal ends of mammalian genes, particularly those with short intergenic distances (Carninci et al., 2005), through the production of enhancer RNAs. R-loop-induced antisense transcription over the 3’-ends of genes has been linked to transcription termination (Skourti-Stathaki et al., 2014). The increased formation of insulator regions and/or gene loops by R-loop (+) terminators via CTCF and cohesin complex recruitment may also contribute to termination (Grzechnik et al., 2014). Finally, it is possible that specific termination proteins are recruited to R-loop terminators. PAF1C is an interesting candidate for this role given its role in termination and 3’-end processing and its strong enrichment at R-loop (+) terminators (Figure 6). Consistent reports of R-loop mediated instability in CPSF and other 3’-end processing mutants (Gomez-Gonzalez et al., 2011; Stirling et al., 2012; Wahba et al., 2011) further consolidate the links between R-loop formation and 3’-end processing.

EXPERIMENTAL PROCEDURES

DRIP-seq and DRIPc-seq

DRIP was performed essentially as described (Ginno et al., 2012) (supplementary Experimental Procedures). After DRIP, the eluted DNA was treated with 4U of DNase I (New England Biolabs) for 35 min at 37°C to degrade all DNA. The RNA strands that used to take part in R-loops were then ethanol precipitated and reverse transcribed into cDNA with the iScript Reverse transcription supermix (BioRad) using a mix of poly(T) and random hexamers. Second strand synthesis was performed using dUTP instead of dTTP. After verifying the enrichments of R-loop (+) versus R-loop (−) loci by qPCR on the newly synthesized cDNA, sequencing libraries were built following sonication to reduce the size of cDNA fragments to ~200 bp. A UDG DNA glycosylase step was added before the PCR amplification step to ensure strand specificity. Sequencing libraries were checked on an Agilent BioAnalyzer prior to sequencing on an Illumina HiSeq 2000. Due to size selection steps during library construction, the hybrids must be >50 bp in length to be detected. Read mapping after quality filters was performed using standard pipelines. Peak calling was performed using a custom-built Hidden Markov Model (supplementary Experimental Procedures).

Measurement of R-loop dynamics

Log-phase Ntera 2 cells were treated with 80 mM DRB and samples were withdrawn at time zero (pre-treatment), 5, 10, 20, 30, 45, 60, and 120 min after treatment and processed for DRIP followed by qPCR to interrogate specific promoter or terminal regions. All values were relative to input and normalized to time zero. After 120 min of DRB treatment, cells were washed twice in pre-warmed normal growth medium and R-loop formation was again analyzed as a function of time.

Chromatin signal mapping, enrichment, and conservation

ENCODE and other datasets for histone modifications and chromatin binding factors are described in Supplementary Table 1. Conservation of R-loop formation was analyzed over regions of human-mouse synteny identified by LiftOver. Detailed methods can be found in Supplementary Experimental Procedures.

Supplementary Material

1
2

HIGHLIGHTS.

  • Prevalent, conserved, and dynamic patterns of co-transcriptional R-loop formation

  • R-loops associate with specific epigenomic signatures at promoters and terminators

  • Terminal R-loops associate with an insulator and enhancer-like state

  • Terminal R-loops define a new class of transcription terminators

Acknowledgments

We thank members of the Chedin lab for critical reading of the manuscript. This work was supported by the National Institutes of Health (GM094299 to FC). LAS was supported in part by Postdoctoral Grant from Philippe foundation. SRH is a Howard Hughes Medical Institute International Student Research fellow. YWL was supported in part by a pre-doctoral training grant (5T32GM007377) and a Dissertation Year fellowship from UC Davis. This work used the Vincent J Coates Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 Instrumentation Grants S10RR029668 and S10RR027303.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

AUTHOR CONTRIBUTIONS

L.A.S., X.X., and P.A.G. performed the experimental work. S.R.H., Y.W.L., S.S., and A.R. performed computational analysis. F.C. conceived the study, oversaw the work, and wrote the manuscript along with all other authors.

ACCESSION NUMBERS

DRIP-seq, DRIPc-seq, RNA-seq, and MethylC-seq datasets have been deposited on the NCBI Gene Expression Omnibus under the accession number GSE70189.

REFERENCES

  1. Aguilera A, Garcia-Muse T. R loops: from transcription byproducts to threats to genome stability. Molecular cell. 2012;46:115–124. doi: 10.1016/j.molcel.2012.04.009. [DOI] [PubMed] [Google Scholar]
  2. Bell AC, West AG, Felsenfeld G. The protein CTCF is required for the enhancer blocking activity of vertebrate insulators. Cell. 1999;98:387–396. doi: 10.1016/s0092-8674(00)81967-4. [DOI] [PubMed] [Google Scholar]
  3. Belotserkovskii BP, Hanawalt PC. Anchoring nascent RNA to the DNA template could interfere with transcription. Biophys J. 2011;100:675–684. doi: 10.1016/j.bpj.2010.12.3709. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Boguslawski SJ, Smith DE, Michalak MA, Mickelson KE, Yehle CO, Patterson WL, Carrico RJ. Characterization of monoclonal antibody to DNA.RNA and its application to immunodetection of hybrids. J Immunol Methods. 1986;89:123–130. doi: 10.1016/0022-1759(86)90040-2. [DOI] [PubMed] [Google Scholar]
  5. Boque-Sastre R, Soler M, Oliveira-Mateos C, Portela A, Moutinho C, Sayols S, Villanueva A, Esteller M, Guil S. Head-to-head antisense transcription and R-loop formation promotes transcriptional activation. Proc Natl Acad Sci U S A. 2015;112:5785–5790. doi: 10.1073/pnas.1421197112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, et al. The transcriptional landscape of the mammalian genome. Science. 2005;309:1559–1563. doi: 10.1126/science.1112014. [DOI] [PubMed] [Google Scholar]
  7. Castellano-Pozo M, Santos-Pereira JM, Rondon AG, Barroso S, Andujar E, Perez-Alegre M, Garcia-Muse T, Aguilera A. R loops are linked to histone H3 S10 phosphorylation and chromatin condensation. Molecular cell. 2013;52:583–590. doi: 10.1016/j.molcel.2013.10.006. [DOI] [PubMed] [Google Scholar]
  8. Chen PB, Chen HV, Acharya D, Rando OJ, Fazzio TG. R loops regulate promoter-proximal chromatin architecture and cellular differentiation. Nat Struct Mol Biol. 2015;22:999–1007. doi: 10.1038/nsmb.3122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Chu Y, Simic R, Warner MH, Arndt KM, Prelich G. Regulation of histone modification and cryptic transcription by the Bur1 and Paf1 complexes. EMBO J. 2007;26:4646–4656. doi: 10.1038/sj.emboj.7601887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. de Jong RN, Truffault V, Diercks T, Ab E, Daniels MA, Kaptein R, Folkers GE. Structure and DNA binding of the human Rtf1 Plus3 domain. Structure. 2008;16:149–159. doi: 10.1016/j.str.2007.10.018. [DOI] [PubMed] [Google Scholar]
  11. Dunn K, Griffith JD. The presence of RNA in a double helix inhibits its interaction with histone protein. Nucleic acids research. 1980;8:555–566. doi: 10.1093/nar/8.3.555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. El Hage A, Webb S, Kerr A, Tollervey D. Genome-wide distribution of RNA-DNA hybrids identifies RNase H targets in tRNA genes, retrotransposons and mitochondria. PLoS genetics. 2014;10:e1004716. doi: 10.1371/journal.pgen.1004716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Ernst J, Kellis M. ChromHMM: automating chromatin-state discovery and characterization. Nature methods. 2012;9:215–216. doi: 10.1038/nmeth.1906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Frietze S, O'Geen H, Blahnik KR, Jin VX, Farnham PJ. ZNF274 recruits the histone methyltransferase SETDB1 to the 3' ends of ZNF genes. PloS one. 2010;5:e15082. doi: 10.1371/journal.pone.0015082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Ginno PA, Lim YW, Lott PL, Korf I, Chedin F. GC skew at the 5' and 3' ends of human genes links R-loop formation to epigenetic regulation and transcription termination. Genome Res. 2013;23:1590–1600. doi: 10.1101/gr.158436.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Ginno PA, Lott PL, Christensen HC, Korf I, Chedin F. R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters. Molecular cell. 2012;45:814–825. doi: 10.1016/j.molcel.2012.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Giresi PG, Kim J, McDaniell RM, Iyer VR, Lieb JD. FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin. Genome Res. 2007;17:877–885. doi: 10.1101/gr.5533506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Gomez-Gonzalez B, Garcia-Rubio M, Bermejo R, Gaillard H, Shirahige K, Marin A, Foiani M, Aguilera A. Genome-wide function of THO/TREX in active genes prevents R-loop-dependent replication obstacles. EMBO J. 2011;30:3106–3119. doi: 10.1038/emboj.2011.206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Grzechnik P, Tan-Wong SM, Proudfoot NJ. Terminate and make a loop: regulation of transcriptional directionality. Trends Biochem Sci. 2014;39:319–327. doi: 10.1016/j.tibs.2014.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hamperl S, Cimprich KA. The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability. DNA repair. 2014;19:84–94. doi: 10.1016/j.dnarep.2014.03.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Heintzman ND, Stuart RK, Hon G, Fu Y, Ching CW, Hawkins RD, Barrera LO, Van Calcar S, Qu C, Ching KA, et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nature genetics. 2007;39:311–318. doi: 10.1038/ng1966. [DOI] [PubMed] [Google Scholar]
  22. Jonkers I, Kwak H, Lis JT. Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons. Elife. 2014;3:e02407. doi: 10.7554/eLife.02407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Jonkers I, Lis JT. Getting up to speed with transcription elongation by RNA polymerase II. Nature reviews Molecular cell biology. 2015;16:167–177. doi: 10.1038/nrm3953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Klose RJ, Yamane K, Bae Y, Zhang D, Erdjument-Bromage H, Tempst P, Wong J, Zhang Y. The transcriptional repressor JHDM3A demethylates trimethyl histone H3 lysine 9 and lysine 36. Nature. 2006;442:312–316. doi: 10.1038/nature04853. [DOI] [PubMed] [Google Scholar]
  25. Krajewski WA, Nakamura T, Mazo A, Canaani E. A motif within SET-domain proteins binds single-stranded nucleic acids and transcribed and supercoiled DNAs and can interfere with assembly of nucleosomes. Molecular and cellular biology. 2005;25:1891–1899. doi: 10.1128/MCB.25.5.1891-1899.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kwak H, Fuda NJ, Core LJ, Lis JT. Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science. 2013;339:950–953. doi: 10.1126/science.1229386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Lim YW, Sanz LA, Xu X, Hartono SR, Chedin F. Genome-wide DNA hypomethylation and RNA:DNA hybrid accumulation in Aicardi-Goutieres syndrome. Elife. 2015:4. doi: 10.7554/eLife.08007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Liu W, Tanasa B, Tyurina OV, Zhou TY, Gassmann R, Liu WT, Ohgi KA, Benner C, Garcia-Bassets I, Aggarwal AK, et al. PHF8 mediates histone H4 lysine 20 demethylation events involved in cell cycle progression. Nature. 2010;466:508–512. doi: 10.1038/nature09272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Mayer A, di Iulio J, Maleri S, Eser U, Vierstra J, Reynolds A, Sandstrom R, Stamatoyannopoulos JA, Churchman LS. Native elongating transcript sequencing reveals human transcriptional activity at nucleotide resolution. Cell. 2015;161:541–554. doi: 10.1016/j.cell.2015.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Noy A, Perez A, Marquez M, Luque FJ, Orozco M. Structure, recognition properties, and flexibility of the DNA.RNA hybrid. J Am Chem Soc. 2005;127:4910–4920. doi: 10.1021/ja043293v. [DOI] [PubMed] [Google Scholar]
  31. Parelho V, Hadjur S, Spivakov M, Leleu M, Sauer S, Gregson HC, Jarmuz A, Canzonetta C, Webster Z, Nesterova T, et al. Cohesins functionally associate with CTCF on mammalian chromosome arms. Cell. 2008;132:422–433. doi: 10.1016/j.cell.2008.01.011. [DOI] [PubMed] [Google Scholar]
  32. Powell WT, Coulson RL, Gonzales ML, Crary FK, Wong SS, Adams S, Ach RA, Tsang P, Yamada NA, Yasui DH, et al. R-loop formation at Snord116 mediates topotecan inhibition of Ube3a–antisense and allele-specific chromatin decondensation. Proc Natl Acad Sci U S A. 2013;110:13938–13943. doi: 10.1073/pnas.1305426110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Ratmeyer L, Vinayak R, Zhong YY, Zon G, Wilson WD. Sequence specific thermodynamic and structural properties for DNA.RNA duplexes. Biochemistry. 1994;33:5298–5304. doi: 10.1021/bi00183a037. [DOI] [PubMed] [Google Scholar]
  34. Shilatifard A. The COMPASS family of histone H3K4 methylases: mechanisms of regulation in development and disease pathogenesis. Annual review of biochemistry. 2012;81:65–95. doi: 10.1146/annurev-biochem-051710-134100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Skourti-Stathaki K, Kamieniarz-Gdula K, Proudfoot NJ. R-loops induce repressive chromatin marks over mammalian gene terminators. Nature. 2014 doi: 10.1038/nature13787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Skourti-Stathaki K, Proudfoot NJ. A double-edged sword: R loops as threats to genome integrity and powerful regulators of gene expression. Genes Dev. 2014;28:1384–1396. doi: 10.1101/gad.242990.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Skourti-Stathaki K, Proudfoot NJ, Gromak N. Human senataxin resolves RNA/DNA hybrids formed at transcriptional pause sites to promote Xrn2-dependent termination. Molecular cell. 2011;42:794–805. doi: 10.1016/j.molcel.2011.04.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Stirling PC, Chan YA, Minaker SW, Aristizabal MJ, Barrett I, Sipahimalani P, Kobor MS, Hieter P. R-loop-mediated genome instability in mRNA cleavage and polyadenylation mutants. Genes Dev. 2012;26:163–175. doi: 10.1101/gad.179721.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Tomson BN, Arndt KM. The many roles of the conserved eukaryotic Paf1 complex in regulating transcription, histone modifications, and disease states. Biochimica et biophysica acta. 2013;1829:116–126. doi: 10.1016/j.bbagrm.2012.08.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Wahba L, Amon JD, Koshland D, Vuica-Ross M. RNase H and multiple RNA biogenesis factors cooperate to prevent RNA:DNA hybrids from generating genome instability. Molecular cell. 2011;44:978–988. doi: 10.1016/j.molcel.2011.10.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Yu K, Chedin F, Hsieh CL, Wilson TE, Lieber MR. R-loops at immunoglobulin class switch regions in the chromosomes of stimulated B cells. Nat Immunol. 2003;4:442–451. doi: 10.1038/ni919. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

RESOURCES