Skip to main content
. 2018 Apr 12;7:e27024. doi: 10.7554/eLife.27024

Figure 1. ChAR-seq uses proximity ligation of chromatin-associated RNA and deep sequencing to map RNA-DNA contacts in situ.

(A) Overview of the ChAR-seq method wherein RNA-DNA contacts are preserved by crosslinking, followed by in situ ligation of the 3’ end of RNAs to the adenylated 5’ end of the ssDNA tail of an oligonucleotide ‘bridge’ containing a biotin modification and a DpnII-complementary overhang on the opposite end. After extending the bridge by reverse transcription to generate a strand of cDNA complementary to the RNA, the genomic DNA is then digested with DpnII and then re-ligated, capturing proximally-associated bridge molecules and RNA. The chimeric molecules are reverse-transcribed, purified and sequenced. (B) Chimeric molecules are sequenced and the RNA and DNA ends are distinguished owing to the polarity of the bridge, which preferentially ligates to RNA via the 5'-adenylated tail and to DNA via the DpnII overhang. The RNA and DNA reads are then computationally recombined to produce contact maps for each annotated RNA in the genome. (C) Representative examples of genome-wide RNA coverage plots generated for Total RNA (black), mRNA (red), Hsromega (green), chinmo (green), ten-m (green), snRNA:U2 (cyan), snRNA:7SK (cyan), rox1 (blue) and roX2 (purple). Arrows show the transcription start site for each gene. In chromosome cartoons throughout the paper, light gray represents the primary chromosome scaffolds, darker gray regions are heterochromatic scaffolds, and black circles are centromeres. (D) Zoomed in region for an 850 kilobase region of chromosome 3L (chr3L). ChAR-seq tracks for Total RNA, ten-m, snRNA:U2, and snRNA:7SK are shown in comparison with PRO-seq tracks (Drosophila S2 [Kwak et al., 2013]) and ATAC-seq (this study, CME-W1-cl8+). (E) ChAR-seq contact matrix (RNA-to-RNA, top) plotted and aligned with same 850 kb region as panel D. ChAR-seq was performed without bridge addition (Hi-C/Mock-ChAR), resulting in DNA-DNA proximity ligation as in Hi-C (‘Hi-C, DNA-to-DNA’, bottom).

Figure 1.

Figure 1—figure supplement 1. Diagram of the oligonucleotide bridge and efficiency of bridge ligation and capture.

Figure 1—figure supplement 1.

The oligonucleotide bridge contains a 5'-adenylated (5'-App) six nucleotide ssDNA tail (green), a single biotin modification (purple), a DpnII site (red) and a 3'-three carbon spacer (Sp3). The 3'-Sp3 is removed from the bridge during genomic DpnII digestion, permitting subsequent ligation to genomic DNA. Lower panel, Bar plot of the fraction of reads at each step of the data processing pipeline from a representative library preparation.

Figure 1—figure supplement 2. In vitro optimization of RNA-to-DNA ligation conditions.

Figure 1—figure supplement 2.

Upper panel, Ten pmols of 17-nt adenylated ssDNA oligonucleotide (Universal App DNA, CTGTAGGCACCATCAAT) was incubated with 5 pmols of a 17nt ssRNA test probe (TTTCGTTGGAAGCGGGA) in 1x NEB T4 RNA Ligase Buffer with the indicated ligase (NEB Thermostable 5’ AppDNA/RNA ligase (Therm 5' Ligase), NEB T4 Rnl2tr K227Q Ligase (trT4K) or NEB T4 Rnl2tr R55K, K227Q ligase (trT4KQ)) and/or supplements (PEG, BSA, ATP, RNaseOUT). Products were then analyzed using denaturing polyacrylamide gel electrophoresis using a combination of NEB microRNA and low range ssRNA ladders and stained with SYBR-gold. Bands were quantified and the percent product was calculated using (shifted / (total * 0.66)) to account for the molar excess of DNA over RNA. No adjustment was made to account for preferential staining of ssDNA over ssRNA. Residual signal is expected in the lower band owing to the molar excess of DNA over RNA. A high molecular weight band is visible in the Therm 5’ Ligase lane, which most likely consists of high molecular weight concatemers of the AppDNA substrate caused by incomplete 3’ blocking of these oligos or removal of the 3’ block by the Therm 5’ Ligase. This experiment was performed once.

Figure 1—figure supplement 3. Diagram of the ChAR-seq data processing pipeline and bar plot of RNA alignment.

Figure 1—figure supplement 3.

(a) Data were processed using a custom pipeline, which can be accessed and is fully documented at: https://github.com/straightlab/flypipe (Bell, 2017 copy archived at https://github.com/elifesciences-publications/flypipe. Red lines indicate reads that do not align to any transcriptome in the sense orientation, and are then permitted to test alignment in the antisense orientation.

Figure 1—figure supplement 4. ChAR-seq RNA-to-bridge ligation is sensitive to RNase treatment.

Figure 1—figure supplement 4.

Bar plot of the relative number of reads after PCR duplicate removal and the fraction of those reads that contained the bridge. The RNase-treated cross-linked sample was incubated with 0.25 mg/mL RNase A and 12.5 Units of RNase H for 1 hr at 37°C between steps 4 and 5 of the extended protocol, followed by an additional wash step identical to step 4.

Figure 1—figure supplement 5. Comparison of RNA-to-DNA contacts between replicates.

Figure 1—figure supplement 5.

Scatter plot of the number of contacts for chromatin-associated RNAs identified in CME-W1-cl8+ rep1 vs rep2 (top), rep1 vs rep3 (middle), and rep2 vs rep3 (bottom). Pearson correlation coefficient between each replicate pair is reported.

Figure 1—figure supplement 6. False positive contacts are proportional to RNA spike-in level.

Figure 1—figure supplement 6.

(A) Aligned scatter plot of spike-in level, as percentage of total soluble RNA (x-axis) vs percentage of false contacts (y-axis). MBP (red), Halo (blue), and GFP (green) purified RNA was added at 0.1%, 1%, and 10% of total soluble RNA. Bar indicated mean number of false contacts for all three spike-ins experiments at each concentration. (B) Heatmap of Pearson correlation coefficients at 100 kb bins of example class I (Hsromega), class II (7SK, 5SrRNA, snRNA:U2), and class III RNAs (roX2), aggregated mRNAs and snoRNAs, and ATAC-seq signal, which is indicative of open chromatin, and spike-ins. The genomic associations of spike-ins do not correlate well with each other, transcriptionally associated RNAs, or open chromatin.

Figure 1—figure supplement 7. Chromatin-associated RNA alignment by class.

Figure 1—figure supplement 7.

Relative abundance of chromatin-associated RNA by transcriptome classification and strand orientation in Drosophila melanogaster CME-W1-cl8+ (male) wing disc cells.

Figure 1—figure supplement 8. Abundance of cis contacts.

Figure 1—figure supplement 8.

For each gene in our dataset, we functionally defined cis contacts as RNA-to-DNA contacts that lie within the gene body (±2 kb) for a given RNA (i.e., contacts that arise from nascent transcription). We then calculated a cis score, which is equivalent to the percentage of contacts that arise from this region. Upper plot is the per gene rank order analysis based on the cis score for each RNA in our dataset. Lower plot is a histogram of the frequency distribution for each cis score (percentage).

Figure 1—figure supplement 9. ChAR-seq RNA-DNA contacts are dissimilar to DNA-DNA contacts.

Figure 1—figure supplement 9.

Representative contact matrix plots for chr2L, chr2R and chrX showing RNA-to-RNA contacts (ChAR-seq, top-half), relative to DNA-to-DNA contacts from Hi-C/Mock-ChAR (Hi-C, bottom half).

Figure 1—figure supplement 10. ChAR-seq protocol preserves genome organization.

Figure 1—figure supplement 10.

Bar plot of Pearson correlation coefficients by chromosome calculated by comparing Hi-C data from CME-W1-cl8+ (Ramírez et al., 2015) to our Hi-C/Mock-ChAR libary, where no bridge was added and biotin fill-in was performed following DpnII digestion, but prior to ligation.