Abstract
RNA is a fundamental component of chromatin. Noncoding RNAs (ncRNAs) can associate with chromatin to influence gene expression and chromatin state; many also act at long distances from their transcriptional origin. Yet we know almost nothing about the functions or sites of action for most ncRNAs. Current methods to identify sites of RNA interaction with the genome are limited to the study of a single RNA at a time. Here we describe a protocol for ChAR-seq, a strategy to identify all chromatin-associated RNAs and map their DNA contacts genome-wide. In ChAR-seq, proximity ligation of RNA and DNA to a linker molecule is used to construct a chimeric RNA-DNA molecule that is converted to DNA for sequencing. In a single assay, ChAR-seq can discover de novo chromatin interactions of distinct RNAs, including nascent transcripts, splicing RNAs, and long noncoding RNAs (lncRNAs). Resulting “maps” of genome-bound RNAs should provide new insights into RNA biology.
INTRODUCTION
Chromatin-associated RNAs play fundamental roles in nuclear biology, functioning to regulate diverse processes including gene regulation, chromatin structure, cell specification, and cell division (Cech & Steitz, 2014; Lopez-Parajes, 2016; Kopp & Mendell, 2018; Nozawa & Gilbert, 2019). Well-studied examples include many noncoding RNAs (ncRNAs) that act at specific regions of the genome. For example, the long noncoding RNA (lncRNA) Xist binds to one X chromosome in females to silence gene expression during mammalian dosage compensation (Galupa & Heard, 2018), the lncRNA TERRA acts at telomeres to maintain their integrity (Cusanelli & Chartrand, 2015), and alpha-satellite RNAs at pericentromeric regions can stabilize histone methyltransferase binding at heterochromatin (Johnson et al., 2017; Velazquez Camacho et al., 2017; Shirai et al., 2017) However, the broad mechanisms by which most lncRNAs act remain unknown. Identifying where chromatin-interacting RNAs localize on the genome is a critical first step towards understanding ncRNA function.
With the advent of high-throughput sequencing and associated methods, the development of RNA technologies has proceeded at a rapid pace in recent years. Several methods now exist to map the genome-wide DNA binding sites of a single RNA. These include chromatin isolation by RNA purification (ChIRP-seq) (Chu et al., 2011), capture hybridization analysis of RNA targets (CHART-seq)(Simon et al., 2011), and RNA antisense purification (RAP) sequencing (Engreitz et al., 2013). These methods preserve RNA, DNA, and protein contacts and therefore allow biochemical isolation of the specific target RNA and its interacting DNA loci or protein(s). Yet thousands of uncharacterized lncRNAs exist in the human genome, and despite their power, the above approaches become labor intensive when performed on more than a few RNAs at a time.
To fill this gap, we developed ChAR-seq to identify chromatin-associated RNAs transcriptome- and genome-wide (Bell et al., 2018). This method relies on in situ proximity ligation of RNA and DNA ends that either directly interact or are in close proximity in nuclear space. ChAR-seq preserves the three-dimensional organization of the genome, and thus avoids the spurious ligation events that can occur when proximity ligation is performed in dilute solution (Nagano et al., 2015).
A key feature of the method is a double-stranded DNA linker of defined sequence (the “bridge”) that ligates to both RNA and DNA: one end contains a 5′-adenylated single-stranded DNA (ssDNA) overhang for specific ligation to 3′ ends of RNA molecules, catalyzed by a mutated T4 RNA ligase (T4 RNA ligase 2, truncated, R55K K227Q); the other end includes a DpnII restriction endonuclease (RE) recognition site for digestion and subsequent ligation to DpnII-digested genomic DNA fragments. After ligation, the chimeric RNA-bridge-DNA molecule is converted to DNA and isolated by biotin pulldown before being sequenced using next-generation platforms. The bridge sequence asymmetry allows one to clearly identify the original RNA and DNA after sequencing.
In principle, ChAR-seq can map the chromosomal contact sites of all chromatin-associated RNAs. ChAR-seq can thus be thought of as a multiplexed de novo RNA-DNA mapping assay capable of generating hundreds, or even thousands, of individual RNA-binding maps (Bell et al., 2018). With such information in hand, experimenters can answer simple, but important, questions such as: Where does my RNA(s) of interest associate with DNA across the genome? What RNAs localize near my DNA region(s) of interest? Results from ChAR-seq and other recently published methods that identify RNA-DNA interactions (Li et al., 2017; Sridhar et al., 2017; Quinodoz et al., 2018) can also be used to ask more sophisticated questions about chromatin-RNA biology relating to enhancer-promoter contacts, splicing regulation, nascent transcription, and the three-dimensional organization of nucleic acids in the nucleus.
Any cell type or tissue can likely be used in ChAR-seq with only a minimal amount of optimization, and no genetic modification is required. We have successfully used the presented protocol with human embryonic stem cells (hESCs), human retinal pigmented epithelial cells (hTERT RPE-1), Drosophila melanogaster cells (cl.8+ a nd Kc167), and nuclei isolated from Xenopus laevis embryos.
Basic Protocol 1 describes the design and preparation of the DNA bridge molecule used to covalently link RNA and DNA ends found in close proximity in the nucleus. Basic Protocol 2 describes the harvesting and cross-linking of human embryonic stem cells (hESCs) for use in ChAR-seq. Basic Protocol 3 describes the main ChAR-seq procedure and library preparation of the chimeric RNA-DNA molecules for high-throughput sequencing. The general outline of the ChAR-seq molecular biology steps include (Figure 1): (1) lysis and nuclear isolation, (2) RNA-to-bridge ligation, (3) reverse transcription of RNA into cDNA, (4) genomic DNA digestion, (5) ligation of genomic DNA to RNA-bridge fragment, (6) reverse cross-linking, (7) sonication, (8) isolation of biotinylated RNA-bridge-DNA molecules, (9) sequencing adapter ligation,(10) indexing and amplification by PCR, and (11) size selection.
Figure 1.
Outline of ChAR-seq. (A) Overview of ChAR-seq protocol. Dashed gray circle represents the nucleus; blue solid lines, genomic DNA; and red lines, RNA transcripts. The final chimeric molecule contains both cDNA (RNA) and genomic DNA ligated to a dsDNA linker, or bridge. See main text for details. (B) ChAR-seq provides information on DNA interaction sites across the genome for many RNAs. Cartoon shows three different RNAs binding the genome at distinct sites. More complex, genome-wide interaction maps can be generated for the entire transcriptome and genome (Bell et al., 2018).
Basic Protocol 1
DESIGN OF BRIDGE MOLECULE TO LINK RNA AND DNA
The linker molecule, or bridge, is an essential feature of ChAR-seq. The bridge is a defined double-stranded DNA (dsDNA) sequence that contains (Figure 2): (1) a 5’- adenylation modification (5’- App) on the top strand that allows ATP-independent and specific ligation to 3’ ssRNA ends when ligated with the T4 RNA truncated KQ ligase (Viollet et al., 2011); (2) a 3-nt random Unique Molecular Identifier (UMI) barcode to differentiate pseudo-duplicates (molecules with the same exact sequence in the read that were independently formed) from PCR duplicates for improved estimation of library complexity; (3) a DpnII RE site for digestion and subsequent ligation to DpnII-digested genomic DNA fragments, which exist every 256 bp, on average;(4) a biotinylated bottom strand for isolation of bridge-containing RNA-DNA hybrid molecules from background DNA; and (5) a PacI RE site to remove undigested bridge molecules that failed to ligate genomic DNA, but could still ligate to adapters via PCR of the biotinylated strand when present as ssDNA after melting. The bridge is small enough to enter cross-linked nuclei after mild lysis with detergent. Importantly, the bridge sequence is not present in the human or Drosophila reference genomes.
Figure 2.
The bridge linker molecule. (A) Schematic of the bridge linker molecule, which possesses polarity enabling it to specifically ligate RNA or DNA at the respective ends. Numbers correspond to features described in Basic Protocol 1: briefly, (1) 5 adenylation (5’-App) that allows specific ligation of ssDNA to ssRNA with a mutated T4 RNA ligase; (2) random barcode (blue); (3) DpnII recognition site (pink); (4) biotin for isolation (purple); (5) PacI recognition site (orange). Gray lines represent restriction enzyme cleavage patterns. (B) Schematic of the final dsDNA molecule that results from the ChAR-seq procedure. RNA and DNA in close proximity will ligate to the bridge linker molecule as shown. Red represents cDNA derived from the ssRNA and green represents genomic DNA. Dashed lines on either side represent extended sequence of varying length.
While the sequence described here has been used successfully, one may wish to redesign the bridge linker to suit specific needs. Alternative bridge designs include cleavage recognition sites for type IIS REs, such as MmeI, which cuts 18 to 20 nt outside the recognition sequence. A related method, GRID-seq, uses this enzyme and a bridge with two MmeI sites, one near the “RNA end” and one near the “DNA end” of the bridge. The advantage of this approach is that MmeI digestion produces a population of molecules of uniform length, with the bridge sequence plus 18 to 20 nt cDNA and DNA fragments in the final molecule (Li et al., 2017). These molecules can be size selected to eliminate bridge-containing molecules that failed to ligate to either RNA or DNA, thus improving the yield of useful sequencing reads. In addition, we have successfully tested a bridge containing two sites for EcoP15I, a type III RE that generates 25- to 27-nt DNA fragments outside of the recognition sequence (Möncke-Buchner et al., 2009; Fullwood et al., 2009). These reads contain slightly longer RNA and DNA than reads generated by MmeI digestion, which improves unique alignment to the genome. Our current approach, without MmeI or EcoP15I, allows the attainment of even longer RNA and DNA molecules, however, which may be especially useful for the analysis of repetitive regions.
One other area for bridge design flexibility includes the use of alternative RE sites for other 4-base-cutter enzymes instead of DpnII. A DpnII-based bridge can be combined with a bridge containing a different enzyme site (e.g., in different samples pooled after library preparation) to increase the diversity of available genomic DNA junctions for ligation by ~2-fold. Regardless, we recommend that users start with the bridge sequence described here as a baseline.
Materials
5′ -adenylated ssDNA bridge “top strand” (HPLC purified, lyophilized): /5rApp/AANNNAAACCGGCGTCCAAGGATCTTTAATTAAGTCGCAG/3SpC3/
Biotinylated ssDNA bridge “bottom strand” (reverse orientation; HPLC purified, lyophilized): /5Phos/GATCTGCGACTTAATTAAAGATCCTTGGACGCCGG/iBiodT/T
TE buffer, pH 8.0 (see Current Protocols article: Moore, 1996)
2× bridge annealing buffer (BAB; see recipe)
Benchtop minicentrifuge (Stratagene PicoFuge or equivalent)
ThermoMixer (Eppendorf ThermoMixer or equivalent) or 94° heat block
-
Resuspend each bridge strand in sufficient TE buffer to produce a final stock concentration of 200 μM. Vortex the tube for 10 sec and spin for several seconds in a benchtop minicentrifuge. Store at −20°C or proceed to step 2.
Top and bottom strand are synthesized as separate oligos. As with any DNA oligonucleotide that arrives as a solid, spin the dry tubes in a PicoFuge before resuspending to ensure that oligo is not present in the tube lid. Depending on the manufacturer, the preadenylated strand will likely cost substantially more than the biotinylated strand. Alternatively, unadenylated oligos can be obtained and then adenylated in house with ATP and the appropriate reagents (e.g., the NEB 5 DNA adenylation kit, NEB cat. no. E2610S). If choosing to adenylate in house, be sure to verify that the reaction has sufficient yield of adenylated product (as close to 100% as possible).
-
Anneal the top and bottom strands of the bridge: Mix 2× BAB, 200 μM bridge top strand, and 200 μM bridge bottom strand in a volume ratio of 2:1:1, to make 50 μM of dsDNA bridge (with a 1:1 ratio of top to bottom bridge strands) in 1× BAB. Pipet to mix well.
The bridge strands must be annealed in a 1:1 ratio. The volume will depend on the number of samples being prepared. NaCl in the BAB improves strand annealing. Although this salt remains in the final 50 μM annealed dsDNA bridge, it will be diluted to a negligible concentration during the RNA-bridge ligation step.
-
Incubate the tube 3 min at 94°C. If using a ThermoMixer or digital heat block, keep the tube in the same slot after incubation, but reduce block temperature to 23°C, and allow tube to slowly cool while the block temperature lowers (preferred). If using a single-temperature heat block, place the tube at room temperature. The bridge anneals during cooling.
Optional: An aliquot of the annealed bridge can be run on a 12% to 15% polyacrylamide gel to verify that the bridge strands have fully annealed. A similar amount of each single-strand bridge can be run as a control.
Store the dsDNA bridge at −20°C until use in Basic Protocol 3. The final concentration should be 50 μM if 100% of single strands have annealed.
Basic Protocol 2
HARVESTING AND CROSS-LINKING OF CELLS
The first step in ChAR-seq is to formaldehyde cross-link nuclear RNA, DNA, and protein in the otherwise intact cell. RNA in direct contact with or close proximity to chromatin at the time of formaldehyde addition will be covalently linked. Cross-linking preserves chromatin-RNA contacts for the duration of the protocol until the de-cross-linking step.
The precise culture conditions and cell preparation protocol required will depend upon the specific cell line or tissue sample. Following is a protocol we have used for H9 human embryonic stem cells. In theory, any cell line should be amenable to ChAR-seq. Whole tissues or embryos may require further optimization, with a focus on obtaining isolated but structurally intact nuclei.
For each ChAR-seq sample (~10 to 15 million human cells), we recommend also collecting an additional ~1 million cells for preparation and analysis by RNA-seq. The expression levels of individual transcripts determined by RNA-seq—in transcripts per million (TPM) or reads per kilobase of transcript per million mapped reads (RPKM)—can be compared with their abundances in the ChAR-seq dataset to determine the enrichment of RNA on chromatin versus RNA levels in the entire cell.
Materials
Cultured cells of cell type of interest (here, hESCs, H9/WA09)
Serum-free medium for hESCs: mTeSR1 (StemCell Technologies, cat. no. 85850) or as appropriate for cell type
37% (w/w) formaldehyde (formalin) stock solution
2.5 M glycine
PBS, pH 7.4 (see Current Protocols article: Moore, 1996), prepared with
DEPC-water, room temperature and ice cold
15-cm round petri dishes or 6-well plates (as appropriate for the cell type)
50-ml conical tubes
Hemocytometer
Centrifuge with swinging-bucket rotor and adapters for 50-ml tubes
1.5- to 2.0-ml nuclease-free plastic tubes (Eppendorf or similar)
Benchtop centrifuge (Eppendorf 5417C or similar)
Liquid nitrogen
-
Grow cells; perform final split into 15-cm plates and grow cells to desired confluency. Remove medium and add 34 ml fresh serum-free medium to the dish.
For hESCs, 10 to 15 million cells can easily be obtained from a single 15-cm round dish. If smaller-sized plates or dishes are used (e.g., 6-well plates), adjust the volumes accordingly in steps below. Cells can be pooled when added to a 50-ml Falcon tube (step 4 below) to obtain the desired number of cells per frozen pellet (e.g., per sample).
-
Add 3 ml 37% formaldehyde to the dish and swirl gently to mix. Rotate slowly on a shaker for 10 min at room temperature.
The 3% formaldehyde (final concentration) will cross-link nucleic acids and proteins. The final formaldehyde concentration can be adjusted depending on the cell type. We have successfully used between 1% and 4% formaldehyde with various cell types.
-
Add 12 ml 2.5 M glycine stock (to 0.6 M final) to quench the formaldehyde. Rotate slowly on a shaker for 5 min at room temperature.
The amount of glycine added will depend on the initial formaldehyde concentration.
Remove medium and add 10 ml PBS to each petri dish of cells. Manually scrape the cells with a plastic cell scraper into 50-ml conical tubes.
Remove a 10-μl aliquot of cells for counting using a hemocytometer. Cells may need to be diluted 1:2 or 1:10 in PBS for accurate counts if the density is high. Count at least two aliquots per sample for proper estimates.
Centrifuge the 50-ml conical tube of cells in a swinging-bucket rotor for 5 min at 500 × g. Gently resuspend the cells in 5 ml sterile-filtered ice-cold PBS (prepared with DEPC-treated or other RNase-free water). Keep cells on ice.
Count the cells again as in step 5. Split or pool cells into the appropriate number of 1.5-ml tubes to obtain ~10 to 15 million cells per tube. If replicate cell counts are variable, use the lower estimate to ensure that a sufficient cell number is obtained.
Centrifuge the 1.5-ml tubes with cells in a benchtop centrifuge for 5 min at 500 × g, 4°C. Remove supernatant, close tube lids, and flash freeze the cells in liquid nitrogen. Store cells at −80°C until use in Basic Protocol 3.
Basic Protocol 3:
GENERATION OF HYBRID RNA-LINKER-DNA MOLECULES FOR DEEP SEQUENCING
This protocol comprises the main ChAR-seq procedure beginning with frozen cross-linked cell pellets and finishing with, and including, sequencing library preparation. Because the protocol involves RNA, we recommend working in an RNase-free environment until the second-strand synthesis step is complete. After that step, one can work in otherwise clean conditions with nuclease-free solutions, as one would for any protocol that involves small quantities of DNA to be sequenced.
Clean the bench, pipets, and pens or marker bodies with RNaseZap or similar solution each day before starting. When not in use, we store pipets, tips, and reagents used for ChAR-seq in covered plastic bins or containers on the bench shelves—and also when storing reagents in shared −20°C freezers or 4°C refrigerators—to reduce the potential for mishandling or RNase contamination.
Materials
Cell lysis buffer, prepared immediately before use (see recipe)
cOmplete Mini EDTA-free Protease Inhibitor Cocktail (Roche, cat. no. 11836170001) or equivalent (for 10× protease inhibitor stock, dilute 1 tablet in 1050 μl DEPC-water and store up to 12 weeks at −20°C)
RNaseOUT or equivalent, at 40 U/μl (Thermo Fisher, cat. no. 10777019)
Frozen formaldehyde-cross-linked cultured cells (Basic Protocol 2)
10% and 0.5% (w/v) SDS (see Current Protocols article: Moore, 1996)
1.5% (v/v) Triton X-100 in DEPC-treated water
Optional: exogenous RNA to spike in (see step 9)
10× T4 RNA ligase buffer (NEB, cat. no. B0216L)
PBS, pH 7.4 (Moore, 1996), prepared with DEPC-treated water
DEPC-treated or nuclease-free water
10× polynucleotide kinase buffer (NEB, cat. no. B0201S)
T4 polynucleotide kinase (PNK) (NEB, cat. no. M0201S)
Polyethylene glycol (PEG 8000) (NEB, product no. B1004, packaged with M0373L)
50 μM annealed bridge (prepared in Basic Protocol 1)
T4 RNA ligase 2, truncated KQ (T4 Rnl2tr R55K, K227Q) (NEB, cat. no. M0373L)
1 M dithiothreitol (DTT)
10 mM (each) dNTP mix
Bst3.0 polymerase (120 U/μl; NEB, cat. no. M0374M)
0.5% (w/v) SDS in 0.1 mM EDTA, prepared by adding 1/5000 (v/v) dilution of 0.5 M EDTA to 0.5% SDS
DpnII restriction enzyme (50 U/μl; NEB, cat. no. R0543M)
0.5 M EDTA (RNase-free)
T4 DNA ligase buffer (NEB, cat. no. B0202S)
T4 DNA ligase (400 U/μl; NEB, cat. no. M0202L)
cDNA synthesis support buffer (see recipe)
1 M MgCl2 (RNase-free)
Escherichia coli DNA polymerase I (10 U/μl stock; NEB, cat. no. M0209L)
RNase H (5 U/μl; NEB, cat. no. M0297S)
5 M NaCl (RNase-free)
Proteinase K (20 mg/ml)
3 M sodium acetate, pH 5.2
Glycogen (5 mg/ml)
100% (molecular biology grade) and 70% (v/v) ethanol
TE buffer, pH 8.0 (Moore, 1996)
Dynabeads MyOne Streptavidin T1 (ThermoFisher, cat. no. 65601)
Tween wash buffer (TWB) (see recipe)
2× bead binding buffer (BBB) (see recipe)
NEBNext Ultra II DNA Library Prep Kit for Illumina (NEB, cat. no. E7645S) including:
NEBNext Ultra II End Prep Reaction Buffer (kit: NEB, cat. no. E7645S)
NEBNext Ultra II End Prep Enzyme Mix (kit: NEB, cat. no. E7645S)
NEBNext Ligation Enhancer (kit: NEB, cat. no. E7645S)
NEBNext Ultra II Ligation Master Mix (kit: NEB, cat. no. E7645S)
NEBNext Ultra II Q5 Master Mix (kit: NEB, cat. no. E7645S)
10 μM Universal Primer NEBNext Multiplex Oligos for Illumina (Set 1) (kit: NEB, cat. no. E7335S)
10 μM Indexing Primer; NEBNext Multiplex Oligos for Illumina (Set 1) (kit: NEB, cat. no. E7335S; note that there are multiple indexing primers available in sets of 12) including:
NEBNext Adaptor (kit: NEB, cat. no. E7335S)
NEB USER enzyme (kit: NEB, cat. no. E7335S)
CutSmart Buffer (NEB, cat. no. B7204S)
PacI restriction enzyme (10 U/μl; NEB, cat. no. R0547S)
NEBNext Ultra II Q5 Master Mix (NEB, cat. no. M0544S; this is required in addition to amount included in kit E7645S)
AMPure XP SPRI (solid-phase reversible immobilization) beads (5 ml; Beckman-Coulter, cat. no. A63880)
10 mM Tris·Cl, pH 8.0 (Moore, 1996)
100× Sybr Green (10,000× concentrate, diluted 1:100; ThermoFisher, cat. no. S7563)
1.5- to 2.0-ml nuclease-free plastic tubes (Eppendorf or similar)
Centrifuge (Eppendorf 5417C or similar), room temperature and 4°C
Agilent Bioanalyzer automated electrophoresis system or equivalent
Covaris S220 focused ultrasonicator or equivalent
Eppendorf ThermoMixer or equivalent heating block with shaking ability
Magnetic rack for multiple (6 to 12) 1.5- to 2.0-ml tubes
PCR thermocycler
200-μl PCR tubes or strips
Cell lysis, RNA fragmentation, and ligation of RNA to bridge (day 1)
Cell lysis
-
1
Prepare cell lysis buffer (see recipe) in advance. Immediately before use, mix 700 μl lysis buffer with 80 μl 10× protease inhibitor stock (to 1 × final) and 20 μl RNaseOut (to 1 U/μl final), mixing gently to avoid foaming.
-
2
Remove the tubes of frozen cross-linked cells from −80°C freezer and immediately add 400 μl of pre-mixed cell lysis buffer + protease inhibitors to the still frozen cross-linked cells. Thaw cells at room temperature and mix gently with a P1000 micropipettor and pipet tip.
Do not let cells thaw on ice before adding lysis buffer.
-
3
After thawing is complete, incubate the cells 2 min at 4°C (on ice).
-
4
Centrifuge 4 min at 2,500 × g, room temperature.
Centrifugation steps are performed at room temperature, unless otherwise noted.
-
5
Pipet out the supernatant and discard. Wash the pellet with 400 μl of lysis buffer + inhibitors.
Pipet slowly to avoid excessive foaming.
-
6
Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard, taking care to avoid removing any portion of the pellet.
Different cell types or numbers may require longer spins to pellet fully. If the pellet is still fluffy or pellet fragments risk being pipetted during removal of supernatant, centrifuge the sample again at 2,500 × g for an additional 2 to 4 min. Although especially critical here, this general precaution applies to all later centrifugations steps as well. In our experience, with ~10 million input cells, a pellet will be clearly visible after each centrifugation throughout the protocol. Always monitor pellet integrity throughout when pipetting supernatant to discard. If the pellet appears unstable during such a step, centrifuge for an additional 1 min at 2,500 × g, or spin the tube for ≥10 seconds in a minicentrifuge immediately before pipetting.
-
7
Gently resuspend the pellet in 100 μl 0.5% SDS and incubate 5 min at 37°C.
-
8
Add 340 μl 1.5% Triton X-100 to neutralize the SDS. Mix well by pipetting gently to avoid excessive foaming. Incubate 15 min at 37°C.
The 1.5% Triton X-100 can be prepared at the beginning of the day or during the 5-min incubation above (prepare a master mix of 290 μl water and 50 μl 10% Triton X-100 per sample).
RNA fragmentation
-
9
Optional: Add spike-in RNA (<50 μl total volume) to the sample and incubate 5 min on ice.
To estimate the background contamination level of random ligation of floating RNA (i.e., fragmented or uncross-linked RNA), we recommend adding exogenous RNA to the sample immediately before RNA fragmentation. This could be total RNA or mRNA from a different cell type (e.g., a species with sufficient sequence divergence from the sample such that reads align uniquely, as for Drosophila and humans); or in vitro-transcribed RNA(s) not present in the organism under study. These RNAs should be added to result in ~1% to 5% of the final reads, a number that may need to be determined empirically. If using cells instead of purified RNA, one can add the spike-in cells during initial cross-linking (Basic Protocol 1) or when thawing the endogenous cells in step 2. This optional step is not necessary for the ChAR-seq protocol, but may provide a useful estimate of false positive ligations.
-
10
Fragment the RNA by adding 11 μl 10× T4 RNA ligase buffer for a final concentration of 0.25×. Incubate exactly 4 min at 70°C. Immediately place on ice when finished.
T4 RNA ligase buffer contains Mg2+, which at higher temperatures will induce RNA fragmentation. The fragmentation time—and thus the number of available 3′ ligation sites—will depend on the cell number and cell type. We therefore recommend optimizing the RNA fragmentation time before beginning the protocol. It is best to optimize with a cell number similar to what will be used in the actual protocol: e.g., perform the protocol through the completed RNA fragmentation step 9, wash with PBS (i.e., add 1000 μl PBS, centrifuge at 2,500 × g for 2 min, discard supernatant) twice, de-cross-link the sample as in steps 43 to 45, and extract the RNA using standard procedures. Examine RNA size distribution on a gel or Bioanalyzer. Partial, but not complete, fragmentation is desired. We have used fragmentation times ranging from 2 to 5 min.
-
11
Add 1000 μl DEPC-treated PBS and incubate 2 min on ice. Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
-
12
Add 1000 μl DEPC-treated PBS and resuspend the pellet by gentle pipetting. Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard, and then proceed immediately to the next step.
The above two steps remove the ligase buffer and wash the pellet. Immediately proceed to step 13, with pre-mixed PNK reaction buffer already prepared.
-
13Add 200 μl polynucleotide kinase (PNK) reaction mix, pre-mixed beforehand from the following:
-
170 μl DEPC-treated water
-
20 μl 10 PNK buffer
-
10 μl T4×polynucleotide kinase (PNK) enzyme.Pipet gently to resuspend pellet. Incubate 30 min at 37°C with light mixing (e.g., ~450 rpm on a shaker).Chemical fragmentation of RNA with heat and Mg2+ favors 3′-phosphate formation at breakages. PNK will remove RNA 3′-phosphoryl groups and ensure 3′-hydroxyl groups are available for efficient ligation. A 3′-hydroxyl group is required for ligation to the 5′-adenylated ssDNA using the 5′-Thermostable App-DNA/RNA ligase. Prepare the PNK reaction mix ahead of time to immediately add after step 12 above.
-
-
14
Add 1000 μl DEPC-treated PBS to the reaction to wash out PNK, and pipet gently to resuspend. Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
-
15
Add 200 μl 1× T4 RNA ligase buffer and pipet gently. The pellet does not need to be dislodged or fully resuspended in this step. Centrifuge 1 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
RNA-to-bridge ligation
-
4Add 200 μl RNA ligation solution, pre-mixed beforehand from the following:
-
42.5 μl water
-
20 μl 10× T4 RNA ligase buffer
-
100 μl 50% PEG 8000 (25% final)
-
7.5 μl RNaseOUT (1.5 U/μl final)
-
20 μl 50 μM annealed bridge (1 nmol bridge per 10 million cells, or 100 pmol per 1 million cells)
-
10 μl 200 U/μl T4 RNA ligase 2, truncated KQ (10 U/μl final).Pipet to fully resuspend the pellet.The RNA ligation solution should be prepared in advance (e.g., during the 30-min PNK incubation) and fully mixed by repeated pipetting. PEG increases molecular crowding to stimulate the ligation activity of the enzyme. PEG is viscous, so during the preparation of RNA ligation solution it may be helpful to add 50 μl of the PEG first to mix with the remaining components, and then add the final 50 μl PEG and pipet to ensure complete mixing of all components.For optimum ChAR-seq libraries, the amount of bridge required can be estimated based on a titration with the cell number. We use 100 pmol bridge per 1 million cells for human cells as a starting point.
-
-
5
Incubate at 23°C overnight (or 12 to 18 hr), shaking at ~900 rpm in a ThermoMixer.
Although shorter ligation times (4 to 8 hr) may also effectively ligate RNA to the bridge, an overnight ligation provides a convenient end point for day 1 of the protocol.
First-strand synthesis, genomic DNA digestion, and DNA-bridge ligation (day 2)
First-strand synthesis
-
7
Add 800 μl DEPC-treated PBS to wash out RNA ligation solution and associated PEG. Centrifuge 6 min at 2,500 × g, room temperature. Pipet the supernatant and discard.
This centrifugation is longer than other wash spins due to the high viscosity of PEG. During the spin, prepare first-strand synthesis reaction mix, minus the enzyme.
-
8
Add 500 μl 1× T4 RNA ligase buffer and pipet gently to resuspend. Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
-
9Add 233.3 μl first-strand synthesis reaction mix, pre-mixed beforehand from the following:
- 25 μl 10× T4 RNA ligase buffer
- 176.8 μl DEPC-treated water
- 0.25 μl 1 M DTT
- 6.25 μl RNaseOUT (1 U/μl final)
-
25 μl 10 mM (each) dNTP mix.Pipet gently to mix well. Immediately proceed to the next step.21. Add 16.7 μl 120 U/μl Bst3.0 (~2000 U total, 8 U/μl final). Pipet to mix well. Incubate for 15 min at 23°C while mixing gently (~450 rpm on ThermoMixer). Increase temperature to 37°C and incubate an additional 10 min. Increase temperature to 50°C and incubate an additional 20 min.
-
10
Remove samples from ThermoMixer. Add 1000 μl DEPC-treated PBS, pipet gently to mix, and cool on ice for 1 min. Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
-
11
Resuspend the pellet in 100 μl 0.5% SDS (in 0.1 mM EDTA) and immediately place samples at 55°C for 10 min.
SDS and heat will inactivate the Bst3.0 enzyme.
-
12
Add 340 μl of 1.5% Triton X-100 to each sample to quench the SDS. Pipet gently to mix well, and incubate 15 min at 37°C.
-
13
Add 1000 μl DEPC-treated PBS and incubate briefly on ice for ~1 min to cool the samples down to room temperature from 37°C. Centrifuge 2 min at 2,500 × g. Pipet out supernatant and discard.
-
14
Add 500 μl 1× T4 RNA ligase buffer to wash the pellet (the pellet does not need to be fully resuspended). Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard. Proceed immediately to the next step.
Genomic DNA digestion
- Add 235 μl genomic digestion solution, pre-mixed beforehand from the following:
-
25 μl 10× T4 RNA ligase buffer
-
203.5 μl DEPC-treated or RNase-free water
-
0.25 μl 1 M DTT
-
6.25 μl RNaseOUT (1 U/μl final).Pipet gently to fully resuspend the pellet
-
-
Add 15 μl 50 U/μl DpnII stock (to 3 U/μl final) and pipet to mix well. Incubate at 37°C for a minimum of 3 hr and up to overnight (or 12 to 18 hr) if desired.
DpnII will digest both genomic DNA and the “spacer fragment” in the bridge to create compatible ligation ends. The approximate cutting frequency of DpnII is once every ~256 bp in a random DNA sequence. Different enzymes have not been tested, but could be used (along with requisite changes to the restriction cut site in the bridge) to increase the diversity of ligation junctions across samples. We typically digest for 3 hr, and have not found a significant difference in results if the digestion proceeds overnight.
-
Terminate the reaction by adding 7.5 μl 0.5 M EDTA (final concentration ~15 mM) and incubate 5 min at room temperature.
T4 RNA ligase buffer contains 10 mM MgCl2. EDTA will chelate magnesium ions, which are required for DpnII to digest DNA, thus stopping enzyme activity.
-
Wash out DpnII: Add 1000 μl DEPC-treated PBS, and centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
Use a pipettor with a 200-μl tip (P200) to ensure all supernatant is removed.
-
Resuspend the pellet in 100 μl 0.5% SDS (in 0.1 mM EDTA). Incubate 10 min at 55°C.
The SDS will inactivate residual DpnII.
Add 340 μl 1.5% Triton X-100 to each sample to quench the SDS. Pipet gently to mix well and incubate 5 min at 37°C.
Add 1000 μl DEPC-treated PBS and incubate 2 min on ice. Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
Add 500 μl T4 DNA ligase buffer and pipet gently to resuspend. Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
DNA-to-bridge ligation
-
5Add 500 μl genomic DNA ligation solution, pre-mixed beforehand from the following:
- 427.5 μl DEPC-treated water
- 12.5 μl RNaseOUT
- 50 μl T4 DNA ligase buffer
-
10 μl 400 U/μl T4 DNA ligase (4000 U; 8 U/μl final).Pipet to resuspend the pellet.This reaction requires ATP, which is included in the T4 DNA ligase buffer.
-
6
Incubate at 16°C overnight (or 12 to 18 hr), shaking at ~900 rpm in a ThermoMixer.
This ligation step provides a natural stopping point for day 2.
Second-strand synthesis, cross-link reversal, and shearing (day 3)
Terminate the ligation reaction
-
1
Add 15 μl 0.5 M EDTA (~15 mM final) to the tube to stop the ligation reaction. Incubate 2 min at 23°C.
EDTA will chelate the magnesium in the T4 ligase buffer (10 mM), which is required for T4 ligase activity. The wash step below removes the inactivated enzyme.
-
2
Add 1 ml PBS and centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
-
3
Add 250 μl 1× cDNA synthesis support buffer to wash and resuspend pellet. Centrifuge 2 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
Second-strand synthesis
-
4Resuspend the pellet in 200 μl 1× cDNA buffer. Add to each tube:
-
1 μl 1 M DTT
-
2.4 μl 1 M MgCl2 (~10 mM final)
-
5 μl RNaseOUT (1 U/μl final)
-
20 μl 10 mM (each) dNTP mix
-
10 μl 10 U/μl E. coli DNA polymerase I (100 U total) 1 μl 5 U/μl RNase H (5 U total).
-
-
5
Pipet to mix well and resuspend the pellet. Incubate 2 hr at 16°C.
-
6
Centrifuge 3 min at 2,500 × g, room temperature. Pipet out supernatant and discard.
Cross-link reversal
-
6Resuspend pellet in 400 μl DEPC-treated PBS. Add to each tube:
-
55 μl 10% SDS (0.1% SDS final)
-
55 μl 5 M NaCl (0.5 M final)
-
3 μl 20 mg/ml proteinase K.
-
-
7
Incubate in a thermomixer or heat block overnight (or 12 to 18 hr) at 70°C.
This step provides a natural stopping point for the end of day 3.
DNA precipitation
-
4
Remove sample from thermomixer and cool sample to room temperature.
Do not cool the sample on ice, or the SDS may precipitate.
-
5
Add 50 μl 3 M sodium acetate and 2 μl 5 mg/ml glycogen to each tube. Then add 1000 μl ice-cold 100% ethanol. If there is room in the tube, add an additional 100 μl ice-cold 100% ethanol.
Flocculent DNA should be visible after inverting the tube three or four times.
-
6
Incubate 30 min on ice.
-
7
Centrifuge 10 min at ~21,000 × g, 4°C. Pipet out supernatant and discard.
-
8
Add 1000 μl ice-cold 70% ethanol to wash the pellet. Centrifuge 5 min at ~21,000 × g, 4°C. Pipet out supernatant and discard.
Use a P200 to remove as much supernatant as possible without disturbing the pellet. This will help to remove residual ethanol.
Shearing
-
1
Gently dry the pellet by leaving the tube open for ~5 min. Add 130 μl TE buffer and fully resuspend the pellet.
-
2
Shear the DNA using a Covaris S220 focuses ultrasonicator or equivalent, set to obtain a peak fragment size of ~200 bp.
The shearing settings can be determined empirically beforehand using one sample. On a Covaris S220 machine, the following settings are a useful starting point: 10% duty factor, 175 peak incident power, 200 cycles per burst, for 180 sec.
-
3
Verify the precise size distribution of the sheared sample by analysis on a Bioanalyzer or equivalent device.
After step 52, one can proceed immediately to step 53. Alternatively, store DNA at −20°C overnight (or up to several weeks, if needed). This step provides a convenient multiday stopping point, if necessary.
Isolation of biotinylated DNA fragments and sequencing library preparation
Isolation of biotinylated DNA fragments
-
4
Pipet 150 μl Dynabeads (MyOne Streptavidin T1) into a clean 1.5-ml tube. Place the tube on a magnetic rack. Remove supernatant after the beads have aggregated on the magnet (after ~2 min).
-
5
Wash beads by adding 400 μl TWB. Remove tube from magnetic stand and mix well by gently pipetting until the beads are resuspended. Place tube back on magnetic rack and wait 2 min for beads to aggregate. Remove and discard supernatant.
-
6
Remove tube from magnetic rack and resuspend beads in 130 μl 2 BBB. Add all sheared DNA (~130 μl, for a final volume of ~260 μl) to the bead×slurry and mix well by pipetting.
-
7
Incubate for 15 min at room temperature, with rotation or gentle mixing on a thermomixer, to bind biotinylated DNA to the beads.
-
8
Place the tube on the magnetic rack, wait at least 2 min for the beads to aggregate, and remove supernatant.
Do not allow the beads to dry out. Residual supernatant will be removed by the washes in the following steps. The supernatant should not contain any biotinylated DNA but can be saved out of precaution or for troubleshooting.
-
9
Add 750 μl 1× TWB to wash the beads and mix with a pipet after removing the tube from the magnetic rack. Warm the tubes for 50°C for 2 min in a heat block or thermomixer (with no shaking).
-
10
Place tube on magnetic rack, wait at least 2 min, and remove and discard supernatant.
-
11
Add 750 μl 1× TWB to wash the beads a second time. Mix with a pipet after removing tube from magnetic rack. Place tube back on rack, remove and discard supernatant, and immediately move on to the next step.
On-bead sequencing adaptor ligation and optional elimination of undigested bridge
Steps 61 to 73 below are part of most standard DNA sequencing library preparations. The main steps are (1) end repair; (2) addition of dA to each end of the molecule; (3) adaptor ligation; and (4) PCR amplification and the optional addition of barcodes. We use NEB reagents, but any similar library prep kit or reagents could be substituted. These steps are performed with the hybrid cDNA-bridge-DNA molecule still attached to beads.
-
12
End repair and dA tailing: Remove tubes from rack and resuspend the beads in 40 μl TE. Add 7 μl NEBNext Ultra II End Prep Reaction Buffer and 3 μl NEBNext Ultra II End Prep Enzyme Mix, and incubate 20 min at room temperature with agitation on a thermomixer to keep beads from settling. Increase temperature to 65°C and incubate for 30 min. Remove tube from heat and cool to room temperature.
-
13Adaptor ligation: Add the following to each tube, in the order shown:
- 2.5 μl NEBNext Adaptor (be sure to add first)
- 1 μl NEBNext Ligation Enhancer
-
30 μl NEBNext Ultra II Ligation Master Mix (be sure to add last).Mix vigorously by pipetting, and incubate 15 to 20 min at room temperature with agitation on a thermomixer to keep beads suspended.
-
14
Add 3 μl NEB USER enzyme to each tube. Mix with gentle pipetting. Incubate 15 min at 37°C.
-
15
Add 750 μl 1× TWB to wash the beads and mix with a pipet after removing tube from magnetic rack. Warm tubes for 2 min at 50°C in a heat block or thermomixer (with no shaking).
-
16
Place the tube on the magnetic rack, wait ~2 min, and remove and discard the supernatant.
-
17
Add 750 μl 1× TWB to wash the beads a second time. Mix with a pipet after removing tube from the magnetic rack. Place tube back on rack, remove and discard supernatant.
-
18Optional: Eliminate undigested bridge—which will remain unligated to DNA—by performing a PacI digestion of adapter-ligated molecules. Pre-mix the following:
-
86 μl nuclease-free water
-
10 μl NEB CutSmart Buffer
-
4 μl PacI restriction enzyme (10 U/μl).Add to the beads and incubate for 1 hr at room temperature with gentle mixing.Steps 67 to 70 are optional. The adenylated strand (top strand) of the double-stranded bridge molecule contains a three-carbon (3C) phosphoramidite spacer on the 3′end that is cleaved off during the DpnII digestion step to reveal free DpnII overhangs. Although the 3C spacer prevents adapter ligation to the 3′ end of undigested bridge, the biotinylated bottom strand can sometimes amplify during PCR, ligate to adapters, and occasionally (<5% of sequenced reads) result in final molecules with cDNA, bridge, but no DNA. These can be eliminated by digesting all adapter-ligated molecules with PacI, as there is a PacI site present between the DpnII site and spacer base in the adenylated strand of the bridge. This will remove the adapter from the 3′ end and prevent amplification by PCR in step 71.
-
-
19
Optional: Add 750 μl 1× TWB to wash the beads and mix with a pipet after removing tube from magnetic rack.
-
20
Optional: Place tube back on magnetic rack, wait at least 2 min, and remove and discard supernatant.
-
21
Optional: Add 750 μl 1× TWB to wash the beads a second time. Mix with a pipet after removing tube from the magnetic rack. Place tube back on rack and remove and discard supernatant.
Library amplification by on-bead PCR
-
4Prepare the following amplification mix for each sample:
-
25 μl 2× NEBNext Ultra II Q5 Master Mix
-
2.5 μl 10 μM Universal Primer (NEBNext Multiplex Oligos for Illumina)
-
2.5 μl 10 μM Indexing Primer (NEBNext Multiplex Oligos for Illumina) 20 μl RNase-free water.Resuspend beads in 50 μl amplification mix and transfer to a fresh 200-μl PCR tube.If multiple samples will be sequenced together, assign a unique barcode to each sample using NEB index primers. For more than 1 sample, prepare a PCR master mix with all components except the Indexing primer, which is sample specific.
-
-
5Perform PCR with the following cycles:
1 cycle: 3 min 98°C (initial denaturation) 7 cycles: 30 sec 98°C (denaturation) 30 sec 63°C (annealing) 40 sec 72°C (extension). -
6
When the PCR is finished, transfer the reaction to new 1.5-ml tubes and place tubes on magnetic rack. Allow 5 min for the beads to aggregate. Transfer the supernatant (which contains the amplified library) to a clean 1.5-ml tube.
The complete original library will still be bound to the beads as the single biotinylated (bottom) strand of the hybrid molecule. Resuspend the beads in 200 μl TE buffer and store at 4°C for later troubleshooting, if necessary. It should be possible to re-PCR off the beads if, following amplification and size selection, a greater concentration of library is required or if an error is made in downstream steps.
Removal of adapter dimers by SPRI-bead-based size selection
Amplification of self-ligated adapters (“adapter dimers”) using NEB adapters and index primers results in a ~130-bp fragment. Adaptor dimers are undesirable because (a) owing to their short length, adapter dimers may preferentially amplify during qPCR-based library quantification and adversely affect determination of the library concentration; and (b) adapter dimers will cluster and sequence if present in the final sequencing library, which wastes reads. Therefore, adapter dimers must be removed.
-
4
Allow AMPure XP bead slurry to warm to room temperature, then vortex thoroughly to resuspend. Measure the volume of each sample (the PCR supernatant). Add an equal volume of beads to each sample, and mix by pipetting until homogenously distributed. Incubate for 5 min at room temperature.
The parameter that determines what sized molecules bind to the beads is the concentration of PEG in the final bead-sample solution. Thus, the ratio of beads to sample is critical. Here, a 1:1 bead:sample ratio is desired to remove DNA shorter than ~150 bp (i.e., DNA <150 bp will not bind the beads at the resulting PEG concentration). See the AMPure XP manual for details. Note that size selection occurs in a gradient over a bp range, rather than a sharp cut-off as with gel purification.
-
5
Place tube on magnetic stand and allow at least 2 min for beads to aggregate. Remove the supernatant and discard or save for potential troubleshooting. Do not disturb the beads.
The supernatant can be discarded, but it may be prudent to save and store at 4°C for later troubleshooting until the size selection and completed library prep is validated by qPCR and Bioanalyzer.
-
6
Add 250 μl 70% ethanol to wash the beads. The beads do not need to be resuspended. Let tubes sit ~2 min on the magnetic rack to allow beads to aggregate. Remove supernatant and discard. Remove tubes from magnetic rack and allow beads to dry for ~5 min. Do not over-dry the beads.
-
7
Add 33 μl 10 mM Tris·Cl, pH 8.0, to elute. Mix well and incubate 5 min at room temperature.
Adding 33 μl of elution solution will result in a final sample volume of ~30 μl if one is careful to not pipet trace beads. Tris buffer is used here instead of TE, since the EDTA in TE may inhibit later PCR steps.
Side qPCR to determine number of further amplification cycles needed
-
8Mix 5 μl of library DNA (eluate from step 77) in a fresh PCR tube with the following reagents (per sample):
-
6 μl 2× NEBNext Ultra II Q5 Master Mix
-
0.5 μl 10 μM Universal Primer (NEBNext Multiplex Oligos for Illumina)
-
0.5 μl 10 μM Indexing Primer (NEBNext Multiplex Oligos for Illumina)
-
0.15 μl 100× Sybr Green.Be sure to match the correct unique indexed primer to the appropriate sample.
Perform the following PCR cycle on a qPCR machine:1 cycle: 3 min 98°C (initial denaturation) 25 cycles: 30 sec 98°C (denaturation) 30 sec 63°C (annealing) 40 sec 72°C (extension). The side qPCR empirically determines the appropriate number of PCR cycles for the off-bead library amplification step that follows. This helps to minimize overamplification artifacts, such as excessive duplication or GC enrichment, during the final library amplification. To calculate the number of required additional cycles, plot the linear Rn (normalized signal) versus cycle for each sample. The cycle number that corresponds to approximately one-quarter of the plateau value of the fluorescent intensity (often the maximum) is the desired additional cycles to PCR, for that sample. Avoid choosing a cycle number close to the fluorescence maximum. The number of cycles may vary between each sample library.
-
Off-bead library amplification
-
5For each indexed sample, mix the following in a clean PCR tube:
-
30 μl 2× NEBNext Ultra II Q5 Master Mix
-
25 μl eluted library from step 77
-
2.5 μl 10 μM Universal Primer (NEBNext Multiplex Oligos for Illumina)
-
2.5 μl 10 μM Indexing Primer (NEBNext Multiplex Oligos for Illumina).
Run the following PCR cycle:N cycles: 98°C for 30 sec (denaturation) 63°C for 30 sec (annealing) 72°C for 40 sec (extension) Hold: 4°C for 5 min or more (hold). The goal of library amplification is to amplify a minimally sufficient amount of material necessary for a sequencing library. When amplifying multiple samples with different cycle numbers, we find it efficient to program the PCR machine using the longest cycle number (N), with the experimenter standing near the PCR machine to remove samples with lower cycle numbers by hand, at the appropriate time. These can be placed on ice until the final cycle has completed. Typical additional cycle numbers for samples of 10 million hESCs range from 1 to 8.
-
Final high and low size selection
The target length for the final DNA molecule depends on the sequencing modality. For 2 × 150-bp paired-end sequencing, the target length including adapters is ~280 to 330 bp (130 bp adapter + ideal fragment length of 150 to 200 bp), so a good distribution of sizes will peak around 280 to 330 and often include sequences from 200 to 500 bp in the distribution tails.
-
5
Allow the AMPure XP bead slurry to warm to room temperature, then vortex thoroughly to resuspend. Measure the volume of the PCR reaction mix. If this is <60 μl, add 1 to 3 μl nuclease-free water to reach 60 μl. Add 0.5 volumes of AMPure bead slurry (30 μl) to the entire PCR reaction mix (60 μl) for a final bead:sample ratio of 0.5. Mix by pipetting until homogenously distributed. Incubate for 5 min at room temperature.
The 0.5 bead:sample ratio should result in molecules >550 bp bound to beads, and molecules of 0 to 500 bp present in solution. If more stringent upper size selection is required, use a 0.6 bead:sample ratio in this step. There may be slight variations among batches of beads, so we suggest that the precise ratio for a desired upper size threshold be empirically determined.
-
6
Place the tube on a magnetic stand. Allow >2 min for beads to aggregate. Recover the supernatant into a new tube.
If the larger fragments (>500 bp) are required for troubleshooting or desired for analysis, add 70% ethanol to the beads and store at 4°C until needed (from several days to a week).
-
7Add 24 μl of fresh AMPure bead slurry to the supernatant (~90 μl), based on the original 0.5 bead:sample ratio in step 80, for a final bead/sample ratio of 0.9. If a ratio of 0.6 was used in the first size-selection (step 80), add 16.875 μl beads, as determined by the equations below. Mix and incubate for 5 min at room temperature (without shaking).
-
If a 0.5 bead/sample ratio was used in step 80, add (X) amount of AMPure beads to obtain a final 0.9 bead/sample ratio for this second size selection.
-
X μl = 0.2666 × volume of supernatant (μl)
-
If a 0.6 bead/sample ratio was used in step 80, add (X) amount of AMPure beads to obtain a final 0.9 bead/sample ratio for this second size selection.
-
X μl = 0.1875 × volume of supernatant (μl)To completely remove adapter dimers, use a 0.9 bead:sample final ratio in the second size-selection step. If smaller molecules (150 to 200) are desired, a 1:1 final ratio in the second selection can be used to remove most adapter dimers, although residual dimers may still be present. The precise ratio required may depend on the AMPure bead batch.
-
-
8
Place the beads on the magnet, allow >2 min for beads to aggregate, and remove and discard the supernatant.
The supernatant contains small molecules <150 to 200 bp (e.g., adapter dimers) and can be discarded or saved for trouble-shooting.
-
9
Add 200 μl 70% ethanol to wash the beads. Place tube back on magnetic rack for >2 min, and discard supernatant. Repeat this step once for a total of two washes. Avoid disrupting the beads during the second wash.
-
10
Remove all traces of ethanol without disturbing the beads. Remove tube from rack, and let the beads dry for ~5 min. Add 33 μl elution buffer (10 mM Tris Cl, pH 8), mix well, and incubate at least 5 min at room temperature. ·
Use a P200 pipet followed by a P20 pipet or similar to remove all ethanol.
-
11
Place the tube back on the magnetic rack, and allow 2 to 4 min for the beads to aggregate. Remove the supernatant (DNA library) and transfer to a clean 1.5-ml tube.
The final DNA library (e.g., supernatant) should contain molecules ~150 to 500 bp in length. The actual distribution within this range may vary from sample to sample, and is largely influenced by the prior distribution of sheared molecules and the specific RNA or DNA lengths within each molecule after biotin isolation. It may also be possible to isolate the final DNA library by gel purification; we have not tested the effects of this approach on yield. The final library can be stored at −20°C.
Library quality assessment
-
5
Perform quality control on the library using a Bioanalyzer (High Sensitivity). Quantify library by qPCR with a phiX standard curve or KAPA Complete Kit-Universal KK4824 (Roche, cat. no. 07960140001)
The size distribution of the final sequencing library should be assessed with a Bioanalyzer or similar assay. We use the qPCR-based concentration as the final library concentration.
Sequencing
We typically sequence ChAR-seq libraries with single-end 152 bp reads or paired-end 2×150 reads on an Illumina MiSeq, HiSeq, or NextSeq according to manufacturer’s instructions. If longer or shorter molecules are desired, the size-selection parameters in steps 80 to 82 may need to be adjusted accordingly.
Often a MiSeq run is used to assess basic quality metrics (e.g., the presence of bridge in the majority of molecules, the length of RNA and DNA ends, and the duplicate rate), and is followed by HiSeq runs to achieve the desired number of raw reads.
For mammalian genomes, obtaining at least 200 to 300 million raw reads is desired, with more being preferred. If one wishes to sequence several hundred million reads or more for a single sample, it may be beneficial to increase the library’s complexity—the number of uniquely generated molecules—and minimize the duplication rate by generating libraries from several technical replicates. These can be barcoded and multiplexed during sequencing, and then pooled after duplicates have been removed from the individual sample raw reads.
For details of the computational analysis, please see the methods detailed by Bell et al. (2018). Briefly, reads containing the bridge sequence are identified, molecule orientation is established, the RNA (cDNA) and DNA portions are split from each read and aligned to the genome (or transcriptome and genome), and RNA and DNA that both uniquely align are reassociated. These raw RNA-DNA “contacts” can be further processed by normalizing to DpnII cut-site frequency in the genome, or scored for significance of RNA binding to genome regions after generation of an appropriate background model and the performance of associated statistical tests.
REAGENTS AND SOLUTIONS
Bead binding buffer (BBB), 2×
10 mM Tris Cl, pH 7.6–8.0
1 mM EDTA
2 M NaCl
Prepare with DEPC-treated or nuclease-free water. Store up to 1 month at room temperature or up to 6 months at 4°C.
Bridge annealing buffer (BAB), 2×
20 mM Tris·Cl, pH 7.6–8.0
1 mM EDTA
100 mM NaCl
Prepare with DEPC-treated or nuclease-free water. Store up to 1 month at room temperature or up to 6 months at 4°C.
cDNA synthesis support buffer, 1×
10 mM Tris·Cl, pH 7.6–8.0
90 mM KCl
50 mM (NH4)2SO4
Prepare with DEPC-treated or nuclease-free water. Store up to 1 month at room temperature or up to 6 months at 4°C.
Cell lysis buffer
10 mM Tris·Cl, pH 7.6–8.0
10 mM NaCl
0.2% Igepal-CA630
1 mM dithiothreitol (DTT)
0.2 mM EDTA
Prepare with DEPC-treated or nuclease-free water, mixing gently to avoid foaming. Prepare immediately before use—do not store.
Tween wash buffer (TWB), 1×
5 mM Tris·Cl, pH 7.6–8.0
0.5 mM EDTA
1 M NaCl
0.05% Tween 20
Prepare with nuclease-free water. Prepare fresh—do not store.
COMMENTARY
Background Information
Chromatin-associated RNA regulates both genome organization and function. It is of great interest to identify where RNAs bind across the genome. ChAR-seq can potentially detect RNA binding targets on DNA for each RNA in the transcriptome in a single assay, without the need for genetic modifications or multiple RNA-specific probes. Whereas other methods require knowledge regarding candidate RNAs, the protocol here allows the de novo detection of RNAs that associate with defined regions of chromatin. Other applications of ChAR-seq include examining chromatin-associated RNA dynamics during differentiation or under states of perturbation. Because of its multiplex nature, ChAR-seq requires a large number of sequencing reads to infer true RNA binding over noise for all RNAs. If one is interested in the binding targets for only one RNA, it may therefore be more cost effective to use probe-based targeted RNA-DNA association methods (Chu et al., 2011, Simon et al., 2011, Engreitz et al., 2013).
Critical Parameters
Several aspects of the protocol are critical for success. A minimum number of cells may be required to sustain a critical nuclear pellet mass during the multiday protocol. We have observed that a larger pellet size during day 2 and day 3 of the protocol correlates with a larger proportion of final usable molecules. Therefore, it is critical not to remove portions of the nuclear pellet during washes. We have not systematically tested the protocol with amounts of starting material less than 10 million hESCs or 25 million Drosophila cl.8+ cells. If it is necessary to deviate substantially from the above number of cells, we recommend performing a titration of bridge amount and cell number. Additional parameters of importance include the duration of RNA fragmentation, the post-shearing DNA length distribution, and the sequencing depth.
Troubleshooting
With ~10 million input cells, a pellet should be visible after each centrifugation step throughout the protocol. If the nuclear pellet is present initially, but lost or not visible during day 2 or day 3, proceed to the de-cross-linking step, precipitate DNA, and assay the concentration on a Bioanalyzer or Qubit fluorometer. Note the pellet size after each centrifugation, and take care not to remove portions when discarding the supernatant. Always monitor pellet integrity while pipetting supernatant. If the pellet appears unstable, centrifuge for an additional 1 to 2 min before removing the supernatant. If, after sequencing, the RNA-bridge ligation worked but few DNA-containing molecules were recovered, ensure that the bridge-to-cell number is optimized. Too much bridge could result in excess ligations to free RNA tethered to proteins but not DNA. It is expected that one-quarter to one-half of the RNA reads will be ribosomal RNA (rRNA). This rRNA should be of diverse length in the final hybrid molecule. If short rRNA fragments (<15 nt) dominate the final RNA molecules, ensure that RNase-free conditions are present at all stages. We recommend ordering Molecular-Biology Grade (PCR) or nuclease-free stock solutions of common solutions (e.g., Tris, MgCl2, EDTA, NaCl), as these are relatively inexpensive when compared to the time required to troubleshoot whether various homemade solutions are RNase-free.
If the size selection is incorrect, recheck all AMPure bead and sample ratio calculations. Different batches of AMPure beads may perform differently, so we recommend mixing a DNA ladder with a range of AMPure bead/sample ratios, and then running samples on a gel to determine the exact size-selection efficiency for a given lot of AMPure beads.
We typically find that 75% to 90% of sequenced reads contain the bridge sequence. If the median sequencing-library length is not well matched to the sequencing-read length, the bridge will be found in fewer reads. If this occurs, prepare a new library and ad-just the shearing parameters to obtain shorter fragments, or sequence with longer reads. In molecules (insert + adapters) >450 bp, the bridge may be present between the paired reads (with paired-end 150-bp reads), resulting in an inability to define RNA and DNA ends.
Anticipated Results
The proportion of sequenced reads corresponding to complete RNA-bridge-DNA molecules (i.e., reads containing the entire bridge with both cDNA and DNA of length >18 nt) typically ranges from 40% to 80%. When rRNAs and non-uniquely aligning cDNA or DNA are removed, about 15% to 30% of the sequenced reads remain as “high-quality” reads for analysis. Thus, sequencing at read depths of at least threefold greater than the number of desired molecules is recommended.
Time Considerations
The ChAR-seq protocol can be completed in 5 days. The first 3 days (up through Covaris shearing) should be performed consecutively without interruption. Day 1 takes about ~3 hr starting from frozen cells, day 2 takes ~6 to 8 hr, and day 3 takes ~3 hr. The last 2 days, however, contain multiple convenient stopping points, and we routinely spread these steps over 3 to 7 days to allow for submission of samples to core facilities for Bioanalyzer analysis. For example, we verify each SPRI-based size selection on a Bioanalyzer before final library quantification, in case adjustments are required. When performing ChAR-seq with more than ten samples simultaneously, we suggest including two experimenters to help make the first 2 days more efficient (e.g., one person can prepare solutions before each step while the other pipets), as the pipetting and resuspension steps compose the majority of “hands-on” time.
Acknowledgements
We thank Nikki Teran and Whitney Johnson for contributions during development of the original version of ChAR-seq as described in Bell et al. (2018), and thank Ali Shariati for assistance with hESC culture. This work was supported by a Stanford Center for Systems Biology (NIH P50 GM107615 to James E. Ferrell) Seed Grant Award to DJ, VIR, and JCB, by the Katharine McCormick Advanced Postdoctoral Fellowship to VIR, by an NSF-GRFP to OKS, and by NIH R01 HG009909 to AFS.
Literature Cited
- Bell JC, Jukam D, Teran NA, Risca VI, Smith OK, Johnson WL, et al. (2018). Chromatin-associated RNA sequencing (ChAR-seq) maps genome-wide RNA-to-DNA contacts. eLife, 7 10.7554/eLife.27024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cech TR, & Steitz JA (2014). The noncoding RNA revolution - Trashing old rules to forge new ones. Cell, 157(1), 77–94. 10.1016/j.cell.2014.03.008 [DOI] [PubMed] [Google Scholar]
- Chu C, Qu K, Zhong FL, Artandi SE, & Chang HY (2011). Genomic Maps of Long Noncoding RNA Occupancy Reveal Principles of RNA-Chromatin Interactions. Molecular Cell, 44(4), 667–678. 10.1016/j.molcel.2011.08.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cusanelli E, & Chartrand P (2015). Telomeric repeat-containing RNA TERRA: A noncoding RNA connecting telomere biology to genome integrity. Frontiers in Genetics, 6(MAR), 143 10.3389/fgene.2015.00143 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engreitz JM, Pandya-Jones A, McDonel P, Shishkin A, Sirokman K, Surka C, et al. (2013). The Xist lncRNA exploits three-dimensional genome architecture to spread across the X chromosome. Science (New York, N.Y.), 341(6147), 1237973 10.1126/science.1237973 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fullwood MJ, Wei C-L, Liu ET, and Ruan Y 2009. Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Research, 19, 521–532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galupa R, & Heard E (2018). X-chromosome inactivation: A crossroads between chromosome architecture and gene regulation. Annual Review of Genetics (Vol. 52, pp. 535–566). 10.1146/annurev-genet-120116-024611 [DOI] [PubMed] [Google Scholar]
- Johnson WL, Yewdell WT, Bell JC, McNulty SM, Duda Z, O’Neill RJ, et al. (2017). RNA-dependent stabilization of SUV39H1 at constitutive heterochromatin. eLife, 6 10.7554/eLife.25299 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kopp F, & Mendell JT (2018). Functional Classification and Experimental Dissection of Long Noncoding RNAs. Cell, 172(3), 393–407. 10.1016/j.cell.2018.01.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li X, Zhou B, Chen L, Gou L-T, Li H, & Fu X-D (2017). GRID-seq reveals the global RNA-chromatin interactome. Nature Biotechnology, 35(10), 940–950. 10.1038/nbt.3968 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lopez-Pajares V (2016). Long non-coding RNA regulation of gene expression during differentiation. Pflugers Archiv European Journal of Physiology, 468(6), 971–981. 10.1007/s00424-016-1809-6 [DOI] [PubMed] [Google Scholar]
- Moncke-Buchner E, Rothenberg M, Reich S, Wagenfuhr K, Matsumura H, … Reuter M (2009). Functional characterization and modulation of the DNA cleavage efficiency of type III restriction Eendonuclease EcoP15I in its interaction with two sites in the DNA target. Journal of Molecular Biology, 387, 1309–1319. [DOI] [PubMed] [Google Scholar]
- Moore D (1996). Commonly used reagents andequipment. Current Protocols in Molecular Biology, 35(1), A.2.1–A.2.8. doi: 10.1002/0471142727.mba02s35. [DOI] [PubMed] [Google Scholar]
- Nagano T, Várnai C, Schoenfelder S, Javierre B-M, Wingett SW, & Fraser P (2015). Comparison of Hi-C results using in-solution versus in-nucleus ligation. Genome Biology, 16(1), 175 10.1186/s13059-015-0753-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nozawa R-S, & Gilbert N (2019). RNA: Nuclear Glue for Folding the Genome. Trends in Cell Biology. 10.1016/j.tcb.2018.12.003 [DOI] [PubMed] [Google Scholar]
- Quinodoz SA, Ollikainen N, Tabak B, Palla A, Schmidt JM, Detmar E, et al. (2018). Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus. Cell, 174(3), 744–757.e24. 10.1016/j.cell.2018.05.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shirai A, Kawaguchi T, Shimojo H, Muramatsu D, Ishida-Yonetani M, Nishimura Y, et al. (2017). Impact of nucleic acid and methylated H3K9 binding activities of Suv39h1 on its heterochromatin assembly. eLife, 6 10.7554/eLife.25317 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simon MD, Wang CI, Kharchenko PV, West JA, Chapman BA, Alekseyenko AA, et al. (2011). The genomic binding sites of a noncoding RNA. Proceedings of the National Academy of Sciences of the United States of America, 108(51), 20497–20502. 10.1073/pnas.1113536108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sridhar B, Rivas-Astroza M, Nguyen TC, Chen W, Yan Z, Cao X, et al. (2017). Systematic Mapping of RNA-Chromatin Interactions In Vivo. Current Biology : CB, 27(4), 602–609. 10.1016/j.cub.2017.01.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Velazquez Camacho O, Galan C, Swist-Rosowska K, Ching R, Gamalinda M, Karabiber F, et al. (2017). Major satellite repeat RNA stabilize heterochromatin retention of Suv39h enzymes by RNA-nucleosome association and RNA:DNA hybrid formation. eLife, 6 10.7554/eLife.25293 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Viollet S, Fuchs RT, Munafo DB, Zhuang F, & Robb GB (2011). T4 RNA Ligase 2 truncated active site mutants: Improved tools for RNA analysis. BMC Biotechnology, 11 10.1186/1472-6750-11-72 [DOI] [PMC free article] [PubMed] [Google Scholar]