Profiling active RNA polymerase II transcription start sites from total RNA by capped small RNA sequencing (csRNA-seq)

Mackenzie K Meyer; Oluwadamilola J Olanrewaju; Patricia Montilla-Perez; Anna L McDonald; Eva M Rickard; Francesca Telese; Christopher Benner; Marina I Savenkova; Sascha H Duttke

doi:10.1038/s41596-025-01285-y

. Author manuscript; available in PMC: 2026 Mar 18.

Published before final editing as: Nat Protoc. 2026 Jan 16:10.1038/s41596-025-01285-y. doi: 10.1038/s41596-025-01285-y

Profiling active RNA polymerase II transcription start sites from total RNA by capped small RNA sequencing (csRNA-seq)

Mackenzie K Meyer ¹, Oluwadamilola J Olanrewaju ¹, Patricia Montilla-Perez ², Anna L McDonald ¹, Eva M Rickard ¹, Francesca Telese ², Christopher Benner ³, Marina I Savenkova ¹, Sascha H Duttke ^1,^#

PMCID: PMC12994261 NIHMSID: NIHMS2142302 PMID: 41540114

Editorial summary:

A protocol for high resolution genome-wide mapping of nascent RNA polymerase II transcription initiation of both stable and transiently expressed RNAs using total RNA from diverse sample types.

Tweet:

#NewNProt for selectively profiling active transcription start sites in diverse eukaryotic total RNA samples by 5’capped small RNA enrichment. Cover teaser: Profiling active transcription start sites from total RNA

High-resolution mapping of active RNA polymerase II transcription initiation provides a dynamic view of gene expression and reveals the entire spectrum of RNA transcripts – from stable mRNAs to transient enhancer RNAs – which is critical for understanding gene regulation, deciphering transcriptional programs, and defining regulatory element function. Here, we present a detailed protocol for capped small RNA-sequencing (csRNA-seq). Starting with total RNA, which can be readily isolated from fresh, frozen, or fixed cells or tissues as well as inactivated infectious samples, csRNA-seq selectively enriches for actively initiating 5’-capped RNA polymerase II transcripts. This approach captures both initiating stable protein-coding and non-coding RNAs as well as rapidly degraded, transient transcripts like enhancer or promoter divergent RNAs, providing a comprehensive snapshot of active cis-regulatory elements and facilitating study of underlying regulatory mechanisms with high sensitivity. The protocol involves small RNA isolation, 5′-capped RNA enrichment, and library generation, followed by sequencing. Key advantages of csRNA-seq over other nascent RNA-seq methods include: (1) decoupling of sample collection and processing; (2) broad compatibility with diverse eukaryotic sample types and organisms; (3) high-resolution data defining active regulatory elements and their properties; and (4) scalability. Importantly, purified RNA is non-infectious and can be isolated from inactivated samples - including clinical or pathogenic specimens - allowing for safe transport and analysis under standard laboratory conditions. This protocol empowers researchers with minimal experience in nascent transcriptomics to study gene regulation, cis-regulatory elements, and transcription dynamics.

Introduction

Capturing snapshots of active or “nascent” transcription initiation has transformed our understanding of gene regulation and revealed a plethora of transient, rapidly degraded transcripts, including those arising from enhancer regions, that are often missed by traditional steady-state RNA sequencing methods (Fig.1, Extended Data Fig.1) ^1–4. Capturing the initiation of these rapidly degraded RNAs is critical: i) they are highly abundant, accounting for 65–75% of transcription initiation events in vertebrates; ii) they serve as valuable biological markers and are increasingly recognized for their functional importance^5–10 – for instance, enhancer RNAs have emerged as the most reliable marker of enhancer activity^4,11 and unstable promoter divergent transcripts likely impact chromatin architecture and transcriptional directionality^12,13; and iii) they provide a dynamic view of all active cis-regulatory elements, including promoters and enhancers, with minute temporal resolution, enabling real-time tracking of gene regulation as it unfolds and identification of underlying regulatory mechanisms^14,15. As a result, methods that capture active transcription have moved to the forefront of efforts aimed at investigating the mechanisms and dynamics of gene regulation, including the roles of transcription factors, co-regulators, or epigenetic features in developmental, disease, drug responses and how and which cis-regulatory elements encode gene expression patterns^1,11,16–25.

Fig. 1 | — Unlike conventional RNA-seq and single-cell RNA-seq, which primarily capture stable steady state transcripts, csRNA-seq captures short, initiated RNAs, including unstable transcripts like enhancer RNAs (e.g., from the GATA super enhancer) and pri-miRNAs. This allows for the precise mapping of transcription start sites (TSS) and provides a dynamic view of transcription, enabling the study of gene regulatory mechanisms across a wide range of samples and RNA species. Adapted with permission from Duttke et al. 2019²⁷.

Several methods that capture transcriptionally active regions have been developed, each offering distinct advantages and sensitivities²⁶. Here, we present a detailed protocol for capped small RNA-sequencing (csRNA-seq), which uniformly maps initiating RNA Polymerase II start sites and levels of both sable and transient transcripts like enhancer RNAs (eRNAs) with efficiency comparable to that of GRO-cap²⁶. However, rather than relying on immunoprecipitation to isolate nascent RNA from engaged RNA polymerase complexes or modified run-on RNAs, csRNA-seq employs a direct enrichment approach²⁷.

Development of the protocol

csRNA-seq builds on foundational studies demonstrating that sequencing capped small RNAs can accurately map active transcription start sites (TSSs) and levels for both stable and rapidly degraded RNAs^28–31. Further, approaches such as START-seq, 5′GRO-seq, GRO-cap or PRO-cap^5,31–33 showed that focusing reads to 5′ capped transcripts improves the sensitivity of TSS identification and quantification of initiation levels, particularly for less abundant, short, and rapidly degraded RNA species (eRNAs, divergent transcripts)²⁶, including those arising from intronic regions. Inspired by and building on these advances, csRNA-seq identifies actively transcribed TSSs from total RNA by enriching and sequencing 5′-capped RNA fragments smaller than those typically found in the steady-state transcriptome. By overcoming the need for cell culture or purified nuclei, csRNA-seq enables high-throughput profiling of any fresh, frozen, or fixed sample^20,24,27,34. This enables the study of gene regulatory dynamics and underlying mechanisms even in challenging contexts, such as tissues, where traditional methods fall short because representative pure nuclei or single-cell suspensions are difficult to obtain without bias^35–37. Additionally, using RNA facilitates the study of more diverse sample types, including plants, fungi, or organs^19,38–41, as well as infectious or potentially infectious samples. RNA poses no pathogenic risk and can be isolated from inactivated samples^19,20,42. This greatly expands the range of samples and organisms in which the entirety of transcriptionally active regions - including those producing transient RNAs - can be analyzed. Finally, the csRNA-seq protocol consists of simple and user-friendly steps, enabling even early-stage researchers, including undergraduates, to engage in research and study the mechanisms of gene regulation^39,40.

Overview of the procedure

The general overview of the csRNA-seq workflow is illustrated in Fig.2. Following total RNA extraction, leveraging sample-specific protocols that retain micro RNAs (miRNAs) and other small RNAs, RNAs smaller than the smallest steady-state RNA polymerase II transcript (usually <60 nts) are isolated, typically by denaturing gel electrophoresis (Fig.2, Steps 1–35). While not essential, this selection step significantly enriches the desired RNAs thereby increasing sensitivity and reducing the needed sequencing depth.

A 10% input is taken (Fig.2, Steps 36–37) and later used to computationally reduce false positives. This step can be omitted if RNA quality is very high (RIN > 8). The remainder is subjected to 5’ cap-enrichment (Fig.2, Steps 38–48). First, 5′-monophosphate RNAs, such as miRNAs, are selectively degraded using a 5′-phosphate-dependent exonuclease and remaining residual RNA phosphorylations depleted with alkaline phosphatase. Next, the 5′ cap-enriched RNA mixture is purified using TRIzol LS to ensure removal of these enzymes prior to library preparation (Fig.2, Steps 49–66). Libraries for both the small RNA-seq (input) and csRNA-seq are generated using conventional small RNA library preparation kits or protocols⁴³ (Fig.2, Steps 67–98). 5′ caps are then removed, leaving 5′ monophosphates (Fig.2, Steps 67–74). Following 3′ OH-dependent adapter ligation by T4 RNA Ligase II, the reverse transcription primer is annealed (Fig.2, Steps 75–82). This primer also hybridizes to the excess 3′ adaptor, reducing adapter dimers. Subsequent steps include 5′ adapter ligation, cDNA synthesis, and library amplification using PCR (Fig.2, Steps 83–98). While not mandatory, we recommend another size selection step (via a DNA gel) to remove potential adapter dimers and steady-state small nuclear and spliceosome RNA-derived cDNAs (~60–63 nts), which, given their high abundance (orders of magnitude more abundant than actively initiating transcripts), represent a prominent source of potential contamination (Fig.2, Steps 99–129).

Typical sequencing depth for mammalian genomes is 5–15 million reads for csRNA-seq and 3–8 million for the small RNA-seq (input), making the method compatible even with enormous genomes, such as barley and maize^27,39 (Fig.2, Steps 130–150). However, sequencing depth also hinges on the transcriptional activity of the sample studied and on how comprehensively weakly transcribed regions need to be confidently captured. Following sequencing, reads are trimmed, aligned, and active TSSs and transcription start regions, like promoters and enhancers, are identified by comparing csRNA-seq enrichment over small RNA-seq reads.

Applications

csRNA-seq provides a strand-specific and genome-wide snapshot of active RNA polymerase II transcription initiation at a defined point in time. Performing csRNA-seq across multiple time points details gene regulatory changes and dynamics, if needed, with minute resolution. This can be leveraged to characterize the molecular basis underlying developmental transitions, disease progression, or response to internal or exogenous stimuli like therapeutics^{18,20,34,38,39,41}. Furthermore, csRNA-seq captures RNA polymerase II transcription initiation, regardless of the final RNA stability (Fig. 1), a critical feature as 65–75% of human RNAs, including eRNAs, are rapidly degraded post-initiation^6,7. Although these unstable RNAs do not encode proteins, they have been shown to participate in an ever-increasing array of cellular processes and medical disorders ⁶. Capturing the dynamics of these transcripts is also critical for defining the identifying factors and regulatory programs mediating signaling responses and defining cell states. Notably, most cell type-specific and signal-dependent transcription factors bind enhancers rather than promoters^44–47. Therefore, csRNA-seq enables the identification of both cis-regulatory regions and the regulatory programs underlying diverse gene regulatory stages, mapping their dynamics, and identifying the transcription factors orchestrating these processes.

By selectively capturing TSSs, csRNA-seq achieves uniformly high sensitivity for both intergenic and intragenic TSSs. This also enables the rigorous study of intronic enhancers, which are sometimes missed by conventional nascent methods. By providing quantitative insights into transcription initiation rates, csRNA-seq can reveal co-regulated domains, coordinated control of enhancers and promoters, and changes in gene expression under various biological or biomedical conditions. The resulting data facilitates the exploration of gene regulatory networks using TSSs as spatial anchors to study transcription factor binding site syntax and potentially identify novel regulatory elements or mechanisms involved in disease (e.g., viral infections or cancer)^{18,20,24,25,42}. Additionally, leveraging csRNA-seq TSSs as anchors allows for the comparison of transcription factor binding site compositions or other features of promoters driving expression of homologous genes across species, independent of direct sequence conservation^25,48. TSSs themselves are further highly dynamic regulatory landmarks that often shift in response to stimuli, development, or differentiation^49–51. While still underutilized, csRNA-seq can also be used for TSS fingerprinting, enabling deconvolution of distinct cell types or stages within bulk data⁵².

csRNA-seq is also valuable for studying active transcription initiation in samples where other nascent RNA profiling methods that require live material or purified nuclei are impractical. This includes patient samples, complex tissues where isolating nuclei or single cells may result in a bias³⁵, and organisms with complex metabolites, plastid organelles, or cell walls³⁹. Because csRNA-seq is compatible with sample inactivation, including crosslinking²⁰, heat, or detergent treatment, it is also well-suited for studying potentially infectious or infectious samples. Inactivated samples or purified RNA can also be shipped, facilitating world-wide sample sourcing. As a result, csRNA-seq has significantly broadened the scope of nascent transcriptomics, extending its applications to a wide range of systems, including infectious agents. For instance, analysis of RNA from Valley Fever BSL3 samples revealed the transcription factor WOPR1 as essential for pathogenesis¹⁹, study of blood from COVID-19 patients revealed determinants of disease pathogenesis and outcome⁴², and analysis of RNA from Zika-infected cells pinpointed the sterol regulatory element binding protein 2 (SREBP2) as critical for viral infection²⁰. Collectively, these findings highlight the utility of csRNA-seq and profiling of nascent RNAs as strategies for detailing gene regulatory changes, identifying potential drug targets, and studying pathogenesis.

Comparison to other methods

csRNA-seq offers a distinct approach to mapping active TSSs by capturing small 5’capped transcripts associated with initiating RNA polymerase II, enabling sensitive detection of TSSs - including those from transient RNAs. A limitation of this method, however, is its inability to distinguish transcripts derived from actively elongating RNA polymerase II from those released during RNA polymerase II backtracking or abortive initiation if captured before their rapid degradation. As such, csRNA-seq is not a bona-fide nascent method. Nevertheless, in human K562 cells, csRNA-seq captures transcriptional activity and enhancer transcription with similar sensitivity to established nascent methods like 5’GRO-seq, GRO-cap, or PRO-cap²⁶. Unlike these run-on methods, csRNA-seq does not require nuclear purification, ribonucleoside analog incorporation, or cell permeabilization. It is RNA polymerase II-specific, technically simple, and exhibits reduced signal-to-noise³⁹, making it broadly applicable to a wide range of samples - including tissues, organs, or pathogenic samples - where traditional run-on or NET-seq⁵³ is challenging. Chromatin run-on and sequencing (ChRO-seq)⁵⁴ addresses some of these limitations⁵⁵ by using fractionated insoluble chromatin as input, but remains technically complex and is not RNA polymerase II-specific, also capturing transcription by mitochondrial and other RNA polymerases.

Nascent TSSs can also be faithfully detected by GRO-seq, PRO-seq, or NET-seq, albeit with reduced sensitivity²⁶. By capturing nascent transcription throughout gene bodies and transcribed regions⁵⁶, these methods provide insights into transcription elongation or termination not captured by csRNA-seq. However, this broader capture complicates the identification of TSSs, particularly for intragenic initiation events like those at intronic enhancers. GRO-seq, PRO-seq and ChRO-seq also capture all RNA-polymerases indiscriminately, including bacterial and phage-like RNA polymerases, which can complicate analysis^43,57. NET-seq maps the location of RNA polymerase II on DNA by sequencing the 3′ end of nascent transcripts extracted from elongating RNA polymerase II complexes. Thereby, NET-seq provides precise nucleotide-resolution mapping of transcriptionally active RNA polymerase II and allows differential analysis of differently modified RNA polymerases⁵³; nonetheless, it relies on successful immunoprecipitation for effectiveness.

Other TSS-focused methods include START-seq³¹ and CIP-TAP³⁰. START-seq was developed using nuclear RNA as input while CIP-TAP also uses total RNA. The streamlined csRNA-seq protocol provided here offers several improvements enabling lower total RNA input and enhanced sensitivity. Nonetheless, both methods offer attractive alternative approaches.

It is further important to distinguish csRNA-seq and related approaches from methods mapping the 5’ ends of steady-state RNAs, such as CapSeq³⁰, CAGE⁵⁸ or other 5’RNA-seq methods⁵⁹. These methods profile the 5’ ends of accumulated pools of transcripts, whose detection is heavily influenced by RNA stability. Consequently, transcripts from recently activated genes or transient RNAs like eRNAs are often below detection limits, while transcripts detected may not be actively synthesized. The 5’ ends of many steady-state RNAs are further post-transcriptionally processed, meaning mapped 5’ends may not necessarily reflect active or true TSSs ^5,29.

While not directly capturing transcription initiation, other methods, including ChIP-seq and those detecting open chromatin (ATAC-seq⁶⁰, DNase-seq⁶¹, MPE-seq⁶² and others⁶³), map putative regulatory regions but generate broad peaks of enrichment, hindering the precise identification of gene regulatory mechanisms^25,48. Moreover, 40–65% of open chromatin regions are not associated with active transcription. Integrating H3K27ac ChIP-seq data can reduce this false discovery rate, though data interpretation remains challenging⁶⁴.

Advantages and limitations of csRNA-seq

Captures momentarily active cis-regulatory elements and TSSs.
Strand-specific information: Provides strand-specific information of currently active transcription start sites genome-wide at single base-pair resolution.
Versatility: Uses total RNA as input making capturing active transcription compatible with any fresh, fixed ²⁰, potentially infectious or banked sample or tissue from which RNA can be isolated.
Broad applicability: csRNA-seq facilitates the study of plants, fungi, protozoa, and other cells protected by cell walls as well as clinical or other potentially biohazardous samples.
Improved signal-to-noise ratio and sensitivity: Allowing detection of rare or common transcription initiation events, including those resulting in highly unstable RNAs.
Specificity: Enriches for RNA polymerase II transcripts based on cap-selection.
Minimal manipulation: Does not require immunoprecipitation or incorporation of synthetic ribonucleotide analogs.

We would also like to acknowledge several limitations of csRNA-seq that users should be aware of. First, the method has a limited window of capture, i.e. ~20–58 nt from the TSS, which means that active transcription closer or further away from the TSS is not detected. It does not capture gene body transcription, termination, nor mature steady-state RNAs. csRNA-seq also cannot distinguish initiating transcripts by actively elongating RNA polymerase II from those released by backtracked or pausing-terminated polymerases if captured prior to their rapid degradation. Additionally, csRNA-seq data represent the averages of potentially heterogeneous cell populations. Due to the relative rarity of initiating RNAs, especially eRNAs, single-cell applications are currently not practical. The short read length of csRNA-seq data also hinders allele-specific analysis or unambiguous mapping of repetitive sequences. The method is further limited to capturing all transcription initiation and cannot distinguish states of specific RNA polymerase II C-terminal domain modifications. Finally, input RNA quality can affect peak calling thresholds, reducing sensitivity.

Experimental Design

Replicates and sample size

The appropriate number of replicates and samples depends on the biological system and research question. For cell lines, two biological replicates per condition are typically sufficient. Three biological replicates per group may suffice for genetically similar animal models (e.g., mice). For primary human samples, the required sample size can vary considerably depending on genetic and phenotypic variability within the cohort and the specific research question. For example, decoding gene regulatory mechanisms underlying COVID-19 disease progression and outcomes required >25 patient samples⁴². Technical replicates are generally not necessary, as assay variance is typically small compared to biological variance.

RNA extraction and input considerations

csRNA-seq can be applied to any sample or tissue from which total RNA can be extracted. An example protocol for RNA extraction is provided as Supplementary Protocol 1. While the protocol accommodates moderately degraded RNA (RIN ≥3)²⁰, we recommend that users optimize RNA isolation, as this will improve data quality and likely save time and effort in the long run. Higher-quality RNA (RIN ≥6) improves data quality.

When isolating total RNA, it is important to remember that csRNA-seq profiles momentarily active transcription initiation. It is thus critical to avoid treatments that may alter transcription states (e.g., tissue digestion, temperature or other perturbations)^35–37; instead, samples should be rapidly cooled and kept below 10 °C to halt RNA polymerase activity. Consequently, scraping cells from culture should be performed in cold PBS. RNA extraction is best performed using TRIzol LS. Most standard columns inefficiently bind small RNAs. If column-based extraction cannot be avoided, it is important to use kits designed for miRNA isolation to preserve as much of the small RNAs as possible. DNase treatment is only needed if complementary ribosomal RNA-depleted (Ribo0) RNA-seq libraries are generated in parallel.

The starting material can be total, nuclear, or chromatin-associated RNA⁶⁵. We recommend starting with a ≥3 μg, 350 ng, or 200 ng, respectively. We recommend using ≥3 μg of total RNA as input, with 25 μg as the upper limit when gels are used for size selection. Although we routinely generate libraries from as little as 1 μg of total RNA and can go as low as 200 ng, starting with sufficient RNA improves data quality and the capture of rare initiation events.

Small RNA input library considerations

Although not strictly required, especially when using high-quality RNA (RIN > 8), we recommend reserving a 10% aliquot of the small RNA fraction before cap enrichment to generate an input small RNA-seq library that also includes non-capped RNAs. This library improves downstream analysis by facilitating bioinformatic removal of non-nascent small RNA contaminants (e.g., mature miRNAs, snoRNAs, and snRNAs). While depleted in the csRNA-seq protocol, the disproportionately high abundance of these small steady-state transcripts relative to nascent RNAs is the prominent source of false positives. For well-annotated species, existing genome annotations (.gtf files) can additionally or alternatively be used for filtering these RNA species.

Of note: while small RNA input libraries in csRNA-seq are primarily used to remove contaminants, enabling the use of partially degraded RNA as input (e.g., from fixed tissues), input libraries also offer the potential for discovery. These libraries contain various small RNAs, including miRNAs, which play crucial regulatory roles in gene expression. Therefore, in addition to their utility as a background for identifying transcribed regions from csRNA-seq (i.e. ‘peak calling’), these libraries can be valuable for studying changes in steady-state small RNAs, providing insights into post-transcriptional gene regulation.

Adaptations for Automation and High-Throughput Processing

Several steps of the csRNA-seq protocol can be adapted to accommodate automation and high-throughput processing, typically at the cost of library quality and increased sequencing requirements. For small RNA size selection, carboxyl magnetic beads or, for more precise experimental size selection, the High Pure RNA Isolation Kit (Roche), combined with bioinformatic size selection, offer viable alternatives to gel isolation. Similarly, carboxyl magnetic bead-based RNA clean up can replace TRIzol LS RNA purification after 5’cap enrichment. However, careful enzyme heat inactivation is crucial in this case, as bead-based protein carryover otherwise introduces bias, requiring substitution of Calf Intestinal alkaline Phosphatase (CIP) or QuickCIP with a more thermolabile phosphatase like Shrimp Alkaline Phosphatase (rSAP). Finally, gel purification of the final libraries can be omitted. While gel purification provides visual assessment of library quality and refines size selection (reducing sequencing costs), the desired fragment lengths (20–58 nt) can be ultimately ensured bioinformatically.

Data Analysis:

csRNA-seq libraries are typically sequenced single-end for 50 cycles or more. Once sequencing data are generated, reads undergo trimming and size selection (typically 20–58 nt insert length) before being aligned to a reference genome. This alignment allows for identification of regulatory elements, like promoters and enhancers, and TSSs across the genome, enabling a detailed view of active transcription.

Bioinformatics tools within the HOMER2 software suite^25,44 streamline this process. These tools provide advanced functionalities for:

Regulatory element identification, revealing which promoters and putative enhancer regions actively initiate transcription;
Individual TSS identification, pinpointing exact start sites of transcription;
DNA motif analysis, identifying transcription factor binding sites enriched near TSSs of interest;
Differential activity analysis, determining changes in TSS position and frequency (strength) under different conditions to define activated or repressed gene promoters or distal transcribed loci like enhancers;
Spatial regulatory syntax, analyzing the spatial relationships among transcription factors or transcription factors and TSSs within regulatory elements; and
Differential analysis of TSSs generating stable or unstable transcripts, distinguishing distinct regulatory element types and dissecting the dynamics between transcription initiation and RNA turnover.

A detailed step-by-step guide to analyze csRNA-seq data is provided in Supplementary Protocol 2 and is also available at https://github.com/DuttkeLab/csRNA_walkthru. All raw and processed data utilized in this tutorial can be downloaded from GEO (GSE287021).

Expertise needed to implement the protocol

This cost-effective and relatively straightforward method enables researchers at all levels, including motivated undergraduates³⁹, to engage in nascent transcriptomics. The csRNA-seq protocol has a similar level of complexity to conventional RNA-seq, making it accessible to researchers familiar with RNA handling and sequencing workflows but minimal experience in nascent transcriptomics. Basic molecular biology skills, including maintaining an RNase-free environment and safe handling of TRIzol LS, are required. While many pause points are included, familiarity with multi-step protocols is essential. Deep sequencing, typically performed by core facilities, and bioinformatics expertise for data processing and interpretation are also necessary. Appropriate personal protective equipment (PPE) should be used at all times. Prior to RNA isolation, input material should be handled according to its potential biological hazards.

Materials

Biological materials

The starting material for csRNA-seq is 0.2–25 μg of total RNA isolated from any fresh, frozen, fixed, or inactivated sample or tissue^{20,27,38–41}. Ideally, ≥3 μg of total RNA with RIN ≥ 8 is used.

▲CRITICAL: Working with biological samples and human cell lines requires strict adherence to respective safety protocols due to potential biohazards.

Reagents

▲CRITICAL: All reagents, solutions, and buffers should be made with UltaPure™ DNase/RNase-Free Distilled Water or DNase/RNase-Free ddH₂O.

▲CRITICAL: Extreme care should be taken to avoid RNase contamination. Use RNase-free reagents and change gloves routinely.

▲CRITICAL: Handle laboratory chemical as detailed in the safety data sheet (SDS). Wear appropriate personal protective equipment (PPE). Handle concentrated solutions with care.

UltaPure^™ DNase/RNase-Free Distilled Water (Thermo Fisher Scientific, cat. no. 10977015)
Formamide (Millipore Sigma, cat. no. F9037–100ML)

! CAUTION: Formamide is toxic if swallowed, in contact with skin, or if inhaled. Use appropriate PPE. Buffers should be prepared in a fume hood to avoid inhalation of mist or vapors.

Bromophenol Blue (Millipore Sigma, cat. no. B0126–25G)
Tris (1M), pH 8.0, RNase-free, (Thermo Fisher Scientific, cat. no. AM9855G)
UltraPure^™ Tris-HCl Buffer (1M), pH 7.5, RNase-free (Invitrogen, cat. no. 15567027)
Xylene cyanol FF (Fisher Scientific, cat. no. BP565–10)

! CAUTION: May be harmful if swallowed, inhaled, or absorbed through skin and causes eye irritation. Use appropriate PPE.

10x TBE 4L (BioPioneer, cat. no. MB1044–4, Fisher Scientific, cat. no. BP133320 or similar)

! CAUTION: Handle concentrated solutions with care.

Sodium Dodecyl Sulfate (SDS, Fisher Scientific, cat. no. BP166–500)

! CAUTION: SDS causes skin, respiratory, and serious eye irritation. Use appropriate PPE.

Sodium Acetate (3 M), pH 5.5, RNase-free (Thermo Fisher Scientific, cat. no. AM9740)
Tween-20 (Sigma Millipore, cat. no. P1379–100ML)
NaCl (5 M), RNase-free (Thermo Fisher Scientific, cat. no. AM9759)
UltraPure^™ 0.5M EDTA, pH 8.0 (Thermo Fisher Scientific, cat. no. 15575020)
UltraPure^™ 1 M Tris-HCI Buffer, pH 7.5 (Thermo Fisher Scientific, cat. no. 15567027)
TRIzol^™ Reagent (Invitrogen, cat. no. 15596026) – if cells are used.

! CAUTION: Toxic if inhaled, swallowed or on contact. Use only inside a fume hood. Wear appropriate PPE, including gloves, eye protection, and protective clothing. Avoid breathing vapor.

TRIzol^™ LS (Invitrogen, cat. no. 10296010)

! CAUTION: Toxic if inhaled, swallowed or on contact. Use only inside a fume hood. Wear appropriate PPE, including gloves, eye protection, and protective clothing. Avoid breathing vapor.

Chloroform:Isoamyl alcohol 24:1 (Millipore Sigma, cat. no. C0549–1QT)

! CAUTION: Chloroform:Isoamyl alcohol 24:1 is harmful on contact with skin or eyes or inhalation. Wear appropriate PPE, use it inside a fume hood, and avoid inhalation.

GlycoBlue^™ Coprecipitant (15 mg/mL, Thermo Fisher Scientific, cat. no. AM9515)
Novex^™ TBE-Urea Gels, 15% (wt/vol), 12 well (Invitrogen, cat. no. EC68852BOX)
Low Range ssRNA Ladder (New England Biolabs, cat. no. N0364S) or a 19 and 59 nts ssDNA oligo.
GelGreen^® Nucleic Acid Gel Stain (Biotium, cat. no. 41005)
Ethanol absolute, KOPTEC (VWR, cat. no. 89125–186)

! CAUTION: Highly flammable liquid and vapor. Causes serious eye irritation. Keep away from heat, sparks, open flames, and hot surfaces. Use appropriate PPE and wear eye protection. Use in a well-ventilated area.

Terminator 5’-Phosphate-Dependent Exonuclease, (LGC Biosearch Technologies, cat. no. TER51020)
SUPERase•In^™ RNase Inhibitor (20 U/μL) (Thermo Fisher Scientific, cat. no. AM2696)
QuickCIP (New England Biolabs, cat. no. M0525L)
RNA 5’ Pyrophosphohydrolase (RppH, New England Biolabs, cat. no. M0356S)
2-Propanol (Sigma-Aldrich, cat. no. I9516–25ML)

NEBNext^® Small RNA Library Prep Set for Illumina® (Multiplex Compatible), 96Rx (New England Biolabs, cat. no. E7330L)
Barcodes: from NEBNext^® Multiplex Small RNA Library Prep Kit for Illumina^® Index Primers 1–48, (NEB, cat. no. E7560S), BIoO, UDI or similar.
T4 RNA Ligase Reaction Buffer (New England Biolabs, cat. no. B021SVIAL or B0216L)
PEG 8000 (New England Biolabs, cat. no. B1004SVIAL, sold only as a part of NEB “T4 RNA Ligase Reaction Buffer”, cat. no. B0216L)
Betaine Solution (Sigma Millipore, cat. no. B0300–1VL)
SpeedBead Magnetic Carboxylate Modified Particles (Cytiva, cat. no. 65152105050250)
Poly(ethylene glycol) (Sigma-Aldrich, cat. no. P2139–500G)
Novex^™ Hi-Density TBE Sample Buffer (5x, Thermo Fisher Scientific, cat. no. LC6678)
Novex^™ TBE gel 10% (wt/vol; 12 well; Invitrogen, cat. no. EC62752BOX)
HyperLadder^™ 25 bp ladder (Meridian Bioscience, cat. no. BIO-33057)
SYBR^™ Gold Nucleic Acid Gel Stain (Thermo Fisher Scientific, cat. no. S11494)
ChIP-DNA clean and concentrator (Zymo Research, cat. no. D5205)

Software for data analysis

HOMER2: A suite of tools for trimming, QC, creating tag directories, genome browser visualization files, peak finding, and motif discovery^25,44 (http://homer.ucsd.edu/homer/download.html).
Alignment software, such as STAR⁶⁶ (https://github.com/alexdobin/STAR/tree/master) or HISAT2⁶⁷ (https://daehwankimlab.github.io/hisat2/download/)

Equipment

1.7 ml tubes SafeSeal LowBind Microcentrifuge Tubes, low binding (Sorenson, cat. no. 39640T)
1.5 ml Safe-Lock LowBind Microcentrifuge Tubes, low binding (Eppendorf, cat. no. 0030123611)
Tempassure 0.2ml PCR Pull-Apart Tube Strips with Dome Cap Strips (USA scientific, cat. no. 1402–2700)
Additional 8 cap strip for 0.2mL tubes with dome top (USA scientific, cat. no. 1400–1800)
Petri Dish square with Grid 100mm x 100mm Fisherbrand^™ (Fisher Scientific, cat. no. FB0875711A)
Gel Breaker Tubes (IST Engineering Inc, cat. no. 3388–100)
SureOne^™ Large Orifice Filtered Pipette Tips P200 pipet tip (Fisher Scientific, cat. no. 02–707-465)
UltraFree^® Centrifugal Filter, 0.45 μm pore size, hydrophilic PVDF, 0.5ml volume (Millipore Sigma, cat. no. UFC30HVNB)
Yellow Blunt Tip Dispensing Fill Needles, 20ga x 2.0” (CML Supply, cat. no. 901–20-200)

! CAUTION: While blunt, handle with care to avoid needle stick injuries.

10 mL BD Luer-Lok^™ Syringe (BD, cat. no. 302995)
Filtered gel loading tips (USA Scientific, cat. no. 1022–0810)
Low binding filter tips
DynaMag 96 Side skirted (Thermo Fisher Scientific, cat. no. 12027)
Invitrogen^™ XCell SureLock^™ Mini-Cell (Invitrogen, cat. no. EI0001)
Thermomixer C (Eppendorf, cat. no. EP5382000023)
Thermocycler T100 (Bio-Rad, cat. no. 1861096)
Centrifuge with fixed rotor angle (Eppendorf, cat. no. 5424)
Refrigerated centrifuge (Eppendorf, cat. no. 5417R)
Qubit fluorometer (Thermo Fisher, cat. no. Q33238)
Dark Reader Transilluminator for gel visualization (Clare Chemical Research, cat. no. DR22A)
Dark Reader Glasses (classic, Clare Chemical Research, cat. no. AG15)
Razor Blades (Fisher Scientific, cat. no. 12–640)
−20 °C and −80 °C Freezers
4^oC Refrigerator
Minicentrifuge (Benchmark Scientific^™, cat. no. RS7183)
Multichannel pipettes
Single channel pipettes
Heat block, one set at 37 °C and the other set at 75 °C, each filled with water equilibrated at the appropriate temperature
Vortex

Reagent Setup

▲CRITICAL: All reagents, solutions and buffers should be prepared with RNase-free water and tested for RNase activity. To prevent sample loss, test buffers and homemade reagents for RNase contamination (e.g., using IDT’s RNase Alert^®). Adhering to standard procedures for RNA including working on ice when possible, and using filter tips, a physical barrier like a mask or protector shield, and RNase inhibitors is recommended. Be mindful of what you touch with your gloves and change them after touching skin, your phone, or anything not designated RNase free.

2xFLB

The 2xFLB is 5mM EDTA, 0.025% (wt/vol) Bromophenol blue, 0.025% (wt/vol) Xylene cyanol and 99% Formamide (vol/vol; Supplementary Table 1). Store at −20 °C for up to a year or until 2xFLB does not freeze anymore, which may indicate formamide hydrolysis.

ddH₂O+0.05% Tween 20

The ddH₂O+0.05% (vol/vol) Tween 20 is 0.05% (vol/vol) Tween 20 in RNase-DNase Free water (Supplementary Table 1). Store at room temperature (RT, ~21 °C) in the dark for up to a year.

csElution Buffer (for RNA gels)

The csElution buffer (csEB) is 400 mM Sodium Acetate (NaOAc), 10 mM Tris (pH 7.5), 1 mM EDTA, and 0.05% (vol/vol) Tween 20 in RNase-DNase Free water (Supplementary Table 1). Store solution at room temperature for 3–4 months, preferably in the dark. Store long term in the dark at 4 °C for up to a year.

TET Buffer

The TET buffer is 10 mM Tris (pH 7.5), 1 mM EDTA, and 0.05% (vol/vol) Tween 20 in RNase-DNase Free water (Supplementary Table 1). Store working solution at room temperature for a year, preferably in the dark. If making a larger stock, store aliquots in the dark at 4 °C for up to a year.

TE’T Buffer

The TE’T buffer is 10 mM Tris (pH 7.5), 0.1 mM EDTA, and 0.05% (vol/vol) Tween 20 in RNase-DNase Free water (Supplementary Table 1). Store working solution at room temperature for a year, preferably in the dark. If making a larger stock, store aliquots in the dark at 4 °C for up to a year.

Sequencing TET Buffer

The sequencing TET (sTET) buffer is 10 mM Tris (pH 8), 0.1 mM EDTA, and 0.05% (vol/vol) Tween 20 in RNase-DNase Free water (Supplementary Table 1). Another option is to add 0.05% (vol/vol) Tween 20 to the Zymogen DNA Elution Buffer (cat. no. D3004–4-10). Store working solution at room temperature for a year, preferably in the dark. If making a larger stock, store aliquots in the dark at 4 °C for up to a year.

DNA Gel Elution Buffer

The DNA Gel Elution buffer is 10 mM Tris (pH 8.5) and 0.1 mM EDTA in RNase-DNase Free water. Store working solution at room temperature for a year (Supplementary Table 1). Another option is to use DNA Gel Elution Buffer (NEB, cat. no. E7324AA) from NEBNext^® Small RNA Library Prep Set for Illumina^® (Multiplex Compatible), 96Rx (NEB, cat. no. E7330L).

Procedure

CRITICAL: A video of the procedure is available as Supplemental Video 1.

▲CRITICAL: Keep all the enzymes in a benchtop cooler at −20 °C, do not place the enzymes on ice. Make all the temperature changes rapidly, for example if your samples are on ice and you need to incubate them at 37 °C, place your samples from the ice to the heated block or thermocycler when it has reached the 37 °C already.

RNA Size Selection

● TIMING: Approximately 7.5 hours (including gel running and elution) for 24 samples; steps 1–4 take 1 h 30 min for 6 gels, steps 5–9 take 1 h 20 min, steps 10–16 take 1 h 20 min for 6 gels, step 17 takes 2 h, steps 18–24 take 1 h 20 min for 24 samples

1
Pre-run a 15% (wt/vol) Acrylamide TBE-UREA gel in 1x TBE buffer at room temperature at 200 V for 20 minutes.

CAUTION !: To track multiple gels, label the plastic cassette and add 3–5 μl of FLB buffer to different lanes (e.g., first lane of gel 1, second lane of gel 2, etc.). This will also confirm the gel is pre-running correctly and will not interfere later.
2
Prepare the RNA samples by mixing 1:1 (vol/vol) of RNA with 2x FLB buffer, up to a maximum volume of 24 μl.
3
Mix samples by flicking and denature the RNA by heating at 75 °C for 2 minutes in 1.7 ml microcentrifuge tubes or for 90 seconds in 0.2 ml PCR tubes.
4
Immediately chill the samples on wet ice.

▲CRITICAL STEP: Rapid cooling is essential to prevent the formation of RNA secondary structures. Keep samples on ice until loading.
5
Thoroughly wash the wells of the gel with 1x TBE buffer using a blunt needle and syringe or 200 μl pipette to remove any excess salt from the wells before loading samples.

▲CRITICAL STEP: It is imperative to thoroughly wash the wells of the gel with 1x TBE buffer before loading samples to ensure proper size selection.

CRITICAL STEP: Load 5 μl of ssRNA ladder or any available 19-nt and 59-nt ssDNA oligos to aid with size selection. Alternatively, you can orientate yourself with the FLB dyes (Extended Data Fig.2).
6
Load the RNA samples onto the gel with generous spacing (at least two empty lanes between samples).
7
Run the gel at 200 V for approximately 45 minutes, or until the bromophenol blue dye has migrated approximately ¾ of the gel length.
8
While the gel is running, prepare Eppendorf 1.5 ml Safe-Lock LowBind tubes containing Gel Breaker tubes inside.

CRITICAL STEP: Instead of Gel Breaker Tubes you can use 0.5 ml PCR tubes (e.g. Axygen^™, PCR05C) perforated 3–4 times with a 22G needle.

CRITICAL STEP: We recommend using Eppendorf Safe-Lock LowBind tubes here specifically as their lids are more stable.
9
Prepare the gel stain by adding 10 ml of 1x TBE buffer containing 0.5 μg/ml GelGreen^® Nucleic Acid Gel Stain to a square petri dish for each gel.
10
After electrophoresis, remove the gel from the tank, crack open the cassette, and carefully clip off the well region.
11
Place the gel into the petri dish containing the GelGreen^® stain and incubate for 2–5 minutes.
12
Transfer the stained gels without the staining solution into the lid of the petri dish, then place the gel-containing lids onto the Dark Reader Transilluminator.

! CAUTION: Wear appropriate skin and eye protection (Dark Reader protective glasses) when using the transilluminator.

? TROUBLESHOOTING

13
Use fresh gloves and razor blades and cut the desired small RNA fraction, usually ~19–60 nt, as follows. TOP: Excise the gel immediately below any prominent band, typically migrating around 61-nt. However, the size of this first prominent band can vary depending on the sample type and species. BOTTOM: Cut at the 19-nt ladder band or on top of the Bromophenol blue (~10-nt). As long as you cut 20-nt or below, lower end size selection is robust but should be kept consistent across samples (Extended Data Fig.2).

CRITICAL STEP: We recommend taking a picture of the gel before and/or after cutting it for your records.

▲CRITICAL STEP: It is imperative to cut at the 19-nt ladder band and 59-nt ladder band to ensure the correct size selection. It is likely that you will not see the small RNAs in this region or just a faint hint of the miRNAs.
14
Transfer the gel slices into the prepared Gel Breaker tubes standing inside the 1.5 ml Safe-Lock LowBind tubes, with the Xylene cyanol dye facing down.
15
Shred the gel slices by centrifuging the Gel Breaker tubes inside 1.5 ml Safe-Lock LowBind tubes at maximum speed (20,000 g) for 3 minutes at room temperature. Use slow acceleration to prevent lids from breaking off.
16
If any gel pieces remain in the Gel Breaker tubes, repeat the centrifugation or invert and flick the tubes to dislodge the gel into the Safe-Lock tubes. Discard the Gel Breaker tubes.
17
Add 350 μl of csEB buffer to each tube with shredded gel and elute the small RNA by agitating in the dark at room temperature (vortex setting 5–6) for 90 minutes - 2 hours.

CRITICAL STEP: Most of the elution occurs within 20 minutes, but longer incubation increases the yield.
18
Meanwhile, prepare a set of 1.7 ml SafeSeal LowBind tubes containing 1.5 μl of GlycoBlue coprecipitant each and a set of UltraFree^® centrifugal filter columns labeled on their lids.

▲CRITICAL STEP: Without GlycoBlue, the RNA pellet is invisible. However, do not add more than 2.5 μl of GlycoBlue, as this can inhibit downstream enzymatic reactions.
19
After gel elution, briefly centrifuge the tubes.
20
To filter out gel slurry, process a maximum of six tubes at a time to prevent the samples from flowing through the columns prior to centrifugation: Using a large orifice P200 pipette tip, pipette the elution slurry up and down 3–4 times and transfer it into the centrifugal filter column. Close the lid. Do NOT centrifuge the columns at this stage.
21
Cut the lid using a pair of scissors, leaving the centrifugal filter column closed, and transfer the closed column containing the slurry by the lid from the original collection tube into the prepared 1.7 ml SafeSeal LowBind tube containing 1.5 μl of GlycoBlue.
22
Centrifuge at 1,000 g for 1 minute at room temperature.
23
Discard the centrifugal filter columns containing the gel.
24
Add at least three volumes (approximately 1 ml) of 100% (vol/vol) ethanol to the small RNA-containing flow-through in each 1.7 ml SafeSeal LowBind tube. Mix thoroughly and precipitate the small RNA at −20 °C overnight or over the weekend.

■ PAUSE POINT: Samples can be stored at −20 °C for weeks.

sRNA Precipitation

● TIMING: Approximately 2 hours for 24 samples

13
Retrieve the small RNA samples from −20 °C. Invert the tubes several times then pellet the RNA by centrifuging at ≥20,000 g (maximum speed) at 4 °C for 30 minutes but without pre-chilling the rotor.

CRITICAL STEP: Starting the centrifugation with the rotor still at room temperature and allowing it to cool down during the spin can improve pellet formation.

? TROUBLESHOOTING

13
Carefully remove all supernatant by aspiration without disturbing the RNA pellet.
14
Briefly centrifuge samples and use filtered gel loading tips to remove any remaining supernatant.
15
Wash the RNA pellet with 400 μl of 75% (vol/vol) ethanol. Gently pipette up and down or briefly shake the tube to dislodge the pellet.

! CAUTION: Do not vortex, as this can disrupt the pellet.
16
Briefly centrifuge and remove most of the supernatant.

? TROUBLESHOOTING

13
Briefly centrifuge once more and remove all remaining supernatant using a filtered gel loading tip.
14
Air-dry the RNA pellets for approximately 5 minutes, or until completely dry.

■ PAUSE POINT: Pellets can be stored at −20 °C for up to one week or at –80 °C for months.
15
Resuspend the sRNA pellet in 6 μl of TE’T buffer by flicking and incubate at room temperature for a few minutes until the pellet dissolves.
16
Vortex briefly and briefly centrifuge.
17
Heat the resuspended RNA at 75 °C for 2 minutes to denature secondary structures.
18
Immediately chill the samples on wet ice.

▲CRITICAL STEP: Rapid cooling is essential to prevent the reformation of secondary structures.
19
If generating input controls (recommended), label a new set of PCR tubes and add 1 μl of TE’T buffer.
20
Transfer 0.5 μl of the resuspended small RNA (from Step 35) into each PCR tube. This will now be referred to as “input”. This sRNA will not be 5’ Cap enriched. Store it at −20 °C until Step 68. Continue with the remaining resuspended sRNA for 5’Cap Enrichment.

▲CRITICAL STEP: This step is essential for generating input libraries.

5’ Cap Enrichment

● TIMING: Approximately 4.5–5.0 hours for 24 samples; steps 38–43 take 1 h 15 min, steps 44–49 take 1 h 30 min, steps 50–58 take 1.5–2.0 h, step 59 takes 20 min or overnight

Prepare the 1x Terminator Master Mix (MM) per csRNA sample as detailed below. Scale up according to the number of samples.

Order	Terminator MM (for 20μL final)	1X for csRNA
1	ddH2O + 0.05% (vol/vol) Tween 20	11.25 μl
2	10x TermA Buffer	2 μl
3	SUPERase	0.25 μl
Vortex well the master mix and spin down
4	TER51020 Term. Endonuclease	1 μl
Invert vigorously/flick and spin down

Open in a new tab

14
Add 14.5 μl of the Terminator Master Mix into each 1.7 ml SafeSeal LowBind tube with 5.5 μl of denaturated sRNA sample.
15
Mix well by flicking or inverting the tubes several times. Do NOT pipette up and down or vortex.
16
Briefly centrifuge and incubate at 30 °C for 1 hour in a heat block.

With about 20 minutes left, prepare the CIP Master Mix (MM) for each sample as detailed below. Scale up according to the number of samples.

Order	CIP MM (for 50 μl final)	1X for csRNA
1	ddH2O+0.05% (vol/vol) Tween 20	24 μl
2	10x CutSmart Buffer (NEB)	5 μl
Vortex well the master mix and spin down
3	Quick CIP (NEB M0525L)	0.75 μl
Invert vigorously/flick and spin down

Open in a new tab

►CRITICAL STEP: The CIP Master Mix can be prepared at room temperature.

18
Briefly centrifuge the samples and add 30 μl of CIP Master Mix to each.
19
Mix well by vigorously inverting the tubes ten times. Do NOT pipette up and down and do not vortex.
20
Briefly centrifuge the samples and incubate at 37 °C for 45 minutes in a heat block.
21
Meanwhile: turn on the second heat block to 75 °C.
22
Heat the samples at 75 °C for 90 seconds, then immediately chill on wet ice or in a cold block.

▲CRITICAL STEP: Rapid cooling is essential to prevent the formation of secondary structures.
23
Incubate at 37 °C for an additional 15 minutes.

CRITICAL STEP: QuickCIP is thermolabile but unlike rSAP, residual activity is observed after the quick heating. If using rSAP, it would need to be re-added at Step 48.
24
Stop the reaction by adding 500 μl of TRIzol LS.

! CAUTION: TRIzol LS Reagent is toxic. Wear appropriate PPE and work in a fume hood.

▲CRITICAL STEP: Adding TRIzol LS inactivates QuickCIP and Terminator Endonuclease.
25
Vortex thoroughly for 5 seconds and spin briefly.
26
Add 150 μl of TE’T buffer and 150 μl of chloroform:isoamyl alcohol (24:1).

! CAUTION: Chloroform is toxic. Use this reagent in a fume hood.
27
Vortex thoroughly for 15 seconds.
28
Centrifuge at 14,000 g for 10 minutes at room temperature.
29
While centrifuging, prepare a new set of 1.7 ml SafeSeal LowBind tubes and add 50 μl of 3 M Sodium Acetate to each.

CRITICAL STEP: If the RNA pellet was not clearly visible after the previous precipitation (Step 31), you can vortex your GlycoBlue stock vial and add an additional 0.5 μl of GlycoBlue coprecipitant here.
30
Carefully transfer approximately 440 μl of the aqueous upper phase (without disturbing the interphase or lower pink layer) to the new tubes containing 50 μl of 3M Sodium Acetate.
31
Vortex or pipette up and down to mix and briefly centrifuge.
32
Add one total volume of isopropanol (usually 500 μl to achieve a final concentration >50% vol/vol).
33
Vortex to mix. Do NOT centrifuge.
34
Incubate the samples on ice for 20 minutes or at −20 °C overnight.

▲CRITICAL STEP: Do NOT incubate at –80 °C, as this can cause complications if residual phenol is present in the sample.

■ PAUSE POINT: Samples can be stored at −20 °C overnight or for weeks at this stage.

Capped small RNA precipitation

● TIMING: Approximately 1 h 30 min for 24 samples.

13
Centrifuge the samples at ≥20,000 g for 30 minutes at 4 °C.
14
As done for the sRNA precipitation (Steps 26–31), carefully remove all supernatant.
15
Briefly centrifuge once more and use a filtered gel loading tip to remove any remaining supernatant.
16
Wash the pellet with 400 μl of 75% (vol/vol) ethanol.
17
Using a P1000 pipette, gently pipette up and down to loosen the pellet and transfer the 5’cap enriched sRNA pellet along with some of the 75% (vol/vol) ethanol to a new PCR strip tube or 96-well plate for library preparation.
18
Briefly centrifuge again and remove all remaining ethanol using a gel loading tip.
19
Air-dry the pellet for 5 minutes at room temperature, or until completely dry.

■ PAUSE POINT: Pellets can be stored at –80 °C for months.

Library Preparation

● TIMING: Approximately 8–8.5 hours for 24 csRNA-seq and 24 input samples; steps 67–73 take 30 min, steps 74–77 take 1 h 50 min, steps 78–79 take 1 h 10 min, steps 80–86 take 45 min, steps 87–91 take 1h 40 min, steps 92–96 take 1 h 20 min, steps 97–98 take 1h

CRITICAL: This procedure describes the preparation of csRNA-seq and sRNA-seq (input) libraries using a modified NEB sRNA kit procedure (i.e., NEB E7330). Other kits can be used but may require optimization. The procedure uses half the recommended volumes of the NEB protocol for csRNA and one-quarter for input samples. Adjust the amount of adapter used depending on input (detailed below).

CRITICAL: Preparing master mixes and aliquoting them into the lids of 8-strip PCR tubes on ice during prior incubations and then just exchanging lids followed by mixing/flicking and brief centrifugation to add the master mix rather pipetting out the master mix into each well significantly speeds up the procedure as all preparations can be done while the samples are incubating.

5’ Decapping

13
Add 3 μl of TE’T buffer to new PCR tube lids. Place the lids on the csRNA tubes and flick to bring the solution to the bottom. Vortex to resuspend the samples (do not use pipette tips).
14
Thaw the input control tubes (from Step 37) at RT and include them in the following steps.
15
Incubate the csRNA and input samples at 75 °C for 2 minutes.
16
Immediately place the tubes on wet ice and keep them on ice.

▲CRITICAL STEP: It is important to rapidly cool the samples to prevent reformation of secondary structures.

Prepare Decapping Master Mix for each sample as detailed below. Scale up according to the number of samples. For adding PEG8000, use Large Orifice Filtered Pipette Tips P200.

Order	Decapping MM	1X csRNA (for 5 μl final)	1X input (for 2.5 μl final)	1X csRNA and input (for 7.5 μl final)
1	10x T4 RNA Ligase buffer	0.8 μl	0.4 μl	1.2 μl
2	SUPERase-In RNase inhibitor (20 U/ml)	0.3 μl	0.15 μl	0.45 μl
Use 200 μl Large Orifice Filtered Pipette Tips P200 for adding PEG800 or cut off 20 μl tip
3	50% (wt/vol) PEG8000	3 μl	1.5 μl	4.5 μl
Vortex well the master mix and spin down
4	RppH	1 μl	0.5 μl	1.5 μl
Invert vigorously and spin down

Open in a new tab

18
Add 5 μl of Decapping MM to new PCR tube lids for csRNA samples and 2.5 μl for input samples. Mix vigorously by inverting tubes.

►CRITICAL STEP: If using a P20 or P10 pipette with small-orifice tips, you may have to adjust the setting to account for viscosity (e.g., “5.4 μl” and “2.9 μl”) to ensure accurate dispensing of the MM. Use tip markings as a reference point for the correct volume.
19
Quick spin samples and exchange old lids with new ones containing the Decapping MM.

▲CRITICAL STEP: Flick vigorously so that sample flies from lid to bottom and vice versa at least ten times to fully mix sample and the PEG-containing MM. Do not vortex.

▲CRITICAL STEP: Thorough mixing is essential due to high PEG concentration.

►CRITICAL STEP: If handling a large number of samples, consider placing the strips in a 96 well PCR rack for mixing. Invert the entire rack vigorously to mix.
20
Briefly centrifuge and incubate at 37 °C (set the thermocycler lid to 42 °C) for 1–2 hours.

3’ Adapter Ligation

Approximately 20 minutes before the decapping reaction (Step 73) is complete, prepare the 3’ Adapter MM for each sample as shown below. Scale up accordingly.

Order	3’ Adapter MM	1X csRNA (for 4 μl final)	1X input (for 2 μl final)	1X csRNA and input (for 6 μl final)
1	2x NEB 3’ Ligation buffer	2 μl	1 μl	3 μl
2	10 μM 3’ SR Adapter	0.3 μl	0.15 μl	0.45 μl
Vortex well the master mix and spin down
3	NEB 3’ Ligation Enzyme Mix	1.7 μl	0.85 μl	2.55 μl
Invert vigorously and spin down

Open in a new tab

►CRITICAL STEP: When starting with >8 μg RNA, use 0.3 μl of 3’ adapter; with 1–8 μg RNA, use 0.15 μl of 3’ adapter and supplement the volume with 0.15 μl of 0.05% (vol/vol) Tween 20. If using <1 μg RNA, use 0.06 μl of adapter.

14
Add 4 μl (for csRNA) or 2 μl (for input) of 3’ Adapter MM to new PCR lids placed on ice or cold PCR tube rack.
15
Once incubation is completed, exchange the old lids with those containing the 3’Adapter MM and mix thoroughly by flicking vigorously and inverting the tubes completely (ten times).

▲CRITICAL STEP: Thorough mixing is essential due to high PEG concentration.
16
Briefly centrifuge and incubate at 22 °C for 1–2 hours (set the thermocycler lid to 40 °C) or at 16 °C for a longer ligation.

CRITICAL STEP: RppH is inactive at 20 °C and will not degrade the adapter³⁶.

Small RNA Reverse Transcription Primer Hybridization

13
Dilute the 10 μM SR RT primer 1:1 with 0.05% (vol/vol) Tween 20 and aliquot into 8-strip tubes.

►CRITICAL STEP: The 5μM SR RT primer aliquots can be stored at −20 °C for months and thawed before use.
14
On ice, directly add 1 μl of the 5 μM SR RT primer to both samples and inputs.
15
Mix well by inverting and flicking vigorously.
16
Briefly centrifuge and incubate using the following thermocycler program (total duration 33 min):

Step Number Temperature Duration

1 75 °C 2 minutes

2 37 °C 15 minutes

3 25 °C 15 minutes

4 4 °C ∞

Set the lid temperature to 80–85 °C

Open in a new tab

5’ SR Adapter Ligation

13
If needed, reconstitute the 5’ SR adapter (e.g., NEB E7328A) in 120 μl ddH₂O + 0.05% (vol/vol) Tween 20 and aliquot. Minimize freeze-thaw cycles by storing aliquots at –80 °C.
14
Before use, heat the SR 5’ adapter at 70 °C for 2 minutes and immediately place on ice. Use the denatured adapter within 30 minutes.

Approximately 20 minutes before the hybridization PCR (Step 82) is complete, prepare 5’Adapter MM for each sample as shown below. Scale up accordingly.

Order	5’Adapter MM	1X csRNA (for 4 μl final)	1X input (for 2 μl final)	1X csRNA and input (for 6 μl final)
1	ddH2O+0.1% (vol/vol) Tween	1.95 μl	0.98 μl	2.93 μl
2	NEB 5’ Ligation Reaction Buffer	0.5 μl	0.25 μl	0.75 μl
3	SR 5’ adapter, 10 μM	0.3 μl	0.15 μl	0.45 μl
Vortex well the master mix and spin down
4	NEB 5’ Ligation Enzyme Mix	1.25 μl	0.63 μl	1.88 μl
Invert vigorously and spin down

Open in a new tab

16
To new PCR lids on ice, add 4 μl of 5’ Adapter MM to csRNA samples or 2 μl to inputs.
17
Once the incubation is complete, exchange lids and mix well by flicking (five times).
18
Briefly centrifuge and incubate at 25 °C for 1.5 hours (set the thermocycler lid to 40 °C).

Reverse Transcription (RT)

Approximately 20 minutes before the 5’ adapter ligation (Step 88) is complete, prepare RT MM for each sample as shown below.

Order	RT MM	1X csRNA (for 5.25 μl final)	1X input (for 2.6 μl final)	1X csRNA and input (for 7.85 μl final)
1	NEB First Strand Synthesis Rx Buffer	4.5 μl	2.25 μl	6.75 μl
2	ProtoScript II	0.75 μl	0.38 μl	1.13 μl
Invert vigorously and spin down

Open in a new tab

14
Add 5.25 μl (samples) or 2.6 μl (inputs) of RT MM to new lids 5–10 minutes before the 5’ adapter ligation ends.
15
Once the incubation is complete, exchange lids and mix well by flicking (five times).
16
Briefly centrifuge and incubate at 50 °C for 1 hour (set the thermocycler lid to 60 °C). Alternatively, incubate overnight with the following thermocycler program (approximately 2 hours and 6 minutes):

Step Number Temperature Duration

1 50 °C 1 hour

2 45 °C 5 minutes

3 50 °C 55 minutes

4 80 °C 5 minutes

5 12 °C ∞

Set the lid temperature to 80–85 °C

Open in a new tab

■ PAUSE POINT: Samples can be frozen at −20 °C or kept at 4 °C overnight.

PCR with Barcodes

13
Prepare a sample/barcode list using preferred barcodes (e.g., UDI or BIOO).
14
Thaw the barcodes, vortex, and briefly centrifuge.

►CRITICAL STEP: Preparing 8-strip aliquots of the barcodes for multichannel pipetting may save time in the long run.
15
Add 2 μl of the corresponding barcode directly to each csRNA and sRNA input sample.

▲CRITICAL STEP: Carefully record the barcodes used for each sample as these will be needed for later analysis.

Prepare either LongAmp Taq or Q5 High-Fidelity/Q5 Ultra II PCR Master Mix for each sample as shown below. Scale up accordingly.

LongAmp Taq PCR Master Mix:

Order	LongAmp Taq PCR	1X csRNA (for 27.3 μl final)	1X input (for 13.65 μl final)	1X csRNA and input (for 40.95 μl final)
1	NEB SR FW Primer, 10μM	2 μl	1 μl	3 μl
2	Betaine	0.3 μl	0.15 μl	0.45 μl
3	2X LongAmp Taq PCR Master Mix	25 μl	12.5 μl	37.5 μl
Invert vigorously and spin down

Open in a new tab

Q5 High-Fidelity or Q5 Ultra II Master Mix:

Order	Q5 High-Fidelity or Q5 Ultra II PCR MM	1X csRNA (for 27.3 μl final)	1X input (for 13.65 μl final)	1X csRNA and input (for 40.95 μl final)
1	NEB SR FW Primer, 10μM	2 μl	1 μl	3 μl
2	Betaine	0.3 μl	0.15 μl	0.45 μl
3	2X Q5 High-Fidelity Master Mix or 2X Q5 Ultra II Master Mix	25 μl	12.5 μl	37.5 μl
Invert vigorously and spin down

Open in a new tab

CRITICAL STEP: DNA polymerase choice may impact downstream sequencing as different sequencing platforms may have different requirements: LongAmp Taq has terminal transferase activity adding A overhangs, Q5 leaves ends blunt. Please check with your sequencing facility for their preference.

17
Add 27.3 μl of the PCR MM to new lids for samples and 13.65 μl for inputs. Switch the lids, mix well, briefly spin down.
18
Amplify the libraries using one of the following PCR programs, depending on the polymerase used. Use 11–15 cycles, typically 12.

LongAmp Taq PCR Master Mix Thermocycler Program:

Number of Cycles	Denature	Anneal	Extend	Hold
1	94 °C for 5 min
12–15	94 °C for 45 sec	63 °C for 30 sec	70 °C for 15 sec
1			70 °C for 5 min
1				4–10 °C for ∞

Open in a new tab

Q5 High-Fidelity or Q5 Ultra II Master Mix Thermocycler Program:

Number of Cycles	Denature	Anneal	Extend	Hold
1	98 °C for 30 sec
12–15	98 °C for 10 sec	63 °C for 30 sec	72 °C for 20 sec
1			72 °C for 5 min
1				4–10 °C for ∞

Open in a new tab

►CRITICAL STEP: The optimal number of PCR cycles depends on the starting material and transcriptional activity of the tissue studied. With 1 μg of input RNA, 14–15 cycles are usually sufficient. Fewer cycles can be used with more input (e.g., 13 cycles for 2 μg, 12 for 4 μg, 11 for >8 μg). It is recommended to use 3’ adapters with unique molecular identifiers (UMIs, e.g., NEB #E7395) to remove PCR duplicates.

■ PAUSE POINT: Samples can be frozen at −20 °C for at least two years.

dsDNA Size Selection and Gel Purification

● TIMING: Approximately 6–7 hours plus an overnight elution for 24 csRNA and 24 sRNA input samples; steps 99–115 take 1.5 h, steps 116–119 take 1 h for 6 gels, step 120 takes 2 h 20 min, steps 121–129 take 1.5–2 h

CRITICAL: All steps can be done at room temperature.

SpeedBead Magnetic Carboxylate Modified Particles Preparation

13
Pre-wash carboxyl magnetic beads. Resuspend the beads thoroughly, add 400 μl of beads to a 2 ml tube with 32 μl of 0.5 M EDTA.
14
Vortex and place the tube on a strong magnet to collect the beads.
15
Remove the supernatant.
16
Wash the beads twice with 2 ml of TE’T buffer. Collect beads on a magnet to remove the supernatant.
17
Resuspend the beads in the original volume (400 μl) of TE’T buffer. PAUSE POINT Washed beads can be stored at 4 °C for up to 6 months.

DNA Bead Purification

! CAUTION: When handling amplified libraries, take precautions to prevent contamination of future libraries (physical separation, minimizing aerosol formation, and decontamination).

Prepare the Beads Master Mix for each sample as shown below. Scale up accordingly.

Order	Beads Master Mix	1X csRNA (for 75 μl final)	1X input (for 37.5 μl final)	1X csRNA and input (for 112.5 μl final)
1	Prewashed Beads (Step 103)	2 μl	1 μl	3 μl
2	5M NaCl	35.5 μl	17.75 μl	53.25 μl
3	40% (wt/vol) PEG8000	37.5 μl	18.75 μl	56.25 μl

Open in a new tab

CRITICAL STEP: The high PEG concentration is necessary for efficient recovery of smaller DNA fragments.

CRITICAL STEP: A larger stock of Beads Master Mix can be made and stored at 4 °C for months.

14
Add the appropriate volume of Beads Master Mix to new PCR tube lids at room temperature, i.e. 75 μl to csRNA samples and 37.5 μl to inputs (ratio of 1:1.5 Sample:Beads Master Mix).
15
Exchange lids, vortex, and briefly centrifuge.
16
Incubate for 10 minutes at room temperature.
17
Place the tubes on a magnetic rack and incubate until the solution is clear (3–4 minutes). The beads should form a compact spot on the side of the tube.
18
Carefully remove the supernatant without disturbing the beads.

! CAUTION: Carefully dispose of the supernatant as it contains amplified libraries. Consider bleaching the waste.
19
Wash the beads twice with 200 μl of 80% (vol/vol) ethanol. For each wash, move the tubes back and forth on the magnet (5–6 times) to ensure thorough washing.
20
Aspirate the supernatant while tubes are on the magnet.
21
Briefly centrifuge and remove all remaining ethanol.
22
Air-dry the beads for 5–10 minutes or use a 30–37 °C heating block for 3–5 minutes. Pellet cracks should be visible.

▲CRITICAL STEP: It is essential to fully dry pellets, as residual ethanol can cause the sample to float out of the gel wells during electrophoresis.
23
Add 15 μl of 1X Novex Hi-Density TBE Sample Buffer to new PCR tube lids.

CRITICAL STEP: Dilute 5X Novex Hi-Density TBE Sample Buffer with RNase/DNase-Free water.
24
Exchange lids, vortex to mix, and briefly centrifuge.

■ PAUSE POINT: Samples can be frozen at −20 °C for months.

dsDNA TBE Gel Purification/Size Selection

13
Prepare a 10% (wt/vol), 12 wells TBE gel cassette. Mark plastic frame with numbers to keep track of samples.
14
Load 8 μl (half) of the samples and corresponding inputs adjacent to each other in lanes 3–10 of the gel. Use a magnet to hold the tubes during loading to prevent beads from entering the wells.

CRITICAL STEP: Avoid using the first and last lanes of the gel, as size selection can be challenging in these lanes.

CRITICAL STEP: Loading csRNA samples next to their corresponding inputs simplifies size selection and reduces bias.
15
Vortex and briefly centrifuge DNA ladder (e.g., Hyperladder 25 bp “Meridian” 1:2 dilution with ddH₂O).
16
Load 3 μl of the diluted ladder into lanes 2 and 11.
17
Run the gel in 0.5 X TBE buffer at 80 V for 20 minutes, then at 180 V for up to 120 minutes, i.e. until the Xylene cyanol dye is 1–1.2 cm from the bottom edge.

▲CRITICAL STEP: Running the gel for a longer time improves size separation. The adapter migrates with the lower edge of the Xylene cyanol dye. The bromophenol blue dye will run off the gel, which is expected.
18
While the gel is running, label a new set of 1.7 ml SafeSeal LowBind microcentrifuge tubes, and add 150 μl of DNA Gel Elution Buffer (e.g., from NEB Library Kit E7324AA) to each tube.

CRITICAL STEP: 10 mM Tris-HCl (pH 8.0) with 0.05% (vol/vol) Tween 20 can be used as an alternative elution buffer.
19
Prepare the gel stain by adding 1 μl of SYBR Gold to 10 ml of 1X TBE in a square petri dish. Gently swirl to mix.
20
Once the Xylene cyanol dye has migrated the desired distance, remove the gel from the electrophoresis tank, open the cassette, and carefully remove the well region.
21
Using a wet spatula, carefully lift the gel and transfer it to the square petri dish containing the SYBR Gold stain. Incubate for 3–5 minutes. Image the gel using a Safe Imager transilluminator. ! CAUTION: Wear appropriate skin and eye protection (Dark Reader protective glasses) when using the transilluminator. CRITICAL STEP: We recommend taking a picture of the gel for your records.
22
Transfer the gel to the dry lid of the petri dish.
23
Excise the desired DNA fragments. For ~20–60 nt fragments (corresponding to above the 125 bp to the 175 bp bands on the gel with the 118-bp NEB adapters), cut between Hyperladder bands #6 and #8 from the top (or slightly higher, but avoid the abundant steady-state RNA seen on the RNA gel; Extended Data Fig.3). We recommend taking another picture of the cut gel for your records after cutting out the desired region.
24
Transfer the gel slices to the prepared 1.7 ml SafeSeal LowBind Microcentrifuge Tubes containing 150 μl of DNA elution buffer
25
Using a 200 μl pipette tip, break each gel slice into two pieces to ensure complete submersion in the 150 μl of elution buffer.
26
Elute the DNA overnight at room temperature in the dark or over the weekend at 4 °C without shaking.

CRITICAL STEP: For a faster elution, shred the gel pieces using Gel-Breaker tubes and elute for 20 minutes with shaking at room temperature.

DNA Clean up

● TIMING: Approximately 4 hours for 24 csRNA-seq and 24 sRNA input samples

CRITICAL This section describes the purification and concentration of the size-selected DNA.

13
Place a tube of sequencing TET (sTET) buffer at 75 °C to warm.
14
Label a set of Zymo-Spin columns (e.g., from Zymo Research ChIP DNA Clean & Concentrator kit).
15
Add 750 μl of Zymo DNA Binding Buffer to each tube containing the gel pieces and 150 μl of DNA elution buffer.
16
Pipette up and down to mix and transfer the entire mixture (but leaving the gel pieces behind) to the Zymo-Spin columns.

CRITICAL STEP: The columns have a maximum capacity of 800 μl. If the total volume exceeds this, divide the sample and process it in two separate spins.
17
Centrifuge for 30 seconds at 10,000 g at room temperature.
18
Discard the flow through.
19
Add 200 μl of DNA Wash Buffer to each column.
20
Centrifuge for 20–30 seconds at 12,000 g at room temperature.
21
Add 200 μl of DNA Wash Buffer to each column.
22
Discard the flow through.
23
Dry columns by centrifugation for 1 minute at 12,000 g at room temperature.
24
Transfer the dry columns to new labeled 1.7 ml SafeSeal LowBind microcentrifuge tubes.
25
Add 12–20 μl of pre-warmed (75 °C) sTET buffer directly to the membrane of each column to elute the csRNA-seq libraries and 40 μl for inputs.

CRITICAL STEP: Elution volumes can be adjusted based on intensity of DNA band on the gel.
26
Incubate the columns for 2–5 minutes at room temperature.
27
Centrifuge with low acceleration for 1 minute at 12,000 g at room temperature.

? TROUBLESHOOTING

13
Remove and discard the columns.
14
Mix the samples well by vortexing and briefly centrifuge.

■ PAUSE POINT: Samples can be frozen at −20 °C for at least two years.

Quantification and Pooling

● TIMING: 2 h for 24 csRNA-seq and 24 sRNA input samples

13
Measure the DNA concentration using a Qubit fluorometer and the Qubit dsDNA HS Assay Kit. Perform a new calibration and use 2 μl of each sample for measurement.
14
Label the dsDNA library tubes with unique sample IDs (e.g., your initials plus a number).

■ PAUSE POINT: Samples can be stored at −20 °C.
15
Pool samples for sequencing according to your experimental design and sequencing platform requirements.

Sequencing

● TIMING: variable

13
Sequence pooled libraries single-end for 50 or more cycles using any sequencing platform compatible with Illumina TRU-seq⁴⁰. Sequencing to a depth of 8–15 million reads is adequate for most samples.

Data Analysis

● TIMING: variable

A schematic of data analysis is shown in Fig.3 and Extended Data Fig.4 and a detailed step-by-step tutorial is available as Supplementary Protocol 2 and at https://github.com/DuttkeLab/csRNA_walkthru.

13
FASTQ trimming: Trim 3’ adapter sequences (AGATCGGAAGAGCACACGTCT) from csRNA and sRNA reads using homerTools trim or Skewer⁶⁸ to trim total RNA paired end reads.
14
Align sequence reads to the genome: csRNA-seq captures nascent RNA (newly initiating transcripts) before processing and splicing. Reads can be aligned using either a DNA or RNA aligner, however, we recommend using a splice-aware aligner (e.g. STAR⁶⁶, HISAT2⁶⁷) to account for potential contamination from spliced RNA. Only reads with a single alignment are considered for downstream analysis. Overall alignment rates should be high (>70%), depending on the organism and completeness of the genome assembly. Aligned reads are stored in .sam or .bam format.
15
Create HOMER Tag Directories: Use HOMER2²⁵ to create tag directories which contain read information and several QC files. At this step, perform QC checks as detailed in Supplementary file 1, including assessing read length distribution and nucleotide preferences at the 5’ end of the reads (TSSs).
16
Create Genome Browser Visualization Files: Reads can be visualized as transcription rates at single nucleotide resolution at the 5’ ends. Separate files are needed for the + and – strand alignments since csRNA-seq is strand specific. These files can be created using HOMER’s makeUCSCfile command and visualized in genome browsers such as the UCSC Genome Browser⁶⁹ or IGV⁷⁰.
17
Identify TSS Clusters: TSS clusters or Transcription Start Regions (TSRs) represent sites with significant transcription initiation activity from active regulatory elements. Use HOMER2’s findcsRNATSS.pl script to identify TSRs and obtain basic TSS cluster annotation information. Data from mammalian cell lines typically have at least 30,000–100,000 TSRs, numbers may vary for other species.

Troubleshooting

Troubleshooting advice can be found in Table 1.

Table 1. |.

Troubleshooting table.

Step	Problem	Possible reason	Solution
12	Degradation of RNA	RNase activity in reagents	Test reagents for RNase contamination using respective assays such as IDT’s RNase alert (IDT, 11-04-03-03).
24	No visible pellet	Pre-chilled centrifuge was used. Pellets may have formed along the side of the tubes rather than at the bottom.	Vortex the tubes, incubate at room temperature for 5 minutes, vortex again, and centrifuge at ≥20,000 g (maximum speed) at 4°C for 15 minutes.
24	No visible pellet	Incomplete RNA precipitation	Re-centrifuge if necessary.
28	Pellet is invisible	Pellet may be dislodged or stuck on the lid.	Spin tubes for 5 minutes and check if the pellet is visible. Confirm pellet is not stuck to the lid.
138	Tube lids break during centrifugation	Spinning tubes at high speed	Crisscross pairs of tube lids in the rotor during centrifugation to prevent breaking.

Open in a new tab

TIMING

Steps 1–24, RNA size selection: 7.5 h (suggested end of the first day)

Steps 25–37, sRNA precipitation: 2 h

Steps 38–59, 5’ cap enrichment: 4.5–5 h

Steps 60–66, capped small RNA precipitation: 1 h 30 min (suggested end of the second day)

Steps 67–98, library preparation: 8–8.5 h (suggested end of the third day)

Steps 99–129, dsDNA size selection and gel purification: 6–7 h (suggested end of the fourth day)

Steps 130–146, DNA clean up: 4 h

Steps 147–149, quantification and pooling: 2 h

Step 150, sequencing: variable

Steps 151–155, data analysis: variable

Anticipated Results

csRNA-seq provides a comprehensive snapshot of active cis-regulatory elements and their TSSs (Fig.1, Extended Data Fig.1). Successful csRNA-seq libraries should show reads mapping predominantly to TSSs, with minimal reads mapping to gene bodies. Extensive gene body coverage suggests evidence of fragmented input RNA and poor cap enrichment. csRNA-seq libraries should exhibit the characteristic nucleotide bias near the TSSs of captured stable and rapidly degraded transcripts (Supplementary Protocol 2) and be significantly more diverse than input small RNA-seq libraries which have reads typically concentrated in high abundance ncRNAs (e.g. miRNAs, snRNAs, etc.). An indication of whether library preparation has been successful can be obtained from the gel post PCR amplification (Extended Data Fig.3). Input small RNA-seq libraries typically show a prominent miRNA band (21–24 bp + adapters size, which is 118 bp for NEB E7330), which is depleted in csRNA-seq libraries, which instead show more prominent inserts of size 35–45 bp. Libraries requiring fewer than 15 PCR cycles generally yield better results, although meaningful data can be obtained with higher cycle numbers, especially when using UMIs to control for amplification bias. Typical csRNA-seq dsDNA libraries yield 18 μl at 0.1–3 ng/μl, whereas input small RNA-seq libraries yield 5–10 times more. However, csRNA-seq library concentrations are highly dependent on the transcriptional activity of the samples studied.

By enriching for initiating transcripts, csRNA-seq captures TSSs of both stable and unstable transcripts (Fig.3; Extended Data Fig.5). Final transcript stability can be estimated using genome annotations or, for a more unbiased approach, total RNA-seq ^39,71.

Extended Data

Extended Data Fig. 3 | — Image of DNA gel purification. The left image shows the gel before cutting, the right shows it after cutting. Ensure the cut is made above band 6 (125 bp) and on top of band 8 (175 bp), while being careful to not collect any strong steady state RNA bands that may be present. csRNA-seq libraries (cs) have a more prominent smear (triangle) above the major ~22 nt sRNA band (*) which should be depleted, compared to input small RNA-seq (I).

Extended Data Fig. 4 | — The analysis pipeline includes 3’ adapter trimming (e.g., using homerTools), read mapping to the genome, creation of tag directories, and ‘peak calling’ of TSS clusters termed transcription start regions (TSRs) enriched over input sRNA-seq. Downstream analyses can include transcription factor binding site (TFBS) enrichment analysis, peak annotation, and differential expression analysis. Quality control steps are integrated throughout the workflow.

Extended Data Fig. 5 | — csRNA-seq provides insights into several features of TSRs, including their association with stable or unstable transcripts (transcript stability), the direction(s) of transcription initiation (directionality), the ratio of initiation from the dominant TSS relative to other TSS within the TSR (focused ratio), and the frequency of transcription initiation at the TSS (TSS strength). Furthermore, csRNA-seq data can be integrated with diverse analyses of DNA features, such as enriched TFBS, DNA topology, and GC content.

Supplementary Material

Supplementary Code

NIHMS2142302-supplement-Supplementary_Code.ipynb^{(31.1KB, ipynb)}

Supplement NIH

NIHMS2142302-supplement-Supplement_NIH.pdf^{(185.7KB, pdf)}

Supplementary Video

Download video file^{(976.2MB, mp4)}

Key points:

Capped small (cs)RNA-seq is a simple, scalable method that uniformly captures active RNA polymerase II initiation events from total RNA, enabling high-resolution mapping of transcription start sites and initiation levels of both stable (mRNAs, lincRNAs) and transient (enhancer-, promoter-antisense, or pri-micro) RNAs.
Unlike other current nascent RNA profiling approaches, csRNA-seq is directly applicable to RNA isolated from fresh, frozen, or fixed cells or tissues, and even from inactivated infectious samples.

Acknowledgments

This work was supported in part by the National Institutes of Health (NIH) grant R00GM135515 to S.H.D. and U01DA051972 to F.T. and C.B. M.K.M. is a NIGMS T32GM152310 Biotech Trainee, E.M.R. a Barry Goldwater and WSU Carson Fellow.

Footnotes

Competing interests

S.H.D. leads the Nascent Transcriptomics Service Center at WSU.

Author Information

1. School of Molecular Biosciences, College of Veterinary Medicine, Washington State University, Pullman, WA, USA

Mackenzie K. Meyer, Oluwadamilola J. Olanrewaju, Anna L. McDonald, Eva M. Rickard, Marina I. Savenkova & Sascha H. Duttke

2. Department of Psychiatry, University of California San Diego, La Jolla CA, USA

Patricia Montilla-Perez and Francesca Telese

3. Department of Medicine, Division of Endocrinology, University of California San Diego School of Medicine, La Jolla, CA, USA

Christopher Benner

Data availability

Processed and down-sampled raw csRNA-seq data for the fast example analysis of K562 csRNA-seq data²⁷ (GEO GSE135498) are available at GSE287021.

References

1.Wissink EM, Vihervaara A, Tippens ND & Lis JT Nascent RNA analyses: tracking transcription and its regulation. Nat Rev Genet 20, 705–723, doi: 10.1038/s41576-019-0159-6 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Core L & Adelman K Promoter-proximal pausing of RNA polymerase II: a nexus of gene regulation. Genes Dev 33, 960–982, doi: 10.1101/gad.325142.119 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Arnold PR, Wells AD & Li XC Diversity and Emerging Roles of Enhancer RNA in Regulation of Gene Expression and Cell Fate. Front Cell Dev Biol 7, 377, doi: 10.3389/fcell.2019.00377 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Cardiello JF, Sanchez GJ, Allen MA & Dowell RD Lessons from eRNAs: understanding transcriptional regulation through the lens of nascent RNAs. Transcription 11, 3–18, doi: 10.1080/21541264.2019.1704128 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Core LJ et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nature Genetics 46, 1311–1320, doi: 10.1038/ng.3142 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Palazzo AF & Lee ES Non-coding RNA: what is functional and what is junk? Front Genet 6, 2, doi: 10.3389/fgene.2015.00002 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Zhang Z et al. Transcriptional landscape and clinical utility of enhancer RNAs for eRNA-targeted therapy in cancer. Nature Communications 10, 4562, doi: 10.1038/s41467-019-12543-5 (2019). [DOI] [Google Scholar]
8.Kim T-K et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182, doi: 10.1038/nature09033 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.De Santa F et al. A Large Fraction of Extragenic RNA Pol II Transcription Sites Overlap Enhancers. PLOS Biology 8, e1000384, doi: 10.1371/journal.pbio.1000384 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Schaukowitch K et al. Enhancer RNA facilitates NELF release from immediate early genes. Mol Cell 56, 29–42, doi: 10.1016/j.molcel.2014.08.023 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Azofeifa JG et al. Enhancer RNA profiling predicts transcription factor activity. Genome Res 28, 334–344, doi: 10.1101/gr.225755.117 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Ntini E et al. Polyadenylation site-induced decay of upstream transcripts enforces promoter directionality. Nat Struct Mol Biol 20, 923–928, doi: 10.1038/nsmb.2640 (2013). [DOI] [PubMed] [Google Scholar]
13.Flynn RA, Almada AE, Zamudio JR & Sharp PA Antisense RNA polymerase II divergent transcripts are P-TEFb dependent and substrates for the RNA exosome. Proc Natl Acad Sci U S A 108, 10460–10465, doi: 10.1073/pnas.1106630108 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Lu XM et al. Nascent RNA sequencing reveals mechanisms of gene regulation in the human malaria parasite Plasmodium falciparum. Nucleic Acids Res 45, 7825–7840, doi: 10.1093/nar/gkx464 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Wang J et al. Nascent RNA sequencing analysis provides insights into enhancer-mediated gene regulation. BMC Genomics 19, 633, doi: 10.1186/s12864-018-5016-z (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Danko CG et al. Identification of active transcriptional regulatory elements from GRO-seq data. Nature Methods 12, 433, doi: 10.1038/nmeth.3329 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Dukler N et al. Nascent RNA sequencing reveals a dynamic global transcriptional response at genes and enhancers to the natural medicinal compound celastrol. Genome Res 27, 1816–1829, doi: 10.1101/gr.222935.117 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Duttke SH et al. Glucocorticoid Receptor-Regulated Enhancers Play a Central Role in the Gene Regulatory Networks Underlying Drug Addiction. Frontiers in Neuroscience 16, doi: 10.3389/fnins.2022.858427 (2022). [DOI] [Google Scholar]
19.Duttke SH et al. Decoding Transcription Regulatory Mechanisms Associated with Coccidioides immitis Phase Transition Using Total RNA. mSystems 7, e0140421, doi: 10.1128/msystems.01404-21 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Branche E et al. SREBP2-dependent lipid gene transcription enhances the infection of human dendritic cells by Zika virus. Nature Communications 13, 5341, doi: 10.1038/s41467-022-33041-1 (2022). [DOI] [Google Scholar]
21.Hah N, Murakami S, Nagari A, Danko CG & Kraus WL Enhancer transcripts mark active estrogen receptor binding sites. Genome Res 23, 1210–1223, doi: 10.1101/gr.152306.112 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Danko CG et al. Dynamic evolution of regulatory element ensembles in primate CD4(+) T cells. Nat Ecol Evol 2, 537–548, doi: 10.1038/s41559-017-0447-5 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Oldfield AJ et al. NF-Y Controls Fidelity of Transcription Initiation at Gene Promoters Through Maintenance of the Nucleosome-Depleted Region. bioRxiv, doi: 10.1101/369389 (2018). [DOI] [Google Scholar]
24.Lim JY et al. DNMT3A haploinsufficiency causes dichotomous DNA methylation defects at enhancers in mature human immune cells. J Exp Med 218, doi: 10.1084/jem.20202733 (2021). [DOI] [Google Scholar]
25.Duttke SH et al. Position-dependent function of human sequence-specific transcription factors. Nature 631, 891–898, doi: 10.1038/s41586-024-07662-z (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Yao L et al. A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers. Nature Biotechnology 40, 1056–1065, doi: 10.1038/s41587-022-01211-7 (2022). [DOI] [Google Scholar]
27.Duttke SH, Chang MW, Heinz S & Benner C Identification and dynamic quantification of regulatory elements using total RNA. Genome Research, doi: 10.1101/gr.253492.119 (2019). [DOI] [Google Scholar]
28.Seila AC et al. Divergent transcription from active promoters. Science 322, 1849–1851, doi: 10.1126/science.1162253 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Affymetrix CSHL, Encode Transcriptome Project. Post-transcriptional processing generates a diversity of 5′-modified long and short RNAs. Nature 457, 1028, doi: 10.1038/nature07759 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Gu W et al. CapSeq and CIP-TAP Identify Pol II Start Sites and Reveal Capped Small RNAs as C. elegans piRNA Precursors. (2012). [Google Scholar]
31.Nechaev S et al. Global Analysis of Short RNAs Reveals Widespread Promoter-Proximal Stalling and Arrest of Pol II in Drosophila. Science 327, 335–338, doi: 10.1126/science.1181421 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Kwak H, Fuda NJ, Core LJ & Lis JT Precise Maps of RNA Polymerase Reveal How Promoters Direct Initiation and Pausing. Science 339, 950–953, doi: 10.1126/science.1229386 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Lam MTY et al. Rev-Erbs repress macrophage gene expression by inhibiting enhancer-directed transcription. Nature 498, 511–515, doi: 10.1038/nature12209 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Sloutskin A et al. From promoter motif to cardiac function: A single DPE motif affects transcription regulation and organ function in vivo. Development, doi: 10.1242/dev.202355 (2024). [DOI] [Google Scholar]
35.Lepage SIM et al. Gene Expression Profile Is Different between Intact and Enzymatically Digested Equine Articular Cartilage. Cartilage 12, 222–225, doi: 10.1177/1947603519833148 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Herron S, Delpech JC, Madore C & Ikezu T Using mechanical homogenization to isolate microglia from mouse brain tissue to preserve transcriptomic integrity. STAR Protoc 3, 101670, doi: 10.1016/j.xpro.2022.101670 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Mattei D et al. Enzymatic Dissociation Induces Transcriptional and Proteotype Bias in Brain Cell Populations. Int J Mol Sci 21, doi: 10.3390/ijms21217944 (2020). [DOI] [Google Scholar]
38.Tremblay BJM et al. Interplay between coding and non-coding regulation drives the Arabidopsis seed-to-seedling transition. Nat Commun 15, 1724, doi: 10.1038/s41467-024-46082-5 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
39.McDonald BR et al. Enhancers associated with unstable RNAs are rare in plants. Nature Plants (2024). [Google Scholar]
40.McDonald AL et al. Efficient small fragment sequencing of human, cow, and bison miRNA, small RNA or csRNA-seq libraries using AVITI. BMC Genomics 25, doi: 10.1101/2024.05.28.596343 (2024). [DOI] [Google Scholar]
41.Perry BW et al. Nascent transcription reveals regulatory changes in extremophile fishes inhabiting hydrogen sulfide-rich environments. Proc Biol Sci 291, 20240412, doi: 10.1098/rspb.2024.0412 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Lam MTY et al. Dynamic activity in cis-regulatory elements of leukocytes identifies transcription factor activation and stratifies COVID-19 severity in ICU patients. Cell Rep Med 4, 100935, doi: 10.1016/j.xcrm.2023.100935 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Hetzel J, Duttke SH, Benner C & Chory J Nascent RNA sequencing reveals distinct features in plant transcription. Proceedings of the National Academy of Sciences of the United States of America 113, 12316–12321, doi: 10.1073/pnas.1603217113 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Heinz S et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Molecular Cell 38, 576–589, doi: 10.1016/j.molcel.2010.05.004 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Fagnocchi L, Poli V & Zippo A Enhancer reprogramming in tumor progression: a new route towards cancer cell plasticity. Cell Mol Life Sci 75, 2537–2555, doi: 10.1007/s00018-018-2820-1 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Ko JY, Oh S & Yoo KH Functional Enhancers As Master Regulators of Tissue-Specific Gene Regulation and Cancer Development. Mol Cells 40, 169–177, doi: 10.14348/molcells.2017.0033 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Panigrahi A & O’Malley BW Mechanisms of enhancer action: the known and the unknown. Genome Biology 22, 108, doi: 10.1186/s13059-021-02322-1 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Dudnyk K, Cai D, Shi C, Xu J & Zhou J Sequence basis of transcription initiation in the human genome. Science 384, eadj0116, doi:doi: 10.1126/science.adj0116 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Link VM et al. Analysis of Genetically Diverse Macrophages Reveals Local and Domain-wide Mechanisms that Control Transcription Factor Binding and Function. Cell 173, 1796–1809.e1717, doi: 10.1016/j.cell.2018.04.018 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Zhang P et al. Relatively frequent switching of transcription start sites during cerebellar development. BMC Genomics 18, 461, doi: 10.1186/s12864-017-3834-z (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Mahat DB et al. Single-cell nascent RNA sequencing unveils coordinated global transcription. Nature 631, 216–223, doi: 10.1038/s41586-024-07517-7 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Policastro RA & Zentner GE Global approaches for profiling transcription initiation. Cell Rep Methods 1, doi: 10.1016/j.crmeth.2021.100081 (2021). [DOI] [Google Scholar]
53.Churchman LS & Weissman JS Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature 469, 368–373, doi: 10.1038/nature09652 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Chu T et al. Chromatin run-on and sequencing maps the transcriptional regulatory landscape of glioblastoma multiforme. Nature Genetics 50, 1553–1564, doi: 10.1038/s41588-018-0244-3 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Chou SP, Alexander AK, Rice EJ, Choate LA & Danko CG Genetic dissection of the RNA polymerase II transcription cycle. Elife 11, doi: 10.7554/eLife.78458 (2022). [DOI] [Google Scholar]
56.Core LJ, Waterfall JJ & Lis JT Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters. Science 322, 1845–1848, doi: 10.1126/science.1162228 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Liu W et al. RNA-directed DNA methylation involves co-transcriptional small-RNA-guided slicing of polymerase V transcripts in Arabidopsis. Nature Plants 4, 181–188, doi: 10.1038/s41477-017-0100-y (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Shiraki T et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proceedings of the National Academy of Sciences 100, 15776–15781, doi: 10.1073/pnas.2136655100 (2003). [DOI] [Google Scholar]
59.Adiconis X et al. Comprehensive comparative analysis of 5’-end RNA-sequencing methods. Nat Methods 15, 505–511, doi: 10.1038/s41592-018-0014-2 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Buenrostro JD, Wu B, Chang HY & Greenleaf WJ ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide. Current Protocols in Molecular Biology 109, 21.29.21–21.29.29, doi:doi: 10.1002/0471142727.mb2129s109 (2015). [DOI] [Google Scholar]
61.Boyle AP et al. High-resolution mapping and characterization of open chromatin across the genome. Cell 132, 311–322, doi: 10.1016/j.cell.2007.12.014 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Ishii H, Kadonaga JT & Ren B MPE-seq, a new method for the genome-wide analysis of chromatin structure. Proc Natl Acad Sci U S A 112, E3457–3465, doi: 10.1073/pnas.1424804112 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
63.Li J et al. Kinetic competition between elongation rate and binding of NELF controls promoter-proximal pausing. Mol Cell 50, 711–722, doi: 10.1016/j.molcel.2013.05.016 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Wang Z et al. Prediction of histone post-translational modification patterns based on nascent transcription data. Nature Genetics 54, 295–305, doi: 10.1038/s41588-022-01026-x (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Mayer A & Churchman LS A Detailed Protocol for Subcellular RNA Sequencing (subRNA-seq). Curr Protoc Mol Biol 120, 4.29.21–24.29.18, doi: 10.1002/cpmb.44 (2017). [DOI] [Google Scholar]
66.Dobin A et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21, doi: 10.1093/bioinformatics/bts635 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
67.Kim D, Paggi JM, Park C, Bennett C & Salzberg SL Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915, doi: 10.1038/s41587-019-0201-4 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
68.Jiang H, Lei R, Ding SW & Zhu S Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15, 182, doi: 10.1186/1471-2105-15-182 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
69.Fujita PA et al. The UCSC Genome Browser database: update 2011. Nucleic Acids Research 39, D876–D882, doi: 10.1093/nar/gkq963 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
70.Robinson JT et al. Integrative genomics viewer. Nat Biotechnol 29, 24–26, doi: 10.1038/nbt.1754 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
71.Blumberg A et al. Characterizing RNA stability genome-wide through combined analysis of PRO-seq and RNA-seq data. BMC Biol 19, 30, doi: 10.1186/s12915-021-00949-x (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

Key references:

1.Duttke SH, Guzman C, Chang M et al. Nature 631, 891–898 (2024): 10.1038/s41586-024-07662-z [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Lam MTY et al. Cell Rep Med 4, 100935 (2023): 10.1016/j.xcrm.2023.100935 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.McDonald BR, Picard CL, Brabb IM et al. Nat. Plants 10, 1246–1257 (2024): 10.1038/s41477-024-01741-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Code

NIHMS2142302-supplement-Supplementary_Code.ipynb^{(31.1KB, ipynb)}

Supplement NIH

NIHMS2142302-supplement-Supplement_NIH.pdf^{(185.7KB, pdf)}

Supplementary Video

Download video file^{(976.2MB, mp4)}

Data Availability Statement

Processed and down-sampled raw csRNA-seq data for the fast example analysis of K562 csRNA-seq data²⁷ (GEO GSE135498) are available at GSE287021.

[R1] 1.Wissink EM, Vihervaara A, Tippens ND & Lis JT Nascent RNA analyses: tracking transcription and its regulation. Nat Rev Genet 20, 705–723, doi: 10.1038/s41576-019-0159-6 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Core L & Adelman K Promoter-proximal pausing of RNA polymerase II: a nexus of gene regulation. Genes Dev 33, 960–982, doi: 10.1101/gad.325142.119 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Arnold PR, Wells AD & Li XC Diversity and Emerging Roles of Enhancer RNA in Regulation of Gene Expression and Cell Fate. Front Cell Dev Biol 7, 377, doi: 10.3389/fcell.2019.00377 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Cardiello JF, Sanchez GJ, Allen MA & Dowell RD Lessons from eRNAs: understanding transcriptional regulation through the lens of nascent RNAs. Transcription 11, 3–18, doi: 10.1080/21541264.2019.1704128 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Core LJ et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nature Genetics 46, 1311–1320, doi: 10.1038/ng.3142 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Palazzo AF & Lee ES Non-coding RNA: what is functional and what is junk? Front Genet 6, 2, doi: 10.3389/fgene.2015.00002 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Zhang Z et al. Transcriptional landscape and clinical utility of enhancer RNAs for eRNA-targeted therapy in cancer. Nature Communications 10, 4562, doi: 10.1038/s41467-019-12543-5 (2019). [DOI] [Google Scholar]

[R8] 8.Kim T-K et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182, doi: 10.1038/nature09033 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.De Santa F et al. A Large Fraction of Extragenic RNA Pol II Transcription Sites Overlap Enhancers. PLOS Biology 8, e1000384, doi: 10.1371/journal.pbio.1000384 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Schaukowitch K et al. Enhancer RNA facilitates NELF release from immediate early genes. Mol Cell 56, 29–42, doi: 10.1016/j.molcel.2014.08.023 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Azofeifa JG et al. Enhancer RNA profiling predicts transcription factor activity. Genome Res 28, 334–344, doi: 10.1101/gr.225755.117 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Ntini E et al. Polyadenylation site-induced decay of upstream transcripts enforces promoter directionality. Nat Struct Mol Biol 20, 923–928, doi: 10.1038/nsmb.2640 (2013). [DOI] [PubMed] [Google Scholar]

[R13] 13.Flynn RA, Almada AE, Zamudio JR & Sharp PA Antisense RNA polymerase II divergent transcripts are P-TEFb dependent and substrates for the RNA exosome. Proc Natl Acad Sci U S A 108, 10460–10465, doi: 10.1073/pnas.1106630108 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Lu XM et al. Nascent RNA sequencing reveals mechanisms of gene regulation in the human malaria parasite Plasmodium falciparum. Nucleic Acids Res 45, 7825–7840, doi: 10.1093/nar/gkx464 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Wang J et al. Nascent RNA sequencing analysis provides insights into enhancer-mediated gene regulation. BMC Genomics 19, 633, doi: 10.1186/s12864-018-5016-z (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Danko CG et al. Identification of active transcriptional regulatory elements from GRO-seq data. Nature Methods 12, 433, doi: 10.1038/nmeth.3329 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Dukler N et al. Nascent RNA sequencing reveals a dynamic global transcriptional response at genes and enhancers to the natural medicinal compound celastrol. Genome Res 27, 1816–1829, doi: 10.1101/gr.222935.117 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Duttke SH et al. Glucocorticoid Receptor-Regulated Enhancers Play a Central Role in the Gene Regulatory Networks Underlying Drug Addiction. Frontiers in Neuroscience 16, doi: 10.3389/fnins.2022.858427 (2022). [DOI] [Google Scholar]

[R19] 19.Duttke SH et al. Decoding Transcription Regulatory Mechanisms Associated with Coccidioides immitis Phase Transition Using Total RNA. mSystems 7, e0140421, doi: 10.1128/msystems.01404-21 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Branche E et al. SREBP2-dependent lipid gene transcription enhances the infection of human dendritic cells by Zika virus. Nature Communications 13, 5341, doi: 10.1038/s41467-022-33041-1 (2022). [DOI] [Google Scholar]

[R21] 21.Hah N, Murakami S, Nagari A, Danko CG & Kraus WL Enhancer transcripts mark active estrogen receptor binding sites. Genome Res 23, 1210–1223, doi: 10.1101/gr.152306.112 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Danko CG et al. Dynamic evolution of regulatory element ensembles in primate CD4(+) T cells. Nat Ecol Evol 2, 537–548, doi: 10.1038/s41559-017-0447-5 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Oldfield AJ et al. NF-Y Controls Fidelity of Transcription Initiation at Gene Promoters Through Maintenance of the Nucleosome-Depleted Region. bioRxiv, doi: 10.1101/369389 (2018). [DOI] [Google Scholar]

[R24] 24.Lim JY et al. DNMT3A haploinsufficiency causes dichotomous DNA methylation defects at enhancers in mature human immune cells. J Exp Med 218, doi: 10.1084/jem.20202733 (2021). [DOI] [Google Scholar]

[R25] 25.Duttke SH et al. Position-dependent function of human sequence-specific transcription factors. Nature 631, 891–898, doi: 10.1038/s41586-024-07662-z (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Yao L et al. A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers. Nature Biotechnology 40, 1056–1065, doi: 10.1038/s41587-022-01211-7 (2022). [DOI] [Google Scholar]

[R27] 27.Duttke SH, Chang MW, Heinz S & Benner C Identification and dynamic quantification of regulatory elements using total RNA. Genome Research, doi: 10.1101/gr.253492.119 (2019). [DOI] [Google Scholar]

[R28] 28.Seila AC et al. Divergent transcription from active promoters. Science 322, 1849–1851, doi: 10.1126/science.1162253 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Affymetrix CSHL, Encode Transcriptome Project. Post-transcriptional processing generates a diversity of 5′-modified long and short RNAs. Nature 457, 1028, doi: 10.1038/nature07759 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Gu W et al. CapSeq and CIP-TAP Identify Pol II Start Sites and Reveal Capped Small RNAs as C. elegans piRNA Precursors. (2012). [Google Scholar]

[R31] 31.Nechaev S et al. Global Analysis of Short RNAs Reveals Widespread Promoter-Proximal Stalling and Arrest of Pol II in Drosophila. Science 327, 335–338, doi: 10.1126/science.1181421 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Kwak H, Fuda NJ, Core LJ & Lis JT Precise Maps of RNA Polymerase Reveal How Promoters Direct Initiation and Pausing. Science 339, 950–953, doi: 10.1126/science.1229386 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Lam MTY et al. Rev-Erbs repress macrophage gene expression by inhibiting enhancer-directed transcription. Nature 498, 511–515, doi: 10.1038/nature12209 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Sloutskin A et al. From promoter motif to cardiac function: A single DPE motif affects transcription regulation and organ function in vivo. Development, doi: 10.1242/dev.202355 (2024). [DOI] [Google Scholar]

[R35] 35.Lepage SIM et al. Gene Expression Profile Is Different between Intact and Enzymatically Digested Equine Articular Cartilage. Cartilage 12, 222–225, doi: 10.1177/1947603519833148 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Herron S, Delpech JC, Madore C & Ikezu T Using mechanical homogenization to isolate microglia from mouse brain tissue to preserve transcriptomic integrity. STAR Protoc 3, 101670, doi: 10.1016/j.xpro.2022.101670 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Mattei D et al. Enzymatic Dissociation Induces Transcriptional and Proteotype Bias in Brain Cell Populations. Int J Mol Sci 21, doi: 10.3390/ijms21217944 (2020). [DOI] [Google Scholar]

[R38] 38.Tremblay BJM et al. Interplay between coding and non-coding regulation drives the Arabidopsis seed-to-seedling transition. Nat Commun 15, 1724, doi: 10.1038/s41467-024-46082-5 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.McDonald BR et al. Enhancers associated with unstable RNAs are rare in plants. Nature Plants (2024). [Google Scholar]

[R40] 40.McDonald AL et al. Efficient small fragment sequencing of human, cow, and bison miRNA, small RNA or csRNA-seq libraries using AVITI. BMC Genomics 25, doi: 10.1101/2024.05.28.596343 (2024). [DOI] [Google Scholar]

[R41] 41.Perry BW et al. Nascent transcription reveals regulatory changes in extremophile fishes inhabiting hydrogen sulfide-rich environments. Proc Biol Sci 291, 20240412, doi: 10.1098/rspb.2024.0412 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] 42.Lam MTY et al. Dynamic activity in cis-regulatory elements of leukocytes identifies transcription factor activation and stratifies COVID-19 severity in ICU patients. Cell Rep Med 4, 100935, doi: 10.1016/j.xcrm.2023.100935 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] 43.Hetzel J, Duttke SH, Benner C & Chory J Nascent RNA sequencing reveals distinct features in plant transcription. Proceedings of the National Academy of Sciences of the United States of America 113, 12316–12321, doi: 10.1073/pnas.1603217113 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] 44.Heinz S et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Molecular Cell 38, 576–589, doi: 10.1016/j.molcel.2010.05.004 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] 45.Fagnocchi L, Poli V & Zippo A Enhancer reprogramming in tumor progression: a new route towards cancer cell plasticity. Cell Mol Life Sci 75, 2537–2555, doi: 10.1007/s00018-018-2820-1 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] 46.Ko JY, Oh S & Yoo KH Functional Enhancers As Master Regulators of Tissue-Specific Gene Regulation and Cancer Development. Mol Cells 40, 169–177, doi: 10.14348/molcells.2017.0033 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] 47.Panigrahi A & O’Malley BW Mechanisms of enhancer action: the known and the unknown. Genome Biology 22, 108, doi: 10.1186/s13059-021-02322-1 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R48] 48.Dudnyk K, Cai D, Shi C, Xu J & Zhou J Sequence basis of transcription initiation in the human genome. Science 384, eadj0116, doi:doi: 10.1126/science.adj0116 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] 49.Link VM et al. Analysis of Genetically Diverse Macrophages Reveals Local and Domain-wide Mechanisms that Control Transcription Factor Binding and Function. Cell 173, 1796–1809.e1717, doi: 10.1016/j.cell.2018.04.018 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] 50.Zhang P et al. Relatively frequent switching of transcription start sites during cerebellar development. BMC Genomics 18, 461, doi: 10.1186/s12864-017-3834-z (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R51] 51.Mahat DB et al. Single-cell nascent RNA sequencing unveils coordinated global transcription. Nature 631, 216–223, doi: 10.1038/s41586-024-07517-7 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R52] 52.Policastro RA & Zentner GE Global approaches for profiling transcription initiation. Cell Rep Methods 1, doi: 10.1016/j.crmeth.2021.100081 (2021). [DOI] [Google Scholar]

[R53] 53.Churchman LS & Weissman JS Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature 469, 368–373, doi: 10.1038/nature09652 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R54] 54.Chu T et al. Chromatin run-on and sequencing maps the transcriptional regulatory landscape of glioblastoma multiforme. Nature Genetics 50, 1553–1564, doi: 10.1038/s41588-018-0244-3 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R55] 55.Chou SP, Alexander AK, Rice EJ, Choate LA & Danko CG Genetic dissection of the RNA polymerase II transcription cycle. Elife 11, doi: 10.7554/eLife.78458 (2022). [DOI] [Google Scholar]

[R56] 56.Core LJ, Waterfall JJ & Lis JT Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters. Science 322, 1845–1848, doi: 10.1126/science.1162228 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R57] 57.Liu W et al. RNA-directed DNA methylation involves co-transcriptional small-RNA-guided slicing of polymerase V transcripts in Arabidopsis. Nature Plants 4, 181–188, doi: 10.1038/s41477-017-0100-y (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R58] 58.Shiraki T et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proceedings of the National Academy of Sciences 100, 15776–15781, doi: 10.1073/pnas.2136655100 (2003). [DOI] [Google Scholar]

[R59] 59.Adiconis X et al. Comprehensive comparative analysis of 5’-end RNA-sequencing methods. Nat Methods 15, 505–511, doi: 10.1038/s41592-018-0014-2 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R60] 60.Buenrostro JD, Wu B, Chang HY & Greenleaf WJ ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide. Current Protocols in Molecular Biology 109, 21.29.21–21.29.29, doi:doi: 10.1002/0471142727.mb2129s109 (2015). [DOI] [Google Scholar]

[R61] 61.Boyle AP et al. High-resolution mapping and characterization of open chromatin across the genome. Cell 132, 311–322, doi: 10.1016/j.cell.2007.12.014 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R62] 62.Ishii H, Kadonaga JT & Ren B MPE-seq, a new method for the genome-wide analysis of chromatin structure. Proc Natl Acad Sci U S A 112, E3457–3465, doi: 10.1073/pnas.1424804112 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R63] 63.Li J et al. Kinetic competition between elongation rate and binding of NELF controls promoter-proximal pausing. Mol Cell 50, 711–722, doi: 10.1016/j.molcel.2013.05.016 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R64] 64.Wang Z et al. Prediction of histone post-translational modification patterns based on nascent transcription data. Nature Genetics 54, 295–305, doi: 10.1038/s41588-022-01026-x (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R65] 65.Mayer A & Churchman LS A Detailed Protocol for Subcellular RNA Sequencing (subRNA-seq). Curr Protoc Mol Biol 120, 4.29.21–24.29.18, doi: 10.1002/cpmb.44 (2017). [DOI] [Google Scholar]

[R66] 66.Dobin A et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21, doi: 10.1093/bioinformatics/bts635 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R67] 67.Kim D, Paggi JM, Park C, Bennett C & Salzberg SL Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915, doi: 10.1038/s41587-019-0201-4 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R68] 68.Jiang H, Lei R, Ding SW & Zhu S Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15, 182, doi: 10.1186/1471-2105-15-182 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R69] 69.Fujita PA et al. The UCSC Genome Browser database: update 2011. Nucleic Acids Research 39, D876–D882, doi: 10.1093/nar/gkq963 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R70] 70.Robinson JT et al. Integrative genomics viewer. Nat Biotechnol 29, 24–26, doi: 10.1038/nbt.1754 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R71] 71.Blumberg A et al. Characterizing RNA stability genome-wide through combined analysis of PRO-seq and RNA-seq data. BMC Biol 19, 30, doi: 10.1186/s12915-021-00949-x (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

Step Number	Temperature	Duration
1	75 °C	2 minutes
2	37 °C	15 minutes
3	25 °C	15 minutes
4	4 °C	∞
Set the lid temperature to 80–85 °C

Step Number	Temperature	Duration
1	50 °C	1 hour
2	45 °C	5 minutes
3	50 °C	55 minutes
4	80 °C	5 minutes
5	12 °C	∞
Set the lid temperature to 80–85 °C

PERMALINK

Profiling active RNA polymerase II transcription start sites from total RNA by capped small RNA sequencing (csRNA-seq)

Mackenzie K Meyer

Oluwadamilola J Olanrewaju

Patricia Montilla-Perez

Anna L McDonald

Eva M Rickard

Francesca Telese

Christopher Benner

Marina I Savenkova

Sascha H Duttke

Editorial summary:

Tweet:

Introduction

Fig. 1 |. Profiling the active transcriptome with csRNA-seq.

Development of the protocol

Overview of the procedure

Fig. 2 |. Overview of csRNA-seq procedure.

Applications

Comparison to other methods

Advantages and limitations of csRNA-seq

Experimental Design

Replicates and sample size

RNA extraction and input considerations

Small RNA input library considerations

Adaptations for Automation and High-Throughput Processing

Data Analysis:

Expertise needed to implement the protocol

Materials

Biological materials

Reagents

Software for data analysis

Equipment

Reagent Setup

2xFLB

ddH2O+0.05% Tween 20

csElution Buffer (for RNA gels)

TET Buffer

TE’T Buffer

Sequencing TET Buffer

DNA Gel Elution Buffer

Procedure

RNA Size Selection

? TROUBLESHOOTING

sRNA Precipitation

? TROUBLESHOOTING

? TROUBLESHOOTING

5’ Cap Enrichment

Capped small RNA precipitation

Library Preparation

5’ Decapping

3’ Adapter Ligation

Small RNA Reverse Transcription Primer Hybridization

5’ SR Adapter Ligation

Reverse Transcription (RT)

PCR with Barcodes

dsDNA Size Selection and Gel Purification

SpeedBead Magnetic Carboxylate Modified Particles Preparation

DNA Bead Purification

dsDNA TBE Gel Purification/Size Selection

DNA Clean up

? TROUBLESHOOTING

Quantification and Pooling

Sequencing

Data Analysis

Fig. 3 |. Overview of csRNA-seq analysis.

Troubleshooting

Table 1. |.

TIMING

Anticipated Results

Extended Data

Extended Data Fig. 1 |. Genome browser visualizations of diverse transcriptional features identified by csRNA-seq.

Extended Data Fig. 2 |. RNA gel examples and cutting.

Extended Data Fig. 3 |. DNA gel examples and cutting.

Extended Data Fig. 4 |. Workflow for csRNA-seq analysis.

Extended Data Fig. 5 |. Exemplary features that can be explored with csRNA-seq.

Supplementary Material

Key points:

Acknowledgments

Footnotes

ddH₂O+0.05% Tween 20