Abstract
We present a combinatorial indexing method, PerturbSci-Kinetics, for capturing whole transcriptomes, nascent transcriptomes and single guide RNA (sgRNA) identities across hundreds of genetic perturbations at the single-cell level. Profiling a pooled CRISPR screen targeting various biological processes, we show the gene expression regulation during RNA synthesis, processing and degradation, miRNA biogenesis and mitochondrial mRNA processing, systematically decoding the genome-wide regulatory network that underlies RNA temporal dynamics at scale.
Subject terms: Transcriptomics, Gene regulation, Transcriptomics
mRNA kinetics are described by combining single-cell combinatorial indexing with metabolic labeling and pooled CRISPR screens.
Main
Cellular functions are determined by the expression of millions of RNA molecules, which are tightly regulated by their synthesis, splicing and degradation. However, understanding how key regulators impact genome-wide RNA kinetics is constrained by existing tools, which provide only snapshots of the transcriptome1–8. To resolve this challenge, we developed PerturbSci-Kinetics, combining CRISPR-based pooled genetic screen, single-cell RNA sequencing (RNA-seq) by combinatorial indexing and RNA metabolic labeling to uncover single-cell transcriptome dynamics across extensive genetic perturbations.
PerturbSci-Kinetics features a combinatorial indexing strategy (‘PerturbSci’) for targeted capture of single guide RNA (sgRNA) transcripts that carries the same cellular barcode with the whole transcriptome (Fig. 1a). In brief, we adopted the modified CRISPR droplet sequencing (CROP-seq) vector5 and developed a strategy for capturing sgRNA sequences6,7 through reverse transcription using an sgRNA-specific primer followed by targeted enrichment of sgRNA sequences via polymerase chain reaction (PCR) (Extended Data Fig. 1, Supplementary Notes 1 and 2 and Supplementary Table 1). With extensive optimizations (Extended Data Fig. 2), PerturbSci achieves a high knockdown efficacy with a potent dual-repressor dCas9 (that is, dCas9-KRAB-MeCP2; ref. 9) and a high capture rate of sgRNA (that is, up to 99.7% of cells) and can readily scale up for profiling a large number of cells using the three-level combinatorial indexing approach10 (Fig. 1b and Supplementary Note 3).
By incorporating 4-thiouridine (4sU) labeling11–17, PerturbSci-Kinetics retrieves time-resolved nascent transcriptomes at single-cell resolution, distinguishing newly synthesized transcripts from whole transcriptomes. The kinetic rates of mRNA such as RNA synthesis and degradation in each genetically perturbed cell population were then inferred (Fig. 1a and Methods). Our method incorporates several optimizations to reduce the cell loss (Extended Data Fig. 2) and enhance the accuracy of nascent reads calling (Extended Data Fig. 3). With three levels of combinatorial indexing, PerturbSci-Kinetics demonstrates orders of magnitude higher throughput than previous approaches coupling metabolic labeling and single-cell RNA-seq (for example, scEU-seq, sci-fate and scNT-seq)18–22 (Fig. 1b).
As a proof of concept, we established a human HEK293 cell line with inducible dCas9-KRAB-MeCP2 (ref. 9) expression (HEK293-idCas9). We thoroughly validated the potent knockdown of target gene expression after doxycycline (dox) treatment (Fig. 1c and Extended Data Fig. 4a–c). Furthermore, we demonstrated the purity of the single-cell transcriptome and sgRNA capture of PerturbSci by profiling mixed human and mouse cells transduced with human and mouse-specific sgRNAs, respectively (Fig. 1d).
We proceeded to validate the capability of PerturbSci-Kinetics in capturing the three-layer readout at the single-cell level. After 4sU labeling and chemical conversion, we observed a significant enrichment of T-to-C mismatches in the mapped reads, which is consistent with findings from our previous study20 (Fig. 1e). A median of 22.1% of newly synthesized reads were recovered, in contrast to only 0.8% in control cells (Fig. 1f). The proportion of reads mapped to exonic regions was also significantly lower in nascent reads compared to pre-existing reads (P < 1 × 10−20, Tukey’s test after ANOVA) (Fig. 1g). Moreover, genes with a higher fraction of nascent reads were significantly enriched in highly dynamic biological processes23, whereas housekeeping genes were strongly enriched in genes with a lower fraction of nascent reads (Fig. 1h–i). Notably, the chemical conversion step is fully compatible with sgRNA detection. We recovered sgRNAs from 97% of chemically converted cells (a median of 62 sgRNA unique molecular identifiers (UMIs) per cell), in which 92.6% were annotated as sgRNA singlets (Fig. 1j–k).
To dissect the impact of genetic perturbations on transcriptome kinetics, we performed a PerturbSci-Kinetics screening on HEK293-idCas9 cells. These cells were transduced with a library of 699 sgRNAs, which included 15 no-target controls (NTCs), targeting a total of 228 genes involved in diverse biological processes (Fig. 2a and Supplementary Table 2). After a 5-d puromycin selection, we harvested a proportion of cells for bulk library preparation (referred to as ‘day 0’ samples) and induced dCas9-KRAB-MeCP2 expression with dox for seven more days. The screening window was carefully chosen to maximize gene knockdown efficiency, minimize population dropout8 and allow cells to attain transcriptomic steady states24 (Extended Data Fig. 4d). We performed 200 µM 4sU labeling for 2 h at the end of the screening and harvested samples for both bulk and PerturbSci-Kinetics library preparation. As a quality control, the activation of CRISPR interference (CRISPRi) significantly altered the abundance of sgRNAs in the pool, which was consistent across replicates and aligned with previous studies25. For example, genes involved in essential functions (for example, DNA replication and ribosome assembly) were strongly depleted after the screening (Extended Data Fig. 4e,g). Reassuringly, the number of sgRNA singlets recovered by PerturbSci-Kinetics correlated well with read counts of bulk screen libraries (Pearson correlation r = 0.988, P < 2.2 × 10−16) (Fig. 2b).
We recovered 161,966 labeled cells with matched sgRNAs (88% of cells recovered in total), and 126,271 cells were annotated as sgRNA singlets (Extended Data Fig. 4j). Despite the shallow sequencing depth (~8,000 reads per cell), we achieved a median of 2,155 UMIs per cell. Of 699 sgRNAs, 698 were successfully recovered, with a median of 28 sgRNA UMIs per cell. Subsequently, we excluded cells containing sgRNAs that demonstrated low knockdown efficiencies (≤40% gene expression reduction compared to NTC). The RT–qPCR validation on several individual sgRNAs corroborated the accuracy of our knockdown efficiency estimates (Extended Data Fig. 4h–l). Ultimately, 98,315 cells were retained for downstream analysis, corresponding to a median of 484 cells per gene perturbation and a median knockdown efficiency of target genes at 67.7% (Fig. 2c).
We next quantified gene-specific synthesis and degradation rates in each perturbation using an ordinary differential equation approach26 (Methods). As expected, genes targeted by the CRISPRi demonstrated substantially reduced synthesis rates, whereas their degradation rates exhibited only mild alterations (Fig. 2c). As another validation, we observed significantly higher correlations of transcriptomes among sgRNAs targeting the same genes across multiple layers (for example, whole/nascent transcriptome and synthesis/degradation rates; Extended Data Fig. 5a). We then performed dimension reduction and uniform manifold approximation and projection (UMAP) visualization27 on aggregated whole transcriptomes of each perturbation. Perturbations targeting paralogous genes (for example, EXOSC5 and EXOSC6) or related biological processes (for example, RNA degradation and energy metabolism) were readily clustered together (Fig. 2d). Similar analyses on gene-specific synthesis/degradation rates managed to group perturbations by their functions (Extended Data Fig. 5b,c). Furthermore, by aggregating profiles of single cells carrying sgRNAs that target the same gene, we achieved robust estimations for both whole/nascent transcriptomes as well as transcriptome kinetic rates (Extended Data Fig. 5d).
We then investigated how genetic perturbations influence global transcriptome dynamics (Fig. 2e–g, Extended Data Fig. 6a–c,e–g and Supplementary Tables 5–7). As expected, the knockdown of genes encoding proteins involved in transcription initiation (for example, GTF2E1 and TAF2), mRNA synthesis (for example, POLR2B and POLR2K) and chromatin remodeling (for example, SMC3 and RAD21) significantly downregulated the global synthesis rates but not the degradation rates. Conversely, perturbations targeting critical biological processes, such as DNA replication (for example, POLA2 and POLD1), ribosome synthesis and rRNA processing (for example, POLR1A, POLR1B, RPL11 and RPS15A) and mRNA and protein processing (for example, CNOT2, CNOT3, CCT3 and CCT4), reduced both global RNA synthesis and degradation, indicating a compensatory mechanism for maintaining transcriptome homeostasis28 (Fig. 2e,f). Moreover, we noted significant reductions in exonic read fractions in nascent transcriptomes after perturbations related to RNA processing (for example, NCBP1, LSM2, LSM4, CPSF2 and CPSF6) and energy metabolism (for example, GAPDH and NDUFS2), signifying dysregulated splicing dynamics (Fig. 2g).
Interestingly, the knockdown of AGO2, a recognized post-transcriptional regulator29, led to an increase in global synthesis, suggesting its potential role in transcriptional repression (Fig. 2e). The re-analysis of public datasets30,31 corroborated our observation. Specifically, genes exhibiting enriched AGO2 binding at transcription start sites (TSSs) were markedly upregulated after AGO2 silencing (Extended Data Fig. 7a,b). Additionally, the enrichment of AGO2 binding was observed immediately downstream of the TSS and was positively correlated with transcriptional pausing (Extended Data Fig. 7a–d). For validation, we employed SLAM-seq32 to examine the transcriptomic response after AGO2 knockdown, identifying 78 highly paused genes significantly upregulated (FDR of 0.05). Notably, the nascent RNA of these genes showed increased 3′ end coverages compared to NTC, indicative of more efficient transcriptional elongation (Extended Data Fig. 7e,f). Collectively, our integrated analyses support the unconventional function of AGO2 in transcriptional repression.
We next investigated regulators of mitochondrial RNA dynamics by quantifying the fraction of nascent reads in single-cell mitochondrial transcriptomes. A significant reduction in mitochondrial transcriptome turnover was observed after perturbing metabolism-associated genes, including those encoding proteins involved in glycolysis (for example, GAPDH, FH and PKM), the tricarboxylic acid (TCA) cycle (for example, ACO2 and IDH3A) and oxidative phosphorylation (for example, NDUFS2 and COX6B1) (Fig. 2h, Extended Data Fig. 6d,h and Supplementary Table 8). Notably, LRPPRC emerged as a key mitochondrial RNA dynamics regulator, as its knockdown led to substantial reduction in both turnover rates and expression levels across most mitochondrial protein-coding genes and mitochondrial functional defects (Extended Data Fig. 8a–c and Supplementary Table 9). In contrast, nuclear-encoded genes were primarily regulated at the transcriptional level upon LRPPRC knockdown (Extended Data Fig. 8d–f). These kinetic changes in mitochondrial mRNA were validated through an independent PerturbSci-Kinetic experiment that profiled with LRPPRC knockdown (Extended Data Fig. 8g–i). Recent studies reported similar findings, observing impaired mitochondrial gene expression and mitochondrial functional defects in the hearts of LRPPRC knockout mice33 and in brown adipocyte-specific LRPPRC knockout mice34. This further corroborates the essential role of LRPPRC in maintaining mitochondrial mRNA homeostasis.
To further demonstrate the unique capacity of PerturbSci-Kinetics in unraveling the regulatory mechanisms that govern gene expression control, we identified 14,618 differentially expressed genes (DEGs) across perturbations, with 22.9% of them exhibiting significant changes in their synthesis or degradation rates (Supplementary Tables 10 and 11 and Methods). Among these, DEGs regulated by RNA degradation were associated with perturbations in mRNA surveillance/processing genes (Fig. 2i). For instance, our study revealed a set of significantly overlapped DEGs upon knockdown of DROSHA and DICER1 (refs. 35,36), genes encoding two crucial RNases in the miRNA biogenesis pathway37 (Extended Data Fig. 9a–c). These DEGs were regulated through distinct mechanisms: some genes were regulated by decreased degradation (for example, genes encoding miRNA-mediated silencing complex (RISC) components: TNRC6A and TNRC6B), whereas others are regulated through increased transcription (for example, miRNA host genes: MIR181A1HG and FTX; genes encoding protein involved in miRNA biogenesis: DDX3X) (Fig. 2j–l and Supplementary Table 12). The RNA-binding pattern of AGO2, a core component of RISC for miRNA-mediated mRNA degradation38, further validated our findings, exhibiting a strong enrichment in the untranslated regions (UTRs) of transcripts from degradation-regulated genes but not in synthesis-regulated genes (Fig. 2m). This finding was further substantiated through PerturbSci-Kinetics profiling on individual sgRNA knockdown clones and SLAM-seq after 4sU chase labeling32 (Fig. 2n and Extended Data Fig. 9d–g).
Finally, we delved into the effects of genetic perturbations on RNA dynamics during cell cycle progression. Using our validation dataset, we separated cells into five clusters representing different cell cycle stages using cell-cycle-related genes39 (Extended Data Fig. 10a–c), and we then calculated stage-specific kinetic rates of genes. Employing mfuzz clustering40, we identified four gene clusters displaying discrepant cell cycle timecourse synthesis dynamics patterns. Among these, only genes in cluster 1 exhibited evident steady-state expression fluctuations (Extended Data Fig. 10d). Although their synthesis and degradation rates both increased in early cell cycle phase, the synthesis rates outpaced the degradation rates, leading to an increase in steady-state mRNA levels from the S to the G2M stage. Gene Ontology (GO) term analysis further supported the crucial roles of proteins encoded by these genes in cell cycle (Extended Data Fig. 10e). Interestingly, in cells with DROSHA and DICER1 knockdown, we observed a similar steady-state expression pattern for genes in cluster 1 but with unresponsive degradation and compensated synthesis during cell cycle progression (Extended Data Fig. 10f), suggesting the existence of synthesis/degradation feedback loops for gene regulation. In contrast, LRPPRC knockdown did not impact cell-cycle-dependent RNA degradation dynamics (Extended Data Fig. 10g), aligning with our results that it specifically affects mitochondrial mRNA stability. Together, our study emphasizes the coordinated regulation of gene expression throughout the cell cycle progression and highlights the presence of intricate feedback loops between RNA synthesis and degradation.
In summary, PerturbSci-Kinetics allows for the quantitative analysis of the genome-wide mRNA kinetics across genetic perturbations in a massively parallel manner. Of note, there are several potential limitations to consider. First, extended 4sU labeling might impact cell states and potentially hinder the identification of sgRNA sequences. To mitigate this, we opted for a relatively short-term (2 h) treatment to minimize such effects. Second, RNA dynamics identified by PerturbSci-Kinetics may not directly reflect causality in gene regulation, partly due to the gradual nature of CRISPRi-based gene knockdown. This limitation could be mitigated by coupling the technique with large-scale chemical perturbations. Third, the perturbation of essential genes might lead to significant dropout, affecting dynamic rate estimations due to limited cells and reads. Moreover, apoptosis-triggered mRNA decay might further complicate the analysis41. Therefore, we recommend excluding genetic perturbations that lead to either strong dropout effects or substantial disruption of cell cycle distribution during RNA dynamics analysis.
Despite these limitations, our findings illuminate the distinct advantages of PerturbSci-Kinetics over conventional assays. Its multi-layer readout provides a comprehensive perspective on gene expression and RNA dynamics in response to genetic perturbations, facilitating high-throughput and parallel characterization of elements that govern gene-specific RNA dynamics. Moreover, given the low cost and high sensitivity of PerturbSci, we envision the potential to systematically dissect cell-type-specific gene regulatory networks across various biological contexts with an unparalleled scale and resolution.
Methods
Cell culture
The 3T3-L1-CRISPRi cell line was obtained from the Tissue Culture facility at the University of California, Berkeley. The HEK293 cell line was a gift from the Scott Keeney laboratory at Memorial Sloan Kettering Cancer Center. The HEK293T cell line and the NIH/3T3 cell line were obtained from the American Type Culture Collection. All cells were maintained at 37 °C and 5% CO2 in high glucose DMEM medium supplemented with l-glutamine and sodium pyruvate (Gibco, 11995065) and 10% FBS (Sigma-Aldrich, F4135).
Cell lines generation
To generate HEK293 cells with dox-inducible dCas9-KRAB-MeCP2 expression, the lentiviral plasmid Lenti-idCas9-KRAB-MeCP2-T2A-mCherry-Neo was constructed. After sequencing validation, the lentivirus was produced by co-transfecting Lenti-idCas9-KRAB-MeCP2-T2A-mCherry-Neo with psPAX2 (Addgene, 12260) and pMD2.G (Addgene, 12259) into low-passage HEK293T cells in a 10-cm dish using Polyjet (SignaGen, SL100688). After lentiviral titration, HEK293 cells were transduced at a multiplicity of infection (MOI) of 0.2 for 48 h. Cells were treated with 1 µg ml−1 dox (Sigma-Aldrich, D5207) for 48 h, and single cells with strong mCherry fluorescence were sorted for monoclonal generation.
The polyclone 3T3-CRISPRi cell line was generated in a similar way. pHR-SFFV-dCas9-BFP-KRAB (Addgene, 46911) was co-transfected with psPAX2 and pMD2.G to generate dCas9-expressing lentivirus, and the transduction at MOI = 0.2 was performed on 3T3 cells. BFPhi cells (top 35% in the BFP+ population) were sorted, and the sorting was repeated twice more after cell expansion to enrich cells with strong dCas9 expression.
Single-gene knockdown and efficacy examination
CROP-seq-opti-Puro-T2A-GFP was assembled by adding a T2A-GFP downstream of puromycin-resistant protein coding sequence on the CROP-seq-opti plasmid (Addgene, 106280). Oligos for individual guides cloning were ordered from Integrated DNA Technologies (IDT) with the following design:
Plus strand: 5′-CACCG[20 bp sgRNA plus strand sequence]-3′
Minus strand: 5′-AAAC[20 bp sgRNA minus strand sequence]C-3′
Oligos were phosphorylated using T4 PNK (New England Biolabs (NEB), M0201S) and were annealed. The CROP-seq-opti-Puro-T2A-GFP was digested by Esp3I (NEB, R0734L), and then the linearized backbone and the annealed duplex were ligated using the Blunt/TA Ligase Master Mix (NEB, M0367S). Transformation, clone amplification, sequencing validation, lentivirus generation and titer measurement were done as stated above.
Mouse 3T3-L1-CRISPRi cells and 3T3-CRISPRi cells were transduced with the lentivirus expressing NTC sgRNA or sgRNA targeting Fto. Human HEK293-idCas9 cells were transduced with lentivirus expressing NTC sgRNA or sgRNA targeting IGF1R during technique development, and HEK293-idCas9-sgXPO5, sgAGO2, sgDROSHA, sgDICER1 and sgLRPPRC cell lines were later established for validating significant hits from the screen. Transduction was carried out at MOI = 0.2 with 8 µg ml−1 of polybrene for 48 h. Transduced cells were then selected by either fluorescence-activated cell sorting (FACS) or puromycin treatment.
For RT–qPCR validation, primer pairs were selected from PrimerBank (https://pga.mgh.harvard.edu/primerbank/) and were synthesized by IDT. Total RNA of each sample was extracted using the RNeasy Mini Kit (Qiagen, 74104). Then, 1 µg of total RNA was reverse transcribed, and PowerUp SYBR Green Master Mix (Thermo Fisher Scientific, A25742) was used for RT–qPCR following the manufacturer’s instructions. The data were analyzed and visualized by GraphPad Prism (9.2.0) software.
For flow cytometry validation, 1 × 106 cells of each sample were harvested and resuspended in 100 µl of PBS/0.1% sodium azide/2% FBS. BV421 Mouse Anti-Human CD221 (BD Biosciences, 565966) and BV421 Mouse IgG1 κ Isotype Control (BD Biosciences, 562438) at the final concentration of 10 µg ml−1 were added, and reactions were incubated at 4 °C in the dark with rotation for 30 min. Cells were then washed twice using PBS/0.1% sodium azide/2% FBS, and fluorescence signals were recorded. The data were analyzed and visualized by FlowJo (10.8.1) software.
Construction of the pooled sgRNA library
Genes to be included in our sgRNA library were selected based on the following considerations. (1) Essential and non-essential genes were identified using the bulk CRISPR screen data from a previous report25 and Depmap43, and both were included in the gene set. (2) To validate the ability of PerturbSci-kinetics to characterize gene-specific RNA dynamics, we selected genes involved in transcription, chromatin remodeling, RNA processing and mRNA decay based on GO terms44 and KEGG pathways45. (3) We ensured that all selected genes were expressed in the cell line to be used in our study. An in-house HEK293 EasySci-RNA dataset was used to select expressing genes that met criteria 1 and 2.
sgRNA sequences targeting genes of interest were obtained from an established optimized CRISPRi sgRNA library (set A)25. Finally, 684 sgRNAs targeting 228 genes (three sgRNAs per gene) and 15 NTCs were included in the present study.
The single-stranded sgRNA library was synthesized in a pooled manner by IDT in the following format:
5′-GGCTTTATATATCTTGTGGAAAGGACGAAACACCG[20 bp sgRNA plus strand sequence]GTTTAAGAGCTATGCTGGAAACAGCATAGCAAGTT-3′
Next, 100 ng of oligo pool was amplified by PCR using primers targeting the 5′ homology arm (HA) and the 3′ HA. The PCR product was purified, and the insert was cloned into Esp3I-digested CROP-seq-opti-Puro-T2A-GFP by Gibson Assembly. In parallel, a control Gibson Assembly reaction containing only the backbone was set. Both reactions were cleaned up by 0.75× AMPure beads (Beckman Coulter, A63882) and eluted in 5 µl of EB buffer (Qiagen, 19086) and then were transformed into Endura electrocompetent cells (Lucigen, 602422) by electroporation (Gene Pulser Xcell Electroporation System; Bio-Rad, 1652662). After recovery, cells of each reaction were spread onto a 245-mm square agarose plate (Corning, 431111) with 100 µg ml−1 carbenicillin (Thermo Fisher Scientific, 10177012) and were then grown at 32 °C for 13 h. All colonies from each reaction were scraped from the plates, and the CROP-seq-opti-Puro-T2A-GFP-sgRNA plasmid library was extracted using ZymoPURE II Plasmid Midiprep Kit (Zymo Research, D4200). The lentiviral library was generated as stated.
The pooled PerturbSci-Kinetics screen experiment
For each replicate, 7 × 106 uninduced HEK293-idCas9 cells were seeded. Two replicates were transduced at MOI = 0.1, and another two replicates were transduced at MOI = 0.2. At least 1,000× coverage was kept throughout the cell culture. At the end of the puromycin selection, we harvested 1.4 × 106 cells in each replicate (2,000× coverage per sgRNA) as day 0 samples of the bulk screen and pellet down at 500g and 4 °C for 5 min for genomic DNA extraction. For the rest of the cells, the dCas9-KRAB-MeCP2 expression was induced by adding dox at the final concentration of 1 µg ml−1, and l-glutamine+, sodium pyruvate−, high glucose DMEM was used to sensitize cells to perturbations on energy metabolism genes. Cells were cultured for an additional 7 d. On day 7, 6 ml of the original media from each plate was mixed with 6 µl of 200 mM 4sU (Sigma-Aldrich, T4509-25MG) dissolved in DMSO (VWR, 97063-136) and was put back for nascent RNA metabolic labeling. After 2 h of treatment, 1.4 × 106 cells in each replicate were harvested as day 7 samples of the bulk screen, and the rest of the cells were fixed for PerturbSci-Kinetics profiling (see the next subsection).
Genomic DNA of bulk screen samples was extracted using Quick-DNA Miniprep Plus Kit (Zymo Research, D4068T) following the manufacturer’s instructions. The bulk screen libraries were amplified from genomic DNA extracted using custom primers (Supplementary Note 2) for sequencing.
Step-by-step protocols for PerturbSci-Kinetics library preparation are included in Supplementary Note 1.
4sU pulse and chase labeling and SLAM-seq
HEK293-idCas9-sgAGO2 and sgNTC cells were induced with dox for 7 d in 10-cm dishes, and cells were labeled with 600 µM 4sU for 20 min before total RNA extraction. HEK293-idCas9-sgDROSHA, sgDICER1 and sgNTC cells were induced with dox for 7 d and were treated with dox+ medium containing 100 µM 4sU for 18 h. The medium was refreshed every 6 h. Then, chase labeling was performed by using medium with 10 mM uridine (Sigma-Aldrich, U3750-1G). After 2-h and 4-h incubation, total RNA was extracted.
Next, 2–5 µg of total RNA from each sample was used for chemical conversion. RNA was diluted into 15 µl and mixed with 5 µl of 100 mM iodoacetamide (IAA), 5 µl of NaPO4 (pH 8.0, 500 mM) buffer and 25 µl of DMSO. The reaction was incubated at 50 °C for 15 min and was then quenched with 1 µl of 1 M DTT. After RNA purification using the Monarch RNA Cleanup Kit (NEB, T2030L), samples were immediately used for library construction.
Full-length and 3′ end bulk SLAM-seq was used for different experimental purposes. For full-length bulk SLAM-seq library construction, the CRISPRclean Stranded Total RNA Prep with rRNA Depletion Kit (Jumpcode Genomics, KIT1014) was used. For 3′ end bulk SLAM-seq library construction, an in-house 3′ end library preparation workflow was used. In brief, 250–500 ng of total mRNA was mixed with 1 µl of 100 µM oligodT primer (ACGACGCTCTTCCGATCTNNNNNNNNNNTTTTTTTTTTTTTTT), 1 µl of 10 mM each dNTP mix and 0.5 µl of SUPERase In, and the volume was adjusted to 15 µl with water. After RNA priming at 55 °C for 5 min, 4 µl of 5×RT buffer and 1 µl of Maxima H Minus Reverse Transcriptase (Thermo Fisher Scientific, EP0753) were added to the reaction, and reverse transcription was performed as recommended by the manufacturer. After 0.6× AMPure beads purification, second strand synthesis (NEB, E6111L) was carried out by 1-h incubation at 16 °C, and then cDNA was purified by 0.6× AMPure beads. After Read2 tagmentation on 10 ng of cDNA using 1:20 v/v Nextera Read2-Tn5, the reaction was quenched, and the final library was prepared as EasySci-RNA10.
Reads processing
For bulk CRISPR screen libraries, BCL files were demultiplexed into FASTQ files based on index 7 barcodes. Reads for each sample were further extracted by index 5 barcode matching. Every read pair was matched against two constant sequences (Read1: 11–25 bp; Read2: 11–25 bp) to remove artifacts. For all matching steps, a maximum of one mismatch was allowed. Finally, sgRNA sequences were extracted from filtered read pairs (at 26–45 bp of Read1) and assigned to sgRNA identities with no mismatch allowed, and read counts matrices at sgRNA and gene levels were quantified using Python (2.7).
For PerturbSci-Kinetics, after demultiplexing on index 7, Read1 was matched against a constant sequence on the sgRNA capture primer to remove unspecific priming, and cell barcodes and UMI sequences sequenced in Read1 were added to the headers of the FASTQ files of Read2, which were retained for further processing. After trimming poly(A) sequences and low-quality bases from Read2 by Trim_Galore (0.6.7)46, reads were aligned to a customized reference genome consisting of a complete hg38 reference genome (GRCh38.p13 from GENCODE) and the dCas9-KRAB-MeCP2 sequence using STAR (2.7.9a)47. Reads with mapping score ≥30 were selected by SAMtools (1.13)48. Then, de-duplication at the single-cell level was performed based on the UMI sequences and the alignment location, and retained reads were split into SAM files per cell. These single-cell SAM files were converted into alignment TSV files using the sam2tsv function in jvarkit (d29b24f)49. After background single-nucleotide polymorphism (SNP) removal, we considered T > C mismatches with the CIGAR string ‘M’ and quality scores >45 as 4sU site. Only reads with >30% of T > C mutations among all mismatches were identified as nascent reads, and the list of reads was extracted from single-cell whole-transcriptome SAM files by Picard (2.27.4)50. Finally, single-cell whole/nascent transcriptome gene × cell count matrices were constructed by assigning reads to genes51.
Read1 and Read2 of PerturbSci-Kinetics sgRNA libraries were matched against constant sequences, respectively, allowing a maximum of one mismatch. For each filtered read pair, cell barcode, sgRNA sequence and UMI were extracted from designed positions. Extracted sgRNA sequences with a maximum of one mismatch from the sgRNA library were accepted and corrected, and the corresponding UMI was used for de-duplication. De-duplication was performed by collapsing identical UMI sequences of each individual corrected sgRNA under a unique cell barcode. Cells with overall sgRNA UMI counts higher than 10 were maintained, and the sgRNA × cell count matrix was constructed.
SLAM-seq reads were processed similarly. In brief, for 3′ end SLAM-seq, UMI sequences in Read1 were extracted and attached to the headers of Read2 by UMI-tools (1.1.2)52, and only Read2 was further processed. After poly(A) and low-quality base trimming by Trim_Galore, reads were aligned to the hg38 reference genome by STAR. In the scenario of high-concentration 4sU labeling, more loose alignment parameters were used (–outFilterMatchNminOverLread 0.2–outFilterScoreMinOverLread 0.2). Reads were filtered by SAMtools, and PCR duplicates in passed reads were further removed by UMI-tools. Nascent reads were identified and extracted, and gene counting on both whole transcriptome and nascent transcriptome were performed as mentioned above but at the sample level. For full-length SLAM-seq, reads were processed similarly, but paired-end reads were retained.
sgRNA singlets identification and off-target sgRNA removal
Cells with at least 300 whole-transcriptome UMIs, 200 genes, 10 sgRNA UMIs and unannotated reads ratio <40% were kept. sgRNA singlets were assigned based on the following criteria: the most abundant sgRNA in the cell took ≥60% of total sgRNA counts and was at least three-fold of the second most abundant sgRNA.
Target genes with the number of cells perturbed ≥50 were kept. The knockdown efficiency was calculated at the individual sgRNA level to remove potential off-target or inefficient sgRNAs: whole transcriptomes of cells receiving the same sgRNA were merged and normalized by counts per million (CPM), and then the fold changes (FCs) of the target gene expressions were calculated by comparing the normalized expression levels between corresponding perturbations and NTC. sgRNAs with ≥40% of target gene expression reduction relative to NTC were regarded as ‘effective sgRNAs’, and singlets receiving these sgRNAs were kept as ‘on-target cells’. Downstream analyses were done at the target gene level by analyzing all cells receiving different sgRNAs targeting the same gene.
UMAP embedding on pseudo-cells
The count matrix of the ‘on-target’ cells described above was loaded into Seurat27, and DEGs of each perturbation (compared to NTC) were retrieved. Cells from perturbations with ≥1 DEGs and cells from genetic perturbations involved in similar pathways of the top perturbations were kept. The FCs of the normalized gene expression between perturbations and NTC were calculated and were binned based on the gene-specific expression levels in NTC. The top 3% of genes showing the highest FCs within each bin were selected and merged as features for principal component analysis (PCA). The top nine principal components (PCs) were used as input for UMAP embedding.
Differential expression analysis
Pairwise differential expression analyses between each perturbation and NTC cells were performed by Monocle 2 (ref. 53). We selected significant hits (false discovery rate (FDR) < 0.05) with a ≥1.5-fold expression difference and CPM ≥5 in at least one of the tested cell pairs. More stringent criteria were used to obtain DEGs with high confidence: significant hits (FDR < 0.05) with a ≥1.5-fold expression difference and CPM ≥50 in at least one of the tested cell pairs were kept. For bulk RNA-seq libraries, genes with a minimum of ten raw counts in at least one sample and expressed in at least half of samples were kept, and EdgeR54 was used for bulk RNA-seq DEG analysis. Significant hits were selected at an FDR < 0.05 level.
Synthesis and degradation rates calculation
After the induction of CRISPRi for 7 d, we assumed that new transcriptomic steady states had been established at the perturbation level before the 4sU labeling, and the labeling did not disturb these new transcriptomic steady states. The following RNA dynamics differential equation was used for synthesis and degradation rates calculation, similarly to the previous study26:
1 |
in which R is the mRNA abundance of each gene; α is the synthesis rate of this gene; and β is the degradation rate of this gene. Because the RNA synthesis follows the zero-order kinetics, and RNA degradation follows the first-order kinetics in cells, is determined by and .
As steady states had been established, the mRNA level of each gene did not change. We can get:
2 |
3 |
Under the assumption that the labeling efficiency was 100%, all nascent RNA was labeled during the 4sU incubation, and pre-existing RNA would only degrade. So, for nascent RNA (Rn), Rn (t = 0) = 0 and αn = α. For pre-existing RNA (), and αp = 0. Based on these boundary conditions, we could further solve the differential equation above on nascent RNA and pre-existing RNA of each gene.
4 |
5 |
As both and were directly measured in PerturbSci-Kinetics, and cells were labeled by 4sU for 2 h (t = 2), can be calculated from equations 3 and 4. Then, can be solved by equation 3.
Due to the shallow sequencing and the sparsity of the single-cell expression data, synthesis and degradation rates of DEGs were calculated at the target gene pseudo-cell level. DEGs with only nascent counts or degradation counts were excluded from further examination because their rates could not be estimated.
To examine the significance of synthesis and degradation rate changes upon perturbation, regarding the different cell sizes across different perturbations and NTC, which could affect the robustness of rate calculation, randomization tests were adopted. Only perturbations with cell number ≥50 were examined. For each DEG belonging to each perturbation, background distributions of the synthesis and degradation rate were generated: a subset of cells with the same size as the corresponding perturbed cells was randomly sampled from a mixed pool consisting of corresponding perturbed cells and NTC cells. Then, these cells were aggregated into a background pseudo-cell; synthesis and degradation rates of the gene for testing were calculated as stated above; and the process was repeated for 500 times. Rates = 0 were assigned if only nascent counts or degradation counts were sampled during the process (referred to as invalid samplings), but only genes with fewer than 50 (10%) ‘invalid samplings’ were kept for P value calculation. The two-sided empirical P values for the synthesis and degradation rate changes were calculated, respectively, by examining the occurrence of extreme values in background distributions compared to the rates from perturbed pseudo-cell. Rate changes with P < 0.05 were regarded as significant, and the directions of the rate changes were determined by comparing the rates from the perturbed pseudo-cell with the background mean values.
Global changes of key statistics upon perturbations
For global synthesis and degradation rate changes, considering the noise from lowly expressed genes, we selected the top 1,000 highly expressed genes from NTC cells and then calculated their synthesis rates and degradation rates in NTC cells and all perturbations with cell number ≥50. Kolmogorov–Smirnov tests were performed to compare rate distributions between each perturbation and NTC cells. The distributions of exonic reads percentage in nascent reads from cells with the same target gene knockdown and NTC cells were compared using Kolmogorov–Smirnov tests to identify genes affecting RNA processing. The proportion of nascent mitochondrial read counts to total mitochondrial read counts was calculated in each single cell, and its distributions between cells with knockdown and NTC cells were compared by Kolmogorov–Smirnov tests to identify the master regulator of mitochondrial mRNA dynamics. In all global statistics examinations, Benjamini–Hochberg multiple hypothesis correction was performed, and comparisons with FDR ≤ 0.05 were considered significant. The medians value from each perturbation and NTC cells were compared to determine the direction of significant changes.
Coverage analysis
We reprocessed the raw data of AGO2 eCLIP obtained from HeLa cells from Zhang et al.42. After adapter trimming, UMI extraction, mapping and UMI-based de-duplication, BAM files were transformed to the single-base coverage by BEDTools55. The transcript regions of genes-of-interest were assembled based on the hg38 genome annotation GTF file from GENCODE. In brief, for each gene, the exonic regions were extracted and redivided into 5′ UTR, coding sequence (CDS) and 3′ UTR by the 5′-most start codon and the 3′-most stop codon annotated in the GTF. The AGO2 binding coverages of these designated regions were obtained by intersection and were binned. The gene-specific signal in each bin was normalized by the number of bases in each bin, and the binned coverage of each gene was scaled to be within 0–1. After aggregating scaled coverages of synthesis/degradation-regulated genes, respectively, the lowest point within the CDS was used as the second scaling factor.
Meta-gene coverage analysis was conducted to visualize the gene body distribution of newly transcribed RNA in NTC and AGO2-knockdown samples. Genomic coordinates of protein-coding genes on chromosomes 1–22 and chromosome X were retrieved from the hg38 genome annotation GTF file from GENCODE. Gene bodies were binned into 50 bins, and ordered bins were exported as BED files. For input reads, two nascent read BAM files per group from the pulse-labeling full-coverage SLAM-seq were merged using SAMtools, and then reads with FLAG = 83/163 were assigned to genes on the plus strand, and reads with FLAG = 99/147 were assigned to genes on the minus strand. The gene-specific binned coverages were counted using the BEDTools intersect command. Binned counts of each gene were normalized by total counts in the gene body, and the coverage of any group of genes was finally drawn by averaging the normalized signals across genes.
Public ChIP-seq, shRNA RNA-seq and GRO-seq data analysis
Genes with detectable expression were identified from shControl/shAGO2 bulk RNA-seq in ENCODE. Processed gene count quantification tables were downloaded from the ENCODE portal. Only genes with mean transcript per million (TPM) > 1 across four samples and with detected expression in at least three of four samples were included. log2 FCs of each gene upon AGO2 silencing were calculated by dividing the mean TPM in the shAGO2 group with the mean TPM in the shControl group.
AGO2 ChIP-seq BAM and narrow peak files from ENCODE were merged for identifying TSS binding of AGO2. TSS regions of genes with detectable expression (defined as 4 kb around the TSS) were retrieved, and genes were classified into AGO2 TSS peak+/− genes based on the overlap between their TSS regions with merged AGO2 ChIP-seq narrow peaks. The binding patterns were then visualized using the computeMatrix function in deepTools (3.5.1)56.
GRO-seq data were downloaded from the Gene Expression Omnibus (GEO) and reprocessed to depict the transcriptional pausing status of genes. The 3′ end of reads was trimmed against poly(A) by Cutadapt (3.4)57, and reads were then aligned to the hg38 reference genome using Bowtie2 (2.3.0)58. After filtering out unmapped reads using SAMtools, BAM files were imported to R. TSS proximal regions and transcriptional elongation regions of protein-coding genes with gene lengths ≥1 kb were extracted, and the getPausingIndices() function from the BRGenomics package (3.17)59 was used to calculate the pausing indices of genes. Genes detected in both replicates were ranked by the pausing index within the replicate, and an averaged rank was used to study the association with AGO2 TSS binding.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41587-023-01948-9.
Supplementary information
Source data
Acknowledgements
We would like to express our gratitude to all members of the Cao laboratory for their helpful discussions and feedback. We thank R. Satija at the New York Genome Center for insightful feedback related to this work. We thank the Tissue Culture facility of the University of California, Berkeley for providing the 3T3L1 cell line; Z. Zheng at Memorial Sloan Kettering Cancer Center for providing the HEK293 cell line; and S. Cheng at Memorial Sloan Kettering Cancer Center for assisting with the supply of specific reagents. We thank members of The Rockefeller University Flow Cytometry Resource Center for their extensive help with FACS sorting. We also thank members of the Information Technology and HPC team at The Rockefeller University, especially J. Banfelder and B. Jayaraman, for their great support. The graphic illustrations in this study were generated using BioRender. We acknowledge that the research leading to this publication was partly supported by the G. Harold and Leila Y. Mathers Charitable Foundation. Additionally, the work received funding from grants provided by the National Institutes of Health (1DP2HG012522, 1R01AG076932 and RM1HG011014) and the Mathers Foundation, awarded to J.C.
Extended data
Author contributions
J.C. and W.Z. conceptualized and supervised the project. Z.X. performed experiments, including technique development and optimization, with input from J.L. Z.X. performed computational analyses, with input from A.S. J.C., W.Z. and Z.X. wrote the manuscript, with input and biological insight from all co-authors.
Peer review
Peer review information
Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.
Data availability
The data generated by this study can be downloaded in raw and processed forms from the National Center for Biotechnology Information Gene Expression Omnibus (GEO)60 (GSE218566). The AGO2 eCLIP data were obtained from the GEO database (GSE115146), and raw data from samples SRR7240709 and SRR7240710 were downloaded. Processed gene counts tables of RNA-seq on shControl/shAGO2 samples were downloaded from the ENCODE portal (ENCSR495YSS and ENCSR898NWE). The AGO2 ChIP-seq BAM and narrow peak files were downloaded from the ENCODE portal (ENCSR151NQL). The GRO-seq data were obtained from the GEO database (GSE97072), and raw data from samples SRR5379790 and SRR5379791 were downloaded. The reference genome hg38 and corresponding genomic annotation GTF file were downloaded from the GENCODE database (release 38, GRCh38.p13). Source data are provided with this paper.
Code availability
The computation scripts for processing PerturbSci-Kinetics are included as supplementary files. Scripts and the user manual are available for open access in GitHub: https://github.com/JunyueCaoLab/PerturbSci_Kinetics (ref. 61). Source data are provided with this paper.
Competing interests
J.C., W.Z. and Z.X. are listed as inventors on a patent related to PerturbSci-Kinetics (US provisional patent application 63/385,479). Other authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Wei Zhou, Junyue Cao.
Extended data
is available for this paper at 10.1038/s41587-023-01948-9.
Supplementary information
The online version contains supplementary material available at 10.1038/s41587-023-01948-9.
References
- 1.Jaitin, D. A. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell167, 1883–1896 (2016). 10.1016/j.cell.2016.11.039 [DOI] [PubMed] [Google Scholar]
- 2.Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell167, 1867–1882 (2016). 10.1016/j.cell.2016.11.048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell167, 1853–1866 (2016). 10.1016/j.cell.2016.11.038 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Xie, S., Duan, J., Li, B., Zhou, P. & Hon, G. C. Multiplexed engineering and analysis of combinatorial enhancer activity in single cells. Mol. Cell66, 285–299 (2017). 10.1016/j.molcel.2017.03.007 [DOI] [PubMed] [Google Scholar]
- 5.Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods14, 297–301 (2017). 10.1038/nmeth.4177 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hill, A. J. et al. On the design of CRISPR-based single-cell molecular screens. Nat. Methods15, 271–274 (2018). 10.1038/nmeth.4604 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Replogle, J. M. et al. Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat. Biotechnol.38, 954–961 (2020). 10.1038/s41587-020-0470-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Replogle, J. M. et al. Mapping information-rich genotype–phenotype landscapes with genome-scale Perturb-seq. Cell185, 2559–2575 (2022). 10.1016/j.cell.2022.05.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Yeo, N. C. et al. An enhanced CRISPR repressor for targeted mammalian gene regulation. Nat. Methods15, 611–616 (2018). 10.1038/s41592-018-0048-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sziraki, A. et al. A global view of aging and Alzheimer’s pathogenesis-associated cell population dynamics and molecular signatures in the human and mouse brains. Preprint at bioRxiv10.1101/2022.09.28.509825 (2023). [DOI] [PMC free article] [PubMed]
- 11.Cleary, M. D., Meiering, C. D., Jan, E., Guymon, R. & Boothroyd, J. C. Biosynthetic labeling of RNA with uracil phosphoribosyltransferase allows cell-specific microarray analysis of mRNA synthesis and decay. Nat. Biotechnol.23, 232–237 (2005). 10.1038/nbt1061 [DOI] [PubMed] [Google Scholar]
- 12.Dolken, L. et al. High-resolution gene expression profiling for simultaneous kinetic parameter analysis of RNA synthesis and decay. RNA14, 1959–1972 (2008). 10.1261/rna.1136108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Miller, C. et al. Dynamic transcriptome analysis measures rates of mRNA synthesis and decay in yeast. Mol. Syst. Biol.7, 458 (2014). 10.1038/msb.2010.112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Duffy, E. E. et al. Tracking distinct RNA populations using efficient and reversible covalent chemistry. Mol. Cell59, 858–866 (2015). 10.1016/j.molcel.2015.07.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Schwalb, B. et al. TT-seq maps the human transient transcriptome. Science352, 1225–1228 (2016). 10.1126/science.aad9841 [DOI] [PubMed] [Google Scholar]
- 16.Rabani, M. et al. Metabolic labeling of RNA uncovers principles of RNA production and degradation dynamics in mammalian cells. Nat. Biotechnol.29, 436–442 (2011). 10.1038/nbt.1861 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Miller, M. R., Robinson, K. J., Cleary, M. D. & Doe, C. Q. TU-tagging: cell type–specific RNA isolation from intact complex tissues. Nat. Methods6, 439–441 (2009). 10.1038/nmeth.1329 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Erhard, F. et al. scSLAM-seq reveals core features of transcription dynamics in single cells. Nature571, 419–423 (2019). 10.1038/s41586-019-1369-y [DOI] [PubMed] [Google Scholar]
- 19.Hendriks, G.-J. et al. NASC-seq monitors RNA synthesis in single cells. Nat. Commun.10, 3138 (2019). 10.1038/s41467-019-11028-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Cao, J., Zhou, W., Steemers, F., Trapnell, C. & Shendure, J. Sci-fate characterizes the dynamics of gene expression in single cells. Nat. Biotechnol.38, 980–988 (2020). 10.1038/s41587-020-0480-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Qiu, Q. et al. Massively parallel and time-resolved RNA sequencing in single cells with scNT-seq. Nat. Methods17, 991–1001 (2020). 10.1038/s41592-020-0935-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Battich, N. et al. Sequencing metabolically labeled transcripts in single cells reveals mRNA turnover strategies. Science367, 1151–1156 (2020). 10.1126/science.aax3072 [DOI] [PubMed] [Google Scholar]
- 23.Kawata, K. et al. Metabolic labeling of RNA using multiple ribonucleoside analogs enables the simultaneous evaluation of RNA synthesis and degradation rates. Genome Res.30, 1481–1491 (2020). 10.1101/gr.264408.120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature473, 337–342 (2011). 10.1038/nature10098 [DOI] [PubMed] [Google Scholar]
- 25.Sanson, K. R. et al. Optimized libraries for CRISPR–Cas9 genetic screens with multiple modalities. Nat. Commun.9, 5416 (2018). 10.1038/s41467-018-07901-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Qiu, X. et al. Mapping transcriptomic vector fields of single cells. Cell185, 690–711 (2022). 10.1016/j.cell.2021.12.045 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Stuart, T. et al. Comprehensive integration of single-cell data. Cell177, 1888–1902 (2019). 10.1016/j.cell.2019.05.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Sun, M. et al. Global analysis of eukaryotic mRNA degradation reveals Xrn1-dependent buffering of transcript levels. Mol. Cell52, 52–62 (2013). 10.1016/j.molcel.2013.09.010 [DOI] [PubMed] [Google Scholar]
- 29.Iwakawa, H.-O. & Tomari, Y. Life of RISC: formation, action, and degradation of RNA-induced silencing complex. Mol. Cell82, 30–43 (2022). 10.1016/j.molcel.2021.11.026 [DOI] [PubMed] [Google Scholar]
- 30.Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA-binding proteins. Nature583, 711–719 (2020). 10.1038/s41586-020-2077-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.ENCODE Project Consortium et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature583, 699–710 (2020). [DOI] [PMC free article] [PubMed]
- 32.Herzog, V. A. et al. Thiol-linked alkylation of RNA to assess expression dynamics. Nat. Methods14, 1198–1204 (2017). 10.1038/nmeth.4435 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Siira, S. J. et al. LRPPRC-mediated folding of the mitochondrial transcriptome. Nat. Commun.8, 1532 (2017). 10.1038/s41467-017-01221-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Paulo, E. et al. Brown adipocyte ATF4 activation improves thermoregulation and systemic metabolism. Cell Rep.36, 109742 (2021). 10.1016/j.celrep.2021.109742 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Treiber, T., Treiber, N. & Meister, G. Regulation of microRNA biogenesis and its crosstalk with other cellular pathways. Nat. Rev. Mol. Cell Biol.20, 5–20 (2019). 10.1038/s41580-018-0059-1 [DOI] [PubMed] [Google Scholar]
- 36.Kim, Y.-K., Kim, B. & Kim, V. N. Re-evaluation of the roles of DROSHA, Exportin 5, and DICER in microRNA biogenesis. Proc. Natl Acad. Sci. USA113, E1881–E1889 (2016). 10.1073/pnas.1602532113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Chipman, L. B. & Pasquinelli, A. E. miRNA targeting: growing beyond the seed. Trends Genet.35, 215–222 (2019). 10.1016/j.tig.2018.12.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Heinrichs, A. A slice of the action. Nat. Rev. Mol. Cell Biol.5, 677–677 (2004). 10.1038/nrm1483 [DOI] [Google Scholar]
- 39.Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol.36, 411–420 (2018). 10.1038/nbt.4096 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Futschik, M. E. & Carlisle, B. Noise-robust soft clustering of gene expression time-course data. J. Bioinform. Comput. Biol.3, 965–988 (2005). 10.1142/S0219720005001375 [DOI] [PubMed] [Google Scholar]
- 41.Thomas, M. P. et al. Apoptosis triggers specific, rapid, and global mRNA decay with 3′ uridylated intermediates degraded by DIS3L2. Cell Rep.11, 1079–1089 (2015). 10.1016/j.celrep.2015.04.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Zhang, K. et al. A novel class of microRNA-recognition elements that function only within open reading frames. Nat. Struct. Mol. Biol.25, 1019–1027 (2018). 10.1038/s41594-018-0136-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Pacini, C. et al. Integrated cross-study datasets of genetic dependencies in cancer. Nat. Commun.12, 1661 (2021). 10.1038/s41467-021-21898-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet.25, 25–29 (2000). 10.1038/75556 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res.28, 27–30 (2000). 10.1093/nar/28.1.27 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Krueger, F. Trim Galore. A wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data. GitHubhttps://github.com/FelixKrueger/TrimGalore (2013).
- 47.Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics29, 15–21 (2013). 10.1093/bioinformatics/bts635 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience10, giab008 (2021). 10.1093/gigascience/giab008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Lindenbaum, P. JVarkit: java-based utilities for bioinformatics. figshare10.6084/m9.figshare.1425030.v1 (2015).
- 50.Picard. Broad Institutehttps://broadinstitute.github.io/picard/ (2014).
- 51.Putri, G. H., Anders, S., Pyl, P. T., Pimanda, J. E. & Zanini, F. Analysing high-throughput sequencing data in Python with HTSeq 2.0. Bioinformatics38, 2943–2945 (2022). 10.1093/bioinformatics/btac166 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res.27, 491–499 (2017). [DOI] [PMC free article] [PubMed]
- 53.Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods14, 979–982 (2017). 10.1038/nmeth.4402 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics26, 139–140 (2010). 10.1093/bioinformatics/btp616 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics26, 841–842 (2010). 10.1093/bioinformatics/btq033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res.44, W160–W165 (2016). 10.1093/nar/gkw257 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J.17, 10–12 (2011). 10.14806/ej.17.1.200 [DOI] [Google Scholar]
- 58.Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods9, 357–359 (2012). 10.1038/nmeth.1923 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.DeBerardine, M. BRGenomics: tools for the efficient analysis of high-resolution genomics data. GitHubhttps://mdeber.github.io/ (2023). [DOI] [PMC free article] [PubMed]
- 60.Xu, Z. et al. PerturbSci-Kinetics: dissecting key regulators of transcriptome kinetics through scalable single-cell RNA profiling of pooled CRISPR screens. Gene Expression Omnibus. NCBIhttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE218566 (2023). [DOI] [PMC free article] [PubMed]
- 61.Xu, Z. et al. PerturbSci_Kinetics. GitHubhttps://github.com/JunyueCaoLab/PerturbSci_Kinetics (2023).
- 62.Miyoshi, H., Blömer, U., Takahashi, M., Gage, F. H. & Verma, I. M. Development of a self-inactivating lentivirus vector. J. Virol.72, 8150–8157 (1998). 10.1128/JVI.72.10.8150-8157.1998 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Scarpulla, R. C., Vega, R. B. & Kelly, D. P. Transcriptional integration of mitochondrial biogenesis. Trends Endocrinol. Metab.23, 459–466 (2012). 10.1016/j.tem.2012.06.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Pakos-Zebrucka, K. et al. The integrated stress response. EMBO Rep.17, 1374–1395 (2016). 10.15252/embr.201642195 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Janowski, B. A. et al. Involvement of AGO1 and AGO2 in mammalian transcriptional silencing. Nat. Struct. Mol. Biol.13, 787–792 (2006). 10.1038/nsmb1140 [DOI] [PubMed] [Google Scholar]
- 66.Griffin, K. N. et al. Widespread association of the Argonaute protein AGO2 with meiotic chromatin suggests a distinct nuclear function in mammalian male reproduction. Genome Res.32, 1655–1668 (2022). 10.1101/gr.276578.122 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Moshkovich, N. et al. RNAi-independent role for Argonaute2 in CTCF/CP190 chromatin insulator function. Genes Dev.25, 1686–1701 (2011). 10.1101/gad.16651211 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data generated by this study can be downloaded in raw and processed forms from the National Center for Biotechnology Information Gene Expression Omnibus (GEO)60 (GSE218566). The AGO2 eCLIP data were obtained from the GEO database (GSE115146), and raw data from samples SRR7240709 and SRR7240710 were downloaded. Processed gene counts tables of RNA-seq on shControl/shAGO2 samples were downloaded from the ENCODE portal (ENCSR495YSS and ENCSR898NWE). The AGO2 ChIP-seq BAM and narrow peak files were downloaded from the ENCODE portal (ENCSR151NQL). The GRO-seq data were obtained from the GEO database (GSE97072), and raw data from samples SRR5379790 and SRR5379791 were downloaded. The reference genome hg38 and corresponding genomic annotation GTF file were downloaded from the GENCODE database (release 38, GRCh38.p13). Source data are provided with this paper.
The computation scripts for processing PerturbSci-Kinetics are included as supplementary files. Scripts and the user manual are available for open access in GitHub: https://github.com/JunyueCaoLab/PerturbSci_Kinetics (ref. 61). Source data are provided with this paper.