Abstract
Pooled CRISPR screens coupled with single-cell RNA-sequencing have enabled systematic interrogation of gene function and regulatory networks. Here, we introduce Cas13 RNA Perturb-seq (CaRPool-seq), which leverages the RNA-targeting CRISPR–Cas13d system and enables efficient combinatorial perturbations alongside multimodal single-cell profiling. CaRPool-seq encodes multiple perturbations on a cleavable CRISPR array that is associated with a detectable barcode sequence, allowing for the simultaneous targeting of multiple genes. We compared CaRPool-seq to existing Cas9-based methods, highlighting its unique strength to efficiently profile combinatorially perturbed cells. Finally, we apply CaRPool-seq to perform multiplexed combinatorial perturbations of myeloid differentiation regulators in an acute myeloid leukemia (AML) model system and identify extensive interactions between different chromatin regulators that can enhance or suppress AML differentiation phenotypes.
Recent technological advances that couple pooled genetic perturbations with single-cell RNA-sequencing (scRNA-seq) or multimodal characterization (that is, Perturb-seq, CRISPR droplet sequencing, CRISP–seq and expanded CRISPR-compatible cellular indexing of transcriptomes and epitopes by sequencing (ECCITE-seq)1–4), promise to transform our understanding of gene function. In particular, the ability to perform combinatorial perturbations represents an opportunity to decode complex regulatory networks, with pioneering work demonstrating the ability to identify epistasis and other genetic interactions5–7. However, there are specific technical and analytical challenges associated with pooled single-cell screens that are exacerbated when considering combinatorial perturbations. For example, undetected or incorrectly assigned single-guide RNAs can affect up to 20% of cells6; this is compounded when multiple independent sgRNAs are introduced and independently detected in each cell. Moreover, perturbations introduced by Cas9 are not uniformly efficient, and a considerable fraction of targeted cells may exhibit no phenotypic effects of perturbation8,9. Therefore, when performing two or more simultaneous perturbations, the fraction of cells where all perturbations are both successfully introduced and successfully detected can decrease dramatically.
Type VI CRISPR–Cas proteins, such as the VI-D family member RfxCas13d, are programmable RNA-guided and RNA-targeting nucleases that enable targeted RNA knockdown. Notably, RfxCas13d is also capable of processing a CRISPR array into multiple mature CRISPR RNAs (crRNAs)10, presenting an attractive option for combinatorial perturbations at the RNA level. Recently, we confirmed that RfxCas13d can lead to striking target-RNA knockdown, and learned a set of optimal targeting rules from thousands of guide RNAs tiling multiple transcripts11. We therefore sought to combine pooled CRISPR–Cas13 screens with single-cell readouts to perform combinatorial and multimodal pooled genetic screens.
Results
Engineered CRISPR arrays enable Cas13 gRNA capture in single cells
Our method for Cas13 RNA Perturb-seq (CaRPool-seq) is enabled via an optimized molecular strategy to deliver individual or multiple gRNA perturbations in each cell and detect their identity during a single-cell sequencing experiment. Type VI-A, C and D Cas13 crRNAs consist of a short 5′ direct repeat and a variable spacer (also called gRNA) at the 3′ end, and therefore lack a common priming site for reverse transcription (RT). We developed two alternative approaches for Cas13 gRNA detection: (1) ‘direct’ capture, which adds a ‘capture sequence’ to the 3′ end of the 23-nt target spacer (Fig. 1a); and (2) an ‘indirect’ capture strategy, where a dedicated crRNA of the CRISPR array contains an array specific barcode (barcode gRNA, bcgRNA) with different positional configurations of the bcgRNA within a CRISPR array (Fig. 1a). We evaluated the performance of each method by targeting cell surface proteins and measuring knockdown via flow cytometry (Fig. 1b and Extended Data Fig. 1), and by quantifying crRNA detection via PCR with reverse transcription (Fig. 1c). While all methods successfully induced robust knockdown (Fig. 1b), we found that indirect guide capture with an optimized configuration resulted in the strongest crRNA transcript detection ability (configuration X, Fig. 1c), for both catalytically active and inactive Cas13 proteins.
These results demonstrate that RfxCas13d crRNAs can be modified by adding a common reverse transcription handle either directly to the gRNA or as a separate bcgRNA as part of a CRISPR array, allowing for reverse transcription and amplification. Notably, our strategy for indirect detection is well-suited for delivering multiple gRNAs into a single cell alongside a detectable bcgRNA that encodes the collective identity of these perturbations. In addition, using a unique set of reverse transcription handle and Illumina PCR priming sequence in our modified crRNA (Extended Data Fig. 2) ensures that these perturbations can be detected not only alongside scRNA-seq, but also when profiling additional molecular modalities (for example, CITE-seq12 for simultaneous transcriptome and surface protein profiling).
As proof of principle, we first tested the ability of CaRPool-seq to detect and assign bcgRNAs in a single-cell species-mixing experiment. We separately transduced RfxCas13d-expressing human embryonic kidney 293FT (HEK293FT) and mouse NIH/3T3 cells with a viral pool of three CRISPR arrays containing a nontargeting (NT) gRNA and a species-specific bcgRNA. We profiled a mixture of human and mouse cells with the 10X Genomics Chromium system (v.3), aiming to detect both cellular transcriptomes and the bcgRNAs. Of 2,387 cells, we found that 78.5% expressed a single bcgRNA (1.1% >1 bcgRNAs; 20.4% no detected bcgRNA). Moreover, we observed extremely high concordance between RNA and bcgRNA labels in singlet cells (99.2%) (Fig. 1d,e). These numbers demonstrate that CaRPool-seq enables pooled perturbation screens that can be efficiently and accurately demultiplexed into a single-cell readout.
CaRPool-seq enables combinatorial gene targeting with multimodal single-cell readout
Next, we tested the ability of CaRPool-seq to distinguish combinatorial perturbations on multiple molecular modalities at single-cell resolution. We designed gRNAs targeting three cell surface proteins, CD46, CD55 and CD71, as well as NT gRNAs. We created 29 crRNA arrays (Supplementary Tables 1 and 2), each of which contains up to three gRNAs and a bcgRNA, allowing for the perturbation of these genes individually or in combination. We transduced HEK293FT cells with a viral pool of all crRNAs and performed CaRPool-seq with CITE-seq12 readout (Fig. 2a and Extended Data Fig. 3a), allowing the assessment of each perturbation on both the cellular transcriptome and antibody-derived tags (ADTs) associated with CD46, CD55 and CD71 surface protein levels.
We obtained 9,355 single-cell profiles and demultiplexed them into groups based on the detected bcgRNA (Extended Data Fig. 3b; 74.7% expressed a single bcgRNA, 80.8% expressed at least one bcgRNA). We observed, on average, a 76.5% (±5.7%) mean reduction in protein levels for each targeted gene after perturbation with Cas13 demonstrating clear evidence of robust molecular perturbation (Fig. 2b–d). Moreover, the strength of knockdown was similar for multi-gRNA crRNA arrays relative to single gRNA perturbations (Fig. 2b and Extended Data Fig. 3c–e). When examining transcriptomic pseudobulk profiles for all 26 targeting gRNA groups, we observed decreased messenger RNA expression for each targeted transcript, even when perturbing transcripts of three genes simultaneously (15 examples in Extended Data Fig. 3f,g). The average strength of transcriptomic knockdown (mean 65%, s.d. 8.7%) was consistently reduced compared to the observed protein reduction. Target RNAs are continuously being produced and degraded before the target can be translated into protein. Further, analogous to how Cas9-nuclease targeting often produces RNAs degraded by nonsense mediated decay13, it is possible that Cas13 cleavage produces RNA molecules that can be detected by scRNA-seq but cannot be translated into functional protein14, suggesting that the level of measured RNA knockdown underestimates the phenotypic effect of Cas13 perturbation.
We performed several analyses to demonstrate that Cas13d-mediated gene knockdown does not introduce notable off-target effects or deleteriously alter cellular fitness. We identified eight genes whose sequence contains potential off-target binding sites for any of our three perturbations, but found that the expression of these genes was not significantly changed in our CaRPool-seq data (Extended Data Fig. 4a). We also performed CD55 target knockdown followed by more sensitive bulk RNA-seq and found no evidence for elevated levels of Cas13d off-targets (Extended Data Fig. 4b and Supplementary Table 3). In addition, previous reports have suggested that expression and target-dependent activation of Cas13d can induce broad nonspecific degradation of cytoplasmic and nuclear mRNA, and a reduction in cellular proliferation or fitness15. Notably, transcripts of mitochondrial genes have been suggested to be at least partially protected from Cas13d collateral activity15. Hence in the presence of collateral activity, mitochondrial genes may appear relatively upregulated compared to nuclear and cytoplasmic genes. Therefore, we compared mitochondrial gene expression levels in cells that have received one or more targeting gRNAs compared to NT control cells (Extended Data Fig. 4c), and did not observe upregulation in our CaRPool-seq data. Last, we classified cell cycles stage for each cell, and found no difference in cell cycle distributions (indicative of their proliferation state) compared to NT control cells (Extended Data Fig. 4d–f). Our findings likely reflect CaRPool-seq’s controlled delivery of Cas13d and gRNAs using single integration lentiviral systems instead of transient overexpression, which is consistent with the suggestion that lower and more controlled Cas13 expression can mitigate collateral activity16.
CaRPool-seq efficiently recovers cells with multiple perturbations
We next benchmarked the performance of CaRPool-seq against direct capture Perturb-seq6 using three different Cas9 effectors: Cas9-nuclease, a first-generation CRISPR inhibition (CRISPRi) system, Krüppel associated box (KRAB) fused to dCas9 (refs. 17,18) and a second-generation, dual-effector CRISPRi system, KRAB–dCas9–MeCP2 (refs. 18,19). In CaRPool-seq one bcgRNA encodes the combined gRNA identities, while direct capture Perturb-seq requires independent detection of one sgRNA feature per perturbation (Fig. 3a). We replicated our previously described experimental system, targeting the same three cell surface markers (CD46, CD55 and CD71) alone or in combination. For each target, we evaluated three sgRNAs from established CRISPR-KO20 and CRISPRi21 sgRNA libraries (Extended Data Fig. 5a) and selected the best sgRNA for Perturb-seq (Extended Data Fig. 5b). In addition, we used Cell Hashing22 to label cells targeted with vectors encoding single, double or triple perturbations. As in CaRPool-seq, we quantified gRNA, RNA and ADT levels in each cell.
Our benchmarking analysis found that, in contrast to CaRPool-seq, alternative Cas9-based approaches struggled to efficiently identify and detect combinatorial perturbations (Fig. 3b). For example, in the KRAB–dCas9–MeCP2 experiment, we recovered 1,570 cells that received vectors targeting three genes. Among these cells, only 779 (49.6%) were associated with the correct three sgRNA after sequencing. In the remaining cells, we detected too few perturbations (zero, one or two gRNAs, 31.2%), too many (four or more gRNAs 10.0%), or an improper combination of three gRNA (9.2%). This observed drop-off is fully consistent with the theoretical expectation of recovery for multiple independently detected gRNAs and highlights the challenge of efficiently profiling multiple perturbations with existing approaches. Since CaRPool-seq associates combinatorial perturbations with a single bcgRNA, the efficiency of detection does not vary between single and multiple perturbations.
We next compared the strength of perturbation across methods. We first considered cells where three perturbations were successfully detected based on either the bcgRNA (CaRPool-seq) or independently detected gRNA (Perturb-seq). When considering these cells, all methods successfully induced a similarly strong depletion of all three surface proteins (Cas13d 74.5%, Cas9 75.5%, KRAB–dCas9 75.2%, KRAB–dCas9–MeCP2 77.3%) (Fig. 3c and Extended Data Fig. 5c,d). We next analyzed all cells based on their ADT levels. CaRPool-seq and Perturb-seq cells clustered together (Fig. 3d), and grouped by gRNA identity (Fig. 3e and Extended Data Fig. 5e), again demonstrating that the strength of phenotypic protein perturbation was similar across all methods. We conclude that CaRPool-seq and Perturb-seq can both effectively introduce combinatorial perturbations into single cells. However, CaRPool-seq exhibits clear advantages in the ability to successfully identify and detect these perturbations and therefore represents an attractive approach for performing combinatorial single-cell CRISPR screens.
A pooled RNA-targeting CRISPR screen identifies genes involved in acute myeloid leukemia (AML) differentiation
To demonstrate the throughput and potential of CaRPool-seq to characterize genetic interactions, we performed a multiplexed screen of 158 combinatorial gene pairs. Motivated by recent work23, we aimed to characterize potential interactions between previously identified regulators of leukemic differentiation, which can influence the response to chemotherapy and small-molecule drugs. We generated a human MLL-AF9 NRASG12D AML cell line (THP1 cells), with a stably integrated doxycycline-inducible Cas13d cassette, as a model system. We first performed a bulk Cas13d CRISPR screen using a targeted library of 439 genes with ten gRNA per gene. On day 13–16 post-Cas13d induction, cells were sorted into bins based on their surface expression of CD14 and CD11b, immunophenotypic markers of monocyte differentiation. By comparing gRNA representation between low and high-expressing bins, we selected 26 genes that influenced differentiation consistently across multiple independent gRNAs and that have also been identified previously in orthogonal pooled Cas9 screens23 (Extended Data Fig. 6a–h and Supplementary Table 4). Through individual perturbations with a flow-cytometry readout, we found that target gene perturbations led to detectable CD11b expression changes after 3 days (Extended Data Fig. 6i). Consistent with previous work23, these genes were largely associated with DNA-binding and chromatin remodeling functions, and include a subset of previously identified regulators of AML differentiation.
CaRPool-seq identifies genetic interactions in AML differentiation
We next applied CaRPool-seq to test the effects of combinatorially perturbing these regulators (Fig. 4a and Extended Data Fig. 7a). We infected cells with a pooled library of 385 crRNA arrays. This library encoded 28 single perturbations (26 regulators and two negative control genes) and 158 paired perturbations. It also encompassed technical replicates for each perturbation using two independent gRNAs, as well as NT controls. We profiled the transcriptome, cell surface protein levels and gRNA expression for 31,308 demultiplexed single cells.
We first compared the level of surface protein expression for each perturbation to NT controls (Extended Data Fig. 7b). As expected, we found that each single-gene perturbation affected CD11b expression, with observed log2 fold changes that were in strong agreement with the level of gRNA enrichment from bulk CRISPR screens (R = 0.86; Fig. 4b, Extended Data Fig. 7c–e and Supplementary Table 5). Observed log2 fold changes for all perturbations were also reproducible (R = 0.82, Extended Data Fig. 7f) across technical replicate perturbations when comparing effects measured for independent gRNAs. We next compared the observed effects of the 158 dual gene perturbations to the effects resulting from the two corresponding single perturbations. We observed a strong correlation and found that the dual perturbation was typically stronger than the average of individual knockdowns, but weaker than the product (Fig. 4c). We also observed both synergistic and dampening effects. For example, individual knockdown of the histone demethylase KDM1A (log2fold change (FC) 2.41) and the histone deacetylase HDAC3 (log2FC 0.53) lead to strong and weak CD11b upregulation, respectively, but dual perturbation led to a synergistic effect (log2FC 2.85). In contrast, while individual knockdown of EP300 also leads to CD11b upregulation (log2FC 1.57), dual perturbation with KDM1A (log2FC 2.05) was weaker than the individual KDM1A knockdown. We observed similar findings using data from our Cas13d CD11b pooled screen (Extended Data Fig. 7g and Supplementary Table 5). To further validate the synergistic relationship between KDM1A and HDAC3 we infected KRAB–dCas9–MeCP2 expressing THP1 cells with multiple independent sgRNAs and measured CD11b and CD14 cell surface protein expression 7 days postinfection using flow cytometry. Again, we observed synergistic upregulation of both surface markers (Fig. 4d) after dual perturbation.
We next explored the transcriptional profiles in our CaRPool-seq dataset. We first sought to orthogonally validate the transcriptomic signatures we observed on perturbation of single genes by comparing with alternative technologies and datasets. We found that the differential gene expression signatures for single-gene perturbations obtained using ECCITE-seq (5′ scRNA-seq, Cas9 perturbation)23 can be readily reproduced in our CaRPool-seq (3′ scRNA-seq, Cas13d perturbation) data (Extended Data Fig. 8a,b). These differentially regulated gene modules were typically associated with genetic programs associated with the differentiation and function of myeloid cells (Extended Data Fig. 8c). To further explore comparisons between human hematopoiesis and our in vitro model system, we integrated our CaRPool-seq dataset with an scRNA-seq reference of hematopoietic progenitors and mature myeloid cells from the Human Cell Atlas and Human Biomolecular Atlas Project24–27, aligning CaRPool-seq THP1 cells to their closest neighbors in the reference dataset, and constructing a joint differentiation trajectory (Fig. 4e). We found that NT control cells localized to early points, but that single perturbations (that is, KDM1A, GFI1, GSE1) pushed cells further down the differentiation trajectory (Fig. 4f), consistent with their role in enhancing leukemic differentiation.
We applied a recent pioneering framework7 that fits a regression model to decompose the observed perturbation responses in doubly perturbed cells as a linear combination of single-gene perturbation responses. The fit and coefficients of this model describe multiple types of genetic interaction, including epistasis, genetic suppression and synergistic relationships. Fitting these models to each of our pairwise perturbations revealed a diversity of genetic interactions, which we broadly clustered into four groups (Fig. 4g). For 33 gene pairs in cluster 1, we saw that each individual gene’s profile contributed equally to the dual perturbation response and the linear model exhibited a strong fit. As a positive control, many of the pairs in this cluster represented perturbations of two proteins in the same complex (that is, MED14/MED24, SUPT16H/SUPT6H), which show similar gene expression responses for singly and doubly perturbed cells (Extended Data Fig. 8d–f). This cluster also represented pairs of proteins residing in separate complexes (MED24/SMARCD1 of mediator and SWI/SNF complexes), which share similar perturbation signatures (Extended Data Fig. 8d). Dual perturbation of KDM1A and the transcriptional repressor GSE1 also fell in this cluster (Extended Data Fig. 8f), consistent with previous work that suggests a cooperative interaction via colocalization at repressed promoters to inhibit myeloid differentiation28.
In cluster 4, we identified genetic interactions where one gene’s effect appeared to dominate over the other. We generally observed that transcriptional responses varied widely when pairing KDM1A knockdown with different chromatin regulators. For example, we found that the EP300-signature appeared more strongly than the KDM1A-signature when combinatorial perturbing both genes (Extended Data Fig. 8g). Dually perturbed cells exhibited higher expression of progenitor genes (that is the progenitor marker AZU1), and reduced expression of differentiated marker genes (that is, myeloid marker S100A4) compared to individual KDM1A perturbation. In contrast, the KDM1A response signature dominated when paired with perturbation of the polycomb repressive complex member RING1 (Extended Data Fig. 8h). Dual perturbation of HDAC3 enhanced the KDM1A transcriptional response signature (Fig. 4h), consistent with our previously described immunophenotypic results for these cells (Fig. 4c). This transcriptional response led dually perturbed cells to be distributed at later segments of our integrated myeloid differentiation trajectory, exhibiting enhanced differentiation compared to either single perturbation (Fig. 4i,j). These findings support and provide a molecular explanation for recent observations that combination therapies of KDM1A antagonist and HDAC inhibitors exhibit an enhanced response29. Moreover, we identified additional synergistic combinations between HDAC3/GFI and HDAC3/GSE1, both of which enhanced the expression of immunophenotypic markers (Fig. 4c), as well as transcriptional differentiation state (Fig. 4j).
Stable RNA structures improve bcgRNA detection
While our paper was in review, Nelson et al.30 reported that the inclusion of structured RNA motifs on prime editing gRNAs (pegRNAs) led to protection from exonuclease degradation, increased stability, and enhanced efficiency. While we expect that crRNA are typically protected from degradation while complexed with Cas13 (refs. 31,32), the ends of longer bcgRNA molecules may still be accessible and susceptible to degradation. These ends include reverse transcription priming sites that are essential for CaRPool-seq bcgRNA detection. Therefore, we tested the addition of stably structured stabilizing RNA elements at the 3′ end of the bcgRNA to antagonize nucleolytic decay (Fig. 5a). We tested six different structures (Extended Data Fig. 9a), and repeated our benchmarking experiment targeting CD46, CD55 and CD71 in HEK293FT cells. While protein knockdown was indistinguishable for all six structures (Fig. 5b,c and Extended Data Fig. 9b), we found that two elements (Zika virus-derived xrRNA1 dumbbell33; evopreQ1 pseudoknot34) led to a robust increase in bcgRNA unique molecular identifier (UMI) counts (Fig. 5d). In particular, the evopreQ1 pseudoknot element led to a sixfold higher bcgRNA detection UMI counts compared to our initial design (Extended Data Fig. 9c). The increased sensitivity also systematically improved the signal-to-noise ratio to distinguish true bcgRNA UMI counts from spurious secondary bcgRNA UMI counts (Extended Data Fig. 9d). This modified bcgRNA structure therefore further improves the performance of CaRPool-seq, and is recommended for future experiments.
Discussion
Here, we present CaRPool-seq, a flexible method for performing CRISPR–Cas13 RNA-targeting screens with a single-cell sequencing-based readout. We introduced an optimized strategy to deliver multiple gRNA as part of a single CRISPR array, which is subsequently cleaved into individual crRNAs. We demonstrate that this strategy is well-suited for performing combinatorial perturbations, whose identity is encoded in a single barcode that can be reliably detected alongside multiple molecule modalities including scRNA-seq and CITE-seq.
Through benchmarking, we show that CaRPool-seq is more efficient and accurate when assigning multiple perturbations in single cells when compared to Cas9-based technologies. Even with individual perturbations, the user can still benefit from CaRPool-seq. In particular, as an RNA-targeting enzyme, Cas13d can be uniquely applied to target specific RNA isoforms, or even circular, enhancer or antisense RNA molecules. RNA-directed approaches may also be optimal when targeting a single member of a local gene cluster, where alternative KRAB-mediated repressive strategies may ‘spread’ and introduce off-target effects35. CaRPool-seq can profile additional cellular modalities such as cell surface protein levels and, in the future, can be extended to additional molecular modalities including intracellular protein levels and chromatin accessibility. Moreover, the strategy of introducing multiple perturbations through cleavable arrays is extendable to other CRISPR systems, including Cas12 (ref. 36) and Cas7-11 (ref. 37), and represent promising extensions of this work. And as combinatorial screens scale rapidly, we note that CaRPool-seq is compatible with pioneering approaches for targeted scRNA-seq, including hybridization-based 10X Targeted Gene Expression Panels6 or multiplexing PCR-based approaches38.
There are current limitations with CaRPool-seq that may be overcome by future advances. For example, RNA-targeting CRISPR proteins cannot currently activate gene expression through transcription or translation in mammalian cells39. Additionally, our work suggests a pooled cloning strategy for medium-sized CaRPool-seq CRISPR arrays; however, further optimizations may be required for long arrays with very large numbers of targeting gRNAs. Lastly, while we did not observe direct or indirect evidence for RfxCas13d’s promiscuous collateral activity in our CaRPool-seq experiments, other RNA-targeting CRISPR effectors37,40,41 may represent an alternative for future experiments.
Combinatorial screens have the potential to shed substantial new light on the structure of genetic regulatory networks, and also to identify combinatorial perturbations that achieve desirable cellular phenotypes. Our CaRPool-seq analysis of AML differentiation regulators benefited from recently developed computational frameworks to identify genetic interactions from multiplexed perturbation screens, and these types of data will be valuable resources for systematic reconstruction of complex pathways and cell circuits. Moreover, our identification of combinatorial perturbations that enhanced AML differentiation phenotypes was consistent with previous identification of efficacious multi-drug therapies, suggesting that future experiments may help to nominate candidates for combined drug treatments. We conclude that CaRPool-seq represents a powerful addition to the growing toolbox of methods for multiplexed single-cell perturbations.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41592-022-01705-x.
Methods
Pooled Cas13d library design and cloning
We design two libraries for pooled cloning, one to identify genes that lead to THP1 cell differentiation (Extended Data Fig. 6) and one for combinatorial targeting with CaRPool-seq (Fig. 4).
First, we designed a RfxCas13d gRNAs library for single gRNA expression targeting 439 individual genes. We selected 240 genes that led to CD11b or CD14 upregulation in Cas9 screens23, in addition to 199 control genes in TLR4-signaling. We selected the transcript with the highest isoform expression (CCLE, https://sites.broadinstitute.org/ccle/datasets) per gene and designed gRNAs using our Cas13design algorithm11. For each gene, we selected ten gRNAs from efficacy quartile Q4 (or Q3), spread along the coding region. Selected gRNAs had no secondary target sites with 0–2 mismatches to the cognate site42. In total, we designed 4,390 gRNAs and 410 NT control gRNAs (>3 mismatches to hg19-annotated transcripts). Library cloning has been described before11. In brief, pooled oligonucleotides (Twist) were amplified using 8× PCR reactions with eight amplification cycles using direct repeat-specific forward primer (Supplementary Table 2). The amplicon was Gibson-cloned into pLentiRNAGuide_001 and pLentiRNAGuide_002 (Addgene nos. 138150 and138151). Complete library representation with minimal bias (90th percentile/10th percentile crRNA read ratio of 1.8 for both libraries) was verified by Illumina sequencing (MiSeq).
For the CaRPool-seq library, we manually inspected all gRNA enrichments from the pooled screen library described above. For each target genes, we picked the two most enriched (depleted for CD14/ATXN7L3) gRNAs avoiding overlapping gRNAs. For each gene, we paired the two gRNAs with an NT gRNA (n = 28 single perturbations, n = 56 arrays). For 17 genes, we designed all pairwise combinations (n = 136 gene pairs, n = 272 arrays). For nine genes we designed a subset of possible gene pairs within the same complex (n = 22 gene pairs, n = 44 arrays). We added 13 NT control arrays. In total, we design 385 arrays with 186 single or double perturbations, each represented by two independent technical replicate gRNA combinations. For the bcgRNAs, we designed random 15mer sequences with hamming distance greater than four to one another. We balanced the relative CRISPR array abundance by the negative effect on cell proliferation of the targeted genes and increased the number of array copies in the pool to minimize dropout in the CaRPool-seq experiment. The oligos for synthesis were designed in the following way:
PCR-handle::BsmBI-site::gRNA1::DR::gRNA2::LguI-bridge::barcode::BsmBI-site::PCR-handle
Pooled oligonucleotides (Twist, Supplementary Tables 1 and 2) wer amplified using Pfu-Ultra-II following the manufacturer’s recommendation using 1 μl of enzyme and 20 ng (1 ng μl−1) of the oligo pool in a 50 μl reaction 95 °C/2 min, 5× (95 °C for 20 s, 58 °C for 20 s, 72 °C for 15 s), 72 °C for 3 min). The amplicon was 2× solid phase reversible immobilization (SPRI) purified, followed by BsmBI-digestion and additional 2× SPRI cleanup. All of the product was ligated into BsmBI-digested pLentiRNAGuide_003 (without evopreQ1 sequence element) using T7-DNA ligase and cloned as described in ref. 11 with >1,000 colonies per construct. The resulting plasmid pool was digested with LguI to enable ligation of the third direct repeat and small RNA-handle to complete the bcgRNA and CRISPR array. The LguI insert (Supplementary Table 2) was cloned into pLentiRNAGuide_001, digested with LguI and gel-purified (2% eGel). Complete library representation with minimal bias (90th percentile/10th percentile crRNA read ratio 2.6/4.8), and correct gene pair to bcgRNA linkage (>94%) was verified by Illumina sequencing (MiSeq). During library cloning, we noticed two critical details: alternative polymerase KAPA and Q5 can lead to a stronger bias in relative array abundance. Further, reducing the number of PCR cycles with increased oligo pool input amounts can decrease bcgRNA reassortment. Last, while we chose a two-step cloning strategy, a single-step strategy may yield similar results. pLentiRNAGuide_003 has been deposited to Addgene (no. 192505).
Pooled CRISPR screening
Pooled Cas13d screens have been performed as described before in ref. 11, with minor modifications. Cas13d expression was induced after THP1 cells were fully selected (1 μg ml−1 doxycycline). Growth medium with fresh puromycin, blasticidin and doxycycline was replenished every 2–4 days, and cells were split as needed always maintaining a guide representation of >1,000×.
For the single gRNA pooled screen, we collected a 1,000× representation at 7 and roughly 14 days post-Cas13d induction and before sorting. After 2 weeks (13–16 days) we stained 15 million cells (roughly 3,000× representation), using FcX-blocking buffer (BioLegend no. 422302; 10 min at room temperature) and followed by either CD11b (BioLegend clone ICRF44 no. 301322, 4 μl per 1 × 106 cells per 100 μl or CD14 (BioLegend clone HCD14 no. 325608, 4 μl per 1 × 106 cells per 100 μl staining (30 min at 4 °C), and finally resuspending cells on PBS with DAPI (4,6-diamidino-2-phenylindole) (Sigma no. D9542, 0.4 μg ml−1) to detect any apoptotic or dead cells. We sorted the cells (Sony SH800) based on their signal intensities (CD11b or CD14: lowest 10–15% and highest 10–15%). Cells were PBS-washed and frozen at −80 °C until sequencing library preparation. In total, we prepared four independent transductions (two multiplicities of infection (MOI) and two alternative direct repeats), performed CD14 sorts for all four transduction replicates, and CD11b sorts for three transduction replicates collecting 1 × 106 to 1.5 × 106 cells per bin.
For the combinatorial perturbation pooled screen, we performed three transduction replicates (MOI 0.13–0.20). Eight days post-Cas13d induction, we collected an input representation (>1,000× coverage) and stained 20–30 million cells with FcX-blocking, CD11b and DAPI as described above. Cells were CD11b-sorted (lowest 15% and highest 15% signal intensity). Cells were PBS-washed and frozen at −80 °C until sequencing library preparation. Library preparations for the single gRNA pooled screen were done as described before11. For the combinatorial targeting pooled screen, we adopted a PCR strategy similar to the CaRPool-seq bcgRNA readout. Pooled screen readout PCR1 remained unchanged. In PCR2, we amplified the 15 basepair (bp) barcode sequence using a soluble Nextera-Read1-CS1 feature capture primer including an optional 28 randomized bases mirroring UMI and cell barcode, and RPIx Read2 i7 index primer. The amplicon was completed in PCR3 using Feature SI primer 2 (10X Genomics) and P7 primer.
Pooled CRISPR screen analysis
Raw reads were demultiplexed based on Illumina i7 barcodes using bcl2fastq and, if applicable, by their custom in-read barcode using a custom python script. For the single gRNA pooled screen, read1 sequencing reads were trimmed to the expected gRNA length by searching for known anchor sequences relative to the guide sequence using a custom python script (https://github.com/hwessels/Cas13). For the combinatorial pooled screen, we extracted the first 15 bases in read2. For the single gRNA pooled screen, we collapsed (FASTX-Toolkit v.0.0.14) processed reads to count duplicates followed by string-match intersection with the reference to retain only perfectly matching alignments (average mapping rate 82.3%, median gRNA count 167). For the combinatorial pooled screen, preprocessed reads were aligned to the barcode reference using bowtie43 (v.1.1.2) with parameters -v 1 -m 1–best –strata (average mapping rate 97%; median barcode read count 635; one barcode was not detected in input samples). For each dataset, raw counts were normalized using a median of ratios method as in DESeq2 (ref. 44) and batch corrected using combat implemented in SVA (v.3.34.0)45. gRNA and bcgRNA enrichments were calculated building the count ratios between a sorting bin or timepoint and the indicated reference sample followed by log2-transformation (log2FC). For every gRNA or bcgRNA, we considered the mean log2FC across replicates. For the single gRNA pooled screen, we used the four best performing gRNAs per target gene to calculate the mean log2FC, where we determined best as either highest or lowest dependent on the sign of the mean enrichment across all ten gRNAs. As we have previously described11, we noticed that log2FC enrichments were generally more pronounced in samples using the enhanced direct repeat. Consistency between replicates and selected gRNAs was estimated using Robust Rank Aggregation (v.1.1)46. For the combinatorial pooled screen, we calculated the mean of both replicate arrays per gene pair. We noticed GFI1 g2 did not lead to strong effects in the pooled screen and in the CaRPool-seq experiment. The technical replicate arrays including GFI1 g2 were removed in all analyses. Enrichments are available in Supplementary Tables 4 and 5 (P values derived from Robust Rank Aggregation).
Direct capture Perturb-seq
Monoclonal CRISPR–Cas effector protein-expressing cell lines (Cas9-nuclease, KRAB–dCas9, KRAB–dCas9–MeCP2) were infected with one of six sgRNA pools (KO-1, KO-2, KO-3 or CRISPRi-1, CRISPRi-2 and CRISPRi-3) (Supplementary Table 1), providing 1–3 sgRNAs in a single vector and a total of nine cell line pool combinations. Cell survival after selection ranged between 1.7 and 5.5% (MOI < 0.1) assuring a high single integration probability. Viral titers were confirmed by measuring the fraction of BFP-positive cells for pools that have received vectors carrying 2+ sgRNAs using flow cytometry. Cells were passaged every 2–3 days (replenishing puromycin and blasticidin at each split) maintaining high sgRNA representation (>1,000× coverage). We confirmed that >98% of cells were BFP-positive before the 10X experiment. We performed 10X (Chromium Single Cell 3′ Gene Expression v.3 with Feature Barcoding technology for CRISPR screening, nos. 1000074, 1000075 and 1000079) 12 days posttransduction. Cells were stained with a pool of five TotalSeq-A antibodies (0.75 μg per antibody per 2 × 106 cells) (Supplementary Table 6) following the CITE-seq protocol12. In addition, we used Cell hashing22 (Supplementary Table 7) to track the nine cell line pool combinations. Before the run, cell viability was determined (≥96%). We ran one 10X lane, leveraging our hashed experimental design loading 38,600 cells. mRNA, sgRNA feature, hashtags (hashtag-derived oligos, HTOs), protein (Antibody-derived oligos, ADTs) libraries were constructed by following 10X Genomics Cell hashing and CITE-seq protocols12,22. All libraries were sequenced together on one NextSeq 75 cycle high-output run.
Direct capture Perturb-seq analysis
Gene expression data was mapped to the hg38 (ensembl v.97) genome reference using Cellranger (v.3.0.1). Guide RNA reads were mapped simultaneously to a sgRNA feature reference (Supplementary Table 1). Before feature mapping, we performed 5′ adapter trimming using cutadapt to account for varying lengths of poly-G tracks five prime to the sgRNA feature (first -g AAGCAGTGGTATCAACGCAGAGTACAT -O 5; then -O 1 -e 0 -g XGGGGGGGGGG) and trimmed the resulting reads to a length of 18 bases. We used CITE-seq-count package (v.1.4.2) for HTO and ADT quantification. Count matrices were then used as input into the Seurat R package (v.4.0)47 to perform downstream analyses. We detected 16,842 cells. HTO and sgRNA counts were normalized using the centered log-ratio transformation approach (margin of 2). To assign experimental conditions and remove cell doublets, we used the HTODemux function in Seurat, with default parameters.
For sgRNA assignment, we customized HTODemux to return identities of second and third sgRNA without changing the underlying modeling approach. We flagged cells with an incorrect number of expected sgRNAs based on the HTO pool assignment. Furthermore, we flagged cells with an unexpected combination of sgRNAs not present in the sgRNA pool used to transduce the cells.
For the analysis shown in Fig. 3, we only retained cells with the correct sgRNA numbers and identities. ADT counts were log-normalized, before running ScaleData (do.scale=FALSE, vars. to.regress=Perturb-Seq.approach). PCA was performed on normalized ADT counts using all five features, followed by uniform manifold approximation and projection (UMAP) dimensional reduction using four dimensions. To compare target knockdown across Perturb-seq approaches for NT cell and cells that received all three (s)gRNAs (CD46, CD55, CD71), we normalized cellular ADT counts using median of ratios across ADT features that were not targeted (CD29, CD56) to derive a scaling factor per cell, and divided the normalized ADT counts by the mean ADT counts in NT cells for each Perturb-seq approach.
CaRPool-seq experiments
We transduced and treated Cas13d-NLS expressing HEK293FT, NIH/3T3 or THP1 cells as described in the Supplementary Information. In the species mixing, we used a pool of three bcgRNAs per species together with NT gRNAs. The HEK293FT CaRPool-seq experiment included 29 CRISPR arrays (Supplementary Table 1) barcoding a diverse set of array configurations around four gRNAs that allowed us to assess gRNA positioning within the CRISPR array, effects of the relative gRNA amount per cell and combinatorial targeting of multiple RNA transcripts. CaRPool-seq species mixing and CaRPool-seq were conducted simultaneously in one lane of 10X Genomics 3′ kit. CaRPool-seq was performed on THP1 cells 5 days post-Cas13d induction (1 μg ml−1 Doxycycline) using four lanes of a 10X Genomics 3′ kit. THP1 CaRPool-seq library design and cloning were described above. Before the runs, cell viability was determined ≥95% for each experiment.
The HEK293FT CaRPool-seq experiment was stained with a pool of five TotalSeq-A antibodies (0.75 μg per antibody per 2 × 106 cells) (Supplementary Table 6) as following the CITE-seq protocol12. Similarly, THP1 cells were first treated with FcX-blocking buffer (BioLegend no. 422302, 10 min at room temperature), before staining cells with a pool of 22 TotalSeq-A antibodies (Supplementary Table 6). To keep track of the experiment identity and identify multiplets, samples were hashed (subsequent to CITE-seq antibody staining) (Supplementary Table 7) following the Cell Hashing protocol22. mRNA, hashtags (HTOs), protein (Antibody-derived oligos, ADTs) libraries were constructed by following 10X Genomics Cell hashing and CITE-seq protocols12,22.
Species mixing and HEK293FT CaRPool-seq experiment libraries were sequenced together on one NextSeq 75 cycle high-output run. THP1 CaRPool-seq libraries were sequenced on NovaSeq6000 using the XP S4 2 × 100 v.1.5 workflow. Sequencing reads coming from the mRNA library were mapped to a joined genome reference of hg38 (ensemble v.97) and mm10 using the Cellranger Software (v.3.0.1), or to hg38 using Cellranger v.6.0.0 for the THP1 experiment. bcgRNA library reads were mapped simultaneously to a barcode reference (Supplementary Table 1) using Cellranger. To generate count matrices for HTO and ADT libraries, the CITE-seq-count package (v.1.4.2) was used (https://github.com/Hoohm/CITE-seq-Count). Count matrices were then used as input into the Seurat R package (v.4.0)47 to perform all downstream analyses.
CaRPool-seq library preparation
We used Cas13 CRISPR array configurations of type X (Fig. 1a and Extended Data Fig. 2). Specifically, the bcgRNA was placed in the last array position and entailed a spacer sequence composed of a five-prime Illumina small RNA PCR handle, a 15mer barcode and a three-prime capture sequence 1 (CS1) compatible with 10X Genomics feature barcoding. This composition allowed the specific amplification of a bcgRNA amplicon with a unique combination of forward and reverse primers. Moreover, usage of the Illumina 5′ PCR handle allows for efficient sequencing of the bcgRNA amplicon with the first base of read2 being the first barcode base. In our last experiment (Fig. 5), we added structured RNA elements 3′ to the CS1 sequence.
CaRPool-seq experiments were conducted using the 10X Genomics 3′ kit (Chromium Single Cell 3′ Gene Expression v3 with feature barcoding technology for CRISPR screening, nos. 1000074, 1000075 and 1000079). Library construction for bcgRNA derived oligos is outlined in Extended Data Fig. 2 and largely followed 10X Genomics user guide CG000184 Rev C with some modifications. Specifically, we eluted the GEM-RT in 33 μl and added 2 μl containing 0.4 μM ADT additive primer (for bcgRNAs and ADTs) and 0.2 μM HTO additive primer before complementary DNA amplification. The cDNA was purified using 0.6× SPRI cleanup for mRNA fraction. The supernatant containing ADT, HTO and bcgRNA cDNA was purified by adding another 1.4× SPRI (0.6 + 1.4 = 2× SPRI) followed by a second 2× SPRI cleanup. The purified short fragments were split into three pools (for example, 3 × 20 μl). One pool each was used for HTO and ADT library construction as described before12,22. Half of the remaining pool (10 μl) was used to construct the bcgRNA library using two PCR recipes. PCR1 adds Illumina P5 and P7 handles to the bcgRNA amplicon (100 μl of PCR1:50 μl of 2× KAPA Hifi PCR Mastermix, up to 45 μl of bcgRNA PCR template, 2.5 μl of Feature SI Primers 210 μM, 2.5 μl of TruSeq Small-RNA RPIx primer (containing i7 index) 10 μM;95 °C 3 min, 12× (95 °C 20 s, 60 °C 8 s, 72 °C 8 s), 72 °C 1 min). The 1.6× SPRI-purified PCR1 product was amplified in PCR2 (100 μl: 50 μl of 2× KAPA Hifi PCR Mastermix, up to 45 μl of PCR1 product, 2.5 μl of P5 primer 10 μM, 2.5 μl of P7 primer 10 μM; 95 °C 3 min, 4× (95 °C 20 s, 60 °C 8 s, 72 °C 8 s), 72 °C 1 min). The final bcgRNA amplicon (203 bp) can be sequenced with standard Illumina sequencing primers (≥28 cycles read1 and ≥15 cycles read2) (Extended Data Figs. 2 and 3a).
CaRPool-seq data analysis
Cells from species-mixing and HEK293FT CaRPool-seq experiments were processed together. Cells with <2,500 UMI were removed. HTO and bcgRNA counts were normalized using the centered log-ratio transformation approach (margin of 2). We used HTODemux to identify cell doublets and assign experimental conditions. Only human cells were hashed, with mouse NIH/3T3 cells being the only cell population without a hashtag. We removed all hashing doublets within the CaRPool-CITE-seq experiment (HTO-01 to HTO-08) and to human cells in the species-mixing experiment (HTO-10). In addition, we removed all cells labeled with a single HTO-01 to HTO-08 if the fraction of mouse reads was >10%, and cells without any HTO if not at least 10% mouse reads were present. Like this, we removed all doublets between CaRPool-seq species mixing and CaRPool-CITE-seq experiments while retaining potential collisions/doublets within the species-mixing experiment. At this point, the experiment was split into two separate objects. For the species-mixing experiment, we determined species identity by quantifying the fraction of human reads for RNA and for the species-specific bcgRNAs (human >0.9, mouse <0.1, collision 0.9 to 0.1). For the HEK293FT CaRPool-CITE-seq experiment RNA counts were log-normalized using the standard Seurat workflow after removing all mouse features and RNA counts. bcgRNA identity was determined using MultiSeqDemux (autoThresh=T). Cells without a bcgRNA assigned and cells with multiple bcgRNA assignments were removed. Differential expression analyses were done using FindMarkers (Wilcoxon’s rank-sum test, pseudocount.use of 1 × 10−4). We converted log2 fold changes to percent knockdown for each target gene in each of the 26 targeting conditions and took the mean to calculated the average target knockdown,
For the THP1 experiment we detected 52,496 single cells (nFeature_RNA > 1,000, nFeature_RNA < 8,000, percent.mt < 20) after HTO demultiplexing using HTOdemux as described above. Model-based bcgRNA assignments (HTODemux or MultiSeqDemux) did not yield satisfying results supported by the observed phenotypic changes, likely due to model limitations imposed by the high number of bcgRNA features. Instead, we assigned bcgRNAs to single cells by applying the following rules: We compared UMI counts for the bcgRNA with the highest UMI count (g1) to, if present, the second detected bcgRNA (g2). bcgRNA counts for g2 may derive from spurious counts arising from library preparation, or from integration of more than one viral element (bcgRNA multiplet). We considered cells with g1 < 5 as negative. We assigned g1 if: (1) g1 = (5–9) and g2 = (0–1) or (2) g1 > 9 and g1/(g1 + g2) > 0.8 and g2 < 11. All other cells were considered bcgRNA multiplets. We assigned 31,308 with a single bcgRNA. Comparing differential gene expression results for technical replicates embedded in the CaRPool-seq library, we noticed GFI1 g2 did not lead to upregulation of CD11b ADT or upregulation of the expected gene expression signature. We removed all cells with GFI1 g2 (n = 601).
Changes in cell surface protein ADT levels for gene pair or individual CRISPR array were calculated using Wilcoxon’s rank-sum test in FindMarkers relative to NT control cells. Changes were determined by repeating the differential expression analysis ten times with ≤30 randomly samples cells per cell group to account for differing numbers of cells followed by averaging. We compared differential CD11b expression between single and dual perturbations by comparing the log2-tranformed fold changes of dually perturbed cells to the log2-transformed mean fold changes of the two single-gene perturbations.
Cas13d gRNA off-target evaluation
To identify potential Cas13d gRNA off-target binding sites we aligned gRNAs to the human transcriptome (GRCh38 cdna.all and noncoding RNA from emsembl release 97) using blastn (v.2.6.0) (megablast) with the following parameters (-strand minus -max_target_seqs 10,000 -evalue 10,000 -word_size 5 -perc_identity 0.7). Second, candidates were further filtered to match with at least 17 bases, as shorter matches do not lead to target knockdown and show a blastn e.value of <100. In Extended Data Fig. 4a, we demonstrate that despite the potential for off-target binding, we do not observe transcriptomic perturbation for these genes.
Cas13d collateral activity was evaluated by comparing expression levels of mitochondrial genes15 in cells expressing targeting gRNAs versus NT gRNAs using FindMarkers (Wilcoxon’s rank-sum test, pseudocount. use 1 × 10−4). To assess differences in cell fitness, we classified single-cell transcriptomes into gene expression programs usually observed at different cell cycle stages (Seurat’s CellCycleScoring), and compared the distribution of cells per cell cycle stages between groups of cells.
Modeling of genetic interactions in single-cell data
To decompose transcriptomic profiles of double perturbation, we used a linear regression model as previously introduced7 and implemented it in R. First, we z-scaled the log-normalized gene expression counts for all cells with respect to the mean and standard deviation of the control group (NT cells). In this way, we have subtracted the baseline expression profiles from each cell and can directly compare the deviation from each perturbation to NT conditions. Next, we grouped cells by gene pair and calculated pseudobulk z-scaled profiles (single perturbations (a, b), and double perturbation (ab)) by calculating the mean across cells for each feature. The average NT-cell profile returns a vector of all zeros. We generated average profiles for 1,530 genes with an average UMI count >0.5. We included gene pairs when all cell groups were represented by at least 25 cells (Examples in Fig. 4g and Extended Data Fig. 8d–h).
As previously introduced7, we model the average z-scale profiles using:
with δa and δb being the pseudobulk z-scaled profile for cells assigned to single perturbations a and b, repectively, while δab is the pseudobulk z-scaled profile for cells assigned to double perturbation ab. c1 and c2 are constants fitted to the data indicating the relative weight of δa and δb profiles. The vector ϵ collects the residuals to the model fit. In our plots, a is the first gene in the gene pair and b is the second gene. c1 corresponds to a, and c2 to b.
We implemented the previously introduced model-fitting procedure7, using the rlm function from the MASS package (v.7.3-58.1), and extracted the mean coefficients (c1 and c2) and residual error ϵ. We collected six measures to evaluate the fit as described before7 (dcor function in energy package v.1.7-10):
Model fit: dcor (c1a + c2a, ab)
Dominance: |log10 (c1/c2)|
Magnitude: (c12 + c22)1/2
Similarity of single to double profiles: dcor ((a,b), ab)
Similarity of single profiles: dcor (a,b)
Equality of contribution: min (dcor (a,ab), dcor (b,ab))/max (dcor (a,ab), dcor (b,ab))
Each feature and its interpretation are described in detail in ref. 7. Features were scaled (margin of 2) before hierarchal clustering (dist, euclidean; methods, ward) to generate a dendrogram, shown in Fig. 4g. For clarity, the heatmap in Fig. 4g shows unscaled values.
The example interactions shown in Fig. 4h and Extended Data Fig. 8d–h depict the union of top 20 differentially expressed genes for each cell group (a, ab) relative to NT cells (selected by P value) derived using the Wilcoxon’s rank-sum test in FindMarkers. Model prediction and residuals are derived from the modeling approach described above. The color scale represents the average z-score normalized expression per gene pair.
Extended Data
Supplementary Material
Acknowledgements
We thank the Technology Innovation laboratory as well as all members of the Sanjana and Satija laboratories for helpful discussions. We are grateful to Z. Daniloski for cloning the Cas9-effector protein plasmids, to I. Aifantis for advice and helpful discussion related to THP1 experiments, and the Technology Innovation laboratory for generously sharing CITE-seq reagents. N.E.S. and R.S. are supported by New York University and New York Genome Center startup funds. N.E.S. is further supported by DARPA (grant no. D18AP00053), the Brain and Behavior Foundation, the Cancer Research Institute, the National Institutes of Health (NIH)/National Human Genome Research Institute (grant no. DP2HG010099) and the NIH/National Cancer Institute (grant no. R01CA218668). R.S. is supported by the Chan Zuckerberg Initiative (grant nos. EOSS-0000000082 to R.S., HCA-A-1704-01895 to P.S. and R.S.), and the National Institutes of Health (grant nos. DP2HG009623-01 to R.S. and RM1HG011014-01 to P.S. and R.S.).
Footnotes
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Competing interests
In the past 3 years, R.S. has worked as a consultant for Bristol-Myers Squibb, Regeneron and Kallyope, and served as a SAB member for ImmunAI, Apollo Life Sciences GmbH, Nanostring and the New York City Pandemic Response Laboratory. N.E.S. is an advisor to Vertex and QIAGEN and is a cofounder of OverT Bio. P.S. is a coinventor on a patent related to protein detection by sequencing as described in this work. The New York Genome Center and New York University have applied for patents relating to the work in this article. The remaining authors declare no competing interests.
Extended data is available for this paper at https://doi.org/10.1038/s41592-022-01705-x.
Supplementary information The online version contains supplementary material available at https://doi.org/10.1038/s41592-022-01705-x.
Data availability
Raw and processed sequencing data have been made available on the National Center for Biotechnology Information Gene Expression Omnibus under the accession number GSE213957. ECCITE-seq data used in this study are available at the Gene Expression Omnibus (GSE146469, ref. 23). Source data are provided with this paper.
References
- 1.Dixit A et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Datlinger P et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jaitin DA et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell 167, 1883–1896 (2016). [DOI] [PubMed] [Google Scholar]
- 4.Mimitou EP et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. Nat. Methods 16, 409–412 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Adamson B et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Replogle JM et al. Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat. Biotechnol 38, 954–961 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Norman TM et al. Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 365, 786–793 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Michlits G et al. Multilayered VBC score predicts sgRNAs that efficiently generate loss-of-function alleles. Nat. Methods 17, 708–716 (2020). [DOI] [PubMed] [Google Scholar]
- 9.Papalexi E et al. Characterizing the molecular regulation of inhibitory immune checkpoints with multimodal single-cell screens. Nat. Genet 53, 322–331 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Konermann S et al. Transcriptome engineering with RNA-targeting type VI-D CRISPR effectors. Cell 173, 665–676 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Wessels HH et al. Massively parallel Cas13 screens reveal principles for guide RNA design. Nat. Biotechnol 38, 722–727 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Stoeckius M et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tuladhar R et al. CRISPR-Cas9-based mutagenesis frequently provokes on-target mRNA misregulation. Nat. Commun 10, 1–10 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Burris BJD, Molina Vargas AM, Park BJ & O’Connell MR Optimization of specific RNA knockdown in mammalian cells with CRISPR-Cas13. Methods 206, 58–68 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shi P et al. RNA-guided cell targeting with CRISPR/RfxCas13d collateral activity in human cells. Preprint at bioRxiv 10.1101/2021.11.30.470032 (2021). [DOI] [Google Scholar]
- 16.Kelley CP, Haerle MC & Wang ET Negative autoregulation mitigates collateral RNase activity of repeat-targeting CRISPR-Cas13d in mammalian cells. Cell Rep. 40, 111226 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Gilbert LA et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Morris JA et al. Discovery of target genes and pathways of blood trait loci using pooled CRISPR screens and single cell RNA sequencing. Preprint at bioRxiv 10.1101/2021.04.07.438882 (2021). [DOI] [Google Scholar]
- 19.Yeo NC et al. An enhanced CRISPR repressor for targeted mammalian gene regulation. Nat. Methods 15, 611–616 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Doench JG et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol 34, 184–191 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Sanson KR et al. Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities. Nat. Commun 9, 5416 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Stoeckius M et al. Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics. Genome Biol. 19, 224 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wang E et al. Surface antigen-guided CRISPR screens identify regulators of myeloid leukemia differentiation. Cell Stem Cell 28, 718–731 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Consortium Hubmap. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Regev A et al. The human cell atlas. eLife 6, 1–30 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Granja JM et al. Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nat. Biotechnol 37, 1458–1465 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Oetjen KA et al. Human bone marrow assessment by single-cell RNA sequencing, mass cytometry, and flow cytometry. JCI Insight. 3, e124928 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Nicosia L et al. Pharmacological inhibition of LSD1 triggers myeloid differentiation by targeting GSE1 oncogenic functions in AML. Oncogene 10.1038/s41388-021-02123-7 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Fiskus W et al. Highly effective combination of LSD1 (KDM1A) antagonist and pan-histone deacetylase inhibitor against human AML cells. Leukemia 28, 2155–2164 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Nelson JW et al. Engineered pegRNAs improve prime editing efficiency. Nat. Biotechnol 10.1038/s41587-021-01039-7 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Méndez-Mancilla A et al. Chemically modified guide RNAs enhance CRISPR-Cas13 knockdown in human cells. Cell Chem. Biol 10.1016/j.chembiol.2021.07.011 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Zhang C et al. Structural basis for the RNA-guided ribonuclease activity of CRISPR-Cas13d. Cell 175, 212–223 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Akiyama BM et al. Zika virus produces noncoding RNAs using a multi-pseudoknot structure that confounds a cellular exonuclease. Science 354, 1148–1152 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Anzalone AV, Lin AJ, Zairis S, Rabadan R & Cornish VW Reprogramming eukaryotic translation with ligand-responsive synthetic RNA switches. Nat. Methods 13, 453–458 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Lensch S et al. Dynamic spreading of chromatin-mediated gene silencing and reactivation between neighboring genes in single cells. eLife 11, e75115 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Campa CC, Weisbach NR, Santinha AJ, Incarnato D & Platt RJ Multiplexed genome engineering by Cas12a and CRISPR arrays encoded on single transcripts. Nat. Methods 16, 887–893 (2019). [DOI] [PubMed] [Google Scholar]
- 37.Özcan A et al. Programmable RNA targeting with the single-protein CRISPR effector Cas7-11. Nature 597, 720–725 (2021). [DOI] [PubMed] [Google Scholar]
- 38.Schraivogel D et al. Targeted Perturb-seq enables genome-scale genetic screens in single cells. Nat. Methods 17, 629–635 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Otoupal PB, Cress BF, Doudna JA & Schoeniger JS CRISPR-RNAa: targeted activation of translation using dCas13 fusions to translation initiation factors. Nucleic Acids Res. 50, 8986–8998 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Tong H et al. High-fidelity Cas13 variants for targeted RNA degradation with minimal collateral effects. Nat. Biotechnol 10.1038/s41587-022-01419-7 (2022). [DOI] [PubMed] [Google Scholar]
- 41.Wei J et al. Deep learning and CRISPR-Cas13d ortholog discovery for optimized RNA targeting. Preprint at bioRxiv 10.1101/2021.09.14.460134 (2022). [DOI] [PubMed] [Google Scholar]
- 42.Guo X et al. Transcriptome-wide Cas13 guide RNA design for model organisms and viral RNA pathogens. Cell Genomics 1, 100001 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Langmead B, Trapnell C, Pop M & Salzberg SL Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Love MI, Huber W & Anders S Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Leek JT, Johnson WE, Parker HS, Jaffe AE & Storey JD The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kolde R, Laur S, Adler P & Vilo J Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics 28, 573–580 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hao Y et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e29 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw and processed sequencing data have been made available on the National Center for Biotechnology Information Gene Expression Omnibus under the accession number GSE213957. ECCITE-seq data used in this study are available at the Gene Expression Omnibus (GSE146469, ref. 23). Source data are provided with this paper.