Abstract
The systematic perturbation of genomes using CRISPR/Cas9 deciphers gene function at an unprecedented rate, depth and ease. Commercially available sgRNA libraries typically contain tens of thousands of pre-defined constructs, resulting in a complexity challenging to handle. In contrast, custom sgRNA libraries comprise gene sets of self-defined content and size, facilitating experiments under complex conditions such as in vivo systems. To streamline and upscale cloning of custom libraries, we present CLUE, a bioinformatic and wet-lab pipeline for the multiplexed generation of pooled sgRNA libraries. CLUE starts from lists of genes or pasted sequences provided by the user and designs a single synthetic oligonucleotide pool containing various libraries. At the core of the approach, a barcoding strategy for unique primer binding sites allows amplifying different user-defined libraries from one single oligonucleotide pool. We prove the approach to be straightforward, versatile and specific, yielding uniform sgRNA distributions in all resulting libraries, virtually devoid of cross-contaminations. For in silico library multiplexing and design, we established an easy-to-use online platform at www.crispr-clue.de. All in all, CLUE represents a resource-saving approach to produce numerous high quality custom sgRNA libraries in parallel, which will foster their broad use across molecular biosciences.
INTRODUCTION
Technical advances in functional genetics contribute to a growing understanding of key processes in physiology and disease. The ability to systematically perturb genomes via CRISPR/Cas9 allows assigning quantifiable phenotypes to genomic manipulations in a high-throughput format, complementing descriptive sequencing data by a functional dimension (1,2). Pooled CRISPR screens utilize the massive parallel perturbation of genes across a cell population in order to identify functional regulators of biological processes (3–5). Such CRISPR screens have mostly been conducted in tissue culture, but also in vivo (6,7).
A major limitation for performing pooled CRISPR screens is the availability of suitable sgRNA libraries (8). For example, most in vivo model systems do not allow studying large genome-wide libraries, as inappropriate coverage induces severe library bottlenecking, requiring small libraries, tailored to the interrogated biology of interest (9). Even for cell-line based in vitro screens, custom sgRNA libraries are on high demand, since they ease technical challenges, reduce costs and zoom in on a particular biological aspect of interest (10). The production of such custom sgRNA libraries, however, is technically demanding, resource-intensive and depending on certain bioinformatical expertise, which can act as an additional obstacle for many molecular biology laboratories (8).
Here, we present CLUE (custom library multiplexed cloning) – a versatile bioinformatics and wet-lab pipeline for the streamlined and multiplexed production of custom sgRNA libraries. CLUE is amenable to all major CRISPR variants including CRISPR interference (CRISPRi), CRISPR activation (CRISPRa) (11) and CRISPR knockout (CRISPRko), covers both murine and human genomes and supports the design and production of multiplexed libraries from a single oligonucleotide pool (12), thus providing an easy, rapid and cost-efficient work-flow. The easy-to-access web interface combines gene lists of several libraries to generate a single oligonucleotide pool, deriving sgRNAs from well-established genome-wide libraries (13–15). A barcoding design of unique primer binding sites enables the discrimination of individual libraries within the pool from one another and provides the foundation for the parallel generation of high quality custom sgRNA libraries. CLUE provides a straightforward, ready to use approach for cloning numerous custom libraries, suitable for the vast majority of molecular biology laboratories, significantly reducing costs and turn-around time.
MATERIALS AND METHODS
In silico oligo pool generation
Oligo pools are generated from input spreadsheets by text parsing methods realized in a Python script. Starting from a spreadsheet in the gene name format, each column is separated into the descriptive (first 3 cells) and the gene name part. Based on the descriptive part the reference sgRNA library is chosen based on Supplementary Table S1. These reference sgRNA libraries reflect well-established genome-wide libraries (13–15). Next, the given number of sgRNAs/gene is selected from the reference library and a ‘G’ is prepended if the first base is not a ‘G’. This step ensures efficient sgRNA transcription from the RNA Pol III promoter. sgRNAs from the reference libraries are chosen in order of their listing, i.e. if three sgRNAs/gene are selected, the first three sgRNAs listed for that gene are chosen. Within the reference libraries sgRNAs are ranked according to their predicted efficacy. For each library the sgRNA sequences are then concatenated with the adapter sequences for oligo pool amplification, specific library adapters obtained from Supplementary Table S2, as well as H1 promoter and sgRNA scaffold sequences. The latter two sequences may also be chosen to correspond to the U6 promoter and sgRNA scaffold as they are found in the pLenti-GUIDE family of vectors (16), extending the list of possible destination vectors compatible with CLUE. The process is iterated over for every library, using the next library specific adapter pair from Supplementary Table S2 and all oligos are finally written to the output file. Alternatively, if the sequence format is provided, sgRNA sequences are directly taken from the spreadsheet and concatenated to the adapter sequences as described above. Primer sequences are directly generated from the adapter sequences of Supplementary Table S2, using the forward oligo also as primer and the reverse complement of the second oligo as reverse primer.
Oligo pool TOPO-cloning
The lyophilized oligo pool was reconstituted in TE buffer at 20 ng/μl. 10 ng array synthesized oligos were used for PCR amplification with Kapa Hifi Polymerase (Roche) and primers Pool_ampl_f and Pool_ampl_r (Supplementary Table S3), according to the manufacturer's instructions. PCR setup was as follows: 98°C 3 min, 98°C 30 s, 62°C 15 s, 72°C 10 s, 72°C 2 min with a total of 15 PCR cycles (PCR1). 2 μl PCR reaction were directly taken for TOPO cloning with Zero Blunt™ TOPO™ PCR Cloning Kit (Invitrogen). In brief, 2 μl PCR reaction were mixed with 2 μl salt solution, 2 μl ultra-pure water and 1 μl linearized TOPO vector (all components from the Kit). One TOPO reaction was performed per 5000 oligos present in the original oligonucleotide pool. TOPO reactions were incubated at room temperature for 30 min. Next, the reaction volume was brought up to 100 μl with water and DNA was precipitated by adding 100 μl isopropanol, 2 μl 5 M NaCl and 1 μl GlycoBlue co-precipitant (Invitrogen). Samples were vortexed and incubated at room temperature for 15 min, followed by centrifugation >16 000 × g for 15 min. Supernatants were discarded and pellets washed twice with 70% ethanol. Pellets were subsequently air-dried and reconstituted in 2 μl water. TOPO plasmids were electroporated into Endura competent cells (Lucigen), according to the manufacturer's instructions. Bacteria were plated on LB agar containing 50 μg/ml kanamycin on 245 mm squared dishes and incubated at 37°C overnight. The next day, bacteria were floated off the plates with liquid LB medium and plasmid DNA was isolated using the NucleoBond Xtra Maxi Kit (Macherey-Nagel), according to the manufacturer's instructions. Obtained plasmid pool was quality controlled by PCR with primers M13_f and M13_r (Supplementary Table S3), expecting bands of 400 bp for successfully cloned plasmids. TOPO plasmid pools were further prepared for NGS by running PCRs with P5-H1_f primers binding to the H1 portion of the cloned oligos together with P7-TOPO-5p or P7-TOPO-3p primers (Supplementary Table S3), binding in the vector backbone. Two PCR reactions per sample with either of the P7-TOPO primers are required due to the random orientation of the fragments in the TOPO cloning procedure. PCRs were set up with Kapa Hifi Pol, 50 ng template and 300 nM primer each, using the following protocol: 98°C 2 min, 98°C 30 s, 62°C 15 s, 72°C 20 s, 72°C 2 min with a total of 25 PCR cycles. PCR fragments were purified and submitted for NGS on an Illumina HiSeq 2000, 50 bp single-end reads aiming for >500 reads per individual oligo.
Cloning of sgRNA libraries
To clone a specific sgRNA library included in the initial oligo pool, the TOPO oligo pool was used as template for PCRs employing primers binding to the adapters of the library of interest (PCR2). In brief, 50 pg TOPO pool was used as template together with 300 nM of each primer and Kapa Hifi Pol. PCR reactions were performed as follows: 98°C 2 min, 98°C 20 s, 57 – 62°C 15 s, 72°C 1 s, 72°C 1 min with a total of 30 PCR cycles. Annealing temperatures were chosen depending on the adapter pair used (Supplementary Table S2). One PCR reaction of 25 μl volume was performed per 5000 sgRNAs in the sgRNA library to be cloned. PCR products were purified using the NucleoSpin Gel and PCR Clean-up Kit (Macherey-Nagel) and eluted DNA was quality controlled on a 2% agarose gel (expected fragment size 105 bp). 50 pg of library specific fragment were used for PCR with primers H1_f and scaff_r to generate DNA fragments ready for Gibson cloning (PCR3). Conditions for PCR3 are identical to PCR2 and use an annealing temperature of 62°C. As for PCR2, one PCR reaction of volume 25 μl was performed per 5000 sgRNAs in the sgRNA library to be cloned. DNA fragments from PCR3 are purified by isopropanol precipitation as described in oligo pool cloning. Purified fragments from PCR3 are quality controlled on a 2% agarose gel (expected fragment size 65 bp, Supplementary Figure S1B). sgRNA expression vector was linearized using FastDigest BpiI (isoschizomer of BbsI, Thermo Scientific). In brief, 5 μg vector were mixed with 3 μl 10X FastDigest Buffer and 1 μl FD BpiI (10 U/μl) and brought to a total volume of 30 μl with water. The reaction was incubated at 37°C for 30 min and linearized vector was purified on a 1% agarose gel (Supplementary Figure S3B). Gibson assembly was performed with 100 ng of the PCR3 product and with 100 ng of the linearized sgRNA expression vector, using NEBuilder HiFi DNA Assembly Master Mix (New England Biolabs), according to the manufacturer's instructions. After completion of the reaction, volumes were brought up to 100 μl with water and DNA was precipitated with isopropanol as described above. Pellets were resuspended in 2 μl water and subjected to electroporation into Endura competent cells (Lucigen) according to the manufacturer's instructions. Bacteria were plated on LB agar containing 100 μg/ml ampicillin on 245 mm squared dishes and incubated at 37°C overnight. The next day bacteria were floated off the plates with liquid LB medium and plasmid DNA was isolated using the NucleoBond Xtra Maxi Kit (Macherey-Nagel), according to the manufacturer's instructions. A detailed pipetting protocol can be found in the supplement section of this manuscript (Supplementary Material 1).
Preparation of sgRNA libraries for next generation sequencing (NGS)
sgRNA libraries were prepared for NGS by running PCRs with P5-H1_f and P7-EF1a_r primers (Supplementary Table S3). In order to prepare NGS Libraries, a single PCR with long primers comprising Illumina adapters and barcodes, as well as sequences complementary to the sgRNA vectors, were used. This allowed running only a single PCR on the sgRNA libraries in order to prepare them for NGS. PCRs were set up with Kapa Hifi Pol, 50 ng template and 300 nM primer each, using the following protocol: 98°C 2 min, 98°C 30 s, 62°C 15 s, 72°C 20 s, 72°C 2 min with a total of 25 PCR cycles. One PCR reaction was performed per 1 × 106 sgRNAs in the given library, ensuring a mean coverage for each sgRNA of >5000-fold. PCR fragments were purified via agarose gel electrophoresis using the NucleoSpin Gel and PCR Clean-Up Kit (Macherey-Nagel) and submitted for NGS on an Illumina HiSeq 2000, 50 bp single-end reads aiming for >500 reads per individual sgRNA. A detailed pipetting protocol can be found in the supplement section of this manuscript (Supplementary Material 1).
NGS data analysis for sgRNA distributions
NGS data of cloned oligo pools or sgRNA libraries were analyzed with custom Python scripts to map reads either to the entire oligo pool or a library of choice. Scripts are available in the Supplementary Material or at www.crispr-clue.de. In brief, reads from fastq files were extracted, constant parts of the sequence identified and sgRNA sequences derived thereof. Mapping required perfect matching, not allowing any mismatches. In case of mismatches, sequences were scored as unmapped.
Pooled CRISPRi screen
U-87 MG cells (ATCC HTB-14) were a kind gift of Michael Hemann (Massachusetts Institute of Technology) and were cultured in DMEM complete medium (90% DMEM / 10% FBS). They were transduced with a lentiviral vector encoding a catalytically inactive dCas9 variant tagged with EGFP and enriched for dCas9-EGFP expressing cells by fluorescence-activated cell sorting (FACS). U-87 MG dCas9-EGFP cells were then transduced with a CLUE sgRNA library at low infection rates to minimize multiple infections per cell. Successfully transduced cells were again enriched by FACS. After a brief in vitro expansion, cells were either subjected to genomic DNA extraction for the t = 0 days control or further passaged in vitro. In vitro culture conditions were kept so that at least a 250 x sgRNA representation was conserved at all times. In vitro samples were harvested after t = 16 days. Genomic DNA was extracted using the Wizard Genomic DNA Purification Kit (Promega). sgRNA insertions were amplified from the genomic DNA using Kapa Hifi Polymerase (Roche), primers P5-H1_f and P7-EF1a_r (Supplementary Table S3) and 400 ng genomic DNA per 20 μl reaction. PCR reactions were performed as follows: 98°C 2 min, 98°C 20 s, 62°C 15 s, 72°C 1 s, 72°C 1 min with a total of 30 PCR cycles. PCR products were purified and sequenced on an Illumina HiSeq 2000 with 50 bp single-end reads. The NGS data was analyzed with custom Python scripts to map and count reads. Read tables were then subjected to MAGeCK analysis (17).
RESULTS
The application of CRISPR/Cas9 for genome perturbation screens is limited by the availability of pooled sgRNA libraries. With CLUE, we set out to develop a streamlined wet-lab and easy-to-use computational pipeline for the cloning of multiple, custom sgRNA libraries from a single synthetic oligonucleotide pool.
The concept behind the CLUE pipeline
Key to the CLUE concept is the staggered combination of three DNA adapter pairs surrounding the sgRNA sequences (Figure 1). The outermost adapters enable PCR-based amplification of an entire synthesized oligonucleotide pool to generate double-stranded DNA, amenable for amplification and storage of the entire pool. The second adapter pair is library specific and serves as primer binding site for the specific amplification of the library of interest. Using different primer pairs allows the construction of several libraries from a single shared oligonucleotide pool. This strategy significantly lowers the financial strain on laboratories since for most commercial vendors the costs per synthesized oligomer within a pool drop significantly when increasing total pool size. The innermost adapter pair is comprised of DNA sequences homologous to sgRNA expression vectors and amenable to Gibson cloning—therefore abolishing the need for restriction digestion of the amplified pool and enabling the inclusion of any sgRNA without any sequence restrictions.
Figure 1.
Schematic representation of the CLUE cloning pipeline. The basic structure of a CLUE oligo is the staggered combination of three DNA adapter pairs flanking a sgRNA. The outermost adapters (large white boxes, PCR 1) enable amplification of the entire oligo pool (optional), the second adapter pair (dark colors, PCR 2) allows for specific amplification of a given sgRNA library and the third adapter pair (small white boxes, PCR 3) is comprised of sequences homologous to the sgRNA expression vector, amenable to Gibson assembly. An oligo pool comprising several sgRNA libraries is initially amplified and cloned into a TOPO vector (PCR 1). From this TOPO pool, specific libraries can be PCR amplified (PCR 2) and in a second step be prepared for Gibson cloning into a sgRNA expression vector (PCR 3) The final product is a specific sgRNA expression library.
With CLUE, library cloning requires three consecutive PCR steps (Figure 1). After an oligonucleotide pool is obtained from a commercial vendor, the whole pool is amplified in a low-cycle-number PCR (Figure 1, PCR1). The resulting double-stranded DNA pool is then cloned into a linearized TOPO vector, which can be amplified through transformation into competent bacteria and quality-controlled by next generation sequencing (NGS) to ensure sufficient library coverage, uniform sgRNA distribution and correct sgRNA sequences (Supplementary Figure S1A). Single distinct libraries are then amplified off the TOPO-pool through the utilization of adapter-specific PCR primers (Figure 1, PCR2). Lastly, the library-specific PCR products are prepared for the Gibson cloning reaction into sgRNA expression vectors in a final PCR step (Figure 1, PCR3 and Supplementary Figure S1B). The quality of the cloned sgRNA libraries can at this point be assessed by NGS.
CLUE sgRNA libraries achieve screening-grade quality
Moving the CLUE concept to practice, we designed an oligo pool comprised of a total of 5585 sgRNAs distributed among 10 different libraries. We performed low-cycle-number amplification of the initial material (PCR1) and TOPO cloned the resulting PCR product. This step amplifies the oligo pool, which is of special importance, if low DNA amounts per sgRNA were purchased, e.g., in large pools, but might be omitted otherwise. The re-isolated TOPO plasmid pool was subjected to NGS for quality control, showing a full sgRNA coverage of 100%. Furthermore, >90% sgRNAs were distributed within one log of the mean of the distribution (Supplementary Figure S1A). Next, we conducted library specific PCRs and the subsequent cloning steps in order to produce all 10 individual libraries (Supplementary Figure S1B), which were then subjected to NGS-based quality control. We again achieved full coverage for all libraries and never missed more than one single sgRNA per library (Table 1). Furthermore, for all libraries, sgRNAs were normally distributed with >90% of sgRNAs falling within one log of the mean of the distribution (Figure 2 and Supplementary Figure S2, first panels). Next, we analyzed cloning and PCR accuracy and found that up to 94% of all reads matched to our initial oligonucleotide pool (Figure 2 and Supplementary Figure S2, second panels). Lastly, we analyzed whether our cloning pipeline is indeed able to specifically amplify individual libraries without library-to-library cross-contamination. For all 10 libraries cloned, we never detected >0.5% of reads mapping to other libraries than the amplified one, demonstrating high library specificity (Figure 2 and Supplementary Figure S2, third panels). We therefore concluded that sgRNA libraries produced with the CLUE pipeline satisfy the basic quality criteria required for high-throughput pooled CRISPR screens (18).
Table 1.
NGS-based quality assessment of 10 pooled sgRNA libraries produced with the CLUE-pipeline
| Library number | Intended library size (sgRNAs) | Number of sgRNAs identified by NGS | sgRNA cloning efficiency (%) | Perfectly matching NGS-reads (%) | Reads mapping to other sub-library (%) | Empty vector reads (%) | Presence of stuffer |
|---|---|---|---|---|---|---|---|
| 1 | 265 | 265 | 100.00 | 92.53 | 0.100 | 0.02 | yes |
| 2 | 705 | 705 | 100.00 | 92.64 | 0.150 | 0.01 | yes |
| 3 | 445 | 444 | 99.78 | 92.71 | 0.020 | 0.01 | yes |
| 4 | 170 | 170 | 100.00 | 93.19 | 0.010 | 0.00 | yes |
| 5 | 520 | 520 | 100.00 | 94.40 | 0.005 | 0.00 | yes |
| 6 | 530 | 530 | 100.00 | 89.68 | 0.010 | 0.00 | yes |
| 7 | 505 | 504 | 99.80 | 78.90 | 0.040 | 8.64 | no |
| 8 | 450 | 449 | 99.78 | 77.32 | 0.370 | 10.39 | no |
| 9 | 1025 | 1024 | 99.90 | 63.93 | 0.002 | 14.95 | no |
| 10 | 970 | 970 | 100.00 | 62.89 | 0.030 | 15.95 | no |
Figure 2.
Distribution and quality assessment of three sgRNA libraries cloned with CLUE. (A–C) First panel: Density-rug plots showing the distribution of all sgRNAs of the respective library. Each rug represents one sgRNA. Second panel: Percentage of reads from a NGS run which could be (gray) or could not (black) be mapped to sgRNA sequences within the oligo pool. Only perfect consensus to the expected sgRNA sequences was counted as successful mapping. Third panel: Distribution of mapped reads from a NGS run to the library that was amplified in the given experiment (match to library) versus reads mapped to any other sgRNA library present in the original oligo pool (cross-contamination).
A 1.2 kb stuffer reduces constructs without sgRNAs
One drawback of Gibson-based library assembly over Golden-Gate cloning is the higher prevalence of vector-only background colonies (19). Indeed, for some of the cloned sgRNA libraries we observed up to 15% of all reads mapping to expression vectors without incorporated sgRNAs (Figure 2, Supplementary Figure S2 and Table 1). One way to enhance Gibson cloning efficacy is to improve target vector linearization. We therefore introduced a 1.2 kb stuffer sequence between the two BbsI type IIS restriction enzyme cleavage sites used for sgRNA cloning, reasoning that this would help select for perfectly linearized vector bands on agarose gels (Supplementary Figures S2, S3A, B). Indeed, this approach reduced the prevalence of empty sgRNA backbones within libraries by several hundred-fold, as demonstrated by both colony PCR and NGS (Figure 2, Supplementary Figures S2, S3 and Table 1).
A pilot CLUE library screen detects transcription factors essential for glioma
We next tested the capability of CLUE sgRNA libraries to produce biologically meaningful hits when employed for pooled genomic perturbation screens. To do so, we transduced the well-established human glioma cell line U87-MG with a catalytically inactive dCas9 construct in order to elicit CRISPR interference (CRISPRi). We next infected these dCas9+ U87-MG cells with a CLUE sgRNA library of 505 individual sgRNAs targeting 88 different transcription regulators (Figure 3A). A fraction of the infected cells was kept as the input control, while the remainder was cultured in vitro for an additional 16 days. We extracted genomic DNA from both cohorts, amplified the integrated sgRNA loci and identified sgRNA distributions by NGS. We then scored the total library sgRNA changes and found that the majority of sgRNAs targeting genes such as the transcription factor E2F1 had depleted during glioma cell in vitro culture (Supplementary Table S4). Intriguingly, several E2F family members including E2F1 are known to be crucial for cell cycle progression and the target of tumor suppressors such as RB (Figure 3C) (20). We next employed the MAGeCK bioinformatics pipeline to mathematically combine the behavior of sgRNAs into gene level scores (Figure 3B, Supplementary Table S5) (17). In brief, MAGeCK ranks genes by testing how much the distribution of sgRNAs targeting a particular gene is skewed away from all sgRNAs within a screen. Besides E2F1, we found that sgRNAs targeting the Zinc finger protein 217 (ZNF217) had strongly depleted (Figure 3B, D). ZNF217 is frequently amplified in human cancer and orchestrates a wide spectrum of pro-oncogenic cellular signaling cascades (21). In summary, our pooled CRISPRi perturbation screen underlines that sgRNA libraries produced with the CLUE pipeline can highlight biologically relevant gene sets such as oncogenes.
Figure 3.
A targeted CRISPRi screen highlights transcription factors required for the proliferative fitness of malignant glioma cells. (A) Pooled screen schematic. U87-MG cells stably expressing a catalytically inactive dCas9 variant were lentivirally transduced with a CLUE sgRNA library of 505 individual sgRNAs targeting 88 different transcription-associated genes. (B) CRISPRi screen analysis. Significance of depletion after 16 days is plotted for all targeted screens. Dashed line: P = 0.05. Genes depleting with P< 0.05 are highlighted in red. (C) Schematic depicting the role of E2F1 in proliferation control. (D) Schematic depicting the pro-oncogenic role of ZNF217-mediated regulation of transcription.
The CLUE web-interface streamlines multiplexed sgRNA library design
Having established that the CLUE pipeline streamlines multiplexed sgRNA library production, we next wanted to make the system broadly available to the scientific community. We felt that one major hurdle for researchers with a limited bioinformatics background could be the process of library and oligonucleotide pool design. To overcome this limitation, we generated www.crispr-clue.de, an intuitive and interactive website for fast and easy construction of sgRNA libraries and corresponding oligo pools (Figure 4A). It allows users to upload various gene lists of interest, specify CRISPR variants (i.e. CRISPRko, CRISPRi, CRISPRa), choose between mouse and human as available model organisms, define libraries and select the number of sgRNAs per target gene. Behind the scenes, the CLUE webtool generates a multiplexed oligonucleotide pool ready to be uploaded to commercial synthesis providers, a list of adapter-binding primers for the amplification of each specific library and lists of sgRNAs distributed across the libraries. Importantly, all sgRNAs provided by CLUE reflect well-established genome-wide libraries (13–15). Alternatively, it is also possible to upload lists of sgRNA sequences directly and use the CLUE webtool for multiplexed oligonucleotide pool design. This additional layer of customization enables users to produce sgRNA libraries for any species and emerging new CRISPR applications. The upload format is a simple spreadsheet (.csv file, Figure 4B). Additionally, we provide a detailed description and corresponding Python scripts for the quality control analysis of sgRNA libraries after NGS on the CLUE website. All in all, our easy-to-use web application aims at enabling numerous molecular biology laboratories to conduct personalized CRISPR screens, matching their unique area of interest.
Figure 4.
Overview of the CLUE web-interface and formats for user submission of sgRNA libraries. (A) Screenshot of the CLUE web-interface for the generation of oligo pools from sgRNA lists. (B) Example of the gene format for sgRNA libraries. Every column contains one sgRNA library, with the first cell holding the library name, the second cell holding the species (human or murine), the third cell holding the type of CRISPR application (ko, kd, act) and every following cell holding a gene name/symbol. (C) Example of the sequence format for sgRNA libraries. Every first column contains the string ‘sgRNA ID’ in its first cell, followed by sgRNA IDs in subsequent cells. Every second column contains the library name in its first cell, followed by sgRNA sequences in every subsequent cell, corresponding to the given sgRNA IDs.
DISCUSSION
The CLUE pipeline covers the entire workflow for multiplexed generation of custom sgRNA libraries, starting from a web-based framework for de-novo library design and multiplexing, up to a wet-lab toolset for cloning each custom sgRNA library. Libraries produced with CLUE proved to be highly accurate, with up to 94% of all NGS reads perfectly matching the intended sequences. CLUE libraries typically contain full library representation, almost no library-to-library cross-contamination and uniform sgRNA distribution within one log of the mean of the corresponding distribution—satisfying state of the art sgRNA library quality-criteria (18,22). High-quality sgRNA libraries are of particular importance for in vivo screens, which are prone to bottlenecking and to a random loss of underrepresented sgRNAs within unevenly cloned libraries.
The CLUE pipeline is strictly optimized for efficiency. First, we incorporated an initial PCR and TOPO cloning step. This relieves constraints imposed by the limited amount of oligonucleotide pool DNA supplied by commercial vendors, which becomes particularly important for large oligonucleotide pools containing many libraries. In addition, pool subcloning is useful for long-term storage as well as for a rapid assessment of synthesis quality prior to library cloning. Second, we utilized a barcoding protocol for library multiplexing, therefore enabling researchers to clone dozens of individual sgRNA libraries from a single oligonucleotide pool, thus reducing costs and turnaround times. Finally, the use of Gibson assembly entirely abolishes the need to avoid certain DNA sequence motifs during sgRNA library design.
We designed CLUE to be easily accessible without major skills in bioinformatics, which makes it applicable to a broad audience of biological researchers. Users can either provide lists of genes or pre-designed sgRNAs. The protocol is highly scalable, allowing individual sgRNA libraries of virtually any size.
CLUE is amenable to all major CRISPR variants including CRISPRi, CRISPRa and CRISPRko. The pipeline's flexibility enables constructing libraries for murine and human genomes when using gene lists, but can be extended to any species of interest, when sgRNA sequences are provided. Along the same lines, this flexibility allows to extend the system to any sgRNA expression vector of interest (16), while currently the webtool is designed to incorporate our own vectors, as well as the widely used pLenti-GUIDE family of sgRNA vectors (16). Beyond the coding genome, CLUE can be utilized for emerging new CRISPR applications such as perturbation of the non-coding genome, modification of the epigenome (23,24) as well as the Cas9-mediated editing of DNA bases (25,26). By adjusting vector-homology sequences, the pipeline holds the potential to be extended to other Cas proteins and their applications, such as Cas13-based transcriptome editing (27,28). This modularity of the CLUE design allows to readily extend the pipeline to other emerging CRISPR/Cas systems, as well as to integrate novel sgRNA libraries into the database. These facts ensure the pipeline to keep up-to-date with likely emerging improved sgRNA design algorithms and libraries as well as novel CRISPR applications.
All in all, CLUE provides a streamlined process for generating multiple custom sgRNA libraries, for all molecular biology laboratories interested in custom genome perturbation. Its modular design, combined with the multiplexing capacity allows for fast and very cost-efficient sgRNA library design and construction, together with high flexibility, for example when combining sgRNA libraries for different CRISPR applications in the same oligonucleotide pool.
Supplementary Material
Contributor Information
Martin Becker, Research Unit Apoptosis in Hematopoietic Stem Cells, Helmholtz Zentrum München, German Center for Environmental Health (HMGU), 81377 Munich, Germany.
Heidi Noll-Puchta, Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig Maximilians University of Munich (LMU), 80337 Munich, Germany.
Diana Amend, Research Unit Apoptosis in Hematopoietic Stem Cells, Helmholtz Zentrum München, German Center for Environmental Health (HMGU), 81377 Munich, Germany.
Florian Nolte, Faculty of Business Administration and Economics, Bielefeld University, Bielefeld, Germany.
Christiane Fuchs, Faculty of Business Administration and Economics, Bielefeld University, Bielefeld, Germany; Department of Mathematics, Technische Universität München, Munich, Germany; Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Munich, Neuherberg, Germany.
Irmela Jeremias, Research Unit Apoptosis in Hematopoietic Stem Cells, Helmholtz Zentrum München, German Center for Environmental Health (HMGU), 81377 Munich, Germany; Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig Maximilians University of Munich (LMU), 80337 Munich, Germany; German Consortium for Translational Cancer Research (DKTK), Partnering Site Munich, 80336 Munich, Germany.
Christian J Braun, Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig Maximilians University of Munich (LMU), 80337 Munich, Germany; Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine, Technische Universität München, Munich, Germany; Hopp Children's Cancer Center Heidelberg (KiTZ), German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
FUNDING
Max-Eder Program of Deutsche Krebshilfe [70113377 to C.J.B.]; Care for Rare Foundation; Beug Foundation for Metastasis Research; Society for the Advancement of Science and Research of the LMU Medical Faculty (WiFoMed); I.J. is supported by the European Research Council Consolidator Grant [681524]; Mildred Scheel Professorship by German Cancer Aid; German Research Foundation (DFG) Collaborative Research Center 1243 ‘Genetic and Epigenetic Evolution of Hematopoietic Neoplasms’, DFG proposal [MA 1876/13-1]; Bettina Bräu Stiftung and Dr Helmut Legerlotz Stiftung. Funding for open access charge: Beug Foundation for Metastasis Research (to C.J.B.).
Conflict of interest statement. None declared.
REFERENCES
- 1. Wang T., Wei J.J., Sabatini D.M., Lander E.S.. Genetic screens in human cells using the CRISPR-Cas9 system. Science. 2014; 343:80–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Shalem O., Sanjana N.E., Hartenian E., Shi X., Scott D.A., Mikkelson T., Heckl D., Ebert B.L., Root D.E., Doench J.G. et al.. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science. 2014; 343:84–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Hsu P.D., Lander E.S., Zhang F.. Development and applications of CRISPR-Cas9 for genome engineering. Cell. 2014; 157:1262–1278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Tzelepis K., Koike-Yusa H., De Braekeleer E., Li Y., Metzakopian E., Dovey O.M., Mupo A., Grinkevich V., Li M., Mazan M. et al.. A CRISPR dropout screen identifies genetic vulnerabilities and therapeutic targets in acute Myeloid Leukemia. Cell Rep. 2016; 17:1193–1205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Behan F.M., Iorio F., Picco G., Goncalves E., Beaver C.M., Migliardi G., Santos R., Rao Y., Sassi F., Pinnelli M. et al.. Prioritization of cancer therapeutic targets using CRISPR-Cas9 screens. Nature. 2019; 568:511–516. [DOI] [PubMed] [Google Scholar]
- 6. Braun C.J., Bruno P.M., Horlbeck M.A., Gilbert L.A., Weissman J.S., Hemann M.T.. Versatile in vivo regulation of tumor phenotypes by dCas9-mediated transcriptional perturbation. Proc. Natl Acad. Sci. U.S.A. 2016; 113:E3892–3900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Xu C., Qi X., Du X., Zou H., Gao F., Feng T., Lu H., Li S., An X., Zhang L. et al.. piggyBac mediates efficient in vivo CRISPR library screening for tumorigenesis in mice. Proc. Natl Acad. Sci. U.S.A. 2017; 114:722–727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Doench J.G. Am I ready for CRISPR? A user's guide to genetic screens. Nat. Rev. Genet. 2018; 19:67–80. [DOI] [PubMed] [Google Scholar]
- 9. Read A., Gao S., Batchelor E., Luo J.. Flexible CRISPR library construction using parallel oligonucleotide retrieval. Nucleic Acids Res. 2017; 45:e101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Shi J., Wang E., Milazzo J.P., Wang Z., Kinney J.B., Vakoc C.R.. Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nat. Biotechnol. 2015; 33:661–667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Gilbert L.A., Horlbeck M.A., Adamson B., Villalta J.E., Chen Y., Whitehead E.H., Guimaraes C., Panning B., Ploegh H.L., Bassik M.C. et al.. Genome-Scale CRISPR-Mediated control of gene repression and activation. Cell. 2014; 159:647–661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Kosuri S., Church G.M.. Large-scale de novo DNA synthesis: technologies and applications. Nat. Methods. 2014; 11:499–507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Wang T., Birsoy K., Hughes N.W., Krupczak K.M., Post Y., Wei J.J., Lander E.S., Sabatini D.M.. Identification and characterization of essential genes in the human genome. Science. 2015; 350:1096–1101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Doench J.G., Fusi N., Sullender M., Hegde M., Vaimberg E.W., Donovan K.F., Smith I., Tothova Z., Wilen C., Orchard R. et al.. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 2016; 34:184–191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Horlbeck M.A., Gilbert L.A., Villalta J.E., Adamson B., Pak R.A., Chen Y., Fields A.P., Park C.Y., Corn J.E., Kampmann M. et al.. Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation. Elife. 2016; 5:e19760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Sanjana N.E., Shalem O., Zhang F.. Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods. 2014; 11:783–784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Li W., Xu H., Xiao T., Cong L., Love M.I., Zhang F., Irizarry R.A., Liu J.S., Brown M., Liu X.S.. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 2014; 15:554. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Joung J., Konermann S., Gootenberg J.S., Abudayyeh O.O., Platt R.J., Brigham M.D., Sanjana N.E., Zhang F.. Genome-scale CRISPR-Cas9 knockout and transcriptional activation screening. Nat. Protoc. 2017; 12:828–863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Engler C., Kandzia R., Marillonnet S.. A one pot, one step, precision cloning method with high throughput capability. PLoS One. 2008; 3:e3647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Meng P., Ghosh R.. Transcription addiction: can we garner the Yin and Yang functions of E2F1 for cancer therapy. Cell Death. Dis. 2014; 5:e1360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Cohen P.A., Donini C.F., Nguyen N.T., Lincet H., Vendrell J.A.. The dark side of ZNF217, a key regulator of tumorigenesis with powerful biomarker value. Oncotarget. 2015; 6:41566–41581. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Li W., Koster J., Xu H., Chen C.H., Xiao T., Liu J.S., Brown M., Liu X.S.. Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR. Genome Biol. 2015; 16:281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Hilton I.B., D’Ippolito A.M., Vockley C.M., Thakore P.I., Crawford G.E., Reddy T.E., Gersbach C.A.. Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers. Nat. Biotechnol. 2015; 33:510–517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Klann T.S., Black J.B., Chellappan M., Safi A., Song L., Hilton I.B., Crawford G.E., Reddy T.E., Gersbach C.A.. CRISPR-Cas9 epigenome editing enables high-throughput screening for functional regulatory elements in the human genome. Nat. Biotechnol. 2017; 35:561–568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Komor A.C., Kim Y.B., Packer M.S., Zuris J.A., Liu D.R.. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature. 2016; 533:420–424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Gaudelli N.M., Komor A.C., Rees H.A., Packer M.S., Badran A.H., Bryson D.I., Liu D.R.. Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage. Nature. 2017; 551:464–471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Abudayyeh O.O., Gootenberg J.S., Essletzbichler P., Han S., Joung J., Belanto J.J., Verdine V., Cox D.B.T., Kellner M.J., Regev A. et al.. RNA targeting with CRISPR-Cas13. Nature. 2017; 550:280–284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Cox D.B.T., Gootenberg J.S., Abudayyeh O.O., Franklin B., Kellner M.J., Joung J., Zhang F.. RNA editing with CRISPR-Cas13. Science. 2017; 358:1019–1027. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




