Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2020 Apr 4;48(9):5016–5023. doi: 10.1093/nar/gkaa208

PAM recognition by miniature CRISPR–Cas12f nucleases triggers programmable double-stranded DNA target cleavage

Tautvydas Karvelis 1,4,, Greta Bigelyte 1,4, Joshua K Young 2,4,, Zhenglin Hou 2, Rimante Zedaveinyte 1, Karolina Budre 1, Sushmitha Paulraj 2, Vesna Djukanovic 2, Stephen Gasior 2, Arunas Silanskas 1, Česlovas Venclovas 1, Virginijus Siksnys 1,
PMCID: PMC7229846  PMID: 32246713

Abstract

In recent years, CRISPR-associated (Cas) nucleases have revolutionized the genome editing field. Being guided by an RNA to cleave double-stranded (ds) DNA targets near a short sequence termed a protospacer adjacent motif (PAM), Cas9 and Cas12 offer unprecedented flexibility, however, more compact versions would simplify delivery and extend application. Here, we present a collection of 10 exceptionally compact (422–603 amino acids) CRISPR–Cas12f nucleases that recognize and cleave dsDNA in a PAM dependent manner. Categorized as class 2 type V-F, they originate from the previously identified Cas14 family and distantly related type V-U3 Cas proteins found in bacteria. Using biochemical methods, we demonstrate that a 5′ T- or C-rich PAM sequence triggers dsDNA target cleavage. Based on this discovery, we evaluated whether they can protect against invading dsDNA in Escherichia coli and find that some but not all can. Altogether, our findings show that miniature Cas12f nucleases can protect against invading dsDNA like much larger class 2 CRISPR effectors and have the potential to be harnessed as programmable nucleases for genome editing.

INTRODUCTION

Clustered regularly interspaced short palindromic repeat (CRISPR) associated (Cas) microbial defense systems protect their hosts against foreign nucleic acid invasion (1–3). Utilizing small guide RNAs (gRNAs) transcribed from a CRISPR locus, accessory Cas proteins are directed to silence invading foreign RNA and DNA (2,4,5). Based on the number and composition of proteins involved in nucleic acid interference, CRISPR-Cas systems are categorized into distinct classes 1–2 and types I–VI (2,6). Class 2 systems encode a single effector protein for interference and are further subdivided into types, II, V and VI (7). Cas9 (type II) and Cas12 (type V) proteins have been shown to cleave invading double-stranded (ds) DNA, single-stranded (ss) DNA and ssRNA (8–13). To recognize and cleave a dsDNA target, both Cas9 and Cas12 require a short sequence, termed the protospacer adjacent motif (PAM), in the vicinity of a DNA sequence targeted by the gRNA (8,9,14). Over the past several years, these endonucleases have been adopted as robust genome editing and transcriptome manipulation tools (15–18). Although both nucleases have been widely used, the size of Cas9 and Cas12 provides constraints on cellular delivery that may limit certain applications, including therapeutics (19,20).

Recently, small CRISPR-associated effector proteins (Cas12f) belonging to the type V-F subtype have been identified through the mining of sequence databases (6,7,13,21). The locus architecture for some of these systems includes genes (cas1, cas2 and cas4) that encode an adaptation module (Figure 1). The presence of these in combination with functional assessments for a single nuclease were previously used to denote a subset of Cas12f proteins as Cas14 (21). More recent classifications have grouped this family into Cas12f1 (Cas14a and type V-U3), Cas12f2 (Cas14b) and Cas12f3 (Cas14c, type V-U2 and U4) (6). In this study, we use the updated nomenclature and also keep the Cas14 designation to bridge the different naming conventions.

Figure 1.

Figure 1.

Schematic representation of CRISPR-Cas loci and effector proteins for type II and V systems as exemplified by Streptococcus pyogenes SF370 (NC_002737.2) for Cas9, Acidaminococcus sp. BV3L6 (NZ_AWUR01000016.1) for Cas12a, uncultured archaeon (KU_516197.1) for Cas14 and Syntrophomonas palmitatica (NZ_BBCE01000017.1) for type V-U3, respectively. Tri-split RuvC domains of effector proteins are shown in blue, HNH domain of Cas9 in orange. Grey rectangles and purple diamonds represent CRISPR repeats and spacers, respectively.

The majority of Cas12f proteins (from V-U3 and Cas14 families) are nearly half the size of the smallest Cas9 or Cas12 nucleases (Figure 1) (2,6,7,21). Compared to Cas12 orthologs, the N-terminal half of Cas12f proteins differs significantly in length accounting for most of the size difference between the two groups (Figure 1 and Supplementary Figure S1). Due to this and their similarity to transposase-associated TnpB proteins, it is hypothesized that they are remnants or intermediates of type V CRISPR–Cas system evolution and are incapable of forming the protein architecture required for dsDNA target recognition and cleavage (7,21). To our knowledge, functional characterization of Cas12f proteins are limited to a single protein identified from an uncultured archaeon (Un1), initially named Cas14a1 and renamed here to Un1Cas12f1, which was shown to exclusively target and cleave ssDNA in a PAM-independent manner (21). Additionally, target binding and cleavage by Un1Cas12f1 triggered collateral nuclease activity that manifested as trans-acting non-specific ssDNA degradation (21), a feature that seems to be largely shared across the type V family, although Cas12g was reported to initially target ssRNA and then indiscriminately degrade both ssDNA and ssRNA (12,13).

Here, using biochemical assays, we report that miniature Cas12f effectors (from Cas14 and type V-U3 families), like most other Cas12 proteins, are able to cleave dsDNA targets if a 5′ PAM sequence is present in the vicinity of the guide RNA target. For the Cas12f proteins from the Cas14 family, this primarily consists of 5′ T-rich sequences while type V-U3 PAM recognition includes both C- and T-rich motifs. Additionally, we demonstrate that some type V-U3 bacterial nucleases (but none of the Cas14 proteins tested) function as bona fide CRISPR-Cas systems in E. coli cells to protect against invading dsDNA despite their small size. These findings provide the first experimental evidence that miniature Cas12f effectors are programmable dsDNA nucleases and lay the foundation for the adoption of these proteins as genome editing tools.

MATERIALS AND METHODS

Engineering CRISPR-Cas12f systems to target a PAM library

CRISPR-Cas12f systems were modified to target the 7 bp randomized PAM library described previously (22). This was accomplished by replacing the native CRISPR array with three repeat:spacer:repeat units that encoded a spacer (33–39 nt depending on the average spacer length observed in the respective Cas12f system) capable of complementing to a sequence (anti-sense strand) immediately 3′ of the region of PAM randomization. The engineered CRISPR-Cas12f systems were then synthesized (GenScript) and cloned into a modified pET-duet1 (MilliporeSigma) or pACYC184 (NEB) plasmid. For the CRISPR-Cas12f1 system initially named Cas14a1 (21) and renamed here as Un1Cas12f1, the pLBH531_MBP-Cas14a1 plasmid (gift from Jennifer Doudna, Addgene plasmid #112500) was used. The sequences of the Cas12f proteins are listed in Supplementary Data S1 file and links to the plasmid sequences encoding the Cas12f–CRISPR systems engineered to target the PAM library are provided in Supplementary Table S2.

RNA synthesis

Un1Cas12f1 (Cas14a1) single guide RNAs (sgRNA) were produced by in vitro transcription using TranscriptAid T7 High Yield Transcription Kit (Thermo Fisher Scientific) and purified using GeneJET RNA Purification Kit (Thermo Fisher Scientific). Templates for T7 transcription were generated by PCR using overlapping oligonucleotides, altogether, containing a T7 promoter at the proximal end followed by the sgRNA sequence. Sequences of the sgRNAs used in our study are available in Supplementary Data S1 file.

Detecting Cas12f dsDNA cleavage and PAM recognition

Plasmid DNA targets were cleaved with Cas12f ribonucleoprotein (RNP) complexes produced from the modified locus or by combining Escherichia coli lysate containing Un1Cas12f1 (Cas14a1) protein with T7 transcribed sgRNA (20 nt spacer). First, E. coli DH5α or ArcticExpress (DE3) cells were transformed with CRISPR-Cas12f encoding plasmids (pACYC or pET-duet1 and pLBH531, respectively) and cultures grown in LB broth (30 ml) supplemented with either chloramphenicol (25 μg/ml) (pACYC plasmids) or ampicillin (100 μg/ml) (pET-duet1 and pLBH531 plasmids). Next, for plasmids with a T7 promoter (pET-duet1 and pLBH531 plasmids), expression was induced with 0.5 mM IPTG when cultures reached OD600 of 0.5 and incubated overnight at 16°C. Cells (from 10 ml) were collected by centrifugation and re-suspended in 1 ml of lysis buffer (20 mM phosphate, pH 7.0, 0.5 M NaCl, 5% (v/v) glycerol) supplemented with 10 μl PMSF (final conc. 2 mM) and lysed by sonication. Cell debris was removed by centrifugation. 10 μl of the obtained supernatant containing RNPs was used directly in the digestion experiments. For Un1Cas12f1, 20 μl of clarified supernatant was combined with 1 μl of RiboLock RNase Inhibitor (Thermo Fisher Scientific) and 2 μg of sgRNA and allowed to complex with the clarified lysate as described below.

Cas12f RNP complexes were used to cleave either the 7 bp randomized PAM library or a plasmid containing a fixed PAM and gRNA target. Briefly, 10 μl of Cas12f-gRNA RNP containing lysate was mixed with 1 μg of PAM library or 1 μg of plasmid containing a single PAM and gRNA target in 100 μl of reaction buffer (10 mM Tris–HCl, pH 7.5 at 37°C, 100 mM NaCl, 1 mM DTT and 10 mM MgCl2). After 1 h incubation at 37°C, DNA ends were repaired by adding 1 μl of T4 DNA polymerase (Thermo Fisher Scientific) and 1 μl of 10 mM dNTP mix (Thermo Fisher Scientific) and incubating the reaction for 20 min at 11°C. The reaction was then inactivated by heating it up to 75°C for 10 min and 3′-dA overhangs added by incubating the reaction mixture with 1 μl of DreamTaq polymerase (Thermo Fisher Scientific) and 1 μl of 10 mM dATP (Thermo Fisher Scientific) for 30 min at 72°C. Additionally, RNA was removed by incubation for 15 min at 37°C with 1 μl of RNase A (Thermo Fisher Scientific). Following purification with a GeneJet PCR Purification column (Thermo Fisher Scientific), the end repaired cleavage products (100 ng) were ligated with a double-stranded DNA adapter containing a 3′-dT overhang (100 ng) for 1 h at 22°C using T4 DNA ligase (Thermo Fisher Scientific). After ligation, cleavage products were PCR amplified appending sequences required for deep sequencing and subjected to Illumina sequencing (22,23).

Double-stranded DNA target cleavage was evaluated by examining the unique junction generated by target cleavage and adapter ligation in deep sequence reads. This was accomplished by first generating a collection of sequences representing all possible outcomes of dsDNA cleavage and adapter ligation within the target region. For example, cleavage and adapter ligation at just after the 21st position of the target would produce the following sequence (5′-CCGCTCTTCCGATCTGCCGGCGACGTTGGGTCAACT-3′) where the adapter and target sequences comprise 5′-CCGCTCTTCCGATCT-3′ and 5′-GCCGGCGACGTTGGGTCAACT-3′, respectively. The frequency of the resulting sequences was then tabulated using a custom Perl script (provided at https://github.com/cortevaCRISPR/Cas12f-InformaticsTools.git) and compared to negative controls (experiments setup without functional Cas12f complexes) to identify target cleavage.

Evidence of PAM recognition was evaluated as described previously (22,23). Briefly, the sequence of the protospacer adapter ligation exhibiting an elevated frequency in the previous step was used in combination with a 10 bp sequence 5′ of the 7 bp PAM region to identify reads that supported dsDNA cleavage. Once identified, the intervening 7 bp PAM sequence was isolated by trimming away the 5′ and 3′ flanking sequences using a custom Perl script (provided at https://github.com/cortevaCRISPR/Cas12f-InformaticsTools.git) and the frequency of the extracted PAM sequences normalized to the original PAM library to account for inherent biases using the following formula.

graphic file with name M1.gif

Following normalization, a position frequency matrix (PFM) (24) was calculated and compared to negative controls (experiments setup without functional Cas12f complexes) to look for biases in nucleotide composition as a function of PAM position. Biases were considered significant and indicative of PAM recognition if they deviated by >2.5-fold from the negative control. Analyses were limited to the top 10% most frequent PAMs to reduce the impact of background noise resulting from non-specific cleavage coming from other components in the E. coli cell lysate mixtures.

Expression and purification of an Un1Cas12f1 (Cas14a1) protein

Un1Cas12f1 (Cas14a1) protein was expressed in E. coli BL21(DE3) strain from the pLBH531_MBP-Cas14a1 plasmid (gift from Jennifer Doudna, Addgene plasmid #112500). Un1Cas12f1D326A and Un1Cas12f1D510A expression plasmids were engineered from pLBH531 using Phusion Site-Directed Mutagenesis Kit (Thermo Fisher Scientific). Escherichia coli cells were grown in LB broth supplemented with ampicillin (100 μg/ml) at 37°C. After culturing to an OD600 of 0.5, temperature was decreased to 16°C and expression induced with 0.5 mM IPTG. After 16 h cells were pelleted, re-suspended in loading buffer (20 mM Tris–HCl, pH 8.0 at 25°C, 1.5 M NaCl, 5 mM 2-mercaptoethanol, 10 mM imidazole, 2 mM PMSF, 5% (v/v) glycerol) and disrupted by sonication. Cell debris was removed by centrifugation. The supernatant was loaded on the Ni2+-charged HiTrap chelating HP column (GE Healthcare) and eluted with a linear gradient of increasing imidazole concentration (from 10 to 500 mM) in 20 mM Tris–HCl, pH 8.0 at 25°C, 0.5 M NaCl, 5 mM 2-mercaptoethanol buffer. The fractions containing Un1Cas12f1 variants were pooled and subsequently loaded on HiTrap heparin HP column (GE Healthcare) for elution using a linear gradient of increasing NaCl concentration (from 0.1 to 1.5 M). The fractions containing the protein of interest were pooled and the 10×His-MBP-tag was cleaved by overnight incubation with TEV protease at 4°C. To remove the cleaved 10×His-MBP-tag and TEV protease, reaction mixtures were loaded onto a HiTrap heparin HP 5 column (GE Healthcare) for elution using a linear gradient of increasing NaCl concentration (from 0.1 to 1.5 M). Next, the elution from the HiTrap heparin column was loaded on a MBPTrap column (GE Healthcare) and the Un1Cas12f1 proteins were collected in the flow-through. The collected fractions with Un1Cas12f1 were then dialyzed against 20 mM Tris–HCl, pH 8.0 at 25°C, 500 mM NaCl, 2 mM DTT and 50% (v/v) glycerol and stored at –20°C. The sequences of the Un1Cas12f1 proteins are listed in Supplementary Data S1 file.

Un1Cas12f1 (Cas14a1)–sgRNA complex assembly for in vitro DNA cleavage

Un1Cas12f1 (Cas14a1) ribonucleoprotein (RNP) complexes (1 μM) were assembled by mixing Un1Cas12f1 protein with sgRNA at 1:1 molar ratio followed by incubation in a complex assembly buffer (10 mM Tris–HCl, pH 7.5 at 37°C, 100 mM NaCl, 1 mM EDTA, 1 mM DTT) at 37°C for 30 min.

DNA substrate generation

Plasmid DNA substrates were generated by cloning oligoduplexes assembled after annealing complementary oligonucleotides (Metabion) into pUC18 plasmid over HindIII (Thermo Fisher Scientific) and EcoRI (Thermo Fisher Scientific) restriction sites. The sequences of the inserts are listed in Supplementary Data S1 file and the links to the plasmid sequences are provided in Supplementary Table S2.

To generate radiolabeled DNA substrates, the 5′-ends of oligonucleotides were radiolabeled using T4 PNK (Thermo Fisher Scientific) and [γ-33P]ATP (PerkinElmer). Duplexes were made by annealing two oligonucleotides with complementary sequences at 95°C following slow cooling to room temperature. A radioactive label was introduced at the 5′-end of individual DNA strands before annealing with the unlabeled strands. The sequences of the oligoduplexes are provided in Supplementary Data S1 file.

DNA substrate cleavage assay

Plasmid DNA cleavage reactions were initiated by mixing plasmid DNA with Un1Cas12f1 (Cas14a1) RNP complex at 46°C. The final reaction mixture typically contained 3 nM plasmid DNA, 100 nM Un1Cas12f1 RNP complex in 2.5 mM Tris–HCl, pH 7.5 at 37°C, 25 mM NaCl, 0.25 mM DTT and 10 mM MgCl2 reaction buffer. Aliquots were removed at timed intervals (30 min if not indicated differently) and mixed with 3× loading dye solution (0.01% Bromophenol Blue and 75 mM EDTA in 50% (v/v) glycerol) and reaction products were analyzed by agarose gel electrophoresis and ethidium bromide staining.

Reactions with oligoduplexes were typically carried-out by mixing labeled oligoduplex with Un1Cas12f1 RNP complex and incubating at 46°C. The final reaction mixture contained 1 nM labeled duplex, 100 nM Un1Cas12f1 RNP complex, 5 mM Tris–HCl, pH 7.5 at 37°C, 50 mM NaCl, 0.5 mM DTT and 5 mM MgCl2 in a 100 μl reaction volume. Aliquots of 6 μl were removed from the reaction mixture at timed intervals (0, 1, 2, 5, 10, 15 and 30 min), quenched with 10 μl of loading dye (95% (v/v) formamide, 0.01% Bromophenol Blue and 25 mM EDTA) and subjected to denaturing gel electrophoresis (20% polyacrylamide containing 8.5 M urea in 0.5× TBE buffer). Gels were dried and visualized by phosphorimaging.

M13 cleavage assay

M13 ssDNA cleavage reactions were initiated by mixing M13 ssDNA (New England Biolabs) and DNA activator with Un1Cas12f1 (Cas14a1) RNP complex at 46°C. Cleavage assays were conducted in 2.5 mM Tris–HCl, pH 7.5 at 37°C, 25 mM NaCl, 0.25 mM DTT and 10 mM MgCl2. The final reaction mixture contained 5 nM M13 ssDNA, 100 nM ssDNA or dsDNA activator and 100 nM Un1Cas12f1 RNP. The reaction was initiated by addition of Un1Cas12f1 RNP complex and was quenched at timed intervals (0, 5, 15, 30, 60 and 90 min) by mixing with 3× loading dye solution (0.01% Bromophenol Blue and 75 mM EDTA in 50% (v/v) glycerol). Products were separated on an agarose gel and stained with SYBR Gold (Thermo Fisher Scientific). The sequences of the activators are listed in Supplementary Data S1 file.

Plasmid interference assay

Plasmid interference assays were performed in E. coli Arctic Express (DE3) strain bearing Cas12f systems (plasmids encoding CRISPR-Cas12f systems are listed in Supplementary Table S2). For Un1Cas12f1 (Cas14a1), E. coli BL21 (DE3) strain was transformed with pGB53 plasmid, which was engineered from the pLBH545_Tet-Cas14a1_Locus plasmid (gift from Jennifer Doudna, Addgene plasmid #112501) by removing tracrRNA and CRISPR array with Bsp1407I and AvrII and adding sgRNA encoding sequence with T7 promoter, HDV ribozyme and terminator sequences. The cells were grown at 37°C to an OD600 of ∼0.5 and electroporated with 100 ng of low copy number pSC101 target plasmids obtained by cloning oligoduplexes over EcoRI and XhoI or EcoRI and NheI restriction sites into pTHSSe_1 (gift from Christopher Voigt, Addgene plasmid #109233) or pSG4K5 (gift from Xiao Wang, Addgene plasmid #74492) plasmids, respectively (the sequences of the inserts are listed in Supplementary Data S1 file and the links to the plasmid sequences are provided in Supplementary Table S2). The co-transformed cells were further diluted by serial 10× fold dilutions and grown at 37°C for 16–20 h on plates containing inductor and antibiotics. For Un1Cas12f1 – AHT (50 ng/ml), IPTG (0.5 mM) and chloramphenicol (30 μg/ml); Cas12f2 from Micrarchaeota archaeon (Mi1) previously named Cas14b4 (21) – gentamycin (10 μg/ml) and carbenicillin (100 μg/ml); for all other Cas12f proteins – IPTG (0.5 mM), gentamycin (10 μg/ml) and carbenicillin (100 μg/ml) were used.

RESULTS

DNA targeting requirements by Cas12f proteins

To our knowledge, just a single Cas12f protein, Un1Cas12f1 (Cas14a1), has been experimentally characterized where it was shown to only cleave ssDNA targets in a PAM-independent manner (21). In our experimentation, shared structural features with other type V effectors capable of dsDNA target cleavage (Supplementary Figure S1) motivated us to evaluate Cas12f family proteins for dsDNA cleavage activity. To establish DNA cleavage requirements, we initially tested a Cas12f2 protein from Micrarchaeota archaeon (Mi1) previously named Cas14b4 (Supplementary Table S1) (21). First, spacers capable of targeting a randomized PAM plasmid library (22) were incorporated into its CRISPR array. The resulting modified CRISPR-Mi1Cas12f2 locus was synthesized, cloned into a low copy plasmid, and transformed into E. coli. A PAM determination assay (22,23) was adapted to test the ability of the Mi1Cas12f2 to recognize and cleave a dsDNA target in vitro (Figure 2A). This was accomplished by combining clarified lysate containing Mi1Cas12f2 protein and gRNAs expressed from the reengineered locus with the PAM library. Next, DNA breaks were captured by double-stranded adapter ligation, enriched by PCR, and deep sequenced as described previously (22,23). DNA cleavage occurring in the target sequence was evaluated by scanning regions in the protospacer for elevated frequencies of adapter ligation relative to negative controls (experiments using lysate from E. coli not transformed with the Mi1Cas12f2 locus). A slight increase (2.5-fold) in the number of adapter ligated sequences was recovered after the 21st protospacer position 3′ of the randomized PAM (Supplementary Figure S2A). Analysis of these fragments showed the recovery of a T-rich sequence (5′-TTAT-3′) immediately 5′ of the gRNA target only in the Mi1Cas12f2 treated sample (Figure 2B and Supplementary Figure S2B). To confirm the PAM sequence and the dsDNA cleavage position, a plasmid was constructed containing a target adjacent to the identified 5′-TTAT-3′ PAM sequence and subjected to cell lysate cleavage experiments. To increase Mi1Cas12f2 concentration in the cell lysate, a higher copy number DNA expression plasmid equipped with an inducible T7 promoter was also utilized. Sequencing of the target plasmid cleavage products confirmed cleavage at the 21st position (Supplementary Figure S2C). Reactions using deletion variants further confirmed that Mi1Cas12f2 was the sole endonuclease required for the observed dsDNA target recognition and cleavage activity (Supplementary Figure S2C).

Figure 2.

Figure 2.

dsDNA cleavage and PAM requirements for Cas12f effector proteins. (A) Workflow of the biochemical approach used to detect dsDNA cleavage and examine PAM recognition. E. coli cells were transformed with a plasmid carrying CRISPR-Cas12f loci engineered to target the PAM library, allowed to express, and then lysed. The resulting lysate containing Cas12f RNP complexes was used to assay for target cleavage and PAM recognition. For Un1Cas12f1 (Cas14a1), E. coli lysate expressing the nuclease was mixed with in vitro transcribed sgRNA. WebLogos of the PAM sequences recovered for Cas12f1 and Cas12f2 proteins from Cas14 (B) and type V-U3 (C) families.

Four additional Cas12f proteins from the Cas14 family (21) were also evaluated. These consisted of two Cas12f1 proteins from uncultured archaeons (Un1 and Un2 which were previously named Cas14a1 and Cas14a3, respectively) and two Cas12f2 proteins from Micrarchaeota archaeon (Mi2) and Aureabacteria bacterium (Au) (Supplementary Table S1). First, for Un2Cas12f1, Mi2Cas12f2, and AuCas12f2, T7 inducible expression plasmids were synthesized (Supplementary Table S2). They contained a minimal locus: the cas12f gene, sequence encoding a putative tracrRNA between the nuclease gene and CRISPR repeats, and a CRISPR array modified to target the PAM library. Then, similar to Mi1Cas12f2 experimentation, E. coli lysate from cells expressing the Cas12f nuclease and guide RNAs was combined with a randomized PAM library. Cleavage products were then captured and analyzed as described for Mi1Cas12f2 (Figure 2A). For Un1Cas12f1, a modified approach was used. Here, an in vitro transcribed single guide RNA (sgRNA) capable of targeting the PAM library was combined with E. coli lysate containing Un1Cas12f1 protein and then assayed for dsDNA target recognition and cleavage as described above. Similarly to Mi1Cas12f2, a cleavage signal distal to the PAM region was detected for all proteins. For Cas12f2 nucleases, this occurred just after the 21st protospacer position (for AuCas12f2, a second cleavage site was also observed after the 23rd position) while for Cas12f1 proteins the signal was found after the 24th position (Supplementary Figures S3A, S4A, S5A and S6A). Analysis of the sequences supporting cleavage yielded 5′ T-rich PAM recognition like that recovered for Mi1Cas12f2 (Figure 2B and Supplementary Figures S2B, S3B, S4B, S5B and S6B).

Next, to further explore the DNA cleavage requirements of miniature CRISPR-Cas effectors, we sought to evaluate the dsDNA cleavage activity of proteins from an uncharacterized putative group of class 2 CRISPR–Cas systems, type V-U3, which lack the proteins involved in adaptation (2,7). First, PSI-BLAST searches were performed to identify a group of CRISPR-associated proteins primarily from bacteria, in particular, lineages of Clostridia and Bacilli, that belonged to the type V-U3 family. They contained a conserved C-terminal tri-split RuvC domain similar to other Cas12 nucleases and a short variable N-terminal sequence as observed for the Cas12f proteins from Cas14 family. Next, five members were selected for functional characterization (Supplementary Table S1). In general, they were chosen to represent the uncovered diversity and ranged in size between 422–497 amino acids. As described for Cas12f proteins above, minimal CRISPR loci containing the nuclease gene, putative tracrRNA encoding region and CRISPR array reprogrammed to target the PAM library were then synthesized, cloned into an inducible expression plasmid (Supplementary Table S2) and examined for dsDNA target recognition and cleavage. As shown in Supplementary Figures S7A, S8A, S9A, S10A and S11A, all produced cleavage around the 24th position 3′ of the PAM region. For nucleases from Parageobacillus thermoglucosidasius (Pt) and Acidibacillus sulfuroxidans (As), secondary cleavage signals (>5% of all reads) were also recovered either before or after the 24th position. Like Cas12f proteins from Cas14 family, type V-U3 nucleases cleaved the dsDNA library in a 5′ PAM-dependent manner and altogether expanded miniature Cas nucleases PAM diversity to encompass not only T-rich but also C-rich motifs (Figure 2C and Supplementary Figures S7B, S8B, S9B, S10B and S11B).

Biochemical characterization of Un1Cas12f1 (Cas14a1) mediated dsDNA cleavage

Since a sgRNA solution was available for Un1Cas12f1 (Cas14a1) (21) we further probed the Un1Cas12f1 protein for programmable dsDNA cleavage using purified components. First, to decipher optimal reaction conditions, the effect of sgRNA spacer length, temperature, salt concentration and divalent metal ions were evaluated on Un1Cas12f1 dsDNA cleavage activity (Supplementary Figure S12). Experiments revealed that Un1Cas12f1 ribonucleoprotein (RNP) complex is a Mg2+-dependent endonuclease that functions best in low salt concentrations (5–25 mM) and is active in a wide temperature range ∼37–50°C, with a temperature optimum of ∼46°C (Supplementary Figure S12). Furthermore, sgRNA spacers of around 20 nt supported the most robust dsDNA cleavage activity (Supplementary Figure S12). Under optimized reaction conditions, supercoiled (SC) plasmid DNA containing a target sequence flanked by an Un1Cas12f1 PAM (5′-TTTA-3′) was completely converted to a linear form (FLL) indicating the formation of a dsDNA break (Figure 3A). Additionally, cleavage of linear DNA yielded DNA fragments of expected size, further validating Un1Cas12f1 mediated dsDNA break formation (Figure 3A). Next, we confirmed that Un1Cas12f1 requires both PAM and sgRNA recognition to cleave a dsDNA target (Supplementary Figure S13). Finally, alanine substitution of conserved RuvC active site residues (21) abolished cutting activity, confirming that the RuvC domain is essential for the observed dsDNA cleavage activity (Figure 3B).

Figure 3.

Figure 3.

Un1Cas12f1 (Cas14a1) RNP complex is a PAM-dependent dsDNA endonuclease. (A) Un1Cas12f1 RNP complex cleaves plasmid DNA targets in vitro in a PAM-dependent manner. (B) Alanine substitution of two conserved RuvC active site residues completely abolishes Un1Cas12f1 DNA cleavage activity. (C) Run-off sequencing of Un1Cas12f1 pre-cleaved plasmid DNA indicates that cleavage is centered around positions 20–24 bp 3′ of the PAM resulting in 5′ overhangs. (D) Oligoduplex cleavage patterns are consistent with staggered cleavage but differ from experiments assembled with plasmid DNA (shown in C). (E) Collateral non-specific M13 ssDNA degradation activity by Un1Cas12f1 triggered by ssDNA and PAM-containing dsDNA. Un1Cas12f1 RNP complexes were assembled using a sgRNA (20 nt spacer). TS – target strand, NTS – non-target strand, SC – supercoiled, FLL – full length linear, OC – open circular.

The type of dsDNA break generated by Un1Cas12f1 was examined next. Using run-off sequencing, we observed that Un1Cas12f1 makes 5′ staggered overhanging DNA cut-sites. Cleavage predominantly occurred centered around positions 20–24 bp in respect to PAM sequence (Figure 3C) and was independent of spacer length or plasmid topology (Supplementary Figure S14). The cleavage pattern of Un1Cas12f1 was also assessed on synthetic double-stranded oligodeoxynucleotides. As illustrated in Figure 3D and Supplementary Figure S15, a 5′ staggered cut pattern, albeit with less strictly defined cleavage positions than observed with larger DNA fragments was seen.

Next we investigated if the non-specific ssDNA degradation activity of Un1Cas12f1 could be induced not just by ssDNA targets (21) but also by dsDNA targets. First, the ability of Un1Cas12f1 to indiscriminately degrade single-stranded M13 DNA in the presence of a ssDNA target without a PAM was confirmed (Figure 3E). Then, a dsDNA target containing a 5′ PAM and sgRNA target for Un1Cas12f1 was also tested for its ability to trigger non-selective ssDNA degradation. As shown in Figure 3E, the trans-acting ssDNase activity of Un1Cas12f1 was activated by both ssDNA and dsDNA targets, similar to observations made for Cas12a (12). Additionally, in the absence of a target, the Un1Cas12f1 RNP complex non-selectively degraded single-stranded M13 DNA (Figure 3E) as well as synthetic single-stranded DNA oligodeoxynucleotides (Supplementary Figure S15) albeit at much slower rates suggesting that it also possesses a non-specific single-stranded DNA nuclease activity.

Cas12f mediated plasmid DNA interference in E. coli

We next tested if CRISPR-Cas12f systems (from Cas14 and type V-U3 protein effector families) can be programmed to target and cleave invading dsDNA in a heterologous E. coli host. First, an E. coli plasmid DNA interference assay was adopted (25,26) using a minimal Cas12f CRISPR locus modified to target the incoming low copy number plasmid DNA. To assess transformation efficiency, each experiment was serially diluted by 10× and compared with controls (experiments performed with a plasmid that does not contain a target site). Surprisingly, none of the Cas12f nucleases from Cas14 protein family interfered with plasmid DNA transformation as evidenced by similar recovery of resistant colonies compared to controls (Figure 4A). These findings, however, are in agreement with a previous study which showed that Un1Cas12f1 (Cas14a1) was incapable of depleting PAM plasmid libraries in a heterologous E. coli host and presumably due to this reason failed to detect PAM-dependent dsDNA cleavage (21). In contrast, type V-U3 effectors from A. sulfuroxidans (As) and Syntrophomonas palmitatica (Sp) both produced notable levels of plasmid interference and slight interference was also observable for nucleases from P. thermoglucosidasius (Pt) and Ruminococcus sp. (Ru) (Figure 4B).

Figure 4.

Figure 4.

Cas12f mediated plasmid DNA interference in E. coli. Plasmid interference assay for Cas12f effectors from Cas14 (A) and type V-U3 (B) protein families. Cells bearing a minimal CRISPR-Cas12f locus were transformed with a low copy number plasmid DNA containing the target sequence. The engineered CRISPR loci contained 33–39 nt spacers except for CRISPR-Un1Cas12f1 (Cas14a1) where the CRISPR locus was replaced with a T7 expressed sgRNA (20 nt spacer). To assess transformation efficiency, each experiment was serially diluted by 10× and compared with controls (experiments performed with a plasmid that does not contain a target site). Red boxes indicate Cas12f variants that showed visually detectable levels of DNA interference activity.

DISCUSSION

Contrary to previous hypotheses suggesting that Cas12f proteins from Cas14 and type V-U3 CRISPR families lack the ability to defend against invading dsDNA (7,21), we illustrate that despite their miniature size some of these nucleases have this capacity. Here, using biochemical approaches, we provide evidence demonstrating that these compact proteins are programmable enzymes capable of introducing targeted dsDNA breaks like larger effectors, Cas9 and Cas12. First, we uncovered PAM-dependent dsDNA cleavage activity similar to other type V interference proteins (13,14,27) for 10 Cas12f family members. Then, we purified the nuclease and sgRNA for Un1Cas12f1 (Cas14a1) and reconstituted its nuclease activity in vitro confirming that both PAM and sgRNA recognition are required for dsDNA target cleavage. Additionally, we observed that ssDNA collateral nuclease activity was triggered not only by ssDNA targets (21) but also by dsDNA targets, a feature shared by most other type V family members (12,13). Finally, using plasmid DNA interference assays, we report that like shown previously (21), Cas12f effectors from the Cas14 family failed to protect against invading dsDNA in E. coli, while even more compact CRISPR-associated type V-U3 nucleases, AsCas12f1 (422 aa) and SpCas12f1 (497 aa), efficiently interfered with plasmid transformation. Taken together, this confirms that at least some Cas12f effectors function like Cas9 and Cas12 nucleases to recognize, cleave, and protect against dsDNA invaders in a heterologous host. In all, our results significantly improve our understanding of novel CRISPR-Cas systems and pave the way for the adoption of programmable miniature nucleases for genome editing applications.

DATA AVAILABILITY

Sequencing data are available on the NCBI Sequence Read Archive under BioProject ID PRJNA607069 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA607069).

Supplementary Material

gkaa208_Supplemental_Files

ACKNOWLEDGEMENTS

We would like to thank S. Dooley, H. Lin and M. King for informatics support, and C. Skrdlant and G. Zastrow-Hayes for Illumina sequencing support.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

A joint collaboration between Vilnius University and Corteva Agriscience.

Conflict of interest statement. T.K., J.K.Y., Z.H. and V.S. have filed patent applications related to the manuscript. J.K.Y., Z.H., S.P., V.D. and S.G. are employees of Corteva Agriscience. V.S. is a Chairman of and has financial interest in CasZyme.

REFERENCES

  • 1. Koonin E. V., Makarova K.S., Wolf Y.I.. Evolutionary genomics of defense systems in Archaea and Bacteria. Annu. Rev. Microbiol. 2017; 71:233–261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Koonin E. V., Makarova K.S., Zhang F.. Diversity, classification and evolution of CRISPR-Cas systems. Curr. Opin. Microbiol. 2017; 37:67–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Mohanraju P., Makarova K.S., Zetsche B., Zhang F., Koonin E.V., Van der Oost J.. Diverse evolutionary roots and mechanistic variations of the CRISPR-Cas systems. Science. 2016; 353:aad5147. [DOI] [PubMed] [Google Scholar]
  • 4. Jiang F., Doudna J.A.. CRISPR–Cas9 structures and mechanisms. Annu. Rev. Biophys. 2017; 46:505–529. [DOI] [PubMed] [Google Scholar]
  • 5. Jackson S.A., McKenzie R.E., Fagerlund R.D., Kieper S.N., Fineran P.C., Brouns S.J.J.. CRISPR-Cas: adapting to change. Science. 2017; 356:eaal5056. [DOI] [PubMed] [Google Scholar]
  • 6. Makarova K.S., Wolf Y.I., Iranzo J., Shmakov S.A., Alkhnbashi O.S., Brouns S.J.J., Charpentier E., Cheng D., Haft D.H., Horvath P. et al.. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol. 2020; 18:67–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Shmakov S., Smargon A., Scott D., Cox D., Pyzocha N., Yan W., Abudayyeh O.O., Gootenberg J.S., Makarova K.S., Wolf Y.I. et al.. Diversity and evolution of class 2 CRISPR–Cas systems. Nat. Rev. Microbiol. 2017; 15:169–182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Gasiunas G., Barrangou R., Horvath P., Siksnys V.. Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc. Natl. Acad. Sci. U.S.A. 2012; 109:2579–2586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Jinek M., Chylinski K., Fonfara I., Hauer M., Doudna J.A., Charpentier E.. A programmable dual-RNA-Guided DNA endonuclease in adaptive bacterial immunity. Science. 2012; 337:816–821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Ma E., Harrington L.B., O’Connell M.R., Zhou K., Doudna J.A.A., O’Connell M.R., Zhou K., Doudna J.A.A.. Single-stranded DNA cleavage by divergent CRISPR-Cas9 enzymes. Mol. Cell. 2015; 60:398–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Zhang Y., Rajan R., Seifert H.S., Mondragón A., Sontheimer E.J.. DNase H activity of Neisseria meningitidis Cas9. Mol. Cell. 2015; 60:242–255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Chen J.S., Ma E., Harrington L.B., Da Costa M., Tian X., Palefsky J.M., Doudna J.A.. CRISPR-Cas12a target binding unleashes indiscriminate single-stranded DNase activity. Science. 2018; 360:436–439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Yan W.X., Hunnewell P., Alfonse L.E., Carte J.M., Keston-Smith E., Sothiselvam S., Garrity A.J., Chong S., Makarova K.S., Koonin E. V. et al.. Functionally diverse type V CRISPR-Cas systems. Science. 2019; 363:88–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Zetsche B., Gootenberg J.S., Abudayyeh O.O., Slaymaker I.M., Makarova K.S., Essletzbichler P., Volz S.E., Joung J., van der Oost J., Regev A. et al.. Cpf1 is a single RNA-Guided endonuclease of a class 2 CRISPR-Cas system. Cell. 2015; 163:759–771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Knott G.J., Doudna J.A.. CRISPR-Cas guides the future of genetic engineering. Science. 2018; 361:866–869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Klompe S.E., Sternberg S.H.. Harnessing ‘A billion years of Experimentation’: the ongoing exploration and exploitation of CRISPR–Cas immune systems. Cris. J. 2018; 1:141–158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Komor A.C., Badran A.H., Liu D.R.. CRISPR-based technologies for the manipulation of eukaryotic genomes. Cell. 2017; 168:20–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Pickar-Oliver A., Gersbach C.A.. The next generation of CRISPR–Cas technologies and applications. Nat. Rev. Mol. Cell Biol. 2019; 20:490–507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Wu Z., Yang H., Colosi P.. Effect of genome size on AAV vector packaging. Mol. Ther. 2010; 18:80–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Lino C.A., Harper J.C., Carney J.P., Timlin J.A.. Delivering CRISPR: a review of the challenges and approaches. Drug Deliv. 2018; 25:1234–1257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Harrington L.B., Burstein D., Chen J.S., Paez-Espino D., Ma E., Witte I.P., Cofsky J.C., Kyrpides N.C., Banfield J.F., Doudna J.A.. Programmed DNA destruction by miniature CRISPR-Cas14 enzymes. Science. 2018; 362:839–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Karvelis T., Gasiunas G., Young J., Bigelyte G., Silanskas A., Cigan M., Siksnys V.. Rapid characterization of CRISPR-Cas9 protospacer adjacent motif sequence elements. Genome Biol. 2015; 16:253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Karvelis T., Young J.K., Siksnys V.. A pipeline for characterization of novel Cas9 orthologs. Methods Enzymol. 2019; 616:219–240. [DOI] [PubMed] [Google Scholar]
  • 24. Stormo G.D. Modeling the specificity of protein-DNA interactions. Quant. Biol. 2013; 1:115–130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Burstein D., Harrington L.B., Strutt S.C., Probst A.J., Anantharaman K., Thomas B.C., Doudna J.A., Banfield J.F.. New CRISPR-Cas systems from uncultivated microbes. Nature. 2017; 542:237–241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Sapranauskas R., Gasiunas G., Fremaux C., Barrangou R., Horvath P., Siksnys V.. The Streptococcus thermophilus CRISPR/Cas system provides immunity in Escherichia coli. Nucleic Acids Res. 2011; 39:9275–9282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Liu J.-J., Orlova N., Oakes B.L., Ma E., Spinner H.B., Baney K.L.M., Chuck J., Tan D., Knott G.J., Harrington L.B. et al.. CasX enzymes comprise a distinct family of RNA-guided genome editors. Nature. 2019; 566:218–223. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gkaa208_Supplemental_Files

Data Availability Statement

Sequencing data are available on the NCBI Sequence Read Archive under BioProject ID PRJNA607069 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA607069).


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES