Significance
CRISPR-Cas systems provide prokaryotic adaptive immunity against invading genetic elements. For immunity, fragments of invader DNA are integrated into CRISPR arrays by Cas1 and Cas2 proteins. Type I-F systems contain a unique fusion of Cas2 to Cas3, the enzyme responsible for destruction of invading DNA. Structural, biophysical, and biochemical analyses of Cas1 and Cas2-3 from Pectobacterium atrosepticum demonstrated that they form a 400-kDa complex with a Cas14:Cas2-32 stoichiometry. Cas1–Cas2-3 binds, processes, and catalyzes the integration of DNA into CRISPR arrays independent of Cas3 activity. The arrangement of Cas3 in the complex, together with its redundant role in processing and integration, supports a scenario where Cas3 couples invader destruction with immunization—a process recently demonstrated in vivo.
Keywords: CRISPR-Cas, phage resistance, horizontal gene transfer, spacer acquisition, mass spectrometry
Abstract
CRISPR-Cas adaptive immune systems capture DNA fragments from invading bacteriophages and plasmids and integrate them as spacers into bacterial CRISPR arrays. In type I-E and II-A CRISPR-Cas systems, this adaptation process is driven by Cas1–Cas2 complexes. Type I-F systems, however, contain a unique fusion of Cas2, with the type I effector helicase and nuclease for invader destruction, Cas3. By using biochemical, structural, and biophysical methods, we present a structural model of the 400-kDa Cas14–Cas2-32 complex from Pectobacterium atrosepticum with bound protospacer substrate DNA. Two Cas1 dimers assemble on a Cas2 domain dimeric core, which is flanked by two Cas3 domains forming a groove where the protospacer binds to Cas1–Cas2. We developed a sensitive in vitro assay and demonstrated that Cas1–Cas2-3 catalyzed spacer integration into CRISPR arrays. The integrase domain of Cas1 was necessary, whereas integration was independent of the helicase or nuclease activities of Cas3. Integration required at least partially duplex protospacers with free 3′-OH groups, and leader-proximal integration was stimulated by integration host factor. In a coupled capture and integration assay, Cas1–Cas2-3 processed and integrated protospacers independent of Cas3 activity. These results provide insight into the structure of protospacer-bound type I Cas1–Cas2-3 adaptation complexes and their integration mechanism.
Clustered regularly interspaced short palindromic repeats (CRISPRs) and their Cas proteins are prokaryote adaptive immune systems that provide defense against invading elements, typically phages and plasmids (1, 2). The systems are evolutionarily diverse, organized into two major classes and multiple types and subtypes (3). CRISPR arrays consist of repeats separated by spacers that are usually derived from invaders (4). Arrays are transcribed and processed into CRISPR RNAs (crRNAs) containing a single spacer sequence (5, 6). In type I CRISPR-Cas systems, crRNAs and Cas proteins assemble into Cascade complexes (5, 7, 8) that bind complementary sequences (protospacers) to elicit invader DNA destruction by the nuclease-helicase protein Cas3 (9–11).
Upon encountering an invader with no matches to existing spacers, new invader-derived spacers can be selected and processed (captured) and integrated into the CRISPR array, which is called naïve adaptation (12–14). Spacer capture is biased to occur beside stalled replication forks and other DNA breaks, and the RecBCD complex is proposed to have a role in generating spacer precursors (15, 16). However, adaptation still occurs in the absence of RecBCD, albeit less efficiently, indicating that other pathways exist. Additionally, in type I systems, spacers are acquired next to protospacer adjacent motifs (PAMs), which are also required for interference (17). Spacer integration relies on Cas1 and Cas2, which are almost universal in CRISPR-Cas systems (3, 18). Cas1 is a nuclease/integrase and Cas2 is a small, apparently structural protein (19, 20). Cas1 and Cas2 domain proteins were first shown to interact in vivo in the type I-F system of Pectobacterium atrosepticum (21). More recently, structures of type I-E Cas1–Cas2 complexes have illuminated aspects of protospacer binding (19, 22, 23). Integration has been studied in vitro for the type I-E and II-A Cas1–Cas2 complexes (24, 25), and the reverse reaction (disintegration) only requires Cas1 (26). These studies provided valuable insight into integration in these specific systems. However, there is considerable CRISPR-Cas diversity, and adaptation mechanisms in other systems are unexplored.
Type I-F systems encode a Cascade (also known as Csy) complex, a Cas1 protein, and a unique fusion of Cas2 and Cas3 (Cas2-3), meaning this protein engages in both adaptation and interference (21, 27). Formation of type I-F Cas1–Cas2-3 complexes is likely to be important for rapid primed adaptation (28, 29). Priming can occur either during interference or in response to invaders that have escaped interference through protospacer or PAM mutations, triggering rapid acquisition from the foreign element to restore immunity (29–32). Multiple type I systems undergo priming, and for type I-E it requires Cascade, crRNA, Cas1, Cas2, and Cas3 (28, 30, 33, 34). During type I-F priming, new spacers are captured from regions adjacent to the escaped protospacer in a process that involves the 3′-to-5′ translocation of a Cas1–Cas2-3 complex (28, 29). The Cas2-3 fusion in I-F systems means that when Cas3 is recruited to targets identified by a Cascade–crRNA complex, Cas2 will always be in tow, which can result in Cas1 corecruitment. This may directly couple Cas3 helicase and nuclease functions with Cas1 and Cas2 adaptation activities. It is plausible that related Cas1–Cas2–Cas3 complexes occur in all type I systems to promote priming.
The structure, stoichiometry, and mechanism of adaptation complexes containing Cas1, Cas2, and Cas3 domains are unknown. Therefore, we characterized the type I-F Cas1–Cas2-3 complex biophysically and, by reconstituting adaptation in vitro, determined its role in spacer capture and integration. We propose that Cas1 and Cas2 alone are sufficient for PAM recognition, spacer capture, and integration, whereas Cas3 likely acts earlier to increase adaptation during priming and interference by acting as a helicase and nuclease in the generation of spacer precursors.
Results
Cas1 and Cas2-3 Form a 400-kDa Cas14–Cas2-32 Complex.
To investigate the Cas1–Cas2-3 complex, we coexpressed StrepII-tagged Cas1 (37.6 kDa) and untagged Cas2-3 (124.9 kDa) from P. atrosepticum and purified the complex by affinity and size-exclusion chromatography (SEC) (Fig. 1A and Fig. S1A). The elution profile of Cas1–Cas2-3 gave a mass of ∼390 kDa (Fig. S1A), which was further supported by SEC–right-angle light scattering (RALS), which estimated 397 kDa (Fig. 1B). Sedimentation velocity by analytical ultracentrifugation (AUC) yielded a major species at 13.5S, which corresponded to ∼380 kDa, whereas a minor species at 4.5S (∼75 kDa) was consistent with some Cas1 dimer dissociating from Cas1–Cas2-3 (Fig. 1C).
Fig. 1.
Cas1 and Cas2-3 form a 400-kDa Cas14–Cas2-32 complex. (A) Expression and purification of StrepII-Cas1–Cas2-3. (A, Left) The coexpression plasmid is depicted. (A, Right) SDS/PAGE of whole-cell (WC), soluble (Sol), Strep-Tactin elution (Eluate), and SEC fractions. A SEC chromatogram with molecular weight (MW) standards is shown in Fig. S1A. (B) SEC-RALS. MW (blue) was calculated from the refractive index (RI; red) and RALS (black). (C) Sedimentation velocity. (D) Native MS. Predicted stoichiometries are shown, and iBAQ is shown in Table S1.
Fig. S1.
SEC and negative-stained transmission electron microscopy of the Cas1–Cas2-3 complex. (A) Chromatogram displaying size-exclusion chromatography of the Cas1–Cas2-3 complex purified after an initial Strep-Tactin affinity step, overlaid with molecular mass size standards (dotted line). mAU, milli absorbance units. (B) Representative transmission electron micrograph of Cas1–Cas2-3 complexes in negative stain. (Scale bar, 500 nm.) (C) Representative 2D class averages. (D) Volumes (3D) obtained by electron tomography corresponding to the class averages shown in C. (E) Three-dimensional classification. The number of particles used for each class is indicated above the arrows, and the last arrow represents the final reconstruction after refinement. (F) Different views of the final map with the Cas1–Cas2-3–protospacer model docked. (Scale bar, 10 nm.)
To clarify the Cas1–to–Cas2-3 ratio, we estimated the absolute protein quantities via shotgun mass spectrometry (MS) using intensity-based absolute quantification (iBAQ) (35), which revealed a 2:1 ratio (Table S1). Top-down MS on a denatured Cas1–Cas2-3 complex enabled an accurate mass measurement of Cas1 as 37,557.62 ± 0.84 Da, whereas free Cas2-3 was not observed. To define the stoichiometry, we determined the accurate mass by native Orbitrap MS. The spectra showed four distinct charge distributions (Fig. 1D) originating from species with molecular masses (and relative abundance) of 363,983 ± 63 Da (7%), 400,471 ± 31 Da (16%), 409,413 ± 54 Da (47%), and 418,403 ± 42 Da (30%). The 400-kDa species is in complete agreement with the predicted stoichiometry of Cas14–Cas2-32. The larger species containing 9 and 18 kDa of extra mass are likely to be complexes copurified with captured DNA. This mass could correspond to 29 to 30 nt/bp of ssDNA (9 kDa), dsDNA (18 kDa), or variations thereof. The less abundant 364-kDa species was assigned as Cas13–Cas2-32. In conclusion, Cas1 and Cas2-3 form a 400-kDa complex with a stoichiometry of Cas14:Cas2-32, with a significant proportion bound to nucleic acids (see below).
Table S1.
Shotgun MS of the Cas1–Cas2-3 complex
| ID | Peptide counts (unique) | Protein name | Peptides | Unique peptides | Molecular mass, kDa | Length, amino acids | PEP | Coverage, % | Intensity | iBAQ |
| Q6D0W9 | 100 | Cas2:3 | 91 | 91 | 124.89 | 1098 | 0 | 55.7 | 1.58E+11 | 2.14E+09 |
| Q6D0X0 | 24 | Cas1 | 23 | 23 | 36.258 | 326 | 0 | 39.6 | 7.26E+10 | 4.27E+09 |
ID: identifier(s) of the protein(s) contained in the protein group. Peptide counts (unique): number of peptides associated with each protein in the protein group. Here, distinct peptide sequences are counted. Modified forms or different charges are counted as one peptide. Protein name: generic protein name(s) contained within the group. Peptides: total number of peptide sequences associated with the protein group (i.e., for all proteins in the group). Unique peptides: total number of unique peptides associated with the protein group (i.e., these peptides are not shared with another protein group). Molecular mass: molecular mass of the leading protein sequence contained in the protein group. Length: length of the leading protein sequence contained in the group. PEP: posterior error probability of the identification. This value essentially operates as a P value, where smaller is more significant. Coverage: percentage of the sequence that is covered by the identified peptides in this sample of the longest protein sequence contained within the group. Intensity: summed extracted ion current of all isotopic clusters associated with the identified amino acid sequence. iBAQ: values calculated by MaxQuant from the (raw) intensities taken into account by the number of theoretical peptides. Thus, iBAQ values are proportional to the molar quantities of the proteins. The iBAQ algorithm can roughly estimate the relative abundance of the proteins within each sample.
Molecular Architecture of Cas1–Cas2-3.
Type I-F Cas2-3 has multiple domains: an N-terminal Cas2, followed by a Cas3 region containing an HD endonuclease and an SF2 helicase (two RecA domains), and an accessory C-terminal domain (36) (Fig. 2A). To gain structural insight into the full complex, we used modeling, cross-linking, and electron microscopy (EM). To create an initial model of P. atrosepticum Cas1–Cas2-3, we made a homology model from Pseudomonas aeruginosa Cas2-3 (37) and used the Escherichia coli type I-E Cas1–Cas2–protospacer structure (22, 23) and P. atrosepticum Cas1 dimer (38). To optimize the model, we performed cross-linking with MS to identify proximal regions within proteins and obtain spatial restraints. Due to lysine side-chain flexibility, we assumed a 40-Å maximum distance between Cα and Cα. There were 19 unique cross-links for Cas1–Cas2-3 (Fig. 2A and Table S2), including 13 intralinks (within a single protein) and 6 interlinks (between two proteins). Of the 13 identified intralinks, 2 belong to Cas1 and 11 to the Cas3 region of Cas2-3. Mapping the cross-links on our model placed two Cas1 intralinks and eight Cas2-3 intralinks within 40 Å, validating the Cas2-3 homology model. The three Cas2-3 outliers suggest that residues 500 and 823, in the RecA1 and RecA2 domains, reside in flexible regions.
Fig. 2.
Molecular architecture of Cas14–Cas2-32. (A) Schematic of intralinks (depicted as gray lines; thin lines are outliers) and interlinks (black) (Table S2). Cas1 is in green, the Cas2-3 domain organization is shown with Cas2 in yellow, the HD nuclease of Cas3 in red, and the remainder of Cas3 in blue (SF2 helicase, composed of RecA1 and RecA2, and a C-terminal domain). (B and C) Modeling of Cas1–Cas2-3 into EM density. Interprotein cross-links are shown (black lines) and EM data are shown in Fig. S1 B–F. (D) DNA-interacting peptides were mapped on Cas1 (positions 201 to 213 and 273 to 294; green) and the Cas2 domain of Cas2-3 (82 to 95; yellow). Details are in Dataset S1. (E and F) The final Cas1–Cas2-3 model with features labeled and protospacer DNA shown in orange.
Table S2.
Chemical cross-linking mass spectrometry of the Cas1–Cas2-3 complex
| No. | Peptide A | Peptide B | Position A | Position B | No. of CSMs* | Length, Å |
| Cas1 intralinks | ||||||
| 1 | TSNDVLVQEAMMT[K]ALYR | FLDHGNYLAYGLAAVSTWVLGLPHGLAVLHG[K]TR | 196 | 257 | 1 | 23.6 |
| 2 | TSNDVLVQEAMMT[K]ALYR | A[K]RGGGTDLANR | 196 | 215 | 24 | 13.0 |
| Cas2-3 intralinks | ||||||
| 1 | H[K]GFRQR | L[K]LDEDDLAVLIGSQAVQDLHEMRK | 390 | 477 | 24 | 8.9 |
| 2 | AR[K]FAGR | AL[K]LPSFMHFSQLDQRLTVHLAR | 290 | 297 | 276 | 10.6 |
| 3 | QAIE[K]SSQR | MFDLHSQHHQQHENG[K]TVSLGLVR | 745 | 823 | 36 | 12.9 |
| 4 | [K]TGEYK | L[K]AWLER | 539 | 350 | 3 | 22.6 |
| 5 | [K]FAGR | L[K]AWLER | 539 | 290 | 2 | 28.1 |
| 6 | [K]TGEYK | [K]ENQQR | 350 | 500 | 1 | 29.7 |
| 7 | [K]ENQQR | KLDDL[K]RPPAAFWWR | 500 | 971 | 2 | 33.1 |
| 8 | [K]ENQQR | L[K]AWLER | 539 | 500 | 4 | 37.5 |
| 9 | H[K]GFR | QAIE[K]SSQR | 823 | 390 | 1 | 45.2 |
| 10 | [K]ENQQR | QAIE[K]SSQR | 823 | 500 | 2 | 54.8 |
| 11 | L[K]AWLER | QAIE[K]SSQR | 823 | 539 | 1 | 56.9 |
| Cas1 to Cas2-3 interlinks | ||||||
| 1 | FTF[K]SEHLQALLDR | QAIE[K]SSQR | 163 | 823 | 4 | 20.6 and 20.7 |
| 2 | TSNDVLVQEAMMT[K]ALYR | [K]ENQQR | 196 | 500 | 2 | 23.7 and 25.4 |
| 3 | YQ[K]GLTDCR | [K]ENQQR | 176 | 500 | 2 | 34.9 and 36.0 |
| 4 | YQ[K]GLTDCR | [K]FAGR | 176 | 290 | 1 | 64.4 and 65.5 |
| 5 | TSNDVLVQEAMMT[K]ALYR | [K]TGEYK | 196 | 350 | 2 | 25.9 and 26.3 |
| 6 | YQ[K]GLTDCR | QAIE[K]SSQR | 176 | 823 | 2 | 30.7 and 30.6 |
CSM, cross-link spectrum match (by analog to PSM, peptide spectrum match): total number of identified cross-link peptide spectra matched for the protein.
Next, we examined the complex by EM and obtained a map of ∼25-Å resolution (Fig. 2 B and C and Fig. S1 B–F). To find the best fit, Cas1 and Cas2-3 were modeled into the density map by taking into account the cross-linking data and a flexible linker between the Cas2 and Cas3 domains (residues 85 to 100). Supporting this flexibility, the P. aeruginosa Cas2-3 cryo-EM structure had no Cas2 density (39). Of the 12 intermolecular connections between Cas1 and Cas3 (Fig. 2 A–C and Table S2), 10 were within cross-linking range. The Cas3 helicase domain has conformational flexibility (9), which might contribute to the cross-link outliers. Overall, the EM map accommodated our model based on the stoichiometry and cross-linking analyses, but also revealed a region of density not accounted for by Cas1–Cas2-3 alone (Fig. 2B). Combined with the extra mass detected by native MS, we hypothesized this density was due to protospacer DNA.
To test whether Cas1–Cas2-3 contained DNA and to analyze DNA placement, we designed an assay involving formaldehyde cross-linking of proteins to DNA and LC-MS/MS detection of enriched DNA-bound peptides. Three peptides were highly enriched in the DNA–protein cross-linked samples, indicating their interaction with DNA (Dataset S1). These were Cas1 residues 201 to 213 and 273 to 294, and Cas2-3 peptide 82 to 95 within the Cas2 domain (Fig. 2D). These positions in the type I-F complex are consistent with the protospacer-binding regions in type I-E Cas1–Cas2 (22, 23), which enabled modeling of a captured protospacer on Cas1–Cas2-3 that correlated with the location of additional EM density (Fig. 2 D–F). We also directly assayed DNA binding by Cas1–Cas2-3 using electrophoretic mobility-shift assays and observed that Cas1–Cas2-3 can bind short linear ssDNA and dsDNA substrates of the same sequence with similar affinities, consistent with the masses observed in native MS (Fig. S2A). Interestingly, the type I-E Cas1 protein was recently shown to associate with non-dsDNA in vivo (40).
Fig. S2.
ssDNA/dsDNA binding by Cas1–Cas2-3, leader end integrations, and metal cofactors. (A) Electrophoretic mobility-shift assay with Cas1–Cas2-3 (4, 6, 9, 13.3, 20, 30, 45, and 67.5 nM) bound to 1 nM fluorescently labeled 32-bp ssDNA (PF1794; Upper) or dsDNA (PF1794+PF1648; Lower). Reactions were performed in triplicate and used to calculate the binding affinities (Right). Data shown are the Mean ± SEM. (B) Schematic of the pPF727 (pCRISPR1) plasmid used in integration reactions. A portion of the upstream leader is depicted with the putative IHF-binding sites indicated (yellow). The first three repeats (blue) and two spacers (red) are shown. (C) A 32-bp protospacer (PF1647+PF1648) was incubated with Cas1–Cas2-3 and pCRISPR1 for 60 min before integration was detected by PCR using primer PF1649 (FW) or PF1650 (RV) in combination with a leader-specific primer, PF1792, as schematically depicted (Left). (D) Similar assay as in Fig. 3B, but using different divalent metal ions (0.2 mM) to stimulate Cas1–Cas2-3 integrase activity. The marker shown is in bp.
The resulting model (Fig. 2 E and F) shows two Cas1 dimers assembled on either end of a Cas2 domain dimeric core, similar to the E. coli I-E complex (19, 22, 23). The two Cas3 domains flank the Cas2 core and span the region between the Cas1 dimers, with the Cas3 HD region in proximity to the catalytic integrase lobe of one Cas1 dimer and the Cas3 C-terminal region near the inactive lobe of the other Cas1 dimer. The protospacer-binding surface is augmented by the HD nuclease subdomains of the Cas3 domains, forming a long groove. In Thermobifida fusca Cas3, ssDNA can be guided by the helicase domain toward the HD nuclease domain for fragmentation (9). Remarkably, in Cas1–Cas2-3 the DNA-binding site of each HD domain is adjacent to the catalytic integrase lobe of Cas1, suggesting a pipeline of DNA-processing active sites that might deliver DNA to Cas1.
Cas1–Cas2-3 Catalyzes Spacer Integration in Vitro.
To reconstitute in vitro spacer integration by Cas1–Cas2-3, we established a novel assay. P. atrosepticum has three CRISPR arrays and all are active in vivo (28, 29). Because CRISPR1 is the most active (∼70% of acquisitions), Cas1–Cas2-3 was incubated with a 32-bp dsDNA protospacer and a plasmid containing CRISPR1 with its leader (Fig. 3A and Fig. S2B). The protospacer was chosen due to its frequent naïve acquisition in vivo, and because ∼90% of spacers are 32 nt long (29). Integration products could be half-sites, resulting from attack by one end of the protospacer, or full-sites, resulting from two half-site integrations by each end of the protospacer (25). Integration was detected by PCR using a forward protospacer primer and a reverse primer in spacer 6 (Fig. 3 A and B). Integration was detected as early as 5 min, was optimal in 1 to 2 h (Fig. 3B), and occurred in either orientation with a similar efficiency and into each repeat–spacer junction (Fig. 3 B and C). By sequencing, we confirmed that spacers integrated at repeat–spacer boundaries, showing that Cas1–Cas2-3 alone recognizes CRISPR repeats. Integrations occurred into the top DNA strand (i.e., the 5′ end of repeats; Fig. 3 B and C) or the bottom DNA strand (i.e., the 3′ end of repeats; Fig. S2C). Although integrations occurred into multiple repeat–spacer junctions, acquisition was slightly favored at the leader-proximal repeat (Figs. 3 and 4). Acquisition was metal-dependent, because EDTA inhibited the reactions (Fig. S2D), presumably by chelating metal cofactors that were copurified with Cas1–Cas2-3. Mg2+ and Ca2+ enhanced the reaction, whereas Mn2+, Fe2+, Ni2+, Co2+, and Zn2+ decreased specificity. In summary, Cas1–Cas2-3 catalyzes spacer integration in either orientation into repeat–spacer junctions, with a slight preference for the leader-proximal repeat.
Fig. 3.
Cas1–Cas2-3 catalyzes spacer integration into CRISPR arrays. (A) Schematic of the integration assay. Purified Cas1–Cas2-3 was incubated with a protospacer (32-nt dsDNA) and the CRISPR1 array on a plasmid (pCRISPR1), and integration was detected by PCR and gel electrophoresis. (B) Time course of protospacer integration into CRISPR1. A 32-bp protospacer (PF1647+PF1648) was incubated with Cas1–Cas2-3 and pCRISPR1 for 1 to 360 min and integration was detected by PCR [primers PF1649 (forward primer; FW) and PF1822]. (C) As in B, but reactions were performed with the reverse primer (PF1650; RV) in combination with PF1822. Similar reactions were performed with a leader-specific primer (PF1792) and PF1649 or PF1650 (Fig. S2C). Metal cofactor requirements are shown in Fig. S2D. The marker shown is in bp.
Fig. 4.
In vitro integration and capture activity of Cas1–Cas2-3. (A–G) Integration assays using pCRISPR1 (A) with no protein, Cas1, Cas2-3, or Cas1–Cas2-3 and a 32-bp protospacer (PF1647+PF1648); (B) with no protein, WT Cas1–Cas2-3, or Cas1 integrase (D269A; INT), Cas3 nuclease (D124A; HD), or Cas3 helicase (D591A; HEL) mutant complexes. The protospacer was as per A. Cas1(D269A)–Cas2-3 was purified and assembled into a stable complex (Fig. S3); (C) with ssDNA or dsDNA 32-nt/bp substrates. Oligonucleotides were PF1647 and PF1648; (D) in the presence and absence of 3′-P–blocked oligonucleotides. Lanes show a 32-bp protospacer with free 3′-OH groups (PF1647+PF1648), an F 3′-P group (PF1882+PF1648), an R 3′-P group (PF1647+PF1883), or both 3′-P groups (PF1882+PF1883); (E) with dsDNA protospacers of decreasing length including 32 bp (PF1647+PF1648), 27 bp (PF1895+PF1896), 26 bp (PF2066+PF2067), 24 bp (PF1897+PF1898), 21 bp (PF1899+PF1900), and 18 bp (PF1901+PF1902); (F) using supercoiled pCRISPR1 or pCRISPR1 linearized by HindIII. The protospacer was as per A; and (G) with the 32-bp protospacer as per A incubated ± IHF. (H) Capture assay involving a plasmid with the CRISPR1 leader and one repeat (pPF1042). The bar graph depicts the proportion of integrated spacers found either directly adjacent to the repeat or elsewhere in the array (other) (Dataset S2). (I) Capture processing-site distribution following sequencing of captured substrates from H. (J) Capture assay as in H with WT Cas1–Cas2-3 or Cas1 integrase (D269A; INT), Cas3 nuclease (D124A; HD), or Cas3 helicase (D591A; HEL) mutant complexes. Integration was detected by PCR using primers PF1649+PF1822, except in E using primers PF1901+PF1822 and in J using primers PF1792+PF1997. The marker shown is in bp.
In Vitro Integration Requires Cas1–Cas2-3 but Not Cas3.
To test whether the entire Cas1–Cas2-3 complex was required for in vitro integration, Cas1 and Cas2-3 were purified separately. Integration was not supported by Cas1 or Cas2-3 alone, but was robust with the Cas1–Cas2-3 complex (Fig. 4A). The role of the Cas3 part of the Cas2-3 fusion in adaptation is unknown, but might assist in primed spacer acquisition (28, 29, 41) and naïve adaptation (41). The Cas1–Cas2-3 architecture suggested that the Cas3 nuclease and helicase active sites are not directly involved in spacer integration, as they would not contact the protospacer (Fig. 2). To test this, the nuclease and helicase domains were inactivated with well-characterized site-directed mutations (D124A and D591A), which disrupt the HD and DExx helicase motif II, respectively (9, 10). As we predicted, spacer integration by Cas1–Cas2-3(D124A) and Cas1–Cas2-3(D591A) mutant complexes was unaffected (Fig. 4B). Because we showed that the active sites of the Cas3 domain did not influence spacer integration, we tested whether the entire Cas3 domain was necessary for integration. Following deletion of the Cas3 domain, we performed an integration assay with Cas1–Cas2(ΔCas3), which showed that integration still occurred but was less specific (Fig. S3A). To examine the role of Cas1, we mutated a metal-coordinating active-site aspartate (D269A), which abolished primed acquisition in vivo (38). Assembly of the resulting Cas1(D269A)–Cas2-3 complex was unaffected (Fig. S3 B and C) yet integration was abrogated (Fig. 4B), demonstrating the key role of the integrase activity of Cas1 in adaptation. Some Cas2 proteins have nuclease activity (42, 43), and the P. atrosepticum Cas2 domain contains some conserved residues at potential catalytic sites (Fig. S3D). However, the purified I-F Cas2 domain lacked detectable nuclease activity against a range of single- and double-stranded DNA substrates (Fig. S3E). Therefore, Cas2 appears to play a structural and DNA-binding role that is consistent with an in vivo mutagenesis and adaptation study in the E. coli I-E Cas2 (19). In summary, the entire Cas1–Cas2-3 complex, but not the helicase or nuclease activities of the Cas3 domain, is required for spacer integration in vitro.
Fig. S3.
Purification of the Cas1(D269A)–Cas2-3 complex, Cas2 nuclease assays, DNA-binding assays, and integration with different protospacers. (A) Integration assay with 32-bp dsDNA (PF1647+PF1648) with Cas1, Cas2(ΔCas3), reconstituted Cas1–Cas2(ΔCas3) complex, or Cas1–Cas2-3 in the presence or absence of IHF. Sequence analysis of the two major integration bands formed with Cas1–Cas2(ΔCas3) and IHF revealed protospacer integration into the leader–repeat junction (*, upper band) and into spacer 5 (**, lower band). Interestingly, integration into spacer 5 occurred at the sequence 5′-TACA^GT-3′ (where ^ denotes the site of integration), which matches the integration site at the leader–repeat junction. (B) SDS/PAGE showing the Cas1(D269A)–Cas2-3 complex fraction after Strep-Tactin and SEC. (C) Corresponding SEC chromatogram overlaid with MW size standards. (D) Sequence alignment of Cas2 domains from Sulfolobus solfataricus (SsoCas2), Bacillus halodurans (BhaCas2), Desulfovibrio vulgaris (DvuCas2), E. coli (EcoCas2), and P. atrosepticum (PatCas2-3). The alignment was made using T-COFFEE (61) and adjusted manually using the corresponding crystal structures and our homology model of PatCas2-3. Alpha helices (blue), beta strands (red), and residues possibly involved in protospacer binding (blue) are shown. Residues implicated in nuclease activity (red) or with a weaker phenotype (orange) in SsoCas2 are shown. (E) Nuclease assays with Cas2(ΔCas3) on plasmid DNA (pPF727), a PCR product (mCherry gene amplified from pPF571) (Upper), ssDNA (PF1647), or dsDNA (PF1647+PF1648) (Lower) in the presence or absence of EDTA. (F) Electrophoretic mobility-shift assay with 20 nM Cas1–Cas2-3 bound with 1 nM fluorescently labeled 32-bp protospacer (PS) (PF1794+PF1648) and competed with 100-fold excess unlabeled protospacers of 32 bp (PF1647+PF1648), 27 bp (PF1895+PF1896), 26 bp (PF2066+PF2067), and 18 bp (PF1901+PF1902). (G) Integration assays with dsDNA protospacers of 60 bp (PF1683+PF1684), 32 bp (PF1647+PF1648), and 18 bp (PF1649+PF1650). (H) Integration assay with 32-bp dsDNA (PF1647+PF1648) and a substrate with 5-nt splayed ends (PF1647+PF1838). Unless stated otherwise, the marker shown is in bp.
Substrate Requirements for Integration.
The natural substrates used during integration by Cas1–Cas2-3 are unknown. Because multiple methods showed that Cas1–Cas2-3 bound dsDNA and ssDNA, we tested integration using ssDNA protospacers. Integration of 32-nt ssDNA protospacers was barely detectable, whereas the dsDNA protospacer integrated efficiently (Fig. 4C). Next, we tested the 3′ end-group requirement for the nucleophile to attack the CRISPR array and showed that phosphorylated 3′ ends (3′-Ps) blocked integration, demonstrating that 3′-OH groups were necessary (Fig. 4D), similar to I-E Cas1–Cas2 (24, 26). Based on our high-throughput in vivo acquisition data (29), we anticipated that Cas1–Cas2-3 would typically use 32-bp protospacers but tolerate other lengths. Therefore, we tested integration of different-length protospacers (Fig. 4E). The 32- and 27-bp protospacers were integrated but integration was severely reduced for substrates <27 bp, which correlated with their inability to outcompete binding of the 32-bp protospacer to Cas1–Cas2-3 (Fig. S3F). In addition, a 60-bp substrate was integrated, albeit less efficiently (Fig. S3G). Because the integration assay cannot discriminate full-site from half-site intermediates, we consider that integration of longer substrates represents half-site events, where Cas1–Cas2-3 binds to the ends of the dsDNA. Therefore, these would not form new spacers in vivo. Indeed, in vivo spacers 34 bp or longer constitute <0.5% of events, and very rarely are 40- to 50-bp spacers detected (29). In the protospacer-bound type I-E Cas1–Cas2 structures, the 23-bp duplex is “bracketed” by two Cas1 tyrosines that splay the DNA and position 5 nt of the 3′ end of the ssDNA in the active site (22, 23). P. atrosepticum Cas1 has a histidine at the equivalent position (His-26) that might mediate splaying of the protospacer and, indeed, Cas1–Cas2-3 can efficiently integrate splayed substrates (Fig. S3H). The I-F consensus spacer is 32 bp, suggesting that Cas1–Cas2-3 typically binds a 22-bp duplex, explaining the lack of integration with protospacers <27 bp that lack a minimum 22 bp of duplex DNA and 5 nt of the 3′ end required to reach the Cas1 active site.
Next, we tested the effects of CRISPR topology on Cas1–Cas2-3–mediated integration. Integration occurred into a supercoiled plasmid containing CRISPR1, but not when it was linearized (Fig. 4F). Our results, together with a similar supercoiled DNA preference exhibited by type I-E Cas1–Cas2 (24), suggest that DNA topology is important for integration by type I CRISPR-Cas systems. Furthermore, analysis of the CRISPR1 leader revealed three putative integration host factor (IHF)-binding sites (Fig. S2B). Therefore, we purified the DNA-binding protein, IHF, and tested its effect on integration. In support of recent work in the type I-E system (44), IHF promoted integration by type I-F Cas1–Cas2-3 and dramatically enhanced the specificity of leader-proximal integrations (Fig. 4G). IHF was also able to increase leader-proximal integrations for the Cas1–Cas2(ΔCas3) complex (Fig. S3A).
Spacer Capture and Integration by Cas1–Cas2-3.
A critical question is how substrates are generated for integration by Cas1–Cas2-3. For type I-E systems, RecBCD may provide one route for the generation of precursor substrates for Cas1–Cas2 during naïve adaptation (16). Similarly, during priming, Cas3 has a role in substrate generation (45) and/or adaptation complex translocation (29). Therefore, we tested whether Cas1–Cas2-3 bound and processed (captured) substrates to generate protospacers proficient for integration. To identify processing sites, we blocked the ends with 3′-P and used splayed substrates to uniquely “barcode” both 5′ and 3′ ends of each DNA strand. We reconstituted a coupled capture–integration assay in vitro, showing that Cas1–Cas2-3 enabled processing and integration into a plasmid with the CRISPR1 leader and a single repeat. We cloned and sequenced these integration products (Dataset S2) and they were predominantly located precisely at the repeat–spacer junction (Fig. 4H, “repeat”), although some were inserted elsewhere in the array (Fig. 4H, “other”). Protospacer processing typically occurred adjacent to the GG PAM, although there was some off-site activity (Fig. 4I). Substrate capture and integration were independent of Cas3 activity, because complexes with helicase or nuclease mutations were proficient in this coupled assay (Fig. 4J). In contrast, the Cas1 integrase mutant complex was unable to acquire spacers, showing that integration, and perhaps capture, required the Cas1 and Cas2 parts of the complex and that Cas3 does not directly participate in these final steps of CRISPR adaptation.
Discussion
Adaptation in type I CRISPR-Cas systems involves three interrelated processes: (i) adaptation to threats not previously encountered (naïve); (ii) adaptation to those that have escaped direct interference (primed); and (iii) interference-driven adaptation that provides a positive feedback loop (14). Primed and interference-driven CRISPR adaptation requires coupling of both the interference (Cascade-crRNA and Cas3) and adaptation (Cas1 and Cas2) machinery, but how this is coordinated is unclear. Here we demonstrated the formation of a type I-F Cas1–Cas2-3 adaptation complex and characterized its integration activity. The complex contains two Cas1 dimers assembled onto the dimeric-scaffolding Cas2 domains of two Cas2-3 proteins in a Cas14:Cas2-32 stoichiometry. Modeling, cross-linking, and EM showed that the Cas3 domain occupies the region between the Cas1 dimers along the side of the Cas2 core, which positions the HD domain of Cas3 in proximity to the catalytic lobe of one Cas1 dimer and the C-terminal region of Cas3 next to the inactive lobe of the opposite Cas1 dimer. This Cas1 and Cas2 core is similar to the crystal structures of the I-E Cas1–Cas2 complex (19, 23, 24).
To study adaptation by Cas1–Cas2-3, we developed a sensitive PCR-based assay that detects protospacer integration without requiring radioactivity or high-throughput sequencing. Using this assay, we demonstrated that both Cas1 and Cas2-3 were needed for protospacer integration, without a requirement for the helicase and nuclease functions of the Cas3 domain, although we found a nonenzymatic role of the Cas3 domain in integration specificity (Fig. S3A). We observed no apparent orientation bias for which end of the protospacer integrated into the CRISPR repeat, whereas in vivo the rate of protospacer flipping is rare (29). In vivo, capture and integration might be coupled to ensure correct protospacer orientation for nucleophilic attack. We tested this in vitro using a coupled capture–integration assay but observed no detectable bias in integration orientation, even in the presence of IHF (Fig. S4), which suggests that additional factors are involved in vivo. Interestingly, although the additional Cas3 domain in the Cas1–Cas2-3 complex is a clear structural difference from the I-E Cas1–Cas2 complex, the integration requirements were similar between the systems and Cas3 was not essential for these final steps. The Cas1 and Cas2 proteins of type I-E and I-F systems are structurally quite different. The I-F Cas1 has unique asymmetry and the sequence of the Cas2 domain of type I-F Cas2-3 is divergent (and directly fused to Cas3) (Fig. S2C). Because type I-F and I-E systems also differ in their PAM recognition and primed spacer acquisition mechanisms, their similarities in the integration mechanism are of considerable interest.
Fig. S4.
IHF does not detectably affect integration orientation. Semiquantitative coupled capture and integration assays using pCRISPR1 (pPF727), Cas1–Cas2-3, and a capture protospacer (PF1994+PF1995) were performed in the presence or absence of IHF (2 µM). (A) Leader direction PCR products were analyzed after 18, 22, 26, and 30 cycles with PF1792 and either the FW (PF1996; Upper) or RV (PF1997; Lower) primers to detect different protospacer orientations (Right). (B) Leader distal direction PCR products were analyzed after 18, 22, 26, and 30 cycles with PF1822 and either the FW (PF1996; Upper) or RV (PF1997; Lower) primers to detect different protospacer orientations (Right). The marker shown is in bp.
This type I-F Cas1–Cas2-3 complex raises the possibility of similar complexes in all type I systems. Priming occurs in type I-B, I-C, I-E, and I-F systems in multiple genera (28, 30, 33, 34, 46), suggesting that this efficient adaptation route is ubiquitous in type I systems. Therefore, the type I-F Cas1–Cas2-3 complex provides a view of adaptation complexes that are recruited to targets by the type I Cascade surveillance machineries. There are several nonmutually exclusive models for the role of Cas3 during priming. In type I-F systems, primed acquisition data are consistent with Cas2-3 helicase-dependent translocation being involved in new spacer selection, either via delivery of Cas1 (and the Cas2 domain) to the invader DNA and/or through substrate precursor generation via the HD nuclease (28, 29). A helicase-dependent model is supported by a type I-E single-molecule study, where nonnucleolytic Cas3 translocation from a primed target was stimulated by Cas1–Cas2 addition (47). However, during type I-E interference, Cas3 is recruited to Cascade-generated R-loops, unwinds DNA via its helicase activity, and feeds ssDNA to the HD nuclease for degradation (9, 10). Interference can also stimulate priming in both type I-E and I-F systems (29, 32) and, in type I-E systems, DNA degradation from Cas3 fuels priming by providing substrates to the adaptation complex (45). The Cas1–Cas2-3 structure reveals that the Cas3 HD nuclease active site is adjacent to the catalytic integrase lobe of Cas1. During priming, it is possible that ssDNA fragments liberated from the HD domain provide precursors to the Cas1–Cas2 portion of the complex either (i) directly or (ii) due to their high local concentration. Because Cas1–Cas2 can efficiently bind ssDNA but is a poor substrate for integration, it is likely that at least partially duplex DNA is required for integration. Cas3 HD activity on the opposite DNA strand might provide complementary fragments that reanneal and are then processed and integrated as new spacers. Interestingly, we detected DNA bound by Cas1–Cas2-3 by various techniques, and the native MS is suggestive of both ssDNA- and dsDNA-bound forms of the complex in vivo, suggesting reannealing could occur on the complex. Furthermore, cross-links suggested that Cas1–Cas2-3 binds protospacer DNA using a similar interface to the E. coli Cas1–Cas2–protospacer complex, although we cannot rule out the possibility that some of the ssDNA binds the Cas3 domain, as was observed in T. fusca Cas3 (9).
Importantly, the nuclease and helicase domains of Cas3 were not involved in the final capture and integration reactions catalyzed by Cas1–Cas2-3, whereas Cas1 activity was required. Therefore, we predict that naïve acquisition in type I-F systems can occur in the absence of the interference machinery when precursors are generated by other processes, such as via RecBCD (16). In contrast, naïve adaptation by I-F systems was proposed to involve all components of the interference machinery (41). Assuming the conservation of type I complexes composed of Cas1, Cas2, and Cas3, the proposed pipeline of DNA-processing active sites, starting from the helicase, to the nuclease, and then to the integrase, may account for the efficiency of primed adaptation and apply generally to type I systems. However, because the distribution of spacer selection differs between type I-E and type I-F systems, there are likely to be mechanistic distinctions.
Materials and Methods
Details of the materials and methods used in this study, including cloning and protein purification, SEC-RALS, AUC, mass spectroscopy, cross-linking analyses, electron microscopy, and structural modeling, are provided in SI Materials and Methods. Strains and plasmids are in Table S3, and Table S4 lists the oligonucleotides.
Table S3.
Bacterial strains and plasmids used in this study
| Strain/plasmid | Genotype/phenotype | Source |
| E. coli | ||
| BL21(DE3) | E. coli B F– ompT, gal, dcm, lon, hsdSB(rB–mB–), λ(DE3 [lacI lacUV5-T7p07 ind1 sam7 nin5]) [malB+]K-12(λS) | Promega |
| DH5α | F−, ϕ80dlacZΔM15, Δ(lacZYA–argF)U169, endA1, recA1, hsdR17 (rK−mK+), deoR, thi-1, supE44, λ−, gyrA96, relA1 | Gibco/BRL |
| P. atrosepticum | ||
| PCF80 | Δcas::cat derivative of SCRI1043 | (62) |
| pCDF-1b | T7lac promoter with CloDF13 replicon, SmR | Novagen |
| pPF171 (aka pJSC2) | N-His6–tagged Cas2-3, in a KmR derivative of pQE-80L, KmR | (62) |
| pPF700 | N-StrepII–tagged Cas1, Cas2-3, pQE-80LoriT, ApR | This study |
| pPF571 | pQE-80LoriT-mCherry derivative | (28) |
| pPF727 (aka pCRISPR1) | CRISPR1 and 781-bp leader, pUC19, ApR | This study |
| pPF732 | N-StrepII–tagged Cas1, Cas2-3 D124A, pQE-80LoriT, ApR | This study |
| pPF737 | N-StrepII–tagged Cas1, Cas2-3 D591A, pQE-80LoriT, ApR | This study |
| pPF739 | N-StrepII–tagged Cas1 D269A, Cas2-3, pQE-80LoriT, ApR | This study |
| pPF1042 | CRISPR1 single repeat and 781-bp leader, pUC19, ApR | This study |
| pPF1051 | N-His6–tagged IHF-α in pRSF-1b, KmR | This study |
| pPF1052 | IHF-β in pCDF-1b, SmR | This study |
| pPF1211 | N-His6–tagged Cas2(1–94) in pRSF-1b, KmR | This study |
| pPF1227 | N-His6–tagged Cas2(1–88) in pRSF-1b, KmR | This study |
| pQE-80LoriT | pQE-80L-based expression vector with RP4 oriT, ColE1 replicon, ApR | (28) |
| pRARE | Encodes rare tRNA genes with p15a replicon, CmR | Novagen |
| pRSF-1b | T7lac promoter with RSF1030-derived replicon, KmR | Novagen |
| pUC19 | High-copy cloning vector, ColE1 replicon, ApR | Addgene |
Table S4.
Oligonucleotides used in this study
| Name | Sequence, 5′-3′ | Notes | Restriction site* |
| PF192 | TGAGCGGATAACAATTTCAC | F for pQE-80L (and derivatives) MCS | |
| PF209 | TCGTCTTCACCTCGAGAAATC | F for pQE-80L (and derivatives) MCS | |
| PF210 | GTCATTACTGGATCTATCAACAGG | R for pQE-80L (and derivatives) MCS | |
| PF391 | AGGTCTGCAGCAGAATGTTCATCGCACTAC | R Cas1 | PstI |
| PF442 | TTTAAGCTTTCAACTGAGTGCGCCAAACAC | R Cas2-3 | HindIII |
| PF669 | TTTCCCGGGAAAGGTAAAGCGCGATTCAC | F CRISPR1 cloning | XmaI |
| PF861 | TAATACGACTCACTATAGGG | T7 promoter sequencing | |
| PF1435 | GCTGGTGTTTGATGTCGCCGCTTTAATTAAAGATGCGCTCGTGC | F Cas1 D269A mutagenesis | |
| PF1436 | GGCGACATCAAACACCAGC | R Cas1 D269A mutagenesis | |
| PF1437 | CCATTGCTGGGCTATTTCACGCTGTTGGCAAAGCCAATGC | F for Cas2-3 D124A mutagenesis | |
| PF1438 | GTGAAATAGCCCAGCAATGG | R for Cas2-3 D124A mutagenesis | |
| PF1439 | TGACCGCCGATCTGGTACTCGCTGAACCGGATGATTTTGGTC | F for Cas2-3 D591A mutagenesis | |
| PF1440 | AGTACCAGATCGGCGGTCA | R for Cas2-3 D591A mutagenesis | |
| PF1578 | AGGTGAATTCATTAAAGAGGAGAAATTAACTATGTGGAGCCACCCGCAGTTCGAAAAAGGCGCGATGGATAACGCCTTTAGCC | F for StrepII-tagged Cas1 | EcoRI |
| PF1647 | AAGCTGAAGGTGACCAAGGGTGGCCCCCTGCC | F 32-bp protospacer | |
| PF1648 | GGCAGGGGGCCACCCTTGGTCACCTTCAGCTT | R 32-bp protospacer | |
| PF1649 | GAAGGTGACCAAGGGTGG | F protospacer integration screening | |
| PF1650 | CCACCCTTGGTCACCTTC | R protospacer integration screening | |
| PF1653 | TTTGAATTCAGTGCCTGAGTGCTGTGAAT | R for CRISPR1 cloning | EcoRI |
| PF1683 | GTTGTGTACGCTGTCTGACGCTCAGTGCAACGAAAACTGACGTTAAGGCATTTTCGTGAT | F 60-bp protospacer | |
| PF1684 | ATCACGAAAATGCCTTAACGTCAGTTTTCGTTGCACTGAGCGTCAGACAGCGTACACAAC | R 60-bp protospacer | |
| PF1792 | GGATTAAAAATCAATGAGTTACAGATG | F CRISPR1 leader integration screening | |
| PF1794 | (IRD700)AAGCTGAAGGTGACCAAGGGTGGCCCCCTGCC | PF1647 with 5′ IRD700 fluorescent label | |
| PF1795 | ATGCGAATTCACCGTATCGATAGTCTCTGC | R for CRISPR1 single-repeat cloning | EcoRI |
| PF1822 | GAACAGGCGATGGTGTCGTG | R CRISPR1 spacer 6 integration screening | |
| PF1838 | TTACTGGGGCCACCCTTGGTCACCTTCCTAGG | R protospacer with 5-nt splayed ends | |
| PF1882 | AAGCTGAAGGTGACCAAGGGTGGCCCCCTGCC-(P) | PF1647 with 3′-P | |
| PF1883 | GGCAGGGGGCCACCCTTGGTCACCTTCAGCTT-(P) | PF1648 with 3′-P | |
| PF1895 | GAAGGTGACCAAGGGTGGCCCCCTGCC | F 27-bp protospacer | |
| PF1896 | GGCAGGGGGCCACCCTTGGTCACCTTC | R 27-bp protospacer | |
| PF1897 | GGTGACCAAGGGTGGCCCCCTGCC | F 24-bp protospacer | |
| PF1898 | GGCAGGGGGCCACCCTTGGTCACC | R 24-bp protospacer | |
| PF1899 | GACCAAGGGTGGCCCCCTGCC | F 21-bp protospacer | |
| PF1900 | GGCAGGGGGCCACCCTTGGTC | R 21-bp protospacer | |
| PF1901 | CAAGGGTGGCCCCCTGCC | F 18-bp protospacer | |
| PF1902 | GGCAGGGGGCCACCCTTG | R 18-bp protospacer | |
| PF1994 | GGATCGAAGGCTAGCAAGGGTGGCCCCCTGCCCTTCG-(P) | F capture protospacer | |
| PF1995 | AATGAGGGGCCACCCTTGCTAGCCTTCAGCTTGGCAT-(P) | R capture protospacer | |
| PF1996 | GAAGGCTAGCAAGGGTGG | F protospacer capture and integration screening | |
| PF1997 | CCACCCTTGCTAGCCTTC | R protospacer capture and integration screening | |
| PF2066 | AAGGTGACCAAGGGTGGCCCCCTGCC | F 26-bp protospacer | |
| PF2067 | GGCAGGGGGCCACCCTTGGTCACCTT | R 26-bp protospacer | |
| PF2077 | AGTCGGATCCGGAAAACCTGTATTTTCAGGGCATGGCGCTTACTAAAGCTGAAATG | F IHF-α encodes TEV protease site | BamHI |
| PF2078 | AGTCAAGCTTTTACTCTTTGGGAGAGGCGTTC | R IHF-α | HindIII |
| PF2093 | AGTCCCATGGCGACCAAGTCTGAACTTATTGAAAG | F IHF-β | NcoI |
| PF2094 | AGTCAAGCTTTTAGCCATAAATGTTAGCGCGG | R IHF-β | HindIII |
| PF2339 | AGCTAAGCTTAAACACTTCCCTGCGCATTAAAAC | R Cas2 (from residue 88) | HindIII |
| PF2340 | AGCTAAGCTTAGCTGGTGGTATTGGTAGGAAC | R Cas2 (from residue 94) | HindIII |
| PF2341 | AGTCGGATCCGGAAAACCTGTATTTTCAGGGCATGAAC-ATTCTGCTGATTTCAG | F Cas2-3 encodes TEV | BamHI |
The restriction site is underlined.
Integration and capture assays are described in detail in SI Materials and Methods. Briefly, reactions typically contained 10 nM protospacer and 70 nM Cas1–Cas2-3 and were incubated on ice for 15 min; then, 7.5 nM CRISPR plasmid was added and incubated at 25 °C for 1 h. Reactions were stopped at 65 °C for 20 min and integration was detected by PCR. Unless stated otherwise, primers were PF1649 and PF1822. Capture assays and PCR (primers PF1792+PF1997) were performed as described above but with plasmid pPF1042, and the integration product was cloned into pGEM-T (Promega); plasmids were isolated from individual colonies and sequenced with primer PF861.
SI Materials and Methods
Construction of Protein Expression Plasmids.
Strains and plasmids used in this study are listed in Table S3. All plasmid cloning was performed in E. coli DH5α, and plasmids were purified using the Zyppy Miniprep Kit (Zymo) and confirmed by sequencing (primers were obtained from IDT; Table S4).
A plasmid (pPF700) for expression of the Cas1–Cas2-3 complex was constructed by PCR-amplifying cas1 and cas2-3 (primers PF1578+PF442) using P. atrosepticum genomic DNA as template and cloning the product into pQE-80LoriT via EcoRI and HindIII restriction sites. The cas1 gene was cloned to incorporate an N-terminal StrepII tag.
Overlap extension PCR was subsequently used to generate single site-directed mutations within either Cas1 or Cas2-3 in the complex expression construct. Left-hand fragments were generated using PF1578 with either PF1436 (Cas1 D269A), PF1438, (Cas2-3 D124A), or PF1440 (Cas2-3 D591A). Right-hand fragments were generated using PF442 with either PF1435 (Cas1 D269A), PF1437 (Cas2-3 D124A), or PF1439 (Cas2-3 D591A). Both fragments were purified and used as template in an overlap extension PCR with primers PF1578 and PF442, and the resulting product was digested with EcoRI and HindIII and ligated into pQE-80LoriT, previously digested with the same enzymes. The resulting plasmids were pPF739 (StrepII-Cas1 D269A, Cas2-3), pPF732 (StrepII-Cas1, Cas2-3 D124A), and pPF737 (StrepII-Cas1, Cas2-3 D591A). A plasmid for expression of N-His6–tagged Cas2-3 (pPF171) was constructed previously (Table S3).
Plasmids (pPF1227 and pPF1211) for expression of Cas2 domain residues 1 to 88 and 1 to 94 of Cas2-3 were constructed by PCR-amplifying (primers PF2341+PF2339 and PF2341+PF2340, respectively) from cas2-3 using pPF700 as template and cloning the products into pRSF-1b via restriction sites BamHI and HindIII. These vectors enable expression of Cas2 with an N-terminal His6 tag followed by TEV protease recognition sequence.
Two plasmids (pPF1051 and pPF1052) for expression of IHF-α and IHF-β were constructed by PCR-amplifying each gene (ihf-α with primers PF2077+PF2078; ihf-β with primers PF2093+PF2094) using P. atrosepticum genomic DNA as template and cloning the products into pRSF-1b (via restriction sites BamHI and HindIII) and pCDF-1b (via restriction sites NcoI and HindIII), respectively. The ihf-α gene was cloned to incorporate an N-terminal His6 tag followed by TEV protease recognition sequence.
Protein Purification.
The StrepII-tagged Cas1–Cas2-3 complex was expressed in E. coli BL21(DE3) containing pPF700 and in some cases with the addition of pRARE. Cultures were grown in LB with ampicillin (100 μg⋅mL−1) (+25 μg⋅mL−1 chloramphenicol for pRARE) at 37 °C and 180 rpm to an OD600 of 0.4 to 0.5. After incubation on ice for 2 h, 0.2 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) was added and incubation was continued for 18 h at 16 °C. Cells were harvested by centrifugation at 10,000 × g and 4°C for 10 min, and every 1 g of cell paste was resuspended in 10 mL of Strep buffer (50 mM Hepes⋅NaOH, 500 mM KCl, 10% glycerol, pH 7.5) supplemented with cOmplete EDTA-free protease inhibitor (Roche), 0.02 mg⋅mL−1 DNase I, and 0.1 mM PMSF (phenylmethanesulfonyl fluoride). Cell suspensions were lysed by French press and the lysate was clarified by centrifugation twice at 15,000 × g and 4°C for 15 min. Up to 10 mL of clarified lysate was added per 1-mL column of Strep-Tactin resin (IBA), the column was washed with Strep buffer, and StrepII-Cas1–Cas2-3 was eluted by addition of Strep buffer supplemented with 3 mM desthiobiotin. Free StrepII-Cas1 was separated from the complex by size-exclusion chromatography (SEC) on a Superdex 200 column in SEC buffer [10 mM Hepes⋅NaOH, 200 mM (for negative-stained transmission electron microscopy) or 500 mM KCl (all other experiments), 10% glycerol, pH 7.5]. Cas1–Cas2-3 fractions were pooled and concentrated using a centrifugal concentrator (100,000 MWCO; molecular weight cutoff). Aliquots of the complex were snap-frozen in liquid nitrogen and stored at −80 °C. The typical yield of Cas1–Cas2-3 was 0.7 mg from 1 L of culture. Superdex 200 fractions containing only the Cas1 dimer were pooled, aliquoted, and stored at −80 °C for experiments with Cas1 alone. The same approach was used for pPF739 (StrepII-Cas1 D269A, Cas2-3), pPF732 (StrepII-Cas1, Cas2-3 D124A), and pPF737 (StrepII-Cas1, Cas2-3 D591A).
The N-terminal His6-tagged Cas2-3 was expressed in a P. atrosepticum cas operon deletion strain (PCF80) containing pPF171 at 25 °C and 180 rpm to an OD600 of 0.5 in LB media with kanamycin (50 μg⋅mL−1), followed by induction with 1 mM IPTG and incubation overnight at 18 °C and 180 rpm. Cells were harvested by centrifugation at 10,000 × g and 4°C for 10 min, resuspended in Cas2-3 lysis buffer (50 mM Hepes⋅NaOH, pH 7.5, 300 mM NaCl, 1 mM DTT, 40 mM imidazole) containing cOmplete EDTA-free protease inhibitor (Roche), 0.1 mM PMSF, 0.02 mg/mL DNase I, and 1 mg⋅mL−1 lysozyme. The cell suspension was incubated on ice for 30 min and lysed by sonication, and the lysate was clarified by centrifugation twice at 15,000 × g and 4°C for 15 min. The clarified lysate was applied to Ni2+-NTA resin (Protino; Macherey-Nagel) and washed, and bound protein was eluted with Cas2-3 lysis buffer supplemented with 250 mM imidazole. Cas2-3 fractions were pooled, buffer-exchanged into SEC buffer using a PD-10 desalting column (GE Healthcare), and concentrated with a centrifugal concentrator (10-kDa MWCO), and aliquots were stored at −80 °C.
N-terminal His6-tagged Cas1–Cas2(ΔCas3) was expressed in BL21(DE3) containing either pPF1227 or pPF1211 in LB supplemented with kanamycin (50 μg⋅mL−1) at 37 °C to an OD600 of 0.6, followed by induction with 1 mM IPTG at 16 °C overnight. Cells were harvested, resuspended in Cas2 lysis buffer (50 mM Hepes⋅NaOH, pH 7.5, 150 mM NaCl, 1 mM DTT, 5% glycerol, 20 mM imidazole) containing cOmplete EDTA-free protease inhibitor (Roche), 0.1 mM PMSF, and 0.02 mg⋅mL−1 DNase I, and lysed by French press. The lysate was clarified by centrifugation twice at 15,000 × g and 4°C for 15 min and the soluble fraction was applied to an Ni-NTA affinity-resin column. The column was washed with Cas2 lysis buffer, and bound protein was eluted in lysis buffer supplemented with 250 mM imidazole. To cleave the His6 tag, the eluted samples were digested overnight with TEV protease at 4 °C while dialyzing against Cas2 storage buffer (10 mM Hepes⋅NaOH, pH 7.5, 150 mM NaCl, 1 mM DTT, 5% glycerol). The cleaved sample was reapplied to an Ni-NTA column and the liberated protein was collected in the unbound fraction.
Reconstituted StrepII-Cas1–His6-Cas2(ΔCas3) complex was formed by binding His6-tagged Cas2(ΔCas3) (pPF1227) to Ni-NTA resin, as described above, followed by addition of Strep-tagged Cas1 dimer in Cas2 storage buffer and mixing by rotation for 1 h at 4 °C. The resin was poured back on the column, unbound Cas1 was washed off with 20 column volumes of Cas2 lysis buffer, and Cas2 (including any bound StrepII-Cas1) was eluted with 250 mM imidazole supplemented in Cas2 lysis buffer. The protein was loaded onto a Strep-Tactin resin column, free His6-Cas2 was washed off, and the StrepII-Cas1–His6-Cas2(1–88) complex was eluted with 3 mM desthiobiotin supplemented in Cas2 storage buffer. The protein was aliquoted and stored at −80 °C.
N-terminal His6-tagged IHF-α and IHF-β were coexpressed in BL21(DE3) containing pPF1051 and pPF1052 in LB [supplemented with kanamycin (50 μg⋅mL−1) and streptomycin (50 μg⋅mL−1)] at 37 °C to an OD600 of 0.6, followed by induction with 0.5 mM IPTG for 3 h. Cells were harvested, resuspended in IHF lysis buffer (20 mM Hepes⋅NaOH, pH 7.5, 500 mM NaCl, 1 mM DTT, 10% glycerol, 10 mM imidazole) containing cOmplete EDTA-free protease inhibitor (Roche), 0.1 mM PMSF, and 0.02 mg⋅mL−1 DNase I, and lysed by sonication. The lysate was clarified by centrifugation and the soluble fraction was applied to an Ni-NTA affinity-resin column. The column was washed with IHF lysis buffer, and bound protein was eluted in lysis buffer supplemented with 300 mM imidazole. To cleave the His6 tag, the eluted samples were digested overnight with TEV protease at 4 °C while dialyzing against IHF lysis buffer. The cleaved sample was reapplied to an Ni-NTA column, and the liberated protein was collected in the unbound fraction. The IHF-α–IHF-β heterodimer was purified from free IHF-α by SEC on a Superose 12 10/300 GL column (GE Healthcare) in 10 mM Hepes⋅NaOH (pH 7.5), 100 mM KCl, 5% glycerol, 1 mM DTT. Protein was aliquoted and stored at −80 °C.
SEC-RALS.
StrepII-tagged Cas1–Cas2-3 complex eluted from Strep-Tactin resin (IBA) was fractionated by gel filtration at 28 °C using a Superdex 200 Increase 10/300 column (GE Healthcare; ∼24-mL column volume). A 100-μL sample of protein was loaded onto the column and eluted with 50 mM Hepes⋅NaOH, 200 mM KCl, 10% glycerol (pH 7.5) at 0.5 mL⋅min−1. A Viscotek Triple Detector Array unit (Malvern Instruments) was used to measure the refractive index (RI) and right-angle light scattering (RALS). BSA (2 mg⋅mL−1) was used as a standard to calibrate the instrument. Cas1–Cas2-3 and StrepII-tagged Cas1 peaks had Mw/Mn [polydispersity index, mass average molar mass (Mw)/number average molar mass (Mn)] ratios of 1.005 and 1.002, respectively, indicating low polydispersity.
Analytical Ultracentrifugation.
Sedimentation experiments were performed on a Beckman Coulter Model XL-A analytical ultracentrifuge equipped with UV-vis scanning optics and an An-50 Ti eight-hole rotor. SEC-fractionated Cas1–Cas2-3 was buffer-exchanged into 50 mM Hepes⋅NaOH, 200 mM KCl, 5% glycerol (pH 7.5). Protein sample and reference (50 mM Hepes⋅NaOH, 200 mM KCl, 5% glycerol, pH 7.5) solutions were loaded into 12-mm double-sector cells with quartz windows. Sample (380 μL) and reference (400 μL) solutions were centrifuged at 40,000 rpm at 12 °C and data were collected in continuous mode at 280 nm every 8 min without averaging. Data were fitted to a continuous size-distribution model, and all values were corrected to S20,w (sedimentation coefficient corrected to 20 °C and the density of water) values using the program SEDFIT (48).
Native MS.
A total of 25 µg of the purified complex was buffer-exchanged into 500 mM ammonium acetate (pH 7.6) and 1 mM DTT at a protein concentration of 2.5 µM on a preequilibrated 6-kDa Micro Bio-Spin column (Bio-Rad) by centrifugation at 1,000 × g at 4 °C. Proteins were sprayed from gold-coated, in-house–constructed borosilicate glass capillaries and analyzed on a modified Exactive Plus (EMR; Thermo Fisher Scientific) adjusted for optimal performance in high mass detection (49). Even though attempts to acquire the exact mass measurement of all Cas1–Cas2-3 proteins were made, only the Cas1 subunit was observed under denaturing conditions (5% formic acid). Cas1 was measured as 37,557.62 ± 0.84 Da, a deviation of 4 ppm from the theoretical mass—well within the limits of the instrument. The instrument settings on the modified Exactive Plus instrument were as follows: The voltages on the flatapoles and transport octapole were manually tuned to enhance the transmission of protein ions; capillary voltages were varied between 1,200 and 1,300 V to create highly charged protein ions; nitrogen was used in the HCD (higher collisional dissociation) cell at a pressure of 1 × 10−09 mbar, with collisional voltages between 80 and 200 V to increase sensitivity, desolvation, and dissociation. All data were collected at a resolution of 8,000 and interpreted manually with MassLynx version 4.0 (Waters).
LC-MS/MS.
Peptide mixtures were reconstituted in 10% formic acid and analyzed on an Orbitrap Fusion (Thermo Fisher Scientific), Orbitrap Q Exactive Plus (Thermo Fischer Scientific) coupled online to an HPLC (Agilent Technologies), or Orbitrap Elite (Thermo Fisher Scientific). A total of 1 µg of the appropriate digest was trapped on a precolumn (ReproSil-Pur C18; Dr. Maisch; 100 µm × 2 cm, 3 µm; constructed in-house) for 10 min with buffer A (0.1% formic acid) and separated on an analytical column (Poroshell 120 EC C18; Agilent Technologies; 50 µm × 50 cm, 2.7 µm) over the indicated time with a linear gradient from 10 to 40% B (0.1% formic acid, 80% acetonitrile). The mass spectrometer was run in shotgun mode, where first a full scan is acquired (resolution was set to 70,000 and maximum ion injection time was 45 ms for the Orbitrap Fusion, and 60,000 and 50 ms for the Orbitrap Q Exactive Plus and Orbitrap Elite). During each cycle the indicated top N most abundant precursors are selected for sequencing with the indicated fragmentation method(s). Before analysis, the peptide mixture was desalted on a C18 reversed-phase stage tip, vacuum-dried, and stored at −80 °C until further analysis, before which the sample was reconstituted in 10% formic acid.
Disuccinimidyl Sulfoxide Protein–Protein Cross-Linking.
A total of 10 µg of purified complex in 50 mM Hepes⋅NaOH, 200 mM KCl, 10% glycerol was incubated with 2 mM disuccinimidyl sulfoxide (DSSO) cross-linker (50) for 1 h at room temperature. This linker is a bifunctional N-hydroxysuccinimide–ester reagent with an established spacer length of 10.4 Å, and has the advantage of fragmenting well in the gas phase. DSSO selectively reacts in solution with peptide amino groups, thus capturing lysine residues in close proximity. The cross-linking reaction was quenched by addition of Tris⋅HCl (pH 8) to a final concentration of 100 mM. Proteins were denatured using 8 M urea solution and subjected to reduction with DTT for 30 min at 37 °C followed by alkylation for 30 min at room temperature with iodoacetamide. The final protein sample was digested in solution using LysC (Wako) for 4 h at 37 °C and then trypsin (Promega) overnight at 37 °C. Analysis was performed as described (LC-MS/MS) with the Orbitrap Fusion (Thermo Fisher Scientific) over a 65-min gradient. As the fragmentation step for the top 10 selected precursors (charge states between 3 and 8; maximum ion injection time was set to 120 ms; intensity threshold was set to 5e4), a double-play mode was set up with CID (collisional induced dissociation) at an NCE (normalized collision energy) of 30 (resolution 30,000) followed by ETD (electron transfer dissociation) set with dynamic reaction time and no supplemental activation (resolution 15,000) (51). Raw data files were processed in Proteome Discoverer 2.2 with in-house–developed nodes for cross-link analysis (51). The fragmentation scans were searched against the Swiss-Prot database for Pectobacterium atrosepticum (April 2016, 792 entries). A false discovery rate was set to 1% at protein and peptide levels.
DNA–Protein Cross-Linking.
Two samples of 10 µg of purified complex, dissolved in 50 mM Hepes⋅NaOH, 200 mM KCl, 10% glycerol were incubated for 30 min at room temperature in the presence of 100 mM EDTA and without (as control) or with formaldehyde (as sample) at a concentration of 25 mM (Thermo Scientific Pierce) resulting in DNA–peptide heteroconjugates. The cross-linking reaction was quenched with Tris⋅HCl (pH 8) to a final concentration of 200 mM, and both samples were subjected to an in-solution digestion with LysC (Wako) and trypsin (Promega). Next, an enrichment step for DNA–peptide heteroconjugates with spin filter columns (Qiagen) was performed. The formaldehyde cross-links were reversed overnight at 65 °C. Analysis was performed as described (LC-MS/MS) with the Orbitrap Elite over 30 min. As the fragmentation step for the top 10 selected precursors (minimum charge state 2; maximum ion injection time was set to 100 ms; minimum intensity threshold required for activation was set to 500 counts), HCD was used at an NCE of 32 (resolution 15,000). Raw data files were processed with MaxQuant version 1.5.3.30 (52) with “match between runs” with the standard settings enabled, and fragmentation scans were searched against the Swiss-Prot database for Pectobacterium atrosepticum (April 2016, 792 entries). A false discovery rate was set to 1% at protein, peptide, and modification levels. In total, 93 peptides were detected from the Cas1 and Cas2-3 subunits (Dataset S1).
Shotgun MS Experiments.
A total of 10 μg of protein complex was denatured in 8 M urea solution and subjected to reduction with DTT for 30 min at 37 °C, followed by alkylation for 30 min at room temperature with iodoacetamide. The mixture was digested in solution using LysC (Wako) and trypsin (Promega). Analysis was performed as described (LC-MS/MS) with the Orbitrap Q-Exactive Plus over 65 min. As the fragmentation step for the top 8 selected precursors (unassigned charge states were excluded; maximum ion injection time was 120 ms; underfill ratio was set to 10%), HCD was used at an NCE of 27 and the resolution was set to 17,500. Raw data files were processed with MaxQuant version 1.5.3.30 (52), and fragmentation scans were searched against the Swiss-Prot database for Pectobacterium atrosepticum (April 2016, 792 entries). A false discovery rate was set to 1% at protein, peptide, and modification levels. For the resulting proteins, intensity-based absolute quantification (iBAQ) values were calculated (35), providing an indication of the absolute quantities of the proteins in the sample.
Structural Modeling.
Homology modeling of P. atrosepticum Cas2-3 was performed using SWISS-MODEL (53) using P. aeruginosa Cas2-3 in complex with AcrF3 [Protein Data Bank (PDB) ID code 5B7I] as a template (37), which allowed a reliable model for P. atrosepticum Cas2-3 due to 64% amino acid similarity (48% identity). Modeling also involved P. atrosepticum Cas1 (PDB ID code 5FCL) (38) and E. coli Cas1–Cas2–protospacer (PDB ID code 5DQZ) (23). Structural modeling, alignment, and figure preparation used PyMOL (54) and UCSF Chimera (55).
Electron Microscopy.
Freshly prepared Cas1–Cas2-3 complex was bound with twofold molar excess of protospacer DNA with 5-nt splayed ends (annealed oligonucleotides PF1647 and PF1838) and diluted to 0.02 mg⋅mL−1 and then inspected by negative staining as described previously (56). Briefly, 4 μL of the diluted sample was adsorbed for 25 s to freshly glow-discharged carbon-coated 400-mesh copper grids. The excess solution was blotted off and specimens were washed three times with water and twice with freshly prepared 1% uranyl formate before being stained with uranyl formate solution for 60 s. The excess staining solution was removed by blotting and the specimen was air-dried. Grids were imaged with a JEOL 2200FS microscope equipped with an FEG (field emission gun) and operated at an acceleration voltage of 200 kV. An in-column omega energy filter with a slit width of 25 eV was used to collect zero-loss images. A total of 150 micrographs were recorded at a calibrated magnification of 50,000 corresponding to a pixel size of 3.12 Å and a defocus between 1 and 3 µm using a 4k × 4k CMOS camera F416 (Tietz Video and Image Processing Systems) under minimal-dose conditions.
For each micrograph, the contrast transfer function parameters were estimated using CTFFIND3 (57), and a total of 14,880 individual particle images were interactively selected using Boxer (58). Image processing was performed using RELION (59). After 2D alignment and classification of the images, incomplete particles and small aggregates were excluded, resulting in a total of 7,565 particles.
We used electron tomography to calculate 3D reconstructions of particles belonging to the most representative class averages. Tilt series were collected in 2° increments, and tomographic reconstructions were calculated using IMOD (60). The zero-tilt images were included as micrographs in the 2D classification, and the alignment parameters were used to sum individual subtomograms. The volumes corresponding to the three most populated classes (Fig. S1D) were used as initial references for several rounds of 3D classification. Independent of the starting reference, the final reconstruction showed similar features. Due to the limited size of the dataset and the artifacts resulting from the negative-staining procedure, we prevented noise overfitting by restricting the resolution of the 3D alignment to 25 Å. Based on the symmetrical features presented in the 2D averages, we imposed C2 symmetry during all of the 3D alignment and classification procedures. A 3D classification into four final classes showed one of the maps displaying clearer features, and it was used for further refinement. The final reconstruction, obtained based on 1,860 particles, was low-pass–filtered to 25 Å (Fig. S1F).
In Vitro Spacer Acquisition Assays.
A plasmid with the CRISPR1 array (29 repeats, 28 spacers) and 781 bp of upstream sequence containing the leader (pPF727) was constructed by PCR amplification of P. atrosepticum genomic DNA (primers PF669+PF1653) and cloned into pUC19 via EcoRI and XmaI restriction sites. Double-stranded protospacers were annealed by combining each oligonucleotide in hybridization buffer (10 mM Hepes⋅NaOH, pH 7.5, 50 mM KCl) to a final concentration of 10 µM, heating to 95 °C for 5 min, and cooling gradually to room temperature. The oligonucleotides used for protospacers are listed in Table S4. A plasmid containing the leader and first repeat of the CRISPR1 array (pPF1042) was similarly constructed but with PCR primers PF669 and PF1795.
For the integration reactions, the following conditions were used with variations described in the text. First, 0.2 pmol double-stranded protospacer (final 10 nM) and 1.4 pmol Cas1–Cas2-3 (final 70 nM) were incubated on ice for 15 min, and then 0.15 pmol plasmid (final 7.5 nM; 500 ng pPF727 or 340 ng pPF1042) was added and the reaction was continued at 25 °C for 60 min. Reactions were performed in 20 µL of 10 mM Hepes⋅NaOH (pH 7.5), 50 mM KCl, 0.2 mM MgCl2, 1% glycerol. Reactions were stopped by heating to 65 °C for 20 min. Working samples were stored at 4 °C and stocks were stored at −20 °C. We found similar results when reactions were ended by phenol/chloroform extraction followed by ethanol precipitation.
For integration reactions that involved IHF, first Cas1–Cas2-3 was incubated with the protospacer for 15 min on ice. In a separate tube, IHF (final 2 µM) was incubated with the CRISPR plasmids for 30 min at 25 °C; these were then mixed and incubated at 25 °C for 60 min. Final reactions involved 20 µL of 10 mM Hepes⋅NaOH (pH 7.5), 50 mM KCl, 5 mM MgCl2, 5 mM DTT, 10% DMSO, 1% glycerol. Reactions were stopped and stored as described above.
The integration products were detected by PCR. Each PCR contained an equivalent of 0.35 μL of inactivated integration reaction containing 0.3 nmol of the plasmid DNA in a 20-μL reaction with DreamTaq polymerase (Thermo Fisher Scientific) and 0.25 μM each primer pair. Primers are stated where relevant, but if not indicated they were PF1649+PF1822. Reactions were performed as follows: 95 °C for 1 min, denaturing 95 °C for 30 s, annealing 55 to 65 °C for 30 s, extension 72 °C for 30 s (30 cycles), and a final extension of 72 °C for 7 min. Samples were analyzed on a 1× TAE (Tris-acetate-EDTA) 3% agarose minigel and visualized by ethidium bromide staining.
Electrophoretic Mobility-Shift Assay.
Binding assays were performed in 10 mM Hepes⋅NaOH (pH 7.5), 100 mM KCl, 5% glycerol, 5 mM DTT, 0.1 mg⋅mL−1 BSA in a total volume of 10 µL. For analysis of ssDNA/dsDNA binding and affinities, Cas1–Cas2-3 (4 to 67.5 nM in 1.5-fold increments) was incubated with 1 nM either fluorescently labeled 32-bp probe (PF1794+PF1648) or 32-nt ssDNA probe (PF1794) on ice for 15 min. Competition reactions involved 20 nM Cas1–Cas2-3 initially incubated with or without 100 nM competitor DNA [32 bp (PF1647+PF1648); 27 bp (PF1895+PF1896); 26 bp (PF2066+PF2067); 18 bp (PF1901+ PF1902)] for 15 min on ice, and then 1 nM fluorescently labeled 32-bp probe (PF1794+PF1648) was added and incubated further on ice for 15 min. The reactions were resolved at 4 °C on a 4% native polyacrylamide gel containing 0.5× TGE (50 mM Tris, 40 mM glycine, 0.1 mM EDTA, pH 9.4). DNA was visualized using the 700-nm channel of the Odyssey Imaging System (LI-COR). Nonlinear regression fits of the data according to a single site-binding model were used to determine apparent dissociation constants (Kds) in GraphPad Prism version 6.0. The mean and SEM were derived from at least three independent experiments.
Cas2 Nuclease Assay.
Plasmid pPF727 DNA (7.5 nM), an 841-bp PCR product of the mCherry gene (24 nM; PCR-amplified with PF192+PF210 from pPF571), 32-nt ssDNA (10 nM; PF1647), or 32-bp dsDNA (10 nM; PF1647+PF1648) substrates were incubated with the Cas2 domain of Cas2-3 (70 nM) in identical conditions as integration assays (10 mM Hepes⋅NaOH, pH 7.5, 50 mM KCl, 0.2 mM MgCl2, 1% glycerol) and 5 mM EDTA was included where described. All nuclease activity assays were performed at 25 °C for 60 min. The assays with plasmid and PCR-amplified DNA were terminated with loading buffer containing a final concentration of 10 mM EDTA and were analyzed on 1.0% TAE agarose gels prestained with ethidium bromide. The assays with short ssDNA and dsDNA substrates were purified by phenol/chloroform/isoamyl alcohol extraction followed by ethanol precipitation. After resuspension in water, an equal volume of formamide loading buffer was added and incubated for 5 min at 95 °C. Samples were analyzed by denaturing PAGE (15% acrylamide, 7 M urea) and visualized using SYBR Gold staining.
Supplementary Material
Acknowledgments
We thank L. Burga for EM data collection, S. Jackson for reading the manuscript, and C. Richter and S. Luckner for purification trials. This research was funded by the Marsden Fund, Royal Society of New Zealand, a Rutherford Discovery Fellowship (to P.C.F.), and an Otago Research Grant (to M.B., P.C.F., and K.L.K.). R.H.J.S. was supported by an Otago Health Sciences Career Development Award. A.J.R.H. was financed by The Netherlands Organisation for Scientific Research (NWO) funded by the NWO Roadmap Initiative Proteins@Work (Project 184.032.201).
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Data deposition: The data reported in this paper have been deposited in the Electron Microscopy Database (accession no. EMD-8660).
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1618421114/-/DCSupplemental.
References
- 1.Barrangou R, Marraffini LA. CRISPR-Cas systems: Prokaryotes upgrade to adaptive immunity. Mol Cell. 2014;54:234–244. doi: 10.1016/j.molcel.2014.03.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mohanraju P, et al. Diverse evolutionary roots and mechanistic variations of the CRISPR-Cas systems. Science. 2016;353:aad5147. doi: 10.1126/science.aad5147. [DOI] [PubMed] [Google Scholar]
- 3.Makarova KS, et al. An updated evolutionary classification of CRISPR-Cas systems. Nat Rev Microbiol. 2015;13:722–736. doi: 10.1038/nrmicro3569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Barrangou R, et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science. 2007;315:1709–1712. doi: 10.1126/science.1138140. [DOI] [PubMed] [Google Scholar]
- 5.Brouns SJ, et al. Small CRISPR RNAs guide antiviral defense in prokaryotes. Science. 2008;321:960–964. doi: 10.1126/science.1159689. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hale C, Kleppe K, Terns RM, Terns MP. Prokaryotic silencing (psi)RNAs in Pyrococcus furiosus. RNA. 2008;14:2572–2579. doi: 10.1261/rna.1246808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jackson RN, et al. Crystal structure of the CRISPR RNA-guided surveillance complex from Escherichia coli. Science. 2014;345:1473–1479. doi: 10.1126/science.1256328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Mulepati S, Héroux A, Bailey S. Crystal structure of a CRISPR RNA-guided surveillance complex bound to a ssDNA target. Science. 2014;345:1479–1484. doi: 10.1126/science.1256996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Huo Y, et al. Structures of CRISPR Cas3 offer mechanistic insights into Cascade-activated DNA unwinding and degradation. Nat Struct Mol Biol. 2014;21:771–777. doi: 10.1038/nsmb.2875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sinkunas T, et al. Cas3 is a single-stranded DNA nuclease and ATP-dependent helicase in the CRISPR/Cas immune system. EMBO J. 2011;30:1335–1342. doi: 10.1038/emboj.2011.41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Westra ER, et al. CRISPR immunity relies on the consecutive binding and degradation of negatively supercoiled invader DNA by Cascade and Cas3. Mol Cell. 2012;46:595–605. doi: 10.1016/j.molcel.2012.03.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Amitai G, Sorek R. CRISPR-Cas adaptation: Insights into the mechanism of action. Nat Rev Microbiol. 2016;14:67–76. doi: 10.1038/nrmicro.2015.14. [DOI] [PubMed] [Google Scholar]
- 13.Sternberg SH, Richter H, Charpentier E, Qimron U. Adaptation in CRISPR-Cas systems. Mol Cell. 2016;61:797–808. doi: 10.1016/j.molcel.2016.01.030. [DOI] [PubMed] [Google Scholar]
- 14.Jackson SA, et al. CRISPR-Cas: Adapting to change. Science. 2017;356:eaal5056. doi: 10.1126/science.aal5056. [DOI] [PubMed] [Google Scholar]
- 15.Ivančić-Baće I, Cass SD, Wearne SJ, Bolt EL. Different genome stability proteins underpin primed and naïve adaptation in E. coli CRISPR-Cas immunity. Nucleic Acids Res. 2015;43:10821–10830. doi: 10.1093/nar/gkv1213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Levy A, et al. CRISPR adaptation biases explain preference for acquisition of foreign DNA. Nature. 2015;520:505–510. doi: 10.1038/nature14302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Mojica FJ, Diez-Villasenor C, Garcia-Martinez J, Almendros C. Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology. 2009;155:733–740. doi: 10.1099/mic.0.023960-0. [DOI] [PubMed] [Google Scholar]
- 18.Yosef I, Goren MG, Qimron U. Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli. Nucleic Acids Res. 2012;40:5569–5576. doi: 10.1093/nar/gks216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Nuñez JK, et al. Cas1-Cas2 complex formation mediates spacer acquisition during CRISPR-Cas adaptive immunity. Nat Struct Mol Biol. 2014;21:528–534. doi: 10.1038/nsmb.2820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wiedenheft B, et al. Structural basis for DNase activity of a conserved protein implicated in CRISPR-mediated genome defense. Structure. 2009;17:904–912. doi: 10.1016/j.str.2009.03.019. [DOI] [PubMed] [Google Scholar]
- 21.Richter C, Gristwood T, Clulow JS, Fineran PC. In vivo protein interactions and complex formation in the Pectobacterium atrosepticum subtype I-F CRISPR/Cas system. PLoS One. 2012;7:e49549. doi: 10.1371/journal.pone.0049549. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Nuñez JK, Harrington LB, Kranzusch PJ, Engelman AN, Doudna JA. Foreign DNA capture during CRISPR-Cas adaptive immunity. Nature. 2015;527:535–538. doi: 10.1038/nature15760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wang J, et al. Structural and mechanistic basis of PAM-dependent spacer acquisition in CRISPR-Cas systems. Cell. 2015;163:840–853. doi: 10.1016/j.cell.2015.10.008. [DOI] [PubMed] [Google Scholar]
- 24.Nuñez JK, Lee AS, Engelman A, Doudna JA. Integrase-mediated spacer acquisition during CRISPR-Cas adaptive immunity. Nature. 2015;519:193–198. doi: 10.1038/nature14237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wright AV, Doudna JA. Protecting genome integrity during CRISPR immune adaptation. Nat Struct Mol Biol. 2016;23:876–883. doi: 10.1038/nsmb.3289. [DOI] [PubMed] [Google Scholar]
- 26.Rollie C, Schneider S, Brinkmann AS, Bolt EL, White MF. Intrinsic sequence specificity of the Cas1 integrase directs new spacer acquisition. eLife. 2015;4:e08716. doi: 10.7554/eLife.08716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Cady KC, O’Toole GA. Non-identity-mediated CRISPR-bacteriophage interaction mediated via the Csy and Cas3 proteins. J Bacteriol. 2011;193:3433–3445. doi: 10.1128/JB.01411-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Richter C, et al. Priming in the type I-F CRISPR-Cas system triggers strand-independent spacer acquisition, bi-directionally from the primed protospacer. Nucleic Acids Res. 2014;42:8516–8526. doi: 10.1093/nar/gku527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Staals RH, et al. Interference-driven spacer acquisition is dominant over naive and primed adaptation in a native CRISPR-Cas system. Nat Commun. 2016;7:12853. doi: 10.1038/ncomms12853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Datsenko KA, et al. Molecular memory of prior infections activates the CRISPR/Cas adaptive bacterial immunity system. Nat Commun. 2012;3:945. doi: 10.1038/ncomms1937. [DOI] [PubMed] [Google Scholar]
- 31.Fineran PC, et al. Degenerate target sites mediate rapid primed CRISPR adaptation. Proc Natl Acad Sci USA. 2014;111:E1629–E1638. doi: 10.1073/pnas.1400071111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Semenova E, et al. Highly efficient primed spacer acquisition from targets destroyed by the Escherichia coli type I-E CRISPR-Cas interfering complex. Proc Natl Acad Sci USA. 2016;113:7626–7631. doi: 10.1073/pnas.1602639113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Li M, Wang R, Zhao D, Xiang H. Adaptation of the Haloarcula hispanica CRISPR-Cas system to a purified virus strictly requires a priming process. Nucleic Acids Res. 2014;42:2483–2492. doi: 10.1093/nar/gkt1154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Rao C, et al. Active and adaptive Legionella CRISPR-Cas reveals a recurrent challenge to the pathogen. Cell Microbiol. 2016;18:1319–1338. doi: 10.1111/cmi.12586. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Schwanhäusser B, et al. Global quantification of mammalian gene expression control. Nature. 2011;473:337–342. doi: 10.1038/nature10098. [DOI] [PubMed] [Google Scholar]
- 36.Jackson RN, Lavin M, Carter J, Wiedenheft B. Fitting CRISPR-associated Cas3 into the helicase family tree. Curr Opin Struct Biol. 2014;24:106–114. doi: 10.1016/j.sbi.2014.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Wang X, et al. Structural basis of Cas3 inhibition by the bacteriophage protein AcrF3. Nat Struct Mol Biol. 2016;23:868–870. doi: 10.1038/nsmb.3269. [DOI] [PubMed] [Google Scholar]
- 38.Wilkinson ME, et al. Structural plasticity and in vivo activity of Cas1 from the type I-F CRISPR-Cas system. Biochem J. 2016;473:1063–1072. doi: 10.1042/BCJ20160078. [DOI] [PubMed] [Google Scholar]
- 39.Wang J, et al. A CRISPR evolutionary arms race: Structural insights into viral anti-CRISPR/Cas responses. Cell Res. 2016;26:1165–1168. doi: 10.1038/cr.2016.103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Musharova O, et al. Spacer-length DNA intermediates are associated with Cas1 in cells undergoing primed CRISPR adaptation. Nucleic Acids Res. 2017;45:3297–3307. doi: 10.1093/nar/gkx097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Vorontsova D, et al. Foreign DNA acquisition by the I-F CRISPR-Cas system requires all components of the interference machinery. Nucleic Acids Res. 2015;43:10848–10860. doi: 10.1093/nar/gkv1261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Beloglazova N, et al. A novel family of sequence-specific endoribonucleases associated with the clustered regularly interspaced short palindromic repeats. J Biol Chem. 2008;283:20361–20371. doi: 10.1074/jbc.M803225200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Nam KH, et al. Double-stranded endonuclease activity in Bacillus halodurans clustered regularly interspaced short palindromic repeats (CRISPR)-associated Cas2 protein. J Biol Chem. 2012;287:35943–35952. doi: 10.1074/jbc.M112.382598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Nuñez JK, Bai L, Harrington LB, Hinder TL, Doudna JA. CRISPR immunological memory requires a host factor for specificity. Mol Cell. 2016;62:824–833. doi: 10.1016/j.molcel.2016.04.027. [DOI] [PubMed] [Google Scholar]
- 45.Künne T, et al. Cas3-derived target DNA degradation fragments fuel primed CRISPR adaptation. Mol Cell. 2016;63:852–864. doi: 10.1016/j.molcel.2016.07.011. [DOI] [PubMed] [Google Scholar]
- 46.Almendros C, Guzmán NM, García-Martínez J, Mojica FJ. Anti-cas spacers in orphan CRISPR4 arrays prevent uptake of active CRISPR-Cas I-F systems. Nat Microbiol. 2016;1:16081. doi: 10.1038/nmicrobiol.2016.81. [DOI] [PubMed] [Google Scholar]
- 47.Redding S, et al. Surveillance and processing of foreign DNA by the Escherichia coli CRISPR-Cas system. Cell. 2015;163:854–865. doi: 10.1016/j.cell.2015.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Schuck P. Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and Lamm equation modeling. Biophys J. 2000;78:1606–1619. doi: 10.1016/S0006-3495(00)76713-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Rose RJ, Damoc E, Denisov E, Makarov A, Heck AJ. High-sensitivity Orbitrap mass analysis of intact macromolecular assemblies. Nat Methods. 2012;9:1084–1086. doi: 10.1038/nmeth.2208. [DOI] [PubMed] [Google Scholar]
- 50.Kao A, et al. Development of a novel cross-linking strategy for fast and accurate identification of cross-linked peptides of protein complexes. Mol Cell Proteomics. 2011;10:M110.002212. doi: 10.1074/mcp.M110.002212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Liu F, Rijkers DT, Post H, Heck AJ. Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry. Nat Methods. 2015;12:1179–1184. doi: 10.1038/nmeth.3603. [DOI] [PubMed] [Google Scholar]
- 52.Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol. 2008;26:1367–1372. doi: 10.1038/nbt.1511. [DOI] [PubMed] [Google Scholar]
- 53.Biasini M, et al. SWISS-MODEL: Modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res. 2014;42:W252–W258. doi: 10.1093/nar/gku340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Schrödinger L. The PyMOL Molecular Graphics System. Version 1.7.6.2. 2010 Available at www.pymol.org.
- 55.Pettersen EF, et al. UCSF Chimera—A visualization system for exploratory research and analysis. J Comput Chem. 2004;25:1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
- 56.Rames M, Yu Y, Ren G. Optimized negative staining: A high-throughput protocol for examining small and asymmetric protein structure by electron microscopy. J Vis Exp. 2014;(90):e51087. doi: 10.3791/51087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Mindell JA, Grigorieff N. Accurate determination of local defocus and specimen tilt in electron microscopy. J Struct Biol. 2003;142:334–347. doi: 10.1016/s1047-8477(03)00069-8. [DOI] [PubMed] [Google Scholar]
- 58.Ludtke SJ, Baldwin PR, Chiu W. EMAN: Semiautomated software for high-resolution single-particle reconstructions. J Struct Biol. 1999;128:82–97. doi: 10.1006/jsbi.1999.4174. [DOI] [PubMed] [Google Scholar]
- 59.Scheres SH. RELION: Implementation of a Bayesian approach to cryo-EM structure determination. J Struct Biol. 2012;180:519–530. doi: 10.1016/j.jsb.2012.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Mastronarde DN. Dual-axis tomography: An approach with alignment methods that preserve resolution. J Struct Biol. 1997;120:343–352. doi: 10.1006/jsbi.1997.3919. [DOI] [PubMed] [Google Scholar]
- 61.Di Tommaso, et al. T-Coffee: A web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension. Nuc acids Res. 2011;39:W13-7. doi: 10.1093/nar/gkr245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Przybilski R, et al. Csy4 is responsible for CRISPR RNA processing in Pectobacterium atrosepticum. RNA Biol. 2011;8:517–528. doi: 10.4161/rna.8.3.15190. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.








