Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Mar 7.
Published in final edited form as: Cell. 2017 Jun 29;170(1):48–60.e11. doi: 10.1016/j.cell.2017.06.012

Structure basis for directional R-loop formation and substrate handover mechanisms in Type I CRISPR-Cas system

Yibei Xiao 1,#, Min Luo 2,#, Robert P Hayes 1, Jonathan Kim 1, Sherwin Ng 1, Fang Ding 1, Maofu Liao 2,*, Ailong Ke 1,*
PMCID: PMC5841471  NIHMSID: NIHMS945142  PMID: 28666122

Summary

Type I CRISPR systems feature a sequential dsDNA target searching and degradation process, by crRNA-displaying Cascade and nuclease-helicase fusion enzyme Cas3, respectively. Here we present two cryo-EM snapshots of the Thermobifida fusca Type I-E Cascade: 1) unwinding 11-bp of dsDNA at the seed-sequence region to scout for sequence complementarity, and 2) further unwinding of the entire protospacer to form a full R-loop. These structures provide the much-needed temporal and spatial resolution to resolve key mechanistic steps leading to Cas3 recruitment. In the early steps, PAM recognition causes severe DNA bending, leading to spontaneous DNA unwinding to form a seed-bubble. The full R-loop formation triggers conformational changes in Cascade, licensing Cas3 to bind. The same process also generates a bulge in the non-target DNA strand, enabling its handover to Cas3 for cleavage. The combination of both negative and positive checkpoints ensures stringent yet efficient target degradation in Type I CRISPR-Cas systems.

Keywords: CRISPR, R-loop, PAM, crRNA, spacer, protospacer, Cascade, Cas3, nuclease, helicase

Introduction

At least six major types of Clustered regularly-interspaced short palindromic repeats and the associated operon (CRISPR-Cas) systems have been defined in bacteria and archaea. They provide an RNA-based adaptive immune system in prokaryotes against foreign genetic elements such as phages and plasmids (Barrangou et al., 2007; Bolotin et al., 2005; Brouns et al., 2008; Hale et al., 2009; Makarova et al., 2006; Marraffini and Sontheimer, 2008; Mojica et al., 2005; Pourcel et al., 2005). Proteins in the cas operon assist the integration of short foreign DNA-derived spacers into the CRISPR locus to establish immunity (adaptation), the processing of the CRISPR array transcript into RNA guides (crRNA biogenesis), and the crRNA-guided detection and degradation of the target DNA or RNA (CRISPR Interference) (Jiang and Doudna, 2015; Mohanraju et al., 2016). CRISPR-Cas systems are further divided into two classes and multiples types therein (Makarova et al., 2015; Shmakov et al., 2015). Class 1 systems (Type I, III, and IV) employ a multi-subunit effector complex to achieve interference, whereas Class 2 systems (Type II, V, and VI) employ a single protein effector (Makarova and Koonin, 2015).

Type I systems are the most widely distributed CRISPR system, and have been utilized to control gene expression and cell fate (Caliando and Voigt, 2015; Luo et al., 2015). They are further divided into seven subtypes (I-A to I-G) based on the cas operon features, with Type I-E being the most extensively studied subtype (Makarova and Koonin, 2015). It has been established that during CRISPR interference, the CRISPR-Associated Complex for Antiviral Defense (Cascade) uses crRNA as a guide to identify 32–35 bp of matching dsDNA (protospacer) flanked by an optimal Protospacer Adjacent Motif (PAM) (Mojica et al., 2009). The target-searching process involves DNA unwinding, duplex formation between the crRNA spacer and target DNA strand, and displacement of the non-target DNA strand to form an R-loop structure, all in the absence of external energy input (Brouns et al., 2008; Westra et al., 2012; Wiedenheft et al., 2011a). In a subsequent step, the nuclease-helicase enzyme Cas3 is recruited to processively degrade the non-target and target DNA strands sequentially (Hochstrasser et al., 2014; Mulepati and Bailey, 2013; Sinkunas et al., 2013; Westra et al., 2012).

Considerable efforts have been devoted to understanding the target-searching mechanism by Cascade. Cryo-electron microscopy (cryo-EM) (Hochstrasser et al., 2014; Wiedenheft et al., 2011a) and crystal structures (Jackson et al., 2014; Mulepati et al., 2014; Zhao et al., 2014) of E. coli Cascade in free and ssDNA-bound states defined a seahorse-shaped architecture. An intriguing homology-searching mechanism was revealed, by which Cascade promotes segmented 5-bp pseudo-A-form heteroduplex formation between the crRNA spacer and the target strand protospacer; base-pairing at every 6th position is disrupted by a thumb element from Cas7 subunits. PAM recognition precedes DNA unwinding, as a mechanism to avoid autoimmunity by mistargeting the spacers in the CRISPR locus (Marraffini and Sontheimer, 2010). Unlike Cas9 and Cpf1, Cascade recognizes a promiscuous set of PAM sequences. The 2.45 Å structure of E. coli Cascade bound to a bifurcated R-loop mimic rationalized this promiscuity based on the observed DNA minor groove contacts (Hayes et al., 2016). Bulk and single molecule biochemistry data suggest that R-loop formation likely initiates from the PAM-proximal side and propagates in a directional fashion to the PAM-distal side (Rutkauskas et al., 2015; Semenova et al., 2011; Szczelkun et al., 2014). The first 7–8 nucleotides in the protospacer are more stringently specified than the rest, hence defined as the seed sequence region (Semenova et al., 2011; Wiedenheft et al., 2011b). Full R-loop formation was thought to trigger a “locking” mechanism to stabilize the R-loop (Rutkauskas et al., 2015). Conformational changes in Cascade accompany the R-loop formation process, which was thought to prepare Cascade for Cas3 binding (Hayes et al., 2016; Mulepati et al., 2014; Wiedenheft et al., 2011a). Non-canonical or PAM-independent R-loop formation was also observed in single molecule experiments (Blosser et al., 2015; Redding et al., 2015). Although the nature of such R-loops is poorly understood, it is thought that they play a role in triggering a process called “primed adaptation”, in which Cascade and Cas3 drive the preferential acquisition of new spacers from dsDNA containing a near-cognate target (Blosser et al., 2015; Datsenko et al., 2012; Fineran et al., 2014).

R-loop formation is an essential event in many RNA-guided processes. In CRISPR-Cas systems, this involves a complicated set of conformational changes in both the DNA substrate and the surveillance complex (Jiang and Doudna, 2015; Nishimasu and Nureki, 2016). The available structure snapshots only captured the apo and the post-R-loop states, but did not resolve the sequence of events in between. In this study, we present two cryo-EM structures of the Type I-E Cascade from Thermobifida fusca (TfuCascade) that forms a seed sequence bubble and a full R-loop structure, at 3.8 Å and 3.3 Å resolution, respectively. The former snapshot provides important insights about the early events leading to R-loop formation, by revealing that PAM-recognition is coupled with DNA bending and spontaneous DNA unwinding, leading to 11-bp DNA-crRNA heteroduplex formation. The latter snapshot reveals that: i) only upon complete R-loop formation does TfuCascade undergo a conformational change, enabling Cas3 binding; ii) the target DNA strand is physically locked by TfuCascade, which is a unique mechanism in TfuCascade to confer exceptional thermostability; and most importantly, iii) TfuCascade creates a bulge in the non-target DNA strand, enabling Cas3 to nick DNA efficiently. Collectively, these structural snapshots solidify the set of mechanisms governing the PAM-dependent R-loop formation, Cas3 recruitment, and substrate handover processes.

Results

In vitro reconstitution of the Cascade/dsDNA complex

Previously, we reported a 2.45 Å crystal structure of E. coli Cascade forming an R-loop mimic upon binding to a partially duplexed DNA, and hypothesized a set of key mechanistic steps that occur during R-loop formation (Hayes et al., 2016). Efforts to capture additional structure snapshots before, during, and after the R-loop formation were unsuccessful in the E. coli system. We therefore redirected efforts to a thermophilic Type I-E system from Thermobifida fusca (Huo et al., 2014) (Figure 1A). The intact TfuCascade was recombinantly expressed in E. coli, and the PAM recognition, Cas3 recruitment, and target DNA cleavage steps were biochemically reconstituted (Huo et al., 2014) (Figure S1A–G). The T. fusca Cascade is advantageous for mechanistic study due to its steep temperature-dependent R-loop formation behavior. In the KMnO4 footprinting assay, which cleaves single-stranded thymidines as DNA unwinds, R-loop formation was found to be robust at 58–62 °C, less efficient at 37 °C, and undetectable at lower temperatures (Figure S1H). Electrophoretic mobility shift assay (EMSA) mirrored the footprinting finding; the TfuCascade/dsDNA complex migrated as a slow and fuzzy band at lower temperatures, but as a sharp and faster band at 58–62 °C (Figure S1I). This electrophoretic difference was interpreted to reflect the DNA curvature differences before and after R-loop formation (Figure S1J). Furthermore, we observed that the slow-migrating band could be chased to the faster one by a 30-second high-temperature incubation prior to EMSA, suggesting that the low-temperature incubation introduced a kinetic barrier in the R-loop formation process (Figure 1B). Interestingly, additional KMnO4 footprinting experiments revealed that a prolonged low-temperature incubation (over 12 hrs at 10 °C) led to the formation of an intermediate, in which a small R-loop was opened at the seed sequence region (seed-bubble; Figure 1C). This is a hypothetical R-loop intermediate that was speculated to form transiently and stochastically (Semenova et al., 2011; Wiedenheft et al., 2011b); here we were able to synchronize the TfuCascade into this functional state for structure-function interrogation.

Figure 1. Temperature-dependent R-loop formation behavior in T. fusca Type I-E Cascade.

Figure 1

A. Organization of the T. fusca Type I-E CRISPR locus and cas operon. Constant repeats and variable spacers are represented in black diamonds and colored blocks, respectively. B. TfuCascade in PAM-searching state can be chased to the full R-loop state in a temperature jump before EMSA. C. TfuCascade can be programmed to PAM-searching, seed-bubble, and full R-loop states by incubating with dsDNA substrate at different temperatures, for different durations, as revealed by KMnO4 footprinting analysis targeting the looped-out non-target DNA strand. Bottom: schematics of different extents of DNA unwinding, 5’-NTS fluorescent label in red asterisk, and the probing sites in blue arrows.

3.3 Å cryo-EM reconstruction of T. fusca Cascade displaying a full R-loop

To capture the post R-loop formation conformation, we incubated TfuCascade with a dsDNA target at a 1:1.5 molar ratio at 60 °C for 30 min, followed by size-exclusion chromatography (SEC) purification to remove the unbound DNA. The 58-bp dsDNA substrate contains a 10-bp upstream region, a 5’-AAG PAM, a 32-bp protospacer region, and a 12-bp downstream region. Cryo-EM images of this TfuCascade-dsDNA complex demonstrate particles with homogenous size and excellent image contrast (Figure S2A). Two-dimensional (2D) averages of cryo-EM particle images revealed structural details such as the segmented Cas7.1-6 backbone (Figure S2B). The final three-dimensional (3D) reconstruction reached an overall resolution of 3.3 Å, allowing unambiguous main chain tracing and side chain assignment in most regions (Figure 2A–C, S2C–H, S3). Some periphery regions are less well resolved, including the Cas6e/crRNA 3’-handle, which was modeled from the E. coli structure, and the finger domain of Cas7.6, whose density was too degenerate to model (Figure 2, Movie S1).

Figure 2. 3.3 Å cryo-EM structure of TfuCascade-full R-loop complex.

Figure 2

A. crRNA and dsDNA sequences used to program TfuCascade for structural studies. Residue numbers and color schemes are followed throughout the text. PAM and disordered NTS region are highlighted in yellow and grey boxes, respectively. B. The cryo-EM map of the TfuCascade/R-loop complex filtered to 3.3 Å without applying a B-factor, allowing the tracing of the entire R-loop. C. A representative region in the cryo-EM map of the TfuCascade/R-loop complex. D. Surface (left) and cartoon (right) representations revealing the subunit organization in TfuCascade, the elongated R-loop inside, and the orientation between PAM proximal and PAM distal dsDNA. E. Location of the PAM-distal dsDNA. F. Positively charged patches near the gapped NT region (NT strand binding site), the dsDNA entry site (K-vise), and above the target strand-binding groove (K-rim). G. Disrupting TfuCse2-specific structure features reduced R-loop formation efficiency. H. Alignment of TfuCascade/R-loop and Cas9/R-loop (PDB: 5F9R) structures along the PAM-proximal dsDNA revealing differences in the R-loop.

The T. fusca Cascade adopts a similar sea-horse-shaped architecture as the E. coli Cascade, featuring a crRNA-displaying helical backbone, comprised of Cas5e, Cas7.1-6, and Cas6e, and an inner belly, comprised of Cse1 and the Cse2.1-2 dimer (Figure 2, S4). While each subunit aligns reasonably well between the two Cascades, significant conformational differences exist (Figure S4). For example, the finger domain of TfuCas7s adopts a different conformation and contains different structure features (Figure S4D). The Lys-rich helix in EcoCas7 (also named K-vise), previously shown to be important for accommodating the PAM-proximal dsDNA, is replaced by a Lys-rich loop (aa144-148, KKTPK) at a shifted location in TfuCas7 (Figure S4D). TfuCas7 further displays a separate Lys-rich loop (aa101-106, KLEKEK, here named K-Rim) in the finger domain, which upon Cas7.1-6 oligomerization forms a continuous positive patch along the target DNA strand inside the groove, assisting nucleic acid recruitment (Figure 2F). Additionally, the TfuCse2 subunit is 30% larger than its E. coli counterpart; the T. fusca-specific features include a flexible internal loop (aa 91-136) and an α-helical C-terminal spike (aa 222-244) (Figure S4C). Both features contribute to the thermophilic adaptation of TfuCascade, as deletion of either region led to impaired R-loop formation efficiency, even at the permissive temperature (Figure 2G).

R-loop formation introduces severe DNA bending in the adjacent region

The entire R-loop is resolved in the cryo-EM map of the TfuCascade/R-loop complex (Figure 2B–D). The PAM-proximal side of dsDNA is bent ~90° at the 3-bp PAM region by the N-terminal domain of Cse1 (Cse1_NTD). The two strands of DNA underneath the PAM bifurcate to form an elongated R-loop. The entire target DNA strand (TS) is well resolved, revealing its involvement in the formation of five 5-bp segments of heteroduplexes with the crRNA spacer, interspaced by a disrupted base-pair at every 6th position (Figure 2C, S3C). The non-target DNA strand (NTS) is resolved at lower resolution due to the lack of sequence-specific contacts with TfuCascade. Tracing along the NTS, the first three nucleotides following PAM are clearly resolved, which travels towards the tip of the L3 loop in Cse1_NTD. The density then significantly weakens and becomes untraceable for the next 25 Å, only to reappear at the rim of the C-terminal domain of Cse1 (Cse1_CTD) as a corridor of density traveling along the backside of the two Cse2 subunits (Figure 2B, Movie S1). A total of 14 nucleotides were modeled into this density; however, the nucleotide registers cannot be reliably assigned due to the limited resolution. Towards the end of the R-loop, the target DNA strand travels extra distances in a channel between Cse2.2 and Cas6d, and re-anneals with the non-target strand on the backside of Cse2.2, leaning towards Cas6e (Figure 2E). The post-R-loop dsDNA orients at a ~90° angle from the axis of the R-loop, and was modeled as an ideal A-form helix due to the reduced local resolution. Overall, the PAM-proximal and PAM-distal DNA duplexes are related by a 90° rotation and separated by a distance of 110 Å (Figure 2D). This structural observation agrees with the EMSA migration behavior of the Cascade/R-loop, and the previous atomic force microscopy observation that Cascade sharply bends the dsDNA surrounding the R-loop (Westra et al., 2012). In comparison, the R-loop formed by Cascade is different from that formed by Cas9 in many aspects, underlining their distinct evolutionary root (Jiang et al., 2016). The dsDNA is bent to a much lesser extent before and after the R-loop in Cas9 (PDB: 5F9R), the R-loop is totally different in shape and directionality, and the target DNA strand forms a continuous 20-bp A-form heteroduplex with the crRNA instead of a distorted 32-bp heteroduplex involving multiple near A-form segments (Figure 2H).

An asymmetric R-loop structure, bulge formation in the non-target DNA strand

A surprising finding in our biochemical reconstitution was that while Cascade presenting a bifurcated R-loop mimic could recruit Cas3 just as efficiently (Huo et al., 2014), only the full R-loop presenting Cascade, where both ends of the R-loop remained duplexed, further triggered the non-target DNA strand cleavage by Cas3 (Figure 3A). This suggests that a Cascade/R-loop mimic structure is inadequate in explaining the Cas3-mediated DNA cleavage mechanism. Upon inspecting the TfuCascade/full R-loop structure, it became clear that the two strands of ssDNA travel unequal distances inside the R-loop. Whereas the 32-nt target DNA strand travels ~175 Å, with an average phosphate-to-phosphate distance of ~5.5 Å, the non-target DNA strand travels only ~130 Å along the Cascade surface, assuming that the flexible region of this strand travels in a similar fashion as seen in the EcoCascade/R-loop mimic structure (Figure 3B). Because ssDNA typically has a phosphate-to-phosphate distance of 5–6 Å, the non-target DNA path can accommodate 22–26 nucleotides at most. At least five, most likely eight or more, nucleotides are unaccounted for if we assume the non-target DNA follows the path seen in the EcoCascade/R-loop mimic structure. We reason that the unaccounted nucleotides exist as a disordered bulge on top of Cse1-CTD (Figure 3B), which would rationalize why the EM density disappears in this region (Figure 2B–G). Because the bulge is close to the putative Cas3 binding site (Hochstrasser et al., 2014), this may serve as a convenient mechanism to hand over the non-target DNA to the HD nuclease center of Cas3 for strand-nicking (Figure 3B). This would explain why Cas3 only cleaves the non-target DNA strand in a full R-loop, but not in a bifurcated R-loop (Figure 3A), as the EcoCascade/bifurcated R-loop structure clearly showed that the bulge does not exist in the equivalent region due to the lack of dsDNA constraint at the PAM-distal region (Hayes et al., 2016).

Figure 3. TfuCascade generates a flexible bulge in the non-target DNA strand, enabling Cas3-mediated DNA degradation.

Figure 3

A. An authentic R-loop generated by TfuCascade was efficiently cleaved by Cas3 at the non-target strand (right lanes), whereas a bifurcated R-loop mimic (left lanes) was not. B. TfuCascade/full R-loop structure reveals that the two DNA strands travel unequal distances in the R-loop. The non-target DNA unlikely follows the path seen in the EcoCascade/R-loop mimic structure. Rather, the full R-loop traps a flexible bulge near the putative Cas3-binding site. C. Diminishing the NT bulge negatively affected Cas3-mediated DNA nicking (left panel) and ATP-dependent processive degradation (right panel). D. Enzymatic probing by ssDNA-specific nucleases directly revealed a hypersensitive region coincident with the NT strand bulge.

Two sets of biochemical experiments were designed to probe whether a flexible bulge is present in the Cascade/full R-loop complex. First, nucleotide insertion/deletions were introduced into the middle of the R-loop in the non-target strand to examine whether slack is present in the non-target DNA strand, and whether perturbing it may affect Cas3-mediated cleavage. On a perfectly base-paired dsDNA target marked by TfuCascade, Cas3 nicked the non-target DNA strand at three locations (PAM+7, +9, +11 nt) in the absence of ATP, and proceeded to processive degradation in the presence of ATP (Figure S1E–G). Up to five nucleotides could be deleted from the non-target DNA strand without appreciable perturbation on the efficiency of Cas3-mediated DNA nicking. Deleting more nucleotides reduced non-target strand nicking eventually to the background level, whereas insertions, which introduced more slack to the non-target strand, led to more efficient DNA nicking (Figure 3C). Interestingly, insertion of ten and deletion of five nucleotides only perturbed the Cas3 cleavage site preference by two nucleotides. This points to the existence of an intriguing ruler mechanism to measure and cleave the non-target strand DNA at a pre-determined distance from the PAM site. The extent of ATP-triggered processive degradation was perturbed to a similar extent by the insertions/deletions, which is consistent with the notion that the non-target strand nicking takes place before the processive degradation process (Figure 3C). Taken together, these observations support the notion that there is significant slack in the non-target strand of the Cascade-bound R-loop, and that generating this slack is a prerequisite for Cas3-mediated DNA degradation. Even stronger evidence came from the enzymatic probing assays, in which the ssDNA-specific endonucleases P1 and S1 both preferentially cleaved a region in the non-target DNA strand inside the TfuCascade/R-loop complex (Figure 3D). These cleavage sites overlapped with the predicted location of the bulge (nt 8–16th), hence directly proving that a flexible bulge is present in the TfuCascade/R-loop complex.

PAM recognition

Common themes and individual differences in PAM recognition can be derived by comparing T. fusca and E. coli Cascade/R-loop structures. PAM preference by TfuCascade was previously defined using in vivo interference assays (Huo, et al., 2014). These findings were consistent with the PAM readout from in vitro EMSA (Figure S1D). TfuCascade strongly specified dANT-dTT in PAM-2, but tolerated promiscuity at PAM-1 and PAM-3 positions. Although the most preferred PAM sequence by TfuCascade, 5’-AAG reading from the non-target strand, also happens to be the most preferred PAM by EcoCascade, the latter recognizes its five interference PAMs in quite a different way. Comparison of the two Cascade/R-loop structures (Hayes et al., 2016) rationalizes their specificity differences. In both cases, PAM recognition features DNA minor groove recognition by Cse1-NTD, involving a Gly-rich loop (Figure 4A, S3). However, TfuCascade preferentially contacts the non-target strand, as opposed to interacting with the target strand, like EcoCascade. The glycine-rich loop in TfuCascade is significantly longer, and a SGM motif in this loop is responsible for PAM readout at all three base-pair levels (Figure 4B, C). TfuCascade specifies PAM-1 from the non-target strand side only, with two H-bonds from S194 to N3 and N2 amines of dGNT-1 (Figure 4C, E). In contrast, recognition of PAM-1 in EcoCascade is more stringent because it contacts nucleotides in both strands. Moving up, EcoCascade tolerates three base-pair combinations at PAM-2 and only rejects the dCNT-dGT combination due to a steric clash. In contrast, TfuCascade exclusively specifies ANT-2 at PAM-2. This is achieved by the partial intercalation of the hydrophobic side chain of M196 between ANT-2 and ANT-3, which severely bends DNA from the minor groove side (Figure 4C–F; Figure S3A, right panel). The intercalation is only possible because the featureless G195 residue enables a closer-than-normal minor groove contact. Presumably, guanosine is rejected in place of ANT-2 because the extra N2 amine in the minor groove side causes steric clashes, and pyrimidines are disfavored either because their tilted stacking with M196 is not as energetically favorable, or the base-paired purines nearby causes steric clashes with M196. The PAM-2 contact is reminiscent of a similar methionine intercalation-mediated di-adenosine readout by the minor groove recognizing HMG-box protein Lef-1 (Love et al., 1995). The guanidinium group of R208 points towards PAM-3 from ~4 Å away (Figure 4C; Figure S3A, left panel). It likely plays the equivalent role of the lysine finger in EcoCascade to recognize PAM-3 with favorable electrostatic contacts. These important structure features were evaluated by mutagenesis and EMSA (Figure 4G). Both M196A and G195A mutations reduce the target-binding affinity of TfuCascade by 25-fold, consistent with their essential roles in PAM recognition. R208A weakened DNA-binding by 5-fold. Unlike EcoCascade, TfuCascade further grips PAM from the target strand side using the L1 loop side of Cse1_NTD. The main chain amides of G158 and E159 make favorable electrostatic contacts to the phosphate of dTT-2 (Figure 4D). A G158Y mutation intended to disrupt the phosphate contact weakened DNA-binding by 5-fold, whereas E159A had a negligible effect (Figure 4G). Importantly, both EcoCascade and TfuCascade insert a conserved Gln-wedge into the path of dsDNA underneath PAM. The tip of this β-hairpin is wider in TfuCascade, allowing it to flip out the first four nucleotides in both strands of DNA (Figure 4A, 4C, S3B).

Figure 4. PAM recognition by TfuCascade.

Figure 4

A. PAM recognition coupled DNA unwinding at PAM-proximal region of the R-loop. B. Schematics of PAM recognition. C, D. Roles of Gly-rich loop, Gln-wedge, and L1 loop in PAM recognition and DNA unwinding. E. S194 specifies a non-target strand G at PAM-1 with two H-bonds. F. M196 mediated a di-adenosine readout at PAM-2 and PAM-3 through partial intercalation, enabled by G195. G. Mutagenesis EMSA verified the involvement of G195, M196, R208, and G158 in PAM recognition.

Structure snapshot of T. fusca Cascade opening a seed-bubble

Since no structural information is available to explain the early events in the R-loop formation process, we set out to capture a structure snapshot of TfuCascade examining the seed sequence region of the protospacer. This is a hypothetical state that takes place after PAM recognition but before full R-loop formation (Rutkauskas et al., 2015; Semenova et al., 2011; Wiedenheft et al., 2011b). Despite previous efforts, it remains unclear whether a seed sequence bubble intermediate stably exists, and if so, how many base pairs of DNA are unwound. Our KMnO4 foot-printing data suggest a seed-bubble complex can be programmed by extended low-temperature incubation (over 24 hrs at 12 °C) of TfuCascade with an excess amount of dsDNA (Figure 1C). The cryo-EM 3D reconstruction of this complex was generated at an overall resolution of 3.8 Å (Figure S5), which allowed the unambiguous tracing of backbones and most side chains. Because the sample was more heterogeneous, only ~25% of the total particles were used for high-resolution 3D reconstruction. The rest of the 3D classes may represent TfuCascade in PAM-searching or other intermediate modes, and the heterogeneous DNA binding modes may have hampered high-resolution reconstruction. Tracing DNA inside the seed-bubble structure, the density features suggest that the dsDNA entry and PAM recognition mechanisms are identical between the seed-bubble and full R-loop states. However, only 11-nt of the target DNA strand could be traced in the R-loop region, which forms two 5-bp segments with the crRNA spacer (Figure 5A–B). Meanwhile, a ~60 Å long rod-like density corresponding to the displaced non-target strand travels over the target strand binding canyon towards the finger domain of Cas7.4 (Figure 5A–B). The density suggests that this strand is deflected by the tip of the L3 loop in Cse1_NTD rather than traveling underneath it, as seen in the full R-loop structure (Figure 5C, 2B). The density feature is suggestive that the post-seed-bubble dsDNA exits TfuCascade from the finger domain of Cas7.4 (Figure 5C, Movie S2). The overall curvature of dsDNA in the seed-bubble state is much less than that in the full R-loop state, which explains their migration differences in EMSA (Figure 5D–E, S1I–J). Three structural elements in TfuCascade are crucial in bending and subsequent melting of dsDNA: the Gln-wedge of Cse1, the L3 loop of Cse1, and the K-vise in Cas7.4 (Figure 5C). Each of these elements plays an important function in the full R-loop state, here we show that they perform a different set of functions in the seed-bubble state. Indeed, Ala substitutions in the K-vise reduced the DNA-binding affinity by ~5 fold, and decreased non-specific DNA binding dramatically (from 0.1–1 µM, Figure 5F). The six K-vise motifs are spaced regularly along the Cas7.1-6 scaffold. We speculate that they may guide the sequential melting of dsDNA from the PAM side. Upon PAM recognition, the dsDNA substrate is held in the bent conformation by the three points of contact, leading to a spontaneous DNA melting transition (Figure 5C). The target strand is then captured by crRNA, forming the seed-bubble (Figure 5C, 5E). Afterwards, during directional R-loop propagation, the dsDNA may melt in 6-bp steps and dwell on the K-vise of the next Cas7. Meanwhile, the non-target DNA strand travels in parallel on top of the target strand binding canyon. Upon complete R-loop formation, the non-target DNA strand is buried to the backside of the Cse2 dimer.

Figure 5. Nature of the seed sequence bubble unwound by TfuCascade.

Figure 5

A. B. Two different views of the TfuCascade/seed-bubble cryo-EM structure highlighting the bending at the PAM region, extent of the seed-bubble, and the paths of the target and non-target strands. Type 2 class is illustrated here to emphasize the path of the non-target strand. C. Function of Q-wedge, L3-loop and Cas7.4 K-vise in shaping the path of the seed-bubble. D. Inclusion of a post-seed-bubble dsDNA to model the entire DNA substrate in the TfuCascade/seed-bubble complex. DNA bending around the PAM region likely plays an important role in melting dsDNA into a seed-bubble. E. DNA substrate follows a distinct path in the seed-bubble and full R-loop states. Note the differences in directionality, curvature, and DNA bending angle. F. Alanine substitution in K-vise reduced the target DNA binding affinity of TfuCascade by ~5-fold.

Cas3 recruitment is specific to the full R-loop, but not the seed-bubble Cascade conformation

The mechanism of Cas3 recruitment by Cascade remains unclear. It was hypothesized that Cas3 selectively binds to the R-loop forming Cascade by distinguishing the conformational differences between the apo and R-loop forming Cascades. Here in the T. fusca Type I-E system, we provided direct evidence supporting this hypothesis. First, the EMSA data revealed that the seed-bubble and full R-loop TfuCascade migrated as distinct conformers, and that Cas3 selectively interacted with the full R-loop conformer (Figure 6A). Cryo-EM structures further defined the conformational differences between these two functional states. The TfuCascade/seed-bubble structure is similar to that of the apo EcoCascade, whereas the TfuCascade/full R-loop structure is similar to that of the R-loop forming TfuCascade and EcoCascade (Figure S4A). The large pivoting motion in Cse1_CTD and a correlated sliding motion in the Cse2 dimer (Jackson et al., 2014; Zhao et al., 2014; Mulepati et al., 2014; Hayes et al., 2016) have not taken place when TfuCascade opens the seed-bubble (Figure 6B). Importantly, in the seed-bubble state, the entire target strand binding canyon is accessible (Figure 6C). However, when TfuCascade undergoes conformational changes upon R-loop formation, the sliding motion of the inner belly aligns a negative patch on Cse2.1 and Cse2.2 with the corresponding K-Rim residues in Cas7.3 and Cas7.1, respectively (Figure 6D). The resulting salt bridge interactions (K97/K101→S94, L102→R93, K104→N23) seal the target strand binding canyon, locking the target DNA strand underneath (Figure 6E). The target strand locking contacts are not observed in the E. coli Cascade/R-loop mimic structure (Hayes et al., 2016), and the involvement of the main chain contacts makes it hard to evaluate how conserved this mechanism would be among Type I-E Cascades. However, our observation strongly agrees with the single molecule study showing that upon reaching the end of the protospacer, the R-loop becomes stably locked by the S. thermophilus Type I-E Cascade (Rutkauskas et al., 2015). Indeed, mutations aimed at disrupting the salt-bridges (N23A or R93A in Cse2, and Δ101-109/GG in Cas7) significantly altered the behavior of TfuCascade, leading to premature R-loop formation at previously non-permissive temperatures or less stable R-loops at high temperatures (Figure 6F). Because one of the two locks is located at the very end of the R-loop region, its presence suggests that the entire target DNA strand must have been engaged in heteroduplex formation with the crRNA spacer before the conformational changes took place to lock it underneath. Therefore, although the locking of the target DNA strand may not be conserved among all Type I systems, its presence in the T. fusca system defines the timing of the conformational changes, which we now conclude must have happened after R-loop formation (Figure 6G).

Figure 6. Cas3 recruitment is specific to the R-loop forming Cascade conformation.

Figure 6

A. Cas3 specifically binds to full R-loop forming TfuCascade in native EMSA. B. Conformational transition in Cse1 and Cse2 subunits between the seed-bubble (in darker colors) and full R-loop (motions interpolated in lighter colors) structures. Arrows indicate the direction of movements upon full R-loop formation. Additionally, target DNA strand binding groove is open and accessible in the seed-bubble state (C), but locked up in two places (Cse2.2-Cas7.1 and Cse2.1-Cas7.3) in the full R-loop state (D). E. Disrupting the salt bridge interactions impairs TfuCascade-mediated R-loop formation. F. Mutagenesis to disrupt the target strand locks led to altered R-loop formation behavior. G. An updated mechanistic model for directional R-loop formation by Type I-E Cascade. New structural evidence points to DNA bending-melting transition in seed-bubble formation, define the timing of the conformational change and its importance in regulating Cas3 recruitment, and identify a flexible bulge in NTS that positively regulates the Cas3-mediated DNA degradation.

Discussion

Originally discovered as an RNA-guided immune system for anti-viral defense in prokaryotes, the CRISPR-Cas systems have been repurposed for genome editing applications. The features that enable CRISPR-Cas systems to identify DNA targets efficiently and precisely are the very same characteristics highly sought after in practical applications. All DNA-targeting CRISPR systems invariably require a crRNA-guided R-loop formation process to identify a matching target. The process involves multiple steps to ensure efficiency and accuracy. It begins with protein-mediated PAM-recognition to reduce the search space, followed by seed-bubble formation to zoom in on a few potential matching protospacers, and finally a directional dsDNA unwinding/crRNA-DNA heteroduplex formation process to form the full R-loop structure. High-resolution structural information is essential for mechanistic understanding. This in turn inspires creative improvement of the genome editing tools (Slaymaker et al., 2016). However, the available structures are limited to either the apo or post R-loop states for the single-component Cas9 and Cpf1 systems and the multi-component Type I/III systems (Jiang and Doudna, 2015). In this regard, the Cascade/seed-bubble snapshot presented in this study is invaluable in capturing the early events in the R-loop formation process. The structure suggests that the PAM-recognition coupled DNA bending, rather than any protein motions in Cascade, drives the melting of dsDNA at the seed sequence region. The importance of DNA bending was not fully appreciated in the previous work because DNA was thought to travel in a straight path in the E. coli Cascade/R-loop structure (Hayes et al., 2016). However, this finding should not come as a total surprise; in many processes DNA bending has been shown as an important element leading to DNA melting, although ATP-binding or hydrolysis is usually involved to compensate for the energy loss associated with DNA unwinding. For example, melting of the replication origin in prokaryotes involves the formation of a DnaA-dsDNA filament that bends dsDNA severely. This introduces negative superhelical strain that melts the downstream sequence; the bubble is then stabilized by the ATP-dependent DnaA-ssDNA filament formation (Duderstadt et al., 2011). In Cascade, DNA bending is introduced during PAM recognition as the result of DNA minor groove recognition. In a recent review, Gorski et. al. pointed out that a universal theme in RNA-guided systems is to facilitate the directional guide RNA/target DNA annealing from the seed-sequence region, and to have a pre-organized A-form RNA guide by the seed-sequence region to immediately compensate for the energy loss of DNA unwinding (Gorski et al., 2017). These points are nicely illustrated in our Cascade/seed-bubble structure by the additional finding that a larger-than-expected seed sequence is examined by Cascade, through the formation of two 5-bp pseudo A-form crRNA/DNA heteroduplex. Several Cascade structure features play unexpected roles in the DNA bending-melting transition. These elements are highly conserved in the E.coli Cascade, and equivalent elements are likely preserved in other Type I systems. We are therefore one step closer to generating a movie to depict the crRNA-guided target-searching process in Type I systems (Figure 6G). Whether a similar DNA bending-melting transition takes place in the R-loop formation process by the single-component CRISPR systems remains unclear due to the lack of early state structures (Gorski et al., 2017).

All DNA-targeting CRISPR systems utilize a single-stranded nuclease or nucleases to degrade matching targets. This reduces off-targeting, as the target DNA only becomes single-stranded after R-loop formation and, therefore, has been sequence validated by the crRNA guide. The mechanism to efficiently couple R-loop formation and substrate degradation steps was not fully understood. In Type II systems, Cas9 encodes two integral nuclease domains, HNH and RuvC. Their activities are tightly controlled by the R-loop-dependent ssDNA exposure, and by additional conformational changes that re-position the nuclease domains (Jiang et al., 2016; Sternberg et al., 2015). Similar mechanisms hold true to a large extent for Cpf1 and other single-component systems (Yamano et al., 2016). For Type I CRISPR systems, the previous consensus was that DNA degradation is tightly regulated at the Cas3 nuclease-helicase recruitment step - Cascade only recruits Cas3 after it has successfully opened an R-loop at the target DNA (Hayes et al., 2016; Hochstrasser et al., 2014; Huo et al., 2014; Mulepati and Bailey, 2013; Westra et al., 2012). Our T. fusca Type I-E study has provided unprecedented structure snapshots of the dynamic conformational transition leading to R-loop formation. There is a clear conformational difference between the seed-bubble and R-loop forming Cascade. The conformational change takes place upon the completion of the R-loop formation, and Cas3 selectively binds to Cascade in the full R-loop state. These observations point to the presence of a negative regulatory mechanism to prevent premature recruitment of the degradation factor Cas3, until the entire protospacer region in dsDNA has been sequence validated by Cascade (Figure 6G). Importantly, our data also reveal that a different mode takes place after Cas3 binding, in which Cascade actively promotes target degradation by creating a flexible bulge in DNA for Cas3 cleavage. This bulge facilitates the handover of the non-target DNA strand from Cascade to Cas3, to be efficiently nicked by the HD nuclease of Cas3 (Figure 6G). Removal of this bulge prevents Cas3-mediated DNA nicking without affecting the efficiency of R-loop formation and Cas3 recruitment, underlining the importance of this activation mechanism. The nicked non-target DNA strand is possibly re-threaded through the helicase domains of Cas3, initiating the next phase of processive DNA unwinding and degradation (Figure 6G). We look forward to parallel mechanistic studies in other Type I systems to cross-validate whether the two-tiered regulation of the degradation step may be a universal theme among Type I CRISPR systems. Emerging Cascade structures from other Type I systems are still not at sufficient resolution or are not programmed to the right functional state (Chowdhury et al., 2017; Hochstrasser et al., 2016). The central theme in Type I CRISPR systems appears to be stringency, established through a sequence of causal events. This may have given the Type I system an evolutionary advantage, which explains why it is the most prevalent CRISPR-Cas system found today.

Contact for Reagent and Resource Sharing

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Ailong Ke (ailong.ke@cornell.edu).

Experimental Model and Subject Details

Microbes

Escherichia coli cells were cultured in LB medium.

Method Details

Cloning, expression, and purification

The expression and purification of T. fusca Cascade was adapted from a previous protocol (Huo et al., 2014) with some modifications. Specifically, cse1 and the cse2-cas7-cas5e-cas6e gene cassette were cloned into a modified Twin-Strep-SUMO-pET19b vector (AmpR) and a pCDFDuet-1 vector (StrR), respectively. A synthetic CRISPR locus containing 4 identical copies of the repeat-spacer sequences was cloned into the pRSFDuet-1 vector (KanR). All three plasmids were sequence verified and co-transformed into E. coli BL21 (DE3) star cells. The cell culture was grown in LB medium at 37°C until the optical density at 600 nm reached 0.8. Expression was induced by adding isopropyl-β-D-thiogalactopyranoside (IPTG) to a final concentration of 1.0 mM at 25°C overnight. Cells were harvested by centrifugation and lysed by sonication in buffer A containing 50 mM HEPES pH 7.5, and 500 mM NaCl. The lysate was centrifuged at 15,000 rpm for 60 min at 4°C, and the supernatant was applied onto the pre-equilibrated Strep-Tactin Superflow resin (IBA, Göttingen, Germany). After washing with 100 mL of buffer A. The protein was eluted with buffer B (50 mM HEPES pH 7.5, 500 mM NaCl, and 5 mM desthiobiotin), and the eluted proteins were incubated with SUMO-protease at 4°C overnight. The TfuCascade was further purified with size-exclusion chromatography (SEC, HiLoad 16/60 Superdex 200; GE Healthcare) equilibrated with buffer C (10 mM HEPES pH 7.5, 150 mM NaCl, 5 mM DTT), the peak fractions were pooled and snap-frozen in liquid nitrogen for later usage (Figure S1A). The assembly of TfuCascade/R-loop and TfuCascade/seed-bubble is described in detail in the results section. The sequences of the oligos used in preparing DNA substrates for biochemical and structural studies are documented in the Key Resources Table.

Key Resources Table, related to Methods. DNA oligos and substrates used in the study.

graphic file with name nihms945142f7.jpg
graphic file with name nihms945142f8.jpg
graphic file with name nihms945142f9.jpg

Electrophoretic mobility shift assay

The protospacers with an AAG PAM were cloned into the pCDFDuet-1 vector between the BamHI and XhoI sites. The substrates were PCR amplified from the plasmids by using the fluorescent T7 primers and subsequently gel-purified. The dsDNA substrates produced this way are 276-bp in length, with a 5′-6FAM label at the non-target strand. DNA binding was carried out in 20 mM HEPES-NaOH, pH 7.5, 150 mM NaCl, and 5% glycerol. 3 nM of dsDNA was incubated with titrations of TfuCascade at 60 °C for 30 min. EMSA was performed at 4 °C on 2% agarose gels. Fluorescent signals were recorded using a Typhoon 9200 scanner.

Chemical and enzymatic probing analysis

The protocol for chemical probing with KMnO4 was modified from the previous publication (Jore et al., 2011). In detail, 3 nM of the 276-bp dsDNA substrates containing a 5’-6FAM label at the non-target strand was incubated with 50 nM Cascade in the 70 µl binding buffer containing 20 mM HEPES pH 7.5, 150 mM NaCl, and 5% glycerol. After incubation at the indicated temperatures for 30 min, 20 µl of each reaction was aliquoted for EMSA. The remaining 50 µl was supplemented with 5.5 µl of 16 mM KMnO4, and incubated at 30 °C for 2 min. The permanganate modification was stopped by the addition of 6.5 µl of β-mercaptoethanol and 6.5 µl of 500 mM EDTA. After phenol-chloroform extraction and ethanol precipitation, the nucleic acid pellets were resuspended in 50 µl of 10% piperidine and incubated at 90 °C for 30 min to induce cleavage at the modified residues. For enzymatic probing with P1 nuclease (US Biological) or S1 nuclease (Thermo Scientific), 40 µl of the Cascade–DNA complexes was supplemented with 10 µl reaction buffer (200 mM sodium acetate pH 5.3, 10 mM ZnSO4), and incubated with 1 unit of nuclease at 25 °C for 30 min. The cleavage reactions were stopped by the addition of EDTA to a final concentration of 20 mM. After phenol-chloroform extraction and ethanol precipitation, the pellet was resuspended and separated on a 10% denaturing polyacrylamide gel. The size markers were PCR amplified individually from the same dsDNA substrates using the same 5′-6FAM labeled forward primer and a reverse primer starting from different positions along the substrate. Fluorescent signals were recorded using a Typhoon 9200 scanner.

Cascade-mediated Cas3 DNA cleavage assay

The TfuCas3 protein was expressed and purified following the established protocol (Huo et al., 2014). The dsDNA substrate was PCR generated from 5’-fluorescently labeled primers. The pre-formed TfuCascade/R-loop complex was mixed with 100 nM of TfuCas3 in a cleavage buffer containing 10 mM HEPES pH 7.5, 150 mM NaCl, 10 mM MgCl2 and 100 µM CoCl2, and the reaction was incubated at 55 °C for 30 min. Afterwards, nucleic acids were phenol-chloroform extracted and separated on a 10% denaturing polyacrylamide gel. Fluorescent signals were recorded on a Typhoon 9200 scanner.

EM data acquisition

2.5 µl of ~0.8 mg/ml SEC-purified Cascade complexes were applied to a glow-discharged Quantifoil holey carbon grid (1.2/1.3, 400 mesh). Grids were blotted for 2.5 s at ~85 % humidity and plunge-frozen in liquid ethane using a Cryoplunge 3 System (Gatan). Cryo-EM images were manually collected on a TF30 Polara electron microscope (FEI company) operated at 300 kV and equipped with a K2 Summit direct electron detector (Gatan), as previously described with minor modifications (Ru et al., 2015). The total exposure time of each movie stack was 7.2 s, leading to a total accumulated dose of 49.2 electrons per Å2 fractionated into 36 frames (200 ms per frame).

Image processing

EM image processing was carried out as previously described with minor modifications (Ru et al., 2015). Dose fractionated super-resolution movie stacks collected from the K2 Summit direct electron detector were binned 2 × 2 to a pixel size of 1.238 Å, and then subjected to motion correction using MotionCor2. A sum of all 36 frames of each movie stack was calculated following a dose-weighting scheme (Grant and Grigorieff, 2015), and used for all following steps of image processing except defocus determination. Defocus values were calculated with the sum of all movie frames without dose weighting, using the program CTFFIND3 (Mindell and Grigorieff, 2003). 2D classification, 3D classification and 3D refinement were carried out using RELION (Scheres, 2012). All refinements followed the gold-standard procedure, in which two half data sets were refined independently. RELION ‘post-processing’ was used to estimate resolution based on the Fourier shell correlation (FSC) = 0.143 criterion after correcting for the effects of a soft shape mask using high-resolution noise substitution (Chen et al., 2013). The overall resolution of the open seed-bubble cryo-EM map (3.8 Å) was estimated based on the FSC = 0.5 criterion instead, to better match the map quality and the validation FSC curves using half data-refined maps and atomic model. Local resolution variations were estimated from two half data maps using ResMap (Kucukelbir et al., 2014). Amplitudes of the final maps were corrected by applying a negative B-factor using RELION ‘post-processing’. The detailed data processing and refinement statistics for the two cryo-EM structures is summarized in Table S1.

Quantification and Statistical Analysis

All experiments were conducted with at least three biological replicates (n = 3). Cryo-EM data collection, refinement, and validation statistics are reported in Table S1.

Data and Software Availability

The cryo-EM density maps of TfuCascade/full R-loop and TfuCascade/seed-bubble have been deposited in the Electron Microscopy Data Bank under accession numbers EMD-8478 and EMD-8477, and the coordinates for the corresponding atomic models have been deposited in the Protein Data Bank under accession numbers 5U0A and 5U07.

Supplementary Material

movie s1. Movie S1, related to Figure 2. Rotating view of the cryo-EM reconstruction of the TfuCascade/full R-loop complex.

Densities were sharpened to review the high-resolution structure features.

movie s2. Movie S2, related to Figure 5. Important structure features in the cryo-EM reconstruction of the TfuCascade/seed sequence bubble complex.

The disordered non-target DNA strand and the dsDNA following the seed-bubble were modeled to give perspectives. Note that because the conformational changes in TfuCascade have not taken place, the target strand-binding canyon is open and accessible.

Download video file (8MB, mp4)
s1. Supplemental Figure S1, related to Figure 1. Biochemical reconstitution of Type I-E CRISPR interference from T. fusca.

A, B. Size-exclusion Chromatography and SDS-PAGE profile of purified TfuCascade. Red and black traces correspond to 260 and 280 nm UV absorptions, respectively. C. Native EMSA analysis of TfuCascade and dsDNA target interaction at 58 °C, with an apparent binding constant of ~ 15 nM. D. PAM preference by TfuCascade as revealed by native EMSA. E, F. Cas3-mediated cleavage pattern on TfuCascade-bound DNA target, at the non-target (5’-6FAM labeled) and target (5’-Cy5 labeled) DNA strands, respectively. G. Nucleotide-resolution mapping of Cas3 nicking sites on the non-target DNA strand. H, I. Temperature-dependent R-loop formation behavior by TfuCascade, as probed by KMnO4 footprinting and native EMSA, respectively. J. EMSA migration differences likely reflect different extents of DNA bending before and after R-loop formation.

s2. Supplemental Figure S2, related to Figure 2. Cryo-EM reconstruction of TfuCascade/full R-loop complex.

A. Representative cryo-EM image and B. 2D averages of TfuCascade/full R-loop particles. C. 3D classification and refinement procedures. Three types of 3D reconstructions reached < 4.0 Å resolution. They differ by the presence/absence of PAM-distal dsDNA density, and the density features also differ slightly at the non-target strand gap region (red circles). The final 3D refinement after combining the cryo-EM particles from all good classes generated a density map at 3.3 Å resolution. D. Two different views of the cryo-EM density maps refined using the particles representing the conformations of type 1, 2, and 3. All three maps are low-pass filtered to 6 Å resolution. Major density feature differences are highlighted in the red circles. E. Final 3D reconstruction (left) and its cross-sectional view (right), colored according to local resolution. F. Histogram of voxels with different local resolution. G. Gold-standard FSC curves between two half maps that were calculated from two half data sets (in red), between the summed map and the final atomic model (in blue), between half map 2 and the atomic model refined against half map 1 (in green). H. Euler angle distribution of cryo-EM particles for calculating the final EM map. The height of each bar indicates the number of particles in a particular orientation.

s3. Supplemental Figure S3, related to Figure 3, 5. Representative cryo-EM densities in the TfuCascade/full R-loop structure.

A. Contacts for PAM recognition. B. Gln-wedge insertion to unwind DNA from PAM-proximal region. C. Thumb insertion by every Cas7 subunit that disrupts the crRNA-target DNA pairing at every 6th position.

s4. Supplemental Figure S4, related to Figure 3, 8. Comparison between TfuCascade/full R-loop and EcoCascade/R-loop mimic structures.

A. Overall structure alignment between TfuCascade/full R-loop (colored according to Figure 2) and EcoCascade/R-loop mimic (PDB: 5H9E, proteins in blue, nucleic acids in magenta) structures. Major structural differences found in the TfuCascade/full R-loop structure are enumerated in text. B. Structure alignment of Cse1 in two structures. Note domain orientation differences as well as different structure features in Cse1_NTD. C. Structure alignment of Cse2. TfuCse2 contains a large flexible internal loop and an extension at the C-terminal helix. D. Structure alignment of Cas7. Note the finger domain differences, the K-vise location difference, and the K-Rim location in TfuCas7. E. Structure alignment of Cas5e. F. Structure alignment of Cas6e. The two Cas6e structures are superimposable because the EcoCas6e structure was rigid-body docked into the low-resolution TfuCascade/full R-loop EM density without further refinement. Note the differences in the spacer region of the crRNA, due to a slightly extended Cas7 backbone conformation in TfuCascade/full R-loop structure.

s5. Supplemental Figure S5, related to Figure 5. cryo-EM reconstruction of the TfuCascade/seed-bubble complex.

A. Representative cryo-EM image and B. 2D averages of TfuCascade/seed-bubble particles. C. 3D classification and refinement procedures. The reconstructions from the class V (“type 1” conformation) and class VI (“type 2” conformation) in the last round of 3D classification are essentially identical except for the length of the target DNA strand density. The final 3D refinement after combining the cryo-EM particles from both good classes generated a density map at 3.8 Å resolution. D. Two different views of the cryo-EM density maps refined using the particles representing the conformations of type 1, type 2 or both types. All three maps are low-pass filtered to 6 Å resolution. E. Final 3D reconstruction (left) and its cross-sectional view (right), colored according to local resolution. F. Histogram of voxels with different local resolution. G. Gold-standard FSC curves between two half maps that were calculated from two half data sets (in red), between the summed map and the final atomic model (in blue), and between half map 2 and the atomic model refined against half map 1 (in green). H. Euler angle distribution of cryo-EM particles for calculating the final EM map. The height of each bar indicates the number of particles in a particular orientation.

table S1. Table S1, related to Methods. Statistical analysis of the cryo-EM structures presented in this study.

Acknowledgments

This work is supported by National Institutes of Health (NIH) grants GM 118174 and GM102543 to A.K. We thank M. Chambers and Z. Li for EM technical support; B. Carragher, C. Potter, and R. Grassucci for initial EM analysis; and I. Finkelstein, I. Price, R. Battaglia, and A. Dolan for discussions.

Footnotes

Author Contributions

Y.X., M.L., MF.L., and A.K. designed the research. Y.X. and M.L. are the main contributors to biochemical reconstitutions, structure determination, and structure-function analyses. R.P.H., J. K., S.N., and F.D. contributed to biochemical analyses or assay setup. MF.L. and A.K. led the structural and biochemical analyses. A.K. and the rest of the authors wrote the manuscript.

The authors declare no competing financial interests.

References

  1. Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, Moineau S, Romero DA, Horvath P. CRISPR provides acquired resistance against viruses in prokaryotes. Science. 2007;315:1709–1712. doi: 10.1126/science.1138140. [DOI] [PubMed] [Google Scholar]
  2. Blosser TR, Loeff L, Westra ER, Vlot M, Kunne T, Sobota M, Dekker C, Brouns SJ, Joo C. Two distinct DNA binding modes guide dual roles of a CRISPR-Cas protein complex. Molecular cell. 2015;58:60–70. doi: 10.1016/j.molcel.2015.01.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bolotin A, Quinquis B, Sorokin A, Ehrlich SD. Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. Microbiology. 2005;151:2551–2561. doi: 10.1099/mic.0.28048-0. [DOI] [PubMed] [Google Scholar]
  4. Brouns SJ, Jore MM, Lundgren M, Westra ER, Slijkhuis RJ, Snijders AP, Dickman MJ, Makarova KS, Koonin EV, van der Oost J. Small CRISPR RNAs guide antiviral defense in prokaryotes. Science. 2008;321:960–964. doi: 10.1126/science.1159689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Caliando BJ, Voigt CA. Targeted DNA degradation using a CRISPR device stably carried in the host genome. Nature communications. 2015;6:6989. doi: 10.1038/ncomms7989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chen S, McMullan G, Faruqi AR, Murshudov GN, Short JM, Scheres SH, Henderson R. High-resolution noise substitution to measure overfitting and validate resolution in 3D structure determination by single particle electron cryomicroscopy. Ultramicroscopy. 2013;135:24–35. doi: 10.1016/j.ultramic.2013.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chowdhury S, Carter J, Rollins MF, Golden SM, Jackson RN, Hoffmann C, Nosaka L, Bondy-Denomy J, Maxwell KL, Davidson AR, et al. Structure Reveals Mechanisms of Viral Suppressors that Intercept a CRISPR RNA-Guided Surveillance Complex. Cell. 2017;169 doi: 10.1016/j.cell.2017.03.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Datsenko KA, Pougach K, Tikhonov A, Wanner BL, Severinov K, Semenova E. Molecular memory of prior infections activates the CRISPR/Cas adaptive bacterial immunity system. Nature communications. 2012;3:945. doi: 10.1038/ncomms1937. [DOI] [PubMed] [Google Scholar]
  9. Duderstadt KE, Chuang K, Berger JM. DNA stretching by bacterial initiators promotes replication origin opening. Nature. 2011;478:209–213. doi: 10.1038/nature10455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Fineran PC, Gerritzen MJ, Suarez-Diez M, Kunne T, Boekhorst J, van Hijum SA, Staals RH, Brouns SJ. Degenerate target sites mediate rapid primed CRISPR adaptation. Proceedings of the National Academy of Sciences of the United States of America. 2014;111:E1629–1638. doi: 10.1073/pnas.1400071111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gorski SA, Vogel J, Doudna JA. RNA-based recognition and targeting: sowing the seeds of specificity. Nat Rev Mol Cell Biol. 2017;18:215–228. doi: 10.1038/nrm.2016.174. [DOI] [PubMed] [Google Scholar]
  12. Grant T, Grigorieff N. Measuring the optimal exposure for single particle cryo-EM using a 2.6 angstrom reconstruction of rotavirus VP6. eLife. 2015;4 doi: 10.7554/eLife.06980. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Hale CR, Zhao P, Olson S, Duff MO, Graveley BR, Wells L, Terns RM, Terns MP. RNA-guided RNA cleavage by a CRISPR RNA-Cas protein complex. Cell. 2009;139:945–956. doi: 10.1016/j.cell.2009.07.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hayes RP, Xiao Y, Ding F, van Erp PB, Rajashankar K, Bailey S, Wiedenheft B, Ke A. Structural basis for promiscuous PAM recognition in type I-E Cascade from E. coli. Nature. 2016;530:499–503. doi: 10.1038/nature16995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Hochstrasser ML, Taylor DW, Bhat P, Guegler CK, Sternberg SH, Nogales E, Doudna JA. CasA mediates Cas3-catalyzed target degradation during CRISPR RNA-guided interference. Proceedings of the National Academy of Sciences of the United States of America. 2014;111:6618–6623. doi: 10.1073/pnas.1405079111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hochstrasser ML, Taylor DW, Kornfeld JE, Nogales E, Doudna JA. DNA Targeting by a Minimal CRISPR RNA-Guided Cascade. Molecular cell. 2016;63:840–851. doi: 10.1016/j.molcel.2016.07.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Huo Y, Nam KH, Ding F, Lee H, Wu L, Xiao Y, Farchione MD, Jr, Zhou S, Rajashankar K, Kurinov I, et al. Structures of CRISPR Cas3 offer mechanistic insights into Cascade-activated DNA unwinding and degradation. Nature structural & molecular biology. 2014;21:771–777. doi: 10.1038/nsmb.2875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Jackson RN, Golden SM, van Erp PB, Carter J, Westra ER, Brouns SJ, van der Oost J, Terwilliger TC, Read RJ, Wiedenheft B. Crystal structure of the CRISPR RNA-guided surveillance complex from Escherichia coli. Science. 2014;345:1473–1479. doi: 10.1126/science.1256328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Jiang F, Doudna JA. The structural biology of CRISPR-Cas systems. Current opinion in structural biology. 2015;30:100–111. doi: 10.1016/j.sbi.2015.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Jiang F, Taylor DW, Chen JS, Kornfeld JE, Zhou K, Thompson AJ, Nogales E, Doudna JA. Structures of a CRISPR-Cas9 R-loop complex primed for DNA cleavage. Science. 2016;351:867–871. doi: 10.1126/science.aad8282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Jore MM, Lundgren M, van Duijn E, Bultema JB, Westra ER, Waghmare SP, Wiedenheft B, Pul U, Wurm R, Wagner R, et al. Structural basis for CRISPR RNA-guided DNA recognition by Cascade. Nature structural & molecular biology. 2011;18:529–536. doi: 10.1038/nsmb.2019. [DOI] [PubMed] [Google Scholar]
  22. Kucukelbir A, Sigworth FJ, Tagare HD. Quantifying the local resolution of cryo-EM density maps. Nat Methods. 2014;11:63–65. doi: 10.1038/nmeth.2727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Luo ML, Mullis AS, Leenay RT, Beisel CL. Repurposing endogenous type I CRISPR-Cas systems for programmable gene repression. Nucleic acids research. 2015;43:674–681. doi: 10.1093/nar/gku971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Makarova KS, Grishin NV, Shabalina SA, Wolf YI, Koonin EV. A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action. Biology direct. 2006;1:7. doi: 10.1186/1745-6150-1-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Makarova KS, Koonin EV. Annotation and Classification of CRISPR-Cas Systems. Methods Mol Biol. 2015;1311:47–75. doi: 10.1007/978-1-4939-2687-9_4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Makarova KS, Wolf YI, Alkhnbashi OS, Costa F, Shah SA, Saunders SJ, Barrangou R, Brouns SJ, Charpentier E, Haft DH, et al. An updated evolutionary classification of CRISPR-Cas systems. Nature reviews Microbiology. 2015;13:722–736. doi: 10.1038/nrmicro3569. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Marraffini LA, Sontheimer EJ. CRISPR interference limits horizontal gene transfer in staphylococci by targeting DNA. Science. 2008;322:1843–1845. doi: 10.1126/science.1165771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Marraffini LA, Sontheimer EJ. Self versus non-self discrimination during CRISPR RNA-directed immunity. Nature. 2010;463:568–571. doi: 10.1038/nature08703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Mindell JA, Grigorieff N. Accurate determination of local defocus and specimen tilt in electron microscopy. J Struct Biol. 2003;142:334–347. doi: 10.1016/s1047-8477(03)00069-8. [DOI] [PubMed] [Google Scholar]
  30. Mohanraju P, Makarova KS, Zetsche B, Zhang F, Koonin EV, van der Oost J. Diverse evolutionary roots and mechanistic variations of the CRISPR-Cas systems. Science. 2016;353:aad5147. doi: 10.1126/science.aad5147. [DOI] [PubMed] [Google Scholar]
  31. Mojica FJ, Diez-Villasenor C, Garcia-Martinez J, Almendros C. Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology. 2009;155:733–740. doi: 10.1099/mic.0.023960-0. [DOI] [PubMed] [Google Scholar]
  32. Mojica FJ, Diez-Villasenor C, Garcia-Martinez J, Soria E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J Mol Evol. 2005;60:174–182. doi: 10.1007/s00239-004-0046-3. [DOI] [PubMed] [Google Scholar]
  33. Mulepati S, Bailey S. In vitro reconstitution of an Escherichia coli RNA-guided immune system reveals unidirectional, ATP-dependent degradation of DNA target. The Journal of biological chemistry. 2013;288:22184–22192. doi: 10.1074/jbc.M113.472233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Mulepati S, Heroux A, Bailey S. Crystal structure of a CRISPR RNA-guided surveillance complex bound to a ssDNA target. Science. 2014;345:1479–1484. doi: 10.1126/science.1256996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Nishimasu H, Nureki O. Structures and mechanisms of CRISPR RNA-guided effector nucleases. Current opinion in structural biology. 2016;43:68–78. doi: 10.1016/j.sbi.2016.11.013. [DOI] [PubMed] [Google Scholar]
  36. Pourcel C, Salvignol G, Vergnaud G. CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. Microbiology. 2005;151:653–663. doi: 10.1099/mic.0.27437-0. [DOI] [PubMed] [Google Scholar]
  37. Redding S, Sternberg SH, Marshall M, Gibb B, Bhat P, Guegler CK, Wiedenheft B, Doudna JA, Greene EC. Surveillance and Processing of Foreign DNA by the Escherichia coli CRISPR-Cas System. Cell. 2015 doi: 10.1016/j.cell.2015.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Ru H, Chambers MG, Fu TM, Tong AB, Liao M, Wu H. Molecular Mechanism of V(D)J Recombination from Synaptic RAG1-RAG2 Complex Structures. Cell. 2015;163:1138–1152. doi: 10.1016/j.cell.2015.10.055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Rutkauskas M, Sinkunas T, Songailiene I, Tikhomirova MS, Siksnys V, Seidel R. Directional R-Loop Formation by the CRISPR-Cas Surveillance Complex Cascade Provides Efficient Off-Target Site Rejection. Cell reports. 2015 doi: 10.1016/j.celrep.2015.01.067. [DOI] [PubMed] [Google Scholar]
  40. Scheres SH. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J Struct Biol. 2012;180:519–530. doi: 10.1016/j.jsb.2012.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Semenova E, Jore MM, Datsenko KA, Semenova A, Westra ER, Wanner B, van der Oost J, Brouns SJ, Severinov K. Interference by clustered regularly interspaced short palindromic repeat (CRISPR) RNA is governed by a seed sequence. Proceedings of the National Academy of Sciences of the United States of America. 2011;108:10098–10103. doi: 10.1073/pnas.1104144108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Shmakov S, Abudayyeh OO, Makarova KS, Wolf YI, Gootenberg JS, Semenova E, Minakhin L, Joung J, Konermann S, Severinov K, et al. Discovery and Functional Characterization of Diverse Class 2 CRISPR-Cas Systems. Molecular cell. 2015;60:385–397. doi: 10.1016/j.molcel.2015.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Sinkunas T, Gasiunas G, Waghmare SP, Dickman MJ, Barrangou R, Horvath P, Siksnys V. In vitro reconstitution of Cascade-mediated CRISPR immunity in Streptococcus thermophilus. The EMBO journal. 2013;32:385–394. doi: 10.1038/emboj.2012.352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Slaymaker IM, Gao L, Zetsche B, Scott DA, Yan WX, Zhang F. Rationally engineered Cas9 nucleases with improved specificity. Science. 2016;351:84–88. doi: 10.1126/science.aad5227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Sternberg SH, LaFrance B, Kaplan M, Doudna JA. Conformational control of DNA target cleavage by CRISPR-Cas9. Nature. 2015;527:110–113. doi: 10.1038/nature15544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Szczelkun MD, Tikhomirova MS, Sinkunas T, Gasiunas G, Karvelis T, Pschera P, Siksnys V, Seidel R. Direct observation of R-loop formation by single RNA-guided Cas9 and Cascade effector complexes. Proceedings of the National Academy of Sciences of the United States of America. 2014;111:9798–9803. doi: 10.1073/pnas.1402597111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Westra ER, van Erp PB, Kunne T, Wong SP, Staals RH, Seegers CL, Bollen S, Jore MM, Semenova E, Severinov K, et al. CRISPR immunity relies on the consecutive binding and degradation of negatively supercoiled invader DNA by Cascade and Cas3. Molecular cell. 2012;46:595–605. doi: 10.1016/j.molcel.2012.03.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Wiedenheft B, Lander GC, Zhou K, Jore MM, Brouns SJ, van der Oost J, Doudna JA, Nogales E. Structures of the RNA-guided surveillance complex from a bacterial immune system. Nature. 2011a;477:486–489. doi: 10.1038/nature10402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Wiedenheft B, van Duijn E, Bultema JB, Waghmare SP, Zhou K, Barendregt A, Westphal W, Heck AJ, Boekema EJ, Dickman MJ, et al. RNA-guided complex from a bacterial immune system enhances target recognition through seed sequence interactions. Proceedings of the National Academy of Sciences of the United States of America. 2011b;108:10092–10097. doi: 10.1073/pnas.1102716108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Yamano T, Nishimasu H, Zetsche B, Hirano H, Slaymaker IM, Li Y, Fedorova I, Nakane T, Makarova KS, Koonin EV, et al. Crystal Structure of Cpf1 in Complex with Guide RNA and Target DNA. Cell. 2016;165:949–962. doi: 10.1016/j.cell.2016.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Zhao H, Sheng G, Wang J, Wang M, Bunkoczi G, Gong W, Wei Z, Wang Y. Crystal structure of the RNA-guided immune surveillance Cascade complex in Escherichia coli. Nature. 2014;515:147–150. doi: 10.1038/nature13733. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

movie s1. Movie S1, related to Figure 2. Rotating view of the cryo-EM reconstruction of the TfuCascade/full R-loop complex.

Densities were sharpened to review the high-resolution structure features.

movie s2. Movie S2, related to Figure 5. Important structure features in the cryo-EM reconstruction of the TfuCascade/seed sequence bubble complex.

The disordered non-target DNA strand and the dsDNA following the seed-bubble were modeled to give perspectives. Note that because the conformational changes in TfuCascade have not taken place, the target strand-binding canyon is open and accessible.

Download video file (8MB, mp4)
s1. Supplemental Figure S1, related to Figure 1. Biochemical reconstitution of Type I-E CRISPR interference from T. fusca.

A, B. Size-exclusion Chromatography and SDS-PAGE profile of purified TfuCascade. Red and black traces correspond to 260 and 280 nm UV absorptions, respectively. C. Native EMSA analysis of TfuCascade and dsDNA target interaction at 58 °C, with an apparent binding constant of ~ 15 nM. D. PAM preference by TfuCascade as revealed by native EMSA. E, F. Cas3-mediated cleavage pattern on TfuCascade-bound DNA target, at the non-target (5’-6FAM labeled) and target (5’-Cy5 labeled) DNA strands, respectively. G. Nucleotide-resolution mapping of Cas3 nicking sites on the non-target DNA strand. H, I. Temperature-dependent R-loop formation behavior by TfuCascade, as probed by KMnO4 footprinting and native EMSA, respectively. J. EMSA migration differences likely reflect different extents of DNA bending before and after R-loop formation.

s2. Supplemental Figure S2, related to Figure 2. Cryo-EM reconstruction of TfuCascade/full R-loop complex.

A. Representative cryo-EM image and B. 2D averages of TfuCascade/full R-loop particles. C. 3D classification and refinement procedures. Three types of 3D reconstructions reached < 4.0 Å resolution. They differ by the presence/absence of PAM-distal dsDNA density, and the density features also differ slightly at the non-target strand gap region (red circles). The final 3D refinement after combining the cryo-EM particles from all good classes generated a density map at 3.3 Å resolution. D. Two different views of the cryo-EM density maps refined using the particles representing the conformations of type 1, 2, and 3. All three maps are low-pass filtered to 6 Å resolution. Major density feature differences are highlighted in the red circles. E. Final 3D reconstruction (left) and its cross-sectional view (right), colored according to local resolution. F. Histogram of voxels with different local resolution. G. Gold-standard FSC curves between two half maps that were calculated from two half data sets (in red), between the summed map and the final atomic model (in blue), between half map 2 and the atomic model refined against half map 1 (in green). H. Euler angle distribution of cryo-EM particles for calculating the final EM map. The height of each bar indicates the number of particles in a particular orientation.

s3. Supplemental Figure S3, related to Figure 3, 5. Representative cryo-EM densities in the TfuCascade/full R-loop structure.

A. Contacts for PAM recognition. B. Gln-wedge insertion to unwind DNA from PAM-proximal region. C. Thumb insertion by every Cas7 subunit that disrupts the crRNA-target DNA pairing at every 6th position.

s4. Supplemental Figure S4, related to Figure 3, 8. Comparison between TfuCascade/full R-loop and EcoCascade/R-loop mimic structures.

A. Overall structure alignment between TfuCascade/full R-loop (colored according to Figure 2) and EcoCascade/R-loop mimic (PDB: 5H9E, proteins in blue, nucleic acids in magenta) structures. Major structural differences found in the TfuCascade/full R-loop structure are enumerated in text. B. Structure alignment of Cse1 in two structures. Note domain orientation differences as well as different structure features in Cse1_NTD. C. Structure alignment of Cse2. TfuCse2 contains a large flexible internal loop and an extension at the C-terminal helix. D. Structure alignment of Cas7. Note the finger domain differences, the K-vise location difference, and the K-Rim location in TfuCas7. E. Structure alignment of Cas5e. F. Structure alignment of Cas6e. The two Cas6e structures are superimposable because the EcoCas6e structure was rigid-body docked into the low-resolution TfuCascade/full R-loop EM density without further refinement. Note the differences in the spacer region of the crRNA, due to a slightly extended Cas7 backbone conformation in TfuCascade/full R-loop structure.

s5. Supplemental Figure S5, related to Figure 5. cryo-EM reconstruction of the TfuCascade/seed-bubble complex.

A. Representative cryo-EM image and B. 2D averages of TfuCascade/seed-bubble particles. C. 3D classification and refinement procedures. The reconstructions from the class V (“type 1” conformation) and class VI (“type 2” conformation) in the last round of 3D classification are essentially identical except for the length of the target DNA strand density. The final 3D refinement after combining the cryo-EM particles from both good classes generated a density map at 3.8 Å resolution. D. Two different views of the cryo-EM density maps refined using the particles representing the conformations of type 1, type 2 or both types. All three maps are low-pass filtered to 6 Å resolution. E. Final 3D reconstruction (left) and its cross-sectional view (right), colored according to local resolution. F. Histogram of voxels with different local resolution. G. Gold-standard FSC curves between two half maps that were calculated from two half data sets (in red), between the summed map and the final atomic model (in blue), and between half map 2 and the atomic model refined against half map 1 (in green). H. Euler angle distribution of cryo-EM particles for calculating the final EM map. The height of each bar indicates the number of particles in a particular orientation.

table S1. Table S1, related to Methods. Statistical analysis of the cryo-EM structures presented in this study.

RESOURCES