Significance
CRISPR-associated transposons (CASTs) show tremendous promise for genome engineering yet remain poorly understood. Here, we present the cryo-electron microscopy structure of the transposase (TnsB) from the V-K CAST element from Scytonema hofmanni (ShCAST). We determine the molecular mechanism of TnsB recruitment to the target site (via the AAA+ regulator TnsC) and the structural details of the TnsB transposase. This TnsB structure reveals architectural similarities to MuA, but also key structural differences that are significant for understanding CAST transposition. Importantly, we highlight a base-flipping mechanism for stabilizing the 5′ end of the transposon, potentially to ensure the fidelity of synaptic complex assembly. The structures presented here provide a direct target for rational, structure-guided design strategies and re-engineering of CAST elements.
Keywords: CRISPR-associated transposon, cryo-EM, TnsB, transposase structure
Abstract
CRISPR-associated transposons (CASTs) are Tn7-like elements that are capable of RNA-guided DNA integration. Although structural data are known for nearly all core transposition components, the transposase component, TnsB, remains uncharacterized. Using cryo-electron microscopy (cryo-EM) structure determination, we reveal the conformation of TnsB during transposon integration for the type V-K CAST system from Scytonema hofmanni (ShCAST). Our structure of TnsB is a tetramer, revealing strong mechanistic relationships with the overall architecture of RNaseH transposases/integrases in general, and in particular the MuA transposase from bacteriophage Mu. However, key structural differences in the C-terminal domains indicate that TnsB’s tetrameric architecture is stabilized by a different set of protein–protein interactions compared with MuA. We describe the base-specific interactions along the TnsB binding site, which explain how different CAST elements can function on cognate mobile elements independent of one another. We observe that melting of the 5′ nontransferred strand of the transposon end is a structural feature stabilized by TnsB and furthermore is crucial for donor–DNA integration. Although not observed in the TnsB strand-transfer complex, the C-terminal end of TnsB serves a crucial role in transposase recruitment to the target site. The C-terminal end of TnsB adopts a short, structured 15-residue “hook” that decorates TnsC filaments. Unlike full-length TnsB, C-terminal fragments do not appear to stimulate filament disassembly using two different assays, suggesting that additional interactions between TnsB and TnsC are required for redistributing TnsC to appropriate targets. The structural information presented here will help guide future work in modifying these important systems as programmable gene integration tools.
CRISPR-associated transposons (CASTs) have co-opted Cas genes for RNA-guided DNA integration and are promising candidates for novel genome-editing methods (1, 2). CAST elements are fascinating because of their ability to integrate DNA payloads contained within the element at a precise position, with a specific orientation, and in a programmable manner (3–6). CAST elements are evolutionarily related to Tn7 elements and are often referred to as “Tn7-like” (2). Accordingly, Tn7 and Tn7-like CAST elements contain multiple conserved genes that likely share common functions, leading to newfound appreciation for decades of biochemical, genetic, and structural work on Tn7 and related elements (7, 8).
Despite remarkable diversity (1, 8, 9), all RNA-directed transposition systems characterized to date share multiple components: a CRISPR effector (Cas12k or Cascade), proteins dedicated to target capture (TniQ + TnsC), and a transposase called TnsB. By analogy to work from prototypic Tn7 (2), TnsB carries out transposon end recognition, pairing, and the chemical steps which result in integration of cognate element DNA. The V-K CAST system from Scytonema hofmanni (ShCAST) is especially appealing as a model system for mechanistic studies due to its simplicity (a single polypeptide chain encodes the effector) and robust in vitro activity (4). Currently, structural information on components Cas12k (10, 11), TniQ, and TnsC (11, 12) exists except for the TnsB transposase, and it remains mysterious how these indispensable components interact to precisely direct insertions into a guide RNA–directed target site. More generally, structural information is required for the TnsB transposase to obtain a mechanistic understanding of the Tn7 and Tn7-like elements given their broad distribution across diverse bacteria with many interesting targeting modalities, including all of the functionally described CAST elements.
Despite their similarities, the transposase components of the aforementioned transposons do not behave identically, and components are not interchangeable. ShCAST, like bacteriophage Mu, likely uses a replicative transposition mechanism (13) involving host-primed DNA replication of the element to generate cointegrates between the donor and target DNAs in vivo (14, 15). In contrast, prototypic Tn7 uses a cut-and-paste mechanism that directly forms a simple insertion (16) based on the heteromeric TnsA+TnsB transposase (17). TnsA and TnsB form a protein complex for which the nuclease activities of both proteins (TnsA and TnsB) are required to generate simple insertions (17–20), but the regulatory details of this process remain unresolved with Tn7 and related elements. A structure of the TnsB transposase would set the foundation for understanding the similarities that link related Tn7 and CAST elements, as well as the key differences that would explain their distinct behavior.
Results
TnsB and MuA Have Similar Architecture in the Context of the Strand-Transfer Complex.
ShCAST transposition likely follows that of many other transposition systems: Pairing of the transposon ends (Fig. 1A, Left) is followed by nucleophilic attack at the transposon ends that allows them to be joined to target DNA (Fig. 1A, Middle), resulting in the product DNA, referred to here as the strand-transfer DNA (Fig. 1A, Right). To understand how TnsB recognizes and pairs the transposon ends and subsequently juxtaposes them to target DNA, we reconstituted and imaged a TnsB strand-transfer complex (STC) using a symmetric DNA substrate containing the first 45 bp of the ShCAST transposon’s left end (Fig. 1B) (4). This DNA sequence contains the first full TnsB binding site, L1, and three-fourths of the second TnsB binding site, L2, to mimic the product of transposition (Fig. 1B; see SI Appendix for details). Reconstitution with this substrate resulted in a homogeneous, stable assembly (as assessed by size-exclusion chromatography; SI Appendix, Fig. S1) with which we obtained a high-resolution cryo-electron microscopy (cryo-EM) reconstruction (3.7-Å global resolution; Fig. 1D and SI Appendix, Fig. S2).
Rigid-body docking of isolated domains obtained from an AlphaFold prediction (21) resulted in a nearly full-length atomic model spanning the majority of the TnsB sequence (GenBank accession no. WP_084763316.1; Fig. 1C). TnsB forms a C2-symmetric tetrameric assembly organized around the strand-transfer DNA (Fig. 1D and E and Movie S1). The overall architecture and arrangement of functional domains are remarkably similar to the MuA STC (22) (Fig. 1F and G and SI Appendix, Fig. S3). MuA is a well-studied RNaseH transposase that is responsible for bacteriophage Mu integration. In the representative view shown (Fig. 1F and G), both complexes resemble an “X,” where the upper half of the complex consists of the target DNA (blue, Fig. 1F and G) and the lower half consists of the transposon ends (green, Fig. 1F and G). Both MuA and TnsB cleave the donor DNA in trans—the subunit whose DNA-binding domain interacts with DNA on the right-hand side of the complex (tan subunit, Fig. 1D and E) positions the catalytic domain to interact with the DNA on the left side of the target–donor junction and vice versa (Fig. 1D and E). Furthermore, both left and right halves of the complexes are identical, with each half containing two protein chains, each in different conformations that are determined by where they bind on the DNA substrate. The two TnsB binding sites on the strand-transfer DNA substrate are referred to as L1 and L2 (because the designed DNA substrate used ShCAST left ends). The corresponding TnsB conformers are distinguished by which TnsB binding site they occupy (Movie S1), and hence the TnsB monomer bound to L1 is referred to as B-L1, and TnsB bound to L2 is referred to as B-L2 (Fig. 1C and D; both are described in more detail below).
We have assigned TnsB domain names following MuA domain names (Fig. 1C) (22), given the remarkable similarities between the STC structures, in order to facilitate structural comparisons. Domains Iβ, Iγ, and IIβ are DNA-binding domains (Fig. 1C and Movie S1), domain IIα is the catalytic domain, and, finally, domains IIIα and IIIβ span the TnsB C terminus, which will be discussed in detail in the following sections. The B-L1 conformation includes residues 29 to 474 and is positioned at the target–donor junction (tan and light purple, Fig. 1C and D and Movie S1). The second distinct conformation, B-L2, spans residues 196 to 519 (orange and dark purple, Fig. 1C and D) and binds the second TnsB binding site (L2).
A Distinct Role for Helix IIIα in Stabilizing TnsB Strand-Transfer Architecture.
Compared with the MuA STC (22), two DNA-binding domains, Iβ and Iγ, from the TnsB B-L2 subunit are not present in our structure (compare Fig. 1F and G; domains present in MuA but absent in TnsB B-L2 are marked with a blue star in Fig. 1G), possibly due to the choice of substrate (our DNA contains an incomplete L2 TnsB binding site). We also observe structural differences between ShCAST TnsB and MuA assemblies. One example is how the tetrameric architecture is stabilized, most notably in the placement of helix IIIα (red asterisk, Fig. 1F and G). In MuA, helix IIIα adopts two different configurations in the R1- (purple square, Fig. 1G) and R2-bound MuA subunits (yellow triangle, Fig. 1G). In contrast, in the TnsB STC, helix IIIα appears to stabilize the tetramer by making different intersubunit interactions (Movie S1). B-L2 helix IIIα (red asterisk, Fig. 1F) wraps around the back of domain IIα of B-L1 (light purple, Fig. 1E) to nestle between the B-L1 (light purple and tan) subunits, forming interactions with both (Fig. 1D and H). In addition to helix IIIα, we observe intersubunit interactions between domain Iβ in B-L1 (tan, boxed in Fig. 1I) and domain IIβ in B-L2 (orange, Fig. 1I). Here, B-L1 domain Iβ completes a β-sheet within B-L2 domain IIβ (Fig. 1I). Therefore, while the TnsB STC contains many conserved features to ensure fidelity of synaptic complex assembly, it appears to have evolved different protein–protein interactions to hold the tetrameric assembly together compared with those found in the MuA STC.
ShCAST Transposase Recruitment Occurs via Interactions between TnsC and TnsB’s C Terminus.
We do not observe any ordered structure past domain IIIα in our TnsB STC structure (Fig. 1C), consistent with the disorder prediction in this region (Fig. 2A). Nevertheless, this is particularly interesting given the role of the transposase C terminus in both prototypic Tn7 and Mu. The last 22 residues of TnsB (residues 681 to 702) in prototypic Tn7 are essential for the TnsB–TnsC interaction and transposition (23). For Mu, the C terminus of MuA is crucial for stimulating adenosine triphosphate (ATP) hydrolysis (24) and disassembly of MuB filaments (the AAA+ protein providing a function analogous to ShCAST TnsC) (25, 26), implying that MuA C-terminal interactions with MuB are also relevant for MuA transposition. Motivated by the remarkable structural and functional similarities between MuB and ShCAST TnsC (12), we reasoned that the C-terminal 109 residues of TnsB (spanning domains IIIα and IIIβ, which we refer to as TnsBCTD; Fig. 2A) are most likely to interact with the TnsC filament. Because TnsB, like MuA, stimulates TnsC filament disassembly in a nucleotide-dependent manner (Fig. 2B) (12), we reasoned that full-length TnsB would not form a stable complex with TnsC filaments suitable for high-resolution structure determination, so we instead pursued structural characterization with TnsB fragments. In order to capture a homogeneous “recruitment-like” state, we added TnsBCTD in excess to AMPPNP-bound TnsC, which forms continuous helical filaments on target DNA (12).
The cryo-EM reconstruction of the TnsC filament coated with TnsBCTD revealed side-chain density features (3.5-Å resolution) corresponding to 14 residues decorating the surface of TnsC filaments (Fig. 2C and SI Appendix, Fig. S4). Atomic modeling into this density (SI Appendix, Fig. S5) indicated that this portion of TnsB most likely corresponds to the last 15 residues of TnsB (the last residue is not modeled), or residue positions 570 to 584, which we refer to as TnsBHook (Fig. 2A and D). Subsequent cryo-EM reconstruction of the TnsBHook peptide (residues 570 to 584; Fig. 2A) in the presence of TnsC filaments resulted in a reconstruction indistinguishable from the previous one, except for slight resolution differences (3.5 vs. 3.8 Å for the TnsBCTD vs. TnsBHook reconstructions, respectively; SI Appendix, Fig. S6), confirming the TnsBHook sequence register assignment. The lack of additional density corresponding to TnsBCTD in our cryo-EM reconstruction suggests positions outside the structured TnsBHook do not make specific contacts with the TnsC filament, which is consistent with TnsB disorder predictions (Fig. 2A). Taken together, the most parsimonious explanation for this is that the TnsBHook represents a structured interaction with TnsC connected by a flexible linker to the rest of the full-length TnsB. Deletions of either the TnsBCTD (ΔCTD, or equivalently TnsBΔCTD, corresponding to residues 1 to 475; Fig. 2A) or the TnsBHook (ΔHook, or TnsBΔHook, corresponding to residues 1 to 569; Fig. 2A) result in loss of transposition activity (Fig. 2E).
ShCAST target-site selection relies on the stimulation of TnsC filament disassembly by TnsB-promoted ATP hydrolysis to allow guide RNA–directed transposition (Fig. 2B) (12). Therefore, we wondered whether interactions with the TnsC filament were sufficient for hydrolyzable nucleotide-dependent filament disassembly, as observed in Mu (24, 27). However, none of the TnsB N-terminal (TnsBΔHook: 1 to 569; TnsBΔCTD: 1 to 475) or C-terminal fragments (TnsBHook: 570 to 584; TnsBCTD: 476 to 584) we assayed were sufficient to recapitulate the disassembly phenotype observed with full-length TnsB with ATP using EM imaging (Fig. 2F) or biochemical assays that track TnsC oligomerization on DNA (SI Appendix, Fig. S7), at least at concentrations for which full-length TnsB is effective at stimulating TnsC filament disassembly. Therefore, in contrast to Mu (24), this suggests that one or more additional interactions between TnsB and TnsC, in addition to that made with the TnsBHook, are required in order to stimulate ATP hydrolysis and filament disassembly in ShCAST. Although a MuA–MuB structure is not available, the interaction surface between TnsB and TnsC appears to colocalize to the same interaction surface mapped to MuB, assuming positions between TnsC and MuB are roughly equivalent (SI Appendix, Fig. S8A) (27, 28). Nevertheless, the lysine residues responsible for mediating transposase interactions in MuB do not appear conserved (SI Appendix, Fig. S8B), suggesting that the nature of interactions between the transposase and its AAA+ regulator varies across transposition systems.
Together, these results paint a picture of the initial steps of TnsB recruitment to the target site via the AAA+ regulator, TnsC. TnsB’s C-terminal hook interacts with TnsC along the surface of the filament, but interaction via the TnsBHook alone is insufficient to stimulate TnsC filament disassembly, indicating that one or more additional interactions between TnsB and TnsC not visualized here must be required. In addition, we reveal that the TnsB C-terminal hook is flexibly linked to the rest of TnsB. The flexible linker is not conserved in length or sequence among TnsB homologs from the V-K CAST elements (SI Appendix, Fig. S9). Nevertheless, given the relatively precise insertion spacing observed in ShCAST (4), it may play crucial roles in orienting TnsB to interact productively with the target site.
The TnsB Strand-Transfer Complex Stabilizes Highly Distorted DNA.
DNA distortions, particularly in the target-bound DNA, are canonical features of RNaseH transposase structures. The TnsB STC has highly distorted DNA (120° bend; Fig. 3A and B) surrounding the 5-bp target site (brown, Fig. 3A) comparable to MuA (22). Target DNA distortions are required to place the scissile phosphate appropriately in the active site (29). Consistent with this, the DDE catalytic residues (D205, D287, and E321; Fig. 3C) are positioned at the target–donor junction precisely at the DNA distortion (red star, Fig. 3B), coordinating a magnesium ion with the scissile phosphate poised for nucleophilic attack (Fig. 3C). Mutation of the catalytic residues significantly reduced transposition activity (Fig. 3D). Surprisingly, the D205 mutation did not completely abolish transposition, but there is no immediately nearby acidic residue that can compensate for the role of D205 (the closest Asp/Glu residue is D291, which is 7.2 Å away from the Mg2+ ion). Thus, it requires further investigation to understand how the D205A mutant can still carry out transposition.
In MuA, helix IIIα of the R1-bound subunit (light purple, indicated with a purple square, Fig. 1G) has additional roles in stabilizing target-DNA distortions and preventing reversal of the reaction (30). The absence of a similar interaction in the TnsB STC structure (Fig. 1F) suggests that the role of helix IIIα in TnsB may primarily be for tetramer stabilization rather than stabilizing target-DNA distortions. Consistent with this, in TnsB the domain IIβ close to the target DNA is closely interacting with the sugar-phosphate backbone, whereas the equivalent domain in MuA is too far to interact with the target DNA (SI Appendix, Fig. S10). This suggests that target-DNA distortions in TnsB are stabilized via a different DNA-binding domain, namely domain IIβ.
TnsB Interactions with Donor DNA Delineates Transposon End Recognition.
Tn7-like elements have an 8-bp terminal sequence (gray, adjacent to the target-site duplication, and target site in brown, Fig. 3E) (2). In our structure, the 8-bp terminal sequence (closest to the target-site duplication and colored gray, Fig. 3E) corresponds to the part of the DNA substrate contacted by the catalytic domain (domain IIα; SI Appendix, Fig. S11) and can be assigned to the contacts between the B-L1 subunit and target DNA near the target–donor junction (Fig. 3F). Transposon cargo and Tn7/CRISPR–associated genes are flanked by left and right ends, consisting of multiple 22-bp TnsB binding sites (1, 2, 31) (blue triangles, Fig. 3E). In order to understand the protein–DNA interactions that enable TnsB to recognize its cognate DNA sequence, we looked at DNA-binding domains Iβ and Iγ which bind along the first TnsB binding site on the donor DNA (L1; Fig. 3F and G). The majority of protein–DNA interactions are sequence-nonspecific contacts with the sugar-phosphate backbone (Fig. 3F). However, several key residues located in the Iγ domain and in the Iβ–Iγ linker form sequence-specific nucleobase contacts. Within Iγ, R158 and K154 are within hydrogen-bonding distance of G−11 and G9, respectively (Fig. 3H). Interestingly, the Iβ–Iγ linker lies along the minor groove of the DNA duplex and contributes sequence-specific contacts. R106 and R99 are within hydrogen-bonding distance of T−14 and T16, respectively (Fig. 3I). The Iγ and Iβ–Iγ linker makes contact with nucleotide positions 5 to 19, which is roughly consistent with the pattern of conservation among TnsB binding sites (SI Appendix, Fig. S12). Although some base-specific interactions are observed in the Iβ domain (R58, R77, and R81), the lack of strong conservation in the TnsB donor sequence in this region (positions 20 to 30; SI Appendix, Fig. S12) suggests that these residues may not strongly contribute to transposon end recognition. Therefore, the TnsB STC structure suggests that transposon end DNA recognition may be modular (i.e., independent and separable from catalytic function) in TnsB, like MuA (32), and could feasibly be altered using rational design strategies, as has been done in the past with MuA via the generation of a chimeric recombinase called “SinMu” (33).
TnsB Forms Specific Base-Stabilizing Contacts in the Nontransferred Strand.
Unlike prototypic Tn7 or other CAST elements (such as the I-F3 subfamily), ShCAST (and other V-K CAST elements) do not encode enzymatic activity for cleavage at the 5′ ends of the element (i.e., it does not encode TnsA) (15). Consistent with this, CAST V-K elements form cointegrates indicative of replicative transposition without subsequent resolution (14). Therefore, we were particularly intrigued to discover a unique structural conformation at the 5′ ends of the transposon (and missing in MuA) with the nontransferred strand (Fig. 4A). The linker connecting domains Iγ and IIα in the L1-bound TnsB subunit snakes underneath each 5′ end of the element in the nontransferred strand (Fig. 4A), forming stabilizing interactions (Fig. 4B) with the first two nucleotide positions. We observe “melting” of the 5′ end of the nontransferred strand through a flipped-out base (T1; Fig. 4B). This specific conformation is stabilized by aromatic interactions with W178 and hydrogen bonding with S175 and R380 (Fig. 4B). Mutation of residues observed to interact with the nontransferred strand results in almost complete abrogation of transposition activity (Fig. 4C), highlighting the importance of the observed interactions.
We wondered whether specific interactions at the ends of the element were consistent with additional flanking DNA from the donor plasmid, as would be expected given TnsB’s transposition mechanism (Fig. 1A). Modeled flanking DNA from the 5′ end of the transposon is sterically accommodated within our existing structure (Fig. 4D), indicating that the DNA substrate we used here is consistent with formation of TnsB cointegrates. Therefore, it appears that the specific structural feature we observe at the 5′ end of the element is both important and consistent with TnsB’s expected transposition substrate. We postulate that the melting of the 5′ nontransferred strand may serve as a regulatory step that ensures the fidelity of synaptic complex assembly.
Discussion
The structures reported here include an STC of a Tn7-like CAST element, and also highlight the remarkable consistency across the catalytic domains of RNaseH transposases, specifically with respect to the Mu transposase (22), despite distant evolutionary relationships. This high degree of structural conservation across considerable evolutionary distance leads us to propose that TnsB from prototypic Tn7 adopts an architecture similar to ShCAST TnsB and MuA upon integration. While not addressed in this work, multiple internal TnsB binding sites found asymmetrically in the left and right ends (Fig. 3E) must somehow establish the strict orientation specificity found with these elements (3, 4, 34). Therefore, a lingering mystery for ShCAST and related transposons is how placement of internal binding sites establishes the orientation and fidelity of synaptic complex assembly.
AlphaFold predictions of the catalytic domain (domain IIα) of prototypic Tn7 TnsB superimposes well onto ShCAST TnsB (2.4 Å rmsd; SI Appendix, Fig. S13A). Interestingly, the region known to interact with TnsA in the prototypic Tn7 system (19) localizes to where flanking host DNA would be located (SI Appendix, Fig. S13B). Given this is the position where the TnsA nuclease would need to localize in order to generate 5′ end cuts for generating simple insertions, this structure suggests that manipulation of ShCAST transposon characteristics via structure-based engineering is practically achievable.
The structural features we observe at the 5′ transposon end in the STC structure (Fig. 4B) have also been similarly observed in the RAG1–RAG2 synaptic complex in which a base-flipping mechanism is important for end recognition and stabilization of the heptameric RSS sequence (35). In contrast, analogous base flipping is not observed in the MuA structure (22), which is not completely modeled in this region. The absence of this feature in MuA is either a result of lack of resolution (due to anisotropic resolution) or, alternatively, that Mu does not stabilize nicked ends in an identical manner compared with ShCAST TnsB. Further research will be required to understand the exact functional role for base flipping in these elements.
The structural work described here also sheds light onto the process of transposase recruitment to the target site for ShCAST and related transposition systems. We demonstrate that physical association between TnsBCTD and TnsC is primarily via the C-terminal hook that is capable of decorating TnsC filaments (Fig. 2). A total of 50 residues (520 to 569) are not observed in either our TnsBCTD–TnsC structure nor the TnsB STC structure, and are consistent with predictions of disorder based on primary sequence (Fig. 2A). This suggests that this particular region of TnsB remains flexible and without structure, at least in the states we have captured here. This is consistent with a model in which a second interaction between TnsB and TnsC is required to recapitulate nucleotide-dependent TnsC filament disassembly, which is observed with full-length TnsB but not the TnsB fragments that we used to decorate TnsC. Such interactions may also be needed to activate the otherwise latent transposition activity in ShCAST TnsB. While the structures here reveal mechanistic insight into TnsB function and provide a basis for ShCAST engineering, this work also uncovers exciting questions centered on ShCAST transposon structure and function that will remain fascinating topics for future investigations.
Materials and Methods
Strand-Transfer Complex Reconstitution.
The strand-transfer DNA substrate was prepared by annealing three oligonucleotides, heating to 95 °C, and then cooling slowly to room temperature in annealing buffer (SI Appendix for composition) (SI Appendix, Table S2). The strand-transfer DNA substrate and purified TnsB were mixed in a 1:6 molar ratio with the following final buffer composition: 26 mM Hepes (pH 7.5), 5 mM Tris⋅HCl (pH 7.5), 20 mM KCl, 100 mM NaCl, 0.2 mM MgCl2, 15 mM MgOAc2, 3% glycerol, and 1.5 mM dithiothreitol (DTT). After incubation at 37 °C for 40 min, the sample was concentrated to ∼7 mg/mL using an Amicon Ultra centrifugal filter (50-kDa molecular weight cutoff, EMD Millipore); 250 μL of the concentrated sample was subjected to size-exclusion chromatography (Superdex S200 Increase 10/300 GL, Cytiva). Peak fractions from 9.2 to 10.7 mL were collected for cryo-EM sample preparation (SI Appendix, Fig. S1).
TnsBCTD–TnsC Complex Preparation.
TnsB and TnsC were purified following previously described protocols (4, 12). Protein truncation constructs consisting of TnsB’s 109 C-terminal residues (referred to throughout as TnsBCTD) were cloned from the ShTnsB vector (Addgene, 135525) and purified using previously described protocols (4, 12). To prepare the TnsBCTD–TnsC complex for cryo-EM imaging, TnsC filaments were formed by mixing purified TnsC with a 1:10 molar ratio of a 22-bp double-stranded DNA (dsDNA) substrate (SI Appendix, Table S2; see SI Appendix for more details). TnsC was allowed to polymerize on ice for 5 min before adding purified TnsBCTD at a twofold molar excess with respect to TnsC.
Cryo-EM Sample Preparation and Imaging.
Slightly different sample preparation protocols were used for the two samples (referred to as TnsB STC and TnsBCTD–TnsC) described in this manuscript. For the TnsB STC, homemade graphene oxide (GO) grids were used (SI Appendix for fabrication details); 4 μL of reconstituted TnsB STC was loaded onto the carbon side of freshly fabricated GO grids. The sample was incubated on the grid for 20 s in the chamber of a Mark IV Vitrobot (ThermoFisher), which was set to 4 °C and 100% humidity. Grids were blotted using a blot force of 5 and blot time of 7 s prior to being plunged into liquid ethane cooled by liquid nitrogen. For the TnsBCTD–TnsC, R1.2/1.3 gold grids (UltraAuFoil, Quantifoil) were glow-discharged (PELCO easiGlow) using a 30-mA current for 30 s prior to sample application and vitrification; 4 μL of freshly reconstituted TnsBCTD–TnsC sample was applied to the gold grid. Vitrification conditions followed that of the TnsB STC (see above).
Vitrified samples were imaged using a Talos Arctica (ThermoFisher, operated at 200 keV) equipped with a BioQuantum energy filter (Gatan) and a K3 direct electron detector (Gatan). The microscope was subjected to stringent alignment procedures, including coma-free alignment and parallel illumination (36). High-throughput imaging was achieved using a 3-by-3 image shift in SerialEM (37). Image magnification settings corresponded to 63,000× magnification (1.33 Å per pixel scaling) and a nominal defocus range of −1.0 to −2.5 μm. Comprehensive imaging parameter details are presented in SI Appendix, Table S1.
Image Processing.
Warp (38) was used for micrograph preprocessing, including beam-induced motion correction, contrast transfer function (CTF) estimation, and initial particle picking. For the TnsB STC, a C1 ab-initio reconstruction was generated using cryoSPARC (39). At this point, the resulting reconstruction had apparent C2 symmetry, therefore C2 symmetry was imposed for all subsequent refinement steps. For the TnsBCTD–TnsC reconstruction, a 20-Å low pass–filtered map of the ATPγS-bound TnsC cryo-EM reconstruction (EMD-23720) (12) was used as an initial reference for cryoSPARC helical reconstruction and refinement (39). Roughly the same refinement procedure was applied to both datasets: cryoSPARC particle alignment parameters and stacks were exported to RELION (40, 41) for subsequent refinement, including three-dimensional classification, CTF refinement (42), and Bayesian polishing (43). The final TnsB STC reconstruction had an estimated resolution of 3.7 Å (gold standard Fourier shell correlation, [FSC]) and, in the case of the TnsBCTD–TnsC reconstruction, 3.5-Å resolution. More comprehensive methodological details are presented in SI Appendix.
Atomic Model Building.
Different modeling procedures were used for the TnsB STC and TnsBCTD–TnsC filament cryo-EM reconstructions. For the TnsB STC cryo-EM map, the TnsB sequence was used to generate an AlphaFold2 (21) prediction. The top-ranked model was split by domain and manually docked into the cryo-EM map using UCSF Chimera (44). One half of the complex, containing two distinct conformations of TnsB and DNA, was completed manually using Coot (45) and C2 symmetry was used to generate the full complex. This was followed by manual inspection and further refinement using Coot (45). The full assembly was energy-minimized in the context of the cryo-EM map using Rosetta (46). Protein and DNA geometry was subjected to Phenix real-space refinement (47).
In the case of the TnsBCTD–TnsC filament cryo-EM reconstruction, TnsC and DNA models from the ATPγS-bound TnsC filament (Protein Data Bank [PDB] ID code 7M99) served as very close initial models. Small adjustments in the TnsC model were made using Rosetta energy minimization, employing helical symmetry to model two helical turns using a single asymmetric unit, as described previously (12). In order to identify the register of the TnsBHook fragment, a 14-residue polyalanine backbone was first built into the density. A custom script was used to thread all 96 possible registers, representing each possible threaded sequence spanning the 109-residue TnsBCTD construct (109 − 14 + 1 = 96 possible registers), onto the TnsB fragment backbone. Each initial model was then relaxed into the density independently, using Rosetta energy minimization (46). Additional Rosetta energy terms to assess atomic model-map fit (elec_dens_fast weight = 40) were enforced during refinement (SI Appendix, Fig. S5), and 30 models were generated for each energy minimization run. The best scoring model was used to assess the sequence register, as shown in SI Appendix, Fig. S5. Details of the model statistics and validation are presented in SI Appendix, Table S1.
In Vitro Transposition Assay.
In vitro transposition assays were carried out as previously described (4, 12). First, 48 μL of the target pot reaction (containing pTarget_PSP1 Addgene 127926, Cas12k, single-guide RNA [sgRNA], TnsC, and TniQ) and 48 μL of the donor pot reaction (containing pDonor_ShCAST Addgene 127924 and TnsB) were independently incubated at 37 °C for 15 min. Then, these two pots were combined and supplemented with 4 μL of 375 mM MgOAc2. The mixture was then incubated for 2 additional hours at 37 °C. The combined final transposition reaction consisted of the following: 50 nM Cas12k, 50 nM TnsC, 50 nM TniQ, 50 nM TnsB, 100 nM sgRNA, 26 mM Hepes (pH 7.5), 4 mM Tris (pH 7.4), 40 mM NaCl, 10 mM KCl, 0.8% glycerol, 2 mM DTT, 50 μg/mL bovine serum albumin, 0.04 mM ethylenediaminetetraacetate, 0.2 mM MgCl2, 15 mM MgOAc2, 0.54 nM pDonor_ShCAST, 0.45 nM pTarget_PSP1, and 2 mM ATP. Product DNA was purified from the reaction mixture using a GeneJET PCR Purification Kit (ThermoFisher), followed by heat-shock transformation into DH5α competent cells (a gift from the J.E.P. laboratory). Transformed cells were plated on agar plates with 50 μg/mL kanamycin for selection.
TnsC Filament Disassembly Assay.
TnsC disassembly was probed using two different assays: an EM-based imaging assay and a biochemical assay. The imaging assay was carried out as follows: TnsC and 60-bp dsDNA (SI Appendix, Table S2) were added at a 25:1 molar ratio into the following buffer: 2 mM nucleotide (ATP or AMPPNP), 25 mM Hepes, 200 mM NaCl, 2% glycerol, 1 mM DTT, and 2 mM MgCl2 in order to initiate filament assembly. Filaments were then either incubated with full-length TnsB or TnsB truncations (1:1 molar ratio of TnsC to TnsB) or an equivalent volume of the buffer as a negative control. Each reaction was incubated at 30 °C for 1 h, followed by negative-stain EM. For the biochemical assay, desthiobiotinylated DNA was incubated with streptavidin magnetic beads. TnsC filament assembly was initiated (as described above) and then TnsB (full-length or truncation constructs) was added to the reaction mixture. After multiple washes, DNA was eluted from the beads using a 4 mM biotin solution and the associated proteins were examined using sodium dodecyl sulfate–polyacrylamide gel electrophoresis. SI Appendix includes more comprehensive details.
Supplementary Material
Acknowledgments
We thank the Cornell Center for Materials Research facility, as well as Katherine Spoth and Mariena Silvestry-Ramos, for maintenance of the electron microscopes used for this research (NSF-DMR1719875). We acknowledge the Extreme Science and Engineering Discovery Environment for computational resources used for image processing (MCB200090 to E.H.K.). We additionally thank Phillip Milner for help with synthesizing GO, Tristan Wellner for optimizing GO grid fabrication, Fang Zhang for help with the generation of DNA substrates, Shan-Chi “Popo” Hsieh for generously allowing us to include his alignment of V-K CAST TnsB sequences, and Michael T. Petassi for help with transposition assays. We also thank Phoebe Rice, Marcin Nowotny, and Nancy Craig for helpful discussions and critical reading of the manuscript. We additionally thank members of the E.H.K. and J.E.P. groups for helpful and stimulating discussions. This research is supported by NIH R01GM129118 (to J.E.P.), R21AI148941 (to J.E.P.), and R01GM124463 (to E.H.K.) and the Pew Biomedical Foundation (E.H.K.).
Footnotes
Competing interest statement: The J.E.P. laboratory has corporate funding for research that is not directly related to the work in this publication. Cornell University has filed patent applications with J.E.P. as inventor involving CRISPR-Cas systems associated with transposons that are not directly related to this work.
This article is a PNAS Direct Submission.
See online for related content such as Commentaries.
This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2202590119/-/DCSupplemental.
Data Availability
Electron density maps and atomic models reported in this article have been deposited in the Protein Data Bank and Electron Microscopy Data Bank (EMDB) (EMDB ID code 25454 TnsBCTD-TnsC filament, EMDB ID code 25455 TnsB STC, EMDB ID code 27140 TnsBHook-TnsC filament, PDB ID code 7SVV TnsBCTD-TnsC filament, and PDB ID code 7SVW TnsB STC).
All study data are included in the article and/or supporting information.
References
- 1.Faure G., et al. , CRISPR-Cas in mobile genetic elements: Counter-defence and beyond. Nat. Rev. Microbiol. 17, 513–525 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Peters J. E., Makarova K. S., Shmakov S., Koonin E. V., Recruitment of CRISPR-Cas systems by Tn7-like transposons. Proc. Natl. Acad. Sci. U.S.A. 114, E7358–E7366 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Klompe S. E., Vo P. L. H., Halpin-Healy T. S., Sternberg S. H., Transposon-encoded CRISPR-Cas systems direct RNA-guided DNA integration. Nature 571, 219–225 (2019). [DOI] [PubMed] [Google Scholar]
- 4.Strecker J., et al. , RNA-guided DNA insertion with CRISPR-associated transposases. Science 365, 48–53 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Saito M., et al. , Dual modes of CRISPR-associated transposon homing. Cell 184, 2441–2453.e18 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Petassi M. T., Hsieh S. C., Peters J. E., Guide RNA categorization enables target site choice in Tn7-CRISPR-Cas transposons. Cell 183, 1757–1771.e18 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Peters J. E., Tn7. Microbiol. Spectr. 2, MDNA3-0010-2014 (2014). [DOI] [PubMed] [Google Scholar]
- 8.Peters J. E., Targeted transposition with Tn7 elements: Safe sites, mobile plasmids, CRISPR/Cas and beyond. Mol. Microbiol. 112, 1635–1644 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Rybarski J. R., Hu K., Hill A. M., Wilke C. O., Finkelstein I. J., Metagenomic discovery of CRISPR-associated transposons. Proc. Natl. Acad. Sci. U.S.A. 118, e2112279118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Xiao R., et al. , Structural basis of target DNA recognition by CRISPR-Cas12k for RNA-guided DNA transposition. Mol. Cell 81, 4457–4466.e5 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Querques I., Schmitz M., Oberli S., Chanez C., Jinek M., Target site selection and remodelling by type V CRISPR-transposon systems. Nature 599, 497–502 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Park J. U., et al. , Structural basis for target site selection in RNA-guided DNA transposition systems. Science 373, 768–774 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Shapiro J. A., Molecular model for the transposition and replication of bacteriophage Mu and other transposable elements. Proc. Natl. Acad. Sci. U.S.A. 76, 1933–1937 (1979). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Vo P. L. H., Acree C., Smith M. L., Sternberg S. H., Unbiased profiling of CRISPR RNA-guided transposition products by long-read sequencing. Mob. DNA 12, 13 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Rice P. A., Craig N. L., Dyda F., Comment on “RNA-guided DNA insertion with CRISPR-associated transposases.” Science 368, eabb2022 (2020). [DOI] [PubMed] [Google Scholar]
- 16.Bainton R., Gamas P., Craig N. L., Tn7 transposition in vitro proceeds through an excised transposon intermediate generated by staggered breaks in DNA. Cell 65, 805–816 (1991). [DOI] [PubMed] [Google Scholar]
- 17.May E. W., Craig N. L., Switching from cut-and-paste to replicative Tn7 transposition. Science 272, 401–404 (1996). [DOI] [PubMed] [Google Scholar]
- 18.Sarnovsky R. J., May E. W., Craig N. L., The Tn7 transposase is a heteromeric complex in which DNA breakage and joining activities are distributed between different gene products. EMBO J. 15, 6348–6361 (1996). [PMC free article] [PubMed] [Google Scholar]
- 19.Choi K. Y., Li Y., Sarnovsky R., Craig N. L., Direct interaction between the TnsA and TnsB subunits controls the heteromeric Tn7 transposase. Proc. Natl. Acad. Sci. U.S.A. 110, E2038–E2045 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lu F., Craig N. L., Isolation and characterization of Tn7 transposase gain-of-function mutants: A model for transposase activation. EMBO J. 19, 3446–3457 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Jumper J., et al. , Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Montaño S. P., Pigli Y. Z., Rice P. A., The Mu transpososome structure sheds light on DDE recombinase evolution. Nature 491, 413–417 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Skelding Z., Queen-Baker J., Craig N. L., Alternative interactions between the Tn7 transposase and the Tn7 target DNA binding protein regulate target immunity and transposition. EMBO J. 22, 5904–5917 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wu Z., Chaconas G., Characterization of a region in phage Mu transposase that is involved in interaction with the Mu B protein. J. Biol. Chem. 269, 28829–28833 (1994). [PubMed] [Google Scholar]
- 25.Greene E. C., Mizuuchi K., Visualizing the assembly and disassembly mechanisms of the MuB transposition targeting complex. J. Biol. Chem. 279, 16736–16743 (2004). [DOI] [PubMed] [Google Scholar]
- 26.Greene E. C., Mizuuchi K., Target immunity during Mu DNA transposition. Transpososome assembly and DNA looping enhance MuA-mediated disassembly of the MuB target complex. Mol. Cell 10, 1367–1378 (2002). [DOI] [PubMed] [Google Scholar]
- 27.Mizuno N., et al. , MuB is an AAA+ ATPase that forms helical filaments to control target selection for DNA transposition. Proc. Natl. Acad. Sci. U.S.A. 110, E2441–E2450 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Coros C. J., Sekino Y., Baker T. A., Chaconas G., Effect of mutations in the C-terminal domain of Mu B on DNA binding and interactions with Mu A transposase. J. Biol. Chem. 278, 31210–31217 (2003). [DOI] [PubMed] [Google Scholar]
- 29.Arinkin V., Smyshlyaev G., Barabas O., Jump ahead with a twist: DNA acrobatics drive transposition forward. Curr. Opin. Struct. Biol. 59, 168–177 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Fuller J. R., Rice P. A., Target DNA bending by the Mu transpososome promotes careful transposition and prevents its reversal. eLife 6, e21777 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Arciszewska L. K., Drake D., Craig N. L., Transposon Tn7. cis-acting sequences in transposition and transposition immunity. J. Mol. Biol. 207, 35–52 (1989). [DOI] [PubMed] [Google Scholar]
- 32.Goldhaber-Gordon I., Early M. H., Baker T. A., MuA transposase separates DNA sequence recognition from catalysis. Biochemistry 42, 14633–14642 (2003). [DOI] [PubMed] [Google Scholar]
- 33.Ling L., Montaño S. P., Sauer R. T., Rice P. A., Baker T. A., Deciphering the roles of multicomponent recognition signals by the AAA+ unfoldase ClpX. J. Mol. Biol. 427, 2966–2982 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Vo P. L. H., et al. , CRISPR RNA-guided integrases for high-efficiency, multiplexed bacterial genome engineering. Nat. Biotechnol. 39, 480–489 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ru H., et al. , Molecular mechanism of V(D)J recombination from synaptic RAG1-RAG2 complex structures. Cell 163, 1138–1152 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Herzik M. A. Jr., Setting up parallel illumination on the Talos Arctica for high-resolution data collection. Methods Mol. Biol. 2215, 125–144 (2021). [DOI] [PubMed] [Google Scholar]
- 37.Mastronarde D. N., Automated electron microscope tomography using robust prediction of specimen movements. J. Struct. Biol. 152, 36–51 (2005). [DOI] [PubMed] [Google Scholar]
- 38.Tegunov D., Cramer P., Real-time cryo-electron microscopy data preprocessing with Warp. Nat. Methods 16, 1146–1152 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Punjani A., Rubinstein J. L., Fleet D. J., Brubaker M. A., cryoSPARC: Algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017). [DOI] [PubMed] [Google Scholar]
- 40.Scheres S. H., RELION: Implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Scheres S. H., Processing of structurally heterogeneous cryo-EM data in RELION. Methods Enzymol. 579, 125–157 (2016). [DOI] [PubMed] [Google Scholar]
- 42.Zivanov J., et al. , New tools for automated high-resolution cryo-EM structure determination in RELION-3. eLife 7, e42166 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Zivanov J., Nakane T., Scheres S. H. W., A Bayesian approach to beam-induced motion correction in cryo-EM single-particle analysis. IUCrJ 6, 5–17 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Pettersen E. F., et al. , UCSF Chimera—A visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004). [DOI] [PubMed] [Google Scholar]
- 45.Emsley P., Lohkamp B., Scott W. G., Cowtan K., Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Leaver-Fay A., et al. , ROSETTA3: An object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545–574 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Afonine P. V., et al. , New tools for the analysis and validation of cryo-EM maps and atomic models. Acta Crystallogr. D Struct. Biol. 74, 814–840 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Electron density maps and atomic models reported in this article have been deposited in the Protein Data Bank and Electron Microscopy Data Bank (EMDB) (EMDB ID code 25454 TnsBCTD-TnsC filament, EMDB ID code 25455 TnsB STC, EMDB ID code 27140 TnsBHook-TnsC filament, PDB ID code 7SVV TnsBCTD-TnsC filament, and PDB ID code 7SVW TnsB STC).
All study data are included in the article and/or supporting information.