The Cas4 family endonuclease is an essential component of the adaptation module in many variants of CRISPR-Cas adaptive immunity systems. The Crenarchaeota Sulfolobus islandicus REY15A carries two cas4 genes (cas4 and csa1) linked to the CRISPR arrays. Here, we demonstrate that Cas4 and Csa1 are essential to CRISPR spacer acquisition in this organism. Both proteins specify the upstream and downstream conserved nucleotide motifs of the protospacers and define the spacer length and orientation in the acquisition process. Conserved amino acid residues, in addition to those recently reported, were identified to be important for these functions. More importantly, overexpression of the Sulfolobus viral Cas4 abolished spacer acquisition, providing support for an anti-CRISPR role for virus-encoded Cas4 proteins that inhibit spacer acquisition.
KEYWORDS: CRISPR-Cas, Cas4, spacer acquisition, Sulfolobus, virus
ABSTRACT
Clustered regularly interspaced short palindromic repeat (CRISPR)-Cas systems incorporate short DNA fragments from invasive genetic elements into host CRISPR arrays in order to generate host immunity. Recently, we demonstrated that the Csa3a regulator protein triggers CCN protospacer-adjacent motif (PAM)-dependent CRISPR spacer acquisition in the subtype I-A CRISPR-Cas system of Sulfolobus islandicus. However, the mechanisms underlying specific protospacer selection and spacer insertion remained unclear. Here, we demonstrate that two Cas4 family proteins (Cas4 and Csa1) have essential roles (i) in recognizing the 5′ PAM and 3′ nucleotide motif of protospacers and (ii) in determining both the spacer length and its orientation. Furthermore, we identify amino acid residues of the Cas4 proteins that facilitate these functions. Overexpression of the Cas4 and Csa1 proteins, and also that of an archaeal virus-encoded Cas4 protein, resulted in strongly reduced adaptation efficiency, and the former proteins yielded a high incidence of PAM-dependent atypical spacer integration or of PAM-independent spacer integration. We further demonstrated that in plasmid challenge experiments, overexpressed Cas4-mediated defective spacer acquisition in turn potentially enabled targeted DNA to escape subtype I-A CRISPR-Cas interference. In summary, these results define the specific involvement of diverse Cas4 proteins in in vivo CRISPR spacer acquisition. Furthermore, we provide support for an anti-CRISPR role for virus-encoded Cas4 proteins that involves compromising CRISPR-Cas interference activity by hindering spacer acquisition.
IMPORTANCE The Cas4 family endonuclease is an essential component of the adaptation module in many variants of CRISPR-Cas adaptive immunity systems. The Crenarchaeota Sulfolobus islandicus REY15A carries two cas4 genes (cas4 and csa1) linked to the CRISPR arrays. Here, we demonstrate that Cas4 and Csa1 are essential to CRISPR spacer acquisition in this organism. Both proteins specify the upstream and downstream conserved nucleotide motifs of the protospacers and define the spacer length and orientation in the acquisition process. Conserved amino acid residues, in addition to those recently reported, were identified to be important for these functions. More importantly, overexpression of the Sulfolobus viral Cas4 abolished spacer acquisition, providing support for an anti-CRISPR role for virus-encoded Cas4 proteins that inhibit spacer acquisition.
INTRODUCTION
Clustered regularly interspaced short palindromic repeats (CRISPRs) and CRISPR-associated (Cas) proteins generate a diversity of immune systems in most archaea and many bacteria that target invasive viruses and plasmids (1, 2). These diverse systems have been classified into two major classes and at least 6 basic types (types I to VI), which are further divided into multiple subtypes (3). Spacer acquisition into CRISPR arrays constitutes the first stage of the immune reaction (4); however, the molecular mechanisms involved in this process are still only partially understood. Short DNA fragments of similar length are excised from the invasive genetic element and integrated into the CRISPR loci, facilitated by the conserved core proteins Cas1 and Cas2 (1). The first successful demonstration of spacer acquisition under laboratory conditions was obtained for the Streptococcus thermophilus subtype II-A system (5), and more recent studies have focused mainly on type I systems (6), including the subtype I-A systems of Sulfolobus solfataricus and Sulfolobus islandicus (7–9), the subtype I-B system of Haloarcula hispanica (10), the subtype I-E system of Escherichia coli (11–13), and the subtype I-F systems of Pseudomonas aeruginosa (14) and Pectobacterium atrosepticum (15).
Spacer acquisition occurs by two related but differing mechanisms, the de novo acquisition and primed acquisition pathways. The former requires Cas1 and Cas2 in the subtype I-E system of E. coli (12, 16–18), Cas1, Cas2, Csa1, and Cas4 in the subtype I-A system of S. islandicus (19), and Cas1, Cas2, Cas9, and Csn2 in S. thermophilus subtype II-A systems (20, 21). In contrast, primed acquisition involves activation by a preexisting spacer in a CRISPR locus that matches the targeted genetic element, and this then triggers spacer acquisition for different type I CRISPR-Cas systems (15–17, 22, 23).
Protospacer-adjacent motifs (PAM) in the DNA of invasive genetic elements determine the acquisition efficiency and integration specificity during the spacer acquisition process. These motifs have been shown to direct spacer acquisition for several archaea and bacteria (7, 9, 10, 12, 13). Moreover, the protospacer 3′-terminal motif is important for acquisition efficiency in the subtype I-E system of E. coli (13). Different subtypes adapt spacers from invasive genetic elements with a range of lengths. For example, most spacers (95%) in subtype I-E CRISPR arrays of E. coli are 32 bp (24), and that is likely to reflect the structure of the substrate-binding Cas1-Cas2 complex (25). However, for other systems, including those of subtypes I-A and II-A (26, 27), spacer lengths vary, and moreover, they require proteins in addition to Cas1 and Cas2 for specific adaptation (19–21). Most of these additional proteins belong to the Cas4 family and carry nuclease activities. Recently, we showed that two Cas4 family proteins, Cas4 and Csa1, are essential for CRISPR adaptation in S. islandicus (19), and more recent studies have shown that the Cas4 nucleases can facilitate PAM recognition and influence spacer lengths and adaptation efficiency (28–30). In particular, the two Cas4 proteins (Cas4-1 and Cas4-2) in the Pyrococcus furiosus subtype I-A system have been observed to have distinct activities: Cas4-1 specifies the upstream PAM, while Cas4-2 specifies the conserved downstream motif (30).
S. islandicus REY15A has proven to be a successful model organism for studying crenarchaeal CRISPR-Cas systems. It carries one subtype I-A and two functionally different subtype III-B interference modules that share an adaptation module encoded adjacent to the subtype I-A interference module (31), and, importantly, a comprehensive array of genetic tools has been developed for this strain (26, 32). Recently, we demonstrated that Csa3a, the activator protein of the subtype I-A CRISPR-Cas immune response in S. islandicus, couples transcriptional activation of spacer acquisition and DNA repair genes and, moreover, enhances CRISPR-Cas interference of invading genetic elements by activating transcription of CRISPR arrays (9, 19). We also demonstrated that the Cas4 family protein genes cas4 and csa1 were essential for spacer acquisition (19). However, the mechanisms involved require further study, including determining the essential amino acid residues of the different Cas4 proteins. Moreover, the functions of virus-encoded Cas4 family proteins remain to be determined.
In the present study, we focused on the functions of the Cas4 family proteins Cas4 and Csa1 in CRISPR spacer acquisition in S. islandicus. We have found new in vivo characteristics of Sulfolobus Cas4 compared with previous in vitro findings on Sulfolobus Cas4 and with Cas4 proteins of other organisms. Moreover, we provide evidence that Sulfolobus virus-encoded Cas4 proteins may play an important role in reducing the efficiency of host CRISPR-Cas interference.
RESULTS
Cas4 family proteins are divided into different subtypes.
Many CRISPR-Cas adaptation systems require proteins in addition to Cas1 and Cas2 and, in particular, Cas4 family proteins that are encoded widely in archaeal and bacterial genomes and in their viruses. A phylogenetic tree derived from alignments of 28 Cas4 family proteins, including 21 genome-encoded Cas4 proteins from six archaeal genomes and seven archaeal viral Cas4-like proteins, is presented in Fig. 1. Here, Cas4_Sso1392 does not carry the four conserved cysteine residues (4-C motif) which form an Fe-S cluster. The sequences group into 4 subtypes, i.e., the CRISPR-related Cas4, CRISPR-related Csa1, solo Cas4, and viral Cas4-like proteins, where the genes of solo Cas4 and viral Cas4 proteins are not linked to CRISPR loci (Fig. 1).
Sulfolobus species generally encode multiple Cas4 proteins (Fig. 1). For example, in the crenarchaeon S. islandicus REY15A, the cas4 and csa1 genes neighbor CRISPR arrays, while two other cas4 genes are not linked to CRISPR arrays (Fig. 1). The euryarchaeon Pyrococcus furiosus JFW02 carries one CRISPR-linked cas4-1 gene and a second solo cas4-2 gene (Fig. 1). However, although the Cas4-1 (Pfu 1119)-encoding gene adjoins a CRISPR array, it diverges strongly from the Sulfolobus CRISPR-linked Csa1 or Cas4 (Fig. 1). Similarly, P. furiosus Cas4-2 (Pfu 1994) falls between solo Cas4 and viral Cas4-like proteins (Fig. 1). Several Sulfolobus rudiviruses and fuselloviruses encode Cas4 proteins, and their genes cluster in the phylogenetic tree (Fig. 1). This raises two questions: (i) what are the functions of CRISPR-linked Cas4 and Csa1 and the solo Cas4 proteins in the Sulfolobus genome, and (ii) what effects, if any, do the viral Cas4 proteins have on the CRISPR-Cas response of the infected host?
Cas4 proteins define the spacer origin and recognize the PAM and the 3′ conserved DNA motif.
S. islandicus REY15A encodes one spacer acquisition module, constituting Cas1, Cas2, Cas4, and Csa1 (Fig. 2A). The downstream gene, csa3a, encodes a transcriptional regulator that activates expression of the adaptation module (9). This, in turn, triggers CCN PAM-dependent de novo spacer acquisition (9, 19). Earlier, we demonstrated that deletion of the cas4 and csa1 genes eliminated the capacity for CRISPR spacer acquisition as judged by the inability of the deletion strain to produce larger PCR products in the CRISPR leader-proximal region (19).
In this study, we examined the effects of deletion and overexpression of Cas4 and Csa1 on spacer acquisition by high-throughput sequencing and analysis of newly acquired spacers. The high-throughput data are summarized in Table S1 in the supplemental material. Consistent with the earlier work (9, 19), the high-throughput sequencing data revealed that the control strain, carrying only the empty vector pSeSD, yielded relatively few new spacers, while many were observed in wild-type (wt) cells with csa3a overexpression (Fig. 2B). Moreover, in both strains, most new spacers were derived from the expression vector and few were from host genome DNA (Fig. 2B). Next, we examined the effects of Cas4 and Csa1 on spacer acquisition by testing cas4 and csa1 deletion strains (Fig. 2B). Very few new spacers were detected for either deletion strain, which indicated that Cas4 and Csa1 were essential for specific spacer acquisition. In addition, we demonstrated that enhancing expression of Cas4, and also Csa1, strongly reduced spacer acquisition (Fig. 2B), possibly due to excess Cas4 or Csa1 large protein complex forms, such as a decameric toroid as reported previously (33), disordering the adaptation complex. Furthermore, when we examined the origins of the spacers, we found that the relatively few spacers that were acquired in the cas4 and csa1 deletion or overexpression mutants did show a stronger bias to genomic DNA than to the plasmid (Fig. 2B). This observation also reinforced that the Cas4 proteins were crucial for specific selection of spacers.
Since CRISPR-Cas interference requires PAM recognition, we analyzed the PAM and 3′-nucleotide motif of the cognate protospacers. New spacers in the wt strain carrying the empty vector were derived predominantly from protospacers carrying conserved CCN PAM and 3′-A/G motifs (Fig. 2B and C). Moreover, protospacers in the csa3a overexpression strain also predominantly carried CCN PAM and the 3′-A/G motifs (Fig. 2B and C), consistent with our earlier results (9). In contrast, the CCN PAM was absent from protospacers of both the deletion and overexpression strains of cas4 and csa1 (Fig. 2B and C); the occurrence of the 3′-A/G motif at the +2 site was also reduced in these strains (Fig. 2B and C). These results indicated that proteins Cas4 and Csa1 play a role in locating the PAM and 3′-motif during spacer selection. Thus, the absence of the PAM in these strains will prevent CRISPR-Cas interference from the defective spacers. Potentially, new spacers can accumulate from any DNA site in the four strains, and this will produce a strong bias to host genomic DNA because of the much higher DNA yields relative to those of the low-copy-number csa3a overexpression plasmid (Fig. 2B). It should be noted that few new spacer reads (∼100 reads) were identified in the csa1 or cas4 overexpression strains (Fig. 2B). PAM sequences from these few reads showed less conservation. However, new spacers from csa1 or cas4 overexpression strains were matched to different locations on the genomic and plasmid DNAs, and the ratios between unique new spacer reads and total new spacer reads were 0.60 and 0.23, respectively, suggesting that identification of non-PAMs in these strains may be significant.
Novel amino acid residues essential for Cas4 protein functions.
Cas4 family proteins carry a few highly conserved amino acid residues (34) (Fig. 3A), and we examined whether they were involved in specific protospacer selection. First, we analyzed the amino acid residues in the RecB motifs of the 28 archaeal genome-encoded or archaeal virus-encoded Cas4 proteins (Fig. 3B). We identified a conserved Asp residue in S. islandicus Cas4 (D60 in SiRe_0763) and the CRISPR-linked Csa1 proteins and two conserved Trp/Leu residues in the CRISPR-linked Cas4 proteins (YL100/101 in SiRe_0763) (Fig. 3B). Therefore, we mutated these amino acids separately with a view to establish whether they were essential for specific spacer acquisition.
The cas4 and mutant genes cas4D60A, cas4E79A, and cas4Y100A/L101A were cloned downstream of the csa3a gene on plasmid pSeSD, and expression of csa3a, cas4, and the cas4 mutants was under the control of the araS promoter (35). These plasmids were then transformed into the cas4 deletion cells to yield complementary strains. Whereas transformation of p(csa3a+cas4) reactivated highly efficient spacer acquisition, albeit with a strong bias to genomic DNA (Fig. 3C), transformation with p(csa3a+cas4D60A) generated a low level of specific spacer acquisition, again with a bias to genomic DNA, while transformation with the two other p(csa3a+cas4 mutant) plasmids failed to restore specific spacer acquisition (Fig. 3D). We infer from these results that Cas4 positions E79 and Y100/L101 are critical for selection of the CCN PAM in protospacers.
Cas4 facilitates specific spacer integration.
As described above, the CCN PAM was not detected in most protospacers from the cas4 and csa1 deletion or overexpression strains or in those deriving from genomic DNA, as indicated in Fig. 2B and 3C and D). These results demonstrate that at suboptimal levels of Cas4, unspecific spacer integration occurred. This result provided an explanation for the majority of spacers being derived from the much more prevalent host genomic DNA. Complementing cas4 in the cas4 deletion mutant strain that overexpressed csa3a restored the selection of protospacers with conserved 5′-CCN PAMs (Fig. 4). Complementing the cas4 deletion mutant with cas4D60A, cas4E79A, and cas4Y100AL101A completely or partially restored selection of plasmid DNA protospacers carrying CCN PAMs (Fig. 4A) but not those derived from genomic DNA (Fig. 4B). However, CCN PAM-dependent inverted integration and slipped integration (atypical CCN) accounted for a large proportion in the cas4D60A complementing strain (Fig. 4B), suggesting that cells with these genomic protospacers survived CRISPR interference guided by the flipped and slipped integrated spacers. These results indicate that amino acid residues D60A, E79, and YL100/101 are important for selection of specific protospacers in Sulfolobus. Particularly, complementing cas4E79A caused half of the genomic protospacers to have no CCN PAM at either end (Fig. 4B), suggesting that it was extremely important for CCN PAM selection.
Cas4 nuclease defines spacer length.
Spacer lengths vary in Sulfolobus species, although they fall mainly in the range of 39 to 41 bp (26). Most spacers acquired in wild-type S. islandicus strains carrying pSeSD or csa3a overexpression vectors were 40 to 41 bp (Fig. 5) similar to the lengths of spacers present in the host CRISPR arrays. However, deletion of cas4 or csa1 not only strongly reduced spacer acquisition efficiency, as measured by the n value (Fig. 2B and 3C), but also yielded, on average, slightly shorter spacer lengths (Fig. 5). Overexpression of csa1 and cas4 also produced altered length distributions (Fig. 5). Complementing cas4 in the cas4 deletion strain did restore spacer lengths to the normal host size distribution, but whereas complementing cas4Y100AL101A did not alter average spacer lengths, complementing cas4D60A and cas4E79A produced increased length distributions (Fig. 5). These results reinforce further that Cas4 is important for producing specific spacer acquisition.
Flipped and slipped spacers inactivate DNA interference of the subtype I-a system.
In order to demonstrate that flipped spacers abolish subtype I-A CRISPR-Cas interference activity, we designed a plasmid challenge experiment (illustrated in Fig. 6A). The DNA sequence of spacer 10 from CRISPR locus 2 of S. islandicus REY15A was cloned into the Sulfolobus-E. coli shuttle vector pSeSD under the control of the araS promoter (35) with different 5′ or 3′ motifs to either activate or escape from subtype I-A and III-B DNA interference activities (Fig. 6A). These challenging plasmids and the control expression vector pSeSD were transformed into S. islandicus E233S, and transformation efficiencies were estimated. If the plasmids were targeted by the subtype I-A and III-B interference complexes, guided by CRISPR RNA (crRNA) from spacer 10, the transformation efficiency would be strongly reduced.
The plasmid carrying the protospacer matching spacer 10 with a 5′-CCA PAM showed a very low transformation efficiency compared with that of the control plasmid, consistent with strong subtype I-A DNA interference having occurred (Fig. 6B). Another control plasmid carrying the same protospacer with a 5′-AAG motif showed a relatively low transformation efficiency compared with that of the pSeSD plasmid but a much higher transformation efficiency than that of the plasmid carrying the CCA motif (Fig. 6B). This indicated that the 5′-AAG motif inhibited subtype I-A interference. The third plasmid carrying the spacer with protospacer 5′-AGG and 3′-TGG sequences mimics the protospacer of the flipped spacers inserted into the CRISPR array (Fig. 6A). High “flip” plasmid transformation efficiency, similar to that of the AAG plasmid (Fig. 6B), would indicate that flipped spacers had integrated into CRISPR arrays and were inactive in subtype I-A.
Slipped spacer integration can also impede subtype I-A DNA interference because the crRNAs will fail to recognize the CCN PAM sequence. To test this experimentally, we selected a protospacer from the template strand of the S. islandicus REY15A cmr2α gene and generated different mutants carrying single-site mutations (Fig. 6C). The altered spacers were cloned into the Sulfolobus expression vector pSeSD (35) to generate plasmid-borne mini-CRISPRs under the control of the arabinose-inducible promoter (Fig. 6C) (32, 36). These challenging plasmids were then transformed into S. islandicus E233S cells, and transformation efficiencies were estimated. If the plasmid-borne spacer crRNA causes interference at cmr2α, then low transformation efficiency will be observed, and vice versa. Since the designed spacer is from the cmr2α template strand, the crRNA will not base pair with the mRNA and, therefore, will not trigger the subtype III-B DNA interference activity. The transformation efficiency of pSeSD carrying the wild-type protospacer was very low compared with that of the empty vector, consistent with strong subtype I-A interference of cmr2α (Fig. 6D). The plasmid carrying the Slip-1 spacer (corresponding to the protospacer with the NCC motif) showed a transformation efficiency that was lower than that of the empty vector control but higher than that of the wild-type spacer plasmid, indicating that moderate plasmid interference had occurred (Fig. 6D). This reflects that a 5′-ACC motif of the protospacer can mediate subtype I-A interference of the target DNA with reduced efficiency (Fig. 6D). No interference by the other Slip spacers was observed at cmr2α, except for very weak interference by the Slip+2 spacer (Fig. 6D). We conclude that the slipped spacers strongly reduced or eliminated subtype I-A CRISPR-Cas interference.
A virus-encoded Cas4 protein hinders CRISPR acquisition.
The cas4 genes are present in most known viruses of the Sulfolobales (37). Sulfolobus spindle-shaped viruses (SSV) encode Cas4 proteins that cluster with a group of Cas4 proteins associated with type I CRISPR-Cas systems of the Thermococcales, and a few Cas4 proteins encoded by Icelandic rudiviruses (e.g., Sulfolobus islandicus rod-shaped virus [SIRV] in Fig. 1) also form a potential branch in this family (37). However, even given the widespread occurrence of cas4 genes in viral genomes, the functions of viral Cas4 in CRISPR spacer acquisition or in virus-host interaction are still unknown. In this study, the cas4 gene of SSV Ragged Hills (SSVRH) was cloned into the csa3a overexpression plasmid and transformed into S. islandicus in order to study its effect on CRISPR spacer acquisition. The positive-control strain overexpressing csa3a (wtOEcsa3a) acquired new spacers, as evidenced by the expanded PCR bands deriving from the leader-proximal CRISPR region (Fig. 7A). In contrast, two strains from single colonies overexpressing both csa3a and SSVRH cas4 failed to yield detectable new spacers (Fig. 7A). PCR products from the leader-proximal CRISPR regions of each of the above-mentioned strains were subsequently sequenced, and the results demonstrated that expression of SSVRH Cas4 produced strongly reduced CRISPR spacer acquisition efficiency (<1%) (Fig. 7B). We analyzed the 5′ and 3′ ends of the corresponding protospacers from the viral cas4 expression strains and found that whereas most protospacers of plasmid origin carried a conserved CCN PAM (70.7%), those of genome origin exhibited a lower level of 5′-CCN PAM (29.3%) (Fig. 7B). This result suggested that the new spacers were activating CRISPR subtype I-A interference and that surviving cells carried protospacers lacking PAMs. In addition, overexpression of SSVRH cas4 led to reduced conservation of the protospacer 3′-A/G motif (Fig. 7B). The length distribution of the adapted spacers from both viral cas4 expression strains was similar to that for the wt cells (Fig. 7C), indicating that viral Cas4 had no effect on spacer length. In summary, expression of SSVRH Cas4 protein strongly reduced spacer acquisition efficiency and reduced the conservation of the 3′-A/G motif (Fig. 7A and B) but did not lead to changes in the 5′-CCN PAM or the spacer length (Fig. 7B and C).
DISCUSSION
Conserved amino acid residues define Cas4 functions.
Cas4 family proteins occur widely within CRISPR-Cas adaptation modules of subtype I-A, I-B, I-C, I-D, I-U, and II-B and type V systems (2), and they include Csa1 of Sulfolobus subtype I-A systems and Csn2 of the subtype II-A systems of S. thermophilus (20) and Streptococcus pyogenes (21). High-resolution crystal structures of Cas4 proteins from S. solfataricus (33) and Pyrobaculum calidifontis (38) indicate that they carry two domains, an N-terminal RecB-like nuclease domain and a C-terminal domain containing an Fe-S cluster coordinated by four conserved cysteine residues. Moreover, biochemical studies have shown that Cas4 can exhibit 5′-to-3′ and 3′-to-5′ DNA exonuclease activity, as well as ATP-independent DNA unwinding activity (33, 34, 38), which suggests that they produce single-strand DNA overhangs that are potential intermediates for insertion of new CRISPR spacers. Previously, we have demonstrated that both Cas4 and Csa1 proteins are essential for de novo spacer acquisition (19). Recently, an in vitro study has demonstrated that Cas4 of the Bacillus halodurans subtype I-C system interacts tightly with Cas1 integrase and processes double-stranded substrates with long 3′ overhangs through site-specific cleavage (29). Cas4 recognizes PAM sequences within the prespacers and prevents integration of unprocessed prespacers, ensuring correct integration (29). An in vivo study revealed that Cas4 of the subtype I-D system from the Synechocystis sp. strain 6803 pSYSA megaplasmid, in addition to Cas1 and Cas2, facilitates recognition of PAMs and defines the spacer length in a heterogenous host, E. coli (28). Although one conserved amino acid residue of Cas4 protein from the Synechocystis subtype I-D system has been identified to define the PAM and spacer length (28), more work is required to study the functions of Cas4 proteins in their native hosts.
Most recently, the Cas4 functions of the P. furiosus subtype I-A system have been studied in vivo (30). The P. furiosus JFW02 genome carries two Cas4 genes: cas4-1 is linked with the CRISPR array, and cas4-2 is solo. Shiimori et al. observed distinct activities in the two Cas4 proteins, especially in recognition of the upstream PAM and downstream conserved DNA motif, respectively, in P. furiosus (30), while in our study, we found that both Csa1 and Cas4 were crucial for specifying the upstream PAM sequence in Sulfolobus (Fig. 2B). Shiimori et al. have also identified one conserved amino acid residue of the RecB motif involved in nuclease activity in both Cas4 proteins for defining the spacer length and orientation in P. furiosus (30). We experimentally identified three conserved amino acid residue sites of the RecB motif that were important for Cas4 function, in addition to that found by Shiimori and coworkers: residue D60, conserved in CRISPR-linked Csa1 proteins; Y100L101, conserved in CRISPR-linked Cas4 proteins; and E79, conserved in all Cas4 family proteins (Fig. 3B). In Fig. 8 we summarize the experimentally studied conserved amino acid residues in the RecB motif of Cas4 family proteins involved in different functions in CRISPR spacer acquisition. D60, identified in S. islandicus Cas4, is essential for spacer acquisition and defines spacer length, but it is not responsible for defining the 5′-CCN PAM and 3′ nucleotide motif (Fig. 8B). D70, corresponding to S. solfataricus Sso0001, is essential to its in vitro nuclease activity (34), and D70 corresponds to the residue in Cas4-1 or Cas4-2 of P. furiosus and determines acquisition efficiency, spacer length, and integration orientation and recognizes the 5′ or 3′ protospacer motifs in vivo (30) (Fig. 8B). E79, identified in Sulfolobus Cas4, influences acquisition efficiency and is important for determining spacer length, integration orientation, and the 5′ and 3′ motifs of protospacers (Fig. 8B). However, H91, corresponding to a residue in Cas4-1 or Cas4-2 of P. furiosus, is unimportant for Cas4 functions (30) (Fig. 8B). YL100/101, identified in this study, strongly effects Cas4 functions (Fig. 3D), probably because these amino acid residues affect the nuclease activity or because the conserved hydrophobic residue L101 is important for Cas4 structural integrity. Similarly, the 4-C cluster of both Cas4-1 and Cas4-2 of P. furiosus is essential for Cas4 functions (30), consistent with published findings that the Fe-S cluster is required for the structural integrity of Cas4 proteins (33).
Csa1 belongs to the Cas4 family proteins, which are specific for subtype I-A CRISPR-Cas systems (2). In this study, we report its functions in defining PAM, orientation and spacer length (Fig. 2, 4, and 5). Even though Csa1 proteins carry conserved amino acid residues as Cas4 protein do, a long insert is found in the Csa1 amino acid sequence, suggesting that some different roles may be played by Csa1 proteins.
A putative anti-CRISPR-Cas mechanism evolved by viruses.
In this study, we show that overexpression of Cas4 or Csa1 can trigger a strong reduction in CRISPR-Cas spacer acquisition efficiency (Fig. 2B) and induce a larger fraction of new spacers to be integrated in a flipped or slipped manner or to be derived from protospacers with no conserved 5′-CNN PAM motif (Fig. 4). Overexpression of the host-encoded Cas4 reduces spacer acquisition efficiency and causes defective acquisition. We quantified the transcription levels of the cas4, csa1, and cas4 mutant genes in the studied strains. As shown in Table S2 in the supplemental material, the transcription level of the cas4 gene from the wild-type cells overexpressing the csa3a and cas4 genes [wtOE(csa3a+cas4)] was much higher (22.7-fold) than that from the wild-type cells overexpressing only the csa3a gene (wtOEcsa3a), and much lower spacer acquisition efficiency was observed in the wtOE(csa3a+cas4) strain. However, the transcription level of the cas4 gene in the cas4 gene deletion strain overexpressing the csa3a and cas4 genes [Δcas4OE(csa3a+cas4)] was higher than that in the wtOEcsa3a strain (137.8-fold) and also higher than that in the wtOE(csa3a+cas4) strain (6.1-fold). The higher expression level of cas4 in the Δcas4OE(csa3a+cas4) strain than in the wtOE(csa3a+cas4) strain is probably because deletion of the chromosomally encoded cas4 gene interrupted the cas4 antisense RNA (39). This antisense RNA against the chromosomally encoded cas4 transcript might mediate inhibition of cas4 translation and degradation the cas4 transcripts. However, this raises a question about how overexpression of cas4 in the wt strain almost abolished spacer acquisition but overexpression of cas4 in the Δcas4 strain restored adaptation efficiency (Fig. 2B and 3C). In our previous work, the cas1 transcript level was 4-fold higher than the cas4 transcript level as seen from the transcriptome data for the wtOEcsa3a strain (19). Therefore, in the wtOEcsa3a strain, Cas4 specifies the PAM on target DNA and probably binds one Cas1 subunit of the Cas1-Cas2 complex for specific integration (in the correct direction) at the leader-proximal site. This is confirmed by the new spacer data from high-throughput sequencing of the leader-proximal regions in the wtOEcsa3a strain showing that most of the protospacers have a 5′-end-adjacent CCN PAM sequence (Fig. 2B and C). However, in the wtOE(csa3a+cas4) strain, Cas4 proteins at a higher concentration possibly bind all four Cas1 subunits of the Cas1-Cas2 complex. Therefore, the saturated binding of Cas4 with the Cas1-Cas2 complex possibly affects the conformation of the adaptation complex, resulting in much lower acquisition efficiency (Fig. 2). Saturated binding of Cas4 with the Cas1-Cas2 complex also probably fails to specify the PAM and correct direction of spacer acquisition (Fig. 4). In the Δcas4OE(csa3a+cas4) strain, a much higher expression level of Cas4 protein might make Cas4 form a toroidal structure with 6 subunits or self-assemble (33, 34). Thus, under this condition, less Cas4 binds with the Cas1-Cas2 complex, and the acquisition efficiency restored (Fig. 3). However, to confirm this hypothesis will require much biochemical and structural study of the Cas4-Cas1/Cas2 complex in combination with genetic studies.
Cas4-like nucleases are encoded by archaeal viruses (40, 41), phages (42), and transposable elements (43), and these genetic elements may use the their Cas4 proteins to interact with the host-encoded adaptation complex. Moreover, a Cas4-like protein from the rudivirus SIRV2 has been shown to possess both 5′-to-3′ exonuclease and endonuclease activities in vitro (40, 44), suggesting that the viruses encode functional Cas4 nucleases that could participate in CRISPR spacer acquisition. Importantly, the expression of rudivirus SIRV2-encoded Cas4-like protein continuously increased after infection into Sulfolobus cells (45). In this study, expression of SSVRH Cas4 strongly reduced spacer acquisition compared to that for the cells lacking SSVRH Cas4 (Fig. 7A and B). High-throughput sequencing data also support that viral Cas4 hinders de novo spacer acquisition in Sulfolobus (Fig. 7B). Analyzing the new spacers detected from the viral Cas4 expression strain, we found that SSVRH Cas4 has no effects on defining the 5′-CCN PAM or spacer length (Fig. 7B and C) but does affect the 3′-A/G motif (Fig. 7B). Together, these results suggest that some invasive genetic elements may have accrued Cas4 proteins to corrupt CRISPR-Cas spacer acquisition by strongly reducing the spacer acquisition efficiency. In turn, the mobile genetic elements may have evolved to escape from CRISPR-Cas interference by the Cas4-disordered adaptation module (the possible mechanism discussed above), as also inferred from bioinformatic analyses (37).
Although expression of SSVRH Cas4 only reduced spacer acquisition efficiency and did not affect integration orientation or spacer length (Fig. 7), this cannot exclude the effects of viral Cas4 on atypical spacer acquisition. For example, it has been shown recently that a Campylobacter bacteriophage-encoded Cas4 homolog induces de novo spacer acquisition exclusively from host DNA (42). We extracted all the new spacers with fewer than 3 mismatches to genomic DNA from that study (42) and analyzed the 5′ and 3′ motifs of the cognate protospacers. However, no conserved PAM or 3′ motifs were detected on either end of the protospacers, indicating two possibilities: (i) the presence of a DNA interference module counterselects the functional genome-derived spacers with conserved PAMs and (ii) the viral Cas4 produces defective spacer acquisition that would prevent CRISPR-Cas targeting of the genomic DNA (data not shown). Given that only host DNA, and no viral DNA, is sampled into the CRISPR array, both possibilities could be reasonable.
During viral infections, archaeal and bacterial cells have also evolved counteracting defense mechanisms, including the production of antisense RNAs. Thus, a high level of antisense transcripts encompassing cas2 and the upstream region of cas4 is detected upon STSV2 viral infection in S. islandicus. The antisense RNA yields increases significantly at around 7 days postinfection and then gradually decreases (39). This raises the possibility that host cells employ antisense RNAs to control the Cas4 expression level in order to minimize its influence on acquisition efficiency and the consequent inactivation of CRISPR-Cas interference against invasive genetic elements.
MATERIALS AND METHODS
Strains, growth, and transformation of Sulfolobus.
The S. islandicus strains employed, including the genetic host E233S (ΔpyrEF ΔlacS) (46), the E233S derivative cas deletion mutants, and the overexpression strains, were cultured in SCVy medium (0.2% sucrose, 0.2% [wt/vol] vitamin-free Casamino Acids [BD Difco vitamin assay], 0.005% yeast extracts, and a mixed vitamin solution) at 78°C. S. islandicus cells were transformed by electroporation, and transformants were selected on two-layer phytal gel plates as described earlier (46). Interference and control plasmid transformation efficiencies were calculated as CFU per microgram of DNA of constructed plasmids. For experiments measuring new spacer uptake at the leader-repeat region of CRISPR loci, cells were grown in SCVy medium.
Plasmid construction.
The csa1, cas4, cas4 mutant, and viral cas4 overexpression plasmids were constructed by cloning cas4 and its mutant genes into the pCsa3a plasmid (9) to form an operon with the csa3a gene. The csa1 gene (SiRe_0760), cas4 gene (SiRe_0763), and viral cas4 (p22) gene were amplified from S. islandicus REY15A or the fusellovirus SSVRH by PCR using FastPfu DNA polymerase (TransGene, Beijing, China) with the primer sets listed in Table S1 in the supplemental material and cloned into the csa3a overexpression plasmid pCsa3a (9), yielding the pCsa3a-Csa1, pCsa3a-Cas4, and pCsa3a-vCas4 plasmids, on which additional Shine-Dalgarno sequences were generated for translational initiation of Csa1 and host and viral Cas4 proteins. For mutations in the cas4 gene, linear plasmids which were amplified from pCsa3a-Cas4 by PCR using FastPfu DNA polymerase contained the designed cas4 gene mutations. After purification, these linear plasmids were cyclized by Gibson assembly before transformation into E. coli BL21 cells.
To construct interference plasmids for testing the flipped spacers, the S10 spacer of CRISPR locus 2 with different 5′-end PAM and 3′-end sequences (Fig. 5A) was cloned into the pSeSD vector. For testing the slipped spacers, a protospacer sequence from the cmr2α gene template strand and its “slipped” derivatives were cloned into the site between two repeat sequences to form a mini-CRISPR under the control of the araS promoter in pSeSD. Interference plasmids were electroporated into S. islandicus strain E233S, and transformation efficiencies were calculated. Primers used for constructing overexpression and interference plasmids are listed in Table S3 in the supplemental material.
PCR amplification of the integrated spacers.
Transformants were cultured in 10 ml SCVy medium at 78°C until the optical density at 600 nm (OD600) reached 0.3. Samples were taken from each culture (0.1 ml), and DNA was extracted from cells and employed as a PCR template. Leader-proximal regions of CRISPR locus 1 were amplified using Taq polymerase with forward and reverse primers CRISPR-F2 and CRISPR1S2-R (or CRISPR1S5-R). PCR products were separated by 1.5% agarose gel electrophoresis and visualized by ethidium bromide staining to identify the expanded bands. The PCR products were purified using a PCR cleanup kit and sent for high-throughput sequencing. Primers used to amplify the leader-proximal regions are listed in Table S1.
High-throughput sequencing and bioinformatics analysis.
Six single colonies on each transformant plate were picked up and mixed in a single tube containing SCVy medium for growth. Genomic DNA was extracted from each transformant and used for amplification of the leader-proximal region. The leader-proximal regions were also amplified from two samples of single colonies carrying a csa3a- and viral cas4-overexpressing plasmid. Equal amounts of the purified leader-proximal PCR products were subjected to HiSeq3000 sequencing (National Key Laboratory of Crop Genetic Improvement). After paired-end data assembly and low-quality data filtration, reads containing multiple (≥2) repeats were selected. For reads containing two repeats, the intervening sequence was considered the original first spacer (S+1). For reads with three or four repeats, the leader-proximal spacers (S−2 and S−1) were considered the new spacers. Using the BLASTN program against the pSeSD plasmid (35) or the S. islandicus REY15A genome sequence (31), the protospacer sequence was identified for each spacer. The 3-bp sequence at the 5′ end of each protospacer was considered the PAM region. Perl scripts were run to analyze the protospacers. The conserved motifs of the 5′-end- and 3′-end-adjacent sequences of the protospacers were analyzed using WebLogo (47). The neighbor-joining tree was generated from a T-Coffee alignment of the proteins using Mega7 (48, 49), with pairwise distances between sequences uncorrected.
Accession number(s).
All high-throughput sequencing reads have been deposited at the SRA database with the accession number PRJNA498890.
Supplementary Material
ACKNOWLEDGMENTS
We thank Roger A. Garrett and Qunxin She of the Danish Archaea Centre in the University of Copenhagen, Denmark, for critical reading of the manuscript.
This work was supported by the National Natural Science Foundation of China (no. 31671291 and 91751104 to N.P.), the National Postdoctoral Program for Innovative Talents (no. BX20180112 to T.L.), and the Fundamental Research Funds for the Central Universities (no. 2662015PY038 and 2662015PX199 to N.P.). Funding for open access charges is from the National Natural Science Foundation of China (no. 31671291).
Footnotes
Supplemental material for this article may be found at https://doi.org/10.1128/JB.00747-18.
REFERENCES
- 1.Makarova KS, Haft DH, Barrangou R, Brouns SJ, Charpentier E, Horvath P, Moineau S, Mojica FJ, Wolf YI, Yakunin AF, van der Oost J, Koonin EV. 2011. Evolution and classification of the CRISPR-Cas systems. Nat Rev Microbiol 9:467–477. doi: 10.1038/nrmicro2577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Makarova KS, Wolf YI, Alkhnbashi OS, Costa F, Shah SA, Saunders SJ, Barrangou R, Brouns SJ, Charpentier E, Haft DH, Horvath P, Moineau S, Mojica FJ, Terns RM, Terns MP, White MF, Yakunin AF, Garrett RA, van der Oost J, Backofen R, Koonin EV. 2015. An updated evolutionary classification of CRISPR-Cas systems. Nat Rev Microbiol 13:722–736. doi: 10.1038/nrmicro3569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Koonin EV, Makarova KS, Zhang F. 2017. Diversity, classification and evolution of CRISPR-Cas systems. Curr Opin Microbiol 37:67–78. doi: 10.1016/j.mib.2017.05.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Westra ER, Swarts DC, Staals RH, Jore MM, Brouns SJ, van der Oost J. 2012. The CRISPRs, they are a-changin’: how prokaryotes generate adaptive immunity. Annu Rev Genet 46:311–339. doi: 10.1146/annurev-genet-110711-155447. [DOI] [PubMed] [Google Scholar]
- 5.Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, Moineau S, Romero DA, Horvath P. 2007. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315:1709–1712. doi: 10.1126/science.1138140. [DOI] [PubMed] [Google Scholar]
- 6.Sternberg SH, Richter H, Charpentier E, Qimron U. 2016. Adaptation in CRISPR-Cas systems. Mol Cell 61:797–808. doi: 10.1016/j.molcel.2016.01.030. [DOI] [PubMed] [Google Scholar]
- 7.Erdmann S, Garrett RA. 2012. Selective and hyperactive uptake of foreign DNA by adaptive immune systems of an archaeon via two distinct mechanisms. Mol Microbiol 85:1044–1056. doi: 10.1111/j.1365-2958.2012.08171.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Erdmann S, Le Moine Bauer S, Garrett RA. 2014. Inter-viral conflicts that exploit host CRISPR immune systems of Sulfolobus. Mol Microbiol 91:900–917. doi: 10.1111/mmi.12503. [DOI] [PubMed] [Google Scholar]
- 9.Liu T, Li Y, Wang X, Ye Q, Li H, Liang Y, She Q, Peng N. 2015. Transcriptional regulator-mediated activation of adaptation genes triggers CRISPR de novo spacer acquisition. Nucleic Acids Res 43:1044–1055. doi: 10.1093/nar/gku1383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Li M, Wang R, Xiang H. 2014. Haloarcula hispanica CRISPR authenticates PAM of a target sequence to prime discriminative adaptation. Nucleic Acids Res 42:7226–7235. doi: 10.1093/nar/gku389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Swarts DC, Mosterd C, van Passel MW, Brouns SJ. 2012. CRISPR interference directs strand specific spacer acquisition. PLoS One 7:e35888. doi: 10.1371/journal.pone.0035888. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Yosef I, Goren MG, Qimron U. 2012. Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli. Nucleic Acids Res 40:5569–5576. doi: 10.1093/nar/gks216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Yosef I, Shitrit D, Goren MG, Burstein D, Pupko T, Qimron U. 2013. DNA motifs determining the efficiency of adaptation into the Escherichia coli CRISPR array. Proc Natl Acad Sci U S A 110:14396–14401. doi: 10.1073/pnas.1300108110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Cady KC, Bondy-Denomy J, Heussler GE, Davidson AR, O'Toole GA. 2012. The CRISPR/Cas adaptive immune system of Pseudomonas aeruginosa mediates resistance to naturally occurring and engineered phages. J Bacteriol 194:5728–5738. doi: 10.1128/JB.01184-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Richter C, Dy RL, McKenzie RE, Watson BN, Taylor C, Chang JT, McNeil MB, Staals RH, Fineran PC. 2014. Priming in the type I-F CRISPR-Cas system triggers strand-independent spacer acquisition, bi-directionally from the primed protospacer. Nucleic Acids Res 42:8516–8526. doi: 10.1093/nar/gku527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Datsenko KA, Pougach K, Tikhonov A, Wanner BL, Severinov K, Semenova E. 2012. Molecular memory of prior infections activates the CRISPR/Cas adaptive bacterial immunity system. Nat Commun 3:945. doi: 10.1038/ncomms1937. [DOI] [PubMed] [Google Scholar]
- 17.Li M, Wang R, Zhao D, Xiang H. 2014. Adaptation of the Haloarcula hispanica CRISPR-Cas system to a purified virus strictly requires a priming process. Nucleic Acids Res 42:2483–2492. doi: 10.1093/nar/gkt1154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Arslan Z, Hermanns V, Wurm R, Wagner R, Pul U. 2014. Detection and characterization of spacer integration intermediates in type I-E CRISPR-Cas system. Nucleic Acids Res 42:7884–7893. doi: 10.1093/nar/gku510. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Liu T, Liu Z, Ye Q, Pan S, Wang X, Li Y, Peng W, Liang Y, She Q, Peng N. 2017. Coupling transcriptional activation of CRISPR-Cas system and DNA repair genes by Csa3a in Sulfolobus islandicus. Nucleic Acids Res 45:8978–8992. doi: 10.1093/nar/gkx612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wei Y, Terns RM, Terns MP. 2015. Cas9 function and host genome sampling in type II-A CRISPR-Cas adaptation. Genes Dev 29:356–361. doi: 10.1101/gad.257550.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Heler R, Samai P, Modell JW, Weiner C, Goldberg GW, Bikard D, Marraffini LA. 2015. Cas9 specifies functional viral targets during CRISPR-Cas adaptation. Nature 519:199–202. doi: 10.1038/nature14245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Diez-Villasenor C, Guzman NM, Almendros C, Garcia-Martinez J, Mojica FJ. 2013. CRISPR-spacer integration reporter plasmids reveal distinct genuine acquisition specificities among CRISPR-Cas I-E variants of Escherichia coli. RNA Biol 10:792–802. doi: 10.4161/rna.24023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Fineran PC, Gerritzen MJ, Suarez-Diez M, Kunne T, Boekhorst J, van Hijum SA, Staals RH, Brouns SJ. 2014. Degenerate target sites mediate rapid primed CRISPR adaptation. Proc Natl Acad Sci U S A 111:E1629–E1638. doi: 10.1073/pnas.1400071111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Savitskaya E, Semenova E, Dedkov V, Metlitskaya A, Severinov K. 2013. High-throughput analysis of type I-E CRISPR/Cas spacer acquisition in E. coli. RNA Biol 10:716–725. doi: 10.4161/rna.24325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wang J, Li J, Zhao H, Sheng G, Wang M, Yin M, Wang Y. 2015. Structural and mechanistic basis of PAM-dependent spacer acquisition in CRISPR-Cas systems. Cell 163:840–853. doi: 10.1016/j.cell.2015.10.008. [DOI] [PubMed] [Google Scholar]
- 26.Garrett RA, Shah SA, Erdmann S, Liu G, Mousaei M, Leon-Sobrino C, Peng W, Gudbergsdottir S, Deng L, Vestergaard G, Peng X, She Q. 2015. CRISPR-Cas adaptive immune systems of the Sulfolobales: unravelling their complexity and diversity. Life (Basel) 5:783–817. doi: 10.3390/life5010783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Mojica FJ, Diez-Villasenor C, Garcia-Martinez J, Soria E. 2005. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J Mol Evol 60:174–182. doi: 10.1007/s00239-004-0046-3. [DOI] [PubMed] [Google Scholar]
- 28.Kieper SN, Almendros C, Behler J, McKenzie RE, Nobrega FL, Haagsma AC, Vink JNA, Hess WR, Brouns S. 2018. Cas4 facilitates PAM-compatible spacer selection during CRISPR adaptation. Cell Rep 22:3377–3384. doi: 10.1016/j.celrep.2018.02.103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Lee H, Zhou Y, Taylor DW, Sashital DG. 2018. Cas4-dependent prespacer processing ensures high-fidelity programming of CRISPR arrays. Mol Cell 70:48–59. doi: 10.1016/j.molcel.2018.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Shiimori M, Garrett SC, Graveley BR, Terns MP. 2018. Cas4 nucleases define the PAM, length, and orientation of DNA fragments integrated at CRISPR loci. Mol Cell 70:814–824. doi: 10.1016/j.molcel.2018.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Guo L, Brugger K, Liu C, Shah SA, Zheng H, Zhu Y, Wang S, Lillestol RK, Chen L, Frank J, Prangishvili D, Paulin L, She Q, Huang L, Garrett RA. 2011. Genome analyses of Icelandic strains of Sulfolobus islandicus, model organisms for genetic and virus-host interaction studies. J Bacteriol 193:1672–1680. doi: 10.1128/JB.01487-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Peng N, Han W, Li Y, Liang Y, She Q. 2017. Genetic technologies for extremely thermophilic microorganisms of Sulfolobus, the only genetically tractable genus of crenarchaea. Sci China Life Sci 60:1–16. doi: 10.1007/s11427-016-0355-8. [DOI] [PubMed] [Google Scholar]
- 33.Lemak S, Beloglazova N, Nocek B, Skarina T, Flick R, Brown G, Popovic A, Joachimiak A, Savchenko A, Yakunin AF. 2013. Toroidal structure and DNA cleavage by the CRISPR-associated [4Fe-4S] cluster containing Cas4 nuclease SSO0001 from Sulfolobus solfataricus. J Am Chem Soc 135:17476–17487. doi: 10.1021/ja408729b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Zhang J, Kasciukovic T, White MF. 2012. The CRISPR associated protein Cas4 is a 5' to 3' DNA exonuclease with an iron-sulfur cluster. PLoS One 7:e47232. doi: 10.1371/journal.pone.0047232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Peng N, Deng L, Mei Y, Jiang D, Hu Y, Awayez M, Liang Y, She Q. 2012. A synthetic arabinose-inducible promoter confers high levels of recombinant protein expression in hyperthermophilic archaeon Sulfolobus islandicus. Appl Environ Microbiol 78:5630–5637. doi: 10.1128/AEM.00855-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Peng W, Feng M, Feng X, Liang YX, She Q. 2015. An archaeal CRISPR type III-B system exhibiting distinctive RNA targeting features and mediating dual RNA and DNA interference. Nucleic Acids Res 43:406–417. doi: 10.1093/nar/gku1302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hudaiberdiev S, Shmakov S, Wolf YI, Terns MP, Makarova KS, Koonin EV. 2017. Phylogenomics of Cas4 family nucleases. BMC Evol Biol 17:232. doi: 10.1186/s12862-017-1081-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lemak S, Nocek B, Beloglazova N, Skarina T, Flick R, Brown G, Joachimiak A, Savchenko A, Yakunin AF. 2014. The CRISPR-associated Cas4 protein Pcal_0546 from Pyrobaculum calidifontis contains a [2Fe-2S] cluster: crystal structure and nuclease activity. Nucleic Acids Res 42:11144–11155. doi: 10.1093/nar/gku797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Leon-Sobrino C, Kot WP, Garrett RA. 2016. Transcriptome changes in STSV2-infected Sulfolobus islandicus REY15A undergoing continuous CRISPR spacer acquisition. Mol Microbiol 99:719–728. doi: 10.1111/mmi.13263. [DOI] [PubMed] [Google Scholar]
- 40.Guo Y, Kragelund BB, White MF, Peng X. 2015. Functional characterization of a conserved archaeal viral operon revealing single-stranded DNA binding, annealing and nuclease activities. J Mol Biol 427:2179–2191. doi: 10.1016/j.jmb.2015.03.013. [DOI] [PubMed] [Google Scholar]
- 41.Prangishvili D, Koonin EV, Krupovic M. 2013. Genomics and biology of Rudiviruses, a model for the study of virus-host interactions in Archaea. Biochm Soc Trans 41:443–450. doi: 10.1042/BST20120313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Hooton SP, Connerton IF. 2015. Campylobacter jejuni acquire new host-derived CRISPR spacers when in association with bacteriophages harboring a CRISPR-like Cas4 protein. Front Microbiol 5:744. doi: 10.3389/fmicb.2014.00744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Krupovic M, Makarova KS, Forterre P, Prangishvili D, Koonin EV. 2014. Casposons: a new superfamily of self-synthesizing DNA transposons at the origin of prokaryotic CRISPR-Cas immunity. BMC Biol 12:36. doi: 10.1186/1741-7007-12-36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Gardner AF, Prangishvili D, Jack WE. 2011. Characterization of Sulfolobus islandicus rod-shaped virus 2 gp19, a single-strand specific endonuclease. Extremophiles 15:619–624. doi: 10.1007/s00792-011-0385-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Quax TE, Voet M, Sismeiro O, Dillies MA, Jagla B, Coppee JY, Sezonov G, Forterre P, van der Oost J, Lavigne R, Prangishvili D. 2013. Massive activation of archaeal defense genes during viral infection. J Virol 87:8419–8428. doi: 10.1128/JVI.01020-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Deng L, Zhu H, Chen Z, Liang YX, She Q. 2009. Unmarked gene deletion and host-vector system for the hyperthermophilic crenarchaeon Sulfolobus islandicus. Extremophiles 13:735–746. doi: 10.1007/s00792-009-0254-2. [DOI] [PubMed] [Google Scholar]
- 47.Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Res 14:1188–1190. doi: 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Notredame C, Higgins DG, Heringa J, 2000. T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217. doi: 10.1006/jmbi.2000.4042. [DOI] [PubMed] [Google Scholar]
- 49.Kumar S, Stecher G, Tamura K. 2016. MEGA7: Molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol 33:1870–1874. doi: 10.1093/molbev/msw054. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.