ABSTRACT
CRISPR-Cas systems are a highly effective immune mechanism for prokaryotes, providing defense against invading foreign DNA. By definition, all CRISPR-Cas systems have short repeats interspersing their spacers. These repeats play a key role in preventing cleavage of self DNA and in the integration of new spacers. Here we focus on the phenomenon of repeat modularity, namely the unexpectedly high degree of repeat conservation across different systems within a genome or between different species. We hypothesize that modularity can be beneficial for CRISPR-Cas containing organisms, because it facilitates horizontal acquisition of ‘pre-immunized’ CRISPR arrays and allows the utilization of spacers acquired by one system for use by other systems within the same cell.
KEYWORDS: CRISPR-Cas, repeats, virus, phage, archaea, lateral gene transfer, horizontal gene transfer, modularity, cas6
1. CRISPR repeats and their cellular interaction partners
CRISPR-Cas systems provide an adaptive form of defense for prokaryotes against invading foreign DNA, especially viruses, with a DNA-based immune memory, and is thus heritable. CRISPR repeats are a defining characteristic of all CRISPR-Cas systems. These repetitive DNA elements enable the correct processing of the pre-crRNA to small crRNAs, while at the same time preventing auto-immune cleavage of the immune memory locus itself. Several Cas proteins must interact specifically with a particular repeat to exert their function. First, spacer integration requires the recognition of the first repeat by the Cas1-Cas2 complex [1–5], which can sometimes also include Cas4 [6]. Second, in all systems, processing of the pre-crRNA requires the RNA cleavage activity of either a dedicated protein or a housekeeping RNase. The enzyme performing this role is Cas6 in most type I and type III systems [7] and Cas5d (sometimes referred to as Cas5c) in type I-C systems [8–10] (reviewed in [11]). Type II systems utilize the housekeeping enzyme RNase III for pre-crRNA maturation in a process which also requires trans-activating CRISPR RNAs (tracrRNAs) [12] and Cas9. In type V-B systems Cas12b and tracrRNAs are required for the processing of pre-crRNA [13] while in type V-A this function is performed by Cas12a alone [14,15]. In type VI-A systems the processing is also accomplished solely by Cas13a [13]. Finally, a type III-B variant system in cyanobacteria has been shown to use the housekeeping enzyme RNase E for this role [16].
2. Repeat compatibility allows crRNA sharing by multiple systems within the same cell
Correct crRNA maturation requires specific recognition of structure and sequence features of the repeats by the processing machinery, and repeats have been accordingly classified into families based on structure and function [17,18]. One could expect that every specific CRISPR-Cas system would have its own distinct repeat family, which has uniquely coevolved with its cognate Cas1 and processing 50 enzyme. However, it has been noted that divergent systems either within the same species or in different species often show high repeat sequence conservation [19–21]. Furthermore, within the same genome there are often multiple CRISPR-Cas systems that have nearly identical repeats. In the archaeon Methanosarcina mazei two Cas6b paralogs belonging to different systems are able to cleave the repeats of both CRISPR loci, which are highly similar to one another [22]. Another example is the genome of the hyperthermophilic archaeon Pyrococcus furiosus that encodes three different CRISPR-Cas systems (III-B, I-G, formerly classified as type I-B and I-A) and seven CRISPR arrays, yet all arrays have identical repeats [23], implying that their pre-crRNA transcripts can be processed by a single Cas6 homolog. Such inter-system compatibility also implies that spacers acquired by a single acquisition complex could mediate destruction of foreign nucleic acids by several distinct interference complexes, thereby providing redundancy should one of these complexes fail to function (for example due to an anti-CRISPR protein expressed by the virus). This illustrates the benefits of repeat conservation: ability to rely on a single processing enzyme for multiple systems and the subsequent compatibility of the crRNAs with different systems within the same cell.
3. Repeat conservation may aid the horizontal acquisition of immune memory
If CRISPR arrays can be acquired horizontally, this could provide another potential advantage for repeat compatibility between different species. This is a likely scenario in prokaryotic groups where CRISPR arrays tend to be plasmid-encoded rather than chromosomal, such as halophilic archaea [24]. Furthermore, in bacteria, CRISPR arrays and entire CRISPR-Cas loci have been shown to be laterally transferred via generalized transduction [25]. Since a single viral family can often infect multiple haloarchaeal genera [26], spacers acquired by one species can protect against viruses later encountered by another. Indeed CRISPR repeats of diverse haloarchaeal genera (typically belonging to type I-B systems) retain an exceptionally high level of sequence similarity [21]. If this conservation is indeed beneficial, one would expect that even relatively remote type I-B systems that have highly divergent Cas6 nucleases would still retain this near-identity of repeats. Indeed, when we performed phylogenetic analysis of Cas6 proteins of halophilic archaea this is exactly what was observed (Fig. 1). Notably even when Cas6 sequence identity is rather low, such as 72% between the protein orthologs of Halorhabdus tiamatea and Natronobacterium gregoryi, the repeats can be nearly identical (29/30 nucleotide identity in this case). Such high variation may exist because Cas6 orthologs can have highly variable active sites [27], with typically low turnover rates [28].
Figure 1.

Molecular Phylogenetic analysis by Maximum Likelihood method of selected haloarchaeal Cas6 orthologs shown next to their respective CRISPR repeats. To represent the fact that many haloarchaeal species have slightly different repeats in different arrays, a Weblogo [30] representation is shown, created using http://weblogo.berkeley.edu/logo.cgi. Only complete genomes in which there is a single CRISPR repeat sequence family were used. The evolutionary history was inferred from an amino acid alignment of the proteins by using the Maximum Likelihood method, and The tree with the highest log likelihood (−2618.88) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Joining and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. All positions containing gaps and missing data were eliminated. There were a total of 197 positions in the final dataset. Evolutionary analyses were conducted using MEGA7 [31].
Thus, repeat compatibility may potentially facilitate the acquisition of either whole CRISPR systems or just ‘pre-immunized’ arrays while retaining functional compatibility between newly acquired and preexisting components. This would also imply a greater benefit for the community of cells than the benefit for the individual cell, as is the case when different cells acquire different spacers against the same virus [29].
4. Concluding words
While on the surface CRISPR repeats are the most static and predictable part of CRISPR-Cas systems, a more careful examination of these elements provides evidence for interesting evolutionary dynamics including the accumulation of multiple systems and their subsequent emergent inter-connectedness, or the lateral transfer of systems and arrays. Additional bioinformatic analysis involving a broader representation of repeats and microbial taxa is required in order to test to the generality of the trends that we have highlighted here. Moreover, the benefits conferred by repeat compatibility for specific archaea or bacteria should also be experimentally confirmed.
Funding Statement
This work was supported by the Israel Science Foundation [grant 535/15].
Disclosure of potential conflicts of interest
No potential conflicts of interest were disclosed.
Acknowledgments
The authors thank Israela Turgeman-Grott for helpful discussions.
References
- 1.Yosef I, Goren MG, Qimron U.. Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli. Nucl Acids Res. 2012;40:5569–5576. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Wei Y, Chesne MT, Terns RM, et al. Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus. Nucl Acids Res. 2015;43:1749–1758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.McGinn J, Marraffini LA. CRISPR-Cas systems optimize their immune response by specifying the site of spacer integration. Mol Cell. 2016;64:616–623. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wang R, Li M, Gong L, et al. DNA motifs determining the accuracy of repeat duplication during CRISPR adaptation in Haloarcula hispanica. Nucl Acids Res. 2016;44:4266–4277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Xiao Y, Ng S, Nam KH, et al. How type II CRISPR-Cas establish immunity through Cas1-Cas2-mediated spacer integration. Nature. 2017;550:137–141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Plagens A, Tjaden B, Hagemann A, et al. Characterization of the CRISPR/Cas subtype I-A system of the hyperthermophilic crenarchaeon Thermoproteus tenax. J Bacteriol. 2012;194:2491–2500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Carte J, Wang R, Li H, et al. Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes. Genes Dev. 2008;22:3489–3496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hochstrasser ML, Taylor DW, Kornfeld JE, et al. DNA targeting by a minimal CRISPR RNA-guided cascade. Mol Cell. 2016;63:840–851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Nam KH, Haitjema C, Liu X, et al. Cas5d protein processes pre-crRNA and assembles into a cascade-like interference complex in subtype I-C/Dvulg CRISPR-Cas system. Structure. 2012;20:1574–1584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Punetha A, Sivathanu R, Anand B. Active site plasticity enables metal-dependent tuning of Cas5d nuclease activity in CRISPR-Cas type I-C system. Nucl Acids Res. 2014;42:3846–3856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Hochstrasser ML, Doudna JA. Cutting it close: CRISPR-associated endoribonuclease structure and function. Trends Biochem Sci. 2015;40:58–66. [DOI] [PubMed] [Google Scholar]
- 12.Deltcheva E, Chylinski K, Sharma CM, et al. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature. 2011;471:602–607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Shmakov S, Abudayyeh OO, Makarova KS, et al. Discovery and functional characterization of diverse class 2 CRISPR-Cas systems. Mol Cell. 2015;60:385–397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zetsche B, Gootenberg JS, Abudayyeh OO, et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell. 2015;163:759–771. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Fonfara I, Richter H, Bratovic M, et al. The CRISPR-associated DNA-cleaving enzyme Cpf1 also processes precursor CRISPR RNA. Nature. 2016;532:517–521. [DOI] [PubMed] [Google Scholar]
- 16.Behler J, Sharma K, Reimann V, et al. The host-encoded RNase E endonuclease as the crRNA maturation enzyme in a CRISPR-Cas subtype III-Bv system. Nat Microbiol. 2018 Mar;3(3):367–377. [DOI] [PubMed] [Google Scholar]
- 17.Alkhnbashi OS, Costa F, Shah SA, et al. CRISPRstrand: predicting repeat orientations to determine the crRNA-encoding strand at CRISPR loci. Bioinformatics. 2014;30:i489–i496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lange SJ, Alkhnbashi OS, Rose D, et al. CRISPRmap: an automated classification of repeat conservation in prokaryotic adaptive immune systems. Nucl Acids Res. 2013;41:8034–8044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Garrett RA, Vestergaard G, Shah SA. Archaeal CRISPR-based immune systems: exchangeable functional modules. Trends Microbiol. 2011;19:549–556. [DOI] [PubMed] [Google Scholar]
- 20.Li M, Liu H, Han J, et al. Characterization of CRISPR RNA biogenesis and Cas6 cleavage-mediated inhibition of a provirus in the haloarchaeon Haloferax mediterranei. J Bacteriol. 2013;195:867–875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Maier LK, Lange SJ, Stoll B, et al. Essential requirements for the detection and degradation of invaders by the Haloferax volcanii CRISPR/Cas system I-B. RNA Biology. 2013;10:865–874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Nickel L, Weidenbach K, Jager D, et al. Two CRISPR-Cas systems in Methanosarcina mazei strain Go1 display common processing features despite belonging to different types I and III. RNA Biology. 2013;10:779–791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Terns RM, Terns MP. The RNA- and DNA-targeting CRISPR-Cas immune systems of Pyrococcus furiosus. Biochem Soc Trans. 2013;41:1416–1421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Stachler AE, Turgeman-Grott I, Shtifman-Segal E, et al. High tolerance to self-targeting of the genome by the endogenous CRISPR-Cas system in an archaeon. Nucl Acids Res. 2017;45:5208–5216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Watson B, Staals R, Fineran P. CRISPR-cas-mediated phage resistance enhances horizontal gene transfer by transduction. mBio. 2018;e02406–e02417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Demina TA, Atanasova NS, Pietila MK, et al. Vesicle-like virion of Haloarcula hispanica pleomorphic virus 3 preserves high infectivity in saturated salt. Virology. 2016;499:40–51. [DOI] [PubMed] [Google Scholar]
- 27.Reeks J, Sokolowski RD, Graham S, et al. Structure of a dimeric crenarchaeal Cas6 enzyme with an atypical active site for CRISPR RNA processing. Biochem J. 2013;452:223–230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Li H. Structural principles of CRISPR RNA processing. Structure. 2015;23:13–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Van Houte S, Ekroth AK, Broniewski JM, et al. The diversity-generating benefits of a prokaryotic adaptive immune system. Nature. 2016;532:385–388. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Crooks GE, Hon G, Chandonia JM, et al. WebLogo: a sequence logo generator. Genome Res. 2004;14:1188–1190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33:1870–1874. [DOI] [PMC free article] [PubMed] [Google Scholar]
