Abstract
Clustered, regularly interspaced, short palindromic repeats (CRISPR) act as an adaptive RNA-mediated immune mechanism in bacteria. They can also be used for identification and evolutionary studies based on polymorphisms within the CRISPR locus. We amplified and analyzed 6 CRISPR loci from 237 Shigella strains belonging to the 4 species groups, as well as 13 Escherichia coli strains. The CRISPR-associated (cas) gene sequence arrays of these strains were screened and compared. The CRISPR sequences from Shigella were conserved among subtypes, suggesting that CRISPR may represent a new identification tool for the detection and discrimination of Shigella species. Secondary structure analysis showed a different stem-loop structure at the terminal repeat, suggesting a distinct recognition mechanism in the formation of crRNA. In addition, the presence of “self-target” spacers and polymorphisms within CRISPR in Shigella indicated a selective pressure for inhibition of this system, which has the potential to damage “self DNA.” Homology analysis of spacers showed that CRISPR might be involved in the regulation of virulence transmission. Phylogenetic analysis based on CRISPR sequences from Shigella and E. coli indicated that although phenotypic properties maintain convergent evolution, the 4 Shigella species do not represent natural groupings. Surprisingly, comparative analysis of Shigella repeats with other species provided new evidence for CRISPR horizontal transfer. Our results suggested that CRISPR analysis is applicable for the detection of Shigella species and for investigation of evolutionary relationships.
Keywords: bacterial evolution, CRISPR/Cas, cas genes, horizontal transfer, polymorphism, Shigella
Introduction
Shigella is the causative agent of shigellosis, with nearly 167 million cases and more than one million deaths annually worldwide,1 mostly of children less than 5 y of age. It is prevalent in less developed countries because of poorer sanitation conditions. Four species or groups of Shigella were recognized and categorized during the 1940s. Each can be further subdivided into one or more serotypes based on the O-antigen structures of the membrane-associated lipopolysaccharide: S. dysenteriae (17 serotypes), S. flexneri (16 serotypes), S. boydii (18 serotypes), and S. sonnei (one serotype). Shigella was thought to be closely related to Escherichia coli, and in recent years, increasing evidence has confirmed the relatedness of these 2 species.2,3 Phylogenetic studies have suggested that Shigella forms a single pathovar of E. coli.4
The clustered regularly interspaced short palindromic repeats (CRISPR) structure was first described in E. coli in 1987.5 It is characterized by 24–47-bp DNA direct repeats, separated by variable 21–72-bp sequences called spacers.6 Spacers are thought mainly to derive from invading DNA, and often have a specific arrangement in different bacteria.6,7 A “leader sequence” and cas (CRISPR-associated sequence) genes are often located adjacent to the CRISPR locus, and encode the proteins required for CRISPR function. Before their physiological role was understood, CRISPR-Cas systems were found to be a useful tool for typing bacterial diversity. A method called spoligotyping, based on CRISPR sequence information, was first used for differentiating Mycobacterium tuberculosis and is now a gold standard for routine genotyping.8,9 Subtyping methods based on analyses of the spacers of the CRISPR loci have also been developed for Yersinia pestis and Salmonella.10-15 Louwen and colleagues established that both spacer variation within the CRISPR array as well as single nucleotide polymorphisms in the cas genes were useful for typing Campylobacter jejuni.16 There are also ongoing studies to apply the CRISPR-based typing system to other bacteria.
Two decades after its discovery, CRISPR was defined as an adaptive RNA-mediated immune system that imparts sequence-specific immunity against mobile genetic elements such as bacteriophage and plasmids.7 In recent years, research into the function of CRISPR has shifted from defense against foreign genetic elements to its other regulatory functions in bacteria. Mounting evidence shows that different CRISPR-Cas systems can modulate various processes in bacteria, such as the genetic regulation of group behavior17,18 and virulence,19,20 even when there is only partial complementarity between spacers and their target sequence.17,18 Spacers are usually derived from mobile origins, such as phages and plasmids. However, several recent reports showed that incorporation of self-targeting spacers (targets within the chromosome) is fairly common and completely random, especially in type II CRISPRs.21-23 Although it has been hypothesized that the self-targeting CRISPR system may participate in gene regulation,24 the complete lack of conservation of self-targeting spacers across species suggests that self-targeting should be regarded as a form of autoimmunity.25 These spacers may lead to cytotoxicity because of the cleavage function of CRISPR systems, which results in large-scale genomic alterations.26 Even so, strains may survive in some cases by acquiring mutations that inactivate self-targeting. These mutations can occur in cas genes, spacers, repeats, and proto-spacer targets, and there is even evidence that the entire CRISPR/cas locus can be lost.25 However, research into the regulatory functions and self-targeting of CRISPR is still in its infancy, and more work is needed to validate and fully interpret these results.
In E. coli, 2 CRISPR loci have been confirmed and were designated as type 2 and type 4 loci.27 These 2 loci were then classified as type I-E and type I-F CRISPR systems, respectively.28 In general, the type I-E CRISPR-Cas system is the predominant type in E. coli.29 However, in Enterobacteriaceae, the I-E CRISPR-Cas system is silenced by the heat-stable nucleoid-structuring repressor protein in the natural state,30,31 resulting in a CRISPR system lacking the characteristics of a classical immune system.32 The conserved nature of CRISPR loci makes them useful markers for clonal population detection and phylogenetic analysis.32-35 Shigella species also contain CRISPR loci. Guo et al. analyzed CRISPRs in a collection of Shigella strains from different regions of China;36 however, few Shigella subtypes were examined and the study did not highlight the differences in CRISPRs among different subtypes. Therefore, a comprehensive study of the CRISPR system in the 4 Shigella species (S. dysenteriae, S. flexneri, S. boydii, and S. sonnei) has not yet been performed.
In this study, we amplified the CRISPR loci from 237 strains distributed among the 4 Shigella species. Polymorphisms within the CRISPR system, including the CRISPR locus and the cas gene sequence, were analyzed. In addition, CRISPR DNA from 13 E. coli strains was amplified and sequenced. The objectives of this study were: (i) to understand CRISPR diversity within Shigella serotypes; (ii) to find evidence for the regulatory functions of CRISPR in Shigella; (iii) to determine the unique characteristics of the CRISPR system in Shigella by comparison between Shigella and E. coli; and (iv) to research the evolutionary relationship between Shigella subtypes, E. coli, and other bacteria based on polymorphisms within the CRISPR loci.
Results
CRISPR locus distribution and characteristics in Shigella
We amplified and sequenced 6 CRISPR loci (CRa-f) within 237 strains belonging to the 4 Shigella species (Table 1). According to the location of the CRISPR loci and repeat sequences, we confirmed that the CRb, CRd, and CRe loci corresponded to the CRISPR3, CRISPR1, and CRISPR2 loci, respectively, described by Guo et al.36 The CRa, CRc, and CRf loci are uncertain CRISPR listed in the CRISPR database (http://crispr.u-psud.fr/). Although we did not determine whether these 3 loci were functional CRISPR regions, we determined that they contained all characteristics of CRISPR loci and could therefore be included in the typing and detection of Shigella. The overall analysis of the 6 CRISPR loci revealed that the average GC content of the spacers was 51.03%. Thirty-one spacers were found in the CRISPR loci of the Shigella strains, and the distribution of the spacers varied from 1 to 13 in the 6 different CRISPR loci: CRa (one spacer), CRb (3 spacers), CRc (one spacer), CRd (13 spacers), CRe (11 spacers), and CRf (2 spacers). The spacers ranged in length from 32-50 bp (Supplementary Text S1).
Table 1.
Summary of Shigella isolates
| Group | Subtype | Number of strains |
|---|---|---|
| S. sonnei | — | 71 |
| S. flexneri | 1a | 12 |
| 1b | 3 | |
| 2a | 35 | |
| 2b | 16 | |
| 2c | 19 | |
| 3a | 1 | |
| 4c | 60 | |
| 6 | 2 | |
| x | 3 | |
| y | 2 | |
| S. dysenteriae | I | 1 |
| S. boydii | 1 | 1 |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 18 | 1 |
All of the S. sonnei strains tested contained the same 6 CRISPR loci, except for 4 strains that lacked one spacer (CRd-s2) close to the leader sequence of the CRd locus (Fig. 1). Similarly, one spacer (CRb-s1) was absent from the CRb loci of 2 S. flexneri strains (serotypes F2b and F3a) compared with the other S. flexneri strains tested. The CRISPR loci in F6 serotype strains also differed from the other S. flexneri subtypes. They did not contain the CRa or CRe loci, and only contained one repeat in the CRc locus. We tested only one S. dysentery strain, whose serotype is A1, but this strain displayed the highest number of spacers in the CRd (8) and CRe (10) loci among all of the Shigella strains, and was the only serotype that did not contain the CRf locus. Because of the limited number of strains, we also screened for the 6 loci in the sequenced genomes of 2 S. dysenteriae strains (S. dysenteriae 1617 Gi: 558570689 and S. dysenteriae Sd197 Gi: 82775382) from NCBI. The results showed that the 2 genomes contained the same CRa, CRb, and CRc loci as our S. dysentery strain, and the same CRf locus as strains belonging to the other 3 Shigella groups tested. However, the 2 genome sequences did not contain the CRe locus. At the CRd locus, 3 spacers that were not detected in any of our strains were identified (Supplementary Text S1). Sequence analysis of the 12 different subtypes of S. boydii revealed that subtypes 1, 2, 3, 4, 8, 10, and 14 shared the same CRISPR loci genotype. S. boydii subtypes 5 and 9 were distinct from the other subtypes as they contained a different sequence in the CRa locus, a different spacer (CRb-s3) in the CRb locus, and additional spacers in the CRd locus. S. boydii subtype 13 was also distinct because it lacked the CRb and CRd loci.
Figure 1.

CRISPR loci in Shigella spp. and E. coli. (A) CRISPR loci in the 4 groups of Shigella spp Their subtypes and the corresponding number of isolates (in parentheses) are listed. (B) The results of sequence analysis of products amplified from 13 E. coli strains using the 6 primer sets designed against the Shigella CRISPR sequences. The strain number and type are listed in the first column. The large diamonds represent spacers and the small diamonds represent repeats. The hexagons represent the brief display of multiple spacers. The rectangles represent the replacement sequence. Horizontal lines represent no amplification product. “s” represents the spacers found in Shigella spp.. “e” represents the spacer found in E. coli. The diamond of CRe-s2 is small because its length is only 20 bp. The yellow diamond in CRe locus represents the repeat that is inserted with a 12 bp fragment. The direction of drawn R-S arrays from the leader sequence to the terminal repeat has been marked with arrow, except CRa, CRc and CRf loci because no research has indicated their leader sequences for now.
All of the CRISPR loci in the Shigella species had distinct characteristics. Two arrangements of the CRa and CRc loci were identified: the first had only one repeat-spacer (R–S) unit (such as in S. sonnei), while the other had only one repeat (such as in S. boydii subtype 12 and S. flexneri subtype 6). Interestingly, there was no R-S unit in the CRa locus of S. flexneri or S. boydii subtypes 5, 9, and 13. Instead, the R-S regions were replaced with 54-bp (R1) and 87-bp (R2) gene segments, while the left and right flank sequences of the R-S regions remained unchanged. Sequence analysis of the CRb, CRd, and CRe loci revealed a high diversity in the R-S regions. Moreover, all S. flexneri subtypes (except F6) had an irregular arrangement of R-S units in the CRd locus, with a 28-bp gene fragment (R3) taking the place of CRd-s2 and a part of the repeat in S. sonnei (Fig. 1). It should be mentioned that CRd and CRe shared an identical repeat sequence. However, not all of the CRe repeats were regular: the third repeat was incomplete and the fourth repeat contained a 12-bp insertion fragment (Supplementary Text S1). Between these 2 repeats, there was a 461-bp unknown gene fragment (R4) in all the CRe loci. In addition, IS1 was inserted upstream of this fragment in all the S. sonnei strains. The CRf locus contained 2 R-S units, and was present in S. sonnei, S. flexneri, S. boydii, and 2 sequenced S. dysenteriae strains, but absent in the one S. dysenteriae strain included in the study.
Comprehensive sequence analysis revealed that the repeat sequences in the CRISPR loci were not always completely conserved because of point mutations in particular serotypes (Table 2). These mutations were generally found in the repeat within the R-S regions and could be used to distinguish between different subtypes. Furthermore, our results revealed the site of a point mutation within the spacer sequences. This finding differed from a previous report37 because the identified point mutations did not all occur at the end of the spacer sequence. The spacers of different serotypes were also relatively well conserved (Table 2). Therefore, repeat, spacer, and locus architecture of Shigella CRISPR regions could prove useful for the serological distinction of Shigella subtypes. However, because of the large variety of Shigella serotypes (such as S. flexneri subtypes), this method would mainly be useful for discriminating between groups and some subtypes. To increase the discriminatory power of this method, we also analyzed the cas gene sequences for the presence of single nucleotide polymorphisms, which has proven useful for Campylobacter typing.16 However, the usable point mutations only occurred in cas1 and cas2 and could be used to discriminate S. boydii subtypes 5 and 9 from the other Shigella subtypes (Supplementary Text S2). We also analyzed the secondary structure of all repeat sequences, and the CRb and CRd repeats formed a hairpin structure (Fig. 2). The hairpin structure formed by the terminal direct repeat (repeat furthest from the leader) sequence was divergent from the consensus direct repeat in the CRb and CRd loci, having shortened stem-loops (Fig. 2).
Table 2.
Point mutations of R-S sequences
| Loci | ID | DNA sequence | Length | Strains |
|---|---|---|---|---|
| CRa | s1-mu1 | CGGGTTGGATAAGGCGCTTATGCCGCATCCGACAGCTATCGCCGGATGCG | 50 | E130 |
| CRb | rep1-mu1 | GTTCACTGCCGTACAGGCAGCTTAAAAA | 28 | B3 B8 B10 E194 E182 |
| rep2-mu1 | GTTCACTGCCGTACAGACAGATAAAATG | 28 | S. sonnei | |
| rep2-mu2 | GTTCACTGCCGTACAGGCAGATAAAATG | 28 | B12 E010 E129 | |
| rep2-mu3 | GTTCACTGCTGTACAGGTAGATAAAATG | 28 | B5 B9 | |
| s2-mu1 | TTCACAGGTAACATACTCCA—-CCCACCAT | 28 | E194 E195 E182 E143 | |
| s2-mu2 | TTCACTGGTAACATACTCCACCCGCCCACCAT | 32 | E010 E136 E133 E183 E200 E129 | |
| CRc | rep-mu1 | TTTGTAGGCCTGATAAGACGCGTCAGCGTCGCATCAGGC | 39 | D1 |
| rep-mu2 | TTTGTAGGCCTGATAAGACGCGCCAGCGTCGCATCAGG- | 38 | B13 | |
| s1-mu1 | TCCGGGTGCCGGATGCAGCGTGAACGCCTTATCCGGCCTACGGCT CTACGGCTCGGA | 57 | E130 | |
| s1-mu2 | ——TGCCGGATGCGGCGTAAACGCCTTATCCGGCCTACGGCTCGGA | 43 | B13 | |
| s1-mu3 | TCCGGGTGCCGGATGCGGCGTGAACGCCTTATCCGGCCTACGGCTCGGA | 49 | B13 | |
| CRd | rep1 | GGGTGTTCTACCGCAGGGGCGGGG-AATTC | 29 | S. sonnei D1 E130 E136 E133 E195 E182 E143 E129 |
| rep1-mu1 | GGGAGTTCTACCGCAGGGGCGGGG-AATTC | 29 | S. flexneri B1 B2 B3 B4 B8 B10 B12 B14 B18 E244 | |
| rep1-mu2 | GG-AGTTCTACCGCAGGGGCGGGGGAACTC | 29 | B5 | |
| rep1-mu3 | GG-AATTCTACCGCAGAGGCGGGGGAACTC | 29 | B9 | |
| rep2-mu1 | CGGTTTATCCCCGCTGGCGCGGGGAACTC | 29 | S. sonnei E130 | |
| rep2-mu2 | CGGTTTATCCCCGCTGGCGCGGGGACCTC | 29 | A1 B5 | |
| rep2-mu3 | CGGTTTATCCCCGCTGATGCGGGGAACAC | 29 | B9 | |
| rep2-mu4 | AGGTTTATCCCCGCTGGCGCGGGGAACAC | 29 | E244 | |
| rep2-mu5 | CGGTTTATCCCCACTGGCGCGGGGAACAC | 29 | E244 | |
| rep2-mu6 | CGGTTTATCCCCGCTAGCGCGGGGAACAC | 29 | E244 | |
| rep2-mu7 | CGGTTTATCCCCGCTGACGCGGGGAACAC | 29 | B9 | |
| rep2-mu8 | CGGTTTATCCCCGCTGGCGTGGGGAACAC | 29 | A1 B9 | |
| rep2-mu9 | CGGTTTATCTCCGCTGGAGCGGGGAACAC | 29 | S. flexneri except F6 subtype | |
| rep2-mu10 | CGGTTTATCCCCGCTGACGCGGGGAACTC | 29 | E195 E182 | |
| s1-mu1 | TAAGTGATATCCATTATCGCATCCAGTGCGCC | 32 | F6 B1 B2 B3 B4 B8 B10 B12 B14 B18 | |
| s1-mu2 | TAAGTGATACCCATCATCGCATCCAGTGCGTC | 32 | S. flexneri except F6 subtype | |
| s1-mu3 | CAAGTGATATCCATCATCGCATCCAGTGCGCC | 32 | B5 B9 | |
| CRe | rep1 | ATGGTTATCCCCGCTGGCGCGGGGAACAT | 29 | S. sonnei |
| rep1-mu2 | ATAGTTATCCCCGCTGACGCGGGGAACAT | 29 | B5 B9 | |
| rep1-mu3 | ATGGTTATCCCCGCTGACGCGGGGAACAT | 29 | B18 D1 all E. coli | |
| CRf | rep-mu1 | AGGTAACTCAGGGAGAAATGTCAG | 24 | first repeat |
| rep-mu2 | AGGTAACTCAGGAAGAAATGTCAG | 24 | first repeat of B18 B1 B3 B8 B10 |
E: E.coli; B: S. boydii; D: S. dysenteriae; F: S. flexneri. The number after “E” represents the strain number. The number after “B,” “D” and “F” represent subtype.
Figure 2.

Hairpin structures of the repeats in S. sonnei. The stem structure of the terminal direct repeat (furthest from the leader sequence) was shorter than the stem structures of the other repeats. (A) The secondary structure of the CRb repeats. (B) The secondary structure of the CRd repeats.
We also compared the Shigella spacer sequences with the genomes of 18 standard Shigella strains, 4207 plasmids, and 522 phage sequences available from GenBank (Table 3 and Supplementary Table S1). The results showed that all 31 spacers showed similarity to plasmid and phage sequences of diverse species, and 24 of the 31 spacers (77%) showed similarity to Shigella chromosomal sequences. Among CRISPR spacers that were homologous to phage sequences, 26 (83%) showed similarity to phage sequences from enterobacteria, while 14 (45%) of the spacers with similarities to plasmid sequences showed similarity to enterobacterial plasmids. We also analyzed the genes potentially targeted by the spacers. Multiple potential target genes were found in the Shigella chromosome, including ybbX, glgP, and nrdI (Table 3). A spacer CRd_sp5 shared a region of homology with the mobA gene, encoding DNA strand transferase. This gene plays a role in the conjugal transfer of multiple E. coli and S. sonnei plasmids, such as the colicin plasmid pDPT3 and virulence plasmid pSSE.38-40 The results also showed that more than one spacer shared regions of homology with the stx2 gene (encoding Shiga toxin 2) of phage I and enterobacterial phage T4. It was previously shown that the matches on the seed sequence region between spacer and proto-spacer are sufficient for the function of CRISPR systems.41,42 Among all homologous regions in our research, several contained the complete seed region, while others contained part of the seed region or did not contain seed region.
Table 3.
Homology analysis of Shigella spacers (partial results)
| Location | Query | Gi | Species | Genes | Identities (%) | Alignment length (bp) | Seed |
|---|---|---|---|---|---|---|---|
| Chromosome | SH_CRb_sp1 | 344915202 | Shigella flexneri 2a str. 301 | folK | 100 | 15 | N |
| SH_CRb_sp2 | 74310614 | Shigella sonnei Ss046 | ybbX | 100 | 14 | Y | |
| SH_CRd_sp1 | 74310614 | Shigella sonnei Ss046 | glgP | 94.44 | 18 | Y | |
| SH_CRd_sp9 | 82775382 | Shigella dysenteriae Sd197 | pyrG | 100 | 14 | N | |
| SH_CRe_sp1 | 74310614 | Shigella sonnei Ss046 | yfcU | 100 | 14 | Y/N | |
| SH_CRe_sp3 | 82542618 | Shigella boydii Sb227 | yfeA | 100 | 14 | N | |
| SH_CRe_sp4 | 82775382 | Shigella dysenteriae Sd197 | nrdI | 100 | 14 | Y | |
| SH_CRe_sp6 | 82775382 | Shigella dysenteriae Sd197 | metR | 100 | 14 | N | |
| SH_CRc_sp1 | 110804074 | Shigella flexneri 5 str. 8401 | nrdB | 95.24 | 42 | — | |
| SH_CRf_sp1 | 384541581 | Shigella flexneri 2002017 | zntA | 94.44 | 18 | — | |
| Plasmid | SH_CRd_sp5 | 459466044 | Shigella sonnei plasmid pDPT3 | mobA | 94.74 | 19 | N |
| SH_CRd_sp5 | 204309817 | Escherichia coli plasmid pO26-S4 | mobA | 94.74 | 19 | N | |
| SH_CRd_sp5 | 190410160 | Escherichia coli plasmid pIGMS5 | mobA | 94.74 | 19 | N | |
| Phage | SH_CRb_sp3 | 20065797 | Stx2 converting phage I | non_coding_region | 100 | 13 | Y/N |
| SH_CRe_sp5 | 20065797 | Stx2 converting phage I | non_coding_region | 100 | 13 | Y/N | |
| SH_CRe_sp8 | 116221992 | Stx2-converting phage 86 | Stx2-86_gp30 | 100 | 13 | N | |
| SH_CRd_sp11 | 20065797 | Stx2 converting phage I | non_coding_region | 100 | 13 | N | |
| SH_CRd_sp10 | 45686283 | Enterobacteria phage T1 | T1p48 | 100 | 14 | N | |
| SH_CRe_sp11 | 17570786 | Enterobacteria phage T3 | gene 1 | 100 | 14 | Y/N | |
| SH_CRe_sp7 | 29366675 | Enterobacteria phage T4 | gene 10 | 100 | 13 | Y/N | |
| SH_CRe_sp2 | 29366675 | Enterobacteria phage T4 | gene sp | 100 | 13 | N |
Complete and detailed homology analysis results are listed in Supplementary Table S1.
GI: serial number of species genome; gene: genes potentially targeted by the spacers. Whether the homologous sequence contains the seed sequence is also indicated: Y, located completely within the seed region; Y/N, located across both seed region and non-seed region; N, outside of the seed region.
The seed sequences of the CRa, CRc, and CRf loci have not been reported.
Comparative analysis with E. coli
Based on the evolutionary relationship between E. coli and Shigella, it was hypothesized that they would share a conserved CRISPR locus. To investigate this, 13 E. coli strains were used as templates for PCR, along with the CRISPR primers used for the Shigella strains. The CRa, CRb, CRc, CRd, and CRe loci were found in E. coli, with the CRa and CRc genotypes being almost identical to those observed in Shigella. The CRb, CRd, and CRe loci in E. coli contained a higher degree of polymorphism than in Shigella species (Fig. 1B). The spacers of Shigella CRb-s2, CRd-s1, CRd-s2, CRe-s1, CRe-s2, CRe-s4, and CRe-s5 were present in E. coli, along with novel spacers not detected in Shigella. None of the E. coli strains tested contained the CRf locus, whereas this locus was present in all of the S. sonnei, S. flexneri, and S. boydii strains examined, as well as in the 2 sequenced S. dysenteriae strains. This suggested that the CRf locus can be used in the identification of Shigella species; however, the absence of this locus cannot rule out S. dysenteriae because the strain of this group may not contain this locus based on our results.
The coexistence of repeats and spacers in the CRa, CRb, CRc, CRd, and CRe loci is further evidence of the evolutionary relatedness of Shigella species and E. coli. To investigate this evolutionary relatedness further, phylogenetic analysis of the CRISPR loci of Shigella subtypes and E. coli was conducted (Fig. 3). Three main clusters were evident on the resulting dendrogram. The first cluster mainly included S. flexneri and S. boydii strains. S. boydii subtypes 5, 9, and 13 constituted their own individual branches, independent of the other subtypes of this species. S. flexneri subtype 6 also occupied its own branch, distinct from the other S. flexneri subtypes, but within a cluster that included the 9 S. boydii subtypes 1, 2, 3, 4, 8, 10, 12, 14, and 18. S. dysentery and S. sonnei strains were located within the second and third clusters respectively, along with the E. coli strains. These results suggested that S. sonnei, S. dysentery, and E. coli were more closely related than the other organisms, and that some subtypes were separated from the main cluster groups.
Figure 3.

Dendrogram based on the R-S array of Shigella spp and E. coli. All the spacers of 6 R-S arrays were used to construct this dendrogram. Sd, Sf, Sb and Ss represent S. dysenteriae, S. flexneri, S. boydii and S. sonnei respectively. “*” for F2b represents the S. flexneri strain that lacks one spacer (s1) in the CRb locus. “*” for Ss-1 and Ss-2 represents the 67 S. sonnei strains and the 4 S. sonnei strains that lack one spacer (s2) in the CRd locus.
The cas genes from the E. coli and Shigella strains were then screened using cas gene-specific primers. Among the Shigella species, CRd was the only locus flanked by the cas genes. In S. dysentery subtype 1 and S. boydii subtypes 5 and 9 strains, the arrangement of the cas genes at the CRd locus was consistent with that of the CRISPR2.1/CAS-E locus in E. coli (i.e., cas2-cas1-cse3-cas5e-cse4-cse2-cse1-cas3), and was similarly located next to the iap genes.43 In S. sonnei strains, 2 insertion sequences, IS600 and ISSfl2, were present in the cse2 and cse3 genes. In S. flexneri, only a partial cas3 gene was identified, and no other cas genes were detected. In S. boydii, the cas gene sequences were also incomplete, with the exception of subtypes 5 and 9 (Fig. 4).
Figure 4.

cas gene arrangement of CRd in Shigella spp. The arrangement of the cas genes in S. dysentery subtype 1, S. boydii subtype 5 and 9 was consistent with the CRISPR2.1/CAS-E locus in E. coli, while 2 IS elements (red rectangles) were inserted among the cas genes in S. sonnei and only a partial cas3 gene exists in S. flexneri.
Comparative analysis with other bacterial species
By analyzing the information in the CRISPR database, 21 standard strains containing CRISPR loci with the same repeat sequence as the Shigella strains were identified (Table 4 and Table S2). The repeats were present within the CRb, CRc, and CRd/e loci. Only E. coli standard strains contained all 3 types of Shigella repeats, while Klebsiella oxytoca KCTC 1686 contained 2 types of repeats (CRb and CRd/e). The other standard strains only contained one type of repeat consistent with Shigella. Among them, Salmonella strains contained the greatest number of CRISPR loci (53 loci) within the CRd repeat (Table S2). It is interesting that the standard strains belong to different biological kingdoms. Most species are human pathogens, while Dickeya, Pectobacterium, and Erwinia are plant pathogens, and Photorhabdus is an insect pathogen.44 Furthermore, the majority of the strains belong to the family Enterobacteriaceae, except one strain, Legionella pneumophila, which belongs to the Legionellaceae. L. pneumophila is a cause of human pneumonia and is ubiquitous in the environment, being found in freshwater and soil.45 Comparative analysis of spacer sequences was also conducted among the species that contained the same repeats. The results showed that, with the exception of a few E. coli strains, almost none of the standard strains contained the same spacer as Shigella.
Table 4.
Standard strains containing the same repeat as Shigella
| ID | Species of standard strains | Relative number | Host in normal |
|---|---|---|---|
| CRb_rep | Escherichia coli | 17/57 | Human |
| Legionella pneumophila | 1 / 10 | Human | |
| Enterobacter sakazakii | 1 / 1 | Human | |
| Klebsiella oxytoca | 1 / 2 | Human | |
| Rahnella aquatilis | 1 / 2 | Human | |
| Cronobacter sakazakii | 1 / 3 | Human | |
| Enterobacteriaceae bacterium | 1 / 1 | Human | |
| Serratia marcescens | 1 / 2 | Human | |
| Erwinia carotovora | 1 / 1 | Plant | |
| Erwinia amylovora | 1 / 3 | Plant | |
| Enterobacter sp. | 1 / 2 | Plant | |
| Photorhabdus luminescens | 1 / 1 | Insect | |
| Photorhabdus asymbiotica | 1 / 1 | Insect | |
| CRc_rep | Escherichia coli | 12/57 | Human |
| CRd/e_rep | Escherichia coli | 25/57 | Human |
| Salmonella | 25/45 | Human | |
| Cronobacter turicensis | 1 / 1 | Human | |
| Citrobacter rodentium | 1 / 1 | Human | |
| Klebsiella oxytoca | 2 / 2 | Human | |
| Escherichia blattae | 1 / 1 | Human | |
| Dickeya dadantii | 1 / 3 | Plant | |
| Pectobacterium wasabiae | 1 / 1 | Plant | |
| Pectobacterium carotovorum | 1 / 2 | Plant |
The relative number represents the number of standard strains containing the same direct repeat as Shigella compared with the number of strains of the same species listed in the CRISPRdb. Detailed results are displayed in Supplementary Table S2.
Discussion
The conservation of the CRISPR system makes it useful for typing and detection purposes. Based on CRISPR polymorphisms within enterohemorrhagic E. coli strains, Delannoy and colleagues developed a typing profile that was more specific than previously established typing methods, which were based on stx and eae gene polymorphisms alone or together with O:H serotypes.33 In this study, the potential application of CRISPR for the typing of Shigella species was investigated. Comparative analysis of CRISPR sequences between different Shigella subtypes indicated that the 6 CRISPR loci are distributed differently. Despite some CRISPR loci only containing imperfect R-S units or one repeat, it is still possible to detect Shigella strains and discriminate their subtypes based on conserved gene arrangements or the distribution of CRISPR loci in the corresponding subtype. Thus, if required, CRISPR could be used to detect and discriminate isolates during a Shigella outbreak or epidemic. Sequence analysis of conserved point mutations within the repeats and spacer sequences is another potential method of discrimination. Therefore, CRISPR polymorphisms could be used for the serological distinction of Shigella subtypes. However, this method may not be specific enough to distinguish each subtype because of the large variety of Shigella serotypes, such as is seen for S. flexneri. In addition, because of the limited number of S. dysenteriae isolates available for our study, future research is required to explore the practicability of this method for distinguishing S. dysenteriae subtypes.
Mutations within terminal repeat regions in diverse CRISPR systems have been reported, including point mutations within terminal repeats in the subtype I-E CRISPR-Cas system of E. coli.35,46,47 The secondary structure of the repeats has been analyzed in previous reports, particularly in type I CRISPR systems.46,48-50 This study also identified mutations within the terminal repeats. Based on this finding, we analyzed the corresponding hairpin structures and determined that the stems of the mutated sequences were shorter than those formed by the consensus direct repeats. In addition, we determined that E. coli and Shigella contain identical terminal repeat sequences, so their shortened stem-loops are consistent (Fig. 2). It has been reported that the stable stem-loop fold formed by the transcript of a single palindromic repeat can facilitate recognition by RNA-binding Cas proteins.27 It also includes an endonuclease cleavage site used in the process of CRISPR RNA formation. In general, the pre-CRISPR RNA cleavage site appears to be located immediately upstream of the 3′ terminal base of the stem-loop formed by the repeat.48 The terminal repeat, being the initial repeat next to the first acquired spacer, has a different stem-loop structure from the other repeats analyzed in this study. We therefore speculate that the terminal repeat has a distinct recognition mechanism or acts as a terminator during the transcription of R-S arrays.
Spacers that are identical to known sequences provide useful information because similarities between spacer and proto-spacer sequences can provide some suggestion of the probable origin of the spacers.51,52 We therefore performed homology analysis to detect the targets of CRISPR. It should be noted that the current study did not completely determine whether the potential targets identified in our search are true spacer origins or would support CRISPR interference function, as most of the homologous regions are part of the spacer sequence or outside of seed regions. Even so, we cannot rule out the relationship between spacers and potential targets because the mutations may occur in proto-spacer regions after the acquisition of CRISPR spacers.23 Particularly in the phage sequences, escape mutations are more likely to occur as CRISPR is one of the fundamental drivers of phage evolution.53,54 Therefore, the data can still be used to identify the true origins of the spacers. In this study, a Shigella spacer was found to be associated with the transferase gene mobA, which is located in multiple plasmids (Table 3). It has been reported that spacers of many bacteria match multiple proto-spacers located on plasmids, especially on conjugative plasmids.55-58 This kind of association is consistent with the frequent horizontal transfer of plasmids. Moreover, some spacers shared homologous sequences with stx2 (Shiga toxin 2)-converting phage I and enterobacterial phage T4. Bacteriophages, such as Stx1- and Stx2-converting phages, are thought to play an important role in the horizontal transfer of virulence genes.59,60 The acquisition of stx genes via Shiga toxin-converting bacteriophages in clinical Shigella isolates such as S. dysenteriae 4,61 S. flexneri 2.62 and S. sonnei.63 suggests natural dissemination of the stx genes. The homology detected between spacer and Stx-converting phage sequences therefore indicates that CRISPR may play a regulatory role in the transmission of virulence factors by controlling the access of lysogenic phage in Shigella species. Although most of these homologous spacers are located in the common host S. dysenteriae, the homologous regions in the spacers do not contain the entire seed sequence, which suggests that the current CRISPR system possibly cannot perform its interference function. We therefore speculate that this regulatory function was active in ancestral strains, but that the Stx-converting phage acquired mutations in the proto-spacer (especially in the seed sequence region), allowing it to escape the control of the CRISPR system. Furthermore, several reports have shown that multiple spacers can share homology with the same phage.51,52,64 In this study, a similar phenomenon was detected, whereby more than one spacer from the same or different loci shared a homologous fragment that was also present in a phage (such as CRb_sp3, CRe_sp5, CRd_sp11, and CRe_sp8 with stx2-converting phage) (Table 3). This phenomenon suggests that bacteria may mediate the regulatory or defense function of CRISPR by increasing the number of CRISPR targets present, thereby enhancing their efficiency.
Barrangou and colleagues showed that the mechanism by which new spacers are acquired operates by the independent addition of each new R-S unit to the leader-proximal end of the R-S array.7 In this study, comparison of the CRISPR loci in different Shigella subtypes revealed that in the CRd locus, a section of the R-S unit is replaced with a 28-bp fragment (R3), while in the CRa locus, the entire R-S array is replaced with a 54-bp sequence (R1). Marie et al. regarded the 461-bp fragment in CRe as a large spacer in their report.32 However, based on its location and length, we are inclined to believe that this region (R4) is also a replacement fragment. These likely represent genetic recombination events directed by the bacteria rather than the regular gain and loss of R-S units. All of the genetic diversity detected, such as replacements, deletions (cas3 in S. flexneri), and the incorporation of IS elements, indicates the instability of Shigella CRISPR regions. Yang and colleagues reported that the proportion of IS elements in Shigella species is higher than in E. coli,65 suggesting more active genomic rearrangement in Shigella species. These genetic rearrangements and deletions may be a response to stress or a complex environment, and aid in adaptation. Early comparative analyses of CRISPR spacers revealed sequence homology not only to “non-self” DNA of mobile genetic elements, but occasionally also to “self” endogenous chromosomal DNA,51 which is consistent with the situation in Shigella (Table 3). Although the presence of self-targeting spacers has been suggested to allow control of gene expression, these spacers may also lead to cytotoxicity because of the cleavage function of CRISPR systems.26 Stern's study proposed that the acquisition of self-targeting spacers is harmful to the stability of the host genome, and that self-targeting should be regarded as a form of autoimmunity.25 Furthermore, previous studies have demonstrated that an imperfect match between the spacer and proto-spacer regions still can elicit the function of the CRISPR-Cas system.41,66 By this loose selectivity, CRISPR-Cas can apparently detect unrelated elements that share weak sequence identity. Obviously, this phenomenon increases the possibility of autoimmunity. It is therefore possible that in the presence of a complete and active CRISPR-Cas system, the acquisition of non-self-targeting spacers helps to defend against foreign elements. However, some new spacers could be detrimental to genome integrity because of partial homology with genome sequences. Accordingly, Shigella strains have undergone genome rearrangements to silence the CRISPR system. Thus, the presence of insertion sequences and deletion of some CRISPR loci and cas genes is more likely representative of a CRISPR-Cas system that is degrading.
Historically, Shigella was first described as Bacillus dysenteriae and was identified as the cause of dysentery. It was clearly related to Bacillus (now Escherichia) coli but was given a different name because B. coli was known as a commensal organism.67 Many different factors, such as nucleotide similarity, 16S rRNA gene sequence, and specific virulence genes, indicate the high degree of relatedness between Shigella species and E. coli. It has been proposed that they be classified as one species in the genus Escherichia, or even that Shigella species be treated as pathogenic forms of E. coli.2,68 However, it now seems clear that the different subtypes of Shigella species should not be categorized as a discrete group within E. coli, and that these organisms do not share a single evolutionary origin. Gulietta and colleagues analyzed the evolutionary relationship between Shigella and E. coli by sequencing 8 housekeeping genes from 4 regions of the chromosome. They showed that S. sonnei and S. dysenteriae strains did not fall within the Shigella clusters, but instead clustered with E. coli.69 Our results and those of several other studies also indicated that some Shigella subtypes were located close to or within the E. coli cluster, while several subtypes clustered independently from the other groups, such as Boydii 13 and Flexneri 6.70 Obviously, the phenotypic properties of Shigella species maintain a convergent evolution, but the 4 generally recognized subgroups of Shigella do not represent natural groupings. Marie and colleagues reported that the genotype of the CRISPR locus in E. coli species is associated with the multilocus sequence typing phylogenetic clusters.32 Our results confirmed that it is feasible to conduct phylogenetic analyses and detect evolutionary distance based on the CRISPR loci of closely related species. However, this method only applies to bacteria with relatively conserved CRISPR systems, such as the Enterobacteriaceae.
Aside from its conventional vertical inheritance, the CRISPR system appears to propagate extensively by horizontal gene transfer. Plasmids, phage, and other mobile elements can harbor and transport CRISPR loci between different bacteria. As a result, the same CRISPR system subtypes can be found in divergent species with phylogenetically distant genomes.43,71 Here, 21 distinct species were identified as having the same repeat pattern within their CRISPR systems as the Shigella species, with 20 of them belonging to the family Enterobacteriaceae. Among these species, Dickeya dadantii, Pectobacterium wasabiae, and Erwinia amylovora are plant pathogens, while Photorhabdus luminescens and Photorhabdus asymbiotica are lethal pathogens of insects. It is interesting that the same repeat sequence is present in different pathogens infecting hosts from different biological kingdoms. This finding indicates that these species may be more closely related within the large Enterobacteriaceae family. The possibility of horizontal CRISPR transfer cannot be ruled out; however, considering that these strains belong to the same family, vertical inheritance of the CRISPR system seems more likely. One exception was identified: L. pneumophila, belonging to the Legionellaceae family, is a ubiquitous environmental bacterium. The lack of a reported evolutionary relationship between L. pneumophila and Shigella (or the Enterobacteriaceae generally) indicates that their ancestral species may have existed in the same environment at a certain stage of evolution, when the horizontal transfer of CRISPR may have occurred. Therefore, the coexistence of CRISPR in L. pneumophila and Shigella provides new evidence of horizontal transfer of the CRISPR system. In addition, this CRISPR region is present on a L. pneumophila plasmid pLPL,72 which suggest the horizontal transfer of CRISPR is more likely to be carried out by the mobile element plasmid. CRISPR can also be used as an informative locus to compare the evolutionary distance between related species. However, this method can only serve as an auxiliary reference because horizontal transfer may have occurred between species. Therefore, this should be carried out in conjunction with phylogenetic analysis of the whole genome sequence to provide insight into the evolutionary relationship.
Taken together, our findings confirm that CRISPR is valuable for the detection and comparative analyses of Shigella strains. Based on the findings, CRISPR loci within Shigella species are relatively well conserved, allowing for the rapid discrimination of groups and partial subtypes. This straightforward screening method may prove helpful in the rapid detection of Shigella during an outbreak, thereby allowing for the implementation of effective intervention strategies. In addition, CRISPR analysis is an effective tool with which to study the evolutionary relationships among Shigella species, E. coli, and other bacteria, potentially providing important information regarding the molecular basis for the emergence of pathogenicity in these organisms. Research into the regulatory functions of CRISPR is still in its infancy. The CRISPR analysis of Shigella presented in this study revealed some new insights into the CRISPR sequence structure, which will contribute toward a more comprehensive understanding of the CRISPR system.
Materials and Methods
Bacterial isolates and DNA extraction
A summary of the 237 Shigella isolates analyzed in this study is provided in Table 1. All of the Shigella and E. coli isolates, which were collected by the Beijing, Shanghai, Xinjiang, and Shenyang Center for Disease Prevention and Control in China, were identified by bacterial serology methods (Table S3). All isolates were stored at −80°C in 20% glycerol. When needed, isolates were grown overnight in Luria-Bertani liquid medium at 37°C. DNA was extracted using a Tiangen microbial DNA extraction kit (Tiangen) and stored at −20°C until use.
PCR amplification
Six CRISPR loci were identified based on the data of CRISPR database (CRISPRdb) among the 4 Shigella species groups, and were designated CRa, CRb, CRc, CRd, CRe, and CRf (Fig. 1). These six CRISPR loci were amplified using primer pairs that targeted the regions flanking the CRISPR loci, while the cas gene sequence was amplified using primers CAS-A and CAS-B (Table 5). The reactions were performed in a 50-μl volume containing 10 ng of DNA template, 0.5 mM of each primer, 1 unit of ExTaq DNA polymerase (Takara), 200 mM dNTPs, and 10× PCR buffer (containing 500 mM KCl, 0.1 M Tris HCl (pH 8.3), and 25 mM MgCl2). The cycling conditions were 95°C for 5 min, followed by 30 cycles of denaturation at 95°C for 30 s, annealing at 56°C for 30 s, extension at 72°C for 30-60 s, and a final extension at 72°C for 10 min in an Applied Biosystems PCR cycler. Sequence was assembled and edited using the SeqMan module of the DNAstar package (DNAstar Inc..).
Table 5.
Primers used for amplification of CRISPR loci and cas genes
| Position | Name | Primer sequence (5′–3′) |
|---|---|---|
| CRa | CRa-F | ATTAGTCGGCGTAAGAAAGA |
| CRa-R | GAACAGCGTGATTATGGATG | |
| CRb | CRb-F | TTGTYAGGTAGGTTGGTGAAG |
| CRb-R | GCGAAGAGAAAGAACGAGTA | |
| CRc | CRc-F | ATCTCTGCTAACACCAACTAC |
| CRc-R | CTACGACCCTGAATGGAATC | |
| CRd | CRd-F | AGCGACTAACTGGAATCTTG |
| CRd-R | CAATCTGGCTACTGGAAGTG | |
| CRe | CRe-F | CGATCCAGAGCTGGTCGAATG |
| CRe-R | AGTGCTCTTTAACATAATGGATG | |
| CRf | CRf-F | GTCGGATCAAGGCTAAGTATA |
| CRf-R | GTCTCATCAATCAGTTCAGTG | |
| CAS | CAS-A | CGTAACCCATCCAAATCC |
| CAS-B | CGAAGAAGTAGCCACCAC |
Sequence polymorphism analyses
Sequences were aligned and nucleotide polymorphisms were identified using MEGA 5.05 software (http://www.megasoftware.net/). Microbial genome sequences were obtained from NCBI (http://www.ncbi.nlm.nih.gov/). Previously published CRISPR sequences from Shigella species were retrieved from the CRISPRdb and NCBI. In addition, the CRISPR database and the CRISPR identification application, CRISPR Finder, were used to retrieve and find CRISPR repeat and spacer sequences. The secondary structure of each of the repeats was analyzed using the RNA fold web server (http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi).
Sequence computational analyses
For sequence similarity analysis, spacer sequences were compared with all publically available sequences using the BLASTn search tool (http://blast.ncbi.nlm.nih.gov). The genome sequences of 522 phage, 4207 plasmids, and 18 Shigella strains were downloaded from NCBI. All spacer sequences were searched against local databases constructed from these genomes using the BLASTn search tool with default parameters, except for a word size of 7. An evolutionary dendrogram for the Shigella and E. coli strains examined in this study was generated using the BioNumerics software (Applied Maths) based on spacer arrangements within all 6 CRISPR loci.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Funding
This work was supported by the Mega-projects of Science and Technology Research (grant numbers 2013ZX10004607, 2012ZX10004215), the National Nature Science Foundation of China (grant numbers 81371854, 81373053); and the Beijing Science and Technology Nova program (grant number xx2013061).
Supplemental Material
Supplemental data for this article can be accessed on the publisher's website
References
- 1.Kotloff K, Winickoff J, Ivanoff B, Clemens JD, Swerdlow D, Sansonetti P, Adak G, Levine M. Global burden of Shigella infections: implications for vaccine development and implementation of control strategies. Bull World Health Organ 1999; 77:651-66; PMID:10516787 [PMC free article] [PubMed] [Google Scholar]
- 2.Lan R, Lumb B, Ryan D, Reeves PR. Molecular Evolution of Large Virulence Plasmid in Shigella Clones and Enteroinvasive Escherichia coli. Infect Immun 2001; 69:6303-9; PMID:11553574; http://dx.doi.org/ 10.1128/IAI.69.10.6303-6309.2001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Zhang Y, Lin K. A phylogenomic analysis of Escherichia coli / Shigella group: implications of genomic features associated with pathogenicity and ecological adaptation. BMC Evol Biol 2012; 12:174; PMID:22958895; http://dx.doi.org/ 10.1186/1471-2148-12-174 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Lan R, Alles MC, Donohoe K, Martinez MB, Reeves PR. Molecular evolutionary relationships of enteroinvasive Escherichia coli and Shigella spp. Infect Immun 2004; 72:5080-8; PMID:15322001; http://dx.doi.org/ 10.1128/IAI.72.9.5080-5088.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ishino Y, Shinagawa H, Makino K, Amemura M, Nakata A. Nucleotide sequence of the iap gene, responsible for alkaline phosphatase isozyme conversion in Escherichia coli, and identification of the gene product. J Bacteriol 1987; 169:5429-33; PMID:3316184 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Grissa I, Vergnaud G, Pourcel C. The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats. BMC Bioinformatics 2007; 8:172; PMID:17521438; http://dx.doi.org/ 10.1186/1471-2105-8-172 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, Moineau S, Romero DA, Horvath P. CRISPR provides acquired resistance against viruses in prokaryotes. Science 2007; 315:1709-12; PMID:17379808; http://dx.doi.org/ 10.1126/science.1138140 [DOI] [PubMed] [Google Scholar]
- 8.Kamerbeek J, Schouls L, Kolk A, van Agterveld M, van Soolingen D, Kuijper S, Bunschoten A, Molhuizen H, Shaw R, Goyal M, et al.. Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. J Clin Microbiol 1997; 35:907-14; PMID:9157152 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Groenen P, Bunschoten AE, Soolingen DV, van Errtbden JD. Nature of DNA polymorphism in the direct repeat cluster of Mycobacterium tuberculosis; application for strain differentiation by a novel typing method. Mol Microbiol 1993; 10:1057-65; PMID:7934856; http://dx.doi.org/ 10.1111/j.1365-2958.1993.tb00976.x [DOI] [PubMed] [Google Scholar]
- 10.Fabre L, Zhang J, Guigon G, Le Hello S, Guibert V, Accou-Demartin M, De Romans S, Lim C, Roux C, Passet V. CRISPR typing and subtyping for improved laboratory surveillance of Salmonella infections. Plos One 2012; 7:e36995; PMID:22623967; http://dx.doi.org/ 10.1371/journal.pone.0036995 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Vergnaud G, Zhou D, Platonov ME, Pourcel C, Yang R, Anisimov AP, Neubauer H, Balakhonov SV, Rakin A, Dentovskaya SV. Analysis of the three Yersinia pestis CRISPR loci provides new tools for phylogenetic studies and possibly for the investigation of ancient DNA. Adv Exp Med Biol 2007:327-38; PMID:17966429; http://dx.doi.org/ 10.1007/978-0-387-72124-8_30 [DOI] [PubMed] [Google Scholar]
- 12.Li H, Li P, Xie J, Yi S, Yang C, Wang J, Sun J, Liu N, Wang X, Wu Z, et al.. New clustered regularly interspaced short palindromic repeat locus spacer pair typing method based on the newly incorporated spacer for Salmonella enterica. J Clin Microbiol 2014; 52:2955-62; PMID:24899040; http://dx.doi.org/ 10.1128/JCM.00696-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Liu F, Kariyawasam S, Jayarao BM, Barrangou R, Gerner-Smidt P, Ribot EM, Knabel SJ, Dudley EG. Subtyping Salmonella enterica serovar enteritidis isolates from different sources by using sequence typing based on virulence genes and clustered regularly interspaced short palindromic repeats (CRISPRs). Appl Environ Microbiol 2011; 77:4520-6; PMID:21571881; http://dx.doi.org/ 10.1128/AEM.00468-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Shariat N, DiMarzio MJ, Yin S, Dettinger L, Sandt CH, Lute JR, Barrangou R, Dudley EG. The combination of CRISPR-MVLST and PFGE provides increased discriminatory power for differentiating human clinical isolates of Salmonella enterica subsp. enterica serovar Enteritidis. Food Microbiol 2013; 34:164-73; PMID:23498194; http://dx.doi.org/ 10.1016/j.fm.2012.11.012 [DOI] [PubMed] [Google Scholar]
- 15.Pourcel C, Salvignol G, Vergnaud G. CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. Microbiology 2005; 151:653-63; PMID:15758212; http://dx.doi.org/ 10.1099/mic.0.27437-0 [DOI] [PubMed] [Google Scholar]
- 16.Louwen R, Horst-Kreft D, de Boer A, Van der Graaf L, de Knegt G, Hamersma M, Heikema A, Timms A, Jacobs B, Wagenaar J. A novel link between Campylobacter jejuni bacteriophage defence, virulence and Guillain-Barré syndrome. Eur J Clin Microbiol Infect Dis 2013; 32:207-26; PMID:22945471; http://dx.doi.org/ 10.1007/s10096-012-1733-4 [DOI] [PubMed] [Google Scholar]
- 17.Cady KC, O'Toole GA. Non-identity-mediated CRISPR-bacteriophage interaction mediated via the Csy and Cas3 proteins. J Bacteriol 2011; 193:3433-45; PMID:21398535; http://dx.doi.org/ 10.1128/JB.01411-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zegans ME, Wagner JC, Cady KC, Murphy DM, Hammond JH, O'Toole GA. Interaction between bacteriophage DMS3 and host CRISPR region inhibits group behaviors of Pseudomonas aeruginosa. J Bacteriol 2009; 191:210-9; PMID:18952788; http://dx.doi.org/ 10.1128/JB.00797-08 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Toledo-Arana A, Dussurget O, Nikitas G, Sesto N, Guet-Revillet H, Balestrino D, Loh E, Gripenland J, Tiensuu T, Vaitkevicius K, et al.. The Listeria transcriptional landscape from saprophytism to virulence. Nature 2009; 459:950-6; PMID:19448609; http://dx.doi.org/ 10.1038/nature08080 [DOI] [PubMed] [Google Scholar]
- 20.Mandin P, Repoila F, Vergassola M, Geissmann T, Cossart P. Identification of new noncoding RNAs in Listeria monocytogenes and prediction of mRNA targets. Nucleic Acids Res 2007; 35:962-74; PMID:17259222; http://dx.doi.org/ 10.1093/nar/gkl1096 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wei Y, Terns RM, Terns MP. Cas9 function and host genome sampling in Type II-A CRISPR-Cas adaptation. Genes Dev 2015; 29:356-61; PMID:25691466; http://dx.doi.org/ 10.1101/gad.257550.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Heler R, Samai P, Modell JW, Weiner C, Goldberg GW, Bikard D, Marraffini LA. Cas9 specifies functional viral targets during CRISPR-Cas adaptation. Nature 2015; 519:199-202; PMID:25707807; http://dx.doi.org/ 10.1038/nature14245 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Dy RL, Pitman AR, Fineran PC. Chromosomal targeting by CRISPR-Cas systems can contribute to genome plasticity in bacteria. Mob Genet Elements 2013; 3:e26831; PMID:24251073; http://dx.doi.org/ 10.4161/mge.26831 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sorek R, Kunin V, Hugenholtz P. CRISPR-a widespread system that provides acquired resistance against phages in bacteria and archaea. Nat Rev Microbiol 2008; 6:181-6; PMID:18157154; http://dx.doi.org/ 10.1038/nrmicro1793 [DOI] [PubMed] [Google Scholar]
- 25.Stern A, Keren L, Wurtzel O, Amitai G, Sorek R. Self-targeting by CRISPR: gene regulation or autoimmunity? Trends Genet 2010; 26:335-40; PMID:20598393; http://dx.doi.org/ 10.1016/j.tig.2010.05.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Vercoe RB, Chang JT, Dy RL, Taylor C, Gristwood T, Clulow JS, Richter C, Przybilski R, Pitman AR, Fineran PC. Cytotoxic chromosomal targeting by CRISPR/Cas systems can reshape bacterial genomes and expel or remodel pathogenicity islands. PLoS Genet 2013; 9:e1003454; PMID:23637624; http://dx.doi.org/ 10.1371/journal.pgen.1003454 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Kunin V, Sorek R, Hugenholtz P. Evolutionary conservation of sequence and secondary structures in CRISPR repeats. Genome Biol 2007; 8:R61; PMID:17442114; http://dx.doi.org/ 10.1186/gb-2007-8-4-r61 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Makarova KS, Haft DH, Barrangou R, Brouns SJ, Charpentier E, Horvath P, Moineau S, Mojica FJ, Wolf YI, Yakunin AF, et al.. Evolution and classification of the CRISPR-Cas systems. Nat Rev Microbiol 2011; 9:467-77; PMID:21552286; http://dx.doi.org/ 10.1038/nrmicro2577 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Diez-Villasenor C, Almendros C, Garcia-Martinez J, Mojica FJ. Diversity of CRISPR loci in Escherichia coli. Microbiol 2010; 156:1351-61; http://dx.doi.org/ 10.1099/mic.0.036046-0 [DOI] [PubMed] [Google Scholar]
- 30.Pougach K, Semenova E, Bogdanova E, Datsenko KA, Djordjevic M, Wanner BL, Severinov K. Transcription, processing and function of CRISPR cassettes in Escherichia coli. Mol Microbiol 2010; 77:1367-79; PMID:20624226; http://dx.doi.org/ 10.1111/j.1365-2958.2010.07265.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Pul U, Wurm R, Arslan Z, Geissen R, Hofmann N, Wagner R. Identification and characterization of E. coli CRISPR-cas promoters and their silencing by H-NS. Mol Microbiol 2010; 75:1495-512; PMID:20132443; http://dx.doi.org/ 10.1111/j.1365-2958.2010.07073.x [DOI] [PubMed] [Google Scholar]
- 32.Touchon M, Charpentier S, Clermont O, Rocha EP, Denamur E, Branger C. CRISPR distribution within the Escherichia coli species is not suggestive of immunity-associated diversifying selection. J Bacteriol 2011; 193:2460-7; PMID:21421763; http://dx.doi.org/ 10.1128/JB.01307-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Delannoy S, Beutin L, Fach P. Use of clustered regularly interspaced short palindromic repeat sequence polymorphisms for specific detection of enterohemorrhagic Escherichia coli strains of serotypes O26:H11, O45:H2, O103:H2, O111:H8, O121:H19, O145:H28, and O157:H7 by real-time PCR. J Clin Microbiol 2012; 50:4035-40; PMID:23035199; http://dx.doi.org/ 10.1128/JCM.02097-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Delannoy S, Beutin L, Burgos Y, Fach P. Specific detection of enteroaggregative hemorrhagic Escherichia coli O104:H4 strains by use of the CRISPR locus as a target for a diagnostic real-time PCR. J Clin Microbiol 2012; 50:3485-92; PMID:22895033; http://dx.doi.org/ 10.1128/JCM.01656-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Touchon M, Rocha EP. The small, slow and specialized CRISPR and anti-CRISPR of Escherichia and Salmonella. PloS ONE 2010; 5:e11126; PMID:20559554; http://dx.doi.org/ 10.1371/journal.pone.0011126 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Guo X, Wang Y, Duan G, Xue Z, Wang L, Wang P, Qiu S, Xi Y, Yang H. Detection and analysis of CRISPRs of Shigella. Curr Microbiol 2015; 70:85-90; PMID:25199561; http://dx.doi.org/ 10.1007/s00284-014-0683-8 [DOI] [PubMed] [Google Scholar]
- 37.Cui Y, Li Y, Gorge O, Platonov ME, Yan Y, Guo Z, Pourcel C, Dentovskaya SV, Balakhonov SV, Wang X, et al.. Insight into microevolution of Yersinia pestis by clustered regularly interspaced short palindromic repeats. PloS ONE 2008; 3:e2652; PMID:18612419; http://dx.doi.org/ 10.1371/journal.pone.0002652 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Morales M, Attai H, Troy K, Bermudes D. Accumulation of single-stranded DNA in Escherichia coli carrying the colicin plasmid pColE3-CA38. Plasmid 2015; 77:7-16; PMID:25450765; http://dx.doi.org/ 10.1016/j.plasmid.2014.11.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Becker EC, Meyer RJ. Moba, the dna strand transferase of plasmid r1162: the minimal domain required for dna processing at the origin of transfer. J Biol Chem 2002; 277:14575-80; PMID:11839744; http://dx.doi.org/ 10.1074/jbc.M110759200 [DOI] [PubMed] [Google Scholar]
- 40.Jiang Y, Yang F, Zhang X, Yang J, Chen L, Yan Y, Nie H, Xiong Z, Wang J, Dong J, et al.. The complete sequence and analysis of the large virulence plasmid pSS of Shigella sonnei. Plasmid 2005; 54:149-59; PMID:16122562; http://dx.doi.org/ 10.1016/j.plasmid.2005.03.002 [DOI] [PubMed] [Google Scholar]
- 41.Semenova E, Jore MM, Datsenko KA, Semenova A, Westra ER, Wanner B, van der Oost J, Brouns SJ, Severinov K. Interference by clustered regularly interspaced short palindromic repeat (CRISPR) RNA is governed by a seed sequence. Proc Natl Acad Sci U S A 2011; 108:10098-103; PMID:21646539; http://dx.doi.org/ 10.1073/pnas.1104144108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Wiedenheft B, van Duijn E, Bultema JB, Waghmare SP, Zhou K, Barendregt A, Westphal W, Heck AJ, Boekema EJ, Dickman MJ, et al.. RNA-guided complex from a bacterial immune system enhances target recognition through seed sequence interactions. Proc Natl Acad Sci U S A 2011; 108:10092-7; PMID:21536913; http://dx.doi.org/ 10.1073/pnas.1102716108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Haft DH, Selengut J, Mongodin EF, Nelson KE. A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLoS Comput Biol 2005; 1:e60; PMID:16292354; http://dx.doi.org/ 10.1371/journal.pcbi.0010060 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Gerrard JG, McNevin S, Alfredson D, Forgan-Smith R, Fraser N. Photorhabdus Species: Bioluminescent Bacteria as Human Pathogens? Emerg Infect Dis 2003; 9:251; PMID:12603999; http://dx.doi.org/ 10.3201/eid0902.020222 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Fields BS, Benson RF, Besser RE. Legionella and Legionnaires' disease: 25 years of investigation. Clin Microbiol Rev 2002; 15:506-26; PMID:12097254; http://dx.doi.org/ 10.1128/CMR.15.3.506-526.2002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Juranek S, Eban T, Altuvia Y, Brown M, Morozov P, Tuschl T, Margalit H. A genome-wide view of the expression and processing patterns of Thermus thermophilus HB8 CRISPR RNAs. RNA 2012; 18:783-94; PMID:22355165; http://dx.doi.org/ 10.1261/rna.031468.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Deveau H, Garneau JE, Moineau S. CRISPR/Cas system and its role in phage-bacteria interactions. Annu Rev Microbiol 2010; 64:475-93; PMID:20528693; http://dx.doi.org/ 10.1146/annurev.micro.112408.134123 [DOI] [PubMed] [Google Scholar]
- 48.Brouns SJ, Jore MM, Lundgren M, Westra ER, Slijkhuis RJ, Snijders AP, Dickman MJ, Makarova KS, Koonin EV, van der Oost J. Small CRISPR RNAs guide antiviral defense in prokaryotes. Science 2008; 321:960-4; PMID:18703739; http://dx.doi.org/ 10.1126/science.1159689 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Plagens A, Tjaden B, Hagemann A, Randau L, Hensel R. Characterization of the CRISPR/Cas subtype I-A system of the hyperthermophilic crenarchaeon Thermoproteus tenax. J Bacteriol 2012; 194:2491-500; PMID:22408157; http://dx.doi.org/ 10.1128/JB.00206-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Sashital DG, Jinek M, Doudna JA. An RNA-induced conformational change required for CRISPR RNA cleavage by the endoribonuclease Cse3. Nat Struct Mol Biol 2011; 18:680-7; PMID:21572442; http://dx.doi.org/ 10.1038/nsmb.2043 [DOI] [PubMed] [Google Scholar]
- 51.Mojica FJ, García-Martínez J, Soria E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. Infect Genet Evol 2005; 60:174-82 [DOI] [PubMed] [Google Scholar]
- 52.Horvath P, Romero DA, Coûté-Monvoisin A-C, Richards M, Deveau H, Moineau S, Boyaval P, Fremaux C, Barrangou R. Diversity, activity, and evolution of CRISPR loci in Streptococcus thermophilus. J Bacteriol 2008; 190:1401-12; PMID:18065539; http://dx.doi.org/ 10.1128/JB.01415-07 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Paez-Espino D, Sharon I, Morovic W, Stahl B, Thomas BC, Barrangou R, Banfield JF. CRISPR immunity drives rapid phage genome evolution in Streptococcus thermophilus. Mbio 2015; 6:e00262-15; PMID:25900652; http://dx.doi.org/ 10.1128/mBio.00262-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Hargreaves KR, Flores CO, Lawley TD, Clokie MR. Abundant and diverse clustered regularly interspaced short palindromic repeat spacers in Clostridium difficile strains and prophages target multiple phage types within this pathogen. Mbio 2014; 5:e01045-13; PMID:25161187; http://dx.doi.org/ 10.1128/mBio.01045-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Westra ER, Staals RH, Gort G, Hogh S, Neumann S, de la Cruz F, Fineran PC, Brouns SJ. CRISPR-Cas systems preferentially target the leading regions of MOBF conjugative plasmids. RNA Biol 2013; 10:749-61; PMID:23535265; http://dx.doi.org/ 10.4161/rna.24202 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Palmer KL, Gilmore MS. Multidrug-resistant enterococci lack CRISPR-cas. mBio 2010; 1:e00227-10; PMID:21060735; http://dx.doi.org/ 10.1128/mBio.00227-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Marraffini LA, Sontheimer EJ. CRISPR interference limits horizontal gene transfer in staphylococci by targeting DNA. Science 2008; 322:1843-5; PMID:19095942; http://dx.doi.org/ 10.1126/science.1165771 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zhang Q, Rho M, Tang H, Doak TG, Ye Y. CRISPR-Cas systems target a diverse collection of invasive mobile genetic elements in human microbiomes. Genome Biol 2013; 14:R40; PMID:23628424; http://dx.doi.org/ 10.1186/gb-2013-14-4-r40 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.García-Aljaro C, Muniesa M, Jofre J, Blanch AR. Newly identified bacteriophages carrying the stx2g Shiga toxin gene isolated from Escherichia coli strains in polluted waters. FEMS Microbiol Lett 2006; 258:127-35; http://dx.doi.org/ 10.1111/j.1574-6968.2006.00213.x [DOI] [PubMed] [Google Scholar]
- 60.Muniesa M, de Simon M, Prats G, Ferrer D, Panella H, Jofre J. Shiga toxin 2-converting bacteriophages associated with clonal variability in Escherichia coli O157:H7 strains of human origin isolated from a single outbreak. Infect Immun 2003; 71:4554-62; PMID:12874335; http://dx.doi.org/ 10.1128/IAI.71.8.4554-4562.2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Gupta SK, Strockbine N, Omondi M, Hise K, Fair MA, Mintz E. Emergence of Shiga toxin 1 genes within Shigella dysenteriae type 4 isolates from travelers returning from the Island of Hispanola. Am J Trop Med Hyg 2007; 76:1163-5; PMID:17556630 [PubMed] [Google Scholar]
- 62.Gray MD, Lampel KA, Strockbine NA, Fernandez RE, Melton-Celsa AR, Maurelli AT. Clinical Isolates of Shiga Toxin 1a-Producing Shigella flexneri with an Epidemiological Link to Recent Travel to Hispaniola. Emerg Infect Dis 2014; 20:1669-77; PMID:25271406; http://dx.doi.org/ 10.3201/eid2010.140292 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Beutin L, Strauch E, Fischer I. Isolation of Shigella sonnei lysogenic for a bacteriophage encoding gene for production of Shiga toxin. Lancet 1999; 353:1498; PMID:10232325; http://dx.doi.org/ 10.1016/S0140-6736(99)00961-7 [DOI] [PubMed] [Google Scholar]
- 64.Bolotin A, Quinquis B, Sorokin A, Ehrlich SD. Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. Microbiol 2005; 151:2551-61; http://dx.doi.org/ 10.1099/mic.0.28048-0 [DOI] [PubMed] [Google Scholar]
- 65.Yang J, Sangal V, Jin Q, Yu J. Shigella Genomes: A Tale of Convergent Evolution and Specialization through IS Expansion and Genome Reduction. Genomes of Foodborne and Waterborne Pathogens 2011:23-39; http://dx.doi.org/ 10.1128/9781555816902.ch2 [DOI] [Google Scholar]
- 66.Gudbergsdottir S, Deng L, Chen Z, Jensen JV, Jensen LR, She Q, Garrett RA. Dynamic properties of the Sulfolobus CRISPR/Cas and CRISPR/Cmr systems when challenged with vector-borne viral and plasmid genes and protospacers. Mol Microbiol 2011; 79:35-49; PMID:21166892; http://dx.doi.org/ 10.1111/j.1365-2958.2010.07452.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Bensted HJ. Dysentery bacilli-Shigella; a brief historical review. Can J Microbiol 1956; 2:163-74; PMID:13316611; http://dx.doi.org/ 10.1139/m56-022 [DOI] [PubMed] [Google Scholar]
- 68.van den Beld MJ, Reubsaet FA. Differentiation between Shigella, enteroinvasive Escherichia coli (EIEC) and noninvasive Escherichia coli. Eur J Clin Microbiol Infect Dis 2012; 31:899-904; PMID:21901636; http://dx.doi.org/ 10.1007/s10096-011-1395-7 [DOI] [PubMed] [Google Scholar]
- 69.Pupo GM, Lan R, Reeves PR. Multiple independent origins of Shigella clones of Escherichia coli and convergent evolution of many of their characteristics. Proc Natl Acad Sci U S A 2000; 97:10567-72; PMID:10954745; http://dx.doi.org/ 10.1073/pnas.180094797 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Pupo GM, Karaolis D, Lan R, Reeves PR. Evolutionary relationships among pathogenic and nonpathogenic Escherichia coli strains inferred from multilocus enzyme electrophoresis and mdh sequence studies. Infect Immun 1997; 65:2685-92; PMID:9199437 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Godde JS, Bickerton A. The repetitive DNA elements called CRISPRs and their associated genes: evidence of horizontal transfer among prokaryotes. Infect Genet Evol 2006; 62:718-29 [DOI] [PubMed] [Google Scholar]
- 72.Cazalet C, Rusniok C, Bruggemann H, Zidane N, Magnier A, Ma L, Tichit M, Jarraud S, Bouchier C, Vandenesch F, et al.. Evidence in the Legionella pneumophila genome for exploitation of host cell functions and high genome plasticity. Nat Genet 2004; 36:1165-73; PMID:15467720; http://dx.doi.org/ 10.1038/ng1447 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
