Abstract
Non-O157 Shiga toxin-producing Escherichia coli (STEC) strains are emerging food-borne pathogens causing life-threatening diseases and food-borne outbreaks. A better understanding of their evolution provides a framework for developing tools to control food safety. We obtained 15 genomes of non-O157 STEC strains, including O26, O111, and O103 strains. Phylogenetic trees revealed a close relationship between O26:H11 and O111:H11 and a scattered distribution of O111. We hypothesize that STEC serotypes with the same H antigens might share common ancestors.
TEXT
Shiga toxin-producing Escherichia coli (STEC) strains are deadly pathogens, causing hemorrhagic colitis (HC) and hemolytic-uremic syndrome (HUS) (4, 13). There are two categories of surface antigens (O somatic and H flagellar), whose combinations are used to classify E. coli. E. coli O157:H7 has caused more outbreaks and HUS cases in the United States than any other serotype. However, there is a growing concern about the health risk of non-O157 STEC (1), as more than 470 serotypes of STEC are associated with human diseases (2). In the United States, non-O157 STEC causes an estimated 112,752 cases of illness each year, which is more than the number of cases (estimated at 63,153) caused by E. coli O157:H7 (16). Among the non-O157 STEC strains, serogroups O26, O111, and O103 are considered the most clinically important and frequently identified non-O157 STEC strains in severe diseases and food-borne outbreaks (3, 14, 19). In this study, we used whole-genome sequencing data to examine the phylogenetic relationship of non-O157 STEC strains for a better understanding of the evolutionary history of these emerging pathogens.
Fifteen STEC strains representing different pulsed-field gel electrophoresis (PFGE) patterns, isolation years, hosts, and stx gene profiles, including O111:H11, O111:H8, O26:H11, O103:H2, and O103:H25, were selected for whole-genome sequencing analysis using the 454 pyrosequencing system (FLX; Roche, Branford, CT) to obtain draft genomes (Table 1). In addition, 28 E. coli published genomes were included for phylogenetic study (Table 1). The genome sizes of the 15 STEC strains ranged from 5.26 Mbp to 6.01 Mbp (Table 1). Multiple sequence alignment of all 43 genomes was performed using Mauve (5), and approximately 183,470 single nucleotide polymorphisms (SNPs) were identified.
TABLE 1.
Serotypes, pathotypes, toxin genotypes, sources, and genome sizes of Escherichia coli strains used in this studya
Strain | Serotype | Pathotypeb | Shiga toxin gene | Source | Size (Mbp) | Accession no. |
---|---|---|---|---|---|---|
CVM10021 | O26:H11 | STEC | stx1 | Cow | 5.50 | AKAZ00000000 |
CVM9942 | O26:H11 | STEC | stx1 | Cow | 5.62 | AJVW00000000 |
CVM10026 | O26:H11 | STEC | stx1 | Cow | 5.57 | AJVX00000000 |
CVM10030 | O26:H11 | STEC | stx1 | Cow | 5.50 | AKBA00000000 |
CVM9952 | O26:H11 | STEC | stx1 | Pig | 5.50 | AKBC00000000 |
CVM9634 | O111:H8 | STEC | stx1 + stx2 | Cow | 5.78 | AKAW00000000 |
CVM9602 | O111:H8 | STEC | stx1 | Human | 5.10 | AKAV00000000 |
CVM9574 | O111:H8 | STEC | stx1 + stx2 | Human | 5.36 | AJVV00000000 |
CVM9570 | O111:H8 | STEC | stx1 + stx2 | Cow | 5.51 | AJVU00000000 |
CVM9545 | O111:H11 | STEC | stx1 | Cow | 5.61 | AJVT00000000 |
CVM9455 | O111:H11 | STEC | stx2 | Unknown | 6.01 | AKAX00000000 |
CVM9534 | O111:H11 | STEC | stx1 | Cow | 5.46 | AJVS00000000 |
CVM9553 | O111:H11 | STEC | stx1 | Cow | 5.60 | AKAY00000000 |
CVM9340 | O103:H25 | STEC | stx1 | Human | 5.26 | AJVQ00000000 |
CVM9450 | O103:H2 | STEC | stx1 | Human | 5.39 | AJVR00000000 |
CFT073 | O6:K2:H1 | UPEC | Unknown | 5.23 | AE014075 | |
Sakai | O157:H7 | STEC | stx1 + stx2 | Human | 5.59 | BA000007 |
CB9615 | O55:H7 | EPEC | Human | 5.39 | NC_013941 | |
4865/96 | O145:H28 | STEC | stx2 | Human | 5.23 | AGTL00000000 |
53638 | O144:? | EIEC | Unknown | 5.07 | AAKB00000000 | |
101-1 | O−:H10 | EAEC | Human | 4.98 | AAMK00000000 | |
MG1655 | Unknown | Commensal | Unknown | 4.64 | NC_000913 | |
5.0959 | H121:H19 | STEC | stx2 | Unknown | 5.37 | AEZX00000000 |
TY-2482 | O104:H4 | EAEC + STEC | stx2 | Human | 5.29 | AFOG00000000 |
CL-3 | O113:H21 | STEC | stx2 | Human | 5.05 | AGTH00000000 |
B2F1 | O91:H21 | STEC | stx2 | Human | 5.01 | AGTI00000000 |
E24377A | O139:H28 | ETEC | Unknown | 4.97 | NC_009801 | |
DEC12B | O111:H2 | STEC | stx2 | Human | 5.49 | AIHB00000000 |
DEC12C | O111:NM | STEC | stx2 | Human | 5.45 | AIHC00000000 |
E22 | O103:H2 | EPEC | Unknown | 5.53 | AAJV00000000 | |
03-EN-705 | O45:H2 | STEC | stx1 | Human | 5.3 | AGTK00000000 |
12009 | O103:H2 | STEC | stx1 + stx2 | Human | 5.45 | NC_013353 |
E110019 | O111:H9 | EPEC | Human | 5.38 | AAJW00000000 | |
DEC15A | O111:H21 | EPEC | Human | 5.25 | AIHO00000000 | |
DEC15E | O111:H21 | EPEC | Human | 5.23 | AIHS00000000 | |
DEC8E | O111:H8 | STEC | stx1 | Human | 5.32 | AIGJ00000000 |
DEC8B | O111:H8 | STEC | stx1 + stx2 | Human | 5.37 | AIGG00000000 |
11128 | O111:H− | STEC | stx1 + stx2 | Human | 5.37 | NC_013364 |
DEC8D | O111:H11 | DEC | Human | 5.46 | AIGI00000000 | |
DEC8C | O111:H11 | STEC | stx1 | Cow | 5.91 | AIGH00000000 |
DEC10B | O26:H11 | STEC | stx1 | Human | 5.58 | AIGQ00000000 |
EPECCa14 | O26:H11 | STEC | stx1 | Unknown | 5.44 | ADUN00000000 |
11368 | O26:H11 | STEC | stx1 | Human | 5.69 | NC_013361 |
Data on strains named with CVM were from this study; the rest were from GenBank.
STEC, Shiga toxin-producing Escherichia coli; EPEC, enteropathogenic Escherichia coli; EIEC, enteroinvasive Escherichia coli; ETEC, enterotoxigenic Escherichia coli; EAEC, enteroaggregative Escherichia coli; DEC, diarrheagenic Escherichia coli.
Pulsed-field gel electrophoresis (PFGE) with XbaI was performed according to a non-O157 PulseNet protocol (http://www.pulsenetinternational.org/SiteCollectionDocuments/pfge/5%201_5%202_5%204_PNetStand_Ecoli_with_Sflexneri.pdf) and analyzed with BioNumerics software (Applied Maths, Austin, TX) using Dice coefficients and unweighted pair group means with arithmetic averages (UPGMA) to construct a dendrogram with a 1.5% band position tolerance. eae subtypes were determined using PCR-restriction fragment length polymorphism (RFLP) as described by Tramuta et al. (20). The 15 STEC strains were grouped into two main clusters (Fig. 1) that separated H11 strains (O111:H11 and O26:H11) from H8 strains (O111:H8). However, PFGE was not able to differentiate O111:H11 and O26:H11. In addition, the O26:H11 and O111:H11 strains shared the same eae subtype (β), while the O111:H8 strains contained θ. It appeared that O111:H11 and O26:H11 were more closely related to each other than either was to O111:H8, according to PFGE profiles and virulence gene-associated elements in the genomes.
Fig 1.
Dendrogram of PFGE profiles of 15 O26, O103, and O111 STEC isolates. The similarity of the PFGE profiles was based on the Dice algorithm with 1.5% tolerance. O26:H11 and O111:H11 strains showed a close relationship, grouped in the same cluster, and shared the same eae subtype. CVM, Center for Veterinary Medicine.
Seven housekeeping genes (aspC, clpX, fadD, icdA, lysP, mdh, and uidA) extracted from genomes were selected for multilocus sequence typing (MLST) analysis as previously described for pathogenic E. coli (http://www.shigatox.net/ecmlst/protocols/index.html). The MLST analysis was performed using MEGA 5.05 (17) with 2,000 iterations (model, maximum composite likelihood; substitution, transitions plus transversions; gamma). The O111:H11, O26:H11, and O111:H8 strains formed one branch in the MLST dendrogram, with O26:H11 and O111:H11 clustering together in a lineage sister to the O111:H8 strains (Fig. 2). It is interesting that O26:H11 strain DEC10B clustered with the O111:H11 strains. Furthermore, five strains sharing the H2 antigen clustered together regardless of O serotypes. Genomic analysis revealed that O111:NM strain DEC12C carried fliC for the H2 gene.
Fig 2.
Dendrogram of MLST analyses using aspC, clpX, fadD, icdA, lysP, mdh, and uidA. Strains 4865/96 (O145:H28), 101-1 (O−:H10), 5.0959 (O121:H19), DEC12B (O111:H2), and E110019 (O111:H9) were not included in the MLST study because at least one of the selected gene alleles was either absent or only partially present.
To further explore evolutionary relatedness, a parsimony phylogenetic tree based on whole-genome-wide SNPs was performed with 10,000 iterations by TNT (tree analysis using new technology) (9). Similarly to data shown by PFGE and MLST, the phylogenetic tree demonstrated that O26:H11 strains belonged to the same clade as did the O111:H11 and O111:H8 strains but grouped more closely with strains of the same H type (O111:H11) (Fig. 3). In the TNT tree, E. coli O26:H11 DEC10B grouped with the O111:H11 strains as shown in the MLST dendrogram (Fig. 2), indicating a close relationship between these strains. All O111 strains but O111:H2 formed one clade (Fig. 3), including O111 enteropathogenic E. coli (EPEC), O111:H8, and O111:H11. Additionally, we reconstructed a maximum likelihood (ML) tree by using Garli-2.0 (22) and a Bio Neighbor Joining (BioNJ) (8) tree by using SeaView 4 (10) (data not shown), displaying similar phylogenetic relationships with the TNT tree with minor differences. For example, in the BioNJ tree, E. coli O26:H11 DEC10B grouped with the O26:H11 strains. The H2 strains were all closely clustered together in all phylogenetic trees, including the TNT, ML, and BioNJ trees as shown in the MLST dendrogram. The phylogenetic trees also supported the idea that STEC O113:H21 and O91:H21 were closely related (Fig. 3).
Fig 3.
Parsimony phylogenetic tree of 43 E. coli strains from diverse pathotypes based on genome-wide single nucleotide polymorphisms (SNPs) with 10,000 iterations. Strains sequenced in this study are shown within boxes. A total of seven subgroups were labeled, as shown in the pairwise distance matrix (Table 2).
The phylogenetic trees indicated that a common ancestor might exist for strains of the same H type. For example, the H11 strains, including O26:H11 and O111:H11 (Fig. 2 and 3), shared a common ancestor; the H2 strains with different O antigens, namely, the H2 group in Fig. 3, shared a common ancestor as well. Previous studies also have suggested that STEC strains with the same H antigens might share common ancestors. For example, Iguchi et al. (11) indicated a close relatedness between STEC O103:H11 and O26:H11 shown by MLST analysis, in which the strains shared the same eae subtype (β-eae) that was also inserted at the same tRNA locus (pheU-tRNA). The O111:H11 strains used in this study also carry β-eae as well (Fig. 1). In addition, Konczy et al. (12) and Ziebell et al. (21) demonstrated that STEC O69:H11 was found to be closely related to O26:H11. Additionally, Konczy et al. (12) reported that H25 STEC strains (O103:H25, O119:H25, and O98:H25) and H21 STEC strains (O91:H21, O113:H21, O146:H21, and ONT:H21) were clustered together, separately, based on MLST. These data and our findings provided strong evidence that some STEC strains with common H antigens appear to originate from common ancestors. It is interesting that the four H groups included the serotypes O26:H11, O111:H11, O111:NM, O111:H2, O103:H2, O103:H25, O45:H2, O91:H21, and O113:H21, which have been identified among the most important non-O157 STEC serotypes associated with outbreaks and HUS. Thus, we hypothesize that some clinical and epidemic STEC serotypes with the same H antigens might have evolved from common ancestors, respectively. It is possible that ancestral strains of those H groups share similar or the same genetic background and/or environmental niche that could facilitate acquisition of stx and other virulence genes essential to STEC pathogenesis.
Our phylogenetic analyses demonstrated that the H groups were monophyletic while the serogroups were polyphyletic (Fig. 3). Scattered distribution of different O111 strains suggested that strains from individual lineages might have acquired surface antigen genes independently in an ongoing parallel evolutionary process, such as E. coli O111:H21 DEC15E (O111 EPEC group), E. coli O111:H2 E22 (H2 group), E. coli O111:H8 CVM9634 (O111:H8 group), and E. coli O111:H11 CVM9545 (O111:H11 group) (Fig. 3). Iguchi et al. (11) suggested that STEC O103:H2, O103:H11, and O103:H25 formed three different lineages by MLST analysis and had distinct eae subtypes. The MLST and SNP phylogenetic trees in this study also supported the idea that O103:H2 and O103:H25 were located on different lineages (Fig. 2 and 3). Thus, using just serogroups may cause misleading conclusions about the phylogenetic relatedness and health risks of STEC strains.
Pairwise distance matrix analysis with 2,000 bootstrap iterations (substitution, transitions plus transversions; complete-delete option) (Table 2) was conducted to determine the number of SNP differences (standard deviation) between different selected groups using MEGA 5.05 (17), including H7, eae type, H2, O26:H11, and O111 groups. The values of base differences per sequence from averaging overall sequence pairs between groups were shown. The smallest distance value was found between O111:H11 and O26:H11 strains, confirming their close relatedness. Previous studies indicated that O157:H7 evolved from O55:H7 in a series of steps, acquiring the O157 antigen gene cluster and other virulence genes through horizontal gene transfer (HGT) (6, 7, 15, 18). The distance between O157:H7 Sakai and O55:H7 CB9615 was 4,215 SNPs with a standard deviation of 37. Because O26:H11 strains are located in the O111 clade in phylogenetic trees and display a closer relationship with O111:H11 strains than with O111:H8 strains, it is possible that O26:H11 evolved similarly from an ancestral O111:H11 strain by antigenic shift from O111 to O26 (the distance between O26:H11 and O111:H11 groups was 3,617 [Table 2]). Sharing the same niche with other O26 strains may facilitate this genetic exchange. Comparative genomics analysis of O26:H11, O111:H11, and other STEC strains is under way to reveal the possible mechanism.
TABLE 2.
Pairwise distance matrix analysis of six selected groups
Groupa | No. of SNP differences (SD) |
|||||
---|---|---|---|---|---|---|
H7 | eae negative | H2 | O111 EPEC | O111:H8 | O111:H11 | |
H7 | ||||||
eae negative | 57,246 (107) | |||||
H2 | 58,029 (139) | 21,523 (77) | ||||
O111 EPEC | 59,530 (105) | 23,107 (98) | 23,442 (113) | |||
O111:H8 | 59,498 (126) | 23,993 (75) | 22,558 (97) | 21,427 (83) | ||
O111:H11 | 59,417 (113) | 24,157 (98) | 22,717 (123) | 21,512 (84) | 4,324 (35) | |
O26:H11 | 57,176 (108) | 21,913 (78) | 22,138 (107) | 24,045 (102) | 6,556 (42) | 3,617 (37) |
Groups are as shown in Fig. 3.
In conclusion, analyses based on whole-genome-wide SNPs, MLST, and PFGE suggest that on some occasions O serogroups appear not to track evolutionary relatedness among pathogenic E. coli strains. Instead, H antigens may be better markers of shared ancestry for some STEC serotypes.
ACKNOWLEDGMENT
The study was supported in part by the Joint Institute for Food Safety and Applied Nutrition (JIFSAN), University of Maryland, College Park, MD.
Footnotes
Published ahead of print 10 October 2012
REFERENCES
- 1. Bettelheim KA. 2007. The non-O157 shiga-toxigenic (verocytotoxigenic) Escherichia coli; under-rated pathogens. Crit. Rev. Microbiol. 33:67–87 [DOI] [PubMed] [Google Scholar]
- 2. Blanco JE, et al. 2004. Serotypes, virulence genes, and intimin types of Shiga toxin (verotoxin)-producing Escherichia coli isolates from human patients: prevalence in Lugo, Spain, from 1992 through 1999. J. Clin. Microbiol. 42:311–319 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Brooks JT, et al. 2005. Non-O157 Shiga toxin-producing Escherichia coli infections in the United States, 1983–2002. J. Infect. Dis. 192:1422–1429 [DOI] [PubMed] [Google Scholar]
- 4. Caprioli A, Morabito S, Brugere H, Oswald E. 2005. Enterohaemorrhagic Escherichia coli: emerging issues on virulence and modes of transmission. Vet. Res. 36:289–311 [DOI] [PubMed] [Google Scholar]
- 5. Darling AC, Mau B, Blattner FR, Perna NT. 2004. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 14:1394–1403 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Feng P, Lampel KA, Karch H, Whittam TS. 1998. Genotypic and phenotypic changes in the emergence of Escherichia coli O157:H7. J. Infect. Dis. 177:1750–1753 [DOI] [PubMed] [Google Scholar]
- 7. Feng PC, et al. 2007. Genetic diversity among clonal lineages within Escherichia coli O157:H7 stepwise evolutionary model. Emerg. Infect. Dis. 13:1701–1706 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Gascuel O. 1997. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14:685–695 [DOI] [PubMed] [Google Scholar]
- 9. Goloboff P, Nixon FSK. 2008. TNT, a program for phylogenetic analysis. Cladistics 24:774–786 [Google Scholar]
- 10. Gouy M, Gascuel GSO. 2010. SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol. Biol. Evol. 27:221–224 [DOI] [PubMed] [Google Scholar]
- 11. Iguchi A, Iyoda S, Ohnishi M. 2012. Molecular characterization reveals three distinct clonal groups among clinical Shiga toxin-producing Escherichia coli strains of serogroup O103. J. Clin. Microbiol. 50:2894–2900 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Konczy P, et al. 2008. Genomic O island 122, locus for enterocyte effacement, and the evolution of virulent verocytotoxin-producing Escherichia coli. J. Bacteriol. 190:5832–5840 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Miliwebsky E, et al. 2007. Prolonged fecal shedding of Shiga toxin-producing Escherichia coli among children attending day-care centers in Argentina. Rev. Argent. Microbiol. 39:90–92 [PubMed] [Google Scholar]
- 14. Ogura Y, et al. 2009. Comparative genomics reveal the mechanism of the parallel evolution of O157 and non-O157 enterohemorrhagic Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 106:17939–17944 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Rump LV, et al. 2012. Complete DNA sequence analysis of enterohemorrhagic Escherichia coli plasmid pO157_2 in beta-glucuronidase-positive E. coli O157:H7 reveals a novel evolutionary path. J. Bacteriol. 194:3457–3463 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Scallan E, et al. 2011. Foodborne illness acquired in the United States—major pathogens. Emerg. Infect. Dis. 17:7–15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Tamura K, et al. 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28:2731–2739 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Tarr PI, et al. 2000. Acquisition of the rfb-gnd cluster in evolution of Escherichia coli O55 and O157. J. Bacteriol. 182:6183–6191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Tozzi AE, et al. 2003. Shiga toxin-producing Escherichia coli infections associated with hemolytic uremic syndrome, Italy, 1988–2000. Emerg. Infect. Dis. 9:106–108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Tramuta C, Robino P, Oswald E, Nebbia P. 2008. Identification of intimin alleles in pathogenic Escherichia coli by PCR-restriction fragment length polymorphism analysis. Vet. Res. Commun. 32:1–5 [DOI] [PubMed] [Google Scholar]
- 21. Ziebell K, et al. 2008. Applicability of phylogenetic methods for characterizing the public health significance of verocytotoxin-producing Escherichia coli strains. Appl. Environ. Microbiol. 74:1671–1675 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Zwickl DJ. 2006. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Ph.D. thesis The University of Texas at Austin, Austin, TX [Google Scholar]