Abstract
In 2009, an outbreak of enterohemorrhagic Escherichia coli (EHEC) on an open farm infected 93 persons, and approximately 22% of these individuals developed hemolytic-uremic syndrome (HUS). Genome sequencing was used to investigate outbreak-derived animal and human EHEC isolates. Phylogeny based on the whole-genome sequence was used to place outbreak isolates in the context of the overall E. coli species and the O157:H7 sequence type 11 (ST11) subgroup. Four informative single nucleotide polymorphisms (SNPs) were identified and used to design an assay to type 122 other outbreak isolates. The SNP phylogeny demonstrated that the outbreak strain was from a lineage distinct from previously reported O157:H7 ST11 EHEC and was not a member of the hypervirulent clade 8. The strain harbored determinants for two Stx2 verotoxins and other putative virulence factors. When linked to the epidemiological information, the sequence data indicate that gross contamination of a single outbreak strain occurred across the farm prior to the first clinical report of HUS. The most likely explanation for these results is that a single successful strain of EHEC spread from a single introduction through the farm by clonal expansion and that contamination of the environment (including the possible colonization of several animals) led ultimately to human cases.
INTRODUCTION
Whole bacterial genome sequencing in medical microbiology is fast becoming a reality; however, the challenge of converting the primary sequence data into useful clinical or public health action remains unmet, because experience with such data is limited. This is especially true for bacterial pathogens from outbreaks where genetic variation is limited. For the identification of an isolate or an outbreak strain, comparison with genomes of very closely related organisms is required. For example, comparative genomics of two Listeria monocytogenes genomes from a single outbreak was used to show “proof of concept” that genome sequences could be used in real time to guide the subsequent public health response (1). Tracking two strains of Acinetobacter baumannii in a hospital has been achieved with the conclusion that “additional studies are needed to benchmark genetic variability” (2). Additionally, genomic analysis of 43 globally distributed methicillin-resistant Staphylococcus aureus (MRSA) strains and 20 isolates from a single hospital revealed the microevolution of MRSA and “the potential tracking of person to person spread” (3). What is clear is that a large sequence database comprising a comprehensive panel of well-chosen isolates is necessary if genome sequencing is to be useful in the field. Perhaps the most comprehensive single species genome data available thus far is for Escherichia coli and this includes the sequences for multiple isolates of the O104:H4 E. coli sequence type 678 (ST678) associated with the outbreak in Germany in 2011 (4).
Enterohemorrhagic E. coli (EHEC) is arguably the most important E. coli pathotype for humans in developed countries (5). The clinical symptoms associated with EHEC infection normally present as bloody diarrhea, but severe complications, such as hemolytic-uremic syndrome (HUS), are relatively common. Comparative genomic analysis has shown that the genomes of EHEC strains consist of a conserved core interspersed with horizontally acquired accessory DNA, of approximately 4 Mb and 1 Mb, respectively (6). The accessory DNA contains genes that encode many of the EHEC-specific virulence-associated functions. Clustering strains based on gene content clusters EHEC together regardless of their phylogenetic lineage illustrating the convergence, in terms of genes involved in pathogenic potential and infection strategy, for this E. coli pathovar (7).
One of the largest outbreaks of EHEC (E. coli O157 phage type 21/28) reported in the United Kingdom thus far was detected in visitors to an open farm in Godstone, Surrey, United Kingdom. Following the epidemiological investigation and use of multilocus variable number tandem repeat (VNTR) analysis (MLVA), general contamination of the environment was suspected, and the outbreak was quickly brought under control (8). In this study, we investigate the nature of the outbreak at the highest resolution possible, by using whole-genome sequencing, in order to understand whether the approach could provide further insight into the nature of the causal organisms, e.g., if they were mono- or polyphyletic, and to compare the resolution gained by this approach with that derived from more traditional epidemiological tools.
MATERIALS AND METHODS
Strains used in this study.
Strain selection for sequencing was carried out at the time of the outbreak at the farm in Surrey, United Kingdom, with the aim of defining variation and to develop an assay to track the outbreak strains in real time as the outbreak continued to unravel. Sixteen isolates from the first set of outbreak samples, eight derived from human patients and eight from animals on the farm, were selected for sequencing (see Table S1 in the supplemental material). The remaining 106 strains collected during the investigation were subsequently subjected to Luminex analysis.
Reference laboratory methods.
Fecal specimens from humans and animals and environmental samples were cultured on MacConkey agar containing sorbitol, cefixime, and tellurite either at the local diagnostic microbiology laboratory or at the Animal Health Veterinary Laboratories. Non-sorbitol-fermenting colonies agglutinating with E. coli O157 latex beads were regarded as presumptive verocytotoxin-producing E. coli (VTEC) O157 and were submitted to the national E. coli reference laboratory at the Laboratory of Gastrointestinal Pathogens (LGP), London, United Kingdom, for confirmation and typing. Presumptive EHEC O157 isolates from human and nonhuman (animal and environment) sources were examined for the presence of the stx gene, subtyped by PCR and biochemical analysis, serotyped, and phage typed (9). All strains were tested by MLVA (10), and a subset were also examined by pulsed-field gel electrophoresis (PFGE) by the method of Barrett et al. (11).
DNA sequencing and genome assembly.
Genomic DNA from pure bacterial cultures (Table 1) was sequenced by both 454 and Illumina technologies. To generate the 454 data, 3-kb paired-end libraries were prepared, and the DNA was sequenced using the GS FLX sequencer using the Titanium protocol. The Illumina data were prepared by sequencing on a GAIIx sequencer using 54-bp single-end libraries. The 454 reads were assembled using the Newbler v2.1 de novo assembler (Roche). The mean of the resulting assembled genome sizes for the 16 EHEC strains sequenced in this study was 5.47 Mb.
Table 1.
Description of the SNPs within conserved loci showing variation between the Surrey outbreak strains
| SNP position in Sakai reference genome | Locus containing or near the SNP | Synonymous/nonsynonymous |
|---|---|---|
| 275809 | ECs0244 (hypothetical protein) | Synonymous |
| 842643 | ECs0755 (putative transcription regulator) | Nonsynonymous |
| 1237999 | ECs1152 (trimethylamine N-oxide reductase subunit) | Nonsynonymous |
| 3249761 | Between ECs5494 (tRNA-Lys) and ECs3279 (hypothetical protein) | Noncoding |
The sequences of the EHEC strain H093800014 were of the best quality: 5,434,900 Illumina reads, equating to 56× coverage and 949,592 454 reads with an average read length of 376 bp equal to approximately 68× coverage. To improve the quality of this genome, the 454 sequence data were combined with the Illumina reads. Initially, Velvet (version 0.7.55 with kmer 31, auto exp_cov, and a coverage cutoff of 4) was used to perform a de novo assembly of the Illumina reads. This resulted in 752 contigs of >500 bp with an N50 score of 14,831 bp. These were combined with the 454 reads as input for Newbler. The resulting assembly had the following statistics: 463 contigs with an average size of 12,028 bp, N50 score of 143,012 bp, and a total assembly length of 5,569,051 bp. EHEC strain H093800014 was used as a reference genome for all of the following comparative analyses.
Mapping and SNP calling.
In order to allow a single nucleotide polymorphism (SNP) calling protocol for all samples, short reads were simulated for genomes for which we had only the published consensus sequence as follows. The wgsim program from the SAMtools package (http://samtools.sourceforge.net) was run as the following command: -e 0 -r 0 -N 3000000 -d 250 -1 50 -2 50 input.fasta output.1.fastq output.2.fastq. The parameters correspond to an error rate of 0, a mutation rate of 0, 3 million read pairs, an outer distance between paired reads of 250, first read length of 50, and second read length of 50, respectively. This generated two fastq files representing 3 million paired-end reads of 50 bp with an insert size of 250 bp for each of the previously published genomes used in this study.
Simulated reads from GenBank reference strains or experimental Illumina reads from the Surrey farm strains were mapped to genome sequences using Bowtie 0.12.7 (http://bowtie-bio.sourceforge.net/index.shtml) enabling the −m1 flag, which excludes non-uniquely-mapping reads. The resultant sequence alignment map from Bowtie was sorted and indexed to produce a binary alignment map (BAM). SAMtools mpileup, run with default parameters, was used to create a variant call format (VCF) file from the BAMs, which was further parsed to extract only SNP positions that were of the highest quality in all genomes (defined as an overall VCF SNP quality score of ≥99, a VCF genotype quality score for each individual strain of ≥80, a minimum coverage of 15 reads, and either homozygous wild-type or variant type).
Phylogenetic comparisons.
SNPs were concatenated into pseudosequences and used to create maximum likelihood trees utilizing the Tamura-Nei model as implemented in MEGA5 (http://www.megasoftware.net/).
EHEC SNP assay.
PCR primer pairs were designed to amplify an approximately 400-bp region around each SNP (a total of 6 SNPs). Primers (see Table S2 in the supplemental material) were designed for the reference genome sequence E. coli O157:H7 strain EDL933 (and checked against E. coli O157:H7 strain Sakai) using primer3 and checked for multiplexing potential using PrimerPlex V2.1 (Premier Biosoft International).
Allele-specific primers were also designed to terminate with one of the SNP bases. Two primers were designed for each SNP, one incorporating the reference genome nucleotide and one with the polymorphism (see Table S3 in the supplemental material). These primers were designed using PrimerPlex V2.1 software, which was also used to give each primer a 5′ X-tag complementary to the DNA anti-X-tag on the Luminex microspheres (Luminex Corp.) used in the Luminex assay. Once designed, all primers were checked for specificity using NCBI BLAST and synthesized by Eurogentec.
Epidemiological analysis.
Isolates with SNP typing results available were linked to routine surveillance questionnaires received by the EHEC surveillance system. The median date used in the analysis was calculated based on the “study date” from the questionnaires and derived to best represent the illness of the patient. The study date is based on the following dates as available: onset of illness, date of specimen, date of questionnaire completion, and isolate receipt date at LGP. Cases were categorized as before or after the median date. Human cases of infection, where the person had visited the farm and so was likely to have been infected directly, were identified as primary, with secondary cases defined as those whose infection originated from a primary case.
STATA version 11 was used to perform exact logistic regression. Cases were grouped according to SNP profiles, and groups containing 8 or more cases were compared to look for an association between SNP profile and whether it was a primary or secondary case. Following this, secondary cases were removed and the process was repeated to look for associations with animal exposure. Age was controlled for in all analyses.
Nucleotide sequence accession numbers.
The Illumina fastq files and the 454 sff file for strain H093800014 were deposited in the European Nucleotide Archive in project ERP001863. The strains and sample accession numbers are as follows; H093460648, ERS181256; H093520770, ERS181257; H093580621, ERS181258; H093660495, ERS181259; H093700588, ERS181260; H093720386, ERS181261; H093740759, ERS181262; H093740495, ERS181263; H093800010, ERS181264; H093800011, ERS181265; H093800012, ERS181266; H093800013, ERS181267; H093800014, ERS181268; H093800016, ERS181269; H093800018, ERS181270; and H093800019, ERS181271.
RESULTS
Comparative analysis of the reference Surrey outbreak strain and published E. coli isolates.
In order to set the outbreak of EHEC in Surrey, United Kingdom, in the context of the pan-E. coli genome and to assess the variation between isolates from the outbreak, single nucleotide polymorphisms were determined. The presence and absence of these SNPs in conserved loci between genomes has been shown to provide an accurate measure of phylogenetic relationships.
Mapping of simulated or real short reads to a single reference genome, strain 55989, allowed SNPs at conserved positions to be called. The number of SNPs between the outbreak isolates was low relative to the variation observed across the E. coli pangenome (see “Comparative analyses of outbreak isolates” below), and therefore, a single isolate, H093800014, was used as a representative for the Surrey outbreak. The genome of this strain was assembled using both the 454 and Illumina reads as input for the Newbler assembly program. The inclusion of the Illumina reads helped correct homopolymeric tract frameshift errors in the longer 454 reads. Phylogenetic analysis of the SNPs relative to the 55989 reference strain showed that the Surrey outbreak EHEC fell within a group with the other O157:H7 EHEC ST11 strains (Fig. 1, inset). A more focused analysis including only O157:H7 EHEC ST11 isolates showed the Surrey strain to be distinct from O157:H7 EHEC ST11 clade 8, previously associated with hypervirulence (12) (13), and genetically distant (closest strain ec536 314 SNPs) from the other sequenced O157:H7 genomes (Fig. 1).
Fig 1.
Maximum likelihood joining tree of all sequenced O157:H7/ST11 genomes using 1,704 conserved SNPs representing diversity in 1,178 genes. The inset is a maximum likelihood tree of 34 E. coli genomes using 54,542 conserved SNPs representing diversity in 2,127 genes. It is rooted against Escherichia fergusonii. Bootstrap values are expressed as a decimal. EHEC isolates are shown in blue type, the Surrey outbreak strain is shown in red type, and the ST11 complex (ST11c) is shown on a yellow background. Enterohemorrhagic E. coli (EHEC), enteroaggregative E. coli (EAEC), nonpathogenic E. coli (NPEC), enterotoxigenic E. coli (ETEC), extraintestinal pathogenic E. coli (ExPEC), and enteropathogenic E. coli (EPEC) are indicated in the inset. The details of the strains used in this analysis including their accession number and pathotype are listed in Table S4 in the supplemental material.
The genome of the reference isolate H093800014 was searched using the NCBI BLASTN for the presence of the stx1 and stx2 toxin genes which are characteristic of EHEC isolates (14). No sequences with significant homology to stx1 were found, but two sequences with significant homology to stx2 were identified. Determination of the predicted amino acid sequences of the products of the genes adjacent to these genes identified two prophages encoding independently for the toxin subtypes Stx2a and Stx2c (15, 16). Characteristic EHEC virulence factors intimin (eae) and the pO157 plasmid were also identified. The genome also harbored a full-length gene encoding the anaerobic nitric oxide reductase NorV which has been reported as being associated with increased EHEC virulence and ability to cause HUS (17). Kulasekara et al. (18) describe five loci putatively involved in EHEC virulence, and our analysis revealed that the Surrey farm reference possessed three of these loci: ECSP_2687 (a homologue to the Shigella protein OspB that reduces cytokine expression), ECSP_1773 (a homologue to the Shigella protein OspG that is thought to prevent NF-κB activation) (19), and ECSP_0242 (a putative virulence factor that contains five ankyrin repeats).
Comparative analyses of outbreak isolates.
When comparing the Surrey outbreak isolates to E. coli O157:H7 Sakai, it was clear that they all lacked large portions of several prophage (Sp5, Sp7, Sp10, Sp13, and SpLE4). Outside these prophage sequences, the only differences between these genomes were in the lipopolysaccharide (LPS) biosynthetic gene cluster which may be associated with structural changes in the LPS coat which are not sufficient to alter the serotype and do warrant further investigation. Isolates H093700588 and H093800011 also lacked part of Sakai prophage region Sp11 (see Fig. S1B in the supplemental material).
One hundred two contigs from the reference Surrey farm strain that did not align to the E. coli O157:H7 strain Sakai chromosome were investigated further by searching for homology in the nonredundant GenBank database using BLASTN. Of the 102 contigs, the sequences of 91 of the contigs showed greater than 95% nucleotide identity across greater than 85% of their length to regions within other published O157:H7 strains in the database. Fifty-seven of the contigs contained phage-related genes demonstrating that much of the variation observed between the strain responsible for the Surrey outbreak and other previously sequenced E. coli strains is a result of phage-mediated evolution. One contig whose sequence did not have homology to other complete E. coli genome sequences was a 2-kb contig with 99.4% nucleotide identity to a Stx2 bacteriophage, ϕMin27. This phage has previously been reported to be found in E. coli O157:H7 strain Min27, isolated from the feces of a piglet with diarrhea at a swine farm in Shanghai, China, in 2003 (20). This contig is presumably part of a Stx2 phage carrying an alternative gene cargo to that seen in other complete E. coli genome sequences.
The Illumina sequences of all 16 outbreak isolates were mapped to the assembled H093800014 sequence and examined for variation. Four SNP differences were identified for the 16 sequenced isolates from the Surrey farm. All SNPs were deemed to be genuine after manual inspection (Table 1; see Fig. S1A in the supplemental material). The putative parental SNP type was inferred from the shared sequence profile for those sites present in all other sequenced EHEC O157 strains (Fig. 2A). The deletion from Sakai prophage region Sp11 mapped to two of the strains with haplotype H3 (H093700588 and H093800011). Therefore, it is likely that the deletion from this prophage occurred after the T → C mutation in the H1 haplotype to produce the H3 haplotype. These four SNP positions were used to assay all the strains isolated from the Surrey farm outbreak (n = 122 in Table S5 in the supplemental material) using the Luminex platform (see Materials and Methods).
Fig 2.

(A) Minimum spanning SNP tree drawn using the four SNPs identified in 16 isolates whose genomes had been sequenced. The haplotypes (H1 to H5) were comprised of the following isolates: H1, H093660495, H093800012, and H093800015; H2, H093460648, H093520770, and H093580621; H3, H093700588, H093720386, and H093800011; H4, H093740495, H093740759, H093800010, H093800014, and H093800016; H5, H093800018 and H093800019. The colors represent the proportion of strains sequenced that were from animal (red) or human (blue) sources. (B) Minimal spanning tree showing results for the five SNP types for all 122 isolates from the Surrey farm with the VNTR type colored. The proportions colored in red and green represent the major VNTR types observed in the outbreak (VNTR types 8-7-3-6-9-2-8-9 and 8-7-13-6-9-2-8-10, respectively).
The SNP-type data for the strain combined with the epidemiological information associated with each case was used to profile the outbreak in an attempt to determine when specific SNP types appeared and what proportion of the outbreak isolates could be attributed to each SNP type.
Figure 3 shows a histogram in which the frequency of each SNP type observed (by sample isolation date) is plotted. Two of the samples having the earliest isolation date were an SNP type that represents a terminal node in the minimal spanning tree, and the ancestral SNP type was isolated throughout the outbreak. The outbreak peaked 23 days into the outbreak, and this increased frequency coincided with observing all SNP types at this time.
Fig 3.
Histogram showing the distribution of SNP types over the progression of the outbreak. The minimal spanning tree shows results for the five SNP types (H1 to H5) for all 122 isolates from the Surrey Farm. SNP type H1 is ancestral.
Exact logistic regression showed that cases infected with SNP type H4 were 4.56 times more likely to have a study date after the median sample isolation date than cases that were not of this type (odds ratio [OR], 4.56; 95% confidence interval [95% CI], 1.17 to 20.10; P = 0.025), which fits with this SNP type being the most common SNP type and representing expansion of a single node. None of the other SNP types showed any statistically significant relationships with the onset date (Fig. 3) or animal exposure, corroborating the previous observation that several animals acted as the source of infection and that all SNP types were present at different locations on the farm. This distribution of SNP types and lack of concordance with isolation date show that the diversity of SNP types identified was present on the farm before any person became sick.
Current typing using VNTR sequences has proven to be useful in detecting outbreaks of EHEC O157 with outbreak strains sharing the same profile or very similar profiles (single locus variants) that are not observed in the background population. In this study, it was observed that the same VNTR type is associated with multiple haplotypes, some are observed independently on nonneighboring haplotypes and other VNTR types are unique to single haplotypes (Fig. 2B). All of these VNTR profiles are single locus variants, and this observation suggests that the differences in VNTR profiles represent rapid repeat number changes (as well as possible reversions) that occur at a higher rate than single nucleotide mutations responsible for determining phylogenetic lineage.
DISCUSSION
Using public genomic data for comparison, we analyzed the genomes of 16 isolates of EHEC from a single open farm in Surrey, United Kingdom, for relatedness in the context of outbreak etiology. We identified four SNPs in the genome of the outbreak strain (Fig. 2A) which we used to define five haplotypes (subtypes of the outbreak strain defined by SNPs). Once the haplotypes had been defined, we designed a rapid SNP assay for their characterization in all isolates from the outbreak. The rationale was that if genome sequencing could be carried out early in an outbreak and if an SNP assay could be developed rapidly, then this could be used to test all subsequent isolates to define haplotypes for tracing the source of the outbreak. From the results of this assay, it is clear that a single successful strain of EHEC spread through the farm by clonal expansion before the human cases began (Fig. 3). This could have been caused by the widespread distribution of feces from an animal source or by multiplication of the outbreak strain in the environment. Furthermore, the results support the findings of the original epidemiology study (21), in that there was no association between haplotypes and risk factors or with sample date (Fig. 3). We also compared our results to those of the established typing techniques. Although the conclusions matched, both MLVA types and PFGE profiles do not reflect the true phylogenetic relationship of the isolates analyzed. Many EHEC outbreaks are defined by MLVA, and it is important to stress that variation detected by VNTR typing, if interpreted appropriately, is a very valuable tool; however, attaching significance to single locus variants during outbreaks of EHEC is misleading.
There were many cases of severe disease (HUS) associated with this outbreak (22% of cases [21]), so we also analyzed the accessory genome to look for specific virulence factors. The presence of two copies of the stx2 genes and the full-length gene encoding NorV, along with the absence of the gene encoding Stx1, most likely explains why the outbreak was so severe, in terms of the high proportion of HUS, since these factors are associated with hypervirulence (17, 18, 22). The severity of outbreaks has also been associated with strain background, in particular with the group known as clade 8 (12). A genome SNP comparison with other O157:H7 EHEC ST11 isolates allowed us to group the Surrey farm outbreak strain as distinct from the clade 8 isolates. This suggests that the Stx2a toxin is independently responsible for the increased pathogenicity.
In this study, we have learnt from the development of an SNP-based assay that there was a clonal expansion of EHEC O157:H7. The clones could be distinguished by their SNP profile into 5 SNP types, which differed by only a total of four SNPs. We show that these SNP types were likely to have been present and widely distributed on the farm prior to the first human clinical case from this outbreak. The caveat to the use of SNP typing using a limited number of predefined SNPs is that it is impossible to look at microevolutionary events in these strains during the outbreak, since variation arising in an outbreak strain but not present in the sequenced strains will not be detected. However, from the complete genome sequences of 16 isolates covering the whole outbreak, we have shown that there is enough variation to develop a robust typing scheme. However, any SNP typing scheme can be used only within the context for which it was designed. In this case, the typing scheme was useful only for this farm outbreak, and a new typing scheme would have to be developed for each outbreak. This has led us to conclude that whole-genome sequencing that detects all SNPs and that detects insertion deletion events would work for any pathogen and represents the future for molecular epidemiology. The biggest issue that remains is the lack of data available for comparison—if the benefits of whole-genome sequencing of pathogens are to be maximized, then alongside the development of technology and bioinformatics, a curated database must be established that microbiologists working in public health can use.
Supplementary Material
ACKNOWLEDGMENTS
This work was funded by the Wellcome Trust through the Wellcome Trust Sanger Institute and the Health Protection Agency, in part through grant CHPR108061.
Julian Parkhill has received support for conference travel and accommodation from Illumina Inc.
Footnotes
Published ahead of print 7 November 2012
Supplemental material for this article may be found at http://dx.doi.org/10.1128/JCM.01696-12.
REFERENCES
- 1. Gilmour MW, Graham M, Van Domselaar G, Tyler S, Kent H, Trout-Yakel KM, Larios O, Allen V, Lee B, Nadon C. 2010. High-throughput genome sequencing of two Listeria monocytogenes clinical isolates during a large foodborne outbreak. BMC Genomics 11:120 doi:10.1186/1471-2164-11-120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Lewis T, Loman NJ, Bingle L, Jumaa P, Weinstock GM, Mortiboy D, Pallen MJ. 2010. High-throughput whole-genome sequencing to dissect the epidemiology of Acinetobacter baumannii isolates from a hospital outbreak. J. Hosp. Infect. 75:37–41 [DOI] [PubMed] [Google Scholar]
- 3. Harris SR, Feil EJ, Holden MT, Quail MA, Nickerson EK, Chantratita N, Gardete S, Tavares A, Day N, Lindsay JA, Edgeworth JD, de Lencastre H, Parkhill J, Peacock SJ, Bentley SD. 2010. Evolution of MRSA during hospital transmission and intercontinental spread. Science 327:469–474 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Brzuszkiewicz E, Thurmer A, Schuldes J, Leimbach A, Liesegang H, Meyer FD, Boelter J, Petersen H, Gottschalk G, Daniel R. 2011. Genome sequence analyses of two isolates from the recent Escherichia coli outbreak in Germany reveal the emergence of a new pathotype: Entero-Aggregative-Haemorrhagic Escherichia coli (EAHEC). Arch. Microbiol. 193:883–891 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Wong AR, Pearson JS, Bright MD, Munera D, Robinson KS, Lee SF, Frankel G, Hartland EL. 2011. Enteropathogenic and enterohaemorrhagic Escherichia coli: even more subversive elements. Mol. Microbiol. 80:1420–1438 [DOI] [PubMed] [Google Scholar]
- 6. Ohnishi M, Tanaka C, Kuhara S, Ishii K, Hattori M, Kurokawa K, Yasunaga T, Makino K, Shinagawa H, Murata T, Nakayama K, Terawaki Y, Hayashi T. 1999. Chromosome of the enterohemorrhagic Escherichia coli O157:H7; comparative analysis with K-12 MG1655 revealed the acquisition of a large amount of foreign DNAs. DNA Res. 6:361–368 [DOI] [PubMed] [Google Scholar]
- 7. Ogura Y, Ooka T, Iguchi A, Toh H, Asadulghani M, Oshima K, Kodama T, Abe H, Nakayama K, Kurokawa K, Tobe T, Hattori M, Hayashi T. 2009. Comparative genomics reveal the mechanism of the parallel evolution of O157 and non-O157 enterohemorrhagic Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 106:17939–17944 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Health Protection Agency 2010. Review of the major outbreak of E. coli O157 in Surrey, 2009. Report of the Independent Investigation Committee June 2010. Health Protection Agency, London, United Kingdom [Google Scholar]
- 9. Willshaw GA, Smith HR, Cheasty T, Wall PG, Rowe B. 1997. Vero cytotoxin-producing Escherichia coli O157 outbreaks in England and Wales, 1995: phenotypic methods and genotypic subtyping. Emerg. Infect. Dis. 3:561–565 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Hyytia-Trees E, Smole SC, Fields PA, Swaminathan B, Ribot EM. 2006. Second generation subtyping: a proposed PulseNet protocol for multiple-locus variable-number tandem repeat analysis of Shiga toxin-producing Escherichia coli O157 (STEC O157). Foodborne Pathog. Dis. 3:118–131 [DOI] [PubMed] [Google Scholar]
- 11. Barrett TJ, Lior H, Green JH, Khakhria R, Wells JG, Bell BP, Greene KD, Lewis J, Griffin PM. 1994. Laboratory investigation of a multistate food-borne outbreak of Escherichia coli O157:H7 by using pulsed-field gel electrophoresis and phage typing. J. Clin. Microbiol. 32:3013–3017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Manning SD, Motiwala AS, Springman AC, Qi W, Lacher DW, Ouellette LM, Mladonicky JM, Somsel P, Rudrik JT, Dietrich SE, Zhang W, Swaminathan B, Alland D, Whittam TS. 2008. Variation in virulence among clades of Escherichia coli O157:H7 associated with disease outbreaks. Proc. Natl. Acad. Sci. U. S. A. 105:4868–4873 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Laing CR, Buchanan C, Taboada EN, Zhang Y, Karmali MA, Thomas JE, Gannon VP. 2009. In silico genomic analyses reveal three distinct lineages of Escherichia coli O157:H7, one of which is associated with hyper-virulence. BMC Genomics 10:287 doi:10.1186/1471-2164-10-287 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Mellmann A, Bielaszewska M, Kock R, Friedrich AW, Fruth A, Middendorf B, Harmsen D, Schmidt MA, Karch H. 2008. Analysis of collection of hemolytic uremic syndrome-associated enterohemorrhagic Escherichia coli. Emerg. Infect. Dis. 14:1287–1290 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. O'Brien AD, Tesh VL, Donohue-Rolfe A, Jackson MP, Olsnes S, Sandvig K, Lindberg AA, Keusch GT. 1992. Shiga toxin: biochemistry, genetics, mode of action, and role in pathogenesis. Curr. Top. Microbiol. Immunol. 180:65–94 [DOI] [PubMed] [Google Scholar]
- 16. Schmitt CK, McKee ML, O'Brien AD. 1991. Two copies of Shiga-like toxin II-related genes common in enterohemorrhagic Escherichia coli strains are responsible for the antigenic heterogeneity of the O157:H− strain E32511. Infect. Immun. 59:1065–1073 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Orth D, Grif K, Khan AB, Naim A, Dierich MP, Wurzner R. 2007. The Shiga toxin genotype rather than the amount of Shiga toxin or the cytotoxicity of Shiga toxin in vitro correlates with the appearance of the hemolytic uremic syndrome. Diagn. Microbiol. Infect. Dis. 59:235–242 [DOI] [PubMed] [Google Scholar]
- 18. Kulasekara BR, Jacobs M, Zhou Y, Wu Z, Sims E, Saenphimmachak C, Rohmer L, Ritchie JM, Radey M, McKevitt M, Freeman TL, Hayden H, Haugen E, Gillett W, Fong C, Chang J, Beskhlebnaya V, Waldor MK, Samadpour M, Whittam TS, Kaul R, Brittnacher M, Miller SI. 2009. Analysis of the genome of the Escherichia coli O157:H7 2006 spinach-associated outbreak isolate indicates candidate genes that may enhance virulence. Infect. Immun. 77:3713–3721 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Kim DW, Lenzen G, Page AL, Legrain P, Sansonetti PJ, Parsot C. 2005. The Shigella flexneri effector OspG interferes with innate immune responses by targeting ubiquitin-conjugating enzymes. Proc. Natl. Acad. Sci. U. S. A. 102:14046–14051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Su L, Yan Y. 2009. Characterization of a Shiga toxin 2-encoding bacteriophage ϕMin27 isolated from Escherichia coli O157:H7 strain of China. Afr. J. Microbiol. Res. 3:799–808. [Google Scholar]
- 21. Ihekweazu C, Carroll K, Adak B, Smith G, Pritchard GC, Gillespie IA, Verlander NQ, Harvey-Vince L, Reacher M, Edeghere O, Sultan B, Cooper R, Morgan G, Kinross PT, Boxall NS, Iversen A, Bickler G. 2012. Large outbreak of verocytotoxin-producing Escherichia coli O157 infection in visitors to a petting farm in South East England, 2009. Epidemiol. Infect. 140:1400–1413 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Fuller CA, Pellino CA, Flagler MJ, Strasser JE, Weiss AA. 2011. Shiga toxin subtypes display dramatic differences in potency. Infect. Immun. 79:1329–1337 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


