Hybrid Shiga toxin-producing Escherichia coli (STEC) and uropathogenic E. coli (UPEC) strains of multilocus sequence type 141 (ST141) cause both urinary tract infections and diarrhea in humans and are phylogenetically positioned between STEC and UPEC strains.
KEYWORDS: molecular evolution, comparative genomics, Shiga toxin-producing Escherichia coli, uropathogenic Escherichia coli, heteropathogenicity
ABSTRACT
Hybrid Shiga toxin-producing Escherichia coli (STEC) and uropathogenic E. coli (UPEC) strains of multilocus sequence type 141 (ST141) cause both urinary tract infections and diarrhea in humans and are phylogenetically positioned between STEC and UPEC strains. We used comparative genomic analysis of 85 temporally and spatially diverse ST141 E. coli strains, including 14 STEC/UPEC hybrids, collected in Germany (n = 13) and the United States (n = 1) to reconstruct their molecular evolution. Whole-genome sequencing data showed that 89% of the ST141 E. coli strains either were STEC/UPEC hybrids or contained a mixture of virulence genes from other pathotypes. Core genome analysis and ancestral reconstruction revealed that the ST141 E. coli strains clustered into two lineages that evolved from a common ancestor in the mid-19th century. The STEC/UPEC hybrid emerged ∼100 years ago by acquiring an stx prophage, which integrated into previously unknown insertion site between rcsB and rcsD, followed by the insertion of a pathogenicity island (PAI) similar to PAI II of UPEC strain 536 (PAI II536-like). The two variants of PAI II536-like were associated with tRNA genes leuX and pheU, respectively. Finally, microevolution within PAI II536-like and acquisition of the enterohemorrhagic E. coli plasmid were observed. Our data suggest that intestinal pathogenic E. coli (IPEC)/extraintestinal pathogenic E. coli (ExPEC) hybrids are widespread and that selection pressure within the ST141 E. coli population led to the emergence of the STEC/UPEC hybrid as a clinically important subgroup. We hypothesize that ST141 E. coli strains serve as a melting pot for pathogroup conversion between IPEC and ExPEC, contrasting the classical theory of pathogen emergence from nonpathogens and corroborating our recent phenomenon of heteropathogenicity among pathogenic E. coli strains.
INTRODUCTION
Escherichia coli is a Gram-negative, nonpathogenic bacterium that usually colonizes the gastrointestinal tract of humans. Some E. coli strains, however, can evolve into pathogenic bacteria by the acquisition of specific virulence factors (1, 2). Depending on the set of virulence factors acquired, E. coli strains can cause intestinal diseases (diarrhea) or extraintestinal diseases in humans and are considered intestinal pathogenic E. coli (IPEC) or extraintestinal pathogenic E. coli (ExPEC) strains, respectively (1). Shiga toxin (Stx)-producing E. coli (STEC) is an important subgroup of IPEC and can cause diarrhea, bloody diarrhea, and, as the most severe complication, hemolytic-uremic syndrome (HUS) (1, 3). Even though E. coli O157:H7 is the most common STEC serotype associated with outbreaks and severe diseases, other non-O157:H7 serotypes have been increasingly detected in human disease (3–5). Other well-defined pathogroups of IPEC with characteristic virulence factors include enterotoxigenic E. coli (ETEC), enteropathogenic E. coli (EPEC), enteroinvasive E. coli (EIEC), adherent-invasive E. coli (AIEC), and enteroaggregative E. coli (EAEC) (1, 6–8). Members of ExPEC pathogroup comprise uropathogenic E. coli (UPEC) and sepsis- and meningitis-associated E. coli (MNEC) (1). Besides these pure pathogroups, a genotypic mosaic, which can be stable over a variable period of time, can emerge in some strains as a result of the acquisition and loss of multiple pathogroup-defining virulence factors via horizontal gene transfer (9); the most prominent example of a genotypic mosaic is E. coli O104:H4, which harbors determinants of both STEC and EAEC and which caused the largest HUS outbreak ever (10–13). However, even after extensive recombination, the resulting pathogens usually remained in their niche, thereby going in line with the classical theory of pathogen emergence, which postulates pathogen emergence from a nonpathogen via the sequential and linear acquisition of virulence factors (8, 10–12).
Recently, we proposed a novel evolutionary hypothesis of pathogen emergence—which is in contrast to the classical theory of linear evolution—called “phased metamorphosis” (8). We demonstrated that STEC O2:H6 strains of multilocus sequence type 141 (ST141) were transitional pathogens and are phylogenetically positioned between IPEC and ExPEC (8). In contrast to the classical theory of pathogen emergence, these strains not only possessed and expressed the biologically active virulence factors typical of STEC and UPEC but also were able to cause both diarrhea and—at least in a mouse model—urinary tract infections (UTIs). The ST141 STEC O2:H6 strains were therefore considered an STEC/UPEC hybrid, and due to their ability to cause both diarrhea and UTIs, they were referred to as “heteropathogens” (8).
The aforementioned study focused on the phenotypes of only the STEC/UPEC hybrid strains and included a limited genotypic characterization; an exhaustive phylogenetic and systematic analysis to unravel the virulence gene diversity of all collected ST141 E. coli strains available in public databases and the evolutionary history of STEC/UPEC hybrid strains is still missing.
Here, we therefore used a comprehensive data set consisting of the whole-genome sequences (WGS) of all collected ST141 E. coli strains (including strains from various parts of the world) available in public databases for extensive (phylo)genomic analyses to characterize the diversity of virulence genes of ST141 E. coli strains and to elucidate the origin and evolution of the STEC/UPEC hybrid. These analyses enabled us to decipher the diversity of hybrids within ST141 E. coli strains and the transitional status of ST141 E. coli strains between IPEC and ExPEC in a broader context to propose an evolutionary model of the STEC/UPEC hybrid and to identify mobile genetic elements (MGE) impacting the evolution of the STEC/UPEC hybrid.
MATERIALS AND METHODS
Description of bacterial strains.
A total of 85 ST141 E. coli strains collected worldwide were used in this study. Metadata (i.e., the source of isolation and disease) associated with the genome sequences of 21 ST141 E. coli strains indicated that all strains except one (strain JP_2011_1) were clinical strains. Strain JP_2011_1 was collected from a river in Japan. The clinical strains were isolated from either urine or blood and were associated with UTI-induced bacteremia and bacteremia, respectively. Moreover, the genome sequences of 50 ST141 E. coli strains had no metadata. However, the majority of these strains were part of a large clinical surveillance project mainly from Europe (the European Survey of Carbapenemase-Producing Enterobacteriaceae, EuSCAPE Sequencing Project) and the United States (the E. coli UTI Defensins and UTI Bacteremia Initiative, Broad Institute). We speculate that they were clinical strains associated with disease in humans. The remaining 14 strains were STEC/UPEC hybrids isolated from stool samples from epidemiologically unrelated patients with diarrhea from Germany (n = 13) and the United States (n = 1). The STEC/UPEC hybrid strains collected from Germany were first identified to be STEC strains from patients with nonbloody diarrhea whose stools contained no other intestinal pathogens and whose illnesses resolved without progression to HUS (8). PCR-based analysis of the 13 German STEC/UPEC hybrid strains showed that they possessed virulence genes typical of both STEC (stx) and UPEC (hlyA, cnf1, vat, the clb island, iroBCDEN [here referred to as the iro cluster], and ybtAEQSTU [here referred to as the ybt cluster]) but lacked virulence genes typical of EAEC (set1, astA, pic, pet, and aatA), ETEC (elt and estI), and EIEC (ial and sen) (8). Additionally, the STEC/UPEC hybrid strains were experimentally shown to express the Shiga toxin gene (stx) and UPEC virulence genes (hlyA, cnf1, vat, the clb island, the iro cluster, and the ybt cluster), as well as to cause diarrhea and UTIs in mouse models (8). Furthermore, metadata associated with the genome sequence of the STEC/UPEC hybrid strain collected from the United States indicated that the strain was isolated from a child with diarrhea. This strain has not been phenotypically characterized to cause UTIs. Detailed descriptions of the strains are provided in Data Set S1 in the supplemental material.
Whole-genome sequence data sets.
Our genomic data set included genomic information for all ST141 E. coli strains collected from different parts of the world that were available in public databases up to May 2017. We retrieved ST141 E. coli strains with their respective metadata and accession numbers from the EnteroBase (https://enterobase.warwick.ac.uk/) and ribosomal multilocus sequence typing (rMLST) (https://pubmlst.org/rmlst/) databases (both databases were accessed in May 2017). Subsequently, the accession number of each strain was used to salvage the corresponding WGS from the NCBI Sequence Read Archive (SRA) or European Nucleotide Archive (ENA) database. In total, 85 ST141 E. coli strains were used in this study, and of these, SRA WGS data sets for 72 strains were retrieved from the NCBI SRA or ENA database. The data sets for the remaining 13 ST141 E. coli strains were derived from epidemiologically unrelated ST141 E. coli strains collected from all over Germany between 2000 and 2005; these strains were sequenced in this and a previous study (8). Furthermore, 60 reference genome sequences comprising those of different pathogroups of IPEC, ExPEC, and commensal E. coli strains were added (Data Set S1).
Whole-genome sequencing and genome analysis.
WGS data sets from previous studies (n = 72) were retrieved as reads from the NCBI SRA or ENA database. The remaining 13 ST141 E. coli strains were sequenced using the Illumina short-read technology and Nextera XT chemistry with either 100-bp or 250-bp paired-end protocols on an Illumina HiScanSQ or MiSeq sequencer according to the manufacturer’s instructions. After quality trimming using default parameters in SeqSphere+ software (version 4.0.0; Ridom GmbH, Münster, Germany), sequence reads were de novo assembled using the Velvet (version 1.1.04) tool (14), available in SeqSphere+, as previously described (15). For a sample to be included in the study, it had to pass our internal WGS quality control; i.e., in silico multilocus sequence typing (MLST) analysis had to confirm that the strain was ST141 and ≥95% of the 2,325 core genome MLST (cgMLST) targets had to be identified, reflecting good genome coverage after sequencing (15).
For comprehensive comparative genome analysis purposes, five STEC/UPEC hybrid strains (Data Set S1) were resequenced using the long-read Pacific Biosciences technology (Pacific Biosciences Inc., Menlo Park, CA, USA) to achieve a (nearly) fully closed genome. DNA was extracted using a MagAttract HMW DNA kit (Qiagen, Hilden, Germany). Subsequently, DNA was sheared to a 10-kb size using g-TUBEs (Covaris Ltd., Brighton, United Kingdom), and the library was prepared using a SMRTbell template preparation kit (version 1.0), as recommended by the manufacturer (Pacific Biosciences Inc.). The DNA polymerase/template complex was prepared using a Sequel binding kit (version 2.1; Pacific Biosciences Inc.) and sequenced on a Sequel system (Pacific Biosciences Inc.). The resulting sequences were assembled using the HGAP4 assembler integrated in SMRT Link software (version 5.1; Pacific Biosciences Inc.).
In silico MLST, rMLST, cgMLST, phylogrouping, and phylogenetic analysis.
In silico gene-by-gene MLST, rMLST, and cgMLST analyses were done using default parameters in SeqSphere+ software as described previously (11). The allelic profiles of each respective strain generated by in silico cgMLST were used to construct a minimum-spanning tree (MST) for the clustering of strains. In the cgMLST MST construction, missing genes were ignored in the pairwise comparison. Moreover, all strains were classified into E. coli reference (ECOR) phylogenetic groups using the default parameters in SeqSphere+ by BLAST analysis of the chuA (GenBank accession no. CP000243.1 [nucleotides 3914107 to 3916089]), tspE4.C2 (GenBank accession no. AF222188.1), and yjaA (GenBank accession no. U00006.1 [nucleotides 78476 to 78859]) genes against the WGS data sets (16, 17). Targets were considered present and included in the ECOR classification if there was ≥95% sequence identity and ≥99% query overlap. Furthermore, for phylogenetic analysis with bootstrapping, the sequences of the 1,445 E. coli Sakai genes present in 145 genomes, including 85 ST141 E. coli genomes and 60 IPEC, ExPEC, and commensal E. coli genomes from GenBank, were extracted, concatenated, and aligned using the MAFFT program (18). Subsequently, we constructed the neighbor-joining (NJ) tree in MEGA6 software (19) using a maximum composite likelihood substitution model with 1,000 bootstrap replications. The resulting phylogeny was edited using the iTOL (20) and Inkscape (http://www.inkscape.org/) programs.
Screening of virulence genes.
We further used in silico analysis to investigate the presence/absence of putative virulence genes associated with various E. coli pathogroups as previously described (8). The virulence genes investigated included genes encoding STEC-associated toxins (stx, EHEC-hlyAC, cdtABC, subAB), serine proteases (espP, espI), adhesins (eae, iha, saa, lpfAO26, lpfAO113, lpfAO157-OI141, lpfAO157-OI154, efa1, sfpACDFGHJ [here referred to as the sfp cluster]), the locus of enterocyte effacement-encoded type III secretion system (escV), and secreted proteins (espF, map, espG). Furthermore, UPEC-associated virulence genes (hlyA, cnf1, vat, sat, papABCDEFGHIJK [here referred to as the pap cluster], sfaABCDGHS [here referred to as the sfa cluster], focAFGH [here referred to as the foc cluster], and hek), iron acquisition system-associated genes (the iro cluster), yersiniabactin cluster-associated genes (the ybt cluster, irp1, irp2, fyuA), cdiAB, and a region of the clb island (8) were also searched for. Moreover, genes encoding the heat-labile (elt) and heat-stable (estI) enterotoxins of ETEC and genes for the virulence plasmid pInv marker (ial) and Shigella enterotoxin 2 homologue (sen), the last two of which are associated with EIEC, were also investigated. Finally, we screened for EAEC-associated genes, namely, the set1, astA, aggR, aaiC, pic, and pet genes, encoding Shigella enterotoxin 1, heat-stable enterotoxin 1, transcriptional regulator AggR, secretory protein AaiC, and the autotransporters Pic and Pet, respectively, and aatA, a marker for the EAEC virulence plasmid. FASTA allele libraries, which represent the allelic diversity of each virulence gene, were created and subsequently used to generate a task template implemented in SeqSphere+ as previously described (21). Details on the virulence genes and their respective accession numbers used for creating the allele libraries are available in Data Set S2. These analyses were done within SeqSphere+ software. Targets were considered present, failed, or absent as described by Strauß et al. (21). Briefly, targets were considered present if there was ≥95% sequence identity and ≥99% query overlap to any of the nucleotide sequences in the allele library. A target with a partial sequence, frameshifts, or nucleotide ambiguities was considered failed. If the respective gene was not found under the given parameters, it was rated absent.
Ancestral dating and phylogenetic analysis.
For 42 strains with available metadata, i.e., country and year of isolation (Data Set S1), coalescence-based analysis to determine the evolutionary rates and the timing of the common ancestor was performed using the Bayesian Evolutionary Analysis Sampling Trees (BEAST) software tool (version 1.8.4) (22). The sequences used for the BEAST analysis were derived from the core genome genes using E. coli O157:H7 Sakai as the reference sequence that could be extracted from all 42 ST141 E. coli strains. After concatenation of these genes, they were aligned using the MAFFT program, available on the MAFFT server (http://mafft.cbrc.jp/alignment/server/large.html) (18). To remove genomic regions that might contain recombination, we applied the Gubbins tool (https://github.com/sanger-pathogens/gubbins) (23) and removed the detected regions from subsequent analysis.
Coalescence-based investigation of the evolutionary history and the timing of diversification events was done using triplicate runs in BEAST, available at CIPRES Science Gateway (24), as described elsewhere (25). In brief, test runs of different models, including the general time-reversible (GTR), the Hasegewa-Kishino-Yano (HKY), and the Jukes-Cantor (JC) models, under the assumption of both strict and relaxed clocks and different demographic models (the constant population size, exponentially growing population, and Bayesian skyline tree priors models), were performed. For each test run, 200 million steps were performed, and the chain was sampled every 20,000th step. Ten percent of the chain was discarded as burn-in. The results of all test runs were compared, and the HKY, strict clock, constant population size model with a uniform clock rate ([0,1], initial = 0.001) emerged as the best model, based on having the highest effective sample size (ESS) and Bayes factor (BF) values (25). Data were evaluated with the Tracer (version 1.6) (26) and TreeAnnotator (version 1.8.4) (27) programs. The maximum clade credibility tree (MCC) was visualized using FigTree (version 1.4.3) software (28).
Comparative genome analysis.
To elucidate the impact of MGE on the evolution of the STEC/UPEC hybrid, five strains were resequenced using the SMRT sequencing technology. The assembled genomes were annotated with the Prokaryotic Genome Annotation Pipeline (PGAP), available at NCBI (29, 30). Prediction of prophage elements within the genomes was done with the aid of the PHAge search tool enhanced release (PHASTER; http://phaster.ca/) (31). Next, STEC/UPEC hybrid genomes were then compared to the UPEC strain 536, E. coli Sakai, and commensal E. coli strain K-12, substrain MG1655, reference genomes. Moreover, we compared the ST141 STEC/UPEC hybrid and Shiga toxin-negative ST141 genomes. Both comparisons were performed and visualized using the Artemis (32), Artemis Comparison Tool (ACT) (33), Mauve (34), and BLAST Ring Image Generator (BRIG) (35) programs.
Data availability.
Raw data (sequence reads) for all new WGS reads were deposited at ENA under project study no. PRJEB31106, the annotated genomes of the five STEC/UPEC hybrid strains are available at NCBI under accession no. CP035498, SEST00000000, SESS00000000, SESR00000000 and SESQ00000000, and the XML file that was finally used for BEAST analysis (including all parameters and sequences analyzed) is available at https://doi.org/10.17879/43169635496.
RESULTS
Whole genome-based typing and phylogrouping of ST141 E. coli strains.
Our internal in silico WGS quality control analysis demonstrated that all 85 E. coli strains used in this study belong to ST141. In silico phylogrouping analysis showed that all ST141 E. coli strains used in this study belong to phylogroup B2. The neighbor-joining (NJ) tree based on 1,445 core genome genes that were present in all strains confirmed the intermediate position of ST141 E. coli strains between IPEC and ExPEC (Fig. 1). To better illustrate the intermediate positioning of ST141 between IPEC and ExPEC, we generated an MST using an allele-based cgMLST approach by comparing up to 4,671 genes (missing genes were ignored in the pairwise comparison). Here, the MST also confirmed the intermediate position of ST141 E. coli strains between IPEC and ExPEC (see Fig. S1 in the supplemental material). In the MST, all strains of ST141 clustered together, differing in at most 236 alleles, whereas the non-ST141 strains (reference strains) differed in at least 1,520 alleles in comparison to the ST141 group.
FIG 1.
Neighbor-joining (NJ) tree showing the phylogenetic relationship between 85 ST141 E. coli strains and diverse intestinal pathogenic E. coli (IPEC) and extraintestinal pathogenic E. coli (ExPEC) reference strains. The NJ tree is based on 1,445 core genome E. coli Sakai genes present in all strains. The labels and numbers in the outer circle illustrate the sample identifier and sequence types (ST), respectively. The numbers above or below the branches represent percent bootstrap values based on 1,000 replications. Only branches with bootstrap values above 90% are indicated.
Diversity of ST141 E. coli virulence genes.
Next, we sought various virulence genes of STEC, other IPEC (ETEC, EAEC, EIEC, and EPEC), and ExPEC (UPEC and MNEC) strains within the genomic data for all ST141 strains. Pathogenic E. coli strains are classified into pathogroups by the presence of specific virulence genes (1). Whereas the classification of EPEC, ETEC, EIEC, and STEC is straightforward, it is less unambiguous for UPEC and MNEC (ExPEC), as many genes are indiscriminately associated with these two pathogroups (1). The following representative virulence genes detected in ST141 E. coli strains were used to classify strains into the STEC pathogroup (which contains stx) and the UPEC pathogroup (in which at least one of the following genes is present: hek, hlyA, cnf1, vat, the clb island, the iro cluster, the ybt cluster, the pap cluster, and the sfa cluster). Shiga toxin gene (stx)-negative strains carrying other STEC-associated virulence genes (the sfp cluster, cdtABC, iha) are referred to in this study as the uropathogenic Shiga toxin-negative (UPECStx−) pathogroup. Since EAEC strains have been found to be quite heterogeneous, we selected these EAEC marker genes, namely, astA, aggR, aaiC, aatA, pet, and pic (36, 37), for our in silico analysis. However, we used astA and pic to classify strains into the EAEC pathogroup, as our in silico analyses showed that the aggR, aatA, and aaiC genes were absent in all our strains. The absence of these genes does not imply that the strains are not of the EAEC pathogroup, as strains lacking these stringent markers, i.e., the aggR, aatA, and aaiC genes, have been shown to express the EAEC phenotype (38). Furthermore, astA and pic have been shown to be predominantly present and significantly associated with diarrhea-causing EAEC (36, 37, 39). Likewise, the astA and/or pic gene has been used as a marker to classify strains into the EAEC pathogroup (36, 37, 40, 41). Ultimately, the phenotype of these strains ought to be verified to ascertain with certainty their association with the EAEC pathogroup.
Our extensive in silico analysis based on the WGS data for 85 ST141 strains collected from different parts of the world revealed that the majority of the ST141 E. coli strains (89%) contained multiple pathogroup-specific virulence genes (for STEC, stx; for UPEC, hlyA, cnf1, the clb island, the iro cluster, hek, and the ybt cluster; and for EAEC, astA and pic) and are thus referred to as hybrids. In addition to the STEC/UPEC hybrid, our current study detected new hybrids, namely, UPEC/STEC/EAEC, UPEC/EAEC, and UPEC/UPECStx−. In total, 16% (14/85) of the ST141 E. coli strains were STEC/UPEC hybrids possessing the virulence genes of both STEC (stx) and UPEC (hlyA, cnf1, the clb island, the iro cluster, hek, and the ybt cluster). Of these strains, 13 were collected in Germany and 1 (strain US_2016_2) was collected in the United States (Data Set S3). These STEC/UPEC hybrids were found to possess the same stx2 allele (GenBank accession no. GU126552) that was subtyped as stx2b and that encodes Stx2b (42). Moreover, strain US_2016_2 contained an additional virulence gene typical of the EAEC pathogroup (astA), while another strain (DE_2003_9) also possessed the EHEC hemolysin (EHEC-hlyAC) gene carried on an EHEC plasmid.
In the remaining 71 ST141 strains, all possessed at least one gene associated with UPEC virulence (e.g., the iro cluster, the ybt cluster, vat, hek, and the clb island); 9 of them possessed only genes associated with UPEC virulence and not with other pathotypes. Furthermore, 11 ST141 UPECStx− strains possessed at least one of the following STEC-associated virulence genes: the adhesion-encoding gene iha, which is responsible in eae-negative STEC strains for adherence to epithelial cells (43); the cytolethal distending toxin-encoding gene cluster cdtABC, which is responsible for cell cycle arrest and subsequent apoptosis (44); and the Sfp fimbria-encoding gene cluster sfp, which is responsible for mannose-resistant hemagglutination in sorbitol-fermenting EHEC strains and adherence to human intestinal epithelial cells (45, 46). Remarkably, these STEC-associated virulence genes (iha, cdtABC, sfp cluster) were not detected in the 14 STEC/UPEC hybrids. Of the remaining 51 ST141 strains, 15 (18%) strains exhibited a combination of virulence genes characteristic of UPEC and EAEC (UPEC/EAEC hybrid). Virulence genes typical for ETEC, EIEC, or EPEC were not found in any of the 85 ST141 strains. A complete overview of the different combinations of pathotypes is tabulated in Table 1.
TABLE 1.
Distribution of virulence genes of different pathogroups in ST141 E. coli strains
| Pathogroup/hybrid (no. of strains) | Presence of virulence genec
|
|||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ExPEC |
STEC |
EAEC |
||||||||||||||||||||
| vat | ybt cluster | clb island | iro cluster | cnf1 | hlyA | hek | pap cluster | sfa cluster | foc cluster | cdiAB | stx | sfp cluster | cdtABC | EHEC-hlyAC | Iha | astA | set1 | pic | aggR | aaiC | aatA | |
| STEC/UPEC (12) | + | + | + | + | + | + | + | + | + | + | + | + | − | − | − | − | − | − | − | − | − | − |
| UPEC/STEC/EAEC (1) | + | + | + | + | + | + | + | + | + | + | + | + | − | − | − | − | + | − | − | − | − | − |
| UPEC/STEC (1) | + | + | + | + | + | + | + | + | + | + | + | + | − | − | + | − | − | − | − | − | − | − |
| UPEC (9) | + | + | + | + | + | + | + | + | + | + | + | − | − | − | − | − | − | − | − | − | − | − |
| UPEC/UPECStx−a (11) | + | + | + | + | + | + | + | + | + | + | + | − | + | + | − | + | − | − | − | − | − | − |
| UPEC/EAEC (15) | + | + | + | + | + | + | + | + | + | + | + | − | − | − | − | − | + | + | + | − | − | − |
| Otherb (36) | + | + | + | + | + | + | + | + | + | + | + | − | + | − | − | − | + | + | + | − | − | − |
The strains possess other virulence genes of STEC but not the stx gene.
The strains possess a combination of virulence genes of UPEC and EAEC strains and/or other STEC virulence genes but not the stx gene.
−, absent; +, present.
Evolutionary history of ST141 STEC/UPEC hybrids.
To confirm the overall clustering of the ST141 E. coli strains and to analyze the phylogenetic relation among the strains more deeply, we constructed a neighbor-joining (NJ) tree based on 1,445 core genome genes that were present in all strains (Fig. 1). Here, we found a division of the ST141 E. coli genes into two distinct groups, with the STEC/UPEC hybrid strains being distributed within the two groups. Using BEAST and the temporal information for the strains, we constructed an MCC tree, where all major diversification events of BEAST analysis were supported by high posterior probability values (Fig. 2 and Fig. S2). BEAST analysis also showed the partition of the ST141 strains into two distinct ST141 lineages (lineages L1 and L2) (Fig. 2), which shared a common ancestor approximately 138 years ago. To verify whether genomic recombination is responsible for the split of the L1 and L2 lineages, the genomic data used for BEAST analysis were used to construct a split decomposition tree using the SplitsTree4 program (version 4.14.8) (47). Here, the split decomposition tree demonstrated that the split of these two lineages was independent and not as a result of genomic recombination (Fig. S3). The two lineages differed by a total of 28 and 58 nonsynonymous and synonymous base substitutions, respectively, when the L2 lineage was compared to the L1 lineage. The distribution of virulence genes between the two lineages varied significantly. While 38% and 89% of the strains in L2 possessed astA (encoding EAEC heat-stable enterotoxin 1) and the sfp gene cluster, respectively, none of the strains in L1 possessed either of these virulence genes. The STEC/UPEC hybrids were distributed within the two lineages, with L1 containing 11 strains, whereas L2 contained 3 strains. Our most recent common ancestor (MRCA) and BEAST analysis of the STEC/UPEC hybrids is based on the STEC/UPEC strains in L2. This was due to the following reasons. There were fewer STEC/UPEC hybrid strains in L2, the similar virulence gene content of the 3 STEC/UPEC hybrids strains in L2 could be found in some STEC/UPEC hybrids strains in L1, and the temporal evolution of STEC/UPEC hybrids in L2 was similar to that of some STEC/UPEC hybrid strains in L1. The BEAST analysis suggested that the gain and loss of mobile virulence elements in STEC/UPEC hybrid genomes occurred several times and in parallel in the two separate lineages (L1, L2). According to our results, the loss and gain of mobile element-encoding virulence genes hlyA, cfn1, hek, pap cluster, stx, and EHEC-hly drove the evolution of the STEC/UPEC hybrid. Subsequent analysis revealed that the hlyA, cnf1, hek, and pap cluster genes were encoded on a pathogenicity island (PAI) similar to PAI II of UPEC strain 536, which we refer to as PAI II536-like. Similarly, the Stx- and EHEC-Hly-encoding genes were encoded on a prophage and EHEC plasmid, respectively. Detailed analysis revealed that approximately 100 years ago, the STEC/UPEC hybrid emerged by acquiring the stx prophage. Approximately 45 to 50 years later (in the 1960s), the STEC/UPEC hybrid acquired PAI II536-like. Subsequently, microevolution was observed within PAI II536-like after its acquisition, leading to the complete or partial loss of this PAI in some phylogenetic offshoots. This is in concordance with the fact that PAI II of UPEC strain 536 was shown to be highly unstable (48). Very recently (20 years ago), the STEC/UPEC hybrid possessing both stx and PAI II536-like acquired the EHEC-hly-carrying plasmid.
FIG 2.
Bayesian maximum clade credibility tree calculated from 30,000 sampled trees and the presence/absence of ExPEC and IPEC virulence genes in ST141 E. coli strains. Node labels represent the posterior probability values. Branch colors represent the country of origin of each sample and their most recent common ancestor (MRCA). The countries of origin are coded as follows: DE, Germany; DK, Denmark; GB, Great Britain; JP, Japan; SE, Sweden; and US, United States. The time periods and ages of the strains can be inferred using the timeline at the bottom. The most recent strain was collected in 2016; the MRCA of all strains was dated to 1880 (95% HPD, 1785 to 2093). L1 and L2 represent lineage l and lineage 2, respectively. The presence/absence of virulence genes is color coded: blue, presence of a gene; white, absence of a gene; yellow, failed (the target gene had a partial sequence, a frameshift, or ambiguities in sequences). The results of a detailed analysis of the presence/absence of virulence genes of ST141 E. coli are shown in Data Set S3 in the supplemental material.
Based on consensus analysis of the results generated by BEAST and the presence/absence of virulence genes located on MGE, we propose here an evolutionary model for the majority of the ST141 STEC/UPEC hybrid lineage (Fig. 3). This model involved the insertion of the stx-harboring prophage into the chromosome, followed by the insertion of PAI II536-like, thereby setting the stage for the acquisition of IPEC and ExPEC virulence genes. In one STEC/UPEC hybrid lineage, the process concluded with the addition of the EHEC plasmid carrying the genes for virulence factors. In contrast, in the other STEC/UPEC hybrid lineage, microevolution within the unstable PAI II536-like was observed, leading to the partial or total loss of PAI II536-like (Fig. 3).
FIG 3.
Proposed evolutionary model of STEC/UPEC hybrid based on consensus analysis of virulence genes and the maximum clade credibility tree of ST141 E. coli strains. The mobile genetic elements stx-harboring prophage (62 kb with 67 coding regions containing stx encoding Shiga toxin as the major virulence factor), PAI II536-like (two variants of 136 kb and 98 kb containing the major ExPEC virulence genes hek [which encodes Hek protein], the pap cluster [which encodes P fimbriae], hlyA [which encodes UPEC exotoxin α-hemolysin], and cfn1 [which encodes UPEC exotoxin cytotoxic necrotizing factor type 1]), and the EHEC-hly-harboring plasmid (which encodes the EHEC hemolysin [the hly gene]) are colored red, green, and black, respectively. Hypothetical strains are drawn with broken lines. Representative STEC/UPEC hybrid strains are indicated below the images. The illustration is not drawn to scale, and only important dates are shown.
Characterization of MGE driving the evolution of the ST141 STEC/UPEC hybrid.
To characterize the impact of MGE on the evolution of STEC/UPEC hybrids, we resequenced five STEC/UPEC hybrid strains (strains DE_2002_1, DE_2003_8, DE_2004_2, DE_2005_3, DE_2009_11) using the SMRT sequencing technology for subsequent comparative genome analysis with pure UPEC and STEC genomes. Here, the STEC/UPEC hybrid exhibited a larger genome size (5.4 Mb) and a higher number of prophage elements (n = 13) than commensal strain E. coli MG1655 (4.6 Mb, 6 prophage elements) and UPEC strain 536 (5.0 Mb, 3 prophage elements) but a smaller genome size and fewer prophage elements than E. coli Sakai (5.5 Mb, 18 prophage elements). The stx-harboring prophage, which was one of the intact prophage elements within the STEC/UPEC hybrid genome, is approximately 62 kb, encodes 67 coding DNA sequences (CDS), and has a size and content similar to those of all five strains (Fig. 4). Unlike the prototypical EHEC O157:H7 strains EDL933 (49) and Sakai (50), whose stx2-harboring prophage integrated into the wrbA gene, in all STEC/UPEC hybrid strains, the stx2-haboring prophage integrated between rcsB and rcsD, a previously unknown integration site for stx-harboring prophages, indicating a single acquisition event. Interestingly, the integration site between rcsB and rcsD is not occupied by a prophage in the commensal strains E. coli MG1655, UPEC strain 536, and E. coli Sakai.
FIG 4.
Integration of the stx-harboring prophage into the rcsB-rcsD gene of the STEC/UPEC hybrid. The stx-harboring prophage is 62 kb and encodes 67 CDS. Not all genes encoded on the stx-harboring prophage are shown in the diagram. Genes on the forward strand are depicted as right-sided arrows; genes located on the reverse strand are depicted as left-sided arrows. Genes showing synteny between genomes are highlighted with the same color. The numbers on the reference genome represent the position on the genome.
PAI II536-like, which was present in two strains (DE_2004_2, DE_2009_11) with different sizes (136 kb and 98 kb) but which was completely absent in the other three strains (DE_2002_1, DE_2003_8, DE_2005_3), is the second genomic element that affects the evolution of the STEC/UPEC hybrid. The PAI II536-like elements, which had sizes of 136 kb and 98 kb, had different insertion sites and were associated with tRNA genes leuX and pheU, respectively. Whereas the leuX tRNA-associated PAI II536-like inserted between the yjgB and yjhS genes of UPEC strain 536, which is the same insertion site of UPEC strain 536 PAI II, the pheU tRNA-associated PAI II536-like integrated between the UPEC 536 genes yjdC (ECP_RS22315) and the gene encoding a hypothetical protein (ECP_RS25880) (Fig. 5). The variation in content and insertion sites of PAI II536-like indicates both microevolution within the PAI and at least two independent acquisition events.
FIG 5.
Integration site of PAI II536-like within the genome of STEC/UPEC in comparison to that within the UPEC 536 genome. Genes on the forward strand are depicted as right-sided arrows; genes located on the reverse strand are depicted as left-sided arrows. Genes showing synteny between genomes are highlighted with the same color. Not all genes encoded on PAI II536-like and PAI II536 are shown in the diagram. The numbers on the reference genome represent the position on the genome. (A) Integration of the 98-kb PAI II536-like associated with pheU tRNA into heteropathogenic Stx-producing E. coli DE_2004_2. (B) Integration of the 136-kb PAI II536-like associated with leuX tRNA into heteropathogenic Stx-producing E. coli DE_2009_11.
Comparison of ST141 STEC/UPEC hybrid and Shiga toxin-negative ST141 genomes.
We further selected the genomes of eight Shiga toxin-negative ST141 strains (strains JP_2011_1, GB_2011_12, EuSCAPE_FI008, DK_2003_4, KTE88, SE_1996_1, US_2009_1, and UPEC) of diverse origins (Denmark, Great Britain, Japan, Sweden, and the United States) and sources of isolation (urine, blood, and a river) and compared their genomic contents to the complete genome of STEC/UPEC hybrid (strain DE_2004_2). The comparative genome analysis demonstrated that the STEC/UPEC hybrid and Shiga toxin-negative ST141 genomes are highly similar (Fig. 6). Detailed analysis showed that, in total, six regions that were larger than 5 kb were missing in all the eight Shiga toxin-negative strains but were present in the STEC/UPEC hybrid (Fig. 6A). Five of the six regions were prophages, and region 2 was associated with an insertion sequence. In Fig. 6B, the genomic position, number of coding regions, and, if known, the genomic content of the six regions are listed.
FIG 6.
(A) Comparative analysis of the ST141 STEC/UPEC hybrid strain DE_2004_2 complete genome with representative genomes of eight Shiga toxin-negative strains (strains JP_2011_1, GB_2011_12, EuSCAPE_FI008, DK_2003 _4, KTE88, SE_1996_1, US_2009_1, and UPEC) collected from diverse origins (Denmark, Great Britain, Japan, Sweden, and the United States) and sources of isolation (urine, blood, and a river) using the BLAST Ring Image Generator. The concentric rings display similarity between the STEC/UPEC hybrid in the inner ring and the Shiga toxin-negative strains in the outer rings. The various colors indicate a BLAST result with a matched degree of shared regions, as shown in the key. The various sections labeled with region numbers on the outer circle represent genomic segments larger than 5 kb which are present in STEC/UPEC strain DE_2004_2 but absent from all eight Shiga toxin-negative strains. (B) Characteristics of the genomic regions present in STEC/UPEC hybrid strain DE_2004_2 but absent from all Shiga toxin-negative strains.
DISCUSSION
Many studies have elaborated on the emergence and clinical significance of hybrid E. coli strains, and hybrids of STEC/ETEC have recently been reported in Germany, the United States, and Slovakia (51–53). The most clinically significant hybrid documented is the E. coli O104:H4 STEC/EAEC hybrid, which caused, up to now, the largest HUS outbreak, centered in northern Germany, in 2011 (13, 54, 55). It was hypothesized that the unique combination of virulence traits of both EAEC and STEC strains, including Stx2 production and aggregative adherence to intestinal epithelial cells (12, 54), contributed to its high virulence. We therefore endeavored to understand in detail the diversity of virulence genes, the origin, and the evolution of the hybrid and particularly the STEC/UPEC hybrid described in our previous study (8) due to their unique intestinal and extraintestinal virulence phenotypes and intermediate phylogenetic position between STEC and UPEC. Evidence from our data corroborate that ST141 E. coli acts as an interface between the ExPEC and IPEC pathogroups and that the STEC/UPEC hybrid, a hybrid subgroup of ST141 E. coli strains, is a young clone that emerged a century ago by acquiring the stx-harboring prophage and further evolving into a dynamic strain by acquiring an ExPEC PAI and an EHEC plasmid.
Evidence from this and other studies clearly points to the ubiquitous presence of IPEC/ExPEC hybrids in wildlife, humans, and, presumably, the environment. In our analysis of 85 ST141 strains collected from different parts of the world, we were surprised by the high prevalence of IPEC/ExPEC hybrid strains (89%). Although we focused on only a single sequence type (ST) that might be particularly susceptible to recombination, other studies showed similar results. For example, Lindstedt et al. found IPEC/ExPEC hybrid strains to be predominant and present at a very high frequency (64.3%) in human fecal samples (56), and Lu et al. have also detected hybrids in the intestines of wild animals (79.2%) (41). We assume that the prevalence of hybrids in clinical samples is much higher than currently recognized. However, we believe that the majority of these pathogens remain undetected due to the fact that clinicians and microbiologists screen only for IPEC or ExPEC virulence genes in patients that present with diarrhea or UTIs, respectively. Screening for both ExPEC and IPEC virulence genes irrespective of clinical symptoms or the site of infection could help to provide more data to better estimate the frequency, the disease burden, and the harm caused by these hybrid strains.
Our extended analysis of ST141 strains from different countries puts previous findings about the intermediate position of the STEC/UPEC hybrid between IPEC and ExPEC into a broader context. Here, we demonstrated that ST141 E. coli is one of possibly various lineages within the E. coli population that contain several hybrids (UPEC/EAEC, STEC/UPEC/EAEC, UPECStx−/UPEC), including the STEC/UPEC hybrids described previously (8). We therefore hypothesize that the ST141 E. coli lineage background is one of the melting pots within the E. coli population to which determinants from various pathogroups are added, which leads—because of a high level of recombination and mutation—to many relatively rare and unrelated hybrid genotypes sharing the ST141 E. coli lineage background. Fed from this melting pot, successfully adapted hybrids could grow within the population and serve as a reservoir for the exchange of IPEC and ExPEC virulence genes. Although we did not experimentally confirm the phenotypic expression of the different pathotypes in the newly detected hybrids (i.e., UPEC/EAEC, STEC/UPEC/EAEC, UPEC/UPECStx−) of our current study, in contrast to our previous study (8), we assume that they can express both IPEC and ExPEC phenotypes. We believe that the presence of such melting pots is a phenomenon during pathogen evolution even more common than expected. Recent data from Boll et al. (57) corroborate our hypothesis: they isolated ST131 E. coli, the most prevalent multidrug-resistant ExPEC lineage globally, from current community-acquired bacteremia and recurrent UTI cases in Denmark and found acquisition of the EAEC-defining virulence plasmid pAA and of additional ExPEC genes, thereby enhancing the ability to successfully colonize and subsequently infect humans (57). We speculate that in-depth analysis of all available ST131 E. coli strains could reveal the existence of other hybrids within ST131 E. coli and further affirm our hypothesis that heteropathogenicity is a common phenomenon of pathogenic E. coli strains.
The evolutionary analysis of our data showed that the STEC/UPEC hybrid is a young clone and that its evolution is driven by the loss/gain of the Stx-encoding bacteriophage, PAI II536-like, and the EHEC plasmid. The absence of microevolution within the stx genes among all STEC/UPEC hybrids, the insertion of the stx-harboring prophage into a previously unknown insertion site (rcsB and rcsD), and the association of PAI II536-like with known and unknown UPEC tRNAs further indicate a recent and even, presumably, an ongoing evolutionary process. These facts, together with the recent ancestry based on the Bayesian analysis (Fig. 2 and 3), indicate that these hybrids must have grown successfully within the ST141 E. coli population, indicating a yet unknown selection pressure (58).
It is worth noting that our results present a temporal bias due to insufficient metadata, and therefore, over half of the ST141 E. coli strains were excluded from the BEAST analysis. The strains used for BEAST analysis span only 2 decades (the oldest collection date is 1996), leading to high highest posterior density (HPD) values. It is therefore difficult to ascertain the divergence times with high confidence, and the inclusion of additional metadata and strains may shift the results with respect to the time of diversification of STEC/UPEC hybrids. However, the presence/absence of virulence genes driving the evolution of STEC/UPEC hybrid and the intermediate position of ST141 E. coli corroborate both our findings based on Bayesian statistics presented here and previous findings (8). Another limitation is the fact that, due to the extraordinary plasticity of the E. coli genome, the classification of strains into pathogroups can be ambiguous. An example is the differentiation of EAEC, where astA can also be found in other, non-EAEC E. coli strains (38, 59), and other EAEC markers, e.g., aggR and aatA, which are not present in all EAEC strains (38). By screening for various markers, however, we aimed to reduce this limitation. Finally, in contrast to the hybrids that were phenotypically confirmed in our previous study (8), we can only speculate that the other hybrids of this study will also express the phenotype of their respective hybrid pathogroups. Ultimately, the phenotypes of these new hybrids ought to be verified.
Due to the insufficient data, it is difficult to predict with certainty the direction of evolution of STEC/UPEC hybrid strains. However, we propose a direction of evolution from ExPEC toward IPEC based on the following evidence. First, all ST141 strains, including an environmental strain, used in this study possessed virulence genes typical of ExPEC but not of IPEC. Second, the in silico phylogrouping analysis showed that all the ST141 E. coli strains (including the hybrid strains) belonged to phylogroup B, a group characteristic for most ExPEC strains (8, 60). On the contrary, IPEC and commensal E. coli strains belong to other phylogroups (phylogroups A, B1, and D) (8) (see Data Set S1 in the supplemental material).
In conclusion, we demonstrated that the ST141 E. coli lineage serves as one of the melting pots for pathogroup conversion between IPEC and ExPEC, of which the STEC/UPEC hybrid is a clinically important subgroup. Whereas the STEC/UPEC hybrid is a young clone, our results corroborate our previous concept of heteropathogenicity and conceptualize a more general principle that certain highly variable and accessible clones drive the emergence and evolution of novel pathogens. The decision whether these pathogens will be evolutionarily successful in the long run is very likely dependent on their niche and the presence of selective pressure. Broader investigations combining phylogeny and systematic virulence gene analysis of other E. coli sequence types and within other bacterial species are required to understand in detail these ongoing evolutionary processes.
Supplementary Material
ACKNOWLEDGMENTS
This work was funded by Deutsche Forschungsgemeinschaft (DFG; German Research Foundation) grant 281125614/GRK2220 (EvoPAD projects A2 and A3) and DFG grant SFB1009 (projects B04 and B05).
The funding agency had no role in the design, execution of the experiments, or publication of the results.
We are grateful to Helge Karch (University of Münster, Münster, Germany) for the fruitful discussion and critical reading of the manuscript.
Footnotes
Supplemental material is available online only.
REFERENCES
- 1.Kaper JB, Nataro JP, Mobley HL. 2004. Pathogenic Escherichia coli. Nat Rev Microbiol 2:123–140. doi: 10.1038/nrmicro818. [DOI] [PubMed] [Google Scholar]
- 2.Pallen MJ, Wren BW. 2007. Bacterial pathogenomics. Nature 449:835–842. doi: 10.1038/nature06248. [DOI] [PubMed] [Google Scholar]
- 3.Karch H, Tarr PI, Bielaszewska M. 2005. Enterohaemorrhagic Escherichia coli in human medicine. Int J Med Microbiol 295:405–418. doi: 10.1016/j.ijmm.2005.06.009. [DOI] [PubMed] [Google Scholar]
- 4.Mellmann A, Bielaszewska M, Köck R, Friedrich AW, Fruth A, Middendorf B, Harmsen D, Schmidt MA, Karch H. 2008. Analysis of collection of hemolytic uremic syndrome-associated enterohemorrhagic Escherichia coli. Emerg Infect Dis 14:1287–1290. doi: 10.3201/eid1408.071082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bielaszewska M, Mellmann A, Bletz S, Zhang W, Köck R, Kossow A, Prager R, Fruth A, Orth-Höller D, Marejková M, Morabito S, Caprioli A, Piérard D, Smith G, Jenkins C, Curová K, Karch H. 2013. Enterohemorrhagic Escherichia coli O26:H11/H−: a new virulent clone emerges in Europe. Clin Infect Dis 56:1373–1381. doi: 10.1093/cid/cit055. [DOI] [PubMed] [Google Scholar]
- 6.Nataro JP, Kaper JB. 1998. Diarrheagenic Escherichia coli. Clin Microbiol Rev 11:142–201. doi: 10.1128/CMR.11.1.142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Darfeuille-Michaud A. 2002. Adherent-invasive Escherichia coli: a putative new E. coli pathotype associated with Crohn’s disease. Int J Med Microbiol 292:185–193. doi: 10.1078/1438-4221-00201. [DOI] [PubMed] [Google Scholar]
- 8.Bielaszewska M, Schiller R, Lammers L, Bauwens A, Fruth A, Middendorf B, Schmidt MA, Tarr PI, Dobrindt U, Karch H, Mellmann A. 2014. Heteropathogenic virulence and phylogeny reveal phased pathogenic metamorphosis in Escherichia coli O2:H6. EMBO Mol Med 6:347–357. doi: 10.1002/emmm.201303133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Tchesnokova V, Radey M, Chattopadhyay S, Larson L, Weaver JL, Kisiela D, Sokurenko EV. 2019. Pandemic fluoroquinolone resistant Escherichia coli clone ST1193 emerged via simultaneous homologous recombinations in 11 gene loci. Proc Natl Acad Sci U S A 116:14740–14748. doi: 10.1073/pnas.1903002116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Brzuszkiewicz E, Thürmer A, Schuldes J, Leimbach A, Liesegang H, Meyer FD, Boelter J, Petersen H, Gottschalk G, Daniel R. 2011. Genome sequence analyses of two isolates from the recent Escherichia coli outbreak in Germany reveal the emergence of a new pathotype: entero-aggregative-haemorrhagic Escherichia coli (EAHEC). Arch Microbiol 193:883–891. doi: 10.1007/s00203-011-0725-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Mellmann A, Harmsen D, Cummings CA, Zentz EB, Leopold SR, Rico A, Prior K, Szczepanowski R, Ji Y, Zhang W, McLaughlin SF, Henkhaus JK, Leopold B, Bielaszewska M, Prager R, Brzoska PM, Moore RL, Guenther S, Rothberg JM, Karch H. 2011. Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104:H4 outbreak by rapid next generation sequencing technology. PLoS One 6:e22751. doi: 10.1371/journal.pone.0022751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Rasko DA, Webster DR, Sahl JW, Bashir A, Boisen N, Scheutz F, Paxinos EE, Sebra R, Chin CS, Iliopoulos D, Klammer A, Peluso P, Lee L, Kislyuk AO, Bullard J, Kasarskis A, Wang S, Eid J, Rank D, Redman JC, Steyert SR, Frimodt-Møller J, Struve C, Petersen AM, Krogfelt KA, Nataro JP, Schadt EE, Waldor MK. 2011. Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany. N Engl J Med 365:709–717. doi: 10.1056/NEJMoa1106920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bielaszewska M, Mellmann A, Zhang W, Köck R, Fruth A, Bauwens A, Peters G, Karch H. 2011. Characterisation of the Escherichia coli strain associated with an outbreak of haemolytic uraemic syndrome in Germany, 2011: a microbiological study. Lancet Infect Dis 11:671–676. doi: 10.1016/S1473-3099(11)70165-7. [DOI] [PubMed] [Google Scholar]
- 14.Zerbino DR, Birney E. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829. doi: 10.1101/gr.074492.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mellmann A, Bletz S, Böking T, Kipp F, Becker K, Schultes A, Prior K, Harmsen D. 2016. Real-time genome sequencing of resistant bacteria provides precision infection control in an institutional setting. J Clin Microbiol 54:2874–2881. doi: 10.1128/JCM.00790-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Clermont O, Bonacorsi S, Bingen E. 2000. Rapid and simple determination of the Escherichia coli phylogenetic group. Appl Environ Microbiol 66:4555–4558. doi: 10.1128/aem.66.10.4555-4558.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Selander RK, Caugant DA, Whittam TS. 1987. Genetic structure and variation in natural populations of Escherichia coli, p 1625–1648. In Neidhardt FC, Ingraham JL, Low KB, Magasanik B, Schaechter M, Umbarger HE (ed), Escherichia coli and Salmonella typhimurium: cellular and molecular biology. American Society for Microbiology, Washington, DC. [Google Scholar]
- 18.Katoh K, Rozewicki J, Yamada KD. 2019. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform 20:1160–1166. doi: 10.1093/bib/bbx108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. 2013. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30:2725–2729. doi: 10.1093/molbev/mst197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Letunic I, Bork P. 2016. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res 44:W242–W245. doi: 10.1093/nar/gkw290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Strauß L, Ruffing U, Abdulla S, Alabi A, Akulenko R, Garrine M, Germann A, Grobusch MP, Helms V, Herrmann M, Kazimoto T, Kern W, Mandomando I, Peters G, Schaumburg F, von Müller L, Mellmann A. 2016. Detecting Staphylococcus aureus virulence and resistance genes: a comparison of whole-genome sequencing and DNA microarray technology. J Clin Microbiol 54:1008–1016. doi: 10.1128/JCM.03022-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Drummond AJ, Suchard MA, Xie D, Rambaut A. 2012. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 29:1969–1973. doi: 10.1093/molbev/mss075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Croucher NJ, Page AJ, Connor TR, Delaney AJ, Keane JA, Bentley SD, Parkhill J, Harris SR. 2015. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res 43:e15. doi: 10.1093/nar/gku1196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Miller MA, Pfeiffer W, Schwartz T. 2010. Creating the CIPRES Science Gateway for inference of large phylogenetic trees, p 1–8. Abstr Gateway Computing Environments Workshop (GCE), New Orleans, LA. [Google Scholar]
- 25.Strauß L, Stegger M, Akpaka PE, Alabi A, Breurec S, Coombs G, Egyir B, Larsen AR, Laurent F, Monecke S, Peters G, Skov R, Strommenger B, Vandenesch F, Schaumburg F, Mellmann A. 2017. Origin, evolution, and global transmission of community-acquired Staphylococcus aureus ST8. Proc Natl Acad Sci U S A 114:E10596–E10604. doi: 10.1073/pnas.1702472114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Rambaut A, Suchard MA, Xie D, Drummond AJ. 2014. Tracer v1.6. http://beast.bio.ed.ac.uk/Tracer.
- 27.Rambaut A, Drummond AJ. 2016. TreeAnnotator version v1.8.4. http://beast.bio.ed.ac.uk.
- 28.Rambaut A. 2016. FigTree version 1.4.3. http://tree.bio.ed.ac.uk.
- 29.Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J. 2016. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res 44:6614–6624. doi: 10.1093/nar/gkw569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Haft DH, DiCuccio M, Badretdin A, Brover V, Chetvernin V, O'Neill K, Li W, Chitsaz F, Derbyshire MK, Gonzales NR, Gwadz M, Lu F, Marchler GH, Song JS, Thanki N, Yamashita RA, Zheng C, Thibaud-Nissen F, Geer LY, Marchler-Bauer A, Pruitt KD. 2018. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res 46:D851–D860. doi: 10.1093/nar/gkx1068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Arndt D, Grant JR, Marcu A, Sajed T, Pon A, Liang Y, Wishart DS. 2016. PHASTER: a better, faster version of the PHAST phage search tool. Nucleic Acids Res 44:W16–W21. doi: 10.1093/nar/gkw387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Carver T, Harris SR, Berriman M, Parkhill J, McQuillan JA. 2012. Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Bioinformatics 28:464–469. doi: 10.1093/bioinformatics/btr703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Carver TJ, Rutherford KM, Berriman M, Rajandream MA, Barrell BG, Parkhill J. 2005. ACT: the Artemis Comparison Tool. Bioinformatics 21:3422–3423. doi: 10.1093/bioinformatics/bti553. [DOI] [PubMed] [Google Scholar]
- 34.Darling AE, Mau M, Perna NT. 2010. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5:e11147. doi: 10.1371/journal.pone.0011147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Alikhan NF, Petty NK, Ben Zakour NL, Beatson SA. 2011. BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons. BMC Genomics 12:402. doi: 10.1186/1471-2164-12-402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Müller D, Greune L, Heusipp G, Karch H, Fruth A, Tschäpe H, Schmidt MA. 2007. Identification of unconventional intestinal pathogenic Escherichia coli isolates expressing intermediate virulence factor profiles by using a novel single-step multiplex PCR. Appl Environ Microbiol 73:3380–3390. doi: 10.1128/AEM.02855-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Cerna JF, Nataro JP, Estrada-Garcia T. 2003. Multiplex PCR for detection of three plasmid-borne genes of enteroaggregative Escherichia coli strains. J Clin Microbiol 41:2138–2140. doi: 10.1128/jcm.41.5.2138-2140.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Sarantuya J, Nishi J, Wakimoto N, Erdene S, Nataro JP, Sheikh J, Iwashita M, Manago K, Tokuda K, Yoshinaga M, Miyata K, Kawano Y. 2004. Typical enteroaggregative Escherichia coli is the most prevalent pathotype among E. coli strains causing diarrhea in Mongolian children. J Clin Microbiol 42:133–139. doi: 10.1128/jcm.42.1.133-139.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Zamboni A, Fabbricotti SH, Fagundes-Neto U, Scaletsky IC. 2004. Enteroaggregative Escherichia coli virulence factors are found to be associated with infantile diarrhea in Brazil. J Clin Microbiol 42:1058–1063. doi: 10.1128/jcm.42.3.1058-1063.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Piva IC, Pereira AL, Ferraz LR, Silva RS, Vieira AC, Blanco JE, Blanco M, Blanco J, Giugliano LG. 2003. Virulence markers of enteroaggregative Escherichia coli isolated from children and adults with diarrhea in Brasília, Brazil. J Clin Microbiol 41:1827–1832. doi: 10.1128/jcm.41.5.1827-1832.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Lu S, Jin D, Wu S, Yang J, Lan R, Bai X, Liu S, Meng Q, Yuan X, Zhou J, Pu J, Chen Q, Dai H, Hu Y, Xiong Y, Ye C, Xu J. 2016. Insights into the evolution of pathogenicity of Escherichia coli from genomic analysis of intestinal E. coli of Marmota himalayana in Qinghai-Tibet plateau of China. Emerg Microbes Infect 5:e122. doi: 10.1038/emi.2016.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Scheutz F, Teel LD, Beutin L, Piérard D, Buvens G, Karch H, Mellmann A, Caprioli A, Tozzoli R, Morabito S, Strockbine NA, Melton-Celsa AR, Sanchez M, Persson S, O'Brien AD. 2012. Multicenter evaluation of a sequence-based protocol for subtyping Shiga toxins and standardizing Stx nomenclature. J Clin Microbiol 50:2951–2963. doi: 10.1128/JCM.00860-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Tarr PI, Bilge SS, Vary JC Jr, Jelacic S, Habeeb RL, Ward TR, Baylor MR, Besser TE. 2000. Iha: a novel Escherichia coli O157:H7 adherence-conferring molecule encoded on a recently acquired chromosomal island of conserved structure. Infect Immun 68:1400–1407. doi: 10.1128/iai.68.3.1400-1407.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Pérès SY, Marchès O, Daigle F, Nougayrède JP, Herault F, Tasca C, De Rycke J, Oswald E. 1997. A new cytolethal distending toxin (CDT) from Escherichia coli producing CNF2 blocks HeLa cell division in G2/M phase. Mol Microbiol 24:1095–1107. doi: 10.1046/j.1365-2958.1997.4181785.x. [DOI] [PubMed] [Google Scholar]
- 45.Brunder W, Khan AS, Hacker J, Karch H. 2001. Novel type of fimbriae encoded by the large plasmid of sorbitol-fermenting enterohemorrhagic Escherichia coli O157:H−. Infect Immun 69:4447–4457. doi: 10.1128/IAI.69.7.4447-4457.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Müsken A, Bielaszewska M, Greune L, Schweppe CH, Müthing J, Schmidt H, Schmidt MA, Karch H, Zhang W. 2008. Anaerobic conditions promote expression of Sfp fimbriae and adherence of sorbitol-fermenting enterohemorrhagic Escherichia coli O157:NM to human intestinal epithelial cells. Appl Environ Microbiol 74:1087–1093. doi: 10.1128/AEM.02496-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Huson DH, Bryant D. 2006. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol 23:254–267. doi: 10.1093/molbev/msj030. [DOI] [PubMed] [Google Scholar]
- 48.Middendorf B, Hochhut B, Leipold K, Dobrindt U, Blum-Oehler G, Hacker J. 2004. Instability of pathogenicity islands in uropathogenic Escherichia coli 536. J Bacteriol 186:3086–3096. doi: 10.1128/jb.186.10.3086-3096.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Perna NT, Plunkett G, Burland V, Mau B, Glasner JD, Rose DJ, Mayhew GF, Evans PS, Gregor J, Kirkpatrick HA, Pósfai G, Hackett J, Klink S, Boutin A, Shao Y, Miller L, Grotbeck EJ, Davis NW, Lim A, Dimalanta ET, Potamousis KD, Apodaca J, Anantharaman TS, Lin J, Yen G, Schwartz DC, Welch RA, Blattner FR. 2001. Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature 409:529–533. doi: 10.1038/35054089. [DOI] [PubMed] [Google Scholar]
- 50.Hayashi T, Makino K, Ohnishi M, Kurokawa K, Ishii K, Yokoyama K, Han CG, Ohtsubo E, Nakayama K, Murata T, Tanaka M, Tobe T, Iida T, Takami H, Honda T, Sasakawa C, Ogasawara N, Yasunaga T, Kuhara S, Shiba T, Hattori M, Shinagawa H. 2001. Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res 8:11–22. doi: 10.1093/dnares/8.1.11. [DOI] [PubMed] [Google Scholar]
- 51.Vu-Khac H, Holoda E, Pilipcinec E, Blanco M, Blanco JE, Dahbi G, Mora A, López C, González EA, Blanco J. 2007. Serotypes, virulence genes, intimin types and PFGE profiles of Escherichia coli isolated from piglets with diarrhea in Slovakia. Vet J 174:176–187. doi: 10.1016/j.tvjl.2006.05.019. [DOI] [PubMed] [Google Scholar]
- 52.Fratamico PM, Bhagwat AA, Injaian L, Fedorka-Cray PJ. 2008. Characterization of Shiga toxin-producing Escherichia coli strains isolated from swine feces. Foodborne Pathog Dis 5:827–838. doi: 10.1089/fpd.2008.0147. [DOI] [PubMed] [Google Scholar]
- 53.Prager R, Fruth A, Busch U, Tietze E. 2011. Comparative analysis of virulence genes, genetic diversity, and phylogeny of Shiga toxin 2g and heat-stable enterotoxin STIa encoding Escherichia coli isolates from humans, animals, and environmental sources. Int J Med Microbiol 301:181–191. doi: 10.1016/j.ijmm.2010.06.003. [DOI] [PubMed] [Google Scholar]
- 54.Frank C, Werber D, Cramer JP, Askar M, Faber M, An der Heiden M, Bernard H, Fruth A, Prager R, Spode A, Wadl M, Zoufaly A, Jordan S, Kemper MJ, Follin P, Müller L, King LA, Rosner B, Buchholz U, Stark K, Krause G, HUS Investigation Team. 2011. Epidemic profile of Shiga-toxin-producing Escherichia coli O104:H4 outbreak in Germany. N Engl J Med 365:1771–1780. doi: 10.1056/NEJMoa1106483. [DOI] [PubMed] [Google Scholar]
- 55.Karch H, Denamur E, Dobrindt U, Finlay BB, Hengge R, Johannes L, Ron EZ, Tønjum T, Sansonetti PJ, Vicente M. 2012. The enemy within us: lessons from the 2011 European Escherichia coli O104:H4 outbreak. EMBO Mol Med 4:841–848. doi: 10.1002/emmm.201201662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Lindstedt BA, Finton MD, Porcellato D, Brandal LT. 2018. High frequency of hybrid Escherichia coli strains with combined intestinal pathogenic Escherichia coli (IPEC) and extraintestinal pathogenic Escherichia coli (ExPEC) virulence factors isolated from human faecal samples. BMC Infect Dis 18:544. doi: 10.1186/s12879-018-3449-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Boll EJ, Stegger M, Hasman H, Roer L, Overballe-Petersen S, Ng K, Scheutz F, Hammerum AM, Dungu A, Hansen F, Lilje B, Hansen DS, Krogfelt KA, Price LB, Johnson JR, Struve C, Olesen B. 2018. Emergence of enteroaggregative Escherichia coli within the ST131 lineage as a cause of extraintestinal infections. bioRxiv 435941. 10.1101/435941. [DOI] [PMC free article] [PubMed]
- 58.Smith JM, Feil EJ, Smith NH. 2000. Population structure and evolutionary dynamics of pathogenic bacteria. Bioessays 22:1115–1122. doi:. [DOI] [PubMed] [Google Scholar]
- 59.Maluta RP, Leite JL, Rojas TCG, Scaletsky ICA, Guastalli EAL, Ramos MC, Dias da Silveira W. 2017. Variants of astA gene among extra-intestinal Escherichia coli of human and avian origin. FEMS Microbiol Lett 364:fnw285. doi: 10.1093/femsle/fnw285. [DOI] [PubMed] [Google Scholar]
- 60.Dreux N, Denizot J, Martinez-Medina M, Mellmann A, Billig M, Kisiela D, Chattopadhyay S, Sokurenko E, Neut C, Gower-Rousseau C, Colombel JF, Bonnet R, Darfeuille-Michaud A, Barnich N. 2013. Point mutations in FimH adhesin of Crohn’s disease-associated adherent-invasive Escherichia coli enhance intestinal inflammatory response. PLoS Pathog 9:e1003141. doi: 10.1371/journal.ppat.1003141. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw data (sequence reads) for all new WGS reads were deposited at ENA under project study no. PRJEB31106, the annotated genomes of the five STEC/UPEC hybrid strains are available at NCBI under accession no. CP035498, SEST00000000, SESS00000000, SESR00000000 and SESQ00000000, and the XML file that was finally used for BEAST analysis (including all parameters and sequences analyzed) is available at https://doi.org/10.17879/43169635496.






