Skip to main content
Applied and Environmental Microbiology logoLink to Applied and Environmental Microbiology
. 2012 Mar;78(5):1353–1360. doi: 10.1128/AEM.06663-11

High-Resolution Two-Locus Clonal Typing of Extraintestinal Pathogenic Escherichia coli

Scott J Weissman a,, James R Johnson b, Veronika Tchesnokova c, Mariya Billig c, Daniel Dykhuizen d, Kim Riddell e, Peggy Rogers e, Xuan Qin f, Susan Butler-Wu g, Brad T Cookson c,g, Ferric C Fang c,g, Delia Scholes h, Sujay Chattopadhyay c, Evgeni Sokurenko c,
PMCID: PMC3294456  PMID: 22226951

Abstract

Multilocus sequence typing (MLST) is usually based on the sequencing of 5 to 8 housekeeping loci in the bacterial chromosome and has provided detailed descriptions of the population structure of bacterial species important to human health. However, even strains with identical MLST profiles (known as sequence types or STs) may possess distinct genotypes, which enable different eco- or pathotypic lifestyles. Here we describe a two-locus, sequence-based typing scheme for Escherichia coli that utilizes a 489-nucleotide (nt) internal fragment of fimH (encoding the type 1 fimbrial adhesin) and the 469-nt internal fumC fragment used in standard MLST. Based on sequence typing of 191 model commensal and pathogenic isolates plus 853 freshly isolated clinical E. coli strains, this 2-locus approach—which we call CH (fumC/fimH) typing—consistently yielded more haplotypes than standard 7-locus MLST, splitting large STs into multiple clonal subgroups and often distinguishing different within-ST eco- and pathotypes. Furthermore, specific CH profiles corresponded to specific STs, or ST complexes, with 95% accuracy, allowing excellent prediction of MLST-based profiles. Thus, 2-locus CH typing provides a genotyping tool for molecular epidemiology analysis that is more economical than standard 7-locus MLST but has superior clonal discrimination power and, at the same time, corresponds closely to MLST-based clonal groupings.

INTRODUCTION

Escherichia coli infections, which encompass both intestinal syndromes (e.g., diarrhea, dysentery) and extraintestinal syndromes (e.g., urinary tract infection [UTI], septicemia, newborn meningitis), represent a significant public health burden worldwide (17). Most extraintestinal E. coli infections are caused by strains from phylogenetic groups B2 and D, within which are concentrated the horizontally mobile genetic determinants associated with extraintestinal virulence, such as toxins, adhesins, protectins, and iron-scavenging systems (17).

Multilocus sequence typing (MLST) is currently the preferred method for characterizing the relatedness of strains within bacterial species (19). Standardized MLST schemes have been established for numerous human pathogens, including E. coli (38). Certain E. coli sequence types (STs, in which MLST profiles are identical) are epidemiologically associated with specific extraintestinal syndromes, e.g., ST127 and ST73 with pyelonephritis (15, 16), while others have been associated with important emerging antimicrobial resistance properties, e.g., ST69 with trimethoprim-sulfamethoxazole resistance (20) and ST131 with fluoroquinolone resistance and extended-spectrum beta-lactamase production (22).

However, STs are not uniform with regard to genetic properties or ecotypic/pathotypic behaviors. Within ST95, for example, strains from the North American OMP6 clade of serotype O18:K1:H7 encode P fimbriae and hemolysin and are strongly associated with both newborn meningitis and UTI (14), while strains from the European OMP9 clade of O18:K1:H7 encode neither element and are associated only with newborn meningitis (1). Moreover, serotypes O1:K1 and O2:K1 also occur within ST95 but are associated with UTI, sepsis, and avian colibacillosis rather than neonatal meningitis (1, 38). Likewise, ST73 includes both pathogenic strains such as archetypal human pyelonephritis isolate CFT073, which is virulent in various animal models of extraintestinal infection, and probiotic/commensal strains such as Nissle 1917 and ABU83972, which not only are avirulent themselves but can protect colonized hosts against symptomatic infection by more pathogenic strains (35). Typing schemes that provide reliable discrimination between closely related strains may enhance understanding of the clinical behavior and properties of these sublineages and of the evolutionary events that gave rise to them.

The vast majority of E. coli strains encode type 1 fimbriae (13, 18). The fim cluster is located in a highly recombinogenic region on the E. coli chromosome, just downstream of the leuX tRNA locus (34), into which pathogenicity islands are frequently inserted. The fimH gene, encoding the type 1 fimbrial adhesin, is under positive selection for functional mutations (37), whereby single nucleotide polymorphisms (SNPs) can produce amino acid replacements that dramatically alter bacterial cell adhesion properties relevant to pathogenesis (36, 37). In a previous study, we observed within clinically prominent ST95 a remarkable level of allelic diversity in fimH—12 unique alleles among 44 isolates—arising both by homologous fim cluster recombination and amino acid replacement mutation under strong positive selection (37). The genetic diversity of fimH has been leveraged for use in various typing applications (2, 8, 32).

Typing methods should provide discriminatory power, reproducibility, and efficiency in terms of cost and time (28). Since the cost and labor required by standard 7-locus MLST limit its application in clinical microbiology, a reduction in the number of typing loci may be attractive, so long as discriminatory power can be maintained or even enhanced. We have characterized fimH variation in both a collection of geographically and ecologically diverse model E. coli isolates and a large collection of fresh clinical isolates in order to develop a high-resolution, sequence-based typing scheme that leverages the pathoadaptive properties of the FimH adhesin, combined with the phylogenetic reliability of a standard MLST locus, fumC.

MATERIALS AND METHODS

Reference E. coli strains.

The primary (reference) study strain collection included 191 commensal and pathogenic isolates of E. coli (see Table S1 in the supplemental material), including 70 strains from the E. coli reference (ECOR) collection and 38 strains with publicly available genome sequences. Two of our ECOR specimens—ECOR 43 and ECOR 59—were excluded from further study after confirmatory molecular testing failed to yield the expected profiles.

We included an additional 83 mainly extraintestinal isolates, of which 75 have been described previously (23, 26, 27, 36, 37). The 8 previously unpublished strains were urine or fecal isolates from humans and domesticated animals with acute UTI and included fecal isolate HFP004, from a collection of fecal isolates recovered from women with acute UTI (and their household members) treated at a family practice clinic in the Minneapolis, MN area; clinical isolates JRF12A, JRF15A, JRF16A, JRF22A, JRF26A, and JRF173, from dogs or cats with acute UTI evaluated at an ambulatory veterinary practice in San Diego County, CA; and HI#2, a urine isolate collected from a healthy adult female with cystitis at the University of Washington, Seattle, WA.

Previously unpublished isolates were assigned to 1 of the 4 traditionally recognized E. coli phylogenetic groups (A, B1, B2, and D) based on either PCR-based phylotyping, as previously described, or clustering with reference strains in a dendrogram based on concatenated MLST sequences (6, 38).

Fresh clinical (current) E. coli isolates.

The collection of fresh clinical (current) E. coli isolates consisted of 853 consecutive E. coli isolates recovered in the clinical microbiology laboratories of several medical institutions during the routine processing of various clinical specimens—mainly urine (91%) but also wound (3%), blood (2%), and other specimens—between October 2010 and January 2011. Of the current isolates, 300 were obtained from the Group Health Cooperative (Seattle, WA), 200 were from the University of Washington Medical Center (Seattle, WA), 143 were from the Harborview Medical Center (Seattle, WA), 110 were from the Minneapolis Veterans Administration Medical Center (Minneapolis, MN), and 100 were from Seattle Children's Hospital (Seattle, WA).

MLST and fimH sequencing.

Amplification and sequencing of the MLST loci were done as previously described (26, 36, 37). For fimH amplification and sequencing, we used the following fimH primers: fimH-F, CACTCAGGGAACCATTCAGGCA (binds 50 to 72 nucleotides [nt] upstream of fimH start); fimH-R, CTTATTGATAAACAAAAGTCAC (spans the last 21 nt of fimH). When necessary, the following mid primer was used to complete full-length fimH sequencing: fimH-mid, CGTTGTTTATAATTCGAG (binds nt 339 to 356 of fimH). The thermocycler program for all reactions consisted of 1 cycle of 94°C for 5 min, followed by 30 cycles of 94°C for 30 s, 57°C for 15 s, and 72°C for 1 min. Contigs were assembled using BioNumerics (Applied Maths, Sint-Martens-Latem, Belgium). To describe the predicted FimH peptides associated with each allele, we selected as our reference the consensus FimH protein that carries the most conserved residue at each amino acid position in the mature peptide (11, 26). Amino acids in the signal peptide were numbered 1 (start codon) through 21 and are indicated throughout this report with a preceding minus sign.

Identification of fimH-null strains.

Strains with publicly available genomes were assigned fimH-null status if they were found to have any interruption (e.g., by insertion sequence) or deletion of the region flanked by fimH primer annealing sites (as for 7 strains). Strains sequenced de novo for this study were assigned fimH-null status if they did not amplify a product of the expected size (975 bp) with the fimH primers used here (as for 5 ECOR strains).

Phylogenetic analysis.

For each strain, the seven MLST gene fragments were concatenated into a single sequence of approximately 3,500 nt. We used PAUP* 4.0b (30) to generate maximum-likelihood DNA trees for concatenated MLST sequences and for full-length fimH sequences.

Phylogenetic analysis of current E. coli.

Of 853 isolates, 611 (72%) had all 7 MLST genes sequenced, 5% had 6, 4.7% had 5, 6% had 4, 10% had 3, and 2% had only 2 MLST genes sequenced. The isolates that underwent less-than-full MLST analysis had a unique combination of sequenced genes that placed them in one of several major ST complexes with high probability (P < 0.0001).

Nucleotide polymorphism analysis.

Nucleotide polymorphism was measured by average pairwise diversity index, π, using MEGA version 4 (31). The polymorphism plot was derived from a series of π values across overlapping windows of 100 nt with a step size of 50 nt using ProSeq v2.91 (10).

Discriminatory power and cluster correlation analyses.

Discriminatory power was analyzed using Simpson's index of diversity (D) (12), which quantifies the likelihood that two individuals selected randomly from the same population will exhibit different types. Thus, the relative discriminatory power of two typing methods can be compared directly using D when they are applied to the same population. Correlation of clustering techniques was evaluated using the Wallace coefficient, which measures the probability that paired strains assigned to the same genotype group by one method are also classified in the same type by the other method. The publicly available script (3) described by Carriço and colleagues (http://biomath.itqb.unl.pt/ClusterComp) was implemented in BioNumerics.

RESULTS

Discriminatory power and congruence of fimH and MLST.

To evaluate the suitability of fimH as a typing locus, we first determined the 7-locus MLST profiles and full-length fimH sequence of 191 reference E. coli strains. The individual MLST loci exhibited 26 to 37 alleles each (Table 1). An intact, full-length fimH sequence was obtained from 179 (94%) of the 191 reference strains, with 67 unique, full-length fimH alleles observed; the 12 fimH-null strains derived from the ECOR (5 strains) and publicly available genome (7 strains) collections. Thus, fimH exhibited greater sequence variation than the individual MLST housekeeping genes.

Table 1.

Numbers of types found and D values of individual and combined loci of 191 diverse E. coli isolates

Typing method No. of types found D (95% CI)
Single loci
    adk 35 0.890 (0.869–0.919)
    fumC 37 0.911 (0.887–0.935)
    gyrB 37 0.887 (0.858–0.915)
    icd 31 0.888 (0.865–0.915)
    mdh 26 0.851 (0.810–0.891)
    purA 28 0.839 (0.802–0.877)
    recA 29 0.893 (0.869–0.912)
    fimH 68 0.967 (0.959–0.976)
    fimHTR 59 0.962 (0.953–0.972)
Loci paired with fimH
    adk + fimHTR 99 0.986 (0.982–0.991)
    fumC + fimHTR 102 0.988 (0.983–0.992)
    gyrB + fimHTR 103 0.987 (0.983–0.992)
    icd + fimHTR 98 0.987 (0.983–0.992)
    mdh + fimHTR 96 0.986 (0.981–0.990)
    purA + fimHTR 95 0.986 (0.982–0.990)
    recA + fimHTR 98 0.987 (0.983–0.991)
    MLST alone 91 0.951 (0.930–0.972)
    MLST + fimH 126 0.991 (0.986–0.995)
    MLST + fimHTR 123 0.990 (0.986–0.995)

We next examined the congruence of the fimH and MLST phylogenies. In total, 91 unique MLST profiles (STs) were encountered among the reference strains, all differing by at least 1 nt in one locus and spanning the 4 traditionally recognized phylogenetic groups of E. coli, i.e., groups A, B1, B2, and D (Fig. 1, left panel). Of the 67 full-length fimH alleles, 58 were associated with a single phylogenetic group. This subset included the 25 alleles encoding FimH polymorphism N78 (Fig. 1, red cross lines), all of which were associated with phylogenetic group B2, and 7 alleles that appeared in multiple STs but all within a given phylogenetic group (Fig. 1; blue, green, and orange cross lines). Thus, these fimH alleles could be defined as phylogenetically restricted alleles. The remaining 9 alleles were associated with STs in 2 or more phylogenetic groups (Fig. 1, black cross lines), indicating that certain fimH alleles frequently move horizontally among phylogenetically distant lineages of E. coli and could be defined as phylogenetically dispersed alleles.

Fig 1.

Fig 1

Sequence-based typing of a collection of 191 model E. coli isolates. (Left panel) Dendrogram of concatenated 7-locus MLST sequences. (Right panel) Dendrogram of full-length type 1 fimbrial adhesin gene fimH sequences, with fimH typing region (fimHTR) alleles and amino acid polymorphisms differed from the consensus structure. Cross-connecting lines link same-strain MLST and fimH haplotypes. The scales at the bottoms of dendrograms indicate phylogenetic distance expressed as percent identity at the nucleotide level. ST numbering is according to the MLST database (http://mlst.ucc.ie/mlst/dbs/Ecoli). The number of isolates associated with the specified type (in parentheses) is shown only where that number is >1. Empty circles indicate STs that include a fimH-null strain. Colors of cross-connecting lines between dendrograms correspond to the colors of the phylogenetic group origins: red, group B2 only; blue, group B1 only; orange, group A only; green, group D only; black, two or more phylogenetic groups. FimH hot spot polymorphisms are underlined. Mature FimH peptide polymorphisms encoded outside the typing region are italicized.

Trimming fimH for typing applications.

Sequence typing customarily uses a relatively short region of each locus (400 to 500 bp) to allow sequence determination by using only two primers. To identify an internal fragment of fimH suitable for typing purposes, we performed sequence polymorphism analysis on the 67 unique full-length fimH sequences in our reference collection. We measured the distribution of polymorphisms between the two functional domains of fimH: the N-terminal lectin domain (encoded by nt 64 to 540), which contains the mannose-specific binding pocket, and the C-terminal pilin domain (nt 541 to 900), which anchors the FimH subunit to the type 1 fimbrial shaft (pilus). According to π values (average number of polymorphisms per nucleotide), the lectin domain (overall π = 0.022) was significantly more diverse (P = 0.02) than the pilin domain (overall π = 0.013; Fig. 2). The lectin domain-encoding region of fimH was actually more diverse than 6 of the 7 MLST loci (π range of 0.008 to 0.015, P < 0.05) and comparable only to that of fumC (π = 0.026; P = 0.37).

Fig 2.

Fig 2

Sliding-window nucleotide polymorphism plot of 7 MLST loci, as well as the fimH lectin and pilin domains. The signal peptide and two fimH domains (lectin and pilin) are partitioned by a vertical dashed line. The red fimH typing region (fimHTR) includes the entire fimH lectin domain and small portions of the adjacent regions (i.e., signal peptide to the left, pilin domain to the right). Overlapping windows of 100 nt with a step size of 50 nt were used. The average π value (± the standard error) is shown for each locus.

We also considered the location of “hot-spot” amino acid residues within FimH that are repeatedly targeted by amino acid replacement mutations that have pathogenicity-enhancing (pathoadaptive) effects on E. coli (36). In the fimH sequences of the reference strains, we identified a total of 4 hotspots, 3 of which (codons 27, 66, and 74) occurred within the lectin domain of FimH and the fourth of which (codon 163) occurred within the proximal portion of the pilin domain.

Using these data, we identified a 489-bp segment (here referred to as the fimH typing region [fimHTR]) that begins at the first codon of the mature peptide and ends after mature peptide codon 163 (nt 550 to 552; Fig. 2, approximated in red).

Discriminatory power of fimHTR-based typing.

Within the reference collection, fimHTR distinguished 58 alleles, in comparison to the 67 alleles distinguished by full-length fimH. For typing purposes, we also defined fimH-null status as an additional character state (i.e., as an additional “allele”). According to the Simpson's D diversity index estimates, although full-length fimH distinguished more alleles than fimHTR, the discriminatory powers (i.e., the population diversity based on the locus sequence) of these 2 regions were nearly equivalent, with D = 0.967 (confidence interval [CI], 0.959 to 0.976) for full-length fimH and D = 0.962 (CI, 0.953 to 0.972) for fimHTR. Thus, each exceeded the discriminatory power of individual MLST loci and was not different from that of 7-locus MLST (Table 1).

We then evaluated each of the 7 MLST loci to select the best candidate for pairing with fimHTR to increase typing resolution. Among the 7 MLST loci, fumC demonstrated numerically the greatest discriminatory power (D = 0.911; CI, 0.887 to 0.935), although the values overlapped with most remaining loci (Table 1). As expected, then, pairing fimHTR with fumC produced the numerically highest discriminatory power (D = 0.988; CI, 0.983 to 0.992) of all such pairings and significantly exceeded the discriminatory power of full MLST (Table 1), although again the discriminatory power of the fimHTR-fumC pairing was not significantly different from that of the other pairings.

However, another attractive feature of fumC that recommended it for pairing with fimHTR in the typing scheme is the fact that, of the 7 MLST loci, fumC demonstrated the best congruence with both ST profiles and major phylogenetic groups (Table 2). These relationships were measured by the Wallace index, which expresses the probability that paired strains assigned to the same genotype group by one method are also classified in the same type by the other method. The superior phylogenetic congruence of fumC is particularly important considering the congruence-disrupting effect of the phylogenetically dispersed fimH alleles, as discussed above. Therefore, we selected the fumC-fimHTR combination as the target loci for sequence typing and designated this the CH (fumC-fimH) typing scheme.

Table 2.

Correspondence of individual MLST loci with full ST profiles and phylogenetic groups of 191 diverse E. coli isolates using the Wallace index

Locus Wallace index for ST Wallace index for phylogenetic group
adk 0.462 0.800
fumC 0.548 0.986
gyrB 0.432 0.900
icd 0.437 0.944
mdh 0.328 0.959
purA 0.305 0.766
recA 0.459 0.892
fimHTR 0.258 0.504

Correlation between MLST and CH typing among current E. coli isolates.

To determine the resolution and specificity of CH typing in a field application, we analyzed 853 fresh clinical E. coli isolates. The isolates were collected consecutively as part of routine diagnostics in five different clinical microbiology labs, from October 2010 through January 2011, without any preselection criteria. All isolates were of extraintestinal origin, primarily from urine. The MLST loci could be sequenced in all of the isolates tested, while fimHTR could be sequenced in more than 99% of the isolates (n = 846).

In total, 210 unique MLST profiles (i.e., STs) were identified. Among them, 181 small STs each comprised <0.5% of the population (≤4 isolates; Fig. 3A), collectively accounting for 252 isolates (29.5%). Additionally, 24 medium STs each comprised 0.5 to 5% of the population (5 to 35 isolates in the collection), collectively accounting for 219 isolates (25.7%). Finally, 5 large STs each included more than 5% of the population (≥40 isolates in the collection) each, collectively accounting for 382 isolates (44.9%). Thus, while the number of individual STs decreased progressively along a gradient from small to large ST size, the greatest proportion of current isolates was accounted for by relatively few large STs, evidence of the highly clonal structure of clinical ExPEC isolates.

Fig 3.

Fig 3

Distribution of isolates and unique profiles by group size among 853 current E. coli strains. ST or CHT sizes: <0.5%, small; 0.5 to 5%, medium; >5%, large. Light bars, total number of strains in each category (left axis). Dark bars, total number of unique profiles in each category (right axis). The axis scale is the same in both panels.

The current clinical isolates carried 143 unique fimHTR alleles. When fumC and fimHTR were combined for CH typing, there were a total of 246 unique CH types (CHTs), i.e., more than the number of 7-locus STs (see above). Similar to STs, the number of CHTs decreased significantly from small (209 CHTs) to medium (34 CHTs) to large (3 CHTs) (Fig. 3B). However, compared to STs, the absolute number of small and medium CHTs was somewhat greater, while the number of large CHTs was significantly lower. Likewise, whereas with MLST the aggregated large STs were most numerous, with CH typing the aggregated medium CHTs were most numerous, indicating that CH typing splits larger STs into smaller CHTs.

To determine to what extent unique CHTs correspond with MLST-based clonal groups, we combined STs into ST complexes by using the eBURST program (http://eburst.mlst.net), where each ST must match at 6 of 7 loci with at least 1 other ST in the complex. Nearly half of the singleton STs (66 of 138) could be combined this way with another ST within the collection, with the rest remaining as individual STs; this yielded a total of 123 separate STs or ST complexes. Overall, 224 CHTs (i.e., >90% of the total) had a unique, specific match and another 3 CHTs were mostly (93 to 97%) matched to a single ST or ST complex. This gave an overall match rate between CH typing and MLST of 95.8%.

The overall superior resolution and the clonal matching of CH typing relative to MLST are illustrated in Fig. 4, where the 5 largest ST complexes (each represented by >5% of the isolates) are compared directly to the corresponding CHTs. These large ST complexes included such notorious ExPEC clones as ST131, ST73, ST95, ST69, and ST127 (Fig. 4, upper panel). Only in the ST69 complex was the number of CHTs less than the number of STs within the same complex. In the other 4 ST complexes, CHTs outnumbered the corresponding STs by 2- to 3-fold. Furthermore, within each complex, except the ST69 complex, the major (founder) ST was split into 4 to 15 different CHTs. For the large ST complexes, almost all CHTs were specific to that complex (Fig. 4), with an overall match rate of 98.8%.

Fig 4.

Fig 4

Correspondence of fumC-fimH profiles (CHTs) with STs for the 5 largest ST complexes. Dotted lines connect minor STs with the corresponding CHTs; the remaining CHTs correspond to the predominant ST within the complex. CHT circles without a pie slice represent profiles with a total match to the ST complex; circles with a pie slice represent CHTs that mostly (93 to 97%) match the ST complex (the slice symbolizes the “nonmatch” isolates).

Thus, among current ExPEC isolates, 2-locus CH typing provided discriminatory power superior to that of MLST while maintaining robust clonal correspondence with the MLST-based clonal groups.

DISCUSSION

By pairing housekeeping locus fumC with a fimH fragment, we have devised a rapid sequence typing scheme for E. coli (designated CH typing) that preserves the phylogenetic signal, has superior discriminatory power, and may resolve clinically important sublineages within STs. It therefore could serve as a cost-effective alternative to full MLST.

Two genetic mechanisms contribute to the high discriminatory power of the fimH locus in CH typing. The first mechanism is fim cluster replacement by homologous recombination. Here, we demonstrated the existence of several phylogenetically dispersed fimH variants that appeared in 2 or more phylogenetic groups, presumably reflecting the independent entries of the corresponding fim clusters into expanding clones. The second mechanism is point mutation, leading to amino acid replacement. Such SNPs occur and accumulate in fimH allelic backgrounds with comparatively rapid evolutionary speed during the diversification of successful clones at the population level or even during the infection of a single host (36). They were shown to have dramatic effects on the mannose-specific uroepithelial cell adhesion that is central to E. coli urovirulence (37).

To create from full-length fimH (900 nt) a shorter fragment suitable for efficient molecular typing (≤500 nt), we selected mature peptide codons 1 to 163, which span the entire mannose-binding lectin domain, the interdomain linker, and a few N-terminal residues of the pilin domain. The fimHTR segment specifically captures polymorphic codons within which amino acid replacement SNPs often distinguish otherwise identical fimH alleles within specific STs. Such polymorphisms include “hot spots” that give rise to the same replacement in multiple allele backgrounds (e.g., G66S occurring in the S70/N78 and V202 allelic backgrounds) or to different replacements in distinct allele backgrounds (e.g., A27V and A27T). Although the proposed fimH partition does exclude functionally significant polymorphisms in the 21-amino-acid FimH signal peptide (25), inclusion of this region would have added only a single allele in the typing of this collection (data not shown), and was thus deemed unnecessary. Furthermore, the fimHTR segment defines an internal fragment that encompasses polymorphisms that are important for the discrimination of sublineages within extraintestinal pathogenic E. coli clones. For example, polymorphisms in codon 163 distinguish model pyelonephritis strain CFT073 from other ST73 strains with distinctive pathotypes, including canine UTI strain JRF26 and model asymptomatic bacteriuria strain ABU83972 (see Table 1 in the supplemental material). CH typing also splits apart the (cystitis-associated) OMP6 subclone from most (meningitis-associated) OMP9 strains of ST95.

Each of the STs currently prominent in extraintestinal disease—including traditional clonal groups such as ST95 and ST73 and emerging clonal groups such as ST69 and ST131—contains at least one of the phylogenetically dispersed fimHTR alleles (i.e., fimHTR27, -30, -34, -35, -41, and -54) in addition to one or more phylogenetically restricted fimH alleles. This suggests that fim cluster recombination may be a molecular indicator of clonal “blooms.” While some of the distinctive serotypic and pathotypic properties distinguishing strains within ST95 and ST73 have been described previously (35, 37), the partitioning of those properties according to phylogenetically restricted versus phylogenetically dispersed fimHTR alleles (see Fig. S1 in the supplemental material) that we observed may provide a novel insight into a molecular mechanism underlying clonal diversification. Indeed, the identity of fimH as a gene under positive selection for functional mutations (26, 37) and as a proven virulence factor in extraintestinal E. coli disease (7) may enable specific allele assignments to provide “value-added” predictive power regarding the pathotypic properties of the associated lineages.

We selected fumC as the single housekeeping locus for pairing with fimHTR in this typing scheme because, of the 7 MLST loci, it (i) provided the best discriminatory power, (ii) exhibited the highest level of nucleotide polymorphism, and (iii) was best able to predict the phylogenetic group. This combination of fumC and fimH provides greater discriminatory power than standard 7-locus MLST, which over the past decade replaced multilocus enzyme electrophoresis (1, 4, 5, 24) as the standard method for studying E. coli population structure. However, CH typing provides less resolving power than does pulsed-field gel electrophoresis (PFGE) (33), which is sensitive to subtler changes that accompany clonal diversification, such as chromosomal insertions, deletions, and inversions.

Although the apparent absence of an intact fimH locus in 10% of the model E. coli strains might appear to limit the applicability of CH typing, the proportion of fimH-null strains is clearly a collection-dependent trait. Among the model strains, the proportion was higher among fully sequenced strains, which included a significant proportion of enteric pathotypes (e.g., Shigella and enteroinvasive E. coli), and lower among the ECOR strains originally selected to broadly represent the diversity of the species. However, to allow the typeability of all E. coli isolates under the CH scheme, we have included fimH-null status as a fimHTR character state. Nonetheless, if wild-type populations recapitulate the fimH prevalence patterns observed in our collection, CH typing may not provide as high resolution among enteric pathotypes (particular Shigella) as among extraintestinal pathotypes.

Clearly, CH typing by itself cannot replace standard MLST for definitive characterization of the clonal phylogeny of E. coli. However, when field tested here in parallel with MLST, CH proved highly capable of predicting the MLST profile. Indeed, that CH typing correlated with specific STs or ST complexes for more than 90% of the isolates increases significantly its applicability as a molecular tool for both applied and basic investigations regarding the epidemiology and population structure of extraintestinal E. coli. For example, CH typing could be used to screen isolates in suspected point source outbreaks where PFGE is too labor-intensive and/or cost-prohibitive and random amplified polymorphic DNA is too poorly reproducible/portable, and could enable such screening to occur more quickly and at lower cost than standard MLST. Alternatively, CH typing could be used to evaluate large clinical isolate collections for sub-ST clonal diversity—e.g., in population studies of antimicrobial resistance (9, 21, 29)—and as a triage method for further molecular typing, whether by PFGE or whole-genome sequencing.

Supplementary Material

Supplemental material

ACKNOWLEDGMENTS

This work was supported by Mentored Clinical Scientist Development K08 award AI057737A from the National Institute of Allergy and Infectious Diseases (S.J.W.) and NIH ARRA award 1RC4AI092828 (E.S.). This material is also based upon work supported by the Office of Research and Development, Medical Research Service, Department of Veterans Affairs (J.R.J.).

We thank Brian Johnston, Alena Gileva, Tawna Matthys, and medical technologists in the Harborview Medical Center and University of Washington Medical Center Microbiology laboratories for their excellent technical assistance.

Footnotes

Published ahead of print 6 January 2012

Supplemental material for this article may be found at http://aem.asm.org.

REFERENCES

  • 1. Achtman M, et al. 1983. Six widespread bacterial clones among Escherichia coli K1 isolates. Infect. Immun. 39:315–335 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Barl T, et al. 2008. Genotyping DNA chip for the simultaneous assessment of antibiotic resistance and pathogenic potential of extraintestinal pathogenic Escherichia coli. Int. J. Antimicrob. Agents 32:272–277 [DOI] [PubMed] [Google Scholar]
  • 3. Carriço JA, et al. 2006. Illustration of a common framework for relating multiple typing methods by application to macrolide-resistant Streptococcus pyogenes. J. Clin. Microbiol. 44:2524–2532 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Caugant DA, et al. 1983. Genetic diversity and relationships among strains of Escherichia coli in the intestine and those causing urinary tract infections. Prog. Allergy 33:203–227 [DOI] [PubMed] [Google Scholar]
  • 5. Caugant DA, et al. 1985. Genetic diversity in relation to serotype in Escherichia coli. Infect. Immun. 49:407–413 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Clermont O, Bonacorsi S, Bingen E. 2000. Rapid and simple determination of the Escherichia coli phylogenetic group. Appl. Environ. Microbiol. 66:4555–4558 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Connell I, et al. 1996. Type 1 fimbrial expression enhances Escherichia coli virulence for the urinary tract. Proc. Natl. Acad. Sci. U. S. A. 93:9827–9832 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Dias RC, Moreira BM, Riley LW. 2010. Use of fimH single-nucleotide polymorphisms for strain typing of clinical isolates of Escherichia coli for epidemiologic investigation. J. Clin. Microbiol. 48:483–488 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Edelstein M, Pimkin M, Palagin I, Edelstein I, Stratchounski L. 2003. Prevalence and molecular epidemiology of CTX-M extended-spectrum beta-lactamase-producing Escherichia coli and Klebsiella pneumoniae in Russian hospitals. Antimicrob. Agents Chemother. 47:3724–3732 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Filatov DA. 2002. ProSeq: a software for preparation and evolutionary analysis of DNA sequence data sets. Mol. Ecol. Notes 2:621–624 [Google Scholar]
  • 11. Hommais F, et al. 2003. The FimH A27V mutation is pathoadaptive for urovirulence in Escherichia coli B2 phylogenetic group isolates. Infect. Immun. 71:3619–3622 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Hunter PR, Gaston MA. 1988. Numerical index of the discriminatory ability of typing systems: an application of Simpson's index of diversity. J. Clin. Microbiol. 26:2465–2466 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Johnson JR, Delavari P, Kuskowski M, Stell AL. 2001. Phylogenetic distribution of extraintestinal virulence-associated traits in Escherichia coli. J. Infect. Dis. 183:78–88 [DOI] [PubMed] [Google Scholar]
  • 14. Johnson JR, Delavari P, O'Bryan TT. 2001. Escherichia coli O18:K1:H7 isolates from patients with acute cystitis and neonatal meningitis exhibit common phylogenetic origins and virulence factor profiles. J. Infect. Dis. 183:425–434 [DOI] [PubMed] [Google Scholar]
  • 15. Johnson JR, et al. 2008. Virulence genotypes and phylogenetic background of Escherichia coli serogroup O6 isolates from humans, dogs, and cats. J. Clin. Microbiol. 46:417–422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Johnson JR, Owens KL, Clabots CR, Weissman SJ, Cannon SB. 2006. Phylogenetic relationships among clonal groups of extraintestinal pathogenic Escherichia coli as assessed by multi-locus sequence analysis. Microbes Infect. 8:1702–1713 [DOI] [PubMed] [Google Scholar]
  • 17. Johnson JR, Russo TA. 2002. Extraintestinal pathogenic Escherichia coli: “the other bad E. coli.” J. Lab. Clin. Med. 139:155–162 [DOI] [PubMed] [Google Scholar]
  • 18. Johnson JR, Stell AL. 2000. Extended virulence genotypes of Escherichia coli strains from patients with urosepsis in relation to phylogeny and host compromise. J. Infect. Dis. 181:261–272 [DOI] [PubMed] [Google Scholar]
  • 19. Maiden MC, et al. 1998. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc. Natl. Acad. Sci. U. S. A. 95:3140–3145 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Manges AR, et al. 2001. Widespread distribution of urinary tract infections caused by a multidrug-resistant Escherichia coli clonal group. N. Engl. J. Med. 345:1007–1013 [DOI] [PubMed] [Google Scholar]
  • 21. Mendonça N, Leitao J, Manageiro V, Ferreira E, Canica M. 2007. Spread of extended-spectrum beta-lactamase CTX-M-producing Escherichia coli clinical isolates in community and nosocomial environments in Portugal. Antimicrob. Agents Chemother. 51:1946–1955 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Nicolas-Chanoine MH, et al. 2008. Intercontinental emergence of Escherichia coli clone O25:H4-ST131 producing CTX-M-15. J. Antimicrob. Chemother. 61:273–281 [DOI] [PubMed] [Google Scholar]
  • 23. Nowrouzian FL, Friman V, Adlerberth I, Wold AE. 2007. Reduced phase switch capacity and functional adhesin expression of type 1-fimbriated Escherichia coli from immunoglobulin A-deficient individuals. Infect. Immun. 75:932–940 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Ochman H, Selander RK. 1984. Standard reference strains of Escherichia coli from natural populations. J. Bacteriol. 157:690–693 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Ronald LS, et al. 2008. Adaptive mutations in the signal peptide of the type 1 fimbrial adhesin of uropathogenic Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 105:10937–10942 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Sokurenko EV, et al. 2004. Selection footprint in the FimH adhesin shows pathoadaptive niche differentiation in Escherichia coli. Mol. Biol. Evol. 21:1373–1383 [DOI] [PubMed] [Google Scholar]
  • 27. Stentebjerg-Olesen B, Chakraborty T, Klemm P. 1999. Type 1 fimbriation and phase switching in a natural Escherichia coli fimB null strain, Nissle 1917. J. Bacteriol. 181:7470–7478 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Struelens MJ. 1996. Consensus guidelines for appropriate use and evaluation of microbial epidemiologic typing systems. Clin. Microbiol. Infect. 2:2–11 [DOI] [PubMed] [Google Scholar]
  • 29. Suzuki S, et al. 2009. Change in the prevalence of extended-spectrum-beta-lactamase-producing Escherichia coli in Japan by clonal spread. J. Antimicrob. Chemother. 63:72–79 [DOI] [PubMed] [Google Scholar]
  • 30. Swofford DL. 2003. PAUP*: phylogenetic analysis using parsimony (* and other methods), version 4. Sinauer Associates, Sunderland, MA [Google Scholar]
  • 31. Tamura K, Dudley J, Nei M, Kumar S. 2007. MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24:1596–1599 [DOI] [PubMed] [Google Scholar]
  • 32. Tartof SY, Solberg OD, Riley LW. 2007. Genotypic analyses of uropathogenic Escherichia coli based on fimH single nucleotide polymorphisms (SNPs). J. Med. Microbiol. 56:1363–1369 [DOI] [PubMed] [Google Scholar]
  • 33. Tenover FC, et al. 1995. Interpreting chromosomal DNA restriction patterns produced by pulsed-field gel electrophoresis: criteria for bacterial strain typing. J. Clin. Microbiol. 33:2233–2239 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Touchon M, et al. 2009. Organised genome dynamics in the Escherichia coli species results in highly diverse adaptive paths. PLoS Genet. 5:e1000344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Vejborg RM, Friis C, Hancock V, Schembri MA, Klemm P. 2010. A virulent parent with probiotic progeny: comparative genomics of Escherichia coli strains CFT073, Nissle 1917 and ABU 83972. Mol. Genet. Genomics 283:469–484 [DOI] [PubMed] [Google Scholar]
  • 36. Weissman SJ, et al. 2007. Differential stability and trade-off effects of pathoadaptive mutations in the Escherichia coli FimH adhesin. Infect. Immun. 75:3548–3555 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Weissman SJ, et al. 2006. Clonal analysis reveals high rate of structural mutations in fimbrial adhesins of extraintestinal pathogenic Escherichia coli. Mol. Microbiol. 59:975–988 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Wirth T, et al. 2006. Sex and virulence in Escherichia coli: an evolutionary perspective. Mol. Microbiol. 60:1136–1151 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental material

Articles from Applied and Environmental Microbiology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES