Abstract
Enterotoxigenic Escherichia coli (ETEC) is an enteric pathogen responsible for the majority of diarrheal cases worldwide. ETEC infections are estimated to cause 80,000 deaths annually, with the highest rates of burden, ca 75 million cases per year, amongst children under 5 years of age in resource-poor countries. It is also the leading cause of diarrhoea in travellers. Previous large-scale sequencing studies have found seven major ETEC lineages currently in circulation worldwide. We used PacBio long-read sequencing combined with Illumina sequencing to create high-quality complete reference genomes for each of the major lineages with manually curated chromosomes and plasmids. We confirm that the major ETEC lineages all harbour conserved plasmids that have been associated with their respective background genomes for decades, suggesting that the plasmids and chromosomes of ETEC are both crucial for ETEC virulence and success as pathogens. The in-depth analysis of gene content, synteny and correct annotations of plasmids will elucidate other plasmids with and without virulence factors in related bacterial species. These reference genomes allow for fast and accurate comparison between different ETEC strains, and these data will form the foundation of ETEC genomics research for years to come.
Subject terms: Genomics, Microbial genetics, Microbiology, Clinical microbiology, Microbial genetics, Pathogens
Introduction
Diarrheal pathogens are a leading cause of morbidity and mortality globally (WHO 2017), with enterotoxigenic Escherichia coli (ETEC) accounting for a large proportion of the diarrhoea cases in resource-poor countries1. An estimation of 220 million cases each year are attributed to ETEC (WHO PPC 2020). The most vulnerable group is children under five years, but ETEC can also cause disease in adults and is the principal cause of diarrhoea in travellers. Resource-poor settings, where access to clean water is limited, enable the spread of ETEC, transmitted via the faecal-oral route through ingestion of contaminated food or water2. The disease severity may range from mild to cholera-like symptoms with profuse watery diarrhoea. The infection is usually self-limiting, lasting three to four days and may be treated by water and electrolyte rehydration to balance the loss of fluids and ions. There is strong evidence to support that an ETEC vaccine is of key importance to prevent children and adults from developing ETEC disease3. Several efforts are on-going to develop an ETEC vaccine, with the majority focusing on including immunogenic antigens possibly capable of inducing protection against a majority of the circulating ETEC clones3–6.
ETEC bacteria adhere to the small intestine through fimbrial, fibrillar or afimbrial outer membrane-structures called colonisation factors (CF). Upon colonisation, the bacteria proliferate and secrete heat-labile toxin (LT) and/or heat-stable toxins, (STh or STp) causing diarrhoea and often vomiting causing the further spread of the bacteria in the environment7.
The ability of an ETEC strain to infect relies on its ability to adhere to cells of a specific host. To date, 27 different CFs with human tropism have been described, and individual ETEC strains usually express 1–3 different CFs8–14. The enterotoxins, LT and ST, can also be subdivided based on structure and function. Human-associated ETEC strains express one of the 28 different LT-I variants (LTh-1 and LTh-2 are the most common variants)15 alone or together with one of the genetically distinct types of STa; STh and STp16,17.
We have previously shown that ETEC strains causing human disease can be grouped into a set of clonal lineages that encompass strains with specific virulence profiles. Seven of the 21 identified lineages encompass ETEC strains that express the most commonly found CFs and toxin profiles amongst isolated clinical ETEC strains3,18.
There is currently one complete ETEC reference genome, H1040719, with curated annotations. Several additional complete ETEC genomes are available20,21, some of which are annotated using automated annotation pipelines that often fail at correctly annotating ETEC specific genes such as CFs. The rapid adaptation of next-generation sequencing in public health, specifically within bacterial diseases22,23 and several large-scale sequencing studies18,24–28 has led to a sharp increase in the number of publicly available ETEC genomes. Most of these data were generated with short-read technologies, such as Illumina. A limitation of short-read sequence data is the inability to unambiguously resolve repetitive regions of a genome, leading to fragmented de novo assemblies of the underlying genome, missing regions and genes, and disjointed synteny. ETEC is a highly diverse pathogen both in the core genome and the accessory genome, including mobile genetic elements (MGE). Clinically related MGEs, such as virulence plasmids, vary within ETEC strains. Hence, it is important to identify lineage-specific reference genomes that are carefully annotated, i.e. manually curated annotations, and include both chromosome and plasmid(s). Several complete genomes have been generated using long-read sequencing alone20,28, however, circularising some chromosomes and plasmids may be difficult, and small plasmids can be lost. Assembly issues can be resolved using a hybrid assembly approach combining long-read and short-read sequencing data. In this report, we describe eight genomes, eight chromosomes (seven successfully circularised) and 29 plasmids (24 successfully circularised) with curated annotations, from isolates representing the major ETEC lineages (L1-L7) that cause disease globally. They are sequenced using both short and long-read sequencing technologies to provide the highest accuracy currently available. These reference genomes will form the foundation of ETEC genomics research for years to come.
Results
Genome analysis of eight representative ETEC isolates
Eight ETEC strains representing the seven major ETEC lineages (L1-L7) comprising isolates with the most prevalent virulence factor profiles were sequenced, assembled, circularised and manually curated (Table 1).
Table 1.
Strain | Lineage | Phylogroup | MLST | O antigen | CF | Toxin profile | YoIa | Location | Subject | Age of subject | D/ASb |
---|---|---|---|---|---|---|---|---|---|---|---|
E925 | L1 | A | 2353 | O6 | CS1 + CS3 + CS21 | LT + STh | 2003 | Guatemala | Indigenous | Child < 5 yrs | D |
E1649 | L2 | A | 4 | O6 | CS2 + CS3 + CS21 | LT + STh | 1997 | Indonesia | Traveller | Adult | D |
E36 | L3 | B1 | 173 | O78 | CFA/I + CS21 | LT + STh | 1980 | Bangladesh | Indigenous | Adult/child | D |
E2980 | L3 | B1 | 5305 | O114 | CS7 | LT | 2010 | Bangladesh | Indigenous | Child < 5 yrs | D |
E1441 | L4 | A | 1312 | O25 | CS6 + CS21 | LT | 1997 | Kenya | Traveller | Adult | D |
E1779 | L5 | B1 | 443 | O115 | CS5 + CS6 | LT + STh | 2005 | Bangladesh | Indigenous | Adult | D |
E562 | L6 | A | 2332 | ON3 | CFA/I + CS21 | STh | 2000 | Mexico | Traveller | Adult | D/AS |
E1373 | L7 | E | 182 | O169 | CS6 | STp | 1996 | Indonesia | Traveller | Adult | D |
a YoI: Year of isolation.
b D/AS: Diarrhoea or Asymptomatic.
L3 includes two different representative strains, one CS7 and one CFA/I positive strain. All chromosomes except one were circularised (E1779). The average length of the chromosome was 4,927,521 bases (4,721,269–5,151,162) with an average GC content of 50.7% (50.4–50.9%) and the number of CDS ranging from 4409 to 4924 (Table S1). Each ETEC reference genome contains between two and five plasmids encompassing plasmid-specific features. Some of which carried virulence genes and/or antibiotic resistance genes (Table 2, Additional File 2).
Table 2.
Plasmid | Length (bp) | GC (%) | Inca | Plasmid features (no of copies) | Virulence genes | Putative virulence genes | Antibiotic resistance profile (genomic) | Acc. no | |
---|---|---|---|---|---|---|---|---|---|
L1 E925 CS1 + CS3 + CS21 LTh + STh |
pAvM_E925_4 | 116 803 | 48.4 | FII | ccdAB, hok-sok (antisense RNA-regulated system), psiAB, repAB, stbAB, tra genes | cstA-G (CS3), eltAB1 (LTh), estA3/4 (STh) | etpBAC | – | LR883051 |
pAvM_E925_5 | 82 909 | 48.6 | FII + FIB | psiAB, repA (2), repB, repE, parB, sopAB, tra genes | lngX1, R, S, T, X2, A-J, P (CS21) | – | – | LR883052 | |
pAvM_E925_6 | 82 314 | 47.8 | I1 | iib (colicin 1b), repA, stbAB, tra genes, pil locus, vapBC | cooB, A, C, D (CS1) | cexE | – | LR883053 | |
pAvM_E925_7 | 51 418 | 45.4 | FII | ccdAB, psiA, repAB, stbAB, tra genes | – | eatA_1-5 (two disrupted eatA copies) | – | LR883054 | |
L2 E1649 CS2* + CS3 + CS21 LTh + STh |
pAvM_E1649_8 | 120 141 | 47.2 | FII | ccdAB (duplicated), hok-sok (antisense RNA-regulated system), psiAB, repAB, stbAB, tra genes | cstA-G (CS3), eltAB1 (LTh), estA3/4 (STh) | eatA, etpBAC | – | LR882976 |
pAvM_E1649_9* | 102 017 | 47.6 | Y | P1 addiction system (phage related), repA, sopAB | – | – | LR882977 | ||
pAvM_E1649_10 | 86 517 | 45.0 | FII + FIB | hok-sok (antisense RNA-regulated system), psiAB, repA(2), repB, sopAB, tra genes | lngX1, R, S, T, X2, A-J, P (CS21) | – | – | LR882974 | |
pAvM_E1649_11* | 8 834 | 42.9 | No hits | ND | – | – | – | LR882975 | |
L3 E36 CFA/I + CS21 LTh + STh |
pAvM_E36_12* | 381 858 | 49.7 | FII + FIB | stbAB (4), psiAB (6), tra genes, repA (3), repB (3), sopAB (2), hok-sok (antisense RNA-regulated system), relE | cfaA, B, C, D (CFA/I), eltAB15 (LTh), lngX1, R, S, T, X2, A-J, P (CS21), estA2 (STh) | eatA, etpBAC | – | LR882998 |
pAvM_E36_13 | 99 448 | 51.6 | B/O/K/Z | stbAB, relE, repA, pil genes, psiAB | – | – |
tetA, B, C, R mdf(A)-like |
LR882999 | |
L3 E2980 CS7 LTh |
pAvM_E2980_14 | 112 056 | 48.1 | I1 | parA, pil genes, relE, repA, stbAB | csvA, B, C, D (CS7), eltAB (LTh) | cexE | – | LR882979 |
pAvM_E2980_15 | 72 255 | 52.4 | FII | psiAB, relE, repAB, stbAB, tra genes | – | – | strA, strB, sul2, blaTEM-1B | LR882980 | |
pAvM_E2980_16 | 48 305 | 50.3 | I1-like | stbAB, repA, vapBC (TA-system) | – | eatA, etpBAC | – | LR882981 | |
L4 E1441 CS6 + CS21 LTh (LT17) |
pAvM_E1441_17 | 130 302 | 51.3 | FII + FIB | pemI/K (TA-system), psiAB, repA (2), repB, sopAB, srnAC (antisense RNA-regulated system), stbAB, tra genes | lngX1, R, S, T, X2, A-J, P (CS21) | – | aadA1, tetR, tetA, sul1, dfrh1 | LR883013 |
pAvM_E1441_18 | 94 840 | 47.1 | FII | parB, psiAB, repAB, stbAB | cssA, B, C, D (CS6), eltAB17 (LTh) | etaA_1, eatA_2, cexE | – | LR883014 | |
L5 E1779 CS5 + CS6 LTh + STh |
pAvM_E1779_19 | 142 377 | 47.6 | FII | ccdAB (TA-system), cea/cia (Colicin E), psiAB, repAB, stbAB (2 copies), tra genes, vapBC (TA-system) | csfA, B, C, E, F, D (CS5), cssA, B, C, D (CS6), estA3/4 (STh) | eatA_1, eatA_2 | – | LR883008 |
pAvM_E1779_20 | 88 759 | 51.8 | FII | hok-sok (antisense RNA-regulated system), psiAB, tra genes, repAB, stbAB | eltAB15 (LTh) | – | – | LR883009 | |
pAvM_E1779_21 | 82 464 | 51.0 | FIIY | repA (2), repB, tra genes, psiAB, parB, sopAB | – | – | – | LR883010 | |
pAvM_E1779_22 | 61 528 | 50.5 | FII | repAB, tra genes | – | – | LR883011 | ||
L6 E562 CFA/I + CS21 STh |
pAvM_E562_23 | 109 853 | 50.5 | I1 (+ FII) | parB, psiAB, stbAB, pil genes, tra genes | – | eatA, etpBAC | – | LR883001 |
pAvM_E562_24 | 86 655 | 48.6 | FII + FIB | psiAB, repA (2), repB, stbAB, tra genes | lngX1, R, S, T, X2, A-J, P (CS21) | – | – | LR883002 | |
pAvM_E562_25* | 81 468 | 46.7 | FII | psiAB (truncated psiA) relE/B (toxin-antitoxin system), repAB, stbAB, sopAB | cfaA, B, C, D (CFA/I), estA2 (STh) | – | – | LR883003 | |
pAvM_E562_26 | 88 318 | 52.9 | B/O/K/Z | pil genes, pndAC (antisense RNA-regulated system) psiAB, relE, repA, tra genes | – | – | – | LR883004 | |
pAvM_E562_27* | 83 375 | 40.0 | FII | hok-sok (antisense RNA-regulated system), pemI/pemK (TA system), psiAB, parB, repAB, tra genes | – | – | blaTEM-1b, tetAR, merRTPCADE (Tn21) | LR883005 | |
L7 E1373 CS6 STp |
pAvM_E1373_28 | 146 433 | 46.1 | FII + FIB | parB, psiAB, relE, repA (2), repB | cssA, B, C, D (CS6), estA5 (STp), CS8-like gene cluster, fae-related genes | – | – | LR882991 |
pAvM_E1373_29 | 109 318 | 46.4 | FIB | parB, parB-like, repA | – | – | – | LR882992 |
Comparative genomics of the chromosome
The chromosomes of the reference strains were aligned and compared using progressiveMauve (v2.4.0, URL: http://darlinglab.org/mauve/mauve.html)29, and the overall structure is conserved across all eight chromosomes (Figure S1). In total, 8348 chromosomal genes were identified in the eight ETEC strains with 3179 genes considered part of the core genome shared by all eight reference strains. The majority of human commensal Escherichia coli (E. coli) strains belong to subgroup A30,31. However, ETEC strains fall into multiple phylogenetic groups (A, B1, B2, D, E, F and CladeI with the majority found in the phylogenetic groups A and B118. The phylogenetic group of the eight ETEC reference strains have previously been determined using the triplex-PCR scheme32. The ETEC references were re-analysed using ClermonTyping33 and it was determined that strain E1373 belongs to the phylogenetic group E while the other reference isolates belong to groups A and B1 (Table 1).
Plasmids
The plasmids of each isolate were annotated using Prokka followed by manual curation of the annotations including genes part of the conjugation machinery and known plasmid stability genes. Virulence factors (including CFs, toxins, EtpBAC and EatA), putative virulence factors (e.g. CexE) and antibiotic resistance determinants with the Comprehensive Antibiotic Resistance Database (CARD)34 as well as complete and partial insertion elements and prophages were manually annotated. The plasmids were designated pAvM_strainID_integer, e.g. pAvM_E925_4 (Additional file 2). The first plasmid reported in this study starts at 4 as three previous plasmids E873p1-3 already have been deposited to GenBank related to a different project8.
Plasmids were typed by analysing the presence and variation of specific replication genes to assign the plasmids to incompatibility (Inc) groups. The Inc groups of the ETEC reference plasmids were first determined using PlasmidFinder and further classified into subtypes using pMLST35. The replicons identified are IncFII, IncFIIA, IncFIIS, IncFIB, IncFIC, IncI1 and IncY. Plasmids with replicon IncY, IncFIIY or IncB/O/K/Z mainly harboured plasmid associated genes, such as stability and transfer genes. Importantly, replicons FII, FIB and I1 are strongly associated with virulence genes as genes encoding all CFs, toxins and virulence factors EatA and EtpBAC are present on these plasmids. The majority of all ETEC plasmids analysed here (17/29) belong to IncFII, of which six of the IncFII plasmids have an additional IncFIB replicon. In six of the ETEC reference strains two or three IncFII replicons are present, for example, in strain E925, the plasmids pAvM_E925_4 and 7 both belong to IncFII. However, the plasmids were further subtyped to FII-111 and FII-15, respectively, (Table 2 and Additional file 3), explaining the plasmid compatibility.
Virulence factors
The CFs expressed by the selected reference strains are CFA/I, CS1-CS3, CS5-CS7 and CS21. Three of the strains (E925, E1649 and E1779) express both LT and ST, two strains (E2980 and E1441) express LT and the strains E36 and E562 express STh, while E1373 express STp (Table 1). A plasmid can harbour multiple virulence genes, usually a CF locus and genes encoding one or two toxins. Interestingly, plasmids do not often harbour multiple CF loci, but on individual plasmids (in the ETEC reference strains described here). Exceptions for this is strain E1779 in which CS5 and CS6 loci are located on the same plasmid (pAvM_E1779_19). In both E925 (L1) and E1649 (L2) the genes encoding CS3 (cstABGH), ST (estA) and LT (eltAB) are located on the same plasmid, both with the FII replicon and of roughly the same size (Table 2). Blastn comparison between the plasmids and additional plasmids that harbour the same virulence genes shows that they are highly conserved (Fig. 1). The results correspond with the close genetic relationship and common ancestry of lineage 1 (L1) and lineage 2 (L2)18.
Besides CFs and toxins, additional virulence factors were identified in the majority of the strains (Table 2), with eatA and etpBAC being the most commonly found.
EatA is an immunogenic mucinase that contributes to virulence by degrading MUC2 which is the major protein component of mucus in the small intestine37,38. The etpABC genes encode an adhesin located on the tip of the flagella and mediate adherence to host cells39,40. Four reference strains (E925, E1649, E36 and E562) harbour both eatA and etpBAC. In three strains the eatA and/or etpBAC are located on the same plasmid with an FII or FII + FIB replicon along with additional ETEC virulence genes, except in E562 and E1373, where eatA and etpBAC are located on an I1 + FII (pAvM_E562_23) and I1 (pAvM_E1373_16) plasmid, respectively, which mainly contains plasmid associated genes including genes encoding the pil operon and tra-operon (pAvM_E562_23). Furthermore, a less explored putative virulence factor is CexE, which is an extracytoplasmic protein dependent on the expression of the CFA/I regulator cfaD41, and was first identified in H1040742. Corroborating earlier findings, the CFA/I positive E36 (L3) and E562 (L6) isolates harbour cexE (pAvM_E36_12 and pAvM_E562_25). In addition, cexE is present in pAvM_E925_6, pAvM_E1779_19 and pAVM_E2980_14, pAvM_E1441_18 and pAvM_E1373_28. CexE has previously also been identified in several CS5 + CS6 positive ETEC and shown to be upregulated in the presence of bile and sodium glycocholate-hydrate43. Bile is known to be involved in the regulation of several ETEC CFs44,45. The location of cexE seems to be conserved across specific strains. In pAvM_E36_12, pAvM_E1441_18, pAvM_E1779_19 and pAvM_E562_25 cexE is located upstream of the aatPABC locus, whereas in pAvM_E925_6 and pAvM_E2980_14 cexE is located downstream of rob (an AraC family transcriptional regulator) in the opposite direction. The pAvM_E925_4 harbours the aatPABC locus, however, cexE is located on a different plasmid (pAvM_E925_6) in this strain.
Comparison of plasmids with the same virulence profile
ETEC isolates within a lineage share the same virulence profile, specifically the same CF profile (Figures S2-S3). We verified that our selected isolates grouped within previously described lineages with confirmed virulence profiles by phylogenetic analyses (Figures S2–S3). Blastn of each of the CF positive plasmids from each reference genome were performed, and the best hit(s) were used for subsequent analysis (Fig. 1). Most of the plasmids identified as related to the ETEC reference plasmids were not annotated, hence, when needed these were annotated using the corresponding ETEC reference plasmids annotation as a high priority when running Prokka. We show that plasmids with the same CF and toxin profile from the same lineage are often conserved (Fig. 1). For example, the two plasmids encoding CS3 (pAvM_E925_4 and pAvM_E1649_8) are highly similar to several CS3 harbouring plasmids from O6:H16 strains collected from various geographical locations between 1975 and 2014, including E. coli O6:H16 strain M9682-C1 plasmid unnamed2 (CP024277.1) and E. coli strain O6:H16 F5656C1 plasmid unnamed2 (CP024262.1) PacBio sequenced by Smith et al.20 (Fig. 1a). Furthermore, high coverage and similarity were found between the plasmids of isolates E1441 (L4), and PacBio sequenced plasmids of ETEC isolates ATCC 43886/E2539C1 and 2014EL-1346-620. These isolates were collected in the seventies46 and 2014 (from a CDC collection), respectively, and assigned as O25:H16 which is the O group determined for E1441 in silico (Fig. 1e). Plasmids of E2980 (LT + CS7, L3) were validated by the PacBio sequenced plasmids of ETEC isolate E2264 (Fig. 1d). Similarly, two plasmids of E1779 (LT, STh + CS5 + CS6, L5) was identified in E2265 (LT, STh + CS5 + CS628,43, although E1779 harboured two additional plasmids. Several additional L5 ETEC genomes have been sequenced within the GEMS study47, and high plasmid similarity and conservation in CS5 + CS6 positive L5 isolates was evident (Fig. 1f).
Overall the results show that ETEC plasmids are specific to lineages circulating worldwide and conserved over time (Fig. 1, Figures S2–S3, and Figures S4–S11 for more extensive plasmid annotation). Thus, the plasmids of major ETEC lineages must confer evolutionary advantages to their host genomes since they are seldom lost.
Antibiotic resistance
E. coli can become resistant to antibiotics, both via the presence of antibiotic resistance genes and the acquisition of adaptive and mutational changes in genes encoding efflux pumps and porins which allows the bacterium to pump out the antibiotic molecules effectively48,49.
Antibiotic resistance genomic marker(s), both chromosomally located and on plasmids, were identified using the CARD database34 (Table 2, Figures S12 and S13 and Additional file 2). Similar to other studies, IncFII and B/O/K/Z plasmids were found to harbour genes conferring antibiotic resistance50. Furthermore, the phenotypic antibiotic resistance profile was determined with clinical MIC breakpoints based on EUCAST (The European Committee on Antimicrobial Susceptibility Testing)51 (Table S2). Phenotypic antibiotic resistance profiles (Table S2) were supported mainly by the findings of antibiotic resistance genes, efflux pumps and porins (Figures S4 and S5 and Table S3), although some differences were found. All ETEC reference strains are phenotypically resistant to at least two antibiotics of the 14 tested (Table S2). Resistance against penicillin’s, norfloxacin (Nor) and chloramphenicol (Cm) is most common among these strains. Two of the strains, E1441 and E2980, harbour more than four antibiotic resistance genes as well as multiple efflux systems and porins (Figure S12, Figure S13 and Table S3). The plasmid pAvM_E1441_17 carries aadA1-like, dfrA15, sul1 and tetA(A) resistance genes (Table 2), where the first three genes are in a Class 1 integron which confers resistance to streptomycin, trimethoprim, and sulphonamide (sulphamethoxazole). The gene tetA(A) is part of a truncated Tn1721 transposon52. The E1441 strain was verified as resistant to tetracycline (Tet) and sulphamethoxazole-trimethoprim (Sxt) while streptomycin was not tested. A mer operon derived from Tn21 is also present in the resistance region of pAvM_E1441_17 (Table 2), indicating that the plasmid would also likely confer tolerance to mercury, although this was not confirmed. Interestingly, this multi-replicon (FII and FIB) plasmid also harbours the lng locus encoding CS21, one of the most prevalent ETEC CFs. In isolate E2980 virulence plasmid pAvM_E2980_15 harboured multiple resistance genes in the same region (blaTEM-1b, strA, strB and sul2) conferring resistance to ampicillin, streptomycin and sulphonamides. E2980 was found to be resistant to ampicillin (Amp) and oxacillin (Oxa), which can be broken down by the beta-lactamase BlaTEM-1b, (Table 2, Tables S2 and S3). E562 harbours three antibiotic resistance genes, ampC located in the chromosome and the tet(A) and blaTEM-1b genes on an FII plasmid (pAvM_E562_27). The mer operon derived from Tn21 is also present in the region (Table 2 and Table S3). The phenotypic resistance profile of E562 matches the genomic profile with resistance to tetracycline (Tet), ampicillin (Amp), amoxicillin-clavulanic acid (Amc) and oxacillin (Oxa) (Table S2). The plasmid pAvM_E36_13 contains a complete copy of Tn10, which encodes the tet(B), tetracycline resistance module. Although the pAvM_E1373_29 phage-like plasmid is cryptic, related plasmids such as the pHMC2-family of phage-like plasmids53 (described below), can harbour resistance genes such as blaCTX-M-1454 and blaCTX-M-1555,56.
Phenotypic intermediate resistance to ampicillin was found in E36 and E1779 encoded by chromosomal gene ampC. Higher MIC values against ampicillin are found in E2980 and E562 strains carrying blaTEM genes. Phenotypic resistance to ceftazidime (Caz) and ceftriaxone (Cro) was not found in the isolates, which were consistent with the absence of extended-spectrum beta-lactamase (ESBL) resistance genes in the sequence data.
Resistance to chloramphenicol (Cm) was found in five isolates, but none of the resistant isolates contained known resistance genes suggesting that chromosomal mutations or presence of efflux pumps may account for this reduced susceptibility.
The ETEC reference strains contain several efflux systems which could explain why the genotypic and phenotypic antibiotic resistance profile did not match for all antibiotics. All of the isolates harbour multiple efflux pumps located on the chromosome and plasmids (Table S3 and Figure S12). In E925, a non-synonymous mutation in acrF was identified (G1979A) resulting in a substitution from arginine to glutamine (A360Q). The effect on the expression and/or function of the AcrEF efflux pump was not verified.
Phenotypic resistance to norfloxacin (Nor) was found in 6 of the isolates. The isolates were analysed for chromosomal mutations likely to confer quinolone resistance, using ResFinder but mutations in gyrA were only found in one strain, E2980, at position S83A which may confer resistance to nalidixic acid, norfloxacin and ciprofloxacin. However, E2980 was sensitive to nalidixic acid. Both mutation(s) that alter the target (gyrA and parC), as well as the presence of efflux pumps, can confer resistance to fluoroquinolones. The majority of the isolates are moderately resistant to norfloxacin (and nalidixic acid), both quinolones, which is most likely due to the presence of two efflux pumps, AcrAB-R and AcrEF-R, as only one mutation was identified in gyrA of isolate E2980 where usually at least two or more mutations are needed to confer augmented resistance57.
Identification of phage-like plasmids in ETEC
Two of the ETEC reference strains (E1649 and E1373) harboured phage-like plasmids (pAvM_E1649_9 and pAvM_E1373_29) which encode for DNA metabolism, DNA biosynthesis as well as structural bacteriophage genes (capsid, tail etc.). Both pAvM_E1649_9 and pAvM_E1373_29 contain genes associated with plasmid replication, division and maintenance (i.e. repA and parAB). Phage-like plasmids are found in various bacterial species, such as E. coli, Klebsiella pneumoniae, Yersinia pestis, Salmonella enterica serovar Typhi, Salmonella enterica serovar Typhimurium, Salmonella enterica serovar Derby and Acinetobacter baumanii58. The plasmid pAvM_E1649_9 belong to the P1 phage-like plasmid family (Fig. 2a and Figure S14a) while pAvM_E1373_29 belongs to the pHCM2-family (Fig. 2b and Figure S14b) that can be traced back to a likely phage origin similar to the Salmonella phage, SSU553. Both phage-plasmids thus contain replication and/or partition genes of plasmid origin and a complete set of genes that are phage related in function and properties (Fig. 2 and Figure S14). Significantly, phage-like plasmid pAvM_E1373_29 falls more within the E. coli lineage of pHCM2 phage-like plasmid rather than those found in Salmonella species. This indicates that phage-like plasmids have diversified within the bacterial species they were isolated.
Blastn searches confirmed high similarity (at least 80% at the DNA across much of the sequence) of pAvM_E1373_29 to several phage-like plasmids found in E. coli including ETEC O169:H41 isolate F8111-1SC320,59, several blaCTX-M-15 positive phage-like plasmids (pANCO1, pANCO256 and PV234a), as well as a plasmid found in E. coli ST648 from wastewater and ST131 isolate SC367ECC60. The P1 phage-like plasmid pAvM_1649_9 is most similar to p1107-99 K, pEC2_5 isolated from human urine and p2448-3 from a UPEC ST131 isolate isolated from blood. The similarity is most pronounced at the amino acid level. Conservation and synteny are evident when pAvM_1649_9 is compared to P1 phage.
Prophages present and their cargo genes
Prophages may insert into chromosomes and bring along genes required for lysogeny and lytic cycles and cargo genes that are often picked up when DNA is compacted into the capsid. Cargo genes can significantly benefit the host bacterium by providing additional elements to defence against phage or immune evasion and finally, environmental survival. PHASTER analyses identified prophages in the chromosomes of all ETEC reference isolates and some of the plasmids (Table S4). Putative tellurite resistance operons in isolates E925, E36, E2980 and E1373 were all located in prophages. In addition, eatA (in E925, and E1649) and estA (STh) genes (E36) were prophage cargo genes.
Many prophage cargo genes identified in this study have properties related to inhibition of cell division. Among these are a variety of kil genes which can enhance host bacterial survival in the presence of some antibiotics61. Some genes that are core entities within many prophages, such as zapA (from E1779_Pph_6), dicB and dicC (found in phage E1779_Pph_7), also have similar effects as they can inhibit cell division in the presence of antibiotics which raise the broader question in terms of how they are beneficial to the host bacterium.
A different gene of interest is the yfdR gene identified in E1779_Pph_7 (gene E1779_04412). YfdR curtails cellular division by inhibiting DNA replication under stress conditions encountered by the bacterial cell. Similarly, the iraM gene located in phage E1441_Pph_2 plays a role in RpoS stability.
OmpX homologs were found in numerous phages in this study. They are trans-membrane located and play a role in virulence as well as antibiotic resistance62. PerC is often associated with EPEC plasmids, where it seems to have a regulatory role for the attaching and effacing gene, eaeA63. The presence of a protein (PerC-family activator) containing the same PFAM domain (PF06069) as PerC in EPEC as cargo within an ETEC strain phage, E1779_Pph_7 located on the chromosome, is intriguing. Its ability to regulate other virulence genes is yet to be determined. Within the same phage, a gntR-like regulatory gene was identified. This gene plays a role in gluconate utilisation and induction of the Entner-Doudoroff pathway64.
Discussion
ETEC strains have previously been shown to fall into globally spread genetically conserved lineages which encompass strains with specific virulence factor profiles18. The currently widely used ETEC reference strains H10407 (CFA/I) and E24377A (CS1 + CS3) are highly divergent from other strains with the same virulence profile sequenced more recently18 and highlights the need for relevant and representative ETEC reference strains and genomes. The long- and short-read sequenced strains presented here comprise complete reference genomes with separate chromosomal and plasmid sequences that allow more detailed studies of ETEC and E. coli phylogeny. The reference strains are representative isolates of their respective lineage and cluster phylogenetically together with different ETEC isolates sequenced by several other groups (Figure S2).
Previous studies confirmed that ETEC belongs to lineages that have spread globally. These analyses were mainly dependent on the shared core genome of chromosomal genes while conservation of plasmids was indicated by the association between the plasmid-borne toxin and CFs and lineage18. Analysis of the plasmids sequenced in the present study showed that the conservation within ETEC lineages also include plasmids. The role of toxin-antitoxin (TA) systems in the maintenance of these plasmids (or presence in the chromosome) have not been considered here in detail, however multiple TA systems were identified across the ETEC plasmids presented (Additional File 2) and their potential involvement will be re-visited in a further paper.
Blast analyses confirm that the plasmids identified in this study are often highly homologous to other plasmids present in GenBank. For instance, the 94.5 kb plasmid pAvM_1441_18 was 98% identical to two 96 kb and 82 kb plasmids belonging to ETEC O25:H16 isolates ATCC 43886/E2539C1 and 2014EL-1346-6 sequenced by PacBio by Smith et al. 20, (Fig. 1e and Figure S6). Plasmid pAvM_E1441_18 is the major virulence plasmid of this lineage carrying genes encoding LT and CS6.
The larger plasmid in E1441 (pAvM_E1441_17) carries both the genes for ETEC CF CS21 and antibiotic resistance determinants. Furthermore, complete conjugation machinery was present suggesting that this is most likely a self-transmissible plasmid, though this was not confirmed. Movement of such a plasmid would result in the spread of ETEC virulence genes and AMR determinants.
Interestingly, Wachsmuth et al.46 analysed transfer frequencies in ETEC O25:H16 isolates (the same serogroup was identified in E1441) and found evidence that resistance to tetracycline and sulfathiazole was transferred but not the genes encoding LT46. The same study found evidence of two large plasmids of similar size46 corroborating our findings of two plasmids of similar size in E1441, one with eltAB and cssABCD without the tra-operon (pAvM_E1441_18) and the other putatively mobile plasmid (pAvM_E1441_17) carrying the sul1 and tet(A) genes as well as the lng operon encoding CF CS21. Since ATCC 43886/E2539C1, E1441 and 2014EL-1346-6, have been isolated in the 1970s, 1997, and 2014, respectively, our findings indicate that E1441 represent an ETEC lineage with stable plasmid content and putative ability to transfer antibiotic resistance and the CS21 operon by transfer of one of the plasmids. Furthermore, pAvM_E1441_17 is a multi-replicon plasmid. Multi-replicon plasmids have been described as a way to broaden their host range, i.e. possibility to be transferred between bacteria of different phylogenetic groups65,66. Whether this plasmid type is found in other E. coli remains to be investigated but the finding that the L4 lineage retains both plasmids in isolates collected over time and worldwide indicate a strong selective force to keep the extra-chromosomal contents of both plasmids.
The ETEC O169:H41 isolate F8111-1SC3 plasmid unnamed 220,59 is highly similar to pAvM_E1373_28 (Fig. 1h and Figure S9). The F8111-ISC3 isolate is part of a CDC collection of ETEC isolates from cruise ship outbreaks and diarrheal cases in US 1996–2003. The antibiotic resistance profiles of these isolates were determined59 and most isolates of O group 169 were tetracycline resistant consistent with the findings of the tet gene in E1373 isolated in Indonesia in 1996. ETEC diarrhoea caused by O169:H41 and STp CS6 isolates is repeatedly reported to cause diarrhoea, particularly in Latin America47,67–69. Among the cruise ship isolates is the sequenced and characterised virulence plasmid pEntYN10 encoding STp and CS6, described as unstable and easily lost in vitro67,70. The E1373 plasmid; AvM_E1373_28 is highly homologous to pEntYN10 (Fig. 1h and Figure S9) and the virulence profile of ETEC O169: H41 is conserved in isolates collected globally. Hence, the instability of the plasmid is incongruent with current data indicating that plasmids are stable within this lineage and serotype.
Interestingly, two distinctive extra-chromosomal elements which are highly similar to P1 and SSU5 phage were identified among the 8 ETEC reference strains sequenced (Fig. 2, Figure S14 and Table S4). The SSU5-like element carries several genes that allow it to be functional as a plasmid and belongs to the pHCM2-like family of Phage-Plasmids (Fig. 2b)53. These plasmids are devoid of virulence factors, transposons and antibiotic markers but, they contain a significant number of DNA metabolism and biosynthesis genes and they may contain bacteriophage inhibitory genes that have not yet been identified. Interestingly, several SSU5 phage-like plasmids have been shown to carry the ESBL gene blaCTX-M15 in extra-intestinal pathogenic E. coli isolates55. ESBL resistance seems to be absent or low in ETEC and the SSU5 phage-like plasmid pAvM_E1373_29 does not contain antibiotic resistance genes. A recent study investigating the distribution of phage-plasmids show that the phage homologs tend to be more conserved and the plasmid homologs more variable71. This is also seen in the phage-plasmids identified here, e.g., genes that could be advantageous to the host cell linked to metabolism and biosynthesis.
To summarise, we provide fully assembled chromosomes and plasmids with manually curated annotations that will serve as new ETEC reference genomes. The in-depth analysis of gene content, synteny and correct annotations of plasmids will also help to elucidate other plasmids with and without virulence factors in related bacterial species. The ETEC reference genomes compared to other long-read sequenced ETEC genomes confirm that the major ETEC lineages harbour conserved plasmids that have been associated with their respective background genomes for decades. This supports the notion that the plasmids and chromosomes of ETEC are both crucial for ETEC virulence and success as pathogens.
Methods
Selection of strains
Initially one to two ETEC strains within each of the lineage (L1–L7)-specific CF profile were chosen from the University of Gothenburg large collection of ETEC strains18 for PacBio sequencing. The seven linages encompass clinically relevant ETEC strains expressing the most common virulence factor profiles, i.e. toxin and CF profile18. The strains were selected based on the location and year of isolation to represent strains isolated from patients with diarrhoea from diverse geographical locations and at different time-points. After the genomes had been sequenced, assembled, circularised and annotated a second selection was made for manual curation of the genomes. This selection was made based on the quality of the genome assembly and the circularisation. The whole genomes of the ETEC reference strains were compared with one or two other long-read sequenced ETEC strains belonging to the same lineage by progressiveMAUVE (v2.4.0, URL: http://darlinglab.org/mauve/mauve.html)29 and showed that the strains are colinear (Figure S15). One representative ETEC genome from each lineage was annotated, with emphasis on the plasmids. The physical ETEC reference strains are available upon request.
Phenotypic toxin and CF analyses
ETEC isolates were identified by culture on MacConkey agar followed by an analysis of LT and ST toxin expression using GM1 ELISAs45. The expression of the different CFs was confirmed by dot-blot analysis45. Isolates had been kept in glycerol stocks at − 70 °C, and each strain has been passaged as few times as possible.
Antibiotic susceptibility testing
All ETEC isolates were tested against 14 antimicrobial agents and their minimum inhibitory concentration was determined by broth microdilution using EUCAST methodology51. The antimicrobial agents were: ampicillin, amoxicillin-clavulanic, oxacillin, ceftazidime, ceftriaxone, doxycycline, tetracycline, nalidixic acid, norfloxacin, azithromycin, erythromycin, chloramphenicol, nitrofurantoin and sulfamethoxazole-trimethoprim. All antibiotics were purchased from Sigma-Aldrich. The E. coli ATCC 25922 was used as quality control. The MIC was recorded visually as the lowest concentration of antibiotic that completely inhibits growth.
DNA extraction and sequencing
Strains from each lineage (L1–L7) were SMRT-sequenced on the PacBio RSII. A hybrid de novo assembly was performed combining the reads from both the SMRT-sequenced and Illumina sequenced strains.
For Single-Molecule Real-Time (SMRT) sequencing (Pacific Bioscience) long intact strands of DNA are required. The genomic DNA extraction was performed as follows. Isolates were cultured in CFA broth overnight at 37 °C followed by cell suspension in TE buffer (10 mM Tris and 1 mM EDTA pH 8.0) with 25% sucrose (Sigma) followed by lysis using 10 mg/ml lysozyme (in 0.25 Tris pH 8.0) (Roche). Cell membranes were digested with Proteinase K (Roche) and Sarkosyl NL-30 (Sigma) in the presence of EDTA. RNase A (Roche) was added to remove RNA molecules. A phenol–chloroform extraction was performed using a mixture of Phenol:Chloroform:Isoamyl Alcohol (25:24:1) (Sigma) in phase lock tubes (5prime). To precipitate the DNA 2.5 volumes 99% ethanol and 0.1 volume 3 M NaAc pH 5.2 was used followed by re-hydration in 10 mM Tris pH 8.0. DNA concentration was measured using NanoDrop spectrophotometer (NanoDrop). On average 10 μg for PacBio sequencing. Library preparation for SMRT sequencing was prepared according to the manufacturers’ (Pacific Biosciences) protocol. The DNA was stored in E buffer and sequenced at the Wellcome Sanger Institute. Isolates were sequenced with a single SMRTcell using the P6-C4 chemistry, to a target coverage of 40–60X using the PacBio RSII sequencer.
Assembly
The resulting raw sequencing data from SMRT sequencing were de novo assembled using the PacBio SMRT analysis pipeline (https://github.com/PacificBiosciences/SMRT-Analysis) (v2.3.0) utilising the Hierarchical Genome Assembly Process (HGAP)72. For all samples, the unfinished assembly produced a single, non-circular, chromosome plus some small contigs, some of which were plasmids or unresolved assembly variants. Using Circlator73 (v1.1.0), small self-contained contigs in the unfinished assembly were identified and removed, with the remaining contigs circularised. Quiver72 was then used to correct errors in the circularised region by mapping corrected reads back to the circularised assembly. As the strains had also been short read sequenced, and this data is of higher base quality, the short reads from the Illumina sequencing were used in combination with the long reads using Unicycler74 to generate high-quality assemblies.
Fully circularised chromosomes and plasmids were achieved for the majority of the strains. Cross-validation of the assemblies was performed where two or three strains of a lineage were sequenced (Figure S15). A single assembly from each lineage was chosen to act as the representative reference genome, with priority given to assemblies with the most complete and circularised chromosome and plasmids. In total, one chromosome and 5 out of the 29 plasmids could not be circularised (independent on the two strains that were sequenced initially) out of the 8 selected representative strains. These are indicated in Table 2 and Table S1. Between two and five plasmids were identified in the eight strains. Shorter contigs that could not be assembled properly contained phage genes and are included in the genomes and annotated as prophages Table S4). Socru was used to validate the assembly of the chromosome, they all have biologically valid orientation and order of rRNA operons with a type GS1.0, which is seen in most E. coli in the public domain75. A multiple alignment of the chromosomes (Figure S1) was generated using progressiveMauve29 and visualised using R (v4.0.2, 2020-06-22, URL: https://www.R-project.org/)76, specifically the R package genoplotR77.
Phylogenetic tree
The phylogenetic relationship between the ETEC reference genomes to other ETEC and E. coli commensals and pathotypes was investigated. The following collections were included: ETEC-36218, ECOR78 and the Horesh collection79 along with additional ETEC genomes from several studies20,24,26,27,47,80,81. The reads of identified ETEC genomes from other studies were downloaded from GenBank and assembled using Velvet. Long-read sequenced ETEC genomes were included in the tree and were not re-assembled. The phylogroup of the ETEC strains was determined using ClermonTyping33 (v20.03). The virulence profile of the ETEC strains was determined using ARIBA82 (v2.14.16) with default settings using the custom ETEC virulence database (https://github.com/avonm/ETEC_vir_db). A total of 1066 genomes was included in the phylogenetic tree. The alignment of core genes (n = 2895) identified by Roary83 (v3.12.0) was converted to a SNP-only alignment using snp-sites84. A phylogenetic tree was produced with IQ-TREE85 (v1.6.10) using a GTR gamma model (GTR+F+I) optimised using the built-in model test and visualised using R (v4.0.2, 2020-06-22, URL: https://www.R-project.org/)76 , specifically using the R packages GGTREE (v2.4.1, URL: https://github.com/YuLab-SMU/ggtree)86 and GGPLOT2 (v3.3.2, URL: https://ggplot2.tidyverse.org)87.
Gene prediction, annotation and comparative analysis
The final assembly was annotated using Prokka88 (v1.14.6). The annotations of all plasmids generated by Prokka were manually checked using the genome viewer Artemis89 and Geneious (v11.1.5, URL: http://www.geneious.com) together with blastp. Annotations of known ETEC virulence genes (colonisation factors, toxins, eatA and etpBAC) were added after blast+ 90 analysis using the reference genes available in the ETEC virulence database (https://github.com/avonm/ETEC_vir_db) and their annotations updated accordingly. The LT and ST alleles were determined according to Joffre et al. (https://github.com/avonm/ETEC_toxin_variants_db)15,17. Where required, PFAM domains were searched using jackhammer to back up any identified protein using blastp (https://www.ebi.ac.uk/Tools/hmmer/search/jackhmmer). Blastn and tblastx were used for plasmid comparison, using both NCBI website or within BLAST Ring Image Generator (BRIG)36 (v0.95, URL: http://brig.sourceforge.net/).
Incompatibility groups
Due to the discrepancy in databases two approaches was used to determine the Inc groups of the 25 plasmids. PlasmidFinder was used with a threshold for minimum % identity at 95% and minimum coverage of 60%. The plasmids were further characterised by pMLST35, except for IncY which are a group of prophages that replicate in a similar manner as autonomous plasmids (Additional File 3). IncB/O/K/Z plasmids were further typed by blastn comparison to the reference B/O (M93062), K (M93063) and Z (M93064) replicons.
oriT prediction
The location of the oriT in the plasmids, if present, was predicted using oriTFinder91 with Blast E-value cut-off set to 0.01.
Genomic antibiotic resistance profiling
The identification of antibiotic resistance genes, located on both the chromosome and plasmid(s) as well as the presence of efflux pumps and porins known to confer resistance to antibiotics. The results were obtained by running ARIBA82 using the CARD database92 with the default settings (minimum 90% sequence identity and no length cut-off). ARIBA combines a mapping/alignment and targeted local assembly approach to identify AMR genes and variants efficiently and accurately from paired sequencing reads. The heatmaps were visualised using Phandango (v.1.3.0, URL: https://jameshadfield.github.io/phandango/#/)93 with colors and text modified in Adobe Illustrator 2019 (v23.1.1). The presence of chromosomal mutations in gyrA and parC was determined with ResFinder (v3.2) from the Center of Genomic Epidemiology94.
Virulence gene prediction
The ETEC assemblies from the ETEC-NCBI collection (Additional file 4) were screened using abricate95 with default settings against the ETEC virulence database (https://github.com/avonm/ETEC_vir_db) for virulence gene (including eatA and etpBAC) predication. A subset of the isolates in the ETEC-NCBI dataset have previously been analysed for the presence of EatA where a sample with negative PCR but positive western blots were included as positive80. Here, only isolates harbouring the eatA and etpBAC genes are considered positive.
Prophage prediction
The complete FASTA sequence of each ETEC reference genome was searched for phage genes and prophages using PHASTER (phaster.ca)96. The identified intact prophages are listed in Table S4. All prophage contained cargo genes but only recognisable genes are stated, not any hypothetical. Additional questionable and not intact prophages were identified but have not been included here. The prophages have been given a specific identifier name and are also annotated as a mobile_element in the submitted chromosome and or plasmid(s) of each strain.
Insertion sequences
Insertion sequences in the plasmids as well as surrounding the CS2 loci located on the chromosome of E1649 were annotated using both Galileo AMR software97 and the ISFinder database98. Complete and partial IS elements were annotated (> 95% identity with hits in ISFinder) along with the present genes encoding transposases. Three new insertion sequences were detected in this analysis and were submitted to ISFinder as TnEc2, TnEc3 and TnEc4. Transposons and other mobile elements (integrons and group II introns) were also identified using Galileo AMR and blastn against public databases.
Supplementary information
Acknowledgements
No acknowledgements to mention.
Author contributions
A.v.M. conceived and designed the experiments, performed the experiments, analysed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored the paper and approved the final draft. G.B. and A.v.M. annotated all IS elements, transposons as well as other mobile elements, contributed to the paper and approved the final draft. D.P. and A.v.M. annotated the identified prophages, contributed to the paper and approved the final draft. C.B. performed the in silico analysis of the genomic antibiotic resistance profiling, contributed to the paper and approved the final draft. E.J. performed the antibiotic resistance profiling, contributed to the paper and approved the final draft. A.J.P. assembled the genomes, contributed to the paper and approved the final draft. A.M.S. conceived and designed the experiments contributed to the paper and approved the final draft. G.D. conceived and designed the experiments and approved the final draft. Å.S. conceived and designed the experiments, analysed data, authored the paper and approved the final draft.
Funding
AvM, AMS and ÅS were supported by the Swedish Foundation for Strategic Research (Grant No. SB12-0072). AvM was also supported by The Swedish Research Council (Grant No. 2018-06828) and the Swedish Society for Medical Research (P18-0140). AJP was supported by the Biotechnology and Biological Sciences Research Council (BBSRC); this research was funded by the BBSRC Institute Strategic Programme Microbes in the Food Chain BB/R012504/1. GD was supported by the Wellcome Trust (Grant WT 098051).
Data availability
The datasets supporting the conclusions of this article are included within the articles and its additional files. The sequencing data generated in this study has been submitted to EMBL (Additional file 4 and 5). The physical ETEC reference strains can be requested by contacting the corresponding author Astrid von Mentzer (avm@sanger.ac.uk or mentzerv@chalmers.se). The database used for annotating ETEC virulence factors, ETEC virulence database, including the LT and ST alleles can be found in the github repositories: https://github.com/avonm/ETEC_vir_db and https://github.com/avonm/ETEC_toxin_variants_db. An interactive version of the core genome phylogeny of the 1,065 E. coli and ETEC isolates along with the ETEC reference strains (Figure S2) reported here is accessible at https://microreact.org/project/2ZZzaHzeXbMEw9U2MAk7pK?tt=cr. obtaining clinical isolates collected as part of this study should be addressed to the corresponding author. Exchange of clinical isolates should always be in agreement with the University of Gothenburg.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-021-88316-2.
References
- 1.Khalil IA, Troeger C, Blacker BF, Rao PC, Brown A, Atherly DE, et al. Morbidity and mortality due to shigella and enterotoxigenic Escherichia coli diarrhoea: The Global Burden of Disease Study 1990–2016. Lancet. Infect. Dis. 2018;18:1229–1240. doi: 10.1016/S1473-3099(18)30475-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Baron, S., Evans, D. J. & Evans, D. G. Escherichia coli in Diarrheal Disease. undefined. 1996. [PubMed]
- 3.Svennerholm A-M, Lundgren A. Recent progress toward an enterotoxigenic Escherichia coli vaccine. Expert Rev. Vaccines. 2012;11:495–507. doi: 10.1586/erv.12.12. [DOI] [PubMed] [Google Scholar]
- 4.Lundgren A, Bourgeois L, Carlin N, Clements J, Gustafsson B, Hartford M, et al. Safety and immunogenicity of an improved oral inactivated multivalent enterotoxigenic Escherichia coli (ETEC) vaccine administered alone and together with dmLT adjuvant in a double-blind, randomized, placebo-controlled Phase I study. Vaccine. 2014;32:7077–7084. doi: 10.1016/j.vaccine.2014.10.069. [DOI] [PubMed] [Google Scholar]
- 5.Harro C, Bourgeois AL, Sack D, Walker R, DeNearing B, Brubaker J, et al. Live attenuated enterotoxigenic Escherichia coli (ETEC) vaccine with dmLT adjuvant protects human volunteers against virulent experimental ETEC challenge. Vaccine. 2019;37:1978–1986. doi: 10.1016/j.vaccine.2019.02.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.O’Ryan M, Vidal R, del Canto F, Salazar JC, Montero D. Vaccines for viral and bacterial pathogens causing acute gastroenteritis: Part II: Vaccines for Shigella, Salmonella, enterotoxigenic E. coli (ETEC) enterohemorragic E. coli (EHEC) and Campylobacter jejuni. Hum. Vacc. Immunother. 2015;11:601–619. doi: 10.1080/21645515.2015.1011578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Qadri F, Svennerholm A-M, Faruque AS, Sack RB. Enterotoxigenic Escherichia coli in developing countries: Epidemiology, microbiology, clinical features, treatment, and prevention. Clin. Microbiol. Rev. 2005;18:465–483. doi: 10.1128/CMR.18.3.465-483.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.von Mentzer A, Tobias J, Wiklund G, Nordqvist S, Aslett M, Dougan G, et al. Identification and characterization of the novel colonization factor CS30 based on whole genome sequencing in enterotoxigenic Escherichia coli (ETEC) Sci. Rep. 2017;7:465. doi: 10.1038/s41598-017-00508-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Gaastra W, Svennerholm A-M. Colonization factors of human enterotoxigenic Escherichia coli (ETEC) Trends Microbiol. 1996;4:444–452. doi: 10.1016/0966-842X(96)10068-8. [DOI] [PubMed] [Google Scholar]
- 10.Nada RA, Shaheen HI, Khalil SB, Mansour A, El-Sayed N, Touni I, et al. Discovery and phylogenetic analysis of novel members of class b enterotoxigenic Escherichia coli adhesive fimbriae. J. Clin. Microbiol. 2011;49:1403–1410. doi: 10.1128/JCM.02006-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Cádiz L, Torres A, Valdés R, Vera G, Gutiérrez D, Levine MM, et al. Coli surface antigen 26 acts as an adherence determinant of enterotoxigenic Escherichia coli and is cross-recognized by anti-CS20 antibodies. Front. Microbiol. 2018;9:248. doi: 10.3389/fmicb.2018.02463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Canto FD, O’Ryan M, Pardo M, Torres A, Gutiérrez D, Cádiz L, et al. Chaperone-usher pili loci of colonization factor-negative human enterotoxigenic Escherichia coli. Front. Cell. Infect. Microbiol. 2017;6:CD009029. doi: 10.3389/fcimb.2016.00200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Grewal HM, Valvatne H, Bhan MK, van Dijk L, Gaastra W, Sommerfelt H. A new putative fimbrial colonization factor, CS19, of human enterotoxigenic Escherichia coli. Infect. Immun. 1997;65:507–513. doi: 10.1128/IAI.65.2.507-513.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Pichel M, Binsztein N, Viboud G. CS22, a novel human enterotoxigenic Escherichia coli adhesin, is related to CS15. Infect. Immun. 2000;68:3280–3285. doi: 10.1128/IAI.68.6.3280-3285.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Joffre E, von Mentzer A, Ghany MAE, Oezguen N, Savidge T, Dougan G, et al. Allele variants of enterotoxigenic Escherichia coli heat-labile toxin are globally transmitted and associated with colonization factors. J. Bacteriol. 2015;197:392–403. doi: 10.1128/JB.02050-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bolin I, Wiklund G, Qadri F, Torres O, Bourgeois AL, Savarino S, et al. Enterotoxigenic Escherichia coli with STh and STp genotypes is associated with diarrhea both in children in areas of endemicity and in travelers. J. Clin. Microbiol. 2006;44:3872–3877. doi: 10.1128/JCM.00790-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Joffre E, von Mentzer A, Svennerholm A-M, Sjöling A. Identification of new heat-stable (STa) enterotoxin allele variants produced by human enterotoxigenic Escherichia coli (ETEC) Int. J. Med. Microbiol. 2016;306:586–594. doi: 10.1016/j.ijmm.2016.05.016. [DOI] [PubMed] [Google Scholar]
- 18.von Mentzer A, Connor TR, Wieler LH, Semmler T, Iguchi A, Thomson NR, et al. Identification of enterotoxigenic Escherichia coli (ETEC) clades with long-term global distribution. Nat. Genet. 2014;46:1321–1326. doi: 10.1038/ng.3145. [DOI] [PubMed] [Google Scholar]
- 19.Crossman LC, Chaudhuri RR, Beatson SA, Wells TJ, Desvaux M, Cunningham AF, et al. A commensal gone bad: complete genome sequence of the prototypical enterotoxigenic Escherichia coli strain H10407. J. Bacteriol. 2010;192:5822–5831. doi: 10.1128/JB.00710-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Smith P, Lindsey RL, Rowe LA, Batra D, Stripling D, Garcia-Toledo L, et al. High-quality whole-genome sequences for 21 enterotoxigenic Escherichia coli strains generated with PacBio sequencing. Genome Announc. 2018;6:6167. doi: 10.1128/genomeA.01311-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Rasko DA, Rosovitz MJ, Myers GS, Mongodin EF, Fricke WF, Gajer P, et al. The pangenome structure of Escherichia coli: Comparative genomic analysis of E. coli commensal and pathogenic isolates. J. Bacteriol. 2008;190:6881–6893. doi: 10.1128/JB.00619-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Besser J, Carleton HA, Gerner-Smidt P, Lindsey RL, Trees E. Next-generation sequencing technologies and their application to the study and control of bacterial infections. Clin. Microbiol. Infect. 2018;24:335–341. doi: 10.1016/j.cmi.2017.10.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Quainoo S, Coolen JPM, van Hijum SAFT, Huynen MA, Melchers WJG, van Schaik W, et al. Whole-genome sequencing of bacterial pathogens: The future of nosocomial outbreak analysis. Clin. Microbiol. Rev. 2017;30:1015–1063. doi: 10.1128/CMR.00016-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sahl JW, Steinsland H, Redman JC, Angiuoli SV, Nataro JP, Sommerfelt H, et al. A comparative genomic analysis of diverse clonal types of enterotoxigenic Escherichia coli reveals pathovar-specific conservation. Infect. Immun. 2011;79:950–960. doi: 10.1128/IAI.00932-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sahl JW, Rasko DA. Analysis of global transcriptional profiles of enterotoxigenic Escherichia coli isolate E24377A. Infect. Immun. 2012;80:1232–1242. doi: 10.1128/IAI.06138-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sahl JW, Sistrunk JR, Fraser CM, Hine E, Baby N, Begum Y, et al. Examination of the enterotoxigenic Escherichia coli population structure during human infection. MBio. 2015;6:e00501. doi: 10.1128/mBio.00501-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sahl JW, Sistrunk JR, Baby NI, Begum Y, Luo Q, Sheikh A, et al. Insights into enterotoxigenic Escherichia coli diversity in Bangladesh utilizing genomic epidemiology. Sci. Rep. 2017;7:3402. doi: 10.1038/s41598-017-03631-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Begum YA, Rydberg HA, Thorell K, Kwak Y-K, Sun L, Joffre E, et al. In situ analyses directly in diarrheal stool reveal large variations in bacterial load and active toxin expression of enterotoxigenic Escherichia coli and Vibrio cholerae. mSphere. 2018;3:e00517-17. doi: 10.1128/mSphere.00517-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Darling AE, Mau B, Perna NT. ProgressiveMauve: Multiple genome alignment with gene gain. Loss and rearrangement. PLoS ONE. 2010;5:e11147. doi: 10.1371/journal.pone.0011147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Li B, Sun J, Han L, Huang X, Fu Q, Ni Y. Phylogenetic groups and pathogenicity island markers in fecal Escherichia coli isolates from asymptomatic humans in China. Appl. Environ. Microbiol. 2010;76:6698–6700. doi: 10.1128/AEM.00707-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tenaillon O, Skurnik D, Picard B, Denamur E. The population genetics of commensal Escherichia coli. Nat. Rev. Microbiol. 2010;8:207–217. doi: 10.1038/nrmicro2298. [DOI] [PubMed] [Google Scholar]
- 32.Clermont O, Bonacorsi S, Bingen E. Rapid and simple determination of the Escherichia coli phylogenetic group. Appl. Environ. Microbiol. 2000;66:4555–4558. doi: 10.1128/AEM.66.10.4555-4558.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Beghain J, Bridier-Nahmias A, Nagard HL, Denamur E, Clermont O. ClermonTyping: An easy-to-use and accurate in silico method for Escherichia genus strain phylotyping. Microb. Genom. 2018;4:690. doi: 10.1099/mgen.0.000192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.McArthur AG, Waglechner N, Nizam F, Yan A, Azad MA, Baylay AJ, et al. The comprehensive antibiotic resistance database. Antimicrob. Agents Chemother. 2013;57:3348–3357. doi: 10.1128/AAC.00419-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Carattoli A, Zankari E, García-Fernández A, Larsen MV, Lund O, Villa L, et al. PlasmidFinder and pMLST: In silico detection and typing of plasmids. Antimicrob. Agents Chemother. 2014;58:AAC.02412-14-3903. doi: 10.1128/AAC.02412-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Alikhan N-F, Petty NK, Zakour NLB, Beatson SA. BLAST Ring image generator (BRIG): Simple prokaryote genome comparisons. BMC Genomics. 2011;12:402. doi: 10.1186/1471-2164-12-402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kumar P, Luo Q, Vickers TJ, Sheikh A, Lewis WG, Fleckenstein JM. EatA, an immunogenic protective antigen of enterotoxigenic Escherichia coli. Degrades intestinal mucin. Infect. Immun. 2014;82:500–508. doi: 10.1128/IAI.01078-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Patel SK, Dotson J, Allen KP, Fleckenstein JM. Identification and molecular characterization of EatA, an autotransporter protein of enterotoxigenic Escherichia coli. Infect. Immun. 2004;72:1786–1794. doi: 10.1128/IAI.72.3.1786-1794.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Fleckenstein JM, Roy K, Fischer JF, Burkitt M. Identification of a two-partner secretion locus of enterotoxigenic Escherichia coli. Infect. Immun. 2006;74:2245–2258. doi: 10.1128/IAI.74.4.2245-2258.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Roy K, Hilliard GM, Hamilton DJ, Luo J, Ostmann MM, Fleckenstein JM. Enterotoxigenic Escherichia coli EtpA mediates adhesion between flagella and host cells. Nature. 2009;457:594–598. doi: 10.1038/nature07568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Hibberd ML, McConnell MM, Willshaw GA, Smith HR, Rowe B. Positive regulation of colonization factor antigen I (CFA/I) production by enterotoxigenic Escherichia coli producing the colonization factors CS5, CS6, CS7, CS17, PCFO9, PCFO159:H4 and PCFO166. J. Gen. Microbiol. 1991;137:1963–1970. doi: 10.1099/00221287-137-8-1963. [DOI] [PubMed] [Google Scholar]
- 42.Pilonieta MC, Bodero MD, Munson GP. CfaD-dependent expression of a novel extracytoplasmic protein from enterotoxigenic Escherichia coli. J. Bacteriol. 2007;189:5060–5067. doi: 10.1128/JB.00131-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Joffre E, Nicklasson M, Álvarez-Carretero S, Xiao X, Sun L, Nookaew I, et al. The bile salt glycocholate induces global changes in gene and protein expression and activates virulence in enterotoxigenic Escherichia coli. Sci. Rep. 2019;9:108. doi: 10.1038/s41598-018-36414-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Nicklasson M, Sjöling Å, von Mentzer A, Qadri F, Svennerholm A-M. Expression of colonization factor CS5 of enterotoxigenic Escherichia coli (ETEC) is enhanced in vivo and by the bile component Na glycocholate hydrate. PLoS ONE. 2012;7:e35827. doi: 10.1371/journal.pone.0035827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Sjöling Å, Wiklund G, Savarino SJ, Cohen DI, Svennerholm A-M. Comparative analyses of phenotypic and genotypic methods for detection of enterotoxigenic Escherichia coli toxins and colonization factors. J. Clin. Microbiol. 2007;45:3295–3301. doi: 10.1128/JCM.00471-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Wachsmuth K, Wells J, Shipley P, Ryder R. Heat-labile enterotoxin production in isolates from a shipboard outbreak of human diarrheal illness. Infect. Immun. 1979;24:793–797. doi: 10.1128/IAI.24.3.793-797.1979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hazen TH, Nagaraj S, Sen S, Permala-Booth J, Canto FD, Vidal R, et al. Genome and functional characterization of colonization factor antigen I- and CS6-encoding heat-stable enterotoxin-only enterotoxigenic Escherichia coli reveals lineage and geographic variation. mSystems. 2019;4:209. doi: 10.1128/mSystems.00329-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Woodford N, Ellington MJ. The emergence of antibiotic resistance by mutation. Clin. Microbiol. Infect. 2007;13:5–18. doi: 10.1111/j.1469-0691.2006.01492.x. [DOI] [PubMed] [Google Scholar]
- 49.Blair JMA, Webber MA, Baylay AJ, Ogbolu DO, Piddock LJV. Molecular mechanisms of antibiotic resistance. Nat. Rev. Microbiol. 2015;13:42–51. doi: 10.1038/nrmicro3380. [DOI] [PubMed] [Google Scholar]
- 50.Rozwandowicz M, Brouwer MSM, Fischer J, Wagenaar JA, Gonzalez-Zorn B, Guerra B, et al. Plasmids carrying antimicrobial resistance genes in Enterobacteriaceae. J. Antimicrob. Chemother. 2018;73:1121–1137. doi: 10.1093/jac/dkx488. [DOI] [PubMed] [Google Scholar]
- 51.EUCAST EC for AST. Determination of Minimum Inhibitory Concentrations (MICs) of Antibacterial Agents by Broth Dilution. Wiley (10.1111); 2003 Aug p. ix–xv.
- 52.Waters SH, Rogowsky P, Grinsted J, Altenbuchner J, Schmitt R. The tetracycline resistance determinants of RP1 and Tn1721: Nucleotide sequence analysis. Nucleic Acids Res. 1983;11:6089–6105. doi: 10.1093/nar/11.17.6089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Octavia, S., Sara, J., & Lan, R. Characterization of a large novel phage-like plasmid in Salmonella enterica serovar Typhimurium. FEMS Microbiol. Lett. 2015. [DOI] [PubMed]
- 54.Liu P, Li P, Jiang X, Bi D, Xie Y, Tai C, et al. Complete genome sequence of Klebsiella pneumoniae subsp. pneumoniae HS11286, a multidrug-resistant strain isolated from human sputum. J. Bacteriol. 2012;194:1841–1842. doi: 10.1128/JB.00043-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Falgenhauer L, Yao Y, Fritzenwanker M, Schmiedel J, Imirzalioglu C, Chakraborty T. Complete genome sequence of phage-like plasmid pECOH89, encoding CTX-M-15. Genome Announc. 2014;2:2227. doi: 10.1128/genomeA.00356-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Colavecchio A, Jeukens J, Freschi L, Rheault J-GE, Kukavica-Ibrulj I, Levesque RC, et al. Complete genome sequences of two phage-like plasmids carrying the CTX-M-15 extended-spectrum β-lactamase gene. Genome Announc. 2017;5:90. doi: 10.1128/genomeA.00102-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Jacoby GA. Mechanisms of resistance to quinolones. Clin. Infect. Dis. 2005;41:S120–S126. doi: 10.1086/428052. [DOI] [PubMed] [Google Scholar]
- 58.Gilcrease EB, Casjens SR. The genome sequence of Escherichia coli tailed phage D6 and the diversity of Enterobacteriales circular plasmid prophages. Virology. 2018;515:203–214. doi: 10.1016/j.virol.2017.12.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Beatty ME, Bopp CA, Wells JG, Greene KD, Puhr ND, Mintz ED. Enterotoxin-producing Escherichia coli O169:H41. United states. Emerg. Infect. Dis. 2004;10:518–521. doi: 10.3201/eid1003.030268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Cho, S., Gupta, S.K., McMillan, E.A., Sharma, P., Ramadan, H., Jové, T., et al. Genomic analysis of multidrug-resistant Escherichia coli from surface water in Northeast Georgia, United States: Presence of an ST131 epidemic strain containing blaCTX-M-15on a phage-like plasmid. Microb. Drug Resist. 2019;mdr.2019.0306. [DOI] [PubMed]
- 61.Wang X, Kim Y, Ma Q, Hong SH, Pokusaeva K, Sturino JM, et al. Cryptic prophages help bacteria cope with adverse environments. Nat. Commun. 2010;1:147. doi: 10.1038/ncomms1146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Hu WS, Lin J-F, Lin Y-H, Chang H-Y. Outer membrane protein STM3031 (Ail/OmpX-like protein) plays a key role in the ceftriaxone resistance of Salmonella enterica serovar Typhimurium. Antimicrob. Agents Ch. 2009;53:3248–3255. doi: 10.1128/AAC.00079-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Gómez-Duarte OG, Kaper JB. A plasmid-encoded regulatory region activates chromosomal eaeA expression in enteropathogenic Escherichia coli. Infect. Immun. 1995;63:1767–1776. doi: 10.1128/IAI.63.5.1767-1776.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Murray EL, Conway T. Multiple regulators control expression of the Entner-Doudoroff Aldolase (Eda) of Escherichia coli. J. Bacteriol. 2005;187:991–1000. doi: 10.1128/JB.187.3.991-1000.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Villa L, García-Fernández A, Fortini D, Carattoli A. Replicon sequence typing of IncF plasmids carrying virulence and resistance determinants. J. Antimicrob. Chemother. 2010;65:2518–2529. doi: 10.1093/jac/dkq347. [DOI] [PubMed] [Google Scholar]
- 66.Osborn AM, Tatley FMDS, Steyn LM, Pickup RW, Saunders JR. Mosaic plasmids and mosaic replicons: evolutionary lessons from the analysis of genetic diversity in IncFII-related replicons. Microbiology. 2000;146:2267–2275. doi: 10.1099/00221287-146-9-2267. [DOI] [PubMed] [Google Scholar]
- 67.Nishikawa Y, Helander A, Ogasawara J, Moyer NP, Hanaoka M, Hase A, et al. Epidemiology and properties of heat-stable enterotoxin-producing Escherichia coli serotype O169:H41. Epidemiol. Infect. 1998;121:31–42. doi: 10.1017/S0950268898001046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Torres OR, González W, Lemus O, Pratdesaba RA, Matute JA, Wiklund G, et al. Toxins and virulence factors of enterotoxigenic Escherichia coli associated with strains isolated from indigenous children and international visitors to a rural community in Guatemala. Epidemiol. Infect. 2014;143:1662–1671. doi: 10.1017/S0950268814002295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Sack DA, Shimko J, Torres O, Bourgeois AL, Francia DS, Gustafsson B, et al. Randomised, double-blind, safety and efficacy of a killed oral vaccine for enterotoxigenic E. coli diarrhoea of travellers to Guatemala and Mexico. Vaccine. 2007;25:4392–4400. doi: 10.1016/j.vaccine.2007.03.034. [DOI] [PubMed] [Google Scholar]
- 70.Ban E, Yoshida Y, Wakushima M, Wajima T, Hamabata T, Ichikawa N, et al. Characterization of unstable pEntYN10 from enterotoxigenic Escherichia coli (ETEC) O169:H41. Virulence. 2015;6:735–744. doi: 10.1080/21505594.2015.1094606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Pfeifer, E., de Sousa, J. A. M., Touchon, M. & Rocha, E. P. C. Bacteria have numerous phage-plasmid families with conserved phage and variable plasmid gene repertoires. Biorxiv. 2020;2020.11.09.375378.
- 72.Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods. 2013;10:563–569. doi: 10.1038/nmeth.2474. [DOI] [PubMed] [Google Scholar]
- 73.Hunt M, Silva ND, Otto TD, Parkhill J, Keane JA, Harris SR. Circlator: Automated circularization of genome assemblies using long sequencing reads. Genome Biol. 2015;16:294. doi: 10.1186/s13059-015-0849-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput. Biol. 2017;13:e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Page, A. J., Ainsworth, E. V. & Langridge, G. C. Socru: Typing of genome-level order and orientation around ribosomal operons in bacteria. Microb Genom. 2020. [DOI] [PMC free article] [PubMed]
- 76.RCT. R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2020. Available from: https://www.R-project.org/.
- 77.Guy L, Kultima JR, Andersson SGE. genoPlotR: Comparative gene and genome visualization in R. Bioinformatics. 2010;26:2334–2335. doi: 10.1093/bioinformatics/btq413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Patel IR, Gangiredla J, Mammel MK, Lampel KA, Elkins CA, Lacher DW. Draft genome sequences of the Escherichia coli reference (ECOR) collection. Microbiol. Resour. Announc. 2018;7:e01133–e1218. doi: 10.1128/MRA.01133-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Horesh, G., Blackwell, G. A., Tonkin-Hill, G., Corander, J., Heinz, E. & Thomson, N. R. A comprehensive and high-quality collection of Escherichia coli genomes and their genes. Microb. Genom. 2021. [DOI] [PMC free article] [PubMed]
- 80.Kuhlmann, F. M., Martin, J., Hazen, T. H., Vickers, T. J., Pashos, M., Okhuysen, P. C., et al. Conservation and global distribution of non-canonical antigens in Enterotoxigenic Escherichia coli. PLoS Negl. Trop. Dis.. 13, e0007825 (2019). [DOI] [PMC free article] [PubMed]
- 81.Rasko DA, Canto FD, Luo Q, Fleckenstein JM, Vidal R, Hazen TH. Comparative genomic analysis and molecular examination of the diversity of enterotoxigenic Escherichia coli isolates from Chile. Plos Neglect Trop D. 2019;13:e0007828. doi: 10.1371/journal.pntd.0007828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Hunt, M., Mather, A. E., Sánchez-Busó, L., Page, A. J., Parkhill, J., Keane, J. A., et al. ARIBA: Rapid antimicrobial resistance genotyping directly from sequencing reads. bioRxiv. 2017;118000. [DOI] [PMC free article] [PubMed]
- 83.Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S, Holden MTG, et al. Roary: Rapid large-scale prokaryote pan genome analysis. Bioinformatics. 2015;31:3691–3693. doi: 10.1093/bioinformatics/btv421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Page AJ, Taylor B, Delaney AJ, Soares J, Seemann T, Keane JA, et al. SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments. Microbial genomics. 2016;2:e000056. doi: 10.1099/mgen.0.000056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol Biol Evol. 2015;32:268–274. doi: 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Yu G. Using ggtree to Visualize Data on Tree-Like Structures. Curr Protoc Bioinform. 2020;69:e96. doi: 10.1002/cpbi.96. [DOI] [PubMed] [Google Scholar]
- 87.Wickham, H. ggplot2, Elegant Graphics for Data Analysis. R. (2016).
- 88.Seemann T. Prokka: Rapid prokaryotic genome annotation. Bioinformatics. 2014;30:2068–2069. doi: 10.1093/bioinformatics/btu153. [DOI] [PubMed] [Google Scholar]
- 89.Carver T, Harris SR, Berriman M, Parkhill J, McQuillan JA. Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Bioinform Oxf Engl. 2011;28:464–469. doi: 10.1093/bioinformatics/btr703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Li, X., Xie, Y., Liu, M., Tai, C., Sun, J., Acids, Z. D. N., et al. oriTfinder: A web-based tool for the identification of origin of transfers in DNA sequences of bacterial mobile genetic elements. academic.oup.com. [DOI] [PMC free article] [PubMed]
- 92.Jia B, Raphenya AR, Alcock B, Waglechner N, Guo P, Tsang KK, et al. CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res. 2017;45:D566–D573. doi: 10.1093/nar/gkw1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Hadfield J, Croucher NJ, Goater RJ, Abudahab K, Aanensen DM, Harris SR. Phandango: an interactive viewer for bacterial population genomics. Bioinform Oxf Engl. 2017;34:292–293. doi: 10.1093/bioinformatics/btx610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund O, et al. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother. 2012;67:2640–2644. doi: 10.1093/jac/dks261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Seemann, T. Abricate [Internet]. undefined. Available from: https://github.com/tseemann/abricate.
- 96.Arndt D, Grant JR, Marcu A, Sajed T, Pon A, Liang Y, et al. PHASTER: a better, faster version of the PHAST phage search tool. Nucleic Acids Res. 2016;44:W16–21. doi: 10.1093/nar/gkw387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Partridge SR, Tsafnat G. Automated annotation of mobile antibiotic resistance in Gram-negative bacteria: the Multiple Antibiotic Resistance Annotator (MARA) and database. J Antimicrob Chemother. 2018;73:883–890. doi: 10.1093/jac/dkx513. [DOI] [PubMed] [Google Scholar]
- 98.Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 2006;34:D32–D36. doi: 10.1093/nar/gkj014. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets supporting the conclusions of this article are included within the articles and its additional files. The sequencing data generated in this study has been submitted to EMBL (Additional file 4 and 5). The physical ETEC reference strains can be requested by contacting the corresponding author Astrid von Mentzer (avm@sanger.ac.uk or mentzerv@chalmers.se). The database used for annotating ETEC virulence factors, ETEC virulence database, including the LT and ST alleles can be found in the github repositories: https://github.com/avonm/ETEC_vir_db and https://github.com/avonm/ETEC_toxin_variants_db. An interactive version of the core genome phylogeny of the 1,065 E. coli and ETEC isolates along with the ETEC reference strains (Figure S2) reported here is accessible at https://microreact.org/project/2ZZzaHzeXbMEw9U2MAk7pK?tt=cr. obtaining clinical isolates collected as part of this study should be addressed to the corresponding author. Exchange of clinical isolates should always be in agreement with the University of Gothenburg.