Skip to main content
Microbiology Resource Announcements logoLink to Microbiology Resource Announcements
. 2024 Feb 20;13(3):e01007-23. doi: 10.1128/mra.01007-23

Draft genome sequences of Escherichia spp. isolates from New Zealand environmental sources

Patrick J Biggs 1,2,3, Marie Moinet 4, Lynn E Rogers 4, Megan Devane 5, Richard Muirhead 6, Rebecca Stott 7, Jonathann C Marshall 8, Adrian L Cookson 2,4,
Editor: John J Dennehy9
PMCID: PMC10927642  PMID: 38376223

ABSTRACT

Escherichia coli is often used as a fecal indicator bacterium for water quality monitoring. We report the draft genome sequences of 500 Escherichia isolates including newly described Escherichia species, namely Escherichia marmotae, Escherichia ruysiae, and Escherichia whittamii, obtained from diverse environmental sources to assist with improved public health risk assessments.

KEYWORDS: Escherichia spp., water quality, Escherichia marmotae, Escherichia whittamii, Escherichia ruysiae

ANNOUNCEMENT

Naturally occurring Escherichia coli from the gut of warm-blooded animals (including birds) is a common indicator of fecal contamination for freshwater quality monitoring as a proxy for fecal contamination and pathogens (1). However, current culture-based methods, used to enumerate E. coli, cannot distinguish between fecal E. coli and naturalized or environmental associated “E. coli-like” strains, also known as Escherichia Cryptic Clades (24). Escherichia whittamii (Cryptic Clade 2) (5), Escherichia ruysiae (Cryptic Clades 3 and 4) (6), and Escherichia marmotae (Cryptic Clade 5) (7) are recently described taxa, but host species and environmental persistence remain to be established. This project focuses on the whole-genome sequencing of E. coli and Escherichia spp. from environmental sources (freshwater, riverine sediment, aquatic biofilm, soil and fecal material from birds, and mammals). Strains were obtained as part of a study examining the impact of contrasting land uses on Escherichia spp. and were cultured as described previously (8). Genomic data from E. coli and the new Escherichia spp. will provide information on the environmental survival of these bacteria and more accurate fecal tracking, enabling the most impactful sources of contamination affecting waterways to be identified and rapidly addressed.

Each strain was resuscitated from freezer stocks and grown on sheep blood agar (Fort Richard, New Zealand) at 37°C. A single colony was subcultured onto fresh sheep blood agar, with DNA extractions undertaken from five colonies for each strain using the Wizard Genomic DNA Purification Kit (Promega, WI, USA). Sequencing libraries were prepared using the Nextera XT DNA library preparation kit (Illumina, CA, USA) and sequenced on either the Illumina MiSeq or HiSeq X platform. The resultant 2 × 150 base paired-end reads were quality assessed with FastQC (v0.11.9) (9). De novo-assembled genomes were obtained using the Nullarbor pipeline (2.0.20191013) (10), including Trimmomatic (v0.39) (11), and assembled with SKESA (v2.4.0) (12) using the respective default parameters for each tool. The genomes were automatically annotated using the NCBI Prokaryotic Genome Annotation Pipeline (13).

Average depth coverage for the draft genomes was 101× (range 28–437×), with average genome size of 4,904,079 bp (range 4,287,000–5,664,657 bp), average number of contigs 107 (range 26–356), and average N50 of 145,778 bp (range 38,436–716,300 bp). The genomes were categorized according to source (Table 1) with those from feces (n = 182) the most common. Mash-based phylogroups were assigned in silico from assembled genomes using the ClermonTyper web tool (14). Eight established E. coli phylogroups were identified: A (n = 6), B1 (n = 182), B2 (n = 80), C (n = 8), D (n = 56), E (n = 15), F (n = 4), and G (n = 1), as well as Cryptic Clade 1 (n = 1) (15, 16). In addition, 147 isolates were identified as non-E. coli Escherichia species: Escherichia whittamii (Cryptic Clade 2, n = 2), Escherichia ruysiae (Cryptic Clade 3, n = 4), and Escherichia marmotae (Cryptic Clade 5, n = 141).

TABLE 1.

Summary of 500 genomes from environmental sources

Source No. of genomes Phylogroups observed Cryptic lineages observed
Feces (avian) 166 B1, B2, C, D, E, F 2, 3, 5
Feces (mammal) 14 B1, C, D None
Feces (unknown) 2 B1 5
Biofilm 90 B1, B2, C, D, E, G 1, 5
Sediment 78 A, B1, B2, D, E 5
Soil 45 A, B1, B2, D, E 5
Water 105 A, B1, B2, C, D, F 5

ACKNOWLEDGMENTS

This work was funded by the New Zealand Ministry of Business, Innovation and Employment project C10X1908 "Novel discriminatory tests for E. coli to improve water quality assessments." M.M. was funded through the Strategic Science Investment Funded project "Food Integrity."

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

We would like to thank mana whenua and the support of local hapū/iwi across the sample sites. We would also like to thank Rose Collis for assistance with uploading sequence data. The authors declare no competing financial interests.

Contributor Information

Adrian L. Cookson, Email: adrian.cookson@agresearch.co.nz.

John J. Dennehy, Queens College Department of Biology, USA

DATA AVAILABILITY

The draft genome assemblies (JAVGZV010000000–JAVHJK010000000) and short reads were deposited at DDBJ/ENA/GenBank under BioProject accession number PRJNA1007421. A full listing of the source and phylogroup information for the 500 genomes can be found at https://doi.org/10.6084/m9.figshare.24152268 (17).

REFERENCES

  • 1. Till D, McBride G, Ball A, Taylor K, Pyle E. 2008. Large-scale freshwater microbiological study: rationale, results and risks. J Water Health 6:443–460. doi: 10.2166/wh.2008.071 [DOI] [PubMed] [Google Scholar]
  • 2. Walk ST, Alm EW, Gordon DM, Ram JL, Toranzos GA, Tiedje JM, Whittam TS. 2009. Cryptic lineages of the genus Escherichia. Appl Environ Microbiol 75:6534–6544. doi: 10.1128/AEM.01262-09 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Walk ST. 2015. The "Cryptic" Escherichia. EcoSal Plus 6. doi: 10.1128/ecosalplus.ESP-0002-2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Devane ML, Moriarty E, Weaver L, Cookson A, Gilpin B. 2020. Fecal indicator bacteria from environmental sources; strategies for identification to improve water quality monitoring. Water Res 185:116204. doi: 10.1016/j.watres.2020.116204 [DOI] [PubMed] [Google Scholar]
  • 5. Gilroy R, Ravi A, Getino M, Pursley I, Horton DL, Alikhan N-F, Baker D, Gharbi K, Hall N, Watson M, Adriaenssens EM, Foster-Nyarko E, Jarju S, Secka A, Antonio M, Oren A, Chaudhuri RR, La Ragione R, Hildebrand F, Pallen MJ. 2021. Extensive microbial diversity within the chicken gut microbiome revealed by metagenomics and culture. PeerJ 9:e10941. doi: 10.7717/peerj.10941 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. van der Putten BCL, Matamoros S, Mende DR, Scholl ER, Consortium C, Schultsz C. 2021. Escherichia ruysiae sp. nov., a novel gram-stain-negative bacterium, isolated from a faecal sample of an international traveller. Int J Syst Evol Microbiol 71:004609. doi: 10.1099/ijsem.0.004609 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Liu S, Jin D, Lan R, Wang Y, Meng Q, Dai H, Lu S, Hu S, Xu J. 2015. Escherichia marmotae sp. nov., isolated from faeces of Marmota himalayana. Int J Syst Evol Microbiol 65:2130–2134. doi: 10.1099/ijs.0.000228 [DOI] [PubMed] [Google Scholar]
  • 8. Cookson AL, Marshall JC, Biggs PJ, Rogers LE, Collis RM, Devane ML, Stott R, Wilkinson DA, Kamke J, Brightwell G. 2022. Whole-genome sequencing and virulome analysis of Escherichia coli isolated from New Zealand environments of contrasting observed land use. Appl Environ Microbiol 88:e0027722. doi: 10.1128/aem.00277-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Andrews A. 2010. FastQC: A Quality Control Tool for High Throughput Sequence Data. Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc
  • 10. Seemann T, Goncalves S, Silva A, Bulach DM, Schultz MB, Kwong JC, Howden BP. 2019. Nullarbor - pipeline to generate complete public health microbiology reports from sequenced isolates. Available from: https://github.com/tseemann/nullarbor
  • 11. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Souvorov A, Agarwala R, Lipman DJ. 2018. SKESA: strategic K-MER extension for scrupulous assemblies. Genome Biol 19:153. doi: 10.1186/s13059-018-1540-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J. 2016. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res 44:6614–6624. doi: 10.1093/nar/gkw569 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Beghain J, Bridier-Nahmias A, Le Nagard H, Denamur E, Clermont O. 2018. Clermontyping: an easy-to-use and accurate in silico method for Escherichia genus strain phylotyping. Microb Genom 4:e000192. doi: 10.1099/mgen.0.000192 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Clermont O, Christenson JK, Denamur E, Gordon DM. 2013. The clermont Escherichia coli phylo-typing method revisited: improvement of specificity and detection of new phylo-groups. Environ Microbiol Rep 5:58–65. doi: 10.1111/1758-2229.12019 [DOI] [PubMed] [Google Scholar]
  • 16. Clermont O, Dixit OVA, Vangchhia B, Condamine B, Dion S, Bridier-Nahmias A, Denamur E, Gordon D. 2019. Characterization and rapid identification of phylogroup G in Escherichia coli, a lineage with high virulence and antibiotic resistance potential. Environ Microbiol 21:3107–3117. doi: 10.1111/1462-2920.14713 [DOI] [PubMed] [Google Scholar]
  • 17. Biggs PJ, Moinet M, Rogers LE, Devane M, Muirhead RW, Stott R, Marshall JC, Cookson AL. 2023. Supplemental Table 1. assembly and Strain Information for 500 genomes sourced from the New Zealand Environment. Available from: 10.6084/m9.figshare.24152268 [DOI]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The draft genome assemblies (JAVGZV010000000–JAVHJK010000000) and short reads were deposited at DDBJ/ENA/GenBank under BioProject accession number PRJNA1007421. A full listing of the source and phylogroup information for the 500 genomes can be found at https://doi.org/10.6084/m9.figshare.24152268 (17).


Articles from Microbiology Resource Announcements are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES