ABSTRACT
Extraintestinal pathogenic Escherichia coli (ExPEC) is a potential factor in ulcerative colitis etiology. We report here the complete genome and plasmid sequences of three Escherichia coli isolates, C 237-04 (p7), C 236-04A (p10A), and C 691-04A (p19A), obtained from fecal samples from ulcerative colitis patients in Copenhagen, Denmark.
ANNOUNCEMENT
Ulcerative colitis (UC) is characterized by periods of colonic mucosal inflammation, including signs and symptoms such as diarrhea, rectal bleeding, and stomach pain, followed by periods of remission. The disease etiology remains unsolved, but several host and environmental factors have been implicated (1, 2), including an association with virulent and pathogenic Escherichia coli (3–6). Studying E. coli strains isolated from UC patients helps us understand their role in UC etiology. Three clinical E. coli strains were isolated from patients’ fecal samples as part of a Danish clinical case-control study on inflammatory bowel disease (IBD), with informed written consent from participants and permission from the Regional Ethics Committee for Copenhagen County Hospitals (permission number KA03019).
The E. coli strains were isolated by suspending feces in phosphate-buffered saline, plating them on SSI selective enteric medium (number 724; SSI Diagnostica, Hillerød, Denmark) (7, 8), and incubating the plates at 37°C overnight. The colonies were assessed visually as E. coli, the species were confirmed using a Minibact E kit (9), and the cultures were stored in glycerol at −80°C. DNA was purified from fresh cultures, incubated at 37°C overnight in LB medium, using the phenol-chloroform method as previously described (10). Separate DNA batches were prepared for short- and long-read sequencing. The purity and concentrations were measured using a NanoDrop One spectrophotometer.
Sequencing and postprocessing were performed as described by Lallement et al. (11) using a combined short- and long-read sequencing approach. The read length of long-read sequencing was favored by omitting shearing and size selection of DNA before library preparation using an SQK-LSK109 ligation kit and sequencing on an Oxford Nanopore MinION instrument with a R9.4.1 flow cell. Base calling of the fast5 files to fastq files was performed using Guppy v5.0.7, with the dna_r9.4.1_450bps_sup model. Reads with a length of more than 10,000 nucleotides (nt) were selected using Filtlong v0.2.1 (https://github.com/rrwick/Filtlong) and assembled into scaffolds using Flye v2.9 (12) with -g 5 –nano-hq arguments or with Minimap2 v2.24 Miniasm v0.3 (13) assembly and two iterations of Minimap2 Racon v1.5.0 polishing (MMR) (14). The Minimap2 settings used were -x ava-ont for assembly and -x asm5 for polishing.
Paired-end 150-bp short-read sequencing was performed by BGI Europe A/S on the BGISEQ-500 platform, and the reads were filtered and trimmed using fastp (15) with the quality filter settings -q 25 and -u 10. Hybrid assemblies from these scaffolds and short sequence reads were constructed using SPAdes v3.15, with the -isolate and -trusted-contigs settings (16), and Unicycler v0.5.0 (17). These hybrid assemblies further identified the small circular plasmids in C 237-04 and C 236-04A solely from short sequence reads; using Unicycler, all replicon sequences were rotated to begin with dnaA (for chromosomes) or repA (for plasmids). Sequence annotation was performed by NCBI using PGAP v6.3 (18). Default parameters were used for all software unless otherwise specified. Sequence and assembly statistics are presented in Table 1.
TABLE 1.
Sequencing and assembly statistics
| Isolate | Replicon | No. of reads |
Coverage (×) |
ONT N50 (bp) | Size (bp) | GC content (%) | No. of predicted CDSc | GenBank accession no. | SRAd accession no. |
|||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BGIa | ONTb | BGI | ONT | BGI | ONT | |||||||
| C 237-04 | 11,632,828 | 41,675 | 25,238 | SRR21857883 | SRR21857880 | |||||||
| p7 (Chr)e | 11,433,789 | 349 | 66 | 4,918,902 | 50.57 | 4,630 | CP109921 | |||||
| pP7_1 | 49,975 | 1,203 | 0 | 6,230 | 49.31 | 8 | CP109922 | |||||
| pP7_2 | 40,225 | 4,136 | 0 | 1,459 | 50.58 | 1 | CP109923 | |||||
| C 236-04A | 11,648,518 | 8,522 | 29,467 | SRR21857884 | SRR21857881 | |||||||
| p10A (Chr) | 11,521,884 | 347 | 13 | 4,987,582 | 50.85 | 4,734 | CP109924 | |||||
| pP10A_1 | 27,678 | 1,010 | 0 | 4,109 | 45.44 | 5 | CP109925 | |||||
| pP10A_2 | 22,567 | 1,097 | 0 | 3,086 | 46.01 | 4 | CP109926 | |||||
| pP10A_3 | 16,005 | 978 | 0 | 2,455 | 48.80 | 4 | CP109927 | |||||
| pP10A_4 | 12,099 | 1,464 | 0 | 1,240 | 46.13 | 1 | CP109928 | |||||
| C 691-04A | 11,649,154 | 174,807 | 4,904 | SRR21857885 | SRR21857882 | |||||||
| p19A (Chr) | 11,309,943 | 328 | 45 | 5,176,167 | 50.46 | 4,826 | CP109929 | |||||
| pP19A_1 | 302,594 | 582 | 123 | 77,976 | 52.26 | 100 | CP109930 | |||||
BGI, BGISEQ sequencing; no. of reads mapped to each replicon using Bowtie2 v2.4.5 (19).
ONT, Oxford Nanopore Technologies.
CDS, coding DNA sequences.
SRA, Sequence Read Archive.
Chr, chromosome.
Serotyping and virulence characterization (7) were confirmed in silico using Web applications provided by the Center for Genomic Epidemiology, DTU (20), and EZClermont v0.6.3 (21). Isolates p7 and p19A were found to be genotypically more virulent than p10A, corresponding to the inflammatory stage of the patients from active and inactive colitis, respectively (7).
Data availability.
This whole-genome sequencing project has been deposited at GenBank under accession number PRJNA882345. The version described in this paper is the first version. The accession numbers for the raw reads and assemblies are provided in Table 1.
ACKNOWLEDGMENT
We thank Jerry Wells of Wageningen University and Research for providing a preliminary sequence for isolate p19A. LJ was supported by a grant from the Novo Nordisk Foundation, Grant number NNF19OC0058547.
Contributor Information
Ole Skovgaard, Email: olesk@ruc.dk.
Karen A. Krogfelt, Email: karenak@ruc.dk.
David Rasko, University of Maryland School of Medicine.
REFERENCES
- 1.Ristow LC, Welch RA. 2016. Hemolysin of uropathogenic Escherichia coli: a cloak or a dagger? Biochim Biophys Acta 1858:538–545. doi: 10.1016/j.bbamem.2015.08.015. [DOI] [PubMed] [Google Scholar]
- 2.Lee M, Chang EB. 2021. Inflammatory bowel diseases (IBD) and the microbiome—searching the crime scene for clues. Gastroenterology 160:524–537. doi: 10.1053/j.gastro.2020.09.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mirsepasi-Lauridsen HC, Halkjaer SI, Mortensen EM, Lydolph MC, Nordgaard-Lassen I, Krogfelt KA, Petersen AM. 2016. Extraintestinal pathogenic Escherichia coli are associated with intestinal inflammation in patients with ulcerative colitis. Sci Rep 6:31152. doi: 10.1038/srep31152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Mirsepasi-Lauridsen HC, Vallance BA, Krogfelt KA, Petersen AM. 2019. Escherichia coli pathobionts associated with inflammatory bowel disease. Clin Microbiol Rev 32:e00060-18. doi: 10.1128/CMR.00060-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kamali DR, Feizi A, Halaji M, Fazeli H, Adibi P. 2021. The prevalence of adherent-invasive Escherichia coli and its association with inflammatory bowel diseases: a systematic review and meta-analysis. Front Med (Lausanne) 8:730243. doi: 10.3389/FMED.2021.730243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Petersen AM, Halkjær SI, Gluud LL. 2015. Intestinal colonization with phylogenetic group B2 Escherichia coli related to inflammatory bowel disease: a systematic review and meta-analysis. Scand J Gastroenterol 50:1199–1207. doi: 10.3109/00365521.2015.1028993. [DOI] [PubMed] [Google Scholar]
- 7.Petersen AM, Nielsen EM, Litrup E, Brynskov J, Mirsepasi H, Krogfelt KA. 2009. A phylogenetic group of Escherichia coli associated with active left-sided inflammatory bowel disease. BMC Microbiol 9:171. doi: 10.1186/1471-2180-9-171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Blom M, Meyer A, Gerner-Smidt P, Gaarslev K, Espersen F. 1999. Evaluation of Statens Serum Institut enteric medium for detection of enteric pathogens. J Clin Microbiol 37:2312–2316. doi: 10.1128/JCM.37.7.2312-2316.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kjaeldgaard P, Nissen B, Lange N, Laursen H. 1986. Evaluation of Minibact, a new system for rapid identification of Enterobacteriaceae. Comparison of Minibact, Micro-ID and API 20E with a conventional method as reference. Acta Pathol Microbiol Immunol Scand B 94:57–61. [PubMed] [Google Scholar]
- 10.Rasool FN, Saavedra MA, Pamba S, Perold V, Mmochi AJ, Maalim M, Simonsen L, Buur L, Pedersen RH, Syberg K, Jelsbak L. 2021. Isolation and characterization of human pathogenic multidrug resistant bacteria associated with plastic litter collected in Zanzibar. J Hazard Mater 405:124591. doi: 10.1016/j.jhazmat.2020.124591. [DOI] [PubMed] [Google Scholar]
- 11.Lallement C, Krogfelt KA, Skovgaard O, Jelsbak L. 2022. Complete genome sequence of a multidrug-resistant Klebsiella pneumoniae environmental isolate from Zanzibar, Tanzania, harboring novel insertion elements and two blaCTX-M-15 genes. Microbiol Resour Announc 11:e0026322. doi: 10.1128/MRA.00263-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kolmogorov M, Yuan J, Lin Y, Pevzner PA. 2019. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37:540–546. doi: 10.1038/s41587-019-0072-8. [DOI] [PubMed] [Google Scholar]
- 13.Li H. 2016. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32:2103–2110. doi: 10.1093/bioinformatics/btw152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Vaser R, Sović I, Nagarajan N, Šikić M. 2017. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27:737–746. doi: 10.1101/gr.214270.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Chen S, Zhou Y, Chen Y, Gu J. 2018. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890. doi: 10.1093/bioinformatics/bty560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Li W, O'Neill KR, Haft DH, DiCuccio M, Chetvernin V, Badretdin A, Coulouris G, Chitsaz F, Derbyshire MK, Durkin AS, Gonzales NR, Gwadz M, Lanczycki CJ, Song JS, Thanki N, Wang J, Yamashita RA, Yang M, Zheng C, Marchler-Bauer A, Thibaud-Nissen F. 2021. RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation. Nucleic Acids Res 49:D1020–D1028. doi: 10.1093/nar/gkaa1105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Technical University of Denmark. 2011. Center for Genomic Epidemiology. http://www.genomicepidemiology.org/services/. Accessed 30 November 2022.
- 21.Clermont O, Christenson JK, Denamur E, Gordon DM. 2013. The Clermont Escherichia coli phylo-typing method revisited: improvement of specificity and detection of new phylo-groups. Environ Microbiol Rep 5:58–65. doi: 10.1111/1758-2229.12019. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
This whole-genome sequencing project has been deposited at GenBank under accession number PRJNA882345. The version described in this paper is the first version. The accession numbers for the raw reads and assemblies are provided in Table 1.
