ABSTRACT
The strain Nitrosomonas europaea W4 was isolated from a river water sample, and its complete genome sequence was determined. The genome encodes a functional Eut bacterial microcompartment for ethanolamine metabolism. Deciphering this genomic information is essential for optimizing high-biomass fermentation and advancing its application in wastewater treatment.
KEYWORDS: Nitrosomonas europaea, Eut BMC, ethanolamine
ANNOUNCEMENT
Nitrosomonas europaea W4 isolated from the Shiqi River in Zhongshan, China. N. europaea exhibits strong ammonia-oxidizing capability and wide environmental distribution, but the strain is difficult to isolate. Currently, only N. europaea ATCC 19718 has a complete genome in NCBI (1). This new Asian isolate may reveal genetic differences from the reference strain. The isolation of N. europaea W4 was performed using a 96-well plate method with inorganic liquid medium (2), cultured at 30°C and 130 rpm in the dark. Bacterial biomass was harvested by centrifuging at 10,000 rpm for 5 min. Genomic DNA was extracted using the Mag-MK Bacterial Genomic DNA Extraction Kit. DNA quality was verified via Qubit 4.0 (Thermo Fisher Scientific) for concentration measurement and pulsed-field gel electrophoresis (CHEF Mapper XA, Bio-Rad) to confirm fragment integrity (>50 kb) (3) High-quality DNA was used for library preparation and sequencing.
Whole-genome sequencing was performed using PacBio Sequel IIe sequencing and Illumina NovaSeq6000 (Illumina Inc., San Diego, CA, USA). For Illumina sequencing, 1 µg of genomic DNA samples was fragmented into 400–500 bp segments using a Covaris M220 Focused Acoustic Shearer. Illumina sequencing libraries were prepared from the sheared fragments using the NEXTflex Rapid DNA-Seq Kit (Bioo Scientific, USA). For PacBio library preparation, 15 µg DNA was spun in a Covaris g-TUBE (Covaris, MA) at 6,000 RPM for 60 s using an Eppendorf 5424 centrifuge. The sequencing library was purified three times with 0.45 × volumes of Agencourt AMPure XP beads. A ~10 kb insert library was prepared and sequenced on one SMRT cell using standard protocols. The quality control of Illumina data was performed using fastp (v.0.23.0) (4) with all software parameters set to their defaults. The PacBio Sequel IIe platform produced HiFi reads. The quality-controlled Illumina data and HiFi reads were assembled using Unicycler (v.0.4.8) (5). The actual annotation and visualization procedures are as follows: Predicted CDS were annotated against NR, Swiss-Prot, GO, COG, and KEGG databases using BLAST+ (v.2.3.0), Diamond (v.0.8.35). Annotation results were visualized with Artemis (v18.2.0) (6). PacBio sequencing yielded 41,266 long reads (total 339,733,834 bp), with an average read length of 8.2 kb, a maximum read length of 22.6 kb, an N50 of 8,670 bp, and ~120× coverage. The Illumina data had a read length of 7,922,322 (total 1,196,270,622 bp). The sequencing depth was 151-fold and Q20/Q30 values of 98.70%/96.32%, achieving ~390× coverage. After completing the assembly of the bacterial genome, the NextDenovo (v2.5.2) tool was used to automatically circularize the genome. Comparative genomic analysis was conducted using the N. europaea ATCC 19718 reference genome (NCBI). The whole-genome average nucleotide identity, calculated via BLASTN, was 97.7% (7). The genome of N. europaea W4 comprises a chromosome (2,892,991 bp) and a plasmid (55,583 bp) (Table 1).
TABLE 1.
Genomic properties of N. europaea W4
| Type | Name | GC% | Size (bp) | Average length (bp) | rRNAs | tRNAs | Other RNA | CDSs |
|---|---|---|---|---|---|---|---|---|
| Chromosome | - | 51.26 | 2,892,991 | 952 | 3 | 41 | 60 | 2,716 |
| Plasmid | - | 52.84 | 55,583 | 828 | - | 1 | - | 55 |
Functional annotation using the eggNOG-mapper v2.1.1.2 database identified genes encoding the Eut bacterial microcompartment (8), highlighting the strain’s potential for ethanolamine metabolism (9). The in-depth study of the genomic information of N. europaea W4 is important for achieving high-biomass fermentation of this isolate and its application in wastewater treatment.
ACKNOWLEDGMENTS
The authors thank Shanghai Meiji Biomedical Technology Co., Ltd. for the technical support in genome sequencing and data analysis.
This work was supported by National Natural Science Foundation of China (51579092) and CEEC Major S&T Projects (CEEC2023-ZDYF-09).
Hongfei Wang: Investigation, Strain Screening, Gene Comparison, Writing – Original Draft. Kai Liu: Investigation, Gene Comparison. Jingxuan Deng: Strain Screening, Writing – Review and Editing. Jian Gao: Funding Acquisition, Writing – Review and Editing. Mingjun Liao: Data Planning, Research Methodology, Funding Acquisition, Supervision, Validation, Writing – Review and Editing.
Contributor Information
Mingjun Liao, Email: 455234600@qq.com.
Irene L. G. Newton, Indiana University, Bloomington, Bloomington, Indiana, USA
DATA AVAILABILITY
The raw sequencing data and the assembled genome have been submitted to the NCBI GenBank database. The genomic sequence is publicly available. The BioSample number is SAM47145788, the BioProject number is PRJNA1229865, and The Illumina raw data accession number is SRX27835090, and the PacBio raw data accession number is SRX29463302. The accession number for the complete assembled genome is GCA_039905905.1.The accession number of the chromosome sequence is CP157069.1, and that of the plasmid sequence is CP157070.1.
REFERENCES
- 1. Chain P, Lamerdin J, Larimer F, Regala W, Lao V, Land M, Hauser L, Hooper A, Klotz M, Norton J, Sayavedra-Soto L, Arciero D, Hommes N, Whittaker M, Arp D. 2003. Complete genome sequence of the ammonia-oxidizing bacterium and obligate chemolithoautotroph Nitrosomonas europaea. J Bacteriol 185:2759–2773. doi: 10.1128/JB.185.9.2759-2773.2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Verhagen FJM, Laanbroek HJ. 1991. Competition for ammonium between nitrifying and heterotrophic bacteria in dual energy-limited chemostats. Appl Environ Microbiol 57:3255–3263. doi: 10.1128/aem.57.11.3255-3263.1991 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Guan H, Qin T, Zhou HJ, Ren HY, Duan YX, Xu X, Shao ZJ. 2013. Establishment and application of pulsed-field gel electrophoresis assay to screen plasmid from bacteria. Disease Surveillance 28:57–60. [Google Scholar]
- 4. Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890. doi: 10.1093/bioinformatics/bty560 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Carver T, Harris SR, Berriman M, Parkhill J, McQuillan JA. 2012. Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Bioinformatics 28:464–469. doi: 10.1093/bioinformatics/btr703 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. 2018. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun 9:5114. doi: 10.1038/s41467-018-07641-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. 2021. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol Biol Evol 38:5825–5829. doi: 10.1093/molbev/msab293 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Pokhrel A, Kang SY, Schmidt-Dannert C. 2021. Ethanolamine bacterial microcompartments: from structure, function studies to bioengineering applications. Curr Opin Microbiol 62:28–37. doi: 10.1016/j.mib.2021.04.008 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The raw sequencing data and the assembled genome have been submitted to the NCBI GenBank database. The genomic sequence is publicly available. The BioSample number is SAM47145788, the BioProject number is PRJNA1229865, and The Illumina raw data accession number is SRX27835090, and the PacBio raw data accession number is SRX29463302. The accession number for the complete assembled genome is GCA_039905905.1.The accession number of the chromosome sequence is CP157069.1, and that of the plasmid sequence is CP157070.1.
