ABSTRACT
Algoriphagus is a heterotrophic bacterium commonly found in diverse marine environments. Here, we report the complete genome sequence of Algoriphagus halophilus strain SOCE 003, which is 5,154,101 bp long, encoding 5,524 annotated protein-coding genes, 39 tRNAs, and 8 rRNAs. This genome information will help us understand the ecology of Algoriphagus.
KEYWORDS: Algoriphagus, marine bacterium, genome sequence, nanopore sequencing
ANNOUNCEMENT
The genus Algoriphagus represents a group of aerobic heterotrophic bacterium (1). They have been isolated from diverse marine habitats, including seawater (2), cold seeps (3), mangrove rhizosphere (4), and so on. As of November 2024, there are 50 species with valid names (www.bacterio.net/algoriphagus.html) in this genus, and some of them encode metabolic potentials of degrading polysaccharides and other macromolecules (5, 6).
We report the complete genome sequence of Algoriphagus halophilus (A. halophilus) strain SOCE 003, isolated from the surface seawater of Dapeng Bay (22°52′43″ N, 114°01′93″ E), China. The measured salinity, chlorophyll a concentration, and temperature were approximately 30 ppt, 5.8 mg/L, and 24°C, respectively. Seawater was filtered through a 0.22 m nitrocellulose membrane and incubated on the Marine Broth 2216 agar (BD Difco, NJ, USA) at 28°C for 3 days. Bacterial colonies were isolated using the streak plating method and were cultivated in the 2216 liquid medium to harvest microbial cells. Genomic DNA was extracted using the cetyltrimethylammonium bromide (CTAB) method with proteinase K, followed by chloroform:isoamyl alcohol (24:1) phase separation (7). DNA was precipitated with isopropanol, washed with 70% ethanol, resuspended in TE buffer, and quantified using the Qubit double-stranded DNA broad-range kit (Thermo Fisher Scientific, USA) (7).
Illumina sequencing was performed at Novogene Co., Ltd. (Tianjin, China) on a NovaSeq 6000 platform (Illumina, CA, USA) with the 2 × 150 bp paired-end strategy. Genomic DNA (0.2 g) was sheared into ~350 bp long fragments using the LE220R-plus ultrasonicator (Covaris, MA, USA), which were then end polished and A-tailed using the NEBNext Ultra DNA library preparation kit (New England Biolabs, MA, USA) following the manufacturer’s instructions. The Illumina sequencing yielded 3.0 Gbps raw reads. Adaptors and low-quality ends were trimmed using fastp v0.19.7 (8) with default parameters, producing 2.6 Gbps clean reads.
DNA fragments >3 kb were selected using the Long Fragment Buffer (Oxford Nanopore Technologies, Oxford, UK), and libraries were prepared with the ligation kit SQK-LSK109 and sequenced on the MinION Mk1C platform with an R9.4.1 flow cell. Raw fast5 files were processed using Guppy v6.1.7 (9) for adapter removal to get clean fastq reads. Nanopore quality control was conducted using NanoPlot v1.40.2 with a Q-value cutoff >7 (10). In total, 381,637 Nanopore reads were generated with an N50 of 6,221 bp and a mean read length of 2,323 bp. An overlap graph was assembled from Nanopore reads using Flye v2.9 (11), and overlaps between contigs were trimmed with default parameters. Unicycler v0.5.1 (12) was used to assemble Nanopore contigs with Illumina clean reads, and the circular contig was identified and rotated to begin with the dnaA gene on the forward strand. This assembly was corrected using Pilon v1.24 (13) and NextPolish v1.4.0 (14) with Illumina reads, resulting in a complete genome of 5,154,101 bp in size, with a GC content of 39%. All tools used default parameters unless noted. Genome annotation was done using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) (15), which predicted 5,524 protein-coding genes, 39 tRNAs, and 8 rRNAs (5S, 16S, and 23S). A maximum likelihood phylogenetic tree was constructed based on 16S rRNA genes using IQ-Tree v2.2.0 (16) (Fig. 1). The first 16S rRNA gene (Fragment 1) of A. halophilus SOCE 003 strain showed the highest identity (91.15%) to that of A. halophilus JC2051 (NR_025744.1). The complete genome sequence of A. halophilus SOCE 003 serves as a reference to decipher the metabolism and adaptation of Algoriphagus in diverse marine habitats.
Fig 1.
Maximum likelihood phylogenetic tree of A. halophilus SOCE 003 with other close relatives. Three 16S rRNA genes were identified in A. halophilus SOCE 003 (Fragments 1–3), with the highest pairwise sequence identity of 94.75% (Fragment 1 vs Fragment 2), and the lowest identity of 91.13% (Fragment 2 vs Fragment 3), as calculated using the distmat module of EMBOSS v6.6.0 (17) after alignments using MAFFT v7.453 (18). Public 16S rRNA gene sequences of closely related strains were obtained from the NCBI GenBank database, and only sequences with valid names (www.bacterio.net/algoriphagus.html) were selected. All the 16S sequences were aligned using ClustalW (19), and the maximum likelihood phylogenetic tree was inferred from the alignment using the best-fitting model automatically selected by ModelFinder (20) in IQ-Tree v2.2.0 (16). The tree file was visualized using the iTOL Web server (https://itol.embl.de). The three 16S sequences of A. halophilus SOCE 003 are shown in red. Three Cyclobacteriaceae bacterium strains (highlighted in blue) were used as the outgroup to root the tree. Branch lengths represent phylogenetic distances from the reference genome. Blue circles represent bootstrap values >80.
ACKNOWLEDGMENTS
This study was supported by the National Natural Science Foundation of China (grant nos. 42276163 and 42476109), Shenzhen Science and Technology Innovation Commission Programme (grant no. JCYJ20220530115401003), SUSTech Undergraduate Teaching Quality and Education Reform Project (grant no. SJZLGC202437), and Shanghai Frontiers Science Center of Polar Research (SOO2004-03).
Contributor Information
Shengwei Hou, Email: housw@sustech.edu.cn.
Frank J. Stewart, Montana State University, Bozeman, Montana, USA
DATA AVAILABILITY
The complete genome sequence of Algoriphagus halophilus strain SOCE 003 has been deposited at GenBank under the BioProject accession number PRJNA1082223, the BioSample accession number SAMN40200481, and the GenBank accession number CP146486. The SRA accession numbers of the Illumina and Nanopore reads are SRR30551005 and SRR30565972, respectively.
REFERENCES
- 1. Bowman JP, Nichols CM, Gibson JAE. 2003. Algoriphagus ratkowskyi gen. nov., sp. nov., Brumimicrobium glaciale gen. nov., sp. nov., Cryomorpha ignava gen. nov., sp. nov. and Crocinitomix catalasitica gen. nov., sp. nov., novel flavobacteria isolated from various polar habitats. Int J Syst Evol Microbiol 53:1343–1355. doi: 10.1099/ijs.0.02553-0 [DOI] [PubMed] [Google Scholar]
- 2.RiyantiZumkeller CM, Spohn M, Mihajlovic S, Schwengers O, Goesmann A, Choironi NA, Schäberle TF, Harwoko H. 2023. Draft genome sequences of Algoriphagus sp. strain PAP.12 and Roseivirga sp. strain PAP.19, isolated from marine samples from Papua, Indonesia. Microbiol Resour Announc 12:e0126422. doi: 10.1128/mra.01264-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Yan C, Chen C, Chai B, Ye Y, Anwar N, Zhao Z, Wang R, Huo Y, Zhang X, Wu M, Zheng D. 2022. Algoriphagus algorifonticola sp. nov., a marine bacterium isolated from cold spring area of South China Sea. Int J Syst Evol Microbiol 72. doi: 10.1099/ijsem.0.005365 [DOI] [PubMed] [Google Scholar]
- 4. Song ZM, Wang KL, Yin Q, Chen CC, Xu Y. 2020. Algoriphagus kandeliae sp. nov., isolated from mangrove rhizosphere soil. Int J Syst Evol Microbiol 70:1672–1677. doi: 10.1099/ijsem.0.003954 [DOI] [PubMed] [Google Scholar]
- 5. Alegado RA, Ferriera S, Nusbaum C, Young SK, Zeng Q, Imamovic A, Fairclough SR, King N. 2011. Complete genome sequence of Algoriphagus sp. PR1, bacterial prey of a colony-forming choanoflagellate. J Bacteriol 193:1485–1486. doi: 10.1128/JB.01421-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Pinnaka AK, Tanuku NRS. 2014. The family Cyclobacteriaceae. In Rosenberg E, DeLong EF, Lory S, Stackebrandt E, Thompson F (ed), The prokaryotes. Springer, Berlin, Heidelberg. [Google Scholar]
- 7. Minas K, McEwan NR, Newbold CJ, Scott KP. 2011. Optimization of a high-throughput CTAB-based protocol for the extraction of qPCR-grade DNA from rumen fluid, plant and bacterial pure cultures. FEMS Microbiol Lett 325:162–169. doi: 10.1111/j.1574-6968.2011.02424.x [DOI] [PubMed] [Google Scholar]
- 8. Chen S, Zhou Y, Chen Y, Gu J. 2018. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890. doi: 10.1093/bioinformatics/bty560 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Wick RR, Judd LM, Holt KE. 2019. Performance of neural network basecalling tools for Oxford Nanopore sequencing. Genome Biol 20:129. doi: 10.1186/s13059-019-1727-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. De Coster W, D’Hert S, Schultz DT, Cruts M, Van Broeckhoven C. 2018. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34:2666–2669. doi: 10.1093/bioinformatics/bty149 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Antipov D, Rayko M, Kolmogorov M, Pevzner PA. 2022. viralFlye: assembling viruses and identifying their hosts from long-read metagenomics data. Genome Biol 23:57. doi: 10.1186/s13059-021-02566-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. doi: 10.1371/journal.pone.0112963 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Hu J, Fan J, Sun Z, Liu S. 2020. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36:2253–2255. doi: 10.1093/bioinformatics/btz891 [DOI] [PubMed] [Google Scholar]
- 15. Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J. 2016. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res 44:6614–6624. doi: 10.1093/nar/gkw569 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, Lanfear R. 2020. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol 37:1530–1534. doi: 10.1093/molbev/msaa015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Rice P, Longden I, Bleasby A. 2000. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet 16:276–277. doi: 10.1016/s0168-9525(00)02024-2 [DOI] [PubMed] [Google Scholar]
- 18. Katoh K, Misawa K, Kuma K, Miyata T. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30:3059–3066. doi: 10.1093/nar/gkf436 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Thompson JD, Higgins DG, Gibson TJ. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680. doi: 10.1093/nar/22.22.4673 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. 2017. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14:587–589. doi: 10.1038/nmeth.4285 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The complete genome sequence of Algoriphagus halophilus strain SOCE 003 has been deposited at GenBank under the BioProject accession number PRJNA1082223, the BioSample accession number SAMN40200481, and the GenBank accession number CP146486. The SRA accession numbers of the Illumina and Nanopore reads are SRR30551005 and SRR30565972, respectively.

