ABSTRACT
We isolated Pseudovibrio ascidiaceicola strain 5337, a gut bacterium of the ascidian Ciona robusta, from Mission Bay, San Diego. The genomic assembly is 6.94 Mb and 99.99% complete, comprising 22 contigs and 6,613 protein-coding genes. Unicycler identified seven circular contigs, and PHASTEST identified 11 prophage regions, including two gene transfer agents.
KEYWORDS: gram-negative bacteria; lysogeny; prophage; ascidian model, Ciona robusta, host-microbe interactions, microbiome.
ANNOUNCEMENT
We isolated Pseudovibrio ascidiaceicola strain 5337 to explore bacterial and phage diversity and trans-kingdom interactions within the tunicate gut. Its genome suggests potential roles of prophages and horizontal gene transfer in introducing novel genetic features that contribute to the development of symbiotic relationships with marine invertebrates (1, 2).
This strain was cultured from the gut of starved Ciona robusta, harvested from Mission Bay, San Diego (32.780167°N, −117.242833°W) in March of 2015 (3). Pooled gut homogenates from five animals were filtered through a 0.45 µM Sterivex filter. From the filtrate, 100 µL of serially diluted aliquots were plated onto BD Difco Marine Agar 2216. Distinct isolated colonies were inoculated in BD Difco Marine Broth overnight at room temperature (20–22°C) with continuous shaking at 120 rpm (3). DNA was extracted from 1 mL of this broth (OD600 ≥ 1.2) using the Purelink Microbiome DNA Extraction Kit (Invitrogen).
For Illumina sequencing, the DNA was sonicated, assessed on a BioAnalyzer 2100 (Agilent Technologies), size-selected to obtain fragments 400–600 bp, and used for library preparation with the NuGEN UltraLow DNA kit (Tecan Life Sciences). This library was sequenced by Eurofins MWG Operon LLC on the Illumina MiSeq 2 × 250 bp platform.
For PacBio sequencing, DNA was sheared using the Covaris g-TUBE, followed by Exo III and VII digestion, BluePippin size selection, damage repair, end repair, and adaptor ligation. DNA is purified at each step using 0.45X AMPure XP beads. Two libraries were constructed using the SMRTbell Library Preparation, 1.0 SPv3 kit and assessed using PicoGreen (Invitrogen), Agilent, and NanoDrop (Thermo Fisher Scientific) assays. Both libraries were sequenced by the University of Minnesota Genomic Center on the Sequel system using two 1M v3 SMRT cells.
On the KBase server (4), 2,765,034 Illumina-sequenced reads were trimmed using Trimmomatic v0.39 (5), and 300,239 PacBio reads (N50 = 9,239 bp) were trimmed to retain reads >1,000 bp using Filtlong v0.2.1 (https://github.com/rrwick/Filtlong). Default software parameters were used. Trimming retained 1,545,622 Illumina paired-end reads and 37,693 PacBio reads. Read quality was evaluated using FastQC v0.12.1 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), while read and genome N50 and GC content were predicted by QUAST (Galaxy v5.3.0 + galaxy0) (6) available on the Galaxy server (7, 8).
Trimmed Illumina and PacBio reads were assembled using Unicycler v0.4.8 (9) and binned using MetaBAT2 (Galaxy v2.17 + galaxy0) (10). GTDB-Tk (Galaxy v2.4.0 + galaxy1) (11) assigned the binned assembly to the species Pseudovibrio ascidiaceicola with 96.86% ANI to GCF_900114245.1. The 16S rRNA sequence matched that of Pseudovibrio sp. FO-BEG1 (CP003147.1), with blastn comparison showing 100% query coverage and 98.79% identity (12).
The draft genome is 6.94 Mb and contains 22 contigs with 51.13% GC content and N50 value of 699,393 bp. The genome coverage was 3.4 × for Illumina reads and 72 × for PacBio reads. The genome is 99.99% complete with 0.57% contamination and 6,613 coding sequences, as predicted by CheckM2 (Galaxy v1.0.2 + galaxy1) (13). The genome was annotated using RAST (14, 15), Prokka (Galaxy v1.14.6 + galaxy1) (16), PADLOC v2.0.0 (17), antiSMASH v7.0 (18), Pharokka (Galaxy v1.3.2 + galaxy0) (19), the PHASTEST servers (20), and PGAP (21). PHASTEST revealed 11 prophage regions (Table 1).
TABLE 1.
| Contig accession | Length (bp) |
Features |
|---|---|---|
| JBOCEA010000001.1 | 1,386,464 | 214943:235296–prophage 1038288:1057987–terpene-precursor production 1304912:1326762–betalactone production 1337972:1370918–prophage |
| JBOCEA010000002.1 | 916,547 | 320644:340598–terpene-precursor production 344620:390576–prophage 774671:775024–non-ribosomal peptide synthetase-like production |
| JBOCEA010000003.1 | 765,036 | 135265:149158–prophage (matches Pseudovibrio sp. FO-BEG1, complete genome) (gene transfer agent) 725209:757053–non-ribosomal peptide synthetase-independent, IucA/IucC-like siderophore production |
| JBOCEA010000004.1 | 699,393 | 20673:51518–prophage (gene transfer agent) 33620:53840–homoserine lactone production 499543:505245–unspecified ribosomally synthesized and post-translationally modified peptide product production |
| JBOCEA010000005.1 a | 512,900 | 392766:452586–N-acyl amino acid production |
| JBOCEA010000006.1 a | 434,424 | 63737:91072–type I polyketide synthase production 91396:126547–type III polyketide synthase production 384999:396841–hydrogen cyanide production |
| JBOCEA010000007.1 | 401,274 | 277577:297552–terpene production |
| JBOCEA010000008.1 | 358,698 | 4492:13280–prophage |
| JBOCEA010000009.1 a | 351,948 | |
| JBOCEA010000010.1 | 176,086 | 18554:27744–unspecified ribosomally synthesized and post-translationally modified peptide product production 162854:171643–prophage |
| JBOCEA010000011.1 | 166,137 | 94063:116682–prophage |
| JBOCEA010000012.1 | 141,274 | |
| JBOCEA010000013.1 a | 140,715 | |
| JBOCEA010000014.1 | 129,220 | 92883:111389–prophage |
| JBOCEA010000015.1 | 103,963 | 1494:39786–prophage |
| JBOCEA010000016.1 a | 81,258 | |
| JBOCEA010000017.1 a | 81,139 | |
| JBOCEA010000018.1 | 46,091 | |
| JBOCEA010000019.1 | 30,308 | |
| JBOCEA010000020.1 | 12,744 | 893:10064–prophage |
| JBOCEA010000021.1 a | 5,857 | |
| JBOCEA010000022.1 | 1,658 |
Circular contigs, as predicted by Unicycler.
Prophage regions were predicted by PHASTEST and secondary metabolite production was predicted by antiSMASH. A spreadsheet of complete annotations is available at https://github.com/ldishaw/Pseudovibrio_5337/. The prophage region on contig JBOCEA010000003.1 (https://www.ncbi.nlm.nih.gov/nuccore/JBOCEA010000003.1?report=asn1) matches the genomic sequence of Pseudovibrio sp. FO-BEG1, with 98% query coverage and 80.86% identity. This is the only prophage region with a match that has >50% query coverage.
Empty cells indicate open reading frames of unknown function.
ACKNOWLEDGMENTS
This project was supported by NSF IOS-2226050, NSF MCB-1817308, and NSF IOS-1456301 to LJD.
The authors would like to acknowledge support from the National Science Foundation, USA.
Contributor Information
Larry J. Dishaw, Email: ldishaw@usf.edu.
Kenneth M. Stedman, Portland State University, Portland, Oregon, USA
DATA AVAILABILITY
Sequence data are deposited at the National Center for Biotechnology Information (NCBI) under the BioProject ID PRJNA1261084 and BioSample ID SAMN48504407. Sequenced reads are deposited in the Sequence Read Archive (SRA) under accession numbers SRX28800662 (Illumina) and SRX28800663 (PacBio). The draft assembly is deposited in GenBank under accession number JBOCEA000000000.1.
REFERENCES
- 1. Alex A, Antunes A. 2018. Genus-wide comparison of Pseudovibrio bacterial genomes reveal diverse adaptations to different marine invertebrate hosts. PLoS One 13:e0194368. doi: 10.1371/journal.pone.0194368 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Romano S, Fernàndez-Guerra A, Reen FJ, Glöckner FO, Crowley SP, O’Sullivan O, Cotter PD, Adams C, Dobson ADW, O’Gara F. 2016. Comparative genomic analysis reveals a diverse repertoire of genes involved in prokaryote-eukaryote interactions within the Pseudovibrio genus. Front Microbiol 7:387. doi: 10.3389/fmicb.2016.00387 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Dishaw LJ, Leigh B, Cannon JP, Liberti A, Mueller MG, Skapura DP, Karrer CR, Pinto MR, De Santis R, Litman GW. 2016. Gut immunity in a protochordate involves a secreted immunoglobulin-type mediator binding host chitin and bacteria. Nat Commun 7:10617. doi: 10.1038/ncomms10617 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, Dehal P, Ware D, Perez F, Canon S, et al. 2018. KBase: The United States department of energy systems biology knowledgebase. Nat Biotechnol 36:566–569. doi: 10.1038/nbt.4163 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072–1075. doi: 10.1093/bioinformatics/btt086 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Abueg LAL, Afgan E, Allart O, Awan AH, Bacon WA, Baker D, Bassetti M, Batut B, Bernt M, Blankenberg D, et al. 2024. The galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update. Nucleic Acids Res 52:W83–W94. doi: 10.1093/nar/gkae410 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Ramsey J, Rasche H, Maughmer C, Criscione A, Mijalis E, Liu M, Hu JC, Young R, Gill JJ. 2020. Galaxy and apollo as a biologist-friendly interface for high-quality cooperative phage genome annotation. PLoS Comput Biol 16:e1008214. doi: 10.1371/journal.pcbi.1008214 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Kang DD, Li F, Kirton E, Thomas A, Egan R, An H, Wang Z. 2019. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7:e7359. doi: 10.7717/peerj.7359 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Chaumeil PA, Mussig AJ, Hugenholtz P, Parks DH. 2020. GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics 36:1925–1927. doi: 10.1093/bioinformatics/btz848 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403–410. doi: 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]
- 13. Chklovski A, Parks DH, Woodcroft BJ, Tyson GW. 2023. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat Methods 20:1203–1212. doi: 10.1038/s41592-023-01940-w [DOI] [PubMed] [Google Scholar]
- 14. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, et al. 2008. The RAST server: rapid annotations using subsystems technology. BMC Genomics 9:75. doi: 10.1186/1471-2164-9-75 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, Edwards RA, Gerdes S, Parrello B, Shukla M, Vonstein V, Wattam AR, Xia F, Stevens R. 2014. The SEED and the rapid annotation of microbial genomes using subsystems technology (RAST). Nucleic Acids Res 42:D206–14. doi: 10.1093/nar/gkt1226 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068–2069. doi: 10.1093/bioinformatics/btu153 [DOI] [PubMed] [Google Scholar]
- 17. Payne LJ, Meaden S, Mestre MR, Palmer C, Toro N, Fineran PC, Jackson SA. 2022. PADLOC: a web server for the identification of antiviral defence systems in microbial genomes. Nucleic Acids Res 50:W541–W550. doi: 10.1093/nar/gkac400 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Blin K, Shaw S, Vader L, Szenei J, Reitz ZL, Augustijn HE, Cediel-Becerra JDD, de Crécy-Lagard V, Koetsier RA, Williams SE, et al. 2025. antiSMASH 8.0: extended gene cluster detection capabilities and analyses of chemistry, enzymology, and regulation. Nucleic Acids Res 53:W32–W38. doi: 10.1093/nar/gkaf334 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Bouras G, Nepal R, Houtak G, Psaltis AJ, Wormald PJ, Vreugde S. 2023. Pharokka: a fast scalable bacteriophage annotation tool. Bioinformatics 39:btac776. doi: 10.1093/bioinformatics/btac776 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Wishart DS, Han S, Saha S, Oler E, Peters H, Grant JR, Stothard P, Gautam V. 2023. PHASTEST: faster than PHASTER, better than PHAST. Nucleic Acids Res 51:W443–W450. doi: 10.1093/nar/gkad382 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Li W, O’Neill KR, Haft DH, DiCuccio M, Chetvernin V, Badretdin A, Coulouris G, Chitsaz F, Derbyshire MK, Durkin AS, Gonzales NR, Gwadz M, Lanczycki CJ, Song JS, Thanki N, Wang J, Yamashita RA, Yang M, Zheng C, Marchler-Bauer A, Thibaud-Nissen F. 2021. RefSeq: expanding the prokaryotic genome annotation pipeline reach with protein family model curation. Nucleic Acids Res 49:D1020–D1028. doi: 10.1093/nar/gkaa1105 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Sequence data are deposited at the National Center for Biotechnology Information (NCBI) under the BioProject ID PRJNA1261084 and BioSample ID SAMN48504407. Sequenced reads are deposited in the Sequence Read Archive (SRA) under accession numbers SRX28800662 (Illumina) and SRX28800663 (PacBio). The draft assembly is deposited in GenBank under accession number JBOCEA000000000.1.
