Skip to main content
Microbiology Resource Announcements logoLink to Microbiology Resource Announcements
. 2023 Nov 9;12(12):e00460-23. doi: 10.1128/MRA.00460-23

Draft genomes of 3 cyanobacteria strains and 17 co-habiting proteobacteria assembled from metagenomes

Maximilian Berthold 1,, Martin Albrecht 2, Douglas A Campbell 1, Naaman M Omar 1
Editor: J Cameron Thrash3
PMCID: PMC10720521  PMID: 37943043

ABSTRACT

Cyanobium and Synechococcus are prominent, globally distributed cyanobacteria genera with ecological significance. Here, we report the genomes of the marine Synechococcus sp. CCMP836 and two strains of Cyanobium (CZS25K and CZS48M) along with the genomes of 17 co-occurring proteobacteria. These genomes will improve the strain-specific ecological positions.

KEYWORDS: cyanobacteria, proteobacteria, metagenomics

ANNOUNCEMENT

The global brackish microbiome is dominated by Synechococcus and Cyanobium (1) with more strains available than sequenced genomes. Cyanobium spp. CZS25K and CZS48M were isolated from the southern Baltic Sea in 2017 (surface water, Darss-Zingst lagoon system, a eutrophic lagoon system with a salinity range of 2 to 6 (2), 54.43°N, 12.68°E; see (3) for details on isolation) and obtained from the Applied Ecology Culture Collection at the University of Rostock. Synechococcus sp. CCMP836 (synonyms WH 8007, 838BG) was isolated from the surface waters of the Gulf of Mexico in 1980 (19.75°N, −92.41°W) (4) and obtained from NCMA Bigelow.

CZS25K and CZS48M were grown in freshwater BG11 media (1 ppt salinity), whereas CCMP836 was grown in marine BG11 media (32 ppt salinity) under a 12:12 photoperiod on a shaker for 7 days before DNA extractions. DNA was extracted from xenic cultures of CCMP836, CZS25K, and CZS48M using a Qiagen DNeasy kit. Libraries were prepared using the NEBNext Ultra II DNA Library Prep kit for Illumina (New England Biolabs) as per the manufacturer’s recommendations and sequenced using the Illumina NovaSeq S4 lane using the Xp protocol as per the manufacturer’s recommendations by Genome Québec, Montréal, Canada, yielding 30.6 million, 28.4 million, and 32.4 million paired-end reads of 150 bp, respectively. In a second extraction, DNA was extracted from xenic cultures of CCMP836, CZS25K, and CZS48M using a Qiagen Genomic-tip 20 /G kit. Libraries were prepared using the Pacific Biosciences Preparing whole genome and metagenome libraries using the SMRTbell prep kit 3.0 protocol and sequenced using a PacBio Sequel II (Genome Québec, Montréal, Canada). No size selection was done prior to sequencing. The sequence quality was checked using FASTQC 0.11.9 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc) and subsequently processed using Cutadapt 4.1 (5). Illumina reads and PacBio CCS reads were error-corrected and de novo assembled by Spades v3.15.4 (6) using the “-s” option for PacBio CCS reads and the “−1” and “−2” options for Illumina reads. Assemblies were aligned and sorted using Bowtie2 v2.4.4 (7) and Samtools v1.17 (8) and binned using MetaBAT2 (9) running on default parameters (Table 1). Samtools v1.17 (8) was also used to calculate genome coverage. Bin reliability was verified using FOCUS 1.5 (10) and CheckM v1.2.0 (11), and gaps were closed by GenomeFinisher (12) under default parameters using alternate Megahit v1.2.9 (13) and GenPipes (14) assemblies. Megahit v1.2.9 was run using the “-r” option for PacBio CCS reads and the “−1” and “−2” options for Illumina reads and the “meta-sensitive” preset. GenPipes was run by Genome Québec, Montréal, Canada, using PacBio reads. The completeness of bins was assessed using BUSCO v5.2.2 (15) (Figure 1) and subsequently annotated using PGAP (16). Taxonomy was assigned using the GTDB-Tk v2.1.1 classify workflow (17). Default parameters were used for all software unless otherwise specified.

TABLE 1.

Taxonomy and attributes of binned metagenome-assembled genomesa

Phylum Order Organism Total length (Mbp) # contigs Largest contig (bp) GC (%) N 50 Genome coverage (×) Source strain Bin Genome contamination (CheckM) National Center for Biotechnology Information (NCBI) genome accession number NCBI Sequence Read Archive (SRA) number
C Synechococcales Cyanobium sp. CZS48M 2.769251 37 457,164 66.93 98,032 3,538.57 CZS48M 9 0.14 JAUCZK000000000 SRR24524117
SRR24524116
C Synechococcales Cyanobium sp. CZS25K 3.100390 34 369,722 67.92 132,422 4,373.01 CZS25K 11 0.82 JAUCZB000000000 SRR24524118
SRR24524119
C Synechococcales Synechococcus sp. CCMP836 2.253937 19 357,230 63.53 155,493 1,145.01 CCMP836 24 0 JAUCYY000000000 SRR24524120
SRR24524121
P Alteromonadales Alteromonas macleodii 4.571015 58 300,811 44.65 130,570 37.377 CZS25K 4 0 JAUCZE000000000 SRR24524118
SRR24524119
P Alteromonadales Alteromonas macleodii 4.503556 49 286,734 44.67 138,090 324.57 CCMP836 21 0 JAUCYV000000000 SRR24524120
SRR24524121
P Burkholderiales Hydrogenophaga sp. 4.017909 52 251,823 66.32 102,357 115.426 CZS48M 2 0 JAUCZG000000000 SRR24524117
SRR24524116
P Burkholderiales Hydrogenophaga sp. 3.855627 10 1,132,111 64.5 446,382 180.227 CZS25K 9 0 JAUCZF000000000 SRR24524118
SRR24524119
P Caulobacterales Maricaulis sp. 2.986590 32 433,653 60.81 129,008 233.057 CCMP836 23 0 JAUCYX000000000 SRR24524120
SRR24524121
P Oceanibaculales Oceanibaculum nanhaiense 3.500712 17 570,757 65.29 317,371 189.412 CZS48M 7 0 JAUCZI000000000 SRR24524117
SRR24524116
P Oceanospirillales Marinobacter salarius 4.275244 62 365,476 57.32 135,686 61.0175 CCMP836 13 0 JAUCYT000000000 SRR24524120
SRR24524121
P Rhizobiales Allorhizobium sp. 4.479681 30 509,794 61.4 227,190 40.1371 CCMP836 11 0 JAUCYR000000000 SRR24524120
SRR24524121
P Rhizobiales Allorhizobium sp. 4.342418 21 823,773 61.46 352,591 199.246 CZS25K 3 0 JAUCZD000000000 SRR24524118
SRR24524119
P Rhodobacterales Roseovarius sp. 4.804078 18 743,804 66.05 328,058 400.304 CCMP836 15 0 JAUCYU000000000 SRR24524120
SRR24524121
P Rhodobacterales Tabrizicola sp. 3.616173 17 987,302 63.3 372,849 319.115 CZS48M 3 0 JAUCZH000000000 SRR24524117
SRR24524116
P Rhodobacterales Rhodobacteraceae sp. 3.599685 21 573,686 63.23 224,738 249.302 CCMP836 22 0 JAUCYW000000000 SRR24524120
SRR24524121
P Rhodospirillales Thalassospira xiamenensis 4.764832 23 970,535 54.71 264,246 254.064 CCMP836 9 0 JAUCYZ000000000 SRR24524120
SRR24524121
P Sphingomonadales Blastomonas fulva 3.805544 56 387,436 64.57 116,093 62.7709 CZS25K 10 0 JAUCZA000000000 SRR24524118
SRR24524119
P Sphingomonadales Blastomonas sp. 3.658526 58 237,960 64.04 112,763 92.1295 CZS25K 1 0 JAUCZC000000000 SRR24524118
SRR24524119
P Sphingomonadales Blastomonas fulva 3.463794 8 1,112,259 64.39 693,030 1031.03 CZS48M 8 0 JAUCZJ000000000 SRR24524117
SRR24524116
P Sphingomonadales Parasphingorhabdus sp. 3.127999 28 434,863 59.39 180,829 20.2539 CCMP836 12 0.84 JAUCYS000000000 SRR24524120
SRR24524121
a

The phylum column indicates whether organisms are cyanobacteria (C) or proteobacteria (P).

Fig 1.

Fig 1

Predicted completeness of cyanobacterial and proteobacterial metagenome-assembled genomes based on core genes as analyzed by BUSCO.

Contributor Information

Maximilian Berthold, Email: mberthold@mta.ca.

J. Cameron Thrash, University of Southern California, Los Angeles, California, USA.

DATA AVAILABILITY

This project has been deposited at the NCBI under BioProject accession number PRJNA956506. The raw sequence metagenomic reads can be located on the SRA under accession numbers SRR24524116 to SRR24524121. Genome assemblies have been deposited at DDBJ/ENA/GenBank under accession numbers JAUCYR000000000 to JAUCZK000000000.

REFERENCES

  • 1. Doré H, Leconte J, Guyet U, Breton S, Farrant GK, Demory D, Ratin M, Hoebeke M, Corre E, Pitt FD, Ostrowski M, Scanlan DJ, Partensky F, Six C, Garczarek L, Blanchard JL. 2022. Global phylogeography of marine Synechococcus in coastal areas reveals strong community shifts. mSystems 7:e0065622. doi: 10.1128/msystems.00656-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Schiewer U. 2008. Darß-Zingst Boddens, northern Rügener Boddens and Schlei, p 35–86. In Schiewer U (ed), Ecology of Baltic coastal waters. Springer, Berlin, Heidelberg. [Google Scholar]
  • 3. Albrecht M, Pröschold T, Schumann R. 2017. Identification of cyanobacteria in a Eutrophic Coastal lagoon on the Southern Baltic coast. Front Microbiol 8:923. doi: 10.3389/fmicb.2017.00923 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Wood AM. 1985. Adaptation of photosynthetic apparatus of marine ultraphytoplankton to natural light fields. Nature 316:253–255. doi: 10.1038/316253a0 [DOI] [Google Scholar]
  • 5. Martin M. 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet j 17:10. doi: 10.14806/ej.17.1.200 [DOI] [Google Scholar]
  • 6. Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. 2017. metaSPAdes: a new versatile metagenomic assembler. Genome Res 27:824–834. doi: 10.1101/gr.213959.116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. doi: 10.1038/nmeth.1923 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup . 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. doi: 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies - PMC. Retrieved May 23 May 2023. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6662567. [DOI] [PMC free article] [PubMed]
  • 10. Silva GGZ, Cuevas DA, Dutilh BE, Edwards RA. 2014. FOCUS: an alignment-free model to identify organisms in metagenomes using non-negative least squares. PeerJ 2:e425. doi: 10.7717/peerj.425 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Retrieved 23 May 2023. https://genome.cshlp.org/content/25/7/1043 [DOI] [PMC free article] [PubMed]
  • 12. Guizelini D, Raittz RT, Cruz LM, Souza EM, Steffens MBR, Pedrosa FO. 2016. GFinisher: a new strategy to refine and finish bacterial genome assemblies. Sci Rep 6:34963. doi: 10.1038/srep34963 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. 2015. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31:1674–1676. doi: 10.1093/bioinformatics/btv033 [DOI] [PubMed] [Google Scholar]
  • 14. Bourgey M, Dali R, Eveleigh R, Chen KC, Letourneau L, Fillon J, Michaud M, Caron M, Sandoval J, Lefebvre F, Leveque G, Mercier E, Bujold D, Marquis P, Van PT, Anderson de Lima Morais D, Tremblay J, Shao X, Henrion E, Gonzalez E, Quirion P-O, Caron B, Bourque G. 2019. GenPipes: an open-source framework for distributed and scalable genomic analyses. Gigascience 8:giz037. doi: 10.1093/gigascience/giz037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. 2021. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol 38:4647–4654. doi: 10.1093/molbev/msab199 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J. 2016. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res 44:6614–6624. doi: 10.1093/nar/gkw569 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. 2022. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics 38:5315–5316. doi: 10.1093/bioinformatics/btac672 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

This project has been deposited at the NCBI under BioProject accession number PRJNA956506. The raw sequence metagenomic reads can be located on the SRA under accession numbers SRR24524116 to SRR24524121. Genome assemblies have been deposited at DDBJ/ENA/GenBank under accession numbers JAUCYR000000000 to JAUCZK000000000.


Articles from Microbiology Resource Announcements are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES