ABSTRACT
Here, we report the genome sequences of eight clinical isolates of Rothia, seven of which were isolated from the upper respiratory tract of people with cystic fibrosis (pwCF). Analyzing the genomes of members of the respiratory microbiome in pwCF can elucidate possible interactions among microbial community members.
KEYWORDS: genome analysis, Rothia, human microbiome
ANNOUNCEMENT
Rothia spp. are prevalent and transcriptionally active members of the microbial community found in cystic fibrosis (CF) sputum (1–4). However, the role of Rothia in the context of CF respiratory disease remains unknown. Therefore, we analyzed the genomes of eight clinical Rothia isolates, seeking to characterize Rothia associated with CF.
Oral swabs (oropharynx) were collected at Columbia University Medical Center under IRB approval (#IRBAAAE8112) from three participants in the Ecology of Cystic Fibrosis (Eco-CF) project (5) and preserved at −80°C in either glycerol or dimethyl sulfoxide. The preserved oral swabs were struck onto Heart Infusion agar plates containing lincomycin (5 µg/mL) and colistin sulfate (10 µg/mL) incubated at 37°C, both anaerobically and aerobically, for 3 days. Colonies with characteristic Rothia morphology were selected and subcultured to purity. Seven of the eight isolates were obtained by the aforementioned procedure, while one isolate (ILR0003) was obtained through BEI Resources, NIAID, NIH as part of the Human Microbiome Project (6). For DNA extraction purposes, each strain was grown overnight (~17 h) at 37°C with shaking in 10 mL BBL Brain Heart Infusion (BHI; Becton, Dickinson and Co) broth. Cells were harvested by centrifugation and washed 3× with Dulbecco’s PBS (Thermo Fisher Scientific). Genomic DNA was extracted using the Monarch Spin gDNA Extraction kit following the manufacturer’s protocol for Gram-positive bacteria (New England Biolabs). DNA was sent to SeqCoast Genomics (Portsmouth, NH, US), where DNA library preparation and sequencing were performed. Briefly, the DNA was sheared and size selected through the on-bead tagmentation process (7) used for library prep by SeqCoast genomics with the Illumina DNA Prep Tagmentation kit and unique dual indexes. Sequencing was performed on the Illumina NextSeq2000 platform using a 300-cycle flow cell kit, producing 2 × 150 bp paired reads. PhiX control (1%–2%) was spiked into the run to support optimal base calling. Read demultiplexing, read trimming, and run analytics were performed using DRAGEN v3.10.12 (8), a NextSeq2000 on-board analysis software. The Kbase platform was used for genome analyses (9). After quality checking the reads with FastQC v0.12.1 (10), genomes were assembled using SPAdes v3.15.3 (11) and checked for completeness using CheckM v1.0.18 (12). Genomes were annotated using the NCBI Prokaryotic Genome Annotation Pipeline (13). Assembled genomes were quality assessed using QUAST v4.4 (14). Each isolate was identified through KBase using the whole genome sequences of all isolates and SpeciesTree v2.2.0 (15). Default parameters were used except where otherwise noted.
Table 1 contains genome information for each isolate including total reads generated, N50, and coverage. Of the eight isolates, three were identified as R. mucilaginosa and five were identified as R. dentocariosa. The genomes ranged in length from 2,284,086 to 2,579,644 bp with GC content between 53.65 and 59.53%. The number of NCBI PGAP-predicted genes ranged from 1,806 to 2,239, with R. dentocariosa strains containing more genes than R. mucilaginosa strains.
TABLE 1.
Rothia isolate metadata, accession links, and genome assembly statistics
| Source | HMP | ECOCF0017 | ECOCF0017 | ECOCF0017 | ECOCF0017 | ECOCF0024 | ECOCF0041 | ECOCF0041 |
|---|---|---|---|---|---|---|---|---|
| Isolate | ILR0003 | ILR0004 | ILR0005 | ILR0006 | ILR0007 | ILR0008 | ILR0010 | ILR0011 |
| Taxonomy | R. mucilaginosa | R. mucilaginosa | R. dentocariosa | R. dentocariosa | R. dentocariosa | R. dentocariosa | R. mucilaginosa | R. dentocariosa |
| SRA accession | SRR31931841 | SRR31931840 | SRR31931839 | SRR31931838 | SRR31931837 | SRR31931836 | SRR31931835 | SRR31931834 |
| WGS accession | JBKQAZ000000000 | JBKQBA000000000 | JBKQBB000000000 | JBKQBC000000000 | JBKQBD000000000 | JBKQBE000000000 | JBKQBF000000000 | JBKQBG000000000 |
| Total reads | 1,816,161 | 2,700,816 | 2,588,374 | 3,056,208 | 2,700,534 | 2,777,206 | 3,033,056 | 2,668,962 |
| Length (bp) | 2,293,984 | 2,307,667 | 2,523,791 | 2,523,853 | 2,524,264 | 2,579,644 | 2,284,086 | 2,481,324 |
| Contigs | 30 | 10 | 10 | 8 | 11 | 22 | 12 | 9 |
| GC (%) | 59.43 | 59.53 | 53.66 | 53.66 | 53.66 | 53.65 | 59.48 | 53.77 |
| Genes (#) | 1,812 | 1,815 | 2,195 | 2,194 | 2,196 | 2,239 | 1,806 | 2,139 |
| N50 | 162,261 | 630,408 | 1,446,065 | 1,446,010 | 497,918 | 343,652 | 708,691 | 391,089 |
| Coverage | 119x | 176x | 154x | 182x | 160x | 161x | 199x | 161x |
| Completeness (%) |
99.36 | 98.12 | 99.34 | 99.34 | 99.34 | 99.34 | 98.12 | 99.34 |
ACKNOWLEDGMENTS
We would like to thank all members of the Ibberson and Lewin labs for careful reading of this manuscript. S.R.W. was supported in part by grant 006442G223 from the Cystic Fibrosis Foundation to C.B.I. This work was also supported by funds from the Doris Duke Foundation #2012060 and #2019106 to P.J.P.
Contributor Information
Carolyn B. Ibberson, Email: ibberson@utk.edu.
David Rasko, University of Maryland School of Medicine, Baltimore, Maryland, USA.
DATA AVAILABILITY
The raw sequencing paired-end reads and assembled whole-genome sequence (WGS) for each isolate were deposited in the Sequence Read Archive (SRA) and GenBank databases, respectively. The SRA and WGS accession numbers and links are listed in Table 1. Raw sequences, genome assemblies, and PGAP annotation files can be found under BioProject number PRJNA1208058.
REFERENCES
- 1. Adekoya AE, Kargbo HA, Ibberson CB. 2023. Defining microbial community functions in chronic human infection with metatranscriptomics. mSystems 8:e00593–23. doi: 10.1101/2023.06.06.543868 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Carmody LA, Zhao J, Kalikin LM, LeBar W, Simon RH, Venkataraman A, Schmidt TM, Abdo Z, Schloss PD, LiPuma JJ. 2015. The daily dynamics of cystic fibrosis airway microbiota during clinical stability and at exacerbation. Microbiome 3:12. doi: 10.1186/s40168-015-0074-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Whelan FJ, Heirali AA, Rossi L, Rabin HR, Parkins MD, Surette MG. 2017. Longitudinal sampling of the lung microbiota in individuals with cystic fibrosis. PLOS One 12:e0172811. doi: 10.1371/journal.pone.0172811 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Guss AM, Roeselers G, Newton ILG, Young CR, Klepac-Ceraj V, Lory S, Cavanaugh CM. 2011. Phylogenetic and metabolic diversity of bacteria associated with cystic fibrosis. ISME J 5:20–29. doi: 10.1038/ismej.2010.88 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Planet PJ. 2024. Data Set Eco-CF. MicrobiomeDB. Available from: https://microbiomedb.org/mbio/app/workspace/analyses/DS_d20b9c4094/new/details
- 6. The Human Microbiome Project Consortium . 2012. Structure, function and diversity of the healthy human microbiome. Nature 486:207–214. doi: 10.1038/nature11234 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Illumina, Inc . 2025. Illumina DNA prep overview. Available from: https://support-docs.illumina.com/LP/IlluminaDNAPrep/Content/LP/Illumina_DNA/DNA-Prep/Overview.htm
- 8. Illumina, Inc . 2024. DRAGEN secondary analysis. Available from: https://www.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html
- 9. Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, Dehal P, Ware D, Perez F, Canon S, et al. 2018. KBase: The United States department of energy systems biology knowledgebase. Nat Biotechnol 36:566–569. doi: 10.1038/nbt.4163 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Andrews S. 2010. FastQC: a quality control tool for high throughput sequence data. Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc
- 11. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2015. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055. doi: 10.1101/gr.186072.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Li W, O’Neill KR, Haft DH, DiCuccio M, Chetvernin V, Badretdin A, Coulouris G, Chitsaz F, Derbyshire MK, Durkin AS, Gonzales NR, Gwadz M, Lanczycki CJ, Song JS, Thanki N, Wang J, Yamashita RA, Yang M, Zheng C, Marchler-Bauer A, Thibaud-Nissen F. 2021. RefSeq: expanding the prokaryotic genome annotation pipeline reach with protein family model curation. Nucleic Acids Res 49:D1020–D1028. doi: 10.1093/nar/gkaa1105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072–1075. doi: 10.1093/bioinformatics/btt086 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Price MN, Dehal PS, Arkin AP. 2010. FastTree 2--approximately maximum-likelihood trees for large alignments. PLOS One 5:e9490. doi: 10.1371/journal.pone.0009490 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The raw sequencing paired-end reads and assembled whole-genome sequence (WGS) for each isolate were deposited in the Sequence Read Archive (SRA) and GenBank databases, respectively. The SRA and WGS accession numbers and links are listed in Table 1. Raw sequences, genome assemblies, and PGAP annotation files can be found under BioProject number PRJNA1208058.
