ABSTRACT
We report the draft genome sequence of Enterococcus gallinarum strain TY-1 isolated from human saliva in China. The genome is 3.4 Mb in size with a G + C content of 40.5 % and encodes 3,150 predicted proteins, seven rRNAs, 56 tRNAs, four ncRNAs, and 76 pseudogenes.
KEYWORDS: Enterococcus gallinarum, draft genome sequence, human saliva
ANNOUNCEMENT
The genus Enterococcus renowned for its strong adaptability is commonly found in food, the environment, and the gastrointestinal tracts of humans and animals (1). Enterococcus gallinarum (E. gallinarum) is an underdiagnosed but clinically significant opportunistic pathogen (2); however, its isolation from human saliva and the genomic characterization of oral isolates remain limited. To address this gap, we isolated E. gallinarum strain TY-1 from human saliva.
For sample collection, participants were instructed to abstain from food and drink for 1 h prior to sampling and to expectorate 1–3 mL of naturally secreted saliva into a sterile collection tube. In a biosafety cabinet, 1 mL of each saliva sample was serially diluted with sterile normal saline to 10⁻³ to 10⁻⁵. Subsequently, 0.1 mL aliquots of each dilution were spread-plated onto MRS agar (Solarbio, China) supplemented with 0.5% (w/v) CaCO₃ (Solarbio, China). Inoculated plates were incubated anaerobically at 37°C for 48–72 h. Colonies exhibiting a clear CaCO₃-dissolution halo were selected, restreaked twice to ensure purity, and Gram-stained for preliminary identification as lactic acid bacteria. Genomic DNA was extracted from pure cultures using the TIANamp Bacteria DNA Kit (DP302; Tiangen, China). The 16S rRNA gene was amplified using universal primers 27F and 1429R (Table 1) and sequenced by Sanger sequencing for species identification.
TABLE 1.
16S rRNA gene and genome assembly statistics of E. gallinarum TY-1
| Category | Specific content |
|---|---|
| 16S rRNA gene | |
| GenBank accession | PX452890 |
| Forward primer F(27) (5′–3′) | AGAGTTTGATCCTGGCTCAG |
| Reverse primer R(1492) (5′–3′) | CGGCTACCTTGTTACGACTT |
| Reaction mixture (25 µL) | 1 µL DNA template (10–100 ng), 0.8 µL each primer (10 µmol/L), 12.5 µL 2× PrimeSTAR HS DNA polymerase (Cat. R010A, Takara, China), and 9.9 µL ddH₂O |
| Amplification condition | 95°C 5 min and 32 cycles of 95°C 30 s, 55°C 45 s, and 72°C 90 s , 72°C 10 min |
| Sequencing kit and system | BigDye Terminator v1.1 (4336701, Thermo Fisher) and ABI3730XL (Applied Biosystems, USA) |
| Genome feature | |
| GenBank assembly | ASM5060942v1 |
| Total length (bp) | 3,363,870 bp |
| Number of scaffolds | 52 |
| Scaffold N50 kb | 208.6 |
| Scaffold L50 | 5 |
| GC content (%) | 40.5 |
| Total genes | 3,293 |
| CDSs (total) | 3,226 |
| CDSs (protein-coding sequences) | 3,150 |
| rRNAs (5S, 16S, 23S) | 4, 1, 2 |
| Complete rRNAs (5S, 16S) | 1, 1 |
| Partial rRNAs (5S, 23S) | 3, 2 |
| tRNAs | 56 |
| ncRNAs | 4 |
| Pseudo genes (total) | 76 |
A DNA library was prepared with the MagPure Bacterial DNA Kit (D6361-02; Magen, China). DNA was fragmented to 200–400 bp, end-repaired, 3′ .-adenylated, adapter-ligated, PCR-amplified, and size-selected with Hieff NGS DNA selection beads (Cat#12601) according to the manufacturer’s instructions. Libraries were quantified using a Qubit 4.0 fluorometer (Q33226; Thermo Fisher Scientific; default parameters were used, except where otherwise noted) using the high-sensitivity kit, size-verified on a 2% agarose gel, and paired-end sequenced (2 × 150 bp) on an Illumina NovaSeq 6000 (Sangon Biotech, China) (3), yielding 14.7 million raw paired-end reads. Adapter and quality trimming (performed with fastp 0.23.0: -q 15 -u 40 -n 5 -e 0 l 35 -w 3) (4) was followed by de novo assembly with SPAdes v3.5.0 (--careful --k 33,55,77,99) (5), and the assembly was polished with PrInSeS-G v.1.0.0 (6). De novo repeat libraries were constructed with RepeatModeler v.2.0.6 and masked via RepeatMasker v.4.1.5 (7, 8). Genome annotation (CDSs, tRNAs, rRNAs, ncRNAs, pseudogenes) was performed with National Center for Biotechnology Information PGAP v6.10 and Prokka v.1.10 (9, 10). Functional classification of predicted proteins was carried out using the Clusters of Orthologous Genes (11) and Kyoto Encyclopedia of Genes and Genomes databases (12).
The draft genome of E. gallinarum TY-1 comprises 3,363,870 bp assembled into 52 contigs (N50 = 208.6 kb; G + C content = 40.5%). Annotation identified 3,293 genes: 3,150 CDSs, 7 rRNAs, 56 tRNAs, four ncRNAs, and 76 pseudogenes (Table 1). Mean sequencing depth was 346.2×, and CheckM v1.2.3 estimated completeness at 92.47% with 4.8% contamination (13).
ACKNOWLEDGMENTS
This work was supported by the Doctoral Foundation of Qingdao Binhai University (BS2023B002), the Qingdao Binhai University Research and Innovation Platform Program (PTK202501), the Natural Science Foundation of Shandong Province (ZR2022QD153, ZR2023MH098), and the Innovation Team of the Molecular Oncology Research Laboratory. Illumina sequencing was performed by Sangon Biotech (Shanghai, China) (http://www.sangon.com). The manuscript was language-edited with AI-assisted tools (Kimi, Doubao).
Contributor Information
Tingting Tang, Email: sunnyting1230@126.com.
Zhenjiang Zech Xu, Nanchang University, Nanchang, Jiangxi, China.
DATA AVAILABILITY
Whole-genome shotgun sequence of Enterococcus gallinarum strain TY-1 was deposited at DDBJ/ENA/GenBank under accession JBODOG000000000 (BioProject PRJNA1266649, BioSample SAMN48682517, SRA SRR34082452).
ETHICS APPROVAL
This study was conducted at Qingdao Binhai University and approved by the Research Ethics Committee (approval no. QBU2024-9). Written informed consent was obtained from all participants.
REFERENCES
- 1. Sakoui S, Derdak R, Pop OL, Vodnar DC, Jouga F, Teleky B-E, Addoum B, Simon E, Suharoschi R, Soukri A, El Khalfi B. 2024. Exploring technological, safety and probiotic properties of Enterococcus strains: impact on rheological parameters in fermented milk. Foods 13:586. doi: 10.3390/foods13040586 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Hyderi Z, Saravanan K, M S DI, Ravi AV. 2026. The emerging clinical relevance of Enterococcus gallinarum: a roadmap for future research and diagnostics. Diagn Microbiol Infect Dis 114:117104. doi: 10.1016/j.diagmicrobio.2025.117104 [DOI] [PubMed] [Google Scholar]
- 3. Minoche AE, Dohm JC, Himmelbauer H. 2011. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems. Genome Biol 12:R112. doi: 10.1186/gb-2011-12-11-r112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Chen S, Zhou Y, Chen Y, Gu J. 2018. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890. doi: 10.1093/bioinformatics/bty560 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Massouras A, Hens K, Gubelmann C, Uplekar S, Decouttere F, Rougemont J, Cole ST, Deplancke B. 2010. Primer-initiated sequence synthesis to detect and assemble structural variants. Nat Methods 7:485–486. doi: 10.1038/nmeth.f.308 [DOI] [PubMed] [Google Scholar]
- 7. Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF. 2020. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci USA 117:9451–9457. doi: 10.1073/pnas.1921046117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Tarailo-Graovac M, Chen N. 2009. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics Chapter 4:4. doi: 10.1002/0471250953.bi0410s25 [DOI] [Google Scholar]
- 9. Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J. 2016. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res 44:6614–6624. doi: 10.1093/nar/gkw569 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068–2069. doi: 10.1093/bioinformatics/btu153 [DOI] [PubMed] [Google Scholar]
- 11. Tatusov RL, Galperin MY, Natale DA, Koonin EV. 2000. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res 28:33–36. doi: 10.1093/nar/28.1.33 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Kanehisa M, Goto S. 2000. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30. doi: 10.1093/nar/28.1.27 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2015. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055. doi: 10.1101/gr.186072.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Whole-genome shotgun sequence of Enterococcus gallinarum strain TY-1 was deposited at DDBJ/ENA/GenBank under accession JBODOG000000000 (BioProject PRJNA1266649, BioSample SAMN48682517, SRA SRR34082452).
