Highlights
-
•
A rare non-transitive dDDH conflict within Sphingobacterium siyangense exposes the limitation of the fixed 70 % threshold.
-
•
Phylogenomic and whole-genome evidence support Sphingobacterium siyangense as a single cohesive species.
-
•
The genus Sphingobacterium has an extremely open pan-genome, shaped by extensive gene flux.
-
•
An integrative genomic framework resolves species boundaries more reliably than single pairwise metrics.
-
•
Proposed taxonomic revisions include designating Sphingobacterium ginsenosidimutans as a synonym of Sphingobacterium detergens and reclassifying Rhinopithecimicrobium faecis as Sphingobacterium faecis comb. nov.
Keywords: Sphingobacterium, Non-transitive dDDH, Species delimitation, Phylogenomics, Comparative genomics, Pan-genome, Prokaryotic taxonomy
Abstract
Accurate species delimitation in prokaryotes increasingly relies on genome-scale comparisons, yet fixed genomic thresholds can be unreliable in lineages shaped by extensive gene flux. In this study, we revisited the taxonomy of genus Sphingobacterium using phylogenomic reconstruction and comprehensive whole-genome comparisons. The genus displays a highly open pan-genome, with only 22 universally conserved genes and nearly 20,000 cloud genes, indicating pronounced genomic plasticity. Within this complex evolutionary context, we detected a rare non-transitive paradox in digital DNA-DNA hybridization (dDDH) within the Sphingobacterium siyangense group. All strains share average nucleotide identity (ANI) values above the accepted species boundary (95%), yet some strain pairs exhibit dDDH values below the species threshold (70%), resulting in a conflict restricted to this metric. Phylogenomic analyses, core genome variation, average amino acid identity (AAI) patterns, and functional gene profiles consistently support the monophyly and genomic cohesion of these strains, showing that dependence on dDDH alone may lead to ambiguous species boundaries. Based on the combined evidence, we treat all members of the S. siyangense cluster as a single species and propose additional taxonomic revisions. Sphingobacterium ginsenosidimutans is recognized as a heterotypic synonym of Sphingobacterium detergens. The species Rhinopithecimicrobium faecis is proposed for reclassification as Sphingobacterium faecis comb. nov. These findings demonstrate that rigid dDDH cutoffs cannot fully capture evolutionary relationships and highlight the value of integrating phylogenomic and pan-genomic evidence for resolving complex species level classifications in prokaryotes.
Graphical abstract
1. Introduction
In the post-genomic era, prokaryotic taxonomy has undergone a profound transformation with the rapid advancement of whole-genome sequencing technologies (Coenye et al., 2005; Konstantinidis and Tiedje, 2007). Traditional polyphasic taxonomy, which relies on phenotypic, chemotaxonomic, and 16S rRNA gene-based characteristics, is increasingly complemented, and in many cases, replaced by genome-based approaches (Hayashi Sant’Anna et al., 2019; Rajkumari et al., 2022; Riesco and Trujillo, 2024). Among these, average nucleotide identity (ANI) and digital DNA-DNA hybridization (dDDH) have emerged as widely accepted metrics for species delineation, with empirical threshold values of 95–96% ANI and/or 70% dDDH (Auch et al., 2010; Kim et al., 2014; Riesco and Trujillo, 2024). These genome-derived metrics provide higher resolution and reproducibility compared to traditional methods, and are now recommended by the International Committee on Systematics of Prokaryotes (ICSP) for defining prokaryotic species boundaries (Riesco and Trujillo, 2024).
However, fixed numerical thresholds have limitations, particularly in borderline cases where genome similarity is close to the cutoff (Jain et al., 2018; Riesco and Trujillo, 2024; Sentausa and Fournier, 2013). A conceptually significant issue, but often overlooked, is the non-transitive nature of ANI, dDDH, and the other indices (Meier-Kolthoff and Göker, 2019; Parks et al., 2020; Rodriguez-R et al., 2024). Since these metrics are calculated independently for each genome pair, they do not necessarily conform to transitive logic. This can result in non-transitive relationships, where genomes A and B, and B and C meet species-level criteria, but A and C do not, despite their link through B (Parks et al., 2020). Such inconsistencies undermine the coherence of species groupings, challenging the reliability of binary frameworks for species classification. These issues emphasize the need for integrative approaches that combine quantitative metrics with phylogenomic context and functional genome annotation comparisons to better reflect evolutionary relationships.
The accuracy of genome-based taxonomy also depends on the availability of high-quality genome sequences for type strains (Hugenholtz et al., 2021). Before 2018, most of the novel taxa were described using only phenotypic traits and 16S rRNA gene sequence data (Chun et al., 2018), leading to gaps and ambiguities in modern classifications. Recent large-scale efforts, such as those led by the World Data Centre for Microorganisms (WDCM), have substantially expanded the genomic coverage of type strains, enabling more rigorous taxonomic re-evaluation and refinement across diverse prokaryotic lineages (Fan et al., 2024; Shi et al., 2020; Wu and Ma, 2019).
Sphingobacterium, a genus within the phylum Bacteroidota, class Sphingobacteriia, order Sphingobacteriales, and family Sphingobacteriaceae, comprises 71 validly published species as recorded in the List of Prokaryotic names with Standing in Nomenclature (LPSN; https://lpsn.dsmz.de/genus/sphingobacterium) (Parte et al., 2020). These Gram-negative, non-spore-forming, facultatively anaerobic bacteria are commonly found in soil, water, gut, and plant-associated environments (Ahmed et al., 2014; Kim et al., 2024; Tao et al., 2024; Zhang et al., 2024; Zhang et al., 2021). Known for their sphingolipid-rich membranes, they exhibit resistance to antibiotics and environmental stresses (Sohlenkamp and Geiger, 2015). Many Sphingobacterium species also possess biotechnological potential, including the ability to degrade complex organic compounds such as plastics and industrial pollutants, as well as the production of exopolysaccharides for bioremediation and biotechnology applications (An et al., 2011; Figueiredo et al., 2022; Khampratueng and Anal, 2026; Tan et al., 2022).
Despite the increasing availability of genomic data for Sphingobacterium type and non-type strains, the taxonomy of this genus remains complex and unresolved in certain respects. Previous taxonomic studies have struggled to establish clear species boundaries, and genome-based analyses have highlighted ambiguities in species delineation. In our previous reassessment of Sphingobacterium, we identified subspecies-level divergence, underscoring the complexities of species classification within this genus (Li et al., 2024). With the release of additional genome sequences for various Sphingobacterium strains, we have an opportunity to revisit and refine the taxonomic relationships of the genus. In this re-evaluation, we encountered three key issues: (1) a non-transitive genome similarity paradox, (2) unexpectedly high genomic similarity between strains classified as separate species, and (3) the misclassification of a species under a different genus, necessitating its reclassification within Sphingobacterium. In this study, we address these challenges using whole-genome comparisons, aiming to clarify species boundaries and emphasize the need for integrating phylogenomic insights into taxonomic frameworks for more accurate species classification.
2. Methods
2.1. Genome data collection of Sphingobacterium strains
A complete list of validly published species in the family Sphingobacteriaceae was collected from LPSN (https://lpsn.dsmz.de/family/sphingobacteriaceae). To comprehensively clarify the phylogenomic relationships within the genus Sphingobacterium, we retrieved all publicly available genomes of type strains of Sphingobacteriaceae species (n=264) from the GenBank database. This dataset spanned 18 genera, including Albibacterium (n=3), Anseongella (n=1), Arcticibacter (n=4), Daejeonella (n=3), Desertivirga (n=3), Hufsiella (n=2), Mucilaginibacter (n=72), Nubsella (n=1), Olivibacter (n=4), Paradesertivirga (n=1), Parapedobacter (n=10), Pararcticibacter (n=1), Pedobacter (n=89), Pelobium (n=1), Pseudopedobacter (n=2), Rhinopithecimicrobium (n=1), Rubrolithibacter (n=1), Solitalea (n=5), and Sphingobacterium (n=60). To further assess whether the observed non-transitive relationship within Sphingobacterium siyangense reflects strain-specific variation, we additionally included all publicly available genomes from cultured S. siyangense strains (n=21) from GenBank. Incorporation of these non-type genomes enabled a broader population-level comparison and provided deeper insight into the genomic coherence and non-transitivity patterns analyzed in this study. Genome quality was estimated by CheckM2 v1.0.2 (Chklovski et al., 2023), and only the genomes with completeness >95% and contamination <5% were retained for downstream analyses (Riesco and Trujillo, 2024).
2.2. 16S rRNA gene-based phylogeny analysis
The 16S rRNA gene sequences of Sphingobacteriaceae species were retrieved from the NCBI database (https://www.ncbi.nlm.nih.gov/). Multiple sequence alignment was carried out in MEGA v12.0.11 (Kumar et al., 2024) using the ClustalW algorithm (Thompson et al., 1994). Phylogenetic relationships were inferred with the neighbor-joining method under Kimura’s two-parameter model (Kimura, 1980), and branch support was assessed with 1,000 bootstrap replicates. Positions containing gaps or missing data were excluded from the analysis. Algoriphagus aquimarinus DSM 23399T (jgi.1068028) was used as the outgroup.
2.3. Whole-genome-based phylogeny analysis
For phylogenomic reconstruction, the Genome Taxonomy Database Toolkit (GTDB-Tk; v2.4.1) (Chaumeil et al., 2022) with GTDB Release 226 (Parks et al., 2021) were employed to identify, extract, and align the concatenated set of 120 conserved single-copy bacterial marker genes (bac120). A maximum-likelihood phylogenomic tree was reconstructed using IQ-TREE 2 v2.3.5 (Minh et al., 2020) with the best-fitting substitution model of ‘Q.insect+F+I+R10’, as determined by ModelFinder v2.6 (Kalyaanamoorthy et al., 2017) according to the lowest Bayesian Information Criterion (BIC) score. Node support was estimated using 1,000 ultrafast bootstrap replicates. To validate the phylogenomic placement of all taxa, the ‘Up-to-date Bacterial Core Gene’ (UBCG) pipeline v3.0 (Na et al., 2018), based on 92 conserved bacterial core genes, was additionally applied. Moreover, the whole-genome sequence-based Genome BLAST Distance Phylogeny (GBDP) tree was also inferred in the TYGS platform (Meier-Kolthoff et al., 2022). All phylogenomic trees were visualized and annotated using the online tool Interactive Tree Of Life (iTOL) v7.1 (https://itol.embl.de/) (Letunic and Bork, 2024).
2.4. Calculations of genome-based similarity indices
Overall genome relatedness indices (OGRI) including average nucleotide identity (ANI), digital DNA–DNA hybridization (dDDH), and average amino acid identity (AAI), were used to calculate pairwise genomic similarities. ANI values was calculated using the Orthologous Average Nucleotide Identity Tool (OrthoANIu, https://www.ezbiocloud.net/tools/ani) (Lee et al., 2016). dDDH values were determined using the Genome-to-Genome Distance Calculator (GGDC; v3.0) web server, based on formula 2 (Meier-Kolthoff et al., 2022). According to established genomic criteria, strains with ANI >95–96% or dDDH >70% are considered conspecific (Riesco and Trujillo, 2024), while dDDH values below 70% typically indicate distinct species (Meier-Kolthoff et al., 2014). The AAI values were displayed using CompareM v0.1.2 (https://github.com/dparks1134/CompareM), with values >95% indicating that the strains are considered to belong to the same species (Konstantinidis et al., 2017).
2.5. Genome annotation and comparative genomic analysis
Basic genomic features, including genome size, G+C content, scaffold/contig numbers, N50, and total gene counts, were assessed using CheckM v1.2.4 (Parks et al., 2015). Protein-coding genes were identified using Prokka v1.14.5 (Seemann, 2014). Secondary metabolite biosynthesis gene clusters (BGCs) were detected with antiSMASH v8.0 (Blin et al., 2025) under relaxed detection settings. Protein sequences predicted by Prokka were used for functional annotation and comparative analyses. Functional genes was performed with eggNOG-mapper v2.1.12 (e-value <1e-5) (Cantalapiedra et al., 2021), assigning annotations to KEGG Orthology (KO), Clusters of Orthologous Groups (COG), and Carbohydrate-active enzymes (CAZymes) categories. Pathway reconstruction was carried out with KofamKOALA (e-value < 1e-5) (Aramaki et al., 2019) against the KEGG database (release 113.0).
Core genome single nucleotide polymorphisms (SNPs) were identified from all S. siyangense genomes using Snippy v4.6.0 with default parameters. A multiple sequence alignment of core SNPs (core.aln) was generated using Snippy-core. Maximum likelihood phylogeny was inferred with IQ-TREE 2 with the best-fitting substitution model of ‘TVM+F+ASC+R4’. SNP alleles were also visualized as a heatmap in R using the ‘ComplexHeatmap’ package, with nucleotides (A, T, C, G) color-coded and genome order preserved. Recombination was assessed using Gubbins v3.4.1 (Croucher et al., 2015) with default settings, using the core SNP alignment as input.
2.6. Pan-genomic analysis
To explore the evolutionary dynamics and population genomic diversity of the genus Sphingobacterium, we performed a pan-genomic analysis on 61 genomes, comprising 60 Sphingobacterium genomes and one Rhinopithecimicrobium genome. The analysis was conducted using Roary v3.13.0 (Page et al., 2015) and Panaroo v1.5.2 (Tonkin-Hill et al., 2020). GFF annotation files generated by Prokka were used as inputs, with a BLASTp minimum percentage identity threshold of 95% for Roary. Additionally, we separately performed pan-genome analysis on 24 Sphingobacterium siyangense genomes using the same methods. Genes were categorized into four groups: core genes (99 % ≤ strains ≤ 100 %), soft-core genes (95 % ≤ strains < 99 %), shell genes (15 % ≤ strains < 95 %), and cloud genes (0 % ≤ strains < 15 %). The distribution of these gene clusters was visualized in R using the packages ‘ggplot2’, ‘dplyr’, ‘tidyr’, and ‘ggrepel’. The ‘final_graph.gml’ file generated by Panaroo was used to produce a pan-genome network visualization in Cytoscape v3.10.4 (Shannon et al., 2003).
3. Results
3.1. Genome characteristics of the Sphingobacterium complex
A total of 60 genome sequences representing Sphingobacterium type strains, along with the genomes of Rhinopithecimicrobium faecis and 21 additional Sphingobacterium siyangense strains (non-type), were retrieved from GenBank (Table S1). All genomes exhibited >95% completeness and <5% contamination, meeting the quality criteria for downstream analyses (Riesco and Trujillo, 2024). Genome sizes of the 82 genomes (including 61 type strains and 21 non-type strains) ranged from 3.14 to 6.91 Mb (mean 4.89 ± 0.96 Mbp), with G + C contents spanning 34.6% to 46.8% (mean 40.19 ± 2.6%). Detailed genomic features are provided in Table S1.
3.2. Phylogenetic analyses
Phylogenetic reconstruction based on 16S rRNA sequences revealed that Sphingobacterium species clustered into two major clades (Clade I and Clade II), with R. faecis nested within Clade I (Fig. S1). Genome-scale phylogenies inferred using GTDB and UBCG frameworks were consistent with this topology (Fig. 1), further supporting the placement of R. faecis within the genus Sphingobacterium. Phylogenomic relationships revealed two closely related lineage pairs: (1) S. siyangense subsp. siyangense CGMCC 1.6855T, S. siyangense subsp. cladoniae JCM 16113T, and ‘S. paramultivorum’ w15T; (2) S. detergens JCM 16722T and S. ginsenosidimutans CECT 7938T. These relationships reflect close evolutionary connections and highlight potential taxonomic uncertainty within the genus. A comprehensive genome-based phylogeny incorporating 264 Sphingobacteriaceae type strain genomes and 21 additional S. siyangense strain genomes is presented in Fig. S2.
Fig. 1.
Genome-based phylogenetic relationships among Sphingobacterium species and the placement of Rhinopithecimicrobium faecis. The left and right panels show maximum-likelihood trees inferred using the GTDB (120 marker genes) and UBCG (92 core genes) frameworks, respectively. Both trees display congruent clustering patterns. Bootstrap support values (1,000 replicates) ≥70% are indicated at nodes.
3.3. Genome-based similarity indices
Pairwise overall genome relatedness indices (OGRI), including average nucleotide identity (ANI) and digital DNA-DNA hybridization (dDDH), were calculated for all type strains to further resolve taxonomic relationships within the genus Sphingobacterium (Fig. 2). Three closely related lineages, S. siyangense subsp. siyangense CGMCC 1.6855T, S. siyangense subsp. cladoniae JCM 16113T, and ‘S. paramultivorum’ w15T, showed high genomic similarity among adjacent pairs, with ANI values from 95.6% to 96.7% and dDDH values from 65.7% (Confidence interval (CI): 62.8–68.6%) to 72.4% (CI: 69.4–75.2%) (Table S2). Notably, ANI between S. siyangense subsp. siyangense and ‘S. paramultivorum’ above the 95% species-level threshold, while the corresponding dDDH dropped below the 70% threshold (65.7%), generating a non-transitive pattern.
Fig. 2.
Genome-based similarity among selected Sphingobacterium type strains. Pairwise overall genome relatedness indices (OGRI), including average nucleotide identity (ANI, lower-left triangle) and digital DNA–DNA hybridization (dDDH, upper-right triangle), were calculated for the strains shown in this phylogenomic subtree. The heatmap is organized according to the genome-based phylogeny. Key strains are highlighted in red.
To comprehensively evaluate genomic divergence within S. siyangense, we incorporated 21 additional non-type genomes, resulting in a dataset of 24 genomes. All pairwise ANI values across these genomes exceeded 95%, confirming strong genomic cohesion within the species (Fig. 3A). In contrast, dDDH values displayed substantial variability, with many falling below the species threshold and showing poor consistency with phylogenetic relatedness. The GBDP tree indicated clear phylogenetic structure, but dDDH-based species/subspecies assignments did not reliably correspond to clade topology (Fig. 3B), suggesting reduced resolution of dDDH for taxa exhibiting higher intra-species diversity and complicating objective delineation of subspecies within Sphingobacterium. In comparison, S. detergens CECT 7938T and S. ginsenosidimutans JCM 16722T exhibited clear genomic cohesion, with 96.5% ANI and 71.2% dDDH (CI: 68.2–74.0%), supporting their classification as a single species.
Fig. 3.
Genomic cohesion and intra-species variation in Sphingobacterium siyangense. (A) Pairwise ANI (lower-left) and dDDH (upper-right) among 24 S. siyangense-related genomes, including type and non-type strains. All ANI values exceed the 95.0% species threshold, while dDDH values are highly variable and do not always match phylogeny. (B) GBDP-based tree showing phylogenetic relationships of the 24 S. siyangense-related genomes. Closely related genomes cluster together, but dDDH-based species and subspecies assignments do not consistently correspond with the phylogeny, highlighting limitations of pairwise dDDH metrics within the species.
Average amino acid identity (AAI) analyses further supported these patterns (Fig. S3). The closely related S. siyangense lineages exhibited AAI values >95% (Fig. S3A) (Konstantinidis et al., 2017), consistent with ANI-based species-level assignments. Across all 24 S. siyangense genomes, AAI also remained >95% (Fig. S4B), demonstrating highly conserved protein-level relatedness despite heterogeneous dDDH signals. Additionally, AAI values for the pair S. detergens–S. ginsenosidimutans (97.0%) also exceeded the 95% species threshold, in agreement with their high ANI values (Fig. S3A). Taken together, these results suggest that AAI provides a more stable indicator of evolutionary relatedness across the genus, complementing ANI and mitigating the limitations of dDDH in lineages with gradual genomic divergence.
3.4. Comparative functional genomic analysis within S. siyangense
To assess whether genomic variation within S. siyangense corresponds to functional divergence, comparative functional genomic analyses were conducted across all 24 genomes. PCoA based on functional gene profiles (KO presence/absence; Fig. 4) showed that all genomes clustered tightly with minimal dispersion, indicating high functional similarity. Consistently, analyses of COG functional categories and CAZyme repertoires were highly conserved, with only subtle variation among strains (Fig. S4). Secondary metabolite biosynthetic gene cluster (BGC) profiles also displayed substantial overlap (Fig. S5). The 24 S. siyangense-related genomes typically encoded 6–8 BGCs, including terpene, terpene-precursor, T3PKS, betalactone, and arylpolyene clusters, with only sporadic lineage-specific features such as RRE-containing, RiPP-like, NRPS-like, and triceptide clusters. These results demonstrate that, despite variability in dDDH values and non-transitive patterns observed at the genome level, the overall functional repertoire of S. siyangense remains largely consistent, supporting its treatment as a genomically cohesive species.
Fig. 4.
Principal coordinates analysis (PCoA) based on presence/absence of KEGG Orthology (KO) terms. Points are colored by group (orange: S. siyangense, green: other species). Dashed ellipses show the 95% confidence interval for S. siyangense. The right panel zooms in on the 24 S. siyangense genomes, with three reference strains highlighted. Only genomes from the clade containing S. siyangense are shown.
Core genome SNP comparisons further supported this genomic cohesion. SNP counts per S. siyangense genome ranged from ∼43,000 in the most closely related isolate (O113_G2) to ∼135,000 in more divergent genome (‘S. paramultivorum’ w15T). Heterozygous sites were rare (<0.3% of variants) and no masked positions were present, reflecting reliable alignment of the core regions. Core SNP variation presented a relatively continuous divergence pattern among the 24 genomes, consistent with the overall genomic cohesion indicated by ANI and AAI metrics. The associated SNP heatmap further illustrated divergence patterns consistent with phylogenetic structure (Fig. S6). Recombination analysis revealed minimal horizontal gene transfer, and the recombination-filtered phylogeny closely matched the unfiltered tree (Fig. S7), indicating that recombination has limited impact on species-level genomic relationships. Together, these SNP-based and functional genomic results demonstrate strong evolutionary cohesion across the 24 genomes, supporting their classification as a single species, S. siyangense.
3.5. Pan-genomic structure of Sphingobacterium and S. siyangense
To explore genomic diversity and plasticity, pan-genome analyses were conducted at both genus and species levels using Roary and Panaroo. At the genus level, the phylogenomic tree combined with the Roary gene presence/absence matrix (Fig. 5A) revealed extensive variation in gene content across lineages, reflecting evolutionary diversification within the genus Sphingobacterium. The pan-genome, as determined by Roary, comprised 197,255 genes, including 22 core genes (present in ≥99% of strains), 6 soft-core genes (95–99%), 758 shell genes (15–95%), and 196,469 cloud genes (<15%). This analysis reveals a very limited core genome and a highly variable accessory genome. For a complementary perspective, the pan-genome analysis using Panaroo is also provided (Table S3). Gene accumulation curves showed rapid saturation of the core genome and steep expansion of the total and novel gene counts with the addition of new genomes (Fig. 5B). Panaroo-based network visualization highlighted that most genes are rare, with only a few conserved across nearly all strains, consistent with extensive accessory genome variability (Fig. 5C). Collectively, Roary and Panaroo yielded largely consistent results, confirming an open pan-genome with high genomic plasticity at the genus level.
Fig. 5.
Pan-genome of 61 Sphingobacterium type strains. (A) Phylogenomic tree combined with the Roary gene presence/absence matrix, illustrating variation in gene content across strains. (B) Gene accumulation curves showing the total number of genes (top) and conserved/core genes (bottom) as genomes are sequentially added. (C) Panaroo-based pan-genome network visualization. Node color intensity reflects the number of genomes containing each gene family, with edges representing shared gene content among strains.
At the species level, analyses of 24 S. siyangense genomes revealed strong genomic cohesion with a moderately open pan-genome. The pan-genome contained 3,138 core genes, 234 soft-core genes, 3,822 shell genes, and 8,704 cloud genes (Table S4). The Roary gene presence/absence matrix (Fig. 6A) showed high conservation of core genes across strains. Gene accumulation curves indicated rapid saturation of the core genome, while total and novel gene counts continued to increase (Fig. 6B), reflecting a moderately open accessory pan-genome within the species. Panaroo-based network visualization further highlighted that most genes are rare, with only a moderate number of shared modules across strains and limited strain-specific variation (Fig. 6C). Roary and Panaroo produced broadly comparable gene counts and trends, confirming the strong genomic cohesion of S. siyangense, while the presence of a moderately sized, variable accessory genome indicates that the species maintains an open pan-genome.
Fig. 6.
Pan-genome of 24 S. siyangense genomes. (A) Phylogenomic tree combined with the Roary gene presence/absence matrix. (B) Gene accumulation curves for total and conserved genes, indicating rapid saturation of the core genome and more moderate expansion of the pan-genome relative to the genus-level analysis. (C) Panaroo-based pan-genome network, with node color indicating the number of genomes containing each gene family, highlighting shared modules and a moderately sized accessory genome.
3.6. Genome-based taxonomic revisions of Sphingobacterium
Genome-scale analyses reveal that several closely related taxa currently recognized as separate species or subspecies actually represent single genomically cohesive species. In particular, S. siyangense subsp. siyangense, S. siyangense subsp. cladoniae, and the not validly published ‘S. paramultivorum’ exhibit ANI and AAI values consistently above the species-level threshold (>95%), despite some pairwise dDDH values falling below 70%. Core genome, functional genomic, and pan-genome analyses further support their strong genomic and functional cohesion. Based on these results, we propose retaining Sphingobacterium siyangense as a single species and discarding the subspecies designations.
In addition, S. detergens and S. ginsenosidimutans, show ANI and AAI values above 95% and minimal functional divergence. In accordance with the International Code of Nomenclature of Prokaryotes (ICNP) (Oren et al., 2023), the later-described S. ginsenosidimutans (Son et al., 2013) should be regarded as a heterotypic synonym of the earlier-described S. detergens (Marques et al., 2012), which is retained as the valid species name. Furthermore, phylogenomic analyses place Rhinopithecimicrobium faecis (Wang et al., 2024) within the Sphingobacterium clade, supporting its reclassification as Sphingobacterium faecis comb. nov. These proposed changes reflect both genomic cohesion and functional consistency, providing a more accurate and streamlined taxonomy for the genus Sphingobacterium.
4. Discussion
Species delimitation in prokaryotes has traditionally relied on pairwise genomic similarity metrics, notably average nucleotide identity (ANI) and digital DNA-DNA hybridization (dDDH), with thresholds of 95–96% ANI and 70% dDDH widely accepted for species boundary definition (Hayashi Sant’Anna et al., 2019; Orellana, 2025; Riesco and Trujillo, 2024). While these metrics have been broadly valuable, their application can be limited in taxa with highly dynamic and open genomes (Chung et al., 2018; Parks et al., 2020). Our genome-scale analyses of Sphingobacterium illustrate this limitation. Within the S. siyangense complex, all genome pairs meet the ANI species threshold (>95%), yet one pair exhibits a dDDH value below 70%, producing a non-transitive pattern (Fig. 2, Fig. 3). This discordance is not due to conflicting signals from multiple metrics, but reflects the inherent non-transitive property of dDDH (Fig. 7), which may fail to hold simultaneously across a triad of closely related genomes (Parks et al., 2020). The highly plastic pan-genome of Sphingobacterium amplifies this effect, as variation in accessory gene content disproportionately influences pairwise genomic similarity (Fig. 5, Fig. 6). These observations underscore the limitations of relying solely on fixed pairwise thresholds for taxonomic inference.
Fig. 7.
Schematic representation of non-transitive relationships based on dDDH values among S. siyangense lineages. Strains: A, S. siyangense subsp. siyangense CGMCC 1.6855T; B, S. siyangense subsp. cladoniae JCM 16113T; C, ‘S. paramultivorum’ w15T. The diagram illustrates the non-transitive pattern observed for dDDH, where dDDH between A–B and B–C meets or approaches the species-level threshold, but A–C falls below 70%, resulting in non-transitive pairwise relationships. ANI and AAI values between all lineage pairs remain >95% (Figs. 2 and S3A), consistent with species-level assignment. Dotted circles on the left and right of A and C highlight the taxa involved in the non-transitive pattern. Inclusion of the 21 additional non-type S. siyangense genomes confirms that, despite this dDDH non-transitivity, all S. siyangense-related strains (n=24) belong to a single genomically cohesive species (Figs. 3A and S3B).
The underlying causes of such inconsistencies are multifactorial. The Sphingobacterium pan-genome is highly open, with a minimal core gene set and a large, dynamic accessory gene repertoire. Horizontal gene transfer, recombination, and lineage-specific gene loss can generate uneven similarity patterns across genomes (Seng et al., 2024; Wang et al., 2020; Zhong et al., 2019). Consequently, pairwise metrics such as dDDH can be strongly influenced by accessory gene content, yielding discordant values among genome pairs (Volpiano et al., 2021). Methodological differences in calculating distances may further amplify these inconsistencies, particularly near threshold boundaries (Chung et al., 2018). These factors collectively explain why dDDH may underrepresent evolutionary relatedness within highly cohesive species like S. siyangense, highlighting the need for integrative approaches that combine multiple genomic and functional metrics.
Our findings also highlight the significant role of proteome-level identity (AAI) as a complementary metric to ANI, particularly in cases where dDDH produces ambiguous results. In the S. siyangense complex, AAI values consistently exceed the 95% threshold across all strains, aligning with the high ANI values and providing a stable measure of genomic cohesion (Fig. S4). In contrast, dDDH values exhibit considerable variability, with some strains failing to meet the 70% threshold despite strong genomic similarity as indicated by ANI and AAI. These discrepancies suggest that AAI offers a more reliable indicator of evolutionary relatedness, especially in taxa with complex pan-genomes where accessory gene content can disproportionately influence dDDH values. Our results underscore the utility of AAI as an additional, stable metric for species delimitation, particularly in cases where pairwise metrics like dDDH may fail to resolve taxonomic boundaries accurately. This aligns with previous suggestions that genome-based taxonomy should consider multiple, complementary metrics to achieve more reliable and consistent species delineation (Barco et al., 2020; Narsing Rao and Thamchaipenet, 2024; Riesco and Trujillo, 2024).
Functional genomic analyses further support our taxonomic conclusions. Despite substantial genomic variation, S. siyangense strains share a high degree of functional coherence. Comparative analysis of functional gene categories (KO, COG, and CAZyme) and secondary metabolite biosynthetic gene clusters (BGCs) across 24 S. siyangense-related genomes reveals remarkable conservation, with only subtle variations in specific gene families and BGCs (Figs. 4 and S4). This functional consistency supports the classification of these strains as a single species, with minor intraspecific variation rather than evidence of distinct species. Notably, the presence of lineage-specific BGCs, such as NRPS-like, NRPS and triceptide clusters, adds ecological relevance to the observed phylogenomic relationships (Fig. S5). While secondary metabolites are not formal taxonomic markers, they provide valuable insight into ecological differentiation and evolutionary adaptation, reinforcing the robustness of our species-level taxonomic assignments (Kuzmanović et al., 2022; Suresh et al., 2025; Wang et al., 2022). These findings align with previous work demonstrating the utility of functional genomic analysis in species delimitation, particularly in taxa with high genomic diversity (Dif et al., 2024; Simpson et al., 2023; Xu et al., 2021).
The patterns of genomic and functional similarity observed among closely related Sphingobacterium strains provide strong justification for revising species boundaries. High ANI and AAI values, coupled with minimal functional divergence, indicate that S. detergens and S. ginsenosidimutans represent genomically cohesive entities rather than distinct species. These findings are consistent with the principle of genomic coherence, which emphasizes that species delineation should reflect both evolutionary relatedness and functional consistency (Vos, 2011). By consolidating these taxa, the genus Sphingobacterium can be represented by a more streamlined taxonomy that reduces ambiguity and better reflects underlying biological relationships. This approach also demonstrates the importance of integrating multiple lines of genomic and functional evidence, rather than relying solely on pairwise thresholds, when establishing species-level classifications in prokaryotes.
Beyond revising species boundaries, this study highlights the broader utility and challenges of genome-based taxonomy. Increasing genomic data availability enables more accurate and reliable taxonomic assignments (Hugenholtz et al., 2021; Riesco and Trujillo, 2024), but reliance on single-metric thresholds (e.g., dDDH) may produce contradictions, particularly in taxa with open pan-genomes or high intra-species diversity. Integrative approaches combining ANI, AAI, dDDH, core-genome SNPs, pan-genome analyses, and functional data allow a more nuanced understanding of species boundaries. However, despite the insights gained, several challenges remain. Functional predictions are still limited by the completeness and accuracy of existing databases, and phylogenetic reconstructions may be affected by factors such as recombination and model selection . Additionally, population-level studies with broader strain sampling could provide more detailed insights into evolutionary dynamics. Expanding experimental validation, especially for traits related to ecological adaptation or secondary metabolism, will further strengthen the predictive accuracy of genome-based taxonomy (Hart et al., 2025; Karlsen et al., 2023; Koblitz et al., 2025; Li et al., 2023; Ravinet et al., 2017).
5. Conclusion
Our genome-scale analyses of Sphingobacterium underscore the limitations of relying solely on fixed genomic thresholds (e.g., dDDH) for species delineation, particularly in taxa with highly dynamic and open genomes. By integrating multiple lines of evidence, including ANI, AAI, dDDH, core-genome SNPs, pan-genome structure, and functional gene content, we were able to resolve conflicts and clarify evolutionary relationships. These analyses support the classification of Sphingobacterium siyangense as a single genomically cohesive species, despite non-transitive patterns observed in dDDH comparisons. Additionally, we propose that Sphingobacterium ginsenosidimutans be reclassified as a synonym of Sphingobacterium detergens. Furthermore, we present evidence supporting the inclusion of Rhinopithecimicrobium faecis within the genus Sphingobacterium. Our study emphasizes the need for an integrative approach to microbial taxonomy that combines genomic, phylogenomic, and functional data, offering a more accurate and biologically meaningful framework for species classification and ensuring both nomenclatural stability and evolutionary relevance.
5.1. Emended description of Sphingobacterium siyangenseLiu et al. (2008)
The species description is as given for Sphingobacterium siyangense (Liu et al., 2008) with the following amendments. Colonies on LB agar are slightly yellow (initially white during early growth) or yellowish. MK-7 is the sole menaquinone. The genome size ranges from 5.96–6.91 Mb, with a genomic DNA G+C content of 39.7–39.9%. GenBank accession numbers for the 16S rRNA gene and genome sequences of the type strain are EU046272 and VLKR00000000 (GCA_007830445.1), respectively.
The type strain is SY1T (=KCTC 22131T=CGMCC 1.6855T), which was isolated from a soil sample from Jiangsu Province, China. Another strain, No.6 (=KCTC 22613=JCM 16113), was isolated from a lichen (Cladonia sp.) collected on Geogeum Island, Korea. This species also encompasses more than 20 additional strains recovered from diverse environments, including strain w15 (GenBank assembly: GCA_009660355.1) from decaying wood in Netherlands; FTD2 (GCA_023277505.1) from activated sludge in Poland; PDNC006 (GCA_016919365.1) from plastic debris in USA; MMO-142 (GCA_037148175.1) and MMO-146 (GCA_037148115.1) from mosquito in USA; T12B17 (GCA_003610905.1) from a soil in Enshi, China, among others.
5.2. Emended description of Sphingobacterium detergensMarqués et al. (2012)
Heterotypic synonym: Sphingobacterium ginsenosidimutans Son et al. 2014.
The species description is as given for Sphingobacterium detergens (Marques et al., 2012) with the following amendments. The pH range for growth is 5.0–10.0. MK-7 is the predominant menaquinone. The major components of cellular fatty acids (>10%) include summed feature 3 (C16:1 ω7c and/or C16:1 ω6c) and iso-C15:0. The genome size ranges from 6.45–6.73 Mb, with a genomic DNA G+C content of 39.8–40.1%. GenBank accession numbers for the 16S rRNA gene and genome sequences of the type strain are JN015213 and RAPY01000000 (GCA_003610355.1), respectively.
The type strain is 6.2ST (=LMG 26465T=CECT 7938T), which was isolated from a soil sample and identified as a biosurfactant producer. Another strain, THG 07 (JCM 16722=KACC 14526), was isolated from the soil of a ginseng field of Pocheon, South Korea, and exhibits ginsenoside-converting activity.
5.3. Description of Sphingobacterium faecis comb. nov
Sphingobacterium faecis (fae’cis. L. gen. fem. n. faecis, referring to faecal origin from where the type strain was isolated).
Basonym: Rhinopithecimicrobium faecis (Wang et al. 2024).
The description is as given for Rhinopithecimicrobium faecis (Wang et al., 2024). The genome size of the type strain is 3.14 Mb with a genomic DNA G+C content of 39.4%. The GenBank accession numbers for the 16S rRNA gene and genome sequences of the type strain are MZ413266 and JAGKSB010000000 (GCA_017942165.1), respectively.
The type strain, WQ 2009T (=CCTCC AB 2021153T=KCTC 82941T), was isolated from the faeces of Rhinopithecus bieti collected from the Yunnan Snub-nosed Monkey National Park.
Funding
This work was supported by the National Natural Science Foundation of China (No. 32400004), the “Tianchi Talents”–Young Doctor Program of Xinjiang Uygur Autonomous Region (No. E535851801), the Natural Science Foundation of Xinjiang Uygur Autonomous Region (No. 2025D01A141), and the Open Project of Guangdong Provincial Key Laboratory of Plant Stress Biology (No. 2024PlantKF02).
Data availability
All publicly available genome assemblies of Sphingobacteriaceae type species (n=264) and non-type Sphingobacterium siyangense strains (n=21) used in this study were obtained from the GenBank database. The accession numbers are presented in Table S1.
CRediT authorship contribution statement
Shuai Li: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Software, Supervision, Visualization, Writing – original draft. Xin-Ran Wang: Investigation, Formal analysis, Validation. Wei Zhang: Resources. Wen-Jun Li: Project administration, Supervision, Writing – review & editing.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
The authors sincerely appreciate the Microbial Resources, Ecology, and Evolution (MicroREE) Lab, led by Prof. Wen-Jun Li at Sun Yat-sen University, for providing high-performance computing resources and technical support for the bioinformatics analyses.
Footnotes
Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.crmicr.2025.100524.
Appendix. Supplementary materials
References
- Ahmed I., Ehsan M., Sin Y., Paek J., Khalid N., Hayat R., et al. Sphingobacterium pakistanensis sp. nov., a novel plant growth promoting rhizobacteria isolated from rhizosphere of Vigna mungo. Antonie Leeuwenhoek. 2014;105:325–333. doi: 10.1007/s10482-013-0077-0. [DOI] [PubMed] [Google Scholar]
- An D., Na C., Bielawski J., Hannun Y.A., Kasper D.L. Membrane sphingolipids as essential molecular signals for Bacteroides survival in the intestine. Proc. Natl. Acad. Sci. U. S. A. 2011;108:4666–4671. doi: 10.1073/pnas.1001501107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aramaki T., Blanc-Mathieu R., Endo H., Ohkubo K., Kanehisa M., Goto S., et al. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics. 2019;36:2251–2252. doi: 10.1093/bioinformatics/btz859. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Auch A.F., von Jan M., Klenk H.P., Göker M. Digital DNA-DNA hybridization for microbial species delineation by means of genome-to-genome sequence comparison. Stand. Genom. Sci. 2010;2:117–134. doi: 10.4056/sigs.531120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barco R.A., Garrity G.M., Scott J.J., Amend J.P., Nealson K.H., Emerson D. A genus definition for Bacteria and Archaea based on a standard genome relatedness index. mBio. 2020;11 doi: 10.1128/mBio.02475-19. -02419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blin K., Shaw S., Vader L., Szenei J., Reitz ZL., Augustijn HE., et al. antiSMASH 8.0: extended gene cluster detection capabilities and analyses of chemistry, enzymology, and regulation. Nucleic Acids Res. 2025;53:W32–W38. doi: 10.1093/nar/gkaf334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cantalapiedra C.P., Hernández-Plaza A., Letunic I., Bork P., Huerta-Cepas J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 2021;38:5825–5829. doi: 10.1093/molbev/msab293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chaumeil P.A., Mussig A.J., Hugenholtz P., Parks D.H. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics. 2022;38:5315–5316. doi: 10.1093/bioinformatics/btac672. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chklovski A., Parks D.H., Woodcroft B.J., Tyson G.W. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat. Methods. 2023;20:1203–1212. doi: 10.1038/s41592-023-01940-w. [DOI] [PubMed] [Google Scholar]
- Chun J., Oren A., Ventosa A., Christensen H., Arahal D.R., da Costa M.S., et al. Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes. Int. J. Syst. Evol. Microbiol. 2018;68:461–466. doi: 10.1099/ijsem.0.002516. [DOI] [PubMed] [Google Scholar]
- Chung M., Munro J.B., Tettelin H., Dunning Hotopp J.C. Using core genome alignments to assign bacterial species. mSystems. 2018;3 doi: 10.1128/mSystems.00236-18. -00218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coenye T., Gevers D., de Peer Y.V., Vandamme P., Swings J. Towards a prokaryotic genomic taxonomy. FEMS Microbiol. Rev. 2005;29:147–167. doi: 10.1016/j.fmrre.2004.11.004. [DOI] [PubMed] [Google Scholar]
- Croucher N.J., Page A.J., Connor T.R., Delaney A.J., Keane J.A., Bentley S.D., et al. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res. 2015;43:e15. doi: 10.1093/nar/gku1196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dif G., Djemouai N., Bouras N., Zitouni A. Reclassification of two Nocardiopsis species using whole genome analysis. Antonie Leeuwenhoek. 2024;118:28. doi: 10.1007/s10482-024-02038-9. [DOI] [PubMed] [Google Scholar]
- Fan G., Sun Q., Sun Y., Liu D., Li S., Li M., et al. GCM and gcType in 2024: comprehensive resources for microbial strains and genomic data. Nucleic Acids Res. 2024;53:D763–D771. doi: 10.1093/nar/gkae1057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Figueiredo G., Gomes M., Covas C., Mendo S., Caetano T. The unexplored wealth of microbial secondary metabolites: the Sphingobacteriaceae case study. Microb. Ecol. 2022;83:470–481. doi: 10.1007/s00248-021-01762-3. [DOI] [PubMed] [Google Scholar]
- Hart R., Moran N.A., Ochman H. Genomic divergence across the tree of life. Proc. Natl. Acad. Sci. U. S. A. 2025;122 doi: 10.1073/pnas.2319389122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hayashi Sant’Anna F., Bach E., Porto R.Z., Guella F., Hayashi Sant’Anna E., Passaglia L.M.P. Genomic metrics made easy: what to do and where to go in the new era of bacterial taxonomy. Crit. Rev. Microbiol. 2019;45:182–200. doi: 10.1080/1040841X.2019.1569587. [DOI] [PubMed] [Google Scholar]
- Hugenholtz P., Chuvochina M., Oren A., Parks D.H., Soo R.M. Prokaryotic taxonomy and nomenclature in the age of big sequence data. ISMe J. 2021;15:1879–1892. doi: 10.1038/s41396-021-00941-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jain C., Rodriguez-R L.M., Phillippy A.M., Konstantinidis K.T., Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 2018;9:5114. doi: 10.1038/s41467-018-07641-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalyaanamoorthy S., Minh B.Q., Wong T.K.F., von Haeseler A., Jermiin L.S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods. 2017;14:587–589. doi: 10.1038/nmeth.4285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karlsen S.T., Rau M.H., Sanchez B.J., Jensen K., Zeidan A.A. From genotype to phenotype: computational approaches for inferring microbial traits relevant to the food industry. FEMS Microbiol. Rev. 2023;47:fuad030. doi: 10.1093/femsre/fuad030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khampratueng P., Anal A.K. Enhancing the biodegradation of low-density polyethylene (LDPE) using novel bacterial consortia: Bacillus sp. AS3 and Sphingobacterium sp. AS8. J. Environ. Sci. 2026;159:263–270. doi: 10.1016/j.jes.2025.04.007. [DOI] [PubMed] [Google Scholar]
- Kim M., Oh H.S., Park S.C., Chun J. Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. Int. J. Syst. Evol. Microbiol. 2014;64:346–351. doi: 10.1099/ijs.0.059774-0. [DOI] [PubMed] [Google Scholar]
- Kim S., Heo J., Choi H., Lee D., Kwon S.W., Kim Y. Sphingobacterium oryzagri sp. nov., isolated from rice paddy soil. Int. J. Syst. Evol. Microbiol. 2024;74 doi: 10.1099/ijsem.0.006371. [DOI] [PubMed] [Google Scholar]
- Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 1980;16:111–120. doi: 10.1007/BF01731581. [DOI] [PubMed] [Google Scholar]
- Koblitz J., Reimer L.C., Pukall R., Overmann J. Predicting bacterial phenotypic traits through improved machine learning using high-quality, curated datasets. Commun. Biol. 2025;8:897. doi: 10.1038/s42003-025-08313-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Konstantinidis K.T., Tiedje J.M. Prokaryotic taxonomy and phylogeny in the genomic era: advancements and challenges ahead. Curr. Opin. Microbiol. 2007;10:504–509. doi: 10.1016/j.mib.2007.08.006. [DOI] [PubMed] [Google Scholar]
- Konstantinidis K.T., Rossello-Mora R., Amann R. Uncultivated microbes in need of their own taxonomy. ISMe J. 2017;11:2399–2406. doi: 10.1038/ismej.2017.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar S., Stecher G., Suleski M., Sanderford M., Sharma S., Tamura K. MEGA12: molecular evolutionary genetic analysis version 12 for adaptive and green computing. Mol. Biol. Evol. 2024;41:msae263. doi: 10.1093/molbev/msae263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuzmanović N., Biondi E., Overmann J., Puławska J., Verbarg S., Smalla K., et al. Genomic analysis provides novel insights into diversification and taxonomy of Allorhizobium vitis (i.e. Agrobacterium vitis) BMC Genom. 2022;23:462. doi: 10.1186/s12864-022-08662-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee I., Ouk Kim Y., Park S.C., Chun J. OrthoANI: An improved algorithm and software for calculating average nucleotide identity. Int. J. Syst. Evol. Microbiol. 2016;66:1100–1103. doi: 10.1099/ijsem.0.000760. [DOI] [PubMed] [Google Scholar]
- Letunic I., Bork P. Interactive Tree of Life (iTOL) v6: recent updates to the phylogenetic tree display and annotation tool. Nucleic Acids Res. 2024;52:W78–W82. doi: 10.1093/nar/gkae268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li S., Lian W.H., Han J.R., Ali M., Lin Z.L., Liu Y.H., et al. Capturing the microbial dark matter in desert soils using culturomics-based metagenomics and high-resolution analysis. NPJ Biofilms Microbiomes. 2023;9:67. doi: 10.1038/s41522-023-00439-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li S., Liu J., Huang J., Dong L., Li W.J. Genome-based reclassification of Sphingobacterium soli Fu et al. 2017 as a later heterotypic synonym of Sphingobacterium cellulitidis Huys et al. 2017 and proposal of Sphingobacterium siyangense subsp. siyangense subsp. nov. and Sphingobacterium siyangense subsp. cladoniae subsp. nov. Int. J. Syst. Evol. Microbiol. 2024;74 doi: 10.1099/ijsem.0.006610. [DOI] [PubMed] [Google Scholar]
- Liu R., Liu H., Zhang C.X., Yang S.Y., Liu X.H., Zhang K.Y., et al. Sphingobacterium siyangense sp. nov., isolated from farm soil. Int. J. Syst. Evol. Microbiol. 2008;58:1458–1462. doi: 10.1099/ijs.0.65696-0. [DOI] [PubMed] [Google Scholar]
- Marques A.M., Burgos-Diaz C., Aranda F.J., Teruel J.A., Manresa A., Ortiz A., et al. Sphingobacterium detergens sp. nov., a surfactant-producing bacterium isolated from soil. Int. J. Syst. Evol. Microbiol. 2012;62:3036–3041. doi: 10.1099/ijs.0.036707-0. [DOI] [PubMed] [Google Scholar]
- Meier-Kolthoff J.P., Hahnke R.L., Petersen J., Scheuner C., Michael V., Fiebig A., et al. Complete genome sequence of DSM 30083T, the type strain (U5/41T) of Escherichia coli, and a proposal for delineating subspecies in microbial taxonomy. Stand. Genom. Sci. 2014;9:2. doi: 10.1186/1944-3277-9-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meier-Kolthoff J.P., Göker M. TYGS is an automated high-throughput platform for state-of-the-art genome-based taxonomy. Nat. Commun. 2019;10:2182. doi: 10.1038/s41467-019-10210-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meier-Kolthoff J.P., Carbasse J.S., Peinado-Olarte R.L., Goker M. TYGS and LPSN: a database tandem for fast and reliable genome-based classification and nomenclature of prokaryotes. Nucleic Acids Res. 2022;50:D801–D807. doi: 10.1093/nar/gkab902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Minh B.Q., Schmidt H.A., Chernomor O., Schrempf D., Woodhams M.D., von Haeseler A., et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 2020;37:1530–1534. doi: 10.1093/molbev/msaa015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Na S.I., Kim Y.O., Yoon S.H., Ha S.M., Baek I., Chun J. UBCG: Up-to-date bacterial core gene set and pipeline for phylogenomic tree reconstruction. J. Microbiol. 2018;56:280–285. doi: 10.1007/s12275-018-8014-6. [DOI] [PubMed] [Google Scholar]
- Narsing Rao M.P., Thamchaipenet A. In: Modern Taxonomy of Bacteria and Archaea: New Methods, Technology and Advances. Li W.J., Jiao J.Y., Salam N., Rao M.P.N., editors. Springer Nature; Singapore, Singapore: 2024. pp. 133–140. [DOI] [Google Scholar]
- Orellana L.H. Average nucleotide identity — the backbone of modern ecological genomics. Nat. Rev. Genet. 2025 doi: 10.1038/s41576-025-00911-5. [DOI] [PubMed] [Google Scholar]
- Oren A., Arahal D.R., Göker M., Moore E.R.B., Rossello-Mora R., Sutcliffe I.C. International code of nomenclature of prokaryotes. prokaryotic code (2022 revision) Int. J. Syst. Evol. Microbiol. 2023;73 doi: 10.1099/ijsem.0.005585. [DOI] [PubMed] [Google Scholar]
- Page A.J., Cummins C.A., Hunt M., Wong V.K., Reuter S., Holden M.T.G., et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics. 2015;31:3691–3693. doi: 10.1093/bioinformatics/btv421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parks D.H., Imelfort M., Skennerton C.T., Hugenholtz P., Tyson G.W.J. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25:1043–1055. doi: 10.1038/s41592-023-01940-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parks D.H., Chuvochina M., Chaumeil P.A., Rinke C., Mussig A.J., Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 2020;38:1079–1086. doi: 10.1038/s41587-020-0501-8. [DOI] [PubMed] [Google Scholar]
- Parks D.H., Chuvochina M., Rinke C., Mussig A.J., Chaumeil P.A., Hugenholtz P. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 2021;50:D785–D794. doi: 10.1093/nar/gkab776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parte A.C., Sardà Carbasse J., Meier-Kolthoff J.P., Reimer L.C., Göker M. List of prokaryotic names with standing in nomenclature (LPSN) moves to the DSMZ. Int. J. Syst. Evol. Microbiol. 2020;70:5607–5612. doi: 10.1099/ijsem.0.004332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rajkumari J., Katiyar P., Dheeman S., Pandey P., Maheshwari D.K. The changing paradigm of rhizobial taxonomy and its systematic growth upto postgenomic technologies. World J. Microbiol. Biotechnol. 2022;38:206. doi: 10.1007/s11274-022-03370-w. [DOI] [PubMed] [Google Scholar]
- Ravinet M., Faria R., Butlin R.K., Galindo J., Bierne N., Rafajlovic M., et al. Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow. J. Evol. Biol. 2017;30:1450–1477. doi: 10.1111/jeb.13047. [DOI] [PubMed] [Google Scholar]
- Riesco R., Trujillo M.E. Update on the proposed minimal standards for the use of genome data for the taxonomy of prokaryotes. Int. J. Syst. Evol. Microbiol. 2024;74 doi: 10.1099/ijsem.0.006300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodriguez-R L.M., Conrad R.E., Viver T., Feistel D.J., Lindner B.G., Venter S.N., et al. An ANI gap within bacterial species that advances the definitions of intra-species units. mBio. 2024;15 doi: 10.1128/mbio.02696-23. -02623. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30:2068–2069. doi: 10.1093/bioinformatics/btu153. [DOI] [PubMed] [Google Scholar]
- Seng R., Chomkatekaew C., Tandhavanant S., Saiprom N., Phunpang R., Thaipadungpanit J., et al. Genetic diversity, determinants, and dissemination of Burkholderia pseudomallei lineages implicated in melioidosis in Northeast Thailand. Nat. Commun. 2024;15:5699. doi: 10.1038/s41467-024-50067-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sentausa E., Fournier P.E. Advantages and limitations of genomics in prokaryotic taxonomy. Clin. Microbiol. Infect. 2013;19:790–795. doi: 10.1111/1469-0691.12181. [DOI] [PubMed] [Google Scholar]
- Shannon P., Markiel A., Ozier O., Baliga N.S., Wang J.T., Ramage D., et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shi W., Sun Q., Fan G., Hideaki S., Moriya O., Itoh T., et al. gcType: a high-quality type strain genome database for microbial phylogenetic and functional research. Nucleic Acids Res. 2020;49:D694–D705. doi: 10.1093/nar/gkaa957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simpson A.C., Sengupta P., Zhang F., Hameed A., Parker C.W., Singh N.K., et al. Phylogenomics, phenotypic, and functional traits of five novel (Earth-derived) bacterial species isolated from the International Space Station and their prevalence in metagenomes. Sci. Rep. 2023;13 doi: 10.1038/s41598-023-44172-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sohlenkamp C., Geiger O. Bacterial membrane lipids: diversity in structures and pathways. FEMS Microbiol. Rev. 2015;40:133–159. doi: 10.1093/femsre/fuv008. [DOI] [PubMed] [Google Scholar]
- Son H.M., Yang J.E., Kook M.C., Shin H.S., Park S.Y., Lee D.G., et al. Sphingobacterium ginsenosidimutans sp. nov., a bacterium with ginsenoside-converting activity isolated from the soil of a ginseng field. J. Gen. Appl. Microbiol. 2013;59:345–352. doi: 10.2323/jgam.59.345. [DOI] [PubMed] [Google Scholar]
- Suresh R., Jayachandiran S., Balu P., Ramasamy D. Comparative genomics reveals genetic diversity and differential metabolic potentials of the species of Arachnia and suggests reclassification of Arachnia propionica E10012 (=NBRC_14587) as novel species. Arch. Microbiol. 2025;207:93. doi: 10.1007/s00203-025-04302-6. [DOI] [PubMed] [Google Scholar]
- Tan H., Kong D., Li Q., Zhou Y., Jiang X., Wang Z., et al. Metabolomics reveals the mechanism of tetracycline biodegradation by a Sphingobacterium mizutaii S121. Environ. Pollut. 2022;305 doi: 10.1016/j.envpol.2022.119299. [DOI] [PubMed] [Google Scholar]
- Tao Y., Yang C., Dong K., Luo W., Ye L., Pu J., et al. Two new members of the genus Sphingobacterium: Sphingobacterium zhuxiongii sp. nov. and Sphingobacterium luzhongxinii sp. nov. Int. J. Syst. Evol. Microbiol. 2024;74 doi: 10.1099/ijsem.0.006488. [DOI] [PubMed] [Google Scholar]
- Thompson J.D., Higgins D.G., Gibson T.J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tonkin-Hill G., MacAlasdair N., Ruis C., Weimann A., Horesh G., Lees J.A., et al. Producing polished prokaryotic pangenomes with the Panaroo pipeline. Genome Biol. 2020;21:180. doi: 10.1186/s13059-020-02090-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Volpiano C.G., Sant'Anna F.H., Ambrosini A., de Sao Jose J.F.B., Beneduzi A., Whitman W.B., et al. Genomic metrics applied to Rhizobiales (Hyphomicrobiales): species reclassification, identification of unauthentic genomes and false type strains. Front. Microbiol. 2021;12 doi: 10.3389/fmicb.2021.614957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vos M. A species concept for bacteria based on adaptive divergence. Trends Microbiol. 2011;19:1–7. doi: 10.1016/j.tim.2010.10.003. [DOI] [PubMed] [Google Scholar]
- Wang J., Li Y., Pinto-Tomas A.A., Cheng K., Huang Y. Habitat adaptation drives speciation of a Streptomyces species with distinct habitats and disparate geographic origins. mBio. 2022;13 doi: 10.1128/mbio.02781-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Q., Zhan P.C., Han X.L., Lu T. Metagenomic and culture-dependent analysis of Rhinopithecius bieti gut microbiota and characterization of a novel genus of Sphingobacteriaceae. Sci. Rep. 2024;14 doi: 10.1038/s41598-024-64727-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang S., Meade A., Lam H.M., Luo H. Evolutionary timeline and genomic plasticity underlying the lifestyle diversity in Rhizobiales. mSystems. 2020;5 doi: 10.1128/msystems.00438-20. -00420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu L., Ma J. The global catalogue of Microorganisms (GCM) 10K type strain sequencing project: providing services to taxonomists for standard genome sequencing and annotation. Int. J. Syst. Evol. Microbiol. 2019;69:895–898. doi: 10.1099/ijsem.0.003276. [DOI] [PubMed] [Google Scholar]
- Xu S., Li Z., Huang Y., Han L., Che Y., Hou X., et al. Whole genome sequencing reveals the genomic diversity, taxonomic classification, and evolutionary relationships of the genus Nocardia. PLoS Negl. Trop. Dis. 2021;15 doi: 10.1371/journal.pntd.0009665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang C., Zhang G., Chen Y., Zheng S., Du J., Zhao Z., et al. Sphingobacterium tenebrionis sp. nov., isolated from intestine of mealworm. Int. J. Syst. Evol. Microbiol. 2024;74 doi: 10.1099/ijsem.0.006455. [DOI] [PubMed] [Google Scholar]
- Zhang M., Li A., Xu S., Chen M., Yao Q., Xiao B., et al. Sphingobacterium micropteri sp. nov. and Sphingobacterium litopenaei sp. nov., isolated from aquaculture water. Int. J. Syst. Evol. Microbiol. 2021;71 doi: 10.1099/ijsem.0.005091. [DOI] [PubMed] [Google Scholar]
- Zhong Z., Kwok L.Y., Hou Q., Sun Y., Li W., Zhang H., et al. Comparative genomic analysis revealed great plasticity and environmental adaptation of the genomes of Enterococcus faecium. BMC Genomics. 2019;20:602. doi: 10.1186/s12864-019-5975-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All publicly available genome assemblies of Sphingobacteriaceae type species (n=264) and non-type Sphingobacterium siyangense strains (n=21) used in this study were obtained from the GenBank database. The accession numbers are presented in Table S1.








