Abstract
Dermatophilaceae polyphosphate-accumulating organisms (PAOs), formerly classified as Tetrasphaera PAOs, play pivotal roles in enhanced biological phosphorus removal (EBPR). However, their phylogenetic diversity, ecological preferences, and metabolic traits remain poorly characterized, and a robust marker gene for their classification is lacking. Here, we performed an extensive phylogenomic and metabolic analysis of Dermatophilaceae PAOs utilizing 46 newly recovered metagenome-assembled genomes from a laboratory-scale EBPR reactor treating high-strength wastewater and full-scale wastewater treatment plants. These analyses revealed a previously uncharacterized PAO genus, named here as Candidatus Dermatophostum, which shows specific preference for high-phosphorus environments. Its representative species, Ca. Dermatophostum ammonifactor, was enriched in the EBPR reactor and its PAO phenotype was confirmed by polyphosphate staining and fluorescence in situ hybridization. Integrative meta-omics combining genomic, transcriptomic, and protein structure analyses revealed its specialized metabolic capabilities for phosphate metabolism, glycogen synthesis, and dissimilatory nitrate reduction to ammonium. Moreover, Ca. Dermatophostum was found to be widely distributed across wastewater treatment plants worldwide, underscoring both its diverse metabolic capabilities and potential engineering implications for mitigating nitrous oxide (N2O) emissions for EBPR system. Finally, we propose a ppk1-based classification framework that resolves Dermatophilaceae PAOs into six distinct clades, consistent with whole-genome phylogeny, and demonstrates that ppk1 can serve as a reliable marker gene for tracking these populations. Together, these findings expand the ecological and functional understanding of Dermatophilaceae PAOs and highlight their promise for advancing sustainable wastewater treatment and resource recovery.
Keywords: enhanced biological phosphorus removal; polyphosphate-accumulating organisms; Dermatophilaceae, ppk1 marker; metabolic mechanism; phylogenetic diversity
Introduction
Phosphorus (P) is a non-renewable resource essential for all life forms. However, the global phosphate rock reserves, the primary source of phosphorus for fertilizers, are projected to be depleted within 50–400 years [1]. This looming scarcity underscores the urgent need for phosphorus recovery and recycling, particularly from wastewater. Enhanced biological phosphorus removal (EBPR) is a promising and cost-effective technology that utilizes polyphosphate-accumulating organisms (PAOs) to remove phosphorus from wastewater by storing it as polyphosphate (polyP) during alternating “feast-famine” cycles [2, 3]. The ability of PAOs to selectively accumulate phosphorus in this manner makes EBPR an attractive solution for sustainable wastewater treatment.
Historically, Ca. Accumulibacter was the first PAO identified using 16S rRNA gene-based sequencing analysis [4], and it remains one of the most extensively characterized PAO genera. However, subsequent studies have revealed that PAOs of the genus Tetrasphaera are more abundant in full-scale EBPR systems and may play a more significant role in phosphorus removal than previously recognized [5, 6]. Although the16S rRNA gene has traditionally served as the primary marker for identification and phylogenetic delineation of Tetrasphaera PAOs [7], subsequent genome-based studies have revealed that the 16S rRNA gene lacks sufficient resolution to distinguish this lineage. As a result, genome-based taxonomy has revised the understanding of Tetrasphaera PAOs, placing them into multiple genera within the Dermatophilaceae family [8–11]. Recently, two novel genera of Dermatophilaceae PAOs, namely Ca. Phosphoribacter and Ca. Lutibacillus were identified in wastewater treatment plants (WWTPs). Among them, Ca. Phosphoribacter included six species, showing species diversity. The two most abundant and often co-occurring species possess identical V1–V3 16S rRNA gene amplicon sequence variants, but show <95% genome-wide average nucleotide identity (ANI) and exhibit distinct metabolic capabilities [11]. These findings suggest that the diversity of Dermatophilaceae PAOs has likely been underestimated, as well as emphasize the need for more robust markers to resolve the taxonomy and ecological distribution of Dermatophilaceae PAOs.
In addition to their taxonomic complexity, Dermatophilaceae PAOs exhibit diverse metabolic capabilities [3]. In terms of nitrogen metabolism, some members of this family encode nitrate and nitrite reductases [11, 12], yet complete denitrification pathways have not been detected, and the capacity for dissimilatory nitrate reduction to ammonium (DNRA) remains uncertain [13]. Furthermore, Dermatophilaceae PAO are fermentative PAOs, showing distinct anaerobic metabolic processes compared to conventional Ca. Accumulibacter PAOs. These microorganisms utilize fermentation to generate energy for phosphate uptake and anaerobic maintenance [11, 12, 14, 15]. Notably, members of Dermatophilaceae PAOs synthesis different intracellular storage compounds during the anaerobic phase. Glycogen [12], free amino acids [16], polyhydroxyalkanoates (PHA) [17] and cyanophycin [11] have been suggested as potential storage compounds, but experimental evidence remains inconsistent and inconclusive [6, 18]. Thus, the metabolism of Dermatophilaceae PAOs during the anaerobic phase remains insufficiently understood, particularly for the novel lineages.
Despite recent advances in genomic research, the functional capabilities of Dermatophilaceae PAOs, particularly under high-strength wastewater conditions, remain incompletely characterized at multiple levels, including gene expression, metabolic activity, and protein structure. These conditions pose unique challenges and opportunities for cost-efficient phosphorus recovery, requiring PAOs with specialized metabolic capacities and thriving under nutrient fluctuations. Therefore, it is essential to explore the genomic and metabolic diversity of Dermatophilaceae PAOs to identify those with enhanced potential for phosphorus removal. In this study, we aimed to address these gaps by recovering 46 new metagenome-assembled genomes (MAGs) of Dermatophilaceae PAOs from both lab-scale reactors and global WWTPs. Specifically, we (i) systematically characterized the taxonomy and metabolism of these PAOs, (ii) identified ppk1 as a more informative phylogenetic marker than the 16S rRNA gene and developed an integrated genome-ppk1 classification framework that delineates six distinct clades, and (iii) identified a novel genus of PAOs, Ca. Dermatophostum, whose representative species, Ca. Dermatophostum ammonifactor, exhibits a specialized ecological preference for high-phosphorus conditions. By integrating genomic and transcriptomic approaches, this study not only advances the understanding of Dermatophilaceae PAOs but also supports their potential application in sustainable phosphorus removal and recovery in wastewater treatment.
Materials and methods
Bioreactor setup, sludge sampling, and fluorescence in situ hybridization
A 10 l sequencing batch reactor was operated to enrich PAOs under high-strength wastewater conditions. The influent contained 394.8 ± 34.7 mg/l total organic carbon (TOC), 134.0 ± 12.74 mg/l nitrogen, and 25.6 ± 1.7 mg/l phosphorus. Every 8 h, 5 l of synthetic wastewater was fed to the reactor, resulting in a hydraulic retention time of 16 h. The reactor was inoculated with activated sludge from a local WWTP in Hangzhou, China. Details of the reactor setup, operational strategy, and routine monitoring were described previously [19] and are also summarized in Supplementary Method S1. The efficiency of the reactor was evaluated during a steady-state cycle on Day 180.
For microbial community analysis, biomass samples were collected regularly for DNA and RNA extraction and sequencing (see below and Supplementary Dataset S1). To investigate the spatial organization of Dermatophilaceae PAOs, fluorescence in situ hybridization (FISH) was performed using 16S rRNA-targeted probes to visualize and identify Dermatophilaceae PAOs in the EBPR reactor on Day 180. The FISH protocol followed a previously described method [19], and detailed experimental procedure and probe sequences are provided in Supplementary Method S2 and Table S1. DAPI staining was used to detect polyP in the cells, according to Nguyen et al [20]. The stained sludge samples, including those with FISH probe and DAPI, were mounted with antifade solution (Leagene, China) and examined using a laser scanning confocal microscope (Zeiss LSM800, Germany). For polyP detection, the excitation wavelength was set to 364 nm, with the emission wavelength ranging from 540 to 590 nm, to exclude the emission wavelength of DNA (397–515 nm) [20].
DNA extraction, library construction, metagenomic sequencing
Sludge samples from the reactor were regularly collected and stored at −80°C until DNA extraction. Total genomic DNA was extracted using FastDNA spin kit for soil (MP Biomedicals, USA) and stored at −20°C. Details of the DNA extraction and quality assessment are provided in Supplementary Method S3. Metagenomic DNA library were prepared using the NEB Next Ultra DNA Library Prep Kit for Illumina (NEB, USA). Sequencing was performed on the NovaSeq 6000 System (Illumina) using a 150 bp paired-end sequencing strategy at Novogene (Beijing, China). A total of 291.63 Gbp of metagenomic sequencing data were generated from the time series reactor samples (n = 16) and the detailed information is available in Supplementary Dataset S1.
Metagenome pretreatment, assembly, and binning
The raw metagenomic data generated from the laboratory EBPR system underwent quality control using FastQC [21] v0.11.7 and MultiQC [22] v1.7. Raw reads were filtered to remove sequencing adapters and low-quality reads using Fastp [23] v0.19.7 and PRINSEQ-lite [24] v0.20.4. Two assembly strategies, including single sample assembly and multiple sample co-assembly, were employed in this study, using SPAdes [25] v3.9.0 and MEGAHIT [26], respectively. Then the generated contigs were then binned to generate MAGs using three binning software (i.e. MetaBAT2 [27], MaxBin [28], and CONCOCT [29]) in the MetaWRAP pipeline [30] v1.3.0. Detailed workflows and parameters are provided in Supplementary Method S4. To recover the genome of Dermatophilaceae PAOs from global WWTPs, a metagenomic dataset (2.72 Tbp) shared by the Global Water Microbiome Consortium was downloaded from the NCBI SRA database (PRJNA509305) [31]. The single sample assembly and binning strategy described above was applied to recover MAGs from this global dataset.
Metagenome-assembled genome analysis and functional annotation
The recovered MAGs from different assembly strategies were dereplicated using dRep [32] v2.3.2. The relative abundance was calculated based on the read coverage from all dereplicated MAGs using CoverM (v0.2.0, https://github.com/wwood/CoverM). Genome taxonomy was determined using GTDB-Tk [33] v2.1.0 and its dependencies Prodigal [34] v2.6.3, HMMER [35] v3.1b2, pplacer [36] v1.1, FastANI [37] v1.32, FastTree [38] v2.1.9 and Mash [39] v2.2. Genome quality was assessed using CheckM [40] v1.2.0. MAGs assigned to the Dermatophilaceae family were selected for further comparative genomics analysis and functional annotation. The ANI between pairwise MAGs assigned to Dermatophilaceae was calculated using pyani [41] v0.2.12. The resulting ANI matrix was processed and visualized using R [42] v4.1.0. Correlation analyses between Dermatophilaceae PAO abundance and physicochemical parameters were performed based on Spearman’s rank correlation and linear regression.
Protein-coding genes were predicted from MAGs contigs using Prodigal [34] v2.6.3 and initially annotated using Prokka [43] v 1.14.6. To improve the functional annotation accuracy, additional annotation was performed using KofamScan [44] against the KEGG database [45] and EnrichM v0.5.0, which uses Diamond [46] v0.9.22.123 to search the protein sequences against a KO-annotated uniref100 database. Key genes involved in the specialized metabolism of novel PAOs were further verified via blastp searches against NCBI and UniProt databases. The combined annotations were manually cross-validated and used to reconstruct metabolic pathways, guided by KEGG Mapper v4.1 [47] and visualized in BioRender (https://biorender.com/). Detailed parameters are provided in Supplementary Method S5.
Phylogenetic analysis of novel polyphosphate accumulating organisms
Phylogenetic analysis of PAOs was performed based on genome, 16S rRNA gene, and ppk1 gene sequences. First, a genome-level phylogenetic tree was generated using GTDB-Tk [33] v2.1.0. Genomes of Dermatophilaceae family in GTDB were downloaded and used as reference genomes. The genome tree was re-rooted by setting 3 Kineococcus MAGs as outgroup. Second, the 16S rRNA gene sequences of Dermatophilaceae MAGs were predicted and extracted using Barrnap (https://github.com/tseemann/barrnap) with the command “--kingdom bac --outseq”. The extracted sequences were aligned using MAFFT [48] v7.505 with the “--auto” parameter. A maximum-likelihood tree was constructed from the gene alignments using Fastree [38] v 2.1.11 with default parameter. The full-length 16S rRNA gene sequences from the families Intrasporangiaceae and Dermatophilaceae in the MiDAS4.8 [49] database were used as reference sequences and outgroups, respectively. Third, ppk1 gene sequences detected in the newly recovered Dermatophilaceae MAGs and their referenced genomes were extracted according to their genome annotation results. The obtained ppk1 gene set was aligned using MAFFT [48] v7.505 with the parameter of “-auto”. A phylogenetic tree was constructed using FastTree [38] v 2.1.11 with default parameter. All three phylogenetic trees were visualized, annotated and refined using iTOL [50].
RNA isolation, metatranscriptomic sequencing, and bioinformatics analysis
Dermatophilaceae PAOs exhibit metabolic traits distinct from classical PAOs during the anaerobic phase, including carbon utilization, fermentation and storage compounds synthesis. To investigate anaerobic-phase metabolism and gene expression in Dermatophilaceae PAOs, 15 biomass samples were collected at the end of the anaerobic phase for metatranscriptomic analysis. To improve the RNA yield, every three consecutive time-point samples were pooled equally for RNA extraction. Detailed information about the biomass samples is summarized in Supplementary Dataset S1. Total RNA was extracted using the RNA PowerSoil Total RNA Isolation Kit (MoBio, USA) following the manufacturer’s instructions. Residual DNA was removed using RNase-Free DNase I (TIANGEN, China). RNA quality, concentration, and integrity were determined as described in Supplementary Method S6. The ribosomal RNA was removed using TIANSeq rRNA Depletion Kit (TIANGEN, China). The rRNA-depleted RNA was then reverse-transcribed, and cDNA libraries were prepared using the TruSeq Stranded mRNA Kit (Illuminia, USA). The constructed libraries were sequenced on the NovaSeq 6000 System (Illumina) using a paired-end (2 × 150) sequencing strategy at the Personal Biotechnology Co., Ltd. (Shanghai, China).
Raw metatranscriptomic reads were processed quality control prior to further analysis. Sequencing adapters were trimmed using cutadapt [51] (v1.17), and low-quality reads were removed using a sliding-window algorithm in fastp [23] (v0.20.0). Ribosomal RNA reads were removed with sortMeRNA [52]. The remaining high-quality reads were mapped to the MAG using hisat2 [53]. Gene-level read counts were calculated using HTseq [54] with the intersection-strict mode. Transcript abundance was then normalized to transcripts per kilobase million (TPM) using StringTie [55], thereby accounting for gene length and sequencing depth. Detailed parameters are provided in Supplementary Method S6.
Protein structure prediction, ligand docking, and conservation analysis
Protein structure prediction was performed using AlphaFold2 via the ColabFold pipeline [56]. Model quality was evaluated using per-residue pLDDT scores, with structures exhibiting average scores above 90 considered reliable for downstream analysis. Ligand-binding sites and molecular docking were predicted using CB-Dock2 [57]. The evolutionary conservation of amino acid residues was analyzed using ConSurf [58]. Detailed settings and parameters are provided in Supplementary Method S7.
Results and discussion
Bioreactor performance and microbial community structure
The EBPR reactor achieved efficient phosphorus removal with an average removal efficiency of 90.8 (±3.5) % during the stable phase (Days 142–266), with influent phosphate concentrations ranging from 0.82 ± 0.004 mmol P/L (equal to 25.6 ± 1.7 mg P/L, Fig. 1a). The reactor exhibited typical EBPR dynamics, with anaerobic phosphorus release and aerobic phosphorus uptake rates of 0.61 and 0.79 mmol P /L, respectively, on Day 180 (Fig. S1). This performance was comparable to previously reported EBPR systems enriched with Tetrasphaera-related PAOs (95% biovolume) [17], indicating efficient phosphorus uptake and removal.
Figure 1.
Linking phosphorus removal performance with microbial composition and dynamics in the EBPR reactor. (a) Time series of phosphorus concentrations in the influent and effluent over 266 days of reactor operation. (b) Temporal dynamics of microbial community composition at the family level, based on metagenomic read mapping. (c) Phylogenomic tree showing the placement of 30 Dermatophilaceae MAGs recovered from the EBPR reactor (pink, in the third concentric ring), 16 MAGs from 226 global WWTPs (blue), and 14 MAGs from a previous study by Singleton et al. (green) [11]. The phylogenetic tree was constructed based on the concatenated alignment of 120 single-copy marker gene proteins using GTDB-Tk. The genomes taxonomically classified as Dermatophilaceae family in GTDB were downloaded and used as reference genomes. The tree was re-rooted using 3 Kineococcus MAGs as the outgroup. Outer concentric rings (from inside to outside) indicate genome completeness, contamination, and source, respectively. (d) Temporal dynamics of high-abundance Dermatophilaceae MAGs in the EBPR reactor during the 261-day operation.
The improved reactor performance coincided with the enrichment of Dermatophilaceae, whose relative abundance increased from 0.7% in the seed sludge to 27.5% by Day 102 and 45.3% by Day 261 (Fig. 1b). This trend paralleled the increase in phosphorus removal efficiency from 48.3% at Day 20 to 87.4% at Day 105 and 95.6% at Day 266 (Fig. 1a), suggesting a functional role for this lineage in phosphorus removal. In addition to phosphorus, the system also demonstrated substantial nitrogen and organic carbon removal during the stable phase (Days 142–266). The total nitrogen decreased from 9.57(±0.91) mmol/l to 1.39(±0.37) mmol/l, and TOC was reduced from 32.9(±1.74) mmol/l to 0.61(±0.45) mmol/l, corresponding to removal efficiencies of 72.7(±13.4) % for nitrogen and 98.1(±1.4) % for TOC, respectively (Fig. S2). These results collectively confirm the system’s capability for simultaneous removal of phosphorus, nitrogen, and organic carbon.
Genome phylogeny reveals novel taxonomic diversity within Dermatophilaceae polyphosphate-accumulating organisms
From 291.63 Gbp metagenomic data derived from 15 EBPR reactor samples, we recovered 382 medium and high-quality MAGs (>70% completeness and <10% contamination), including 30 assigned to the Dermatophilaceae family (Supplementary Dataset S2). These MAGs represented on average 73.8 (± 5.8) % of metagenomic reads across reactor samples and 28.9% in the seed sludge, indicating their representativeness in the EBPR microbiome. To extend the phylogenetic landscape of Dermatophilaceae PAOs, we further recovered and screened 2641 MAGs from a global metagenomic dataset spanning 226 WWTPs [31], identifying 16 additional Dermatophilaceae MAGs (Supplementary Dataset S2). These newly recovered MAGs greatly expand the known genomic repertoire of the family and provide a valuable resource for exploring taxonomic relationships, metabolic traits, and ecological niches (see below).
Of the 46 MAGs recovered, 44 were assigned to eight known or candidate genera within Dermatophilaceae: JAGOME01 (3), Austwickia (7), Kineosphaera (4), Ca. Phosphoribacter (14), Ca. Lutibacillus (4), Tetrasphaera_A (7), Phycicoccus_A (4), and Ornithinibacter (1), whereas two remained unclassified (Fig. 1c and Supplementary Dataset S2). JAGOME01, a currently uncharacterized genus in GTDB, was initially identified from WWTP activated sludge [59]. Phylogenomic analysis further revealed that the JAGOME01 MAGs formed a monophyletic group closely related to the known Phycicoccus_A PAOs genus (Fig. 1c). This close evolutionary proximity suggests that JAGOME01 may share functional traits with known PAOs. Indeed, all JAGOME01 MAGs encoded key genes involved in polyP metabolism (e.g. ppk1, ppk2, ppx and ppgk), and phosphorus transport (e.g. pit or pstSABC) [13], supporting their functional potential as PAOs (Fig. 2 and Supplementary Dataset S3). The ANI values between JAGOME01 and other PAO genera ranged from 75%–77%, consistent with genus-level ANI boundaries in Dermatophilaceae [11], supporting the classification of JAGOME01 as a novel genus (Fig. S3 and Supplementary Dataset S4). Based on these findings, we propose the novel genus Ca. Dermatophostum, named to reflect its taxonomic affiliation (within family Dermatophilaceae), environmental origins (“tum” from the Latin lutum, meaning mud), and phosphorus-assimilating function.
Figure 2.
Functional potential of Dermatophilaceae MAGs and their closest relatives. (a) The maximum likelihood phylogenetic tree based on ppk1 gene sequences retrieved from Dermatophilaceae MAGs. (b) Comparison of the functional potential among Dermatophilaceae MAGs, with a focus on nutrient removal-related functions. Three representative MAGs of Ca. Phosphoribacter (EBPR_bin.20), Ca. Lutibacillus (EBPR_bin.18), and Ca. Dermatophostum (Ammonifactor), which exhibit high abundance (>1%) in EBPR reactor, are highlighted in red. Gene names are grouped based on their related functions: polyphosphate (polyP) synthesis, phosphorus (P) transport, nitrogen (N) cycling, amino acid transport and synthesis, polyhydroxyalkanoates (PHA) synthesis, glycogen synthesis and fermentation.
The representative species, Ca. Dermatophostum ammonifactor, was selectively enriched in the EBPR reactor. Its genome (97.6% completeness and 0% contamination) reached a peak relative abundance of 31.4% on Day 127 and it accounted for 87.7% of the total Dermatophilaceae PAO abundance during Days 121–200 (Fig. 1d), indicating a central role in the reactor. FISH analysis detected a high proportion of Dermatophilaceae PAOs within the reactor community, and DAPI staining revealed stronger signals under aerobic conditions (Fig. 3a–c) compared to anaerobic conditions (Fig. 3d–f). Given the dominance of Ca. Dermatophostum ammonifactor as the primary PAO, these results suggest that most cells detected by FISH and DAPI staining may belong to the novel PAO Ca. Dermatophostum ammonifactor. Additionally, we found that all three Ca. Dermatophostum MAGs were recovered from the reactor, whereas most Ca. Phosphoribacter MAGs (10/14) originated from global WWTPs. Environmental conditions, particularly influent phosphorus concentrations, differ between the EBPR reactor and WWTPs. This contrast indicates habitat-driven niche differentiation within Dermatophilaceae PAOs, associated with their distinct metabolic traits, which will be discussed in subsequent sections.
Figure 3.
Micrographs of the EBPR microbiome at Day 180, co-stained with FISH probes and DAPI. Bacteria targeted by the EUBmix probe are shown in green; the 16S rRNA of Dermatophilaceae PAOs targeted by the TETmix probe is shown in orange; and the light yellow indicates dual staining with EUBmix and TETmix probes targeting PAOs. Bacterial cells that accumulated polyP are stained with DAPI and present in blue. Panels (a) and (d) show dual staining with EUBmix and TETmix probes; panels (b) and (e) show staining with TETmix only; and panels (c) and (f) show DAPI staining only. Panels (a–c) correspond to samples taken from the aerobic phase, and panels (d–f) correspond to samples from the anaerobic phase. Detailed information on FISH probes can be found in Supplementary Method S2 and Table S1.
Polyphosphate kinase ppk1 as a robust phylogenetic marker for Dermatophilaceae polyphosphate-accumulating organisms
Although widely used, the 16S rRNA gene lacks sufficient resolution to distinguish closely related Dermatophilaceae PAOs [60]. For example, Ca. Dermatophostum ammonifactor was misclassified as Tetrasphaera midas_s_4428, with a global sequence similarity of 98.1% (Supplementary Dataset S5). Similarly, the known PAOs such as Phycicoccus_A elongatus, Ca. Phosphoribacter, and Ca. Lutibacillus have historically been grouped under Tetrasphaera based solely on 16S rRNA gene sequences [11, 13], highlighting the limited taxonomic resolution of 16S rRNA gene.
Given its essential role in polyP synthesis and its widespread use in delineating Ca. Accumulibacter clades [61], we hypothesized that ppk1 presents a promising alternative phylogenetic marker for resolving the diversity of Dermatophilaceae PAOs. To evaluate this, we conducted a comprehensive comparison between ppk1 and genome-based phylogenies for Dermatophilaceae PAOs. The ppk1-based phylogeny closely mirrored the genome-based phylogeny (Fig. 4), suggesting ppk1 reliably reflects whole-genome evolutionary relationships within this lineage.
Figure 4.
Comparison of genome-based, ppk1-based, and 16S rRNA gene-based phylogenies of Dermatophilaceae PAOs. The maximum-likelihood genome tree was constructed from the alignment of 120 single copy marker gene proteins, each trimmed to 5000 amino acids before alignment to ensure uniform sequence length and remove non-relevant regions, using GTDB-Tk. The maximum-likelihood ppk1 and 16S rRNA gene trees were generated from alignments of the ppk1 and 16S rRNA genes extracted from the genomes. Background colors represented the clade divisions based on ppk1 and 16S rRNA gene phylogeny. The alignment of clade structure across the three trees highlights the higher resolution and congruence of ppk1 and genome-based phylogenies, in contrast to the limited discriminatory power of the 16S rRNA gene.
Based on the consistent ppk1 and genome-based phylogenies, Dermatophilaceae PAOs can be grouped into at least six distinct clades (1–6), corresponding to genus-level taxonomic divisions within this family (Fig. 4 and Supplementary Dataset S4). Within these clades, multiple subclades reflect species-level divisions, demonstrating the fine-scale discriminatory power of ppk1 (Fig. S4 and S5). Specifically, the genera Phycicoccus_A, Tetrasphaera_A, Tetrasphaera, Ca. Phosphoribacter, Ca. Lutibacillus and Ca. Dermatophostum were exclusively composed of clade 1–6, respectively, with their subclades corresponding to distinct species within each genus (Fig. 4, Figs S4 and S5). These findings support our hypothesis that ppk1 serves as a reliable genetic marker for resolving species-level diversity in Dermatophilaceae PAOs and establish an effective framework for their classification. They also highlight the potential of ppk1 as a broadly applicable PAO marker, warranting further investigation across diverse PAO lineages. Furthermore, ppk1 exhibited higher recoverability than 16S rRNA gene in our MAGs. In specific, all 46 Dermatophilaceae MAGs contained ppk1, whereas only six containing 16S rRNA gene sequences. This underscores the utility of ppk1 as an accessible marker for detecting, classifying, and tracking Dermatophilaceae PAOs across diverse ecosystems and provides a practical framework for future comparative and functional studies for this microbial lineage in EBPR processes.
Comparative transcriptomic and evolutionary insights into Dermatophilaceae polyphosphate-accumulating organisms
Dermatophilaceae PAOs exhibit metabolic traits that differ from classical PAOs, such as Ca. Accumulibacter, particularly in the anaerobic phase, including organic substrate utilization, fermentation, and intracellular storage compound synthesis. To characterize anaerobic-phase gene expression in novel Dermatophilaceae PAOs and to compare them with other coexisting PAOs in the system, including previously described genera such as Ca. Phosphoribacter and Ca. Lutibacillus [11], we performed metatranscriptomic analyses of PAO MAGs recovered from the EBPR reactor.
Overall expression activity
At the anaerobic phase, Dermatophilaceae PAOs consistently represented a substantial proportion of the total PAO-associated transcripts, accounting for 79.5% of the summed PAO transcripts during Days 54–66 (Fig. 5a). Within this group, Ca. Dermatophostum contributed the majority of the Dermatophilaceae-associated transcripts, comprising 92.8% of this fraction during Days 113–127 (Fig. 5b). In combination with its high abundance as the most prevalent PAO in the reactor (Fig. 5c), these results suggest that the Ca. Dermatophostum genus plays an important functional role in the EBPR reactor. Within this genus, Ca. Dermatophostum ammonifactor was the most abundant and transcriptionally active species, increasing from 0.02% at Day 0 to 31.43% at Day 127, and remaining at 21.38% during Days 127–179 (Fig. 1d). Correspondingly, its total transcript abundance increased from 608.0 TPM at Day 9 to 17 932.8 TPM at Day 54, and remained elevated at 20 708.1 TPM during Days 113–179 (Fig. 5d). In contrast, anaerobic-phase transcripts from other coexisting PAOs, including Ca. Lutibacillus EBPR_bin.18 and Ca. Phosphoribacter EBPR_bin.20, declined sharply over time, from 5254.8 and 27 295.5 TPM at Day 54 to only 57.6 and 28.6 TPM at Day 224, respectively (Fig. 5d). Although transcript abundance reflects both transcriptional activity and population abundance, these results indicate that Ca. Dermatophostum ammonifactor is the dominant and transcriptionally active PAO in the EBPR reactor. Besides, the average transcript abundance of housekeeping genes in Ca. Dermatophostum ammonifactor (151.4 TPM) was higher than in Ca. Lutibacillus EBPR_bin.18 (42.2 TPM) and Ca. Phosphoribacter EBPR_bin.20 (47.6 TPM) during Days 54–179 (Fig. 5e). As housekeeping gene abundance correlates with the level of active cells in microbial community [62], this result indicates that Ca. Dermatophostum ammonifactor maintained a higher proportion of transcriptionally active cells, further supporting its important role in the reactor.
Figure 5.
Activity, gene expression, genomic abundance and protein structure of novel Dermatophilaceae PAOs in lab-scale EBPR system. (a) Total transcriptional activity of Dermatophilaceae PAOs. (b) Expression profiles of the novel PAO genus Ca. Dermatophostum. (c) Genomic abundance of potential PAOs in EBPR reactor. (d) Metatranscriptomic comparison of all potential PAO MAGs in the bioreactor. (e) Expression of genes related to phosphorus metabolism, nitrogen metabolism and organic substrate transport (amino acids and sugars), fermentation, and storage polymer synthesis (PHA and glycogen). These data include three representative Dermatophilaceae PAOs MAGs, namely EBPR_bin.20 (clade 4), EBPR_bin.18 (clade 5), and Ammonifactor (clade 6), which belong to Ca. Phosphoribacter (Ca. P.), Ca. Lutibacillus (Ca. L.) and Ca. Dermatophostum (Ca. D.), respectively. Each column corresponds to a sampling stage during reactor operation: S1 (Days 9–23), S2 (Days 54–66), S3 (Days 113–127); S4 (Days 170–179). Each row represents the gene expression of a genome. Color intensity represents the log-transformed normalized expression level, measured in TPM. White (empty) boxes indicate the absence of the corresponding gene in the genome. Detailed expression values are provided in Supplementary Dataset S7. (f) Conservation and binding pocket of the pit transporter in Ca. Dermatophostum ammonifactor. Surface representation of five predicted binding pockets on the pit transporter protein, with conservation scores mapped onto the surface. The color scale represents varying degrees of conservation: blue (variable), cyan (average), and pink (conserved), with yellow indicating insufficient data. Binding pocket characteristics and conservation score of contact residue of the pit transporter in Ca. Dermatophostum ammonifactor as predicted by CB-Dock2.
Polyphosphate metabolism
To evaluate transcriptional activity in polyP metabolism during the anaerobic phase, we profiled transcripts of genes involved in polyP synthesis, hydrolysis and phosphorus transport (gene list in Supplementary Dataset S6). Once the EBPR community was established and stabilized (Days 54–179), ppk1 and ppk2 were consistently expressed in Ca. Dermatophostum ammonifactor with the average transcript abundances of 261.8 and 2121.8 TPM, respectively. In comparison, Ca. Lutibacillus EBPR_bin.18 only expressed ppk2 during Days 54–66, and Ca. Phosphoribacter EBPR_bin.20 expressed ppk1 and ppk2 during Days 54–127 (Fig. 5e). This pattern suggests that Ca. Dermatophostum ammonifactor exhibits persistently phosphorus metabolic activity in the EBPR reactor. Moreover, ppk2 showed higher activity, underscoring its preferential role in catalyzing polyP hydrolysis for energy production under anaerobic conditions [2, 63, 64]. In addition, all three genomes co-expressed exopolyphosphatase (ppx) with ppk2 for polyP hydrolysis, whereas endopolyphosphatase (ppn1) was absent (Figs 2b and 5d), consistent with its exclusive presence in eukaryotic cells [65].
For polyP degradation, Dematophilaceae PAOs encoded only the ppgk gene, whereas pap gene was absent (Figs 2 and 5e). Metatranscriptomic data confirmed ppgk expression in Ca. Dermatophostum ammonifactor (58.2 TPM), Ca. Lutibacillus EBPR_bin.18 (3036.9 TPM), and Ca. Phosphoribacter EBPR_bin.20 (20 TPM), suggesting that Dermatophilaceae PAOs preferentially utilize the ppgk pathway for anaerobic energy generation. In contrast, classical Ca. Accumulibacter PAOs and Azonexus PAOs encode only the pap gene [13], which transfers phosphate from polyP to AMP rather than directly phosphorylating glucose [66], thereby requiring additional steps for energy metabolism [13]. This difference implies that Dermatophilaceae PAOs may exhibit greater metabolic efficiency under anaerobic conditions, enhancing their resilience to environmental fluctuations, such as changes in oxygen availability or substrate supply.
Phosphate transport
Regarding phosphate transport, Ca. Lutibacillus EBPR_bin.18 and Ca. Phosphoribacter EBPR_bin.20 encoded and expressed both the high-affinity phosphate transporters (pstSCAB) and low-affinity phosphate transporters (pit) (Figs 2 and 5e). In contrast, Ca. Dermatophostum ammonifactor, unlike these PAOs and the well-studied model organism Phycicoccus_A elongatus (clade 1), exclusively encoded and expressed the pit transporter. Structural modeling of the pit transporter in Ca. Dermatophostum ammonifactor predicted a 329 residues structure with a high average pLDDT score of 93.4 (Fig. S6), indicating confidence in both its global fold and residue-level accuracy. Protein-ligand docking analysis identified five potential binding pockets, with pockets 1 and 2 standing out due to their larger cavity volume and stronger binding affinity (Fig. 5f). Further conservation analysis indicated that pocket 1 had a moderate conservation score of 5.0, whereas pocket 2 exhibited a higher score of 6.6. These findings suggest that pocket 1 may represent a more flexible region for modulating phosphate accommodation, whereas pocket 2 likely plays a critical role in preserving structural stability and functionality of the transporter.
The pit transporter is proton-driven and high-throughput and is favored under phosphate-rich environments [67, 68]. Conversely, the pst transporter is ATP-driven and typically more efficient under low phosphate environment [69, 70]. The absence of pst in Ca. Dermatophostum ammonifactor may reflect limited competitiveness under low-phosphorus conditions, whereas conferring an advantage in high-phosphate environments. Whether the pst system was lost through evolutionary selection or was never present in Ca. Dermatophostum ammonifactor requires further investigation, but this phosphate uptake trait highlights its potential relevance for high-phosphate wastewater treatment.
Nitrogen transformation and cycling
Integrating biological nitrogen and phosphorus removal is critical for optimizing wastewater treatment processes [71]. Like the extensively studied Phycicoccus_A elongatus of clade 1, Ca. Dermatophostum ammonifactor and Ca. Phosphoribacter EBPR_bin.20 encoded respiratory nitrate reductase genes (narG/H/I) (Fig. 2 and Supplementary Dataset S3), suggesting their ability to reduce nitrate (NO3−) to nitrite (NO2−). Metatranscriptomic analysis confirmed the expression of narG/H/I in Ca. Dermatophostum ammonifactor and Ca. Phosphoribacter EBPR_bin.20 in the anaerobic phase, with the average expression values of 234.8 and 34.2 TPM, respectively (Fig. 5e and Supplementary Dataset S7). Ca. Lutibacillus EBPR_bin.18, however, lacked narG/H/I but encoded and expressed nirK and norB, which are associated with the reduction of NO2− to nitrous oxide (N2O).
All Dermatophilaceae PAOs lacked the nitric oxide reductase gene (nosZ) (Fig. 2), indicating an inability to reduce N2O to N2. Ca. Dermatophostum ammonifactor (clade 6) encoded and expressed the complete nrfA/nrfH operon in the anaerobic phase (Supplementary Dataset S3 and S7). The predicted nrfA protein exhibits high similarity to a homolog from the same family (96% coverage, 81.8% identity) (Fig. S7). This high sequence similarity, together with the presence of the complete nrfA/nrfH operon, suggests that Ca. Dermatophostum ammonifactor has the potential to perform DNRA. A previous study reported that Tetrasphaera japonica (clade 3) encoded nirB and nirD for DNRA [12]. We further found that members of the Tetrasphaera_A genus (clade 2), including EBPR_bin.31, SRR14932593_bin.9 and EBPR_bin.224, also encoded nirB and nirD (Fig. 2). In contrast, DNRA-related genes were not detected in clade 1, 4 and 5. Together, these findings indicate a functional divergence in nitrogen metabolism among Dermatophilaceae PAOs.
The expression of nrfA and nrfH peaked at Days 54–66 with 72.8 and 42.2 TPM, respectively, which was lower than both the housekeeping gene average (149 TPM) and the denitrification gene norB (1161.9 TPM). When interpreted in the context of relative transcriptional investment across nitrogen transformation pathways, these patterns suggest that denitrification-associated processes were more transcriptionally prominent than DNRA in Ca. Dermatophostum ammonifactor under the examined EBPR conditions. Although DNRA activity was relatively low in this system, this pathway represents a truncated nitrogen cycle in which nitrate is reduced directly to ammonium, fundamentally reducing N₂O emissions [72]. Future studies should aim to elucidate the environmental factors and ecological trade-offs that regulate DNRA expression in EBPR systems, which could open new avenues for utilizing DNRA-capable PAOs, such as Ca. Dermatophostum ammonifactor, to simultaneously recover phosphorus and mitigate greenhouse gas in EBPR systems.
Organic substrate transport and fermentation
Dermatophilaceae PAOs in the bioreactor exhibited a diverse array of transporter genes for organic substrate uptake. Ca. Dermatophostum ammonifactor, Ca. Lutibacillus EBPR_bin.18 and Ca. Phosphoribacter EBPR_bin.20 actively expressed genes associated with branched-chain amino acid transport (livKHMGF), polar amino acid transport (paSP) and saccharide transporters (malK) under anaerobic conditions (Fig. 5e), suggesting their role in amino acids and saccharides uptake in the reactor. Beyond these substrates, they also expressed genes for the uptake of putrescine (puuP), glycine (opuD), maltose (malY), and fucose (fucP) in anaerobic condition (Supplementary Note S1, Fig. 5e and Fig. S8), indicating their metabolic versatility. The diversity in transporter expression suggests niche differentiation among PAOs, reducing direct competition and supporting coexistence through the exploitation of different carbon sources and metabolic strategies.
In addition to substrate uptake, Dermatophilaceae PAOs exhibited the capacity to anaerobically ferment amino acid and glucose into central metabolites, such as succinate, pyruvate, and acetate. For example, Ca. Dermatophostum ammonifactor and Ca. Phosphoribacter EBPR_bin.20 encoded and expressed key genes (phaA, fadA and acd) for valine and leucine fermentation to succinate, while Ca. Dermatophostum ammonifactor, Ca. Lutibacillus EBPR_bin.18 and Ca. Phosphoribacter EBPR_bin.20 encoded and expressed genes for alanine fermentation to succinate (gltB, gdhA, ald, and glnA) and pyruvate (aceF and aceE) in anaerobic phase (Fig. 5e). Additionally, Ca. Dermatophostum ammonifactor and Ca. Lutibacillus EBPR_bin.18 encoded the complete phosphate acetyltransferase-acetate kinase pathway (ackA and pta) for converting glucose to acetate (Fig. 5e). These fermentation products (e.g. succinate, pyruvate, and acetate) can serve as substrates for other functional bacteria, including denitrifiers and other PAOs, fostering syntrophic interactions that stabilize microbial communities and enhance nutrient removal. Such potential cooperation highlights the ecological role of Dermatophilaceae PAOs and their potential to support robust and reliable performance in full-scale EBPR systems.
Glycogen and polyhydroxyalkanoates metabolism
Regarding storage compounds synthesis under anaerobic conditions, Ca. Dermatophostum ammonifactor encoded and expressed the acetyl-CoA acetyltransferase (Figs 2 and 5e) but lacked acetoacetyl-CoA reductase (phaB) and PHA synthase (phaC), implying its inability to produce PHA. In addition, Ca. Dermatophostum ammonifactor, Ca. Lutibacillus EBPR_bin.18, and Ca. Phosphoribacter EBPR_bin.20 encoded and expressed glgM and glgE, indicating glycogen synthesis via the GlgE pathway, with average anaerobic phase expression levels of 97.1, 2024.2, and 110.0 TPM, respectively (Fig. 5e). Genes for glycogen degradation (amyA, glgP, or ma) were also expressed during the anaerobic phase, supporting active turnover. Thus, glycogen appears to be the primary storage polymer in these PAOs. In Ca. Accumulibacter and Azonexus PAOs, glycogen is synthesized via the GlgC pathway [6], in which glucose-1-P is converted to ADP-glucose by adenylyltransferase (glgC), followed by elongation by glycogen synthase (glgA) and branching enzyme (glgB) [13]. In contrast, Dermatophilaceae PAOs utilize maltose-1-P as the precursor via GlgE pathway, thereby bypassing ATP-dependent steps of the GlgC pathway. This unique strategy may enhance glycogen synthesis efficiency and provide an energetic advantage under EBPR conditions.
Metabolic model and wide distribution of “Ca. Dermatophostum ammonifactor”
To further illustrate its metabolic potential, we constructed a metabolic model of Ca. Dermatophostum ammonifactor (Fig. 6). It exhibits metabolic flexibility in fermenting serine, alanine, glycine, and glutamine to central metabolites such as succinate, pyruvate, and acetate, demonstrating its capacity to utilize diverse carbon sources. A complete Embden–Meyerhof–Parnas pathway supports efficient ATP generation, while the pentose phosphate pathway supplies reducing power for biosynthesis. These metabolic traits enable Ca. Dermatophostum ammonifactor to sustain both energy and redox homeostasis under dynamic environmental conditions, which may enhance its ecological role within EBPR consortia and contribute to robust phosphorus removal in full-scale wastewater treatment systems.
Figure 6.
Metabolic model of Ca. Dermatophostum ammonifactor, the representative species of a new genus within Dermatophilaceae PAOs. Only genes relevant to carbon, phosphorus, nitrogen, energy metabolisms, and nutrient transport are shown. Solid lines indicate genes detected in the genome of Ca. Dermatophostum ammonifactor recovered in this study, while dotted lines represent genes not detected in the genome. Specific genes involved in these pathways can be found in Supplementary Dataset S4 and S6.
The newly described genus Ca. Dermatophostum is widely distributed across WWTPs worldwide, being detected in 65% of facilities in Europe, 50% in Africa, and 33.9% in Asia (Fig. 7a). Its highest abundance was observed in plants in Germany (0.37%) and Switzerland (0.30%), exceeding previously described genera such as Ca. Lutibacillus (0.03% and 0%) and Tetrasphaera_A (0.09% and 0.02%), and comparable to Phycicoccus_A (0.5% and 0.18%) (Fig. 7b). To explore potential environmental drivers of Ca. Dermatophostum, we examined the relationships between its relative abundance and influent nutrient parameters across WWTP samples. Although influent phosphorus concentrations were low and highly variable among WWTPs (2–8 mg P/L, Supplementary Dataset S8), a significant positive correlation between Ca. Dermatophostum abundance and total phosphorus was consistently supported by both parametric and non-parametric analyses (linear regression: R2_adj = 0.14, P < 0.01; Spearman’s ρ = 0.32, P < 0.05; Fig. 7c), indicating that higher influent phosphorus levels are likely driving the abundance of this taxon. Ca. Dermatophostum also exhibited strong positive correlation with two other widespread PAO genera, Ca. Phosphoribacter and Phycicoccus_A (Spearman, ρ = 0.64–0.73, P < 0.001, Fig. S9), indicating Ca. Dermatophostum coexists with these PAOs. However, no significant correlations were observed between Ca. Phosphoribacter or Phycicoccus_A and influent phosphorus concentrations (Fig. 7d and e), implying that these PAOs may not share the same phosphorus-associated ecological preference as Ca. Dermatophostum. In addition, Ca. Dermatophostum was strongly enriched and transcriptionally active in the high-phosphorus EBPR reactor examined in this study with influent phosphorus concentration of 25.6 ± 1.7 mg/l (Figs 1 and 5). Consistent with our findings, a recent study reported the enrichment of Ca. Dermatophostum (as JAGOME01) in an EBPR reactor with influent phosphorus concentrations up to 65.54 ± 19.41 mg/l [73]. Together, these findings suggest that Ca. Dermatophostum shows a specialized ecological preference for high-phosphorus conditions, becoming both enriched and transcriptionally active in such environments, underscoring its potential for high-phosphorus wastewater treatment.
Figure 7.
Global distribution and environmental associations of Ca. Dermatophostum in global wastewater treatment plants (n = 224). (a) Detection frequencies of Ca. Dermatophostum across continents. (b) Relative abundances detected in wastewater treatment plants across countries. (c) Correlation between Ca. Dermatophostum abundance with influent total phosphorus (TP) in analyzed WWTPs. (d) Correlation between Ca. Phosphoribacter abundance with influent TP in analyzed WWTPs. (e) Correlation between Phycicoccus_A abundance with influent TP in analyzed WWTPs. Correlation coefficients were calculated based on Spearman’s rank correlation and linear regression (grey line).
Environmental and ecological implications
This study provides new insights into the phylogenetic diversity and metabolic capabilities of Dermatophilaceae PAOs, a previously under-characterized but widespread lineage in EBPR systems. While the 16S rRNA gene has traditionally been used for PAO classification and quantification, its limited resolution in distinguishing Dermatophilaceae PAOs has been well-documented. Through comparative phylogenetic analysis, we demonstrate that ppk1 is a robust and reliable marker for resolving the diversity of Dermatophilaceae PAOs. This finding offers a much-needed basis for the development of targeted approaches, such as amplicon sequencing and fluorescence probe design, for the detection, quantification and characterization of this important PAO lineage across diverse ecosystems. Furthermore, we established an effective classification framework that enables fine-scale delineation of Dermatophilaceae PAOs into distinct clades and sub-clades. This framework addresses a long-standing challenge of inconsistent taxonomic assignment of Dermatophilaceae PAOs across studies. Accordingly, it helps establish an explicit taxonomy-metabolism-function network of Dermatophilaceae PAOs, providing guidance for their future application and management in phosphorus control and recovery.
EBPR is widely applied in domestic wastewater treatment, where influent phosphorus concentrations typically range from 4 to 12 mg/l [74], with our survey showing an average of 5.5 mg P/L. However, its application to high-strength wastewater (>20 mg P/L), such as effluents from food processing, dairy production, and livestock farming is increasingly of interest [74], as its potential for cost-efficient biological phosphorus recovery. Achieving efficient phosphorus removal under such conditions requires PAOs with specialized metabolic traits. In this study, we identified Ca. Dermatophostum ammonifactor, a member of the novel genus Ca. Dermatophostum within Dermatophilaceae family, which exhibits a preference for high-phosphorus environments and represents a promising candidate for high-strength wastewater treatment. Beyond phosphorus removal, Ca. Dermatophostum ammonifactor also demonstrated DNRA ability, which facilitates nitrogen transformation and potentially reduces nitrous oxide (N₂O) emissions. As wastewater treatment facilities transition toward carbon-neutral and resource recovery objectives, harnessing the metabolic capabilities of novel PAOs like Ca. Dermatophostum ammonifactor may contribute to both climate change mitigation and sustainable water management. Together, this study provides new insights into the ecological and functional roles of Dermatophilaceae PAOs and their biotechnological potential, supporting the development of more resilient and sustainable wastewater treatment systems.
Etymology of Dermatophilaceae polyphosphate-accumulating organisms representing novel genus and species
According to the phylogeny, habitat and metabolic traits, the etymology of the novel genera of Dermatophilaceae PAOs discovered in present study is proposed, as follows: “Candidatus Dermatophostum”: The genus name Dermatophostum consists of “Dermato”, representing the Dermatophilaceae family; “phos”, representing the Latin for phosphorus; and “tum”, representing the Latin lutum (mud), indicating that this genus of Dermatophilaceae discovered in sludge is capable of phosphorus assimilation. The species name (compounded from “Ammoni”, Latinized from ammonium, “factor”, Latin factor, a maker) refers to the metabolic trait of ammonium transformation of this species.
Supplementary Material
Acknowledgements
We thank team members including Dr. Zhiguo Zhang for offering help and advice on data analysis, Ms. Yisong Xu for laboratory management support, and Mr. Guoqing Zhang for managing the laboratory computational server. We acknowledge the Westlake University HPC Center for computational support.
Contributor Information
Hui Wang, Zhejiang Provincial Key Laboratory of Intelligent Low-Carbon Biosynthesis, School of Engineering, Westlake University, Hangzhou, Zhejiang 310030, China.
Ze Zhao, Zhejiang Provincial Key Laboratory of Intelligent Low-Carbon Biosynthesis, School of Engineering, Westlake University, Hangzhou, Zhejiang 310030, China.
Limin Lin, Zhejiang Provincial Key Laboratory of Intelligent Low-Carbon Biosynthesis, School of Engineering, Westlake University, Hangzhou, Zhejiang 310030, China.
Ao Dong, Zhejiang Provincial Key Laboratory of Intelligent Low-Carbon Biosynthesis, School of Engineering, Westlake University, Hangzhou, Zhejiang 310030, China; Westlake Laboratory of Life Sciences and Biomedicine, School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China.
Ye Deng, State Key Laboratory of Regional Environment and Sustainability, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China.
Jizhong Zhou, Institute for Environmental Genomics, Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK 73019, United States.
Feng Ju, Zhejiang Provincial Key Laboratory of Intelligent Low-Carbon Biosynthesis, School of Engineering, Westlake University, Hangzhou, Zhejiang 310030, China; Westlake Laboratory of Life Sciences and Biomedicine, School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China; Center for Future Foods, Muyuan Laboratory, Zhengzhou, Henan 450016, China.
Conflicts of interest
None declared.
Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 42477517 to F.J. and Grant No. 42207546 to L.L.). The authors also acknowledge support from the Westlake University-Muyuan Joint Research Institute (Grant No. WU2024MY003).
Data availability
All raw metagenomic sequencing data generated in this study have been submitted to CNGB under the project accession number CNP0003076. All raw metatranscriptomic sequencing data generated in this study have been submitted to CNGB under the project accession number CNP0004051.
References
- 1. Luo X, Elrys AS, Zhang L et al. The global fate of inorganic phosphorus fertilizers added to terrestrial ecosystems. One Earth 2024;7:1402–13. 10.1016/j.oneear.2024.07.002 [DOI] [Google Scholar]
- 2. Garcia Martin H, Ivanova N, Kunin V et al. Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities. Nat Biotechnol 2006;24:1263–9. 10.1038/nbt1247 [DOI] [PubMed] [Google Scholar]
- 3. Nielsen PH, McIlroy SJ, Albertsen M et al. Re-evaluating the microbiology of the enhanced biological phosphorus removal process. Curr Opin Biotechnol 2019;57:111–8. 10.1016/j.copbio.2019.03.008 [DOI] [PubMed] [Google Scholar]
- 4. Maszenan AM, Seviour RJ, Patel BKC et al. Friedmanniella spumicola sp. nov. and Friedmanniella capsulata sp. nov. from activated sludge foam: gram-positive cocci that grow in aggregates of repeating groups of cocci. Int J Syst Evol Microbiol 1999;49 Pt 4:1667–80. 10.1099/00207713-49-4-1667 [DOI] [Google Scholar]
- 5. Oehmen A, Lemos PC, Carvalho G et al. Advances in enhanced biological phosphorus removal: from micro to macro scale. Water Res 2007;41:2271–300. 10.1016/j.watres.2007.02.030 [DOI] [PubMed] [Google Scholar]
- 6. Liu R, Hao X, Chen Q et al. Research advances of tetrasphaera in enhanced biological phosphorus removal: a review. Water Res 2019;166:115003. 10.1016/j.watres.2019.115003 [DOI] [PubMed] [Google Scholar]
- 7. Nguyen HT, Le VQ, Hansen AA et al. High diversity and abundance of putative polyphosphate-accumulating Tetrasphaera-related bacteria in activated sludge systems. FEMS Microbiol Ecol 2011;76:256–67. 10.1111/j.1574-6941.2011.01049.x [DOI] [PubMed] [Google Scholar]
- 8. Parks DH, Chuvochina M, Waite DW et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 2018;36:996–1004. 10.1038/nbt.4229 [DOI] [PubMed] [Google Scholar]
- 9. Nouioui I, Carro L, Garcia-Lopez M et al. Genome-based taxonomic classification of the phylum Actinobacteria. Front Microbiol 2018;9:2007. 10.3389/fmicb.2018.02007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Zhang Y, Kinyua MN. Identification and classification of the Tetrasphaera genus in enhanced biological phosphorus removal process: a review. Rev Environ Sci Biotechnol 2020;19:699–715. 10.1007/s11157-020-09549-7 [DOI] [Google Scholar]
- 11. Singleton CM, Petriglieri F, Wasmund K et al. The novel genus, ‘Candidatus phosphoribacter’, previously identified as Tetrasphaera, is the dominant polyphosphate accumulating lineage in EBPR wastewater treatment plants worldwide. ISME J 2022;16:1605–16. 10.1038/s41396-022-01212-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Kristiansen R, Nguyen HT, Saunders AM et al. A metabolic model for members of the genus Tetrasphaera involved in enhanced biological phosphorus removal. ISME J 2013;7:543–54. 10.1038/ismej.2012.136 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Ruiz-Haddad L, Ali M, Pronk M et al. Demystifying polyphosphate-accumulating organisms relevant to wastewater treatment: a review of their phylogeny, metabolism, and detection. Environ Sci Ecotechnol 2024;21:100387. 10.1016/j.ese.2024.100387 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Xie X, Deng X, Chen J et al. Two new clades recovered at high temperatures provide novel phylogenetic and genomic insights into Candidatus Accumulibacter. ISME Commun 2024;4:ycae049. 10.1093/ismeco/ycae049 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Oehmen A, Carvalho G, Lopez-Vazquez CM et al. Incorporating microbial ecology into the metabolic modelling of polyphosphate accumulating organisms and glycogen accumulating organisms. Water Res 2010;44:4992–5004. 10.1016/j.watres.2010.06.071 [DOI] [PubMed] [Google Scholar]
- 16. Nguyen HT, Kristiansen R, Vestergaard M et al. Intracellular accumulation of glycine in polyphosphate-accumulating organisms in activated sludge, a novel storage mechanism under dynamic anaerobic-aerobic conditions. Appl Environ Microbiol 2015;81:4809–18. 10.1128/AEM.01012-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Close K, Marques R, Carvalho VCF et al. The storage compounds associated with Tetrasphaera PAO metabolism and the relationship between diversity and P removal. Water Res 2021;204:117621. 10.1016/j.watres.2021.117621 [DOI] [PubMed] [Google Scholar]
- 18. Fernando EY, McIlroy SJ, Nierychlo M et al. Resolving the individual contribution of key microbial populations to enhanced biological phosphorus removal with raman-fish. ISME J 2019;13:1933–46. 10.1038/s41396-019-0399-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Wang H, Wang Y, Zhang G et al. Temporal dynamics and performance association of the Tetrasphaera-enriched microbiome for enhanced biological phosphorus removal. Engineering 2023;29:168–78. 10.1016/j.eng.2022.10.016 [DOI] [Google Scholar]
- 20. Nguyen HTT, Nielsen JL, Nielsen PH. 'Candidatus Halomonas phosphatis', a novel polyphosphate-accumulating organism in full-scale enhanced biological phosphorus removal plants. Environ Microbiol 2012;14:2826–37. 10.1111/j.1462-2920.2012.02826.x [DOI] [PubMed] [Google Scholar]
- 21.Fastqc: A quality control tool for high throughput sequence data. https://www.bioinformatics.babraham.ac.uk/publications.html.
- 22. Ewels P, Magnusson M, Lundin S et al. Multiqc: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 2016;32:3047–8. 10.1093/bioinformatics/btw354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Chen S, Zhou Y, Chen Y et al. Fastp: An ultra-fast all-in-one fastq preprocessor. Bioinformatics 2018;34:i884–90. 10.1093/bioinformatics/bty560 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 2011;27:863–4. 10.1093/bioinformatics/btr026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Nurk S, Meleshko D, Korobeynikov A et al. Metaspades: a new versatile metagenomic assembler. Genome Res 2017;27:824–34. 10.1101/gr.213959.116 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Li D, Luo R, Liu CM et al. Megahit v1.0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods 2016;102:3–11. 10.1016/j.ymeth.2016.02.020 [DOI] [PubMed] [Google Scholar]
- 27. Kang DD, Li F, Kirton E et al. Metabat 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 2019;7:e7359. 10.7717/peerj.7359 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Wu Y-W, Tang Y-H, Tringe SG et al. Maxbin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome 2014;2:26. 10.1186/2049-2618-2-26 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Alneberg J, Bjarnason BS, de Bruijn I et al. Binning metagenomic contigs by coverage and composition. Nat Methods 2014;11:1144–6. 10.1038/nmeth.3103 [DOI] [PubMed] [Google Scholar]
- 30. Uritskiy GV, DiRuggiero J, Taylor J. Metawrap—a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 2018;6:158. 10.1186/s40168-018-0541-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Zhu C, Wu L, Ning D et al. Global diversity and distribution of antibiotic resistance genes in human wastewater treatment systems. Nat Commun 2025;16:4006. 10.1038/s41467-025-59019-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Olm MR, Brown CT, Brooks B et al. Drep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J 2017;11:2864–8. 10.1038/ismej.2017.126 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Chaumeil PA, Mussig AJ, Hugenholtz P et al. Gtdb-tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics 2019;36:1925–7. 10.1093/bioinformatics/btz848 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Hyatt D, Chen GL, Locascio PF et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 2010;11:119. 10.1186/1471-2105-11-119 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Eddy SR. Accelerated profile HMM searches. PLoS Comput Biol 2011;7:e1002195. 10.1371/journal.pcbi.1002195 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Matsen FA, Kodner RB, Armbrust EV. Pplacer: linear time maximum-likelihood and bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics 2010;11:538. 10.1186/1471-2105-11-538 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Jain C, Rodriguez-R LM, Phillippy AM et al. High throughput ani analysis of 90k prokaryotic genomes reveals clear species boundaries. Nat Commun 2018;9:5114. 10.1038/s41467-018-07641-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Price MN, AP DPF-A, Arkin AP. Fasttree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol 2009;26:1641–50. 10.1093/molbev/msp077 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Ondov BD, Treangen TJ, Melsted P et al. Mash: fast genome and metagenome distance estimation using minhash. Genome Biol 2016;17:132. 10.1186/s13059-016-0997-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Parks DH, Imelfort M, Skennerton CT et al. Checkm: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 2015;25:1043–55. 10.1101/gr.186072.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Pritchard L, Glover RH, Humphris S et al. Genomics and taxonomy in diagnostics for food security: soft-rotting enterobacterial plant pathogens. Anal Methods 2016;8:12–24. 10.1039/C5AY02550H [DOI] [Google Scholar]
- 42. R.D.C. Team . R: A Language and Environment for Statistical Computing. R foundation for Statistical Computing, Vienna, Austria, 2009.
- 43. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 2014;30:2068–9. 10.1093/bioinformatics/btu153 [DOI] [PubMed] [Google Scholar]
- 44. Aramaki T, Blanc-Mathieu R, Endo H et al. Kofamkoala: KEGG Ortholog assignment based on profile hmm and adaptive score threshold. Bioinformatics 2019;36:2251–2. 10.1093/bioinformatics/btz859 [DOI] [Google Scholar]
- 45. Kanehisa M, Sato Y, Kawashima M et al. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 2016;44:D457–62. 10.1093/nar/gkv1070 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using diamond. Nat Methods 2015;12:59–60. 10.1038/nmeth.3176 [DOI] [PubMed] [Google Scholar]
- 47. Kanehisa M, Sato Y. KEGG mapper for inferring cellular functions from protein sequences. Protein Sci 2020;29:28–35. 10.1002/pro.3711 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Katoh K, Misawa K, Ki K et al. Mafft: a novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic Acids Res 2002;30:3059–66. 10.1093/nar/gkf436 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Dueholm MKD, Nierychlo M, Andersen KS et al. Midas 4: a global catalogue of full-length 16s rRNA gene sequences and taxonomy for studies of bacterial communities in wastewater treatment plants. Nat Commun 2022;13:1908. 10.1038/s41467-022-29438-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Letunic I, Bork P. Interactive tree of life (iTol) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res 2021;49:W293–6. 10.1093/nar/gkab301 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 2011;17:3. 10.14806/ej.17.1.200 [DOI] [Google Scholar]
- 52. Kopylova E, Noé L, Touzet H. Sortmerna: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics 2012;28:3211–7. 10.1093/bioinformatics/bts611 [DOI] [PubMed] [Google Scholar]
- 53. Kim D, Paggi JM, Park C et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 2019;37:907–15. 10.1038/s41587-019-0201-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Putri GH, Anders S, Pyl PT et al. Analysing high-throughput sequencing data in python with HTSeq 2.0. Bioinformatics 2022;38:2943–5. 10.1093/bioinformatics/btac166 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Pertea M, Pertea GM, Antonescu CM et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 2015;33:290–5. 10.1038/nbt.3122 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Mirdita M, Schütze K, Moriwaki Y et al. Colabfold: making protein folding accessible to all. Nat Methods 2022;19:679–82. 10.1038/s41592-022-01488-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Liu Y, Yang X, Gan J et al. CB-Dock2: improved protein-ligand blind docking by integrating cavity detection, docking and homologous template fitting. Nucleic Acids Res 2022;50:W159–64. 10.1093/nar/gkac394 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Ashkenazy H, Abadi S, Martz E et al. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res 2016;44:W344–50. 10.1093/nar/gkw408 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Schneider D, Zühlke D, Poehlein A et al. Metagenome-assembled genome sequences from different wastewater treatment stages in Germany. Microbiol Resour Announc 2021;10:e0050421. 10.1128/mra.00504-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Petriglieri F, Singleton Caitlin M, Kondrotaite Z et al. Reevaluation of the phylogenetic diversity and global distribution of the genus “Candidatus Accumulibacter”. mSystems 2022;7:e0001622–2. 10.1128/msystems.00016-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Kolakovic S, Freitas EB, Reis MAM et al. Accumulibacter diversity at the sub-clade level impacts enhanced biological phosphorus removal performance. Water Res 2021;199:117210. 10.1016/j.watres.2021.117210 [DOI] [PubMed] [Google Scholar]
- 62. Milanese A, Mende DR, Paoli L et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat Commun 2019;10:1014. 10.1038/s41467-019-08844-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Neville N, Roberge N, Jia Z. Polyphosphate kinase 2 (ppk2) enzymes: structure, function, and roles in bacterial physiology and virulence. Int J Mol Sci 2022;23:670. 10.3390/ijms23020670 [DOI] [Google Scholar]
- 64. Du H, Yang L, Wu J et al. Simultaneous removal of phosphorus and nitrogen in a sequencing batch biofilm reactor with transgenic bacteria expressing polyphosphate kinase. Appl Microbiol Biotechnol 2012;96:265–72. 10.1007/s00253-011-3839-5 [DOI] [PubMed] [Google Scholar]
- 65. Andreeva N, Trilisenko L, Eldarov M et al. Polyphosphatase ppn1 of Saccharomyces cerevisiae: switching of exopolyphosphatase and endopolyphosphatase activities. PLoS One 2015;10:e0119594. 10.1371/journal.pone.0119594 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Itoh H, Shiba T. Polyphosphate synthetic activity of polyphosphate:AMP phosphotransferase in Acinetobacter johnsonii 210A. J Bacteriol 2004;186:5178–81. 10.1128/jb.186.15.5178-5181.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Martín JF, Liras P. Molecular mechanisms of phosphate sensing, transport and signalling in Streptomyces and related Actinobacteria. Int J Mol Sci 2021;22:1129. 10.3390/ijms22031129 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Harris RM, Webb DC, Howitt SM et al. Characterization of pita and pitb from Escherichia coli. J Bacteriol 2001;183:5008–14. 10.1128/jb.183.17.5008-5014.2001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Rosenberg H, Gerdes RG, Chegwidden K. Two systems for the uptake of phosphate in Escherichia coli. J Bacteriol 1977;131:505–11. 10.1128/jb.131.2.505-511.1977 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Zhang R, He J, Wang M et al. The role of high affinity phosphate transporters in aerobic phosphate uptake during acetate- or propionate-fed enhanced biological phosphorus removal process. J Water Process Eng 2024;67:106178. 10.1016/j.jwpe.2024.106178 [DOI] [Google Scholar]
- 71. Larriba O, Rovira-Cal E, Juznic-Zonta Z et al. Evaluation of the integration of p recovery, polyhydroxyalkanoate production and short-cut nitrogen removal in a mainstream wastewater treatment process. Water Res 2020;172:115474. 10.1016/j.watres.2020.115474 [DOI] [PubMed] [Google Scholar]
- 72. Zhao L, Chen J, Shen G et al. Dissimilatory nitrate reduction to ammonia in the natural environment and wastewater treatment facilities: a comprehensive review. Environ Technol Innovation 2025;37:104011. 10.1016/j.eti.2024.104011 [DOI] [Google Scholar]
- 73. Close K, Ye L, Wang D et al. The return sludge sidestream process promotes phosphorus removal stability with synergistic activity of polyphosphate accumulating organism populations. J Environ Chem Eng 2025;13:117331. 10.1016/j.jece.2025.117331 [DOI] [Google Scholar]
- 74. Kang D, Yuan Z, Li G et al. Toward integrating EBPR and the short-cut nitrogen removal process in a one-stage system for treating high-strength wastewater. Environ Sci Technol 2023;57:13247–57. 10.1021/acs.est.3c03917 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All raw metagenomic sequencing data generated in this study have been submitted to CNGB under the project accession number CNP0003076. All raw metatranscriptomic sequencing data generated in this study have been submitted to CNGB under the project accession number CNP0004051.







