Skip to main content
ISME Communications logoLink to ISME Communications
. 2024 Mar 27;4(1):ycae042. doi: 10.1093/ismeco/ycae042

Quantifying microbial guilds

Juan Rivas-Santisteban 1,, Pablo Yubero 2, Semidán Robaina-Estévez 3, José M González 4, Javier Tamames 5, Carlos Pedrós-Alió 6
PMCID: PMC11069341  PMID: 38707845

Abstract

The ecological role of microorganisms is of utmost importance due to their multiple interactions with the environment. However, assessing the contribution of individual taxonomic groups has proven difficult despite the availability of high throughput data, hindering our understanding of such complex systems. Here, we propose a quantitative definition of guild that is readily applicable to metagenomic data. Our framework focuses on the functional character of protein sequences, as well as their diversifying nature. First, we discriminate functional sequences from the whole sequence space corresponding to a gene annotation to then quantify their contribution to the guild composition across environments. In addition, we identify and distinguish functional implementations, which are sequence spaces that have different ways of carrying out the function. In contrast, we found that orthology delineation did not consistently align with ecologically (or functionally) distinct implementations of the function. We demonstrate the value of our approach with two case studies: the ammonia oxidation and polyamine uptake guilds from the Malaspina circumnavigation cruise, revealing novel ecological dynamics of the latter in marine ecosystems. Thus, the quantification of guilds helps us to assess the functional role of different taxonomic groups with profound implications on the study of microbial communities.

Keywords: microbial ecology, microbial guilds, phylogenetics, functional ecology, polyamine uptake, ammonia oxidation

Introduction

The large number and variety of functions carried out by microorganisms have a major impact on ecosystems. However, these functions involve a plethora of substrates and biocatalysts of varying nature, such as proteins [1], which are subject to evolutionary forces. Consequently, the quantitative ecological study of such functions faces several challenges.

First, proteins may diverge due to evolution, mainly through gene duplication, adaptation, and drift [2–6]. Admittedly, proteins carrying out a given function will diverge as microorganisms adapt to different conditions, while drift will cause stochastic changes between sequences despite conserving the function. These processes maintain a variety of proteins with different degrees of similarity that fulfill the same function (Fig. 1) [7, 8].

Figure 1.

Figure 1

Protein sequence similarity cannot fully anticipate functional activity. Notes: Any particular function can be performed by a plethora of different protein sequences, generated by adaptive evolution and drift. Top: traditionally, automatic annotation considers any given protein sequence below a dissimilarity threshold as functional. Although the threshold can be adjusted, there are several pitfalls: (i) the threshold value is prone to errors; (ii) the functional space may display discontinuities or gaps; and (iii) proteins beyond the dissimilarity threshold may be able to perform the function. Bottom: we exemplify these scenarios with real protein sequences related to the Inline graphic-subunit of the ammonia monooxygenase (amoA). We show: AOA (archaeal) and AOB (bacterial) amoA sequences well within the similarity threshold; pmoA, which mainly oxidize Inline graphic-alkanes and usually does not carry out the function but falls within the threshold; hapE, which can oxidize a wide range of acetophenone derivatives, but not ammonia, is beyond the threshold; and CTP-synthase-1, which oxidizes L-glutamine to form a nucleic acid precursor, but it also can oxidize ammonia, thus performing the molecular function, falls well beyond any reasonable sequence similarity to AOA-amoA.

Second, most proteins are inherited vertically, and therefore taxonomically related organisms will often share a similar set of functions [9]. However, there are alternatives to vertical inheritance. The main one is horizontal transfer [10], which has been reported even among organisms belonging to remarkably distant lineages [11]. There are other mechanisms to acquire functions not related to inheritance, such as unrelated proteins converging in function [12, 13]. Therefore, neither taxonomy nor sequence similarity can fully predict the occurrence or strength of microbial functions.

There is, thus, a need for a non-taxonomic approach to analyze the ecology of microbial functions, which are mainly executed by a pool of divergent proteins. The ecological guild concept can fulfill this role. The classical definition of a guild is “a group of species that exploit the same class of environmental resources in a similar way […] without regard to taxonomic position, that overlaps significantly in their niche requirements” [14]. This view was conceived as a way to study the ecology of macroorganisms and became popular in the 1970s. For example, all insect predators can be studied together as they are members of the insectivorous guild, without considering the taxonomic group they belong to [15, 16]. Thus, guilds are broadly understood as the functional groups into which communities can be subdivided, unlike the concept of population, which consists of taxonomical groups.

However, the classical definition does not suit the needs of microbial ecology. In macrofauna, guilds are usually identified by feeding behavior [17]. The range of behaviors is determined by the complex genotypic and environmental input in which an individual develops [18, 19]. Thus, the resource exploitation carried out by the insectivorous guild (searching, capturing, ingesting, and digesting an insect) can be easily delimited and observed. In contrast, microbial feeding phenotypes are much closer to their genotypes [20]. The acquisition of a nutrient by prokaryotes depends almost exclusively on a few proteins [21, 22]. Therefore, our ability to observe microbial phenotypes is often limited to inference from sequenced metagenomes, where we rely on gene prediction and automatic function annotations, which are unreliable or simply unavailable. Furthermore, the requirement for a similar exploitation of resources is inherently ambiguous.

These reasons have led to a lack of consensus on how to define and quantify microbial guilds, despite a persistent use of the term [23–25]. Below are some examples.

Wu et al. [26] use the term “microbial guild” to assign a functional value based on spatial co-occurrence among taxa, yet co-occurrence in space does not necessarily imply sharing the same function [27], although it might help functional prediction [28]. In another study, motility of diatoms reflected their nutritional traits and were classified accordingly [29]. However, this method is hardly extensible to all microbes. Other authors proposed that guilds should be restricted to taxa exploiting the same resource in a given space and time [30, 31], but guild membership should not be bounded spatially or temporally, as microbes from different communities may contribute to the same guild. Last, Pedrós-Alió [32] defined microbial guilds as “a group of microorganisms using the same energy and carbon sources and the same electron donors and acceptors”. However, this imposes a too narrow eligibility. Under this definition, methanotrophs using different electron donors while exploiting methane would not belong to the same methane consumption guild. It is clear, then, that different authors have understood the guild concept in different ways and this detracts from its usefulness. Could we amend such definitions to better capture ecological dynamics?

Our goal is to provide a definition that fits the needs of microbial ecology, while remaining applicable to all organisms. A guild is a list of organisms that exploit or produce a key resource, and membership will be granted regardless of their lineage, their environment, or how they have implemented the needed functions. Importantly, the contribution from each member to the guild varies significantly with space and time. As a result, membership is not a static quality, and the value of a member needs to be quantified in each context.

Considering this definition, we here present a method to quantify microbial guilds through the abundance, richness, and efficiency of their proteins. In addition, we improve current methods for functional delineation to strictly discriminate functional from non-functional sequences, and to categorize them ecologically. With our conceptual and methodological contributions, we avoided the problems related to similarity in niche occupancy and protein variability, which led to ambiguity and imprecision when describing microbial guilds.

Methods

Construction of the marine prokaryotic genomes database (1 in Fig. S1)

To facilitate the construction of the gene-specific reference databases, we compiled a database of peptide sequences obtained from a collection of prokaryotic, quality-filtered genomes (MAGs and SAGs) from marine environments. Specifically, we retrieved genomes from the following databases: the MAR database, all 1,270 complete genomes, and 5,521 partial genomes that had the “high quality” status as described in [33]; the OceanDNA database [34], all 52,325 genomes, as they had been quality-filtered based on their completeness and degree of contamination with the formula: percent completeness Inline graphic percent contamination Inline graphic; the collection compiled by Paoli et al. [35], which includes genomes from various origins such as TARA OCEANS [36] and XORG—in this case, only genomes passing the same quality filter applied in OceanDNA were kept, amounting to a total of 26,942 additional genomes. All the genomes considered had assigned taxonomy obtained with the GTDB Toolkit 2.0.0 [37] available in their corresponding databases. Finally, all sequences were merged into a single database, reads were further quality filtered with fastp 0.20.1, and sequence duplicates were removed with seqkit rmdud 2.0.0 [38] using default parameters.

Functional marker selection (2a in Fig. S1)

The search for functional markers was carried out by means of an extensive bibliographic comparison. This methodology is based on choosing publicly available hidden Markov models (HMMs) [39] for one or several genes, trying to avoid functional paralogs to maximize functional univocity. In order to choose an HMM as a guild marker, we followed the following conservative criteria: (i) the construction of the HMM must be congruent with the sequences that have reviewed functional evidence in the literature; (ii) the metagemonic sequences retrieved with the tested HMM can be filtered out by a specific quality argument, derived from the inner workings of genomic architecture (i.e. synteny) or a consequence of the evolutive history of the gene (i.e. similar sequences that have undergone functional drift). With this methodology, we selected the best minimal markers for the guilds analyzed in this work.

Construction of the gene-specific reference database (2b in Fig. S1)

We used the selected profile HMMs and HMMER3 [40] to retrieve candidate sequences of the target gene from our collected marine peptide database. Gather score thresholds were used as a quality filter when available; otherwise, a minimum E-score threshold of Inline graphic was employed.

To facilitate inference and later visual inspection of the phylogenetic trees, sequence hits were further filtered to set a maximum database size of N representative sequences. To this end, we applied a series of filters. First, we set minimal and maximal sequence length cut-off values of l1 and l2, respectively. Second, we removed sequence duplicates through seqkit’s rmdup subcommand with default parameters. Third, we applied CD-HIT [41] with default parameters to reduce redundancy in the peptide database. Finally, if the database size was larger than the allowed maximum after applying CD-HIT, we further reduced the number of representative peptides through RepSet [42], an optimization-based algorithm that obtains a series of nested sets of representative peptides of decreasing size. Specifically, we selected the maximal set of representative peptides with a size lower than the established size threshold value.

Usage of synteny during gene-specific reference database construction (2c in Fig. S1)

In some cases, we used syntenic information to reduce uncertainty due to the potential presence of paralogs during the reference peptide database construction. To this end, we employed the Python package Pynteny [43], which facilitates synteny-aware profile, HMM-based searchers. After generating a list of synteny-complaint target-peptide matches, we followed the same protocol to reduce database size when required to meet the established reference database size threshold value.

Inference of the gene-specific reference trees (3a in Fig. S1)

Once the peptide reference database was constructed, we employed MUSCLE [44] with default parameter values to perform a multiple sequence alignment of the reference database. Next, we used the previous alignment and IQ-TREE [45] with default parameter values to build a reference phylogenetic tree for each target gene. We determined the substitution model through ModelTest [46] by selecting the model with the highest AIC score.

Classification of clusters within the reference phylogenetic tree (3b in Fig. S1)

Once we have constructed the reference tree, we can now propagate functional information that corresponds to the different regions of the reference tree. In the case of amoA, the functional information was obtained directly from the sequences we used to build the tree, and the clusters inferred from the similarity between sequences.

In the case of the polyamine binding reference tree, we needed other criteria to classify clusters. To check whether different clusters are associated with different environmental conditions, we carried out an extensive literature search of the environmental preferences of 321 species that matched 478 leaves (41%). We assembled a curated collection of physicochemical preferences for these species that included tolerance ranges and optimal values of temperature, salinity, and pH, as well as other variables such as motility (Table S1). For each internal node, we calculated the average values of all its leaves. To determine whether the associations with environmental variables of the cluster were significant, these node averages were compared to the distribution observed under Inline graphic randomizations to obtain their Inline graphic-scores. Nodes with an average value of the Inline graphic-score larger than 3 (i.e. Inline graphic-value Inline graphic) were considered significant for the particular environmental variable (Table S2). To select the most general internal nodes, we focused on those that are significant but whose parent node is not.

To determine whether these sequence clusters were expected by the taxonomy of the organisms, we constructed a null model of phylogenetic divergence with two phylomarkers (16S and rplB). We then compared the divergence of Inline graphic-like sequences for the same organisms, finding that taxonomy does not explain, in most cases, the drift found in functional genes; however, environmental variables do, especially for nodes that separate more leaves on the tree.

This methodology can be applied to any type of functional evidence, not only environmental, but also kinetic, substrate preference, or any other type.

Preprocessing of query sequences (4a in Fig. S1)

Query sequences were retrieved from the metagenomes following SqueezeMeta’s pipeline [47].

Placement of query sequences (4b in Fig. S1)

To place query sequences (metagenomic output) in the reference tree, we first obtained an alignment between the query and the reference sequences with PaPaRa 2.5 [48]. Then, we placed query sequences with the tool EPA-ng 0.3.8 [49]. Additionally, we employed the Gappa toolkit 0.8, specifically the command gappa examine graft [50], to visualize the placed sequences on the reference tree using default parameters. The phylogenetic placement tree was visualized using the Interactive Tree of Life [51].

Taxonomical and functional labeling of placed query sequences (4c in Fig. S1)

We employed the Gappa toolkit [50], specifically the command gappa examine assign, to assign taxonomy to placed sequences. Briefly, Gappa first assigns a consensus taxonomy to each internal node of the tree and then assigns to each query sequence the closest taxonomy in the reference tree weighted by the likelihood of each placement. We employed default parameters and the best_hit option to retrieve only the taxonomic assignments with the highest total placement likelihood for each query. To assign functional labels to placed queries, we selected the function of the tree cluster in which each query had been placed. To this end, we first added the cluster label to each taxonomic path of the reference sequences as an additional (artificial) taxon above the domain level. In this manner, we could employ gappa examine assign to assign both taxonomy and the cluster label (i.e. function) to each placed query.

Quantification of the polyamine-uptake guild from metagenomic data (5 in Fig. S1)

Further technical details regarding the application of our method can be found in the Supplementary Information and the repository tutorials (https://github.com/pyubero/microguilds/tree/main/Tutorials).

Expected sequence richness model

In our data, we invariably observed that larger sequence abundance corresponds to larger sequence richness; this arises naturally when sampling a large pool of distinct elements far from saturation. Briefly, we assess the expected richness by fitting

graphic file with name DmEquation1.gif (1)

where Inline graphic are the sequences considered for a taxon, implementation, and environment, abundance is measured in TPM, richness is the natural number of unique sequences, and where Inline graphic and Inline graphic are constants. We consistently obtained Inline graphic (Fig. S2). Consequently, when computing the impact coefficients later in Equation (2), the term Inline graphic rewards sequences that display a larger richness than expected.

16S and rplB sequences

To screen phylogenetic deviations between functions and phylomarkers (Fig. 4B), we obtained the nucleotide sequences of the 16S ribosomal subunit and the rplB gene from the assembly genomic RNA and CDS provided by the NCBI for 319 out of the 321 species found in pure culture. When 16S sequences were ¡1000bp, we used instead sequences from other strains as they should remain well conserved within the same species. All RefSeq assembly accession numbers and alternative GIs for 16S data were automatically retrieved from the NCBI; a detailed list is available in Table S3.

Figure 4.

Figure 4

Polyamine uptake implementations revealed by environmental cues are non-synonymous with phylogeny. Notes: (A) A phylogenetic reconstruction of potF, a polyamine-binding subunit of an ABC transporter. Due to the very scarce experimental evidence available, we here determine the different implementations as sequence spaces adapted to specific environmental conditions (see Methods, and Tables S1 and S2). We gathered environmental data for 478 tree locations representing 321 organisms (green tags). We found six nodes enriched by environmental preferences (Inline graphic randomizations, one-tailed Inline graphic-values Inline graphic), which included most sequences. Thus, the tree acts as a metagenomic query classifier. (B) To find whether the previous clusters grouped taxonomically related species, we tested the divergence between last common ancestors (LCAs) in the trees of potF and a pair of phylogenetic markers: ribosomal 16S RNA and rplB genes. The average phylogenetic distances between common groups of organisms (nodes in the tree) in the 16S RNA (Inline graphic-axis) are compared to those in the rplB gene (black dots) or to those in the potF gene (colored dots). The size of the dot represents the number of children leaves compared. The two phylogenetic markers follow a straight line as expected. The potF sequences, however, significantly depart from this linearity. The null model is equivalent to the neutral variation expected from phylogeny and was fitted with a bivariate regression (mean and Inline graphic, black solid and dotted lines). Red edges to the circles highlight the nodes that define the environmental clusters in panel A, which are beyond the taxonomy-related null model. The red arrow shows an example that corresponds with Oceanobacter kriegii and Thalassobius gelatinovorus, where their potF sequences are more similar than what is expected by their taxonomic markers.

Results

Quantification of microbial guilds from metagenomes

Our method quantifies the contribution of metagenomic sequences across three ecological aspects. Briefly, the first is the taxonomy assigned to the sequences, in our case, the protein(s); it consists of the list of guild contributors. The second distinguishes between modes in which the function is carried out; that is, how the function is implemented across the different taxa and environments, e.g. high versus low affinity transport of ammonia or thermophilic versus psychrophilic oxidation of acetate. Often, an implementation clearly relates to taxonomy; however, in many instances a single taxon may have more than one implementation and a given implementation may be shared by several taxa (Fig. 2A). Last, we consider the environment, which allows analysis of context-dependent contributions to the guild. This approach results in a detailed picture of the guild composition (Fig. S3).

Figure 2.

Figure 2

Guild quantification method. Notes: (A) Example showing how abundance, observed diversity, and univocity are calculated. A tree is constructed with reference sequences (lines, far left). Different clusters are (Inline graphic) or are not (Inline graphic) assigned to the function based on available evidence. Metagenomic sequences (stars) are placed in the reference tree. Sequence Inline graphic does not fall within the known Inline graphic clusters and is discarded. The remaining sequences fall in two different clusters (implementations Inline graphic in yellow and Inline graphic in pink). A given taxon may have different implementations and different taxa may hold the same. Next, abundance of reads for each cluster (column A, arbitrary abundances are shown as examples) and observed diversity (column Inline graphic, counting the different unique sequences in each cluster) are calculated. Real metagenomic data quantification is available in Table S4. (B) Calculation of expected diversity, Inline graphic. In general, the more abundant a gene is, the larger the diversity of sequences. For each gene, these variables show a log–log relationship. When real data from a metagenome are placed in the graph, some show either higher or lower diversity than expected from their abundance. By calculating Inline graphic, we reward the former and penalize the latter.

In practice, we consider that sequences associated with a taxon, an implementation, and a context are equivalent, and we assess their importance considering three factors: their abundance; their richness, as the number of unique sequences; and their efficiency carrying out the function (Fig. 2A). While abundant and efficient proteins reasonably favor the function’s output, we also include richness as it typically promotes functional stability in communities and is thus of considerable importance [52], especially in microbiomes where levels of functional redundancy and niche overlap are very high [53]. In fact, it is a general postulate of ecology that the richness of the ecosystem’s constituent elements is synonymous with its complexity and maturity [54, 55]. Therefore, we quantify the contribution of a group of equivalent sequences Inline graphic to a guild through the impact coefficient Inline graphic:

graphic file with name DmEquation2.gif (2)

where Inline graphic is the summed abundance of the corresponding sequences, Inline graphic and Inline graphic are the observed and expected richness, and Inline graphic is the univocity of the function. Because we consider that the richness is key for the resilience of the function over time, we reward (or penalize) cases with higher (or lower) sequence richness than expected (Fig. 2B). The denominator Inline graphic corrects for the fact that greater richness is invariably observed with increased abundance (see the Methods section). Last, the univocity refers to the efficiency of the sequences. For clarity, in this study we only consider a binary classification of Inline graphic where sequences are either functional (Inline graphic) or non-functional (Inline graphic), although this definition allows for intermediate values, e.g. by assessing the average efficiency to accomplish the function.

In summary, in this work we rely on normalized abundance of metagenomic reads to quantitatively characterize guilds according to taxonomy, implementations, and contexts through the impact coefficients Inline graphic, where larger values reflect the abundance in the environment of a given implementation, the occurrence of unexpected protein richness, and the likelihood of the proteins to perform the guild-definitory function.

Enhancing functional profiling through the discrimination of sequence spaces in amoA

The previous formalism first requires a reliable functional annotation of sequences. However, automatic annotation relies on similarity with known sequences, producing many false positives because similarity alone is often not accurate enough to discriminate functionality (Fig. 1) [56]. In this subsection, we present a method that greatly improves automatic function annotation by using reference trees as functional sequence classifiers. We illustrate the procedure using ammonia oxidation as an example, whose effector is typically an ammonia monooxygenase (AMO). We focus on the gene encoding the A subunit (amoA), which conducts the catalytic activity of the enzyme complex [57, 58] and has undergone extensive functional description.

Figure 3 shows the reference amoA tree to classify related sequences from our metagenomic dataset using an in-house curated oceanic database (see Methods; Fig. 3A, Fig. S4, and File S1). Within the tree, we identified regions containing functional or non-functional sequences based on experimental evidence from the literature [59–61]. These regions highlight several particularities of AMO enzymes (full clustering in Fig. S4). On the one hand, amoA genes have evolved among taxonomic groups with different metabolic pathways. Thus, the ancestral AMO enzyme has shifted from a moderate affinity for a broad spectrum of substrates to a restricted substrate specificity, suboptimal for the substrate preference of each organism [62]. Therefore, other closely related enzymes retained some promiscuity or ambivalence; e.g. particulate methane monooxygenases (pMMO) are able to oxidize ammonia even though their main substrate is methane [63, 64], or pBMO, which, apart from oxidizing butane, can hydroxylate other short alkanes (CInline graphic–CInline graphic) [65, 66]. Therefore, we can discriminate between proteins carrying out ammonia oxidation with Inline graphic and others that have higher affinities for methane or other simple aliphatic alkanes [67], with Inline graphic for ammonia oxidation (Fig. 1). On the other hand, there are two different implementations among functional sequences: the archaeal and bacterial AMO, AOA, and AOB, respectively. Each occupies different niches [68] and acts differently on the key substrate: AOA-amoA proteins display a Inline graphic-fold higher affinity for ammonia, dominating in oligotrophic environments [69].

Figure 3.

Figure 3

Distinguishing functional from non-functional sequences improves the assessment of the ammonia oxidation guild. Notes: (A) Reconstructed phylogeny of amoA, used as a reference tree to classify ammonia-oxidizing capable sequences. The tree contains 135 sequences with strong functional evidence based on either biochemical or physiological features, or inferred by homology to quality sequences (see Methods). For clarity, we only highlight clusters of sequences where metagenomic Malaspina samples have been successfully placed (full clustering in Fig. S4). Among those, we distinguish sequence clusters with proven ammonium oxidation function (shaded greens) from sequences with a broader substrate spectrum and sequences with evidence of being non-functional for ammonia oxidation (shaded orange). Note that AOA-amoA form a single cluster, but AOB-amoA comprises two dissimilar clusters. Evidence of function was gathered from various sources, albeit the main ones are marked in pink and blue in the outer circle. The size of red circles is proportional to the number of reads. (B) Log representation of abundance values (TPM) of the amoA classified queries found in Malaspina metagenomes (red circles from panel A). We excluded 31% of unique sequences (1.01% of total TPM) corresponding to the non-univocal tree clusters, keeping a conservative criterion.

Then, we retrieved amoA sequences from Malaspina metagenomes [70] and placed them on to the tree (Fig. 3A). First, the placements were robust as we did not find any spurious sequence. However, we found that from a total of 129 unique sequences, 40 were discarded by the tree placement due to paralogy (31.0%), and only 82 were bona fide for ammonia oxidation (63.5%). Note that, despite the similarity between sequences in the AOB-amoA and Inline graphic clusters, we were able to distinguish functional from non-functional proteins. In Fig. 3B, we show the results of the classification of the Malaspina queries per implementation, taxon, and water column depth in terms of their abundance. We are able to focus our analysis on the AOA and AOB implementations (in green), which are the reliable functional sequences in contrast to existing approaches that are unable to filter out false positives, and to distinguish implementations.

Overall, this shows that several amoA-like sequences were non-functional, highlighting the need to recognize false positives (Inline graphic) to accurately quantify the ammonia oxidation function in metagenomes. In addition, identifying different regions in the tree allowed us to identify different amoA implementations, providing a more detailed description of the ammonia oxidizers in the Malaspina metagenomes.

Discriminating functional sequence spaces of potF based on environmental preferences

Unlike amoA, unfortunately, most genes have limited or ambiguous experimental evidence. This is the case of potF, which is a subunit of an ATP-binding cassette (ABC transporter) that binds and imports putrescine-like polyamines. Here, we were unable to delimit sequence spaces by substrate preference due to the scarcity of experimental evidence (Fig. S5), but we were able to characterize clusters using a different criterion.

As in the previous case, we built a reference tree (Fig. 4A), but using the HMM corresponding to polyamine binding, KEGG K11073 [71] (File S2) to identify functional sequences with the same oceanic database. Thus, the reconstructed reference tree showed a collection of HMM-retrieved potF-like sequences grouped by their similarity where each sequence is represented by a leaf of the tree. The ability of a protein to perform a function is influenced by its surrounding environment, thus requiring specific conditions to perform it effectively. In particular, periplasmic transporters are more exposed to changing environments than cytoplasmic proteins. Therefore, we wished to explore whether the divergence between tree branches depended on environmental conditions.

To that aim, we retrieved from the literature the environmental preferences of the bacteria present in the reference tree. We obtained 15 environmental variables for 321 organisms in pure culture, corresponding to 478 of 1158 tree leaves (Tables S1 and S2, and Fig. S6). We then searched for sequence spaces that shared similar environmental properties (randomized tests, one-tailed Inline graphic-values Inline graphic). First, we observed significant nodes that grouped only a handful of sequences, highlighting properties that the taxonomy accurately predicts (e.g. hydrocarbon degradation) [72]. Second, we found significant nodes that grouped all sequences in just a few clusters under a combination of preferences to temperature, salinity, and pH, suggesting groups undergoing either common environmental adaptation, or horizontal gene transfer (shaded colors in Fig. 4A).

We wondered whether these environmental features were deducible from taxonomy or not. To answer, we compared the tree structures between a phylomarker and potF-like sequences for the same list of organisms (see Methods). We found that the significant internal nodes grouped sequences that diverged differently than expected from the phylomarker (Fig. 4B). This divergence can be exemplified by two different limit cases. On the one hand, the same organism may have more than one sequence, which may be in either distant or nearby positions in the tree; e.g. Pseudomonas alcaligenes appeared in six distant positions among three different clusters (Fig. S7). On the other hand, sequences from distant taxa may unexpectedly converge in similarity. This is the case of Oceanobacter kriegii and Thalassobius gelatinovorus, an alpha- and gamma-proteobacteria, respectively, whose normalized phylogenetic distance on a 16S tree is large, while being remarkably close in the potF-like tree (0.67 versus 0.16; red arrow in Fig. 4B). Thus, the phylogenetic signal poorly predicted divergence in potF-like sequences. Therefore, we used the significant internal nodes to define different implementations and classify sequences, underlining broader trends and properties across the reference tree of potF.

Finally, we placed the potF-like short queries from Malaspina metagenomes on to the annotated tree, discarding Inline graphic (4.13%) poor-quality unique sequences (see Methods; grey circles in Fig. 4A). The rest of the placements had a mean weighted likelihood robustness of 0.89 and populated the entire tree, suggesting that most known marine polyamine-like binding proteins were well represented in our database. In this way, we decoupled functional behaviors from taxonomic positions of potF-like proteins, and discretized the sequence space using environmental evidence.

Decoding ecological dynamics in the polyamine uptake guild in marine contexts

We finally quantified the polyamine uptake guild in the Malaspina circumnavigation samples, showcasing the potential of our approach. This specific guild is reported to be ubiquitous in the oceans [73], so it was ideal to assess the guild composition in different marine environments. Using the classified high-quality sequences from the previous section (red circles in Fig. 4A), we calculated the impact coefficients Inline graphic for each taxon, implementation, and environment, and analyzed the guild composition across three different depth layers: epipelagic (0–200 m), mesopelagic (200–1000 m), and bathypelagic (1000–4000 m).

Figure 5 compares the results between state-of-the-art methodologies (A), our approach considering only abundance (B), and our approach based on impact coefficients (C). First, note that current methodologies collapse the whole potF sequence space into a single bar, while we are able to differentiate between implementations. Second, the impact coefficients (which include the richness ratio, Equation (2)) reveal taxa that have a huge effector richness and that might be fundamental for functional stability (e.g. Sulfitobacter sp001629325 and Pseudooceanicola atlanticus). Therefore, our approach provides a more comprehensive understanding of the guild structure and its differences across contexts.

Figure 5.

Figure 5

Our guild approach identifies important ecological patterns in the polyamine uptake guild in contrast to the traditional approach. Notes: (A) The output of a standard automatic metagenomic analysis. The results have been displayed as histograms and as radial plots with stacked log values to improve the visualization of taxa spanning different orders of magnitude. (B) Our guild approach considering only abundance. (C) The impact coefficients, which also ponder sequence richness, shown across three environments (columns). What in the automatic annotation was a single bar has been split into several sectors, each of them corresponding to one of the implementations as identified in Fig. 4A. We only show taxa with the largest contributions, the rest are grouped under the “Others” tag (black). The contribution to the function fluctuates in both taxonomic identity and implementation preference with non-trivial relationships with depth. For example, UBA11654 sp001629325 (striped blue, red arrows) contributes in the epipelagic with two different implementations, cIII and cIIa, while it only contributes through cIII in the mesopelagic, and disappears in the bathypelagic. Moreover, Sulfitobacter sp. and P. atlanticus (black arrows) were only visible considering expected richness. Note how easy it is to observe distinct functional trends for each taxon, in contrast with traditional approaches (panel A). The incertae implementation is representing the absence of k values in the undefined sequence spaces of the reference tree.

Overall, our results showed that the polyamine uptake guild is important throughout the entire water column, but with a different composition of taxa and implementations. The main implementations of polyamine uptake found were those significant for activity under saline conditions (cIa, cIb, and cIII), which is coherent with the fact that the samples were all marine. Also, we observed considerable functional redundancy as there were different implementations in every sample and several taxa within each implementation. Specifically, the guild structure changed significantly between the epipelagic and mesopelagic, both in taxonomic composition and in the estimated strength of each of the implementations. Between the mesopelagic and bathypelagic, the pattern was remarkably taxon-preserved but the net contribution of each implementation to the overall function changed slightly, with more top contributors in the latter. However, the polyamine uptake function persisted throughout the water column despite these changes in taxonomic composition, showing a species turnover with depth.

In addition, we easily compared the guild across environments by aggregating implementations and/or taxa (in Inline graphic of Equation (2)). Thus, we computed the fold-changes of the impact coefficients Inline graphic, and Inline graphic. Even though several implementations were more important and taxonomically diverse in the bathypelagic, the main guild activity occurred in the mesopelagic. Similarly, we could also compare the guild’s composition across environments and implementations by aggregating all sequences corresponding to different taxa (in Inline graphic of Equation (2)). The most remarkable change corresponded to implementation cIIb between the epipelagic and mesopelagic Inline graphic (Fig. 6), suggesting an implementation-dependent bloom in the mesopelagic. This is an interesting finding, because the implementation cIIb is significant for organisms with a wide pH range (Fig. 4), and the greatest variability of oceanic pH occurs precisely in the mesopelagic (Fig. S8). This confirms the ability of our approach to reveal environmental dependences of biological functions in general.

Figure 6.

Figure 6

Changes with depth in the importance of the polyamine uptake guild. An advantage of using the impact coefficients to determine the structure of a guild is that we can visualize the functional contribution in a variety of ways. For example, here we look at the fold changes in the contribution between different ocean layers for different implementations. First, the Inline graphic values of each implementation are added for each layer (Inline graphic, Inline graphic, Inline graphic). Then the values for two different layers are divided (Inline graphic, Inline graphic). It is easily observed which implementations depend the most on depth; in this case, Inline graphic and Inline graphic, sequence spaces putatively adapted to a wide pH range. Interestingly, the sharp changes in these two implementations correspond with the area of the water column with the largest shift in pH toward acidity in the oxycline (Fig. S8).

Discussion

Diversification of functional protein sequences

Most functions are performed by evolving proteins. In addition, each function is often found in many different environments. Thus, diversification of the function-capable sequences is not only expected but frequently observed [74–76]. In order to quantify microbial guilds, the issue of how proteins diversify while maintaining function must be considered.

First, diversification of a protein can lead to promiscuity or pleiotropy [77, 78], especially when horizontal transfer events occur [79]. It is then likely that the protein may partially or totally lose its original function, undergoing a process of readaptation to its new genomic and environmental context [11, 80]. Such diversification constitutes the main problem when delimiting functional sequence spaces. A common method to predict function among divergent proteins is the orthology delineation, which account for speciation and duplication events, whereby their chronology allows the classification of gene sets into paralogs, orthologs, inparalogs, outparalogs, etc. [81, 82]. Yet, orthology delineation methods were less accurate than our approach in terms of functional grouping within the same protein family. Instead, our approach relies on biochemical evidence at the sequence level. We clustered the amoA tree by means of orthology delineation with possvm [83], which relies on the species overlap algorithm [84] (Fig. S9). The result was different from that obtained with our method. Some ortholog groups were correctly delineated, such as AOB (Inline graphic; OG40). However, they included methane oxidizing (Inline graphic), butane oxidizing (Inline graphic), and ammonia oxidizing (Inline graphic; AOB-gamma) sequences in the same ortholog group (OG16). Consequently, we opted for a conservative strategy, i.e. to propagate the biochemical evidence at the sequence level to robust regions of reference trees.

Once the functional sequence spaces have been properly identified, the next task is to determine what is functionally distinctive about these divergent groups. Neutral drift undoubtedly contributes to protein sub- and neo-functionalization [85]. However, the observed functional plasticity in a population of sequences may be constrained by selection, allowing different kinetics in diverse environments, even modifying the substrate preferences [86–88]. We can expect that sequence variants adapted to similar conditions will converge in similarity. Thus, if the entire sequence space performing exactly the same function is grouped into clusters of sequence similarity, groups of sequences that are expected to work alike in similar environments shall emerge (Figs. 3 and 4), instead of at-random groupings. For quantification of microbial guilds, we defined the function implementations as these groups of sequences that work equivalently towards a key resource, regardless their lineage (same binding affinity, substrate spectrum, temperature, pH or salinity preferences, etc.).

How the environment constrains microbial protein diversification

Like all other organisms, microbes achieve proteostasis through the expression of regulatory feedbacks, tuning of non-covalent interactions between structural subunits, and sequence re-adaptation [89–91]. All of these mechanisms act in multiple levels and can have an immediate impact on substrate accommodation [92]. A single amino acid change may be crucial for the specificity between the substrate and its binding site [93, 94]. Moreover, modification of residues at sites other than the conserved regions of the protein can often be structurally important [95]. Regarding the quaternary structure, the protein subunits evolve to remain bound under physiological conditions, and to monomerize in out-of-range environments [96]. Sometimes, due in part to the non-covalent nature of these protein–protein bonds, it is possible to recover function when physiological conditions return [96]. For all these reasons, it can be stated that any functional protein is the fine-tuned product of a sequence to a particular range of environmental conditions.

However, the process itself is poorly understood in a mechanistic way despite continued efforts [97–99]. More recently, Panja et al. [100] statistically explored the changes in microbial proteins adapted to different types of environments, and they diverged both in terms of amino acid composition and in their arrangement, a result in line with the “weak selection” concept [101]. According to these and other previous results, salinity, pH, and temperature would represent the major environmental drivers of how proteins evolve [4, 102–104], modifying the catalytic kinetics, substrate specificity or conditional stability while maintaining the same function [88, 105].

From these facts, it is deduced that environmental constraints, beyond speciation processes and gene duplications, are key controllers of functional structure in microbiomes. Therefore, we longed to classify proteins ecologically by their ability to operate similarly across environments.

Determining microbial implementations considering the nature of protein diversification

As a case study, we chose a function that is difficult to explore and quantify, which is organic nitrogen acquisition through putrescine and other related polyamines. The difficulty of exploring this function is given by the following issues: first, substrate affinity is moderately unspecific and, although there may be a slight preferential binding to spermidine or putrescine depending on certain amino acids [106], our results indicate that proteins with the same substrate preferences are not strictly clustered in the tree (Fig. S5); and second, there is an extreme shortage of sequences with functional experimental evidence.

As stated above, the microbial guild quantification method aims (i) to discriminate sequence spaces that correspond to the same function, and then (ii) to characterize groups of functional sequences that work in a similar way (implementations of the function). The first objective improved automatic gene annotation, while the second categorized it functionally. To do both, we built several reference phylogenetic trees for oceanic organisms and used them as sequence space classifiers (Figs. 3 and 4).

Regarding objective (ii), we wanted to classify the performances of ABC transporter-associated polyamine-binding proteins. As we did not have sufficient information on the preferential binding to each polyamine or its kinetics, we decided to characterize groups that work alike in a different way. We manually obtained environmental preference information for as many of the cultured organisms present in our reference tree as possible. Then, we tested how good the tree topology was at discriminating groups of sequences putatively adapted to work in given ranges of the environmental variables.

When evaluating sequentially all nodes in the tree, we found that some internal nodes had a significant correspondence with particular environmental variables. These nodes divided the tree into large clades containing sequences that correlated with salinity, pH, and/or temperature (Fig. 4A). In addition, motility was also very significant for the same group of sequences related to pH variability. One possibility was that these environmentally consistent clades were phylogenetically equivalent. That is, the adaptation of transporter proteins to a given environmental condition should have been predicted by taxonomy. However, this was not the case. We found that the divergence of the bigger groups was not explained by taxonomy (Fig. 4B). Consequently, we defined the implementations for polyamine uptake as the sequence spaces corresponding to these inner nodes. Note that, until this work, the study of microbial guilds failed to consider how the functions were implemented, as organisms contributing to the guild in different ways prevented membership.

Decoupling taxonomy and function required a redefinition of guilds

We have argued above that taxonomic position is not, in many cases, synonymous with function. In addition to those arguments, there is some research actually focused on decoupling taxonomy from functional assets [107, 108]. Moreover, machine-learning approaches appear to outperform niche prediction with functions rather than phylogeny [109]. This means that, at least in specific cases, it is possible to better predict the occurrence of function in an environment by its physicochemical features rather than by the taxonomic composition detected therein [108]. Despite this, some recent efforts have attempted to statistically infer microbial guilds through patterns of variation in species assignments and ecosystem properties related to the samples [110]. These authors were trying to find a solution without relying on automatic annotation. However, as they state, the taxa recovered with their model are just likely related to a function, but we cannot know the mechanism, or how they are related, and it assumes an absolute concurrence of taxonomy with function.

In this context, our guild definition has proven useful for decoupling taxonomy from how the molecular functions are carried out, and our method was robust enough to improve automatic annotation. Now, microorganisms no longer have to occupy the resource space in a similar way to obtain membership, but can contribute to the guild with different implementations of the function over the key resource. For example, two individuals of the same species may be exploiting the same resource at very different rates, or two individuals of different species may be equivalent in terms of consumption of the key resource; taxonomy is not an absolute predictor of either functions or how they had been implemented. Decoupling these effects was therefore mandatory to quantify microbial guilds.

As shown in the Results section, the acquisition of putrescine-like polyamines is a ubiquitous trait in the examined ocean layers, which is consistent with previous literature on the topic [73]. Nevertheless, we added novel insights about how this guild changes with depth by detailing how the different species implemented the function. The guild presents itself, however, in multiple shapes; it changes both its taxonomic composition and the most abundant implementations, and seems to follow trends that correspond to the different physicochemical characteristics of the three ocean depths analyzed. In fact, there are characteristic guild patterns that seem to be better explained by depth than by sampling spot or latitude. We found that this function was carried out by a wide variety of taxa and every considered implementation of polyamine uptake was always present, although with different strength. This result seems to support the statement of functional redundancy being more prevalent than expected by chance in microbiomes [111].

In the study by Bergauer et al. [73], polyamine uptake was one of the traits studied to analyze microbial heterotrophy in different ocean layers. Regarding the analysis of the guild, our approach adds: (i) high-quality depuration of spurious metagenomic queries (Fig. 4A); (ii) discrimination of sequence spaces corresponding to a functional attribute (Fig. 5B) not deducible from taxonomy (Fig. 4B); and (iii) determination of the unexpected diversification of function’s proteins (Fig. 5C). Finally, it makes the comparison between different guilds easier, as the ecological values are standardized by the same theoretical framework.

Thus, in Fig. 6, the Inline graphic ratio had strong responses among implementations adapted to a wide pH range (Inline graphic and Inline graphic). In our dataset, this remarkable increase is shown to be exclusive for this type of implementation, and can be explained by the rapid depth-dependent acidification, a characteristic feature of the mesopelagic oxycline [112, 113]. In general, the lowest pH levels in the water column correspond with the presence of an oxygen minimum. The decrease in pH is mainly driven by the increased concentration of dissolved carbonic acid, which also relates to biological activity of upper layers (Fig. S8). Values of pH are also dependent on more strictly abiotic factors such as temperature, salinity, and pressure, acting as dissociation constant modifiers [114, 115]. Therefore, the pH minimum is strongly linked to mesopelagic depths and may exhibit some seasonality. A key result was that the implementations that most changed their representation (in both abundance and richness) between epipelagic and mesopelagic were precisely those preserving their function despite pH variability. This was not only a validation of the method, but a significant step forward in knowledge of functional ecology. We can now make use of a new window to search for structure.

Conclusions

This work has three strands of contributions: (i) conceptually, we propose a new definition of guild, as previous definitions led to ambiguity, especially in microorganisms; (ii) methodologically, we have been able to formalize the new concept of guild for the study of microbiomes by delineating the sequence spaces and assigning functional properties to them, and to develop a quantification method that corrects for efficiency and expected richness of effectors, in addition to their abundance; (iii) ecologically, we have been able to quantify and describe for the first time the polyamine uptake guild in water columns from tropical and subtropical latitudes, identifying key players, and finding a major correspondence between certain sequence spaces of potF and the oceanic environment in which they perform their function.

Supplementary Material

Supplementary_ISMEcommun_ycae042

Acknowledgments

We are grateful to Juan F. Poyatos (CNB-CSIC) for discussion of the methodologies and their implications, to Alberto Pascual-García (CNB-CSIC) for the feedback and discussion on the epistemology of the microbial guild, and to Diego Jiménez for encouraging this work. We also thank the CSIC-LifeHUB forum (PIE-202120E047-Conexiones-Life) for generating space for discussion and allowing these ideas to mature.

Contributor Information

Juan Rivas-Santisteban, Microbiome Analysis Laboratory, Centro Nacional de Biotecnología (CNB), CSIC, Calle Darwin no. 3, Madrid, 28049, Spain.

Pablo Yubero, Logic of Genomic Systems Laboratory, Centro Nacional de Biotecnología (CNB), CSIC, Spain.

Semidán Robaina-Estévez, Department of Microbiology, University of La Laguna, Spain.

José M González, Department of Microbiology, University of La Laguna, Spain.

Javier Tamames, Microbiome Analysis Laboratory, Centro Nacional de Biotecnología (CNB), CSIC, Calle Darwin no. 3, Madrid, 28049, Spain.

Carlos Pedrós-Alió, Microbiome Analysis Laboratory, Centro Nacional de Biotecnología (CNB), CSIC, Calle Darwin no. 3, Madrid, 28049, Spain.

Author contributions

Juan Rivas-Santisteban (conceptualization, methodology, validation, writing—original draft, investigation, data curation, visualization); Pablo Yubero (methodology, software, formal analysis, writing—review & editing, data curation, visualization); Semidán Robaina-Estévez (methodology, writing—review & editing); José M. González (methodology); Javier Tamames and Carlos Pedrós-Alió (writing—review & editing, project administration, supervision, funding acquisition).

Conflicts of interest

The authors declare no competing interests.

Funding

We acknowledge support of the publication fee by the CSIC Open Access Publication Support Initiative through its Unit of Information Resources for Research (URICI). Projects PID2019-110011RB-C31 and -C32 funded by MCIN/AEI/10.13039/501100011033 and work supported by PhD fellowship PRE2020-096130 from the Spanish Ministerio de Ciencia e Innovación and the European Social Fund.

Data availability

All data generated or analysed during this study are included in this published article and its supplementary information files.

Materials and correspondence

Correspondence and request for materials should be addressed to Juan Rivas-Santisteban.

References

  • 1. Cech TR. Evolution of biological catalysis: ribozyme to RNP enzyme. In Cold Spring Harbor Symposia on Quantitative Biology, Vol. 74. Cold Spring Harbor Laboratory Press, 2009, 11–16. 10.1101/sqb.2009.74.024. [DOI] [PubMed] [Google Scholar]
  • 2. Masel J. Genetic drift. Curr Biol  2011;21:R837–8. 10.1016/j.cub.2011.08.007 [DOI] [PubMed] [Google Scholar]
  • 3. Altenhoff AM, Dessimoz C. Inferring orthology and paralogy. Methods Mol Biol  2012;855:259–79. 10.1007/978-1-61779-582-4_9 [DOI] [PubMed] [Google Scholar]
  • 4. Lanyi JK. Salt-dependent properties of proteins from extremely halophilic bacteria. Bacteriol Rev  1974;38:272–90. 10.1128/br.38.3.272-290.1974 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Ladero M, Ruiz G, Pessela BCC  et al.  Thermal and ph inactivation of an immobilized thermostable Inline graphic-galactosidase from thermus sp. strain t2: comparison to the free enzyme. Biochem Eng J  2006;31:14–24. 10.1016/j.bej.2006.05.012 [DOI] [Google Scholar]
  • 6. Lynch M, Ackerman MS, Gout J-F  et al.  Genetic drift, selection and the evolution of the mutation rate. Nat Rev Genet  2016;17:704–14. 10.1038/nrg.2016.104 [DOI] [PubMed] [Google Scholar]
  • 7. Liberles DA, Teichmann SA, Bahar I  et al.  The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci  2012;21:769–85. 10.1002/pro.2071 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Dourado H, Mori M, Hwa T  et al.  On the optimality of the enzyme–substrate relationship in bacteria. PLoS Biol  2021;19:e3001416. 10.1371/journal.pbio.3001416 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Baiser B, Lockwood JL. The relationship between functional and taxonomic homogenization. Glob Ecol Biogeogr  2011;20:134–44. 10.1111/j.1466-8238.2010.00583.x [DOI] [Google Scholar]
  • 10. van de Guchte M. Horizontal gene transfer and ecosystem function dynamics. Trends Microbiol  2017;25:699–700. 10.1016/j.tim.2017.07.002 [DOI] [PubMed] [Google Scholar]
  • 11. Husnik F, McCutcheon JP. Functional horizontal gene transfer from bacteria to eukaryotes. Nat Rev Microbiol  2018;16:67–79. 10.1038/nrmicro.2017.137 [DOI] [PubMed] [Google Scholar]
  • 12. Pagé A, Tivey MK, Stakes DS  et al.  Temporal and spatial archaeal colonization of hydrothermal vent deposits. Environ Microbiol  2008;10:874–84. 10.1111/j.1462-2920.2007.01505.x [DOI] [PubMed] [Google Scholar]
  • 13. Storz JF. Causes of molecular convergence and parallelism in protein evolution. Nat Rev Genet  2016;17:239–50. 10.1038/nrg.2016.11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Root RB. The niche exploitation pattern of the blue-gray gnatcatcher. Ecol Monogr  1967;37:317–50. 10.2307/1942327 [DOI] [Google Scholar]
  • 15. Koran M, Kropil R. What are ecological guilds? Dilemma of guild concepts. Russ J Ecol  2014;45:445–7. 10.1134/S1067413614050178 [DOI] [Google Scholar]
  • 16. Nebel S, Mills A, McCracken J  et al.  Declines of aerial insectivores in North America follow a geographic gradient. Avian Conservation and Ecology  2010;5. 10.5751/ACE-00391-050201 [DOI] [Google Scholar]
  • 17. Hohberg K. Soil nematode fauna of afforested mine sites: genera distribution, trophic structure and functional guilds. Appl Soil Ecol  2003;22:113–26. 10.1016/S0929-1393(02)00135-X [DOI] [Google Scholar]
  • 18. Chiel HJ, Beer RD. The brain has a body: adaptive behavior emerges from interactions of nervous system, body and environment. Trends Neurosci  1997;20:553–7. 10.1016/S0166-2236(97)01149-1 [DOI] [PubMed] [Google Scholar]
  • 19. Hillis TL, Mallory FF. Sexual dimorphism in wolves (Canis lupus) of the Keewatin District, northwest territories, Canada. Can J Zool  1996;74:721–5. 10.1139/z96-081 [DOI] [Google Scholar]
  • 20. Torsvik V, Øvreås L. Microbial diversity and function in soil: from genes to ecosystems. Curr Opin Microbiol  2002;5:240–5. 10.1016/S1369-5274(02)00324-7 [DOI] [PubMed] [Google Scholar]
  • 21. Gregory TR. Genome size evolution in animals. In The Evolution of the Genome, Amsterdam: Elsevier, 2005, 3–87. 10.1016/B978-012301463-4/50003-6 [DOI] [Google Scholar]
  • 22. Gregory TR and DeSalle R. Comparative genomics in prokaryotes. In The Evolution of the Genome. Amsterdam: Elsevier, 2005, 585–675. 10.1016/B978-012301463-4/50012-7 [DOI] [Google Scholar]
  • 23. Veshareh MJ, Nick HM. A novel relationship for the maximum specific growth rate of a microbial guild. FEMS Microbiol Lett  2021;368. 10.1093/femsle/fnab064 [DOI] [PubMed] [Google Scholar]
  • 24. Jones CM, Spor A, Brennan FP  et al.  Recently identified microbial guild mediates soil N2O sink capacity. Nat Clim Chang  2014;4:801–5. 10.1038/nclimate2301 [DOI] [Google Scholar]
  • 25. Martinović T, Odriozola I, Mašínová T  et al.  Temporal turnover of the soil microbiome composition is guild-specific. Ecol Lett  2021;24:2726–38. 10.1111/ele.13896 [DOI] [PubMed] [Google Scholar]
  • 26. Guojun W, Zhao N, Zhang C  et al.  Guild-based analysis for understanding gut microbiome in human health and diseases. Genome medicine  2021;13:1–12. 10.1186/s13073-021-00840-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Graham DW, Knapp CW, Van Vleck ES  et al.  Experimental demonstration of chaotic instability in biological nitrification. ISME Journal  2007;1:385–93. 10.1038/ismej.2007.45 [DOI] [PubMed] [Google Scholar]
  • 28. Ryabov A, Blasius B, Hillebrand H  et al.  Estimation of functional diversity and species traits from ecological monitoring data. Proc Natl Acad Sci  2022;119:e2118156119. 10.1073/pnas.2118156119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Passy SI. Diatom ecological guilds display distinct and predictable behavior along nutrient and disturbance gradients in running waters. Aquat Bot  2007;86:171–8. 10.1016/j.aquabot.2006.09.018 [DOI] [Google Scholar]
  • 30. Fauth JE, Bernardo J, Camara M  et al.  Simplifying the jargon of community ecology: a conceptual approach. Am Nat  1996;147:282–6. 10.1086/285850 [DOI] [Google Scholar]
  • 31. Nemergut DR, Schmidt SK, Fukami T  et al.  Patterns and processes of microbial community assembly. Microbiol Mol Biol Rev  2013;77:342–56. 10.1128/MMBR.00051-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Pedrós-Alió C. Toward an autecology of bacterioplankton. In: Plankton Ecology. Berlin: Springer, 1989, 297–336. [Google Scholar]
  • 33. Klemetsen T, Raknes IA, Juan F  et al.  The MAR databases: development and implementation of databases specific for marine metagenomics. Nucleic Acids Res  2018;46:D692–9. 10.1093/nar/gkx1036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Nishimura Y, Yoshizawa S. The OceanDNA MAG catalog contains over 50,000 prokaryotic genomes originated from various marine environments. Scientific Data  2022;9:305. 10.1038/s41597-022-01392-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Paoli L, Ruscheweyh H-J, Forneris CC  et al.  Biosynthetic potential of the global ocean microbiome. Nature  2022;607:111–8. 10.1038/s41586-022-04862-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Sunagawa S, Coelho LP, Chaffron S  et al.  Structure and function of the global ocean microbiome. Science  2015;348. 10.1126/science.1261359 [DOI] [PubMed] [Google Scholar]
  • 37. Chaumeil P-A, Mussig AJ, Hugenholtz P  et al.  GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics  2020;36:1925–7. 10.1093/bioinformatics/btz848 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Shen W, Le S, Li Y  et al.  SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS One  2016;11:e0163962. 10.1371/journal.pone.0163962 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Sona V, Vinayaka CR, Natale DA  et al.  Structure-guided rule-based annotation of protein functional sites in UniProt knowledgebase. In Bioinformatics for Comparative Proteomics, Methods in Molecular Biology, Vol. 694. New York: Springer, 2011, 91–105. [DOI] [PubMed] [Google Scholar]
  • 40. Johnson LS, Eddy SR, Portugaly E. Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics  2010;11:1–8. 10.1186/1471-2105-11-431 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Limin F, Niu B, Zhu Z  et al.  CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics  2012;28:3150–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Libbrecht MW, Bilmes JA, Noble WS. Choosing non-redundant representative subsets of protein sequence data sets using submodular optimization. Proteins: Structure, Function, and Bioinformatics  2018;86:454–66. 10.1002/prot.25461 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Robaina-Estévez S. Pynteny: Synteny-Aware HMM Searches Made Easy. Zenodo, 2022. [Google Scholar]
  • 44. Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics  2004;5:1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Minh BQ, Schmidt HA, Chernomor O  et al.  IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol  2020;37:1530–4. 10.1093/molbev/msaa015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Darriba D, Posada D, Kozlov AM  et al.  ModelTest-NG: a new and scalable tool for the selection of DNA and protein evolutionary models. Mol Biol Evol  2020;37:291–4. 10.1093/molbev/msz189 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Tamames J, Puente-Sánchez F. Squeezemeta, a highly portable, fully automatic metagenomic analysis pipeline. Front Microbiol  2019;9:3349. 10.3389/fmicb.2018.03349 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Berger SA, Stamatakis A. Papara 2.0: a vectorized algorithm for probabilistic phylogeny-aware alignment extension. Heidelberg Institute for Theoretical Studies, 2012. [Google Scholar]
  • 49. Barbera P, Kozlov AM, Czech L  et al.  EPA-ng: massively parallel evolutionary placement of genetic sequences. Syst Biol  2019;68:365–9. 10.1093/sysbio/syy054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Czech L, Barbera P, Stamatakis A. Genesis and gappa: processing, analyzing and visualizing phylogenetic (placement) data. Bioinformatics  2020;36:3263–5. 10.1093/bioinformatics/btaa070 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res  2016;44:W242–5. 10.1093/nar/gkw290 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Ives AR, Klug JL, Gross K. Stability and species richness in complex communities. Ecol Lett  2000;3:399–411. 10.1046/j.1461-0248.2000.00144.x [DOI] [Google Scholar]
  • 53. Vidal MC, Wang SP, Rivers DM  et al.  Species richness and redundancy promote persistence of exploited mutualisms in yeast. Science  2020;370:346–50. 10.1126/science.abb6703 [DOI] [PubMed] [Google Scholar]
  • 54. Margalef R. Diversity and stability: a practical proposal and a model of interdependence. Brookhaven Symposia in Biology  1969;22:25–37. [PubMed] [Google Scholar]
  • 55. Margalef R. On certain unifying principles in ecology. Am Nat  1963;97:357–74. 10.1086/282286 [DOI] [Google Scholar]
  • 56. Valencia A. Automatic annotation of protein function. Curr Opin Struct Biol  2005;15:267–74. 10.1016/j.sbi.2005.05.010 [DOI] [PubMed] [Google Scholar]
  • 57. Ensign SA, Hyman MR, Arp DJ. In vitro activation of ammonia monooxygenase from Nitrosomonas europaea by copper. J Bacteriol  1993;175:1971–80. 10.1128/jb.175.7.1971-1980.1993 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Rotthauwe J-H, Witzel K-P, Liesack W. The ammonia monooxygenase structural gene amoA as a functional marker: molecular fine-scale analysis of natural ammonia-oxidizing populations. Appl Environ Microbiol  1997;63:4704–12. 10.1128/aem.63.12.4704-4712.1997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Eloy RJ, Alves BQ, Minh TU  et al.  Unifying the global phylogeny and environmental distribution of ammonia-oxidising archaea based on amoA genes. Nat Commun  2018;9:1517. 10.1038/s41467-018-03861-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Khadka R, Clothier L, Wang L  et al.  Evolutionary history of copper membrane monooxygenases. Front Microbiol  2018;9:2493. 10.3389/fmicb.2018.02493 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Wright CL, Schatteman A, Crombie AT  et al.  Inhibition of ammonia monooxygenase from ammonia-oxidizing archaea by linear and aromatic alkynes. Appl Environ Microbiol  2020;86:e02388–19. 10.1128/AEM.02388-19 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Lau E, Lukich D, Le T. The evolution of the amoA, pmoA and bmoA genes measured by ka/ks ratios. Proceedings of the West Virginia Academy of Science  2016;88. 10.55632/pwvas.v88i1.10 [DOI] [Google Scholar]
  • 63. Ward BB. Kinetic studies on ammonia and methane oxidation by Nitrosococcus oceanus. Arch Microbiol  1987;147:126–33. 10.1007/BF00415273 [DOI] [Google Scholar]
  • 64. Oudova-Rivera B, Wright CL, Crombie AT  et al.  The effect of methane and methanol on the terrestrial ammonia oxidising archaeon ‘Candidatus Nitrosocosmicus franklandus c13’. Environ Microbiol  2023;25:948–61. 10.1111/1462-2920.16316 [DOI] [PubMed] [Google Scholar]
  • 65. Sayavedra-Soto LA, Hamamura N, Liu C-W  et al.  The membrane-associated monooxygenase in the butane-oxidizing gram-positive bacterium nocardioides sp. strain cf8 is a novel member of the amo/pmo family. Environ Microbiol Rep  2011;3:390–6. 10.1111/j.1758-2229.2010.00239.x [DOI] [PubMed] [Google Scholar]
  • 66. Munch J, Pullmann P, Zhang W  et al.  Enzymatic hydroxylations of sp3-carbons. ACS Catal  2021;11:9168–203. 10.1021/acscatal.1c00759 [DOI] [Google Scholar]
  • 67. Rochman FF, Kwon M, Khadka R  et al.  Novel copper-containing membrane monooxygenases (CuMMOs) encoded by alkane-utilizing Betaproteobacteria. ISME Journal  2020;14:714–26. 10.1038/s41396-019-0561-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Shafiee RT, Diver PJ, Snow JT  et al.  Marine ammonia-oxidising archaea and bacteria occupy distinct iron and copper niches. ISME Communications  2021;1:1. 10.1038/s43705-021-00001-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Martens-Habbena W, Berube PM, Urakawa H  et al.  Ammonia oxidation kinetics determine niche separation of nitrifying archaea and bacteria. Nature  2009;461:976–9. 10.1038/nature08465 [DOI] [PubMed] [Google Scholar]
  • 70. Sánchez P, Coutinho FH, Sebastián M  et al.  Marine picoplankton metagenomes and MAGs from eleven vertical profiles obtained by the Malaspina Expedition. Sci Data  2024;11:154. 10.1038/s41597-024-02974-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Pistocchi R, Kashiwagi K, Miyamoto S  et al.  Characteristics of the operon for a putrescine transport system that maps at 19 minutes on the Escherichia coli chromosome. J Biol Chem  1993;268:146–52. 10.1016/S0021-9258(18)54126-0 [DOI] [PubMed] [Google Scholar]
  • 72. Moreno-Ulloa A, Diaz VS, Tejeda-Mora JA  et al.  Chemical profiling provides insights into the metabolic machinery of hydrocarbon-degrading deep-sea microbes. Msystems  2020;5:10–1128. 10.1128/mSystems.00824-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Bergauer K, Fernandez-Guerra A, Garcia JAL  et al.  Organic matter processing by microbial communities throughout the Atlantic water column as revealed by metaproteomics. Proc Natl Acad Sci  2018;115:E400–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Fay JC, Chung-I W. Sequence divergence, functional constraint, and selection in protein evolution. Annu Rev Genomics Hum Genet  2003;4:213–35. 10.1146/annurev.genom.4.020303.162528 [DOI] [PubMed] [Google Scholar]
  • 75. Pascual-García A, Abia D, Méndez R  et al.  Quantifying the evolutionary divergence of protein structures: the role of function change and function conservation. Proteins: Structure, Function, and Bioinformatics  2010;78:181–96. 10.1002/prot.22616 [DOI] [PubMed] [Google Scholar]
  • 76. Soria PS, McGary KL, Rokas A. Functional divergence for every paralog. Mol Biol Evol  2014;31:984–92. 10.1093/molbev/msu050 [DOI] [PubMed] [Google Scholar]
  • 77. Hult K, Berglund P. Enzyme promiscuity: mechanism and applications. Trends Biotechnol  2007;25:231–8. 10.1016/j.tibtech.2007.03.002 [DOI] [PubMed] [Google Scholar]
  • 78. Ruelens P, Wynands T, Arjan J  et al.  Interaction between mutation type and gene pleiotropy drives parallel evolution in the laboratory. Philos Trans R Soc B  2023;378:20220051. 10.1098/rstb.2022.0051 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Glasner ME, Truong DP, Morse BC. How enzyme promiscuity and horizontal gene transfer contribute to metabolic innovation. FEBS J  2020;287:1323–42. 10.1111/febs.15185 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Cheng D, Cheng C-HC, Ye H  et al.  Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict. Proc Natl Acad Sci  2010;107:21593–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81. Koonin EV. Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet  2005;39:309–38. 10.1146/annurev.genet.39.073003.114725 [DOI] [PubMed] [Google Scholar]
  • 82. Altenhoff AM, Boeckmann B, Capella-Gutierrez S  et al.  Standardized benchmarking in the quest for orthologs. Nat Methods  2016;13:425–30. 10.1038/nmeth.3830 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83. Grau-Bové X, Sebé-Pedrós A. Orthology clusters from gene trees with possvm. Mol Biol Evol  2021;38:5204–8. 10.1093/molbev/msab234 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84. Huerta-Cepas J, Dopazo H, Dopazo J  et al.  The human phylome. Genome Biol  2007;8:1–16. 10.1186/gb-2007-8-6-r109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85. Kimura M. The neutral theory of molecular evolution: a review of recent evidence. Japanese Journal of Genetics  1991;66:367–86. 10.1266/jjg.66.367 [DOI] [PubMed] [Google Scholar]
  • 86. Alam MS, Garg SK, Agrawal P. Studies on structural and functional divergence among seven whiB proteins of mycobacterium tuberculosis h37rv. FEBS J  2009;276:76–93. 10.1111/j.1742-4658.2008.06755.x [DOI] [PubMed] [Google Scholar]
  • 87. Offre P, Kerou M, Spang A  et al.  Variability of the transporter gene complement in ammonia-oxidizing archaea. Trends Microbiol  2014;22:665–75. 10.1016/j.tim.2014.07.007 [DOI] [PubMed] [Google Scholar]
  • 88. Zhao Q. Molecular and thermodynamic mechanisms for protein adaptation. Eur Biophys J  2022;51:519–34. 10.1007/s00249-022-01618-9 [DOI] [PubMed] [Google Scholar]
  • 89. Ullmann A, Jacob F, Monod J. On the subunit structure of wild-type versus complemented Inline graphic-galactosidase of Escherichia coli. J Mol Biol  1968;32:1–13. 10.1016/0022-2836(68)90140-X [DOI] [PubMed] [Google Scholar]
  • 90. Gidalevitz T, Prahlad V, Morimoto RI. The stress of protein misfolding: from single cells to multicellular organisms. Cold Spring Harbor Perspect Biol  2011;3:a009704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91. Manara A, DalCorso G, Baliardini C  et al.  Pseudomonas putida response to cadmium: changes in membrane and cytosolic proteomes. J Proteome Res  2012;11:4169–79. 10.1021/pr300281f [DOI] [PubMed] [Google Scholar]
  • 92. Thompson J, Reese-Wagoner A, Banaszak L. Liver fatty acid binding protein: species variation and the accommodation of different ligands. Biochimica et Biophysica Acta (BBA)-molecular and cell biology of. Lipids  1999;1441:117–30. [DOI] [PubMed] [Google Scholar]
  • 93. Gierse JK, McDonald JJ, Hauser SD  et al.  A single amino acid difference between cyclooxygenase-1 (COX-1) and- 2 (COX-2) reverses the selectivity of COX-2 specific inhibitors. J Biol Chem  1996;271:15810–4. 10.1074/jbc.271.26.15810 [DOI] [PubMed] [Google Scholar]
  • 94. Price MN, Arkin AP. Interactive analysis of functional residues in protein families. Msystems  2022;e00705–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95. Sadowski MI, Jones DT. The sequence–structure relationship and protein function prediction. Curr Opin Struct Biol  2009;19:357–62. 10.1016/j.sbi.2009.03.008 [DOI] [PubMed] [Google Scholar]
  • 96. Traut TW. Dissociation of enzyme oligomers: a mechanism for allosteric regulation. Crit Rev Biochem Mol Biol  1994;29:125–63. 10.3109/10409239409086799 [DOI] [PubMed] [Google Scholar]
  • 97. Kreitman M. The neutral theory is dead. Long live the neutral theory. BioEssays  1996;18:678–83. 10.1002/bies.950180812 [DOI] [PubMed] [Google Scholar]
  • 98. Reed CJ, Lewis H, Trejo E  et al.  Protein adaptations in archaeal extremophiles. Archaea  2013;2013:1–14. 10.1155/2013/373275 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99. Tamuri AU, Dos Reis M. A mutation–selection model of protein evolution under persistent positive selection. Mol Biol Evol  2022;39. 10.1093/molbev/msab309 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100. Panja AS, Maiti S, Bandyopadhyay B. Protein stability governed by its structural plasticity is inferred by physicochemical factors and salt bridges. Sci Rep  2020;10:1–9. 10.1038/s41598-020-58825-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101. Akashi H, Osada N, Ohta T. Weak selection and protein evolution. Genetics  2012;192:15–31. 10.1534/genetics.112.140178 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102. Fisher M, Gokhman I, Pick U  et al.  A structurally novel transferrin-like protein accumulates in the plasma membrane of the unicellular green alga dunaliella Salina grown in high salinities. J Biol Chem  1997;272:1565–70. 10.1074/jbc.272.3.1565 [DOI] [PubMed] [Google Scholar]
  • 103. Kumar S, Singh SK, Gromiha MM. Temperature-dependent molecular adaptations in microbial proteins. Encyclopedia of Industrial Biotechnology: Bioprocess, Bioseparation, and Cell Technology  2009;1–22 [Google Scholar]
  • 104. Tamames J, Abellán JJ, Pignatelli M  et al.  Environmental distribution of prokaryotic taxa. BMC Microbiol  2010;10:85–14. 10.1186/1471-2180-10-85 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105. Huston AL, Haeggström JZ, Feller G. Cold adaptation of enzymes: structural, kinetic and microcalorimetric characterizations of an aminopeptidase from the Arctic psychrophile Colwellia psychrerythraea and of human leukotriene a4 hydrolase. Biochimica et Biophysica Acta (BBA)-Proteins and Proteomics  2008;1784:1865–72. 10.1016/j.bbapap.2008.06.002 [DOI] [PubMed] [Google Scholar]
  • 106. Kashiwagi K, Pistocchi R, Shibuya S  et al.  Spermidine-preferential uptake system in Escherichia coli. Identification of amino acids involved in polyamine binding in PotD protein. J Biol Chem  1996;271:12205–8. 10.1074/jbc.271.21.12205 [DOI] [PubMed] [Google Scholar]
  • 107. Louca S, Parfrey LW, Doebeli M. Decoupling function and taxonomy in the global ocean microbiome. Science  2016;353:1272–7. 10.1126/science.aaf4507 [DOI] [PubMed] [Google Scholar]
  • 108. Tamames J, Sánchez PD, Nikel PI  et al.  Quantifying the relative importance of phylogeny and environmental preferences as drivers of gene content in prokaryotic microorganisms. Front Microbiol  2016;7:433. 10.3389/fmicb.2016.00433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109. Alneberg J, Bennke C, Beier S  et al.  Ecosystem-wide metagenomic binning enables prediction of ecological niches from genomes. Communications Biology  2020;3:1–10. 10.1038/s42003-020-0856-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110. Shan X, Goyal A  Gregor R  et al.  Annotation-free discovery of functional groups in microbial communities. Nat Ecol Evol  2023;7:716–24. 10.1038/s41559-023-02021-z [DOI] [PubMed] [Google Scholar]
  • 111. Puente-Sanchez F, Pascual-García A, Bastolla U  et al.  Cross-biome microbial networks reveal functional redundancy and suggest genome reduction through functional complementarity. bioRxiv  2022. 10.1101/2022.09.11.507163 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112. Park K. Deep-sea pH. Science  1966;154:1540–2. 10.1126/science.154.3756.1540 [DOI] [PubMed] [Google Scholar]
  • 113. Dickson AG. The measurement of sea water ph. Mar Chem  1993;44:131–42. 10.1016/0304-4203(93)90198-W [DOI] [Google Scholar]
  • 114. Byrne RH, McElligott S, Feely RA  et al.  The role of pHT measurements in marine CO2-system characterizations. Deep-Sea Res I Oceanogr Res Pap  1999;46:1985–97. 10.1016/S0967-0637(99)00031-X [DOI] [Google Scholar]
  • 115. Ternon JF, Oudot C, Gourlaouen V  et al.  The determination of pHT in the equatorial Atlantic Ocean and its role in the sound absorption modeling in seawater. J Mar Syst  2001;30:67–87. 10.1016/S0924-7963(01)00038-0 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary_ISMEcommun_ycae042

Data Availability Statement

All data generated or analysed during this study are included in this published article and its supplementary information files.


Articles from ISME Communications are provided here courtesy of Oxford University Press

RESOURCES