Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2022 Sep 9;18(9):e1010491. doi: 10.1371/journal.pcbi.1010491

Species abundance correlations carry limited information about microbial network interactions

Susanne Pinto 1,*, Elisa Benincà 2, Egbert H van Nes 3, Marten Scheffer 3, Johannes A Bogaards 4
Editor: Kiran Raosaheb Patil5
PMCID: PMC9518925  PMID: 36084152

Abstract

Unraveling the network of interactions in ecological communities is a daunting task. Common methods to infer interspecific interactions from cross-sectional data are based on co-occurrence measures. For instance, interactions in the human microbiome are often inferred from correlations between the abundances of bacterial phylogenetic groups across subjects. We tested whether such correlation-based methods are indeed reliable for inferring interaction networks. For this purpose, we simulated bacterial communities by means of the generalized Lotka-Volterra model, with variation in model parameters representing variability among hosts. Our results show that correlations can be indicative for presence of bacterial interactions, but only when measurement noise is low relative to the variation in interaction strengths between hosts. Indication of interaction was affected by type of interaction network, process noise and sampling under non-equilibrium conditions. The sign of a correlation mostly coincided with the nature of the strongest pairwise interaction, but this is not necessarily the case. For instance, under rare conditions of identical interaction strength, we found that competitive and exploitative interactions can result in positive as well as negative correlations. Thus, cross-sectional abundance data carry limited information on specific interaction types. Correlations in abundance may hint at interactions but require independent validation.

Author summary

The bacteria in and on our body (the human microbiome) largely determine how our body functions, and whether we stay healthy or get sick. These bacteria do not live on their own, but interact among each other and with their human host. Finding out which bacteria interact with each other is cumbersome, but patterns of joint occurrence between species might provide a clue to their ecological dependencies. We investigated whether correlations in species abundance can be used for the purpose of ecological network reconstruction. We simulated different bacterial communities with known interactions according to a theoretical population model. After having collected virtual samples from our simulated data, we performed a correlation analysis and then compared the correlation network with our known interaction network. We found that correlations can be informative for underlying interactions, but ecological conclusions should be drawn carefully. An obvious limitation of correlation analysis is that direction of interaction cannot be recovered from co-occurrence data, making correlations insensitive for detection of asymmetric interactions. In addition, we found that competitive and exploitative interactions can induce positive as well as negative correlations. We recommend careful interpretation and validation when inferring networks from cross-sectional abundance data.

Introduction

The human body harbors an exceptional bacterial diversity [1]. The composition of these bacterial communities is generally shaped by characteristics of the host and by the ecological dependencies among bacterial species themselves [24]. These dependencies often occur through competitive or synergistic interactions, which may lead to a (mutual) decrease or increase in the abundance of interacting species [5]. For instance, it is known that bacteria can interact with each other through excreted metabolites, which can function as an antimicrobial or as a food source [2,6]. Among other mechanisms, for example, negative interactions take place when toxic compounds produced by one species harm other bacteria, whereas positive interactions occur when bacteria feed on the nutrients that are produced by others. Besides, many different forms of interactions exist, depending on the effects experienced by the species involved. Knowledge of interspecific interactions in the human microbiome is paramount to understand ecological processes and compositional changes in relation to health and disease [7,8].

Most human microbiome studies are limited to only a few samples in time, presenting mere ‘snapshots’ of the microbial ecosystem, even if these are derived from hundreds of human hosts. A common way to infer microbial networks from such cross-sectional data is by quantifying co-occurrence, e.g., through (partial) correlations, between bacterial phylogenetic groups. Several different conclusions have been derived from such endeavors, for example on species associations that reflect shared or overlapping niche preferences [9], microbial community structure [10,11], the resilience of microbial communities to perturbations [12] and keystone species in microbial networks [13]. Currently there are several correlation-based network tools available that can deal with the difficulties of microbiome data, such as the compositionality [1416]. The potential of correlation-based approaches for uncovering microbial networks has been highlighted in previous research [17].

Whether correlation-based networks represent meaningful ecological structure in microbial communities is however debated. Carr et al. (2019) showed that spurious correlations may occur due to the use of sequencing methods, data transformations and the large number of unmeasured variables [18]. Berry & Widder (2014) and Hirano & Takemoto (2019) assessed the performance of different co-occurrence methods for inferring interaction structure and found that their performance strongly depends on the underlying network properties, like network size and density and the number of samples used to construct the network [13,19]. Apart from the challenges of metagenomic-based abundance data and disagreement between various network tools, here we question whether correlations itself are at all useful to distinguish between different ecological interaction types. Resource competition and metabolic cooperation have been successfully inferred within environmental microbiomes, by linking ecological distribution data to multi-species metabolic models and subsequent verification of putative interactions by means of experimental co-growth analysis [20]. However, host-associated microbiomes often include non-culturable organisms, without information on nutrient requirements or metabolic function. Likewise, performance of correlation analysis in relation to alternative interaction types in human microbiota is not well understood and deserves further investigation.

Correspondence of correlations with ecological interactions needs to be studied against a known ground truth, which can be achieved by means of simulation. Mathematical models have been used as ground truth in assessment of correlation network techniques before (e.g. [21]), but correlation networks have not been systematically investigated against distinct interaction types in dynamic models. This requires elucidation especially as the ‘true’ ecological networks governing microbiome dynamics are still unknown. For this purpose, we assessed the performance of correlation-based network reconstruction by simulating abundance data based on the generalized Lotka-Volterra (gLV) model. The gLV model describes the collective dynamics of multiple species by means of an interaction matrix that can modulate different types of interactions [22]. The model is commonly used in microbiome studies for different aims: to simulate microbial communities under various interaction structures [22], to infer interaction structure from time-series data [12], to forecast population dynamics after a perturbation [23], to infer the network topology from steady state samples [24] and to identify the efficiency of intervention protocols in altering the state of a system via the addition or subtraction of microbial species [25]. In ecology, gLV-type models have been questioned for their reliance on pairwise additive interactions, as well as for the strictly linear effects imposed on interspecific interactions. Nonetheless, from the perspective of network inference, it makes sense to first investigate gLV-type models, as their first-order description of ecological dependencies, specified through a pairwise interaction matrix, resembles the objective of correlation analysis and most network models [2].

In addressing how gLV-type interactions can be inferred from cross-sectional data, we mainly focus on the correspondence between the obtained correlation-based networks and the underlying network of ecological interactions. We specifically investigate how inference of microbial interaction types is enabled by interindividual variation in population-dynamic parameters, e.g., species-specific carrying capacities, intrinsic growth rates and strength of interspecific interactions, and how network reconstruction is affected by gLV model assumptions. We highlight several situations where correlations cannot distinguish microbial interaction types, and therefore recommend careful interpretation and validation when inferring networks from cross-sectional abundance data.

Methods

Two species Lotka-Volterra model with self-limitation

First, we investigated how interactions between two species of microbial populations are displayed in terms of correlations in abundances in the Lotka-Volterra model. For the sake of convenience, we use the term ‘species’, although in studies with real microbiome data it is often not possible to characterize the taxonomic abundances at species level and therefore genera or higher taxonomic levels are often used instead.

The two-species Lotka-Volterra model is given by the following set of ordinary differential equations:

dN1dt=r1N1(1K11N1+α12N2)dN2dt=r2N2(1K21N2+α21N1) (1)

Here, Ni is the abundance of either species 1 or species 2 (with i = 1 or i = 2). The term ri is the intrinsic growth rate of each species, here normalized to 1 and 2 per time unit for species 1 and 2 respectively. The effect of each species’ abundance on its own growth is defined in terms of the species-specific carrying capacities Ki, with αii = –Ki- 1 denoting intraspecific competition. We arbitrarily chose the carrying capacity for the first species to be higher than the carrying capacity for the second species (K1 = 1.5; K2 = 1.1), meaning intraspecific competition is less strong for species 1 compared to species 2. Furthermore, αij (i = 1, 2; j = 1, 2; ij) indicates the interspecific interactions (the effect of one species abundance on the growth of the other species). A positive αij (e.g., as in the case of mutualism) denotes a positive effect of species j on the growth of species i, a negative αij (e.g., as in the case of competition) means a negative effect of species j on the growth of species i (S1 Fig). We assessed the effect of variation in the interspecific interaction parameters on correlation in equilibrium abundance between both species. For this purpose, the interspecific interaction strengths (α12 and α21) were drawn randomly from two normal distributions with similar or different mean and similar or different standard deviations (σα). Moreover, we also investigated the situation where |α12| = |α21|. Note that it was not possible to achieve stable co-existence for every combination of α12 and α21. More information on the conditions for co-existence can be found in the supplementary information (S1 Text).

Generalized host-specific Lotka-Volterra model

Microbial abundance is not only shaped by intra- and interspecific interactions, but also by host characteristics, for example lifestyle, diet and age [26]. Therefore, we investigated the performance of correlation-based network inference of microbial networks for a host-specific version of the gLV model. The host specific gLV model is given by:

dNi,mdt=ri,mNi,m(1Ki,m1Ni,m+j=1jiSαij,mNj,m) (2)

Here, Ni,m is the abundance of each species i in host m, with i = 1, …, s (s being the total number of bacterial species) and m = 1, …, 300 (the total number of hosts). The terms ri,m and Ki,m are the intrinsic growth rates and the carrying capacities of each species i in host m. The carrying capacities are kept separated from the interaction matrix A which only contains interspecific interactions (namely, the pairwise terms αij), facilitating a one-to-one comparison with the correlation matrix.

Parameterization of the base case simulations

We started with a base case and we added step by step variation to this case. Note that the base-case parametrization does not reflect any particular real-world system. Rather, parameters were chosen in such a way to facilitate computation and promote co-existence between species. Variations to the base-case parameters are shown later on, but also here, findings should be appreciated from a qualitative rather than quantitative viewpoint. In the base case the number of bacteria equals ten. The species-specific growth rate ri and the species-specific carrying capacity Ki were randomly drawn from uniform distributions respectively U(0.05, 0.1) and U(0, 1). The density of the interaction matrix A in the base case was chosen such that both sparsity of the interaction network and co-existence of the species was promoted in all simulations; in the base case, density was ¼ meaning that three out of four possible interactions were set to zero. Moreover, to ensure co-existence between species in the model we chose stronger intraspecific interactions than pairwise interspecific interactions. The species-specific parameters αij were drawn from a Gaussian mixture distribution, as follows. Half of the interactions were drawn from a negative normal distribution: αij ~ N(–0.25, 0.1); and the other half of the interactions were drawn from a positive normal distribution: αij ~ N(0.25, 0.1). All interactions were restricted to lie between –0.5 and 0.5, i.e., the normal distributions were truncated at –0.5 and 0.5. The parameters ri, Ki and the interaction matrix A were randomly drawn 1000 times from the aforementioned distributions to obtain 1000 different parameter combinations. Hereafter, host-specific parameters were drawn from log-normal distributions around species-specific parameters, as follows:

{ln(|αij,m|)N(ln(|αij|),σα)ln(ri,m)N(ln(ri),σr)ln(Ki,m)N(ln(Ki),σK) (3)

Here, σα denotes the interindividual variability in interspecific interactions among the 300 hosts (with σα = 0.25 in the base case), and |αij,m | denotes the absolute strength of interaction from species j on the growth of species i for each host m. Note that, for the sake of simplicity, the use of log-normal distributions was adopted to induce fold-changes around population means, where both the presence and the sign of interspecific interactions are kept constant across hosts. However, this may be untrue in real microbiota as many microbes can change metabolic pathways and therefore may switch from interaction types and interaction partners. In the base case model, the carrying capacities and growth rates were kept constant across hosts, meaning σr and σK were set equal to 0.

The simulation process yielded 300.000 timeseries (300 host specific timeseries for each of the 1000 ten species networks). The running time of the model was chosen such that all species reached their equilibrium abundance. If at least one species did not survive (i.e., when its abundance dropped below 0.001), we rejected the simulation in favor of another randomly drawn parameter set. After sampling the abundances at equilibrium, we added independent and identically distributed noise υ to mimic uncertainty in measurements (with υ ~ U(-0.01, 0.01) in the base case). This measurement noise can be thought of as representing, for example, sampling errors, environmental contamination, batch effects during sequencing, or annotation errors in reference genomes [27]. Simulations were performed in R (R version 3.6.0; https://www.r-project.org/). The gLV model was solved with the lsoda function from the deSolve package (version 1.24) which uses a FORTRAN ODE solver written by Petzold & Hindmarsh (1995) [28]. R code is available via GitHub (https://github.com/susannepinto/gLV_microbiome.git). A general overview of the base case simulation design is given in Fig 1.

Fig 1. Representation of the workflow.

Fig 1

In an interaction network singular green and red arrows represent a commensalistic interaction and an amensalistic interaction respectively, whereas double green arrows represent mutualism and double red arrows competition. A green and red arrow signifies an exploitative interaction. See S1 Fig for more details. (A) A random interaction matrix i. This interaction matrix is implemented in the gLV model (B) together with the intrinsic growth rates and carrying capacities of the species. (C) All timeseries are (slightly) different due to the variation in the interaction strengths. (D) The partial correlations are calculated from the abundances per species sampled from the 300 different hosts at equilibrium. Only the significant correlations and the lower part of the matrix are used for the comparison with the original interaction matrix i. Variations to the workflow were studied by adding for example a perturbation or process noise.

Variations to the base case model

We studied multiple variations to the base case model. Like the base case simulations, we did 1000 simulations per variation. As a first variation, we added host-specific variability to the species-specific parameters ri and Ki using Eq 3, with σr = 0.25 and σk = 0.25.

Second, we varied the amount of measurement noise, from υ ~ U(–0.01, 0.01) (medium noise in the base case) to υ ~ U(–0.001, 0.001) (low noise) and to υ ~ U(–0.1, 0.1) (high noise). We also simulated timeseries with a different type of noise, namely varying magnitudes of process noise W (S2 Fig). In contrast to measurement noise, which was added only to the sampled abundances, process noise was added to the gLV model such that within-host population dynamics were perturbed at discrete time intervals Δtt = 1 time unit). The time-varying process noise was drawn from a log-normal distribution to prevent the abundances from dropping below zero, i.e. ΔWi = ln(Ni,m(Δt))–ln(Ni,m(t)) ~ N(ln(Ni,t), σW) (with σW ~ N(0, 1) for high process noise and σW ~ N(0, 0.1) for low process noise).

Further, we simulated data with interaction strengths drawn from a uniform (αij ~ U(–0.5, 0.5)) or unimodal (αij ~ N(0, 0.15)) distribution. As in the base case, the interaction strengths were restricted to lie between –0.5 and 0.5 (S3 Fig).

We also analyzed three different structures of microbial networks. First, we increased the number of species s to 30. To promote co-existence, we also reduced the density of the interaction matrix to 1/6. Secondly, we simulated a network based on a producer consumer relation between the species (S4 Fig). Instead of random interaction networks (S4A Fig), the producer-consumer networks are based on a cross-feeding structure between producers and consumers (with equal numbers of producers and consumers) (S4B Fig). Producers excrete metabolites which are consumed by the consumers. Because consumers remove the ‘waste’ from the producers, the presence of a consumer can also be beneficial for the producers. Therefore, between producers and consumers positive interactions are more likely to occur than negative interactions. On this purpose, we drew the consumer-producer interactions from the positive side of the Gaussian mixture distribution (αij ~ N(0.25, 0.1)). In contrast, among producers and consumers themselves, the interactions are predominantly negative as these species are more likely to compete for similar resources. On this purpose, we drew the interactions among producers and among consumers from the negative side of the Gaussian mixture distribution (αij ~ N(–0.25, 0.1)). Thirdly, we simulated a microbial network with interaction hubs, i.e. a network containing species with unusually high numbers of ecological interactions compared to other species in the network (S4 Fig) [29]. Hub-species networks were created according to the Barabási-Albert model [30] and implemented with the barabasi.game function from the igraph package (version 1.2.11). In the network-generating algorithm, interactions are distributed according to a mechanism of preferential attachment. Thus, species with interactions obtain a higher chance of getting more interactions, resulting in a few ‘hub-species’ with many interactions. We constructed two scale-free directed graphs (with power = 2), denoting “incoming” and “outgoing” interactions, and combined these to obtain a bidirected graph. Density was kept similar to the base case model (1/4).

Next, we also investigated how network inference is affected by sample size by considering a scenario with 3000 instead of 300 hosts. We did this for the base-case model with random interaction networks, as well as for the producer-consumer and hub-species networks described above.

Last, we investigated the effect of a perturbation on the performance of network inference. The populations were perturbed after 175 time units, with a perturbation that lasted for 50 time units. The perturbation was modelled by taking a new set of random carrying capacities per species per sample. Due to the simulated perturbation, the equilibrium distribution shifted. After the perturbation, the species grew back to their original equilibrium. Sampling occurred before, during or after the perturbation.

Assessment of correlation-based network inference

With the simulated data at hand, we created a dataset with the abundances of the model species sampled at equilibrium for each host m. After adding measurement noise to the data, we inferred the correlations between species by calculating the partial Pearson correlation coefficients ρ between all abundances Ni across the m different hosts (Fig 1). We did not use plain correlations, because partial correlations have the advantage of controlling for confounding interactions (e.g. interactions between bacterial species affecting the abundance of a third species) [31]. Agreement between the partial correlation matrix and the interaction matrix A from the gLV model was assessed qualitatively, i.e., we only considered whether significant entries in the partial correlation matrix agreed with the interaction matrix in terms of non-zero entries with the correct sign. We used the Benjamini-Hochberg procedure to control for the expected proportion of ‘false discoveries’ after calculating partial correlations between each pair of species [32]. The results (true positives, true negatives, false positives and false negatives) were stored in a confusion matrix (Table 1). Because a correlation matrix is symmetric and an interaction matrix A is not, we only used half of the partial correlation matrix (Fig 1D) to construct the confusion matrix. For a correctly classified interaction, either one or both interactions in the upper and lower part of the A matrix must have the same sign as in the lower part of the partial correlation matrix. This can produce a bias, because asymmetric interactions can result in a true positive result for correspondence of the correlation coefficient (ρ) with either interaction. For example, for exploitative interactions, both negative and positive correlations are classified as true positive results. Therefore, we tested the effect of this bias on the success of network inference by specifying the intended sign in correlation analysis, as the sign of the strongest interaction in each pair of species. Hence, for an exploitative interaction, only a positive or a negative correlation is correct, depending on the weights of the asymmetric interactions. Secondly, we also tested the effect of this bias on the success of network inference by setting the rule that the sign of both interactions must be matched by the inferred correlation coefficient. Hence, only mutualism and competition can be inferred correctly, as amensalism, commensalism and exploitative interactions are asymmetric.

Table 1. The confusion matrix as used in this study.

The inferred partial correlation coefficient ρ (from the lower part of the partial correlation matrix) must have the same sign as one of the interactions in the interaction matrix A to be considered as a true positive finding in base case analysis.

Interaction in the A matrix from the model Inferred partial correlation
Negative* Not significant Positive*
No interaction 0, 0 false positive true negative false positive
Mutualism +, + false positive false negative true positive
Competition –, – true positive false negative false positive
Commensalism +, 0 | 0, + false positive false negative true positive
Amensalism –, 0 | 0, – true positive false negative false positive
Exploitative interaction +,–|–, + true positive false negative true positive

* Only significant partial correlations (with p < 0.05) are considered after correction for multiple testing with Benjamini-Hochberg procedure.

Performance of network inference was evaluated with precision and recall and a combination of both measures, called the F1-score [33]. The precision is the fraction of correctly classified interactions among the total number of significantly predicted interactions (i.e., significant partial correlations) and the recall is the fraction of correctly classified interactions among the total number of non-zero interactions in the interaction matrix A. The F1-score (on a scale from 0 (no agreement) to 1 (perfect agreement)) is obtained as the harmonic mean of precision and recall, weighted equally, as given in the following equation:

F1=2precisionrecallprecision+recall (4)

Results

Inference of asymmetric and symmetric interactions in a two-species system

Correlations in abundances of the species in a two-species Lotka-Volterra model are shaped by the type of interaction involved. Fig 2 shows scatterplots of the abundances of two bacterial species for different interaction mechanisms over a range of different combinations of α12 and α21. Mutualistic interactions clearly yielded a positive correlation in abundance between the two species involved (Figs 2A and S5). Competitive interactions generally yielded negative correlations (Figs 2B and S5). However, under perfectly symmetric competition (when α12 = α21) we did find a positive correlation depending on interaction strength and carrying capacities of the species involved (S5D Fig, second panel). In the situation where one of the two species does not experience any benefits or limitations in growth from the other species, as is the case with commensalism and amensalism (i.e. α12 = 0 or α21 = 0), correlations are zero because one of the species will grow to its carrying capacity irrespective the abundance of the other species (Fig 2C and 2D).

Fig 2.

Fig 2

Scatter plots between the abundances of two bacterial species for different interaction mechanisms: (A) mutualism, (B) competition, (C) commensalism, (D) amensalism and (E, F) exploitative interactions. The abundances of the two species N1 and N2 at equilibrium are shown as scatterplots and have been obtained by running the two-species Lotka-Volterra model, with K1 = 1.5; K2 = 1.1; r1 = 1; r2 = 2 and αij drawn randomly from normal distributions with identical means and standard deviations (α12 ~ N(|0.7|, 0.2), α21 ~ N(|0.7|, 0.2)). In the case of commensalism and amensalism: α12 ~ N(|0.7|, 0.2) and α21 = 0. The two species can co-exist under certain combinations of αij (S1 Text). The grey polygon indicates the area where co-existence is possible. Note that the axes have different ranges in each subplot. Because the two species have different carrying capacities, the two situations of exploitative interactions are different. i.e., in case of exploitative interaction type 1: species 1 is exploited by species 2 and in case of exploitative interaction type 2: species 2 is exploited by species 1.

Correlations under exploitative interactions among bacteria, benefitting one but harming the other species, generally yielded positive correlations (Figs 2E, 2F and S5), but negative correlations were also found. This happened when the exploitative benefit was of equal magnitude as the harm done to the other species (S5D Fig), or of similar mean magnitude but with more variation (e.g. species 1 is exploited by species 2;–α12 = α21 and σα12 << σα21 (exploitative interaction type 1) or species 2 is exploited by species 1; α12 = –α21 and σα21 << σα12 (exploitative interaction type 2) (S5B Fig). However, if the exploitative benefit outweighs the harm done to the other species, exploitative interactions will generally yield positive correlations. It should also be noted that the two species were not exchangeable, because species 1 was given a weaker intraspecific interaction strength than species 2. Thus, in the absence of interspecific interactions, species 1 can reach a higher abundance at equilibrium. This means that, for the same interspecific interaction strength, the species with the higher carrying capacity exerts a stronger (negative) effect on the growth of the other species.

Network inference under various interaction types

Here we used the base case model to assess the success rate of recovering a particular interaction type between pairs of species: amensalism, commensalism, exploitative interactions, mutualism and competition (S1 Fig). Fig 3A shows that correlations were more often found in mutualistic and competitive interactions, where interacting species experience the same qualitative effects from each other, than in amensalistic and commensalistic interactions, where only one species experiences an effect from the presence of another species. For exploitative interactions among bacteria, either a positive or negative correlation coefficient ρ could be found, with a success rate comparable to amensalistic and commensalistic interactions. Contrary to the results that included symmetric interactions, there was no difference between the successful inference of positive interactions over negative interactions in any interaction type (Fig 3B). For all interaction types, the sign of the significant correlation coefficient ρ found, mostly agreed with the sign of the type of the interaction (Fig 3A and 3B). However, with the inferred correlations neither the type nor direction of the original interaction could be recovered.

Fig 3. The percentage of significant partial correlations (with sign matching interaction in either direction), as recovered from the base case model.

Fig 3

(A) For different types of pairwise interactions and (B) for the different correlations.

Network inference under various sources of process variability

Next, we investigated how correct network inference was affected by several variations to the base case model (Fig 4 and S1 Table). In all cases considered, interactions were recovered with precision exceeding recall. This means that the likelihood of missing an interaction (i.e., 1 –recall) was higher than the likelihood of finding a false interaction (i.e., 1 – precision), illustrating the effect of false discovery rate control.

Fig 4. Inference under various sources of process variability.

Fig 4

For the different scenario’s we show the precision, recall and the F1-score. (A) The base case model. (B) Host-specific variation in the carrying capacities and intrinsic growth rates. (C) Decreased and increased amount of measurement noise (υ) and the effect of process noise (W) (S2 Fig). (D) Interaction strengths drawn from a uniform and unimodal distribution (S3 Fig). (E) The results for a 30 species system, a network based on a producer-consumer structure and a network with hub interactions (S4 Fig). (F) The effect of network inference when specifying the intended sign in correlation analysis, as the sign of the strongest interaction in each pair of species, or by setting the rule that the sign of both interactions must be matched by the inferred correlation coefficient (strict inference). (G) Three scenarios with 3000 hosts, for the base-case with random interaction networks as well as for the scenarios with structured (i.e. producer-consumer and hub-species) networks. Network inference was assessed by the F1-score, which measures agreement between the interaction matrix in the gLV model and the inferred partial correlation matrix on a scale from 0 (no agreement) to 1 (perfect agreement) (according to the rules of Table 1). The dashed line indicates the median result from the base case model. The bars of the boxplots indicate the variability of the data outside the middle 50% (i.e., the lower 25% of scores and the upper 25% of scores).

Partial correlations corresponded to non-zero entries in the interaction matrix only when interindividual variation existed in the interaction parameters (αij) and/or carrying capacities (Ki) (Fig 4A and 4B). These parameters directly influence microbial abundance patterns, as interspecific interactions and carrying capacities determine the equilibrium of the gLV model. The intrinsic growth rate only determines the speed at which species reach their equilibrium, and this parameter is not informative for the equilibrium abundances. In fact, performance under interindividual variation in growth rates was just as bad as the performance under pure measurement noise with no variation in model parameters (Fig 4B).

Performance of correlation-based network inference was robust to measurement noise, if measurement noise was small compared to interindividual variation in process parameters (Fig 4C). When measurement noise became of the same magnitude as the variation in interspecific interactions, the F1-score deteriorated, and it was no longer possible to use correlations as a proxy for interactions (Fig 4C). We also checked whether adding process noise would affect the inference. We did observe a significant improvement of the inference from a model with process noise relative to only measurement noise (Fig 4C and S1 Table).

Hereafter, we investigated the effect of drawing the interaction strengths from different types of distributions (Figs 4D and S3). We did not observe a difference between the success rate of network inference under a Gaussian mixture distribution or uniform distribution, which were conditioned to have similar variances (S1 Table). However, successful inference deteriorates with reduced interaction strength, as success rates were better under a Gaussian mixture distribution or uniform distribution, as compared to a unimodal distribution around zero (with smaller variance) (Fig 4D). The weaker interactions have a smaller effect on equilibrium abundances of other species, which makes them harder to detect with correlation analysis.

Fig 4E shows the results for different network types. Increasing the number of species from 10 to 30 had a significant negative effect on the success of the inference (S1 Table), which was mainly due to reduced precision. Conversely, F1-scores were improved as compared to the base-case when assuming a producer-consumer based network (S4 Fig and S1 Table), on account of an improved recall. The inference in a network with interaction hubs (as explained in S4 Fig) was significantly worse than in a random network, which could be attributed to a somewhat reduced recall.

Note that problems may arise with asymmetric relationships. When using the rule that pairwise correlations should match the strongest interaction between both species involved as the intended sign, we found only a slight non-significant reduction in F1-score as compared to the base case scenario (Fig 4F and S1 Table). Thus, pairwise interactions wherein the net effect on population growth is positive or negative are mostly picked up as such in correlation analysis. However, under the rule that mutual interactions must both be reflected in the sign of the correlations, asymmetric interactions cannot be recovered as correlations are symmetric. We indeed found much lower F1-scores when detection of asymmetric interactions was no longer considered as a true positive result after inferring a significant correlation coefficient ρ (either positive or negative) (Fig 4F).

Finally, we verified that network inference improved with increasing sample size. This applied to models with random as well as structured interactions networks (Fig 4G). In the base case, precision was somewhat reduced at increased sample size notwithstanding Benjamini-Hochberg control. However, this was compensated by substantially improved recall, resulting in significantly increased F1-scores. Interestingly, precision stayed more or less constant at increased sample size in producer-consumer and hub-species networks, whereas recall improved but remained somewhat behind that of random networks.

Network inference under non-equilibrium conditions

Fig 5 shows that the equilibrium assumption is not necessary for successful correlation-based network inference. In fact, our results even suggest that a perturbation can positively affect the performance of network inference. Variation in the growth rates becomes significantly informative outside the equilibrium (S2 Table). Also, variation in the interactions becomes even more informative when the population is still growing towards the equilibrium. Network inference is impaired only right after the start of a perturbation, when the population is still far from a new equilibrium, unless the interindividual variation is in the carrying capacities (Fig 5B). We also assessed the success of correlation-based inference when the sampling occurred randomly in time in relation to the perturbation. We found that the F1-score resembled an average of F1-scores across various sampling timepoints.

Fig 5. The effect of a perturbation on correlation-based network inference.

Fig 5

(A) Example of a timeseries. Dashed lines represent sampling timepoints. Sampling was performed during the perturbation (t1 = green, t2 = yellow, t3 = blue and t4 = grey) and at equilibrium (t5 = dark blue). Alternatively, sampling was performed randomly between t = 100 and t = 1000 (random = pink). (B) Results (F1-scores) of network inference for sampling at various timepoints. After a perturbation all species grow back to their original equilibrium. The bars of the boxplots indicate the variability outside the middle 50% (i.e., the lower 25% of scores and the upper 25% of scores). Dashed lines represent median results of sampling during equilibrium.

Discussion

Correlation-based network inference has been used in many studies and for many different types of human and environmental microbial communities [31]. The reliability of the results with regards to true ecological dependencies has been criticized, to the extent that correlation analysis has been suggested to almost never reveal anything substantive about the biotic relationships between bacteria [18]. However, the theoretical basis that enables ecological interactions to be inferred from cross-sectional abundance data remains poorly understood. Most of the previous research has focused on the reconstructed network properties or the difficulties pertaining to metagenomics-based abundance patterns, e.g., the compositionality of the data and the high proportion of zeros [18,31,34]. While these difficulties are pervasive and merit further consideration, here, we question whether correlations are at all useful in distinguishing different interaction types in microbial networks.

We demonstrated multiple pitfalls when using correlation-based methods for inferring interactions. Some of those pitfalls are well known, as they relate to the inherent symmetry of correlation-based metrics and the frequent asymmetry of ecological interactions [18]. As a result, asymmetric interaction types (commensalism, amensalism and exploitative interactions) cannot be recovered with indication of the direction of interaction, which agrees with prior work done by Weiss et al. (2016) [21]. Symmetric interaction types, where species involved affect each other’s growth in a qualitatively similar way (competition, mutualism) can be recovered, although competitive interactions may also result in positive correlations, albeit in very rare cases where species have identical competitive strength. Likewise, we found that exploitative interactions generally induce positive correlations, especially in the likely circumstance where the exploitative benefit outweighs the harm to the exploited species. These findings might explain why empirical correlation-based networks have a relative shortage of negative correlations [20,34,35]. It remains to be investigated whether the high frequency of positive edges in reconstructed networks is caused by methodologic limitations or whether the interspecific interactions in host-associated microbiota are primarily mutualistic [3639].

Still, as illustrated by our analysis, correlations in microbial abundance across independently sampled hosts can be indicative for underlying ecological interactions under host-specific variation in microbial population dynamics. That is, if microbial groups of interest are omnipresent and their interactions are appropriately captured by generalized Lotka-Volterra (gLV) dynamics, the variation in population abundances should be driven by interindividual variability in population-dynamic parameters. In the context of the gLV model, the informative parameters are primarily related to intrinsic growth rates, carrying capacities and strength of between-species interactions of microbial groups considered. A change in species abundances can be informative for the interactions among those species, as was also previously shown by Stone and Roberts (1991) [40]. It remains to be determined how much variability across individual hosts is driven by external forcing and by gradual differences in process-related parameters relative to measurement noise. On one hand, it is well known that microbes adapt to host-specific environments, shaped by diet, lifestyle, hormonal regulation, immune system, etcetera [26]. As an example, increased abundance of a particular bacterial species at increased glucose intake levels might be reflective of increased resource availability (affecting carrying capacity and growth rate) or superior competitive strength (affecting interactions with other species) [6]. On the other hand, environmental drivers of bacterial growth can operate over different spatial and temporal scales and correlations in abundance can be reflective of shared environmental niches that have no meaning in terms of direct biotic interactions [1].

Therefore, a correlation between the abundance of two species does not imply that those species are interacting [41]. Many of the detected correlations may be caused by shared environmental preferences rather than species interactions [42]. Such kind of environmental filtering can mask putative between-species interactions as well as induce spurious correlations [18]. Also, co-occurring species may appear to be dependent on each other, while their co-occurrence can be explained by them actually sharing a similar dependency on a third party–so that co-occurrence, and hence apparent dependencies drawn from that, may also be explained by higher-order interactions [43]. Berry and Widder (2014) claimed that network interpretation is only possible if samples are derived from similar environments [13]. Our analysis suggests that network inference partially depends on a degree of heterogeneity in population-dynamic parameters. If differences in bacterial abundances between hosts are mainly due to measurement noise, their correlations are not informative of underlying interactions. In our simulations, with relative standard deviation in process-related parameters between hosts of about 25%, inference performed well as long as measurement noise had coefficients of variation well below 10% of mean bacterial abundances. Strikingly, the inference of interactions was even improved when process noise was added. More research is needed to delineate the extent to which correlation analyses can be confounded by latent environmental drivers of microbial population dynamics, and how strongly one should condition on environmental or host homogeneity.

Our results have been obtained by using the gLV model. While the gLV model has been very popular in microbiome research because of its manageability, it has several drawbacks. In ecology, the gLV model has been criticized for the absence of trophic levels within the model [44]. This is in contrast to most classical ecological (e.g. plant-herbivore or predator-prey) systems, where direct consumption and predation offer more opportunity for top-down regulation, possibly obscuring interactions in co-occurrence patterns [45]. But trophic levels are probably not so relevant in the human microbiome as bacteria mainly interact with each other through excreted metabolites [2]. Furthermore, the interactions between bacteria might be much more complex than the additive and pairwise interactions that the gLV model assumes. Momeni et al (2017) claimed that pairwise modeling will often fail to predict microbial dynamics, as many interactions occur through chemical production pathways (such as cross-feeding and nutrient competition) involving more than two species [46]. Correlation analysis fails to capture the resulting higher-order interactions, for which more advanced techniques, e.g. graphical models [47], might be more appropriate. It is unclear, how well directed links predicted by these methods recover true ecological interaction types. Often, they require more prior knowledge of the network of microbial interactions, time series or more fine-grained data on the pathways of interaction. Moreover, microbial networks can be bi-directed and cyclic [20], which poses problems for inference of directionality and type of interactions from mere cross-sectional data. More classical methods of separating direct from indirect interactions, e.g. path analysis [48], rely on testing of specific alternative causal hypothesis, which can only be considered as a next step in network inference. To shed more light on causal pathways, there is a need in microbial ecology for models that can describe the full set of metabolite concentrations, metabolic fluxes and species abundances within a community [49]. Based on metabolic modelling, Freilich et al. (2011) concluded that cooperative interactions are relatively rare among free-living bacteria and, if present, are often unidirectional. Machado et al. (2021) suggested that mutualistic interactions are much more common among host-associated bacteria, that often form highly cooperative communities and have smaller genomes and fewer metabolic genes compared to other species. Cooperative communities are resilient to nutrient change and adaptable to a wide variety of different environments, including the human body [20,43]. Metabolic modeling is still challenging and heavily based on a priori assumptions, but is also a rapidly developing field that may prove useful for computational validation of correlation-based interaction networks [50].

In addition, the gLV model disregards important biological processes, such as adaptation (for instance, switching of mutualistic partners due to for example horizonal gene transfer [51]), that may affect the topology of ecological networks, rather than the strength of ecological interactions in a network. Furthermore, the gLV model displays dynamics that are characterized by strong equilibrium attractors. Many studies have shown the occurrence of complex dynamics as alternative stable states [52], oscillations and chaos in experimental [5355], but also in field studies [56], with ecological communities. Whether this also applies to the bacterial communities inhabiting the human body is still unknown, due to the paucity of long-term human microbiome studies. However, a study among a thousand western individuals has suggested the existence of tipping elements in the intestinal microbiome [57] indicating the possible presence of alternative attractors in the dynamics of gut microbiome communities [58,59].

As a general critique, the use of simulated data based on gLV dynamics raises the question to what extent the necessary model assumptions (and therefore the results) are representative for the human microbiome. Of course, real data are much more complex than simulated data. To reiterate, our base-case parametrization does not reflect any particular real-world system, and findings should be appreciated from a qualitative rather than quantitative viewpoint. Even so, while models can only serve as very crude approximations, the main features of model-based analysis might still hold, as demonstrated by Freilich et al. (2018) [42]. They compared a well-resolved, empirically defined interaction network of species in the rocky intertidal zone in Central Chili to a reconstructed network based on the co-occurrence of those species. There are similarities in their findings to our results. For example, they found that weak interactions are missed more often than interactions above a certain threshold. They also concluded that the ability to correctly detect a true link varies across different interaction types, and that positive interactions are better detected than negative interactions. Interestingly, in line with our results, they also found that negative interactions are misclassified as positive interactions more often than vice versa.

In our simulation studies, the chance of finding false interactions was well under control by using partial correlations with adjustment for multiple testing. It should be noted that application of correlation-based network reconstruction to real-world high-throughput microbial abundance data typically requires additional constraints for control of false discovery rates. Real-world microbiome data have some specific challenges which may negatively affect the success of correlation-based network inference. The compositionality of the data, the diversity of species (with many rare species) and the density of interactions make these networks harder to predict and apparent correlations more likely to appear [14,19]. Various correlation-based methods, often free of charge and stored in pre-programmed packages are available to handle these challenges. However, Weiss et al. (2016) showed that with the same data, there is much disagreement between the inferred networks generated by different tools [21]. Thus, even if correlations are a useful proxy of microbial interactions, performance of network inference in high-dimensional settings will also strongly depend on the specific network modelling approach taken.

To summarize, correlation-based methods are particularly insensitive for the detection of asymmetric interactions (such as exploitative interactions, amensalism or commensalism), as direction of interaction cannot be recovered from co-occurrence data. Still, they may perform well when applied to networks that are dominated by mutualistic and competitive interactions, as in producer-consumer systems. Applicability of correlation-based network inference to readily available microbiome data thus depends on the type of interactions that govern microbiome dynamics, which likely depends on each application. To conclude, our study suggests that hypotheses about microbial interactions, generated with correlation-based methods, should be questioned with domain-specific knowledge. We highlight again the careful interpretation and validation that is required.

Supporting information

S1 Text. Co-existence in a two-species Lotka-Volterra model with self-limitation.

Table A. Conditions for stable co-existence in the two-species Lotka-Volterra model. Fig A. Zero-growth isoclines (“null-clines”) in the two-species Lotka-Volterra model.

(PDF)

S1 Fig. Cartoon illustrating the different interaction mechanisms.

(PDF)

S2 Fig. The effect of process noise (W) on the within host population dynamics.

(PDF)

S3 Fig. Distributions of interaction strengths in three different scenarios.

(PDF)

S4 Fig. Network structures used in the different case studies.

(PDF)

S5 Fig. The effect of αij on the correlations between the abundances of two bacterial species for different interactions mechanisms.

(PDF)

S1 Table. Mann-Whitney U test results for the F1-scores of the base case model and for the F1-scores of the model with different sources of process variability.

(PDF)

S2 Table. Mann-Whitney U test results for the F1-scores of the samples taken during equilibrium (t5 in Fig 5) and for the F1-scores of the samples taken outside equilibrium.

(PDF)

Data Availability

All relevant codes are available via GitHub (https://github.com/susannepinto/gLV_microbiome.git).

Funding Statement

This publication is part of the project "Ecology meets human health: unraveling the complex dynamics of human microbiota to direct therapeutic intervention" financed by the Dutch Organization for Scientific Research (NWO) through the research program Complexity in Health and Nutrition (NWO grant 645.001.002; www.nwo.nl/onderzoeksprogrammas/complexiteit), with co-funding by the National Institute for Public Health and the Environment (RIVM) of the Netherlands. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Consortium THMP. Structure, function and diversity of the healthy human microbiome. Nature. 2012; 486: 207–214. doi: 10.1038/nature11234 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Faust K, Raes J. Microbial interactions: from networks to models. Nature Reviews Microbiology volume. 2012; 10: 538–550. doi: 10.1038/nrmicro2832 [DOI] [PubMed] [Google Scholar]
  • 3.Falony G, Joossens M, Vieira-Silva S, Wang J, Darzi Y, Faust K, et al. Population-level Analysis of Gut Microbiota Variation. Science. 2016; 352: 560–564. [DOI] [PubMed] [Google Scholar]
  • 4.Karkman A, Lehtimäki J, Ruokolainen L. The ecology of human microbiota: dynamics and diversity in health and disease. Annals Of The New York Academy Of Sciences. 2017: 1–15. doi: 10.1111/nyas.13326 [DOI] [PubMed] [Google Scholar]
  • 5.Hibbing ME, Fuqua C, Parsek MR, Peterson SB. Bacterial competition: Surviving and thriving in the microbial jungle. Nature Reviews Microbiology. 2010; 8: 15–25. doi: 10.1038/nrmicro2259 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Woyke T, Teeling H, Ivanova NN, Huntemann M, Richter M, Gloeckner FO, et al. Symbiosis insights through metagenomic analysis of a microbial consortium. Nature. 2006; 443(7114): 950–955. doi: 10.1038/nature05192 [DOI] [PubMed] [Google Scholar]
  • 7.van Nood E, Vrieze A, Nieuwdorp M, Fuentes S, Zoetendal EG, de Vos WM, et al. Duodenal infusion of donor feces for recurrent Clostridium difficile. The New England Journal of Medicine. 2013; 368: 407–415. [DOI] [PubMed] [Google Scholar]
  • 8.Coyte KZ, Schluter J, Foster KR. The ecology of the microbiome: Networks, competition, and stability. Science. 2015; 350: 663–666. doi: 10.1126/science.aad2602 [DOI] [PubMed] [Google Scholar]
  • 9.Chaffron S, Rehrauer H, Pernthaler J, von Mering C. A global network of coexisting microbes from environmental and whole-genome sequence data. Genome Res. 2010; 20(7): 947–959. doi: 10.1101/gr.104521.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Riera JL, Baldo L. Animal Microbial co-occurrence networks of gut microbiota reveal community conservation and diet-associated shifts in cichlid fishes. Microbiome. 2020; 2: 36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Vemuri R, Martoni CJ, Kavanagh K, Eri R. Lactobacillus acidophilus DDS-1 Modulates the Gut Microbial Co-Occurrence Networks in Aging Mice. Nutrients. 2022; 14. doi: 10.3390/nu14050977 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Stein RR, Bucci V, Toussaint NC, Buffie CG, Rätsch G, Pamer EG, et al. Ecological modeling from time-series inference: insight into dynamics and stability of intestinal microbiota. Plos Computational Biology. 2013; 9: 1–11. doi: 10.1371/journal.pcbi.1003388 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Berry D, Widder S. Deciphering microbial interactions and detecting keystone species with co-occurrence networks. Frontiers in Microbiology. 2014; 5: 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. Plos Computational Biology. 2012; 8: 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kurtz ZD, Müller CL, Miraldi ER, Littman DR, Blaser MJ, Bonneau RA. Sparse and Compositionally Robust Inference of Microbial Ecological Networks. Plos Computational Biology. 2015; 11(5). doi: 10.1371/journal.pcbi.1004226 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Faust K, Raes J. CoNet app: inference of biological association networks using Cytoscape. F1000Research. 2016; 5. doi: 10.12688/f1000research.9050.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Faust K, Sathirapongsasuti JF, Izard J, Segata N, Gevers D, Raes J, et al. Microbial co-occurrence relationships in the human microbiome. Plos Computational Biology. 2012; 8: 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Carr A, Diener C, Baliga NS, Gibbons SM. Use and abuse of correlation analyses in microbial ecology. The International Society of Microbial Ecology Journal. 2019; 13: 2647–2655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hirano H, Takemoto K. Difficulty in inferring microbial community structure based on co-occurrence network approaches. BMC Bioinformatics. 2019; 20: 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Freilich S, Zarecki R, Eilam O, Segal ES, Henry CS, Kupiec M, et al. Competitive and cooperative metabolic interactions in bacterial communities. Nature communications. 2011; 589(2): 7. doi: 10.1038/ncomms1597 [DOI] [PubMed] [Google Scholar]
  • 21.Weiss S, Treuren WV, Lozupone C, Faust K, Friedman J, Deng Y, et al. Correlation detection strategies in microbial data sets vary widely in sensitivity and precision. The International Society of Microbial Ecology Journal. 2016; 10: 1669–1681. doi: 10.1038/ismej.2015.235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gonze D, Coyte KZ, Lahti L, Faust K. Microbial communities as dynamical systems. Current Opinion in Microbiology. 2018; 44: 41–49. doi: 10.1016/j.mib.2018.07.004 [DOI] [PubMed] [Google Scholar]
  • 23.Bucci V, Tzen B, Li N, Simmons M, Tanoue T, Bogart E, et al. MDSINE: Microbial Dynamical Systems INference Engine for microbiota time-Series analyses. Genome Biology. 2016; 17: 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Xiao Y, Angulo MT, Friedman J, Waldor MK, Weiss ST, Liu Y-Y. Mapping the ecological networks of microbial communities. Nature communications. 2017; 8: 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Jones EW, Shankin-Clarke P, Carlson JM. Navigation and control of outcomes in a generalized Lotka-Volterra model of the microbiome. In: Kotas J (eds). Advances in Nonlinear Biological Systems: Modeling and Optimal Control, 11th edn. American Institute of Mathematical Sciences, USA, pp 97–120, 2020. [Google Scholar]
  • 26.Cho I, Blaser MJ. The human microbiome: At the interface of health and disease. Nature Reviews Genetics. 2012; 13: 260–270. doi: 10.1038/nrg3182 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Burnham P, Gomez-Lopez N, Heyang M, Cheng AP, Lenz JS, Dadhania DM, et al. Separating the signal from the noise in metagenomic cell-free DNA sequencing. Microbiome. 2020; 8: 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Hindmarsh AC, Petzold LR. Algorithms and software for ordinary differential equations and differential algebraic equations, Part II: Higher-order methods and software packages. Computers in Physics. 1995; 9: 148–155. [Google Scholar]
  • 29.Goh K-Il OE, Jeong H, Kahng B, Kim D. Classification of scale-free networks. PNAS. 2002; 99(20): 12583–12588. doi: 10.1073/pnas.202301299 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Barabási A. Network Science. 1 ed: Cambridge University Press; 2016. [Google Scholar]
  • 31.Layeghifard M, Hwang DM, Guttman DS. Disentangling interactions in the microbiome: A network perspective. Trends in Microbiology. 2017; 25: 217–228. doi: 10.1016/j.tim.2016.11.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. 1995; 57: 289–300. [Google Scholar]
  • 33.Sasaki Y. The truth of the F-measure. University of Manchester: School of Computer Science. 2007: 1–5. [Google Scholar]
  • 34.Cougoul A, Bailly X, Vourc’h G, Gasqui P. Rarity of microbial species: In search of reliable associations. PLoS ONE. 2019; 14: 1–15. doi: 10.1371/journal.pone.0200458 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Gupta S, Hjelmsø MH, Lehtimäki J, Li X, Mortensen MS, Russel J, et al. Environmental shaping of the bacterial and fungal community in infant bed dust and correlations with the airway microbiota. Microbiome. 2020; 7: 1–16. doi: 10.1186/s40168-020-00895-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ma ZS. The P/N (Positive-to-Negative Links) ratio in complex networks—a promising in silico biomarker for detecting changes occurring in the human microbiome. Microbial Ecology. 2018; 75: 1063–1073. doi: 10.1007/s00248-017-1079-7 [DOI] [PubMed] [Google Scholar]
  • 37.Seelbinder B, Chen J, Brunke S, Vazquez-Uribe R, Santhaman R, Meyer A-C, et al. Antibiotics create a shift from mutualism to competition in human gut communities with a longer-lasting impact on fungi than bacteria. Microbiome. 2020; 12: 1–20. doi: 10.1186/s40168-020-00899-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Röttjers L, Faust K. From hairballs to hypotheses-biological insights from microbial networks. FEMS Microbiology Reviews. 2018; 42: 761–780. doi: 10.1093/femsre/fuy030 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Machado D, Maistrenko O, Andrejev S, Kim Y, Bork P, Patil K. Polarization of microbial communities between competitive and cooperative metabolism. Nat Ecol Evol. 2021; 5(2): 195–203. doi: 10.1038/s41559-020-01353-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Stone L, Roberts A. Conditions for a species to gain advantage from the presence of competitors. Ecology. 1991; 72: 947–957. [Google Scholar]
  • 41.Fisher CK, Mehta P. Identifying keystone species in the human gut microbiome from metagenomic timeseries using sparse linear regression. PLoS ONE 2014; 9: 1–10. doi: 10.1371/journal.pone.0102451 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Freilich MA, Wieters E, Broitman BR, Marquet PA, Navarrete SA. Species co-occurrence networks: can they reveal trophic and non-trophic interactions in ecological communities? The Ecological Society of America. 2018; 99: 690–699. doi: 10.1002/ecy.2142 [DOI] [PubMed] [Google Scholar]
  • 43.Mayfield M, Stouffer D. Higher-order interactions capture unexplained complexity in diverse communities. Nat Ecol Evol. 2017; 1(0062). doi: 10.1038/s41559-016-0062 [DOI] [PubMed] [Google Scholar]
  • 44.Tilman D. The importance of the mechanisms of interspecific competition. The American Naturalist. 1987; 129: 769–774. [Google Scholar]
  • 45.Pace M, Cole J, Carpenter S, Kitchell J. Trophic cascades revealed in diverse ecosystems. Trends in Ecology and Evolution. 1999; 14: 483–488. doi: 10.1016/s0169-5347(99)01723-1 [DOI] [PubMed] [Google Scholar]
  • 46.Momeni B, Xie L, Shou W. Lotka-Volterra pairwise modeling fails to capture diverse pairwise microbial interactions. eLife. 2017; 6: 1–34. doi: 10.7554/eLife.25051 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Højsgaard S, Edwards D, Lauritzen S. Graphical models with R: Springer Science & Business Media; 2012. [Google Scholar]
  • 48.Wootton J. Predicting Direct and Indirect Effects: An Integrated Approach Using Experiments and Path Analysis. Ecology. 1994; 75(1): 151–165. [Google Scholar]
  • 49.Hanemaaijer M, Röling WFM, Olivier BG, Khandelwal RA, Teusink B, Bruggeman FJ. Systems modeling approaches for microbial community studies: from metagenomics to inference of the community structure. Frontiers in Microbiology. 2015; 6: 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Succurro A, Ebenhöh O. Review and perspective on mathematical modeling of microbial ecosystems. Biochemical Society Transactions. 2018; 46: 403–412. doi: 10.1042/BST20170265 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Moran NA, Wernegreen JJ. Lifestyle evolution in symbiotic bacteria: Insights from genomics. Trends in Ecology and Evolution. 2000; 15: 321–326. doi: 10.1016/s0169-5347(00)01902-9 [DOI] [PubMed] [Google Scholar]
  • 52.Scheffer M, Carpenter S, Foley JA, Folke C, Walker B. Catastrophic shifts in ecosystems. Nature. 2001; 413: 591–596. doi: 10.1038/35098000 [DOI] [PubMed] [Google Scholar]
  • 53.Costantino RF, Desharnais RA, Cushing JM, Dennis B. Chaotic dynamics in an insect population. Science. 1997; 275: 389–391. doi: 10.1126/science.275.5298.389 [DOI] [PubMed] [Google Scholar]
  • 54.Becks L, Hilker FM, Malchow H, Jürgens K, Arndt H. Experimental demonstration of chaos in a microbial food web. Nature. 2005; 435: 1226–1229. doi: 10.1038/nature03627 [DOI] [PubMed] [Google Scholar]
  • 55.Benincà E, Huisman J, Heerkloss R, Jöhnk KD, Branco P, van Nes EH, et al. Chaos in a long-term experiment with a plankton community. Nature. 2008; 451: 822–825. doi: 10.1038/nature06512 [DOI] [PubMed] [Google Scholar]
  • 56.Benincà E, Ballantine B, Ellner SP, Huisman J. Species fluctuations sustained by a cyclic succession at the edge of chaos. Proceedings of the National Academy of Science. 2015; 112(20): 6389–6394. doi: 10.1073/pnas.1421968112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Lahti L, Salojärvi J, Salonen A, Scheffer M, Vos de WM. Tipping elements in the human intestinal ecosystem. Nature communications. 2014; 5: 1–10. doi: 10.1038/ncomms5344 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Scheffer M, Bascompte J, Brock WA, Brovkin V, Carpenter SR, Dakos V, et al. Early-warning signals for critical transitions. Nature. 2009; 461: 53–59. doi: 10.1038/nature08227 [DOI] [PubMed] [Google Scholar]
  • 59.Scheffer M, Carpenter SR, Lenton TM, Bascompte J, Brock W, Dakos V, et al. Anticipating Critical Transitions. Science. 2012; 338: 344–348. doi: 10.1126/science.1225244 [DOI] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010491.r001

Decision Letter 0

Kiran Raosaheb Patil

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

24 Feb 2022

Dear Dr. Pinto,

Thank you very much for submitting your manuscript "Species abundance correlations cannot distinguish interaction types in microbial networks" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Kiran Raosaheb Patil, Ph.D.

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: In this manuscript, the authors approach the question of whether cross-sectional sampling of microbial communities, followed by correlation analysis of quantitative abundances, can reveal insights into the types of ecological interactions occurring between microbes in a community. The authors test this by performing Lotka-Volterra simulations under known interaction topologies, and varying the parameters and/or noise-levels during each run. They then perform partial correlation analysis across multiple snapshots that have resulted from the simulations, testing whether the original input interactions can be retrieved.

Overall, the study is dealing with a relevant problem, and it is technically well executed. However, I doubt whether its findings are all that surprising/novel. Most scientists would doubt that detailed mechanistic (ecological) insights could ever be drawn from cross-sectional correlation analysis alone, and I am not aware of studies that conclude about fine differences between directionalities and types of interactions from mere correlation data. More importantly, there are more advanced techniques such as Structural Equation Models or Bayesian Networks, which attempt to infer causality from cross-sectional community data, but are not tested by the authors.

Furthermore, one of their novel findings (namely that parasitic interactions can manifest in both positive as well as negative correlations) is perhaps somewhat unrealistic: negative correlations are only observed in setups where the positive effect of the host on the parasite is much smaller than the negative effect of the parasite on the host (cf. line 329 and following). This appears to be biologically unrealistic; most parasites have evolved to avoid overly straining the host … the above setup is somewhat difficult to perceive as biologically meaningful/stable.

Also, given that generic interaction types (positive and negative) are by-and-large well detected (Fig. 2 and 3A), the title of the study is perhaps unnecessarily pessimistic. A bit more nuance would perhaps be more helpful to the reader.

The parameter and design choices during the modelling are fairly arbitrary: the interaction networks are small and have an entirely random topology – meaning that important features such as interaction-hubs and -modules are not considered. The authors rarely motivate their design and parameter choices, which makes the interpretation of their observations difficult. The level of noise in the “high” noise settings seems large – in my opinion it is not clear whether such levels of measurement noise prevail in practice.

Their "strict inference" analysis (Fig. 4F) would be more fair, if in cases where the true interaction weights are asymmetric, but the correlation analysis yields the “sign” of the interaction with the larger weight, this would be counted as “true positive” predictions. As stated now, the task cannot be solved, since the intended sign is not specified.

In addition to the concerns above, there are a number of minor points as below:

- “only a few samples at a time” repeatedly used, but misleading, since most studies have dozens or hundreds of samples (not to mention large scale projects with thousands)

- Figure reference in line 329 should probably be Fig. 2E/2F, since 2C/2D don’t show parasitism

- L 407: the authors use the term significant without statistically quantifying this statement

- L 472: but only in very rare cases (see Fig. 3A), this should be explicitly mentioned here

- L 480-81: where do the authors demonstrate that omnipresence of species is generally required? If they do, this should be referred to here

- L553: where do the authors show that positive interactions are more readily detectable than negative ones? (to the contrary, L359-60 reads: “There was no difference between the successful inference of positive interactions over negative interactions”)

- Fig. 1: may need re-arrangement (B comes before A)

Reviewer #2: Please find the review uploaded as a Word document.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachment

Submitted filename: Review PCOMPBIOL-D-22-00053.docx

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010491.r003

Decision Letter 1

Kiran Raosaheb Patil

23 May 2022

Dear Dr. Pinto,

Thank you very much for submitting your manuscript "Species abundance correlations carry limited information about microbial network interactions" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

The reviewer has asked for several clarifications and corrections, all of which I agree with. Further to those comments, I would recommend to shape the Introduction and Discussion to reflect extensive studies that have rather successfully used metabolic models in conjunction with co-occurrence analyses (binary as well as higher-order): e.g. Freilich et al. Nat Communications, 2011 and Machado et al. Nat Ecol Evol, 2021. The results from studies could be used to balance your discussion around the limits / benefits of correlational analyses.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Kiran Raosaheb Patil, Ph.D.

Deputy Editor

PLOS Computational Biology

***********************

The reviewer has asked for several clarifications and corrections, all of which I agree with. Further to those comments, I would recommend to shape the Introduction and Discussion to reflect extensive studies that have rather successfully used metabolic models in conjunction with co-occurrence analyses (binary as well as higher-order): e.g. Freilich et al. Nat Communications, 2011 and Machado et al. Nat Ecol Evol, 2021. The results from studies could be used to balance your discussion around the limits / benefits of correlational analyses.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have improved, extended and clarified the manuscript. There are a few (minor) issues left, which if addressed could further improve the manuscript:

with regard to author response 2: “have not been systematically investigated against a ground truth” ... just for clarification: using mathematical models (gLV and others) as a ground truth has been done extensively in the literature (e.g. Weiss et al. (2016), also cited in the manuscript), albeit in variations that differ from what the authors did.

with regard to author response 3: “However, these techniques are also unable to recover directionality and type of interactions” ... SEMs/Bayesian networks do infer directed graphs (albeit with limitations), this sentence therefore seems misleading in the current form and requires either relevant references or proper explanation (similar to the response provided by the authors). Suggestion: “It is unclear, how well directed links predicted by these methods recover true ecological interaction types.”

with regard to author response 4: For clarification ... the given example constitutes not parasitism, but interference competition, where one species actively prevents another from obtaining resources used by both. The resulting interaction pattern would not be +/-, but rather -(weak)/-(strong), with one strong and one weak competitor. The original point stands: relationships in which one species simultaneously benefits from and harms another species are more likely to be stable if the exploitative benefit for species 1 outweighs the harm done to species 2. Observing unusual interactions with the opposite pattern, while low in number, should at least be briefly discussed, especially if they are used to support the central claim that interaction types are not detectable via. correlation methods. As a side note, perhaps the +/- class of interactions could be more adequately labeled “exploitative”, as they also include non-parasitic interactions such as predation.

with regard to author response 5: [regarding the adjusted abstract]: “limited information about the underlying web of interactions” sounds too strong, as many important characteristics of interaction networks (interaction signs, hub detection, various graph metrics) are either well recovered by correlations or have not been investigated by the authors. Suggestion: "limited information on specific interaction types” or “cannot distinguish detailed interaction types”. The line “competitive interactions may result in positive as well as negative correlations” is slightly misleading as is: requires quantification, as only very small fractions for competitive, commensal and amensal interactions had minority signs. Especially since this is already stated in the text (L. 530), it should for transparency also be in the abstract.

with regard to author response 7: While the addition to the methods section helps clarity, it is easy to overlook. It would therefore be useful also have a condensed reference to the parameterization challenges in the caveat section of the Discussion section.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010491.r005

Decision Letter 2

Kiran Raosaheb Patil

5 Aug 2022

Dear Ms. Pinto,

Thank you very much for submitting your manuscript "Species abundance correlations carry limited information about microbial network interactions" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

"...interactions are relatively rare among free-living bacteria and, if present, are often unidirectional. In contrast, Machado et al. (2021) suggested that" --> In contrast is not fully incorrect since free-living and host-associated communities have been shown to have different patterns. I think removing "in contrast" should fix it.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Kiran Raosaheb Patil, Ph.D.

Deputy Editor

PLOS Computational Biology

Kiran Patil

Deputy Editor

PLOS Computational Biology

***********************

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately:

[LINK]

"...interactions are relatively rare among free-living bacteria and, if present, are often unidirectional. In contrast, Machado et al. (2021) suggested that" --> In contrast is not fully incorrect since free-living and host-associated communities have been shown to have different patterns. I think removing "in contrast" should fix it.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: In this second revision, the authors have extensively answered to remaining criticisms. They have strengthened the manuscript by including further references and clarifications. I can now support its publication.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

References:

Review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript.

If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010491.r007

Decision Letter 3

Kiran Raosaheb Patil

15 Aug 2022

Dear Ms. Pinto,

We are pleased to inform you that your manuscript 'Species abundance correlations carry limited information about microbial network interactions' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Kiran Raosaheb Patil, Ph.D.

Deputy Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010491.r008

Acceptance letter

Kiran Raosaheb Patil

29 Aug 2022

PCOMPBIOL-D-22-00053R3

Species abundance correlations carry limited information about microbial network interactions

Dear Dr Pinto,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Zsofia Freund

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text. Co-existence in a two-species Lotka-Volterra model with self-limitation.

    Table A. Conditions for stable co-existence in the two-species Lotka-Volterra model. Fig A. Zero-growth isoclines (“null-clines”) in the two-species Lotka-Volterra model.

    (PDF)

    S1 Fig. Cartoon illustrating the different interaction mechanisms.

    (PDF)

    S2 Fig. The effect of process noise (W) on the within host population dynamics.

    (PDF)

    S3 Fig. Distributions of interaction strengths in three different scenarios.

    (PDF)

    S4 Fig. Network structures used in the different case studies.

    (PDF)

    S5 Fig. The effect of αij on the correlations between the abundances of two bacterial species for different interactions mechanisms.

    (PDF)

    S1 Table. Mann-Whitney U test results for the F1-scores of the base case model and for the F1-scores of the model with different sources of process variability.

    (PDF)

    S2 Table. Mann-Whitney U test results for the F1-scores of the samples taken during equilibrium (t5 in Fig 5) and for the F1-scores of the samples taken outside equilibrium.

    (PDF)

    Attachment

    Submitted filename: Review PCOMPBIOL-D-22-00053.docx

    Attachment

    Submitted filename: Rebuttal_PCOMPBIOL_26042022.docx

    Attachment

    Submitted filename: Rebuttal_PCOMPBIOL_10072022.docx

    Attachment

    Submitted filename: Rebuttal_PCOMPBIOL_12082022_2.docx

    Data Availability Statement

    All relevant codes are available via GitHub (https://github.com/susannepinto/gLV_microbiome.git).


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES