Figure 5. Community metabolite trajectories cluster into qualitatively distinct groups which can be classified based on presence and absence of key microbial species.
(a) Schematic of experiment and network representing a minimal spanning tree across the 95 communities where weights (indicated by edge length) are equal to the Euclidean distance between the metabolite trajectories for each community. Node colors indicate clusters determined as described in the Materials and methods. Red node with black outline annotated with ‘25’ represents the 25-member community. Annotations indicate the most specific microbial species presence/absence rules that describe most data points in the cluster of the corresponding color as determined by a decision tree classifier (Materials and methods). Communities that deviate from the rules for their cluster are indicated with a border matching the color of the closest cluster whose rules they do follow. Network visualization generated using the draw_kamada_kawai function in networkx (v2.1) for Python 3. (b–g) Temporal changes in metabolite concentrations for communities within each cluster (indicated by sub-plot border color), with individual communities denoted by transparent lines. Solid lines and shaded regions represent the mean ±1 s.d. of all communities in the cluster. (h) Schematic of LSTM model training and computation of gradients to evaluate impact of species abundance on metabolite concentrations in a specific community context. (i) Heatmap of model M3 prediction accuracy for four metabolites in the 34 validation communities at each time point (Pearson correlation , N=34 for all tests). (j) Heatmap of the gradient analysis of model M3 as described in (h) for the full 25-species community. N and p-values are reported in Supplementary file 1.



