Skip to main content
. 2022 Jun 23;11:e73870. doi: 10.7554/eLife.73870

Figure 2. The LSTM model can predict the temporal changes in species abundance in a 12-member synthetic human gut community in response to periodic dilution (passaging).

(a) Proposed LSTM modeling methodology for the dynamic prediction of species abundance in a microbial community. The initial abundance information is an input to the first LSTM cell, the output of which is trained to predict abundance at the next time point. Consequently, the predicted abundance becomes an input to another LSTM cell with shared weights to predict the abundance at the subsequent time point. The process is repeated until measurements at all time points are available. X represents a vector of species abundances. Thus, all predictions are forecasted from the abundance at time 0. (b) Scatter plot of measured (true) and predicted species abundance of a 12-member synthetic human gut community at 12 hr (N=876, p-value =2.44e-257). (c) Scatter plot of measured (true) and predicted abundance at 24 hr (p-value =6.51e-257). (d) Scatter plot of measured (true) and predicted abundance at 36 hr (p-value =7.42e-257). (e) Scatter plot of measured (true) and predicted abundance at 48 hr (p-value =1.66e-227). (f) Scatter plot of measured (true) and predicted abundance at 60 hr (p-value =3.39e-227).

Figure 2.

Figure 2—figure supplement 1. Prediction of temporal changes of species abundance for a few representative communities by the LSTM network.

Figure 2—figure supplement 1.

(a) Histogram of prediction R2-scores on the test set. The prediction R2-score for each community is determined between the measured (true) and predicted abundances of all species in that community at all time instants. Prediction of individual species abundance in communities that display (b) accurate predictions (11-member community), (c) close to the median (3-member), and (d) poor predictions (four-member). We note the difference in abundance scales for some species such as BH and BO. Despite the order of magnitude difference in the scales, our proposed LSTM network does a great job at predicting species abundance with consistency across all species in (a). This is primarily due to feature standardization during the training and inference of LSTM networks.