Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2020 Sep 29;16(9):e1007786. doi: 10.1371/journal.pcbi.1007786

Minimizing the number of optimizations for efficient community dynamic flux balance analysis

James D Brunner 1,*, Nicholas Chia 1
Editor: Pedro Mendes2
PMCID: PMC7546477  PMID: 32991583

Abstract

Dynamic flux balance analysis uses a quasi-steady state assumption to calculate an organism’s metabolic activity at each time-step of a dynamic simulation, using the well-known technique of flux balance analysis. For microbial communities, this calculation is especially costly and involves solving a linear constrained optimization problem for each member of the community at each time step. However, this is unnecessary and inefficient, as prior solutions can be used to inform future time steps. Here, we show that a basis for the space of internal fluxes can be chosen for each microbe in a community and this basis can be used to simulate forward by solving a relatively inexpensive system of linear equations at most time steps. We can use this solution as long as the resulting metabolic activity remains within the optimization problem’s constraints (i.e. the solution to the linear system of equations remains a feasible to the linear program). As the solution becomes infeasible, it first becomes a feasible but degenerate solution to the optimization problem, and we can solve a different but related optimization problem to choose an appropriate basis to continue forward simulation. We demonstrate the efficiency and robustness of our method by comparing with currently used methods on a four species community, and show that our method requires at least 91% fewer optimizations to be solved. For reproducibility, we prototyped the method using Python. Source code is available at https://github.com/jdbrunner/surfin_fba.

Author summary

The standard methods in the field for dynamic flux balance analysis (FBA) carry a prohibitively high computational cost because it requires solving a linear optimization problem at each time-step. We have developed a novel method for producing solutions to this dynamical system which greatly reduces the number of optimization problems that must be solved. We prove mathematically that we can solve the optimization problem once and simulate the system forward as an ordinary differential equation (ODE) for some time interval, and solutions to this ODE provide solutions to the optimization problem. Eventually, the system reaches an easily check-able condition which implies that another optimization problem must be solved. We compare our method against typically used methods for dynamic FBA to validate that it provides equivalent solutions while requiring fewer linear-program solutions.


This is a PLOS Computational Biology Methods paper.

Introduction

Microbial communities and human health

The makeup of microbial communities is often complex, dynamic, and hard to predict. However, microbial community structure has a profound effect on human health and disease [17]. These two facts have led to significant interest in mathematical models which can predict relative abundances among microbes in a community. Various dynamical models have been proposed to explain and predict microbial community population dynamics [812]. Among these are models which propose that interactions between species are mediated by the metabolites that each species produces and consumes [13, 14], and there is significant evidence that these models perform better than models which depend on direct interaction between species [15, 16].

Recently, advances in genetic sequencing have allowed the creation of genome-scale models (GEMs) that reflect the internal network of cellular metabolism, and can therefore be used to predict metabolite use and production [1719]. This technique can be extended to microbial community modeling by combining GEMs of different species. There has been significant interest in using GEMs to predict relative populations of stable microbial communities [2026]. Community metabolic modeling can not only predict relative populations, but also holds the potential to predict and explain the community metabolite yield, which can have a profound effect on health [4]. Furthermore, model repositories such as the online bacterial bioinformatics resource PATRIC [27] or the BiGG model database [28] make it possible to build community models using information from individual species investigations.

GEMs can be used to predict microbial growth rates as well as metabolite consumption and production rates using a process called flux balance analysis (FBA). Because these predictions appear in the form of rates of change, they can be used to define a metabolite mediated dynamical model simply by taking as a vector field the rates of change predicted by FBA. We can therefore combine the techniques of metabolite mediated dynamic modeling and community metabolic modeling to produce dynamic predictions of microbial community population size and metabolite yield. This strategy is called dynamic FBA [2931], and has recently been used to model microbial communities [3234].

Dynamic FBA, when implemented naïvely, requires a linear optimization problem to be repeatedly solved, and carries a high computational cost for even small communities. Furthermore, in silico experiments may need to be repeated many times over various environmental conditions or using various parameter choices in order to make robust conclusions or to accurately fit model parameters. As a result, implementations of dynamic FBA which depend on optimization at every time-step carry a prohibitively high computational cost when used to simulate larger microbial communities. The implementation of dynamic FBA in the popular COBRA toolbox software package [17] is done in this way, and essentially all more efficient available tools for simulating dynamic FBA fundamentally use an ODE solver approach with optimization at each time-step [24, 31, 3538]. Dynamic FBA can be improved by taking advantage of the linear structure of the optimization problem which provides a choice of basis for an optimal solution that may be reused at future time-steps [39, 40]. However, the optimizations that are required by this strategy involve solutions with non-unique bases. This means that a basis chosen at random may not provide an optimal solution to the linear program at future time-steps because it provides a solution that is non-optimal or infeasible.

In order to implement dynamic FBA without optimizing at each time step, we use an optimal basic set for the FBA linear optimization problem to create a system of linear equations whose solutions at future time-steps coincide with the solutions to the FBA optimization problem. To solve the problem of non-uniqueness among bases, we prove that there exists a choice of basis that allows forward simulation for a given optimal flux solution and provide a method to choose this basis. Note that this method does not choose among a set of non-unique optimal flux solutions, but instead chooses a basis for a single given optimum. To choose among multiple optimal flux solutions, biological, rather than mathematical, considerations should be used.

In this manuscript, we detail how dynamic FBA can be simulated forward without re-optimization for some time interval, and give a method for doing so. We propose conditions on an optimal basic set for the FBA linear optimization problem which allows for forward simulation, and we prove that such a choice exists. We then detail how to choose this basis set, and finally give examples of simulations which demonstrate the power of our method. For reproducibility, we make a prototype implementation of our method in the Python language available at https://github.com/jdbrunner/surfin_fba.

Background

Flux balance analysis

With the advent of genetic sequencing and the resulting genome scale reconstruction of metabolic pathways, methods have been developed to analyze and draw insight from such large scale models [18]. To enable computation of relevant model outcomes, constraint based reconstruction and analysis (COBRA) is used to model steady state fluxes vi through a microorganism’s internal metabolic reactions under physically relevant constraints [18]. One of the most basic COBRA methods, called flux balance analysis (FBA) optimizes some combination of reaction fluxes ∑γivi which correspond to increased cellular biomass, subject to the constraint that the cell’s internal metabolism is at equilibrium:

Γv=0 (1)

where Γ is the stoichiometric matrix, a matrix describing the stoichiometry of the metabolic model.

This optimization is chosen because it reflects the optimization carried out by nature through evolution [18]. The vector γ = (γ1, γ2, …, γd) is an encoding of cellular objectives, reflecting the belief that the cell will be optimized to carry out these objectives. The constraint Eq (1) means that any optimal set of fluxes found by FBA corresponds to a steady state of the classical model of chemical reaction networks [41]. This reflects the assumption that the cell will approach an internal chemical equilibrium.

The optimization is done over a polytope of feasible solutions defined by the inequalities vi,minvivi,max, or possibly more complicated linear constraints. See Fig 1 for a geometric representation of an example of the type of linear optimization problem that is carried out. By convention, forward and reverse reactions are not separated and so negative flux is allowed. Linear optimization problems like FBA often give rise to an infinite set of optimal flux vectors v = (v1, v2, …, vd). Geometrically, this set will correspond to some face of the polytope of feasible solutions. To draw conclusions despite this limitation, many methods have been developed to either characterize the set of optimal solutions, as with flux variability analysis (FVA), or enforce more constraints on the network to reduce the size of this set, as with loopless FVA [18].

Fig 1. Geometric representation of Example 1 for t3 > t2 > t1 > 0, showing the three options for bases which are equivalent at t = 0.

Fig 1

Note that the best choice depends on the function c(t) = (10, 10, 30 − t) and cannot be chosen using the static problem alone. The feasible region of the optimization problem is shown in gray.

Dynamic FBA

FBA provides a rate of increase of biomass which can be interpreted as a growth rate for a cell. Furthermore, a subset of the reactions of a GEM represent metabolite exchange between the cell and its environment. By interpreting constraints on nutrient exchange reactions within the metabolic network as functions of the available external metabolites and fluxes of exchange reactions as metabolite exchange rates between the cell and its environment, the coupled system can be modeled. The simplest way to do this is to use an Euler method, as in [30].

In addition to Euler’s method, more sophisticated ODE solvers may be used in the so-called “direct” method of simply recomputing the FBA optimization at every time-step. This can provide better solution accuracy and potentially larger time-steps, but may also require more than one FBA optimization at each time-step. For instance, the Runge-Kutta fourth order method [42] requires four FBA solutions at each time step. Direct methods are implemented in the COBRA toolbox [17] and are the central algorithm in many modern tools, including those of Zhuang et al. [31, 35], Harcombe et al. [36], Zomorrodi et al. [24], Louca and Doebeli [37], and Popp and Centler [38]. Notably, any direct method requires at least one complete recalculation of the network fluxes at each time-step.

However, resolving the system at each time step is not necessary, as the solution the optimization problem at some initial time can actually be used to compute future optimal solutions. Höffner et al., [40], used this observation to introduce a variable step-size method for dynamic FBA. In that method a basic index set is chosen by adding biological constraints to the optimization problem hierarchically until a unique optimal flux vector is found. The challenge of such an approach is in choosing the basis for the optimal solution, as the optimal basis is not guaranteed to be unique even for a unique optimal flux solution. In fact, due to the nature of the method of Höffner et al. and of our method, any optimization past the initial solution that must be carried out is guaranteed to have a solution with a non-unique basis. Furthermore, many choices of optimal basis will not provide a solution for future time-steps, so that choosing among these bases must be done intelligently. Unfortunately, Höffner et al. [40] do not provide a method for choosing among non-unique bases for a single linear program solution.

Our method seeks to solve this problem by choosing a basis which is most likely to remain optimal as simulation proceeds forward from the possibilities provided by an FBA solution. We therefore prioritize reducing the number of times the linear program must be solved, choosing our basis based on the mathematical properties of the system which gives the best chance of providing a solution at future time-steps.

Additionally, a method described as the “dynamic optimization approach” was introduced in Mahadevan et al., [29], however this method is computationally expensive. In particular, the method given in [29] involves optimizing over the entire time-course simulated, and so is formulated as a non-linear program which only needs to be solved once. While this method requires only one optimization, this optimization is itself prohibitively difficult due to the dimensionality of the problem growing with the fineness of time-discretization.

The dynamic FBA model for communities

We can write a metabolite mediated model for the population dynamics of a community of organisms x = (x1, …, xp) on a medium composed of nutrients y = (y1, …, ym):

x˙i=gi(ψi(y))xi (2)
y˙j=-i=1pψij(y)xi (3)

where ψi is a vector of the fluxes of nutrient exchange reactions for organism xi as determined by FBA. Using FBA to determine ψi is therefore a quasi-steady state assumption on the internal metabolism of the organisms xi [4345].

Recall that the basic assumption of flux balance analysis is that, given a matrix Γi that gives the stoichiometry of the network of reactions in a cell of organism xi that growth gi(y) is the maximum determined by solving the following linear program [18]:

{max(vi·γi)Γivi=0ci1vci2(y)} (4)

where ci1 is some vector of lower flux bounds while ci2(y) is some vector-valued function of the available metabolites which represents upper flux bounds. The key observation allowing dynamic FBA is that the optimal solution to this problem also determines ψi simply by taking ψij to be the value of the flux vij of the appropriate metabolite exchange reaction. For clarity, we will relabel the elements of vi so that ψik = vij if vij is the kth exchange flux, and ϕik = vij if vij is the kth internal flux. The objective vector γi indicates which reactions within the cell contribute directly to cellular biomass, and so is non-zero only in elements corresponding to internal fluxes. We can therefore rewrite this vector to include only elements corresponding to internal fluxes, so that the objective of the optimization is to maximize γiϕi.

The stoichiometry of metabolite exchange reactions is represented by standard basis vectors [18]. Therefore, we can partition Γi as

Γi=[I-Γi*0Γi] (5)

where I is the identity matrix of appropriate size, and Γi* and Γi contain the stoichiometry of the internal reactions [18, 46, 47]. Making this change in notation allows us to see that the optimization problem of flux balance analysis is essentially internal to the cell, with external reactions providing constraints.

We can see from Eq (5) that ker(Γi) is isomorphic to ker(Γi), and so we can maximize over this kernel. Then, the exchange reaction fluxes are determined by the internal fluxes according to the linear mapping ψi=Γi*ϕi. The maximization of FBA becomes a maximization problem over the internal fluxes. We rewrite Eq (4) using Eq (5) and combine with Eqs (2) and (3) to form the differential algebraic system

dxidt=xi(γi·ϕi) (6)
dydt=-ixiΓi*ϕi (7)
{max(ϕi·γi)Γiϕi=0ci1[Γi*I]ϕici2(y)} (8)

where each ϕi is determined by the optimization Eq (8), all carried out separately. Note that this is a metabolite mediated model of community growth as defined in [15]. That is, the coupling of the growth of the separate microbes is due to the shared pool of metabolites y. Each separate optimization which determines ϕi at a single time-step depends on y, and each ϕi determines some change in y. Furthermore, each optimization is carried out in a manner that depends only the status of the metabolite pool and is independent from the optimizations of other organisms. There is therefore no shared “community objective”. Instead, each organism optimizes according to only its own internal objective.

We write, for full generality, upper and lower dynamic bounds on internal and exchange reactions, and assume that each function cij(y) ∈ C. We let

Ai=[(Γi*)T,-(Γi*)T,I,-I,]T (9)

so that we can rewrite the optimization problem Eq (8) as

{max(ϕi·γi)Aiϕici(y,t)Γiϕi=0} (10)

for ease of notation.

We now hope to select a basic index set Ii for Eq (10) for each organism xi so that each ϕi(t) is a solution to the resulting linear system of equations.

Methods

Linear optimization preliminaries

In this manuscript, we will rewrite the FBA optimization problem in the form

{max(ϕ·γ)AϕcΓϕ=0} (11)

where the matrices A and Γ are derived from the stoichiometric matrix and flux constraints. Such a problem is often referred to as a linear program (LP). We now recall some well known results from the study of linear programming (see, for example [40, 48]).

First, we note that Eq (11) can be rewritten in the so-called standard form with the addition of slack variables s = (s1, …, sn) which represent the distance each of the n constraints is from its bound as follows:

{max(ϕ˜·γ˜)[A˜I][ϕ˜s]=cϕ˜i0,si0}. (12)

Standard form requires that we rewrite ϕi=ϕi+ϕi and then define ϕ~=(ϕ1+,ϕ2+,...,ϕd+,ϕ1,ϕ2,...,ϕd) so that we require non-negativity of each variable, and the matrix A~=[AB], B = −A. We rewrite the problem in this form to make use of established results, and for ease of notation will write ϕ instead of ϕ~ when it is clear which form of the problem we are discussing.

We will make use of the well-known result that there exists an optimal basis or basic set for a bounded linear program [49]. To state this result, we first define the notation BJ to be the matrix with columns of [A~I] corresponding to some index set {k1,k2,...,kn}=J, and if BJ is invertible we define the notation wJ(a) so that

(wJ(a))l={(BI-1a)jl=kjJ0lJ (13)

for any an. We may now define a basic optimal solution and corresponding basic index set.

Definition 1 A basic optimal solution to a linear program is an optimal solution along with some index set {k1,k2,...,kn}=I such that w=wI(c), where c is the vector of constraints as in Eq (12). The variables {wi|iI} are referred to as basic variables, and the index set I is referred to as the basic index set.

Finally, if there exists a bounded, optimal solution to Eq (12), then there exists a basic optimal solution and corresponding basic index set.

For a given basic optimal solution vector w, there may be more than one basic index set I such that w=wI(b). Such a solution is called degenerate. Clearly a necessary condition for such non-uniqueness is that there exists some kI such that wk = 0. This is also a sufficient condition as long as there is some column of [A~I] which is not in the column space of BI\{k}.

Forward simulation without re-solving

Consider again Eq (10), the linear program that must be solved at each time point of the dynamical system for each microbial population. Information from prior solutions can inform future time-steps as long as the region of feasible solutions has not qualitatively changed. Thus, we may only need to solve the optimization problem a few times over the course of a simulation. The key observation making this possible is that the simplex method of solving a linear program provides an optimal basis for the solution. We may often re-use this basis for future time-steps within some time interval, and therefore find optimal solutions without re-solving the linear program.

In order to do this, we need to find a form of the solution which may be evolved in time. Thus, we turn the system of linear inequalities given in the linear program into a system of linear equations. Then, if this system has a unique solution we have reduced the task to solving a system of equations rather than optimizing over a system of inequalities. We can find such a system of equations by solving the linear program once, and using this solution to create a system of equations whose solution provides the optimal flux ϕi using a basic index set. We then use this same system to simulate forward without the need to re-solve the optimization problem until the solution to the system of equations is no longer a feasible solution to the linear program.

First, the linear program Eq (10) is transformed into standard form (Eq (12)). Then, a basic optimal solution is found with corresponding basic index set Ii. The dynamical system Eqs (6), (7) and (10) can then be evolved in time using Eq (13). This evolution is accurate until some wij becomes negative (meaning that the solution is no longer a feasible solution to the linear program). At this point, a new basis must be chosen. That is, until wIi(c(t)) becomes infeasible, we let (ϕj1(ci(t)),...,ϕjm(ci(t)),s1(ci(t)),...,sn(ci(t)))=wIi(ci(t)) and replace Eqs (6), (7) and (10) with

dxidt=xi(γi·ϕi(ci(t))) (14)
dydt=-ixiΓi*ϕi(ci(t)) (15)

One major difficulty in this technique is that a unique wi does not guarantee a unique basis set Ii. If we have some (wIi)j=0 for jIi, then there exists some alternate set I^i such that wI^i=wIi. Such a solution wIi is called degenerate. In a static implementation of a linear program, the choice of basis of a degenerate solution is not important, as one is interested in the optimal vector and optimal value. However, as we will demonstrate with 1, the choice of basis of a degenerate solution is important in a dynamic problem. In fact, if the system given in Eqs (14) and (15) is evolved forward until wIi(ci(t)) becomes infeasible, the time at which the system becomes infeasible is the time at which we have some (wIi)j=0 for jIi. Thus, we need to resolve Eq (10) whenever wIi(ci(t)) becomes degenerate, which will be the final time-point at which the wIi(ci(t)) is feasible.

Example 1 Consider the dynamic linear program

{max((1,1)·v)[100112]v[101030-t]vi0} (16)

In standard form at t = 0, this linear program becomes

{max((1,1)·v)[101000101012001][vs]=[101030]vi,si0} (17)

which has the unique solution w = (10, 10, 0, 0, 0). There are three choices of basic index sets: I1={1,2,3}, I2={1,2,4}, and I3={1,2,5}. The resulting bases are

BI1=[101010120]BI2=[100011120]BI3=[100010121]

Computing Eq (13) at t > 0 for each, we have that BI1 yields wI1(c(t))=(10t,10,t,0,0), BI2 yields wI2(c(t))=(10,10t2,0,t2,0), and BI3 yields wI3(c(t))=(10,10,0,0,t), shown in Fig 1 for t > 0. Thus, only wI2(c(t)) solves the dynamic problem because wI1(c(t)) is not optimal and wI3(c(t)) is not feasible for t > 0. We may follow wI2 and be insured of remaining at an optimal solution to the linear program until t = 20 + ε, at which point wI2=(10,ε/2,0,10,0), which is not a feasible solution to the linear program. At time t = 20, a re-optimization is required to choose a new basis.

Notice that the correct choice of basis fundamentally depends on the time-varying bound function c(t) = (10, 10, 30 − t). To see this, consider other possible time-varying bounds c(t) which have c(0) = (10, 10, 30). For example, if c(t) = (10 − t, 10 − t, 30), then only BI3 would give the correct w(c(t)) for t > 0.

A basis for the flux vector

We now provide a method to choose a basis Ii for each organism xi in the case of a degenerate solution. Consider an optimal solution wi to the linear program Eq (12). To simulate forward according to Eqs (14) and (15), we need for each organism xi a basic index set Ii such that

{w˙i=wIi(ddtci)[A˜I]w˙=ddtci(wIi)j=0w˙ij0} (18)

so that the solution remains feasible, and furthermore that w˙i is optimal over the possible choice of basic index sets for wi. This is obviously a necessary condition for forward simulation within some non-empty time interval, and can be made sufficient (although no longer necessary) by making the inequality (wIi)j=0w˙ij0 strict. We use the relaxed condition for more practical applicability.

In order to develop a method based on the above observation (i.e., Eq (18)), we must know that Eq (12) has such a solution. We therefore require the following lemma, which is proved in S1 Text:

Lemma 1 For a linear program with the form given in Eq (12) with a basic optimal solution w, there exists a basic index set I such that Eq (18) holds and w˙ is optimal over the possible choice of basic index sets for w.

If Eq (12) has only a non-degenerate solution, the unique basis will satisfy this requirement. The challenge remains to choose from among the possible bases of a degenerate solution.

To do this, we form a second linear program analogous to Eq (18) in the following way. We first find all constraints aij (i.e. rows of Ai or Γi) such that aijϕi = cij(t), calling this set Si. Note that this set contains all the rows of Γi, for which we regard cij(t) = 0 for all t > 0. Note that if the solution given is a basic optimal solution, the rank of the matrix whose rows are aij for aijSi is d, where again d is the number of internal fluxes. This is true because we include constraints of the type a < ϕij < b as rows of Ai.

Then, we solve the linear program

{max(w˙i·γi)aj·ϕ˙idcijdt,ajSi} (19)

We may then use any basis BIi which solves Eq (19) as long as it has exactly d non-basic slack variables. Lemma 1 tells us that such a choice exists, although it may be necessary to manually pivot non-slack variables into the basis set given by the numerical solver. In testing the algorithm, this was necessary when using IBM ILOG CPLEX Optimization Studio to solve, but not when using The Gurobi Optimizer. Note that we do not need the entire basis BIi, but instead only need the d × d submatrix formed by rows of Ai or Γi which correspond to non-basic slack variables in the solution to Eq (19). These appear as rows (ai, 0) in BIi, and so this sub-matrix uniquely determines ϕi. We call this smaller matrix Bi, and label the set of row indices as J.

The chosen basis J and corresponding constraints are used to simulate forward until that particular solution becomes infeasible. At that time, we have an optimal solution to Eq (10) simply by continuity. We therefore do not need to resolve Eq (10) but instead re-form and solve Eq (19).

Pseudo-Code of the method

Below, we present as pseudo-code an outline of the method. A practical implication may need to adaptively adjust the time-step Δt to insure that no resource is artificially over-depleted past 0.

Algorithm 1: Dynamic FBA algorithm following Lemma 1. Note that for numerical stability and speed, we may store the matrices Qi, Ri such that QiRi = Bi is the QR-factorization of Bi rather than either storing Bi1 or solving completely during each time step of numerical integration.

Input: Final time T, initial microbial biomasses xi(0), initial nutrient concentrations yj(0), maximum inflow rates of nutrients αi, stoichiometric matrices Γi

OutPut: Timecourse simulation of biomass and nutrient concentrations

1 for each microbial population i do

2  Set wi(0) to be solution to Eq (13) which lies on a vertex of the feasible polytope.;

3  Solve Eq (21) to find initial basis Bi

4 end

5 while t < T do

6  Integrate Eqs (14) and (15) from t to t + Δt with ϕi=Bi1cJ(y(t),t);

7if Bi1cJ(y(t+Δt),t+Δt) is not a feasible solution then

8   reset xi = xi(t), yj = yj(t);

9   Solve Eq (21) to find new basis Bi, with additional constraints representing bounds violated by Bi1cJ(y(t),t).

10end

11 end

Results

Number of optimizations

We can compare the efficiency of Algorithm 1 with modern dynamic FBA methods by counting the number of times a large linear program must be carried out over the course of a simulation. At their core, state-of-art dynamic FBA tools such as d-OptCom [24] and COMETS [36] employ the direct method of calling an ODE-solving method with the linear program set as the right-hand-side. In the case of Euler’s method, the resulting ODE can be integrated by hand between time-steps. This last strategy is often referred to as the “static optimization approach” [40].

We compared simulation of various combinations of the organisms Escherichia coli str. K-12 substr. MG1655 (model iJR904), Saccharomyces cerevisiae S288C (model iND705), Pseudomonas putida KT2440 (model iJN746) and Mycobacterium tuberculosis H37Rv (model iEK1008), using models from the BiGG database [28] (see S2 Table for details). We counted the optimizations required for our model, as well as for direct methods using the numerical ODE solvers vode, zvode, lsoda, dopri5, and dop853 from the SciPy library. All of these numerical ODE solvers use adaptive step sizes for accuracy and stability, and so represent optimized choices of time-steps. Additionally, we compared the method of Höffner et al. as implemented in the MatLab package DFBAlab [39]. The number of re-optimizations required for each simulation is shown in Table 1 and the time-point of each re-optimization that was carried out is shown in Fig 2.

Table 1. Number of realizations required to simulate to time t = 5 with no cell death or metabolite flow, using M9 minimal medium.

*Simulation failed at t = 3.034277.

Model Combination Solution Method
Algorithm 1 Höffner vode zvode lsoda dopri5 dop853
iJR904 7 1 62 62 116 3313 6228
iND750 4 1 91 91 85 3508 6514
iJN746 4 13 166 167 376 1176 2249
iEK1008 4 4 120 120 208 2768 5148
iJR904 + iND750 4 24 240 211 346 5586 10469
iJR904 + iJN746 30 479 420 420 744 2695 5579
iJR904 + iEK1008 20 136 216 216 454 3385 6411
iND750 + iEK1008 8 32 311 311 509 5284 9888
iJR904 + iND750 + iEK1008 18 32* 451 451 1282 6225 11961
iJR904 + iND750 + iJN746 + iEK1008 56 672 1122 1122 2242 6837 13529

Fig 2. Time-points of re-optimizations required in simulations using the proposed method, the method of Höffner et al. [40] and various direct methods, shown in blue.

Fig 2

Shown in orange are times at which the direct method solver encountered an infeasible linear program due to numerical error.

For our method and the direct method, we allowed exchange of every metabolite detailed in S1 Table with initial metabolite concentrations given by that same file, and with initial biomass of 0.3 for each species. The file sim_comm.py in the supplementary repository https://github.com/jdbrunner/surfin_fba contains complete simulation set-up.

To compare with the method of Höffner et al. [40], we use the newly available Python package from the research group of Dr. David Tourigny titled dynamic-fba [50] for single organisms. This package allows simulation without secondary optimizations, as our does, and so is more similar to our prototype tool for comparison. Unfortunately, this package is currently only able to simulate single organisms at the time of publishing. For microbial communities, we can compare with the MatLab package DFBAlab [39] which requires all dynamics variables to be optimized in a secondary optimization. For simulations with DFBAlab, we use only the low-concentration metabolites D-glucose, oxygen, and cob(I)alamin from the M9 medium detailed in S1 Table as dynamically varying metabolites. It is worth noting that these are the most favorable conditions we could find for the method of Höffner [39, 40] et al. which are still biologically equivalent to our other simulations.

Error estimation

Our method provides much less theoretical error in dynamic FBA solutions than traditional methods. In fact, Algorithm 1 implies that a simulation of a microbial community can be divided into time intervals on which the algorithm is exact. Of course, this assumes that the linear ODE solved in these intervals is solved exactly rather than numerically.

Precisely, there exits some sequence t0 = 0 < t1 < ⋯ < tn−1 < tn = T such that if we know the optimal flux vectors wi(tl) at time tl, then Lemma 1 implies the existence of a set of invertible matrices Bil such that solutions to Eqs (14) and (15) are solutions to Eqs (6), (7) and (10) for t ∈ [tl, tl+1]. Therefore, if we are able to identify the tl exactly, then Algorithm 1 provides exact solutions to the dynamic FBA problem Eqs (6), (7) and (10). Of course, numerical limitations imply that we will not re-optimize precisely at each tl, and so we must investigate the impact of this error. However, once re-optimization is done, the method is again exact. The result is that we have no local truncation error for any time step taken between re-optimization after tl and the interval endpoint tl+1, except for error due to numerical integration. In comparison, direct methods provide some integration error at every time step. This error depends on the integration strategy used, and so for example the Euler’s method based static optimization approach carries first order local truncation error at each time step. This can easily lead to ODE overshoot and infeasible linear programs at future time-steps.

Assume that tl−1 is known exactly, and N is such that

t1=tl-1+(N-1)Δttl<tl-1+NΔt=t2,

so that there is some possible error in the interval [t1, t2]. We can estimate the accumulated error in this time interval using a power series expansion. Let x(t), y(t) be solutions to Eqs (6), (7) and (10) and x~,y~ be solutions given by Algorithm 1 for t ∈ [t1, t2). Furthermore, let Bil1 be the invertible matrices derived by solving Eq (10) at tl−1 and Bil those derived by solving at tl. Then, x(t1)=x~(t1) and y(t1)=y~(t1). For each xi we expand, assuming some regularity of the functions c(y),

xi(t2)-x˜i(t2)=(Δt)xi(t1)(γi·((Bil-1)-1-(Bil-1)-1)c^i(y(t1))+o(Δt) (20)

and see that this method gives first order local error in time steps that require a re-optimization.

The local error, while first order, only appears at time steps in which a re-optimization occurred, and so global error will scale with the number of necessary re-optimizations. This is in contrast with the classical use of Euler’s method, which gives first order local error at every time-step, or any other direct ODE method, whose error is dependent on the solver used.

We may compare the solutions provided by direct methods with those provided by the method presented in Algorithm 1 and by the method of Höffner et al. [40]. The root-sum-square (l2) difference in results are shown in Table 2, and example simulations are shown in Fig 3. As we argue above, direct methods are less accurate in theory that the algorithm presented in Algorithm 1. Furthermore, direct simulations routinely failed to simulate to time t = 5 without encountering an infeasible linear program. This infeasibility is the result of numerical error accumulating throughout the simulation. The comparisons in Table 2 can be summarized by three distinct characteristics. First, in the case of S.cerevisiae, the direct methods agree well with the newly presented method. Secondly, in the case of E.coli and M.tuberculosis, error seems to begin accumulating immediately. Finally, in the case of P.putida, the simulations agree well up to some time-point at which the direct method fails and either quits entirely (as in the case of the dopri5 solver which returns small error) or continues at a constant value.

Table 2. l2 difference in solutions to single-organism simulations between direct methods and the method presented in Algorithm 1.

vode zvode lsoda dopri5 dop853 Hoffner et al.
E.coli 5.09933 5.09933 4.61467 5.09928 5.09928 4.68578
M.tuberculosis 1.45401 1.45401 1.45417 1.45415 1.45415 2.48691
S.cerevisiae 0.00426 0.00426 0.00430 0.00429 0.00429 3.06105
P.putida 15.29177 15.29177 0.07080 15.23826 15.26221 4.78751

Fig 3. Simulations of E.coli, S.cerevisae, M.tuberculosis and P.putida using Algorithm 1, direct solvers, and the method of Höffner et al.

Fig 3

In simulations of E.coli M.tuberculosis, there is discrepancy early in the simulation. In contrast, simulations of P.putida agree up to the point that an ODE solver fails.

We note that discrepancies in dynamic FBA simulation may not always be due to numerical error, but instead due to non-uniqueness in optimal flux solutions. Our method provides a strategy for choosing between non-unique representations (in the form of a basis) of a single optimal flux solution. The method of Höffner et al. [40] provides a lexicographic strategy for choosing between non-unique optimal flux solutions based on biological, rather than mathematical, considerations. We note that for complete reproducibility, our method should be integrated with some biologically based strategy for choosing between non-unique optima.

Examples & applications

There has been a recent surge in interest in modeling microbial communities using genome-scale metabolic models, much of which has focused on equilibrium methods [4, 21, 22, 26, 51]. In order to capture transient behavior and dynamic responses to stimuli, dynamic FBA has also been applied to microbial communities [24, 34, 52]. However, community dynamic FBA invariably leads to a large dynamical system with a high-dimensional parameter space, often with little to no knowledge of parameter values. Any parameter fitting therefore requires repeated numerical simulation of the system. Existing tools to do this are built around a direct simulation approach, requiring many linear program solutions. By drastically reducing the number of optimizations required for numerical simulation, our approach offers the promise of efficient numerical simulation of dynamic FBA which will make parameter fitting more tractable, and may even allow conclusions without well-fit parameters.

Below, we demonstrate that the problem of parameter fitting is an important one by showing that experimental outcome in even small communities is sensitive to changes in kinetic parameters. Precisely, the kinetic parameters governing the uptake rate of nutrients (i.e., the parameters of the functions ci2 in Eq (4)) have a profound effect on species competition.

Next, we show how repeated simulation with randomly sampled parameters can provide some insight into community structure even without a well-fit set of nutrient uptake parameters. These examples demonstrate the importance of efficient dynamic FBA to microbial community modeling.

Prediction dependence on nutrient uptake

The set of unknown functions ci2(y) in Eq (4) present a profound problem for dynamic FBA simulation. If the behavior of the system is sensitive to the functions chosen and parameters of those functions, a single simulation will be of little use in drawing biological conclusion. In order to demonstrate that such a sensitivity exists, we repeatedly simulated the same simple community with different randomly drawn parameters. While a more realistic choice of function may be saturating or sigmoidal (as with Hill or Michaelis-Menten kinetics), for the following experiment we take these functions to be linear:

cij2(y)=κijyj, (21)

meaning that the maximum uptake rate of nutrient yj by organism xi is proportional to the concentration of yj. This choice minimizes the number of parameters that must be chosen for our analysis of parameter sensitivity, and is in line with an assumption of simple mass action kinetics [53, 54].

The choice of κij may have a profound effect on the outcome of a community simulation, as it represents how well an organism can sequester a resource when this will optimize the organism’s growth. In order study this effect in a small community, we sampled a three-species community model with κij ∈ (0, 1) chosen uniformly at random. We used models for E.coli, S.cerevisiae and M.tuberculosis downloaded from the BiGG model database [28].

We simulated with no dilution of metabolites or microbes, and no replenishment of nutrients. In every simulation, some critical metabolite was eventually depleted and the organisms stopped growing. We recorded the simulated final biomass of each organism from each simulation, and the results are shown in Fig 4.

Fig 4.

Fig 4

(Top) Histogram of the final simulated biomass of each of E.coli, S.cerevisiae and M.tuberculosis from 95 simulations, each with different metabolite uptake rates κij. (Bottom) Pair-wise comparison of the final simulated biomass densities using a kernel density estimation. In red is the result of uniform uptake rates κij = 1 for all i, j.

Community growth effects

As we saw in previous section, community growth outcomes depend on the choice of nutrient uptake rates κij. Using Algorithm 1, we can perform Monte-Carlo sampling in order to understand the possible effects on some microorganism of growing in some community. To do this, we randomly sample the set of uptake rates κij and run simulations of various communities for the chosen uptake rates. Then, the correlation between communities of final simulated biomass of some organism can be interpreted as the effect of the community on the growth of that organism. A correlation less than 1 between growth of an organism in different communities indicates that the community is having some effect. To see the direction of this effect, we can fit a simple linear regression model (best fit line) to the final simulated biomasses. Then, the slope of this line tells us if the organism benefits or is harmed by being in one community over another.

We again simulated E.coli, S.cerevisiae and M.tuberculosis downloaded from the BiGG model database [28]. Simulations were run with the M9 medium described in S1 Table, with no replenishment of resources.

Each organism grew to a larger final simulated biomass when alone compared to when in a trio with the other two, which is unsurprising given the finite resources. This difference was the least pronounced for S.cerevisiae, suggesting that this organism is the least negatively effected by the competition. However, this can be seen as only a preliminary observation without better estimates of uptake parameters. Best-fit lines are shown in Fig 5. Efficient dynamic FBA allows repeated simulation with randomly sampled parameters, which gives an indication of likely behavior even without accurate parameter fitting.

Fig 5. Final simulated biomass of E.coli, S.cerevisiae and M.tuberculosis when grown alone or in pairs, for randomly sampled modeled parameters.

Fig 5

Best fit lines indicate the average effect of the community on an organism’s growth.

Conclusion

Understanding, predicting, and manipulating the make-up of microbial communities requires understanding a complex dynamic process. Genome-scale metabolic models provide an approximation to this process through the quasi-steady state assumption which leads to dynamic flux balance analysis. However, this system is large and hard to simulate numerically, let alone analyze for qualitative behaviors. As a first step towards a thorough analysis of community of organisms modeled with dynamic FBA, an efficient method of numerical simulation would provide an essential tool. However, modern tools for simulating dynamic FBA rely on repeatedly solving an optimization problem at every time step [24, 31, 3538].

Dynamic FBA simulation can be improved by considering the structure of these linear programs so that many fewer optimizations are required. As of now, the algorithm of Höffner et al. [40] is the only published method which takes advantage of this observation. However, that method does not account for the degeneracy of solutions to the relevant linear programs, meaning that it can choose a solution that cannot be carried forward in time. We present a method that chooses a basis for forward simulation. In contrast to the method of Höffner et al., we choose this basis in such a way that increases the likelihood that this forward simulation is actually possible.

Efficient dynamic FBA will allow better parameter fitting to time-longitudinal data. Furthermore, it allows for a search of parameter space which can help predict likely model outcomes or learn maps from parameter values to model outcomes.

Supporting information

S1 Text. Proof of Lemma 1.

Proof of the main lemma of the paper.

(PDF)

S1 Table. M9 medium File.

Defines an M9 minimal medium as adapted from Monk et al. [55].

(CSV)

S2 Table. List of models used.

Provides name, ID, and URL for the four models used in analysis of the method.

(CSV)

Data Availability

The source code is available at https://github.com/jdbrunner/surfin_fba.

Funding Statement

This work was supported by funding from the DeWitt and Curtiss Family Foundation (JDB), National Cancer Institute grant R01 CA179243 (NC), and the Center for Individualized Medicine, Mayo Clinic (NC and JDB). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Braundmeier AG, Lenz KM, Inman KS, Chia N, Jeraldo P, Walther-António MRS, et al. Individualized medicine and the microbiome in reproductive tract. Frontiers in Physiology. 2015;6:97 10.3389/fphys.2015.00097 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Calcinotto A, Brevi A, Chesi M, Ferrarese R, Perez LG, Grioni M, et al. Microbiota-driven interleukin-17-producing cells and eosinophils synergize to accelerate multiple myeloma progression. Nature communications. 2018;9(1):4832 10.1038/s41467-018-07305-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Flemer B, Lynch DB, Brown JM, Jeffery IB, Ryan FJ, Claesson MJ, et al. Tumour-associated and non-tumour-associated microbiota in colorectal cancer. Gut. 2017;66(4):633–643. 10.1136/gutjnl-2015-309595 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Hale VL, Jeraldo P, Chen J, Mundy M, Yao J, Priya S, et al. Distinct microbes, metabolites, and ecologies define the microbiome in deficient and proficient mismatch repair colorectal cancers. Genome Medicine. 2018;10(1):78 10.1186/s13073-018-0586-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Ng KM, Ferreyra JA, Higginbottom SK, Lynch JB, Kashyap PC, Gopinath S, et al. Microbiota-liberated host sugars facilitate post-antibiotic expansion of enteric pathogens. Nature. 2013;502:96 EP–. 10.1038/nature12503 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Round JL, Mazmanian SK. The gut microbiota shapes intestinal immune responses during health and disease. Nature Reviews Immunology. 2009;9:313 EP–. 10.1038/nri2515 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Walsh DM, Mert I, Chen J, Hou X, Weroha SJ, Chia N, et al. The Role of Microbiota in Human Reproductive Tract Cancers. In: American Journal of Physical Anthropology. vol. 168. Wiley 111 River St, Hoboken 07030-5774, NJ USA; 2019. p. 260–261.
  • 8. Fisher CK, Mehta P. Identifying keystone species in the human gut microbiome from metagenomic timeseries using sparse linear regression. PloS one. 2014;9(7):e102451 10.1371/journal.pone.0102451 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Friedman J, Higgins LM, Gore J. Community structure follows simple assembly rules in microbial microcosms. Nature Ecology & Evolution. 2017;1:0109 EP–. 10.1038/s41559-017-0109 [DOI] [PubMed] [Google Scholar]
  • 10. Goyal A, Maslov S. Diversity, Stability, and Reproducibility in Stochastically Assembled Microbial Ecosystems. Phys Rev Lett. 2018;120:158102 10.1103/PhysRevLett.120.158102 [DOI] [PubMed] [Google Scholar]
  • 11. Stein RR, Bucci V, Toussaint NC, Buffie CG, Rätsch G, Pamer EG, et al. Ecological modeling from time-series inference: insight into dynamics and stability of intestinal microbiota. PLoS computational biology. 2013;9(12):e1003388 10.1371/journal.pcbi.1003388 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Sung J, Kim S, Cabatbat JJT, Jang S, Jin YS, Jung GY, et al. Global metabolic interaction network of the human gut microbiota for context-specific community-scale analysis. Nature communications. 2017;8:15393; 15393–15393. 10.1038/ncomms15393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Niehaus L, Boland I, Liu M, Chen K, Fu D, Henckel C, et al. Microbial coexistence through chemical-mediated interactions. bioRxiv. 2018. 10.1038/s41467-019-10062-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Posfai A, Taillefumier T, Wingreen NS. Metabolic Trade-Offs Promote Diversity in a Model Ecosystem. Phys Rev Lett. 2017;118:028103 10.1103/PhysRevLett.118.028103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Brunner JD, Chia N. Metabolite-mediated modelling of microbial community dynamics captures emergent behaviour more effectively than species–species modelling. Journal of the Royal Society Interface. 2019;16(159):20190423 10.1098/rsif.2019.0423 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Momeni B, Xie L, Shou W. Lotka-Volterra pairwise modeling fails to capture diverse pairwise microbial interactions. Elife. 2017;6:e25051 10.7554/eLife.25051 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Heirendt L, Arreckx S, Pfau T, Mendoza S, Richelle A, Heinken A, et al. Creation and analysis of biochemical constraint-based models: the COBRA toolbox v3. 0. arXiv. arXiv preprint arXiv:171004038. 2017;. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Lewis NE, Nagarajan H, Palsson BO. Constraining the metabolic genotype–phenotype relationship using a phylogeny of in silico methods. Nature Reviews Microbiology. 2012;10:291 EP–. 10.1038/nrmicro2737 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Lloyd CJ, Ebrahim A, Yang L, King ZA, Catoiu E, O’Brien EJ, et al. COBRAme: A computational framework for genome-scale models of metabolism and gene expression. PLOS Computational Biology. 2018;14(7):1–14. 10.1371/journal.pcbi.1006302 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Chan SHJ, Simons MN, Maranas CD. SteadyCom: Predicting microbial abundances while ensuring community stability. PLOS Computational Biology. 2017;13(5):1–25. 10.1371/journal.pcbi.1005539 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Diener C, Resendis-Antonio O. Micom: metagenome-scale modeling to infer metabolic interactions in the microbiota. bioRxiv. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Gottstein W, Olivier BG, Bruggeman FJ, Teusink B. Constraint-based stoichiometric modelling from single organisms to microbial communities. Journal of the Royal Society Interface. 2016;13(124):20160627 10.1098/rsif.2016.0627 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Mendes-Soares H, Mundy M, Soares LM, Chia N. MMinte: an application for predicting metabolic interactions among the microbial species in a community. BMC Bioinformatics. 2016;17(1):343 10.1186/s12859-016-1230-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Zomorrodi AR, Islam MM, Maranas CD. d-OptCom: Dynamic Multi-level and Multi-objective Metabolic Modeling of Microbial Communities. ACS Synthetic Biology. 2014;3(4):247–257. 10.1021/sb4001307 [DOI] [PubMed] [Google Scholar]
  • 25. Borer B, Ataman M, Hatzimanikatis V, Or D. Modeling metabolic networks of individual bacterial agents in heterogeneous and dynamic soil habitats (IndiMeSH). PLoS computational biology. 2019;15(6). 10.1371/journal.pcbi.1007127 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Koch S, Kohrs F, Lahmann P, Bissinger T, Wendschuh S, Benndorf D, et al. RedCom: A strategy for reduced metabolic modeling of complex microbial communities and its application for analyzing experimental datasets from anaerobic digestion. PLoS computational biology. 2019;15(2):e1006759 10.1371/journal.pcbi.1006759 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Wattam AR, Davis JJ, Assaf R, Boisvert S, Brettin T, Bun C, et al. Improvements to PATRIC, the all-bacterial bioinformatics database and analysis resource center. Nucleic acids research. 2017;45(D1):D535–D542. 10.1093/nar/gkw1017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. King ZA, Lu J, Dräger A, Miller P, Federowicz S, Lerman JA, et al. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models. Nucleic acids research. 2016;44(D1):D515–D522. 10.1093/nar/gkv1049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Mahadevan R, Edwards JS, Doyle FJ. Dynamic Flux Balance Analysis of Diauxic Growth in Escherichia coli. Biophysical Journal. 2002;83(3):1331—1340. 10.1016/S0006-3495(02)73903-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Varma A, Palsson BO. Stoichiometric flux balance models quantitatively predict growth and metabolic by-product secretion in wild-type Escherichia coli W3110. Applied and Environmental Microbiology. 1994;60(10):3724–3731. 10.1128/AEM.60.10.3724-3731.1994 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Zhuang K, Izallalen M, Mouser P, Richter H, Risso C, Mahadevan R, et al. Genome-scale dynamic modeling of the competition between Rhodoferax and Geobacter in anoxic subsurface environments. The ISME journal. 2011;5(2):305 10.1038/ismej.2010.117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Henson MA, Hanly TJ. Dynamic flux balance analysis for synthetic microbial communities. IET systems biology. 2014;8(5):214–229. 10.1049/iet-syb.2013.0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Song HS, Cannon WR, Beliaev AS, Konopka A. Mathematical modeling of microbial community dynamics: a methodological review. Processes. 2014;2(4):711–752. 10.3390/pr2040711 [DOI] [Google Scholar]
  • 34. Succurro A, Segrè D, Ebenhöh O. Emergent subpopulation behavior uncovered with a community dynamic metabolic model of Escherichia coli diauxic growth. Msystems. 2019;4(1). 10.1128/mSystems.00230-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Zhuang K, Ma E, Lovley DR, Mahadevan R. The design of long-term effective uranium bioremediation strategy using a community metabolic model. Biotechnology and bioengineering. 2012;109(10):2475–2483. 10.1002/bit.24528 [DOI] [PubMed] [Google Scholar]
  • 36. Harcombe WR, Riehl WJ, Dukovski I, Granger BR, Betts A, Lang AH, et al. Metabolic Resource Allocation in Individual Microbes Determines Ecosystem Interactions and Spatial Dynamics. Cell Reports. 2014;7(4):1104—1115. 10.1016/j.celrep.2014.03.070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Louca S, Doebeli M. Calibration and analysis of genome-based models for microbial ecology. Elife. 2015;4:e08208 10.7554/eLife.08208 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Popp D, Centler F. μbialSim: constraint-based dynamic simulation of complex microbiomes. BioRxiv. 2019; p. 716126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Gomez JA, Höffner K, Barton PI. DFBAlab: a fast and reliable MATLAB code for dynamic flux balance analysis. BMC bioinformatics. 2014;15(1):409 10.1186/s12859-014-0409-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Höffner K, Harwood SM, Barton PI. A reliable simulator for dynamic flux balance analysis. Biotechnology and Bioengineering. 2012;110(3):792–802. [DOI] [PubMed] [Google Scholar]
  • 41. Feinberg M, Horn F. Dynamics of open chemical systems and the algebraic structure of the underlying reaction network. Chemical Engineering Science. 1973;29:775–787. [Google Scholar]
  • 42. Bradie B. A Friendly Introduction to Numerical Analysis. Pearson Education Inc.; 2006. [Google Scholar]
  • 43. Baroukh C, Muñoz-Tamayo R, Steyer JP, Bernard O. DRUM: a new framework for metabolic modeling under non-balanced growth. Application to the carbon metabolism of unicellular microalgae. PloS one. 2014;9(8). 10.1371/journal.pone.0104499 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Øyås O, Stelling J. Genome-scale metabolic networks in time and space. Current Opinion in Systems Biology. 2018;8:51–58. 10.1016/j.coisb.2017.12.003 [DOI] [Google Scholar]
  • 45. Zazueta CL, Bernard O, Gouzé JL. Reduction of Metabolic Networks keeping Core Dynamics. Discrete Applied Mathematics. 2018;157(10):2483–2493. [Google Scholar]
  • 46. Kondo A, Ishii J, Hara KY, Hasunuma T, Matsuda F. Development of microbial cell factories for bio-refinery through synthetic bioengineering. Journal of biotechnology. 2013;163(2):204–216. 10.1016/j.jbiotec.2012.05.021 [DOI] [PubMed] [Google Scholar]
  • 47. Bordbar A, Monk JM, King ZA, Palsson BO. Constraint-based models predict metabolic and associated cellular functions. Nature Reviews Genetics. 2014;15(2):107–120. 10.1038/nrg3643 [DOI] [PubMed] [Google Scholar]
  • 48. Bertsimas D, Tsitsiklis JN. Introduction to linear optimization. vol. 6 Athena Scientific; Belmont, MA; 1997. [Google Scholar]
  • 49. Tardella F. The fundamental theorem of linear programming: extensions and applications. Optimization. 2011;60(1-2):283–301. 10.1080/02331934.2010.506535 [DOI] [Google Scholar]
  • 50.Tourigny DS, Muriel JC, Beber ME. dfba: Software for efficient simulation of dynamic flux-balance analysis models in Python; 2020. https://gitlab.com/davidtourigny/dynamic-fba.
  • 51. Islam MM, Fernando SC, Saha R. Metabolic modeling elucidates the transactions in the rumen microbiome and the shifts upon virome interactions. Frontiers in microbiology. 2019;10:2412 10.3389/fmicb.2019.02412 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Xu X, Zarecki R, Medina S, Ofaim S, Liu X, Chen C, et al. Modeling microbial communities from atrazine contaminated soils promotes the development of biostimulation solutions. The ISME journal. 2019;13(2):494–508. 10.1038/s41396-018-0288-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Horn F, Jackson R. General mass action kinetics. Archive for Rational Mechanics and Analysis. 1972;47. [Google Scholar]
  • 54.Feinberg M. Lectures on Chemical Reaction Networks; 1979. http://www.crnt.osu.edu/LecturesOnReactionNetworks.
  • 55. Monk JM, Charusanti P, Aziz RK, Lerman JA, Premyodhin N, Orth JD, et al. Genome-scale metabolic reconstructions of multiple Escherichia coli strains highlight strain-specific adaptations to nutritional environments. Proceedings of the National Academy of Sciences. 2013;110(50):20338–20343. 10.1073/pnas.1307797110 [DOI] [PMC free article] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1007786.r001

Decision Letter 0

Pedro Mendes, Stefano Allesina

28 May 2020

Dear Dr. Brunner,

Thank you very much for submitting your manuscript "Minimizing the number of optimizations for efficient community dynamic flux balance analysis." for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Pedro Mendes, PhD

Associate Editor

PLOS Computational Biology

Stefano Allesina

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors propose a method for efficient dynamic flux balance analysis (DFBA) of community metabolic models that reduces the number of LP solutions as compared to brute-force methods based on solving the LP at each time step. The method is applied to an 8-species community to demonstrate its usefulness. While the development of numerical methods for community DFBA is an important problem, the manuscript has several shortcomings that need to be addressed before publication can be recommended.

1. First and foremost, the authors fail to explain why their method is superior to or even different from the DFBA method proposed by the Barton group at MIT [33, 34]. The two methods seem very similar as suggested by the authors: "Hoeffner et al, [34], used this observation to introduce a variable

step-size method for dynamic FBA. In that method a basic index set is chosen by adding biological constraints to the optimization problem hierarchically until a unique optimal solution is found. The challenge of such an approach is in choosing the basis for the optimal solution, as the optimal basis is not guaranteed to be unique. We describe how to choose a basis from among the possibilities provided from an FBA solution which is most likely to remain optimal as simulation proceeds forward. We therefore prioritize reducing the number of times the linear program must be solved." So the claim innovation here is more rational choice of the optimal basis? This point is critical and not clearly explained.

2. While the authors are correct that the COBRA toolbox offers a simple and inefficient Euler-based method (e.g. [30]), I would hardly consider this method to be state-of-the-art. If the authors would like to claim superiority, then they should compare their method to the DFBA method proposed by the Barton group [33, 34].

3. When comparing methods, the authors state that the Euler-type DFBA method required a very short time step to ensure non-negative metabolite concentrations. Perhaps a more efficient implementation of the Euler method would be to utilize an ODE solution method that ensure non-negativity of the dependent variables, such as the ODE solvers available in MATLAB. Did the authors consider this option, which would allow a fairer comparison?

4. The application to the 8-species community is a reasonable test of the method, but some biological context also would be useful. For example, why focus on uracil as a crossfed metabolite? Why assume linear linear uptake kinetics rather than the usual Michael-Menten kinetics?

5. A computation time of over 15 minutes for a 1 hour simulation (see Fig. 5) does not seem particularly fast to me. Please compare computation times with a state-of-the-art DFBA code.

6. Several of the plots lack axes labels (e.g. Fig. 6, 7 and 9). Please correct.

Reviewer #2: In this manuscript, Brunner and Chia have proposed an algorithm to accelerate the simulation of time-course behavior of microbial communities. The existing dFBA algorithm was improved upon by implementing a solution-hopping strategy in order to reduce the number of times the FBA problem must be solved at each time step. The authors have shown that this approach remains scalable for communities containing up to 8 members when compared to the classic dFBA algorithm. Overall, this speed-up is significant and when properly benchmarked against existing state-of-the-art algorithms, has the potential to be an important advancement in the field of microbial community modeling.

Major comments:

1. The authors have only compared their algorithm against the classical dFBA algorithm. However, in recent years, several meaningful algorithms have been developed such as d-OptCom (Zomorrodi and Islam et al, ACS Synthetic Biology, 2014), COMETS (Harcombe et al, Cell Rep., 2014), and SteadyCom (Chan et al, PLoS Comput Biol, 2017). The authors must benchmark their method in terms of both time and performance with the above methods.

2. The proposed method assumes that maximization of total biomass is the community objective. A consequence of this is that only the fastest growing organism will show any growth over time and cause the community to become a monoculture. The authors must comment on how their method safeguards against such outcomes.

3. The classical dFBA algorithm consists of a numerical ODE integration wrapped around an instantaneous FBA problem. It is the requirement of 10,000 integration steps that leads to the high computational cost. As a result of this, the reported “85% reduction in computation time” may be inflated. It would be more meaningful to compare the two algorithms using an adaptive time-step based integration such as adaptive Runge-Kutta or Richardson’s extrapolation, or a faster ODE solver such as CVODE from the SUNDIALS package.

4. Because the forward simulation method approximates the optimal FBA solution at multiple time steps, the authors must also compare the accuracy of their simulations with that of classical dFBA and comment on the accuracy vs. speed trade-off of their proposed algorithm.

5. Metabolic fluxes have a large dynamic range of values spanning up to three orders of magnitude. This can often lead to numerical instability issues where concentrations assume negative values. The authors must justify why 10,000 time-steps are sufficient to bypass this issue.

6. The authors have stated that re-solving is necessary when the solution becomes infeasible. However, metabolic networks are carefully constructed such that, as long as concentration of species in the extracellular media remain non-negative, the FBA problem will never be infeasible. The authors must clarify what leads to infeasibility in their FBA problem. Could this be an artefact of the approximation introduced in this algorithm to bypass re-solving the FBA problem?

7. The authors must also include a discussion on the new analyses enabled by the accelerated simulation of microbial communities. In particular, what previously difficult to answer questions can now be addressed by the community due to the advancement afforded by this proposed method.

Minor Comments:

1. The authors must clarify that their method improves upon the Static Optimization Approach of dFBA.

2. In the formulation shown in Equation (4), the quantity “c” is used prior to its definition. Furthermore, the function c(y) has not been defined in this manuscript.

3. It is recommended that different subscripts be used to denote microbial species, metabolites, and reactions for clarity of presentation. The same subscript “i” has been used to represent all of these quantities.

4. In equations (6), (7), and (8), it is unclear why all fluxes are represented in terms of intracellular fluxes only. Generally, the kernel of the stoichiometry matrix is represented in terms of all but one extracellular exchange fluxes, and the intracellular fluxes causing a rank-deficiency in the stoichiometry matrix.

Reviewer #3: The authors present a new algorithm (surfinFBA) for computing dynamic FBA simulations which does not require solving the linear problem at every time step, which greatly improves the computation times. Although the method looks promising, the quality of the manuscript is not so great and, in my point of view, needs substantial restructuring.

Major issues:

* The main emphasis of the paper is put on the mathematical details of deriving the algorithm, which would be interesting in a mathematical / computer science journal, but is not so interesting to the target audience of this journal. I would suggest moving these details to a supplementary text, and focusing the main text on the simulation results and benchmark of the new method.

* The authors present the classic and new algorithm as Figures in the manuscript, and throughout the text often refer to running these algorithms as implementing / simulating Fig.1 and Fig. 3. I find this inappropriate and very confusing. The two algorithms should be presented as methods in the Methods section and referred to by their names, not by figure numbers.

* The authors cite the first dFBA method by Varma and Palsson (1994), but they should also mention the first implementation of dFBA for multi-species simulation, the DyMMM method by Zhuang and co-workers (2011).

* There are also more recent multi-species dFBA methods like MCM (Louca and Doebeli, 2015) and microbialSim (Popp and Centler, bioRxiv, 2019). The authors should also compare the advantages / disadvantages of their method with these more recent methods. For instance, in their pre-print, Popp and Centler show that microbialSim as able to simulate a 773-species community, a much larger size than the 8-species community simulated with surfinFBA.

* The authors should better classify how they setup the simulation conditions. In the paper they mention they use the "COBRApy default simulated medium". This does not make sense. COBRApy is just a simulation library, it makes no assumptions regarding the medium. The medium is something the user must decide upon when parameterizing a model.

Minor issues:

* The authors mention they downloaded the models from the PATRIC database, which as far as I know is a genome database, not a model database, and I could not find how to download an SBML model from PATRIC. The models provided in the github repository are JSON files with modelSEED annotations, so I supposed they were built with modelSEED. This should be better clarified.

Reviewer #4: We have read “Minimizing he number of optimizations for efficient community dynamic flux balance analysis” by Brunner and Chia. The authors have developed a method named surfinFBA for performing dynamic flux balance analysis calculations for in-silico communities of organisms that reduces the number of optimization problems that must be solved. This approach leads to an 85% reduction in compute time required relative to the standard method implemented in the COBRA toolbox (a package frequently used by the genome-scale modelling community). We applaud the authors use of github to disseminate the method in a usable format. Our interest in this manuscript is only tempered by a lack of discussion, and insufficient benchmarking relative to the classical method performed by the authors.

Major comments

1. The author demonstrates that surfinFBA provides a 85% speed increase in dynamic FBA simulations relative to the standard method. Supplementary File 1 compares surfinFBA to the classic method implemented in the COBRA toolbox. We appreciate seeing the time differences however there is no comparison of results between the two implementations. The author must add an analysis demonstrating the results are equivalent between the two tools, otherwise the time differences is less meaningful. Adding computed dFBA profiles from both the classic method and the novel method to the supplementary file for comparison would be a first step.

2. A key value of FBA and dFBA simulations is their mechanistic nature enabling interpretation and contextualization of results. Thus, we were disappointed in the statement on line 436 where the authors mention that P. veronii had “some competitive advantage” in the default media. What media, specifically was used? Models usually come pre-loaded with a default media, but the COBRA toolbox itself does not (this should be corrected in the manuscript). The author should utilize the simulation results to trace the media component providing this competitive advantage to P. veronii. What aspect of the metabolic network for P. veronnii provides this competitive advantage? A deeper dive into some of the results presented here would provide more value to the manuscript.

3. The manuscript ends rather abruptly, and we were disappointed in the lack of discussion section. Now that dFBA simulations can be run more efficiently what applications do the authors envision? The applications presented at the end are informative but additional discussion of their value would be illuminating.

4. Please provide details of all simulations performed. The stoichiometric matrices and flux bounds used should be provided in an excel sheet

Minor comments

1. Be sure to distinguish between the python and matlab implementations of the cobra toolbox – I believe all references are to the python implementation but please make this explicit. Also reference the version used for benchmarking.

2. Be sure to label axes in all figures – e.g. 6 and 7.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No: Numerical data underlying figures do not seem to be provided according to the data availability policy of the journal.

Reviewer #4: No: Simulation parameters used to generate the results should be shared. The stoichiometric matrices and flux bounds used should be provided in an excel sheet

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Reviewer #4: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1007786.r003

Decision Letter 1

Pedro Mendes, Stefano Allesina

21 Aug 2020

Dear Dr. Brunner,

We are pleased to inform you that your manuscript 'Minimizing the number of optimizations for efficient community dynamic flux balance analysis.' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Pedro Mendes, PhD

Associate Editor

PLOS Computational Biology

Stefano Allesina

Deputy Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors undertook some serious revisions to address my previous comments and concerns. Therefore, I am satisfied with the revision.

Reviewer #2: The authors have sufficiently addressed our raised concerns.

Reviewer #4: The authors have performed a significant amount of work to address all of our comments and suggestions

We encourage them to comb through the manuscript one last time for grammatical errors that were introduced in the revision

We noticed a few, e.g. line 601 "by show that"

Besides these very minor issues we feel the the manuscript has been greatly improved through the revision.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #4: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #4: No

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1007786.r004

Acceptance letter

Pedro Mendes, Stefano Allesina

24 Sep 2020

PCOMPBIOL-D-20-00387R1

Minimizing the number of optimizations for efficient community dynamic flux balance analysis.

Dear Dr Brunner,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Sarah Hammond

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text. Proof of Lemma 1.

    Proof of the main lemma of the paper.

    (PDF)

    S1 Table. M9 medium File.

    Defines an M9 minimal medium as adapted from Monk et al. [55].

    (CSV)

    S2 Table. List of models used.

    Provides name, ID, and URL for the four models used in analysis of the method.

    (CSV)

    Attachment

    Submitted filename: ResponsesToReviewers.docx

    Data Availability Statement

    The source code is available at https://github.com/jdbrunner/surfin_fba.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES