Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2022 Apr 14;121(10):1919–1930. doi: 10.1016/j.bpj.2022.04.012

Relationship between fitness and heterogeneity in exponentially growing microbial populations

Anna Paola Muntoni 1,2, Alfredo Braunstein 1,2,3, Andrea Pagnani 1,2,3, Daniele De Martino 4,, Andrea De Martino 1,2,5,∗∗
PMCID: PMC9199093  PMID: 35422414

Abstract

Despite major environmental and genetic differences, microbial metabolic networks are known to generate consistent physiological outcomes across vastly different organisms. This remarkable robustness suggests that, at least in bacteria, metabolic activity may be guided by universal principles. The constrained optimization of evolutionarily motivated objective functions, such as the growth rate, has emerged as the key theoretical assumption for the study of bacterial metabolism. While conceptually and practically useful in many situations, the idea that certain functions are optimized is hard to validate in data. Moreover, it is not always clear how optimality can be reconciled with the high degree of single-cell variability observed in experiments within microbial populations. To shed light on these issues, we develop an inverse modeling framework that connects the fitness of a population of cells (represented by the mean single-cell growth rate) to the underlying metabolic variability through the maximum entropy inference of the distribution of metabolic phenotypes from data. While no clear objective function emerges, we find that, as the medium gets richer, the fitness and inferred variability for Escherichia coli populations follow and slowly approach the theoretically optimal bound defined by minimal reduction of variability at given fitness. These results suggest that bacterial metabolism may be crucially shaped by a population-level trade-off between growth and heterogeneity.

Significance

Evolutionary reasoning suggests that growth rate maximization may be the key organization principle of microbial metabolism. While appealing, optimality is hard to validate by directly extracting objective functions from data. Using a maximum entropy framework to infer metabolic phenotypes from population-level experiments, we show here that, as growth conditions improve, Escherichia coli cells approach a theoretical limit that connects their average growth rate to cell-to-cell variability in metabolic phenotypes. Specifically, as the former increases, the reduction in variability stays close to a minimum. This suggests that the organization of microbial metabolism (and some of the trade-offs that characterize it) may result from the need to preserve large metabolic heterogeneity in any growth condition.

Introduction

A standard assumption of theoretical models of microbial metabolism is that cells regulate the fluxes through metabolic reactions to maximize their growth rate (i.e., their biomass output) (1, 2, 3, 4). While intuitive and highly successful in many applications (5), this idea is not easy to validate in exponentially growing microbial populations. Quantitative studies of the interplay between metabolism and gene expression suggest, for instance, that microbial fitness is strongly gauged by regulatory constraints (6,7), biosynthetic costs (8,9), and the ability to respond to changing environments (10, 11, 12, 13, 14). As a consequence, the trade-offs arising from a complex multiobjective optimization often give a more accurate description of microbial growth than straightforward biomass maximization (15, 16, 17). Moreover, experiments characterizing bacterial growth at single-cell resolution have shown tight links between fitness and cell-to-cell variability (18, 19, 20). Growth rate distributions and metabolic fluxes indeed appear to be best captured by modeling such variability rather than assuming growth rate maximization (21,22), with the implication that trade-offs between metabolism and gene expression may affect not only bulk (average) properties but also the overall structure of a microbial population. While genome-scale models of metabolic networks can account for some of these facts (23, 24, 25, 26), the maximization of biomass output remains a key conceptual premise.

Addressing the question of “what cells actually want” (27) requires in essence to reverse the usual theoretical pipeline and infer from empirical data (reaction fluxes, growth rates, nutrient intake rates, etc.) 1) how the flow of metabolites through the metabolic network is organized and 2) whether some objective function is optimized. In this work we develop a framework to learn the probability distribution of metabolic phenotypes (namely, whole network flux configurations) using mass-spectrometry data (28) to inform a constraint-based model of E. coli’s metabolism. This approach differs significantly from previous inference studies, specifically those of Refs (21,22), where the probability to observe a certain phenotype was effectively assumed to be a Boltzmann-like exponential function of its growth rate. No such assumption is made here. Rather, for each experimental sample, we compute the most likely distribution of phenotypes compatible with flux data resorting to the maximum entropy (MaxEnt) principle. In a nutshell, this approach prescribes that, if one has to infer a probability distribution subject to a given set of constraints, the distribution having the largest entropy provides the best guess, in the sense of being closest to uniform (i.e., minimizing its divergence from the uniform distribution), thus avoiding the introduction of biases that are not needed to accommodate constraints (see (29) for a simple introduction to this idea). Each inferred distribution is then characterized via 1) its mean biomass output (a proxy for the fitness) and 2) its “information content,” a global measure of cell-to-cell variability introduced in (21) that in essence quantifies the “volume” of the space of allowed phenotypes over which the inferred distribution is spread, with high information content corresponding to small volume. Fitness and information values inferred at different glucose levels appear to draw a well-defined curve in the fitness-information plane, supporting the idea of a tight link between growth and (inferred) variability. As a benchmark, we compare this curve against a purely theoretical bound obtained by maximizing, in each condition, the mean biomass output at fixed information content (similar to a “rate-distortion curve” in information theory (30)). We found that empirical populations qualitatively follow and slowly approach the theoretical limit as the growth medium gets richer. In other words, as the fitness (mean biomass output) increases, the inferred phenotypic variability tends to remain as large as possible. This quantitatively supports the idea that heterogeneity plays a key role in shaping the fitness of a microbial population.

Materials and methods

Constraint-based model of the metabolic network

Given a network reconstruction defined by the matrix S of the stoichiometric coefficients of metabolic reactions (with N the number of reactions and M that of chemical species, including exchange fluxes between the cell and the medium), feasible flux vectors v={vi} are assumed to satisfy the nonequilibrium steady state (NESS) mass balance conditions Sv=0 (5) (Fig. 1 A). Once ranges of variability of the form vi[vi,min,vi,max] are supplied for each flux, solutions span a convex polytope of dimension at least equal to Nrank(S) (31). This represents the “feasible space” F of the metabolic network. In principle, all vectors vF are viable network states (phenotypes). Flux balance analysis and related approaches typically focus on the optimal phenotype, defined as the flux vector that maximizes a specific objective, usually the “biomass synthesis rate” vbm included in the network reconstructions (2), which quantifies the rate at which biomass precursors are produced, in the correct proportions, in state v. It follows that the vector vF, which maximizes vbm, can be found by Linear Programming. We shall hereafter write vbmmaxmaxvFvbm. Here, we aim at using experimental data on fluxes to infer a probability density on F that most efficiently represents our empirical knowledge. Since the biomass synthesis rate corresponds to the growth rate according to the metabolic model, the variable vbm has the dimension of a rate, i.e., h1. In the following we make use of a heavy notation to denote the biomass synthesis rate under different conditions and approximations. For sake of clarity, we summarize in supporting material, “notation used for the biomass production rate” the notation associated with this quantity.

Figure 1.

Figure 1

(A) Constraint-based models use large-scale reconstructions of cellular metabolic networks, encoded in a stoichiometric matrix S, together with biochemical or regulatory constraints on fluxes. Nonequilibrium steady states (NESSs) of the network satisfy the conditions Sv=0. The (high-dimensional, as NM) polytope of solutions is the feasible space F of the system. The biomass output associated with each feasible flux vector (phenotype) represents its fitness. (B) Empirical mean biomass rate vbmxp (markers, from (15,32)) and maximum biomass output vbmmax (continuous line) predicted by flux balance analysis for the metabolic network and glucose uptakes from (15,32). (C) Experiment-derived averages for a small set of fluxes and the bounds defined by the feasible space can be used to inform a maximum entropy inference procedure that allows us to determine the most likely distribution of phenotypes for the entire network. The distribution inferred for each experiment yields a point in the (fitness, information) plane. (D) Inferred fitness-information values for a set of 33 experiments probing E. coli growth in glucose-minimal medium. Green line: fitness-information (F-I) bounds numerically computed by EP for the 33 experiments (all curves perfectly overlap). In each condition, fitness values vbm=vbmoptm have been rescaled by the corresponding value of vbmmax. Markers: fitness is computed according to the inferred distribution, i.e., vbm=vbminf. The black marker denotes the rescaled fitness vbmq(v;β=0)/vbmmax corresponding to a uniform distribution on F (I=0, β=0), which separates the upper branch of the F-I bound (growth faster than vbmq(v;0), β>0) from the lower one (growth slower than vbmq(v;0), β<0). The gray area represents the infeasible region. In color bars in (B) and (D), orange and blue shades are used for data coming from (15) and (32), respectively.

Experimental data and network reconstruction

We have considered the 17 experiments described in (32) and the 16 from (15), which study glucose-limited E. coli growth and employ the same network reconstruction for flux estimation. Taken together, data cover 33 values of the growth rate, from ca. 0.05/h to 1/h (a range that includes the key phenotypic crossover marked by the onset of acetate overflow (9,33)) and provide, for each condition, estimates for the population-averaged fluxes through a small set of reactions from the central carbon pathways (34). The former dataset yields the expectation values of 26 fluxes at various growth rates and glucose intakes below the acetate onset point. The latter quantifies instead 25 fluxes in a nutrient-rich medium, with acetate excretion observed in 11 experiments. As fluxes were measured relative to the glucose uptake in (15), we converted them to net fluxes using the glucose uptake values reported in Table S5 of (15). The data we used finally comprised 33 vectors vxp of average fluxes and the vector σxp of their experimental errors. To define the feasible space F, we derived the stoichiometric matrix relative to the network reconstruction given in Table S1 in (15), after mapping these fluxes to those measured in (32) via their chemical equations (supporting material, “reactions mapping”). To define reaction reversibility, we assigned bounds [1000,0] or [0,1000] to irreversible reactions, and [1000,1000] to reversible ones. We then implemented two modifications. First, we turned reaction udhA from positive irreversible to reversible to allow for negative values (measured). Next, using the algorithm given in (35), we found that reaction sdhABCD was responsible for a thermodynamically infeasible loop in the network. To prevent it, we changed it from reversible to positive irreversible, so that all flux configurations we consider are thermodynamically consistent. We finally performed flux variability analysis (36) to restrict the range of variability of each flux. To define the growth medium, the glucose import flux was set to the value reported in each experiment. The corresponding flux can then be encoded in an experiment-dependent vector b, so that the NESS conditions take the form Sv=b. The network we consider is ultimately composed of N=73 fluxes and M=49 metabolites, while the feasible space F is a convex polytope of dimension 26, i.e., with 26 independent degrees of freedom. Fig. 1 B shows, for all experiments, the measured biomass rate vbmxp (markers) together with the maximum value of the biomass vbmmax predicted by flux balance analysis for the network just described (black line).

MaxEnt distribution

To infer the probability density of flux configurations (p(v)), we use empirical data to constrain the space of probability densities on F. Specifically, we would like to impose that the mean flux vj=Fvjp(v)dv of every reaction j that has been experimentally quantified matches its experimental estimate vjxp. We denote by E the set of experimentally measured fluxes. According to the MaxEnt principle, the least-biased guess for p(v) compatible with these constraints is given by the solution of

maxp(v)H[p]   subject to vj=vjxpjE, (1)

where H[p]=Fp(v)log2p(v)dv is the entropy and E stands for the set of measured fluxes. This yields

p(v;c)=1Z(c)exp[jEcjvj], (2)

where c={cj} is the vector of Lagrange multipliers (“fields” for short) enforcing the constraints (1) and Z(c) is a factor ensuring proper normalization (Fp(v;c)dv=1). The values of the fields cj must be determined from the conditions

vjFvjp(v;c)dv=vjxpjE. (3)

To solve Eq. (3) in a general setting one can neither rely on the Monte Carlo schemes employed in (21) nor to Boltzmann learning (supporting material, “fitting averages using HR-based Boltzmann machine learning”) due to exceeding computational costs. Furthermore, we observed that, due to experimental uncertainties, some empirical means vjxp given in (15,32) lie outside the feasible polytope F defined by the network reconstruction employed in those studies. In other words, there is no configuration vF satisfying vj=vjxp exactly, therefore the above-mentioned approach cannot directly be applied. To account for the last issue, we introduce in the next section a slightly modified MaxEnt model, whose parameters are determined by expectation propagation (EP), a highly efficient algorithm for approximate Bayesian inference with broad applicability (37, 38, 39).

Computation of the MaxEnt distribution via EP

We define for each measured flux jE, the auxiliary variable vje=vjxp+ηj, where ηj is a Gaussian random variable with zero mean and standard deviation γ1/2 (no cross correlations between different fluxes are assumed). We then aimed at determining the distribution of phenotypes v such that 1) v lies in F, 2) fluxes in E take on values as close as possible (within experimental errors) to those encoded in the auxiliary vector ve, and 3) averages of ve match empirical averages vxp. The (MaxEnt) distribution satisfying these constraints reads

p(v,ve;c)=1Ze(c)jEeγ2(vjevj)2+cjvje, (4)

where Ze(c)=RndveFdvp(v,ve;c) is the normalization constant, and c is the vector of Lagrange multipliers ensuring that

vjep(v,ve;c)=vjxpforeachjE. (5)

In the limit of large γ the distribution will concentrate on the point (or face) that is “closest” to the empirical vector vxp. To set a specific value for γ, however, we imposed that the average variances of auxiliary fluxes ve match the average variance of experimental errors, i.e.,

1|E|jE(vjevjxp)2p(v,ve;c)=1|E|jE(σjxp)2 (6)

(see supporting material, “mathematical details of the expectation propagation algorithm” for details). Note that (4) is tightly related to (2), since

Rnp(v,ve;c)dvep(v;c). (7)

To avoid the high computation cost of calculating the vector c and the scalar γ through Boltzmann learning, we propose to employ a variant of EP (37, 38, 39), an algorithm computing a multivariate Gaussian approximation q(v,ve;c) of p(v,ve;c). Note that, thanks to the relation in (7) and the Gaussian approximation provided by EP, we can characterize the fluxes vF by means of a marginal density q(v;c), fully parametrized by a vector of means μ and a covariance matrix Σ. Details are given in supporting material, “mathematical details of the expectation propagation algorithm.” After applying EP to each experiment a to compute the distribution q(v,ve;ca) approximating p(v,ve;ca), expectation values of constrained fluxes coincide within error bars with empirical means as well as with results obtained by (potentially more accurate but less efficient) Monte Carlo Hit-and-Run calculations (40) (supporting material, “fitting averages using HR-based Boltzmann machine learning” and “fitting quality”; Figs. S3, S4, and S5). The script performing EP to compute the joint distribution (4) along with its single-flux marginals using constraints derived from the datasets considered in this paper is available at https://github.com/infernet-h2020/MetaME.

Inferred fitness and information content

To characterize the inferred distributions (4) (one per experimental population), we use two quantities (Fig. 1 C): 1) the mean inferred biomass output

vbminf=vbmq(v;c)=Fvbmq(v;c)dv, (8)

and 2) the “information content” per degree of freedom defined as

Iinf=H[p(v;0)]H[p(v;c)]dim(F)ln2[bits]. (9)

vbminf is a proxy for the population growth rate (fitness) when cell-to-cell variability is sufficiently small (19). Iinf, namely the entropy loss relative to the uniform density on F, quantifies instead how “spread” over F is p(v;c). Low entropy or low phenotypic variability implies high information content and vice versa. We calculated the information content per degree of freedom from the Gaussian approximation to (4), i.e.,

Haq=12log det2πeΣa, (10)
Iainf=H0Hadim(F)ln2, (11)

where H0 is the entropy of the multivariate Gaussian distribution approximating the uniform distribution on F.

F-I bound

To benchmark inferred distributions, we reasoned that a large population of cells can be described by a probability density p(v) over F as long as the feasible space can be taken to be the same for all cells in the population (i.e., if all cells obey the same set of physicochemical, regulatory, and environmental constraints). Clearly, the same fitness (mean biomass output vbm=Fvbmp(v)dv) can be achieved by different probability densities p(v). Among equally fit populations, however, those with the largest entropy encode less information into p(v) and are hence likely to face smaller costs associated with the regulation of fluxes. Therefore, it is reasonable to define the optimal population for any given information content as the one solving

maxp(v)vbmsubject toI[p]=I¯. (12)

Equivalently, the optimal population is the one achieving a given fitness at the smallest reduction of flux variability, i.e., the solution of

minp(v)I[p]subject tovbm=v¯bm. (13)

The solution of Eq. (13) has the form

p(v)=1Z(β)eβvbm, (14)

where β is the Lagrange multiplier enforcing the constraint vbm=v¯bm and Z(β) is a normalization constant. The distribution Eq. (14) depends on the single parameter β. As β increases, v¯bm increases and the values of vbm and I associated with Eq. (14) will change, drawing a curve in the (I,vbm) plane that only depends on the specifics of F. We call this curve the ftness-information (F-I) bound. By definition, optimal populations (fastest-growing if β>0, slowest-growing if β<0) have fitness and information values that lie on this curve.

Given a feasible space F, the F-I bound can be computed by EP, as the latter yields a multivariate Gaussian approximation q(v;β) to the distribution Eq. (14). The mean biomass synthesis rate increases as β increases. We therefore selected 103 values of β in the range [0,104] and, for each of these, computed q(v;β), and the corresponding mean biomass vbmoptm=vbmq(v;β) and information content Ioptm. The latter quantities give the F-I bound (the green line in Fig. 1 D). (The same curve can also be computed thorough the Monte Carlo protocols described in (21,22).) Note that, in principle, one obtains different F-I bounds in each condition, since both F and vbmmax change with the glucose level. However, when the values of vbmoptm are rescaled by vbmmax, all curves collapse on the green line due to inherent linear scaling of fluxes with the glucose uptake. In this respect, the F-I bound shown in Fig. 1 D provides a highly robust characterization of the cell’s metabolic capabilities.

Results

Inferred fitness-information relationship

Inferred values of (rescaled) fitness and information (vbminf/vbmmax,Iinf) for the 33 experiments used to inform the MaxEnt problem (materials and methods) are shown by the markers in Fig. 1 D. Different color groups (orange versus blue) are used for data coming from (15) and (32), respectively, with shades corresponding to different growth conditions, while triangles versus circles denote the presence (νacexp>0) or absence (νacexp=0) of acetate excretion. As the growth medium gets richer, both the mean biomass output and the information encoded in inferred distributions increase, following a remarkably well-defined behavior. In other terms, faster-growing populations tend to spread over smaller and smaller portions of the feasible space. Such a relationship represents the trade-off between static (instantaneous) fitness and cell-to-cell variability in bacterial populations starting from reaction fluxes rather than through direct quantification of elongation rates or interdivision times (see, e.g., (19)).

Empirical versus optimal fitness-information relationship

The theoretically optimal fitness-information relationship is encoded by the F-I bound describing the maximal mean biomass output achievable in F at any given information content (or, vice versa, the minimum information content required to achieve a given mean biomass) (materials and methods). The bound obtained for network reconstruction of (15,32) is shown by the green line in Fig. 1 D. The top (respectively, bottom) branch of the line corresponds to optimal states with β>0 (respectively, β<0) in Eq. (14), where the biomass synthetic rate is higher (respectively, lower) than the value obtained by an unbiased uniform sampling of F (β=0 in Eq. (14)). Such a value is displayed as a black square in Fig. 1 D. F-I pairs lying within the two branches of the F-I bound (white area) are feasible, while pairs in the gray area are forbidden. Inferred F-I pairs consistently lie in the feasible region. A comparison between inferred and optimal scenarios reveals, however, two qualitatively different regimes (Fig. 2 A). In poorer media, populations appear to be significantly suboptimal but rapidly close the gap with the theoretically optimal mean biomass production rate as the medium improves. Here, inferred values of I are larger than optimal ones by about 1 bit across the whole range of growth rates (Fig. 1 D). This suggests that a remodeling of the protein repertoire leading to faster growth at roughly the same regulatory costs is likely the key driver of the organization of flux patterns, as described, e.g., in (7). In richer media (faster growth), the fitness remains instead at a roughly constant (small) distance to the optimum, suggesting that growth is mainly information limited: increases in fitness require fine-tuning metabolic fluxes to encode more information into p(v;c). Noticeably, the crossover from one regime to the other occurs around the growth rate where acetate overflow sets in (9).

Figure 2.

Figure 2

(A) Vertical distance between the F-I bound and inferred F-I pairs versus empirical growth rate. The dashed line is a guide for the eye. (B) Norms of the projections of the fields c onto the feasible space F (supporting material, “projections of the coefficients along individual flux directions”) as a function of the experimentally measured biomass. In color bars, orange and blue shades are used for data coming from (15) and (32), respectively.

Physical meaning of the inferred Lagrange multipliers

Qualitative changes in the flux distributions are also reflected in the behavior of the inferred fields c={cj} (see (2)). Roughly speaking, these quantities can be interpreted as “forces” acting on fluxes: the larger cj, the more p(v;c) is deformed in the direction of flux j with respect to the uniform distribution on F with the prescribed glucose uptake in order for vje to match vjxp (materials and methods). The projection cproj of the fields vector c onto the feasible space F (or, more precisely, its norm ||cproj||, see supporting material, “projections of the coefficients along individual flux directions”) therefore quantifies the overall deformation required to reproduce data once the glucose uptake rate is given. One sees (Fig. 2 B) that ||cproj|| decreases approximately as vbm0.4 as the growth rate increases. In other words, as the glucose uptake increases, inferred distributions get closer to being as broad as possible given the glucose uptake. Because optimal populations are the least constrained at any given fitness, one may think that experimental populations also get globally less constrained as the medium gets richer. This is, however, not the case. To show it, one must compute, for every experiment, the fitness-information pairs obtained by varying only the biomass output (i.e., the coefficient associated with the biomass reaction) at fixed fields c. For any given experiment (i.e., for any given c), this procedure returns a line in the fitness-information plane describing the values of I that would correspond to a population subject to the same fields, but carrying a higher fitness. By construction, populations lying to the left of this line at any given growth rate encode less information into the flux distribution than the reference population and are therefore globally less constrained. A representative example in Fig. S8 shows the isofield lines for two experiments from (32) and one from (15). One sees that populations tend to lie to the right of these lines. Hence, contrary to intuition, faster-growing populations are slightly more constrained than slower ones despite being closer to the optimal fitness and information content.

Finally, the projections of c along the directions of individual degrees of freedom of the feasible space F provide information about the degree of regulation of individual reactions (Fig. S10). At slower growth, reactions belonging to glycolysis, the Entner-Doudoroff pathway and the glyoxylate shunt get more and more downregulated compared with the unbiased mean as the glucose level is limited, while fluxes through the pentose phosphate pathway, the TCA cycle and oxidative phosphorylation are mostly upregulated. This picture effectively recapitulates known patterns of proteome allocation in E. coli (7). The projection of c on the biomass production rate is relatively large at faster growth but behaves erratically in poor media. Marginal flux distributions (supporting material, “probability densities of the nonmeasured fluxes” and Fig. S7) confirm this picture in greater detail.

Inferred versus optimal patterns of pathway regulation

Besides quantifying the distortion of the inferred phenotype distribution (2) with respect to a uniform distribution (at given nutrient intake), the fields c also provide information about how different metabolic pathways are used in different conditions. At the simplest level, principal-component analysis (PCA) performed on the ensemble of the 33 inferred c vectors (one per experiment) shows that experiments are classified by the projection on the first component in two clusters characterized by distinct acetate excretion profiles (supporting material, “dimensionality reduction” and Fig. S1; note that PCA correctly assigns the data from (15) to the two clusters). This confirms carbon overflow as the key separator of phenotypic behavior in carbon-limited E. coli growth.

At a more refined level, one can consider the average variance of fluxes in each pathway P defined according to the network reconstruction provided in Table S1 from (15) defined as

Var(vi)iP¯=1|P|iPΣii. (15)

A larger (respectively, smaller) variance indicates that the pathway is globally less (respectively, more) regulated, as its fluxes are allowed larger variability on average. Fig. 3 reports a comparison between optimal and inferred values for six key pathways. While glycolysis, Entner-Doudoroff, and pentose phosphate pathways appear to be tightly controlled in all conditions (both in optimal and inferred distributions), respiration and the TCA cycle display a significant (and remarkably similar) pattern of variability in both cases. Noticeably, their variability undergoes strong modulations as the growth rate changes. The major difference between the two cases is seen in the fermentation pathway, which appears to be much more variable in empirical population than it is at optimality. Finally, in both optimal and inferred distributions the overall variability of fluxes within pathways appears to decrease close to the acetate onset point (ca. 0.6/h), again pointing to the occurrence of a major regulatory transition. Detailed results for all pathways included in the network model are given in Fig. S9. Taken together, changes in the variability of pathways (Fig. 3) and in the regulation of different reactions (Fig. S10) suggest that some of the known empirical facts regarding the use of cellular resources may result from the need to preserve a sufficiently large metabolic variability in any growth condition.

Figure 3.

Figure 3

Mean variance of fluxes (Eq. 15) through six metabolic pathways as a function of the growth rate (empirical values) in optimal (purple) and inferred (cyan) distributions. (Note that markers appear darker at points where many of them overlap. This is mainly due to replicated experiments in (32).)

Reduced representations of metabolic activity are not sparse

An important problem arising in the analysis of metabolic networks concerns the possibility that whole network flux configurations might be efficiently represented by a small number of collective variables (“pathways”), whose control would be the central task of metabolic regulation. Experiments probing these variables would effectively allow us to reconstruct the activity across the whole network. Fig. 4 indeed shows that about 90% of the empirical variance of inferred fields c (red line) is explained by the first five principal components, suggesting that a significant reduction of dimensionality might be possible (as found, e.g., in (41)). Encouraged by this, we then estimated how well one can reconstruct inferred Lagrange coefficients and mean fluxes from PCA coefficients as a function of the number of PCA components included in the calculation. Ideally, an efficient representation would only require a small number of components. Denoting by k the number of PCA components included, as quality indicators we used the quantities

εcoeff(k)=||caprojcaPCA(k)||¯, (16)
εmeans(k)=||vEcavEcaPCA(k)||¯, (17)

where caproj (respectively, caPCA(k)) is the projected vector of Lagrange multipliers for experiment a=1,,33 from the full inference problem (respectively, obtained by including only the first k PCA components), while vE(.) collects the mean values of the measured fluxes E according to a MaxEnt distribution parametrized by the external fields (.). The over-bar denotes the average over the 33 experiments. Fig. 4 shows that εcoeff is generically small and decreases fast with k, in line with the fact that the first five principal components explain most of the variability of the original coefficients. εmeans, however, decreases much more slowly. In practice, lossless inference of the coefficients c (and hence the reconstruction of mean fluxes) requires the inclusion of the first 18 principal components (supporting material, “accessing the most informative fluxes” and Fig. S2). Incidentally, this number coincides with the number of fluxes measured in (15,32) that are linearly independent. Note that, besides the achievable effectiveness in reproducing measured fluxes, inferred distributions also yields predictions for metabolic fluxes that are inaccessible to labeling experiments, stored in the marginal densities of individual fluxes (supporting material, “probability densities of the nonmeasured fluxes” and Fig. S7). Likewise, they allow for accurate predictions of the values of physiological parameters quantified independently of fluxes, such as growth and acetate excretion rates, when such quantities are excluded from the inference procedure (Fig. 5).

Figure 4.

Figure 4

Explained variance of the coefficients (in percentage, red line) and reconstruction error of the original fields (continuous blue line) and phenotypes (dashed blue line) as a function of the number of PCA components employed to compute their low-rank counterpart.

Figure 5.

Figure 5

Mean values of the acetate excretion rate (top) and biomass production rate (bottom) measured (purple) and predicted from the inferred distribution (cyan) for the different experiments from (15,32). For comparison, bare flux balance analysis predictions correspond to vex ace=0 and vbm/vbmmax=1, respectively. Error bars represent experimental errors as reported in ([15, 32])

Metabolic control coefficients

The fact that the complexity of metabolic activity is not captured by a few effective variables indicates that the moments of (2) are highly sensitive to the values of the inferred fields c: small changes in the latter can induce large rearrangements in the distribution. Metabolism, in other words, forms a system of globally coupled processes. Within the theoretical framework employed here, this aspect is fully quantified by the flux-flux correlation matrices. Indeed, starting from the definition of Z(c), i.e., (see (2))

Z(c)=Fexp[jEcjvj]dv, (18)

one can easily show by a direct calculation that

lnZ(c)ci=1Z(c)Fviexp[jEcjvj]dvvic (19)
2lnZ(c)cicj=vivjcvicvjcCij (20)
viccj=vjcci=Cij, (21)

where c=p(v;c)dv. Hence (see (21)), flux-flux correlations Cij (computable by EP, see supporting material, “mathematical details of the expectation propagation algorithm”) can be immediately interpreted as “metabolic control coefficients” relating changes in flux i to changes in field j (or vice versa). For instance, the correlation matrix computed for a representative experiment from (32) shown in supporting material, “correlation matrices” and Fig. S6 suggests that glycolitic reactions are positively coupled to other glycolytic reactions (e.g., an upregulation in one reaction, quantified by a change in the corresponding field, will increase the mean flux through the other), whereas they are mostly negatively coupled to reactions in the pentose phosphate pathway (upregulations in the latter will decrease the flux through the former). These couplings span across the entire metabolic network. Indeed one sees that changes in one field typically propagate to a large number of reactions, supporting the idea that the cross talk between metabolic reactions is significantly nonlocal.

Discussion

Biological significance of the information content

While the idea that fitness and information content of flux distributions are interrelated seems rather natural, the physiological meaning of the latter is not obvious. Technically, I quantifies the deviation of the inferred distribution p(v;c) from uniformity in a given medium. Small values of I imply that experiment-derived constraints do not significantly modify our previous knowledge of the flux distribution, corresponding to all flux vectors in the feasible space defined by the given uptake rates being equally likely. On the other hand, large values of I imply that the inferred likelihood has a small overlap with the uniform distribution in the same medium. In this sense, one can say that I provides a proxy for the amount of metabolic regulation required to grow in a given medium. We have seen that the information content per degree of freedom more than doubles as the growth rate goes from 0.05/h to about 1/h (Fig. 1 D). Such a gain is mainly due to a systemic fine-tuning of fluxes and correlations rather than to the tightened control of a few pathways, in agreement with evidence suggesting that system-wide rearrangements underlie response to changing carbon levels in E. coli (7).

Our analysis also shows that inferred distributions exceed the minimum required information content by roughly 1 bit per degree of freedom in all growth conditions (Fig. 1 D). This suggests that the biochemical constraints used to define the feasible space (flux reversibility, ranges of variability, etc.) might be too conservative. Further ingredients affecting the metabolism of single cells, such as biosynthetic costs (25), might also reduce the feasible space and bring data closer to the theoretical bound. However, the gap may also indicate that population growth requires a minimum amount of regulatory information, in line with the idea that minimal complexity (as opposed to minimal number of components) is the defining characteristic of cells (42). That regulatory interactions and mechanical effects are crucial in determining E. coli’s overall metabolic capabilities is indicated by the fact that they remain substantially unchanged after a large-scale removal of unnecessary genes (43, 44, 45). By contrast, they are significantly affected by the selective knockout of a small number of genes through which specific cellular tasks are optimized (46, 47, 48). In this sense, constraint-based models may be missing a substantial amount of regulatory interactions that would effectively reduce the size of the feasible space F. Identifying these constraints could bring empirical populations closer to the F-I bound and provide crucial hints about the nature of optimality in bacterial growth.

It is finally important to remark that the F-I bound we define is fundamentally different from the fitness-information relationship derived in (49). In that case, one quantifies the information about nutrient availability that has to be encoded in the level of a nutrient-processing enzyme to achieve a given fitness. In our case, information is a measure of the high-dimensional space of flux configurations that is effectively accessible to the system.

Limitations of the study

Besides the information encoded in the network structure, the key physical assumption made in our inference is that metabolic networks are at a NESS described by the mass balance conditions alone. This means that we do not account for factors such as biosynthetic costs, molecular crowding, membrane occupancy, etc. All of these are likely essential for the metabolic behavior single cells. However, including them in an inverse model defined on F would necessarily require additional assumptions about how they are linked to metabolic fluxes.

On the technical side, our study is limited by two not-easily avoidable facts. 1) The data sets we used are not homogeneous, so it is a priori difficult to consider one as a continuation of the other at different growth rates. Carrying out this study on a broader, unique fluxomic data set covering a large enough range of growth rates would likely yield a more clean-cut picture. That a consistent scenario can emerge despite this limitation is in this respect quite remarkable. 2) In our framework, we implicitly interpret measured flux variances as proxies for the cell-to-cell variability. While this assumption has given consistent results when used in the context of single-cell data (21,22), a more detailed understanding of the sources of variability and error in fluxomics would allow to fine-tune the application of the MaxEnt scheme for the inverse problem considered here. It is, however, important to note that, while our approach is capable of efficiently representing the empirical variability, it cannot point to specific causal factors behind it. For this goal, different types of models (e.g., biochemically detailed dynamical models) are necessary.

Relation to other approaches

The most immediate comparison for our results is given by standard biomass maximization, which corresponds to the limit β in (14). Previous work has shown that empirical data, including distributions of elongation rates in exponentially growing populations and measured fluxes, are better described using (14) with finite β rather than its β limit (21,22). Here, we have effectively quantified how close flux distributions inferred from data are to (14) in terms of fitness and information content. Another set of potentially related problems concerns the experiment-guided determination of an objective function for constraint-based models. Different techniques have been proposed in the past to infer objectives or discriminate between various alternatives (50, 51, 52, 53, 54, 55). While the vector c of Lagrange multipliers does partially align with the biomass output, our analysis does not highlight a clear objective function for constraint-based models. On the contrary, our results support the idea that the growth of bacterial populations is governed by a trade-off between mean single-cell biomass and heterogeneity. Notice that optimizing the mean biomass over time provides individual cells with an effective way to cope with multiple sources of variability (56). For instance, bacteria in fluctuating environments may be unable to adjust fluxes to the distribution that maximizes the instantaneous biomass synthetic rate due to the biosynthetic cost of the regulatory machinery implementing the adjustments. Regulatory programs selected over longer timescales would essentially optimize the frequency with which metabolism is adjusted in varying conditions.

Finally, we note that MaxEnt-based models of metabolic networks have been employed in the past for a variety of purposes: to guide the decomposition of flux configurations into physiologically significant modes (57,58); explain the variability observed in bacterial populations (21,59) and continuous cell cultures (60,61) (note: the latter article appeared while the present paper was under review); reproduce empirical data on fluxes (22); derive dynamic strategies of cellular resource allocation (62); or predict response times to changing environments (63). While also based on the MaxEnt principle, the work presented here faces the question of heterogeneity from a rather different viewpoint, aiming essentially at bridging the gap between optimization-based methods and empirical results by building an efficient representation of metabolic data using constraint-based models. Our hope is that such an approach will lead to new theoretical insights into the nature and optimality of bacterial growth.

Conclusion

We have shown here that, as the growth medium gets richer, phenotype distributions inferred for E. coli populations appear to follow and slowly approach a theoretical limit that quantitatively relates the mean biomass production rate to the cell-to-cell variability in metabolic phenotypes. Specifically, the mean biomass gets closer to the maximum allowed by the inferred heterogeneity of the population. Despite the fact that large fluctuations affect the activity of metabolic pathways, the scenario we obtain reproduces some of the well-known trade-offs that characterize E. coli growth under carbon limitation, including downregulation of glycolysis, upregulation of respiration and the TCA cycle, and the transition to acetate overflow. This suggests that E. coli populations trade some of their fitness to maintain their metabolic heterogeneity nearly as large as possible in all growth conditions considered.

Author contributions

A.P.M. performed research, analyzed the data, prepared figures and tables, contributed materials and analysis tools, wrote the paper, and critically reviewed the manuscript. A.B., A.P., D.D.M., and A.D.M. conceived and designed research, contributed materials and analysis tools, wrote the paper and critically reviewed the manuscript.

Acknowledgments

A.B., A.D.M., A.P., and A.P.M. acknowledge financial support from Marie Skłodowska-Curie, grant agreement no. 734439(INFERNET).

Editor: Mark Alber.

Footnotes

Supporting material can be found online at https://doi.org/10.1016/j.bpj.2022.04.012.

Contributor Information

Daniele De Martino, Email: daniele.demartino@ehu.eus.

Andrea De Martino, Email: andrea.demartino@polito.it.

Supporting citations

References ([64, 65, 66, 67]) appear in the supporting material.

Supporting material

Document S1. Figures S1–S10
mmc1.pdf (7.5MB, pdf)
Document S2. Article plus supporting material
mmc2.pdf (8.9MB, pdf)

References

  • 1.Lewis N.E., Nagarajan H., Palsson B.O. Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods. Nat. Rev. Microbiol. 2012;10:291–305. doi: 10.1038/nrmicro2737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Feist A.M., Palsson B.O. The biomass objective function. Curr. Opin. Microbiol. 2010;13:344–349. doi: 10.1016/j.mib.2010.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Dourado H., Lercher M.J. An analytical theory of balanced cellular growth. Nat. Commun. 2020;11:1226–1314. doi: 10.1038/s41467-020-14751-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bruggeman F.J., Planqué R., et al. Teusink B. Searching for principles of microbial physiology. FEMS Microbiol. Rev. 2020;44:821–844. doi: 10.1093/femsre/fuaa034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bordbar A., Monk J.M., et al. Palsson B.O. Constraint-based models predict metabolic and associated cellular functions. Nat. Rev. Genet. 2014;15:107–120. doi: 10.1038/nrg3643. [DOI] [PubMed] [Google Scholar]
  • 6.Scott M., Gunderson C.W., et al. Hwa T. Interdependence of cell growth and gene expression: origins and consequences. Science. 2010;330:1099–1102. doi: 10.1126/science.1192588. [DOI] [PubMed] [Google Scholar]
  • 7.Hui S., Silverman J.M., et al. Williamson J.R. Quantitative proteomic analysis reveals a simple strategy of global resource allocation in bacteria. Mol. Syst. Biol. 2015;11:784. doi: 10.15252/msb.20145697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Flamholz A., Noor E., et al. Milo R. Glycolytic strategy as a tradeoff between energy yield and protein cost. Proc. Natl. Acad. Sci. 2013;110:10039–10044. doi: 10.1073/pnas.1215283110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Basan M., Hui S., et al. Hwa T. Overflow metabolism in Escherichia coli results from efficient proteome allocation. Nature. 2015;528:99–104. doi: 10.1038/nature15765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Basan M., Honda T., et al. Sauer U. A universal trade-off between growth and lag in fluctuating environments. Nature. 2020;584:7821. doi: 10.1038/s41586-020-2505-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Utrilla J., O’Brien E.J., et al. Palsson B.O. Global rebalancing of cellular resources by pleiotropic point mutations illustrates a multi-scale mechanism of adaptive evolution. Cell Syst. 2016;2:260–271. doi: 10.1016/j.cels.2016.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Mori M., Schink S., et al. Hwa T. Quantifying the benefit of a proteome reserve in fluctuating environments. Nat. Commun. 2017;8:1225–1228. doi: 10.1038/s41467-017-01242-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Towbin B.D., et al. Alon U. Optimality and sub-optimality in a bacterial growth law. Nat. Commun. 2017;8:14123–14128. doi: 10.1038/ncomms14123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Erickson D.W., Schink S.J., et al. Hwa T. A global resource allocation strategy governs growth transition kinetics of Escherichia coli. Nature. 2017;551:119–123. doi: 10.1038/nature24299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Schuetz R., Zamboni N., et al. Sauer U. Multidimensional optimality of microbial metabolism. Science. 2012;336:601–604. doi: 10.1126/science.1216882. [DOI] [PubMed] [Google Scholar]
  • 16.Shoval O., Sheftel H., et al. Alon U. Evolutionary trade-offs, Pareto optimality, and the geometry of phenotype space. Science. 2012;336:1157–1160. doi: 10.1126/science.1217405. [DOI] [PubMed] [Google Scholar]
  • 17.Mori M., Marinari E., De Martino A. A yield-cost tradeoff governs Escherichia coli’s decision between fermentation and respiration in carbon-limited growth. NPJ Syst. Biol. Appl. 2019;5:16–19. doi: 10.1038/s41540-019-0093-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kiviet D.J., Nghe P., et al. Tans S.J. Stochasticity of metabolism and growth at the single-cell level. Nature. 2014;514:376–379. doi: 10.1038/nature13582. [DOI] [PubMed] [Google Scholar]
  • 19.Taheri-Araghi S., Bradde S., et al. Jun S. Cell-size control and homeostasis in bacteria. Curr. Biol. 2015;25:385–391. doi: 10.1016/j.cub.2014.12.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kennard A.S., Osella M.J., et al. Cosentino Lagomarsino M. Individuality and universality in the growth-division laws of single E. coli cells. Phys. Rev. E. 2016;93:012408. doi: 10.1103/physreve.93.012408. [DOI] [PubMed] [Google Scholar]
  • 21.De Martino D., Capuani F., De Martino A. Growth against entropy in bacterial metabolism: the phenotypic trade-off behind empirical growth rate distributions in E. coli. Phys. Biol. 2016;13:036005. doi: 10.1088/1478-3975/13/3/036005. [DOI] [PubMed] [Google Scholar]
  • 22.De Martino D., MC Andersson A., et al. Tkacik G. Statistical mechanics for metabolic networks during steady state growth. Nat. Commun. 2018;9:2988–2989. doi: 10.1038/s41467-018-05417-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.O’Brien E.J., Lerman J.A., et al. Palsson B.O. Genome-scale models of metabolism and gene expression extend and refine growth phenotype prediction. Mol. Syst. Biol. 2013;9:693. doi: 10.1038/msb.2013.52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Goelzer A., Muntel J., et al. Becher D. Quantitative prediction of genome-wide resource allocation in bacteria. Metab. Eng. 2015;32:232–243. doi: 10.1016/j.ymben.2015.10.003. [DOI] [PubMed] [Google Scholar]
  • 25.Mori M., Hwa T., et al. Marinari E. Constrained allocation flux balance analysis. PLoS Comput. Biol. 2016;12:e1004913. doi: 10.1371/journal.pcbi.1004913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Reimers A.M., Knoop H., et al. Steuer R. Cellular trade-offs and optimal resource allocation during cyanobacterial diurnal growth. Proc. Natl. Acad. Sci. 2017;114:E6457–E6465. doi: 10.1073/pnas.1617508114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Feist A.M., Palsson B.O. What do cells actually want? Genome Biol. 2016;17:110. doi: 10.1186/s13059-016-0983-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Dai Z., Locasale J.W. Understanding metabolism with flux analysis: from theory to application. Metab. Eng. 2017;43:94–102. doi: 10.1016/j.ymben.2016.09.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.De Martino A., De Martino D. An introduction to the maximum entropy approach and its application to inference problems in biology. Heliyon. 2018;4:e00596. doi: 10.1016/j.heliyon.2018.e00596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.MacKay D.J. Cambridge University Press; 2003. Information Theory, Inference and Learning Algorithms. [Google Scholar]
  • 31.Schrijver A. John Wiley & Sons; 1998. Theory of Linear and Integer Programming. [Google Scholar]
  • 32.Nanchen A., Schicker A., Sauer U. Nonlinear dependency of intracellular fluxes on growth rate in miniaturized continuous cultures of Escherichia coli. Appl. Environ. Microbiol. 2006;72:1164–1172. doi: 10.1128/aem.72.2.1164-1172.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Wolfe A.J. The acetate switch. Microbiol. Mol. Biol. Rev. 2005;69:12–50. doi: 10.1128/mmbr.69.1.12-50.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Buescher J.M., Antoniewicz M.R., et al. Gottlieb E. A roadmap for interpreting 13C metabolite labeling patterns from cells. Curr. Opin. Biotechnol. 2015;34:189–201. doi: 10.1016/j.copbio.2015.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.De Martino D., Capuani F., et al. Marinari E. Counting and correcting thermodynamically infeasible flux cycles in genome-scale metabolic networks. Metabolites. 2013;3:946–966. doi: 10.3390/metabo3040946. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Gudmundsson S., Thiele I. Computationally efficient flux variability analysis. BMC Bioinformatics. 2010;11:489. doi: 10.1186/1471-2105-11-489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Opper M., Winther O. Gaussian processes for classification: mean-field algorithms. Neural Comput. 2000;12:2655–2684. doi: 10.1162/089976600300014881. [DOI] [PubMed] [Google Scholar]
  • 38.Minka T.P. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc.; 2001. Expectation propagation for approximate Bayesian inference; pp. 362–369. [Google Scholar]
  • 39.Braunstein A., Muntoni A.P., Pagnani A. An analytic approximation of the feasible space of metabolic networks. Nat. Commun. 2017;8:14915–14919. doi: 10.1038/ncomms14915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.De Martino D., Mori M., Parisi V. Uniform sampling of steady states in metabolic networks: heterogeneous scales and rounding. PloS one. 2015;10:e0122670. doi: 10.1371/journal.pone.0122670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Furusawa C., Kaneko K. Formation of dominant mode by evolution in biological systems. Phys. Rev. E. 2018;97:042410. doi: 10.1103/physreve.97.042410. [DOI] [PubMed] [Google Scholar]
  • 42.Xavier J.C., Patil K.R., Rocha I. Systems biology perspectives on minimal and simpler cells. Microbiol. Mol. Biol. Rev. 2014;78:487–509. doi: 10.1128/mmbr.00050-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Gerosa L., Sauer U. Regulation and control of metabolic fluxes in microbes. Curr. Opin. Biotechnol. 2011;22:566–575. doi: 10.1016/j.copbio.2011.04.016. [DOI] [PubMed] [Google Scholar]
  • 44.Posfai G., Plunkett G., et al. Burland V. Emergent properties of reduced-genome Escherichia coli. Science. 2006;312:1044–1046. doi: 10.1126/science.1126439. [DOI] [PubMed] [Google Scholar]
  • 45.Minton A.P., Rivas G. The Minimal Cell. Springer; Dordrecht: 2011. Biochemical reactions in the crowded and confined physiological environment: physical chemistry meets synthetic biology; pp. 73–89. [Google Scholar]
  • 46.Carlson R., Srienc F. Fundamental Escherichia coli biochemical pathways for biomass and energy production: identification of reactions. Biotechnol. Bioeng. 2004;85:1–19. doi: 10.1002/bit.10812. [DOI] [PubMed] [Google Scholar]
  • 47.Trinh C.T., Carlson R., et al. Srienc F. Design, construction and performance of the most efficient biomass producing E. coli bacterium. Metab. Eng. 2006;8:628–638. doi: 10.1016/j.ymben.2006.07.006. [DOI] [PubMed] [Google Scholar]
  • 48.Trinh C.T., Unrean P., Srienc F. Minimal Escherichia coli cell for the most efficient production of ethanol from hexoses and pentoses. Appl. Environ. Microbiol. 2008;74:3634–3643. doi: 10.1128/aem.02708-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Bialek W. Princeton University Press; 2012. Biophysics: Searching for Principles. [Google Scholar]
  • 50.Burgard A.P., Maranas C.D. Optimization-based framework for inferring and testing hypothesized metabolic objective functions. Biotechnol. Bioeng. 2003;82:670–677. doi: 10.1002/bit.10617. [DOI] [PubMed] [Google Scholar]
  • 51.Gianchandani E.P., Oberhardt M.A., et al. Papin J.A. Predicting biological system objectives de novo from internal state measurements. BMC Bioinformatics. 2008;9:43. doi: 10.1186/1471-2105-9-43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Chiu H.C., Segrè D. Genome Informatics 2008: Genome Informatics Series. Vol. 20. 2008. Comparative determination of biomass composition in differentially active metabolic states; pp. 171–182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Zhao Q., Stettner A.I., et al. Segre D. Mapping the landscape of metabolic goals of a cell. Genome Biol. 2016;17:109. doi: 10.1186/s13059-016-0968-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Yang L., Saunders M.A., et al. Bento J. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019. July)Estimating cellular goals from high-dimensional biological data; pp. 2202–2211. [Google Scholar]
  • 55.Knorr A.L., Jain R., Srivastava R. Bayesian-based selection of metabolic objective functions. Bioinformatics. 2007;23:351–357. doi: 10.1093/bioinformatics/btl619. [DOI] [PubMed] [Google Scholar]
  • 56.De Martino A., Gueudré T., Miotto M. Exploration-exploitation tradeoffs dictate the optimal distributions of phenotypes for populations subject to fitness fluctuations. Phys. Rev. E. 2019;99:012417. doi: 10.1103/physreve.99.012417. [DOI] [PubMed] [Google Scholar]
  • 57.Zhao Q., Kurata H. Maximum entropy decomposition of flux distribution at steady state to elementary modes. J. Biosci. Bioeng. 2009;107:84–89. doi: 10.1016/j.jbiosc.2008.09.011. [DOI] [PubMed] [Google Scholar]
  • 58.Zhao Q., Kurata H. Use of maximum entropy principle with Lagrange multipliers extends the feasibility of elementary mode analysis. J. Biosci. Bioeng. 2010;110:254–261. doi: 10.1016/j.jbiosc.2010.01.015. [DOI] [PubMed] [Google Scholar]
  • 59.De Martino D., Capuani F., De Martino A. Quantifying the entropic cost of cellular growth control. Phys. Rev. E. 2017;96:010401. doi: 10.1103/physreve.96.010401. [DOI] [PubMed] [Google Scholar]
  • 60.Fernandez-de-Cossio-Diaz J., Mulet R. Maximum entropy and population heterogeneity in continuous cell cultures. PLOS Comput. Biol. 2019;15:e1006823. doi: 10.1371/journal.pcbi.1006823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Pereiro-Morejón J.A., Fernández-de-Cossio-Díaz J., Mulet R. Inferring metabolic fluxes in nutrient-limited continuous cultures: a Maximum Entropy Approach with minimum information. arXiv. 2021 doi: 10.1016/j.isci.2022.105450. Preprint at. 2109.13149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Tourigny D.S. Dynamic metabolic resource allocation based on the maximum entropy principle. J. Math. Biol. 2020;80:2395–2430. doi: 10.1007/s00285-020-01499-6. [DOI] [PubMed] [Google Scholar]
  • 63.De Martino D., Masoero D. Asymptotic analysis of noisy fitness maximization, applied to metabolism & growth. J. Stat. Mech. Theor. Exp. 2016;2016:123502. doi: 10.1088/1742-5468/aa4e8f. [DOI] [Google Scholar]
  • 64.Braunstein A., Paola Muntoni A., et al. Pieropan M. Compressed sensing reconstruction using expectation propagation. J. Phys. A: Math. Theor. 2020;53:184001. doi: 10.1088/1751-8121/ab3065. [DOI] [Google Scholar]
  • 65.Saldida J., Muntoni A.P., et al. Heinemann M. Unbiased metabolic flux inference through combined thermodynamic and 13C flux analysis. bioRxiv. 2020 doi: 10.1101/2020.06.29.177063. Preprint at. [DOI] [Google Scholar]
  • 66.Bernstein D.S. Princeton university press; 2009. Matrix Mathematics: Theory, Facts, and Formulas. [Google Scholar]
  • 67.Orth Jeffrey D., Fleming R.M.T., Palsson Bernhard Ø. Reconstruction and use of microbial metabolic networks: the core Escherichia coli metabolic model as an educational guide. EcoSal Plus. 2010;4:2. doi: 10.1128/ecosalplus.10.2.1.1. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S10
mmc1.pdf (7.5MB, pdf)
Document S2. Article plus supporting material
mmc2.pdf (8.9MB, pdf)

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES