Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2023 Jun 6;19(6):e1011156. doi: 10.1371/journal.pcbi.1011156

Mathematical properties of optimal fluxes in cellular reaction networks at balanced growth

Hugo Dourado 1,*, Wolfram Liebermeister 2, Oliver Ebenhöh 3, Martin J Lercher 1
Editor: Pedro Mendes4
PMCID: PMC10275479  PMID: 37279246

Abstract

The physiology of biological cells evolved under physical and chemical constraints, such as mass conservation across the network of biochemical reactions, nonlinear reaction kinetics, and limits on cell density. For unicellular organisms, the fitness that governs this evolution is mainly determined by the balanced cellular growth rate. We previously introduced growth balance analysis (GBA) as a general framework to model and analyze such nonlinear systems, revealing important analytical properties of optimal balanced growth states. It has been shown that at optimality, only a minimal subset of reactions can have nonzero flux. However, no general principles have been established to determine if a specific reaction is active at optimality. Here, we extend the GBA framework to study the optimality of each biochemical reaction, and we identify the mathematical conditions determining whether a reaction is active or not at optimal growth in a given environment. We reformulate the mathematical problem in terms of a minimal number of dimensionless variables and use the Karush-Kuhn-Tucker (KKT) conditions to identify fundamental principles of optimal resource allocation in GBA models of any size and complexity. Our approach helps to identify from first principles the economic values of biochemical reactions, expressed as marginal changes in cellular growth rate; these economic values can be related to the costs and benefits of proteome allocation into the reactions’ catalysts. Our formulation also generalizes the concepts of Metabolic Control Analysis to models of growing cells. We show how the extended GBA framework unifies and extends previous approaches of cellular modeling and analysis, putting forward a program to analyze cellular growth through the stationarity conditions of a Lagrangian function. GBA thereby provides a general theoretical toolbox for the study of fundamental mathematical properties of balanced cellular growth.

Author summary

Mathematical models are an important tool to understand and predict the complex behavior of biological cells. This behavior is driven by nonlinear physical constraints that cannot be captured entirely in the prevalent modeling frameworks, which rely on simplified linear optimizations. The next generation of more realistic cell models will depend on an efficient mathematical formulation for the corresponding nonlinear optimization problem that facilitates the analytical study and numerical simulation of large models. Here, we present a succinct formulation for this nonlinear problem, and we derive the analytical properties of fluxes at optimal growth. We also show how these analytical properties can be understood in terms of economics and control theory, where they expose trade-offs related to the allocation of proteins.

Introduction

A core feature of microbial cells is self-replication—their ability to build a complete, identical cell exclusively out of the chemical compounds found in the environment. If a population of asynchronously replicating microbial cells grows exponentially in a constant environment, its self-replication can often be assumed to result from balanced growth, a non-equilibrium steady state in which every cellular component accumulates at the same rate in proportion to its total amount [1]. For non-interacting microbes in a constant environment, the balanced growth rate is equivalent to fitness [2].

The cellular composition is thus often interpreted as an approximate solution to a problem of optimal allocation, driven by natural selection. Accordingly, theoretical methods estimating the optimal allocation are used as a reference to understand cellular composition in vivo [38].

At the whole-cell level, a mechanistic understanding of the quantitative principles that shape cellular balanced growth has been approached predominantly through methods collectively classified as constraint-based modeling (CBM). CBM approaches define a solution space of feasible cellular states (usually defined by reaction fluxes) based on simple, mechanistic constraints. The predominant constraint in CBMs is flux balance, encoded through a linear system of equations that constrain the space of allowed reactions fluxes v [9, 10],

Sv=0. (1)

Here, v is a vector of reaction fluxes, i.e., reaction rates in units of [moles][time]−1[volume]−1. Each row of the stoichiometric matrix S corresponds to one metabolite, while each column corresponds to a metabolic reaction, with entries listing the corresponding stoichiometric coefficients of substrates (negative integers) and products (positive integers).

Thermodynamics and physiological limits—such as a limited nutrient uptake capacity—are typically approximated through fixed upper and/or lower bounds on the modeled fluxes v [11]. The most widely used CBM approach, Flux Balance Analysis (FBA) [11, 12], obtains plausible physiological states by optimizing some objective function over the feasible flux vectors. Frequently, the objective function is the flux through a hypothetical biomass reaction vbio, which mimics the accumulation of precursors for macromolecules and the consumption of energy for their assembly during growth.

Resource Balance Analysis (RBA) and metabolism and expression models (ME-models)—which are also based on optimization under constraints—go beyond FBA by aiming to model metabolism in its most general sense, with the ultimate goal of representing all chemical reactions that occur in a living organism [8, 13]. In contrast to FBA, these methods take into account the burden of producing the macromolecules (proteins and RNA) required for catalyzing each flux. They approximate the corresponding kinetic rate laws as linear relations between fluxes and the concentration of their catalysts, ignoring the dependence on reactant concentrations (except for dependencies on extracellular concentrations, which serve as model parameters).

All widely used CBMs [8, 11, 13, 14] are formulated as linear optimization problems, which can be solved efficiently even for genome-scale models with thousands of reactions. Accordingly, they are currently the most efficient tools to predict and understand realistic cellular models. However, by construction, these linear methods cannot capture the potentially complex nonlinear relationship between biochemical reaction fluxes—and hence cellular growth—and the concentrations of reactants involved as substrates and products. Instead of accounting for nonlinear kinetics, these methods rely on linear, phenomenological assumptions.

Nonlinear CBMs [6, 1519] account for constraints such as nonlinear kinetic rate laws, linking the concentration of metabolites to reaction fluxes. This link means that the metabolite concentrations are now an output of the model instead of an input. Molenaar et al. [6] introduced “self-replicator” models that maximize the cellular growth rate, with reaction fluxes that are limited by fundamental physiological constraints including mass conservation, nonlinear rate laws, and limited protein density. Importantly, these models are completely self-contained, in the sense that in order to grow and self-replicate, all of a model’s individual components have to be produced explicitly by the model itself. Instead of using a phenomenological “biomass reaction”, the constrained optimization of growth predicts the detailed cell composition, and all possible trade-offs in resource allocation can be accounted for from first principles.

Similar to RBA and ME models, self-replicator models include a “ribosome” reaction that produces the necessary proteins. The proteins can be classified into three categories: transport proteins in the cell surface, which exchange mass with the environment; enzymes, which catalyze internal metabolic reactions; and the ribosome itself, which catalyzes the internal protein production, and which for simplicity is assumed to be composed only of proteins. The study of models of this type relies on the numerical solution of nonlinear optimizations: while it in principle accommodates models with any number of reactions, actual presented models have small, highly simplified reaction networks [6, 1519].

We have previously formalized a general framework for modeling and analyzing nonlinear CBMs, an approach we termed growth balance analysis (GBA) [4] (Fig 1). GBA models are based on the self-replicator scheme, but instead of considering a fixed protein concentration, they consider a fixed combined mass density of all intracellular components, including also metabolites. Optimal cellular resource allocation, as predicted by GBA models, emerges exclusively from quantitative biochemical and physical principles, including the intrinsic nonlinear nature of the underlying reaction kinetics. In general, the optimization of nonlinear models is a non-convex problem, frequently hampered by the existence of multiple local optima [20]. Several studies have explored ad-hoc analytical solutions to convex, minimal nonlinear cell models consisting of up to three cellular reactions. Despite their simplicity, simulations with these schematic models are qualitatively consistent with the experimentally observed behaviour of actual cells [6, 1519].

Fig 1. Constraints in a GBA model.

Fig 1

A) In a GBA model, a cell exchanges external reactants (red circles) via transporters (blue squares at the cell surface); converts internal reactants (green circles) via enzymatic reactions (blue squares inside the metabolic network); and produces all proteins catalyzing the reactions (blue rectangle “a”) via a ribosome reaction “r”. The ribosome reaction consumes and returns metabolites to the metabolic network. In its strict sense, the metabolic network comprises the conversion of small molecules into energy and precursors for macromolecules. A model may also describe metabolism in its more general sense, including other enzymatic reactions such as those for DNA replication and transcription. B) All reactions in the model must conserve mass, a concept that comprises (i) mass balance within reactions: one unit of mass consumed (-1) results in one unit of mass produced (+1); and (ii) flux balance of reactant production and consumption, including the dilution by growth of all components (dashed arrow). C) Each reaction flux is catalyzed by a specific protein with turnover time τ (or equivalently, turnover rate k = 1/τ). τ is determined by kinetic rate laws and depends on the concentrations of reactants involved in the reaction; k = 1/τ has a maximal value kcat. D) Two basic density constraints govern the cellular interior: (i) the density of proteins “a”, and (ii) the total density ρ, which is the sum of all protein and metabolite concentrations.

The optimization of large nonlinear CBMs, such as GBA models, is still an open problem for numerical methods [20]. Thus, in previous work, we proposed a different approach to the analysis of GBA models—instead of looking for the optimal state of a GBA model with numerical methods, we ask: what are the analytical properties of this optimal state? We named the equations specifying these analytical properties the balance equations of the corresponding GBA model [4]. If we further assume, as most CBM approaches do [6, 8, 11, 1319], that cells are at an optimal state (or at least close to one), then the balance equations become useful tools to estimate and understand the resource allocation in actual biological cells. We derived the balance equation for each reactant in a GBA mode and applied these equations to successfully predict the protein allocation into the ribosome of both E. coli and yeast across various growth conditions [4]. The application of these balance equations to estimate complete cellular states is however limited; in our original derivation [4], the stoichiometric matrix S of interest is assumed to have full column rank, which is the case for all optimal states of GBA models when one restricts S to the columns corresponding to active reactions (i.e., those with non-zero flux) [4, 21, 22]. The active reactions at optimal growth form an Elementary Growth Mode (EGM) [23], and they represent an Elementary Flux Mode (EFM) [21, 22] of a related FBA problem [4]. Unfortunately, however, the optimal choice of this EGM/EFM and its constituting active reactions is not known a priori for large-scale models, and cannot be explained by our previous analytical study [4].

Below, we generalize our previous analytical study of GBA models by deriving the analytical properties of each reaction at optimal balanced growth, now also accounting for models with column-rank-deficient stoichiometric matrices—i.e., biochemical networks with alternative pathways. These analytical properties can be seen as generalized balance equations, explaining from first principles the optimal resource allocation strategy for each reaction in a cell. In particular, they explain from first principles the exact mathematical condition determining whether a reaction is active or not in an optimal growth state. We then interpret these balance equations in terms of marginal costs and benefits of reactions with respect to their influence on growth, and quantify how changes in the model parameters and external conditions control the optimal growth rate.

Results

We first present the notation and mathematical definitions for growth optimization, including an objective function and constraints. We then reformulate the problem in terms of flux fractions as the only free variables, which greatly simplifies the subsequent analytical study. Finally, we explore the consequences that emerge from the necessary optimality conditions in terms of economics and control theory, and discuss their biological significance.

Growth modeling

We define a GBA model as the triple of model parameters (M, τ, ρ). The matrix M describes the mass fractions of internal reactants consumed and produced by each reaction; τ is a vector of catalytic turnover times for all reactions, where each is a function of internal reactant concentrations c and possibly also external concentrations x (assumed to be fixed and given); and ρ is the combined mass concentration of all internal components. In the following paragraphs, we provide more detailed descriptions of the model constituents M, τ, and ρ. Here and below, we use the term “reaction” to also encompass transport processes across the cell surface, which are “catalyzed” by transporter proteins or protein complexes.

The matrix M was first introduced in [4]. It is constructed from the stoichiometric matrix S for the total, closed system, i.e., including rows for external reactants. We add a column “r” for the ribosome reaction that produces all cellular protein, as well as a row “a” corresponding to the total concentration of all proteins in mass units. We now first convert all entries to masses, by multiplying each row with the corresponding molecular mass. Because of mass balance, each column must then sum to 0. We next normalize each column such that the sum of its negative entries equals −1 and the sum of its positive entries equals +1. Now the entries correspond to the mass fractions of each reactant (rows) going into and out of each reaction (columns), as illustrated for the example in Fig 1B. Finally, we reduce the normalized matrix to a matrix for an open system, by dropping all rows for reactants external to the modeled cell. For the remaining internal reactants, we will assume a quasi-steady state and thus enforce mass conservation.

As illustrated in Fig 2, to simplify the notation for the following theoretical development, we partition the columns of M (indexed together by j) into index sets for reaction types: s for transport processes across the cell surface; e for internal enzymatic reactions; and r for the ribosome reaction, which is the only one that produces protein. We partition the rows of internal reactants (indexed together by i) into indices m for metabolites and a for total protein. We use the term “metabolites” in its more general sense, referring to any molecule in the cell that is not a protein. We distinguish vectors by using boldface, and vector components by using italics with the appropriate upper or lower index, e.g., c is the column vector of all internal reactant concentrations, ci are its components, and we use a lower index to indicate the components ci of the row vector c.

Fig 2. Schematic overview of the mass conservation constraint.

Fig 2

M v = μc, determined by the mass fraction matrix M, the column vector of mass fluxes v, the growth rate μ, and the column vector of internal reactants mass concentrations c. The indices indicate partitions according to the type of reaction (columns of M, v) or reactant (rows of M, c). The index i = (m, a) correspond to rows for internal reactants, comprising rows m for metabolites, and a row “a” for the total mass concentration of all proteins. The index j = (s, e, r) correspond, respectively, to transport proteins, enzymes, and the ribosome. We also use the index l for all reactions when necessary. Note the row “a” of M has only one nonzero entry Mra, corresponding to the mass fraction of protein produced by the ribosome reaction r. Different colors indicate three different types of reactions: red (transporters), blue (enzymes), green (ribosome); and two types of reactants: light gray (metabolites), dark gray (total protein), resulting in six partitions of M.

The ribosome reaction represents the last step in protein synthesis, and is assumed to be catalyzed by a “ribosome” consisting entirely of protein. We here ignore the RNA components of the ribosome for simplicity, but it is possible to extend the modeled ribosome to a more realistic RNA-protein complex. In addition, the enzymatic reactions (e) could be extended so that they include details of protein translation that occur before the last, “ribosome” step (r) [5]. Note that nonlinear genome-scale GBA models can be created from existing linear genome-scale models, by extending their stoichiometric matrix S with the addition of a ribosome reaction, normalizing it to M with the molecular masses, and adding the kinetic rate laws τ and density ρ (see below).

We assume that every reaction represented by a column j in M is catalyzed by a protein or protein complex with concentration pj—specifically, a transporter (s), an enzyme (e), or the ribosome (r). The corresponding flux vj is assumed to be proportional to pj, expressed as vj = pj/τj(c, x). Thus, τj is a function defined as the inverse of the usual metabolite-dependent factor in kinetic rate laws. τj must have a negative value when the flux is negative. Fig 1C shows the relationship between turnover time τ and turnover number kcat. S1 Text provides a basic discussion of rate laws and the necessary kinetic parameters. c = (cm, ca) is the vector of internal reactant concentrations, comprising all metabolite concentrations cm as well as ca, the combined mass concentration of all proteins. Hence, each τj may depend not only on the concentrations of the substrates and products of the corresponding reaction, but also on inhibitors and regulatory metabolites not involved in the turnover itself. The transport processes s are the only reactions whose rate laws may depend on the external concentrations x.

Note that in accordance with the normalization of M, all concentrations of proteins pj and reactants ci throughout this work are in units of [mass][volume]−1. Fluxes ([mass][volume]−1[time]−1) and the kinetic parameters must then also be expressed in mass units, e.g., Michaelis constants Km in [mass][volume]−1 and turnover numbers kcat in product mass per protein mass per time, resulting in [time]−1.

ρ, the final constituent of GBA models, is the sum of all internal concentrations. We assume ρ to be constant, which is consistent with experimental data on E. coli across growth conditions and even across the cell cycle [2426]. The mass balances exploited for the normalization of M mean that all reactants involved in reactions must be accounted for in the model and hence be included in the value of ρ; e.g., in a realistic model water is a reactant in many reactions, so ρ corresponds in this case to the total cellular density (or buoyant density). Simplified models may instead include only dry mass components, so that both M and ρ consider only these.

Mass conservation implies that in the mass fraction matrix M, each column sum γjiMji is zero if it involves only the consumption and production of internal reactants (indices e, r). In contrast, transport reactions (with indices s), which bring mass into and out of the modeled system, do not conserve mass, resulting in the equations

γr=0γe=0γs0. (2)

The property (2) guarantees mass conservation within reactions, an information that is not always fully encoded in the stoichiometric matrix S, as many models ignore common reactants such as water (see discussion in S1 Text). While external reactants have no corresponding rows in M, their concentrations x may influence the turnover times τs of transporters. We present examples of GBA models in S1 Text and R code for their numerical optimization in S1 File.

We are interested in the cellular physiology, defined through the concentration vectors c, p and the vector of reaction fluxes v, at balanced growth. For a given model specified by (M, τ, ρ) and a given environment specified by x, balanced growth at the instantaneous rate μ is specified by the following constraints:

Mv=μc(massconservationinbalancedgrowth) (3)
p=vτ(c,x)(reactionkinetics) (4)
ca=jpj(definitionoftotalproteindensityca) (5)
ρ=ici(constantcellulardensity) (6)
c0(non-negativereactantconcentrations) (7)
p0,(non-negativeproteinconcentrations) (8)

where ⊙ indicates element-wise multiplication. We say that any state (c,p,v) satisfying Eqs 38 with growth rate μ > 0 is a Balanced Growth State (BGS) for the model specified by (M, τ, ρ) and the environment specified by x. The Optimal Growth State (OGS) is the BGS resulting in the maximal growth rate μ. It can be shown that any OGS must always use a minimal subset of active reactions, i.e., growth becomes impossible if one of the active reactions is deactivated without simultaneously activating other reactions [4, 21, 22]. We term BGSs that use such minimal subsets of reactions Elementary Growth States (EGSs) [4]. Each EGS corresponds to an Elementary Flux Mode (EFM) [27] of the “linearized” version of the balanced growth problem with fixed concentrations c [4]; in that case, τ(c, x) also have fixed values and all Eqs 38 become linear. EGSs are specific instances of Elementary Growth Modes (EGMs) [23], sets of states using the same minimal set of active reactions.

The model’s balanced growth property is captured by the right hand side of Eq 3. We assume that the growth rate is always positive, μ > 0. Thus, for internal nodes with non-zero concentration (ci ≠ 0), there is a necessary mass flow to offset the dilution through the associated volume growth at rate μ [28]. Note that the total protein concentration ca defined by Eq 5 has no fixed value, but is constrained by the concentration of enzymes and transporters required to sustain the reaction fluxes, by a row “a” in Eq 3 that specifies mass conservation in balanced growth, and by the total density constraint in Eq 6. We summarize our assumptions about proteins as the following: i) all proteins have the same amino acid composition (determined by the entries in the column Mr) and are produced by the ribosome following identical kinetics; ii) the total protein concentration ca is defined as the sum of all protein mass concentrations p by Eq 5; iii) ca relates to the ribosome flux vr and growth rate μ via the row “a” in the mass conservation constraint given by Eq 3 (Mravr=μca); and iv) ca also relates to the density constraint via the sum in Eq 6. From our modeling perspective, we might think of “protein” as being produced by the ribosome and instantly distributed across all reactions such that each individual protein catalyst (transporter, enzyme, or ribosome) maintains its concentration in balanced growth.

Below, we will be interested in the analytical properties of the OGS for a given model (M, τ, ρ) and environment x. From Eqs 35, we see that the variables (c, p, v) are highly interdependent. The above formulation does not lend itself to expressing μ as an explicit function of these variables, which makes it not ideal for numerical or analytical studies. If one can find a mathematically equivalent formulation based on fewer, independent variables, then this would facilitate the use of the KKT conditions, analogous to how generalized coordinates facilitate the solution of problems in Lagrangian mechanics [29]. Thus, we next focus on a corresponding reformulation of the optimization problem. This formulation will apply to all BGSs, and only later we will use it to examine OGSs.

A reformulation in terms of flux fractions f

Our guiding thought below is that there can be a correspondence between cell states at different growth rates, which can be expressed in the form of scaling relations. These scaling relations extend the mass fraction scaling of M to fluxes and concentrations. Specifically, we define biomass fractions

bcρ(adimensional) (9)

(equivalent to c = ρ b, since ρ > 0), which express concentrations as fractions of the total cellular density; and we define flux fractions

fvμρ(adimensional) (10)

(equivalent to v = μρ f, since we assume μρ > 0), which express fluxes as fractions of the net mass uptake—i.e., the net growth—of the cell, μρ. Thus, each flux fraction fj describes the activity of reaction j relative to the total cellular mass production. We note that the flux fractions f may in principle assume any real value, in the same way the fluxes vj do, including negative values when the corresponding reactions are running backwards (in which case τj < 0). The reversibility of reactions is not an input, but an emergent output pattern here, due to the sign of each turnover time τj, as we discuss below.

Importantly, Eq 3 describing mass conservation in balanced growth does not depend explicitly on μ anymore when written in terms of f and b:

Mf=b. (11)

This equation also implies that the mass fractions b are uniquely determined by the flux fractions f, independently of μ. Conveniently, this unique dependence also means that we can express the turnover times as functions of only f and the fixed parameters ρ, M, and x:

τ=τ(c,x)=τ(ρMf,x). (12)

In the following discussion, we mostly focus on the dependence of τ on f, and for simplicity of notation we do not state the dependence of τ on the fixed parameters (ρ, M, x) explicitly. Importantly, τ does not depend explicitly on μ, which otherwise would cause a recursion problem when further expressing the growth rate μ in terms of only f and τ(f), as we will see below. The resulting dimensionality reduction of the solution space not only simplifies the analytical considerations below, but also potentially accelerates numerical optimizations [30].

From Eqs 4 and 10, we obtain the expression for protein concentrations in terms of f, μ, and ρ,

p=μρfτ(f). (13)

The combined mass fraction of all proteins in the cell, ba, is the sum of all p in the last equation, divided by ρ:

ba=μfτ(f). (14)

Thus, we can calculate the total protein mass fraction during balanced growth from μ and f, based only on reaction kinetics. However, through Eq 11, the same total protein mass fraction is also related to f through the corresponding row “a” in M:

ba=Maf=Mrafr, (15)

where Ma is the row of M corresponding to the total protein concentration, and the second equality reflects our assumption that the “ribosome” reaction r is the only one producing proteins (with no reaction consuming them), so that Mja=0 for jr. Equating the right hand sides of the previous two Eqs 14 and 15 and solving for μ (with ba ≠ 0 ⇒ fτ(f) ≠ 0), we get the growth function

μ(f)=Mrafrfτ(f). (16)

Thus, the growth rate becomes an explicit function of only the flux fractions f. μ still depends on the fixed parameters ρ, M, and x through the functions τ = τ(ρ M f, x). Note that if fluxes v were used instead of the flux fractions f, then τ(c, x) = τ(M v/μ, x), which would cause a recursion issue when defining the growth rate as a function of v and τ following the same procedure [7, 23]. In that case, one is forced to account for c as separate variables, thereby increasing the dimensionality of the problem. The same recursion issue occurs when formulating the problem in terms of protein concentrations p.

From now on, we will consider b (Eq 11) and τ (Eq 12) as functions of f, and treat f as the only free variables. After writing the growth rate μ as a function of f, we now do the same thing for our remaining constraints, so now we have much fewer variables and constraints.

In the scaled variables, the density constraint (Eq 6) is reduced to

ibi=1. (17)

Using Eq 11, we can rewrite this constraint in terms of flux fractions f. We see that in balanced growth, the density constraint (Eq 17) is equivalent to a flux balance on the cell surface,

1=γf=sγsfs, (18)

where the second equality comes from Eq 2: only the columns s sum up to non-zero values γs, so only transport fluxes fs are limited by this constraint. The nature of this constraint as a global mass balance becomes more evident if we multiply the whole expression by μρ: the net mass uptake sγsvs going through the cell surface must equal the rate of biomass production μρ.

Any solution to the growth function (Eq 16) automatically respects internal mass conservation, protein density and the kinetic constraints: for any given vector f, μ(f) returns the unique growth rate satisfying these constraints (which also depend on ρ through τ = τ(ρ M f, x). The flux balance at the cell surface is enforced separately by Eq 18 on transporters, making these fundamentally different from enzymatic and ribosome reactions. In particular, for a model with only one transporter s, Eq 18 already determines the scaled uptake rate fs = (γs)−1. With two transport fluxes, one flux is uniquely determined by the other; a simple example would be a model that only has transporters for glucose uptake and CO2 excretion (see example model “C” in S1 Text). More generally, Eq 18 can be used to uniquely determine one transport flux fraction in terms of the others, reducing the number of variables by one. For clarity of presentation, however, we will keep Eq 18 as a separate constraint and not eliminate any variable, until the introduction of growth control coefficients in the corresponding section.

Finally, writing the non-negativity constraints on proteins and reactant concentrations in terms of f results in the following element-wise inequalities on the corresponding vectors

fτ(f)=pμρ0, (19)
Mf=b=cρ0. (20)

We are now in the position to provide a concise formulation of growth rate optimization in terms of flux fractions f. Combining Eqs 16, 18, 19, 20, the optimal growth problem for a given environment x becomes

maximizefRNμ(f)=Mrafrfτ(f)subjectto:γf=1fτ(f)0Mf0, (21)

where f is a vector containing a real-valued flux fraction for each reaction (fRN, with N ≔ number of columns in M), and the turnover times τ = τ(ρ M f, x) are functions that depend on f and on the parameters ρ, M, x. In the following discussion, we assume that all τj are different from zero, which simply means there is no reaction with infinite turnover rate (this relationship can be visualized in Fig 1C). We note that the direction of reactions (i.e. the sign of each fj) is not enforced here such as in some other methods, but instead emerges as a result of the optimization (21); because of the constraint 19, a non-zero fj should always have the same sign as τj, which is in turn a thermodynamic property of reaction j determined by its kinetic parameters and the relevant reactant concentrations [31]. If the rate laws have a general functional form, these functions will be parameterised by the set of kinetic parameters K. After solving this optimization problem, all original cellular variables (unscaled fluxes as well as unscaled metabolite and protein concentrations) can be easily reconstructed from f. In the following, we will refer to π as the vector of parameters that define the optimization problem, which includes M, ρ, and x, as well as the elements of K. The parameters in π are considered fixed until the section “Growth Control and Adaptation”, where we study the sensitivity of optimal growth to marginal changes in the components of π.

Table 1 lists all symbols used below.

Table 1. Symbols used.

Symbol Description (units)
A growth adaptation coefficient
b biomass fraction vector
c reactant concentration vector ([mass][volume]−1)
C control coefficient matrix
f flux fraction vector
E indirect sensitivity matrix ([time])
K set of kinetic parameters (various units)
L Lagrangian ([time]−1)
M mass fraction matrix
N number of reactions (= number of columns in M)
p protein concentration vector ([mass][volume]−1)
S stoichiometric matrix ([mol])
v flux vector ([mass][volume]−1[time]−1)
x external reactant concentration vector ([mass][volume]−1)
γ vector with column sums of M
Γ growth control coefficient ([time]−1)
ε direct sensitivity matrix ([time])
ϕ proteome mass fraction vector
θ KKT multiplier for the protein non-negativity constraint ([time]−2)
λ KKT multiplier for the density constraint ([time]−1)
μ growth rate ([time]−1)
π parameter vector (various units)
ρ mass density ([mass][volume]−1)
τ turnover time vector ([time])
Index Description
a all proteins
e enzymatic reactions
m metabolites
i internal reactants (including m and a)
r ribosome reaction
s surface reactions (i.e., transport reactions)
j reactions (including s, e, r)
l reactions (including s, e, r)

Growth analysis

Next, we utilize the problem reformulation in terms of flux fractions f to derive general necessary conditions of OGSs, valid for any GBA model—including those with redundant pathways. First, for each reaction, we will derive explicit expressions for shadow prices in the optimal state; in constrained optimization in economics, the shadow price is the change, per infinitesimal unit of the constraint, in the optimal value of the objective function of an optimization problem obtained by relaxing the constraint. This term has been applied also to biological systems in the context of constraint-based optimization. [32]. We then derive equations for the state variables f themselves, which must hold in any optimal state. This development constitutes a generalization of our previous analytical approach to GBA [4], which was restricted to models with matrices M of full column rank. The latter condition is not generally satisfied by realistic cellular models, as many cellular biochemical reactions are structurally redundant, i.e., their columns in M are linearly dependent on other columns. OGSs always have non-redundant active reactions (i.e., the active M has full column rank) [4, 21, 22], but this optimal set of active reactions is generally not known a priori. In contrast, the following analysis in terms of flux fractions is valid for any M of arbitrary size and rank.

For the following, we emphasize that the state of our system is completely determined by scaled fluxes fj, which serve as independent variables. All other variables are fully dependent on them: the unscaled fluxes v, the scaled and unscaled concentrations b, c, and p, the reaction times τ, and the growth rate μ.

All following analyses benefit from knowing the system’s sensitivity to small changes of each of the independent variables fj. The partial derivatives of the system’s properties c(f), v(f), b(f), τ(f), p(f), and μ(f) with respect to each fj provide explicit expressions for sensitivity coefficients similar to the ones introduced in Metabolic Value Theory [32], based on the original concepts of Metabolic Control Analysis (MCA) [9, 10]. A unique feature of the present treatment arises from the system of equations in Eq 11, which determines the linear dependence of biomass fractions b on f, so that the partial derivative of bi with respect to fj is given simply as

bifj=Mji. (22)

Via the chain rule of differentiation, this expression also determines the partial derivatives with respect to fj for any functions of bi. A case of particular interest in the following discussions is the vector of turnover times τ = τ(c, x) = τ(ρ b, x) = τ(ρ M f, x). We first define the (direct) time elasticities (elasticities in short), the sensitivity of each turnover time τl(c, x) = τl(ρ b, x) with respect to each biomass fraction bi, as

εilτlbi=τlcicibi=τlciρ, (23)

where we used the chain rule of differentiation in the first equality and Eq 9 in the second. We then use the direct elasticities εil to express the sensitivity of τl to a change in a flux fraction fj, defined as the indirect time elasticity matrix E (or indirect elasticity in short), with entries

Ejlτlfj=iτlcicibibifj=iεilMji, (24)

where we used Eq 22 in the last equality. In the following discussion, we assume that the kinetic rate laws do not depend on the total protein concentration ca, meaning εal=τl/ca=0 for all reactions l. That would be different if, for example, one accounts for the macromolecular crowding effects via kinetic rate laws [33]. The indirect elasticities E and direct elasticities ε share some resemblance with the Jacobian and elasticity matrices defined in Metabolic Value Theory and MCA, although we do not intend to explore the exact relationships in this work. For an example of direct and indirect elasticities, where τ follows a simple Michaelis-Menten rate law, see S1 Text.

In the remainder of this paper, we will explore three complementary types of analyses of GBA systems. First, in the growth optimality section we will state the analytical conditions necessary for an optimal state f*. Second, in the growth economy section we will calculate the sensitivity of a (not necessarily optimal) growth rate μ to small changes in each f, which we interpret in economic terms as marginal values of reactions. Third, in the growth control and adaptation section we will estimate the sensitivity of the optimal growth rate μ* to small changes in the previously fixed parameters π. In each of these analyses, the sensitivity measures captured by E will appear naturally in the results.

Growth optimality

We next calculate the necessary analytical conditions for the optimal growth state (OGS). This calculation extends our previous analytical approach, which was restricted to GBA models with matrices M of full column rank [4], to general GBA models with arbitrary matrices M, facilitated by the reformulation of the GBA problem in terms of flux fractions f. We approach this problem by studying the Karush-Kuhn-Tucker (KKT) conditions [34, 35], which generalize the method of Lagrange multipliers by also accounting for inequality constraints, here present due to the non-negativity of concentrations. To simplify the presentation in this section, we here account explicitly only for the non-negativity of protein concentrations, but not for the non-negativity of metabolite concentrations. Under the reasonable assumption that metabolites with zero concentration do not participate in any active reactions, the resulting necessary conditions are also necessary when accounting for this latter constraint; the full calculations can be found in S1 Text.

We define the Lagrangian L(f,λ,θ) for a given GBA model (M, τ, ρ) and external concentrations x as

L(f,λ,θ)μ(f)+λ(γf-1)+θfτ(f), (25)

The KKT multipliersλ, θ are auxiliary variables used to find the optimal state, but also encode important economic and control information about the system at optimality, as we will see later. λ relates to the equality constraint enforcing the fixed cell density, connected to f via the flux balance at the surface (Eq 18); θ relate to the inequality constraint enforcing the non-negativity of proteins (Eq 19). Solving the KKT conditions (see Methods for details), we get the balance equations determining the necessary condition for each reaction j at optimal growth:

(jμ+λγj)fj=0, (26)

where ∂jμ ≔ ∂μ/∂fj indicates the partial derivative of μ with respect to fj, calculated from Eq 16 as

jμ=μba(Mja-μτj-μfEj), (27)

with Ej representing the column j of E, and λ is the optimal value of the density constraint multiplier

λ=μ2bafEf (28)

(see Methods for the detailed calculations). When we further consider that only transporters have a nonzero column sum γj (Eq 2), we get an equivalent expression for the optimal λ that highlights its particular dependence on the reactions directly connected to the transport reactions (see S1 Text).

Eq 26 is a necessary property of the global optimum of (21), but it could be in principle also satisfied by other local stationary points of the constrained growth rate defined by Eq. 21, which exist whenever the optimization (21) is non-convex. The possible non-uniqueness of solutions for Eq 26 is however not of our concern here, since in this study we are focusing only on the analytical properties of the global optimum, not on methods to calculate it.

Eq 26 generalizes the necessary analytical conditions we found before [4] for the optimal states of GBA models with full column rank matrix M; in that case, the conditions could only be applied to arbitrary models if one had prior knowledge of what reactions are active at optimality, effectively reducing M to an “active” matrix of interest (which is guaranteed to be of full column rank [4]). Here, no prior knowledge of active reactions is required. Instead, Eq 26 provides the very condition determining whether each reaction is active at optimality: a reaction with nonzero flux fj requires that the corresponding term in parentheses (i.e., the corresponding θj, see Methods) is equal to zero. Conversely, if the term in parentheses is different from zero (θj ≠ 0), then the reaction cannot carry flux at optimality (fj = 0). In particular, the ribosome evidently needs to be active for balanced cellular growth, as proteins are required as catalysts; thus, θr = 0 must always hold in optimal states. The KKT multipliers θ are the shadow prices [32] of each τjfj = pj/(μρ), which has unit of time, and can be understood as the fraction pj/ρ of the total growth time 1/μ which is allocated to produce the protein j in the biomass.

We may also express Eq 26 for each reaction j in the usual, unscaled variables v (fluxes), and c (reactant concentrations, including metabolites and total protein), by using Eqs 4, 9, 10 and 11 (see S1 Text)

(Mja-μτj-vεMj+vεcργj)vj=0, (29)

where Mj indicates the column j of M. We now continue our analysis in terms of the flux fractions f, since these are the variables of the optimization problem (Eq. 21). However, we keep in mind that the same change of variables to p, v, c is possible in all the following equations, as done for Eq 29.

Growth economy

As growth rate is closely related to fitness [2], it makes sense to view growth rate as the primary value of the cellular economy. In this subsection, we will thus explore the economy of balanced cellular growth, by asking how a small change in the state variables fj affects the growth rate μ of any optimal or non-optimal state. Below, we will see that the necessary conditions of optimal growth specify that the marginal costs and benefits of each flux must be perfectly balanced.

We define the marginal value of flux j as the partial derivative ∂jμ, which quantifies the marginal gain in growth rate resulting from a small increase in fj. From Eq 27, we see that the marginal value can be expressed naturally as a multiple of the growth rate per mass fraction of protein in biomass, μ/ba. As we will see next, the corresponding adimensional factor—the term in parentheses in Eq 27—quantifies different types of costs (when negative) and benefits (when positive) of reaction j in terms of its influence on protein allocation,

baμjμ=Mja-μτj-μfEj. (30)

The first summand quantifies how a marginal increase in fj increases the total protein fraction in the cell density ba = ca/ρ (see Eq 5),

Mja=bafj. (31)

We name this contribution to the normalized marginal value the marginal protein production. As we assume that the ribosome reaction is the only reaction that consumes or produces protein, this reaction (j = r) is the only one with a nonzero (and positive) marginal production benefit.

To interpret the remaining summands, we remember that an individual protein’s mass fraction in the cellular density can be expressed as pl/ρ = τlvl/ρ = μτlfl. The last two terms in Eq 30 quantify the combined decrease of individual protein fractions in cellular density (pl/ρ) caused by a marginal increase in fj at fixed μ,

-μτj-μfEj=-((μfτ)fj)μ=-l((pl/ρ)fj)μ. (32)

Here, the first summand quantifies the change in pl/ρ at fixed turnover times, which is evidently non-zero only for the enzyme catalyzing the perturbed flux j itself. We name this term, −μτj, the marginal (protein) investment into j. The final summand quantifies the local change of the individual protein concentrations that must occur to compensate the changes in the turnover times (quantified by the indirect elasticity E), themselves caused by changes in metabolite concentrations forced due to flux balance. We name it the marginal (protein) opportunity of j, as it is related to opportunity costs and benefits in economics. For the typical case of reactions running in the forward direction (fj > 0), τj is positive, and thus the marginal investment into j is negative, representing a cost. If all fluxes are non-negative, beneficial decreases in turnover times correspond to negative E, resulting in positive marginal opportunity (i.e., marginal opportunity benefit).

We can now summarize our insights about cellular economy, in particular about changes in the growth rate μ in response to changes in a flux fj. The first and second terms in Eq 30 are simple, direct consequences of the flux change: the marginal production benefit, an increase in protein production if fj is the ribosome flux; and the marginal investment, an increase in the protein concentration required to sustain an increased fj. The third term in Eq 30, the marginal opportunity, is more interesting, though equally easy to understand. As a simple consequence of mass conservation (Eq 11), a change in fj while keeping all other fluxes fixed must result in changes in the concentrations of all reactants consumed or produced in the corresponding reaction. These concentration changes modify the turnover times τl(c) of all reactions l whose kinetics depend on them, either because they are directly connected to those reactants or because they act as inhibitors or activators; see Fig 3 for an example. Keeping the corresponding fluxes fl constant requires matching changes in the concentrations pl of the catalyzing proteins (Eq 4). This total amount of “protein saved” due to a change in fj is quantified by -μlflEjl.

Fig 3. The dependence of marginal opportunity on the reaction neighborhood.

Fig 3

The figure shows a simple example of a reaction (j = 2, red) that is directly connected to two metabolites (m = 1, 2) and thereby to two other reactions j = 1, 3. Reaction j = 2 is also connected indirectly to reaction j = 4 by inhibiting it through metabolite m = 2 (indicated by the blunt arrow ⊤). The marginal opportunity of reaction j = 2 is -μfE2=-μ(f1τ1b1M21+f2τ2b1M21+f2τ2b2M22+f3τ3b2M22+f4τ4b2M22). It quantifies how a marginal change in f2 while keeping all other fl fixed, causes (i) an inevitable change in b1, b2 due to the flux balance (Eq 11); which by consequence causes (ii) an inevitable change in τ1, τ2, τ3, τ4, as these are functions of b1, b2; which finally causes (iii) an inevitable change in p1, p2, p3, p4 due to the kinetic constraints (Eq 4) at fixed v1, v2, v3, v4 (determined by the fixed flux fractions and growth rate (Eq 10)). The example also shows how the information about mass conservation and reaction kinetics is completely built into the definition of the growth function (Eq 16).

The above results confirm the often postulated central role of proteins in the cellular economy [31, 36, 37]. While the measure of cellular economic value may be the growth rate itself, protein concentrations constitute the general currency in which we can express the contributions of cellular subsystems. We can highlight this central economic role of proteins further by relating the marginal values ∂jμ—changes in growth rate in response to flux changes—to changes in the allocation of proteome fractions ϕlpl/ca:

jμ=μba(Mja-μτj-μfEj)=-μl(ϕlfj)μ. (33)

The second equality follows directly from taking the derivative of lϕl=μρfτ/ρMaf with respect to fj, at constant μ.

We can now look at the balance equations at optimal growth from an economic perspective. For the ribosome and active enzymatic reactions, a zero marginal value ∂jμ = 0 also means a zero shadow price, θj = 0 (Eqs 54 and 55)—so the reaction is optimal, and growth cannot be accelerated by increasing or decreasing fj by a small amount. This insight provides an intuitive interpretation for the balance equations for the ribosome (Eq 57) and for all active, internal enzymatic reactions (Eq 58). An exception are only the transporters. In contrast to all other flux fractions, their shadow price (Eq 56) depends both on their marginal value and on their marginal biomass production, μ fE f γs (a cost when negative and a benefit when positive).

For active enzymes with zero marginal value—and thus for all active enzymes at optimality (Eq 58)—Eq 27 simplifies to

τe+fEe=0. (34)

This simple relationship shows that at optimality, the marginal investment into e should perfectly balance its marginal opportunity. As the last equation involves only the neighborhood of e (defined as all reactions l such that Eel0), we can study such relationships at optimality locally, without full knowledge about the entire reaction network. We thus do not need the entire matrix M or complete knowledge of parameterized turnover time functions in the vector τ.

In the preceding subsection, we studied how any optimal or non-optimal growth rate μ is sensitive to marginal changes in one of the flux fractions, resulting in an economic understanding of marginal flux values in terms of their relationship with protein allocation. We next reinterpret some of these results from the perspective of control theory, and turn to a complementary problem that focuses on the sensitivity of the optimal growth rate to changes in the model parameters and external concentrations.

Growth control and adaptation

We are first interested in the total control that each fj has on the (optimal or non-optimal) growth rate μ, accounting also for the density constraint limitation. In order to do that, we choose one active transport reaction s′ and express its corresponding fs0 as a function of the other fluxes via the density constraint (Eq 18),

fs=1γs(1-lγlfl), (35)

where ls′ sums over all other reactions. Thus,

fsfj=-γjγs, (36)

which is non-zero only if j is also a transport reaction (so γj ≠ 0). We now define the Growth Control Coefficients Γj as

Γjjμ-sμγjγs, (37)

where the first term quantifies the growth change caused by fj itself, and the second term quantifies the growth change caused by a change in fs, itself changed due to the changed fj and the density constraint. Note that for the ribosome and enzymatic reactions, their growth control coefficient is simply their marginal value, since γr=γe=0. For models with only one transport reaction s, Γs = 0, since fs = (γs)−1 is fixed by the density constraint and cannot be changed. Conveniently, this is also captured by Eq 37. If s′ is optimal, θs = 0, and Eq 26 determines ∂sμ = −λγs, so in that case

Γj=jμ+λγj, (38)

and the balance equation Eq 26 is thus equivalent to

Γjfj=0. (39)

We may also see the optimal condition for enzymes (Eq 34) in terms of protein concentrations, by multiplying it element-wise with ve (so it is also valid now for inactive enzymes),

pe=lplCel, (40)

where we defined

Celfevl(vlfe)pl=fevl((pl/τl)fe)pl=-feτlτlfe, (41)

via Eq 4 and using partial derivatives at fixed pl. Cel can be seen as (scaled) control coefficients (CC), analogous to (scaled) control coefficients in MCA [9, 10]. This result is analogous to how enzyme concentrations and their respective CC relate at optimal fluxes constrained by a fixed total enzyme concentration [38] (see S1 Text for a detailed discussion). For an example of control coefficients where τ follows a simple Michaelis-Menten rate law, see S1 Text.

We now explore the sensitivity of the optimal growth rate to changes in one parameter π in the vector π. The growth problem (Eq 21) is constrained by the parameters π, including the arguments necessary to determine the turnover times τ at given f. This means that any marginal change in one of the parameters π would lead to changes in the solution f* of the optimization (Eq 21). In this sense, the parameters π can be understood as control variables, while the corresponding optimal state f*, and its functions μ* = μ(f*), v* = v(f*), p* = p(f*), and c* = c(f*) are the response variables. Fig 4 summarizes these relationships.

Fig 4. The parameters π and their control on the optimal cellular state f*.

Fig 4

Because growth rate is closely related to fitness, we are also particularly interested in how marginal changes in one of the previously fixed parameters π affect the optimal growth rate μ* [39]. We can estimate this effect directly via the envelope theorem [4, 40], by effectively considering the optimal state f* as fixed and treating the parameters π as the new independent variables, making it unnecessary to calculate the new optimal state after the parameter change. To do that, we first simplify the problem by assuming that these marginal changes have no effect on which reactions are active, so we simplify the Lagrangian (Eq 25) by ignoring the inequality constraints; note that in this case only the objective function μ can be influenced by parameter changes, since the density constraint only depends on M, whose entries cannot be changed continuously. Second, we can think about the optimal growth rate μ* as a function of the parameters μ*(π)L(f*(π),λ*(π),π), so the total change dμ*/dπ induced by a marginal change in a parameter π can be calculated via the chain rule

dμ*dπ=Lfjdfj*dπ+Lλdλρ*dπ+Lπ=Lπ, (42)

where the last equality comes from L/fj=L/λ=0 according to the stationarity (Eq 49) and primal feasibility (Eq 18) at an optimal state.

We now define growth adaptation coefficients A as the relative change in the optimal growth rate μ* in response to a small, relative change in one control variable π

Aππμ*dμ*dπ=πμ(f)Lπ, (43)

where here and in the rest of this section f is to be understood as the optimal state before the change in the parameter π. Note that in the following discussion, the parameters π of interest only influence L via the objective function μ, so the partial derivatives L/π are simply evaluated as ∂μ/∂π at fixed f.

For direct changes in the turnover times τj (e.g., through changing the corresponding 1/kcatj), the growth adaptation coefficient is calculated by evaluating the growth function μ and its partial derivative at fixed f,

Aτjτjμ*dμ*dτj=τjμμτj=-μρfjτjca=-ϕj, (44)

where we effectively treated τj as a variable in the growth equation Eq 16, and ϕj = pj/ca is the optimal proteome fraction allocated to reaction j before the change in τj. This result is consistent with the observation that drugs targeting the most highly expressed catalysts, such as the ribosome, have the strongest effects on cellular growth rates [5, 41].

For changes in some external parameter such as a concentration x, the growth adaptation coefficient is again calculated by evaluating the growth function μ and its partial derivative at fixed f, and using the chain rule of differentiation we obtain

Axxμ*dμ*dx=xμsμτsτsx=-sϕsxτsτsx, (45)

where we have a summation over s (only transporters s have kinetic rate laws depending on external concentrations). According to Eq 45, the growth adaptation coefficient of an external concentration x is simply the sum over the “scaled elasticities” xτsτsx of the transporters of x, weighted by the optimal proteome fractions ϕs allocated to each s before the change in x. This result gives an explicit quantitative estimation on which external concentrations should be changed in order to cause the most change in the optimal growth rate. This equation may hence provide a useful tool for improving the growth media environment for industrial cell cultures, and for quantifying the effect of drugs aimed at decreasing the growth of pathogens and cancer cells. If the turnover times τ depend explicitly on other external parameters, such as pH and temperature, growth adaptation coefficients can be calculated and interpreted exactly as in Eq 45.

The growth adaptation coefficient with respect to the mass density ρ, assuming it affects turnover times τ only through reactant concentrations c, reads

Aρρμ*dμ*dρ=ρμl,iμτlτlciciρ=-ρμl,i,j(μ2baflτlciMjifj)=-μbafEf, (46)

where ci/ρ=bi=jMjifj according to Eqs 9 and 11, and the last equality comes from the definition of the indirect elasticity (Eq 24). From this expression and λ in Eq 28, we see that −λ = μAρ; at optimality, the negative KKT multiplier for the density constraint, −λ, quantifies the absolute increase in growth rate caused by a marginal increase in ρ, given by μ itself times the proportional change, Aρ. Thus, the extra term in the shadow price of transporters (compare θs in Eq 56 to θr and θe) quantifies the growth rate benefit gained by allowing the violation of the density constraint (Eq 18) caused by a small increase in fs.

Just as the economy of growth is deeply connected to protein allocation, so is growth control. For Aj and Ax, this connection is clear from Eqs 44 and 45, respectively. For Aρ, we first note that it relates to optimal marginal values via Eq 53,

μAρ=-λ=j(jμ)fj=s(sμ)fs. (47)

At optimality, the summands on the RHS are zero for the ribosome and for enzymatic reactions (∂j μfj = 0 for j=r,e), and the summation over j can thus be restricted to only transporters s. Thus, at optimality, the absolute change in optimal growth rate caused by increasing ρ, μAρ, is equal to the summed marginal effects of transport fluxes on the growth rate, ∂sμ, weighted by the flux fractions fs themselves. To see the full connection between Aρ and protein allocation, we insert Eq 33 into Eq 47 to obtain

Aρ=-l,s(ϕlfs)μfs. (48)

This equation shows that the proportional effect on the optimal growth rate that is exerted by a marginal increase in ρ, Aρ, equals the combined marginal effects of transport fluxes fs on proteome allocation fractions, weighted by the transport fluxes themselves.

Discussion

Modeling frameworks that are essentially linear, such as FBA and RBA, are typically analyzed numerically, as the efficiency of linear programming facilitates fast solutions even for genome-scale models [7, 8, 11]. In contrast, the construction and solution of genome-scale non-linear models faces two major obstacles, both intimately linked to the kinetic rate laws. First, experimental estimates for the required kinetic parameters—kcat and Km values in the simplest case of generalized Michaelis-Menten kinetics—are lacking for most reactions [42]. This problem can be alleviated by using parameter estimates from artificial intelligence approaches [4345]. Second, the non-linearity of enzymatic rate laws makes numerical optimizations much more difficult than for linear systems, explaining why existing studies have been limited to models with only a handful of reactions [6, 1519]. Numerical optimization is particularly problematic for models with redundant pathways, where the optimization problem is non-convex [20].

The succinct mathematical formulation for modeling balanced cellular growth developed in this paper helps to address both problems. On the one hand, the reduction of the problem description to a minimal number of independent variables—the flux fractions—reduces the dimensionality of the search space, and may thus help to accelerate numerical approaches to find optimal states. On the other hand, this formulation allowed us to identify necessary conditions for states of maximal growth rate. For enzymatic reactions e, these conditions (Eq 34) are “local” in the sense that they only depend on the flux fractions f and on the kinetic parameters of reactions directly connected to e itself, i.e., they only depend on the fluxes and parameters of reactions whose turnover times τj are directly affected by changes in fe. On the other hand, the optimality conditions for the ribosome and transport reactions (see Eqs 57 and 59) do require the knowledge of the full vector f and of all model parameters, which are required explicitly or via μ as determined by Eq 16.

The concise formulation also helped in the interpretation of the optimality conditions from the perspectives of economy and control theory. The marginal change in growth rate induced by each flux change is seen as the flux’s marginal economic value, while the growth adaptation coefficient of each model parameter or external concentration is the change in the optimal growth rate induced by a marginal change in this parameter. The close correspondence between the mathematical expressions obtained in both perspectives helps to clarify the mathematical and conceptual links between these usually separate fields of study, including the extension of previous results of metabolic control analysis (MCA), developed for ad-hoc objectives in static sub-networks, to the holistic problem of cellular growth in GBA models. In MCA, one typically treats enzyme concentrations as control variables and studies how small changes to them affect reactant concentrations and fluxes. Here, all these variables are not only connected, but are uniquely determined by the flux fraction vector f. Moreover, the growth rate μ itself is explicitly connected to f through the growth function (Eq 16). Through these connections, we can quantify the sensitivity of the cellular growth rate, and hence approximately of organismal fitness, to changes in the control variables π, something not possible in the usual MCA framework [9, 10]. The growth adaptation coefficients provide explicit expressions for the effects on growth rate caused by small changes in control variables at optimality. Due to the close relationship between growth rate and fitness, these estimates could be used to interpret and predict evolutionary changes in these variables.

A closely related nonlinear cellular modeling approach accounts for the different amino acid compositions of individual proteins by including “personalized” ribosome reactions for each protein [23, 46, 47]. In contrast to GBA, this type of model cannot be simplified using flux fractions f, as it requires a mathematical formulation that includes explicit variables for metabolite concentrations. Experimental data for E. coli [48] indicates that the 20 amino acid content into its total proteome changes very little over 22 highly distinct growth environments (mean coefficient of variation = 2.46%, maximal CV = 7.55%, see Table A in S1 Text), suggesting that—at least globally—different protein compositions are likely not a major factor driving significant changes in the optimal cellular state. Thus, a unique ribosome reaction with fixed column Mr is a realistic assumption over all these growth conditions. Further study is necessary to identify whether the different compositions of individual proteins may cause significant changes in their allocation across environments.

All analytical results in this study were derived exclusively from the growth constraints assumed in GBA models: mass conservation in balanced growth, reaction kinetics, cellular density, and non-negative concentrations. For the analysis of optimal growth, we encoded all corresponding information into a single Lagrangian function, parameterized in terms of the constraints. We formulated the problem with the flux fractions f as the only free variables, and used KKT conditions to obtain the necessary conditions for optimal growth states (OGSs). Through these conditions, the marginal protein allocation emerges as the natural underlying currency in the cell economy; this relationship has frequently been asserted [31, 36, 37], but is derived here entirely from first principles.

The KKT framework provides a straight-forward way to incorporate new constraints, analogous to how physical theories using the Lagrangian formalism account for additional forces by adding corresponding functions and Lagrange multipliers into the Lagrangian. A re-derivation of the KKT conditions will then result in an extended set of balance equations. Among the potential extra physiological constraints, one might consider also phenomenological constraints such as the recently reported relationship between the cellular surface/volume ratio and the growth rate [49].

One fundamental physiological limitation that could be included in this way but is not considered explicitly here is the diffusion limit of molecules within cellular compartments. This limit links density and kinetic constraints. A higher dry mass density increases the “crowding effect” within cells [26], which entails a lower diffusion rate and by consequence a longer time for reactants to find their catalysts; this effect can be modeled directly by including a corresponding dependence in the Michaelis constants Km. A study on the crowding effects of all cellular concentrations—including those of small molecules—found that the observed E. coli dry mass density is in the range expected if evolution had optimized the cellular density for maximal growth rate [33]. In this sense, a fixed density constraint on all molecules, as considered here, may be seen as a simplifying approximation, justified by the observed constancy of cellular buoyant and dry mass densities across different growth conditions [25, 49], with the exception only of large changes in environmental osmolarities [26].

The Lagrangian formalism described here also allows a direct generalization of the theory to other objective functions, i.e., other measures of fitness at balanced growth. This can be done by incorporating a new objective function F(f) and adding a new constraint for the growth rate via ω(μμ0), where μ is determined by the growth function, ω is the corresponding KKT multiplier, and μ0 is the constrained growth rate given now as an input.

An important step toward a more general theory of cellular growth would be to extend the present analytical approach to changing environments, and to derive similar analytical conditions for time-dependent optimal cellular states f(t). In this situation, fitness is determined by the proportional growth in a given period of time, so the objective function becomes the integral of the specific growth rate μ(t) [17, 50], under the same constraints as discussed here. This dynamical extension to a theory of proportional growth optimization would help to generalize the existing results on dynamic metabolic flux optimization [51], and building more realistic models for cells in cyclical environments, such as feast-famine cycles of the gut microbiome [52] or day-night cycles of photosynthetic microbes [19].

In sum, the concise mathematical formulation of the growth optimization problem developed here provides a powerful toolbox for the analysis and solution of mechanistic descriptions of optimal cellular physiology and growth. It thereby opens a path toward a fundamental understanding of organizing principles of biological cells. While biological systems will never be fully optimal, the study of optimal growth strategies provides an extremely useful null model for the action of natural selection.

Methods

The necessary KKT conditions include the primal feasibility conditions given by Eqs 18 and 19), and

jL=0(stationarity) (49)
θjfjτj=0(complementaryslackness), (50)

where ∂j ≔ ∂/∂fj indicates the partial derivative with respect to fj.

The stationarity conditions can be solved for the corresponding optimal multipliers θj, resulting in

θj=-(jμ+λγj)/τj, (51)

where λ is the optimal value for the density multiplier. After an element-wise multiplication of both sides of Eq 51 with fjτj, we can use the complementary slackness (θjτjfj = 0) to get

(jμ)fj+λγjfj=0. (52)

Now summing the last equation over all j and using the primal feasibility (Eq 18) results in

λ=-j(jμ)fj. (53)

Combining Eqs 51, 27 and 28, we can now express each multiplier θj explicitly in terms of the flux fractions f at optimality, resulting in slightly different expressions for ribosomal, enzymatic, and transport reactions:

θr=μba1τr(-Mra+μτr+μfEr) (54)
θe=μba1τe(μτe+μfEe) (55)
θs=μba1τs(μτs+μfEs-μfEfγs). (56)

By inserting these expressions into the complementary slackness conditions (Eq 50), we can now solve for f, which results in the balance equations for ribosomal, enzymatic, and transport reactions:

(Mra-μτr-μfEr)fr=0 (57)
(τe+fEe)fe=0 (58)
(τs+fEs-fEfγs)fs=0, (59)

where we simplified the expressions by exploiting that μ, ba, τj ≠ 0.

Supporting information

S1 Text. Detailed derivation of the balance equations, rate laws and kinetic parameters, mass balance and the stoichiometric matrix S, examples of GBA models, the dependence of λ on transporters, optimal enzyme concentrations and control coefficients, Fig A (Schematics and parameters defining each model example), Table A (Amino acid frequency in the E. coli proteome at various growth conditions).

(PDF)

S1 File. Zip file containing files for numerical optimization.

As described in the S1 Text.

(ZIP)

Acknowledgments

We thank Stefan Müller for discussions about KKT conditions, Alexander Kroll for verifying early calculations, and Xiao-Pan Hu for providing data in Table A in S1 Text.

Data Availability

All data and codes are available in the S1 File.

Funding Statement

This work was funded by a Volkswagenstiftung “Life?” grant to MJL, and by the German Research Foundation through grant CRC 1310 to MJL, and, under Germany’s Excellence Strategy, grant EXC 2048/1 (Project ID: 390686111) to OE and MJL. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Bremer H DP. Modulation of Chemical Composition and Other Parameters of the Cell at Different Exponential Growth Rates. EcoSal Plus. 2008;. [DOI] [PubMed]
  • 2.Fisher RA, Bennett JH. The genetical theory of natural selection: a complete variorum edition. Oxford University Press; 1999. Available from: http://books.google.com/books?id=sT4lIDk5no4C.
  • 3. Dourado H, Mori M, Hwa T, Lercher MJ. On the optimality of the enzyme–substrate relationship in bacteria. PLOS Biology. 2021;19(10):1–18. doi: 10.1371/journal.pbio.3001416 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Dourado H, Lercher MJ. An analytical theory of balanced cellular growth. Nature Communications. 2020;11(1):1226. doi: 10.1038/s41467-020-14751-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Hu XP, Dourado H, Schubert P, Lercher MJ. The protein translation machinery is expressed for maximal efficiency in Escherichia coli. Nature Communications. 2020;11(1):5260. doi: 10.1038/s41467-020-18948-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Molenaar D, van Berlo R, de Ridder D, Teusink B. Shifts in growth strategies reflect tradeoffs in cellular economics. Molecular Systems Biology. 2009;5(1):323. doi: 10.1038/msb.2009.82 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Goelzer A, Fromion V, Scorletti G. Cell Design in Bacteria As a Convex Optimization Problem. Automatica. 2011;47(6):1210–1218. doi: 10.1016/j.automatica.2011.02.038 [DOI] [Google Scholar]
  • 8. O’Brien EJ, Lerman JA, Chang RL, Hyduke DR, Palsson B. Genome-scale models of metabolism and gene expression extend and refine growth phenotype prediction. Molecular Systems Biology. 2013. doi: 10.1038/msb.2013.52 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Heinrich R, Rapoport TA. A Linear Steady-State Treatment of Enzymatic Chains. European Journal of Biochemistry. 1974;42(1):89–95. doi: 10.1111/j.1432-1033.1974.tb03318.x [DOI] [PubMed] [Google Scholar]
  • 10. Kacser H, Burns JA. The control of flux. Symp Soc Exp Biol. 1973;27:65–104. [PubMed] [Google Scholar]
  • 11. Lewis NE, Nagarajan H, Palsson BO. Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods. Nature Reviews Microbiology. 2012;10(4):291–305. doi: 10.1038/nrmicro2737 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Watson MR. Metabolic maps for the Apple II. Biochemical Society Transactions. 1984;12(6):1093–1094. doi: 10.1042/bst0121093 [DOI] [Google Scholar]
  • 13. Goelzer A, Muntel J, Chubukov V, Jules M, Prestel E, Nölker R, et al. Quantitative prediction of genome-wide resource allocation in bacteria. Metabolic Engineering. 2015;32:232–243. doi: 10.1016/j.ymben.2015.10.003 [DOI] [PubMed] [Google Scholar]
  • 14. Mori M, Hwa T, Martin OC, De Martino A, Marinari E. Constrained Allocation Flux Balance Analysis. PLOS Computational Biology. 2016;12(6):1–24. doi: 10.1371/journal.pcbi.1004913 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Weiße AY, Oyarzún DA, Danos V, Swain PS. Mechanistic links between cellular trade-offs, gene expression, and growth. Proceedings of the National Academy of Sciences. 2015;112(9):E1038–E1047. doi: 10.1073/pnas.1416533112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Maitra A, Dill KA. Bacterial growth laws reflect the evolutionary importance of energy efficiency. Proceedings of the National Academy of Sciences. 2015;112(2):406–411. doi: 10.1073/pnas.1421138111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Giordano N, Mairet F, Gouzé JL, Geiselmann J, de Jong H. Dynamical Allocation of Cellular Resources as an Optimal Control Problem: Novel Insights into Microbial Growth Strategies. PLOS Computational Biology. 2016;12(3):e1004802. doi: 10.1371/journal.pcbi.1004802 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Towbin BD, Korem Y, Bren A, Doron S, Sorek R, Alon U. Optimality and sub-optimality in a bacterial growth law. Nature Communications. 2017;8:14123. doi: 10.1038/ncomms14123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Faizi M, Zavřel T, Loureiro C, Červený J, Steuer R. A model of optimal protein allocation during phototrophic growth. BioSystems. 2018;166:26–36. doi: 10.1016/j.biosystems.2018.02.004 [DOI] [PubMed] [Google Scholar]
  • 20. Wortel MT, Noor E, Ferris M, Bruggeman FJ, Liebermeister W. Metabolic enzyme cost explains variable trade-offs between microbial growth rate and yield. PLOS Computational Biology. 2018;14(2):1–21. doi: 10.1371/journal.pcbi.1006010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Müller S, Regensburger G, Steuer R. Enzyme allocation problems in kinetic metabolic networks: Optimal solutions are elementary flux modes. Journal of Theoretical Biology. 2014;347:182–190. doi: 10.1016/j.jtbi.2013.11.015 [DOI] [PubMed] [Google Scholar]
  • 22. Wortel MT, Peters H, Hulshof J, Teusink B, Bruggeman FJ. Metabolic states with maximal specific rate carry flux through an elementary flux mode. The FEBS Journal. 2014;281(6):1547–1555. doi: 10.1111/febs.12722 [DOI] [PubMed] [Google Scholar]
  • 23. de Groot DH, Hulshof J, Teusink B, Bruggeman FJ, Planqué R. Elementary Growth Modes provide a molecular description of cellular self-fabrication. PLOS Computational Biology. 2020;16(1):1–33. doi: 10.1371/journal.pcbi.1007559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Baldwin WW, Myer R, Nicole Powell, Anderson E, Koch AL. Buoyant density of Escherichia coli is determined solely by the osmolarity of the culture medium. Archives of Microbiology. 1995;164(2):155–157. doi: 10.1007/BF02525322 [DOI] [PubMed] [Google Scholar]
  • 25. Kubitschek HE, Baldwin WW, Graetzer R. Buoyant density constancy during the cell cycle of Escherichia coli. Journal of bacteriology. 1983;155:3. doi: 10.1128/jb.155.3.1027-1032.1983 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Cayley S, Lewis BA, Guttman HJ, Record MT. …Characterization of the cytoplasm of Escherichia coli K-12 as a function of external osmolarity: Implications for protein-DNA interactions in vivo. Journal of Molecular Biology. 1991;222(2):281–300. doi: 10.1016/0022-2836(91)90212-O [DOI] [PubMed] [Google Scholar]
  • 27. Schuster S, Hilgetag C. On elementary flux modes in biochemical reaction systems at steady state. Journal of Biological Systems. 1994;02(02):165–182. doi: 10.1142/S0218339094000131 [DOI] [Google Scholar]
  • 28. Benyamini T, Folger O, Ruppin E, Shlomi T. Flux balance analysis accounting for metabolite dilution. Genome Biology. 2010;11(4):R43. doi: 10.1186/gb-2010-11-4-r43 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Marion JB, Thornton ST. Classical dynamics of particles and systems; 3rd edition. San Diego: Harcourt Brace Jovanovich; 1988. Available from: https://bib-pubdb1.desy.de/record/337680. [Google Scholar]
  • 30.Luenberger DG, Ye Y. Linear and nonlinear programming. 3rd ed. International Series in Operations Research Management Science. New York, NY: Springer; 2008.
  • 31. Noor E, Flamholz A, Bar-Even A, Davidi D, Milo R, Liebermeister W. The Protein Cost of Metabolic Fluxes: Prediction from Enzymatic Rate Laws and Cost Minimization. PLOS Computational Biology. 2016;12(11):1–29. doi: 10.1371/journal.pcbi.1005167 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Liebermeister W. The value structure of metabolic states. bioRxiv. 2022. [Google Scholar]
  • 33.Pang TY, Lercher MJ. Optimal density of biological cells. bioRxiv [Preprint]. 2020.
  • 34.Karush W. Minima of functions of several variables with inequalities as side constraints. M Sc Dissertation Dept of Mathematics, Univ of Chicago. 1939;.
  • 35. Kuhn HW, Tucker AW. Nonlinear Programming Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. Neyman. 1951; p. 481–492. [Google Scholar]
  • 36. Dekel E, Alon U. Optimality and evolutionary tuning of the expression level of a protein. Nature. 2005;436(7050):588–592. doi: 10.1038/nature03842 [DOI] [PubMed] [Google Scholar]
  • 37. Kafri M, Metzl-Raz E, Jona G, Barkai N. The Cost of Protein Production. Cell Reports. 2016;14(1):22–31. doi: 10.1016/j.celrep.2015.12.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Klipp E, Heinrich R. Competition for enzymes in metabolic pathways:: Implications for optimal distributions of enzyme concentrations and for the distribution of flux control. Biosystems. 1999;54(1):1–14. doi: 10.1016/S0303-2647(99)00059-3 [DOI] [PubMed] [Google Scholar]
  • 39. Wilken SE, Besançon M, Kratochvíl M, Foko Kuate CA, Trefois C, Gu W, et al. Interrogating the effect of enzyme kinetics on metabolism using differentiable constraint-based models. Metabolic engineering. 2022; p. S1096—7176(22)00117—3. doi: 10.1016/j.ymben.2022.09.002 [DOI] [PubMed] [Google Scholar]
  • 40. Afriat S. Theory of Maxima and the Method of Lagrange. SIAM Journal on Applied Mathematics. 1971;20(3):343–357. doi: 10.1137/0120037 [DOI] [Google Scholar]
  • 41. Scott M, Gunderson CW, Mateescu EM, Zhang Z, Hwa T. Interdependence of cell growth and gene expression: Origins and consequences. Science. 2010;330(6007):1099–1102. doi: 10.1126/science.1192588 [DOI] [PubMed] [Google Scholar]
  • 42. Nilsson A, Nielsen J, Palsson BO. Metabolic Models of Protein Allocation Call for the Kinetome. Cell Systems. 2017;5(6):538–541. doi: 10.1016/j.cels.2017.11.013 [DOI] [PubMed] [Google Scholar]
  • 43. Heckmann D, Lloyd CJ, Mih N, Ha Y, Zielinski DC, Haiman ZB, et al. Machine learning applied to enzyme turnover numbers reveals protein structural correlates and improves metabolic models. Nature Communications. 2018;9(1):5252. doi: 10.1038/s41467-018-07652-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Kroll A, Engqvist MKM, Heckmann D, Lercher MJ. Deep learning allows genome-scale prediction of Michaelis constants from structural features. PLOS Biology. 2021;19(10):1–21. doi: 10.1371/journal.pbio.3001402 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Li F, Yuan L, Lu H, Li G, Chen Y, Engqvist MKM, et al. Deep learning-based kcat prediction enables improved enzyme-constrained model reconstruction. Nature Catalysis. 2022;5(8):662–672. doi: 10.1038/s41929-022-00798-z [DOI] [Google Scholar]
  • 46.Müller S. Elementary growth modes/vectors and minimal autocatalytic sets for kinetic/constraint-based models of cellular growth. bioRxiv [Preprint]. 2021.
  • 47. Müller S, Széliová D, Zanghellini J. Elementary vectors and autocatalytic sets for resource allocation in next-generation models of cellular growth. PLOS Computational Biology. 2022;18(2):1–27. doi: 10.1371/journal.pcbi.1009843 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Schmidt A, Kochanowski K, Vedelaar S, Ahrné E, Volkmer B, Callipo L, et al. The quantitative and condition-dependent Escherichia coli proteome. Nature Biotechnology. 2015;34:104 EP–. doi: 10.1038/nbt.3418 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Oldewurtel ER, Kitahara Y, van Teeffelen S. Robust surface-to-mass coupling and turgor-dependent cell width determine bacterial dry-mass density. Proceedings of the National Academy of Sciences. 2021;118(32):e2021416118. doi: 10.1073/pnas.2021416118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Nordholt N, van Heerden JH, Bruggeman FJ. Biphasic Cell-Size and Growth-Rate Homeostasis by Single Bacillus subtilis Cells. Current Biology. 2020;30(12):2238–2247.e5. doi: 10.1016/j.cub.2020.04.030 [DOI] [PubMed] [Google Scholar]
  • 51. Planqué R, Hulshof J, Teusink B, Hendriks JC, Bruggeman FJ. Maintaining maximal metabolic flux by gene expression control. PLOS Computational Biology. 2018;14(9):1–20. doi: 10.1371/journal.pcbi.1006412 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Mori M, Schink S, Erickson DW, Gerland U, Hwa T. Quantifying the benefit of a proteome reserve in fluctuating environments. Nature Communications. 2017;8(1):1225. doi: 10.1038/s41467-017-01242-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011156.r001

Decision Letter 0

Pedro Mendes, Kiran Raosaheb Patil

30 Jan 2023

Dear Dr. Dourado,

Thank you very much for submitting your manuscript "Growth Mechanics: General principles of optimal cellular resource allocation in balanced growth" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. Please note that the correct Reviewer #3 comments are now attached.

While you should address all reviewer's comments, the most important issue would be to provide a more extensive biological motivation for the work and a case study, as requested by Reviewer 1.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Kiran Raosaheb Patil, Ph.D.

Section Editor

PLOS Computational Biology

Kiran Patil

Section Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Summary:

In this work, the authors introduce a mathematical method to perform growth balance analysis based on a known method to solve convex nonlinear optimization. The current manuscript only focuses on the mathematical formulation but does not provide biologically relevant examples to demonstrate the usefulness of the formulation. The manuscript often reads as lacking in essential detail. Comments are also provided below after the reviewing process. In addition, the reviewer suggests that the authors move detailed mathematical derivations and formulation to Supplementary Methods and focus on explaining the formulation (i.e., so as to help readers understand rather than derive the formulation).

Major concerns

The introduction appears to present only a partial picture of the proposed GM framework with respect to other, non-linear, metabolic modeling frameworks, where many are concerned with the paucity of information needed to construct a kinetic model. Frameworks such as Ensemble Models (EMs), GRASP, iSCHRUNK (within ORACLE), and K-FIT are used to overcome this barrier. Is this not also a barrier to the GBA/GM frameworks? Other common terms in linear modeling frameworks, such as genome-scale models (used in the discussion, line 590) should be introduced here if used elsewhere.

Do elements of vector v in M*v = mu*c (Eq. 3) correspond to fluxes of the reactions facilitated/catalyzed by transporters (transport reaction), enzymes (metabolic reaction), and ribosomes (protein translation reaction)? Or do those elements in v correspond to those machineries’ synthesis fluxes? If the former is true, then how are the machiney’s synthesis fluxes accounted for in the formulation?

In addition, it is unclear to the reviewer on what constraints are formed from the column “r” and row “a” of M. On protein and ribosome, total protein concentration is constrained but to be less than a variable “c”? Then, “c^a” is constrained to be less than “rho”. Is “rho” a constant/parameter? What about ribosome concentration? Does this mean the model’s maximal growth rate is limited by other constraint(s) rather than protein and ribosome availability (as proteins and ribosomes can be produced as much as the cells need)?

Line 183-184, the statement “The property (2) guarantees mass conservation within reactions…” is not true as M is derived from S. The two sets of positive and negative coefficients in M were mentioned to be normalized on individual reactions. In other words, the two sums of molecular weight of positive and negative terms, respectively, can differ (which indicate reaction is mass, and elemental, imbalances).

To create Eq. 11, does this mean that the growth rate “mu” has to be constant? Thus, to solve a GBA optimization (previous or current formulation), does “mu” need to be set to a constant? If so, how could one solve for the maximal “mu” in GBA?

Line 222, the authors state that extracellular concentrations, “x”, are constant. Is this assumption compatible with simulation for batch conditions (i.e., nutrients are depleted over time)?

Based on Eq. 19, are flux fraction “f_j” variables positive? What about reversible reactions? Also, could the Eq. 19 constraint be replaced by the “f_j >= 0” constraint because “tau_j” are positive? If not, could the authors provide a comment on the current form of Eq. 19?

It would be helpful to readers to include in the growth problem (Eq. 21) notation to indicate the variables’ domains (e.g., are flux variables strictly non-negative?). This could also be helpful in understanding how to construct models which can use this framework (e.g. are reversible reactions allowed?).

The reviewer does not find support for the statement presented in lines 605 to 607: “These conditions are local for each reaction, i.e., they do not require complete knowledge of the cellular reaction network and its kinetics.” Without complete knowledge, how are M and “tau” defined? In this work, overall, there is no concrete, concise, statement or statements as to what knowledge is needed to build a model to which this analysis can be applied.

According to the abstract and title, the purpose of this manuscript is to introduce the Growth Mechanics (GM) framework, but the phrase “Growth Mechanics” is only used in the title and abstract, and the acronym is only used in the abstract and last paragraph of the discussion. What is the GM framework? How is it different from the GBA framework, and where are the generalizations which are mentioned in the abstract? How is it more powerful? Some discussion on the distinctiveness of the two frameworks and a comparison of their applications would be useful.

Marked inconsistencies exist in how “tau” is discussed. In Table 1, it is noted as a turnover time for the reaction, and later on discussed as the inverse of the usual factor in kinetic rate laws (lines 152 to 153, the reviewer assumed this refers to a “k_cat” with units of inverse time, so it is understood that “k_cat” = 1/“tau”). In enzyme kinetics, turnover number is a constant for the system (e.g. “k_cat” = “V_max” / “E_t”). So Table 1 and the beginning of the manuscript suggest a constant “tau” value. However, In line 148, this statement is made: “...and adding the kinetic rate laws ‘tau’ and density ‘rho’”. Further, in lines 156 to 158, it is stated that “tau” is dependent on concentrations for both participant and non-participant metabolites. In Eq. 4, τ is treated as a concentration-dependent function. In these lines and equations, it appears that “tau” evolves from a simple turnover time vector into a Michaelis-Menten form description of the reaction rate, as lines 161 to 165 discuss Michaelis constants and turnover numbers, as well as the unit for input to this equation, and line 333 discusses using a “tau” that follows “a simple Michaelis-Menten rate law”.

How are proteins included and formulated in the model? Is it then that there is a single pseudo-protein in the GBA models (the section from lines 118 to 120 is written in the singular form, indicating a single protein metabolite and single protein-producing reaction from the ribosome)? How is the composition of this single protein determined? Note that this seems to contradict lines 149 to 151, Eqs. 4 and 5, as well as many other places in the manuscript which suggest that each reaction j has an associated protein or protein complex.

This manuscript is considered by the reviewer to be incomplete because no (biologically) relevant examples and applications are provided for the methods and computational models. The authors mentioned examples in Supplementary Materials but they are only simple toy examples.

If the tool requires the use of pre-defined kinetic law (in “tau”) with pre-defined kinetic parameters, then some other tool would need to be used in conjunction with this modeling framework. This is acknowledged in Lines 592 to 595. However, many such parameter estimating tools or approaches (such as ensemble modeling) integrate parameter estimation with kinetic parameter estimation. Therefore, what would be the advantage of using this approach as opposed to an integrated approach? How do these model structures compare with other kinetic modeling approaches?

It appears that a number of bilinear terms exist in the formulation. How is this consistent with the purported convexity of the formulation?

Minor concerns

Line 15, Constraint Based Reconstruction and Analysis (COBRA) and Genome-Scale Model (GSM or GEM) are more commonly used acronyms, and should be included here for clarity and easier linking to related works.

Lines 33-40, is the argument being made that RBA and ME models are both CBM models? If so, specify. Many other works consider these a different class of models from GSM models, so this could be unintentionally confusing if not clarified.

Lines 43-44, “widely adopted” rather than “most powerful” should be used.

Table 1, please make sure that a symbol is defined before it is used (index i is noted as containing m and a, but m is not yet defined).

Line 110, for clarity please specify if M,τ, and ρ are parameters or variables.

Lines 123-124, please specify if the negative and positive entries are within a particular column.

Line 163, please ensure consistency in how units are described. In Table 1, the units for v are described as “[mass][volume]^-1[time]^-1” whereas here they are described as “[mass x volume ^-1 x time ^-1]”. This happens elsewhere as well, this example is intended to draw all such instances to the authors’ attention.

Line 168, could the value for “rho” be provided in the main text?

Line 237, please specify the “the previous two equations”.

Reviewer #2: Review of Growth Mechanics: General principles of optimal cellular resource allocation in balanced growth

by Bob Planqué

This is a well-written and most welcome paper on the conditions that hold in states of optimal balanced growth, in the particular case of an EFM/EGM, and in which it is assumed that all protein and ribosome in the cell may be lumped into one protein compartment. I particularly like the new view of reformulating the balanced growth equations using flux fractions, and the fact that in this particular model the metabolite concentration indeed drop out of the equations, even though they were explicitly taken into account to start with.

I have enjoyed reading the manuscript, and only have a number of comments for improvement, clarification and link to other literature.

One of the omissions that I think should really be dealt with is the link to the elementary mode literature. The case considered is that in which the reaction matrix has full rank. This is equivalent to restricting to an Elementary Mode, whether it is an Elem. Flux Mode [refs 21 ,22 in the ms] or an Elem Growth Mode [ref 44,45, maybe also 46?], as I am sure the authors are well aware. Given the central role E(F/G)Ms have come to play in our systems biology literature, I think it is essential that this concept is mentioned, rather than just refer to papers that deal with them.

In l 87, the authors refer to two papers in which it is proved that EFMs are specific flux optimisers. In [44], it is shown that EGMs are growth rate optimisers. The model considered in this paper falls somewhat in the middle of these approaches (EFMs disregarding the self-replication aspect and not accounting for protein synthesis, EGMs accounting for everything, differentiating between different enzyme synthesis rates, etc.), and it is not immediately clear to me that the EFM proofs apply to the case at hand. Maybe it is best to refer to [44] (and maybe also [45] which deals with other cases in-between EFMs and EGMs and also contains proofs of growth rate optimisation in elementary modes) when making the claim that one may restrict to the case of full matrix rank. Then all bases are covered.

l 117 and further: Here the matrix M is introduced. I had trouble understanding the construction of the last row of M, even though it turned out to be trivial. I do not think the construction follows Molenaar et al (who did differentiate between protein compartments, and were the inspiration of the introduction of the 'alpha' ribosomal fractions used in [44]). Please explain this last row more clearly, preferably by giving a tiny example. At present, examples are in the SI (but this is not mentioned in the ms at this point, so this could be another solution), which is not too accessible to the reader.

In Eq (5) - (6) there are two constraints, but in l 192 it is mentioned that the protein constraint is an emergent property, rather than a real hard constraint (i.e. c^a is not known beforehand). So why include it then? I guess I'm missing something here.

l 242: the recursion has been noted in older papers, starting with the RBA models by Goelzer et al (2009). It is a mainstay of ME-type models. It is clearly also mentioned in [44]. The insight that this recursion disappears here is very neat, and I need to think about that more deeply. But please add some refs here.

Growth analysis, page 9.

This whole section aims to derive conditions that hold at optimality + steady state. In particular for EFMs (without the self-replication part taken into account), this has been done in Planqué et al. (2018), in which such equations are coupled to dynamic enzyme synthesis rates to control the maximal specific flux in varying environments. The situation here is of course a bit different, but the two situations are closely related, so a reference seems in place, either here or in the Discussion.

In this section, it is also not mentioned whether such optimal states actually necessarily exist, or whether there are multiple (local) optima. This all has to do with the convexity properties of the relevant functional, of which the authors are well aware. References such as the paper by Wolfram and Elad in 2016 on Enzyme Cost Minimisation and convexity, and also (Planqué et al. 2018) which improves this slightly to strict convexity, are relevant, but I don't think they solve the case here immediately. The authors would do well to change 'the optimal state' to 'an optimal state' in several places, such as in line 345 and 385.

Discussion

l 606: I read there are local conditions for each reaction as necessary conditions for optimal growth. I didn't quite understand this part of the ms, I have to say (I didn't have the time to think in detail about it), but I find this surprising. Surely, because reactions have substrates and products, such conditions must be coupled? See Planqué et al (2018) for a situation where this is clearly the case. But as I said, the situations do not exactly compare.

I somehow find the idea of lumping all protein and ribosome into one pool, while on the other hand calculating (differing) steady state protein and ribosome concentrations for different conditions peculiar. Would it be not more natural to explain this by saying that all protein synthesis rates (per unit of ribosome) are assumed to be equal (which is also what comes out of having a constant relevant amino acid abundance assumption, see l 638 and which is also discussed in [44])? Now it sometimes reads as if you both lump things, and not lump things. The idea of marginal protein allocation clearly hinges on proteins being present in different concentrations (in optimum or otherwise). I think this just needs to be clarified, preferably already in the part where the model is introduced.

l 683: The situation considered here is that of optimal control, but there are alternatives, such as adaptive control. In the latter case, this is essentially the qORAC like framework, see Planqué et al (2018). As it already exists (for a slightly different case than the one considered here), and the extension from qORAC to the present paper would be only a small change (I think), it should be cited here.

Minor comments and questions:

l 36: the burden => the enzyme/protein burden?

l 72: what is the difference between a fixed protein concentration and a fixed combined mass density of their components? I think I understand because I know how this is usually formulated, but the reader might not.

l 284: set OF kinetic parameters

l 294: what are shadow prices? I understand there must be some link to economic arguments, but this needs to be explained, or at least given some reference to aid the reader.

References:

Planqué et al. (2018).  Maintaining maximal metabolic rates by gene expression control. PLoS Comp Biol. 14(9):e1006412.

Reviewer #3: In this work, Dourado et al. extend their framework called Growth Balance Analysis to a more general system with several meaningful advantages. All formulae are expressed using a single independent variable f (for flux) and therefore are simpler than other types of cell growth models. By this, they lay the groundwork for a universal modeling approach which has the potential to bring together many disjoint approaches and hopefully increase cooperation between modelers that use FBA, MCA, kinetic models and others. The text is very well written and I haven't found any scientific or mathematical issues.

I can offer a few suggestions that might improve the text even further:

1. The use of Einstein's summation convention is, in my view, not very helpful. Indeed, it might appeal to some physicists and could be a bit less verbose - but I don't think the benefit outweighs the downside of being less standard and harder to read for many people.

2. I might have misunderstood, but it appears that fluxes (v^j) and turnover times (τ^j) can be negative (and indeed they are not constrained to be non-negative like other variables). I think stating this explicitly could help readability.

3. At some point, metabolite concentrations and protein concentrations are "replaced" with fluxes based on the balanced growth assumption (i.e. unlike in FBA where fluxes balance to 0, here each one balances to the dilution rate of the metabolite/protein defined by cellular growth). First, this idea was presented in a similar context in 2010 by Benyamini et al. (https://doi.org/10.1186/gb-2010-11-4-r43), albeit only for metabolites.

Furthermore, in the discussion (lines 602-603) the authors highlight the advantage of using this approach in minimizing the number of independent variables and thus assisting the numerical solvers. However, this might be a slight overstatement since many models indeed have explicit protein concentration variables, but then the biosynthesis flux is the dependent variable (and very simple to express as a function of the concentration). For metabolites, the case is not very different (the Sv = 0 constraint might have more rows and be more rank deficient, but almost all solvers can deal with this easily).

4. The manuscript is quite long and dense (in terms of mathematical definitions and derivations). There is no easy solution to this, but perhaps the authors might consider splitting it or moving some parts to a supplementary section and really focus only on the main message. In addition, perhaps a few toy examples (with simulations or analytical solutions) could be helpful as keeping track of all the abstract math symbols all the way until the end is a bit daunting. That being said, I greatly appreciated the table of symbols on page 4. Perhaps one could add more of these symbols to figure 1A as well?

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes: Elad Noor

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011156.r003

Decision Letter 1

Pedro Mendes, Kiran Raosaheb Patil

4 May 2023

Dear Dr. Dourado,

We are pleased to inform you that your manuscript 'Mathematical properties of optimal fluxes in cellular reaction networks at balanced growth' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Pedro Mendes, PhD

Academic Editor

PLOS Computational Biology

Kiran Patil

Section Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have addressed our comments and suggestions for changes. The revised manuscript is significantly improved.

Reviewer #3: The authors have addressed all of my and the other reviewers' comments.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #3: Yes: Elad Noor

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011156.r004

Acceptance letter

Pedro Mendes, Kiran Raosaheb Patil

1 Jun 2023

PCOMPBIOL-D-22-01649R1

Mathematical properties of optimal fluxes in cellular reaction networks at balanced growth

Dear Dr Dourado,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Zsofia Freund

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text. Detailed derivation of the balance equations, rate laws and kinetic parameters, mass balance and the stoichiometric matrix S, examples of GBA models, the dependence of λ on transporters, optimal enzyme concentrations and control coefficients, Fig A (Schematics and parameters defining each model example), Table A (Amino acid frequency in the E. coli proteome at various growth conditions).

    (PDF)

    S1 File. Zip file containing files for numerical optimization.

    As described in the S1 Text.

    (ZIP)

    Attachment

    Submitted filename: Response to Reviewers.pdf

    Data Availability Statement

    All data and codes are available in the S1 File.


    Articles from PLOS Computational Biology are provided here courtesy of PLOS

    RESOURCES