Abstract
The physiology of biological cells evolved under physical and chemical constraints, such as mass conservation across the network of biochemical reactions, nonlinear reaction kinetics, and limits on cell density. For unicellular organisms, the fitness that governs this evolution is mainly determined by the balanced cellular growth rate. We previously introduced growth balance analysis (GBA) as a general framework to model and analyze such nonlinear systems, revealing important analytical properties of optimal balanced growth states. It has been shown that at optimality, only a minimal subset of reactions can have nonzero flux. However, no general principles have been established to determine if a specific reaction is active at optimality. Here, we extend the GBA framework to study the optimality of each biochemical reaction, and we identify the mathematical conditions determining whether a reaction is active or not at optimal growth in a given environment. We reformulate the mathematical problem in terms of a minimal number of dimensionless variables and use the Karush-Kuhn-Tucker (KKT) conditions to identify fundamental principles of optimal resource allocation in GBA models of any size and complexity. Our approach helps to identify from first principles the economic values of biochemical reactions, expressed as marginal changes in cellular growth rate; these economic values can be related to the costs and benefits of proteome allocation into the reactions’ catalysts. Our formulation also generalizes the concepts of Metabolic Control Analysis to models of growing cells. We show how the extended GBA framework unifies and extends previous approaches of cellular modeling and analysis, putting forward a program to analyze cellular growth through the stationarity conditions of a Lagrangian function. GBA thereby provides a general theoretical toolbox for the study of fundamental mathematical properties of balanced cellular growth.
Author summary
Mathematical models are an important tool to understand and predict the complex behavior of biological cells. This behavior is driven by nonlinear physical constraints that cannot be captured entirely in the prevalent modeling frameworks, which rely on simplified linear optimizations. The next generation of more realistic cell models will depend on an efficient mathematical formulation for the corresponding nonlinear optimization problem that facilitates the analytical study and numerical simulation of large models. Here, we present a succinct formulation for this nonlinear problem, and we derive the analytical properties of fluxes at optimal growth. We also show how these analytical properties can be understood in terms of economics and control theory, where they expose trade-offs related to the allocation of proteins.
Introduction
A core feature of microbial cells is self-replication—their ability to build a complete, identical cell exclusively out of the chemical compounds found in the environment. If a population of asynchronously replicating microbial cells grows exponentially in a constant environment, its self-replication can often be assumed to result from balanced growth, a non-equilibrium steady state in which every cellular component accumulates at the same rate in proportion to its total amount [1]. For non-interacting microbes in a constant environment, the balanced growth rate is equivalent to fitness [2].
The cellular composition is thus often interpreted as an approximate solution to a problem of optimal allocation, driven by natural selection. Accordingly, theoretical methods estimating the optimal allocation are used as a reference to understand cellular composition in vivo [3–8].
At the whole-cell level, a mechanistic understanding of the quantitative principles that shape cellular balanced growth has been approached predominantly through methods collectively classified as constraint-based modeling (CBM). CBM approaches define a solution space of feasible cellular states (usually defined by reaction fluxes) based on simple, mechanistic constraints. The predominant constraint in CBMs is flux balance, encoded through a linear system of equations that constrain the space of allowed reactions fluxes v [9, 10],
(1) |
Here, v is a vector of reaction fluxes, i.e., reaction rates in units of [moles][time]−1[volume]−1. Each row of the stoichiometric matrix S corresponds to one metabolite, while each column corresponds to a metabolic reaction, with entries listing the corresponding stoichiometric coefficients of substrates (negative integers) and products (positive integers).
Thermodynamics and physiological limits—such as a limited nutrient uptake capacity—are typically approximated through fixed upper and/or lower bounds on the modeled fluxes v [11]. The most widely used CBM approach, Flux Balance Analysis (FBA) [11, 12], obtains plausible physiological states by optimizing some objective function over the feasible flux vectors. Frequently, the objective function is the flux through a hypothetical biomass reaction vbio, which mimics the accumulation of precursors for macromolecules and the consumption of energy for their assembly during growth.
Resource Balance Analysis (RBA) and metabolism and expression models (ME-models)—which are also based on optimization under constraints—go beyond FBA by aiming to model metabolism in its most general sense, with the ultimate goal of representing all chemical reactions that occur in a living organism [8, 13]. In contrast to FBA, these methods take into account the burden of producing the macromolecules (proteins and RNA) required for catalyzing each flux. They approximate the corresponding kinetic rate laws as linear relations between fluxes and the concentration of their catalysts, ignoring the dependence on reactant concentrations (except for dependencies on extracellular concentrations, which serve as model parameters).
All widely used CBMs [8, 11, 13, 14] are formulated as linear optimization problems, which can be solved efficiently even for genome-scale models with thousands of reactions. Accordingly, they are currently the most efficient tools to predict and understand realistic cellular models. However, by construction, these linear methods cannot capture the potentially complex nonlinear relationship between biochemical reaction fluxes—and hence cellular growth—and the concentrations of reactants involved as substrates and products. Instead of accounting for nonlinear kinetics, these methods rely on linear, phenomenological assumptions.
Nonlinear CBMs [6, 15–19] account for constraints such as nonlinear kinetic rate laws, linking the concentration of metabolites to reaction fluxes. This link means that the metabolite concentrations are now an output of the model instead of an input. Molenaar et al. [6] introduced “self-replicator” models that maximize the cellular growth rate, with reaction fluxes that are limited by fundamental physiological constraints including mass conservation, nonlinear rate laws, and limited protein density. Importantly, these models are completely self-contained, in the sense that in order to grow and self-replicate, all of a model’s individual components have to be produced explicitly by the model itself. Instead of using a phenomenological “biomass reaction”, the constrained optimization of growth predicts the detailed cell composition, and all possible trade-offs in resource allocation can be accounted for from first principles.
Similar to RBA and ME models, self-replicator models include a “ribosome” reaction that produces the necessary proteins. The proteins can be classified into three categories: transport proteins in the cell surface, which exchange mass with the environment; enzymes, which catalyze internal metabolic reactions; and the ribosome itself, which catalyzes the internal protein production, and which for simplicity is assumed to be composed only of proteins. The study of models of this type relies on the numerical solution of nonlinear optimizations: while it in principle accommodates models with any number of reactions, actual presented models have small, highly simplified reaction networks [6, 15–19].
We have previously formalized a general framework for modeling and analyzing nonlinear CBMs, an approach we termed growth balance analysis (GBA) [4] (Fig 1). GBA models are based on the self-replicator scheme, but instead of considering a fixed protein concentration, they consider a fixed combined mass density of all intracellular components, including also metabolites. Optimal cellular resource allocation, as predicted by GBA models, emerges exclusively from quantitative biochemical and physical principles, including the intrinsic nonlinear nature of the underlying reaction kinetics. In general, the optimization of nonlinear models is a non-convex problem, frequently hampered by the existence of multiple local optima [20]. Several studies have explored ad-hoc analytical solutions to convex, minimal nonlinear cell models consisting of up to three cellular reactions. Despite their simplicity, simulations with these schematic models are qualitatively consistent with the experimentally observed behaviour of actual cells [6, 15–19].
The optimization of large nonlinear CBMs, such as GBA models, is still an open problem for numerical methods [20]. Thus, in previous work, we proposed a different approach to the analysis of GBA models—instead of looking for the optimal state of a GBA model with numerical methods, we ask: what are the analytical properties of this optimal state? We named the equations specifying these analytical properties the balance equations of the corresponding GBA model [4]. If we further assume, as most CBM approaches do [6, 8, 11, 13–19], that cells are at an optimal state (or at least close to one), then the balance equations become useful tools to estimate and understand the resource allocation in actual biological cells. We derived the balance equation for each reactant in a GBA mode and applied these equations to successfully predict the protein allocation into the ribosome of both E. coli and yeast across various growth conditions [4]. The application of these balance equations to estimate complete cellular states is however limited; in our original derivation [4], the stoichiometric matrix S of interest is assumed to have full column rank, which is the case for all optimal states of GBA models when one restricts S to the columns corresponding to active reactions (i.e., those with non-zero flux) [4, 21, 22]. The active reactions at optimal growth form an Elementary Growth Mode (EGM) [23], and they represent an Elementary Flux Mode (EFM) [21, 22] of a related FBA problem [4]. Unfortunately, however, the optimal choice of this EGM/EFM and its constituting active reactions is not known a priori for large-scale models, and cannot be explained by our previous analytical study [4].
Below, we generalize our previous analytical study of GBA models by deriving the analytical properties of each reaction at optimal balanced growth, now also accounting for models with column-rank-deficient stoichiometric matrices—i.e., biochemical networks with alternative pathways. These analytical properties can be seen as generalized balance equations, explaining from first principles the optimal resource allocation strategy for each reaction in a cell. In particular, they explain from first principles the exact mathematical condition determining whether a reaction is active or not in an optimal growth state. We then interpret these balance equations in terms of marginal costs and benefits of reactions with respect to their influence on growth, and quantify how changes in the model parameters and external conditions control the optimal growth rate.
Results
We first present the notation and mathematical definitions for growth optimization, including an objective function and constraints. We then reformulate the problem in terms of flux fractions as the only free variables, which greatly simplifies the subsequent analytical study. Finally, we explore the consequences that emerge from the necessary optimality conditions in terms of economics and control theory, and discuss their biological significance.
Growth modeling
We define a GBA model as the triple of model parameters (M, τ, ρ). The matrix M describes the mass fractions of internal reactants consumed and produced by each reaction; τ is a vector of catalytic turnover times for all reactions, where each is a function of internal reactant concentrations c and possibly also external concentrations x (assumed to be fixed and given); and ρ is the combined mass concentration of all internal components. In the following paragraphs, we provide more detailed descriptions of the model constituents M, τ, and ρ. Here and below, we use the term “reaction” to also encompass transport processes across the cell surface, which are “catalyzed” by transporter proteins or protein complexes.
The matrix M was first introduced in [4]. It is constructed from the stoichiometric matrix S for the total, closed system, i.e., including rows for external reactants. We add a column “r” for the ribosome reaction that produces all cellular protein, as well as a row “a” corresponding to the total concentration of all proteins in mass units. We now first convert all entries to masses, by multiplying each row with the corresponding molecular mass. Because of mass balance, each column must then sum to 0. We next normalize each column such that the sum of its negative entries equals −1 and the sum of its positive entries equals +1. Now the entries correspond to the mass fractions of each reactant (rows) going into and out of each reaction (columns), as illustrated for the example in Fig 1B. Finally, we reduce the normalized matrix to a matrix for an open system, by dropping all rows for reactants external to the modeled cell. For the remaining internal reactants, we will assume a quasi-steady state and thus enforce mass conservation.
As illustrated in Fig 2, to simplify the notation for the following theoretical development, we partition the columns of M (indexed together by j) into index sets for reaction types: s for transport processes across the cell surface; e for internal enzymatic reactions; and r for the ribosome reaction, which is the only one that produces protein. We partition the rows of internal reactants (indexed together by i) into indices m for metabolites and a for total protein. We use the term “metabolites” in its more general sense, referring to any molecule in the cell that is not a protein. We distinguish vectors by using boldface, and vector components by using italics with the appropriate upper or lower index, e.g., c is the column vector of all internal reactant concentrations, ci are its components, and we use a lower index to indicate the components ci of the row vector c⊤.
The ribosome reaction represents the last step in protein synthesis, and is assumed to be catalyzed by a “ribosome” consisting entirely of protein. We here ignore the RNA components of the ribosome for simplicity, but it is possible to extend the modeled ribosome to a more realistic RNA-protein complex. In addition, the enzymatic reactions (e) could be extended so that they include details of protein translation that occur before the last, “ribosome” step (r) [5]. Note that nonlinear genome-scale GBA models can be created from existing linear genome-scale models, by extending their stoichiometric matrix S with the addition of a ribosome reaction, normalizing it to M with the molecular masses, and adding the kinetic rate laws τ and density ρ (see below).
We assume that every reaction represented by a column j in M is catalyzed by a protein or protein complex with concentration pj—specifically, a transporter (s), an enzyme (e), or the ribosome (r). The corresponding flux vj is assumed to be proportional to pj, expressed as vj = pj/τj(c, x). Thus, τj is a function defined as the inverse of the usual metabolite-dependent factor in kinetic rate laws. τj must have a negative value when the flux is negative. Fig 1C shows the relationship between turnover time τ and turnover number kcat. S1 Text provides a basic discussion of rate laws and the necessary kinetic parameters. c = (cm, ca)⊤ is the vector of internal reactant concentrations, comprising all metabolite concentrations cm as well as ca, the combined mass concentration of all proteins. Hence, each τj may depend not only on the concentrations of the substrates and products of the corresponding reaction, but also on inhibitors and regulatory metabolites not involved in the turnover itself. The transport processes s are the only reactions whose rate laws may depend on the external concentrations x.
Note that in accordance with the normalization of M, all concentrations of proteins pj and reactants ci throughout this work are in units of [mass][volume]−1. Fluxes ([mass][volume]−1[time]−1) and the kinetic parameters must then also be expressed in mass units, e.g., Michaelis constants Km in [mass][volume]−1 and turnover numbers kcat in product mass per protein mass per time, resulting in [time]−1.
ρ, the final constituent of GBA models, is the sum of all internal concentrations. We assume ρ to be constant, which is consistent with experimental data on E. coli across growth conditions and even across the cell cycle [24–26]. The mass balances exploited for the normalization of M mean that all reactants involved in reactions must be accounted for in the model and hence be included in the value of ρ; e.g., in a realistic model water is a reactant in many reactions, so ρ corresponds in this case to the total cellular density (or buoyant density). Simplified models may instead include only dry mass components, so that both M and ρ consider only these.
Mass conservation implies that in the mass fraction matrix M, each column sum is zero if it involves only the consumption and production of internal reactants (indices e, r). In contrast, transport reactions (with indices s), which bring mass into and out of the modeled system, do not conserve mass, resulting in the equations
(2) |
The property (2) guarantees mass conservation within reactions, an information that is not always fully encoded in the stoichiometric matrix S, as many models ignore common reactants such as water (see discussion in S1 Text). While external reactants have no corresponding rows in M, their concentrations x may influence the turnover times τs of transporters. We present examples of GBA models in S1 Text and R code for their numerical optimization in S1 File.
We are interested in the cellular physiology, defined through the concentration vectors c, p and the vector of reaction fluxes v, at balanced growth. For a given model specified by (M, τ, ρ) and a given environment specified by x, balanced growth at the instantaneous rate μ is specified by the following constraints:
(3) |
(4) |
(5) |
(6) |
(7) |
(8) |
where ⊙ indicates element-wise multiplication. We say that any state (c,p,v) satisfying Eqs 3–8 with growth rate μ > 0 is a Balanced Growth State (BGS) for the model specified by (M, τ, ρ) and the environment specified by x. The Optimal Growth State (OGS) is the BGS resulting in the maximal growth rate μ. It can be shown that any OGS must always use a minimal subset of active reactions, i.e., growth becomes impossible if one of the active reactions is deactivated without simultaneously activating other reactions [4, 21, 22]. We term BGSs that use such minimal subsets of reactions Elementary Growth States (EGSs) [4]. Each EGS corresponds to an Elementary Flux Mode (EFM) [27] of the “linearized” version of the balanced growth problem with fixed concentrations c [4]; in that case, τ(c, x) also have fixed values and all Eqs 3–8 become linear. EGSs are specific instances of Elementary Growth Modes (EGMs) [23], sets of states using the same minimal set of active reactions.
The model’s balanced growth property is captured by the right hand side of Eq 3. We assume that the growth rate is always positive, μ > 0. Thus, for internal nodes with non-zero concentration (ci ≠ 0), there is a necessary mass flow to offset the dilution through the associated volume growth at rate μ [28]. Note that the total protein concentration ca defined by Eq 5 has no fixed value, but is constrained by the concentration of enzymes and transporters required to sustain the reaction fluxes, by a row “a” in Eq 3 that specifies mass conservation in balanced growth, and by the total density constraint in Eq 6. We summarize our assumptions about proteins as the following: i) all proteins have the same amino acid composition (determined by the entries in the column Mr) and are produced by the ribosome following identical kinetics; ii) the total protein concentration ca is defined as the sum of all protein mass concentrations p by Eq 5; iii) ca relates to the ribosome flux vr and growth rate μ via the row “a” in the mass conservation constraint given by Eq 3 (); and iv) ca also relates to the density constraint via the sum in Eq 6. From our modeling perspective, we might think of “protein” as being produced by the ribosome and instantly distributed across all reactions such that each individual protein catalyst (transporter, enzyme, or ribosome) maintains its concentration in balanced growth.
Below, we will be interested in the analytical properties of the OGS for a given model (M, τ, ρ) and environment x. From Eqs 3–5, we see that the variables (c, p, v) are highly interdependent. The above formulation does not lend itself to expressing μ as an explicit function of these variables, which makes it not ideal for numerical or analytical studies. If one can find a mathematically equivalent formulation based on fewer, independent variables, then this would facilitate the use of the KKT conditions, analogous to how generalized coordinates facilitate the solution of problems in Lagrangian mechanics [29]. Thus, we next focus on a corresponding reformulation of the optimization problem. This formulation will apply to all BGSs, and only later we will use it to examine OGSs.
A reformulation in terms of flux fractions f
Our guiding thought below is that there can be a correspondence between cell states at different growth rates, which can be expressed in the form of scaling relations. These scaling relations extend the mass fraction scaling of M to fluxes and concentrations. Specifically, we define biomass fractions
(9) |
(equivalent to c = ρ b, since ρ > 0), which express concentrations as fractions of the total cellular density; and we define flux fractions
(10) |
(equivalent to v = μρ f, since we assume μρ > 0), which express fluxes as fractions of the net mass uptake—i.e., the net growth—of the cell, μρ. Thus, each flux fraction fj describes the activity of reaction j relative to the total cellular mass production. We note that the flux fractions f may in principle assume any real value, in the same way the fluxes vj do, including negative values when the corresponding reactions are running backwards (in which case τj < 0). The reversibility of reactions is not an input, but an emergent output pattern here, due to the sign of each turnover time τj, as we discuss below.
Importantly, Eq 3 describing mass conservation in balanced growth does not depend explicitly on μ anymore when written in terms of f and b:
(11) |
This equation also implies that the mass fractions b are uniquely determined by the flux fractions f, independently of μ. Conveniently, this unique dependence also means that we can express the turnover times as functions of only f and the fixed parameters ρ, M, and x:
(12) |
In the following discussion, we mostly focus on the dependence of τ on f, and for simplicity of notation we do not state the dependence of τ on the fixed parameters (ρ, M, x) explicitly. Importantly, τ does not depend explicitly on μ, which otherwise would cause a recursion problem when further expressing the growth rate μ in terms of only f and τ(f), as we will see below. The resulting dimensionality reduction of the solution space not only simplifies the analytical considerations below, but also potentially accelerates numerical optimizations [30].
From Eqs 4 and 10, we obtain the expression for protein concentrations in terms of f, μ, and ρ,
(13) |
The combined mass fraction of all proteins in the cell, , is the sum of all p in the last equation, divided by ρ:
(14) |
Thus, we can calculate the total protein mass fraction during balanced growth from μ and f, based only on reaction kinetics. However, through Eq 11, the same total protein mass fraction is also related to f through the corresponding row “a” in M:
(15) |
where Ma is the row of M corresponding to the total protein concentration, and the second equality reflects our assumption that the “ribosome” reaction r is the only one producing proteins (with no reaction consuming them), so that for . Equating the right hand sides of the previous two Eqs 14 and 15 and solving for μ (with ba ≠ 0 ⇒ f⊤τ(f) ≠ 0), we get the growth function
(16) |
Thus, the growth rate becomes an explicit function of only the flux fractions f. μ still depends on the fixed parameters ρ, M, and x through the functions τ = τ(ρ M f, x). Note that if fluxes v were used instead of the flux fractions f, then τ(c, x) = τ(M v/μ, x), which would cause a recursion issue when defining the growth rate as a function of v and τ following the same procedure [7, 23]. In that case, one is forced to account for c as separate variables, thereby increasing the dimensionality of the problem. The same recursion issue occurs when formulating the problem in terms of protein concentrations p.
From now on, we will consider b (Eq 11) and τ (Eq 12) as functions of f, and treat f as the only free variables. After writing the growth rate μ as a function of f, we now do the same thing for our remaining constraints, so now we have much fewer variables and constraints.
In the scaled variables, the density constraint (Eq 6) is reduced to
(17) |
Using Eq 11, we can rewrite this constraint in terms of flux fractions f. We see that in balanced growth, the density constraint (Eq 17) is equivalent to a flux balance on the cell surface,
(18) |
where the second equality comes from Eq 2: only the columns s sum up to non-zero values γs, so only transport fluxes fs are limited by this constraint. The nature of this constraint as a global mass balance becomes more evident if we multiply the whole expression by μρ: the net mass uptake going through the cell surface must equal the rate of biomass production μρ.
Any solution to the growth function (Eq 16) automatically respects internal mass conservation, protein density and the kinetic constraints: for any given vector f, μ(f) returns the unique growth rate satisfying these constraints (which also depend on ρ through τ = τ(ρ M f, x). The flux balance at the cell surface is enforced separately by Eq 18 on transporters, making these fundamentally different from enzymatic and ribosome reactions. In particular, for a model with only one transporter s, Eq 18 already determines the scaled uptake rate fs = (γs)−1. With two transport fluxes, one flux is uniquely determined by the other; a simple example would be a model that only has transporters for glucose uptake and CO2 excretion (see example model “C” in S1 Text). More generally, Eq 18 can be used to uniquely determine one transport flux fraction in terms of the others, reducing the number of variables by one. For clarity of presentation, however, we will keep Eq 18 as a separate constraint and not eliminate any variable, until the introduction of growth control coefficients in the corresponding section.
Finally, writing the non-negativity constraints on proteins and reactant concentrations in terms of f results in the following element-wise inequalities on the corresponding vectors
(19) |
(20) |
We are now in the position to provide a concise formulation of growth rate optimization in terms of flux fractions f. Combining Eqs 16, 18, 19, 20, the optimal growth problem for a given environment x becomes
(21) |
where f is a vector containing a real-valued flux fraction for each reaction (, with N ≔ number of columns in M), and the turnover times τ = τ(ρ M f, x) are functions that depend on f and on the parameters ρ, M, x. In the following discussion, we assume that all τj are different from zero, which simply means there is no reaction with infinite turnover rate (this relationship can be visualized in Fig 1C). We note that the direction of reactions (i.e. the sign of each fj) is not enforced here such as in some other methods, but instead emerges as a result of the optimization (21); because of the constraint 19, a non-zero fj should always have the same sign as τj, which is in turn a thermodynamic property of reaction j determined by its kinetic parameters and the relevant reactant concentrations [31]. If the rate laws have a general functional form, these functions will be parameterised by the set of kinetic parameters . After solving this optimization problem, all original cellular variables (unscaled fluxes as well as unscaled metabolite and protein concentrations) can be easily reconstructed from f. In the following, we will refer to π as the vector of parameters that define the optimization problem, which includes M, ρ, and x, as well as the elements of . The parameters in π are considered fixed until the section “Growth Control and Adaptation”, where we study the sensitivity of optimal growth to marginal changes in the components of π.
Table 1 lists all symbols used below.
Table 1. Symbols used.
Symbol | Description (units) |
A | growth adaptation coefficient |
b | biomass fraction vector |
c | reactant concentration vector ([mass][volume]−1) |
C | control coefficient matrix |
f | flux fraction vector |
E | indirect sensitivity matrix ([time]) |
set of kinetic parameters (various units) | |
Lagrangian ([time]−1) | |
M | mass fraction matrix |
N | number of reactions (= number of columns in M) |
p | protein concentration vector ([mass][volume]−1) |
S | stoichiometric matrix ([mol]) |
v | flux vector ([mass][volume]−1[time]−1) |
x | external reactant concentration vector ([mass][volume]−1) |
γ | vector with column sums of M |
Γ | growth control coefficient ([time]−1) |
ε | direct sensitivity matrix ([time]) |
ϕ | proteome mass fraction vector |
θ | KKT multiplier for the protein non-negativity constraint ([time]−2) |
λ | KKT multiplier for the density constraint ([time]−1) |
μ | growth rate ([time]−1) |
π | parameter vector (various units) |
ρ | mass density ([mass][volume]−1) |
τ | turnover time vector ([time]) |
Index | Description |
a | all proteins |
e | enzymatic reactions |
m | metabolites |
i | internal reactants (including m and a) |
r | ribosome reaction |
s | surface reactions (i.e., transport reactions) |
j | reactions (including s, e, r) |
l | reactions (including s, e, r) |
Growth analysis
Next, we utilize the problem reformulation in terms of flux fractions f to derive general necessary conditions of OGSs, valid for any GBA model—including those with redundant pathways. First, for each reaction, we will derive explicit expressions for shadow prices in the optimal state; in constrained optimization in economics, the shadow price is the change, per infinitesimal unit of the constraint, in the optimal value of the objective function of an optimization problem obtained by relaxing the constraint. This term has been applied also to biological systems in the context of constraint-based optimization. [32]. We then derive equations for the state variables f themselves, which must hold in any optimal state. This development constitutes a generalization of our previous analytical approach to GBA [4], which was restricted to models with matrices M of full column rank. The latter condition is not generally satisfied by realistic cellular models, as many cellular biochemical reactions are structurally redundant, i.e., their columns in M are linearly dependent on other columns. OGSs always have non-redundant active reactions (i.e., the active M has full column rank) [4, 21, 22], but this optimal set of active reactions is generally not known a priori. In contrast, the following analysis in terms of flux fractions is valid for any M of arbitrary size and rank.
For the following, we emphasize that the state of our system is completely determined by scaled fluxes fj, which serve as independent variables. All other variables are fully dependent on them: the unscaled fluxes v, the scaled and unscaled concentrations b, c, and p, the reaction times τ, and the growth rate μ.
All following analyses benefit from knowing the system’s sensitivity to small changes of each of the independent variables fj. The partial derivatives of the system’s properties c(f), v(f), b(f), τ(f), p(f), and μ(f) with respect to each fj provide explicit expressions for sensitivity coefficients similar to the ones introduced in Metabolic Value Theory [32], based on the original concepts of Metabolic Control Analysis (MCA) [9, 10]. A unique feature of the present treatment arises from the system of equations in Eq 11, which determines the linear dependence of biomass fractions b on f, so that the partial derivative of bi with respect to fj is given simply as
(22) |
Via the chain rule of differentiation, this expression also determines the partial derivatives with respect to fj for any functions of bi. A case of particular interest in the following discussions is the vector of turnover times τ = τ(c, x) = τ(ρ b, x) = τ(ρ M f, x). We first define the (direct) time elasticities (elasticities in short), the sensitivity of each turnover time τl(c, x) = τl(ρ b, x) with respect to each biomass fraction bi, as
(23) |
where we used the chain rule of differentiation in the first equality and Eq 9 in the second. We then use the direct elasticities to express the sensitivity of τl to a change in a flux fraction fj, defined as the indirect time elasticity matrix E (or indirect elasticity in short), with entries
(24) |
where we used Eq 22 in the last equality. In the following discussion, we assume that the kinetic rate laws do not depend on the total protein concentration ca, meaning for all reactions l. That would be different if, for example, one accounts for the macromolecular crowding effects via kinetic rate laws [33]. The indirect elasticities E and direct elasticities ε share some resemblance with the Jacobian and elasticity matrices defined in Metabolic Value Theory and MCA, although we do not intend to explore the exact relationships in this work. For an example of direct and indirect elasticities, where τ follows a simple Michaelis-Menten rate law, see S1 Text.
In the remainder of this paper, we will explore three complementary types of analyses of GBA systems. First, in the growth optimality section we will state the analytical conditions necessary for an optimal state f*. Second, in the growth economy section we will calculate the sensitivity of a (not necessarily optimal) growth rate μ to small changes in each f, which we interpret in economic terms as marginal values of reactions. Third, in the growth control and adaptation section we will estimate the sensitivity of the optimal growth rate μ* to small changes in the previously fixed parameters π. In each of these analyses, the sensitivity measures captured by E will appear naturally in the results.
Growth optimality
We next calculate the necessary analytical conditions for the optimal growth state (OGS). This calculation extends our previous analytical approach, which was restricted to GBA models with matrices M of full column rank [4], to general GBA models with arbitrary matrices M, facilitated by the reformulation of the GBA problem in terms of flux fractions f. We approach this problem by studying the Karush-Kuhn-Tucker (KKT) conditions [34, 35], which generalize the method of Lagrange multipliers by also accounting for inequality constraints, here present due to the non-negativity of concentrations. To simplify the presentation in this section, we here account explicitly only for the non-negativity of protein concentrations, but not for the non-negativity of metabolite concentrations. Under the reasonable assumption that metabolites with zero concentration do not participate in any active reactions, the resulting necessary conditions are also necessary when accounting for this latter constraint; the full calculations can be found in S1 Text.
We define the Lagrangian for a given GBA model (M, τ, ρ) and external concentrations x as
(25) |
The KKT multipliersλ, θ are auxiliary variables used to find the optimal state, but also encode important economic and control information about the system at optimality, as we will see later. λ relates to the equality constraint enforcing the fixed cell density, connected to f via the flux balance at the surface (Eq 18); θ relate to the inequality constraint enforcing the non-negativity of proteins (Eq 19). Solving the KKT conditions (see Methods for details), we get the balance equations determining the necessary condition for each reaction j at optimal growth:
(26) |
where ∂jμ ≔ ∂μ/∂fj indicates the partial derivative of μ with respect to fj, calculated from Eq 16 as
(27) |
with Ej representing the column j of E, and λ is the optimal value of the density constraint multiplier
(28) |
(see Methods for the detailed calculations). When we further consider that only transporters have a nonzero column sum γj (Eq 2), we get an equivalent expression for the optimal λ that highlights its particular dependence on the reactions directly connected to the transport reactions (see S1 Text).
Eq 26 is a necessary property of the global optimum of (21), but it could be in principle also satisfied by other local stationary points of the constrained growth rate defined by Eq. 21, which exist whenever the optimization (21) is non-convex. The possible non-uniqueness of solutions for Eq 26 is however not of our concern here, since in this study we are focusing only on the analytical properties of the global optimum, not on methods to calculate it.
Eq 26 generalizes the necessary analytical conditions we found before [4] for the optimal states of GBA models with full column rank matrix M; in that case, the conditions could only be applied to arbitrary models if one had prior knowledge of what reactions are active at optimality, effectively reducing M to an “active” matrix of interest (which is guaranteed to be of full column rank [4]). Here, no prior knowledge of active reactions is required. Instead, Eq 26 provides the very condition determining whether each reaction is active at optimality: a reaction with nonzero flux fj requires that the corresponding term in parentheses (i.e., the corresponding θj, see Methods) is equal to zero. Conversely, if the term in parentheses is different from zero (θj ≠ 0), then the reaction cannot carry flux at optimality (fj = 0). In particular, the ribosome evidently needs to be active for balanced cellular growth, as proteins are required as catalysts; thus, θr = 0 must always hold in optimal states. The KKT multipliers θ are the shadow prices [32] of each τjfj = pj/(μρ), which has unit of time, and can be understood as the fraction pj/ρ of the total growth time 1/μ which is allocated to produce the protein j in the biomass.
We may also express Eq 26 for each reaction j in the usual, unscaled variables v (fluxes), and c (reactant concentrations, including metabolites and total protein), by using Eqs 4, 9, 10 and 11 (see S1 Text)
(29) |
where Mj indicates the column j of M. We now continue our analysis in terms of the flux fractions f, since these are the variables of the optimization problem (Eq. 21). However, we keep in mind that the same change of variables to p, v, c is possible in all the following equations, as done for Eq 29.
Growth economy
As growth rate is closely related to fitness [2], it makes sense to view growth rate as the primary value of the cellular economy. In this subsection, we will thus explore the economy of balanced cellular growth, by asking how a small change in the state variables fj affects the growth rate μ of any optimal or non-optimal state. Below, we will see that the necessary conditions of optimal growth specify that the marginal costs and benefits of each flux must be perfectly balanced.
We define the marginal value of flux j as the partial derivative ∂jμ, which quantifies the marginal gain in growth rate resulting from a small increase in fj. From Eq 27, we see that the marginal value can be expressed naturally as a multiple of the growth rate per mass fraction of protein in biomass, μ/ba. As we will see next, the corresponding adimensional factor—the term in parentheses in Eq 27—quantifies different types of costs (when negative) and benefits (when positive) of reaction j in terms of its influence on protein allocation,
(30) |
The first summand quantifies how a marginal increase in fj increases the total protein fraction in the cell density ba = ca/ρ (see Eq 5),
(31) |
We name this contribution to the normalized marginal value the marginal protein production. As we assume that the ribosome reaction is the only reaction that consumes or produces protein, this reaction (j = r) is the only one with a nonzero (and positive) marginal production benefit.
To interpret the remaining summands, we remember that an individual protein’s mass fraction in the cellular density can be expressed as pl/ρ = τlvl/ρ = μτlfl. The last two terms in Eq 30 quantify the combined decrease of individual protein fractions in cellular density (pl/ρ) caused by a marginal increase in fj at fixed μ,
(32) |
Here, the first summand quantifies the change in pl/ρ at fixed turnover times, which is evidently non-zero only for the enzyme catalyzing the perturbed flux j itself. We name this term, −μτj, the marginal (protein) investment into j. The final summand quantifies the local change of the individual protein concentrations that must occur to compensate the changes in the turnover times (quantified by the indirect elasticity E), themselves caused by changes in metabolite concentrations forced due to flux balance. We name it the marginal (protein) opportunity of j, as it is related to opportunity costs and benefits in economics. For the typical case of reactions running in the forward direction (fj > 0), τj is positive, and thus the marginal investment into j is negative, representing a cost. If all fluxes are non-negative, beneficial decreases in turnover times correspond to negative E, resulting in positive marginal opportunity (i.e., marginal opportunity benefit).
We can now summarize our insights about cellular economy, in particular about changes in the growth rate μ in response to changes in a flux fj. The first and second terms in Eq 30 are simple, direct consequences of the flux change: the marginal production benefit, an increase in protein production if fj is the ribosome flux; and the marginal investment, an increase in the protein concentration required to sustain an increased fj. The third term in Eq 30, the marginal opportunity, is more interesting, though equally easy to understand. As a simple consequence of mass conservation (Eq 11), a change in fj while keeping all other fluxes fixed must result in changes in the concentrations of all reactants consumed or produced in the corresponding reaction. These concentration changes modify the turnover times τl(c) of all reactions l whose kinetics depend on them, either because they are directly connected to those reactants or because they act as inhibitors or activators; see Fig 3 for an example. Keeping the corresponding fluxes fl constant requires matching changes in the concentrations pl of the catalyzing proteins (Eq 4). This total amount of “protein saved” due to a change in fj is quantified by .
The above results confirm the often postulated central role of proteins in the cellular economy [31, 36, 37]. While the measure of cellular economic value may be the growth rate itself, protein concentrations constitute the general currency in which we can express the contributions of cellular subsystems. We can highlight this central economic role of proteins further by relating the marginal values ∂jμ—changes in growth rate in response to flux changes—to changes in the allocation of proteome fractions ϕl ≔ pl/ca:
(33) |
The second equality follows directly from taking the derivative of with respect to fj, at constant μ.
We can now look at the balance equations at optimal growth from an economic perspective. For the ribosome and active enzymatic reactions, a zero marginal value ∂jμ = 0 also means a zero shadow price, θj = 0 (Eqs 54 and 55)—so the reaction is optimal, and growth cannot be accelerated by increasing or decreasing fj by a small amount. This insight provides an intuitive interpretation for the balance equations for the ribosome (Eq 57) and for all active, internal enzymatic reactions (Eq 58). An exception are only the transporters. In contrast to all other flux fractions, their shadow price (Eq 56) depends both on their marginal value and on their marginal biomass production, μ f⊤E f γs (a cost when negative and a benefit when positive).
For active enzymes with zero marginal value—and thus for all active enzymes at optimality (Eq 58)—Eq 27 simplifies to
(34) |
This simple relationship shows that at optimality, the marginal investment into e should perfectly balance its marginal opportunity. As the last equation involves only the neighborhood of e (defined as all reactions l such that ), we can study such relationships at optimality locally, without full knowledge about the entire reaction network. We thus do not need the entire matrix M or complete knowledge of parameterized turnover time functions in the vector τ.
In the preceding subsection, we studied how any optimal or non-optimal growth rate μ is sensitive to marginal changes in one of the flux fractions, resulting in an economic understanding of marginal flux values in terms of their relationship with protein allocation. We next reinterpret some of these results from the perspective of control theory, and turn to a complementary problem that focuses on the sensitivity of the optimal growth rate to changes in the model parameters and external concentrations.
Growth control and adaptation
We are first interested in the total control that each fj has on the (optimal or non-optimal) growth rate μ, accounting also for the density constraint limitation. In order to do that, we choose one active transport reaction s′ and express its corresponding as a function of the other fluxes via the density constraint (Eq 18),
(35) |
where l ≠ s′ sums over all other reactions. Thus,
(36) |
which is non-zero only if j is also a transport reaction (so γj ≠ 0). We now define the Growth Control Coefficients Γj as
(37) |
where the first term quantifies the growth change caused by fj itself, and the second term quantifies the growth change caused by a change in , itself changed due to the changed fj and the density constraint. Note that for the ribosome and enzymatic reactions, their growth control coefficient is simply their marginal value, since . For models with only one transport reaction s, Γs = 0, since fs = (γs)−1 is fixed by the density constraint and cannot be changed. Conveniently, this is also captured by Eq 37. If s′ is optimal, θs′ = 0, and Eq 26 determines ∂s′μ = −λγs′, so in that case
(38) |
and the balance equation Eq 26 is thus equivalent to
(39) |
We may also see the optimal condition for enzymes (Eq 34) in terms of protein concentrations, by multiplying it element-wise with ve (so it is also valid now for inactive enzymes),
(40) |
where we defined
(41) |
via Eq 4 and using partial derivatives at fixed pl. can be seen as (scaled) control coefficients (CC), analogous to (scaled) control coefficients in MCA [9, 10]. This result is analogous to how enzyme concentrations and their respective CC relate at optimal fluxes constrained by a fixed total enzyme concentration [38] (see S1 Text for a detailed discussion). For an example of control coefficients where τ follows a simple Michaelis-Menten rate law, see S1 Text.
We now explore the sensitivity of the optimal growth rate to changes in one parameter π in the vector π. The growth problem (Eq 21) is constrained by the parameters π, including the arguments necessary to determine the turnover times τ at given f. This means that any marginal change in one of the parameters π would lead to changes in the solution f* of the optimization (Eq 21). In this sense, the parameters π can be understood as control variables, while the corresponding optimal state f*, and its functions μ* = μ(f*), v* = v(f*), p* = p(f*), and c* = c(f*) are the response variables. Fig 4 summarizes these relationships.
Because growth rate is closely related to fitness, we are also particularly interested in how marginal changes in one of the previously fixed parameters π affect the optimal growth rate μ* [39]. We can estimate this effect directly via the envelope theorem [4, 40], by effectively considering the optimal state f* as fixed and treating the parameters π as the new independent variables, making it unnecessary to calculate the new optimal state after the parameter change. To do that, we first simplify the problem by assuming that these marginal changes have no effect on which reactions are active, so we simplify the Lagrangian (Eq 25) by ignoring the inequality constraints; note that in this case only the objective function μ can be influenced by parameter changes, since the density constraint only depends on M, whose entries cannot be changed continuously. Second, we can think about the optimal growth rate μ* as a function of the parameters , so the total change induced by a marginal change in a parameter π can be calculated via the chain rule
(42) |
where the last equality comes from according to the stationarity (Eq 49) and primal feasibility (Eq 18) at an optimal state.
We now define growth adaptation coefficients A as the relative change in the optimal growth rate μ* in response to a small, relative change in one control variable π
(43) |
where here and in the rest of this section f is to be understood as the optimal state before the change in the parameter π. Note that in the following discussion, the parameters π of interest only influence via the objective function μ, so the partial derivatives are simply evaluated as ∂μ/∂π at fixed f.
For direct changes in the turnover times τj (e.g., through changing the corresponding ), the growth adaptation coefficient is calculated by evaluating the growth function μ and its partial derivative at fixed f,
(44) |
where we effectively treated τj as a variable in the growth equation Eq 16, and ϕj = pj/ca is the optimal proteome fraction allocated to reaction j before the change in τj. This result is consistent with the observation that drugs targeting the most highly expressed catalysts, such as the ribosome, have the strongest effects on cellular growth rates [5, 41].
For changes in some external parameter such as a concentration x, the growth adaptation coefficient is again calculated by evaluating the growth function μ and its partial derivative at fixed f, and using the chain rule of differentiation we obtain
(45) |
where we have a summation over s (only transporters s have kinetic rate laws depending on external concentrations). According to Eq 45, the growth adaptation coefficient of an external concentration x is simply the sum over the “scaled elasticities” of the transporters of x, weighted by the optimal proteome fractions ϕs allocated to each s before the change in x. This result gives an explicit quantitative estimation on which external concentrations should be changed in order to cause the most change in the optimal growth rate. This equation may hence provide a useful tool for improving the growth media environment for industrial cell cultures, and for quantifying the effect of drugs aimed at decreasing the growth of pathogens and cancer cells. If the turnover times τ depend explicitly on other external parameters, such as pH and temperature, growth adaptation coefficients can be calculated and interpreted exactly as in Eq 45.
The growth adaptation coefficient with respect to the mass density ρ, assuming it affects turnover times τ only through reactant concentrations c, reads
(46) |
where according to Eqs 9 and 11, and the last equality comes from the definition of the indirect elasticity (Eq 24). From this expression and λ in Eq 28, we see that −λ = μAρ; at optimality, the negative KKT multiplier for the density constraint, −λ, quantifies the absolute increase in growth rate caused by a marginal increase in ρ, given by μ itself times the proportional change, Aρ. Thus, the extra term in the shadow price of transporters (compare θs in Eq 56 to θr and θe) quantifies the growth rate benefit gained by allowing the violation of the density constraint (Eq 18) caused by a small increase in fs.
Just as the economy of growth is deeply connected to protein allocation, so is growth control. For Aj and Ax, this connection is clear from Eqs 44 and 45, respectively. For Aρ, we first note that it relates to optimal marginal values via Eq 53,
(47) |
At optimality, the summands on the RHS are zero for the ribosome and for enzymatic reactions (∂j μfj = 0 for ), and the summation over j can thus be restricted to only transporters s. Thus, at optimality, the absolute change in optimal growth rate caused by increasing ρ, μAρ, is equal to the summed marginal effects of transport fluxes on the growth rate, ∂sμ, weighted by the flux fractions fs themselves. To see the full connection between Aρ and protein allocation, we insert Eq 33 into Eq 47 to obtain
(48) |
This equation shows that the proportional effect on the optimal growth rate that is exerted by a marginal increase in ρ, Aρ, equals the combined marginal effects of transport fluxes fs on proteome allocation fractions, weighted by the transport fluxes themselves.
Discussion
Modeling frameworks that are essentially linear, such as FBA and RBA, are typically analyzed numerically, as the efficiency of linear programming facilitates fast solutions even for genome-scale models [7, 8, 11]. In contrast, the construction and solution of genome-scale non-linear models faces two major obstacles, both intimately linked to the kinetic rate laws. First, experimental estimates for the required kinetic parameters—kcat and Km values in the simplest case of generalized Michaelis-Menten kinetics—are lacking for most reactions [42]. This problem can be alleviated by using parameter estimates from artificial intelligence approaches [43–45]. Second, the non-linearity of enzymatic rate laws makes numerical optimizations much more difficult than for linear systems, explaining why existing studies have been limited to models with only a handful of reactions [6, 15–19]. Numerical optimization is particularly problematic for models with redundant pathways, where the optimization problem is non-convex [20].
The succinct mathematical formulation for modeling balanced cellular growth developed in this paper helps to address both problems. On the one hand, the reduction of the problem description to a minimal number of independent variables—the flux fractions—reduces the dimensionality of the search space, and may thus help to accelerate numerical approaches to find optimal states. On the other hand, this formulation allowed us to identify necessary conditions for states of maximal growth rate. For enzymatic reactions e, these conditions (Eq 34) are “local” in the sense that they only depend on the flux fractions f and on the kinetic parameters of reactions directly connected to e itself, i.e., they only depend on the fluxes and parameters of reactions whose turnover times τj are directly affected by changes in fe. On the other hand, the optimality conditions for the ribosome and transport reactions (see Eqs 57 and 59) do require the knowledge of the full vector f and of all model parameters, which are required explicitly or via μ as determined by Eq 16.
The concise formulation also helped in the interpretation of the optimality conditions from the perspectives of economy and control theory. The marginal change in growth rate induced by each flux change is seen as the flux’s marginal economic value, while the growth adaptation coefficient of each model parameter or external concentration is the change in the optimal growth rate induced by a marginal change in this parameter. The close correspondence between the mathematical expressions obtained in both perspectives helps to clarify the mathematical and conceptual links between these usually separate fields of study, including the extension of previous results of metabolic control analysis (MCA), developed for ad-hoc objectives in static sub-networks, to the holistic problem of cellular growth in GBA models. In MCA, one typically treats enzyme concentrations as control variables and studies how small changes to them affect reactant concentrations and fluxes. Here, all these variables are not only connected, but are uniquely determined by the flux fraction vector f. Moreover, the growth rate μ itself is explicitly connected to f through the growth function (Eq 16). Through these connections, we can quantify the sensitivity of the cellular growth rate, and hence approximately of organismal fitness, to changes in the control variables π, something not possible in the usual MCA framework [9, 10]. The growth adaptation coefficients provide explicit expressions for the effects on growth rate caused by small changes in control variables at optimality. Due to the close relationship between growth rate and fitness, these estimates could be used to interpret and predict evolutionary changes in these variables.
A closely related nonlinear cellular modeling approach accounts for the different amino acid compositions of individual proteins by including “personalized” ribosome reactions for each protein [23, 46, 47]. In contrast to GBA, this type of model cannot be simplified using flux fractions f, as it requires a mathematical formulation that includes explicit variables for metabolite concentrations. Experimental data for E. coli [48] indicates that the 20 amino acid content into its total proteome changes very little over 22 highly distinct growth environments (mean coefficient of variation = 2.46%, maximal CV = 7.55%, see Table A in S1 Text), suggesting that—at least globally—different protein compositions are likely not a major factor driving significant changes in the optimal cellular state. Thus, a unique ribosome reaction with fixed column Mr is a realistic assumption over all these growth conditions. Further study is necessary to identify whether the different compositions of individual proteins may cause significant changes in their allocation across environments.
All analytical results in this study were derived exclusively from the growth constraints assumed in GBA models: mass conservation in balanced growth, reaction kinetics, cellular density, and non-negative concentrations. For the analysis of optimal growth, we encoded all corresponding information into a single Lagrangian function, parameterized in terms of the constraints. We formulated the problem with the flux fractions f as the only free variables, and used KKT conditions to obtain the necessary conditions for optimal growth states (OGSs). Through these conditions, the marginal protein allocation emerges as the natural underlying currency in the cell economy; this relationship has frequently been asserted [31, 36, 37], but is derived here entirely from first principles.
The KKT framework provides a straight-forward way to incorporate new constraints, analogous to how physical theories using the Lagrangian formalism account for additional forces by adding corresponding functions and Lagrange multipliers into the Lagrangian. A re-derivation of the KKT conditions will then result in an extended set of balance equations. Among the potential extra physiological constraints, one might consider also phenomenological constraints such as the recently reported relationship between the cellular surface/volume ratio and the growth rate [49].
One fundamental physiological limitation that could be included in this way but is not considered explicitly here is the diffusion limit of molecules within cellular compartments. This limit links density and kinetic constraints. A higher dry mass density increases the “crowding effect” within cells [26], which entails a lower diffusion rate and by consequence a longer time for reactants to find their catalysts; this effect can be modeled directly by including a corresponding dependence in the Michaelis constants Km. A study on the crowding effects of all cellular concentrations—including those of small molecules—found that the observed E. coli dry mass density is in the range expected if evolution had optimized the cellular density for maximal growth rate [33]. In this sense, a fixed density constraint on all molecules, as considered here, may be seen as a simplifying approximation, justified by the observed constancy of cellular buoyant and dry mass densities across different growth conditions [25, 49], with the exception only of large changes in environmental osmolarities [26].
The Lagrangian formalism described here also allows a direct generalization of the theory to other objective functions, i.e., other measures of fitness at balanced growth. This can be done by incorporating a new objective function F(f) and adding a new constraint for the growth rate via ω(μ − μ0), where μ is determined by the growth function, ω is the corresponding KKT multiplier, and μ0 is the constrained growth rate given now as an input.
An important step toward a more general theory of cellular growth would be to extend the present analytical approach to changing environments, and to derive similar analytical conditions for time-dependent optimal cellular states f(t). In this situation, fitness is determined by the proportional growth in a given period of time, so the objective function becomes the integral of the specific growth rate μ(t) [17, 50], under the same constraints as discussed here. This dynamical extension to a theory of proportional growth optimization would help to generalize the existing results on dynamic metabolic flux optimization [51], and building more realistic models for cells in cyclical environments, such as feast-famine cycles of the gut microbiome [52] or day-night cycles of photosynthetic microbes [19].
In sum, the concise mathematical formulation of the growth optimization problem developed here provides a powerful toolbox for the analysis and solution of mechanistic descriptions of optimal cellular physiology and growth. It thereby opens a path toward a fundamental understanding of organizing principles of biological cells. While biological systems will never be fully optimal, the study of optimal growth strategies provides an extremely useful null model for the action of natural selection.
Methods
The necessary KKT conditions include the primal feasibility conditions given by Eqs 18 and 19), and
(49) |
(50) |
where ∂j ≔ ∂/∂fj indicates the partial derivative with respect to fj.
The stationarity conditions can be solved for the corresponding optimal multipliers θj, resulting in
(51) |
where λ is the optimal value for the density multiplier. After an element-wise multiplication of both sides of Eq 51 with fjτj, we can use the complementary slackness (θjτjfj = 0) to get
(52) |
Now summing the last equation over all j and using the primal feasibility (Eq 18) results in
(53) |
Combining Eqs 51, 27 and 28, we can now express each multiplier θj explicitly in terms of the flux fractions f at optimality, resulting in slightly different expressions for ribosomal, enzymatic, and transport reactions:
(54) |
(55) |
(56) |
By inserting these expressions into the complementary slackness conditions (Eq 50), we can now solve for f, which results in the balance equations for ribosomal, enzymatic, and transport reactions:
(57) |
(58) |
(59) |
where we simplified the expressions by exploiting that μ, ba, τj ≠ 0.
Supporting information
Acknowledgments
We thank Stefan Müller for discussions about KKT conditions, Alexander Kroll for verifying early calculations, and Xiao-Pan Hu for providing data in Table A in S1 Text.
Data Availability
All data and codes are available in the S1 File.
Funding Statement
This work was funded by a Volkswagenstiftung “Life?” grant to MJL, and by the German Research Foundation through grant CRC 1310 to MJL, and, under Germany’s Excellence Strategy, grant EXC 2048/1 (Project ID: 390686111) to OE and MJL. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Bremer H DP. Modulation of Chemical Composition and Other Parameters of the Cell at Different Exponential Growth Rates. EcoSal Plus. 2008;. [DOI] [PubMed]
- 2.Fisher RA, Bennett JH. The genetical theory of natural selection: a complete variorum edition. Oxford University Press; 1999. Available from: http://books.google.com/books?id=sT4lIDk5no4C.
- 3. Dourado H, Mori M, Hwa T, Lercher MJ. On the optimality of the enzyme–substrate relationship in bacteria. PLOS Biology. 2021;19(10):1–18. doi: 10.1371/journal.pbio.3001416 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Dourado H, Lercher MJ. An analytical theory of balanced cellular growth. Nature Communications. 2020;11(1):1226. doi: 10.1038/s41467-020-14751-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Hu XP, Dourado H, Schubert P, Lercher MJ. The protein translation machinery is expressed for maximal efficiency in Escherichia coli. Nature Communications. 2020;11(1):5260. doi: 10.1038/s41467-020-18948-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Molenaar D, van Berlo R, de Ridder D, Teusink B. Shifts in growth strategies reflect tradeoffs in cellular economics. Molecular Systems Biology. 2009;5(1):323. doi: 10.1038/msb.2009.82 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Goelzer A, Fromion V, Scorletti G. Cell Design in Bacteria As a Convex Optimization Problem. Automatica. 2011;47(6):1210–1218. doi: 10.1016/j.automatica.2011.02.038 [DOI] [Google Scholar]
- 8. O’Brien EJ, Lerman JA, Chang RL, Hyduke DR, Palsson B. Genome-scale models of metabolism and gene expression extend and refine growth phenotype prediction. Molecular Systems Biology. 2013. doi: 10.1038/msb.2013.52 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Heinrich R, Rapoport TA. A Linear Steady-State Treatment of Enzymatic Chains. European Journal of Biochemistry. 1974;42(1):89–95. doi: 10.1111/j.1432-1033.1974.tb03318.x [DOI] [PubMed] [Google Scholar]
- 10. Kacser H, Burns JA. The control of flux. Symp Soc Exp Biol. 1973;27:65–104. [PubMed] [Google Scholar]
- 11. Lewis NE, Nagarajan H, Palsson BO. Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods. Nature Reviews Microbiology. 2012;10(4):291–305. doi: 10.1038/nrmicro2737 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Watson MR. Metabolic maps for the Apple II. Biochemical Society Transactions. 1984;12(6):1093–1094. doi: 10.1042/bst0121093 [DOI] [Google Scholar]
- 13. Goelzer A, Muntel J, Chubukov V, Jules M, Prestel E, Nölker R, et al. Quantitative prediction of genome-wide resource allocation in bacteria. Metabolic Engineering. 2015;32:232–243. doi: 10.1016/j.ymben.2015.10.003 [DOI] [PubMed] [Google Scholar]
- 14. Mori M, Hwa T, Martin OC, De Martino A, Marinari E. Constrained Allocation Flux Balance Analysis. PLOS Computational Biology. 2016;12(6):1–24. doi: 10.1371/journal.pcbi.1004913 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Weiße AY, Oyarzún DA, Danos V, Swain PS. Mechanistic links between cellular trade-offs, gene expression, and growth. Proceedings of the National Academy of Sciences. 2015;112(9):E1038–E1047. doi: 10.1073/pnas.1416533112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Maitra A, Dill KA. Bacterial growth laws reflect the evolutionary importance of energy efficiency. Proceedings of the National Academy of Sciences. 2015;112(2):406–411. doi: 10.1073/pnas.1421138111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Giordano N, Mairet F, Gouzé JL, Geiselmann J, de Jong H. Dynamical Allocation of Cellular Resources as an Optimal Control Problem: Novel Insights into Microbial Growth Strategies. PLOS Computational Biology. 2016;12(3):e1004802. doi: 10.1371/journal.pcbi.1004802 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Towbin BD, Korem Y, Bren A, Doron S, Sorek R, Alon U. Optimality and sub-optimality in a bacterial growth law. Nature Communications. 2017;8:14123. doi: 10.1038/ncomms14123 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Faizi M, Zavřel T, Loureiro C, Červený J, Steuer R. A model of optimal protein allocation during phototrophic growth. BioSystems. 2018;166:26–36. doi: 10.1016/j.biosystems.2018.02.004 [DOI] [PubMed] [Google Scholar]
- 20. Wortel MT, Noor E, Ferris M, Bruggeman FJ, Liebermeister W. Metabolic enzyme cost explains variable trade-offs between microbial growth rate and yield. PLOS Computational Biology. 2018;14(2):1–21. doi: 10.1371/journal.pcbi.1006010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Müller S, Regensburger G, Steuer R. Enzyme allocation problems in kinetic metabolic networks: Optimal solutions are elementary flux modes. Journal of Theoretical Biology. 2014;347:182–190. doi: 10.1016/j.jtbi.2013.11.015 [DOI] [PubMed] [Google Scholar]
- 22. Wortel MT, Peters H, Hulshof J, Teusink B, Bruggeman FJ. Metabolic states with maximal specific rate carry flux through an elementary flux mode. The FEBS Journal. 2014;281(6):1547–1555. doi: 10.1111/febs.12722 [DOI] [PubMed] [Google Scholar]
- 23. de Groot DH, Hulshof J, Teusink B, Bruggeman FJ, Planqué R. Elementary Growth Modes provide a molecular description of cellular self-fabrication. PLOS Computational Biology. 2020;16(1):1–33. doi: 10.1371/journal.pcbi.1007559 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Baldwin WW, Myer R, Nicole Powell, Anderson E, Koch AL. Buoyant density of Escherichia coli is determined solely by the osmolarity of the culture medium. Archives of Microbiology. 1995;164(2):155–157. doi: 10.1007/BF02525322 [DOI] [PubMed] [Google Scholar]
- 25. Kubitschek HE, Baldwin WW, Graetzer R. Buoyant density constancy during the cell cycle of Escherichia coli. Journal of bacteriology. 1983;155:3. doi: 10.1128/jb.155.3.1027-1032.1983 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Cayley S, Lewis BA, Guttman HJ, Record MT. …Characterization of the cytoplasm of Escherichia coli K-12 as a function of external osmolarity: Implications for protein-DNA interactions in vivo. Journal of Molecular Biology. 1991;222(2):281–300. doi: 10.1016/0022-2836(91)90212-O [DOI] [PubMed] [Google Scholar]
- 27. Schuster S, Hilgetag C. On elementary flux modes in biochemical reaction systems at steady state. Journal of Biological Systems. 1994;02(02):165–182. doi: 10.1142/S0218339094000131 [DOI] [Google Scholar]
- 28. Benyamini T, Folger O, Ruppin E, Shlomi T. Flux balance analysis accounting for metabolite dilution. Genome Biology. 2010;11(4):R43. doi: 10.1186/gb-2010-11-4-r43 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Marion JB, Thornton ST. Classical dynamics of particles and systems; 3rd edition. San Diego: Harcourt Brace Jovanovich; 1988. Available from: https://bib-pubdb1.desy.de/record/337680. [Google Scholar]
- 30.Luenberger DG, Ye Y. Linear and nonlinear programming. 3rd ed. International Series in Operations Research Management Science. New York, NY: Springer; 2008.
- 31. Noor E, Flamholz A, Bar-Even A, Davidi D, Milo R, Liebermeister W. The Protein Cost of Metabolic Fluxes: Prediction from Enzymatic Rate Laws and Cost Minimization. PLOS Computational Biology. 2016;12(11):1–29. doi: 10.1371/journal.pcbi.1005167 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Liebermeister W. The value structure of metabolic states. bioRxiv. 2022. [Google Scholar]
- 33.Pang TY, Lercher MJ. Optimal density of biological cells. bioRxiv [Preprint]. 2020.
- 34.Karush W. Minima of functions of several variables with inequalities as side constraints. M Sc Dissertation Dept of Mathematics, Univ of Chicago. 1939;.
- 35. Kuhn HW, Tucker AW. Nonlinear Programming Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. Neyman. 1951; p. 481–492. [Google Scholar]
- 36. Dekel E, Alon U. Optimality and evolutionary tuning of the expression level of a protein. Nature. 2005;436(7050):588–592. doi: 10.1038/nature03842 [DOI] [PubMed] [Google Scholar]
- 37. Kafri M, Metzl-Raz E, Jona G, Barkai N. The Cost of Protein Production. Cell Reports. 2016;14(1):22–31. doi: 10.1016/j.celrep.2015.12.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Klipp E, Heinrich R. Competition for enzymes in metabolic pathways:: Implications for optimal distributions of enzyme concentrations and for the distribution of flux control. Biosystems. 1999;54(1):1–14. doi: 10.1016/S0303-2647(99)00059-3 [DOI] [PubMed] [Google Scholar]
- 39. Wilken SE, Besançon M, Kratochvíl M, Foko Kuate CA, Trefois C, Gu W, et al. Interrogating the effect of enzyme kinetics on metabolism using differentiable constraint-based models. Metabolic engineering. 2022; p. S1096—7176(22)00117—3. doi: 10.1016/j.ymben.2022.09.002 [DOI] [PubMed] [Google Scholar]
- 40. Afriat S. Theory of Maxima and the Method of Lagrange. SIAM Journal on Applied Mathematics. 1971;20(3):343–357. doi: 10.1137/0120037 [DOI] [Google Scholar]
- 41. Scott M, Gunderson CW, Mateescu EM, Zhang Z, Hwa T. Interdependence of cell growth and gene expression: Origins and consequences. Science. 2010;330(6007):1099–1102. doi: 10.1126/science.1192588 [DOI] [PubMed] [Google Scholar]
- 42. Nilsson A, Nielsen J, Palsson BO. Metabolic Models of Protein Allocation Call for the Kinetome. Cell Systems. 2017;5(6):538–541. doi: 10.1016/j.cels.2017.11.013 [DOI] [PubMed] [Google Scholar]
- 43. Heckmann D, Lloyd CJ, Mih N, Ha Y, Zielinski DC, Haiman ZB, et al. Machine learning applied to enzyme turnover numbers reveals protein structural correlates and improves metabolic models. Nature Communications. 2018;9(1):5252. doi: 10.1038/s41467-018-07652-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Kroll A, Engqvist MKM, Heckmann D, Lercher MJ. Deep learning allows genome-scale prediction of Michaelis constants from structural features. PLOS Biology. 2021;19(10):1–21. doi: 10.1371/journal.pbio.3001402 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Li F, Yuan L, Lu H, Li G, Chen Y, Engqvist MKM, et al. Deep learning-based kcat prediction enables improved enzyme-constrained model reconstruction. Nature Catalysis. 2022;5(8):662–672. doi: 10.1038/s41929-022-00798-z [DOI] [Google Scholar]
- 46.Müller S. Elementary growth modes/vectors and minimal autocatalytic sets for kinetic/constraint-based models of cellular growth. bioRxiv [Preprint]. 2021.
- 47. Müller S, Széliová D, Zanghellini J. Elementary vectors and autocatalytic sets for resource allocation in next-generation models of cellular growth. PLOS Computational Biology. 2022;18(2):1–27. doi: 10.1371/journal.pcbi.1009843 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Schmidt A, Kochanowski K, Vedelaar S, Ahrné E, Volkmer B, Callipo L, et al. The quantitative and condition-dependent Escherichia coli proteome. Nature Biotechnology. 2015;34:104 EP–. doi: 10.1038/nbt.3418 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Oldewurtel ER, Kitahara Y, van Teeffelen S. Robust surface-to-mass coupling and turgor-dependent cell width determine bacterial dry-mass density. Proceedings of the National Academy of Sciences. 2021;118(32):e2021416118. doi: 10.1073/pnas.2021416118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Nordholt N, van Heerden JH, Bruggeman FJ. Biphasic Cell-Size and Growth-Rate Homeostasis by Single Bacillus subtilis Cells. Current Biology. 2020;30(12):2238–2247.e5. doi: 10.1016/j.cub.2020.04.030 [DOI] [PubMed] [Google Scholar]
- 51. Planqué R, Hulshof J, Teusink B, Hendriks JC, Bruggeman FJ. Maintaining maximal metabolic flux by gene expression control. PLOS Computational Biology. 2018;14(9):1–20. doi: 10.1371/journal.pcbi.1006412 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Mori M, Schink S, Erickson DW, Gerland U, Hwa T. Quantifying the benefit of a proteome reserve in fluctuating environments. Nature Communications. 2017;8(1):1225. doi: 10.1038/s41467-017-01242-8 [DOI] [PMC free article] [PubMed] [Google Scholar]