Significance
Growing theoretical and experimental evidence demonstrates that cells can (and do) spend biochemical energy while regulating their genes. Here, we explore the impact of departing from equilibrium in simple networks known to pervade biology. We learn that beyond increasing sensitivity, dissipation can unlock more flexible input–output behaviors that are otherwise forbidden without spending energy. These more complex behaviors could enable cells to perform more sophisticated functions using simpler systems than those needed at equilibrium.
Keywords: nonequilibrium, gene regulation, transcription, biophysics
Abstract
Cells adapt to environments and tune gene expression by controlling the concentrations of proteins and their kinetics in regulatory networks. In both eukaryotes and prokaryotes, experiments and theory increasingly attest that these networks can and do consume biochemical energy. How does this dissipation enable cellular behaviors forbidden in equilibrium? This open question demands quantitative models that transcend thermodynamic equilibrium. Here, we study the control of simple, ubiquitous gene regulatory networks to explore the consequences of departing equilibrium in transcription. Employing graph theory to model a set of especially common regulatory motifs, we find that dissipation unlocks nonmonotonicity and enhanced sensitivity of gene expression with respect to a transcription factor’s concentration. These features allow a single transcription factor to act as both a repressor and activator at different concentrations or achieve outputs with multiple concentration regimes of locally enhanced sensitivity. We systematically dissect how energetically driving individual transitions within regulatory networks, or pairs of transitions, generates a wide range of more adjustable and sensitive phenotypic responses than in equilibrium. These results generalize to more complex regulatory scenarios, including combinatorial control by multiple transcription factors, which we relate and often find collapse to simple mathematical behaviors. Our findings quantify necessary conditions and detectable consequences of energy expenditure. These richer mathematical behaviors—feasibly accessed using biological energy budgets and rates—may empower cells to accomplish sophisticated regulation with simpler architectures than those required at equilibrium.
Gene regulation—to which biology owes much of its exquisite sophistication (1)—is achieved by many network architectures that allow (and credibly depend on) energy expenditure (2–5). To adapt to varying environments, cells often dynamically tune concentrations of transcription factors (6) or inducers as their available control variables. This biochemical control adjusts the probabilities of cellular states by regulating rate constants that depend on the transcription factor or effector. Organisms use the specific shapes of these input (transcription factor concentration) to output (transcription) relationships to execute extraordinary signal processing (7). Given their centrality, these induction curves also promise to help clarify how spending biochemical energy empowers the very dynamism and fidelity of the living. While unexplained energy expenditures in aerobically growing bacteria have been documented since at least the 1980s (8), new measurements and models (9–12) confirm that cells often operate with surprising surpluses in energetic capacity relative to known expenditures. These observations place fresh urgency on deciphering how dissipation modifies vital functions such as gene regulation (13).
How can nonequilibrium relieve fundamental constraints on physiological adaptation, or enhance the flexibility of cellular behavior? To confront this question, here we investigate the nonequilibrium output behaviors of networks formed by regulatory binding proteins. These systems can represent the dynamic behaviors of genetic transcription executed by RNA polymerase and regulated by transcription factors as control variables (Fig. 1). While our analysis generalizes to accommodate more complex regulation, including combinatorial action by many transcription factors and DNA looping (Fig. 2 and SI Appendix), we focus on the widely relevant case of a single transcription factor governing a cycle of four states. This architecture is putatively the most common regulatory architecture in well-characterized bacteria (Fig. 2). This motif is also among the simplest closed systems capable of breaking equilibrium using basic reactions pervasive in biology. Accordingly, quantifying the behavior and limitations of the square graph is also a crucial step toward understanding more complex transcriptional networks.
Fig. 1.
Structure of a fundamental gene-regulatory motif. A square cycle of four-states emerges when up to two molecules (such as a transcription factor and polymerase ) can bind to a common substrate (say a genome). Output observables are linear combinations of the state probabilities; for instance, mRNA production scales with the probabilities of transcriptionally active states where polymerase is bound to the genome (states and ). These outputs vary with the control parameter , here schematized as the concentration of a transcription factor. (Numerical values of rate constants underneath edges are biologically plausible values, as inferred in SI Appendix, section 2B).
Fig. 2.
Incidence of regulatory architectures for transcription, where a square cycle’s response is common. (A) Regulatory architectures reported in E. coli, organized by the numbers of activating and repressing transcription factors associated with each regulated promoter, according to data in the RegulonDB database (19) release 11.2. Over half of these regulated promoters are reported to be regulated by the most common architecture of a single transcription factor (purple bars) and thus can be described by a square graph of states (Fig. 1). Less common network architectures with multiple transcription factors form hypercubic state spaces whose complete responses are analyzed in SI Appendix. (B) While larger binding networks can generically show more complex regulatory responses than the square cycle, simpler topologies and responses result from restrictions like overlapping binding site (SI Appendix, section 3D). Over half of all nontrivial transcriptionally potent derivative networks formed by up to three transcription factors show a response that resembles the square graph with respect to at least one transcription factor. (C) Transcriptional regulation by DNA looping accomplishes distinct responses, whose properties are analyzed algebraically and numerically in SI Appendix.
Given their simplicity, equivalents of the system in Fig. 1 have enjoyed earlier study in guises such as enzymatic control (14); remodeling of nucleosomes (5); and other settings in transcription (15, 16). In this work, we use tools from graph theory (17, 18) to explore the full space of transcriptional steady-state outputs available for this system under different energetic drives, compared to equilibrium control. We find that all equilibrium responses must be monotonic (with one inflection point) as a function of control variables, such as the concentration of transcription factor, measured in a conventional logarithmic scale. In contrast, we find that nonequilibrium models can exhibit three types of output: an “equilibrium-like,” monotonic response with one inflection point, potentially displaced from equilibrium; a new—but still-monotonic—shape with three inflection points; and a new, surprising nonmonotonic shape with two inflection points, where, for instance, increasing a control variable can change its effect from repression to activation. Combining analytical and numerical analysis, we globally bound the maximal sensitivities of transcriptional responses. Demonstrating that these mathematical behaviors are feasible to access within biological energy expenditures around typical rates, we systematically analyze the impact of breaking detailed balance along each transition rate. This analysis establishes design principles for optimizing sensitivity and unlocking behaviors that are especially prone to implicate nonequilibrium in measurements.
These broader, multiply-inflected transcriptional responses unlocked by nonequilibrium could be harnessed to achieve useful physiological functions. Our findings illustrate surprising regularity revealed by graph theoretic tools and dissect how even primordial biological networks operating out of equilibrium can rival the regulatory sophistication of (plausibly) larger, slower networks at equilibrium.
Results
Modeling Pervasive Gene Regulatory Motifs.
At steady state, a system is in equilibrium (or, equivalently, at detailed balance) if, for all pairs of states , the probability flux into state from state equals the flux into state from state , where is the probability of state and is the transition rate from state to . Otherwise, the system is out of equilibrium and requires energetic dissipation to sustain the system’s steady state. Nonequilibrium steady states can only be achieved with systems that contain at least one cycle; linear or branched architectures at steady state must be at equilibrium (for systems closed to external material inputs; see SI Appendix, section 1B and refs. 20 and 21). A single cycle is thus the simplest closed setting where the intriguing new consequences of nonequilibrium become possible.
A cycle of four states emerges naturally from up to two molecules binding or unbinding to a substrate. When the substrate is a promoter site on the genome, one molecule is RNA polymerase , and the second molecule is a transcription factor protein that can enhance or impede polymerase binding to the genome, the resulting cycle captures transcriptional regulation. Specifically, the four states represent the empty site of the genome substrate (“S”); the genome substrate bound to the transcription factor only (“X”); to the polymerase only (“P”); or to both (“XP”). Fig. 1 illustrates this central, motivating setting. (Note that the transcription factor and polymerase concentrations and do not affect whether the system is in or out of equilibrium and can be tuned while maintaining any extent of disequilibrium; see SI Appendix, section 1C.
This square cycle of states pervades gene regulation. We analyzed the incidence of regulatory architectures reported for Escherichia coli, according to the most recent release of the canonical RegulonDB database of known interactions (19). As summarized in Fig. 2A, over half of all regulated promoters in E. coli are reported to accommodate a single copy of a transcription factor. The four-state cycle also finds widespread examples or structural-equivalents in models of eukaryotic gene regulation (5, 15, 22, 23). Accordingly, these promoters and genes can be modeled by the four-state cycle in Fig. 1. These cyclic architectures also contrast the more commonly studied, noncyclic, motif of simple repression that cannot break detailed balance (SI Appendix, section 1B) (1, 6, 24–26). We also provisionally quantified an empirically common role for the square graph motif in the eukaryote Drosophila, according to interactions reported by the Redfly (27) database (SI Appendix, section 2A.2). However, this apparent commonality of a single transcription factor binding site per gene is probably more an artifact of our field’s substantial regulatory ignorance for eukaryotes than a well-tested empirical finding, since only ∼1.8% of genes are canonically annotated with known regulatory interactions in Drosophila.
Fig. 2 also documents other more complex regulation, including by multiple transcription factors and via DNA looping, used by prokaryotes and eukaryotes. Tellingly, eukaryotic gene expression is a setting where explicit consumption of adenosine triphosphate (ATP) is especially plausible (3, 4) yet poorly understood (2, 13, 15). These larger networks often form hypercubic state spaces or variants thereof. We analyze extensively these more complex networks, and their reachable behaviors, in SI Appendix. Often, however, we find that biophysical constraints (such as overlapped binding sites) compel even these more complex networks to resemble the mathematical behavior of the square graph with respect to at least one transcription factor as a control variable (Fig. 2B and SI Appendix, section 3D). In sum, the prevalence of architectures that are exactly square cycles across organisms—and those that mathematically resemble the square cycle—establish the importance of understanding the regulatory capabilities of the square graph. Further, understanding how the square graph operates is a key prerequisite for understanding more complex regulatory networks. However, for completeness, we derive how a wide variety of other common regulatory architectures perform using similar analyses in SI Appendix, section 3.
Kinetic measurements often justify the assumption that transcription factors bind and unbind with genomes quickly relative to transcription by polymerase. This separation of timescales makes macroscopic gene expression proportional to the steady-state probability of finding the system in transcriptionally active microstates. (We precisely validate this assumption for our setting using plausible transcriptional rates in SI Appendix, section 2C.)
We note that the average gene production rate , proportional to gene expression, is a typical and crucial output of interest. This response grows with the net probability that the polymerase is bound, where is the transcription rate once the polymerase is bound, is the probability of the state , where just the polymerase is bound, and is the probability of the state , where both polymerase and transcription factor are bound.
However, other outputs (that depend on other states) may also be biologically or experimentally significant. For instance, the localization of the transcription factors themselves to the genome (to recruit other cofactors or epigenetic modifications) can shape biological function independent of the polymerase, e.g., invoking the probability . We accommodate the breadth of these possible outputs by studying how any (nonnegative) linear combination of state probabilities varies with the transcription factor concentration as a control variable, where gives the potency of the th state. These different outputs and problem settings are captured by adopting particular , but as we will now see, all are subject to universal behavior.
Nonequilibrium Steady-State Output Responses.
To explore how these input–output functions operate away from equilibrium, we cannot apply equilibrium statistical mechanical models that assign thermodynamic energies to each state to calculate their probabilities (1, 6, 28–30). Instead, we now require a fully kinetic description (also known as a chemical master equation or continuous-time Markov chain) based on transitions between states. A large increase in complexity and the number of parameters typically accompanies this generalization. Fortunately, these dynamics admit a beautiful and powerful correspondence to graph theory that helps tame this complexity (17, 18, 20). Our guide is the Matrix Tree Theorem, which gives a simple diagrammatic procedure on a network’s structure to find stationary probabilities (Materials and Methods and SI Appendix, section 2D).* In brief, the Matrix Tree Theorem asserts that at steady state, the probability of any state is proportional to the sum of products of rate constants over all spanning trees rooted in that state. A rooted spanning tree of a graph is a subset of (directed) edges that collectively visits every state exactly once, has no cycles, and designates a specific root vertex . As illustrated by the example in Fig. 3A, each state in a spanning tree has exactly one outgoing edge (except the root state, which has only incoming edges); thus, spanning trees contain a unique directed path from any state to the root state.
Fig. 3.

Nonequilibrium response of the four-state fundamental transcriptional motif. (A) An example of a spanning tree (rooted in state XP) like those that define steady-state probabilities via the Matrix Tree Theorem. (B) All 16 directed, rooted spanning trees of the four-state cycle in 1(A): trees are grouped by the root state (in columns) and by how many participating edges depend on the control parameter (in rows). As guaranteed by the Matrix Tree Theorem, the steady-state probability of any state—in or out of equilibrium—is given by the sum of the weights of these spanning trees, introducing up to a quadratic dependence in in any output, as represented by Eq. 1. (C–E) Three universal output behaviors (regulatory shape phenotypes) can result from this architecture. A monotonic “equilibrium-like” output (C) manifests a Hill-like or MWC-like response, behavior familiar from equilibrium thermodynamic models. However, exclusively out of equilibrium, new multiply-inflected regulatory shape phenotypes become possible. Under drive, outputs can (D) vary nonmonotonically and reach two inflection points with the control parameter; or show three inflection points and vary monotonically (E). These richer phenotypes show a wider set of properties that characterize each curve: These include the “leak” value of the observable when the control variable is absent (, in orange; the saturation asymptotic limit as the control variable is maximally present (; in light blue); the observable’s values at intermediate plateau regions (; in red); and slopes 1 and 2 at inflection points and when they are defined (in green and purple, respectively).
Counting all sixteen rooted spanning trees of the four-state transcriptional system (Fig. 3B) and deploying the Tree Theorem explains how probabilities must vary with the transcription factor control parameter . Depending on the root (separated by column in Fig. 3B), each spanning tree carries two edges that depend on (Top row of Fig. 3B); one edge (Middle row, Fig. 3B); or no dependent edges (Bottom row, Fig. 3B). This structure yields relative probabilities with up to quadratic scaling with . Hence we find that, for this class of architectures, the form of any output function , in or out of equilibrium, is a ratio of quadratic polynomials in ,
| [1] |
where the coefficients , , , , , and are sums of subsets of (weighted) directed spanning trees carrying various -dependencies (SI Appendix, section 2D). The denominator, the sum of all rooted spanning trees and hence also a quadratic polynomial, serves as a normalizing factor that converts statistical weights to probabilities and represents a nonequilibrium partition function.
Note that while we derived the output form Eq. 1 using the particular choice of -dependent arrows appropriate for this transcriptional setting, the same formalism can treat many other control parameters that appear quite (structurally or biologically) distinct from these details, such as a concentration of another internal molecule (for instance polymerase, ) or an external molecule (for instance explicit drive by ). Specifically, any graph need only display up to two powers of the control variable among its spanning trees—a modest structural condition—for its response to exactly reproduce that of the square graph. SI Appendix, section 2H gives further examples of different placements of controlled edges that still produce a network output with the functional form of Eq. 1, and therefore remain precisely addressable by the analysis of this paper. Other outputs, like those of the more complicated networks of multiple transcription factors analyzed in SI Appendix, invite a fresh application of the Matrix Tree Theorem but benefit from the same framework; an ensemble of such analyses relevant to transcription appears in SI Appendix, section 3.
Equilibrium Output Curves Are Constrained.
Eq. 1 describes all input–output curves, in or out of equilibrium, produced by this four-state transcriptional system. When detailed balance does hold, this equation becomes equivalent to thermodynamic statistical-mechanical models (as it must). We explain algebraic correspondences to thermodynamic models, like those communing with earlier transcriptional experiments (6, 30), in SI Appendix, section 2G.3. Importantly, we find that the equilibrium condition demotes any observable output to the simpler form of a ratio of linear polynomials in , namely
| [2] |
for constants set wholly by thermodynamic parameters (SI Appendix, section 2G.1). Not coincidentally, this functional form formally reproduces or evokes the Hill induction, Michaelis–Menten, Langmuir-binding, Monod–Wyman–Changeux, or two-state Fermi function forms from the equilibrium statistical mechanics of binding commonly used to model and fit induction curves in natural (6, 31) or synthetic (32) settings. This equilibrium curve is paradigmatic of our biochemical intuition—smoothly saturating, with one point of inflection, with respect to transcription factor concentration in a conventional logarithmic scale (Fig. 3A and SI Appendix, section 2E).†
While earlier insightful work identified raw algebraic expressions similar to Eqs. 1 and 2 in different contexts (including refs. 16 and 34), their functional flexibility has not been deeply explored, and such studies largely asked distinct questions (e.g. focusing on sensitivity; see SI Appendix, section 2G.6). Next, we derive biological consequences of these behaviors that represent precise contrasts between nonequilibrium and equilibrium regulation.
New Regulatory Shape Phenotypes Unlocked by Nonequilibrium.
How much more complex is the regulation realizable by nonequilibrium outputs (Eq. 1), compared to that of their equilibrium special case, (Eq. 2)? To reach the qualitative essence of this question, we first investigate the possible shapes of the output curve. Specifically, we monitor the output’s changes in concavity with respect to the control parameter. We postpone comment on the characteristic positions and scales of output curves—any shifts in their horizontal position (viz. characteristic concentration scales) or vertical expanses (e.g. maximally induced responses)—until shortly.
Neglecting scales and shifts allows us to collapse the general, six-parameter output curve of Eq. 1 to a normalized function of just two emergent shape parameters,
| [3] |
Here, the emergent shape parameters and are complicated functions of the coefficients in Eq. 1 (and hence of underlying rate constants), and is the governing concentration measured in terms of a characteristic concentration scale (all defined in SI Appendix, section 2F). The values and are the leakiness (uninduced) and saturation (maximally induced) responses; we return to these values in the following subsections. This representation preserves the concavity of the response function, allowing us to explore shapes and quantitative features in a two-dimensional space more efficiently and comprehensively than possible in the space of the eight rates.‡
Harnessing this collapsed representation, we find that all output curves assume just three different universal shapes (Materials and Methods and SI Appendix, section 2I). First, the output can be “equilibrium-like:” monotonic and saturating with a single inflection point with respect to the control parameter (on a log scale), recalling the shape of the equilibrium response (Fig. 3C). Uniquely out of equilibrium, however, two additional multiply-inflected response shapes become possible. Under energy expenditure, outputs can become nonmonotonic and show two inflection points (Fig. 3D), or remain monotonic with three inflection points (Fig. 3E), with respect to the log of the control parameter. Responses with three inflections are always shaped as depicted in Fig. 3E: maximally steep at the first and third inflection points, but minimally steep at the second inflection point. Clearly, these nonequilibrium curves are marked departures from simple equilibrium-like curves but show a remarkable parsimony and regularity, given that they describe all departures from equilibrium for any rate parameter values. These three regulatory behaviors can pose different physiological implications for an organism; admit distinct quantitative constraints on sensitivity (as we will soon see); and require different conditions on underlying rate constants to be reached. In view of their categorical differences, we refer to these possible shapes as regulatory (shape) phenotypes.§
Leakiness, Saturation, and EC50 Are Tunable at Equilibrium.
Does spending energy enable finer control over these quantitative traits, beyond growing their number? In fact, only some quantitative traits are given extra adjustability by spending energy. Without the transcription factor, the system cannot be found in any microstate that involves it, collapsing four states into just the two states. This pair of states forms an acyclic graph, so these steady-state probabilities must show detailed balance (i.e., are set purely thermodynamically). Thus, leakiness , determined exclusively by and states, can be adjusted freely while maintaining detailed balance. Analogously, when the transcription factor concentration is saturating (), the system is never found in the two microstates without the transcription factor, again admitting an orthogonal description of a balance between two states, now . Hence, saturation is also freely adjustable at equilibrium. These leakiness and saturation values are independently adjustable by two separate energy parameters—the binding energies of the polymerase to the genome when the transcription factor is absent or present, respectively. At equilibrium, once the leakiness and saturation are fixed by energy parameters, the response’s maximal sensitivity (slope at the inflection point) is predetermined and no longer tunable, as revealed by its algebraic dependencies. In contrast, while the location of the governing inflection point depends on these two energy parameters, it can also be tuned—remaining at equilibrium—using another energy parameter (the binding energy between the transcription factor and genome); further details are discussed in SI Appendix, section 2G.2.
Nonequilibrium Control of Sensitivity Obeys Shape-Dependent Global Bounds.
Out of equilibrium, the sensitivity of responses enjoys greater adjustability. Specifically, the diversity of input–output curves accessible under drive motivates us to assess sensitivity by a suitably normalized slope , defined by
| [4] |
where and are the extremal values of the observable over all , and is a characteristic concentration scale ensuring dimensional consistency. For monotonic curves, the maximum and minimum responses are necessarily the uninduced leakiness and the maximally induced saturation (or vice versa), whereas for nonmonotonic responses with two inflections, the maximal and minimal responses can occur at intermediate finite values of . This normalized sensitivity is directly related to familiar measures such as the logarithmic sensitivity and the effective Hill coefficient, but more naturally and tightly describes sensitivities of nonmonotonic phenotypes (SI Appendix, section 2J).This measure parameterizes sensitivity with greater indifference to absolute scales of the function.
By combining wide numerical sampling, symbolic inequality solving, and analytical arguments (SI Appendix, section 2J), we investigated the maximal normalized sensitivity any response curve can exhibit for the four-state system across its three possible shape phenotypes. We found that sensitivity is tightly bounded above and below by precise finite limits; these limits vary by phenotype. Fig. 4 summarizes these bounds, visualized by how normalized and centered response curves behave around inflection points of maximal slope. Equilibrium response curves always show a normalized sensitivity of exactly one-fourth. Out of equilibrium, singly-inflected response curves can increase this maximal sensitivity up to one-half, or decrease maximal sensitivity below the equilibrium value to a numerical value of about 0.16. (We lack a coherent explanation for this curious numerical lower bound but verified it by precise symbolic inequality solving; see SI Appendix, section 2J). Nonequilibrium with two inflection points all have maximal sensitivity of at least the equilibrium level of one-fourth, but up to one-half. Nonequilibrium curves with three inflection points all show maximal sensitivity of at most the equilibrium level of one-fourth, and at least a sensitivity of one-eighth.
Fig. 4.
Global bounds, in or out of equilibrium, restrict maximal (normalized) response sensitivity (with respect to input concentrations on a log scale). The normalized responses are plotted near points of inflection that maximize slope, separated by shape phenotype. When the output has one inflection point (Left), the maximal sensitivity is bounded between a minimum of (blue line) and a maximum of 1/2 (red line) for any set of rate values or any dissipation; this subsumes the equilibrium case, whose normalized sensitivity is fixed at 1/4 (black dotted line). When the output has two inflections (Middle), the maximal sensitivity is bounded between 1/4 and 1/2. When the output has three inflections (Right), the maximal sensitivity is bounded between 1/8 and 1/4.
Cast in terms of the raw maximal sharpness of each response curve, these bounds report that raw maximal sharpness is always between one-eighth and one-half of the distance between the maximum and minimum responses per -fold increase in the concentration . We stress that these bounds on sensitivity, in terms of the observed and , are tighter quantitative constraints than bounds merely in terms of the maximal or minimal transcription rates (or other potency values) or that any microstate of the system can show, as can be connected to recent, related upper bounds (33). This follows since the extrema of the average observable response curve over all are usually more restricted than the most extreme potencies over microstates (namely, and ) (SI Appendix, section 2J.4).
These findings emphasize that network architecture and dissipation are not the only hard global constraints that bound sensitivity. The global shape of the response curve further categorically constrains the possible sensitivity. This relationship is potentially biologically relevant: For instance, it is impossible for an organism regulated by the square-graph transcriptional motif to achieve both a triply-inflected output curve and a normalized sensitivity greater than that at equilibrium. This represents a tradeoff between the shape complexity of a response and its maximal sensitivity.
Breaking Detailed Balance Along Each Edge.
Our foregoing analysis has been mathematically general. That is, the constrained shapes and bounds on sensitivity hold for any response following Eq. 1, over all rate constant values and energetic dissipations. These constraints also apply even—as previously noted—if the response is produced by a different underlying graph architecture than the particular transcriptional motif shown in Fig. 1, as long as the graph still yields spanning trees that depend up to quadratically on the control variable. Just because multiply-inflected or adjustable response curves are mathematically possible, however, does not establish that they are biologically plausible. To assess whether these behaviors can be accessed using physiologically plausible amounts of energy expenditure or typical biological rates, we now specialize to the plausible particulars of transcription as in Fig. 1. In the remainder of this paper, we quantify the extent of dissipation sustaining a nonequilibrium steady-state by focusing on the free energy coupled to the system, with units of or Joule; we refer to this quantity as the nonequilibrium driving force or simply as the (net) drive (SI Appendix, section 1D). In addition, we now adopt the transcriptional potencies and . This choice makes our response observable the probability that polymerase is bound to the genome. (More general responses can be easily analyzed analogously; see SI Appendix.)
Typical empirical binding energies, diffusion-limited rates, and single-molecule kinetic measurements yield order-of-magnitude estimates for the eight rates governing transcription at equilibrium (SI Appendix, section 2B and Fig. 1). First, we choose a set of default rates consistent with these orders-of-magnitude (given in the Lower Right stem plot of Fig. 5C). Next, we investigate how breaking detailed balance by spending energy to increase or decrease a single rate constant at a time—while keeping the seven other rates fixed at biological default values—modulates the transcriptional response curve. Hydrolyzing an ATP molecule makes available 20 of energy [Bionumbers ID 101701, (35, 36)] that can be used as a chemical potential gradient to drive transitions [for instance, by powering an enzymatically assisted pathway (37)]. This amount of free energy is also the scale observed to power active processes like biomolecular motors (38). Accordingly, to conservatively emulate a biological energy budget, we allot a maximum of just two ATP hydrolyses’ worth of free energy, , to break detailed balance. This budget for drive allows a given individual rate to be scaled by up to a factor .
Fig. 5.

Systematically breaking detailed balance edge-by-edge. (A) Example of how spending energy to modify a single rate (here, )—while the seven other rates remain fixed—changes the response curve away from default equilibrium behavior (pale yellow curve labeled “0” net drive and outlined in black). Responses from rate values larger than (or smaller than) at equilibrium are shown in increasingly red (or blue) colors, respectively; curves are also labeled with the numerical values of the net drive that generated them in units (positive for an increase; negative for a decrease). Each curve’s resulting inflection points are marked by yellow, orange, or pink markers, denoting one to three inflection points (respectively), and summarized in the associated one-dimensional (shape phenotypic) phase-diagram with the same colors on the Right. Inset: the position of the final inflection point versus net drive (power law exponent is ∼1). (B) Another representative behavior is displayed when is instead the rate varied. Inset: the saturation versus net drive (power law exponent is ∼1). (C) Summary of how all eight rates respond to energy expenditure to realize different regulatory shape phenotypes. Below, stem plots give precise values of each default rate constant at equilibrium. (These rates satisfy initial “broken symmetries” that violate the conditions in Eq. 5 by default, facilitating more ready access to nonmonotonicity. SI Appendix, section 2K documents the impact of departing from different default starting rates that instead satisfy Eq. 5, in addition to the impacts of driving all other edges.) (Here, the reference concentration scale setting the horizontal offset of the concentration axis is ).
Applied edge-by-edge, this procedure reveals that biologically feasible energy expenditures dramatically modify the response curve and easily attain all three regulatory shape phenotypes. Illustrating this regulatory plasticity, Fig. 5A shows how breaking detailed balance by scaling a rate up (increasingly red curves) or down (increasingly green-blue curves) can shift response curves to the left or right on the horizontal axis (effectively tuning what EC50 formerly represented at equilibrium), and also smoothly change the number of inflection points (See SI Appendix, section 2K for many further example response curves). Yet even for the same net nonequilibrium driving force, the consequences of breaking detailed balance depend significantly on the edge it is broken along. Fig. 5B shows another representative behavior by modifying a different edge, where the major effect of departing equilibrium is to modulate the leakiness, saturation, or intermediate scales of the response. Despite the diversity of this regulation, quantitatively regular control behavior emerges as well: Inset plots emphasize that phenotypic properties such as the position, , of the final inflection point and the saturation, , scale as power laws with the net drive over some regimes.
This broad regulatory flexibility is sustained over all eight rate constants, whose comprehensive response behaviors under drive are analyzed in SI Appendix, section 2K. Fig. 5C summarizes how driving each rate attains different shape phenotypes (number of inflections). Notably, any rate can be driven to access any of the three response shape phenotypes at some small, biologically feasible dissipation. Yet the minimum nonequilibrium driving force values needed to unlock a given phenotype—and the fraction of rate space manifesting this phenotype—varies markedly across the rates. For instance, the two-inflection-point nonequilibrium response shape (orange) is only reached for a fairly narrow, fine-tuned region of drive for the rates and , but is the most common shape phenotype over finite net drives for the rates and . Such variable consequences of injecting energy along different rate transitions reflect the privileged roles that states and play in the graph, given that their probability is the transcriptionally potent response we monitor. The contrasting impacts of modifying each edge are also sensitive to the default rates that define the system’s biological equilibrium starting point, a revealing dependence that we will return to in the penultimate Results section.
Breaking Detailed Balance Two Edges at a Time.
Adjusting one edge at a time, as we have just investigated, is one of many ways a network could invest energy to control its input–output function. For instance, the classical scheme of kinetic proofreading recognized that many steps could each be driven independently (39), as has later been repeatedly observed in the multistep ways that T-cell or mitogen-activated protein kinases activation implement kinetic proofreading (40–43) or in mechanochemical operation of myosin motors (44). How do such distributed investments of energy afford expanded control of response functions? To understand this question, we now appraise how breaking detailed balance along up to two edges at a time expands how different response behaviors may be accessed. With two independent drives (one for each edge’s departure from its default biological value), the formerly-one-dimensional phase diagrams of Fig. 5 become slices of two-dimensional phase diagrams that map where response shapes are reached (see Fig. 6 A and B; and also the census of how all twenty-eight rate pairs behave found in SI Appendix, section 2K).
Fig. 6.
Breaking detailed balance along two edges unlocks higher sensitivity and multiply-inflected outputs with smaller drive than required for breaking detailed balance along single edges. (A) Adjusting the rate pair —while fixing the other six rates at their default biological values at equilibrium (of Figs. 3A and 5C’s stem plot)—varies the number of inflection points (light yellow: one inflection, orange: two inflections, pink: three inflections), in a 2D analog of Fig. 3. Specifically, this rate pair illustrates a case where nonmonotonic two-inflection curves can be reached with only an infinitesimal net drive. (B) In contrast, when tuning , a finite minimum drive is needed to access nonmonotonicity; numerical sampling reveals that this total drive is the same as required while only tuning one edge at a time. (C) Maxima of raw slope over the same modulations (axes) of the rate pair shown in (A), with slope-maximizing rates within the permissible rate space indicated with a circle. is a reference concentration. (D) Overlaying the same positions of maximal slope for all twenty-eight rate pairs emphasizes that optimal slopes are found at the boundary of the permissible rate space. Marker colors reflect the maximal slope achieved for each rate pair. Panel (E) summarizes the behavior of panel (D) by representing each optimal rate pair value with two important natural parameters: the net drive (either the log ratio or log product of each rate’s difference from their equilibrium starting values, depending on the relative (counter)clockwise orientation of the rates in a pair); and the net total distance in rate space between the optimal values and their starting values, .
Geometrically more complex than their one-edge equivalents in Fig. 5, these two-edge phase diagrams expose new ways to transition between the shape phenotypes. One measure of this new facility is the energetic cost needed to reach nonmonotonic (two inflection-point) response curves. Starting from biological equilibrium, what is the minimum net drive required for the response to become nonmonotonic, when energy can be injected along just one edge at a time (Fig. 5) or up to two edges at a time (Fig. 6 A and B)? Regarding this question, we find that the possible pairs of edges can be divided into two types. A few—like the edge pair illustrated in Fig. 6B—require the same finite total dissipation to reach nonmonotonicity as needed if only driving either individual edge. However, the majority of rate pairs—such as the edge pair —offer a dissipative bargain: By controlling both rates, it is possible to find a point in rate space where only an infinitesimal departure from detailed balance activates nonmonotonicity (as circled in Fig. 6A). These infinitesimal minimal drives contrast the finite drives always required while modifying single edges (Fig. 5C). This new economy is enjoyed by the 22 rate pairs that include at least one of the four special rates , or ; their membership is a clue for identifying critical conditions on nonmonotonicity as we deduce in the next Results section.
The richer behaviors achievable by breaking detailed balance along two rates (instead of just one) become even more pronounced from the lens of sensitivity. The heatmap of Fig. 6C depicts the maximal unnormalized sharpness reached by modifying the rate pair (the same rates mapped phenotypically in the phase space of Fig. 6A). If only one rate constant at a time were allowed to be driven, only the slices of sharpness along the white dotted and vertical and horizontal lines would be accessible, at most realizing a maximal unnormalized sharpness of 0.15 with respect to the concentration on a log scale. However, once both edges can be modified, it becomes possible to access the maximal slope region on the Lower Right, yielding a greater maximum sensitivity of about 0.35. Repeating this procedure for all 28 rate pairs, as shown in Fig. 6D, we find that the points in rate space that maximize slope all require both rate constants in each pair to be modified from their default equilibrium values (lying away from the and vertical and horizontal lines). To maximize sensitivity, all rate pairs show one (but usually not both) rate constant that has been driven to the maximal extent allowed by the nonequilibrium driving force budget (localizing optimal points to the borders—but not necessarily corners—in Fig. 6D). The net drive ensuing from both rate’s departure from their equilibrium values is often distinct from those independent departures. Fig. 6E recasts the same slope-maximizing points in Fig. 6D in terms of these two separate properties (the net drive , and the average geometric distance, , each edge moved from its biological starting point.) Different rate pairs show dramatically different optimal maximum sensitivities at varying cost: Choosing to break detailed balance along the can achieve a maximal slope of about 0.35 (probability units per -fold change in [X]) at a net drive of only (dark gray marker), but choosing less wisely the rate pair at best attains a slope of about 0.054 (probability units per -fold change in [X]), even while spending a net energy almost four times as large. Collectively, these findings highlight how distributing dissipation over the transitions in a network can achieve more precise and dramatic responses.
Generic Rate Conditions Forbid Access to Nonmonotonic Responses.
Why, as we have seen, are nonmonotonic responses accessed with different ease while driving some rates—or still more economically, rate pairs—rather than others? How do the default equilibrium rates from which biology departs affect the tunability of responses? Confronting these questions leads us to glean general kinetic conditions that enable or forbid nonmonotonicity. We reformulate the criterion for nonmonotonicity to explicitly invoke net drive and rate constants (SI Appendix, section 2L). Using these analytical arguments, we determine that nonmonotonicity is forbidden for any net drive when transition rates satisfy the following, surprisingly loose, conditions:
| [5] |
That is, if the presence of the transcription factor on the genome increases or decreases the polymerase’s binding rate in a sense opposite to its effect on the unbinding rate (or leaves either unchanged), the response must depend on the transcription factor monotonically. Only when the transcription factor plays a functionally “ambiguous,” dualistic role—coherently changing both the polymerase’s binding and unbinding rates (that themselves have opposite effects on the response)—may the response become nonmonotonic under a sufficient net drive. Since access to nonmonotonicity is governed by kinetic conditions in Eq. 5—but thermodynamic parameters instead set whether a response is globally activating or repressing (SI Appendix, section 2L)—the qualitative origin of nonmonotonicity stems from when kinetic and thermodynamic aspects in the system oppose each other.
This condition of Eq. 5 helps explain why some rates and rate pairs reach regulatory shape phenotypes so differently under drive, and how default starting rate constants matter. A comprehensive census of responses while driving one edge at a time when default rates satisfy Eq. 5 is provided in SI Appendix, section 2K. Instructively, Eq. 5 demands that when the transcription factor does not change the polymerase’s (un)binding rates—namely, either or —the response must be monotonic. By default, under the often reasonable classical assumption that the binding rate of polymerase is purely diffusion-limited (1), the transcription factor indeed may not affect the polymerase’s binding rate, thus forcing the response to be monotonic.¶ This type of biophysical constraint may contribute to why transcriptional responses are most canonically pictured as monotonic. However, while plausible, this biophysical scenario is hardly inescapable or universal. In fact, even for architectures as “simple” as lac repression, there is gathering empirical evidence that proteins associate with DNA binding sites under more intricate regulation than merely diffusion (45). Transcription factors that mediate steric access to the genome (dissipatively or not), such as via DNA looping (46), may also be especially prone to contravene this condition.
More Complex Regulatory Architectures Obey Similar Constraints.
As documented earlier in Fig. 2, more complex regulatory networks than the square cycle also govern transcription, albeit less commonly. In SI Appendix, section 3, we use the Matrix Tree Theorem to derive how combinatorial regulation by multiple transcription factors—forming hypercubic graphs—generalize the algebraic behavior of the square graph. Further, we also investigate two distinct architectures for DNA looping, thus generalizing quantitative input–output functions characterized in previous experiments (e.g., ref. 47) but previously studied with models confined by equilibrium assumptions (30, 47). These larger networks also produce outputs that are ratios of particular (multivariate) polynomials in transcription factor concentrations as control variables. Like in the square graph, these responses attain restricted subsets of monomials in the control variables, whose constrained algebraic forms we calculate in SI Appendix, section 3. At equilibrium, these responses analogously collapse further to much simpler expressions with constrained shapes. To learn which nonequilibrium behaviors are not just mathematically possible but also biologically plausible to reach in such systems, we further systematically broke detailed balance along each edge in the architecture of DNA looping by repression. This operation illustrates how the edge-by-edge analysis in Fig. 5 can extend to more complex graphs. These algebraic and numerical analyses promise to identify constraints on the ultimate signal processing and combinatorial logic that cells can perform even with larger networks, in or out of equilibrium. For instance, we also derive separate kinetic criteria needed for DNA looping architectures to attain nonmonotonic response shapes. These conditions reveal how even in these more complex networks, narrow and biophysically challenging parameter regimes are required to reach specific response features (SI Appendix, section 3).
Discussion
In this work, we dissected how spending energy transforms the control of gene expression in minimal and common transcriptional architectures. Harnessing a kinetic description and diagrammatic procedure from graph theory, we found that any transcriptional outputs follow a universal form with respect to a control parameter like a transcription factor’s concentration. Focusing on a particularly simple and pervasive regulatory motif, we found these responses may only adopt three shapes, including an equilibrium-like (monotonic, singly-inflected, saturating) response. Uniquely out of equilibrium, however, two unexpected and noncanonical output behaviors become possible: a doubly-inflected, nonmonotonic response, and a triply-inflected, monotonic response. Underneath wide parametric complexity, we established tight global bounds on the maximal sensitivity of transcriptional responses and learned these can vary and tradeoff with response shape. Next, we systematically mapped how biologically feasible amounts of energy along single rates or rate pairs control responses. These findings established that the noncanonical responses are easily accessed around rates plausible for transcription, especially when dissipation can be distributed more widely over a network. Last, we uncovered global, transparent kinetic conditions that forbid (or enable) novel nonmonotonic responses. An important aspect of the present work is that beyond focusing on point quantities such as the sensitivity, we characterize the entire input–output function over the full range of different detailed-balance-breaking behaviors.
The flexible regulation unlocked by nonequilibrium could be widely biological salient. Responses that can show three inflection points—instead of just one at equilibrium—could accomplish the role of two classical (singly-inflected) input–output functions. Since an inflection can mark a local region of enhanced output sensitivity, and effectively implement a threshold, this functionality could allow cells to achieve distinct cellular fates, such as in Wolpert’s classical French Flag model (48). By contrast to our small architecture, canonical pictures of multiple thresholded responses usually require multiple genes—often at least one specific gene per threshold (49). One important example is the celebrated Dorsal protein in Drosophila, where two critical thresholds have been proposed to accomplish twist gene activation and decapentaplegic gene repression to help establish distinct parts of dorsal patterns in embryonic development (50, figure 2.26, p. 64). We propose that triply-inflected responses from a single gene could accomplish some of this same functionality with a smaller architecture.
Nonmonotonic response functions with two inflection points could empower cells to accomplish more sophisticated signal processing, such as band-pass or band-gap filtering of chemical inputs, and/or generate temporal pulses of chemical outputs. Similar implications have been explored by Alon (64), inter alios, who established how nonmonotonic outputs can be produced by chaining together incoherent feed-forward loops (51–54). To achieve more complex outputs, these networks use transcriptional interactions among multiple genes at equilibrium—e.g., from two to six (or more) genes in such examples. Hence, these networks operate with comparatively larger sizes and timescales than mere binding-unbinding reactions on a single gene’s regulatory network like the square graph we study in this report. We suggest these comparisons contribute useful material to a maturing discourse about when and how biology uses thermodynamic or kinetic control mechanisms (37, 44).
Even responses that remain “equilibrium-like” with a single inflection benefit from energy expenditure, since our bounds establish they may be up to two times more sensitive than at equilibrium, and enjoy new kinetic (instead of merely thermodynamic) ways of controlling the location of the governing inflection point (EC50). Such adaptation evokes earlier influential models of ultrasensitivity by energy expenditure (55).
While only mild net drives unlock useful regulatory shapes and traits, our analysis emphasizes other mechanistic factors that govern how easily these behaviors can be accessed, or measured as signatures of nonequilibrium in natural or synthetic settings. First, the biological network’s architecture determines whether these distinct macroscopic behaviors can be attained at all. Although prokaryotic gene regulation has regularly shown a compelling coherence between quantitative measurements and equilibrium statistical mechanical models [including demanding studies from our own laboratories over the past two decades (6, 25, 28, 56, 57) and beyond (46)], many of the most fiercely interrogated systems (e.g. the lac repressor) are indeed exactly those with acyclic network topologies that make nonequilibrium steady-states impossible (without open fluxes) and guarantee detailed balance. This reflects a possible overrepresentation of biological settings where detailed balance may be expected a priori to apply on mere structural grounds. On the other hand, the means to spend energy biochemically clearly exist, even in bacteria through two-component regulatory systems (58) and other active settings like nucleosome remodeling in eukaryotes (5). Accordingly, our findings invite and vigorous reappraisal of whether signatures of nonequilibrium are lurking in architectures that are more prone to accommodate it, such as the four-state “simple activation” motif we discussed here. Moreover, the measurements (or synthetic biological perturbations) needed to map the nonequilibrium landscape of transcriptional responses must differ from the convenient binding site modifications [e.g. parallel promoter libraries (25, 59)] previously used to test equilibrium models, since manipulating binding energies inherently preserves detailed balance. Developing fresh experimental approaches to augment or attenuate a single transition between microstates (or set of transitions) in situ to break detailed balance is a crucial direction of future empirical work, whose value is advocated for by our results. To manipulate and probe tractable models of transcription, these methods might include optogenetic control (60, 61), or adjustments of governing enzyme concentrations or activities.
Second, where energy is invested crucially dictates which regulatory behaviors are available. We found that investing energy along more than one rate at once was capable of achieving more dramatic response curves more economically. This finding may help explain the many observations in biological systems where energy is independently injected along multiple steps (39–44). However, since each independently regulated injection of energy may also be accompanied by architectural costs, not all examples of biological regulation may contain the distributed dissipation machinery required to make novel nonequilibrium response signatures conspicuous.
Third, the structures of responses while breaking detailed balance edge-by-edge, and our general kinetic criteria that forbid nonmonotonicity, highlight that certain critical imbalances between rate constants are needed to produce some out-of-equilibrium signatures. On basic biophysical grounds, some natural systems may (or may not) exhibit the required rate imbalances to make novel responses as easy to activate (SI Appendix, section 2L.2). Indeed, the rate imbalances required to produce nonmonotonicity we found are nonobvious. These kinetic criteria have significant implications for organizing parameter explorations. For instance, we show in SI Appendix, section 2M that two exciting, completely separate studies recently published (2, 15) exploring the informational and functional consequences of nonequilibrium impose simplifying assumptions on rate constants that in fact preclude the possibility of nonmonotonic responses, according to equivalents of our monotonicity criterion. We expect that this approach and our kinetic criteria will help future works include and capture the regulatory consequences of these rich behaviors. We anticipate this flexibility may be especially germane for environments that present nonuniform input statistics.
The contrast between the nonequilibrium steady-states possible to support using this “simple activation” architecture, and the difficulty of sustaining nonequilibrium steady-states in a simple repression architecture that lacks a cycle, also possibly provides a distinct design principle to understand the timeless question of why both activators and repressors are employed as distinct architectures when they can produce the same mean gene expression. Intriguing rationalizations based on ecological demand have been offered for why these architectures are used differently in E. coli, such as the classical proposal by Savageau (62–64). We speculate that another, quite distinct, feature—the very possibility of using nonequilibrium to steer input–output response curves so flexibly—may also contribute to why organisms might use a simple-activation (or other cycle-containing) architecture over acyclic architectures, all other features being equal. Whether this nonequilibrium controllability significantly shapes the natural incidence of regulatory architectures can only be assessed using quantitative measurements of input–output behaviors from a much broader set of architectures than the relatively narrow (e.g., Lac repressor, Bicoid, Ph05, and CI in bacteriophage- switch) subjects of existing analyses.
Our work urges the value of measuring the global shapes of input–output functions. This approach complements but contrasts the more common practice of attempting to fit and compare single scalar parameters like sensitivity or pointwise proofreading-ratios (13, 55). Unexpectedly flexible global curve shapes might be even more obvious and robust signatures of nonequilibrium than pointwise estimators, since scalar measures like derivatives are often highly sensitive to measurement noise (65). Yet, just as for pointwise measures, inferring nonequilibrium from global shape requires sufficient knowledge of a system’s states. As our wider analysis of transcriptional architectures (SI Appendix, section 3) stresses, larger graphs can reach complex shapes at equilibrium only accessible by small graphs out of equilibrium. Whether a measured complex response stems from a simple system operating out of equilibrium, or a more architecturally complex system operating near equilibrium, may depend on separate lines of empirical evidence, including knowledge of binding site topology. Nevertheless, such details are often known—or knowable—for promoters of interest.
Our calculations also provide explicit maps of parameter spaces that can guide the naturalist looking for whether this expanded regulation occurs naturally in specific examples of transcription. This information is also a guide to the synthetic biologist who endeavors to engineer such responses in genetic circuits and exploit the advantages of producing complex regulation using a small driven network, instead of a comparatively larger, more slowly tuned network of multiple genes and/or proteins at equilibrium.
Beyond advocating for experiments, our findings invite many theoretical extensions. How dissipation affects the intricate tradeoffs between sensitivity, specificity, speed, and stochasticity in (steady-state or transient) gene regulation is a large, open, physiologically relevant question amenable to further graph-theoretic dissection. In addition, we hope for deeper analytical rationalization of our bounds on sensitivity; our upper bounds surely share similar foundations with looser, more architecturally general, bounds recently and insightfully established by Owen and Horowitz (33), though our additional lower bounds and different mathematical quantities suggest separate theoretical ingredients.
Overall, we foresee that graph-theoretic treatments like we have deployed here—and as have been first so powerfully established and refined by foundational investigators (18, 20, 33, 66)—will help understand still more sophisticated networks. Just as Feynman diagrams and other diagrammatic reasoning catalyze field theory and particle physics (67, 68), drawing graphs promises to help build structural principles about how energy enlivens biology.
Materials and Methods
Nonequilibrium Steady-State Probabilities via the Matrix Tree Theorem.
Consider a continuous-time Markov chain with states, whose transition rates between states and are stored in the th element of the transition matrix , and so the probabilities of finding the system in these states evolve according to
(With this convention of as a column vector, the columns of the matrix sum to zero and the diagonal entries are accordingly .) Identifying our Markov system as a weighted graph, a spanning tree over the states is a set of edges that visits every state exactly once. A spanning tree
i
rooted in a state contains no outgoing edges from state (and exactly one outgoing edge for every other state ). These notions are summarized in the example of Fig. 3A. The Matrix Tree Theorem (MTT), also known as the Markov Chain Tree Theorem, states that at steady state , the statistical weight of the th microscopic state is the sum of products of rate constants over spanning trees rooted in node
![]() |
[6] |
where is the number of spanning trees rooted in (18, 24). This weight is the relative odds of finding the system in state as a fraction of all the statistical weights , namely . Applying the MTT to the regulatory motif of Fig. 3A indicates that any steady-state probabilistic observable depends on the transcription factor control parameter according to Eq. 1 (SI Appendix, section 2D).
Emergent Shape Parameters and Shape Phenotypes.
The collapsed shape representation of Eq. 3 allows us to solve for the number of positive solutions to , yields the numbers of possible inflection points (via, for instance, Descartes’ rule of signs or explicit inequality solving) and hence shapes (SI Appendix, section 2I). Numerical and symbolic analysis of the space formed by these two emergent shape parameters (Eq. 3 and SI Appendix, section 2I.2) helps establish our global bounds on sensitivity. Ultimately, this collapsed representation is also a crucial theoretical stepladder to find the generic conditions forbidding nonmonotonicity given in Eq. 5 (SI Appendix, section 2I).
Single Edge and Edge Pair Perturbations.
We estimated default biological rates for transcription at equilibrium by synthesizing reported binding affinities, association rates, and diffusion constants (SI Appendix, section 2B). We solved the condition for an inflection point symbolically and numerically (SI Appendix, sections 2I–2K).
Supplementary Material
Appendix 01 (PDF)
Acknowledgments
We gratefully acknowledge feedback from and formative discussions with Forte Shinko, Nicholas Lammers, Avi Flamholz, Jordan Horowitz, Jeremy Owen, and Jané Kondev. G.L.S. thanks the US NSF Graduate Research Fellowship Program under Grant DGE-1745301 and Caltech’s Center for Environmental Microbial Interactions for support. R.P. is supported by the Rosen Center at Caltech and the NIH 1R35 GM118043 (Maximizing Investigators’ Research Award). H.G.G. is supported by NIH R01 Award (R01GM139913) and NIH R21 Award (R21HD107436), and the Koret-UC Berkeley-Tel Aviv University Initiative in Computational Biology and Bioinformatics. H.G.G. is also a Chan Zuckerberg Biohub–San Francisco Investigator.
Author contributions
S.D.M., G.L.S., P.D., H.G.G., and R.P. designed research; S.D.M., G.L.S., and P.D. performed research; R.P. and H.G.G. directed the project; S.D.M., G.L.S., H.G.G., and R.P. wrote the manuscript; and G.L.S. wrote the paper.
Competing interests
The authors declare no competing interest.
Footnotes
This article is a PNAS Direct Submission Y.T. is a guest editor invited by the Editorial Board.
*Though the square graph could also be analyzed directly by (more arduous) linear algebra, this setting already shows how the Matrix Tree Theorem dramatically simplifies practical calculations and organizes terms in powers of the control variable (as shown in Fig. 3B). It also shows power, calculationally and conceptually, in more complex graphs where linear-algebraic operations become forbidding (SI Appendix, section 3).
†Throughout our analysis and discussion in this paper, we monitor the shape, number of inflection points, and sensitivity of transcriptional outputs with respect to the control parameter of the concentration of transcription factor, on a logarithmic scale. We use this logarithmic convention in alignment with common practice in biochemical and transcriptional studies (6, 32, 33).
‡The two-parameter simplicity of Eq. 3 is one possible nonequilibrium sophistication of the (usually one-parameter) data collapses used to unify simpler, equilibrium, two-state physiological responses (31) and regulation (6) in bacteria.
§We use the phrase “regulatory (shape) phenotype,” referring to the overall shape of a response curve, to distinguish our meaning from the usage of ref. 2, who instead referred to specific quantitative traits within curves of a single mathematical shape (such as sensitivity or noise) as “regulatory phenotypes.”
¶By contrast, by the assumption that the transcription factor has the typical biophysical effect of changing the affinity between the polymerase and genome, the polymerase’s off-rate from the genome is affected by the transcription factor’s presence, and . So usually it is not an equality between polymerase’s off-rates that prevents a response from being nonmonotonic.
Contributor Information
Gabriel L. Salmon, Email: gsalmon@caltech.edu.
Rob Phillips, Email: phillips@pboc.caltech.edu.
Data, Materials, and Software Availability
Symbolic and numerical code used for this study’s analyses and figures is available open source. See https://github.com/RPGroup-PBoC/graphnoneq (69). All other data are included in the manuscript and/or SI Appendix.
Supporting Information
References
- 1.Morrison M., Razo-Mejia M., Phillips R., Reconciling kinetic and thermodynamic models of bacterial transcription. PLoS Comput. Biol. 17, e1008572 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Grah R., Zoller B., Tkačik G., Nonequilibrium models of optimal enhancer function. Proc. Natl. Acad. Sci. U.S.A. 117, 31614–31622 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Wong F., Gunawardena J., Gene regulation in and out of equilibrium. Annu. Rev. Biophys. 49, 199–226 (2020). [DOI] [PubMed] [Google Scholar]
- 4.Coulon A., Chow C. C., Singer R. H., Larson D. R., Eukaryotic transcriptional dynamics: from single molecules to cell populations. Nat. Rev. Genet. 14, 572–584 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Shelansky R., Boeger H., Nucleosomal proofreading of activator-promoter interactions. Proc. Natl. Acad. Sci. U.S.A. 117, 2456–2461 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Razo-Mejia M., et al. , Tuning transcriptional regulation through signaling: A predictive theory of allosteric induction. Cell Syst. 6, 456–469 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jacob F., Monod J., “On the regulation of gene activity” in Cold Spring Harbor Symposia Quantitative Biology (Cold Spring Harbor Laboratory Press, 1961), vol. 26, pp. 193–211.
- 8.Farmer I. S., Jones C. W., The energetics of Escherichia coli during aerobic growth in continuous culture. Eur. J. Biochem. 67, 115–122 (1976). [DOI] [PubMed] [Google Scholar]
- 9.Mori M., Cheng C., Taylor B. R., Okano H., Hwa T., Functional decomposition of metabolism allows a system-level quantification of fluxes and protein allocation towards specific metabolic functions. Nat. Commun. 14, 4161 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Rodenfels J., Neugebauer K. M., Howard J., Heat oscillations driven by the embryonic cell cycle reveal the energetic costs of signaling. Dev. Cell 48, 646–658 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Yu Q., Zhang D., Tu Y., Inverse power law scaling of energy dissipation rate in nonequilibrium reaction networks. Phys. Rev. Lett. 126, 080601 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Yang X., et al. , Physical bioenergetics: Energy fluxes, budgets, and constraints in cells. Proc. Natl. Acad. Sci. U.S.A. 118, e2026786118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zoller B., Gregor T., Tkačik G., Eukaryotic gene regulation at equilibrium, or non? Curr. Opin. Syst. Biol. 31, 100435 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ge H., Qian M., Qian H., Stochastic theory of nonequilibrium steady states. Part II: Applications in chemical biophysics. Phys. Rep. 510, 87–118 (2012). [Google Scholar]
- 15.Lammers N. C., Flamholz A. I., Garcia H. G., Competing constraints shape the nonequilibrium limits of cellular decision-making. Proc. Natl. Acad. Sci. U.S.A. 120, e2211203120 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Nam K. M., Martinez-Corral R., Gunawardena J., The linear framework: using graph theory to reveal the algebra and thermodynamics of biomolecular systems. Interface Focus 12, 20220013 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Gunawardena J., A linear framework for time-scale separation in nonlinear biochemical systems. PLoS One 7, e36321 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Mirzaev I., Gunawardena J., Laplacian dynamics on general graphs. Bull. Math. Biol. 75, 2118–2149 (2013). [DOI] [PubMed] [Google Scholar]
- 19.Tierrafría V. H., et al. , Regulondb 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12. Microb. Genom. 8, mgen000833 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hill T. L., Free Energy Transduction and Biochemical Cycle Kinetics (Courier, 2013). [Google Scholar]
- 21.Qian H., Open-system nonequilibrium steady state: Statistical thermodynamics, fluctuations, and chemical oscillations. J. Phys. Chem. B 110, 15063–15074 (2006). [DOI] [PubMed] [Google Scholar]
- 22.R. Shelansky et al., A telltale sign of irreversibility in transcriptional regulation. bioRxiv [Preprint] (2022). 10.1101/2022.06.27.497819 (Accessed 1 April 2023). [DOI]
- 23.Mirny L. A., Nucleosome-mediated cooperativity between transcription factors. Proc. Natl. Acad. Sci. U.S.A. 107, 22534–22539 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Phillips R., Garcia H. G., Physical Genomics: From E. coli to Elephants (Princeton University Press, 2024). [Google Scholar]
- 25.Ireland W. T., et al. , Deciphering the regulatory genome of Escherichia coli, one hundred promoters at a time. eLife 9, e55308 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Rydenfelt M., Garcia H. G., Cox R. S. III, Phillips R., The influence of promoter architectures and regulatory motifs on gene expression in Escherichia coli. PLoS One 9, e114347 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Keränen S. V., Villahoz-Baleta A., Bruno A. E., Halfon M. S., REDfly: An integrated knowledgebase for insect regulatory genomics. Insects 13, 618 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Garcia H. G., Phillips R., Quantitative dissection of the simple repression input-output function. Proc. Natl. Acad. Sci. U.S.A. 108, 12173–12178 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Garcia H. G., Kondev J., Orme N., Theriot J. A., Phillips R., Thermodynamics of biological processes in Methods in enzymology. Elsevier 492, 27–59 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Bintu L., et al. , Transcriptional regulation by the numbers: Models. Curr. Opin. Genet. Dev. 15, 116–124 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Swem L. R., Swem D. L., Wingreen N. S., Bassler B. L., Deducing receptor signaling parameters from in vivo analysis: LuxN/AI-1 quorum sensing in Vibrio harveyi. Cell 134, 461–473 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Meyer A. J., Segall-Shapiro T. H., Glassey E., Zhang J., Voigt C. A., Escherichia coli “marionette’’ strains with 12 highly optimized small-molecule sensors. Nat. Chem. Biol. 15, 196–204 (2019). [DOI] [PubMed] [Google Scholar]
- 33.Owen J. A., Horowitz J. M., Size limits the sensitivity of kinetic schemes. Nat. Commun. 14, 1280 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hartich D., Barato A. C., Seifert U., Nonequilibrium sensing and its analogy to kinetic proofreading. New J. Phys. 17, 055026 (2015). [Google Scholar]
- 35.Milo R., Jorgensen P., Moran U., Weber G., Springer M., Bionumbers-the database of key numbers in molecular and cell biology. Nucleic Acids Res. 38, D750–D753 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Tran Q. H., Unden G., Changes in the proton potential and the cellular energetics of Escherichia coli during growth by aerobic and anaerobic respiration or by fermentation. Eur. J. Biochem. 251, 538–543 (1998). [DOI] [PubMed] [Google Scholar]
- 37.Horowitz J. M., Zhou K., England J. L., Minimum energetic cost to maintain a target nonequilibrium state. Phys. Rev. E. 95, 042102 (2017). [DOI] [PubMed] [Google Scholar]
- 38.Lin M. M., General efficiency relation for dissipative molecular machines. Proc. R. Soc. A 480, 20230388 (2024). [Google Scholar]
- 39.Hopfield J. J., Kinetic proofreading: A new mechanism for reducing errors in biosynthetic processes requiring high specificity. Proc. Natl. Acad. Sci. U.S.A. 71, 4135–4139 (1974). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Britain D. M., Town J. P., Weiner O. D., Progressive enhancement of kinetic proofreading in T cell antigen discrimination from receptor activation to DAG generation. eLife 11, e75263 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Swain P. S., Siggia E. D., The role of proofreading in signal transduction specificity. Biophys. J. 82, 2928–2933 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.McKeithan T. W., Kinetic proofreading in T-cell receptor signal transduction. Proc. Natl. Acad. Sci. U.S.A. 92, 5042–5046 (1995). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Cui W., Mehta P., Identifying feasible operating regimes for early T-cell recognition: The speed, energy, accuracy trade-off in kinetic proofreading and adaptive sorting. PLoS One 13, e0202331 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Mallory J. D., Kolomeisky A. B., Igoshin O. A., Kinetic control of stationary flux ratios for a wide range of biochemical processes. Proc. Natl. Acad. Sci. U.S.A. 117, 8884–8889 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Marklund E., et al. , Sequence specificity in DNA binding is mainly governed by association. Science 375, 442–445 (2022). [DOI] [PubMed] [Google Scholar]
- 46.Kuhlman T., Zhang Z., Saier M. H. Jr., Hwa T., Combinatorial transcriptional control of the lactose operon of Escherichia coli. Proc. Natl. Acad. Sci. U.S.A. 104, 6043–6048 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Boedicker J. Q., Garcia H. G., Phillips R., Theoretical and experimental dissection of DNA loop-mediated repression. Phys. Rev. Lett. 110, 018101 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Wolpert L., Positional information and the spatial pattern of cellular differentiation. J. Theor. Biol. 25, 1–47 (1969). [DOI] [PubMed] [Google Scholar]
- 49.Papageorgiou S., Almirantis Y., Gradient model describes the spatial-temporal expression pattern of HOXA genes in the developing vertebrate limb. Dev. Dyn. 207, 461–469 (1996). [DOI] [PubMed] [Google Scholar]
- 50.L. Wolpert, C. Tickle, A. M. Arias, Principles of development (Oxford University Press, USA, ed. 6, 2015).
- 51.Kaplan S., Bren A., Dekel E., Alon U., The incoherent feed-forward loop can generate non-monotonic input functions for genes. Mol. Syst. Biol. 4, 203 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Ishihara S., Fujimoto K., Shibata T., Cross talking of network motifs in gene regulation that generates temporal pulses and spatial stripes. Genes Cells 10, 1025–1038 (2005). [DOI] [PubMed] [Google Scholar]
- 53.Basu S., Gerchman Y., Collins C. H., Arnold F. H., Weiss R., A synthetic multicellular system for programmed pattern formation. Nature 434, 1130–1134 (2005). [DOI] [PubMed] [Google Scholar]
- 54.Entus R., Aufderheide B., Sauro H. M., Design and implementation of three incoherent feed-forward motif based biological concentration sensors. Syst. Synth. Biol. 1, 119–128 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Tu Y., The nonequilibrium mechanism for ultrasensitivity in a biological switch: Sensing by Maxwell’s demons. Proc. Natl. Acad. Sci. U.S.A. 105, 11737–11741 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Jones D. L., Brewster R. C., Phillips R., Promoter architecture dictates cell-to-cell variability in gene expression. Science 346, 1533–1536 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Brewster R. C., et al. , The transcription factor titration effect dictates level of gene expression. Cell 156, 1312–1323 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Mitrophanov A. Y., Groisman E. A., Signal integration in bacterial two-component regulatory systems. Genes. Dev. 22, 2601–2611 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Yu T. C., et al. , Multiplexed characterization of rationally designed promoter architectures deconstructs combinatorial logic for IPTG-inducible systems. Nat. Commun. 12, 325 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Zhang W., et al. , Optogenetic control with a photocleavable protein. PhoCl. Nat. Methods 14, 391–394 (2017). [DOI] [PubMed] [Google Scholar]
- 61.Liu H., Gomez G., Lin S., Lin S., Lin C., Optogenetic control of transcription in zebrafish. PLoS One 7, e50738 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Savageau M. A., Design of molecular control mechanisms and the demand for gene expression. Proc. Natl. Acad. Sci. U.S.A. 74, 5647–5651 (1977). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Savageau M. A., Genetic regulatory mechanisms and the ecological niche of Escherichia coli. Proc. Natl. Acad. Sci. U.S.A. 71, 2453–2455 (1974). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.U. Alon, An introduction to systems biology: Design principles of biological circuits (CRC press, ed. 2, 2019). See especially §7.6 “Demand rules for gene regulation can minimize errors,” p. 129.
- 65.Lanshammar H., On precision limits for derivatives numerically calculated from noisy data. J. Biomech. 15, 459–470 (1982). [DOI] [PubMed] [Google Scholar]
- 66.Schnakenberg J., Network theory of microscopic and macroscopic behavior of master equation systems. Rev. Mod. Phys. 48, 571 (1976). [Google Scholar]
- 67.Kaiser D., Drawing Theories Apart: The Dispersion of Feynman Diagrams in Postwar Physics (University of Chicago Press, 2005). [Google Scholar]
- 68.Veltman M., Diagrammatica: The Path to Feynman Diagrams (Cambridge University Press, 1994), vol. 4. [Google Scholar]
- 69.S. Mahdavi, G. Salmon, Code for Studying Nonequilibrium Graphs. Github. https://github.com/RPGroup-PBoC/graphnoneq. Deposited 2 June 2024.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Appendix 01 (PDF)
Data Availability Statement
Symbolic and numerical code used for this study’s analyses and figures is available open source. See https://github.com/RPGroup-PBoC/graphnoneq (69). All other data are included in the manuscript and/or SI Appendix.





