Allosteric conformational ensembles have unlimited capacity for integrating information

John W Biddle; Rosa Martinez-Corral; Felix Wong; Jeremy Gunawardena

doi:10.7554/eLife.65498

. 2021 Jun 9;10:e65498. doi: 10.7554/eLife.65498

Allosteric conformational ensembles have unlimited capacity for integrating information

John W Biddle ^1,^†,^‡, Rosa Martinez-Corral ^1,^†, Felix Wong ^2,³, Jeremy Gunawardena ^1,^✉

Editors: Aleksandra M Walczak⁴, Arvind Murugan⁵

PMCID: PMC8189718 PMID: 34106049

Abstract

Integration of binding information by macromolecular entities is fundamental to cellular functionality. Recent work has shown that such integration cannot be explained by pairwise cooperativities, in which binding is modulated by binding at another site. Higher-order cooperativities (HOCs), in which binding is collectively modulated by multiple other binding events, appear to be necessary but an appropriate mechanism has been lacking. We show here that HOCs arise through allostery, in which effective cooperativity emerges indirectly from an ensemble of dynamically interchanging conformations. Conformational ensembles play important roles in many cellular processes but their integrative capabilities remain poorly understood. We show that sufficiently complex ensembles can implement any form of information integration achievable without energy expenditure, including all patterns of HOCs. Our results provide a rigorous biophysical foundation for analysing the integration of binding information through allostery. We discuss the implications for eukaryotic gene regulation, where complex conformational dynamics accompanies widespread information integration.

Research organism: None

Introduction

Cells receive information in different ways, of which molecular binding is the most diverse and widespread. Binding events influence downstream biological functions. In the biophysical treatment that we present here, biological functions, such as the output of a gene or the oxygen-carrying capacity of haemoglobin, are quantified as averages over the probabilities of microscopic states. We will be concerned with how binding events collectively determine these probability distributions and will refer to this process as the integration of binding information.

The most proximal form of such integration is pairwise cooperativity, in which binding at one site modulates binding at another site. This can arise through direct interaction, where one binding event creates a molecular surface, which either stabilises or destabilises the other binding event. This situation is illustrated in Figure 1A, which shows the binding of ligand to sites on a target molecule. (In considering the target of binding, we use ‘molecule’ for simplicity to denote any molecular entity, from a single polypeptide to a macromolecular aggregate such as an oligomer or complex with multiple components.) We use the notation $K_{i, S}$ for the association constant—on-rate divided by off-rate, with dimensions of (concentration)⁻¹—where $i$ denotes the binding site and $S$ denotes the set of sites which are already bound. This notation was introduced in previous work (Estrada et al., 2016) and is explained further in the Materials and methods. It allows binding to be analysed while keeping track of the context in which binding occurs, which is essential for making sense of how binding information is integrated.

Figure 1. — (A) Pairwise cooperativity by direct interaction on a target molecule (grey). As discussed in the text, the target could be any molecular entity. Left: target molecule with no ligands bound; numbers $1, \dots, 6$ denote the binding sites. Right: target molecule after binding of blue ligand to site 2. (B) Indirect long-distance pairwise cooperativity, which can arise ‘effectively’ through allostery. (C) Higher-order cooperativity, in which multiple bound sites, 2, 4 and 6, affect binding at site 5.

Oxygen binding to haemoglobin is a classical example of integration of binding information, for which Linus Pauling gave the first biophysical definition of cooperativity (Pauling, 1935). At a time when the mechanistic details of haemoglobin were largely unknown, Pauling assumed that cooperativity arose from direct interactions between the four haem groups. He defined the pairwise cooperativity for binding to site $i$ , given that site $j$ is already bound, as the fold change in the association constant compared to when site $j$ is not bound. In other words, the pairwise cooperativity is given by $K_{i, {j}} / K_{i, \emptyset}$ , where $\emptyset$ denotes the empty set. (Pauling considered non-pairwise effects but deemed them unnecessary to account for the available data.) It is conventional to say that the cooperativity is ‘positive’ if this ratio is greater than 1 and ‘negative’ if this ratio is less than 1; the sites are said to be ‘independent’ if the cooperativity is exactly 1, in which case binding to site $j$ has no influence on binding to site $i$ . This terminology reflects the underlying free energy (Equation 1). Association constants and cooperativities may be thought of as an alternative way of describing the free-energy landscape, as we will explain in more detail in the Results. Figure 1A depicts the situation in which there is negative cooperativity for binding to site 1 and positive cooperativity for binding to site 3, given that site 2 is bound.

Studies of feedback inhibition in metabolic pathways revealed that information to modulate binding could also be conveyed over long distances on a target molecule, beyond the reach of direct interactions (Changeux, 1961; Gerhart, 2014; Figure 1B). Monod and Jacob coined the term ‘allostery’ for this form of indirect cooperativity (Monod and Jacob, 1961). Monod, Wyman and Changeux (MWC) and, independently, Koshland, Némethy and Filmer (KNF) put forward equilibrium thermodynamic models, which showed how effective cooperativity could arise from the interplay between ligand binding and conformational change (Koshland et al., 1966; Monod et al., 1965). In the two-conformation MWC model (Figure 2B), there is no ‘intrinsic’ cooperativity—the binding sites are independent in each conformation—and ‘effective’ cooperativity arises as an emergent property of the dynamically interchanging ensemble of conformations.

Figure 2. — (A) Plots of the binding function, whose shape reflects the interactions between binding sites, as described in the text. (B) The Monod, Wyman and Changeux (MWC) model with a population of dimers in two quaternary conformations, with each monomer having one binding site and ligand binding shown by a solid black disc. The two monomers are considered to be distinguishable, leading to four microstates. Directed arrows show transitions between microstates. This picture anticipates the graph-theoretic representation used later in this paper. (C) Schematic of the end points of the allosteric pathway between the tense, fully deoxygenated and the relaxed, fully oxygenated conformations of a single haemoglobin tetramer, $α_{1} α_{2} β_{1} β_{2}$ , showing the tertiary and quaternary changes, based on Figure 4 of Perutz, 1970. Haem group (red); oxygen (cyan disc); salt bridge (positive, magenta disc; negative, blue bar); DPG is 2–3-diphosphoglycerate.

In these studies, the effective cooperativity between sites was not quantitatively determined. Instead, the presence of cooperativity was inferred from the shape of the binding function, which is the average fraction of bound sites, or fractional saturation, as a function of ligand concentration (Figure 2A). The famous MWC formula is an expression for this binding function (Monod et al., 1965). If the sites are effectively independent, the binding function has a hyperbolic shape, similar to that of a Michaelis–Menten curve. A sigmoidal curve, which flattens first and then rises more steeply, indicates positive cooperativity, while a curve which rises steeply first and then flattens indicates negative cooperativity. Surprisingly, despite decades of study, the effective cooperativity of allostery is still largely assessed in this way, through the shape of the binding function, which is sometimes quantified in terms of a sensitivity or Hill coefficient. However, the shape of the binding function, and any associated Hill coefficient, are measures which aggregate over conformations and binding states, and they give little insight into how binding information is being integrated. To put it another way, the underlying free-energy landscape cannot be inferred from the shape of the binding function: as we will see below, different free-energy landscapes can give rise to indistinguishable binding functions. One of the contributions of this paper is to show how effective cooperativities can be quantified, providing thereby a set of parameters which collectively describe the allosteric free-energy landscape and placing allosteric information integration on a similar biophysical foundation to that provided by Pauling for direct interactions between two sites.

The MWC and KNF models are phenomenological: effective cooperativity arises as an emergent property of a conformational ensemble. This leaves open the question of how information is propagated between distant binding sites across a single molecule. This question was particularly relevant to haemoglobin, for which it had become clear that the haem groups were sufficiently far apart that direct interactions were implausible. Perutz’s X-ray crystallography studies of haemoglobin revealed a pathway of structural transitions during cooperative oxygen binding which linked one conformation to another (Figure 2C), thereby relating the single-molecule viewpoint to the ensemble viewpoint (Perutz, 1970). These pioneering studies provided important justification for key aspects of the MWC model, which has endured as one of the most successful mathematical models in biology (Changeux, 2013; Marzen et al., 2013).

Allostery was initially thought to be limited to certain symmetric protein oligomers like haemoglobin and to involve only a few, usually two, conformations. But Cooper and Dryden's theoretical demonstration that information could be conveyed by fluctuations around a dominant conformation anticipated the emergence of a more dynamical perspective (Cooper and Dryden, 1984; Henzler-Wildman and Kern, 2007). At the single-molecule level, it has been found that binding information can be conveyed over long distances by complex atomic networks, of which Perutz’s linear pathway (Figure 2C) is only a simple example (Schueler-Furman and Wodak, 2016; Kornev and Taylor, 2015; Knoverek et al., 2019; Wodak et al., 2019). These atomic networks may in turn underpin complex ensembles of conformations in many kinds of target molecules and allosteric regulation is now seen to be common to most cellular processes (Nussinov et al., 2013; Changeux and Christopoulos, 2016; Motlagh et al., 2014; Lorimer et al., 2018; Wodak et al., 2019; Ganser et al., 2019). The unexpected finding of widespread intrinsic disorder in proteins has been particularly influential in prompting a reassessment of the classical structure-function relationship, with conformations which may only be fleetingly present providing plasticity of binding to many partners (Wrabl et al., 2011; Wright and Dyson, 2015; Berlow et al., 2018).

However, while ensembles have grown greatly in complexity from MWC’s two conformations and new theoretical frameworks for studying them have been introduced (Wodak et al., 2019), the quantitative analysis of information integration has barely changed beyond pairwise cooperativity. In the present paper, we will be particularly concerned with higher-order cooperativities (HOCs) in which multiple binding events collectively modulate another binding site (Figure 1C). Such higher-order effects can be quantified by association constants, $K_{i, S}$ , where the set $S$ has more than one bound site. The size of $S$ , denoted by $# (S)$ , is the order of cooperativity, so that pairwise cooperativity may be considered as HOC of order 1. For the example in Figure 1C, the ratio, $K_{5, {2, 4, 6}} / K_{5, \emptyset}$ , defines the non-dimensional HOC of order 3 for binding to site 5, given that sites 2, 4 and 6 are already bound. The notation used here is essential to express such higher-order concepts.

Higher-order effects have been discussed in previous studies (Dodd et al., 2004; Peeters et al., 2013; Martini, 2017; Gruber and Horovitz, 2018) and treated systematically in the mutant-cycle strategy developed in Horovitz and Fersht, 1990 and recently reviewed (Carter, 2017). The latter approach relies on perturbing residues or modules to unravel networks of energetic couplings within a macromolecule. It focusses on the single-molecule scale in contrast to the ensemble scale of the present paper (Figure 2). Mutant-cycle studies have confirmed the presence of substantial higher-order interactions underlying information propagation in proteins (Jain and Ranganathan, 2004; Sadovsky and Yifrach, 2007; Carter et al., 2017). The two approaches may be seen as different ways of analysing the free-energy landscape, as we explain in the Results.

HOCs were introduced in Estrada et al., 2016, where it was shown that experimental data on the sharpness of gene expression could not be accounted for purely in terms of pairwise cooperativities (Park et al., 2019a). In this context, the target molecule is the chromatin structure containing the relevant transcription factor (TF) binding sites and the analogue of the binding function is the steady-state probability of RNA polymerase being recruited, considered as a function of TF concentration (Estrada et al., 2016; Park et al., 2019a). The Hunchback gene considered in Estrada et al., 2016, Park et al., 2019a, which is thought to have six binding sites for the TF Bicoid, requires HOCs up to order 5 to account for the data, under the assumption that the regulatory machinery is operating without energy expenditure at thermodynamic equilibrium. An important problem emerging from this previous work, and one of the starting points for the present paper, is to identify a molecular mechanism capable of implementing such HOCs.

In the present paper, we show that allosteric conformational ensembles can implement any pattern of effective HOCs. Accordingly, they can implement any form of information integration that is achievable at thermodynamic equilibrium. We work at the ensemble level (Figure 2B) using a graph-based representation of Markov processes developed previously (below). We introduce a systematic method of ‘coarse graining’, which is likely to be broadly useful for other studies. This allows us to define the effective HOCs arising from any allosteric ensemble, no matter how complex. These effective HOCs provide a quantitative language in which the integrative capabilities of any ensemble can be specified. We show, in particular, that allosteric ensembles can account for the experimental data on Hunchback mentioned above, which was the problem that prompted the present study. It is straightforward to determine the binding function from the effective HOCs, and we derive a generalised MWC formula for an arbitrary ensemble, which recovers the functional perspective. Our results subsume and generalise previous findings and clarify issues which have been present since the concept of allostery was introduced. Our graph-based approach further enables general theorems to be rigorously proved for any ensemble (below), in contrast to calculation of specific models which has been the norm up to now.

Our analysis raises questions about how effective HOCs are implemented at the level of single molecules, similar to those answered by Perutz for haemoglobin and the MWC model (Figure 2C). This important problem lies outside the scope of the present paper and requires different methods (Wodak et al., 2019), such as the mutant-cycle approach mentioned above (Carter, 2017). Our analysis is also restricted to ensembles which are at thermodynamic equilibrium without expenditure of energy, as is generally assumed in studies of allostery. Energy expenditure may be present in maintaining a conformational ensemble, for example, through post-translational modification, but the significance of this has not been widely appreciated in the literature. Thermodynamic equilibrium sets fundamental physical limits on information processing in the form of ‘Hopfield barriers’ (Estrada et al., 2016; Biddle et al., 2019; Wong and Gunawardena, 2020). Energy expenditure can bypass these barriers and substantially enhance equilibrium capabilities. However, the study of non-equilibrium systems is more challenging and we must defer analysis of this interesting problem to subsequent work (Discussion).

The integration of binding information through cooperativities leads to the integration of biological functions. Haemoglobin offers a vivid example of how allostery implements this relationship. This one target molecule integrates two distinct functions, of taking up oxygen in the lungs and delivering oxygen to the tissues, by having two distinct conformations, each adapted to one of the functions, and dynamically interchanging between them. In the lungs, with a higher oxygen partial pressure, binding cooperativity causes the relaxed conformation to be dominant in the molecular population, which thereby takes up oxygen; in the tissues, with a lower oxygen pressure, binding cooperativity causes the tense conformation to be dominant in the population, which thereby gives up oxygen. Evolution may have used this integrative strategy more widely than just to transport oxygen, and we review in the Discussion some of the evidence for an analogy between functional integration by haemoglobin and by gene regulation.

Results

Construction of the allostery graph

Our approach uses the linear framework for timescale separation (Gunawardena, 2012), details of which are provided in the 'Materials and methods' along with further references. We briefly outline the approach here.

In the linear framework, a suitable biochemical system is described by a finite directed graph with labelled edges. In our context, graph vertices represent microstates of the target molecule and graph edges represent transitions between microstates, for which the edge labels are the instantaneous transition rates. A linear framework graph specifies a finite-state, continuous-time Markov process, and any reasonable such Markov process can be described by such a graph. We will be concerned with the probabilities of microstates at steady state. These probabilities can be interpreted in two ways, which reflect the ensemble and single-molecule viewpoints of Figure 2. From the ensemble perspective, the probability is the proportion of target molecules which are in the specified microstate, once the molecular population has reached steady state, considered in the limit of an infinite population. From the single-molecule perspective, the probability is the proportion of time spent in the specified microstate, in the limit of infinite time. The equivalence of these definitions comes from the ergodic theorem for Markov processes (Stroock, 2014). These different interpretations may be helpful when dealing with different biological contexts: a population of haemoglobin molecules may be considered from the ensemble viewpoint, while an individual gene may be considered from the single-molecule viewpoint. As far as the determination of probabilities is concerned, the two viewpoints are equivalent.

The graph representation may also be seen as a discrete approximation of a continuous energy landscape, as in Figure 3, in which the target molecule is moving deterministically on a high-dimensional landscape in response to a potential, while being buffeted stochastically through interactions with the surrounding thermal bath (Frauenfelder et al., 1991). In mathematics, this approximation goes back to the work of Wentzell and Freidlin on large deviation theory for stochastic differential equations in the low noise limit (Ventsel' and Freidlin, 1970; Freidlin and Wentzell, 2012). It has been exploited more recently to sample energy landscapes in chemical physics (Wales, 2006) and in the form of Markov State Models arising from molecular dynamics simulations (Noé and Fischer, 2008; Sengupta and Strodel, 2018). In this approximation, the vertices correspond to the minima of the free energy up to some energy cut-off, the edges correspond to appropriate limiting barrier crossings and the labels correspond to transition rates over the barrier.

The linear framework graph, or the accompanying Markov process, describes the time-dependent behaviour of the system. Our concern in the present paper is with systems which have reached a steady state of thermodynamic equilibrium, so that detailed balance, or microscopic reversibility, is satisfied. The assumption of thermodynamic equilibrium has been standard since allostery was introduced (Koshland et al., 1966; Monod et al., 1965) but has significant implications, as pointed out in the Introduction, and we will return to this issue in the Discussion. At thermodynamic equilibrium, we can dispense with dynamical information and work with what we call ‘equilibrium graphs’ (Figure 3). These are also directed graphs with labelled edges but the edge labels no longer contain dynamical information in the form of rates but rather ratios of forward to reverse rates. These ratios are determined by the minima of the free-energy landscape, with the equilibrium label on the edge from vertex $i$ to vertex $j$ being given by the formula in Figure 3 . Free energy is often expressed relative to a reference level, as we will do below, so it will be convenient to write the equilibrium label from $i$ to $j$ as

\exp (- \frac{Δ Φ_{j} - Δ Φ_{i}}{k_{B} T}),

(1)

where $Δ Φ_{u}$ is the relative free-energy of vertex $u$ , $k_{B}$ is Boltzmann’s constant and $T$ is the absolute temperature (Figure 3). Note that if the edge in question involves components from outside the graph itself, such as a ligand which binds to $i$ to yield $j$ , then the chemical potential of the ligand will contribute to the free energy. This contribution will manifest itself in the presence of a ligand concentration term in the edge label, as seen in Figure 4. The equilibrium edge labels are the only parameters needed at thermodynamic equilibrium and the free energies of the vertices can be recovered from them, up to an additive constant. From now on, in the main text, when we say ‘graph’, we will mean ‘equilibrium graph’.

We explain such graphs using our main example. Figure 4 shows the graph, $A$ , for an allosteric ensemble, with multiple conformations $c_{1}, \dots, c_{N}$ and multiple sites, $1, \dots, n$ , for binding of a single ligand ( $n = 3$ in the example). The graph vertices represent abstract conformations with patterns of ligand binding, denoted $(c_{k}, S)$ , where the index $k$ designates the conformation with $1 \leq k \leq N$ , and $S \subseteq {1, \dots, n}$ is the subset of bound sites. Directed edges represent transitions arising either from binding without change of conformation (‘vertical’ edges), $(c_{k}, S) \to (c_{k}, S \cup {i})$ where $i \notin S$ , which occur for all conformations c_k, or from conformational change without binding (‘horizontal’ edges), $(c_{k}, S) \to (c_{j}, S)$ where $k \neq j$ , which occur for all binding subsets $S$ . Edges are shown in only one direction for clarity—when binding or unbinding is present, we use the direction of binding—but edges are always reversible, in accordance with thermodynamic equilibrium. Ignoring labels and thinking only in terms of vertices and edges, or ‘structure’, $A$ has a product form: the vertical subgraphs, $A^{c_{k}}$ , consisting of those vertices with conformation c_k and all edges between them, all have the same structure and the horizontal subgraphs, $A_{S}$ , consisting of those vertices with binding subset $S$ and all edges between them, also all have the same structure (Figure 4). Structurally speaking, we can think of $A$ as the graph product (Ahsendorf et al., 2014) of the vertical subgraph $A^{c_{1}}$ and the horizontal subgraph $A_{\emptyset}$ (Figure 4).

In an allostery graph, ‘conformation’ is meant abstractly as any state for which binding association constants can be defined. It does not imply any particular atomic configuration of a target molecule nor make any commitments as to how the pattern of binding changes.

The product-form structure of the allostery graph reflects the ‘conformational selection’ viewpoint of MWC, in which conformations exist prior to ligand binding, rather than the ‘induced fit’ viewpoint of KNF, in which binding can induce new conformations. Considerable evidence now exists for conformational selection, in the form of transient, rarely populated conformations which exist prior to binding (Tzeng and Kalodimos, 2011). Induced fit may be incorporated within our graph-based approach by treating new conformations as always present but at extremely low probability. One of the original justifications for induced fit was that it enabled negative cooperativities, in contrast to conformational selection (Koshland and Hamadani, 2002), but we will show below that induced fit is not necessary for this and that negative HOCs arise naturally in our approach. Accordingly, the product-form structure of our allostery graphs is both convenient and powerful.

The edge labels are the non-dimensional ratios of the forward transition rate to the reverse transition rate; accordingly, the label for the reverse edge is the reciprocal of the label for the forward edge (Materials and methods). Labels may include the influence of components outside the graph, such as a binding ligand. For instance, the label for the binding edge $(c_{k}, S) \to (c_{k}, S \cup {i})$ is $x K_{c_{k}, i, S}$ , where $x$ is the ligand concentration and $K_{c_{k}, i, S}$ is the association constant (Figure 1A), with dimensions of (concentration)⁻¹, as described in the Introduction. Horizontal edge labels are not individually annotated and need only be specified for the horizontal subgraph of empty conformations, $A_{\emptyset}$ , since all other labels are determined by detailed balance (Materials and methods).

The graph structure allows HOCs between binding events to be calculated, as suggested in the Introduction. We will define this first for the ‘intrinsic’ HOCs which arise in a given conformation and explain in the next section how ‘effective’ HOCs are defined for the ensemble. In conformation c_k, the intrinsic HOC for binding to site $i$ , given that the sites in $S$ are already bound, denoted $ω_{c_{k}, i, S}$ , is defined by normalising the corresponding association constant to that for binding to site $i$ when nothing else is bound (Estrada et al., 2016),

ω_{c_{k}, i, S} = \frac{K_{c_{k}, i, S}}{K_{c_{k}, i, \emptyset}} .

(2)

HOCs are non-dimensional quantities. If $S$ has only a single site, say $S = {j}$ , then the intrinsic HOC of order 1, $ω_{c_{k}, i, {j}}$ , is the classical pairwise cooperativity between sites $i$ and $j$ . There is positive or negative intrinsic HOC if $ω_{c_{k}, i, S} > 1$ or $ω_{c_{k}, i, S} < 1$ , respectively, and independence if $ω_{c_{k}, i, S} = 1$ (Figure 1A).

For any graph $G$ , the steady-state probabilities of the vertices can be calculated from the edge labels. For each vertex, $v$ , in $G$ , the probability, ${Pr}_{v} (G)$ , is proportional to the quantity, $μ_{v} (G)$ , obtained by multiplying the edge labels along any directed path of edges from a fixed reference vertex to $v$ . It is a consequence of detailed balance that $μ_{v} (G)$ does not depend on the choice of path in $G$ . This implies algebraic relationships among the edge labels. These can be fully determined from $G$ and independent sets of parameters can be chosen (Materials and methods). For the allostery graph, a convenient choice vertically is those association constants $K_{c_{k}, i, S}$ with $i$ less than all the sites in $S$ , denoted $i < S$ ; horizontal choices are discussed in the Materials and methods but are not needed for the main text.

Since probabilities must add up to 1, it follows that

{Pr}_{v} (G) = \frac{μ_{v} (G)}{\sum_{u \in G} μ_{u} (G)} .

(3)

Equation 3 yields the same result as equilibrium statistical mechanics, with the denominator being the partition function for the thermodynamic grand canonical ensemble. Equilibrium statistical mechanics typically focusses only on vertices and uses their free energies as the fundamental parameters. Directed graphs of the form considered here were previously used in Hill, 1966 and Schnakenberg, 1976 to study systems away from thermodynamic equilibrium, where the graph edges become essential to represent entropy production (Wong and Gunawardena, 2020). We find that the graph remains just as useful at thermodynamic equilibrium because binding and unbinding are the fundamental mechanisms through which information is integrated and these mechanisms must be represented by graph edges. Indeed, as the next section shows, graphs are invaluable for formulating higher-order concepts.

Our specification of an allostery graph allows for arbitrary conformational complexity and arbitrary interacting ligands (we consider only one ligand here for simplicity), with the independent association constants in each conformation being arbitrary and with arbitrary changes in these parameters between conformations. Moreover, the abstract nature of ‘conformation’, as described above, permits substantial generality. Allostery graphs can be formulated to encompass the two conformations of MWC (Marzen et al., 2013), nested models (Robert et al., 1987), the fluctuations of Cooper and Dryden, 1984 and more recent views of dynamical allostery (Tzeng and Kalodimos, 2011), the multiple domains of the Ensemble Allosteric Model developed by Hilser and colleagues (Hilser et al., 2012) and applied also to intrinsically disordered proteins (Motlagh et al., 2012), other ensemble models (LeVine and Weinstein, 2015; Tsai and Nussinov, 2014) and Markov State Models arising from molecular dynamics simulations (Noé and Fischer, 2008).

Relationships between higher-order measures

As mentioned in the Introduction, a systematic approach to higher-order effects using mutant-cycle analysis was developed in Horovitz and Fersht, 1990 and Horovitz and Fersht, 1992 and widely used subsequently (Carter, 2017). The HOCs presented above were introduced in our previous work (Estrada et al., 2016), and the present paper is concerned not with HOCs per se, but with effective HOCs that arise from an allosteric ensemble, as will be described below. Nevertheless, it may still be helpful to explain the relationship between our HOCs and the higher-order couplings arising from mutant-cycle analysis. We are grateful to an anonymous reviewer for making this point to us. The material which follows may be of particular interest to those familiar with the relevant literature but is not required for the main results of the paper.

Both HOCs and higher-order couplings can be seen as different ways of analysing the underlying free-energy landscape. Both approaches make essential use of directed graphs to organise this landscape. Figure 5A shows the labelled equilibrium graph for ligand binding to three sites in a single conformation, while Figure 5B shows a directed graph of the kind used in Horovitz and Fersht, 1990 for defining higher-order couplings for perturbations to three sites. The latter graphs are sometimes called ‘boxes’ (Horovitz and Fersht, 1990). We use ‘sites’ here for either individual residues or the modules described in Carter, 2017. Perturbations are typically mutations, such as replacement of an asparagine residue by alanine. The choice of replacement can make a difference to the results, but this is not usually depicted in graph representations like Figure 5B. The directed edges have rather different interpretations in the two examples in Figure 5: for the equilibrium graph in Figure 5A, a directed edge represents the biochemical process of ligand binding; for the coupling graph in Figure 5B, a directed edge represents an experimental perturbation. In both cases, the vertices have an associated free energy, denoted $Δ Φ_{S}$ , where $S \subseteq {1, \dots, n}$ is either the subset of bound sites in the equilibrium graph (Figure 5A) or the subset of perturbed sites in the coupling graph (Figure 5B). The $Δ$ notation is conventionally used in the literature to signify a free-energy difference (Equation 1) or free energy relative to a chosen zero level. A frequent choice of zero is the free energy of empty binding or of the unperturbed state, in which case $Δ Φ_{\emptyset} = 0$ , but we have not assumed this here. Note that the free energies of the equilibrium graph have a contribution from the ligand, which manifests itself in the dependence of the edge labels on the ligand concentration, $x$ , while the free energies of the coupling graph do not. Despite this difference, the free energies provide in both cases the fundamental independent thermodynamic parameters, of which there are $2^{n} - 1$ for $n$ sites, in terms of which both HOCs and higher-order couplings can be rigorously defined.

Figure 5. — (A) Equilibrium graph, similar to those in Figure 4, for binding of a ligand to three sites on a single conformation, ordered as shown at the base, and annotated with edge labels. The single conformation has been omitted from subscripts for clarity. (B) Directed graph used to define higher-order couplings, for a macromolecule with three sites or modules (solid squares), ordered as shown at the base, with perturbations indicated by blue colour in place of black. Vertices are annotated with the corresponding free energy.

The definition is easiest for HOCs. Equation 1 tells us that the edge label, $x K_{i, S}$ , is given by

x K_{i, S} = \exp (- \frac{Δ Φ_{S \cup {i}} - Δ Φ_{S}}{k_{B} T}) .

(4)

We omit the single conformation from subscripts for clarity. It follows from Equation 2 that HOCs can be written in terms of free energies as follows:

ω_{i, S} = \exp (- \frac{(Δ Φ_{S \cup {i}} - Δ Φ_{S}) - (Δ Φ_{{i}} - Δ Φ_{\emptyset})}{k_{B} T}) .

(5)

HOCs are non-dimensional quantities associated to graph edges. As noted above, there are algebraic relationships among them arising from detailed balance at thermodynamic equilibrium. An independent set of parameters is formed by restricting to those for which $i < S$ , of which there are $2^{n} - n - 1$ . Taken together with the $n$ ‘bare’ association constants for initial ligand binding, $K_{i, \emptyset}$ , they form a complete set of $2^{n} - 1$ independent parameters for the free-energy landscape. It follows from Equations 4 and 5 that these parameters can be used to recover the fundamental free energies, so that the two sets of parameters are mathematically equivalent.

Mutant-cycle studies often refer to both Horovitz and Fersht, 1990 and Horovitz and Fersht, 1992, which present apparently different measures of higher-order coupling. The second of these papers introduces what we will refer to as the ‘residual free energy’ of a vertex and denote $Δ ϕ_{S}$ . This is the free energy remaining at vertex $S$ after accounting for the contributions from all proper subsets of $S$ . The residual free energy may be concisely defined recursively, starting from $Δ ϕ_{\emptyset} = Δ Φ_{\emptyset}$ , by

Δ ϕ_{S} = Δ Φ_{S} - (\sum_{X \subset S} Δ ϕ_{X}) .

(6)

We see from Equation 6 that $Δ ϕ_{{i}} = Δ Φ_{{i}} - Δ Φ_{\emptyset}$ and that $Δ ϕ_{{i, j}} = Δ Φ_{{i, j}} - (Δ Φ_{{i}} + Δ Φ_{{j}}) + Δ Φ_{\emptyset}$ . $Δ ϕ_{S}$ may be calculated directly from $Δ Φ_{X}$ but, as the previous example suggests, overlapping contributions of the actual free energies must be cancelled out (Horovitz and Fersht, 1992, Equation 4),

Δ ϕ_{S} = \sum_{0 \leq k \leq # (S)} {(- 1)}^{# (S) - k} (\sum_{Y \subseteq S, # (Y) = k} Δ Φ_{Y}) .

(7)

To see why Equation 7 is a consequence of Equation 6, note first that Equation 7 gives the correct result for $S = \emptyset$ . It may then be recursively checked by assuming it holds for $X \subset S$ and substituting into Equation 6 to check that it holds for $S$ . Each subset $Y \subset S$ contributes a term $\pm Δ Φ_{Y}$ arising from $Δ ϕ_{X}$ for each $X$ that satisfies $Y \subseteq X \subset S$ . The sign of $Δ Φ_{Y}$ coming from Equation 7 is ${(- 1)}^{# (X) - # (Y)}$ . These terms almost completely cancel each other out because, letting $p = # (S) - # (Y)$ ,

\sum_{Y \subseteq X \subset S} (- 1)^{# (X) - # (Y)} = \sum_{V \subset S ∖ Y} (- 1)^{# (V)} = \sum_{0 \leq j < p} (\begin{matrix} p \\ j \end{matrix}) (- 1)^{j} = (- 1)^{p + 1} .

Taking into account the additional sign coming from Equation 6, we recover Equation 7 for $S$ . This proves recursively that Equation 7 is the solution of Equation 6 in terms of free energies.

We can go further to show how $Δ ϕ_{S}$ is expressed in terms of HOCs. For this, we must assume that $q = # (S) > 1$ . When $q = 1$ , ligand binding contributes to $Δ ϕ_{S}$ , but when $q > 1$ that is no longer the case, as we will see. Choose any site $i \in S$ . The summation in Equation 7 involves $2^{q}$ terms $Δ Φ_{Y}$ . It can be reorganised into a sum of $2^{q - 1}$ terms of the form $\pm (Δ Φ_{Z \cup {i}} - Δ Φ_{Z})$ , where $Z \subseteq S \ {i}$ . The sign of these terms is given by the sign of $Δ Φ_{Z \cup {i}}$ coming from Equation 7 and is therefore ${(- 1)}^{# (S) - # (Z) - 1}$ . It is easy to see that, because $q > 1$ , there must be equal numbers of +1 and −1 signs. It follows from Equation 4 that

\exp (- \frac{Δ ϕ_{S}}{k_{B} T}) = \prod_{Z \subseteq S ∖ {i}} (x K_{i, Z})^{(- 1)^{# (S) - # (Z) - 1}},

where the double exponent just means that the right-hand side is a ratio in which those terms for which $# (S) - # (Z)$ is odd go in the numerator and those terms for which $# (S) - # (Z)$ is even go in the denominator. Using Equation 2, we can rewrite $K_{i, Z}$ as $K_{i, \emptyset} ω_{i, Z}$ . Since there are equal numbers of each sign, we can cancel each occurrence of $x K_{i, \emptyset}$ between numerator and denominator to yield a formula for residual free energies in terms of HOCs when $# (S) > 1$ :

\exp (- \frac{Δ ϕ_{S}}{k_{B} T}) = \prod_{Z \subseteq S ∖ {i}} (ω_{i, Z})^{(- 1)^{# (S) - # (Z) - 1}} .

(8)

The choice of $i \in S$ in Equation 8 is arbitrary. As an illustration of Equation 8, recalling from Equation 5 that $ω_{i, \emptyset} = 1$ , we see that

\exp (- \frac{Δ ϕ_{{i_{1}, i_{2}}}}{k_{B} T}) = ω_{i_{1}, {i_{2}}}, \exp (- \frac{Δ ϕ_{{i_{1}, i_{2}, i_{3}}}}{k_{B} T}) = \frac{ω_{i_{1}, {i_{2}, i_{3}}}}{ω_{i_{1}, {i_{2}}} ω_{i_{1}, {i_{3}}}} .

(9)

Equations 8 and 9 show how the residual free energy is built up from binding at any given site to the hierarchy of subsets of the remaining sites.

Residual free energies can be thought of as a measure of collective synergy between sites (Horovitz and Fersht, 1992). They are associated to graph vertices and constitute $2^{n} - 1$ independent parameters, with no algebraic relationships between them. It follows from Equations 6 and 7 that they are mathematically equivalent to the fundamental free energies. Residual free energies have also been independently described for other purposes in Equation 4 of Martini, 2017.

The higher-order couplings introduced in Horovitz and Fersht, 1990 appear at first sight to be quite different from the residual free energies introduced in Horovitz and Fersht, 1992. The couplings are described by examples for low orders, as are typically encountered in practice (Horovitz and Fersht, 1990). We provide a general definition here by introducing a slightly more complex version. A coupling is associated to a pair, consisting of, first, a vertex, $Z \subseteq {1, \dots, n}$ , and, second, an ordered sequence of distinct sites, $(i_{1}, \dots, i_{k})$ , none of which are in $Z$ , so that $Z \cap {i_{1}, \dots, i_{k}} = \emptyset$ . The vertex $Z$ should be thought of as an ‘offset’ within the coupling graph and the sites, $i_{1}, \dots, i_{k}$ as specifying an ordered sequence of perturbations undertaken around $Z$ . Higher-order couplings are conventionally used in the literature only for $Z = \emptyset$ , but this more complex version is needed for the definition in Equation 11 below. Associated to such a pair $Z, (i_{1}, \dots, i_{k})$ is a $k$ th order coupling, which we will denote by $Δ^{k} γ_{Z, (i_{1}, \dots, i_{k})}$ . We start by defining the first-order coupling, $Δ^{1} γ_{Z, (i_{1})}$ , for any $Z$ satisfying the restriction above, in terms of the free energy,

Δ^{1} γ_{Z, (i_{1})} = Δ Φ_{Z \cup {i_{1}}} - Δ Φ_{Z} .

(10)

With that in hand, we can define for $k \geq 2$ , again for any $Z$ satisfying the restriction

Δ^{k} γ_{Z, (i_{1}, \dots, i_{k})} = Δ^{k - 1} γ_{Z \cup {i_{k}}, (i_{1}, \dots, i_{k - 1})} - Δ^{k - 1} γ_{Z, (i_{1}, \dots, i_{k - 1})},

(11)

where it is clear that $Z \cup {i_{k}}$ must be disjoint from ${i_{1}, \dots, i_{k - 1}}$ , so that the right-hand side of Equation 11 is recursively well defined. Unravelling Equations 11 and 10, we see that

Δ^{2} γ_{Z, (i_{1}, i_{2})} = Δ^{1} γ_{Z \cup {i_{2}}, (i_{1})} - Δ^{1} γ_{Z, (i_{1})} = Δ Φ_{Z \cup {i_{1}, i_{2}}} - Δ Φ_{Z \cup {i_{2}}} - (Δ Φ_{Z \cup {i_{1}}} - Δ Φ_{Z}),

(12)

which corresponds when $Z = \emptyset$ to Equation 1 of Horovitz and Fersht, 1990. With some more work, it can be seen that Equation 11 reproduces the $k = 3$ and $k = 4$ examples in Horovitz and Fersht, 1990. Equation 12 expresses the intuition behind higher-order coupling, that it measures the effect of a perturbation relative to the unperturbed state, hierarchically for a sequence of perturbations.

It can be seen quite easily from Equations 5 and 12 that

\exp (- \frac{Δ^{2} γ_{Z, (i_{1}, i_{2})}}{k_{B} T}) = \frac{ω_{i_{1}, Z \cup {i_{2}}}}{ω_{i_{1}, Z}} .

(13)

We note from Equation 13 that ‘order’ is counted differently between HOCs and conventional higher-order couplings: when $Z = \emptyset$ , Equation 13 relates a higher-order coupling with $k = 2$ to a HOC of order 1. Substituting Equation 13 into Equation 11 and continuing the recursion, we find that

\exp (- \frac{Δ^{3} γ_{Z, (i_{1}, i_{2}, i_{3})}}{k_{B} T}) = \frac{ω_{i_{1}, Z \cup {i_{2}, i_{3}}}}{ω_{i_{1}, Z \cup {i_{2}}} ω_{i_{1}, Z \cup {i_{3}}}},

at which point the similarity with Equation 9 becomes evident and the pattern emerges. It can be shown by direct substitution in Equation 11 that the following general formula holds, which expresses higher-order couplings in terms of HOCs for any $k \geq 2$ :

\exp (- \frac{Δ^{k} γ_{Z, (i_{1}, \dots, i_{k})}}{k_{B} T}) = \prod_{X \subseteq {i_{2}, \dots, i_{k}}} (ω_{i_{1}, Z \cup X})^{(- 1)^{k - 1 - # (X)}} .

(14)

Comparing Equation 14 with Equation 8 we see that, despite their very different definitions in Equations 11 and 6, conventional higher-order couplings are the same as residual free energies. Indeed, for $k \geq 1$ ,

Δ^{k} γ_{\emptyset, (i_{1}, \dots, i_{k})} = Δ ϕ_{{i_{1}, \dots, i_{k}}} .

(15)

Equation 15 may seem strange because a higher-order coupling is defined in terms of an ordered sequence of perturbations, $(i_{1}, \dots, i_{k})$ , while a residual free energy depends only on the subset of sites, ${i_{1}, \dots, i_{k}}$ , without respect to the order of sites. It is a consequence of detailed balance at thermodynamic equilibrium that the order in which the perturbations are undertaken does not matter. For example, it is clear from Equation 12 that $Δ^{2} γ_{\emptyset, (i_{1}, i_{2})} = Δ^{2} γ_{\emptyset, (i_{2}, i_{1})}$ . More generally, if ρ is any permutation of the perturbed sites, so that ρ is a bijective function, $ρ : {i_{1}, \dots, i_{k}} \to {i_{1}, \dots, i_{k}}$ , then it can be shown that

Δ^{k} γ_{Z, (i_{1}, \dots, i_{k})} = Δ^{k} γ_{Z, (ρ (i_{1}), \dots, ρ (i_{k}))} .

(16)

Note that Equation 16 follows from Equation 15 when $Z = \emptyset$ . This property of invariance under permutation is referred to as ‘symmetry’ in Horovitz and Fersht, 1990 and is similar to the algebraic relations which give rise to the independent HOCs, $ω_{i, S}$ with $i < S$ , as described previously.

The equality between the higher-order couplings introduced in Horovitz and Fersht, 1990 and the residual free energies introduced in Horovitz and Fersht, 1992, as described in Equation 15, is presumably well known to those in the field. It seems to be implicitly assumed in Horovitz and Fersht, 1992, but we have not found a clear statement of it in the literature. It would be difficult to formulate one in the absence of a general definition of higher-order coupling, as we have given in Equation 11. The formulas above may therefore be of some value in offering a rigorous treatment.

Each of the measures we have discussed, HOCs, residual free energies and higher-order couplings, offers a different way of analysing the free-energy landscape using the graphs in Figure 5. HOCs are associated to graph edges; residual free energies are associated to graph vertices; and higher-order couplings are associated to sequences of sites, at least when symmetries are ignored. As we have seen above, the three measures are mathematically equivalent. However, they are useful for different purposes. HOCs tell us about the integration of binding information; residual free energies capture the collective synergy between sets of sites; and higher-order couplings show how these same synergies can be extracted from a sequence of experimental perturbations. One advantage of HOCs is that they are non-dimensional quantities in terms of which it is straightforward to calculate the other measures. By doing so, we were able to show rigorously that higher-order couplings are also residual free energies (Equation 15).

Having explained how various higher-order measures are related to each other, we return to the question of how effective cooperativity arises from allosteric ensembles with multiple conformations. For this problem, HOCs are much easier to use than either residual free energies or higher-order couplings. With Equations 8 and 14 now available, effective residual free energies or effective higher-order couplings may be calculated from the effective HOCs that we construct below, but we will not exploit this capability in the present paper.

Coarse graining yields effective HOCs

As MWC showed, even if there is no intrinsic cooperativity in any conformation, an effective cooperativity can arise from the ensemble. This is usually detected in the shape of the binding function (Figure 2A). Here, we introduce a method of coarse graining through which effective cooperativities can be rigorously defined. We illustrate this for the allostery graph, $A$ , and explain the general coarse-graining method in the Materials and methods. For allostery, the idea is to treat the horizontal subgraphs, $A_{S}$ , as the vertices of a new coarse-grained graph, $A^{ϕ}$ , (Figure 4, bottom right). There is an edge between two vertices in $A^{ϕ}$ , if, and only if, there is an edge in $A$ between the corresponding horizontal subgraphs. It is not hard to see that $A^{ϕ}$ is identical in structure to any of the vertical subgraphs $A^{c_{k}}$ . We can think of $A^{ϕ}$ as if it represents a single effective conformation to which ligand is binding, and we can index each vertex of $A^{ϕ}$ by the corresponding subset of bound sites, $S$ . The key point, as explained in detail in the Materials and methods, is that it is possible to assign labels to the edges in $A^{ϕ}$ so that

{Pr}_{S} (A^{ϕ}) = \sum_{k = 1}^{N} {Pr}_{(c_{k}, S)} (A),

(17)

with $A^{ϕ}$ being at thermodynamic equilibrium under these label assignments. According to Equation 17, the probability of being in a coarse-grained vertex of $A^{ϕ}$ is identical to the overall probability of being in any of the corresponding vertices of $A$ . This is exactly the property a coarse graining should satisfy at steady state. It is not difficult to see why a procedure like this should work. The coarse-graining formula in Equation 17 tells us the expected probability distribution on the coarse-grained graph, $A^{ϕ}$ . Equation 3 can then be used to back out the equilibrium labels on the edges of $A^{ϕ}$ which give rise to this probability distribution. We provide a more direct way of achieving the same result in Equation 40. This assignment of labels to $A^{ϕ}$ is the only way to ensure Equation 17 at equilibrium, so that the coarse graining is both systematic and unique. The Materials and methods gives a more careful treatment for coarse graining any linear framework graph, which may not itself be at thermodynamic equilibrium.

Our coarse-graining procedure offers a general method for calculating how effective behaviour emerges, at thermodynamic equilibrium, from a more detailed underlying mechanism. This procedure is likely to be broadly useful for other studies. We note that it applies only to the steady state. It does not provide a coarse graining of the underlying dynamics, which is a much harder problem.

Because $A^{ϕ}$ resembles the graph for ligand binding at a single conformation, we can calculate HOCs for $A^{ϕ}$ —equivalently, effective HOCs for $A$ —just as we did above, by normalising the effective association constants. Once the dust of calculation has settled (Materials and methods), we find that $A$ has effective association constants and effective HOCs:

K_{i, S}^{ϕ} = \frac{⟨ K_{c_{k}, i, S} . μ_{S} (A^{c_{k}}) ⟩}{⟨ μ_{S} (A^{c_{k}}) ⟩} and ω_{i, S}^{ϕ} = \frac{⟨ K_{c_{k}, i, S} . μ_{S} (A^{c_{k}}) ⟩}{⟨ K_{c_{k}, i, \emptyset} ⟩ ⟨ μ_{S} (A^{c_{k}}) ⟩} .

(18)

The quantity $μ_{S} (A^{c_{k}})$ is calculated by multiplying labels over paths, as above, within the vertical subgraph $A^{c_{k}}$ . The terms within angle brackets, of the form $⟨ X (c_{k}) ⟩$ , where $X (c_{k})$ is some function over conformations c_k, denote averages over the steady-state probability distribution of the horizontal subgraph: $⟨ X (c_{k}) ⟩ = \sum_{1 \leq k \leq N} X (c_{k}) {Pr}_{c_{k}} (A_{\emptyset})$ . The right-hand formula in Equation 18 for the effective HOCs has a suggestive structure: it is an average of a product divided by the product of the averages. The effective parameters in Equation 18 provide a biophysical language in which the integrative capabilities of any ensemble can be rigorously specified.

Effective HOCs for MWC-like ensembles

The functional viewpoint is readily recovered from the ensemble. A generalised MWC formula can be given in terms of effective HOCs, from which the classical two-conformation MWC formula is easily derived (Materials and methods). Some expected properties of effective HOCs are also easily checked (Materials and methods). First, $ω_{i, S}^{ϕ}$ is independent of ligand concentration, $x$ . Second, there is no effective HOC for binding to an empty conformation, so that $ω_{i, \emptyset}^{ϕ} = 1$ . Third, if there is only one conformation c₁, then the effective HOC reduces to the intrinsic HOC, so that $ω_{i, S}^{ϕ} = ω_{c_{1}, i, S}$ .

More illuminating are the effective HOCs for the MWC model. We consider any conformational ensemble which is MWC-like: there is no intrinsic HOC in any conformation, so that $ω_{c_{k}, i, S} = 1$ and $K_{c_{k}, i, S} = K_{c_{k}, i, \emptyset}$ ; and the bare association constants are identical at all sites, so that we can set $K_{c_{k}, i, \emptyset} = K_{c_{k}}$ . There may, however, be any number of conformations, not just the two conformations of the classical MWC model. It then follows that $ω_{i, S}^{ϕ}$ depends only on the size of $S$ , so that we can write $ω_{i, S}^{ϕ}$ as $ω_{s}^{ϕ}$ , where $s = # (S)$ is the order of cooperativity. Equation 18 then simplifies to (Materials and methods)

ω_{s}^{ϕ} = \frac{⟨ {(K_{c_{k}})}^{s + 1} ⟩}{⟨ K_{c_{k}} ⟩ ⟨ {(K_{c_{k}})}^{s} ⟩} .

(19)

We see that, although there is no intrinsic HOC in any conformation, effective HOC of each order arises from the moments of $K_{c_{k}}$ over the probability distribution on $A_{\emptyset}$ . In particular, Equation 19 shows that the effective pairwise cooperativity is $ω_{1}^{ϕ} = ⟨ {(K_{c_{k}})}^{2} ⟩ / {⟨ K_{c_{k}} ⟩}^{2}$ .

In studies of G-protein coupled receptor (GPCR) allostery, Ehlert relates ‘empirical’ to ‘ultimate’ levels of explanation by a procedure similar to our coarse graining, but with only two conformations, and calculates a ‘cooperativity constant’ which is the same as $ω_{1}^{ϕ}$ (Ehlert, 2016). Gruber and Horovitz calculate ‘successive ligand binding constants’ for the two-conformation MWC model which are the same as effective association constants, $K_{s}^{ϕ}$ , (Gruber and Horovitz, 2018) (Materials and methods). To our knowledge, these are the only other calculations of effective allosteric quantities. We note that Equation 19 applies to all HOCs, not just pairwise, and to any MWC-like ensemble, not just those with two conformations.

The classical MWC model yields only positive cooperativity (Koshland and Hamadani, 2002; Monod et al., 1965), as measured in the functional perspective (Figure 2A). We find that MWC-like ensembles yield positive effective HOCs of all orders. Strikingly, these effective HOCs increase with increasing order of cooperativity: provided $K_{c_{k}}$ is not constant over conformations (Materials and methods),

1 < ω_{1}^{ϕ} < ω_{2}^{ϕ} < \dots < ω_{n - 1}^{ϕ} .

(20)

This shows that ensembles with independent and identical sites, including the two-conformation MWC model, can effectively implement high orders and high levels of positive cooperativity. Equation 20 is very informative, and we return to it in the Discussion.

It is often suggested that negative cooperativity requires a different kind of ensemble to those considered here, such as one allowing KNF-style induced fit (Koshland and Hamadani, 2002). However, if two sites are independent but not identical, so that $K_{c_{k}, 1, \emptyset} \neq K_{c_{k}, 2, \emptyset}$ , then, with just two conformations, the effective pairwise cooperativity can become negative. Indeed, $ω_{1, {2}}^{ϕ} < 1$ , if, and only if, the values of the association constants are not in the same relative order in the two conformations (Materials and methods). Negative effective cooperativity can arise from non-identical sites and does not need a special kind of ensemble.

Integrative flexibility of ensembles

Equation 18 shows that effective HOCs of any order can arise for a conformational ensemble but does not reveal what values they can attain. Can they vary arbitrarily? The question can be rigorously posed as follows. Suppose that we are considering $n$ binding sites and that numbers $β_{i} > 0$ , for $1 \leq i \leq n$ , and $α_{i, S} > 0$ , for $i < S$ , are chosen at will. Does there exist a conformational ensemble such that the bare effective association constants satisfy $K_{i, \emptyset}^{ϕ} = β_{i}$ , and the independent effective HOCs satisfy $ω_{i, S}^{ϕ} = α_{i, S}$ ?

To address this question, we assume that there is no intrinsic HOC, so as not to introduce cryptically what we want to generate. It follows that the sites cannot be identical, for otherwise Equation 20 shows that integrative flexibility is impossible. Accordingly, the bare association constants, $K_{c_{k}, i, \emptyset}$ for $1 \leq i \leq n$ , can be treated as $n$ free parameters in each conformation c_k. If there are $N$ conformations in the ensemble, then there are $N - 1$ free parameters coming from the horizontal edges (Materials and methods). Dimensional considerations imply that the effective HOCs cannot take arbitrary values if $n (N - 1) < 2^{n} - 1$ . Conversely, we prove the following flexibility theorem: any pattern of values can be realised by an allosteric ensemble with no intrinsic cooperativity, to any required degree of accuracy, provided there are enough conformations with the right probability distribution and the right patterns of bare association constants.

To see why this is possible, we outline the argument here and give rigorous details in Theorem 1 in the Materials and methods. Other arguments may of course be possible and the details presented here should not be thought of as the only way for the results to hold. We will use an allostery graph $A$ whose conformations are indexed by subsets $T \subseteq {1, \dots, n}$ and denoted $c_{T}$ . Both binding subsets and conformations will then be indexed by subsets of ${1, \dots, n}$ . To avoid confusion, we will use $S$ to label binding subsets and $T$ to label conformations, so that a vertex of $A$ will be $(c_{T}, S)$ . The allostery graph for the case $n = 2$ is shown in Figure 6. We will focus on the horizontal subgraph of empty conformations, $A_{\emptyset}$ , because that is what is needed for calculating effective HOCs using Equation 18. We will take the reference vertex of $A_{\emptyset}$ to be $c_{\emptyset}$ . Recall from what was explained previously that the product of the equilibrium labels along any path in $A_{\emptyset}$ from the reference vertex to the vertex $c_{T}$ is the quantity $μ_{c_{T}} (A_{\emptyset})$ , in terms of which the steady-state probabilities of $A_{\emptyset}$ are given by Equation 3. Let $λ_{T} = μ_{c_{T}} (A_{\emptyset})$ . These quantities are $2^{n} - 1$ free parameters whose values we are going to assign. They are more convenient for our purposes than an independent set of equilibrium labels for $A_{\emptyset}$ . By Equation 3,

{Pr}_{c_{T}} (A_{\emptyset}) = \frac{λ_{T}}{\sum_{X \subseteq {1, \dots, n}} λ_{X}} .

(21)

Figure 6. — There are $n = 2$ sites and $2^{2} = 4$ conformations, giving a 16-vertex allostery graph (top). Vertices indicate a bound site with a solid black dot and an unbound site with a black dash. Sites are indexed in increasing order, $1, 2$ , from left to right. The red vertical binding edges carry a factor of ε in their equilibrium labels; the blue vertical binding edges do not, as specified in the text and Equation 60. The vertices of the allostery graph are annotated with the values of $μ_{S} (A^{c_{T}})$ , as specified in the text and Equation 61. The horizontal subgraph of empty conformations is shown at the bottom, with conformations indexed below each vertex by subsets of ${1, 2}$ and annotated above each vertex with the corresponding value of $λ_{T}$ , as specified by Equation 24.

The other free parameters that we need are $n$ quantities, $κ_{1}, \dots, κ_{n} > 0$ , to which we will subsequently assign values, in terms of which we will define the intrinsic association constants. We will assume that the sites are independent in each conformation, so that all intrinsic HOCs of $A$ are 1. It follows that $K_{c_{T}, i, S} = K_{c_{T}, i, \emptyset}$ . We then set $K_{c_{T}, i, \emptyset} = κ_{i}$ if $i \in T$ , and $K_{c_{T}, i, \emptyset} = ε κ_{i}$ if $i \notin T$ . Here, ε is a small positive quantity which can be chosen to determine the degree of accuracy to which the $β_{i}$ and $α_{i, S}$ are approximated. In the calculations which follow, we will only be interested in terms which do not involve ε as a factor. Because the sites are independent in each conformation, it follows that, in the vertical subgraph, $A^{c_{T}}$ , at any conformation $c_{T}$ , $μ_{S} (A^{c_{T}}) = (\prod_{i \in S} κ_{i}) x^{# (S)}$ , whenever $S \subseteq T$ . However, if $S ⊈ T$ , then $μ_{S} (A^{c_{T}})$ acquires factors of ε and so $μ_{S} (A^{c_{T}}) \approx 0$ , where ≈ means simply that the related quantities become equal as ε becomes very small. In this case, for our purposes, $μ_{S} (A^{c_{T}})$ is negligible whenever $S ⊈ T$ . Figure 6 illustrates how this plays out in the allostery graph for $n = 2$ .

To calculate the effective association constants, the left-hand formula in Equation 18 shows that we must evaluate the averages $⟨ K_{c_{T}, i, S} . μ_{S} (A^{c_{T}}) ⟩$ and $⟨ μ_{S} (A^{c_{T}}) ⟩$ . Using Equation 21,

⟨ μ_{S} (A^{c_{T}}) ⟩ = \sum_{T} μ_{S} (A^{c_{T}}) (\frac{λ_{T}}{\sum_{X} λ_{X}}) .

The only terms in the sum which do not involve ε as a factor are those $T$ for which $S \subseteq T$ . Furthermore, the definition of $μ_{S} (A^{c_{T}})$ given above shows that these terms do not depend on $T$ . Similarly, using Equation 21 again,

⟨ K_{c_{T}, i, S} . μ_{S} (A^{c_{T}}) ⟩ = \sum_{T} K_{c_{T}, i, S} . μ_{S} (A^{c_{T}}) (\frac{λ_{T}}{\sum_{X} λ_{X}})

and the only terms in the sum which do not involve ε as a factor are those for which $S \cup {i} \subseteq T$ . These terms also do not depend on $T$ . It follows from Equation 18 that

K_{i, S}^{ϕ} = \frac{⟨ K_{c_{T}, i, S} . μ_{S} (A^{c_{T}}) ⟩}{⟨ μ_{S} (A^{c_{T}}) ⟩} \approx κ_{i} (\frac{\sum_{S \cup {i} \subseteq T} λ_{T}}{\sum_{S \subseteq T} λ_{T}}),

(22)

where we have ignored all terms involving ε as a factor.

Equation 22 tells us two things. First, that the effective association constants are approximately proportional to the corresponding κ’s. Hence, if the proportionality constants, which depend only on the $λ_{T}$ , are determined, we can choose the $κ_{i}$ so as to make the bare effective association constants $K_{i, \emptyset}^{ϕ}$ approximately equal to $β_{i}$ . Second, Equation 22 tells us that the effective HOCs, $ω_{i, S}^{ϕ}$ , are independent of the $κ_{i}$ and depend only on the $λ_{T}$ ,

ω_{i, S}^{ϕ} = \frac{K_{i, S}^{ϕ}}{K_{i, \emptyset}^{ϕ}} \approx \frac{(\sum_{\emptyset \subseteq T} λ_{T}) (\sum_{S \cup {i} \subseteq T} λ_{T})}{(\sum_{{i} \subseteq T} λ_{T}) (\sum_{S \subseteq T} λ_{T})} .

(23)

It remains for us to assign values to the $λ_{T}$ so that the effective HOCs become approximately equal to the α’s.

To do this, assume that, for the conformation $c_{T}$ , the subset $T$ is written as $T = {i_{1}, \dots i_{k}}$ , where the indices are in increasing order, $i_{1} < i_{2} < \dots < i_{k}$ . Because of this ordering, the quantities $α_{i_{j}, {i_{j + 1}, \dots, i_{k}}}$ are given to us by hypothesis. Hence, we can define

λ_{T} = α_{i_{1}, {i_{2}, \dots, i_{k}}} α_{i_{2}, {i_{3}, \dots, i_{k}}} \dots α_{i_{k - 1}, {i_{k}}} δ^{k} .

(24)

Here, δ is another small positive quantity, similar to ε, which can be chosen to set the degree of accuracy to which the β’s and α’s are approximated. As with ε, we will treat as negligible terms in which δ is a factor. Figure 6 illustrates Equation 24 for the case $n = 2$ .

It can be seen from Equation 24 that $\sum_{X \subseteq T} λ_{T} = λ_{X} (1 + U)$ , where $U$ has a factor of δ and is therefore negligible as δ becomes very small, $U \approx 0$ . It then follows from Equation 23 that

ω_{i, S}^{ϕ} = \frac{(1 + U) λ_{S \cup {i}} (1 + U)}{δ (1 + U) λ_{S} (1 + U)},

(25)

where we have used $U$ as a generic symbol for quantities which are negligible as δ becomes very small. By Equation 24, $λ_{S \cup {i}} = α_{i, S} δ λ_{S}$ , so that

ω_{i, S}^{ϕ} \approx α_{i, S} .

(26)

This establishes part of what is required. For the other part, we can return to Equation 22 and set

κ_{i} = β_{i} (\frac{\sum_{{i} \subseteq T} λ_{T}}{\sum_{\emptyset \subseteq T} λ_{T}}),

from which it follows from Equation 22 that

K_{i, \emptyset}^{ϕ} \approx β_{i} .

(27)

Equations 26 and 27 show that the effective association constants and effective HOCs can take arbitrary positive values to any desired degree of accuracy, as determined by ε and δ. This establishes the flexibility theorem. The Materials and methods provides a more careful treatment in Theorem 1, which rigorously establishes the approximation as ε and δ become very small.

Figures 7 and 8 together illustrate the flexibility theorem. Figure 7A shows three arbitrarily chosen patterns of effective parameters for a target molecule with four ligand binding sites. Figure 7B shows the corresponding overall binding functions (black curves) together with the individual site-specific binding functions (coloured curves). As a matter of thermodynamics, the overall binding function is always an increasing function of ligand concentration. In contrast, the site-specific binding functions may increase or decrease depending on the combinations of positive and negative effective HOCs in Figure 7A, and thereby show more clearly the complexity arising from those different combinations. The implementation of the effective parameters by an allosteric ensemble, as specified by the flexibility theorem, is illustrated in Figure 8. Figure 8A shows the allosteric ensemble for $n = 4$ sites as a product graph with 16 binding patterns and 16 conformations. Figure 8B shows the intrinsic association constants in each conformation coming from the proof of the flexibility theorem, to an accuracy of 0.01. Figure 8C confirms that this allosteric ensemble exactly reproduces the overall binding functions in Figure 7B.

Figure 8. — (A) The allostery graph, $A$ , which implements the choices of effective higher-order cooperativities (HOCs) in Figure 7, shown as the product of the vertical subgraph of binding patterns at conformation $c_{\emptyset}$ , $A^{c_{\emptyset}}$ , and the horizontal subgraph of empty conformations, $A_{\emptyset}$ . As required in the proof of the flexibility theorem, both conformations and binding subsets are indexed by subsets of ${1, \dots, n}$ , where $n$ is the number of binding sites. Since $n = 4$ for the effective HOCs in Figure 7, there are 16 binding subsets and 16 conformations, $c_{\emptyset}, \dots, c_{{1, 2, 3, 4}}$ . (B) Intrinsic bare association constants, $K_{c_{T}, i, \emptyset}$ , in each conformation, in arbitrary units of (concentration)⁻¹, and the probability distribution on the subgraph of empty conformations, $A_{\emptyset}$ , for the allostery graph in (A), giving the three choices of effective parameters in Figure 7A to an accuracy of 0.01 (Materials and methods), colour coded on a log scale as shown in the respective legends below. (C) Overall binding functions for the three parameterised ensembles in (B) (black curves), overlaid on the overall binding functions from Figure 7B (red curves), which were calculated from the effective parameters. The match is too close for the red curves to be visible. Numerical values are given in the Materials and methods. Calculations were undertaken in a Mathematica notebook, available on request.

In respect of the dimensional argument made previously, the allostery graph used in the proof above has $2^{n} - 1$ free parameters for $A_{\emptyset}$ and the $κ_{1}, \dots, κ_{n}$ are a further $n$ free parameters, giving $2^{n} - 1 + n$ free parameters in total. This is more than the minimal required number of $2^{n} - 1$ but not by much. It remains an interesting open question whether a conformational ensemble can be constructed, perhaps with more free parameters, which gives the effective HOCs exactly, rather than only approximately. One consequence of the definitions of $K_{c_{T}, i, \emptyset}$ and of $λ_{T}$ in Equation 24 is that the parameters of the allosteric ensemble become exponentially small, as is evident for the examples in Figure 8B. Another interesting question is whether alternative constructions can be found which do not exhibit such a broad range of parameter values. Irrespective of these questions, the proof given above confirms that there is no fundamental biophysical limitation to achieving any pattern of values to any desired degree of accuracy. Accordingly, a central result of the present paper is that sufficiently complex allosteric ensembles can implement any form of information integration that is achievable without energy expenditure.

Allosteric ensembles for Hill functions

As mentioned in the Introduction, the starting point for the present paper was to account for experimental data on gene expression. Studies in Drosophila have shown that the Hunchback gene, in response to the maternal TF Bicoid, is sharply expressed in a way that is well fitted, after appropriate normalisation, to a Hill function, $ℋ_{h} (x) = x^{h} / (1 + x^{h})$ . This sharp expression underlies the initial patterning of anterior-posterior stripes in the early Drosophila embryo. Estimated values for the Hill coefficient, $h$ , vary depending on the experimental construct and time of measurement but are typically in the range $4 \leq h \leq 8$ during early nuclear cycle 14 (Tran et al., 2018). The relevant promoter is believed to have $n = 6$ Bicoid binding sites, and the mechanistic basis for the sharpness is the subject of considerable interest. We showed in previous work that, if the promoter was assumed to have six Bicoid binding sites and to be operating at thermodynamic equilibrium, then the highest Hill coefficient that could be achieved of $h = 6$ , at the so-called Hopfield barrier, required HOCs for Bicoid binding of order up to 5 (Estrada et al., 2016). In particular, pairwise cooperativities, which had previously been invoked to account for the sharpness (Gregor et al., 2007), are not sufficient to explain the data. Left open by this previous work was a molecular mechanism which could create the high-order HOCs required for Hill functions. We have seen above that allosteric ensembles can create any pattern of HOCs, so it is natural to ask if there are allosteric ensembles which yield good approximations to Hill functions.

We implemented a numerical optimisation algorithm to find binding functions which approximated Hill functions (Materials and methods). Hill functions are naturally normalised so that $ℋ_{h} (1) = 0.5$ , so we followed the procedure introduced previously (Estrada et al., 2016) of normalising concentration to its value at half-maximum: if the normalised binding function is denoted $f (x)$ , then $f (1) = 0.5$ . Figure 9 shows results for an allosteric ensemble with four conformations for ligand binding to six sites. The ensemble has no intrinsic cooperativity in any conformation, so that $K_{c_{k}, i, S} = K_{c_{k}, i, \emptyset}$ for any binding subset $S \subseteq {1, \dots, 6}$ , while the bare association constants, $K_{c_{k}, i, \emptyset}$ , differ between the conformations (Figure 9B). This gives $4 \times 6 = 24$ free parameters together with an additional three free parameters for the independent equilibrium labels on the horizontal subgraph $A_{\emptyset}$ (Figure 9A). We limited the parameter ranges so that the $K_{c_{k}, i, \emptyset}$ were in the range $[10^{- 4}, 10^{4}]$ and the equilibrium labels of $A_{\emptyset}$ were in the range $[10^{- 6}, 10^{6}]$ . With these settings, it was not difficult to find normalised binding functions which are very well fitted by the Hill function, $ℋ_{h} (x)$ , for Hill coefficients $h = 4$ , 5 and 6 (Figure 9D).

We were able to find multiple sets of parameters which yielded excellent fits; Figure 9 shows two representative examples for each Hill coefficient. It is evident that very different numerical ensembles (Figure 9B) can give almost identical binding functions (Figure 9D). This reinforces the point made in the Introduction that the binding function, or some associated measure of its shape, such as a Hill coefficient, are aggregate measures which give little insight into how binding information is being integrated. For this, the patterns of effective parameters provide more detailed information. As can be seen from Figure 9C, effective HOCs of all orders up to 5 are needed for each Hill function, as suggested previously (Estrada et al., 2016), with predominantly positive effective HOCs, $ω_{i, S}^{ϕ} > 1$ , and varying amounts of independence, $ω_{i, S}^{ϕ} = 1$ .

It is interesting to ask what role the size of the ensemble plays in approximating Hill functions. We cannot give a definitive answer but can make some observations. We were able to approximate $ℋ_{6}$ with a two-conformation ensemble with six sites but only with much wider parametric ranges. It was also more difficult in terms of optimisation time to find a good fit, and we did not find multiple fits. This suggests that the larger the ensemble the easier it is to approximate Hill functions with limited parameter ranges. It is also conceivable that the size of the ensemble may have to increase with the number of binding sites to retain control over the parametric ranges. We must leave such issues to subsequent work. While our results are numerical, and therefore limited to the ensemble we have analysed, it seems clear that allosteric ensembles provide a molecular mechanism that can closely approximate Hill functions with the required high orders of effective cooperativity, thereby providing a solution to our original question. Since Hill functions are widely used to fit data, the potential for an underlying allosteric mechanism may be broadly useful.

Discussion

Jacques Monod famously described allostery as ‘the second secret of life’ (Ullmann, 2011). It is only relatively recently, however, that the prescience of his remark has been appreciated and the wealth of conformational ensembles present in most cellular processes has been revealed (Changeux and Christopoulos, 2016; Motlagh et al., 2014; Nussinov et al., 2013).

The present paper seeks to expand the existing allosteric perspective by providing a biophysical foundation for information integration by conformational ensembles. Equation 48 and Equation 49 in the Materials and methods (Equation 18 above) provide for the first time a rigorous definition of effective, higher-order quantities—the association constants, $K_{i, S}^{ϕ}$ , and cooperativities, $ω_{i, S}^{ϕ}$ ,—arising from any ensemble. Since our methods are equivalent to those of equilibrium statistical mechanics (Material and methods), these definitions correctly aggregate the free-energy contributions which emerge in the ensemble from ligand binding to a conformation, intrinsic cooperativity within a conformation and conformational change. As noted above, our results encompass recent work on effective properties of the classical, two-conformation MWC ensemble—for pairwise cooperativity (Ehlert, 2016) and higher-order association constants (Gruber and Horovitz, 2018)—but they hold more generally for ensembles of arbitrary complexity with any number of conformations, including those with intrinsic cooperativities.

The effective quantities introduced here provide a language in which the integrative capabilities of an ensemble can be rigorously expressed. To begin with, the overall binding function can be determined in terms of the effective quantities through a generalised MWC formula (Materials and methods), thereby recovering the functional viewpoint (Figure 2A) from the ensemble viewpoint (Figure 2B). This generalised MWC formula reduces to the usual MWC formula for the classical two-conformation MWC model (Equation 55). We also clarify issues which had been difficult to understand in the absence of a quantitative definition of effective quantities. We find that the classical MWC model exhibits effective HOCs of any order and that these are always positive. In other words, binding always encourages further binding. Moreover, these effective HOCs increase strictly with increasing order (Equation 20), so that the more sites which are bound, the greater the encouragement to further binding. We see that HOC has always been present, even for oxygen binding to haemoglobin, albeit unrecognised for lack of an appropriate quantitative definition. Equation 20 confirms in a more precise way the long-standing realisation from the functional perspective that the MWC model exhibits only positive cooperativity; at the same time it succinctly expresses the rigidity and limitations of this model.

It is often stated in the allostery literature that negative cooperativity requires induced fit, in which binding induces conformations which are not present prior to binding. This view goes back to Koshland, who pointed to the emergence of negative cooperativity in the KNF model of allostery, which allows induced fit, and contrasted that to the positive cooperativity of the MWC model, which assumes conformational selection (Koshland and Hamadani, 2002). Our language of effective quantities permits a more discriminating analysis. It confirms, as just pointed out, that the classical MWC model exhibits only positive effective HOCs but also shows that induced fit is not required for negative effective HOC, which can arise just as readily from conformational selection (Materials and methods). What is required is not a different kind of ensemble but, rather, binding sites that are not identical.

Our main result, on the flexibility of conformational ensembles, shows that positive and negative HOCs of any value can occur in any pattern whatsoever, provided that the conformational ensemble is sufficiently complex, with enough conformations (Figure 8). Since the effective quantities provide a complete parameterisation of an ensemble at thermodynamic equilibrium, we see that conformational ensembles can implement any form of information integration that is achievable without external sources of energy. In particular, allosteric ensembles can be found whose binding functions closely approximate Hill functions (Figure 9), thereby answering the question which prompted this study, as to how such functions might arise in gene regulation.

Eukaryotic gene regulation is one of the most complex forms of cellular information processing (Wong and Gunawardena, 2020). Information from the binding of multiple TFs at many sites, often widely distributed across the genome in distal enhancer sequences, must be integrated to determine whether, and in what manner, a gene is expressed. The results of the present paper offer a way to think further about how such integration takes place (Tsai and Nussinov, 2011). We focus on gene regulation, but our results may also be useful for analysing other mechanisms of information integration, such as GPCRs (Thal et al., 2018).

As pointed out in the Introduction, haemoglobin solves the problem of integrating two quite different physiological functions—picking up oxygen in the lungs and delivering oxygen to the tissues—by having two conformations, each adapted to one of these functions, and dynamically inter-converting between them (Figure 10A). The effective cooperativity of oxygen binding ensures that the appropriate conformation dominates the ensemble in the distinct contexts of the lungs, where oxygen is abundant, and the tissues, where oxygen is scarce, so that oxygen is transferred from the former to the latter.

Genes have to be regulated to achieve yet more elaborate forms of integration, with the same gene being expressed differently in different contexts. Such pleiotropy is particularly evident in developmental genes (Bolt and Duboule, 2020) but usually occurs in distinct cells within the developing organism. The same gene is present in these cells, but it may be difficult to know whether the corresponding regulatory machineries are also the same. More directly suitable examples for the present discussion arise in individual cells exposed to distinct stimuli (Molina et al., 2013; Kalo et al., 2015; Lin et al., 2015), which may be particularly the case for neurons or cells of the immune system (Marco et al., 2020; Smale et al., 2013).

Depending on the input pattern of TFs present in a given cellular context (Figure 10B, left), a gene may be expressed in a certain way, as a distribution of splice isoforms, each with an overall level of mRNA expression and a pattern of stochastic bursting (Lammers et al., 2020; Figure 10B, right). A different input pattern of TFs may elicit a different mRNA output. Our results suggest that one way in which these different input-output relationships could be integrated in the workings of a single gene is through allostery of the overall regulatory machinery. An allosteric analogy in gene regulation was previously made by Mirny, 2010, building upon observations of indirect cooperativity between TFs that were mediated by nucleosomes (Miller and Widom, 2003). In the allosteric analogy, TF binding to DNA takes place in one of two conformations—nucleosome present or absent—which dynamically interchange, leading to the classical MWC model. Here, we build upon Mirny’s idea to suggest that not only indirect cooperativity but also, more broadly, information integration may be accounted for by the conformational dynamics of the gene regulatory machinery. The latter comprises not just individual nucleosomes but whatever other molecular entities are implicated in conveying information from TF binding sites to RNA polymerase and the transcriptional machinery (Figure 10B, centre), as discussed below. If this hypothesis is correct, then the flexibility result tells us that the overall regulatory conformational ensemble must exhibit sufficient complexity to implement the integration of binding information.

Studies of individual regulatory components have revealed many levels of conformational complexity. DNA itself exhibits conformational changes in respect of TF binding (Kim et al., 2013). Nucleosomes are moved or evicted to alter chromatin conformation and DNA accessibility (Mirny, 2010; Voss and Hager, 2014). TFs, in particular, show high levels of intrinsic disorder compared to other classes of proteins (Liu et al., 2006), especially in their activation domains, and these disordered regions exhibit dynamic multivalent interactions characteristic of higher-order effects (Chong et al., 2018; Clark et al., 2018). Hub TFs like p53 exhibit high levels of conformational flexibility in the context of specific DNA binding (Demir et al., 2017). Transcriptional co-regulators, which do not directly bind DNA but are recruited there by TFs, exhibit substantial conformational complexity: CBP/p300 has multiple intrinsically disordered regions which facilitate higher-order cooperative interactions (Dyson and Wright, 2016), while the Mediator complex exhibits quite remarkable conformational changes upon binding to TFs (Allen and Taatjes, 2015). Transcription initiation sub-complexes such as TFIID, which help assemble the transcriptional machinery, show conformational plasticity (Nogales et al., 2017), while the C-terminal domain of RNA Pol II, which is repetitive and intrinsically disordered, shows surprising local structural heterogeneity (Portz et al., 2017). The significance of RNA conformational dynamics during transcription is becoming clearer (Ganser et al., 2019). Finally, transcription may also be regulated within larger-scale entities, such as transcription factories (Edelman and Fraser, 2012), phase-separated condensates (Sabari et al., 2018) and topological domains (Benabdallah and Bickmore, 2015). The role of such entities remains a matter of debate (Mir et al., 2019), but they may play a significant role in conveying information over long genomic distances between distal enhancers and target promoters (Furlong and Levine, 2018). From the perspective taken here, in view of their size and extent, they may exhibit conformational dynamics on longer timescales.

These various findings have emerged largely independently of each other. They indicate the presence of many conformations of components of the gene regulatory machinery, with these components dynamically interchanging on varying timescales. The collective effect of these coupled dynamics is difficult to predict but we can hazard some guesses. It has been suggested, for example, that multi-protein complexes like Mediator couple the conformational repertoires of their component proteins into complex allosteric networks for processing information (Lewis, 2010). From an ensemble viewpoint, if component $X$ has $m$ conformations and component $Y$ has $n$ conformations, we might naively expect that the coupling of $X$ and $Y$ in a complex yields roughly $m n$ conformations. Following this multiplicative logic for the many components involved in eukaryotic gene regulation, from DNA itself to condensates and domains, suggests that the gene regulatory machinery has enormous conformational capacity with a deep hierarchy of timescales.

In making the analogy to haemoglobin, it is the conformational dynamics which implements the transfer of information from upstream TF inputs to downstream gene output. In any given cellular context, as determined by the input pattern of TFs, we may expect one, or perhaps a few, overall regulatory conformations which are well-adapted to generate the required mRNA output and these conformations will be the most frequently observed. The ensemble may exhibit complex patterns of positive and negative effective HOCs among the input TFs which will characterise the required output. In the light of our flexibility theorem, the occurrence of such HOCs, which appear to be necessary to account for data on gene regulation (Park et al., 2019a), may be seen as evidence for conformational complexity. When the cellular context changes, different conformations, adapted to produce the output required in the new context, may be present most often—although careful inspection may show them to have been more fleetingly present previously, as would be expected under conformational selection. More broadly, the complexity of the regulatory conformational ensemble and its dynamics reflects the complexity of functional integration which the gene has to undertake.

Furlong and Levine have suggested a ‘hub and condensate’ model for the overall gene regulatory machinery, which brings together aspects of earlier models to account for how remote enhancers communicate with target promoters (Furlong and Levine, 2018). The allosteric perspective taken here emphasises the significance of conformational dynamics for the functional integration undertaken by such ‘hubs’.

Testing these ideas on the scale of the regulatory machinery presents a daunting challenge, but recent developments point the way towards approaching them, including advances in cryo-EM (Lewis and Costa, 2020), single-molecule microscopy (Li et al., 2019; Bacic et al., 2020), NMR (Shi et al., 2020), synthetic biology (Park et al., 2019b) and the measurement of higher-order quantities (Gruber and Horovitz, 2018). Before experiments can be formulated, an appropriate conceptual picture needs to be described and that is what we have tried to formulate here. We now know a great deal about the molecular components involved in gene regulation, but the question of how these components collectively give rise to function has been harder to grasp. The allosteric analogy to haemoglobin, upon which we have built here, suggests a potential way to fill this gap.

In extending the haemoglobin analogy, we have sidestepped the issue of energy expenditure. This is not relevant for haemoglobin, but it can hardly be avoided in considering eukaryotic gene regulation, where reorganisation of chromatin and nucleosomes requires energy-dissipating motor proteins and post-translational modifications driven by chemical potential differences are found on all components of the regulatory machinery (Wong and Gunawardena, 2020). What impact such energy expenditure has on ensemble functional integration is a very interesting question. In a separate study that was stimulated by the present paper, we have confirmed that, if a conformational ensemble is maintained in steady state away from thermodynamic equilibrium, then it can exhibit greater functional capabilities than at equilibrium. We hope to report on these findings subsequently. The results presented here offer a rigorous starting point for thinking about how regulatory ensembles integrate binding information at thermodynamic equilibrium. If, indeed, regulatory energy expenditure is essential for gene expression function, as studies increasingly suggest (Park et al., 2019a; Grah et al., 2020; Wolff et al., 2021), new methods, both theoretical and experimental, will be required to understand its functional significance.

Materials and methods

The linear framework

Background and references

The graphs described in the main text, like those in Figure 4, are ‘equilibrium graphs’, which are convenient for describing systems at thermodynamic equilibrium. Equilibrium graphs are derived from linear framework graphs. The distinction between them is that the latter specifies a dynamics, while the former specifies an equilibrium steady state. We first explain the latter and then describe the former. Throughout this section we will use ‘graph’ to mean ‘linear framework graph’ and ‘equilibrium graph’ to mean the kind of graph used in the main text.

The linear framework was introduced in Gunawardena, 2012, developed in Mirzaev and Gunawardena, 2013, Mirzaev and Bortz, 2015, applied to various biological problems in Ahsendorf et al., 2014, Dasgupta et al., 2014, Estrada et al., 2016, Wong et al., 2018a, Wong et al., 2018b, Yordanov and Stelling, 2018, Biddle et al., 2019, Yordanov and Stelling, 2020 and reviewed in Gunawardena, 2014, Wong and Gunawardena, 2020. Technical details and proofs of the ideas described here can be found in Gunawardena, 2012, Mirzaev and Gunawardena, 2013, as well as in the Supplementary Information of Estrada et al., 2016, Wong et al., 2018b, Biddle et al., 2019.

The framework uses finite, directed graphs with labelled edges and no self-loops to analyse biochemical systems under timescale separation. In a typical timescale separation, the vertices represent ‘fast’ components or states, which are assumed to reach steady state; the edges represent reactions or transitions; and the edge labels represent rates with dimensions of (time)⁻¹. The labels may include contributions from ‘slow’ components, which are not represented by vertices but which interact with them, such as binding ligands in the case of allostery.

Linear framework graphs and dynamics

Graphs will always be connected, so that they cannot be separated into sub-graphs between which there are no edges. The set of vertices of a graph $G$ will be denoted by $ν (G)$ . For a general graph, the vertices will be indexed by numbers $1, \dots, N \in ν (G)$ and vertex 1 will be taken to be the reference vertex. Particular kinds of graphs, such as the allostery graphs discussed in the paper, may use a different indexing. An edge from vertex $i$ to vertex $j$ will be denoted $i \to j$ and the label on that edge by $ℓ (i \to j)$ . A subscript, as in $i \to_{G} j$ , may be used to specify which graph is under discussion. When discussing graphs, we used the word ‘structure’ to refer to properties that depend on vertices and edges only, ignoring the labels.

A graph gives rise to a dynamical system by assuming that each edge is a chemical reaction under mass-action kinetics with the label as the rate constant. Since each edge has only a single source vertex, the corresponding dynamics is linear and can be represented by a linear differential equation in matrix form:

\frac{d u}{d t} = ℒ (G) u .

(28)

Here, $G$ is the graph, $u$ is a vector of component concentrations and $ℒ (G)$ is the Laplacian matrix of $G$ . Since material is only moved between vertices, there is a conservation law, $\sum_{i} u_{i} (t) = u_{t o t}$ . By setting $u_{t o t} = 1$ , $u$ can be treated as a vector of probabilities. In such a stochastic setting, Equation 28 is the master equation (Kolmogorov forward equation) of the underlying Markov process. This is a general representation: given any well-behaved Markov process on a finite state space, there is a graph, whose vertices are the states, for which Equation 28 is the master equation.

The linear dynamics in Equation 28 gives the linear framework its name and is common to all applications. The treatment of the external components, which appear in the edge labels and which introduce nonlinearities, depends on the application. For the case of allostery treated here, we make the same assumptions as in thermodynamics for the grand canonical ensemble, with each ligand being present in a reservoir from which binding and unbinding to graph vertices does not change its free concentration. In this case, the edge labels are effectively constant. The same assumptions are implicitly used in other studies of allostery.

Steady states and thermodynamic equilibrium

The dynamics in Equation 28 always tends to a steady state, at which $d u / d t = 0$ , and, under the fundamental timescale separation, it is assumed to have reached a steady state. If the graph is strongly connected, it has a unique steady state up to a scalar multiple, so that $dim \ker ℒ (G) = 1$ . Strong connectivity means that, given any two distinct vertices, $i$ and $j$ , there is a path of directed edges from $i$ to $j$ , $i = i_{1} \to i_{2} \to \dots \to i_{k - 1} \to i_{k} = j$ . Under strong connectivity, a representative steady state for the dynamics, $ρ (G) \in \ker ℒ (G)$ , may be calculated in terms of the edge labels by the Matrix Tree Theorem. We omit the corresponding expression as it is not needed here, but it can be found in any of the references given above. This expression holds whether or not the steady state is one of thermodynamic equilibrium. However, at thermodynamic equilibrium, the description of the steady state simplifies considerably because detailed balance holds. This means that the graph is reversible, so that, if $i \to j$ , then also $j \to i$ , and each pair of such edges is independently in flux balance, so that

ρ_{i} (G) ℓ (i \to j) = ρ_{j} (G) ℓ (j \to i) .

(29)

This ‘microscopic reversibility’ is a fundamental property of thermodynamic equilibrium. Note that a reversible, connected graph is necessarily strongly connected.

Take any path of reversible edges from the reference vertex 1 to some vertex $i$ , $1 = i_{1} ⇌ i_{2} ⇌ \dots ⇌ i_{k - 1} \to i_{k} = i$ , and let $μ_{i} (G)$ be the product of the label ratios along the path:

μ_{i} (G) = (\frac{ℓ (i_{1} \to i_{2})}{ℓ (i_{2} \to i_{1})}) \times \dots \times (\frac{ℓ (i_{k - 1} \to i_{k})}{ℓ (i_{k} \to i_{k - 1})}) .

(30)

It is straightforward to see from Equation 29 that $μ_{i} (G)$ does not depend on the chosen path and that $ρ_{i} (G) = μ_{i} (G) ρ_{1} (G)$ . The vector $μ (G)$ is therefore a scalar multiple of $ρ (G)$ and so also a steady state for the dynamics. The detailed balance formula in Equation 29 also holds for μ in place of ρ. At thermodynamic equilibrium, the only parameters needed to describe steady states are label ratios.

Equilibrium graphs and independent parameters

This observation about label ratios leads to the concept of an equilibrium graph. Suppose that $G$ is a linear framework graph which can reach thermodynamic equilibrium and is therefore reversible (above). $G$ gives rise to an equilibrium graph, $ℰ (G)$ , as follows. The vertices and edges of $ℰ (G)$ are the same as those of $G$ , but the edge labels in $ℰ (G)$ , which we will refer to as ‘equilibrium edge labels’ and denote $ℓ_{e q} (i \to j)$ , are the label ratios in $G$ . In other words,

ℓ_{e q} (i \to j) = \frac{ℓ (i \to_{G} j)}{ℓ (j \to_{G} i)} .

(31)

Scheme 1 illustrates the relationship between the linear framework graph and the corresponding equilibrium graph. Note that the equilibrium edge labels of $ℰ (G)$ are non-dimensional and that $ℓ_{e q} (j \to i) = ℓ_{e q} (i \to j)^{- 1}$ . The equilibrium edge labels are the essential parameters for describing a state of thermodynamic equilibrium.

These parameters are not independent because Equation 29 implies algebraic relationships among them. Indeed, Equation 29 is equivalent to the following ‘cycle condition’, which we formulate for $ℰ (G)$ : given any cycle of edges, $i_{1} \to i_{2} \to \dots \to i_{k - 1} \to i_{1}$ , the product of the equilibrium edge labels along the cycle is always 1:

ℓ_{e q} (i_{1} \to i_{2}) \times \dots \times ℓ_{e q} (i_{k - 1} \to i_{1}) = 1 .

(32)

This cycle condition is equivalent to the detailed balance condition in Equation 29 and either condition is equivalent to $G$ being at thermodynamic equilibrium.

There is a systematic procedure for choosing a set of equilibrium edge label parameters which are both independent, so that there are no algebraic relationships among them, and also complete, so that all other equilibrium edge labels can be algebraically calculated from them. Recall that a spanning tree of $G$ is a connected subgraph, $T$ , which contains each vertex of $G$ (spanning) and which has no cycles when edge directions are ignored (tree). Any strongly connected graph has a spanning tree and the number of edges in such a tree is one less than the number of vertices in the graph. Since $G$ and $ℰ (G)$ have the same vertices and edges, they have identical spanning trees. The equilibrium edge labels $ℓ_{e q} (i \to_{T} j)$ , taken over all edges $i \to j$ of $T$ , form a complete and independent set of parameters at thermodynamic equilibrium. In particular, if $G$ has $N$ vertices, there are $N - 1$ independent parameters at thermodynamic equilibrium.

In the main text, we defined an equilibrium allostery graph, $A$ (Figure 4), without specifying a corresponding linear framework graph, $G$ , for which $ℰ (G) = A$ . Because label ratios are used in an equilibrium graph, there is no unique linear framework graph corresponding to it. However, some choice of transition rates, $ℓ (i \to_{G} j)$ and $ℓ (j \to_{G} i)$ , can always be made such that their ratio is $ℓ_{e q} (i \to_{ℰ (𝒢)} j)$ . Hence, some linear framework graph $G$ can always be defined such that $ℰ (G) = A$ . In some of the constructions below, we will work with the linear framework graph, $G$ , rather than with the equilibrium graph $A$ and will then show that the construction does not depend on the choice of $G$ .

Steady-state probabilities and equilibrium statistical mechanics

The steady-state probability of vertex $i$ , ${Pr}_{i} (G)$ , can be calculated from the steady state of the dynamics by normalising, so that

{Pr}_{i} (G) = \frac{ρ_{i} (G)}{ρ_{1} (G) + \dots + ρ_{N} (G)} o r {Pr}_{i} (G) = \frac{μ_{i} (G)}{μ_{1} (G) + \dots + μ_{N} (G)},

(33)

where the first formula holds for any strongly connected graph and the second formula also holds if the graph is at thermodynamic equilibrium. In the latter case, Equation 29 holds and $μ (G)$ can be defined by Equation 30. The second formula in Equation 33 corresponds to Equation 3. If the graph is at thermodynamic equilibrium, the equilibrium edge labels may be interpreted thermodynamically, as illustrated in Figure 3 and discussed in the main text (Equation 1):

ℓ_{e q} (i \to j) = \exp (\frac{Δ Φ}{k_{B} T}) .

(34)

If Equation 34 is used to expand the second formula in Equation 33, it gives the specification of equilibrium statistical mechanics for the grand canonical ensemble, with the denominator being the partition function.

It will be helpful to let $Π (G)$ and $Ψ (G)$ denote the corresponding denominators in Equation 33, so that $Π (G) = ρ_{1} (G) + \dots + ρ_{N} (G)$ for any strongly connected graph and $Ψ (G) = μ_{1} (G) + \dots + μ_{N} (G)$ for a graph which is at thermodynamic equilibrium. We will refer to $Π (G)$ and $Ψ (G)$ as partition functions. It follows from Equation 33 that

{Pr}_{i} (G) Π (G) = ρ_{i} (G) or {Pr}_{i} (G) Ψ (G) = μ_{i} (G),

(35)

depending on the context.

The allostery graph

Structure and labels

An allostery graph, $A$ , is an equilibrium graph which describes the interplay between conformational change and ligand binding, as illustrated in Figure 4. Its vertices are indexed by $(c_{k}, S)$ , where c_k specifies a conformation with $1 \leq k \leq N$ and $S \subseteq {1, \dots, n}$ specifies a subset of sites bound by a ligand whose concentration is $x$ . There is no difficulty in allowing multiple ligands and overlapping binding sites, but to keep the formalism simple, we describe here the case of a single ligand and distinct binding sites.

Recall from the main text that $A$ has vertical subgraphs, $A^{c_{k}}$ , consisting of vertices $(c_{k}, R)$ for all binding subsets, $R$ , together with all edges between them, with the vertices indexed by binding subsets, $R$ , and with $R = \emptyset$ being the reference vertex. $A$ has horizontal subgraphs, $A_{S}$ , consisting of vertices $(c_{i}, S)$ for all conformations c_i, together with all edges between them, with the vertices labelled by conformations c_i, and with c₁ being the reference vertex. The product structure of $A$ is revealed by all vertical subgraphs having the same structure as each other and all horizontal subgraphs having the same structure as each other (Figure 4).

As for the labels, the vertical binding edges have equilibrium labels,

ℓ_{e q} ((c_{k}, S) \to_{A} (c_{k}, S \cup {i})) = x K_{c_{k}, i, S} (i \notin S),

(36)

where $x$ is the concentration of the ligand and $K_{c_{k}, i, S}$ is the association constant for binding to site $i$ when the ligand is already bound at the sites in $S$ . The horizontal edges, which represent transitions between conformations, have equilibrium labels, $ℓ_{e q} ((c_{k}, S) \to_{A} (c_{l}, S))$ , which are not individually annotated. However, it is only necessary to specify these equilibrium labels for a single horizontal subgraph, of which the subgraph of empty conformations, $A_{\emptyset}$ , is particularly convenient. To see this, let us calculate the quantity $μ_{(c_{k}, S)} (A)$ using Equation 30. Taking the reference vertex in $A$ to be $(c_{1}, \emptyset)$ , we can always find a path to any given vertex $(c_{k}, S)$ of $A$ by first moving horizontally within $A_{\emptyset}$ from $(c_{1}, \emptyset)$ to $(c_{k}, \emptyset)$ and then moving vertically within $A^{c_{k}}$ from $(c_{k}, \emptyset)$ to $(c_{k}, S)$ . According to Equation 30, the steady state is given by the product of the equilibrium labels along this path, so that

μ_{(c_{k}, S)} (A) = μ_{c_{k}} (A_{\emptyset}) μ_{S} (A^{c_{k}}) .

(37)

Now consider any horizontal edge in $A$ , $(c_{k}, S) \to (c_{l}, S)$ . Since $A$ is at thermodynamic equilibrium, it follows from Equation 29, using μ in place of ρ, and Equation 37, that

ℓ_{e q} ((c_{k}, S) \to_{A} (c_{l}, S)) = \frac{μ_{(c_{l}, S)} (A)}{μ_{(c_{k}, S)} (A)} = (\frac{μ_{c_{l}} (A_{\emptyset})}{μ_{c_{k}} (A_{\emptyset})}) (\frac{μ_{S} (A^{c_{l}})}{μ_{S} (A^{c_{k}})}) .

Applying Equation 29 to $A_{\emptyset}$ , with μ in place of ρ, we see that

ℓ_{e q} ((c_{k}, \emptyset) \to_{A_{\emptyset}} (c_{l}, \emptyset)) = \frac{μ_{c_{l}} (A_{\emptyset})}{μ_{c_{k}} (A_{\emptyset})} .

Hence, it follows that

ℓ_{e q} ((c_{k}, S) \to_{A} (c_{l}, S)) = ℓ_{e q} ((c_{k}, \emptyset) \to_{A_{\emptyset}} (c_{l}, \emptyset)) (\frac{μ_{S} (A^{c_{l}})}{μ_{S} (A^{c_{k}})}) .

(38)

Accordingly, all the labels in $A$ are determined by the vertical labels in Equation 36, from which $μ_{S} (A^{c_{k}})$ and $μ_{S} (A^{c_{l}})$ are determined, and the horizontal labels in the subgraph of empty conformations, $A_{\emptyset}$ . As can be seen from Scheme 2, Equation 38 amounts to exploiting the equilibrium cycle condition in Equation 32.

Scheme 2. — A hypothetical allostery graph shows how the label for the edge at the top, which links the vertical subgraphs at conformations c_k and c_l at the binding subset $S$ , can be calculated from the quantities $μ_{S} (A^{c_{k}})$ and $μ_{S} (A^{c_{l}})$ and the edge label at the bottom. The $μ_{S}$ quantities come from paths to the vertices in question from the respective reference vertices in the vertical subgraphs, as specified in Equation 30 and Scheme 1. The edge label at the bottom comes from the horizontal subgraph of empty conformations, $A_{\emptyset}$ . The vertical subgraphs $A^{c_{k}}$ and $A^{c_{l}}$ have the same structure and the paths are shown as the same in each subgraph, but they could be arbitrary paths because of the cycle condition at thermodynamic equilibrium (Equation 32). Once appropriate directions are taken, the two paths and the edges at the top and bottom constitute a large cycle in the allostery graph and Equation 38 is simply a rewriting of Equation 32 applied to this cycle.

Independent parameters

We can choose any spanning tree in the horizontal subgraph of empty conformations, $A_{\emptyset}$ . As explained above, the equilibrium labels on the edges of this tree define a complete set of $N - 1$ independent parameters for $A_{\emptyset}$ . As for the vertical subgraphs, $A^{c_{k}}$ , which all have the same structure, consider the subgraph of $A^{c_{k}}$ consisting of all edges, together with the corresponding source and target vertices, of the form, $(c_{k}, S) \to (c_{k}, S \cup {i})$ , where $\emptyset \subseteq S \subset {1, \dots, n}$ and $i$ is less than all the sites in $S$ ( $i < S$ ). It is not difficult to see that this subgraph is a spanning tree of $A^{c_{k}}$ (Estrada et al., 2016, SI, §3.2). Accordingly, the association constants, $K_{c_{k}, i, S}$ from Equation 36, with $i < S$ , form a complete set of independent parameters for $A^{c_{k}}$ . Because of the product structure of $A$ , adjoining the spanning trees in $A^{c_{k}}$ , for each conformation c_k with $1 \leq k \leq N$ , to the spanning tree in $A_{\emptyset}$ , yields a spanning tree in $A$ . Hence, the independent parameters for $A^{c_{k}}$ together with the $N - 1$ independent parameters for $A_{\emptyset}$ are also collectively independent as parameters for $A$ . It follows from the description of labels above that these parameters are also complete for $A$ , so that any equilibrium label in $A$ can be expressed in terms of them.

A general method of coarse graining

Coarse graining a linear framework graph and Equation 17

We will describe the coarse-graining procedure for an arbitrary reversible linear framework graph, $G$ , and then explain how this can be adapted to an equilibrium graph, as described for the allostery graph $A$ in the main text.

We will say that a graph $G$ is in-uniform if, given any vertex $j \in ν (G)$ , then for all edges $i \to j$ , $ℓ (i \to j)$ does not depend on the source vertex $i$ .

Lemma 1

Suppose that $G$ is reversible and in-uniform. Then, $G$ is at thermodynamic equilibrium and the vector θ given by $θ_{j} = ℓ (i \to j)$ , which is well-defined by hypothesis, is a basis element in $\ker L (G)$ and a steady state for the dynamics.

Proof: If $i_{1} ⇌ i_{2} ⇌ \dots ⇌ i_{k - 1} ⇌ i_{k}$ is any path of reversible edges in $G$ , then the product of the label ratios along the path satisfies

(\frac{ℓ (i_{1} \to i_{2})}{ℓ (i_{2} \to i_{1})}) (\frac{ℓ (i_{2} \to i_{3})}{ℓ (i_{3} \to i_{2})}) \dots (\frac{ℓ (i_{k - 2} \to i_{k - 1})}{ℓ (i_{k - 1} \to i_{k - 2})}) (\frac{ℓ (i_{k - 1} \to i_{k})}{ℓ (i_{k} \to i_{k - 1})}) = \frac{ℓ (i_{k - 1} \to i_{k})}{ℓ (i_{2} \to i_{1})},

(39)

because the intermediate terms cancel out by the in-uniform hypothesis. If the path is a cycle, so that $i_{k} = i_{1}$ , then, again because of the in-uniform hypothesis, the right-hand side of Equation 39 is 1. Hence, $G$ satisfies the cycle condition in Equation 32 and is therefore at thermodynamic equilibrium. For the last statement, assume that i₁ is the reference vertex 1 and that $i_{k} = j$ , for any vertex $j$ . Using Equation 30, we see that $μ_{j} (G) = θ_{j} / θ_{1}$ . Since $θ_{1}$ is a scalar multiple, the last statement follows.

◼

Now let $G$ be an arbitrary reversible graph, which need not satisfy detailed balance. Let $G_{1}, \dots, G_{m}$ be any partition of the vertices of $G$ , so that $G_{i} \subseteq ν (G)$ , $G_{1} \cup \dots \cup G_{m} = ν (G)$ and $G_{i} \cap G_{j} = \emptyset$ when $i \neq j$ . Let $𝒞 (G)$ be the labelled directed graph with $ν (𝒞 (G)) = {1, \dots, m}$ and let $u \to_{𝒞 (G)} v$ if, and only if, there exists $i \in G_{u}$ and $j \in G_{v}$ such that $i \to_{G} j$ . Finally, let the edge labels of $𝒞 (G)$ be given by

ℓ (u \to_{𝒞 (G)} v) = Q (\sum_{i \in G_{v}} ρ_{i} (G)) .

(40)

The quantity $Q$ in Equation 40 is chosen arbitrarily so that the dimension of $ℓ (u \to v)$ is (time)⁻¹, as required for an edge label. This is necessary because, by the Matrix Tree Theorem, the dimension of $ρ_{i} (G)$ is (time)^1−N, where $N$ is the number of vertices in $G$ . However, $Q$ plays no role in the analysis which follows because the coarse graining applies only to the steady state of $𝒞 (G)$ , not its transient dynamics, and, as we will see, $𝒞 (G)$ is always at thermodynamic equilibrium, so that $Q$ disappears when equilibrium edge labels are considered.

Note that $𝒞 (G)$ inherits reversibility from $G$ and that $𝒞 (G)$ is in-uniform. Hence, by Lemma 1, $𝒞 (G)$ is at thermodynamic equilibrium and

λ μ_{v} (𝒞 (G)) = Q (\sum_{i \in G_{v}} ρ_{i} (G)),

(41)

where λ is a scalar that does not depend on $v \in ν (𝒞 (G))$ . Since $G_{1}, \dots, G_{m}$ is a partition of the vertices of $G$ , it follows from Equation 41 that

λ Ψ (𝒞 (G)) = λ (\sum_{v \in ν (𝒞 (G))} μ_{v} (𝒞 (G))) = Q (\sum_{i \in ν (G)} ρ_{i} (G)) = Q Π (G) .

Equations 35 and 41 then show that both λ and $Q$ cancel in the ratio for the steady-state probabilities, so that

{Pr}_{v} (𝒞 (G)) = \sum_{i \in G_{v}} {Pr}_{i} (G) .

(42)

Equation 42 is the coarse-graining equation, as given in Equation 17.

Coarse graining an equilibrium graph

The coarse-graining procedure described above can be applied to any reversible graph, which need not be at thermodynamic equilibrium. However, the coarse graining described in the paper was for an equilibrium graph. It is not difficult to see that the construction above can be undertaken consistently for any equilibrium graph. It is helpful to first establish a more general observation. The choice of edge labels for $𝒞 (G)$ , as given in Equation 40, is not the only one for which Equation 42 holds, as the appearance of the factor $Q$ indicates. However, the label ratios in $𝒞 (G)$ are uniquely determined by the labels of $G$ .

Suppose that $G$ is a reversible graph with a vertex partition $G_{1}, \dots, G_{m}$ , as above. $G$ need not be at thermodynamic equilibrium. Suppose that $C$ is a graph which is isomorphic to $𝒞 (G)$ as a directed graph (‘structurally isomorphic’), in the sense that it has identical vertices and edges but may have different edge labels. (Technically speaking, an ‘isomorphism’ allows for the vertices of $C$ to have an alternative indexing to those of $𝒞 (G)$ as long as the two indexings can be inter-converted so as to preserve the edges. For simplicity of exposition, we assume that the indexing is, in fact, identical. No loss of generality arises from doing this.)

Lemma 2

Suppose that $C$ is at thermodynamic equilibrium and the coarse-graining equation (Equation 42) holds for $C$ , so that ${𝑃𝑟}_{u} (C) = \sum_{i \in G_{u}} {𝑃𝑟}_{i} (G)$ . If $u ⇌_{C} v$ is any reversible edge, then its equilibrium label depends only on $G$ ,

ℓ_{e q} (u \to_{C} v) = \frac{\sum_{i \in G_{v}} ρ_{i} (G)}{\sum_{i \in G_{u}} ρ_{i} (G)},

and $C$ and $C (G)$ are isomorphic as equilibrium graphs, so that identical edges have identical equilibrium labels.

Proof: It follows from Equation 35 that ${Pr}_{i} (G) = ρ_{i} (G) / Π (G)$ and, since $C$ is at thermodynamic equilibrium, ${Pr}_{u} (C) = μ_{u} (C) / Ψ (C)$ . Using the coarse-graining equation for ${Pr}_{u} (C)$ , we see that

μ_{u} (C) = (\sum_{i \in G_{u}} ρ_{i} (G)) (\frac{Ψ (C)}{Π (G)}) .

(43)

Since $C$ is at thermodynamic equilibrium, Equation 29, with μ in place of ρ, implies that

ℓ_{e q} (u \to_{C} v) = \frac{μ_{v} (C)}{μ_{u} (C)} .

Substituting with Equation 43, the partition functions cancel out to give the formula above. Since $𝒞 (G)$ satisfies the same assumptions as $C$ , it has the same equilibrium labels. Hence, $C$ and $𝒞 (G)$ must be isomorphic as equilibrium graphs.

◼

Corollary 1

Suppose that $A$ is an equilibrium graph and that $G$ is any graph for which $E (G) = A$ , as described above. If any coarse graining of $G$ is undertaken to yield the coarse-grained graph $C (G)$ , which must be at thermodynamic equilibrium, then

ℓ_{e q} (u \to_{𝒞 (G)} v) = \frac{\sum_{i \in A_{v}} μ_{i} (A)}{\sum_{i \in A_{u}} μ_{i} (A)}

and $E (C (G))$ depends only on $A$ and not on the choice of $G$ .

Proof: $A$ acquires from $G$ the same coarse graining, with the partition $A_{1}, \dots, A_{m}$ of $ν (A)$ , where $A_{i} = G_{i} \subseteq {1, \dots m}$ . By hypothesis, $G$ is at thermodynamic equilibrium, so that $ρ_{i} (G) = λ μ_{i} (G)$ for some scalar multiple λ. Also, since $ℰ (G) = A$ , $μ_{i} (G) = μ_{i} (A)$ . Substituting in the formula in Lemma 2 yields the formula above. The equilibrium labels of $𝒞 (G)$ therefore depend only on the equilibrium labels of $A$ , as required.

◼

It follows from Corollary 1 that coarse graining can be carried out on an equilibrium graph, $A$ , by choosing any graph $G$ for which $ℰ (G) = A$ and carrying out the coarse-graining procedure described above on $G$ . This justifies the coarse-graining construction described in the main text.

Coarse graining the allostery graph

Proof of Equation 18

As described in the main text and Figure 4, the coarse-grained allostery graph, $A^{ϕ} = 𝒞 (A)$ , is defined using the partition of $A$ by its horizontal subgraphs, $A_{S}$ , where $S$ runs through all binding subsets, $S \subseteq {1, \dots, n}$ . $A^{ϕ}$ has the same structure of vertices and edges as any of the binding subgraphs, $A^{c_{k}}$ , and is indexed in the same way by the binding subsets, $S$ . Scheme 3 shows an example, which illustrates the calculations undertaken in this section.

Scheme 3. — At top left is an example allostery graph, with binding of a single ligand to $n = 2$ sites for $N = 3$ conformations. Vertices indicate a bound site with a solid black dot and an unbound site with a black dash and binding subsets are colour coded: both sites unbound, black; only site 1 bound, magenta; only site 2 bound, cyan; both sites bound, blue. Some vertices are annotated and some edge labels are shown, with $x$ denoting ligand concentration. Note that the allostery graph has been oriented with its reference vertex at the top, in contrast to the graphs in the main text figures, in order to accommodate the formulas. Example calculations of $μ_{S}$ based on Equation 30 are shown for the vertical subgraph $A^{c_{3}}$ . At bottom is the horizontal subgraph $A_{\emptyset}$ along with the calculation of its steady-state probability distribution in terms of the equilibrium labels, $l_{1}, l_{2}$ . and the quantities $μ_{c_{k}}$ . At top right is the coarse-grained allostery graph, $A^{ϕ}$ , with vertices colour coded as for the binding subsets of the allostery graph. Equation 48 for the effective association constants is illustrated below $A^{ϕ}$ .

Consider the reversible edge in $A^{ϕ}$ , $S ⇌ S \cup {i}$ , where $i \notin S$ . This reversible edge effectively arises from the binding and unbinding of ligand at site $i$ . According to Equation 36, its effective association constant, $K_{i, S}^{ϕ}$ , should satisfy

x K_{i, S}^{ϕ} = ℓ_{e q} (S \to_{A^{ϕ}} S \cup {i}) .

(44)

Since $A$ is at thermodynamic equilibrium, we can make use of the formula in Corollary 1 to rewrite this as

K_{i, S}^{ϕ} = x^{- 1} (\frac{\sum_{1 \leq k \leq N} μ_{(c_{k}, S \cup {i})} (A)}{\sum_{1 \leq k \leq N} μ_{(c_{k}, S)} (A)}) .

Equations 30 and 36 tell us that $μ_{(c_{k}, S \cup {i})} (A) = x K_{c_{k}, i, S} μ_{(c_{k}, S)} (A)$ , so that, after rearranging,

K_{i, S}^{ϕ} = \sum_{1 \leq k \leq N} K_{c_{k}, i, S} (\frac{μ_{(c_{k}, S)} (A)}{\sum_{1 \leq k \leq N} μ_{(c_{k}, S)} (A)}) .

(45)

We can now appeal to Equations 35 and 37 to rewrite the term in brackets on the right as

\frac{μ_{S} (A^{c_{k}}) μ_{c_{k}} (A_{\emptyset})}{\sum_{1 \leq k \leq N} μ_{S} (A^{c_{k}}) μ_{c_{k}} (A_{\emptyset})} = \frac{μ_{S} (A^{c_{k}}) {Pr}_{c_{k}} (A_{\emptyset})}{\sum_{1 \leq k \leq N} μ_{S} (A^{c_{k}}) {Pr}_{c_{k}} (A_{\emptyset})} .

(46)

At this point, it will be helpful to introduce the following notation. If $G$ is any equilibrium graph and $u : ν (G) \to 𝐑$ is any real-valued function defined on the vertices of $G$ , let $⟨ u ⟩$ denote the average of $u$ over the steady-state probability distribution of $G$ ,

⟨ u ⟩ = \sum_{i \in ν (G)} u_{i} {Pr}_{i} (G) .

(47)

With this notation in hand, we can rewrite the denominator in Equation 46 as $⟨ μ_{S} (A^{c_{k}}) ⟩$ , where, from now on, averages will be taken over the steady-state probability distribution of the horizontal subgraph of empty conformations, $A_{\emptyset}$ (Scheme 3, bottom). Inserting this expression back into Equation 45 and rearranging, we obtain a formula for the effective association constant as a ratio of averages,

K_{i, S}^{ϕ} = \frac{⟨ K_{c_{k}, i, S} . μ_{S} (A^{c_{k}}) ⟩}{⟨ μ_{S} (A^{c_{k}}) ⟩},

(48)

which gives the first formula in Equation 18. The ‘dot’ in Equation 48 signifies a product to make the formula easier to read. Scheme 3 demonstrates this calculation. Recall from the main text that HOCs are defined by normalising to the empty binding subset, so that $ω_{i, S}^{ϕ} = K_{i, S}^{ϕ} / K_{i, \emptyset}^{ϕ}$ . Furthermore, since the reference vertex of the vertical subgraphs, $A^{c_{k}}$ , is taken to be the empty binding subset, $μ_{\emptyset} (A^{c_{k}}) = 1$ . It follows that the effective HOCs are given by

ω_{i, S}^{ϕ} = \frac{⟨ K_{c_{k}, i, S} . μ_{S} (A^{c_{k}}) ⟩}{⟨ K_{c_{k}, i, \emptyset} ⟩ . ⟨ μ_{S} (A^{c_{k}}) ⟩},

(49)

which gives the second formula in Equation 18.

Elementary properties of effective HOCs

The main text describes three elementary properties of effective HOCs which follow from Equation 49. The only quantity in Equation 49 which involves the ligand concentration, $x$ , is $μ_{S} (A^{c_{k}})$ . It follows from Equation 30 that this quantity is a monomial in $x$ of the form $a x^{p}$ , where $a$ does not involve $x$ and $p = # (S)$ . In particular, $x^{p}$ does not depend on the conformation c_k. It follows that $x^{p}$ can be extracted from the averages in Equation 49 and cancelled between the numerator and denominator. Hence, $ω_{i, S}^{ϕ}$ is independent of $x$ . If $S = \emptyset$ , then $μ_{S} (A^{c_{k}}) = 1$ for all $1 \leq k \leq N$ and it follows from Equation 49 that $ω_{i, \emptyset}^{ϕ} = 1$ . Finally, if there is only one conformation c₁, the averages in Equation 49 collapse and $μ_{S} (A^{c_{1}})$ cancels above and below, so that $ω_{i, S}^{ϕ} = ω_{c_{1}, i, S}$ , as required.

Generalised MWC formula

The original MWC formula calculates the binding curve, or fractional saturation, of the two-conformation model as a function of ligand concentration $x$ (Monod et al., 1965). Here, we do the same for an arbitrary allostery graph, $A$ . Let $s = # (S)$ . The fractional saturation of $A$ is given by the average binding,

\sum_{1 \leq k \leq N} \sum_{S \subseteq {1, \dots, n}} s {Pr}_{(c_{k}, S)} (A),

normalised to the number of binding sites, $n$ . By the coarse-graining formula in Equation 42, we can rewrite the fractional saturation as

\frac{1}{n} (\sum_{S \subseteq {1, \dots, n}} s {Pr}_{S} (A^{ϕ})) .

(50)

The probability, ${Pr}_{S} (A^{ϕ})$ , can be calculated using Equation 33, which requires the quantities $μ_{S} (A^{ϕ})$ . These can in turn be calculated by the path formula in Equation 30. We can choose the path in $A^{ϕ}$ to use the independent parameters introduced above. Let $S = {i_{1}, \dots, i_{s}}$ , where $i_{1} < \dots < i_{s}$ . Making use of Equation 44, we see that

μ_{S} (A^{ϕ}) = K_{i_{1}, {i_{2}, \dots, i_{s}}}^{ϕ} K_{i_{2}, {i_{3}, \dots, i_{s}}}^{ϕ} \dots K_{i_{s - 1}, {i_{s}}}^{ϕ} K_{i_{s}, \emptyset}^{ϕ} x^{s} .

(51)

Equation 51 can be rewritten in terms of the non-dimensional effective HOCs, but it is simpler for our purposes to use instead the effective association constants, $K_{i, S}^{ϕ}$ . The dependence on $x$ in Equation 51 shows that average binding is given by the logarithmic derivative of the partition function, $Ψ (A^{ϕ})$ , so the fractional saturation can be written as

\frac{1}{n} (\sum_{S \subseteq {1, \dots, n}} s {Pr}_{S} (A^{ϕ})) = \frac{1}{n} (\frac{x}{Ψ (A^{ϕ})}) (\frac{d Ψ (A^{ϕ})}{d x}) .

(52)

With this in mind, Equation 51 shows that the partition function can be written as a polynomial in $x$ ,

Ψ (A^{ϕ}) = \sum_{S \subseteq {1, \dots, n}} μ_{S} (A^{ϕ}) = \sum_{0 \leq s \leq n} (\sum_{1 \leq i_{1} < \dots < i_{s} \leq n} K_{i_{1}, {i_{2}, \dots, i_{s}}}^{ϕ} K_{i_{2}, {i_{3}, \dots, i_{s}}}^{ϕ} \dots K_{i_{s - 1}, {i_{s}}}^{ϕ} K_{i_{s}, \emptyset}^{ϕ}) x^{s} .

Finally, the $K_{i, S}^{ϕ}$ can be determined as averages over the horizontal subgraph of empty conformations using Equation 48. In this way, the fractional saturation in Equation 52 is ultimately determined by the independent parameters of $A$ , giving rise thereby to a generalised MWC formula that is valid for any allostery graph. We explain below how the classical MWC formula is recovered using this procedure.

Effective HOCs for MWC-like models

Proof of Equation 19 and related work

Let $A$ be an allostery graph with ligand binding to $n$ sites which are independent and identical in each conformation. Because of independence, $ω_{c_{k}, i, S} = 1$ , so that $K_{c_{k}, i, S} = K_{c_{k}, i, \emptyset}$ does not depend on $S$ ; because the sites are identical, $K_{c_{i}, i, S}$ does not depend on $i$ . Hence, we may write $K_{c_{k}, i, S} = K_{c_{k}}$ and the labels on the binding edges of the vertical subgraph $A^{c_{k}}$ are all given by $K_{c_{k}}$ . It follows from Equation 30 that $μ_{S} (A^{c_{k}}) = {(K_{c_{k}})}^{s}$ , where $s = # (S)$ . Equation 49 then tells us that $ω_{i, S}^{ϕ}$ also depends only on $s$ , so that we can write it as $ω_{s}^{ϕ}$ , and Equation 49 simplifies to

ω_{s}^{ϕ} = \frac{⟨ {(K_{c_{k}})}^{s + 1} ⟩}{⟨ K_{c_{k}} ⟩ ⟨ {(K_{c_{k}})}^{s} ⟩},

(53)

which gives Equation 19.

If we consider the effective association constant instead of the effective HOC, then, with the same assumptions as above, Equation 48 tells us that

K_{s}^{ϕ} = \frac{⟨ {(K_{c_{k}})}^{s + 1} ⟩}{⟨ {(K_{c_{k}})}^{s} ⟩} .

Suppose that only two conformations, $R$ and $T$ , are present. Let $ℓ_{e q} (c_{R} \to c_{T}) = L$ and write $K_{c_{T}}$ and $K_{c_{R}}$ as $K_{T}$ and $K_{R}$ , respectively. Then, for any random variable on conformations, $X_{c_{k}}$ , the average is given by $⟨ X_{c_{k}} ⟩ = (X_{c_{R}} + X_{c_{T}} L) / (1 + L)$ . Hence,

K_{s}^{ϕ} = \frac{K_{R}^{s + 1} + K_{T}^{s + 1} L}{K_{R}^{s} + K_{T}^{s} L},

(54)

which is the formula for the (s + 1)-th ‘intrinsic binding constant’ given by Gruber and Horovitz, 2018, Equation (2.10). In their analysis, the word ‘intrinsic’ corresponds to our ‘effective’.

We can use Equation 54 to work out what the generalised MWC formula derived above yields for the classical MWC model. Substituting Equation 54 in Equation 51, the intermediate terms in the product cancel out to leave,

μ_{S} (A^{ϕ}) = (K_{R}^{s} + K_{T}^{s} L) x^{s},

in which the right-hand side depends only on $s = # (S)$ . Collecting together subsets of the same size, the partition function of $A^{ϕ}$ may be written as

Ψ (A^{ϕ}) = \sum_{0 \leq s \leq n} (\binom{n}{s}) (K_{R}^{s} + K_{T}^{s} L) x^{s} = {(1 + x K_{R})}^{n} + L {(1 + x K_{T})}^{n} .

It then follows from Equation 52 that the fractional saturation is given by

\frac{1}{n} (\frac{x}{Ψ (A^{ϕ})}) (\frac{d Ψ (A^{ϕ})}{d x}) = \frac{x K_{R} {(1 + x K_{R})}^{n - 1} + x K_{T} L {(1 + x K_{T})}^{n - 1}}{{(1 + x K_{R})}^{n} + L {(1 + x K_{T})}^{n}} .

If we set $α = x K_{R}$ and $c α = x K_{T}$ , this gives, for the fractional saturation,

\frac{α {(1 + α)}^{n - 1} + c α L {(1 + c α)}^{n - 1}}{{(1 + α)}^{n} + L {(1 + c α)}^{n}},

(55)

which recovers the classical MWC formula in the notation of Monod et al., 1965, Equation 2.

Proof of Equation 20

The following result is unlikely not to be known in other contexts but we have not been able to find mention of it.

Lemma 3

Suppose that $X$ is a positive random variable, $X > 0$ , over a finite probability distribution. If $s \geq 1$ , the following moment inequality holds,

⟨ X^{s + 1} ⟩ ⟨ X^{s - 1} ⟩ \geq {⟨ X^{s} ⟩}^{2},

with equality if, and only if, $X$ is constant over the distribution.

Proof: Suppose that the states of the probability space are indexed by $1 \leq i \leq m$ and that p_i denotes the probability of state $i$ . Then,

⟨ X^{s} ⟩ = \sum_{i} X_{i}^{s} p_{i} .

(56)

The quantity $α_{s} = ⟨ X^{s + 1} ⟩ ⟨ X^{s - 1} ⟩ - {⟨ X^{s} ⟩}^{2}$ can then be written as

α_{s} = (\sum_{i} X_{i}^{s + 1} p_{i}) (\sum_{i} X_{i}^{s - 1} p_{i}) - {(\sum_{i} X_{i}^{s} p_{i})}^{2} .

Collecting together terms in $p_{i} p_{j}$ , we can rewrite this as

α_{s} = \sum_{1 \leq i \leq m} (\sum_{i < j \leq m} (X_{i}^{s + 1} X_{j}^{s - 1} + X_{i}^{s - 1} X_{j}^{s + 1} - 2 X_{i}^{s} X_{j}^{s}) p_{i} p_{j}) .

(57)

Note that the terms corresponding to $i = j$ yield $(X_{i}^{s + 1} X_{i}^{s - 1} - X_{i}^{s} X_{i}^{s}) p_{i}^{2} = 0$ and so do not contribute to Equation 57. Choose any pair $1 \leq i \leq m$ and $i < j \leq m$ and let $X_{j} = μ X_{i}$ . Then, the coefficient of $p_{i} p_{j}$ in Equation 57 becomes

X_{i}^{s + 1} X_{i}^{s - 1} μ^{s - 1} + X_{i}^{s - 1} X_{i}^{s + 1} μ^{s + 1} - 2 X_{i}^{s} X_{i}^{s} μ^{s} = {(X_{i}^{s})}^{2} μ^{s - 1} (1 - 2 μ + μ^{2}) .

Now, $1 - 2 μ + μ^{2} = {(μ - 1)}^{2} \geq 0$ for $μ \in 𝐑$ , with equality if, and only if, $μ = 1$ . Since $X > 0$ by hypothesis, $μ > 0$ , so the coefficient of $p_{i} p_{j}$ is positive unless $μ = 1$ . Hence, $α_{s} > 0$ unless $X_{i} = X_{j}$ whenever $1 \leq i \leq m$ and $i < j \leq m$ , which means that $X$ is constant over the distribution. Of course, if $X$ is constant, then clearly $α_{s} = 0$ for all $s \geq 1$ . The result follows.

◼

Corollary 2

If $A$ is an MWC-like allostery graph, its effective HOCs satisfy

1 \leq ω_{1}^{ϕ} \leq ω_{2}^{ϕ} \leq \dots \leq ω_{n - 1}^{ϕ},

(58)

with equality at any stage if, and only if, $K_{c_{k}}$ is constant over $A_{\emptyset}$ .

Proof: It follows from Equation 53 that we can rewrite the effective HOCs recursively as

ω_{s}^{ϕ} = ω_{s - 1}^{ϕ} \frac{⟨ {(K_{c_{k}})}^{s + 1} ⟩ ⟨ {(K_{c_{k}})}^{s - 1} ⟩}{{⟨ {(K_{c_{k}})}^{s} ⟩}^{2}} .

(59)

Since $ω_{0}^{ϕ} = 1$ , the result follows by recursively applying Lemma 3 to $X = K_{c_{k}} > 0$ . Equation 58 gives Equation 20.

◼

Negative effective cooperativity

We consider an allostery graph $A$ with two conformations and two sites, in which binding is independent but not identical, so that the association constants differ between sites. Let $K_{c_{k}, 1, \emptyset} = K_{c_{k}, 1}$ and $K_{c_{k}, 2, \emptyset} = K_{c_{k}, 2}$ , for $k = 1, 2$ . Since the sites are independent, $ω_{c_{k}, 1, {2}} = 1$ , so that $K_{c_{k}, 1, {2}} = K_{c_{k}, 1}$ , for $k = 1, 2$ . It follows from Equation 30—see also Scheme 1—that

μ_{{1}} (A^{c_{k}}) = x K_{c_{k}, 1} and μ_{{2}} (A^{c_{k}}) = x K_{c_{k}, 2} for k = 1, 2 .

Let λ be the single equilibrium label in the horizontal subgraph of empty conformations,

λ = ℓ_{e q} (c_{1} \to_{A_{\emptyset}} c_{2}) = ℓ_{e q} ((c_{1}, \emptyset) \to_{A} (c_{2}, \emptyset)) .

It follows from Equations 30 and 33—see also the similar calculation in Scheme 3—that ${Pr}_{c_{1}} (A_{\emptyset}) = 1 / (1 + λ)$ and ${Pr}_{c_{2}} (A_{\emptyset}) = λ / (1 + λ)$ . We know from Equation 49 that

ω_{1, {2}}^{ϕ} = \frac{⟨ K_{c_{k}, 1, {2}} . μ_{{2}} (A^{c_{k}}) ⟩}{⟨ K_{c_{k}, 1, \emptyset} ⟩ . ⟨ μ_{{2}} (A^{c_{k}}) ⟩},

and using the identifications above, we see that

\begin{aligned} ⟨ K_{c_{k}, 1, {2}} . μ_{{2}} (A^{c_{k}}) ⟩ & = \frac{(K_{c_{1}, 1} K_{c_{1}, 2} + λ K_{c_{2}, 1} K_{c_{2}, 2}) x}{1 + λ} \\ ⟨ μ_{{2}} (A^{c_{k}}) ⟩ & = \frac{(K_{c_{1}, 2} + λ K_{c_{2}, 2}) x}{1 + λ} \\ ⟨ K_{c_{k}, 1, \emptyset} ⟩ & = \frac{K_{c_{1}, 1} + λ K_{c_{2}, 1}}{1 + λ} . \end{aligned}

Substituting and simplifying, we find that

\begin{array}{ll} ω_{1, {2}}^{ϕ} = \frac{(K_{c_{1}, 2} K_{c_{1}, 1} + λ K_{c_{2}, 2} K_{c_{2}, 1}) \cdot (1 + λ)}{(K_{c_{1}, 2} + λ K_{c_{2}, 2}) \cdot (K_{c_{1}, 1} + λ K_{c_{2}, 1})} \\ = \frac{K_{c_{1}, 1} K_{c_{1}, 2} + λ (K_{c_{1}, 1} K_{c_{1}, 2} + K_{c_{2}, 1} K_{c_{2}, 2}) + λ^{2} K_{c_{2}, 1} K_{c_{2}, 2}}{K_{c_{1}, 1} K_{c_{1}, 2} + λ (K_{c_{1}, 1} K_{c_{2}, 2} + K_{c_{2}, 1} K_{c_{1}, 2}) + λ^{2} K_{c_{2}, 1} K_{c_{2}, 2}} . \end{array}

The first and last terms are the same in the numerator and denominator, so it follows that $ω_{1, {2}}^{ϕ} < 1$ if, and only if,

\frac{K_{c_{1}, 1} K_{c_{1}, 2} + K_{c_{2}, 1} K_{c_{2}, 2}}{K_{c_{1}, 1} K_{c_{2}, 2} + K_{c_{2}, 1} K_{c_{1}, 2}} < 1,

which is to say

K_{c_{1}, 1} K_{c_{1}, 2} + K_{c_{2}, 1} K_{c_{2}, 2} - (K_{c_{1}, 1} K_{c_{2}, 2} + K_{c_{2}, 1} K_{c_{1}, 2}) < 0 .

The left-hand side factors to give

(K_{c_{1}, 1} - K_{c_{2}, 1}) (K_{c_{1}, 2} - K_{c_{2}, 2}) < 0 .

We see that negative cooperativity arises if, and only if, the sites have opposite patterns of association constants in the two conformations.

Flexibility of allostery

The integrative flexibility theorem

We provide here a complete version of the proof that was sketched in the main text, showing rigorously how the approximation is handled. Some preliminary notation is needed. Recall that if $X$ is a finite set—typically, a subset of ${1, \dots, n}$ —then $# (X)$ will denote the number of elements in $X$ . If $X$ and $Y$ are sets, then $X \ Y$ will denote the complement of $Y$ in $X$ , $X \ Y = {i \in X, i \notin Y}$ . To control the approximation, we will use the ‘little o’ notation: $𝒪_{u} (1)$ will stand for any quantity which depends on $u$ and for which $𝒪_{u} (1) \to 0$ as $u \to 0$ . For instance, $A u + B u^{2}$ is $𝒪_{u} (1)$ but $(A u + B u^{2}) / u$ is $𝒪_{u} (1)$ if, and only if, $A = 0$ . This notation allows concise expression of complicated expressions which vanish in the limit as $u \to 0$ . Note that $f (u) \to A$ as $u \to 0$ if, and only if, $f (u) = A + 𝒪_{u} (1)$ , which is a useful trick for simplifying $f$ .

Theorem 1

Suppose $n \geq 1$ and choose $2^{n} - 1$ arbitrary positive numbers

β_{i} > 0 (1 \leq i \leq n) 𝑎𝑛𝑑 α_{i, S} > 0 (\emptyset \neq S \subseteq {1, \dots, n}, i < S) .

Given any $ε > 0$ and $δ > 0$ , there exists an allosteric conformational ensemble, which has no intrinsic HOC in any conformation, such that

K_{i, \emptyset}^{ϕ} = β_{i} + 𝒪_{ε} (1) 𝑎𝑛𝑑 ω_{i, S}^{ϕ} = α_{i, S} + 𝒪_{ε} (1) + 𝒪_{δ} (1)

for all corresponding values of $i$ and $S$ .

Proof: Recall from the main text that we use an allostery graph $A$ whose conformations are indexed by subsets $T \subseteq {1, \dots, n}$ and denoted $c_{T}$ , as illustrated in Figure 6. The reference vertex of $A$ is $r = (c_{\emptyset}, \emptyset)$ . For the horizontal subgraph of empty conformations, $A_{\emptyset}$ , let $λ_{T} = μ_{c_{T}} (A_{\emptyset})$ . It follows from Equation 30, using μ in place of ρ, that the $λ_{T}$ determine the equilibrium labels of $A_{\emptyset}$ . Keeping in mind that $λ_{\emptyset} = 1$ , the $λ_{T}$ form a set of $2^{n} - 1$ independent parameters for $A_{\emptyset}$ , as explained above. The steady-state probabilities are then given by ${Pr}_{c_{T}} (A_{\emptyset}) = λ_{T} / (\sum_{\emptyset \subseteq X \subseteq {1, \dots, n}} λ_{X})$ (Equation 35).

Let $κ_{1}, \dots, κ_{n} > 0$ be positive quantities whose values we will subsequently choose. We assume that all intrinsic HOCs are one and, for any binding microstate $S \subseteq {1, \dots, n}$ , we set

K_{c_{T}, i, S} = K_{c_{T}, i, \emptyset} = {\begin{matrix} κ_{i} & if i \in T \\ ε κ_{i} & if i \notin T \end{matrix}

(60)

If $c_{T}$ is a conformation and $S \subseteq {1, \dots, n}$ is a binding microstate, it follows from Equation 60 that

μ_{S} (A^{c_{T}}) = (\prod_{i \in S} κ_{i} x) ε^{# (S \ T)} = {\begin{matrix} (\prod_{i \in S} κ_{i}) x^{# (S)} & if S \subseteq T \\ 𝒪_{ε} (1) x^{# (S)} & otherwise . \end{matrix}

(61)

After coarse graining, we can calculate effective association constants and effective HOCs using the formulas in Equations 48 and 49. Let $S$ be a binding microstate and $i \notin S$ . Using Equation 48 and Equations 60 and 61,

K_{i, S}^{ϕ} = κ_{i} (\frac{\sum_{S \cup {i} \subseteq T} λ_{T} + 𝒪_{ε} (1)}{\sum_{S \subseteq T} λ_{T} + 𝒪_{ε} (1)}) .

Letting $ε \to 0$ , we can use the trick described above to rewrite this as

K_{i, S}^{ϕ} = κ_{i} (\frac{\sum_{S \cup {i} \subseteq T} λ_{T}}{\sum_{S \subseteq T} λ_{T}}) + 𝒪_{ε} (1) .

(62)

Equation 62 is the more rigorous version of Equation 22. It follows from Equation 62, using the same trick to reorganise the terms which are $𝒪_{ε} (1)$ , that the effective HOCs are

ω_{i, S}^{ϕ} = \frac{K_{i, S}^{ϕ}}{K_{i, \emptyset}^{ϕ}} = \frac{(\sum_{\emptyset \subseteq T} λ_{T}) (\sum_{S \cup {i} \subseteq T} λ_{T})}{(\sum_{{i} \subseteq T} λ_{T}) (\sum_{S \subseteq T} λ_{T})} + 𝒪_{ε} (1) .

(63)

Equation 63 is the more rigorous version of Equation 23. We see that the effective HOCs are independent of the quantities $κ_{i}$ and depend only on the parameters, $λ_{T}$ , of the horizontal subgraph $A_{\emptyset}$ .

We can now specify the $λ_{T}$ . If $T = {i_{1}, \dots, i_{k}}$ , where $i_{1} < i_{2} < \dots < i_{k}$ , we set

λ_{T} = α_{i_{1}, {i_{2}, \dots, i_{k}}} α_{i_{2}, {i_{3}, \dots, i_{k}}} \dots α_{i_{k - 1}, {i_{k}}} δ^{k},

(64)

where each of the α quantities is given by hypothesis. Note that the exponent of δ depends only on the size of $T$ and not on which elements $T$ contains. Equation 64 is illustrated in Figure 6.

It follows from Equation 64 that, given any $X \subseteq {1, \dots, n}$ ,

\sum_{X \subseteq T} λ_{T} = λ_{X} (1 + 𝒪_{δ} (1)) .

Using this, we see that the main term in Equation 63 has the form

(1 + 𝒪_{δ} (1)) \cdot \frac{λ_{S \cup {i}} (1 + 𝒪_{δ} (1))}{δ (1 + 𝒪_{δ} (1)) λ_{S} (1 + 𝒪_{δ} (1))} .

(65)

It follows from Equation 64 that, when $i < S$ , $λ_{S \cup {i}} = α_{i, S} λ_{S} δ$ , so using the trick above for reorganising the $𝒪_{δ} (1)$ terms, we can rewrite Equation 65 as $α_{i, S} + 𝒪_{δ} (1)$ . Substituting back into Equation 63, we see that, when $i < S$ ,

ω_{i, S}^{ϕ} = α_{i, S} + 𝒪_{ε} (1) + 𝒪_{δ} (1) .

(66)

Equation 66 is the more rigorous version of Equation 26.

With the choice of $λ_{T}$ given by Equation 64, we can return to Equation 62 with $S = \emptyset$ and define

κ_{i} = β_{i} {(\frac{\sum_{{i} \subseteq T} λ_{T}}{\sum_{\emptyset \subseteq T} λ_{T}})}^{- 1} .

Substituting back into Equation 62 with $S = \emptyset$ , we see that

K_{i, \emptyset}^{ϕ} = β_{i} + 𝒪_{ε} (1) .

(67)

Equation 67 is the more rigorous version of Equation 27. The result follows from Equations 66, 67.

◼

Construction of Figure 8

We implemented in a Mathematica notebook the proof strategy in Theorem 1 for any number $n$ of sites. The notebook takes as input parameters the $β_{i}$ and the $α_{i, S}$ for $i < S$ in the statement of the theorem, along with specified values for the quantities ε and δ. It produces as output the effective bare association constants, $K_{i, \emptyset}^{ϕ}$ , and effective HOCs, $ω_{i, S}^{ϕ}$ for $i < S$ , as given by Theorem 1. The values of $ϵ$ and δ can then be adjusted so that the calculated $K_{i, \emptyset}^{ϕ}$ and $ω_{i, S}^{ϕ}$ are as close as required to the $β_{i}$ and $α_{i, S}$ . The notebook is available on request.

Figure 8 shows the results from using this notebook on three examples, chosen by hand to illustrate different patterns of effective bare association constants and effective HOCs. The actual numerical values are listed below.

The colour names used here refer to the colour code for the three examples in Figure 8. The maximum error was calculated as the larger of $\max_{i} | \frac{β_{i} - K_{i, \emptyset}^{ϕ}}{β_{i}} |$ and $\max_{i, S} | \frac{α_{i, S} - ω_{i, S}^{ϕ}}{α_{i, S}} |$ . The quantities δ and ε were adjusted to make the maximum error less than 0.01.

The binding curves for each example (Figure 7B) show the dependence on concentration of average binding to site $i$ (coloured curves), which can be written in terms of the coarse-grained graph, $A^{ϕ}$ , in the form

\sum_{S \subseteq {1, \dots, n}} χ_{i} (S) {Pr}_{S} (A^{ϕ}) .

Here, $χ_{i} (S)$ is the indicator function for $i$ being in $S$ ,

χ_{i} (S) = {\begin{cases} 1 & if i \in S \\ 0 & if i \notin S . \end{cases}

Since the size of $S$ , which was denoted by $s$ above, is given by $s = \sum_{1 \leq i \leq n} χ_{i} (S)$ , we see from Equation 50 that the fractional saturation (Figure 7B, black curves) is the sum of the average bindings over all sites, normalised to the number of sites, $n$ .

	Maroon		Orange		Red
	$δ = 10^{- 7}, ε = 10^{- 12}$		$δ = 10^{- 7}, ε = 10^{- 14}$		$δ = 10^{- 7}, ε = 10^{- 16}$
$i$	$β_{i}$	$K_{i, \emptyset}^{ϕ}$	$β_{i}$	$K_{i, \emptyset}^{ϕ}$	$β_{i}$	$K_{i, \emptyset}^{ϕ}$
1	1.5777	1.5776	0.031353	0.031353	0.21257	0.21257
2	24.013	24.014	0.011104	0.011104	0.84301	0.84301
3	89.958	89.959	13.195	13.195	9.8514	9.8514
4	0.015685	0.015685	52.437	52.437	27.000	27.000
$i, S$	$α_{i, S}$	$ω_{i, S}^{ϕ}$	$α_{i, S}$	$ω_{i, S}^{ϕ}$	$α_{i, S}$	$ω_{i, S}^{ϕ}$
$1, {2}$	0.084815	0.0848456	1.0801	1.0801	50.455	50.454
$1, {3}$	3.7432	3.7432	34.768	34.768	0.016359	0.016401
$1, {4}$	0.044245	0.044264	0.032668	0.032669	0.60018	0.60018
$2, {3}$	30.240	30.239	4.0683	4.0683	7.2944	7.2944
$2, {4}$	0.074064	0.074083	1.5098	1.5098	0.010809	0.010809
$3, {4}$	9.2687	9.2685	0.025183	0.025184	0.012613	0.012613
$1, {2, 3}$	4.0933	4.0933	0.31238	0.31238	57.783	57.783
$1, {2, 4}$	15.687	15.683	0.70016	0.70016	0.025618	0.025623
$1, {3, 4}$	0.013335	0.013349	0.13042	0.13056	4.4450	4.4450
$2, {3, 4}$	0.082851	0.082892	2.5235	2.5235	0.13584	0.13584
$1, {2, 3, 4}$	6.5843	6.5825	0.017404	0.017407	0.063587	0.063833
Max. error	0.00105		0.00105		0.00386

Open in a new tab

Allosteric ensembles for Hill functions

Construction of Figure 9

As described in the main text, we considered an allosteric ensemble with four conformations and six ligand binding sites with no intrinsic cooperativity in any conformation. Accordingly, the bare association constants, $K_{c_{k}, i, \emptyset}$ , constitute 6 free parameters for each conformation c_k, $k = 1, \dots, 4$ , giving 24 free parameters. A further 3 free parameters arise for the independent equilibrium labels of the horizontal subgraph of empty conformations, $A_{\emptyset}$ , giving 27 free parameters in total. The association constants were restricted to lie in the range $[10^{- 4}, 10^{4}]$ and the equilibrium labels in the range $[10^{- 6}, 10^{6}]$ . To compare the binding function, $f (u)$ , to the Hill functions $ℋ_{h} (x)$ , the concentration variable, $u$ , was normalised to its half-maximal value, u_0.5, for which $f (u_{0.5}) = 0.5$ (Estrada et al., 2016). The normalised binding function, $g (x) = f (x u_{0.5})$ , then satisfies $g (1) = 0.5$ . We followed a two-step procedure to find binding functions which approximated Hill functions. The algorithm is publicly available on GitHub (github.com/rosamc/allostery-paper-2021; copy archived at swh:1:rev:386b23961732962e8ac8390322c9c6e6dfc39168), and we describe it here in general terms. For step 1, we used the measures of position, $γ (g)$ , and steepness, $ρ (g)$ , of a normalised binding function, $g (x)$ , introduced previously (Estrada et al., 2016). The steepness of $g (x)$ is the maximum value of its derivative,

ρ (g) = \max_{x \geq 0} \frac{d g}{d x},

and the position of $g$ is the normalised concentration at which that maximum occurs,

γ (g) = z, such that {\frac{d g}{d x} |}_{x = z} = ρ (g) .

The combination of these two measures provides an estimate of the shape of the binding function (Estrada et al., 2016). Starting with a seed for random number generation, we randomly sampled parameter values independently and logarithmically within the ranges specified above to find parameter sets for which $γ (g) \in [0.5, 1.2]$ and $ρ (g) \in [0.5, 1.3]$ , which ensures that $g$ is not too far in position-steepness space from the Hill functions (Estrada et al., 2016, Supplementary Information, §6.1). This narrows down the search space substantially. Once such a parameter set has been found, step 2 of the procedure followed a Monte Carlo optimisation as follows. This algorithm was fine-tuned by hand, and full details are available with the source code on GitHub. The error between the selected binding function $g$ and the appropriate Hill function, $ℋ_{h}$ , was measured as the average absolute difference between the functions at 1000 logarithmically spaced points between 0.0005 and 5,

δ (g, ℋ_{h}) = \frac{\sum_{1 \leq j \leq 1000} | g (0.0005 u^{j}) - ℋ_{h} (0.0005 u^{j}) |}{1000},

where $u = 10^{0.0003003}$ . Starting from the initial parameter set, $θ_{0}$ , as selected in the first step, we randomly chose each parameter with probability $p$ and, for each chosen parameter, we randomly picked a new value v₁ logarithmically in the range $[m v_{0}, M v_{0}]$ , where v₀ is the existing parameter value. If the chosen value fell outside the appropriate parameter range, we took v₁ to be the limit of the range. Having done this for each parameter to generate a new parameter set, $θ_{1}$ , we chose $θ_{1}$ for the next step of the iteration if $δ (g_{θ_{1}}, ℋ_{h}) < δ (g_{θ_{0}}, ℋ_{h})$ and, if not, we chose $θ_{1}$ with probability β; otherwise, we retained $θ_{0}$ . The algorithm parameters $p$ , $m$ and $M$ were adjusted so that $p$ decreased and the range $[m, M]$ narrowed as the error decreased. Iterations were continued to an upper limit of $5 \times 10^{6}$ , or until a parameter set was found for which $δ (g_{θ}, ℋ_{h}) < 0.0002$ . Step 1 and iterations of step 2 were undertaken with $β = 0.25, 0.5$ and 0.75 for each of 290 initial seeds for random number generation, and the examples shown in Figure 9 were selected from among those with the least error. For Hill coefficient $h = 4$ , we had to relax the error bound slightly and the two examples shown in Figure 9 satisfy $0.0003 < δ (g_{θ}, ℋ_{h}) < 0.0004$ .

Acknowledgements

We are indebted to Hernan Garcia and an anonymous reviewer for questions and suggestions which helped to improve this paper. JWB and JG were supported by US National Science Foundation (NSF) Award #1462629. RMC was supported by US National Institutes of Health award #GM122928 and EMBO Fellowship ALTF683-2019. FW was supported by the James S McDonnell Foundation and NSF Graduate Research Fellowship #DGE1144152.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Jeremy Gunawardena, Email: jeremy_gunawardena@hms.harvard.edu.

Aleksandra M Walczak, École Normale Supérieure, France.

Arvind Murugan, University of Chicago, United States.

Funding Information

This paper was supported by the following grants:

National Science Foundation 1462629 to John W Biddle, Jeremy Gunawardena.
National Institutes of Health GM122928 to Rosa Martinez-Corral.
European Molecular Biology Organization ALTF683-2019 to Rosa Martinez-Corral.
National Science Foundation DGE1144152 to Felix Wong.
James S. McDonnell Foundation to Felix Wong.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Formal analysis, Writing - review and editing.

Conceptualization, Formal analysis, Supervision, Methodology, Writing - original draft, Writing - review and editing.

Additional files

Transparent reporting form

elife-65498-transrepform.docx^{(105.3KB, docx)}

Data availability

No data has been generated or acquired for this study, which is purely theoretical.

References

Ahsendorf T, Wong F, Eils R, Gunawardena J. A framework for modelling gene regulation which accommodates non-equilibrium mechanisms. BMC Biology. 2014;12:102. doi: 10.1186/s12915-014-0102-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
Allen BL, Taatjes DJ. The mediator complex: a central integrator of transcription. Nature Reviews Molecular Cell Biology. 2015;16:155–166. doi: 10.1038/nrm3951. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bacic L, Sabantsev A, Deindl S. Recent advances in single-molecule fluorescence microscopy render structural biology dynamic. Current Opinion in Structural Biology. 2020;65:61–68. doi: 10.1016/j.sbi.2020.05.006. [DOI] [PubMed] [Google Scholar]
Benabdallah NS, Bickmore WA. Regulatory domains and their mechanisms. Cold Spring Harbor Symposia on Quantitative Biology; 2015. pp. 45–51. [DOI] [PubMed] [Google Scholar]
Berlow RB, Dyson HJ, Wright PE. Expanding the paradigm: intrinsically disordered proteins and allosteric regulation. Journal of Molecular Biology. 2018;430:2309–2320. doi: 10.1016/j.jmb.2018.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
Biddle JW, Nguyen M, Gunawardena J. Negative reciprocity, not ordered assembly, underlies the interaction of Sox2 and Oct4 on DNA. eLife. 2019;8:e410172018. doi: 10.7554/eLife.41017. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bolt CC, Duboule D. The regulatory landscapes of developmental genes. Development. 2020;147:dev171736. doi: 10.1242/dev.171736. [DOI] [PMC free article] [PubMed] [Google Scholar]
Carter CW. High-Dimensional mutant and modular thermodynamic cycles, molecular switching, and free energy transduction. Annual Review of Biophysics. 2017;46:433–453. doi: 10.1146/annurev-biophys-070816-033811. [DOI] [PMC free article] [PubMed] [Google Scholar]
Carter CW, Chandrasekaran SN, Weinreb V, Li L, Williams T. Combining multi-mutant and modular thermodynamic cycles to measure energetic coupling networks in enzyme catalysis. Structural Dynamics. 2017;4:032101. doi: 10.1063/1.4974218. [DOI] [PMC free article] [PubMed] [Google Scholar]
Changeux JP. The feedback control mechanisms of biosynthetic L-threonine deaminase by L-isoleucine. Cold Spring Harbor Symposia on Quantitative Biology; 1961. pp. 313–318. [DOI] [PubMed] [Google Scholar]
Changeux JP. 50 years of allosteric interactions: the twists and turns of the models. Nature Reviews Molecular Cell Biology. 2013;14:819–829. doi: 10.1038/nrm3695. [DOI] [PubMed] [Google Scholar]
Changeux JP, Christopoulos A. Allosteric modulation as a unifying mechanism for receptor function and regulation. Cell. 2016;166:1084–1102. doi: 10.1016/j.cell.2016.08.015. [DOI] [PubMed] [Google Scholar]
Chong S, Dugast-Darzacq C, Liu Z, Dong P, Dailey GM, Cattoglio C, Heckert A, Banala S, Lavis L, Darzacq X, Tjian R. Imaging dynamic and selective low-complexity domain interactions that control gene transcription. Science. 2018;361:eaar2555. doi: 10.1126/science.aar2555. [DOI] [PMC free article] [PubMed] [Google Scholar]
Clark S, Myers JB, King A, Fiala R, Novacek J, Pearce G, Heierhorst J, Reichow SL, Barbar EJ. Multivalency regulates activity in an intrinsically disordered transcription factor. eLife. 2018;7:e36258. doi: 10.7554/eLife.36258. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cooper A, Dryden DT. Allostery without conformational change. A plausible model. European Biophysics Journal : EBJ. 1984;11:103–109. doi: 10.1007/BF00276625. [DOI] [PubMed] [Google Scholar]
Dasgupta T, Croll DH, Owen JA, Vander Heiden MG, Locasale JW, Alon U, Cantley LC, Gunawardena J. A fundamental trade-off in covalent switching and its circumvention by enzyme bifunctionality in glucose homeostasis. Journal of Biological Chemistry. 2014;289:13010–13025. doi: 10.1074/jbc.M113.546515. [DOI] [PMC free article] [PubMed] [Google Scholar]
Demir Ö, Leong PU, Amaro RE. Full-length p53 tetramer bound to DNA and its quaternary dynamics. Oncogene. 2017;36:1451–1460. doi: 10.1038/onc.2016.321. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dodd IB, Shearwin KE, Perkins AJ, Burr T, Hochschild A, Egan JB. Cooperativity in long-range gene regulation by the lambda CI repressor. Genes & Development. 2004;18:344–354. doi: 10.1101/gad.1167904. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dyson HJ, Wright PE. Role of intrinsic protein disorder in the function and interactions of the transcriptional coactivators CREB-binding protein (CBP) and p300. Journal of Biological Chemistry. 2016;291:6714–6722. doi: 10.1074/jbc.R115.692020. [DOI] [PMC free article] [PubMed] [Google Scholar]
Edelman LB, Fraser P. Transcription factories: genetic programming in three dimensions. Current Opinion in Genetics & Development. 2012;22:110–114. doi: 10.1016/j.gde.2012.01.010. [DOI] [PubMed] [Google Scholar]
Ehlert FJ. Cooperativity has empirical and ultimate levels of explanation. Trends in Pharmacological Sciences. 2016;37:620–623. doi: 10.1016/j.tips.2016.06.001. [DOI] [PubMed] [Google Scholar]
Estrada J, Wong F, DePace A, Gunawardena J. Information integration and energy expenditure in gene regulation. Cell. 2016;166:234–244. doi: 10.1016/j.cell.2016.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
Frauenfelder H, Sligar SG, Wolynes PG. The energy landscapes and motions of proteins. Science. 1991;254:1598–1603. doi: 10.1126/science.1749933. [DOI] [PubMed] [Google Scholar]
Freidlin MI, Wentzell AD. Random Perturbations of Dynamical Systems. Heidleberg, Germany: Springer; 2012. [DOI] [Google Scholar]
Furlong EEM, Levine M. Developmental enhancers and chromosome topology. Science. 2018;361:1341–1345. doi: 10.1126/science.aau0320. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ganser LR, Kelly ML, Herschlag D, Al-Hashimi HM. The roles of structural dynamics in the cellular functions of RNAs. Nature Reviews Molecular Cell Biology. 2019;20:474–489. doi: 10.1038/s41580-019-0136-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gerhart J. From feedback inhibition to allostery: the enduring example of aspartate transcarbamoylase. FEBS Journal. 2014;281:612–620. doi: 10.1111/febs.12483. [DOI] [PubMed] [Google Scholar]
Grah R, Zoller B, Tkačik G. Nonequilibrium models of optimal enhancer function. PNAS. 2020;117:31614–31622. doi: 10.1073/pnas.2006731117. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gregor T, Tank DW, Wieschaus EF, Bialek W. Probing the limits to positional information. Cell. 2007;130:153–164. doi: 10.1016/j.cell.2007.05.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gruber R, Horovitz A. Unpicking allosteric mechanisms of homo-oligomeric proteins by determining their successive ligand binding constants. Philosophical Transactions of the Royal Society B: Biological Sciences. 2018;373:20170176. doi: 10.1098/rstb.2017.0176. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gunawardena J. A linear framework for time-scale separation in nonlinear biochemical systems. PLOS ONE. 2012;7:e36321. doi: 10.1371/journal.pone.0036321. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gunawardena J. Time-scale separation--Michaelis and Menten's old idea, still bearing fruit. FEBS Journal. 2014;281:473–488. doi: 10.1111/febs.12532. [DOI] [PMC free article] [PubMed] [Google Scholar]
Henzler-Wildman K, Kern D. Dynamic personalities of proteins. Nature. 2007;450:964–972. doi: 10.1038/nature06522. [DOI] [PubMed] [Google Scholar]
Hill TL. Studies in irreversible thermodynamics IV. diagrammatic representation of steady state fluxes for unimolecular systems. Journal of Theoretical Biology. 1966;10:442–459. doi: 10.1016/0022-5193(66)90137-8. [DOI] [PubMed] [Google Scholar]
Hilser VJ, Wrabl JO, Motlagh HN. Structural and energetic basis of allostery. Annual Review of Biophysics. 2012;41:585–609. doi: 10.1146/annurev-biophys-050511-102319. [DOI] [PMC free article] [PubMed] [Google Scholar]
Horovitz A, Fersht AR. Strategy for analysing the co-operativity of intramolecular interactions in peptides and proteins. Journal of Molecular Biology. 1990;214:613–617. doi: 10.1016/0022-2836(90)90275-Q. [DOI] [PubMed] [Google Scholar]
Horovitz A, Fersht AR. Co-operative interactions during protein folding. Journal of Molecular Biology. 1992;224:733–740. doi: 10.1016/0022-2836(92)90557-Z. [DOI] [PubMed] [Google Scholar]
Jain RK, Ranganathan R. Local complexity of amino acid interactions in a protein core. PNAS. 2004;101:111–116. doi: 10.1073/pnas.2534352100. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kalo A, Kanter I, Shraga A, Sheinberger J, Tzemach H, Kinor N, Singer RH, Lionnet T, Shav-Tal Y. Cellular levels of signaling factors are sensed by β-actin alleles to modulate transcriptional pulse intensity. Cell Reports. 2015;11:419–432. doi: 10.1016/j.celrep.2015.03.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kim S, Broströmer E, Xing D, Jin J, Chong S, Ge H, Wang S, Gu C, Yang L, Gao YQ, Su XD, Sun Y, Xie XS. Probing allostery through DNA. Science. 2013;339:816–819. doi: 10.1126/science.1229223. [DOI] [PMC free article] [PubMed] [Google Scholar]
Knoverek CR, Amarasinghe GK, Bowman GR. Advanced methods for accessing protein Shape-Shifting present new therapeutic opportunities. Trends in Biochemical Sciences. 2019;44:351–364. doi: 10.1016/j.tibs.2018.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kornev AP, Taylor SS. Dynamics-Driven allostery in protein kinases. Trends in Biochemical Sciences. 2015;40:628–647. doi: 10.1016/j.tibs.2015.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
Koshland DE, Némethy G, Filmer D. Comparison of experimental binding data and theoretical models in proteins containing subunits. Biochemistry. 1966;5:365–385. doi: 10.1021/bi00865a047. [DOI] [PubMed] [Google Scholar]
Koshland DE, Hamadani K. Proteomics and models for enzyme cooperativity. Journal of Biological Chemistry. 2002;277:46841–46844. doi: 10.1074/jbc.R200014200. [DOI] [PubMed] [Google Scholar]
Lammers NC, Kim YJ, Zhao J, Garcia HG. A matter of time: using dynamics and theory to uncover mechanisms of transcriptional bursting. Current Opinion in Cell Biology. 2020;67:147–157. doi: 10.1016/j.ceb.2020.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
LeVine MV, Weinstein H. AIM for allostery: using the ising model to understand information processing and transmission in allosteric biomolecular systems. Entropy. 2015;17:2895–2918. doi: 10.3390/e17052895. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lewis BA. Understanding large multiprotein complexes: applying a multiple allosteric networks model to explain the function of the mediator transcription complex. Journal of Cell Science. 2010;123:159–163. doi: 10.1242/jcs.057216. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lewis JS, Costa A. Caught in the act: structural dynamics of replication origin activation and fork progression. Biochemical Society Transactions. 2020;48:1057–1066. doi: 10.1042/BST20190998. [DOI] [PMC free article] [PubMed] [Google Scholar]
Li J, Dong A, Saydaminova K, Chang H, Wang G, Ochiai H, Yamamoto T, Pertsinidis A. Single-Molecule nanoscopy elucidates RNA polymerase II transcription at single genes in live cells. Cell. 2019;178:491–506. doi: 10.1016/j.cell.2019.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lin Y, Sohn CH, Dalal CK, Cai L, Elowitz MB. Combinatorial gene regulation by modulation of relative pulse timing. Nature. 2015;527:54–58. doi: 10.1038/nature15710. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu J, Perumal NB, Oldfield CJ, Su EW, Uversky VN, Dunker AK. Intrinsic disorder in transcription factors. Biochemistry. 2006;45:6873–6888. doi: 10.1021/bi0602718. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lorimer GH, Horovitz A, McLeish T. Allostery and molecular machines. Philosophical Transactions of the Royal Society B: Biological Sciences. 2018;373:20170173. doi: 10.1098/rstb.2017.0173. [DOI] [PMC free article] [PubMed] [Google Scholar]
Marco A, Meharena HS, Dileep V, Raju RM, Davila-Velderrain J, Zhang AL, Adaikkan C, Young JZ, Gao F, Kellis M, Tsai LH. Mapping the epigenomic and transcriptomic interplay during memory formation and recall in the hippocampal engram ensemble. Nature Neuroscience. 2020;23:1606–1617. doi: 10.1038/s41593-020-00717-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Martini JWR. A measure to quantify the degree of cooperativity in overall titration curves. Journal of Theoretical Biology. 2017;432:33–37. doi: 10.1016/j.jtbi.2017.08.010. [DOI] [PubMed] [Google Scholar]
Marzen S, Garcia HG, Phillips R. Statistical mechanics of Monod-Wyman-Changeux (MWC) models. Journal of Molecular Biology. 2013;425:1433–1460. doi: 10.1016/j.jmb.2013.03.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
Miller JA, Widom J. Collaborative competition mechanism for gene activation in vivo. Molecular and Cellular Biology. 2003;23:1623–1632. doi: 10.1128/MCB.23.5.1623-1632.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mir M, Bickmore W, Furlong EEM, Narlikar G. Chromatin topology, condensates and gene regulation: shifting paradigms or just a phase? Development. 2019;146:dev182766. doi: 10.1242/dev.182766. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mirny LA. Nucleosome-mediated cooperativity between transcription factors. PNAS. 2010;107:22534–22539. doi: 10.1073/pnas.0913805107. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mirzaev I, Bortz DM. Laplacian dynamics with synthesis and degradation. Bulletin of Mathematical Biology. 2015;77:1013–1045. doi: 10.1007/s11538-015-0075-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mirzaev I, Gunawardena J. Laplacian dynamics on general graphs. Bulletin of Mathematical Biology. 2013;75:2118–2149. doi: 10.1007/s11538-013-9884-8. [DOI] [PubMed] [Google Scholar]
Molina N, Suter DM, Cannavo R, Zoller B, Gotic I, Naef F. Stimulus-induced modulation of transcriptional bursting in a single mammalian gene. PNAS. 2013;110:20563–20568. doi: 10.1073/pnas.1312310110. [DOI] [PMC free article] [PubMed] [Google Scholar]
Monod J, Wyman J, Changeux JP. On the nature of allosteric transitions: a plausible model. Journal of Molecular Biology. 1965;12:88–118. doi: 10.1016/S0022-2836(65)80285-6. [DOI] [PubMed] [Google Scholar]
Monod J, Jacob F. Teleonomic mechanisms in cellular metabolism, growth, and differentiation. Cold Spring Harbor Symposia on Quantitative Biology; 1961. pp. 389–401. [DOI] [PubMed] [Google Scholar]
Motlagh HN, Li J, Thompson EB, Hilser VJ. Interplay between allostery and intrinsic disorder in an ensemble. Biochemical Society Transactions. 2012;40:975–980. doi: 10.1042/BST20120163. [DOI] [PMC free article] [PubMed] [Google Scholar]
Motlagh HN, Wrabl JO, Li J, Hilser VJ. The ensemble nature of allostery. Nature. 2014;508:331–339. doi: 10.1038/nature13001. [DOI] [PMC free article] [PubMed] [Google Scholar]
Noé F, Fischer S. Transition networks for modeling the kinetics of conformational change in macromolecules. Current Opinion in Structural Biology. 2008;18:154–162. doi: 10.1016/j.sbi.2008.01.008. [DOI] [PubMed] [Google Scholar]
Nogales E, Fang J, Louder RK. Structural dynamics and DNA interaction of human TFIID. Transcription. 2017;8:55–60. doi: 10.1080/21541264.2016.1265701. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nussinov R, Tsai CJ, Ma B. The underappreciated role of allostery in the cellular network. Annual Review of Biophysics. 2013;42:169–189. doi: 10.1146/annurev-biophys-083012-130257. [DOI] [PMC free article] [PubMed] [Google Scholar]
Park J, Estrada J, Johnson G, Vincent BJ, Ricci-Tam C, Bragdon MD, Shulgina Y, Cha A, Wunderlich Z, Gunawardena J, DePace AH. Dissecting the sharp response of a canonical developmental enhancer reveals multiple sources of cooperativity. eLife. 2019a;8:e41266. doi: 10.7554/eLife.41266. [DOI] [PMC free article] [PubMed] [Google Scholar]
Park M, Patel N, Keung AJ, Khalil AS. Engineering epigenetic regulation using synthetic Read-Write modules. Cell. 2019b;176:227–238. doi: 10.1016/j.cell.2018.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pauling L. The oxygen equilibrium of hemoglobin and its structural interpretation. PNAS. 1935;21:186–191. doi: 10.1073/pnas.21.4.186. [DOI] [PMC free article] [PubMed] [Google Scholar]
Peeters E, van Oeffelen L, Nadal M, Forterre P, Charlier D. A thermodynamic model of the cooperative interaction between the archaeal transcription factor Ss-LrpB and its tripartite operator DNA. Gene. 2013;524:330–340. doi: 10.1016/j.gene.2013.03.118. [DOI] [PubMed] [Google Scholar]
Perutz MF. Stereochemistry of cooperative effects in haemoglobin. Nature. 1970;228:726–734. doi: 10.1038/228726a0. [DOI] [PubMed] [Google Scholar]
Portz B, Lu F, Gibbs EB, Mayfield JE, Rachel Mehaffey M, Zhang YJ, Brodbelt JS, Showalter SA, Gilmour DS. Structural heterogeneity in the intrinsically disordered RNA polymerase II C-terminal domain. Nature Communications. 2017;8:15231. doi: 10.1038/ncomms15231. [DOI] [PMC free article] [PubMed] [Google Scholar]
Robert CH, Decker H, Richey B, Gill SJ, Wyman J. Nesting: hierarchies of allosteric interactions. PNAS. 1987;84:1891–1895. doi: 10.1073/pnas.84.7.1891. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sabari BR, Dall'Agnese A, Boija A, Klein IA, Coffey EL, Shrinivas K, Abraham BJ, Hannett NM, Zamudio AV, Manteiga JC, Li CH, Guo YE, Day DS, Schuijers J, Vasile E, Malik S, Hnisz D, Lee TI, Cisse II, Roeder RG, Sharp PA, Chakraborty AK, Young RA. Coactivator condensation at super-enhancers links phase separation and gene control. Science. 2018;361:eaar3958. doi: 10.1126/science.aar3958. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sadovsky E, Yifrach O. Principles underlying energetic coupling along an allosteric communication trajectory of a voltage-activated K+ channel. PNAS. 2007;104:19813–19818. doi: 10.1073/pnas.0708120104. [DOI] [PMC free article] [PubMed] [Google Scholar]
Schnakenberg J. Network theory of microscopic and macroscopic behavior of master equation systems. Reviews of Modern Physics. 1976;48:571–585. doi: 10.1103/RevModPhys.48.571. [DOI] [Google Scholar]
Schueler-Furman O, Wodak SJ. Computational approaches to investigating allostery. Current Opinion in Structural Biology. 2016;41:159–171. doi: 10.1016/j.sbi.2016.06.017. [DOI] [PubMed] [Google Scholar]
Sengupta U, Strodel B. Markov models for the elucidation of allosteric regulation. Philosophical Transactions of the Royal Society B: Biological Sciences. 2018;373:20170178. doi: 10.1098/rstb.2017.0178. [DOI] [PMC free article] [PubMed] [Google Scholar]
Shi H, Rangadurai A, Abou Assi H, Roy R, Case DA, Herschlag D, Yesselman JD, Al-Hashimi HM. Rapid and accurate determination of atomistic RNA dynamic ensemble models using NMR and structure prediction. Nature Communications. 2020;11:5531. doi: 10.1038/s41467-020-19371-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
Smale ST, Plevy SE, Weinmann AS, Zhou L, Ramirez-Carrozzi VR, Pope SD, Bhatt DM, Tong AJ. Toward an understanding of the gene-specific and global logic of inducible gene transcription. Cold Spring Harbor Symposia on Quantitative Biology; 2013. pp. 61–68. [DOI] [PubMed] [Google Scholar]
Stroock DW. An Introduction to Markov Processes. In: Vakil R, editor. Graduate Texts in Mathematics. Berlin, Germany: Springer-Verlag; 2014. pp. 1–203. [DOI] [Google Scholar]
Thal DM, Glukhova A, Sexton PM, Christopoulos A. Structural insights into G-protein-coupled receptor allostery. Nature. 2018;559:45–53. doi: 10.1038/s41586-018-0259-z. [DOI] [PubMed] [Google Scholar]
Tran H, Desponds J, Perez Romero CA, Coppey M, Fradin C, Dostatni N, Walczak AM. Precision in a rush: trade-offs between reproducibility and steepness of the hunchback expression pattern. PLOS Computational Biology. 2018;14:e1006513. doi: 10.1371/journal.pcbi.1006513. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tsai CJ, Nussinov R. Gene-specific transcription activation via long-range allosteric shape-shifting. Biochemical Journal. 2011;439:15–25. doi: 10.1042/BJ20110972. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tsai CJ, Nussinov R. A unified view of "how allostery works". PLOS Computational Biology. 2014;10:e1003394. doi: 10.1371/journal.pcbi.1003394. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tzeng SR, Kalodimos CG. Protein dynamics and allostery: an NMR view. Current Opinion in Structural Biology. 2011;21:62–67. doi: 10.1016/j.sbi.2010.10.007. [DOI] [PubMed] [Google Scholar]
Ullmann A. In memoriam: jacques Monod (1910-1976) Genome Biology and Evolution. 2011;3:1025–1033. doi: 10.1093/gbe/evr024. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ventsel' AD, Freidlin MI. On small random perturbations of dynamical systems. Russian Mathematical Surveys. 1970;25:1–55. doi: 10.1070/RM1970v025n01ABEH001254. [DOI] [Google Scholar]
Voss TC, Hager GL. Dynamic regulation of transcriptional states by chromatin and transcription factors. Nature Reviews Genetics. 2014;15:69–81. doi: 10.1038/nrg3623. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wales DJ. Energy landscapes: calculating pathways and rates. International Reviews in Physical Chemistry. 2006;25:237–282. doi: 10.1080/01442350600676921. [DOI] [Google Scholar]
Wodak SJ, Paci E, Dokholyan NV, Berezovsky IN, Horovitz A, Li J, Hilser VJ, Bahar I, Karanicolas J, Stock G, Hamm P, Stote RH, Eberhardt J, Chebaro Y, Dejaegere A, Cecchini M, Changeux JP, Bolhuis PG, Vreede J, Faccioli P, Orioli S, Ravasio R, Yan L, Brito C, Wyart M, Gkeka P, Rivalta I, Palermo G, McCammon JA, Panecka-Hofman J, Wade RC, Di Pizio A, Niv MY, Nussinov R, Tsai CJ, Jang H, Padhorny D, Kozakov D, McLeish T. Allostery in its many disguises: from theory to applications. Structure. 2019;27:566–578. doi: 10.1016/j.str.2019.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wolff MR, Schmid A, Korber P, Gerland U. Effective dynamics of nucleosome configurations at the yeast PHO5 promoter. eLife. 2021;10:e58394. doi: 10.7554/eLife.58394. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wong F, Amir A, Gunawardena J. Energy-speed-accuracy relation in complex networks for biological discrimination. Physical Review E. 2018a;98:012420. doi: 10.1103/PhysRevE.98.012420. [DOI] [PubMed] [Google Scholar]
Wong F, Dutta A, Chowdhury D, Gunawardena J. Structural conditions on complex networks for the Michaelis-Menten input-output response. PNAS. 2018b;115:9738–9743. doi: 10.1073/pnas.1808053115. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wong F, Gunawardena J. Gene regulation in and out of equilibrium. Annual Review of Biophysics. 2020;49:199–226. doi: 10.1146/annurev-biophys-121219-081542. [DOI] [PubMed] [Google Scholar]
Wrabl JO, Gu J, Liu T, Schrank TP, Whitten ST, Hilser VJ. The role of protein conformational fluctuations in Allostery, function, and evolution. Biophysical Chemistry. 2011;159:129–141. doi: 10.1016/j.bpc.2011.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wright PE, Dyson HJ. Intrinsically disordered proteins in cellular signalling and regulation. Nature Reviews Molecular Cell Biology. 2015;16:18–29. doi: 10.1038/nrm3920. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yordanov P, Stelling J. Steady-State differential dose response in biological systems. Biophysical Journal. 2018;114:723–736. doi: 10.1016/j.bpj.2017.11.3780. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yordanov P, Stelling J. Efficient manipulation and generation of kirchhoff polynomials for the analysis of non-equilibrium biochemical reaction networks. Journal of the Royal Society Interface. 2020;17:20190828. doi: 10.1098/rsif.2019.0828. [DOI] [PMC free article] [PubMed] [Google Scholar]

eLife. doi: 10.7554/eLife.65498.sa1

Decision letter

Editor: Arvind Murugan¹

Reviewed by: Hernan G Garcia²

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This paper extends classical models of molecular cooperativity to higher order cooperativity, where the binding of ligand by a protein is affected by other already bound ligands. The work quantifies effective higher order cooperativity between 3 or more ligands that interact indirectly by biasing the underlying (equilibrium) molecular ensemble. The work should be of broad interest to protein scientists since it suggests a new way of quantifying empirical observations of cooperativity.

Decision letter after peer review:

Thank you for submitting your article "Allosteric conformational ensembles have unlimited capacity for integrating information" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by Arvind Murugan as Reviewing Editor and Aleksandra Walczak as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Hernan G Garcia (Reviewer #1).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

The reviewers had mixed opinions, primarily with respect to clarity of the paper and presenting a clear relationship to prior work. In particular, reviewer #2 has concerns about the way cooperativity is quantified here, the benefits of this approach and its relationship to prior work. Below, I summarize a few areas where the paper must be improved prior to being acceptable for publication. Please also refer to the reviewer's detailed reports for constructive criticism that will make this paper more readable and impactful.

1. Flavor of results in the main paper: The work relies on significant mathematical work that is entirely confined to the appendices. The main paper is too superficial as a result and the reader should have more meat to sink their teeth into. See reviewer's comments for suggestions – e.g., some equations (or intuition behind equations) can be moved from the appendix to the main paper. I present one suggestion re: Figure 4 below. Feel free to address this important issue in other ways instead.

Figure 4 is the only figure that presents some sense of the results and is much too brief. Perhaps Figure 4 can be unpacked, possibly into an additional figure, offering intuition into the remarkable binding curves shown (e.g., with positive and negative cooperativity in different regimes). For example, you could show the kinetic network needed to get one or two of the most interesting binding curves shown in Figure 4. The current visualization in Figure 4 in terms of heatmaps is hard to interpret.

The mathematical content in Materials and methods needs to be better integrated with the argument in the main text. One way to do this would be to add notes in the Methods that point to concepts discussed in the main text. See reviewer comments re: the same.

2. Relationship to prior work: Your work seeks to do two distinct things: (a) demonstrate that equilibrium conformational ensembles can implement any pattern of HOCs, (b) introduce a new way to quantify higher order cooperativity that's distinct from binding curve shape.

As one of the reviewers points out, the presentation of (b), relationship to prior work and benefits of the new measure over prior work should be better clarified. See reviewer comments for more. Could you spell out an example or two where the binding curve is an unwieldy or misleading characterization of cooperativity while your HOC coefficient performs better?

3. Concrete biological example – theory can and should precede experiments. But the paper will have more impact if the authors can lay out how to use the framework here to perform or interpret experiments. Ideally this would be done with a concrete example of a protein or protein complex where these ideas might potentially have relevance, how what is known about its conformations predicts HOCs and binding curves, what experimental signatures one might look for and so on – even if there is currently no data.

See review comments for other suggestions.

Reviewer #1:

Often in biology, in phenomena ranging from the binding of oxygen to hemoglobin to the binding of transcription factors to DNA, it is observed that the binding of a second ligand to its substrate is more likely than the binding of the first ligand. This so-called cooperativity is usually associated with direct ligand-ligand interactions. However, an increasing body of theoretical work rooted on the Monod-Wymand-Changeux and Koshland-Némethy-Filmer models has shown that, if the substrate can adopt two conformations, cooperativity can arise in the absence of direct interactions between ligands.

Despite the widespread adoption of these models, they have presented limitations when confronted with real data. For example, quantitatively recapitulating gene expression input-output functions in eukaryotes often calls for more than the pairwise interactions that lead to classic cooperativity. Instead, in order to reconcile theory and experiment it is necessary to invoke higher-order cooperativity. Here, multiple bound ligands act in a collective fashion to influence the binding (or unbinding) of additional ligands.

Biddle et al. propose an intriguing theoretical model for realizing higher-order cooperativity between binding sites in a single substrate in the absence of energy dissipation, which means that they must adhere to the strict constraints of microscopic reversibility imposed by thermodynamic equilibrium. They demonstrate that, by extending previous models and allowing the substrate to fluctuate between multiple distinct conformational states, systems may achieve arbitrary higher-order cooperativitive (HOC) behaviors, even at thermodynamic equilibrium. Their graph-based method extends the idea of allosteric regulation to apply to systems with many distinct conformational degrees of freedom and, as such, should, in principle, provide a useful conceptual tool for interrogating the wide range of biological processes in which allostery is thought to play some role.

The paper is extremely well-written, with ample room for the introduction of concepts-including their historical background-and for the discussion. However, we worry that the difficulty of their mathematical notation, as well as their choice to relegate key details about both the derivation and the application of their method to the SI will limit the impact and pedagogical value of this creative and timely work.

Likewise, the considerable import of their finding that sufficiently complex allosteric systems can realize any regulatory logic that is achievable at thermodynamic equilibrium is somewhat obscured by the absence of a clear, detailed application to a concrete biological system. All the same, we view this work as an exciting step towards developing theoretical models that adequately attend to the richness and complexity of real biological systems.

Strengths:

– The paper offers a new framework for thinking about how complex allosteric systems with multiple distinct conformations function to integrate information from ligand binding.

– The authors show that allostery, when sufficiently complex, can provide a physical basis for the emergence of higher-order cooperativities of an arbitrary nature.

– The authors provide an intuitive method for coarse-graining systems with many conformations into a single, tractable ligand-binding graph, which can then be used to quantify higher-order cooperativities between binding sites. This method should prove a useful tool for navigating the complexities present in many real biological systems.

– The authors show that their framework is consistent with (and therefore subsumes) previously used MWC models.

Weaknesses:

– Due to the strong results and implications of the paper, the mathematical proofs in the Materials and methods section must be easy to follow and accessible to the reader. The abundance of indices and references back and forth from the main text make it difficult to follow and evaluate the author's claims throughout this work. The derivations of the authors' coarse-graining procedure and their expression for effective higher-order-cooperativity, as well as their proof that sufficiently complex allosteric systems can achieve any regulatory logic, are nowhere to be found in the main text. While it may not be practical to include these pieces in full, the authors often could at least provide qualitative intuition for the origins and implications of the expressions they present.

– The lemmas and proofs in the Materials and methods are stated mostly in the form of equations, with few explanation on how the proof connects to the concept explained in the main text.

– It is worth noting that the authors limit themselves to considering systems at thermodynamic equilibrium. This is perfectly understandable given the considerable scope of the work already undertaken, but it will be interesting to see what new behaviors might emerge from systems operating away from equilibrium in future work.

– Given that this paper considers only the equilibrium situation, it would be interesting to explicitly state the advantage of adopting the linear framework as opposed to a thermodynamic description in terms of, for example, Boltzmann weights.

– The absence of a thorough, well-illustrated application to a concrete biological system somewhat dampens the paper's impact.

– The authors use the phrase "information integration" multiple times throughout, but they never provide a precise definition of what they mean. Typically a treatment of information transmission would be expected to deal with noise, as well as mean behavior, but that is not done here. They need to clearly define this term early on. While the authors provide an example that does give some intuition in lines 126-136, it might be helpful to move this discussion earlier to provide more context for the rest of the discussion in the introduction.

– In line 41, the authors point out that previous studies investigating effective cooperative effects in MWC models do not "quantitatively determine" the effective cooperativity, but instead infer it indirectly from the shape of the binding curve. However, they do not tell us why this matters. What can we expect to gain by quantifying effective cooperativity directly?

– What is the benefit of having more than 2 conformations? Can the authors show, quantitatively, how performance scales with the number of conformations? The discussion in lines 340-344 provides some basis for this, but the point seems worthy of further discussion and illustration. Is there a graphical way to illustrate the space of achievable integrative behaviors, and how this expands with increasing N (for some given n)?

– This work would be significantly strengthened by including a concrete example that demonstrates both how the framework could be employed to analyze a biological system and what it tells us about how conformational flexibility impacts integrative behaviors. For instance, the authors could revisit their earlier work on the hunchback gene in fruit flies (Estrada et al., Cell, 2016; Park et al., eLife, 2019), and show how the space of achievable GRFs expands with the number of conformational degrees of freedom.

Reviewer #2:

In this paper, the authors argue correctly that quantification of higher-order coupling (HOC) is crucial for the understanding of biological systems at many different levels of description. I found the paper hard to read. This is due, in part, to the lack of connection with previous descriptions of HOC. The most basic description of pairwise coupling is usually through linkage analysis developed by Wyman. Such coupling is often described by cycles, e.g. a double-mutant cycle or a cycle that describes binding of some ligand X in the absence and presence of a second ligand Y. Pairwise coupling is usually considered to have a dimension of 2 (and not 1 as in the work here). A natural extension to HOC coupling is then done via higher-order dimensional constructs, e.g. triple-mutant boxes for the 3-way coupling between 3 residues (JMB 1990 Aug 5;214(3):613-7; PNAS 2004 Jan 6;101(1):111-6; Annu Rev Biophys. 2017 May 22;46:433-453). Consequently, a key question for me about the current work is the relationship between the previously used measure for HOC and the one described here.

Also, is there an advantage to using the measure proposed in the current work? It seems to me that the description here bypasses intermediate orders of coupling. In other words, nth order coupling is not described in terms of all the lower orders of coupling. Is that a good thing?

In addition, the authors ignore (lines 48-50) the existence of the Hill constant which provides a measure of cooperativity despite having some shortcomings and (line 83) the many previous papers about HOC as mentioned above.

Other comments:

1. Line 308 and elsewhere -it seems that statistical corrections for the binding constants were not introduced. This is OK if stated and not misinterpreted.

2. Line 321 – HOC usually diminishes with factorial decomposition. Why not here?

3. Lines 328, 401-402 – site-heterogeneity leads to apparent negative cooperativity but it is apparent since it can involve no coupling or 'communication' between sites. It should not, therefore, be presented as a possible source for HOC and is not true negative cooperativity.

4. Line 338 – I thought that intrinsic HOC can arise only when the sites are not identical so what am I missing unless it's the statistical factor.

5. Figure 4 – why can binding decrease with increasing substrate concentration?

6 Lines 385-392 – for hemoglobin affinity increases but cooperativity actually decreases at high substrate concentrations because most of the molecules are 'locked' in the R state. Is this captured by the current formalism?

7. Line 699 – fix typo: i to k; I don't understand Equation 15. If each term in the product is a ratio of the terms for forward and reverse directions so should the result on the rhs. Thermodynamically, a product of equilibrium constants is an equilibrium constant but the result on the rhs is not.

8. The analogy with TF binding is potentially problematic because of confusion between different levels of cooperativity. For example, IPTG binding to the lac repressor dimer occurs without cooperativity but 2 IPTG molecules need to be bound for transcription to occur. Hence, measuring transcription as a function of IPTG concentration appears to be very cooperative but the fraction bound as a function of IPTG concentration is not.

eLife. 2021 Jun 9;10:e65498. doi: 10.7554/eLife.65498.sa2

Author response

Essential revisions:

The reviewers had mixed opinions, primarily with respect to clarity of the paper and presenting a clear relationship to prior work. In particular, reviewer #2 has concerns about the way cooperativity is quantified here, the benefits of this approach and its relationship to prior work. Below, I summarize a few areas where the paper must be improved prior to being acceptable for publication. Please also refer to the reviewer's detailed reports for constructive criticism that will make this paper more readable and impactful.

1. Flavor of results in the main paper: The work relies on significant mathematical work that is entirely confined to the appendices. The main paper is too superficial as a result and the reader should have more meat to sink their teeth into. See reviewer's comments for suggestions – e.g., some equations (or intuition behind equations) can be moved from the appendix to the main paper. I present one suggestion re: Figure 4 below. Feel free to address this important issue in other ways instead.

Figure 4 is the only figure that presents some sense of the results and is much too brief. Perhaps Figure 4 can be unpacked, possibly into an additional figure, offering intuition into the remarkable binding curves shown (e.g., with positive and negative cooperativity in different regimes). For example, you could show the kinetic network needed to get one or two of the most interesting binding curves shown in Figure 4. The current visualization in Figure 4 in terms of heatmaps is hard to interpret.

The mathematical content in Materials and methods needs to be better integrated with the argument in the main text. One way to do this would be to add notes in the Methods that point to concepts discussed in the main text. See reviewer comments re: the same.

Our previous experience has been that most readers would prefer not to confront the mathematics and we had structured the paper accordingly. We apologise for this misjudgement and have taken the following steps to provide more "meat" in the main text.

– We have described the free-energy landscape in more detail, with a new Equation 1 and a new Figure 3.

– As a response to point 2 below, we have added a new section to the Results in which we explain in detail the mathematical relationship between higher-order cooperativity measures. We have introduced a new Figure 5 and new Equations 4 to 16, along with 3 other unnumbered displayed equations.

– We have explained in more detail the basis of coarse graining and the further details provided in the Materials and methods (lines 443-50).

– We have included the essential details of the proof of the flexibility theorem in the main text. This material includes the new Equations 21 to 27, along with 3 other un-numbered displayed equations, as well as the new Figure 6, which is enhanced from what was previously Scheme 2 in the Material and methods. We still provide a fully rigorous and concise proof in the Materials and methods.

– We have broken up the old Figure 4 into two new figures (Figures 7 and 8), as requested, and included a new depiction of the allostery graph in Figure 8A.

2. Relationship to prior work: Your work seeks to do two distinct things: (a) demonstrate that equilibrium conformational ensembles can implement any pattern of HOCs, (b) introduce a new way to quantify higher order cooperativity that's distinct from binding curve shape.

As one of the reviewers points out, the presentation of (b), relationship to prior work and benefits of the new measure over prior work should be better clarified. See reviewer comments for more. Could you spell out an example or two where the binding curve is an unwieldy or misleading characterization of cooperativity while your HOC coefficient performs better?

This is an important point and we apologise for our unfamiliarity with the prior work described by Reviewer #2. We have now pointed out this prior work in the Introduction (lines 98-106) and included a new section of the Results entitled Relationships between higher-order measures (pages 15-21) in which we carefully explain the relationship between our HOCs and the two forms of higher-order couplings introduced in previous work. We present general formulas for the couplings described in both Horovitz and Fersht 1992 (Equations 6 and 7) and Horovitz and Fersht 1990 (Equation 11). The latter formula seems to be new, to our knowledge. We further give new general formulas for calculating both measures from our HOCs (Equations 8 and 14), from which we deduce rigorously that the two measures introduced in Horovitz and Fersht 1990, 1992 are, in fact, the same (Equation 15). We were surprised not to find a clear statement of this equality in the literature. We presume that it must be well known to those in the field and to be tacitly assumed. We note that it would not be easy to formulate a rigorous statement of this equality in the absence of a general definition for the higher-order couplings introduced in Horovitz and Fersht 1990. We have now provided such a definition in Equation 11. We hope, therefore, that this new section will be of some value and that it provides a full answer to the Reviewer's question as to "the relationship between the previously used measure for HOC and the one described here". As to the benefits of our HOCs, we make comparisons between all the measures in the penultimate paragraph of the new section. We feel that each measure is suitable for a different purpose and we explain why our HOCs are well suited to the problems studied in the present paper.

3. Concrete biological example – theory can and should precede experiments. But the paper will have more impact if the authors can lay out how to use the framework here to perform or interpret experiments. Ideally this would be done with a concrete example of a protein or protein complex where these ideas might potentially have relevance, how what is known about its conformations predicts HOCs and binding curves, what experimental signatures one might look for and so on – even if there is currently no data.

We had included an extensive discussion of the implications of our results for gene regulation, based on the "haemoglobin analogy", as depicted in the old Figure 5 (now Figure 10), and we remarked on the kinds of experiments that would be needed to test this conceptual picture (lines 803-11). We feel this does illustrate the significance of our findings but acknowledge that this material is Discussion rather than Results. Accordingly, we have included a new final section of the Results entitled Allosteric ensembles for Hill functions (pages 32-5) and a new figure (Figure 9) to show that allosteric ensembles can be found whose binding functions closely approximate Hill functions.

See review comments for other suggestions.

Reviewer #1:

[…] – Given that this paper considers only the equilibrium situation, it would be interesting to explicitly state the advantage of adopting the linear framework as opposed to a thermodynamic description in terms of, for example, Boltzmann weights.

We thank the Reviewer for this suggestion. We have now explained the advantage of the graphbased linear framework at the point where we discuss equilibrium statistical mechanics (lines 266-74). We have also noted there the central role that linear framework graphs play in the subsequent new section in which we examine the relationship between higher-order measures.

– The authors use the phrase "information integration" multiple times throughout, but they never provide a precise definition of what they mean. Typically a treatment of information transmission would be expected to deal with noise, as well as mean behavior, but that is not done here. They need to clearly define this term early on. While the authors provide an example that does give some intuition in lines 126-136, it might be helpful to move this discussion earlier to provide more context for the rest of the discussion in the introduction.

We apologise for not being clear about what we mean by "integration". We were not thinking of it in terms of information theory, as the Reviewer suggests, but, rather, as the process by which the occurrence of ligand binding influences downstream function. We have now stated this in the second sentence of the text (lines 3-7).

– In line 41, the authors point out that previous studies investigating effective cooperative effects in MWC models do not "quantitatively determine" the effective cooperativity, but instead infer it indirectly from the shape of the binding curve. However, they do not tell us why this matters. What can we expect to gain by quantifying effective cooperativity directly?

Briefly, we gain access to the free-energy landscape, which cannot be acquired from aggregated measures such as the shape of the binding curve. To introduce this point, we have now added a sentence at lines 31-32 to explain how association constants or cooperativites are another way of describing free energies. We have then explained more carefully on lines 53-62 the significance of effective cooperativities for describing the free-energy landscape.

– What is the benefit of having more than 2 conformations? Can the authors show, quantitatively, how performance scales with the number of conformations? The discussion in lines 340-344 provides some basis for this, but the point seems worthy of further discussion and illustration. Is there a graphical way to illustrate the space of achievable integrative behaviors, and how this expands with increasing N (for some given n)?

We fully agree with the Reviewer that these are interesting questions but we fear that answering them amounts to writing another paper. As the Reviewer notes, we have explained why more conformations are mathematically essential to achieve flexibility (lines 520-21) and we have proved that, with enough conformations, complete flexibility can be achieved (Integrative flexibility of ensembles and Theorem 1 in the Materials and methods). We also note, in the new final section of the results, that the number of conformations may play a role in the flexibility with which Hill functions can be approximated (lines 649-56). However, as we point out, the impact of the number of conformations is a delicate question because of the potential interplay between numbers of sites, numbers of conformations and parametric ranges. To go further and to work out how the number of conformations influences function requires substantial further work. We feel this is more appropriate to a follow-up study.

– This work would be significantly strengthened by including a concrete example that demonstrates both how the framework could be employed to analyze a biological system and what it tells us about how conformational flexibility impacts integrative behaviors. For instance, the authors could revisit their earlier work on the hunchback gene in fruit flies (Estrada et al., Cell, 2016; Park et al., eLife, 2019), and show how the space of achievable GRFs expands with the number of conformational degrees of freedom.

Our thanks to the Reviewer for this suggestion. We have now included a new final section of the Results entitled Allosteric ensembles for Hill functions (pages 32-5) along with a new Figure 9 in which we show how the Hill functions, which provide fits to experimental data on hunchback, can be recovered from an allosteric ensemble.

Reviewer #2:

In this paper, the authors argue correctly that quantification of higher-order coupling (HOC) is crucial for the understanding of biological systems at many different levels of description. I found the paper hard to read. This is due, in part, to the lack of connection with previous descriptions of HOC. The most basic description of pairwise coupling is usually through linkage analysis developed by Wyman. Such coupling is often described by cycles, e.g. a double-mutant cycle or a cycle that describes binding of some ligand X in the absence and presence of a second ligand Y. Pairwise coupling is usually considered to have a dimension of 2 (and not 1 as in the work here). A natural extension to HOC coupling is then done via higher-order dimensional constructs, e.g. triple-mutant boxes for the 3-way coupling between 3 residues (JMB 1990 Aug 5;214(3):613-7; PNAS 2004 Jan 6;101(1):111-6; Annu Rev Biophys. 2017 May 22;46:433-453). Consequently, a key question for me about the current work is the relationship between the previously used measure for HOC and the one described here.

Also, is there an advantage to using the measure proposed in the current work? It seems to me that the description here bypasses intermediate orders of coupling. In other words, nth order coupling is not described in terms of all the lower orders of coupling. Is that a good thing?

In addition, the authors ignore (lines 48-50) the existence of the Hill constant which provides a measure of cooperativity despite having some shortcomings and (line 83) the many previous papers about HOC as mentioned above.

We are grateful to the Reviewer for pointing out the previous work on higher-order measures and apologise for having overlooked it. We have addressed this important matter in detail and discussed the advantages of the new measure, as fully described in point 2 above. We have now cited in a new paragraph of the Introduction (lines 98-106) all the references provided by the Reviewer as well as Horovitz and Fersht 1992, which we have discussed further in the Results (see point 2), Jain and Ranganathan 2004, Sadovsky and Yifrach 2007 and Carter et al. 2017. We hope these revisions go some way towards placing the paper in the context of previous work.

Pairwise coupling is usually considered to have a dimension of 2 (and not 1 as in the work here).

We agree that this is so for the customary higher-order couplings and we have used the new Equation 13 to point out this difference (lines 386-8). We note that the situation is more complicated when there is a non-trivial "offset", which arises in the new treatment of higher-order couplings which we have provided (Equation 11). The offset increases the order of the corresponding HOC, as can be seen from Equations 13 or 14.

It seems to me that the description here bypasses intermediate orders of coupling. In other words, nth order coupling is not described in terms of all the lower orders of coupling. Is that a good thing?

Indeed, the Reviewer is correct in saying that our HOCs are not hierarchical. Whether that is a good thing or not depends, presumably, on what kinds of problems one is trying to address. We believe that HOCs are well suited to describe integration of binding information and specifically to understand how such integration arises "effectively" from a conformational ensemble through coarse graining. This is one of the main contributions of our paper, for which a hierarchical measure of coupling would have been substantially harder to work with. Furthermore, as we show in Equations 8 and 14, our HOCs can precisely describe the hierarchical "intermediate orders of coupling" which are present in the higher-order measures introduced in Horovitz and Fersht 1990 and 1992. With Equations 8 and 14 now available, there is no difficulty in calculating the effective higher-order couplings arising from any conformational ensemble, thereby recovering the "intermediate orders of coupling" in this generalised setting.

In addition, the authors ignore (lines 48-50) the existence of the Hill constant which provides a measure of cooperativity despite having some shortcomings.

We have now mentioned the Hill coefficient (lines 53-9) and explained more carefully why aggregated measures of this kind provide only limited information about the underlying free energies. This point is reiterated in the last section of the Results (lines 640-5) and in the new Figure 9.

Other comments:

1. Line 308 and elsewhere -it seems that statistical corrections for the binding constants were not introduced. This is OK if stated and not misinterpreted.

The Reviewer is correct that we do not use statistical factors. They are required when binding states are represented by the number of bound sites. We avoid this problem by accounting for each site which is bound in the subset of bound sites. At the specific point to which Reviewer refers, now Equation 19, we show that HOCs depend only on the number of bound sites. Statistical factors do not appear to be necessary for the discussion that follows.

2. Line 321 – HOC usually diminishes with factorial decomposition. Why not here?

We are not sure what the Reviewer means by "factorial decomposition". However, our finding that cooperativity increases with order for the MWC-like ensemble (Equation 20) was for our definition of HOC. It is conceivable that this is not the case for the measures introduced in Horovitz and Fersht 1990, 1992. Indeed, Equations 8 and 14, which show how higher-order couplings are calculated from HOCs, involve a ratio of HOCs. Hence, it would be possible, in principle, for these other measures to diminish with order, as the Reviewer suggests, even though our HOCs do not. However, we have not investigated this matter further.

3. Lines 328, 401-402 – site-heterogeneity leads to apparent negative cooperativity but it is apparent since it can involve no coupling or 'communication' between sites. It should not, therefore, be presented as a possible source for HOC and is not true negative cooperativity.

We have been careful to make the distinction which the Reviewer draws between cooperativity at the level of a single molecule, and "effective" cooperativity, at the level of an ensemble. We distinguish throughout the paper between the "intrinsic" cooperativity within a given conformation and the "effective" cooperativity arising from the ensemble. We prefer "effective" to either "apparent" or "false" cooperativity. We do not present the heterogeneity of sites as a source of negative cooperativity, only of negative effective cooperativity (line 400 in the original paper; line 699-700 in the revision). We feel this is a reasonable way to maintain the distinction which the Reviewer makes.

4. Line 338 – I thought that intrinsic HOC can arise only when the sites are not identical so what am I missing unless it's the statistical factor.

There seems to be some confusion here. We define "intrinsic" HOC to be the cooperativity between sites in a single conformation (Equation 2). We define sites to be "identical" if they have the same association constants for binding (line 477-8). It is possible for sites to be identical and still have intrinsic HOCs but, in the passage in question, we impose the requirement that all intrinsic HOCs are one, so that the sites are independent. This means that any effective cooperativity which arises in the ensemble cannot be attributed to intrisinc cooperativity arising from an individual conformation.

5. Figure 4 – why can binding decrease with increasing substrate concentration?

Average total binding, or fractional saturation, cannot increase with increasing substrate, no matter what cooperativities are present. That is a consequence of thermodynamics. However, average binding at an individual site can increase or decrease depending on the pattern of cooperativities, as shown in Figure 7B.

6 Lines 385-392 – for hemoglobin affinity increases but cooperativity actually decreases at high substrate concentrations because most of the molecules are 'locked' in the R state. Is this captured by the current formalism?

We do not know which measure of cooperativity the Reviewer has in mind here. However, if the implication is that some measure of cooperativity becomes concentration dependent, then none of the measures discussed in the paper have that property. They are all independent of concentration. Accordingly, the current formalism would not capture the behaviour described by the Reviewer, although it seems like an interesting question to explore further.

7. Line 699 – fix typo: i to k; I don't understand Equation 15. If each term in the product is a ratio of the terms for forward and reverse directions so should the result on the rhs. Thermodynamically, a product of equilibrium constants is an equilibrium constant but the result on the rhs is not.

Corrected. Thank you! The old Equation 15 (new Equation 39) is for a linear framework graph. In our treatment in this section, the only requirement for an edge label is that it is a rate, with units of (time)^-1, and no thermodynamic terms, such as ligand concentrations, are specified within the labels. Accordingly, the ratios in Equation 39 are all non-dimensional, so no inconsistency arises between the left-hand and right-hand sides.

8. The analogy with TF binding is potentially problematic because of confusion between different levels of cooperativity. For example, IPTG binding to the lac repressor dimer occurs without cooperativity but 2 IPTG molecules need to be bound for transcription to occur. Hence, measuring transcription as a function of IPTG concentration appears to be very cooperative but the fraction bound as a function of IPTG concentration is not.

Indeed, we agree that cooperativity depends crucially on which input is being considered: if the input is the TF, that gives a very different result than if the input is IPTG. We do not see this as problematic but, rather, as a potential source of confusion if the input is not clearly specified. To address the Reviewer's concern, we have made sure to say "input pattern of TFs" throughout the Discussion.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Transparent reporting form

elife-65498-transrepform.docx^{(105.3KB, docx)}

Data Availability Statement

No data has been generated or acquired for this study, which is purely theoretical.

[bib1] Ahsendorf T, Wong F, Eils R, Gunawardena J. A framework for modelling gene regulation which accommodates non-equilibrium mechanisms. BMC Biology. 2014;12:102. doi: 10.1186/s12915-014-0102-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib2] Allen BL, Taatjes DJ. The mediator complex: a central integrator of transcription. Nature Reviews Molecular Cell Biology. 2015;16:155–166. doi: 10.1038/nrm3951. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] Bacic L, Sabantsev A, Deindl S. Recent advances in single-molecule fluorescence microscopy render structural biology dynamic. Current Opinion in Structural Biology. 2020;65:61–68. doi: 10.1016/j.sbi.2020.05.006. [DOI] [PubMed] [Google Scholar]

[bib4] Benabdallah NS, Bickmore WA. Regulatory domains and their mechanisms. Cold Spring Harbor Symposia on Quantitative Biology; 2015. pp. 45–51. [DOI] [PubMed] [Google Scholar]

[bib5] Berlow RB, Dyson HJ, Wright PE. Expanding the paradigm: intrinsically disordered proteins and allosteric regulation. Journal of Molecular Biology. 2018;430:2309–2320. doi: 10.1016/j.jmb.2018.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib6] Biddle JW, Nguyen M, Gunawardena J. Negative reciprocity, not ordered assembly, underlies the interaction of Sox2 and Oct4 on DNA. eLife. 2019;8:e410172018. doi: 10.7554/eLife.41017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib7] Bolt CC, Duboule D. The regulatory landscapes of developmental genes. Development. 2020;147:dev171736. doi: 10.1242/dev.171736. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib8] Carter CW. High-Dimensional mutant and modular thermodynamic cycles, molecular switching, and free energy transduction. Annual Review of Biophysics. 2017;46:433–453. doi: 10.1146/annurev-biophys-070816-033811. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] Carter CW, Chandrasekaran SN, Weinreb V, Li L, Williams T. Combining multi-mutant and modular thermodynamic cycles to measure energetic coupling networks in enzyme catalysis. Structural Dynamics. 2017;4:032101. doi: 10.1063/1.4974218. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] Changeux JP. The feedback control mechanisms of biosynthetic L-threonine deaminase by L-isoleucine. Cold Spring Harbor Symposia on Quantitative Biology; 1961. pp. 313–318. [DOI] [PubMed] [Google Scholar]

[bib11] Changeux JP. 50 years of allosteric interactions: the twists and turns of the models. Nature Reviews Molecular Cell Biology. 2013;14:819–829. doi: 10.1038/nrm3695. [DOI] [PubMed] [Google Scholar]

[bib12] Changeux JP, Christopoulos A. Allosteric modulation as a unifying mechanism for receptor function and regulation. Cell. 2016;166:1084–1102. doi: 10.1016/j.cell.2016.08.015. [DOI] [PubMed] [Google Scholar]

[bib13] Chong S, Dugast-Darzacq C, Liu Z, Dong P, Dailey GM, Cattoglio C, Heckert A, Banala S, Lavis L, Darzacq X, Tjian R. Imaging dynamic and selective low-complexity domain interactions that control gene transcription. Science. 2018;361:eaar2555. doi: 10.1126/science.aar2555. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] Clark S, Myers JB, King A, Fiala R, Novacek J, Pearce G, Heierhorst J, Reichow SL, Barbar EJ. Multivalency regulates activity in an intrinsically disordered transcription factor. eLife. 2018;7:e36258. doi: 10.7554/eLife.36258. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] Cooper A, Dryden DT. Allostery without conformational change. A plausible model. European Biophysics Journal : EBJ. 1984;11:103–109. doi: 10.1007/BF00276625. [DOI] [PubMed] [Google Scholar]

[bib16] Dasgupta T, Croll DH, Owen JA, Vander Heiden MG, Locasale JW, Alon U, Cantley LC, Gunawardena J. A fundamental trade-off in covalent switching and its circumvention by enzyme bifunctionality in glucose homeostasis. Journal of Biological Chemistry. 2014;289:13010–13025. doi: 10.1074/jbc.M113.546515. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib17] Demir Ö, Leong PU, Amaro RE. Full-length p53 tetramer bound to DNA and its quaternary dynamics. Oncogene. 2017;36:1451–1460. doi: 10.1038/onc.2016.321. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib18] Dodd IB, Shearwin KE, Perkins AJ, Burr T, Hochschild A, Egan JB. Cooperativity in long-range gene regulation by the lambda CI repressor. Genes & Development. 2004;18:344–354. doi: 10.1101/gad.1167904. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib19] Dyson HJ, Wright PE. Role of intrinsic protein disorder in the function and interactions of the transcriptional coactivators CREB-binding protein (CBP) and p300. Journal of Biological Chemistry. 2016;291:6714–6722. doi: 10.1074/jbc.R115.692020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib20] Edelman LB, Fraser P. Transcription factories: genetic programming in three dimensions. Current Opinion in Genetics & Development. 2012;22:110–114. doi: 10.1016/j.gde.2012.01.010. [DOI] [PubMed] [Google Scholar]

[bib21] Ehlert FJ. Cooperativity has empirical and ultimate levels of explanation. Trends in Pharmacological Sciences. 2016;37:620–623. doi: 10.1016/j.tips.2016.06.001. [DOI] [PubMed] [Google Scholar]

[bib22] Estrada J, Wong F, DePace A, Gunawardena J. Information integration and energy expenditure in gene regulation. Cell. 2016;166:234–244. doi: 10.1016/j.cell.2016.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib23] Frauenfelder H, Sligar SG, Wolynes PG. The energy landscapes and motions of proteins. Science. 1991;254:1598–1603. doi: 10.1126/science.1749933. [DOI] [PubMed] [Google Scholar]

[bib24] Freidlin MI, Wentzell AD. Random Perturbations of Dynamical Systems. Heidleberg, Germany: Springer; 2012. [DOI] [Google Scholar]

[bib25] Furlong EEM, Levine M. Developmental enhancers and chromosome topology. Science. 2018;361:1341–1345. doi: 10.1126/science.aau0320. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib26] Ganser LR, Kelly ML, Herschlag D, Al-Hashimi HM. The roles of structural dynamics in the cellular functions of RNAs. Nature Reviews Molecular Cell Biology. 2019;20:474–489. doi: 10.1038/s41580-019-0136-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib27] Gerhart J. From feedback inhibition to allostery: the enduring example of aspartate transcarbamoylase. FEBS Journal. 2014;281:612–620. doi: 10.1111/febs.12483. [DOI] [PubMed] [Google Scholar]

[bib28] Grah R, Zoller B, Tkačik G. Nonequilibrium models of optimal enhancer function. PNAS. 2020;117:31614–31622. doi: 10.1073/pnas.2006731117. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib29] Gregor T, Tank DW, Wieschaus EF, Bialek W. Probing the limits to positional information. Cell. 2007;130:153–164. doi: 10.1016/j.cell.2007.05.025. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib30] Gruber R, Horovitz A. Unpicking allosteric mechanisms of homo-oligomeric proteins by determining their successive ligand binding constants. Philosophical Transactions of the Royal Society B: Biological Sciences. 2018;373:20170176. doi: 10.1098/rstb.2017.0176. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib31] Gunawardena J. A linear framework for time-scale separation in nonlinear biochemical systems. PLOS ONE. 2012;7:e36321. doi: 10.1371/journal.pone.0036321. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib32] Gunawardena J. Time-scale separation--Michaelis and Menten's old idea, still bearing fruit. FEBS Journal. 2014;281:473–488. doi: 10.1111/febs.12532. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib33] Henzler-Wildman K, Kern D. Dynamic personalities of proteins. Nature. 2007;450:964–972. doi: 10.1038/nature06522. [DOI] [PubMed] [Google Scholar]

[bib34] Hill TL. Studies in irreversible thermodynamics IV. diagrammatic representation of steady state fluxes for unimolecular systems. Journal of Theoretical Biology. 1966;10:442–459. doi: 10.1016/0022-5193(66)90137-8. [DOI] [PubMed] [Google Scholar]

[bib35] Hilser VJ, Wrabl JO, Motlagh HN. Structural and energetic basis of allostery. Annual Review of Biophysics. 2012;41:585–609. doi: 10.1146/annurev-biophys-050511-102319. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib36] Horovitz A, Fersht AR. Strategy for analysing the co-operativity of intramolecular interactions in peptides and proteins. Journal of Molecular Biology. 1990;214:613–617. doi: 10.1016/0022-2836(90)90275-Q. [DOI] [PubMed] [Google Scholar]

[bib37] Horovitz A, Fersht AR. Co-operative interactions during protein folding. Journal of Molecular Biology. 1992;224:733–740. doi: 10.1016/0022-2836(92)90557-Z. [DOI] [PubMed] [Google Scholar]

[bib38] Jain RK, Ranganathan R. Local complexity of amino acid interactions in a protein core. PNAS. 2004;101:111–116. doi: 10.1073/pnas.2534352100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib39] Kalo A, Kanter I, Shraga A, Sheinberger J, Tzemach H, Kinor N, Singer RH, Lionnet T, Shav-Tal Y. Cellular levels of signaling factors are sensed by β-actin alleles to modulate transcriptional pulse intensity. Cell Reports. 2015;11:419–432. doi: 10.1016/j.celrep.2015.03.039. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib40] Kim S, Broströmer E, Xing D, Jin J, Chong S, Ge H, Wang S, Gu C, Yang L, Gao YQ, Su XD, Sun Y, Xie XS. Probing allostery through DNA. Science. 2013;339:816–819. doi: 10.1126/science.1229223. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib41] Knoverek CR, Amarasinghe GK, Bowman GR. Advanced methods for accessing protein Shape-Shifting present new therapeutic opportunities. Trends in Biochemical Sciences. 2019;44:351–364. doi: 10.1016/j.tibs.2018.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib42] Kornev AP, Taylor SS. Dynamics-Driven allostery in protein kinases. Trends in Biochemical Sciences. 2015;40:628–647. doi: 10.1016/j.tibs.2015.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib43] Koshland DE, Némethy G, Filmer D. Comparison of experimental binding data and theoretical models in proteins containing subunits. Biochemistry. 1966;5:365–385. doi: 10.1021/bi00865a047. [DOI] [PubMed] [Google Scholar]

[bib44] Koshland DE, Hamadani K. Proteomics and models for enzyme cooperativity. Journal of Biological Chemistry. 2002;277:46841–46844. doi: 10.1074/jbc.R200014200. [DOI] [PubMed] [Google Scholar]

[bib45] Lammers NC, Kim YJ, Zhao J, Garcia HG. A matter of time: using dynamics and theory to uncover mechanisms of transcriptional bursting. Current Opinion in Cell Biology. 2020;67:147–157. doi: 10.1016/j.ceb.2020.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib46] LeVine MV, Weinstein H. AIM for allostery: using the ising model to understand information processing and transmission in allosteric biomolecular systems. Entropy. 2015;17:2895–2918. doi: 10.3390/e17052895. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib47] Lewis BA. Understanding large multiprotein complexes: applying a multiple allosteric networks model to explain the function of the mediator transcription complex. Journal of Cell Science. 2010;123:159–163. doi: 10.1242/jcs.057216. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib48] Lewis JS, Costa A. Caught in the act: structural dynamics of replication origin activation and fork progression. Biochemical Society Transactions. 2020;48:1057–1066. doi: 10.1042/BST20190998. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib49] Li J, Dong A, Saydaminova K, Chang H, Wang G, Ochiai H, Yamamoto T, Pertsinidis A. Single-Molecule nanoscopy elucidates RNA polymerase II transcription at single genes in live cells. Cell. 2019;178:491–506. doi: 10.1016/j.cell.2019.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib50] Lin Y, Sohn CH, Dalal CK, Cai L, Elowitz MB. Combinatorial gene regulation by modulation of relative pulse timing. Nature. 2015;527:54–58. doi: 10.1038/nature15710. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib51] Liu J, Perumal NB, Oldfield CJ, Su EW, Uversky VN, Dunker AK. Intrinsic disorder in transcription factors. Biochemistry. 2006;45:6873–6888. doi: 10.1021/bi0602718. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib52] Lorimer GH, Horovitz A, McLeish T. Allostery and molecular machines. Philosophical Transactions of the Royal Society B: Biological Sciences. 2018;373:20170173. doi: 10.1098/rstb.2017.0173. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib53] Marco A, Meharena HS, Dileep V, Raju RM, Davila-Velderrain J, Zhang AL, Adaikkan C, Young JZ, Gao F, Kellis M, Tsai LH. Mapping the epigenomic and transcriptomic interplay during memory formation and recall in the hippocampal engram ensemble. Nature Neuroscience. 2020;23:1606–1617. doi: 10.1038/s41593-020-00717-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib54] Martini JWR. A measure to quantify the degree of cooperativity in overall titration curves. Journal of Theoretical Biology. 2017;432:33–37. doi: 10.1016/j.jtbi.2017.08.010. [DOI] [PubMed] [Google Scholar]

[bib55] Marzen S, Garcia HG, Phillips R. Statistical mechanics of Monod-Wyman-Changeux (MWC) models. Journal of Molecular Biology. 2013;425:1433–1460. doi: 10.1016/j.jmb.2013.03.013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib56] Miller JA, Widom J. Collaborative competition mechanism for gene activation in vivo. Molecular and Cellular Biology. 2003;23:1623–1632. doi: 10.1128/MCB.23.5.1623-1632.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib57] Mir M, Bickmore W, Furlong EEM, Narlikar G. Chromatin topology, condensates and gene regulation: shifting paradigms or just a phase? Development. 2019;146:dev182766. doi: 10.1242/dev.182766. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib58] Mirny LA. Nucleosome-mediated cooperativity between transcription factors. PNAS. 2010;107:22534–22539. doi: 10.1073/pnas.0913805107. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib59] Mirzaev I, Bortz DM. Laplacian dynamics with synthesis and degradation. Bulletin of Mathematical Biology. 2015;77:1013–1045. doi: 10.1007/s11538-015-0075-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib60] Mirzaev I, Gunawardena J. Laplacian dynamics on general graphs. Bulletin of Mathematical Biology. 2013;75:2118–2149. doi: 10.1007/s11538-013-9884-8. [DOI] [PubMed] [Google Scholar]

[bib61] Molina N, Suter DM, Cannavo R, Zoller B, Gotic I, Naef F. Stimulus-induced modulation of transcriptional bursting in a single mammalian gene. PNAS. 2013;110:20563–20568. doi: 10.1073/pnas.1312310110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib62] Monod J, Wyman J, Changeux JP. On the nature of allosteric transitions: a plausible model. Journal of Molecular Biology. 1965;12:88–118. doi: 10.1016/S0022-2836(65)80285-6. [DOI] [PubMed] [Google Scholar]

[bib63] Monod J, Jacob F. Teleonomic mechanisms in cellular metabolism, growth, and differentiation. Cold Spring Harbor Symposia on Quantitative Biology; 1961. pp. 389–401. [DOI] [PubMed] [Google Scholar]

[bib64] Motlagh HN, Li J, Thompson EB, Hilser VJ. Interplay between allostery and intrinsic disorder in an ensemble. Biochemical Society Transactions. 2012;40:975–980. doi: 10.1042/BST20120163. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib65] Motlagh HN, Wrabl JO, Li J, Hilser VJ. The ensemble nature of allostery. Nature. 2014;508:331–339. doi: 10.1038/nature13001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib66] Noé F, Fischer S. Transition networks for modeling the kinetics of conformational change in macromolecules. Current Opinion in Structural Biology. 2008;18:154–162. doi: 10.1016/j.sbi.2008.01.008. [DOI] [PubMed] [Google Scholar]

[bib67] Nogales E, Fang J, Louder RK. Structural dynamics and DNA interaction of human TFIID. Transcription. 2017;8:55–60. doi: 10.1080/21541264.2016.1265701. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib68] Nussinov R, Tsai CJ, Ma B. The underappreciated role of allostery in the cellular network. Annual Review of Biophysics. 2013;42:169–189. doi: 10.1146/annurev-biophys-083012-130257. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib69] Park J, Estrada J, Johnson G, Vincent BJ, Ricci-Tam C, Bragdon MD, Shulgina Y, Cha A, Wunderlich Z, Gunawardena J, DePace AH. Dissecting the sharp response of a canonical developmental enhancer reveals multiple sources of cooperativity. eLife. 2019a;8:e41266. doi: 10.7554/eLife.41266. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib70] Park M, Patel N, Keung AJ, Khalil AS. Engineering epigenetic regulation using synthetic Read-Write modules. Cell. 2019b;176:227–238. doi: 10.1016/j.cell.2018.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib71] Pauling L. The oxygen equilibrium of hemoglobin and its structural interpretation. PNAS. 1935;21:186–191. doi: 10.1073/pnas.21.4.186. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib72] Peeters E, van Oeffelen L, Nadal M, Forterre P, Charlier D. A thermodynamic model of the cooperative interaction between the archaeal transcription factor Ss-LrpB and its tripartite operator DNA. Gene. 2013;524:330–340. doi: 10.1016/j.gene.2013.03.118. [DOI] [PubMed] [Google Scholar]

[bib73] Perutz MF. Stereochemistry of cooperative effects in haemoglobin. Nature. 1970;228:726–734. doi: 10.1038/228726a0. [DOI] [PubMed] [Google Scholar]

[bib74] Portz B, Lu F, Gibbs EB, Mayfield JE, Rachel Mehaffey M, Zhang YJ, Brodbelt JS, Showalter SA, Gilmour DS. Structural heterogeneity in the intrinsically disordered RNA polymerase II C-terminal domain. Nature Communications. 2017;8:15231. doi: 10.1038/ncomms15231. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib75] Robert CH, Decker H, Richey B, Gill SJ, Wyman J. Nesting: hierarchies of allosteric interactions. PNAS. 1987;84:1891–1895. doi: 10.1073/pnas.84.7.1891. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib76] Sabari BR, Dall'Agnese A, Boija A, Klein IA, Coffey EL, Shrinivas K, Abraham BJ, Hannett NM, Zamudio AV, Manteiga JC, Li CH, Guo YE, Day DS, Schuijers J, Vasile E, Malik S, Hnisz D, Lee TI, Cisse II, Roeder RG, Sharp PA, Chakraborty AK, Young RA. Coactivator condensation at super-enhancers links phase separation and gene control. Science. 2018;361:eaar3958. doi: 10.1126/science.aar3958. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib77] Sadovsky E, Yifrach O. Principles underlying energetic coupling along an allosteric communication trajectory of a voltage-activated K+ channel. PNAS. 2007;104:19813–19818. doi: 10.1073/pnas.0708120104. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib78] Schnakenberg J. Network theory of microscopic and macroscopic behavior of master equation systems. Reviews of Modern Physics. 1976;48:571–585. doi: 10.1103/RevModPhys.48.571. [DOI] [Google Scholar]

[bib79] Schueler-Furman O, Wodak SJ. Computational approaches to investigating allostery. Current Opinion in Structural Biology. 2016;41:159–171. doi: 10.1016/j.sbi.2016.06.017. [DOI] [PubMed] [Google Scholar]

[bib80] Sengupta U, Strodel B. Markov models for the elucidation of allosteric regulation. Philosophical Transactions of the Royal Society B: Biological Sciences. 2018;373:20170178. doi: 10.1098/rstb.2017.0178. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib81] Shi H, Rangadurai A, Abou Assi H, Roy R, Case DA, Herschlag D, Yesselman JD, Al-Hashimi HM. Rapid and accurate determination of atomistic RNA dynamic ensemble models using NMR and structure prediction. Nature Communications. 2020;11:5531. doi: 10.1038/s41467-020-19371-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib82] Smale ST, Plevy SE, Weinmann AS, Zhou L, Ramirez-Carrozzi VR, Pope SD, Bhatt DM, Tong AJ. Toward an understanding of the gene-specific and global logic of inducible gene transcription. Cold Spring Harbor Symposia on Quantitative Biology; 2013. pp. 61–68. [DOI] [PubMed] [Google Scholar]

[bib83] Stroock DW. An Introduction to Markov Processes. In: Vakil R, editor. Graduate Texts in Mathematics. Berlin, Germany: Springer-Verlag; 2014. pp. 1–203. [DOI] [Google Scholar]

[bib84] Thal DM, Glukhova A, Sexton PM, Christopoulos A. Structural insights into G-protein-coupled receptor allostery. Nature. 2018;559:45–53. doi: 10.1038/s41586-018-0259-z. [DOI] [PubMed] [Google Scholar]

[bib85] Tran H, Desponds J, Perez Romero CA, Coppey M, Fradin C, Dostatni N, Walczak AM. Precision in a rush: trade-offs between reproducibility and steepness of the hunchback expression pattern. PLOS Computational Biology. 2018;14:e1006513. doi: 10.1371/journal.pcbi.1006513. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib86] Tsai CJ, Nussinov R. Gene-specific transcription activation via long-range allosteric shape-shifting. Biochemical Journal. 2011;439:15–25. doi: 10.1042/BJ20110972. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib87] Tsai CJ, Nussinov R. A unified view of "how allostery works". PLOS Computational Biology. 2014;10:e1003394. doi: 10.1371/journal.pcbi.1003394. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib88] Tzeng SR, Kalodimos CG. Protein dynamics and allostery: an NMR view. Current Opinion in Structural Biology. 2011;21:62–67. doi: 10.1016/j.sbi.2010.10.007. [DOI] [PubMed] [Google Scholar]

[bib89] Ullmann A. In memoriam: jacques Monod (1910-1976) Genome Biology and Evolution. 2011;3:1025–1033. doi: 10.1093/gbe/evr024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib90] Ventsel' AD, Freidlin MI. On small random perturbations of dynamical systems. Russian Mathematical Surveys. 1970;25:1–55. doi: 10.1070/RM1970v025n01ABEH001254. [DOI] [Google Scholar]

[bib91] Voss TC, Hager GL. Dynamic regulation of transcriptional states by chromatin and transcription factors. Nature Reviews Genetics. 2014;15:69–81. doi: 10.1038/nrg3623. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib92] Wales DJ. Energy landscapes: calculating pathways and rates. International Reviews in Physical Chemistry. 2006;25:237–282. doi: 10.1080/01442350600676921. [DOI] [Google Scholar]

[bib93] Wodak SJ, Paci E, Dokholyan NV, Berezovsky IN, Horovitz A, Li J, Hilser VJ, Bahar I, Karanicolas J, Stock G, Hamm P, Stote RH, Eberhardt J, Chebaro Y, Dejaegere A, Cecchini M, Changeux JP, Bolhuis PG, Vreede J, Faccioli P, Orioli S, Ravasio R, Yan L, Brito C, Wyart M, Gkeka P, Rivalta I, Palermo G, McCammon JA, Panecka-Hofman J, Wade RC, Di Pizio A, Niv MY, Nussinov R, Tsai CJ, Jang H, Padhorny D, Kozakov D, McLeish T. Allostery in its many disguises: from theory to applications. Structure. 2019;27:566–578. doi: 10.1016/j.str.2019.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib94] Wolff MR, Schmid A, Korber P, Gerland U. Effective dynamics of nucleosome configurations at the yeast PHO5 promoter. eLife. 2021;10:e58394. doi: 10.7554/eLife.58394. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib95] Wong F, Amir A, Gunawardena J. Energy-speed-accuracy relation in complex networks for biological discrimination. Physical Review E. 2018a;98:012420. doi: 10.1103/PhysRevE.98.012420. [DOI] [PubMed] [Google Scholar]

[bib96] Wong F, Dutta A, Chowdhury D, Gunawardena J. Structural conditions on complex networks for the Michaelis-Menten input-output response. PNAS. 2018b;115:9738–9743. doi: 10.1073/pnas.1808053115. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib97] Wong F, Gunawardena J. Gene regulation in and out of equilibrium. Annual Review of Biophysics. 2020;49:199–226. doi: 10.1146/annurev-biophys-121219-081542. [DOI] [PubMed] [Google Scholar]

[bib98] Wrabl JO, Gu J, Liu T, Schrank TP, Whitten ST, Hilser VJ. The role of protein conformational fluctuations in Allostery, function, and evolution. Biophysical Chemistry. 2011;159:129–141. doi: 10.1016/j.bpc.2011.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib99] Wright PE, Dyson HJ. Intrinsically disordered proteins in cellular signalling and regulation. Nature Reviews Molecular Cell Biology. 2015;16:18–29. doi: 10.1038/nrm3920. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib100] Yordanov P, Stelling J. Steady-State differential dose response in biological systems. Biophysical Journal. 2018;114:723–736. doi: 10.1016/j.bpj.2017.11.3780. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib101] Yordanov P, Stelling J. Efficient manipulation and generation of kirchhoff polynomials for the analysis of non-equilibrium biochemical reaction networks. Journal of the Royal Society Interface. 2020;17:20190828. doi: 10.1098/rsif.2019.0828. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Allosteric conformational ensembles have unlimited capacity for integrating information

John W Biddle

Rosa Martinez-Corral

Felix Wong

Jeremy Gunawardena

Roles

Abstract

Introduction

Figure 1. Binding cooperativity.

Figure 2. Cooperativity and allostery from three perspectives.

Results

Construction of the allostery graph

Figure 3. The free-energy landscape and corresponding graphs.

Figure 4. The allostery graph and coarse graining.

Relationships between higher-order measures

Figure 5. Graphs for defining higher-order measures.

Coarse graining yields effective HOCs

Effective HOCs for MWC-like ensembles

Integrative flexibility of ensembles

Figure 6. Example allostery graph for the flexibility theorem.

Figure 7. Integrative flexibility of allostery I.

Figure 8. Integrative flexibility of allostery II.

Allosteric ensembles for Hill functions

Figure 9. Allosteric ensembles for Hill functions.

Discussion

Figure 10. The haemoglobin analogy in gene regulation.

Materials and methods

The linear framework

Background and references

Linear framework graphs and dynamics

Steady states and thermodynamic equilibrium

Equilibrium graphs and independent parameters

Scheme 1. Graphs and equilibrium calculations.

Steady-state probabilities and equilibrium statistical mechanics

The allostery graph

Structure and labels

Scheme 2. Illustration of Equation 38.

Independent parameters

A general method of coarse graining

Coarse graining a linear framework graph and Equation 17

Lemma 1

Coarse graining an equilibrium graph

Lemma 2

Corollary 1

Coarse graining the allostery graph

Proof of Equation 18

Scheme 3. Coarse graining and effective association constants.

Elementary properties of effective HOCs

Generalised MWC formula

Effective HOCs for MWC-like models

Proof of Equation 19 and related work

Proof of Equation 20

Lemma 3

Corollary 2

Negative effective cooperativity

Flexibility of allostery

The integrative flexibility theorem

Theorem 1

Construction of Figure 8

Allosteric ensembles for Hill functions

Construction of Figure 9

Acknowledgements

Funding Statement

Contributor Information

Funding Information

Additional information

Competing interests

Author contributions

Additional files

Data availability

References

Decision letter

Roles

Author response

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK