Abstract
We map a class of well-mixed stochastic models of biochemical feedback in steady state to the mean-field Ising model near the critical point. The mapping provides an effective temperature, magnetic field, order parameter, and heat `capacity that can be extracted from biological data without fitting or knowledge of the underlying molecular details. We demonstrate this procedure on fluorescence data from mouse T cells, which reveals distinctions between how the cells respond to different drugs. We also show that the heat capacity allows inference of the absolute molecule number from fluorescence intensity. We explain this result in terms of the underlying fluctuations, and we demonstrate the generality of our work.
I. INTRODUCTION
Positive feedback is ubiquitous in biochemical networks and can lead to a bifurcation from a monostable to a bistable cellular state [1–4]. Near the bifurcation point, the bistable state often reflects a choice between two accessible but opposing cell fates. For example, in T cells, the distribution of doubly phosphorylated ERK (ppERK) can be bimodal [4]. ppERK is a protein that initiates cell proliferation and is implicated in the self- or non-self-decision between mounting an immune response or not [4,5].
The bifurcation point is similar to an Ising-type critical point in physical systems such as fluids, magnets, and superconductors, where a disordered state transitions to one of two ordered states at a critical temperature [6]. In fact, universality tells us that the two should not just be similar, they should be the same: because they are both bifurcating systems, both types of systems should exhibit the same critical scaling exponents and therefore belong to the same universality class [6]. Although this powerful idea has allowed diverse physical phenomena to be united into specific behavioral classes, the application of universality to biological systems is still developing [7–14].
Biological tools such as flow cytometry, fluorescence microscopy, and RNA sequencing allow reliable experimental estimates of abundance distributions, inspiring researchers to seek to apply insights from statistical physics to biological data. In particular, recent studies have demonstrated that biological systems on many scales, from molecules [15], to cells [16–21], to populations [22–24], exhibit signatures consistent with physical systems near a critical point. However, some of these studies have come under scrutiny because some of the signatures, particularly scaling laws, can arise far from or independent of a critical point [25–27]. Part of the problem is that the identification of appropriate scaling variables from data can be ambiguous, and one is often left looking for scaling relationships in an unguided way.
Typical approaches to the interpretation of abundance distributions include fitting to either detailed mechanistic models of the underlying reaction scheme, or to an effective description of the data such as a Gaussian or log-normal mixture model. The former approach is usually difficult to parametrize and difficult to generalize to other systems. The latter approach often suffers from numerical issues (the likelihood is unbounded and the expectation-maximization algorithm can lead to spurious solutions [28]). Moreover, the vicinity of a bifurcation point is precisely where a mixture analysis is most likely to fail. In contrast, mapping to a statistical physics framework is expected to be universal, in the sense that the precise microscopic details of a broad range of biochemical models are unimportant near the bifurcation point, as they are coarse-grained rather than particular reaction parameters.
Here we provide a framework for mapping well-mixed stochastic models of biochemical feedback to the mean-field Ising model, and we apply it to published data on T cells. This allows us to extract effective thermodynamic quantities from experimental data without needing to fit to a parametric model of the system. This makes the theory applicable to a broad class of biological datasets without worrying about model selection or goodness-of-fit criteria. The theory provides insights on how T cells respond to drugs, and it reveals distinctions between one type of drug response and another. Furthermore, we find that one of the thermodynamic quantities (the heat capacity) provides a way to estimate absolute molecule number from fluorescence level in bifurcating systems. We demonstrate that our results can be extended to cases in which feedback is indirect, and we discuss further extensions, including to spatiotemporal dynamics.
II. RESULTS
We consider a reaction network in a cell where X is the molecular species of interest, and the other species A, B, C, etc. form a chemical bath for X [Fig. 1(a)]. The reactions of interest produce or degrade an X molecule, can involve the bath species, and in principle are reversible. We allow for nonlinear feedback on X, meaning that the production of an X molecule in a particular reaction might require a certain number of X molecules as reactants. This leads to an arbitrary number of reactions of the form
(1) |
where in the rth reaction, jr are stoichiometric integers describing the nonlinearity, are the forward (+) and backward (−) reaction rates, and represent bath species involved as reactants (+) or products (−). A simple and well-studied special case of Eq. (1) is Schlögl’s second model [29–36], in which X is either produced spontaneously from bath species A, or in a trimolecular reaction from two existing X molecules and bath species B (i.e., R = 2, j1 = 0, j2 = 2, , , and ).
We assume that molecules are well-mixed and that the numbers of bath molecules are constant. The latter assumption is equivalent to integrating out all species but X, such that the feedback on X arises directly from X itself [Eq. (1)]. However, in general the feedback will be indirect, with X regulating dynamic species in the bath that in turn regulate X (this is almost certainly the case in the T cells we study here). Therefore, we consider this more general case later in Sec. II D and show that the results discussed below remain unchanged.
The master equation for the probability of observing n molecules of species X according to Eq. (1) is
(2) |
where and are the total birth and death propensities, and and are the forward and backward propensities of each reaction pair. Here are the numbers of molecules of the bath species involved in reaction r, and the factorials account for the number of ways that X molecules can meet in a reaction. The steady state of Eq. (2) is [37,38]
(3) |
where is set by normalization. In the second step of Eq. (3) we define an effective birth propensity fn ≡ nbn−1/dn corresponding to spontaneous death with propensity n [Fig. 1(a)]. In general, fn is an arbitrary, nonlinear feedback function governed by the reaction network. For the Schlögl model, it is fn =[aK2 +s(n−1)(n−2)]/[(n−1) (n − 2) + K2], where we have introduced the dimensionless quantities , , and . As a ubiquitous example, we also consider the Hill function fn = a + snH/(nH + KH) with coefficient H. Importantly, the inverse of Eq. (3),
(4) |
allows calculation of the feedback function from the distribution [39], as utilized when analyzing the experimental data later in Sec. II B.
The quantity fn − n determines the dynamic stability: there can be either one or two stable states n∗ [Fig. 1(b)], and the transition from a monostable to a bistable regime occurs at a bifurcation point [Fig. 1(c), inset]. These deterministic regimes correspond stochastically to unimodal and bimodal distributions pn, respectively, with maxima at n∗, while the bifurcation point corresponds to a distribution that is flat on top [Fig. 1(c)].
A. Ising mapping and scaling exponents
To understand the scaling behavior near the bifurcation point, we expand the stability condition to third order around a point nc satisfying . This choice of nc eliminates the quadratic term in the dynamic forcing fn − n, equivalent to eliminating the cubic term in an effective potential as in Ginzburg–Landau theory [40]. Defining the parameters
(5) |
the expansion becomes h − θm − m3/3 = 0. This expression is equivalent to the expansion of the Ising mean-field equation m = tanh[(m + h)/(1 + θ)] for small magnetization m, where θ = (T − Tc)/Tc is the reduced temperature, and h is the dimensionless magnetic field [40]. Therefore, in our system we interpret m as the order parameter, θ as an effective reduced temperature, and h as an effective field. Explicit expressions for nc, θ, and h in terms of the biochemical parameters and vice versa are given for the Schlögl and Hill models in Appendix A.
We see in Figs. 1(c) and 1(d) that nc determines where the distribution is centered, that θ drives the system to the unimodal (θ > 0) or bimodal (θ < 0) state, and that h biases the system to high (h > 0) or low (h < 0) molecule numbers. Note that unlike in the Ising model, even when h = 0 an asymmetry persists between the high and low states [see the purple distribution in Fig. 1(c)]. The reason is that in the master equation [Eq. (2)], unlike in Ginzburg-Landau theory, fluctuations scale with molecule number, such that the high state is wider than the low state.
The equivalence between our system and the Ising mean-field equation near the critical point [Eq. (5)] implies that our system has the same scaling exponents β = 1/2, γ = 1, and δ = 3 as the Ising universality class in its mean-field limit [40]. For completeness, we verify in Appendix B that these scalings are indeed obeyed by the Schlögl and Hill models.
However, Eq. (5) does not explicitly determine the value of the exponent α. The reason is that, unlike β, γ, and δ, the exponent α depends on the entire distribution pn, not just the maxima. Specifically, α concerns the heat capacity, C|h=0 ~ |θ|−α, which depends on the entropy S and thus pn. The equilibrium definition C = T∂T S generalizes to a nonequilibrium system like ours when one uses the Shannon entropy S = −kB ∑n pn ln pn [41]. Since T = (1 + θ)Tc, we have C = (1 + θ)∂θS, or
(6) |
where . Equation (6) follows from performing the θ derivative using the expression in Eq. (3), the expansion below Eq. (5), and the definition of θ [Eq. (5)]. We see in Fig. 2(a) that when h = 0, C exhibits a minimum at θ∗. We see in Fig. 2(b) that θ∗ vanishes as the system size increases, nc → ∞. This implies that C|h=0 ~ |θ|0 to subquadratic order in θ, or α = 0, again consistent with the Ising universality class in its mean-field limit. Interestingly, whereas C is discontinuous in the mean-field Ising model [40] and constant in the van der Waals model of a fluid [6], it is minimized here; nevertheless, in all cases α = 0. Note from Fig. 2(a) that C is negative near θ = 0; negative heat capacity is a well-known feature of nonequilibrium steady states [42–44].
B. Application to immune cell data
To demonstrate the utility of our theory, we apply it to published data from T cells [4]. In these experiments, chemotherapy drugs inhibit the enzymes MEK and SRC in the biochemical networks of the cells. The inhibition results in bimodal (low dose) or unimodal (high dose) distributions of ppERK abundance, which is measured as fluorescence intensity I by flow cytometry. The distributions are shown for a range of drug doses in Figs. 3(a) and 3(b) (the insets show distributions of log intensity for clarity). Experimental details are given in the original publication [4] and are summarized in Appendix C, along with the drugs and dose amounts.
First, we compute the feedback function f from each distribution using Eq. (4) (see Appendix D). Figures 3(c) and 3(d) show the corresponding forcing functions [compare to Fig. 1(b)]. As expected, in each case we see that the forcing function transitions from two stable states to one stable state as the drug is applied.
Then, we compute Ic (the analog of nc in units of fluorescence intensity), θ, and h from the feedback function using Eq. (5) (see Appendix D). These quantities are shown as a function of drug dose in Figs. 3(e)–3(g). We see that the behavior is different depending on whether MEK inhibitor (MEKi) or SRC inhibitor (SRCi) is applied. Specifically, MEKi decreases Ic, increases θ, and decreases h, whereas SRCi only decreases h, leaving the other quantities unchanged. Thus, the effective thermodynamic quantities can differentiate cellular responses to different perturbations, such as the application of different drugs.
Furthermore, the mapping provides an intuitive interpretation of the drug responses. MEKi causes a transition from a bimodal to a unimodal state in the expected way: by increasing the reduced temperature θ from a negative to a positive value [Fig. 3(f)]. In the process, Ic decreases [Fig. 3(e)], meaning that the unimodal state is shifted to lower molecule number, near the lower mode of the bimodal state [Fig. 3(a), inset]. In contrast, SRCi causes a transition from a bimodal to a unimodal state in a different way: by decreasing the field while leaving θ and Ic unchanged [Figs. 3(e)–3(g)]. In essence, the distribution remains bimodal and unshifted, except that the field causes the high mode to diminish in weight [Fig. 3(b), inset]. Interestingly, the mean dose-response curves are similar for the two drugs [4], but our mapping elucidates precisely how the transitions are different at the distribution level. Related conclusions were drawn in [4], but those conclusions relied on fitting the distributions to a five-parameter Gaussian mixture model, which is expected to fail near the bifurcation point. Here we use only three parameters and no fitting, and we emerge with an intuitive interpretation in terms of thermodynamic quantities.
Finally, we note that for both drugs the effective field is negative at all doses [Fig. 3(g)]. The reason is that the fluorescence distributions have long tails (which is why they are often easier to visualize in log space); see Figs. 3(a) and 3(b). In the theory, a long tail is indistinguishable from a low-molecule-number bias in the peak, which corresponds to h < 0. We address the possible origins and implications of the long tails in Sec. III.
C. Estimation of molecule number
We now apply the theory to compute the heat capacity from the T cell data. Specifically, we compute C using Eq. (6) (see Appendix D) for all drugs and doses used in the experiments [4] (Appendix C). Unlike the other thermodynamic quantities, C requires a conversion from fluorescence intensity to molecule number because it depends explicitly on the distribution pn [Eq. (6)]. Therefore, we compute C for various values of the conversion factor I1, where n = I/I1. The results are shown in Fig. 4. We see that irrespective of I1 over four orders of magnitude, the data closest to h = 0 (yellow) exhibit a global minimum in C at θ = 0, as expected from Fig. 2(a). However, we also see that the depth of the minimum agrees with that of the theory only for the particular choice I1 ≈ 0.1 [Fig. 4(c)].
To obtain a more precise estimate of I1, we plot the sum of squared errors between the data and the theory as a function of I1 in Fig. 4(e). We focus on the bifurcation region by considering only values of θ within −Δθ ⩽ θ ⩽ Δθ, and we find that our results are not sensitive to the choice of θ [Fig. 4(f)]. This procedure (see the details in Appendix D) results in an estimate of I1 = 0.5 ± 0.2, as seen in Fig. 4(f). This value of I1 corresponds to ppERK molecules in the high mode averaged across all cases with no inhibitor. It is possible to compare this value with previous measurements on these cells. In two separate experiments, it was estimated that there are approximately 100000 [5] and 214000 [45] ERK molecules per cell, and that only about 50% of these molecules are doubly phosphorylated during T cell receptor activation [5] (see Appendix D). These considerations give a range of roughly 50000–107000 ppERK molecules, which is consistent with our estimate of 170000 ± 70000. The agreement is especially notable given that T cell protein abundances generally span six orders of magnitude, from tens to tens of millions of molecules per cell [45].
Why does the heat capacity extract the conversion between fluorescence intensity and molecule number? As mentioned above, α is the only exponent that is a function of pn instead of just its maxima. This means that the plot of C versus θ contains information not only about means or modes, but also about fluctuations. The notion that fluctuation information is essential for converting from intensity to molecule number can be seen with a simpler example: a Poisson distribution. Here we would have . From this relation it is clear that information about not only the mean () but also the fluctuations () in intensity is necessary and sufficient to infer the conversion factor I1. In our case, the heat capacity is extracting similar information, but for a bifurcating system.
D. Generalization to indirect feedback
In the T cells, it is well known that ppERK does not apply feedback to its own activation directly, but rather indirectly via upstream components [4,5,46]. Therefore, we seek to determine the extent to which the above results are sensitive to our assumption in the theory that the feedback is direct. To this end, we construct a minimal extension of the model in Eq. (1) in which the feedback is indirect:
(7) |
Here X is produced, is degraded, and reversibly dimerizes (first line); the dimer D produces a species A that produces X and is degraded (second line); and the dimer also produces a species B that degrades X and is degraded (third line). Equation (7) is an extension of Eq. (1) because there are multiple stochastic variables (X, D, A, and B), there are irreversible reactions, and X feeds back on itself indirectly through D, A, and B instead of directly.
The deterministic steady state of Eq. (7) is
(8) |
where c0 ≡ k1/k2, c2 ≡ k3k5k6/(k2k4k7), c3 ≡ k3k8k9/(k2k4k10), and the molecule numbers of D, A, and B have been eliminated in favor of n∗ by setting their own time derivatives to zero. Because Eq. (8) is cubic in n∗, we see immediately that it has the same form as the expanded Ising mean-field equation h − θm − m3/3 = 0 [see Eq. (5)]. Specifically, defining m = (n∗ − nc)/nc as in Eq. (5), the choice nc = c2/(3c3) eliminates the term quadratic in m and implies and . It immediately follows that this model has the same exponents β = 1/2, γ = 1, and δ = 3 as the mean-field Ising universality class.
To test whether the heat capacity for this model exhibits the same features as that for the direct feedback model in Fig. 2(a), we compute the steady-state marginal distribution pn using stochastic simulations [47] of Eq. (7). Specifically, we set k3/k4 = 1/nc and k5/k7 = k8/k10 = 1 to ensure that the numbers of D, A, and B molecules, respectively, are on the order of nc. We then set k4/k2 = k7/k2 = k10/k2 = ρ, where ρ is a free parameter that determines whether the degradation timescales of D, A, and B, respectively, are faster (ρ > 1) or slower (ρ < 1) than that of X. These conditions, along with the definitions of nc, θ, and h above, constitute nine equations for nine reaction rates, plus k2, which sets the units of time. Solving these equations yields expressions for the rates in terms of nc, θ, h, and ρ that we use in the simulations.
Figure 5(a) shows the heat capacity C as a function of θ for h = 0, nc = 100, and ρ = {0.1, 1, 10}, where C = (1 + θ)∂θS is computed from the entropy S = −kB ∑n pn ln pn by numerical derivative. We see that for all ρ values, the curves exhibit a minimum at θ = 0, implying α = 0, and they rise more steeply for negative than for positive θ as in Fig. 2(a).
We then investigate whether Eq. (1) remains valid as a coarse-grained description of the extended model in Eq. (7). To answer this question, we infer values of nc, θ, h, and C directly from the simulation data pn using the same protocol as for the experimental data. That is, we compute fn via Eq. (4), and then compute θ, h, and C from its derivatives at nc according to Eqs. (5) and (6), where nc satisfies . As with the experimental data (see Appendix D), derivatives are calculated using a Savitsky-Golay filter [48], although here we apply the filter directly to fn and perform the analysis directly in n space, not log space.
Figure 5(b) shows the result of this procedure for the inferred heat capacity C as a function of the inferred θ. We see that, as with the exact C and θ [Fig. 5(a)], the data exhibit a minimum at θ = 0 and rise more steeply for negative than for positive θ. Note that the values of C and θ are different in (a) and (b), which is expected because the shape of pn is not quantitatively the same in the two models of Eqs. (1) and (7); nonetheless, the shape of the C versus θ curves remains the same. We have checked that the inferred values of nc and h are distributed around their known values of 100 and 0, respectively, and that the shape persists across a range of filter window sizes.
These results suggest that the main findings above are not sensitive to our assumption that feedback is direct, and therefore that we are justified in using Eq. (1) as a coarse-grained model to analyze the T cell data.
III. DISCUSSION
We have employed the fact that a feedback-induced bifurcation exhibits the scaling properties of the mean-field Ising universality class to provide a simple prescription for modeling and analyzing biological data. Contrary to existing mixture-model approaches, our method is most valuable near the bifurcation point, which is where biologically significant cell-fate decisions are expected to take place. Our approach provides the effective order parameter, reduced temperature, magnetic field, and heat capacity from experimental distributions without fitting or needing to know the molecular details. By applying the approach to T cell flow cytometry data, we discovered that these quantities discriminate between cellular responses in an intuitive, interpretable way, and that the heat capacity allows estimation of the molecule number from fluorescence intensity for a bifurcating system. By generalizing the theory to include indirect feedback, we demonstrated the capacity to model realistic signaling cascades where indirect feedback is common. Our approach should be applicable to other systems observed to undergo a pitchfork-like bifurcation and the associated unimodal-to-bimodal transition in abundance distributions, but not to systems that have an absorbing or extinction state, as they are expected to fall under a different universality class [49,50].
The theory assumes only birth-death reactions and neglects more complex mechanisms such as bursting [51,52] or parameter fluctuations [53,54]. These mechanisms are known to produce long tails and may be responsible for the long tails observed in the experimental data [Figs. 3(a) and 3(b)]. Cell-to-cell variability (CCV) may also contribute to the long tails, as it is known to be present in T cell populations [55]. Our theory neglects CCV and instead assumes that the distribution of molecule numbers across the population is the same as that traced out by a single cell over time. Although CCV may play an important role, one generically expects the role of intrinsic fluctuations to be amplified near a critical point, and models that ignore CCV have been shown to be sufficient to explain both the bimodality [3] and variance properties [56] of ppERK in T cells. Moreover, the fact that our theory provides an estimate of the molecule number that is consistent with other estimates suggests that intrinsic fluctuations play a large role. Distinguishing between intrinsic fluctuations and long-lived CCV is an important topic for future work.
Our work provides key tools that can be used for a broader exploration of biological systems. The approach is applicable to any experimental dataset that exhibits unimodal and bimodal abundance distributions, and could lead to a unified picture of diverse cell types and environmental perturbations in terms of effective thermodynamic quantities. At the same time, several extensions of our work are natural. For example, the dynamics of the theory could be probed to investigate the consequences of critical slowing down for driven or dynamically perturbed systems with feedback. Alternatively, the theory could be generalized to systems that are not well-mixed, such as intracellular compartments or communicating populations, to investigate space-dependent universal behavior and its biological implications.
IV. DATA AVAILABILITY
Data and code for all figures and the MIFlowCyt record are available [59].
ACKNOWLEDGMENTS
This work was supported by Human Frontier Science Program Grant No. LT000123/2014 (A.E.), National Institutes of Health (NIH) Grant No. R01 GM082938 (A.E.), Simons Foundation Grant No. 376198 (T.A.B. and A.M.), and the Intramural Research Program of the NIH, Center for Cancer Research, National Cancer Institute.
APPENDIX A: MAPPING FOR SCHLÖGL AND HILL MODELS
Here we provide the mapping from nc, θ, and h to the biochemical parameters and vice versa for the Schlögl and Hill models. For the Schlögl model, the feedback function is
(A1) |
The condition is satisfied by
(A2) |
The parameters θ and h are given by Eq. (5), where
(A3) |
(A4) |
(A5) |
These expressions are inverted to write the biochemical parameters a, s, and K in terms of nc, θ, and h:
(A6) |
(A7) |
(A8) |
where x ≡ 2nc − 3.
Similarly, for the Hill model we have
(A9) |
(A10) |
(A11) |
(A12) |
(A13) |
(A14) |
(A15) |
(A16) |
In the Hill model, H is an additional free parameter.
APPENDIX B: SCALING EXPONENTS β, γ, AND δ
Here we verify that the stochastic Schlögl and Hill models have the scaling exponents β, γ, and δ of the mean-field Ising universality class. Specifically, we expect m = ±(−3θ)β for h = 0 and θ < 0, with β = 1/2; χ = θ−γ or χ = (−2θ)−γ for θ > 0 or θ < 0, respectively, with γ = 1, where χ ≡ (∂hm)h=0 is the dimensionless susceptibility; and m = (3h)1/δ for θ = 0, with δ = 3. Figure 6 computes these quantities from the parameters and maxima of pn for the Schlögl and Hill models using the mapping in Eq. (5). We see that the scalings hold, as expected.
APPENDIX C: EXPERIMENTAL METHODS
The experimental data analyzed in Fig. 3, along with a detailed description of the experimental methods, have been published previously [4]. In this appendix, we briefly summarize the experimental system and methods. The drugs and dose ranges used in Figs. 3 and 4 are listed in Table I.
TABLE I.
Drug | Inhibits | Dose range (nM) | Shape in Fig. 4 |
---|---|---|---|
PD325901 | MEK | 0.09–1000 | Up triangle |
AZD6244 | MEK | 2.4–5000 | Down triangle |
Trametinib | MEK | 0.5–1000 | Right triangle |
Dasatinib | SRC | 0.09–1000 | Circle |
Bosutinib | SRC | 0.5–1000 | Square |
PP2 | SRC | 24–50 000 | Diamond |
The data investigate inhibition of the antigen-driven MAP kinase cascade in primary CD8+ mouse T cells. A natural way to stimulate T cells is to load a peptide (a fragment of an antigenic protein that the T cells are programed to recognize) onto antigen-presenting cells. We achieve this by incubating RMA-S cells with antigen at 37 °C. At the same time, we harvest the spleen and lymph nodes of a RAG2−/− OT1 mouse, which has T cells specific only to the ovalbumin peptide with the amino acid sequence SIINFEKL. When we mix the OT1 T cells with the antigen-loaded RMA-S cells, we expose the OT1 T cells to their activating peptide. In response, the T cells activate their receptors through a SRC Family kinase (Lck). This triggers an enzymatic cascade, which in turn actives Ras-Raf-MEK-ERK leading to double phosphorylation of ERK, rendering it capable of communicating with the nucleus. By waiting for 10 min, the signaling reaches steady state and the distribution of the abundance of doubly phosphorylated ERK (ppERK) is the readout.
To measure the abundance of ppERK, we use fluorescence cytometry. Specifically, we introduce ppERK-targeted antibodies that are preconjugated with a fluorescent dye. Because antibodies selectively attach to their target molecule with negligible false-positives, the fluorescence intensity of the dye is proportional to the abundance of ppERK. To measure the intensity, approximately 30 000 cells per sample are passed oneby-one through a microfluidic device where they encounter a series of excitation lasers. Each cell yields one intensity value, and the histogram provides an estimate of the distribution of ppERK abundance across the population. We assume that the distribution across the population is a fair representation of the steady-state distribution of ppERK abundance of a single cell. This is reasonable (and is the accepted practice) since while the cells are alive and the experiment is taking place, they are in a dilute suspension (approximately 30 000 cells in 100 μL), not close enough together to influence each other.
APPENDIX D: EXPERIMENTAL DATA ANALYSIS
We calculate the forcing functions and the effective thermodynamic quantities Ic, θ, h, and C from an experimental intensity distribution using the following procedure. First, we set n = I/I1 to convert p(I) to pn, where the intensity of one molecule I1 converts from intensity I to molecule number n. We will see below that only C will depend on the value of I1.
Next, because the experimental distributions are long-tailed, we convert Eq. (5) to ln I space for numerical stability. Here we provide the necessary conversions between functions of n from the theory, and functions of ℓ ≡ ln I from the experiments, as the probability distributions over n and ℓ do not have the same functional forms [57]. In what follows, a prime denotes the derivative of a function with respect to its argument (n for f; and ℓ for q, Q, and ϕ). ℓ and n are related as
(D1) |
We denote the distribution of ℓ as q(ℓ). Approximating n as continuous, probability conservation requires
(D2) |
Using Eq. (D2), the feedback function [Eq. (4)] is
(D3) |
where, using Eq. (D1),
(D4) |
The last steps define ϵ ≡ 1/n and assume that for most values of n with appreciable probability we have ϵ ≪ 1. Therefore, Eq. (D3) becomes
(D5) |
where we have kept to first order in ϵ. Defining ϕ(ℓ) ≡ fn − n, from Eq. (D5) we have
(D6) |
In the last step, we define Q(ℓ) ≡ −ℓ + ln q so that ϕ is computed as a total derivative, which we find more numerically stable. The ϕ(ℓ) are the forcing functions plotted in Figs. 3(c) and 3(d).
The point nc is defined by . Equation (D1) implies
(D7) |
such that the condition becomes
(D8) |
Therefore, we define a point ℓc by
(D9) |
Numerically, we enforce Eq. (D9) by writing it as 0 = ∂ℓ(ϕ′ − ϕ), and therefore
(D10) |
Then
(D11) |
from Eq. (D1).
Derivatives of f with respect to n at nc are related in a straightforward way to derivatives of ϕ with respect to ℓ at ℓc. First, the zeroth derivative is, by Eq. (D6),
(D12) |
where nc is defined in Eq. (D11). Then, using Eq. (D7), the first derivative is
(D13) |
Finally, by a similar procedure, the third derivative is
(D14) |
where the second step uses Eq. (D9). Using Eqs. (D11)–(D14), θ and h [Eq. (5)] become
(D15) |
(D16) |
Note that they do not depend on I1.
To estimate the derivatives in Eqs. (D15) and (D16), we apply a Savitsky-Golay filter to the experimental q(ℓ) [48]. Savitsky-Golay filtering replaces each data point with the value of a polynomial of order J that is fit to the data within a window W of the point. Since we require three derivatives of ϕ(ℓ) [Eqs. (D15) and (D16)], which depends on the first derivative of q(ℓ) [Eq. (D6)], we use the minimum value J = 4. Thus, the procedure requires the adjustable parameter W/L, where L is the number of ln I bins. We find that L = 100 and W = 25 suffice [Figs. 3, 4, and 5(b)], and that results are robust to W/L.
The analysis is demonstrated for an example experimental distribution in Fig. 7. In summary, we do the following:
Plot q(ℓ) from the data using L bins [Fig. 7(a), black].
Filter q(ℓ) using window W [Fig. 7(a), red].
Compute Ic, θ, and h from ℓc, ϕ, and its derivatives using Eqs. (D11), (D15), and (D16).
Compute pn from the data using I1.
Compute C/kB from pn, θ, nc [Eq. (D11)], [Eq. (D14)], and fn [Eq. (D5)] using Eq. (6).
Figure 7(d) shows that Ic falls between the maxima as expected, and that θ and h are negative corresponding to a distribution that is bimodal and skewed to the left, respectively.
To estimate the value of I1, consider χ2, defined as
(D17) |
where N is the number of data points, Ci/kB is the value of the heat capacity for each data point, C(θi)/kB is the predicted value of the heat capacity at the location θi of that data point, and is the variance for data point i. Under the simplifying assumption that takes the same value σ2 for all data points, we have χ2 = s/σ2, where s is the sum of squared errors plotted in Fig. 4(e). As a function of I1, χ2 should scale quadratically near its minimum,
(D18) |
where the location and curvature of the minimum give the best estimate and error in the estimate , respectively [58]. In terms of s, we have
(D19) |
where s∗ is the minimal value. The value σ2 is, by definition, the average squared deviation of the data from the theory [58],
(D20) |
here evaluated at the minimum s∗. Inserting this result into Eq. (D19), we obtain
(D21) |
We see that if I1 deviates from by , then s is larger than its minimal value by a factor of 1 + N−1. This criterion, illustrated by the black line in Fig. 4(e), is used to determine .
Figure 4(e) is restricted to data whose θ values are less than or equal to θ = 0.05 in magnitude, of which there are N = 20 points. As Δθ increases, N increases, and the minimum of s also becomes less sharp. These effects compensate, yielding an estimate of I1 whose value and error are insensitive to Δθ, as seen in Fig. 4(f). Averaged across θ values, we find I1 = 0.5 and , as reported in the main text.
We compare our estimate of ppERK molecule number to two previous studies. In [5], it was estimated that there are 100 000 ERK molecules per cell (see the Results section in [5]). From [45], we estimate that there are 214000 ERK molecules per cell. Specifically, from the Excel file associated with Fig. 1 in [45], we sum the mean number (column I) of ERK1 (also called MAPK3, row 2345) and ERK2 (also called MAPK1, row 874) to obtain 214000 molecules to three significant digits. In [5], it was estimated that 50% of ERK molecules are doubly phosphorylated during T cell receptor activation (see the caption of Fig. S2 in [5]).
References
- [1].Mitrophanov AY and Groisman EA, Bioessays 30, 542 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Tkačik G, Walczak AM, and Bialek W, Phys. Rev. E 85, 041903 (2012). [DOI] [PubMed] [Google Scholar]
- [3].Das J, Ho M, Zikherman J, Govern C, Yang M, Weiss A, Chakraborty AK, and Roose JP, Cell 136, 337 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Vogel RM, Erez A, and Altan-Bonnet G, Nat. Commun 7, 12428 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Altan-Bonnet G and Germain RN, PLoS Biol. 3, e356 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Goldenfeld N, Lectures on Phase Transitions and the Renormalization Group, Advanced Book Program (Addison-Wesley, Reading, MA, 1992). [Google Scholar]
- [7].Mora T and Bialek W, J. Stat. Phys 144, 268 (2011). [Google Scholar]
- [8].Munoz MA, Rev. Mod. Phys 90, 031001 (2018). [Google Scholar]
- [9].Salman H, Brenner N, Tung C.-k., Elyahu N, Stolovicki E, Moore L, Libchaber A, and Braun E, Phys. Rev. Lett 108, 238105 (2012). [DOI] [PubMed] [Google Scholar]
- [10].Brenner N, Newman CM, Osmanović D, Rabin Y, Salman H, and Stein DL, Phys. Rev. E 92, 042713 (2015). [DOI] [PubMed] [Google Scholar]
- [11].Pal M, Ghosh S, and Bose I, Phys. Biol 12, 016001 (2014). [DOI] [PubMed] [Google Scholar]
- [12].Ridden SJ, Chang HH, Zygalakis KC, and MacArthur BD, Phys. Rev. Lett 115, 208103 (2015). [DOI] [PubMed] [Google Scholar]
- [13].Qian H, Ao P, Tu Y, and Wang J, Chem. Phys. Lett 665, 153 (2016). [Google Scholar]
- [14].Hidalgo J, Grilli J, Suweis S, Muñoz MA, Banavar JR, and Maritan A, Proc. Natl. Acad. Sci. USA 111, 10095 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Mora T, Walczak AM, Bialek W, and Callan CG, Proc. Natl. Acad. Sci. USA 107, 5405 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Kastner DB, Baccus SA, and Sharpee TO, Proc. Natl. Acad. Sci. USA 112, 2533 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Krotov D, Dubuis JO, Gregor T, and Bialek W, Proc. Natl. Acad. Sci. USA 111, 3683 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].De Palo G, Yi D, and Endres RG, PLoS Biol. 15, e1002602 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Chen X, Dong X, Be’er A, Swinney HL, and Zhang HP, Phys. Rev. Lett 108, 148101 (2012). [DOI] [PubMed] [Google Scholar]
- [20].Aguilar-Hidalgo D, Werner S, Wartlick O, González-Gaitán M, Friedrich BM, and Jülicher F, Phys. Rev. Lett 120, 198102 (2018). [DOI] [PubMed] [Google Scholar]
- [21].Wan KY and Goldstein RE, Phys. Rev. Lett 121, 058103 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Bialek W, Cavagna A, Giardina I, Mora T, Pohl O, Silvestri E, Viale M, and Walczak AM, Proc. Natl. Acad. Sci. USA 111, 7212 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Attanasi A, Cavagna A, Del Castello L, Giardina I, Melillo S, Parisi L, Pohl O, Rossaro B, Shen E, Silvestri E et al. , Phys. Rev. Lett 113, 238102 (2014). [DOI] [PubMed] [Google Scholar]
- [24].Cavagna A, Conti D, Creato C, Del Castello L, Giardina I, Grigera TS, Melillo S, Parisi L, and Viale M, Nat. Phys 13, 914 (2017). [Google Scholar]
- [25].Schwab DJ, Nemenman I, and Mehta P, Phys. Rev. Lett 113, 068102 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Touboul J and Destexhe A, Phys. Rev. E 95, 012413 (2017). [DOI] [PubMed] [Google Scholar]
- [27].Newman ME, Contemp. Phys 46, 323 (2005). [Google Scholar]
- [28].Biernacki C, Celeux G, and Govaert G, Comput. Stat. Data Anal 41, 561 (2003). [Google Scholar]
- [29].Schlögl F, Z. Phys 253, 147 (1972). [Google Scholar]
- [30].Dewel G, Walgraef D, and Borckmans P, Z. Phys. B 28, 235 (1977). [Google Scholar]
- [31].Nicolis G and Malek-Mansour M, J. Stat. Phys 22, 495 (1980). [Google Scholar]
- [32].Brachet M and Tirapegui E, Phys. Lett. A 81, 211 (1981). [Google Scholar]
- [33].Grassberger P, Z. Phys. B 47, 365 (1982). [Google Scholar]
- [34].Prakash S and Nicolis G, J. Stat. Phys 86, 1289 (1997). [Google Scholar]
- [35].Liu D-J, Guo X, and Evans JW, Phys. Rev. Lett 98, 050601 (2007). [DOI] [PubMed] [Google Scholar]
- [36].Vellela M and Qian H, J. R. Soc., Interf 6, 925 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [37].Van Kampen NG, Stochastic Processes in Physics and Chemistry (Elsevier, Amsterdam, 1992). [Google Scholar]
- [38].Gardiner CW, Handbook of Stochastic Methods (Springer, Berlin, 1985). [Google Scholar]
- [39].Walczak AM, Mugler A, and Wiggins CH, Proc. Natl. Acad. Sci. USA 106, 6529 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [40].Kopietz P, Bartosch L, and Schütz F, Introduction to the Functional Renormalization Group (Springer, Berlin, 2010), Vol. 798. [Google Scholar]
- [41].Mandal D, Phys. Rev. E 88, 062135 (2013). [DOI] [PubMed] [Google Scholar]
- [42].Zia R, Praestgaard E, and Mouritsen O, Am. J. Phys 70, 384 (2002). [Google Scholar]
- [43].Boksenbojm E, Maes C, Netočný K, and Pešek J, Europhys. Lett 96, 40001 (2011). [Google Scholar]
- [44].Bisquert J, Am. J. Phys 73, 735 (2005). [Google Scholar]
- [45].Hukelmann JL, Anderson KE, Sinclair LV, Grzes KM, Murillo AB, Hawkins PT, Stephens LR, Lamond AI, and Cantrell DA, Nat. Immunol 17, 104 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [46].Shin S-Y, Rath O, Choo S-M, Fee F, McFerran B, Kolch W, and Cho K-H, J. Cell Sci 122, 425 (2009). [DOI] [PubMed] [Google Scholar]
- [47].Gillespie DT, J. Phys. Chem 81, 2340 (1977). [Google Scholar]
- [48].Savitzky A and Golay MJ, Anal. Chem 36, 1627 (1964). [Google Scholar]
- [49].Ohtsuki T and Keyes T, Phys. Rev. A 35, 2697 (1987). [DOI] [PubMed] [Google Scholar]
- [50].Grassberger P and Sundermeyer K, Phys. Lett. B 77, 220 (1978). [Google Scholar]
- [51].Friedman N, Cai L, and Xie XS, Phys. Rev. Lett 97, 168302 (2006). [DOI] [PubMed] [Google Scholar]
- [52].Mugler A, Walczak AM, and Wiggins CH, Phys. Rev. E 80, 041921 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [53].Shahrezaei V, Ollivier JF, and Swain PS, Mol. Syst. Biol 4, 196 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [54].Horsthemke W and Lefever R, Noise-induced Transitions (Springer, Berlin, 1984). [Google Scholar]
- [55].Cotari JW, Voisinne G, Dar OE, Karabacak V, and Altan-Bonnet G, Sci. Sign 6, ra17 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [56].Prill RJ, Vogel R, Cecchi GA, Altan-Bonnet G, and Stolovitzky G, PLoS ONE 10, e0125777 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [57].Erez A, Vogel R, Mugler A, Belmonte A, and Altan-Bonnet G, Cytom. Pt. A 93, 611 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [58].Bevington PR and Robinson DK, Data Reduction and Error Analysis for the Physical Sciences (McGraw-Hill, New York, 2002). [Google Scholar]
- [59]. https://github.com/AmirErez/UniversalImmune.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data and code for all figures and the MIFlowCyt record are available [59].