Parameter-free model discrimination criterion based on steady-state coplanarity

Heather A Harrington; Kenneth L Ho; Thomas Thorne; Michael PH Stumpf

doi:10.1073/pnas.1117073109

. 2012 Sep 11;109(39):15746-15751. doi: 10.1073/pnas.1117073109

Parameter-free model discrimination criterion based on steady-state coplanarity

Heather A Harrington ^a,¹, Kenneth L Ho ^b,¹, Thomas Thorne ^a, Michael PH Stumpf ^a,²

PMCID: PMC3465434 PMID: 22967512

Abstract

We introduce a procedure for deciding when a mass-action model is incompatible with observed steady-state data that does not require any parameter estimation. Thus, we avoid the difficulties of nonlinear optimization typically associated with methods based on parameter fitting. Instead, we borrow ideas from algebraic geometry to construct a transformation of the model variables such that any set of steady states of the model under that transformation lies on a common plane, irrespective of the values of the model parameters. Model rejection can then be performed by assessing the degree to which the transformed data deviate from coplanarity. We demonstrate our method by applying it to models of multisite phosphorylation and cell death signaling. Our framework offers a parameter-free perspective on the statistical model selection problem, which can complement conventional statistical methods in certain classes of problems where inference has to be based on steady-state data and the model structures allow for suitable algebraic relationships among the steady-state solutions.

Keywords: chemical reaction networks, mass-action kinetics, ordinary differential equations, singular values, algebraic statistics

In many branches of science and engineering, one is often interested in the problem of model selection: Given observed data and a set of candidate models for the process generating that data, which is the most appropriate model for that process? Such a situation commonly arises when the inner workings of a process are not completely understood, so that multiple models are consistent with the current state of knowledge. For mechanistic models, e.g., ordinary differential equation (ODE) or stochastic dynamical models, most selection techniques involve parameter estimation, which typically requires some form of optimization, exploration of the parameter space, or formal inference procedure (1, 2). For sufficiently complicated models, however, this task can become infeasible, owing to the nonlinearity and multimodality of the objective function (which penalizes any differences between the data and the model predictions), as well as the high dimensionality of the parameter space (3).

Here, we present a framework for the discrimination of mass-action ODE models (and suitable generalizations thereof) that does not require or rely upon such estimated parameters. Our method (Fig. 1) operates on steady-state data and combines techniques from algebraic geometry, linear algebra, and statistics to determine when a given model is incompatible with the data under all choices of the model parameters. The core idea is to use the model equations to construct a transformation of the original variables such that any set of steady states of the model under that transformation possesses a simple geometric structure, irrespective of parameter values. In this case, we insist that the transformed steady states lie on a plane, which we detect numerically; if the observed data are not coplanar under the transformation induced by a given model, then we can confidently reject that model.

Fig. 1. — Parameter-free method for model discrimination.

The idea of transformation to coplanarity has been employed before, but previous efforts were limited, in part, by its systematic detection and quantification. For example, in ref. 4, it was necessary to first manually reduce the dimension of the transformed space to three so that coplanarity could be assessed visually. Other related research using similar methods include refs. 5–7. The current work extends existing methodologies by devising a numerical scheme for quantifying the deviation from coplanarity that generalizes to higher dimensions and allows for statistical interpretation. Thus, we provide a richer and more powerful framework for the application of this basic technique. Chemical reaction network theory (CRNT) (8, 9) and stoichiometric network analysis (10) likewise embrace a parameter-free philosophy and can also be exploited for model selection (11–13).

It is worth noting that our method provides a necessary but (generally) not sufficient condition for model compatibility: A model that is compatible with the data must provide a transformation to coplanarity, but a model that achieves coplanarity is not necessarily compatible, due to additional degrees of freedom introduced in the transformation process. This work is in contrast to traditional approaches based on parameter fitting, which provide a sufficient but not necessary condition because local extrema in the cost function surface may prevent a suitable fit. These two approaches are therefore complementary and can be used together for improved model selection.

The remainder of this paper is organized as follows. First, we introduce the concept of steady-state invariants (4, 5), polynomials that vanish at steady state and which depend only on experimentally accessible variables. Then we illustrate how to use steady-state invariants to deduce coplanarity requirements for model compatibility and how to detect such coplanarity numerically; we also discuss invariants in the context of standard parameter fitting techniques. Next, we apply our method to models of multisite phosphorylation and cell death signaling. Finally, we end with some generalizations and concluding remarks.

Steady-State Invariants

Consider a chemical reaction network model

graphic file with name pnas.1117073109eq54.jpg

[1]

in the species X₁,…,X_N, where s_ij and Inline graphic are the stoichiometric coefficients of X_j in the reactant and product sets, respectively, of reaction i, with rate constant k_i. Under mass-action kinetics, the model has dynamics

graphic file with name pnas.1117073109eq55.jpg

[2]

where x_i is the concentration of species X_i (throughout, we follow the convention that lowercase letters denote the concentrations of the corresponding species indicated in uppercase). These equations provide a quantitative description of the model and can, in principle, be used to test its validity by assessing the degree to which they are satisfied by observed data. Unfortunately, in practice, the required variables are rarely all available. In particular, the velocities Inline graphic can be difficult to measure, so we can often consider only the steady state , as we will do here. Furthermore, certain species may be experimentally inaccessible due to technological limitations; we eliminate these variables from the equations if possible.

For simple models, this elimination can be done by hand, but a more systematic approach is required in general. One such approach is to use Gröbner bases (14), a central tool in computational algebraic geometry that provides a generalization of Gaussian elimination for multivariate polynomial systems. Here, we follow the general procedure of Manrai and Gunawardena (4). Let Inline graphic be the polynomial ring consisting of all polynomials in the parameters a = (k₁,…,k_R) with coefficients from the rational numbers , and let be its fraction field, comprising all elements of the form f/g, where . Clearly, each , the ring of all polynomials in x = (x₁,…,x_N) with coefficients in Inline graphic . Note that the parameters a have been absorbed into the coefficient field ; thus, by performing all operations over , we can treat a symbolically, i.e., without specifying any particular parameter values.

To characterize the steady state Inline graphic , we construct the ideal generated by , consisting of all polynomials , where each . Clearly, J contains all elements of that vanish at steady state. To obtain only those elements of J that do not depend on the variables x₁,…,x_i, we consider the ith elimination ideal , where x_obs = (x_i+1,…,x_N) denotes the “observable” variables. Here, it is useful to introduce Gröbner bases, which are special sets of generators with the so-called elimination property that if g = (g₁,…,g_M) is a Gröbner basis for J under the lexicographic ordering x₁ > ⋯ > x_N, then Inline graphic , where are precisely those elements of g containing only the variables x_obs. The polynomials g_obs generate all elements of that vanish at steady state and so characterize the projection of the steady state onto the variables x_obs.

Procedurally, we compute a reduced Gröbner basis g of J with respect to a suitable lexicograhic ordering using standard algorithms, then obtain g_obs by subselection. For numerical convenience we further rescale each polynomial in g_obs so that all coefficients belong to Inline graphic (i.e., we multiply through by their common denominator). Then the elements of g_obs = (I₁,…,I_{N_inv}) have the form

graphic file with name pnas.1117073109eq56.jpg

[3]

where we have applied the relabeling x_obs = (x₁,…,x_{N_obs}). Clearly, each I_i is a polynomial in x_obs that vanishes at steady state; we call such polynomials steady-state invariants (or sometimes just invariants for short).

Note in general that steady-state invariants may fail to exist because J_i may be empty. Moreover, invariants and their properties (e.g., degrees) can depend delicately on the choice of monomial ordering. Some manual intervention is therefore often required to obtain useful invariants. We will not treat this important (but subtle) issue here, instead focusing on the analysis of given invariants, however they are obtained. This approach also has the advantage of separating the computation of invariants from their interpretation, in principle allowing the use of invariants from various theories. Steady-state invariants, if they exist, describe relationships between observable variables that hold at steady state for any given realization of parameter values, regardless of other factors such as initial conditions.

For full details on the computational procedure employed, see the accompanying Sage worksheet, which contains code for all computations performed (Materials and Methods). For further background on algebraic geometry and Gröbner bases (including the potential problems of obtaining them), see ref. 14; for other methods of variable elimination, see, e.g., ref. 15. Similar algebraic ideas have also appeared in the context of phylogenetics (16, 17).

Model Discrimination

We start with a set of steady-state measurements Inline graphic for i = 1,…,m, and a given model with steady-state invariants .

Data Coplanarity.

An invariant, Inline graphic , can be written somewhat simplified as

graphic file with name pnas.1117073109eq57.jpg

[4]

We first describe a procedure for deciding whether it is possible that the invariant is compatible with the data, i.e.,

[5]

for some choice of a. We therefore rewrite Eq. 4 as

graphic file with name pnas.1117073109eq59.jpg

[6]

where Inline graphic and b_j = f_j(a), with y = (y₁,…,y_n) and b = (b₁,…,b_n). Let φ be the map taking x_obs to y. Then compatibility implies that the transformed variable corresponding to any observation , considered as a point in with coordinates , lies on the hyperplane defined by the coefficients b. In other words, compatibility with the data Inline graphic implies that the corresponding transformed data are coplanar.

In general, it is possible that the invariant vanishes trivially (b = 0) under some choice of parameters, for which coplanarity need no longer hold. To discount this case, we can check, for instance, that the denominator of the corresponding g_obs is never zero. Then I always has at least one nonzero coefficient; hereafter in this section, we assume that the invariant is non-vanishing in this sense.

Let Inline graphic be the matrix whose rows consist of the . Then the data are coplanar if and only if Yb = 0 for some nontrivial column vector b ≠ 0. Such a vector, by definition, resides in the null space of Y, which can be found using the singular value decomposition Y = UΣV^T, where the diagonal elements of Σ give the singular values σ_i≥0 encoding the “stretch” of each basis vector in V. In particular, the smallest singular value σ_min bounds the norm ‖Yb‖ for any b ≠ 0 via

[7]

so if σ_min > 0, then the data cannot be coplanar (18). More generally, σ_min gives the least squares deviation of the data from coplanarity under the scaling constraint ‖b‖ = 1. This quantity depends only on the data and is therefore parameter-free.

Note that this requirement holds for any choice of b, regardless of whether it can be realized by the original parameters a. In this sense, the condition of small σ_min provides a necessary but not sufficient criterion for model compatibility. The additional degrees of freedom introduced by neglecting the functional forms f_j effectively linearizes the compatibility condition expressed by Eq. [5], allowing for a simple, direct solution.

To account for the presence of noise, suppose that we know each component Inline graphic of a measurement only up to an error , with

[8]

where Inline graphic is a standard normal random variable. We imagine that the noise parameter ϵ is given, for example, by instrument error. Then from the perturbation equation

[9]

we find, expanding to first order, that the error is propagated to the transformed variables as Inline graphic , where ∇φ is the Jacobian of φ, with elements (∇φ)_ij = ∂y_i/∂x_j. Therefore,

graphic file with name pnas.1117073109eq63.jpg

[10]

We now consider the effect of the Inline graphic on σ_min under the null hypothesis that the underlying are coplanar with coefficients b (of unit norm). Thus, we study the vector Yb, whose entries are perturbed from zero to

graphic file with name pnas.1117073109eq64.jpg

[11]

for each transformed datum Inline graphic . Since ‖b‖ = 1 by assumption, if we rescale each row of Y by its corresponding effective error

graphic file with name pnas.1117073109eq65.jpg

[12]

thus obtaining Y^′, then each entry of Y^′b has the form μ_iZ with |μ_i| ≤ 1, for i = 1,…,m. We hence define the coplanarity error

[13]

which, from the discussion above, is bounded by the length of a normal random vector with variances Inline graphic , whose distribution function clearly dominates that of the length of a normal random vector with variances . But this latter quantity simply follows the χ distribution with m degrees of freedom. In other words,

[14]

if p_α is the upper α-percentile for χ_m (e.g., α = 0.05), then

[15]

which gives an approximate criterion for rejecting coplanarity. As the amount of data increases, the approximation improves since σ_min(Y^′) → ‖Y^′b‖ as m → ∞ by the symmetry of Eq. 10.

Depending on the exact situation at hand, it may be appropriate to choose a more conservative significance level α or to invoke additional criteria to decide whether a model is acceptable. In the examples below, however, we will see that whether a model can be rejected is often fairly obvious, and in such cases we will simply use the asymptotic arguments based on the χ_m distribution.

Invariant Minimization.

Steady-state invariants can also be used in conjunction with standard parameter fitting techniques. The basic approach is to minimize the Frobenius norm of the matrix Inline graphic , with entries , over the parameters, which readily provides a sufficient condition for model compatibility since any a producing a small norm provides parameters that fit the data by construction. However, the condition is not necessary because suitable parameters may fail to be found even for compatible models due to the intricacies of the objective function. Clearly, prior knowledge of a can be used to guide the optimization away from such difficulties.

Assuming that the model and its parameters are correct, each invariant Inline graphic in principle. However, due to noise, , where

graphic file with name pnas.1117073109eq69.jpg

[16]

by Eq. 11. Therefore, if we use Inline graphic as the entry of θ corresponding to invariant I and datum , then the invariant error

[17]

This quantity can be used to compute the likelihood L(a) = Pr[θ(a)] and allows, e.g., various likelihood-based selection schemes (19, 20), assuming that the optimization can be performed. Here, we use the Akaike information criterion (AIC),

[18]

where L_max = max _aL(a), which penalizes model complexity; the preferred model is the one with the minimum AIC (21).

Results

We apply our methods to two illustrative biological processes for which competing models exist: multisite phosphorylation and cell death signaling.

Multisite Phosphorylation.

We focus first on phosphorylation, a key cellular regulatory mechanism that has been the subject of extensive study, both experimentally (22–24) and theoretically (4, 5, 25–27). Following ref. 4, we consider a two-site system with reactions,

[19]

[20]

where u,v∈{0,1}² are bit strings of length two, encoding the occupancies of each site (0 or 1 for the absence or presence, respectively, of a phosphate), with u having less bits than v; S_u is the phosphoform with phosphorylation state u; K is a kinase, an enzyme that adds phosphates; and F is a phosphatase, an enzyme that removes phosphates. Each enzyme can be either processive (P), where more than one phosphate modification may be achieved in a single step, or distributive (D), where only one modification is allowed before the enzyme dissociates from the substrate (c₀₀₁₁ = 0 for K, γ₁₁₀₀ for F). This mechanistic diversity generates four competing models: PP, PD, DP, and DD; where the first letter designates the mechanism of the kinase, and the second, that of the phosphatase.

As in ref. 4, we consider only the concentrations x_obs = (s₀₀,s₀₁,s₁₀,s₁₁) as observable and use the ordering,

[21]

with which we are able to eliminate all other variables except f from the dynamics of each model. The remaining Gröbner basis polynomials are of the form p(f,x_obs) = f·q(x_obs), where f ≠ 0 unless there is no phosphatase in the system, which we assume not to be the case, so we take only the observable part q(x_obs). It is easy to check that the resulting denominators are always of one sign.

Each model has three steady-state invariants. Matched appropriately, the invariants for model PP share the same transformed variables y = φ(x_obs) as those for PD; the same is true for DP and DD. Thus, in terms of the transformed data, only the kinase mechanism is discriminative. Between PP/PD and DP/DD, two invariants (I₁ and I₂) are discriminative in principle, though only one (I₂) succeeds numerically: For simulated data from the PP/PD models, provided that the noise level is sufficiently low, coplanarity on I₂ is able to correctly reject the DP/DD models at significance level α = 0.05 (Δ ∼ 10⁵ versus Δ ∼ 1 for PP/PD at ϵ = 10^-9, against a threshold of p_α = 11.2). The corresponding test using DP/DD data is not successful due to the form of I₂, which has transformed variables,

[22]

[23]

for PP/PD and DP/DD, respectively, i.e., y^PP/PD has the additional variable s₀₀s₁₀ over y^DP/DD. Therefore, PP/PD models can be made to fit DP/DD data simply by setting the coefficient corresponding to s₀₀s₁₀ to zero, which is in fact what we observe. No model is rejected on the basis of data generated from it.

We emphasize that these results are specific to the particular ordering chosen. Indeed, one can make the phosphatase mechanism discriminative instead by reversing the order of the variables x_obs in Eq. 21. The exhaustive analysis of such orderings is beyond the scope here; rather, we aim to illustrate the potential uses (and usefulness) of this type of approach using concrete examples.

Although the condition of coplanarity is technically valid only at steady state, there should nevertheless be some convergence over time to coplanarity for any compatible model. We hence compute Δ for the PP/PD and DP/DD models along time course trajectories simulated from model PP at various levels of ϵ (Fig. 2A). For low noise, the results confirm convergence for invariants previously identified as compatible (all I_i for PP/PD; I₁ and I₃ for DP/DD), with stagnation for incompatible invariants (I₂ for DP/DD); these results suggest wider applicability of this method, provided that the data are approaching steady state reasonably fast. As the noise increases, however, Δ decreases inversely proportionally, until the stagnation point hits the basal error level of Δ ∼ 1 and we lose all power to reject. Additional simulations estimate the critical noise level at ϵ ∼ 10^-4 (Fig. 2B).

Fig. 2. — Discrimination of multisite phosphorylation models. (A) Coplanarity error Δ of the steady-state invariants of the PP/PD (*Left*) and DP/DD (*Right*) models along time course trajectories simulated from the PP model, corrupted by various levels of noise (lined, ϵ = 10^-9; dashed, ϵ = 10^-6; dotted, ϵ = 10^-3). At each noise level, the errors for three invariants are shown (blue, I₁; green, I₂; red, I₃). (B) Coplanarity error Δ of DP/DD invariants on PP data at steady state as a function of the noise level ϵ; invariants colored as in A. The shaded region indicates the regime over which the DP/DD models can be rejected at significance level α = 0.05. (C) Invariant error AIC A for each model (blue, PP; green, PD; red, DP; cyan, DD) on data generated from the PP (*Upper Left*), PD (*Upper Right*), DP (*Lower Left*), and DD (*Lower Right*) models.

To further discriminate between all four models we next turn to invariant minimization. The required optimization involves highly nonlinear functions, so success should be expected only if we have good initial estimates of the model parameters, which can be rather strong demand. In such a case, however, minimization is indeed capable of identifying the correct model from the data so long as ϵ ≲ 10^-5 (Fig. 2C). These results reinforce our belief that the algebraic approach proposed here naturally complements conventional (i.e., parametric) reverse engineering schemes such as optimization or inference procedures.

Cell Death Signaling.

We next apply our methods to receptor-mediated cell death signaling, the so-called extrinsic apoptosis pathway, which plays a prominent role in cancers and other diseases (28–31). Specifically, we consider the assembly of the death-inducing signaling complex (DISC), a multiprotein oligomer formed by the association of FasL, a death ligand, with its cognate receptor Fas (32, 33).

We investigate two models of DISC formation. The first (34), which we call the cross-linking model is based on the successive binding of Fas (R) to FasL (L),

graphic file with name pnas.1117073109eq77.jpg

[24]

graphic file with name pnas.1117073109eq78.jpg

[25]

graphic file with name pnas.1117073109eq79.jpg

[26]

where C_i is the complex FasL∶Fas_i. The second (6), which we call the cluster model, posits three forms of Fas (inactive, X; active and unstable, Y; active and stable, Z) and specifies receptor cluster-stabilization events driven by FasL,

graphic file with name pnas.1117073109eq80.jpg

[27]

[28]

[29]

[30]

where the last two reactions represent entire families generated by taking i = 2 or 3, with j = 1,…,i and k = 1,…,j. The cluster model is capable of bistability, whereas the cross-linking model exhibits only monostable behavior (6).

The two models are structurally very different, and discriminating between them requires some care. Hence, following ref. 6, we establish a correspondence between the models by considering the apoptotic signal ζ transduced by the DISC, defined as ζ = c₁ + 2c₂ + 3c₃ for the cross-linking model and ζ = z for the cluster model. We assume that ζ is experimentally accessible; other variables assumed accessible include λ, the total concentration of FasL (λ = l + c₁ + c₂ + c₃ and λ = l for the cross-linking and cluster models, respectively), and ρ, the total concentration of Fas (ρ = r + c₁ + 2c₂ + 3c₃ and ρ = x + y + z, respectively). Eliminating all other variables via the orderings (c₂,c₃,λ,ρ,ζ) and (y,λ,ρ,ζ) for the cross-linking and cluster models, respectively (after appropriate variable substitutions), we obtain one non-vanishing steady-state invariant for each model. The dimensions of the transformed spaces are 5 and 15 for the cross-linking and cluster models, respectively.

As for phosphorylation, we compute the coplanarity error for each invariant on time course data simulated from each model at various noise levels. Although results are inconclusive for data from the cross-linking model, the coplanarity criterion can reject the cross-linking model on the basis of cluster model data at α = 0.05, provided that ϵ ≲ 10^-2 (Fig. 3 A and B). The minimization protocol also correctly identifies the model from the data over the same range of noise levels (Fig. 3C).

Fig. 3. — Discrimination of cell death signaling models. (A) Coplanarity error Δ of the steady-state invariants of the cross-linking (*Left*) and cluster (*Right*) models along time course trajectories simulated from the cluster model, corrupted by various levels of noise (blue, ϵ = 10^-9; green, ϵ = 10^-6; red, ϵ = 10^-3). (B) Coplanarity error Δ of model invariants (blue, cross-linking; green, cluster) on cluster data at steady state as a function of the noise level ϵ. The shaded region indicates the regime over which the cross-linking model can be rejected at significance level α = 0.05. (C) Invariant error AIC A for each model (blue, cross-linking; green, cluster) on data generated from the cross-linking (*Left*) and cluster (*Right*) models.

Discussion

In this paper, we have presented a model discrimination scheme based on steady-state coplanarity that does not require known or estimated parameter values. Thus we are able to sidestep the parameter inference problem common to many fields including systems biology (3, 35). Such algebraic methods are not always effective, however; steady-state invariants may not exist, and even when they do, the additional degrees of freedom introduced by effective linearization can cause the method to fail. A promising solution to the problem when invariants cannot be calculated using Gröbner bases may be to employ invariants from CRNT (36). Our results also suggest a somewhat low tolerance for noise, which can restrict its applicability. Significantly, our method has the unique feature that it can be applied with complete ignorance of parameter values, and is therefore a useful additional tool in the analysis of inverse problems involving dynamical systems.

Rather than competing directly with current model discrimination techniques, we expect that coplanarity will form one end of an entire spectrum of methods, to be used when no parameter information is available. At the other end lie methods based on parameter estimation (including invariant minimization), which, for dynamical systems, can depend delicately on qualitative and quantitative aspects of the systems under consideration (37, 38). The intermediate regime comprises techniques that can leverage partial knowledge, for instance, constraints on certain parameter values or qualitative features of the dynamics (39). Along this spectrum, naturally, the discriminative power increases with the amount of prior information available. In this broader context, coplanarity can be used to efficiently reject candidate models before employing more demanding parameter estimation tools. Thus, it can serve as a preprocessor to thin out the model space. The real advantages and limitations of any inferential procedure become apparent once their performance can be evaluated in real-world applications, which is perhaps particularly true for this current approach. Certainly a range of theoretical and computational issues surround algebraic methods which will likely impact their applicability. Here we have found that a pragmatic approach yields some useful insights for small and intermediate-sized problems.

Finally, we remark that the presented scheme is perhaps the simplest of a potential class of parameter-free selection methods based on the detection of geometric structure. In this view, transformation to coplanarity is just one of many low-dimensional descriptions of such structure. The existence of low-dimensional representations has recently been predicted in neuronal signaling (40), and can ultimately be attributed to the inherent robustness of biological systems (41, 42).

Materials and Methods

Gröbner Basis Calculation.

All reduced Gröbner bases are computed over the field Inline graphic of rational functions in the parameters a with rational coefficients, under a suitable lexicographic ordering with the observables x_obs located at the end of the variable list, using the computer algebra system Singular (http://www.singular.uni-kl.de/) as interfaced through Sage (http://www.sagemath.org/).

Data Generation.

For each model parameters are drawn independently from a log-normal distribution with median μ^∗ = e^μ = 1 and multiplicative standard deviation σ^∗ = e^σ = 2, where μ and σ are the mean and standard deviation, respectively, of the underlying normal distribution. Using these parameters m = 100 time course trajectories are computed for each model via integration of the model ODEs over the time interval 0 ≤ t ≤ 100; each trajectory is seeded by random initial conditions sampled from a log-normal distribution also with μ^∗ = 1 and σ^∗ = 2. Integration is performed using the solver LSODA as wrapped in SciPy (http://www.scipy.org/). The data are then corrupted by noise of varying levels from ϵ = 10^-9 to 10^-1, for each ϵ, multiplying the nominal data by random log-normal samples with μ^∗ = 1 and σ^∗ = 1 + ϵ.

Invariant Minimization.

Invariant error likelihood maximization is performed in two phases. First, an approximate optimal parameter set is obtained by minimizing the Frobenius norm of the matrix Inline graphic , where each entry corresponds to an invariant-datum pair as in θ, but with values , where

graphic file with name pnas.1117073109eq84.jpg

[31]

The result is then taken as an initial parameter estimate to compute L_max. All optimizations are performed using L-BFGS-B (43) through SciPy, with lower and upper bounds of 0.01 and 100, respectively, for each variable. The minimization of ‖η‖_F is seeded with initial value 1 for all variables.

Computational Platform.

All computations are performed centrally in Sage, making use of its interfaces to various programs. Plots were produced using matplotlib (http://matplotlib.sourceforge.net/). The Sage worksheet for this paper, which contains code for all computations performed, is available at http://www.sagenb.org/home/pub/3462/.

ACKNOWLEDGMENTS.

We thank Carsten Wiuf, Elisenda Feliu, and Sarah Filippi for their comments on the manuscript. H.A.H. and M.P.H.S. gratefully acknowledge the Leverhulme Trust. K.L.H. was funded in part by National Science Foundation Grant DGE-0333389. T.T. and M.P.H.S. also acknowledge funding from the Biotechnology and Biological Research Council. M.P.H.S. is a Royal Society Wolfson Research Merit award holder.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

References

1.Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface. 2009;6:187–202. doi: 10.1098/rsif.2008.0172. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Vyshemirsky V, Girolami MA. Bayesian ranking of biochemical systems models. Bioinformatics. 2008;24:833–839. doi: 10.1093/bioinformatics/btm607. [DOI] [PubMed] [Google Scholar]
3.Ashyraliyev M, Fomekong-Nanfack Y, Kaandrop JA, Blom JG. Systems biology: Parameter estimation for biochemical models. FEBS J. 2009;276:886–902. doi: 10.1111/j.1742-4658.2008.06844.x. [DOI] [PubMed] [Google Scholar]
4.Manrai AK, Gunawardena J. The geometry of multisite phosphorylation. Biophys J. 2008;95:5533–5543. doi: 10.1529/biophysj.108.140632. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Gunawardena J. Distributivity and processivity in multisite phosphorylation can be distinguished through steady-state invariants. Biophys J. 2007;93:3828–3834. doi: 10.1529/biophysj.107.110866. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Ho KL, Harrington HA. Bistability in apoptosis by receptor clustering. PLoS Comput Biol. 2010;6:e1000956. doi: 10.1371/journal.pcbi.1000956. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Martínez-Forero I, Peláez-López A, Villoslada P. Steady state detection of chemical reaction networks using a simplified analytical method. PLoS One. 2010;5:e10823. doi: 10.1371/journal.pone.0010823. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Feinberg M. Chemical reaction network structure and the stability of complex isothermal reactors—I. The deficiency zero and deficiency one theorems. Chem Eng Sci. 1987;42:2229–2268. [Google Scholar]
9.Feinberg M. Chemical reaction network structure and the stability of complex isothermal reactors—II. Multiple steady states for networks of deficiency one. Chem Eng Sci. 1988;43:1–25. [Google Scholar]
10.Clarke BL. Stoichiometric network analysis. Cell Biophys. 1988;12:237–253. doi: 10.1007/BF02918360. [DOI] [PubMed] [Google Scholar]
11.Conradi C, Saez-Rodriguez J, Gilles ED, Raisch J. Using chemical reaction network theory to discard a kinetic mechanism hypothesis. IEEE Proc Syst Biol. 2005;152:243–248. doi: 10.1049/ip-syb:20050045. [DOI] [PubMed] [Google Scholar]
12.Ellison P, Feinberg M. How catalytic mechanisms reveal themselves in multiple steady-state data: I. Basic principles. J Mol Catal A Chem. 2000;154:155–167. [Google Scholar]
13.Ellison P, Feinberg M, Yue MH, Saltsburg H. How catalytic mechanisms reveal themselves in multiple steady-state data: II. An ethylene hydrogenation example. J Mol Catal A Chem. 2000;154:169–184. [Google Scholar]
14.Cox D, Little J, O’Shea D. Ideals, Varieties, and Algorithms. New York: Springer; 1997. [Google Scholar]
15.Feliu E, Wiuf C. Variable elimination in chemical reaction networks with mass-action kinetics. SIAM J Appl Math. 2012;72:959–981. doi: 10.1007/s00285-012-0510-4. [DOI] [PubMed] [Google Scholar]
16.Cavender JA, Felsenstein J. Invariants of phylogenies in a simple case with discrete states. J Classif. 1987;4:57–71. [Google Scholar]
17.Pachter L, Sturmfels B. The mathematics of phylogenomics. SIAM Rev. 2007;49:3–31. [Google Scholar]
18.Belsley DA, Kuh E, Welsch R. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Hoboken, NJ: Wiley; 2004. [Google Scholar]
19.Casella G, Berger RL. Statistical Inference. 2nd Ed. Pacific Grove, CA: Duxbury Press; 2001. [Google Scholar]
20.Schwarz G. Estimating the dimension of a model. Ann Statist. 1978;6:461–464. [Google Scholar]
21.Akaike H. A new look at the statistical model identification. IEEE Trans Automat Contr. 1974;19:716–723. [Google Scholar]
22.Cohen P. The role of protein phosphorylation in neural and hormonal control of cellular activity. Nature. 1982;296:613–620. doi: 10.1038/296613a0. [DOI] [PubMed] [Google Scholar]
23.Cohen P. The regulation of protein function by multisite phosphorylation—A 25 year update. Trends Biochem Sci. 2000;25:596–601. doi: 10.1016/s0968-0004(00)01712-6. [DOI] [PubMed] [Google Scholar]
24.Seger R, Krebs EG. The MAPK signaling cascade. FASEB J. 1995;9:726–735. [PubMed] [Google Scholar]
25.Gunawardena J. Multisite protein phosphorylation makes a good threshold but a poor switch. Proc Natl Acad Sci USA. 2005;102:14617–14622. doi: 10.1073/pnas.0507322102. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Huang CYF, Ferrel JE., Jr Ultrasensitivity in the mitogen-activated protein kinase cascade. Proc Natl Acad Sci USA. 1996;93:10078–10083. doi: 10.1073/pnas.93.19.10078. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Thomson M, Gunawardena J. Unlimited multistability in multisite phosphorylation systems. Nature. 2009;460:274–277. doi: 10.1038/nature08102. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Thompson CB. Apoptosis in the pathogenesis and treatment of disease. Science. 1995;267:1456–1462. doi: 10.1126/science.7878464. [DOI] [PubMed] [Google Scholar]
29.Raff M. Cell suicide for beginners. Nature. 1998;396:119–122. doi: 10.1038/24055. [DOI] [PubMed] [Google Scholar]
30.Meier P, Finch A, Evan G. Apoptosis in development. Nature. 2000;407:796–801. doi: 10.1038/35037734. [DOI] [PubMed] [Google Scholar]
31.Fulda S, Debatin KM. Extrinsic versus intrinsic apoptosis pathways in anticancer chemotherapy. Oncogene. 2006;25:4798–4811. doi: 10.1038/sj.onc.1209608. [DOI] [PubMed] [Google Scholar]
32.Ashkenazi A, Dixit VM. Death receptors: Signaling and modulation. Science. 1998;281:1305–1308. doi: 10.1126/science.281.5381.1305. [DOI] [PubMed] [Google Scholar]
33.Peter ME, Krammer PH. The CD95 (APO-1/Fas) DISC and beyond. Cell Death Differ. 2003;10:26–35. doi: 10.1038/sj.cdd.4401186. [DOI] [PubMed] [Google Scholar]
34.Lai R, Jackson TL. A mathematical model of receptor-mediated apoptosis: Dying to know why FasL is a trimer. Math Biosci Eng. 2004;1:325–338. doi: 10.3934/mbe.2004.1.325. [DOI] [PubMed] [Google Scholar]
35.Gunawardena J. Models in systems biology: The parameter problem and meanings of robustness. In: Lodhi HM, Muggleton SH, editors. Elements of Computational Systems Biology. Hoboken, NJ: Wiley; 2010. pp. 19–47. [Google Scholar]
36.Karp RL, Pérez Millán M, Dasgupta T, Dickenstein A, Gunawardena J. Complex-linear invariants of biochemical networks. J Theor Biol. 2012;311:130–138. doi: 10.1016/j.jtbi.2012.07.004. [DOI] [PubMed] [Google Scholar]
37.Erguler K, Stumpf MPH. Practical limits for reverse engineering of dynamical systems: A statistical analysis of sensitivity and parameter inferability in systems biology models. Mol Biosyst. 2011;7:1593–1602. doi: 10.1039/c0mb00107d. [DOI] [PubMed] [Google Scholar]
38.Gutenkunst RN, et al. Universally sloppy parameter sensitivities in systems biology models. PLoS Comput Biol. 2007;3:e189. doi: 10.1371/journal.pcbi.0030189. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Mélykúti B, August E, Papachristodoulou A, El-Samad H. Discriminating between rival biochemical network models: Three approaches to optimal experiment design. BMC Syst Biol. 2010;4:38. doi: 10.1186/1752-0509-4-38. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Barbano PE, et al. A mathematical tool for exploring the dynamics of biological networks. Proc Natl Acad Sci USA. 2007;104:19169–19174. doi: 10.1073/pnas.0709955104. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Csete ME, Doyle JC. Reverse engineering of biological complexity. Science. 2002;295:1664–1669. doi: 10.1126/science.1069981. [DOI] [PubMed] [Google Scholar]
42.Kitano H. Biological robustness. Nat Rev Genet. 2004;5:826–837. doi: 10.1038/nrg1471. [DOI] [PubMed] [Google Scholar]
43.Zhu C, Byrd RH, Lu P, Nocedal J. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw. 1997;23:550–560. [Google Scholar]

[B1] 1.Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface. 2009;6:187–202. doi: 10.1098/rsif.2008.0172. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] 2.Vyshemirsky V, Girolami MA. Bayesian ranking of biochemical systems models. Bioinformatics. 2008;24:833–839. doi: 10.1093/bioinformatics/btm607. [DOI] [PubMed] [Google Scholar]

[B3] 3.Ashyraliyev M, Fomekong-Nanfack Y, Kaandrop JA, Blom JG. Systems biology: Parameter estimation for biochemical models. FEBS J. 2009;276:886–902. doi: 10.1111/j.1742-4658.2008.06844.x. [DOI] [PubMed] [Google Scholar]

[B4] 4.Manrai AK, Gunawardena J. The geometry of multisite phosphorylation. Biophys J. 2008;95:5533–5543. doi: 10.1529/biophysj.108.140632. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5.Gunawardena J. Distributivity and processivity in multisite phosphorylation can be distinguished through steady-state invariants. Biophys J. 2007;93:3828–3834. doi: 10.1529/biophysj.107.110866. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6.Ho KL, Harrington HA. Bistability in apoptosis by receptor clustering. PLoS Comput Biol. 2010;6:e1000956. doi: 10.1371/journal.pcbi.1000956. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] 7.Martínez-Forero I, Peláez-López A, Villoslada P. Steady state detection of chemical reaction networks using a simplified analytical method. PLoS One. 2010;5:e10823. doi: 10.1371/journal.pone.0010823. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] 8.Feinberg M. Chemical reaction network structure and the stability of complex isothermal reactors—I. The deficiency zero and deficiency one theorems. Chem Eng Sci. 1987;42:2229–2268. [Google Scholar]

[B9] 9.Feinberg M. Chemical reaction network structure and the stability of complex isothermal reactors—II. Multiple steady states for networks of deficiency one. Chem Eng Sci. 1988;43:1–25. [Google Scholar]

[B10] 10.Clarke BL. Stoichiometric network analysis. Cell Biophys. 1988;12:237–253. doi: 10.1007/BF02918360. [DOI] [PubMed] [Google Scholar]

[B11] 11.Conradi C, Saez-Rodriguez J, Gilles ED, Raisch J. Using chemical reaction network theory to discard a kinetic mechanism hypothesis. IEEE Proc Syst Biol. 2005;152:243–248. doi: 10.1049/ip-syb:20050045. [DOI] [PubMed] [Google Scholar]

[B12] 12.Ellison P, Feinberg M. How catalytic mechanisms reveal themselves in multiple steady-state data: I. Basic principles. J Mol Catal A Chem. 2000;154:155–167. [Google Scholar]

[B13] 13.Ellison P, Feinberg M, Yue MH, Saltsburg H. How catalytic mechanisms reveal themselves in multiple steady-state data: II. An ethylene hydrogenation example. J Mol Catal A Chem. 2000;154:169–184. [Google Scholar]

[B14] 14.Cox D, Little J, O’Shea D. Ideals, Varieties, and Algorithms. New York: Springer; 1997. [Google Scholar]

[B15] 15.Feliu E, Wiuf C. Variable elimination in chemical reaction networks with mass-action kinetics. SIAM J Appl Math. 2012;72:959–981. doi: 10.1007/s00285-012-0510-4. [DOI] [PubMed] [Google Scholar]

[B16] 16.Cavender JA, Felsenstein J. Invariants of phylogenies in a simple case with discrete states. J Classif. 1987;4:57–71. [Google Scholar]

[B17] 17.Pachter L, Sturmfels B. The mathematics of phylogenomics. SIAM Rev. 2007;49:3–31. [Google Scholar]

[B18] 18.Belsley DA, Kuh E, Welsch R. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Hoboken, NJ: Wiley; 2004. [Google Scholar]

[B19] 19.Casella G, Berger RL. Statistical Inference. 2nd Ed. Pacific Grove, CA: Duxbury Press; 2001. [Google Scholar]

[B20] 20.Schwarz G. Estimating the dimension of a model. Ann Statist. 1978;6:461–464. [Google Scholar]

[B21] 21.Akaike H. A new look at the statistical model identification. IEEE Trans Automat Contr. 1974;19:716–723. [Google Scholar]

[B22] 22.Cohen P. The role of protein phosphorylation in neural and hormonal control of cellular activity. Nature. 1982;296:613–620. doi: 10.1038/296613a0. [DOI] [PubMed] [Google Scholar]

[B23] 23.Cohen P. The regulation of protein function by multisite phosphorylation—A 25 year update. Trends Biochem Sci. 2000;25:596–601. doi: 10.1016/s0968-0004(00)01712-6. [DOI] [PubMed] [Google Scholar]

[B24] 24.Seger R, Krebs EG. The MAPK signaling cascade. FASEB J. 1995;9:726–735. [PubMed] [Google Scholar]

[B25] 25.Gunawardena J. Multisite protein phosphorylation makes a good threshold but a poor switch. Proc Natl Acad Sci USA. 2005;102:14617–14622. doi: 10.1073/pnas.0507322102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B26] 26.Huang CYF, Ferrel JE., Jr Ultrasensitivity in the mitogen-activated protein kinase cascade. Proc Natl Acad Sci USA. 1996;93:10078–10083. doi: 10.1073/pnas.93.19.10078. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B27] 27.Thomson M, Gunawardena J. Unlimited multistability in multisite phosphorylation systems. Nature. 2009;460:274–277. doi: 10.1038/nature08102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B28] 28.Thompson CB. Apoptosis in the pathogenesis and treatment of disease. Science. 1995;267:1456–1462. doi: 10.1126/science.7878464. [DOI] [PubMed] [Google Scholar]

[B29] 29.Raff M. Cell suicide for beginners. Nature. 1998;396:119–122. doi: 10.1038/24055. [DOI] [PubMed] [Google Scholar]

[B30] 30.Meier P, Finch A, Evan G. Apoptosis in development. Nature. 2000;407:796–801. doi: 10.1038/35037734. [DOI] [PubMed] [Google Scholar]

[B31] 31.Fulda S, Debatin KM. Extrinsic versus intrinsic apoptosis pathways in anticancer chemotherapy. Oncogene. 2006;25:4798–4811. doi: 10.1038/sj.onc.1209608. [DOI] [PubMed] [Google Scholar]

[B32] 32.Ashkenazi A, Dixit VM. Death receptors: Signaling and modulation. Science. 1998;281:1305–1308. doi: 10.1126/science.281.5381.1305. [DOI] [PubMed] [Google Scholar]

[B33] 33.Peter ME, Krammer PH. The CD95 (APO-1/Fas) DISC and beyond. Cell Death Differ. 2003;10:26–35. doi: 10.1038/sj.cdd.4401186. [DOI] [PubMed] [Google Scholar]

[B34] 34.Lai R, Jackson TL. A mathematical model of receptor-mediated apoptosis: Dying to know why FasL is a trimer. Math Biosci Eng. 2004;1:325–338. doi: 10.3934/mbe.2004.1.325. [DOI] [PubMed] [Google Scholar]

[B35] 35.Gunawardena J. Models in systems biology: The parameter problem and meanings of robustness. In: Lodhi HM, Muggleton SH, editors. Elements of Computational Systems Biology. Hoboken, NJ: Wiley; 2010. pp. 19–47. [Google Scholar]

[B36] 36.Karp RL, Pérez Millán M, Dasgupta T, Dickenstein A, Gunawardena J. Complex-linear invariants of biochemical networks. J Theor Biol. 2012;311:130–138. doi: 10.1016/j.jtbi.2012.07.004. [DOI] [PubMed] [Google Scholar]

[B37] 37.Erguler K, Stumpf MPH. Practical limits for reverse engineering of dynamical systems: A statistical analysis of sensitivity and parameter inferability in systems biology models. Mol Biosyst. 2011;7:1593–1602. doi: 10.1039/c0mb00107d. [DOI] [PubMed] [Google Scholar]

[B38] 38.Gutenkunst RN, et al. Universally sloppy parameter sensitivities in systems biology models. PLoS Comput Biol. 2007;3:e189. doi: 10.1371/journal.pcbi.0030189. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B39] 39.Mélykúti B, August E, Papachristodoulou A, El-Samad H. Discriminating between rival biochemical network models: Three approaches to optimal experiment design. BMC Syst Biol. 2010;4:38. doi: 10.1186/1752-0509-4-38. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B40] 40.Barbano PE, et al. A mathematical tool for exploring the dynamics of biological networks. Proc Natl Acad Sci USA. 2007;104:19169–19174. doi: 10.1073/pnas.0709955104. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B41] 41.Csete ME, Doyle JC. Reverse engineering of biological complexity. Science. 2002;295:1664–1669. doi: 10.1126/science.1069981. [DOI] [PubMed] [Google Scholar]

[B42] 42.Kitano H. Biological robustness. Nat Rev Genet. 2004;5:826–837. doi: 10.1038/nrg1471. [DOI] [PubMed] [Google Scholar]

[B43] 43.Zhu C, Byrd RH, Lu P, Nocedal J. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw. 1997;23:550–560. [Google Scholar]

PERMALINK

Parameter-free model discrimination criterion based on steady-state coplanarity

Heather A Harrington

Kenneth L Ho

Thomas Thorne

Michael PH Stumpf

Abstract

Fig. 1.

Steady-State Invariants

Model Discrimination

Data Coplanarity.

Invariant Minimization.

Results

Multisite Phosphorylation.

Fig. 2.

Cell Death Signaling.

Fig. 3.

Discussion

Materials and Methods

Gröbner Basis Calculation.

Data Generation.

Invariant Minimization.

Computational Platform.

ACKNOWLEDGMENTS.

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Parameter-free model discrimination criterion based on steady-state coplanarity

Heather A Harrington

Kenneth L Ho

Thomas Thorne

Michael PH Stumpf

Abstract

Fig. 1.

Steady-State Invariants

Model Discrimination

Data Coplanarity.

Invariant Minimization.

Results

Multisite Phosphorylation.

Fig. 2.

Cell Death Signaling.

Fig. 3.

Discussion

Materials and Methods

Gröbner Basis Calculation.

Data Generation.

Invariant Minimization.

Computational Platform.

ACKNOWLEDGMENTS.

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases