From graph topology to ODE models for gene regulatory networks

Xiaohan Kang; Bruce Hajek; Yoshie Hanzawa

doi:10.1371/journal.pone.0235070

. 2020 Jun 30;15(6):e0235070. doi: 10.1371/journal.pone.0235070

From graph topology to ODE models for gene regulatory networks

Xiaohan Kang ^1,^*, Bruce Hajek ¹, Yoshie Hanzawa ²

Editor: Enrique Hernandez-Lemus³

PMCID: PMC7326199 PMID: 32603340

Abstract

A gene regulatory network can be described at a high level by a directed graph with signed edges, and at a more detailed level by a system of ordinary differential equations (ODEs). The former qualitatively models the causal regulatory interactions between ordered pairs of genes, while the latter quantitatively models the time-varying concentrations of mRNA and proteins. This paper clarifies the connection between the two types of models. We propose a property, called the constant sign property, for a general class of ODE models. The constant sign property characterizes the set of conditions (system parameters, external signals, or internal states) under which an ODE model is consistent with a signed, directed graph. If the constant sign property for an ODE model holds globally for all conditions, then the ODE model has a single signed, directed graph. If the constant sign property for an ODE model only holds locally, which may be more typical, then the ODE model corresponds to different graphs under different sets of conditions. In addition, two versions of constant sign property are given and a relationship between them is proved. As an example, the ODE models that capture the effect of cis-regulatory elements involving protein complex binding, based on the model in the GeneNetWeaver source code, are described in detail and shown to satisfy the global constant sign property with a unique consistent gene regulatory graph. Even a single gene regulatory graph is shown to have many ODE models of GeneNetWeaver type consistent with it due to combinatorial complexity and continuous parameters. Finally the question of how closely data generated by one ODE model can be fit by another ODE model is explored. It is observed that the fit is better if the two models come from the same graph.

Introduction

A gene regulatory network is a collection of molecular classes such that each molecular class interacts with a small number of other molecular classes, creating a sparse graph structure [1]. A goal of systems biology is to understand gene regulatory networks and infer them from data [2, 3]. A directed graph with vertices representing genes and signed edges representing gene-to-gene interactions, also known as a circuit model [4] or a logical model [5], is a model with a high level of abstraction (see S1 Appendix). The vertices of such graph models often only consist of the genes but not the properties of the derived proteins because the latter information is usually not available. An ordinary differential equation (ODE) model is far more detailed than a graph model: they quantitatively describe the dynamics of the time-varying mRNA and protein concentrations of the genes, and can be used to capture complex effects, including protein–protein interaction, post-translational modification, environmental signals, diffusion of proteins in different parts of the cell, and various time constants. As a result, ascribing a directed graph to a biologically plausible gene regulatory network can miss important biological details and dynamics because of the abstraction. However, it is significantly more challenging to ascribe a particular ODE model to a gene regulatory network than to ascribe a directed graph because an ODE model requires much finer classification with possibly orders of magnitude more amount of data. As one example, the work [6] is notable for successful identification of an ODE model that captures the gene regulatory network underlying the dynamics of the circadian clock. The ODE model in [6] is based on a number of previous empirical and modeling studies, and it is shown that parameters for the model can be selected to give a good match to the data. In general, however, without such prior knowledge, the relation between the graph models and the ODE models is unclear. The purpose of this paper is to explore the connections between the two types of models.

We propose a property of the ODE models, called the constant sign property (CSP), such that an ODE model corresponds to a single graph model under a set of conditions if and only if the ODE model satisfies CSP under that set of conditions. An ODE model is said to satisfy global constant sign property (GCSP) if it satisfies CSP under all conditions, in which case the ODE model corresponds to a single graph model. Typically, an ODE model corresponds to different graph models under different conditions characterizing the context-dependent and time-varying nature of biological systems [7, 8]. An ODE model that does not satisfy GCSP is illustrated in Fig 1.

Fig 1 — The ODE model f governs the dynamics of all parts of the plant, and expression data collected from different parts of a plant (flower vs. leaf) can correspond to different graph models.

One particularly rich class of ODE models that satisfy GCSP are based on GeneNetWeaver [10, 11], the software used to generate expression data in DREAM challenges 3–5 [11–13] and recently applied to single-cell analysis [14, 15]. In these ODE models a layer of intermediate elements called modules are constructed with transcription factors (TFs) as their input and target genes their output. The activity level of a module depends on its input and its type, and determines the production rate of its output. The modules model the binding of protein complexes to DNA in transcriptional regulation. TFs can regulate the target gene through one or multiple modules. Assuming for each TF and each target gene there is only one module that takes the TF as an input and the target gene as an output, we show that CSP is satisfied, so each GeneNetWeaver ODE model has a well-defined graph model associated with it. The combinatorial nature of the number of possible module configurations (i.e., the number of the modules and their input and output) and the continuous value parameters make the GeneNetWeaver ODE models extremely rich.

The organization of this paper is as follows. In the first subsection of the Materials and Methods section, we describe the ODE models and the graph models, and propose two notions of CSP. In the second subsection of the Materials and Methods section, we describe ODE models based on GeneNetWeaver. The Results section has three subsections. In the first, a relation of the two notions of CSP is provided. In the second, the GeneNetWeaver ODE models are shown to satisfy the constant sign property, and their complexity is investigated. In the third, a case study of a core soybean flowering network based on the literature is presented to demonstrate the use of the GeneNetWeaver ODE models. First it is illustrated that a single signed, directed graph model has a large space of consistent ODE models. Second, to study how different the GeneNetWeaver ODE models are, we explore the problem of numerically fitting parameters of one ODE model to synthetic expression data generated from another. The generalization, implication and limitation of CSP are discussed before the concluding remarks.

Materials and methods

ODE model and constant sign property

In this section we define the constant sign property, a property under which ODE models are consistent with signed directed graphs. Roughly speaking, CSP holds when unilaterally increasing the expression level of one gene causes the expression level of another gene to move in one direction. In other words, the effect of one regulator gene has a constant sign on a target gene. In rare cases, CSP may hold globally, regardless of the expression levels of all the genes and the concentrations of any other molecular classes. More generally, CSP may hold only for a set of expression levels and system parameters, leading to a local definition. We present the precise definition of CSP in this section.

Let x₁(t), x₂(t), …, x_n(t) be the mRNA abundances for the n genes (the observables) at time t. Let x_n+1(t), x_n+2(t), …, x_n+m(t) be the protein concentrations (the unobservables) at time t, which may include derived (protein complexes and modifications like protein phosphorylation) and localized (e.g., cytoplasmic and nuclear) proteins. Let x_n+m+1(t), x_n+m+2(t), …, x_n+m+l(t) be the strengths of the chemical and environmental signals (the controllables, e.g., temperature and photoperiod) at time t. Let x(t) = (x_i(t):i∈[n+m+l]) be the system state at time t, where [n] denotes the set of integers {1, 2, …, n}. Let $λ \in R^{s}$ be the parameters of the ODE model and let $f_{i} : R^{n + m + l} \times R^{s} \to R$ be the time derivative of x_i as a function of the (n + m + l)-dimensional system state and the parameters for i ∈ [n + m]. Note the domain of f_i is assumed to be the entire Euclidean space rather than a subset of it without loss of generality because one can always restrict f_i to a subset of states that x takes. Examples of f for the single-input case (n + m + l = 1) include the Michaelis–Menten kinetics and the more general Hill kinetics. Examples of f for the multi-input case (n + m + l ≥ 2) include the Shea–Ackers model [16, 17], which is the average production rate based on a Gibbs measure of the control states, and the GeneNetWeaver model to be discussed later in this paper, which models the additive effect of multiple intermediate Shea–Ackers type modules. Both the Shea–Ackers model and the GeneNetWeaver model generalize the Hill kinetics to multi-input scenarios in their own ways and are, among many other sophisticated ODE models, within the framework of ODE models in this paper.

Formally, given the numbers of molecular classes (i.e., n classes of mRNAs, m classes of proteins, and l classes of molecular signals), the dynamics of an ODE model are characterized by the collection of time derivatives for the uncontrollable variables f = (f_i:i∈[n + m]). In the rest of the paper an ODE model refers to the collection of the functions f. The trajectories of the mRNA and protein concentrations evolving with time depend on $(x^{0}, \tilde{x}, λ)$ , where $x^{0} = (x_{i}^{0} : i \in [n + m])$ are the initial conditions of the mRNAs and proteins at time 0, $\tilde{x} = ({\tilde{x}}_{i} (t) : n + m + 1 \leq i \leq n + m + l, t \geq 0)$ are the predefined external signal strengths for all time, and $λ \in R^{s}$ are the parameters. The trajectories can then be obtained by solving the following initial value problem.

\begin{matrix} x_{i} (0) = x_{i}^{0}, i \in [n + m], \end{matrix}

\begin{matrix} x_{i} (t) = {\tilde{x}}_{i} (t), n + m + 1 \leq i \leq n + m + l, t \geq 0, \end{matrix}

\begin{matrix} \frac{d x_{i} (t)}{d t} = f_{i} (x (t), λ), i \in [n + m] . \end{matrix}

Note the signals (x_i:n + m + 1≤i≤n + m + l) are exogenously controlled and not solved via the equations. In this paper we assume existence and uniqueness of the solution on the entire positive time horizon for ease of exposition. The concept of CSP can be easily generalized to ODE models where only local solutions exist.

Infinitesimal monotonicity

We first define a version of monotonicity called infinitesimal monotonicity such that CSP using this definition of monotonicity can be applied to a broad class of ODE models.

Roughly speaking, infinitesimal monotonicity characterizes the monotone influence of one observed variable on another over a sufficiently short period of time. Such monotonicity depends on the current system state. For each regulator–target pair, to avoid external and indirect influence, we clamp the exogenous signals as well as the observed variables other than the target to their initial values, so only the unobserved variables and the target observed variable are allowed to change with time. The clamped value of the regulator can be perturbed. A change in the constant value of the regulator can cause a change in the target observed variable in continuous time, possibly through one or multiple unobserved variables. The system with the input at the regulator observable and output at the target observable is thus treated as a black box in the sense that one does not need to know its internal states (the unobservables) to determine the infinitesimal monotonicity of the system. This assumes that the initial internal states are fixed.

Given the ODE model f, and given a state $x \in R^{n + m + l}$ and parameters $λ \in R^{s}$ , let j be the target gene and let the dynamics of the clamped ODE model be driven by

\begin{matrix} {\hat{f}}_{k}^{(j)} = {\begin{matrix} f_{k} & if k \in {j} \cup [n + 1 : n + m], \\ 0 & otherwise, \end{matrix} \end{matrix}

for any k ∈ [n + m + l]. Here [a: b] denotes the set of integers {a, a + 1, …, b}. Then ${\hat{f}}^{(j)} = ({\hat{f}}_{k}^{(j)} : k \in [n + m + l])$ determines the dynamics of a system where the mRNA abundances and exogenous signals remain constant across time except for the mRNA abundance of gene j. Fix a potential regulator gene i ≠ j and let $(η^{(j)} (t, h, x, λ) \in R^{n + m + l} : t \geq 0)$ be the solution of the initial value problem with initial condition (x_i + h, x_−i), dynamics ${\hat{f}}^{(j)}$ , parameters λ. Note here η^(j) also includes the clamped exogenous signals. Also note that for any t we have

\begin{matrix} η_{k}^{(j)} (t, h, x, λ) \equiv x_{k} for k \in [n] \ {i, j} and k > n + m, \end{matrix}

\begin{matrix} η_{i}^{(j)} (t, h, x, λ) \equiv x_{i} + h, \end{matrix}

and

\begin{matrix} η_{j}^{(j)} (0, h, x, λ) = x_{j} . \end{matrix}

The following definition gives a precise characterization of the target gene expression to be strictly increasing or decreasing with respect to the regulator gene expression in a small future time period.

Definition 1 (Infinitesimal monotonicity). For an ODE model f at state x with parameters λ and (i, j)∈[n]² with i ≠ j, the infinitesimal monotonicity for i on j is given by

\begin{matrix} B_{inf} (i, j, x, λ) = {\begin{matrix} \emptyset & if \forall h and \forall t, η_{j}^{(j)} (t, h, x, λ) = η_{j}^{(j)} (t, 0, x, λ), \\ {1} & if \exists ϵ > 0 such that \forall t \in (0, ϵ) and \forall h \in (- ϵ, 0) \cup (0, ϵ), \\ \frac{η_{j}^{(j)} (t, h, x, λ) - η_{j}^{(j)} (t, 0, x, λ)}{h} > 0, \\ {- 1} & if \exists ϵ > 0 such that \forall t \in (0, ϵ) and \forall h \in (- ϵ, 0) \cup (0, ϵ), \\ \frac{η_{j}^{(j)} (t, h, x, λ) - η_{j}^{(j)} (t, 0, x, λ)}{h} < 0, \\ {1, - 1} & otherwise . \end{matrix} \end{matrix}

Equivalently, in less mathematical terms, B_inf(i, j, x, λ) = ∅ indicates gene i does not affect gene j at state x and parameters λ. The cases with B_inf(i, j, x, λ) = {1} and {−1} indicate gene i activates or represses gene j, respectively, at state x and parameters λ in a small time period with small perturbation. The case with B_inf(i, j, x, λ) = {1, −1} indicates gene i does not affect gene j in a monotone way.

Remark 1. Note the case B_inf(i, j, x, λ) = {1, −1} can happen when the expression level of the target gene j reaches the maximum with respect to x_i, so that a change of x_i in either direction will cause the solution $η_{j}^{(j)} (t, h, x, λ)$ to decrease for small t, in which case the monotonicity is indeterminate (neither increasing nor decreasing).

In practice the values of x and λ may be unknown, so we are interested in how B_inf varies with x and λ. Usually we expect some level of continuity of B_inf with respect to x and λ, so the infinitesimal monotonicity of the ODE model may be consistent in a small set of (x, λ) pairs, denoted by S. In the case when S equals the entire state–parameter space, the infinitesimal monotonicity is consistent globally. The following definition generalizes Definition 1 by checking the consistency of infinitesimal monotonicity over a set S, and defines an associated graph.

Definition 2 (Infinitesimal gene regulatory graph). The infinitesimal gene regulatory graph of an ODE model f over $S \subseteq R^{n + m + l} \times R^{s}$ is given by a graph $([n], E_{inf} (S), B_{inf} (S))$ , where the set of edge labels B_inf(S) = (B_inf(i, j, S):(i, j)∈[n]², i≠j) is defined by

\begin{matrix} B_{inf} (i, j, S) = ⋃_{(x, λ) \in S} B_{inf} (i, j, x, λ) \end{matrix}

and the set of edges is

\begin{matrix} E_{inf} (S) = {(i, j) : B_{inf} (i, j, S) \neq \emptyset} . \end{matrix}

Equivalently, in less mathematical terms, B_inf(i, j, S) = ∅ indicates gene i does not affect gene j when (x, λ) is in S. The case with B_inf(i, j, S) = {1} indicates gene i can increase gene j for some (x, λ) in S, but cannot decrease gene j for any (x, λ) in S. The case with B_inf(i, j, S) = {−1} indicates gene i can decrease gene j for some (x, λ) in S, but cannot increase gene j for any (x, λ) in S. The case with B_inf(i, j, S) = {1, −1} indicates the monotonicity is indeterminate over S.

Definition 3 (Infinitesimal constant sign property). An ODE model f satisfies the infinitesimal constant sign property over $S \subseteq R^{n + m + l} \times R^{s}$ if $\forall (i, j) \in E_{inf} (S), B_{inf} (i, j, S) = {1}$ or B_inf(i, j, S) = {−1}. In other words, the ODE model satisfies infinitesimal constant sign property on S if no pair of (i, j) has indeterminate monotonicity on S.

Remark 2. The set S represents the set of states where the infinitesimal CSP holds. If S is the entire state space then we say the infinitesimal CSP holds globally. Complex biological systems usually do not satisfy CSP globally, but may satisfy CSP locally over the set S where the system states reside. For example, in Fig 1, the gene expressions in the flowers may be contained in set S₁ where the infinitesimal CSP is satisfied with a gene regulatory graph G₁, while the gene expressions in the leaves may be contained in set S₂ that does not intersect with S₁, and the infinitesimal CSP is satisfied with a different gene regulatory graph G₂.

Sum–product monotonicity

Infinitesimal monotonicity gives a natural notion of monotonicity, but it is expressed in terms of the solutions of the differential equations, and solving the differential equations can be analytically challenging and numerically unstable. Hence, in this section we focus on ODE models with a smooth f and propose another notion of monotonicity that does not require solving the system of ODEs.

Definition 4 (Molecular graph). The molecular graph of an ODE model is a graph whose vertices are the internal molecular classes (i.e., the observables and the unobservables) and whose edges indicate non-constant effects among the internal molecular classes with signs indicating monotonicity of the effects. Formally, given an ODE model f, the molecular graph at state $x \in R^{n + m + l}$ with parameters $λ \in R^{s}$ is a directed graph with vertices [n + m] and edges $E_{mol}$ , where

\begin{matrix} E_{mol} {= {(i, j) \in [m + n]}^{2} : & there exists x \in R^{n + m + l}, λ \in R^{s}, and x_{i}^{'} \in R such that \\ f_{j} (x, λ) \neq f_{j} ((x_{i}^{'}, x_{- i}), λ)} . \end{matrix}

In other words $(i, j) \notin E_{mol}$ if f_j does not actually depend on x_i. See Fig 2(A) for an example of a molecular graph. Note in general we could have edges from unobservables to unobservables (e.g., protein–protein interactions) and from observables to observables (modeling fast translation where mRNA abundances and protein concentrations are considered the same).

The molecular graph represents the interactions among all the molecular classes. However, usually only the mRNA abundances are measured; the proteins and their derived products are not measured, making the molecular graph only partially observed. As a result, one often seeks an induced graph on the mRNA classes, which leads to the following definitions analogous to the clamped systems for infinitesimal monotonicity.

Definition 5 (Unobserved path of length q for q ≥ 1). Given a molecular graph, the set of unobserved paths from one mRNA to another is the set of paths that do not go though another mRNA. Formally, given n, m, l, and edges $E_{mol} \subseteq {[n + m]}^{2}$ and i, j ∈ [n] with i ≠ j, the set of unobserved paths of length q connecting i and j is

\begin{matrix} P_{i j}^{q} = {(r_{0}, r_{1}, \dots, r_{q}) \in {[n + m]}^{q + 1} : & r_{q} = i, r_{0} = j, and \forall k \in [1 : q - 1], r_{k} \in [n + 1 : n + m], \\ and \forall k \in [q], (r_{k}, r_{k - 1}) \in E_{mol}} . \end{matrix}

Definition 6 (Molecular distance). The molecular distance from i to j is

\begin{matrix} q_{i j}^{*} = {\begin{matrix} min {q : P_{i j}^{q} \neq \emptyset} & if P_{i j}^{q} \neq \emptyset for some q, \\ \infty & otherwise . \end{matrix} \end{matrix}

Definition 7 (Sum–product monotonicity). For genes i and j, state x and parameters λ, the sum–product monotonicity is defined by

\begin{matrix} B_{sum} (i, j, x, λ) = {\begin{matrix} \emptyset & if q_{i j}^{*} = \infty, \\ {1} & if q_{i j}^{*} < \infty and Δ (i, j, x, λ) > 0, \\ {- 1} & if q_{i j}^{*} < \infty and Δ (i, j, x, λ) < 0, \\ {1, - 1} & if q_{i j}^{*} < \infty and Δ (i, j, x, λ) = 0, \end{matrix} \end{matrix}

where $Δ (i, j, x, λ) ≜ \sum_{r \in P_{i j}^{q_{i j}^{*}}} \prod_{l = 1}^{q_{i j}^{*}} \partial_{r_{l}} f_{r_{l - 1}} (x, λ)$ .

Note B_sum is only based on derivatives of f, not solving the ODEs. It plays a similar role as B_inf. Thus we can define sum–product gene regulatory graph and sum–product constant sign property in a similar way as Definitions 2 and 3. A relation between the infinitesimal monotonicity and the sum–product monotonicity is given in 1 in the Results section.

GeneNetWeaver ODE model

We consider a differential equation model such that transcription factors participate in modules which bind to the promoter regions of a given target gene. This model is based on the GeneNetWeaver software version 3 [10]. Part of the model of the popular simulator is described in the studies [12] and [11], but there is no good reference that precisely describes the model. So in this section we describe the generative model in GeneNetWeaver based on a given directed graph, and show in the next section that the CSP is satisfied. Note GeneNetWeaver models are a special class of ODE models with the molecular graphs being bipartite, resulting in no unobserved paths of length greater than 2, unlike the general case as illustrated in Fig 2. GeneNetWeaver allows fast protein–protein interactions though the f function, but does not characterize slow protein–protein interactions or external signals.

The model in GeneNetWeaver is based on standard modeling assumptions (see [19]) including statistical thermodynamics, as described in the study [20]. The activity level of the promoter of a gene is controlled by one or more cis-regulatory modules, which for brevity we refer to as modules. A module can be either an enhancer or a silencer. Each module has one or more transcription factors as activators, and possibly one or more TFs as deactivators. For each target gene, a number of modules are associated with its TFs such that each TF is an input of one of the modules. For simplicity assume that each module regulates only a single target gene.

Let $([n], E, b)$ be a directed signed graph with vertices [n], edge set $E$ , and edge signs b. For target gene j, let $N_{j} ≜ {i \in [n] : (i, j) \in E}$ be the set of its TFs and let $S_{j} \subseteq P (N_{j})$ be a partition of N_j according to the input of the modules. Then the modules for target gene j can be indexed by the tuple (K, j) (denoted by K: j in the subscripts), where $K \in S_{j}$ . Note each TF regulates the target gene j only through one module. The random model for assignment of the TFs to modules and of the parameters in GeneNetWeaver is summarized in S2 Appendix. Let the sets of activators and deactivators for module K: j be A_{K: j} and D_{K: j} with A_{K: j}∪D_{K: j} = N_j and A_{K: j}∩D_{K: j} = ∅. For a module K: j, let c_{K: j} be the type (1 for enhancer and −1 for silencer), r_{K: j} the mode (1 for synergistic binding and 0 for independent binding). Note r_{K: j} only matters for multi-input modules (i.e., those with |K|>1). Let β_{K: j} ≥ 0 be the absolute effect of module K: j on gene j in mRNA production rate. Note that by the construction in S2 Appendix, it is guaranteed that $b_{i j} = c_{K : j} (1_{{i \in A_{K : j}}} - 1_{{i \in D_{K : j}}})$ .

Let x_i(t) and y_i(t) be the mRNA and protein concentrations for gene i at time t. We ignore t in the remainder of the paper for simplicity. The dynamics are given by

\begin{matrix} \frac{d x_{i}}{d t} = f_{i} (y) - δ_{i} x_{i} \end{matrix}

and

\begin{matrix} \frac{d y_{i}}{d t} = f_{i}^{(p)} (x_{i}) - δ_{i}^{(p)} y_{i}, \end{matrix}

where f_i(y) is the relative activation rate for gene i (i.e. the mRNA production rate for gene i for the normalized variables) discussed in the next two subsections, $f_{i}^{(p)} (x_{i}) = ρ_{i} x_{i}$ is the translation rate of protein i, and δ_i and $δ_{i}^{(p)}$ are the degradation rates of the mRNA and the protein. Because only x is observed in RNA-seq experiments, without loss of generality the unit of the unobserved protein concentrations can be chosen such that $ρ_{i} = δ_{i}^{(p)}$ for all i (see nondimensionalization in the study [12]). Note the GeneNetWeaver model is a special ODE model with m = n and l = 0.

Activity level of a single module

For edge (i, j), the normalized expression level of gene i, ν_ij, is defined by

\begin{matrix} ν_{i j} = {(\frac{y_{i}}{k_{i j}})}^{h_{i j}}, \end{matrix}

where k_ij is the Michaelis–Menten normalizing constant and h_ij is a small positive integer, the Hill constant, representing the number of copies of the TF i that need to bind to the promoter region of gene j to activate the gene. (If gene i is not bound to the promoter region of gene j, it is like taking the Hill constant equal to zero and thus normalized expression level equal to one.) The activity level of module K: j denoted by M_{K: j}, which is the probability that module K: j is active, is given in the following three cases.

Type 1 modules: Input TFs bind to module independently

In this case, r_{K: j} = 0, and we have

\begin{matrix} M_{K : j} = (\prod_{i \in A_{K : j}} \frac{ν_{i j}}{1 + ν_{i j}}) (\prod_{i \in D_{K : j}} \frac{1}{1 + ν_{i j}}) . \end{matrix}

Interpreting each fraction as the probability that an activator is actively bound (or a deactivator is not bound), the activation M_{K: j} is the probability that all the inputs of module K: j are working together to activate the module, i.e., the probability that the module is active. It is assumed that for a module to be active, all the activators must be bound and all the deactivators must be unbound, and all the bindings happen independently.

One can think of module K: j as a system with 2^{|A_{K: j}|+ |D_{K: j}|} possible states of the inputs. Suppose each input j binds with rate ν_ij and unbinds with rate 1 independently. Then the stationary probability of the state that all the activators are bound and none of the deactivators is bound is M_{K: j}.

Alternatively, one can assign additive energy of

\begin{matrix} E_{i j} & = - log ν_{i j} \\ = - h_{i j} log \frac{y_{i}}{k_{i j}} \end{matrix}

to each bound input gene i and energy zero to each unbound gene. Then M_{K: j} is the probability that all activators are bound and none of the deactivators is bound in the Gibbs measure. In other words, the Type 1 modules are Shea–Ackers models with all binding states possible and only the one state with all the activators initiating transcription.

Type 2 modules: TFs are all activators and bind to module as a complex

In this case, D_{K: j} = ∅, r_{K: j} = 1, and we have

\begin{matrix} M_{K : j} = \frac{\prod_{i \in A_{K : j}} ν_{i j}}{1 + \prod_{i \in A_{K : j}} ν_{i j}} . \end{matrix}

One can think of such a module as a system with only two states: bound by the activator complex, or unbound. The transition rate from unbound to bound is $\prod_{i \in A_{K : j}} ν_{i j}$ , and that from bound to unbound is 1. Then the activation of the module is the probability of the bound state in the stationary distribution, given by M_{K: j}.

Alternatively, this corresponds to the Shea–Ackers model as in the previous case, except all the states other than fully unbound and fully bound are unstable (i.e. have infinite energy).

Type 3 modules: Some TFs are deactivators and bind to module as a complex

In this case, D_{K: j} ≠ ∅ and r_{K: j} = 1, and we have

\begin{matrix} M_{K : j} = \frac{\prod_{i \in A_{K : j}} ν_{i j}}{1 + \prod_{i \in A_{K : j}} ν_{i j} + (\prod_{i \in A_{K : j}} ν_{i j}) (\prod_{i \in D_{K : j}} ν_{i j})} . \end{matrix}

(1)

In this case the system can be in one of three states: unbound, bound by the activator complex, and bound by the deactivated (activator) complex. The Gibbs measure in the Shea–Ackers model for Type 3 modules with three stable states (i.e. have finite energy) assigns probability M_{K: j} to the activated state.

Note if ∏_{i ∈ ∅} ν_ij is understood to be 0 then Eq (1) reduces to Type 2 when D_{K: j} = ∅. However historically $\prod_{i \in D_{K : j}} ν_{i j}$ was understood as 1 in an early version of GeneNetWeaver and caused a bug of wrong Type 2 modules.

Remark 3. Presumably it is possible for there to be more than three stable states for a module, so additional types of modules could arise, but for simplicity, following GeneNetWeaver, we assume at least one of the three cases above holds.

Remark 4. If a module K: j has only one input i (i.e. K = {i}) then the module is type 1 and $M_{K : j} = \frac{ν_{i j}}{1 + ν_{i j}}$ or $M_{K : j} = \frac{1}{1 + ν_{i j}}$ . We will see later in the random model of GeneNetWeaver that only the former (single activator) is allowed.

GeneNetWeaver software uses the 3 types of modules derived above. In all three cases the activation M_{K: j} is monotonically increasing in y_i for activators i ∈ A_{K: j}, and monotonically decreasing in y_i for deactivators i ∈ D_{K: j}.

Production rate as a function of multiple module activations

The relative activation of gene j as a function of the protein concentrations y is

\begin{matrix} f_{j} (y) = \sum_{s \in {0, 1}^{S_{j}}} α_{j, s} (\prod_{K \in S_{j} : s_{K} = 1} M_{K : j}) (\prod_{K \in S_{j} : s_{K} = 0} (1 - M_{K : j})), \end{matrix}

(2)

where α_j,s is the relative activation of the promoter under the module configuration s. Note that α in Eq (2) gives $2^{| S_{j} |}$ degrees of freedom, one for every possible subset of the modules being active. However, following the GeneNetWeaver computer code [10], we assume that the interaction among the modules is linear, meaning that for some choice of α_j,basal, $(c_{K : j} : K \in S_{j})$ , and $(β_{K : j} : K \in S_{j})$ , we have for any configuration $s \in {0, 1}^{S_{j}}$ ,

\begin{matrix} α_{j, s} = α_{j, basal} + \sum_{K \in S_{j} : s_{K} = 1} c_{K : j} β_{K : j}, \end{matrix}

(3)

This reduces the number of degrees of freedom for α to $| S_{j} | + 1$ . Then, combining Eqs (2) and (3) yields

\begin{matrix} f_{j} (y) & = E α_{j, S} \\ = α_{j, basal} + \sum_{K \in S_{j}} c_{K : j} β_{K : j} E S_{K} \\ = α_{j, basal} + \sum_{K \in S_{j}} c_{K : j} β_{K : j} M_{K : j}, \end{matrix}

(4)

where S is distributed by the product distribution of the Bernoulli distributions with means $(M_{K : j} : K \in S_{j})$ . So the relative activation, or the mRNA production rate, of a gene is given by the basal activation plus the inner product of the module effects and the module activation. We also note that the effect of the modules is not assumed to be statistically independent: all we need to know to compute the relative activation of a gene are the marginal probability of activation of the single modules.

Taking into account the three different types of modules described in the previous section on activity level of a single module, Eq (4) yields the following expression for the relative activation of gene j:

\begin{matrix} f_{j} (y) & = α_{j, basal} + \sum_{K : r_{K : j} = 0} c_{K : j} β_{K : j} (\prod_{i \in A_{K : j}} \frac{ν_{i j}}{1 + ν_{i j}}) (\prod_{i \in D_{K : j}} \frac{1}{1 + ν_{i j}}) \\ \sum_{\begin{matrix} K : r_{k : j} = 1 \\ D_{K : j} = \emptyset \end{matrix}} c_{K : j} β_{K : j} \frac{\prod_{i \in A_{K : j}} ν_{i j}}{1 + \prod_{i \in A_{K : j}} ν_{i j}} \\ \sum_{\begin{matrix} K : r_{k : j} = 1 \\ D_{K : j} = \emptyset \end{matrix}} c_{K : j} β_{K : j} \frac{\prod_{i \in A_{K : j}} ν_{i j}}{1 + \prod_{i \in A_{K : j}} ν_{i j} + (\prod_{i \in A_{K : j}} ν_{i j}) (\prod_{i \in D_{K : j}} ν_{i j})} . \end{matrix}

(5)

As we will see in the Results section, f satisfies the CSP. Note that in the actual GeneNetWeaver source code every α_j,s is truncated to the interval [0, 1]:

\begin{matrix} α_{j, s} = {[α_{j, basal} + \sum_{K \in S_{j} : s_{K} = 1} c_{K : j} β_{K : j}]}_{0}^{1}, \end{matrix}

where ${[x]}_{0}^{1} = max {min {x, 1}, 0}$ is the projection of x to the [0, 1] interval. Then the relative activation in each state may not be linear in the individual module effects. In that case one has to resort to Eq (2) instead of Eq (5) for computing the mRNA production rate. The resulting truncated model does not necessarily satisfy the CSP because f_j may not be monotone in M_{K: j} in Eq (2).

Results

A relation between infinitesimal monotonicity and sum–product monotonicity

The following result establishes the equivalence of the two notions of monotonicity for ODE models that satisfy the sum–product CSP. So if the sum–product CSP holds, we do not need to distinguish between the sum–product CSP and the infinitesimal CSP. Consequently, given an ODE model, one can easily find the corresponding graph models for different system parameters, external signals, and internal states by calculating the sum products of the first-order partial derivatives of the input function f.

Proposition 1. If f is smooth and satisfies the sum–product CSP over $S \subseteq R^{n + m + l} \times R^{s}$ , then it also satisfies the infinitesimal CSP over S, and the sum–product gene regulatory graph and the infinitesimal gene regulatory graph are the same.

proof. It suffices to show B_sum(i, j, x, λ) = B_inf(i, j, x, λ) if B_sum(i, j, x, λ)≠{1, −1} for any (x, λ)∈S. For fixed i, j, x, λ, let $η (t, h) ≜ η^{(j)} (t, h, x, λ)$ be the solution of the clamped initial value problem at time t with initial condition η(0, h) = (x_i + h, x_−i). We are interested in the sign of

\begin{matrix} g (t, h) ≜ η_{j} (t, h) - η_{j} (t, 0) . \end{matrix}

If $q_{i j}^{*} = \infty$ then we readily have B_sum(i, j, x, λ) = B_inf(i, j, x, λ) = ∅. Suppose $q_{i j}^{*} = q < \infty$ . Then by Corollary 4.1 in Section 5 of [21] (page 101), f being smooth implies g is also smooth, and we can show that (see the proof in S3 Appendix)

\begin{matrix} \partial_{t^{a} h^{b}} g (0, 0) = {\begin{matrix} Δ (i, j, x, λ) & if (a, b) = (q, 1), \\ 0 & if 0 \leq a \leq q - 1 or b = 0 . \end{matrix} \end{matrix}

(6)

Hence by the multivariate Taylor’s theorem (see, e.g., [22])

\begin{matrix} g (t, h) & = g (0, 0) + g^{'} (0, 0) (t, h) + \frac{1}{2} g^{(2)} (0, 0) {(t, h)}^{2} + \dots \\ + \frac{1}{(q + 1)!} g^{(q + 1)} (0, 0) {(t, h)}^{q + 1} + {o (| t |}^{q + 1} + {| h |}^{q + 1}) \\ = 0 + 0 + \dots + 0 + \frac{1}{(q + 1)!} (\frac{\partial^{q + 1} g}{\partial t^{q + 1}} (0, 0) t^{q + 1} + (\binom{q + 1}{1}) \frac{\partial^{q + 1} g}{\partial t^{q} \partial h} (0, 0) t^{q} h \\ + \dots + \frac{\partial^{q + 1} g}{\partial h^{q + 1}} (0, 0) h^{q + 1}) + {o (| t |}^{q + 1} + {| h |}^{q + 1}) \\ = \frac{1}{q!} Δ (i, j, x, λ) t^{q} h + {o (| t |}^{q + 1} + {| h |}^{q + 1}) \end{matrix}

as (t, h) → (0, 0). So g(t, h) has the same sign as Δ(i, j, x, λ)t^q h in a sufficiently small neighborhood of (0, 0). Hence B_sum(i, j, x, λ) = B_inf(i, j, x, λ).

Remark 5. If multiple ODE models satisfy CSP with the same gene regulatory graph, then they can be combined into a single ODE model with different parameterization so that the combined ODE model still satisfies CSP with the same gene regulatory graph. For example, ODE models for different environmental temperatures can be either considered different models or a single unified model with different temperature parameter. Then the temperature-specific models satisfy CSP with the same gene regulatory graph if and only if the unified model satisfies CSP for all temperatures.

Remark 6. The effect of a gene on itself can be either autoregulation or degradation. The two effects can be distinguished with the molecular graph: a self-loop with negative derivative indicates degradation, and a loop of multiple hops indicates autoregulation. The infinitesimal monotonicity does not distinguish the two effects.

The following is an example of an ODE model that does not satisfy CSP globally, based on the interactions among FT, TFL1, FD, and LFY genes in the study [9].

Example 1. Consider a four-gene ODE model with the following dynamics for gene 4.

\begin{matrix} {\dot{x}}_{4} & = f_{4} (x_{1}, x_{2}, x_{3}) \\ ≜ \frac{x_{1} x_{3}}{λ_{1} + x_{1} x_{3}} \frac{λ_{2}}{λ_{2} + x_{2} x_{3}}, \end{matrix}

where we use x for both the mRNA and protein concentrations. The biological meaning could be genes 1 and 3 form a protein complex that activates gene 4, while genes 2 and 3 form a protein complex that represses gene 4. Then it can be checked that the effect of gene 3 on gene 4 does not satisfy the CSP globally. Indeed, one can check that

\begin{matrix} \partial_{3} f_{4} = \frac{x_{1} λ_{2}}{{(λ_{1} + x_{1} x_{3})}^{2} {(λ_{2} + x_{2} x_{3})}^{2}} (λ_{1} λ_{2} - x_{1} x_{2} x_{3}^{2}) . \end{matrix}

So gene 3 activates gene 4 if $λ_{1} λ_{2} > x_{1} x_{2} x_{3}^{2}$ , and represses gene 4 if $λ_{1} λ_{2} < x_{1} x_{2} x_{3}^{2}$ .

Here is an example of a molecular graph having a shorter unobserved path dominating a longer unobserved path with the opposite sign, taken from part of the gene regulatory network in the study [23], achieving CSP with the sign of the shorter path (see Fig 3).

Fig 3 — (A) The molecular graph with blue edges indicating positive partial derivatives and red edges indicating negative partial derivatives. (B) The gene regulatory graph.

Example 2. The mRNA ELF4^m is transcribed into the protein ELF4^p, which then forms the complex EC^c with the protein LUX^p. The complex EC^c induces the transcription of the mRNA GI^m. Then there is a 3-hop path (ELF4^m–ELF4^p–EC^c–GI^m) and a 4-hop path (ELF4^m–ELF4^p–LUX^p–EC^c–GI^m) from ELF4^m to GI^m with opposite signs. The ODE model of the molecular graph satisfies CSP with ELF4 activating GI in the gene regulatory graph.

GeneNetWeaver: CSP and complexity

In this section GeneNetWeaver models (without the truncation of the α terms in the implementation) are shown to satisfy the CSP globally, regardless of the parameters and the system states, and thus correspond to the signed directed graphs that were used to generate the models. Moreover, when data is generated through multifactorial perturbation for the DREAM challenge (primarily for generation of stationary expression levels, rather than trajectories), each ensemble of networks produced is also associated with the same directed signed graph. This is in contrast to the Shea–Ackers model, which is shown to be able to generate non-monotone behavior [17]. Formally we have the following result.

Proposition 2. Given any directed signed graph, the ensemble of the GeneNetWeaver models satisfy CSP over (0, ∞)²ⁿ and the gene regulatory graphs coincide with the given graph.

Proof. Fix any model of the ensemble of GeneNetWeaver models for the given graph. For any target gene j and its regulator i ∈ N_j, there exists a unique module, indexed by K:j, whose input $K \in S_{j}$ includes i. Then for any of the three module types,

\begin{matrix} \partial_{ν_{i j}} M_{K : j} {\begin{matrix} > 0 & if i \in A_{K : j}, \\ < 0 & if i \in D_{K : j} . \end{matrix} \end{matrix}

Then by Eq (4),

\begin{matrix} \partial_{y_{i}} f_{j} = c_{K : j} β_{K : j} \partial_{ν_{i j}} M_{K : j} h_{i j} \frac{y_{i}^{h_{i j} - 1}}{k_{i j}^{h_{i j}}} \end{matrix}

and

\begin{matrix} \partial_{x_{i}} f_{i}^{(p)} = ρ_{i} . \end{matrix}

Because only c_{K: j} and $\partial_{ν_{i j}} M_{K : j}$ can be negative in $\partial_{y_{i}} f_{j} \partial_{x_{i}} f_{k}^{(p)}$ , the sum–product of the first-order partial derivatives of the path from x_i to x_j has the same sign as $c_{K : j} \partial_{ν_{i j}} M_{K : j}$ , which is consistent with the sign b_ij in the given graph by the construction in S2 Appendix. Hence by Proposition 1 the fixed ODE model satisfies CSP over all positive state vectors with gene regulatory graph equal to the given graph. Repeat this for all ODE models in the ensemble and the proposition is proved. We now discuss the complexity of GeneNetWeaver ODE models for a given gene regulatory graph. The complexity comes from both the large number of parameters and the combinatorial nature of the module configurations. The complexity indicates that ODE models are both much more detailed and considerably harder to infer compared to the graphical models.

For each gene i there are 5 non-negative real parameters (α_i,basal, x_i(0), y_i(0), $δ_{i}^{(m)}$ , $δ_{i}^{(p)}$ ). For each edge (i, j) there is a non-negative real parameter (k_ij) and an integer parameter (h_ij). For each module K: i there is a positive real parameter (β_{K: i}) and two binary parameters (c_{K: i} and r_{K: i}).

The module configuration encodes great combinatorial complexity. Given a gene has K ≥ 1 input genes, the number of ways to partition the genes into modules is the Kth Bell number. The first ten Bell numbers are 1, 2, 5, 15, 52, 203, 877, 4140, 21147, and 115975. In addition, each input to a given module needs to be classified as an activator or deactivator.

Case study: Soybean flowering networks

In this section the similarities of the ODE models corresponding to three different graph models are studied. First the classes of ODE models are listed for the three graph models. Then, to investigate their similarities, we generate expression data from one ODE model, and fit another model to the data by optimizing the parameters. The level of fitness of one class of ODE model to the data generated from another is used as a metric of similarity. As we will see, ODE models corresponding to the same graph model tend to have a higher similarity, while those from different graph models tend to have a lower similarity, as long as the least-squares problem is sufficiently overdetermined. The result implies that the graph model corresponding to the ODE model may be recovered with moderate amount of data, while the amount of data required for ODE model recovery may be of a much higher order. The simulation code for the data fitting results is available at [24].

Five-gene graph and ODE models

In this section we explicitly write out the classes of GeneNetWeaver ODE models of three graph models. The first two graph models are compiled from the literature, with only the sign of one edge different between them (the difference is discovered in the study [25]). The third graph model is an arbitrary five-gene repressilator for comparison purpose.

Flowering network with COL1a activating E1

A graph model of a five-gene soybean flowering network is shown in Fig 4. The network is based on the flowering network for Arabidopsis and homologs of Arabidopsis genes found in soybean (see Table 1). The corresponding gene IDs are shown in Table 2.

Table 1. Core flowering genes.

regulatory interaction	reference
E1 activates COL1a	[26]
E1 activates FT4	[27]
COL1a activates E1	[25]
COL1a represses E1	[26]
COL1a activates FT4	[26], [25]
COL1a represses FT2a	[26], [25]
FT4 represses AP1a	[27]^*
FT2a activates AP1a	[28]

Open in a new tab

* For FT4 only, not for the interaction with AP1a.

Table 2. Core flowering genes.

index	gene ID	gene name
1	Glyma.06G207800	E1
2	Glyma.08G255200	COL1a
3	Glyma.08G363100	FT4
4	Glyma.16G150700	FT2a
5	Glyma.16G091300	AP1a

Open in a new tab

The mRNA and proteins concentrations of the soybean genes E1, COL1a, FT4, FT2a, and AP1a are denoted by (x_i)_{1≤i ≤ 5} and (y_i)_{1≤i ≤ 5}. The differential equations based on the GeneNetWeaver model are

\begin{matrix} {\dot{x}}_{1} = α_{1, basal} + \frac{{(y_{2} / k_{21})}^{h_{21}}}{1 + {(y_{2} / k_{21})}^{h_{21}}} β_{2 : 1} - δ_{1}^{(m)} x_{1} . \end{matrix}

(7)

\begin{matrix} {\dot{x}}_{2} = α_{2, basal} + \frac{{(y_{1} / k_{12})}^{h_{12}}}{1 + {(y_{1} / k_{12})}^{h_{12}}} β_{1 : 2} - δ_{2}^{(m)} x_{2} . \end{matrix}

(8)

\begin{matrix} {\begin{matrix} {\dot{x}}_{3} & = α_{3, basal} + \frac{{(y_{1} / k_{13})}^{h_{13}}}{1 + {(y_{1} / k_{13})}^{h_{13}}} \frac{{(y_{2} / k_{23})}^{h_{23}}}{1 + {(y_{2} / k_{23})}^{h_{23}}} β_{12 : 3} - δ_{3}^{(m)} x_{3} \\ (independent binding), or \\ {\dot{x}}_{3} & = α_{3, basal} + \frac{{(y_{1} / k_{13})}^{h_{13}} {(y_{2} / k_{23})}^{h_{23}}}{1 + {(y_{1} / k_{13})}^{h_{13}} {(y_{2} / k_{23})}^{h_{23}}} β_{12 : 3} - δ_{3}^{(m)} x_{3} \\ (synergistic binding), or \\ {\dot{x}}_{3} & = α_{3, basal} + \frac{{(y_{1} / k_{13})}^{h_{13}}}{1 + {(y_{1} / k_{13})}^{h_{13}}} β_{1 : 3} + \frac{{(y_{2} / k_{23})}^{h_{23}}}{1 + {(y_{2} / k_{23})}^{h_{23}}} β_{2 : 3} - δ_{3}^{(m)} x_{3} \\ (two modules). \end{matrix} \end{matrix}

(9)

\begin{matrix} {\dot{x}}_{4} = {(α_{4, basal} - \frac{{(y_{2} / k_{24})}^{h_{24}}}{1 + {(y_{2} / k_{24})}^{h_{24}}} β_{2 : 4})}^{+} - δ_{4}^{(m)} x_{4} . \end{matrix}

(10)

\begin{matrix} {\begin{matrix} {\dot{x}}_{5} & = α_{5, basal} + \frac{1}{1 + {(y_{3} / k_{35})}^{h_{35}}} \frac{{(y_{4} / k_{45})}^{h_{45}}}{1 + {(y_{4} / k_{45})}^{h_{45}}} β_{34 : 5} - δ_{5}^{(m)} x_{5} \\ (independent binding enhancer), or \\ {\dot{x}}_{5} & = {(α_{5, basal} - \frac{{(y_{3} / k_{35})}^{h_{35}}}{1 + {(y_{3} / k_{35})}^{h_{35}}} \frac{1}{1 + {(y_{4} / k_{45})}^{h_{45}}} β_{34 : 5})}^{+} - δ_{5}^{(m)} x_{5} \\ (independent binding silencer), or \\ {\dot{x}}_{5} & = α_{5, basal} + \frac{{(y_{4} / k_{45})}^{h_{45}}}{1 + {(y_{4} / k_{45})}^{h_{45}} + {(y_{3} / k_{35})}^{h_{35}} {(y_{4} / k_{45})}^{h_{45}}} β_{34 : 5} - δ_{5}^{(m)} x_{5} \\ (synergistic binding enhancer), or \\ {\dot{x}}_{5} & = {(α_{5, basal} - \frac{{(y_{3} / k_{35})}^{h_{35}}}{1 + {(y_{3} / k_{35})}^{h_{35}} + {(y_{3} / k_{35})}^{h_{35}} {(y_{4} / k_{45})}^{h_{45}}} β_{34 : 5})}^{+} - δ_{5}^{(m)} x_{5} \\ (synergistic binding silencer), or \\ {\dot{x}}_{5} & = {(α_{5, basal} - \frac{{(y_{3} / k_{35})}^{h_{35}}}{1 + {(y_{3} / k_{35})}^{h_{35}}} β_{3 : 5} + \frac{{(y_{4} / k_{45})}^{h_{45}}}{1 + {(y_{4} / k_{45})}^{h_{45}}} β_{4 : 5})}^{+} - δ_{5}^{(m)} x_{5} \\ (two modules). \end{matrix} \end{matrix}

(11)

\begin{matrix} {\dot{y}}_{1} = ρ_{1} (x_{1} - y_{1}) . \end{matrix}

(12)

\begin{matrix} {\dot{y}}_{2} = ρ_{2} (x_{2} - y_{2}) . \end{matrix}

(13)

\begin{matrix} {\dot{y}}_{3} = ρ_{3} (x_{3} - y_{3}) . \end{matrix}

(14)

\begin{matrix} {\dot{y}}_{4} = ρ_{4} (x_{4} - y_{4}) . \end{matrix}

(15)

\begin{matrix} {\dot{y}}_{5} = ρ_{5} (x_{5} - y_{5}) . \end{matrix}

(16)

Here (x)⁺ = max{x, 0}. We apply nondimensionalization by setting δ_i = α_i,basal + ∑_j β_{j: i}, so that the steady state expression levels are between 0 and 1. We can see that given the graph, there are 15 configurations of the ODEs (3 for x₃ times 5 for x₅). We use [i, j] with 1 ≤ i ≤ 3 and 1 ≤ j ≤ 5 to denote the configuration using the ith equation for x₃ and the jth equation for x₅, and use the symbol F_[i,j],+ to denote the class of flowering network ODE models with configurations [i, j] (the plus sign signifies the activation regulation of COL1a on E1). The initial conditions, namely the 5 mRNA abundances x(0)’s and the 5 protein concentrations y(0)’s, are 10-dimensional. In addition, there are 24–26 positive real parameters (depending on the configuration) and 7 discrete parameters (the Hill coefficients) for the dynamics. For example, for configuration [1, 1], the parameters for the dynamics consist of the basal activations α’s (5), the Michaelis–Menten constants k’s (7), the absolute effect of modules β’s (7), the translation rate ρ’s (5), summing up to 24 parameters.

Flowering network with COL1a repressing E1

A slight variant of the soybean flowering graph model in Fig 4 is shown in Fig 5. Note the only difference is the sign of the edge from COL1a to E1. The symbol F_[i,j],− denotes the class of ODE models Eqs (7)–(16) with the ith and the jth configurations in Eqs (9) and (11), but with Eq (4) replaced by

\begin{matrix} {\dot{x}}_{1} = {(α_{1, basal} - \frac{{(y_{2} / k_{21})}^{h_{21}}}{1 + {(y_{2} / k_{21})}^{h_{21}}} β_{2 : 1})}^{+} - δ_{1}^{(m)} x_{1} . \end{matrix}

(17)

Here the negative sign in F_[i,j],− signifies the repression regulation of COL1a on E1. The number of parameters is the same as the network in Fig 4.

Repressilator

An arbitrary repressilator network is shown in Fig 6.

The symbol R denotes the class of ODE models for the repressilator, given below.

\begin{matrix} {\dot{x}}_{1} = {(α_{1, basal} - \frac{{(y_{3} / k_{31})}^{h_{31}}}{1 + {(y_{3} / k_{31})}^{h_{31}}} β_{3 : 1})}^{+} - δ_{1}^{(m)} x_{1} . \end{matrix}

(18)

\begin{matrix} {\dot{x}}_{2} = {(α_{2, basal} - \frac{{(y_{1} / k_{12})}^{h_{12}}}{1 + {(y_{1} / k_{12})}^{h_{12}}} β_{1 : 2})}^{+} - δ_{2}^{(m)} x_{2} . \end{matrix}

(19)

\begin{matrix} {\dot{x}}_{3} = {(α_{3, basal} - \frac{{(y_{4} / k_{43})}^{h_{43}}}{1 + {(y_{4} / k_{43})}^{h_{43}}} β_{4 : 3})}^{+} - δ_{3}^{(m)} x_{3} . \end{matrix}

(20)

\begin{matrix} {\dot{x}}_{4} = {(α_{4, basal} - \frac{{(y_{5} / k_{54})}^{h_{54}}}{1 + {(y_{5} / k_{54})}^{h_{54}}} β_{5 : 4})}^{+} - δ_{4}^{(m)} x_{4} . \end{matrix}

(21)

\begin{matrix} {\dot{x}}_{5} = {(α_{5, basal} - \frac{{(y_{2} / k_{25})}^{h_{25}}}{1 + {(y_{2} / k_{25})}^{h_{25}}} β_{2 : 5})}^{+} - δ_{5}^{(m)} x_{5} . \end{matrix}

(22)

There is only one possible configuration for each target gene. The dynamics involve 20 parameters.

Data generation

The synthetic expression dataset is generated as follows. For the generated data, we use F_[1,1],+ (the flowering network with configuration [1, 1] and COL1a activating E1) with a fixed set of parameters for the dynamics. For a single set of trajectories (i.e., for a single plant), we use a set of initial values x(0)’s and y(0)’s generated uniformly at random between 0 and 1. The entire dataset may consist of only a single set of trajectories, corresponding to a single plant; or the dataset may consist of multiple sets of trajectories, corresponding to multiple plants. If multiple sets of trajectories are used, the initial conditions for each set of trajectories are generated independently, while the parameters for the dynamics are the same across all sets of trajectories. In other words, we model distinct plants by assuming distinct initial conditions, while using common parameters for the dynamics. To produce the data, the x variables are sampled at time points 0, 1, 2, 3, 4, 5, 6, so that each set of trajectories (i.e., each plant) produces 35 data points. Because each set of trajectories is sampled at different times from the system with one initial condition representing different stages of a single plant, the synthetic datasets are of multi-shot sampling, as opposed to one-shot sampling in practice where each individual is only sampled once [29]. We also generate random expression datasets with reflected Brownian motions with covariance 0.05, and denote such a stochastic model by B.

Fitting results

The counts for data points and parameters are summarized in Table 3. Note that with a single set of trajectories, the number of parameters is close to the number of data points. As the number of sets of trajectories increases, the number of data points outgrows the number of parameters because each additional set provides 35 new data points while only allowing 10 more parameters from the initial conditions (because the dynamic parameters are shared across all sets of trajectories).

Table 3. Number of parameters in different ODE models.

S (number of sets of trajectories)	1	2	5	10
STn (number of data points)	35	70	175	350
`F`_[1,1],+	34	44	74	124
`F`_[3,5],+	36	46	76	126
`F`_[1,1],−	34	44	74	124
`R`	30	40	70	120

Open in a new tab

A Basin-hopping algorithm in the Python package LMFIT [30] is used to perform the global optimization of the curve fitting (see details in the source code of the simulation [24]). The sample size varies between 35 and 350 depending on the number of sets of trajectories. The fit is evaluated by the fitting loss and the coefficients of determination (R²) shown in Tables 4 and 5. The fitting loss function for two S×T×n tensors x and $\hat{x}$ is defined by

\begin{matrix} l (x, \hat{x}) = {(\frac{1}{S T n} \sum_{i = 1}^{S} \sum_{j = 1}^{T} \sum_{k = 1}^{n} {(x_{i j k} - {\hat{x}}_{i j k})}^{2})}^{1 / 2}, \end{matrix}

where S is the number of sets of trajectories in the dataset, T the number of time points, and n the number of genes.

Table 4. Fitting losses using different classes of ODE models on different synthetic datasets.

S (number of sets of trajectories)	1	2	5	10
fit `F`_[1,1],+ model to `F`_[1,1],+ data	0.0015	0.0015	0.0010	0.0009
fit `F`_[3,5],+ model to `F`_[1,1],+ data	0.0016	0.0021	0.0019	0.0021
fit `F`_[1,1],− model to `F`_[1,1],+ data	0.0032	0.0036	0.0165	0.0208
fit `R` model to `F`_[1,1],+ data	0.0030	0.0037	0.0148	0.0204
fit `F`_[1,1],+ model to `B` data	0.1269	0.1125	0.1307	0.1390

Open in a new tab

Table 5. Coefficients of determination using different classes of ODE models on different synthetic datasets.

S (number of sets of trajectories)	1	2	5	10
fit `F`_[1,1],+ model to `F`_[1,1],+ data	0.99996	0.99995	0.99999	0.99999
fit `F`_[3,5],+ model to `F`_[1,1],+ data	0.99995	0.99991	0.99996	0.99995
fit `F`_[1,1],− model to `F`_[1,1],+ data	0.99980	0.99974	0.99702	0.99517
fit `R` model to `F`_[1,1],+ data	0.99983	0.99972	0.99760	0.99535
fit `F`_[1,1],+ model to `B` data	0.88639	0.90175	0.87241	0.87517

Open in a new tab

Note the time scale of the ODE is assumed to be known, which restricts how fast the expression levels can change. The time scale thus acts as a regularizer to prevent overfitting.

We make the following observations from Tables 4 and 5.

The implemented optimization algorithm failed to find the optimal parameters in row 1 (the best fit should be a perfect fit with zero loss), but the relative loss compared to the average nondimensionalized expression level 0.5 is very small (less than 0.5%), and the coefficients of determination are close to 1. Both indicate a near-optimal fit.
ODE models from all three graph models (rows 1, 2, 3, and 4) fit the synthetic flowering network data well when there are only one or two sets of trajectories (columns 1 and 2). The relative losses are less than 1% and R² is larger than 0.9997. We can see from Table 3 that the number of data points is close to the number of parameters in the S = 1 setting, and only moderately larger in the S = 2 setting. So when S ≤ 2 the three graph models in this case study are nearly indistinguishable. In other words, one may not be able to infer the graph structure with very limited data.
When fitting the models to 5 or 10 sets of trajectories simultaneously, i.e., when the system is sufficiently overdetermined, only the models from the correct graph (rows 1 and 2) fit well. The models from incorrect graphs (rows 3 and 4) suffer a roughly 4% relative loss after fitting for 10 sets of trajectories and R² falls below 0.998. Note that F_[1,1],− differs from the ground truth of the data F_[1,1],+ only by the sign of one edge, while the model R shares no edges in common with the ground truth at all. Yet the fitness of the slight variant of the ground truth graph is as bad as the completely different repressilator graph.
Both F_[1,1],+ and F_[3,5],+ fit the F_[1,1],+ data very well for all numbers of sets of trajectories (rows 1 and 2). This indicates the classes of ODE models with different configurations of the same graph model are similar in terms of data fitting. Consequently, even with data sufficient to infer the correct graph model, it may be impossible to infer the specific ODE model.
The models from the flowering network cannot fit the random dataset (reflected Brownian motions with covariance 0.05) well. It turns out that the ODE models with 34 parameters have trouble following the highly variable 35 data points from the reflected Brownian motions. The low fitness level to the random dataset shows great redundancy in the parameters in terms of generating data points. It also indicates the fitting results to the synthetic ODE data are significant compared to fitting a random dataset.

Discussion

Generalization of CSP to related gene regulatory network models

The concept of CSP can be applied to many other models. We first explain this for continuous-state models, and then for discrete-state models.

Continuous-state models

A network model somewhat similar to ODE models is a fixed-point model. The study by Van den Bulcke et al. [31] uses a fixed-point model for gene regulatory networks. ODE models based on Michaelis–Menten and Hill kinetics and linear degradation terms are used to determine the expression level of a given gene as a function of the expression levels of other genes. Then a fixed point is produced. This can model equilibrium points, also known as resting points, of ODE models. The concept of constant sign property can be applied to fixed-point models as well. Van den Bulcke et al. [31] focuses on models for the network topology, which is not addressed in this paper.

Other continuous-state models have been used for gene regulatory networks. The study by Mendes et al. [32] simulates gene regulatory networks using a biochemical simulator called Gepasi [33], which models complex biochemical pathways using ODEs. For such biochemical systems, constant sign property discussed in this paper can be used to find the causal dependency among observed variables (e.g., mRNA abundances in the special case of gene regulatory networks). In order to avoid the difficult calibration of the parameters in ODEs, Ocone et al. [34] models the promoter by a binary state process and approximates the transcription–translation network with stochastic differential equations. Constant sign property can be easily generalized to such hybrid models by introducing a notion of monotonicity for the stochastic systems. It is worth mentioning that constant sign property is defined with directionality for causal relationship among the genes and not suitable for models based on mere correlation (e.g., graphical Gaussian models [35]).

Discrete-state models

One common type of discrete models used for gene regulatory networks are Bayesian networks (see, e.g., Friedman et al. [36]). Boolean networks, as a special case of Bayesian networks, are used to capture qualitative gene regulation (see, e.g., Liang et al. [37]), for which constant sign property can be defined based on the monotonicity of the boolean functions. The study by Husmeier [38] evaluates a dynamic Bayesian network inference algorithm using simulated data based on an ODE model whose genetic network model is taken from Zak et al. [39] and whose equations are taken from chemical kinetics (see Chapter 22 of Atkins and de Paula [40]). Similarly, the study by Smith et al. [41] also proposes a dynamic Bayesian network algorithm, and evaluates its performance on sampled and quantized data from a dynamic Bayesian network simulator that models different regions of the brain of songbirds regulated by their behaviors. The simulated data is generated with a small step size before being sampled, and thus resembles an ODE model simulator. For the dynamic Bayesian network gene expressions are quantized to discrete values. The constant sign property can also be applied to dynamic Bayesian network models using a partial order of the conditional distributions (e.g., stochastic dominance) of target genes given the expressions of their regulators. Husmeier [38] gives an example of a graphical model that is more detailed than the gene regulatory graph in this paper. Although both the GeneNetWeaver model and the ODE models in Husmeier [38] are based on chemical kinetic equations, one difference is that the Michaelis–Menten and the Hill kinetics in GeneNetWeaver arise from considerations of a faster time scale of the binding of TF to the promoter regions (see Alon [19]). Nevertheless, both GeneNetWeaver and the ODE models for realistic simulation in Husmeier [38] fall into the general framework of ODE models in this paper and hence the constant sign property we have proposed applies to both.

Implication of GCSP

GCSP of an ODE model generalizes the notion of a linear dynamical system by allowing the variation of the state vector (i.e., the concentrations of molecular classes) to be nonlinear in the state vector so long as the overall effect of the most influential pathways in the molecular graph keeps the same sign (i.e., activation stays activation and repression stays repression regardless of the expression of the regulator, the target gene, or any other molecular classes). Biologically, GCSP indicates homogeneity of the gene regulatory network in the sense that the qualitative properties of gene regulation are preserved after cellular differentiation and under different external conditions. Lack of GCSP indicates significant change in regulatory functions after cellular differentiation and under different external conditions. Note that GCSP is more likely to hold for the subnetwork of a small number of genes compared to a larger network.

Limitation of infinitesimal CSP

The definitions of CSP proposed in this paper focus on short time behavior. Over short time periods, the paths with the smallest number of hops dominate. Often the shortest paths have the strongest influence, as seen in Example 2. But in some cases the shortest paths could be weaker than some slightly longer paths, and if the longer paths have an opposite sign, then the focus on short time and shortest paths can be misleading, because the longer paths will take over quickly after the brief initial dominance by the shortest paths. In the extreme case of a complete molecular graph, where every molecular class has a (possibly tiny) regulatory effect on every other molecular class, the gene regulatory graph defined in this paper would be determined by only the direct edges in the molecular graph and all the actual biological pathways would be entirely ignored. This also shows the importance of network sparsity.

Conclusion

Gene regulatory networks are modeled at different abstraction levels with tradeoff between accuracy and tractability. Graph models with signed directed edges provide circuit-like characterization of gene regulation, while ODE models quantify detailed dynamics for various molecular classes. The constant sign property proposed in this paper connects the two types of models by identifying a set of conditions under which ODE models correspond to a single graph model, and provides a deeper understanding of the context-dependent and time-varying nature of gene regulatory networks. A class of ODE models for a given graph model based on the source code of a popular software package GeneNetWeaver is described in detail and shown to satisfy the global constant sign property. Exploration of data fitting of one ODE model to the data generated from another shows better fit when two models have the same graph model.

Supporting information

S1 Appendix. Basic model of gene interaction.

A brief review on both graph models and ODE models is given here.

(PDF)

Click here for additional data file.^{(107.7KB, pdf)}

S2 Appendix. Random model for production functions used in GeneNetWeaver.

Specific module generation and parameter ranges in GeneNetWeaver are described here.

(PDF)

Click here for additional data file.^{(124.4KB, pdf)}

S3 Appendix. Proof of Eq (6).

A proof of the equation involving the partial derivatives of the solution of dynamical systems.

(PDF)

Click here for additional data file.^{(145.2KB, pdf)}

Data Availability

The computer simulation source code is available at https://github.com/Veggente/graph-ode.

Funding Statement

This work was supported by the Plant Genome Research Program from the National Science Foundation (NSF-IOS-PGRP-1823145) to B.H. and Y.H., and by the Communication and Information Foundations program from the National Science Foundation (NSF-CCF-CIF-1900636) to B.H.

References

1. Davidson E, Levin M. Gene regulatory networks. Proc Natl Acad Sci USA. 2005;102(14):4935–4935. 10.1073/pnas.0502024102 [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Emmert-Streib F, Dehmer M, Haibe-Kains B. Gene regulatory networks and their applications: Understanding biological and medical problems in terms of networks. Front Cell Dev Biol. 2014;2. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Emmert-Streib F, Dehmer M, Haibe-Kains B. Untangling statistical and biological models to understand network inference: The need for a genomics network ontology. Front Genet. 2014;5. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Kim HD, Shay T, O’Shea EK, Regev A. Transcriptional regulatory circuits: Predicting numbers from alphabets. Science. 2009;325(5939):429–432. 10.1126/science.1171347 [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Karlebach G, Shamir R. Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Biol. 2008;9(10):770–780. 10.1038/nrm2503 [DOI] [PubMed] [Google Scholar]
6. Seaton DD, Smith RW, Song YH, MacGregor DR, Stewart K, Steel G, et al. Linked circadian outputs control elongation growth and flowering in response to photoperiod and temperature. Mol Syst Biol. 2015;11(1). 10.15252/msb.20145766 [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Kolar M, Song L, Ahmed A, Xing EP. Estimating time-varying networks. Ann Appl Stat. 2010;4(1):94–123. 10.1214/09-AOAS308 [DOI] [Google Scholar]
8. Luscombe NM, Babu MM, Yu H, Snyder M, Teichmann SA, Gerstein M. Genomic analysis of regulatory network dynamics reveals large topological changes. Nature. 2004;431(7006):308–312. 10.1038/nature02782 [DOI] [PubMed] [Google Scholar]
9. Jaeger KE, Pullen N, Lamzin S, Morris RJ, Wigge PA. Interlocking feedback loops govern the dynamic behavior of the floral transition in Arabidopsis. Plant Cell. 2013;25(3):820–833. 10.1105/tpc.113.109355 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Schaffter T, Marbach D. GeneNetWeaver; 2012. Available from: https://github.com/tschaffter/gnw.
11. Schaffter T, Marbach D, Floreano D. GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics. 2011;27(16):2263–2270. 10.1093/bioinformatics/btr373 [DOI] [PubMed] [Google Scholar]
12. Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G. Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci USA. 2010;107(14):6286–6291. 10.1073/pnas.0913357107 [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, et al. Wisdom of crowds for robust gene network inference. Nat Methods. 2012;9(8):796–804. 10.1038/nmeth.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Chan TE, Stumpf MPH, Babtie AC. Gene regulatory network inference from single-cell data using multivariate information measures. Cell Syst. 2017;5(3):251–267.e3. 10.1016/j.cels.2017.08.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods. 2020;17(2):147–154. 10.1038/s41592-019-0690-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Shea MA, Ackers GK. The OR control system of bacteriophage lambda: A physical-chemical model for gene regulation. J Mol Biol. 1985;181(2):211–230. 10.1016/0022-2836(85)90086-5 [DOI] [PubMed] [Google Scholar]
17. Gedeon T, Mischaikow K, Patterson K, Traldi E. When activators repress and repressors activate: A qualitative analysis of the Shea–Ackers model. Bull Math Biol. 2008;70(6):1660–1683. 10.1007/s11538-008-9313-6 [DOI] [PubMed] [Google Scholar]
18. Locke JCW, Southern MM, Kozma-Bognár L, Hibberd V, Brown PE, Turner MS, et al. Extension of a genetic network model by iterative experimentation and mathematical analysis. Mol Syst Biol. 2005;1(1). 10.1038/msb4100018 [DOI] [PMC free article] [PubMed] [Google Scholar]
19. Alon U. An introduction to systems biology: Design principles of biological circuits. CRC press; 2006. [Google Scholar]
20. Ackers GK, Johnson AD, Shea MA. Quantitative model for gene regulation by lambda phage repressor. Proc Natl Acad Sci USA. 1982;79(4):1129–1133. 10.1073/pnas.79.4.1129 [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Hartman P. Ordinary Differential Equations. 2nd ed SIAM; 2002. [Google Scholar]
22.Zorich VA. Mathematical Analysis II. Springer Berlin Heidelberg; 2016. Available from: https://doi.org/10.10072F978-3-662-48993-2.
23. Pokhilko A, Fernández AP, Edwards KD, Southern MM, Halliday KJ, Millar AJ. The clock gene circuit in Arabidopsis includes a repressilator with additional feedback loops. Mol Syst Biol. 2012;8(1):574 10.1038/msb.2012.6 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Kang X. Graph and ODE models simulations; 2020. Available from: https://github.com/Veggente/graph-ode.
25. Wu F, Kang X, Wang M, Haider W, Price WB, Hajek B, et al. Transcriptome-enabled network inference revealed the GmCOL1 feed-forward loop and its roles in photoperiodic flowering of soybean. Front Plant Sci. 2019;10 10.3389/fpls.2019.01221 [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Cao D, Li Y, Lu S, Wang J, Nan H, Li X, et al. GmCOL1a and GmCOL1b function as flowering repressors in soybean under long-day conditions. Plant Cell Physiol. 2015;56(12):2409–2422. 10.1093/pcp/pcv152 [DOI] [PubMed] [Google Scholar]
27. Zhai H, Lü S, Liang S, Wu H, Zhang X, Liu B, et al. GmFT4, a homolog of FLOWERING LOCUS T, is positively regulated by E1 and functions as a flowering repressor in soybean. PLOS ONE. 2014;9(2):e89030 10.1371/journal.pone.0089030 [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Nan H, Cao D, Zhang D, Li Y, Lu S, Tang L, et al. GmFT2a and GmFT5a redundantly and differentially regulate flowering through interaction with and upregulation of the bZIP transcription factor GmFDL19 in soybean. PLOS ONE. 2014;9(5):e97669 10.1371/journal.pone.0097669 [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Kang X, Hajek B, Wu F, Hanzawa Y. Time series experimental design under one-shot sampling: The importance of condition diversity. PLOS ONE. 2019;14(10):e0224577 10.1371/journal.pone.0224577 [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Newville M, Stensitzki T, Allen DB, Ingargiola A. LMFIT: Non-Linear Least-Square Minimization and Curve-Fitting for Python; 2014. Available from: https://zenodo.org/record/11813.
31. Van den Bulcke T, Van Leemput K, Naudts B, van Remortel P, Ma H, Verschoren A, et al. SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinform. 2006;7(1):43 10.1186/1471-2105-7-43 [DOI] [PMC free article] [PubMed] [Google Scholar]
32. Mendes P, Sha W, Ye K. Artificial gene networks for objective comparison of analysis algorithms. Bioinformatics. 2003;19(Suppl 2):ii122–ii129. [DOI] [PubMed] [Google Scholar]
33. Mendes P. Biochemistry by numbers: simulation of biochemical pathways with Gepasi 3. Trends Biochem Sci. 1997;22(9):361–363. 10.1016/S0968-0004(97)01103-1 [DOI] [PubMed] [Google Scholar]
34. Ocone A, Millar AJ, Sanguinetti G. Hybrid regulatory models: A statistically tractable approach to model regulatory network dynamics. Bioinformatics. 2013;29(7):910–916. 10.1093/bioinformatics/btt069 [DOI] [PubMed] [Google Scholar]
35. Wille A, Zimmermann P, Vranová E, Fürholz A, Laule O, Bleuler S, et al. Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biol. 2004;5(11):R92 10.1186/gb-2004-5-11-r92 [DOI] [PMC free article] [PubMed] [Google Scholar]
36. Friedman N, Linial M, Nachman I, Pe’er D. Using Bayesian networks to analyze expression data. J Comput Biol. 2000;7(3-4):601–620. 10.1089/106652700750050961 [DOI] [PubMed] [Google Scholar]
37. Liang S, Fuhrman S, Somogyi R. REVEAL, a general reverse engineering algorithm for inference of genetic network architectures. Pac Symp Biocomput. 1998;3:18–29. [PubMed] [Google Scholar]
38. Husmeier D. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics. 2003;19(17):2271–2282. 10.1093/bioinformatics/btg313 [DOI] [PubMed] [Google Scholar]
39.Zak DE, Doyle III FJ, Gonye GE, Schwaber JS. Simulation studies for the identification of genetic networks from cDNA array and regulatory activity data. In: Proc Int Conf Syste Biol; 2001. p. 231–238.
40. Atkins P, de Paula J. Physical Chemistry. 9th ed Oxford University Press; 2010. [Google Scholar]
41. Smith VA, Jarvis ED, Hartemink AJ. Evaluating functional network inference using simulations of complex biological systems. Bioinformatics. 2002;18(Suppl 1):S216–S224. 10.1093/bioinformatics/18.suppl_1.S216 [DOI] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0235070.r001

Decision Letter 0

Enrique Hernandez-Lemus

10 Mar 2020

PONE-D-20-02246

From graph topology to ODE models for gene regulatory networks

PLOS ONE

Dear Dr. Kang,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The reviewers point out to the need to deepen the level of description and clarifying the concepts behind your analysis and results. Compare the performance of the present method with different methods, as well as highlighting the novelty and usefulness of this approach.

We would appreciate receiving your revised manuscript by Apr 24 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Enrique Hernandez-Lemus, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements:

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.plosone.org/attachments/PLOSOne_formatting_sample_main_body.pdf and http://www.plosone.org/attachments/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: N/A

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Overall, most given explanations are too brief and for this reason the ideas introduced are difficult to understand. The paper should be extended - at the moment it is anyway quite short.

Importantly, there is no deeper discussion of the obtained results (also a discussion section is missing). At least 3-4 of discussion should be added.

The paper is currently in a draft stage and required more work.

Introduction:

The related literature is not properly cited or discussed. Please add more background information about this. In total, about 40 papers should be cited (currently only 18 are cited).

It is unclear to me how the authors define a 'gene regulatory network'. Frequently this is done wrong, as discussed here

PMID: 25364745

Importantly, a gene networks refer to all possible types of molecular interactions, including transcriptional regulation or protein interaction. If it would only refer to the former it would be a transcriptional regulatory network.

Furthermore, the modeling perspective of such networks has been discussed in detail in

PMID: 25221572

It is unclear of GeneNetWeaver is essential for the analysis of if different tools could be used, e.g.,

PMID: 16438721

This point should be made clear.

GeneNetWeaver is an old model. Is there no newer development in this area?

Fitting results

This section is entirely unclear. I guess a regression is performed? What is the used sample size? What is R^2? What regularization has been used? How is the fit evaluated? How is dealt with overfitting?

Conclusions

The insights provided are shallow.This problem related back to my points I made above.

What does the CSP mean biologically?

The paper mentions no word that usually GRN do have many more than 5 genes? Can one extend the model to 100 genes?

Reviewer #2: The paper under review is unfocused and rather weak.

The main purpose seems to be to define carefully an ODE model called GeneNet Weaver that has been used to to produce data used in DREAM challenges. Then the main result seems to be observation in three lines that the ODE's generated by this model have monotone dependence on their arguments and therefore are

consistent with a network graphical model with signed edges.

However, there is insufficient justification why this class of models reflects biological reality. For instance, the module 1 uses product between all activators and deactivators which assumes that all activators must be bound for transcription to happen. This is not true in many promoter regions where either activator can activate the gene. If these modules can be then considered independently, this must be explained and stated.

In line 205 a statement about a sum being considered 1 vs zero is puzzling, and left hanging without any explanation. Yet this seems to be. a crucial point of the discussion about relationship between modules 2 and 3.

There is no comparison with older similar models of gene regulation based on statistical mechanics and occupancy or promoters. In particular, line of papers

Ackers GK, Johnson AD, Shea MA., Quantitative model for gene regulation by lambda phage repressor., Proc Natl Acad Sci U S A. 1982 Feb;79(4):1129-33

Shea MA, Ackers GK., The OR control system of bacteriophage lambda. A physical-chemical model for gene regulation., Mol Biol. 1985 Jan 20;181(2):211-30.

should be cited and discussed. In particular,

Gedeon, Mischaikow K, Patterson K, Traldi E.,

When activators repress and repressors activate: a qualitative analysis of the Shea-Ackers model.

Bull Math Biol. 2008 Aug;70(6):1660-83. doi: 10.1007/s11538-008-9313-6. Epub 2008 Jul 22.

has shown that the Shea-Ackers type model does not have to have the monotonicity property that is claimed in this paper for GeneNet Weaver models.

Discussion on what assumptions on GeneNet Weaver models guarantee the monotonicity must be made.

It is not clear what the case study communicates. Artificial data is generated and then fit with ODE models. This has no bearing on biology.

I recommend either a rejection or a major revision for this paper.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Review PloS One 2020.pdf

Click here for additional data file.^{(56.8KB, pdf)}

PLoS One. 2020 Jun 30;15(6):e0235070. doi: 10.1371/journal.pone.0235070.r002

Author response to Decision Letter 0

26 May 2020

Please see the response letter.

Attachment

Submitted filename: response.pdf

Click here for additional data file.^{(85.2KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0235070.r003

Decision Letter 1

Enrique Hernandez-Lemus

9 Jun 2020

From graph topology to ODE models for gene regulatory networks

PONE-D-20-02246R1

Dear Dr. Kang,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Enrique Hernandez-Lemus, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: I Don't Know

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

PLoS One. doi: 10.1371/journal.pone.0235070.r004

Acceptance letter

Enrique Hernandez-Lemus

17 Jun 2020

PONE-D-20-02246R1

From graph topology to ODE models for gene regulatory networks

Dear Dr. Kang:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Prof. Enrique Hernandez-Lemus

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Appendix. Basic model of gene interaction.

A brief review on both graph models and ODE models is given here.

(PDF)

Click here for additional data file.^{(107.7KB, pdf)}

S2 Appendix. Random model for production functions used in GeneNetWeaver.

Specific module generation and parameter ranges in GeneNetWeaver are described here.

(PDF)

Click here for additional data file.^{(124.4KB, pdf)}

S3 Appendix. Proof of Eq (6).

A proof of the equation involving the partial derivatives of the solution of dynamical systems.

(PDF)

Click here for additional data file.^{(145.2KB, pdf)}

Attachment

Submitted filename: Review PloS One 2020.pdf

Click here for additional data file.^{(56.8KB, pdf)}

Attachment

Submitted filename: response.pdf

Click here for additional data file.^{(85.2KB, pdf)}

Data Availability Statement

The computer simulation source code is available at https://github.com/Veggente/graph-ode.

[pone.0235070.ref001] 1. Davidson E, Levin M. Gene regulatory networks. Proc Natl Acad Sci USA. 2005;102(14):4935–4935. 10.1073/pnas.0502024102 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref002] 2. Emmert-Streib F, Dehmer M, Haibe-Kains B. Gene regulatory networks and their applications: Understanding biological and medical problems in terms of networks. Front Cell Dev Biol. 2014;2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref003] 3. Emmert-Streib F, Dehmer M, Haibe-Kains B. Untangling statistical and biological models to understand network inference: The need for a genomics network ontology. Front Genet. 2014;5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref004] 4. Kim HD, Shay T, O’Shea EK, Regev A. Transcriptional regulatory circuits: Predicting numbers from alphabets. Science. 2009;325(5939):429–432. 10.1126/science.1171347 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref005] 5. Karlebach G, Shamir R. Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Biol. 2008;9(10):770–780. 10.1038/nrm2503 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref006] 6. Seaton DD, Smith RW, Song YH, MacGregor DR, Stewart K, Steel G, et al. Linked circadian outputs control elongation growth and flowering in response to photoperiod and temperature. Mol Syst Biol. 2015;11(1). 10.15252/msb.20145766 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref007] 7. Kolar M, Song L, Ahmed A, Xing EP. Estimating time-varying networks. Ann Appl Stat. 2010;4(1):94–123. 10.1214/09-AOAS308 [DOI] [Google Scholar]

[pone.0235070.ref008] 8. Luscombe NM, Babu MM, Yu H, Snyder M, Teichmann SA, Gerstein M. Genomic analysis of regulatory network dynamics reveals large topological changes. Nature. 2004;431(7006):308–312. 10.1038/nature02782 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref009] 9. Jaeger KE, Pullen N, Lamzin S, Morris RJ, Wigge PA. Interlocking feedback loops govern the dynamic behavior of the floral transition in Arabidopsis. Plant Cell. 2013;25(3):820–833. 10.1105/tpc.113.109355 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref010] 10.Schaffter T, Marbach D. GeneNetWeaver; 2012. Available from: https://github.com/tschaffter/gnw.

[pone.0235070.ref011] 11. Schaffter T, Marbach D, Floreano D. GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics. 2011;27(16):2263–2270. 10.1093/bioinformatics/btr373 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref012] 12. Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G. Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci USA. 2010;107(14):6286–6291. 10.1073/pnas.0913357107 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref013] 13. Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, et al. Wisdom of crowds for robust gene network inference. Nat Methods. 2012;9(8):796–804. 10.1038/nmeth.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref014] 14. Chan TE, Stumpf MPH, Babtie AC. Gene regulatory network inference from single-cell data using multivariate information measures. Cell Syst. 2017;5(3):251–267.e3. 10.1016/j.cels.2017.08.014 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref015] 15. Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods. 2020;17(2):147–154. 10.1038/s41592-019-0690-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref016] 16. Shea MA, Ackers GK. The OR control system of bacteriophage lambda: A physical-chemical model for gene regulation. J Mol Biol. 1985;181(2):211–230. 10.1016/0022-2836(85)90086-5 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref017] 17. Gedeon T, Mischaikow K, Patterson K, Traldi E. When activators repress and repressors activate: A qualitative analysis of the Shea–Ackers model. Bull Math Biol. 2008;70(6):1660–1683. 10.1007/s11538-008-9313-6 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref018] 18. Locke JCW, Southern MM, Kozma-Bognár L, Hibberd V, Brown PE, Turner MS, et al. Extension of a genetic network model by iterative experimentation and mathematical analysis. Mol Syst Biol. 2005;1(1). 10.1038/msb4100018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref019] 19. Alon U. An introduction to systems biology: Design principles of biological circuits. CRC press; 2006. [Google Scholar]

[pone.0235070.ref020] 20. Ackers GK, Johnson AD, Shea MA. Quantitative model for gene regulation by lambda phage repressor. Proc Natl Acad Sci USA. 1982;79(4):1129–1133. 10.1073/pnas.79.4.1129 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref021] 21. Hartman P. Ordinary Differential Equations. 2nd ed SIAM; 2002. [Google Scholar]

[pone.0235070.ref022] 22.Zorich VA. Mathematical Analysis II. Springer Berlin Heidelberg; 2016. Available from: https://doi.org/10.10072F978-3-662-48993-2.

[pone.0235070.ref023] 23. Pokhilko A, Fernández AP, Edwards KD, Southern MM, Halliday KJ, Millar AJ. The clock gene circuit in Arabidopsis includes a repressilator with additional feedback loops. Mol Syst Biol. 2012;8(1):574 10.1038/msb.2012.6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref024] 24.Kang X. Graph and ODE models simulations; 2020. Available from: https://github.com/Veggente/graph-ode.

[pone.0235070.ref025] 25. Wu F, Kang X, Wang M, Haider W, Price WB, Hajek B, et al. Transcriptome-enabled network inference revealed the GmCOL1 feed-forward loop and its roles in photoperiodic flowering of soybean. Front Plant Sci. 2019;10 10.3389/fpls.2019.01221 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref026] 26. Cao D, Li Y, Lu S, Wang J, Nan H, Li X, et al. GmCOL1a and GmCOL1b function as flowering repressors in soybean under long-day conditions. Plant Cell Physiol. 2015;56(12):2409–2422. 10.1093/pcp/pcv152 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref027] 27. Zhai H, Lü S, Liang S, Wu H, Zhang X, Liu B, et al. GmFT4, a homolog of FLOWERING LOCUS T, is positively regulated by E1 and functions as a flowering repressor in soybean. PLOS ONE. 2014;9(2):e89030 10.1371/journal.pone.0089030 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref028] 28. Nan H, Cao D, Zhang D, Li Y, Lu S, Tang L, et al. GmFT2a and GmFT5a redundantly and differentially regulate flowering through interaction with and upregulation of the bZIP transcription factor GmFDL19 in soybean. PLOS ONE. 2014;9(5):e97669 10.1371/journal.pone.0097669 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref029] 29. Kang X, Hajek B, Wu F, Hanzawa Y. Time series experimental design under one-shot sampling: The importance of condition diversity. PLOS ONE. 2019;14(10):e0224577 10.1371/journal.pone.0224577 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref030] 30.Newville M, Stensitzki T, Allen DB, Ingargiola A. LMFIT: Non-Linear Least-Square Minimization and Curve-Fitting for Python; 2014. Available from: https://zenodo.org/record/11813.

[pone.0235070.ref031] 31. Van den Bulcke T, Van Leemput K, Naudts B, van Remortel P, Ma H, Verschoren A, et al. SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinform. 2006;7(1):43 10.1186/1471-2105-7-43 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref032] 32. Mendes P, Sha W, Ye K. Artificial gene networks for objective comparison of analysis algorithms. Bioinformatics. 2003;19(Suppl 2):ii122–ii129. [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref033] 33. Mendes P. Biochemistry by numbers: simulation of biochemical pathways with Gepasi 3. Trends Biochem Sci. 1997;22(9):361–363. 10.1016/S0968-0004(97)01103-1 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref034] 34. Ocone A, Millar AJ, Sanguinetti G. Hybrid regulatory models: A statistically tractable approach to model regulatory network dynamics. Bioinformatics. 2013;29(7):910–916. 10.1093/bioinformatics/btt069 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref035] 35. Wille A, Zimmermann P, Vranová E, Fürholz A, Laule O, Bleuler S, et al. Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biol. 2004;5(11):R92 10.1186/gb-2004-5-11-r92 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0235070.ref036] 36. Friedman N, Linial M, Nachman I, Pe’er D. Using Bayesian networks to analyze expression data. J Comput Biol. 2000;7(3-4):601–620. 10.1089/106652700750050961 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref037] 37. Liang S, Fuhrman S, Somogyi R. REVEAL, a general reverse engineering algorithm for inference of genetic network architectures. Pac Symp Biocomput. 1998;3:18–29. [PubMed] [Google Scholar]

[pone.0235070.ref038] 38. Husmeier D. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics. 2003;19(17):2271–2282. 10.1093/bioinformatics/btg313 [DOI] [PubMed] [Google Scholar]

[pone.0235070.ref039] 39.Zak DE, Doyle III FJ, Gonye GE, Schwaber JS. Simulation studies for the identification of genetic networks from cDNA array and regulatory activity data. In: Proc Int Conf Syste Biol; 2001. p. 231–238.

[pone.0235070.ref040] 40. Atkins P, de Paula J. Physical Chemistry. 9th ed Oxford University Press; 2010. [Google Scholar]

[pone.0235070.ref041] 41. Smith VA, Jarvis ED, Hartemink AJ. Evaluating functional network inference using simulations of complex biological systems. Bioinformatics. 2002;18(Suppl 1):S216–S224. 10.1093/bioinformatics/18.suppl_1.S216 [DOI] [PubMed] [Google Scholar]

PERMALINK

From graph topology to ODE models for gene regulatory networks

Xiaohan Kang

Bruce Hajek

Yoshie Hanzawa

Roles

Abstract

Introduction

Fig 1. Network reconstruction for an ODE model in the study [9] without global CSP.

Materials and methods

ODE model and constant sign property

Infinitesimal monotonicity

Sum–product monotonicity

Fig 2. A molecular graph and its corresponding gene regulatory graph for the single-loop network in the study [18].

GeneNetWeaver ODE model

Activity level of a single module

Production rate as a function of multiple module activations

Results

A relation between infinitesimal monotonicity and sum–product monotonicity

Fig 3. Molecular graph and gene regulatory graph of the ELF4–GI regulation in the study [23].

GeneNetWeaver: CSP and complexity

Case study: Soybean flowering networks

Five-gene graph and ODE models

Flowering network with COL1a activating E1

Fig 4. A graph model of the core flowering network for soybean.

Table 1. Core flowering genes.

Table 2. Core flowering genes.

Flowering network with COL1a repressing E1

Fig 5. A variant of the graph model of the core flowering network for soybean.

Repressilator

Fig 6. A five-gene repressilator graph model.

Data generation

Fitting results

Table 3. Number of parameters in different ODE models.

Table 4. Fitting losses using different classes of ODE models on different synthetic datasets.

Table 5. Coefficients of determination using different classes of ODE models on different synthetic datasets.

Discussion

Generalization of CSP to related gene regulatory network models

Continuous-state models

Discrete-state models

Implication of GCSP

Limitation of infinitesimal CSP

Conclusion

Supporting information

Data Availability

Funding Statement

References

Decision Letter 0

Enrique Hernandez-Lemus

Roles

Author response to Decision Letter 0

Decision Letter 1

Enrique Hernandez-Lemus

Roles

Acceptance letter

Enrique Hernandez-Lemus

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases