Propagation of genetic variation in gene regulatory networks

Erik Plahte; Arne B Gjuvsland; Stig W Omholt

doi:10.1016/j.physd.2013.04.002

. Author manuscript; available in PMC: 2014 Aug 1.

Published in final edited form as: Physica D. 2013 Apr 17;256-257:7–20. doi: 10.1016/j.physd.2013.04.002

Propagation of genetic variation in gene regulatory networks

Erik Plahte ^a,^c, Arne B Gjuvsland ^a,^c,^*, Stig W Omholt ^b,^c,^d

PMCID: PMC3752980 NIHMSID: NIHMS486320 PMID: 23997378

Abstract

A future quantitative genetics theory should link genetic variation to phenotypic variation in a causally cohesive way based on how genes actually work and interact. We provide a theoretical framework for predicting and understanding the manifestation of genetic variation in haploid and diploid regulatory networks with arbitrary feedback structures and intra-locus and inter-locus functional dependencies. Using results from network and graph theory, we define propagation functions describing how genetic variation in a locus is propagated through the network, and show how their derivatives are related to the network’s feedback structure. Similarly, feedback functions describe the effect of genotypic variation of a locus on itself, either directly or mediated by the network. A simple sign rule relates the sign of the derivative of the feedback function of any locus to the feedback loops involving that particular locus. We show that the sign of the phenotypically manifested interaction between alleles at a diploid locus is equal to the sign of the dominant feedback loop involving that particular locus, in accordance with recent results for a single locus system. Our results provide tools by which one can use observable equilibrium concentrations of gene products to disclose structural properties of the network architecture. Our work is a step towards a theory capable of explaining the pleiotropy and epistasis features of genetic variation in complex regulatory networks as functions of regulatory anatomy and functional location of the genetic variation.

Keywords: Gene regulation, Network, Haploid, Diploid, Genetic variation, Feedback

1. Introduction

Understanding the genotype to phenotype map is essential for a whole range of problems in evolutionary biology, production biology and biomedicine. As gene regulatory networks are the main mediating agents for setting up this map, a theory that can tell us how genetic variation is phenotypically manifested in gene regulatory networks as a function of regulatory anatomy may prove most helpful. Such a theory will be an important contribution to a future quantitative genetics theory linking genes, phenotypes and population level genetic phenomena in causal models based on how genes actually work and interact. More specifically, by being able to describe how the effects of genetic variation propagate in a network one will be able to predict how genetic variation in a gene affects network pathways and processes. In this way one may be able to tie genetic variation in gene networks to a whole range of biological processes that generate high-level phenotypic features. Moreover, at the generic level such a theory can be used in a systematic way to reveal recurrent patterns of how variation is propagated in specific types of regulatory anatomies.

We assume that the network is composed of a set of interacting nodes or loci. Each locus can in principle be regarded as a module by being a functional unit or subsystem of molecular processes whose working may be unknown, but which includes the whole transcriptional and translational machinery that produces the output of the locus [1, 2]. The phenotypes of a network are the stable equilibrium values of the gene products of all the loci in the network. Each locus is susceptible to genetic variation, and we assume that the genetic variation affects the promoter region of a given gene, but that there is no variation in the coding region of the gene. Many experimental results justify the relevance of this assumption. There are examples of noncoding mutations affecting production rates [3], mRNA processing rates [4, 5], the shape of the cis-regulatory input function [6, 7, 8], and mRNA decay rates [9, 10, 11]. In a recent study of adaptive evolution in threespine sticklebacks, Jones et al. found that in 41% of the genes allelic variation was regulatory, in 42% it was probably regulatory, and in only 17% it was coding [12].

To fully understand the functional properties of a diploid gene it is desirable to model its two alleles as separate quantities. This was first done by Omholt et al. [13] to show how the phenomena of genetic dominance, overdominance, additivity, and epistasis could be seen as generic features of simple diploid gene regulatory networks. This model framework was later used to introduce the socalled allele interaction concept [14]. In the present paper we develop these ideas further by proposing a way by which a diploid gene modelled in this fashion can be represented as a single entity and described by a single ODE for its gene product.

Based on these premises we provide a new vocabulary for analysing how genetic variation is manifested in a wide class of haploid and diploid gene regulatory networks possessing negative and positive feedback loops. We introduce terms to describe how a change in equilibrium value at one locus affects the equilibrium values of all other loci, how to identify the causal chains of loci conveying a genetic signal from one locus to another, and how genetic variation at a particular locus affects the equilibrium value phenotype of the locus itself. In [14] we investigated the relationships between single locus gene action concepts and regulatory network anatomy in small networks. Here we extend the analysis to gene regulatory networks with arbitrary number of loci and complex feedback structures. This extension is highly relevant for understanding epistasis and pleiotropy in genotype-phenotype maps. Epistasis refers to situations where the effect of a genetic substitution at one locus depends on the genotype at another locus. Pleiotropy describes situations where one gene influences several phenotypes rather than a single one. Since epistasis and pleiotropy are inherent to biological networks, a system-level understanding of these phenomena is needed [15, 16].

By this work we contribute to the long and strong tradition originating with the works of René Thomas on relating generic systemic properties to the web of feedback loops [17, 18], while at the same time elucidating the link between genetics and systems dynamics. Our results provide further support to the view that nonlinear system dynamics will make up a major part of the core of the mathematical foundation of a future quantitative genetics theory [19, 20].

2. Propagation of genetic variation: features shared by haploid and diploid networks

At this stage we are not concerned with the inner workings of each gene due to genetic variation, but assume that the output rate of a locus is a given function of the concentration levels of its regulators, which we assume are one or several gene outputs. Thus in the first part of the paper we deal with characteristics of propagation of genetic variation that are shared by both haploid and diploid networks.

We combine results from linear algebra and network theory (see e.g. [21]) with gene network ideas to describe how genetic variation in one locus propagates to the other loci in the system in terms of the equilibrium values of the state variables. We introduce the term propagation function to describe how a change in equilibrium value of one node affects the equilibrium values of all other nodes, the term propagation chain to describe a chain of actions conveying a genetic signal from one node in the network to another, and finally, the term feedback function to describe how genetic variation at any particular locus affects the equilibrium value of the locus product itself.

A brief explanation of our notation is found in Appendix A.

2.1. Basic rate equations

We assume the network N is composed of a set of n loci X_i, i = 1, …, n, where n ≥ 2. The non-negative variable z_i represents the possibly time dependent concentration or amount of the output of X_i and acts as input to other loci in the network or contributes of the network’s net output. The dynamics of N is described by a set of autonomous rate equations E_i for z_i, i ∈ N = {1,2, …, n},

{\dot{z}}_{i} = f_{i} (z, a_{i}) = r_{i} (z, a_{i}) - γ_{i} z_{i},

(1)

where $z \in R_{+}^{n}$ is the n-component vector with non-negative components z_i, r_i(z, a_i) is differentiable with respect to z in a certain open and convex domain W, and γ_i > 0 is the relative degradation rate of z_i. The quantity a = {a_i}, i ∈ N, represents a set of parameters defining the system’s genotype, the subset a_i defining the genotype of X_i and comprising quantities like maximum production rate, activation thresholds, affinities of activators and inhibitors, mRNA to protein conversion rate, etc. In many modelling approaches of this type, r_i is a Boolean or Boolean-like functional of sigmoidal functions or piecewise constant functions, see [22] for a review of modelling approaches for gene networks. It should be noted that there could be long and complicated chains of effects incorporated into r_i(z, a_i) [23].

We assume that for each combination of genotypes of the loci X_i in N, the system composed of Eqs. (1) has a single hyperbolic, asymptotically stable and differentiable point-like solution x in W. We show in Section 2.2 that under reasonable assumptions an equilibrium x always exists. If N has no positive loops, x is unique [24, 25]. To avoid having to discuss possible problems related to multistationarity, we invoke the additional assumption that the equilibrium of the system is unique within the domain of phase space of interest even if there are positive loops in the system.

2.2. Propagation functions

A shift in the equilibrium value of some x_k due to a change in parameters specific for X_k will propagate through the network and lead to shifts in other equilibrium values. The propagation follows the network connections, which can be read out from the Jacobian J of Eq. (1) in the stable state x. To the network N corresponding to Eq. (1) we associate a signed digraph G. To each node or locus X_i is associated a vertex X_i in G. Let X_j → X_i indicate a direct effect from X_j to X_i if J_{i j} = ∂r_i(z, a)/∂z_j ≠ 0 in z = x. The effect of X_j on X_i is positive (negative) if the rate of change ż_i increases (decreases) when z_j increases. For this direct effect there is a corresponding directed arc in G from X_j to X_i with a sign equal to the sign of J_{i j} associated to it. The sequence of direct effects X_k → X_j → ⋯ → X_l is called a chain from X_k to X_l if each node in the chain occurs only once [26]. This chain corresponds to a simple path in G from X_k to X_l. We will use the term progagation chain.

The following proposition shows that for each pair k, l ∈ N, where l ≠ k, there exists a propagation function p_lk which determines how the perturbed value of x_l due to a genetic variation in X_k is given in terms of x_k.

Proposition 1

Let k ∈ N be given, let L = N \ {k}, and consider the set of equilibrium conditions

f_{l} (x_{L}, x_{k}, a_{l}) = r_{l} (x_{L}, x_{k}, a_{l}) - γ_{l} x_{l} = 0, l \in L,

(2)

where all x_i ≥ 0, and all r_l satisfy r_l(x_L, x_k, a_l) > 0 for x_l = 0. For any x_k the system of equations f_L(x_L, x_k, a_L) = 0 has at least one set of solutions x_l = p_lk(x_k, a^(k)), where a^(k) is the set of parameters not occurring in the rate equation of X_k.

The proposition follows directly from Theorem 4.9 in [27]. Because the equation f_k(x_L, x_k, a_k) = 0 is not included in the system of equations f_L(x_L, x_k, a_L) = 0, the solution x_L is independent of the X_k-specific parameters a_k. This fact is important because it implies that the effects on x_l of any genetic variation of X_k is given by a fixed propagation function p_lk.

From this follows the usefulness of the propagation functions. A genotypic variation (mutation) in a gene may lead to new equilibrium values of the gene products in the network. One way of addressing this would be to try to parametrise the genotypic variation, and then model the dependence of the equilibrium values on the relevant parameters. The propagation functions offer a simpler solution because they require no knowledge of how the mutated gene could be modelled. They only relate the observable equilibrium concentrations. There is no need to take account of what is the cause of the genetic variation of X_k, how this manifests itself in a shift of parameter values in a_k, or how this parameter value shift might influence p_lk. All that matters are the shifted values of x_k and x_l. For a given k the set of all the functions p_lk contain all information about how the genetic change in X_k becomes manifested in the network against a fixed genetic background (the genotypes of all the other genes).

In the following we explore the properties of the propagation functions and show how they are related to the structure and interactions in the network. In the following we will try and derive the propagation functions from network properties, and also use what can be learned about propagation functions from observed equilibrium values to obtain information about causal chains in the network.

For a given k the functions p_lk are in principle observable by varying the genotype of X_k while keeping the other loci fixed and recording the shifted equilibrium values. Of course, solving p_lk for a given model is in general prohibited due to the non-linearities in the system. However, finding the derivative of p_lk is a linear problem. In the following we relate the derivative $p_{lk}^{'} (x_{k}, a^{(k)}) = q_{lk} (x_{k}, a^{(k)})$ to the values of the elements of the Jacobian J of Eqs. (1) for a given k and any l ≠ k. Let L = n \ {l, k} and j ∈ L. All the equilibrium conditions E_L define x_j as a function of x_k, i.e. x_L = p_Lk(x_k). Then, when the expression in Eq. (6) below for dx_l/dx_k exists,

x_{l} = r_{l} (x_{l}, x_{k}, p_{L k} (x_{k}))

(3)

defines x_l as a function of x_k around the steady state. Differentiating Eq. (3) with respect to x_k gives

\sum_{j \neq k} J_{lj} q_{jk} = - J_{lk} .

(4)

Let Q^(k) be the column vector with components q_ik and ν^(k) the column vector with elements ∂ f_i/∂ x_k, both with i = k excluded. Then

J^{(kk)} Q^{(k)} = ν^{(k)} .

(5)

Using Cramer’s rule and interchanging columns in the numerator finally leads to

\frac{d x_{l}}{d x_{k}} = q_{lk} (x_{k}, a^{(k)}) = {(- 1)}^{k + l} \frac{D^{(kl)}}{D^{(kk)}} .

(6)

Note that the right hand side is in fact independent of f_k(x, a) because row number k in J is deleted in both determinants. This confirms Proposition 1. However, despite this, genotype variation in X_k will shift the equilibrium values and indirectly affect the values of the matrix elements of J. Furthermore, it follows from the implicit function theorem (see e.g. [28]) that if D^(kk) ≠ 0 in x, then there is a unique differentiable mapping p_lk : x_k ↦ x_l in a neighbourhood of x whose derivative can be given as above.

Eq. (6) shows that the propagation of genetic variation in locus X_k is intimately linked to the feedback loop structure of the network in the stable state. While the left hand side of Eq. (6) can be approximated by finite differences of observable equilibrium values after a perturbation of X_k, its right hand side depends on the feedback structure of the network, which is not directly accessible. In the following section we introduce the propagation chain concept and show how it is linked to J, and how it discloses the biological implications of Eq. (6). First, however, we show that Eq. (6) sheds some light on the conditions for the validity of the chain rule for functions defined implicitly by a set of equations.

From an imprudent application of the chain rule to x_m = p_ml(x_l) and x_l = p_lk(x_k) one might be tempted to conclude that x_m = p_ml ∘ p_lk(x_k) and

p_{ml}^{'} p_{lk}^{'} = p_{mk}^{'},

(7)

where k ∈ N, m ∈ N and k ≠ m. This, however, is not generally true. In Appendix C we prove and comment on the following result:

Proposition 2

Assume the variables have been renumbered such that k = 1 and 1 < l < m < n, and define the sets L = {1 : l}, M = {(l + 1) :n}, Q = {1 : (l − 1)}, R = {l : n}, where {i : j} = {i, i + 1, …, j} for i < j and {i : i} = {i}. In terms of partitioned matrices

J = (\begin{matrix} J_{LQ} & J_{LR} \\ J_{MQ} & J_{MR} \end{matrix}) .

(8)

If J_MQ = 0, the chain rule Eq. (7) is fulfilled.

The opposite conclusion is not true, however, as there may be nonzero elements in J_MQ = 0 that do not enter into feedback loops without jeopardising the rule.

Because the numbering of the nodes is arbitrary and immaterial, this result can be interpreted as follows. If all chains of effects from X_k to X_m pass through X_l, then the chain rule Eq. (7) is fulfilled, even if there are return chains from X_m to X_k so that both nodes are members of a feedback loop. However, if there exists a chain from X_k to X_m that does not pass through X_l, the chain rule may be violated. Apart from this, the network structure is immaterial.

As a simple illustration we consider the three-gene system

\begin{array}{l} x_{1} & = & r_{1} (x_{3}), \\ x_{2} & = & r_{2} (x_{1}), \\ x_{3} & = & r_{3} (x_{1}, x_{2}), \end{array}

(9)

in which all γ_i = 1. The two chains X₁ → X₂ → X₃ and X₁ → X₃, constitute a feedforward loop from X₁ to X₃, X₂ playing the role of the intermediate element X_l in Eq. (7). Then

\begin{array}{l} \frac{d x_{2}}{d x_{1}} & = & q_{21} & = \frac{d r_{2}}{d x_{1}}, \\ \frac{d x_{3}}{d x_{2}} & = & q_{32} & = \frac{\frac{\partial r_{3}}{\partial x_{2}}}{1 - \frac{\partial r_{3}}{\partial x_{1}} \frac{\partial r_{1}}{\partial x_{3}}}, \\ \frac{d x_{3}}{d x_{1}} & = & q_{31} & = \frac{\partial r_{3}}{\partial x_{1}} + \frac{\partial r_{3}}{\partial x_{2}} q_{21} . \end{array}

(10)

Obviously, q₃₁ ≠ q₃₂ q₂₁ if r₃ depends explicitly on x₁, in which case there is a chain from X₁ to X₃ not passing through X₂. On the other hand, the arc X₃ → X₁ causes no problem.

2.3. Propagation chains and feedback loops

We start this section with a few standard definitions and clarifications.

A circuit is a set of elements in the Jacobian J whose circuit product (the product of all the elements in the circuit) contributes to det(J) or one of its principal subdeterminants. An element in a circuit represents either an action from one node to another or to itself (a regulatory element), or a degradation term. Thus, a circuit with i elements involves i nodes. The signed circuit product of a circuit equals its circuit product times a sign factor defined in Appendix B. A full circuit is a circuit with n elements. The length of a circuit equals the number of elements in the circuit. The sign of a circuit equals the sign of its circuit product.
If there is a circuit among a subset of nodes and another circuit among another disjoint subset of nodes, the two circuits are subcircuits in a a composite circuit. The circuit product of a composite circuit can always be factorised as a product of two or more subcircuit products. The sign of the composite circuit equals the product of the signs of all the subcircuits.
A proper circuit is a circuit that is not composite. Its circuit product cannot be factorised into subcircuit products.
A feedback loop or just a loop is a circuit that only comprises regulatory elements. A feedback loop comprises one or more closed chains of actions or effects (closed paths) in the network in which any node in the chains occurs just once.
An autoregulatory loop is a loop with one member, arising from a node whose product acts on its own dose-response function.

For example, if

J = (\begin{array}{l} - γ_{1} & 0 & 0 \\ 0 & - γ_{2} & c_{23} \\ 0 & c_{23} & - γ_{3} \end{array}),

(11)

there is just one regulatory loop in the network (X₂ ⇄ X₃), but several (composite) circuits, for instance −γ₁c₂₃c₃₂. The degradation terms in Eq. (1) ensure that all nodes are members of one or more circuits which could be purely regulatory loops or a mixture of regulatory effects and degradation terms. A circuit product is therefore always either a loop product or equal to a loop product times one or more factors −γ_j. Accordingly, in the mathematical sense there exists at least one (proper or composite) circuit L_N comprising all nodes, even in cases where there is no full (regulatory) loop. In Appendix B we recall a few useful facts about subdeterminants and circuits.

Let U = {u₁, u₂, …, u_ρ} be a subset of N with k as its first element and l as its last, and let ρ = |U| be the number of elements in U. Then C_U is a progagation chain X_u₁ → X_u₂ → X_u₃ → … → X_{u_ρ} if the product

C_{U} = J_{u_{ρ} u_{ρ - 1}} J_{u_{ρ - 1} u_{ρ - 2}} \dots J_{u_{2} u_{1}}

(12)

is nonzero. If C_U is made to close on itself by appending the action X_{u_ρ} → X_u₁, it becomes the loop L_U with loop product P_U = J_{u₁u_ρ} C_U.

Next we show that if some q_lk(x_k, a^(k)) ≠ 0, there must be a chain propagating the effect of a shift in x_k from X_k to X_l. (The opposite is not true, as the contributions from two or more chains might accidentally cancel.) Combining Eq. (6) with known formulas for the expansion of determinants in terms of minors [27], we can express q_lk as

q_{lk} (x_{k}, a^{(k)}) = \frac{1}{D^{(kk)}} \sum_{U} {(- 1)}^{u - 1} D_{VV} C_{U},

(13)

where U is any chain set with U₁ = k and u_ρ = l, V = N \ U, C_U = C_U (J) is the chain product of U, and the sum runs over all such U. Keep in mind that Eq. (6) presupposes k ≠ l. Combining Eq. (6) and Eq. (13) we see that D^(kl) is a weighted sum of the chain product in J of all chains leading from X_k to X_l. If no such chain exists for given k and l, then q_lk(x_k, a^(k)) = 0, as expected.

For a given a gene regulatory network model, Eq. (6), or alternatively Eq. (13), allows us to obtain analytical expressions predicting how variation in a gene X_k affects the equilibrium concentrations of all other genes in the network. In those cases where q_lk (x_k, a^(k)) equals zero, the genetic variation in X_k does not become manifested in the output of node X_l even though the equilibrium concentration of x_k is changed. If the variation becomes manifested in the output of more than one locus, the introduced polymorphism is pleiotropic. Since q_lk(x_k, a^(k)) depends explicitly on all chains leading from X_k to X_l, a change in genotype at one or more loci involved can potentially modify the effect on x_l of a shift in x_k, leading to epistasis. This implies that the epistasis and pleiotropy features of all loci can be cartographed in a systematic way. This information can be used to validate a particular model against experimental measurements of q_lk(x_k, a^(k)) as well as to identify generic characteristics of how variation is manifested as a function of regulatory anatomy.

2.4. The regulatory feedback effect on x_k of genetic variation in X_k

The formula (13) for q_lk is only valid for k ≠ l. We now want to define a function which can be used to determine the effect of genotypic variation in X_k on x_k itself. It is obvious from Eq. (1) that even in an isolated node X_k without autoregulation, a change of genotype manifested as a change of a_k will in general lead to a shifted value of x_k. We will call this an unmediated effect. In addition there may be contributions from mediated effects due to the feedback loops involving X_k, including an autoregulatory loop. For example, in a system with the Jacobian in Eq. (11), a change of genotype in X₂ will lead to a shift in x₂ for two reasons: a change of the dose-response function r₂, and because of the loop X₂ ⇄ X₃. The resultant of both effects determines how x_k responds to genetic variation in X_k.

We let X_L be the set of nodes apart from X_k itself that act directly on X_k, and X_M the remaining set of nodes, such that {k} ∪ L ∪ M = N. The stationarity condition for node X_k is

γ_{k} x_{k} = r_{k} (x_{k}, x_{L}, a_{k}) .

(14)

According to Proposition 1 we can in principle find x_l = p_lk(x_k, a^(k)) for all l ∈ L ∪ M, i.e. all l ≠ k. Inserting this into Eq. (14) gives

γ_{k} x_{k} = r_{k} (x_{k}, p_{Lk} (x_{k}, a^{(k)}), a_{k}) .

(15)

We define the feedback function ϕ_k for X_k by

ϕ_{k} (x_{k}, a) = r_{k} (x_{k}, p_{Lk} (x_{k}, a^{(k)}), a_{k}),

(16)

or just ϕ_k(x_k) = r_k(x_k, p_Lk(x_k)), and express the stationarity condition for X_k as

x_{k} = \frac{1}{γ_{k}} ϕ_{k} (x_{k}, a) .

(17)

For a given genotype, expressed as given a value of the parameter set a, the value of x_k can be found as the (by assumption stable and unique) solution of this equation. If $ψ_{k} (x_{k}, a) = ϕ_{k}^{'} (x_{k}, a) \equiv 0$ , where the prime denotes the derivative with respect to x_k, then X_k is not involved in any regulatory feedback loop. However, if $ψ_{k} (x_{k}, a) = ϕ_{k}^{'} (x_{k}, a) \neq 0$ , there is an effective feedback of X_k on itself, mediated by one or more loops. Therefore the feedback function ϕ_k describes and quantifies the feedback effects of changes in the equilibrium value of X_k on itself.

The derivative of ϕ_k can be expressed in terms of the Jacobi matrix elements. Differentiating Eq. (16) with respect to x_k, using x_l = p_lk(x_k, a^(k)), Eq. (6) and that ∂_{r_k}/∂_{x_m} = 0 for all m ∈ M defined just before Eq. (14), we find

ψ_{k} (x_{k}, a) = \frac{\partial r_{k}}{\partial x_{k}} + \sum_{l \in L} \frac{\partial r_{k}}{\partial x_{l}} q_{lk} = γ_{k} + \frac{D}{D^{(kk)}} .

(18)

Let F_k be the sum of the signed circuit products (defined in Appendix B) of all full circuits in J in which there is a real regulation of X_k, but not necessarily of the other nodes. (We do not consider the linear degradation as a regulation. For example, in J defined in Eq. (11), there are two full circuits with circuit products −γ₁ γ₂ γ₃ and −γ₁ c₂₃ c₃₂, respectively, but only the latter includes a real regulation of X₂ (by X₃) and would contribute to F₂. Neither contributes to F₁.)

As a illustration we consider the system

\begin{array}{l} γ_{1} x_{1} & = & r_{1} (x_{1}, x_{2}, x_{3}), \\ γ_{2} x_{2} & = & r_{2} (x_{1}), \\ γ_{3} x_{3} & = & r_{3} (x_{2}), \end{array}

(19)

with the two loops X₁ ⇄ X₂ and X₁ → X₂ → X₃ → X₁. With k = 1 we readily find

ψ_{1} = γ_{1} + \frac{1}{γ_{2} γ_{3}} (γ_{2} γ_{3} J_{11} + γ_{3} J_{12} J_{21} + J_{13} J_{32} J_{31}) .

(20)

The expression in the parenthesis is F₁. Its second term comes with a positive sign because the minus sign for −γ₃ is cancelled by the negative signature factor of the loop X₁ ⇄ X₂. Here as always, F_k is independent of γ_k, but not of the other degradation rates.

We also note that D^(kk) is the sum of the signed circuit product of all circuits (proper and composite) of length n − 1 which do not involve X_k. Expanding D along row k gives

D = \frac{\partial r_{k}}{\partial x_{k}} D^{(kk)} - γ_{k} D^{(kk)} + \sum_{j \neq k} \frac{\partial r_{k}}{\partial x_{j}} D^{(kj)} .

(21)

According to the lemma in Appendix B a determinant can be expanded as a sum of its signed circuit products. The first term in Eq. (21) is the sum of all full, composite circuit products with an autoregulatory subcircuit in X_k. Each term in the last sum is the determinant of a matrix K^kj obtained by setting all elements in row k and column j except J_kj equal to zero. Then det(K^kj) is the sum of all circuit products in J which involve the element J_kj, i.e. in which X_k contributes with an active regulation. This gives D = F_k − γ_k^D^(kk), and if D^(kk) ≠ 0,

ψ_{k} (x_{k}, a) = \frac{F_{k}}{D^{(kk)}} = γ_{k} \frac{F_{k}}{F_{k} - D} .

(22)

By this we have obtained a formula that relates the gain of the feedback function of a locus X_k to the circuit products of the full circuits in which X_k is regulated. This circuit would be either a full loop or a a set of subloops, one of them involving X_k, and a number of degradation terms. It provides an analytic basis for the intuition that a high gain is obtained if the loops that X_k enters into are much stronger than the rest, i.e. if |Fk| ≫ |D^(kk)|. Note that because x is hyperbolic by assumption, D ≠ 0, thus ψ_k (x_k, a) ≠ γ_k.

If ψ_k(x_k, a) = 0, then F_k = 0, which means that there is no effective regulation of X_k or the effects of the regulating loops happen to cancel. Then assume ψ_k(x_k, a) ≠ 0. Solving Eq. (22) with respect to D and using that (−1)ⁿ D > 0 (see Appendix B) leads to

{(- 1)}^{n} F_{k} Ω_{k} (x_{k}, a) > 0,

(23)

where

Ω_{k} (x_{k}, a) = \frac{ψ_{k} (x_{k}, a) - γ_{k}}{ψ_{k} (x_{k}, a)} .

(24)

Assume there exists a full circuit composed of a proper loop L involving X_k and a perhaps number of degradation terms. Let P be the loop product of L. As is illustrated in Eq. (20), the sign of this circuits product is equal to sign(P) independently of the number of degradation terms, because the negative signs of the degradation terms are compensated by the signature factor (see Lemma 2 in Appendix B) of the full loop. If sign(F_k) = sign(P), we call L a sign-dominant loop of X_k. The signature factor of L is (−1)ⁿ⁻¹ because it has n members. The sign of its contribution to F_k is therefore (−1)ⁿ⁻¹ sign(P), yielding the result

P Ω_{k} (x_{k}, a) < 0,

(25)

which will be used to prove Proposition 5. From this follows readily

Proposition 3

If P > 0, then 0 < ψ_k(x_k, a) < γ_k, and if P < 0, then ψ_k(x_k, a) < 0 or ψ_k(x_k, a) > γ_k, and vice versa.

Thus, a positive sign-dominant proper loop implies a feedback function with positive slope bounded by the degradation rate, while a negative sign-dominant loop implies either negative slope or a large positive slope of ϕ_k. If L is a composite loop, Eq. (25) is replaced by

{(- 1)}^{n + ε_{L}} P Ω_{k} (x_{k}, a) > 0,

(26)

where (−1)^ε_L is the signature of the loop. If F_k ≠ 0, there is always at least one sign-dominant loop for X_k.

To compute the values of x_k for a slight change of genotype in X_k, the shift in a_k must also be taken into account. Let x_k = x_k(a) be the solution of Eq. (17), and let b ∈ a_k be a single parameter. Differentiating Eq. (17) and introducing Jacobi elements as in the derivation of Eq. (18) we find

\frac{\partial x_{k}}{\partial b} = - \frac{D^{(kk)}}{D} \frac{\partial r_{k}}{\partial b} = - \frac{D^{(kk)}}{D} \frac{\partial ϕ_{k}}{\partial b} = \frac{1}{γ_{k} - ψ_{k} (x_{k}, a)} \frac{\partial ϕ_{k}}{\partial b} .

(27)

This formula emphasises the importance of the feedback function as a source of information about the phenotypic effects of genotype changes.

We are now ready to use these results to analyse diploid networks.

3. Allele interaction in networks with diploid loci

The rest of the paper deals with models of diploid systems, that is, systems in which chromosomes come in pairs with one variant of each gene, called an allele, on each of the two chromosomes. Thus, each gene is composed of two alleles, each allele being regulated more or less independently of the other, and the product of the gene is some combination of the product of each of the two alleles. If the two alleles are identical, the gene is called homozygotic, if they are different, the gene is heterozygotic, and if one of the alleles has been knocked out, it is hemizygotic.

Since the dawn of genetics, additive and dominant gene actions in diploids have been defined by comparing heterozygote and homozygote phenotypes without reference to, or model of, the functional dependency between the two alleles composing each genotype. However, from [14] as well as the present paper it is clear that it is precisely the interaction between the two alleles that gives rise to non-additive gene action. Consequently, the genetics concepts of additive and dominant gene actions cannot explain basic phenomena in genetics theory from regulatory biology. Exploiting the additivity and nonadditivity properties of the two alleles, Gjuvsland et al. [14] showed that by means of the new concept of allele interaction, gene regulatory systems with one or two loci can be linked to single locus genetic theory.

We first present ways of modelling a network of genes involving diploid loci in an efficient way, and then introduce the concept of allele interaction. Finally, we study how the sign of the allele interaction is related to the feedback structure of the network. When studying allele interaction, we contrast different genotypes at a single focal locus without specifying the genotype of the rest of the loci.

3.1. Allele-specific diploid gene regulatory network models

As our objective is to relate the changes in genotypic value (i.e. the phenotype) due to allelic variation of a locus X_i to a potential interaction between its two alleles, we need to model the function and regulation of the two alleles as two distinct entities. A bi-allelic node X_i with two alleles sitting on each of the two chromosomes, splits into two subnodes $X_{i}^{1}$ and $X_{i}^{2}$ with $z_{i}^{1}$ and $z_{i}^{2}$ representing the concentration of gene product from each of the two chromosomes, respectively.

As stated in the introduction we assume that the outputs of the two alleles are functionally equivalent in the sense that they regulate other genes in the same fashion, genetic differences manifesting themselves only in the regulation of the two alleles, not in qualitative differences in their output. This implies that the nodes in the network are regulated by the total gene product $z_{i} = z_{i}^{1} + z_{i}^{2}$ , not by its two constituents separately. If this assumption should not hold for a gene X_i, the dose-response function of a downstream gene could depend on $z_{i}^{1}$ and $z_{i}^{2}$ separately. In such cases the simplifications described below would not be justified, and one would have to model the two alleles of this gene by two separate equations.

We let superscripts α_i and β_i denote the alleles in $X_{i}^{1}$ and $X_{i}^{2}$ , respectively, α_i ∈ {1,2}, β_i ∈ {1,2}. If the genotype of X_i is biallelic with alleles α_i, β_i, we model its rate equations by

\begin{array}{l} {\dot{z}}_{i}^{1} = f_{i}^{α_{i}} (z) = r_{i}^{α_{i}} (z) - γ_{i}^{α_{i}} z_{i}^{1}, \\ {\dot{z}}_{i}^{2} = f_{i}^{β_{i}} (z) = r_{i}^{β_{i}} (z) - γ_{i}^{β_{i}} z_{i}^{2}, \end{array}

(28)

where $z_{i} = z_{i}^{1} + z_{i}^{2}$ . For simplicity we suppress the parameters a_i from the arguments of the dose-response functions in the following. If (α_i, β_i) = (1, 1) or (α_i, β_i) = (2, 2), the equations describe a homozygous locus X_i, while (α_i, β_i) = (1,2) describes the heterozygote. This model for a diploid node was first proposed by Omholt et al. [13].

If X_i is a homozygous locus, α_i = β_i, and simple addition of the two equations gives

{\dot{z}}_{i} = 2 r_{i}^{α_{i}} (z) - γ_{i}^{α_{i}} z_{i} .

(29)

In the hemizygous genotypes where one allele has been knocked out and the remaining copy is of genotype α_i, Eqs. (28) are reduced to

{\dot{z}}_{i}^{1} = f_{i}^{α_{i}} (z) = r_{i}^{α_{i}} (z) - γ_{i}^{α_{i}} z_{i}^{1},

(30)

and $z_{i} = z_{i}^{1}$ . For each polymorphic locus we may therefore consider five different genotypes: the bi-allelic genotypes 11, 12, and 22, and the mono-allelic genotypes 1 and 2.

In the following we consider a network in which X_n is polymorphic while the genotypes of the remaining loci are unspecified but fixed. We first describe it by the extended system S_E defined by the rate equations

S_{E} = {\begin{cases} {\dot{z}}_{i} & = & r_{i} (z) - γ_{i} z_{i}, i = 1, \dots, n - 1, \\ {\dot{z}}_{n}^{1} & = & r_{n}^{α_{n}} (z) - γ_{n}^{α_{n}} z_{n}^{1}, \\ {\dot{z}}_{n}^{2} & = & r_{n}^{β_{n}} (z) - γ_{n}^{β_{n}} z_{n}^{2}, \end{cases}

(31)

where $z_{n} = z_{n}^{1} + z_{n}^{2}$ and z = [z₁, …, z_n]. For simplicity we drop the subscript n to α_n and β_n in the following. As above, we denote the presupposed asymptotically stable state of Eq. (31) by x = [x₁, …, x_n].

3.2. Aggregating diploid loci

Contrary to Eq. (28), common ways of modelling gene regulatory networks describe a gene by a single equation for the total output of the gene, even when the gene is diploid. In the present section we investigate whether these two contrasting modelling schemes can be unified into a common modelling approach.

By exploiting the assumption that the diploid node X_n only acts on the other nodes by its total output z_n, we want to convert the extended model into a new model expressed in terms of the total product of a locus, while still keeping track of the properties of each of the two alleles. In other words, we want to construct a system S_A obtained by merging $X_{n}^{1}$ and $X_{n}^{2}$ into one aggregated node X_n with a single rate equation for z_n. We will call this conversion an aggregation. The rationale for this operation is that an aggregated model facilitates considerably the theoretical analysis of allele interaction in high-dimensional systems.

In fact, almost all gene regulatory models occuring in the literature are aggregated in the sense that they describe each gene by just one variable representing the amount or concentration of the gene’s output. This is so even if the gene is diploid, and even in cases where several of the genes probably have allelic variation in the coding region as well, and perhaps produce qualitatively different outputs. Gene transcription and translation are very complicated processes which are only very crudely modelled by the kind of equations studied in the present paper. Even if the two allele products act in the same way such that only their total concentration matters as regulatory agents, there may be different degradation rates operating at the mRNA stage, during the translation process or later. If $γ_{n}^{α} = γ_{n}^{β} = γ_{n}$ , then obviously ${\dot{z}}_{n} = r_{n}^{α} (z, a_{n}) + r_{n}^{β} (z, a_{n}) - γ_{n} z_{n}$ . However, when $γ_{n}^{α} \neq γ_{n}^{β}$ , it is impossible to combine the two last equations in Eqs. (31) into one rate equation for z_n. The crucial problem in these cases is to perform the aggregation in such a way that the aggregated model reproduces the properties of the original extended model.

A natural solution would be to assume that the total dose-response function of the gene is the sum of the dose-response functions for each of the two alleles, and that the relative degradation rate of the total gene product $z_{n} = z_{n}^{1} + z_{n}^{2}$ is an average of the two allelic degradation rates $γ_{n}^{1}$ and $γ_{n}^{2}$ . We will call such a model an aggregated model S_A of S_E:

S_{A} {\begin{cases} \dot{y} = r_{i} (y) - γ_{i} y_{i}, i = 1, \dots, n - 1, \\ {\dot{y}}_{n} = r_{n}^{α} (y) + r_{n}^{β} (y) - γ_{n}^{αβ} y_{n}, \end{cases}

(32)

where

γ_{n}^{αβ} = \frac{γ_{n}^{α} x_{n}^{α ∣ \circ} + γ_{n}^{β} x_{n}^{β ∣ \circ}}{x_{n}^{α ∣ \circ} + x_{n}^{β ∣ \circ}} .

(33)

Let z(t, z⁰) and y(t, y⁰) be the solutions of S_E and S_A, respectively, satisfying z(0, z⁰) = z⁰ and y(0,y⁰) = y⁰. It is easy to see that if $z^{*} = [z_{1}^{*}, \dots, z_{n}^{*}]$ is a steady point of S_E, then y* = z* is a steady point of S_A. It is not obvious that if z* is an asymptotically stable point of S_E, then y* = z* is an asymptotically stable point of S_A, and if z* is hyperbolic, then y* is hyperbolic. However, we show in Appendix D that this is in fact the case. We also show by extensive numeric simulations that in the majority of cases the temporal behaviours of z(t, z⁰) and y(t, z⁰) are approximately equal and qualitiatively similar for a range of actual parameter values and realistic common initial values y⁰ = z⁰ ≠ z*. This being the case, we call S_A a well-founded aggregation of S_E.

These results strongly suggest that the idea of aggregating a diploid model in this way makes sense. If S_E has several biallelic nodes, we use this aggregate procedure of S_A on each node. If each aggregation is well-founded, we finally arrive at a well-founded, fully aggregated system S_FA. Its diploid loci X_i of genotype α_iβ_i are described by

{\dot{y}}_{i} = r_{i}^{α_{i}} (y) + r_{i}^{β_{i}} (y) - γ_{i}^{α_{i} β_{i}} y_{i},

(34)

where $γ_{i}^{α_{i} β_{i}}$ is given by Eq. (33) with n replaced by i. Haploid nodes X_j are described by

{\dot{y}}_{j} = r_{i} (y) - γ_{j} y_{j} .

(35)

If all nodes are diploid and admit the above aggregation process, the dimensionality of the model has been reduced from 2n to n, leading to the fully aggregated model S_FA. In the next section we use S_FA to investigate the consequences of knockout behaviour and different allele combinations in genotype-phenotype maps (GP maps).

3.3. The allele interaction concept

The concept of allele interaction for a polymorphic locus X for some specific phenotypic trait in a regulatory network was defined by Gjuvsland et al. [14]. Recall that $x_{i}^{1 ∣ \circ}$ and $x_{i}^{2 ∣ \circ}$ are the hemizygote genotypic values of a locus X_i when only allele 1, respectively allele 2 is present, and $x_{i}^{11}$ , $x_{i}^{12}$ and $x_{i}^{22}$ are the biallelic homozygote and heterozygote genotypic values. The heterozygote allele interaction value $Δ_{i}^{12}$ of X_i is defined as

Δ_{i}^{12} = x_{i}^{12} - (x_{i}^{1 ∣ \circ} + x_{i}^{2 ∣ \circ}) .

(36)

We define the two homozygote allele interaction values $Δ_{i}^{11}$ and $Δ_{i}^{22}$ in the same way, in general

Δ_{i}^{αβ} = x_{i}^{αβ} - (x_{i}^{α ∣ \circ} + x_{i}^{β ∣ \circ}),

(37)

where α,β ∈ {1, 2}. An allele interaction is said to be negative if $Δ_{i}^{αβ} < 0$ and positive if $Δ_{i}^{αβ} > 0$ .

Mendelian dominance is expressed by the dominance value

d_{i} = x_{i}^{12} - \frac{x_{i}^{11} + x_{i}^{22}}{2} .

(38)

The name “dominance value” stems from the fact that if d_i ≠ 0, then one of the alleles contributes more to (dominates) the equilibrium value, shifting the heterozygous value away from the midpoint between the two homozygous equilibrium values. Allele interaction is closely related to d_i because $d_{i} = Δ_{i}^{12} - (Δ_{i}^{11} + Δ_{i}^{22}) / 2$ . Gjuvsland et al. [14] showed that if an isolated node X is under negative autoregulation, then its three allele interaction values are negative, while if the autoregulation is positive, they are positive. Building upon the theoretical machinery developed above, we show in the following that these results can be generalised to higher dimensional gene regulatory networks with more complex feedback structures. In this way we are able to build new theory relating gene action concepts and regulatory network anatomy to quantitative genetics.

If the two alleles of X_i were completely independent, one would expect $Δ_{i}^{12} = 0$ , as in this case the total output of the gene would be just the sum of the outputs from the two alleles. A nonzero value would therefore indicate some kind of oneway or mutual action between the alleles. In a single-locus model a nonzero allele interaction value could be a consequence of the feedback between the two alleles. Non-feedback mechanisms, such as transvection [29] in which one allele has an effect on the other (but not the other way round), could also lead to nonzero allele interaction [14].

To search for systemic causes of nonzero allele interaction values we examine the two-locus systems in Figure 1. Their rate equations are given in Appendix E. The node X₁ is diploid and splits in two subnodes $X_{1}^{1}$ and $X_{1}^{2}$ of type α₁ = 1 and β₁ = 2, respectively. Taking the total equilibrium concentration of X₁ as the system’s phenotype, we want to investigate the allele interaction value $Δ_{1}^{12}$ of the system in Figure 1a. We then have to compare the equilibrium value $y_{1}^{12}$ with $y_{1}^{1 ∣ \circ} + y_{1}^{2 ∣ \circ}$ , the phenotype values when only one allele is present. Let the node X̃₂ be a copy of X₂. Then $y_{1}^{1 ∣ \circ}$ and $y_{1}^{2 ∣ \circ}$ are the phenotype values of the two systems in Figure 1c.

a The two-locus system analysed in Section 3.3 to investigate the source of nonzero allele interaction values. The nodes $X_{1}^{1}$ and $X_{1}^{2}$ are the two alleles of the locus X₁. Black arrows indicate direct effects. b The interaction diagram of the artificial μ-system. The node X̃₂ is a genetically identical copy of X₂. The two red arrows indicate actions whose strengths depend on μ. c If μ = 0, the actions are zero. The μ-system splits in two independent subsystems, each of them equivalent to the original system i a with one allele knocked out. d If μ = 1, the strengths of the actions are the same as in the system i a. In this case the μ-system is equivalent to the original system in a without any allele knockout.

To ease the comparison between these two systems we introduce the artificial μ-system in Figure 1b. The two red arcs represent actions whose strengths (expressed by the magnitudes of the corresponding Jacobian elements) are proportional to a parameter μ which can be varied in [0, 1]. The equilibrium values of $X_{1}^{1}$ and $X_{1}^{2}$ in the μ-system are $y_{1}^{1 ∣ 2} (μ)$ and $y_{1}^{2 ∣ 1} (μ)$ , and the corresponding allele interaction value is $Δ_{1}^{12} (μ)$ .

If μ = 0, the μ-system simplifies to the two independent subsystems in Figure 1c. The one to the left (right) is the allele knockout system with only $X_{1}^{1} (X_{1}^{2})$ left because X̃₂ and X₂ are presumed identical. The phenotypic value of the whole system is $y_{1}^{12} (0) = x_{1}^{1 ∣ \circ} + x_{1}^{2 ∣ \circ}$ . Therefore $Δ_{1}^{12} (0) = y_{1}^{12} (0) - y_{1}^{1 ∣ \circ} (0) - y_{1}^{2 ∣ \circ} (0) = x_{1}^{1 ∣ \circ} + x_{1}^{2 ∣ \circ} - x_{1}^{1 ∣ \circ} - x_{1}^{2 ∣ \circ} = 0$ .

If μ = 1, the μ-system is represented by Figure 1d. From its rate equations it follows that the equilibrium conditions of this system are the same as for Figure 1a, leading to equal allele interaction values. Therefore the μ-system interpolates continuously between the diploid system in Figure 1a and the allele knockout system in Figure 1c.

Differentiating the equilibrium conditions of the μ-system, we find that $d Δ_{1}^{12} / d μ ∣_{μ = 0}$ is in general nonzero. Details are given in Appendix E. Because $Δ_{1}^{12} (0) = 0$ , this implies that even for infinitesimally small μ the μ-system has a nonzero allele interaction value. One might think that this nonzero value is caused by the feedback loop $X_{1}^{1} \to {\tilde{X}}_{2} \to X_{1}^{2} \to X_{2} \to X_{1}^{1}$ . However, there is still a nonzero allele interaction value if the arc from $X_{1}^{2}$ to X₂ is removed. In this case there is no mutual interaction between the alleles, only an indirect action from $X_{1}^{1}$ to $X_{1}^{2}$ , but no chain from $X_{1}^{2}$ to $X_{1}^{1}$ . We conclude that a nonzero allele interaction value could be caused by feedback among the two alleles, but that a one-way action is sufficient.

3.4. Allele interaction, feedback functions and feedback loops

The allele interaction values can be computed from directly observable quantities. In this section we show how they can be related to properties of the network. Using finite differences and the mean value theorem, the derivatives of the dose-response functions can be estimated in terms of the single allele effects

\begin{matrix} δ_{l}^{α_{k} β_{k} \ α_{k}} & = & x_{l}^{α_{k} β_{k}} - x_{l}^{β_{k} ∣ \circ}, \\ δ_{l}^{α_{k} β_{k} \ β_{k}} & = & x_{l}^{α_{k} β_{k}} - x_{l}^{α_{k} ∣ \circ} \end{matrix}

(39)

which quantify the effect on any locus X_l of activating the second allele in the initially hemizygous locus X_k. Then

p_{lk}^{'} (c_{l}^{α_{k} β_{k}}) = q_{lk} (c_{l}^{α_{k} β_{k}}) = \frac{δ_{l}^{α_{k} β_{k} \ β_{k}}}{δ_{k}^{α_{k} β_{k} \ β_{k}}},

(40)

where $c_{l}^{α_{k} β_{k}} \in (x_{k}^{α_{k} ∣ \circ}, x_{k}^{α_{k} β_{k}})$ . We use the subscript l in $c_{l}^{α_{k} β_{k}}$ because its value clearly depends on l.

Because the function p_lk is independent of the allelic composition of X_k, we can get four independent estimates of q_lk by combining $X_{k}^{11}$ with $X_{k}^{1}$ , $X_{k}^{22}$ with $X_{k}^{2}$ , and $X_{k}^{12}$ with $X_{k}^{1}$ and with $X_{k}^{2}$ . Note however that they will refer to different and unknown arguments, so that all together they will provide an estimate of the average value of q_lk in the interval between the minimum and maximum of the five genotypic values $x_{k}^{1 ∣ \circ}$ , $x_{k}^{2 ∣ \circ}$ , $x_{k}^{11}$ , $x_{k}^{12}$ and $x_{k}^{22}$ .

If a model for a given network exists, we can use Eq. (40) to estimate how the single allele effect propagates through the network as a consequence of polymorphism in X_k and the network connectivities, and use this to test the model. Conversely, measurements of the single allele effects from a polymorphic locus give information about the network connections [30].

Again dropping the subscript k from α_k and β_k, we denote in the following the feedback functions of X_k by $ϕ_{k}^{α}$ , $ϕ_{k}^{β}$ and $ϕ_{k}^{αβ}$ , and similarly for F_k, etc. According to Eq. (17) the stationarity conditions for the allele combinations α∣∘, β∣∘ and αβ are

\begin{matrix} γ_{k}^{α} x_{k}^{α ∣ \circ} & = & ϕ_{k}^{α} (x_{k}^{α ∣ \circ}, a), \\ γ_{k}^{β} x_{k}^{β ∣ \circ} & = & ϕ_{k}^{β} (x_{k}^{β ∣ \circ}, a), \\ γ_{k}^{αβ} x_{k}^{αβ} & = & ϕ_{k}^{α} (x_{k}^{αβ}, a) + ϕ_{k}^{β} (x_{k}^{αβ}, a) = ϕ_{k}^{αβ} (x_{k}^{αβ}, a), \end{matrix}

(41)

where $γ_{k}^{αβ}$ is computed in accordance with Eq. (33). The last equation follows because, as is evident from Proposition 1, to derive p_Lk we do not use the stationarity condition for X_k, and all the other stationarity conditions are invariant under polymorphism of X_k and don’t have superscripts α and β. The allele interaction value $Δ_{k}^{αβ} = x_{k}^{αβ} - x_{k}^{α ∣ \circ} - x_{k}^{β ∣ \circ}$ is then given in terms of the solutions of the three Eqs. (41). The following proposition relates $Δ_{k}^{αβ}$ to the derivatives $ψ_{k}^{α}$ and $ψ_{k}^{β}$ of the feedback functions $ϕ_{k}^{α}$ and $ϕ_{k}^{β}$ .

Proposition 4

For any biallelic locus X_k, k ∈ N, there exist numbers $c_{k}^{αβ} \in (x_{k}^{α ∣ \circ}, x_{k}^{αβ})$ and $c_{k}^{β α} \in (x_{k}^{β ∣ \circ}, x_{k}^{αβ})$ such that

Δ_{k}^{αβ} = \frac{ψ_{k}^{α} (c_{k}^{αβ}, a) x_{k}^{β ∣ \circ} + ψ_{k}^{β} (c_{k}^{βα}, a) x_{k}^{α ∣ \circ}}{γ_{k}^{αβ} - ψ_{k}^{α} (c_{k}^{αβ}, a) - ψ_{k}^{β} (c_{k}^{β α}, a)} .

(42)

Proof

From Eqs. (17) and (41) it follows that

Δ_{k}^{αβ} = \frac{1}{γ_{k}^{αβ}} (ϕ_{k}^{α} (x_{k}^{αβ}, a) + ϕ_{k}^{β} (x_{k}^{αβ}, a)) - x_{k}^{α ∣ \circ} - x_{k}^{β ∣ \circ} .

Inserting $x_{k}^{αβ} = x_{k}^{α ∣ \circ} + (Δ_{k}^{αβ} + x_{k}^{β ∣ \circ})$ into $ϕ_{k}^{α} (x_{k}^{αβ}, a)$ and $x_{k}^{αβ} = x_{k}^{β ∣ \circ} + (Δ_{k}^{αβ} + x_{k}^{α ∣ \circ})$ into $ϕ_{k}^{β} (x_{k}^{αβ}, a)$ and using the mean value theorem on both functions lead after some elementary algebra and repeated use of Eqs. (33) and (17) to Eq. (42).

By combining Eq. (42) with Eq. (23) and using the following lemma, we are able to relate the sign of $Δ_{k}^{αβ}$ to properties of the feedback loops of the system.

Lemma 1

Let E be an open subset of R₊, let ϕ₁ : E → R₊ and ϕ₂ : E → R₊ be two positive, strictly monotonic and differentiable functions. Define ϕ₁₂(x) = ϕ₁(x) + ϕ₂(x), and assume that γ_ix = ϕ_i(x), i = 1,2, have unique solutions x₁, x₂ in E, where γ_i > 0 and x₁ ≤ x₂ by convention. Define

γ = \frac{γ_{1} x_{1} + γ_{2} x_{2}}{x_{1} + x_{2}},

(43)

and assume the solution x₁₂ of γx = ϕ₁₂(x) is also in E. Define Δ₁₂ = x₁₂ − x₁ − x₂ and δ₁ = x₁₂ − x₂, δ₂ = x₁₂ − x₁.

If $ϕ_{1}^{'} (x) < 0$ and $ϕ_{2}^{'} (x) < 0$ for all x ∈ E, then Δ₁₂ < 0 and δ₂ > 0.
If $0 < ϕ_{1}^{'} (x) < γ_{1}$ , $0 < ϕ_{2}^{'} (x) < γ_{2}$ for all x ∈ E, then Δ₁₂ > 0 and δ₁ > 0, δ₂ > 0.
If $ϕ_{1}^{'} (x) > γ_{1}$ , $ϕ_{2}^{'} (x) > γ_{2}$ for all x ∈ E, then Δ₁₂ < 0 and δ₁ < 0.

The proof is in Appendix F.

Recall the definition $Ω_{i} (x) = (ϕ_{i}^{'} (x) - γ_{i}) / ϕ_{1}^{'} (x)$ in Eq. (24). Assume Ω₁ and Ω₂ have the same sign. It follows from Lemma 1 that if Ω_i < 0 for i = 1, 2, then Δ₁₂ > 0, and if Ω_i > 0 for i = 1, 2, then Δ₁₂ < 0. Combining this with Eqs. (23)-(26) we readily arrive at

Proposition 5

If $F_{k}^{α}$ and $F_{k}^{β}$ , α ≠ β, have the same sign S_k, then ${(- 1)}^{n} S_{k} Δ_{k}^{αβ} < 0$ . If P is the loop product of a sign-dominant, proper loop for X_k, then $P Δ_{k}^{αβ} > 0$ . If the loop is sign-dominant but composite and has signature factor (−1)^ε, then ${(- 1)}^{n + ε - 1} P Δ_{k}^{αβ} > 0$ .

The allele interaction values $Δ_{k}^{αβ}$ are directly observable by subjecting each X_k to allele knockout and recording the unperturbed and perturbed equilibrium values of x_k. If this is done for all X_k, a set of exact sign conditions on the loop structure of the system is obtained. This may be particularly useful for homozygous systems, because then $F_{k}^{α} = F_{k}^{β}$ and there will be no problem with the sign S_k.

4. Discussion and conclusions

Combining network theory and linear algebra results with mathematical models of gene regulatory networks, we have introduced relevant concepts and provided analytical insights on how genetic variation is propagated in gene networks. We hope that our results may contribute to a future theory on the pleiotropy and epistasis features of genetic variation in haploid and diploid gene networks as a function of regulatory architecture and functional location of genetic variation.

We have also shown that the modelling framework for diploid gene networks developed by Omholt et al. [13] in which a diploid node is described by two rate equations, can be transformed—in our language: aggregated—into a standard type model in which each locus, haploid or diploid, is described by just one rate equation.

The time-dependent solutions of the aggregated models are qualitatively equivalent to the corresponding model by the modeling framework of Omholt et al., and the equilibrium solutions of the former are stable when the solutions of the latter are. Qualitative equivalence is here to be taken in an informal sense, meaning that the graphs of the solution curves look similar, and that the curves are relatively close to each other in a sense given in Appendix D.

The variables of the aggregated model are the observable total gene product of each locus. The model depends explicitly on the genotypes of the two alleles of the diploid loci. This property facilitates investigations on how the genotypic value of a diploid locus (i.e. its phenotype) depends on its genotype. It further reduces the size of the model from perhaps 2n down to n. This reduction also makes it much easier to read out the connection and the feedback loops between the loci in “everyday” language in which we talk about a gene as one entity despite the fact that it is composed by two more or less independent alleles. To the best of our knowledge this provides for the first time a rationale for modeling diploid gene regulatory networks with one node for each locus even though the locus may be polymorphic and show intra-locus interaction effects.

Finally, we have shown that for a wide range of network architectures the sign of the allele interaction is independent of the shape of the rate functions and parameter values, and does not change with mutations in the other nodes or under external noise. More specifically, Proposition 5 confirms and generalizes the result in [14] for an isolated gene. It shows the close connection between the sign of the allele interaction for a polymorphic locus X_k and the feedback loops it is involved in. Its main importance is that recording the equilibrium values x_k for a hemizygotic and either a homo- or heterozygotic locus X_k gives information about the network interactions and feedback loops involving X_k. These genotypes are within experimental reach for several organisms, and the machinery developed above can be tested in several settings. Hemizygous collections are already available for yeast [31]. Of course there may be networks for which the actual genotypes lead to more complex sign relations so that the above results would not be valid. Irrespective of whether the sign relations are valid or not, if these three allele interaction values $Δ_{k}^{αα}$ , $Δ_{k}^{αβ}$ and $Δ_{k}^{ββ}$ have equal signs s_k, a tentative hypothesis is that X_k has one or more sign-dominant loop with sign s_k.

Gjuvsland et al. [14] showed that in systems with one or two loci, a biallelic locus can display up to 18 qualitatively different allele interaction sign patterns (triplets of +, − and 0 representing the signs of Δ¹¹, Δ¹² and Δ²²). In a single locus system with autoregulation only a subset of 7 of these could be realised with monotonic dose-response functions. With non-monotonic dose-response functions, however, 16 sign patterns could be generated. They also showed analytically that for each allele combination, the allele interaction value and the sign of the autoregulatory loop were equal (their Supporting Information, Result 1). For the autoregulatory system of an isolated node X₁, the sign of F₁ is just the sign of the autoregulatory loop, which equals the sign of the derivative of the doseresponse function. Therefore, a non-monotonic dose-response function implies that F₁—and the allele interaction value—may take both signs, depending on parameter values.

Consider then a multi-locus system with monotonous dose-response functions (Figure 2). The two full loops X₁ → X₂ → X₃ → X₁ and (X₁ → X₃ → X₁)(X₂ → X₂) are incoherent (their contributions to F₁ have opposite signs) because F₁ = J₁₃J₃₂J₂₁ − J₁₃J₃₁J₂₂ and J₁₃J₃₂J₂₁ > 0, J₁₃J₃₁J₂₂ > 0. Depending on parameter values either the one or the other may determine the sign of F₁ and give opposite signs to $Δ_{1}^{αβ}$ . In this multi-node network, varying sign of $Δ_{1}^{αβ}$ can be obtained with monotonic dose-response functions, while this could only be obtained with non-monotonic dose-response functions in the single node autoregulatory system. Based on this we conjecture that with monotonic dose-response functions a much wider range of allele interaction sign motifs can be obtained in multi-gene systems than for autoregulated genes.

A system with an incoherent feedforward motif (from X₁ to X₃) and three feedback loops. An arrow denotes positive action, a crossbar negative action. The two full loops X₁ → X₂ → X₃ → X₁ and (X₁ → X₃ → X₁) (X₂ → X₂) are incoherent: their contributions to each *F_i* have opposite signs.

Our results provide a theoretical basis for two kinds of experimental tests of network models: (i) checking the sign of the allele interaction for any node by allele knockout in the same node; and (ii) checking the effect of allele knockout in one node on the equilibrium values of other nodes. In both cases the checking can be made independently on either of the homozygotes and on the heterozygote. This gives three possible combinations for each polymorphic locus. If the allelic composition of each of these loci can be selected or imposed experimentally and independently for each locus, the number of different test can in principle be very large. The formalism developed above may be combined with systematic measurement of the effects of allele knockouts and their effects on the other nodes in the network to deduce the connectivity of networks for which no model so far exists. This approach would be very similar to the approach suggested by Kholodenko et al. [30].

We have deliberately refrained from dealing with networks with multiple stable states. Surely, multistationarity is a generic characteristic of nonlinear dynamic systems, but is not a relevant issue in a large number of biological systems. Nor have we allowed genetic variation affecting the coding part of a gene. For such genes aggregation is generally not possible, as the two allele products may have different effects on other genes. It would not make sense to sum the two product concentrations, and the two alleles would simply have to be modelled by separate rate equations. The model framework we have used for the theory development is of course very simple both in terms of the relationship between the gene product expression level and the production rate from downstream loci and the neglect of more complex regulatory anatomies involving for example noncoding RNA (see e.g. [32, 33, 34, 35]). Including more biological realism along these lines would make it more complicated to develop the theory, but might at the same time disclose deeper insight into the propagation of genetic variation in real networks. Our formalism can easily also account for other network agents than gene loci, and can be used to study e.g. regulatory structures involving gene networks, metabolic networks and protein signalling networks. We anticipate that such an endeavor will yield new insight into the manifestation of genetic variation in nonlinear biological systems.

Acknowledgments

This work has been supported in part by The Research Council of Norway, project number 178901/V30, Bridging the gap: disclosure, understanding and exploitation of the genotype-phenotype map, and by the Virtual Physiological Rat Project funded through NIH grant P50-GM094503.

Appendix A

Notation

In this appendix we explain our notation for subsets of vectors and matrices, and for equilibrium values for different genotypes (allelic compositions) of diploid genes.

Let U and V be subsets of N = {1,2, …, n}. We use the notation z_U = {z_k}_k∈U, X_U = {X_k}_k∈U, etc. In matrix equations z_U denotes the corresponding column vector. The n × n Jacobian matrix of Eq. (1) in the stable point x is denoted J, and J_UV is the matrix obtained from J by selecting the rows U and the columns V (without interchanging rows or columns). We use the notation X^(U) to denote the set of nodes not in X_N, etc., and denote the corresponding set of variables by z^(U). Similarly, x^(k) is the set of all x_i except x_k or the vector obtained by removing x_k from x. Let i ∈ U and j ∈ V. The matrix $J_{UV}^{(ij)}$ is obtained from J by selecting the rows U and the columns V in J and deleting row i and column j in J. The superscript (i ∘) indicates that only row i and no column is deleted, and (∘ j) that only column j and no row is deleted. We also define D = det(J), $D_{UV}^{(ij)} = \det (J_{UV}^{(ij)})$ if |U| = |V|, and D^(ij) = det(J^(ij)). It goes without saying that if there is no superscript, no row or column is deleted, and if there is no subscript, all rows and columns are included. Similarly, if L ⊂ N, p_Lk(x_k, a^(k)) = {p_lk(x_k, a^(k)) ∣ l ∈ L}.

The genotype of a diploid gene X_i is denoted g_i =α_iβ_i, where α_i and β_i take the values 1 or 2, indicating two different alleles. All equilibrium values depend on the total genotype g = [g₁ … g_n] of the system, but we do not complicate formulas by stating this explicitly. Instead, we let $x_{i}^{α_{i} β_{i}}$ denote the equilibrium value of X_i when its genotype is α_iβ_i. Thus, $x_{i}^{11}$ , $x_{i}^{12}$ and $x_{i}^{22}$ are the stable equilibrium values of X_i when both alleles are present and both are of type 1, of types 1 and 2, and both of type 2, respectively.

The stable equilibrium value for X_i when one of the alleles has been knocked out is $x_{i}^{1 ∣ \circ}$ and $x_{i}^{2 ∣ \circ}$ , where ∘ indicates a nil value, i.e. that the allele is absent. Finally, $x_{i}^{1 ∣ 1}$ and $x_{i}^{1 ∣ 2}$ represent the equilibrium value of the output from a subnode of X_i with allele of type 1 when the other allele is of type 1 or 2, respectively. For example $x_{i}^{11} = x_{i}^{1 ∣ 1} + x_{i}^{1 ∣ 1} = 2 x_{i}^{1 ∣ 1}$ , $x_{i}^{12} = x_{i}^{1 ∣ 2} + x_{i}^{2 ∣ 1}$ , and $x_{i}^{22} = x_{i}^{2 ∣ 2} + x_{i}^{2 ∣ 2} = 2 x_{i}^{2 ∣ 2}$ . Note however that while e.g. $z_{i}^{1}$ is the (time dependent) output of $X_{i}^{1}$ whatever its actual genotype, $x_{i}^{1 ∣ \circ}$ is the equilibrium concentration of the gene product of X_i when the copy of the gene on one chromosome is knocked out and the one present is allele α_i = 1.

Appendix B

Circuits and loops

In this appendix we recall some useful facts related to the circuit structure of a real n × n matrix A.

Lemma 2 ([24])

Let k ∈ N be given, let U be any subset of N with k elements, and let π(U) be the set of permutations of U, including the identity permutation. Let V ∈ π(U) and define the circuit product

P (U, V) = A_{U_{1} V_{1}} A_{U_{2} V_{2} \dots} A_{U_{k} V_{k}}

(B.1)

and

S_{U} = \sum_{V \in π (V)} {(- 1)}^{ε (U, V)} P (U, V),

(B.2)

where ε(U, V) is the number of subcircuit products in the circuit product P(U, V) with an even number of factors, and

s_{k} = \sum_{U} S_{U},

(B.3)

where the sum runs over all U for which |U| = k. Then S_U = D_UU, and the characteristic polynomial of A is

p_{n} (λ) = λ^{n} - s_{1} λ^{n - 1} + s_{2} λ^{n - 2} + \dots + {(- 1)}^{n} s_{n} .

(B.4)

In particular, the trace T = tr(A) = s₁ and the determinant D = det(A) = s_n. Of course, s_n = S_N. We call (−1)^{ε(U, V)} P(U, V) the signed circuit product of the circuit corresponding to the circuit product P(U, V). To express signs we use the sign function defined by sign(x) = −1 if x < 0, sign(0) = 0, sign(x) = +1 if x > 0.

A square matrix for which all eigenvalues have a negative real part, will be called a stable matrix. The following result should be well-known.

Lemma 3

[36, Vol. 2, p. 220] If the real n × n matrix A is stable, then

{(- 1)}^{j} s_{j} > 0, all j \in N .

(B.5)

Appendix C

Proof of Proposition 2

Proof

When F_MQ = 0, the four determinants D^(1l), D^(lm), D^(1m) and D^(ll) are all block triangular, and can be expressed as

\begin{matrix} D^{(1 l)} & = & D_{LQ}^{(1 l)} D_{MM}, \\ D^{(lm)} & = & D_{QQ} D_{MR}^{(\circ m)}, \\ D^{(1 m)} & = & D_{LQ}^{1 l} D_{MR}^{(\circ m)}, \\ D^{(ll)} & = & D_{QQ} D_{MM} . \end{matrix}

(C.1)

The notation for subscripts and superscripts was defined in Appendix A. From Eqs. (C.1) follows trivially that

D^{(1 l)} D^{(lm)} = D^{(1 m)} D^{(ll)} .

(C.2)

which is equivalent to the chain rule due to Eq. (6).

Appendix D

Justification of aggregated models

In this appendix we justify the claim in Section 3.2 that the aggregated model S_A is a well-founded aggregation of S_E.

The simple model for transcription regulation developed by Bintu et al. [37, 38] and Buchler et al. [39] is based upon setting the transcription rate proportional to the binding probability of transcription factors and polymerase to the gene’s binding site. They used traditional Boltzmann statistics to derive formulas for the binding probabilities. Extending this analysis to biallelic genes remains to be done. Unfortunately, a physio-chemical analysis of transcription soon gets very complicated [40, 41], but the following simple argument lends some justification to the assumption that the production rate of the biallelic gene is just the sum of the two monoallelic production rates. Assume the number of transcription factor molecules is much larger than the number of binding sites of the gene, and that the effect of non-specific binding sites for the transcription factors can be disregarded. Then the number of transcription factor molecules available for binding to one chromosome is not appreciably reduced if a small fraction of them are bound to the other chromosome. If the probability of binding to the one chromosome is independent of what happens at the other chromosome, the total probability that transcription factors will bind to the gene and initiate transcription is just the sum of the two single-allele probabilities, and the total transcription rate is the sum of the two single-allele transcription rates.

The following proposition shows that S_A possesses the same asymptotic stability properties as S_E.

Proposition 6

Let $z^{*} = [z_{1}^{*}, \dots, z_{n}^{*}]$ be an asymptotically stable point for S_E. If the Jacobian J of S_A is diagonalisable in z*, then z* is an asymptotically stable point for S_A.

Proof

If $γ_{n}^{1} = γ_{n}^{2} = γ_{n}$ , the equations for $z_{n}^{1}$ and $z_{n}^{2}$ of S_E can be added, leading to the equations for S_A.

We then assume $γ_{n}^{1} \neq γ_{n}^{2}$ . Let z(t, z⁰), where $z_{n} = z_{n}^{1} + z_{n}^{2}$ , be a solution of S_E as given by Eqs. (31) such that lim_t→∞ z(t, y⁰) = z*, and define u(t) = z(t, y⁰) − z*. The definition of S_A ensures that z* is a stationary point for both systems.

Because z* is an asymptotically stable state for S_E, for any ε > 0 there exists a T > 0 such that ∥u(t)∥ < ε for t > T. By choosing y⁰ sufficiently close to z* we can ensure that ∥u∥ < ε for all positive t.

We proceed by investigating the rate equations for the n-component vector u(t).

\begin{array}{l} {\dot{u}}_{i} & = & r_{i} (z^{*} + u) - γ_{i} z_{i} = r_{i} (z^{*} + u) - r_{i} (z^{*}) - γ_{i} u_{i}, \\ {\dot{u}}_{n} & = & r_{n} (z^{*} + u) - γ_{n}^{1} z_{n}^{1} - γ_{n}^{2} z_{n}^{2} \\ = & r_{n} (z^{*} + u) - r_{n} (z^{*}) - γ_{n}^{1} u_{n}^{1} - γ_{n}^{2} u_{n}^{2}, \end{array}

(D.1)

where i = 1, …, n − 1 and $r_{n} = r_{n}^{1} + r_{n}^{2}$ . After a little algebra the equation for u̇_n can be written

{\dot{u}}_{n} = r_{n} (z^{*} + u) - r_{n} (z^{*}) - γ_{n} u_{n} + e_{n} (u),

(D.2)

where γ_n is defined in Eq. (33), and

e_{n} (u) = \frac{1}{z_{n}^{*}} (γ_{n}^{1} z_{n}^{* 2} - γ_{n}^{2} z_{n}^{* 1}) (u_{n}^{2} - u_{n}^{1}) .

(D.3)

The mean value theorem for a mapping r : Rⁿ →Rⁿ is [28]

Theorem 1

Suppose r : W → Rⁿ is differentiable on the open set W ⊂ Rⁿ, and that the line segment joining z* and z lies in W. Then there exist numbers α_i, 0 < α_i ≤ 1, and vectors wⁱ = (1 − α_i) z* + α_iz, i = 1, …, n, such that

r_{i} (z) - r_{i} (z^{*}) = D r_{i} (w) (z - z^{*}), i = 1, \dots, n,

(D.4)

where D = [∂/∂z₁, …, ∂/∂z_n], and r_i, z and z* are column vectors.

Note that wⁱ lies on the line segment between z* and z.

Let J(z) be the Jacobian of S_A, defined by J_ij(z) = ∂ f_i(z) / ∂z_j. Applying the mean value theorem to Eq. (D.2) we get

\dot{u} = H (u, z^{*}, a) u + e (u),

(D.5)

where H (u, z*, a) is obtained as follows: Let νⁱ = (1 − α_i)z* + α_i(z* + u) = z* + α_iu, where 0 < α_i < 1, and define a = [α₁, …, α_n]. Then H(u, z*, a) is the matrix obtained by evaluating the elements of J(z) in row number i in the point νⁱ, i = 1, …,n. Obviously, H (u, z*, a) → J(z*) = J* when t → ∞ and u(t) → 0.

We write H (u, z*, a) = J* + E(u, a) = PD*P⁻¹ + E(u, a), where D* is the diagonal eigenvalue matrix and P the eigenvector matrix for J*. Then E(u, a) → 0 when t → ∞. Considering Eq. (D.5) as an inhomogeneous ODE for u(t) and introducing ν(t) = P⁻¹ u(t), its solution is

ν (t) = e^{D^{*} t} ν^{0} + \int_{0}^{t} e^{D^{*} (t - τ)} (E (Pν (τ), a) + e (Pν (τ))) dτ .

(D.6)

With w(t) = E(Pν(τ), a) + e(Pν(τ)) we write this simpler as

\begin{array}{l} ν (t) & = & e^{D^{*} t} ν^{0} + \int_{0}^{t} e^{D^{*} (t - τ)} w (τ) dτ \\ = & e^{D^{*} t} ν^{0} + \int_{0}^{t} e^{D^{*} (t - τ)} dτ \bar{w} (t) \\ = & e^{D^{*} t} ν^{0} + {(D^{*})}^{- 1} (I - e^{D^{*} t}) \bar{w} (t), \end{array}

(D.7)

where w̄(t), which is the vector of mean values of the components of w(t), is bounded by the minimum and maximum of w(t) in [0, t] because the remaining integrand is positive for each component of ν(t).

Let {ν^0j}, j = 1, …, n be a set of linearly independent vectors, and ν^j (t) the corresponding solutions given by Eq. (D.7). Because w̄(t) → 0 when t → ∞, the set of ν^j (t) is also linearly independent for sufficiently large t. Letting V⁰, V(t) and W̄ (t) be the matrices with ν^0j, ν^j (t) and w̄^j (t) as columns, respectively, we get

V (t) = e^{D^{*} t} V^{0} + {(D^{*})}^{- 1} (I - e^{D^{*} t}) \bar{W} (t),

(D.8)

leading to

e^{D^{*} t} = {(V (t) - {(D^{*})}^{- 1} \bar{W} (t)) (V^{0} - {(D^{*})}^{- 1} \bar{W} (t))}^{- 1} .

(D.9)

The last factor in Eq. (D.9) is well-defined for sufficiently large t and approaches (V⁰)⁻¹ when t → ∞ because W̄ (t) approaches the zero matrix. It follows that ∥e^D*t∥ → 0 when t → ∞. Let μ be the spectral abscissa of D*. For all t ≥ 0, exp(μt) ≤ ∥exp(D*t)∥ [42, Theorem 15.3]. This shows that exp(μt) → 0 when t → ∞. Therefore μ < 0, and z* is an asymptotically stable and hyperbolic point for S_A.

To compare the temporal behaviours of S_E and S_A we had to rely on numerical simulations. To justify the aggregation, the temporal behaviours of z(t, z⁰) and y(t, z⁰) should be approximately equal and qualitiatively similar for a range of actual parameter values and realistic common initial values z⁰ ≠ z*. Although very close similarity far from the common equilibrium point cannot be expected, at least the behaviours near the equilibrium should be quantitatively similar. We quantify the degree of similarity of the solution curves by the relative discrepancy

RelErr (y, z) = {\frac{\int_{0}^{\infty} {(y_{i}, (t, z_{0}) - z_{i} (t, z_{0}))}^{2} dt}{\int_{0}^{\infty} {(z_{i} (t, z_{0}) - z_{i}^{*})}^{2} dt}}_{i \in N} .

(D.10)

One advantage of this discrepancy measure is that both integrals converge exponentially if z* is hyperbolic so that they can easily be computed numerically by integrating to a sufficiently large and finite T. As a measure of the similarity of solutions near the equilibrium we used

EigDiff (z^{*}; S_{E}, S_{A}) = \log (\sum_{j \in N} \frac{| λ_{j} - Λ_{j} |}{| Λ_{j} |} e^{ℜ} (Λ_{j})),

(D.11)

where {Λ_j} and {λ_j} are the sets of eigenvalues of the Jacobians of S_E and S_A, respectively. Because S_E has one additional eigenvalue, one of its eigenvalues has to be excluded from the sum in Eq. (D.11). We excluded the eigenvalue that minimises the sum. The purpose of the exponential factor is to simulate the fact that an eigenvalue contributes to the solution of the linearised equations around z* by this factor. This similarity measure is justified if there is a corresponding similarity between the two sets of eigenvectors, because then for a given solution z(t) it will be possible to construct a solution y(t) which will be quantitatively similar to z(t) close to the equilibrium.

Numeric simulations for a range of n-values show that in almost all cases the eigenvalues of S_A match the eigenvalues of S_E very closely (Figure D.3). The scatterplots in the left column show that except in a few cases, RelErr(y, z) and EigDiff (z*; S_E, S_A) are much smaller than 0, showing that the solution of S_A lies relatively close to the solution of S_E, and that the difference between the eigenvalues {Λ_j} of S_E and {λ_j} of S_A are much smaller in magnitude than the eigenvalues of the Jacobians of S_E.

Except in a small number of cases, the temporal behaviours of the solutions also match closely. Typically, at least when n is large, there are appreciable differences between the two solutions only for a few variables (Figure D.4). Frequently this happens for variables that do not approach their final state monotonically, either because they oscillate towards the equilibrium or because they approach a limit cycle. Also, there could be multistationarity in the systems such the two solutions approach different final states.

Figure D.4 — Selected examples of solution curves for extended systems (blue) and the corresponding aggregated systems (green). a: A typical case with n = 4 in which the two systems differ significantly in just one variable. b: A case with n = 4 of particularly bad similarity. In both cases, however, the two sets of curves are qualitatively similar, but the oscillations and dips are shifted in time. c: A system with n = 3 having a stable limit cycle for both systems.

Below follows a summary of the simulation details. The rate functions f_k, k = 1, …, n − 1, $f_{n}^{1}$ and $f_{n}^{2}$ were given by the function

f_{i} (z) = a_{i} B_{j} (Z_{k}, Z_{l}) - γ_{i} z_{i},

(D.12)

where a_i and γ_i were scalars chosen at random from a uniform distribution over (0,1), and B_j is any of the 14 non-constant Boolean functions of two variables, chosen at random for each i, but equal for $f_{n}^{1}$ and $f_{n}^{2}$ . The function

Z_{k} = \frac{z_{k}^{p_{k}} + h_{k} θ_{k}^{p_{k}}}{z_{k}^{p_{k}} + θ_{k}^{p_{k}}}

(D.13)

is the generalised Hill function derived from applying Boltzmann statistics to transcription regulation [43]. Note that here the superscripts are powers. The thresholds θ_k are chosen at random uniformly over (0, 1), the steepness parameters p_k (equivalent to the Hill exponent) were picked from a uniform distribution of integers in [1,10], and the inverse fold changes h_k were also chosen at random from a uniform distribution over (0, 1). The two inputs to the Boolean functions were chosen at random among the variables Z_i, but the same for $r_{n}^{1}$ and $r_{n}^{2}$ . For each value of n we ran 500 simulations. For each parameter set, both solutions were started from the same randomly chosen initial point. The systems that did not converge to a stable point or in which the two systems approached different attractors, were disregarded. That left us with the number of cases mentioned in the caption of Figure D.3.

Appendix E

The root of nonzero allele interaction values

The basic two-node system in Figure 1a is in our standard notation given by the rate equations

\begin{matrix} {\dot{z}}_{1}^{1} & = & r_{1}^{1} (z_{2}) - γ_{1}^{1} z_{1}^{1}, \\ {\dot{z}}_{1}^{2} & = & r_{1}^{2} (z_{2}) - γ_{1}^{2} z_{1}^{2}, \\ {\dot{z}}_{2} & = & r_{2} (z_{1}) - γ_{2} z_{2}, \end{matrix}

(E.1)

where $z_{1} = z_{1}^{1} + z_{1}^{2}$ . For the μ-system in Figure 1b the rate equations are

\begin{array}{l} {\dot{z}}_{1}^{1} & = & r_{1}^{1} (z_{2}) - γ_{1}^{1} z_{1}^{1}, \\ {\dot{z}}_{1}^{2} & = & r_{1}^{2} ({\tilde{z}}_{2}) - γ_{1}^{2} z_{1}^{2}, \\ {\dot{z}}_{2} & = & r_{2} (z_{1}^{1} + μ z_{1}^{2}) - γ_{2} z_{2}, \\ {\dot{\tilde{z}}}_{2} & = & r_{2} (μ z_{1}^{1} + z_{1}^{2}) - γ_{2} {\tilde{z}}_{2}, \end{array}

(E.2)

where μ ∈ [0,1]. By assumption the equilibrium conditions of Eq. (E.2) define unique stable equilibrium values $y_{1}^{12} (μ) = y_{1}^{1 ∣ 2} (μ) + y_{1}^{2 ∣ 1} (μ)$ , y₂(μ) and ỹ₂(μ). The allele interaction value is $Δ_{1}^{12} (μ) = y_{1}^{12} (μ) - y_{1}^{1 ∣ \circ} (μ) + y_{1}^{2 ∣ \circ} (μ)$ . Using implicit differentiation, doing some straightforward algebra and finally taking the limit μ → 0, we find

\lim_{μ \to 0} \frac{d Δ_{1}^{12} (μ)}{d μ} = \frac{u_{1}^{2} u_{2}}{γ_{1}^{2} γ_{2} - u_{1}^{2} u_{2}} y_{1}^{1} + \frac{u_{1}^{1} u_{2}}{γ_{1}^{1} γ_{2} - u_{1}^{1} u_{2}} y_{1}^{2},

(E.3)

where $u_{1}^{α}$ and u₂ represent the derivatives of the corresponding dose-response functions with respect to their argument. Because $Δ_{1}^{12} (0) = 0$ (see the main text), it follows that for arbitrarily small μ > 0 the μ-system has a nonzero allele interaction value. We may conclude that in general, this is true also for μ = 1, in which case the μ-system is equivalent to the basic system defined by Eqs. (E.1).

If the red arrow from $X_{1}^{2}$ in Figure 1b is missing, the loop $X_{1}^{1} \to {\tilde{X}}_{2} \to X_{1}^{2} \to X_{2} \to X_{1}^{1}$ is broken, and there is no longer a regulatory loop in the system. In this case the subsystem $X_{1}^{2}$ , X₂, does not act on the two other nodes, and $\lim_{μ \to 0} d Δ_{1}^{12} (μ) / d μ$ no longer depends on $y_{1}^{2}$ . Only the first term in Eq. (E.3) remains, and the conclusion is still valid.

Appendix F

Proof of Lemma 1

Proof

We adapt the numbering such that x₁ ≤ x₂. The intersections between the curves y = ϕ_i(x) and y = γ_ix, i = 1, 2, and y = ϕ₁ (x) + ϕ₂ (x) and y = γx define the solutions x₁, x₂ and x₁₂, respectively. We consider three cases separately.

The case $ϕ_{i}^{'} (x) < 0, i = 1, 2$ . Assume x₁₂ ≥ x₁ + x₂. Then
$\begin{array}{l} γ x_{12} & = & ϕ_{1} (x_{12}) + ϕ_{2} (x_{12}) \leq ϕ_{1} (x_{1} + x_{2}) + ϕ_{2} (x_{1} + x_{2}) \\ < & ϕ_{1} (x_{1}) + ϕ_{2} (x_{2}) = γ_{1} x_{1} + γ_{2} x_{2} = γ (x_{1} + x_{2}), \end{array}$
contradicting the assumption. Thus, x₁₂ < x₁ + x₂.

Then assume x₁₂ ≤ x₁. This leads to
$\begin{array}{l} γ x_{12} & \geq & ϕ_{1} (x_{1}) + ϕ_{2} (x_{1}) \\ > & ϕ_{1} (x_{1}) + ϕ_{2} (x_{2}) = γ (x_{1} + x_{2}) > γ x_{12}, \end{array}$
which is impossible. Thus, x₁₂ > x₁. In passing we note that it is not possible to draw a definite conclusion about which is the larger of γ₁ and γ₂.

The case $0 < ϕ_{i}^{'} (x) < γ_{i}$ , i = 1, 2. In this case, too, existence of x₁₂ has to be assumed. The three curves y = ϕ₁ (x), y = ϕ₂(x), and y = ϕ₁₂ (x) intersect the lines y = γ₁ x, y = γ₂x and y = γx, respectively, from above as in case 1. Then ϕ₁₂(x₁ + x₂) > ϕ₁(x₁) + ϕ₂(x₂) = γ₁x₁ + γ₂x₂ = γ(x₁ + x₂). This implies that in x = x₁ + x₂, ϕ₁₂(x) > γx, from which it follows that in this point the curve y = ϕ₁₂(x) lies above the line y = γx. Therefore the intersection x₁₂ lies to the right of x₁ + x₂, i.e. x₁₂ > x₁ + x₂.
The case $ϕ_{i}^{'} (x) > γ_{i}$ , i = 1,2. The existence of x₁₂ is not ensured, but we have assumed it exists. In this case $ϕ_{12}^{'} (x) > γ_{1} + γ_{2} > γ$ , and the three curves intersect the corresponding lines from below. As ϕ₁₂(x) > ϕ_i(x), it follows that ϕ_i(x_i) > γ_ix_i. Assume γ₁ < γ₂. Thus, in both points x₁ and x₂, ϕ₁₂(x) > γx. This implies that x₁₂ < x₁, x₁₂ < x₂, thus x₁₂ < x₁ + x₂. The case γ₂ < γ₁ goes likewise.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

1.Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature. 1999;402:C47–C52. doi: 10.1038/35011540. [DOI] [PubMed] [Google Scholar]
2.Lauffenburger DA. Cell signaling pathways as control modules: Complexity for simplicity? Proc Natl Acad Sci U S A. 2000;97:5031–5033. doi: 10.1073/pnas.97.10.5031. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Hoogendoorn B, Coleman SL, Guy CA, Smith K, Bowen T, Buckland PR, O’Donovan MC. Functional analysis of human promoter polymorphisms. Human Molecular Genetics. 2003;12:2249–2254. doi: 10.1093/hmg/ddg246. [DOI] [PubMed] [Google Scholar]
4.Gehring NH, Frede U, Neu-Yilik G, Hundsdoerfer P, Vetter B, Hentze MW, Kulozik AE. Increased efficiency of mRNA 3′ end formation: a new genetic mechanism contributing to hereditary thrombophilia. Nature Genetics. 2001;28:389–392. doi: 10.1038/ng578. [DOI] [PubMed] [Google Scholar]
5.Peng J, Murray EL, Schoenberg DR. The poly(A)-limiting element enhances mRNA accumulation by increasing the efficiency of pre-mRNA 3′ processing. RNA. 2005;11:958–965. doi: 10.1261/rna.2020805. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Wang RL, Stec A, Hey J, Lukens L, Doebley J. The limits of selection during maize domestication. Nature. 1999;398:236–239. doi: 10.1038/18435. [DOI] [PubMed] [Google Scholar]
7.Rosenfeld N, Young JW, Alon U, Swain PS, Elowitz MB. Gene regulation at the single-cell level. Science. 2005;307:1962–1965. doi: 10.1126/science.1106914. [DOI] [PubMed] [Google Scholar]
8.Mayo AE, Setty Y, Shavit S, Zaslaver A, Alon U. Plasticity of the cis-regulatory input function of a gene. PLoS Biology. 2006;4:e45. doi: 10.1371/journal.pbio.0040045. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Duan J, Antezana MA. Mammalian mutation pressure, synonymous codon choice, and mRNA degradation. Journal of Molecular Evolution. 2003;57:694–701. doi: 10.1007/s00239-003-2519-1. [DOI] [PubMed] [Google Scholar]
10.Capon F, Allen MH, Ameen M, Burden AD, Tillman D, Barker JN, Trembath RC. A synonymous SNP of the corneodesmosin gene leads to increased mRNA stability and demonstrates association with psoriasis across diverse ethnic groups. Human Molecular Genetics. 2004;13:2361–2368. doi: 10.1093/hmg/ddh273. [DOI] [PubMed] [Google Scholar]
11.Chamary JV, Hurst L. Evidence for selection on synonymous mutations affecting stability of mRNA secondary structure in mammals. Genome Biology. 2005;6:R75. doi: 10.1186/gb-2005-6-9-r75. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Jones FC, Grabherr MG, Chan YF, Russell P, Mauceli E, Johnson J, Swofford R, Pirun M, Zody MC, White S, Birney E, Searle S, Schmutz J, Grimwood J, Dickson MC, Myers RM, Miller CT, Summers BR, Knecht AK, Brady SD, Zhang H, Pollen AA, Howes T, Amemiya C, Lander ES, Di Palma F, Lindblad-Toh K, Kingsley DM. The genomic basis of adaptive evolution in threespine sticklebacks. Nature. 2012;484:55–61. doi: 10.1038/nature10944. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Omholt SW, Plahte E, Øyehaug L, Xiang KF. Gene regulatory networks generating the phenomena of additivity, dominance and epistasis. Genetics. 2000;155:969–980. doi: 10.1093/genetics/155.2.969. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Gjuvsland AB, Plahte E, Ådnøy T, Omholt SW. Allele interaction – single locus genetics meets regulatory biology. PLoS ONE. 2010;5:e9379. doi: 10.1371/journal.pone.0009379. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Gibson G. Epistasis and pleiotropy as natural properties of transcriptional regulation. Theoretical Population Biology. 1996;49:58–89. doi: 10.1006/tpbi.1996.0003. [DOI] [PubMed] [Google Scholar]
16.Tyler AL, Asselbergs FW, Williams SM, Moore JH. Shadows of complexity: what biological networks reveal about epistasis and pleiotropy. Bioessays. 2009;31:220–227. doi: 10.1002/bies.200800022. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Thomas R, D’Ari R. Biological Feedback. CRC Press; Boca Raton, USA: 1990. [Google Scholar]
18.Thomas R, Kaufman M. Multistationarity, the basis of cell differentiation and memory. ii. logical analysis of regulatory networks in terms of feedback circuits. Chaos. 2001;11:180–195. doi: 10.1063/1.1349893. [DOI] [PubMed] [Google Scholar]
19.Welch SM, Dong ZS, Roe JL, Das S. Flowering time control: gene network modelling and the link to quantitative genetics. Aust J Agric Res. 2005;56:919–936. [Google Scholar]
20.Omholt SW. From bean-bag genetics to feedback genetics: bridging the gap between regulatory biology and classical genetics. Landes Bioscience; Georgetown, Texas, USA: 2006. [Google Scholar]
21.Maybee JS, Olesky DD, van den Driessche P, Wiener G. Matrices, Digraphs and Determinants. University of Colorado; 1987. Technical Report. [Google Scholar]
22.de Jong H. Modeling and simulation of genetic regulatory systems: A literature review. J Comput Biol. 2002;9:67–104. doi: 10.1089/10665270252833208. [DOI] [PubMed] [Google Scholar]
23.Brazhnik P, de la Fuente A, Mendes P. Gene networks: how to put the function in genomics. Trends Biotechnol. 2002;20:467–472. doi: 10.1016/s0167-7799(02)02053-x. [DOI] [PubMed] [Google Scholar]
24.Plahte E, Mestl T, Omholt SW. Feedback loops, stability and multistationarity in dynamical systems. JBS. 1995;3:409–413. [Google Scholar]
25.Soulé C. Graphic reguirements for multistationarity. ComPlexUs. 2003;1:123–133. [Google Scholar]
26.Maybee JS. Principal Minor Determinant Formulas. University of Colorado; 1973. Technical Report CU-CS-033-73. [Google Scholar]
27.Radulescu O, Lagarrigue S, Siegel A, Veber P, Le Borgne M. Topology and static response of interaction networks in molecular biology. J R Soc Interface. 2006;3:185–196. doi: 10.1098/rsif.2005.0092. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Marsden JE, Hoffman MJ. Elementary Classical Analysis. 2 W. H. Freeman; New York: 1993. [Google Scholar]
29.Duncan IW. Transvection effects in Drosophila. Annu Rev Genet. 2002;36:521–556. doi: 10.1146/annurev.genet.36.060402.100441. [DOI] [PubMed] [Google Scholar]
30.Kholodenko BN, Kiyatkin A, Bruggeman FJ, Sontag E, Westerhoff HV, Hoek JB. Untangling the wires: A strategy to trace functional interactions in signaling and gene networks. Proc Natl Acad Sci U S A. 2002;99:12841–12846. doi: 10.1073/pnas.192442699. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, Dow S, Lucau-Danila A, Anderson K, Andre B, Arkin AP, Astromoff A, El Bakkoury M, Bangham R, Benito R, Brachat S, Campanaro S, Curtiss M, Davis K, Deutschbauer A, Entian K-D, Flaherty P, Foury F, Garfinkel DJ, Gerstein M, Gotte D, Guldener U, Hegemann JH, Hempel S, Herman Z, Jaramillo DF, Kelly DE, Kelly SL, Kotter P, LaBonte D, Lamb DC, Lan N, Liang H, Liao H, Liu L, Luo C, Lussier M, Mao R, Menard P, Ooi SL, Revuelta JL, Roberts CJ, Rose M, Ross-Macdonald P, Scherens B, Schimmack G, Shafer B, Shoemaker DD, Sookhai-Mahadeo S, Storms RK, Strathern JN, Valle G, Voet M, Volckaert G, Wang C-y, Ward TR, Wilhelmy J, Winzeler EA, Yang Y, Yen G, Youngman E, Yu K, Bussey H, Boeke JD, Snyder M, Philippsen P, Davis RW, Johnston M. Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002;418:387–391. doi: 10.1038/nature00935. [DOI] [PubMed] [Google Scholar]
32.Mattick JS. RNA regulation: a new genetics? Nat Rev Genet. 2004;5:316–323. doi: 10.1038/nrg1321. [DOI] [PubMed] [Google Scholar]
33.Hobert O. Gene regulation by transcription factors and microRNAs. Science. 2008;319:1785–1786. doi: 10.1126/science.1151651. [DOI] [PubMed] [Google Scholar]
34.Makeyev EV, Maniatis T. Multilevel regulation of gene expression by microRNAs. Science. 2008;319:1789–1790. doi: 10.1126/science.1152326. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Shimoni Y, Friedlander G, Hetzroni G, Niv G, Altuvia S, Biham O, Margalit H. Regulation of gene expression by small non-coding RNAs: a quantitative view. Mol Syst Biol. 2007;3 doi: 10.1038/msb4100181. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Gantmacher FR. The Theory of Matrices. AMS Chelsea Publishing; Providence, Rhode Island: 2000. [Google Scholar]
37.Bintu L, Buchler NE, Garcia HG, Gerland U, Hwa T, Kondev J, Kuhlman T, Phillips R. Transcriptional regulation by the numbers: applications. Curr Opin Genet Dev. 2005;15:125–135. doi: 10.1016/j.gde.2005.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Bintu L, Buchler NE, Garcia HG, Gerland U, Hwa T, Kondev J, Phillips R. Transcriptional regulation by the numbers: models. Curr Opin Genet Dev. 2005;15:116–124. doi: 10.1016/j.gde.2005.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Buchler NE, Gerland U, Hwa T. On schemes of combinatorial transcription logic. Proc Natl Acad Sci U S A. 2003;100:5136–5141. doi: 10.1073/pnas.0930314100. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Sengupta AM, Djordjevic M, Shraiman BI. Specificity and robustness in transcription control networks. Proc Natl Acad Sci U S A. 2002;99:2072–2077. doi: 10.1073/pnas.022388499. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.von Hippel PH, Berg OG. On the specificity of DNA-protein interactions. Proc Natl Acad Sci U S A. 1986;83:1608–1612. doi: 10.1073/pnas.83.6.1608. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Trefethen LN, Embree M. Spectra and Pseudospectra: the Behavior of Nonnormal Matrices and Operators. Princeton University Press; Princeton, N.J.: 2005. [Google Scholar]
43.Buchler NE, Gerland U, Hwa T. Nonlinear protein degradation and the function of genetic circuits. Proc Natl Acad Sci U S A. 2005;102:9559–9564. doi: 10.1073/pnas.0409553102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature. 1999;402:C47–C52. doi: 10.1038/35011540. [DOI] [PubMed] [Google Scholar]

[R2] 2.Lauffenburger DA. Cell signaling pathways as control modules: Complexity for simplicity? Proc Natl Acad Sci U S A. 2000;97:5031–5033. doi: 10.1073/pnas.97.10.5031. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Hoogendoorn B, Coleman SL, Guy CA, Smith K, Bowen T, Buckland PR, O’Donovan MC. Functional analysis of human promoter polymorphisms. Human Molecular Genetics. 2003;12:2249–2254. doi: 10.1093/hmg/ddg246. [DOI] [PubMed] [Google Scholar]

[R4] 4.Gehring NH, Frede U, Neu-Yilik G, Hundsdoerfer P, Vetter B, Hentze MW, Kulozik AE. Increased efficiency of mRNA 3′ end formation: a new genetic mechanism contributing to hereditary thrombophilia. Nature Genetics. 2001;28:389–392. doi: 10.1038/ng578. [DOI] [PubMed] [Google Scholar]

[R5] 5.Peng J, Murray EL, Schoenberg DR. The poly(A)-limiting element enhances mRNA accumulation by increasing the efficiency of pre-mRNA 3′ processing. RNA. 2005;11:958–965. doi: 10.1261/rna.2020805. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Wang RL, Stec A, Hey J, Lukens L, Doebley J. The limits of selection during maize domestication. Nature. 1999;398:236–239. doi: 10.1038/18435. [DOI] [PubMed] [Google Scholar]

[R7] 7.Rosenfeld N, Young JW, Alon U, Swain PS, Elowitz MB. Gene regulation at the single-cell level. Science. 2005;307:1962–1965. doi: 10.1126/science.1106914. [DOI] [PubMed] [Google Scholar]

[R8] 8.Mayo AE, Setty Y, Shavit S, Zaslaver A, Alon U. Plasticity of the cis-regulatory input function of a gene. PLoS Biology. 2006;4:e45. doi: 10.1371/journal.pbio.0040045. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Duan J, Antezana MA. Mammalian mutation pressure, synonymous codon choice, and mRNA degradation. Journal of Molecular Evolution. 2003;57:694–701. doi: 10.1007/s00239-003-2519-1. [DOI] [PubMed] [Google Scholar]

[R10] 10.Capon F, Allen MH, Ameen M, Burden AD, Tillman D, Barker JN, Trembath RC. A synonymous SNP of the corneodesmosin gene leads to increased mRNA stability and demonstrates association with psoriasis across diverse ethnic groups. Human Molecular Genetics. 2004;13:2361–2368. doi: 10.1093/hmg/ddh273. [DOI] [PubMed] [Google Scholar]

[R11] 11.Chamary JV, Hurst L. Evidence for selection on synonymous mutations affecting stability of mRNA secondary structure in mammals. Genome Biology. 2005;6:R75. doi: 10.1186/gb-2005-6-9-r75. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Jones FC, Grabherr MG, Chan YF, Russell P, Mauceli E, Johnson J, Swofford R, Pirun M, Zody MC, White S, Birney E, Searle S, Schmutz J, Grimwood J, Dickson MC, Myers RM, Miller CT, Summers BR, Knecht AK, Brady SD, Zhang H, Pollen AA, Howes T, Amemiya C, Lander ES, Di Palma F, Lindblad-Toh K, Kingsley DM. The genomic basis of adaptive evolution in threespine sticklebacks. Nature. 2012;484:55–61. doi: 10.1038/nature10944. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Omholt SW, Plahte E, Øyehaug L, Xiang KF. Gene regulatory networks generating the phenomena of additivity, dominance and epistasis. Genetics. 2000;155:969–980. doi: 10.1093/genetics/155.2.969. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Gjuvsland AB, Plahte E, Ådnøy T, Omholt SW. Allele interaction – single locus genetics meets regulatory biology. PLoS ONE. 2010;5:e9379. doi: 10.1371/journal.pone.0009379. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Gibson G. Epistasis and pleiotropy as natural properties of transcriptional regulation. Theoretical Population Biology. 1996;49:58–89. doi: 10.1006/tpbi.1996.0003. [DOI] [PubMed] [Google Scholar]

[R16] 16.Tyler AL, Asselbergs FW, Williams SM, Moore JH. Shadows of complexity: what biological networks reveal about epistasis and pleiotropy. Bioessays. 2009;31:220–227. doi: 10.1002/bies.200800022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Thomas R, D’Ari R. Biological Feedback. CRC Press; Boca Raton, USA: 1990. [Google Scholar]

[R18] 18.Thomas R, Kaufman M. Multistationarity, the basis of cell differentiation and memory. ii. logical analysis of regulatory networks in terms of feedback circuits. Chaos. 2001;11:180–195. doi: 10.1063/1.1349893. [DOI] [PubMed] [Google Scholar]

[R19] 19.Welch SM, Dong ZS, Roe JL, Das S. Flowering time control: gene network modelling and the link to quantitative genetics. Aust J Agric Res. 2005;56:919–936. [Google Scholar]

[R20] 20.Omholt SW. From bean-bag genetics to feedback genetics: bridging the gap between regulatory biology and classical genetics. Landes Bioscience; Georgetown, Texas, USA: 2006. [Google Scholar]

[R21] 21.Maybee JS, Olesky DD, van den Driessche P, Wiener G. Matrices, Digraphs and Determinants. University of Colorado; 1987. Technical Report. [Google Scholar]

[R22] 22.de Jong H. Modeling and simulation of genetic regulatory systems: A literature review. J Comput Biol. 2002;9:67–104. doi: 10.1089/10665270252833208. [DOI] [PubMed] [Google Scholar]

[R23] 23.Brazhnik P, de la Fuente A, Mendes P. Gene networks: how to put the function in genomics. Trends Biotechnol. 2002;20:467–472. doi: 10.1016/s0167-7799(02)02053-x. [DOI] [PubMed] [Google Scholar]

[R24] 24.Plahte E, Mestl T, Omholt SW. Feedback loops, stability and multistationarity in dynamical systems. JBS. 1995;3:409–413. [Google Scholar]

[R25] 25.Soulé C. Graphic reguirements for multistationarity. ComPlexUs. 2003;1:123–133. [Google Scholar]

[R26] 26.Maybee JS. Principal Minor Determinant Formulas. University of Colorado; 1973. Technical Report CU-CS-033-73. [Google Scholar]

[R27] 27.Radulescu O, Lagarrigue S, Siegel A, Veber P, Le Borgne M. Topology and static response of interaction networks in molecular biology. J R Soc Interface. 2006;3:185–196. doi: 10.1098/rsif.2005.0092. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Marsden JE, Hoffman MJ. Elementary Classical Analysis. 2 W. H. Freeman; New York: 1993. [Google Scholar]

[R29] 29.Duncan IW. Transvection effects in Drosophila. Annu Rev Genet. 2002;36:521–556. doi: 10.1146/annurev.genet.36.060402.100441. [DOI] [PubMed] [Google Scholar]

[R30] 30.Kholodenko BN, Kiyatkin A, Bruggeman FJ, Sontag E, Westerhoff HV, Hoek JB. Untangling the wires: A strategy to trace functional interactions in signaling and gene networks. Proc Natl Acad Sci U S A. 2002;99:12841–12846. doi: 10.1073/pnas.192442699. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, Dow S, Lucau-Danila A, Anderson K, Andre B, Arkin AP, Astromoff A, El Bakkoury M, Bangham R, Benito R, Brachat S, Campanaro S, Curtiss M, Davis K, Deutschbauer A, Entian K-D, Flaherty P, Foury F, Garfinkel DJ, Gerstein M, Gotte D, Guldener U, Hegemann JH, Hempel S, Herman Z, Jaramillo DF, Kelly DE, Kelly SL, Kotter P, LaBonte D, Lamb DC, Lan N, Liang H, Liao H, Liu L, Luo C, Lussier M, Mao R, Menard P, Ooi SL, Revuelta JL, Roberts CJ, Rose M, Ross-Macdonald P, Scherens B, Schimmack G, Shafer B, Shoemaker DD, Sookhai-Mahadeo S, Storms RK, Strathern JN, Valle G, Voet M, Volckaert G, Wang C-y, Ward TR, Wilhelmy J, Winzeler EA, Yang Y, Yen G, Youngman E, Yu K, Bussey H, Boeke JD, Snyder M, Philippsen P, Davis RW, Johnston M. Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002;418:387–391. doi: 10.1038/nature00935. [DOI] [PubMed] [Google Scholar]

[R32] 32.Mattick JS. RNA regulation: a new genetics? Nat Rev Genet. 2004;5:316–323. doi: 10.1038/nrg1321. [DOI] [PubMed] [Google Scholar]

[R33] 33.Hobert O. Gene regulation by transcription factors and microRNAs. Science. 2008;319:1785–1786. doi: 10.1126/science.1151651. [DOI] [PubMed] [Google Scholar]

[R34] 34.Makeyev EV, Maniatis T. Multilevel regulation of gene expression by microRNAs. Science. 2008;319:1789–1790. doi: 10.1126/science.1152326. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.Shimoni Y, Friedlander G, Hetzroni G, Niv G, Altuvia S, Biham O, Margalit H. Regulation of gene expression by small non-coding RNAs: a quantitative view. Mol Syst Biol. 2007;3 doi: 10.1038/msb4100181. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Gantmacher FR. The Theory of Matrices. AMS Chelsea Publishing; Providence, Rhode Island: 2000. [Google Scholar]

[R37] 37.Bintu L, Buchler NE, Garcia HG, Gerland U, Hwa T, Kondev J, Kuhlman T, Phillips R. Transcriptional regulation by the numbers: applications. Curr Opin Genet Dev. 2005;15:125–135. doi: 10.1016/j.gde.2005.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] 38.Bintu L, Buchler NE, Garcia HG, Gerland U, Hwa T, Kondev J, Phillips R. Transcriptional regulation by the numbers: models. Curr Opin Genet Dev. 2005;15:116–124. doi: 10.1016/j.gde.2005.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Buchler NE, Gerland U, Hwa T. On schemes of combinatorial transcription logic. Proc Natl Acad Sci U S A. 2003;100:5136–5141. doi: 10.1073/pnas.0930314100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] 40.Sengupta AM, Djordjevic M, Shraiman BI. Specificity and robustness in transcription control networks. Proc Natl Acad Sci U S A. 2002;99:2072–2077. doi: 10.1073/pnas.022388499. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] 41.von Hippel PH, Berg OG. On the specificity of DNA-protein interactions. Proc Natl Acad Sci U S A. 1986;83:1608–1612. doi: 10.1073/pnas.83.6.1608. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] 42.Trefethen LN, Embree M. Spectra and Pseudospectra: the Behavior of Nonnormal Matrices and Operators. Princeton University Press; Princeton, N.J.: 2005. [Google Scholar]

[R43] 43.Buchler NE, Gerland U, Hwa T. Nonlinear protein degradation and the function of genetic circuits. Proc Natl Acad Sci U S A. 2005;102:9559–9564. doi: 10.1073/pnas.0409553102. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Propagation of genetic variation in gene regulatory networks

Erik Plahte

Arne B Gjuvsland

Stig W Omholt

Abstract

1. Introduction

2. Propagation of genetic variation: features shared by haploid and diploid networks

2.1. Basic rate equations

2.2. Propagation functions

Proposition 1

Proposition 2

2.3. Propagation chains and feedback loops

2.4. The regulatory feedback effect on xk of genetic variation in Xk

Proposition 3

3. Allele interaction in networks with diploid loci

3.1. Allele-specific diploid gene regulatory network models

3.2. Aggregating diploid loci

3.3. The allele interaction concept

Figure 1.

3.4. Allele interaction, feedback functions and feedback loops

Proposition 4

Proof

Lemma 1

Proposition 5

4. Discussion and conclusions

Figure 2.

Acknowledgments

Appendix A

Notation

Appendix B

Circuits and loops

Lemma 2 ([24])

Lemma 3

Appendix C

Proof of Proposition 2

Proof

Appendix D

Justification of aggregated models

Proposition 6

Proof

Theorem 1

Figure D.3.

Figure D.4.

Appendix E

The root of nonzero allele interaction values

Appendix F

Proof of Lemma 1

Proof

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

2.4. The regulatory feedback effect on x_k of genetic variation in X_k