Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Aug 1.
Published in final edited form as: Physica D. 2013 Apr 17;256-257:7–20. doi: 10.1016/j.physd.2013.04.002

Propagation of genetic variation in gene regulatory networks

Erik Plahte a,c, Arne B Gjuvsland a,c,*, Stig W Omholt b,c,d
PMCID: PMC3752980  NIHMSID: NIHMS486320  PMID: 23997378

Abstract

A future quantitative genetics theory should link genetic variation to phenotypic variation in a causally cohesive way based on how genes actually work and interact. We provide a theoretical framework for predicting and understanding the manifestation of genetic variation in haploid and diploid regulatory networks with arbitrary feedback structures and intra-locus and inter-locus functional dependencies. Using results from network and graph theory, we define propagation functions describing how genetic variation in a locus is propagated through the network, and show how their derivatives are related to the network’s feedback structure. Similarly, feedback functions describe the effect of genotypic variation of a locus on itself, either directly or mediated by the network. A simple sign rule relates the sign of the derivative of the feedback function of any locus to the feedback loops involving that particular locus. We show that the sign of the phenotypically manifested interaction between alleles at a diploid locus is equal to the sign of the dominant feedback loop involving that particular locus, in accordance with recent results for a single locus system. Our results provide tools by which one can use observable equilibrium concentrations of gene products to disclose structural properties of the network architecture. Our work is a step towards a theory capable of explaining the pleiotropy and epistasis features of genetic variation in complex regulatory networks as functions of regulatory anatomy and functional location of the genetic variation.

Keywords: Gene regulation, Network, Haploid, Diploid, Genetic variation, Feedback

1. Introduction

Understanding the genotype to phenotype map is essential for a whole range of problems in evolutionary biology, production biology and biomedicine. As gene regulatory networks are the main mediating agents for setting up this map, a theory that can tell us how genetic variation is phenotypically manifested in gene regulatory networks as a function of regulatory anatomy may prove most helpful. Such a theory will be an important contribution to a future quantitative genetics theory linking genes, phenotypes and population level genetic phenomena in causal models based on how genes actually work and interact. More specifically, by being able to describe how the effects of genetic variation propagate in a network one will be able to predict how genetic variation in a gene affects network pathways and processes. In this way one may be able to tie genetic variation in gene networks to a whole range of biological processes that generate high-level phenotypic features. Moreover, at the generic level such a theory can be used in a systematic way to reveal recurrent patterns of how variation is propagated in specific types of regulatory anatomies.

We assume that the network is composed of a set of interacting nodes or loci. Each locus can in principle be regarded as a module by being a functional unit or subsystem of molecular processes whose working may be unknown, but which includes the whole transcriptional and translational machinery that produces the output of the locus [1, 2]. The phenotypes of a network are the stable equilibrium values of the gene products of all the loci in the network. Each locus is susceptible to genetic variation, and we assume that the genetic variation affects the promoter region of a given gene, but that there is no variation in the coding region of the gene. Many experimental results justify the relevance of this assumption. There are examples of noncoding mutations affecting production rates [3], mRNA processing rates [4, 5], the shape of the cis-regulatory input function [6, 7, 8], and mRNA decay rates [9, 10, 11]. In a recent study of adaptive evolution in threespine sticklebacks, Jones et al. found that in 41% of the genes allelic variation was regulatory, in 42% it was probably regulatory, and in only 17% it was coding [12].

To fully understand the functional properties of a diploid gene it is desirable to model its two alleles as separate quantities. This was first done by Omholt et al. [13] to show how the phenomena of genetic dominance, overdominance, additivity, and epistasis could be seen as generic features of simple diploid gene regulatory networks. This model framework was later used to introduce the socalled allele interaction concept [14]. In the present paper we develop these ideas further by proposing a way by which a diploid gene modelled in this fashion can be represented as a single entity and described by a single ODE for its gene product.

Based on these premises we provide a new vocabulary for analysing how genetic variation is manifested in a wide class of haploid and diploid gene regulatory networks possessing negative and positive feedback loops. We introduce terms to describe how a change in equilibrium value at one locus affects the equilibrium values of all other loci, how to identify the causal chains of loci conveying a genetic signal from one locus to another, and how genetic variation at a particular locus affects the equilibrium value phenotype of the locus itself. In [14] we investigated the relationships between single locus gene action concepts and regulatory network anatomy in small networks. Here we extend the analysis to gene regulatory networks with arbitrary number of loci and complex feedback structures. This extension is highly relevant for understanding epistasis and pleiotropy in genotype-phenotype maps. Epistasis refers to situations where the effect of a genetic substitution at one locus depends on the genotype at another locus. Pleiotropy describes situations where one gene influences several phenotypes rather than a single one. Since epistasis and pleiotropy are inherent to biological networks, a system-level understanding of these phenomena is needed [15, 16].

By this work we contribute to the long and strong tradition originating with the works of René Thomas on relating generic systemic properties to the web of feedback loops [17, 18], while at the same time elucidating the link between genetics and systems dynamics. Our results provide further support to the view that nonlinear system dynamics will make up a major part of the core of the mathematical foundation of a future quantitative genetics theory [19, 20].

2. Propagation of genetic variation: features shared by haploid and diploid networks

At this stage we are not concerned with the inner workings of each gene due to genetic variation, but assume that the output rate of a locus is a given function of the concentration levels of its regulators, which we assume are one or several gene outputs. Thus in the first part of the paper we deal with characteristics of propagation of genetic variation that are shared by both haploid and diploid networks.

We combine results from linear algebra and network theory (see e.g. [21]) with gene network ideas to describe how genetic variation in one locus propagates to the other loci in the system in terms of the equilibrium values of the state variables. We introduce the term propagation function to describe how a change in equilibrium value of one node affects the equilibrium values of all other nodes, the term propagation chain to describe a chain of actions conveying a genetic signal from one node in the network to another, and finally, the term feedback function to describe how genetic variation at any particular locus affects the equilibrium value of the locus product itself.

A brief explanation of our notation is found in Appendix A.

2.1. Basic rate equations

We assume the network N is composed of a set of n loci Xi, i = 1, …, n, where n ≥ 2. The non-negative variable zi represents the possibly time dependent concentration or amount of the output of Xi and acts as input to other loci in the network or contributes of the network’s net output. The dynamics of N is described by a set of autonomous rate equations Ei for zi, iN = {1,2, …, n},

z˙i=fi(z,ai)=ri(z,ai)γizi, (1)

where zR+n is the n-component vector with non-negative components zi, ri(z, ai) is differentiable with respect to z in a certain open and convex domain W, and γi > 0 is the relative degradation rate of zi. The quantity a = {ai}, iN, represents a set of parameters defining the system’s genotype, the subset ai defining the genotype of Xi and comprising quantities like maximum production rate, activation thresholds, affinities of activators and inhibitors, mRNA to protein conversion rate, etc. In many modelling approaches of this type, ri is a Boolean or Boolean-like functional of sigmoidal functions or piecewise constant functions, see [22] for a review of modelling approaches for gene networks. It should be noted that there could be long and complicated chains of effects incorporated into ri(z, ai) [23].

We assume that for each combination of genotypes of the loci Xi in N, the system composed of Eqs. (1) has a single hyperbolic, asymptotically stable and differentiable point-like solution x in W. We show in Section 2.2 that under reasonable assumptions an equilibrium x always exists. If N has no positive loops, x is unique [24, 25]. To avoid having to discuss possible problems related to multistationarity, we invoke the additional assumption that the equilibrium of the system is unique within the domain of phase space of interest even if there are positive loops in the system.

2.2. Propagation functions

A shift in the equilibrium value of some xk due to a change in parameters specific for Xk will propagate through the network and lead to shifts in other equilibrium values. The propagation follows the network connections, which can be read out from the Jacobian J of Eq. (1) in the stable state x. To the network N corresponding to Eq. (1) we associate a signed digraph G. To each node or locus Xi is associated a vertex Xi in G. Let XjXi indicate a direct effect from Xj to Xi if Ji j = ∂ri(z, a)/∂zj ≠ 0 in z = x. The effect of Xj on Xi is positive (negative) if the rate of change żi increases (decreases) when zj increases. For this direct effect there is a corresponding directed arc in G from Xj to Xi with a sign equal to the sign of Ji j associated to it. The sequence of direct effects XkXj → ⋯ → Xl is called a chain from Xk to Xl if each node in the chain occurs only once [26]. This chain corresponds to a simple path in G from Xk to Xl. We will use the term progagation chain.

The following proposition shows that for each pair k, lN, where lk, there exists a propagation function plk which determines how the perturbed value of xl due to a genetic variation in Xk is given in terms of xk.

Proposition 1

Let kN be given, let L = N \ {k}, and consider the set of equilibrium conditions

fl(xL,xk,al)=rl(xL,xk,al)γlxl=0,lL, (2)

where all xi ≥ 0, and all rl satisfy rl(xL, xk, al) > 0 for xl = 0. For any xk the system of equations fL(xL, xk, aL) = 0 has at least one set of solutions xl = plk(xk, a(k)), where a(k) is the set of parameters not occurring in the rate equation of Xk.

The proposition follows directly from Theorem 4.9 in [27]. Because the equation fk(xL, xk, ak) = 0 is not included in the system of equations fL(xL, xk, aL) = 0, the solution xL is independent of the Xk-specific parameters ak. This fact is important because it implies that the effects on xl of any genetic variation of Xk is given by a fixed propagation function plk.

From this follows the usefulness of the propagation functions. A genotypic variation (mutation) in a gene may lead to new equilibrium values of the gene products in the network. One way of addressing this would be to try to parametrise the genotypic variation, and then model the dependence of the equilibrium values on the relevant parameters. The propagation functions offer a simpler solution because they require no knowledge of how the mutated gene could be modelled. They only relate the observable equilibrium concentrations. There is no need to take account of what is the cause of the genetic variation of Xk, how this manifests itself in a shift of parameter values in ak, or how this parameter value shift might influence plk. All that matters are the shifted values of xk and xl. For a given k the set of all the functions plk contain all information about how the genetic change in Xk becomes manifested in the network against a fixed genetic background (the genotypes of all the other genes).

In the following we explore the properties of the propagation functions and show how they are related to the structure and interactions in the network. In the following we will try and derive the propagation functions from network properties, and also use what can be learned about propagation functions from observed equilibrium values to obtain information about causal chains in the network.

For a given k the functions plk are in principle observable by varying the genotype of Xk while keeping the other loci fixed and recording the shifted equilibrium values. Of course, solving plk for a given model is in general prohibited due to the non-linearities in the system. However, finding the derivative of plk is a linear problem. In the following we relate the derivative plk(xk,a(k))=qlk(xk,a(k)) to the values of the elements of the Jacobian J of Eqs. (1) for a given k and any lk. Let L = n \ {l, k} and jL. All the equilibrium conditions EL define xj as a function of xk, i.e. xL = pLk(xk). Then, when the expression in Eq. (6) below for dxl/dxk exists,

xl=rl(xl,xk,pLk(xk)) (3)

defines xl as a function of xk around the steady state. Differentiating Eq. (3) with respect to xk gives

jkJljqjk=Jlk. (4)

Let Q(k) be the column vector with components qik and ν(k) the column vector with elements ∂ fi/∂ xk, both with i = k excluded. Then

J(kk)Q(k)=ν(k). (5)

Using Cramer’s rule and interchanging columns in the numerator finally leads to

dxldxk=qlk(xk,a(k))=(1)k+lD(kl)D(kk). (6)

Note that the right hand side is in fact independent of fk(x, a) because row number k in J is deleted in both determinants. This confirms Proposition 1. However, despite this, genotype variation in Xk will shift the equilibrium values and indirectly affect the values of the matrix elements of J. Furthermore, it follows from the implicit function theorem (see e.g. [28]) that if D(kk) ≠ 0 in x, then there is a unique differentiable mapping plk : xkxl in a neighbourhood of x whose derivative can be given as above.

Eq. (6) shows that the propagation of genetic variation in locus Xk is intimately linked to the feedback loop structure of the network in the stable state. While the left hand side of Eq. (6) can be approximated by finite differences of observable equilibrium values after a perturbation of Xk, its right hand side depends on the feedback structure of the network, which is not directly accessible. In the following section we introduce the propagation chain concept and show how it is linked to J, and how it discloses the biological implications of Eq. (6). First, however, we show that Eq. (6) sheds some light on the conditions for the validity of the chain rule for functions defined implicitly by a set of equations.

From an imprudent application of the chain rule to xm = pml(xl) and xl = plk(xk) one might be tempted to conclude that xm = pmlplk(xk) and

pmlplk=pmk, (7)

where kN, mN and km. This, however, is not generally true. In Appendix C we prove and comment on the following result:

Proposition 2

Assume the variables have been renumbered such that k = 1 and 1 < l < m < n, and define the sets L = {1 : l}, M = {(l + 1) :n}, Q = {1 : (l − 1)}, R = {l : n}, where {i : j} = {i, i + 1, …, j} for i < j and {i : i} = {i}. In terms of partitioned matrices

J=(JLQJLRJMQJMR). (8)

If JMQ = 0, the chain rule Eq. (7) is fulfilled.

The opposite conclusion is not true, however, as there may be nonzero elements in JMQ = 0 that do not enter into feedback loops without jeopardising the rule.

Because the numbering of the nodes is arbitrary and immaterial, this result can be interpreted as follows. If all chains of effects from Xk to Xm pass through Xl, then the chain rule Eq. (7) is fulfilled, even if there are return chains from Xm to Xk so that both nodes are members of a feedback loop. However, if there exists a chain from Xk to Xm that does not pass through Xl, the chain rule may be violated. Apart from this, the network structure is immaterial.

As a simple illustration we consider the three-gene system

x1=r1(x3),x2=r2(x1),x3=r3(x1,x2), (9)

in which all γi = 1. The two chains X1X2X3 and X1X3, constitute a feedforward loop from X1 to X3, X2 playing the role of the intermediate element Xl in Eq. (7). Then

dx2dx1=q21=dr2dx1,dx3dx2=q32=r3x21r3x1r1x3,dx3dx1=q31=r3x1+r3x2q21. (10)

Obviously, q31q32 q21 if r3 depends explicitly on x1, in which case there is a chain from X1 to X3 not passing through X2. On the other hand, the arc X3X1 causes no problem.

2.3. Propagation chains and feedback loops

We start this section with a few standard definitions and clarifications.

  • A circuit is a set of elements in the Jacobian J whose circuit product (the product of all the elements in the circuit) contributes to det(J) or one of its principal subdeterminants. An element in a circuit represents either an action from one node to another or to itself (a regulatory element), or a degradation term. Thus, a circuit with i elements involves i nodes. The signed circuit product of a circuit equals its circuit product times a sign factor defined in Appendix B. A full circuit is a circuit with n elements. The length of a circuit equals the number of elements in the circuit. The sign of a circuit equals the sign of its circuit product.

  • If there is a circuit among a subset of nodes and another circuit among another disjoint subset of nodes, the two circuits are subcircuits in a a composite circuit. The circuit product of a composite circuit can always be factorised as a product of two or more subcircuit products. The sign of the composite circuit equals the product of the signs of all the subcircuits.

  • A proper circuit is a circuit that is not composite. Its circuit product cannot be factorised into subcircuit products.

  • A feedback loop or just a loop is a circuit that only comprises regulatory elements. A feedback loop comprises one or more closed chains of actions or effects (closed paths) in the network in which any node in the chains occurs just once.

  • An autoregulatory loop is a loop with one member, arising from a node whose product acts on its own dose-response function.

For example, if

J=(γ1000γ2c230c23γ3), (11)

there is just one regulatory loop in the network (X2X3), but several (composite) circuits, for instance −γ1c23c32. The degradation terms in Eq. (1) ensure that all nodes are members of one or more circuits which could be purely regulatory loops or a mixture of regulatory effects and degradation terms. A circuit product is therefore always either a loop product or equal to a loop product times one or more factors −γj. Accordingly, in the mathematical sense there exists at least one (proper or composite) circuit LN comprising all nodes, even in cases where there is no full (regulatory) loop. In Appendix B we recall a few useful facts about subdeterminants and circuits.

Let U = {u1, u2, …, uρ} be a subset of N with k as its first element and l as its last, and let ρ = |U| be the number of elements in U. Then CU is a progagation chain Xu1Xu2Xu3 → … → Xuρ if the product

CU=Juρuρ1Juρ1uρ2Ju2u1 (12)

is nonzero. If CU is made to close on itself by appending the action XuρXu1, it becomes the loop LU with loop product PU = Ju1uρ CU.

Next we show that if some qlk(xk, a(k)) ≠ 0, there must be a chain propagating the effect of a shift in xk from Xk to Xl. (The opposite is not true, as the contributions from two or more chains might accidentally cancel.) Combining Eq. (6) with known formulas for the expansion of determinants in terms of minors [27], we can express qlk as

qlk(xk,a(k))=1D(kk)U(1)u1DVVCU, (13)

where U is any chain set with U1 = k and uρ = l, V = N \ U, CU = CU (J) is the chain product of U, and the sum runs over all such U. Keep in mind that Eq. (6) presupposes k ≠ l. Combining Eq. (6) and Eq. (13) we see that D(kl) is a weighted sum of the chain product in J of all chains leading from Xk to Xl. If no such chain exists for given k and l, then qlk(xk, a(k)) = 0, as expected.

For a given a gene regulatory network model, Eq. (6), or alternatively Eq. (13), allows us to obtain analytical expressions predicting how variation in a gene Xk affects the equilibrium concentrations of all other genes in the network. In those cases where qlk (xk, a(k)) equals zero, the genetic variation in Xk does not become manifested in the output of node Xl even though the equilibrium concentration of xk is changed. If the variation becomes manifested in the output of more than one locus, the introduced polymorphism is pleiotropic. Since qlk(xk, a(k)) depends explicitly on all chains leading from Xk to Xl, a change in genotype at one or more loci involved can potentially modify the effect on xl of a shift in xk, leading to epistasis. This implies that the epistasis and pleiotropy features of all loci can be cartographed in a systematic way. This information can be used to validate a particular model against experimental measurements of qlk(xk, a(k)) as well as to identify generic characteristics of how variation is manifested as a function of regulatory anatomy.

2.4. The regulatory feedback effect on xk of genetic variation in Xk

The formula (13) for qlk is only valid for kl. We now want to define a function which can be used to determine the effect of genotypic variation in Xk on xk itself. It is obvious from Eq. (1) that even in an isolated node Xk without autoregulation, a change of genotype manifested as a change of ak will in general lead to a shifted value of xk. We will call this an unmediated effect. In addition there may be contributions from mediated effects due to the feedback loops involving Xk, including an autoregulatory loop. For example, in a system with the Jacobian in Eq. (11), a change of genotype in X2 will lead to a shift in x2 for two reasons: a change of the dose-response function r2, and because of the loop X2X3. The resultant of both effects determines how xk responds to genetic variation in Xk.

We let XL be the set of nodes apart from Xk itself that act directly on Xk, and XM the remaining set of nodes, such that {k} ∪ LM = N. The stationarity condition for node Xk is

γkxk=rk(xk,xL,ak). (14)

According to Proposition 1 we can in principle find xl = plk(xk, a(k)) for all lLM, i.e. all lk. Inserting this into Eq. (14) gives

γkxk=rk(xk,pLk(xk,a(k)),ak). (15)

We define the feedback function ϕk for Xk by

ϕk(xk,a)=rk(xk,pLk(xk,a(k)),ak), (16)

or just ϕk(xk) = rk(xk, pLk(xk)), and express the stationarity condition for Xk as

xk=1γkϕk(xk,a). (17)

For a given genotype, expressed as given a value of the parameter set a, the value of xk can be found as the (by assumption stable and unique) solution of this equation. If ψk(xk,a)=ϕk(xk,a)0, where the prime denotes the derivative with respect to xk, then Xk is not involved in any regulatory feedback loop. However, if ψk(xk,a)=ϕk(xk,a)0, there is an effective feedback of Xk on itself, mediated by one or more loops. Therefore the feedback function ϕk describes and quantifies the feedback effects of changes in the equilibrium value of Xk on itself.

The derivative of ϕk can be expressed in terms of the Jacobi matrix elements. Differentiating Eq. (16) with respect to xk, using xl = plk(xk, a(k)), Eq. (6) and that rk/∂xm = 0 for all mM defined just before Eq. (14), we find

ψk(xk,a)=rkxk+lLrkxlqlk=γk+DD(kk). (18)

Let Fk be the sum of the signed circuit products (defined in Appendix B) of all full circuits in J in which there is a real regulation of Xk, but not necessarily of the other nodes. (We do not consider the linear degradation as a regulation. For example, in J defined in Eq. (11), there are two full circuits with circuit products −γ1 γ2 γ3 and −γ1 c23 c32, respectively, but only the latter includes a real regulation of X2 (by X3) and would contribute to F2. Neither contributes to F1.)

As a illustration we consider the system

γ1x1=r1(x1,x2,x3),γ2x2=r2(x1),γ3x3=r3(x2), (19)

with the two loops X1X2 and X1X2X3X1. With k = 1 we readily find

ψ1=γ1+1γ2γ3(γ2γ3J11+γ3J12J21+J13J32J31). (20)

The expression in the parenthesis is F1. Its second term comes with a positive sign because the minus sign for −γ3 is cancelled by the negative signature factor of the loop X1X2. Here as always, Fk is independent of γk, but not of the other degradation rates.

We also note that D(kk) is the sum of the signed circuit product of all circuits (proper and composite) of length n − 1 which do not involve Xk. Expanding D along row k gives

D=rkxkD(kk)γkD(kk)+jkrkxjD(kj). (21)

According to the lemma in Appendix B a determinant can be expanded as a sum of its signed circuit products. The first term in Eq. (21) is the sum of all full, composite circuit products with an autoregulatory subcircuit in Xk. Each term in the last sum is the determinant of a matrix Kkj obtained by setting all elements in row k and column j except Jkj equal to zero. Then det(Kkj) is the sum of all circuit products in J which involve the element Jkj, i.e. in which Xk contributes with an active regulation. This gives D = FkγkD(kk), and if D(kk) ≠ 0,

ψk(xk,a)=FkD(kk)=γkFkFkD. (22)

By this we have obtained a formula that relates the gain of the feedback function of a locus Xk to the circuit products of the full circuits in which Xk is regulated. This circuit would be either a full loop or a a set of subloops, one of them involving Xk, and a number of degradation terms. It provides an analytic basis for the intuition that a high gain is obtained if the loops that Xk enters into are much stronger than the rest, i.e. if |Fk| ≫ |D(kk)|. Note that because x is hyperbolic by assumption, D ≠ 0, thus ψk (xk, a) ≠ γk.

If ψk(xk, a) = 0, then Fk = 0, which means that there is no effective regulation of Xk or the effects of the regulating loops happen to cancel. Then assume ψk(xk, a) ≠ 0. Solving Eq. (22) with respect to D and using that (−1)n D > 0 (see Appendix B) leads to

(1)nFkΩk(xk,a)>0, (23)

where

Ωk(xk,a)=ψk(xk,a)γkψk(xk,a). (24)

Assume there exists a full circuit composed of a proper loop L involving Xk and a perhaps number of degradation terms. Let P be the loop product of L. As is illustrated in Eq. (20), the sign of this circuits product is equal to sign(P) independently of the number of degradation terms, because the negative signs of the degradation terms are compensated by the signature factor (see Lemma 2 in Appendix B) of the full loop. If sign(Fk) = sign(P), we call L a sign-dominant loop of Xk. The signature factor of L is (−1)n−1 because it has n members. The sign of its contribution to Fk is therefore (−1)n−1 sign(P), yielding the result

PΩk(xk,a)<0, (25)

which will be used to prove Proposition 5. From this follows readily

Proposition 3

If P > 0, then 0 < ψk(xk, a) < γk, and if P < 0, then ψk(xk, a) < 0 or ψk(xk, a) > γk, and vice versa.

Thus, a positive sign-dominant proper loop implies a feedback function with positive slope bounded by the degradation rate, while a negative sign-dominant loop implies either negative slope or a large positive slope of ϕk. If L is a composite loop, Eq. (25) is replaced by

(1)n+εLPΩk(xk,a)>0, (26)

where (−1)εL is the signature of the loop. If Fk ≠ 0, there is always at least one sign-dominant loop for Xk.

To compute the values of xk for a slight change of genotype in Xk, the shift in ak must also be taken into account. Let xk = xk(a) be the solution of Eq. (17), and let bak be a single parameter. Differentiating Eq. (17) and introducing Jacobi elements as in the derivation of Eq. (18) we find

xkb=D(kk)Drkb=D(kk)Dϕkb=1γkψk(xk,a)ϕkb. (27)

This formula emphasises the importance of the feedback function as a source of information about the phenotypic effects of genotype changes.

We are now ready to use these results to analyse diploid networks.

3. Allele interaction in networks with diploid loci

The rest of the paper deals with models of diploid systems, that is, systems in which chromosomes come in pairs with one variant of each gene, called an allele, on each of the two chromosomes. Thus, each gene is composed of two alleles, each allele being regulated more or less independently of the other, and the product of the gene is some combination of the product of each of the two alleles. If the two alleles are identical, the gene is called homozygotic, if they are different, the gene is heterozygotic, and if one of the alleles has been knocked out, it is hemizygotic.

Since the dawn of genetics, additive and dominant gene actions in diploids have been defined by comparing heterozygote and homozygote phenotypes without reference to, or model of, the functional dependency between the two alleles composing each genotype. However, from [14] as well as the present paper it is clear that it is precisely the interaction between the two alleles that gives rise to non-additive gene action. Consequently, the genetics concepts of additive and dominant gene actions cannot explain basic phenomena in genetics theory from regulatory biology. Exploiting the additivity and nonadditivity properties of the two alleles, Gjuvsland et al. [14] showed that by means of the new concept of allele interaction, gene regulatory systems with one or two loci can be linked to single locus genetic theory.

We first present ways of modelling a network of genes involving diploid loci in an efficient way, and then introduce the concept of allele interaction. Finally, we study how the sign of the allele interaction is related to the feedback structure of the network. When studying allele interaction, we contrast different genotypes at a single focal locus without specifying the genotype of the rest of the loci.

3.1. Allele-specific diploid gene regulatory network models

As our objective is to relate the changes in genotypic value (i.e. the phenotype) due to allelic variation of a locus Xi to a potential interaction between its two alleles, we need to model the function and regulation of the two alleles as two distinct entities. A bi-allelic node Xi with two alleles sitting on each of the two chromosomes, splits into two subnodes Xi1and Xi2 with zi1 and zi2 representing the concentration of gene product from each of the two chromosomes, respectively.

As stated in the introduction we assume that the outputs of the two alleles are functionally equivalent in the sense that they regulate other genes in the same fashion, genetic differences manifesting themselves only in the regulation of the two alleles, not in qualitative differences in their output. This implies that the nodes in the network are regulated by the total gene product zi=zi1+zi2, not by its two constituents separately. If this assumption should not hold for a gene Xi, the dose-response function of a downstream gene could depend on zi1 and zi2 separately. In such cases the simplifications described below would not be justified, and one would have to model the two alleles of this gene by two separate equations.

We let superscripts αi and βi denote the alleles in Xi1 and Xi2, respectively, αi ∈ {1,2}, βi ∈ {1,2}. If the genotype of Xi is biallelic with alleles αi, βi, we model its rate equations by

z˙i1=fiαi(z)=riαi(z)γiαizi1,z˙i2=fiβi(z)=riβi(z)γiβizi2, (28)

where zi=zi1+zi2. For simplicity we suppress the parameters ai from the arguments of the dose-response functions in the following. If (αi, βi) = (1, 1) or (αi, βi) = (2, 2), the equations describe a homozygous locus Xi, while (αi, βi) = (1,2) describes the heterozygote. This model for a diploid node was first proposed by Omholt et al. [13].

If Xi is a homozygous locus, αi = βi, and simple addition of the two equations gives

z˙i=2riαi(z)γiαizi. (29)

In the hemizygous genotypes where one allele has been knocked out and the remaining copy is of genotype αi, Eqs. (28) are reduced to

z˙i1=fiαi(z)=riαi(z)γiαizi1, (30)

and zi=zi1. For each polymorphic locus we may therefore consider five different genotypes: the bi-allelic genotypes 11, 12, and 22, and the mono-allelic genotypes 1 and 2.

In the following we consider a network in which Xn is polymorphic while the genotypes of the remaining loci are unspecified but fixed. We first describe it by the extended system SE defined by the rate equations

SE={z˙i=ri(z)γizi,i=1,,n1,z˙n1=rnαn(z)γnαnzn1,z˙n2=rnβn(z)γnβnzn2, (31)

where zn=zn1+zn2 and z = [z1, …, zn]. For simplicity we drop the subscript n to αn and βn in the following. As above, we denote the presupposed asymptotically stable state of Eq. (31) by x = [x1, …, xn].

3.2. Aggregating diploid loci

Contrary to Eq. (28), common ways of modelling gene regulatory networks describe a gene by a single equation for the total output of the gene, even when the gene is diploid. In the present section we investigate whether these two contrasting modelling schemes can be unified into a common modelling approach.

By exploiting the assumption that the diploid node Xn only acts on the other nodes by its total output zn, we want to convert the extended model into a new model expressed in terms of the total product of a locus, while still keeping track of the properties of each of the two alleles. In other words, we want to construct a system SA obtained by merging Xn1 and Xn2 into one aggregated node Xn with a single rate equation for zn. We will call this conversion an aggregation. The rationale for this operation is that an aggregated model facilitates considerably the theoretical analysis of allele interaction in high-dimensional systems.

In fact, almost all gene regulatory models occuring in the literature are aggregated in the sense that they describe each gene by just one variable representing the amount or concentration of the gene’s output. This is so even if the gene is diploid, and even in cases where several of the genes probably have allelic variation in the coding region as well, and perhaps produce qualitatively different outputs. Gene transcription and translation are very complicated processes which are only very crudely modelled by the kind of equations studied in the present paper. Even if the two allele products act in the same way such that only their total concentration matters as regulatory agents, there may be different degradation rates operating at the mRNA stage, during the translation process or later. If γnα=γnβ=γn, then obviously z˙n=rnα(z,an)+rnβ(z,an)γnzn. However, when γnαγnβ, it is impossible to combine the two last equations in Eqs. (31) into one rate equation for zn. The crucial problem in these cases is to perform the aggregation in such a way that the aggregated model reproduces the properties of the original extended model.

A natural solution would be to assume that the total dose-response function of the gene is the sum of the dose-response functions for each of the two alleles, and that the relative degradation rate of the total gene product zn=zn1+zn2 is an average of the two allelic degradation rates γn1 and γn2. We will call such a model an aggregated model SA of SE:

SA{y˙=ri(y)γiyi,i=1,,n1,y˙n=rnα(y)+rnβ(y)γnαβyn, (32)

where

γnαβ=γnαxnα+γnβxnβxnα+xnβ. (33)

Let z(t, z0) and y(t, y0) be the solutions of SE and SA, respectively, satisfying z(0, z0) = z0 and y(0,y0) = y0. It is easy to see that if z=[z1,,zn] is a steady point of SE, then y* = z* is a steady point of SA. It is not obvious that if z* is an asymptotically stable point of SE, then y* = z* is an asymptotically stable point of SA, and if z* is hyperbolic, then y* is hyperbolic. However, we show in Appendix D that this is in fact the case. We also show by extensive numeric simulations that in the majority of cases the temporal behaviours of z(t, z0) and y(t, z0) are approximately equal and qualitiatively similar for a range of actual parameter values and realistic common initial values y0 = z0z*. This being the case, we call SA a well-founded aggregation of SE.

These results strongly suggest that the idea of aggregating a diploid model in this way makes sense. If SE has several biallelic nodes, we use this aggregate procedure of SA on each node. If each aggregation is well-founded, we finally arrive at a well-founded, fully aggregated system SFA. Its diploid loci Xi of genotype αiβi are described by

y˙i=riαi(y)+riβi(y)γiαiβiyi, (34)

where γiαiβi is given by Eq. (33) with n replaced by i. Haploid nodes Xj are described by

y˙j=ri(y)γjyj. (35)

If all nodes are diploid and admit the above aggregation process, the dimensionality of the model has been reduced from 2n to n, leading to the fully aggregated model SFA. In the next section we use SFA to investigate the consequences of knockout behaviour and different allele combinations in genotype-phenotype maps (GP maps).

3.3. The allele interaction concept

The concept of allele interaction for a polymorphic locus X for some specific phenotypic trait in a regulatory network was defined by Gjuvsland et al. [14]. Recall that xi1 and xi2 are the hemizygote genotypic values of a locus Xi when only allele 1, respectively allele 2 is present, and xi11, xi12 and xi22 are the biallelic homozygote and heterozygote genotypic values. The heterozygote allele interaction value Δi12 of Xi is defined as

Δi12=xi12(xi1+xi2). (36)

We define the two homozygote allele interaction values Δi11 and Δi22 in the same way, in general

Δiαβ=xiαβ(xiα+xiβ), (37)

where α,β ∈ {1, 2}. An allele interaction is said to be negative if Δiαβ<0 and positive if Δiαβ>0.

Mendelian dominance is expressed by the dominance value

di=xi12xi11+xi222. (38)

The name “dominance value” stems from the fact that if di ≠ 0, then one of the alleles contributes more to (dominates) the equilibrium value, shifting the heterozygous value away from the midpoint between the two homozygous equilibrium values. Allele interaction is closely related to di because di=Δi12(Δi11+Δi22)/2. Gjuvsland et al. [14] showed that if an isolated node X is under negative autoregulation, then its three allele interaction values are negative, while if the autoregulation is positive, they are positive. Building upon the theoretical machinery developed above, we show in the following that these results can be generalised to higher dimensional gene regulatory networks with more complex feedback structures. In this way we are able to build new theory relating gene action concepts and regulatory network anatomy to quantitative genetics.

If the two alleles of Xi were completely independent, one would expect Δi12=0, as in this case the total output of the gene would be just the sum of the outputs from the two alleles. A nonzero value would therefore indicate some kind of oneway or mutual action between the alleles. In a single-locus model a nonzero allele interaction value could be a consequence of the feedback between the two alleles. Non-feedback mechanisms, such as transvection [29] in which one allele has an effect on the other (but not the other way round), could also lead to nonzero allele interaction [14].

To search for systemic causes of nonzero allele interaction values we examine the two-locus systems in Figure 1. Their rate equations are given in Appendix E. The node X1 is diploid and splits in two subnodes X11 and X12 of type α1 = 1 and β1 = 2, respectively. Taking the total equilibrium concentration of X1 as the system’s phenotype, we want to investigate the allele interaction value Δ112 of the system in Figure 1a. We then have to compare the equilibrium value y112 with y11+y12, the phenotype values when only one allele is present. Let the node 2 be a copy of X2. Then y11 and y12 are the phenotype values of the two systems in Figure 1c.

Figure 1.

Figure 1

a The two-locus system analysed in Section 3.3 to investigate the source of nonzero allele interaction values. The nodes X11 and X12 are the two alleles of the locus X1. Black arrows indicate direct effects. b The interaction diagram of the artificial μ-system. The node 2 is a genetically identical copy of X2. The two red arrows indicate actions whose strengths depend on μ. c If μ = 0, the actions are zero. The μ-system splits in two independent subsystems, each of them equivalent to the original system i a with one allele knocked out. d If μ = 1, the strengths of the actions are the same as in the system i a. In this case the μ-system is equivalent to the original system in a without any allele knockout.

To ease the comparison between these two systems we introduce the artificial μ-system in Figure 1b. The two red arcs represent actions whose strengths (expressed by the magnitudes of the corresponding Jacobian elements) are proportional to a parameter μ which can be varied in [0, 1]. The equilibrium values of X11 and X12 in the μ-system are y112(μ) and y121(μ), and the corresponding allele interaction value is Δ112(μ).

If μ = 0, the μ-system simplifies to the two independent subsystems in Figure 1c. The one to the left (right) is the allele knockout system with only X11(X12) left because 2 and X2 are presumed identical. The phenotypic value of the whole system is y112(0)=x11+x12. Therefore Δ112(0)=y112(0)y11(0)y12(0)=x11+x12x11x12=0.

If μ = 1, the μ-system is represented by Figure 1d. From its rate equations it follows that the equilibrium conditions of this system are the same as for Figure 1a, leading to equal allele interaction values. Therefore the μ-system interpolates continuously between the diploid system in Figure 1a and the allele knockout system in Figure 1c.

Differentiating the equilibrium conditions of the μ-system, we find that dΔ112/dμμ=0 is in general nonzero. Details are given in Appendix E. Because Δ112(0)=0, this implies that even for infinitesimally small μ the μ-system has a nonzero allele interaction value. One might think that this nonzero value is caused by the feedback loop X11X2X12X2X11. However, there is still a nonzero allele interaction value if the arc from X12 to X2 is removed. In this case there is no mutual interaction between the alleles, only an indirect action from X11 to X12, but no chain from X12 to X11. We conclude that a nonzero allele interaction value could be caused by feedback among the two alleles, but that a one-way action is sufficient.

3.4. Allele interaction, feedback functions and feedback loops

The allele interaction values can be computed from directly observable quantities. In this section we show how they can be related to properties of the network. Using finite differences and the mean value theorem, the derivatives of the dose-response functions can be estimated in terms of the single allele effects

δlαkβk\αk=xlαkβkxlβk,δlαkβk\βk=xlαkβkxlαk (39)

which quantify the effect on any locus Xl of activating the second allele in the initially hemizygous locus Xk. Then

plk(clαkβk)=qlk(clαkβk)=δlαkβk\βkδkαkβk\βk, (40)

where clαkβk(xkαk,xkαkβk). We use the subscript l in clαkβk because its value clearly depends on l.

Because the function plk is independent of the allelic composition of Xk, we can get four independent estimates of qlk by combining Xk11 with Xk1, Xk22 with Xk2, and Xk12 with Xk1 and with Xk2. Note however that they will refer to different and unknown arguments, so that all together they will provide an estimate of the average value of qlk in the interval between the minimum and maximum of the five genotypic values xk1, xk2, xk11, xk12 and xk22.

If a model for a given network exists, we can use Eq. (40) to estimate how the single allele effect propagates through the network as a consequence of polymorphism in Xk and the network connectivities, and use this to test the model. Conversely, measurements of the single allele effects from a polymorphic locus give information about the network connections [30].

Again dropping the subscript k from αk and βk, we denote in the following the feedback functions of Xk by ϕkα, ϕkβ and ϕkαβ, and similarly for Fk, etc. According to Eq. (17) the stationarity conditions for the allele combinations α∣∘, β∣∘ and αβ are

γkαxkα=ϕkα(xkα,a),γkβxkβ=ϕkβ(xkβ,a),γkαβxkαβ=ϕkα(xkαβ,a)+ϕkβ(xkαβ,a)=ϕkαβ(xkαβ,a), (41)

where γkαβ is computed in accordance with Eq. (33). The last equation follows because, as is evident from Proposition 1, to derive pLk we do not use the stationarity condition for Xk, and all the other stationarity conditions are invariant under polymorphism of Xk and don’t have superscripts α and β. The allele interaction value Δkαβ=xkαβxkαxkβ is then given in terms of the solutions of the three Eqs. (41). The following proposition relates Δkαβ to the derivatives ψkα and ψkβ of the feedback functions ϕkα and ϕkβ.

Proposition 4

For any biallelic locus Xk, kN, there exist numbers ckαβ(xkα,xkαβ) and ckβα(xkβ,xkαβ) such that

Δkαβ=ψkα(ckαβ,a)xkβ+ψkβ(ckβα,a)xkαγkαβψkα(ckαβ,a)ψkβ(ckβα,a). (42)
Proof

From Eqs. (17) and (41) it follows that

Δkαβ=1γkαβ(ϕkα(xkαβ,a)+ϕkβ(xkαβ,a))xkαxkβ.

Inserting xkαβ=xkα+(Δkαβ+xkβ) into ϕkα(xkαβ,a) and xkαβ=xkβ+(Δkαβ+xkα) into ϕkβ(xkαβ,a) and using the mean value theorem on both functions lead after some elementary algebra and repeated use of Eqs. (33) and (17) to Eq. (42).

By combining Eq. (42) with Eq. (23) and using the following lemma, we are able to relate the sign of Δkαβ to properties of the feedback loops of the system.

Lemma 1

Let E be an open subset of R+, let ϕ1 : ER+ and ϕ2 : ER+ be two positive, strictly monotonic and differentiable functions. Define ϕ12(x) = ϕ1(x) + ϕ2(x), and assume that γix = ϕi(x), i = 1,2, have unique solutions x1, x2 in E, where γi > 0 and x1x2 by convention. Define

γ=γ1x1+γ2x2x1+x2, (43)

and assume the solution x12 of γx = ϕ12(x) is also in E. Define Δ12 = x12x1x2 and δ1 = x12x2, δ2 = x12x1.

  1. If ϕ1(x)<0 and ϕ2(x)<0 for all xE, then Δ12 < 0 and δ2 > 0.

  2. If 0<ϕ1(x)<γ1, 0<ϕ2(x)<γ2 for all xE, then Δ12 > 0 and δ1 > 0, δ2 > 0.

  3. If ϕ1(x)>γ1, ϕ2(x)>γ2 for all xE, then Δ12 < 0 and δ1 < 0.

The proof is in Appendix F.

Recall the definition Ωi(x)=(ϕi(x)γi)/ϕ1(x) in Eq. (24). Assume Ω1 and Ω2 have the same sign. It follows from Lemma 1 that if Ωi < 0 for i = 1, 2, then Δ12 > 0, and if Ωi > 0 for i = 1, 2, then Δ12 < 0. Combining this with Eqs. (23)-(26) we readily arrive at

Proposition 5

If Fkα and Fkβ, αβ, have the same sign Sk, then (1)nSkΔkαβ<0. If P is the loop product of a sign-dominant, proper loop for Xk, then PΔkαβ>0. If the loop is sign-dominant but composite and has signature factor (−1)ε, then (1)n+ε1PΔkαβ>0.

The allele interaction values Δkαβ are directly observable by subjecting each Xk to allele knockout and recording the unperturbed and perturbed equilibrium values of xk. If this is done for all Xk, a set of exact sign conditions on the loop structure of the system is obtained. This may be particularly useful for homozygous systems, because then Fkα=Fkβ and there will be no problem with the sign Sk.

4. Discussion and conclusions

Combining network theory and linear algebra results with mathematical models of gene regulatory networks, we have introduced relevant concepts and provided analytical insights on how genetic variation is propagated in gene networks. We hope that our results may contribute to a future theory on the pleiotropy and epistasis features of genetic variation in haploid and diploid gene networks as a function of regulatory architecture and functional location of genetic variation.

We have also shown that the modelling framework for diploid gene networks developed by Omholt et al. [13] in which a diploid node is described by two rate equations, can be transformed—in our language: aggregated—into a standard type model in which each locus, haploid or diploid, is described by just one rate equation.

The time-dependent solutions of the aggregated models are qualitatively equivalent to the corresponding model by the modeling framework of Omholt et al., and the equilibrium solutions of the former are stable when the solutions of the latter are. Qualitative equivalence is here to be taken in an informal sense, meaning that the graphs of the solution curves look similar, and that the curves are relatively close to each other in a sense given in Appendix D.

The variables of the aggregated model are the observable total gene product of each locus. The model depends explicitly on the genotypes of the two alleles of the diploid loci. This property facilitates investigations on how the genotypic value of a diploid locus (i.e. its phenotype) depends on its genotype. It further reduces the size of the model from perhaps 2n down to n. This reduction also makes it much easier to read out the connection and the feedback loops between the loci in “everyday” language in which we talk about a gene as one entity despite the fact that it is composed by two more or less independent alleles. To the best of our knowledge this provides for the first time a rationale for modeling diploid gene regulatory networks with one node for each locus even though the locus may be polymorphic and show intra-locus interaction effects.

Finally, we have shown that for a wide range of network architectures the sign of the allele interaction is independent of the shape of the rate functions and parameter values, and does not change with mutations in the other nodes or under external noise. More specifically, Proposition 5 confirms and generalizes the result in [14] for an isolated gene. It shows the close connection between the sign of the allele interaction for a polymorphic locus Xk and the feedback loops it is involved in. Its main importance is that recording the equilibrium values xk for a hemizygotic and either a homo- or heterozygotic locus Xk gives information about the network interactions and feedback loops involving Xk. These genotypes are within experimental reach for several organisms, and the machinery developed above can be tested in several settings. Hemizygous collections are already available for yeast [31]. Of course there may be networks for which the actual genotypes lead to more complex sign relations so that the above results would not be valid. Irrespective of whether the sign relations are valid or not, if these three allele interaction values Δkαα, Δkαβ and Δkββ have equal signs sk, a tentative hypothesis is that Xk has one or more sign-dominant loop with sign sk.

Gjuvsland et al. [14] showed that in systems with one or two loci, a biallelic locus can display up to 18 qualitatively different allele interaction sign patterns (triplets of +, − and 0 representing the signs of Δ11, Δ12 and Δ22). In a single locus system with autoregulation only a subset of 7 of these could be realised with monotonic dose-response functions. With non-monotonic dose-response functions, however, 16 sign patterns could be generated. They also showed analytically that for each allele combination, the allele interaction value and the sign of the autoregulatory loop were equal (their Supporting Information, Result 1). For the autoregulatory system of an isolated node X1, the sign of F1 is just the sign of the autoregulatory loop, which equals the sign of the derivative of the doseresponse function. Therefore, a non-monotonic dose-response function implies that F1—and the allele interaction value—may take both signs, depending on parameter values.

Consider then a multi-locus system with monotonous dose-response functions (Figure 2). The two full loops X1X2X3X1 and (X1X3X1)(X2X2) are incoherent (their contributions to F1 have opposite signs) because F1 = J13J32J21J13J31J22 and J13J32J21 > 0, J13J31J22 > 0. Depending on parameter values either the one or the other may determine the sign of F1 and give opposite signs to Δ1αβ. In this multi-node network, varying sign of Δ1αβ can be obtained with monotonic dose-response functions, while this could only be obtained with non-monotonic dose-response functions in the single node autoregulatory system. Based on this we conjecture that with monotonic dose-response functions a much wider range of allele interaction sign motifs can be obtained in multi-gene systems than for autoregulated genes.

Figure 2.

Figure 2

A system with an incoherent feedforward motif (from X1 to X3) and three feedback loops. An arrow denotes positive action, a crossbar negative action. The two full loops X1X2X3X1 and (X1X3X1) (X2X2) are incoherent: their contributions to each Fi have opposite signs.

Our results provide a theoretical basis for two kinds of experimental tests of network models: (i) checking the sign of the allele interaction for any node by allele knockout in the same node; and (ii) checking the effect of allele knockout in one node on the equilibrium values of other nodes. In both cases the checking can be made independently on either of the homozygotes and on the heterozygote. This gives three possible combinations for each polymorphic locus. If the allelic composition of each of these loci can be selected or imposed experimentally and independently for each locus, the number of different test can in principle be very large. The formalism developed above may be combined with systematic measurement of the effects of allele knockouts and their effects on the other nodes in the network to deduce the connectivity of networks for which no model so far exists. This approach would be very similar to the approach suggested by Kholodenko et al. [30].

We have deliberately refrained from dealing with networks with multiple stable states. Surely, multistationarity is a generic characteristic of nonlinear dynamic systems, but is not a relevant issue in a large number of biological systems. Nor have we allowed genetic variation affecting the coding part of a gene. For such genes aggregation is generally not possible, as the two allele products may have different effects on other genes. It would not make sense to sum the two product concentrations, and the two alleles would simply have to be modelled by separate rate equations. The model framework we have used for the theory development is of course very simple both in terms of the relationship between the gene product expression level and the production rate from downstream loci and the neglect of more complex regulatory anatomies involving for example noncoding RNA (see e.g. [32, 33, 34, 35]). Including more biological realism along these lines would make it more complicated to develop the theory, but might at the same time disclose deeper insight into the propagation of genetic variation in real networks. Our formalism can easily also account for other network agents than gene loci, and can be used to study e.g. regulatory structures involving gene networks, metabolic networks and protein signalling networks. We anticipate that such an endeavor will yield new insight into the manifestation of genetic variation in nonlinear biological systems.

Acknowledgments

This work has been supported in part by The Research Council of Norway, project number 178901/V30, Bridging the gap: disclosure, understanding and exploitation of the genotype-phenotype map, and by the Virtual Physiological Rat Project funded through NIH grant P50-GM094503.

Appendix A

Notation

In this appendix we explain our notation for subsets of vectors and matrices, and for equilibrium values for different genotypes (allelic compositions) of diploid genes.

Let U and V be subsets of N = {1,2, …, n}. We use the notation zU = {zk}kU, XU = {Xk}kU, etc. In matrix equations zU denotes the corresponding column vector. The n × n Jacobian matrix of Eq. (1) in the stable point x is denoted J, and JUV is the matrix obtained from J by selecting the rows U and the columns V (without interchanging rows or columns). We use the notation X(U) to denote the set of nodes not in XN, etc., and denote the corresponding set of variables by z(U). Similarly, x(k) is the set of all xi except xk or the vector obtained by removing xk from x. Let iU and jV. The matrix JUV(ij) is obtained from J by selecting the rows U and the columns V in J and deleting row i and column j in J. The superscript (i ∘) indicates that only row i and no column is deleted, and (∘ j) that only column j and no row is deleted. We also define D = det(J), DUV(ij)=det(JUV(ij)) if |U| = |V|, and D(ij) = det(J(ij)). It goes without saying that if there is no superscript, no row or column is deleted, and if there is no subscript, all rows and columns are included. Similarly, if LN, pLk(xk, a(k)) = {plk(xk, a(k)) ∣ lL}.

The genotype of a diploid gene Xi is denoted gi =αiβi, where αi and βi take the values 1 or 2, indicating two different alleles. All equilibrium values depend on the total genotype g = [g1gn] of the system, but we do not complicate formulas by stating this explicitly. Instead, we let xiαiβi denote the equilibrium value of Xi when its genotype is αiβi. Thus, xi11, xi12 and xi22 are the stable equilibrium values of Xi when both alleles are present and both are of type 1, of types 1 and 2, and both of type 2, respectively.

The stable equilibrium value for Xi when one of the alleles has been knocked out is xi1 and xi2, where ∘ indicates a nil value, i.e. that the allele is absent. Finally, xi11 and xi12 represent the equilibrium value of the output from a subnode of Xi with allele of type 1 when the other allele is of type 1 or 2, respectively. For example xi11=xi11+xi11=2xi11, xi12=xi12+xi21, and xi22=xi22+xi22=2xi22. Note however that while e.g. zi1 is the (time dependent) output of Xi1 whatever its actual genotype, xi1 is the equilibrium concentration of the gene product of Xi when the copy of the gene on one chromosome is knocked out and the one present is allele αi = 1.

Appendix B

Circuits and loops

In this appendix we recall some useful facts related to the circuit structure of a real n × n matrix A.

Lemma 2 ([24])

Let kN be given, let U be any subset of N with k elements, and let π(U) be the set of permutations of U, including the identity permutation. Let Vπ(U) and define the circuit product

P(U,V)=AU1V1AU2V2AUkVk (B.1)

and

SU=Vπ(V)(1)ε(U,V)P(U,V), (B.2)

where ε(U, V) is the number of subcircuit products in the circuit product P(U, V) with an even number of factors, and

sk=USU, (B.3)

where the sum runs over all U for which |U| = k. Then SU = DUU, and the characteristic polynomial of A is

pn(λ)=λns1λn1+s2λn2++(1)nsn. (B.4)

In particular, the trace T = tr(A) = s1 and the determinant D = det(A) = sn. Of course, sn = SN. We call (−1)ε(U, V) P(U, V) the signed circuit product of the circuit corresponding to the circuit product P(U, V). To express signs we use the sign function defined by sign(x) = −1 if x < 0, sign(0) = 0, sign(x) = +1 if x > 0.

A square matrix for which all eigenvalues have a negative real part, will be called a stable matrix. The following result should be well-known.

Lemma 3

[36, Vol. 2, p. 220] If the real n × n matrix A is stable, then

(1)jsj>0,alljN. (B.5)

Appendix C

Proof of Proposition 2

Proof

When FMQ = 0, the four determinants D(1l), D(lm), D(1m) and D(ll) are all block triangular, and can be expressed as

D(1l)=DLQ(1l)DMM,D(lm)=DQQDMR(m),D(1m)=DLQ1lDMR(m),D(ll)=DQQDMM. (C.1)

The notation for subscripts and superscripts was defined in Appendix A. From Eqs. (C.1) follows trivially that

D(1l)D(lm)=D(1m)D(ll). (C.2)

which is equivalent to the chain rule due to Eq. (6).

Appendix D

Justification of aggregated models

In this appendix we justify the claim in Section 3.2 that the aggregated model SA is a well-founded aggregation of SE.

The simple model for transcription regulation developed by Bintu et al. [37, 38] and Buchler et al. [39] is based upon setting the transcription rate proportional to the binding probability of transcription factors and polymerase to the gene’s binding site. They used traditional Boltzmann statistics to derive formulas for the binding probabilities. Extending this analysis to biallelic genes remains to be done. Unfortunately, a physio-chemical analysis of transcription soon gets very complicated [40, 41], but the following simple argument lends some justification to the assumption that the production rate of the biallelic gene is just the sum of the two monoallelic production rates. Assume the number of transcription factor molecules is much larger than the number of binding sites of the gene, and that the effect of non-specific binding sites for the transcription factors can be disregarded. Then the number of transcription factor molecules available for binding to one chromosome is not appreciably reduced if a small fraction of them are bound to the other chromosome. If the probability of binding to the one chromosome is independent of what happens at the other chromosome, the total probability that transcription factors will bind to the gene and initiate transcription is just the sum of the two single-allele probabilities, and the total transcription rate is the sum of the two single-allele transcription rates.

The following proposition shows that SA possesses the same asymptotic stability properties as SE.

Proposition 6

Let z=[z1,,zn] be an asymptotically stable point for SE. If the Jacobian J of SA is diagonalisable in z*, then z* is an asymptotically stable point for SA.

Proof

If γn1=γn2=γn, the equations for zn1 and zn2 of SE can be added, leading to the equations for SA.

We then assume γn1γn2. Let z(t, z0), where zn=zn1+zn2, be a solution of SE as given by Eqs. (31) such that limt→∞ z(t, y0) = z*, and define u(t) = z(t, y0) − z*. The definition of SA ensures that z* is a stationary point for both systems.

Because z* is an asymptotically stable state for SE, for any ε > 0 there exists a T > 0 such that ∥u(t)∥ < ε for t > T. By choosing y0 sufficiently close to z* we can ensure that ∥u∥ < ε for all positive t.

We proceed by investigating the rate equations for the n-component vector u(t).

u˙i=ri(z+u)γizi=ri(z+u)ri(z)γiui,u˙n=rn(z+u)γn1zn1γn2zn2=rn(z+u)rn(z)γn1un1γn2un2, (D.1)

where i = 1, …, n − 1 and rn=rn1+rn2. After a little algebra the equation for n can be written

u˙n=rn(z+u)rn(z)γnun+en(u), (D.2)

where γn is defined in Eq. (33), and

en(u)=1zn(γn1zn2γn2zn1)(un2un1). (D.3)

The mean value theorem for a mapping r : RnRn is [28]

Theorem 1

Suppose r : WRn is differentiable on the open set WRn, and that the line segment joining z* and z lies in W. Then there exist numbers αi, 0 < αi ≤ 1, and vectors wi = (1 − αi) z* + αiz, i = 1, …, n, such that

ri(z)ri(z)=Dri(w)(zz),i=1,,n, (D.4)

where D = [∂/∂z1, …, ∂/∂zn], and ri, z and z* are column vectors.

Note that wi lies on the line segment between z* and z.

Let J(z) be the Jacobian of SA, defined by Jij(z) = ∂ fi(z) / ∂zj. Applying the mean value theorem to Eq. (D.2) we get

u˙=H(u,z,a)u+e(u), (D.5)

where H (u, z*, a) is obtained as follows: Let νi = (1 − αi)z* + αi(z* + u) = z* + αiu, where 0 < αi < 1, and define a = [α1, …, αn]. Then H(u, z*, a) is the matrix obtained by evaluating the elements of J(z) in row number i in the point νi, i = 1, …,n. Obviously, H (u, z*, a) → J(z*) = J* when t → ∞ and u(t) → 0.

We write H (u, z*, a) = J* + E(u, a) = PD*P−1 + E(u, a), where D* is the diagonal eigenvalue matrix and P the eigenvector matrix for J*. Then E(u, a) → 0 when t → ∞. Considering Eq. (D.5) as an inhomogeneous ODE for u(t) and introducing ν(t) = P−1 u(t), its solution is

ν(t)=eDtν0+0teD(tτ)(E((τ),a)+e((τ))). (D.6)

With w(t) = E((τ), a) + e((τ)) we write this simpler as

ν(t)=eDtν0+0teD(tτ)w(τ)=eDtν0+0teD(tτ)w¯(t)=eDtν0+(D)1(IeDt)w¯(t), (D.7)

where (t), which is the vector of mean values of the components of w(t), is bounded by the minimum and maximum of w(t) in [0, t] because the remaining integrand is positive for each component of ν(t).

Let {ν0j}, j = 1, …, n be a set of linearly independent vectors, and νj (t) the corresponding solutions given by Eq. (D.7). Because (t) → 0 when t → ∞, the set of νj (t) is also linearly independent for sufficiently large t. Letting V0, V(t) and (t) be the matrices with ν0j, νj (t) and j (t) as columns, respectively, we get

V(t)=eDtV0+(D)1(IeDt)W¯(t), (D.8)

leading to

eDt=(V(t)(D)1W¯(t))(V0(D)1W¯(t))1. (D.9)

The last factor in Eq. (D.9) is well-defined for sufficiently large t and approaches (V0)−1 when t → ∞ because (t) approaches the zero matrix. It follows that ∥eD*t∥ → 0 when t → ∞. Let μ be the spectral abscissa of D*. For all t ≥ 0, exp(μt) ≤ ∥exp(D*t)∥ [42, Theorem 15.3]. This shows that exp(μt) → 0 when t → ∞. Therefore μ < 0, and z* is an asymptotically stable and hyperbolic point for SA.

To compare the temporal behaviours of SE and SA we had to rely on numerical simulations. To justify the aggregation, the temporal behaviours of z(t, z0) and y(t, z0) should be approximately equal and qualitiatively similar for a range of actual parameter values and realistic common initial values z0z*. Although very close similarity far from the common equilibrium point cannot be expected, at least the behaviours near the equilibrium should be quantitatively similar. We quantify the degree of similarity of the solution curves by the relative discrepancy

RelErr(y,z)={0(yi,(t,z0)zi(t,z0))2dt0(zi(t,z0)zi)2dt}iN. (D.10)

One advantage of this discrepancy measure is that both integrals converge exponentially if z* is hyperbolic so that they can easily be computed numerically by integrating to a sufficiently large and finite T. As a measure of the similarity of solutions near the equilibrium we used

EigDiff(z;SE,SA)=log(jN|λjΛj||Λj|e(Λj)), (D.11)

where {Λj} and {λj} are the sets of eigenvalues of the Jacobians of SE and SA, respectively. Because SE has one additional eigenvalue, one of its eigenvalues has to be excluded from the sum in Eq. (D.11). We excluded the eigenvalue that minimises the sum. The purpose of the exponential factor is to simulate the fact that an eigenvalue contributes to the solution of the linearised equations around z* by this factor. This similarity measure is justified if there is a corresponding similarity between the two sets of eigenvectors, because then for a given solution z(t) it will be possible to construct a solution y(t) which will be quantitatively similar to z(t) close to the equilibrium.

Numeric simulations for a range of n-values show that in almost all cases the eigenvalues of SA match the eigenvalues of SE very closely (Figure D.3). The scatterplots in the left column show that except in a few cases, RelErr(y, z) and EigDiff (z*; SE, SA) are much smaller than 0, showing that the solution of SA lies relatively close to the solution of SE, and that the difference between the eigenvalues {Λj} of SE and {λj} of SA are much smaller in magnitude than the eigenvalues of the Jacobians of SE.

Figure D.3.

Figure D.3

Comparisons of the temporal behaviour of the extended model SE and the aggregate model SA for varying parameter combinations. Top row: 3 aggregated nodes, 423 data points. Second row: 7 aggregated nodes, 394 data points. Third row: 14 aggregated nodes, 352 data points. Bottom row: 24 aggregated nodes, 392 data points. Left panels: scatterplot of RelErr(y, z,) vs. EigDiff(z*;SE,SA) for 423 parameter sets. Middle panels: the distribution of RelErr(y, z). Right panels: the distribution of EigDiff(z*; SE,SA). Parameter values and details about the simulations are given in the text.

Except in a small number of cases, the temporal behaviours of the solutions also match closely. Typically, at least when n is large, there are appreciable differences between the two solutions only for a few variables (Figure D.4). Frequently this happens for variables that do not approach their final state monotonically, either because they oscillate towards the equilibrium or because they approach a limit cycle. Also, there could be multistationarity in the systems such the two solutions approach different final states.

Figure D.4.

Figure D.4

Selected examples of solution curves for extended systems (blue) and the corresponding aggregated systems (green). a: A typical case with n = 4 in which the two systems differ significantly in just one variable. b: A case with n = 4 of particularly bad similarity. In both cases, however, the two sets of curves are qualitatively similar, but the oscillations and dips are shifted in time. c: A system with n = 3 having a stable limit cycle for both systems.

Below follows a summary of the simulation details. The rate functions fk, k = 1, …, n − 1, fn1 and fn2 were given by the function

fi(z)=aiBj(Zk,Zl)γizi, (D.12)

where ai and γi were scalars chosen at random from a uniform distribution over (0,1), and Bj is any of the 14 non-constant Boolean functions of two variables, chosen at random for each i, but equal for fn1 and fn2. The function

Zk=zkpk+hkθkpkzkpk+θkpk (D.13)

is the generalised Hill function derived from applying Boltzmann statistics to transcription regulation [43]. Note that here the superscripts are powers. The thresholds θk are chosen at random uniformly over (0, 1), the steepness parameters pk (equivalent to the Hill exponent) were picked from a uniform distribution of integers in [1,10], and the inverse fold changes hk were also chosen at random from a uniform distribution over (0, 1). The two inputs to the Boolean functions were chosen at random among the variables Zi, but the same for rn1 and rn2. For each value of n we ran 500 simulations. For each parameter set, both solutions were started from the same randomly chosen initial point. The systems that did not converge to a stable point or in which the two systems approached different attractors, were disregarded. That left us with the number of cases mentioned in the caption of Figure D.3.

Appendix E

The root of nonzero allele interaction values

The basic two-node system in Figure 1a is in our standard notation given by the rate equations

z˙11=r11(z2)γ11z11,z˙12=r12(z2)γ12z12,z˙2=r2(z1)γ2z2, (E.1)

where z1=z11+z12. For the μ-system in Figure 1b the rate equations are

z˙11=r11(z2)γ11z11,z˙12=r12(z2)γ12z12,z˙2=r2(z11+μz12)γ2z2,z˙2=r2(μz11+z12)γ2z2, (E.2)

where μ ∈ [0,1]. By assumption the equilibrium conditions of Eq. (E.2) define unique stable equilibrium values y112(μ)=y112(μ)+y121(μ), y2(μ) and 2(μ). The allele interaction value is Δ112(μ)=y112(μ)y11(μ)+y12(μ). Using implicit differentiation, doing some straightforward algebra and finally taking the limit μ → 0, we find

limμ0dΔ112(μ)dμ=u12u2γ12γ2u12u2y11+u11u2γ11γ2u11u2y12, (E.3)

where u1α and u2 represent the derivatives of the corresponding dose-response functions with respect to their argument. Because Δ112(0)=0 (see the main text), it follows that for arbitrarily small μ > 0 the μ-system has a nonzero allele interaction value. We may conclude that in general, this is true also for μ = 1, in which case the μ-system is equivalent to the basic system defined by Eqs. (E.1).

If the red arrow from X12 in Figure 1b is missing, the loop X11X2X12X2X11 is broken, and there is no longer a regulatory loop in the system. In this case the subsystem X12, X2, does not act on the two other nodes, and limμ0dΔ112(μ)/dμ no longer depends on y12. Only the first term in Eq. (E.3) remains, and the conclusion is still valid.

Appendix F

Proof of Lemma 1

Proof

We adapt the numbering such that x1x2. The intersections between the curves y = ϕi(x) and y = γix, i = 1, 2, and y = ϕ1 (x) + ϕ2 (x) and y = γx define the solutions x1, x2 and x12, respectively. We consider three cases separately.

  1. The case ϕi(x)<0,i=1,2. Assume x12x1 + x2. Then
    γx12=ϕ1(x12)+ϕ2(x12)ϕ1(x1+x2)+ϕ2(x1+x2)<ϕ1(x1)+ϕ2(x2)=γ1x1+γ2x2=γ(x1+x2),
    contradicting the assumption. Thus, x12 < x1 + x2.
    Then assume x12x1. This leads to
    γx12ϕ1(x1)+ϕ2(x1)>ϕ1(x1)+ϕ2(x2)=γ(x1+x2)>γx12,
    which is impossible. Thus, x12 > x1. In passing we note that it is not possible to draw a definite conclusion about which is the larger of γ1 and γ2.

    The case 0<ϕi(x)<γi, i = 1, 2. In this case, too, existence of x12 has to be assumed. The three curves y = ϕ1 (x), y = ϕ2(x), and y = ϕ12 (x) intersect the lines y = γ1 x, y = γ2x and y = γx, respectively, from above as in case 1. Then ϕ12(x1 + x2) > ϕ1(x1) + ϕ2(x2) = γ1x1 + γ2x2 = γ(x1 + x2). This implies that in x = x1 + x2, ϕ12(x) > γx, from which it follows that in this point the curve y = ϕ12(x) lies above the line y = γx. Therefore the intersection x12 lies to the right of x1 + x2, i.e. x12 > x1 + x2.

  2. The case ϕi(x)>γi, i = 1,2. The existence of x12 is not ensured, but we have assumed it exists. In this case ϕ12(x)>γ1+γ2>γ, and the three curves intersect the corresponding lines from below. As ϕ12(x) > ϕi(x), it follows that ϕi(xi) > γixi. Assume γ1 < γ2. Thus, in both points x1 and x2, ϕ12(x) > γx. This implies that x12 < x1, x12 < x2, thus x12 < x1 + x2. The case γ2 < γ1 goes likewise.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature. 1999;402:C47–C52. doi: 10.1038/35011540. [DOI] [PubMed] [Google Scholar]
  • 2.Lauffenburger DA. Cell signaling pathways as control modules: Complexity for simplicity? Proc Natl Acad Sci U S A. 2000;97:5031–5033. doi: 10.1073/pnas.97.10.5031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hoogendoorn B, Coleman SL, Guy CA, Smith K, Bowen T, Buckland PR, O’Donovan MC. Functional analysis of human promoter polymorphisms. Human Molecular Genetics. 2003;12:2249–2254. doi: 10.1093/hmg/ddg246. [DOI] [PubMed] [Google Scholar]
  • 4.Gehring NH, Frede U, Neu-Yilik G, Hundsdoerfer P, Vetter B, Hentze MW, Kulozik AE. Increased efficiency of mRNA 3′ end formation: a new genetic mechanism contributing to hereditary thrombophilia. Nature Genetics. 2001;28:389–392. doi: 10.1038/ng578. [DOI] [PubMed] [Google Scholar]
  • 5.Peng J, Murray EL, Schoenberg DR. The poly(A)-limiting element enhances mRNA accumulation by increasing the efficiency of pre-mRNA 3′ processing. RNA. 2005;11:958–965. doi: 10.1261/rna.2020805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wang RL, Stec A, Hey J, Lukens L, Doebley J. The limits of selection during maize domestication. Nature. 1999;398:236–239. doi: 10.1038/18435. [DOI] [PubMed] [Google Scholar]
  • 7.Rosenfeld N, Young JW, Alon U, Swain PS, Elowitz MB. Gene regulation at the single-cell level. Science. 2005;307:1962–1965. doi: 10.1126/science.1106914. [DOI] [PubMed] [Google Scholar]
  • 8.Mayo AE, Setty Y, Shavit S, Zaslaver A, Alon U. Plasticity of the cis-regulatory input function of a gene. PLoS Biology. 2006;4:e45. doi: 10.1371/journal.pbio.0040045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Duan J, Antezana MA. Mammalian mutation pressure, synonymous codon choice, and mRNA degradation. Journal of Molecular Evolution. 2003;57:694–701. doi: 10.1007/s00239-003-2519-1. [DOI] [PubMed] [Google Scholar]
  • 10.Capon F, Allen MH, Ameen M, Burden AD, Tillman D, Barker JN, Trembath RC. A synonymous SNP of the corneodesmosin gene leads to increased mRNA stability and demonstrates association with psoriasis across diverse ethnic groups. Human Molecular Genetics. 2004;13:2361–2368. doi: 10.1093/hmg/ddh273. [DOI] [PubMed] [Google Scholar]
  • 11.Chamary JV, Hurst L. Evidence for selection on synonymous mutations affecting stability of mRNA secondary structure in mammals. Genome Biology. 2005;6:R75. doi: 10.1186/gb-2005-6-9-r75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Jones FC, Grabherr MG, Chan YF, Russell P, Mauceli E, Johnson J, Swofford R, Pirun M, Zody MC, White S, Birney E, Searle S, Schmutz J, Grimwood J, Dickson MC, Myers RM, Miller CT, Summers BR, Knecht AK, Brady SD, Zhang H, Pollen AA, Howes T, Amemiya C, Lander ES, Di Palma F, Lindblad-Toh K, Kingsley DM. The genomic basis of adaptive evolution in threespine sticklebacks. Nature. 2012;484:55–61. doi: 10.1038/nature10944. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Omholt SW, Plahte E, Øyehaug L, Xiang KF. Gene regulatory networks generating the phenomena of additivity, dominance and epistasis. Genetics. 2000;155:969–980. doi: 10.1093/genetics/155.2.969. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Gjuvsland AB, Plahte E, Ådnøy T, Omholt SW. Allele interaction – single locus genetics meets regulatory biology. PLoS ONE. 2010;5:e9379. doi: 10.1371/journal.pone.0009379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Gibson G. Epistasis and pleiotropy as natural properties of transcriptional regulation. Theoretical Population Biology. 1996;49:58–89. doi: 10.1006/tpbi.1996.0003. [DOI] [PubMed] [Google Scholar]
  • 16.Tyler AL, Asselbergs FW, Williams SM, Moore JH. Shadows of complexity: what biological networks reveal about epistasis and pleiotropy. Bioessays. 2009;31:220–227. doi: 10.1002/bies.200800022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Thomas R, D’Ari R. Biological Feedback. CRC Press; Boca Raton, USA: 1990. [Google Scholar]
  • 18.Thomas R, Kaufman M. Multistationarity, the basis of cell differentiation and memory. ii. logical analysis of regulatory networks in terms of feedback circuits. Chaos. 2001;11:180–195. doi: 10.1063/1.1349893. [DOI] [PubMed] [Google Scholar]
  • 19.Welch SM, Dong ZS, Roe JL, Das S. Flowering time control: gene network modelling and the link to quantitative genetics. Aust J Agric Res. 2005;56:919–936. [Google Scholar]
  • 20.Omholt SW. From bean-bag genetics to feedback genetics: bridging the gap between regulatory biology and classical genetics. Landes Bioscience; Georgetown, Texas, USA: 2006. [Google Scholar]
  • 21.Maybee JS, Olesky DD, van den Driessche P, Wiener G. Matrices, Digraphs and Determinants. University of Colorado; 1987. Technical Report. [Google Scholar]
  • 22.de Jong H. Modeling and simulation of genetic regulatory systems: A literature review. J Comput Biol. 2002;9:67–104. doi: 10.1089/10665270252833208. [DOI] [PubMed] [Google Scholar]
  • 23.Brazhnik P, de la Fuente A, Mendes P. Gene networks: how to put the function in genomics. Trends Biotechnol. 2002;20:467–472. doi: 10.1016/s0167-7799(02)02053-x. [DOI] [PubMed] [Google Scholar]
  • 24.Plahte E, Mestl T, Omholt SW. Feedback loops, stability and multistationarity in dynamical systems. JBS. 1995;3:409–413. [Google Scholar]
  • 25.Soulé C. Graphic reguirements for multistationarity. ComPlexUs. 2003;1:123–133. [Google Scholar]
  • 26.Maybee JS. Principal Minor Determinant Formulas. University of Colorado; 1973. Technical Report CU-CS-033-73. [Google Scholar]
  • 27.Radulescu O, Lagarrigue S, Siegel A, Veber P, Le Borgne M. Topology and static response of interaction networks in molecular biology. J R Soc Interface. 2006;3:185–196. doi: 10.1098/rsif.2005.0092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Marsden JE, Hoffman MJ. Elementary Classical Analysis. 2 W. H. Freeman; New York: 1993. [Google Scholar]
  • 29.Duncan IW. Transvection effects in Drosophila. Annu Rev Genet. 2002;36:521–556. doi: 10.1146/annurev.genet.36.060402.100441. [DOI] [PubMed] [Google Scholar]
  • 30.Kholodenko BN, Kiyatkin A, Bruggeman FJ, Sontag E, Westerhoff HV, Hoek JB. Untangling the wires: A strategy to trace functional interactions in signaling and gene networks. Proc Natl Acad Sci U S A. 2002;99:12841–12846. doi: 10.1073/pnas.192442699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, Dow S, Lucau-Danila A, Anderson K, Andre B, Arkin AP, Astromoff A, El Bakkoury M, Bangham R, Benito R, Brachat S, Campanaro S, Curtiss M, Davis K, Deutschbauer A, Entian K-D, Flaherty P, Foury F, Garfinkel DJ, Gerstein M, Gotte D, Guldener U, Hegemann JH, Hempel S, Herman Z, Jaramillo DF, Kelly DE, Kelly SL, Kotter P, LaBonte D, Lamb DC, Lan N, Liang H, Liao H, Liu L, Luo C, Lussier M, Mao R, Menard P, Ooi SL, Revuelta JL, Roberts CJ, Rose M, Ross-Macdonald P, Scherens B, Schimmack G, Shafer B, Shoemaker DD, Sookhai-Mahadeo S, Storms RK, Strathern JN, Valle G, Voet M, Volckaert G, Wang C-y, Ward TR, Wilhelmy J, Winzeler EA, Yang Y, Yen G, Youngman E, Yu K, Bussey H, Boeke JD, Snyder M, Philippsen P, Davis RW, Johnston M. Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002;418:387–391. doi: 10.1038/nature00935. [DOI] [PubMed] [Google Scholar]
  • 32.Mattick JS. RNA regulation: a new genetics? Nat Rev Genet. 2004;5:316–323. doi: 10.1038/nrg1321. [DOI] [PubMed] [Google Scholar]
  • 33.Hobert O. Gene regulation by transcription factors and microRNAs. Science. 2008;319:1785–1786. doi: 10.1126/science.1151651. [DOI] [PubMed] [Google Scholar]
  • 34.Makeyev EV, Maniatis T. Multilevel regulation of gene expression by microRNAs. Science. 2008;319:1789–1790. doi: 10.1126/science.1152326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Shimoni Y, Friedlander G, Hetzroni G, Niv G, Altuvia S, Biham O, Margalit H. Regulation of gene expression by small non-coding RNAs: a quantitative view. Mol Syst Biol. 2007;3 doi: 10.1038/msb4100181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Gantmacher FR. The Theory of Matrices. AMS Chelsea Publishing; Providence, Rhode Island: 2000. [Google Scholar]
  • 37.Bintu L, Buchler NE, Garcia HG, Gerland U, Hwa T, Kondev J, Kuhlman T, Phillips R. Transcriptional regulation by the numbers: applications. Curr Opin Genet Dev. 2005;15:125–135. doi: 10.1016/j.gde.2005.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Bintu L, Buchler NE, Garcia HG, Gerland U, Hwa T, Kondev J, Phillips R. Transcriptional regulation by the numbers: models. Curr Opin Genet Dev. 2005;15:116–124. doi: 10.1016/j.gde.2005.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Buchler NE, Gerland U, Hwa T. On schemes of combinatorial transcription logic. Proc Natl Acad Sci U S A. 2003;100:5136–5141. doi: 10.1073/pnas.0930314100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Sengupta AM, Djordjevic M, Shraiman BI. Specificity and robustness in transcription control networks. Proc Natl Acad Sci U S A. 2002;99:2072–2077. doi: 10.1073/pnas.022388499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.von Hippel PH, Berg OG. On the specificity of DNA-protein interactions. Proc Natl Acad Sci U S A. 1986;83:1608–1612. doi: 10.1073/pnas.83.6.1608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Trefethen LN, Embree M. Spectra and Pseudospectra: the Behavior of Nonnormal Matrices and Operators. Princeton University Press; Princeton, N.J.: 2005. [Google Scholar]
  • 43.Buchler NE, Gerland U, Hwa T. Nonlinear protein degradation and the function of genetic circuits. Proc Natl Acad Sci U S A. 2005;102:9559–9564. doi: 10.1073/pnas.0409553102. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES