A Tutorial on Analysis and Simulation of Boolean Gene Regulatory Network Models

Yufei Xiao

doi:10.2174/138920209789208237

. 2009 Nov;10(7):511–525. doi: 10.2174/138920209789208237

A Tutorial on Analysis and Simulation of Boolean Gene Regulatory Network Models

Yufei Xiao ^1,^2,^*

PMCID: PMC2808677 PMID: 20436877

Abstract

Driven by the desire to understand genomic functions through the interactions among genes and gene products, the research in gene regulatory networks has become a heated area in genomic signal processing. Among the most studied mathematical models are Boolean networks and probabilistic Boolean networks, which are rule-based dynamic systems. This tutorial provides an introduction to the essential concepts of these two Boolean models, and presents the up-to-date analysis and simulation methods developed for them. In the Analysis section, we will show that Boolean models are Markov chains, based on which we present a Markovian steady-state analysis on attractors, and also reveal the relationship between probabilistic Boolean networks and dynamic Bayesian networks (another popular genetic network model), again via Markov analysis; we dedicate the last subsection to structural analysis, which opens a door to other topics such as network control. The Simulation section will start from the basic tasks of creating state transition diagrams and finding attractors, proceed to the simulation of network dynamics and obtaining the steady-state distributions, and finally come to an algorithm of generating artificial Boolean networks with prescribed attractors. The contents are arranged in a roughly logical order, such that the Markov chain analysis lays the basis for the most part of Analysis section, and also prepares the readers to the topics in Simulation section.

1. INTRODUCTION

In most living organisms, genome carries the hereditary information that governs their life, death, and reproduction. Central to genomic functions are the coordinated interactions between genes (both the protein-coding DNA sequences and regulatory non-coding DNA sequences), RNAs and proteins, forming the so called gene regulatory networks (or genetic regulatory networks).

The urgency of understanding gene regulations from systems level has increased tremendously ever since the early stage of genomics research. A driving force is that, if we can build good gene regulatory network models and apply intervention techniques to control the genes, we may find better treatment for diseases resulting from aberrant gene regulations, such as cancer. In the past decade, the invention of high throughput technologies has made it possible to harvest large quantities of data efficiently, which is turning the quantitative study of gene regulatory networks into a reality. Such study requires the application of signal processing techniques and fast computing algorithms to process the data and interpret the results. These needs in turn have fueled the development of genomic signal processing and the use of mathematical models to describe the complex interactions between genes.

The roles of mathematical models for gene regulatory networks include:

Describing genetic regulations at a system level;
Enabling artificial simulation of network behavior;
Predicting new structures and relationships;
Making it possible to analyze or intervene in the network through signal processing methods.

Among various mathematical endeavors are two Boolean models, Boolean networks (BNs) [1] and probabilistic Boolean networks (PBNs) [2], in which each node (gene) takes on two possible values, ON or OFF (or 1 and 0), and the way genes interact with each other is formulated by standard logic functions. They constitute an important class of models for gene regulatory networks, in that they capture some fundamental characteristics of gene regulations, are conceptually simple, and their rule-based structures bear physical and biological meanings. Moreover, Boolean models can be physically implemented by electronic circuits, and demonstrate rich dynamics that can be studied using mathematical and signal processing theory (for instance, Markov chains [2, 3]).

In practice, Boolean models have been successfully applied to describe real gene regulatory relations (for instance, the drosophila segment polarity network [4]), and the attractors of BNs and PBNs have been associated with cellular phenotypes in the living organisms [5]. The association of network attractors and actual phenotypes has inspired the development of control strategy [6] to increase the possibility of reaching desirable attractors (“good” phenotypes) and decrease the likelihood of undesirable attractors (“bad” phenotypes such as cancer). The effort of applying control theory to Boolean models is especially appealing in the medical community, as it holds potential to guide the effective intervention and treatment in cancer.

The author would like to bring the fundamentals of Boolean models to a wider audience in light of their theoretical value and pragmatic utility. This tutorial will introduce the basic concepts of Boolean networks and probabilistic Boolean networks, present the mathematical essentials, and discuss some analyses developed for the models and the common simulation issues. It is written for researchers in the genomic signal processing area, as well as researchers with general mathematics, statistics, engineering, or computer science backgrounds who are interested in this topic. It intends to provide a quick reference to the fundamentals of Boolean models, allowing the readers to apply those techniques to their own studies. Formal definitions and mathematical foundations will be laid out concisely, with some in-depth mathematical details left to the references.

2. PRELIMINARIES

In Boolean models, each variable (known as a node) can take two possible values, 1 (ON) and 0 (OFF). A node can represent a gene, RNA sequence, or protein, and its value (1 or 0) indicates its measured abundance (expressed or unexpressed; high or low). In this paper, we use “node” and “gene” interchangeably.

A state in Boolean models is a binary vector of all the gene values measured at the same time, and is also called the gene activity (or expression) profile (GAP). The state space of a Boolean model consists of all the possible states, and its size will be 2ⁿ for a model with n nodes.

Definition 1[2, 7] A Boolean network is defined on a set of n binary-valued nodes (genes) $V = \{x_{1}, \cdot \cdot \cdot, x_{n}\}, x_{i} \in \{0, 1\}$ , where each node x_i has k_i parent nodes (regulators) chosen from V, and its value at time t + 1 is determined by its parent nodes at t through a Boolean function f_i,

x_{i} (t + 1) = f_{i} (x_{i1} (t), x_{i2} (t),..., x_{{ik}_{i}} (t)), \{i 1, \cdot \cdot \cdot, {ik}_{i}\} ⫅ \{1, \cdot \cdot \cdot, n\}

(1)

k_i is called the connectivity of x_i, and f_i is the regulatory function. Defining network function f = (f₁,...,f_n), we denote the Boolean network as β(V, f). Let the network state at time t be x(t) = (x₁(t),...,x_n(t)), the state transition x(t) → x(t + 1) is governed by f, written as x(t + 1) = f(x(t)).

In Boolean networks, genetic interactions and regulations are hard-wired with the assumption of biological determinism. However, any gene regulatory network is not a closed system and has interactions with its environment and other genetic networks, and it is also likely that genetic regulations are inherently stochastic; therefore, Boolean networks will have limitations in their modeling power. Probabilistic Boolean networks were introduced to address this issue [2, 7], such that they are composed of a family of Boolean networks, each of which is considered a context [8]. At any given time, gene regulations are governed by one component Boolean network, and network switchings are possible such that at a later time instant, genes can interact under a different context. In this sense, probabilistic Boolean networks are more flexible in modeling and interpreting biological data.

Definition 2 [2, 3, 7] A probabilistic Boolean network is defined on $V = \{x_{1}, \cdot \cdot \cdot, x_{n}\}, x_{i} \in \{0, 1\}$ , and consists of r Boolean networks β₁(V,f₁),...,β_r(V,f_r), with associated network selection probabilities c₁,...,c_r such that $\sum_{j = 1}^{r} c_{j} = 1$ . The network function of the j -th BN is $f_{j} = (f_{j}^{(1)},..., f_{j}^{(n)})$ . At any time, genes are regulated by one of the BNs, and at the next time instant, there is a probability q (switching probability) to change network; once a change is decided upon, we choose a BN randomly (from r BNs) by the selection probabilities. Let p be the rate of random gene perturbation (flipping a gene value from 0 to 1 or 1 to 0), the state transition of PBN at t (assuming operation under β_j) is probabilistic, namely [3],

x (t + 1) = \{\begin{matrix} f_{j} (x (t)), with probability {(1 - p)}^{n}, \\ x (t) \oplus γ, with probability 1 - {(1 - p)}^{n}, \end{matrix}

(2)

where ⊕ is bit-wise modulo-2 addition, γ=(γ₁,...,γ_n) is a random vector with Pr{γ_i=1}= p, and x(t)⊕γ denotes a random perturbation on the state x(t) (one or more genes are flipped). Let the set of network functions be F={f₁,...,f_r}, and we denote the PBN by G(V,F,c,p) (see Remark 1).

Alternatively, the PBN can be represented as G(V,Ψ,α,p), with Ψ = {Ψ₁,...,Ψ_n} and α={α₁,...,α_n}. In this representation, each node x_i is regarded as being regulated by a set of l(i) Boolean functions $ψ_{i} = \{ψ_{1}^{(i)}, \cdot \cdot \cdot ψ_{l (i)}^{(i)}\}$ with the corresponding set of function selection probabilities $α_{i} = \{α_{1}^{(i)},..., α_{l (i)}^{(i)}\} (\sum_{j = 1}^{l (i)} α_{j}^{(i)} = 1)$ . The two representations are related such that any network function f_j is a realization of the regulatory functions of n genes by choosing one function from the function set Ψ_i for each gene x_i, and we can write

f_{j} = (ψ_{j_{1}}^{(1)},..., ψ_{j_{n}}^{(n)}), j_{i} \in \{1,..., l (i)\} .

(3)

Moreover, if it is an independent PBN, namely $\Pr \{ψ_{j_{1}}^{(1)},..., ψ_{j_{n}}^{(n)}\} = \prod_{i = 1}^{n} \Pr \{ψ_{j_{i}}^{(i)}\}$ , c and α are related by

c_{j} = \prod_{i = 1}^{n} α_{j_{i}}^{(i)} .

(4)

Remark 1 q does not appear in the PBN representation, because according to the network switching scheme described, it can be shown that the probability of being in the β_j at any time is equal to c_j, regardless of q. However, if we modify the network switching scheme such that, once a network switch is decided upon, we randomly choose any network other than the current network, it will require the definition of r(r–1) conditional selection probabilities, $c_{jk} = \Pr \{f_{k} |f_{j}\}, k, j \in \{1,..., r\}, k \neq j$ , and the derivation of Pr{f_j} (the probability of being in β_j) is left as an exercise to the reader.

A Boolean model with finite number of nodes has a finite state space. From the definition of Boolean network, it follows that its state transitions are deterministic, that is, given a state, its successor state is unique. Naturally, if we represent the whole state space and the transitions among the sates of a BN graphically, we can have a state transition diagram.

Definition 3 The state transition diagram of an n-node Boolean network β(V, f) is a directed graph D(S,E). S is a set of 2ⁿ vertices, each representing a possible state of a Boolean network; E is a set of 2ⁿ edges, each pointing from a state to its successor state in state transition. If a state transits to itself, then the edge is a loop. The state transitions are computed by evaluating x(t+1) = f(x(t)) exactly 2ⁿ times, each time x(t) being 00...0,00...1,...,11...1 respectively.

Fig. (1) is an example of state transition diagram of a three-node BN. Like BNs, a PBN also has finite state space. Although state transitions in a PBN are not deterministic, they can be represented probabilistically. We will show how to construct the state transition diagram of a PBN in the Simulation section.

With the help of state transition diagram, such as the one in Fig. (1), we can easily visualize that in a BN, any state trajectory in time x(0)→ x(1)→ x(2)→ ... must end up in a “trap”, and stay there forever unless a gene perturbation occurs. Similarly, if neither gene perturbation nor network switching has occurred, a time trajectory in a PBN will end up in a “trap” in one of the component BNs too; however, either gene perturbation or a network switch may cause it to escape from the trap. In spite of this, when gene perturbation and network switching are rare, a PBN is most likely to reach a “trap” before either occurs and will spend a reasonably long time there.

Definition 4 Starting from any initial state in a finite Boolean network, when free of gene perturbation, state transitions will allow the network to reach a finite set of states {a₁,...,a_m} and cycle among them in a fixed order forever. The set of states is called an attractor, denoted by A. If A contains merely one state, it is a singleton attractor; otherwise, it is an attractor cycle. The set of states from which the network will eventually reach an attractor A constitutes the basin of attraction of A. A BN may have more than one attractor.

The attractors of a PBN are defined as the union of attractors of its component BNs. In particular, if a PBN is composed of r BNs, and the k -th BN has m_k attractors, A_k1 ,A_k2,..., A_{km_k} , then the attractors of PBN are $\{A_{11}, A_{12},..., A_{{1 m}_{1}}\} \cup \cdot \cdot \cdot \cup \{A_{r1}, A_{r2},..., A_{{rm}_{r}}\}$ .

In a BN, different basins of attraction are depicted in the state transition diagram as disjoint subgraphs. In Fig. (1), D(S, E) is composed of three disjoint subgraphs, D₁(S₁,E₁), D₂(S₂,E₂) , and D₃(S₃,E₃). 110 and 101 are singleton attractors, while 100 and 111 constitute a cycle. Their respective basins of attraction are S₁ = {000,010,110}, S₂ = {101} and S₃ = {001,011,100,111}.

We are interested in the attractors of a Boolean model for at least two reasons: (1) Attractors represent the stable states of a dynamic system, thus they are tied to the long term behavior of Boolean models; (2) Earlier researchers demonstrated the association of cellular phenotype with attractors [5], thus giving a biological meaning to the attractors. Intuitively, when an attractor has a large basin of attraction, the corresponding phenotype is more likely than that of an attractor with much smaller basin of attraction. To develop intervention strategies that change the long term behavior of Boolean models, it is important to study the attractors.

3. ANALYSES OF BOOLEAN MODELS

Although analysis and simulation are two parallel subjects with Boolean models, the former includes some essential results that lay a foundation for the latter. In this section, we visit Boolean model analysis first.

One of the central ideas with Boolean models is their connection with Markov chains (subsection 3.1). Because of this, Boolean models, under certain conditions, possess steady-state distributions. The steady-state probabilities of attractors, which indicate the long-run trend of network dynamics, can be found analytically via Markov chain analysis (subsection 3.2). Moreover, the relationship between PBNs and Bayesian networks (another class of gene regulatory network models) can be established in a similar manner (subsection 3.3). Lastly, a subsection will be dedicated to structural analysis, which opens a door to other topics beyond this tutorial (such as control of genetic networks).

3.1. Markov Chain Analysis

As readers will find out soon, the transition probability matrix introduced below is not only a convenience in Markov chain analysis, but also finds itself useful in simulation, to be discussed in Section 4.

3.1.1. Transition Probability Matrix

On a Boolean model of n nodes, a transition probability matrix $T = {[t_{ij}]}_{2^{n} \times 2^{n}}$ can be defined where t_ij indicates the probability of transition from one state (which is equal to i–1 if we convert the binary vector to an integer) to another state (which corresponds to j–1).

In a Boolean network β(V,f), t_ij can be computed by

t_{ij} = \{\begin{matrix} 1, \exists s \in {\{0, 1\}}^{n} such that dec (s) = i - 1, dec (f (s)) = j - 1, \\ 0, otherwise, \end{matrix}

(5)

where dec(.) converts a binary vector to an integer, for instance, dec(00101) = 5. Since BN is deterministic, T contains one 1 on each row, and all other elements are 0's.

In a PBN consisting of r BNs β₁(V,f₁),...,β_r(V,f_r), t_ij can be computed as follows [2, 3]. Note that p (random gene perturbation rate) and γ are defined as in Definition 2, and c_k is the selection probability of β_k.

t_{ij} = \sum_{k = 1}^{r} \Pr \{β_{k} is selected\} .

\begin{matrix} \Pr \{s \to w, dec (s) = i - 1, dec (w) = j - 1 |β_{k} is selected\} \\ = \sum_{k = 1}^{r} c_{k} . [\Pr \{s \to w by state transition, dec (s) = i - 1, dec (w) = j - 1 |f_{k}\} \\ + \Pr \{s \to w by random gene perturbation, dec (s) = i - 1, dec (w) = j - 1 |f_{k}\}] \\ = \sum_{k = 1}^{r} c_{k} . [{(1 - p)}^{n} \Pr \{s \to f_{k} (s), dec (s) = i - 1, dec (f_{k} (s)) = j - 1 |f_{k}\} \\ + \Pr \{s \to s \oplus γ, γ \neq (0,..., 0), dec (s) = i - 1, dec (s \oplus γ) = j - 1 |f_{k}\}] \\ = \sum_{k = 1}^{r} c_{k} . [{(1 - p)}^{n} 1_{\{s \to f_{k} (s), dec (s) = i - 1, dec (f_{k} (s)) = j - 1\}} + p_{γ} 1_{[i \neq j]}], \end{matrix}

(6)

where 1 's are indicator functions, $p_{γ} = (\begin{matrix} n \\ l \end{matrix}) p^{l} {(1 - p)}^{n - 1},$ , l=number of 1's in the random vector γ=(γ₁,...,γ_n), and l indicates the Hamming distance between s and w.

When taking a closer look at Eq. (6), we find that T is the sum of a fixed transition matrix $\bar{T}$ and a perturbation matrix $\tilde{T}$ ,

\bar{T} = {(1 - p)}^{n} \sum_{j = 1}^{r} c_{j} T_{j},

(7)

\tilde{T} = [{\tilde{t}}_{ij}], {\tilde{t}}_{ij} = n (\begin{matrix} n \\ n_{ij} \end{matrix}) p^{η_{ij}} {(1 - p)}^{{n - η}_{ij}} 1_{[i \neq j]},

(8)

where T_j and c_j are the transition probability matrix and the network selection probability of the j -th Boolean network, respectively; η_ij is the Hamming distance between states s and w, with dec(s) = i–1 and dec(w) = j–1.

T_j is sparse with only 2ⁿ non-zero entries (out of 2ⁿ×2ⁿ entries), where each is a state transition driven by the network function f_j and involves n computations. $\tilde{T}$ depends only on n and p , and involves n computations. Thus, the computational complexity for T is $O ({n \cdot r \cdot 2}^{n})$ [9].

3.1.2. Boolean Models are Markov Chains

Given the definition of T matrix in Section 3.1.1, we can see that a Boolean model with n genes is a homogeneous Markov chain of N = 2ⁿ states, with T being the Markov matrix and $\sum_{j = 1}^{2^{n}} t_{ij} = 1, \forall i$ . A state x (a binary vector of length n) in Boolean model has one-to-one correspondence with the i -th state (1 ≤ i ≤ N) in the associated Markov chain by dec(x) = i–1.

What is the use of matrix T ? Let W = Tⁿ, we can show that the (i,j) -th element of W is equal to the probability of transition from the i -th state to the j -th state of the Markov chain in n steps, $w_{ij} = \Pr \{x (t + n) = z, dec (z) = j - 1 |x (t) = y, dec (y) = i - 1\}$ .

The proof is left as an exercise to the reader.

An N -state Markov chain possesses a stationary distribution (or invariant distribution) if there exists a probability distribution π = (π₁,...,π_N) such that

π = πT.

π = πT implies $π = {π T}^{n}, \forall n$ . Thus in a Markov chain with stationary distribution π , if we start from the i -th state with probability π_i, the chance of being in any state j after an arbitrary number of steps is always π_j .

An N -state Markov chain possesses a steady-state distribution $π^{*} = (π_{1}^{*},..., π_{N}^{*})$ if starting from any initial distribution π

π^{*} = lim_{k \to \infty} π T^{k},

it means that regardless of the initial state, the probability of a Markov chain being in the i -th state in the long run is $π_{i}^{*}$ . A Markov chain possessing a stationary distribution does not necessarily possess a steady-state distribution.

Why should it be of our concern if the Markov chain has a steady-state distribution or not? This is because we are interested in the Boolean model associated with the Markov chain, and would like to know how it behaves in the long-run. As a reminder, the attractors of a Boolean model are often associated with cellular phenotype, and by finding out the steady-state probabilities of a given attractor, we can have a general picture of the likelihood of a certain phenotype. When a Boolean model possesses (namely, its Markov chain possesses) a steady-state distribution, we can find those probabilities by simulating the model for a long time, starting from an arbitrary initial state x(0). In fact, this implies the equivalence of “space average” and “time average”, as is a common concept in stochastic processes.

When will a Markov chain possess a steady-state distribution? It turns out that an ergodic Markov chain will do. A Markov chain is said to be ergodic if it is irreducible and aperiodic [10].

Definition 5 A Markov chain is irreducible if it is possible to go from every state to every state (not necessarily in one move).

Definition 6 In a Markov chain, a state has period d if starting from this state, we can only return to it in n steps and n is a multiple of d. A state is periodic if it has some period > 1. A Markov chain is aperiodic if none of its state is periodic.

A Boolean network possesses a stationary distribution, but not a steady-state distribution unless it has one singleton attractor and no other attractors. Here we show how to find a stationary distribution. Assume a BN has m singleton attractors, a₁,...,a_m, or an attractor cycle {a₁,...,a_m}, where dec(a₁) = i₁–1,...,dec(a_m)=i_m–1, then π with $π_{i_{1}} = ... = π_{i_{m}} = 1 / m$ and $π_{j} = 0, j \notin \{i_{1},..., i_{m}\}$ is a stationary distribution (the proof is left as an exercise to the reader). If a BN has a combination of singleton attractors and cycles, π can be constructed such that the probabilities corresponding to the singleton attractors are equal, the probabilities corresponding to the states within each attractor cycle are equal, and $\sum_{i = 1}^{N} π_{i} = 1$ . When there is only one attractor in the BN, the stationary distribution is unique.

When p,q> 0, a PBN possesses steady-state distribution, because the Markov chain corresponding to the PBN is ergodic. Interested readers can find the proof in [3]. Now that PBN has a steady-state distribution, we can obtain such distribution in two ways: (2) solving the linear equations $π (T - I) = 0, \sum_{i = 1}^{N} π_{i} = 1$ (I is the identity matrix), and interested readers can consult books on linear algebra; (2) using the empirical methods in Section 4.3. If we are interested in the steady-state probabilities of the attractors only, an analytic method exists, to be discussed next.

3.2. Analytic Method for Computing the Steady-State Probabilities of Attractors

Recall from Section 2 that attractors are important to the long-term behavior of Boolean models because they are associated with cellular phenotypes; now we also know that PBNs possess steady-state distributions, which means that a PBN has a unique long-term trend independent of initial state. Therefore, we would naturally ask the question, how can we find the long-term probabilities of these attractors which are so important to us?

In the following, we will present a Markov chain based analytic method that answers this question, and more details, including proofs, can be found in [11].

3.2.1. Steady-State Distributions of Attractors in a BN with Perturbations

First consider a special case of PBN, Boolean network with perturbations (BNp), in which any gene has a probability p of flipping its value. A BNp inherits all the attractors and corresponding basins of attraction from the original BN. Because of the random gene perturbations, BNps possess steady-state distributions (the proof is similar to that of PBN, and it is left as an exercise to the reader).

A BNp defined on V = {x₁,...,x_n} with gene perturbation rate p can be viewed as homogenous irreducible Markov chain X_t with state space {0,1}ⁿ. Let $x, y \in {\{0, 1\}}^{n}$ be any two states, then at any time t,P_y(x)=Pr{X_t+1=x|X_t=y} is the probability of state transition from y to x.

For X_t, there exists a unique steady-state distribution π. Let the steady-state probability of state x be π(x), and let $B \subset {(0, 1)}^{n}$ be a collection of states, then the steady-state probability of B is $π (B) = \sum_{x \in B} π (x)$ .

Assume the BNp has attractors A₁,...,A_m, with corresponding basins of attraction (or simply referred to as basins) B₁,...,B_m. Since the attractors are subsets of the basins,

π (A_{k}) = π (A_{k} |B_{k}) π (B_{k}) .

(9)

Therefore, we can compute the steady-state probability of any attractor A_k by the following two steps: (1) the steady-state probability of basin B_k, π(B_k), and (2) the conditional probability of attractor A_k given its being in B_k, π(A_k|B_k).

(I). Obtaining the Steady-State Probability of Basin, π(B_k)

Define a random variable τ(t) which measures the time elapsed between the last perturbation and the current time t. τ(t) = 0 means a perturbation occurs at t. For any starting state h, let

P_{B_{i}}^{*} (B_{k}) = \lim_{t \to \infty} \Pr \{X_{t} \in B_{k} |X_{t - 1} \in B_{i}, X_{0} = h, τ (t) = 0\},

(10)

and define the conditional probability of being in state $x \in B$ given that the system is inside a set B, prior to a perturbation,

π^{*} (x |B) : = \lim_{t \to \infty} \Pr \{X_{t - 1} = x |X_{t - 1} \in B, X_{0} = h, τ (t) = 0\} .

(11)

The following theorem represents the steady-state distribution of the basins as the solution of a group of linear equations, where the coefficients are $P_{B_{i}}^{*} (B_{k})$ 's. The lemma that follows gives the formula for the coefficients.

Theorem 1

π (B_{k}) = \sum_{i = 1}^{m} P_{B_{i}}^{*} (B_{k}) π (B_{i}) .

(12)

Lemma 1

P_{B_{i}}^{*} (B_{k}) = \sum_{x \in B_{k}} \sum_{y \in B_{i}} P_{y}^{*} (x) π^{*} (y |B_{i}),

(13)

where $P_{y}^{*} (x)$ is the probability that state transition goes from y to x in one step by gene perturbation.

Now the only unknown is π^*(y|B_i). When p is small, the system spends majority of the time inside an attractor, and we can use the following approximation,

π^{*} (y |B_{i}) \approx \frac{1}{|A_{i}|} 1_{[y \in A_{i}]},

(14)

where |A_i| is the cardinality of A_i. Therefore,

P_{B_{i}}^{*} (B_{k}) \approx \frac{1}{|A_{i}|} \sum_{x \in B_{k}} \sum_{y \in A_{i}} P_{y}^{*} (x) .

(15)

(II). Obtaining the Steady-State Probability of Attractor, π(A_k)

Lemma 2 For basin B_k, initial state h, and fixed value j≥0,

\lim_{t \to \infty} \Pr \{X_{t - j} = x |X_{t - j} \in B_{k}, X_{0} = h, τ (t) = j\} = \frac{1}{π (B_{k})} \sum_{i = 1}^{m} \sum_{y \in B_{i}} P_{y}^{*} (x) π^{*} (y |B_{i}) π (B_{i}) .

(16)

Lemma 3 If δ(x,A_k) is the number of iterations of f needed to reach the attractor A_k from the state x, then for any $x \in A_{k}, b < 1$ ,

\sum_{j = δ (x, A_{k})}^{\infty} (1 - b) b^{j} = b^{δ (x, A_{k})} .

(17)

Applying the two lemmas and letting b=(1-p)ⁿ, we can obtain the steady-state probability of attractor A_k.

Theorem 2

π (A_{k}) = \sum_{i = 1}^{m} [\sum_{x \in B_{k}} \sum_{y \in B_{i}} P_{y}^{*} (x) π^{*} (y |B_{i}) {(1 - p)}^{n δ (x, A_{k})}] π (B_{i}) .

(18)

When p is small, using the approximation in Eq. (14), we have

π (A_{k}) \approx \sum_{i = 1}^{m} \frac{1}{|A_{i}|} [\sum_{x \in B_{k}} \sum_{y \in B_{i}} P_{y}^{*} (x) {(1 - p)}^{n δ (x, A_{k})}] π (B_{i}) .

(19)

3.2.2. Steady-State Distributions of Attractors in a PBN

In a PBN, we represent the pair (x,f) as the state of a homogeneous Markov chain, (X_t,F_t), and the transition probabilities are defined as

P_{y, g} (x, f) = \Pr \{X_{t + 1} = x, F_{t + 1} = f |X_{t} = y, F_{t} = g\}

(20)

Assume the PBN is composed of r BNs β₁(V,f₁),...,β_r(V,f_r). Within BN β_k, the attractors and basins are denoted A_ki and B_ki, i = 1,...,m_k. The computation of the steady-state probabilities are now split into three steps: (1) steady-state probabilities π(B_ki,f_k) of the basins, (2) conditional probabilities π(A_ki,f_k|B_ki,f_k), and (3) approximation to the marginal steady-state probabilities π(A_ki) (since different BNs may have the same attractor).

The computations in steps (1) and (2) are similar to that of BNp, with (B_ki,f_k) in place of B_k whenever applicable, and there is one extra summation $\sum_{k = 1}^{r}$ for the r component BNs. Interested readers can find details in [11].

From steps (1) and (2), we can obtain π(A_ki,f_k). The last step sums up π(A_ki,f_l) over l whenever the l -th BN has A_ki as an attractor,

π (A_{ki}) = \sum_{l = 1}^{r} π (A_{ki}, f_{l}) .

(21)

Since π(A_ki,f_l) is unknown when k ≠ l, we use the following approximation when p is small,

π (A_{ki}, f_{l} |A_{lj}, f_{l}) \approx \frac{|A_{ki} \cap A_{lj}|}{A_{lj}} .

(22)

Thus,

π (A_{ki}, f_{l}) \approx \sum_{j = 1}^{m_{l}} π (A_{ki}, f_{l} |A_{lj}, f_{l}) π (A_{lj}, f_{l}) \approx \sum_{j = 1}^{m_{l}} \frac{|A_{ki} \cap A_{lj}|}{|A_{lj}|} π (A_{lj}, f_{l}),

(23)

and

π (A_{ki}) \approx \sum_{l = 1}^{r} \sum_{j = 1}^{m_{l}} \frac{|A_{ki} \cap A_{lj}|}{|A_{lj}|} π (A_{lj}, f_{l}) .

(24)

3.3. Relationship Between PBNs and Bayesian Networks

Bayesian networks (BaN) are graphic models that describe the conditional probabilistic dependencies between variables, and have been used to model genetic regulatory networks [12]. An advantage of BaNs is that they involve model selection to optimally explain the observed data [2]; BaNs can use either continuous or discrete variables, which is more flexible for modeling. In comparison, Boolean models have explicit regulatory rules that carry biological information, which can be more appealing to biologist than the statistic representation of BaNs. Although Boolean models use binary-quantized variables which sets a limitation on the data usage, they are computational less complex than BaNs when learning the network structure from data (see Section 3.3 of [2] for a more detailed discussion and references). Since network structure learning is out of scope of this article, interested readers can refer to [12] for Bayesian learning, [13] for Boolean network learning, and [8, 14] for PBN learning.

While BNs are deterministic, PBNs and BaNs are related by their probabilistic nature; like PBNs, dynamic BaNs can be considered as Markov chains too. In the following analysis, we will show that equivalence between PBNs and BaNs can be established under certain conditions [15]. In this analysis, the random gene perturbation rate p in PBN is assumed to be 0.

A BaN with n random variables X₁,...,X_n (not necessarily binary) is represented by Ba(H,Θ), where H is a directed acyclic graph whose vertices correspond to the n variables and Θ is a set of conditional probability distributions induced by graph H. Letting X = (X₁,...,X_n), x_i be a realization of the random variable X_i, and Pa(X_i) be the parents of X_i, the unique joint probability distribution over the n variables is given by

\Pr \{x_{1},..., x_{n}\} = \prod_{i = 1}^{n} \Pr \{x_{i} |Pa (X_{i})\} .

A dynamic Bayesian network (DBN) is a temporal extension of BaN, and consists of two parts: (1) an initial BaN Ba₀ = (H₀,Θ₀) that defines the joint distribution of the variables x₁(0),...,x_n(0), and (2) a transition BaN Ba₁ = (H₁,Θ₁) that defines the transition probabilities $\Pr \{X (t) |X (t - 1)\}, \forall t$ . Let x represent a realization of X, and the joint distribution of X(0),...,X(T) can be expressed by

\Pr \{x (0),..., x (T)\} = \Pr \{x (0)\} \prod_{t = 1}^{T} \Pr \{x (t) |x (t - 1)\} = \prod_{i = 1}^{n} \Pr \{x_{i} (0) |Pa (X_{i} (0))\} \cdot \prod_{t = 1}^{T} \prod_{j = 1}^{n} \Pr \{x_{j} (t) |Pa (X_{j} (t))\} .

(25)

In a PBN G(V,F,c), where $V = \{x_{1},..., x_{n}\}, x_{i} \in \{0, 1\}$ and F = {f₁,...,f_r}, the joint probability distribution of states over the time period [0,T] can be expressed as

\Pr \{x (0),..., x (T)\} = \Pr \{x (0)\} \prod_{t = 1}^{T} \Pr \{x (t - 1) \to x (t)\} .

For an independent PBN,

\Pr \{x (0),..., x (N)\} = \Pr \{x (0)\} \prod_{t = 1}^{T} \prod_{i = 1}^{n} \Pr \{x (t - 1) \to x_{i} (t)\} .

(26)

3.3.1. An Independent PBN as a Binary-Valued DBN

Let the independent PBN be G(V,Ψα) (the alternative representation, see what follows Definition 2). First, since a BaN can represent arbitrary joint distribution, the distribution of the initial state of PBN, Pr{x₀} , can be represented by some Ba₀. Second, to construct Ba₁(H₁,Θ₁) from the PBN, we let set $X_{j}^{(i)} ⫅ V$ denote the regulators of gene x_i in function $ψ_{j}^{(i)}$ ,

Pa (x_{i}) = \cup_{j = 1}^{l (i)} X_{j}^{(i)} .

(27)

We construct graph H₁ such that there are two layers of nodes, the first layer has nodes X₁(t–1),...,X_n(t–1), the second layer has nodes X₁(t),...,X_n(t), and there exists a directed edge from $X_{k} (t - 1) to X_{i} (t) if \exists j \in \{1,...., l (i)\}$ in the PBN such that $x_{k} \in X_{j}^{(i)}$ . Thus in H₁, Pa(X_i(t)) corresponds to the set of all possible regulators of x_i in the PBN.

Let D_i be the joint distribution of the variables in Pa(x_i), and recall that $α_{j}^{(i)} = \Pr \{ψ_{j}^{(i)} issued\}$ , then

\Pr \{X_{i} = 1\} = \sum_{j = 1}^{l (i)} \Pr \{X_{i} = 1 |ψ_{j}^{(i)} is used\} . α_{j}^{(i)}

(28)

= \sum_{j = 1}^{l (i)} [\sum_{{x \in \{0, 1\}}^{|Pa (X_{i})|}} D_{i} (x) ψ_{j}^{(i)} (x)] . α_{j}^{(i)}

(29)

= \sum_{{x \in \{0, 1\}}^{|Pa (X_{i})|}} D_{i} (x) [\sum_{j = 1}^{l (i)} ψ_{j}^{(i)} (x) α_{j}^{(i)}],

(30)

and we have

\Pr \{X_{i} (t) = 1 |Pa (X_{i} (t)) = z\} = \sum_{j = 1}^{l (i)} ψ_{j}^{(i)} (z) α_{j}^{(i)} .

(31)

Eq. (31) defines Θ₁ (induced by H₁) for each node, thus any independent PBN G(V,Ψ,α) can be expressed as a binary DBN (Ba₀,Ba₁).

Remark 2 Strictly speaking, the input variables for $ψ_{j}^{(i)}$ are a subset of Pa(x_i), so the notations in Eqs. (29-31) are not accurate when we use the same vector x (or z) for $ψ_{j}^{(i)}$ and for D_i (or Pa(X_i(t))). We should understand that those notations are only used as a convenience.

3.3.2. A Binary-Valued DBN as an Independent PBN

Assume DBN (Ba₀,Ba₁) defined on X=(X₁,...,X_n) is given, and X_i's are binary-valued random variables. Now we demonstrate how to construct a PBN. Define the set of nodes V = {x₁,...,x_n} in PBN corresponding to X₁,...,X_n, and let the distribution of PBN initial state x₀ = (x₁(0),...,x_n(0)) match Θ₀ in Ba₀(H₀,Θ₀).

In , Ba₁(H₁,Θ₁), assume Pa(X_i(t)) contains k_i variables X_i1,...,X_{ik_i}. For each X_i, we enumerate each conditional probability regarding X_i(t) in Θ₁ as a triplet $(z_{j}, y_{j}, p_{j}), with z_{j} \in \{0, 1\}, y_{j} = y_{j1} y_{j2} ... y_{{jk}_{i}} \in {\{0, 1\}}^{k_{i}}, p_{j} = \Pr \{X_{i} (t) = z_{j} |Pa (X_{i} (t)) = y_{j}\}$ and there are 2^k_i+1 such triplets. The triplets are arranged such that the first 2^k_i of them have z_j = 1, and p_j's are in ascending order. For every j≤2^k_i, define a sequence of symbols $\tilde{x_{j}} = \tilde{x_{j1}} \tilde{x_{j2}} ... \tilde{x_{{jk}_{i}}}$ , where we choose the variable x_jd for symbol $\tilde{x_{jd}} if y_{jd} = 1$ , and choose $\overline{x_{jd}}$ (the negation of variable x_jd) for symbol $\tilde{x_{jd}} if y_{jd} = 0$ .

Letting l(i)=2^k_i+1, we define the set of l(i) Boolean functions for gene x_i in the PBN as $ψ_{i} = \{ψ_{1}^{(i)},..., ψ_{l (i) - 1}^{(i)}, ψ_{l (i)}^{(i)}\}$ , where

ψ_{m}^{(i)} = {\tilde{x}}_{m} \lor {\tilde{x}}_{m + 1} \lor ... \lor {\tilde{x}}_{l (i) - 1}, for 1 \leq m \leq l (i) - 1

(32)

is a disjunction of conjunctions, and $ψ_{l (i)}^{(i)}$ is a zero function. Define the corresponding function selection probabilities $α_{1}^{(i)} = p_{1}, α_{m}^{(i)} = p_{m} - p_{m - 1} for1 < m \leq l (i) - 1, and α_{l (i)}^{(i)} = 1 - p_{l (i) - 1},$ it can be verified that

\Pr \{X_{i} (t) = 1 |Pa (X_{i} (t)) = y_{j}\} = \sum_{m = 1}^{l (i)} ψ_{m}^{(i)} (y_{j}) α_{j}^{(i)} = \sum_{m = 1}^{j} ψ_{m}^{(i)} (y_{j}) α_{j}^{(i)} = p_{1} + \sum_{m = 2}^{j} (p_{m} - p_{m - 1}) = p_{j} .

(33)

Therefore, a binary DBN can be represented as a PBN G(V,Ψ,α), where Ψ={Ψ₁,...,Ψ_n}, α = {α₁,...,α_n}, and $α_{i} = (α_{1}^{(i)},..., α_{l (i)}^{(i)})$ . It should be noted that the mapping from a binary DBN to an independent PBN is not unique, and the above representation is one solution.

Summarizing subsections 3.3.1 and 3.3.2, we have the following theorem [15].

Theorem 3 Independent PBNs G(V,Ψ,α) and binary-valued DBNs (Ba₀,Ba₁) whose initial and transition BNs Ba₀ and Ba₁ are assumed to have only within and between consecutive slice connections, respectively, can represent the same joint distribution over their common variables.

3.4. Structural Analysis

Boolean models, like any other networks, have two issues of interest: Is the model robust? Is the model controllable? From the standpoint of system stability, we require the model be robust, namely, resistent to small changes in the network; from the standpoint of network intervention, we desire that the network be controllable, such that it will respond to certain perturbation. There needs to be a balance of the two properties. These two questions encourage researchers to do the following, (1) Find structural properties of the network that are related to robustness and controllability; (2) Seek ways to analyze the effect of perturbations and to design control techniques.

In 3.4.1, (1) is addressed. We review some structural measures of Boolean models that quantify the propagation of expression level change from one gene to others (or vice versa). In 3.4.2, (2) is partly addressed, where we review structural perturbations, and present a methodology that analyzes the perturbation on Boolean functions. Since the control techniques are out of the scope of this paper, interested readers can find more information in the review articles [16, 17].

3.4.1. Quantitative Measures of the Structure

In gene regulatory networks, the interactions among genes are reflected by two facts: the connections among genes, and the Boolean functions defined upon the connection. No matter it is the robustness or the controllability issue we are interested in, it all boils down to one central question: how a change in the expression level of one gene leads to changes in other genes in the network and vice versa. Here, we introduce three measures of the structural properties that are related to the question: canalization, influence and sensitivity.

When a gene is regulated by several parent genes through function f, some parent genes can be more important in determining its value than others. An extreme case is canalizing function, in which one variable (canalizing variable) can determine the function output regardless of other variables.

Definition 7 [18] A Boolean function f : {0,1}ⁿ → {0,1} is said to be canalizing if there exists an $i \in \{1,..., n\}$ and $u, v \in \{0, 1\}$ such that for all $x_{1},..., x_{n} \in \{0, 1\}$ , if x_i = u then f(x₁,...,x_n)=v.

In gene regulatory networks, canalizing variables are also referred to as the master genes. Canalization is commonly observed in real organisms, and it plays an important role in the stability of genetic regulation, as discussed in [19, 20]. Mathematically, researchers have shown that canalization is associated with the stability of Boolean networks. For more theoretical work, see [21-23].

Other than canalization, the degree of gene-gene interaction can be described in more general terms, and we define two quantitative measures, influence and sensitivity, as follows.

Consider a Boolean function f with input variables x₁,...,x_n. Letting x = (x₁,...,x_n), we define the influence of a gene on the function f.

Definition 8 [2] The influence of a variable x_j on the Boolean function f is the expectation of the partial derivative with respect to the distribution D(x),

I_{j} (f) = E_{D} [\frac{\partial f (x)}{{\partial x}_{j}}] = \Pr \{\frac{\partial f (x)}{{\partial x}_{j}} = 1\} = \Pr \{f (x) \neq f (x^{(j)})\}

(34)

Note that the partial derivative of f with respect to x_i is

\frac{\partial f (x)}{{\partial x}_{j}} = |f (x) - f {(x)}^{(j)}|,

(35)

in which x^(j) = (x₁,...,1-x_j,...,x_n) (with x_j toggled).

In a BN, since each node x_i has one regulatory function f_i, so the influence of node x_j (assuming it regulates x_i) on x_i is I_j(x_i)=I_j(f_i). In a PBN, let the set of regulating functions for x_i is $ψ_{1}^{(i)},..., ψ_{l (i)}^{(i)}$ , with function selection probabilities $α_{1}^{(i)},..., α_{l (i)}^{(i)}$ , the influence of gene x_j on x_i

I_{j} (x_{i}) = \sum_{k = 1}^{l (i)} I_{j} (ψ_{k}^{(i)}) . α_{k}^{(i)} .

(36)

Thus for a Boolean model with n genes, an influence matrix Γ of dimension n × n can be constructed, where its i,j element being Γ_ij = I_i(x_j). We can define influence of gene x_i to be the collective influence of x_i on all other genes,

r (x_{i}) = \sum_{j = 1}^{n} Γ_{ij} .

(37)

Related to influence, we define the sensitivity of a function,

s_{x} (f) = \sum_{j = 1}^{n} |f (x) - f (x^{j})| .

(38)

Then the average sensitivity of f with respect to distribution D is

s (f) = E_{D} [s_{x} (f)] = \sum_{j = 1}^{n} E_{D} [|f (x) - f (x^{(j)})|] = \sum_{j = 1}^{n} I_{j} (f) .

(39)

The meaning of average sensitivity is that, on average, how much the function f changes between the Hamming distance one neighbors (i.e., the input vectors differ by one bit). For PBNs, the average sensitivity of gene x_i is (cf. Eq. (37))

s (x_{i}) = \sum_{j = 1}^{n} I_{j} (x_{i}) = \sum_{j = 1}^{n} Γ_{ji} .

(40)

Biologically, the influence of a gene indicates its overall impact on other genes. A gene with high influence has the potential to regulate the system dynamics and its perturbation has significant downstream effect. The sensitivity of a gene measures its stability or autonomy. Low sensitivity means that other genes have little effect on it, and the “house-keeping” genes usually have this property [2]. It is shown that such quantitative measures (or variants) can help guide the control of genetic networks [24] and aid in the steady-state analysis [25].

3.4.2. Structural Perturbation Analysis

There are two types of perturbation on Boolean models: perturbation on network states and perturbation on network structure. The former refers to a sudden (forced or spontaneous) change in the current state from x to x' , which causes the system dynamics to be disturbed temporarily. Such disturbance is transient in nature, because the network nodes and connections are intact, and the underlying gene regulation principles do not change. Therefore, the network attractors and the basins of attraction remain the same. However, if the perturbed Boolean model has multiple attractors, state perturbations may cause convergence to a different attractor than the original one, and may change the steady-state distribution of the network. This type of perturbation has been studied extensively (e.g. [26]), and finds its use in network control (e.g. [6]).

Perturbation on network structure refers to any change in the “wiring” or functions of the network. For instance, we may remove or add a gene to the network, change connections among genes, change the Boolean functions, or even change the synchronous Boolean network to an asynchronous model (where not all the genes are updated at the same time). Structural perturbation is more complex and less studied, compared to state perturbation. When network structure is perturbed, the network attractors and basins of attraction will be impacted, therefore the long-term consequence is more difficult to gauge than that of state perturbation.

The reasons for studying structural perturbation are: (1) modeling of gene regulatory networks is subject to uncertainty, and it is desirable to study the effect of small difference in network models on the network dynamic behavior; (2) it is likely that gene regulations, like other biological functions, have intrinsic stochasticity, and it is of interest to predict the consequence of any perturbation in regulation; (3) changing the network structure can alter the network steady-state distribution, thus structural perturbation can be an alternative way (with respect to state perturbation) of network control [25, 27, 28].

In [8], the authors developed theories to predict the impact of function perturbations on network dynamics and attractors, and main results are presented below. For more applications, see [28]. For further analysis in terms of steady-state distribution and application in network intervention, see [25].

Problem formulation. Given a Boolean network β(V,f), V = {x₁,...,x_n}, f = (f₁,...,f_n) , if one or more functions have one or more flips on their truth table outputs, we would like to predict the effect on state transitions and attractors.

Assume gene x_i has k_i regulators x_i1,x_i2,...,x_{ik_i} then the truth table of f_i has 2^k_i rows, as is shown below. The input vector on row j will be denoted $a_{j}^{i} \in {\{0, 1\}}^{k_{i}}$ , for instance, $a_{1}^{i} = 00...0$ . If we flip the output on row j, then we call it a one-bit function perturbation on f_i, and denote it $f_{i}^{(j)} .$

Row label

x_{i1} x_{i2} \cdot \cdot \cdot x_{{ik}_{i}}

f_i(.)

00 \cdot \cdot \cdot 0

00 \cdot \cdot \cdot 1

\begin{matrix} \cdot \\ \cdot \\ \cdot \end{matrix}

\begin{matrix} \cdot \\ \cdot \\ \cdot \end{matrix}

\begin{matrix} \cdot \\ \cdot \\ \cdot \end{matrix}

2^{k_{i}}

11 \cdot \cdot \cdot 1

Open in a new tab

Any state transition s → w contains n mappings, f_i :s → w_i. We define In_i(s)=(s_i1,s_i2,...,s_{ik_i}), which is a sub-vector of s that corresponds to the regulators of x_i.

The following proposition and corollaries state the basic effects of one-bit function perturbation on the state transitions and attractors. Proofs and extensions to two-bit perturbations can be found in [28].

Proposition 1 A state transition s → w is affected by one-bit perturbation $f_{i} \to f_{i}^{(j)}$ if and only if ${In}_{i} (s) = a_{j}^{i}$ . If the state transition is affected, the new state transition will be s → w⁽ⁱ⁾, where w⁽ⁱ⁾ is defined to be the same as w except the i -th digit is flipped.

Corollary 1 If x_i has k_i regulators, then the one-bit perturbation $f_{i} \to f_{i}^{(j)}$ will result in 2^n-k_i changed state transitions in the state transition diagram. This is equivalent to 2^n-k_i altered edges in the state transition diagram.

Corollary 2 (Invariant singleton attractor) Suppose state S is a singleton attractor. It will no longer be a singleton attractor following the one-bit perturbation $f_{i} \to f_{i}^{(j)}$ if and only if ${In}_{i} (s) = a_{j}^{i}$ .

Corollary 3 (Emerging singleton attractor) A non-singleton-attractor state S becomes a singleton attractor as a result of the one-bit perturbation $f_{i} \to f_{i}^{(j)}$ if and only if the following are true: (1) ${In}_{i} (s) = a_{j}^{i}$ , and (2) absent the perturbation, S → S⁽ⁱ⁾.

We use the following toy example to demonstrate the above results. From these results, more applications can be derived, such as controlling the network steady-state distribution through function perturbation, or identifying functional perturbation by observing phenotype changes [28].

Example 1 Consider a BN with n = 3 genes,

x_{1} (t + 1) = x_{3} (t),

(41)

x_{2} (t + 1) = 0,

(42)

x_{3} (t + 1) = \bar{x_{1}} (t) x_{2} (t) + x_{1} (t) \bar{x_{2}} (t),

(43)

where the truth table of f₃ is shown below and the state transition diagram is shown in Fig. (2).

Row label	x₁	x₂	f₃(.)
1	0	0	0
2	0	1	1
3	1	0	1
4	1	1	0

Open in a new tab

If a one-bit perturbation forces f₃ to become $f_{3}^{(3)}$ , since k₃ = 2, 2 state transitions will be affected. By Proposition 1, states 100 and 101 no longer transit to 001 and 101 but to 000 and 100 respectively. Because of that, attractor cycle {001, 100} will be affected. Moreover, Corollary 2 predicts that the singleton attractor 000 is robust to the perturbation while 101 is not. The predictions are confirmed by the new state transition diagram shown in Fig. (3).

Fig. (3) — State transition diagram of the perturbed BN, Example 1.

Finally, the author would like to remind the readers that other works on (various types of) structural perturbation are available. For instance, in [29], the authors added a redundant node to Boolean network, such that the bolstered network is more resistent to a one-bit function perturbation (as defined above). In [30], the effect of asynchronous update of a drosophila segment polarity network model is examined in terms of the phenotypes (steady-states). In [25], the authors derived analytical results of how function perturbations affects network steady-state distributions and applied them to structural intervention. In [31], the author modeled gene knockdown and broken regulatory pathway in Boolean networks, and analyzed the effects.

4. SIMULATION ISSUES WITH BOOLEAN MODELS

Recall from Section 2 that a Boolean model of n genes has a finite state space, and a BN has deterministic dynamic behavior which can be fully captured by the state transition diagram. A PBN is probabilistic in nature, therefore its state transition is also probabilistic. For both BNs and PBNs, attractors are characteristic of their long-term behavior. Given the above knowledge, if we would like to know anything about a Boolean model, we should find out its state transition diagram and attractors first. This is to be discussed in Section 4.1.

For Boolean models, the most commonly encountered simulation issues include: (1) how to generate the time sequence data of a network, x(0),x(1),...,x(t),...; (2) how to find the network steady-state distribution if it exists; and (3) how to produce artificial Boolean models with prescribed attractors to facilitate other studies. Among them, (1) is a basic practice that can be utilized in (2) and (3), and we will deal with them in Sections 4.2, 4.3 and 4.4 respectively. Note that the techniques in 4.1 is crucial to all the three issues.

4.1. Generating State Transition Diagram and Finding Attractors

To obtain the state transition diagram of a BN, we first compile a state transition table. Assuming n nodes in the network, x₁,...,x_n, we evaluate the current state x(t) to be 00...0, 00...1, ..., 11...0, and 11...1 in turn, compute their respective x(t+1) 's, and tabulate the results. The states can also be represented by integers instead of binary vectors. Table 1 is an example when n = 3. In practice, we only store the second row (“next states”) for computational purpose, because by default the current states are always arranged such that they correspond to integers 0,1,2,...,2ⁿ–1.

Table 1.

Example of a State Transition Table for a BN

Current states	000	001	010	011	100	101	110	111
Next states	000	000	100	000	010	010	110	010

Open in a new tab

To obtain the state transition diagram of a BN, we draw 2ⁿ vertices, each representing a possible state, and connect two vertices by a directed edge if one state transits to the other based on the state transition table. If a state transits to itself, the edge points to itself. Fig. (4) is the state transition diagram based on Table 1.

Fig. (4) — State transition diagram of a Boolean network.

Similarly for a PBN, when gene perturbation rate p=0, we can draw its state transition diagram by combining the state transition diagrams of its component BNs. Now each edge has a probability attached to it, representing the possibility of one state transiting to the other. For example, if a PBN is composed of two BNs, where the first BN has state transitions shown in Table 1, and the second BN has state transitions shown in Table 2, and their selection probabilities are c₁ and c₂ = 1-c₁ respectively, then when p = 0, the PBN's state transition diagram is shown in Fig. (5). When p > 0 , a state transition can either be driven by some network function or by random gene perturbations, and we may refer to its state transition matrix T when constructing the state transition diagram. It should be noted that the sum of probabilities of all the edges exiting a vertex should always be 1.

Table 2.

State Transition Table for the Second BN in a PBN

Current states	000	001	010	011	100	101	110	111
Next states	001	001	101	001	010	010	110	010

Open in a new tab

Fig. (5) — Example of state transition diagram of a probabilistic Boolean network when gene perturbation rate p = 0.

The following is a simple algorithm for finding the attractors of BN based on the state transition table (using integer representation of the states).

Algorithm 1 (Finding attractors)

Generate an array a of size 2ⁿ, and initialize all a_i'S to 0. a_i corresponds to state i–1.
Search for singleton attractors. For each state i between 0 and 2ⁿ–1, look up the (i+1) -th entry in the state transition table for its next state j. If j = 1, then j is a singleton attractor, set a_j+1 :=1.
Search for attractor cycles. For each state i between 0 to 2ⁿ-1, if a_i+1=0, look up the state transition table repeatedly for the successor states of i, such that i→j→k... until a singleton attractor or an attractor cycle is reached. If an attractor cycle is reached, save the cycle states and set the corresponding elements in a to 1.

4.2. Simulating a Dynamic System

A common practice with a Boolean model defined on V = {x₁,...,x_n} is to generate time sequence data. x(0),x(1),x(2),.... A direct method is to start from an initial state x(0), and plug in the Boolean functions repeatedly to find the subsequent states (for BNs and PBNs), sometimes taking into consideration network switches and gene perturbations (for PBNs).

An alternative way, which is more efficient when simulation time is long $(t ≫ 2^{n})$ , is to utilize the information of state transition diagram (encoded in the state transition table) or transition probability matrix T. For BN, it entails converting the current state x(t) to an integer, and looking up the state transition table or matrix T for the next state x(t+1). For PBN, one can start from a randomly chosen initial state and a randomly chosen initial network (from r BNs), and follow either of the two protocols below. Note that we follow the notations in Definition 2 and use p, q to denote the random gene perturbation rate and network switching probability, respectively. Network selection probabilities are denoted by c₁,...,c_r.

Table-lookup and real-time computation based method. Construct r state transition tables for the r component BNs respectively (letting gene perturbation rate p = 0). At any time t, if at the k -th network, generate n independent [0,1] uniformly distributed random numbers p₁,...,p_n. If p_i<p, flip x_i(t) to get $x_{i} (t + 1); if p_{i} > p \forall i$ (no gene perturbation), convert x(t) to integer and look up the k -th state transition table to find x(t+1). Finally, generate a [0,1] uniformly distributed random number q_s and compare to q to decide if the system will switch network at t+1; if switch will occur, choose from the r networks according to the selection probabilities.
T matrix based method. Compute the transition probability matrix T. If dec(x(t))=i–1, generate a [0,1] uniformly distributed random number p_t. If $\sum_{l = 1}^{j - 1} t_{il} \leq p_{t} < \sum_{l = 1}^{j} t_{il}$ (t_il is the (i,l) element of T), then convert j–1 to a n -bit vector s (dec(s)=j–1) and the next state is x(t+1)=S.

One other issue of simulating a PBN is the choice of parameters p and q. As stated in Section 2, network switching probability q does not affect the probability of being at any constituent BN, and in theory we can choose any value for q; however, we prefer to choose small q because in a biological system, switching network corresponds to the change of context (reflecting a change of regulatory paradigm, either caused by environment change or internal signals), which should not occur very often. Moreover, if q is large, or even q=1, then network switching is frequent, and a short time sequence of data x(t₁),x(t₁+1),...,x(t₂) are more likely to come from several BNs instead of from one single BN. This may pose a difficulty if we try to identify the underlying PBN and its component BNs from the sequence data [32]. On the other hand, if q is too small, and the number of BNs in the PBN is large, it will take too long a time to obtain the steady-state distribution by simulation method. Usually p should be small to reflect the rarity of random gene perturbation, and we let p <<q. Also, small p is helpful if the generated sequence data will be used as artificial time-series data for the identification the underlying PBN and its component BNs. However, if p is too small, it will take longer to obtain the steady-state distribution. Usually, we can choose q = 0.01 ~ 0.2 and p = 0.01% ~ 0.5%.

4.3. Obtaining the Steady-State Distributions

4.3.1. Power Method

As discussed in Section 3.1, a PBN possesses a steady-state distribution when q, p > 0 [3]. By definition, this distribution π^* is the solution to linear equations π = π.T with constraint $\sum_{i} π_{i} = 1$ is unique and can be estimated by iteration, given the transition probability matrix T (assuming n genes, and N = 2ⁿ ).

Algorithm 2 (Finding steady-state distribution)

Set δ^* and generate an initial distribution $π^{(0)} = (π_{1}^{(0)},..., π_{N}^{(0)})$ ; Let k:=0.
DO

Compute π^(k+1):=π^(k).T;

$δ : = ‖π^{(k + 1)} - π^{(k)}‖;$

k: = k+1;

UNTIL (δ<δ^*)
π^* :=π^(k).

Note that ||.|| can be any norm, such as ||.||_∞.

When the number of BNs in a PBN is large and some BNs have small selection probabilities, an approximation method for constructing T is proposed in [33]. In the approximation, $\hat{T}$ is computed instead of T, which ignores r₀ BNs whose selection probabilities c_k1,...,c_kr₀ are less than a threshold value ε,

\hat{T} = \bar{T^{'}} + \tilde{T},

(44)

\bar{T^{'}} = {(1 - p)}^{n} . \sum_{j = 1, j \neq k 1,..., {kr}_{0}}^{r} c_{j} T_{j} / (1 - \sum_{i = 1}^{r_{0}} c_{ki})

(45)

where T_j (1≤ j≤ r) and $\tilde{T}$ are defined as in Eqs. (7) and (8). If $\hat{T}$ is used in place of T, and the solution for $π = π \hat{T} is \hat{π^{*}}$ , the expected relative error in steady-state distribution is shown to be bounded by O(ε) [33]

E [\frac{{‖\hat{π^{*}} - \hat{π^{*}} T‖}_{\infty}}{{‖π^{*}‖}_{\infty}}] < (2 + 2 n) \sum_{i = 1}^{r_{0}} c_{ki} < 2 (n + 1) r_{0} ε .

(46)

The following is an alternative method of obtaining steady-state distribution. If we are interested in the attractors only, knowing that the majority of the steady-state probability mass is on attractors if p is small, we may apply the Markov chain based analytic method in Section 3.2.

4.3.2. Monte Carlo Simulation Method [34]

This method requires generation of a long time sequence of data, x(0),x(1),...,x(T), such that the frequencies of all the possible 2ⁿ states approach the steady-state distribution. In a given n -gene PBN with gene perturbation rate p, the smaller p is and the larger n is, the longer it takes to converge to the steady-state distribution. In general, we need to simulate at least 10.2ⁿ.p^-1 steps.

To estimate when the PBN has converged to its steady-state distribution, we can use the Kolmogorov-Smirnov test. The basic idea of Kolmogorov-Smirnov test is to measure the closeness of an empirical probability distribution to the theoretical distribution. Since the latter (steady-state distribution) is unknown in this case, we will test the closeness of two empirical distributions.

To get two quasi-i.i.d (independently and identically distributed) samples in PBN, we select two samples $x (t_{1}), x (t_{1} + m Δ),..., x (t_{1} + (M - 1) Δ) and x (t_{2}), x (t_{2} + m Δ),..., x (t_{2} + (M - 1) Δ) (t_{1} < t_{2} and t_{2} - t_{1} \neq m Δ, 0 < \forall m < M)$ , and the Kolmogorov-Smirnov statistic is defined as

K = \frac{1}{M} \max_{s} |\sum_{m = 0}^{M - 1} 1_{[00 \cdot \cdot \cdot 0, s] (x (t_{1} + m Δ)) -} \sum_{m = 0}^{M - 1} 1_{[00 \cdot \cdot \cdot 0, s]} (x (t_{2} + m Δ))| .

(47)

In the definition, the maximum is over the state space {0,1}ⁿ, and 1_[00...0,s](x) is an indicator function whose output equals 1 if and only if $x \in \{00 \cdot \cdot \cdot 0, \cdot \cdot \cdot, s\}, s \in {\{0, 1\}}^{n}$ , and the output equals 0 otherwise.

4.4. Generating Artificial BNs with Prescribed Attractors [35]

In a simulation study of Boolean models, it is often necessary to create artificial networks with certain properties. Of special interest is the problem of generating artificial BNs with a given set of attractors, since attractors are hypothesized to correspond to cellular phenotypes and play an important role in the long term behavior of Boolean models.

First, note that the state transition diagram can be partitioned into level sets, where level set l_j consists of all states that transit to one of the attractors in exactly j steps, and the attractors belong to the level set l₀.

Problem formulation [35] Given a set of n nodes V={x₁,...,x_n}, a family of n subsets $P_{1},..., P_{n} \subseteq V$ with 0<k≤|P_i|≤K, a set A of d states (binary vectors of n bits), and integers l,L satisfying 0<l<L, we will construct a BN defined on V, which satisfies the following constraints: the set of regulators of node x_i is P_i (P={P₁,...,P_n} is called the regulator set of V), the attractors are A₁,...,A_m such that $\cup_{j = 1}^{m} A_{j} = A$ , and the BN has between l and L level sets.

Specifically, if we are interested in constructing a BN with only singleton attractors, its state transition diagram will be a d -forest (containing d single-rooted trees) if the BN has d singleton attractors. The following theorem gives the number of all possible state transition diagrams that only contain singleton attractors (the proof can be found in [35]).

Theorem 4 The cardinality of the collection of all forests on N vertices is (N+1)^N-1.

Since N=2ⁿ and the number of all possible state transition diagrams are N^N, when n is large, the ratio (N+1)^N-1/N^N is asymptotically e/2ⁿ, thus a brute force search has a low success rate.

Assuming only singleton attractors are allowed, the following algorithm is for solving the search problem formulated above. A second algorithm is also given in [35], but shown to be less efficient.

Algorithm 3 (Generating artificial Boolean network)

Randomly generate or give in advance a set A of d states (as singleton attractors).
Randomly generate a predictor set P, where each P_i has k to K nodes. If Step 2 has been repeated more than a pre-specified number of times, go back to Step 1.
Check if the attractor set A is compatible with P, i.e. only the attractors (each transits to itself) of the state transition diagram are checked for compatibility against P. If not compatible, go back to Step 2.
Fill in the entries of the truth tables that correspond to the attractors generated in Step 1. Using the predictor set P and randomly fill in the remaining entries of the truth table. If Step 4 has been repeated more than a pre-specified number of times go back to Step 2.
Search for cycles of any length in the state transition diagram D based on the truth table generated in Step 4. If a cycle is found go back to Step 4, otherwise continue to Step 6.
If D has less than l or more than L level sets go back to Step 4.
Save the generated BN and terminate the algorithm.

5. CLOSING WORDS

This paper has presented the following analysis and simulation issues of Boolean networks and probabilistic Boolean networks, which are models for gene regulatory networks.

Analysis. An important aspect of Boolean models is that they can be viewed as homogeneous Markov chains; for a PBN, when the network switching probability q > 0 and gene perturbation rate p > 0, it possesses a steady-state distribution. Markov analysis serves as a basis for finding the steady-state probabilities of attractors and for proving the equivalence of PBN and dynamic Bayesian networks. Finally, a structural analysis is provided, where quantitative measures of gene-to-gene relationships are introduced, and the effect of perturbation on Boolean functions are analyzed.
Simulation. Central to the simulation of Boolean models is the use of state transition diagram and transition probability matrix. In network simulation, different methods are presented and simple guidelines of parameter selection are provided. To test the convergence of a simulated PBN to its steady-state distribution, we can employ Kolmogorov-Smirnov statistic. Lastly, an algorithm for generating artificial BNs with prescribed attractors is presented.

To find more references on Boolean models, and obtain a MATLAB toolbox for BN/PBN, readers can go to the following website, http://personal.systemsbiology.net/ilya/PBN/PBN.htm. Another online source of papers is http://gsp.tamu.edu/Publications/journal-publications.

ACKNOWLEDGEMENT

The author thanks Yuping Xiao for giving valuable critique on the initial manuscript. The anonymous reviewers' insightful comments have helped the author's revision considerably.

REFERENCES

1.Kauffman SA. The origins of order self-organization and selection in evolution. New York: Oxford University Press; 1993. [Google Scholar]
2.Shmulevich I, Dougherty ER, Kim S, Zhang W. Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics. 2002;18:261–274. doi: 10.1093/bioinformatics/18.2.261. [DOI] [PubMed] [Google Scholar]
3.Shmulevich I, Dougherty ER, Zhang W. Gene perturbation and intervention in probabilistic Boolean networks. Bioinformatics. 2002;18:1319–1331. doi: 10.1093/bioinformatics/18.10.1319. [DOI] [PubMed] [Google Scholar]
4.Albert R, Othmer HG. The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J. Theor. Biol. 2003;223:1–18. doi: 10.1016/s0022-5193(03)00035-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Huang S. Gene expression profiling, genetic networks, and cellular states: an integrating concept for tumorigenesis and drug discovery. J. Mol. Med. 1999;77:469–480. doi: 10.1007/s001099900023. [DOI] [PubMed] [Google Scholar]
6.Datta A, Choudhary A, Bittner ML, Dougherty ER. External control in Markovian genetic regulatory networks. Mach. Learn. 2003;52:169–191. doi: 10.1093/bioinformatics/bth008. [DOI] [PubMed] [Google Scholar]
7.Shmulevich I, Dougherty ER, Zhang W. From Boolean to probabilistic Boolean networks as models of genetic regulatory networks. Proc. IEEE. 2002;90:1778–1792. [Google Scholar]
8.Dougherty ER, Xiao Y. Design of probabilistic Boolean networks under the requirement of contextual data consistency. IEEE Trans. Signal Processing. 2006;54:3603–3613. [Google Scholar]
9.Zhang S-Q, Ching W-K, Ng MK, Akutsu T. Simulation study in probabilistic Boolean network models for genetic regulatory networks. Int. J. Data Min. Bioinform. 2007;1(3):217–240. doi: 10.1504/ijdmb.2007.011610. [DOI] [PubMed] [Google Scholar]
10.Çinlar E. Introduction to stochastic processes. 1st ed. Englewood Cliffs NJ: Prentice Hall; 1997. [Google Scholar]
11.Brun M, Dougherty ER, Shmulevich I. Steady-state probabilities for attractors in probabilistic Boolean networks. EURASIP J. Signal Processing. 2005;85:1993–2013. [Google Scholar]
12.Friedman N, Linial M, Nachman I, Pe’er D. Using Bayesian networks to analyze expression data. J. Comput. Biol. 2000;7:601–620. doi: 10.1089/106652700750050961. [DOI] [PubMed] [Google Scholar]
13.Ideker T, Thorsson V. Discovery of regulatory interactions through perturbation: inference and experiment design. Pac. Symp. Biocomput. 2000;5:302–313. doi: 10.1142/9789814447331_0029. [DOI] [PubMed] [Google Scholar]
14.Xiao Y, Dougherty ER. Optimizing consistency-based design of context-sensitive gene regulatory networks. IEEE Trans. Circ. Syst. I. 2006;53:2431–2437. [Google Scholar]
15.Lähdesmäki H, Hautaniemi S, Shmulevich I, Yli-Harja O. Relationship between probabilistic Boolean networks and dynamic Bayesian networks as models for gene regulatory networks. Signal Processing. 2006;86:814–834. doi: 10.1016/j.sigpro.2005.06.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Datta A, Pal R, Dougherty ER. Intervention in probabilistic gene regulatory networks. Curr. Genomics. 2006;1(2):167–184. [Google Scholar]
17.Datta A, Pal R, Choudhary A, Dougherty ER. Control approaches for probabilistic gene regulatory networks. IEEE Signal Processing Mag. 2007;24(1):54–63. [Google Scholar]
18.Shmulevich I, Lähdesmäki H, Dougherty ER, Astola J, Zhang W. The role of certain Post classes in Boolean network models of genetic networks. PNAS. 2003;100(19):10734–10739. doi: 10.1073/pnas.1534782100. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Szallasi Z, Liang S. Modeling the normal and neoplastic cell cycle with realistic Boolean genetic networks: their application for understanding carcinogenesis and assessing therapeutic strategies. Pac. Symp. Biocomput. 1998;3:66–76. [PubMed] [Google Scholar]
20.Martins-Jr DC, Hashimoto RF, Braga-Neto U, Bittner ML, Dougherty ER. Intrinsically multivariate predictive genes. IEEE J. Select. Topics Signal Processing. 2008;2:424–439. [Google Scholar]
21.Kauffman S, Peterson C, Samuelsson B, Troein C. Genetic networks with canalizing Boolean rules are always stable. PNAS. 2004;101(49):17102–17107. doi: 10.1073/pnas.0407783101. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Aldana M, Cluzel P. A natural class of robust networks. PNAS. 2003;100(15):8710–8714. doi: 10.1073/pnas.1536783100. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Reichhardt CJO, Bassler KE. Canalization and symmetry in Boolean genetic regulatory networks. Journal of Physics A: Math. Theor. 2007;40:4339–4350. [Google Scholar]
24.Choudhary A, Datta A, Bittner ML, Dougherty ER. Assignment of terminal penalties in controlling genetic regulatory networks. American Control Conference; Portland, OR. 2005. [Google Scholar]
25.Qian X, Dougherty ER. Effect of function perturbation on the steady-state distribution of genetic regulatory networks: optimal structural intervention. IEEE Trans. Signal Processing. 2008;56(10):4966–4975. [Google Scholar]
26.Qu X, Aldana M, Kadanoff LP. Numerical and theoretical studies of noise effects in the Kauffman model. J. Stat. Phys. 2002;109(516):967–985. [Google Scholar]
27.Shmulevich I, Dougherty ER, Zhang W. Control of stationary behavior in probabilistic Boolean networks by means of structural intervention. Biol. Syst. 2002;10:431–445. [Google Scholar]
28.Xiao Y, Dougherty ER. The impact of function perturbations in Boolean networks. Bioinformatics. 2007;23:1265–1273. doi: 10.1093/bioinformatics/btm093. [DOI] [PubMed] [Google Scholar]
29.Gershenson C, Kauffman SA, Shmulevich I. The role of redundancy in the robustness of random Boolean networks. Artificial Life X, Proceedings of the Tenth International Conference on the Simulation and Synthesis of Living Systems; 2006. available online at http://arxiv.org/abs/nlin/0511018 . [Google Scholar]
30.Chaves M, Albert R, Sontag ED. Robustness and fragility of Boolean models for genetic regulatory network. J. Theor. Biol. 2005;235:431–449. doi: 10.1016/j.jtbi.2005.01.023. [DOI] [PubMed] [Google Scholar]
31.Xiao Y. Modeling gene knockdown and pathway blockage in gene regulatory networks. Proc. IEEE Workshop on Systems Biology and Medicine, Philadelphia, PA, November. 2008.
32.Marshall S, Yu L, Xiao Y, Dougherty ER. Inference of a probabilistic Boolean network from a single observed temporal sequence. EURASIP J. Bioinform. Syst. Biol. 2007;1 doi: 10.1155/2007/32454. Article ID 32454, 15 pages. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Ching W-K, Zhang S, Ng MK, Akutsu T. An approximation method for solving the steady-state probability distribution of probabilistic Boolean networks. Bioinformatics. 2007;23:1511–1518. doi: 10.1093/bioinformatics/btm142. [DOI] [PubMed] [Google Scholar]
34.Shmulevich I, Gluhovsky I, Hashimoto R, Dougherty ER, Zhang W. Steady-state analysis of genetic regulatory networks modeled by probabilistic Boolean networks. Comp. Funct. Genomics. 2003;4:601–608. doi: 10.1002/cfg.342. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Pal R, Ivanov I, Datta A, Bittner ML, Dougherty ER. Generating Boolean networks with a prescribed attractor structure. Bioinformatics. 2005;21:4021–4025. doi: 10.1093/bioinformatics/bti664. [DOI] [PubMed] [Google Scholar]

[R1] 1.Kauffman SA. The origins of order self-organization and selection in evolution. New York: Oxford University Press; 1993. [Google Scholar]

[R2] 2.Shmulevich I, Dougherty ER, Kim S, Zhang W. Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics. 2002;18:261–274. doi: 10.1093/bioinformatics/18.2.261. [DOI] [PubMed] [Google Scholar]

[R3] 3.Shmulevich I, Dougherty ER, Zhang W. Gene perturbation and intervention in probabilistic Boolean networks. Bioinformatics. 2002;18:1319–1331. doi: 10.1093/bioinformatics/18.10.1319. [DOI] [PubMed] [Google Scholar]

[R4] 4.Albert R, Othmer HG. The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J. Theor. Biol. 2003;223:1–18. doi: 10.1016/s0022-5193(03)00035-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Huang S. Gene expression profiling, genetic networks, and cellular states: an integrating concept for tumorigenesis and drug discovery. J. Mol. Med. 1999;77:469–480. doi: 10.1007/s001099900023. [DOI] [PubMed] [Google Scholar]

[R6] 6.Datta A, Choudhary A, Bittner ML, Dougherty ER. External control in Markovian genetic regulatory networks. Mach. Learn. 2003;52:169–191. doi: 10.1093/bioinformatics/bth008. [DOI] [PubMed] [Google Scholar]

[R7] 7.Shmulevich I, Dougherty ER, Zhang W. From Boolean to probabilistic Boolean networks as models of genetic regulatory networks. Proc. IEEE. 2002;90:1778–1792. [Google Scholar]

[R8] 8.Dougherty ER, Xiao Y. Design of probabilistic Boolean networks under the requirement of contextual data consistency. IEEE Trans. Signal Processing. 2006;54:3603–3613. [Google Scholar]

[R9] 9.Zhang S-Q, Ching W-K, Ng MK, Akutsu T. Simulation study in probabilistic Boolean network models for genetic regulatory networks. Int. J. Data Min. Bioinform. 2007;1(3):217–240. doi: 10.1504/ijdmb.2007.011610. [DOI] [PubMed] [Google Scholar]

[R10] 10.Çinlar E. Introduction to stochastic processes. 1st ed. Englewood Cliffs NJ: Prentice Hall; 1997. [Google Scholar]

[R11] 11.Brun M, Dougherty ER, Shmulevich I. Steady-state probabilities for attractors in probabilistic Boolean networks. EURASIP J. Signal Processing. 2005;85:1993–2013. [Google Scholar]

[R12] 12.Friedman N, Linial M, Nachman I, Pe’er D. Using Bayesian networks to analyze expression data. J. Comput. Biol. 2000;7:601–620. doi: 10.1089/106652700750050961. [DOI] [PubMed] [Google Scholar]

[R13] 13.Ideker T, Thorsson V. Discovery of regulatory interactions through perturbation: inference and experiment design. Pac. Symp. Biocomput. 2000;5:302–313. doi: 10.1142/9789814447331_0029. [DOI] [PubMed] [Google Scholar]

[R14] 14.Xiao Y, Dougherty ER. Optimizing consistency-based design of context-sensitive gene regulatory networks. IEEE Trans. Circ. Syst. I. 2006;53:2431–2437. [Google Scholar]

[R15] 15.Lähdesmäki H, Hautaniemi S, Shmulevich I, Yli-Harja O. Relationship between probabilistic Boolean networks and dynamic Bayesian networks as models for gene regulatory networks. Signal Processing. 2006;86:814–834. doi: 10.1016/j.sigpro.2005.06.008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Datta A, Pal R, Dougherty ER. Intervention in probabilistic gene regulatory networks. Curr. Genomics. 2006;1(2):167–184. [Google Scholar]

[R17] 17.Datta A, Pal R, Choudhary A, Dougherty ER. Control approaches for probabilistic gene regulatory networks. IEEE Signal Processing Mag. 2007;24(1):54–63. [Google Scholar]

[R18] 18.Shmulevich I, Lähdesmäki H, Dougherty ER, Astola J, Zhang W. The role of certain Post classes in Boolean network models of genetic networks. PNAS. 2003;100(19):10734–10739. doi: 10.1073/pnas.1534782100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Szallasi Z, Liang S. Modeling the normal and neoplastic cell cycle with realistic Boolean genetic networks: their application for understanding carcinogenesis and assessing therapeutic strategies. Pac. Symp. Biocomput. 1998;3:66–76. [PubMed] [Google Scholar]

[R20] 20.Martins-Jr DC, Hashimoto RF, Braga-Neto U, Bittner ML, Dougherty ER. Intrinsically multivariate predictive genes. IEEE J. Select. Topics Signal Processing. 2008;2:424–439. [Google Scholar]

[R21] 21.Kauffman S, Peterson C, Samuelsson B, Troein C. Genetic networks with canalizing Boolean rules are always stable. PNAS. 2004;101(49):17102–17107. doi: 10.1073/pnas.0407783101. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Aldana M, Cluzel P. A natural class of robust networks. PNAS. 2003;100(15):8710–8714. doi: 10.1073/pnas.1536783100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Reichhardt CJO, Bassler KE. Canalization and symmetry in Boolean genetic regulatory networks. Journal of Physics A: Math. Theor. 2007;40:4339–4350. [Google Scholar]

[R24] 24.Choudhary A, Datta A, Bittner ML, Dougherty ER. Assignment of terminal penalties in controlling genetic regulatory networks. American Control Conference; Portland, OR. 2005. [Google Scholar]

[R25] 25.Qian X, Dougherty ER. Effect of function perturbation on the steady-state distribution of genetic regulatory networks: optimal structural intervention. IEEE Trans. Signal Processing. 2008;56(10):4966–4975. [Google Scholar]

[R26] 26.Qu X, Aldana M, Kadanoff LP. Numerical and theoretical studies of noise effects in the Kauffman model. J. Stat. Phys. 2002;109(516):967–985. [Google Scholar]

[R27] 27.Shmulevich I, Dougherty ER, Zhang W. Control of stationary behavior in probabilistic Boolean networks by means of structural intervention. Biol. Syst. 2002;10:431–445. [Google Scholar]

[R28] 28.Xiao Y, Dougherty ER. The impact of function perturbations in Boolean networks. Bioinformatics. 2007;23:1265–1273. doi: 10.1093/bioinformatics/btm093. [DOI] [PubMed] [Google Scholar]

[R29] 29.Gershenson C, Kauffman SA, Shmulevich I. The role of redundancy in the robustness of random Boolean networks. Artificial Life X, Proceedings of the Tenth International Conference on the Simulation and Synthesis of Living Systems; 2006. available online at http://arxiv.org/abs/nlin/0511018 . [Google Scholar]

[R30] 30.Chaves M, Albert R, Sontag ED. Robustness and fragility of Boolean models for genetic regulatory network. J. Theor. Biol. 2005;235:431–449. doi: 10.1016/j.jtbi.2005.01.023. [DOI] [PubMed] [Google Scholar]

[R31] 31.Xiao Y. Modeling gene knockdown and pathway blockage in gene regulatory networks. Proc. IEEE Workshop on Systems Biology and Medicine, Philadelphia, PA, November. 2008.

[R32] 32.Marshall S, Yu L, Xiao Y, Dougherty ER. Inference of a probabilistic Boolean network from a single observed temporal sequence. EURASIP J. Bioinform. Syst. Biol. 2007;1 doi: 10.1155/2007/32454. Article ID 32454, 15 pages. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Ching W-K, Zhang S, Ng MK, Akutsu T. An approximation method for solving the steady-state probability distribution of probabilistic Boolean networks. Bioinformatics. 2007;23:1511–1518. doi: 10.1093/bioinformatics/btm142. [DOI] [PubMed] [Google Scholar]

[R34] 34.Shmulevich I, Gluhovsky I, Hashimoto R, Dougherty ER, Zhang W. Steady-state analysis of genetic regulatory networks modeled by probabilistic Boolean networks. Comp. Funct. Genomics. 2003;4:601–608. doi: 10.1002/cfg.342. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.Pal R, Ivanov I, Datta A, Bittner ML, Dougherty ER. Generating Boolean networks with a prescribed attractor structure. Bioinformatics. 2005;21:4021–4025. doi: 10.1093/bioinformatics/bti664. [DOI] [PubMed] [Google Scholar]

PERMALINK

A Tutorial on Analysis and Simulation of Boolean Gene Regulatory Network Models

Yufei Xiao

Abstract

1. INTRODUCTION

2. PRELIMINARIES

Fig. (1).

3. ANALYSES OF BOOLEAN MODELS

3.1. Markov Chain Analysis

3.1.1. Transition Probability Matrix

3.1.2. Boolean Models are Markov Chains

3.2. Analytic Method for Computing the Steady-State Probabilities of Attractors

3.2.1. Steady-State Distributions of Attractors in a BN with Perturbations

(I). Obtaining the Steady-State Probability of Basin, π(Bk)

(II). Obtaining the Steady-State Probability of Attractor, π(Ak)

3.2.2. Steady-State Distributions of Attractors in a PBN

3.3. Relationship Between PBNs and Bayesian Networks

3.3.1. An Independent PBN as a Binary-Valued DBN

3.3.2. A Binary-Valued DBN as an Independent PBN

3.4. Structural Analysis

3.4.1. Quantitative Measures of the Structure

3.4.2. Structural Perturbation Analysis

Fig. (2).

Fig. (3).

4. SIMULATION ISSUES WITH BOOLEAN MODELS

4.1. Generating State Transition Diagram and Finding Attractors

Table 1.

Fig. (4).

Table 2.

Fig. (5).

4.2. Simulating a Dynamic System

4.3. Obtaining the Steady-State Distributions

4.3.1. Power Method

4.3.2. Monte Carlo Simulation Method [34]

4.4. Generating Artificial BNs with Prescribed Attractors [35]

5. CLOSING WORDS

ACKNOWLEDGEMENT

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

(I). Obtaining the Steady-State Probability of Basin, π(B_k)

(II). Obtaining the Steady-State Probability of Attractor, π(A_k)