Abstract
Boolean Network (BN) is a simple and popular mathematical model that has attracted significant attention from systems biology due to its capacity to reveal genetic regulatory network behavior. In addition, observability, as an important network feature, plays a vital role in deciphering the underlying mechanisms driving a genetic regulatory network and has been widely investigated. Prior studies examined observability of BNs and other complex networks. That said, observability of attractor, which can serve as a biomarker for disease, has not been fully examined in the literature. In this study, we formulated a new definition for singleton or cyclic attractor observability in BNs and developed an effective methodology to resolve the captured problem. We also showed complexity is of O(Pmn), when the maximal period of cyclic attractor is P, the number of attractor is m and the number of genes is n. Importantly, we have confirmed our method can faithfully predict the expression pattern of segment polarity genes in Drosophila melanogaster and showed it can effectively and efficiently deal with the captured observability problem.
Keywords: Attractor, Boolean networks, observability
I. INTRODUCTION
Systems biology, refers to the study of biological systems component interactions and a field that has undergone rapid development in the post-genome era. Genetic regulatory networks (often simply referred to as genetic networks) have been intensively studied to better understand the interactions of various genes, molecules and proteins. Several formalisms have been employed to better model genetic regulatory processes. BNs [1]-[3] have received particular attention owing to their capacity to capture the highly dynamic behavior of genetic regulations.
The first interesting problem typically encountered is Boolean network topological structure, such as fixed points, cycles and attractors basin, etc [4]-[6]. Many studies dealing with the analysis of attractors of randomly given BNs have been performed [3], [7] in large part due to the fact that different attractors can be utilized to represent unique cell types. Several methods have been suggested [8]-[10] to enumerate and/or identify attractors in BNs. As examples, Devloo et al. proposed an effective method based on transforming to a constraint satisfaction problem [8], and Irons developed another method employing small subnetworks [9]. Zhang et al. [10] proposed algorithms to enumerate singleton and small attractors, and also, analyzed the time complexities of the average cases of these algorithms. Of note, it has been shown that singleton attractors (i.e., a fixed state) detection problem is NP-hard [11]. In addition, Akutsu et al. [12] developed algorithms with guaranteed “worst case” time complexity allowing singleton attractor detection in BNs of limited Boolean functions. Importantly, all these approaches dealt specifically with 2n×2n matrices (where n represents the gene number in a BN), making applying them to large-scale BNs exceedingly challenging. Therefore, Akutsu et al. [13] developed an integer linear programming (ILP-based) approach to construct the attractor control problem of medium size BNs. Even though the authors in [13] have reported that the attractor control problem is -hard, an ILP-based methodology can be applied to moderate size BNs. Each of these studies, however, has simply examined randomly generated, simple BNs (i.e., one cell type). It is therefore critically necessary to develop strategies capable of analyzing multiple BNs (i.e., various cell types). In the genuine cellular context, there are various kinds of cells, and it is therefore simply more realistic to perform attractor analysis for multiple BNs. Motivated by this, Qiu et al. [14] recently examined this problem and employed the ILP-based method to solve it.
The control of Boolean networks is also a challenging problem. According to control theory, dynamic systems are controllable if they can be driven from any initial state to any desired final state with a suitable choice of inputs [15]. Several studies have been conducted on this problem [16], [17], and when a random Boolean network is examined, the principle interest lies on the stationary distribution of the system. The reachability problem as in control theory becomes a common concern for the deterministic network [18].
Observability, as a dual of controllability, is a significant concept which needs further exploration. Several studies have recently examined such observability problems. Cheng and Qi [19], [20] adopted a semi-tensor strategy to examine observability and related problems. While their design was effective for formulating and constructing the problems, the application of their method to large-scale BNs is also impractical. Of note, Liu et al. [21] proposed a novel view of complex networks observability. They showed the aim of attractors observability was to determine the minimum consecutive nodes necessary to discriminate distinct attractor cycles from each other in the system. This work provided a great opportunity for one to diagnose disease types since different attractors represent different cell types. That said, we suggest here that the problem proposed represents a particular case, binary alphabet of the minimal key problem examined in [12], [22], [23]. It has been showed that the minimal key problem is NP-hard even when dealing with the binary alphabet [23]. Cheng et al. [24] have proposed to apply the ILP-based method to solve the observability problem which was limited to singleton attractors. However, cyclic attractors exist in the real-world and the issue of observability for cyclic attractors has not yet been fully addressed. Qiu et al. [25] have proposed a method to study the observability of singleton and cyclic attractor. However, the analysis on the sensitivity of number of genes and attractors need to be further explored. In addition, the application of our method to the real model requires further discussion. Thus, the problem we try to address in this study is to identify the minimum set of contiguous nodes such that we can discriminate the singleton or cyclic attractors. Furthermore, we analyze the distribution of gene number and attractor number for distinguishing the cell types, respectively. We show that our proposed method is efficient and can effectively solve large-size problems in O(Pmn) for the worst case, where P is the maximum number of period of cyclic attractor, m is the number of (singleton or cyclic) attractors and n is the number of genes (nodes) in a BN.
This paper has two main contributions. First, we formulate a new biological problem based on Boolean networks to discriminate the cell type with minimum consecutive nodes. Second, we propose an effective and efficient method, (works in O(Pmn)) to solve the captured problem. By integrating the expression pattern of the segment polarity genes in Drosophila melanogaster, we have demonstrated that our approach can efficiently identify the minimum contiguous genes or proteins to discriminate the attractors. Furthermore, experimentation confirms that our proposed methodology is extremely efficient and can effectively solve the problem in seconds.
The remainder of the paper is structured as follows. In the “Problem formulation” section, we introduce some background and formulate the problem. “Methodology” presents a novel method for our captured problem and give two propositions. “Numerical Experiments” describes the materials and experimentation results to validate the efficiency and effectiveness of our method. And finally, conclusions and future directions are described in the last section.
II. PROBLEM FORMULATION
This section (1) introduces the BN model and the attractor detection problem which are closely related to attractor observability, and (2) then formulates the attractor observability problem in BNs.
A BN G(V, F) consists of a set of nodes V used to represent multiple genes V = {ν1, ν2, ⋯ , νn} and a set of Boolean functions F = (f1, f2, ⋯ , fn) where fi : {0, 1}n → {0, 1}. Each node (e.g., a gene) is assumed to take either 1 (active) or 0 (inactive) as its state value. And νi(t) is denoted as the state of node i at time t, after which the state of node i at time t + 1 is given as follow:
here
This indicates the gene state νi at time t + 1 is determined by the hi gene states at previous time t, where hi represents input node number called indegree of νi. Moreover, the maximum indegree of BN is denoted by H = maxi{hi}. We note x ∨ y, x ∧ y, ¬x, x ⊕ y is logical OR of x and y, logical AND of x and y, logical NOT of x and exclusive OR of x and y, respectively. Overall gene expressions at time t in a BN is determined by
and called the network Gene Activity Profile (GAP) at time t. Note that gap(t) ranges from [0, 0, . . . , 0] to [1, 1, . . . , 1], therefore we have a total of 2n possible global states. The following details an example of a BN.
Example 1:
Each gene νi is regulated by a Boolean function fi. A state transition diagram and the dynamics of a BN are shown in Figure 1. The BN truth table is showed in Table 1. For example, the fourth row of the table shows that if the state of BN is [0, 1, 1] at time t, then the state at time t + 1 will be [0, 1, 0]. Similarly, the arc from 011 to 010 indicates that when the BN state is [0, 1, 1] at time t, the state is [0, 1, 0] at time t + 1. In total, there are 2n potential states in a BN containing n nodes. Thus, the state transition table contains 2n rows, and the corresponding state transition diagram has 2n vertices.
FIGURE 1.
Example of a BN.
TABLE 1.
The truth table.
State | υ1(t) | υ2(t) | υ3(t) | υ1(t + 1) | υ2(t + 1) | υ3(t + 1) |
---|---|---|---|---|---|---|
1 | 0 | 0 | 0 | 1 | 1 | 0 |
2 | 0 | 0 | 1 | 1 | 1 | 0 |
3 | 0 | 1 | 0 | 0 | 1 | 0 |
4 | 0 | 1 | 1 | 0 | 1 | 0 |
5 | 1 | 0 | 0 | 1 | 0 | 1 |
6 | 1 | 0 | 1 | 1 | 1 | 1 |
7 | 1 | 1 | 0 | 0 | 0 | 1 |
8 | 1 | 1 | 1 | 0 | 1 | 1 |
The gene state is determined by its associated regulatory function. Given the initial state of a network, a BN will ultimately enter into a certain set of global states (i.e., a directed cycle as depicted in state transition diagram). We denote such a set as an attractor. When an attractor corresponds to one global state (i.e., a single fixed point), it is classified as a singleton attractor. In all other events, it would be classified as a cyclic attractor, and we denote p as the period of cyclic attractor if it consists of p global states. We can see from Figure 1 that the network will eventually evolve into one of two attractors: either a singleton attractor, [0, 1, 0] or a cyclic attractor of period 2, [1, 1, 0] ↔ [0, 0, 1].
A. OBSERVABILITY OF ATTRACTORS
Once attractors are detected, one can conduct an analysis of the BN attractor control problem [13], [14]. Observability, as a compliment of controllability, has been examined by Cheng and Qi [19]. They developed necessary and sufficient conditions for dealing with the captured observability problem. Cheng et al. [24] proposed an efficient and effective approach for solving singleton attractor observability in BNs. That said, the problem of observability for cyclic attractor has not yet been resolved. Thus, in this study, we propose a new methodology for addressing the observability problem of (singleton and cyclic) attractors, not restricted to singleton ones in BNs, and therefore representative of the realities of biology. Importantly, when a BN is given, attractors can be further detected. In brief, assume a set of singleton or cyclic attractors is given as S = {S1, S2, . . . , Sm}, our method then attempts to identify a tandem gene set of minimum cardinality that can be used to distinguish distinct attractor cycles from each other in a system. For instance, we consider four singleton or cyclic attractors with seven genes which is given as S = {S1, S2, S3, S4}.
where S1 and S2 are singleton attractors, S3 and S4 correspond to cyclic attractors of period 2. Note that there are four attractors in total, it would require at least two nodes to distinguish them. Next, if we observe the third and forth nodes only, we can identify which attractor a system belongs to (i.e., (1, 1), (0, 0), (0, 1), (1, 0) mean S1, S2, S3 and S4, respectively). As such, the attractor observability problem is formulated as:
Definition 1: ATTRACTOR OBSERVABILITY (AO)
Instance: List of (singleton or cyclic) attractors in a BN,
Problem: Determine the minimum cardinality of consecutive nodes that can distinguish different attractor cycles from one another in a BN.
III. METHODOLOGY
In the following section, we describe our attractor observability problem approach.
A. PROPOSED ALGORITHM FOR AO
Assume a set of (singleton or cyclic) attractors are given as S = {S1, S2, . . . , Sm}, where size of each Si, (i ∈ {1, 2, ⋯ , m}) is li. We then have total of L = l1 ∙ l2 ⋯ lN possible combinations and employ a matrix set A = {A1, A2, ⋯ , AL} to describe all possible case combinations. Each matrix Ai, i ∈ {1, 2, ⋯ , L} is therefore of size m × n (m: attractor number; n: nodes (genes) number). Then an individual row Ai corresponds to one global state from singleton attractor or cyclic attractor and individual columns represent gene states. We then develop our algorithm based on matrix A for AO. The major steps (procedures) for our algorithm are details below.
The Procedure
Step I: For i ∈ {1, 2, ⋯ , L}, assume the size of matrix Ai is m × n, where m is attractor number and n is gene number. Repeat Step II-VIII.
Step II: Apply arithmetic ⊕ operator for Ai to generate a new matrix biStatei which is of size . We then define
recall that the arithmetic ⊕ is presented as follows:
Step III: For an indiviual row in matrix biStatei, identify consecutive 0′s with length no less than ⌈log2(m)⌉ and place them in matrix Bcon (assume that total entry number in Bcon is r). Another matrix Bindex size r × 2 is then created to record indices of Bcon (e.g., Bindex(i, 1) and Bindex(i, 2) correspond to the first and last elements of row i of Bcon, i ∈ {1, ⋯ , r}, respectively).
Step IV: Generate the union of Bcon and denote it as Buni.
Step V: Let U represent a vector which ranges from 1 to n (i.e., U = {1, ⋯ , n}). If U \ Buni ≠ ∅ holds, then we can conclude this represents the desired minimal contiguous nodes and corresponding minimal number for Ai is ⌈log2(m)⌉ and denoted as minLi. Therefore, the desired minimal number of consecutive nodes has been identified (i.e., ⌈log2(m)⌉ and stop. Otherwise, continue.
Step VI: Employ a sorting algorithm (such as bucket sort) to Bindex. Then rank vector Bindex (:, 1) in an ascending order and update the order in Bindex(:, 2) accordingly. As an example,
(1) |
after applying the sorting algorithm, it becomes
(2) |
Step VII: Compare last elements of rows with identical first element in Bindex. Discard rows with smaller last elements and keep the rows with the largest last element in a new matrix Bni of size l × 2 as the rows with smaller last elements are subsets of the row with the largest last element. Thus, Bni is
(3) |
Step VIII: The desired node number for Ai is
and denoted as minLi. If minLi = ⌈log2(m)⌉, stop algorithm. Otherwise, return to Step II.
Sep IX: Compare minLi, i ∈ {1, 2, ⋯ , L} value and choose the minimum value as the desired minimal number of consecutive nodes necessary to distinguish different attractors.
Proposition 1: If element number in Buni from matrix Ai, i ∈ {1, 2, ⋯ , L} is less than n, then the minimal number of desired nodes for Ai, i ∈ {1, 2, ⋯ , L} equals to ⌈log2(m)⌉.
Proof: Let (ν1, ν2, . . . , νn) represent n consecutive nodes in a given BN. Assume that element number in Buni < n, at least one element in must exist which is denoted as νx. Then it suffices to show that there exists a pair of (x1, x2), such that (νx1 , . . . , νx, . . . , νx2) represent desired minimal consecutive nodes, where x2−x1 + 1 = ⌈log2(m)⌉. In other words, we must show that there is no row with elements that are all 0 from column x1 to x2 in the matrix biState. For x-th column elements, there are two optional values (i.e., 0 or 1). One possible case is biStatetx = 0 (for some t), where the length of consecutive 0’s covering biStatetx is < ⌈log2(m)⌉. Otherwise, x ∈ Buni. Therefore, this implies that no less than one 1 exists from column x1 to x2. As another possible case, i.e., biStateik = 1, then it is obvious that no continuous 0’s can exist from column x1 to x2. Based on this, we conclude that (νx1 , . . . , νx, . . . , νx2) is the desired minimal consecutive nodes for one attractor combination Ai in the BN. □
Proposition 2: If Buni element number from matrix Ai, i ∈ {1, 2, ⋯ , L} is n, then the minimal cardinality of desired nodes for Ai, i ∈ {1, 2, ⋯ , L} is:
where j ∈ {1, ⋯ , l − 1}.
Proof: Consider two following cases below:
Case one: If
then it suffices to demonstrate that in the sub-matrix biState(:, Bni(j + 1, 1) − 1 : Bni(j, 2) + 1), there is no row with all 0 elements. Note that biStatet1,Bni(j,2) corresponds to the last 0 in the t1 row starting from column Bni(j, 1) to Bni(j, 2) (for some t1), indicating that biStatet1,Bni(j,2)+1 value must be 1. However, if t ≠ t1, there are then two potential values of biStatet,Bni(j,2)+1 (i.e., 0 or 1).
In the case of biStatet,Bni(j,2)+1 = 0, it implies that biStatet,Bni(j+1,1)−1 must take 1. Otherwise, Bni(j + 1, 1) is replaced by Bni(j + 1, 1) − 1.
In the case of biStatet,Bni(j,2)+1 = 1, it has been found that in row t, no less than one 1 exists within column Bni(j+1, 1)−1 to Bni(j, 2) + 1.
Similarly, we note that biStatet2,Bni(j+1,1) is the first 0 in the t2 row starting from column Bni(j + 1, 1) to Bni(j + 1, 2) (for some t2), this implies that biStatet2,Bni(j+1,1)−1 = 1.
If t3 ≠ t2, biStatet3,Bni(j+1,1)−1 may take 0 or 1.
If biStatet3,Bni(j+1,1)−1 = 0, then obviously the length of consecutive 0’s containing biStatet3,Bni(j+1,1)−1 in row t3 is < ⌈log2(m)⌉. Otherwise, Bni(j + 1, 1) will not be the original one.
If biStatet3,Bni(j+1,1)−1 = 1, then it is obvious that no less than one 1 exists within column Bni(j + 1, 1) − 1 to Bni(j, 2) + 1 for row t3. Thus, min{Bni(j, 2) − Bni(j + 1, 1) + 3} gives the solution.
Case two: If
then it is sufficient to consider this case (i.e., biStatet3,Bni(j+1,1)−1 is 0), due to the fact that for the other cases, min{Bni(j, 2) − Bni(j + 1, 1) + 3} is not less than ⌈log2(m)⌉. Thus, we extend indicated columns to guarantee containing column biStatet3,Bni(j+1,1)−1 and also guarantee the length of them is ⌈log2(m)⌉. Then columns of length ⌈log2(m)⌉ will produce the desired solution. Assume
and since we have shown there exist node set that can differentiate different cell types, we can next prove c represents the optimal solution. Put another way, there is no set of nodes length c′ (c′ < c) that can distinguish Ai attractors. Assume there exist a column set (c1, ⋯ , ct) with length c′, then it is sufficient to show that c′ < c does not hold. If c′ is less than c, then a row exists such that the set of column from column c1 to ct exactly represents a subset of column Bni(j, 1) to Bni(j, 2). This means that a row exists with elements all corresponding to 0’s. Therefore, we can conclude that c′ does not exist such that c′ < c, and max {min{Bni(j, 2) − Bni(j + 1, 1) + 3}, ⌈log2(m)⌉}, j ∈ {1, ⋯ , l − 1} corresponds to the desired solution of Ai. □
We give an example to illustrate the first case of our algorithm.
Example 2: Let A be given below:
(4) |
Then we have
(5) |
and we can see that the matrix Bcon will be in the form
(6) |
It is noted that the union of Bcon is {[1 2 3], [5 6]} which is a subset of U. Thus any subset of the remaining elements of U \ Bcon which is of length ⌈log2(m)⌉ (i.e., 2) will be the desired set of nodes. Actually, the elements of U \ Bcon is 4, all the subset including 4 of length 2 is [3, 4] or [4, 5] which are exactly the desired minimal set of nodes necessary to discriminate attractors. Thus, [ν3, ν4] or [ν4, ν5] may be taken to represent the desired minimal consecutive nodes. However, if the union of Bcon is exactly equal to U, for instance,
(7) |
Then we have
(8) |
Similarly, we can obtain the matrix Bcon which is in this form:
Since the union of vectors in Bcon ranges from 1 to n, it is necessary to apply Step VI. Thus, the matrix Bindex is sorted using bucket sort algorithm according to the first column and keep the row with the greatest last element in Bindex, then Bni is given by
(9) |
Then applying our proposed algorithm, the minimum set of columns is as follows:
which is equal to 2. Furthermore, since the row number of A is 4, it requires at least ⌈log2(m)⌉ (i.e., 2) to distinguish the attractors, and this implies that it is impossible to find any less number for the other combination of attractor set. Thus, we conclude the length of the minimal set of nodes to distinguish the attractors is two. Based on this example, (ν2, ν3) likely represents the desired minimal consecutive nodes.
In the case where a BN consists of singleton and cyclic attractors, we consider the following example to illustrate our method.
Example 3: We assume the steady states are as follows,
then there are totally l1 ∙ l2 ∙ l3 ∙ l4 = 4 combinations. It is easy to see that the combination can be
(10) |
(11) |
(12) |
(13) |
By applying our algorithm, the minimal length of consecutive nodes for each combination Ai(i = 1, 2, 3, 4) are 4, 6, 3 and 2, respectively. After comparing the above values, we conclude that the minimal cardinality of consecutive nodes necessary to distinguish different attractor cycles is 2. And the cyclic attractors are of period 2, thus the algorithm efficiently works in O(2mn). Furthermore, it shall be noted that m is typically extremely small, and it is therefore reasonable to utilize our method to singleton or cyclic attractors with large-scale networks.
B. COMPLEXITY ANALYSIS
We have conducted a complexity analysis for our algorithm. The principle computational costs come from two parts: one comes from generating the matrix biState and the other comes from the sorting algorithm. The first part clearly requires operations where . Note that the number of attractors in a given BN is usually quite small, and as such conclude the computational complexity of this step is O(n). For the sorting algorithm in Step VI, we have adopted bucket sort that works in O(n), and repeat the algorithm for Pm time such that all possible combinations of singleton or cyclic attractors are considered. Thus, we conclude our algorithm is both effective and efficient for the problem of attractor observability and works in O(Pmn).
IV. NUMERICAL EXPERIMENTS
In this section, we performed computational experiments to validate our methodoloy for attractor observability. Initially, we randomly generated a singleton attractor set and repeated the experiment ten times with different simulated attractors. Then we took the average value as the final result. Notations utilized are as follows.
m: attractor number;
n: node number;
time: average time (in seconds) for each trial;
numNode: the desired minimal node number.
We utilize the above notations herein.
As seen in Table 2 our method was effective and efficient in solving the observability problem of singleton attractors for large-scale networks. Although the size of BN was up to 300 nodes, the average elapsed time was less than one second which is much faster than the ILP-based method [24]. Besides, the desired number of nodes to distinguish different attractor cycles was consistently small although the node number was large. For our next analysis, we randomly generated a cyclic attractor set, and set the period of cyclic attractors to two. Computational times are shown in Table 3 which illustrated the efficiency of our method. Running time was around 1 second even the number of network nodes was set to 1000. Therefore, our algorithm can be applied to cyclic attractors, instead of being confined to singleton attractors.
TABLE 2.
Results on observability of singleton attractor for our method.
n/m | 100/10 | 150/15 | 200/20 | 250/25 | 300/30 |
---|---|---|---|---|---|
Time (sec) | 0.013 | 0.02 | 0.024 | 0.029 | 0.047 |
numNode | 4.1 | 5.1 | 6 | 6.5 | 6.9 |
TABLE 3.
Results on observability of cyclic attractors with period 2.
n/m | 100/4 | 400/5 | 600/6 | 800/7 | 1000/8 |
---|---|---|---|---|---|
Time (sec) | 0.19 | 0.19 | 0.15 | 0.75 | 1.42 |
numNode | 2 | 2.1 | 3 | 3 | 3.2 |
To examine the distribution of the desired number of nodes when the number of attractor was fixed (m), we fixed m as 20 and conducted analysis for different numbers of n ranging from 100 to 1000. The desired number of nodes are shown in Figure 2. Since most of the desired numbers are 6, this indicates that our method is insensitive to the node number in BNs with a fixed number of attractors. Therefore, n variation does not significantly effect desired node number for discriminating the attractor system.
FIGURE 2.
Sensitivity analysis on the number of nodes in BNs. Step size 100 is used for n in [100,1000] and m is fixed to be 20.
We also examined the distribution of the desired node number when the number of nodes (n) was fixed to be 100. We adopted 10 as the step size for m in [10,100]. As illustrated in Figure 3, most of the desired numbers of consecutive nodes are quite small although m is large. Therefore, we can discriminate the set of attractors in large-scale network by identifying the small list of desired nodes.
FIGURE 3.
Sensitivity analysis on the number of attractors in BNs. Step size 10 is used for m in [10,100] and n is fixed to be 100.
To further verify our strategy, we applied our methodology to a popular genetic regulatory model in Drosophila melanogaster. Previously, in [26], Albert et al. proposed a model to describe embryonic pattern formation in the fruit fly Drosophila melanogaster. Their Boolean model consisted of 60 variables whose steady states were identified by manually solving a system of Boolean equations. After this, authors in [27] developed a strategy for efficiently identifying attractors of the network. To analyze the model, they first standardized their variables using the Boolean rules described in [26] by renaming them, i.e., SLPi or wgi to x1, ⋯ , x60. The variable xi and corresponding genes are summarized in Table 4. They next applied their proposed method and obtained the steady states shown in Table 5. Each row in Table 5 corresponds to a stable attractor and each column represents a gene (or protein). Attractors have been denoted as binary values with 1 representing a gene being expressed (or high protein concentration), and 0 representing a gene not being expressed (or low concentration). Based on the established given attractor set, we successfully applied our model to identify the minimum cardinality of contiguous nodes (i.e., 22) necessary to distinguish them. Specifically, (x3, ⋯ , x24) are the desired minimum consecutive variables which implies we can use these 22 nodes to discriminate the attractors and the 22 variables correspond to a list of genes or proteins which are shown in Table 6.
TABLE 4.
Correspondence of variables and gene names.
compartment 1 | SLP x1 |
wg x2 |
WG x3 |
en x4 |
EN x5 |
hh x6 |
HH x7 |
ptc x8 |
PTC x9 |
PH x10 |
SMO x11 |
ci x12 |
CI x13 |
CIA x14 |
CIR x15 |
compartment 2 | SLP x16 |
wg x17 |
WG x18 |
en x19 |
EN x20 |
hh x21 |
HH x22 |
ptc x23 |
PTC x24 |
PH x25 |
SMO x26 |
ci x27 |
CI x28 |
CIA x29 |
CIR x30 |
compartment 3 | SLP x31 |
wg x32 |
WG x33 |
en x34 |
EN x35 |
hh x36 |
HH x37 |
ptc x38 |
PTC x39 |
PH x40 |
SMO x41 |
ci x42 |
CI x43 |
CIA x44 |
CIR x45 |
compartment 4 | SLP x46 |
wg x47 |
WG x48 |
en x49 |
EN x50 |
hh x51 |
HH x52 |
ptc x53 |
PTC x54 |
PH x55 |
SMO x56 |
ci x57 |
CI x58 |
CIA x59 |
CIR x60 |
TABLE 5.
Steady states of a drosophila model.
000000001001101000000001001101100000001001101100000001001101 |
000111100010000000111100010000111000011111110111000011111110 |
000000011111110000111100010000111000011111110100000001001101 |
011000011111110000111100010000111000011111110100000001001101 |
000000011111110000111101000000111000011111110100000001001101 |
011000011111110000111101000000111000011111110100000001001101 |
000111100010000000000011111110100000001001101111000011111110 |
000111101000000000000011111110100000001001101111000011111110 |
000111100010000011000011111110100000001001101111000011111110 |
000111101000000011000011111110100000001001101111000011111110 |
TABLE 6.
detected minimum 22 genes or proteins.
compartment 1 | WG,en,EN,hh,HH,ptc,PTC,PH,SMO,ci,CI,CIA,CIR |
compartment 2 | SLP,wg,WG,en,EN,hh,HH,ptc,PTC |
V. CONCLUSION
In this work, we addressed a novel problem, observability of singleton or cyclic attractors, which is of value in distinguishing different attractor cycles. We have developed an effective and efficient methodology to solve the captured problem, and complexity analysis shows our methodology works in O(Pmn) time. As the number of attractors (m) is typically small, our method can be employed to resolve large-scale networks in addition to medium-size ones. We have also performed computational experiments verifying the efficiency and effectiveness of our novel method, and applied our algorithm to characterize a real-world scenario, i.e., segment polarity gene expression in Drosophila melanogaster. Our results suggest that our strategy may provide a novel tool for use in identifying useful genetic network biomarkers for the detection of disease. In the future work, further investigations will focus on developing more efficient methods to solve the captured problem.
ACKNOWLEDGMENT
The authors are grateful to the anonymous reviewers for their valuable suggestions and remarks which helped to improve the quality of the paper. This paper was presented in part at the workshop session of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM18) (2018) in Madrid, Spain. The full paper has not been published.
This work was supported in part by the Natural Science Foundation of SZU under Grant 000393, in part by the Natural Science Foundation of Shenzhen under Grant JCYJ 20170817100950436, and in part by the National Natural Science Foundation of China NSFC under Grant 11901575, Grant 91730301, and Grant 61472428.
Biography
YUSHAN QIU received the B.E. degree from the School of Mathematics, South China Normal University, Guangdong, China, in 2007, and the Ph.D. degree from the Department of Mathematics, The University of Hong Kong, in 2015. After that, she held a postdoctoral position with Northwestern University, Evanston, IL, USA, from 2015 to 2016. She joined Shenzhen University as an Assistant Professor, since September 2016. Her research interests include bioinformatics and machine learning.
YULONG HUANG is currently pursuing a major in biomedical science and a minor in philosophy, with expected graduation, in fall 2021. She is also a rising junior with the College of Allied Health, University of South Alabama. As a part of her school’s Early Acceptance Program, she plans to attend the University of South Alabama College of Medicine, in spring 2021. In addition, she is an Active Member in organizations such as the Alpha Epsilon Delta, the Biomedical Sciences Society, the Honors College Association, the Chinese Students and Scholars Association, and the Reformed University Fellowship.
SHAOBO TAN is currently pursuing the master’s degree with the School of Computing, University of South Alabama, under the supervision of Dr. J. Huang. His research areas focus on machine intelligence, data semantics, and knowledge acquisition from large amounts of data.
DONGQI LI is currently a junior student with the Contra Costa Christian School. As an external member of Dr. J. Huang research group at the School of Computing, University of South Alabama, he has been conducting research activities in a remote manner for the past year, under the supervision of Dr. Huang.
ADA CHAELI VAN DER ZIJP-TAN is currently pursuing a major in biomedical science and minors in mathematics, chemistry, and biology with the College of Allied Health. She participates as a student selected for the College of Medicine Early Acceptance Program, University of South Alabama. She is also involved in the Honors College, Alpha Epsilon Delta Medical Honors Society, and the Chinese Scholars and Students Association on campus.
GLEN M. BORCHERT received the B.S. degree in biology from the University of Tennessee, in 2000, the Ph.D. degree in interdisciplinary genetics from the University of Iowa, in 2007, and completed Postdoctoral Fellowships at Illinois State and UC Berkeley. He started his own laboratory at the University of South Alabama, in 2012, and the Genetic Mechanisms Cluster of the National Science Foundation named Dr. Borchert, a NSF CAREER Investigator, in 2014. His current research focuses on identifying novel genes and genetic regulators in an array of bacterial, plant, and animal systems through next generation sequencing analysis.
HAO JIANG received the B.Sc. degree from the Harbin Institute of Technology, in 2009, and the Ph.D. degree from The University of Hong Kong, in 2013. She is currently an Assistant Professor with the School of Mathematics, Renmin University of China. Her research interests include mathematical modeling in bioinformatics and matrix analysis-based machine learning methods.
JINGSHAN HUANG received the Ph.D. degree in computer science and engineering from the University of South Carolina, in 2007. He joined the University of South Alabama, in 2009. His research areas are data science and computational life science in general; in particular, the semantics and analysis of data as well as their innovative applications on human genomics and transcriptomics. His research is supported by NIH, NSF, and DOE.
REFERENCES
- [1].Kauffman SA, “Metabolic stability and epigenesis in randomly constructed genetic nets,” J. Theor. Biol, vol. 22, no. 3, pp. 437–467, March 1969. [DOI] [PubMed] [Google Scholar]
- [2].Kauffman S, The Origins of Order: Self-organization and Selection in Evolution. New York, NY, USA: Oxford Univ. Press, 1993. [Google Scholar]
- [3].Kauffman S, At Home in the Universe. Oxford, U.K.: Oxford Univ. Press, 1995. [Google Scholar]
- [4].Albert R and Barabasi A-L, “Dynamics of complex systems: Scaling laws or the period of Boolean networks,” Phys. Rev. Lett, vol. 84, no. 24, pp. 5660–5663, June 2000. [DOI] [PubMed] [Google Scholar]
- [5].Albert R and Othmer H, “The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster,” J. Theor. Biol, vol. 223, no. 1, pp. 1–18, June 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Drossel B, Mihaljev T, and Greil F, “Number and length of attractors in a critical Kauffman model with connectivity one,” Phts. Rev. Lett, vol. 94, no. 8, March 2005, Art. no. 088701. [DOI] [PubMed] [Google Scholar]
- [7].Samuelsson B and Troein C, “Superpolynomial growth in the number of attractors in Kauffman networks,” Phys. Rev. Lett, vol. 90, no. 9, March 2003, Art. no. 98701. [DOI] [PubMed] [Google Scholar]
- [8].Devloo V, Hansen P, and Labbé E, “Identification of all steady states in large networks by logical analysis,” Bull. Math. Biol, vol. 65, no. 6, pp. 1025–1051, November 2003. [DOI] [PubMed] [Google Scholar]
- [9].Irons D, “Improving the efficiency of attractor cycle identification in BNs,” Phys. D, Nonlinear Phenomena, vol. 217, no. 1, pp. 7–21, May 2006. [Google Scholar]
- [10].Zhang SQ, Hayashida M, Akutsu T, Ching WK, and Ng MK, “Algorithms for finding small attractors in Boolean networks,” EURASIP J. Bioinf. Syst. Biol, vol. 2007, January 2007, Art. no. 20180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Akutsu T, Kuhara S, Maruyama O, and Miyano S, “A system for identifying genetic networks from gene expression patterns produced by gene disruptions and overexpressions,” Genome Informat., vol. 9, pp. 151–160, September 1998. [PubMed] [Google Scholar]
- [12].Akutsu T and Bao F, “Approximating minimum keys and optimal substructure screens,” in Proc. Int. Comput. Combinatorics Conf. (COCOON), 1996, pp. 290–299. [Google Scholar]
- [13].Akutsu T, Zhao Y, Hayashida M, and Tamura T, “Integer programming-based approach to attractor detection and control of Boolean networks,” IEICE Trans. Inf. Syst, vols. E95-D, no. 12, p. 2960, 2012. [Google Scholar]
- [14].Qiu Y, Tamura T, Ching WK, and Akutsu T, “On control of singleton attractors in multiple Boolean networks: Integer programming-based method,” BMC Syst. Biol, vol. 8, p. S7, January 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Luenberger D, Introduction to Dynamic Systems: Theory, Models, Application. Hoboken, NJ, USA: Wiley, 1979. [Google Scholar]
- [16].Pal R, Datta A, Bittner ML, and Dougherty ER, “Intervention in context-sensitive probabilistic Boolean networks,” Bioinformatics, vol. 21, no. 7, pp. 1121–1218, April 2005. [DOI] [PubMed] [Google Scholar]
- [17].Pal R, Datta A, Bittner M, and Dougherty ER, “Optimal infinite-horizon control for probabilistic Boolean networks,” IEEE Trans. Signal Process, vol. 53, no. 6, pp. 2375–2387, June 2006. [Google Scholar]
- [18].Akutsu T, Hayashida M, Ching W-K, and Ng MK, “Control of Boolean networks: Hardness results and algorithms for tree structured networks,” J. Theor. Biol, vol. 244, no. 4, pp. 670–679, February 2007. [DOI] [PubMed] [Google Scholar]
- [19].Cheng D and Qi H, “Controllability and observability of Boolean control networks,” Automatica, vol. 45, no. 7, pp. 1659–1667, July 2009. [Google Scholar]
- [20].Cheng D, “Input-state approach to Boolean networks,” IEEE Trans. Neural Netw, vol. 20, no. 3, pp. 512–521, March 2009. [DOI] [PubMed] [Google Scholar]
- [21].Liu Y-Y, Slotine J-J, and Barabasi A-L, “Observability of complex systems,” Proc. Nat. Acad. Sci. USA, vol. 110, no. 7, pp. 2460–2465, February 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Licchesi C and Osborn S, “Candidate keys for relations,” J. Comput. Syst. Sci, vol. 17, no. 2, pp. 270–279, October 1978. [Google Scholar]
- [23].Motwani R and Xu Y, “Efficient algorithms for masking and finding quasi-identifiers,” in Proc. Conf. Very Large Data Bases (VLDB), Vienna, Austria, September 2007, pp. 83–93. [Google Scholar]
- [24].Cheng X, Qiu Y, Jiang H, Yim M, and Ching WK, “Integer programming-based method for observability of singleton attractors in Boolean networks,” IET Syst. Biol, vol. 11, no. 1, pp. 30–35, February 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Qiu Y, Huang Y, Tan S, Li D, Zijp-Tan A, Fong A, Borchert G and Huang J, “Novel method for singleton and cyclic attractor observability in Boolean networks,” in Proc. IEEE Int. Conf. Bioinf. Biomed. (BIBM), Madrid, Spain, December 2018, pp. 1526–1531. [Google Scholar]
- [26].Albert R and Othmer H, “The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster,” J. Theor. Boil, vol. 223, no. 1, pp. 1–18, July 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Hinkelmann F, Brandon M, Guang B, McNeill R, Blekherman G, Veliz-Cuba A, and Laubenbacher R, “ADAM: Analysis of discrete models of biological systems using computer algebra,” BMC Bioinformatics, vol. 12, July 2011, Art. no. 295. [DOI] [PMC free article] [PubMed] [Google Scholar]