Abstract
Background
Controlling complex molecular regulatory networks is getting a growing attention as it can provide a systematic way of driving any cellular state to a desired cell phenotypic state. A number of recent studies suggested various control methods, but there is still deficiency in finding out practically useful control targets that ensure convergence of any initial network state to one of attractor states corresponding to a desired cell phenotype.
Results
To find out practically useful control targets, we introduce a new concept of phenotype control kernel (PCK) for a Boolean network, defined as the collection of all minimal sets of control nodes having their fixed state values that can generate all possible control sets which eventually drive any initial state to one of attractor states corresponding to a particular cell phenotype of interest. We also present a detailed method with which we can identify PCK in a systematic way based on the layered network and converging tree of a given network. We identify all candidates for control nodes from the layered network and then hierarchically search for all possible minimal sets by using the converging tree. We show the usefulness of PCK by applying it to cell proliferation and apoptosis signaling networks and comparing the results with other control methods. PCK is the unique control method for Boolean network models that can be used to identify all possible minimal sets of control nodes. Interestingly, many of the minimal sets have only one or two control nodes.
Conclusions
Based on the new concept of PCK, we can identify all possible minimal sets of control nodes that can drive any molecular network state to one of multiple attractor states representing a same desired cell phenotype.
Electronic supplementary material
The online version of this article (10.1186/s12918-018-0576-8) contains supplementary material, which is available to authorized users.
Keywords: Biological network, Boolean network model, Attractor, Basin, Target control, Network control, Phenotype control kernel, Layered network, Converging tree
Background
The ultimate goal of systems biology is to control a cellular state which is determined by the dynamics of the underlying molecular regulatory network. Here, the cellular state transition dynamics is governed by both topology (structural information on interaction between molecules) and regulatory functions (operations of the types of interactions).
There have been attempts (presented by various ‘control methods’) to find out ‘control nodes’ that can drive the network state to a desired one by fixing the state values of the control nodes or directly controlling the control nodes with external signals. One attempt was to find out a dominating set from the topology of undirected networks [1–7], which is defined as a set of central nodes that are connected to all the other nodes in undirected networks. Another bunch of works suggested driver nodes [8, 9], steering kernel [10] and feedback vertex sets [11, 12] based only on the topology of directed networks. To drive any given initial state composed of all the nodes in the network into any other final state within a finite time (‘full control’), it is enough to control driver nodes in [8] with external input signals directly acting on the driver nodes, where the dynamics of the network is modelled by a system of linear differential equations and the signals can be explicitly defined. The control method in [9] was developed to find out driver nodes to drive some nodes of interest instead of all the nodes as in [8], which can be considered as a generalization of [8] and is called ‘target control’ compared to full control. For transition between two specific states of a network instead of any two states as in [8], the structure-based method in [10] can be used where the dynamics is also modelled by a system of linear differential equations. It is possible that fewer nodes are enough to control some networks by using the methods in [9, 10] than in [8]. The strategy in [11, 12] is to make a given network acyclic by removing its feedback loops, where the removal is implemented by fixing state values of some nodes in the loops. When applying the structure-based method in [11, 12], the dynamics is modelled by a system of first order differential equations and the fixed values must be the values of the nodes in the desired attractor, where the point attractor for the differential equations is defined as a vector composed of all the node values at which the first derivatives of the nodes are zero. Despite the differential equations used in [11, 12] must satisfy some properties, the equations can be nonlinear.
Although the control methods based only on the topology can provide control nodes for a large size of networks, topology itself is not enough to identify control nodes that can ensure transition towards a desired attractor, which was shown by using Boolean network models [13]. Note that the Boolean attractor is defined as a set of vectors of Boolean state values of all the nodes that is closed with respect to transition (i.e., if a state vector is in the set, then next state vector is also included in the set). So, a number of other studies were carried out to use regulatory functions of the network in finding out control nodes. The control method in [14] was developed for continuous models as in [8–12] whose goal is to identify a sequence of perturbations to the undesired state of the system that can drive it to the attraction basin of the desired stable state, where the basin is the set of states converging to the desired state. However, there is no efficient algorithm of finding out the exact basin. Among various types of regulatory functions, the Boolean function is particularly useful for modeling large-scale regulatory networks as it is a parameter-free logical function and thereby we can avoid parameter estimation which is often a critical limitation in mathematical modeling of such large-scale networks [15–17], so many control methods for Boolean network models were developed [18–25]. The basic idea in [18] is to numerically find out a minimal control set by using a genetic algorithm. The control method integrating the structure of a given network and the Boolean regulatory functions was developed in [19] by using both minimal strongly connected components and their fixed points of the Boolean model. The approach in [19] is based on the concept in [20]. The desired final state used in the control methods in [18–20] must be a Boolean attractor before applying the control methods, but there is no such restriction on the desired final state in [21–25] as in [8–10]. The state values of control nodes found in [18–20] and [21–25] are fixed and directly controlled with external Boolean inputs as in [11, 12] and [8–10], respectively. Given a set of external control nodes for a Boolean network which has a tree structure, the polynomial time algorithm in [21] and integer programming-based approach in [22] were introduced to find out a sequence of the state values of the external control nodes to force an initial state to transit toward a desired state within desired time steps. Necessary and sufficient conditions for controllability and observability of Boolean control networks are considered in [23] based on semi-tensor product of matrices, which is a new matrix product and requires high computational cost. The control method in [24] is based on computational algebra, which forces the desired state to become a fixed point attractor. This method is only applicable to the case of controlling a desired state to become a point attractor. To avoid undesirable state transitions, edge-deletion strategy can be applied as in [25].
In most cases, there are multiple attractor states corresponding to a particular cell phenotype of interest, which is defined by the state values of some nodes instead of all nodes. So, in case a control method for Boolean models is to drive a given network state to converge to any attractor corresponding to the phenotype, the desired final state depends only on the state values of the phenotype nodes instead of all nodes, which is a type of target control as in [9] for continuous models. However, there is no such target control method for Boolean network models as far as we know. In addition, there can be multiple control sets, which is the set of control nodes, for a given network and so we need target control methods whose goal is to find out all possible control sets, but there is no such target control method yet. Therefore, in this paper, we present a novel and practical target control method for Boolean network models with which we can identify all minimal control sets and show its usefulness by applying it to biological network examples.
Results
The layered network and converging tree of an example network
To illustrate the main idea, let us consider a Boolean network model of a small size with a unique phenotype node P and update rules as shown in Fig. 1a where the desired phenotype value is P = 0. The procedure of finding minimal control sets for P = 0 is largely composed of two parts: The first part is hierarchically constructing a layered network (Fig. 1b) from the given network (Fig. 1a), where each layer consists of nodes of the given network. The second part is hierarchically finding out minimal control sets (Fig. 2a-d) and constructing the converging tree (Fig. 2e) which consists of the minimal control sets.
The layered network can be constructed as follows. The 0th layer of the layered network (Fig. 1b) is composed of the phenotype node P. The input nodes to P are C and E, which comprise the 1st layer. Likewise, among the nodes of A, B, D and F which are not used in the 0th nor 1st layer, the input nodes to C or E are B, D and F, which comprise the 2nd layer. Finally, the remaining node A is an input node to D, so the 3rd layer is composed only of A. Since there is no remaining input node to a node in the 3rd layer, the 4th layer does not exist and therefore the 3rd layer becomes the last layer in the layered network. The step-by-step procedure for construction of the layered network implies the uniqueness of the layered network. Using the layered network, we can separate the nodes in a given network into two groups: One is the set of nodes which have an influence on the phenotype nodes and the other is the set of nodes which have no influence. The two groups are presented for a simplified mitogen-activated protein kinase network. In addition, to find out a minimal control set containing a control node ϒ of interest, we can use the information of the location of the node ϒ in the layered network, which is shown for a simplified cancer cell signaling network. In particular, information on the influential nodes obtained from the layered network is used to implement the algorithm for construction of the converging tree. We can then find out minimal control sets in the converging tree by hierarchically applying necessary and sufficient conditions for the phenotype node to have the desired value as follows:
Step 0. Determine the desired value of the phenotype node. The 0th level of the converging tree (Fig. 2e) is the singleton set {P = 0} of the desired steady state value of the phenotype node P (Fig. 2a). For simplicity, we use the notation {P = 0} instead of the set {P|P = 0}.
Step 1. Find children sets in the 1st level of the converging tree that directly generate the parent set {P = 0} in the 0th level. To define the concepts of child and parent sets, let us consider that a minimal control set S1 is located in the ith level. If inserting S1 into the update rules results in a minimal control set S2 in the (i − 1)th level, then the minimal control set S1 is called the child set of the parent set S2. The parent set of S2 is also called the ancestor set of S1. We often omit the term ‘minimal’ in ‘minimal control sets’ if there is no confusion. The candidate nodes of the children sets of {P = 0} are P, C and E because of P* = C&E. Then the steady state value P = 0 is directly generated by substituting one of the three perturbations {P = 0}, {C = 0} and {E = 0} to the update rules (Fig. 1a) and, as a result, these three perturbation targets constitute the control sets in the 1st level. For a new child set S, we apply the following two rules of removing ‘included’ control sets or ‘contradictory’ children sets to both S ∪ Sbefore and Sancester, where Sbefore is the union of control sets found in the previous process and Sancester the union of the parent and ancestor sets of the child set S:
The first rule is to remove ‘included’ control sets in the set S ∪ Sbefore (see the first removal rule in Methods for details).
The second rule is to remove S if the value of a control node N in S is ‘contradictory’ to the value of the control node N in Sancester(see the second removal rule in Methods for details).
For the control set S={P = 0} in the 1st level, we apply the removal rules to S ∪ Sbefore = {P = 0}∪{P = 0} = {P = 0} and Sancester = {P = 0} and, as a result, {P = 0} in the 1st level is included in {P = 0} in the 0th level. Therefore, the control set {P = 0} in the 1st level is removed by the first rule. For the control set S={C = 0} in the 1st level, we apply the removal rules to S ∪ Sbefore={C = 0}∪{P = 0} and Sancester={P = 0} and, as a result, we find that no control set is removed until the 1st level except {P = 0} in the 1st level. Similarly, in the case of S={E = 0} in the 1st level, we have S ∪ Sbefore={E = 0}∪{C = 0}∪{P = 0} and Sancester={P = 0}, and therefore we find that no control set is removed until the 1st level except {P = 0} in the 1st level. As a result, the 1st level consists of two control sets {C = 0} and {E = 0} which are referred to as signals for the control set {P = 0} in the 0th level (Fig. 2b). In the following, for a singleton and parent set {K = 0}, we exclude such a node K from the candidate nodes of its children sets. Since the level of the minimal control set {C = 0} is the 1st level, {C = 0} can directly control the phenotype value P = 0 in the 0th level.
Step 2. Find children sets that directly generate each parent set in the 1st level. Since there exist two parent sets {C = 0} and {E = 0} in the 1st level (Fig. 2b), the children sets can be found for each parent set (Fig. 2c).
Step2–1. Parent set {C = 0} in the 1st level. The candidate nodes of the children sets are B, D and E because of the update rule C* = (!B)&D&E for the node C. Then the steady state value C = 0 is directly generated by substituting one of the three perturbations {B = 1}, {D = 0}, {E = 0} to the update rule. Applying the first removal rule, we find that {E = 0} in the 2nd level is removed due to {E = 0} in the 1st level and that the remaining control sets in the 2nd level are {B = 1} and {D = 0}.
Step2–2. Parent set {E = 0} in the 1st level. The candidate nodes are D and F because of the update rule E* = D|F. Then the steady state value E = 0 is directly generated by substituting the perturbation {(D, F) = (0,0)} to the update rule. Applying the first removal rule, we find that {(D,F) = (0,0)} in the 2nd level is removed due to {D = 0} in the 2nd level and that the parent set {E = 0} in the 1st level does not have a child set, resulting in that {E = 0} becomes a leaf set in the 1st level.
The control sets in the 2nd level are {B = 1} and {D = 0} which are referred to as signals for the parent set {C = 0} in the 1st level. Since the level of the minimal control set {B = 1} is the 2nd level, {B = 1} can indirectly control the phenotype value P = 0 in the 0th level via the parent set {C = 0} in the 1st level. Following the similar process, we can construct the converging tree in Fig. 2e (see Additional file 1 for details). Finally, we find out six minimal control sets: {C = 0} and {E = 0} in the 1st level, {B = 1} and {D = 0} in the 2nd level (children sets of {C = 0}), {A = 0} in the 3rd level (child set of {D = 0}), and {F = 0} in the last level (child set of {A = 0}). Using the converging tree, we find out that the minimal control set {F = 0} generates the minimal control sets which are the parent or ancestor sets {A = 0}, {D = 0} and {C = 0}. Since all possible sets of control targets must contain at least one minimal control set (see the proof in Additional file 1), we call the collection of such minimal control sets ‘phenotype control kernel (PCK)’ for a given Boolean network model. Therefore, PCK consists of six minimal control sets in this case,
Two biological network models
We apply our method to two biological network models. The first one is the mitogen-activated protein kinase (MAPK) model [26] with 53 nodes and 88 links, where Boolean update rules are described with logic functions (the first tab in Additional file 2). To show that our method can also be applied to models with threshold functions instead of logic functions, we employed another cancer cell signaling network model [27] of 96 nodes and 265 links, where the Boolean update rules are described with threshold functions (the first tab in Additional file 3).
The simplified MAPK network and control strategy
The MAPK model has four stimuli (DNA_damage, EGFR_stimulus, FGFR3_stimulus, TGFBR_stimulus) and three phenotype nodes (Apoptosis, Growth_Arrest, Proliferation) (Additional file 4: Figure S1).
We use the four input values (DNA_damage, EGFR_Stimulus, FGFR3_Stimulus, TGFBR_Stimulus) = (0,1,1,0), where the input values represent proliferative conditions. The MAPK model with these input values has five attractors for 1000 random initial states, where the value of Proliferation in each attractor is not fixed as Proliferation = 1. So, to make a proper cancer state space in which all attractors have Proliferation = 1, we consider typical oncogenic mutations (p53, PI3K, RAS, CREB, PPP2CA) = (0,1,1,1,0) as well as the input values. By applying these input values and mutations to the original update rules, we find that all attractors have.
which indicates that all attractors represent a proliferation phenotype. Hence, we find that any initial state of the MAPK network will converge to a cancer state under this condition (the last tab in Additional file 2).
To identify control targets under the aforementioned condition, we first need to determine how to represent the mutations in the MAPK network. For this purpose, let us consider the cancer therapies targeting oncogene addiction and synthetic lethality effects [28]: A cancer therapy targeting the oncogene addiction identifies a mutated gene and attempts to suppress oncogenic signal from the mutated gene [29]. Similarly, to represent the control situation in the MAPK network, we consider two mutant PPP2CA and CREB as input nodes generating oncogenic signals. As a result, we consider MEK1_2, ERK, p70, DUSP1, p38, MSK and MYC in the downstream of the two mutants as control targets that can suppress the oncogenic signals from CREB and PPP2CA.
Recently, synthetic lethality is targeted as an alternative of anti-cancer therapy. This approach is not based on perturbation of a single gene but simultaneous perturbation of more than one gene which causes death of cancer cells [30, 31]. In our case, to find out such synthetic lethal pair, we can first substitute state values of the mutant p53, PI3K and RAS as well as the four input values into the original update rules, which results in the nonzero value of proliferation node. Then, the control node identified by our method for the desired value Proliferation = 0 can be a synthetic lethality partner for one of p53, PI3K and RAS. After the aforementioned representation of the five mutations in the MAPK network, we find the fixed state values of 24 nodes and the simplified update rules for 29 nodes (the fourth tab in Additional file 2). We refer to the network obtained from the simplified update rules as a simplified MAPK network which has only one phenotype node Proliferation (the green node in Fig. 3) and two input nodes (CREB, PPP2CA) marked with yellow balls in Fig. 3. Since the mutations (CREB, PPP2CA) = (1,0) drive the simplified MAPK network to have (Apoptosis, Growth_Arrest, Proliferation) = (0,0,1), the simplified MAPK network can be called a cancer cell signaling network. Therefore, using the simplified network, we can identify control targets including [1] targets for suppressing oncogenic signal flow from CREB and PPP2CA and [2] targets for inducing synthetic lethality with p53, PI3K and RAS.
Let us consider a case that we can only control non-phenotype nodes except CREB and PPP2CA in the simplified MAPK network and that we cannot change the determined phenotypic values (Apoptosis, Growth_Arrest) = (0,0) whereas we can obtain Proliferation = 0 by controlling some of the 26 nodes. In this case, the goal of control is to stop proliferation in the original network, indicating that all attractors of the original network should have the phenotypic values (Apoptosis, Growth_Arrest, Proliferation) = (0,0,0). To achieve this goal, we apply our method to the simplified MAPK network model and, as a result, we can find all control sets.
The layered network of the simplified MAPK network
The nodes of the simplified MAPK network can be further partitioned according to where the nodes are located in the layered network as shown in Fig. 3, where 14 red nodes are the candidates for control nodes for the desired value of the phenotype node Proliferation (the green node in Fig. 3). On the other hand, any perturbation of the other 12 white nodes in Fig. 3 cannot drive the network to have the desired value of Proliferation.
The converging tree of the simplified MAPK network
In the following, we describe the construction procedure of the converging tree step by step.
Step 0. Determine the desired value of the phenotype node. Given the input values and mutations, any initial state converges to a cancer state with (Apoptosis, Growth_Arrest, Proliferation) = (0,0,1). Since we cannot change the steady state values (Apoptosis, Growth_Arrest) = (0,0) but can only change the Proliferation node by controlling the simplified MAPK network, the desired steady state value of Proliferation is Proliferation = 0, which is located in the 0th level of the converging tree as shown in Fig. 4.
Step1. Find children sets that directly generate the parent set {Proliferation = 0} in the 0th level. Since the possible control nodes in the 1st level are input nodes to Proliferation, we find that the candidates for control nodes are p70 and MYC from Proliferation* = p70&MYC. Then the steady state value Proliferation = 0 is determined by one of the two perturbations {p70 = 0} and {MYC = 0}. Applying the two removal rules, we find that no control set is removed up to the present level. Therefore the 1st level consists of {p70 = 0} and {MYC = 0}.
Step2. Find children sets that directly generate each parent set in the 1st level. Since the 1st level contains two parent sets {p70 = 0} and {MYC = 0}, the children sets can be found for each parent set.
Step2–1. Parent set {p70 = 0} in the 1st level. The possible control node is ERK because of p70* = ERK. Then the steady state value p70 = 0 is determined by {ERK = 0}. Applying the two removal rules, we find that no control set is removed up to the present level.
Step2–2. Parent set {MYC = 0} in the 1st level. The possible control node is MSK because of MYC* = MSK. Then the steady state value MYC = 0 is determined by the perturbation {MSK = 0}. Applying the two removal rules, we find that no control set is removed up to the present level.
It follows from Step2–1 and Step2–2 that the control sets in the 2nd level are {ERK = 0} and {MSK = 0}.
Step3. Find children sets that directly generate each parent set in the 2nd level. The 2nd level contains two control sets {ERK = 0} and {MSK = 0}. We then follow the two steps as in Step 2.
Step3–1. Parent set {ERK = 0} in the 2nd level. The possible control node is MEK1_2 because of ERK* = MEK1_2. Then the steady state value ERK = 0 is determined by {MEK1_2 = 0}. Applying the two removal rules, we find that no control set is removed up to the present level.
Step3–2. Parent set {MSK = 0} in the 2nd level. The candidates for control nodes are ERK and p38 because of MSK* = ERK|p38. Then the steady state value MSK = 0 is determined by the perturbation {(ERK, p38) = (0,0)}. Applying the first removal rule, we find that {(ERK, p38) = (0,0)} is removed since the control set {ERK = 0} exists in the 2nd level. Thus {MSK = 0} becomes a leaf set in the 2nd level.
It follows from Step3–1 and Step3–2 that the control set in the 3rd level is {MEK1_2 = 0}.
Step4. Find children sets that directly generate the parent set {MEK1_2 = 0} in the 3rd level. The candidates for control nodes are PPP2CA and AP1 because of MEK1_2* =!(PPP2CA|AP1), which is simplified as MEK1_2* =!AP1 from the mutation PPP2CA = 0. Then the steady state value MEK1_2 = 0 is determined by {AP1 = 1}. Applying the two removal rules, we find that no control set is removed up to the present level. Therefore the 4th level has only one control set {AP1 = 1}.
Following the similar procedure with the two removal rules, we can obtain the converging tree in Fig. 4 (see Additional file 5 for the other levels).
Comparison with other control methods
Our method can be considered as a target control method for Boolean network models, which drives some target nodes of interest to have desired values instead of all nodes in the network. Hence, previous target control methods for Boolean network models, if any, can be compared with our method. However, there is no such a target control method for Boolean network models, so direct comparison is not possible at present.
Although the method introduced in [9] is a target control method which is not for Boolean network models, we can still compare our method with it by using the simplified MAPK network model where the target node set consists of the Proliferation node (filled orange circle in Fig. 4a). The method in [9] provides a set of driver nodes that is enough to drive the Proliferation node to have its desired state value by using the greedy algorithm which constructs a series of bipartite graphs as shown in Additional file 6. As a result, the PPP2CA node becomes the unique driver node (filled red circle in Fig. 4a). For convenience, the set of all driver nodes is also called a control set from now on. Note that if a different maximum matching is chosen, a different control set {CREB} can be obtained. Then the method in [9] needs to control one node (denoted by number 1 before the green bar in Fig. 4a) and similarly our method needs only one or two layered nodes in the layered network for the phenotype control (denoted by numbers 1 and 2 before the blue bars in Fig. 4a), where the layered nodes are marked with purple circles in the bottom box in Fig. 4a. As expected, fewer driver nodes are needed for target control methods than the full control methods in [8] and [11, 12] (see Additional file 7).
Applying the greedy algorithm and our method to the simplified MAPK network model, we obtain one and eleven control sets in Fig. 4b, respectively. If the greedy algorithm is applied multiple times and different maximum matchings are chosen at each time, multiple control sets can be obtained. The difference of the number of control sets between the method in [9] and our method originates from the fact that driver nodes can be obtained only from the structure of the network but our method depends on Boolean update rules. Note that if a set Ψ of some nodes is not a minimal control set and controlling the set Ψ leads the simplified MAPK Boolean network to have the desired value Proliferation = 0, then a subset of Ψ is one of the 11 control sets in the converging tree and thereby the set Ψ is unnecessary with respect to perturbation.
The simplified cancer cell signaling network with threshold update functions
The cancer cell signaling network has six inputs (Mutagen, GFs, Nutrients, TNFa, Hypoxia, Gli) and four output nodes (AcidLactic, Apoptosis, Glut_1 and DNARepair) as in Additional file 8: Figure S2.
Hypoxia condition is a common condition of cancer cells in vivo, so let us consider a simplified network model under this condition and find out control targets that can induce apoptosis [32]. Note that there are some fixed values (PTEN, APC, Max, p14, FOXO, ROS) = (1,1,1,0,1,0) in the original update rules. Substituting these fixed values as well as input Hypoxia = 1 into the original update rules, we finally have fixed values for 40 nodes and simplified update rules for the remaining 56 nodes including three output nodes Apoptosis, DNA_Repair and Glut_1 (the second tab in Additional file 3). Note that the attractors of the cancer cell signaling network are preserved after the substitution [33]. We refer to the cancer cell signaling network obtained from the simplified update rules as a simplified cancer cell signaling network.
Construction of the converging tree based on the layered network of the simplified cancer cell signaling network
Since the cell death of a cancer cell is more preferable than the cell cycle arrest, let us consider only the value of the Apoptosis node for finding control targets without considering that of DNA_Repair node or Glut_1 node. The number of layered nodes in the layered network is 39 (marked with red balls in Fig. 5). The other 17 nodes (marked with white balls in Fig. 5) have no influence on the Apoptosis node.
In order to find out a control set that contains CHK1/2 as a control node, we can use the layered network in Fig. 5 in which CHK1/2 is located in the 4th layer. For this purpose, we need to construct a converging tree up to at least the 4th level, where we find a control set in Additional file 9 which contain the control node CHK1/2 as follows:
which is marked with red CHK1/2 in Fig. 6.
Discussion
Considering the numbers of initial states and desired final states of a given network, we can classify control methods into three groups by using the network and state representation in Fig. 7a-b. A first group of studies is to find out a control set that can drive a given initial state to a desired state within a finite time (denoted by ‘one-to-one control’ and illustrated in the top subfigures of Fig. 7c-e) as suggested in [8–10]. The structure-based control method in [8] guarantees that the target set {A, B, C, D, E, F, P} in Fig. 1 is controllable given any initial and final states by controlling only one node F as shown in Fig. 7e. In case the desired target set consists of a part of all nodes, the target control method in [9] can be applied. For instance, if the target set is {A, B, F, P} instead of {A, B, C, D, E, F, P}, then the node F is the driver node for the target set {A, B, F, P} as shown in Fig 7e. A second group of studies is to look for control targets that can drive any initial state to one desired attractor (denoted by ‘any-to-one control’ and illustrated in the middle subfigures of Fig. 7c-e) as proposed in [11, 12], [18] and [19]. The desired final state used in the second group must be one of attractors which already exist in the state space of the given network before applying a control method. However, our method has no such restriction on the desired final attractor, which means that there can be an initial state whose trajectory after applying PCK converges to an attractor which is not an attractor before applying PCK as in Fig. 1, Fig. 3 and Additional file 2. On the other hand, PCK belongs to a third group, which ensures driving any initial state to one of (possibly) multiple attractors corresponding to a particular phenotype of interest (denoted by ‘any-to-multiple control’ and illustrated in the bottom subfigures of Fig. 7c-e) as explained in detail in the third and fourth tabs of Additional file 3. In this case, any control set in PCK can drive a given network state to one of attractors of the same phenotype, where the convergence to a particular attractor depends on the initial state and control set. Using our method, different initial states might converge to a same attractor or a same initial state before and after applying PCK might be driven to different attractors depending on the control set chosen from PCK. Applying any control in PCK, all attractors will have the desired phenotype value. We need to note that PCK only ensures convergence to a same phenotype of interest and provides all possible minimal control sets for this purpose.
In case a desired phenotype is defined by the state values of multiple nodes as in [34–36], our method can also be applied if the number of children sets does not increase too much owing to the two removal rules, as shown in Additional file 10. However, if the number of children sets is enormously increased and the children sets are not efficiently eliminated by the two removal rules, then the complexity can become high. As a result, if we cannot complete the construction of a converging tree, we need to modify our algorithm such that the number of children sets is restricted.
Even if the construction of a converging tree is incomplete, the converging tree provides us with useful information since we can find control sets hierarchically from the root set. Let us consider a case when we need to know the level of a converging tree, in which there is a control set containing a given control node N before we construct the converging tree. If the node N is not a direct or indirect input node to the phenotype nodes of interest, it is not possible to find such a control set and therefore we cannot find such a level even after completion of the converging tree. Otherwise, we can use the relationship between such a level and the shortest length of paths from the given node N to the phenotype nodes. To deal with both cases, we have introduced the layered network which hierarchically consists of direct or indirect input nodes to the phenotype nodes. We need to note that the layered network can be constructed by using only the topology of a given directed network and it does not require the information about regulatory functions. If the node N is not included in the layered network, then there is no control set having the node N as its control node. For instance, SOS, FRS2, SPRY, PKC, FGFR3, GRB2, PLCG, EGFR, BCL2, MAX, p14 and MDM2 in Fig. 3 are not included in the layered network. Otherwise, we need to construct the converging tree with multiple levels. For instance, if N is the node N = CHK1/2 in Fig. 5, then the node Nis located in the 4th layer of the layered network and therefore we need to construct the converging tree with at least 4 levels as in Fig. 6. In this respect, our method can be considered as a hierarchical control strategy based on both topology and regulatory functions of the network in finding out control nodes.
Conclusions
There is a growing interest in controlling complex biological networks, but no practical method is available that can be used to find out all possible combinations of control targets ensuring convergence to a desired cell phenotype. To resolve this problem, in this study, we introduced the concept of PCK and presented a detailed method of identifying PCK based on layered network and converging tree. We showed that PCK can generate all control sets. The converging tree can be useful for biologists or clinicians since it can be used to find out drug targets or to realize precision medicine (or personalized medicine) by identifying control targets that are interpreted or serve as drug targets. For instance, we can construct a patient-specific cancer cell signaling network model by reflecting the genomic variation of the tumor sample and apply this method to identify the most effective drug targets in consideration of such genomic variation effects in the network dynamics.
In this paper, we considered an intracellular regulatory network model and developed one pair of the layered network and converging tree of the single network model. Considering the heterogeneous cell population of cancer, we need to further expand the present approach by considering multiple network models at the same time. For instance, for two Boolean network models representing two cancer cell types, we can construct two pairs of the layered network and converging tree. In this way, we might be able to find out a set of common control sets that can be applied to the two cancer networks for the desired values of the phenotype nodes. Such a control set can be found by combining the layered network (topological property) and the converging tree (dynamical property). This remains as a future study.
Methods
Let P1, ⋯, Pℓ be some of the phenotype nodes of a given Boolean network, which are used to define the desired state values Pi = di (1 ≤ i ≤ ℓ) for a positive integer ℓ and constants di ∈ {0, 1}. The term ‘minimal control set’ denotes a minimal set of nodes having their fixed state values that can ensure convergence of the Boolean network to the desired values Pi = di (1 ≤ i ≤ ℓ) by inserting the fixed state values into the update rules of the Boolean network, where the elements of the minimal control set are referred to as ‘control nodes’ and it is said that the minimal control set generates (or determines) the desired state values Pi = di (1 ≤ i ≤ ℓ). Any minimal control set can be hierarchically found from a tree structure obtained by solving a system of some Boolean equations constructed from the update rules, so some terminologies are employed here to indicate the location of minimal control sets: root, child, parent, ancestor, descendant and leaf sets. The root set is {(P1, ⋯, Pℓ) = (d1, ⋯, dℓ)}. If a minimal control set S1 directly generates the root set {(P1, ⋯, Pℓ) = (d1, ⋯, dℓ)}, then the minimal control set S1 is called the child set of the parent set {(P1, ⋯, Pℓ) = (d1, ⋯, dℓ)}. If S2 is a child set of S1, then {(P1, ⋯, Pℓ) = (d1, ⋯, dℓ)} is called the ancestor set of S2. In addition, S2 is called the descendant set of {(P1, ⋯, Pℓ) = (d1, ⋯, dℓ)}. If S1 has no child set, then S1 is called a leaf set.
Layered network
The layered network of a given network consists of layers, where the 0th layer consists of the phenotype nodes of interest. Let us denote the nodes located in the kth layer for any k in {0, ⋯, i} as for some nonnegative integers i and γk. Then the (i + 1)th layer consists of the nodes which are inputs to at least one of and not contained in . Each node of the layered network is referred as a layered node. Since the input nodes to each node in the network are unique, the layered network is unique.
The first removal rule for included control sets
Let be a control set in the kth level. If a control set in the ith level satisfies
for some k, all 1 ≤ m ≤ ξk, 1 ≤ gmki ≤ ς and ξk ≤ ς.
then Ci is removed. In this case, Ci is said to be ‘included’ in Ck from the point of view of perturbations. For instance, we define that a control set {A = 0} includes a control set {(A, B) = (0,1)} from the point of view of perturbations and that the control set {(A, B) = (0,1)} is said to be an included control set. Therefore, the control set {(A, B) = (0,1)} can be removed. Such a rule is referred to as the first removal rule for included control sets.
The second removal rule for contradictory children sets
Let a candidate for a minimal control set in the (i + 1)th level have parent or ancestor sets in the kth level for 0 ≤ k ≤ i and a positive integer γk, where for 1 ≤ j ≤ γk and . If Ci + 1 satisfies
and for some m, k, j and gmikj,
then the candidate Ci + 1 is removed. In this case Ci + 1 in the (i + 1)th level is said to be a contradictory child set. For example, as in Fig. 2d, the control set {(C,F) = (1,0)} in the 3rd level has an ancestor set {C = 0} in the 1st level. Applying the child set {(C,F) = (1,0)} into the given network, we can obtain the parent set {B = 1} but cannot the ancestor set {C = 0}, so that we cannot obtain the desired phenotype value. Therefore, we need to remove the minimal control set {(C,F) = (1,0)} in the 3rd level, where the removal is referred to as the second removal rule for ‘contradictory’ children sets.
Algorithm for the converging tree
The converging tree of a given network can be hierarchically constructed as follows.
Step 0. Determine the desired steady state values of the phenotype nodes. Let P1, ⋯, Pℓ (1 ≤ ℓ) be some of the phenotype nodes of a given Boolean network such that the phenotype nodes have desired steady state values (P1, ⋯, Pℓ) = (d1, ⋯, dℓ) for a positive integer ℓ and constants dυ (1 ≤ υ ≤ ℓ) of values 0 or 1. Then the 0th level of the converging tree is the set {(P1, ⋯, Pℓ) = (d1, ⋯, dℓ)}, which is called the root set of the converging tree.
Step 1. Let the control sets C1, ⋯, Ck be non-leaf sets of the ith level of the converging tree for a nonnegative integer i and a positive integer k, where for 1 ≤ j ≤ k, a positive integer γj and . Here is a control node of the control set Cj with the fixed state value .
Step 1–1. Find children sets of each parent set Cj in the ith level.
Each one of is a solution of the system of the Boolean equations
where is a positive integer and is the update function for .
Before applying our algorithm, we can prepare the collection of solutions of each Boolean equation , ni ∈ {0, 1}, where is the update rule for xi. Then the solutions of the system can be obtained by the Cartesian product of sets of solutions of each Boolean equation.
Step 1–1-1. Find included sets among by applying the first removal rule.
Case 1-1. All are included.
If j < k, replace j with j + 1 and go to Step 1–1.
Otherwise, go to Step2.
Case 1-2. Some of are not included.
Denote by the children sets that are not included and go to Step 1–1-2.
Step 1–1-2. Find contradictory children sets among by applying the second removal rule.
Case 2-1. All are contradictory.
If j < k, replace j with j + 1 and go to Step 1–1.
Otherwise, go to Step2.
Case 2-2. Some of are not contradictory.
Denote by the children sets that are not contradictory and go to Step 1–1-3.
Step 1–1-3. Find control sets in the τth level (1 ≤ τ ≤ i + 1) which are included in one of . If there exists an included control set, go to Case 3. Otherwise, replace j with j + 1 and go to Step 1–1 when j < k, and go to Step2 when j = k.
Case 3-1. The included control sets in the τth level are not parent or ancestors of any one of .
Remove the included control sets and their descendant sets. If there exists a next parent Cυ(j + 1 ≤ υ ≤ k), repeat Step 1–1 with Cυ instead of Cj. Otherwise, go to Step 2 .
Case 3-2. Other case.
Let the Lth level (0 ≤ L ≤ i) be the lowest level among the levels in which there exists a parent or ancestor set included in one of . Denote the included control sets in the Lth level by CL. Replace CL with the children sets of which include CL and remove the ξth levels (L < ξ) as well as included control sets in the Lth level. Repeat Step 1–1 from the Lth level instead of the ith level.
Step 2. Let the control sets D1, ⋯, Dθ consist of the (i + 1)thlevel of the converging tree for a positive integer θ. If all D1, ⋯, Dθ are leaf sets, then construction of the converging tree is completed and PCK is the collection of all minimal control sets in the converging tree. Otherwise, repeat Step 1 for non-leaf sets in the (i + 1)th level instead of C1, ⋯, Ck.
Remark. The algorithm for the converging tree can be summarized as two parts: The first part is to find out the layered nodes that have influence on the desired phenotype values. The second part is to apply the two removal rules to eliminate ‘included’ or ‘contradictory’ control sets. As a result, we can obtain the minimal control sets in each level of the converging tree. Before applying our algorithm, we can prepare the collection of solutions of each Boolean equation ,ni ∈ {0, 1}, where is the update rule for xi. Then the solutions of system of Boolean equations can be obtained by the Cartesian product of sets of solutions of each Boolean equation. So, the computational complexity is as follows.
Let pi (1 ≤ i ≤ ηpi) be a parent set at the ith level, where the nodes of the set pi are denoted by (pn)i, j (1 ≤ j ≤ ηpni). Since we can find out children nodes (cn)i, j, k (1 ≤ k ≤ ηcnij) of the parent node (pn)i, j without solving Boolean equations by using the collection prepared a priori, the number of children sets of the parent set pi is and the number of children sets of all the parent sets in the ith level is . Hence, the computational complexity for finding out children sets of all the parent sets in the ith level is
The computational complexity of applying the first removal rule is
In case there is no removed child set, the computational complexity of applying the second removal rule is
If the number of children sets is not increased owing to the two removal rules, the computational complexity would be not too high, as shown in Additional file 10. However, if the number of children sets becomes enormously increased since the children sets are not eliminated by the two removal rules, then the complexity might become high. As a result, if we cannot complete the construction of converging tree, we need to modify our algorithm such that the number of children sets is restricted.
Additional files
Acknowledgments
Not applicable.
Funding
This work was supported by the National Research Foundation of Korea (NRF) grants funded by the Korea Government, the Ministry of Science and ICT (2017R1A2A1A17069642, 2015M3A9A7067220, and 2013M3A9A7046303). It was also supported by the KAIST Future Systems Healthcare Project from the Ministry of Science, ICT & Future Planning, the KUSTAR-KAIST Institute, Korea under the R&D program supervised by the KAIST, and the grant of the Korean Health Technology R&D Project, Ministry of Health & Welfare, Republic of Korea (HI13C2162).
Availability of data and materials
The data supporting the results of this article are included and cited within the article and its additional files.
Abbreviations
- MAPK
Mitogen-activated protein kinase
- MFVS
Minimum feedback vertex set
- PCK
Phenotype control kernel
Authors’ contributions
Conceived and designed the experiments: SMC and KHC. Performed the experiments: SMC, BB and JIJ. Analyzed the data: SMC, BB, JIJ and KHC. Wrote the paper: SMC and KHC. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Footnotes
Electronic supplementary material
The online version of this article (10.1186/s12918-018-0576-8) contains supplementary material, which is available to authorized users.
Contributor Information
Sang-Mok Choo, Email: smchoo@ulsan.ac.kr.
Byunghyun Ban, Email: needleworm@kaist.ac.kr.
Jae Il Joo, Email: mustard@kaist.ac.kr.
Kwang-Hyun Cho, Phone: +82-42-350-4325, Email: ckh@kaist.ac.kr, http://sbie.kaist.ac.kr/.
References
- 1.Milenkovic T, Memisevic V, Bonato A, Przulj N. Dominating biological networks. PLoS One. 2011;6(8):e23016. doi: 10.1371/journal.pone.0023016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Nacher JC, Akutsu T. Dominating scale-free networks with variable scaling exponent: heterogeneous networks are not difficult to control. New J Phys. 2012;14(7):073005. doi: 10.1088/1367-2630/14/7/073005. [DOI] [Google Scholar]
- 3.Wuchty S. Controllability in protein interaction networks. P Natl Acad Sci USA. 2014;111(19):7156–7160. doi: 10.1073/pnas.1311231111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Khuri S, Essentiality WS. Centrality in protein interaction networks revisited. Bmc Bioinformatics. 2015;16:109. doi: 10.1186/s12859-015-0536-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Nacher JC, Akutsu T. Minimum dominating set-based methods for analyzing biological networks. Methods. 2016;102:57–63. doi: 10.1016/j.ymeth.2015.12.017. [DOI] [PubMed] [Google Scholar]
- 6.Zhang XF, Ou-Yang L, Zhu Y, Wu MY, Dai DQ. Determining minimum set of driver nodes in protein-protein interaction networks. Bmc Bioinformatics. 2015;16:146. doi: 10.1186/s12859-015-0591-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wang H, Zheng H, Browne F, Wang C, editors. Minimum dominating sets in cell cycle specific protein interaction networks. 2014. pp. 2–5. [Google Scholar]
- 8.Liu YY, Slotine JJ, Barabasi AL. Controllability of complex networks. Nature. 2011;473(7346):167–173. doi: 10.1038/nature10011. [DOI] [PubMed] [Google Scholar]
- 9.Gao J, Liu YY, D'Souza RM, Barabasi AL. Target control of complex networks. Nat Commun. 2014;5:5415. doi: 10.1038/ncomms6415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wu FX, Wu L, Wang J, Liu J, Chen L. Transittability of complex networks and its applications to regulatory biomolecular networks. Sci Rep-Uk. 2014;4:4819. doi: 10.1038/srep04819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Fiedler B, Mochizuki A, Kurosawa G, Saito D. Dynamics and control at feedback vertex sets. I: informative and determining nodes in regulatory networks. J Dyn Differ Equ. 2013;25(3):563–604. doi: 10.1007/s10884-013-9312-7. [DOI] [Google Scholar]
- 12.Mochizuki A, Fiedler B, Kurosawa G, Saito D. Dynamics and control at feedback vertex sets. II: a faithful monitor to determine the diversity of molecular activities in regulatory networks. J Theor Biol. 2013;335:130–146. doi: 10.1016/j.jtbi.2013.06.009. [DOI] [PubMed] [Google Scholar]
- 13.Gates AJ, Rocha LM. Control of complex networks requires both structure and dynamics. Sci Rep-Uk. 2016;6:24456. doi: 10.1038/srep24456. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Cornelius SP, Kath WL, Motter AE. Realistic control of network dynamics. Nat Commun. 2013;4:1942. doi: 10.1038/ncomms2939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kauffman SA. Metabolic stability and epigenesis in randomly constructed genetic nets. J Theor Biol. 1969;22(3):437–467. doi: 10.1016/0022-5193(69)90015-0. [DOI] [PubMed] [Google Scholar]
- 16.Glass L, Kauffman SA. The logical analysis of continuous, non-linear biochemical control networks. J Theor Biol. 1973;39(1):103–129. doi: 10.1016/0022-5193(73)90208-7. [DOI] [PubMed] [Google Scholar]
- 17.Wang RS, Saadatpour A, Albert R. Boolean modeling in systems biology: an overview of methodology and applications. Phys Biol. 2012;9(5):055001. doi: 10.1088/1478-3975/9/5/055001. [DOI] [PubMed] [Google Scholar]
- 18.Kim J, Park SM, Cho KH. Discovery of a kernel for controlling biomolecular regulatory networks. Sci Rep-Uk. 2013;3:2223. doi: 10.1038/srep02223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Zanudo JG, Albert R. Cell fate reprogramming by control of intracellular network dynamics. PLoS Comput Biol. 2015;11(4):e1004193. doi: 10.1371/journal.pcbi.1004193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wang RS, Albert R. Elementary signaling modes predict the essentiality of signal transduction network components. BMC Syst Biol. 2011;5:44. doi: 10.1186/1752-0509-5-44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Akutsu T, Hayashida M, Ching WK, Ng MK. Control of Boolean networks: hardness results and algorithms for tree structured networks. J Theor Biol. 2007;244(4):670–679. doi: 10.1016/j.jtbi.2006.09.023. [DOI] [PubMed] [Google Scholar]
- 22.Akutsu T, Zhao Y, Hayashida M, Tamura T. Integer programming-based approach to attractor detection and control of Boolean networks. Ieice T Inf Syst. 2012;E95d(12):2960–2970. doi: 10.1587/transinf.E95.D.2960. [DOI] [Google Scholar]
- 23.Cheng DZ, Qi HS. Controllability and observability of Boolean control networks. Automatica. 2009;45(7):1659–1667. doi: 10.1016/j.automatica.2009.03.006. [DOI] [Google Scholar]
- 24.Murrugarra D, Veliz-Cuba A, Aguilar B, Laubenbacher R. Identification of control targets in Boolean molecular network models via computational algebra. BMC Syst Biol. 2016;10:94. doi: 10.1186/s12918-016-0332-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Murrugarra D, Dimitrova ES. Molecular network control through boolean canalization. EURASIP J Bioinform Syst Biol. 2015;2015(1):9. doi: 10.1186/s13637-015-0029-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Grieco L, Calzone L, Bernard-Pierrot I, Radvanyi F, Kahn-Perles B, Thieffry D. Integrative modelling of the influence of MAPK network on cancer cell fate decision. PLoS Comput Biol. 2013;9(10):e1003286. doi: 10.1371/journal.pcbi.1003286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Fumia HF, Martins ML. Boolean network model for cancer pathways: predicting carcinogenesis and targeted therapy outcomes. PLoS One. 2013;8(7):e69008. doi: 10.1371/journal.pone.0069008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Lord CJ, Ashworth A. Mechanisms of resistance to therapies targeting BRCA-mutant cancers. Nat Med. 2013;19(11):1381–1388. doi: 10.1038/nm.3369. [DOI] [PubMed] [Google Scholar]
- 29.Luo J, Solimini NL, Elledge SJ. Principles of cancer therapy: oncogene and non-oncogene addiction. Cell. 2009;136(5):823–837. doi: 10.1016/j.cell.2009.02.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Wang T, Yu H, Hughes NW, Liu B, Kendirli A, Klein K, et al. Gene essentiality profiling reveals gene networks and synthetic lethal interactions with Oncogenic Ras. Cell. 2017;168(5):890–903. doi: 10.1016/j.cell.2017.01.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kaelin WG., Jr The concept of synthetic lethality in the context of anticancer therapy. Nat Rev Cancer. 2005;5(9):689–698. doi: 10.1038/nrc1691. [DOI] [PubMed] [Google Scholar]
- 32.Wilson WR, Hay MP. Targeting hypoxia in cancer therapy. Nat Rev Cancer. 2011;11(6):393–410. doi: 10.1038/nrc3064. [DOI] [PubMed] [Google Scholar]
- 33.Saadatpour A, Albert R, Reluga TC. A reduction method for Boolean network models proven to conserve attractors. SIAM J Appl Dyn Syst. 2013;12(4):1997–2011. doi: 10.1137/13090537X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Steinway SN, Zanudo JGT, Michel PJ, Feith DJ, Loughran TP, Albert R. Combinatorial interventions inhibit TGFbeta-driven epithelial-to-mesenchymal transition and support hybrid cellular phenotypes. NPJ Syst Biol Appl. 2015;1:15014. doi: 10.1038/npjsba.2015.14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Cahan P, Li H, Morris SA, Lummertz da Rocha E, Daley GQ, Collins JJ. CellNet: network biology applied to stem cell engineering. Cell. 2014;158(4):903–915. doi: 10.1016/j.cell.2014.07.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Cho KH, Joo JI, Shin D, Kim D, Park SM. The reverse control of irreversible biological processes. Wiley Interdiscip Rev Syst Biol Med. 2016;8(5):366–377. doi: 10.1002/wsbm.1346. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data supporting the results of this article are included and cited within the article and its additional files.