Skip to main content
PLOS One logoLink to PLOS One
. 2013 Feb 15;8(2):e56832. doi: 10.1371/journal.pone.0056832

Using Consensus Bayesian Network to Model the Reactive Oxygen Species Regulatory Pathway

Liangdong Hu 1, Limin Wang 1,*
Editor: Frank Emmert-Streib2
PMCID: PMC3574104  PMID: 23457624

Abstract

Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the Bayesian network from microarray data directly. Although large numbers of Bayesian network learning algorithms have been developed, when applying them to learn Bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn Bayesian networks contain too few microarray data. In this paper, we propose a consensus Bayesian network which is constructed by combining Bayesian networks from relevant literatures and Bayesian networks learned from microarray data. It would have a higher accuracy than the Bayesian networks learned from one database. In the experiment, we validated the Bayesian network combination algorithm on several classic machine learning databases and used the consensus Bayesian network to model the Inline graphic's ROS pathway.

Introduction

Reactive Oxygen Species (ROS) are formed as by-products of normal metabolism of aerobic organisms, they can react with DNA and produce damage [1]. Cells protect themselves from ROS by detoxification mechanisms and repair mechanisms [2], [3]. Microarray is a powerful tool for measuring a large number of genes' expressions. Given the microarray expressions, it is possible to construct the regulatory pathway that organisms response to the oxidative stress directly.

An outstanding idea is the use of Bayesian network for representing regulatory pathway [4][7]. Bayesian network is a Directed Acyclic Graph (DAG) used for representing probabilistic relationships between variables. It was first proposed by Pearl [8], and Jensen [9] gave an intuitive definition. A lot of work has been done in the automatic learning of Bayesian network from database. Consequently, large numbers of Bayesian network learning algorithms based on different methodologies have been developed [10][13] and they have high accuracies in learning Bayesian networks from classic machine learning databases. However, when applying these algorithms to learn Bayesian networks from microarray data, the accuracies are low. Careful studies show that this is because the databases they used to learn Bayesian networks contain too few microarray data. On the other hand, microarray chip is expensive, it is difficult to obtain a large number of microarray data from one laboratory or one database, and a few hundred expression data can not guarantee a high learning accuracy.

To overcome this problem, we propose a consensus Bayesian network which is constructed by combining several Bayesian networks. This consensus Bayesian network is approximately equal to the Bayesian network learned from the database obtained by merging all these combined Bayesian networks' corresponding databases, then its equivalent database may have enough data and the accuracy can be improved. The main procedure of construction of consensus Bayesian network can be described as follow: (1) Review all relevant literatures and derive the Bayesian networks. (2) Search microarray expressions which are not used in relevant literatures and download them to learn Bayesian networks. (3) Combine all these Bayesian networks to construct the consensus Bayesian network.

Combination of Bayesian networks includes combination of graph models and aggregation of probability distributions [14][17]. Utz [18] proposed a method to combine many different Bayesian networks into an undirected graph, and each edge in the graph has a weight represents the frequency with which the edge occurs in the component networks. Zhang et al. [19] proposed a method for fusing Bayesian networks. They construct an initial network based on the union and intersection of the Bayesian networks, and then search for the structure which maximizes the scoring function(CH criterion). Our Bayesian network combination algorithm is based on the properties of probability. Due to probabilistic independence, Conditional Probability Tables (CPTs) can be extended, then corresponding nodes' CPTs can be changed into a same form and the aggregation function can be applied to these CPTs. After extending every corresponding CPTs, the combination of Bayesian networks changed into the aggregations of every corresponding nodes' CPTs if these Bayesian networks' variables' prior orders are consistent with each other. Some nodes' CPTs were extended previously, so they may have bogus parents after combination, then we should find them, delete the bogus edges and simplify the CPTs. The combination algorithm can also be applied to the combination of Bayesian networks defined over different variable sets by using the extension of Bayesian network.

Inline graphic was used in the experiment, a constructed ROS pathway was derived from the literature wrote by Hodges et al. [20] and 612 microarray expression data were downloaded from the Many Microbe Microarrays Database(M3D) [21]. 27 genes were identified from the EcoCyc [22] ROS detoxification pathway. A consensus Bayesian network using the 27 genes as variables was constructed by combining the Bayesian network from the literature and the Bayesian network learned from the 612 microarray expressions. For demonstrating the combination of Bayesian networks defined over different variable sets, we used a prediction program to find genes may be involved in the ROS pathway, learned a Bayesian network which using the 27 genes and the newly found genes as variables, and then combined this Bayesian network and the Bayesian network from the literature to construct a new consensus Bayesian network.

Results

Validation on classic machine learning databases

In order to validate whether the consensus Bayesian network Inline graphic constructed by combining Bayesian networks Inline graphic and Inline graphic is equivalent to the Bayesian network learned from the database obtained by merging the two Bayesian networks' corresponding databases Inline graphic and Inline graphic or not, 6 databases were downloaded from the UCI Machine Learning Repository (http://archive.ics.uci.edu/ml/), and the databases of ALARM net and Chest-clinic net were generated by the BN PowerConstructor. For each database Inline graphic, we chose Inline graphic samples (about Inline graphic of the samples in Inline graphic) randomly and used them as Inline graphic, the rest samples in Inline graphic were used as Inline graphic. Two Bayesian networks Inline graphic and Inline graphic were learned from Inline graphic and Inline graphic, respectively. Consensus Bayesian network Inline graphic was constructed by combining Inline graphic and Inline graphic. After that, another Bayesian network Inline graphic used as a reference was learned from Inline graphic, Inline graphic was compared with Inline graphic and the proportion of the number of identical edges between Inline graphic and Inline graphic to the total number of edges in Inline graphic and Inline graphic (similarity Inline graphic) was computed. The program was run 100 times to compute the average similarity. All results of the experiments are shown in Table 1. Table 1 shows that all the average similarities are greater than Inline graphic. So, consensus Bayesian network Inline graphic is approximately equal to Bayesian network Inline graphic. Although the combination algorithm is validated with 8 different databases and the types of data in these databases are very different, it doesn't affect the results. The Bayesian network learning algorithm just compute the distributions by counting the number of samples, and determine the relationships between the variables by analyzing the distributions. The Bayesian network combination algorithm is used to combine Bayesian networks and it doesn't involve the data. So, the type of data doesn't affect the validation.

Table 1. Validation of the combination algorithm.

Database similarity T(s) similarity T′(s)
Letter Recognition 17 20000 100.0% 0.000009 100.0% 0.000010
Shuttle 10 14500 100.0% 0.000008 100.0% 0.000008
Parkinsons Telemonitoring 26 5875 79.4±2.2% 0.086804 77.9±1.2% 1437.502573
Image Segmentation 20 2310 80.6±1.7% 0.066748 78.0±1.9% 835.820385
Contraceptive Method Choice 10 1473 83.2±2.1% 0.033214 82.5±2.5% 18.325412
Solar Flare 13 1389 75.0±3.0% 0.043424 76.6±2.5% 261.702598
ALARM net 37 10000 97.8±2.2% 0.123528 95.6±2.2% 372.952340
Chest-clinic net 8 1000 93.4±0.4% 0.026708 93.4±0.4% 12.259816

Where is the number of variables in the database, is the number of samples in the database, similarity is the average proportion of the number of identical edges between and to the total number of edges in and , and is the execution time of the Bayesian network combination program. The table shows that the similarity is depend on the number of samples, this is because the algorithms are based on the computation of probabilities and the accuracy of computation of probability is sensitive to the number of samples. Specifically, there are two reasons: (a)The real distributions of variables can't be reflected if the database only have several samples; (b)The equation we used to compute the probabilities is sensitive to the number of samples. Then in the experiments on and , similarity , this is because the two databases have enough samples and can provide enough information for constructing the real Bayesian networks, then the learned Bayesian networks , and are completely the same. So, consensus Bayesian network , and and are the same. Similarity and execution time are the results of the experiments using the fusion method proposed by Zhang et al. [19] instead of our combination algorithm. and show that our algorithm works more efficiently. The time complexity of our algorithm is , where is the number of nodes in the network. However, the execution time of Zhang's fusion method grows exponentially as the size of the biggest clique in the Clique tree increases.

Consensus Bayesian network Inline graphic is approximately equal to Bayesian network Inline graphic, then we can view Inline graphic's database Inline graphic as Inline graphic's equivalent database, and Inline graphic have more samples than Inline graphic or Inline graphic. So, the use of consensus Bayesian network helps to solve the problem of lack of data in partial databases and the accuracy can be improved. The true structures of ALARM net and Chest-clinic net are known. Then we compared the learned networks with the known networks, the results are shown in Table 2. Table 2 shows that Inline graphic has a higher accuracy than Inline graphic or Inline graphic.

Table 2. Comparison of the accuracies.

ALARM Chest-clinic
Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
Inline graphic 52 49 48 10 12 8
Inline graphic 0 1 0 1 0 0
Inline graphic 6 4 2 3 4 0

Where Inline graphic is the number of edges in the Bayesian network, Inline graphic is the number of missing edges, Inline graphic is the number of extra edges. The true structures of ALARM net and Chest-clinic net contain 46 directed edges and 8 directed edges, respectively.

Construction of consensus Bayesian network for modeling Inline graphic's ROS pathway

The consensus Bayesian network is constructed by combining Bayesian networks derived from literatures and Bayesian networks learned from microarray data. First, relevant literatures were reviewed and a ROS pathway was derived from the literature wrote by Hodges et al.[20], denoted as Inline graphic. In the literature, 27 genes identified from the EcoCyc ROS detoxification pathway were chosen as variables and 305 microarray expressions were used to learn the Bayesian network. Second, microarray data was searched and a microarray expression build with 612 microarray expressions was downloaded from the M3D database. Then Bayesian network Inline graphic which also uses the 27 genes as variables was learned from these microarray expressions. Finally, consensus Bayesian network Inline graphic was constructed by combining these two Bayesian networks, and the result is shown in Figure 1. In the combination program, we take weights Inline graphic, Inline graphic and threshold Inline graphic.

Figure 1. Consensus Bayesian network Inline graphic.

Figure 1

27 genes were identified from the EcoCyc ROS detoxification pathway.

A novel prediction algorithm based on the computation of mutual information was developed to identify genes which are strongly associated with a particular gene in the regulatory pathway. If Inline graphic is a gene in the regulatory pathway, gene Inline graphic is strongly associated with Inline graphic, then Inline graphic may work together with Inline graphic and also be involved in the pathway. The main procedure of this algorithm can be described as follow: assume set Inline graphic includes all the known genes in the regulatory pathway, and set Inline graphic includes the rest genes of the organism. Choose one gene Inline graphic in Inline graphic, for each gene Inline graphic, compute the mutual information Inline graphic, if Inline graphic, it means gene Inline graphic is related to gene Inline graphic, then Inline graphic may be involved in the pathway too. The program is ended until every gene in Inline graphic has been tested.

27 genes identified from the EcoCyc ROS detoxification pathway were used as set Inline graphic, while the rest genes in Inline graphic were used as set Inline graphic. The program found 4 genes may be involved in the ROS pathway, and the results are shown in Table 3. A new Bayesian network Inline graphic using the 31 (27+4) genes as variables was learned from the 612 microarray expressions. Inline graphic contains more genes than Inline graphic, so Inline graphic was extended into Inline graphic. Then a new consensus Bayesian network Inline graphic was constructed by combining Inline graphic and Inline graphic, and the result is shown in Figure 2.

Table 3. 4 genes identified by the prediction program.

Gene Inline graphic Gene Inline graphic Mutual information Inline graphic
Inline graphic Inline graphic Inline graphic
Inline graphic Inline graphic Inline graphic
Inline graphic Inline graphic Inline graphic
Inline graphic Inline graphic Inline graphic

Genes Inline graphic, Inline graphic were identified from the EcoCyc ROS detoxification pathway. The interactions between gene Inline graphic and gene Inline graphic can also be found in EcoCyc database.

Figure 2. Consensus Bayesian network Inline graphic.

Figure 2

27 genes were identified from the EcoCyc ROS detoxification pathway, while genes Inline graphic, Inline graphic, Inline graphic, Inline graphic were identified by the prediction program.

Discussion

In the discussion, we address this question: does the Bayesian network learned from microarray expressions match with a known regulatory pathway?

Before answering this question, we carried out an experiment. The procedure of the experiment can be described as follow: assume that Inline graphic includes all of the genes of Inline graphic Inline graphic, and then we construct an undirected graph Inline graphic, where Inline graphic. Let Inline graphic be the largest connected subgraph of Inline graphic. Then Inline graphic of the genes in Inline graphic were included in Inline graphic. Mutual information Inline graphic means genes Inline graphic and Inline graphic are interacted, so this phenomenon shows that almost all genes in Inline graphic are related directly or indirectly. We can infer that some genes may be involved in different regulatory pathways, simultaneously. Otherwise, if there is no gene be involved in more than one regulatory pathway, that is, the regulatory pathways in Inline graphic have no intersection, then we can't observe the phenomenon that thousands of genes related directly or indirectly. On the other hand, before microarray measurements, the Inline graphic was alive, so almost all of the regulatory pathways of Inline graphic were at work. Then although two genes must be interacted if there is a directed edge between them in the Bayesian network, it is hard to determine the directed edge belongs to which regulatory pathway. For example, there is a directed edge between Inline graphic and Inline graphic in Inline graphic (Figure 1), then there must be an interaction between Inline graphic and Inline graphic. They are involved in the regulation of transcription (EcoCyc database) and this biological process was working when measuring the expressions of these genes using microarray, therefore, the existence of Inline graphic maybe due to that they are regulating the transcription. However, the ROS detoxification pathway (EcoCyc database) also contains Inline graphic and Inline graphic, then the existence of Inline graphic maybe due to that they are regulating the response to the oxidative stress. So, it is hard to determine the directed edge Inline graphic belongs to which regulatory pathway. If there is no edge between two genes in the Bayesian network, then the two genes are not interacted directly in any regulatory pathway. So, if a known regulatory pathway contains Inline graphic genes and we use these Inline graphic genes as variables to learn a Bayesian network from microarray expressions. Then all of the interactions between the Inline graphic genes are contained in the Bayesian network, however, some of these interactions may not contained in this regulatory pathway. This means the regulatory pathway is a subgraph of the Bayesian network. Although the Bayesian network is not equivalent to the regulatory pathway, it still has important significance. With its guidance, the number of biological experiments could be greatly reduced when modeling a regulatory pathway.

Methods

Data preprocessing

The algorithms can only process discrete data in this paper. However, the 612 microarray expression data of Inline graphic downloaded from the M3D database are continuous. Then expression data for each gene was discretized using a maximum entropy approach which uses three equally-sized bins (q3 quantization). And the genes' expressions were divided into three categories: underexpressed, normal, overexpressed.

Usually, Bayesian networks derived from literatures only have a structure, then we have three ways to obtain the parameters: (1) If the program of the learning algorithm is available on the internet, then both the structure and the parameters of the Bayesian network can be obtained by run the program directly. (2) If the microarray data used in the literatures were collected in a database available on the internet, then we can download these microarray data to learn the parameters. (3) Sometimes the corresponding database is unable to be found, or the Bayesian network is not learned form database, but constructed by biological experiments directly. Then distribution for each node can be estimated by analyzing the genes' special characteristics and the relationships between genes.

Bayesian network

A Bayesian network defined over a variable set Inline graphic can be represented as a pair Inline graphic, where Inline graphic is a DAG and each directed edge in the DAG represents a dependence, Inline graphic is a group of CPTs and each node in the DAG has a CPT. Usually, Inline graphic is called Bayesian network's structure and can be represented as a pair Inline graphic, where Inline graphic is the edge set; Inline graphic is called Bayesian network's parameter. Inline graphic is a directed acyclic graph, that is, the nodes in Inline graphic have a topological order, and we call it prior order. Let Inline graphic and Inline graphic be two Bayesian networks and their DAGs are Inline graphic and Inline graphic, respectively, then Inline graphic and Inline graphic's variables' prior orders are consistent with each other if Inline graphic is acyclic. Let Inline graphic be a node in Inline graphic, Inline graphic's direct precursor nodes are called its parents, denoted as Inline graphic, then Inline graphic's CPT represents the conditional probability Inline graphic. Suppose we have the CPT of Inline graphic as shown in Figure 3(c), it shows Inline graphic is a parent of Inline graphic and Inline graphic means Inline graphic. Assume that CPT Inline graphic represents Inline graphic, Inline graphic represents Inline graphic and Inline graphic represents Inline graphic. Then Inline graphic and Inline graphic are two tables with the same structure and the conditional probabilities in the corresponding positions of the two tables represents the same conditional probability, so we say they have a same form. While Inline graphic and Inline graphic do not have a same form.

Figure 3. Extension and simplification of CPT.

Figure 3

Bayesian network learning algorithm

Usually, Bayesian network is learned from database, it represents the probabilistic relationships between the variables in the database. So, a Bayesian network matches with a database, and we call this database Bayesian network's corresponding database. Bayesian network learning includes structure learning and parameter learning. We use an information-theory based learning algorithm proposed by Cheng et al. [11] to learn Bayesian network's structure in this paper.

Dependence between two variables can be quantitatively computed by using mutual information. Mutual information Inline graphic between two variables Inline graphic and Inline graphic can be defined as:

graphic file with name pone.0056832.e188.jpg (1)

where Inline graphic, Inline graphic are the expression values of Inline graphic and Inline graphic, respectively. Mutual information is non-negative, it means Inline graphic. Inline graphic holds if and only if Inline graphic and Inline graphic are independent. Given a threshold Inline graphic (Inline graphic), Inline graphic and Inline graphic are related if Inline graphic. Similarly, conditional mutual information Inline graphic can be defined as:

graphic file with name pone.0056832.e203.jpg (2)

Then the main procedure of Cheng's Bayesian network structure learning algorithm can be described as follow:

Step 1. Create initial undirected graph. A Maximum Weight Span Tree (MWST) [23] is used as the initial graph. Let Inline graphic be an undirected edge list, where Inline graphic is the variable set. Sort Inline graphic in descending order of mutual information. For each Inline graphic, add it into the undirected graph(and delete it from Inline graphic) if it doesn't form a circle. End this loop until the graph contains Inline graphic edges.

Step 2. Add edges. Assume that set Inline graphic contains all the nodes which are in the paths between Inline graphic and Inline graphic and in the neighborhood of Inline graphic, simultaneously. Inline graphic represents one of sets Inline graphic and Inline graphic which contains less nodes. For each Inline graphic, add it into the undirected graph(and delete it from Inline graphic) if Inline graphic holds.

Step 3. Remove redundant edges. For each edge Inline graphic in the undirected graph, delete it if Inline graphic holds.

Step 4. Determine edges' directions. For each Inline graphic, direct them Inline graphic if

graphic file with name pone.0056832.e224.jpg (3)

holds, where threshold Inline graphic. Some undirected edges' directions can be determined by using Bayesian network's acyclic property. For the rest undirected edges, use the local Minimal Description Length (MDL) score [24] to choose the direction which makes the MDL score more smaller.

In Bayesian network parameter learning, the following equation is used to compute the conditional probabilities in each node's CPT:

graphic file with name pone.0056832.e226.jpg (4)

where Inline graphic is the number of samples satisfies Inline graphic in the database.

Extension and simplification of CPT

Theorem 1

Given variables Inline graphic and Inline graphic, then Inline graphic (Inline graphic) holds if Inline graphic and Inline graphic are independent.

Corollary 1

Given Inline graphic, Inline graphic and any other variable Inline graphic, then Inline graphic (Inline graphic) holds if Inline graphic and Inline graphic are independent given Inline graphic.

Suppose we have the CPT of node Inline graphic as shown in Figure 3(a), it can be extended into the form as shown in Figure 3(b) if Inline graphic and Inline graphic are independent of each other. Since Inline graphic and Inline graphic are independent, Inline graphic can not affect the distribution of Inline graphic, then for Inline graphic, Inline graphic holds. According to that, two CPTs of a same node in different Bayesian networks can be extended into a same form, and then can be aggregated even if the node does not have a same parent set in these Bayesian networks. Specifically, for a node Inline graphic, and its parent sets are Inline graphic and Inline graphic Inline graphic in Inline graphic and Inline graphic, respectively. Then the two CPTs of Inline graphic do not have a same form, and the aggregation function can't be applied (See the CPTs of Inline graphic shown in Figure 3(b) and Figure 3(c), they have a same form, then the aggregation function can be applied to aggregate the conditional probabilities in the corresponding position of the two CPTs, and the aggregation function can't be applied to aggregate the CPTs shown in Figure 3(a) and Figure 3(c)). However, we can take Inline graphic as the parent set and extend both the CPTs of Inline graphic in Inline graphic and Inline graphic into form Inline graphic, then the two CPTs of Inline graphic have a same form and the aggregation function can be applied. This means we also view the nodes in Inline graphic as the parents of Inline graphic in Inline graphic, although they are not real parents and do not affect Inline graphic's conditional probability. We call these parents bogus parents and the directed edges between a node and its bogus parents bogus edges. As shown in Figure 4(c), Inline graphic is a bogus parent of Inline graphic, and Inline graphic is a bogus edge.

Figure 4. An example to demonstrate the combination of two Bayesian networks.

Figure 4

Assume that weights Inline graphic, Inline graphic, and we have two Bayesian networks as shown in Inline graphic and Inline graphic. The CPTs of Inline graphic in the two Bayesian networks do not have a same form, so they need to be extended. After extending the CPTs, the two Bayesian networks' structures and every corresponding CPTs' forms are completely the same(as shown in Inline graphic and Inline graphic, the dashed edges represent bogus edges), and then the aggregation function can be applied to aggregate the conditional probabilities in corresponding positions of each corresponding CPTs. For example, Inline graphic Inline graphic. In the combined Bayesian network as shown in Inline graphic, we need to use variance to test Inline graphic's two parent nodes, Inline graphic Inline graphic, Inline graphic, so Inline graphic is a bogus parent. Then the CPT of Inline graphic need to be simplified and the bogus edge Inline graphic should be deleted. The consensus Bayesian network is shown in Inline graphic.

Theorem 2

Given variables Inline graphic and Inline graphic, Inline graphic is independent of Inline graphic if the conditional probability of Inline graphic does not change when Inline graphic takes different values.

Proof

Assume that the number of expression values of Inline graphic is Inline graphic and for Inline graphic, Inline graphic. Then for Inline graphic, we have:

graphic file with name pone.0056832.e302.jpg

So, Inline graphic, then Inline graphic and Inline graphic are independent.

End of the proof.

Corollary 2

Given Inline graphic, Inline graphic and any other variable Inline graphic, Inline graphic is independent of Inline graphic given Inline graphic if the conditional probability of Inline graphic does not change when Inline graphic takes different values (only Inline graphic changes).

Theorem 2 and Corollary 2 can be used to determine whether two nodes are independent of each other or not. Suppose we have the CPT of node Inline graphic as shown in Figure 3(c), if Inline graphic, Inline graphic or they are approximately equal, it deduces that Inline graphic and Inline graphic are independent, and Inline graphic is not the parent node of Inline graphic. Then the CPT of node Inline graphic can be simplified into the form as shown in Figure 3(d). Conditional probabilities in the CPT of Inline graphic are discrete values, then variance can be used to determine whether the conditional probability of Inline graphic changes or not when Inline graphic takes different values. Assume that Inline graphic's parent set is Inline graphic. First, compute each variance Inline graphic of the conditional probabilities satisfy Inline graphic, Inline graphic and Inline graphic takes different values. Second, compute the average variance Inline graphic of all Inline graphic when Inline graphic and Inline graphic take different values. Given a threshold Inline graphic (Inline graphic), if Inline graphic, it means the conditional probability of Inline graphic almost does not change when Inline graphic takes different values, then Inline graphic and Inline graphic are independent and Inline graphic is not the parent node of Inline graphic. In the combination algorithm, CPTs were extended previously, then some nodes may have bogus parents after the aggregation of CPTs. However, we can use this method to find them, and then simplify the CPTs and delete the bogus edges. Threshold Inline graphic can be selected by using domain knowledge. Specifically, we have Inline graphic if variables Inline graphic and Inline graphic are related, and Inline graphic if variables Inline graphic and Inline graphic are independent. And then we have Inline graphic if there is Inline graphic pair of variables related, and Inline graphic if there is Inline graphic pair of variables independent each other. So we have Inline graphic.

Aggregation function

Assume that the conditional probability of node Inline graphic in Bayesian networks Inline graphic and Inline graphic are Inline graphic and Inline graphic, respectively. Then in the consensus Bayesian network, the conditional probability of Inline graphic can be computed using the following equation:

graphic file with name pone.0056832.e363.jpg (5)

where Inline graphic is the weight of Inline graphic and Inline graphic is the weight of Inline graphic. Weight Inline graphic is a positive integer representing a belief to the Bayesian network. Inline graphic means Inline graphic is more reliable than Inline graphic; Inline graphic means Inline graphic is absolutely reliable.

Next, we would like to discuss why we choose this aggregation function. The combination of Bayesian networks must satisfies this property: the consensus Bayesian network Inline graphic constructed by combining Bayesian networks Inline graphic and Inline graphic is equivalent to the Bayesian network learned from the database obtained by merging the two Bayesian networks' corresponding databases Inline graphic and Inline graphic. Then the aggregation function should satisfy it too. Assume that node Inline graphic is not the parent of Inline graphic in any Bayesian network, then in the consensus Bayesian network, Inline graphic can not be the parent of Inline graphic. Then CPTs of Inline graphic in different Bayesian networks after extension not only have a same form, but also contains all of Inline graphic's possible parent nodes. So, we needn't consider the nodes which are not included in the parent set of Inline graphic when aggregating the CPTs. When computing the conditional probability of one node in Bayesian network, Equation (4) is used. The conditional probability of Inline graphic in Bayesian networks Inline graphic and Inline graphic are Inline graphic and Inline graphic, respectively. Assume that the number of samples satisfy Inline graphic in Inline graphic is Inline graphic, then the number of samples satisfy Inline graphic and Inline graphic in Inline graphic is Inline graphic; the number of samples satisfy Inline graphic in Inline graphic is Inline graphic, then the number of samples satisfy Inline graphic and Inline graphic in Inline graphic is Inline graphic. So, the conditional probability of Inline graphic in the Bayesian network learned from the database obtained by merging Inline graphic and Inline graphic is:

graphic file with name pone.0056832.e408.jpg (6)

On the other hand, samples satisfy Inline graphic in Inline graphic and in Inline graphic obey the same distribution. So, we have:

graphic file with name pone.0056832.e412.jpg (7)

where Inline graphic and Inline graphic are the total numbers of samples in Inline graphic and Inline graphic, respectively. Then the conditional probability of Inline graphic changed to be:

graphic file with name pone.0056832.e418.jpg (8)

Total numbers of samples in databases are unable to be known sometimes, so we use the weights of the Bayesian networks instead of them, then Equation (8) changed into Equation (5). In the experiment, we still use the total numbers of samples as they are already known.

Combination of Bayesian networks

If two Bayesian networks are defined over the same variable set and their variables' prior orders are consistent with each other, then they can be combined using the method described as follow:

Step 1. Extend every corresponding CPTs in the two Bayesian networks into same form. Then the structures of the two Bayesian networks are completely the same(although some of their edges are bogus edges).

Step 2. Use the aggregation function to aggregate the conditional probabilities in the corresponding positions of each corresponding CPTs.

Step 3. In each CPT after aggregation, compute variance Inline graphic for each parent node Inline graphic, determine whether Inline graphic holds or not to judge node Inline graphic is a bogus parent or not, then simplify the CPT and delete the bogus edge if Inline graphic holds.

After simplifying the CPTs and deleting the bogus edges, the consensus Bayesian network is obtained. Figure 4 shows an example of combination of two Bayesian networks. However, Bayesian networks' variables' prior orders do not always consistent with each other, then it needs to reverse some directed edges sometimes. The principle of reversal is to ensure that the Bayesian network after reversal is equivalent to the original Bayesian network.

Extension of Bayesian network

Sometimes the Bayesian networks going to be combined may not defined over the same variable set, then they need to be extended. Specifically, given two Bayesian networks Inline graphic and Inline graphic with their variable sets satisfy Inline graphic and Inline graphic, if their variables' prior orders are consistent with each other, Inline graphic can be extended into Inline graphic using the method described as follow:

Step 1. Extend Inline graphic's DAG Inline graphic into Inline graphic. Let Inline graphic, and then add all the directed edges satisfy

graphic file with name pone.0056832.e434.jpg

into graph Inline graphic. These added edges are not in Inline graphic originally, so we call them extended edges.

Step 2. Compute each node's CPT. For a node Inline graphic, if Inline graphic, then its CPT is the same as the CPT of Inline graphic in Inline graphic; if Inline graphic and there is no directed edge satisfies Inline graphic, then its CPT is the same as the CPT of Inline graphic in Inline graphic; if Inline graphic and has directed edges satisfy Inline graphic, in this case, there are three possible situations may appeared in the extended Bayesian network as shown in Figure 5. Then the conditional probabilities of Inline graphic in these three situations can be computed using the following equations, respectively:

Figure 5. Three possible situations in the extended Bayesian network Inline graphic.

Figure 5

Where Inline graphic, Inline graphic, Inline graphic and Inline graphic may be two nodes or two node sets with each node in them has a directed edge point to Inline graphic. In Inline graphic, solid lines represent the edges in Inline graphic originally, and dashed lines represent the extended edges. The undirected edge Inline graphic represents one of these three cases: (1) directed edge Inline graphic; (2) directed edge Inline graphic; (3) Inline graphic and Inline graphic is disconnect.

In Figure 5(a)

graphic file with name pone.0056832.e461.jpg (9)

where Inline graphic is the number of expression values of Inline graphic. Inline graphic and Inline graphic can be computed using the standard Bayesian network inference algorithm [25] in Inline graphic, while Inline graphic is already known in Inline graphic.

In Figure 5(b)

graphic file with name pone.0056832.e469.jpg (10)

In Figure 5(c)

graphic file with name pone.0056832.e470.jpg (11)

If Inline graphic and Inline graphic are disconnect in Inline graphic, it can only deduce that Inline graphic and Inline graphic are independent given Inline graphic, however, it doesn't affect the conditional probabilities Inline graphic and Inline graphic, then the conditional probability of Inline graphic can be computed using Equation (9). If both Inline graphic and Inline graphic, Inline graphic and Inline graphic are disconnect in Inline graphic, it deduces Inline graphic and Inline graphic are independent, then the conditional probability of Inline graphic can be computed using Equation (10). After obtaining every node's CPT in Inline graphic, the extension of Bayesian network Inline graphic is finished.

After extending Inline graphic into Inline graphic and Inline graphic into Inline graphic, Inline graphic and Inline graphic are defined over the same variable set Inline graphic, and then they can be combined using the combination algorithm.

Funding Statement

This research is supported by Special Science Foundation for the Doctoral Program of China (No. 200801831011) and Postdoctoral Science Foundation of China (No. 20100481053). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Ramotar D, Popoff SC, Gralla EB, Demple B (1991) Cellular role of yeast apn1 apurinic endonuclease/3′-diesterase: repair of oxidative and alkylation dna damage and control of spontaneous mutation. Molecular and Cellular Biology 11: 4537–4544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Demple B, Harrison L (1994) Repair of oxidative damage to dna: enzymology and biology. Annual Review of Biochemistry 63: 915–948. [DOI] [PubMed] [Google Scholar]
  • 3. Bohr VA, Dianov GL (1999) Oxidative dna damage processing in nuclear and mitochondrial dna. Biochimie 81: 155–160. [DOI] [PubMed] [Google Scholar]
  • 4. Friedman N, Linial M, Nachman I, Pe'er D (2000) Using bayesian networks to analyze expression data. Journal of Computational Biology 7: 601–620. [DOI] [PubMed] [Google Scholar]
  • 5. Irene M, Jeremy D, David P (2002) Modelling reguratory pathways in e. coli from time series expression profiles. Bioinformatics 18: S241–S248. [DOI] [PubMed] [Google Scholar]
  • 6. Beal M, Falciani F, Ghahramani Z, Rangel C, Wild D (2005) A bayesian approach to reconstructing genetic regulatory networks with hidden factors. Bioinformatics 21: 349–356. [DOI] [PubMed] [Google Scholar]
  • 7. Bansal M, Belcastro V, Ambesi-Impiombato A, di Bernardo D (2007) How to infer gene networks from expression profiles. Molecular Systems Biology 3: 78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Pearl J (1986) Fusion, propagation and structuring in belief networks. Artificial Intelligence 29: 241–288. [Google Scholar]
  • 9.Jensen F (2001) Bayesian networks and decision graphs. New York: Springer. [Google Scholar]
  • 10. Cooper GF, Herskovits E (1992) A bayesian method for the induction of probabilistic networks from data. Machine Learning 9: 309–347. [Google Scholar]
  • 11. Cheng J, Greiner R, Kelly J, Bell D, Liu WR (2002) Learning bayesian networks from data: an information-theory based approach. Artificial Intelligence 137: 43–90. [Google Scholar]
  • 12. Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing bayesian network structure learning algorithm. Machine Learning 65: 31–78. [Google Scholar]
  • 13. Cano A, Masegosa A, Moral S (2011) A method for integrating expert knowledge when learning bayesian networks from data. IEEE transactions on systems, man, and cybernetics 41: 1382–1394. [DOI] [PubMed] [Google Scholar]
  • 14. Wong SM, Butz CJ (2001) Constructing the dependency structure of a multiagent probabilistic network. IEEE Transactions on Knowledge and Data Engineering 13: 395–415. [Google Scholar]
  • 15. Yang ZQ, Wright RN (2006) Privacy-preserving computation of bayesian networks on vertically partitioned data. IEEE Transactions on Knowledge and Data Engineering 18: 1253–1264. [Google Scholar]
  • 16. Pavlin G, de Oude P, Maris M, Nunnink J, Hood T (2010) A multi-agent systems approach to distributed bayesian information fusion. Infromation Fusion 11: 267–282. [Google Scholar]
  • 17. Sagrado J, Moral S (2003) Qualitative combination of bayesian networks. International Journal of Intelligent Systems 18: 237–249. [Google Scholar]
  • 18. Utz CM (2010) Learning ensembles of bayesian network structures using random forest techniques. Master Thesis of University of Oklahoma [Google Scholar]
  • 19. Zhang Y, Yue K, Yue M, Liu W (2011) An approach for fusing bayesian networks. Journal of Information and Computational Science 8: 194–201. [Google Scholar]
  • 20. Hodges AP, Dai DJ, Xiang ZS, Woolf P, Xi CW, et al. (2010) Bayesian network expansion identifies new ros and biofilm regulators. PLoS ONE 5: e9513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Faith JJ, Driscoll ME, Fusaro VA, Cosgrove EJ, Hayete B, et al. (2008) Many microbe microarrays database: uniformly normalized affymetrix compendia with structured experimental metadata. Nucleic Acids Research 36: D866–D870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Keseler IM, Collado VJ, Gama CS, Ingraham J, Paley S (2005) Ecocyc: a comprehensive database resource for escherichia coli. Nucleic Acids Research 33: D334–D337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Chow CK, Liu CN (1968) Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14: 462–467. [Google Scholar]
  • 24. Lam W, Bacchus F (1994) Learning bayesian belief networks: an approach based on the mdl principle. Computational Intelligence 10: 269–293. [Google Scholar]
  • 25.Zhang LW, Guo HP (2006) Introduction to Bayesian network. Peking: Science Press. [Google Scholar]

Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES