A new Bayesian piecewise linear regression model for dynamic network reconstruction

Mahdi Shafiee Kamalabad; Marco Grzegorczyk

doi:10.1186/s12859-021-03998-9

. 2021 Apr 26;22(Suppl 2):196. doi: 10.1186/s12859-021-03998-9

A new Bayesian piecewise linear regression model for dynamic network reconstruction

Mahdi Shafiee Kamalabad ^1,², Marco Grzegorczyk ^3,^✉

PMCID: PMC8074473 PMID: 33902443

Abstract

Background

Linear regression models are important tools for learning regulatory networks from gene expression time series. A conventional assumption for non-homogeneous regulatory processes on a short time scale is that the network structure stays constant across time, while the network parameters are time-dependent. The objective is then to learn the network structure along with changepoints that divide the time series into time segments. An uncoupled model learns the parameters separately for each segment, while a coupled model enforces the parameters of any segment to stay similar to those of the previous segment. In this paper, we propose a new consensus model that infers for each individual time segment whether it is coupled to (or uncoupled from) the previous segment.

Results

The results show that the new consensus model is superior to the uncoupled and the coupled model, as well as superior to a recently proposed generalized coupled model.

Conclusions

The newly proposed model has the uncoupled and the coupled model as limiting cases, and it is able to infer the best trade-off between them from the data.

Supplementary Information

The online version supplementary material available at 10.1186/s12859-021-03998-9.

Keywords: Bayesian piece-wise linear regression, Gene regulatory networks, Network reconstruction, Segment-wise parameter coupling

Background

Non-homogeneous dynamic Bayesian networks have become a popular tool for learning the structures of cellular regulatory networks from gene expression and protein concentration data. The traditional (homogeneous) dynamic Bayesian network models assume the network parameters to stay constant across time. This can lead to biased results and wrong conclusions, as cellular regulatory processes can change in time. It was therefore proposed to combine dynamic Bayesian network models with Bayesian changepoint processes, see, e.g., [1–3]. Then a multiple changepoint process is used to divide the temporal data into disjoint segments, and the data within each segment are modelled by linear regression models. For most cellular processes on a short time scale it is not realistic to assume that the network structure changes over time. The network structure is therefore usually assumed to stay unchanged and only the network parameters are assumed to time-varying. As a motivation for this assumption consider a gene regulatory network, in which an edge from gene $Z_{i}$ to gene $Z_{j}$ , $Z_{i}$ $\to$ $Z_{j}$ , typically would indicate that gene $Z_{i}$ codes for a transcription factor that can bind to the promoter of gene $Z_{j}$ , so that $Z_{j}$ ’s transcription is initiated. The ability to bind to the promoter (= the edge connection) is unlikely to change within a short time period, whereas the extent of binding (= the network interaction parameter) can undergo quick temporal changes. Regarding our two real-life applications to S. cerevisiae (yeast) and A. thaliana (plant) gene expression data, the assumption of a fixed network structure therefore seems more faithful.

The uncoupled model, akin to the models proposed by Lèbre et al. [1] and Dondelinger et al. [3], learns the segment-specific network parameters for each segment separately. To allow for information-sharing with respect to the network parameters, models with globally [4] and sequentially [5] coupled network parameters were proposed. As sequential information-sharing seems more suitable for temporal time segments, we focus here on the sequential coupling. The underlying idea is that the network parameters of each segment should be enforced to stay similar to those of the previous segment. Grzegorczyk and Husmeier [5] proposed a coupled model, in which the posterior expectations of the network parameters of segment h are used as prior expectations for the next segment $h + 1$ . The strength of the coupling, i.e. the variance of the network parameter priors, is regulated by a coupling parameter. Although it was shown that this is very useful for applications where the network parameters stay similar over time, the fully coupled model has the drawback that it enforces coupling and does not feature any possibility for uncoupling. In this paper we therefore propose a partially segment-wise coupled model, which can be seen as a consensus model between the uncoupled and the fully coupled model. Discrete binary indicator variables $δ_{h}$ indicate for each segment h whether it is coupled to the previous segment ( $δ_{h} = 1$ ) or uncoupled from it ( $δ_{h} = 0$ ). Along with the network structure and the data segmentation the values of those indicator variables are inferred from the data. The new partially coupled model reaches the original models in the limit: If it couples all segments ( $δ_{h} = 1$ for all $h > 2$ ), it becomes the fully coupled model. If it uncouples all segments ( $δ_{h} = 0$ for all h), it becomes the uncoupled model.

In our earlier work [6] we have proposed a new generalized fully coupled model. While the fully coupled model from [5] couples all neighbouring segments $(h - 1, h)$ with the same coupling strength $λ \in R^{+}$ , the generalised (fully) coupled model from [6] uses for each pair of neighbouring segments $(h - 1, h)$ a segment-specific coupling strength parameter $λ_{h} \in R^{+}$ . This leads to a higher model flexibility, but like the coupled model the generalized coupled model still does not allow for uncoupling. In our comparative evaluation study, we will compare the new partially coupled model with the three competing models: the uncoupled model, the (fully) coupled model, and the generalized (fully) coupled model.

In recent works alternative model refinements have been proposed [7, 8]. These models distinguish coupled from uncoupled network edges rather than distinguishing coupled from uncoupled time segments. The partially non-homogeneous model from Shafiee Kamalabad et al. [7] builds on the idea that only some network parameters (i.e some edges) might be subject to changes, while other network parameters (i.e. edges) might stay constant. The model has been designed for analysing data that have been measured under different experimental conditions, so that it does not allow the segmentation of a time series to be inferred. The non-homogeneous model from Shafiee Kamalabad and Grzegorczyk [8] distinguishes between two groups of edges: (i) edges that are fully coupled among all segments and (ii) edges that are uncoupled among all segments. The new model that we propose here is conceptual related, but complementary in that it replaces the concept of partially coupled edges by the concept of partially coupled time segments.

We note that network reconstruction is a topical research field in the computational biology literature and that many different network reconstruction approaches have been proposed over the years. However, most of the proposed models do not focus on non-homogeneous regulatory processes but rely on a homogeneity of the regulatory processes. For some applications this assumption of homogeneity can be too restrictive; compare, e.g., our data applications. In response to one of the reviewers of our paper, we here briefly discuss a few recently proposed network reconstruction methods. Vignes et al. [9] investigated and compared a wide variety of methods, ranging from Bayesian networks to penalised linear regression based models and proposed a meta-analysis based on Fisher’s Inverse Chi-Square meta-test for combining different approaches. Huang et al. [10] proposed to apply Bayesian model averaging for linear regression methods. The method uses a closed form solution to compute the edge posterior probabilities within a hybrid framework of Bayesian model averaging and linear regression. Xing et al. [11] proposed a Candidate Auto Selection algorithm based on the pairwise mutual information and breakpoint detection. With a greedy search algorithm it is searched for the best network topology. Unlike the above mentioned models, Fan et al. [12] propose to impose a prior on the topology information in their inference process. Incorporating this prior information can partially compensate for the lack of reliable data. They then developed a Bayesian group lasso with spike and slab prior approach based on non-parametric models. Xu et al. [13] propose to employ a series of linear regression problems to model the relationship between the network nodes. They use an efficient variational Bayes method for optimization and inference of the unknown network parameters.

Methods

Learning dynamic networks with time-varying parameters

Consider N random variables $Z_{1}, \dots, Z_{N}$ that are the nodes of a network. Let $D$ denote an N-by- $(T + 1)$ data matrix, whose N rows correspond to the variables and whose $T + 1$ columns correspond to time points $t = 1, \dots, T + 1$ . The element in the ith row and tth column, $D_{i, t}$ , is the value of $Z_{i}$ at time point t. For temporal data it is typically assumed that the regulatory interactions are subject to a lag of one time point. For example, an edge $Z_{i} \to Z_{j}$ indicates that $D_{j, t + 1}$ ( $Z_{j}$ at $t + 1$ ) depends on $D_{i, t}$ ( $Z_{i}$ at t). The variable $Z_{i}$ is then called a parent (node) of $Z_{j}$ .

Because of the lag, there is no need for any acyclicity constraint, and for each node $Z_{j}$ ( $j = 1, \dots, N$ ) the parent nodes can be learned separately. This has computational advantages, since the ‘network learning task’ can be separated into N independent ‘parent learning tasks’. Henceforth, when a computer cluster is available, the N parent sets can be learned in parallel, so that the inference algorithms scale-up well.

A popular method is to apply linear regression, where $Y : = Z_{j}$ is the response and ${Z_{1}, \dots, Z_{j - 1}, Z_{j + 1}, \dots, Z_{N}} = : {X_{1}, \dots, X_{n}}$ are potential covariates (with $n : = N - 1$ ). Because of the lag, $T + 1$ time points yield T observations for the linear regression model. Each observation $D_{t}$ $(t \in {1, \dots, T})$ consists of a response value $Y = D_{j, t + 1}$ and the shifted covariate values: $X_{1} = D_{1, t}, \dots, X_{j - 1} = D_{j - 1, t}, X_{j} = D_{j + 1, t}, \dots, X_{n} = D_{N, t}$ , where $n = N - 1$ .

Having inferred a covariate set $π_{j}$ for each $Z_{j}$ , a network is built by merging the covariate sets: $G : = {π_{1}, \dots, π_{N}}$ . There is the edge $Z_{i} \to Z_{j}$ in $G$ if and only if $Z_{i} \in π_{j}$ .

As the same linear regression approaches are used for each $Z_{j}$ , we describe the models using a general terminology: Let Y be the response and let $X_{1}, \dots, X_{n}$ be the covariates of the linear regression model.

To allow for time-dependent regression coefficients, a piece-wise linear regression model can be used. Changepoints $τ : = {τ_{1}, \dots, τ_{H - 1}}$ with $1 \leq τ_{h} < T$ divide the observations $D_{1}, \dots, D_{T}$ into disjoint segments $h = 1, \dots, H$ containing $T_{1}, \dots, T_{H}$ consecutive data points, so that: $\sum T_{h} = T$ . Observation $D_{t}$ ( $1 \leq t \leq T$ ) belongs to segment h if $τ_{h - 1} < t \leq τ_{h}$ , where $τ_{0} : = 1$ and $τ_{H} : = T$ are two pseudo changepoints.

We assume all covariate sets $π \subset {X_{1}, \dots, X_{n}}$ with up to $F = 3$ covariates to be equally likely a priori, $p (π) = c$ , while parent sets with more than $F$ covariates get a zero prior probability (‘fan-in restriction’). Further we assume that the distance between changepoints is geometrically distributed with hyperparameter $p \in (0, 1)$ , so that

\begin{matrix} p (τ) = {(1 - p)}^{τ_{H} - τ_{H - 1} - 1} \cdot \prod_{h = 1}^{H - 1} p \cdot {(1 - p)}^{τ_{h} - τ_{h - 1} - 1} = {(1 - p)}^{(T - 1) - (H - 1)} \cdot p^{H - 1} \end{matrix}

With $y = y_{τ} : = {y_{1}, \dots, y_{H}}$ being the set of segment-specific response vectors, implied by the changepoint set $τ$ , the posterior distribution takes the form:

\begin{matrix} p (π, τ, θ | y) \propto p (π) \cdot p (τ) \cdot p (θ | π, τ) \cdot p (y | π, τ, θ) \end{matrix}

where $θ = θ (π, τ)$ denotes the set of all model parameters, including segment-specific parameters as well as parameters that are shared among segments.

In the following subsections we assume $π \subset {X_{1}, \dots, X_{n}}$ and the segmentation $y = {y_{1}, \dots, y_{H}}$ , induced by $τ$ , to be fixed, and we do not make $π$ and $τ$ explicit anymore. Without loss of generality, we assume that $π$ contains the first k covariates: $π : = {X_{1}, \dots, X_{k}}$ . For fixed $π$ and $τ$ , Eq. (1) reduces to:

\begin{matrix} p (θ | y) \propto p (θ) \cdot p (y | θ) \end{matrix}

A generic Bayesian piece-wise linear regression model

Consider a Bayesian linear regression model, where Y is the response and $X_{1}, \dots, X_{k}$ are the covariates. We assume that T observations $D_{1}, \dots, D_{T}$ have been made at equidistant time points and that the data can be subdivided into disjoint segments $h \in {1, \dots, H}$ , where segment h contains $T_{h}$ data points and has a segment-specific regression coefficient vector $w_{h}$ . Let $y_{h}$ be the response vector and $X_{h}$ be the design matrix for segment h, where each $X_{h}$ includes a first column of 1’s for the intercept. For each segment $h = 1, \dots, H$ we assume a Gaussian likelihood:

\begin{matrix} y_{h} | (w_{h}, σ^{2}) \sim N (X_{h} w_{h}, σ^{2} I) \end{matrix}

where $I$ is the identity matrix, and $σ^{2}$ is a noise variance parameter that is shared among all segments. We impose an inverse Gamma prior on $σ^{2}$ , $σ^{- 2} \sim G A M (α_{σ}, β_{σ})$ , and we assume that the vectors $w_{h}$ have Gaussian priors:

\begin{matrix} w_{h} | (μ_{h}, Σ_{h}, σ^{2}) \sim N (μ_{h}, σ^{2} Σ_{h}) \end{matrix}

where $μ_{h}$ is a (k+1)-dimensional vector, and $Σ_{h}$ is a positive definite $(k + 1)$ -by- $(k + 1)$ matrix. Re-using the parameter $σ^{2}$ in Eq. (3), yields a fully-conjugate prior in both $w_{h}$ and $σ^{2}$ (see, e.g., Sections 3.3 and 3.4 in Gelman [14]). Figure 1 shows a graphical model representation of this generic model. For notational convenience we define:

\begin{matrix} θ : = \{μ_{1}, \dots, μ_{H} ; Σ_{1}, \dots, Σ_{H}\} \end{matrix}

Fig. 1 — Graphical representation of the generic model. Parameters that have to be inferred are represented by white circles. The data and the fixed hyperparameters are represented by grey circles. Circles within the plate are specific for segment h

The full conditional distribution of $w_{h}$ is (cp. Section 3.3 in [15]):

\begin{matrix} w_{h} | (y_{h}, σ^{2}, θ) \sim N ({[Σ_{h}^{- 1} + X_{h}^{T} X_{h}]}^{- 1} (Σ_{h}^{- 1} μ_{h} + X_{h}^{T} y_{h}), σ^{2} {(Σ_{h}^{- 1} + X_{h}^{T} X_{h})}^{- 1}) \end{matrix}

and the segment-specific marginal likelihoods with $w_{h}$ integrated out are:

\begin{matrix} y_{h} | (σ^{2}, θ) \sim N (X_{h} μ_{h}, σ^{2} C_{h} (θ)) \end{matrix}

where $C_{h} (θ) : = I + X_{h} Σ_{h} X_{h}^{T}$ (cp. Section 3.3 in [15]). From Eq. (5) we get:

\begin{matrix} p (σ^{2}, | y, θ) \propto p (σ^{2}) \cdot \prod_{h = 1}^{H} p (y_{h} | σ^{2}, θ) = {(σ^{- 2})}^{a_{σ} + \frac{1}{2} \cdot T - 1} e^{- σ^{- 2} (b_{σ} + \frac{1}{2} \cdot Δ^{2} (θ))} \end{matrix}

where $y : = {y_{1}, \dots, y_{H}}$ and $Δ^{2} (θ) : = \sum_{h = 1}^{H} {(y_{h} - X_{h} μ_{h})}^{T} C_{h} {(θ)}^{- 1} (y_{h} - X_{h} μ_{h})$ . The shape of $p (σ^{2} | y, θ)$ implies:

\begin{matrix} σ^{- 2} | (y, θ) \sim G A M (α_{σ} + \frac{1}{2} \cdot T, β_{σ} + \frac{1}{2} \cdot Δ^{2} (θ)) \end{matrix}

For the marginal likelihood, with $w_{h}$ ( $h = 1, \dots, H$ ) and $σ^{2}$ integrated out, we apply the rule from Section 2.3.7 of Bishop [15]:

\begin{matrix} p (y | θ) = \frac{Γ (\frac{T}{2} + a_{σ})}{Γ (a_{σ})} \cdot \frac{π^{- T / 2} \cdot {(2 b_{σ})}^{a_{σ}}}{{(\prod_{h = 1}^{H} det (C_{h} (θ)))}^{1 / 2}} \cdot {(2 b_{σ} + Δ^{2} (θ))}^{- (\frac{T}{2} + a_{σ})} \end{matrix}

When all parameters in $θ$ are fixed, the marginal likelihood of the piece-wise linear regression model can be computed in closed form. In typical models the (hyper-)hyperparameters in $θ$ depend on hyperparameters with their own hyperprior distributions. From now on we will only include the free hyperparameters in $θ$ . In the following subsections we describe four possible model instantiations, namely: the uncoupled model (M1), the coupled model (M2), the newly proposed partially coupled model (M3), and the generalized coupled model (M4). In the forthcoming subsections we will introduce further mathematical symbols. For convenience, Table 1 lists the mathematical symbols that we will use in this paper.

Table 1.

List of mathematical symbols

Symbol	Description	Prior distribution
N	Total number of nodes (genes)	–
n	Number of potential parent nodes, here $n = N - 1$	–
h	Data segment h	–
H	Total number of data segments	–
k	Number of covariates in covariate set	–
t	Data point t	–
$σ^{2}$	Noise variance parameter	$σ^{- 2} \sim G A M (α_{σ}, β_{σ})$
$λ_{c}$	Coupling strength parameter, $h > 1$	$λ_{c}^{- 1} \sim G A M (α_{c}, β_{c})$
$λ_{u}$	SNR parameter, $h = 1$	$λ_{u}^{- 1} \sim G A M (α_{u}, β_{u})$
$λ_{h}$	hth coupling strength parameter (M4 model)	$λ_{h}^{- 1} \sim G A M (α_{c}, β_{c})$
$δ_{h}$	hth coupling indicator variable (M3 model)	$δ_{h} \sim B E R (p)$ , $p \sim B E T A (a, b)$
T	Total number of data points	–
$T_{h}$	Number of data points in segment h	–
$D_{i}$	ith data point	–
$Z_{i}$	ith network node	–
$π_{i}$	Parent (covariate) set of ith node, $Z_{i}$	$p (\| π \| < = 3) = c$ , $p (\| π \| > 3) = 0$
$τ$	Changepoint set	$p (τ) = {(1 - p)}^{(T - 1) - (H - 1)} \cdot p^{H - 1}$
$τ_{h}$	Changepoint h	–
$X_{i}$	ith covariate	–
$X_{h}$	Design matrix of segment h	–
$y_{h}$	Response vector of segment h	$y_{h} \| (w_{h}, σ^{2}) \sim N (X_{h} w_{h}, σ^{2} I)$
$w_{h}$	Regression coefficient vector of segment h	$w_{h} \| (μ_{h}, Σ_{h}, σ^{2}) \sim N (μ_{h}, σ^{2} Σ_{h})$
$\tilde{w_{h - 1}}$	Posterior expectation of $w_{h - 1}$	–

Open in a new tab

Model M1: the uncoupled model

A standard approach, akin to the models of Lèbre et al. [1] and Dondelinger et al. [3], is to set $μ_{h} = 0$ and to assume that the matrices $Σ_{h}$ are diagonal matrices $Σ_{h} = λ_{u} I$ , where the parameter $λ_{u} \in R^{+}$ is shared among segments and assumed to be inverse Gamma distributed, $λ_{u}^{- 1} \sim G A M (α_{u}, β_{u})$ . In the supplementary material we provide a graphical model representation of the uncoupled model (M1). Using the notation of the generic model, we have:

\begin{matrix} θ = {λ_{u}}, C_{h} (λ_{u}) = I + λ_{u} X_{h} X_{h}^{T}, Δ^{2} (λ_{u}) : = \sum_{h = 1}^{H} y_{h}^{T} C_{h} {(λ_{u})}^{- 1} y_{h} \end{matrix}

For the posterior distribution of the uncoupled model we have:

\begin{matrix} p (w, σ^{2}, λ_{u} | y) \propto p (σ^{2}) \cdot p (λ_{u}) \cdot \prod_{h = 1}^{H} p (w_{h} | σ^{2}, λ_{u}) \cdot \prod_{h = 1}^{H} p (y_{h} | σ^{2}, w_{h}) \end{matrix}

where $w : = {w_{1}, \dots, w_{H}}$ . From Eq. (9) it follows for the full conditional distribution of $λ_{u}$ :

\begin{matrix} p (λ_{u}, | y, w,, σ^{2}) & \propto p (λ_{u}) \cdot \prod_{h = 1}^{H} p (w_{h} | σ^{2}, λ_{u}) \\ \propto {(λ_{u}^{- 1})}^{a_{u} + \frac{H \cdot (k + 1)}{2}} \cdot exp \{- λ_{u}^{- 1} (b_{u} + \frac{1}{2} σ^{- 2} \sum_{h = 1}^{H} w_{h}^{T} w_{h})\} \end{matrix}

and the shape of the latter density implies:

\begin{matrix} λ_{u}^{- 1} | (y, w, σ^{2}) \sim G A M (α_{u} + \frac{H \cdot (k + 1)}{2}, β_{u} + \frac{1}{2} σ^{- 2} \sum_{h = 1}^{H} w_{h}^{T} w_{h}) \end{matrix}

Since the full conditional distribution of $λ_{u}$ depends on $σ^{2}$ and $w$ , those parameters have to be sampled first. From Eq. (6) a value of $σ^{2}$ can be sampled via a collapsed Gibbs-sampling step, with the $w_{h}$ ’s being integrated out. Subsequently, given $σ^{2}$ , Eq. (4) can be used to sample the vectors $w_{h}$ ’s. Finally, for each $λ_{u}$ sampled from Eq. (10) the marginal likelihood, $p (y | λ_{u})$ , can be computed by plugging in the expressions from Eq. (8) into Eq. (7).

Model M2: the (fully) coupled model

The (fully) coupled model, proposed by Grzegorczyk and Husmeier [5], uses the posterior expectation of $w_{h - 1}$ as prior expectation for $w_{h}$ . Only the first segment $h = 1$ has an uninformative prior:

\begin{matrix} w_{h} \sim \{\begin{matrix} N (0, σ^{2} λ_{u} I) & if h = 1 \\ N ({\tilde{w}}_{h - 1}, σ^{2} λ_{c} I) & if h > 1 \end{matrix}) \end{matrix}

where ${\tilde{w}}_{h - 1}$ is the posterior expectation of $w_{h - 1}$ (cp. Eq. (4)):

\begin{matrix} {\tilde{w}}_{h - 1} : = \{\begin{matrix} {[Σ_{1}^{- 1} + X_{1}^{T} X_{1}]}^{- 1} (X_{1}^{T}, y_{1}) & if h = 2 \\ {[Σ_{h - 1}^{- 1} + X_{h - 1}^{T} X_{h - 1}]}^{- 1} (λ_{c}^{- 1} {\tilde{w}}_{h - 2} + X_{h - 1}^{T} y_{h - 1}) & if h > 2 \end{matrix}) \end{matrix}

The parameter $λ_{c}$ has been called the ’coupling parameter’ and it has been assumed that it has an inverse Gamma prior distribution, $λ_{c}^{- 1} \sim G A M (α_{c}, β_{c})$ . Using the notation from the generic model (see Fig. 1), we note that Eq. (11) corresponds to:

\begin{matrix} μ_{h} = \{\begin{matrix} 0 & if h = 1 \\ {\tilde{w}}_{h - 1} & if h > 1 \end{matrix}), Σ_{h} = \{\begin{matrix} λ_{u} I & if h = 1 \\ λ_{c} I & if h > 1 \end{matrix}), \\ C_{h} (θ) = \{\begin{matrix} I + λ_{u} X_{h} X_{h}^{T} & if h = 1 \\ I + λ_{c} X_{h} X_{h}^{T} & if h > 1 \end{matrix}), θ = {λ_{u}, λ_{c}}, Δ^{2} (θ) = \sum_{h = 1}^{H} {(y_{h} - X_{h} {\tilde{w}}_{h - 1})}^{T} C_{h} {(θ)}^{- 1} (y_{h} - X_{h} {\tilde{w}}_{h - 1}) \end{matrix}

with ${\tilde{w}}_{0} : = 0$ , $λ_{u}^{- 1} \sim G A M (α_{u}, β_{u})$ and $λ_{c}^{- 1} \sim G A M (α_{c}, β_{c})$ . As ${\tilde{w}}_{h - 1}$ is treated like a fixed hyperparameter when used as input for segment h, we exclude the parameters ${\tilde{w}}_{1}, \dots, {\tilde{w}}_{H - 1}$ from $θ$ .

In the supplementary material we provide a graphical model representation of the coupled M2 model. For the posterior we have:

\begin{matrix} p (w, σ^{2}, λ_{u}, λ_{c} | y) \propto p (σ^{2}) \cdot p (λ_{u}) \cdot p (λ_{c}) \cdot p (w_{1} | σ^{2}, λ_{u}) \cdot \\ \prod_{h = 2}^{H} p (w_{h} | σ^{2}, λ_{c}) \cdot \prod_{h = 1}^{H} p (y_{h} | σ^{2}, w_{h}) \end{matrix}

In analogy to the derivations in the previous subsection one can derive (cp. [5]):

\begin{matrix} λ_{u}^{- 1} | (y, w, σ^{2}, λ_{c}) \sim G A M (α_{u} + \frac{1 \cdot (k + 1)}{2}, β_{u} + \frac{1}{2} σ^{- 2} D_{u}^{2}) \end{matrix}

\begin{matrix} λ_{c}^{- 1} | (y, w, σ^{2}, λ_{u}) \sim G A M (α_{c} + \frac{(H - 1) \cdot (k + 1)}{2}, β_{c} + \frac{1}{2} σ^{- 2} D_{c}^{2}) \end{matrix}

where $D_{u}^{2} : = w_{1}^{T} w_{1}$ and $D_{c}^{2} : = \sum_{h = 2}^{H} {(w_{h} - {\tilde{w}}_{h - 1})}^{T} (w_{h} - {\tilde{w}}_{h - 1})$ .

For each $θ = {λ_{u}, λ_{c}}$ the marginal likelihood, $p (y | λ_{u}, λ_{c})$ , can be computed by plugging the expressions $C_{h} (θ)$ and $Δ^{2} (θ)$ into Eq. (7).

Model M3: the new partially segment-wise coupled model

We propose a new ‘consensus’ model between the M1 and the M2 model. The new model (M3) allows each segment $h > 1$ either to coupled top or to uncouple from the preceding segment $h - 1$ . We use an uninformative prior for the first segment $h = 1$ , and for all segments $h > 1$ we introduce a binary variable $δ_{h}$ which indicates whether segment h is coupled to ( $δ_{h} = 1$ ) or uncoupled from ( $δ_{h} = 0$ ) the preceding segment $h - 1$ :

\begin{matrix} w_{h} \sim \{\begin{matrix} N (0, σ^{2} λ_{u} I) & if h = 1 \\ N (δ_{h} \cdot {\tilde{w}}_{h - 1}, σ^{2} λ_{c}^{δ_{h}} λ_{u}^{1 - δ_{h}} I) & if h > 1 \end{matrix}) \end{matrix}

where ${\tilde{w}}_{h - 1}$ is the posterior expectation of $w_{h - 1}$ . The new priors from Eq. (15) yield for $h \geq 2$ the following posterior expectations (cp. Eq. (4)):

\begin{matrix} {\tilde{w}}_{h - 1} = {(λ_{c}^{- δ_{h - 1}} λ_{u}^{- (1 - δ_{h - 1})} I + X_{h - 1}^{T} X_{h - 1})}^{- 1} (δ_{h - 1} λ_{c}^{- 1} {\tilde{w}}_{h - 2} + X_{h - 1}^{T} y_{h - 1}) \end{matrix}

with ${\tilde{w}}_{0} : = 0$ , $δ_{1} : = 0$ , we have in the generic model notation:

\begin{matrix} μ_{h} = δ_{h} {\tilde{w}}_{h - 1}, Σ_{h} = λ_{c}^{δ_{h}} λ_{u}^{1 - δ_{h}} I, θ = \{λ_{u}, λ_{c}, {\{δ_{h}\}}_{h \geq 2}\}, C_{h} (θ) = I + λ_{c}^{δ_{h}} λ_{u}^{1 - δ_{h}} X_{h} X_{h}^{T} \end{matrix}

We assume the binary variables $δ_{2}, \dots, δ_{H}$ to have a Bernoulli prior distributions, $δ_{h} \sim B E R (p)$ , with a joint hyperparameter $p \in [0, 1]$ having a Beta hyperprior distribution, $p \sim B E T A (a, b)$ . We note that

$δ_{h} = 0$ ( $h \geq 2$ ) gives model M1 with $P (w_{h}) = N (0, λ_{u} σ^{2} I)$ for all h
$δ_{h} = 1$ ( $h \geq 2$ ) gives model M2 with $P (w_{h}) = N ({\tilde{w}}_{h - 1}, λ_{c} σ^{2} I)$ for $h \geq 2$ .
The new partially segment-wise coupled model infers the variables $δ_{h}$ ( $h \geq 2$ ) from the data. It searches for the best trade-off between the models M1 and M2.

A graphical model presentation of the partially coupled model is shown in Fig. 2. For $δ_{h} \sim B E R (p)$ with $p \sim B E T A (a, b)$ the joint marginal density of ${δ_{h}}_{h \geq 2}$ is:

\begin{matrix} p ({\{δ_{h}\}}_{h \geq 2}) = \int p (p) \prod_{h = 2}^{H} p (δ_{h} | p) d p = \frac{Γ (a + b)}{Γ (a) Γ (b)} \cdot \frac{Γ (a + \sum_{h = 2}^{H} δ_{h}) Γ (b + \sum_{h = 2}^{H} (1 - δ_{h}))}{Γ (a + b + (H - 1))} \end{matrix}

For the posterior distribution of the partially segment-wise coupled model we get:

\begin{matrix} p (w, σ^{2}, λ_{u}, λ_{c}, {δ_{h}}_{h \geq 2} | y) \propto & p (σ^{2}) \cdot p (λ_{u}) \cdot p (λ_{c}) \cdot p ({δ_{h}}_{h \geq 2}) \cdot p (w_{1} | σ^{2}, λ_{u}) \\ \cdot \prod_{h = 2}^{H} p (w_{h} | σ^{2}, λ_{u}, λ_{c}, δ_{h}) \cdot \prod_{h = 1}^{H} p (y_{h} | σ^{2}, w_{h}) \end{matrix}

For the full conditional distributions of $λ_{u}$ and $λ_{c}$ we have:

\begin{matrix} p (λ_{u} | y, w, σ^{2}, λ_{c}, {δ_{h}}_{h \geq 2}) \propto & p (λ_{u}) \cdot \prod_{h : δ_{h} = 0} p (w_{h} | σ^{2}, λ_{u}) \\ p (λ_{c} | y, w, σ^{2}, λ_{u}, {δ_{h}}_{h \geq 2}) \propto & p (λ_{c}) \cdot \prod_{h : δ_{h} = 1} p (w_{h} | σ^{2}, λ_{c}) \end{matrix}

where $δ_{1} : = 0$ fixed. And it follows from the shapes of the densities:

\begin{matrix} λ_{u}^{- 1} | (y, w, σ^{2}, λ_{c}, {δ_{h}}_{h \geq 2}) \sim & G A M (α_{u} + \frac{H_{u} \cdot (k + 1)}{2}, β_{u} + \frac{1}{2} σ^{- 2} D_{u}^{2}) \\ λ_{c}^{- 1} | (y, w, σ^{2}, λ_{u}, {δ_{h}}_{h \geq 2}) \sim & G A M (α_{c} + \frac{H_{c} \cdot (k + 1)}{2}, β_{c} + \frac{1}{2} σ^{- 2} D_{c}^{2}) \end{matrix}

where $H_{c} = \sum_{h} δ_{h}$ is the number of coupled segments, $H_{u} = \sum_{h} (1 - δ_{h})$ is the number of uncoupled segments, so that $H_{c} + H_{u} = H$ , and

\begin{matrix} D_{u}^{2} : = \sum_{h : δ_{h} = 0} w_{h}^{T} w_{h}, D_{c}^{2} : = \sum_{h : δ_{h} = 1} {(w_{h} - {\tilde{w}}_{h - 1})}^{T} (w_{h} - {\tilde{w}}_{h - 1}) \end{matrix}

For each parameter instantiation $θ = {λ_{u}, λ_{c}, {δ_{h}}_{h \geq 2}}$ the marginal likelihood, $p (y | θ)$ , can be computed with Eq. (7), where $C_{h} (θ)$ was defined above, and

\begin{matrix} Δ^{2} (θ) = \sum_{h = 1}^{H} {(y_{h} - δ_{h} X_{h} {\tilde{w}}_{h - 1})}^{T} {[I + λ_{c}^{δ_{h}} λ_{u}^{1 - δ_{h}} X_{h} X_{h}^{T}]}^{- 1} (y_{h} - δ_{h} X_{h} {\tilde{w}}_{h - 1}) \end{matrix}

We have for each binary variable $δ_{k}$ ( $k = 2, \dots, H$ ):

\begin{matrix} p (δ_{k} = 1 | λ_{u}, λ_{c}, {δ_{h}}_{h \neq k}, y) \propto p (y | λ_{u}, λ_{c}, {δ_{h}}_{h \neq k}, δ_{k} = 1) \cdot p ({δ_{h}}_{h \neq k}, δ_{k} = 1) \end{matrix}

so that its full conditional distribution is:

\begin{matrix} δ_{k} | (λ_{u}, λ_{c}, {δ_{h}}_{h \neq k}, y) \sim B E R (\frac{p (y | λ_{u}, λ_{c}, {δ_{h}}_{h \neq k}, δ_{k} = 1) \cdot p ({δ_{h}}_{h \neq k}, δ_{k} = 1)}{\sum_{j = 0}^{1} p (y | λ_{u}, λ_{c}, {δ_{h}}_{h \neq k}, δ_{k} = j) \cdot p ({δ_{h}}_{h \neq k}, δ_{k} = j)}) \end{matrix}

Each $δ_{k}$ ( $k > 1$ ) can therefore be sampled with a collapsed Gibbs sampling step, where ${w_{h}}$ , $σ^{2}$ and $p$ have been integrated out.

Fig. 2 — Graphical representation of the new partially coupled model (M3). Parameters that have to be inferred are represented by white circles. The data and the fixed hyperparameters are represented by grey circles. The two rectangles indicate definitions, which depend on the parent nodes. Circles and definitions within the plate are segment-specific. For each segment the model infers if the prior for $w_{h}$ is coupled to ( $δ_{h} = 1$ ) or uncoupled from ( $δ_{h} = 0$ ) the preceding segment $h - 1$

Model M4: the generalised (fully) coupled model

In [6] we proposed to generalise the (fully) coupled model (i.e. the M2 model) by introducing a segment-specific coupling parameter $λ_{h}$ for each segment $h > 2$ . This yields:

\begin{matrix} w_{h} \sim \{\begin{matrix} N (0, σ^{2} λ_{u} I) & if h = 1 \\ N ({\tilde{w}}_{h - 1}, σ^{2} λ_{h} I) & if h > 1 \end{matrix}) \end{matrix}

where ${\tilde{w}}_{h - 1}$ is the posterior expectation of $w_{h - 1}$ . For the parameters $λ_{h}$ we have assumed that they are inverse Gamma distributed, $λ_{h}^{- 1} \sim G A M (α_{c}, β_{c})$ , with hyperparameters $α_{c}$ and $β_{c}$ . In the supplementary material we provide a graphical model representation of the M4 model. Recalling the generic notation and setting ${\tilde{w}}_{0} : = 0$ and $λ_{1} : = λ_{u}$ , Eq. (17) gives:

\begin{matrix} μ_{h} = {\tilde{w}}_{h - 1}, Σ_{h} = λ_{h} I, C_{h} (θ) = I + λ_{h} X_{h} X_{h}^{T}, θ = {λ_{u}, {λ_{h}}_{h \geq 2}}, \\ and Δ^{2} (θ) = \sum_{h = 1}^{H} {(y_{h} - X_{h} {\tilde{w}}_{h - 1})}^{T} C_{h} {(θ)}^{- 1} (y_{h} - X_{h} {\tilde{w}}_{h - 1}) \end{matrix}

For the posterior we have:

\begin{matrix} p (w, σ^{2}, λ_{u}, {λ_{h}}_{h \geq 2} | y) \propto & p (σ^{2}) \cdot p (λ_{u}) \cdot (\prod_{h = 2}^{H}, p, (λ_{h})) \\ \cdot p (w_{1} | σ^{2}, λ_{u}) \cdot \prod_{h = 2}^{H} p (w_{h} | σ^{2}, λ_{h}) \cdot \prod_{h = 1}^{H} p (y_{h} | σ^{2}, w_{h}) \end{matrix}

For $k = 2, \dots, H$ it follows:

\begin{matrix} λ_{k}^{- 1} | (y, w, σ^{2}, λ_{u}, {λ_{h}}_{h \neq k}) \sim & G A M (α_{c} + \frac{(k + 1)}{2}, β_{c} + \frac{1}{2} σ^{- 2} D_{k}^{2}) \\ and λ_{u}^{- 1} | (y, w, σ^{2}, {λ_{h}}_{h \geq 2}) \sim & G A M (α_{u} + \frac{(k + 1)}{2}, β_{u} + \frac{1}{2} σ^{- 2} D_{u}^{2}) \end{matrix}

where $D_{u}^{2} : = w_{1}^{T} w_{1}$ and $D_{k}^{2} : = {(w_{k} - {\tilde{w}}_{k - 1})}^{T} (w_{k} - {\tilde{w}}_{k - 1})$ .

For each $θ = {λ_{u}, {λ_{h}}_{h \geq 2}}$ the marginal likelihood, $p (y | {λ_{u}, {λ_{h}}_{h \geq 2}})$ , can be computed with Eq. (7); using the expressions $C_{h} (θ)$ and $Δ^{2} (θ)$ defined above.

Unlike the proposed partially coupled M3 model, the generalized coupled M4 model does not feature any mechanism to uncouple neighbouring segments. Like the fully coupled M2 model, the M4 model has been designed such that it has to couple all neighbouring segments. The only advantage over the M2 model is that the the M4 model introduces segment-specific coupling parameters, so that the coupling strength(s) can vary over time.

Reversible jump Markov chain Monte Carlo inference

We use Reversible Jump Markov Chain Monte Carlo simulations to generate posterior samples ${π^{(w)}, τ^{(w)}, θ^{(w)}}_{w = 1, \dots, W}$ . In each iteration we re-sample the parameters in $θ$ from their full conditional distributions (Gibbs sampling), and we perform two Metropolis-Hastings moves; one on the covariate set $π$ and one on the changepoint set $τ$ . For the four models (M1–M4) Eq. (1) takes the form:

\begin{matrix} p (π, τ, θ | y) \propto \{\begin{matrix} p (π) p (τ) p (λ_{u}) \cdot p (y | π, τ,, λ_{u}) & M1 \\ p (π) p (τ) p (λ_{u}) \cdot p (λ_{c}) \cdot p (y | π, τ, λ_{u}, λ_{c}) & M2 \\ p (π) p (τ) p (λ_{u}) \cdot p (λ_{c}) \cdot p ({δ_{h}}_{h \geq 2}) \cdot p (y | π, τ, λ_{u}, λ_{c}, {δ_{h}}_{h \geq 2}) & M3 \\ p (π) p (τ) p (λ_{u}) \cdot (\prod_{h = 2}^{H}, p, (λ_{h})) \cdot p (y | π, τ, λ_{u}, {λ_{h}}_{h \geq 2}) & M4 \end{matrix}) \end{matrix}

All likelihood terms, $p (y | \dots)$ , are marginalized over $σ^{2}$ and ${w_{h}}$ and for the new M3 model also the Bernoulli parameter $p$ has been integrated out.

For the models M1–M2 the dimension of $θ$ does not depend on $τ$ , while for the models M3–M4 the dimension of $θ$ does depend on $τ$ . The M3 model has a discrete parameter $δ_{h} \in {0, 1}$ and the M4 model has a continuous parameter $λ_{h} \in R^{+}$ for each $h > 1$ .

The model-specific full conditional distributions for the Gibbs sampling steps have been provided above. For sampling $π$ we implement 3 moves: covariate ‘removal (R)’, ‘addition (A)’, and ‘exchange (E)’. Each move proposes to replace $π$ by a new covariate set $π^{*}$ having one covariate more (A) or less (R) or exchanged (E). When randomly selecting the move type and the involved covariate(s), we get for all models the acceptance probability:

\begin{matrix} A (π \to π^{*}) = min \{1, \frac{p (y | π^{*}, \dots)}{p (y | π, \dots)} \cdot \frac{p (π^{*})}{p (π)} \cdot H R_{π}\} \\ with the Hastings Ratios: H R_{π, R} = \frac{| π |}{n - | π^{*} |}, H R_{π, A} = \frac{n - | π |}{| π^{*} |}, H R_{π, E} = 1 \end{matrix}

For sampling $τ$ we also implement 3 move types: changepoint ‘birth (R)’, ‘death (D)’, and ‘re-allocation (R)’ moves. Each move proposes to replace $τ$ by a new changepoint set $τ^{*}$ having one changepoint added (B) or deleted (D) or re-allocated (R). When randomly selecting the move type, the involved changepoint and the new changepoint location, we get for M1 and M2:

\begin{matrix} A (τ \to τ^{*}) = & min \{1, \frac{p (y | τ^{*}, \dots)}{p (y | τ, \dots)} \cdot \frac{p (τ^{*})}{p (τ)} \cdot H R_{τ}\} \\ where H R_{τ, B} = \frac{T - 1 - | τ |}{| τ^{*} |}, H R_{τ, D} = \frac{| τ |}{T - 1 - | τ^{*} |}, H R_{τ, R} = 1 \end{matrix}

For the models M3 (proposed here) and the model M4 from [6] the changepoint moves also affect the numbers of parameters in ${δ_{h}}_{h \geq 2}$ and ${λ_{h}}_{h \geq 2}$ , respectively. For all segments that stay identical we keep the parameters unchanged. For all new segments we re-sample the corresponding parameters. For the new model M3 we flip coins to get candidates for the involved $δ_{h}$ ’s. This yields:

\begin{matrix} A ([τ, \{δ_{h}\}] \to [τ^{*}, {δ_{h}}^{*}]) = min \{1, \frac{p (y | τ^{*}, {δ_{h}}^{*}, \dots)}{p (y | τ, {δ_{h}}, \dots)} \frac{p (τ^{*})}{p (τ)} \frac{p ({δ_{h}}^{*})}{p (\{δ_{h}\})} \cdot H R_{τ} \cdot c_{τ}\} \end{matrix}

where $c_{τ, B} = 2$ for birth, $c_{τ, D} = 1 / 2$ for death, and $c_{τ, R} = 1$ for re-allocation moves. For the model M4 we follow [6] and re-sample the involved $λ_{h}$ ’s from their priors $p (λ_{h})$ . We obtain:

\begin{matrix} A ([τ, \{λ_{h}\}] \to [τ^{*}, {λ_{h}}^{*}]) = min \{1, \frac{p (y | τ^{*}, {λ_{h}}^{*}, \dots)}{p (y | τ, {λ_{h}}, \dots)} \cdot \frac{p (τ^{*})}{p (τ)} \cdot H R_{τ}\} \end{matrix}

Note that the additional factor $c_{τ} : = \frac{p ({λ_{h}})}{p ({λ_{h}}^{*})}$ of the Hastings ratio has been canceled with the prior ratio $\frac{p ({λ_{h}}^{*})}{p ({λ_{h}})}$ .

Edge scores and areas under precision recall curves (AUC)

For a network with N variables $Z_{1}, \dots, Z_{N}$ we infer N separate regression models. For each $Z_{i}$ we get a sample ${π_{i}^{(w)}, τ_{i}^{(w)}, θ_{i}^{(w)}}_{w = 1, \dots, W}$ from the ith posterior. From the covariate sets we form a sample of graphs $G^{(w)} = {π_{1}^{(w)}, \dots, π_{N}^{(w)}}_{w = 1, \dots, W}$ . For each edge $Z_{i} \to Z_{j}$ the edge posterior probability (edge score) is:

\begin{matrix} {\hat{e}}_{i, j} = \frac{1}{W} \sum_{w = 1}^{W} I_{i \to j} (G^{(w)}) where I_{i \to j} (G^{(w)}) = \{\begin{matrix} 1 & if X_{i} \in π_{j}^{(w)} \\ 0 & if X_{i} \notin π_{j}^{(w)} \end{matrix}) \end{matrix}

If the true network is known and has M edges, we can quantify the network reconstruction accuracy. For each threshold $ξ \in [0, 1]$ we extract the $n_{ξ}$ edges whose scores ${\hat{e}}_{i, j}$ exceed $ξ$ , and we count the number of true positives $T_{ξ}$ among them. Plotting the precisions $P_{ξ} : = T_{ξ} / n_{ξ}$ against the recalls $R_{ξ} : = T_{ξ} / M$ , gives the precision-recall curve. We refer to the area under the curve as AUC value.

Hyperparameter settings and simulation details

The hyperparameters of the priors and hyperpriors of the four NH-DBN models (M1–M4) have to be specified in advance, and we note that the hyperparameter setting can have an effect on the resulting posterior distributions and so on the network reconstruction results. Selecting appropriate hyperparameters is therefore a crucial task. In the absence of genuine prior knowledge (e.g. from experts or from the literature), we re-use the rather uninformative (and thus generic) parameter settings from earlier publications. Re-using those hyperparameters also has the advantage that our empirical results can be compared with earlier reported results. More specifically, we proceed as follows:

For the models M1, M2 and M4 we re-use the hyperparameters from the earlier works by Lèbre et al. [1], Grzegorczyk and Husmeier [5], and Shafiee Kamalabad and Grzegorczyk [6]: $σ^{- 2} \sim G A M (α_{σ} = ν, β_{σ} = ν)$ with $ν = 0.005$ , $λ_{u}^{- 1} \sim G A M (α_{u} = 2, β_{u} = 0.2)$ , and $λ_{c}^{- 1} \sim G A M (α_{c} = 3, β_{c} = 3)$ . For the new partially coupled model M3 we use the same setting with the extension: $δ_{h} \sim B E R (p)$ with $p \sim B E T A (a = 1, b = 1)$ , which seems to be a very natural choice. For the M3 model we also tested several alternative hyperparameter settings, but we did not observe significantly deviating results, indicating that the M3 model is rather robust with respect to the hyperparameter settings. For more thorough studies on how the hyperparameter setting affects the network reconstruction results, we refer to the work by Grzegorczyk and Husmeier [5].

For all models M1–M4 we run each reversible jump Markov chain Monte Carlo simulation for $V = 100, 000$ iterations. Setting the burn-in phase to 0.5V (50%) and thinning out by the factor 10 during the sampling phase, yields $W = 0.5 V / 10 = 5000$ samples from each posterior. To check for convergence, we compared the samples of independent simulations, using standard trace plot diagnostics as well as scatter plots of the estimated edge scores. For most of the data sets, analysed here, the diagnostics indicated almost perfect convergence already after $V = 10, 000$ iterations; see Fig. 7a for an example.

Fig. 7 — Analysis of the real yeast data. (a) For each run length, $V \in {100, 1000, 10, 000, 100, 000}$ we performed 15 RJMCMC simulations with the partially coupled model (M3). We used the hyperparameter $p = 0.05$ for the changepoint prior. For each V there is a scatter plot where the simulation-specific edge scores (vertical axis) are plotted against the average scores for that V (horizontal axis). (b) We implemented the models M1–M4 with different hyperparameters p of the geometric distribution for the distance between changepoints. For each p the bars show the model-specific average AUC scores. The error bars indicate standard deviations

Data

Synthetic network data

For model comparisons we generated various synthetic network data sets. We report here on two studies with realistic network topologies, shown in Figs. 3 and 4. In both studies we assumed the data segmentation to be known. Hence, we kept the changepoints in $τ$ fixed at their right locations and did not perform reversible jump Markov chain Monte Carlo moves on $τ$ .

Fig. 3 — Yeast networks. Left: the true yeast network with $N = 5$ nodes and $M = 8$ edges. Right: yeast network prediction obtained with model M3. The grey (dotted) edges correspond to false positives (negatives)

Study 1 For the RAF pathway with $N = 11$ nodes and $M = 20$ edges, shown in Fig. 4 and taken from Sachs et al. [16], we generated data with $H = 4$ segments having $m = 10$ data points each. For each node $Z_{i}$ and its parent nodes in $π_{i}$ we sampled the regression coefficients for $h = 1$ from standard Gaussian distributions and collected them in a vector $w_{1}^{i}$ which we normalised to Euclidean norm 1, $w_{1}^{i} \leftarrow w_{1}^{i} / | w_{1}^{i} |$ . For the segments $h = 2, 3, 4$ we use: $w_{h}^{i} = w_{h - 1}^{i}$ ( $δ_{h} = 1$ , coupled) or $w_{h}^{i} = - w_{h - 1}^{i}$ ( $δ_{h} = 0$ , uncoupled). The design matrices $X_{h}^{i}$ contain a first column of 1’s for the intercept and the segment-specific values of the parent nodes, shifted by one time point. To the segment-specific values of $Z_{i}$ : $z_{h}^{i} = X_{h}^{i} w_{h}^{i}$ we element-wise added Gaussian noise with standard deviation $σ = 0.05$ . For all coupling scenarios $(δ_{2}, δ_{3}, δ_{4}) \in {0, 1}^{3}$ , we generated 25 data sets having different regression coefficients.

Study 2 This study is similar to the first one with three changes: (i) We used the yeast network with $N = 5$ nodes and $M = 8$ edges, shown in the left panel of Fig. 3 and taken from Cantone et al. [17]. (ii) Again we generated data with $H = 4$ segments, but we varied the number of time points per segment $m \in {2, 3, \dots, 12}$ . (iii) We focused on one scenario: For each node $Z_{i}$ and its parent nodes in $π_{i}$ we generated two vectors $w_{⋄}^{i}$ and $w_{⋆}^{i}$ with standard Gaussian distributed entries. We re-normalised the first vector to Euclidean norm 1, $w_{⋄}^{i} \leftarrow w_{⋄}^{i} / | w_{⋄}^{i} |$ , and the 2nd vector to norm 0.5, $w_{⋆}^{i} \leftarrow 0.5 \cdot w_{⋆}^{i} / | w_{⋆}^{i} |$ . We set $w_{1}^{i} = w_{2}^{i} = w_{⋄}^{i}$ so that the segments $h = 2$ and $h = 3$ are coupled, and $w_{3}^{i} = w_{4}^{i} = (w_{⋄}^{i} + w_{⋆}^{i}) / (| w_{⋄}^{i} + w_{⋆}^{i} |)$ , so that the segments $h = 3$ and $h = 4$ are coupled, while the coupling between $h = 3$ and $h = 2$ is ‘moderate’. For each m we generated 25 data matrices with different regression coefficients.

Yeast gene expression data

Cantone at al. [17] synthetically designed a network in S. cerevisiae (yeast) with $N = 5$ genes, and measured gene expression data under galactose- and glucose-metabolism: 16 measurements were taken in galactose and 21 measurements were taken in glucose, with 20 minutes intervals in between measurements. Although the network is small, it is an ideal benchmark data set: The network structure is known, so that network reconstruction methods can be cross-compared on real wet-lab data. We follow Grzegorczyk and Husmeier and pre-process the data as described in [5]. The true network structure is shown in the left panel of Fig. 3. As an example, a network prediction obtained with the partially coupled model (M3) is shown in the right panel. For the prediction we extracted the 8 edges with the highest scores.

Arabidopsis gene expression data

The circadian clock in Arabidopsis thaliana optimizes the gene regulatory processes with respect to the daily dark:light cycles (photo periods). In four experiments Arabidopsis plants were entrained in different dark:light cycles, before gene expression data were measured under constant light condition over 24- and 48-h time intervals. We follow Grzegorczyk and Husmeier [5] and merge the four time series to one single data set with $T = 47$ data points and focus our attention on the $N = 9$ core genes: LHY, TOC1, CCA1, ELF4, ELF3, GI, PRR9, PRR5, and PRR3.

Results

In this section we present the results of a comparative evaluation study, in which we compare the performance of the new partially coupled model (M3) with the competing models M1, M2 and M4. Throughout this section we use the new M3 model as reference model.

Results for synthetic network data

We start with the RAF-pathway for which we generated network data for 8 different coupling scenarios. Figure 5a compares the network reconstruction accuracies in terms of average AUC value differences. For 6 out of 8 scenarios the three AUC differences are clearly and significantly in favour of M3. Not surprisingly, for the two extreme scenarios, where all segments $h \geq 2$ are either coupled (‘0111’) or uncoupled (‘0000’), M3 performs slightly worse than the fully coupled models (M2 and M4) or the uncoupled model (M1), respectively. But unlike the uncoupled model (M1) for coupled data (‘0111’), and unlike the coupled models (M2 and M4) for uncoupled data (‘0000’), the partially coupled model (M3) never performs significantly worse than the respective ‘gold-standard’ model. For the partially coupled model, Fig. 5b shows the posterior probabilities that the segments $h = 2, 3, 4$ are coupled. The trends are in good agreement with the true coupling mechanism. Model M3 correctly infers whether the regression coefficients stay similar (identical) or change (substantially). The generalised coupled model (M4) can only adjust the segment-specific coupling strengths, but has no option to uncouple. Like the coupled model (M2), it fails when the parameters are subject to drastic changes. When comparing the coupled model (M2) with the generalised coupled model (M4), we see that M2 performs better when only one segment is coupled, while the new M4 model is superior to M2 if two segments are coupled, see the scenarios ‘0011’, ‘0110’, and ‘0101’.

Fig. 5 — Results for synthetic RAF pathway data. We distinguish 8 coupling scenarios $(δ_{1} = 0, δ_{2}, δ_{3}, δ_{4})$ . a Each histogram has three bars for the average AUC differences between the partially coupled model (M3) and the other models: ‘M3 versus M2 [= Coupled]’ (white), ‘M3 versus M4 [= Generalised]’ (black), and ‘M3 versus M1 [= Uncoupled]’ (grey). The error bars indicate t-test confidence intervals. b Diagnostic for the partially coupled model (M3): The bars give the posterior probabilities $p (δ_{h} = 1 | D)$ that segment h is coupled to $h - 1$ ( $h = 2, 3, 4$ )

For the yeast network we generated data corresponding to a ‘0101’ coupling scheme and the change of the parameters (from the 2nd to the 3rd segment) is less drastic than for the RAF pathway data. Figure 6 shows how the AUC differences vary with the number of time points T, where $T = 4 m$ and m is the number of data points per segment. For sufficiently many data points the effect of the prior diminishes and all models yield high AUC values (see bottom right panel). There are then no significant differences between the AUC values anymore. However, for the lower sample sizes again the new partially coupled model (M3) performs clearly best. For $12 \leq m \leq 28$ model M3 is significantly superior to all other models and for $30 \leq T \leq 40$ it still significantly outperforms the uncoupled (M1) and the coupled (M2) model. The performance of the generalised model (M4) is comparable to the performance of the uncoupled model. For moderate sample sizes ( $12 \leq T \leq 44$ ) model M4 is significantly better than the fully coupled model (M2).

Fig. 6 — Results for synthetic yeast data, generated under coupling scenario (0, 1, 1, 1). Five panels show the average AUC differences plotted against the numbers of data points T. The error bars indicate t test confidence intervals. The bottom right panel shows the model-specific average AUC values

Results for yeast gene expression data

For the yeast gene expression data we assume the changepoint(s) to be unknown and we infer the segmentation from the data. Figure 7a shows convergence diagnostics for the partially coupled model (M3). It can be seen from the scatter plots that $V = 10, 000$ RJMCMC iterations yield already almost perfect convergence. The edge scores of 15 independent MCMC runs are almost identical to each other.

The average AUC scores of the models M1–M4 are shown in Fig. 7b. Since the number of inferred changepoints grows with the hyperparameter p of the geometric distribution on the distance between changepoints, we implemented the models with different p’s. The uncoupled model is superior to the coupled model for the lowest p ( $p = 0.02$ ) only, but becomes more and more inferior to the coupled model, as p increases. This result is consistent with the finding in Grzegorczyk and Husmeier [5] and can be explained as follows: As the hyperparameter of the changepoint prior $p \in (0, 1)$ increases, the number of inferred data segments H grows so that the individual data segments $h = 1, \dots, H$ get shorter. The individual segments h then cover less data points and are thus less informative. The coupling scheme allows for information-sharing among segments. The information content of large segments is sufficient for inference, so that coupling does not provide any noteworthy advantage. But for short (uninformative) segments information coupling improves the inference certainty, as coupling allows for the incorporation of information from the preceding segment(s). Therefore the potential improvement that can be gained by coupling grows with the hyperparameter p.

The new partially coupled model (M3) performs consistently better than the uncoupled and the coupled model (M1–M2). The only exemption occurs for $p = 0.1$ where the coupled model (M2) appears to perform slightly (but not significantly) better than M3. For p’s up to $p = 0.05$ the fully coupled (M2) and the generalised fully coupled model (M4) perform approximately equally well. However, for the three highest p’s the M4 model performs better than the coupled model (M2) and even outperforms the new partially coupled model (M3). While the performances of the models M1–M3 decrease with the number of changepoints, the performance of the model M4 stays rather robust.

Subsequently, we re-analysed the yeast data with $K = 1, \dots, 5$ fixed changepoints. Figure 8a, b shows the average AUC scores and the AUC score differences in favour of the partially coupled model (M3). Panel (a) reveals that the new partially coupled model (M3) reaches again the highest network reconstruction accuracy. Panel (b) shows that the superiority of M3 is significant, with only one exemption: For $K = 1$ the uncoupled model M1 does not perform worse than the partially coupled model (M3).

Fig. 8 — Results for real yeast data with fixed changepoints. We imposed $K \in {1, \dots, 5}$ changepoints and kept them fixed. K changepoints yield $H = K + 1$ segments. For each K we used the first changepoint to separate the two parts of the time series (galactose vs. glucose metabolism). Successively we located the next changepoint in the middle of the longest segment to divide it into 2 segments, until K changepoints were set. a show the model-specific average total AUC scores with error bars indicating standard deviations. b shows the AUC score differences in favour of the partially coupled model (M3). Here the error bars indicate t-test confidence intervals

Subsequently, we also investigated the segment-specific coupling posterior probabilities $p (δ_{h} = 1 | D)$ ( $h = 2, \dots, H = K + 1$ ) for the new partially coupled model (M3) and the posterior distributions of the coupling parameters $λ_{u}, λ_{2}, \dots, λ_{K + 1}$ for the generalised model (M4), but we could not find clear trends for any gene. As an example, we provide the results for gene ASH1 in Fig. 9a, b. Panel (a) shows that the coupling posterior probabilities of model M3 do not have a clear pattern. However, it becomes obvious that the partially coupled model makes use of segment-wise switches between the uncoupled and the coupled approach. Panel (b) shows that the distributions of the segment-specific coupling parameters, $λ_{2}, \dots, λ_{K + 1}$ , of model M4 stay rather similar among segments. This explains why the generalised coupled model (M4) is not superior to the fully coupled model (M2).

Fig. 9 — Results for real yeast data with fixed changepoints. We imposed $K \in {1, \dots, 5}$ changepoints and kept them fixed. K changepoints yield $H = K + 1$ segments. For each K we used the first changepoint to separate the two parts of the time series (galactose vs. glucose metabolism). Successively we located the next changepoint in the middle of the longest segment to divide it into 2 segments, until K changepoints were set. a Diagnostic for the partially coupled model (M3): The bars give the posterior probabilities $p (δ_{h} = 1 | D)$ that segment h is coupled to $h - 1$ ( $h = 2, \dots, K + 1$ ) for target gene ASH1. b Diagnostic for the generalised coupled model (M4): In each panel there is a boxplot for each segment $h = 2, \dots, K + 1$ showing the distributions of the logarithmic coupling parameters $λ_{h}$ for target gene ASH1

Application to Arabidopsis gene expression data

For the Arabidopsis gene expression data we cannot objectively compare the network reconstruction accuracies of the four models, since the true circadian clock network is not known. We therefore only applied the new partially coupled model (M3), which we had found to be the best model in our earlier studies. Figure 10 shows the Arabidopsis network, which was reconstructed using the hyperparameter $p = 0.1$ for the geometric distribution on the distance between changepoints. To obtain a network prediction, we extracted the 20 edges with the highest edge scores. Although a proper evaluation of the network prediction is beyond the scope of this paper, we note that several features of the network are consistent with the plant biology literature. E.g. the feedback loop between LHY and TOC1 is the most important key feature of the circadian clock network (see, e.g., the work by Locke et al. [18]). Many of the other predicted edges have been reported in more recent works. E.g. the edges $L H Y \to E L F 3$ , $L H Y \to E L F 4$ , $G I \to T O C 1$ , $E L F 3 \to P R R 3$ and $E L F 4 \to P R R 9$ can all be found in the circadian clock network (hypothesis) of Herrero et al. [19].

Fig. 10 — Prediction of the circadian clock network in *Arabidopsis thaliana*. The prediction was obtained with the proposed partially coupled model (M3), using the hyperparameter $p = 0.1$ for the geometric distribution on the distance between changepoints. The network shows the 20 edges with the highest edge scores. We have added the label ‘L’ to those edges that have already been reported in the biology literature. Fore more details see the main text

Discussion and conclusions

We have proposed a new Bayesian piece-wise linear regression model for reconstructing regulatory networks from gene expression time series. The new partially coupled model (M3), whose graphical model representation is given in Fig. 2, is a consensus model between the uncoupled model (M1) and the fully coupled model (M2). In the uncoupled model (M1) the segment-specific regression coefficients have to be learned for each segment separately. In the fully coupled model (M2) each segment is compelled to be coupled to the previous one. The new partially coupled model (M3) combines features of the uncoupled and the fully coupled model, and it can infer for each individual time segment whether it is coupled to (or uncoupled from) the preceding segment.

We have cross-compared the new model (M3) with the two established models (M1–M2) as well as with the generalised coupled model (M4) that makes use of segment-specific coupling parameters [6]. In our data applications, the new partially coupled model (M3) reached significantly better network reconstruction accuracies than its competitors (M1, M2, and M4).

In an earlier work [6], we found that the performances of the fully coupled model (M1) and of the generalised fully coupled model (M4) can be improved by imposing additional hyperpriors on the hyperparameters of the coupling strength parameter. In our future work we will therefore investigate whether either the use of hyperpriors or the use of segment specific continuous (coupling/SNR) parameters along the lines of the M4 model can improve the new partially coupled model (M3). Moreover, in our future work we will also try to combine the concept of partially coupled time segments of the proposed model (M3) with the recently proposed concept of partially coupled edges [8]. The combination of both concepts will yield a highly flexible novel NH-DBN model, in which each individual network edge is partially segment-wise coupled. We will empirically test whether this new hybrid model leads to improved network reconstruction results or whether it suffers from model over-flexibility.

Supplementary information

12859_2021_3998_MOESM1_ESM.pdf^{(170.7KB, pdf)}

Additional file 1. Graphical model representations of the three competing models are provided as additional files. Figure 11 shows a graphical model representation of the M1 model. Figure 12 shows a graphical model representation of the M2 model. Figure 13 shows a graphical model representation of the M4 model.

Acknowledgements

Not applicable.

About this supplement

This article has been published as part of BMC Bioinformatics Volume 22, Supplement 2 2021: 15th and 16th International Conference on Computational Intelligence methods for Bioinformatics and Biostatistics (CIBB 2018-19). The full contents of the supplement are available at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-22-supplement-2.

Abbreviations

DBN: Dynamic Bayesian network
NH-DBN: Non-homogeneous dynamic Bayesian network
MCMC: Markov chain Monte Carlo
RJMCMC: Reversible jump Markov chain Monte Carlo
SNR: Signal-to-noise ratio
AUC: Areas under precision recall curve

Authors' contributions

Both authors contributed equally to the methodological work and both authors. MSK performed the computational work and drafted the manuscript. MG supervised the project and revised the draft version of the manuscript. All authors read and approved the final manuscript.

Funding

Not applicable.

Availability of data and materials

The datasets analysed during the current study are available in the figshare repository, https://figshare.com/s/96f578777aa6b43f3638

We note that the data stem from earlier publications [5, 17].

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Mahdi Shafiee Kamalabad, Email: m.shafiee@tilburguniversity.edu.

Marco Grzegorczyk, Email: m.a.gzegorczyk@rug.nl.

References

1.Lèbre S, Becq J, Devaux F, Lelandais G, Stumpf MPH. Statistical inference of the time-varying structure of gene-regulation networks. BMC Syst Biol. 2010;4:130. doi: 10.1186/1752-0509-4-130. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Grzegorczyk M, Husmeier D. Improvements in the reconstruction of time-varying gene regulatory networks: dynamic programming and regularization by information sharing among genes. Bioinformatics. 2011;27(5):693–699. doi: 10.1093/bioinformatics/btq711. [DOI] [PubMed] [Google Scholar]
3.Dondelinger F, Lèbre S, Husmeier D. Non-homogeneous dynamic Bayesian networks with Bayesian regularization for inferring gene regulatory networks with gradually time-varying structure. Mach Learn. 2012;90:191–230. doi: 10.1007/s10994-012-5311-x. [DOI] [Google Scholar]
4.Grzegorczyk M, Husmeier D. Regularization of non-homogeneous dynamic Bayesian networks with global information-coupling based on hierarchical Bayesian models. Mach Learn. 2013;91:105–154. doi: 10.1007/s10994-012-5326-3. [DOI] [Google Scholar]
5.Grzegorczyk M, Husmeier D. A non-homogeneous dynamic Bayesian network with sequentially coupled interaction parameters for applications in systems and synthetic biology. Stat Appl Genet Mol Biol SAGMB. 2012;11(4) (Article 7). [DOI] [PubMed]
6.Shafiee Kamalabad M, Grzegorczyk M. Improving nonhomogeneous dynamic Bayesian networks with sequentially coupled parameters. Stat Neerl. 2018;72(3):281–305. doi: 10.1111/stan.12136. [DOI] [Google Scholar]
7.Shafiee Kamalabad M, Heberle AM, Thedieck K, Grzegorczyk M. Partially non-homogeneous dynamic Bayesian networks based on Bayesian regression models with partitioned design matrices. Bioinformatics. 2019;35(12):2108–2117. doi: 10.1093/bioinformatics/bty917. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Shafiee Kamalabad M, Grzegorczyk M. Non-homogeneous dynamic Bayesian networks with edge-wise sequentially coupled parameters. Bioinformatics. 2020;36(4):1198–1207. doi: 10.1093/bioinformatics/btz690. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Vignes M, Vandel J, Allouche D, Ramadan-Alban N, Cierco-Ayrolles C, Schiex T, Mangin B, De Givry S. Gene regulatory network reconstruction using Bayesian networks, the Dantzig selector, the Lasso and their meta-analysis. PLoS ONE. 2011;6(12):29165. doi: 10.1371/journal.pone.0029165. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Huang X, Zi Z. Inferring cellular regulatory networks with Bayesian model averaging for linear regression (BMALR) Mol Biol Syst. 2014;10(8):2023–2030. doi: 10.1039/c4mb00053f. [DOI] [PubMed] [Google Scholar]
11.Xing L, Guo M, Liu X, Wang C, Wang L, Zhang Y. An improved Bayesian network method for reconstructing gene regulatory network based on candidate auto selection. BMC Genom. 2017;18(9):17–30. doi: 10.1186/s12864-017-4228-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Fan Y, Wang X, Peng Q. Inference of gene regulatory networks using Bayesian nonparametric regression and topology information. Comput Math Methods Med. 2017;2017:8307530. doi: 10.1155/2017/8307530. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Xu S, Zhang C-X, Wang P, Zhang J. Variational Bayesian complex network reconstruction. CoRR 2018.
14.Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian data analysis. 2. London: Chapman and Hall/CRC; 2004. [Google Scholar]
15.Bishop CM. Pattern recognition and machine learning. Singapore: Springer; 2006. [Google Scholar]
16.Sachs K, Perez O, Pe’er D, Lauffenburger DA, Nolan GP. Protein-signaling networks derived from multiparameter single-cell data. Science. 2005;308:523–529. doi: 10.1126/science.1105809. [DOI] [PubMed] [Google Scholar]
17.Cantone I, Marucci L, Iorio F, Ricci MA, Belcastro V, Bansal M, Santini S, di Bernardo M, di Bernardo D, Cosma MP. A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches. Cell. 2009;137:172–181. doi: 10.1016/j.cell.2009.01.055. [DOI] [PubMed] [Google Scholar]
18.Locke JCW, Kozma-Bognár L, Gould PD, Fehér B, Kevei E, Nagy F, Turner MS, Hall A, Millar AJ. Experimental validation of a predicted feedback loop in the multi-oscillator clock of Arabidopsis thaliana. Mol Syst Biol. 2006;2(1):59. doi: 10.1038/msb4100102. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Herrero E, Kolmos E, Bujdoso N, Yuan Y, Wang M, Berns MC, Uhlworm H, Coupland G, Saini R, Jaskolski M, Webb A, Concalves J, Davis SJ. EARLY FLOWERING4 recruitment of EARLY FLOWERING3 in the nucleus sustains the Arabidopsis circadian clock. Plant Cell. 2012;24(2):428–443. doi: 10.1105/tpc.111.093807. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

12859_2021_3998_MOESM1_ESM.pdf^{(170.7KB, pdf)}

Data Availability Statement

The datasets analysed during the current study are available in the figshare repository, https://figshare.com/s/96f578777aa6b43f3638

We note that the data stem from earlier publications [5, 17].

[CR1] 1.Lèbre S, Becq J, Devaux F, Lelandais G, Stumpf MPH. Statistical inference of the time-varying structure of gene-regulation networks. BMC Syst Biol. 2010;4:130. doi: 10.1186/1752-0509-4-130. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Grzegorczyk M, Husmeier D. Improvements in the reconstruction of time-varying gene regulatory networks: dynamic programming and regularization by information sharing among genes. Bioinformatics. 2011;27(5):693–699. doi: 10.1093/bioinformatics/btq711. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Dondelinger F, Lèbre S, Husmeier D. Non-homogeneous dynamic Bayesian networks with Bayesian regularization for inferring gene regulatory networks with gradually time-varying structure. Mach Learn. 2012;90:191–230. doi: 10.1007/s10994-012-5311-x. [DOI] [Google Scholar]

[CR4] 4.Grzegorczyk M, Husmeier D. Regularization of non-homogeneous dynamic Bayesian networks with global information-coupling based on hierarchical Bayesian models. Mach Learn. 2013;91:105–154. doi: 10.1007/s10994-012-5326-3. [DOI] [Google Scholar]

[CR5] 5.Grzegorczyk M, Husmeier D. A non-homogeneous dynamic Bayesian network with sequentially coupled interaction parameters for applications in systems and synthetic biology. Stat Appl Genet Mol Biol SAGMB. 2012;11(4) (Article 7). [DOI] [PubMed]

[CR6] 6.Shafiee Kamalabad M, Grzegorczyk M. Improving nonhomogeneous dynamic Bayesian networks with sequentially coupled parameters. Stat Neerl. 2018;72(3):281–305. doi: 10.1111/stan.12136. [DOI] [Google Scholar]

[CR7] 7.Shafiee Kamalabad M, Heberle AM, Thedieck K, Grzegorczyk M. Partially non-homogeneous dynamic Bayesian networks based on Bayesian regression models with partitioned design matrices. Bioinformatics. 2019;35(12):2108–2117. doi: 10.1093/bioinformatics/bty917. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Shafiee Kamalabad M, Grzegorczyk M. Non-homogeneous dynamic Bayesian networks with edge-wise sequentially coupled parameters. Bioinformatics. 2020;36(4):1198–1207. doi: 10.1093/bioinformatics/btz690. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Vignes M, Vandel J, Allouche D, Ramadan-Alban N, Cierco-Ayrolles C, Schiex T, Mangin B, De Givry S. Gene regulatory network reconstruction using Bayesian networks, the Dantzig selector, the Lasso and their meta-analysis. PLoS ONE. 2011;6(12):29165. doi: 10.1371/journal.pone.0029165. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Huang X, Zi Z. Inferring cellular regulatory networks with Bayesian model averaging for linear regression (BMALR) Mol Biol Syst. 2014;10(8):2023–2030. doi: 10.1039/c4mb00053f. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Xing L, Guo M, Liu X, Wang C, Wang L, Zhang Y. An improved Bayesian network method for reconstructing gene regulatory network based on candidate auto selection. BMC Genom. 2017;18(9):17–30. doi: 10.1186/s12864-017-4228-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Fan Y, Wang X, Peng Q. Inference of gene regulatory networks using Bayesian nonparametric regression and topology information. Comput Math Methods Med. 2017;2017:8307530. doi: 10.1155/2017/8307530. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Xu S, Zhang C-X, Wang P, Zhang J. Variational Bayesian complex network reconstruction. CoRR 2018.

[CR14] 14.Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian data analysis. 2. London: Chapman and Hall/CRC; 2004. [Google Scholar]

[CR15] 15.Bishop CM. Pattern recognition and machine learning. Singapore: Springer; 2006. [Google Scholar]

[CR16] 16.Sachs K, Perez O, Pe’er D, Lauffenburger DA, Nolan GP. Protein-signaling networks derived from multiparameter single-cell data. Science. 2005;308:523–529. doi: 10.1126/science.1105809. [DOI] [PubMed] [Google Scholar]

[CR17] 17.Cantone I, Marucci L, Iorio F, Ricci MA, Belcastro V, Bansal M, Santini S, di Bernardo M, di Bernardo D, Cosma MP. A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches. Cell. 2009;137:172–181. doi: 10.1016/j.cell.2009.01.055. [DOI] [PubMed] [Google Scholar]

[CR18] 18.Locke JCW, Kozma-Bognár L, Gould PD, Fehér B, Kevei E, Nagy F, Turner MS, Hall A, Millar AJ. Experimental validation of a predicted feedback loop in the multi-oscillator clock of Arabidopsis thaliana. Mol Syst Biol. 2006;2(1):59. doi: 10.1038/msb4100102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Herrero E, Kolmos E, Bujdoso N, Yuan Y, Wang M, Berns MC, Uhlworm H, Coupland G, Saini R, Jaskolski M, Webb A, Concalves J, Davis SJ. EARLY FLOWERING4 recruitment of EARLY FLOWERING3 in the nucleus sustains the Arabidopsis circadian clock. Plant Cell. 2012;24(2):428–443. doi: 10.1105/tpc.111.093807. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

A new Bayesian piecewise linear regression model for dynamic network reconstruction

Mahdi Shafiee Kamalabad

Marco Grzegorczyk

Conference

Abstract

Background

Results

Conclusions

Supplementary Information

Background

Methods

Learning dynamic networks with time-varying parameters

A generic Bayesian piece-wise linear regression model

Fig. 1.

Table 1.

Model M1: the uncoupled model

Model M2: the (fully) coupled model

Model M3: the new partially segment-wise coupled model

Fig. 2.

Model M4: the generalised (fully) coupled model

Reversible jump Markov chain Monte Carlo inference

Edge scores and areas under precision recall curves (AUC)

Hyperparameter settings and simulation details

Fig. 7.

Data

Synthetic network data

Fig. 3.

Fig. 4.

Yeast gene expression data

Arabidopsis gene expression data

Results

Results for synthetic network data

Fig. 5.

Fig. 6.

Results for yeast gene expression data

Fig. 8.

Fig. 9.

Application to Arabidopsis gene expression data

Fig. 10.

Discussion and conclusions

Supplementary information

Acknowledgements

About this supplement

Abbreviations

Authors' contributions

Funding

Availability of data and materials

Ethics approval and consent to participate

Consent for publication

Competing interests

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases