Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 1.
Published in final edited form as: IEEE Trans Biomed Eng. 2024 Nov 21;71(12):3390–3401. doi: 10.1109/TBME.2024.3423803

A Deep Dynamic Causal Learning Model to Study Changes in Dynamic Effective Connectivity during Brain Development

Yingying Wang 1, Chen Qiao 2, Gang Qu 3, Vince D Calhoun 4, Julia M Stephen 5, Tony W Wilson 6, Yu-Ping Wang 7
PMCID: PMC11700232  NIHMSID: NIHMS2030680  PMID: 38968024

Abstract

Objective:

Brain dynamic effective connectivity (dEC), characterizes the information transmission patterns between brain regions that change over time, which provides insight into the biological mechanism underlying brain development. However, most existing methods predominantly capture fixed or temporally invariant EC, leaving dEC largely unexplored.

Methods:

Herein we propose a deep dynamic causal learning model specifically designed to capture dEC. It includes a dynamic causal learner to detect time-varying causal relationships from spatio-temporal data, and a dynamic causal discriminator to validate these findings by comparing original and reconstructed data.

Results:

Our model outperforms established baselines in the accuracy of identifying dynamic causalities when tested on the simulated data. When applied to the Philadelphia Neurodevelopmental Cohort, the model uncovers distinct patterns in dEC networks across different age groups. Specifically, the evolution process of brain dEC networks in young adults is more stable than in children, and significant differences in information transfer patterns exist between them.

Conclusion:

This study highlights the brain’s developmental trajectory, where networks transition from undifferentiated to specialized structures with age, in accordance with the improvement of an individual’s cognitive and information processing capability.

Significance:

The proposed model consists of the identification and verification of dynamic causality, utilizing the spatio-temporal fusing information from fMRI. As a result, it can accurately detect dEC and characterize its evolution over age.

Keywords: Brain development, dynamic causality, dynamic effective connectivity, spatio-temporal information

I. Introduction

The human brain is a highly complex neural network system consisting of spatially distinct but functionally interconnected regions. Different brain areas coordinate and regulate various bodily functions through complex neuronal interactions [1], [2]. Functional magnetic resonance imaging (fMRI) technology, with its high spatial resolution and non-invasive nature, is widely used to investigate complex brain functional networks [3]–[5]. Research on brain functional networks primarily focuses on two aspects: functional connectivity (FC) [6] and effective connectivity (EC) [7]. While FC indicates statistical dependencies between different brain regions, EC illustrates the causal influences that one region exerts over another, thereby revealing both the intensity and direction of information flow to enhance our understanding of human cognitive mechanisms [8], [9]. Brain dynamic EC (dEC) signifies the evolving EC that includes both the causality distributed across brain networks and the corresponding temporal changes, which can be applied to identify dynamic causality between brain regions.

The causality modeling, the process of deducing causal relationships from observed data, has gained increasing attraction in brain EC analysis [10], [11]. Specifically, each brain region is viewed as a node in a causal graph, with directed edges representing ECs between brain regions [12]. Various causal discovery algorithms have been employed to identify EC, including Granger causality (GC), transfer entropy (TE), structural equation models (SEM), and dynamic Bayesian networks (DBN). For example, Ting et al. propose a factor-based GC model to learn the large-scale EC [13]. Kim et al. propose a two-stage unified SEM to identify EC from multisubject, multivariate fMRI [14]. Wei et al. established weight-directed functional brain networks from electroencephalogram data by normalized phase TE [15]. Liu et al. proposed a CTE-score function based on conditional entropy (CE) and TE to evaluate the quality of candidate EC from fMRI [16]. Rajapakse et al. proposed a unified probabilistic framework to simultaneously consider the detection of brain activation and the estimation of EC [17]. With the development of deep learning technology, neural networks have been successfully applied in causal discovery. Tank et al. [18] inferred the nonlinear causal relationships by applying structured multilayer perceptrons (MLP) or recurrent neural networks (RNN) combined with sparsity constraints on the weights. Nauta et al. proposed a temporal causal discovery framework (TCDF) that used attention-based convolutional neural networks to learn the causality from time series data [19]. Kipf et al. introduced a neural relational inference (NRI) model based on a variational autoencoder to learn the interactions and dynamics of a system [20]. These methods are predominantly focused on fixed, temporally invariant EC. Thus, they are called static causal discovery methods.

In order to identify time-varying causality, Robinson et al. proposed a nonhomogeneous DBN (NHDBN) by replacing the linear model with a piecewise linear model, and the regression coefficients were learned separately for each segment [21]. Grzegorczyk et al. introduced sequential coupling NHDBN (Seq-DBN) and global coupling NHDBN (Glob-DBN) to allow for information exchange among time segments [22], [23]. Kamalabad et al. proposed a NHDBN with an edge-wise coupling scheme, which inferred whether interaction parameters should be coupled for each individual edge [24]. Liu et al. proposed a non-stationary DBN method to estimate brain dEC [25]. However, these models require prior assumptions on the networks, thus, the selection of the prior distributions affects the final parameter estimation and prediction results. Additionally, NHDBNs need to infer change points in the data, which increases the computational complexity of the model, especially when dealing with large-scale datasets. These requirements greatly limit the application of NHDBNs.

Additionally, previous research demonstrated that temporal information and spatial information in fMRI data from the human brain are particularly important for brain functional network analysis [26].Temporal dynamic features of brain regions, coupled with spatial dependency, characterize spatio-temporal fusion information of brain regions. Gadgil et al. proposed a model based on spatio-temporal graph convolution networks (STGCN) for investigating rs-fMRI data [27]. Jiang et al. proposed an anatomy-guided spatio-temporal graph convolutional network (AG-STGCN) to discover the regularity and variability of FC differences between gyri and sulci across multiple task domains [28]. Lian et al. proposed a pedestrian trajectory prediction model based on a spatio-temporal graph convolutional network (PTP-STGCN) that used Transformer to capture the temporal features and graph convolution to extract the spatial features, respectively [29]. Experimental results show that these methods have achieved favorable results in the classification and recognition of correlation due to capturing deep spatio-temporal features. However, They cannot reveal the causal relationships of nodes.

To address the above issues, a deep dynamic causal learning (DDCL) model is proposed to identify dynamic causal relationships between brain regions in an unsupervised manner. The main contributions of the paper can be summarized as follows.

1). Constructing a deep network-based dynamic causal learning model:

Our model introduces a novel dual-module architecture: a dynamic causal learner with a spatio-temporal feature learning module combining a temporal convolutional network (TCN) with spatial attention mechanism, and a dynamic causal discriminator employing a spatio-temporal SEM (ST-SEM) module. This combination allows a more nuanced capture and verification of dynamic causality, promoting the learning of spatio-temporal data in ways that traditional methods are unable to do.

2). Discovering dEC Networks in the Human Brain:

Applied to simulated data and real fMRI from the Philadelphia Neurodevelopmental Cohort (PNC), our model demonstrates superior accuracy in identifying dEC networks, revealing developmental changes in brain connectivity. It uncovers the transformation from segregated to integrated dEC networks as children mature into young adults, thereby validating the model’s ability in brain development study.

II. Methodology

In this section, DDCL will be introduced. Some preliminary work is first presented, including TCN, the attention mechanism, and SEM. Then, the details of the dynamic causal learner and the dynamic causal discriminator are described. Finally, the learning process of the model is described. The flow chart of DDCL is shown in Fig.1.

Fig. 1.

Fig. 1.

The flow chart of the i-th subnetwork 𝓝i of DDCL. Each subnetwork 𝓝i takes the post time series from all nodes as input to reconstruct xi. The DCLi represents the dynamic causal learner corresponding to the i-th node xi. The DCDi represents the dynamic causal discriminator corresponding to the i-th node xi. STFL and CE represent a spatio-temporal feature learning module and a causality estimation module, respectively. ST-SEM and CV represent a spatio-temporal SEM module and a causality verification module, respectively. TIA and SIA together form the ST-SEM module.

A. Preliminary Work

1). TCN:

TCN alleviates the problem of exploding/vanishing gradients compared to recurrent neural networks, which is widely used in time series analysis [30]. TCN consists of multiple TCN blocks, and each TCN block is constructed with causal convolutions and dilated convolutions. Dilated convolution is an effective method for handling long-range time series data since its receptive field grows exponentially with an increase in the number of convolutional layers. Causal convolution is a type of convolution designed for time series, and the output at time t depends on the input at time t and those before it. Specifically, for time series x=x1,x2,,xT, the prediction xˆt at timestep t is represented as

xˆt=(x*f)(t)=k=0K-1f(K-k)xt-k (1)

where f represents a convolution kernel with K indicating the size of the convolution kernel.

2). Attention mechanisms:

The essence of the attention is to calculate the weights of different nodes/series in data, which is an effective way to help the model decide which features are more important. Therefore, an attention mechanism [31] is introduced here to account for spatial dependencies of nodes in data.

Let X=x1,x2,,xNTRN×d be a sample, where N represents the number of nodes, and d represents the dimension of features and αRN×N denote an attention score matrix with αij representing the attention score between node i and node j. To calculate the attention score αij, the weight coefficient eij, which indicates the influence that node j exerts on node i, is first given by:

eij=U˜V˜Ti,j (2)

where U˜=tanhUH(l),V˜=tanhVH(l).U,V are the parameter matrices and H(l)RN×d' represents the extracted feature at layer l. The attention score αij is

αij=softmaxeij=expeijj=1,jiNexpeij (3)

3). SEM:

Let ARN×N be the weighted adjacency matrix of the causal graph G corresponding to X. The linear SEM [32] is

X=ATX+E (4)

where E=e1,e2,,eNT is random noise. If Aij0,xj is the direct cause of xi and there is a directed edge xjxi in the causal graph G.

B. DDCL

Consider a set of samples S=X1,X2,,XK containing K subjects. Each subject, denoted as Xk, is represented by a matrix in RN×T, where N is the number of nodes and T the number of time points. For simplicity, let us refer to each subject as X and represent it as X=x1,x2,,xN, where each xi is a time series of length T. For each node xi,xit is the observed value at time t, and xit=xi1,,xit-1,xit denotes the observed series from initial time to time t. Let Xt=x1t,x2t,,xNt.

DDCL is constructed with 𝒩 independent subnetworks 𝒩1,𝒩2,,𝒩N; each 𝒩i consists of a dynamic causal learner, which is used to learn the dynamic causal relationships between xi and other nodes, and a dynamic causal discriminator, which is utilized to verify the accuracy of the causality.

1). Dynamic Causal Learner (DCL):

DCL is constructed to discover the dynamic causality between nodes by utilizing spatio-temporal information.

For node xi, the dynamic causal learner DCLi, built based on a spatio-temporal feature learning module and a causality estimation module, is used to identify the dynamic causality between xi and other nodes in the observed data. The learned AitRN(t=2,3,,T) by DCLi is a causal mask vector, which can be used to measure the causality between xi and other nodes. Specifically, the value of Aijt[0,1] represents the strength of the causality xjxi at time t; if Aijt=0,xj is not the cause of xi at time t.

i). Spatio-temporal feature learning (STFL) module:

The STFL is used to fuse spatio-temporal information in the observed data. Specifically, for each subject Xt,TCNident,i is adopted to extract deep temporal features of the subject’s nodes.

hit=TCNident,iXt (5)

where hit=hi,1t,hi,2t,hi,NtRN×d1, and TCNident,i corresponds to subnetwork 𝒩i. Each hi,jt denotes extracted deep temporal feature of the j-th nodes xj at time t, relying on current and historical information of xj, and d1 is the dimension of features.

The spatial attention module is utilized to capture spatial relationships between nodes. Let αit=αi1t,αi2t,,αiNt be the spatial attention vector initialized as defined in Section II-B.2. Based on the temporal feature hit extracted by TCNident,i, spatio-temporal representation Hit is obtained.

Hit=αithit=αi1thi1t,αi2thi2t,,αiNthiNtRN×d1 (6)

where Hit=Hi1t,Hi2t,,HiNt is the deep spatio-temporal representation of the node features, which fuses historical temporal information of xi and the spatial impact from other nodes at the current moment.

ii). Causality estimation (CE) module:

It is used to learn the dynamic causal mask vector. Specifically, a multilayer perceptron (MLP) is utilized to estimate whether there is causality between node xi and other nodes based on the spatio-temporal representation Hit :

Ait=σfHit,θit (7)

where σ() is the sigmoid activation function, f refers to MLP with θit being the parameters of it. The learned AitRN(t=2,3,,T) represent the causality between the i-th node xi and other nodes at time t.

Then, Xt is masked by the causal mask vector Ait

X˜t=AitTXt.

where X˜t=x˜1t,x˜2t,,x˜Nt.X˜t is used as input of the subsequent dynamic causal discriminator to verify the accuracy of the causal mask vector Ait.

2). Dynamic Causal Discriminator (DCD):

To verify the accuracy of the causal mask vectors discovered by the dynamic causal learner, a dynamic causal discriminator is constructed.

Directly quantifying the divergence between the actual and inferred causality is impracticable, given the frequently indeterminate nature of true causality within the observed data. Thus, the reconstruction strategy in SEM is employed to verify the accuracy of the learned causality [33]. Specifically, for the causal mask vector Ait(t=2,3,T), the i-th dynamic causal discriminator DCDi is built, which is based on a ST-SEM module and a causality verification module, to measure the difference between the original data xi and the reconstructed data xˆi. The ST-SEM is constructed with a temporal information accumulation step and a spatial information aggregation step. The closer the reconstructed data xˆi is to the original data xi, the more accurate the causal mask vector is.

i). ST-SEM module:

The ST-SEM module is used to reconstruct data xˆi based on the masked X˜t, in which each component x˜it is considered as a cause node of xi at time t, through a temporal information accumulation (TIA) step and a spatial information aggregation (SIA) step. In the TIA step, TCNrecon,i is applied to learn the current representation from the masked X˜t:

h˜it=TCNrecon,iX˜t (8)

where h˜it=h˜i1t,h˜i2t,,h˜iNtRN×d2 is representation matrix with d2 being the dimension of features, and TCNrecon,i corresponds to subnetwork 𝒩i. Each representation vector h˜ijt accumulates the information of cause node x˜j at time t and historical information before it.

To fuse the spatial information to obtain better reconstitution, the SIA step is utilized. According to the temporal information h˜it obtained by TCNrecon,i, formulas (2) and (3) are used to calculate the spatial attention vector αit=αi1t,αi2t,,αiNtRN. Since we do not consider selfinfluence, αiit=0. The spatial attention is then applied to aggregate information of cause nodes. Let

zit=tanhW1αith˜it+b1 (9)

where W1Rd2×d3,b1Rd3 are the weight matrix and bias vector, respectively. The feature zitRN×d3 serves as a representation to accumulate historical information of each causal node while accounting for spatial dependency between each causal node and xi. The feature zit is utilized to reconstruct xˆit.

xˆit=W3tanhW2zit+b2+ei (10)

where W2Rd3×d4,W3Rd4×1,b2Rd4 are the weight matrix and bias vector, respectively, d4 is the dimension of features, and eiR is Gaussian noise.

Based on (8), (9), and (10), ST-SEM is devised by leveraging the temporal and spatial information of cause nodes to reconstruct xˆi. If the reconstructed xˆi by ST-SEM closely resemble the original xi, the causal mask vectors Ait(t=2,3,,T) are valid.

ii). Causality verification (CV) module:

To verify the accuracy of the causal mask vector, the causality verification module is applied to compare the similarity between the original xi and the reconstructed xˆi. Specifically, the reconstruction loss quantitatively evaluates the discrepancy between xi and xˆi, whereas the causal sparsity loss is utilized to modulate the sparsity of the causal mask vectors Ait(t=2,3,,T). This approach is predicated on empirical findings from neurological studies, which have elucidated the presence of pivotal sparse features within the neural information processing paradigm in human systems [34]. Thus, for the i-th node xi, the loss functions are represented as

Li=Lrecon,i+λLsparse,i (11)

where Lrecon,i is the reconstruction loss, Lsparse,i is the causal sparsity loss, and λ is the hyper-parameter that controls the sparsity of the causal mask vector.

The reconstruction loss Lrecon,i between the reconstructed xˆi and original xˆi is expressed as:

Lrecon,i=1T-1t=2Txˆit-xit22 (12)

The causal sparsity loss employs the log-sum norm. Compared with the traditional L1 and L2 norms, the log-sum norm approximates the L0 norm better, i.e., stronger sparse learning ability [35]. Therefore, the log-sum norm is used to impose sparsity constraints on Ait, and the causal sparsity loss is defined by

Lsparse,i=1N-1j=1,jiNt=2TlogAijtϵ+1 (13)

where ϵ is the scale parameter of log-sum regularization. Under the guidance of Li, the network 𝒩i of DDCL can discover the dynamic casual mask vectors Ait(t=2,3,,T) to reveal the dynamic causality in the observed data.

3). Algorithm of DDCL:

Each 𝒩i is trained independently to estimate the causal mask vectors Ait(t=2,3,,T). Independent training of 𝒩i provides the advantages of parallel processing and can significantly improve computational efficiency, especially when dealing with large-scale networks. Specifically, in the training process of 𝒩i, the dynamic causal discriminator DCDi is first optimized with initialized causal mask vector Ait=[1,1,,1], the dynamic causal learner DCLi is then trained with DCDi. When all N networks are well-trained, the causal mask vectors Ait(i=1,2,,N) are combined to form the causal mask matrix At at time t. The overall procedure of 𝒩i algorithm is shown in Algorithm 1.

B.

III. Experiments and Analysis

In this section, we first compare the proposed DDCL with static causal discovery methods on simulated fMRI data to assess the performance of DDCL. Subsequently, DDCL is compared with time-varying causal discovery methods on Yeast gene expression data. Finally, DDCL is applied to real resting state fMRI (rs-fMRI) data, aiming to delineate the brain dEC networks of children and young adults. This application is intended to reveal the evolution in dEC networks over time and analyze the differences in information transfer patterns in brain functional networks between different groups. All experiments were conducted on a Linux server with Intel(R) Xeon(R) Gold 4130 CPU @2.10GHz, 48 cores, and 251GB of RAM, running Ubuntu 20.04.6 LTS.

A. Experiments and Evaluations on Simulation Dataset

The simulated NetSim fMRI data were provided by Smith et al, which was generated based on the dynamic causal model, and the ground truth networks are shown in Fig. 2 [36]. The details of the dataset are shown in Table. I. The interactions of all the simulated networks are nonlinear. DDCL is compared with static EC methods, including GC [37], TE [38], component-wise MLP (cMLP), component-wise LSTM (cLSTM) [18], TCDF [19], NRI [20], and fNRI [39].

Fig. 2.

Fig. 2.

The ground truth networks of simulated data [36]

TABLE I.

The details of the NetSim dataset

Data Nodes Session(min) TR(s) Number of Subjects
Sim1 5 10 3.0 50
Sim2 10 10 3.0 50
Sim3 15 10 3.0 50
Sim4 50 10 3.0 50

The number of hidden units of DDCL is set to be 16, and the Adam is used to update the network parameters with a learning rate of 1 × 10−3. The sparsity penalty coefficients are set to be 0.7 by grid search. The parameters of comparison algorithms are selected according to the existing literature, and all codes are from the original papers. Each comparative model underwent ten random experiments. The area under the receiver operating characteristic (AUROC) is used to evaluate the performance of different models. The presence of the connection is regarded as the positive class, and the absence of the connection is recorded as the negative class. Thus, the problem of whether the causality between nodes can be identified accurately is converted into a binary classification issue. A higher AUROC indicates better classification performance, that is, a better ability to identify causal relationships. The experimental results of using DDCL and baseline methods are shown in Table. II. The exceptional performance of DDCL can be attributed primarily to the effective use of spatio-temporal information and sparse constraints on causality.

TABLE II.

AUROC values (%) of DDCL and baselines on Smith datasets

Method Sim1 Sim2 Sim3 Sim4
GC 60.1 ± 3.1 56.0 ± 3.3 52.3 ± 3.9 37.5 ± 3.6
TE 61.3 ± 1.6 57.2 ± 1.7 55.4 ± 3.0 41.6 ± 1.9
NRI 68.9 ± 4.2 68.0 ± 3.5 66.7 ± 4.7 55.1 ± 1.7
fNRI 71.6 ± 2.9 70.3 ± 4.1 67.9 ± 4.0 57.5 ± 2.3
cMLP 62.7 ± 5.1 61.1 ± 4.3 59.9 ± 4.5 55.3 ± 4.2
cLSTM 66.5 ± 5.0 64.1 ± 4.2 61.3 ± 5.3 55.5 ± 4.0
TCDF 84.0 ± 4.7 81.6 ± 2.8 80.3 ± 4.9 71.2 ± 2.0
DDCL 92.3 ± 3.4 91.0 ± 4.0 89.6 ± 3.3 83.4 ± 3.2

The Yeast gene expression data were provided by Cantone et al. [40]. The data contain 5 genes, each gene was measured 37 times with 16 measurements taken in galactose and 21 measurements taken in glucose. The network structure is fixed but the strength of the connection changes with time, which is shown in Fig. 3(a). The proposed model is compared with dynamic causal discovery methods, such as NHDBNs, Seq-DBNs, and Glob-DBNs. Ten experiments were performed for each method, the results are shown in Fig. 3(b), and the performance of AUORC and running time are presented in Table III. The results indicate that DDCL exhibits the shortest running time, due to its design with the capability of parallel processing. The running time for NHDBN is twice as long as that of DDCL, while the running times for Seq-DBN and Glob-DBN are approximately three times as long as that of DDCL. Additionally, DDCL has the highest AUROC value, which indicates that it has the best ability to identify the causal relationships.

Fig. 3.

Fig. 3.

(a) The true yeast network; (b) Comparison of AUROC values on Yeast gene

TABLE III.

The mean (±SD) of AUORC and running time for different methods

Methods DDCL NHDBN Seq-DBN Glob-DBN
AUROC 0.82 ± 0.02 0.63 ± 0.02 0.64 ± 0.01 0.76 ± 0.02
Times(s) 61 ± 1.7 122 ± 1.6 175 ± 1.3 209 ± 1.9

B. Experiments and Analysis on Real rs-fMRI data

1). Data collection and preprocessing:

The PNC is a large-scale collaborative project between the Brain Behavior Laboratory at the University of Pennsylvania and the Children’s Hospital of Philadelphia, which is widely studied for understanding the mechanisms of human brain development. The dataset, acquired with a Siemens TIM Trio scanner, contains resting fMRI data from 204 young adults aged 216–271 months and 193 children aged 103–144 months [41]. The details of the subjects are listed in Table. IV. The standard brain imaging preprocessing for the collected data including motion correction, spatial normalization, and spatial smoothing with a 3mm full-width half-maximum Gaussian kernel, was first implemented using Statistical Parametric Mapping 12 (SPM12). A regression procedure was subsequently applied to remove the influence of motion, and band-pass filtering was implemented on the functional time series within the frequency range of 0.01–0.1Hz. According to Power et al. [42], the standard 264 regions of interest (ROIs) are defined. By averaging the time series of all voxels in the ROI, the data was parcellated to a 264×T matrix for each subject, where T=124 represents the number of time points with the time interval being 3s.

TABLE IV.

Demographic characteristics of the subjects

Children Young Adults
Number 193 204
Gender(male/female) 91/102 81/123
Age(Mean±SD,months) 124.06±11.33 231.50±12.14
Ethnicity
ASIAN 3(1.5%) 0(0%)
AFRICAN 77(39.9%) 74(36.3%)
AMERICAN 0(0%) 2(1%)
OTHER/MIXED 20(10.4%) 17(8.3%)
CAUCASIAN/WHITE 92(47.7%) 111(54.4%)
HAWAIIAN/PACIFIC 1(0.5%) 0(0%)

2). Evaluation of DDCL effectiveness:

For the preprocessed data, DDCL is applied to identify the brain dEC networks between two age groups. The causal mask matrix is initialized as an all-one matrix. A standard Adam optimizer with a learning rate of 1 × 10−3 is employed for model optimization. The epochs of the dynamic causal learner and the dynamic causal discriminator are set to be 200. The causal sparsity penalty coefficient is set to be 0.3 by grid search. Based on these common parameters, DDCL is used to learn a dEC network respectively for children and young adults, individual dECs are obtained for each subject. Based on that, the hypothesis testing methods were used to test whether the information flow between brain regions clearly exists. Specifically, for each group, all subjects’ dECs were averaged to obtain group-level dECs, which were denoted by ERT×N×N with T representing the number of time points and N representing the number of ROIs. Then, for each pair of ROIs, the t-tests were used to test whether the mean of the dEC over time was non-zero, which is the prerequisite for the existence of an effective connection. The significance level is set to be 0.01. After the hypothesis testing, 9849 significantly non-zero dECs in children, and 3604 significantly non-zero dECs in young adults were identified. The number of dECs in young adults is significantly smaller than that in children, aligning with extant research [43].

To validate the effectiveness of DDCL, we compare it with other brain functional network methods. According to whether the brain network contains directional information, these methods are divided into two categories: correlation-based methods and causality-based methods. Correlation-based methods include the Pearson correlation (PC) method, the Kendall correlation (KC) method, and the Spearman correlation (SC) method. Causality-based methods include the aforementioned cMLP and cLSTM. The three dynamic causal methods, such as NHDBNs, Seq-DBNs, and Glob-DBNs, are not compared here due to their high computational complexity. Specifically, on the real rs-fMRI data, for one subject, the average running time of DDCL is 23 mins. Since the network scale and length of time series are critical factors affecting the computational complexity of NHDBN [44], and the average running time of NHDBN is 528 mins for one subject, thus, for the rs-fMRI dataset with 397 subjects, it is inconvenient to implement NHDBN as the running time of it is approximately 5 months. Meanwhile, the average running times of Seq-DBN and Glob-DBN are 660 mins and 747 mins for one subject respectively, and they are also not suitable for such scale dataset.

Fig. 4 shows the brain functional networks estimated by the aforementioned methods. Fig. 4(a) is obtained by the proposed DDCL model, which shows the temporal average EC of the group-level dEC for each group. The average EC is a reflection of the mean value of dEC over the entire duration, which can show significant differences in dynamic information flows between children and young adults to a certain extent. Fig. 4(b)(c) are obtained by cMLP and cLSTM, which are static EC methods. Thus, Fig. 4 does not present “dynamic” directly. Notably, DDCL discerns sparser EC patterns compared to cMLP and cLSTM and more effectively delineates the variances in brain functional networks between children and young adults. Fig. 4(df) show the brain FC networks obtained by three different correlation-based methods. While these methods discern differences in brain networks between groups, FC merely indicates a statistical dependence among ROIs without ascertaining the directionality of information flow. In contrast, DDCL identifies the brain EC network, which can reflect the direction of information flow between different ROIs. Additionally, DDCL can infer dEC, which further shows the evolution of information flow transferring patterns over time in the brain.

Fig. 4.

Fig. 4.

Learned the brain function networks of children and young adults by different methods. The upper subgraph (a-c) shows the ECs inferred by different causal discovery algorithms, and the subgraph below (d-f) shows the FCs inferred by different correlation coefficient methods. DDCL outperforms cMLP and cLSTM in revealing the essential differences between the two groups. Compared with the FC, the EC network inferred by DDCL becomes more sparse and reflects the direction of information flow among ROIs.

To further illustrate the competitiveness of DDCL, structural similarity (SSIM), cosine similarity (CS), Jenson-Shannon divergence (JSD), and structural Euclidean distance (SED) are used as evaluation metrics. SSIM and CS quantify the similarity of two causal matrices [45], and JSD and SED reflect the disparities in connections between children and youth adults [46]. Let Ec and Ea represent the EC networks for children and young adults, respectively. The metrics are specifically defined as follows.

SSIMEc,Ea=2μcμa+c12σca+c2μc2+μa2+c1σc2+σa2+c2CSEc,Ea=EcEaEcEaJSDEc,Ea=12KLEcM1+12KLEaM1SEDEc,Ea=1M2iEc,i-Ea,i2σi2

where μc,μa indicate the means of Ec,Ea, and σc,σa,σca represent the standard deviations and the correlation coefficient of Ec,Ea, and c1,c2 are saturation constants that contribute to numerical stability. The symbol means to convert a matrix to a vector, i.e., if A2=aijn×n,A=a11,a12,,a1n,a21,a21,TRn2 is a vector corresponding to it. KL() is the Kullback-Leibler (KL) divergence between two distributions and M1=Ec+Ea2σi represents the standard deviation of Ec,i,Ea,i,, and M2 is the sum of the number of non-zero elements in Ec and Ea.

The comparison results of all methods are shown in Table. V. It can be seen that DDCL achieves the best performance on four metrics, which demonstrates its effectiveness. In detail, compared with other methods, SSIM and CS have the smallest values, which show that the learned dEC networks of children and youth adults by DDCL have the most minor similarity, and help to reveal the essential difference of dECs between the two groups. The larger JSD means that the dEC networks of children and young adults learned by DDCL have more significant differences in the distribution, and the larger SED reflects that the dEC networks of children are more distant from those of young adults in Euclidean space. These results demonstrate that DDCL can better characterize the differences between groups and contribute to further research on age-related changes in brain functional mechanisms.

TABLE V.

The differences of discovered brain EC networks between children and young adults by different methods

method SSIM CS JSD SED
Correlation-based methods Pearson-Corr 0.844 0.979 0.002 1.960
Kendall-Corr 0.889 0.980 0.002 1.957
Spearman-Corr 0.865 0.981 0.003 1.957
Causality-based methods cMLP 0.994 0.689 0.000 1.870
cLSTM 0.854 1.000 0.000 1.674
DDCL 0.090 0.310 0.076 5.242

Additionally, the proposed DDCL is compared with the spatio-temporal learning methods, which classify different groups by capturing correlations between brain regions. The methods compared include STGCN [27], AG-STGCN [28], and PTP-STGCN [29]. Four metrics were chosen to evaluate the classification ability of each model, including classification accuracy (ACC), precision, recall, and F1-score. These experiments were conducted in a 10-fold cross-validation manner, and the average performance results are presented in Fig. 5. It can be observed that the proposed DDCL model outperforms other methods.

Fig. 5.

Fig. 5.

Comparison of classification performance

The proposed DDCL demonstrates superior performance compared to these comparison methods on the simulation and real fMRI data. The reasons for the excellent performance of DDCL mainly lie in two points. On the one hand, in contrast to causality-based methods, the effective use of spatio-temporal information and sparse constraints enables DDCL to reveal the essential differences in dECs between children and young adults. On the other, compared with correlation-based methods, the brain dEC networks identified by DDCL are asymmetric matrices, reflecting the direction of information flow between brain regions, which can discover biological mechanisms related to brain development more accurately.

3). Analysis of dEC differences between two groups :

To further understand the dEC patterns, the 264 ROIs were divided into 13 resting state networks (RSNs) [42]. There are 12 RSNs: the cingulo-opercular task control network (COTCN), sensory/somatomotor network (SSN), default mode network (DMN), auditory network (AN), visual network (VN), frontoparietal task control network (FPTCN), memory retrieval network (MRN), subcortical network (SCN), salience network (SN), dorsal attention network (DAN), ventral attention network (VAN), and cerebellar network (CN), related to the language, memory, perception of movement, cognition, vision, and other functions of the brain. Additionally, the uncertain network (UN), consisting of 28 ROIs, exhibits weak correlations with other RSNs [47].

Fig. 6(a) shows the distributions of dECs among RSNs in children and young adults. It can be observed that there are a large number of dECs within SSN, DMN, and VN for both children and young adults, whereas there are very few connections within both MRN and CN. The number of dECs among RSNs in young adults is significantly decreased compared to children, i.e., children’s brains show more dispersed dEC patterns compared to young adults, which is consistent with [48]. Additionally, the connectivity between the 12 RSNs with deterministic functions and UN in young adults is weaker than that of children. It suggests that brain function networks gradually become specialized networks with brain development.

Fig. 6.

Fig. 6.

(a) The upper subgraph represents the distribution of dECs of children in RSNs. The bottom subgraph represents the distribution of dECs of young adults in RSNs. (b) The upper subgraph represents the distribution of significantly enhanced dECs in young adults compared to children. The bottom subgraph represents the distribution of significantly weakened dECs in young adults. (c) In these subgraphs, the y-axis represents the information flow intensity. The upper subgraph represents differences in information inflow and outflow of each RSN in children. The middle subgraph represents differences in information inflow and outflow of each RSN in young adults. The bottom subgraph represents the difference in information self-flow in each RSN between children and young adults.

For the dECs inferred by DDCL, the hypothesis testing method in [49] is used to test whether the changes found in dECs are significant. After the hypothesis testing, 3012 significantly enhanced dECs with age, and 8397 weakened dECs with age were observed. The details of the hypothesis test methods can be found in Appendix. The number of decreased connections is greater than that of the increased ones. The above subfigure of Fig. 6(b) shows the enhanced dECs within and among RSNs with age. It can be seen that there are significantly enhanced dECs within SSN, DMN, VN, and FPTCN. The lower part of Fig. 6(b) shows significantly weakened dECs within and among RSNs with age. There are weakened connections within DMN, SSN, VN, and FPTCN, between DMN and other RSNs, such as SSN, VN, FPTCN, SN, VN, VAN, and UN, in which there are both bidirectional and non-bidirectional connections. The weakened dECs between FPTCN and other RSNs, such as VN, SSN, SN, and UN, can also be observed. In a word, there are both weakened and enhanced dECs within RSNs, but most of the dECs among RSNs are weakened, which indicates that redundant connectivity in children’s brain networks gradually disappears, and more integrated brain functional networks develop with age. The upper and middle subfigures in Fig. 6(c) show the differences in information inflow and outflow intensity of each RSN between children and young adults, respectively. The blue bars represent the information inflow intensity, and the orange bars represent the information outflow intensity. For children, we observed that information inflow outnumbers outflow in SSN, DNM, FPTCN, VN, and SCN. However, the information inflow intensity is weaker than the outflow in those RSNs of young adults. Moreover, there is more information outflow than information inflow in AN, SN, and VAN for children, but the intensity of information inflow and outflow is roughly flat in young adults. The bottom subfigure shows the differences in dECs within each RSN between children and young adults. The light blue and dark blue bars represent the intensity of dECs within each RSN for children and young adults, respectively. It can be seen that children have stronger dECs within each RSN than young adults.

4). Analysis of dEC changes over time:

To explore the time-varying pattern differences in dEC between the two groups, the variance of dEC intensity over time is employed. Specifically, Fig. 7(a) shows the variance of dEC over time within and between ROIs. It can be observed that compared with young adults, there are stronger fluctuations of dEC between children’s ROIs, and the number of dEC with significant fluctuations for children is much more than that of young adults. Fig. 7(b) shows the variance of dEC over time within and between RSNs. The fluctuations of dEC within DMN, SSN, VN, and FPTCN are found in children. Besides, we can observe that these are fluctuations of bidirectional dEC between DMN and other RSNs, such as SSN, VN, FTPCN, SN, SCN, and UN. Unlike children, extremely slight fluctuations are only found between SSN and AN, between VN and AN, and there are little fluctuations of dEC within and among other RSNs in young adults.

Fig. 7.

Fig. 7.

(a) Fluctuation of EC over time between ROIs in children and young adults. (b) Fluctuation of EC over time between RSNs in children and young adults.

Based on Fig. 7(b), we find that the top two RSNs with significant fluctuations are SSN and DMN. To provide a clearer illustration of the temporal changes of dEC, Fig. 8 shows mean and 95% confidence interval of information self-flow, inflow, and outflow intensity between SSN and other RSNs, and between DMN and other RSNs across subjects. To reduce the asynchronous effects of rs-fMRI between children and young adults, our main focus is on the analysis of overall trends and differential characteristics at the group level. It can be observed that the information flow direction remained unchanged during the entire scanning period in both children and young adults. For young adults, whether SSN or DMN, the distribution of information flow intensity across different subjects is very concentrated, but there were strong fluctuations of information flow intensity between different subjects for children. This observation suggests that with the growth from children to young adults, functional networks become mature and specialized, and can transmit information more efficiently and steadily. Moreover, the information inflow outnumbers outflow between SSN and other RSNs, and between DMN and other RSNs for children. for children. However, the information inflow is weaker than the outflow between those two RSNs and other RSNs in young adults. This is consistent with what was presented in Fig. 6(c) by using average EC. Additionally, the information self-flow of SSN and DMN in children is stronger than the self-flow in young adults. It is also consistent with the results shown in Fig. 6(a). These mean that similar significant differences can be observed by average EC and dEC.

Fig. 8.

Fig. 8.

The mean and 95% confidence interval of information self-flow, inflow, and outflow intensity between SSN and other RSNs across subjects. (b) The mean and 95% confidence interval of information self-flow, inflow, and outflow intensity between DMN and other RSNs across subjects. The x-axis represents scan time, and the y-axis represents the information flow intensity.

IV. Discussion

In this study, we presented a deep dynamic causal learning method to investigate the differences in dynamic effective connectivity between children and young adults. The dECs with significant differences are mainly observed within or among SSN, VN, DMN, FPTCN, SCN, and SN, which are strongly related to cognition, working memory, emotion, information processing, emotion, and vision.

DMN is a functional network consisting primarily of the lateral temporal lobe, medial frontal lobe, posterior cingulate gyrus, and hippocampus. It is particularly associated with memory, as well as other cognitive and psychological functions [50]. Studies have shown that DMN is continuously activated during resting, and the activity in DMN never stops [51]. SSN includes the cingulate gyrus, precuneus, superior frontal gyrus, and central anterior and posterior gyrus, which are mainly involved in emotional, sensory, and cognitive activities [52]. FPTCN mainly comprises the frontal gyrus and superior parietal lobule, which are mainly responsible for working memory maintenance, complex problem-solving, and cognitive tasks. It also plays a significant role in creativity tasks because it relates to top-down cognitive and executive control [53]. SN mainly includes the cingulate gyrus, insula, superior marginal gyrus, and paracentral lobules. In the network, the relevant information and the physical characteristics of the task are used to judge the salience of the stimulus and regulate the attention [54]. VN, which is comprised of the cuneus, lingual gyrus, and middle occipital gyrus, is involved in processing visual information [55]. SCN includes lentiform, the thalamus, and extranuclear. According to [56], it is crucial for memory, attention, perception, and consciousness.

According to our results, there are a large number of dECs within DMN, as well as between DMN and other RSNs, in both children and young adults. The weakened dECs outweigh the enhanced dECs as brain develops. This finding is consistent with the conclusions from earlier research studies [57]. In particular, our analysis reveals that there are weakened dECs between DMN and SN with development. SN and DMN collaborate to constitute the allostatic-interoceptive brain system, which governs energy regulation inside the body and encompasses various psychological processes, including decision-making, memory, and emotional and pain processing [58]. Previous research has demonstrated that the more defensive brain organization of the allostatic-interoceptive brain system corresponds to the increased connection between DMN and SN [59]. Our findings suggest that the connectivity between DMN and SN is stronger in children compared to young adults in resting state, showing that children exhibit more defensive brain organization than young adults. Furthermore, the dECs between DMN and FPTCN are weakened as brain develops. Past research has shown that the weakened connectivity between DMN and FPTCN is linked to enhanced reading skills during the developmental period [60]. This consistent finding suggests that as brain develops, functional networks progressively specialize, and higher-order cognition is inversely correlated with the intensity of connectivity between DMN and FPTCN. Additionally, we found weakened functional connectivity between the DMN and SCN with age. Studies have shown that the reduced connectivity associated with the SCN indicates a general decrease in cortical functional connectivity during development [61].

Apart from the aforementioned differences between DMN and some RSNs, the weakened dECs between FPTCN and SSN, and between FPTCN and VN can also be observed with age. This maybe due to FPTCN’s participation in activities such as working memory maintenance, complex problem-solving, and cognitive processes. During childhood, since the autonomous thinking system is not completely established, the completion of such cognitive activities mostly depends on somatosensory, auditory, and task-related cues. However, young adults have a better-developed cognitive system, displaying less dependence on external factors such as sounds and tasks, and possessing a more distinct understanding of personal evaluation. This discovery is consistent with the previous assertion [62]. Furthermore, prior studies have shown that weakened connectivity between FPTCN and SSN during development. In contrast, children show increased connectivity between these two RSNs [63]. This is mostly ascribed to the progressive development of intricate cognitive processes with age, while the redundant connections exist in the brain networks of children.

By investigating time-varying dEC patterns in children and adults, it can be observed that the fluctuations in dEC between RSNs in children are more significant than in young adults. This indicates that the brain dEC networks in young adults are more stable than in children. Furthermore, the most significant fluctuations occur in DMN, which is involved with introspective thought and self-reflection [50], [64], including reflecting on the past and envisioning the future [65]. For young adults, there are more concentrated and stable distribution of information flow within DMN across different subjects, demonstrating that DMN becomes mature in young adults, and high-order cognition related to internal mentation develops gradually. Additionally, the distribution of information flow intensity of SSN in young adults is more concentrated than in children. SSN is primarily responsible for interacting with the external environment [66]. This indicates that SSN of young adults is more specialized and can process external information more efficiently.

In summary, our research demonstrates that the brain dEC networks of young adults during rest are more stable than children’s, and the information transmission patterns change with age. Furthermore, young adults demonstrate a more mature high-order cognition involving internal mentation, and increasing capacity to process external information. Additionally, most of the dECs weakened with growth. Young adults present more focused or organized dEC patterns, but children show more diffuse dEC patterns [67]. It indicates that the brain network changes from undifferentiated to more specialized structures as age [68].

V. Conclusion

In this study, the dEC of the brain is investigated to reveal the change in information flow transmission patterns between brain areas over time. We propose a DDCL model to identify the dynamic causal relationships among brain areas from fMRI data in an unsupervised manner. Specifically, a dynamic causal learner is utilized to identify dynamic causality by incorporating spatio-temporal information, and a dynamic causal discriminator is employed to validate the accuracy of the learned dynamic causality. DDCL is applied to both simulated and real PNC data for discovering dEC networks. The analyses on simulated data show that DDCL can identify dEC more accurately compared with baseline methods. The experimental results in PNC data demonstrate that DDCL can delineate the changes in dEC between children and young adults. Our findings indicate that the brain dEC networks in young adults manifest greater stability compared to children, coupled with pronounced disparities in information transfer patterns between the two groups. In addition, high-order cognition involving internal mentation progressively develops with growth, and brain functional networks in young adults can process external information more efficiently. In particular, our results suggest a developmental transition of brain networks from undifferentiated to more specialized and organized structures as age progresses.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (Nos. 12271429, 12090021, and 12226007), the Natural Science Basic Research Program of Shaanxi (No. 2022JM-005) and was partly supported by the National Institutes of Health (R01 MH104680, R01 GM109068, R01 MH121101, R01 MH116782, R01 MH118013 and P20-GM144641). (Corresponding authors: Chen Qiao and Yu-Ping Wang.)

Appendix

The hypothesis testing of significant changes

For each pair of ROIs, the increased or decreased effective connectivity between them is calculated based on hypothesis testing methods. At first, the F-test is first used to test whether there is a significant difference in variance between children and adults. Then, different t-tests are used to test whether there is a significant difference in mean value between the children and adults based on the results of the F-test. If there was a significant difference in variance between the two groups, the t-test for unequal variances was used. If there is no significant difference in variance between the two groups, the t-test for equal variances was used.

In the experiments, the significance level p is set to be 0.01. When the p-value < 0.01, we can determine that there exists a significant difference in dEC between the two groups in this pair of ROIs. Moreover, if M=X1-X2>0, then there exist increased effective connections with its value being M. Similarly, the decreased effective connections can be defined when M<0.

Contributor Information

Yingying Wang, School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049 P.R. China.

Chen Qiao, School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049 P.R. China.

Gang Qu, Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118 USA.

Vince D. Calhoun, Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State University, Georgia Institute of Technology, Emory University, Atlanta, GA 30303.

Julia M. Stephen, Mind Research Network, Albuquerque, NM 87106.

Tony W. Wilson, Institute for Human Neuroscience, Boys Town National Research Hospital, Boys Town, NE 68010.

Yu-Ping Wang, Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118 USA.

REFERENCES

  • [1].Gosak M et al. , “Network science of biological systems at different scales: A review,” Phys. Life Rev, vol. 24, pp. 118–135, 2018. [DOI] [PubMed] [Google Scholar]
  • [2].Majhi S et al. , “Chimera states in neuronal networks: A review,” Phys. Life Rev, vol. 28, pp. 100–121, 2019. [DOI] [PubMed] [Google Scholar]
  • [3].Ma Z-Z et al. , “Tracking whole-brain connectivity dynamics in the resting-state fmri with post-facial paralysis synkinesis,” Brain Res. Bull, vol. 173, pp. 108–115, 2021. [DOI] [PubMed] [Google Scholar]
  • [4].Tokuda T et al. , “Multiple clustering for identifying subject clusters and brain sub-networks using functional connectivity matrices without vectorization,” Neural Netw, vol. 142, pp. 269–287, 2021. [DOI] [PubMed] [Google Scholar]
  • [5].Gong J et al. , “Dual temporal and spatial sparse representation for inferring group-wise brain networks from resting-state fmri dataset,” IEEE Trans. Biomed. Eng, vol. 65, no. 5, pp. 1035–1048, 2018. [DOI] [PubMed] [Google Scholar]
  • [6].M. P. et al. , “Exploring the brain network: A review on resting-state fmri functional connectivity,” Eur. Neuropsychopharmacol, vol. 20, no. 8, pp. 519–534, 2010. [DOI] [PubMed] [Google Scholar]
  • [7].Friston KJ, “Functional and effective connectivity: a review,” Brain Connect, vol. 1, no. 1, pp. 13–36, 2011. [DOI] [PubMed] [Google Scholar]
  • [8].Ji J et al. , “A survey on brain effective connectivity network learning,” IEEE Trans. Neural Netw. Learn. Syst, vol. 34, pp. 1879–1899, 2021. [DOI] [PubMed] [Google Scholar]
  • [9].Samdin SB et al. , “A unified estimation framework for state-related changes in effective brain connectivity,” IEEE Trans. Biomed. Eng, vol. 64, no. 4, pp. 844–858, 2017. [DOI] [PubMed] [Google Scholar]
  • [10].Liao W et al. , “Kernel granger causality mapping effective connectivity on fmri data,” IEEE Trans. Med. Imaging, vol. 28, no. 11, pp. 1825–1835, 2009. [DOI] [PubMed] [Google Scholar]
  • [11].Deshpande G and Hu X, “Investigating effective brain connectivity from fmri data: Past findings and current issues with reference to granger causality analysis,” Brain Connect, vol. 2, no. 5, pp. 235–245, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Sanchez-Romero R et al. , “Estimating feedforward and feedback effective connections from fmri time series: Assessments of statistical methods,” Netw. Neurosci, vol. 3, no. 2, pp. 274–306, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Ting C-M et al. , “Estimating effective connectivity from fmri data using factor-based subspace autoregressive models,” IEEE Signal Process. Lett, vol. 22, no. 6, pp. 757–761, 2015. [Google Scholar]
  • [14].Kim J et al. , “Unified structural equation modeling approach for the analysis of multisubject, multivariate functional mri data,” Hum. Brain Mapp, vol. 28, no. 2, pp. 85–93, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Wei S et al. , “Analysis of weight-directed functional brain networks in the deception state based on eeg signal,” IEEE J. Biomed. Health Inform, vol. 27, no. 10, pp. 4736–4747, 2023. [DOI] [PubMed] [Google Scholar]
  • [16].Liu J et al. , “Inferring effective connectivity networks from fmri time series with a temporal entropy-score,” IEEE Trans. Neural Netw. Learn. Syst, vol. 33, no. 10, pp. 5993–6006, 2022. [DOI] [PubMed] [Google Scholar]
  • [17].Rajapakse JC et al. , “Probabilistic framework for brain connectivity from functional mr images,” IEEE Trans. Med. Imaging, vol. 27, no. 6, pp. 825–833, 2008. [DOI] [PubMed] [Google Scholar]
  • [18].Tank A et al. , “Neural granger causality,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 44, no. 8, pp. 4267–4279, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Nauta M et al. , “Causal discovery with attention-based convolutional neural networks,” Mach. learn. knowl. extr, vol. 1, no. 1, pp. 312–340, 2019. [Google Scholar]
  • [20].Kipf T et al. , “Neural relational inference for interacting systems,” in ICML, 2018, p. 2688–2697. [Google Scholar]
  • [21].Robinson JW and Hartemink AJ, “Learning non-stationary dynamic bayesian networks,” J. Mach. Learn. Res, vol. 11, p. 3647–3680, 2010. [Google Scholar]
  • [22].Grzegorczyk M and Husmeier D, “A non-homogeneous dynamic bayesian network with sequentially coupled interaction parameters for applications in systems and synthetic biology,” Stat. Appl. Genet. Mol. Biol, vol. 11, no. 4, 2012. [DOI] [PubMed] [Google Scholar]
  • [23].Grzegorczyk MA and Husmeier D, “Regularization of non-homogeneous dynamic bayesian networks with global information-coupling based on hierarchical bayesian models,” Mach. Learn, vol. 91, pp. 105–154, 2013. [Google Scholar]
  • [24].Kamalabad MS and Grzegorczyk MA, “Non-homogeneous dynamic bayesian networks with edge-wise sequentially coupled parameters,” Bioinformatics, vol. 36, pp. 1198–1207, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Liu J et al. , “Estimating brain effective connectivity in fmri data by non-stationary dynamic bayesian networks,” in Proc. IEEE Int. Conf. Bioinf. Biomed. (BIBM), 2019, pp. 834–839. [Google Scholar]
  • [26].Shinn M et al. , “Functional brain networks reflect spatial and temporal autocorrelation,” Nat. Neurosci, vol. 26, no. 5, p. 867–878, May 2023. [DOI] [PubMed] [Google Scholar]
  • [27].Gadgil S et al. , “Spatio-temporal graph convolution for resting-state fmri analysis,” in MICCAI, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Jiang M et al. , “Anatomy-guided spatio-temporal graph convolutional networks (ag-stgcns) for modeling functional connectivity between gyri and sulci across multiple task domains,” IEEE Trans. Neural Netw. Learn. Syst, pp. 1–11, 2022. [DOI] [PubMed] [Google Scholar]
  • [29].Lian J et al. , “Ptp-stgcn: Pedestrian trajectory prediction based on a spatio-temporal graph convolutional neural network,” Appl. Intell, vol. 53, p. 2862–2878, 2023. [Google Scholar]
  • [30].Bai S et al. , “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,” ArXiv, vol. abs/1803.01271, 2018. [Google Scholar]
  • [31].Velickovic P et al. , “Graph attention networks,” in ICLR, 2018. [Google Scholar]
  • [32].Shimizu S et al. , “A linear non-gaussian acyclic model for causal discovery.” J. Mach. Learn. Res, vol. 7, no. 10, p. 2003–2030, 2006. [Google Scholar]
  • [33].Ji J et al. , “Estimating effective connectivity by recurrent generative adversarial networks,” IEEE Trans. Med. Imaging, vol. 40, no. 12, pp. 3326–3336, 2021. [DOI] [PubMed] [Google Scholar]
  • [34].Olshausen BA and Field DJ, “Sparse coding of sensory inputs,” Curr. Opin. Neurobiol, vol. 14, no. 4, pp. 481–487, 2004. [DOI] [PubMed] [Google Scholar]
  • [35].Qiao C et al. , “Log-sum enhanced sparse deep neural network,” Neurocomputing, vol. 407, pp. 206–220, 2020. [Google Scholar]
  • [36].Smith SM et al. , “Network modelling methods for fmri,” Neuroimage, vol. 54, no. 2, pp. 875–891, 2011. [DOI] [PubMed] [Google Scholar]
  • [37].Granger CWJ, “Investigating causal relations by econometric models and cross-spectral methods,” 1969, pp. 424–438. [Google Scholar]
  • [38].Schreiber T, “Measuring information transfer,” Phys. Rev. Lett, vol. 85 2, p. 461, 2000. [DOI] [PubMed] [Google Scholar]
  • [39].Webb E et al. , “Factorised neural relational inference for multi-interaction systems,” ArXiv, vol. abs/1905.08721, 2019. [Google Scholar]
  • [40].Cantone I et al. , “A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches,” Cell, vol. 137, no. 1, pp. 172–181, 2009. [DOI] [PubMed] [Google Scholar]
  • [41].Satterthwaite TD et al. , “Neuroimaging of the philadelphia neurodevelopmental cohort,” Neuroimage, vol. 86, pp. 544–553, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [42].Power J et al. , “Functional network organization of the human brain,” Neuron, vol. 72, no. 4, pp. 665–678, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Supekar K et al. , “Development of large-scale functional brain networks in children,” PLoS. Biol, vol. 7, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [44].Grzegorczyk MA and Husmeier D, “Non-homogeneous dynamic bayesian networks for continuous data,” Mach. Learn, vol. 83, pp. 355–419, 2011. [Google Scholar]
  • [45].Wang Z et al. , “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process, vol. 13, no. 4, pp. 600–612, 2004. [DOI] [PubMed] [Google Scholar]
  • [46].Lin J, “Divergence measures based on the shannon entropy,” IEEE Trans. Inf. Theory, vol. 37, no. 1, pp. 145–151, 1991. [Google Scholar]
  • [47].Engel J et al. , “Connectomics and epilepsy,” Curr. Opin. Neurol, vol. 26, no. 2, pp. 186–194, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [48].He L et al. , “Decreased dynamic segregation but increased dynamic integration of the resting-state functional networks during normal aging,” Neuroscience, vol. 437, pp. 54–63, 2020. [DOI] [PubMed] [Google Scholar]
  • [49].Qiao C et al. , “A deep autoencoder with sparse and graph laplacian regularization for characterizing dynamic functional connectivity during brain development,” Neurocomputing, vol. 456, pp. 97–108, 2021. [Google Scholar]
  • [50].Raichle ME et al. , “A default mode of brain function,” Proc. Natl. Acad. Sci. U. S. A, vol. 98, no. 2, pp. 676–682, 2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [51].Spreng RN et al. , “The common neural basis of autobiographical memory, prospection, navigation, theory of mind, and the default mode: a quantitative meta-analysis,” J. Cogn. Neurosci, vol. 21, no. 3, pp. 489–510, 2009. [DOI] [PubMed] [Google Scholar]
  • [52].Londei A et al. , “Sensory-motor brain network connectivity for speech comprehension,” Hum. Brain Mapp, vol. 31, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [53].Pillay S et al. , “Perceptual demand and distraction interactions mediated by task-control networks,” Neuroimage, vol. 138, pp. 141–146, 2016. [DOI] [PubMed] [Google Scholar]
  • [54].Seeley WW, “The salience network: a neural system for perceiving and responding to homeostatic demands,” J. Neurosci, vol. 39, no. 50, pp. 9878–9882, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [55].Corbetta M et al. , “The reorienting system of the human brain: From environment to theory of mind,” Neuron, vol. 58, no. 3, pp. 306–324, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [56].Kang J et al. , “Energy landscape analysis of the subcortical brain network unravels system properties beneath resting state dynamics,” Neuroimage, vol. 149, pp. 153–164, 2017. [DOI] [PubMed] [Google Scholar]
  • [57].Cai B et al. , “Estimation of dynamic sparse connectivity patterns from resting state fmri,” IEEE Trans. Med. Imaging, vol. 37, no. 5, pp. 1224–1234, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [58].Kleckner IR et al. , “Evidence for a large-scale brain system supporting allostasis and interoception in humans,” Nat. Hum. Behav, vol. 1, no. 5, p. 0069, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [59].Kozlowska K et al. , ““motoring in idle”: The default mode and somatomotor networks are overactive in children and adolescents with functional neurological symptoms,” NeuroImage-Clin., vol. 18, pp. 730–743, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [60].Jolles DD et al. , “Relationships between intrinsic functional connectivity, cognitive control, and reading achievement across development,” Neuroimage, vol. 221, p. 117202, 2020. [DOI] [PubMed] [Google Scholar]
  • [61].Allen EA et al. , “A baseline for the multivariate comparison of resting-state networks,” Front. Syst. Neurosci, vol. 5, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [62].Broyd SJ et al. , “Default-mode brain dysfunction in mental disorders: A systematic review,” Neurosci. Biobehav. Rev, vol. 33, no. 3, pp. 279–296, 2009. [DOI] [PubMed] [Google Scholar]
  • [63].Cai B et al. , “Estimation of dynamic sparse connectivity patterns from resting state fmri,” IEEE Trans. Med. Imaging, vol. 37, pp. 1224–1234, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [64].Qin P and Northoff G, “How is our self related to midline regions and the default-mode network?” Neuroimage, vol. 57, no. 3, pp. 1221–1233, 2011, special Issue: Educational Neuroscience. [DOI] [PubMed] [Google Scholar]
  • [65].Buckner RL et al. , “The brain’s default network,” Ann. N.Y. Acad. Sci, vol. 1124, 2008. [DOI] [PubMed] [Google Scholar]
  • [66].Fuster JM, “The prefrontal cortex—an update time is of the essence,” Neuron, vol. 30, pp. 319–333, 2001. [DOI] [PubMed] [Google Scholar]
  • [67].Kelly AC et al. , “Development of anterior cingulate functional connectivity from late childhood to early adulthood,” Cereb. Cortex, vol. 19, no. 3, pp. 640–657, 2009. [DOI] [PubMed] [Google Scholar]
  • [68].Rombouts SARB, “A comprehensive study of whole-brain functional connectivity in children and young adults,” Cereb. Cortex, vol. 21, no. 2, pp. p.385–391, 2011. [DOI] [PubMed] [Google Scholar]

RESOURCES