Abstract
Connectivity studies of the brain are usually based on functional Magnetic Resonance Imaging (fMRI) experiments involving many subjects. These studies need to take into account not only the interaction between areas of a single brain but also the differences amongst those subjects. In this paper we develop a methodology called the group-structure (GS) approach that models possible heterogeneity between subjects and searches for distinct homogeneous sub-groups according to some measure that reflects the connectivity maps. We suggest a GS method that uses a novel distance based on a model selection measure, the Bayes factor. We then develop a new class of Multiregression Dynamic Models to estimate individual networks whilst acknowledging a GS type dependence structure across subjects. We compare the efficacy of this methodology to three other methods, virtual-typical-subject (VTS), individual-structure (IS) and common-structure (CS), used to infer a group network using both synthetic and real fMRI data. We find that the GS approach provides results that are both more consistent with the data and more flexible in their interpretative power than its competitors. In addition, we present two methods, the Individual Estimation of Multiple Networks (IEMN) and the Marginal Estimation of Multiple Networks (MEMN), generated from the GS approach and used to estimate all types of networks informed by an experiment —individual, homogeneous subgroups and group networks. These methods are then compared both from a theoretical perspective and in practice using real fMRI data.
Keywords: Multiregression Dynamic Model, Bayesian network, Group analysis, Cluster analysis, Functional Magnetic Resonance Imaging (fMRI)
Highlights
-
•
The group-structure (GS) approach models possible heterogeneity between subjects.
-
•
The search for distinct homogeneous groups is based on the connectivity maps.
-
•
All four methods for group network were compared using synthetic and real fMRI data
-
•
IEMN and MEMN estimate individual, homogeneous subgroups and group networks.
1. Introduction
Functional MRI measures changes in metabolism that occur in brain tissue. Using fMRI, it is possible to locate the areas of the brain which are responsible, for example, for memory, language and hearing. This technique can now provide an image of the brain which has both good spatial and temporal resolution. Furthermore measurements can now be obtained both safely and noninvasively (Poldrack et al., 2011). Although fMRI experiments are widely used to understand how the brain works in response to certain tasks, they also provide a very useful tool to help understand the working of the brain when a person is resting (Raichle, 2010). These mechanisms are important to understand because they provide a backcloth around which to measure the impact of activity a brain function. These resting-state experiments are conducted by having the subject remain in a state of quiet repose and will be the focus of this paper.
One way to understand the brain during this resting-state is to study the connectivity among different cerebral areas, i.e., to assess how one neural system influences another Smith et al. (2011), Friston (2011), Poldrack et al. (2011). The interpretation of this connectivity depends strongly on the statistical procedure used to estimate it. For instance, functional connectivity is defined as the correlation among measurements of neuronal activity of different areas (Friston, 2011). Although there is a correlation between the two regions, this does not imply that one region directly influences the other one. Therefore an effective connectivity is defined based on hypotheses concerning potential causal relationship in the brain (e.g. Spirtes et al., 2000; Pearl, 2000). Mechelli et al. (2002) clarifies the difference between functional and effective connectivity by stating that functional connectivity is defined as the temporal correlations among neurophysiological events in different neural systems, whereas effective connectivity is defined as the influence that one neural system exerts over another.
Connectivities are usually studied using different classes of graphical model (see e.g. Sporns, 2011). Such a graph consists of nodes and edges in which the latter represents connectivity between pairs of nodes sited at a voxel or defined brain region. Once such nodes have been defined, an integration study, e.g. Bayesian network (BN; Korb and Nicholson, 2004), can be used to find edges. These edges generally are either undirected when the model simply reflects some dependence of measurements at two connected sites, or directed where the receiving node (the child) is hypothesised to be influenced by the donating node (the parent). Directed acyclic graphs (DAGs) are graphs whose edges are all directed and no path starts and ends at the same node, see the examples of DAG in Fig. 1. The focus of this paper will be on studying effective connectivity through a particular class of DAG models.
Fig. 1.
Data was simulated considering these three difference graphical structures: DAG1, DAG2 and DAG3 (in the first row). The GS approach found three groups and estimated graphs are in the second row. The estimated DAGs for the VTS, CS and IS approaches are in the third row.
One useful family of statistical models that has recently been used to study brain connectivity is the Multiregression Dynamic Model (MDM; Costa et al., 2015). In this model the connections are represented by parameters that vary over time. As a result, the MDM generalises the class of BNs into a useful class where connectivity strengths between two nodes are hypothesised to change in the time. In contrast to most methods that use Granger causality to estimate effective connectivity, in this model the edges denote direct contemporaneous relationships that might exist between nodes.
Despite its flexibility, the MDM can be supported by a conjugate analysis and its predictive likelihood has a closed form which allows us to perform fast model selection. In Costa et al. (2015) we provide an efficient search over this large class of networks based on an integer programming algorithm. The so-called Multiregression Dynamic Model-Integer Programming Algorithm (MDM-IPA) performed well in detecting the presence of a network connection and also in distinguishing the directionality of the relationships between the brain regions. Here, we also search over another large class of models, the Multiregression Dynamic Model-Directed Graph Model (MDM-DGM) class, which does not demand the acyclic constraints required for the vanilla MDM and searches the larger class of directed graphs. This appears to perform even better in applications in neuroscience than the vanilla MDM. This is because unlike the MDM-IPA it is able to model a bidirectional communication between brain regions which typically exists in this domain. The application of these search methods into real resting-state and task fMRI data can be seen e.g. in Costa (2014), Costa et al. (2015), Harbord et al. (2016) and Costa et al. (2017).
However connectivity studies are usually based on fMRI experiments using many subjects and not just one. Therefore, ideally any analyses should take into account not only the interaction between the areas of one single brain but also the differences among subjects. In this way, in multi-subjects experiments, generally there are three main analyses of interest: (i) to estimate individual networks; (ii) to compare them and then verify if the group of people are homogeneous. If this is not the case then we can add into the model explanatory variables such as gender, age and diseases to explain the heterogeneity we discover; (iii) to estimate a network that represents homogeneous subgroups and the entire group of these individuals.
The purpose of this paper is to present some new approaches that address these three issues described above. For simplicity, we have organisedthis paper into two parts. In the first part, corresponding to Sections from 2 to 5, we compare four methods used to estimate group networks. We also demonstrate the importance of doing this and explicitly how to study the homogeneity of a group. In standard fMRI studies different subjects can display very different relational maps of connectivity from each other. Therefore whilst combining subjects can be a very helpful way to extract information about the whole group, any assumption that implies subjects a posteriori exchangeable is very contentious in this domain. Fortunately the algorithm we use to search for the best fitting models for a single subject can also be used as a basis for a separation measure for first clustering subjects into those exhibiting similar connectivity relationships between the regions of their brain. We demonstrate here that by using this clustering method (to first determine promising subgroups we want to combine) performs much better than any current direct combination method.
The use of group analysis to assess the integration of activity in brain regions – as we perform here – has two advantages (Mechelli et al., 2002). First, it is possible to investigate directly which connections are different across subjects. For instance, some patients may not have a particular connection that exists in healthy people. Alternatively, the connection strength may vary according to age. Second, by allowing the measurement from each individual subject to be modelled, the degrees of freedom that exist over for example typical subject analyses can potentially improve the statistical power of any analysis we propose.
Most group analysis methods do not first verify whether or not the group of subjects is a sample of populations with different connectivity patterns. Instead, the standard group-structure approaches (GS) aim to find homogeneous sub-groups according to connectivity maps. Here we suggest an analysis of this type which allows a novel distance based on the model selection measure, the Bayes factor (Jeffreys, 1961). The first step to analyse this group of subject is to calculate and record the dynamic regression scores for all individuals so as to find MAP models between our classes. Of course, this is a huge task. However, once we have done this we can then use the scores we calculate to determine various types of aggregate graphs relatively cheaply. In this paper we show how within our dynamic framework such aggregate graphical models can be discovered which depict useful relationships both within and between the brain images of different subjects.
We note that although the GS approach developed here is applied to analyses using the class of MDM models, it can be used with any other probabilistic graphical model. We show here that, using both synthetic and real fMRI data, the GS approach provides results that are more scientifically plausible and that are more sensitive to potential heterogeneities within a population than virtual-typical-subject (VTS), individual-structure (IS) and common-structure (CS) approaches.
Therefore, in this first part of the paper, we discussed methods used to infer only the group network. Then, in the second part of this work, corresponding to Section 6, we extend this discussion to all types of networks informed by an experiment — individual, homogeneous subgroups and group networks. We define the Individual Estimation of Multiple Networks (IEMN) approach which is basically the GS approach showed in the first part, i.e. the individual networks are estimated independently whilst the subgroup networks made up of homogeneous subjects are estimated as a function of the same graphical structure for all individuals. We then compare this to the new approach called the Marginal Estimation of Multiple Networks (MEMN) that enables us to infer individual networks considering the information of other subjects. This method searches both individual and subgroup networks considering a distance between homogeneous subjects, as shown below.
In theory, MEMN provides more precise results than IEMN, because when the graphical structure is unknown, the learning algorithm adds more uncertainty to the inferential process. In this sense the use of information from other subjects in the learning individual networks, as the MEMN does, can therefore improve this estimation process. We note that Oates et al. (2014a), Oates et al. (2014b) presented a similar method to MEMN. These all the individual and subgroup networks are estimated simultaneously, and then the denser graphs are penalised. However network learning is more complex and rather less transparent compared to the one we discuss here. IEMN and MEMN are compared in theory and in practice using real fMRI data. Short description and some references for all methods used in this work can be seen in Appendix A.
This paper is structured as follows. Section 2 provides a brief literature review of some methods used for group analysis. Section 3 describes these group analysis methods in the context of the MDM. Section 4 compares these four methods using synthetic data whilst Section 5 analyzes the group of subjects considering real resting-state fMRI data. Section 6 presents methods to estimate both individual and group networks, IEMN and MEMN, and a comparison between them, and finally conclusions are given in Section 7.
2. A short review of group analysis methods
Many fMRI experiments with multiple subjects have been conducted recently. In general, four approaches which deal with multi-datasets may be found in the neuroimaging literature (e.g. Mechelli et al., 2002; Li et al., 2008; Ramsey et al., 2010; Gates and Molenaar, 2012). The first approach is the virtual-typical-subject (VTS). This approach ignores inter-subject variability, assuming that the information from different datasets come from the same subject. This “typical subject” can be found by calculating the average of observed variables for every node over subjects or concatenating the datasets, so that methods designed for a single individual can be used Zheng and Rajapakse (2006), Rajapakse and Zhou (2007), Li et al. (2008). Of course, when datasets are concatenated, the number of data points per node increases. Consequently the degree of freedom for estimating the parameters is higher. However, the assumptions underpinning this approach, i.e. “variations in connectivity from subject to subject are random, well-behaved and uninteresting”, are not always true (Mechelli et al., 2002). Moreover, the variability of concatenated data may be significantly higher than individual variability whilst the variability of averaged data may be very much lower than the usual variability found for each subject. Therefore, the results of this group-based analysis may not reflect some of the features found in the individual context (Gonçalves et al., 2001). In addition, it is not possible to compare the interactions according to different characteristics, such as task performance or gender.
Gates and Molenaar (2012) give two reasons why VTS is not suitable for modelling a brain network. The first concerns the connectivity strength. It is generally accepted that this is expected to vary between subjects. However, some researchers want to study the relationship between this connectivity strength and disease level and this cannot be addressed using this method. Second, the communication pattern among brain regions may well differ from individual to individual. For instance, in a study of fMRI activation pattern related to writing, three of five regions showed inconsistent results across subjects (Sugihara et al., 2006). To address these problems a second method, common-structure (CS), and a third method, individual-structure (IS), have been proposed.
The CS approach considers the same network structure but allows the parameters to differ between subjects. The connectivity strengths are expected to vary over subjects due to measurement error or the individual characteristic of influences from one region to another. An example of this approach is the Independent Multiple-sample Greedy Equivalence Search (IMaGES) which uses BIC scores to find a Markov equivalence class, basically summing the scores over subjects who are assumed to share the same graphical structure (Ramsey et al., 2010). Clearly the CS approach cannot, therefore, allow for the possibility that the pattern of connectivity may diverge over subjects. However, such a divergence may plausibly happen. For example, in a resting-state experiment when people are free to think of anything, someone may use their memory whilst others may do mental arithmetic. Another reason why the assumptions on which such methods are based might be violated is that a group of patients may well have a disease with different degrees of severity. This could then result in different connections as well as different connectivity strengths arising between brain regions (Li et al., 2008).
The next approach, individual-structure (IS), drives the learning network process individually in each dataset so that results are pooled into a single network. The group network is usually formed by including the edges that exist for most subjects. An example of this approach is Oates (2013) who proposed an algorithm that first scored individuals and then constructed a“group network” by minimising the distance between the individual and this group network. Although the IS approach seems to cope well with the different interactions, its results are often inconsistent amongst individuals because subjects tend in practice to display obvious heterogeneities Gonçalves et al. (2001), Mechelli et al. (2002). We show an example of this in Section 4.
It is not possible to say which method is generally superior over the others since the interpretation of results depends directly on the assumptions of each method which can vary across different populations and experiments (Li et al., 2008). However, it is fair to say that although some of these methods explicitly recognise the intra-subject variability, none of them assesses the homogeneity of the individual connectivity maps. If the approaches described above are applied to a heterogeneous group, then any conclusions based on the group network can therefore obviously be misleading. A fourth method, the group-structure (GS) approach, aims to model such potential heterogeneity between subjects.
The GS approach studies group homogeneity through cluster analysis, considering a particular measure of similarity between subjects. If this analysis suggests that subjects should be clustered into disjoint subgroups, then this group of subjects is not homogeneous. When heterogeneity is found to be present, it is suggested that any subsequent analyses of interest should be done for every subgroup independently. Note that no prior classification information is necessary to use these methods. For instance, Kherif et al. (2004) defined a separation measure between two subjects’ data based on a multivariate correlation. In another example, Gates (2012) estimated the effective connectivity of both individual and group network using the Group Iterative Multiple model estimation (GIMME; Gates and Molenaar, 2012). In this work, Gates (2012) proposed the correlation between connectivity strengths as the separation measure between subjects. In this paper, we propose an alternative separation measure between subjects as a function of the model selection criterion: the Bayes factor.
3. Group analysis using the multiregression dynamic model
In this section we first describe MDM, a Bayesian dynamic regression model used to estimate effective connectivity. Then we show how VTS, the CS and the IS approaches can be applied in the context of the MDM. Finally we present a novel GS approach that uses a distance based on one used for a model selection criterion.
3.1. The multiregression dynamic model
MDM is a class of multivariate time series models which embeds putative causal hypotheses among its variables over time Queen and Smith (1993), Queen and Albers (2009). In MDM, the multivariate model for observable series is broken down into simpler univariate regression dynamic linear models (DLMs) so that the effective connectivity (or parents’ effect of one node) is allowed to vary across the period of an experiment. Formally, this model is described by the following observation equations, system equation and initial information (Queen and Smith, 1993):
where regions, time points, denotes the Gaussian distribution, , is the -dimensional parameter vector for and is a known linear function of the parents and assumed fixed. For nodes that do not have parents, . In addition, are square matrices which form .
The initial information is defined as follows:
where is the observation precision defined as and represents the Gamma distribution given density explicitly. The initial information is the probabilistic representation of the previous knowledge about given the information at time , i.e. . The mean vector is an initial estimate of the parameters whilst the variance–covariance matrix represents the uncertainty about this mean and the relationship among the parameters. is square matrix and is defined as . Because the equations of the MDM can be viewed as a collection of nested univariate DLMs, the parameters can be estimated using well-known Kalman Filter recurrences over time. Therefore, as the relationship between the observed variables and their parents is linear, the errors follow a Gaussian distribution whilst the precision has a Gamma distribution, these lead to a conjugate analysis thus ensure a parametric economy and identifiability of the model and perhaps most potently extremely first formulated for evaluating the fit of any potential connectivity group.
Thus one of the most popular ways of comparing two models is to use the Bayes factor measure Jeffreys (1961), West and Harrison (1997). This is defined as the ratio between the predictive likelihood of two models, model and model , say. The joint log predictive likelihood (LPL) is calculated as
| (1) |
where is the observed value of time series until time , the conditional forecast distribution has the closed form of a student’s t-distribution and reflects the current choice of model that determines the relationship between regions (Costa et al., 2015). To compare model to model , we use the log Bayes factor (BF) so that
Heuristically the higher the score of a model the better its fit to the observed data.
To search over networks we have used the recent MDM-IP algorithm (Costa et al., 2015) which is an efficient search-and-score method using scores found through Eq. (1), the gobnilp system Cussens (2011), Bartlett and Cussen (2013) and the SCIP IP framework (Achterberg, 2007). In our case, for any candidate model , is a sum of ‘local scores’, one for each node , each corresponding to various scores of models hypothesising a causal relationship from other nodes. The local score for is thus determined by the choice of parent set specified by the model . Let denote this local score, so that . We can now view a model selection for the MDM as a search for subsets which maximise the subject to existing an MDM model , with for , corresponding to the best search set of causal relationships under the constraint that the whole system must be acyclic (see details in Costa et al., 2015). It is also possible to search for graphical structure without the constraints of DAG, choosing the set of parents that maximise the LPL for each node independently. We called this algorithm the MDM-DGM (Directed Graph Model; Costa et al., 2017).
3.2. Some approaches for group analysis
The VTS approach finds a typical subject, assuming the same network with exactly the same connectivity for all subjects. Within an MDM framework, the “typical subject” was defined by first calculating the average of time series variables over all subjects. Based only on this “ordinary subject”, the search method as applied in Costa et al. (2015) can then be used to find the group network. Note that the local score for node can now be written as follows:
where is the average of observed variables at time and node over subjects, , , and . Here is the model defined by the parent set of node so that the group network consists of . The connectivity strength for the group network is estimated based on the smoothed posterior distribution of parameters ’s using the MDM fitted for each such typical subject.
The CS approach assumes that all subjects share the same group network structure, but that the parameters may differ over subjects. In this way, the parameter estimation process can be applied to each subject independently. Therefore for individual models represented by the same graph , the scores used in search process are defined as follows:
| (2) |
where is the number of subjects, is the observed variable for region and subject at time , is the observed cumulative data until time for subject , and .
Note that within the CS approach the parents of a particular node are the same for all subjects (). The group network is estimated using a search algorithm whose scores are given in Eq. (2). The connectivity strength for group network is estimated as the average of the smoothed estimates of ’s over subjects.
In contrast, using this model the IS approach usually learns individual networks independently, using the individual scores
| (3) |
where is the model defined by the parent set of node for subject so that the individual network for subject consists of . The group network structure () consists of the edges that exist in the individual network for most subjects. MDM is then fitted for all subjects using the group network and, as in the CS approach, the connectivity strength is estimated as the average of the smoothed estimates of ’s over subjects.
3.3. Clustering with pairwise log Bayes factor separation
Here we define a new method designed to be sensitive to potential heterogeneities over the connectivity graphs of different subjects as well as their connectivity strengths. In this group-structure (GS) approach subjects are first grouped according to the similarities in their graphs. These similarities are defined by a separation measure, , calculated for every pair of subjects and . The individual networks, , with the pairwise group network, , are then compared using
where ,
, for , , . So here the individual networks, , are estimated by maximising the scores in Eq. (3). The pairwise group networks, , is then estimated by maximising the sum of scores for only two subjects, and , such as in Eq. (2), considering .
Some properties of are given below.
-
1.
For the MDM-IPA, the scores are exactly the LPL. So can be seen as the logBF comparing the model that assumes subjects and have different graphical structures to one where it is assumed that they share the same one. Thus, we call this separation measure the pairwise logBF separation.
-
2.
The pairwise logBF separation is symmetric, i.e. .
-
3.If the estimated individual graphical structures for subjects and are the same, then . As is the network that maximises the scores in Eq. (3), then , where and is any possible network for subject , except . Thus,
By definition, the pairwise group network assumes that both subjects share the same graphical structure, i.e. . As a result, when , the above inequality becomes
Because, by definition, is a network different from and given Eq. (4), is the pairwise group network that maximises the scores in Eq. (2), i.e. . Therefore,(4) -
4.By definition, the separation is non-negative. That is, because , whenever is selected by maximising the scores in Eq. (3), and is any possible individual network for subject for . Thus,
Using a cluster analysis with these pairwise logBF separations, subjects are grouped according to their similar networks. Then Eq. (2) is used to score models for subjects belonging to the same subgroup. As a result a graphical structure is estimated for each one of subgroups independently, where is the number of subgroups. Only then is the connectivity strength estimated per subgroup as the average of estimated parameters ’s over (the approximately homogeneous) subjects belonging to the same subgroup.
To compare the computational cost of these methods recall that learning the individual network follows two basic steps: (1) calculating the scores for each set of parents for individual nodes, (2) calculating the optimal MDM using the MDM-IPA or the MDM-DGM over the full model. The run-time of the first step depends critically on the number of nodes and the sample size. It is necessary to fit a linear dynamic model for every node and every set of parents — there are possible sets of parents per node. We have found that step 1 takes dramatically longer time than the step 2, see Costa et al. (2015). For example, for an 11-node networks with 100 time-points, step 1 took around 168 min whilst step 2 took around 30 s, on a 2.7 GHz quad-core Intel Core i7 Linux host with 16 GB. Thus, the VTS approach takes the shortest time because the search network algorithm (steps 1 and 2) is applied only once for a “typical” subject. In contrast, in the CS approach, step 1 is applied for each subject and then the scores are summed over all individuals whilst step 2 is applied only once. For the IS and GS approaches, steps 1 and 2 are applied individually for all subjects. Therefore, as step 2 only takes a relatively short time, the difference between the IS and CS tends to be small. However, as individual scores are summed before step 2 can be applied for every pair of subjects, the GS approach takes longest. The greater the number of subjects the bigger is these run time differences.
4. Comparing methods using synthetic data
In this section, we compare the four group analysis approaches described above using synthetic data. The aim of this section is to assess the efficacy of methods when subjects are sampled from populations whose individuals may exhibit different networks. In order to obtain more realistic data, we simulated data based on the results found in the analysis with real datasets and showed in Section 5. Therefore, we considered different DAGs that were estimated for most people in real situation (subgroups 1, 2 and 4 from Fig. 5), and we repeat DAGs here in Fig. 1 (DAG1, DAG2 and DAG3) for easy viewing. We simulated data for subjects for each DAG, considering nodes, time points and true parameters as the average of estimates over subjects found in Section 5.
Fig. 5.
The connectivity strength standardised mean () for a particular edge , where indexes rows and columns, using the MDM-IPA per subgroup defined in Fig. 3, only for significant connectivities, i.e..
The GS approach
The pairwise logBF separation for all pairs of subjects was evaluated as shown in Section 3.3 and considering the MDM-IPA. In order to assess the homogeneity of this group, we used the hierarchical cluster ( Everitt et al., 2011, Chapter 4) and the multidimensional scaling (MDS; Everitt et al., 2011, Section 2.3.3). The hierarchical cluster results can be illustrated through a dendrogram and, to define subgroups, we use the dynamic tree cut (hybrid algorithm; Langfelder et al., 2008). Fig. 2 (left) shows the dendrogram which was found using the R packages hclust and dynamicTreeCut, and the minimum size of cluster equal to . In this diagram, subjects are represented by the indexing of their respective DAG. The hybrid algorithm correctly identifies the number of subgroups, i.e. the three coloured rectangles under the dendrogram and all subjects were correctly grouped. Moreover, the average separation between subjects belonging to the same group was around whilst the average separation between groups was almost . This provides strong evidence that people from different subgroups have different graphical structures.
Fig. 2.
Dendrogram (left) and MDS (right) for synthetic data using the pairwise logBF separation. Numbers , and correspond to subjects simulated based on DAG1, DAG2 and DAG3, respectively. Coloured rectangles under the dendrogram identify the three subgroups found by the hybrid algorithm. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Following this, multidimensional scaling (MDS) was explored in this context. MDS depicts patterns in the separation between subjects with a Euclidean space. By using geometry, the best separation between subjects in a low-dimensional scaling is used to represent the original dissimilarity measure ( Everitt et al., 2011, Section 2.3.3). Note that, from this MDS plot, it is possible to recognise subgroups and outliers, and also to verify a measure of the quality of this approximate Euclidean depiction. For instance, Fig. 2 (right) shows a 2D plot which captured almost 99% of this information, where subjects are labelled according to the number of their DAG as before. Clearly subjects from DAG1 are on the left and bottom of the figure, whilst subjects from DAG2 are on right and DAG3 on the left and top.
Comparing group analysis approaches
The graphical structures for the VTS, CS and IS approaches were estimated as described in Section 3.2, considering the MDM-IPA (see Fig. 1). As there are three true DAGS and as these approaches only estimated a single DAG, we considered a graph formed by edges that exist for most people (more than of individuals) in order to compare the performance of methods. Also we estimated sensitivity and specificity measures in which the former represents the proportion of edges that exist in the true graph and are correctly identified as such in the estimated graph, whilst the latter represents the proportion of edges that do not exist in the true graph that are correctly identified as such in the estimated graph. Table 1 shows these estimated measures for all four approaches.
Unsurprisingly the IS approach picked up the edges that exist for most individuals, i.e. of these popular edges, and estimated only more false-positive edges. The VTS approach provides the worst performance, estimating only of popular edges and more false-positive edges — the averaging of the time series over subjects resulted a bad estimated graph in some way.
In contrast to other methods, the GS approach identified the three different networks perfectly, that were made up of 10 individuals in each one (see Fig. 2). As a result the estimated sensitivity and specificity were one of the highest measures among the methods, with the average of and for these measures, respectively, as shown in Table 1.
Table 1.
The estimated sensitivity and specificity measures for VTS, CS, IS and GS approaches. These measures were estimated for the first 3 approaches considering a graph formed by most popular edges whilst all true DAGs were considered for the GS approach.
| Approach | Sensitivity | Specificity |
|---|---|---|
| VTS | 58% | 73% |
| CS | 71% | 87% |
| IS | 88% | 95% |
| GS-subgroup 1 | 83% | 94% |
| GS-subgroup 2 | 96% | 96% |
| GS-subgroup 3 | 83% | 95% |
5. Group-structure in practice
This real application consists of a resting-state fMRI data setting from a 15-min experiment with 32 subjects (TR 1140 ms, 2 2 2 mm). ROIs were defined either functionally or based on the Harvard-Oxford atlas: left and right amygdala, ventromedial prefrontal cortex (VMPFC), left and right dorsolateral prefrontal cortex (DLPFC), left and right posterior insula (PostInsula), left and right anterior insula (AntInsula), left and right orbitofrontal cortex (OFC), and anterior midcingulate cortex (aMCC). More details can be found in Bijsterbosch et al. (2015). The GS approach was applied and the result can be seen in Fig. 3 through a dendrogram (left) and the MDS plot (right), considering the MDM-IPA. Based on the hybrid algorithm, we defined six subgroups: red as subgroup 1; royal blue as subgroup 2; grey (without subject 6) as subgroup 3; light blue as subgroup 4; yellow as subgroup 5; and Subject 6 as subgroup 6.
Fig. 3.
Dendrogram (left) and MDS (right) of real fMRI data using the pairwise logBF separation for the MDM-IPA. Coloured rectangles under dendrogram identify subgroups found by the hybrid algorithm. Subgroup 1 is defined as Red; Subgroup 2 is Royal Blue; Subgroup 3 is Grey (without subject 6); Subgroup 4 is Light Blue; Subgroup 5 is Yellow; and Subgroup 6 is Subject 6. The MDS graph illustrates the subjects with respective colours, and captured almost of the information provided by the dissimilarity measure. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
It is possible to use some characteristics of the subjects to begin to explain the differences between subgroups. For example, Fig. 4 shows a significant difference in the percentage of men between subgroup 3 and 4 (left), and in the percentage of high trait anxiety individuals between subgroup 2 and 5.
Fig. 4.
The proportion of male (left) and the proportion of subjects who have high trait anxiety (right) by subgroup defined in Fig. 3. The bars represent 95% HPD interval calculated assuming a non-informative prior.
The graphical structures for each subgroup were estimated independently, summing the scores of subjects belonging to the same subgroup and then applying the MDM-IPA. To simplify the analysis, we calculated the connectivity strength standardised mean as , where and are the average of location and scale parameters of the smoothed distribution over time and subjects for connectivity and subgroup ; is the sample size; and is the number of subjects in the subgroup . Fig. 5 provides the connectivity strength standardised mean for a particular edge , where indexes rows and columns, per subgroup and only for those edges with significant Binomial tests after false discovery rate correction (FDR; Benjamini and Hochberg, 1995). We noted that for connectivity from node , AntInsula-L, to node , AntInsula-R, in subgroup 6, but we attributed the value in this figure for clarifying the differences among other connectivities.
Note that not only the existence of the effective connectivity differs among subgroups, but also the connectivity strength may vary from one subgroup to other. For instance, suppose we are interested in comparing subgroup 3 to subgroup 4 and subgroup 2 to subgroup 5, because they appear to have a correlation with covariates gender and high trait anxiety, as shown in Fig. 4. Then we could argue we should calculate the connectivity strength standardised difference as as a coarse measure of connectivity in subgroup compared to subgroup . Fig. 6 (left) provides this difference in connectivity strength between subgroups 3 and 4, in which the percentage of male is higher in the former subgroup than the latter one (Fig. 4 left). Pink connectivities mean they are stronger in subgroup 3 (men) than in subgroup 4 (women), whilst blue connectivities are stronger in subgroup 4 (women) than in the subgroup (men). Note that the connectivity from node 8 (AntInsula left) to node 9 (AntInsula right) is the strongest in subgroup 4 (women) whilst the reciprocal connectivity, from node 9 to 8, is the strongest in subgroup 3 (men). Also connectivities in which the parent is node 12 (aMCC) tend to be stronger in subgroup 4 (women) than in the male subgroup.
Fig. 6.
The connectivity strength standardised difference () for a particular edge , where indexes rows and columns, between subgroup 3 and subgroup 4 (left), and between subgroup 2 and subgroup 5 (right). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Another example is the difference between subgroups 2 and 5 where the proportion of subjects who have high trait anxiety is larger in the former subgroup than latter one (Fig. 4 right). The connectivity strength standardised difference () between these two subgroups can be seen in Fig. 6 (right). An interesting result is that some connectivities are the strongest in one subgroup whilst their reciprocal connectivities are the strongest in other subgroup. For instance, the connectivity from node 1 (left Amygdala) to node 2 (right Amygdala) is the strongest in subgroup 5 whilst the reciprocal connectivity, from node 2 to 1, is the strongest in subgroup 2. A similar result can be seen between the following pairs of nodes: 4 and 5 (left and right DLPFC); 5 and 11 (right DLPFC and right OFC); 7 and 9 (right postInsula and right antInsula); 7 and 12 (right postInsula and aMCC); 9 and 12 (right antInsula and aMCC). Note that most of these connectivities are on the right side of the brain.
6. The estimation of multiple networks
We discussed above four approaches which are used to estimate a group network, considering the different ways of combining individual information. In this section, we consider a complementary problem and provide ways to infer all graphs: individual, subgroup and group networks. We present the following two methods here: the Individual Estimation of Multiple Networks (IEMN) and the Marginal Estimation of Multiple Networks (MEMN). All these methods incorporate the GS approach to deal with heterogeneous group, but the second one, MEMN, appears to be more robust than IEMN, because it estimates the individual network using information from other subjects.
In Section 6.1, we define the method IEMN in which subjects are first grouped using a cluster analysis, as in the GS approach. Then, IEMN first estimates the individual networks independently, and after that, the subgroup and the group networks are estimated using the CS approach. This method therefore addresses mainly the challenge about the heterogeneous population problem. The second method, MEMN, is then developed based on a method suggested by Oates (2013), in which subgroup and group networks are estimated using the IS approach and a similarity measure between individual and subgroup/group network structures. This method then estimates the individual networks using the information of other subjects belonging to the same homogeneous subgroup. IEMN and MEMN are compared using real fMRI data in Section 6.2.
6.1. The IEMN and the MEMN
In this section we describe a new approach for searching over MDMs which not only estimates group networks but also individual networks, whilst taking into account the information from other subjects. This approach, called the Marginal Estimation of Multiple Networks (MEMN), was originally developed for Dynamic Bayesian Networks (DBNs), using a penalty function that represents the distance between group and individual networks (Oates, 2013). We generalise this method by combining it with the GS approach, i.e. the cluster analysis shown above, including one more step to estimate the subgroup networks. Oates (2013) used his method to consider the probability of a particular edge existing. Here, in contrast, we develop MEMN based on the log predictive likelihood (LPL). Furthermore, Oates (2013) assumed that the parameter of the penalty function, , was known, being defined by scientists. It may not be easy to suggest appropriate values for this parameter, especially when the study consists of a novel experiment, and a misspecification of this parameter will provide erroneous results. We discuss here some new possibilities for estimating from data, for example, minimising LPL.
A comparison between the MEMN and the Individual Estimation of Multiple Networks (IEMN) is also provided in this section. The IEMN is basically the GS approach described above, where the individual networks were estimated independently whilst the subgroup networks were estimated summing the scores over subjects within the same subgroup.
Reviewing the notation, is a graphical structure within the space for subject ; is a graphical structure within the space of subgroup . is the set of the indices of subjects belonging to the subgroup , according to cluster analysis, and denotes the number of subjects in the subgroup . denotes a graphical structure of the group considering all subjects within the space (see Fig. 7).
Fig. 7.
Individual networks: ; Subgroup networks: , found by cluster analysis; and the group network: .
6.1.1. The Individual Estimation of Multiple Networks (IEMN)
IEMN: Individual Graphical Structures:
The maximum a posteriori probability (MAP) estimator of for IEMN can be defined as
where . The score is found per subject and node as in Eq. (3). Therefore the MDM-IPA or the MDM-DGM can be applied to find , per subject independently, using the scores .
IEMN: Subgroup Graphical Structures:
Now the MAP estimator of the subgroup network is:
where is defined as in Eq. (2) for . Then can be estimated by MDM-IPA or MDM-DGM per subgroup independently.
IEMN: Group Graphical Structure:
To search the group network, the CS approach can be applied so that the scores are summed over all subjects, as in Eq. (2). Then the MAP estimator of the group network for the whole population can be defined as
where , and . Again can be estimated by the MDM-IPA or the MDM-DGM.
6.1.2. The Marginal Estimation of Multiple Networks (MEMN)
MEMN: Individual Graphical Structures:
This method scores individual networks based on the information from the subgroup network and the individual networks of other subjects who belong to the same subgroup, as follows (see details in Appendix B).
| (5) |
where a subject belongs to the subgroup ; whenever the kth element of ; and is the set of the indexes of all subjects belonging to the subgroup , except for subject . Here ; and . Note that the term , from Eq. (3), for ; and the other term is defined as follows:
where is the Structural Hamming Distance (SHD) between and , i.e. the number of nodes that are the parents of node only in one network: or . We reduce the number of hyperparameters by assuming that the parents of node are a priori equally likely to be shared between the subject and subgroup , i.e. for all , , and . The hyperparameter is usually specified in a subjective manner. Oates (2013) suggested writing as a function of the probability of maintaining the status (present/absent) of the edge between the individual network and the subgroup network . That is,
| (6) |
Here denotes the set of elements contained in A but not in B plus elements contained in B but not in A. Therefore, the odds of an individual graph are the same as its subgroup graph regarding a particular edge
For instance, setting , the probability of maintaining edge status is almost twice the probability of not maintaining edge status between the subgroup and individual networks. These odds increase to about and then for and , respectively.
Defining , the MAP estimator of individual networks is then
The individual network structure for subject can thus be found using the scores in Eq. (5) and MDM-IPA or MDM-DGM.
MEMN: Subgroup Graphical Structures:
The subgroup network is found through the posterior probability of , as follows (see details in Appendix C).
As , the subgroup network structure is found using these scores above and MDM-IPA or MDM-DGM, so that
MEMN: Group Graphical Structure:
The estimation of group network structure using the individual networks follows the same idea shown above for subgroup networks. Thus
| (7) |
and . The MAP estimator is
Comparing the MEMN with the IEMN in Theory
In IEMN, the individual networks are estimated independently, assuming that subjects have different networks, and so the information of one subject is not used in the estimation of another subject. In contrast, the group network is estimated considering that all subjects come from the same population and so they share the same graphical structure.
The higher the , the more similar the group network results are, comparing IEMN with MEMN. This is due to the fact that IEMN assumes that all subjects have the same graphical structure to estimate . Similarly, in MEMN, the higher the , the higher the score in which the distance between the individual and the group network is small, and so the more similar the individual graphs are to each other.
The smaller the , the more similar the individual results are (), when comparing IEMN with MEMN. When , is proportional to a constant, for , and so is function of (see Eq. (5)). Therefore, the estimated individual graphical structures using IEMN are the same as those using the MEMN. In contrast, as increases, the scores are more penalised for individual networks that are more different from , therefore increasing the divergence between the IEMN and MEMN results.
Note that the comments above about group network also apply to the subgroup networks . Some properties cited above will be demonstrated in the next section.
6.2. The application of multiple networks
We next use a rich fMRI study that has information from five different experimental conditions, called sessions: Session is a (conventional) resting-state condition; session is a motor condition in which individuals tapped something; session is a visual condition in which individuals watched a movie; session and session are a combination between visual and motor condition, but the former is in a random way whilst in the latter individuals tap when they see certain random events in the movie. Data were acquired on subjects, and each acquisition consists of 230 time points, sampled every 1.3 s, with 2 2 2 mm voxels. The FSL software1 was used for preprocessing, including head motion correction, automated artefact removal procedure Salimi-Khorshidi et al. (2014), Griffanti et al. (2014) and intersubject registration. We use ROI’s defined on 5 motor brain regions and 6 visual regions. The motor nodes used are Cerebellum, Putamen, Supplementary Motor Area (SMA), Precentral Gyrus and Postcentral Gyrus (nodes numbered from 1 to 5, respectively) whilst the visual nodes used are Visual Cortex V1, V2, V3, V4, V5 and task negative (V1 V2; nodes numbered from 6 to 11, respectively). The observed time series are computed as the average of BOLD fMRI data over the voxels of each of these defined brain areas (Costa et al., 2017).
In this section we discuss the main differences between IEMN and MEMN, highlighted in the previous section, considering now a real application. In addition, we explore some new methods for determining and discuss the impact of its values on the results of multi-subject analyses. We show that depending on the chosen value of , IEMN and MEMN can provide completely different results. It should be remembered that the space of parameter is , i.e. it ranges from individuals believed to have different connectivity maps to hypotheses that they have the same one.
Some possibilities for defining this parameter are (i) through a scientific belief statement (Oates, 2013), e.g., as shown in Section 6.1.2, implies that the probability of maintaining edge status (absent/present) is almost twice the probability of not maintaining edge status between the group and the individual networks; (ii) maximising the LPL (or equivalently, maximising the scores or ); (iii) maximising the posterior probability of individual networks, , or group networks, ; (iv) by cross-validation. We show here that different ways of estimating can lead to very different values of this parameter, and so divergent analyses result.
In this section we also compare MDM-IPA and MDM-DGM. Here we show that the graphs estimated by MDM-DGM are usually denser than DAGs from MDM-IPA, to accommodate for the possible cycles in the communication among brain regions. We also show that the methods described here can be used to compare data from different experimental conditions.
For the purposes described above, we are using an external validation study, in which the estimated individual networks were compared to predictive networks, i.e. networks estimated using the data from other subjects. For simplicity, we are considering two levels: the individual and the group network. However, this analysis could of course also be applied to subgroup networks as well.
First the individual networks were estimated considering IEMN and MEMN, with and , using MDM-IPA and MDM-DGM. Then the predicted individual network for subject was estimated considering the group analysis for all subjects, except subject , using the same methods as before.
Fig. 8 shows the estimated and predicted graphical structures for subject 1 in the resting-state condition, considering all the methods described above. As expected, the estimated individual network found using IEMN was similar to the one using MEMN with small (see the first and second column and the first and third row). In contrast, as the predicted results were found using the methods of group network and as discussed above, the predicted network of IEMN was similar to the MEMN one with large (see the first and last column and the second and last row).
Fig. 8.
The estimated and predicted networks, using the MDM-IPA and the MDM-DGM, for subject 1 and resting-state condition (Session 1), considering IEMN (the first column) and MEMN (from the second column) with and . The motor nodes used are Cerebellum, Putamen, Supplementary Motor Area (SMA), Precentral Gyrus and Postcentral Gyrus (yellow nodes numbered from 1 to 5, respectively) whilst the visual nodes used are Visual Cortex V1, V2, V3, V4, V5 and task negative (v1 v2; blue nodes numbered from 6 to 11, respectively).
The parameter can be defined maximising the individual and the group scores. They were evaluated as a function of the posterior probability of individual and group networks, by Eq. (5) and Eq. (7) , respectively, i.e.
The value of that maximised these scores was 0.7 for both individual and group networks, and also for MDM-IPA and MDM-DGM (see Table 2).
Table 2.
The average of scores for individual networks, , and group networks, , over subjects and sessions, considering the learning network algorithms: MDM-IPA and MDM-DGM, and different values of .
| Individual |
Group |
|||
|---|---|---|---|---|
| IPA | DGM | IPA | DGM | |
| 0.1 | 20 336.50 | 20 851.97 | 19 414.24 | 19 426.27 |
| 0.7 | 20 382.76 | 20 904.73 | 19 423.86 | 19 507.02 |
| 10 | 19 929.91 | 20 731.38 | 18 119.40 | 19 348.62 |
| 100 | 18 844.87 | 20 729.59 | 14 671.61 | 19 347.24 |
| 1000 | 14 349.80 | 20 729.59 | 13 499.95 | 19 347.24 |
Now analysing the number of edges, the results also clearly depend on the individual and the group networks. We can see in Fig. 8 that the higher the , the denser the estimated individual networks were (the first and third row) whilst the group networks got sparser with the growth of (the second and last row).
We also study which method provides graphical structures which are more similar over subjects. Fig. 9 shows the average of the SHD between two estimated individual networks, over all pairwise subjects and all sessions, considering IEMN and MEMN (), using the MDM-IPA (blue bars) and the MDM-DGM (orange bars). Considering the complete individual graphical structure (Fig. 9, left), IEMN and the small values of for MEMN provided different results across subjects. This result is expected as long as the large implies to a similar structure between the individual networks (’s) and the group network (), and then, among the individual graphs. In this way, the MEMN with large is suitable for a homogeneous group. This conclusion is confirmed considering the estimated individual graphs, with only the significant connections (right figure), although the difference between the methods is faint.
Fig. 9.
The average of the structural Hamming distance comparing two individual networks over all pairwise subjects and sessions, considering IEMN and the MEMN (), using the MDM-IPA (blue bars) and the MDM-DGM (orange bars), for the all edges (left) and the only significant edges (right) of estimated networks. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
The estimated and the predicted networks were compared using the logBF and the SHD, for the complete graph and considering only significant edges. We then computed the percentage of subjects in which the distance between the estimated and the predicted is the smallest when both networks belong to the same session. Fig. 10 provides the average of this percentage of predicting correctly the network session, over all sessions, comparing the estimated to the predicted networks using the same method, i.e. IEMN and MEMN with , and comparing the estimated networks using the IEMN (i.e. individual graphs estimated independently) to the predicted networks using the MEMN for (we called this ). The chance of predicting the correct network session randomly is sessions (dashed horizontal line in this figure). The green lines represent the 95% HPD intervals, when employing a non-informative prior distribution. In general, IEMN and MEMN with provided one of the best results. For MDM-IPA, MEMN with had the highest percentage of predicting the network session (around ) correctly. Overall, MDM-IPA predicted the network session more correctly than MDM-DGM.
Fig. 10.
The average of the percentage of predicting correctly the network session, over all sessions, comparing the estimated to the predicted networks using the same method, i.e. IEMN and MEMN with , and comparing the estimated networks using IEMN to the predicted networks using MEMN for , using the MDM-IPA (blue bars) and the MDM-DGM (orange bars). The dashed horizontal line means the chance of predicting correctly the network session randomly. The green lines represent the 95% HPD intervals, considering a non-informative prior distribution. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
To compare sessions, we identified which sessions were better at predicting another particular session. For instance, Table 3 shows sessions (columns 3 and 5) that have the highest percentage of predicting the network of session cited in column 1, considering the method MEMN . Considering all methods and different values of , in general,
-
Session 1, resting-state condition, better predicted Session 2, motor condition;
-
Session 3, visual condition, and Session 4, visual and motor (random) condition, better predicted each other;
-
Session 3 also better predicted Session 5, visual and motor condition;
-
There was not a predominant session that predicted Session 1, resting-state.
Note that these results are consistent with the conclusion provided in Costa et al. (2017), where Session 1, resting-state, was responsible for the greatest difference among sessions considering the connectivity maps, and Session 3, visual condition, is closer to Session 4, visual and motor conditions, than to other sessions.
Table 3.
The percentage of subjects who have a particular session chosen by the smallest value of the logBF comparing estimated network, using IEMN, to the predicted network, using MEMN with . Columns 2 and 4 show this percentage regarding the same session as in column 1, for MDM-IPA and MDM-DGM, respectively. Columns 3 and 5 show the session (and the correspondent percentage) that better predicts the session in column 1, whereas all other sessions, for MDM-IPA and MDM-DGM, respectively.
| Session | MDM-IPA |
MDM-DGM |
||
|---|---|---|---|---|
| % right session | Predictor session (%) | % right session | Predictor session (%) | |
| 1-RS | 40% | 5 (33%) | 47% | 3 (27%) and 4 (27%) |
| 2-Motor | 47% | 1 (33%) | 33% | 1 (27%) |
| 3-Visual | 33% | 4 (40%) | 40% | 4 (33%) |
| 4-Visual Motor (random) | 40% | 3 (33%) | 33% | 1 (40%) |
| 5-Visual Motor | 27% | 3 (40%) | 13% | 4 (47%) |
7. Conclusions
Many experimental designs in neuroscience involve data collected on multiple subjects. They may differ with respect to neural connectivity, such that corresponding graphs may be subject-specific Sugihara et al. (2006), Li et al. (2008). The elements of neural architecture are assumed to be largely conserved among subjects. Therefore it is natural to leverage this similarity in order to improve statistical efficiency by addressing both the robustness of inferred graphical structure and reducing small sample bias (Mechelli et al., 2002). The statistical challenge of estimating multiple related graphical models has recently received much attention, e.g. the VTS, the CS and the IS approaches Mechelli et al. (2002), Zheng and Rajapakse (2006), Rajapakse and Zhou (2007), Li et al. (2008), Ramsey et al. (2010). Table 4 clarifies the main difference in the assumptions of all the methods discussed in this work.
Table 4.
The main assumptions for all methods described in this work. Note that IEMN and MEMN are based on the GS approach.
| Assumptions | VTS | CS | IS | IEMN | MEMN |
|---|---|---|---|---|---|
| The degree of similarity between subjects is fixed | |||||
| Homogeneity | |||||
| The same graphical structure | |||||
| The same connectivity strengths |
Here VTS was applied to evaluate the average of time series variables over subjects. Although this method performed poorly in the synthetic study, it was consistent with the other methods for the real fMRI data. The CS approach provided dense graphs, in both synthetic and real studies. However, in the sub-groups of simulated data, where most of the subjects share the same graphical structure, the process of summing scores had excellent performance. The IS approach provided sparser graphs than the other methods, and its result appeared to reflect what happened to most subjects.
Studies suggest that it might be possible to increase statistical efficiency, often considerably, by formulating an appropriate joint model that couples multiple graphs. However, these methods use an exchangeability assumption, treating the entire group as homogeneous. Therefore these approaches are bound to provide inconsistent results for a heterogeneous group, as shown above. In our study, we saw that only the GS approach can recognise the heterogeneity that existed in the group.
We also developed the IEMN and MEMN methods based on the GS approach. We showed that these two approaches provided similar results when the hyperparameter is small for individual networks and large for group networks. Moreover, the higher the , the denser the individual networks are, but the sparser the group networks are. We also discussed some procedures that can be applied in order to estimate . The results found here suggest that the estimation of depends on the aim of the study, e.g., if one wishes to predict the connectivity for a new subject, then cross-validation can be used, but if the focus is on estimating individual networks, then can be chosen maximising the posterior distribution, . In general, the appropriate choice of is also related to how homogeneous the group is, and so how much of the information of other datasets should also be included in the analysis of one dataset.
As the aim of this work is to discuss methods that deal with multiple networks and in principal these approaches can be used with other graphical models, we do not explore methods applied to fMRI data. However, a reader interested in this subject can see a comparison between the MDM and other methods usually applied to neuroscience data (e.g. Granger causality, Linear Dynamic System and Patel’s ) in Costa et al. (2015) and a recent discussion about state-space methods used to study dynamic networks with application into fMRI data in Solo (2016).
Acknowledgements
This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1 and by CAPES Foundation within the Minsistry of Education, Brazil (grant BEX0706/10-8). We are grateful to Janine Bijsterbosch and Sonia Bishop for the fMRI data.
Footnotes
Appendix A. Short description and some references for all methods used in this work
- The Multiregression Dynamic Model (MDM):
It is a class of multivariate time series models which embeds putative causal hypotheses among its variables over time Queen and Smith (1993), Queen and Albers (2009), Costa et al. (2015);
- The Multiregression Dynamic Model-Integer Programming Algorithm (MDM-IPA):
It is an efficient search-and-score method that provides a search over the large class of networks based on an integer programming algorithm Bartlett and Cussen (2013), Costa et al. (2015);
- The Multiregression Dynamic Model-Directed Graph Model (MDM-DGM):
It is also a learning network algorithm but searchesfor graphical structure without the constraints of DAG (Costa et al., 2017);
- The Virtual-typical-subject (VTS) Approach:
It ignores inter-subject variability, assuming that the information from different datasets come from the same subject Zheng and Rajapakse (2006), Rajapakse and Zhou (2007), Li et al. (2008);
- The Common-structure (CS) Approach:
It considers the same network structure but allows the parameters to differ between subjects Ramsey et al. (2010), Li et al. (2008);
- The Individual-structure (IS) Approach:
It drives the learning network process individually in each dataset so that results are pooled into a single network Mechelli et al. (2002), Li et al. (2008), Oates (2013);
- The Group-structure (GS) Approach:
It studies group homogeneity through cluster analysis, considering a particular measure of similarity between subjects Kherif et al. (2004), Gates (2012);
- The Individual Estimation of Multiple Networks (IEMN) Approach:
It is basically the GS approach, i.e. the individual networks are estimated independently whilst the subgroup networks made up of homogeneous subjects are estimated as a function of the same graphical structure for all individuals Kherif et al. (2004), Gates (2012);
- The Marginal Estimation of Multiple Networks (MEMN) Approach:
It searches both individual and subgroup networks considering a distance between homogeneoussubjects Oates (2013), Oates et al. (2014a), Oates et al. (2014b).
Appendix B. Individual graphical structures for the MEMN
| (8) |
where here is proportional to a constant because we are assuming that a priori all subgroup network structures are equally probable.
Appendix C. Subgroup graphical structures for the MEMN
Again is considered as proportional to a constant.
References
- Achterberg T. TU Berlin; 2007. Constraint Integer Programming. (Ph.D. thesis) [Google Scholar]
- Bartlett M., Cussens J. Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence. 2013. Advances in Bayesian network learning using integer programming. arXiv preprint arXiv:1309.6825. [Google Scholar]
- Benjamini Y., Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 1995;57(1):289–300. [Google Scholar]
- Bijsterbosch J., Smith S., Bishop S.J. Functional connectivity under anticipation of shock: Correlates of trait anxious affect versus induced anxiety. J. Cogn. Neurosci. 2015 doi: 10.1162/jocn_a_00825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Costa L. University of Warwick; 2014. Studying Effective Brain Connectivity using Multiregression Dynamic Models. (Diss.) [Google Scholar]
- Costa L., Nichols T., Smith J. Studying the effective brain connectivity using multiregression dynamic model-directed graphical models. Braz. J. Probab. Stat. 2017;31(4):765–800. [Google Scholar]
- Costa L., Smith J., Nichols T., Cussens J., Duff E.P., Makin T.R. Searching multiregression dynamic models of resting-state fMRI networks using integer programming. Bayesian Anal. 2015;10(2):441–478. [Google Scholar]
- Cussens J. Bayesian network learning with cutting planes. In: Cozman Fabio G., Pfeffer Avi., editors. Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011) AUAI Press; Barcelona: 2011. pp. 153–160. [Google Scholar]
- Everitt B., Landau S., Leese M., Stahl D. fifth ed. Wiley; Chichester: 2011. Cluster Analysis. [Google Scholar]
- Friston K.J. Functional and effective connectivity: a review. Brain Connect. 2011;1(1):13–36. doi: 10.1089/brain.2011.0008. [DOI] [PubMed] [Google Scholar]
- Gates, K.M., 2012. Identifying subgroups using fMRI connectivity maps. In: Paper Presented at the Annual Meeting for the Society for Neuroscience, New Orleans.
- Gates K.M., Molenaar P.C.M. Group search algorithm recovers effective connectivity maps for individuals in homogeneous and heterogeneous samples. NeuroImage. 2012;63:310–319. doi: 10.1016/j.neuroimage.2012.06.026. [DOI] [PubMed] [Google Scholar]
- Gonçalves M.S., Hall D.A., Johnsrude I.S., Haggard M.P. Can meaningful effective connectivities be obtained between auditory cortical regions? NeuroImage. 2001;14:1353–1360. doi: 10.1006/nimg.2001.0954. [DOI] [PubMed] [Google Scholar]
- Griffanti Ludovica, Salimi-Khorshidi G., Beckmann C.F., Auerbach E.J., Douaud G., Sexton C.E., Zsoldos E., Ebmeier K.P., Filippini N., Mackay C.E., Moeller S. ICA-based artefact removal and accelerated fMRI acquisition for improved resting state network imaging. NeuroImage. 2014;95:232–247. doi: 10.1016/j.neuroimage.2014.03.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harbord, R., Costa, L., Smith, J.Q., Bijsterbosch, J., Bishop, S., Nichols, T.E., 2016. Scaling up directed graphical models for resting-state fMRI with stepwise regression. In: The organization for human brain mapping(OHBM), Geneva, 26-30 June.
- Jeffreys H. (third ed.) Oxford University Press; London: 1961. Theory of Probability. [Google Scholar]
- Kherif F., Poline J.-B., Mériaux S., Benali H., Flandin G., Brett M. Group analysis in functional neuroimaging: selecting subjects using similarity measures. NeuroImage. 2004;20(4):2197–2208. doi: 10.1016/j.neuroimage.2003.08.018. [DOI] [PubMed] [Google Scholar]
- Korb K.B., Nicholson A.E. Chapman & Hall / CRC; Boca Raton: 2004. Bayesian Artificial Intelligence. Computer Science and Data Analysis. [Google Scholar]
- Langfelder P., Zhang B., Horvath S. Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics. 2008;24(5):719–720. doi: 10.1093/bioinformatics/btm563. [DOI] [PubMed] [Google Scholar]
- Li J., Wang Z.J., Palmer S.J., McKeown M.J. Dynamic Bayesian network modeling of fMRI: a comparison of group-analysis methods. Neuroimage. 2008;41(2):398–407. doi: 10.1016/j.neuroimage.2008.01.068. [DOI] [PubMed] [Google Scholar]
- Mechelli A., Penny W.D., Price C.J., Gitelman D.R., Friston K.J. Effective connectivity and intersubject variability: using a multisubject network to test differences and commonalities. NeuroImage. 2002;17(3):1459–1469. doi: 10.1006/nimg.2002.1231. (11) [DOI] [PubMed] [Google Scholar]
- Oates C.J. The University of Warwick; U.K: 2013. Bayesian Inference for Protein Signalling Networks. (Ph.D. thesis) (Chapter 4) [Google Scholar]
- Oates C.J., Costa L., Nichols T.E. Toward a multisubject analysis of neural connectivity. Neural Comput. 2014 doi: 10.1162/NECO_a_00690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oates, C.J., Smith, J.Q., Mukherjee, S., Cussens, J., 2014b. Exact estimation of multiple directed acyclic Graphs. arXiv preprint arXiv:1404.1238.
- Pearl J. Cambridge University Press; Cambridge: 2000. Causality: Models, Reasoning, and Inference. [Google Scholar]
- Poldrack R.A., Mumford J.A., Nichols T.E. Cambridge University Press; 2011. Handbook of FMRI Data Analysis. [Google Scholar]
- Queen C.M., Albers C.J. Intervention and causality: Forecasting traffic flows using a dynamic bayesian network. J. Amer. Statist. Assoc. 2009;104(486):669–681. [Google Scholar]
- Queen C.M., Smith J.Q. Multiregression dynamic models. J. R. Stat. Soc. Ser. B Stat. Methodol. 1993;55:849–870. [Google Scholar]
- Raichle M.E. Two views of brain function. Trends Cogn. Sci. 2010;14(4):180–190. doi: 10.1016/j.tics.2010.01.008. [DOI] [PubMed] [Google Scholar]
- Rajapakse J.C., Zhou J. Learning effective brain connectivity with dynamic Bayesian networks. NeuroImage. 2007;37:749–760. doi: 10.1016/j.neuroimage.2007.06.003. [DOI] [PubMed] [Google Scholar]
- Ramsey J.D., Hanson S.J., Hanson C., Halchenko Y.O., Poldrack R.A., Glymour C. Six problems for causal inference from fMRI. NeuroImage. 2010;49:1545–1558. doi: 10.1016/j.neuroimage.2009.08.065. [DOI] [PubMed] [Google Scholar]
- Salimi-Khorshidi G., Douaud G., Beckmann C.F., Glasser M.F., Griffanti L., Smith S.M. Automatic denoising of functional MRI data: Combining independent component analysis and hierarchical fusion of classifiers. NeuroImage. 2014;90:449–468. doi: 10.1016/j.neuroimage.2013.11.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith S.M., Miller K.L., Salimi-Khorshidi G., Webster M., Beckmann C., Nichols T., Ramsey J., Woolrich M. Network modeling methods for FMRI. NeuroImage. 2011;54(2):875–891. doi: 10.1016/j.neuroimage.2010.08.063. [DOI] [PubMed] [Google Scholar]
- Solo Victor. State-space analysis of granger-geweke causality measures with application to fmri. Neural Comput. 2016;28(5):914–949. doi: 10.1162/NECO_a_00828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spirtes P., Glymour C.N., Scheines R. second ed. MIT Press; Cambridge, Mass: 2000. Causation, Prediction, and Search. [Google Scholar]
- Sporns O. MIT Press; 2011. Networks of the Brain. [Google Scholar]
- Sugihara G., Kaminaga T., Sugishita M. Interindividual uniformity and variety of the “Writing center”: A functional MRI study. NeuroImage. 2006;32:1837–1849. doi: 10.1016/j.neuroimage.2006.05.035. [DOI] [PubMed] [Google Scholar]
- West M., Harrison P.J. (second ed.) Springer-Verlag; New York: 1997. Bayesian Forecasting and Dynamic Models. [Google Scholar]
- Zheng X., Rajapakse J.C. Learning functional structure from fMR images. NeuroImage. 2006;31(4):16011613. doi: 10.1016/j.neuroimage.2006.01.031. [DOI] [PubMed] [Google Scholar]
Further Reading
- McGonigle D.J., Howseman A.M., Athwal B.S., Friston K.J., Frackowiak R.S., Holmes A.P. Variability in fMRI: an examination of intersession differences. NeuroImage. 2000;11:708–734. doi: 10.1006/nimg.2000.0562. [DOI] [PubMed] [Google Scholar]
- Petris G., Petrone S., Campagnoli P. Springer; New York: 2009. Dynamic Linear Models with R. [Google Scholar]










