Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Apr 1.
Published in final edited form as: Comput Med Imaging Graph. 2014 May 10;41:117–125. doi: 10.1016/j.compmedimag.2014.05.001

Examining the multifactorial nature of a cognitive process using Bayesian brain-behavior modeling Running head: Bayesian brain-behavior modeling

Rong Chen 1,*, Edward H Herskovits 2
PMCID: PMC4226745  NIHMSID: NIHMS600398  PMID: 24880892

Abstract

Establishing relationships among brain structures and cognitive functions is a central task in cognitive neuroscience. Existing methods to establish associations among a set of function variables and a set of brain regions, such as dissociation logic and conjunction analysis, are hypothesis-driven. We propose a new data-driven approach to structure-function association analysis. We validated it by analyzing a simulated atrophy study. We applied the proposed method to a study of aging and dementia. We found that the most significant age-related and dementia-related volume reductions were in the hippocampal formation and the supramarginal gyrus, respectively. These findings suggest a multi-component brain-aging model.

Keywords: structure-function association, Bayesian analysis, cognitive process, aging and dementia

1. Introduction

A central task in cognitive neuroscience is the establishment of relationships among brain structures and neural activity on the one hand, and cognitive functions or processes on the other. Such relationships are referred to as brain-behavior or structure-function associations. Delineating brain-behavior associations is of fundamental importance to understand the neural bases of cognition. The classic examples of brain-behavior association involve the neural bases of language1. Broca and Wernicke examined patients with brain damage, who had difficulty with language as a result (aphasia). Broca reported a lack of language production in patients with damage to the posterior and inferior regions of the left frontal lobe (Broca’s area); and Wernicke reported that patients who have damage to the posterior/superior aspect of the left temporal lobe (Wernicke’s area) could no longer comprehend language. Putting these two studies together, we have a brain-behavior association involving two brain regions (Broca’s area and Wernicke’s area) and two cognitive processes (producing language and comprehending language). Broca’s area is specialized for producing language, whereas Wernicke’s area is specialized for comprehending language.

In this manuscript, we focus on the problem of establishing associations among a set of function variables (denoted by F) and a set of brain regions (denoted by R). Two examples of such an analysis are dissociation analysis24 and conjunction analysis58. Both inferential methods have played important roles in cognitive neuroscience.

One of the fundamental approaches to the demonstration of structure-function associations in lesion-based brain studies is to establish associations and dissociations3, 4. An association pattern is established when damage in brain region A disrupts the performance of task α. A dissociation pattern is established when damage in brain region A disrupts the performance of task α but not that of task β; in this case, region A and task β are dissociated. A more firm approach to the demonstration of brain-behavior associations is to establish a double-dissociation model2. A double-dissociation pattern is established when damage in brain region A impairs task α but not task β, whereas damage in region B impairs task β but not task α. In a double dissociation analysis, investigators model interactions among two function variables, Fα and Fβ, and two brain structures RA and RB. The observation of double dissociation provides evidence of functionally distinct neural systems, which is central to the delineation of underlying mechanisms.

In conjunction analysis, the central question is “which brain regions are damaged in all tasks?”. For example, in a study involving two function variables where Fα represents the language production deficit and Fβ represents the language comprehension deficit, our goal is to find brain regions that are damaged in patients with language production deficit and language comprehension deficit.

Existing methods for the elucidation of brain-behavior associations have two major limitations. First, these methods are confirmatory; that is, they are designed to confirm a particular hypothesis, rather than compare models. Davies recognized this limitation when he said “None of this be allowed to suggest that double dissociation rules out the possibility of any kind of alternative explanation”9. For example, consider a study with model M1, in which damage in brain region A impairs task α but not task β, whereas damage in region B impairs task β but not α. This model may reasonably explain the data observed in this study; however, this analysis ignores the possibility that there exist one or more additional models that offer more-plausible explanations of these data.

The second major limitation of existing methods is their focus on functional specialization. Functional specialization and functional integration are two fundamental principles of brain functional organization10. Functional specialization posits that a brain region is specialized for some aspect of a cognitive process, whereas functional integration emphasizes interactions among brain regions. Existing approaches do not consider regional interactions and therefore cannot detect functional-integration patterns.

We propose a new method, Bayesian brain-behavior modeling (BBM), for the delineation of brain-behavior associations. BBM directly models the joint-probability distribution among brain regions and task variables, and directly addresses the two major limitations of existing methods for brain-behavior associations. First, BBM has a model-comparison mechanism: from a set of candidate theories (the model space), BBM searches for the model that is most consistent with the observed data. Second, BBM generates a network model, which provides a natural framework within which to explore functional integration: brain regions are modeled as nodes, and interactions are modeled as edges between nodes.

As a data-driven approach, BBM accepts, but does not require, the specification of a prior model. BBM can reveal arbitrary interactions among brain regions and function variables, not limited to the dissociation pattern or conjunction pattern.

2. Methods

The overarching goal of BBM is to delineate structure-function associations. Let RA denote a feature associated with brain region A, such as the presence of activation in a functional MR (fMR) experiment, or the presence of a lesion or morphological feature manifest on structural MR. Let Fα denote the functional assessment of a process α (e.g., performance on a particular task); Fα could also represent the presence or absence of a disorder, such as Alzheimer’s disease. In this framework, we model a structure-function association as an association between {RA} and {Fα}.

2.1 Background: Bayesian networks

BBM is based on Bayesian-network models11. A Bayesian network is a probabilistic graphical model that specifies a joint probability distribution over a set of variables V = {X1, X2, …, Xn}. A Bayesian network B consists of two components: a structure S, and parameters Θ; i.e., B = (S, Θ). The structure of the Bayesian network describes the probabilistic associations among the variables, and is represented as a directed acyclic graph. Nodes in this graph represent variables of interest, such as brain regions or clinical variables. A directed edge in this graph from a variable Xi to Xj, written as XiXj, indicates that the variables Xi and Xj are associated, that Xi is a parent node of Xj, and that Xj is a child node of Xi. Each variable in a Bayesian network is associated with a conditional-probability distribution Pr(Xi | pa(Xi)), where pa(Xi) represents the parent set of Xi. A discrete Bayesian network can represent any joint distribution over these discrete variables11.

Figure 1 shows a simple example of a Bayesian network. There are four variables in this Bayesian network: two brain regions (RA and RB) and two function variables (Fα and Fβ). This Bayesian network represents a scenario in which Fα is probabilistically determined by the state of region RA, and Fβ is probabilistically determined by that of RB; therefore, in this network, pa(Fα) = {RA} and pa(Fβ) = {RB}.

Figure 1.

Figure 1

A Bayesian network represents double dissociation in a neuropsychology study

When variables are categorical, Pr(Xi | pa(Xi)) can be represented as a conditional probability table. Let θijk = Pr(Xi = k | pa(Xi) = j) be the conditional probability of Xi assuming state k given that its parents, pa(Xi), assume joint state j. If Xi does not have parents, then θijk is the marginal probability distribution of Xi. Θ ={θijk} represents the parameters of a Bayesian network, from which the joint distribution over all variables can be computed. In Figure 1, the conditional probability Pr(Fα = abnormal | RA = damaged) = 0.8 means that the probability of a subject’s having abnormal Fα is 0.8 when region RA is damaged.

A critical notion in Bayesian network modeling is that of a Markov blanket. In probabilistic terms, the Markov blanket of node X, denoted by mb(X), is the minimum set of variables that renders X conditionally independent of all other variables in the Bayesian network. In the context of predicting the state of X based on knowledge of a subset of the variables in a Bayesian network, we achieve greatest accuracy when we know the states of the Markov blanket of X. That is, nodes in the Markov blanket of X are jointly most predictive of X.

One of the advantages of the BN representation is its powerful inferential capability11. The inference task is to find the posterior distribution of a set of outcome variables, given values for evidence variables. For example, in Figure 1, we may be interested in the posterior probability that Fα is abnormal, given that RA is damaged. In this query, the outcome variable is Fα, and the evidence variable is RA. Standard BN inference algorithms12 can efficiently calculate such queries.

2.2 Data preprocessing

The input to our approach consists of subjects’ image volumes, I = {I1, …, Im}, along with values for two function variables {{Fα1, Fβ1}, …, {Fαm, Fβm}}}, where m is the number of subjects. Fα and Fβ are binary variables.

The goal of data preprocessing is to extract features that represent regional states. A brain atlas defines a set of structures in a canonical coordinate system. Let Ri be a binary variable that represents the state of structure i, i.e., normal or abnormal. Our goal is the inference of Ri based on I.

For morphometric studies, we can use an image-processing pipeline similar to that described in13 to obtain regional volumes. This pipeline consists of four steps: skull stripping, segmentation, spatial normalization, and RAVENS analysis14, the last of which yields a voxel-wise volumetric map. This image-processing pipeline yields regional volumes for each structure defined on a brain template (corrected for intracranial volume). For each atlas structure, we apply a threshold to convert its normalized volume into a binary variable: if a subject’s structure’s volume is less than a threshold, such as the sample median (i.e., is there is sufficient volume loss), we label it ‘abnormal’ (i.e., atrophic); otherwise, we label it ‘normal’. This pipeline is used in13, 15.

Similarly, for lesion-deficit studies, our data-preprocessing pipeline includes three steps. First, we delineate abnormal brain voxels based on MR or CT findings, either manually or with automatic segmentation software. We refer to the delineated abnormal brain region as the subject’s lesion map. In a subject’s lesion map, if a voxel is lesioned, it is labeled as ‘1’; otherwise, it is labeled as ‘0’. In the second step, we register each subject's lesion map to a brain template; for this task, we co-register each subject’s image volume to the atlas using a mutual-information maximization algorithm16. We then apply the normalization parameters derived during the registration process to that subject’s lesion map. This step yields a lesion map for each subject, defined in the template space. In the third step, we infer Ri. For a given brain region i, we define the lesion fraction Li as

Li=NlesionNtotal (1)

where Nlesion and Ntotal are the number of abnormal voxels and the total number of voxels in region i, respectively. We infer Ri based on {L1, …, Lm}. For a threshold th, if Li < th, we set Ri = 0; otherwise, we set Ri = 1. There are two commonly used methods for determining this threshold: we can choose the threshold based on experts’ knowledge, or we set the threshold as the sample mean or median.

2.3 The BBM Algorithm

2.3.1 Model generation

Given a set of regions {Ri}, and a set of function variables {Fα}, BBM generates a BN that models probabilistic associations among these variables. We use the K2 score17 for model generation; this metric quantifies how well a given BN structure fits the observed data D, and is widely used to guide the generation of BN models from data. The K2 score is the marginal likelihood Pr(D | S):

K2(D,S)=Pr(D|S)=i=1nj=1qi(ri1)!(Nij+ri1)!k=1riNijk!, (2)

where S is the structure, ri is the number of states of the ith variable, qi is the number of joint states of the parents of the ith variable, Nijk is the number of samples in D for which the ith variable assumes its kth state and its parents assume their jth joint state, and Nij = Σk Nijk.

An important characteristic of the K2 score is decomposability17: for a single variable Xi, the K2 score is defined as

K2(Xi,D,S)=j=1qi(ri1)!(Nij+ri1)!k=1riNijk!. (3)

Decomposability allows us to determine the optimal parent set for each variable individually, as opposed to optimizing the entire network structure at once. In this way, decomposability can greatly simplify the network-selection process.

In addition to defining a metric that measures how well a structure-function model fits a particular data set, we require a mechanism for selecting candidate models, which the algorithm will compare to maximize the metric. BBM uses a Markov Chain Monte Carlo (MCMC) method, called Gibbs sampling, to select candidate models. This algorithm is shown in Figure 2. In BBM, we assume that {Fα} (the function variables) are leaf nodes. That is, they don’t have children. In the Gibbs sampling algorithm, for each variable, the conditional probability Pr(pa(k) | S(k), D) is calculated based on the K2 score, where S(k) represents the BN structure excluding parent nodes for node k, and pa(k) is the parent set of node k. Let Si denote the BN structure in iteration i. For each node k, we consider structures obtained by the addition of one parent node for node k, deletion of one parent node for node k, or no change. We calculate the conditional probability for each candidate structure, Pr(pa(k) | S(k), D), as follows:

Pr(pa(k)|S(k),D)=1cPr(D|pa(k),S(k))Pr(pa(k)), (4)

where c is a normalization constant, and Pr(pa(k) | S(k), D) is the K2 score. Based on a non-informative prior distribution, Pr(pa(k)) is a uniform distribution over the edge space.

Figure 2.

Figure 2

Gibbs sampling for generating Bayesian networks

Relative to greedy-search methods such as forward selection, the advantages of Gibbs sampling are twofold: it does not require a predefined node ordering in order to search for candidate models, and it can provide the posterior distribution of a BN structure, Pr(S | D). However, Gibbs sampling does so at the expense of much greater computational resources (memory and computation time) than greedy search.

Given a BN structure, the maximum-likelihood estimate of Θ18 is

θijk=NijkNij. (5)

2.3.2 Inference

Given the generated BN that describes interactions among brain regions and function variables, investigators are often interested in the separation of function variables {Fα}; that is, they want to know whether Fα and Fβ are associated with distinct brain regions, or conversely whether there are brain regions associated with both Fα and Fβ. To make these determinations, we examine the Markov blankets of function variables, because nodes in the Markov blanket of X are jointly most predictive of X. Therefore, mb(X) is a promising set of biomarkers for X. In particular, we can compare mb(Fα) and mb(Fβ) to determine whether Fα and Fβ are associated with distinct brain regions. One metric that is often used to assess the overlap between two sets, such as mb(Fα) and mb(Fβ), is the Dice coefficient19, which is defined as

2*Ω(mb(Fα)mb(Fβ))Ω(mb^(Fα))+Ω(mb(Fβ)),

where Ω is an operator that calculates the cardinality of a set. The Dice coefficient ranges from 0 to 1, the former indicated two distinct sets, and the latter indicating identity.

Another way to compare Fα with Fβ is to examine whether biomarkers of Fα are predictive of Fβ, and whether biomarkers of Fβ are predictive of Fα20. If Fα and Fβ were separate processes, we would expect that biomarkers of Fα would not be predictive of Fβ, and that biomarkers of Fβ would not be predictive of Fα. Let acc(Fα, mb(Fα)) represent the prediction accuracy of predicting Fα based on mb(Fα). We can calculate four prediction accuracies: acc(Fα, mb(Fα)), acc(Fα, mb(Fβ)), acc(Fβ, mb(Fα)), acc(Fβ, mb(Fβ)). If Fα and Fβ are separate processes, acc(Fα, mb(Fα)) and acc(Fβ, mb(Fβ)) should be high, and acc(Fα, mb(Fβ)) and acc(Fβ, mb(Fα)) should be low.

3. Experimental Results

We present two experiments in this section: Experiment 1 evaluates BBM using simulated data. In Experiment 2, we apply this approach to a study of aging and dementia.

3.1 Simulated data: lesion-deficit analysis

In this experiment, we evaluated BBM using simulated data, for which we have the ground truth. In particular, we synthesized brain-behavior association data sets, each of which consisted of several brain regions and two function variables. Figure 3 shows the structures of these ground-truth BNs. The Bayesian network in Figure 3 (a) represents a classic double dissociation pattern. The BN in Figure 3 (b) models a scenario in which two function variables are directly dependent on two different regions (R5Fα and R6Fβ). Note that there are some higher-order interactions in this model; for example, both R5 and R6 are dependent on R4. In Figure 3 (c), there are 5 structure and 2 function variables, and one brain structure (R5) is directly associated with both function variables. Therefore, Fα and Fβ are not directly associated with separate neuronal systems. Previous research has suggested that brain networks have the small-world property21: brain regions tend to have low average path lengths. Following this principle, we generated the BN in Figure 3 (d) represents a complicated brain network involving 48 brain regions.

Figure 3.

Figure 3

The ground-truth Bayesian networks for simulated data. (a) Canonical double dissociation. (b) Two function variables are directly dependent on two different brain regions. (c) One region (R5) is directly associated with both function variables. (d) A complicated network comprising 48 brain regions.

The networks in Figure 3 represent a broad spectrum of brain-behavior associations, with which we could determine BBM’s accuracy and scalability of BBM. The key component of BBM is a BN-induction algorithm; such algorithms scale well with the number of samples, and have been shown to scale to millions of samples17, 22. The BNs in Figure 3 include a model with a small number of variables (Figure 3 (a)), one with medium complexity (Figure 3 (b) and Figure 3 (c)), and a model with high complexity (Figure 3 (d)).

3.1.1 Lesion simulation

To make the simulation more realistic, we introduced simulated lesions into the data set. First, we chose several brain regions from the AAL template23. We then randomly sampled the ground-truth BN to generate 100 samples. For example, when we synthesized data based on the BN shown in Figure 3 (a), one of the samples we obtained was [BA22 = intact, BA39 = damaged, Fα = normal, Fβ = abnormal]; this nomenclature indicates that, for this sample, Brodmann area BA22 is intact, BA39 is damaged, Fα is normal, and Fβ is abnormal. To generate image data for this sample, we introduced lesions into voxels in BA39. That is, for signal level λ and noise level ν, voxels in BA39 were lesioned with probability λ, and voxels not in BA39 were damaged with probability λ. We chose λ = 0.7 and ν = 0.1 for lesioned regions, and λ = 0.1 and ν = 0.1 for intact regions. We had previously applied this lesion-simulation procedure in24.

To threshold these images to obtain lesion maps, for a brain region i, if its lesion load for a particular subject was greater than 0.3 (i.e., if at least 30% of its volume was lesioned), we set Ri = lesioned; otherwise, we set Ri = intact. We submitted these binary lesion maps, along with the binary function variables, to BBM.

3.1.2 Results

Let Ψ denote the union of the Markov blankets of Fα and that of Fβ; that is, Ψ = {mb(Fα), mb(Fβ)}; then, the Dice metric indicating the similarity of the estimated Ψ and ground truth Ψ is

12(2*Ω(mb^(Fα)mb*(Fα))Ω(mb^(Fα)+mb*(Fα))+2*Ω(mb^(Fβ)mb*(Fβ))Ω(mb^(Fβ)+mb*(Fβ))), (6)

where Ω is the operator that calculates the cardinality of a set, andmb^ and mb* are the estimated and ground-truth Markov blankets, respectively.

We assessed BBM’s ability to infer structure-function associations using the Dice metric as shown in Equation (6). In order to assess the stability of BBM, we randomly sampled each ground-truth BN in Figure 3 100 times, thereby generating 400 ground-truth BNs, and we computed the median and the inter-quartile range25 of the Dice metric across the 100 data sets for each network. The inter-quartile range is the distance between the 75th percentile and the 25th percentile. We used the median and inter-quartile range to describe the variability of the Dice metric because this metric does not follow a normal distribution.

Table 1 lists the median and inter-quartile range of the Dice metric for the simulated data. It’s generally accepted that the Dice metric > 0.7 represents good agreement26. We found that the median Dice metric was very high: it was 1 (perfect concordance) for the models in Figure 3 (a), (b), and (c), and was 0.83 for the model in Figure 3 (d). Table 1 also shows that the inter-quartile range, which represents the variability of the Dice metric, was low. Based on these findings, we conclude that 1) BBM is able to infer structure-function associations accurately; 2) BBM is stable under sampling (low inter-quartile range); and 3) BBM scales well with the number of variables because it can detect complicated multivariate structure-function associations.

Table 1.

Dice metrics for the simulated data.

Model Median inter-quartile range
Figure 3 (a) 1 0
Figure 3 (b) 1 0.08
Figure 3 (c) 1 0.13
Figure 3 (d) 0.83 0.16

3.2 Examining the multifactorial nature of aging effects on cognition

In Experiment 2a, our goal was to investigate age-related and dementia-related reductions in regional gray-matter volumes in the brain. Some researchers have suggested that Alzheimer’s disease (AD) is an exaggeration of normal aging, in which cognitive functioning declines along a continuum and dementia is an acceleration of the same process that causes normal cognitive decline27. However, recent research has supported a multifactorial perspective, in which age-related brain changes are distinct from pathological changes in AD20, 28. In this experiment, we determine whether the age-related pattern in regional gray matter volume reduction is distinct from that found in people with dementia.

Head and colleagues performed a region-of-interest (ROI) based analysis on the image data of 100 subjects28. These elderly individuals were recruited from the registry of the Washington University Alzheimer Disease Research Center28. The T1-weighed high-resolution structural MR scan for each subject was acquired using a Siemens 1.5T Vision instrument (Erlangen, Germany) and a spoiled gradient-echo sequence (time to repetition 9.7 msec, time delay 4 msec, flip angle = 10°, time following inversion pulse 20 msec, trigger delay 200 msec). Each participant was assessed with the Washington University Clinical Dementia Rating (CDR)29; participants with CDR greater than or equal to 0.5 were classified as demented, while those with CDR = 0 were classified as nondemented. Subjects with CDR greater than or equal to 0.5 are classified as dementia of Alzheimer’s type.

Head’s analysis was hypothesis-driven and centered on two brain structures: the hippocampus and corpus callosum. In order to examine the relationship between non-demented aging and dementia, they divided study subjects into subgroups, based on whether they were above or below 78 years, and their dementia status. That is, there were two function variables: Dementia and Aging. Dementia represented whether a subject was cognitively normal (CDR = 0) or demented (CDR ≥ 0.5), and Aging represented whether a subject’s age was above or below the sample median (78 years). Head et al. then performed a double-dissociation analysis for these two function variables and volumes of the hippocampus and corpus callosum. They found that dementia was associated with hippocampal volume changes, whereas normal aging was related to the volume of the anterior portion of the corpus callosum.

We re-analyzed the image and clinical data of these 100 subjects, to detect interactions among regional brain volumes, aging, and dementia of Alzheimer’s type, using BBM. The differences between our analysis and Head’s analysis were: 1) Our analysis was data-driven; and 2) Our analysis involved 21 bilateral structures defined on the MNI Jakob template, corrected for intracranial volume. These structures covered the entire brain.

3.2.1 BBM analysis

There were two function variables: Dementia and Aging. Dementia represented whether a subject was cognitively normal (CDR = 0) or demented (CDR ≥ 0.5), and Aging represented whether a subject’s age was above or below 78 years. The definitions of these function variables are same as these used in28.

Using the image-processing pipeline for morphometric studies described in Section 2.2, we obtained structural volumes for 21 bilateral structures defined on the MNI Jakob template, corrected for intracranial volume. Similar to Head’s analysis, we combined bilateral structures; these structures covered the entire brain. We thresholded each structure’s volume into a binary variable: if a subject's structure volume was less than the sample median across all subjects, we labeled it as ‘atrophic’; otherwise, we labeled it as ‘normal’.

We provided these 21 structure variables and two function variables as input to the BBM algorithm. BBM generated the Bayesian-network model shown in Figure 4. The result of primary interest is the union of the Markov blankets of the function variables, denoted by Ψ (i.e., the Markov blanket of Aging and that of Dementia). We had Ψ = mb(Aging)mb(Dementia)={smg}∪{hippo}. Ψ is depicted in Figure 5. First, BBM results indicated that the two processes (Aging and Dementia) are directly related to two different neuronal systems. Dementia is primarily associated with the hippocampal formation: when the hippocampal formation demonstrates significant volume loss, that subject has a high probability of having dementia (Pr(Dementia = present | hippo = atrophy) = 0.77); in contrast, when the hippocampal formation does not demonstrate significant volume loss, that subject has a high probability of being cognitively normal (Pr(Dementia = absent | hippo = normal) = 0.77). In summary, this model indicates that dementia is characterized by volume reduction in the hippocampal formation. We see a similar pattern of probabilistic associations between the supramarginal gyrus and the aging process: volume loss in the supramarginal gyrus predicts that a subject will have age greater than 78 (Pr(Age ≥ 78 | smg = atrophy) = 0.69). In addition, we found that dementia is primarily associated with volume loss in structures in the medial temporal lobe, whereas the aging process is primarily related to structures in the frontal and parietal lobes.

Figure 4.

Figure 4

A Bayesian-network model representing probabilistic associations among Alzheimer’s Disease (AD), Aging, and 21 brain structures. Left: the structure of the Bayesian network. Right: the conditional probability tables for Dementia and Aging. ag - angular gyrus, amyg - amygdala, cing - cingulate gyrus, ent - entorhinal cortex, hippo - hippocampus, ifg - inferior frontal gyrus, iog - inferior occipital gyrus, lfog - lateral frontal orbital gyrus, mefg - medial frontal gyrus, mifg - middle frontal gyrus, mfog - medial frontal orbital gyrus, mog - middle occipital gyrus, pc - perirhinal cortex, pcg - precentral gyrus, ph - parahippocampal gyrus, pocg - postcentral gyrus, sfg - superior frontal gyrus, smg - supramarginal gyrus, sog - superior occipital gyrus, spl - superior parietal lobule, th - thalamus.

Figure 5.

Figure 5

The Markov blanket of Aging (the supramarginal gyrus, the green structure) and that of Dementia (the hippocampus, the blue structure), overlaid on the MNI template, in radiological convention.

We assessed the distinction between Aging and Dementia by examining whether biomarkers of Aging are predictive of Dementia, and whether biomarkers of Dementia are predictive of Aging. Table 2 lists the prediction accuracy of Aging and Dementia using different biomarker sets. We found that mb(Aging) (the biomarker set of Aging) was predictive of Aging with accuracy = 0.69, but not predictive of Dementia; mb(Dementia) was predictive of Dementia with accuracy = 0.78, but not predictive of Dementia. These findings suggest that Aging and Dementia are separate processes.

Table 2.

Prediction accuracies of Aging and Dementia based on different biomarker sets.

Prediction Accuracy
Biomarker Aging Dementia
mb(Aging) 0.69 0.50
mb(Dementia) 0.51 0.78

We used resampling methods for model validation and stability assessment. In particular, for a data set D, we resampled it using either the Jackknife method or the bootstrap resampling, to obtain a new data set Dr. Then we generated a BN model based from Dr. We collected the BN models generated from all resampled {Dr} to form a model ensemble. From this model ensemble we calculated frequencies and the mode of Ψ. We found that there were 4 different Ψ in the ensemble. The Markov blankets of the function variables generated using the original data corresponds to the mode of the pattern ensemble (probability = 0.89). This demonstrated that the Markov blanket of the function variables generated from the original data is a stable pattern under data perturbation, and therefore is not likely to be due to a statistical artifact.

We investigated whether the pattern Ψ is stable under changes to the threshold that we applied to each structure’s volume. We tried a wide range of thresholds (40, 45, 50 (median), 55, 60 percentile). The Markov blanket of Aging and that of Dementia are listed in Table 3. We found that Ψ was stable for all of these thresholds.

Table 3.

The Markov blanket of Aging and that of Dementia for different thresholds

Threshold The Markov blanket of Aging The Markov blanket of Dementia
40 Supramarginal gyrus Hippocampus
45 Supramarginal gyrus, Superior parietal lobule Hippocampus, Precentral gyrus
50 Supramarginal gyrus Hippocampus
55 Supramarginal gyrus Hippocampus
60 Supramarginal gyrus Hippocampus, Parahippocampal gyrus

3.2.2 Comparison of BBM and double dissociation

To further demonstrate the differences and connections between BBM and double-dissociation approaches, we analyzed the same data set using the double-dissociation approach.

Experiment 2b demonstrates the similarities between the double-dissociation approach and BBM. To evaluate the hypothesis that there exists double dissociation among AD, aging, the hippocampus, and the supramarginal gyrus, we computed four one-way ANOVA tests: hippo - Aging, hippo - Dementia, smg - Aging, and smg - Dementia. We used the standard significance level cutoff 0.05. We found a significant difference in hippocampal volumes between normal and demented groups (p-value < 0.001); however, we did not detect a significant difference between hippocampal volumes in the Age ≥78 and Age < 78 groups (p-value = 0.15). Similarly, we found no significant difference in the supramarginal-gyrus volumes between normal and demented groups (p-value = 0.12), whereas we found a significant difference in supramarginal-gyrus volumes between the Age ≥ 78 and Age < 78 groups (p-value < 0.001). This analysis establishes a double dissociation pattern; that is, Dementia is primarily related to the hippocampal formation, but not the supramarginal gyrus, whereas Aging is primarily related to the supramarginal gyrus, but not the hippocampal formation. This experiment demonstrates that when there is just one brain structure associated with each function variable, and the expert can formulate the appropriate theoretical model, the double-dissociation approach and BBM will obtain similar results, assuming that BBM does not find structures that are more strongly associated with either function variable than those specified by the researcher.

In experiment 2c, we demonstrate important differences between the double-dissociation approach and BBM. A researcher who formulates the hypothesis that there exists double dissociation among Dementia, Aging, the hippocampus, and the angular gyrus can test this hypothesis using analysis of variance, and would find that Dementia is primarily related to the hippocampal formation (p-value < 0.001), but not the angular gyrus (p-value = 0.19), whereas Aging is primarily related to the angular gyrus (p-value < 0.01), but not the hippocampal formation (p-value = 0.12). In this manner, double dissociation is established. We use MA to denote this hypothesis.

However, another competing theory MB (the interactions among Dementia, Aging, the hippocampus, and the supramarginal gyrus) may offer a more plausible explanation of the data. We calculated the marginal likelihoods of MA and MB based on Equation (2). Then we calculated the Bayes factor defined as Pr(D | MB)/Pr(D | MA). The Bayes factor is 56.3. This value suggested MB is more strongly supported by the data under consideration than MA30. However, since double dissociation approaches are hypothesis-driven and do not have a model-selection mechanism, this more plausible competing theory would be ignored under the double-dissociation framework. This experiment demonstrates a significant limitation of double-dissociation logic.

4. Conclusion and Discussions

We have described and evaluated a novel method, BBM, for discovering structure-function associations. BBM generates a probabilistic model of associations among brain structures and function variables. Relative to existing structure-function detection methods, such as dissociation logic and conjunction analysis, BBM provides neuroscientists with a more powerful, data-driven means of delineating structure-function associations, and does not require a pre-specified hypothesis. Our experiments, based on both simulated data and data from a study of aging and dementia, demonstrate that BBM effectively detects structure-function associations.

As analysis of the simulated data demonstrated, BBM was able to detect structure-function associations effectively, even for the scenario in which there were many brain regions involved (Figure 3 (d)). We also found that BBM is generates a model that is stable under data perturbation, and that it is relatively straightforward to determine whether the results are stable, or whether more samples would be required.

In the study of aging and dementia (Experiments 2a and 2b), we found that the two function variables (Aging and Dementia) are directly related to two different neuronal systems. In particular, we found that Dementia is characterized by volume reduction in the hippocampal formation; in contrast, Aging is characterized by volume reduction in the supramarginal gyrus. Our findings are consistent with a multifactorial framework of cognitive aging, which suggests that more than one process may be responsible for cognitive decline20, 28, 31. Our findings are also consistent with those from other studies of Alzheimer’s disease and aging. In particular, MR studies have reported that, compared with normal controls, patients with Alzheimer’s disease manifest volume reduction in structures within the medial temporal lobe, particularly the hippocampus32, whereas in the setting of aging, the frontal and parietal lobes demonstrate atrophy33.

In Experiment 2c, if we were to employ dissociation logic, and thus test the hypothesis that there exists a double dissociation among the superior frontal gyrus, the hippocampus, Aging, and Dementia, we would confirm that atrophy of the superior frontal gyrus is associated with Aging but not with Dementia, and that atrophy of the hippocampus is associated with the Dementia but not with Aging. However, a stronger association exists between atrophy of the supramarginal gyrus and aging, which could have been detected with dissociation logic only if a corresponding hypothesis were postulated by researchers performing the analysis. In this case, dissociation logic yields an over-simplified version of the model shown in Figure 4. In fact, the class of models that BBM can generate is a multivariate generalization of the model generated by dissociation logic, and therefore is a proper superset of the models that the latter can generate.

In the study of aging and dementia (Experiment 2), each participant was assessed with the Washington University Clinical Dementia Rating (CDR)29; subjects with CDR greater than or equal to 0.5 are classified as dementia of Alzheimer’s type. For the demented group, there were 33 subjects with CDR=0.5 and 17 subjects with CDR=1. A widely used definition for mild cognitive impairment (MCI) is CDR=0.534. Therefore, the biomarkers detected in our experiment are for dementia of Alzheimer’s type. In the future, we will apply BBM to a dataset centering on MCI, and investigate the interactions among aging, mild cognitive impairment, and brain regional volumes. This could shed light on the overlapping atrophy patterns of aging and mild cognitive impairment.

In this paper, we based these analyses on discrete Bayesian networks. Using discrete BNs for these analyses required that we threshold all continuous variables, possibly leading to loss of information. An alternative approach is to use hybrid networks. Hybrid Bayesian networks can incorporate a combination of continuous and discrete variables. A discrete node, Yi, cannot have continuous parents; and the conditional probability of Yi can be parameterized as θijk=P(Yi = k | pa(Yi) = j). For a continuous node Zi, the conditional probability is a Gaussian linear regression on the continuous parents with parameters depending on the configuration of the discrete parents. Each of these two model types (discrete and hybrid BNs) has benefits and drawbacks. A hybrid BN does not suffer from information loss caused by thresholding; however, it can detect only those dependencies that are close to linear. The discrete BN may suffer from information loss during thresholding; however, it can model arbitrary nonlinear interactions among variables. In fact, a discrete BN can model any joint distribution over discrete variables11; it therefore has the potential to model interesting and important interactions that cannot be modeled using a hybrid network.

Delineating brain-behavior associations is a process in which we model interactions among structure and function variables. It is not a classification problem, in which the primary goal is to predict a single group membership variable based on a set of brain regions. There are two major differences between brain-behavior modeling and classification. First, the output of a brain-behavior modeling method is a descriptive model that links specific brain structure measures to specific cognitive functions, instead of a predictive model. Second, brain-behavior modeling usually incorporates several function variables (in the double dissociation method, we have two function variables), whereas classification centers on one function variable.

One way to address the problem of establishing associations among a set of function variables and a set of brain regions is to consider it as a multiple-input multiple-output system (MIMO) modeling problem1. Then existing methods such as dependent Gaussian process models35 can be used to solve this problem. One of the advantages of the BN-based method is that the generated model is declarative36. The interactions among brain regions and function variables are represented by a network model which is intuitive.

BBM is atlas-based. The advantage of the atlas-based approach is its computational tractability and reproducibility37. However, it requires a pre-defined atlas. An alternative approach is jointly grouping voxels into brain regions and detecting the Markov blanket of the clinical variable3840. We plan to extend our current work to incorporate brain parcellation into the algorithm.

In summary, BBM is a powerful data-driven method for delineating structure-function associations. It does not require a pre-specified model, and can generate a model to describing interactions among brain regions and function variables.

Acknowledgements

This work was supported by National Institutes of Health grant R01 AG13743, which is funded by the National Institute of Aging, and the National Institute of Mental Health; this work was also supported by the American Recovery and Reinvestment Act. This work was also supported by National Institutes of Health grant R03 EB-009310.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

1

This was pointed out to us by one of the reviewers.

References

  • 1.Purves D, Brannon EM, Cabeza R, et al. Principles of Cognitive Neuroscience. Sinauer Associates Inc; 2007. [Google Scholar]
  • 2.Teuber HL. Physiological psychology. Annual Review of Psychology. 1955;6:267–296. doi: 10.1146/annurev.ps.06.020155.001411. [DOI] [PubMed] [Google Scholar]
  • 3.Shallice T. From Neuropsychology to Mental Structure. From Neuropsychology to Mental Structure. Cambridge University Press; 1988. [Google Scholar]
  • 4.Morgan JE, Ricker JH. Textbook of Clinical Neuropsychology. Taylor Francis Group; 2008. [Google Scholar]
  • 5.Price CJ, Friston KJ. Cognitive conjunction: a new approach to brain activation experiments. Neuroimage. 1997 May;5(4 Pt 1):261–270. doi: 10.1006/nimg.1997.0269. [DOI] [PubMed] [Google Scholar]
  • 6.Friston KJ, Holmes AP, Price CJ, Buchel C, Worsley KJ. Multisubject fMRI studies and conjunction analyses. Neuroimage. 1999 Oct;10(4):385–396. doi: 10.1006/nimg.1999.0484. [DOI] [PubMed] [Google Scholar]
  • 7.Nichols T, Brett M, Andersson J, Wager T, Poline JB. Valid conjunction inference with the minimum statistic. Neuroimage. 2005 Apr 15;25(3):653–660. doi: 10.1016/j.neuroimage.2004.12.005. [DOI] [PubMed] [Google Scholar]
  • 8.Rudert T, Lohmann G. Conjunction analysis and propositional logic in fMRI data analysis using Bayesian statistics. J Magn Reson Imaging. 2008 Dec;28(6):1533–1539. doi: 10.1002/jmri.21518. [DOI] [PubMed] [Google Scholar]
  • 9.Davies M. Double Dissociation: Understanding its Role in Cognitive Neuropsychology. Mind & Language. 2010;25(5):500–540. [Google Scholar]
  • 10.Friston KJ. Models of brain function in neuroimaging. Annu Rev Psychol. 2005;56:57–87. doi: 10.1146/annurev.psych.56.091103.070311. [DOI] [PubMed] [Google Scholar]
  • 11.Pearl J. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann; 1988. [Google Scholar]
  • 12.Cowell R. Proceedings of the NATO Advanced Study Institute on Learning in Graphical Models. Kluwer Academic Publishers; 1998. Introduction to Inference for Bayesian Networks; pp. 9–26. [Google Scholar]
  • 13.Chen R, Herskovits EH. Network analysis of mild cognitive impairment. NeuroImage. 2006;29(4):1252–1259. doi: 10.1016/j.neuroimage.2005.08.020. [DOI] [PubMed] [Google Scholar]
  • 14.Goldszal A, Davatzikos C, Pham D, Yan M, Bryan RN, Resnick SM. An image processing protocol for quanlitative and quantitative volumetric analysis of brain images. J. Comput. Assisted Tomogr. 1998;22:827–837. doi: 10.1097/00004728-199809000-00030. [DOI] [PubMed] [Google Scholar]
  • 15.Chen R, Herskovits EH. Machine-learning techniques for building a diagnostic model for very mild dementia. Neuroimage. 2010;52(1):234–244. doi: 10.1016/j.neuroimage.2010.03.084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Jenkinson M, Smith S. A global optimisation method for robust affine registration of brain images. Med. Image Anal. 2001;5(2):143–156. doi: 10.1016/s1361-8415(01)00036-6. [DOI] [PubMed] [Google Scholar]
  • 17.Cooper GF, Herskovits EH. A Bayesian method for the induction of probabilistic networks from data. Machine Learning. 1992;9:309–347. [Google Scholar]
  • 18.Heckerman D, Geiger D, Chickering DM. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning. 1995;20:197–243. [Google Scholar]
  • 19.Dice L. Measures of the Amount of Ecologic Association Between Species. Ecology. 1945;26:297–302. [Google Scholar]
  • 20.Siedlecki KL, Habeck CG, Brickman AM, Gazes Y, Stern Y. Examining the multifactorial nature of cognitive aging with covariance analysis of positron emission tomography data. J Int Neuropsychol Soc. 2009 Nov;15(6):973–981. doi: 10.1017/S1355617709990592. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Bullmore E, Sporns O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat Rev Neurosci. 2009 Mar;10(3):186–198. doi: 10.1038/nrn2575. [DOI] [PubMed] [Google Scholar]
  • 22.Chen R, Sivakumar K, Kargupta H. Collective Mining of Bayesian Networks from Distributed Heterogeneous Data. Knowl. Inf. Syst. 2004:164–187. [Google Scholar]
  • 23.Tzourio-Mazoyer N, Landeau B, Papathanassiou D, et al. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage. 2002 Jan;15(1):273–289. doi: 10.1006/nimg.2001.0978. [DOI] [PubMed] [Google Scholar]
  • 24.Chen R, Herskovits EH. Voxel-based Bayesian lesion-symptom mapping. Neuroimage. 2010 Jan 1;49(1):597–602. doi: 10.1016/j.neuroimage.2009.07.061. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Upton G, Cook I. Understanding Statistics. Oxford University Press; 1996. [Google Scholar]
  • 26.Zijdenbos AP, Dawant BM, Margolin RA, Palmer AC. Morphometric analysis of white matter lesions in MR images: method and validation. IEEE Trans Med Imaging. 1994;13(4):716–724. doi: 10.1109/42.363096. [DOI] [PubMed] [Google Scholar]
  • 27.Brayne C, Calloway P. Normal ageing, impaired cognitive function, and senile dementia of the Alzheimer's type: a continuum? Lancet. 1988 Jun 4;1(8597):1265–1267. doi: 10.1016/s0140-6736(88)92081-8. [DOI] [PubMed] [Google Scholar]
  • 28.Head D, Snyder AZ, Girton LE, Morris JC, Buckner RL. Frontal-hippocampal double dissociation between normal aging and Alzheimer's disease. Cereb Cortex. 2005 Jun;15(6):732–739. doi: 10.1093/cercor/bhh174. [DOI] [PubMed] [Google Scholar]
  • 29.Morris JC. The Clinical Dementia Rating (CDR): current version and scoring rules. Neurology. 1993 Nov;43(11):2412–2414. doi: 10.1212/wnl.43.11.2412-a. [DOI] [PubMed] [Google Scholar]
  • 30.Jeffreys H. The Theory of Probability. Oxford: 1961. [Google Scholar]
  • 31.Gabrieli JD. Memory systems analyses of mnemonic disorders in aging and age-related diseases. Proc Natl Acad Sci U S A. 1996 Nov 26;93(24):13534–13540. doi: 10.1073/pnas.93.24.13534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Samuel M, Colchester ACF. Structural and functional magnetic resonance imaging in neurodegenerative diseases. In: Beal MF, Lang AE, Ludolph AE, editors. Neurodegenerative Diseases. Cambridge: Cambridge University Press; 2005. pp. 253–289. [Google Scholar]
  • 33.Dennis NA, Cabeza R. Neuroimaging of healthy cognitive aging. In: Craik F, Salthouse TA, editors. Handbook of aging and cognition: Third edition. Mahwah, NJ: Erlbaum; 2008. [Google Scholar]
  • 34.Gauthier S, Reisberg B, Zaudig M, et al. Mild cognitive impairment. Lancet. 2006 Apr 15;367(9518):1262–1270. doi: 10.1016/S0140-6736(06)68542-5. [DOI] [PubMed] [Google Scholar]
  • 35.Boyle P, Frean M. Dependent Gaussian processes. Paper presented at: In Advances in Neural Information Processing Systems. 2005;17 [Google Scholar]
  • 36.Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. The MIT Press; 2009. [Google Scholar]
  • 37.Bullmore ET, Bassett DS. Brain graphs: graphical models of the human brain connectome. Annu Rev Clin Psychol. 2011;7:113–140. doi: 10.1146/annurev-clinpsy-040510-143934. [DOI] [PubMed] [Google Scholar]
  • 38.Chen R, Herskovits EH. Graphical-model based morphometric analysis. IEEE Transaction on Medical Imaging. 2005;24(10):1237–1248. doi: 10.1109/TMI.2005.854305. [DOI] [PubMed] [Google Scholar]
  • 39.Chen R, Hillis AE, Pawlak M, Herskovits EH. Voxelwise Bayesian lesion deficit analysis. NeuroImage. 2008;40:1633–1642. doi: 10.1016/j.neuroimage.2008.01.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Chen R, Herskovits EH. Graphical-model-based multivariate analysis of functional magnetic resonance data. NeuroImage. 2007;35:635–647. doi: 10.1016/j.neuroimage.2006.11.040. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES