Abstract
Resting-state fMRI provides a method to examine the functional network of the brain under spontaneous fluctuations. A number of studies have proposed using resting-state BOLD data to parcellate the brain into functional subunits. In this work, we present two state-of-the-art graph-based partitioning approaches, and investigate their application to the problem of brain network segmentation using resting-state fMRI. The two approaches, the normalized cut (Ncut) and the modularity detection algorithm, are also compared to the the Gaussian mixture model (GMM) approach. We show that the Ncut approach performs consistently better than the modularity detection approach, and it also outperforms the GMM approach for in vivo fMRI data. Resting-state fMRI data were acquired from 43 healthy subjects, and the Ncut algorithm was used to parcellate several different cortical regions of interest. The group-wise delineation of the functional subunits based on resting-state fMRI was highly consistent with the parcellation results from two task-based fMRI studies (one with 18 subjects and the other with 20 subjects). The findings suggest that whole-brain parcellation of the cortex using resting-state fMRI is feasible, and that the Ncut algorithm provides the appropriate technique for this task.
Keywords: Resting-state fMRI, BOLD, Gaussian mixture model, normalized cut, modularity
1. Introduction
Resting-state fMRI has been used to examine functional connections between cortical regions since the first presentation of the method by Biswal in 1995 (Biswal et al., 1995). The approach typically uses task-free time course information obtained using a blood oxygenation level dependent contrast (BOLD) acquisitions (Ogawa et al., 1990) and measures the temporal correlation between different regions within a single subject over time. The resting-state fc-fMRI approach has been used to investigate a number of basic neuroscience questions such as the connectivity present in different brain states (Vincent et al., 2007; Martuzzi et al., 2009) and the relationship between functional connectivity and behavior (Hampson et al., 2006, 2004). This approach also provides an opportunity to examine network properties in the brain and to parcellate the brain into minimal functional subunits based on the correlated BOLD signal. Parcellation of the cortex into individual subunits based on resting-state data opens up the possibility of developing a subunit atlas analogous to the Brodmann areas but based on cortical function rather than cytoarchitecture. A number of clustering techniques have been suggested for segmenting the brain using resting-state fMRI, including independent component analysis (ICA) (Damoiseaux et al., 2006; Chena et al., 2008; Luca et al., 2006), Gaussian mixture model (GMM) (Golland et al., 2008), and hierarchical clustering (Achard et al., 2006; Salvador et al., 2005), to name a few. In this paper, we focus on clustering algorithms based on graph theory. Graph theory is a common methodology for studying complex networks (Boccaletti et al., 2006; Watts and Strogatz, 1998; Achard et al., 2006; Sporns et al., 2007; Buckner et al., 2009). Many recent applications of graph theory to brain network analysis have focused on the small-world architecture (Watts and Strogatz, 1998). Small-world networks allow highly efficient parallel information processing for a low wiring cost (Latora and Marchiori, 2001). Such networks have been identified in both structural and functional analysis of brain data (Achard et al., 2006; Sporns et al., 2007). In addition, graph theory also offers superb tools for partitioning networks. Graph-based clustering approaches have gained popularity in image segmentation (Shi and Malik, 2000; Boykov and Kolmogorov, 2004) and machine learning applications (Belkin and Niyogi, 2004). Most recently, some of these techniques have also been applied to the analysis of brain networks using resting-state fMRI (Thirion et al., 2006; van den Heuvel et al., 2008; Schwartz et al., 2008).
Graph partitioning approaches can be divided into two major categories. One set of algorithms attempts to solve a combinatorial optimization problem and obtain a binary (integer) indicator function, e.g., the max-flow/min-cut (Boykov and Kolmogorov, 2004) algorithm. The advantage of these algorithms is that the indicator function defines the partition directly. However, combinatorial optimization is often very difficult to solve, and many minimization/maximization functionals are intractable for this kind of optimization. The other set of algorithms instead relax the binary constraint and solve for a real valued functdvipdfmion. The relaxation makes the optimization problem tractable and relatively easier to solve. The real valued function is later converted to obtain the partition. However, to convert the real valued function to the optimal binary (integer) solution is nontrivial. Nevertheless we are able to obtain a solution that is very close to the optimum. In this work we investigate algorithms of the second kind. In particular, we selected two graph partitioning algorithms, the normalized cuts (Ncut) algorithm (Shi and Malik, 2000) and the modularity detection algorithm (Newman, 2006b,a), and applied them to segment resting-state BOLD based functional connectivity data. We also applied the GMM (MaLanchlan and Peel, 2000) approach in our experiment for comparison. The GMM is a probabilistic approach with the underlying assumption that the data has a Gaussian density distribution. It is a robust unsupervised data clustering approach and has been applied to resting-state fMRI data analysis (Golland et al., 2008). The comparison between graph-based approaches and the GMM would give us a more comprehensive evalution of the strengths and weaknesses of these algorithms when working with resting-state fMRI data. A systematic comparison of the algorithms using both synthetic and real resting-state fMRI data is presented below. A group consistency measure based on the average entropy is introduced for use as criteria to evaluate the performance of the algorithms, and this is particularly valuable in applications such as this where no ground truth is available. We show that the normalized cut algorithm has the best overall performance, and that the segmentation obtained using the Ncut algorithm is the most consistent across groups of subjects. In addition, we show that delineation based on resting-state fMRI is highly consistent with delineation under task conditions. The agreement provides compelling evidence that functional parcellation of the brain can be revealed using resting-state fMRI and that the parcellations are meaningful with respect to functional task-based delineation of functional subunits in the brain. The paper is organized as follows. In the Theory section, we review the two graph partitioning algorithms. The generation of synthetic data and the acquisition of two in-vivo fMRI datasets are described in the Methods section. Parcellation results using in vivo data from two regions of interest are presented in Results with the performance evaluation of the three selected algorithms.
2. Graph-based Partitioning
A graph G consists of a set of vertices V = {v1, v2, …, vN} and a set of edges E = {e(i, j), vi, vj ∈ V}. Given an fMRI dataset, each voxel corresponds to a vertex in V, and N is the total number of voxels. The edges between two vertices are defined based on the functional connectivity (e.g. correlation coefficients). If voxel vi and voxel vj are functionally connected, then e(i, j) = 1, otherwise e(i, j) = 0. To better characterize the differences in functional connectivity, a real value is assigned to each edge, denoted as w(i, j), w(i, j) > 0 if e(i, j) = 1 and w(i, j) = 0 if e(i, j) = 0.
We next define a few quantities that are commonly used in graph partitioning algorithms. Given a graph G = (V, E), a two-way partition of G is denoted as (A, Ā), where A ∪ Ā = V and A ∩ Ā = ∅. The indicator vector x = [x1, x2, …, xN] of the partition is defined by,
| (1) |
The N × N weight matrix W has w(i, j) as its entries. d is the degree vector, di = Σjw(i, j). The Laplacian of the graph is given by,
| (2) |
V ol(A) = Σvi∈Adi is the volume of the set A, which is different from the cardinality |A| of the set (total number of points).
2.1. Normalized Cuts and Spectral Clustering
In the seminal paper (Shi and Malik, 2000) by Shi and Malik, the normalized cut was proposed for segmenting natural images. A two-way normalized cut is defined by
| (3) |
Normalizing the cut value by the total edge connection to all the vertices in the graph removes the bias towards separating out small set of points. Ncut(A, Ā) can be written in a matrix form
It has been shown (Shi and Malik, 2000) that minimizing Ncut is equivalent to minimizing the Rayleigh quotient given by
| (4) |
with the constraint that y is piecewise constant and yTd = 0, and D is a diagonal matrix, D(i, i) = di. By removing the piecewise constant constraint of y, the minimum of Q(y) is achieved by setting y equal to the smallest nontrivial eigenvector ϕ1 of the normalized Laplacian . In the two-way case, the binary partition is obtained by splitting ϕ1 at a chosen value τ. Several options are available: one can choose 0 (sign cut), or the median of ϕ1 (bisection). In our implementation we search for τ such that the corresponding indicator vector x gives the best Ncut(A, Ā). The discretization in R-way segmentation is more complicated, and can be found by either weighted K-means clustering (Bach and Jordan, 2004) or the method proposed in (Yu and Shi, 2003). The normalized cut algorithm is very closely related to spectral clustering, which uses the first nontrivial eigenvector ψ1 (the Fiedler vector) of the graph Laplacian matrix L. In fact, it has been shown that the Fiedler vector is the real valued solution to the following minimization problem,
| (5) |
The two optimizations differ in that one algorithm normalizes using V ol(A), while the other one normalizes using |A|. There is some evidence (Chung, 1997) from a spectral graph theoretical point of view that the normalized Laplacian has better behavior than the standard graph Laplacian.
2.2. Modularity Detection
Arguing that the size of segments is not an appropriate partitioning criteria, Newman (Newman, 2006b) proposed the use of the so-called modularity function to find tightly connected communities in a graph. The modularity function measures the difference between the number of edges within a community and the expected number of such edges. Therefore maximizing the modularity function helps find strongly connected structures independent of the size. The modularity matrix of a graph is defined to be
| (6) |
where 2m = Σijw(i, j). The optimal solution is found by maximizing xTBx with the constraint |x| = 1. Both a combinatorial algorithm (Clauset et al., 2004) and a spectral approach (Newman, 2006a) are available for solving the problem. Here we are interested in the spectral approach, where the real valued solution is the first eigenvector of B which has the largest positive eigenvalue. The splitting point is chosen to be 0 for a 2-way segmentation. One can apply the algorithm recursively to divide the graph into multiple subunits. Table 1 summarizes the optimization criteria, the matrices that are essential for the optimization and the corresponding real valued solutions of the three graph based partitioning algorithms.
Table 1. Optimization and solution of the three graph based partitioning algorithms.
| Algorithm | Optimization Target Function | Matrix Form | Real-valued Solution | ||
|---|---|---|---|---|---|
| Ncut |
|
|
eigenvector of L̃ | ||
| Average Cut |
|
|
eigenvector of L | ||
| Modularity Detection | (No. of edges in A - expected No. of edges in A) + (No. of edges in Ā - expected No. of edges in Ā) | xTBx | eigenvector of B |
3. Material and Methods
In this section, we first describe the generation of four synthetic datasets. For the first three datasets, the data distribution is well modeled. But for the last data set, we used experimental fMRI data obtained in vivo, so we have no explicit control over the data distribution. By using these two types of synthetic datasets, we get a more comprehensive estimation of the performance of the algorithms. The acquisition of two in vivo fMRI datasets are also presented in this section, one is from a resting-state study and the other is from a task-based study. Parameter selection for graph-based approaches is discussed in section 3.4, and the group consistency measure we proposed is presented in section 3.5.
3.1. Synthetic data
Three sets of synthetic data were generated to compare the performance of the three algorithms under different circumstances. For each set, we created fifty different single slice fMRI data sets. The image matrix size was 32 × 32, with 1200 time points for each time course. The statistics computed in the results section were averaged over the fifty data sets. The signal time course simulated was a sinusoidal function at frequency f = 0.05Hz with a fixed phase θ, and the additive noise is i.i.d. white Gaussian with standard deviation 1. The noise corrupted time series is given by,
The time courses were sampled at TR=1.55s (to match the parameters of the real fMRI acquisition). The signal to noise ratio, , where T = 1200 is the length of the time courses. The one slice of fMRI data is divided into two parts according configuration A or B shown in 1. The signal time courses for the two segments have the same frequency but a phase difference of . The purpose of creating an unbalanced configuration B is to test if there exists any bias towards equal-size partitioning from any of the three algorithms. Table 2 shows the parameters for the three sets of synthetic data,
Table 2. Synthetic datasets.
| SNR | Configuration | ||
|---|---|---|---|
| A (balanced) | B (unbalanced) | ||
| syn-data1 | 0.04 | ✓ | |
| syn-data2 | 0.02 | ✓ | |
| syn-data3 | 0.04 | ✓ | |
We also created one more synthetic dataset (syn-data4) from a real fMRI dataset described in 3.2.1. This set includes 43 independent slices of fMRI data that were collected from 43 healthy subjects. The size of each slice is 28 by 28, and configuration A was used for this dataset. Resting-state fMRI time courses from the intra-parietal sulcus were randomly selected for one region, and time courses from the visual cortex were randomly selected for the other region (there were two subjects who had fewer than 392 voxels in the IPS and we generated additional time courses by interpolation). All the resting-state time courses were detrended and low-pass filtered.
3.2. In-vivo fMRI Data
The segmentation algorithms described above were applied to both resting-state fMRI data and task-based fMRI data. In this manner, we can test the hypothesis that the delineation of functional subunits is invariant under different conditions, such as task or resting-state.
3.2.1. Resting-State fMRI
Imaging was performed on a 3T Siemens Trio scanner at the Yale MRRC. A T1-weighted 3-plane localizer was used to localize the slices to be obtained and T1 anatomic scans were collected in the axial-oblique orientation parallel to the ac-pc line. Resting-state fMRI data was obtained using a gradient echo T2*-weighted echo planar imaging sequence, flip angle alpha= 80, echo time TE = 30ms, repetition time TR = 1550ms, 64 × 64 matrix, with 25 slices 6mm thick, skip 0mm, 22 × 22cm2 FOV, providing whole-brain coverage with voxel size of 3.4mm × 3.4mm × 6mm. Eight 6-min runs of resting-state data were collected. 43 healthy right-handed subjects participated in the study after giving informed written consent.
3.2.2. Task-Based fMRI
Task-based fMRI data was collected for a study of BOLD signal change between different sessions in a test/retest experiment (Buck et al., 2008). Imaging was performed on a 3T Siemens Trio scanner at the Yale MRRC. Functional data was obtained using gradient echo planar imaging during tasks, flip angle alpha = 80, echo time TE = 30ms, repetition time TR = 2000 ms, 64 × 64 matrix, with 24 slices 5mm thick, 20 × 20 cm2 FOV. High resolution anatomic scans were also acquired using a 3D MPRAGE volume acquisition, alpha = 15, TE = 2.83ms, TR = 1500ms, inversion time TI = 800ms, 256 × 256 × 160 matrix with voxel size 1.0 × 1.0 × 1.0mm3. Each subject underwent the Stroop task (Stroop, 1935). Twenty healthy subjects participated in the study. A test/retest analysis of the fMRI data from this task has previously been published (Buck et al., 2008).
3.2.3. Preprocessing
Functional data was motion and slice timing corrected using SPM5. A Gaussian kernel of FWHM 6 mm was applied for spatial smoothing. The mean time courses from the white matter (WM) and the cerebrospinal fluid (CSF) were calculated and the data were orthogonalized with respect to the two mean time courses and to the six motion related signals estimated by SPM5. After the orthogonalization, linear trends were removed and lowpass filtering (< 0.1Hz) was applied.
3.3. Region of Interest
3.3.1. ROI 1: Visual Cortex
A mask of the primary visual cortex (Brodmann area 17) and the secondary visual cortex (Brodmann area 18) was created based on the Yale Brodmann Atlas (available at http://bioimagesuite.org). The mask was mapped to each individual space and voxels within the mask were recruited for the experiment. Because of the variation across individual brains, the number of voxels N (3.4 × 3.4 × 6mm3 resolution) inside BA 17/18 varies from 583 to 1077.
3.3.2. ROI 2: Intraparietal Sulcus
A recently published task-based study by our group (Roth et al., 2009) showed that the intraparietal sulcus is actively involved in working memory tasks and that the specific task conditions can subdivide this region into smaller functional subunits. A mask of the intraparietal sulcus was made based on the activation result in the MNI space. The mask was mapped to individual subject space and the number of included voxels varies from 371 to 616 across subjects.
3.4. Parameter Selection
A graph is constructed by connecting each vertex to its k nearest neighbors in terms of functional distance. Denote the time course at voxel vi as fi = [fi(1), fi(2), …, fi(T)], the functional distance is given by . When fi, fj have unit length (‖fi‖ = ‖fj‖ = 1), the functional distance is directly related to the Pearson correlation coefficient corr(fi, fj), ‖fi − fj‖ = 2–2corr(fi, fj). The weight is defined using a Gaussian kernel, which is commonly used in graph-based approaches,
The weight matrix W (or the Laplacian matrix L) is controlled by two parameters, the number of nearest neighbors k and the scale parameter σ. k controls the sparsity of the matrix, which is estimated to be k/N. The scale parameter σ controls the decay of the Gaussian kernel. When σ → 0, W approaches an identity matrix; when σ → ∞, W becomes equivalent to an adjacency (binary) matrix. Since the optimization defined by both graph-based algorithms depends only on the weight matrix (see Table 1), choices of k and σ directly affect the partitioning results.
In the experiment, we constructed a number of graphs using different k and σ. k was sampled at values such that the sparsity of the matrix is approximately 0.02, 0.05 and 0.1. σ was sampled proportional to the median of the functional distances. The median of functional distances, denoted as ξ, was estimated over all pairs of voxels. We set σ to equal to ξ/2, ξ and 2ξ for the synthetic data, and for in vivo fMRI we had σ = ξ, because results from synthetic data showed that partitioning results are much less sensitive to σ values.
3.5. Performance Evaluation and Group Consistency
For synthetic datasets, performance of the algorithms was evaluated based on the ground truth. However, we have no access to the ground truth for real fMRI data, thus we could only evaluate the performance indirectly. Assuming there truly exists a functional division, then this division should remain stable for each subject, and remain consistent across group of subjects. Under this assumption, the consistency of partitioning across subjects can be used to evaluate the performance of the algorithms.
In our experiment, the partitioning was performed in the individual subject space of resolution 3.4375 × 3.4375 × 6 and normalized to a reference space of resolution 3 × 3 × 3. For the sake of simplicity, we assume the partitioning is two-way. The consistency measure could be easily extended for multi-label cases. Let Zs = [zs(1), zs(2), …, zs(N)] denote the label vector for subject s, and zs(i) ∈ {1, 2}. The probability of voxel vi being classified as subunit 1 is given by
| (7) |
where δ(·, ·) is the Kronecker delta function. The uncertainty of label assignment at a single voxel can be assessed by the discrete entropy,
| (8) |
If vi is unanimously assigned to one subunit, then H(i) = 0, and if Pr(vi = 1) = Pr(vi = 2) = 0.5, then H(i) = log(2) = 0.6931. The cross subject consistency is defined to be the average entropy over all voxels,
| (9) |
The smaller the average entropy ℋ, the better the agreement between partitions across all subjects. Given the synthetic data, we are able to examine the relation between classification error and group consistency measured by the average entropy. Figure 2 shows that the classification error is positively correlated with the average entropy. Therefore group consistency is a good measure of an algorithm's efficacy to identify a subdivision.
Figure 2.
Relationship between classification errors and group consistency obtained using synthetic data. X-axis: the total number of voxels that are wrongfully classified; Y-axis: the group consistency measured by average entropy across fifty subjects (realizations). Quantities of X and Y axes are positively correlated with Pearson correlation coefficient r = 0.9853.
4. Experiment Results
4.1. Synthetic Data
We applied the normalized cut algorithm, the modularity detection algorithm and the GMM to the four synthetic datasets. The classification errors were averaged over fifty independent realizations (forty-three for syn-data4). Table 3 summarizes the error statistics of the two graph-based approaches using different k and σ. Table 4 lists the best classification errors achieved by all three algorithms.
Table 3. Classification errors from synthetic datasets.
| Ncut | Modularity | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| k | σ | σ | |||||||
| ξ | 2ξ | ξ | 2ξ | ||||||
| 20 | 23 | 23 | 23 | 41 | 38 | 37 | |||
| syn-data1 | 50 | 12 | 12 | 12 | 15 | 14 | 14 | ||
| 100 | 8 | 8 | 8 | 10 | 10 | 9 | |||
| 20 | 152 | 130 | 129 | 220 | 199 | 194 | |||
| syn-data2 | 50 | 222 | 84 | 84 | 110 | 103 | 100 | ||
| 100 | 246 | 65 | 65 | 76 | 72 | 72 | |||
| 20 | 55 | 78 | 83 | 451 | 417 | 421 | |||
| syn-data3 | 50 | 35 | 52 | 55 | 380 | 312 | 311 | ||
| 100 | 36 | 49 | 51 | 362 | 300 | 280 | |||
| 20 | 5 | 5 | 6 | 188 | 74 | 53 | |||
| syn-data4 | 50 | 17 | 19 | 20 | 137 | 59 | 43 | ||
| 100 | 36 | 36 | 33 | 115 | 50 | 46 | |||
Table 4. Best classification errors from synthetic datasets.
| syn-data1 | syn-data2 | syn-data3 | syn-data4 | |
|---|---|---|---|---|
| Ncut | 8 | 65 | 36 | 5 |
| Modularity | 9 | 72 | 280 | 43 |
| GMM | 5 | 43 | 10 | 208 |
4.2. Real fMRI Data
In the experiments with real fMRI data, we were interested in subdividing two regions of interest of the human cortex, namely the visual cortex (VC) and the intraparietal sulcus (IPS). The delineation of the two ROIs was obtained using resting-state fMRI and task-based fMRI data. We denoted the four sets of data as “fMRI-data1”, “fMRI-data2”, “fMRI-data3” and “fMRI-data4”, respectively (see Table 5 for explanation).
Table 5. Four sets of fMRI data.
| ROI | Condition | |||
|---|---|---|---|---|
| VC | IPS | Resting | Task | |
| fMRI-datal | ✓ | ✓ | ||
| fMRI-data2 | ✓ | ✓ | ||
| fMRI-data3 | ✓ | ✓ | ||
| fMRI-data4 | ✓ | ✓ | ||
The segmentation was done in subject space so that each voxel in the region of interest was assigned an integer label (1 or 2). The label images were then transformed into the reference space of dimension 3 × 3 × 3mm3 to obtain a group-wise segmentation. The labels in the group-wise result were assigned by majority vote,
We used the group consistency measure (defined in Section 3.5) to evaluate the segmentation results. Figure 3 and Figure 4 show the entropy (8) at each voxel of the group-wise segmentation from all three algorithms. Table 6 summarizes the average entropy. The Ncut algorithm has the overall best performance, whereas the GMM algorithm has the worst group consistency among the three. The segmentation by the Ncut algorithm shows the least amount of red (where red indicates lower group consistency than yellow) in Figure 3 and 4, and it is tightly packed near the boundary of the two subunits. On the contrary, the red region shown on the entropy map based on the GMM results is much more diffuse.
Figure 3.

Entropy calculated from group segmentation results of fMRI-data1. High consistency (small entropy value) indicated by the white/yellow color spectrum, low consistency (large entropy value) shown with the red color spectrum. A: results from the Ncut algorithm, average entropy 0.1825; B: results from the modularity detection algorithm, average entropy 0.3570; C: results from the GMM, average entropy 0.5023.
Figure 4.

Entropy calculated from group segmentation results from fMRI-data3. High consistency (small entropy value) indicated by the white/yellow color spectrum, low consistency (large entropy value) shown with the red color spectrum. A: results from the Ncut algorithm, average entropy 0.3866; B: results from the modularity detection algorithm, average entropy 0.4880; C: results from the GMM, average entropy 0.4888.
Table 6. Group consistency (ℋ) from four sets of fMRI data.
| fMRI-data1 | fMRI-data2 | fMRI-data3 | fMRI-data4 | |
|---|---|---|---|---|
| Ncut | 0.1825 | 0.1877 | 0.3866 | 0.3815 |
| Modularity | 0.3570 | 0.3192 | 0.4880 | 0.3575 |
| GMM | 0.5023 | 0.4888 | 0.4888 | 0.4772 |
One important goal in this work was to compare delineation of functional subunits under different conditions (resting-state data versus fMRI data in task active and task-non-active areas). Figure 5 shows the two-way segmentation of the visual cortex obtained using the normalized cut algorithm. The task-based fMRI data were acquired during a Stroop task. Note that although the visual cortex was actively involved during the Stroop task, the task itself was not designed to elicit functional differences in the visual cortex, thus this can be thought of as a non-specific task condition. We see that the results are highly consistent between the resting-state and the non-specific task state. The same way of subdividing the visual cortex was also shown by two other groups of investigators (Salvador et al., 2005; Smith et al., 2009). Figure 6 compares the segmentation results of the intraparietal sulcus. The parcellations are also consistent across conditions (i.e. task versus resting-state BOLD data). In this comparison, we not only have results from resting-state (Fig. 6A) and non-specific task state (Fig. 6B), but we also have results from a 3rd experiment (working memory task (Roth et al., 2009)) that explicitly delineated the functional subunits based on the particular task conditions (Fig. 6C).
Figure 5.

Group segmentation results of the visual cortex (VC). Both segmentations were obtained using the normalized cut algorithm. The colormap shows the classification of each voxel with its probability. The green/blue spectrum indicates membership of group I, while the blue spectrum indicates 100% agreement across individuals, and green spectrum indicates a little above 50% agreement across individual. The red/yellow spectrum indicates membership of group II. A: segmentation based on resting-state fMRI; B: segmentation based on task-based fMRI.
Figure 6.

Group segmentation results of the intra-parietal sulcus (IPS). All three segmentations were obtained using the normalized cut algorithm. The colormap shows the classification of each voxel with its probability. The green/blue spectrum indicates membership of group I, while the blue spectrum indicates 100% agreement across individuals, and green spectrum indicates a little above 50% agreement across individual. The red/yellow spectrum indicates membership of group II. A: segmentation based on resting-state fMRI; B: segmentation based on a Stroop task fMRI study; C: segmentation based on memory update/refresh task fMRI study.
5. Discussion
5.1. Ncut vs. Modularity Detection
According to Table 3, the normalized cut algorithm and the modularity detection algorithm performed almost equally well for syn-data1 and syn-data2, when the two sub-regions are of the same size. However for syn-data3, where one sub-region is seven times bigger than the other, the modularity detection algorithm failed to separate the two sub-regions. The failure of the modularity detection algorithm in the unbalanced case could be problematic for brain segmentation applications. The Ncut algorithm still attained reasonable results for syn-data3, and it outperformed the modularity detection algorithm significantly for syn-data4. In experiments with real fMRI data, Table 6 shows that the group results obtained using the Ncut algorithm are more consistent than those obtained using the modularity detection algorithm.
5.2. Non-Gaussian Mixture Distribution
It is not surprising to see that the GMM algorithm, which had the least clustering errors for syn-data1, syn-data2 and syndata3, failed to separate the two regions in syn-data4. As for syn-data1(2,3), the datasets were constructed using the Gaussian mixture model, therefore the GMM is the optimal algorithm to identify the compositing clusters. However, for syn-data4 constructed from resting-state fMRI data, the distribution of the data points is quite different. To the author's knowledge, there are no published reports where investigators have attempted to model the distribution explicitly. This difference in the data distribution also makes the graphs constructed based on syn-data1(2,3) quite different from the graphs constructed based on syn-data4. We can use the degree vector d to illustrate the difference. The left panel in Fig. 7 shows the histogram of d from one syn-data1 realization, and the middle panel shows the histogram of d from one syn-data4 realization. We can see that when the data follows a Gaussian mixture model, the degree vector distributed approximately according to an exponential distribution (red curve in the rightmost panel). But with real fMRI data, the distribution of the degree vector resembles a gamma distribution with its shape parameter equal to two (or higher) (blue curve in the rightmost panel).
Figure 7.
Histogram of the degree vector d. Left: the corresponding graph was constructed based on one realization from syn-data1. Middle: the corresponding graph was constructed based on one realization from syn-data4, which was from resting-state fMRI. Right: blue curve: gamma distribution with shape parameter equal to 2; red curve: exponential distribution.
Based on the statistics shown in Table 6, we carried out a paired t-test on the difference of the group consistency between GMM and Ncut, and obtained p = 0.0443. In other words, in working with real fMRI data, Ncut works significantly better than GMM. Graph-based approaches are in general more versatile in situations where the data distribution is unknown, because the segmentation is based on pairwise connections rather than any assumptions about the global data distribution.
5.3. Parameter Selection
Table 3 and Table 4 show that the value of σ does not significantly affect the performance of the graph based approaches. It is reasonable to take σ equal to the median of the functional distances (σ = ξ). However the neighborhood size k plays a more important role. Table 3 shows that for syn-data1,2,3, the classification error decreases as k increases. Therefore for datasets with a Gaussian mixture distribution, it is desirable to choose k as large as possible. However for real fMRI data, the rule changes. Table 3 shows that the normalized cut algorithm achieved the best classification error at the smallest k value (k = 20). It is clear that smaller k is preferred. But for the modularity detection algorithm, the classification error depends on both σ and k, and it is not obvious how one should choose the combination. In our experiments with real fMRI datasets, the best group consistency was achieved by using small k (k/N = 0.02), for both the Ncut algorithm and the modularity detection algorithm. When working with fMRI datasets of larger size (N > 10K), one could choose an even smaller k/N ratio.
5.4. Resting vs Task
Figure 5 and Figure 6 show that the delineation of functional subunits in the two regions of interest is highly consistent between resting-state and task-based conditions. The agreement between different conditions provides strong support to the claim that functional organization in terms of subunit delineation is maintained in both resting-state and in task-based conditions, and can be revealed using resting-state fMRI. It is important to note that there is substantial evidence that the strength of the connections between different subunits can change with task (Hampson et al., 2004, 2006) or brain state (Bartels and Zeki, 2005; Vincent et al., 2007; Greicius and Menon, 2004), but the results here suggest that the underlying functional subunits do not change with task.
6. Conclusion
We have presented a review of state-of-the-art graph-based partitioning algorithms and two such algorithms have been applied to the brain functional subunit parcellation problem with resting-state fMRI data as input. The two graph-based approaches were compared against each other and also to the GMM approach that is commonly used for clustering. We found that one of the graph-based approaches-the normalized cut algorithm, outperformed the other two algorithms for the in vivo fMRI data, in the sense that the segmentation was the most consistent across subjects. Additionally, the results showed that the subdivision into small functional subunits of one sensory brain region and a higher cognitive brain region remained invariant under resting-state or task-based conditions. This work has shown the feasibility of using an algorithm such as the Ncut to parcellate the cortex into functional subunits using resting-state BOLD data. This approach can potentially allow us to build a whole-brain atlas of minimal functional subunits that would provide a much more relevant context for describing fMRI results than current atlases such as the Brodmann atlas. Furthermore, while the majority of connectivity based analyses rely on seed-to-seed connectivity or seed-to-whole brain connectivity analysis, both of these approaches are highly sensitive to the definition of the boundaries of the seed region. Various approaches including functional task-based or anatomic based seed definitions have been used. Anatomic based seed definitions are very problematic in cortical or subcortical regions that do not have clear anatomic boundaries. Seed regions based on functional tasks appear to work well but functional localizers are not always available for all cortical regions and thus this approach does not allow analysis of the whole brain. If the region used as a seed contains a mixture of different time-courses (i.e. is poorly defined with respect to local connectivity) then correlations with this seed may not be meaningful. Furthermore, for whole-brain survey studies such as in drug trials or genetic phenotyping studies neither the task-based functional approach nor the anatomic delineation of seed regions are available across the whole-brain and hence both methods are inadequate for such studies. Many recent studies examining network properties of the brain through connectivity also rely on some predefined nodes, through anatomic atlas or function once again, to enter as starting points. The approach presented here can be used to delineate minimal subunits as node definitions potentially with more meaningful network properties extracted from these nodes that have uniform time-courses. In summary the approach presented in this work could have a significant impact in producing an atlas of minimal functional subunits (minimal in the sense that they are as small as possible while maintaining across subject consistency) for use in reporting fMRI task-based results, for providing starting regions of interest for further connectivity analyses of diseased or healthy populations, and for further analysis of network properties in the brain.
Figure 1.
Left: configuration A, the two parts have the same number of voxels; Right: configuration B, one part is seven times larger than the other.
Acknowledgments
This work was supported by NIH EB000473 and EB006494.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Achard S, Salvador R, Whitcher B, Suckling J, Bullmore E. A resilient, low-frequency, small-world human brain functional network with highly connected association cortical hubs. Neuroscience. 2006:63–72. doi: 10.1523/JNEUROSCI.3874-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bach F, Jordan M. Learning spectral clustering. Advances in Neural Information Processing Systems (NIPS) 2004;16 [Google Scholar]
- Bartels A, Zeki S. Brain dynamics during natural viewing conditions–a new guide for mampping connectivity in vivo. NeuroImage. 2005;24:339–349. doi: 10.1016/j.neuroimage.2004.08.044. [DOI] [PubMed] [Google Scholar]
- Belkin M, Niyogi P. Semi-supervised learning on riemannian manifolds. Machine Learning. 2004;56:209–239. [Google Scholar]
- Biswal B, Yetkin F, Haughton V, Hyde J. Functional connectivity in the motor cortex of resting human brain using echo-planar mri. Magnetic Resonance Medicine. 1995:537–541. doi: 10.1002/mrm.1910340409. [DOI] [PubMed] [Google Scholar]
- Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU. Complex networks: Structure adn dynamics. Physics Reports. 2006:175–308. [Google Scholar]
- Boykov Y, Kolmogorov V. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004;26:1124–1137. doi: 10.1109/TPAMI.2004.60. [DOI] [PubMed] [Google Scholar]
- Buck R, Singhal H, Arora J, Schlitt H, Constable RT. Detecting change in bold signal between sessions for atlas-based anatomical rois. NeuroImage. 2008;40 doi: 10.1016/j.neuroimage.2008.01.001. [DOI] [PubMed] [Google Scholar]
- Buckner RL, Sepulcre J, Talukdar T, Krienen FM, Liu H, Hedden T, Andrews-Hanna JR, Spering RA, Johnson KA. Cortical hubs revealed by intrinsic functional connectivity: Mapping, assessment of stability, and relation to alzheimer's disease. Neruoscience. 2009;29:1880–1893. doi: 10.1523/JNEUROSCI.5062-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chena S, Rossa TJ, Zhana W, Myers CS, Chuang KS, Heishman SJ, Stein EA, Yang Y. Group independent component analysis reveals consistent resting-state networks across multiple sessions. Brain Research. 2008;1239:141–151. doi: 10.1016/j.brainres.2008.08.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chung F. Spectral Graph Theory. CBNS-AMS; 1997. [Google Scholar]
- Clauset A, Newman MEJ, Moore C. Finding community structure in very large networks. Physical Review. 2004;70 doi: 10.1103/PhysRevE.70.066111. [DOI] [PubMed] [Google Scholar]
- Damoiseaux JS, Rombouts SARB, Barkhof F, Scheltens P, Stam CJ, Smith SM, Beckmann CF. Consistent resting-state networks across healthy subjects. PNAS. 2006;103:13848–13853. doi: 10.1073/pnas.0601417103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Golland Y, Golland P, Bentin S, Malach R. Data-dirven clutering reveals a fundamental subdivision of the human cortex into two global systems. Neuropsychologia. 2008;46:540–553. doi: 10.1016/j.neuropsychologia.2007.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greicius MD, Menon V. Default-mode activity during a passive sensory task: uncoupled from deactivation but impacting activation. J Cogn Neurosc. 2004;16:1484–1492. doi: 10.1162/0898929042568532. [DOI] [PubMed] [Google Scholar]
- Hampson M, Driesen N, Skudlarski P, Gore J, Constable R. Brain connectivity related to working memory performance. Neruoscience. 2006;26:13338–43. doi: 10.1523/JNEUROSCI.3408-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hampson M, Olson I, Leung HC, Skudlarski P, Gore J. Changes in functional connectivity of human mt/v5 with visual motion input. NeuroReport. 2004;15:1315–1319. doi: 10.1097/01.wnr.0000129997.95055.15. [DOI] [PubMed] [Google Scholar]
- Latora V, Marchiori M. Efficient behavior of small-world networks. Phyical review letters. 2001;87 doi: 10.1103/PhysRevLett.87.198701. [DOI] [PubMed] [Google Scholar]
- Luca MD, Beckmann C, Stefano ND, Matthews P, Smith S. fmri resting state networks define distinct modes of long-distance interactions in the human brain. NeuroImage. 2006;29:1359–1367. doi: 10.1016/j.neuroimage.2005.08.035. [DOI] [PubMed] [Google Scholar]
- MaLanchlan GJ, Peel D. Finite Mixture Models. Wiley; 2000. [Google Scholar]
- Martuzzi R, Ramani R, Qiu M, Rajeevan N, Constable RT. Functional connectivity and alterations in baseline brain state in humans. NeuroImage. 2009 doi: 10.1016/j.neuroimage.2009.07.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Newman MEJ. Finding community structure in networks using the eigenvectors of matrices. Physical Review. 2006a;74 doi: 10.1103/PhysRevE.74.036104. [DOI] [PubMed] [Google Scholar]
- Newman MEJ. Modularity and community structure in networks. PNAS. 2006b;103:8577–8582. doi: 10.1073/pnas.0601602103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ogawa S, Lee T, Kay A, Tank D. Brain magnetic resonance imaging with contrast dependent on blood oxygenation. PNAS. 1990;87:9868–9872. doi: 10.1073/pnas.87.24.9868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roth J, Raye MK, Constable RT. Similar and dissociable mechanisms for attention to internal versus external information. NeuroImage. 2009 doi: 10.1016/j.neuroimage.2009.07.002. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salvador R, Suckling J, Coeman M, Pickard J, Menon D, Bullmore E. Neurophysiological architecture of functional magnetic resonance images of human brain. Cerebral Cortex. 2005 doi: 10.1093/cercor/bhi016. [DOI] [PubMed] [Google Scholar]
- Schwartz A, Gozzi A, Bifone A. Community structure and modularity inn networks of correlated brain activity. Magn Reson Imag. 2008;26:914–920. doi: 10.1016/j.mri.2008.01.048. [DOI] [PubMed] [Google Scholar]
- Shi J, Malik J. Normalized cuts and image segmentation. IEEE transactions on pattern analysis and machine intelligence. 2000;22:888–905. [Google Scholar]
- Smith SM, Fox PT, Miller KL, Glahn DC, Fox PM, Mackay CE, Filippini N, Watkins KE, Toro R, Laird AR, Beckmann CF. Correspondence of the brain's functional architecture during activation and rest. PNAS. 2009;106:13040–13045. doi: 10.1073/pnas.0905267106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sporns O, Honey CJ, Kotter R. Identification and classification of hubs in brain networks. PLoS One. 2007;10 doi: 10.1371/journal.pone.0001049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stroop J. Studies of interference in serial verbal reactions. J Exp Psychology. 1935;18:643–662. [Google Scholar]
- Thirion B, Dodel S, Poline J. Detection of signal synchronizations in resting-state fmri datasets. NeuroImage. 2006;29:321–327. doi: 10.1016/j.neuroimage.2005.06.054. [DOI] [PubMed] [Google Scholar]
- van den Heuvel M, Mandl R, Pol HH. Normalized cut group clustering of resting-state fmri data. PLOS ONE. 2008;3:1–11. doi: 10.1371/journal.pone.0002001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vincent J, Patel G, Fox M, Snyder A, Baker J, Essen DV, Zempel J, Snyder L, Corbetta M, Raichle M. Intrinsic functional architecture in the anaesthetized monkey brain. Nature. 2007:83–86. doi: 10.1038/nature05758. [DOI] [PubMed] [Google Scholar]
- Watts DJ, Strogatz SH. Collective dynamics of ‘small-world’ networks. Nature. 1998;393 doi: 10.1038/30918. [DOI] [PubMed] [Google Scholar]
- Yu SX, Shi J. Multiclass spectral clustering. International Conference on Computer Vision 2003 [Google Scholar]



