Skip to main content
Brain Connectivity logoLink to Brain Connectivity
. 2021 Mar 15;11(2):132–145. doi: 10.1089/brain.2020.0794

A Classification-Based Approach to Estimate the Number of Resting Functional Magnetic Resonance Imaging Dynamic Functional Connectivity States

Debbrata K Saha 1,, Eswar Damaraju 1, Barnaly Rashid 2, Anees Abrol 1, Sergey M Plis 1, Vince D Calhoun 1
PMCID: PMC7993535  PMID: 33317408

Abstract

Aim: To determine the optimal number of connectivity states in dynamic functional connectivity analysis.

Introduction: Recent work has focused on the study of dynamic (vs. static) brain connectivity in resting functional magnetic resonance imaging data. In this work, we focus on temporal correlation between time courses extracted from coherent networks called functional network connectivity (FNC). Dynamic FNC is most commonly estimated using a sliding window-based approach to capture short periods of FNC change. These data are then clustered to estimate transient connectivity patterns or states. Determining the number of states is a challenging problem. The elbow criterion is one of the widely used approaches to determine the connectivity states.

Materials and Methods: In our work, we present an alternative approach that evaluates classification (e.g., healthy controls [HCs] vs. patients) as a measure to select the optimal number of states (clusters). We apply different classification strategies to perform classification between HCs and patients with schizophrenia for different numbers of states (i.e., varying the model order in the clustering algorithm). We compute cross-validated accuracy for different model orders to evaluate the classification performance.

Results: Our results are consistent with our earlier work which shows that overall accuracy improves when dynamic connectivity measures are used separately or in combination with static connectivity measures. Results also show that the optimal model order for classification is different from that using the standard k-means model selection method, and that such optimization improves cross-validated accuracy. The optimal model order obtained from the proposed approach also gives significantly improved classification performance over the traditional model selection method.

Conclusion: The observed results suggest that if one's goal is to perform classification, using the proposed approach as a criterion for selecting the optimal number of states in dynamic connectivity analysis leads to improved accuracy in hold-out data.

Impact statement

We present a novel approach to estimate an optimal number of brain states for dynamic functional network connectivity (dFNC) typically estimated using elbow, silhouette, gap statistics, and similar methods. Our results show that using the proposed classification framework (1) identifies a clear optimal solution, and (2) obtains higher cross-validated accuracy compared with the elbow criteria. In sum, this provides a new approach for estimating and characterizing the optimal number cluster centroids in dFNC measures, which appears to be useful for classification, prediction, and characterization of healthy and disordered brain function.

Keywords: classification, fMRI, functional connectivity, ICA, model order, schizophrenia

Introduction

The unconstrained resting human brain has been shown to exhibit time-varying functional connectivity (FC) dynamics (Chang and Glover, 2010; Hutchison et al., 2013; Sakoglu et al., 2010). Since then, researchers have developed several methods to estimate these intrinsic connectivity dynamics (Calhoun et al., 2014; Preti et al., 2017) and to study if these dynamic connectivity estimates provide additional diagnostic value above average (static) FC between brain regions that assumes constancy of connectivity throughout the scan (Damaraju et al., 2014; Rashid et al., 2016).

FC among two brain regions in resting-state functional magnetic resonance imaging (rsfMRI) data is computed as a pairwise statistical dependency (usually correlation) between their time courses (TCs). To delineate brain FC, rsfMRI data have been analyzed using a variety of analytical tools. Two of the major approaches are (1) seed-based analysis (Biswal et al., 1995; Greicius et al., 2003) and (2) data-driven methods, such as independent component analysis (ICA) (Calhoun and Adali, 2012; Calhoun et al., 2001b, 2009; Damoiseaux et al., 2006; Fox and Raichle, 2007). The connectivity estimates computed using brain network TCs using blind decomposition techniques, such as ICA, are referred to as functional network connectivity (FNC) (Jafri et al., 2008). To capture dynamic changes of brain connectivity, one of the most common methods of estimating time-varying connectivity states is the sliding windows method, in which correlations between network TCs are computed and then clustered using algorithms such as k-means to identify recurring stable patterns of FNC across time and subjects (Allen et al., 2014).

Recently, there is growing interest in developing techniques to classify subjects into diagnostic groups using FC in various research domains (Allen et al., 2014; Kim et al., 2016; Plitt et al., 2015; Rahaman et al., 2019; Saha et al., 2017; Saha et al., 2019). Some recent studies have evaluated classification among bipolar and schizophrenia (SZ) patients using the features generated from FC (Arbabshirani et al., 2013; Shen et al., 2010; Su et al., 2013). An atlas-based approach was utilized in Shen et al. (2010) to compute the mean time-courses of 116 brain regions for both resting-state control and patient subjects. They extracted the features from each subject by computing the correlation between the TCs. After that, the feature's dimension was reduced by using the dimensionality reduction method, and finally, they classified patients from controls with good accuracy. Another classification approach, where features derived from sliding-window-based dynamic FNC (dFNC) measures have been shown to provide significant improvements in classification accuracies of patients with SZ and bipolar disorder from healthy controls (HCs) compared with static FNC (sFNC) measures (Rashid et al., 2016). In their work, the dFNC states were computed by clustering sliding-windowed FNC estimates using k-means. A classification approach was introduced where features were extracted from discrete and stable rsfMRI network states generated after applying k-means, and finally, these features were used to train a support vector machine (SVM) classifier to classify between major depressive disorder and HCs (Byun et al., 2014). The use of feature extraction from different states presents a challenge to compute the optimal number of states. Several data-driven methods to select the optimum number of model order for k-means algorithm include the following: (1) the elbow method, (2) the gap statistic (Tibshirani et al., 2001), and (3) the mean silhouette width (Rousseeuw, 1987). The main disadvantage of elbow and silhouette methods to select the optimal number model order is that both rely on global clustering characteristics. In addition, the elbow cannot always be unambiguously identified. Sometimes there is no elbow, or sometimes there exist several elbows in certain data distribution (Kodinariya and Makwana, 2013). A limitation of the gap statistic is that it can be difficult to find optimum clusters when data are not separated well (Wang et al., 2018). In addition, it is unclear if the number of clusters obtained using this method is optimal for classification. To mitigate this problem, we introduce an approach that estimates the optimum number of dFNC states based on the maximal classification rate within a nested cross-validation analysis. While we build our current work on the previously published work by Rashid and associates (2016), we addressed few limitations that were present in the previous work. Rashid and colleagues focused on classification performance using static, dynamic, and combined connectivity features, however, the authors did not utilize a compact feature for classification purposes. In this study, we explore both capturing dynamic information from time-varying connectivity windows and identifying a set of robust and compact features that would lead to better classification performance. Identification of the best number of clusters in k-means clustering approach that would represent the complete data matrix in the most informative way while eliminating any information redundancy is a challenging issue. This is specifically important in the case of classification algorithms where feature selection and reduction are vital for optimum model performance. Our current framework focuses on this important issue to capture the optimum feature sets.

In our work, we extensively consider static, dynamic, and combined (static and dynamic) features to analyze our classification task and show how the selection of optimal model order influences the classification accuracy. Meanwhile, we conduct statistical significance tested on the classification accuracies between different approaches using paired t-tests. We also investigate if further subclustering of the obtained k-means states (hierarchical clustering) can further improve the classification accuracy by capturing additional variation in the dFNC measures. In addition, instead of taking all sFNC features for classification, we search for the optimum number of sFNC features that further enhances the classification accuracy when combined with dFNC features. In our final analysis, we identify the most predictive brain regions by evaluating the predictive power of sFNC features in the combined (sFNC+dFNC) classification analysis. In this work, we propose a classification-based model order selection approach to identify the optimal number of distinct states. We show that our approach (1) provides an unambiguous estimate and (2) results in higher classification accuracy in hold-out data. To the best of our knowledge, our proposed method is the first approach to estimate the optimum number of dFNC states based on classification rates.

Materials and Methods

IRB approvals and patient consents

The study was approved and monitored by the Institutional Review Boards of the following collaborating data collection sites included in this work: University of California Irvine, the University of California Los Angeles, the University of California San Francisco, Duke University, University of North Carolina, University of New Mexico, University of Iowa, and University of Minnesota. At the beginning, signed informed consent was taken from each participant to share the deidentified data between centers.

Participants

In our experiments, data were acquired from the multisite Functional Imaging Biomedical Informatics Research Network (fBIRN) project (Potkin and Ford, 2009). This study included rsfMRI data scanned during the eyes-closed state and collected from seven different sites across the United States (Supplementary Data and Supplementary Tables S1 and S2). These data consist of 163 HCs (117 males, 46 females; mean age 36.9) and 151 SZ patients (114 males, 37 females; mean age 37.8), and all patients are matched by age and gender. Previous studies from our group have used these data as well (Abrol et al., 2017; Damaraju et al., 2014).

Imaging parameters

Imaging data were acquired using two different types of scanners. Data from six sites were collected on a 3T Siemens Tim Trio System scanner, and the 3T General Electric Discovery MR750 scanner was used for one site. A standard gradient-echo echo planar imaging paradigm consisting of field of view 220 × 220 mm (64 × 64 matrices), repeat time (TR) = 2 sec, echo time (TE) = 30 ms, flip angle = 77°, 162 volumes, 32 sequential ascending axial slices of 4 mm thickness, and 1 mm skip was used to obtain rsfMRI scans.

Data preprocessing

Raw fMRI data went through a preprocessing pipeline using a combination of AFNI (http://afni.nimh.nih.gov/) (Cox, 1996), SPM, and GIFT toolboxes. Briefly, the preprocessing steps included (1) subject head motion correction using the INRIAlign toolbox in SPM, (2) slice-timing correction using SPM toolbox to address the timing differences associated with slice acquisition for rigid body motion correction, (3) despiked the data to reduce the impact of outliers using AFNI's 3dDespike algorithm, (4) normalized the data to a Montreal Neurological Institute template and resampling to 3 mm3 isotropic voxels, and (5) smoothing the data to 6 mm full-width-at-half-maximum using AFNI's BlurToFWHM algorithm where the smoothing was performed by a conservative finite difference approximation to the diffusion equation since this algorithm reduces scanner-specific variability in smoothness by providing “smoothness equivalence” to data across multiple sites (Friedman et al., 2008). Finally, before group independent component analysis (GICA), variance normalization was performed on each voxel's TC, since this approach provides a better decomposition of subcortical (SC) sources in addition to cortical networks (Allen et al., 2010). GICA and post-ICA were done using GIFT, and sFNC and dFNC were computed using the GIFT dFNC toolbox.

Group independent component analysis

A GICA framework, as implemented in the GIFT software (Calhoun et al., 2001a, Erhardt et al., 2011), was used to analyze preprocessed functional data. Spatial ICA transforms the subject data into a linear mixture of spatially independent components. Principal component analysis (PCA) was used to reduce 162 time point data into 100 directions of each subject. These subject reduced data were concatenated across time and group data PCA was applied to these concatenated data to reduce to 100 components along with the directions of maximum group variability. The Infomax algorithm (Bell and Sejnowski, 1995) was used to retrieve 100 independent components from the group-PCA reduced matrix. To confirm the stability, the ICA algorithm was repeated 20 times in ICASSO, and a central solution was selected using the modes of the component cluster. Finally, the spatiotemporal regression back-reconstruction approach (Calhoun et al., 2001a, Erhardt et al., 2011) was used to retrieve subject-specific spatial maps (SMs) and TCs.

Post-ICA processing

One sample t-test maps of each SMs across all subjects were computed and thresholded to identify the regional peak activations of clusters for that component. Furthermore, the mean power spectra of the corresponding TCs of all components were computed. For all independent components or ICs, we computed what percentage of the thresholded component SMs overlap with gray matter. If the overlap with gray matter was above 60% (showing high overlap with gray matter, and low spatial overlap with known vascular, ventricular, motion, and susceptibility artifacts, and TCs dominated by low-frequency fluctuations), we considered that component to be an intrinsic connectivity network (ICN), otherwise, we excluded it. Based on this selection procedure, 47 ICNs out of the 100 independent components were retained. The cluster stability/quality (Iq) index (returned from Icasso) for each estimate-cluster provides the rank for the corresponding ICA estimate. The Iq index for these ICNs across 20 ICASSO runs was very high (Iq > 0.9) for all of the components, except an ICN that resembles the language network (Iq = 0.74).

After obtaining 47 ICNs, their corresponding TCs for each subject were detrended, orthogonalized with respect to subject-specific motion parameters, and then despiked to reduce the effect of outliers on subsequent FNC measures [see supplementary fig. 1 in Allen and associates (2014)]. Despiking of corresponding TCs was performed by first detecting spikes using AFNI's 3dDespike algorithm and replacing spikes by values getting from third-order spline fit to neighboring clean portions of the data.

Static functional network connectivity

The pairwise correlation between ICN TCs (FNC) was computed by taking the average of connectivity among different ICNs for the duration of the scan period. The sFNC was obtained using the entire ICN TCs. Before computing FNC between ICN TCs, the ICNs were band pass filtered with a passband of [0.01–0.15] Hz using a fifth-order Butterworth filter. After computing the mean sFNC matrix across subjects, this matrix was arranged into modular partitions using the Louvain algorithm of the brain connectivity toolbox.

Finally, the rows of sFNC matrix were partitioned into SC, auditory (AUD), visual (VIS), sensorimotor (SM), a broad set of regions involved in cognitive control (CC) and attention, default-mode network (DMN) regions, and cerebellar (CB) components. To measure group differences among ICNs in sFNC, the multivariate analysis of covariance framework (Allen et al., 2011) was used. In addition, diagnosis, site, and gender were used as factors, age as a covariate, and interactions of gender and age by diagnosis. Besides, to consider residual motion-related variance, mean framewise displacement (meanFD) was included as a nuisance covariate in ICA-derived measures, as suggested in recent studies (Sakoglu et al., 2010; Yan et al., 2013). The reduced model is acquired by the elimination of one term and any associated interactions. At each step, the multivariate model was used to compare the variance explained in the response variable of the current full model and reduced model using the Wilks3 Lambda likelihood ratio test statistic (Christensen, 2001). In this model reduction step, the term that did not meet the threshold value α = 0.01 for the F-test was referred to as least significant and removed.

Dynamic functional network connectivity

To compute dFNC between two ICA TCs, a sliding window approach of window size of 22 TR (44 sec) in steps of 1 TR was used by following our earlier work (Allen et al., 2014). To avoid the noise of estimated covariance using shorter length time series, covariance was estimated from regularized inverse covariance matrix (ICOV) (Smith et al., 2011; Varoquaux et al., 2010) by graphical LASSO framework (Friedman et al., 2008). Besides, L1 norm constraint was applied to the ICOV for sparsity enforcement. The regularization parameter for each subject was optimized in a cross-validation framework by assessing the log-likelihood of unseen data of the subject. A recent report suggested that the original graphical LASSO implementation might not ensure positive semidefiniteness of the estimated covariance matrix (Mazumder and Hastie, 2012). In our analysis, we therefore examined the eigenvalues of the estimated dynamic covariance matrices and confirmed that they were all positive. Finally, the covariance values of the dFNC estimates for each subject were Fisher-Z transformed and residualized with respect to age, gender, and site using the reduced model, which is determined from our sFNC analysis.

Classification framework

We evaluated our classification of sFNC, dFNC, and a combination of both sFNC and dFNC features following one of our earlier works (Rashid et al., 2016). In all experimental settings, we used a linear SVM to compute the classification in a repeated, nested cross-validation analysis. We applied SVM on fivefold cross-validated data with 10 repetitions. In SVM, we used a linear kernel and optimized the cost parameter (C) using the grid search approach. To find the optimal value of C, we computed cross-validated accuracy of each training fold for the different values of C [minimum C value: 0.03125, maximum C value: 2, increment size: 0.01] and used the model with optimal (i.e., validated) value of this hyperparameter to apply on the test set of that given fold.

sFNC approach

To perform classification on sFNC, we first split the data into fivefolds where each fold consists of training and testing sets. For any given fold, we split the data into training and testing data sets. We then trained a linear SVM classifier using the training data sets. Next, we validated this by classifying subjects using the testing data sets of that given fold using the trained classifier and recorded the computed accuracy. This same approach was applied to the rest of the fourfolds, and finally, this whole procedure was repeated 10 times. The conceptual framework of the sFNC approach is shown in Figure 1.

FIG. 1.

FIG. 1.

An overview of the sFNC approach. Group ICA is used to decompose resting-state data from 314 subjects into 100 independent components. Among them, 47 ICNs are identified based on peak activation and overlapping criteria. Spatiotemporal regression back-reconstruction approach is used to retrieve subject-specific SMs and TCs. sFNC is computed as the covariance of TCs, which is finally used as static features. FNC, functional network connectivity; ICA, independent component analysis; sFNC, static FNC; SMs, spatial maps; TCs, time courses. Color images are available online.

dFNC approach

To evaluate the classification of task using dFNC, we used the same folds of data that are used in the sFNC approach to maintain consistency. We applied the k-means clustering algorithm to estimate the connectivity states and the centroid information from these states was used to compute the classification features. We used different model orders (model order k = 2–10) to extract group-specific centroid information. In other words, for any model order k, we took a fold of data and applied k-means on the HC and SZ training groups separately. After completion of k-means, we saved the centroid information from HC and SZ groups and then concatenated these group-specific centroids to form a regression matrix Rgroups×centroids. Using this regression matrix, we regressed windowed FNC matrices at each time point and computed the beta coefficient, β. The total number of beta coefficients was model order (k) × 2 at each time point. Next, we computed the mean beta coefficients across all the time windows of any given subject to obtain the final dFNC features (FeatdFNC). For model order k, each subject obtained 2k number of β coefficients. Finally, we trained a linear SVM using the beta coefficients from the training sets and computed the accuracy for the testing sets using this trained classifier. This same approach was applied to the other fourfolds and the whole procedure was repeated 10 times for any given model order. The conceptual framework for the dFNC approach is shown in Figure 2.

FIG. 2.

FIG. 2.

An overview of the dFNC approach. At first, a group-wise k-means clustering was applied to training groups separately to obtain cluster centroids per group (HC and SZ). Next, a regression matrix was formed and the windowed FNC matrices at each time point were regressed based on this to extract the beta coefficient (β). Finally, the mean beta coefficient was computed across all the time windows for each subject, which are considered dynamic features. dFNC, dynamic FNC; HC, healthy control; SZ, schizophrenia. Color images are available online.

Combined sFNC and dFNC approach

Following the same manner of sFNC and dFNC evaluations, to compute the classification of combined (static+dynamic) approach, we used the same folds of data that were used in the previous two computations. In this approach, for dimensionality reduction and feature selection, we estimated the most informative features using the double input symmetric relevance (DISR) method (Meyer and Bontempi, 2006) and used these selectively reduced features instead of using the entire set of sFNC features. The DISR method selects the subset of variables (i.e., features) such that the combination of selected variables results in higher mutual information than the collective information given by each of the variables while taken individually. We ran this simulation for a range of DISR features (15, 20, 30, 40, 50, 100) to observe the fluctuation and variability of classification performance for different combinations of features. For any fold of data under a given model order (k), we extracted the DISR features from the sFNC training set, referring to these features as FeatsFNC. To obtain features from dFNC(FeatdFNC) of the same fold of data, we applied the same approach as that used in the dFNC case.

In the next analysis step, the extracted FeatsFNC and FeatdFNC of the same subject were concatenated to form a set of combined features (FeatsFNC+dFNC). After that, we trained a linear SVM classifier using these combined features of the training set. For the testing set of the same fold, we used the same static feature (FeatsFNC) columns that were extracted from the training sets using the DISR method. To obtain dFNC features (FeatdFNC), we computed the beta coefficients (β) of each subject using the regression matrix (Rgroups×centroids) of the training sets following the same regression technique as outlined before. Next, we combined the static features (FeatsFNC) and dynamic features (FeatdFNC) to form a set of combined features (FeatsFNC+dFNC) of that testing set and classified subjects using the training set's classifier. This same approach was applied on the rest of fourfolds, and finally, the whole procedure was repeated 10 times for each model order. Eventually, the whole procedure was repeated for the range of the number of DISR features mentioned before. The conceptual framework of the combined (static+dynamic) FNC approach is shown in Figure 3.

FIG. 3.

FIG. 3.

An overview of the combined (static+dynamic) approach. Here, static and dynamic features are concatenated to create the combined features for the classification. However, instead of using all static features, the most informative features are selected using the DISR method from precomputed static features. DISR, double input symmetric relevance. Color images are available online.

Intersection approach

We use the intersection approach to check the robustness of the features. In this analysis, we keep all experimental setups the same as combined sFNC and dFNC analysis (Combined sFNC and dFNC Approach section) except adding another layer to pick static features after applying the DISR method on sFNC data. For a given data partition, we extract 40 DISR static features for each fold and finally retain the most stable static features (FeatsFNC_Intersection) by computing the intersection across all fivefolds. Finally, we follow the same procedure mentioned in the dFNC Approach section where we concatenate FeatsFNC_Intersection and FeatdFNC and formed combined features (FeatsFNC_Instersection+dFNC), which is used for the classification using a linear SVM.

In addition, to identify the dominant static features contributing to the classification accuracy throughout our entire experiments, we took the intersection of static features (FeatsFNC_Intersection) across all 10 partitions of data.

Hierarchical approach

In this experiment, we examine if further (i.e., a second level) clustering on first-level clustering states can capture additional variation in dFNC measures to improve classification accuracy. Here we take information from hierarchical states and fit into our classification framework to investigate the contribution of this additional information by monitoring the classifier's performance. We test this hypothesis using three different setups of experiments on the states obtained from running k-means (using the validated model order 4 only) on dFNC. The reason for picking model order 4 into account for this analysis is that we get the highest accuracy for model order 4 in all our previous experimental settings.

In setup 1, at first, we ran k-means with model order 4 on each fold of dFNC training samples to estimate states. After that, k-means with model order 4 was applied again on the highest occupancy state (the state that belongs to maximum observations). Next, centroids from the first- and second-level (hierarchical) states were concatenated, and beta coefficients were computed using the regression technique described in the dFNC Approach section.

In setup 2, at first, we ran k-means with model order 4 on each fold of dFNC training samples to estimate states. After that, model order 2 was applied to the highest occupancy state. Then centroids from the first- and second-level states were concatenated, and beta coefficients were computed using the regression technique mentioned in the dFNC Approach section.

In setup 3, at first, we ran k-means with model order 4 on the dFNC data of each fold's training samples. After that, each hierarchical state was further clustered by running k-means with model order 2 (resulting in eight substates per fold). Finally, like the previous two setups, centroids from these eight states were concatenated and the beta coefficients were computed using the regression technique mentioned in the dFNC Approach section.

Results

Forty-seven ICNs were obtained through the decomposition using spatial ICA following one of our earlier investigations of whole-brain FC (Damaraju et al., 2014). Spatial maps of 47 ICNs are shown in Figure 4. The ICNs are categorized based on anatomical and functional domains, including the SC, AUD, VIS, somatomotor, CC processes, DMN, and CB networks.

FIG. 4.

FIG. 4.

Composite maps of the 47 identified ICNs, sorted into seven subcategories. Each color in the composite maps corresponds to a different ICN. ICNs, intrinsic connectivity networks. Color images are available online.

To perform classification on sFNC, we computed the pairwise correlation between the TCs of the 47 ICNs. Each subject constitutes 47C2 = 1081 brain connections; these data form a 314 × 1081 feature matrix at the group level. Note that initially we take all features of sFNC into account for classification, but for the combined (static+dynamic) case we extracted the top informative features using the DISR method as mentioned in combined sFNC and dFNC approach. For dFNC, we computed the pairwise correlation between the TCs of 47 ICNs using the sliding-window approach (Allen et al., 2014; Rashid et al., 2014) of each dynamic window, which finally formed a 314 × 136 × 1081 (subjects × windows × connections) dFNC matrix. Recall that we regressed this dFNC matrix by the regression matrix estimated from the group-level states and that finally produce dFNC features for classification, as described in the dFNC Approach section.

The boxplots in Figures 5–9 show the distribution of mean accuracies for each of the cross-validation repeats. The comparison of cross-validated classification accuracy between sFNC and dFNC is presented in Figure 5. In our classification framework, sFNC reported an overall mean accuracy of 77.16% with the range of accuracies [75.49, 79.34]. For dFNC, we report a maximum accuracy for model order 4 where the mean value of the accuracy is 79.45% with range [77.96–80.89]. The comparison metric of static versus combined (static+dynamic) for a range of the different number of DISR features is demonstrated in Figure 6. After combining static with dynamic features, the maximum accuracy was reported by model order 4 with the top 40 DISR static features where the mean accuracy is 80.45% with range [78.37–81.87]. A similar result was observed for each case, with the same model order (k = 4) being validated.

FIG. 5.

FIG. 5.

Classification accuracy of sFNC and dFNC (flat approach); on the X-axis, labels k = 2–10 indicate the model order for dFNC, and Y-axis indicates the mean accuracy. Each boxplot consists of 10 points where each point is the computed mean accuracy of fivefolds under any given repetition of experiment (total 10 repetitions). Color images are available online.

FIG. 6.

FIG. 6.

Classification accuracy of static versus combined (static+dynamic) features. Each subplot represents accuracy for different DISR features (extracted from sFNC). On the X-axis, S represents accuracy for sFNC (selected static DISR features), and 2–10 indicate the model order for combined features (static DISR features+dFNC beta coefficients), and Y-axis indicates the mean accuracy. Color images are available online.

FIG. 7.

FIG. 7.

Classification accuracy of static, dynamic, and combined FNC approach. On the X-axis, labels k = 2–10 indicate the model order for dFNC, and Y-axis indicates the mean classification accuracy. * indicate p < 0.05 (FDR corrected), ** indicate p < 0.01 (FDR corrected), *** indicate p < 0.001 (FDR corrected). FDR, false discovery rate. Color images are available online.

FIG. 8.

FIG. 8.

Classification accuracy of 40 DISR features versus common DISR features (obtained by intersection across all folds) of intersection approach. On X-axis, labels k = 2–10 indicate the model order, and Y-axis reports the mean accuracy values. For any model order, there exists one pair of boxplot where red boxplot indicates the classification accuracy of flat 40 DISR features and the green one represents the accuracy of DISR intersection features. Color images are available online.

FIG. 9.

FIG. 9.

Classification accuracy of model order 4 for different settings of hierarchical approach. On X-axis, “first level” indicates accuracy for model order 4 without hierarchical clustering. Settings “1–3” indicate hierarchical classification accuracy of model order 4 for three different setups. Color images are available online.

We also conducted statistical significance tests on the classification accuracies for the static versus dynamic, dynamic versus combined, and static versus combined approaches using paired t-tests. The results of this analysis are summarized in Figure 7. In the combined approach, we used 40 DISR static features for all model orders. sFNC showed statistically significant improvement over dFNC for model order 2. In dFNC, the classification accuracy of model orders 3 and 4 showed statistically significant improvement with sFNC features. Finally, the combined (static+dynamic) approach significantly outperforms sFNC for model orders 2, 3, and 4 and dFNC for model orders 2 and 10. For the validated model order (k = 4), we found significant improvement for the dynamic versus static (p < 0.01) and combined versus static (p < 0.001) cases.

The classification accuracies of the intersection approach are shown in Figure 8. In this case, the accuracy of intersected static features together with dynamic features performed slightly better than the traditional combined approach for all experimental setups. Each boxplot consists of 10 points where each point is the mean accuracy over the five cross-validated folds under a given partition of data. Here, yet again, the highest classification accuracy was reported for model order 4 (consistent with all previous results) where the mean accuracy is 81.53% with range [80.53–0.8315]. Finally, we identified seven dominants sFNC features that are picked by the DISR feature selection method across all folds and all partitions as shown in Figure 10.

FIG. 10.

FIG. 10.

Brain mapping of most predictive static features. Three pairs of thalamocortical connections (two thalamus-VIS and one thalamus-motor pair), two SM-VIS pairs, one SM-SC pair, and one VIS-AUD pair were consistently selected by the DISR method across all partitions of data. All 47 ICNs (selected from 100 independent components) are shown on the diagonal. AUD, auditory; SC, subcortical; SM, sensorimotor; VIS, visual. Color images are available online.

The classification accuracies of the hierarchical approach are shown in Figure 9. In this figure, the first plot (labeled as “first level”) represents the accuracy of dFNC for model order 4. The other three plots highlight the classification accuracies for three different settings of second-level (hierarchical) clustering, as described in hte Hierarchical Approach section. From our analysis, the classification accuracy of our traditional clustering approach always outperforms the second-level clustering settings. From model order 4 of dFNC, we get maximum accuracy where the mean accuracy is 79.45% with range [77.96–80.89]. Meanwhile, in the hierarchical approach, after using second-level clustering information with model order 4 of dFNC, we get maximum accuracy for Setting 2 where the mean accuracy is 79.01% with range [77.73–80.59].

Discussion

dFNC gives information about time-varying connectivity fluctuations over time (Allen et al., 2014). It captures the local connectivity of each window instead of providing mean connectivity, unlike sFNC. Hence, the information that is sometimes missing in sFNC could potentially be captured by dFNC approaches. A recent study has shown that these informative dynamic features help to distinguish between HC and patients in a classification framework (Rashid et al., 2016). Results presented in Figure 5 show that the classification accuracy of dFNC outperforms sFNC for model orders 3 to 8, which is consistent with our earlier work and the best accuracy was obtained from model order 4. However, for higher model orders (k = 9, 10), the observed dFNC classification accuracies drop. We speculate that this observation may due to the splitting of low occupancy states, analogous to the results observed in setting 3 in the hierarchical approach analyses (Fig. 10). Also, under such circumstances, outlier clusters (e.g., single-subject observations clustered together) may occur thus adding noise to the analysis. Moreover, the results demonstrate a consistent and sizable (k = 3–8) range of the pattern of interest in this comparison.

The classification accuracy was improved after combining the dynamic and top sFNC features together, as also reported in Rashid and associates (2016). In our work, we ran experiments for different subsets of static features using the DISR method to improve the accuracy, which is presented in Figure 6. Consistent with the earlier work (Rashid et al., 2016), classification accuracy improved than obtained from sFNC or dFNC individually on separate data sets. The best accuracy was achieved for 40 DISR sFNC features in model order 4, and the combined feature approach significantly outperforms sFNC for model orders 2, 3, and 4. Instead of the low accuracy of dFNC compared with sFNC in model order 2, the combined approach significantly outperforms both sFNC and dFNC. A possible explanation is that the subset of static features (40 DISR features) is playing a vital role to increase the classification accuracy. This observation demonstrates that both dynamic and sFNC methods capture complementary aspects of connectivity and the classification accuracy increases after combining the dynamic and top sFNC features compared with using individual features separately.

We utilized an intersection approach to assess the robustness of the features. In the combined approach described in the Combined sFNC and dFNC Approach section, the selected static features in each fold are independent. A feature that appears only one time in a single fold may be picked by random chance and this may affect the performance. Our experimental results demonstrate the robustness of the features (Fig. 8). The intersection approach is also able to identify the most dominant static features (representing different brain network pairs) for the classification. Our analysis highlighted seven brain network pairs that are consistently dominant across all 10 partitions of data as presented in Figure 9. These include three pairs of thalamocortical connections (two thalamus-VIS and one thalamus-motor pair), two sensorimotor (SM)-VIS pairs, one sensorimotor (SM)-SC pair, and one VIS-AUD pair. Notably, the dysconnectivity within the thalamocortical connections has earlier been reported in Anticevic and associates (2014), and our findings also are in line with this. The reduced anticorrelation between the thalamus and sensorimotor network in patients with SZ is a good predictor and is consistent with earlier findings (Woodward and Heckers, 2016).

We use the hierarchical approach to examine if the information from the hierarchical states would contribute to improve the classification accuracy or not. The model order that resulted in the best classification accuracy for our experiment was 4. We expected that further generalization of high occupancy state (the state that contains maximum number of timing windows after running k-means) and the information from the hierarchical states might improve the accuracy further. In our analysis, the accuracy of hierarchical approach (Fig. 10) did not outperform the dFNC approach (mentioned in the dFNC Approach section). We speculate that by applying the hierarchical approach on the high occupancy state, we eventually generate low occupancy states and the information from these hierarchical states was not useful for improving classification accuracy.

Based on our experiments, we achieve the best cross-validated classification accuracy for model order 4 (i.e., k = 4 in k-means clustering) to distinguish HC and SZ subjects of the fBIRN data. The elbow method suggested an optimal value of k = 5 for these data (Damaraju et al., 2014). We also ran silhoutte and gap statistics methods on fBIRN data by setting the maximum cluster size 10, and the suggested model orders were 2 and 10 from these methods, respectively. These results confirm our expectation that the optimal model order of k-means algorithm for classification can be different from the model order obtained from these standard methods. Our work thus shows the benefits of conducting an exhaustive search on training data and using this to select the clustering model order for brain imaging applications.

Conclusion

In our work, we have investigated the efficiency of dFNC features to classify HC and SZ patients using different classification strategies. Here, we have proposed a classification-based approach to estimate the optimal model order for dFNC, focusing on overcoming the complexity introduced by elbow, silhouette, gap statistics, and similar methods, which are traditionally used to find out the optimal k in k-means. The results from our analysis showed that by using our proposed classification framework, higher accuracy can be achieved compared with that of the traditional methods to select the optimal k in k-means. Finally, our classification framework can be used as a tool for estimating and characterizing the optimal number of cluster centroids in dFNC measures, which appears to be useful for classification, prediction, and characterization of healthy and disordered brain function.

Limitations and Future Directions

We consider, as dynamic features, the mean regression coefficients across time for each of the derived HC or SZ states, when fitted on the sliding window-based dynamic FC samples. While the mean regression coefficients provide a compact and meaningful feature set, there are other possible approaches to extract dynamic connectivity features that should be considered in future work. For instance, instead of considering the mean strategy for extracting dynamic features, other approaches such as the standard deviation or the instantaneous change in similarity to a given state might be interesting work to explore in the future. There are some experimental limitations to choose the optimal window size in the dFNC approach. The optimal window size should be able to measure the variability of FC and capture any short time effects (Sakoglu et al., 2010). We used a sliding window approach of fixed window size of 22 TR (44 sec) in steps of 2 TR following one of our earlier works (Allen et al., 2014; Damaraju et al., 2014). Capturing different patterns of variability using different window lengths in a dFNC approach and finally fitting the features into the classification framework can be an interesting work for further research.

In addition to the linear SVM classifier, a variety of other classifiers can be further used to examine the robustness of our proposed approach. In SVM, we used a linear kernel and optimized the parameter “C” using a grid search approach. To the best of our knowledge, this is the first study to propose a classification approach for model order selection for clustering in a dFNC analysis. A detailed comparison of different kernels of SVM and optimization of different SVM hyperparameters would be interesting to explore in the future.

Supplementary Material

Supplemental data
Supp_Data.docx (15.9KB, docx)
Supplemental data
Supp_TableS1-S2.pdf (26KB, pdf)

Authors' Contributions

Authors V.D.C., S.M.P., D.K.S., and E.D. designed the study. Author E.D. preprocessed the data. Authors D.K.S., E.D., B.R., A.A., and V.D.C. designed the experiment. Author D.K.S ran the analyses, carried out the statistical analysis, and wrote the first draft of the article. Authors B.R., V.D.C., A.A., and E.D. edited the article. All authors contributed to the different phases of experimental analyses and approved the final version of the article.

Author Disclosure Statement

The authors (D.K.S., E.D., B.R., A.A., S.M.P., V.D.C.) declare no potential conflicts of interest regarding personal, financial, or any other relationships with other people or organizations for this work. Finally, the authors or their affiliated institutions do not have any agreement or contract that could be considered involvement for the financial interest in this work.

Funding Information

This study was supported by the National Institutes of Health grants R01EB020407 and R01MH118695.

Supplementary Materials

Supplementary Data

Supplementary Tables S1

Supplementary Tables S2

References

  1. Abrol A, Rashid B, Rachakonda S, Damaraju E, Calhoun VD. 2017. Schizophrenia shows disrupted links between brain volume and dynamic functional connectivity. Front Neurosci 11:624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Allen EA, Damaraju E, Plis SM, Erhardt EB, Eichele T, Calhoun VD. 2014. Tracking whole-brain connectivity dynamics in the resting state. Cereb Cortex 24:663–676 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Allen EA, Erhardt EB, Eichele T, Mayer AS, Calhoun VD. 2010. Comparison of pre-normalization methods on the accuracy and reliability of group ICA results. Organ Hum Brain Mapp. Abstract no: 3422 [Google Scholar]
  4. Allen E, Erhardt E, Damaraju E, Gruner W, Segall J, Silva R, et al. 2011. A baseline for the multivariate comparison of resting-state networks. Front Syst Neurosci 5:2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Anticevic A, Cole MW, Repovs G, Murray JD, Brumbaugh MS, Winkler AM, et al. 2014. Characterizing thalamo-cortical disturbances in schizophrenia and bipolar illness. Cereb Cortex 24:3116–3130 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Arbabshirani MR, Kiehl K, Pearlson G, Calhoun VD. 2013. Classification of schizophrenia patients based on resting-state functional network connectivity. Front Neurosci 7:133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bell AJ, Sejnowski TJ. 1995. An information-maximization approach to blind separation and blind deconvolution. Neural Comput 7:1129–1159 [DOI] [PubMed] [Google Scholar]
  8. Biswal B, Yetkin FZ, Haughton VM, Hyde JS. 1995. Functional connectivity in the motor cortex of resting human brain using echo-planar MRI. Magn Reson Med 34:537–541 [DOI] [PubMed] [Google Scholar]
  9. Byun HY, Lu JJ, Mayberg HS, Günay C. 2014. Classification of resting state fMRI datasets using dynamic network clusters. Modern Artificial Intelligence for Health Analytics 14:2–6 [Google Scholar]
  10. Calhoun VD, Adali T. 2012. Multisubject independent component analysis of fMRI: a decade of intrinsic networks, default mode, and neurodiagnostic discovery. IEEE Rev Biomed Eng 5:60–73 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Calhoun VD, Adali T, Pearlson GD, Pekar JJ. 2001a. A method for making group inferences from functional MRI data using independent component analysis. Hum Brain Mapp 14:140–151 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Calhoun VD, Eichele T, Pearlson G. 2009. Functional brain networks in schizophrenia: a review. Front Hum Neurosci 3:17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Calhoun VD, Adali T, Pearlson G, Pekar J.. Group ICA of functional MRI data: separability, stationarity, and inference. In Proceedings of the International Conference on ICA and BSS, San Diego, CA, 2001b [Google Scholar]
  14. Calhoun V, Miller R, Pearlson G, Adalı T. 2014. The chronnectome: Time-varying connectivity networks as the next frontier in fMRI data discovery. Neuron 84:262–274 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chang C, Glover GH. 2010. Time-frequency dynamics of resting-state brain connectivity measured with fMRI. Neuroimage 50:81–98 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Christensen R. 2001. Advanced Linear Modeling: Multivariate, Time Series, and Spatial Data; Nonparametric Regression and Response Surface Maximization. New York, NY: Springer [Google Scholar]
  17. Cox RW. 1996. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput Biomed Res 29:162–173 [DOI] [PubMed] [Google Scholar]
  18. Damaraju E, Allen EA, Belger A, Ford JM, McEwen S, Mathalon DH, et al. 2014. Dynamic functional connectivity analysis reveals transient states of dysconnectivity in schizophrenia. NeuroImage Clin 5:298–308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Damoiseaux JS, Rombouts SARB, Barkhof F, Scheltens P, Stam CJ, Smith SM, et al. 2006. Consistent resting-state networks across healthy subjects. Proc Natl Acad Sci of U S A 103:13848–13853 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Erhardt EB, Rachakonda S, Bedrick EJ, Allen EA, Adali T, Calhoun VD. 2011. Comparison of multi-subject ICA methods for analysis of fMRI data. Hum Brain Mapp 32:2075–2095 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Fox MD, Raichle ME. 2007. Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging. Nat Rev Neurosci 8:700–711 [DOI] [PubMed] [Google Scholar]
  22. Friedman J, Hastie T, Tibshirani R. 2008. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9:432–441 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Greicius MD, Krasnow B, Reiss AL, Menon V. 2003. Functional connectivity in the resting brain: a network analysis of the default mode hypothesis. Proc Natl Acad Sci U S A 100:253–258 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hutchison RM, Womelsdorf T, Allen EA, Bandettini PA, Calhoun VD, Corbetta M, et al. 2013. Dynamic functional connectivity: promise, issues, and interpretations. Neuroimage 80:360–378 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Jafri MJ, Pearlson GD, Stevens M, Calhoun VD. 2008. A method for functional network connectivity among spatially independent resting-state components in schizophrenia. Neuroimage 39:1666–1681 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kim J, Calhoun VD, Shim E, Lee J-H. 2016. Deep neural network with weight sparsity control and pre-training extracts hierarchical features and enhances classification performance: evidence from whole-brain resting-state functional connectivity patterns of schizophrenia. NeuroImage 124:127–146 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kodinariya T, Makwana PR. 2013. Review on determining of cluster in k-means clustering. Int J Adv Res Comput Sci Manage Stud 1:90–95 [Google Scholar]
  28. Mazumder R, Hastie T. 2012. The graphical lasso: new insights and alternatives. Electron J Stat 6:2125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Meyer PE, Bontempi G. 2006. On the use of variable complementarity for feature selection in cancer classification. In Applications of Evolutionary Computing, Berlin, Heidelberg, 2006 [Google Scholar]
  30. Plitt M, Barnes KA, Martin A. 2015. Functional connectivity classification of autism identifies highly predictive brain features but falls short of biomarker standards. Neuroimage Clin 7:359–366 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Potkin SG, Ford JM. 2009. Widespread cortical dysfunction in schizophrenia: the FBIRN imaging consortium. Schizophr Bull 35:15–18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Preti MG, Bolton TAW, Van De Ville D. 2017. The dynamic functional connectome: state-of-the-art and perspectives. Neuroimage 160:41–54 [DOI] [PubMed] [Google Scholar]
  33. Rahaman MA, Turner JA, Gupta CN, Rachakonda S, Chen J, Liu JY, et al. 2019. N-BiC: A method for multi-component and symptom biclustering of structural MRI data: Application to schizophrenia. IEEE Trans Biomed Eng 67:110–121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Rashid B, Arbabshirani MR, Damaraju E, Cetin MS, Miller R, Pearlson GD, et al. 2016. Classification of schizophrenia and bipolar patients using static and dynamic resting-state fMRI brain connectivity. Neuroimage 134:645–657 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Rashid B, Damaraju E, Pearlson GD, Calhoun VD. 2014. Dynamic connectivity states estimated from resting fMRI Identify differences among Schizophrenia, bipolar disorder, and healthy control subjects. Front Hum Neurosci 8:897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Rousseeuw PJ. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65 [Google Scholar]
  37. Saha DK, Calhoun VD, Yuhui DU, Zening FU, Panta SR, Plis SM. 2019. dSNE: a visualization approach for use with decentralized data. bioRxiv:826974
  38. Saha DK, Calhoun VD, Panta SR, Plis SM. 2017. See without looking: joint visualization of sensitive multi-site datasets. Int Joint Conf Artif Intell, pp. 2672–2678 [Google Scholar]
  39. Sakoglu U, Pearlson GD, Kiehl KA, Wang YM, Michael AM, Calhoun VD. 2010. A method for evaluating dynamic functional network connectivity and task-modulation: application to schizophrenia. MAGMA 23:351–366 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Shen H, Wang L, Liu Y, Hu D. 2010. Discriminative analysis of resting-state functional connectivity patterns of schizophrenia using low dimensional embedding of fMRI. Neuroimage 49:3110–3121 [DOI] [PubMed] [Google Scholar]
  41. Smith SM, Miller KL, Salimi-Khorshidi G, Webster M, Beckmann CF, Nichols TE, et al. 2011. Network modelling methods for FMRI. Neuroimage 54:875–891 [DOI] [PubMed] [Google Scholar]
  42. Su L, Wang L, Shen H, Feng G, Hu D. 2013. Discriminative analysis of non-linear brain connectivity in schizophrenia: an fMRI Study. Front Hum Neurosci 7:702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Tibshirani R, Walther G, Hastie T. 2001. Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc Ser B Stat Methodol 63:411–423 [Google Scholar]
  44. Varoquaux G, Gramfort A, Poline J-B, Thirion B. 2010. Brain covariance selection: better individual functional connectivity models using population prior. Adv Neural Inf Process Syst 23:2334–2342 [Google Scholar]
  45. Wang M, Abrams ZB, Kornblau SM, Coombes KR. 2018. Thresher: determining the number of clusters while removing outliers. BMC Bioinformatics 19:9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Woodward ND, Heckers S. 2016. Mapping thalamocortical functional connectivity in chronic and early stages of psychotic disorders. Biol Psychiatry 79:1016–1025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Yan C-G, Cheung B, Kelly C, Colcombe S, Craddock RC, Martino AD, et al. 2013. A comprehensive assessment of regional variation in the impact of head micromovements on functional connectomics. NeuroImage 76:183–201 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental data
Supp_Data.docx (15.9KB, docx)
Supplemental data
Supp_TableS1-S2.pdf (26KB, pdf)

Articles from Brain Connectivity are provided here courtesy of Mary Ann Liebert, Inc.

RESOURCES