Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 May 11.
Published in final edited form as: IEEE J Biomed Health Inform. 2021 May 11;25(5):1712–1723. doi: 10.1109/JBHI.2020.3019421

Multi-paradigm fMRI fusion via sparse tensor decomposition in brain functional connectivity study

Yipu Zhang 1, Li Xiao 2, Gemeng Zhang 3, Biao Cai 4, Julia M Stephen 5, Tony W Wilson 6, Vince D Calhoun 7, Yu-Ping Wang 8
PMCID: PMC7904970  NIHMSID: NIHMS1667028  PMID: 32841133

Abstract

Functional magnetic resonance imaging (fMRI) is a powerful technique with the potential to estimate individual variations in behavioral and cognitive traits. Joint learning of multiple datasets can utilize their complementary information so as to improve learning performance, but it also gives rise to the challenge for data fusion to effectively integrate brain patterns elicited by multiple fMRI data. However, most of the current data fusion methods analyze each single dataset separately and further infer the relationship among them, which fail to utilize the multidimensional structure inherent across modalities and may ignore complex but important interactions. To address this issue, we propose a novel sparse tensor decomposition method to integrate multiple task-stimulus (paradigm) fMRI data. Seeing each paradigm fMRI as one modality, our proposed method considers the relationships across subjects and modalities simultaneously. In specific, a third-order tensor is first modeled by using the functional network connectivity (FNC) of subjects in multiple fMRI paradigms. A novel sparse tensor decomposition with the regularization terms is designed to factorize the tensor into a series of rank-one components, which can extract the shared components across modalities as the embedded features. The L2,1-norm regularizer (i.e., group sparsity) is enforced to select a few common features among multiple subjects. Validation of the proposed method is performed on realistic three paradigm fMRI datasets from the Philadelphia Neurodevelopmental Cohort (PNC) study, for the study of the relationship between the FNC and human cognitive abilities. Experimental results show our method outperforms several other competing methods in the prediction of individuals with different cognitive behaviors via the wide range achievement test (WRAT). Furthermore, our method discovers the FNC related to the cognitive behaviors, such as the connectivity associated with the default mode network (DMN) for three paradigms, and the connectivity between DMN and visual (VIS) domains within the emotion task.

Keywords: brain functional connectivity, data fusion, feature extraction, multi-paradigm fMRI, tensor decomposition

I. Introduction

Functional connectome analysis is crucial for uncovering individual-specific brain networks [1], [2]. Over the past few decades, a variety of functional connectome-based research studies have focused on identifying variations in behavioral and cognitive traits such as the intelligence and disease symptoms for individuals of different ages and genders [3], [4]. Functional connectivity has been proven to be a useful tool to explore function of human brain, which represents the functional nodes defined by regions-of-interest (ROIs) and edges defined by correlation between the time courses of ROIs.

Functional magnetic resonance imaging (fMRI) has been widely used to measure brain activity due to its noninvasiveness, and high spatial and temporal resolution [5]. Resting state (RS) fMRI is one of the tasks to investigate individual differences in the organizational patterns of functional networks, and measure intrinsic or spontaneous activity in the brain such as the default mode network (DMN) [6]. To improve the quantification of functional connectivity, a variety of analytical methods such as graph theory have been applied to fMRI data analysis [7], [8]. However, the resting state is an unconstrained state that may fail to capture the full range of individual differences in terms of functional connectivity [9]. Task-based fMRI is collected while participants are functionally involved with a specific task, which explores the individual differences due to brain activity changes stimulated by a paradigm [10], [11]. Finn et al. [2] estimated functional connectivity profiles using both task and resting state fMRI as a fingerprint that can accurately identify subjects from a large group. Greene et al. [12] predicted fluid intelligence by using a model built from task fMRI data (working memory (WM) or emotion (EM)), which outperformed a model built from resting state fMRI. These results also suggested that certain tasks may bring about meaningful finding across subjects with different traits, motivating a paradigm shift from resting to task-based functional connectivity analysis.

In recent study, Sripada et al. [13] found that resting-state connectivity can be leveraged to produce generalizable markers of neurocognitive functioning, and highlighted the importance of task control default mode network interconnections as a major locus of individual differences in cognitive functioning. In particular, when complementary information is obtained from multiple datasets generated by different task-trait experiments, data fusion has lead to novel findings in many recent neuroimaging studies. It has also been suggested in [3], [14] that there is a commonality between different modalities (different functional tasks or different imaging modalities) implicated by the same underlying pathology. Approaches such as machine learning and statistical analysis are widely applied in this research [15]–[18]. In the study of functional brain networks, Xiao et al. [19] proposed a framework with alternating diffusion map by combining two paradigms fMRI datasets to enhance the prediction of intelligence quotient (IQ). Kaufmann et al. [3] also reported commonality across different paradigms assessed via a set of task-reference comparison experiments. However, the underlying algorithms of these works were primarily based on the matrix decomposition framework, which reduced the dimension of each dataset separately and utilized similarity scores to identify the linked features between different datasets. As a result, they may not reflect complex interactions and specific structures of these multidimensional data [20].

To overcome this limitation, tensors, also called multidimensional arrays, are well recognized as a powerful tool to represent complex high dimensional data and have been applied to the fusion of structural and diffusion MRI data [21], a three-way array analysis for brain imaging [22], dynamic network states detection [23], etc. Specifically, Beckmann and Smith [20] presented tensor ICA, which applied independent component analysis (ICA) to tensor decomposition for multi-subject fMRI analysis. Tensor ICA decomposed a third-order tensor with dimensions of voxels × time × subjects into a set of independent spatial maps associated with time courses and subject modes. Based on tensor ICA, Hore et al. [24] presented a Bayesian method for the analysis of gene expression data from multiple tissue types, which decomposed the tensor into a series of latent components to uncover gene network linked to genetic variations. Similarly, Ma et al. [25] modeled multi-view graph data as tensors to learn graph embedding and captured the local structure. These studies demonstrate tensor decomposition to be a powerful approach for data representation and dimension reduction.

Despite recent success of employing tensor-based methods for fMRI signal analysis, most of these models ignore the subject-subject relation between modalities, which could otherwise improve the performance. To bridge the gap mentioned above, we propose a novel multi-paradigm sparse tensor decomposition method (MSTD) to integrate multiple fMRI datasets (here the modalities refer to fMRI data collected under multiple paradigms). We are motivated by the work in [26], [27], which leverage the information from multiple task-stimulus to uncover the individual differences in functional brain networks associated with cognitive abilities. The relationships of ROIs from both multiple subjects and multiple task-based fMRI datasets are aggregated to model a third-order tensor. Then the CANDECOMP/PARAFAC decomposition (CPD) based method [28] is utilized to factorize the tensor into a series of rank-one components. Note that to find the shared significant sparsity patterns across subjects, L2,1-norm and L1-norm regularizations are enforced in tensor decomposition. Each component in the decomposition represents a functional sub-network as a new embedding feature. Moreover, the features are extracted via hierarchical clustering for the further individual cognition identification. To validate the efficiency and effectiveness of the proposed model, i.e., MSTD, we applied it to classify and predict the WRAT score [29] using the Philadelphia Neurodevelopmental Cohort (PNC) dataset [30]. Results show that MSTD outperforms other existing methods, such as single modal LASSO model (SM) [31], multi-modal multi-task learning (MTL) [32], a manifold regularized multi-task learning (NM2TL) we proposed recently [33] and CPD.

The primary contributions of this paper can be summarized in three aspects. First, the proposed method integrates multiple brain functional information from three fMRI paradigms. That is, MSTD is applied to explore three functional brain networks simultaneously. Second, regularization terms are incorporated into tensor decomposition to obtain low-dimensional embedding of the brain network. Third, our method provides a practical framework to not only extract shared components via tensor decomposition but also associate the FNC with cognitive abilities.

The rest of this paper is organized as follows. Section II introduces the tensor decomposition method and our proposed model. Experimental results on the PNC dataset are presented in Section III, followed by conclusions in Section IV.

II. Method

A. Notation and preliminaries

In this paper, lowercase, boldface lowercase, boldface uppercase and calligraphic letters are used to denote scalars, vectors, matrices and tensors, respectively. For a matrix A, its transpose, Moore-Penrose pseudoinverse, (i, j)-th entry and (r)-th column are denoted as A, A, aij and ar. Let be the set of real numbers. For convenience, we present a list of notations and definitions used in this paper in TABLE I.

TABLE I:

Notations and definitions.

Notation/Abbr Definition/Full name

ROI Regions of interest.
FNC Functional network connectivity.
RS Resting state.
WM Working memory task.
EM Emotion task.
CPD the CANDECOMP/PARAFAC decomposition.
WRAT The wide range achievement test.

T(κ) The mode-κ unfolding matrix of a tensor T.
R The rank of a tensor (also called the number of components).
δ The number of ROIs.
I The number of pairwise ROI correlations, and I = δ(δ − l)/2.
J The number of subjects.
K The number of fMRI paradigms.
F The FNC representation matrix, and Fδ×δ.

The outer product, arbr=arbr.
The Kronecker product, AB=[a11Ba1JBaI1BaIJB]..
The Khatri-Rao product, AB = [a1b1 a2b2arbr].
* The Hadamard product, AB=[a11b11a1Jb1JaI1bI1aIJbIJ].
TF The Frobenius norm of T, TF=i=1Ij=1Jk=1K(tijk)2.
||A||2,1 The L2,1-norm of A, A2,1=i=1Ij=1J(aij)2.

CPD is the most well-known tensor factorization method, which can represent a third-order tensor TI×J×K as a linear combination of rank-one terms:

T=r=1Rarbrcr+E (1)

where R is the rank of a tensor and EI×J×K is a three-way array denoting noise. In terms of the standard matrix representations of T, the components are usually estimated by the minimization of the following equivalent objective function:

min12T(1)(BC)AF2=min12T(2)(CA)BF2=min12T(3)(AB)CF2 (2)

where A, B and C are I × R, J × R and K × R matrices, respectively. For CPD, the Alternating Least Squares (ALS) algorithm [34], [35] is the most widely used due to its high speed and ease of implementation [36]. The ALS approach solves one matrix with the other two fixed alternatively in each iteration and repeats this procedure until the convergence criterion is satisfied. The update equations are given by Eq.(3).

AT(1)[(CB)]BT(2)[(CA)]CT(3)[(BA)] (3)

B. The proposed multi-paradigm sparse tensor decomposition

The Blood-Oxygen-Level-Dependent (BOLD) signal of fMRI measures the brain activity dynamically by detecting changes associated with blood flow at different spatial locations in the brain. In this way, the brain networks can be built based on the BOLD signals to represent functional connectivity across brain regions. For the FNC of individuals, the nodes can be defined by brain ROIs, and the weight of edge between two nodes is represented as the temporal correlation (covariance) of the fMRI time series. Then for the j-th individual, the network representation matrix Fj (j = 1, ..., J) can be obtained by calculating the Pearson’s correlation between each pair of ROIs. Since Fj is symmetric, it can also be converted into a vector via vectorization of upper triangular matrix or lower triangular matrix of Fj. So the FNC of all subjects is denoted as GJ×δ(δ1)2.

To take advantage of complementary information from multi-paradigm fMRI data, the data is tensorized as TI×J×K by stacking the FNC of all paradigms. In our approach, we aim to learn a series of meaningful components for the representation of T. For the multi-paradigm fMRI tensor T, the objective function is formulated as the following optimization problem based on CPD.

min12T[[A,B,C]]F2=min12Tr=1RarbrcrF2 (4)

Here, each column of AI×R contains the ROIs’ correlation scores, which interpret each component as the functional sub-network. BJ×Ris a matrix with bjr(j = 1, 2, ..., J; r = 1, 2, ..., R) denoting the score of the r-th component for the j-th subject, and C is a K × R matrix with ckr(k = 1, 2, ..., K; r = 1, 2, ..., R) denoting the weight of the r-th component for the k-th paradigm.

CPD offers a dimension reduction approach to map the original network data into a number of sub-network representations. Note that each component obtained by Eq.(4) consists of three vectors of scores indicating the relative contribution of each individual, FNC and paradigm. By enforcing sparsity constraints on these latent components, the most significant basis components of FNC will be selected and linearly combined to represent the original whole-brain functional network. More concretely, given an input tensor TI×J×K, we can learn R sparse functional sub-networks denoted as A, the new low-dimensional representation for each subject as B and the component weights in paradigm as C. To this end, the objective function is proposed based on CPD for multi-paradigm sparse tensor decomposition model as

minA,B,C12T[[A,B,C]]F2+λ1A1+λ2B2,1s.t.i=1Iair2=1,k=1Kckr2=1,r{1,2,,R}. (5)

where λ1 and λ2 are two positive regularization parameters. In Eq.(5), the L1-norm regularization is imposed on A, which can extract significant ROIs’ correlations by nonzero values and make each component to have a sparse sub-network representation. Considering both column sparsity and sparsity along the column direction to further identify the relevant significant components, the L2,1-norm regularization is imposed on B to extract the shared and significant components in all the subjects. The L2,1-norm penalizes each column of B as a whole and enforces sparsity among the columns by removing unimportant components. λ1 and λ2 are tuning parameters, which not only control the feature extraction when reconstructing FNC but also determine the sparsity and scale of components. Since there is no golden criterion for selecting parameters, in our experiments, their values are experimentally determined to ensure that the reconstructed FNC exhibit a significant level of sparsity in terms of spatial distributions.

Besides, the constraints are also imposed on A and C, which normalize each column in each iteration to prevent these matrices from having arbitrarily large values, leading to a small value of B. As a result, a new low-dimensional representation B is obtained for each subject through multi-paradigm fMRI data. Fig.1(a) shows the procedure of multi-paradigm sparse tensor decomposition.

Fig. 1:

Fig. 1:

The main framework of the proposed method. (a). the procedure of multi-paradigm sparse tensor decomposition. (b). the procedure of clustering and extracting features for classification and prediction.

C. Optimization algorithm

To solve Eq.(5), an efficient algorithm is proposed by iteratively updating the matrices A, B and C. According to Eq.(2), the objective function can be rewritten as:

L:=12T(1)(BC)AF2+λ1A1+λ2B2,1 (6)

or

L:=12T(2)(CA)BF2+λ1A1+λ2B2,1 (7)

By taking the derivative of Eq.(6) with respect to A as zero as well as taking the derivative of Eq.(7) with respect to B as zero, we obtain

LA=(A(CB)T(1))(CB)+λ1sgn(A) (8)
LB=((CA)BT(2))(CA)+λ2BΣ (9)

where sgn(·) is the signum function, BΣ=(B2,1)/B, and ΣR×R is a diagonal matrix with the r-th diagonal element as 1/∥br2.

Σ=[1/b121/b221/bR2]

Then we have the updated procedure.

At+1=T(1)(CtBt)λ1sgn(A)(CtBt)(CtBt) (10)
Bt+1=T(2)(CtAt)(CtAt)(CtAt)+λ2Σ (11)

In addition, matrix C can be updated by

Ct+1=T(3)(BtAt)((Ct)Ct(Bt)Bt) (12)

where t is the t-th iteration. In addition, to ensure that the constraints in Eq.(5) are satisfied, we further normalize each component of A and C in each iteration. The iteration will terminate when the relative error of the objective function satisfies |et+1et| ≤ 10−6, where et=TtTF/TF. The pseudocode of the proposed optimization algorithm is summarized in Algorithm 1.

Algorithm 1.

multi-paradigm sparse tensor decomposition

Input: tensor T, the number of components R, λ1 and λ2;
Output: A, B and C;
1: Initialization: B, C
2: for t = 1 to Max-Iteration do
3:  update A by Eq.(10)
4: for r = 1 to R do
5:   art=art/art
6: end for
7:  update B by Eq.(11)
8:  update C by Eq.(12)
9: for r = 1 to R do
10:   crt=crt/crt
11: end for
12: until converge
13: end for
14: Return matrices A, B and C.

D. the classification and prediction of cognitive abilities

Although a number of rank-one components are obtained by the proposed model, the unique solution is sensitive with the component number R. According to the previous studies [37], a uniqueness condition of CPD for a third-order tensor is provided that krank(A) + krank(B) + krank(C) ≥ 2R + 2, where the Kruskal rank of matrix A is the maximum value (krank(A)) ensuring that any krank(A) columns of matrix A are linearly independent.

As each component serves as the basis for sparsely representing the whole-brain functional network, it is essential to interpret the functional significance of those hundreds of network components and establish their correspondences across a group of subjects and multiple paradigms. In this paper, we mainly focus on several significant components that are highly correlated with cognitive behaviors assessed by WRAT. To identify and select such essential components that are more easily interpretable, a clustering procedure is employed, which is inspired by Block Term Decomposition (BTD) [38]. More specifically, the absolute value of all pairwise components’ correlations in A are calculated. Hierarchical clustering is then used to cluster the similar components, using the absolute value of correlation as a dissimilarity measure no matter whether the correlation are positive or negative. The clustering is terminated when no correlations between clusters are above 0.6 according to Ref. [24]. That is, for each av of A, if |corr(av, ap)| > 0.6 (v, p = 1, 2, ..., R), we have Lv = Lv ∪ {ap}. Then

av={av+aqLvaq|Lv|+1,corr(av,ap)>0.6av+aqLv(aq)|Lv|+1,corr(av,ap)<0.6 (13)

where | · | is the cardinality of a set. The individual scores and the paradigm scores for the corresponding ap can be regraded as the low-rank factorization for the corresponding matrix of the cluster. In other words,

Tv=1Vav(BvCv) (14)

where BvJ×Lv and CvK×Lv are two rank-Lv matrices. Eq.(14) is a BTD of the tensor T in V rank-(Lv, Lv, 1) terms [39]. Based on the above procedure, the number of components is reduced while the component weights in both individual and paradigm are kept. That is, the same functional sub-networks are shared among all subjects under different weights in multiple paradigms, and the low-rank combination of individual and paradigm ensures that the components obtained by tensor decomposition can be used for further analysis. To validate our proposed method, we call the linear support vector machine (SVM) and support vector regression (SVR) functions in Matlab directly with default hyperparameters to test the subsequent classification and prediction power, respectively. SVM is well known as a state-of-the-art classifier and has been extensively used in fMRI data analysis [40] due to its high accuracy. Since the goal of this paper is to test the performance of the proposed framework for extracting embedded features of FNC, the low-dimensional representation B and the corresponding WRAT scores are used as the inputs to the SVM and the SVR models. The procedure of clustering and extracting features for WRAT classification and prediction is shown in Fig.1(b).

Computational analysis.

In Algorithm 1, the objective function L is solved by alternately updating matrices A, B and C. So here the time complexity is analyzed according to the update equations, i.e., Eq.(10), Eq.(11) and Eq.(12). In Eq.(10), T(1) is a I×JK matrix and (CtBt) is a JK×R matrix, whose matrix-matrix multiplication has the time complexity of O(IJKR)(IJ,K,R). Similarly, the time complexity of Eq.(11) is O(IJKR). In Eq.(12), the time complexity of calculating the pseudoinverse of an R × R matrix is O(R3), and the time complexity of computing matrix multiplication is also O(IJKR). In the stage of hierarchical clustering, the time complexity of computing correlation is O(I), and the time complexity of clustering all the components is O(IR). So in the whole procedure, the main computational cost is tensor decomposition. The orders of magnitude I, J, K and R are usually O(104), O(102), O(101) and O(102), respectively. MSTD iteratively optimizes matrices A, B and C to minimize the objective function and runs with the amount of computation in O(109) at each iteration, which is acceptable for using multiple fMRI datasets.

III. Experiments and results

A. Data pre-processing

The data used in the experiments are acquired from the Philadelphia Neurodevelopmental Cohort (PNC) [30], which is a large-scale collaborative research project between the Brain Behavior Laboratory at the University of Pennsylvania and the Center for Applied Genomics at the Children’s Hospital of Philadelphia. The PNC project mainly focused on characterizing brain and behavior and their interactions with genetics by combining neuroimaging, clinical and cognitive phenotypes, and genomics techniques. Multi-paradigm neuroimaging data and multiple genetic factors for nearly 900 adolescents aged from 8 to 21 years are obtained from the PNC dataset, which are available in the dbGaP database [30].

In this paper, the relationship between individual differences in WRAT and brain activity is investigated during the engagement of cognitive abilities, measured with RS, EM and WM fMRI. The WRAT scores assessed the general cognition and learning ability (such as reading recognition, spelling and arithmetic computation) of subjects, which was a one hour computerized neurocognitive battery (CNB) administered in the PNC study. While some reservations remain, the WRAT is still an effective method to estimate intelligence quotient (IQ) values [41]. To mitigate the influence of age over the final results, we selected the 342 subjects whose ages were above 16 years [42]. The distribution of WRAT scores of 342 subjects is shown in Fig.2.

Fig. 2:

Fig. 2:

The WRAT score distribution among the 342 subjects.

All MRI scans were conducted on a single 3T Siemens TIM Trio whole-body scanner, with a single-shot, interleaved multi-slice, gradient-echo, echo planar imaging sequence. The repetition time and echo time were 3000ms and 32ms, and the total scan duration for RS, EM and WM were 6.2 minutes, 11.6 minutes and 10.5 minutes, respectively. Standard preprocessing steps were applied by using SPM12 (www.fil.ion.ucl.ac.uk/spm/), including motion correction, spatial normalization to standard MNI space, and spatial smoothing with a 3mm FWHM Gaussian kernel. The functional time series were band-pass filtered using the 0.01Hz to the 0.1Hz frequency range. 264 ROIs (containing 21,384 voxels) were extracted based on the Power coordinates [43] with a sphere radius parameter of 5mm, and calculated the Pearson correlation between the time courses of each pair of ROIs. A 264 × 264 correlation matrix (FC matrix) for each subject in each single fMRI paradigm was then obtained. To reduce redundant information, only the lower triangular portion of the symmetrical correlation matrix was properly reformed into a vector with 34,716 correlation values. In the subsequent analysis, these 34,716 values were the features extracted from all three fMRI paradigms for each subject.

B. Experimental setting

In our experiments, all regularization parameters in the model were tuned by a 4-fold cross-validation on the training set through a grid search within their respective ranges, including sparsity level λ1 ∈ {10−3, 2 × 10−3, ..., 8 × 10−3} and λ2 ∈ {8, 10, 12, 14, 16}. To assess the reliability of components, Ref. [21], [44] examined the components in different number R = 10, R = 50 and R = 100. They found additional components were showed when R = 100, and sometimes split the R = 50 components into finer subdivisions. According to this observation, we set the number of components R = 100 to guarantee the uniqueness of tensor decomposition with sufficient components for analysis. Furthermore, due to computational limitations, increasing the components number much beyond this would slow down the computation considerably and expand memory usage beyond an acceptable level. So a proper number finally depends on the quality of the data and the detail desired from the decomposition.

Note that a large number of features (34,716 ROI correlations) and a relatively small number of samples (342 subjects) may cause the problem of overfitting or high computational complexity. To address this, in each fMRI paradigm dataset, we excluded the features for which p-values of the correlation with WRAT scores are greater than or equal to 0.05, and took the union of three datasets to train the model. For further analysis, we investigate whether the proposed method can classify two discriminative subsets according to cognitive ability. Here, low and high cognitive subsets are defined by extracting the subjects which were in both top and bottom τ percent of the WRAT scores, τ ∈ {10, 20, 30} [45]. Through the subset extractions, three groups of subsets with different number of subjects can be obtained for classification experiments. A 4-fold cross-validation is used to evaluate the classification performance. That is, the complete data were randomly partitioned into 4 disjoint subsets of equal size; each subset was successively selected as the testing set, and the other 3 subsets were used to train the SVM classifier. Finally, the trained SVM model was applied to classify the WRAT of subjects in the testing set. We repeated this process 100 times to reduce the effect of sampling bias in the cross-validation.

With the selected components, the strength and significance of the correlation between ROIs is measured by the hypothetical test. The null hypothesis of no correlation between two ROIs is written as

H0=ρ11=ρ12==ρuw=ρηiηj=0

versus the alternative hypothesis

HA:u,w>0,ρuw0

where ηi and ηj are the numbers of voxels in the i-th ROI and the j-th ROI, and ρuw is the correlation between the u-th voxel and the w-th voxel. To test the hypothesis, we calculated the pair-wise correlation between the voxles from the i-th ROI and the j-th ROI, and averaged the correlation of each voxel-voxel pair to avoid the varying size of ROI as follows.

ρηiηj=1ηiηju=1ηiw=1ηj|ρuw| (15)

Then the significance of the correlation is evaluated via comparing ρij with the null statistics ρij0 with 10000 times permutations of the samples.

pvalue={ρh0ρij;h=1,2,,H}10000

Here, random sampling from 342 subjects are performed repeatedly for H = 50 times, selected the same proportion of subjects for training and test sets.

C. Results

The results in Fig.3 show that sparsity parameters affect the classification accuracy in three different tests. Since there are more females than males as shown in Fig.2, we reported the p-values of gender by Chi-square test comparing the number of males and females in both high IQ and low IQ groups. The p-values in three tests are 0.1456, 0.1684 and 0.0665, respectively. The values are larger than 0.05 that indicate sex differences would not influence the WRAT classification performance in our experiments. For a relatively low number test, 68 subjects are retained from top and bottom 10% of 342 subjects with average WRAT scores 130.26 and 74.56, respectively. The highest accuracy of our model is 0.8591 when λ1 = 5 × 10−3 and λ2 = 8. For the 20% test, the average WRAT scores of subjects selected from two ends are 123.10 and 79.49, respectively. The highest accuracy is 0.7517 under the same parameters setting as 10% group. When τ = 30, the best classification performance is 0.6774 when λ1 = 2 × 10−3 and λ2 = 14. It can be seen from Fig.3 that our model is sensitive to the sparsity parameters only in a small range. To better understand the effects of these parameters, the performance of the tensor decomposition model with λ1 = 0 and λ2 = 0 is also tested. In this case, our proposed model is degraded to the CPD model. Table II shows the comparison of classification results between using our model and CPD. The accuracy of three tests increase approximately 15.75%, 8.17% and 2.43% relative to CPD, respectively. It demonstrates that the use of regularization terms in Eq.(5) improve the overall classification performance. Notably, as the number of subjects increases in both low and high cognitive groups, the difference in WRAT scores between the two groups decreases as the percent of individuals included increases. This is because the groups are expanded by choosing individuals whose WRAT scores are less distinct (top/bottom 30% WRAT scores vs. top/bottom 10% WRAT scores). It may lead to a moderate decrease in classification accuracy.

Fig. 3:

Fig. 3:

The classification performance on 4-fold cross-validation for various parameter values. (a). 10%. (b). 20%. (c). 30%.

TABLE II:

The comparison of classification performance between MSTD and CPD models.

Data selection 10% 20% 30%

Number of subjects 68 136 204
Gender (Male/Female) 34/34 62/74 89/105
P-value of gender 0.1456 0.1684 0.0665

Ave WRAT 102.41 101.29 100.91
(Top/Bottom) 130.26/74.56 123.10/79.49 118.91/82.90

Accuracy of CPD 0.7016 0.6700 0.6531

Accuracy of MSTD 0.8591 0.7517 0.6774

Accuracy improved 15.75% 8.17% 2.43%

To further validate the performance of our model, MSTD is also utilized to predict the WRAT score and compared the performance of MSTD with other competing algorithms: SM [31], MTL [32], NM2TL [33] and CPD, where SM is a single-task based model, MTL and NM2TL are the joint learning models for two tasks. A 4-fold cross-validation is applied to evaluate the WRAT prediction performance for all these methods and tested 100 times independently on the whole dataset of 342 subjects. The model performance was quantified with both the correlation coefficient (CC) and the root mean square error (RMSE) between predicted and actual WRAT score of subjects in the testing set. According to the classification accuracy based on the different groups, the parameters of MSTD were tuned in a smaller range (λ1 ∈ {2 × 10−3, 3 × 10−3, ..., 8 × 10−3}, λ2 ∈ {8, 10, 12, 14, 16}).

Fig.4 summarizes the performance of all methods for predicting cognitive abilities. It can be seen that MSTD model shows the best performance in both CC and RMSE. More specifically, MSTD obtains the best CC 0.4599 as well as the best RMSE 14.5682 by combining three fMRI paradigms. CPD achieves the second best CC 0.4090 and outperforms the other methods, i.e., 0.3796 by NM2TL using WM and EM datasets, 0.3218 and 0.3233 by MTL using WM and EM dataset, 0.3179 and 0.3225 by SM using WM and EM dataset. In RMSE, NM2TL achieves the second best result of 14.7559 which is a bit smaller than 14.9789 by CPD, 15.3024 for MTL using WM, 15.2762 by MTL using EM, 15.4841 by SM using WM and 15.3965 by SM using EM. From the prediction tasks, MSTD gets a better performance than CPD in terms of bot CC and RMSE than CPD. That is, the regularization terms indeed improve the model performance by extracting more discriminative features associated with cognitive behaviors. In addition, it can be observed that two tensor-based models using three paradigms provide better performance than the models using one or two fMRI paradigms. MSTD has the minimum error among the compared models. Except NM2TL, the RMSE of CPD is significantly smaller than that of the other four models. It reveals that using the information provided by multiple datasets can effectively improve prediction accuracy and reduce errors.

Fig. 4:

Fig. 4:

The cognitive abilities prediction performance of 5 methods. (a). Performance in CCs. (b). Performance in RMSEs.

D. Analysis and discussion

Although 100 components are obtained by the proposed method, only a fraction of them is related to cognitive behaviors. To facilitate the analysis of these components, we used the components obtained from the 10% group of WRAT classification and restricted the components to those that were highly related with cognitive abilities (output the top 5 ranking components) based on the component weights of SVM classifier. On the basis of the work from Power et al. [43], these five components in 12 functional networks are visualized via the BrainNet viewer [46] (https://www.nitrc.org/projects/bnv/). Those include Sensory/somatomotor Hand (SM/H), Sensory/somatomotor Mouth (SM/M), Cingulo-Opercular task control (CON), Auditory (AUD), Default Mode (DMN), Memory Retrieval (MRN), Visual (VIS), Fronto-Parietal task control (FPN), Salience (SN), Subcortical (SUB), Ventral Attention (VAN) and Dorsal Attention (DAN).

The 1st component reflects a complex sub-network composed of different regions. Fig.5(A-a) displays the number of connections in the 1st component. Here we added 1 to each element if the value of the correlation between two nodes is not zero. Fig.5(A-b) shows the paradigm weights of the 1st component occurring in three paradigms, which are 0.58, 0.59 and 0.56 for RS, EM and WM, respectively. It indicated that the sub-network of the 1st components appears in all three paradigms simultaneously. Fig.6(A) represents the functional connectivity sub-network in different regions from the axial view, including SM/H, DMN, FPN and DAN. In these figures, the nodes are distinguished with different colors according to the functional networks where they are located. The node size is used to indicate the node connectivity strength (NCS). For instance, the NCS of the i-th node is defined with the sum of the absolute value of the i-th row in matrix F. A greater NCS implies a stronger connection of one node to the regions. Also, the colors of the edge indicate the correlation or anti-correlation of two nodes in red or blue, and the width of the edge represents the magnitude of the connectivity strength between nodes.

Fig. 5:

Fig. 5:

The number of connections from the 1st component to the 5th component (from A-a to E-a) in 12 main functional brain networks, and the weights of five components occurring in three fMRI paradigms (from A-b to E-b).

Fig. 6:

Fig. 6:

The connectivity from the 1st component to the 5th component (from A to E).

The 2nd and the 3rd components are more concerned with DMN. As shown in Fig.5(B-a), the 2nd component exhibits significant contributions only in DMN, as well as the 3rd component shows the links between DMN and other regions in Fig.5(C-a). Based on these observations, the connectivity in DMN of the 2nd component is displayed from the sagittal left, the axial and the sagittal right view (from left to right) in Fig.6(B), and represented the main sub-networks between DMN and the other three regions SM/H, VIS and DAN in Fig.6(C). Since both of the components appear in all three fMRI paradigms, the weights of the 2nd component in RS, EM and WM are 0.57, 0.61 and 0.54 (Fig.5(B-b)), and weights of the 3rd component in RS, EM and WM are 0.53, 0.51 and 0.68 (Fig.5(C-b)), respectively.

It should be noted that connections in the first three components are mainly located at SM/H, DMN, FPN and DAN, which are in accordance with the previous studies in the literature. For instance, Ref. [2] reported that the medial frontal network and FPN emerged as the most successful in individual subject identification, and the combination of these two frontoparietal networks significantly outperformed either network alone. Ref. [47] also indicated the connections between pre-frontal and frontal cortices (FPN) comprising DAN were associated with the intelligence, and significant brain-behavior associations was also observed for DMN, FPN and VIS. The predictive power of sub-networks in SM/H and DMN were found by [48], which was in line with observations in the 1st component. Moreover, similar results for the 3rd component were also described in [47], [49]. The correlations between lower functional connectivity and higher intelligence scores involve the connections within cortical areas comprising DMN as well as the functional interactions between these areas and regions within SM/H and DAN. From the 2nd and 3rd components, the connectivity in DMN are stronger than other regions, which is consistent with the current knowledge that DMN exhibits relatively high connectivity within itself [50].

Through Fig.5(D) and (E), it can be found that the 4th and the 5th components reflect the contributions of two specific tasks (EM and WM) to cognitive behaviors. Specifically, Fig.5(D) shows that the connectivity of the 4th component is concentrated in SM/H-DMN and it emerges in EM and WM with the weights 0.59 and 0.79. Fig.6(D) displays the connections of the 4th component from the sagittal left, the axial and the sagittal right view. In Fig.5(E), the 5th component have a strong relationship between DMN and other regions, including VIS and DAN, and it occurs in EM with the highest weight of 0.86.

From the point of view of brain functional system, the common connectivity domains of the first three components include DMN-DMN, DMN-SM/H, DMN-FPN and DMN-DAN, which appear in all three paradigms. That is, the functional connectivity of these regions is highly related to the resting state which is activated when performing the working memory or emotion task. DMN activates in task-free states and is implicated in spontaneous thought, evaluation and memory; it works in both cooperative and antagonistic ways with task control networks [50], [51]. In addition, cognitive functions rely on a complex interaction between these networks, where task control networks provide top-down regulatory signals that modulate spontaneous processing unfolding in association cortices within DMN [13]. Moreover, the 4th component reveals a particular ROI node in the SM/H, which links with other brain regions and exists in both EM and WM but rarely in RS. It seems it is caused by the motor activity when the subjects press the button in a specific task. The 5th component indicates some special FNCs linking DMN with VIS and DAN. Compared with RS and WM, the 5th component appears more in EM. It is also described in [52] that the ROI is associated with complex cognitive functions and activated in the emotion task.

The selected FNCs are mainly within or across frontal, parietal, occipital and temporal lobes. Here we listed the top ranked ROI nodes of each component (the NCS is greater than 0.4) in detail (see Supplementary material). It can be observed that the majority of nodes in the 1st component are located in superior frontal gyrus, middle frontal gyrus and precuneus. The majority of nodes in the 2nd component is within middle temporal gyrus. The 3rd component contains three nodes within supramarginal gyrus, medial frontal gyrus and angular gyrus, respectively. The 5th component occurring in EM task involves more cerebral cortex areas, such as cingulate gyrus, precuneus, angular gyrus, middle temporal gyrus, supramarginal gyrus, posterior cingulate, sub-gyral and middle occipital gyrus. Besides, some nodes located in middle temporal gyrus, supramarginal gyrus and angular gyrus recurrently appear in the 2nd, 3rd and 5th components, which suggests that these overlapped brain domains are activated under different task-stimulus. These findings are consistent with previous studies that indicate that activation across several regions within frontal, parietal, occipital and temporal lobes significantly predict cognitive abilities [33], [53], [54].

In summary, our main contributions are in three main points. First, using a tensor model to combine the information from multi-paradigm datasets can effectively select features shared among multiple subjects. Second, utilizing regularization terms can further remove redundant information, leading to the improved accuracy of classification and prediction. Third, the experimental analysis on the PNC data can identify important variations within and between DMN and SM/H, FPN and DAN, as well as the connections between VIS, DAN and DMN of task control FNC, are particularly significant for highlighting individual differences in cognitive behaviors.

IV. Conclusion

In this paper, we proposed a sparse tensor-based method to identify the individual differences by brain functional connectivity, using multiple fMRI paradigms. Specifically, the FNC of subjects in multiple fMRI paradigms is modeled as a three-way tensor, which is then decomposed into a series of rank-one components. Furthermore, the sparsity regularization terms are incorporated into the tensor model so significant FNC subnetworks can be extracted. Utilizing the low-dimensional components of each subject as features extracted by our sparse tensor model, SVM and SVR were trained for classifying and predicting cognitive abilities. Our method is validated with the PNC dataset from a comprehensive study of brain development. The experimental results demonstrated that our model outperformed other competing models. Meanwhile, some cognition-relevant functional connectivity and brain regions are discovered, and part of them were consistent with the previous studies in the literature. In general, the proposed method in this paper shows a valid way to circumvent the overfitting problem in high dimension but small sample fMRI datasets. It considers the relationships across subjects and can select the shared features among multiple fMRI paradigms, which is effective in combining complementary information from multiple fMRI paradigms. As a result, it can help identify reliable biomarkers reflecting individual differences.

Supplementary Material

Supplementary Material

Acknowledgments

Manuscript received XXX; accepted XXX. This work was supported in part by NIH under Grant R01GM109068, Grant R01MH104680, Grant R01MH107354, R01AR059781, R01EB006841, R01EB005846, R01MH116782, R01MH121101, P20GM130447 and P20GM103472, in part by NSF under Grant 1539067, in part by the Fundamental Research Funds for the Central Universities, Chang’an University (CHD) NO.300102329102 and in part by the Natural Science Foundation of Shaanxi NO.2019JM-536.

Contributor Information

Yipu Zhang, school of Electronics and Control Engineering, Chang’an University, Xi’an, Shaanxi, 710064, China..

Li Xiao, Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118..

Gemeng Zhang, Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118..

Biao Cai, Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118..

Julia M. Stephen, Mind Research Network, Albuquerque, NM 87106.

Tony W. Wilson, Department of Neurological Sciences, University of Nebraska Medical Center, Omaha, NE 68198.

Vince D. Calhoun, Tri-institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS) {Georgia State University, Georgia Institute of Technology, Emory University}, Atlanta, GA 30030 and Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131..

Yu-Ping Wang, Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118..

References

  • [1].Sporns O, “The human connectome: a complex network,” Annals of the New York Academy of Sciences, vol. 1224, no. 1, pp. 109–125, 2011. [DOI] [PubMed] [Google Scholar]
  • [2].Finn ES, Shen X, Scheinost D, Rosenberg MD, Huang J, Chun MM, Papademetris X, and Constable RT, “Functional connectome fingerprinting: identifying individuals using patterns of brain connectivity,” Nature neuroscience, vol. 18, no. 11, p. 1664, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Kaufmann T, Alnæs D, Doan NT, Brandt CL, Andreassen OA, and Westlye LT, “Delayed stabilization and individualization in connectome development are related to psychiatric disorders,” Nature neuroscience, vol. 20, no. 4, p. 513, 2017. [DOI] [PubMed] [Google Scholar]
  • [4].Shen X, Finn ES, Scheinost D, Rosenberg MD, Chun MM, Papademetris X, and Constable RT, “Using connectome-based predictive modeling to predict individual behavior from brain connectivity,” nature protocols, vol. 12, no. 3, p. 506, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Calhoun VD, Miller R, Pearlson G, and Adalı T, “The chronnectome: time-varying connectivity networks as the next frontier in fmri data discovery,” Neuron, vol. 84, no. 2, pp. 262–274, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Fox MD and Raichle ME, “Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging,” Nature reviews neuroscience, vol. 8, no. 9, p. 700, 2007. [DOI] [PubMed] [Google Scholar]
  • [7].Tian L, Wang J, Yan C, and He Y, “Hemisphere-and gender-related differences in small-world brain networks: a resting-state functional mri study,” Neuroimage, vol. 54, no. 1, pp. 191–202, 2011. [DOI] [PubMed] [Google Scholar]
  • [8].Zille P, Calhoun VD, Stephen JM, Wilson TW, and Wang Y-P, “Fused estimation of sparse connectivity patterns from rest fmri—application to comparison of children and adult brains,” IEEE transactions on medical imaging, vol. 37, no. 10, pp. 2165–2175, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Buckner RL, Krienen FM, and Yeo BT, “Opportunities and limitations of intrinsic functional connectivity mri,” Nature neuroscience, vol. 16, no. 7, p. 832, 2013. [DOI] [PubMed] [Google Scholar]
  • [10].Finn ES, Scheinost D, Finn DM, Shen X, Papademetris X, and Constable RT, “Can brain state be manipulated to emphasize individual differences in functional connectivity?” Neuroimage, vol. 160, pp. 140–151, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Lv J, Jiang X, Li X, Zhu D, Chen H, Zhang T, Zhang S, Hu X, Han J, Huang H et al. , “Sparse representation of whole-brain fmri signals for identification of functional networks,” Medical image analysis, vol. 20, no. 1, pp. 112–134, 2015. [DOI] [PubMed] [Google Scholar]
  • [12].Medaglia JD, Lynall M-E, and Bassett DS, “Cognitive network neuroscience,” Journal of cognitive neuroscience, vol. 27, no. 8, pp. 1471–1491, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Sripada C, Rutherford S, Angstadt M, Thompson WK, Luciana M, Weigard A, Hyde LH, and Heitzeg M, “Prediction of neurocognition in youth from resting state fmri,” Molecular psychiatry, pp. 1–9, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Calhoun VD and Sui J, “Multimodal fusion of brain imaging data: a key to finding the missing link (s) in complex mental illness,” Biological psychiatry: cognitive neuroscience and neuroimaging, vol. 1, no. 3, pp. 230–244, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Calhoun VD and Adali T, “Multisubject independent component analysis of fmri: a decade of intrinsic networks, default mode, and neurodiagnostic discovery,” IEEE reviews in biomedical engineering, vol. 5, pp. 60–73, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Anderson A, Douglas PK, Kerr WT, Haynes VS, Yuille AL, Xie J, Wu YN, Brown JA, and Cohen MS, “Non-negative matrix factorization of multimodal mri, fmri and phenotypic data reveals differential changes in default mode subnetworks in adhd,” NeuroImage, vol. 102, pp. 207–219, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Lin D, Cao H, Calhoun VD, and Wang Y-P, “Sparse models for correlative and integrative analysis of imaging and genetic data,” Journal of neuroscience methods, vol. 237, pp. 69–78, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Fang J, Lin D, Schulz SC, Xu Z, Calhoun VD, and Wang Y-P, “Joint sparse canonical correlation analysis for detecting differential imaging genetics modules,” Bioinformatics, vol. 32, no. 22, pp. 3480–3488, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Xiao L, Stephen JM, Wilson TW, Calhoun VD, and Wang Y, “Alternating diffusion map based fusion of multimodal brain connectivity networks for iq prediction,” IEEE Transactions on Biomedical Engineering, vol. 66, pp. 2140–2151, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Beckmann CF and Smith SM, “Tensorial extensions of independent component analysis for multisubject fmri analysis,” Neuroimage, vol. 25, no. 1, pp. 294–311, 2005. [DOI] [PubMed] [Google Scholar]
  • [21].Groves AR, Beckmann CF, Smith SM, and Woolrich MW, “Linked independent component analysis for multimodal data fusion,” Neuroimage, vol. 54, no. 3, pp. 2198–2217, 2011. [DOI] [PubMed] [Google Scholar]
  • [22].Miwakeichi F, Martınez-Montes E, Valdés-Sosa PA, Nishiyama N, Mizuhara H, and Yamaguchi Y, “Decomposing eeg data into space–time–frequency components using parallel factor analysis,” NeuroImage, vol. 22, no. 3, pp. 1035–1045, 2004. [DOI] [PubMed] [Google Scholar]
  • [23].Mahyari AG, Zoltowski DM, Bernat EM, and Aviyente S, “A tensor decomposition-based approach for detecting dynamic network states from eeg,” IEEE Transactions on Biomedical Engineering, vol. 64, no. 1, pp. 225–237, 2017. [DOI] [PubMed] [Google Scholar]
  • [24].Hore V, Viñuela A, Buil A, Knight J, McCarthy MI, Small K, and Marchini J, “Tensor decomposition for multiple-tissue gene expression experiments,” Nature genetics, vol. 48, no. 9, p. 1094, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Ma G, He L, Lu C-T, Shao W, Yu PS, Leow AD, and Ragin AB, “Multi-view clustering with graph embedding for connectome analysis,” in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 2017, pp. 127–136. [Google Scholar]
  • [26].Zhu X, Li X, Zhang S, Ju C, and Wu X, “Robust joint graph sparse coding for unsupervised spectral feature selection,” IEEE transactions on neural networks and learning systems, vol. 28, no. 6, pp. 1263–1275, 2017. [DOI] [PubMed] [Google Scholar]
  • [27].Wang H, Nie F, Huang H, Risacher S, Ding C, Saykin AJ, Shen L et al. , “Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance,” in Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011, pp. 557–562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Hitchcock FL, “The expression of a tensor or a polyadic as a sum of products,” Journal of Mathematics and Physics, vol. 6, no. 1–4, pp. 164–189, 1927. [Google Scholar]
  • [29].Wilkinson GS and Robertson GJ, “Wide range achievement test 4 (wrat4),” Lutz, FL: Psychological Assessment Resources, 2006. [Google Scholar]
  • [30].Satterthwaite TD, Connolly JJ, Ruparel K, Calkins ME, Jackson C, Elliott MA, Roalf DR, Hopson R, Prabhakaran K, Behr M et al. , “The philadelphia neurodevelopmental cohort: a publicly available resource for the study of normal and abnormal brain development in youth,” Neuroimage, vol. 124, pp. 1115–1119, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Tibshirani R, “Regression shrinkage and selection via the lasso: a retrospective,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 73, no. 3, pp. 273–282, 2011. [Google Scholar]
  • [32].Zhang D, Shen D, Initiative ADN et al. , “Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in alzheimer’s disease,” NeuroImage, vol. 59, no. 2, pp. 895–907, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Xiao L, Stephen JM, Wilson TW, Calhoun VD, and Wang Y-P, “A manifold regularized multi-task learning model for iq prediction from two fmri paradigms,” IEEE Transactions on Biomedical Engineering, vol. 67, no. 3, pp. 796–806, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Carroll JD and Chang J-J, “Analysis of individual differences in multidimensional scaling via an n-way generalization of “eckart-young” decomposition,” Psychometrika, vol. 35, no. 3, pp. 283–319, 1970. [Google Scholar]
  • [35].Harshman RA, “Foundations of the parafac procedure: Models and conditions for an” explanatory” multimodal factor analysis,” 1970. [Google Scholar]
  • [36].Tomasi G and Bro R, “A comparison of algorithms for fitting the parafac model,” Computational Statistics & Data Analysis, vol. 50, no. 7, pp. 1700–1734, 2006. [Google Scholar]
  • [37].Kruskal JB, “Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics,” Linear algebra and its applications, vol. 18, no. 2, pp. 95–138, 1977. [Google Scholar]
  • [38].De Lathauwer L, “Blind separation of exponential polynomials and the decomposition of a tensor in rank-(l_r,l_r,1) terms,” SIAM Journal on Matrix Analysis and Applications, vol. 32, no. 4, pp. 1451–1474, 2011. [Google Scholar]
  • [39].Sorber L, Van Barel M, and De Lathauwer L, “Optimization-based algorithms for tensor decompositions: Canonical polyadic decomposition, decomposition in rank-(lr, lr, 1) terms, and a new generalization,” SIAM Journal on Optimization, vol. 23, no. 2, pp. 695–720, 2013. [Google Scholar]
  • [40].Yang H, Liu J, Sui J, Pearlson G, and Calhoun VD, “A hybrid machine learning method for fusing fmri and genetic data: combining both improves classification of schizophrenia,” Frontiers in human neuroscience, vol. 4, p. 192, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [41].Griffin SL, Mindt MR, Rankin EJ, Ritchie AJ, and Scott JG, “Estimating premorbid intelligence comparison of traditional and contemporary methods across the intelligence continuum,” Archives of Clinical Neuropsychology, vol. 17, no. 5, pp. 497–507, 2002. [PubMed] [Google Scholar]
  • [42].Zille P, Calhoun VD, and Wang Y.-p., “Enforcing co-expression within a brain-imaging genomics regression framework,” IEEE transactions on medical imaging, vol. 37, no. 12, pp. 2561–2571, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Power JD, Cohen AL, Nelson SM, Wig GS, Barnes KA, Church JA, Vogel AC, Laumann TO, Miezin FM, Schlaggar BL et al. , “Functional network organization of the human brain,” Neuron, vol. 72, no. 4, pp. 665–678, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [44].Groves AR, Smith SM, Fjell AM, Tamnes CK, Walhovd KB, Douaud G, Woolrich MW, and Westlye LT, “Benefits of multi-modal fusion analysis on a large-scale dataset: life-span patterns of inter-subject variability in cortical morphometry and white matter microstructure,” Neuroimage, vol. 63, no. 1, pp. 365–380, 2012. [DOI] [PubMed] [Google Scholar]
  • [45].Cai B, Zhang G, Hu W, Zhang A, Zille P, Zhang Y, Stephen JM, Wilson TW, Calhoun VD, and Wang Y-P, “Refined measure of functional connectomes for improved identifiability and prediction,” Human brain mapping, vol. 40, no. 16, pp. 4843–4858, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [46].Xia M, Wang J, and He Y, “Brainnet viewer: a network visualization tool for human brain connectomics,” PloS one, vol. 8, no. 7, p. e68910, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [47].Hearne LJ, Mattingley JB, and Cocchi L, “Functional brain networks related to individual differences in human intelligence at rest,” Scientific reports, vol. 6, p. 32328, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [48].Cai B, Zhang G, Zhang A, Stephen JM, Wilson TW, Calhoun VD, and Wang Y, “Capturing dynamic connectivity from resting state fmri using time-varying graphical lasso,” IEEE Transactions on Biomedical Engineering, vol. 66, pp. 1852–1862, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [49].Hu W, Cai B, Zhang A, Calhoun VD, and Wang Y-P, “Deep collaborative learning with application to multimodal brain development study,” IEEE Transactions on Biomedical Engineering, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [50].Buckner RL, Andrews-Hanna JR, and Schacter DL, “The brain’s default network,” Annals of the New York Academy of Sciences, vol. 1124, no. 1, pp. 1–38, 2008. [DOI] [PubMed] [Google Scholar]
  • [51].Gerlach KD, Spreng RN, Gilmore AW, and Schacter DL, “Solving future problems: default network and executive activity associated with goal-directed mental simulations,” Neuroimage, vol. 55, no. 4, pp. 1816–1824, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [52].Jacoby N, Bruneau E, Koster-Hale J, and Saxe R, “Localizing pain matrix and theory of mind networks with both verbal and non-verbal stimuli,” Neuroimage, vol. 126, pp. 39–48, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [53].Greene AS, Gao S, Scheinost D, and Constable RT, “Task-induced brain state manipulation improves prediction of individual traits,” Nature communications, vol. 9, no. 1, p. 2807, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [54].Graham S, Jiang J, Manning V, Nejad AB, Zhisheng K, Salleh SR, Golay X, Berne YI, and Mckenna PJ, “Iq-related fmri differences during cognitive set shifting,” Cerebral Cortex, vol. 20, no. 3, pp. 641–649, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material

RESOURCES