Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Apr 17.
Published in final edited form as: IEEE Trans Med Imaging. 2019 Aug 2;39(3):611–620. doi: 10.1109/TMI.2019.2931708

Analyzing Dynamical Brain Functional Connectivity As Trajectories on Space of Covariance Matrices

Mengyu Dai , Zhengwu Zhang , Anuj Srivastava
PMCID: PMC7164686  NIHMSID: NIHMS1573353  PMID: 31395539

Abstract

Human brain functional connectivity (FC) is often measured as the similarity of functional MRI responses across brain regions when a brain is either resting or performing a task. This paper aims to statistically analyze the dynamic nature of FC by representing the collective time-series data, over a set of brain regions, as a trajectory on the space of covariance matrices, or symmetric-positive definite matrices (SPDMs). We use a recently developed metric on the space of SPDMs for quantifying differences across FC observations, and for clustering and classification of FC trajectories. To facilitate large scale and high-dimensional data analysis, we propose a novel, metric-based dimensionality reduction technique to reduce data from large SPDMs to small SPDMs. We illustrate this comprehensive framework using data from the Human Connectome Project (HCP) database for multiple subjects and tasks, with task classification rates that match or outperform state-of-the-art techniques.

Keywords: Dynamic functional connectivity, fMRI pattern classification, covariance trajectory, Riemmanian metric, dimension reduction

I. Introduction

There has been a great interest in studying brain functional connectome generated from functional MRI (fMRI), which measures the blood oxygen level dependent (BOLD) contrast signals inside the brain over a period of time. The recent Human Connectome Project (HCP) [1] facilitates such research by providing high-quality publicly available dataset. The functional connectome, also referred to as functional connectivity (FC), is estimated as a set of statistical dependencies of fMRI signals among remote anatomical regions. These dependencies are expressed as quantifications of similarity, correlation, or covariance between simultaneous BOLD measurements across regions in human brain. Recent studies [2] have shown that FC is a dynamic process and changes continuosly, even in the resting state. The short-term FC is often represented as a correlation or covariance matrix of fMRI data over a small time window, with the matrix size equal to the number of brain regions being considered.

Our primary goal is to quantitatively represent and compare dynamic FC between different anatomical regions, over long intervals representing either performances of certain tasks or with brain at resting state. We achieve this by representing dynamic FCs as trajectories on space of covariance matrices, and comparing these trajectories using a Riemmanian metric on this space. This, in turn, leads to classification of dynamic FCs and prediction of various kinds of disease or stimuli using fMRI data. One challenge is the high dimensionality of the observed data - it becomes expensive to efficiently compare covariance trajectories when the dimension of covariances (i.e. the number of regions being studied) is very large. To handle this issue, we provide a metric-based, dimension-reduction method that helps reduce large covariance matrices to small covariance matrices, using a metric-based approach, and significantly improves computational efficiency of the ensuing statistical analysis.

A significant progress has been made over the last few years in the field of fMRI data analysis. One of the foci has been the classification of task and external stimuli using fMRI data. In [3], the authors used a multivariate pattern analysis (MVPA) approach to test the hypothesis that motivationrelated enhancement of cognitive control results from improved encoding and representation of task set information. In the study presented in [4], fMRI data were studied using deep learning tools for Alzheimer’s disease prediction. Barachant et al. [5] studied covariance matrices classification using SVM classifiers under a Riemannian metric based kernel. Zhao et al. [6] proposed a Riemannian framework for performing group analysis on longitudinal RS-fMRI data. Experiments studied the potential use of their framework in longitudinal connectivity analysis. Dodero et al. [7] extracted traces of brain function by constructing trajectories in a low-dimensional space. Their approach allows studying dynamic FC in detail from the moment-to-moment fluctuations of the brain signals during resting state or other conditions. Qiu et al. [8] proposed a framework for computing the mean of a set of brain functional networks and embedding brain functional networks into a low-dimensional space. They used Locally Linear Embedding (LLE) under the Log-Euclidean Riemannian metric. Dodero et al. [9] presented a general computational framework for classifying connectivity graphs using kernels defined on the Riemannian manifold of positive definite matrices. Based on representation through a regularized graph Laplacian, this method can also be applied to classification of any kind of connectivity graphs with non-negative weights. Huang et al. [10] proposed a regression approach that smooths out the noise by exploiting the geometric structure of correlation matrices. They demonstrated that the method is useful for identifying distinct state spaces in the resting-state connectivities of males and females through the transition matrices of the common states. Despite these achievements, these methods either did not provide quantitive measures to represent and compare dynamic FCs at the network level, or did not have efficient solutions in region-based large-scale multi-class classification problems. Here we aim to address these issues using a novel combination of representing dynamic FC using a sequence of symmetric positive definite matrices (SPDMs), a Riemannian structure on SPDMs, and a tool for dimension reduction from large SPDMs to small SPDMs.

We first parcellate human brain into functional units or regions, and represent short-term FC as SPDMs, with entries corresponding to covariances of localized fMRI signals. We use a sliding window to segment multivariate time series into overlapping temporal blocks, and estimate a covariance matrix for each block. The original time series data (BOLD signal) can thus be converted into a time series of covariance matrices, termed as a covariance trajectory or SPDM trajectory. Next, we utilize a Riemannian structure (described in several places including [11]) on the space of SPDMs to facilitate the comparison of trajectories. To improve computational efficiency, we propose a novel dimension-reduction approach based on the chosen Riemannian metric of SPDMs. The novel contributions of this paper are as follows:

  1. Dynamic FC Representation Using SPDM Trajectories: We represent the dynamic FCs as SPDM trajectories and compares these trajectories based on an advanced Riemannian framework.

  2. Metric-Based Task Classification: We use a Riemannian metric to perform registration, comparison, and classification of dynamic FCs as covariance trajectories.

  3. SPDM Dimension Reduction: We introduce a metric-based dimension-reduction technique to map large SPDMs into small ones, thus improving the computational efficiency and facilitating statistical analysis of data involving hundreds of regions.

The rest of this paper is organized as follows. In Section II, we describe the mathematical components of the proposed framework, including: the estimation and representation of FC using SPDMs, Riemannian metrics for comparing SPDM trajectories, and dimension reduction for large SPDMs. Section III presents experimental results using both simulated and real HCP data. The paper ends with a short discussion and summary in Section IV.

II. Methodology

In this section we lay out different components of our framework. After standard preprocessing steps (see more details later in the real data analysis section), we obtain a time series of fMRI BOLD signal for each brain region of interest (ROI). We then select a set of ROIs and consider the corresponding vector-valued or multivariate time series. A sliding window method is used to temporally group elements of this multivariate time series into overlapping blocks. Within each block we estimate a covariance matrix, and thus obtain a time series of covariance matrices or a SPDM trajectory for each fMRI recording. This data is then ready for statistical analysis and classification. We describe the main pieces of this analysis next.

A. Covariance Estimation

The first step is to estimate a correlation or a covariance matrix that represents the short-term FC between distinct ROIs in the brain. More specifically, we pick n ROIs and then represent FC over a small time-interval using an n × n covariance matrix of the corresponding BOLD signals over that interval. In this paper we use the approach from [12] to estimate the covariance matrix. The estimated Σ is an optimal convex linear combination of the sample covariance matrix S and the identity matrix I, i.e., Σ = ρ1I + ρ2S, where the optimal weights ρ1 and ρ2 are estimated from the data. The optimality is defined with respect to a quadratic loss function, asymptotically as the number of observations and the number of variables go to infinity together. Extensive Monte Carlo simulations confirm that the asymptotic results tend to hold well even in finite sample situations. Please refer to [12] for more details. Now, after estimating a covariance matrix for each time-interval segmented by a sliding window, we obtain a covariance trajectory for each multivariate time series.

B. Riemannian Structure on Symmetric Positive-Definite Matrices (SPDMs)

In order to quantify differences in FCs, represented by covariance matrices, we need a metric structure on the manifold of covariance matrices or SPDMs. While there are several Riemannian structures in the literature [11, 13, 14], we use the one introduced in [11], since it has the advantage of having close forms for many necessary operations we need on the SPDM manifold, e.g., geodesic distance, parallel transport, exponential map, inverse exponential map. Zhang et al. [11] also have demonstrated that this metric is superior over other metrics such as the log-Euclidean one [13] in analyzing dynamic FCs.

Let P˜n be the space of n × n SPDMs, and let Pn be its subset of matrices with determinant one. In our approach, we impose separate distances on the determinant one matrices and the determinants themselves. For any square matrix G with unit determinant, i.e., GGL(n), we can write it as a product of two square matrices G = PS, where PP and SSO(n) (SO(n) is the set of all n×n rotation matrices). This is called the polar decomposition. It motivates us to analyze P by representing it as a G after removing S. More formally, we identify P with the quotient space SL(n)/SO(n). This identification is based on the map π:SL(n)/SO(n)P, given by π([G])=G˜G˜t, for any G˜[G], where the square-root is the symmetric, positive-definite square-root of a symmetric matrix. The notation [G] stands for all possible rotations of the matrix G, given by [G] = {GS|SSO(n)}. The inverse map of π is given by: π−1(P) = [P] ≡ {PS|SSO(n)}.This establishes a one-to-one correspondence between the quotient space SL(n)/SO(n) and P. Skipping further details, this process leads to the following geodesic distance between points in P. For any P1, P2P:

dP(P1,P2)=A12F. (1)

where A12 = log(P12), P12=P11P22P11 ∥·∥F denotes the Frobenious norm of a matrix. Since for any P˜P˜ we have det(P˜)>0, we can express P˜ as a pair (P,1nlog(det(P˜))) with P=P˜det(P˜)1/nP. Thus, P˜ is identified with the product space of P×+ and we take a weighted combination of distances on these two components to reach a metric on P˜:

dP˜(I,P˜)2=dP(I,P)2+1n(log(det(P˜)))2. (2)

For any two arbitrary covariances P˜1 and P˜2, let P˜12=P˜11P˜2S12 for some optimal S12SO(n) (using Procrustes alignment). Also, note that for P˜12P˜, we have det(P˜12)=det(P˜2)/det(P˜1). Therefore, the resulting squared geodesic distance between P˜1 and P˜2 is:

dP˜(P˜1,P˜2)2=dP(I,P12)2+1n(log(det(P˜2))log(det(P˜1)))2. (3)

Under this framework, we also can get closed-form solutions for the exponential map, inverse exponential map, parallel transport and Riemannian curvature tensor, which are often useful. We refer the reader to [14] for more details.

Note that in Eqn (3), distance between covariance matrices is made up of two components – the determinant term and the unit-determinant symmetric matrix term. We can choose arbitrary relative weights on these terms to combine the two components. While, in some simple cases, it has been shown that one can obtain decent classification performance using only the determinant term, in general the second component provides important critical information about actual difference between covariance trajectories. As an example, Fig. 1 shows evolutions of log(det(α(t)))/n versus t, where α denotes a covariance trajectory, for different experimental conditions. It can be seen from this plot that the determinant information is not sufficient to classify tasks and experimental conditions. Therefore, in the following sections, we will mainly focus on studying the contributions from the unit-determinant symmetric matrix term (first term on the right side of Eqn (3)).

Fig. 1.

Fig. 1.

Determinants of covariance matrices in 80 trajectories from 4 different task activities in the HCP data (n =4).

C. Riemannian Metric for FC Trajectories

To compare FC trajectories from different observations, we need a method for calculating distances between such trajectories. While the previously defined metric on SPDMs provides a way forward, an important issue remains. The time rate of executions of different tasks across subjects and experiments may not be tightly synchronized. Even if the cues are offered at exact times, the reaction times of different subjects could be different, even the same subjects can differ in response times across experiments. One needs to take care of this execution rate variability in order to better understand pure FC variation. Mathematically, this variability is represented as follows.

Let α denote a smooth trajectory on the Riemannian manifold of SPDMs P, where P is endowed with the Riemannian distance in (2). Let M denote the set of all such trajectories: M={α:[0,1]P|α is smooth}. Also, define Γ to be the set of all diffeomorphisms of [0, 1] : Γ = {γ : [0, 1] → [0, 1]|γ(0) = 0, γ(1) = 1}, γ is a diffeomorphism with the end points preserved. Elements of Γ represent different execution rates across different experimental recordings. If α is a trajectory on P, then αγ is a trajectory that follows the same sequence of points as α but at the evolution rate governed by γ. That is, α and αγ differ only in the execution rates, or response times, the actual FC responses are identical. It is important to note that Γ forms a group under the composition operation.

Let α1 and α2 be two smooth trajectories in M, a simple way to establish a metric between them is to use 01d(α1(t),α2(t))dt. Although it represents a natural solution to quantify differences between α1 and α2, it suffers from the problem that d(α1, α2) ≠ d(α1γ1, α2γ2), when the activities are observed under different execution rates denoted by γ1 and γ2. This implies that even the same FC trajectories will have non-zero distances between them if they are performed at different execution rates. The execution rate γ is often considered as a nuisance variable if we only consider the value or the shape of the trajectories. We will later demonstrate that removing the effects of γ will facilitate better classification of FCs in certain scenarios. To make comparison of trajectories invariant to their execution rates, we use the Riemmanian metric as described in [11]. This construction is described next.

Definition 1

Let α:[0,1]P be a smooth trajectory with a starting point α(0)=PP. Define the transported square-root vector field (TSRVF) of α to be a scaled parallel-transport of the vector field α˙ along α to the starting point P according to: for each t ∈ [0,1], q(t)=α˙(t)α(t)P|α˙(t)|TP(P), where |·| denotes the norm that is defined through the Riemannian metric on P. Vα(t)→α(0) denote the parallel transport of V from α(t) to α(0) along α. A covariance trajectory α is then represented by the pair (P, q(t)) in this analysis.

For any point PP, denote the set of all square-integrable curves in TP(P) as PL2([0,1],TP(P)). The space of all smooth SPDM trajectories, then, becomes an infinitedimensional vector bundle =IIPPP, which is the indexed union of P for every TP(P).

For given two points (P1, q1) and (P2, q2) on , we want to find the geodesic path connecting them. Let (x(s), v(s, ·)), s ∈ [0, 1] be a path with (x(0), v(0, ·)) = (P1, q1) and (x(1), v(1, ·)) = (P2, q2). To become a geodesic, (x(s), v(s, ·)) needs to satisfy certain conditions according to [11, 15] and these conditions help develop a geodesic shooting method for finding geodesic between two points on . Skipping details, we provide the final expressions for geodesic distances.

Defnition 2

Given two trajectories α1, α2 and their representations (P1, q1), (P2,q2), and let (x(s),v(s)), s ∈ [0, 1] be the geodesic between (P1, q1) and (P2, q2) on , the geodesic distance is given as:

dc((P1,q1),(P2,q2))=lx2+01|q1(t)q2(t))|2dt. (4)

This distance has two components: (1) the length between the starting points on P, lx=01|x˙(s)|ds; and (2) the standard L2 norm on p2 between the TSRVFs of the two trajectories, where q1 represents the parallel transport of q1p1 along x to p2. In the following context, we will also denote dc((P1, q1), (P2, q2)) as dc(α1, α2) for simplicity.

The main motivation for us to use the TSRVF representation, to represent a SPDM trajectory, is that we have the following important property held: dc(α1, α2) = dc(α1γ, α2γ). This key property allows us to study dynamic FCs in a manner that is invariant to their evolution rate. That is we can compare two equivalence classes or orbits [α1] and [α2], where [αi] = {αiγ|γ ∈ Γ}.

Definition 3

Define the geodesic distance dq on /Γ as the shortest distance between two orbits in ,

dq([α1],[α2])=infγΓdc((P1,q1),(P2,(q2γ)γ˙)). (5)

It has been shown in [11] that this is a proper metric in the space /Γ. To solve the above equation, we take the following optimization strategy

min(x,v),γ(lx2+01|q1,x(q2γ)γ˙|2dt) (6)

This optimization is very expensive since we need to iteratively search for the optimal path (x, v) on and the reparameterization γ. Instead, we use a fast approximation here. We first find the baseline x connecting two trajectories using geodesic between α1(0) and α2(0) on P and then align their TSRVFs accordingly using the Dynamic Programming Algorithm. This substantially speeds up the solution albeit at the cost of potentially differing from the optimal solution stated under the theoretical formulation.

D. Dimension Reduction for SPDMs

As mentioned above, the number of ROIs in brain connectivity studies can become very large [16], resulting in FC trajectories with large size SPDMs. The problem of comparing trajectories can become computationally expensive and, therefore, we seek a method for the data reduction, while preserving the symmetric, positive-definite nature of covariance matrices. Our main idea is to find a linear projection that maps high-dimensional SPDMs to low-dimensional SPDMs in a principled, near-optimal manner. In addition to providing computational simplification, the low-dimensional SPDMs also facilitate analyses in the following ways:

  1. Such a projection can bring FC trajectories associated with different number of ROIs to the same smaller dimension, and make comparisons between them possible.

  2. In the case that not all ROIs carry the same amount of information, dimension reduction can help filter out some extraneous components.

The problem of dimension reduction of SPDMs has been studied and used in a variety of computer vision and pattern recognition problems, see e.g., [17, 18]. In this paper, we develop a novel dimension-reduction technique adapted to the Riemannian metric presented in Section II B. Thus, the reduced SPDMs are especially suitable for analyzing under the proposed Riemannian framework.

Given a set of n×n unit-determinant SPDMs {Pi}, where n is a large integer, our goal is to find orthogonal matrix Bn×d, where d << n and BTB = Id, to project Pi to Qi in d×d according to Qi = BT PiB. The space of such orthogonal matrices is called a Stiefel manifold, denoted by Sn, d. The next question is: What should be the optimality criterion for defining B? A simple yet important idea is that pairwise distances between given SPDMs should be preserved as much as possible after the projection. That is, find BSn,d such that dPd(Qi,Qj)dPn(Pi,Pj) for all i, j in the training set. This criterion can be formulated as:

argminBSn,di,j(dPn(Pi,Pj)dPd(Qi,Qj))2.

A direct optimization of this quantity over BSn,d is complicated due to the complexity of the chosen Riemannian metric. Instead, we develop an approximation where the comparison of distances is replaced by the comparison of relevant matrices directly.

In the original space, the distance between matrices Pi and Pj is governed by the matrix Pij=Pi1Pj2Pi1 (see Eqn (1)). Similarly, distance in the smaller space is determined by the matrix Qij=Qi1Qj2Qi1. In order to compare these matrices, we need to bring them to the same space. Let P^i denotes the reconstruction of Pi from its projection Qi, i.e. P^i=BQiBTn×n. Our goal is to find BSn,d that minimizes the quantity:

argminBSn,di,jPijP^i1P^j2P^i12. (7)

However, this specification requires the following proviso. Since P^i is rank d, it is not invertible, and one needs to use its pseudoinverse instead. Let P^i=BQi1BT denote the pseudoinverse of P^i. (In the appendix we establish this expression as a pseudoinverse.) Then, we have the following result.

Lemma 1.

Under the conditions specified above, we have, for all i, j,

PijP^iP^j2P^i2=PijBQijBT2.

To prove this, one only needs to show that P^iP^j2P^i=BQijBT, which is proven in the appendix. This provides another interpretation of the objective function.

Lemma 2.

The optimization of quantity in Lemma 1 can be rephrased as follows.

argminBSn,di,j=1NPijBQijBT2=argmaxBSn,d(i,j=1Ntr(BTPijBBTPijB)).

The proof of Lemma 2 is given in the appendix. We solve this optimization problem on the Stiefel manifold using the Matlab toolbox Manopt [19].

III. Experiments

In this section, we provide a number of simulations and real data experiments to illustrate the proposed methodology. The first simulated example illustrates the proposed dimension-reduction technique. The second simulation studies the combined effects of dimension reduction and temporal alignment. In the real data experiment section, we present a few classification experiments based on different ROIs, activities, and classifiers with and without dimension reduction and alignment.

A. Simulation Studies on Dimension Reduction and Alignment

Simulation Experiment 1:

We first randomly generate k sets of SPDMs, each set containing T independent n×n SPDMs. Let Pi(j) be the jth SPDM in the ith set; it is generated using Pi(j) = Ki KiT+ nIn + εij εijT, where Ki and εij are n×n square matrices generated with i.i.d standard normal entries. In this simulation we have used the values n = 100, T = 20 and k = 10. Next, we apply our dimension reduction method to reduce the dimension of SPDMs from n = 100 to d = 20, 10 and 5. In order to evaluate this reduction, we calculate pairwise-distance matrices before and after dimension reduction, and display them in Fig. 2. The results show that although the dimension of SPDM changes dramatically, the patterns of distances within and across sets are preserved well for d = 20 and d = 10, with a slight deterioration in the case of d = 5. To quantify changes in distance matrices, we let D be the distance matrix between covariance matrices in the original dimension, Dd be the distance matrix between covariance matrices after reducing dimension to d, and display their Frobenius norm ∥DDdF vs d in Fig. 2 (e). We also utilize some real data associated with social task along with AAL [20] template to display functional connectivities using reconstructed SPDMs in different dimensions, with BrainNet Viewer [21]. We threshold precision matrices associated with these SPDMs to represent edges in graphs, as shown in Fig. 3. Similarities between these graphs indicate the success of data compression with our dimension reduction method.

Fig. 2.

Fig. 2.

Pairwise distances between 200 covariance matrices before and after dimension reduction. (a) shows the distance matrix before dimension reduction, and (b), (c), (d) show the distance matrices after reducing the dimension to d=20, 10 and 5, respectively. (e) shows ∥DDdF vs dimension.

Fig. 3.

Fig. 3.

Reconstructed connectivities using precision matrices estimated by reconstructed SPDMs from different dimensions.

Simulation Experiment 2:

Next, we study the combined effects of dimension reduction and temporal alignment. We simulate covariance trajectories in the space of large SPDMs and time warp them to generate new trajectories. Then, we perform dimension reduction and solve for temporal registration in the reduced space, in that order, and compare the warping results to the original time warpings.

We start by generating several covariance sequences, labeled {Si}, each with 20 time points and the covariance size 100 × 100. They are generated as follows: First we simulate a time series in 100 from multivariate normal distribution N(0,I100) for t = 1, …, 300, and then use a sliding window with window size of 80 and step size of 10 to segment the time series into overlapping blocks. We then estimate covariance in each block to form a 20-length sequence of 100 × 100 covariances. The resulting sequence is resampled at 20 time points after temporal smoothing with a Gaussian kernel resulting in Si. Next, we randomly generate warping functions γi, and form new sequences Siγi, denoted by S˜i. Finally we implement dimension reduction on Si and S˜i, respectively to obtain trajectories of size d × d × 20, with d << 100, denoted by Sid and S˜id. Then, we use the optimization stated in Eqn (6) to align Sid to S˜id. In Fig. 4, we show the recovered warping functions for different cases. Each panel represents a different experiment with different simulated data. The ground truth warping function is shown in black color for n = 10, while the other curves represent the recovered warping functions for different values of d. From these results, we can see that for a moderate dimensional reduction, with d = 50 or d = 20, the alignment results are near perfect. However, a large reduction (d ≤ 5) can result in some loss of information and lead to a deterioration in alignment performance.

Fig. 4.

Fig. 4.

The recovered warping functions before and after different levels of dimension reduction. x-axis and y-axis represent time before and after warping respectively.

The computational cost for analyzing a large set of SPDM trajectories is a big concern, thus motivating the need for dimension reduction. In Fig. 5 we plot the computational cost of comparing trajectories, with and without temporal alignments, for different sizes of SPDM trajectories. This cost was computed with a PC with Intel i7–6700HQ and 8GB RAM. One can see that the cost of performing temporal alignment grows rapidly with the matrix size and becomes impractical for n larger than 100. This is an important consideration and sometime the gains of alignment are nullified by increases in the computational cost.

Fig. 5.

Fig. 5.

Time cost for calculating distance between two 333 × 333 × 20 trajectories before and after alignment with different degrees of dimension reduction.

B. Real Data Experiments

Next we use HCP dataset to perform quantitative analysis of fMRI responses from different ROIs under different activities. The majority of the HCP fMRI data were acquired at 3T. A series of 4D fMRI imaging data were acquired for each subject while they were performing different tasks involving different neural systems, e.g. visual, motion and cognition systems. The acquired image is with an isometric spatial resolution of 2 mm and temporal resolution of 0.7 s. All fMRI data in HCP are preprocessed by removing spatial distortions, realigning volume to compensate for subject motion, registering the fMRI to the structural MRI, reducing the bias field, normalizing the 4D image to a global mean, masking the data with the final brain mask and aligning the brain to a standard space (MNI space) [22]. This preprocessed fMRI data are then available for connectivity analysis.

We begin by segmenting brain into multiple ROIs using an existing template [23]. With this template, the cortical area is divided into several networks: visual, auditory, default, frontoparietal, dorsal attention, cingulo-opercular, ventral attention, and salience. The fMRI time series data for each region (totally, 333 regions) are extracted using CONN functional connectivity toolbox [24]. Similar to the simulation study, we use a sliding window method to reach covariance trajectories representing dynamic FCs. For each activity we use the first 176 time points in fMRI time series. Despite the incompleteness of data, this setup makes sure the resulting sequences are compared at the same time scale. In total we extract data from 196 subjects under five different tasks motor, gambling, social cognition, language, emotion - and the resting state and use them in our analysis.

In each classification experiment, we calculate pairwise distance dq, as defined in Eqn (5), between individual trajectories, and feed these distances into two types of classifiers. One is SVM with RBF kernel. The other is a simple deep neuron network (DNN) classifier with four dense layers along with dropout and ReLu activation at each layer, and one last dense layer with softmax activation, using Keras with Tensorflow [25] backend. We find in some cases alignment helps although that, in the more general multiclass classification tasks, using dc in Eqn (4) we can still obtain same level of accuracies. Thus, one can choose to use Eqn (4) in these tasks to save computation time. The results of individual experiments are presented next.

1). Dynamic FCs under different sets of ROIs and a fixed task:

In this experiment, we restrict to the gambling task and use different sets of ROIs to study corresponding dynamic FCs. The first set of 4 ROIs is from the Salience network and the second set of 4 ROIs is from the CinguloParietal network. Fig. 6 shows locations of these ROIs in human brain. We extract the fMRI signals from 40 subjects in the HCP data set, and in total we obtain 80 trajectories. 40 for each set of ROIs. We then calculate two 80 × 80 pairwise distance matrices before and after the temporal alignment. In Fig. 7, we display these distance matrices before and after alignment, and a histogram of relative changes in pairwise distances due to alignment (for all trajectories). From this figure, we see that the dynamic FCs generated by the Salience network are more homogenous compared with the ones generated by the CinguloParietal network. This is reasonable because ROIs in the CinguloParietal network are actively involved in the gambling task [26]. The histogram in the last panel shows that there is a significant reduction (~ 10−20%) in distances due to temporal alignment of trajectories.

Fig. 6.

Fig. 6.

Plots of ROIs in the Salience, CinguloParietal, VentralAttn and DorsalAttn networks in the used template.

Fig. 7.

Fig. 7.

Pairwise distance matrix between the 80 trajectories from gambling task. Group 1: first 40 trajectories from 4 ROIs in Salience network involving 4 ROIs; Group 2: second 40 trajectories from 4 ROIs in CinguloParietal network. (a) Pairwise distances before alignment. (b) Distances after alignment (c) Histogram of (dc-dq)/dc.

Next, these distance matrices are used as features for a binary classification task to study whether we can distinguish the dynamic FCs generated from different brain regions when people perform the same task. The average classification rates before and after alignment using the DNN classifier are shown in Table I. From this result, we can see that the alignment helps improve the classification rate for the class with higher variation.

TABLE I.

Average classification rates in the Salience and CinguloParietal networks before and after alignment.

Salience CinguloParietal
Before alignment 80% 83%
After alignment 80% 91%

2). Dynamic FCs under different tasks but for same ROIs:

Here we are interested in studying the classification of dynamic FCs under different tasks using the same set of ROIs. To save computational cost, in the following experiments we use dc for computing distances between trajectories. We present results from three scenarios:

a). Scenario 1: Classification and prediction based on ROIs in the VentralAttn and DorsalAttn networks:

We use 196 subjects from the HCP dataset. For four selected tasks - gambling, social, emotion and resting state - we construct dynamic FC trajectories for each subject using ROIs either in the Ventral or in the DorsalAttn networks. The VentralAttn network has 20 ROIs, resulting in 784, 20 × 20 × 20 dynamic FC trajectories. The DorsalAttn network has 29 ROIs, resulting in 784, 29 × 29 × 20 dynamic FC trajectories. We then calculate pairwise distances between trajectories, with dimension reduction, and feed the distance features to the classifiers. The average overall classification rates based on five-fold cross-valiation are shown in Fig. 9. There we can see that the classification rates are relatively stable, except when we reduce the dimension all the way to d = 5. Notably, we can reduce the computational cost of comparing trajectories by a significant amount due to dimension-reduction and still retain good performance. In Fig. 9 we also present results obtained using trajectories directly as input for an SVM classifier, noted as SVM-t. Results show that using distances as features results in higher success rates in this low-dimensional classification task.

Fig. 9.

Fig. 9.

Classification results in VentralAttn and DosalAttn networks involving 196 subjects and four activities with different dimensions. Top four lines were results using distance features as classifier input and bottom two lines were results using trajectories as input.

A confusion matrix of classification results using all ROIs in DorsalAttn network and a DNN classifier is shown in Fig. 8. We see that for one case (resting state) the success rate reaches 100%, while it is lower for some cases.

Fig. 8.

Fig. 8.

Class-wised classification results in DosalAttn networks involving four activities.

b). Scenario 2: Classification of dynamic FC trajectoriesbased on ROIs in other networks:

Here we use 196 subjects and different sets of ROIs – ROIs in DorsalAttn (DA), RetrosplenialTemporal (RT), CinguloParietal (CP) and SMmouth (SM) networks) – as the source for data to be used in classification. Using the same settings as in Scenario 1, the average overall classification rates for the two classifiers are shown in Fig. 10. We study a four-class classification (gambling, social, emotion and resting state), versus a more difficult six-class classification (motor, gambling, social, language, emotion and resting state). We see that the success rates using the proposed metric reach 93% in the four-class classification, but drop to 72% in the six-class classification. In this setup, despite the simple architecture of given DNN classifier, its overall performance is as competitive as using SVM. One can seek performance improvement using more advanced neural network architectures. Here we also compare performance using the log-Euclidean metric between SPDMs for computing distances between covariance trajectories for comparison, denoted as SVM-L in Fig. 10. We see our proposed metric gives better overall performance, and the log-Euclidean metric gives better performance in the six class classification problem using ROIs in DorsalAttn network. One can interpret that the two metrics are better at capturing different characteristcs of dynamic functional connectivity from different activities, so one can consider combining distance features calculated from different metrics for classifications in practice.

Fig. 10.

Fig. 10.

Classification results from different networks using four activities and six activities.

c). Scenario 3: Classification of dynamic FCs with alarge number of brain regions and dimension reduction with comparison:

In this experiment, we involve all 333 ROIs and four activities (gambling, social, emotion and resting-state) with 100 subjects for each activity. Here we also compare performances of our proposed pipeline with the regular PCA method. To perform PCA, for each of the averaged ROI based multivariate time series in n (n = 333 here), we first reduce its dimension to d using PCA, then transform it into covariance trajectory and calculate pairwise distances as earlier. We did not perform experiments in the case when n = 333 because of the high computational and storage cost.

The pairwise distance matrices between 400 trajectories with different degrees of dimension reduction are shown in Fig 11. Despite different dimension reduction, the block patterns are still well preserved. Classification results using SVM classifier for these setups are presented in Table II. We see that the proposed pipeline provides the highest classification rates across different dimensions, highlighting the power of the proposed method.

Fig. 11.

Fig. 11.

Pairwise distance matrix between 400 trajectories from four activities using 333 ROIs with dimension reduction. (a) (b) (c) show when d =100, 30, 10 respectively.

TABLE II.

Classification rates of four activities with 333 ROIs with dimension reduction.

DR method d=10 d=30 d=100
Proposed 80% 80% 82%
PCA 48% 59% 53%

Existing fMRI pattern classification literature utilizes data from different sources. Some works use particular datasets to classify mental disorders and some aim on classifying task/resting activities. To our knowledge, there are a few works using HCP dataset for classification of activities. One example is Zhang et al. [27], where the classification accuracy is up to 100% when classifying tfMRI/rsfMRI in a twoclass classification problem. [28] and [29] also reach high classification rates in a two-task classification problem. In contrast, our results achieve high classification rates in a multitask classification problem, and also reach 100% classification rates for activities such as resting state.

Another usage of the proposed framework is for comparing covariance trajectories with different dimensions, i.e., different number of ROIs. In view of the dimension reduction used in our method, one can bring different trajectories to the same dimension, thus making the comparison between theses trajectories possible. As an example, we use resting state data and different number of ROIs in 10 networks to form covariance trajectories with 30 subjects in each network, resulting on total of 300 trajectories. We project all covariance trajectories using d = 4 and compute pairwise distances between all 300 trajectories. The results, shown in Fig. 12, shows a strong consistent pattern in Salience network under resting state [30].

Fig. 12.

Fig. 12.

Pairwise distances of 300 trajectories from 10 networks (Frontal-Parietal, SMhand, Salience, Visual, CinguloOperc, VentralAttn, DorsalAttn, RetrosplenialTemporal, CinguloParietal, SMmouth respectively) after reducing dimensions to d =4.

IV. Discussion

In this paper we have presented a comprehensive framework for comparing multivariate fMRI time series, based on a distance on trajectories in the space SPDMs. We also derive a method for SPDM dimension reduction, which saves significant computational costs while preserving pairwise distances as much as possible. The method can be further used to compare covariance trajectories in different dimensions. Experimental results show high classification accuracies in multi-class classification problems, especially in emotion task and resting state data.

Through extensive experiments, we demonstrate the usefulness of the proposed framework in fMRI data analysis. For different tasks, the same set of brain regions can display different dynamic FC patterns. Some brain regions can be very informative in distinguishing the tasks, while others do not. These findings are consistent with existing literature, e.g., in [31], the authors discovered that FCs are dominated by common patterns, and tasks can only modestly influence brain networks. Classification of tasks using the full brain fMRI does not always result in the best performance. Hence, particular brain regions should be considered for task-based fMRI analysis and classification. The selection of brain regions can be done through prior neuroscientific knowledge or some data driven methods [32].

We also notice that in a multi-class classification problem, distance features can be similar for certain tasks due to similar functional connectivity patterns of the selected ROIs, thus the resulting classification rates are relatively low in these tasks. In terms of classification, we expect that given the distance features, using advanced deep learning frameworks and ensemble models may improve performance.

Acknowledgements

This research was supported in part by the NSF grants to AS – NSF DMS CDS&E 1621787 and NSF CCF 1617397. ZZ was supported in part by SAMSI via grant NSF DMS 1127914 during this research.

The authors are thankful to the organizers of HCP database for making it available to public for this research. HCP is funded in part by WU-Minnesota consortium (PIs: Van Essen and Ugurbil) 1U54MH091657.

Appendix

A. Discussion on pseudoinverse of P^

For an n×d orthogonal matrix B, an n×n SPDM P and a projection Q=BTPBd×d, the reconstruction of P is given by: P^=BQBT, such that BTB = Id. Since P^ is singular, we cannot invert it, but can use the notion of a pseudoinverse instead. Let the SVD of Q be Q = UQ0UT, and ST an SVD of P^ be P^=[U1U2][S100S2][U1U2]T. Here S1d×d and and S2(nd)×(nd). Due to P^ not being full ranked, S2 is a matrix of zeros, and P^=U1S1U1T. Its pseudo inverse is thus defined to be P^=U1S11U1T.

On the other hand, P^=BQBT=BUQ0UTBT. Since U is a rotation matrix, BU is just a rotation of B, and BU is still orthonormal. Setting BU = U1 and Q0 = S1, we get SVD of P^, and its pseudo inverse can be written as P^=BUQ01(BU)T=BQ1BT.

B. Proof of Lemma 1

Given P^=BQBT and P^=BQ1BT, we have

P^iP^j2P^i=(BQi1BT)(BQjBT)(BQjBT)(BQi1BT),

for all i and j. Since BTB = Id, this equals

BQi1BTBQjBTBQjBTBQi1BT,

which equals BQi1Qj2Qi1BT=BQijBT. Therefore, we can write

PiPj2PiP^iP^j2P^i=PijBQijBT.

C. Proof of Lemma 2

Starting with the left side and simplifying, we get:

argminB,Qiji,jPijBQijBT2=argminB,Qiji,jtr(PijBQijBT)(PijBQijBT)=argminB,Qiji,jtr(PijPijPijBQijBTBQijBTPij+BQijBTBQijBT)=argminB,Qiji,jtr(PijPij2BTPijBQij+QijQij).

To simplify notation, we replace the trace operator with the matrix inner product, i.e tr(AB) = 〈A, B〉 = ∑i,j aijbij. Now, for each i, we have

tr(PijPij2BTPijBQij+QijQij)=argminB,Qij(Pij,Pij2BTPijB,Qij+Qij,Qij)=argminB,Qij(BTPijB,BTPijB+BTPijBQij,BTPijBQij).

It is easy to see that, to minimize the objective function, we need to set BTPijB = Qij for each i, j. By substituting Qij = BTPijB, we have

argminB,Qijtr(PijPij2BTPijBQij+QijQij)=argminBBTPijB,BTPijB=argmaxBtr(BTPijBBTPijB).

Therefore, for all i, js,

argminB,Qiji,jPijBQijBTF2=i,jargmaxBtr(BTPijBBTPijB).

REFERENCES

  • [1].Van Essen D, Smith S, Barch D, Behrens T, Yacoub E, and Ugurbil K, “The WU-Minn Human Connectome Project: An overview,” NeuroImage, vol. 80, pp. 62–79, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Hutchison R, Womelsdorf T, Allen E, Bandettini P, Calhoun V, Corbett M, Della Penna S, Duyn J, Glover G, Gonzalez-Castillo J et al. , “Dynamic functional connectivity: Promise, issues, and interpretations,” NeuroImage, vol. 80, pp. 360–378, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Etzel J, Cole M, Zacks J, Kay K, and Braver T, “Reward motivation enhances task coding in frontoparietal cortex,” Cerebral Cortex, vol. 26, no. 4, pp. 1647–1659, 4 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Sarraf S and Tofighi G, “Deepad: Alzheimer’s disease classification via deep convolutional neural networks using mri and fmri,” bioRxiv, 2016. [Google Scholar]
  • [5].Barachant A, Bonnet S, Congedo M, and Jutten C, “Classification of covariance matrices using a riemannian-based kernel for bci applications,” Neurocomput., vol. 112, pp. 172–178, July 2013. [Google Scholar]
  • [6].Zhao Q, Kwon D, and Pohl KM, “A riemannian framework for longitudinal analysis of resting-state functional connectivity,” in MICCAI, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Dodero L, Sona D, Meskaldji DE, Murino V, and Van De Ville D, “Traces of human functional activity: Moment-to-moment fluctuations in fmri data,” in 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), April 2016, pp. 1307–1310. [Google Scholar]
  • [8].Qiu A, Lee A, Tan M, and Chung MK, “Manifold learning on brain functional networks in aging,” Medical Image Analysis, vol. 20, no. 1, pp. 52–60, 2015. [DOI] [PubMed] [Google Scholar]
  • [9].Dodero L, Minh HQ, Biagio MS, Murino V, and Sona D, “Kernel-based classification for brain connectivity graphs on the riemannian manifold of positive definite matrices,” in 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), April 2015, pp. 42–45. [Google Scholar]
  • [10].Huang S-G, Samdin S, Chee Ming T, Ombao H, and Chung MK, “Statistical Model for DynamicallyChanging Correlation Matrices with Application to Resting-State Brain Connectivity,” arXiv e-prints, p. arXiv:1812.10050, December 2018. [Google Scholar]
  • [11].Zhang Z, Su J, Klassen E, Le H, and Srivastava A, “Rate-invariant analysis of covariance trajectories,” Journal of Mathematical Imaging and Vision, vol. published online, 2018. [Google Scholar]
  • [12].Ledoit O and Wolf M, “A well-conditioned estimator for large-dimensional covariance matrices,” Journal of Multivariate Analysis, vol. 88, no. 2, pp. 365–411, 2004. [Google Scholar]
  • [13].Pennec X, Fillard P, and Ayache N, “A riemannian framework for tensor computing,” Int. J. Comput. Vision, vol. 66, no. 1, pp. 41–66, 2006. [Google Scholar]
  • [14].Su J, Dryden IL, Klassen E, Le H, and Srivastava A, “Fitting optimal curves to time-indexed, noisy observations on nonlinear manifolds,” Journal of Image and Vision Computing, vol. 30, no. 6–7, pp. 428–442, 2012. [Google Scholar]
  • [15].Le Brigant A, “A discrete framework to find the optimal matching between manifold-valued curves,” Journal of Mathematical Imaging and Vision, pp. 1–31, 2018. [Google Scholar]
  • [16].Arslan S, Ktena SI, Makropoulos A, Robinson EC, Rueckert D, and Parisot S, “Human brain mapping: A systematic comparison of parcellation methods for the human cerebral cortex,” NeuroImage, vol. 170, pp. 5–30, 2018, segmenting the Brain. [DOI] [PubMed] [Google Scholar]
  • [17].Harandi MT, Salzmann M, and Hartley RI, “Dimensionality reduction on spd manifolds: The emergence of geometry-aware methods,” IEEE transactions on pattern analysis and machine intelligence, 2017. [DOI] [PubMed] [Google Scholar]
  • [18].Li Y and Lu R, “Locally preserving projection on symmetric positive definite matrix lie group,” CoRR, vol. abs/1703.09499, 2017. [Google Scholar]
  • [19].Boumal N, Mishra B, Absil P-A, and Sepulchre R, “Manopt, a matlab toolbox for optimization on manifolds,” J. Mach. Learn. Res, vol. 15, no. 1, pp. 1455–1459, January 2014. [Google Scholar]
  • [20].Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Mazoyer B, and Joliot M, “Automated anatomical labeling of activations in spm using a macroscopic anatomical parcellation of the mni mri single-subject brain,” NeuroImage, vol. 15, no. 1, pp. 273–289, 2002. [DOI] [PubMed] [Google Scholar]
  • [21].Xia M, Wang J, and He Y, “Brainnet viewer: A network visualization tool for human brain connectomics,” PLOS ONE, vol. 8, no. 7, pp. 1–15, 07 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Glasser MF, Sotiropoulos SN, Wilson JA, Coalson TS, Fischl B, Andersson JL, Xu J, Jbabdi S, Webster M, Polimeni JR et al. , “The minimal preprocessing pipelines for the human connectome project,” NeuroImage, vol. 80, pp. 105–124, 2013, mapping the Connectome. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Gordon EM, Laumann TO, Adeyemo B, Huckins JF, Kelley WM, and Petersen SE, “Generation and evaluation of a cortical area parcellation from resting-state correlations,” Cerebral Cortex, vol. 26, no. 1, p. 288, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Whitfield-Gabrieli S and Nieto-Castanon A, “Conn: a functional connectivity toolbox for correlated and anticorrelated brain networks,” Brain Connect, vol. 2, no. 3, pp. 125–141, 2012. [DOI] [PubMed] [Google Scholar]
  • [25].Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M et al. , “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015,
  • [26].van Amelsvoort T and Hernaus D, “Effect of pharmacological interventions on the fronto-cingulo-parietal cognitive control network in psychiatric disorders: A transdiagnostic systematic review of fmri studies,” Frontiers in Psychiatry, vol. 7, p. 82, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Zhang S, Li X, Lv J, Jiang X, Guo L, and Liu T, “Characterizing and differentiating task-based and resting state fmri signals via two-stage sparse representations.”Brain imaging and behavior, vol. 10, pp. 21–32, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Chung AW, Pesce E, Monti RP, and Montana G, “Classifying hcp task-fmri networks using heat kernels,” 2016 International Workshop on Pattern Recognition in Neuroimaging (PRNI), pp. 1–4, 2016. [Google Scholar]
  • [29].Liu H, Zhang M, Hu X, Ren Y, Zhang S, Han J, Guo L, and Liu T, “Fmri data classification based on hybrid temporal and spatial sparse representation,” in 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), April 2017, pp. 957–960. [Google Scholar]
  • [30].Day G, Farb N, Tang-Wai D, Masellis M, Black S, Freedman M, Pollock B, and Chow T, “Salience network resting-state activity: Prediction of frontotemporal dementia progression,” JAMA Neurology, vol. 70, no. 10, pp. 1249–1253, 2013. [DOI] [PubMed] [Google Scholar]
  • [31].Gratton C, Laumann TO, Nielsen AN, Greene DJ, Gordon EM, Gilmore AW, Nelson SM, Coalson RS, Snyder AZ, Schlaggar BL et al. , “Functional brain networks are dominated by stable group and individual factors, not cognitive or daily variation,” Neuron, vol. 98, no. 2, pp. 439–452, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Wang L, Zhang Z, and Dunson D, “Symmetric bilinear regression for signal subgraph estimation,” IEEE Transactions on Signal Processing, vol. 67, no. 7, pp. 1929–1940, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES