Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jun 1.
Published in final edited form as: IEEE Trans Biomed Eng. 2018 Oct 11;66(6):1549–1558. doi: 10.1109/TBME.2018.2875467

Scalable and Robust Tensor Decomposition of Spontaneous Stereotactic EEG Data

Jian Li 1, Justin P Haldar 2, John C Mosher 3, Dileep R Nair 4, Jorge A Gonzalez-Martinez 5, Richard M Leahy 6
PMCID: PMC6677658  NIHMSID: NIHMS1529898  PMID: 30307856

Abstract

Objective:

Identification of networks from resting brain signals is an important step in understanding the dynamics of spontaneous brain activity. We approach this problem using a tensor-based model.

Methods:

We develop a rank-recursive Scalable and Robust Sequential Canonical Polyadic Decomposition (SRSCPD) framework to decompose a tensor into several rank-1 components. Robustness and scalability are achieved using a warm start for each rank based on the results from the previous rank.

Results:

In simulations we show that SRSCPD consistently outperforms the multi-start alternating least square (ALS) algorithm over a range of ranks and signal-to-noise ratios (SNRs), with lower computation cost. When applying SRSCPD to resting in-vivo stereotactic EEG (SEEG) data from two subjects with epilepsy, we found components corresponding to default mode and motor networks in both subjects. These components were also highly consistent within subject between two sessions recorded several hours apart. Similar components were not obtained using the conventional ALS algorithm.

Conclusion:

Consistent brain networks and their dynamic behaviors were identified from resting SEEG data using SRSCPD.

Significance:

SRSCPD is scalable to large datasets and therefore a promising tool for identification of brain networks in long recordings from single subjects.

Index Terms-: Tensor decomposition, dynamic functional connectivity, stereotactic EEG, optimization

I. Introduction

EXPLORING functional connectivity (FC) in resting brain signals is a rich approach to studying brain networks [1]. Of particular recent interest is the dynamic nature of FC [2], The most commonly used strategy for decoding dynamic FC (DFC) is to compute correlation or coherence using a sliding window [3], [4], However, using a long temporal window to obtain robust FC estimates inevitably leads to over-smoothing of dynamic changes [5], [6], To overcome this difficulty, principal component analysis (PCA)-based and independent component analysis (ICA)-based approaches have been proposed. Although they do not introduce temporal smoothing, a limitation of those methods is that either the time series of each network is required to be independent (temporal ICA) [7] or the spatial modes of the networks are disjoint (spatial ICA) [8], [9] or orthogonal (PCA) [10], whereas real networks can overlap and be correlated in both space and time [11].

Tensors are a generalization of matrices in which we can represent data with more than two indices. For example, here we use a third-order tensor to represent SEEG data in terms of space, time and frequency (in the following “tensors” refers to tensors of order 3 or higher). As with matrices, tensors can be represented as a sum of rank-1 components. Structured data often then admit to low rank models with respect to this tensor representation. One of the two popular tensor models is the canonical polyadic (CP) form [12], which has been shown [13]–[15] to also be equivalent to parallel factors analysis (PARAFAC) [16] and the canonical decomposition (CANDECOMP) [17].

CP is a model that can capture structure inherent in multidimensional data when represented in a low rank tensor of order 3 or higher. Conversely, standard 2D matrix decomposition methods, such as PCA or ICA, applied to matricized or unfolded tensors [15] are typically not able to capture this structure using a similar low-rank model. Moreover, CP decomposition has a unique solution under less restrictive conditions than the orthogonality or independence assumptions implicit in PCA or ICA [18], [19], This latter property is particularly appealing when analyzing SEEG or other brain data that can be represented as a third or higher-order tensor, since we can avoid the restrictive assumptions of orthogonality or independence between components.

Among the many algorithms for computing the CP decomposition, alternating least squares (ALS), [16][17] is the most widely used because of its simplicity relative to alternatives [14], [15], [20], [21], Comparisons in these papers show that, in general, ALS provides solutions of similar quality to other algorithms. While some are more robust to over-factoring than ALS, especially in ill-conditioned cases, the expense is higher computational complexity in both memory and time [15].

ALS-based CP decomposition has been widely used in EEG analysis by transforming the raw EEG recordings to a time-frequency representation using short-time Fourier or Morlet wavelet transforms and applying a 3-way CP decomposition (channel by time by frequency) [22], Möck [23] applied CP to event-related potentials (ERP). Miwakeichi et al. [24] analyzed both spontaneous and evoked EEG recordings and showed that theta activity was predominant during a task condition, while alpha activity was observed continuously during both rest and task conditions. CP decomposition has also been applied to ictal EEG recordings from patients with epilepsy. The extracted components have been used to localize the seizure onset zone [25]–[27] as well as to remove artifacts [25], [26], When applied to group data analysis, CP decomposition and the extracted components/features can be used for a variety of purposes: classification [28], cross-modality comparison [29], and hypothesis testing [30].

The application of CP decomposition to functional MRI (fMRI) data has also been explored, usually at the group-level with the goal of finding common factors among subjects [14], Individual analysis using CP is rarely performed because fMRI data lacks the rich spectral information present in EEG data. Andersen et al. [31] applied CP decomposition to finger-tapping fMRI data. Beckmann et al. [32] proposed a variation of CP which extends probabilistic independent component analysis (PICA) to higher orders by adding an independence constraint in the spatial dimension. While originally applied to task fMRI data, Damoiseaux et al. [33] applied tensor-PICA to resting fMRI data and identified 10 networks consistent with previous findings.

In previous studies where CP decomposition was applied to either EEG or fMRI data [24]–[29], [31], [32], [34], [35], the size of the datasets is relatively small (the largest dimension has an order of 104 elements or less). To explore dynamic functional connectivity, especially large-scale dynamics, we need to explore much larger datasets. There are two issues that have limited studies of this kind: scalability and robustness.

Scalability:

The majority of the studies cited above either truncate the data into short temporal segments (e.g. in EEG ictal and event related potential recordings) or heavily down-sample the data in the spatial domain (e.g. in fMRI studies) or sometimes both. The degrees of freedom (DOF), approximately reflected by the largest dimension in the CP model for a typical 10-minute SEEG recording used in our experiments is larger (≥ 105) than in these previous studies. Moreover, as we will show in Section II below, the computational complexity increases approximately quadratically as the DOF increase when using ALS to estimate rank. Additionally, to achieve similar quality of solutions relative to our algorithm, ALS has to be applied with multi-start as shown in Section III, resulting in an even higher complexity. Therefore, in order to compute CP decompositions on data of this size, a fast and efficient algorithm is required.

Robustness (against local minima):

It is well known that the ALS algorithm on a CP model is not guaranteed to converge to a global minimum or a stationary point, even when multi-start is applied during the optimization [14], [15], The local minimum problem becomes more severe as the number of components increases. Performance is further compromised when a larger number of components than necessary are fit to the data (over-factoring), resulting in splitting rank-1 components into two or more factors.

Several techniques have been explored to improve the robustness and efficiency of the ALS algorithm. For example, Rajih et al. [36] added a line search after each major ALS iteration. Navasca et al. [37] applied Tikhonov regularization on each sub-problem in ALS iteration. However, similar to the ALS alternatives reviewed above, these modifications result in significantly higher computation cost limiting their practical utility, particularly for large scale problems.

In 2D matrix scenarios, Haidar et al. [38] proposed an incremented-rank PowerFactorization (IRPF) approach to solve minimum-rank matrix recovery problems, where higher-rank solutions were obtained recursively using lower-rank results as warm initializations, resulting in a substantially improved performance compared to the standard convex optimization approach. Performance was also theoretically characterized later in [39], In this work, we extend IRPF to higher-order tensors, with the goal of resolving the scalability and robustness issues discussed above. We refer to our approach as “scalable and robust sequential CP decomposition” (SRSCPD). As we show below, this algorithm is more robust than ALS and can be extended to large-scale problems. An outline of the remainder of the paper is as follows. Section II will briefly summarize the notation that will be used in later sections. Section III will describe the SRSCPD framework and the experimental design for both simulation and in-vivo data. Section IV will present the experiment results and conclusions follow in Section V.

II. Notation and Preliminaries

We first define some necessary notation and review the ALS algorithm which we use as part of our SRSCPD framework in section III. We largely follow the notational conventions and definitions ofKolda and Bader [14].

A. Scalar, Vector, Matrix and Tensor

A scalar is denoted by a lowercase letter, e.g. x; a vector by a bold lowercase letter, e.g. x; a matrix by a bold uppercase letter, e.g. X; and a tensor by a bold script letter, e.g. X. The number of dimensions is called the order and each dimension is referred as a mode. We use a third-order tensor XI×J×K in the following with individual elements denoted by xi,j,k The notation and algorithms extend naturally to higher-order tensors.

A rank-1 tensor can be expressed as the outer product of vectors, i.e. X=abc where “°” represents the vector outer product. We use the norm:

X=i=1Ij=1Jk=1Kxi,j,k2 (1)

B. Matricization

Tensors can be unfolded or “matricized” into matrix form [14], Matricization along dimension n is denoted by X(n) so that a third-order tensor XI×J×K can be matricized into X(1)I×JK or X(2)J×IK or X(3)K×IJ.

C. Kronecker, Khatri-Rao and Hadamard Product

We use the definitions of the following matrix products defined in [14] and repeated here for convenience. The Kronecker product XY of matrix XI×J and YK×L is defined as

XY=[x11Yx12Yx1JYx21Yx22Yx2JYxI1YxI2YxIJY] (2)

where xij is the (i,j)-th element of X.

The Khatri-Rao product XY of matrix XI×K and YJ×K is the column-wise Kronecker product of X and Y

XY=[x1y1 x2y2  xKyK] (3)

where xi is the i-th column of X.

The Hadamard product X * Y of matrix XI×J and YJ×J is the element-wise matrix product

X*Y=[x11y11x12y12x1Jy1Jx21y21x22y22x2Jy2Jx11yI1xI2yI2xIJyIJ]. (4)

In the following we use the property of the Khatri-Rao product [14]:

(XY)=(XTX*YTY)(XY)T (5)

where X represents the Moore-Penrose pseudo-inverse of X

D. CP Decomposition

CP decomposes a tensor into a sum of rank-1 tensors or components. For a third-order tensor XI×J×K

X=r=1Rarbrcr+E (6)

where arI, brJ, crK, R is the rank or the number of components and E is the error tensor. If we group the components in each mode into a matrix, i.e. let A=[a1a2aR]I×R and similarly for BJ×R and CK×R, then the CP decomposition can be expressed as [14]

X(1)=A(CB)T+E(1) (7)

or

X(2)=B(CA)T+E(2) (8)

or

X(3)=C(BA)T+E(3) (9)

where A, B, C are called the loading matrices for the three modes respectively.

E. Computation of CP decomposition and the ALS algorithm

Suppose we want to find the best rank R approximation of XI×J×K via

minX^XX^+g(X^) (10)

Where X^=r=1Rλrarbrcr, λr represents the scale of where component r and ar, br and cr have unit norm. g(X^)=μ1g1(A)+μ2g2(B)+μ3g3(C) is a regularizing function with 123) the corresponding regularization parameters. The ALS algorithm solves this problem in an alternating fashion. We first solve for A with B and C fixed, then solve for B with A and C fixed, and so on. This procedure is repeated until some convergence criterion is satisfied. Note that, for quadratic regularizes, each sub-problem reduces to ordinary least square. Specifically, assume B and C are fixed and we are solving for A. Using the equivalent matrix expression discussed above, we can write the optimization problem as

A^= argmin AX(1)A(CB)TF+μ1g1(A) (11)

The solution with μ1 = 0 (without regularization) reduces to a regular least square solution:

A^=X(1)[(CB)T] (12)

Using the property in Eq. (5), we can rewrite as

A^=X(1)(CB)(CTC*BTB) (13)

This expression is almost always preferable to (12) because it achieves a much lower computational complexity by only calculating the pseudo-inverse of an R×R matrix. Finally, we normalize each component and set λr equal to the normalization factor for the rth component, r = 1,…, R. For the case μ1 ≠ 0, the solution in (12) is replaced by the solution to (11), which will be closed form if g1(A) is quadratic but may require an iterative solution in other cases. The full ALS algorithm is shown in Algorithm I.

II.

III. Materials and Methods

A. SRSCPD Framework

The best rank-r approximation of a matrix with respect to the Frobenius norm is given by the leading r factors of the SVD. This is not the case for CP decomposition of a higher-order tensor. Kolda [40] showed an example where the best rank-1 approximation is not part of the best rank-2 approximation of a tensor. As a result, components in the CP decomposition for a given desired rank should be found simultaneously. Smilde et al. [41] (example 4.3) showed that the naïve sequential CP, in which a rank-1 tensor is fit to the residue at each iteration, failed to extract the correct components even when the data are known to be perfectly trilinear. Interestingly, this greedy sequential approach is still frequently used, simply because it is the most tractable approach to fitting tensor models to large datasets [42], [43].

The determination of tensor rank is NP-hard [44], Many metrics have been proposed to help find the correct rank, e.g. the core consistency diagnostic (CORCONDIA) [45], difference in fit (DIFFIT) [46] and automatic relevance determination (ARD) [47], All these metrics require a set of decomposition results for all ranks up to the maximum rank R. Obtaining such a set of solutions using CP decomposition is quadratically more complex than finding a rank-1 approximation, as we need to compute the decompositions for each rank r = 1,2…, R separately (1 + 2 + ⋯ + R = 0(R2)). This represents a significant challenge to use of higher rank tensor models and was the primary motivation for our development of the SRSCPD framework.

The SRSCPD framework is built on the original ALS algorithm. Our goal is to compute a rank-recursive set of decompositions from rank 1 to rank R. Our approach uses the result for rank r to initialize the decomposition for rank r + 1. Initialization for the additional component is found by fitting a rank-1 tensor to the residual from the rank r fit. In contrast, the original ALS algorithm does not use any information from rank r when fitting a model of rank r + 1. This “warm start” greatly improves convergence speed relative to the standard ALS algorithm except in very low rank cases (see simulation result in Fig. 4). The warm start may also help to avoid poor local minima in this non-convex optimization problem. As a result, we are able to address the problems with robustness and scalability for large-scale datasets.

Figure 4:

Figure 4:

Simulation results. Boxplots of the run time in seconds over 100 Monte Carlo trials are shown as a function of R. M denotes the number of random initializations when using original ALS algorithm. Top-left panel shows the zoomed-in results for a better comparison for lower rank data.

The full SRSCPD framework is shown in Algorithm II, for a third-order tensor example. The inputs of the algorithm are a tensor XI×J×K and the desired maximum rank R. For each iteration r, a rank-r approximation is calculated using the original CP-ALS algorithm with initializations {A*,B*, C*,λ*}. The initializations are formed by concatenating the solutions {Ar−1,Br−1,Cr−1,λr−1} from the previous iteration r − 1 with the rank-1 approximation {a′,b′,c′,λ′} of the residue tensor Xres, where Xres is obtained by subtracting the reconstructed tensor using {Ar−1,Br−1,Cr1, λr−1} from the original data tensor X.

SRSCPD is flexible in the sense that techniques that have been proposed to improve the ALS algorithm can be directly incorporated. For example, one can add a line search along the estimated gradient descend direction for each mode at the end of each major iteration of ALS [36], Moreover, constraints and regularization terms can be applied to each of the ALS sub-problems, e.g. non-negativity, sparsity, and smoothness.

III.

B. Simulation

We simulated SEEG data [48] with 100 channels, 200 Hz sampling rate, 2 second duration for ranks from R = 1 to 10. In each component a total of N channels were co-activated where N was chosen randomly between 2 and 10. For each of the active channels for each component we generated a time series to represent a block activation pattern with the signal switching on and off, respectively, in active and inactive blocks. The number of active blocks over the 2-second period was selected randomly between 2 and 5 and both the minimum block length and the minimum interval between any adjacent activated blocks was set to 0.1 second. Within each active block, the signals in each component were unit amplitude sinusoids with frequencies chosen randomly between 10 and 80 Hz. Finally, we added white Gaussian noise to the simulated data with a range of SNRs.

The third-order tensor X was generated by calculating the magnitude squared of the complex Morlet wavelet transform (MWT) coefficients of the simulated data matrix with center frequency 1 Hz, time resolution full-width-half-maximum (FWHM) of 2 seconds [49] in a linearly spaced frequency range from 1 to 100 Hz with interval 1 Hz. Thus, the final tensor X has the dimensions of I×J×K, where I = 100,j = 400, K = 100. An example of the model used to simulate the data is shown in Fig. 1. Note that overlaps between components may occur in any of the three modes.

Figure 1:

Figure 1:

An example of the simulated data with 5 components. Each component is represented by a distinct color in all three modes. From left to right: The channel (spatial) mode shows the activated channels that participate in each network; The time (temporal) mode shows the block activation pattern for each network; The spectral mode shows the Morlet wavelet frequency spectrum for each network.

We first compared the robustness of the decomposition using the SRSCPD framework against ALS using 1, 2 and 5 random initializations (generated from a standard uniform distribution in the interval (0,1) using MATLAB (The MathWorks, Inc., Natick, MA, USA) function “rand”). The same convergence criterion was used for both algorithms and in all cases, we computed solutions from rank 1 to R. In both algorithms we used a non-negativity constraint on all loading matrices, because the squared magnitude of the wavelet coefficients are naturally non-negative and the constraint helps avoid degeneracy [50], Let AI×R, BJ×R, CK×R be the loading matrix in each of the three modes as described in Section II. Then in each sub-problem of the ALS, we used the following cost function for A (likewise for B and C)

A^= argmin AX(1)A(CB)TF s.t. A0 (14)

where “” denotes the element-wise inequality.

Since we know the ground truth under the simulated settings, we assessed the quality of the solutions using the averaged congruence product (ACP) [51], ACP is a measure of correlations between components. Specifically, let A, B, C be the column-wise normalized ground truth loading matrices and A^,B^,C^ their estimated counterparts. Then the ACP is defined per [51]as

ACP=maxp tr ((ATA^)*(BTB^)*(CTC^)p) (15)

where P is a permutation matrix accounting for the ambiguity of the ordering of the solutions [16] and tr(X) indicates the trace of X.

We evaluated the ACP of the solutions obtained from both ALS and SRSCPD as a function of R for SNR = 10 . For each R, we ran 100 Monte Carlo trials and boxplots of ACP were generated. For each simulated tensor, we repeated ALS M times, where M = l, 2, and 5, each time using a different random initialization. The final solution was selected as that which has the lowest cost. We also box-plotted the Frobenius norm error as shown in Eq. 10 for each trial. Additionally, we recorded the run time for each of the methods. We then repeated the above study, but instead of varying R we conducted the experiment as a function of SNR with R = 5.

C. Application to In-Vivo SEEG Dataset

We performed retrospective analysis of patient data collected under an Institutional Review Board approved protocol for SEEG evaluation and monitoring in the Epilepsy Center, Cleveland Clinic, OH, USA. The SEEG evaluation performs invasive pre-surgical electrophysiological mapping for patients who have pharmaco-resistant focal epilepsy. For each patient, the implantation was performed using multi-lead depth electrodes, with each electrode comprising typically 10 contacts spaced a few millimeters apart (AdTech, Racine, Wisconsin; Integra, Plainsboro, New Jersey; or PMT, Chanhassen, Minnesota). The electrode locations were determined after a multidisciplinary patient management conference where the hypotheses about the epileptogenic zone were drawn based on available noninvasive data: clinical history, video EEG, MRI, PET, ictal SPECT and MEG. The electrodes were implanted according to the Talairach stereotactic method using orthogonal or oblique trajectories [52], The implantation schemes are shown in Table I. The SEEG signals were recorded using a common reference on a Nihon Kohden EEG system with a sampling rate of 1000 Hz.

TABLE I:

Summary of the Patient Data

Subject ID 1 2
#of Channels 69 113
Epilepsy Type Posterior cingulate epilepsy Left fronto-parietal
Data Segment Time 09:23:00–09:33:00 Same Day: 13:22:02–13:32:02 21:50:30–22:00:30 Next Day: 08:35:20–08:45:20
Implantation Scheme graphic file with name nihms-1529898-t0002.jpg graphic file with name nihms-1529898-t0003.jpg

We chose two 10-minute data segments a minimum of 4 hours apart (see Table I) for each patient. The segments were selected using annotated video of the patients for periods of physical inactivity (e.g. reading, watching TV). Using co-registration of the post-implant X-ray CT to the patients’ MR image, we selected the subset of the SEEG contacts that were in gray matter.

For each data segment we applied the MWT with center frequency 1 Hz and time resolution FWHM of 2 seconds in a linearly spaced frequency range of 1 to 100 Hz with interval 1 Hz. We computed the squared magnitude of the wavelet coefficients and temporally down-sampled the resulting envelope data in each frequency band by a factor of 5, resulting in a new envelope sampling rate of 200 Hz. We performed a flattening of the power spectrum to compensate for its “1/f” characteristics to emphasize the higher frequency components. The resulting data were then represented as a third-order tensor XI×J×K, where I is the number of channels (see Table I), J = 120,000 is the number of time points and K = 100 isthe number of frequency bins.

We applied SRSCPD to each tensor with three additional constraints: non-negativity on all three modes due to the non-negative squared magnitude of the wavelet coefficients; a sparsity constraint on the spatial (channel) mode as we assumed that in each network only a small set of channels would be involved; and a smoothness constraint on the spectral mode reflecting the limited frequency resolution of the MWT. Let AI×R, BJ×R, CK×R be the loading matrices for each of the three modes. Then in the sub-problems of the ALS, we use the following cost functions:

A^=  argminAX(1)A(CB)TF2+μ1iAil1, s.tA0 (16)
B^= argminBX(2)B(CA)TF2,s.t.B0 (17)
C^= argminCX(3)C(BA)TF2+μ3kCkl22, s.tC0 (18)

where X.i denotes the ith column of X, μ1 and μ3 are the regularization parameters which were set to 0.2 empirically l1 and l2 denotes the l1 norm and l2 norm respectively.▽ is the finite difference operator on the columns of C. We solved each of the convex sub-problems (16) – (18) using Auslender and Teboulle’s single-projection algorithm [53] in the TFOCS toolbox [54].

Finally, the rank of the tensor was estimated based on the decomposition results using CORCONDIA [45], This rank metric used the fact that the trilinearity of components starts decreasing in the case of over-factoring (the number of fitted components is greater than the actual rank).

IV. Results

A. Simulation

Fig. 2 shows performance of ALS vs SRSCPD as a function of rank R. For small R all results are similar. However, for larger ranks we see that the ALS results are strongly dependent on initialization and that performance for M = 5 is significantly better than for M = 2 and M = 1. SRSCPD benefits from using the results of the lower rank as an initialization, resulting in overall improved performance (higher median ACP) relative to all three versions of ALS.

Figure 2:

Figure 2:

Simulation results. Boxplots of ACP over 100 Monte Carlo trials are shown as a function of R. M denotes the number of random initializations when using the original ALS algorithm.

Fig. 3 shows the corresponding Frobenius norm error. As with Fig. 2, ALS with M = 1 and 2 shows larger error than M = 5, with the difference increasing with rank. In contrast to the ACP metric in Fig. 2, the error for ALS with M = 5 in Fig. 3 is very similar to that for SRSCPD and sometimes smaller for higher ranks. Closer examination of these results revealed that in cases where this occurs, ALS fails to find one or more of the weaker components that SRSCPD does find. Instead, part of the noise in the data is fit to one of the tensor components. This in turn leads to a lower squared error in the fit even though the extracted components are a poorer fit to the ground truth as measured with ACP.

Figure 3:

Figure 3:

Simulation results. Boxplots of the Frobenius norm error over 100 Monte Carlo trials are shown as a function of R. M denotes the number of random initializations when using the original ALS algorithm.

Fig. 4 shows the run time as a function of R (the run time was measured using MATLAB with Dell Precision T3610 computer, Intel Xeon E5-1650 v2 CPU). As expected, the ratios of the run time among the ALS methods are approximately proportional to M, the number of different initializations. The cost of SRSCPD is significantly lower than that for ALS with M = 2 and 5 restarts. As the rank increases (R > 4), the cost for SRSCPD is even lower than that for ALS without restart, M = 1. The reason for this is that the warm start in SRSCPD produces a better initialization that not only results in improved performance (Fig. 2) but also faster convergence of the ALS sub-problems as shown in Fig. SI in the supplemental material.

Fig. 5 shows that as the SNR increases, ACP also improves for all methods. SRSCPD shows generally similar performance to ALS with M = 5 restarts and is substantially better than results for M = l. However, for lower SNRs, the performance of ALS with M = 5 restarts is superior to SRSCPD.

Figure 5:

Figure 5:

Simulation results. Boxplots of the ACP over 100 Monte Carlo trials are shown as a function of SNR. M denotes the number of random initializations when using the original ALS algorithm.

B. In-Vivo SEEG Dataset

1). Estimation of Rank

Fig. 6 shows plots of the CORCONDIA rank metric as a function of R for the two sessions for both subjects. Per recommendations in [45], rank should be chosen so that the CORCONDIA value is higher than 0.9. Based on the plots in Fig. 6, we selected R = 3 and 4 for the two sessions of Subject 1 and R = 5 and 3 for the two sessions of Subject 2.

Figure 6:

Figure 6:

CORCONDIA rank metric are shown as a function of rank R for two sessions of both subjects.

2). Intra-subject Network Comparison

We found corresponding consistent components (CCs) across the two sessions as the pair of components that had the largest product of spatial congruence and spectral congruence, i.e. for each component ai1,bi1,ci1 in the first session, we found (aj2,bj2,cj2) in the second session such that

j= argmaxj tr ((ai1Taj2)*((ci1Tcj2)),i=1,,R (19)

where aik,bik,cik represent the ith spatial, temporal and spectral component of the kth (k = 1, 2) session, respectively.

Note that we do not expect temporal congruence across sessions. For Subject 1, we found three CCs with large congruence product (>0.6). Fig. 7 shows two of these components, the third is included in the supplemental material, Fig. S2. The congruence was 0.892 (spatial mode) and 0.997 (spectral mode) for component (a) and 0.984 (spatial mode) and 0.967 (spectral mode) for (b).

Figure 7:

Figure 7:

Two consistent components for Subject 1 show: (a) locations consistent with activity in the default mode network within the alpha band and (b) Motor activity in the beta frequency range. For each pair of consistent components, we show modes for session 1 in red and session 2 in blue in the top row. From left to right: The channel (spatial) mode shows the activated channels that participate in each network; The time (temporal) mode shows the dynamic variations of each network (only the first 10 seconds is shown for better visualization); The spectral mode shows the frequency-dependent component of the tensor. In the bottom row the left and middle sub-figures show the spatial distribution of the activated channels mapped onto the subject’s smoothed cortical surface. For visualization purposes, a contact or channel is defined as activated if the value of the (normalized) channel mode at that contact exceeds a threshold of 0.05 in both sessions. The right sub-figure shows the Welch power spectrum of the temporal mode.

The first network for Subject 1 includes contacts in angular gyrus, mid-temporal gyrus, and precuneus all consistent with activation in the Default Mode Network (DMN) [55], Electrodes were not present in this subject for other regions that are typically included in the DMN, such as the superior frontal gyrus (SFG). The spectrum for this component is dominated by alpha rhythms, which is consistent with studies of the DMN in the EEG/MEG literature [56]–[58], The second network we found for this subject contains contacts mainly in the somatosensory and motor cortex with a peak in the beta band, which is consistent with sensorimotor rhythms as reported in the EEG/ECoG literature [59], [60]: beta activity appears most strongly in both motor and somatosensory cortex in movement preparation periods and steady contraction periods following a movement. These subjects were monitored during periods of natural inactivity when we might expect some amount of motor activity, for example page-turning while reading.

The temporal mode shows the power envelope of the SEEG signals, i.e. the dynamic variation in power across time of the spectral model. The bottom right sub-figure shows the power spectrum of this envelope estimated using the Welch method [61] after high-pass filtering with a cutoff frequency 0.02 Hz to remove DC drift. For both components, the power spectral density peaks at a frequency of approximately 0.1 Hz (0.05 Hz - 0.15 Hz), which is similar to the dominant frequency found in resting fMRI BOLD (Blood Oxygen Level Dependent) oscillations [62].

For Subject 2, we also found three CCs. Fig. 8 shows two of the three, the remaining one is shown in Fig. S3. Again, the similarity between sessions was high with congruence 0.652 (spatial mode) and 0.959 (spectral mode) for the DMN and 0.799 (spatial mode) and 0.957 (spectral mode) for the somatomotor network. In this subject there are electrodes in the SFG, unlike the first subject, and we now observe that the DMN does indeed include SFG. As with the first subject we see a second strong component in the somatosensory and motor cortex. However, unlike the first subject, the signal in this case is predominantly mu rather than beta. Again, this is consistent with periods of natural inactivity, where mu rhythms are typically observed in somatomotor cortex in parallel with alpha activity in the visual cortex during resting [63].

Figure 8:

Figure 8:

Two consistent components for Subject 2: (a) default mode network in alpha frequency and (b) motor network centered on the mu range. Details as for Fig. 6.

3). Inter-subject network comparison

When comparing the results between Subject 1 and Subject 2, we found that in both cases the DMN activity is dominated by alpha activity while the motor network is predominantly beta or mu, despite the fact the two subjects had different electrode implantation schemes and different locations of their epileptogenic zones (see Table II).

4). Artifact Detection

SRSCPD not only can be used to identify brain networks, it can also detect artifacts. For example, in the supplemental material, Fig. S4 shows a component that was mismatched (i.e. low spatial and spectral congruence) between the two sessions for Subject 1. Similarly, Fig. S5 shows one of the components that was mismatched between sessions for Subject 2. These mismatched components are likely artifacts as they mostly contain a burst or bursts of activity on a very limited number of channels and the time courses do not look physiological in nature.

5). Comparison to Results Using ALS Algorithm

We also applied the traditional ALS algorithm to the same datasets and compared the components obtained from the two methods. ALS did not find as many functionally distinct networks as SRSCPD did from each individual session based on the same rank selection criterion. The rank was estimated to be R = 2 and 3 for the two sessions of Subject 1 and R = 3 and 3 for the two sessions of Subject 2 using the ALS algorithm as shown in Fig. S6. Moreover, we only found one CC (a DMN shown in Fig. S7) between the two sessions in one of the subjects. No other CCs were found as the maximum congruence product (Eq. 19) was less than 0.15 between all other pairs of components. These results show that ALS is not as robust as the SRSCPD algorithm especially when SNR is low. See supplemental materials (Fig. S6S16) for details.

V. Conclusion

We have described a novel framework for decomposition of electrophysiological data using a third-order tensor. Our SRSCPD approach is based on the original ALS algorithm, using a warm-start in a rank-recursive search to both improve the quality of results and reduce computation cost relative to conventional ALS with multi-start. The SRSCPD framework is scalable to large datasets due to its use of the warm start. We have shown its application to SEEG data in two subjects with epilepsy and found two consistent brain networks across different sessions of recordings several hours apart and in two different subjects. In contrast, ALS found only one consistent network (the default mode) in one subject. This consistency shows promise for the use of SRSCPD for robust identification of spontaneous brain network activity from invasively recorded EEG in individual subjects.

Supplementary Material

TBME-01704-2017-R1-supplimental

Acknowledgments

This work was supported in part by the National Institutes of Health under grants R01-NS074980, R01-EB026299 and R01-NS089212.

Contributor Information

Jian Li, Signal and Image Processing Institute, University of Southern California, Los Angeles, CA, USA..

Justin P. Haldar, Signal and Image Processing Institute, University of Southern California, Los Angeles, CA, USA..

John C. Mosher, Department of Neurology, University of Texas Health Science Center at Houston, Houston, TX, USA..

Dileep R. Nair, Epilepsy Center, Cleveland Clinic Neurological Institute, Cleveland, OH, USA.

Jorge A. Gonzalez-Martinez, Epilepsy Center, Cleveland Clinic Neurological Institute, Cleveland, OH, USA.

Richard M. Leahy, Signal and Image Processing Institute, University of Southern California, Los Angeles, CA, USA..

References

  • [1].Friston KJ, “Functional and Effective Connectivity: A Review,” Brain Connect., 1, no. 1, pp. 13–36, 2011. [DOI] [PubMed] [Google Scholar]
  • [2].Rabinovich MI, Friston KJ, and Varona P, Principles of Brain Dynamics: Global State Interactions. MIT Press, 2012. [Google Scholar]
  • [3].Chang C and Glover GH, “Time-frequency dynamics of resting-state brain connectivity measured with fMRl” Neuroimage, 50, no. 1, pp. 81–98, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Handwerker DA, Roopchansingh V, Gonzalez-Castillo J, and Bandettini PA, “Periodic changes in fMRI connectivity,” Neuroimage, 63, no. 3, pp. 1712–1719, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Hutchison RM et al. , “Dynamic functional connectivity: Promise, issues, and interpretations,” Neuroimage, 80, pp. 360–378, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Preti MG, Bolton TA, and Van De Ville D, “The dynamic functional connectome: State-of-the-art and perspectives,” Neuroimage, 160, pp. 41–54, 2017. [DOI] [PubMed] [Google Scholar]
  • [7].Smith SM et al. , “Temporally-independent functional modes of spontaneous brain activity,” Proc. Natl. Acad. Sci, 109, no. 8, pp. 3131–3136, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Beckmann CF and Smith SM, “Probabilistic Independent Component Analysis for Functional Magnetic Resonance Imaging,” IEEE Trans. Med. Imaging, 23, no. 2, pp. 137–152, 2004. [DOI] [PubMed] [Google Scholar]
  • [9].Calhoun VD and Adali T, “Multisubject independent component analysis of fMRI: A decade of intrinsic networks, default mode, and neurodiagnostic discovery,” IEEE Rev. Biomed. Eng, 5, pp. 60–73, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Friston KJ, Frith CD, Liddle PF, and Frackowiak RSJ, “Functional Connectivity: The Principal-Component Analysis of Large (PET) Data Sets,” J. Cereb. Blood Flow Metab, 13, no. 1, pp. 5–14, 1993. [DOI] [PubMed] [Google Scholar]
  • [11].Karahanoglu FI and Van De Ville D, “Transient brain activity disentangles fMRI resting-state dynamics in terms of spatially and temporally overlapping networks,” Nat. Commun, 6, p. 7751, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Hitchcock FL, “The Expression of a Tensor or a Polyadic as a Sum of Products,” J.Math. Phys, 6, no. 1–4, pp. 164–189, 1927. [Google Scholar]
  • [13].Kiers HAL, “Towards a standardized notation and terminology in multiway analysis,” J. Chemom, 14, no. 3, pp. 105–122, 2000. [Google Scholar]
  • [14].Kolda TG and Bader BW, “Tensor Decompositions and Applications,” SIAMRev, 51, no. 3, pp. 455–500, 2009. [Google Scholar]
  • [15].Cichocki A et al. , “Tensor decompositions for signal processing applications: From two-way to multiway component analysis,” IEEE SignalProcess. Mag, 32, no. 2, pp. 145–163, 2015. [Google Scholar]
  • [16].Harshman R. a, “Foundations of the PARAFAC procedure: Models and conditions for an ‘explanatory’ multimodal factor analysis,” UCLA Work. Pap. Phonetics, 16, no. 10, pp. 1–84, 1970. [Google Scholar]
  • [17].Carroll JD and Chang JJ, “Analysis of individual differences in multidimensional scaling via an n-way generalization of Έckart-Young’ decomposition,”Psychometrika, 35, no. 3, pp. 283–319, 1970. [Google Scholar]
  • [18].Kruskal JB, “Rank, decomposition, and uniqueness for 3-way and N- wayarrays,” Multiway Data Analysis. pp. 7–18, 1989. [Google Scholar]
  • [19].Sidiropoulos ND and Bro R, “On the uniqueness of multilinear decomposition of N-way arrays,” J. Chemom, 14, no. 3, pp. 229–239,2000. [Google Scholar]
  • [20].Faber NM, Bro R, and Hopke PK, “Recent developments in CANDECOMP/PARAFAC algorithms: A critical review,” Chemom. Intell. Lab. Syst, 65, no. 1, pp. 119–137, 2003. [Google Scholar]
  • [21].Tomasi G and Bro R, “A comparison of algorithms for fitting the PARAFAC model,” Comput. Stat. Data Anal, 50, no. 7, pp. 1700–1734, 2006. [Google Scholar]
  • [22].Cong F, Lin QH, Kuang LD, Gong XF, Astikainen P, and Ristaniemi T, “Tensor decomposition of EEG signals: A brief review,” J. Neurosci. Methods, 248, pp. 59–69, 2015. [DOI] [PubMed] [Google Scholar]
  • [23].Mocks J, “Topographic components model for event-related potentials and some biophysical considerations,” IEEE Trans. Biomed. Eng, 35, no. 6, pp. 482–484, 1988. [DOI] [PubMed] [Google Scholar]
  • [24].Miwakeichi F, Martinez-Montes E, Valdes-Sosa PA, Nishiyama N, Mizuhara H, and Yamaguchi Y, “Decomposing EEG data into space- time-frequency components using Parallel Factor Analysis,” Neuroimage, 22, no. 3, pp. 1035–1045, 2004. [DOI] [PubMed] [Google Scholar]
  • [25].Acar E, Aykut-Bingol C, Bingol H, Bro R, and Yener B, “Multiway analysis of epilepsy tensors,” Bioinformatics, 23, no. 13, pp. 10–18, 2007. [DOI] [PubMed] [Google Scholar]
  • [26].De Vos M et al. , “Canonical decomposition of ictal scalp EEG reliably detects the seizure onset zone,” Neuroimage, 37, no. 3, pp. 844–854, 2007. [DOI] [PubMed] [Google Scholar]
  • [27].Deburchgraeve W et al. , “Neonatal seizure localization using PARAFAC decomposition,” Clin. Neurophysiol, 120, no. 10, pp. 1787–1796,2009. [DOI] [PubMed] [Google Scholar]
  • [28].Lee H, Kim Y-D, Cichocki A, and Choi S, “Nonnegative Tensor Factorization for Continuous EEG Classification,” Int. J. Neural Syst, 17, no. 04, pp. 305–317, 2007. [DOI] [PubMed] [Google Scholar]
  • [29].Vanderperren K et al. , “Single trial ERP reading based on parallel factor analysis,” Psychophysiology, 50, no. 1, pp. 97–110, 2013. [DOI] [PubMed] [Google Scholar]
  • [30].Morup M, Hansen LK, Herrmann CS, Pamas J, and Arnfred SM, “Parallel Factor Analysis as an exploratory tool for wavelet transformed event-related EEG,” Neuroimage, 29, no. 3, pp. 938–947, 2006. [DOI] [PubMed] [Google Scholar]
  • [31].Andersen AH and Rayens WS, “Structure-seeking multilinear methods for the analysis of fMRI data,” Neuroimage, 22, no. 2, pp. 728–739, 2004. [DOI] [PubMed] [Google Scholar]
  • [32].Beckmann CF and Smith SM, “Tensorial extensions of independent [56 component analysis for multisubject FMRI analysis,” Neuroimage, vol.25, no. 1, pp. 294–311,2005. [DOI] [PubMed] [Google Scholar]
  • [33].Damoiseaux JS et al. , “Consistent resting-state networks across healthy subjects,” Proc. Natl. Acad. Sci, 103, no. 37, pp. 13848–13853, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Morup M, Hansen LK, Pamas J, and Arnfred SM, “Decomposing the time-frequency representation of EEG using nonnegative matrix and multi-way factorization,” Tech. Univ. Denmark Tech. Rep, 2006. [Google Scholar]
  • [35].Bamathan M, Megalooikonomou V, Faloutsos C, Faro S, and Mohamed FB, “TWave: High-order analysis of functional MRI,” Neuroimage, 58, no. 2, pp. 537–548, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Rajih M, Comon P, and Harshman RA, “Enhanced Line Search: A [60 Novel Method to Accelerate PARAFAC,” SIAM J. Matrix Anal. Appl, 30, no. 3, pp. 1128–1147, 2008. [Google Scholar]
  • [37].Navasca C, De Lathauwer L, and Kindermann S, “Swamp reducing technique for tensor decomposition,” in European Signal Processing Conference, 2008, pp. 1–5. [Google Scholar]
  • [38].Haldar JP and Hernando D, “Rank-constrained solutions to linear matrix equations using powerfactorization,” IEEE Signal Process. Lett, 16, no. 7, pp. 584–587, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Jain P, Netrapalli P, and Sanghavi S, “Low-rank Matrix Completion using Alternating Minimization,” Proc. forty-fifth Annu. ACM Symp. Theory Comput, pp. 665–674, 2012. [Google Scholar]
  • [40].Kolda TG, “Orthogonal Tensor Decompositions,” SIAM J. Matrix Anal. Appl, 23, no. 1, pp. 243–255, 2001. [Google Scholar]
  • [41].Smilde A, Bro R, and GQÌaάi P,Multi-WayAnalysis with Applications in the Chemical Sciences. 2004.
  • [42].Kolda TG, Bader BW, and Kenny JP, “Higher-order web link analysis using multilinear algebra,” in Proceedings - IEEE International Conference onDataMining, ICDM, 2005, pp. 242–249. [Google Scholar]
  • [43].Zhang T and Golub GH, “Rank-One Approximation to High Order Tensors,” SIAMJ.MatrixAnal. Appl, 23, no. 2, pp. 534–550, 2001. [Google Scholar]
  • [44].Hastad J, “Tensor rank is NP-complete,” J. Algorithms, 11, no. 4, pp. 644–654, 1990. [Google Scholar]
  • [45].Bro R and Kiers HAL, “A new efficient method for determining the number of components in PARAFAC models,” J. Chemom, 17, no. 5, pp. 274–286, 2003. [Google Scholar]
  • [46].Timmerman ME and Kiers HAL, “Three-mode principal components analysis: Choosing the numbers of components and sensitivity to local optima,” Br. J. Math. Stat. Psychol, 53, no. 1, pp. 1–16, 2000. [DOI] [PubMed] [Google Scholar]
  • [47].Morup M and Hansena LK, “Automatic relevance determination for multi-way models,”,/. Chemom, 23, no. 7–8, pp. 352–363, 2009. [Google Scholar]
  • [48].Bancaud J et al. , La stéréoencéphalographie dans I’épilepsie Informations neuro-physio-pathologiques apportées par I’investigation fonctionnelle stéréotaxique. Paris: Masson, 1965. [Google Scholar]
  • [49].Tadel F, Baillet S, Mosher JC, Pantazis D, and Leahy RM, “Brainstorm: A user-friendly application for MEG/EEG analysis,” Comput. Intell. Neurosci, 2011,p. 8,2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [50].Lim L, “Optimal solutions to non-negative PARAFAC / multilinear NMF always exist,” in Workshop on Tensor Decompositions and Applications, Centre International de rencontres Mathmatiques, Luminy, France, 2005. [Google Scholar]
  • [51].Tomasi G and Bro R, “PARAFAC and missing values,” Chemom. Intell. Lab. Syst, 75, no. 2, pp. 163–180, 2005. [Google Scholar]
  • [52].Gonzalez-Martinez J et al. , “Stereotactic placement of depth electrodes in medically intractable epilepsy,” J. Neurosurg, 120, no. 3, pp. 639–644, March 2014. [DOI] [PubMed] [Google Scholar]
  • [53].Auslender A and Teboulle M, “Interior Gradient and Proximal Methods for Convex and Conic Optimization,” SIAM J. OPTIM. c Soc. Ind. Appl. Math, 16, no. 3, pp. 697–725, 2006. [Google Scholar]
  • [54].Becker SR, Candès EJ, and Grant MC, “Templates for convex cone problems with applications to sparse signal recovery,” Math. Program. Comput, 3, no. 3, pp. 165–218, 2011. [Google Scholar]
  • [55].Buckner RL, Andrews-Hanna JR, and Schacter DL, “The brain’s default network: Anatomy, function, and relevance to disease,” Ann. N. Y. Acad. Sci, 1124, pp. 1–38, 2008. [DOI] [PubMed] [Google Scholar]
  • [56].Schomer DL and Lopes da Silva F, Niedermeyer’s Electroencephalography: basic principles, clinical applications, and related fields. 2011.
  • [57].Goldman RI, Stem JM, Engel J, and Cohen MS, “Simultaneous EEG and fMRI of the alpha rhythm,” Neuroreport, 13, no. 18, pp. 2487–2492, 2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [58].Brookes MJ et al. , “Investigating the electrophysiological basis of resting state networks using magnetoencephalography,” Proc. Natl. Acad. Sci, 108, no. 40, pp. 16783–16788, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [59].Baker SN, “Oscillatory interactions between sensorimotor cortex and the periphery,” Curr. Opin. Neurobiol, 17, no. 6, pp. 649–655, 2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [60].Zhang Y, Chen Y, Bressler SL, and Ding M, “Response preparation and inhibition: The role of the cortical sensorimotor beta rhythm,” Neuroscience, 156, no. 1, pp. 238–246, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [61].Welch PD, “The Use of Fast Fourier Transform for the Estimation of Power Spectra: A Method Based on Time Aver, aging Over Short, Modified Periodograms,” IEEE Trans. Audio Electroacoust, 15, no. 2, pp. 70–73, 1967. [Google Scholar]
  • [62].Biswal B, Zerrin Yetkin F, Haughton VM, and Hyde JS, “Functional Connectivity in the Motor Cortex of Resting Human Brain Using Echo-Planar MRI,” Magn. Reson. Med, 34, no. 4, pp. 537–541, 1995. [DOI] [PubMed] [Google Scholar]
  • [63].Steriade M, “Cellular Substrates of Brain Rhythms,” Basic principles, clinical applications, and related fields, pp. 27–62, 1980. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

TBME-01704-2017-R1-supplimental

RESOURCES