Skip to main content
Elsevier Sponsored Documents logoLink to Elsevier Sponsored Documents
. 2023 Aug 15;277:120218. doi: 10.1016/j.neuroimage.2023.120218

Identifying good practices for detecting inter-regional linear functional connectivity from EEG

Franziska Pellegrini a,b,, Arnaud Delorme c, Vadim Nikulin d, Stefan Haufe e,f,a,b,1
PMCID: PMC10374983  PMID: 37307866

Highlights

  • Functional connectivity (FC) estimation from EEG varies with analysis pipelines.

  • Based on this simulation study, we recommend LCMV for source projection.

  • Principal component analysis (PCA) is a good strategy to aggregate regional activity.

  • The multivariate interaction measure performs well in estimating undirected FC.

  • Time-reversed Granger Causality performs well in estimating directed FC.

Keywords: Electroencephalography, Inter-regional functional connectivity, Simulation, Source reconstruction, Linearly-constrained minimum variance beamforming, Multivariate interaction measure, Time-reversed granger causality

Abstract

Aggregating voxel-level statistical dependencies between multivariate time series is an important intermediate step when characterising functional connectivity (FC) between larger brain regions. However, there are numerous ways in which voxel-level data can be aggregated into inter-regional FC, and the advantages of each of these approaches are currently unclear.

In this study we generate ground-truth data and compare the performances of various pipelines that estimate directed and undirected linear phase-to-phase FC between regions. We test the ability of several existing and novel FC analysis pipelines to identify the true regions within which connectivity was simulated. We test various inverse modelling algorithms, strategies to aggregate time series within regions, and connectivity metrics. Furthermore, we investigate the influence of the number of interactions, the signal-to-noise ratio, the noise mix, the interaction time delay, and the number of active sources per region on the ability of detecting phase-to-phase FC.

Throughout all simulated scenarios, lowest performance is obtained with pipelines involving the absolute value of coherency. Further, the combination of dynamic imaging of coherent sources (DICS) beamforming with directed FC metrics that aggregate information across multiple frequencies leads to unsatisfactory results. Pipelines that show promising results with our simulated pseudo-EEG data involve the following steps: (1) Source projection using the linearly-constrained minimum variance (LCMV) beamformer. (2) Principal component analysis (PCA) using the same fixed number of components within every region. (3) Calculation of the multivariate interaction measure (MIM) for every region pair to assess undirected phase-to-phase FC, or calculation of time-reversed Granger Causality (TRGC) to assess directed phase-to-phase FC. We formulate recommendations based on these results that may increase the validity of future experimental connectivity studies.

We further introduce the free ROIconnect plugin for the EEGLAB toolbox that includes the recommended methods and pipelines that are presented here. We show an exemplary application of the best performing pipeline to the analysis of EEG data recorded during motor imagery.

1. Introduction

In recent years, the field of functional neuroimaging has seen a shift from the mere localization of brain activity towards assessing interaction patterns between functionally segregated and specialized brain regions (Friston, 2011, Schoffelen, Gross, 2019). Functional connectivity (FC), in contrast to structural connectivity, expresses a statistical dependency between two or more neuronal time series. It has been proposed that FC reflects inter-areal brain communication (Fries, 2015). Moreover, empirical FC estimates have been linked to various cognitive functions (Schoffelen and Gross, 2019) and show pathological alterations in many neurological diseases like Parkinson’s Disease, Alzheimer’s Disease, and epilepsy (Van Diessen et al., 2015).

Electroencephalography (EEG) and Magnetoencephalography (MEG) are suitable tools for recording neural activity non-invasively with high temporal resolution. Pipelines for analysing inter-regional FC from M/EEG recordings typically consist of a series of processing steps: artifact cleaning, source projection, aggregation of signals within regions of interests (ROIs), and, finally, FC estimation. At each step, researchers can choose between a huge selection of processing methods, where every decision has the potential to crucially affect the final result of an analysis and its interpretation (Colclough, Woolrich, Tewarie, Brookes, Quinn, Smith, 2016, Mahjoory, Nikulin, Botrel, Linkenkaer-Hansen, Fato, Haufe, 2017, Wang, Bénar, Quilichini, Friston, Jirsa, Bernard, 2014). This not only complicates the comparison of results from different FC studies, it also raises the question: which pipelines are suitable for reliable source-level FC detection from M/EEG?

In the absence of a robust ground truth on information flow patterns in the human brain, computer simulations are a straightforward way to address such questions (Ewald et al., 2012). Indeed, numerous works have aimed to validate parts or aspects of M/EEG FC methodologies by employing simulated activity. Several studies have focused on assessing the accuracy of different inverse solutions (Allouch, Yochum, Kabbara, Duprez, Khalil, Wendling, Hassan, Modolo, 2022, Anzolin, Presti, Van De Steen, Astolfi, Haufe, Marinazzo, 2019, Bradley, Yao, Dewald, Richter, 2016, Castaño Candamil, Höhne, Martínez-Vargas, An, Castellanos-Domínguez, Haufe, 2015, Grova, Daunizeau, Lina, Bénar, Benali, Gotman, 2006, Halder, Talwar, Jaiswal, Banerjee, 2019, Hashemi, Cai, Kutyniok, Müller, Nagarajan, Haufe, 2021, Haufe, Nikulin, Ziehe, Müller, Nolte, 2008, Haufe, Tomioka, Dickhaus, Sannelli, Blankertz, Nolte, Müller, 2011, Hincapié, Kujala, Mattout, Pascarella, Daligault, Delpuech, Mery, Cosmelli, Jerbi, 2017, Jaiswal, Nenonen, Stenroos, Gramfort, Dalal, Westner, Litvak, Mosher, Schoffelen, Witton, et al., 2020). Others have tested the performance of different FC metrics (Allouch, Yochum, Kabbara, Duprez, Khalil, Wendling, Hassan, Modolo, 2022, Anzolin, Presti, Van De Steen, Astolfi, Haufe, Marinazzo, 2019, Astolfi, Cincotti, Mattia, Marciani, Baccala, de Vico Fallani, Salinari, Ursino, Zavaglia, Ding, et al., 2007, Haufe, Nikulin, Müller, Nolte, 2013, Silfverhuth, Hintsala, Kortelainen, Seppänen, 2012, Sommariva, Sorrentino, Piana, Pizzella, Marzetti, 2019); however, not always on source-reconstructed data exhibiting realistic levels of source leakage.

Many studies aim at aggregating FC within physiologically defined ROIs (Basti, Nili, Hauk, Marzetti, Henson, 2020, Idaji, Zhang, Stephani, Nolte, Mueller, Villringer, Nikulin, 2021, Palva, Monto, Kulashekhar, Palva, 2010, Palva, Kulashekhar, Hämäläinen, Palva, 2011, Schoffelen, Hultén, Lam, Marquand, Uddén, Hagoort, 2017, Supp, Schlögl, Trujillo-Barreto, Müller, Gruber, 2007). This approach has various advantages. First, it is computationally more tractable (both memory- and time-wise) than the computation of FC between many pairs of individual sources, and it can avoid numerical instabilities for FC metrics that require full-rank signals. Second, interpreting or even visualizing FC between thousands of separate sources is almost impossible. Third, statistical testing is far easier due to a much reduced number of multiple comparisons. And, forth, across-subject statistical analyses are eased by working on a standardized set of regions rather than in individual anatomical spaces lacking a common set of source locations.

There have been various suggestions on how to reduce the signal dimensionality within ROIs. While some approaches focus on selecting one source for each ROI that best represents the activity of all sources in it (Ghumare, Schrooten, Vandenberghe, Dupont, 2018, Hillebrand, Barnes, Bosboom, Berendse, Stam, 2012, Perinelli, Assecondi, Tagliabue, Mazza, 2022), others involve some kind of averaging or weighted averaging over all source time series of a ROI (Korhonen, Palva, Palva, 2014, Palva, Monto, Kulashekhar, Palva, 2010, Palva, Kulashekhar, Hämäläinen, Palva, 2011). This approach can be made more general by using the strongest principal component (PC) of all sources of a ROI as a representative time series of that ROI (Basti, Nili, Hauk, Marzetti, Henson, 2020, Ghumare, Schrooten, Vandenberghe, Dupont, 2018, Hillebrand, Barnes, Bosboom, Berendse, Stam, 2012, Rubega, Carboni, Seeber, Pascucci, Tourbier, Toscano, Van Mierlo, Hagmann, Plomp, Vulliemoz, et al., 2019, Supp, Schlögl, Trujillo-Barreto, Müller, Gruber, 2007). The assumption behind this is that the projection of the data that captures the highest amount of variance within a ROI (its strongest PCs) also reflects the connectivity structure of that ROI best. While most works use only the first PC per region, the use of multiple components has also been suggested (e.g. Schoffelen et al., 2017). For this approach, the subsequent FC estimation is usually calculated between pairs of multivariate time series. Another approach, used for example in Schoffelen et al. (2017), is to apply a multivariate FC metric (here, a multivariate extension of Granger causality, Barrett et al., 2010) to the first C PCs of each pair of ROIs. Comparable undirected metrics are the multivariate interaction measure (MIM) and the maximized imaginary coherency (MIC) (Basti, Nili, Hauk, Marzetti, Henson, 2020, Ewald, Marzetti, Zappasodi, Meinecke, Nolte, 2012), which are currently already in use for source-to-source FC estimation (e.g. D’Andrea et al., 2019). These are promising approaches towards more reliable FC estimation. But their virtue in the context of inter-regional FC estimation is still unclear. Moreover, a comprehensive approach evaluating entire data analysis pipelines rather than individual steps is still lacking (see Haufe, Ewald, 2019, Mahjoory, Nikulin, Botrel, Linkenkaer-Hansen, Fato, Haufe, 2017).

Consequently, this work addresses the following questions: First, which pipelines are promising candidates for inferring phase-to-phase FC? Second, which pipelines are promising candidates for inferring the directionality of an interaction? And, most importantly, which pipelines are not suitable to detect FC from data that is corrupted by signal mixing? In addition, we investigate how the number of PCs per ROI affects FC estimation. Finally, we evaluate how the performance of detecting ground-truth interactions varies depending on crucial data parameters like the signal-to-noise ratio (SNR), the number of ground-truth interactions, the noise composition, and the length of the interaction delay. All pipelines are tested within an EEG signal simulation framework that builds on our prior work (Haufe and Ewald, 2019). Note that we focus here on 1:1-phase-to-phase coupling with non-zero time delay, which is the most commonly studied type of FC. Other coupling types including phase–amplitude, amplitude–amplitude, phase–frequency, frequency–frequency, and amplitude–frequency coupling (e.g., Jirsa and Müller, 2013) are not studied here. Further note that we do not intend to propose a realistic model of EEG data or the whole brain. Rather, we aim to identify metrics and pipelines that can accurately reconstruct ROI-level functional connectivity (FC) in the presence of signal mixing, which heavily affects popular metrics used to infer directed and undirected linear FC. That is, we don’t address the question of whether networks estimated using FC metrics provide an accurate depiction of actual brain networks.

The best-performing methods and pipelines identified in this study are implemented in the free ROIconnect plugin for the EEGLAB toolbox. We describe the functionality of ROIconnect and apply it to investigate EEG phase-to-phase FC during left and right hand motor imagery.

2. Methods

2.1. Data generation

We generate time series at a sampling rate of 100 Hz with a recording length of three minutes (Nt=100·60·3=18000 samples). For spectral analyses, we epoch the data into Ne=90 segments of T=200 samples (2 seconds) length.

Ground-truth activity of interacting sources (c.f. Fig. 1a) is generated as random white noise filtered in the alpha band (8 to 12 Hz). Throughout, we use zero-phase forward and reverse second-order digital band-pass Butterworth filters. The interaction between two regions is modeled as unidirectional from the sending region to the receiving region. This is ensured by defining the activity at the receiving region to be an exact copy of the activity at the sending region with a certain time delay (see Section 3). Additionally, pink (1/f scaled) background noise is added to the sending and receiving regions independently. More specifically, both the ground-truth signal and the pink background noise are first normalized to have unit-norm in the interacting frequency band. To this end, every interacting ground-truth signal time series gxRNt at region x is divided by its 2-norm: gxn=gxgx2. Every pink background noise time series pxRNt is filtered in the interacting frequency band to obtain px812HzRNt. The unfiltered noise time series is then divided by the 2-norm of its filtered version: pxn=pxpx812Hz2. Subsequently, a weighted sum of the normalized signal time series and the normalized noise time series is calculated:

sx=θgxn+(1θ)pxnRNt (1)

The result is called the (interacting) signal (Fig. 1b). The parameter θ takes values between 0 and 1 and defines the source-level SNR in decibel (dB): SNRθ=20*log10(θ1θ). The source-level SNR is set to 3.5 dB (θ=0.6). The transposed column vectors of all 2NI signal time series s form the signal sources J˜IR2NI×Nt, with NI region pairs containing the 2NI interacting signals.

Fig. 1.

Fig. 1

Example of simulated data in time and frequency domain. (a) Ground-truth activity at two interacting sources was generated as random white noise filtered in the alpha band (8 to 12 Hz). Left: the one-second window of data in the time domain. Right: power spectral density (PSD). (b) Two interacting signals, generated as a mixture of the ground-truth activity and pink background noise (SNRθ = 3.5 dB). Left: one-second window of data in the time domain. Right: PSD. (c) Brain noise, generated as random pink noise without additional activity in the alpha band (shown is the activity of an exemplary non-interacting source). Left: one-second window of data in time domain. Right: PSD. (d) PSD of activity at the sensor level is generated by mixing white sensor noise, and the interacting signal, and the brain noise at the sensor level (SNR = 3.5 dB). (e) PSD of reconstructed source-level activity. Shown are PSDs of the first principal component of all 68 regions.

In contrast, activity of a non-interacting source at region y byRNt – referred to as brain noise (Fig. 1c) – is generated using random pink noise only without additional activity in the alpha band. The transposed column vectors of all R2NI brain noise time series b form the brain noise sources J˜bRR2NI×Nt, with R denoting the number of regions.

We use a surface-based source model with Nv=1895 dipolar sources placed in the cortical gray matter. Regions are defined according to the Desikan-Killiany atlas (Desikan et al., 2006), which is a surface-based atlas with R=68 cortical regions. Depending on the number of interacting voxels (see Experiment 6, Section 3), one or two time series per region are generated. Every ground-truth time series is placed in a randomly selected source location within a region, so that every region contains the same number of ground-truth time series. The NI region pairs containing the 2NI interacting signals are chosen randomly, and all other regions contain time series with brain noise.

In the next step, source activity is projected to sensor space by using a physical forward model of the electrical current flow in the head, summarized by a leadfield matrix. The leadfield describes the signal measured at the sensors for a given source current density. It is a function of the head geometry and the electrical conductivities of different tissues in the head. The template leadfield is obtained from a boundary element method (BEM) head model of the ICBM152 anatomical head template, which is a non-linear average of the magnetic resonance (MR) images of 152 healthy subjects (Mazziotta et al., 1995). We use Brainstorm (Tadel et al., 2011) and openMEEG (Gramfort et al., 2010) software to generate the headmodel and leadfield. Ns = 97 sensors are placed on the scalp following the standard BrainProducts ActiCap97 channel setup. Note that the spatial orientation of all simulated dipolar sources is chosen to be perpendicular to the cortex surface, so the three spatial orientations that define the dipole orientation of the source activity orientations are summarized into one. This assumption implies a scalar leadfield LsRNs×Nv. We denote the columns of Ls that correspond to the interacting sources by LIRNs×2NI and those corresponding to the brain noise sources by LbRNs×R2NI. Signal sources J˜I and brain noise sources J˜b are then separately projected to sensor space:

Q˜I=LI*J˜I (2)
Q˜b=Lb*J˜b, (3)

with Q˜I and Q˜bRNs×Nt.

At sensor level, we mix the different signal and noise components. We generate white sensor noise Q˜sRNs×Nt with equal variance at all sensors. The multivariate sensor-space time series corresponding to all three signal components—brain noise, interacting signals, and sensor noise—are divided by their Frobenius norms with respect to the interacting frequency band (8–12 Hz):

Q˜In=Q˜IQ˜I812Hz2, (4)
Q˜bn=Q˜bQ˜b812Hz2, (5)
Q˜sn=Q˜sQ˜s812Hz2, (6)

with Q˜I812Hz,Q˜b812Hz and Q˜s812HzRNs×Nt. Then the three components are combined as follows: first, we add brain noise and sensor noise with a specific brain noise-to-sensor noise-ratio (BSR) to obtain the total noise Q˜n and normalize it with respect to the interacting frequency band:

Q˜n=θbsrQ˜bn+(1θbsr)Q˜sn (7)
Q˜nn=Q˜nQ˜n812Hz2. (8)

The default BSR value is set to 0 dB, i.e., θbsr=0.5. Second, we sum up signal and total noise with a specific global (sensor-level) SNR:

Q˜=θsnrQ˜In+(1θsnr)Q˜nn (9)
Q˜n=Q˜Q˜812Hz2 (10)

The default SNR value is set to 3.5 dB, i.e., θsnr=0.6. An example of the power-spectral density of the resulting activity on sensor level is illustrated in Fig. 1d. As a last step, we high-pass filter the generated sensor data with a cutoff of 1 Hz.

2.2. Source reconstruction

We test four different inverse solutions for source reconstruction: ‘exact’ low-resolution electromagnetic tomography (eLORETA), linearly-constrained minimum variance beamforming (LCMV), dynamic imaging of coherent sources (DICS), and Champagne. Inverse source reconstructions are based on the same leadfield used to simulate the signals. Full 3D currents are estimated for each source dipole. That is, prior information about the dipoles’ orientation is not used. A normal direction could in principle be estimated from the reconstructed cortical surface mesh (which we used here for signal generation); however, such estimation is considered to be rather unstable, since we do not have a good estimate of the cortical surface orientation in practice. The aggregation of the three spatial dimensions is discussed in Section 2.3.

‘Exact’ low-resolution electromagnetic tomography

The starting point to solve the source localization problem is the linear forward model Q˜=LvJ˜, where Q˜RNs×Nt stands for the sensor measurements, J˜R3Nv×Nt is the vector-valued activity of the dipolar brain sources to be recovered, and LvRNs×3Nv is the vector-valued linear leadfield matrix that maps the electrical activity from sources to sensor level. Here, 3Nv stand for the three spatial dimensions that together define the dipole orientation of the source activity. The solution of this equation is ill-posed since the number of brain sources Nv is much smaller than the number of measurement sensors Ns. Therefore eLORETA imposes the constraint of spatially smooth current density distributions (Pascual-Marqui, 2007, Pascual-Marqui, Lehmann, Koukkou, Kochi, Anderer, Saletu, Tanaka, Hirata, John, Prichep, Biscay-Lirio, Kinoshita, 2011). Briefly, eLORETA uses a weighted minimum norm criterion to estimate the source distribution:

J^=argminJ˜[Q˜LvJ˜2+aJ˜WJ˜], (11)

where a0 denotes a regularization parameter, and W is a block-diagonal symmetric weight matrix:

W=[W1000W2000WNv]R3Nv×3Nv, (12)

where 0 is the 3×3 zero matrix and Wv the 3×3 weight matrix at the v-th voxel defined in Equation  (15). The solution of Equation  (11) is given by:

J^=W1Lv(LW1Lv+aK)Q˜=PEQ˜, (13)

where KRNs×Ns is a centering matrix re-referencing the leadfield and sensor measurements to the common-average reference, A is the Moore-Penrose pseudo-inverse of a matrix A, and PERNs×3Nv is the eLORETA inverse filter. eLORETA then first computes

M=(LvW1Lv+aK) (14)

and then for v=1,,Nv, calculates weights

Wv=[LvvMLvv]1/2, (15)

with LvvRNs×3 denoting the leadfield for a single source location. It then iterates Equation  (14) and (15) until convergence and use the final weights to calculate J^. eLORETA has been shown to outperform other linear solutions in localization precision (Allouch, Yochum, Kabbara, Duprez, Khalil, Wendling, Hassan, Modolo, 2022, Halder, Talwar, Jaiswal, Banerjee, 2019, Pascual-Marqui, 2007).

In this study, we choose the regularization parameter based on the best result in a five-fold spatial cross-validation (Hashemi et al., 2021) with fifteen candidate parameters taken from a logarithmically spaced range between 0.01*Tr(CovQ˜) and Tr(CovQ˜), where Tr(A) denotes the trace of a matrix A and CovQ˜CNs×Ns denotes the sample covariance matrix of the sensor-space data.

Linearly-constrained minimum variance beamforming

The LCMV (Van Veen et al., 1997) filter PLRNs×3Nv belongs to the class of beamformers. It estimates source activity separately for every source location. While LCMV maximizes source activity originating from the target location, it suppresses noise and other source contributions. Let LvvRNs×3 and PvLRNs×3 denote the leadfield and projection matrix for a single source location, respectively. The LCMV projection filter minimizes the total variance of the source-projected signal across the three dipole dimensions:

PvL=argminPvTr(PvCovQ˜Pv) (16)

under the unit-gain constraint

PvLvv=I3×3. (17)

The source estimate J^vR3×Nt at the v-th voxel is given by

J^v=[(LvvCovQ˜1Lvv)1LvvCovQ˜1]Q˜=PvLQ˜. (18)

Previous simulations indicated that LCMV overall shows a higher connectivity reconstruction accuracy than eLORETA but is more strongly affected by low SNR (Anzolin et al., 2019). We show a power spectrum of exemplary LCMV-reconstructed source activity in Fig. 1e.

Dynamic imaging of coherent sources

DICS (Gross et al., 2001) is the frequency-domain equivalent of LCMV. In contrast to LCMV, DICS estimates spatial filters separately for each spectral frequency. The DICS filter PD is evaluated for a given frequency f using the real part of the sensor-level cross-spectral density matrix SQ:

PvD(f)=(LvvSQ(f)1Lvv)1LvvSQ(f)1 (19)

with

SQ(f)=<q(f,e)q*(f,e)>eCNs×Ns, (20)

where (·)* denotes complex conjugation and q(f,e) denotes the Fourier transform of the sensor measurements q˜(t,e). That is, the time-domain sensor signal Q˜ is cut into Nc epochs of T time samples to derive q˜(t,e), then multiplied with a Hanning window of length T, and Fourier-transformed epoch by epoch to derive q(f,e).

The beamformer filter PD(f)=[P1D(f),,PNvD(f)] can then be used to project the sensor cross-spectrum to source space:

SJ(f)=PD(f)SQ(f)PD(f)C3Nv×3Nv. (21)

Based on previous literature described above, we hypothesize that the beamformer solutions (LCMV and DICS) perform better than eLORETA when used in combination with undirected FC measures. However, since directed FC measures need to aggregate information across frequencies, we hypothesize that the estimation of such measures might be negatively affected by DICS source reconstruction. Concretely, we expect that DICS’ ability to optimize SNR per frequency and, thereby, to reconstruct different sources for each frequency can be counterproductive in cases where in fact the same pairs of sources are interacting at multiple frequencies. In contrast, we expect that LCMV, which reconstructs a single set of sources by optimizing the SNR across the whole frequency spectrum, would yield more consistent source cross-spectra and, therefore, better directed FC estimates than DICS.

Champagne

Champagne (Wipf et al., 2010) uses hierarchical sparse Bayesian inference for inverse modelling. Specifically, it imposes a zero-mean Gaussian prior independently for each source voxel. The prior source covariance is given by

Γ=[Γ1000Γ2000ΓNv]R3Nv×3Nv, (22)

where Γv is the 3×3 covariance of the v-th voxel. Here we use a Champagne variant that models each Γv as a full positive-definite matrix

Γv=[γv,1γv,4γv,5γv,4γv,2γv,6γv,5γv,6γv,3] (23)

with six parameters. The prior source variances and covariances in Γ are treated as model hyperparameters and are optimized in an iterative way. For any given choice of Γ, the posterior distribution of the source activity is given by (Wipf et al., 2010):

p(J˜|Q˜,γ)=t=1NtN(j^(t),Σj),where (24)
j^(t)=ΓLv(Σq)1q˜(t)=PCq˜(t) (25)
Σj=ΓΓLv(Σq)1LvΓ (26)
Σq=σ2I+LvΓLv, (27)

and where σ2 denotes a homoscedastic sensor noise variance parameter. The posterior parameters j^(t) and Σj are then used to obtain the next estimate of γ by minimizing the negative log model evidence (Bayesian Type-II likelihood):

LII(γ)=logp(Q˜|γ)=1Ntt=1Ntq˜(t)Σq1q˜(t)+logΣq. (28)

This process is repeated until convergence. Importantly, the majority of source variance parameters converges to zero in the course of the optimization, so that the reconstructed source distribution becomes sparse.

In the original Champagne version, a baseline or control measurement is used to estimate noise covariance in sensor data. Since baseline data are not available in our study, we use a homoscedastic noise model in which all sensors are assumed to be perturbed by uncorrelated Gaussian white noise with equal variance, and estimate the shared variance parameter using five-fold spatial cross-validation (Hashemi et al., 2021). Again, fifteen candidate parameters are taken from a logarithmically spaced range between 0.01*Tr(CovQ˜) and Tr(CovQ˜).

2.3. Dimensionality reduction

To aggregate time series of multiple sources within a region, an intuitive approach would be to take the mean across sources within each spatial dimension. However, this approach has two disadvantages: First, it assumes a high homogeneity within all voxels of a pre-defined region, which is not always given. Second, it does not offer a solution for aggregating the three spatial dimensions, since averaging across these might lead to cancellations due to different polarities.

Principal component analysis

An alternative approach is to reduce the dimensionality of multiple time series by employing a singular value decomposition (SVD) or, equivalently, principal component analysis (PCA), and to subsequently only select the C strongest PCs accounting for most of the variance within a region for further processing. Let J˜rRNt×3R denote the reconstructed broad-band source time courses of R dipolar sources within a single region r after mean subtraction. The covariance matrix Covr=J˜rJ˜rN1R3R×3R is a symmetric matrix that can be diagonalized as

Covr=VBV, (29)

where BR3R×3R is a diagonal matrix containing the eigenvalues λv (variances) of the PCs, which are, without loss of generality, assumed to be given in descending order, and VR3R×3R is a matrix of corresponding eigenvectors in which each column contains one eigenvector. The jth PC can then be found in the jth column of J˜rV.

In practice, the PCs are calculated using an SVD of the zero-mean data matrix J˜r as

J˜r=UDV. (30)

Using the ‘economy version’ of the SVD, URNt×3R is a matrix of orthonormal PC time courses, DR3R×3R is a matrix of corresponding singular values, and VR3R×3R is the matrix of eigenvectors (or, equivalently, singular vectors) defined above. Note that the square of the elements of D, divided by Nt1, are identical to the variances of the corresponding PCs (eigenvalues of Covr). Each squared singular vector, normalized by the sum of all singular vectors, thus corresponds to the variance explained by the corresponding singular vector. We will use this property for the two VARPC pipelines (Section 2.5).

Comparing PCA and SVD, one can easily see that

Covr=VDUUDV=VD2N1V, (31)

and λv=dv2N1. Thus, the PCs can also be calculated with SVD:

J˜rV=UDVV=UD. (32)

To reduce the dimensionality of the voxel data within one region, we keep only the strongest C PCs, i.e., the columns of UD that correspond to the largest eigenvalues. For a more extensive overview of the relationship between SVD and PCA, we refer to Wall et al. (2003). Note that in this study, we applied SVD on the time-domain source signals J˜r for most of the pipelines. However, we applied PCA on the real part of the source-level cross-spectrum, summed across frequencies, in case of DICS. For the ease of reading, we will stick to PCA terminology for all pipelines in the following.

It has been popular in the literature (Basti, Nili, Hauk, Marzetti, Henson, 2020, Friston, Rotshtein, Geng, Sterzer, Henson, 2006) to select only the first PC for every region and subsequently employ a univariate FC measure for further processing. We describe this approach further in Section 2.5, pipeline FIXPC1.

2.4. Connectivity metrics

There are numerous approaches to estimate FC (Schoffelen and Gross, 2019). One key distinction can be made between FC metrics that measure undirected (symmetric) interactions between signals and those that also measure the direction of FC.

It has been shown that the estimation of both undirected and directed FC from M/EEG recordings is complicated by the presence of mixed noise and signal sources (Bastos, Schoffelen, 2016, Haufe, Nikulin, Müller, Nolte, 2013, Nolte, Bai, Wheaton, Mari, Vorbach, Hallett, 2004, Schaworonkow, Nikulin, 2021, Wang, Lobier, Siebenhühner, Puoliväli, Palva, Palva, 2018). Due to volume conduction in the brain, signal sources from all parts of the brain superimpose at each M/EEG sensor. Projecting the sensor signals to source space can help disentangling separate signal sources. However, a signal reconstructed at a specific source voxel may still contain contributions from other sources in its vicinity. This phenomenon is called source leakage (Schoffelen and Gross, 2009).

Volume conduction and source leakage can lead to spurious FC despite the absence of genuine interactions (Haufe, Nikulin, Müller, Nolte, 2013, Nolte, Bai, Wheaton, Mari, Vorbach, Hallett, 2004). To overcome this problem, robust FC metrics have been developed (Haufe, Nikulin, Müller, Nolte, 2013, Nolte, Bai, Wheaton, Mari, Vorbach, Hallett, 2004, Nolte, Ziehe, Nikulin, Schlögl, Krämer, Brismar, Müller, 2008, Winkler, Panknin, Bartz, Müller, Haufe, 2016). Robustness is here referred to as the property of an FC measure to converge to zero in the limit of infinite data when the observed data are just instantaneous mixtures of independent sources (Nolte et al., 2004). Robust FC metrics use that spurious interactions due to signal mixing are instantaneous, while physiological interactions impose a small time delay. Robust FC metrics are therefore only sensitive to statistical dependencies with a non-zero time delay while eliminating zero-delay contributions.

We here test six different FC measures, four to detect undirected FC (coherence, iCOH, MIC, and MIM), and two measures that estimate the direction of interaction between two sources (multivariate GC and TRGC). This selection includes four robust FC metrics (c.f. Section 1) and two non-robust ones (coherence and GC). Based on the literature described above, we hypothesize that robust metrics will perform better than non-robust metrics. Please note that all tested FC metrics are frequency-resolved. That is, all metrics output an Nroi×Nroi×Nfreq tensor that contains the estimated FC for all region pairs at all frequencies. However, since we expect the interaction to be located in the interacting frequency band between 8 and 12 Hz (see Section 2.1), we select only those frequency bins within this band and average the FC scores across them. As a result, we obtain an Nroi×Nroi matrix.

All tested FC metrics are derived from the cross-spectrum. Let x˜(t,e)RK and y˜(t,e)RL be two multivariate time series where t{1,,T} indexes samples within epochs of 2 seconds length and e indexes epochs. Often, K=L=3 represents the three dipole orientations of two reconstructed current sources. In other cases, K and L denotes the number of retained data dimensions of two brain regions after (e.g., PCA) dimensionality reduction. These time-domain data are then multiplied with a Hanning window and Fourier transformed into x(f,e) and y(f,e), where f{0,0.5,,50} indexes frequencies. The joint cross-spectrum is then computed from the Fourier-transformed data as

S[xy](f)=[Sxx(f)Sxy(f)Syx(f)Syy(f)]C(K+L)×(K+L), (33)

where Sxy=x(f,e)y*(f,e)eCK×L.

Coherence and imaginary part of coherency

(Absolute) coherence (COH) and iCOH are measures of the synchronicity of two time series. Both coherence and iCOH are derived from the complex-valued coherency, which is a generalization of correlation in the frequency domain. As such, coherency quantifies the linear relationship between two time series at a specific frequency. Its phase expresses the average phase difference between the two time series, whereas its absolute value expresses the stability of the phase difference.

Complex-valued coherency CxyCK×L is the normalized cross spectrum (Nunez et al., 1997):

Cxy(f)=Sxy(f)(Sxx(f)Syy(f))1/2. (34)

Based on the terminology of Nolte et al. (2004), we define coherence as the absolute part of coherency: COHxy(f)=|Cxy(f)|RK×L, where |·| denotes the absolute value. Coherence captures both zero-delay and non-zero-delay synchronization between two time series. This can be problematic in the context of M/EEG measurements, where substantial zero-delay synchronization can be introduced by signal spread due to volume conduction or source leakage in absence of genuine interactions between distinct brain areas (Nolte et al., 2004). In contrast, the imaginary part of coherency is a robust FC measure since it is only non-zero for interactions with a phase delay different from multiples of π (Nolte et al., 2004). Here, we use the absolute value of the imaginary part of coherency, iCOHxy(f)=|CxyI(f)|CK×L, as a measure of synchronization strength, where CI denotes the imaginary part of C.

Note that both coherence and iCOH are not designed to aggregate FC between two multivariate time series into one FC score. A single FC score can be obtained by taking the average across all elements of COHxy or iCOHxy, respectively.

Multivariate interaction measure and maximized imaginary coherency

The multivariate interaction measure (MIM) and maximized imaginary coherency (MIC, Ewald et al., 2012) are multivariate generalizations of iCOH and are therefore also robust against source leakage.

MIM is defined as follows:

MIMxy(f)=Tr[(CxxR(f))1CxyI(f)(CyyR(f))1(CxyI(f))], (35)

where CR denotes the real part of C. In contrast, MIC aims at maximizing iCOH between the two multivariate time series. That is, MIC finds projections from two multi-dimensional spaces to two one-dimensional spaces such that iCOH between the projected signals becomes maximal:

MICxy(f)=maxa,b(aS˜xyI(f)b|a||b|), (36)

where S˜ is a whitened version of the cross-spectrum S (Ewald et al., 2012), and where aRK×1 and bRL×1 are projection weight vectors corresponding to the subspaces, or regions, of x and y, respectively. Note that, while the imaginary part itself can be positive or negative, flipping the sign of either a or b will also flip the sign of the imaginary part. Thus, without loss of generality, maximization of Eq.  (36) will find the imaginary part with strongest magnitude.

All undirected FC metrics (COH, iCOH, MIC, and MIM) are bounded between 0 and 1.

Multivariate Granger causality and time-reversed Granger causality

Granger Causality (GC) defines directed interactions between time series using a predictability argument (Bressler, Seth, 2011, Granger, 1969). Considering two univariate time series x˜(t) and y˜(t), we say that y˜ Granger-causes x˜ if the past information of y˜ improves the prediction of the presence of x˜ above and beyond what we could predict by the past of x˜ alone. That is, GC does not only assess the existence of a connection but also estimates the direction of that connection. We here use a spectrally resolved multivariate extension of GC (Barnett, Seth, 2014, Barrett, Barnett, Seth, 2010, Geweke, 1982), which allows us to estimate Granger-causal influences between groups of variables at individual frequencies. There are multiple strategies to arrive at spectral Granger causality estimates. Here, we follow recommendations made in Barnett et al. (2018); Barnett, Seth, 2014, Barnett, Seth, 2015; Faes et al. (2017) that ensure stable and unbiased estimates, and use Matlab code provided by the respective authors.

We first transform the joint cross-spectrum into an autocovariance sequence G[xy](p)R(K+L)×(K+L) with lags p{0,1,,NP}, NP=20, using the inverse Fourier transform. The autocovariance spectrum is further used to estimate the parameters A(p)R(K+L)×(K+L),p{1,,NP} and Σ=Covt[ϵ(t)]R(K+L)×(K+L) of a linear autoregressive model

[x(t)y(t)]=p=1NPA(p)[x(tp)y(tp)]+ϵ(t) (37)

of order NP using Whittle’s algorithm (Barnett, Seth, 2014, Whittle, 1963). Autoregressive model parameters are next converted into a state-space representation (A¯,C¯,K¯,Σ¯) corresponding to the model

z(t)=A¯z(t)+K¯ɛ(t) (38)
[x¯(t)y¯(t)]=C¯z(t)+ɛ(t), (39)

using the method of Aoki and Havenner (1991), where x¯(t)=[x¯(t),x¯(t1),,x¯(tNP)] and y¯(t)=[y¯(t),y¯(t1),,y¯(tNP)] are temporal embeddings of order NP, z(t)R(K+L)NP and ɛ(t)R(K+L)NP are unobserved variables, and all parameters are (K+L)NP×(K+L)NP matrices. Subsequently, the transfer function H(z)IC¯(IA¯z)1K¯zC(K+L)NP×(K+L)NP of a moving-average representation

[x(t)y(t)]=H(z)·ɛ(t) (40)

of the observations is derived, where IR(K+L)NP×(K+L)NP denotes the identity matrix and where z=ei4πf/T for a vector of frequencies f{0Hz,0.5Hz,,50Hz},T=200, and a factorization of the joint cross-spectrum is obtained as S[xy](f)=H(f)Σ¯H*(f) (Barnett and Seth, 2015). Frequency-dependent Granger scores

Fxy(f)=logSyy(f)Syy(f)Hyx(f)Σ¯xx|yHyx*(f) (41)

and (analogously) Fyx(f) are then calculated, where H(f) and Σ¯ are partitioned in the same way as S(f), where Σ¯xx|yΣ¯xxΣ¯xyΣ¯yy1Σ¯yx denotes a partial covariance matrix, and where · denotes matrix determinant (Barnett and Seth, 2015). Finally, differences

Fxynet(f)Fxy(f)Fyx(f) (42)

and Fyxnet(f)=Fxynet(f) summarizing the net information flow between the multivariate time series x˜(t) and y˜(t) are calculated (Winkler et al., 2016).

Just like coherence, GC is not robust, i.e. can deliver spurious results for mixtures of independent sources as a result of volume conduction or source leakage (e.g., Haufe, Nikulin, Müller, Nolte, 2013, Haufe, Nikulin, Nolte, 2012). This can be easily acknowledged by considering a single source that spreads into two measurement channels, which are superimposed by distinct noise terms. In that case, both channels will mutually improve each other’s prediction in the sense of GC (Haufe and Ewald, 2019). This problem is overcome by a robust version of GC, time-reversed GC (TRGC), which introduces a test on the temporal order of the time series. That is, TRGC estimates the directed information flow once on the original time series and once on a time-reversed version of the time series. If GC is reduced or even reversed when the temporal order of the time series is reversed, it is likely that the effect is not an artifact coming from volume conduction (Haufe, Nikulin, Müller, Nolte, 2013, Haufe, Nikulin, Nolte, 2012, Vinck, Huurdeman, Bosman, Fries, Battaglia, Pennartz, Tiesinga, 2015, Winkler, Panknin, Bartz, Müller, Haufe, 2016). Formally, multivariate spectral GC as introduced above can be evaluated on the time-reversed data by fitting the autoregressive model in Eq.  (37) on the transposed autocovariance sequence G[xy]TR(p)=G[xy](p),p{0,1,,NP}. This yields net GC scores FxyTRnet(f) for the time-reversed data, which are subtracted from the net scores obtained for the original (forward) data to yield the final time-reversed GC scores:

FxyTRGC(f)Fxynet(f)FxyTRnet(f) (43)

and (analogously) FyxTRGC(f)Fyxnet(f)FyxTRnet(f)=FxyTRGC(f).

2.5. Pipelines

In the following section, we describe the processing pipelines that were tested. All pipelines take the sensor measurements Q˜ as input. Then all pipelines calculate and apply an inverse model P to project sensor data to source level. From there, we aggregate voxel activity within regions by employing PCA and estimate inter-regional FC with various FC metrics described above. We describe several strategies of combining PCA with the calculation of FC in the following subsections. This step results in a Nroi×Nroi×Nfreq FC matrix which is then averaged across the frequency bins within the interaction frequency band (8–12 Hz). The output of all pipelines is one connectivity score for every region combination. We describe the processing exemplarily for the calculation of FC between two regions X and Y.

Pipelines FIXPC1 to FIXPC6: Fixed number of principal components

The first six pipelines use PCA dimensionality reduction. Afterwards, depending on the pipeline, a fixed number C of either one, two, three, four, five, or six strongest PCs are selected for further processing. Then, FC is calculated: in case of univariate measures (i.e., coherence and iCOH), we first calculate FC scores between all PC combinations of the two regions X and Y and then average across all pairwise FC scores. In case of multivariate FC measures, we directly calculate a single FC score between the PCs of region X and those of region Y. This approach has been used previously (e.g. Schoffelen et al., 2017).

Pipelines VARPC90 and VARPC99: Variable numbers of principal components

Pipelines VARPC90 and VARPC99 are equivalent to the FIXPC pipelines, with the difference that we do not select the same fixed number of PCs for every region. Instead, we select the number of PCs such that at least 90% (VARPC90) or 99% (VARPC99) of the variance in each ROI is preserved (c.f. Section 2.3). Thus, an individual number of PCs is chosen for each region. FC is then calculated analogously to pipelines FIXPC1 to FIXPC6. The idea of selecting the number of PCs such that a pre-defined fraction of the variance is retained has been used in previous literature (e.g. Gómez-Herrero et al., 2008).

Pipeline MEANFC: Mean first FC second

In this pipeline, the time series of all voxels within one region are averaged separately for the three orthogonal dipole orientations. Then, for univariate FC measures, FC is calculated between all 3*3 dimension combinations of the 3D-time series of region X and region Y. Afterwards, the average of these nine FC scores is taken. Multivariate FC measures are directly calculated between the 3D time series.

Pipeline CENTRAL: Central voxel pick

In this pipeline, we select only the central voxel of each region for further processing. The central voxel of a region is defined as the voxel whose average Euclidean distance to all other voxels in the region is minimal. To calculate the FC score between the 3D time series of the central voxel of region X and the 3D time series of the central voxel of region Y, we proceed analogous to pipeline MEANFC: in case of univariate FC measures, the FC score for all combinations of dipole orientations is calculated and then averaged. In case of multivariate FC measures, only one FC score is calculated between the two 3D time series. Selecting the time series of the central voxel as the representative time series for the region is an idea that has been used in previous studies already (Perinelli et al., 2022).

Pipeline FCMEAN: FC first mean second

In pipeline FCMEAN, the multivariate FC between each 3D voxel time series of region X with each voxel time series of region Y is calculated first. That is, if RX is the number of voxels of region X and RY is the number of voxels in region Y, RX*RY FC scores for all voxel combinations are calculated. To obtain a single FC score between region X and region Y, we then average all RX*RY FC scores. Due to computational and time constraints, we test this pipeline only for MIM and MIC. This approach has also been used in the literature before (Babiloni et al., 2018).

Pipeline TRUEVOX: True voxel pick

This pipeline is used as a baseline. Here we select the voxel for further processing that indeed contains the activity of the given ROI—i.e. the ground-truth voxel (see Section 2.1). All further processing is analogous to pipeline CENTRAL. In configurations with two active voxels per region (see Section 3, Experiment 6), FC scores are calculated for 2*3*3 voxel- and dipole orientation combinations.

2.6. Performance evaluation

We use a rank-based evaluation metric to assess the performance of the pipelines. All processing pipelines result in one FC score for every region–region combination. To evaluate the performance of a pipeline, we first sort all FC scores in a descending order and retrieve the rank rRNI, with NI{1,2,3,4,5} denoting the number of ground-truth interactions. Based on this rank vector, we calculate the percentile rank (PR):

PR=iNI(1riF)NI, (44)

with F denoting the total number of FC scores. The PR is then normalized to the perfect-skill PRps and no-skill PRns cases, and is therefore defined between 0 and 1:

PRps=iNI(1iF)NI (45)
PRns=iNI(1Fi+1F)NI (46)
PR=PRPRnsPRpsPRns. (47)

We report all PR values rounded to the second decimal. In case of the phase-based FC metrics, the PR is calculated on the original FC scores. In case of GC and TRGC, we separately evaluate each pipeline’s interaction detection ability, and its ability to determine the direction of the interaction. For evaluating the detection, we calculate the PR on the absolute values of the FC scores, whereas for evaluating the directionality determination performance, we calculate the PR only on the positive FC scores. Note that this is sufficient for the anti-symmetric directed FC measures used here.

2.7. Statistical assessment

In Experiment 1C, we provide a suggestion on how to statistically assess the presence of FC. Here, we obtain p-values by testing against a surrogate distribution consistent with the null hypothesis of zero interaction between all region pairs. The 10,000 samples of the surrogate distribution are drawn by shuffling epochs relative to each other when computing the cross-spectrum. More specifically, we calculate the cross-spectrum between the time series of one region and the shuffled time series of another region with the Welch method, where the diagonal entries of the cross-spectrum (spectral powers) are obtained without shuffling. From the shuffled cross-spectrum, MIM is calculated. We obtain p-values by counting the number of shuffled MIM-samples that are higher than the true MIM score and dividing this number by the total number of samples in the null distribution. FDR-correction (α-level = 0.05) is used on the upper triangle of the region–region p-value matrix to set a significance threshold.

2.8. ROIconnect Toolbox

Based on our experimental results (see Section 3), we identified a set of recommended methods and pipelines. These have been implemented in a Matlab toolbox and are made available as a plugin to the free EEGlab package2. This toolbox also contains code for analyzing spectral power in EEG source space, and for visualizing power and FC results in source space. A comprehensive description of the functionality and usage of the toolbox is provided in Appendix A. Moreover, an exemplary application of the toolbox to the analysis of a real EEG dataset is provided in Section 4.

3. Experiments and results

We conducted a set of experiments to assess the influence of the different pipeline parameters on the reconstruction of ground-truth region-to-region FC. We describe the general experimental setting in Fig. 2. Each experiment consisted of the following steps: (1) Signal generation. (2) Source projection. (3) Dimensionality reduction within regions. (4) Functional connectivity estimation. (5) Performance evaluation. Each experiment was carried out 100 times (= iterations). If not indicated otherwise, all experiments had the following default setting:

  • LCMV inverse solution

  • SNR = 3.5 dB

  • BSR = 0 dB

  • number of interactions = 2

  • time delay of the interaction = 50 to 200 ms

  • number of generated sources per region = 1

Fig. 2.

Fig. 2

Experimental setup. Every experiment consisted of five consecutive steps: (1) Signal generation. (2) Source projection. (3) Dimensionality reduction within regions. (4) Functional connectivity estimation. (5) Performance evaluation. Every experiment was carried out 100 times.

If not stated otherwise, the following parameters were drawn randomly in each iteration: ground-truth interacting (seed and target) regions (two distinct regions uniformly drawn between 1 and Nroi), ground-truth active voxel(s) within regions (uniformly drawn between 1 and Rroi), time delay (uniformly drawn between 50 and 200 ms). Furthermore, brain noise and sensor noise, as well as the signal were generated based on (filtered) random white noise processes as described above.

Fig. 3 to Fig. 11 show the results of experiments 1–6. In addition, all main results are summarized in Table 1. All figures (plotting code adapted from Allen et al., 2019) follow the same scheme: in every subplot, the 100 dots on the right side mark the performance, i.e. the PR, measured in each of the 100 iterations. On the left, a smooth kernel estimate of the data density is shown. The red and black lines represent the mean and median PR of the experiment, respectively, and the boxcar marks the 2.5th and 97.5th percentiles. Please note that the Y-axis is scaled logarithmically in all plots. We tested differences between pipeline performances with a one-sided Wilcoxon signed-rank test. Please note that a p-value pA,B corresponds to a one-sided test for B>A.

Fig. 3.

Fig. 3

Comparison of different functional connectivity metrics (Experiment 1A). Red and black lines indicate the mean and median percentile rank (PR), respectively. The boxcar marks the 2.5th and 97.5th percentiles.

Fig. 11.

Fig. 11

Performance when two active sources per region are simulated (Experiment 6). (a) Undirected FC reconstruction performance achieved using the multivariate interaction measure (MIM). (b) Directed FC reconstruction performance achieved using time-reversed Granger causality. Red and black lines indicate the mean and median, respectively. The boxcar marks the 2.5th and 97.5th percentile.

Table 1.

Summary of the results of experiment one to six. A pipeline including robust multivariate FC metrics like MIM or TRGC, a PCA with fixed number of selected components, and LCMV source reconstruction yields the best performance.

#Exp. Tested parameter Result
1A FC metric MIM/TRGC yield best performance.
1B pipelines Fixed PC+FC yield best performance.
2 Inverse solution LCMV yields best performance.
3A SNR The higher the better.
3B BSR The less sensor noise the better.
4 #Interactions The lower the better.
5 Short interaction delays Longer delays yield better performance.
6 Two active sources Overall lower performance.
Peak performance at three to four PCs.

Matlab code to reproduce all experiments is provided under3.

3.1. Experiment 1

Experiment 1A

In Experiment 1A, we evaluated the performance of different FC metrics in detecting the ground-truth interactions. The ability to detect FC was tested for coherence, iCOH, MIC, MIM, GC, and TRGC. The ability to detect the correct direction of the interaction was tested for GC and TRGC (see Section 2.4).

In Fig. 3, we show the performances of different FC metrics. We see that MIM, MIC and TRGC (detection) all have a mean PR of over 0.97 and clearly outperform the other measures in detecting the ground-truth FC. The non-robust metrics coherence (mean PR = 0.59) and GC (mean PR = 0.95) detect the ground-truth interactions less reliably (pcoherence,MIM<104;

pGC,MIM=0.0040). When comparing GC and TRGC in their ability to infer the direction of the interaction, TRGC (mean PR = 0.98) outperforms GC (mean PR = 0.96; pGC,TRGC<104).

Experiment 1B

In Experiment 1B, we tested the influence of different strategies of dimensionality reduction within regions. In Fig. 4, we show the comparison for MIM (interaction detection) and TRGC (directionality determination). For MIM, we observe that the FIXPC pipelines show a better performance than most of the other pipelines. Within the FIXPC pipelines, the pipelines with two, three, or four PCs perform best (all mean PR = 0.99, pFIXPC5,FIXPC3<104). Only the TRUEVOX (baseline) pipeline using ground-truth information on voxel locations expectantly shows a higher performance (mean PR = 1.00; pFIXPC3,TRUEVOX<104). The two VARPC pipelines show a substantially reduced performance (mean PR = 0.96 and mean PR = 0.73, respectively; both pVARPC,FIXPC3<104). The MEANFC and CENTRAL pipelines (mean PR = 0.98 and mean PR = 0.96, respectively) also show reduced performance in comparison to the FIXPC3 pipeline (both p<104). The FCMEAN pipeline (mean PR = 0.97) also did not perform as well as the FIXPC3 pipeline (p<104) while taking much longer to compute (FIXPC3 < 1 h, FCMEAN = 32 h, single core, allocated memory: 16 GB).

Fig. 4.

Fig. 4

Comparison of different pipelines (Experiment 1B). (a) Undirected FC reconstruction performance achieved using the multivariate interaction measure (MIM). (b) Directed FC reconstruction performance achieved using time-reversed Granger causality. Red and black lines indicate the mean and median percentile rank (PR), respectively. The boxcar marks the 2.5th and 97.5th percentile.

In terms of directionality estimation using TRGC, the outcome is similar. Again, the TRUEVOX pipeline shows perfect performance (mean PR = 1.00). The FIXPC pipelines also exhibit very high performances (FIXPC4: mean PR = 0.99). Notably, in contrast to the results obtained with MIM, the VARPC90 also achieves competitive performance (mean PR = 0.99, pVARPC90,FIXPC3=0.0235). Please see Figure S1 to compare computation times of all pipelines.

We show the full matrix of all combinations of FC metrics and dimensionality reduction pipelines in Supplementary Figure S2. However, for all further experiments, we report performances only for MIM (interaction detection) and TRGC (directionality determination) since they performed best in Experiment 1A, and we focus on the FIXPC3 pipeline due the high performance observed in Experiment 1B.

Experiment 1C

To explore how to statistically assess the presence of FC, we performed an additional experiment for a specific setting (SNR = 3.5 dB, one interaction between region 11 and region 49, BSR = 0 dB, LCMV filter, dimensionality reduction to 3PCs, FC metric = MIM). Here, we obtained p-values by testing against a surrogate distribution consistent with the null hypothesis of zero interaction between all region pairs. In Fig. 5, we contrast the ground-truth ROI-to-ROI connectome with the estimated FC per region combination as well as the -log10(p) values “surviving” the FDR-correction for this experiment. While in the ground-truth connectome only the ground-truth region combination shows a high MIM score, there are also some high MIM scores in other region combinations than the ground truth in the reconstructed source-level connectome. Still, the ground-truth region combination in this setting achieves the second-highest MIM score (PR = 0.9996). However, in Fig. 5c, we see that testing the statistical significance with a shuffling test results in a substantial number of significant false positive interactions in the vicinity of the simulated interacting region pair. We discuss this result in Section 5.

Fig. 5.

Fig. 5

Comparison of the ground-truth ROI-to-ROI connectome with the estimated functional connectivity per region combination and the -log10(p) values after FDR-correction for a single experiment. A ground-truth interaction is modeled between region 11 and region 49..

3.2. Experiment 2

Experiment 2A

In Experiment 2, we tested the influence of the type of inverse solution on the pipelines performances. In Figure 6, we show the comparison between eLORETA, LCMV, DICS, and Champagne. We observe that the two beamformer solutions and Champagne clearly outperform eLORETA (mean PR 0.65; Figure 6a) in detecting undirected connectivity (all p<104). While DICS, LCMV and Champagne all show very good performances, we see a slight advantage of LCMV (mean PR = 0.99) in comparison to Champagne (mean PR = 0.97, pChampagne,LCMV=0.0013). We do not observe a significant difference between DICS and LCMV (pDICS,LCMV=0.2805).

Fig. 6.

Fig. 6

Comparison of different inverse solutions (Experiment 2). (a) Undirected FC reconstruction performance achieved using the multivariate interaction measure (MIM). (b) Directed FC reconstruction performance achieved using time-reversed Granger causality. Red and black lines indicate the mean and median percentile rank (PR), respectively. The boxcar marks the 2.5th and 97.5th percentile.

In terms of directionality determination (Figure 6b), the picture is different: while LCMV (mean PR = 0.98) leads to accurate directionality estimates, DICS fails to detect the direction of the ground-truth interaction in a high number of experiments (mean PR = 0.28, pDICS,LCMV<104). eLORETA also shows a reduced overall performance (mean PR = 0.69, peLORETA,LCMV<104). Champagne shows decent performance (mean PR = 0.99), which is, however, lower than that of LCMV (pChampagne,LCMV<104).

The differences in computation times of the different inverse solutions are also remarkable. While LCMV (2 sec) and DICS (178 sec) are fast to compute, eLORETA (388 sec) and Champagne (3747 sec) take much longer to compute as a cross-validation scheme to set the regularization parameter is implemented for both. Setting the regularization parameter to a default value would drastically reduce computation time for eLORETA and Champagne, but would also decrease performance (results not shown).

Experiment 2B

To investigate further why eLORETA performs considerably less well than LCMV in our experiments, we generated ground-truth activity with an interaction between one seed voxel in the left frontal cortex and one target voxel in the left precentral cortex. We then again generated sensor data as described in Section 2.1 and applied pipeline FIXPC1 to calculate regional MIM scores. In Supplementary Figure S3, we show the resulting power maps, as well as seed MIM scores and target MIM scores for data projected with eLORETA and MIM, respectively. We see clearly the advantage of LCMV: while both power and MIM in the eLORETA condition are spread out to other regions, LCMV is able to localize the ground-truth power and connectivity very precisely.

Experiment 2C

Does LCMV only perform so well in our experiment because our experimental setup artificially favors it? In the following additional analysis, we investigated whether LCMV still has an advantage over eLORETA when multiple pairs of correlated sources are present. More specifically, we here simulated two pairs of interacting sources where the time courses of the second source pair were identical to those of the first source pair. Results are presented in Figure 7. Please note that in this case, also the cross-interactions between the seed and target regions were evaluated as ground-truth interactions. We see that, while eLORETA is not much affected by the correlated sources setup, LCMV has a decreased reconstruction performance according to both MIM and TRGC. However, LCMV still performs better than eLORETA even in this setup (peLORETA,LCMV<104).

Fig. 7.

Fig. 7

Performance observed for two perfectly correlated source pairs. (a) Undirected FC reconstruction performance achieved using the multivariate interaction measure (MIM). (b) Directed FC reconstruction performance achieved using time-reversed Granger causality. Red and black lines indicate the mean and median, respectively. The boxcar marks the 2.5th and 97.5th percentile.

3.3. Experiment 3

In real-world EEG measurements, data are to a certain extent corrupted by noise, e.g. from irrelevant brain sources, or by noise sources from the outside. In Experiment 3, we investigated the effect of SNR and BSR on FC estimation performance. In Fig. 8a and 8 b, we show the performance of the FIXPC3 pipeline for SNRs of -7.4 dB, 3.5 dB and 19.1 dB. For both MIM (Fig. 8a) and TRGC (Fig. 8b), we observe decreased performances for decreased SNRs, as expected. For an SNR of 19.1 dB, nearly all experiments show a perfect detection of ground-truth interactions (mean PR > 0.99).

Fig. 8.

Fig. 8

FC estimation performance depends on the signal-to-noise ratio and brain noise-to-sensor noise ratio (Experiment 3). (a/c) Undirected FC reconstruction performance achieved using the multivariate interaction measure (MIM). (b/d) Directed FC reconstruction performance achieved using time-reversed Granger causality. Red and black lines indicate the mean and median percentile rank (PR), respectively. The boxcar marks the 2.5th and 97.5th percentile.

Is FC detection more impaired by pink brain noise or white sensor noise? In Experiment 3B, we tested the performance for BSR environments of 100% sensor noise, 25% brain noise, 50% brain noise, 75% brain noise, and 100% brain noise. In Fig. 8c and 8 d, we show the performances for different BSRs. We observe a slightly better performance for signals more strongly contaminated by correlated brain noise than white sensor noise (mean MIM PR 100% brain noise > 0.99) compared to the opposite case (mean MIM PR 0% brain noise = 0.97).

Note that in Experiments 1 to 3, for better comparison between the experimental conditions and to avoid variation due to random factors besides the experimental variation, we used the same generated data within an iteration in every experiment and only varied the tested condition.

3.4. Experiment 4

While we focused on a very simple scenario with only two interacting region pairs so far, real brain activity likely involves multiple interacting sources. To increase the complexity in our setup, we compared performances for different numbers of interacting region pairs in Experiment 4. As expected, Fig. 9 clearly shows that more simultaneous true interactions lead to decreased ability to reliably detect them. While the detection is nearly perfect for one interaction (mean MIM PR > 0.99; mean TRGC PR > 0.99), the performance is much reduced for 5 interactions (mean MIM PR = 0.91; mean TRGC PR = 0.93). This applies for both MIM and TRGC. Please note however, that despite using a normalized version of the PR (see Section 2.6), the PR metric is not perfectly comparable for different numbers of true interactions. That is, when calculating the PR on randomly drawn data, the PR distribution is close to uniform when only one interaction is assumed, but shows a normal distribution with increasing kurtosis for higher numbers of interactions. However, the mean of the distribution equals to 0.5 for all assumed interactions.

Fig. 9.

Fig. 9

FC reconstruction performance depends on the number of true interactions (Experiment 4). (a) Undirected FC reconstruction performance achieved using the multivariate interaction measure (MIM). (b) Directed FC reconstruction performance achieved using time-reversed Granger causality. Red and black lines indicate the mean and median percentile rank (PR), respectively. The boxcar marks the 2.5th and 97.5th percentile.

3.5. Experiment 5

While it is not entirely clear how large interaction delays in the brain can be, they likely range between 2 and 100 ms, depending not only on physical wiring, but also on cognitive factors (see Section 5). In Experiment 5, we evaluated to which degree the performance drops when regions interact with shorter time delays of 2, 4, 6, 8, and 10 ms. While the performance for the MIM metric is already quite impaired for a delay of 10 ms (mean PR = 0.90), performance drops drastically for 4 ms (mean PR = 0.73) and 2 ms (mean PR = 0.60) (Fig. 10a). Detecting the direction of the interaction with TRGC is already much more difficult at a true delay of 10 ms (mean PR = 0.73) and is further reduced for a delay of 2 ms (mean PR = 0.56; Fig. 10b).

Fig. 10.

Fig. 10

Performance for very small interaction delays and the default delay (Experiment 5). (a) Undirected FC reconstruction performance achieved using the multivariate interaction measure (MIM). (b) Directed FC reconstruction performance achieved using time-reversed Granger causality. Red and black lines indicate the mean and median percentile rank (PR), respectively. The boxcar marks the 2.5th and 97.5th percentile.

3.6. Experiment 6

In our previous experiments, the FIXPC pipelines with two to four PCs showed the best performance. But the ‘optimal’ number of PCs likely depends on the number of (interacting and non-interacting) signals in the brain as well as their relative strengths. To verify that the optimal number of PCs depends on the number of true sources, we increased the number of active voxels per region to two in Experiment 6. We then simulated two bivariate interactions between two different source pairs originating from the same regions.We show the results for pipelines FIXPC1 to FIXPC6 in Fig. 11. Interestingly, we here see that pipelines FIXPC3 (mean MIM PR = 0.99; mean TRGC PR = 0.99) and FIXPC4 (mean MIM PR = 0.99; mean TRGC PR = 0.99) perform clearly better than FIXPC1 (mean MIM PR = 0.89; mean TRGC PR = 0.93) or FIXPC6 (mean MIM PR = 0.98; mean TRGC PR = 0.98). Based on these results, we confirm that the choice of the optimal number of fixed PCs increases with the number of independently active processes within one region (see Section 5 for further discussion).

4. Exploratory analysis of functional connectivity in left vs. right motor imagery

To illustrate how the recommended analysis pipeline can be used to analyse real EEG data, we show an exploratory analysis of power and FC in left and right motor imagery. In the Berlin arm of the so-called VitalBCI study (Blankertz, Sannelli, Halder, Hammer, Kübler, Müller, Curio, Dickhaus, 2010, Sannelli, Vidaurre, Müller, Blankertz, 2019), 39 subjects conducted an experiment in which they imagined a movement with either the left or the right hand (Motor Imagery Calibration set; MI-Cb 1–3). Each trial consisted of a visual stimulus showing a fixation cross imposed with an arrow indicating the task for the trial (i.e., left or right motor imagery). After 4 sec, the stimulus disappeared, and the screen stayed black for 2 sec. Every subject conducted 75 left and 75 right motor imagery trials. During the experiment, EEG data were recorded with a 119-channel whole-head EEG system with a sampling rate of 1000 Hz. For this study, we used a 90-channel whole head standard subset of them. For our analysis, we selected only the 26 subjects for which previous studies have reported that the left vs. right motor imagery conditions could be well separated using statistical and machine learning techniques (’Category I’ in Sannelli et al., 2019). Further experimental details are provided in Blankertz et al. (2010); Sannelli et al. (2019).

We filtered the data (1 Hz high-pass filter, 48–52 Hz notch filter, and 45 Hz low-pass filter, all zero-phase forward and reverse second-order digital high-pass Butterworth filters), and then sub-sampled them to 100 Hz. We then rejected artifactual channels based on visual inspection of the power spectrum and the topographical distribution of alpha power (between zero and five per participant, mean 1.19 channels) and interpolated them (spherical scalp spline interpolation). A leadfield was computed using the template head model Colin27_5003_Standard-10-5-Cap339 that is already part of the EEGLAB toolbox. We then epoched the data from 1 to 3 sec post-stimulus presentation start and separated left from right motor imagery trials.

We used the pop_roi_activity function of the newly developed ROIconnect plugin for EEGLAB to calculate an LCMV source projection filter, apply it to the sensor data, and calculate region-wise power (see Appendix A for a more detailed description). We then normalized the power with respect to the total power between 3 and 7 Hz as well as 15 and 40 Hz, and averaged it across frequencies between 8 and 13 Hz. The statistical significance of the differences between right- and left-hand motor imagery power was assessed with a paired t-test in every region. In Supplementary Figure S4, we show the negative log10-transformed p-values, multiplied with the sign of the t-statistic. As expected, the results show a clear lateralization for the activation of the motor areas.

To estimate inter-regional FC, we used the pop_roi_connect function to calculate MIM based on the three strongest PCs of every region. Again, MIM was averaged across frequencies between 8 and 13 Hz. To reduce the region-by-region MIM matrix to a vector of net MIM scores, we summed up all MIM estimates across one region dimension.

Analogous to our statistical evaluation of simulated data, described in Experiment 1C, we assessed the statistical significance of the net FC of each region against the null hypothesis of zero net interaction separately for each of the two motor imagery conditions. Specifically, we first calculated the true MIM score between all region pairs in all subjects. Then, we generated a null distribution of 1000 shuffled MIM scores for every region combination in every subject. Subsequently, the true and shuffled net MIM scores were calculated by averaging across one of the region dimensions. To obtain p-values, we compared the true MIM of every region and subject to the respective null distribution. To aggregate the p-values across subjects, we applied Stouffer’s method (see, e.g., Dowding and Haufe, 2018). Finally, FDR-correction (α-level = 0.05) was used to correct for multiple comparisons. We show the negative log10-transformed p-values in Figs. 12a and 12b.

Fig. 12.

Fig. 12

Results of the exploratory analysis of functional connectivity in left and right hand motor imagery tasks.

Additionally, we assessed the statistical difference between the net MIM scores of the left- vs. right-hand motor imagery condition by again using a paired t-test for every region. In Fig. 12c, we show the negative log10-transformed p-values, multiplied with the sign of the t-statistic. Again, as expected, the results show a lateralization for the undirected net FC of the motor areas.

Matlab code of the analyses presented in this section is provided under4.

5. Discussion

Estimating functional connectivity between brain regions from reconstructed EEG sources is a promising research area that has generated a number of important results (e.g. Babiloni, Del Percio, Lizio, Noce, Lopez, Soricelli, Ferri, Nobili, Arnaldi, Famà, et al., 2018, Hipp, Engel, Siegel, 2011, Schoffelen, Hultén, Lam, Marquand, Uddén, Hagoort, 2017). However, respective analysis pipelines consist of a number of subsequent steps for which multiple modeling choices exist and can typically be justified. In order to identify accurate and reliable analysis pipelines, simulation studies with ground-truth data can be highly informative. However, most existing simulation studies do not evaluate complete pipelines but focus on single steps. In particular, various published studies assume the locations of the interacting sources to be known a-priori, while, in practice, they have to be estimated as well. To this end, it has become widespread to aggregate voxel-level source activity within regions of an atlas before conducting FC analyses across regions. Multiple ways to conduct this dimensionality reduction step have been proposed, which have not yet been systematically compared using simulations. The main focus of our study was thus to identify those EEG processing pipelines from a set of common approaches that can detect ground-truth inter-regional FC most accurately. For the scenario modelled in this study, we observe that a pipeline consisting of an LCMV source projection, PCA dimensionality reduction, the selection of a fixed number of principal components for each ROI, and a robust FC metric like MIM or TRGC results in the most reliable detection of ground-truth FC (see Table 1). Consistent with results reported in Anzolin et al. (2019), LCMV consistently yielded higher FC reconstruction performance than eLORETA. Thus, we here answer the question that Mahjoory et al. (2017) left open, namely which source reconstruction technique is most suitable for EEG FC estimation. Our results are also in line with a larger body of studies that highlighted the advantages of robust FC metrics compared to non-robust ones (e.g. Haufe, Nikulin, Müller, Nolte, 2013, Nolte, Bai, Wheaton, Mari, Vorbach, Hallett, 2004, Schoffelen, Gross, 2019, Vinck, Huurdeman, Bosman, Fries, Battaglia, Pennartz, Tiesinga, 2015, Winkler, Panknin, Bartz, Müller, Haufe, 2016).

Inverse solutions

For some inverse solutions, the choice of the regularization parameter has been shown to influence the accuracy of source reconstruction (Hashemi, Cai, Kutyniok, Müller, Nagarajan, Haufe, 2021, Hincapié, Kujala, Mattout, Daligault, Delpuech, Mery, Cosmelli, Jerbi, 2016). While the parameter is of little importance for methods like LCMV and DICS, which are fitted separately to each source and thus solve low-dimensional optimization problems, it should be carefully chosen for full inverse solutions like Champagne and eLORETA, which estimate the activity at each source voxel within a single model. To avoid a performance drop due to unsuitable regularization parameter choice in eLORETA and Champagne, we used the spatial cross-validation method described in (Habermehl, Steinbrink, Müller, Haufe, 2014, Hashemi, Cai, Kutyniok, Müller, Nagarajan, Haufe, 2021). This method automatically sets the parameter based on the data at hand and has been shown to improve the source reconstruction (Hashemi et al., 2021).

As hypothesized, DICS resulted in poor directionality determination performance, while LCMV and TRGC performed well. This can be explained by the difference between LCMV and DICS: while LCMV estimates the inverse solution in the time domain, DICS estimates the source projection for every frequency separately (Gross et al., 2001). This can lead to inconsistencies across frequencies. Since directionality estimation requires the aggregation of phase information across multiple frequencies, such inconsistencies may lead to failure of detecting true interactions and their directionalities. Therefore, we recommend to avoid using DICS source reconstruction when analysing directed FC. For undirected FC measures, this seems to be less of a problem. Still, in our simulation, LCMV consistently performed (even if only slightly) better than DICS. This can be explained by the lower effective number of data samples that are available to DICS at each individual frequency compared to LCMV, which uses data from the entire frequency spectrum. However, there may be cases when using DICS could result in more accurate localization. For example, this could be the case when the noise has a dominant frequency that is different from the signal.

Robust functional connectivity metrics

In this study, we observed a strong benefit of using robust FC metrics over non-robust metrics in detecting genuine neuronal interactions. Overall, the performance of coherence is highly impaired by the volume conduction effect (see Figure 3, c.f. Nolte et al., 2004). The TRGC metric performed well for the investigation of the interaction direction, but also satisfyingly well for the interaction detection. However, the computation time for calculating TRGC exceeds that of MIM by far. Thus, we recommend using MIM to detect undirected FC in case the direction of the effect is not of relevance. If TRGC is calculated for estimating the direction of interactions, the absolute value of TRGC can be used to detect interactions as well.

Interestingly, GC without time reversal did not perform much worse than TRGC. This is in line with previous results (Winkler et al., 2016) demonstrating that the calculation of net GC values already provides a certain robustification against volume conduction artifacts. Concretely, it has been shown that net GC is more robust to mixed noise than the standard GC; however not as robust as TRGC (Winkler et al., 2016). We generally recommend using robust FC connectivity metrics like iCOH, MIM/MIC, or TRGC.

Aggregation within regions

When comparing different processing pipelines, we found that employing an SVD/PCA and selecting a fixed number of components for further processing performs better than selecting a variable number of components in every ROI. When further investigating this effect, we found that, for MIM and MIC, the final connectivity score of the VARPC pipelines was positively correlated with the number of voxels of the two concerning ROIs (90%: MIM: r=0.50, MIC: r=0.32; 99%: MIM: r=0.70, MIC: r=0.41). This indicates that the flexible number of PCs leads to a bias in MIM and MIC depending on the size of the two involved ROIs. This could be expected, as the degrees of freedom for fitting MIM and MIC scale linearly with the number of voxels within a pair of regions. These in- or explicit model parameters can be tuned to maximize the FC of the projected data, which may lead to over-fitting. For finite data, this leads to a systematic overestimation of FC, to the degree of which it correlates with the number of voxels. Although representing a multivariate technique as well, similar behavior was not observed for TRGC. Here it is likely that a potential bias of the signal dimensionalities would cancel out when taking differences between the two interaction directions as well as between original and time-reversed data.

An interesting and so far unsolved question is how many fixed components should be chosen for further processing. In Experiment 6, we observed a clear performance peak around three to four components (Fig. 11). In the default version with only one active source per ROI, we saw a similar pattern, but not as pronounced as in Experiment 6. This points towards a data-dependent optimal number of components. Future work should investigate how this parameter can be optimized based on the data at hand.

Short time delays

In Experiment 5, we investigated to what extent the performance drops when the true interaction occurs with a very small time delay of 2 to 10 msec, which might be a realistic range for a number of neural interaction phenomena in the brain. Precise data on the typical order of the times within which macroscopic neural ensembles exchange information are, however, hard to obtain, as these transmission times depend not only on the physical wiring but also on cognitive factors that are not straightforward to model. Previous work has shown that delays can range from 2 to 100 msec, depending on the distance and number of synapses between two nodes (e.g. Fries, 2005, Miocinovic, de Hemptinne, Chen, Isbaine, Willie, Ostrem, Starr, 2018, Oswal, Beudel, Zrinzo, Limousin, Hariz, Foltynie, Litvak, Brown, 2016, Shouno, Tachibana, Nambu, Doya, 2017). For example, Oswal et al. (2016) studied interaction delays between the subthalamic nucleus and the motor cortex and found interaction delays of 20 to 46 msec. The satisfactory performance observed in our study for undirected FC at delays of 8 and 10 msec may therefore be of particular importance for clinical scientists that aim at investigating such long-range interactions. Note that the range of delays that can be detected with robust connectivity metrics strongly depends on the frequency band in which the interaction takes place. If the delay is very short compared to the base frequency of the interaction, then the phase difference it induces is close to either 0 or ±π, making it less and less distinguishable from a pure volume conduction effect as it approaches these limits. In addition, the directionality of an interaction can only be resolved by analyzing multiple frequencies. Here, wider interaction bands lead to better reconstructions of the directionality of interactions with shorter delays, whereas higher frequency resolutions (that is, longer data segments) lead to better reconstructions of the directionality of interactions with longer delays. Here, we have demonstrated that alpha-band interactions with physiologically plausible transmission delays can be detected at 0.5 Hz frequency resolution, depending on the underlying SNR as well as additional modeling assumptions (see Limitations below).

Statistical assessment

The goal of this study was to evaluate data analysis pipelines to assess FC. However, we excluded the assessment of any subsequent statistical evaluation of FC, which is not straightforward to investigate in simulation studies. In a simulation setting, we are free to choose the two factors that influence the statistical power of a test—SNR and sample size. Determining realistic ranges for both in the context of EEG FC estimation is challenging but critical. Second, due to source leakage, we must expect (tiny) spill-over effects from interacting to non-interacting region pairs, an effect termed “ghost interactions” (Palva et al., 2018). As a result, these ghost interactions will inevitably become statistically significant for any source pair at high enough SNRs and sample sizes—an effect that can also be seen in Fig. 5c. For these reasons, we here assessed the effect sizes of FC metrics instead of their statistical significance, and focused on evaluating the performance of different FC estimation pipelines relative to one another rather than on their absolute performance. However, future studies should go one step further by systematically assessing statistical maps derived from connectomes using our results as building blocks.

Limitations

While this study investigates a large range of processing pipelines, phase-to-phase FC metrics, and data parameters, it is far from being exhaustive. Other works have shown that many other parameters like channel density (Song et al., 2015), the location of interacting sources (Anzolin et al., 2019), data length (Astolfi, Cincotti, Mattia, Marciani, Baccala, de Vico Fallani, Salinari, Ursino, Zavaglia, Ding, et al., 2007, Liuzzi, Gascoyne, Tewarie, Barratt, Boto, Brookes, 2017, Sommariva, Sorrentino, Piana, Pizzella, Marzetti, 2019, Van Diessen, Numan, Van Dellen, Van Der Kooi, Boersma, Hofman, Van Lutterveld, Van Dijk, Van Straaten, Hillebrand, et al., 2015), referencing (Chella, Pizzella, Zappasodi, Marzetti, 2016, Huang, Zhang, Cui, Yang, He, Liu, Yin, 2017, Van Diessen, Numan, Van Dellen, Van Der Kooi, Boersma, Hofman, Van Lutterveld, Van Dijk, Van Straaten, Hillebrand, et al., 2015), and co-registration (Liuzzi et al., 2017) can influence FC detection. Besides, we here used the same head model for generating the sensor data and estimating the inverse solution. However, we expect worse performance when the head model has to be estimated, and previous work has shown that the quality of head model estimation also influences FC detection (Mahjoory et al., 2017). Likewise, there exist many other inverse solutions, like MNE, wMNE, LORETA, sLORETA, and MSP, just to name a few. Further, there also exist other types of dimensionality reduction techniques. For example, some works selected the source with the highest power within a region or the source that showed the highest correlation to the time series of other sources in the ROI to be representative for all time series of the ROI (Ghumare, Schrooten, Vandenberghe, Dupont, 2018, Hillebrand, Barnes, Bosboom, Berendse, Stam, 2012). Others have presented a procedure of optimizing a weighting scheme before averaging all time series within a ROI (Palva, Monto, Kulashekhar, Palva, 2010, Palva, Kulashekhar, Hämäläinen, Palva, 2011).

We also did not investigate the effect of the number of epochs and the epoch length in this study. It has been shown that the number of epochs can introduce a bias for certain connectivity metrics (Vinck et al., 2010). This is the case for connectivity metrics that yield positive values only, like (absolute) coherence, the absolute value of the imaginary part of coherency, MIM, or MIC. For these metrics, for a fixed epoch length, a lower number of epochs will systematically lead to higher values of estimated connectivity, even under the null hypothesis of no interaction. This is due to the higher variance of the estimates for lower samples sizes, which turns into a positive bias when the absolute value is taken. Further, Fraschini et al. (2016) argued that also the epoch length may have an influence on FC estimation, where shorter epochs were found to introduce a positive bias on FC when the number of epochs was held constant. As a result, we recommend to use fixed numbers and length of epochs throughout a single experiment. This is of particular importance when the goal is to compare different groups or experimental conditions.

As the set of coupling mechanism and corresponding FC metrics that have been proposed is huge, we deliberately constrained our analysis here to phase-phase coupling using a selection of metrics that have previously been shown to be robust to mixing artifacts (Ewald, Marzetti, Zappasodi, Meinecke, Nolte, 2012, Haufe, Nikulin, Müller, Nolte, 2013, Nolte, Bai, Wheaton, Mari, Vorbach, Hallett, 2004). In contrast, non-robust metrics have been shown to be prone to the spurious discovery of interactions (Bastos, Schoffelen, 2016, Brunner, Billinger, Seeber, Mullen, Makeig, 2016, Haufe, Nikulin, Müller, Nolte, 2013, Nolte, Bai, Wheaton, Mari, Vorbach, Hallett, 2004, Van de Steen, Faes, Karahan, Songsiri, Valdes-Sosa, Marinazzo, 2019). This was confirmed here again for absolute coherence and GC. For a detailed overview of the taxonomy of FC metrics we refer to the works of Bastos and Schoffelen (2016); Marzetti et al. (2019); Schoffelen and Gross (2019). Our results are obtained for intra-frequency phase–phase coupling, and make no claims about non-linear interaction metrics quantifying phase–amplitude or amplitude–amplitude coupling within or across frequencies (Colclough, Brookes, Smith, Woolrich, 2015, De Pasquale, Della Penna, Snyder, Lewis, Mantini, Marzetti, Belardinelli, Ciancetta, Pizzella, Romani, et al., 2010, Hipp, Hawellek, Corbetta, Siegel, Engel, 2012). Nevertheless, we expect that robust-to-volume conduction measures for these FC types would be required to obtain optimal performance.

A further limitation of simulation studies in general is that assumptions need to be made that are hard, if not impossible, to confirm. Here, our goal was to generate pseudo-EEG data comprising realistic effects of volume conduction using a physical model of a human head. In terms of the generated time series, we focused on alpha-band oscillations as carriers of the modeled interactions. By adding pink brain noise, uniformly distributed across the entire brain, as well as white sensor noise, we obtained simulated sensor-space EEG data that resemble real data in crucial aspects such as spectral peaks and the general 1/f shape of the power spectrum. On the other hand, numerous additional assumptions were made regarding the linear dynamics of the interacting sources, the conception of the interaction as a pure and fixed time delay, the focus on an interaction in the alpha band, the number of interactions, the signal-to-noise ratio, and the stationarity of all signal and noise sources. Several of these experimental variables were systematically varied to provide a comprehensive picture of the performance of each pipeline in a wide range of scenarios. The ranking of the pipelines’ performances was robust in all tested scenarios. However, a remaining question is how realistic the individual studied parameter choices are. Our simulated environment resembles a setting of task-related (ongoing) activity with few dominant active and interacting sources, as opposed to a resting-state setting with numerous equally active and interacting sources. Hincapié et al. (2017) showed that connectivity estimation pipelines including beamformers perform well for point-like sources, whereas for extended cortical patches, MNE source estimation was found to be more accurate. In this study, we simulated point-like sources, which could lead to an overestimation of beamformer performance. Considering that FC analyses are predominantly performed on ongoing (including resting-state) activity, the assumption of having only a few interacting source pairs standing out against non-interacting background sources may be challenged. However, this assumption was made here for the practical purpose of enabling a comparison between approaches. Considering that FC analyses are predominantly performed on ongoing (e.g., resting-state) activity rather than averaged data, the assumptions of only few interacting source pairs standing out against non-interacting background sources with relatively high SNR can certainly be questioned. However, these assumptions were made here for the practical purpose of enabling a comparison between approaches rather than with the ambition of claiming real-world validity.

Future simulation studies should nevertheless strive to further increase the realism of the generated pseudo-EEG signals. In this regard, Anzolin et al. (2021) presented a toolbox that mimics typical EEG artifacts like eye blinks. We restricted ourselves here to using artificial time series designed to exhibit the specific properties assessed by the studied FC metrics; that is, time-delayed linear dynamics. In contrast, biologically inspired models such as the models implemented within the virtual brain toolbox (TVB;Sanz Leon et al., 2013) provide a richer portfolio of non-linear dynamics and thus are alternative ground-truth models specifically when the goal is to validate non-linear FC metrics. The COALIA model (Bensaid et al., 2019), for example, has been used to mimick network activity in epilepsy for the purpose of validating FC estimates (Allouch et al., 2022). Further studies used the same model family to study the effect of parameters such as electrode density on FC estimates (Allouch, Kabbara, Duprez, Khalil, Modolo, Hassan, 2023, Tabbal, Kabbara, Yochum, Khalil, Hassan, Benquet, 2022). Similarly, Jirsa and Müller (2013) have used TVB to evaluate metrics of cross-frequency coupling. Overall, these studies provide complementary evidence that is largely aligned with our results, for example with respect to the superiority of robust connectivity metrics. The plausibility of several assumptions made by neural mass models has also recently been questioned (Pathak et al., 2022). Nevertheless, such models hold great promise as validation tools in the future.

Note in this respect that it was not our intention to propose a realistic model of EEG data or even the whole brain but simply to generate data that would allow us to test how well ROI-level FC can be reconstructed in the presence of volume conduction/source leakage. The types of FC we are interested here (directed and undirected linear FC) have been widely studied and popular metrics to infer these types of FC are known to be heavily affected by volume conduction (Haufe, Nikulin, Müller, Nolte, 2013, Nolte, Bai, Wheaton, Mari, Vorbach, Hallett, 2004). Hence, it was our intention to identify metrics and pipelines that have a high chance of reconstructing FC on the ROI level when signals are mapped to the EEG and back by realistic forward and inverse models. We deliberately do not address the question whether networks estimated using FC metrics provide a correct depiction of actual brain networks.

As a further limitation, our simulations are to some extent restricted to EEG data. However, it can be expected that, qualitatively, the results of this paper could be transferred to MEG data. MEG analyses also suffer from the source leakage problem (Colclough, Woolrich, Tewarie, Brookes, Quinn, Smith, 2016, Pizzella, Marzetti, Della Penna, de Pasquale, Zappasodi, Romani, 2014) and benefit from disentangling signal sources with source reconstruction (Marzetti, Basti, Chella, D’Andrea, Syrjälä, Pizzella, 2019, Schoffelen, Gross, 2019). Moreover, the same FC metrics are typically used in EEG and MEG analyses (Schoffelen, Gross, 2009, Schoffelen, Gross, 2019). Nevertheless, differences exist, which would be worth studying. In contrast to EEG, which records secondary neuronal return currents, MEG records the magnetic field that is induced by electrical activity and arises in a circular field around an electric current (Hämäläinen et al., 1993). Therefore, MEG cannot record radial neuronal currents (Huang et al., 2007). This must be taken into account when estimating the inverse solution from the leadfield, i.e. it is advised to reduce the rank of the forward model from three to two by applying an SVD at each source location (Westner et al., 2021).

We here provide a simulation framework that is openly accessible by the community. Individual pipeline steps, but also simulated data can easily be replaced by other variants, following a plug-and-play principle. A such, we encourage readers to test aspects of the pipelines, other data, and other FC metrics not considered here.

6. Conclusion

This work compared an extensive set of data analysis pipelines for the purpose of extracting directed and undirected functional connectivity between predefined brain regions from simulated EEG data. While several individual steps of such pipelines have been benchmarked in previous studies, we focused specifically on the problem of aggregating source-reconstructed data into region-level time courses and, ultimately, region-to-region connectivity matrices. Thereby, we close a gap in the current literature evaluating FC estimation approaches. We show that using non-robust FC metrics greatly reduces the ability to correctly detect ground-truth FC. Further, in our simulated pseudo-EEG data, the use of the eLORETA inverse solution also leads to worse FC detection performance than beamformers. Moreover, the use of inverse solutions that are frequency-specific, such as DICS, may hamper the correct identification of the directionality of interactions. Finally, unequal dimensionalities of signals at different ROIs may bias certain connectivity measures, such as MIC and MIM, degrading their ability to identify true interactions from a noise floor. Thus, dimensionality reduction techniques should be applied such that the number of retained signal components is the same for all regions. We expect that avoiding these pitfalls may enhance the correct interpretation and comparability of results of future connectivity investigations. FC pipelines that show promising results with our simulated pseudo-EEG data consist of beamformer or champagne source reconstruction, aggregation of time series within ROIs using a fixed number of strongest PCs, and using a robust FC metric like MIM or TRGC. To which scenarios these results can be generalized remains to be shown in further studies. In practice, low SNR, high numbers of interactions, and small interaction delays may, however, reduce the performance even of the best performing pipelines.

Data and code availability

The code for the simulation can be found here: https://github.com/fpellegrini/FCsim. The code for the ROIconnect plugin can be found here: https://github.com/sccn/roiconnect. And the code for the minimal real data example here: https://github.com/fpellegrini/MotorImag. Data of the real data example are available upon request.

CRediT authorship contribution statement

Franziska Pellegrini: Methodology, Software, Investigation, Writing – original draft, Visualization. Arnaud Delorme: Validation, Software, Writing – review & editing. Vadim Nikulin: Methodology, Writing – review & editing, Supervision. Stefan Haufe: Conceptualization, Methodology, Validation, Investigation, Resources, Writing – review & editing, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This project was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 758985) and through Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) Project-ID 424778381 TRR 295. We thank Tien Dung Nguyen for contributing to the development of the ROIconnect plugin. The computations for this work were partly run on the open Neuroscience Gateway cluster (Sivagnanam et al., 2013).

Footnotes

Appendix A. ROIconnect toolbox

ROIconnect is a freely available open-source plugin to the popular MATLAB-based open-source toolbox EEGLAB for EEG data analysis. It adds the functionality of calculating region-wise power and inter-regional FC on the source level. Moreover, it provides functions to visualize power and FC. All functions can be accessed by the EEGLAB GUI or the command line. ROIconnect uses core EEGLAB functions for importing and preprocessing EEG data, and calculating the leadfield and source model: we refer users to other EEGLAB functions to preprocess data before applying ROIconnect functions. The ROIconnect plugin can be downloaded through github5 or installed via the EEGLAB GUI extension manager.

Key features

The features of ROIconnect are implemented in three main functions: pop_roi_activity, pop_roi_connect, and pop_roi_connectplot.

pop_roi_activity takes an EEG struct containing EEG sensor activity, a pointer to a headmodel and a source model, the atlas name, and the number of PCs for dimensionality reduction as input. It then calculates a source projection filter (default: LCMV) and applies it to the sensor data. Power is then calculated with the Welch method for every frequency on the voxel time series and then summed across voxels within regions. The result is saved in EEG.roi.source_roi_power. To estimate region-wise FC, the pop_roi_activity function reduces the dimensionality of the time series of every region by employing a PCA and selecting the strongest PCs (as defined in the input) for every region. The resulting time series are then stored in EEG.roi.source_roi_data.

pop_roi_connect calculates FC between regions. It builds on the output of pop_roi_activity. That is, it takes the EEG struct as input, as well as the name of the FC metrics that should be calculated. The function calculates all FC metrics in a frequency-resolved way. That is, the output contains FC scores for every region–region–frequency combination. To avoid biases due to different data lengths, pop_roi_connect estimates FC for time windows (‘snippets’) of 60 sec length (default), which subsequently can be averaged (default) or used as input for later statistical analyses. The snippet length can be flexibly adjusted by the user. The output of this function is stored under the name of the respective FC metric under EEG.roi.

The pop_roi_connectplot function enables visualizing power and FC in the following modes:

  • Power as region-wise bar plot.

  • Power as source-level cortical surface topography.

  • FC as region-by-region matrix.

  • Net FC, that is, the mean FC from all regions to all regions, as cortical surface topography.

  • Seed FC, that is, the FC of a seed region to all other regions, as cortical surface topography.

For plotting, a specific frequency or frequency band can be chosen by the user. For matrix representations, it is also possible to just plot one of the hemispheres or only regions belonging to specific brain lobes.

Supplementary material

Supplementary material associated with this article can be found, in the online version, at 10.1016/j.neuroimage.2023.120218

Appendix B. Supplementary materials

Supplementary Data S1

Supplementary Raw Research Data. This is open data under the CC BY license http://creativecommons.org/licenses/by/4.0/

mmc1.pdf (5MB, pdf)

Data availability

Data will be made available on request.

References

  1. Allen M., Poggiali D., Whitaker K., Marshall T.R., Kievit R.A. Raincloud plots: a multi-platform tool for robust data visualization. Wellcome Open Res. 2019;4 doi: 10.12688/wellcomeopenres.15191.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Allouch S., Kabbara A., Duprez J., Khalil M., Modolo J., Hassan M. Effect of channel density, inverse solutions and connectivity measures on EEG resting-state networks reconstruction: asimulation study. Neuroimage. 2023;271:120006. doi: 10.1016/j.neuroimage.2023.120006. [DOI] [PubMed] [Google Scholar]
  3. Allouch S., Yochum M., Kabbara A., Duprez J., Khalil M., Wendling F., Hassan M., Modolo J. Mean-field modeling of brain-scale dynamics for the evaluation of EEG source-space networks. Brain Topogr. 2022;35(1):54–65. doi: 10.1007/s10548-021-00859-9. [DOI] [PubMed] [Google Scholar]
  4. Anzolin A., Presti P., Van De Steen F., Astolfi L., Haufe S., Marinazzo D. Quantifying the effect of demixing approaches on directed connectivity estimated between reconstructed EEG sources. Brain Topogr. 2019;32(4):655–674. doi: 10.1007/s10548-019-00705-z. [DOI] [PubMed] [Google Scholar]
  5. Anzolin A., Toppi J., Petti M., Cincotti F., Astolfi L. SEED-G: Simulated EEG data generator for testing connectivity algorithms. Sensors. 2021;21(11):3632. doi: 10.3390/s21113632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Aoki M., Havenner A. State space modeling of multiple time series. Econom. Rev. 1991;10(1):1–59. [Google Scholar]
  7. Astolfi L., Cincotti F., Mattia D., Marciani M.G., Baccala L.A., de Vico Fallani F., Salinari S., Ursino M., Zavaglia M., Ding L., et al. Comparison of different cortical connectivity estimators for high-resolution EEG recordings. Hum. Brain Mapp. 2007;28(2):143–157. doi: 10.1002/hbm.20263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Babiloni C., Del Percio C., Lizio R., Noce G., Lopez S., Soricelli A., Ferri R., Nobili F., Arnaldi D., Famà F., et al. Abnormalities of resting-state functional cortical connectivity in patients with dementia due to alzheimer’s and lewy body diseases: an EEG study. Neurobiol. Aging. 2018;65:18–40. doi: 10.1016/j.neurobiolaging.2017.12.023. [DOI] [PubMed] [Google Scholar]
  9. Barnett L., Barrett A.B., Seth A.K. Solved problems for granger causality in neuroscience: a response to stokes and purdon. Neuroimage. 2018;178:744–748. doi: 10.1016/j.neuroimage.2018.05.067. [DOI] [PubMed] [Google Scholar]
  10. Barnett L., Seth A.K. The MVGC multivariate granger causality toolbox: a new approach to granger-causal inference. J. Neurosci. Methods. 2014;223:50–68. doi: 10.1016/j.jneumeth.2013.10.018. [DOI] [PubMed] [Google Scholar]
  11. Barnett L., Seth A.K. Granger causality for state-space models. Phys. Rev. E. 2015;91(4):040101. doi: 10.1103/PhysRevE.91.040101. [DOI] [PubMed] [Google Scholar]
  12. Barrett A.B., Barnett L., Seth A.K. Multivariate granger causality and generalized variance. Phys. Rev. E. 2010;81(4):041907. doi: 10.1103/PhysRevE.81.041907. [DOI] [PubMed] [Google Scholar]
  13. Basti A., Nili H., Hauk O., Marzetti L., Henson R.N. Multi-dimensional connectivity: a conceptual and mathematical review. Neuroimage. 2020:117179. doi: 10.1016/j.neuroimage.2020.117179. [DOI] [PubMed] [Google Scholar]
  14. Bastos A.M., Schoffelen J.-M. A tutorial review of functional connectivity analysis methods and their interpretational pitfalls. Front. Syst. Neurosci. 2016;9:175. doi: 10.3389/fnsys.2015.00175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Bensaid S., Modolo J., Merlet I., Wendling F., Benquet P. Coalia: a computational model of human EEG for consciousness research. Front. Syst. Neurosci. 2019;13:59. doi: 10.3389/fnsys.2019.00059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Blankertz B., Sannelli C., Halder S., Hammer E.M., Kübler A., Müller K.-R., Curio G., Dickhaus T. Neurophysiological predictor of SMR-based BCI performance. Neuroimage. 2010;51(4):1303–1309. doi: 10.1016/j.neuroimage.2010.03.022. [DOI] [PubMed] [Google Scholar]
  17. Bradley A., Yao J., Dewald J., Richter C.-P. Evaluation of electroencephalography source localization algorithms with multiple cortical sources. PLoS ONE. 2016;11(1):e0147266. doi: 10.1371/journal.pone.0147266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Bressler S.L., Seth A.K. Wiener–granger causality: a well established methodology. Neuroimage. 2011;58(2):323–329. doi: 10.1016/j.neuroimage.2010.02.059. [DOI] [PubMed] [Google Scholar]
  19. Brunner C., Billinger M., Seeber M., Mullen T.R., Makeig S. Volume conduction influences scalp-based connectivity estimates. Front. Comput. Neurosci. 2016;10:121. doi: 10.3389/fncom.2016.00121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Castaño Candamil S., Höhne J., Martínez-Vargas J.-D., An X.-W., Castellanos-Domínguez G., Haufe S. Solving the EEG inverse problem based on space–time–frequency structured sparsity constraints. Neuroimage. 2015;118:598–612. doi: 10.1016/j.neuroimage.2015.05.052. [DOI] [PubMed] [Google Scholar]
  21. Chella F., Pizzella V., Zappasodi F., Marzetti L. Impact of the reference choice on scalp EEG connectivity estimation. J. Neural Eng. 2016;13(3):036016. doi: 10.1088/1741-2560/13/3/036016. [DOI] [PubMed] [Google Scholar]
  22. Colclough G.L., Brookes M.J., Smith S.M., Woolrich M.W. A symmetric multivariate leakage correction for MEG connectomes. Neuroimage. 2015;117:439–448. doi: 10.1016/j.neuroimage.2015.03.071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Colclough G.L., Woolrich M.W., Tewarie P.K., Brookes M.J., Quinn A.J., Smith S.M. How reliable are MEG resting-state connectivity metrics? Neuroimage. 2016;138:284–293. doi: 10.1016/j.neuroimage.2016.05.070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. D’Andrea A., Chella F., Marshall T.R., Pizzella V., Romani G.L., Jensen O., Marzetti L. Alpha and alpha-beta phase synchronization mediate the recruitment of the visuospatial attention network through the superior longitudinal fasciculus. Neuroimage. 2019;188:722–732. doi: 10.1016/j.neuroimage.2018.12.056. [DOI] [PubMed] [Google Scholar]
  25. De Pasquale F., Della Penna S., Snyder A.Z., Lewis C., Mantini D., Marzetti L., Belardinelli P., Ciancetta L., Pizzella V., Romani G.L., et al. Temporal dynamics of spontaneous MEG activity in brain networks. Proc. Natl. Acad. Sci. 2010;107(13):6040–6045. doi: 10.1073/pnas.0913863107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Desikan R.S., Ségonne F., Fischl B., Quinn B.T., Dickerson B.C., Blacker D., Buckner R.L., Dale A.M., Maguire R.P., Hyman B.T., et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage. 2006;31(3):968–980. doi: 10.1016/j.neuroimage.2006.01.021. [DOI] [PubMed] [Google Scholar]
  27. Dowding I., Haufe S. Powerful statistical inference for nested data using sufficient summary statistics. Front. Hum. Neurosci. 2018;12:103. doi: 10.3389/fnhum.2018.00103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Ewald A., Marzetti L., Zappasodi F., Meinecke F.C., Nolte G. Estimating true brain connectivity from EEG/MEG data invariant to linear and static transformations in sensor space. Neuroimage. 2012;60(1):476–488. doi: 10.1016/j.neuroimage.2011.11.084. [DOI] [PubMed] [Google Scholar]
  29. Faes L., Stramaglia S., Marinazzo D. On the interpretability and computational reliability of frequency-domain granger causality. F1000Res. 2017;6 doi: 10.12688/f1000research.12694.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Fraschini M., Demuru M., Crobe A., Marrosu F., Stam C.J., Hillebrand A. The effect of epoch length on estimated EEG functional connectivity and brain network organisation. J. Neural Eng. 2016;13(3):036015. doi: 10.1088/1741-2560/13/3/036015. [DOI] [PubMed] [Google Scholar]
  31. Fries P. A mechanism for cognitive dynamics: neuronal communication through neuronal coherence. Trends Cogn. Sci. (Regul. Ed.) 2005;9(10):474–480. doi: 10.1016/j.tics.2005.08.011. [DOI] [PubMed] [Google Scholar]
  32. Fries P. Rhythms for cognition: communication through coherence. Neuron. 2015;88(1):220–235. doi: 10.1016/j.neuron.2015.09.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Friston K.J. Functional and effective connectivity: a review. Brain Connect. 2011;1(1):13–36. doi: 10.1089/brain.2011.0008. [DOI] [PubMed] [Google Scholar]
  34. Friston K.J., Rotshtein P., Geng J.J., Sterzer P., Henson R.N. A critique of functional localisers. Neuroimage. 2006;30(4):1077–1087. doi: 10.1016/j.neuroimage.2005.08.012. [DOI] [PubMed] [Google Scholar]
  35. Geweke J. Measurement of linear dependence and feedback between multiple time series. J. Am. Stat. Assoc. 1982;77(378):304–313. [Google Scholar]
  36. Ghumare E.G., Schrooten M., Vandenberghe R., Dupont P. A time-varying connectivity analysis from distributed EEG sources: a simulation study. Brain Topogr. 2018;31(5):721–737. doi: 10.1007/s10548-018-0621-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Gómez-Herrero G., Atienza M., Egiazarian K., Cantero J.L. Measuring directional coupling between EEG sources. Neuroimage. 2008;43(3):497–508. doi: 10.1016/j.neuroimage.2008.07.032. [DOI] [PubMed] [Google Scholar]
  38. Gramfort A., Papadopoulo T., Olivi E., Clerc M. OpenMEEG: opensource software for quasistatic bioelectromagnetics. Biomed. Eng. Online. 2010;9(1):1–20. doi: 10.1186/1475-925X-9-45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Granger C.W.J. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: J. Econom. Soc. 1969:424–438. [Google Scholar]
  40. Gross J., Kujala J., Hämäläinen M., Timmermann L., Schnitzler A., Salmelin R. Dynamic imaging of coherent sources: studying neural interactions in the human brain. Proc. Natl. Acad. Sci. 2001;98(2):694–699. doi: 10.1073/pnas.98.2.694. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Grova C., Daunizeau J., Lina J.-M., Bénar C.G., Benali H., Gotman J. Evaluation of EEG localization methods using realistic simulations of interictal spikes. Neuroimage. 2006;29(3):734–753. doi: 10.1016/j.neuroimage.2005.08.053. [DOI] [PubMed] [Google Scholar]
  42. Habermehl C., Steinbrink J.M., Müller K.-R., Haufe S. Optimizing the regularization for image reconstruction of cerebral diffuse optical tomography. J. Biomed. Opt. 2014;19(9):096006. doi: 10.1117/1.JBO.19.9.096006. [DOI] [PubMed] [Google Scholar]
  43. Halder T., Talwar S., Jaiswal A.K., Banerjee A. Quantitative evaluation in estimating sources underlying brain oscillations using current source density methods and beamformer approaches. eNeuro. 2019;6(4) doi: 10.1523/ENEURO.0170-19.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Hämäläinen M., Hari R., Ilmoniemi R.J., Knuutila J., Lounasmaa O.V. Magnetoencephalographytheory, instrumentation, and applications to noninvasive studies of the working human brain. Rev. Mod. Phys. 1993;65(2):413. [Google Scholar]
  45. Hashemi A., Cai C., Kutyniok G., Müller K.-R., Nagarajan S.S., Haufe S. Unification of sparse bayesian learning algorithms for electromagnetic brain imaging with the majorization minimization framework. bioRxiv. 2021 doi: 10.1016/j.neuroimage.2021.118309. [DOI] [PMC free article] [PubMed] [Google Scholar]; 2020–08
  46. Haufe S., Ewald A. A simulation framework for benchmarking EEG-based brain connectivity estimation methodologies. Brain Topogr. 2019;32(4):625–642. doi: 10.1007/s10548-016-0498-y. [DOI] [PubMed] [Google Scholar]
  47. Haufe S., Nikulin V.V., Müller K.-R., Nolte G. A critical assessment of connectivity measures for EEG data: a simulation study. Neuroimage. 2013;64:120–133. doi: 10.1016/j.neuroimage.2012.09.036. [DOI] [PubMed] [Google Scholar]
  48. Haufe S., Nikulin V.V., Nolte G. International Conference on Latent Variable Analysis and Signal Separation. Springer; 2012. Alleviating the influence of weak data asymmetries on granger-causal analyses; pp. 25–33. [Google Scholar]
  49. Haufe S., Nikulin V.V., Ziehe A., Müller K.-R., Nolte G. Combining sparsity and rotational invariance in EEG/MEG source reconstruction. Neuroimage. 2008;42(2):726–738. doi: 10.1016/j.neuroimage.2008.04.246. [DOI] [PubMed] [Google Scholar]
  50. Haufe S., Tomioka R., Dickhaus T., Sannelli C., Blankertz B., Nolte G., Müller K.-R. Large-scale EEG/MEG source localization with spatial flexibility. Neuroimage. 2011;54(2):851–859. doi: 10.1016/j.neuroimage.2010.09.003. [DOI] [PubMed] [Google Scholar]
  51. Hillebrand A., Barnes G.R., Bosboom J.L., Berendse H.W., Stam C.J. Frequency-dependent functional connectivity within resting-state networks: an atlas-based MEG beamformer solution. Neuroimage. 2012;59(4):3909–3921. doi: 10.1016/j.neuroimage.2011.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Hincapié A.-S., Kujala J., Mattout J., Daligault S., Delpuech C., Mery D., Cosmelli D., Jerbi K. MEG Connectivity and power detections with minimum norm estimates require different regularization parameters. Comput. Intell. Neurosci. 2016;2016 doi: 10.1155/2016/3979547. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Hincapié A.-S., Kujala J., Mattout J., Pascarella A., Daligault S., Delpuech C., Mery D., Cosmelli D., Jerbi K. The impact of MEG source reconstruction method on source-space connectivity estimation: a comparison between minimum-norm solution and beamforming. Neuroimage. 2017;156:29–42. doi: 10.1016/j.neuroimage.2017.04.038. [DOI] [PubMed] [Google Scholar]
  54. Hipp J.F., Engel A.K., Siegel M. Oscillatory synchronization in large-scale cortical networks predicts perception. Neuron. 2011;69(2):387–396. doi: 10.1016/j.neuron.2010.12.027. [DOI] [PubMed] [Google Scholar]
  55. Hipp J.F., Hawellek D.J., Corbetta M., Siegel M., Engel A.K. Large-scale cortical correlation structure of spontaneous oscillatory activity. Nat. Neurosci. 2012;15(6):884–890. doi: 10.1038/nn.3101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Huang M.-X., Song T., Hagler D.J., Jr, Podgorny I., Jousmaki V., Cui L., Gaa K., Harrington D.L., Dale A.M., Lee R.R., et al. A novel integrated MEG and EEG analysis method for dipolar sources. Neuroimage. 2007;37(3):731–748. doi: 10.1016/j.neuroimage.2007.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Huang Y., Zhang J., Cui Y., Yang G., He L., Liu Q., Yin G. How different EEG references influence sensor level functional connectivity graphs. Front. Neurosci. 2017;11:368. doi: 10.3389/fnins.2017.00368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Idaji M.J., Zhang J., Stephani T., Nolte G., Mueller K.-R., Villringer A., Nikulin V. Harmoni: a method for eliminating spurious interactions due to the harmonic components in neuronal data. bioRxiv. 2021 doi: 10.1016/j.neuroimage.2022.119053. [DOI] [PubMed] [Google Scholar]
  59. Jaiswal A., Nenonen J., Stenroos M., Gramfort A., Dalal S.S., Westner B.U., Litvak V., Mosher J.C., Schoffelen J.-M., Witton C., et al. Comparison of beamformer implementations for MEG source localization. Neuroimage. 2020;216:116797. doi: 10.1016/j.neuroimage.2020.116797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Jirsa V., Müller V. Cross-frequency coupling in real and virtual brain networks. Front. Comput. Neurosci. 2013;7:78. doi: 10.3389/fncom.2013.00078. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Korhonen O., Palva S., Palva J.M. Sparse weightings for collapsing inverse solutions to cortical parcellations optimize m/EEG source reconstruction accuracy. J. Neurosci. Methods. 2014;226:147–160. doi: 10.1016/j.jneumeth.2014.01.031. [DOI] [PubMed] [Google Scholar]
  62. Liuzzi L., Gascoyne L.E., Tewarie P.K., Barratt E.L., Boto E., Brookes M.J. Optimising experimental design for MEG resting state functional connectivity measurement. Neuroimage. 2017;155:565–576. doi: 10.1016/j.neuroimage.2016.11.064. [DOI] [PubMed] [Google Scholar]
  63. Mahjoory K., Nikulin V.V., Botrel L., Linkenkaer-Hansen K., Fato M.M., Haufe S. Consistency of EEG source localization and connectivity estimates. Neuroimage. 2017;152:590–601. doi: 10.1016/j.neuroimage.2017.02.076. [DOI] [PubMed] [Google Scholar]
  64. Marzetti L., Basti A., Chella F., D’Andrea A., Syrjälä J., Pizzella V. Brain functional connectivity through phase coupling of neuronal oscillations: a perspective from magnetoencephalography. Front. Neurosci. 2019;13:964. doi: 10.3389/fnins.2019.00964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Mazziotta J.C., Toga A.W., Evans A., Fox P., Lancaster J., et al. A probabilistic atlas of the human brain: theory and rationale for its development. Neuroimage. 1995;2(2):89–101. doi: 10.1006/nimg.1995.1012. [DOI] [PubMed] [Google Scholar]
  66. Miocinovic S., de Hemptinne C., Chen W., Isbaine F., Willie J.T., Ostrem J.L., Starr P.A. Cortical potentials evoked by subthalamic stimulation demonstrate a short latency hyperdirect pathway in humans. J. Neurosci. 2018;38(43):9129–9141. doi: 10.1523/JNEUROSCI.1327-18.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Nolte G., Bai O., Wheaton L., Mari Z., Vorbach S., Hallett M. Identifying true brain interaction from EEG data using the imaginary part of coherency. Clin. Neurophysiol. 2004;115(10):2292–2307. doi: 10.1016/j.clinph.2004.04.029. [DOI] [PubMed] [Google Scholar]
  68. Nolte G., Ziehe A., Nikulin V.V., Schlögl A., Krämer N., Brismar T., Müller K.-R. Robustly estimating the flow direction of information in complex physical systems. Phys. Rev. Lett. 2008;100(23):234101. doi: 10.1103/PhysRevLett.100.234101. [DOI] [PubMed] [Google Scholar]
  69. Nunez P.L., Srinivasan R., Westdorp A.F., Wijesinghe R.S., Tucker D.M., Silberstein R.B., Cadusch P.J. EEG Coherency: i: statistics, reference electrode, volume conduction, laplacians, cortical imaging, and interpretation at multiple scales. Electroencephalogr. Clin. Neurophysiol. 1997;103(5):499–515. doi: 10.1016/s0013-4694(97)00066-7. [DOI] [PubMed] [Google Scholar]
  70. Oswal A., Beudel M., Zrinzo L., Limousin P., Hariz M., Foltynie T., Litvak V., Brown P. Deep brain stimulation modulates synchrony within spatially and spectrally distinct resting state networks in parkinsons disease. Brain. 2016;139(5):1482–1496. doi: 10.1093/brain/aww048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Palva J.M., Monto S., Kulashekhar S., Palva S. Neuronal synchrony reveals working memory networks and predicts individual memory capacity. Proceed. Natl. Acad. Sci. 2010;107(16):7580–7585. doi: 10.1073/pnas.0913113107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Palva J.M., Wang S.H., Palva S., Zhigalov A., Monto S., Brookes M.J., Schoffelen J.-M., Jerbi K. Ghost interactions in MEG/EEG source space: anote of caution on inter-areal coupling measures. Neuroimage. 2018;173:632–643. doi: 10.1016/j.neuroimage.2018.02.032. [DOI] [PubMed] [Google Scholar]
  73. Palva S., Kulashekhar S., Hämäläinen M., Palva J.M. Localization of cortical phase and amplitude dynamics during visual working memory encoding and retention. J. Neurosci. 2011;31(13):5013–5025. doi: 10.1523/JNEUROSCI.5592-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Pascual-Marqui R.D. Discrete, 3d distributed, linear imaging methods of electric neuronal activity. part 1: exact, zero error localization. arXiv preprint arXiv:0710.3341. 2007 [Google Scholar]
  75. Pascual-Marqui R.D., Lehmann D., Koukkou M., Kochi K., Anderer P., Saletu B., Tanaka H., Hirata K., John E.R., Prichep L., Biscay-Lirio R., Kinoshita T. Assessing interactions in the brain with exact low-resolution electromagnetic tomography. Philos. Trans. A Math. Phys. Eng. Sci. 2011;369:3768–3784. doi: 10.1098/rsta.2011.0081. [DOI] [PubMed] [Google Scholar]
  76. Pathak A., Roy D., Banerjee A. Whole-brain network models: from physics to bedside. Front. Comput. Neurosci. 2022;16 doi: 10.3389/fncom.2022.866517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Perinelli A., Assecondi S., Tagliabue C.F., Mazza V. Power shift and connectivity changes in healthy aging during resting-state EEG. Neuroimage. 2022:119247. doi: 10.1016/j.neuroimage.2022.119247. [DOI] [PubMed] [Google Scholar]
  78. Pizzella V., Marzetti L., Della Penna S., de Pasquale F., Zappasodi F., Romani G.L. Magnetoencephalography in the study of brain dynamics. Funct. Neurol. 2014;29(4):241. [PMC free article] [PubMed] [Google Scholar]
  79. Rubega M., Carboni M., Seeber M., Pascucci D., Tourbier S., Toscano G., Van Mierlo P., Hagmann P., Plomp G., Vulliemoz S., et al. Estimating EEG source dipole orientation based on singular-value decomposition for connectivity analysis. Brain Topogr. 2019;32(4):704–719. doi: 10.1007/s10548-018-0691-2. [DOI] [PubMed] [Google Scholar]
  80. Sannelli C., Vidaurre C., Müller K.-R., Blankertz B. A large scale screening study with a SMR-based BCI: categorization of BCI users and differences in their SMR activity. PLoS ONE. 2019;14(1):e0207351. doi: 10.1371/journal.pone.0207351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Sanz Leon P., Knock S.A., Woodman M.M., Domide L., Mersmann J., McIntosh A.R., Jirsa V. The virtual brain: a simulator of primate brain network dynamics. Front. Neuroinform. 2013;7:10. doi: 10.3389/fninf.2013.00010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Schaworonkow N., Nikulin V.V. Is sensor space analysis good enough? spatial patterns as a tool for assessing spatial mixing of EEG/MEG rhythms. bioRxiv. 2021 doi: 10.1016/j.neuroimage.2022.119093. [DOI] [PubMed] [Google Scholar]
  83. Schoffelen J.-M., Gross J. Source connectivity analysis with MEG and EEG. Hum. Brain Mapp. 2009;30(6):1857–1865. doi: 10.1002/hbm.20745. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Schoffelen J.-M., Gross J. Studying dynamic neural interactions with MEG. Magnetoencephalography: from signals to dynamic cortical networks. 2019:519–541. [Google Scholar]
  85. Schoffelen J.-M., Hultén A., Lam N., Marquand A.F., Uddén J., Hagoort P. Frequency-specific directed interactions in the human brain network for language. Proc. Natl. Acad. Sci. 2017;114(30):8083–8088. doi: 10.1073/pnas.1703155114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Shouno O., Tachibana Y., Nambu A., Doya K. Computational model of recurrent subthalamo-pallidal circuit for generation of parkinsonian oscillations. Front. Neuroanat. 2017;11:21. doi: 10.3389/fnana.2017.00021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Silfverhuth M.J., Hintsala H., Kortelainen J., Seppänen T. Experimental comparison of connectivity measures with simulated EEG signals. Med. Biol. Eng. Comput. 2012;50(7):683–688. doi: 10.1007/s11517-012-0911-y. [DOI] [PubMed] [Google Scholar]
  88. Sivagnanam S., Majumdar A., Yoshimoto K., Astakhov V., Bandrowski A.E., Martone M.E., Carnevale N.T., et al. Introducing the neuroscience gateway. IWSG. 2013;993:0. doi: 10.1002/cpe.3283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Sommariva S., Sorrentino A., Piana M., Pizzella V., Marzetti L. A comparative study of the robustness of frequency-domain connectivity measures to finite data length. Brain Topogr. 2019;32(4):675–695. doi: 10.1007/s10548-017-0609-4. [DOI] [PubMed] [Google Scholar]
  90. Song J., Davey C., Poulsen C., Luu P., Turovets S., Anderson E., Li K., Tucker D. EEG Source localization: sensor density and head surface coverage. J. Neurosci. Methods. 2015;256:9–21. doi: 10.1016/j.jneumeth.2015.08.015. [DOI] [PubMed] [Google Scholar]
  91. Van de Steen F., Faes L., Karahan E., Songsiri J., Valdes-Sosa P.A., Marinazzo D. Critical comments on EEG sensor space dynamical connectivity analysis. Brain Topogr. 2019;32:643–654. doi: 10.1007/s10548-016-0538-7. [DOI] [PubMed] [Google Scholar]
  92. Supp G.G., Schlögl A., Trujillo-Barreto N., Müller M.M., Gruber T. Directed cortical information flow during human object recognition: analyzing induced EEG gamma-band responses in brain’s source space. PLoS ONE. 2007;2(8):e684. doi: 10.1371/journal.pone.0000684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Tabbal J., Kabbara A., Yochum M., Khalil M., Hassan M., Benquet P. Assessing HD-EEG functional connectivity states using a human brain computational model. J. Neural Eng. 2022;19(5):056032. doi: 10.1088/1741-2552/ac954f. [DOI] [PubMed] [Google Scholar]
  94. Tadel F., Baillet S., Mosher J.C., Pantazis D., Leahy R.M. Brainstorm: a user-friendly application for MEG/EEG analysis. Comput. Intell. Neurosci. 2011;2011 doi: 10.1155/2011/879716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Van Diessen E., Numan T., Van Dellen E., Van Der Kooi A.W., Boersma M., Hofman D., Van Lutterveld R., Van Dijk B.W., Van Straaten E., Hillebrand A., et al. Opportunities and methodological challenges in EEG and MEG resting state functional brain network research. Clin. Neurophysiol. 2015;126(8):1468–1481. doi: 10.1016/j.clinph.2014.11.018. [DOI] [PubMed] [Google Scholar]
  96. Van Veen B.D., Van Drongelen W., Yuchtman M., Suzuki A. Localization of brain electrical activity via linearly constrained minimum variance spatial filtering. IEEE Trans. Biomed. Eng. 1997;44(9):867–880. doi: 10.1109/10.623056. [DOI] [PubMed] [Google Scholar]
  97. Vinck M., Huurdeman L., Bosman C.A., Fries P., Battaglia F.P., Pennartz C.M.A., Tiesinga P.H. How to detect the granger-causal flow direction in the presence of additive noise? Neuroimage. 2015;108:301–318. doi: 10.1016/j.neuroimage.2014.12.017. [DOI] [PubMed] [Google Scholar]
  98. Vinck M., van Wingerden M., Womelsdorf T., Fries P., Pennartz C.M.A. The pairwise phase consistency: a bias-free measure of rhythmic neuronal synchronization. Neuroimage. 2010;51(1):112–122. doi: 10.1016/j.neuroimage.2010.01.073. [DOI] [PubMed] [Google Scholar]
  99. Wall M.E., Rechtsteiner A., Rocha L.M. A practical approach to microarray data analysis. Springer; 2003. Singular Value Decomposition and Principal Component Analysis; pp. 91–109. [Google Scholar]
  100. Wang H.E., Bénar C.G., Quilichini P.P., Friston K.J., Jirsa V.K., Bernard C. A systematic framework for functional connectivity measures. Front. Neurosci. 2014;8:405. doi: 10.3389/fnins.2014.00405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Wang S.H., Lobier M., Siebenhühner F., Puoliväli T., Palva S., Palva J.M. Hyperedge bundling: a practical solution to spurious interactions in MEG/EEG source connectivity analyses. Neuroimage. 2018;173:610–622. doi: 10.1016/j.neuroimage.2018.01.056. [DOI] [PubMed] [Google Scholar]
  102. Westner B.U., Dalal S.S., Gramfort A., Litvak V., Mosher J.C., Oostenveld R., Schoffelen J.-M. A unified view on beamformers for m/EEG source reconstruction. Neuroimage. 2021:118789. doi: 10.1016/j.neuroimage.2021.118789. [DOI] [PubMed] [Google Scholar]
  103. Whittle P. On the fitting of multivariate autoregressions, and the approximate canonical factorization of a spectral density matrix. Biometrika. 1963;50(1–2):129–134. [Google Scholar]
  104. Winkler I., Panknin D., Bartz D., Müller K.-R., Haufe S. Validity of time reversal for testing granger causality. IEEE Trans. Signal Process. 2016;64(11):2746–2760. [Google Scholar]
  105. Wipf D.P., Owen J.P., Attias H.T., Sekihara K., Nagarajan S.S. Robust bayesian estimation of the location, orientation, and time course of multiple correlated neural sources using MEG. Neuroimage. 2010;49(1):641–655. doi: 10.1016/j.neuroimage.2009.06.083. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data S1

Supplementary Raw Research Data. This is open data under the CC BY license http://creativecommons.org/licenses/by/4.0/

mmc1.pdf (5MB, pdf)

Data Availability Statement

The code for the simulation can be found here: https://github.com/fpellegrini/FCsim. The code for the ROIconnect plugin can be found here: https://github.com/sccn/roiconnect. And the code for the minimal real data example here: https://github.com/fpellegrini/MotorImag. Data of the real data example are available upon request.

Data will be made available on request.

RESOURCES