Oscillating neural circuits: Phase, amplitude, and the complex normal distribution

Konrad N URBAN; Heejong BONG; Josue ORELLANA; Robert E KASS

doi:10.1002/cjs.11790

. Author manuscript; available in PMC: 2024 Sep 1.

Published in final edited form as: Can J Stat. 2023 Jul 22;51(3):824–851. doi: 10.1002/cjs.11790

Show available content in

Oscillating neural circuits: Phase, amplitude, and the complex normal distribution

Konrad N URBAN ^1,^*, Heejong BONG ¹, Josue ORELLANA ¹, Robert E KASS ¹

PMCID: PMC11223177 NIHMSID: NIHMS1995735 PMID: 38974813

Abstract

Multiple oscillating time series are typically analyzed in the frequency domain, where coherence is usually said to represent the magnitude of the correlation between two signals at a particular frequency. The correlation being referenced is complex-valued and is similar to the real-valued Pearson correlation in some ways but not others. We discuss the dependence among oscillating series in the context of the multivariate complex normal distribution, which plays a role for vectors of complex random variables analogous to the usual multivariate normal distribution for vectors of real-valued random variables. We emphasize special cases that are valuable for the neural data we are interested in and provide new variations on existing results. We then introduce a complex latent variable model for narrowly band-pass-filtered signals at some frequency, and show that the resulting maximum likelihood estimate produces a latent coherence that is equivalent to the magnitude of the complex canonical correlation at the given frequency. We also derive an equivalence between partial coherence and the magnitude of complex partial correlation, at a given frequency. Our theoretical framework leads to interpretable results for an interesting multivariate dataset from the Allen Institute for Brain Science.

Keywords: Coherence, complex normal distribution, latent variable model, oscillations, Primary 62H20, Secondary 62P10

1. INTRODUCTION

Oscillations in neural circuits have been observed under a variety of circumstances and have provoked much speculation about their physiological function (Buzsaki & Draguhn, 2004; Fries, 2005). In the past 15 years, the role of oscillations at particular frequencies has been the subject of considerable experimental investigation, including causal manipulation (Cardin etal.,2009).Of particular interest is the intriguing possibility that oscillations facilitate purposeful communication across distinct parts of the brain, such as when an organism must retrieve and hold items from memory or direct its visual attention to a particular location (Miller, Lundqvist & Bastos, 2018; Schmidt et al., 2019). This has led to the idea that alterations of circuit oscillations could indicate brain dysfunction (Mathalon & Sohal, 2015).

From a statistical perspective, regardless of their mechanistic function, neural oscillations can be considered useful indicators of coordinated activity across brain regions. A short snippet of data, typical of those we have analyzed, is shown in the top panel of Figure 1 together with a band-pass-filtered version. In the bottom panel are band-pass-filtered series for two trials, where at many points in time the amplitudes, phases, or both are different. When series from two electrodes (in different parts of the brain) are considered, the two phases may tend to shift forward or backward together across trials, which indicates coordinated activity. Similarly, there may be trial-to-trial correlation between the amplitudes.

Figure 1: — Three seconds of local field potential (LFP) data filtered at 6.5Hz with a window from 6 to 7Hz. In the top figure, we plot both the raw LFP signal, which consists of a noisy oscillation around 6.5Hz, and the filtered signal, which removes the noise. In the bottom figure, we plot filtered data from the same electrode across two trials. At many points in time, the two trials have different phases or amplitudes.

To quantify this form of association, an obvious question is whether it might be advantageous to consider phase and amplitude together, as defining complex numbers, and model association using multivariate complex distributions. In addition, data such as those in Figure 1 often come from many recordings in each brain region that somehow must be combined. We investigate the properties of the multivariate complex normal distribution with the goal of analyzing interactions among multiple groups of oscillating time series.

A related conceptual concern comes from the standard interpretation of coherence, which is the starting point for much frequency-based analysis of co-dependence. Under the Cramér (or Cramér–Khintchine) representation of a bivariate stationary time series, coherence is usually said to be the magnitude of the time series correlation at a particular frequency (e.g., Shumway & Stoffer, 2017, Section 4.6). The difficulty with this interpretation is that “correlation” here refers to the complex correlation for complex random variables, which is analogous to the Pearson correlation for real-valued variables in certain respects, but not others.

This article reviews and reformulates ideas drawn from the literature and then provides several new results, including a summary of interacting groups of oscillating multivariate time series using a time-domain rendering of latent coherence, which turns out to be the magnitude of a complex-valued canonical correlation. Although we have been motivated by the analysis of neural data, we believe this article will be of general statistical interest as it concerns a basic topic in time series analysis. We also hope it represents a suitable tribute to Nancy Reid, whose work has often aimed to advance statistics through conceptual clarification and consolidation.

2. BACKGROUND AND SUMMARY

We are interested in the covariation of both phase and amplitude in two or more time series. The data we analyze in Section 6 come from experiments during which animals are shown a visual stimulus repeatedly, across many trials, while neural activity is recorded from electrodes inserted into the brain. Repeated measurements across many trials is typical of neurophysiological data, and these repetitions are helpful in dealing with the striking nonstationarity present in neural recordings (see, e.g., Kass et al., 2018). In addition, neurophysiological experiments often collect data from multiple electrodes embedded in multiple brain regions.

From oscillating series such as those in the top panel of Figure 1, the band-pass-filtered version can be obtained by Fourier analytic methods (though it can also be obtained using wavelets). As long as the band in the filtering is sufficiently narrow (surrounding some particular frequency), at any point in time the phase and amplitude of the real-valued, filtered time series can be recovered using the Hilbert transform (see the Appendix) under an assumption of local stationarity (Ombao & Van Bellegem, 2008). A phase and an amplitude define a complex number, and the Hilbert transform converts a real-valued signal into the corresponding complex-valued signal. For a pair of repeatedly observed complex-valued variables, correlation is more complicated than in the real case: covariance becomes complex-valued and there are two forms of linear covariation, called covariance and pseudo-covariance. Complex covariance is defined for two complex-valued random variables $X_{1}$ and $X_{2}$ as

cov (X_{1}, X_{2}) = E {(X_{1} - E X_{1}) (\bar{X_{2} - E X_{2}})},

where $\bar{x}$ is the conjugate of a complex number $x$ . Note that $cov (X_{1}, X_{2})$ is a complex number. Complex pseudo-covariance is defined as

pcov (X_{1}, X_{2}) = E {(X_{1} - E X_{1}) (X_{2} - E X_{2})} .

The variance–covariance and pseudo-variance–covariance matrices for a complex random vector $X$ are defined analogously to the case of complex random variables, using the Hermitian and transpose operators, respectively:

var (X) = E {(X - E X) {(X - E X)}^{H}} and pvar (X) = E {(X - E X) {(X - E X)}^{⊤}},

where ${(X - E X)}^{H} = {(\bar{X - E X})}^{⊤}$ . When the pseudo-covariance is zero, a complex random vector is called proper.

In addition to complex correlation, dependence between phase angles can be measured through phase-locking value (PLV) (Lachaux et al., 1999) which, as we show in Section 3.2.2, can be viewed as an analogue of correlation for angular random variables. For more than two phases, a recently developed class of models called torus graphs provides a thorough rendering of multivariate phase dependence (Klein et al., 2020). A torus graph is any member of the full exponential family with means and cross-products (interaction terms) on a multidimensional torus, which is the natural home for multivariate circular data because the product of circles is a torus. Thus, torus graphs are probabilistic graphical models and represent an analogue to Gaussian graphical models for circular (angular) data.

Phase coupling is also analyzed using coherence, the magnitude of coherency, which is a frequency-domain measure that has a form resembling correlation in terms of spectral densities and cross-spectral densities. One way to understand how coherence becomes complex correlation at a particular frequency is to consider the result of filtering signals with a complex band-pass filter in a band $(ω_{0} - δ, ω_{0} + δ)$ for some small $δ$ . Ombao & Van Bellegem (2008) noted that the resulting narrow-band coherence, which they called “band coherence,” is the correlation between the filtered signals. We can effectively pass to the limit as $δ \to 0$ to get the infinitesimal version that appears in the Cramér–Khintchine representation, displayed in Section 4.1. In a similar vein, in Section 4.1 we also note that a single-frequency bivariate process with stochastic amplitudes and phases yields coherence as the magnitude of complex correlation.

Coherence has been used to understand conditional dependence relationships between time series. Dahlhaus (2000) studied so-called coherence graphs for real-valued time series and showed how conditional dependence may be inferred from the partial coherence between these signals. Partial coherence arises by inverting and rescaling a matrix of coherency values for $d$ time series, and represents a form of conditional dependence between signals (Ombao & Pinto, 2022). Tugnait (2019a,b) analyzed conditional dependence between signals with arbitrary spectra by observing that the limiting distribution of a certain class of transformations of real-valued time series takes the form of a complex normal distribution.

Our setting, in which we observe multiple signals oscillating at the same frequency, differs from those previously studied in Dahlhaus (2000) and Tugnait (2019a,b) because our signals are oscillating only at a single frequency. In Section 4, we provide a comprehensive overview of how, for our setting, complex correlation and coherency are equivalent when we assume complex normality. Similarly, we show that complex partial correlation and partial coherency are equivalent, and we discuss how these concepts are directly tied to the conditional dependence between the observed oscillating time series. Thus, by modelling band-pass-filtered data with the complex normal distribution, we obtain simple interpretations of estimated correlation and partial correlation matrices in terms of coherence and partial coherence.

Our interest is in recordings from multiple electrodes embedded in several brain regions. To take advantage of the multiple sources of activity observed from each region, we developed a novel complex normal latent variable model in which each of several latent variables represents activity in one of the regions (at a particular point in time). Our goal is to estimate the dependence among these latent variables. We specify the model in Section 5 and apply it to data in Section 6. The data we analyze are local field potentials (LFPs), which are voltage recordings from electrodes inserted into the brain; they are low-pass-filtered (smoothed) and typically down-sampled to 1kHz so that each second of data has 1000 observations. LFPs represent bulk activity near the electrode (roughly within 150–200 μm) involving large numbers of neurons (Buzsáki, Anastassiou & Koch, 2012; Einevoll et al., 2013; Pesaran et al., 2018). Our theoretical results enable us to obtain a simple interpretation of data from the Allen Institute for Brain Science, Seattle, WA. We examined three recordings in each of six regions of the visual cortex in response to a visual stimulus and estimated latent correlation and partial correlation matrices after band-pass-filtering at 6.5Hz to isolate the theta rhythm. There are many large latent correlations of the theta rhythm oscillations between regions (which, as we show, may be considered large values of latent coherence). However, the unique associations, that is, the latent partial correlations, between these regions are modest in size with the exception of the unique associations between AM and PM, as well as between DG and CA1, which exhibit latent partial correlations (latent partial coherence) close to 1.

Although some of the theorems in this article may be unsurprising to experts on the complex normal distribution and frequency-based analysis, they all represent at least novel extensions and reformulations, and several results are entirely new. In Section 3.3.3, we discuss the conditional distribution of the angles given the amplitudes in the polar coordinate representation of the complex normal distribution. Navarro, Frellsen & Turner (2017) showed that the multivariate generalized von Mises (mGvM) distribution can be viewed as a conditional distribution of the 2 $d$ -dimensional, real-valued multivariate normal distribution when all the amplitudes are equal to 1. Here, we change the setting of this result to the complex normal distribution, generalize the result to arbitrary amplitudes, and show how the conditional distribution changes when certain relevant restrictions are made on the parameters of the complex normal distribution. These results (appearing in Theorems 2 and 3 and Corollary 4) form new characterizations of torus graphs. In Section 4 we provide a general treatment of partial correlation, conditional correlation, and conditional dependence for the complex normal distribution. These results consolidate and reformulate what was previously available in the literature (see Section 4 for specific citations). The new results in Corollary 13 and Theorem 12 show that maximum likelihood estimation in our latent variable model produces estimates that are equivalent to complex-valued versions of canonical correlation. The proofs of the theorems and corollaries we show are provided in the Supplementary Material Section S1.

3. DEPENDENCE OF COMPLEX-VALUED RANDOM VARIABLES

In this section, we first present the data-generating setting we are attempting to model. We observe complex-valued random vectors, and our goal is to measure associations between their entries. After introducing this setting, we mostly ignore the fact that we are dealing with oscillating signals, but we will return to this important point in Section 4 when we study the relationship between coherence, complex correlation, and the complex normal distribution. This important correspondence is the basis for the analysis presented in this paper.

We first review various measures of pairwise association between complex-valued random variables, including complex correlation, PLV, and amplitude correlation. We then discuss two multivariate models for association between complex-valued random vectors: the complex normal distribution, which is based on the linear association between complex random vectors, and the torus graph distribution, which is a multivariate model for phase. In Section 3.3.3, we show some new results that describe how the torus graph distribution arises by conditioning angles on amplitudes in the complex normal distribution.

3.1. Setting: Repeated Observations of Oscillating Signals

In this article, we assume we observe $d$ time series that are oscillating at some frequency $ω_{0}$ on each of many repeated trials. We can model the data as a $d$ -dimensional random vector $Y (t)$ , and assume these vectors are i.i.d. across trials. Throughout most of this article, we omit the trial number from our notation. We can first apply a local band-pass filter to the raw vector $Y (t)$ around the frequency $ω_{0}$ at time $t$ , and then apply the Hilbert transform to obtain a complex-valued random vector $X (t)$ having components that are functions of the phases and amplitudes of the oscillations: $X_{i} (t) = R_{i} (t) \exp {ι Θ_{i} (t)}$ for $R_{i} (t) \in (0, \infty)$ and $Θ_{i} (t) \in [- π, π)$ for components $i \in [d]$ . Here and everywhere else in the article, $[d] = {i \in ℕ : 1 \leq i \leq d}$ , where $𝜄$ represents the imaginary unit.

3.2. Pairwise Association Between Complex-Valued Random Variables

3.2.1. Complex Correlation

Linear association between complex-valued random variables is measured by complex correlation. We defined both covariance and pseudo-covariance between complex random variables and random vectors in Section 2. The corresponding complex correlations are

corr (X_{1}, X_{2}) = \frac{cov (X_{1}, X_{2})}{\sqrt{var (X_{1}) var (X_{2})}} and pcorr (X_{1}, X_{2}) = \frac{pcov (X_{1}, X_{2})}{\sqrt{var (X_{1}) var (X_{2})}} .

3.2.2. Angular Association: Phase-Locking Value

We can study the association between two phases by considering their representations as unit-length complex random variables $X_{1} = e^{i Θ_{1}}$ and $X_{2} = e^{ı Θ_{2}}$ . The set of such pairs is the product of two circles, a two-dimensional torus.

The strength of the association between two angular random variables $Θ_{1}$ , $Θ_{2}$ can be measured using PLV (Lachaux et al., 1999). When we have repeated observations $θ_{1}^{n}$ , $θ_{2}^{n}$ across $N$ trials of angular observations, then we can define

PLV = \frac{1}{N} | \sum_{n = 1}^{N} \exp {l (θ_{i}^{n} - θ_{j}^{n})} |,

which is widely used to measure the angular dependence between oscillating signals (Lepage & Vijayan, 2017).

To better understand PLV, we provide a theoretical analysis of the components of phase-based association. Suppose that $c_{i} = E {\cos (Θ_{i})}$ and $s_{i} = E {\sin (Θ_{i})}$ for $i \in {1, 2}$ . The first moment of $X_{i}$ is

E (X_{i}) = E {\cos (Θ_{i})} + ı E {\sin (Θ_{i})} = c_{i} + ı s_{i} = r_{i} e^{ı θ_{i}},

where $r_{i} = {(c_{i}^{2} + s_{i}^{2})}^{1 / 2}$ and $θ_{i}$ is the angle of the vector ( $c_{i}$ , $s_{i}$ ) relative to (1,0) for $i \in {1, 2}$ . The complex covariance of $X_{1}$ and $X_{2}$ is

cov (X_{1}, X_{2}) = E (X_{1} {\bar{X}}_{2}) - E (X_{1}) E ({\bar{X}}_{2}) = E {e^{ı (Θ_{1} - Θ_{2})}} - r_{1} r_{2} e^{ı (θ_{1} - θ_{2})},

where the pseudo-covariance of $X_{1}$ and $X_{2}$ is

pcov (X_{1}, X_{2}) = E (X_{1} X_{2}) - E (X_{1}) E (X_{2}) = E {e^{ı (Θ_{1} - Θ_{2})}} - r_{1} r_{2} e^{ı (θ_{1} + θ_{2})} .

Covariance and pseudo-covariance, respectively, represent rotational and reflectional association between $X_{1}$ and $X_{2}$ . Rotational covariance measures clockwise–clockwise association, and reflectional covariance measures clockwise–anticlockwise association. Both types of association are shown in Figure 2. The amplitude of the rotational covariance controls the width of the yellow high-probability band, and the phase of this complex number specifies the shift of this band, which is seen along the diagonal of the Cartesian plane; the band shifts in direction from bottom-right toward top-left (Figure 2a). The same notion applies to the reflectional covariance magnitude, but the band is positioned on the other diagonal (Figure 2b). The presence of both types of covariation is possible and can lead to concentrated marginals. However, in our experience with phases extracted from LFP neural data, we have almost always observed exclusively rotational dependence. In addition, the marginals are close to uniform. If $r_{1} = r_{2} = 0$ (e.g., if the $Θ_{i}$ are uniform), then

cov (X_{1}, X_{2}) = E {e^{l (Θ_{1} - Θ_{2})}} and corr (X_{1}, X_{2}) = E {e^{l (Θ_{1} - Θ_{2})}} .

The quantity $E {e^{l (Θ_{1} - Θ_{2})}}$ is the theoretical counterpart of PLV. Thus, in the absence of reflectional covariation, if $r_{1} = r_{2} = 0$ , then $E {e^{l (Θ_{1} - Θ_{2})}}$ is the analogue of Pearson correlation for angular random variables.

Figure 2: — The torus is the natural domain for a pair of circular random variables. Illustration of both types of circular covariance in a torus graph bivariate density, with uniform marginal densities, plotted side by side on a two-dimensional torus and a Cartesian plane. (a) Positive rotational dependence. (b) Negative reflectional dependence. The figure has been adapted from a figure in Klein et al. (2020).

3.2.3. Amplitude Correlation

For two amplitudes $R_{i}$ , $R_{j} \in (0, \infty)$ , the simplest way to characterize amplitude correlation is through the ordinary Pearson correlation of real-valued random variables $corr (R_{i}, R_{j})$ . Other approaches involve calculating the correlation between the log of the amplitudes or between the square of the amplitudes (Nolte et al., 2020).

3.3. Multivariate Models

3.3.1. The Complex Normal Distribution

A complex random vector $X \in ℂ^{d}$ is said to be complex normal (CN) if its real and imaginary parts are jointly multivariate normal (Andersen et al., 1995). Suppose that $(ℜ e X, ℑ m X) \sim 𝓝 (μ, Σ)$ , where

μ = (\begin{matrix} μ_{1} \\ μ_{2} \end{matrix}) and Σ = (\begin{matrix} Σ_{11} & Σ_{12} \\ Σ_{12}^{⊤} & Σ_{22} \end{matrix}),

so that $μ_{1} = E (ℜ e X)$ , $μ_{2} = E (ℑ m X)$ , $Σ_{11} = var (ℜ e X)$ , $Σ_{22} = var (ℑ m X)$ , and $Σ_{12} = cov (ℜ e X, ℑ m X)$ . Then

m = μ_{1} + ı μ_{2}, Γ = (Σ_{11} + Σ_{22}) + ı (Σ_{12} - Σ_{12}^{⊤}), and C = (Σ_{11} - Σ_{22}) + ı (Σ_{12} + Σ_{12}^{⊤}),

giving a bijection between ( $m$ , $Γ$ , $C$ ) and ( $μ$ , $Σ$ ). Thus, the complex normal distribution is well defined by the parameter set ( $m$ , $Γ$ , $C$ ). We denote the distribution by $X \sim 𝓒 𝓝 (m, Γ, C)$ . Recall that the distribution of $X$ is said to be proper if the pseudo-covariance matrix $C$ is zero. Further, $X$ is circularly symmetric, meaning component-wise circularly symmetric so that its pdf satisfies $p (X) = p (e^{l α} X)$ for all $α \in ℝ$ if and only if $X$ is proper and has mean zero (Adali, Schreier & Scharf, 2011).

Circularly symmetric, proper, and restriction-less CN distributions form full regular exponential families. This ensures that the maximum likelihood estimator, which involves firstand second-order moment statistics, is sufficient. In addition, these families form probabilistic graphical models so that conditional independence is easily characterized. We return to this in Theorem 8 and Corollary 9. We provide an abbreviated version of this theorem here; for full details, see the Supplementary Material Section S1.

Theorem 1.

Each of the following families forms a full and regular exponential family: (i) the family of CN distributions represented by $𝓒 𝓝 (μ, Γ, C)$ ; (ii) the family of proper CN distributions $𝓒 𝓝 (μ, Γ, 0)$ ; and (iii) the family of circularly symmetric CN distributions $𝓒 𝓝 (0, Γ, 0)$ .

3.3.2. Torus Graphs

Multivariate models for angular random variables are most naturally defined on the torus rather than on Euclidean space. Such a model is given in Klein et al. (2020). Here, we briefly review its construction.

To define an exponential family on a torus with a given mean and covariance structure, first- and second-order sufficient statistics are needed. Using two-dimensional rectangular coordinates (involving cosines and sines), the first-order sufficient statistics are $U_{1} = (\cos Θ_{1}, \sin Θ_{1})$ and $U_{2} = (\cos Θ_{2}, \sin Θ_{2})$ . Second-order behaviour is summarized by

U_{1} U_{2}^{⊤} = (\begin{matrix} \cos Θ_{1} \cos Θ_{2} & \cos Θ_{1} \sin Θ_{2} \\ \sin Θ_{1} \cos Θ_{2} & \sin Θ_{1} \sin Θ_{2} \end{matrix}) .

We can then write a natural exponential family density for $θ = (θ_{1}, θ_{2})$ with a canonical parameter $η$ consisting of $η_{i} \in ℝ^{2}$ and $η_{i j} \in ℝ^{4}$ for $i$ , $j \in {1, 2}$ as

p (θ; η) \propto \exp {η_{1}^{⊤} (\begin{matrix} \cos θ_{1} \\ \sin θ_{1} \end{matrix}) + η_{2}^{⊤} (\begin{matrix} \cos θ_{2} \\ \sin θ_{2} \end{matrix}) + η_{12}^{⊤} (\begin{matrix} \cos θ_{1} \cos θ_{2} \\ \cos θ_{1} \sin θ_{2} \\ \sin θ_{1} \cos θ_{2} \\ \sin θ_{1} \sin θ_{2} \end{matrix})}, 0 \leq θ_{1}, θ_{2} \leq 2 π .

(1)

For $θ = (θ_{1}, \dots, θ_{d})$ , (1) extends to an exponential family on a $d$ -dimensional torus as

p (θ; η) \propto \exp {\sum_{j = 1}^{d} η_{j}^{⊤} (\begin{matrix} \cos θ_{j} \\ \sin θ_{j} \end{matrix}) + \sum_{i < j} η_{i j}^{⊤} (\begin{matrix} \cos θ_{i} \cos θ_{j} \\ \cos θ_{i} \sin θ_{j} \\ \sin θ_{i} \cos θ_{j} \\ \sin θ_{i} \sin θ_{j} \end{matrix})}, 0 \leq θ_{i} \leq 2 π .

(2)

Based on simple trigonometric identities, Klein et al. (2020) reparameterized this family into a more interpretable form

p (θ; η (ϕ)) \propto \exp {\sum_{i = 1}^{d} ϕ_{i}^{⊤} (\begin{matrix} \cos θ_{i} \\ \sin θ_{i} \end{matrix}) + \sum_{i < j} ϕ_{i j}^{⊤} (\begin{matrix} \cos (θ_{i} - θ_{j}) \\ \sin (θ_{i} - θ_{j}) \\ \cos (θ_{i} + θ_{j}) \\ \sin (θ_{i} + θ_{j}) \end{matrix})},

(3)

which uses the first- and second-order statistics seen in the definitions of complex covariance. Klein et al. (2020) then defined a $d$ -dimensional torus graph (TG) to be any member of the family of distributions specified by (2) or (3). If $Θ$ is distributed according to $p (θ; η)$ , then we say that $Θ \sim 𝓣 𝓖 (η)$ ; likewise, if $Θ$ is distributed according to $p (θ; η (ϕ))$ , then we say that $Θ \sim 𝓣 𝓖 (η (ϕ))$ .

Because torus graphs form exponential families, a pair of random variables $X_{i}$ and $X_{j}$ will be conditionally independent given all other variables if and only if the four elements in the pairwise interaction parameter $ϕ_{i j}$ are zero. Thus, torus graphs define probabilistic graphical models.

Torus graphs can uncover conditional dependence relationships, meaning pairwise dependence edges that are still present after conditioning on the rest of the random variables in the model. Conditional dependence is obscured with simple correlation-type measures such as PLV because for each edge, they consider only two random variables (see Figure 3, which displays a conditional independence graph from Klein et al. (2020)).

Figure 3: — Torus graphs can recover conditional dependence graphs when PLV fails in simulated data; see Klein et al. (2020) for details. The figure has been adapted from one in Klein et al. (2020).

There are three parameter groups in (3): marginal concentrations, rotational covariance, and reflectional covariance. Klein et al. (2020) showed that the submodel with only rotational covariance parameters (which corresponds to a proper distribution when the statistics $U_{1}$ and $U_{2}$ are considered to be complex variables) does a good job of fitting phase angles extracted from neural LFP data. In addition, they described how several alternative families of distributions can be seen as special cases of torus graphs. They then applied their torus graph model to characterize a network graph of interactions among recordings from four brain regions during a memory task.

Although torus graphs provide interesting exponential families that could have been defined long ago, they would have been irrelevant to data analysis before recent practical developments used by Klein et al. (2020) to estimate the parameters of the model.

3.3.3. Characterization of Torus Graphs Using the Complex Normal Distribution

We consider the polar coordinate representation of a random vector with a complex normal distribution and prove that the conditional distribution of the phases, given the amplitudes, form a torus graph. We provide a general statement and then offer additional theorems in the proper and circularly symmetric cases.

Suppose that $R_{i} \in [0, \infty)$ and $Θ_{i} \in [- π, π)$ are the amplitude and phase of $X_{i}$ for $i \in [d]$ . In other words, ( $R_{i}$ , $Θ_{i}$ ) is the polar coordinate representation of $X_{i}$ . We consider the conditional distribution of the phases given the amplitudes. Under certain parameter restrictions on a complex normal distribution, this conditional distribution forms a torus graph.

Theorem 2.

Let $X \in ℂ^{d}$ be a complex normal random vector such that $(ℜ e X, ℑ m X) \sim 𝓝 (μ, Σ)$ . Let $Ω = Σ^{- 1}$ be the inverse covariance matrix. If $Ω_{i i} = Ω_{i + d, i + d}$ and $Ω_{i, i + d} = 0$ for $i \in [d]$ , then $Θ ∣ {R = r} \sim 𝓣 𝓖 (η)$ , where $η_{i} = r_{i} ({(Ω μ)}_{i}, {(Ω μ)}_{i + d})$ and $η_{i j} = - r_{i} r_{j} (Ω_{i, j}, Ω_{i, j + d}, Ω_{i + d, j}, Ω_{i + d, j + d}) / 2$ for $i$ , $j \in [d]$ .

In Theorem 2, the components of the conditioning vector $r$ appear as multipliers in the natural parameters but the graph is the same for every vector $r$ .

There exists another distribution in the literature, known as the mGvM distribution, which is a more general version of the TG distribution that includes all second-moment terms (Navarro, Frellsen & Turner, 2017). Navarro et al. showed that, given a $2 d$ -dimensional normal distribution analogous to the complex normal distribution, $Θ ∣ {R = 1}$ is a mGvM distribution without requiring any restrictions on the parameters of the original normal distribution. In| the Supplementary Material Section S1, we generalize this result for any conditional distribution $Θ ∣ {R = r}$ and then leverage it to prove Theorem 2 (see the Supplementary Material Section S1.2).

We can obtain a more precise characterization of the conditional distribution under the parameter restrictions $C = 0$ and $m = 0$ .

Theorem 3.

If $X \sim 𝓒 𝓝 (m, Γ, 0)$ , so that the complex normal distribution of $X$ is proper, then $Θ ∣ {R = r} \sim 𝓣 C (η (ϕ))$ as in (3), where $ϕ_{i j, 3} = ϕ_{i j, 4} = 0$ and $ϕ_{i j, k}$ denotes the $k th$ component of $ϕ_{i j}$ .

Corollary 4.

If $X \sim 𝓒 𝓝 (0, Γ, 0)$ , so that the complex normal distribution of $X$ is circularly symmetric, then $Θ ∣ {R = r} \sim 𝓣 𝓖 (η (ϕ))$ with the same restrictions as in Theorem 3 and with the additional restriction that $ϕ_{i} = 0$ for all $i \in [d]$ .

These theorems say that when the complex normal distribution is proper, the resulting torus graph family has only rotational dependence. The absence of reflectional dependence is intuitive from the definition in Section 3.3.2 of reflectional covariance as pseudo-covariance. Under circular symmetry, because $ϕ_{i} = 0$ , the conditional distribution $Θ_{i} ∣ {R = r}$ is uniform over $[0, 2 π)$ and, additionally, $Θ_{i}$ is marginally uniform. We repeat that the combination of uniformity and| only rotational dependence is a particularly important special case, partly because this is the case in which PLV becomes a circular analogue to Pearson correlation. From a theoretical perspective, according to an argument given by Picinbono (1994), stationary band-pass-filtered signals with sufficiently narrow bands are proper. In the context of the particular neural data application reported here, we provide empirical evidence for circularity in Figure S2, which shows that the sample covariance matrix has much larger entries than the sample pseudo-covariance matrix.

4. COHERENCY AND THE COMPLEX NORMAL DISTRIBUTION

Assume we have a $d$ -dimensional time series for which every dimension contains an oscillation at a given frequency. As discussed previously, these oscillations can be extracted by band-pass-filtering. Spectral dependence and coherency were analyzed using the complex normal distribution by Tugnait (2019a,b), who considered a more general setting in which the spectrum of the signals may be distributed across numerous frequencies. However, when band-pass-filtered signals all oscillate at the same frequency, we get simpler and more interpretable results, which we leverage using our latent variable model in Section 5.

The key to our approach is a well-known result that coherency is a form of complex correlation for stationary signals oscillating at a particular frequency (Ombao & Van Bellegem, 2008). For single-frequency signals, it further turns out that partial coherency, which is used to study conditional dependence among time series, is equivalent to partial correlation (Dahlhaus, 2000; Ombao & Pinto, 2022). For the complex normal distribution, we provide a systematic study of complex conditional correlation, complex partial correlation, and conditional independence. We also review the relationship between complex correlation and other measures of pairwise dependence.

4.1. Coherency and Pairwise Complex Correlation

Suppose we have stationary signals $X_{1} (t)$ and $X_{2} (t)$ on $t \in [0, 1]$ , which may be complex-valued. Assume they have auto-covariance functions $Σ_{i i} (t) = cov {X_{i} (t_{0} + t), X_{i} (t_{0})}$ for $i \in {1, 2}$ , and a cross-covariance function $Σ_{12} (t) = cov {X_{1} (t_{0} + t), X_{2} (t)}$ . These definitions do not depend on $t_{0}$ because of stationarity. For $ω \in [- 0.5, 0.5]$ , the spectrum and cross-spectrum of the signals are

f_{i i} (ω) = \int_{0}^{1} Σ_{i i} (t) e^{- 12 π ω t} d t and f_{12} (ω) = \int_{0}^{1} Σ_{12} (t) e^{- l 2 π ω t} d t,

and the coherency is

τ_{12} (ω) = \frac{f_{12} (ω)}{\sqrt{f_{11} (ω) f_{22} (ω)}} .

The coherency is complex-valued. Coherence is the magnitude of the coherency. The Cramér–Khintchine decomposition (see Ch. 3 of Brémaud, 2014) is

(\begin{matrix} X_{1} (t) \\ X_{2} (t) \end{matrix}) = \int_{- 0.5}^{0.5} \exp (ı 2 π ω t) d (\begin{matrix} Y_{1} (ω) \\ Y_{2} (ω) \end{matrix}),

where ${(Y_{1} (ω), Y_{2} (ω))}^{⊤}$ is a bivariate orthogonal-increment random process. In this case, the coherency between $X_{1}$ and $X_{2}$ at the frequency $ω_{0}$ is often considered as the complex-valued correlation coefficient between the infinitesimal increments of $Y_{1}$ and $Y_{2}$ at $ω_{0}$ , which is to say

τ_{12} (ω_{0}) = corr {d Y_{1} (ω_{0}), d Y_{2} (ω_{0})} .

A quick way to see that this characterization makes sense is to consider two time series oscillating at a single frequency $ω_{0}$ , i.e.,

X_{1} (t) = R_{1} \exp {ι (Θ_{1} + 2 π ω_{0} t)} and X_{2} (t) = R_{2} \exp {l (Θ_{2} + 2 π ω_{0} t)},

where $R_{i} \in [0, \infty)$ and $Θ_{i} \in [- π, π)$ are random variables, which we think of as representing the trial-specific amplitude and phase of $X_{i}$ , respectively, for $i \in {1, 2}$ . Then the auto-covariance and cross-covariance kernels are $Σ_{i i} (t) = var (R_{i} e^{ι Θ_{i}}) e^{ı 2 π ω_{0} t}$ and $Σ_{12} (t) = cov (R_{1} e^{ι Θ_{1}}, R_{2} e^{ι Θ_{2}}) e^{ι 2 π ω_{0} t}$ . The spectrum and cross-spectrum at the frequency $ω_{0}$ are

f_{k k} (ω_{0}) = \int_{0}^{1} var (R_{i} e^{ℓ Θ_{i}}) e^{ı 2 π ω_{0} t} e^{- 12 π ω_{0} t} d t = var (R_{i} e^{ı Θ_{i}}), k \in {1, 2}

(4)

and

f_{12} (ω_{0}) = \int_{0}^{1} cov (R_{1} e^{ι Θ_{1}}, R_{2} e^{ι Θ_{2}}) e^{ı 2 π ω_{0} t} e^{- l 2 π ω_{0} t} d t = cov (R_{1} e^{ι Θ_{1}}, R_{2} e^{ι Θ_{2}}),

(5)

respectively. In this case, the coherency is the complex-valued correlation between $X_{1} (t)$ and $X_{2} (t)$ for every $t$ :

τ_{12} (ω_{0}) = \frac{f_{12} (ω_{0})}{\sqrt{f_{11} (ω_{0}) f_{22} (ω_{0})}} = corr (R_{1} e^{ι Θ_{1}}, R_{2} e^{ι Θ_{2}}) = corr {X_{1} (t), X_{2} (t)} .

As a generalization of the single-frequency case, if a pair of signals are band-pass-filtered over a small frequency band, it is possible to define “band coherency” over a narrow band of frequencies. In this case, the band coherency of the filtered signals is equal to the complex correlation of the filtered signals (Ombao & Van Bellegem, 2008).

4.2. Partial Correlation, Conditional Correlation, and Conditional Independence in the Complex Normal Distribution

In the case of real-valued random variables, conditional correlation and partial correlation have been studied extensively (Baba, Shibata & Sibuya, 2004). For such real-valued random variables, conditional correlation and partial correlation are distinct quantities, and conditional dependence between random variables need not imply that either quantity is nonzero. However, real random variables that are jointly normal have the following special properties: (i) conditional correlation and partial correlation are equal, and (ii) zero conditional correlation (or partial correlation) is equivalent to conditional independence.

In the complex case, the relationship between these quantities has been partially studied in Tugnait (2019a,b) and Andersen et al. (1995). Andersen et al. (1995) define conditional covariance for the proper complex normal distribution and show how it is related to various entries in the complex precision matrix. They further relate conditional dependence to conditional covariation for the proper complex normal distribution. Tugnait (2019a,b) discusses how to assess conditional dependence for proper and improper complex normal distributions.

In this section, we define partial correlation for complex random vectors in a manner analogous to the real case. We describe the differences between partial correlation and conditional correlation for complex random variables, and we provide results relating partial correlation and conditional independence, even in the improper complex normal case.

In what follows, assume we have some complex-valued random vector $X \in ℂ^{d}$ with finite first and second moments. For any subset $𝓘 \subset [d]$ , let $X_{𝓘}$ denote the random vector given by the elements of $X$ whose indices are in $𝓘$ (in order). Let ${\tilde{X}}_{𝓘} \in ℂ^{2 | 𝓘 |}$ be the concatenation of $X_{𝓘}$ and its conjugate, i.e., ${\tilde{X}}_{𝓘} = (X_{𝓘}, {\bar{X}}_{𝓘})$ . Let $𝓛 ({\tilde{X}}_{𝓘}) = {α + β^{H} {\tilde{X}}_{𝓘} : α \in ℂ, β \in ℂ^{2 | 𝓘 |}}$ be the linear space spanned by ${\tilde{X}}_{𝓘}$ and, for every $i \in [d]$ , let ${proj}_{𝓛 ({\tilde{X}}_{𝓘})} (X_{i}) = {argmin}_{Y \in 𝓛 ({\tilde{X}}_{𝓘})} E {{(Y - X_{i})}^{H} (Y - X_{i})}$ be the projection of $X_{i}$ onto that space.

Definition 5.

For $i$ , $j \in [d]$ , the partial correlation between $X_{i}$ and $X_{j}$ is

ρ (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = corr {X_{i} - {proj}_{𝓛 ({\tilde{X}}_{[d] ∖ i, j}})} (X_{i}), X_{j} - {proj}_{𝓛 ({\tilde{X}}_{[d] ∖ {i, j}}} (X_{j})}

and the partial pseudo-correlation between $X_{i}$ and $X_{j}$ is

\bar{ρ} (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = pcorr {X_{i} - {proj}_{𝓛 ({\tilde{X}}_{[d ∖ i, j)})} (X_{i}), X_{j} - {proj}_{𝓛 ({\tilde{X}}_{[d ∖ i, j)}} (X_{j})} .

The conditional correlation is

corr (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = \frac{cov (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}})}{\sqrt{var (X_{i} ∣ X_{[d] ∖ {i, j}}) var (X_{j} ∣ X_{[d] ∖ {i, j}})}} .

The conditional pseudo-correlation is defined analogously by replacing cov with pcov.

We start by stating a correspondence between partial correlation and entries in a transformed precision matrix. The results in this section are proven in the Supplementary Material Section S1.

Theorem 6.

For any complex-valued random vector $X$ , let $\tilde{Γ} = var (\tilde{X})$ . Denote by diag( $A$ ) the diagonal matrix, with diagonal entries the same as $A$ . Let

P = diag {({\tilde{Γ}}^{- 1})}^{- 1 / 2} {\tilde{Γ}}^{- 1} diag {({\tilde{Γ}}^{- 1})}^{- 1 / 2} .

Then $ρ (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = P_{i, j}$ and $\bar{ρ} (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = P_{i, j + d}$ .

Although conditional correlation and partial correlation may be different in general (Baba, Shibata & Sibuya, 2004), they coincide in the complex normal case.

Theorem 7.

If $X \sim 𝓒 𝓝 (m, Γ, C)$ , then, for $i$ , $j \in [d]$

ρ (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = corr (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}})

and

\bar{ρ} (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = pcorr (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) .

In the real-valued case, if the partial correlation is zero for jointly complex normal random variables, then the two variables are conditionally independent. Because in the complex case we have to consider both types of second-order association, we must consider both partial correlation and pseudo-partial correlation. For the following theorems, given random variables $A$ , $B$ , and $C$ , we write $A ⫫ B ∣ C$ if the conditional distributions $A ∣ C$ and $B ∣ C$ are independent.

Theorem 8.

If $X \sim 𝓒 𝓝 (m, Γ, C)$ , then, $ρ (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = 0$ and $\bar{ρ} (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = 0$ if and only if $X_{i} ⫫ X_{j} ∣ X_{[d] ∖ {i, j}}$ .

A corollary of this theorem allows us to drop the reliance on the pseudo-correlation in the proper complex normal case since such a correlation must be zero.

Corollary 9.

If $X \sim 𝓒 𝓝 (m, Γ, 0)$ , then $ρ (X_{i}, X_{j} ∣ X_{[d] ∖ {i, j}}) = 0$ if and only if $X_{i} ⫫ X_{j} ∣ X_{[d] ∖ {i, j}}$ .

By applying Theorems 6 and 8, we get the following corollary:

Corollary 10.

If $X \sim 𝓒 𝓝 (m, Γ, C)$ , then $X_{i} ⫫ X_{j} ∣ X_{[d] ∖ {i, j}}$ if and only if ${\tilde{Γ}}_{i j}^{- 1} = 0$ and ${\tilde{Γ}}_{i, j + d}^{- 1} = 0$ . In addition, if $C = 0$ , then $X_{i}$ , $X_{j}$ are conditionally independent if and only if $Γ_{i j}^{- 1} = 0$ for $Γ = var (X)$ .

4.3. Partial Coherency and Partial Correlation

We now leverage the results from Section 4.2 to show a correspondence between partial coherency, a modification of coherency to account for conditional dependence, and partial correlation. Partial coherency was developed to better differentiate unique pairwise associations from associations that are shared with one or more other observed variables (Ombao & Pinto, 2022). Assume we have $d$ signals $X_{1} (t), \dots, X_{d} (t)$ on $t \in [0, 1]$ . Let $Σ_{i j} (t) = cov {X_{i} (t + t_{0}), X_{j} (t_{0})}$ , and define $f_{i j} (ω)$ as in Section 4.1. Consider the matrix $f (ω)$ formed by assembling the entries of $f_{i j} (ω)$ . Let $diag {f (ω)}$ be the matrix containing the diagonal elements of $f (ω)$ and zero on the off-diagonals. Let

g (ω) = diag {[{f (ω)}^{- 1}]}^{- 1 / 2} {f (ω)}^{- 1} diag {[{f (ω)}^{- 1}]}^{- 1 / 2} .

We call $g {(ω)}_{i j}$ the partial coherency between signals $i$ and $j$ . If $X_{1} (t), \dots, X_{d} (t)$ are all oscillating signals at some frequency $ω_{0}$ , we obtain the following theorem.

Theorem 11.

If $X_{k} (t) = R_{k} \exp {ι (Θ_{k} + 2 π ω_{0} t)}$ for all $k \in [d]$ and $pcorr {X_{i} (0), X_{j} (0)} = 0$ , then

g {(ω_{0})}_{i j} = ρ {X_{i} (0), X_{j} (0) ∣ X_{[d] ∖ {i, j}} (0)} .

This theorem follows from observing that, by our results in Section 4.1, $Σ_{i j} (0) = f {(ω_{0})}_{i j}$ for such signals. We can then apply Theorem 6 and, observing that the upper-right block of $\tilde{Γ} = cov (\tilde{X})$ is zero because $pcorr {X_{i} (0), X_{j} (0)} = 0$ , we obtain the desired result.

Researchers have previously studied partial coherency as a way to construct conditional independence graphs for time series (Dahlhaus, 2000). For instance, Tugnait (2019a,b) estimates conditional independence between time series by first constructing a sufficient statistic that accounts for the oscillations in a time series at all observable frequencies and then, after observing that the limiting distribution of this statistic is a complex normal random variable, deriving conditional independence tests based on this limiting distribution. As we have seen, because we are dealing with signals oscillating at a single frequency, we can estimate partial coherency simply by computing the partial correlation between the observed variables. We will use this to interpret our latent-variable model. The Supplementary Material Section S2 details various simple examples in which PLV, amplitude correlation, and coherence are compared.

4.4. Alternative Measures of Pairwise Association Between Oscillating Signals

Thus far, we have focused on coherence as a measure of association for oscillating signals. As described in Section 3, other measures exist for studying the dependence between oscillating signals. In particular, we described PLV, which many investigators use as an alternative to coherence. Since PLV was first introduced, questions have arisen over whether to use PLV, coherence, or amplitude correlation to measure connectivity between oscillatory neural signals (e.g., Lachaux et al., 1999; Srinath & Ray, 2014; Lowet et al., 2016; Lepage & Vijayan, 2017). Our intention here is to review briefly some of these arguments to understand how coherence is situated among these other possible measures of association.

Clearly, coherence depends on both the phase and amplitude of oscillatory signals. Dependence on amplitude, however, has invited criticism of coherence as a pure representation of the degree of synchrony among the phases of oscillatory signals. PLV was introduced in part to overcome the perceived limitations of this dependence (Lachaux et al., 1999). Later work showed that, particularly in nonstationary settings, estimators of coherence are not well behaved and can fail to accurately represent synchrony (Lowet et al., 2016). Other authors have criticized coherence on the grounds that it can be biased by amplitude correlation (Srinath & Ray, 2014). However, some investigators have responded to these criticisms by pointing out that coherence may up-weight trials with larger amplitude oscillations, where information about phase is likely to be stronger (Lepage & Vijayan, 2017). Thus, the inclusion of amplitude information may be beneficial.

Further, some researchers have analyzed the relationship between PLV, amplitude correlation, and coherence when particular parametric models are assumed for data. For instance, Aydore, Pantazis & Leahy (2013) found that when data are distributed according to the complex normal distribution, PLV can be written as a function of coherence, which indicates that coherence provides all the information that PLV does. Further, Aydore, Pantazis & Leahy (2013) found that both the von Mises distribution and Gaussian models are effective for studying phase associations in LFP data, and argued that PLV and cross-correlation capture the same information. More generally, Nolte et al. (2020) made the remark that, because cross-correlation provides all possible information about the second-order statistical properties of circularly symmetric, bivariate, complex normal random variables, any pairwise measure of association can be written as a function of cross-correlation (and thus coherency). Nolte et al. found that PLV and related phase-based coupling statistics, as well as amplitude–amplitude correlational statistics, can be written as functions of coherence for complex normal random variables.

Thus, if the data are assumed to be proper and complex-normal, then complex correlation, and thus coherence, provides the most detailed measure of association available. For the oscillating neural signals that are the focus of this work, the complex normal model appears to be a reasonable approximation of the data, and so we focus on coherence as our measure of pairwise association since coherence may provide much of the information contained within other measures of association.

5. A COMPLEX-NORMAL, LATENT-VARIABLE MODEL

5.1. Model

We assume that we have recordings from $K$ brain regions, and that the activity of each brain region is recorded by $d_{k}$ electrodes $(1 \leq k \leq K)$ over $N$ repeated experimental trials. We assume that each electrode records an oscillating signal at some frequency, and we band-pass-filter the data to extract this oscillating signal as a preprocessing step to obtain a complex number that represents the phase and amplitude of the oscillation. For each trial $n \in [N]$ , the observation $X_{k}^{n}$ from brain region $d_{k}$ is a $d_{k}$ -dimensional complex-valued variable. We further assume that each $X_{k}^{n}$ is driven by a univariate latent factor $Z_{k}^{n}$ , which is also complex-variate, for the model

X_{k}^{n} = β_{k} Z_{k}^{n} + ϵ_{k}^{n}, ϵ_{k}^{n} \sim 𝓒 𝓝 (0, Φ_{k}, 0),

(6)

Z^{n} = (Z_{1}^{n}, \dots, Z_{k}^{n}) \sim 𝓒 𝓝 (0, Γ, 0),

(7)

where $β_{k} \in ℂ^{d_{k}}$ is a complex-valued factor loading, $Γ \in ℂ^{K \times K}$ is the latent covariance matrix, which is assumed to be positive definite, and $ϵ_{k}^{n} \in ℂ^{d_{k}}$ is a region-specific noise with the covariance parameter $Φ_{k} \in ℂ^{d_{k} \times d_{k}}$ , which also is assumed to be positive definite.

Using basic properties of the complex normal distribution, the marginal distribution of the vector $X^{n} = (X_{1}^{n}, \dots, X_{K}^{n})$ of all the observed signals is

X^{n} \sim 𝓒 𝓝 (0, (\begin{matrix} β_{1} β_{1}^{H} Γ_{11} + Φ_{1} & β_{1} β_{2}^{H} Γ_{12} & β_{1} β_{3}^{H} Γ_{13} & \dots \\ β_{2} β_{1}^{H} Γ_{21} & β_{2} β_{2}^{H} Γ_{22} + Φ_{2} & β_{2} β_{3}^{H} Γ_{23} & \dots \\ β_{3} β_{1}^{H} Γ_{31} & β_{3} β_{2}^{H} Γ_{32} & β_{3} β_{3}^{H} Γ_{33} + Φ_{3} & \dots \\ \dots & \dots & \dots & \dots \end{matrix}), 0) .

(8)

This representation shows that the model is nonidentifiable without adding additional constraints. In particular, $β_{k}$ can be scaled by an arbitrary complex number, and corresponding entries in $Γ$ scaled by the inverse conjugate of that number, with no change to the marginal distribution. Additionally, given an arbitrary real value $a \in ℝ$ , we have that $β_{k} β_{k}^{H} (Γ_{k k} + a) + (Φ_{k} - a β_{k} β_{k}^{H}) = β_{k} β_{k}^{H} Γ_{k k} + Φ_{k}$ , which indicates that, while $Φ_{k} - a β_{k} β_{k}^{H}$ is positive semidefinite (PSD), we can add to the diagonal of $Γ_{k k}$ and subtract $a β_{k} β_{k}^{H}$ from $Φ_{k}$ and retain the same marginal distribution.

Thus, to ensure the identifiability of the parameters, we require that

$β_{k}^{H} β_{k} = 1$ and $ℑ m β_{k, 1} = 0$ , where $β_{k, 1}$ is the first component of $β_{k}$ , and
$\sup {a \geq 0 : Φ_{k} - a β_{k} β_{k}^{H} \geq 0} = 0$ .

To better understand the model, we show how it can produce maximum likelihood estimates that are related to the solutions for a complex extension of canonical correlation analysis (CCA; Hotelling, 1936). Bong et al. (2023) found that a real-valued, latent-variable model, which bears some similarity to the complex-valued, latent-variable model described here, could be viewed as a form of probabilistic CCA, first described in Bach & Jordan (2005). Because CCA has a distribution-free definition, the correspondence we show below with an analogue of CCA for complex-valued random vectors provides a similar distribution-free motivation for the estimation procedure we derive here.

If we observe a pair of real random vectors $Y_{1} \in ℝ^{d_{1}}$ and $Y_{2} \in ℝ^{d_{2}}$ , CCA finds weights $w_{1}$ , $w_{2}$ (where $w_{1} \in ℝ^{d_{1}}$ , $w_{2} \in ℝ^{d_{2}}$ ) that maximize $| corr (w_{1}^{⊤} Y_{1}, w_{2}^{⊤} Y_{2}) |$ . Kettenring (1971) extended CCA to multiple vector observations. For $(Y_{1} \in ℝ^{d_{1}}, \dots, Y_{K} \in ℝ^{d_{K}})$ , this so-called “multiset CCA” finds weights $w_{1} \in ℝ^{d_{1}}, \dots, w_{K} \in ℝ^{d_{K}}$ maximizing a notion of the cross-correlation among $(w_{1}^{⊤} Y_{1}, \dots, w_{K}^{⊤} Y_{K})$ . Multiset CCA easily extends to complex-valued random vectors, $X_{1} \in ℂ^{d_{1}} \dots, X_{K} \in ℂ^{d_{K}}$ . In particular, we can write the optimization problem solved by complex multiset CCA as

\arg \min_{w_{k} \in ℂ^{d} k : k \in [K], var (w_{k}^{H} X_{k}) = 1} \det {var (w_{1}^{H} X_{1}, \dots, w_{K}^{H} X_{K})} .

(9)

The solution weights ${w_{k, c c} : k \in [K]}$ that achieve the optimum of (9) are called canonical weights, and the resulting correlation matrix $P_{c c} = corr (w_{1, c c}^{H} X_{1}, \dots, w_{K, c c}^{H} X_{K})$ is called a canonical correlation matrix. The sample estimates ${\hat{w}}_{k, c c}$ and ${\hat{P}}_{c c}$ are obtained by replacing corr in (9) with the sample version $\hat{corr}$ based on the observed signals $(X_{1}^{n}, \dots, X_{K}^{n})$ during experimental trials $n \in [N]$ . We now state a theorem describing the equivalence between the canonical weights, the canonical correlation matrix, and the maximum likelihood estimator of the parameters in the model given in (6) and (7) under the given identifiability constraints. A proof is given in the Appendix.

Theorem 12.

Suppose that ${\hat{β}}_{k}$ , $k \in [K]$ , and $\hat{Γ}$ are the maximum likelihood estimators of the parameters in (6) and (7) under the given identifiability constraints, labelled as 1 and 2 following (8), based on $N$ observed tuples $(X_{1}^{n}, \dots X_{K}^{n})$ . We have the following equivalence between the maximum likelihood estimators, the canonical weights, and the canonical correlation matrix:

{\hat{β}}_{k} = \frac{\hat{var} (X_{k}) {\hat{w}}_{k, c c}}{{‖ \hat{var} (X_{k}) {\hat{w}}_{k, c c} ‖}_{2}} and diag ({\hat{Γ}}_{k k}^{- 1 / 2}) \hat{Γ} diag ({\hat{Γ}}_{k k}^{- 1 / 2}) = {\hat{P}}_{c c}

for $k \in [K]$ , where $diag ({\hat{Γ}}_{k k}^{- 1 / 2})$ is the diagonal matrix with entries ${\hat{Γ}}_{k k}^{- 1 / 2}$ .

As a corollary of the theorem, if $K = 2$ , then our model is equivalent to the pairwise CCA for two complex random vectors $X_{1} \in ℂ^{d_{1}}$ and $X_{2} \in ℂ^{d_{2}}$ . To demonstrate this equivalence, let us write

w_{1, c c}, w_{2, c c} = \arg \max_{w_{1} \in ℂ^{d_{1}}, w_{2} \in ℂ^{d_{2}}} | corr (w_{1}^{⊤} X_{1}, w_{2}^{⊤} X_{2}) |,

let $ρ_{c c}$ be the resulting canonical correlation coefficient, and let ${\hat{w}}_{1, c c}$ and ${\hat{ρ}}_{c c}$ be the corresponding sample estimators. We can then state the following corollary.

Corollary 13.

If ${\hat{β}}_{1}$ , ${\hat{β}}_{2}$ , and $\hat{Γ}$ are the maximum likelihood estimators of (6) and (7) under the given identifiability constraints, then they are related to the weights and canonical correlation values through

{\hat{β}}_{1} = \frac{\hat{var} (X_{1}) {\hat{w}}_{1, c c}}{{‖ \hat{var} (X_{1}) {\hat{w}}_{1, c c} ‖}_{2}}, {\hat{β}}_{1} = \frac{\hat{var} (X_{2}) {\hat{w}}_{2, c c}}{{‖ \hat{var} (X_{2}) {\hat{w}}_{2, c c} ‖}_{2}}, and \frac{{\hat{Γ}}_{12}}{\sqrt{{\hat{Γ}}_{11} {\hat{Γ}}_{22}}} = {\hat{ρ}}_{c c} .

In the general case with more than two latent factors, fitting the marginal likelihood directly is an intractable optimization problem. Instead, we fit the model using the expectation maximization (EM) algorithm; we describe our fitting procedure in the Appendix. In Figure S1, we provide evidence that, for simulated datasets similar to the real dataset we analyze in Section 6, the estimates provided by the EM fitting procedure converge to the true parameters as the size of the dataset increases. Code for performing inference and analysis with our model is available at https://github.com/urbkn7/latent_cn.

In applying Theorem 12 to the data in Section 6, we first band-pass-filter the data so that the magnitude of the complex correlation becomes the coherence (as discussed in Section 4.1). We also decompose the covariance matrix to get partial coherence (Section 4.3).

5.2. Inference

We study the strength of the associations among the estimated latent factors in our model. To do so, we develop a parametric bootstrap procedure to test for conditional dependence between brain areas. Recall from Section 4 that for vectors distributed according to the complex normal distribution, zero entries in the latent precision matrix correspond to conditional independence relationships between the associated dimensions of the random vector. Therefore, for two regions with indices $r_{1}$ , $r_{2} \in [R]$ , we want to test the null hypothesis $𝓗_{0} : Γ_{r_{1}, r_{2}}^{- 1} = 0$ against the alternative hypothesis $𝓗_{a} : Γ_{r_{1}, r_{2}}^{- 1} \neq 0$ .

To employ the parametric bootstrap, we first estimate the parameters of the model on the original dataset. Let $Θ = {{β_{k} : k \in [K]}, {Φ_{k} : k \in [K]}, Γ}$ , and let $𝓛 (Θ; 𝓓)$ denote the likelihood of the parameters under the model given by (8) for the dataset $𝓓 = {X^{n} : n \in [N]}$ . We first estimate $Θ$ with the EM algorithm to obtain $\hat{Θ} (𝓓)$ , our approximation of the maximum likelihood estimator for the dataset $𝓓$ . Then, we optimize the likelihood with a modified EM algorithm under the additional constraint that $Γ_{r_{1}, r_{2}}^{- 1} = 0$ (see the Appendix for details on how we solve this optimization problem) to obtain ${\hat{Θ}}_{0} (𝓓)$ . We then simulate datasets $𝓓^{(b)} \sim P_{{\hat{Θ}}_{0}}$ , where $1 \leq b \leq B$ and $B$ is the total number of bootstrap datasets. Let

λ (𝓓) = \frac{𝓛 {{\hat{Θ}}_{0} (𝓓); 𝓓}}{𝓛 {\hat{Θ} (𝓓); 𝓓}} .

We can observe the quantile of the likelihood ratio statistic for the original data, $λ (𝓓)$ , relative to the set of likelihood ratio statistics for the bootstrapped datasets ${λ (𝓓^{(b)}) : b \in [B]}$ to obtain a $p$ -value. We can also obtain confidence intervals for parameters of interest by employing the ordinary nonparametric bootstrap. The parameters we are most interested in are contained in the latent correlation and conditional correlation matrices. We have the latent correlation matrix $S = diag {(Γ)}^{- 1 / 2} Γdiag {(Γ)}^{- 1 / 2}$ and the conditional correlation matrix $D = diag {(Γ^{- 1})}^{- 1 / 2} Γ^{- 1} diag {(Γ^{- 1})}^{- 1 / 2}$ .

6. DATA ANALYSIS

We apply the techniques we have discussed so far to a dataset of LFP data from the Allen Institute (Siegle et al., 2021). In the experiment, six electrode probes were simultaneously inserted into the mouse brain, with each probe targeting some area of the visual cortex but also recording other regions. During the experiment, the mice were presented with a variety of visual stimuli; here, we focus on presentations of drifting grating stimuli, which appear as bars moving across a screen in the mouse’s field of view. In the interest of utilizing as much data as possible, we ignore differences in the direction and size of the stimuli presented. This gives us 630 trials.

The experimenters marked every electrode with the anatomical brain area in which it resided during the experiment. We observed an oscillation at 6.5Hz in many of these areas. Therefore, we band-pass-filtered the LFP signal and then used the Hilbert transform to recover the analytic signal. We selected a single time point, 2 s into the trial, which occurs well after the stimulus disappears from the screen. Selecting this time point removes the influence of trial-locked changes in brain activity. In addition, we used five electrodes from each of the six regions for which we have data to form the set of time series we analyze.

We then applied the latent variable model to this dataset. For all analyses, we used one latent variable per brain area analyzed. Each brain area has multiple electrodes embedded, which record the LFP signals; we select five electrodes from each region, and the signals from these electrodes are the observed variables in our model. In Figure 4, we show data from a single probe passing through four visual regions (AM, PM, V1, and LM) and two hippocampal regions (CA1 and DG). We display the empirical correlation matrices of the observed data as well as the latent correlation matrices and partial correlation matrices estimated using our model. These matrices of latent correlation and partial correlations are of interest since, as discussed in Section 4, in our setting they are equivalent to the coherency and partial coherency between regions. While the estimated latent marginal correlations are relatively large between many areas, the estimated partial correlations (the $| {\hat{D}}_{i j} | s$ ) are strongest between two pairs of areas: between AM and PM as well as between CA1 and DG.

Figure 4: — An example of an application of the latent variable model to LFP data from the Allen Institute. The entries in the correlation matrices are arranged consecutively according to their vertical position in the inserted probe. The sample correlation for the real part of the data and the sample correlation between the real and imaginary parts of the data are denoted by $corr (ℜ X)$ and $corr (ℜ X, ℑ X)$ . The estimated latent correlation matrix is denoted by $\hat{S}$ , and the estimated latent partial correlation matrix is denoted by $\hat{D}$ . The notations $\hat{S}$ and $\hat{D}$ denote the absolute value of each entry in the corresponding matrices. Electrodes are labelled by the anatomical region in which they resided during the experiment. We observe that while the estimated latent correlation matrix $\hat{S}$ has large values between many regions, the estimated latent partial correlation primarily has large values between two pairs of regions: between DG and CA1 as well as between AM and PM.

We perform the parametric bootstrap test discussed in Section 5.2 to study the significance of the entries in the latent partial correlation matrix. In Figure 5, we plot the results of this analysis. We observe significant correlations between the majority of pairs of areas for the 6.5Hz frequency.

Figure 5: — Results of parametric bootstrap test performed on all pairs of regions to test if a significant conditional correlation exists between each pair of areas. Numbers inside the boxes are likelihood ratio statistics testing the hypothesis that $| D_{i j} | > 0$ . The colour of each box represents the results of significance tests for three values of $α$ , the false-positive rate, where each significance test is done with a Bonferroni correction for multiple comparisons. The level $α = 0.001$ is the smallest we are able to test given our simulation settings. The likelihood ratio values indicate that some pairs would be significant at much smaller values of $α$ . There are significant partial correlations between most pairs of areas at $α = 0.001$ , except between DG and AM as well as between V1 and CA1.

To further understand how strong the correlations between these areas are, we report bootstrap confidence intervals for the significantly nonzero parameters of the correlation and partial correlation matrices in Figure 6. For nearly all pairs of regions, the partial correlations are of modest size and are substantially smaller than the corresponding marginal correlations. However, the pairs PM–AM and DG–CA1 have large partial correlations close to 1. Also of interest are the patterns of association for the LM area. Note that the strength of the correlation between the LM and the CA1 areas is larger than that between LM and any other visual region, suggesting that LM has stronger oscillatory associations at 6.5Hz frequency with hippocampal regions than with other visual regions, despite LM itself being a visual region.

Figure 6: — Point estimates and confidence intervals of correlations (blue) and partial correlations (orange) for all pairs of regions with significantly nonzero latent partial correlations. All confidence intervals have 95% coverage. The partial correlations are smaller than the corresponding marginal correlations, with the exception of the large partial correlations between PM and AM and between DG and CA1.

7. DISCUSSION

We began with two separate goals, and ended by merging them. On the one hand, we wanted to see whether the multivariate complex normal distribution might be useful in analyzing dependence among groups of oscillating time series. On the other hand, we wished to better understand complex correlation. Our investigation, built on existing literature concerning dependence among complex random variables and complex normal variables in particular, uncovered several interesting relationships that were either not known previously or not spelled out clearly. For complex random vectors, we reviewed the distinction between partial correlation and conditional correlation(Baba,Shibata&Sibuya,2004), which are equivalent for real, normal random vectors, and we showed that the complex partial correlation and conditional correlation also coincide under complex normality. We phrased many of our theorems in terms of the general form of complex normality, including both covariance and pseudo-covariance, and then specialized to the proper case where pseudo-covariances vanish. We showed that in the proper case, pairwise conditional independence coincides with zero partial correlation, as it does for real multivariate normal distributions, and that partial coherency may be considered a partial complex correlation at a given frequency. These facts are important for analytic interpretation, as we demonstrated in our real data example.

The scientific backdrop for our work on the analysis of neural data is one of the most pressing problems in the application of statistics to neurophysiology. Specifically, the problem lies in identifying coordinated activity across two or more regions of the brain based on recordings of multiple time series that are highly nonstationary but are repeated across experimental trials. When there are many repetitions of multiple recorded values in two regions at a single time point, CCA provides a solution to the problem of determining their dependence. In the two-region case, we showed that maximum likelihood estimation for our latent variable model produces a form of latent coherence which is equivalent to the magnitude of the complex canonical correlation. In the multi-region case, maximum likelihood estimation for our model produces a generalized canonical correlation.

The latent coherencies and partial coherencies from our latent-variable model are not computed in the frequency domain but are obtained instead in the time domain with band-pass-filtering in a narrow band. For this we leaned on a key observation by Ombao & Van Bellegem (2008) that “band coherence” (a version of coherence written in terms of integrals over the band) could be interpreted as the magnitude of complex correlation of two signals. We used our results for the complex normal distribution to interpret the complex correlations and partial correlations we found in the data.

Part of understanding coherence requires a comparison to PLV, which can be generalized to the multivariate setting with torus graphs. The original motivation for PLV was discomfort with the dependence of coherence on amplitude (Lachaux et al., 1999). Like PLV, torus graphs ignore amplitude variation and any possible cross-covariation between amplitudes and phases. Thus, torus graphs might be considered models of covariation among phases after marginalizing over amplitudes. The complex normal results characterizing torus graphs as conditional distributions after conditioning on amplitudes show that even though the graphical structure (the conditional independence structure) of torus graphs does not depend on amplitude, calibration of the magnitude of interaction effects apparently depends on the amplitudes. When the amplitudes are roughly constant, coherence and PLV provide essentially equivalent results (see the Supplementary Material Section S2 and Lepage & Vijayan, 2017).

The results we found in the data are striking and intriguing. There are large partial coherences in two pairs of areas, one involving higher visual areas (AM and PM, the anterior and posterior parts of the medial visual area, which are further downstream than primary visual cortex, V1) and the other involving the hippocampus (CA1, a sub-region of the cornu ammonis and DG, the dentate gyrus). Both pairs involve areas that are contiguous, but that alone does not explain their interaction because all of these areas share anatomical boundaries with some other areas. In the case of CA1 and DG, these are the only hippocampal areas in the data. The coherence between the AM and PM areas could indicate close collaboration in neural processing and may be worth further investigation. In addition, we observed stronger coherences between LM and hippocampal regions than between LM and other visual regions, which could also be a subject of further investigation.

Often, there are dozens of time series in each region, and latent-variable models are attractive ways to reduce dimensionality for examining cross-region interactions. In an unpublished work (Orellana & Kass, 2023), a hierarchical model based on torus graphs has been used to describe large numbers of phase measurements made in each of several brain areas. This hierarchical structure has the advantage of reducing the total number of parameters, which results in better statistical inference when the amount of data is small relative to its dimensionality. To assess time-varying amplitude interaction, Bong et al. (2023) developed a time-series generalization of a factor analysis model with one latent factor for each region. They allowed for nonstationarity and included all relevant time-lagged cross-correlations. They first showed how the model leads to a time-series generalization of probabilistic CCA, based on multiset CCA (analogous to Theorem 12 here). Because of nonstationarity, each combination of a time point in one region and a time point in another region could have a unique correlation across trials, which made the covariance matrix have a large number of free parameters. The authors adapted sparse estimation methods to solve the high-dimensional inference problem and showed how it produced interpretable and interesting results when applied to their data. Closely related work can be found in Bong et al. (2020).

It would be straightforward to apply the approach here at many time points, but modelling time lags, as is done in Bong et al. (2023), would require additional work. It would also be possible to estimate partial coherence, even at a single point, using a latent, multivariate time series model, but this would require either a very high dimensional formulation along the lines of Bong et al. (2023) or a specific time series model, both of which present their own challenges.

To simplify interpretation, it was important for us to assume that the complex normal distributions were proper. In our data, this seemed to be a reasonable assumption, and in all the data we have examined using torus graphs, reflectional dependence is, similarly, either absent or difficult to detect. Perhaps future uses of the complex normal distribution, along the lines outlined here, will reveal situations in which pseudo-covariance needs to be considered. The substantial additional complication of such cases would likely present new challenges, but we hope the framework we have summarized here would provide a useful starting point.

Supplementary Material

UrbanEtAl-Supplementary

NIHMS1995735-supplement-UrbanEtAl-Supplementary.pdf^{(1.1MB, pdf)}

ACKNOWLEDGEMENTS

This work was supported in part by the National Institute of Mental Health (grant RO1 MH064537). Urban was supported in part by the National Institute on Drug Abuse (grant 5T90DA022762).

APPENDIX

A NOTE ON THE HILBERT TRANSFORM

Here we provide an overview of the Hilbert transform, which is commonly used to recover phase and amplitude from oscillating signals (Cohen, 2014). Let $X (t) = ℜ e X (t) + ı J m X (t)$ be the output of a complex signal that has been filtered in a band $(ω_{0} - δ, ω_{0} + δ)$ .

We only observe $ℜ e X (t)$ , and the problem is to recover $X (t)$ , which is possible when $δ$ is sufficiently small. The Hilbert transform operates on $ℜ e X (t)$ to produce $ℑ m X (t)$ according to

ℑ m X (t) = p. v. \int \frac{ℜ e X (s)}{π (t - s)} d s,

(A1)

where the integral is over the domain of $t$ . The notation $p. v. \int f (t) d t$ denotes the Cauchy principal value; this formula is given in numerous sources (e.g., Pandey, 2011), but because we have not seen a concise derivation, we provide one here.

Let us write the Fourier transforms of $X (t)$ and its complex conjugate $\bar{X} (t)$ as $𝓕 (X)$ and $𝓕 (\bar{X})$ , and the evaluation of such transforms at a frequency $ω$ by $𝓕 (X) (ω)$ , etc. From the general relations

ℜ e X (t) = \frac{1}{2} {X (t) + \bar{X} (t)} and ℑ m X (t) = - \frac{l}{2} {X (t) - \bar{X} (t)},

we have the Fourier transform

𝓕 (ℜ e X) = \frac{1}{2} {𝓕 (X) + 𝓕 (\bar{X})} and 𝓕 (ℑ m X) = - \frac{l}{2} {𝓕 (X) - 𝓕 (\bar{X})} .

Note that $𝓕 (X) (ω)$ is, from the band-pass filtering, concentrated around $ω_{0}$ , and $𝓕 (\bar{X}) (ω)$ is concentrated around $- ω_{0}$ . Therefore, when $ω > 0$ the formula above gives $𝓕 (ℜ e X) (ω) = 𝓕 (X) (ω) / 2$ and, similarly, when $ω < 0$ , $𝓕 (ℜ e X) (ω) = 𝓕 (\bar{X}) (ω) / 2$ . Thus, using $sgn (ω)$ to denote the sign of $ω$ , when we multiply $𝓕 (R e X) (ω)$ by $- ι sgn (ω)$ , we get

{- ι sgn (ω)} 𝓕 (ℜ e X) (ω) = 𝓕 (J m X) (ω) .

As a result, the convolution $ℜ e X (t) * 𝓕^{- 1} {- ι sgn (ω_{0})}$ satisfies

ℜ e X (t) * 𝓕^{- 1} {- ı sgn (ω)} = ℑ m X (t) .

Inserting the formula for $𝓕^{- 1} {- ι sgn (ω_{0})}$ into the definition of convolution produces (A1), giving the Hilbert transform of $ℜ e X (t)$ . The resulting $X (t)$ is called the analytic signal.

FITTING PROCEDURE FOR THE LATENT FACTOR MODEL IN SECTION 5

In this section, we discuss the procedure for fitting the latent factor model (Eqs. 6 and 7) in Section 5 to the data ${X_{1}^{n}, \dots, X_{K}^{n}}_{n = 1, \dots, N}$ from $N$ independent repeated observations. The procedure provides estimates of the model parameters $Γ$ , $β_{k}$ , and $Φ_{k}$ using the EM algorithm (Dempster, Laird & Rubin, 1977). The EM algorithm starts with initial estimates for the parameters, denoted by $Γ^{(0)}$ , $β_{k}^{(0)}$ , and $Φ_{k}^{(0)}$ , and iteratively updates the estimates to optimize the likelihood through alternating E-steps and M-steps, which we will describe next. We denote the estimates after the $r th$ update by $Γ^{(r)}$ , $β_{k}^{(r)}$ , and $Φ_{k}^{(r)}$ .

At the $(r + 1) th$ iteration, the E-step calculates the sufficient statistics of the latent factors ${Z_{1}^{n}, \dots Z_{K}^{n}}$ conditional on the observed data ${X_{1}^{n}, \dots, X_{K}^{n}}$ and the parameter estimates $Γ^{(r)}$ , $β_{k}^{(r)}$ , and $Ψ_{k}^{(r)}$ after the $r th$ iteration. The full joint probability density of the observed data and the latent factors, given the $r th$ parameter estimates, is

p (X, Z; Γ^{(r)}, β_{k}^{(r)}, Φ_{k}^{(r)}) \propto \frac{1}{\det (Γ^{(r)})} \exp (- Z^{H} {Γ^{(r)}}^{- 1} Z) \times \prod_{k = 1}^{K} (\frac{1}{\det (Φ_{k}^{(r)})} \exp {- {(X_{k} - β_{k}^{(r)} Z_{k})}^{H} {Φ_{k}^{(r)}}^{- 1} (X_{k}^{n} - β_{k}^{(r)} Z_{k})}),

where $Z = (Z_{1}, \dots, Z_{K})$ is the concatenation of the latent factors. Further,

E (Z ∣ X, Γ^{(r)}, β_{k}^{(r)}, Φ_{k}^{(r)}) = {({Γ^{(r)}}^{- 1} + D^{(r)})}^{- 1} V^{(r)},

var (Z ∣ X, Γ^{(r)}, β_{k}^{(r)}, Φ_{k}^{(r)}) = {({Γ^{(r)}}^{- 1} + D^{(r)})}^{- 1}, and

pvar (Z ∣ X, Γ^{(r)}, β_{k}^{(r)}, Φ_{k}^{(r)}) = O_{K},

where $D^{(r)}$ is a diagonal matrix with entries $D_{k k}^{(r)} = β_{k}^{(r) H} {Φ_{k}^{(r)}}^{- 1} β_{k}^{(r)}$ , $V^{(r)}$ is a vector of elements $V_{k}^{(r)} = β_{k}^{(r) H} {Φ_{k}^{(r)}}^{- 1} X_{k}$ , and $O_{K}$ is the $K \times K$ zero matrix. For brevity, we denote $E (Z^{n} ∣ X^{n}, Γ^{(r)}, β_{k}^{(r)}, Φ_{k}^{(r)})$ and $var (Z^{n} ∣ X^{n}, Γ^{(r)}, β_{k}^{(r)}, Φ_{k}^{(r)})$ by $E^{n (r)}$ and ${var}^{(r)}$ . (We note that ${var}^{(r)}$ is invariant over $n \in [N]$ .)

In the M-step, we find the parameters maximizing the expectation of the full log-likelihood function with respect to both ${X^{n}}_{n}$ and ${Z^{n}}_{n}$ conditional on ${X_{1}^{n}, \dots, X_{K}^{n}}_{n}$ , $Γ^{(r)}$ , $ϕ_{k}^{(r)}$ , and $β_{k}^{(r)}$ . The conditional expectation of the full log-likelihood function for the model is

E (ℓ (Γ, β_{k}, Φ_{k}; {X^{n}, Z^{n}}_{n}) ∣ X, Γ^{(r)}, β_{k}^{(r)}, Φ_{k}^{(r)}) \overset{+ C}{=} \sum_{n = 1}^{N} [- ln det (Γ) - tr {Γ^{- 1} ({var}^{(r)} + E^{n (r)} E^{n (r) H})} + \sum_{k = 1}^{K} {- ln det (Φ_{k}) - {(X_{k}^{n})}^{H} Φ_{k}^{- 1} X_{k}^{n} + {\bar{E}}_{k}^{n (r)} β_{k}^{H} Φ_{k}^{- 1} X_{k}^{n} + {(X_{k}^{n})}^{H} Φ_{k}^{- 1} β_{k} E_{k}^{n (r)} - β_{k}^{H} Φ_{k}^{- 1} β_{k} ({var}_{k k}^{(r)} + E_{k}^{n (r)} {\bar{E}}_{k}^{n (r)})}] .

The parameters are complex-valued, so we use Wirtinger calculus to take derivatives. Using formulas from Adali, Schreier & Scharf (2011), we take derivatives and set them equal to zero to achieve the update steps

Γ^{(r + 1)} = {var}^{(r)} + \frac{1}{N} \sum_{n = 1}^{N} E^{n (r)} E^{n (r) H},

β_{k}^{(r + 1)} = \frac{\sum_{n} X_{k}^{n} {\bar{E}}_{k}^{n (r)}}{\sum_{n} ({var}_{k k}^{(r)} + E_{k}^{n (r)} {\bar{E}}_{k}^{n (r)})}, and

Φ_{k}^{(r + 1)} = {var}_{k k}^{(r)} β_{k}^{(r + 1)} {(β_{k}^{(r + 1)})}^{H} + \frac{1}{N} \sum_{n = 1}^{N} (X_{k}^{n} - β_{k}^{(r + 1)} E_{k}^{n (r)}) {(X_{k}^{n} - β_{k}^{(r + 1)} E_{k}^{n (r)})}^{H} .

We also have the identifiability constraints $β_{k}^{H} β_{k} = 1$ , $J m β_{k} (1) = 0$ , and $\sup {a \geq 0 : ϕ_{k} - a β_{k} β_{k}^{H} ⪰ 0} = 0$ . To produce estimates that satisfy these constraints, we utilize the following procedure. First, we adjust the elements of $\hat{Γ}$ and ${\hat{β}}_{k}$ so that $β_{k}^{H} β_{k} = 1$ , and $ℑ m β_{k} (1) = 0$ ; this can be accomplished by multiplying the rows and columns in $\hat{Γ}$ and the ${\hat{β}}_{k}$ vectors by a complex-valued scalar. To ensure that $\sup {a \geq 0 : ϕ_{k} - a β_{k} β_{k}^{H} ⪰ 0} = 0$ , we solve a convex optimization problem by maximizing $α$ under the constraint that $ϕ_{k} - a β_{k} β_{k}^{H} ⪰ 0$ . We then subtract $a {\hat{β}}_{k} {\hat{β}}_{k}^{H}$ from ${\hat{ϕ}}_{k}$ and add $α$ to $Γ_{k k}$ .

We use a procedure based on the method of moments to obtain starting values for the EM algorithm, a strategy that has been shown theoretically to often result in near-optimal parameter estimates (Balakrishnan, Wainwright & Yu, 2017). To do so, we start by computing the sample covariance matrix over the data. Then, we estimate the parameter vectors $β_{k}$ by examining the submatrices given by $\hat{cov} (X_{k}^{n}, X_{i}^{n})$ . In particular, because under the correct model specification, we have that $cov (X_{k}^{n}, X_{i}^{n}) = Γ_{k i} β_{k} β_{i}^{H}$ , we can take any column of $\hat{cov} (X_{k}^{n}, X_{i}^{n})$ and normalize it according to the identifiability constraints to obtain an estimate of $β_{k}$ . We do this and average over all the corresponding columns to get an overall estimate of $β_{k}$ . Once $β_{i}$ and $β_{j}$ have been estimated, it is then straightforward to obtain estimates of $Γ_{i j}$ by dividing the entries in $\hat{cov} (X_{i}^{n}, X_{j}^{n})$ by those in $β_{i} β_{j}^{H}$ .

Now we briefly address our strategy for parameter estimation under a null hypothesis, as is done in the inference method we introduce. Under the null hypothesis, we are assuming that some entry of the latent precision matrix, $Γ_{i j}^{- 1}$ , is zero. To satisfy this assumption, we need to modify our estimation procedure. Observe that the conditional expectation of the log-likelihood depends on $Γ$ through

- ln det Γ - tr {Γ^{- 1} ({var}^{(r)} + E^{n (r)} E^{n (r) H})} .

Thus, we need to optimize this expression over $Γ$ under the constraints that $Γ_{i j}^{- 1} = 0$ while ensuring that $Γ$ remains a Hermitian PSD matrix. Fortunately, this is a convex optimization problem, and while there is no explicit expression for the optimal $Γ$ under these constraints, we can solve this problem numerically using a convex optimization package. Then, we can estimate the remaining variables using the explicit formulas given above.

PROOF OF THEOREM 12

Let $u_{k} = S_{k k}^{- 1 / 2} β_{k} Γ_{k k}^{1 / 2}$ and $Ψ_{k} = S_{k k}^{- 1 / 2} Φ_{k} S_{k k}^{- 1 / 2}$ , where $S_{k ℓ} = β_{k} Γ_{k ℓ} β_{ℓ}^{H} + Φ_{k} δ_{k ℓ}$ is a submatrix of the marginal covariance matrix $S$ of $(X_{1}, \dots, X_{K})$ and $δ_{k ℓ} = 1$ if $k = ℓ$ and $δ_{k ℓ} = 0$ otherwise, for $k$ , $ℓ \in [K]$ . Then the second identifiability constraint can be rewritten as $u_{k}^{H} u_{k} = β_{k}^{H} S_{k k}^{- 1} β_{k} Γ_{k k} = 1$ and

u_{k}^{H} Ψ_{k k} u_{k} = Γ_{k k}^{1 / 2} β_{k}^{H} S_{k k}^{- 1} Φ_{k} S_{k k}^{- 1} β_{k} Γ_{k k}^{1 / 2} = Γ_{k k}^{1 / 2} β_{k}^{H} S_{k k}^{- 1} (S_{k k} - β_{k} Γ_{k k} β_{k}^{H}) S_{k k}^{- 1} β_{k} Γ_{k k}^{1 / 2} = β_{k}^{H} S_{k k}^{- 1} β_{k} Γ_{k k} - {(β_{k}^{H} S_{k k}^{- 1} β_{k} Γ_{k k})}^{2} = 1 - 1^{2} = 0, k \in [K] .

That is, $u_{k}$ is orthogonal to $Ψ_{k k}$ . Denoting the block diagonal matrix of ${S_{k k} : k \in [K]}$ by $V$ , we have that $R = V^{- 1 / 2} S V^{- 1 / 2}$ has submatrices

R_{k ℓ} = S_{k k}^{- 1 / 2} S_{k ℓ} S_{ℓ ℓ}^{- 1 / 2} = u_{k} Γ_{k k}^{- 1 / 2} Γ_{k ℓ} Γ_{k k}^{- 1 / 2} u_{ℓ}^{H} + Ψ_{k} δ_{k ℓ} .

Because of the orthogonality between $u_{k}$ and $Ψ_{k}$ , the calculation of $\det (R)$ and $R^{- 1}$ is straightforward: $\det (R) = \det (Ω) / \prod_{k} pdet (Ψ_{k})$ and $Q = R^{- 1}$ consists of submatrices

Q_{k l} = u_{k} Ω_{k l} u_{ℓ}^{H} + Ψ_{k}^{+} δ_{k ℓ},

where $Ω = diag (Γ_{k k}^{1 / 2}) Γ^{- 1} diag (Γ_{k k}^{1 / 2})$ is the inverse correlation matrix and $pdet (A)$ and $A^{+}$ are the pseudo determinant and Moore–Penrose pseudo-inverse of a PSD matrix $A$ . Notice that $Ψ_{k} = I - u_{k} u_{k}^{H} = Ψ_{k}^{+}$ and hence $pdet (Ψ_{k}) = 1$ . In turn, the negative log-likelihood under the model (Eqs. 6 and 7) of the parameter set ${Γ} \cup {β_{k}, Φ_{k} : k \in [K]}$ with respect to the observed time series ${X_{1}^{n}, \dots, X_{K}^{n}}_{n = 1, \dots, N}$ is

nll (Γ, β_{k}, Φ_{k}; {X_{1}^{n}, \dots, X_{K}^{n}}_{n = 1, \dots, N}) = ln det (S) + tr (S^{- 1} \hat{S}) = - ln det (Ω) + \sum_{k} \ln pdet (Ψ_{k}) + \sum_{k} ln det (S_{k k}) + tr (Ω \hat{P}) + \sum_{k} t r (Ψ_{k}^{+} S_{k k}^{- 1 / 2} {\hat{S}}_{k k} S_{k k}^{- 1 / 2}) = - ln det (Ω) + tr (Ω \hat{P}) + \sum_{k} [ln det (S_{k k}) + tr {(S_{k k}^{- 1} - w_{k} w_{k}^{H}} {\hat{S}}_{k k})],

where

\hat{P} = \hat{var} [(w_{k}^{H} X_{k} : k \in [K])], {\hat{S}}_{k l} = \hat{cov} [X_{k}, X_{l}], w_{k} = S_{k k}^{- 1 / 2} u_{k},

and $\hat{var}$ and $\hat{cor}$ indicate the sample variance and covariance operators, for $k$ , $ℓ \in [K]$ . The maximum likelihood estimator minimizes $nll (θ; {X_{1}^{n}, \dots, X_{K}^{n}}_{n = 1, \dots, N})$ with respect to $S_{k k}^{- 1} : w_{k}^{H} S_{k k} w_{k} = 1$ . That is,

\nabla_{S_{k k}^{- 1}} nll = S_{k k} - {\hat{S}}_{k k} = S_{k k} w_{k} λ_{k} w_{k}^{H} S_{k k},

for some $λ_{k} \in ℝ$ and for all $k \in [K]$ . Because $w_{k}^{H} S_{k k} w_{k} = 1$ ,

1 - w_{k}^{H} {\hat{S}}_{k k} w_{k} = w_{k}^{H} S_{k k} w_{k} - w_{k}^{H} {\hat{S}}_{k k} w_{k} = w_{k}^{H} S_{k k} w_{k} λ_{k} w_{k}^{H} S_{k k} w_{k} = λ_{k} .

Therefore, the two terms $ln det (S_{k k})$ and $tr ((S_{k k}^{- 1} - w_{k} w_{k}^{H}) {\hat{S}}_{k k})$ may be rewritten as

ln det (S_{k k}) = - ln (1 - λ_{k}) + ln det ({\hat{S}}_{k k})

and

tr ((S_{k k}^{- 1} - w_{k} w_{k}^{H}) {\hat{S}}_{k k}) = tr ((S_{k k}^{- 1} - w_{k} w_{k}^{H}) (S_{k k} - S_{k k} w_{k} λ_{k} w_{k}^{H} S_{k k})) = d_{k} - 1.

The maximum likelihood estimation problem then reduces to minimizing

nll (Ω, w_{k}, λ_{k}; {X_{1}^{n}, \dots, X_{K}^{n}}_{n = 1, \dots, N}) = - ln det (Ω) - \sum_{k} \ln (1 - λ_{k}) + tr (Ω \hat{P}) .

with the restriction that $diag (Ω^{- 1}) = 1$ . Let $w_{k}^{'} = w_{k} / \sqrt{1 - λ_{k}}$ , $Ω^{'} = diag (\sqrt{1 - λ_{k}}) Ω diag (\sqrt{1 - λ_{k}})$ , and ${\hat{P}}^{'} = \hat{var} [(w_{k}^{' H} X_{k} : k \in [K])]$ . The likelihood can be rewritten as

nll (Ω^{'}, w_{k}^{'}; {X_{1}^{n}, \dots, X_{K}^{n}}_{n = 1, \dots, N}) = - ln det (Ω^{'}) + tr (Ω^{'} {\hat{P}}^{'}),

which is maximized when $Ω^{'} = {\hat{P}}^{' - 1}$ , given that $w_{k}^{'}$ is fixed for $k \in [K]$ . Thus, maximum likelihood estimation is equivalent to finding $w_{k}^{'}$ minimizing $ln det ({\hat{P}}^{'})$ under ${w_{k}^{'}}^{H} {\hat{S}}_{k k} w_{k}^{'} = 1$ for $k \in [K]$ , which is the GENVAR procedure of Kettenring (1971).

Footnotes

Additional Supporting Information may be found in the online version of this article at the publisher’s website.

REFERENCES

Adali T., Schreier PJ., & Scharf LL. (2011). Complex-valued signal processing: The proper way to deal with impropriety. IEEE Transactions on Signal Processing, 59(11), 5101–5125. [Google Scholar]
Andersen HH, Hojbjerre M, Sorensen D, & Eriksen PS (1995). Linear and Graphical Models: For the Multivariate Complex Normal Distribution, Springer, New York. [Google Scholar]
Aydore S, Pantazis D, & Leahy RM (2013). A note on the phase locking value and its properties. Neuroimage, 74, 231–244. [DOI] [PMC free article] [PubMed] [Google Scholar]
Baba K, Shibata R, & Sibuya M (2004). Partial correlation and conditional correlation as measures of conditional independence. Australian & New Zealand Journal of Statistics, 46(4), 657–664. [Google Scholar]
Bach FR & Jordan MI (2005). A Probabilistic Interpretation of Canonical Correlation Analysis, Technical Report 688, University of California, Berkeley. [Google Scholar]
Balakrishnan S, Wainwright MJ, & Yu B (2017). Statistical guarantees for the EM algorithm: From population to sample-based analysis. The Annals of Statistics, 45(1), 77–120. [Google Scholar]
Bong H, Liu Z, Ren Z, Smith M, Ventura V, & Kass RE (2020). Latent dynamic factor analysis of high-dimensional neural recordings. In Advances in Neural Information Processing Systems, Vol. 33, Neural Information Processing Systems Foundation Inc., San Diego, 16446–16456. [PMC free article] [PubMed] [Google Scholar]
Bong H, Ventura V, Yttri EA, Smith MA & Kass RE (2023). Cross-population amplitude coupling in high-dimensional oscillatory neural time series. arXiv preprint, arXiv:2105.03508.
Brémaud P (2014). Fourier Analysis and Stochastic Processes, Springer, Cham, Switzerland. [Google Scholar]
Buzsáki G, Anastassiou CA, & Koch C (2012). The origin of extracellular fields and currents—EEG, ECoG, LFP and spikes. Nature Reviews Neuroscience, 13(6), 407–420. [DOI] [PMC free article] [PubMed] [Google Scholar]
Buzsaki G & Draguhn A (2004). Neuronal oscillations in cortical networks. Science, 304(5679), 1926–1929. [DOI] [PubMed] [Google Scholar]
Cardin JA, Carlén M, Meletis K, Knoblich U, Zhang F, Deisseroth K, Tsai L-H, & Moore CI (2009). Driving fast-spiking cells induces gamma rhythm and controls sensory responses. Nature, 459(7247), 663–667. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cohen MX (2014). Analyzing Neural Time Series Data: Theory and Practice, MIT Press, Cambridge, MA. [Google Scholar]
Dahlhaus R (2000). Graphical interaction models for multivariate time series. Metrika, 51(2), 157–172. [Google Scholar]
Dempster AP, Laird NM, & Rubin DB (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39, 1–22. [Google Scholar]
Einevoll GT, Kayser C, Logothetis NK, & Panzeri S (2013). Modelling and analysis of local field potentials for studying the function of cortical circuits. Nature Reviews Neuroscience, 14(11), 770–785. [DOI] [PubMed] [Google Scholar]
Fries P (2005). A mechanism for cognitive dynamics: Neuronal communication through neuronal coherence. Trends in Cognitive Sciences, 9(10), 474–480. [DOI] [PubMed] [Google Scholar]
Hotelling H (1936). Relations between two sets of variates. Biometrika, 28(3–4), 321–377. [Google Scholar]
Kass RE, Amari S-I, Arai K, Brown EN, Diekman CO, Diesmann M, Doiron B et al. (2018). Computational neuroscience: Mathematical and statistical perspectives. Annual Review of Statistics and its Application, 5, 183–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kettenring JR (1971). Canonical analysis of several sets of variables. Biometrika, 58(3), 433–451. [Google Scholar]
Klein N, Orellana J, Brincat SL, Miller EK, & Kass RE (2020). Torus graphs for multivariate phase coupling analysis. The Annals of Applied Statistics, 14(2), 635–660. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lachaux J-P, Rodriguez E, Martinerie J, & Varela FJ (1999). Measuring phase synchrony in brain signals. Human Brain Mapping, 8(4), 194–208. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lepage KQ & Vijayan S (2017). The relationship between coherence and the phase-locking value. Journal of Theoretical Biology, 435, 106–109. [DOI] [PubMed] [Google Scholar]
Lowet E, Roberts MJ, Bonizzi P, Karel J, & De Weerd P (2016). Quantifying neural oscillatory synchronization: A comparison between spectral coherence and phase-locking value approaches. PloS One, 11(1), e0146443. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mathalon DH & Sohal VS (2015). Neural oscillations and synchrony in brain dysfunction and neuropsychiatric disorders: It’s about time. Journal of the American Medical Association Psychiatry, 72(8), 840–844. [DOI] [PubMed] [Google Scholar]
Miller EK, Lundqvist M, & Bastos AM (2018). Working memory 2.0. Neuron, 100(2), 463–475. [DOI] [PMC free article] [PubMed] [Google Scholar]
Navarro A., Frellsen J., & Turner R. (2017). The multivariate generalised von Mises distribution: Inference and applications. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31, Association for the Advancement of Artificial Intelligence, Washington, DC. [Google Scholar]
Nolte G, Galindo-Leon E, Li Z, Liu X, & Engel AK (2020). Mathematical relations between measures of brain connectivity estimated from electrophysiological recordings for Gaussian distributed data. Frontiers in Neuroscience, 14, 577574. https://www.frontiersin.org/articles/10.3389/fnins.2020.577574/ [DOI] [PMC free article] [PubMed] [Google Scholar]
Ombao H & Pinto M (2022). Spectral dependence. Econometrics and Statistics. 10.1016/j.ecosta.2022.10.005 [DOI]
Ombao H&VanBellegem S.(2008).Evolutionarycoherenceofnonstationarysignals.IEEETransactions on Signal Processing, 56(6), 2259–2266. [Google Scholar]
Orellana J & Kass RE (2023). Latent Torus Graphs for Dense Recordings and Cross-Region Phase Coupling Analysis (unpublished manuscript).
Pandey JN (2011). The Hilbert Transform of Schwartz Distributions and Applications, Wiley, New York. [Google Scholar]
Pesaran B, Vinck M, Einevoll GT, Sirota A, Fries P, Siegel M, Truccolo W, Schroeder CE, & Srinivasan R (2018). Investigating large-scale brain dynamics using field potential recordings: Analysis and interpretation. Nature Neuroscience, 21(7), 903–919. [DOI] [PMC free article] [PubMed] [Google Scholar]
Picinbono B (1994). On circularity. IEEE Transactions on Signal Processing, 42(12), 3473–3482. [Google Scholar]
Schmidt R, Ruiz MH, Kilavik BE, Lundqvist M, Starr PA, & Aron AR (2019). Beta oscillations in working memory, executive control of movement and thought, and sensorimotor function. Journal of Neuroscience, 39(42), 8231–8238. [DOI] [PMC free article] [PubMed] [Google Scholar]
Shumway RH & Stoffer DS (2017). Time Series Analysis and its Applications: With R Examples, Springer, Cham, Switzerland. [Google Scholar]
Siegle JH, Jia X, Durand S, Gale S, Bennett C, Graddis N, Heller G et al. (2021). Survey of spiking in the mouse visual system reveals functional hierarchy. Nature, 592(7852), 86–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
Srinath R & Ray S (2014). Effect of amplitude correlations on coherence in the local field potential. Journal of Neurophysiology, 112(4), 741–751. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tugnait JK (2019a). Edge exclusion tests for graphical model selection: Complex Gaussian vectors and time series. IEEE Transactions on Signal Processing, 67(19), 5062–5077. [Google Scholar]
Tugnait JK (2019b). Edge exclusion tests for improper complex Gaussian graphical model selection. IEEE Transactions on Signal Processing, 67(13), 3547–3560. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

UrbanEtAl-Supplementary

NIHMS1995735-supplement-UrbanEtAl-Supplementary.pdf^{(1.1MB, pdf)}

[R1] Adali T., Schreier PJ., & Scharf LL. (2011). Complex-valued signal processing: The proper way to deal with impropriety. IEEE Transactions on Signal Processing, 59(11), 5101–5125. [Google Scholar]

[R2] Andersen HH, Hojbjerre M, Sorensen D, & Eriksen PS (1995). Linear and Graphical Models: For the Multivariate Complex Normal Distribution, Springer, New York. [Google Scholar]

[R3] Aydore S, Pantazis D, & Leahy RM (2013). A note on the phase locking value and its properties. Neuroimage, 74, 231–244. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Baba K, Shibata R, & Sibuya M (2004). Partial correlation and conditional correlation as measures of conditional independence. Australian & New Zealand Journal of Statistics, 46(4), 657–664. [Google Scholar]

[R5] Bach FR & Jordan MI (2005). A Probabilistic Interpretation of Canonical Correlation Analysis, Technical Report 688, University of California, Berkeley. [Google Scholar]

[R6] Balakrishnan S, Wainwright MJ, & Yu B (2017). Statistical guarantees for the EM algorithm: From population to sample-based analysis. The Annals of Statistics, 45(1), 77–120. [Google Scholar]

[R7] Bong H, Liu Z, Ren Z, Smith M, Ventura V, & Kass RE (2020). Latent dynamic factor analysis of high-dimensional neural recordings. In Advances in Neural Information Processing Systems, Vol. 33, Neural Information Processing Systems Foundation Inc., San Diego, 16446–16456. [PMC free article] [PubMed] [Google Scholar]

[R8] Bong H, Ventura V, Yttri EA, Smith MA & Kass RE (2023). Cross-population amplitude coupling in high-dimensional oscillatory neural time series. arXiv preprint, arXiv:2105.03508.

[R9] Brémaud P (2014). Fourier Analysis and Stochastic Processes, Springer, Cham, Switzerland. [Google Scholar]

[R10] Buzsáki G, Anastassiou CA, & Koch C (2012). The origin of extracellular fields and currents—EEG, ECoG, LFP and spikes. Nature Reviews Neuroscience, 13(6), 407–420. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Buzsaki G & Draguhn A (2004). Neuronal oscillations in cortical networks. Science, 304(5679), 1926–1929. [DOI] [PubMed] [Google Scholar]

[R12] Cardin JA, Carlén M, Meletis K, Knoblich U, Zhang F, Deisseroth K, Tsai L-H, & Moore CI (2009). Driving fast-spiking cells induces gamma rhythm and controls sensory responses. Nature, 459(7247), 663–667. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Cohen MX (2014). Analyzing Neural Time Series Data: Theory and Practice, MIT Press, Cambridge, MA. [Google Scholar]

[R14] Dahlhaus R (2000). Graphical interaction models for multivariate time series. Metrika, 51(2), 157–172. [Google Scholar]

[R15] Dempster AP, Laird NM, & Rubin DB (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39, 1–22. [Google Scholar]

[R16] Einevoll GT, Kayser C, Logothetis NK, & Panzeri S (2013). Modelling and analysis of local field potentials for studying the function of cortical circuits. Nature Reviews Neuroscience, 14(11), 770–785. [DOI] [PubMed] [Google Scholar]

[R17] Fries P (2005). A mechanism for cognitive dynamics: Neuronal communication through neuronal coherence. Trends in Cognitive Sciences, 9(10), 474–480. [DOI] [PubMed] [Google Scholar]

[R18] Hotelling H (1936). Relations between two sets of variates. Biometrika, 28(3–4), 321–377. [Google Scholar]

[R19] Kass RE, Amari S-I, Arai K, Brown EN, Diekman CO, Diesmann M, Doiron B et al. (2018). Computational neuroscience: Mathematical and statistical perspectives. Annual Review of Statistics and its Application, 5, 183–214. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Kettenring JR (1971). Canonical analysis of several sets of variables. Biometrika, 58(3), 433–451. [Google Scholar]

[R21] Klein N, Orellana J, Brincat SL, Miller EK, & Kass RE (2020). Torus graphs for multivariate phase coupling analysis. The Annals of Applied Statistics, 14(2), 635–660. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] Lachaux J-P, Rodriguez E, Martinerie J, & Varela FJ (1999). Measuring phase synchrony in brain signals. Human Brain Mapping, 8(4), 194–208. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] Lepage KQ & Vijayan S (2017). The relationship between coherence and the phase-locking value. Journal of Theoretical Biology, 435, 106–109. [DOI] [PubMed] [Google Scholar]

[R24] Lowet E, Roberts MJ, Bonizzi P, Karel J, & De Weerd P (2016). Quantifying neural oscillatory synchronization: A comparison between spectral coherence and phase-locking value approaches. PloS One, 11(1), e0146443. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] Mathalon DH & Sohal VS (2015). Neural oscillations and synchrony in brain dysfunction and neuropsychiatric disorders: It’s about time. Journal of the American Medical Association Psychiatry, 72(8), 840–844. [DOI] [PubMed] [Google Scholar]

[R26] Miller EK, Lundqvist M, & Bastos AM (2018). Working memory 2.0. Neuron, 100(2), 463–475. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Navarro A., Frellsen J., & Turner R. (2017). The multivariate generalised von Mises distribution: Inference and applications. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31, Association for the Advancement of Artificial Intelligence, Washington, DC. [Google Scholar]

[R28] Nolte G, Galindo-Leon E, Li Z, Liu X, & Engel AK (2020). Mathematical relations between measures of brain connectivity estimated from electrophysiological recordings for Gaussian distributed data. Frontiers in Neuroscience, 14, 577574. https://www.frontiersin.org/articles/10.3389/fnins.2020.577574/ [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Ombao H & Pinto M (2022). Spectral dependence. Econometrics and Statistics. 10.1016/j.ecosta.2022.10.005 [DOI]

[R30] Ombao H&VanBellegem S.(2008).Evolutionarycoherenceofnonstationarysignals.IEEETransactions on Signal Processing, 56(6), 2259–2266. [Google Scholar]

[R31] Orellana J & Kass RE (2023). Latent Torus Graphs for Dense Recordings and Cross-Region Phase Coupling Analysis (unpublished manuscript).

[R32] Pandey JN (2011). The Hilbert Transform of Schwartz Distributions and Applications, Wiley, New York. [Google Scholar]

[R33] Pesaran B, Vinck M, Einevoll GT, Sirota A, Fries P, Siegel M, Truccolo W, Schroeder CE, & Srinivasan R (2018). Investigating large-scale brain dynamics using field potential recordings: Analysis and interpretation. Nature Neuroscience, 21(7), 903–919. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] Picinbono B (1994). On circularity. IEEE Transactions on Signal Processing, 42(12), 3473–3482. [Google Scholar]

[R35] Schmidt R, Ruiz MH, Kilavik BE, Lundqvist M, Starr PA, & Aron AR (2019). Beta oscillations in working memory, executive control of movement and thought, and sensorimotor function. Journal of Neuroscience, 39(42), 8231–8238. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] Shumway RH & Stoffer DS (2017). Time Series Analysis and its Applications: With R Examples, Springer, Cham, Switzerland. [Google Scholar]

[R37] Siegle JH, Jia X, Durand S, Gale S, Bennett C, Graddis N, Heller G et al. (2021). Survey of spiking in the mouse visual system reveals functional hierarchy. Nature, 592(7852), 86–92. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] Srinath R & Ray S (2014). Effect of amplitude correlations on coherence in the local field potential. Journal of Neurophysiology, 112(4), 741–751. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] Tugnait JK (2019a). Edge exclusion tests for graphical model selection: Complex Gaussian vectors and time series. IEEE Transactions on Signal Processing, 67(19), 5062–5077. [Google Scholar]

[R40] Tugnait JK (2019b). Edge exclusion tests for improper complex Gaussian graphical model selection. IEEE Transactions on Signal Processing, 67(13), 3547–3560. [Google Scholar]

PERMALINK

Oscillating neural circuits: Phase, amplitude, and the complex normal distribution

Konrad N URBAN

Heejong BONG

Josue ORELLANA

Robert E KASS

Abstract

Abstract

1. INTRODUCTION

Figure 1:

2. BACKGROUND AND SUMMARY

3. DEPENDENCE OF COMPLEX-VALUED RANDOM VARIABLES

3.1. Setting: Repeated Observations of Oscillating Signals

3.2. Pairwise Association Between Complex-Valued Random Variables

3.2.1. Complex Correlation

3.2.2. Angular Association: Phase-Locking Value

Figure 2:

3.2.3. Amplitude Correlation

3.3. Multivariate Models

3.3.1. The Complex Normal Distribution

Theorem 1.

3.3.2. Torus Graphs

Figure 3:

3.3.3. Characterization of Torus Graphs Using the Complex Normal Distribution

Theorem 2.

Theorem 3.

Corollary 4.

4. COHERENCY AND THE COMPLEX NORMAL DISTRIBUTION

4.1. Coherency and Pairwise Complex Correlation

4.2. Partial Correlation, Conditional Correlation, and Conditional Independence in the Complex Normal Distribution

Definition 5.

Theorem 6.

Theorem 7.

Theorem 8.

Corollary 9.

Corollary 10.

4.3. Partial Coherency and Partial Correlation

Theorem 11.

4.4. Alternative Measures of Pairwise Association Between Oscillating Signals

5. A COMPLEX-NORMAL, LATENT-VARIABLE MODEL

5.1. Model

Theorem 12.

Corollary 13.

5.2. Inference

6. DATA ANALYSIS

Figure 4:

Figure 5:

Figure 6:

7. DISCUSSION

Supplementary Material

ACKNOWLEDGEMENTS

APPENDIX

A NOTE ON THE HILBERT TRANSFORM

FITTING PROCEDURE FOR THE LATENT FACTOR MODEL IN SECTION 5

PROOF OF THEOREM 12

Footnotes

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases