Summary
Neural computations underlying cognition and behavior rely on the coordination of neural activity across multiple brain areas. Understanding how brain areas interact to process information or generate behavior is thus a central question in neuroscience. Here we provide an overview of statistical approaches for characterizing statistical dependencies in multi-region spike train recordings. We focus on two classes of models in particular: regression-based models and shared latent variable models. Regression-based models describe interactions in terms of a directed transformation of information from one region to another. Shared latent variable models, on the other hand, seek to describe interactions in terms of sources that capture common fluctuations in spiking activity across regions. We discuss the advantages and limitations of each of these approaches and future directions for the field. We intend this review to be an introduction to the statistical methods in multi-region models for computational neuroscientists and experimentalists alike.
Introduction
The idea that distinct areas of the brain support different cognitive and behavioral functions dates back to the early 19th century [1]. During this time, there was debate as whether various cortical regions had specific functionality or rather participated in all cognitive and psychological functions [2, 3]. These competing ideas remain a topic of study today [4], and despite significant progress in understanding brain function over the past two centuries, the extent to which different brain areas support distinct functions and convey different information to other brain areas remains largely unknown.
In the past decade, advances in neural recording technologies have provided exciting new opportunities to study how distinct brain regions interact. In particular, advances in wide field calcium imaging [5–7] and high-throughput neural probes [8] allow for simultaneous recording of neural activity from hundreds to thousands of neurons in multiple brain regions (Figure 1A). Recent work has made substantial progress in characterizing neural population codes within a single brain regions [9–15]. The challenge remains to extend these models to study how populations interact. This necessitates both adding to existing statistical tools and developing new tools to extract accurate and interpretable descriptions of interactions from multi-region neural recordings.
Figure 1:

Approaches to characterizing neural activity from simultaneously recorded brain regions. A. Schematic for simultaneous electrophysiological recording from two separate brain regions. B. Regression models are directional statistical descriptions that seek to characterize the firing of one brain region using the activity of the other region. C. Latent variable models characterize shared structure underlying response fluctuations in multiple brain regions.
A simple approach to analyzing multi-region datasets is to compute descriptive statistics that summarize the statistical relationships between regions. For example, an extensive literature has focused on using the cross-correlogram or cross-covariance between spike trains to make inferences about connectivity [16–19]. In recent work, covariance-based measures have been used to identify functional sub-networks across visual areas in rodent visual cortex [5, 15, 20]. However, moment-based methods are typically limited by the lack of an explicit statistical model for neural responses, and may lack statistical power for identifying weak interactions between areas. In this review, we therefore focus on model-based approaches, which provide explicit descriptions of the dependencies within and between brain areas.
Here we provide an overview of two modern approaches for characterizing dependencies in multi-region electrophysiological recordings. We begin by discussing regression-based approaches, which seek to build predictive models for neural activity in one region using neural activity in other regions (Figure 1B). We then discuss latent variable models, which characterize the structure of multi-region data in terms of shared low-dimensional time series (Figure 1C). We conclude by proposing future directions and ways of extending existing models to help further uncover the ways distinct brain regions interact.
Regression-based approaches
Regression models provide one framework for characterizing statistical dependencies between brain areas using data from multi-area recordings. The basic idea is to fit a model that predicts activity in one brain area using the activity from one or more other brain areas (Figure 1B). Models of this form contain a set of linear weights that specify how spikes from a group of input neurons modulate the firing rate of an output neuron, with Gaussian [21] or Poisson observation noise and time-lag between regions [22, 23]. The inferred regression weights in this class of models provide a measure of statistical dependency, often referred to as functional connectivity, between areas [22, 24]. When the regression model in question is a time-lagged linear-Gaussian “least-squares” regression, the resulting dependency measure is known as Granger causality [25, 26]. When the model consists of a linear-mapping following by a point-wise non-linearity and Poisson observations, the method is known as the Poisson generalized linear model (GLM). Recent work has extended this framework to allow for non-linear dependencies between neurons using deep neural networks [27]).
For this review, we focus on the Poisson GLM, which provides a simple yet powerful regression modeling framework for spike train data. The single-neuron Poisson GLM describes how the spike rate of a single neuron depends on external covariates such as visual stimuli, spatial position, or theta oscillation phase [28–32]. So called “auto-regressive” Poisson GLMs include weights on spike history from one or more neurons, which allow them to characterize dependencies between neurons in multi-region recordings [22–24, 33–39].
Formally, a Poisson GLM describes each neuron by an instantaneous firing rate, also known as the conditional intensity, denoted λt, which defines the probability of observing a spike in a small time bin at time t. This firing rate is given by the projection of an input vector xt onto the linear weights w, transformed by a rectifying nonlinear function f:
| (1) |
where b is a constant offset or bias term. Common choices of nonlinearity f include exponential and the soft-rectified or “softplus” function, f(x) = log(1 + ex), both of which ensure that firing rates be non-negative. The spike count yt in a time bin of size Δ has a conditionally Poisson distribution:
| (2) |
although other count distributions (e.g., Bernoulli, binomial, negative binomial) have also be considered [40–43].
To model statistical dependencies between neurons, the spike history of other neurons can be included as a regressor. In this case the firing rate of neuron j can be re-written as:
| (3) |
where wj and bj are stimulus weights and offset for neuron j, hj denotes the “self-coupling” spike-history filter for neuron j, cij denotes a “coupling” filter from neuron i to neuron j, and denotes a vector representing the spike history of neuron i at time t. This is typically defined to be a vector of binned spike counts from time bins (t – τ, … , t − 1), for some number of time lags τ before the current time bin.
When defined this way, it can be seen that the coupled Poisson GLM is actually a collection of single-neuron models, one for each neuron in the population (schematized for a single neuron in Fig. 2). Each neuron is characterized by the set of filters that determine its inputs. Fitting the coupled Poisson GLM to multi-neuron data therefore involves fitting a separate GLM to each neuron in the population, which can be carried out in parallel.
Figure 2:

The coupled Poisson GLM is a regression model for capturing dependencies in multi-region spike train recordings. To predict spike trains in region B using spike trains from region A, the model contains “coupling” filters cij that capture statistical dependencies between previous spikes from neuron i and the spike rate of neuron j at time t. It may also contain filters capturing dependencies on external stimuli, and on the spike history of other neurons in region B (not shown). The full Poisson GLM for a single brain region contains an independent set of filters for every spike train, meaning that it is really a collection of independent models, one for every neuron in downstream population B. One can similarly construct a Poisson GLM to predict spike trains in region A from the spikes in region B.
One way to characterize statistical dependencies with the Poisson GLM is to analyze the coupling filters between neurons in different brain regions. The strength of dependencies between regions can be assessed by computing the predictive performance of models fit with and without coupling filters between regions [22]. A simple metric is the difference in cross-validated log-likelihood, given by (log P(Ytest|θcoupled) − log P(Ytest|θuncoupled), where Ytest represents held-out test data not used for fitting, and θcoupled and θuncoupled represent fitted parameters of the coupled and uncoupled models, respectively. This quantity has units of bits when taking log base 2, and can be divided by the number of seconds or number of samples to get units of bits/s or bits/sample [31, 42]. Another way to use GLMs to characterize dependencies between brain regions is to directly analyze the shape and amplitude of fitted coupling filters. The peak amplitude of each coupling filter provides a summary of coupling strength, while the sign of the peak indicates whether the effective dependencies are excitatory or suppressive [38].
One general advantage of regression-based approaches is the ability to characterize directed dependencies between brain regions. Because we regress the activity in each brain area against the time-lagged activity from other areas, we can identify asymmetric relationships between areas. For example, we might identify that spikes in area A predict increased spiking area B, whereas spikes in area B predict decreased firing in area A. However, it is important to emphasize that the statistical dependencies identified by GLMs are statistical and not causal, and are susceptible to omitted variable bias [44]. Thus, for example, the dependencies between two brain regions identified by GLM analysis may in fact reflect shared input with different time lags from a third, unobserved brain region [45, 46].
Another advantage of GLMs is the ability to parallelize over neurons when fitting, since the full model is simply a collection of single-neuron models. This confers computational advantages over latent variable models (see next section), which must be fit jointly to all neurons at once. However, the total number of coupling filters in a GLM grows quadratically with the number of neurons, which can lead to overfitting and problems of interpretation in large neural populations. Sparsity-inducing priors, which drive some coupling filters to zero, can help mitigate both problems; these priors reduce the number of model parameters while preserving the most significant connections between neurons [35, 38, 47].
Latent variable modeling approaches
An alternative to regression-based approaches is to model dependencies between brain regions using shared latent variables. Here, the standard approach is to model the response fluctuations in a large population of neurons as arising from a small number of hidden or ‘latent’ sources. These models are typically not directed, meaning that they they seek to describe dependencies between brain regions in terms of shared modulation by a small number of hidden or unobserved sources, as opposed to seeking to describe dependencies in terms of separate feedforward and feedback components.
In recent years, latent variable models have become a popular approach for analyzing structure in single-region datasets. Typically, the time-course of the latent variable is either assumed to follow a linear dynamical system (LDS) [45, 48–53], or is simply assumed to be smooth in time using a Gaussian process (GP) prior [43, 54–60]. Both LDS and GP latent variable models have been augmented and modified for different purposes, and here we will highlight how each form can be extended to understand multi-region data.
The key distinction between LDS and GP latent variable models lies in the assumed prior over the latent variable. In LDS models, the latent vector xt evolves according to linear dynamics with Gaussian noise:
| (4) |
where A describes the discrete-time linear dynamics. Although the assumption of linear dynamics is almost certainly too restrictive for real neural circuits, the noise component ϵt, often referred to as “innovations” noise, can be viewed as accounting for both the true noise as well model mismatch. One notable extension of this model, the switching Linear Dynamical System (sLDS), allows for discrete switching between different linear dynamics matrices, which can approximate nonlinear dynamics in the same way that a series of line segments can approximate a curve [61–65].
An alternate approach to LDS models is to describe the latent variable with a Gaussian process (GP), which makes no assumption about the dynamics beyond smoothness in time. A GP defines a joint Gaussian distribution over the latent xt (an element of the latent vector xt) at all time points t ∈ [0, T]. The correlations of this latent process are specified by a kernel function:
| (5) |
which determines the a priori correlation between the latent variable at any pair of times t and t′. A common choice of kernel function is the ‘Gaussian’ or ‘radial basis function’ (RBF) covariance:
| (6) |
which is governed by a single hyperparameter ℓ known as the “length scale”, which determines degree of the smoothness. In a classic model known as Gaussian process factor analysis (GPFA), each element of the latent vector xt is assumed to have an independent zero-mean GP prior, each with its own length scale [54].
In both LDS and GP latent variable models, the probability distribution over neural data is defined in terms of a noisy projection of the low-dimensional latent variable to the space of high-dimensional neural activity. The simplest case is to assume a linear mapping, as found in a class of so-called “factor models”. If we assume Gaussian additive noise with covariance R, we obtain [54]:
| (7) |
where yt is the set of n neurons’ spike counts at time t, xt is the p-element latent vector at time t, and W is a (n × p) matrix of loading weights. Each column of W describes how a single latent affects the population spiking activity, and each row describes how a single neuron’s activity depends on the different latents contained in xt.
A better model for spike count data is to assume Poisson noise after transformation by a fixed nonlinearity. In this case, the factor model can be written:
| (8) |
where as before Δ denotes time bin width and f(·) is fixed non-negative output function [43, 57, 64]. Gaussian models are tractable and simple to fit because the log-likelihood can be evaluated in closed form. Poisson models, by contrast, my be more accurate for spike count data, particularly when spike rates are low, but require approximate inference methods to fit [43, 57, 64, 66, 67]. Recent work has introduced a latent variable models with flexible nonlinear mappings between latent variables and observations [60, 66–69], but for the purposes of this review we focus on models with linear mappings.
To extend the two latent variable modeling approaches described above to multi-region data, we will first consider recording from two brain regions, denoted A and B. We can perform inference on an LDS or GP factor model by concatenating data from two regions, and then examine the inferred loadings matrix, W. In the standard inference case, the model will uncover latent structure agnostic to the multi-region nature of the data. The latent factors will reflect the dimensions of shared variability, across time, for the entire population. A way to isolate per-region latent structure is to constrain the loadings matrix, W, to have block structure, with a subset of latents mapping only to region A, and another subset mapping only to region B. The model, shown for the Poisson observation case, takes the following form:
| (9) |
where xA and xB reflect the latents corresponding to regions A and B and WA,B the respective loadings for each region, and for simplicity we have dropped indexing on time.
This model is equivalent to fitting two entirely separate models, one for region A and another for region B, with no shared structure between them. We wish to contrast the above model with one that includes both shared and independent latent variables:
| (10) |
Here xS refers to latent variables that are shared between regions, while xA and xB denote region-specific latent variables. This block structure in the loadings matrix W allows us to extract shared and independent sources of variability using either GP [70, 71] or LDS [65] priors over the latent variables. We show a schematic for this model in Figure 3. Here, a loadings matrix with independent components and shared components map independent and shared latents to a set of firing rates. These firing rates describe Poisson spiking from populations of neurons in each brain region.
Figure 3:

The multi-region Poisson GPFA (Gaussian-Process Factor Analysis) model contains a block-structured loadings matrix, with latent variables local to each region, and “shared” latents that capture shared variability across regions. Latent firing rates are passed through a rectifying nonlinearity f to obtain firing rates, which drive spiking via a Poisson process.
These shared latent variable models for multi-region data are structurally related to canonical correlation analysis (CCA) [72], which have been used in neuroscience settings for multi-region data before [4]. In particular, the LVMs outlined above relate closely to statistical variants of CCA known in the machine learning community as probabilistic CCA [73] and Bayesian CCA (BCCA) [74]. Though probabilistic CCA and BCCA do not include neuroscience-specific model features such as point-wise nonlinearities, Poisson observations, and GP priors, they are factor analytic models which identify shared low-dimensional subspaces that map to distinct upstream observations [73]. In the case of BCCA, these statistical models, like the LVMs described above, assert region-specific and shared factors through a block-structured loadings matrix [74, 75]. The ongoing work in the machine learning community on probabilistic CCA and Bayesian CCA may have useful implications to the neuroscience community. In particular, recent developments of incorporating sparsity priors over the block loadings matrix may be well suited to neuroscience contexts [75]
Implementing LVMs with block loadings structure is an exciting future for multi-region analysis. Some preliminary results suggest that models of this sort can incorporate time-delays to identify latents that represent feed-forward and feedback communication between brain regions [71]. Other work has isolated trial-varying latent structure that is shared across regions during repeated presentations of a visual stimulus [70]. However, a technical challenge here remains in properly identifying the dimensionality of the shared an independent subspaces, as a exhaustive search over all possibilities in computationally cumbersome. Greedy approaches, that add dimensions one-by-one if it improves cross-validation performance, is a good initial start [70]. An exciting possibility would be to impose an explicit regularization over the loadings matrix, similar to the approaches used in [75], such as an automatic relevance determination prior, or nuclear norm regularization. These more sophisticated statistical methods would isolate within and across region sub-spaces without the need directly specify the nature of the block structured loadings structure.
Hybrid approaches: low-dimensional regression models
So far we have considered regression models, which seek to explain activity in one brain area as a function of activity in another, and latent variable models, which seek to explain activity in multiple brain areas using a small number of components or factors. It is of course natural to seek to combine the strengths of these two approaches: models that can explain activity in one brain area in terms of a small number of components arising from another brain area. Dimensionality-reduction methods have had great success in extracting interpretable features from high-dimensional neural recordings [9]. Combining dimensionality reduction with regression thus provides a powerful approach for extracting more interpretable statistical interactions between brain regions. Although we have emphasized models with Poisson noise up to now, this section will focus primarily on models with Gaussian noise.
A simple approach to low-dimensional regression for multi-region data is to work in two stages: first perform dimensionality reduction on the upstream region, and then use regression on this low dimensional representation to predict downstream activity. When using principal components analysis (PCA) for dimensionality reduction followed by ordinary least squares regression, this technique is known as principal components regression (PCR). Kaufman et al. [76] used PCR to analyze the relationship between neural activity in pre-motor (PMd) and primary motor cortex (M1) during movement preparation, arguing that PMd activity falls primarily in the null space of the linear mapping to M1. Semedo et al. [21] used a similar approach, substituting factor analysis for PCA, to analyze dependencies in neural activity between V1 and V2, concluding that principal axes of V1 variability were not the dimensions that best predicted activity in downstream area V2.
An alternate approach to low-dimensional regression modeling is reduced rank regression (RRR), which unifies dimensonality reduction and regression into a single model. The RRR model explicitly imposes a rank constraint on the parameters of a regression model that govern the mapping from one brain region to another. For linear regression with Gaussian noise, the RRR objective is
| (11) |
where YA denotes an T × m matrix of responses from m area-A neurons over T time bins, YB denotes a T × n matrix of responses from n area-B downstream neurons, and W is a m × n matrix of regression weights of rank at most r.
Remarkably, there is a closed-form solution to this optimization problem. This solution can be obtained using the top r right singular vectors of , where is the matrix of ordinary least squares regression weights, . If we assemble these singular vectors into a matrix V, the RRR solution is simply . The reduced-rank regression model can of course be extended to the setting of Poisson GLMs, although in this case fitting requires numerical optimization of the log-likelihood, as there is no longer an eigenvector solution.
The rank constraint in the reduced-rank regression model can be seen as imposing a bottleneck for information transmission between brain areas. For example, if we use a rank-2 RRR model, the interactions between brain areas are confined to a 2-dimensional subspace, regardless of the total number of latent dimensions in each area. Recent work from Semedo et al [21] coined the term ‘communication subspace’ to refer to this phenomenon, and showed that the dependencies between neural activity in visual areas V1 and V2 are indeed low-dimensional relative to the dimensionality of activity in each brain region.
The rank constraint in RRR also provides a form of regularization by reducing the number of parameters and allowing for parameter-sharing across neurons. The RRR model may therefore achieve better generalization performance than a (full-rank) GLM with all-to-all coupling due reduced overfitting. More importantly, RRR allows for direct estimation of the communication subspace from one population to another, without requiring separate steps for dimensionality-reduction and regression. In addition to being more accurate, the RRR model is also be more interpretable due to the fact that it provides a concise summary of the statistical dependencies between areas.
Challenges and opportunities
Recent advances in technology have ushered in a new era for multi-region optical and electrophysiological recording. These techniques hold great potential for deciphering the complex the flow of information within neural circuits. However, the challenges posed by analyzing such datasets highlight the need for innovative new statistical tools and approaches. As we move forward with the development of such tools, we propose that future work be guided by the principles of intepretable structure, scalability, and statistical efficiency.
Thinking carefully about interpretable statistical structure is important due to the large number of possible modelling options for multi-region data. When developing a model for multi-region data, scientists should consider what the model fits might tell us in the context of a particular hypothesis or set of hypotheses of interest. Successful examples include multi-region models built to understand signal and noise representations [70], alignment of activity subspaces [21], feedforward and feedback signals [71], or functional coupling properties across and within regions [23]. In each of these cases, a particular neuroscience question or network property motivates the development of the model, and the resultant fits provide insight into the network property of interest. Developing or using a multi-region model without first thinking about how to interpret it may be a useful mathematical exercise, but a good fit to neural data will not on its own provide any insight into multi-region neural activity.
Scalability of statistical models for multi-region data is vital for future work. High-throughput imaging and electrophysiology preparations will soon be routinely recording from thousands of neurons simultaneously. The estimation procedures used to fit the models described here must be able to contend not only with an expanding volume of data, but with the exploding degrees of freedom that accompany it. Fitting regression models in the large-scale setting, including parameters characterizing the coupling between cells, results in geometric growth in the number of parameters with the number of cells. One promising solution is to rely on approximate modeling methods [38]. Another approach that applies more generally to population-level modeling is to explicitly reduce the number of degrees freedom of the model, in which combinations of regression and latent variable techniques are applied synergistically.
We are especially excited about possibilities for combining latent variable models and regression models, and the potential for such combined models to reveal new insights into the flow of information between and within brain regions. Model-based targeted dimensionality reduction [77, 78] represents one recent method that combines a regression model with explicit low-dimensional constraints on population activity. Bayesian latent structure discovery [79] is another approach that merges regression and latent variable models. Both of these approaches model population-level activity by sharing degrees of freedom among cells but leave enough flexibility to identify functionally significant, low dimensional latent structure. These approaches have yet to be used in the multi-region setting but we expect their applicability to the multi-region setting to be straight forward.
Conclusion
The development of statistical analysis tools for multi-region data is in its infancy. We believe that a conceptual understanding of information processing in the brain depends on appropriate characterization of population-level activity [80]. As multi-region datasets become increasingly common, there will be greater need for analytical tools and appropriate statistical summaries for each region as well as inter-region interactions. Here, we have described two general methodological approaches for understanding multi-region data that are newly active areas of research and make suggestions for future directions.
Acknowledgments
SLK was supported by NIH grant F32MH115445-03. JWP was supported by grants from the Simons Collaboration on the Global Brain (SCGB AWD543027), the NIH BRAIN initiative (NS104899 and R01EB026946), and a U19 NIH-NINDS BRAIN Initiative Award (5U19NS104648).
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- [1].Gall Franz Joseph. On the functions of the brain and of each of its parts: With observations on the possibility of determining the instincts, propensities, and talents, or the moral and intellectual dispositions of men and animals, by the configuration of the brain and head, volume 1 Marsh, Capen & Lyon, 1835. [Google Scholar]
- [2].Gross Charles G. A hole in the head: more tales in the history of neuroscience. MIT Press, 2012. [Google Scholar]
- [3].Clarke Edwin and O’Malley Charles Donald. The human brain and spinal cord: A historical study illustrated by writings from antiquity to the twentieth century. Norman Publishing, 1996. [Google Scholar]
- [4].Steinmetz Nicholas A, Zatka-Haas Peter, Carandini Matteo, and Harris Kenneth D. Distributed coding of choice, action and engagement across the mouse brain. Nature, 576(7786):266–273, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Stirman Jeffrey N, Smith Ikuko T, Kudenov Michael W, and Smith Spencer L. Wide field-of-view, multi-region, two-photon imaging of neuronal activity in the mammalian brain. Nature biotechnology, 34(8):857, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Stosiek Christoph, Garaschuk Olga, Holthoff Knut, and Konnerth Arthur. In vivo two-photon calcium imaging of neuronal networks. Proceedings of the National Academy of Sciences, 100(12):7319–7324, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Tian Lin, Hires S Andrew, Mao Tianyi, Huber Daniel, Chiappe M Eugenia, Chalasani Sreekanth H, Petreanu Leopoldo, Akerboom Jasper, McKinney Sean A, Schreiter Eric R, et al. Imaging neural activity in worms, flies and mice with improved gcamp calcium indicators. Nature methods, 6(12):875, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Jun James J, Steinmetz Nicholas A, Siegle Joshua H, Denman Daniel J, Bauza Marius, Barbarits Brian, Lee Albert K, Anastassiou Costas A, Andrei Alexandru, Aydın Çağatay, et al. Fully integrated silicon probes for high-density recording of neural activity. Nature, 551(7679):232–236, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Cunningham John P and Byron M Yu. Dimensionality reduction for large-scale neural recordings. Nature neuroscience, 17(11):1500–1509, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Moreno-Bote Rubén, Beck Jeffrey, Kanitscheider Ingmar, Pitkow Xaq, Latham Peter, and Pouget Alexandre. Information-limiting correlations. Nat Neurosci, 17(10):1410–1417, October 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Tkačik Gašper, Marre Olivier, Amodei Dario, Schneidman Elad, Bialek William, and Berry Michael J II. Searching for collective behavior in a large network of sensory neurons. PLoS Comput Biol, 10(1):e1003408, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Stringer Carsen, Pachitariu Marius, Steinmetz Nicholas, Carandini Matteo, and Harris Kenneth D. High-dimensional geometry of population responses in visual cortex. Nature, page 1, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Stringer Carsen, Michaelos Michalis, and Pachitariu Marius. High precision coding in visual cortex. BioRxiv, page 679324, 2019. [DOI] [PubMed] [Google Scholar]
- [14].Rumyantsev Oleg I, Lecoq Jérôme A, Hernandez Oscar, Zhang Yanping, Savall Joan, Chrapkiewicz Radosław, Li Jane, Zeng Hongkui, Ganguli Surya, and Schnitzer Mark J. Fundamental bounds on the fidelity of sensory cortical coding. Nature, 580(7801):100–105, 2020. [DOI] [PubMed] [Google Scholar]
- [15].de Vries Saskia EJ, Lecoq Jerome A, Buice Michael A, Groblewski Peter A, Ocker Gabriel K, Oliver Michael, Feng David, Cain Nicholas, Ledochowitsch Peter, Millman Daniel, et al. A large-scale standardized physiological survey reveals functional organization of the mouse visual cortex. Nature Neuroscience, 23(1):138–151, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Moore George P, Segundo Jose P, Perkel Donald H, and Levitan Herbert. Statistical signs of synaptic interaction in neurons. Biophysical journal, 10(9):876, 1970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Allum JHJ, Hepp-Reymond M-C, and Gysin R. Cross-correlation analysis of interneuronal connectivity in the motor cortex of the monkey. Brain research, 231(2):325–334, 1982. [DOI] [PubMed] [Google Scholar]
- [18].Toyama K, Kimura M, and Tanaka K. Cross-correlation analysis of interneuronal connectivity in cat visual cortex. Journal of Neurophysiology, 46(2):191–201, 1981. [DOI] [PubMed] [Google Scholar]
- [19].De Blasi Stefano, Ciba Manuel, Bahmer Andreas, and Thielemann Christiane. Total spiking probability edges: A cross-correlation based method for effective connectivity estimation of cortical spiking neurons. Journal of neuroscience methods, 312:169–181, 2019. [DOI] [PubMed] [Google Scholar]
- [20].Yu Yiyi, Stirman Jeffrey N., Dorsett Christopher R., and Smith Spencer L.. Mesoscale correlation structure with single cell resolution during visual coding. bioRxiv, 2019. [Google Scholar]
- [21].Semedo João D, Zandvakili Amin, Machens Christian K, Byron M Yu, and Kohn Adam. Cortical areas interact through a communication subspace. Neuron, 102(1):249–259, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]; ** use regression based approaches in combination with dimensionality reduction techniques to demonstrate that V1 and V2 activity interact via a “communication subspace”. In particular, the authors demonstrate that low-dimensional mappings of V1 activity best predict V2 activity, and this dimensionality does not align with the principle dimensions of V1.
- [22].Yates Jacob L, Park Il Memming, Katz Leor N, Pillow Jonathan W, and Huk Alexander C. Functional dissection of signal and noise in mt and lip during decision-making. Nature neuroscience, 20(9):1285, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]; * fit multi-region Poisson GLMs with coupling filters to investigate functional connections between MT and LIP neurons. The authors show that trial-by-trial activity in LIP does not depend on MT activity.
- [23].Hart Eric and Huk Alexander C. Recurrent circuit dynamics underlie persistent activity in the macaque frontoparietal network. eLife, 9:e52460, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]; * use a generalized linear model (GLM) to identify functional coupling between activity in LIP and FEF (parietal-frontal) areas. They used smooth temporal basis functions in their GLM reduce dimensionality of the coupling filters. Model fits show strong long-timescale recurrent excitation between LIP and FEF.
- [24].Okatan M, Wilson M, and Brown E. Analyzing functional connectivity using a network likelihood model of ensemble neural spiking activity. Neural Computation, 17:1927–1961, 2005. [DOI] [PubMed] [Google Scholar]
- [25].Quinn Christopher, Coleman Todd, Kiyavash Negar, and Hatsopoulos Nicholas. Estimating the directed information to infer causal relationships in ensemble neural spike train recordings. Journal of Computational Neuroscience, 30:17–44, 2011. 10.1007/s10827-010-0247-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Seth Anil K, Barrett Adam B, and Barnett Lionel. Granger causality analysis in neuroscience and neuroimaging. Journal of Neuroscience, 35(8):3293–3297, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Moore BJ, Berger T, and Song D. Validation of a convolutional neural network model for spike transformation using a generalized linear model. In 2020 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC), pages 3236–3239, 2020. [DOI] [PubMed] [Google Scholar]
- [28].Brillinger DR. Maximum likelihood analysis of spike trains of interacting nerve cells. Biological cybernetics, 59(3):189–200, 1988. [DOI] [PubMed] [Google Scholar]
- [29].Paninski Liam. Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems, 15(4):243–262, 2004. [PubMed] [Google Scholar]
- [30].Truccolo Wilson, Eden Uri T, Fellows Matthew R, Donoghue John P, and Brown Emery N. A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. Journal of neurophysiology, 93(2):1074–1089, 2005. [DOI] [PubMed] [Google Scholar]
- [31].Pillow Jonathan W, Shlens Jonathon, Paninski Liam, Sher Alexander, Litke Alan M, Chichilnisky EJ, and Simoncelli Eero P. Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature, 454(7207):995–999, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [32].Hardcastle Kiah, Maheswaranathan Niru, Ganguli Surya, and Giocomo Lisa M. A multiplexed, heterogeneous, and adaptive code for navigation in medial entorhinal cortex. Neuron, 94(2):375–387, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Brown Emery N., Kass Robert E., and Mitra Partha P.. Multiple neural spike train data analysis: state-of-the-art and future challenges. Nat Neurosci, 7(5):456–461, May 2004. [DOI] [PubMed] [Google Scholar]
- [34].Ventura V, Cai C, and Kass RE. Trial-to-trial variability and its effect on time-varying dependency between two neurons. Journal of neurophysiology, 94(4):2928–2939, 2005. [DOI] [PubMed] [Google Scholar]
- [35].Stevenson Ian H, Rebesco James M, Hatsopoulos Nicholas G, Haga Zach, Miller Lee E, and Kording Konrad P. Bayesian inference of functional connectivity and network structure from spikes. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 17(3):203–213, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Rikhye Rajeev V, Gilra Aditya, and Halassa Michael M. Thalamic regulation of switching between cortical representations enables cognitive flexibility. Nature neuroscience, 21(12):1753–1763, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [37].Perich Matthew G, Gallego Juan A, and Miller Lee E. A neural population mechanism for rapid learning. Neuron, 100(4):964–976, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Zoltowski David and Pillow Jonathan W. Scaling the poisson glm to massive neural datasets through polynomial approximations. In Advances in neural information processing systems, pages 3517–3527, 2018. [PMC free article] [PubMed] [Google Scholar]; * develop scalable methods for fitting Poisson GLMs. They apply the methods to fit a fully-coupled GLM with sparse priors on connection weights to simultaneous recordings of 800 neurons.
- [39].Dowling Matthew, Zhao Yuan, and Park Il Memming. Non-parametric generalized linear model, 2020.
- [40].Pillow Jonathan W and Scott James. Fully bayesian inference for neural models with negative-binomial spiking. In Advances in neural information processing systems, pages 1898–1906, 2012. [Google Scholar]
- [41].Goris RLT, Movshon JA, and Simoncelli EP. Partitioning neuronal variability. Nature neuroscience, 17(6):858–865, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [42].Williamson Ross S., Sahani Maneesh, and Pillow Jonathan W.. The equivalence of information-theoretic and likelihood-based methods for neural dimensionality reduction. PLoS Comput Biol, 11(4):e1004141, 04 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Keeley Stephen L., Zoltowski David M., Yu Yiyi, Yates Jacob L., Smith Spencer L., and Pillow Jonathan W.. Efficient non-conjugate gaussian process factor models for spike count data using polynomial approximations. arXiv preprint arXiv:1906.03318, 2019. [Google Scholar]
- [44].Stevenson Ian H. Omitted variable bias in glms of neural spiking activity. Neural computation, 30(12):3227–3258, 2018. [DOI] [PubMed] [Google Scholar]
- [45].Macke Jakob H, Buesing Lars, Cunningham John P, Byron M Yu, Shenoy Krishna V, and Sahani Maneesh. Empirical models of spiking in neural populations. In Advances in neural information processing systems, pages 1350–1358, 2011. [Google Scholar]
- [46].Vidne Michael, Ahmadian Yashar, Shlens Jonathon, Pillow Jonathan W, Kulkarni Jayant, Litke Alan M, Chichilnisky EJ, Simoncelli Eero, and Paninski Liam. Modeling the impact of common noise inputs on the network activity of retinal ganglion cells. Journal of computational neuroscience, 33(1):97–121, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [47].Gerwinn Sebastian, Macke Jakob H., and Bethge Matthias. Bayesian inference for generalized linear models for spiking neurons. Frontiers in Computational Neuroscience, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [48].Paninski L, Ahmadian Y, Ferreira Daniel G., Koyama S, Rad Kamiar R., Vidne M, Vogelstein J, and Wu W. A new look at state-space models for neural data. J comp neurosci, 29(1–2):107–126, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Buesing L, Macke JH, and Sahani M. Spectral learning of linear dynamics from generalised-linear observations with application to neural population data. In Adv neur inf proc sys, pages 1682–1690, 2012. [Google Scholar]
- [50].Archer EW, Koster U, Pillow JW, and Macke JH. Low-dimensional models of neural population activity in sensory cortical circuits. In Adv neur inf proc sys, pages 343–351, 2014. [Google Scholar]
- [51].Macke JH, Buesing L, and Sahani M. Estimating state and parameters in state space models of spike trains. Advanced State Space Methods for Neural and Clinical Data, page 137, 2015. [Google Scholar]
- [52].Gao Yuanjun, Busing Lars, Shenoy Krishna V, and Cunningham John P. High-dimensional neural spike train analysis with generalized count linear dynamical systems. In Advances in Neural Information Processing Systems, pages 2044–2052, 2015. [Google Scholar]
- [53].Kao Jonathan C, Nuyujukian Paul, Ryu Stephen I, Churchland Mark M, Cunningham John P, and Shenoy Krishna V. Single-trial dynamics of motor cortex and their applications to brain-machine interfaces. Nature communications, 6, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [54].Byron M Yu, Cunningham John P, Santhanam Gopal, Ryu Stephen I, Shenoy Krishna V, and Sahani Maneesh. Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity. In Advances in neural information processing systems, pages 1881–1888, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [55].Pfau David, Pnevmatikakis Eftychios A, and Paninski Liam. Robust learning of low-dimensional dynamics from large neural ensembles. In Advances in neural information processing systems, pages 2391–2399, 2013. [Google Scholar]
- [56].Nam Hooram. Poisson extension of gaussian process factor analysis for modeling spiking neural populations Master’s thesis, Department of Neural Computation and Behaviour, Max Planck Institute for Biological Cybernetics, Tubingen, 8 2015. [Google Scholar]
- [57].Zhao Yuan and Park Il Memming. Variational latent gaussian process for recovering single-trial dynamics from population spike trains. Neural computation, 29(5):1293–1316, 2017. [DOI] [PubMed] [Google Scholar]
- [58].Duncker Lea and Sahani Maneesh. Temporal alignment and latent gaussian process factor inference in population spike trains. In Advances in Neural Information Processing Systems, pages 10445–10455, 2018. [Google Scholar]
- [59].Zhao Yuan, Yates Jacob Lachenmyer, Levi Aaron Joseph, Huk Alexander Christopher, and Park Il Memming. Stimulus-choice (mis) alignment in primate mt cortex. bioRxiv, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [60].Wu Anqi, Roy Nicholas A, Keeley Stephen, and Pillow Jonathan W. Gaussian process based nonlinear latent structure discovery in multivariate spike train data. In Advances in neural information processing systems, pages 3496–3505, 2017. [PMC free article] [PubMed] [Google Scholar]
- [61].Linderman Scott, Johnson Matthew, Miller Andrew, Adams Ryan, Blei David, and Paninski Liam. Bayesian learning and inference in recurrent switching linear dynamical systems. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), pages 914–922, 2017. [Google Scholar]
- [62].Nassar Josue, Linderman Scott, Bugallo Monica, and Park Il Memming. Tree-structured recurrent switching linear dynamical systems for multi-scale modeling. In International Conference on Learning Representations (ICLR), 2019. [Google Scholar]
- [63].Linderman Scott W, Nichols Annika LA, Blei David M, Zimmer Manuel, and Paninski Liam. Hierarchical recurrent state space models reveal discrete and continuous dynamics of neural activity in c. elegans. bioRxiv, page 621540, 2019. [Google Scholar]
- [64].Zoltowski David M, Pillow Jonathan W, and Linderman Scott W. Unifying and generalizing models of neural dynamics during decision-making. arXiv preprint arXiv:2001.04571, 2020. [Google Scholar]
- [65].Glaser Joshua I, Whiteway Matthew R, Cunningham John P, Paninski Liam, and Linderman Scott W. Recurrent switching dynamical systems models for multiple interacting neural populations. bioRxiv, 2020. [Google Scholar]
- [66].Archer Evan, Park Il Memming, Buesing Lars, Cunningham John, and Paninski Liam. Black box variational inference for state space models. arXiv preprint arXiv:1511.07367, 2015. [Google Scholar]
- [67].Gao Y, Archer EW, Paninski L, and Cunningham JP. Linear dynamical neural population models through nonlinear embeddings. In Adv neur inf proc sys, pages 163–171, 2016. [Google Scholar]
- [68].Ainsworth Samuel K, Foti Nicholas J, Lee Adrian KC, and Fox Emily B. oi-vae: Output interpretable vaes for nonlinear group factor analysis. In International Conference on Machine Learning, pages 119–128, 2018. [Google Scholar]
- [69].Pandarinath Chethan, O’Shea Daniel J., Collins Jasmine, Jozefowicz Rafal, Stavisky Sergey D., Kao Jonathan C., Trautmann Eric M., Kaufman Matthew T., Ryu Stephen I., Hochberg Leigh R., Henderson Jaimie M., Shenoy Krishna V., Abbott LF, and Sussillo David. Inferring single-trial neural population dynamics using sequential auto-encoders. Nature Methods, 15(10):805–815, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [70].Keeley Stephen L, Aoi Mikio C, Yu Yiyi, Smith Spencer LaVere, and Pillow Jonathan W. Identifying signal and noise structure in neural population activity with gaussian process factor models. bioRxiv, 2020. [Google Scholar]; ** find latent structure in multi-region recordings that is shared across trails (signal) and trial specific (noise). Using cross validation methods, the authors find that there is trail-by-trial latent structure that is shared across brain region. These results somewhat disagree with the findings of [22], though the data in [70] are from rodent visual cortex.
- [71].Gokcen Evren, Semedo Joao, Zandvakili Amin, Machens Christian, Kohn Adam, and Yu Byron. Dissecting feedforward and feedback interactions between populations of neurons In Cosyne Abstracts, Denver, CO, 2020. [Google Scholar]; * use a Gaussian Process latent variable model with temporal delays to understand inter-region communication in visual cortex. They identify distinct feedforward and feedback latent structures across visual cortical areas by associating these structures with particular delay values learned in the latent variable model. They call this model Delayed Latents Across Groups (DLAG).
- [72].Hotelling Harold. Simplified calculation of principal components. Psychometrika, 1(1):27–35, 1936. [Google Scholar]
- [73].Bach Francis R and Jordan Michael I. A probabilistic interpretation of canonical correlation analysis. Technical report, University of California, Berkeley, 2005. [Google Scholar]
- [74].Klami Arto, Virtanen Seppo, and Kaski Samuel. Bayesian canonical correlation analysis. Journal of Machine Learning Research, 14(Apr):965–1003, 2013. [Google Scholar]
- [75].Zhao Shiwen, Gao Chuan, Mukherjee Sayan, and Engelhardt Barbara E. Bayesian group factor analysis with structured sparsity. The Journal of Machine Learning Research, 17(1):6868–6914, 2016. [Google Scholar]
- [76].Kaufman Matthew T, Churchland Mark M, Ryu Stephen I, and Shenoy Krishna V. Cortical activity in the null space: permitting preparation without movement. Nature neuroscience, 17(3):440–448, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [77].Aoi Mikio and Pillow Jonathan W. Model-based targeted dimensionality reduction for neuronal population data. In Advances in neural information processing systems, pages 6690–6699, 2018. [PMC free article] [PubMed] [Google Scholar]
- [78].Aoi Mikio C., Mante Valerio, and Pillow Jonathan W.. Prefrontal cortex exhibits multidimensional dynamic encoding during decision-making. Nature Neuroscience, October 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [79].Linderman Scott, Adams Ryan P, and Pillow Jonathan W. Bayesian latent structure discovery from multi-neuron recordings. In Advances in neural information processing systems, pages 2002–2010, 2016. [Google Scholar]
- [80].Saxena Shreya and Cunningham John P. Towards the neural population doctrine. Current opinion in neurobiology, 55:103–111, 2019. [DOI] [PubMed] [Google Scholar]
