Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

ArXiv logoLink to ArXiv
[Preprint]. 2023 Sep 12:arXiv:2309.06402v1. [Version 1]

Expressive dynamics models with nonlinear injective readouts enable reliable recovery of latent features from neural activity

Christopher Versteeg 1, Andrew R Sedler 1,2, Jonathan D McCart 1,2, Chethan Pandarinath 1,2
PMCID: PMC10516113  PMID: 37744459

Abstract

The advent of large-scale neural recordings has enabled new approaches that aim to discover the computational mechanisms of neural circuits by understanding the rules that govern how their state evolves over time. While these neural dynamics cannot be directly measured, they can typically be approximated by low-dimensional models in a latent space. How these models represent the mapping from latent space to neural space can affect the interpretability of the latent representation. We show that typical choices for this mapping (e.g., linear or MLP) often lack the property of injectivity, meaning that changes in latent state are not obligated to affect activity in the neural space. During training, non-injective readouts incentivize the invention of dynamics that misrepresent the underlying system and the computation it performs. Combining our injective Flow readout with prior work on interpretable latent dynamics models, we created the Ordinary Differential equations autoencoder with Injective Nonlinear readout (ODIN), which learns to capture latent dynamical systems that are nonlinearly embedded into observed neural activity via an approximately injective nonlinear mapping. We show that ODIN can recover nonlinearly embedded systems from simulated neural activity, even when the nature of the system and embedding are unknown. Additionally, we show that ODIN enables the unsupervised recovery of underlying dynamical features (e.g., fixed points) and embedding geometry. When applied to biological neural recordings, ODIN can reconstruct neural activity with comparable accuracy to previous state-of-the-art methods while using substantially fewer latent dimensions. Overall, ODIN’s accuracy in recovering ground-truth latent features and ability to accurately reconstruct neural activity with low dimensionality make it a promising method for distilling interpretable dynamics that can help explain neural computation.

1. Introduction

Recent evidence has shown that when artificial recurrent neural networks are trained to perform tasks, the rules that govern how the internal activity evolves over time (i.e., the network dynamics) can provide insight into how the network performs the underlying computation [14]. Given the conceptual similarities between artificial neural networks and biological neural circuits, it may be possible to apply these same dynamical analyses to brain activity to gain insight into how neural circuits perform complex sensory, cognitive, and motor processes [57]. However, unlike in artificial networks, we cannot easily interrogate the dynamics of biological neural circuits and must first estimate them from observed neural activity.

Fortunately, advances in recording technology have dramatically increased the number of neurons that can be simultaneously recorded, providing ample data for novel population-level analyses of neural activity [810]. In these datasets, the activity of hundreds or thousands of neurons can often be captured by relatively low-dimensional subspaces [11], orders-of-magnitude smaller than the total number of neurons. Neural activity in these latent spaces seems to evolve according to consistent sets of rules (i.e., latent dynamics) [12, 6]. Assuming no external inputs, these rules can be expressed mathematically as:

zt+1=zt+f(zt) (1)
yt=expgzt (2)
xt~Poisson(yt) (3)

where ztRD represents the latent state at time t,f():RDRD is the vector field governing the dynamical system, ytRN denotes the firing rates of the N neurons, g():RDRN maps latent activity into log-firing rates, and xtRN denotes the observed spike counts at time t, assuming the spiking activity follows a Poisson distribution with time-varying rates given at each moment t by yt.

Unfortunately, any latent system can be equivalently described by many combinations of dynamics f and embeddings g, which makes the search for a unique latent system futile. However, versions of a latent system’s dynamics f and embedding g that are less complex and use fewer latent dimensions can be easier to interpret than alternative representations that are more complex and/or higher-dimensional. Models of latent dynamics that can discover simple and low-dimensional representations will make it easier to link latent dynamics to neural computation.

A popular approach to estimate neural dynamics [1315] is to use neural population dynamics models (NPDMs), which model neural activity as a latent dynamical system embedded into neural activity. We refer to the components of an NPDM that learn the dynamics and embedding as the generator fˆ and the readout gˆ, respectively. When modeling neural activity, the generator and readout are jointly trained to infer firing rates yˆ that maximize the likelihood of the observed neural activity x.

Using NPDMs to estimate underlying dynamics and embedding implicitly assumes that good reconstruction performance (i.e., xˆx) implies interpretable estimates of the underlying system (i.e., zˆz,fˆf,gˆg). However, recent work has shown that when the state dimensionality of the generator Dˆ is larger than a system’s latent dimensionality D, high reconstruction performance may actually correspond to estimates of the latent system that are overly complex or misleading and therefore harder to interpret [15]. At present, reconstruction performance is seemingly an unreliable indicator for the interpretability of the learned dynamics.

This vulnerability to learning overly complex latent features might emerge from the fact that, without constraints on the readout gˆ, changes in the latent state are not obligated to have an effect on predicted neural activity. Thus, NPDMs can be rewarded for inventing latent activity that boosts reconstruction performance, even if that latent activity has no direct correspondence to neural activity. A potential solution is to make gˆ injective, which obligates all latent activity to affect neural reconstruction. This would penalize any latent activity that is not reflected in the observed neural activity, thereby putting pressure on the generator fˆ and readout gˆ to learn a more interpretable (i.e., simpler and lower dimensional) representation of the underlying system.

In addition, most previously used readouts gˆ were not expressive enough to model diverse mappings from latent space to neural space, assuming the embedding g to be a relatively simple (often linear) transformation (though there are exceptions [1618]). Capturing nonlinear embeddings is important because neural activity often lives on a lower-dimensional manifold that is nonlinearly embedded into the higher-dimensional neural space [7]. Therefore, assumptions of linearity are likely to prevent NPDMs from capturing dynamics in their simplest and lowest-dimensional form, making them less interpretable than the latent features learned by NPDMs that can approximate these nonlinearities.

To address these challenges, we propose a novel architecture called the Ordinary Differential equation autoencoder with Injective Nonlinear readout (ODIN), which implements fˆ using a Neural ODE (NODE [19]) and gˆ using a network inspired by invertible ResNets [2022, 19, 23]. ODIN approximates an injective nonlinear mapping between latent states and neural activity, obligating all latent state variance to appear in the predicted neural activity and penalizing the model for using excessively complex or high-dimensional dynamics to model the underlying system. On synthetic data, ODIN learns representations of the latent system that are more interpretable, with simpler and lower-dimensional latent activity and dynamical features (e.g., fixed points) than alternative readouts. ODIN’s interpretability is also more robust to overestimates of latent dimensionality and can recover the nonlinear embedding of synthetic data that evolves on a simulated manifold. When applied to neural activity from a monkey performing a reaching task with obstacles, ODIN reconstructs neural activity comparably to state-of-the-art recurrent neural network (RNN)-based models while requiring far fewer latent state dimensions. In summary, ODIN estimates interpretable latent features from synthetic data and has high reconstruction performance on biological neural recordings, making it a promising tool for understanding how the brain performs computation.

2. Related Work

Many previous models have attempted to understand neural activity through the lens of neural dynamics. Early efforts limited model complexity by constraining both fˆ and gˆ to be linear [2426]. While these models were relatively straightforward to analyze, they often failed to adequately explain neural activity patterns [27].

Other approaches increased the expressiveness of the modeled dynamics fˆ. RNNs can learn to approximate complex nonlinear dynamics, and have been shown to substantially outperform linear dynamics models in reconstructing neural activity [27]. Unfortunately, RNNs implicitly couple the capacity of the model to the latent state dimensionality, meaning their ability to model complex dynamics relies on having a high-dimensional latent state. In contrast, NODEs can model arbitrarily complex dynamics of embedded dynamical systems at the dimensionality of the system [19, 15]. On synthetic data, NODEs have been shown to recover dynamics more accurately than RNN-based methods [28, 15]. In contrast to our approach, previous NODE-based models used a linear readout gˆ that lacks injectivity. This can make the accuracy of estimated latent activity vulnerable to overestimates of the latent dimensionality (i.e., when Dˆ>D) and/or fail to capture potential nonlinearities in the embedding g.

Early efforts to allow greater flexibility in gˆ preserved linearity in fˆ, using feed-forward neural networks to nonlinearly embed linear dynamical systems in high-dimensional neural firing rates [16]. More recently, models have used Gaussian processes to approximate nonlinear mappings from latent state to neural firing with tuning curves [17]. Other models have combined nonlinear dynamics models and nonlinear embeddings for applications in behavioral tracking [29] and neural reconstruction [18]. Additional approaches extend these methods to incorporate alternative noise models that may better reflect the underlying firing properties of neurons [16, 30]. While nonlinear, the readouts of these models lacked injectivity in their mapping from latent activity to neural activity.

Many alternative models seek to capture interpretable latent features of a system from observations. One popular approach uses a sparsity penalty on a high-dimensional basis set to derive a sparse symbolic estimate of the governing equations for the system [31]. However, it is unclear whether such sparse symbolic representation is necessarily a benefit when modeling dynamics in the brain. Another recent model uses contrastive loss and auxiliary behavioral variables to learn low-dimensional representations of latent activity [32]. This approach does not have an explicit dynamics model, however, so is not amenable to the dynamical analyses performed in this manuscript.

Normalizing flows – a type of invertible neural network – have recently become a staple for generative modeling and density estimation [20, 23]. Some latent variable models have used invertible networks to approximate the mapping from the latent space to neural activity [33] or for generative models of visual cortex activity [34]. To allow this mapping to change dimensionality between the latent space and neural activity, some of these models used a zero-padding procedure similar to the padding used in this manuscript (see Section 3.3.1), which makes the transformation injective rather than invertible [33, 23]. However, these previous approaches did not have explicit dynamics models, making our study, to our knowledge, the first to test whether injective readouts can improve the interpretability of neural population dynamics models.

3. Methods

3.1. Synthetic Neural Data

To determine whether different models can distill an interpretable latent system from observed population activity, we first used reference datasets that were generated using simple ground-truth dynamics f and embedding g. Our synthetic test cases emulate the empirical properties of neural systems, specifically low-dimensional latent dynamics observed through noisy spiking activity [13, 3537]. We sampled latent trajectories from the Arneodo system (f,D=3) and nonlinearly embedded these trajectories into neural activity via an embedding g. We consider models that can recover the dynamics f and embedding g used to generate these data as providing an interpretable description of the latent system and its relation to the neural activity. Additional detail on data generation, models, and metrics can be found in the Supplementary Material.

Unless otherwise noted, we generated activations for N neurons (N=12) by projecting the simulated latent trajectories Z through a 3×N matrix whose columns were random encoding vectors with elements sampled from a uniform distribution U[-0.5,0.5] (Fig. 1A, left). We standardized these activations to have zero mean and unit variance and applied a different scaled sigmoid function to each neuron, yielding a matrix of non-negative time-varying firing rates Y. The scaling of each sigmoid function was evenly spaced on a logarithmic scale between 100.2 and 10. This process created a diverse set of activation functions ranging from quasi-linear to nearly step-function-like behavior (Fig. 1A, Activation Functions). For one experiment, we used the standard linear-exponential activation function, as described in previous work [15], instead of the scaled sigmoid.

Figure 1:

Figure 1:

A) Synthetic neural data generation (left to right). Trajectories from the Arneodo system are projected onto random encoding vectors to compute activations at each timepoint. A scaled sigmoid nonlinearity is applied to convert the activations into firing rates. B) Zero-padded latent dynamics (green) are reversibly warped into higher-dimensional neural activity space (blue). C) The Flow readout maps from latent space to neural space by applying a sequence of K small updates (parameterized by an MLP, bottom). The reverse pass of the Flow maps from neural space to latent space and is implemented by serial subtraction of updates from the same MLP.

We simulated spiking activity X by sampling from inhomogeneous Poisson processes with time-varying rate parameters equal to the firing rates Y of the simulated neurons (Fig. 1A, right). We randomly split 70-point segments of these trials into training and validation datasets (training and validation proportions were 0.8 and 0.2, respectively).

3.2. Biological Neural Data

We evaluated how well our model could reconstruct biological neural activity on a well-characterized dataset [38] included in the Neural Latents Benchmark (NLB) [27]. This dataset is composed of single-unit recordings from primary and pre-motor cortices of a monkey performing a visually-guided reaching task with obstacles, referred to as the Maze task. Trials were trimmed to the window [−250, 350 ms] relative to movement onset, and spiking activity was binned at 20 ms. To compare the reconstruction performance of our model directly against the benchmark, we split the neural activity into held-in and held-out neurons, comprising 137 and 35 neurons, respectively, using the same sets of neurons as were used to assess models for the NLB leaderboard.

3.3. Model Architecture

We used three sequential autoencoder (SAE) variants in this study, with the main difference being the choice of readout module, gˆ(). In brief, a sequence of binned spike counts x1:T was passed through a bidirectional GRU encoder, whose final hidden states were converted to an initial condition zˆ0 via a mapping ϕ(). A modified NODE generator unrolled the initial condition into time-varying latent states zˆ1:T. These were subsequently mapped to inferred rates via the readout gˆ(){Linear,MLP,Flow}. All models were trained for a fixed number of epochs to infer firing rates yˆ1:T that minimize the negative Poisson log-likelihood of the observed spikes x1:T.

hT=hfwdhbwd=BiGRUx1:T (4)
zˆ0=ϕhT (5)
zˆt+1=zˆt+αMLPzˆt (6)
y^t=expg^(z^t) (7)

For models with Linear and MLP readouts, ϕ() was a linear map to RDˆ. For models with Flow readouts, ϕ() was a linear map to RN followed by the reverse pass of the Flow (see Section 3.3.1). We unrolled the NODE using Euler’s method with a fixed step size equal to the bin width and trained using standard backpropagation for efficiency. A scaling factor (α=0.1) was applied to the output of the NODE’s MLP to stabilize the dynamics during early training. Readouts were implemented as either a single linear layer (Linear), an MLP with two 150-unit ReLU hidden layers (MLP), or a Flow readout (Flow) which contains an MLP with two 150-unit ReLU hidden layers. We refer to these three models as Linear-NODE, MLP-NODE, and ODIN, respectively.

3.3.1. Flow Readout

The Flow readout resembles a simplified invertible ResNet [23]. Flow learns a vector field that can reversibly transform data between latent and neural representations (Figure 1B). The Flow readout has three steps: first, we increase the dimensionality of the latent activity zt to match that of the neural activity by padding the latent state with zeros. This corresponds to an initial estimate of the log-firing rates, logyˆt,0. Note that zero-padding makes our mapping injective rather than fully invertible (see [23, 33]). The Flow network then uses an MLP to iteratively refine ⁡logyˆt,k over K steps (K=20) after which we apply an exponential to produce the final firing rate predictions, yˆt. A scaling factor (β=0.1) was applied to the output of the Flow’s MLP, which prevents the embedding from becoming unstable during the early training period.

logyˆt,0=zˆt0T (8)
logyˆt,k+1=logyˆt,k+βMLPlogyˆt,k (9)
g^(z^t)=logy^t,K=logy^t (10)

We also use a reverse pass of the Flow to transform the output of the encoders to initial conditions in the latent space via ϕ(), approximating the inverse function gˆ-1. Our method subtracts the output of the MLP from the state rather than adding it as in the forward mode (Fig 1C), a simplified version of the fixed-point iteration procedure described in [23]. We then trim the excess dimensions to recover zˆRDˆ (in effect, removing the zero-padding dimensions).

logyˆt,k-1=logyˆt,k-βMLPlogyˆt,k (11)
g^1(logy^t)=[logy^t,0,1,,logy^t,0,D^]T=Z^t (12)

The Flow mapping is only guaranteed to be injective if changes in the output of the MLP are sufficiently small relative to changes in the input (i.e., Lipschitz constant for the MLP that is strictly less than 1) [23]. The model can be made fully injective by either restricting the weights of the MLP (e.g., spectral norm [39]), or using a variable step-size ODE solver that can prevent crossing trajectories (e.g., continuous normalizing flows [19]). In practice, we found that using a moderate number of steps allows Flow to preserve approximate injectivity of the readout at all tested dimensionalities (Supp. Fig. S2).

3.4. Metrics and characterization of dynamics

We assessed model performance in five domains: 1) reconstruction performance, 2) latent accuracy, 3) dynamical accuracy, 4) embedding accuracy, and 5) readout injectivity. All metrics were evaluated on validation data. Critically, on biological data without a ground-truth system, only the reconstruction performance and readout injectivity can be assessed, since all the other metrics rely on full observability of the underlying system. Therefore, we need models for which good performance on the observable metrics (reconstruction, injectivity) implies good performance on the unobservable metrics (latent, dynamical, and embedding accuracy).

Reconstruction performance for the synthetic data was assessed using two key metrics. The first, spike negative log-likelihood (Spike NLL), was defined as the Poisson NLL employed during model training. The second, Rate R2, was the coefficient of determination between the inferred and true firing rates, averaged across neurons. We used Spike NLL to assess how well the inferred rates explain the spiking activity, while Rate R2 reflects the model’s ability to find the true firing rates. These metrics quantify how well the model captures the embedded system’s dynamics (i.e., that fˆ,gˆ captures the system described by f, g), but give no indication of the interpretability of the learned latent representation (i.e., that the learned fˆ,gˆ are simple and low-dimensional).

For the biological neural data, we measured model performance using two metrics from the Neural Latents Benchmark (NLB) [27], co-smoothing bits-per-spike (co-bps) and velocity decoding performance on predicted firing rates (Vel R2). co-bps is a measure of reconstruction performance that quantifies how well the model predicts the spiking of the held-out neurons, while Vel R2 quantifies how well the denoised rates can predict the monkey’s hand velocity during the reach. We have no way to directly assess embedding, latent, or dynamical accuracy because they are unobserved in most biological datasets.

To determine whether a model’s inferred latent activity contains features that are not in the simulated latent activity, we used a previously published metric called the State R2 [15]. State R2 is defined as the coefficient of determination R2 of a linear regression from simulated latent trajectories z to the inferred latent trajectories zˆ. State R2 will be low if the inferred latent trajectories contain features that cannot be explained by an affine transformation of the true latent trajectories. Importantly, State R2 alone cannot ensure latent accuracy. This is because a model can achieve high State R2 trivially if the inferred latent activity zˆ is a low-dimensional projection of the simulated activity z. Therefore, only models that have both good reconstruction performance (Spike NLL, Rate R2) and State R2 can be said to accurately reflect the simulated latent dynamics without extra features that make the model harder to interpret (i.e., zˆz).

As a direct comparison of the estimated dynamics fˆ to the simulated dynamics f, we extracted the fixed-point (FP) structure from our trained models and compared it to the FP structure of the underlying system. We used previously published FP-finding techniques [40] to identify regions of the generator’s dynamics where the magnitude of the vector field was close to zero, calling this set of locations the putative FPs. We linearized the dynamics around the FPs and computed the eigenvalues of the Jacobian of fˆ to characterize each FP. Capturing FP location and character gives an indication of how closely the estimated dynamics resemble the simulated dynamics (i.e., fˆf).

To determine how well our embedding gˆ captures the simulated embedding g, we projected the encoding vectors used to generate the synthetic neural activity from the ground-truth system into our model’s latent space using the same affine transformation from ground-truth latent activity to inferred latent activity that was used to compute State R2. We projected the inferred latent activity onto each neuron’s affine-transformed encoding vector to find the predicted activation of each synthetic neuron. We then related the predicted firing rates of each neuron to its corresponding activations to derive an estimate of each neuron’s activation function. Because the inferred latent activity is arbitrarily scaled/translated relative to the true latent activity, we fit an affine transformation from the predicted activation function to the ground-truth activation function. The coefficient of determination R2 of this fit quantifies how well our models were able to recover the synthetic warping applied to each neuron (i.e., gˆg).

We compared the injectivity of the Flow readout to Linear and MLP readouts using effective rank [41] and cycle-consistency, respectively. Effective rank quantifies the number of significant singular values in a Linear readout, while cycle-consistency quantifies how well the inferred latent activity zˆ can be recovered from the predicted log-firing rates logyˆ.

4. Results

4.1. Finding interpretable latent activity across state dimensionalities with ODIN

As the latent dimensionality D is unknown for biological datasets, we wanted to test how robust each model was to choices of state dimensionality Dˆ. We trained Linear/MLP -NODE, and ODIN (Fig 2A) to reconstruct synthetic neural activity from the Arneodo system [42] and compared reconstruction performance (i.e. Spike NLL and Rate R2) and latent recovery (i.e. State R2) as functions of the dimensionality Dˆ of the state space. We trained 5 different random seeds for each of the 3 model types and 5 state dimensionalities (75 total models, model hyperparameters in Supp. Table 1, representative hyperparameter sweeps in Supp. Fig. S1).

Figure 2:

Figure 2:

ODIN recovers latent activity more accurately than alternative models and is robust to overestimates of latent dimensionality. A) Diagram of models tested, including Linear-NODE (green), MLP-NODE (orange), ODIN (red). B) Inferred latent activity of representative model at each state dimensionality Dˆ. True latent activity (affine-transformed to overlay inferred latent activity) shown in light blue. C) All: Model metrics as a function of Dˆ. Shaded areas represent one standard deviation around the mean. Dashed vertical line indicates Dˆ=3 Top: Spike NLL, Middle: Rate R2, Bottom: State R2.

First, we observed that latent activity inferred by Linear-NODE did not closely resemble the simulated latent activity, with all tested dimensionalities performing worse than either ODIN or the MLP-NODE at Dˆ=3 (Fig 2B,C, mean State R2=0.70 for Linear-NODE vs. 0.89, 0.93 for MLP-NODE, ODIN respectively). We also found that Linear-NODE required many more dimensions to reach the peak reconstruction performance (Fig 2C, Rate R2). These results demonstrate that models that are unable to account for nonlinear embeddings are vulnerable to learning more complex and higher dimensional dynamics than those learned by models with nonlinear readouts.

Next, we compared ODIN to MLP-NODE and found that at the correct dimensionality (Dˆ=3), these models had similar performance for both reconstruction and latent recovery. However, as the dimensionality increased beyond the true dimensionality (Dˆ>3), the latent recovery of the MLP-NODE degraded rapidly while ODIN’s latent recovery remained high (Fig 2C, as Dˆ>3). As the true latent dimensionality D is usually unknown, NPDMs with non-injective readouts (like MLPs) may be predisposed to learning misleading latent activity that can make it more difficult to interpret biological datasets.

4.2. Common readouts learn non-injective mappings from latent activity to firing rates

We then sought to assess the injectivity of different readouts. First, we used effective rank [41] to quantify the injectivity of our Linear readouts. We trained 5 Linear-NODE models at a range of state dimensionalities (Dˆ=3,5,8,10) to reconstruct simulated neural activity from Arneodo that was linearly embedded into 12D neural space. We found that while reconstruction performance was optimal when Dˆ>3 (Supp. Fig. S3), the effective rank of these best-reconstructing models never exceeded 4 (mean erank = 3.74 at Dˆ=10). This means that for the largest Linear-NODE models, around 6 of 10 latent dimensions had no effect on reconstructed log-rates. The fact that linear readouts learn mappings with low effective rank, coupled with improved reconstruction performance when Dˆ>3 suggests that the Linear readouts utilize non-injectivity to improve reconstruction at the expense of latent accuracy.

Next, we used a cycle consistency metric to show that MLP readouts also have a tendency to become non-injective. Cycle consistency quantifies how well inputs to a function can be recovered from the function’s outputs. We trained a separate MLP to predict inferred latents zˆ from predicted log-firing rates logyˆ for 10D MLP-NODE and ODIN models shown in Figure 2. We found that the cycle consistency of the ODIN model was consistently higher than for MLP-NODE (Fig. 3B, Noise Level = 0). It is possible that models may learn to compress latent activity to arbitrarily small firing rate changes while still remaining technically injective. This failure mode could potentially be invisible to the standard cycle-consistency. To address this concern, we added Gaussian noise to the log-firing rates logyˆ and tried to recover the inferred latent activity from these noise corrupted log-rates. Consistent with ODIN’s bias towards injectivity, we found that ODIN’s cycle consistency was more robust to the addition of noise than MLP-NODE (Fig. 3B, Noise Level > 0).

Figure 3:

Figure 3:

Linear- and MLP-NODEs tend towards non-injectivity A) Effective rank of Linear readout as a function of state dimensionality Dˆ. Each point represents one randomly instantiated model. B) Cycle-consistency R2 for ODIN and MLP-NODE as a function of noise corruption.

To demonstrate that injectivity was the critical feature that allowed ODIN to outperform other models, we tested an alternative injective readout, an Invertible Neural Network (INN). INN implementation differs significantly from Flow, but they share the property of injectivity. We found that INN-NODE qualitatively reproduced ODIN’s performance in Figure 2C (Supp. Fig. S4), suggesting that the injectivity is the critical feature for recovering interpretable latent activity. We describe the advantages of ODIN over INN-NODE in the Supplementary material.

4.3. Recovering fixed point structure with ODIN

A common method to examine how well dynamics models capture the underlying dynamics from synthetic data is to compare the character and structure of the inferred fixed points (FPs) to the FPs of the ground-truth system [15]. At a high-level, FPs enable a concise description of the dynamics in a small region of state-space around the FP, and can collectively provide a qualitative picture of the overall dynamical landscape. To obtain a set of candidate FPs, we searched the latent space for points at which the magnitude of the vector field fˆ is minimized (as in [1, 40]). We computed the eigenvalues of the Jacobian of fˆ at each FP location. The real and imaginary components of these eigenvalues identify each FP as attractive, repulsive, etc.

We found that 3D ODIN models and 3D Linear-NODEs were both able to recover three fixed points that generally matched the location of the three fixed points of the Arneodo system (Fig 4A), However, while ODIN was also able to capture the eigenspectra of all three FPs (Fig. 4B, red ×), the Linear-NODE failed to capture the rotational dynamics of the central FP (Fig 4B, middle column, green +). Both models were able to approximately recover the eigenspectra of outermost FPs of the system (Fig. 4B, left, right columns). We found that the MLP-NODE was also able to find FPs with similar accuracy to ODIN at 3D. These results show that the inability to model the nonlinear embedding can lead to impoverished estimates of the underlying dynamics fˆ.

Figure 4:

Figure 4:

ODIN recovers fixed point properties accurately at the correct dimensionality. A,B) Representative latent activity and fixed-points from the true (blue, ∘), ODIN (red, ×), and Linear-NODE (green, +) systems. Each fixed point is labeled with reference to C. C) Plots of the real vs. imaginary part of the eigenvalues of the Jacobian evaluated at each fixed point. Unit circle in the complex plane (black curve) shows boundary between attractive and repulsive behavior (the attractive and repulsive sides of the boundary are indicated by inset).

4.4. Recovering simulated activation functions with ODIN

While obtaining interpretable dynamics is our primary goal, models that allow unsupervised recovery of the embedding geometry may provide additional insight about the computations performed by the neural system [43, 7]. For this section, we considered a representative model from each readout class with the correct number of latent dimensions (D=3). We performed an affine transformation from the ground truth encoding vectors into the modeled latent space and computed the projection of the modeled latent activity onto the affine-transformed encoding vectors (Fig 5A). From this projection, we derived an estimate of the activation function for each neuron, and compared this estimate to the ground-truth activation function.

Figure 5:

Figure 5:

ODIN can recover nonlinear activation functions of neurons. A) True encoding vectors (numbered lines over true latent activity (blue)) were affine-transformed into a representative model’s latent space. B) Inferred activation function for two example neurons (columns), color coded by readout type (Linear-NODE = green, MLP-NODE = orange, ODIN = red, True = black). Plots show the predicted firing rate vs. the activation of the selected neuron. C) Comparison of the R2 values of the fits across all neurons for models with Dˆ=3.

We found, as expected, that Linear-NODE was unable to approximate the sigmoidal activation function of individual neurons (Fig 5B, green). On the other hand, both ODIN and MLP-NODE were able to capture activation functions ranging from nearly linear to step function-like in nature (Fig 5B, red, orange). Across all simulated neurons for models with D=3, we found that ODIN more accurately estimated the activation function of individual neurons compared to both Linear- and MLP-NODEs (Fig 5C), suggesting that ODIN’s injectivity allows more accurate estimation of non-linear embeddings (two-sided paired t-test, p-val for ODIN vs. Linear-, MLP-NODE < 1e-10).

4.5. Modeling motor cortical activity with ODIN

To validate ODIN’s ability to fit neural activity from a biological neural circuit, we applied ODIN to the Maze dataset from the Neural Latents Benchmark, composed of recordings from the motor and pre-motor cortices of a monkey performing a reaching task (Fig. 6A). After performing hyperparameter sweeps across regularization parameters and network size (Supp. Table 2), we trained a set of ODIN and Linear-NODE models to reconstruct the neural activity with a range of state dimensionalities Dˆ. We visualized the top 3 PCs of the condition-averaged latent trajectories and predicted single-neuron firing rates for example models from each readout type. We found no visually obvious differences in the inferred latent trajectories (Fig. 6B), but when we computed condition-averaged peri-stimulus time histograms (PSTHs) of single neuron firing rates, we found that ODIN typically produced firing rate estimates that more closely resembled the empirical PSTHs than those from the Linear-NODE (Fig. 6C).

Figure 6:

Figure 6:

ODIN can reconstruct cortical activity with low-dimensional dynamics A) Top: Schematic of task [38] Bottom: example hand trajectories and condition-averaged firing rates aligned to move onset. B) Example condition-averaged latent activity from ODIN and Linear-NODE models applied to neural activity recorded during the Maze task. C) Example single-neuron peri-stimulus time histograms for ODIN and Linear-NODE models across conditions. D) Effects of latent state dimensionality Dˆ on reconstruction (top, co-bps) and decoding (bottom, Vel R2) performance. Plot shows mean (point) and standard deviation (shading) of 5 randomly initialized ODIN and Linear-NODE models at each Dˆ. GPFA and AutoLFADS were a single run, or the best performing model from an adaptive hyperparameter search, respectively. Horizontal lines represent peak performance by AutoLFADS with Dˆ=100.

Without access to a ground truth dynamics f and embedding g that generated these biological data, the dimensionality required to reconstruct the neural activity was our primary measure of interpretability. We computed co-bps –a measure of reconstruction performance on held-out neurons– for each model and found that 10D ODIN models substantially outperformed Linear-NODE models, even when the Linear-NODE had more than twice as many dimensions (10D ODIN: 0.333, vs 25D Linear: 0.287). This suggests that ODIN’s injective non-linear readout is effective at reducing the state dimensionality required to capture the data relative to a simple linear readout.

We also compared ODIN to alternative models including AutoLFADS, GPFA, and MLP-NODE [27] at the same state dimensionalities. Trained AutoLFADS and GPFA models had lower co-bps at all tested state dimensionalities. In particular, co-bps was substantially higher for 10D ODIN compared to the 10D AutoLFADS or GPFA models (0.333 vs. 0.237, 0.204, respectively). As expected, MLP-NODE (not shown) performed similarly to ODIN; however, without a known state dimensionality, the MLP readout may incentivize the MLP-NODE to invent latent activity that is not reflected in the dataset. Of note, increasing AutoLFADS to a very high state dimensionality (Dˆ=100) allowed it to outperform ODIN in co-bps. However, as we have shown in Figures 2 and 3, improved reconstruction performance often comes at the expense of accuracy in latent recovery. Together, these results suggest that ODIN is effective at reducing the state dimensionality needed for good neural reconstruction, which may provide more interpretable latent representations than alternative models.

5. Discussion

Dynamics models have had great success in reproducing neural activity patterns and relating brain activity to behavior [44, 27, 45]. However, it has been difficult to use these models to investigate neural computation directly. If neural population models could be trusted to find interpretable representations of latent dynamics, then recent techniques that can uncover computation in artificial networks could help to explain computations in the brain [1, 40, 46]. In this work, we created a new model called ODIN that can overcome major barriers to learning interpretable latent dynamical systems. By combining Neural ODE generators and approximately injective nonlinear readouts, ODIN offers significant advantages over the current state-of-the-art, including lower latent dimensionality, simpler latent activity that is robust to the choice of latent dimensionality, and the ability to model arbitrary nonlinear activation functions.

Circuits in the brain are densely interconnected, and so a primary limitation of this work is that ODIN is not yet able to account for inputs to the system that may be coming from areas that are not directly modeled. Thus ODIN is currently only able to model the dynamics of a given population of neurons as an autonomous system. Inferring inputs is difficult due to ambiguity in the role and timecourse of inputs compared to internal dynamics for driving the state of the system. While some RNN-based models have methods for input inference [44], more work is needed to develop solutions for NODE-based models. Injective readouts are an important step towards addressing the fundamental difficulties of input inference, as models without injective readouts can be incentivized to imagine latent features that are actually the result of inputs.

Interpretable dynamics derived from neural population recordings could answer critical scientific questions about the brain and help improve brain-machine interface technology. A potential negative consequence is that human neural interfaces combined with an understanding of neural computation might make it possible and profitable to develop strategies that are effective at influencing behavior. Future researchers should focus on applications of this research that are scientific and medical rather than commercial or political.

Supplementary Material

1

Acknowledgements

The authors would like to acknowledge Timothy D. Kim and Carlos Brody for helpful discussions that further developed the ideas in this manuscript.

This work was supported by NSF NCS 1835364, NIH-NINDS/OD DP2NS127291, NIH BRAIN/NIDA RF1 DA055667, and the Alfred P. Sloan Foundation (CP), NIH BRAIN/NINDS F32 RFA-MH-23-110 (CV), the Simons Foundation as part of the Simons-Emory International Consortium on Motor Control (CP, CV), and NSF Graduate Research Fellowship DGE-2039655 (ARS).

References

  • [1].Sussillo David and Barak Omri. Opening the black box: low-dimensional dynamics in high-dimensional recurrent neural networks. Neural Computation, 25(3):626–649, March 2013. ISSN 1530–888X. doi: 10.1162/NECO_a_00409. [DOI] [PubMed] [Google Scholar]
  • [2].Mante Valerio, Sussillo David, Shenoy Krishna, and Newsome William. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature, 503:78–84, November 2013. doi: 10.1038/nature12742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Remington Evan D., Narain Devika, Hosseini Eghbal A., and Jazayeri Mehrdad. Flexible Sensorimotor Computations through Rapid Reconfiguration of Cortical Dynamics. Neuron, 98 (5):1005–1019.e5, June 2018. ISSN 1097–4199. doi: 10.1016/j.neuron.2018.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Maheswaranathan Niru, Williams Alex, Golub Matthew, Ganguli Surya, and Sussillo David. Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper/2019/hash/d921c3c762b1522c475ac8fc0811bb0f-Abstract.html. [PMC free article] [PubMed] [Google Scholar]
  • [5].Vyas Saurabh, Golub Matthew D., Sussillo David, and Shenoy Krishna V.. Computation Through Neural Population Dynamics. Annual Review of Neuroscience, 43(1):249–275, 2020. doi: 10.1146/annurev-neuro-092619-094115. URL 10.1146/annurev-neuro-092619-094115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Shenoy Krishna V., Sahani Maneesh, and Churchland Mark M.. Cortical control of arm movements: a dynamical systems perspective. Annual Review of Neuroscience, 36:337–359, July 2013. ISSN 1545–4126. doi: 10.1146/annurev-neuro-062111-150509. [DOI] [PubMed] [Google Scholar]
  • [7].Jazayeri Mehrdad and Ostojic Srdjan. Interpreting neural computations by examining intrinsic and embedding dimensionality of neural activity. Technical Report arXiv:2107.04084, arXiv, August 2021. URL http://arxiv.org/abs/2107.04084 arXiv:2107.04084 [q-bio] type: article. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Stevenson Ian H. and Kording Konrad P.. How advances in neural recording affect data analysis. Nature Neuroscience, 14(2):139–142, February 2011. ISSN 1546–1726. doi: 10.1038/nn.2731. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Nicholas A Steinmetz Cagatay Aydin, Lebedeva Anna, Okun Michael, Pachitariu Marius, Bauza Marius, Beau Maxime, Bhagat Jai, Claudia Böhm Martijn Broux, Chen Susu, Colonell Jennifer, Richard J Gardner Bill Karsh, Kloosterman Fabian, Kostadinov Dimitar, Carolina Mora-Lopez John O Callaghan, Park Junchol, Putzeys Jan, Sauerbrei Britton, van Daal Rik J J, Vollan Abraham Z, Wang Shiwei, Welkenhuysen Marleen, Ye Zhiwen, Dudman Joshua T, Dutta Barundeb, Hantman Adam W, Harris Kenneth D, Lee Albert K, Moser Edvard I, O’Keefe John, Renart Alfonso, Svoboda Karel, Häusser Michael, Haesler Sebastian, Carandini Matteo, and Harris Timothy D. Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings. Science, 372(6539), April 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Demas Jeffrey, Manley Jason, Tejera Frank, Barber Kevin, Kim Hyewon, Francisca Martínez Traub Brandon Chen, and Vaziri Alipasha. High-speed, cortex-wide volumetric recording of neuroactivity at cellular resolution using light beads microscopy. Nature Methods, 18(9): 1103–1111, September 2021. ISSN 1548–7105. doi: 10.1038/s41592-021-01239-8. URL https://www.nature.com/articles/s41592-021-01239-8, Number: 9 Publisher: Nature Publishing Group. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Gao Peiran and Ganguli Surya. On simplicity and complexity in the brave new world of large-scale neuroscience. Current Opinion in Neurobiology, 32:148–155, June 2015. ISSN 0959–4388. doi: 10.1016/J.CONB.2015.04.003. URL https://www.sciencedirect.com/science/article/pii/S0959438815000768, Publisher: Elsevier Current Trends. [DOI] [PubMed] [Google Scholar]
  • [12].Duncker Lea and Sahani Maneesh. Dynamics on the manifold: Identifying computational dynamical activity from neural population recordings. Current Opinion in Neurobiology, 70: 163–170, October 2021. ISSN 0959–4388. doi: 10.1016/j.conb.2021.10.014. URL https://wWw.sciencedirect.com/science/article/pii/S0959438821001264. [DOI] [PubMed] [Google Scholar]
  • [13].Sussillo David, Jozefowicz Rafal, Abbott L. F., and Pandarinath Chethan. LFADS - Latent Factor Analysis via Dynamical Systems. Technical Report arXiv:1608.06315, arXiv, August 2016. URL http://arxiv.org/abs/1608.06315 arXiv:1608.06315 [cs, q-bio, stat] type: article. [Google Scholar]
  • [14].Schimel Marine, Kao Ta-Chu, Jensen Kristopher T., and Hennequin Guillaume. iLQR-VAE : control-based learning of input-driven dynamics with applications to neural data. Technical report, bioRxiv, October 2021. URL https://www.biorxiv.org/content/10.1101/2021.10.07.463540v1. Section: New Results Type: article. [Google Scholar]
  • [15].Sedler Andrew R., Versteeg Christopher, and Pandarinath Chethan. Expressive architectures enhance interpretability of dynamics-based neural population models, February 2023. URL http://arxiv.org/abs/2212.03771, arXiv:2212.03771 [cs, q-bio]. [DOI] [PMC free article] [PubMed]
  • [16].Gao Yuanjun, Archer Evan, Paninski Liam, and Cunningham John P.. Linear dynamical neural population models through nonlinear embeddings. Technical Report arXiv:1605.08454, arXiv, October 2016. URL http://arxiv.org/abs/1605.08454. arXiv:1605.08454 [q-bio, stat] type: article. [Google Scholar]
  • [17].Wu Anqi, Roy Nicholas A., Keeley Stephen, and Pillow Jonathan W. Gaussian process based nonlinear latent structure discovery in multivariate spike train data. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://papers.nips.cc/paper_files/paper/2017/hash/b3b4d2dbedc99fe843fd3dedb02f086f-Abstract.html. [PMC free article] [PubMed] [Google Scholar]
  • [18].Zhao Yuan and Park Memming Il. Variational Online Learning of Neural Dynamics. Frontiers in Computational Neuroscience, 14, 2020. ISSN 1662–5188. URL https://www.frontiersinorg/article/10.3389/fncom.2020.00071 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Chen Ricky T. Q., Rubanova Yulia, Bettencourt Jesse, and Duvenaud David. Neural Ordinary Differential Equations. Technical Report arXiv:1806.07366, arXiv, December 2019. URL http://arxiv.org/abs/1806.07366. arXiv:1806.07366 [cs, stat] type: article. [Google Scholar]
  • [20].Dinh Laurent, Krueger David, and Bengio Yoshua. Nice: Non-linear independent components estimation. arXiv preprint arXiv: 1410.8516, 2014. [Google Scholar]
  • [21].Kingma Durk P and Dhariwal Prafulla. Glow: Generative flow with invertible 1×1 convolutions. Advances in neural information processing systems, 31, 2018. [Google Scholar]
  • [22].Ardizzone Lynton, Kruse Jakob, Wirkert Sebastian, Rahner Daniel, Pellegrini Eric W., Klessen Ralf S., Lena Maier-Hein Carsten Rother, and Köthe Ullrich. Analyzing Inverse Problems with Invertible Neural Networks. Technical Report arXiv:1808.04730, arXiv, February 2019. URL http://arxiv.org/abs/1808.04730 arXiv:1808.04730 [cs, stat] type: article. [Google Scholar]
  • [23].Behrmann Jens, Grathwohl Will, Chen Ricky T. Q., Duvenaud David, and Jacobsen Joern-Henrik. Invertible Residual Networks. In Proceedings of the 36th International Conference on Machine Learning, pages 573–582. PMLR, May 2019. URL https://proceedings.mlr.press/v97/behrmann19a.html. ISSN: 2640–3498. [Google Scholar]
  • [24].Macke Jakob H, Buesing Lars, Cunningham John P, Yu Byron M, Shenoy Krishna V, and Sahani Maneesh. Empirical models of spiking in neural populations. In Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc., 2011. URL https://papers.nips.cc/paper/2011/hash/7143d7fbadfa4693b9eec507d9d37443-Abstract.html. [Google Scholar]
  • [25].Archer Evan, Il Memming Park Lars Buesing, Cunningham John, and Paninski Liam. Black box variational inference for state space models, November 2015. URL http://arxiv.org/abs/1511.07367, arXiv:1511.07367 [stat].
  • [26].Pfau David, Pnevmatikakis Eftychios A, and Paninski Liam. Robust learning of low-dimensional dynamics from large neural ensembles. In Advances in Neural Information Processing Systems, volume 26. Curran Associates, Inc., 2013. URL https://papers.nips.cc/paper_files/paper/2013/hash/47a658229eb2368a99f1d032c8848542-Abstract.html [Google Scholar]
  • [27].Pei Felix, Ye Joel, Zoltowski David, Wu Anqi, Chowdhury Raeed H., Sohn Hansem, O’Doherty Joseph E., Shenoy Krishna V., Kaufman Matthew T., Churchland Mark, Jazayeri Mehrdad, Miller Lee E., Pillow Jonathan, Park Memming Il, Dyer Eva L., and Pandarinath Chethan. Neural Latents Benchmark ‘21: Evaluating latent variable models of neural population activity. Technical Report arXiv:2109.04463, arXiv, January 2022. URL http://arxiv.org/abs/2109.04463 arXiv:2109.04463 [cs, q-bio] type: article. [Google Scholar]
  • [28].Kim Timothy D, Luo Thomas Z, Pillow Jonathan W, and Brody Carlos. Inferring latent dynamics underlying neural population activity via neural differential equations. In International Conference on Machine Learning, pages 5551–5561. PMLR, 2021. [Google Scholar]
  • [29].Johnson Matthew J., Duvenaud David, Wiltschko Alexander B., Datta Sandeep R., and Adams Ryan P.. Composing graphical models with neural networks for structured representations and fast inference, July 2017. URL http://arxiv.org/abs/1603.06277. arXiv:1603.06277 [stat].
  • [30].Stevenson Ian H.. Flexible models for spike count data with both over- and under- dispersion. Journal of Computational Neuroscience, 41(1):29–43, August 2016. ISSN 1573–6873. doi: 10.1007/s10827-016-0603-y. URL/ 10.1007/s10827-016-0603-y. [DOI] [PubMed] [Google Scholar]
  • [31].Brunton Steven L, Proctor Joshua L, and Kutz J Nathan. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the national academy of sciences, 113(15):3932–3937, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Schneider Steffen, Lee Jin Hwa, and Mathis Mackenzie Weygandt. Learnable latent embeddings for joint behavioural and neural analysis. Nature, 617(7960):360–368, May 2023. ISSN 1476–4687. doi: 10.1038/s41586-023-06031-6. URL https://www.nature.com/articles/s41586-023-06031-6. Number: 7960 Publisher: Nature Publishing Group. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Zhou Ding and Wei Xue-Xin. Learning identifiable and interpretable latent models of high-dimensional neural activity using pi-VAE, November 2020. URL http://arxiv.org/abs/2011.04798 arXiv:2011.04798 [cs, q-bio, stat].
  • [34].Bashiri Mohammad, Walker Edgar, Lurz Konstantin-Klemens, Jagadish Akshay, Muhammad Taliah, Ding Zhiwei, Ding Zhuokun, Tolias Andreas, and Sinz Fabian. A flow-based latent state generative model of neural population responses to natural images. In Advances in Neural Information Processing Systems, volume 34, pages 15801–15815. Curran Associates, Inc., 2021. URL/https://proceedings.neurips.cc/paper/2021/hash/84a529a92de322be42dd3365afd54f91-Abstract.html. [Google Scholar]
  • [35].Smith Jimmy T. H., Linderman Scott W., and Sussillo David. Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems. Technical Report arXiv:2111.01256, arXiv, November 2021. URL http://arxiv.org/abs/2111.01256. arXiv:2111.01256 [cs] type: article. [Google Scholar]
  • [36].Hurwitz Cole, Srivastava Akash, Xu Kai, Jude Justin, Perich Matthew, Miller Lee, and Hennig Matthias. Targeted Neural Dynamical Modeling. In Advances in Neural Information Processing Systems, volume 34, pages 29379–29392. Curran Associates, Inc., 2021. URL https://papers.nips.cc/paper_files/paper/2021/hash/f5cfbc876972bd0d031c8abc37344c28-Abstract.html. [Google Scholar]
  • [37].Jensen Kristopher, Kao Ta-Chu, Stone Jasmine, and Hennequin Guillaume. Scalable Bayesian GPFA with automatic relevance determination and discrete noise models. In Advances in Neural Information Processing Systems, volume 34, pages 10613–10626. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper/2021/hash/58238e9ae2dd305d79c2ebc8c1883422-Abstract.html. [Google Scholar]
  • [38].Churchland Mark M., Cunningham John P., Kaufman Matthew T., Ryu Stephen I., and Shenoy Krishna V.. Cortical preparatory activity: representation of movement or first cog in a dynamical machine? Neuron, 68(3):387–400, November 2010. ISSN 1097–4199. doi: 10.1016/j.neuron.2010.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Miyato Takeru, Kataoka Toshiki, Koyama Masanori, and Yoshida Yuichi. Spectral Normalization for Generative Adversarial Networks, February 2018. URL http://arxiv.org/abs/1802.05957, arXiv:1802.05957 [cs, stat].
  • [40].Golub Matthew D. and Sussillo David. Fixedpointfinder: A tensorflow toolbox for identifying and characterizing fixed points in recurrent neural networks. Journal of Open Source Software, 3(31):1003, 2018. doi: 10.21105/joss.01003. URL 10.21105/joss.01003 [DOI] [Google Scholar]
  • [41].Roy Olivier and Vetterli Martin. The Effective Rank: a Measure of Effective Dimensionality. European Association for Signal Processing, 2007. [Google Scholar]
  • [42].Arneodo A, Coullet P, and Tresser C. Occurence of strange attractors in three-dimensional Volterra equations. Physics Letters A, 79(4):259–263, October 1980. ISSN 0375–9601. doi: 10.1016/0375-9601(80)90342-4. URL/https://www.sciencedirect.com/science/article/pii/0375960180903424 [DOI] [Google Scholar]
  • [43].Gardner Richard J., Hermansen Erik, Pachitariu Marius, Burak Yoram, Baas Nils A., Dunn Benjamin A., Moser May-Britt, and Moser Edvard I.. Toroidal topology of population activity in grid cells. Technical report, bioRxiv, February 2021. URL https://www.biorxiv.org/content/10.1101/2021.02.25.432776v1. Section: New Results Type: article. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [44].Pandarinath Chethan, O’Shea Daniel J., Collins Jasmine, Jozefowicz Rafal, Stavisky Sergey D., Kao Jonathan C., Trautmann Eric M., Kaufman Matthew T., Ryu Stephen I., Hochberg Leigh R., Henderson Jaimie M., Shenoy Krishna V., Abbott L. F., and Sussillo David. Inferring single-trial neural population dynamics using sequential auto-encoders. Nature Methods, 15(10):805–815, October 2018. ISSN 1548–7105. doi: 10.1038/s41592-018-0109-9. URL https://www.nature.com/articles/s41592-018-0109-9. Number: 10 Publisher: Nature Publishing Group. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [45].Smith Jimmy T. H., Warrington Andrew, and Linderman Scott W.. Simplified State Space Layers for Sequence Modeling, March 2023. URL http://arxiv.org/abs/2208.04933. arXiv:2208.04933 [cs].
  • [46].Driscoll Laura, Shenoy Krishna, and Sussillo David. Flexible multitask computation in recurrent networks utilizes shared dynamical motifs, August 2022. URL https://www.biorxiv.org/content/10.1101/2022.08.15.503870v1. Pages: 2022.08.15.503870 Section: New Results. [DOI] [PMC free article] [PubMed]
  • [47].Gilpin William. Chaos as an interpretable benchmark for forecasting and data-driven modelling. Advances in Neural Information Processing Systems, 2021. URL http://arxiv.org/abs/2110.05266 [Google Scholar]
  • [48].Maynard Edwin M., Nordhausen Craig T., and Normann Richard A.. The Utah Intracortical Electrode Array: A recording structure for potential brain-computer interfaces. Electroencephalography and Clinical Neurophysiology, 102(3):228–239, March 1997. ISSN 0013–4694. doi: 10.1016/S0013-4694(96)95176-0. URL https://www.sciencedirect.com/science/article/pii/S0013469496951760 [DOI] [PubMed] [Google Scholar]
  • [49].Oliver Rübel Andrew Tritt, Ly Ryan, Benjamin K Dichter Satrajit Ghosh, Niu Lawrence, Baker Pamela, Soltesz Ivan, Ng Lydia, Svoboda Karel, Frank Loren, and Bouchard Kristofer E. The Neurodata Without Borders ecosystem for neurophysiological data science. eLife, 11: e78362, October 2022. ISSN 2050–084X. doi: 10.7554/eLife.78362. URL 10.7554/eLife.78362. Publisher: eLife Sciences Publications, Ltd. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [50].Mohammad Reza Keshtkaran Andrew R. Sedler, Chowdhury Raeed H., Tandon Raghav, Basrai Diya, Nguyen Sarah L., Sohn Hansem, Jazayeri Mehrdad, Miller Lee E., and Pandarinath Chethan. A large-scale neural network training framework for generalized estimation of single-trial population dynamics. Nature Methods, 19(12):1572–1577, December 2022. ISSN 1548–7105. doi: 10.1038/s41592-022-01675-0. URL https://www.nature.com/articles/s41592-022-01675-0. Number: 12 Publisher: Nature Publishing Group. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [51].Willett Francis R., Avansino Donald T., Hochberg Leigh R., Henderson Jaimie M., and Shenoy Krishna V.. High-performance brain-to-text communication via handwriting. Nature, 593(7858):249–254, May 2021. ISSN 0028–0836. doi: 10.1038/s41586-021-03506-2. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8163299/. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [52].Paszke Adam, Gross Sam, Massa Francisco, Lerer Adam, Bradbury James, Chanan Gregory, Killeen Trevor, Lin Zeming, Gimelshein Natalia, Antiga Luca, Desmaison Alban, Andreas Köpf Edward Yang, Zach DeVito Martin Raison, Tejani Alykhan, Chilamkurthy Sasank, Steiner Benoit, Fang Lu, Bai Junjie, and Chintala Soumith. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Technical Report arXiv:1912.01703, arXiv, December 2019. URL http://arxiv.org/abs/1912.01703, arXiv:1912.01703 [cs, stat] type: article. [Google Scholar]
  • [53].Liaw Richard, Liang Eric, Nishihara Robert, Moritz Philipp, Gonzalez Joseph E, and Stoica Ion. Tune: A research platform for distributed model selection and training. arXiv preprint arXiv:1807.05118, 2018. [Google Scholar]
  • [54].Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., and Duchesnay E.. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Articles from ArXiv are provided here courtesy of arXiv

RESOURCES