Linearly decodable functions from neural population codes

M Brandon Westover; Chris Eliasmith; Charles H Anderson

doi:10.1016/s0925-2312(02)00459-9

. Author manuscript; available in PMC: 2020 Mar 9.

Published in final edited form as: Neurocomputing (Amst). 2002 Mar 26;44-46:691–696. doi: 10.1016/s0925-2312(02)00459-9

Linearly decodable functions from neural population codes

M Brandon Westover ^a,^*, Chris Eliasmith ^b, Charles H Anderson ^a

PMCID: PMC7062372 NIHMSID: NIHMS1053524 PMID: 32153318

Abstract

The population vector is a linear decoder for an ensemble of neurons, whose response properties are nonlinear functions of the input vector. However, previous analyses of this decoder seem to have missed the observation that the population vector can also be used to estimate functions of the input vector. We explore the use of singular value decomposition to delineate the set of functions which are linearly decodable from a given population of noisy neurons.

Keywords: Population codes, Singular value decomposition, Principal components

Many sensory systems utilize a large number of neurons to make measurements on inherently low-dimensional systems. Examples include the ~1000 neurons in the cricket cercal system that measure horizontal wind velocity [2]; and the much larger hair cell population in the mammallian otolith that senses linear acceleration of the head, a three-dimensional vector [1]. Although the responses of the individual neurons are highly nonlinear and heterogeneous, a linear weighted sum of these nonlinear measures often provides an exceptionally precise linear estimate of the underlying physical input. Linear or “population vector” decoding is typically used to explore how neural systems might implement a communication channel; that is, how downstream populations might reconstruct the value of an encoded physical variable by specific synaptic weighting and summing of afferent sensory neuronal signals [4]. However, many downstream populations may be more concerned with extracting transformations of the sensory input. Using the general framework we present below, such transformations can also be computed by linear decoding.

The approach taken here is to view the computation of the decoding vector as a linear combination of projections of the high-dimensional state vector of neuronal activities onto a special set of axes. Combinations of projections along other axes provide measures of nonlinear functions. Examples include the magnitude of the input vector, or bilinear forms, like the vector cross product. We discuss below how to determine the range of functions that can be robustly extracted in this manner.

Let x be a D-dimensional random vector representing the input to a set of N neurons, with response curves ${a_{1} (x), a_{2} (x), \dots, a_{N} (x)}$ . The response curves are a heterogeneous set of nonlinear, scalar-valued functions; and the input is physically restricted to a finite range R, e.g. $R = {x : ‖ x ‖ ⩽ 1}$ . For simplicity, we assume that x is uniformly distributed on R.

We specify a set of N orthonormal vectors ${{\hat{e}}_{1}, {\hat{e}}_{2}, \dots, \hat{e_{N}}}$ (columns of the N × N identity matrix) to serve as a basis (axes) for the population’s state space. In these coordinates, the state vector for a given input x is

a (x) = a_{1} (x) {\hat{e}}_{1} + a_{2} (x) {\hat{e}}_{2} + \dots + a_{N} (x) {\hat{e}}_{N} .

(1)

Thus, the neuronal population {a_i} maps the range R onto a surface a(x) in the N-dimensional state space, parameterized by x. Noise in the neural encoding process (e.g. jitter in spike times) introduces a cloud of uncertainty around each state point.

Next, let the desired transformation f(x) be a vector in a generalized vector space F with inner product

〈 f_{1} (x), f_{2} (x) 〉 = \int_{R} f_{1} (x) f_{2} (x) d x .

(2)

For example, we may take F to be the set of continuous functions over R.

Finally, we specify how the desired transformation is to be extracted from the neural activities by defining a decoding rule

\hat{f} (x) = \sum_{i = 1}^{N} a_{i} (x) ϕ_{i} = a^{'} (x) ϕ,

(3)

where $ϕ = {[ϕ_{1}, ϕ_{2}, \dots, ϕ_{N}]}^{'}$ are the decoding vectors, or in the case of D =1, decoding weights. Thus, the extracted approximation to f(x) will be a linear combination of the neural response functions. The set of all possible combinations defines a vector subspace $\hat{F} \subset F$ . We seek an optimal projector ϕ for projecting points along a(x) from the state space onto $\hat{F}$ .

Having set the stage, the precise statement and solution of our problem becomes geometrically clear (see Fig. 1a):

Find the function $\hat{f} (x) \in \hat{F}$ which best approximates $f (x) \in F$ ; that is, the vector in $\hat{F}$ which minimizes the norm of the approximation error $ε = f (x) - \hat{f} (x)$ .

According to the well-known Projection Theorem, a unique solution $\hat{f_{0}} (x)$ exists, and is simply the orthogonal projection of f(x) onto $\hat{F}$ . Thus, the error will be orthogonal to the subspace spanned by the neural response functions; i.e.,

0 = 〈 f (x) - \hat{f} (x), a_{j} (x) 〉 = 〈 f (x) - \sum_{i = 1}^{N} a_{i} (x) ϕ_{i}, a_{j} (x) 〉

(4)

for all j = 1;...;N. Solving, we obtain

ϕ_{i} = \sum_{i = 1}^{N} Γ_{i j}^{- 1} 〈 f (x), a_{j} (x) 〉,

(5)

where

Γ_{i j} = 〈 a_{i} (x), a_{j} (x) 〉

(6)

is the gram matrix, whose entries represent the correlations between activities of neuron pairs, averaged over the input range R.

The gram matrix is usually ill-conditioned, so its inverse cannot be computed directly. This can be handled by adding statistically independent Gaussian noise to the neuronal responses [4]. Instead, we use the pseudoinverse based on singular value decomposition (SVD) [3, Chapter 2], because it reveals much about what functions can be extracted from a population’s activities.

Using SVD, we decompose the gram matrix as

Γ = U^{'} S^{'} U,

(7)

where $S = diag {ω_{1}, ω_{2}, \dots, ω_{N}}$ is the diagonal matrix of singular values in order of decreasing magnitude; and U is a rotation matrix whose columns are the corresponding eigenvectors. The columns {u_1, u₂,...,u_N} of U define a new coordinate system for the state space.

In rotated coordinates, the state vector a(x) and the orthogonal projector ϕ become $U a (x) ≐ χ (x) = {[χ_{1} (x), χ_{2} (x), \dots, χ_{N} (x)]}^{'}$ ; and $U ϕ ≐ Φ$ . The function approximation can be reexpressed in rotated coordinates as

\hat{f} (x) = a^{'} (x) ϕ = a^{'} (x) U^{'} U ϕ = {[U a (x)]}^{'} [U ϕ] = χ^{'} (x) Φ = \sum_{i = 1}^{N} χ_{i} (x) Φ_{i} .

(8)

Each of the curves χ_i(x) represents the projection of the rotated state path χ(x) onto a coordinate axis u_i. As before, $\hat{f} (x)$ is a linear combination of functions spanning the subspace $\hat{F}$ .

Crucial differences distinguish the new and old representations. First, in the original coordinates, the ordering of axes ${\hat{e}}_{i}$ is arbitrary. However, projections along the rotated axes u_i are naturally ordered by the singular values of the gram matrix: Each singular value measures the second moment of the state path χ(x) about its associated axis. Second, the spanning set {χ_i(x)} is now orthogonal:

〈 χ_{i} (x), χ_{j} (x) 〉 = ω_{i} δ_{i j},

which means that the χ_i(x)’s contribute uncorrelated information to the sum in (8), in decreasing order of importance from i =1 to N.

Fig. 1b illustrates these points for a hypothetical case, where D =1, N = 2. Before rotation, neither axis is indispensable to a representation of points on the curve a(x). After rotation, the projection along axis u₁ clearly captures most of the information. Accordingly, truncating the expansion introduces only small changes in the error ε. In fact, noise in neuronal responses will destroy information along axes with small singular values.

By examining the first several χ functions (i.e., the robust projections of χ(x)) we gain qualitative information about the nature and range of functions $\hat{F}$ supported by the population.

We first examine a population with monotonic saturating response functions that encode a scalar x (see Fig. 2a). One such well studied example is the neural population that encodes horizontal eye position in the neural integrator [5].

In Fig. 2b we plot the five χ_i(x) functions corresponding to the highest singular values of the gram matrix. We see that this population supports an approximation basis which looks roughly like a “rotated” set of orthogonal polynomials over the interval $- 1 ⩽ x ⩽ 1$ .

As expected, linear transformations $f : x \to α x$ are well supported by the population, in addition to functions well approximated by low order polynomials (Fig. 2c). We emphasize low order, because projections onto later axes are small and thus sensitive to neuronal noise. For this population, we observe an approximately exponential decrease in magnitude of the singular values with increasing polynomial order (not shown). The number of χ_i(x)’s with significant singular values, hence, the range of transformations extractable from the population’s neural activities, can be expanded by increasing the number of neurons or, perhaps more interestingly, by adjusting parameters governing the encoding of x.

Note that this analysis fails if the nonlinear responses of the neurons in the population are homogeneous. Clearly, in this case, linear weighted sums will simply reflect the nonlinear responses of the individual neurons. This suggests that the diversity of real neuronal responses is computationally important.

Next, for comparison, we consider a population of neurons having Gaussian tuning curves (see Fig. 2a), modeled after ensembles frequently encountered in mammalian cortical populations. Again we plot the first five projections of χ(x) along the eigen-vectors of this population’s gram matrix.

Several striking differences stand out. First, note the absence of a linear term. Predictably, attempts to extract linear transformations from this population are less successful (Fig. 2c). However, the localized nature of these χ_i(x)’s renders them a reasonable basis for approximating localized functions (Fig. 2d).

Thus, through the lens of the rotated state space, the differences between the response functions of these populations are seen to imply qualitatively different computational potentialities.

We have described how the population vector can be generalized to compute functions of input variables with linear decoding. The class of functions that can be decoded with significant robustness to noise depends on the form and diversity of neuronal responses in the population code. We are presently applying this observation to examine the properties of neuron populations in several primary sensory systems to obtain insight into the form of nonlinear transformations they can efficiently supply to downstream neurons.

References

[1].Angelaki DE, McHenry MQ, Dickman JD, Newlands SD, Hess BJM, Computation of inertial motion: neural strategies to resolve ambigous otolith information, J. Neurosci 19 (1999) 316–327. [DOI] [PMC free article] [PubMed] [Google Scholar]
[2].Jacobs GA, Thenissen FE, Extraction of sensory parameters from a neural map by primary sensory interneurons, J. Neurosci 20 (2000) 2934–2943. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Press WH, Teukolsky SA, Vetterling WT, Flannery BP, Numerical recipes in C, Cambridge University Press, New York, 1992. [Google Scholar]
[4].Salinas E, Abbott LF, Vector reconstruction from firing rates, J. Comput. Neurosci 1 (1994) 89–107. [DOI] [PubMed] [Google Scholar]
[5].Seung HS, Lee DD, Reis BY, Tank DW, Stability of the memory of eye position in a recurrent network of conductane-based model neurons, Neuron 26 (2000) 259–271. [DOI] [PubMed] [Google Scholar]

[R1] [1].Angelaki DE, McHenry MQ, Dickman JD, Newlands SD, Hess BJM, Computation of inertial motion: neural strategies to resolve ambigous otolith information, J. Neurosci 19 (1999) 316–327. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] [2].Jacobs GA, Thenissen FE, Extraction of sensory parameters from a neural map by primary sensory interneurons, J. Neurosci 20 (2000) 2934–2943. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Press WH, Teukolsky SA, Vetterling WT, Flannery BP, Numerical recipes in C, Cambridge University Press, New York, 1992. [Google Scholar]

[R4] [4].Salinas E, Abbott LF, Vector reconstruction from firing rates, J. Comput. Neurosci 1 (1994) 89–107. [DOI] [PubMed] [Google Scholar]

[R5] [5].Seung HS, Lee DD, Reis BY, Tank DW, Stability of the memory of eye position in a recurrent network of conductane-based model neurons, Neuron 26 (2000) 259–271. [DOI] [PubMed] [Google Scholar]

PERMALINK

Linearly decodable functions from neural population codes

M Brandon Westover

Chris Eliasmith

Charles H Anderson

Abstract

Fig. 1.

Fig. 2.

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Linearly decodable functions from neural population codes

M Brandon Westover

Chris Eliasmith

Charles H Anderson

Abstract

Fig. 1.

Fig. 2.

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases