A Kalman filtering approach to the representation of kinematic quantities by the hippocampal-entorhinal complex

Graham Wordsworth Osborn

doi:10.1007/s11571-010-9115-z

. 2010 Jun 8;4(4):315–335. doi: 10.1007/s11571-010-9115-z

A Kalman filtering approach to the representation of kinematic quantities by the hippocampal-entorhinal complex

Graham Wordsworth Osborn ^1,^✉

PMCID: PMC2974095 PMID: 22132041

Abstract

Several regions of the brain which represent kinematic quantities are grouped under a single state-estimator framework. A theoretic effort is made to predict the activity of each cell population as a function of time using a simple state estimator (the Kalman filter). Three brain regions are considered in detail: the parietal cortex (reaching cells), the hippocampus (place cells and head-direction cells), and the entorhinal cortex (grid cells). For the reaching cell and place cell examples, we compute the perceived probability distributions of objects in the environment as a function of the observations. For the grid cell example, we show that the elastic behavior of the grids observed in experiments arises naturally from the Kalman filter. To our knowledge, the application of a tensor Kalman filter to grid cells is completely novel.

Keywords: Hippocampus, Place cell, Entorhinal, Grid cell, State estimation, Kalman filter

Introduction

The encoding of space and time coordinates is a basic problem which the brain must solve. Multiple (perhaps most) brain regions are involved in this problem. At a basic level, primary sensory afferents must relay a topographic map from the mechanoreceptor sites to the primary somatosensory cortices. At a more abstract level, association cortices must build and update geometrically complex models of the environment to direct accurate behavior. Specific anatomic regions have been selected by experimentalists to study space–time reasoning. “Arm-reaching” neurons in the parietal cortex of primates have been used to study how visual cues are transformed from eye coordinates to arm coordinates (Ferraina et al. 1997). The inferior colliculus of the bat and ferret has been used as a model for time-interval coding (Ehrlich et al. 1997). The hippocampal-entorhinal complex (HEC) and its neighboring regions, in particular, contain a rich array of cells tuned to kinematic variables. These include place cells in the hippocampus, head-direction cells in the postsubiculum, and grid cells in the entorhinal cortex of rats (O’Keefe and Dostrovsky 1971; Taube et al. 1990; Hafting et al. 2005).

In each of the experimental models mentioned, computational models have been devised to predict how cells will fire when the animal is exposed to a stimulus oriented in space–time. Models for predicting cell activity for the reaching problem have been proposed by Cisek (2006) and Pouget and Snyder (2000). Particularly successful place cell models have been proposed by O’Keefe and Burgess (1996), Barry et al. (2006), and Balakrishnan et al. (1999); countless others exist. Notable models of entorhinal grids have been constructed by Fuhs and Touretzky (2006), Rolls et al. (2006), and Blair et al. (2007). Extensions of the theta phase precession theory for place cells and grid cells were given by Igarashi et al. (2007), Baker and Olds (2007), Wagatsuma and Yamaguchi (2007), and Takahashi et al. (2009). It is puzzling to note that the models mentioned differ from one brain region to the next, although the stimuli and task in question may be indistinguishable. Take the example of an animal reaching for an object in the environment. According to the parietal view, we might say that the animal transforms the object’s image on its retina from retinal to hand coordinates to facilitate reaching; each step in this transformation has been documented in the parietal cortex. From the hippocampal standpoint, we might argue that the hippocampus has a place cell anchored to the same object which “informs” the animal of its location. It follows that two simultaneous representations of the same object must exist. We would like to pose the question: which representation determines the animal’s reaching behavior? The natural answer is that both representations contribute; i.e., the two brain regions are working on the same calculation together. If this is the case, however, why does one need two different mathematical models to perform one calculation?

The simplest answer to this problem would be to suppose that all space–time calculations in the brain are carried out according to a set of common rules. If one could determine these rules, it would be unnecessary to build a new mathematical model to explain how each new brain region calculates space–time relations. Instead, one would have to figure out “which part” of the overall calculation the region in question was doing.

In this paper, we have attempted to group several of the brain functions just described under a common set of rules. The rules adopted correspond to those used by engineers to deal with the analogous problems in robotics (Gelb 1984; Smith et al. 1990). These rules have already been applied to the hippocampus in several important papers (Bousquet et al. 1999; Szita and Lorincz 2004). The author’s goal is to show that the same scheme can be applied with equal success to any brain region which represents space–time; we will outline this procedure in detail for the parietal cortex (reaching cells) and entorhinal cortex (grid cells). We will also revisit place cells with a somewhat different interpretation than that of Bousquet et al. (1999).

Assumptions of the Kalman Filter

The set of rules utilized in this paper will be the state update equations of the linear Kalman filter. This filter is useful for computations and experiments due to its simplicity. However, we do not wish to create the impression that the unity between the several problems presented would vanish if a more generalized set of rules were used. To this end, we will briefly review the series of assumptions which connect the Kalman filter to the basic rules of probability. Analogous arguments for fuzzy sets could be made by replacing probabilities with membership functions.

Given a sample space containing various events, we can define the unions and intersections of these events. A particular type of sample space arises if we consider a particular collection of mutually exclusive “states” at multiple time points. We will refer to the states contained within a given time point as a “frame.” In this type of space, the only non-zero conditional probabilities are between states at non-identical time points. If we further constrain the conditional probabilities to only involve states at adjacent time points (the Markov property), we obtain a Markov chain.

A Markov chain of particular value to the neuroscientist is the hidden Markov model (HMM), which is used to incorporate the effect of observations on the probability distribution of a hidden stochastic process. Denoting the initial state vector by Inline graphic the state transition probabilities by and the state emission probabilities by the probability of a given observations sequence in a HMM can be written as:

A key assumption in the HMM is that the observations are independent of one another and of states outside the corresponding time point. This assumption is hard to justify experimentally. We find this to be a major limitation of the HMM, but the complexity of alternative formulations without this assumption appears prohibitive at this time.

Special Markov chains can be obtained by restricting the permitted types of state transitions. If the states considered are a lattice of cubes within a volume of space, we might permit only transitions between neighboring cubes. A random walk has this property. Letting these cubes shrink to infinitesimal dimensions, we obtain a random walk for continuous variables (a diffusion process). Nonlinear filters such as the Kolmogorov and Zakai equations, or linear filters such as the Kalman and Wiener filters, fall into this category (Gelb 1984).

We require the analogue of Eq. 1a for continuous variables to generate the Kalman filter. For the discrete-time case, the system transitions (or “control inputs”) and state emission probabilities (or “observations”) may be considered separately. For the system evolution model x(k) at time-step k, we take the linear equation:

Here a(k) and b(k) are model parameters which we will set equal to unity, u(k) is a control input variable, and w(k) indicates additive random noise. Similarly, for the linear observation model z(k) we let:

Here c(k) is a model parameter which, again, will be set equal to unity. The term v(k) denotes additive random noise. The random variables w(k) and v(k) will be assumed continuous and normally distributed, with covariance matrices R(k) and Q(k), respectively.

Further discussion of these state equations may be found in Gelb (1984) or Dissanayake et al. (2001).

An important concept relating to the Kalman filter is the “frame.” As mentioned previously, by a frame we shall mean the probabilities of all states at a particular time point. Physically, we can relate a particular “state of knowledge” by the observer to each time point. In the diagrams provided throughout the paper, separate frames are represented by separate icons.

The uncertainty between frames

It is possible to classify state estimation problems by the number of frames of type B (brain) and W (external world) of a system. Namely, we can permit each subset to have one or multiple frames. This scheme yields 4 possibilities for B/W combinations: 1/1, 1/N, N/1, N/N. These combinations correspond to the 4 situations encountered by a brain when it represents the motion of extended bodies. We will list, for each, an experimental example:

1/1—the problem of representing one’s body position in space (modeled as point-like) relative to a perfectly known (or “very familiar”) environment. The example of place cells will be used.
1/N—the problem of representing body position in space relative to an imperfectly-known environment, or representing the relative positions of objects in the environment. Place cells, time cells, and head-direction cells are examples.
N/1—the problem of representing the extended body of a subject (where uncertainties between sensors may exist). The reaching problem (studied in the parietal cortex) will be addressed.
N/N—a generalized case of the preceding problems. We will not discuss it.

We will now examine each of these cases in detail. The reader is strongly encouraged to review Appendix Eqs. 31–34 at this time for a detailed overview of the methods used. Owing to the space required for a full description of these methods, we chose to omit them from the body of the text.

Body position relative to a perfectly known environment

General results

The directed acyclic graph (DAG) for the estimation problem 1/1 is shown in Fig. 1. This problem represents the most basic form of the Kalman filter; we will try to gain intuition for its properties in this section. A subject makes observations of a single non-subject frame (we will call it an object) while the subject performs a series of motions through space. The relative uncertainty U in the positions of the subject S and the object O, calculated according to Appendix Eqs. 31 and 33, is given in Eq. 34:

Here Q_k and R_k represent the state evolution observation covariances, respectively, at time step k. The corresponding position estimates can be calculated according to the Kalman filter equations as shown in (34). Since the uncertainty expressions govern the state evolution, however, we will only give U explicitly.

Fig. 1 — A subject makes alternating observations (with covariances R_k) and control inputs (with covariances Q_k). Evaluating the circuit resistance between the subject at time step k and the object yields Eq. 2a

The expression 2a agrees with our intuition about how “certain” our subject (say, a rat) is of the position of a particular object relative to its body. If the rat makes a perfect observation R_k = 0, then we find U = R_k = 0. Similarly, if a very inaccurate observation R_k → ∞ is made, it follows that U_n = U_n−1; the new observation makes no impact on the rat’s state estimate.

In general, a rat will never be “completely certain” or “completely uncertain” of its position in a familiar environment. Rather, the rat’s uncertainty will tend to fluctuate within some finite range between the 2 extremes when the number of observations becomes large. We would like to estimate this range. To do this, we begin by setting all the terms Q_i and R_i equal in 2a. We then take the limit of 2a when k → ∞:

If we let Q_max, R_max denote the poorest (least precise) observation and the poorest control input, and let Q_min, R_min equal the best values for these quantities, we can insert these values in 2b to get a “maximum range” for the uncertainty. This is:

We see that U must remain finite and nonzero if Q_max, R_max and Q_min, R_min are all finite.

If the relative positions of all objects in an extended environment are known with certainty (see IIIB2), then Eq. 2a holds for the uncertainty of the rat with respect to any object in the room. That is:

In other words: if all the objects O₁, O₂, … O_n are equally “observable,” i.e.

the rat gets the same information from observing different objects as it does from observing only one object! We may as well call this common uncertainty “the uncertainty of the rat with respect to the room.” This is precisely the relation which hippocampal “place” cells are believed to represent. Consequently, we should be able to make predictions about place cell fields using our model. We shall do this in the next section.

Let us consider other special cases of Eq. 2a. In the limit, if the rat could drive U to zero, a precise point in the rat’s body could be represented with respect to the world (to be discussed further for the case N/1). This implies that place cell fields can (in principle) become arbitrarily point-like in space, provided observations become suitably accurate. The best way for the rat to do this would be to minimize its internal sources of error Q; for instance, it might choose to remain roughly stationary. Setting Q = 0 in Eq. 2a, we have:

For k roughly identical observations R this reduces to U = R/k.

Conversely, imagine that a rat at a particular moment knows its position with uncertainty U = U₀. If we then deprive the rat of further landmark information (perhaps by turning off the lights), Eq. 2a gives (with R_i = 0 and k control inputs):

This situation represents “pure path integration” by the rat, and results in progressive enlargement of the uncertainty ellipsoid (Dissanayake et al. 2001). If we measured the summed activity of a population of place cells at a particular time (we assume the distribution is Gaussian), we might expect the distribution’s volume to increase in this case. We are not aware of experiments which have demonstrated such a phenomenon, although experiments on blind rats have been conducted (see, for instance, Save et al. 1998). This implies that the “position estimate of the rat with respect to a particular landmark at a particular time” and a “place field” are not equivalent quantities; we discuss a more likely relationship between them next.

Predictions for place cell fields

Thus far, we have demonstrated that a state-estimator satisfies our intuitive notions regarding that quantity which may be called “the uncertainty of the subject with respect to a landmark.” We now need to show that this quantity actually agrees with what is known about place cells in the HEC. To make this leap, we will need to assume a relationship between the firing-rate F of a place cell population (or PCP) X at time t and the probability that the rat is at the position x:

Here Inline graphic denotes a normal distribution with mean μ and variance σ². To begin with, we need to take into account that experiments have typically considered the sum of all action potentials for a given PCP over some time interval As we know, the rat’s state estimate (according to the Kalman filter) will generally not agree with the rat’s true position unless the rat knows its position relative to all landmarks with perfect certainty. This implies that a given PCP will not achieve peak firing at precisely the same location with each visit. Thus, the “firing field” Inline graphic of a PCP which is measured experimentally is actually the sum of multiple, distinct position estimates made at various times:

Here Inline graphic Thus, a given place field (according to the usual experimental definition) will consist not of a single normal distribution, but rather a sum of such distributions.

We now wish to predict some of the place cell fields observed by O’Keefe and Burgess (1996). This study examined how place fields changed when a familiar rectangular enclosure of dimensions l × w was deformed to a new configuration l′ × w′. To initiate the Kalman filter, we will make the following assumptions:

A1 The only landmarks are the corners of the enclosure. The positions of the corners serve to specify the dimensions of the enclosure completely.

A2 The rat possesses no memory for observations made after the deformation. This is equivalent to the requirement that Q → ∞ in Eq. 2a for all control inputs after the deformation. The consequence of this requirement is that the rat’s state estimate at any moment will only depend on the information available at that moment and the information acquired before the deformation.

A3 The estimates for the length l and width w of the enclosure are uncorrelated.

With these assumptions, we initiate the Kalman filter. At any moment the rat will observe a certain combination of corners. If the observed corners have a configuration identical to that of the undeformed enclosure, the PCP will fire identically in the deformed/undeformed rooms (relative to the observed landmarks). Conversely, if the relative orientations of the observed corners have changed, the rat will need to generate some new estimate of its position relative to them. Since there are only 4 corners total, we can easily tabulate all possible observations. We must keep in mind that the state estimates we write down are actually averages, since each individual observation is subject to random error.

Let us take first a square box w × w which is deformed to a rectangle w × l, as shown in Fig. 2 (top panel). Tabulating all possible observations, we obtain 3 possible state estimates.

Table 1.

Three possible state estimates are generated from all possible corner combinations

Corners observed	State estimate	(Corners cont’d)	(State cont’d)
None	None	bc	3
a	1	bd	3
b	1	cd	2
c	2	abc	3
d	2	abd	3
ab	1	acd	3
ac	3	bcd	3
ad	3	abcd	3

Open in a new tab

Fig. 2 — A rectangular enclosure l × w is stretched to new dimensions l′ × w. Three different views of corner combinations are possible; only one takes into account the new dimensions. The *color red* denotes positions where the place cell in the *top panel* will continue to fire in the *bottom panel*. See Table 1 and Eq. 4b

Thus, if the “place field” in the square box was Gaussian, the field in the rectangle will be (setting 1 = ab, 2 = cd, 3 = abcd):

This three-peaked field is shown in Fig. 2. We write the labels for the rectangle in Fig. 2 (lower panel) in matrix form:

We would expect the Gaussians in brackets (Eq. 4a) to be of greater amplitude since their variances will be smaller (i.e., the PCP will fire more vigorously for corner configurations which are not deformed than for those which have undergone deformation). This result is in agreement with the experimental results obtained by O’Keefe and Burgess; in fact, it also accounts for the observed peak Inline graphic which did not appear in the model they developed (O’Keefe and Burgess 1996). It is worthy of note that if we permitted the rat some post-deformation memory, the peaks and would gradually migrate toward This accounts for the “blurring” observed between the 3 peaks in Fig. 2a (upper right panel) of O’Keefe and Burgess’s paper (1996).

We give the corresponding expression for a rectangle l × w which is deformed to a rectangle l′ × w′. For convenience, we label the distinct Gaussian terms with respect to the corners:

Here we have set Inline graphic for brevity. Again, we see that the terms in brackets correspond to the peaks predicted by the model of O’Keefe and Burgess; these peaks correspond to corner views which appear identical. This place field is shown in Fig. 3. We give the corresponding matrix form of the state estimates for the large square (as in the lower panel of Fig. 3):

The reader can verify that all other corner combinations yield one of the state estimates listed here. In their paper, O’Keefe and Burgess (1996) presented the bracketed terms of P(x) in the form:

Although these results are promising, we were unable to predict some of the firing-fields found by O’Keefe and Burgess (1996) by inspection alone. This difficulty is not unexpected, as the fields should be highly dependent on the particular observation programmes (which were not available to the author). We would propose the method of robotic simulation (pioneered by Burgess et al. 1997) as best suited to this challenge: if one could record the rat’s observations as a function of time and then enact the same observation programme using a robot, it might be possible to duplicate the fields with great precision. For numerical simulations of time-dependent place field migration using the Kalman filter, the reader may refer to the study by Bousquet et al. (1999).

Fig. 3 — A rectangular enclosure Q_h is stretched to new dimensions l′ × w′. Nine different views of corner combinations are possible; five take into account the new dimensions. See Eq. 5b

The reader will note that we are not the first to posit a relationship between place fields and Kalman filtering. A previous effort of this kind was made by Balakrishnan et al. (1999). The formulation utilized by these authors differed from ours in that they considered a system state vector which included each PCP field explicitly. This differs from our interpretation, where the collective activity of the PCP is identical to the rat’s state estimate of its position at a given time; as such, place fields need not appear explicitly in the state vector. We cannot see a way to predict the non-Gaussian geometry of place fields using the scheme suggested by this group (Balakrishnan et al. 1999). Nevertheless, their work represents a major advance in our understanding of the HEC as a state-estimator.

Body position relative to an imperfectly-known environment

We can consider 2 types of uncertainties for the problem 1/N. First, we have the uncertainty of the observer S with respect to all the objects O₁, O₂, … O_n, which will call the interaction uncertainty:

Conversely, we can define the uncertainty of the objects independently of the observer, which we will refer to as the mapping uncertainty:

We will find U_int and U_map first for the most general (uncertain) case and then for the limiting (certain) case.

General case

Upper bound for uncertainty

For the general case of the uncertainty U_12…n between N frames, we can only establish an upper bound for the determinant according to the formula (Gersho and Gray 1992):

Writing the uncertainty for N frames evaluated with respect to the frame i, we have:

Here U_ij denotes the relative uncertainty of the frames i and j. Using the general formula 7, we can write the upper bound for the interaction uncertainty as:

That is, the subject’s uncertainty with respect to all the objects cannot exceed the product of the individual subject-object uncertainties. Similarly, we can write the upper bound for the mapping uncertainty as:

That is, the uncertainty of the all the objects with respect to one another cannot exceed the product of all pairings with the reference object. The formulas 8 and 9 provide a powerful means of approximation when the covariance matrix becomes large and difficult to evaluate.

The uncertainties for zero process noise

If we set the process noise Q equal to zero, we can find the uncertainty for N objects explicitly. We refer to Fig. 4 (top panel). We can see that the circuit for multiple observations of each object can be drawn in a form identical to that for one observation of each object (bottom panel). We will find the uncertainties for 3 objects; the results can be easily extended. To simplify the expressions, we will set:

We find the uncertainty between the 3 objects and the observer to be:

This is the interaction uncertainty U_int defined previously. Conversely, we can evaluate the uncertainty between the 3 objects:

This is the mapping uncertainty U_map. Extending these formulas to 4 objects gives:

Fig. 4 — (*Top panel*) A subject makes observations (with covariances R_k for object 1, for object 2, and for object 3) and control inputs (with covariances all zero); numbers denote time steps. (*Bottom panel*) This circuit can be rearranged to yield a simpler one; the result is Eq. 10

Inline graphic — (*Top panel*) A subject makes observations (with covariances R_k for object 1, for object 2, and for object 3) and control inputs (with covariances all zero); numbers denote time steps. (*Bottom panel*) This circuit can be rearranged to yield a simpler one; the result is Eq. 10

The quantity U_int, as expected, goes to zero when a perfect observation of any object is made (when the relative uncertainty between the subject and one object becomes zero). As noted in A3, to preserve the information about the remaining N − 1 objects we must drop the zero terms from the determinant. For instance, a perfect observation of one of 4 objects will require the transition:

By contrast, the quantity U_map does not go to zero following a perfect observation. To make this quantity go to zero, we must make the relative uncertainty between 2 of the objects zero. This highlights the fact that U_map is an observer-independent quantity.

We can illustrate the objectivity of U_map in another way. Suppose that, after observing three objects as before, the observer executes a control input Q (Fig. 5). The interaction uncertainty will now be:

The mapping uncertainty U_map will, however, remain unchanged. This reflects the fact that the control input Q has not changed the observer’s knowledge about the relative position of the three objects. Note also that U_map did not increase; this is a general property of U_map we will explore in the next section.

Fig. 5 — The *bottom panel* of Fig. 4 is simplified and redrawn on the *left*. We then add a control input which transitions the observer from state 1 to state 2. See Eq. 13

Limiting case

Convergence and measurement conditions for the mapping uncertainty

Dissanayake et al. (2001) proved for the quantity U_map the following 2 properties, valid for arbitrary observations and control inputs in a static environment:

U_map is non-increasing.
If the number of observations n for each object increases without bound, then U_map approaches zero.

That is:

Since we might choose to evaluate the particular uncertainty U^(p)_map for any subset p of objects in our overall DAG, Dissanayake’s result can be applied to this subset as well. This allows us to “condense” the frames of p into a single frame:

The property (2) enables us to convert any multi-object observation problem into a single-object observation problem by achieving convergence in the sense 14. Once convergence between objects has been reached, Eq. 2a is sufficient to describe the subsequent Kalman filter dynamics. The asymptotic approach of the mapping uncertainty to zero was nicely demonstrated by Dissanayake et al. (2001) with an autonomous motor vehicle.

Equation 14 also permits us to state the conditions for which an experimenter can expect to measure a kinematic tuning in the brain (note that we use the terms “tuning” and “convergence” interchangeably here). These conditions are:

Subject observability If the world W is characterized by n continuous quantities q_i, a brain B may (in principle) acquire a tuning to any coordinate element of the form:
15a
Here we assume that a cell population’s tuning corresponds to the probability P(q₁, q₂, …q_n) of the coordinates q₁, q₂, …q_n for some physical process. Thus, the cell population’s tuning (or firing rate) dT for the coordinate element would be given by an expression of the form:
Since the quantities q_i are characteristic of the physical world and not necessarily of the brain, however, only the m quantities (m ≤ n) which have converged according to 14 will appear in 15a. Thus, the converged tuning of B is given by:
15b
Experimenter observability Of the quantities , only the s quantities (s ≤ m) which are expressly measured by the experimenter will appear in 15b. Thus, the observed tuning of B is given by:
15c
Condition (2) implies that the number of measured tunings of a cell cannot be greater than its true number of tunings. We now apply these conditions to the arbitrary motion of bodies in space (their kinematics).

Kinematic convergence

Space–time The most general tuning which the brain may have for kinematic quantities includes the time coordinate, position coordinates, and angular orientation of the body in question:
16
Note that the “tuning element” 16 is not a volume element which specifies not just position, but the complete kinematics of an oriented point in space and time. That is, 16 expresses convergence not for a static spatial distribution, but for a static event. By a static event, we mean an event which occurs identically (the relative space and time intervals are the same) for each trial. This definition agrees with our intuition; we know we can “become familiar” not only with consistent spatial distributions (positions of objects in a room) but also consistent events (the coordinates of a friend’s face each time he smiles). Nevertheless, the fact that the observer’s observations for a given trial of the static event are constrained requires us to prove that convergence is possible over multiple trials; we do this in 35.

It would be possible to verify the tuning 16 experimentally by measuring the appropriate cortical region while an animal observed the relative orientations of 2 objects, with one chosen as the origin and the other as the point in phase space This experiment would be somewhat more challenging to perform than the usual “place cell” experiments, given the large number of degrees of freedom. To the author’s knowledge, a simulation of space–time convergence has not yet been performed. Its existence is entirely confirmed by Dissanayake’s result and our proof; however, the time course for convergence will generally be much longer than that for a static space.
Less general tunings We now consider cells which satisfy only a portion of the tunings in 16; that is, only some of the variables satisfy the condition (2). We will outline measurement conditions 15c such that

It is easy to see that an infinite number of such “partial” tunings exist; this follows from the fact that an infinite number of “planes” of dimension N − 1 can be drawn in a “volume” of dimension N. We will consider only a few experimentally relevant examples here.

1. “Time” cells

Let us imagine some static event in which only the time coordinate converges. A good example would be an auditory pattern, where the only parameters of interest are the particular characteristics of the sound (e.g., intensity and pitch) and the times at which they occur. We could easily design an experiment exactly analogous to the usual “place field” experiments to test whether a rat’s “place” cells can also tune to time intervals. We would merely need to play the same auditory pattern (maybe a musical recording) repeatedly to a rat while measuring the same cortical cells. Instead of averaging firing-rates over all visits to a particular spatial location, we would average them over all trials of a particular note in the pattern. We would expect to find that different notes in the composition would be represented by different cells, just as in the spatial case. It is known already that cell in the inferior colliculus of the bat and ferret show time-interval tuning (Ehrlich et al. 1997); if the HEC serves a global memory role we should expect temporal tuning there as well.

2. “Place” cells

We can also imagine cells with a tuning consisting only of the spatial variables P(x, y, z). These are exemplified by the well-known “place” cells already discussed; it was recently verified by Jeffery et al. (2008) that place cell tunings extend to 3 dimensions. It is also known that there exist cells with approximately 2-dimensional tuning; entorhinal grid cells appear to obey this property (Jeffery et al. 2008). We see that in the context of the present theory, it is meaningless to speak of cells which represent strictly 2D or 3D space; rather, we speak of the degree of convergence (as expressed by 5) which a particular cortical region exhibits for a particular coordinate. The only reason entorhinal grid cells appear to represent the “2D space” of an explored surface is that the rat’s observation behavior selects objects on the floor as the most useful landmarks. If the rat were to explore a box in zero gravity (let’s assume the rat is still “stuck” to the walls), grid cells would likely adopt the normal coordinate vectors n_x, n_y of a given wall as the “most natural” coordinate tuning. Similarly, a rat exploring a curved surface would probably utilize normal coordinates in its entorhinal map.

Of course, we must remember that the observed “place” cells represent only the relationship between 2 particular frames: the rat and a landmark, summarized as <s, o₁> (where s denotes the subject and o₁ denotes the landmark). Since, from what has already been said, we should have cells that can represent any frame combination of the form <s, o₁, …o_n>, to find the other types of place cells we merely need to construct situations where 2 non-rat frames execute relative motion. For example, we could place an object in the box with the rat and compare the object’s position relative to the box with the rat’s cortical activity; we should find that the rat possesses “place” cells representing the object’s position (rather than its own). Specifically, we would expect the positions of the rat and the object to appear as 2 attractor bumps within the PCP.

3. “Head-direction” cells

Another possible partial tuning is that of the angular variables P(θ, ϕ). These are exemplified by the “head-direction” cells (Taube et al. 1990). All the comments made previously for “place” cells pertain to them as well.

The extended body (or extended brain)

We now need to address how state estimation can be carried out when the sensory apparatus of the observer consists of multiple frames. In doing so, we will resolve a paradox relating to “place” cell representation: which specific part of the rat’s body is represented by the place cell population? A similar issue arises in the study of the “reaching problem” in primates, since the brain appears to represent the various frames involved (hand, subject, and object) in various coordinate systems (Ferraina et al. 1997). We will focus on the case where the various sensors make measurements of a single non-observer frame (the problem N/1); this result can easily be extended to other cases.

We can solve this type of problem by considering the various sensors as “point observers” which relay their observations to one another. As an example of this approach, consider the “reaching problem” illustrated in Fig. 6. We need to evaluate the uncertainty between the hand and the object U_ho, which is:

Here denotes Inline graphic the uncertainty of the object as viewed by the eye, R_h1 denotes the uncertainty of the hand as viewed by the eye, and R_h2 represents the uncertainty of the hand relative to the eye as measured by the proprioceptive (non-visual) system connecting the two.

Fig. 6 — The reaching problem at a single time point. is the uncertainty of the object image position on the retina, is the uncertainty of the hand image position on the retina, and R_h2 is the uncertainty of the hand relative to the eye through proprioception

We could transform between any 2 frames within the body in this way. In other words, a “place” cell population represents not a specific part of the body relative to an external landmark, but rather all parts of the body relative to the landmark with varying degrees of uncertainty.

We can use the same approach to extend our Eq. 2a (the problem 1/1) to the case of an extended observer. The DAG for an observer consisting of 2 sensors is shown in Fig. 7. Control inputs to the hand and eye are assumed (for simplicity) to be independently generated, as shown. We cannot write down a general expression for the uncertainty U_ho; however, if we assume the control inputs Q_h → ∞, we can write:

Here we have neglected the index e for the observations R_e and control inputs Q_e in the in the second term to simplify the notation; U_he is the hand-eye uncertainty and U_eo is the eye-object uncertainty. We can see that the second term is simply Eq. 2a. Thus, the uncertainty between the hand and the target object is simply the point-observer uncertainty U_eo plus an additive term U_he under this approximation.

Fig. 7 — The reaching problem over multiple time points. In Eq. 18, we let Q_h approach infinity; Q_e and R_o are simply written as Q and R (k is the time step index) in Eq. 18

The principle outlined in this section can be easily tested in the HEC, preferably using primates. We might provide a monkey with a chessboard-sized “environment” containing various “landmarks” for spatial reference; the monkey would be encouraged to reach for objects placed at these various landmarks (analogous to “exploration” of the environment). We would then expect to find “place” cells which fired only when the monkey’s hand entered a particular part of this miniature environment. Thus, we find that place cells (hippocampus) and reaching cells (parietal cortex) encode very similar quantities.

Metric estimation

It was recently discovered that a population of cells in the HEC provide a spatial metric for the rat’s environment (Hafting et al. 2005). This metric is known as the “entorhinal grid.” More remarkably, it has also been shown that these grids are deformable; that is, the metric is not always Euclidean (Barry et al. 2007). In this section, we will show that the same estimation methods used in the previous section can be used to predict the dynamics of these grids. First, however, we must define some basic concepts of measure.

Preliminaries

Measures

Most of the day-to-day judgments we make about space intervals and time intervals are approximate. This imprecision arises from the fact that we usually do not possess precisely calibrated clocks or measuring-rods to establish the intervals in question. For example, a subject would usually not pull out a tape-measure to determine whether a parking-space was wide enough for his car. Instead, he would rely on comparisons of the desired interval with cues in the environment to make the judgment. This means that any physical relation in the environment might be employed as a measure.

It is equally clear from daily experience, however, that not all measures are created equal. If, for instance, a subject knew that the clocks in a particular building were notoriously unreliable, he would be better advised to consult a (more reliable) wristwatch if he wished to know the “time.” Examining this example a bit more carefully, we see that the establishment of a particular clock or measuring-rod as a “reliable” measure must always involve its comparison with another specified, reliable measure (preferably on multiple occasions). To use the previous example, if by comparing the clocks in a particular building with the watch on multiple occasions the subject should find them to consistently agree, he might as well simply consult the building clocks to determine the “time.”

In fact, we see that this procedure for establishing reliable measures is precisely analogous to Dissanayake’s convergence result, expressed in Eq. 14. We shall therefore use 14 to formally define what we shall mean by a standard of measure:

Definition: A standard of measure (or SOM) is one which has converged, in the sense of 14, relative to another SOM.

Thus, we see that physical relations which are both consistent and frequently observed will provide the “ideal” benchmarks for comparison. We also see that, if a particular measure has not converged, its use introduces error into the conclusions derived. This error can be quantified using the usual methods of state estimation.

The reader should note that, from here on, we will refer only to space in our discussion (for simplicity); the arguments all apply to time as well.

The subjective definition of space using measures

By displacing a physical object (or measure) through space, we may characterize the distance relations in any part of that space completely. That is, by a series of displacements of a given ruler, we can assign a distance between any 2 points within the volume under consideration. This collection of distance relations is called the metric for the space. If the measures used to define the metric are perfect, the metric estimate B (for “brain”) corresponds perfectly to the true metric W (for “world”) of the physical space. Since we live in an approximately Euclidean world, W is always Euclidean unless we define it otherwise (for instance, by “deforming” the environment from metric W to metric W′). Conversely, if some of the measures are imperfect, the metrics B and W will generally not coincide. This means that B will generally not be Euclidean.

It is important to realize that the metric estimate B, like all the subjective quantities (internal to the brain) we have discussed in this paper, is defined exclusively in terms of the subject’s observations. The subject’s metric B is infinitely uncertain for any part of the true metric W which the subject has not measured using a measure. By making sufficiently detailed and concise measurements (using a SOM) of some part of W, on the other hand, the subject can make B approximate W as closely as he might desire.

We will assume that the metric estimate B corresponds exactly to the tessellating pattern of the entorhinal grids.

The metric tensor and generalized Kalman equations

Mathematically, there is no difference between “metric state estimation” and the ordinary state estimation procedures we employed in earlier sections of the paper. The distinction between the two cases instead arises from our choice of which parts of the system will be subjected to state estimation: in ordinary state estimation we assume an unbiased set of rulers to be provided (and not subject to state estimation), whereas in metric state estimation we do not make this assumption. In practice, however, it is useful to introduce a formal (mathematical) distinction as well. Since the metric intervals of interest are part of a continuum, they are more conveniently represented by utilizing the metric tensor g_ik. This entails a slightly different derivation of the Kalman state update equations (see “Extension of the vector Kalman filter to the metric tensor”) which leaves their basic structure intact. One can visualize the transition between the vector and tensor Kalman equations as analogous to the transition between a large number of springs and a continuum body in the theory of elasticity.

The metric tensor enables us to describe the infinitesimal distance relations at any given point of space. Since a subject cannot perform an infinite number of measurements, we must keep in mind that the subject’s state estimate can only approximate a continuum space. This approximation will improve as the number of measurements becomes large (as in the cases to be discussed here).

The transition of the Kalman equations from a vector to a metric description has a useful consequence: the “probability energy” u_ikP_iklmu_lm has the same form as the elastic energy of a deformed body (Extension of the vector Kalman filter to the metric tensor). If one overlooks the fact that “action at a distance” is prohibited in the linear elasticity theory (see “The relationship between the “probability” energy F and the energy for a linearly elastic body”), this analogy leads us to a useful mental picture of how “metric deformations” occur. Namely, we can picture each stage of the state estimation process as corresponding to the stretching or relaxing of a continuous elastic medium. The elastic analogy for the Kalman filter is crucial in that it parallels the “stretching” and “compression” properties observed in entorhinal grids; we will examine this more closely with the Barry et al. (2007) experiment.

Experience-dependent re-scaling of entorhinal grids

Simple example

Before applying the Kalman equations directly to the experiment by Barry et al. (2007), we will consider the simplest possible example of a “metric estimation cycle.” We take the situation shown in Fig. 8.

Fig. 8 — (*Top panel*) A ruler is noted to stretch from length X to X′. According to the ruler’s reading, however, its length is still X. To refute this, a second ruler is needed for comparison. If the second ruler is error-prone, we should take a weighted average of X and X′ (X < L < X′). (*Bottom panel*) The same process in continuous time

Phase 1

We imagine a rat observes a single object, a ruler X. Since no other objects exist in our simplified example, this ruler defines the rat’s concept of distance. This can be seen mathematically by noting that the rat’s estimate of the length l of the ruler X must be derived by measuring it relative to a set of measuring rods X, Y, … yielding length values (for the ruler X)x, y, … and associated errors σ_x, σ_y, … according to the general Kalman update formula 2a (which we shall simplify by setting the Q_i equal to zero):

In words, this equation says that the distance we denote “the length of the ruler X” is a weighted average of the lengths of X with respect to ruler X, the ruler Y, and so on. If only ruler X is available:

Thus, no matter what physical deformation the solitary ruler X should undergo, the rat will have no means for recognizing that any deformation has taken place. This fact, which runs counter to our experience, is of fundamental importance for understanding why entorhinal grids deform as they do.

Phase 2

The ruler X is now deformed to some new, physically different ruler X′ of length x′. Since the rat has no benchmark for comparison, it perceives no change in X′ from the structure of X; rather, its internal definition of length is deformed by a corresponding amount such that the perceived length of the ruler is unchanged:

To the experimenter, it will appear that the rat’s metric has instantaneously “stretched” upon observing the stretched ruler.

Phase 3

As the rat begins to explore its environment, it will discover other sensory cues. Mathematically, we might say that the rat discovers a new ruler Y with which to compare the ruler X′. Using our general equation, we derive the rat’s new estimate l₃ of the length of ruler X′ by comparison of the two rulers:

If (as a special case) ruler Y is believed to be perfectly accurate (σ_y = 0):

That is, the rat discards the length of ruler X′ relative to itself and instead considers only the length of X′ relative to Y. If Y is in fact perfectly accurate (i.e., its length is always the same when it is observed), then:

As long as no perfect rulers exist, however, the rat’s estimate of the length of X′ can only approach its true value x′ asymptotically (on average). For instance, if all rulers have the same variance σ, then our general expression goes into:

Since x is the biased length of X′, we see that x < l < x′. This limiting behavior of the length estimate at (or near) its true value corresponds with our common-sense notion that lengths in an environment rich in sensory cues can be determined to a high level of accuracy. In this manner, we return to Phase 1.

Simulation of the Barry et al. (2007) experiment

We now consider the experiment of Barry et al. (2007) in detail. In these experiments, a rectangular box was deformed from a configuration l × w to a configuration l′ × w′. For simplicity, we consider the case where a square l × w is deformed to a rectangle l × w, with l > w. As in our earlier discussion of place cells, we will make certain assumptions.

A1 The rat’s estimates of l and w are uncorrelated.

A2 The corners of the enclosure provide the only landmarks for the rat. All state estimates by the rat are assumed rectangular.

A3 The structure constants E(u_iku_lm) have a form identical to those for a linearly elastic, isotropic body (see “The special case of an isotropic body” and “The relationship between the “probability” energy F and the energy for a linearly elastic body”).