Skip to main content
Science Advances logoLink to Science Advances
. 2024 Jul 31;10(31):eadm8470. doi: 10.1126/sciadv.adm8470

Space is a latent sequence: A theory of the hippocampus

Rajkumar Vasudeva Raju 1, J Swaroop Guntupalli 1, Guangyao Zhou 1, Carter Wendelken 1, Miguel Lázaro-Gredilla 1, Dileep George 1,*
PMCID: PMC11290523  PMID: 39083616

Abstract

Fascinating phenomena such as landmark vector cells and splitter cells are frequently discovered in the hippocampus. Without a unifying principle, each experiment seemingly uncovers new anomalies or coding types. Here, we provide a unifying principle that the mental representation of space is an emergent property of latent higher-order sequence learning. Treating space as a sequence resolves numerous phenomena and suggests that the place field mapping methodology that interprets sequential neuronal responses in Euclidean terms might itself be a source of anomalies. Our model, clone-structured causal graph (CSCG), employs higher-order graph scaffolding to learn latent representations by mapping aliased egocentric sensory inputs to unique contexts. Learning to compress sequential and episodic experiences using CSCGs yields allocentric cognitive maps that are suitable for planning, introspection, consolidation, and abstraction. By explicating the role of Euclidean place field mapping and demonstrating how latent sequential representations unify myriad observed phenomena, our work positions the hippocampus in a sequence-centric paradigm, challenging the prevailing space-centric view.


Mental representations of space emerge from sequence learning on egocentric sensory inputs.

INTRODUCTION

The hippocampus is known for its role in episodic memory, map-like spatial representations, relational inference, and fast learning—a seemingly disparate set of requirements. Simultaneously, hippocampal cells are categorized into a wide variety of types based on their firing patterns ranging from place cells, splitter cells, time cells, lap cells, and event-specific representations and exhibit a variety of remapping phenomena in response to environmental changes (1). These phenomena often get characterized using Euclidean spatial concepts such as object vector cells (2), landmark vector cells (3), and distance coding (3, 4), without a coherent underlying explanation, and remain unresolved with other phenomena like splitter cells (57) and event-specific representations (8). Could these divergent requirements and myriad phenomena be explained using a simple set of principles that are computationally grounded, implemented, and easy to understand? Here, we show that treating space as a sequence can resolve many of the divergent phenomena ascribed to spatial mapping, and help clarify the connections between spatial, temporal, abstract, and relational representations in the hippocampal complex.

Treating space as a sequence is a necessity for humans and other animals because they lack a global positioning system that enables direct sensing of location coordinates. Additionally, their actions are not with respect to a global coordinate frame as determined by a compass. Consequently, they need to acquire and abstract the concepts of locations and space from purely egocentric sensory-motor experience (9). However, sensations from the world are aliased (10, 11) and do not convey locations directly. In other words, identical sensations can occur at multiple locations or in different sequential contexts. Further complicating matters, an animal’s actions are relative to its body rather than to a global coordinate frame. To develop internal space-like maps from these aliased sensations and egocentric actions (as illustrated in the sketch in Fig. 1A), the learning agent has to appropriately split or merge sensations based on sequential contexts (Fig. 1B) (12, 13). Our model, clone-structured causal graph (CSCG), tackles this problem by learning different latent states (called clones) to represent the same observation in different sequential contexts (1416), merging or splitting the latent states as necessary. In CSCGs, allocentric “spatial” representations naturally arise from higher-order sequence learning on egocentric sensory and motor inputs, without making any Euclidean assumptions, and without having locations as an input. An organism or an agent can use a CSCG for navigation, foraging, context recognition, and shortcut planning without having to explicitly compute place fields or having to decode locations.

Fig. 1. Clone-structured cognitive graph.

Fig. 1.

(A) Learning cognitive maps from sequential sensory observations is challenging because observations do not identify locations uniquely. (B) The cognitive map learning problem can be understood as learning a latent graph from observations emitted at every node, where two different nodes can emit the same observation. The challenge is to learn context-specific representations that will disambiguate sensory observations in the latent space. The observation D occurs in three different contexts in sequences ADE (purple), BDF (green), and CDG (orange) from the environment, a distinction that is not represented in a first-order Markov model. Two of these contexts (purple and green) correspond to the same latent state, and the third (orange) to a different latent state. Cloning D into multiple latent states allows for flexible merging and splitting of contexts as appropriate. (C) The cloning structure of dynamic Markov coding can be incorporated in an HMM with a structured emission matrix, the cloned HMM. CSCG extends cloned HMMs by including actions. (D) CSCG learns an allocentric map from aliased egocentric local observations in a 2D room with uniform interiors even with long runs of the same observation (i). (ii) Each unique sensation, indexed by color, is attached to a set of latent states (clones) through the emission matrix. Through learning of the transition matrix, these clones learn to represent different temporal contexts of that sensation. (iii) Learned transition graph among clones. Each clone’s color represents the observation it is attached to. The red arrows highlight the observations corresponding to the clones in the top-left corner of the graph. (iv) Clone activations as the agent navigates the room can be used to compute their place fields, which reveal the spatial locations they represent.

Our model suggests that place field maps need to be interpreted carefully because they overlay sequential responses onto Euclidean maps. Directly characterizing the place field maps in terms of spatial and Euclidean concepts could be a source of anomalies since the underlying phenomena are inherently sequential and dynamic (17). In contrast, CSCG explicates how the learning of sequential contexts gives rise to spatial representations that an agent can use to drive behavior without explicitly representing location coordinates. CSCGs predict the conditions under which place fields are expected to change in response to visible or invisible environmental changes, and when they do not, resolving a variety of phenomena with a simple principle.

Model

We consider experimental setups where an agent moves around in an environment and receives local sensations that are aliased in the sense that they do not correspond uniquely to locations in the environment, and the actions of the agent are relative to its current orientation and not in a global frame. The environment need not be Euclidean. The agent makes no Euclidean assumptions and does not have access to a map of the environment. If the sensations from the environment are vectors (for example, visual patterns) in a continuous space, they are discretized using a vector quantizer. From a sequence of discretized observations and actions, both of which could be egocentric, an agent has to discover the latent topology of its environment to vicariously evaluate different options for navigation. This is a difficult problem due to the aliasing of the observations, egocentric action space, and a lack of Euclidean assumptions.

This can be formulated as the problem of learning a latent graph from aliased observations at its nodes. An agent performs a sequence of actions a1, …, aN (with each an ∈ {1, …, Nactions}) in an environment G. As a result of each action, it receives an observation, obtaining the stream x1, …, xN (with each xn ∈ ℝd or {1, …, Nobs} for continuous and discrete observations, respectively). The goal of learning is to recover the topology of the environment G from sequences of actions and observations.

Concretely, an environment is defined by a directed multigraph G ≡ {V, E} with latent nodes V ≡ {v1, …vNnodes} and latent edges E ≡ {e1, …eNedges}. At each time step n, the agent exists at a latent node and receives the observation xn. The node is labeled by the discretized observation yn (when the observations are already discrete yn = xn). Multiple latent nodes can have the same label, so the observation does not directly identify a node. When an agent at latent node vi executes an action a, it will transition to vj with probability P(vjvi, a). Whenever this probability is larger than 0, an associated directed edge from vi to vj is introduced in the graph, labeled with the corresponding action and probability. Note that this means that the graph can contain multiple edges with the same starting and ending node, but labeled with different actions. This is what makes G a multidigraph and not a simple graph. For consistency, all edges originating from the same node and labeled with the same action must have their probabilities sum up to 1.

The above definitions result in a precise, action-conditional probabilistic model for sequences. Using zn to represent the unobserved node at time step n, and adding a simple per-node policy P(anzn) to also model the actions, results in the CSCG model. To extend CSCG to continuous observations, we introduced a variable yn between the hidden state zn and the observation xn. The joint distribution of a sequence of observations and actions is

p(x,a)=zyP(z1)n=1N1P(zn+1zn,an)P(anzn)n=1Np(xnyn)P(ynzn) (1)

depicted as a probabilistic graphical model in Fig. 1C. We use the following shorthand for a sequence of actions: a ≡ {a1…, aN} (x, y, z are similarly defined). The transition probabilities are fully parameterized through an action-conditional transition tensor T with elements Tijk = P(zn+1 = kzn = j, an = i). In this formulation, the observation model p(xnyn) = 𝒩(xn∣μyn, σ2I) is parameterized as an isotropic Gaussian with variance σ2 and mean μyn, which is the centroid associated to the discrete emission yn. The emission model is parameterized by an emission matrix E with elements Eij = P(yn = jzn = i). When there are only a finite number of observations or the observations are actually discrete, the observation model becomes deterministic and we can set yn = xn. This exactly recovers the discrete CSCG from (15).

In a CSCG, by design, multiple hidden states share the same emission: If states i and j are clones of the same emission, then p(yz = i) = p(yz = j). We define C(y) as the set of clones of the emission y. Further, in a CSCG, the emissions are deterministic, p(yz = i) = 1 if iC(y) and 0 otherwise. Simply put, each hidden state maps to only a single emission. This clone stucture can be used to further simplify the joint distribution

p(x,a)=z1C(y1)zNC(yN)yP(z1) n=1N1P(zn+1zn,an) P(anzn) n=1Np(xnyn) (2)

Observe that if we remove the policy P(anzn) from Eq. 2, we are left with the conditional model p(xa). This action-conditional setting corresponds to an (action-conditional) hidden Markov model (HMM) in which the emission matrix is fixed and determined by the cloning structure, which improves the model’s learnability (15, 18). The clone structure introduces a sparsity pattern in the emission matrix, which is computationally advantageous for both learning and inference (15). In addition, the model supports causal semantics (19) and learning from interventions (20, 21). A CSCG’s learned transition tensor can be represented by a directed multi-graph, and reusing this learned transition structure along with the cloning structure to remap to a new environment can be considered as learning using soft interventions (21).

For a given sequence of actions, the learned transition tensor P(zn+1zn, an) encodes a distribution over sequences, establishing a connection between observed temporal sequences and arbitrary hidden (not necessarily Euclidean) topologies G of the environment. CSCG achieves this by having an overcomplete latent space where the latent states vastly outnumber the visible discrete observations, with the hidden states partitioned into disjoint sets, each set corresponding to a visible observation, and the members of each set being the clones of the corresponding visible observation. This kind of latent structure gives the flexibility to split the same observation in different sequential contexts or merge different temporal contexts into the same latent state (12), as shown in Fig. 1, while mitigating the local minima problem associated with learning latent space models (15).

Learning the latent topology represented by P(zn+1zn, an) is achieved using expectation maximization (EM), which maximizes the likelihood of the model using a local update mechanism. The tensor P(zn+1zn, an) is initialized randomly, and the number of latent states allocated is the “capacity” of the model. Typically, the model is allocated excess capacity than what is needed to represent the environment, and the learning algorithm can use this capacity to represent the splits and merges (Fig. 1B) in the latent space to model the observed sequences. Such overparameterization also helps with better learning by avoiding local minima traps. The random initialization can be thought of as a superposition of all possible latent graphs, and maximum likelihood learning as a smooth parameterization of the topology learning problem. The splitting and merging of latent states that is conceptually shown in Fig. 1B is achieved by the re-weighting of the connections by the EM updates without a change in the number of neurons. Updates using the EM algorithm smoothly pushes a randomly initialized P(zn+1zn, an) toward a superposition of multiple copies of the true graph G of the environment. This superposition can be further consolidated by running a greedy version of EM, called the Viterbi EM, which collapses the superpositions into a sparse tensor, pruning the connections of unnecessary clones. (See the Supplementary Materials for more details.)

Our work focuses on the computational and algorithmic aspects of sequence learning in the brain and not on the actual implementation in neuronal networks. However, we do believe that inference and learning in CSCG can be achieved using biologically plausible mechanisms (15, 22). The clones in a CSCG may be represented by an assembly of neurons. Message-passing inference in CSCG is computationally cheap and biologically plausible using simple integrate-and-fire neurons (23). The EM algorithm that is used for learning is a local update mechanism analogous to spike timing–dependent plasticity (24, 25). See (15) for a circuit representing a potential biological implementation of the inference updates in CSCG.

RESULTS

We tested the CSCG model in a variety of experimental settings. The first set of experiments investigated the ability of a CSCG to learn latent topologies from perceptually aliased observation sequences, the ability to represent maps of multiple environments in the same model, and the ability to transitively stitch global maps from temporally disjoint but spatially overlapping experiences. Furthermore, we investigated the ability of the model to use previously acquired structural knowledge to guide behavior in unfamiliar environments. All these properties are important for the performance of an animal. The second set of experiments investigated CSCG’s ability to reproduce and explain a broad set of well-known experimental phenomena from the hippocampus (see Table 1). These phenomena can be broadly divided into spatial, geometry-related, and landmark-related remapping (3, 2628), phenomena with both spatial and temporal components (8, 27), and place field repetition, distortion, and changes with respect to environmental connectivity (2931). In addition, we performed a set of experiments that serve as testable predictions for CSCG’s ability to explain the mechanisms underlying hippocampal phenomena.

Table 1. List of experiments, their observed phenomena, and related publications.

Experiment Phenomena Publications
Geometry changes Place field remaps as determined by geometry O’Keefe and Burgess (27)
Visual cue rotation Place field rotates with cue card Muller and Kubie (26)
Barrier addition Place field disruption near barrier Muller and Kubie (26)
Landmark vector cells Place field remaps w.r.t. a landmark Deshmukh and Knierim (3)
Linear track Place field remaps w.r.t. start and end of the track Sheehan et al. (28)
Directional place fields Place field remapping is sensitive to movement direction O’Keefe and Burgess (27)
Laps on a track Event-specific rate remapping and lap cells Sun et al. (8)
Four connected rooms Place fields are unaffected by closed doors Duvelle et al. (31)
Two identical rooms Place fields are repeated in two identical rooms Fuhs et al. (29)
Hairpin maze Direction-specific repetition of place fields Derdikman et al. (30)
Room size expansion Place fields expand or stretch based on location w.r.t. boundaries Tanni et al. (43)

CSCG can construct maps from aliased egocentric observations in diverse environments

CSCGs are successful in learning latent topologies in a variety of environments, including two-dimensional (2D) and 3D simulated environments (Figs. 1D and 2A), from purely sequential aliased random walk observations. In the uniform room example in Fig. 1D, the agent received egocentric visual observations quantized though a vector quantizer, and took egocentric actions, with four possible heading directions in each location. The visible input to the agent depended on its location as well as head direction. Learning in CSCG discovered the latent headings and locations and represented them using separate clones (Fig. 1Diii). Each node in the graph (Fig. 1Diii) corresponds to a clone, and its color represents the local observation it is attached to. Note that the learning of the transition graph discovered four clones per spatial location; this corresponds to the four possible headings that an agent can be in (see the Supplementary Materials for more details). CSCGs are also able to correctly learn the topology of more complicated simulated 3D environments from sequential aliased egocentric observations as illustrated in Fig. 2A.

Fig. 2. CSCGs learn diverse latent topologies, transitively stitch them, and transfer structure to new environments.

Fig. 2.

(A) CSCG learning in an example 3D environment. The agent navigates the environment with egocentric actions and gets observations as RGB images. The images are passed through a vector quantizer to obtain cluster indices, which are used as observations for training a CSCG. The learned transition graph reflects the topology of the environment, which is a square room with a plus-shaped barrier in the center. (B) An agent experiences an environment composed of four connected 3D rooms in disjoint sequential episodes. Each episode has only partial coverage of the locations in the composite room. CSCG learning stitches together the disjoint experiences into a coherent global map. (C) An agent experiences multiple sequential episodes sampled from four nonoverlapping 3D rooms. In this case, CSCG learning correctly learns separate maps for each maze. (D) The learned transition graph of a CSCG trained on one environment can be considered a reusable schema. Given partial experience in an unfamiliar environment, a differently colored variation of the four-room environment used in (B), the previously learned joint transition model can be used to identify the agent’s location within it and rapidly navigate around obstacles to find the shortest path to the goal (here, to return to the starting location).

While each clone in the transition graph in Fig. 1Diii is “bottom-up” responding to the local sensation indicated by the color, that sensation needs to occur in the latent sequential context specified by the transition graph. By representing sequential contexts in the latent space, these clones come to represent variables like locations and heading that are not directly sensed. An experimenter can obtain the place field of a clone by creating a map representing the arena that the agent is moving in, and marking and accumulating the instantaneous activities of the clone at the present ground-truth location of the agent on that map. Examples of such place fields are shown in Fig. 1Div. The clones in Fig. 1Diii are also head direction sensitive, which corresponds well with the observation in (32) that place fields show head direction sensitivity when they are mapped conditioned on head direction. See fig. S9 for examples of place fields and their head direction sensitivity, consistent with contemporary observations about view sensitivity of place fields (3235). While the place field can give rise to the interpretation that the clone is responding to that particular location, this is purely an interpretive convenience for the experimenter. The agent itself has no mechanism by which it can derive a place field from the activity of its neurons. As we show in the next section, the agent does not need to compute place fields to locate itself, nor need to decode locations from the clones to make navigation decisions.

CSCGs make complex latent transitive inferences during learning and represent the learned information to enable transitive inferences (36). When different overlapping sections of an environment are exposed to the agent in disjoint episodes, CSCGs learn the underlying map that stitches together the whole environment (Fig. 2B), including the global loop closures. When environments are really disjoint, CSCGs learn to separate the maps, and simultaneously represent multiple maps in memory without being explicitly instructed about map boundaries during training (Fig. 2C). The appropriate map can then be recalled as hidden state inference (37) and used to guide behavior.

Replay-based planning and schema-based transfer enable shortcut inference in dynamic settings

A behaving agent can keep track of its state as the most likely clone given past observations, without having to invoke any concepts about space or place fields. If the agent intends to navigate to a previously remembered goal based on a visual sensation, the action sequences that achieve this can be directly inferred from the latent graph. By re-activating the remembered clone corresponding to the previously encountered visual sensation, and propagating messages forward and backward from the clone corresponding to the current location, the agent can infer an action sequence from the current state to the target. Such message passing–based planning in CSCGs involves forward and backward sweeps from the current state, and is akin to replays in the hippocampus (38, 39). See (15) for a visualization of the replay dynamics in CSCG. While recurrent neural networks or transformers have the representational power, and could arguably be trained to predict the next observation as well as CSCGs, their latent space is not structured like a graph. A striking advantage of CSCG in comparison to such models is that learned maps can be quickly reconfigured to reflect changes in the environment. When a previously passable route is blocked, the corresponding structural modification can be made in the latent graph, and message passing–based inference will use this updated information about the environment to navigate around obstacles (15).

CSCGs can also transfer prior knowledge to new environments and infer shortcut paths through unobserved locations by treating the learned transition graph as a schema (4042) and learning just the emission matrix. To demonstrate this ability, we first trained a CSCG using aliased observations from random walks in an environment composed of four connected 3D rooms. Next, we placed the agent in an unfamiliar room with the same structure (test environment in Fig. 2D), but with variations in wall colors and lighting conditions. As the agent walks in the new room, we keep the transition matrix of the CSCG fixed and update the emission matrix with the EM algorithm. Just from the partial experience of walking through three of the four rooms, the CSCG is able to infer the shortest path to the start location through the previously unvisited room. Notably, the environment is not a 2D plane amenable to Euclidean vector navigation: The inferred shortest path involved an elevated platform accessed via a ramp as shown in Fig. 2D, where the agent had to first navigate away from the goal (in a Euclidean sense) to access the ramp. Even with partial knowledge of an environment, an agent can vicariously evaluate the sequence of actions to be taken to reach a destination by reusing the CSCG’s transition graph from a similar, previously experienced environment. This experiment shows the potential of CSCG to use its learned structures as schemas, and it required the experimenter’s intervention to update only the emission matrix while keeping the transition matrix fixed. It shows the path toward a more general model that could be automatically and simultaneously updating multiple emission matrices and transition matrices, and dynamically choosing which of these models to use for behavior.

Remapping due to changes in overall geometry, visual cues, or landmarks can be explained using sequence learning

Changing the interpretation of place fields from explicitly representing spatial locations to representing the sequential context in which a sensation occurs explains a wide variety of place cell remapping phenomena. In the transition graph in Fig. 1Diii, each state should be interpreted, not as responding to a specific location in the room, but as responding to the specific sequence of observations leading up to that location. As we demonstrate, sequential interpretation of spatial representations can explain a variety remapping driven by changes in geometry, visual cues, transparent or opaque barriers, landmarks, distances to start or end locations, etc.

The classic experiments on geometry change–driven remapping (27) can be explained using CSCGs as follows: Changing the geometry of a room changes the locations where similar sequential contexts will be observed. In these experiments, place fields that developed while the rat trained in an arena remapped in a geometry-dependent manner when the arena was elongated or widened. We demonstrate this by first training a CSCG on a small square (SS) room (size 9 × 9) with uniform interior and observing the place field changes of clones in test rooms that varied in size along the two dimensions (see Fig. 3A). The activations of clones in a CSCG represent the posterior distribution over latent states given the past sequence of observations. As described earlier, the specific sequential context in which clones activate can be interpreted as coding for location. Since the interiors of a uniform room have undifferentiated local sequential context, the responses of clones in the center will be anchored with respect to the boundaries because of the relative uniqueness of the observations there. When navigating an elongated room using the CSCG learned from the smaller room, the internal states will reliably signal end-of-room states when the agent is near the boundaries of the new room. This effectively creates two loci for sequential contexts. The same clone that fired in the sequential context corresponding to a specific location in the original room will now fire at two different locations due to the splitting of the sequential contexts in the elongated room, as reflected in the remapped responses of clones 161 and 748 in Fig. 3A. In contrast, the response of clone 314 does not remap and remains the same in all four rooms. This is because this neuron’s sensory input already includes part of the boundary, and also because the sequence it represents has shorter undifferentiated segments from the boundary, making it strongly anchored. Although these results were originally characterized as boundary vector coding, our results show that the major findings of (27) can be explained using sequence representation without using geometric concepts. As we describe later, the sequence perspective also naturally explains the temporal dependence of the remapped place fields. Note that remapping relies on changes in the environment being superposed on a previously learned graph without any new learning. Of course, with further training in the new environment, the remapping will diminish because new place fields representing the new environment will develop with more experience in that environment.

Fig. 3. CSCG reproduces several place field remapping phenomena.

Fig. 3.

(A) A CSCG was first trained in a small square (SS) room (i). (ii) Place fields were computed in the SS, horizontal rectangle (HR), vertical rectangle (VR), and large square (LS) rooms. Notably, clone 314 maintained a consistent place field anchored to the top-left corner, while clones 161 and 748 exhibited field splitting when the room is elongated along the horizontal and vertical axes, respectively. (iii) The CSCG also replicated directional place fields reported in (27). (B) When trained in a circular room with a cue, the CSCG demonstrated that place fields also rotate when the cue is rotated. For most clones, the place fields disappear when a barrier is introduced. (C) Trained in a rectangular layout with a landmark, the CSCG exhibited place fields with two components in the modified layout: one at the original location and another at a vector displacement from the new landmark location. (D) A CSCG was trained on a linear track using outbound and inbound walks. Place fields were computed using trials with different starting positions for the outbound trajectories and different end positions for the inbound trajectories. Most clones coded distance from the start box, while others were anchored to the end box. (E) A CSCG was trained in a rectangular maze similar to (8), with training trials comprising three laps followed by a reward at the end. During test trials, the reward was shown at the end of four laps. Place fields computed with training trials show that there are different clones that are maximally active for different laps. In test trials, lap three clones are substantially active in both the third and fourth laps, reflecting the absence of a reward in the third lap.

The classic Muller and Kubie experiments (26) showing a variety of remapping phenomena can also be explained using CSCG, which we illustrate in Fig. 3B. In these experiments, rats were trained in a circular arena with a cue card placed on the wall. Researchers found a variety of remapping phenomena with respect to rotation of the cue card and introduction of opaque or transparent barriers. To investigate these phenomena, we first trained a CSCG in a circular arena with a cue card at the 12 o’clock position. In this environment, the differentiated sequential contexts will develop with reference to the cue card. When we computed place fields with this CSCG in an arena where the cue was rotated, the place fields also rotated accordingly because they are always referenced to the context and not the absolute location. Placing a barrier in the arena has two effects that destroy the place field for some clones. One effect is that the barrier prevents the agent from taking some trajectories that are important for revealing the relevant sequential contexts for some clones. The second is that the presence of the barrier can change the visual sensation in its vicinity. Both these effects combine to explain why place fields are disrupted when a barrier is placed through its center, and not affected when the barrier is far away.

CSCGs also explain why place cells can be seen as encoding a vector relationship to local landmarks (3). Just like cue cards, or boundaries, landmark objects placed in an environment act as disambiguating contexts with respect to which sensations at other locations are encoded. Thus, when a landmark is moved, some of sequential contexts also move in reference to that landmark. We illustrate this landmark vector remapping phenomenon in Fig. 3C. We first trained a CSCG in a rectangular layout with a landmark on one side of the room. We computed place fields in this layout as well as a modified version in which the landmark was moved to a different location. In the modified layout, the place fields now have two components—one at the same location as in the original layout, and the second at the same relative displacement from the new location of the landmark.

In more recent experiments (28), rats were trained on outbound and inbound traversals on a linear track that could be changed in length. Responses to the appropriate sequential contexts in a CSCG naturally explain the remapping of place fields observed as the track length varies. To demonstrate this, we first trained a CSCG on a linear track of length 18 steps using both outbound (left to right) and inbound (right to left) walks. We then computed place fields separately on outbound and inbound trajectories for various track lengths (Fig. 3D). We observed that most clones coded for distance from the starting position. The place fields gradually widened with distance from the starting position reflecting the growing uncertainty in the distance from the starting point. There were also clones anchored to the end point of the trajectories.

Sequence representation can explain puzzling phenomena that mix spatial and temporal effects

Sequential contexts naturally explain the direction sensitivity of place field remapping reported in (27). When the room is elongated, some place fields that were unimodal in the original room remapped to produce two peaks, corresponding to two subcomponents in the elongated room. It was observed that these peaks were direction sensitive: The left subcomponent was active during rightward travel and vice versa. We tested CSCG for the same effects using the same settings as in Fig. 3A, by plotting the fields conditioned on the direction of travel. In the horizontal rectangle room, rightward and leftward trajectories of the agent strongly activated the left and right peaks of the place field, respectively, as shown in Fig. 3Aiii. This is because only one of the sequential contexts that activate a clone occurs in a directional walk, which is a natural consequence of representing locations using sequential contexts. In contrast, a purely geometric model like the boundary vector model (41) does not offer an explanation for the direction sensitivity of place field remapping.

CSCG can also explain recently discovered phenomena like event-specific rate remapping (ESR) cells (8), which signal a combination of location and lap number for different laps around a maze, without postulating special coding mechanisms. Figure 3E shows a similar setting to an experiment in (8) where a rat runs multiple laps in a looping rectangular track before receiving a reward. We trained a CSCG on trials comprising three laps of a rectangular track with a reward state at the end of the third lap. A CSCG exposed to the sequence of observations from such trials learned to distinguish the laps and to predict the reward at the end of the third lap, without the help of any explicit lap-boundary markers in the training sequence. This is reflected in the place fields of the clones for the training trials (left panel in Fig. 3E)—each clone is maximally active for an observation when it occurs in its specific lap. However, each clone also shows weak activations when its corresponding observation is encountered in other laps, a signature of ESR. This occurs naturally in the CSCG due to probability smoothing and the inference dynamics. CSCGs can also explain the remapping of ESR cells. We computed place fields on test trials comprising four laps, instead of three, in which the reward was at the end of the fourth lap. The lap three clones were strongly activated in both the third and the fourth laps, reflecting the change when the reward state is reached (right panel in Fig. 3E).

CSCG can predict what kinds of environmental changes lead to remapping

CSCGs show that environmental connectivity changes need not lead to place field remapping even when the agents’ behavior shows adaptation to the change, a phenomenon that researchers found puzzling. In (31), rats ran in a four-room maze where the doors connecting the rooms could be selectively locked to change the connectivity of the arena. The rat’s behavior reflected that it recognized the connectivity changes of the environment, but the place fields did not remap in response to these connectivity changes. The authors found this lack of remapping puzzling and argued that place cells do not encode a topological map. However, CSCGs show that place cells can encode global location in their activations, global topology in the cell-to-cell connectivity, and still not show remapping in response to the manipulations in (31).

To demonstrate this, we trained a CSCG using a random walk in an environment comprising four square rooms that are connected by two-way doors, similar to the experimental setting in (31). Each room had visual cues that distinguished it from the other rooms. CSCG learned the global topology of the arena in the transition matrix, and the activation of clones corresponded to locations, as in previous experiments (Fig. 4A, top row). We then tested for two environmental modifications used in (31): (i) One door was locked both ways effectively creating a blockade, and (ii) all doors were locked in one way allowing only an anti-clockwise direction of traversal in the environment. The corresponding modifications were made in the CSCG transition matrix by modifying the connections appropriately, and planning routes in this modified CSCG corresponded to the reported successful navigation. We then computed place fields using the appropriately modified CSCGs paired with the arena connectivity changes, and compared these to the fields from the original CSCG in the original arena. In Fig. 4A, we show that the place fields were the same across all three settings, consistent with the observations in (31).

Fig. 4. CSCG reproduces various observations about place cells such as place field repetition, size, and shape variations.

Fig. 4.

(A) A CSCG trained in a layout with interconnected square rooms showed consistent place fields across settings with locked doors. (B) A CSCG was first trained on a layout comprising two visually identical rooms in the same orientation connected by a corridor. Place fields were computed in a layout where the two rooms were rotated such that their orientations differed by 180. We also considered a second modification, where we introduced an asymmetry in the connection between the two rooms. In all three layouts, we observed place field repetition across the two rooms, in contrast to the findings in (30), where place field repetition was observed only in the same orientation layout. (C) However, when we retrained the CSCG in the layout with asymmetric connectivity, we observed that the place field repetition disappeared. (D) CSCG reproduces direction-dependent place field repetition in a hairpin maze as observed in (30). (E) We trained CSCGs on square rooms with uniform interiors of three different sizes. We observed that place fields at the edge and center of the room elongate and enlarge as the room size increases, while place fields anchored to a corner remain consistent across sizes. (F) A CSCG trained on the checkerboard room has more expanded place fields compared to one trained on a room with a random pattern. (G) In a room elongation experiment, we initially trained a CSCG on a rectangular room with local landmarks on one side. Place fields were computed in both the training room and an elongated room where landmarks remained in the same position. Place fields anchored to the landmarks remained consistent in both layouts, while those farther from the landmarks expanded in the elongated layout.

The reason for lack of remapping can be understood by realizing that the connectivity change blocked paths without any change in the visual cues. The blocked path affected only a few of the potential sequences that were responsible for that place field, a change that is too small to be reflected in the aggregated sequential responses. However, the connectivity change can still lead to large changes in behavior, for example, in navigation between the two rooms. In CSCGs, those changes will be reflected in the replay messages used for planning and in the computed shortest paths.

While the explicit latent graph representation of CSCG allows for rapid structural modification of the internal model of an agent in response to environmental changes, the changes itself were manually made by the experimenter in the current results. Making these modifications automatically and dynamically in response to surprise would result in a further integrated model and is an area of further research.

Sequence learning explains place field repetition, size, and shape variations

Place fields distort along the boundaries and increase in size systematically toward the center of an empty arena (43). In very elongated rooms, place fields have multiple lobes. In some settings, place fields are known to repeat in identical rooms (29, 44). While all these phenomena appear to be spatial, CSCGs provide cogent explanations for these in terms of sequence learning: All of them result from state aliasing due to the difficulty in creating different latent states for temporal contexts that are identical for long number of steps.

To demonstrate place field repetition in visually identical environments, we trained a CSCG in a layout comprising two visually identical rooms in the same orientation and connected by a corridor, as shown in Fig. 4B, similar to the setting in (29). Place fields computed in this layout show repetition, i.e., clones are active at the same location in both rooms. We also considered a layout in which the two rooms were abutted by rotating them such that their orientations differed by 180. In (29), it was reported that place field repetition disappears in the different orientation setting. This was attributed to the rats potentially being able to maintain their inertial angular orientation. With CSCGs, in the absence of an external “compass,” we observe that place field repetition persists in the modified layout, even after the introduction of an asymmetric connection between the two rooms. However, when the CSCG was retrained on the different orientation setting with an asymmetric connection between the rooms, it was able to partially split contexts in the two rooms. This resulted in unique place fields for most clones, as shown in Fig. 4C. If the sensory input to CSCG is augmented with an external head direction input, then the different orientation setting results in unique place fields in CSCGs, similar to what is observed in (29).

In Fig. 4D, we reproduce the direction-dependent place field repetition reported in (31). We trained a CSCG on a hairpin maze, with distinct end markers and a cue to distinguish top and bottom walls, using left to right (LR) and right to left (RL) walks. Place fields computed using this CSCG using only LR or RL walks reveal direction-dependent place field repetition, as shown in Fig. 4D. For example, clone 42 is activated at the same location in multiple segments of the maze, but only in the LR traversal. The top and bottom walls of the maze have different observations, which provides the CSCG enough context to disambiguate the two directions of travel. However, for each direction of traversal, the observations are the same in all segments of the maze, except the ends, resulting in the repetition of place fields.

To study the effect of room size on place fields (43), we trained three different CSCGs on square rooms, with uniform interiors, of side length 7, 9, and 11, respectively. As an agent moves away from the boundaries to the center of an empty room, different sequential trajectories start to look the same, making it difficult for the learning algorithm to split the contexts into different clones. This results in the same clone representing more contexts than it would in the periphery of the room where contexts can be easily distinguished. In place field mapping, this will appear as an enlargement of the place fields in the center of the room (“Center” column in Fig. 4E). Similarly, the observations along the edge of a room might not all develop into distinct clones, resulting in multiple observations along the edge being aliased into the same clone. This aliasing, due to the elongation of the same evidence, will appear as an elongation of the place field (“Edge” column in Fig. 4E).

Place field size expansion (43) in an empty arena happens because of the same reason as place field repetition in two identical iso-oriented rooms. Both can be explained by the inability of the model to split very long-term temporal contexts into distinct latent clones with the given amount of training. (Of course, longer training will partially overcome this problem, which is observed in animals as well.) In that sense, larger place fields are the same as place field repetition, just happening in adjacent locations.

Testable predictions made by CSCGs

CSCGs can also make experimentally testable predictions for yet to be observed phenomena. One such prediction is the following. What controls how place fields change is not the rate of visual change, but the uniqueness of the visual context. To demonstrate this, we trained two CSCGs on square rooms with checkerboard and random patterns on the floor, respectively. We observed that the place fields in the checkerboard room were more expanded, as shown in Fig. 4F. This is because the same context repeats throughout the interior of the room, making it difficult for the learning to split the contexts into different clones.

CSCGs provide a mechanistic explanation for the question of when and why do place fields globally or partially remap. The answer: Place cell responses are driven by their sequential contexts, and changes that substantially affect the sequential context of a neuron are what determines when and how its field will remap. In the abstract, sequential context for a sensory observation can be thought of as the history of sensory observations that can predict it. The CSCG model makes this abstract definition concrete and measurable: Sequential context for a latent state is the temporal trajectories of latent states that lead to it. Any change that makes the same sequential context occur in different parts of the room will result in that field partially appearing in the new place. The organization and specificity of local context driving the responses of a cell will have a notable impact on its remapping. A cell that is tuned to sequences in the middle of a uniform room will have its place fields anchored by the boundaries that are relatively more unique, causing the field to remap when the boundaries are moved. However, if the cell had some other local cues, for example, markings on the floor, that would provide it a unique sequential context, then the cell’s field will not remap when the boundary is moved. In Fig. 4G, clone 15 and clone 128 are two cells from the CSCG trained in the training layout. When the room is elongated, the place field of clone 128 expands. This is because the local sequential context for clone 128 was anchored by the cyan landmark on the left and the boundary on the right. These partial contexts occur in two different places in the elongated room. In contrast, the local sequential context for clone 15 is anchored by the blue landmark on its left and the cyan landmark on its right, and those did not change when the room was elongated. This means that, locally, clone 15 will see the same sequential contexts after room elongation, resulting in a lack of remapping in its place field.

DISCUSSION

The discovery of place cells is a striking success of hippocampus research, and place field mapping has served as a valuable tool in revealing the representational properties of neurons in the hippocampus. However, anomalies have been accumulating over the simple view that place cells represent just locations (8, 9, 32, 45, 46). Place fields distort around boundaries and split along trajectories. They are direction sensitive (32, 34) and can even represent the lap count in running loops (8). In some cases, insertion of a boundary in between a place field clearly disrupts the place field, suggesting that fields are related to the connectivity of the underlying environment (26). Yet, place fields can remain unchanged when the connectivity of the environment changes without any visible cues (31). If the environmental change is not reflected in place fields, how are the rats able to change their behavior in the new environment? In summary, many of these questions about the role of place cells—what they represent, how those representations are learned, how they are used, and how they change with respect to environmental manipulations remain unanswered in the location-centric description of hippocampal neurons. In contrast, the sequence-centric paradigm (45) we develop in this paper resolves these anomalies by re-interpreting space as a sequence. In contrast to the Kantian view of space and time as a priori, our work offers computational support for the Leibnizian view that only sequential ordering needs to be a priori, opening up interesting questions regarding innate representations and core knowledge for cognitive science and artificial intelligence (AI).

Our current work required substantial advances over the previous work that introduced the basic CSCG model (15). The previous work was entirely in an allocentric setting that hardcoded action semantics in a global coordinate system, which is insufficient to address how spatial representations emerge from egocentric actions and sensations. Moreover, George et al. (15) dealt only with idealized environments with discrete ordinal sensations. Additionally, George et al. (15) did not realize how the Euclidean place field mapping methodology is related to the different reported phenomena. Generalizing to egocentric setting and continuous high-dimensional signals from visual perception in 2D and 3D environments and arbitrary topologies enabled the coverage of experiments and led to the strong insights in our current work: Place is a sequence, and place field mapping methodology is a source of anomalies. By improving the model to learn from realistic data in complex 3D environments, ideas in this paper have also become relevant for AI, addressing longstanding questions about episodic experience, memory, cognitive maps, and planning.

While CSCG draws up on many past and contemporary models of hippocampus (47), it is markedly different in many aspects. In contrast to temporal context models (48) that accumulate sequential context in the observation space, the sequential representation in CSCG is in the latent space, giving it the ability to model more complex and long-duration temporal dependencies. The ability of CSCG to represent locations as sequences crucially depends upon having a latent representation. Although successor representations (49) can model temporal relations, they are not directly applicable in the aliased settings we consider here. Since successor representation assumes that sensations directly correspond to locations, it cannot explain how spatial representation emerges from egocentric sensations in the partially observable settings that animals encounter in the real world. The memory compression model (50) focuses on compressing instantaneous inputs by exploiting correlations between them. However, their experiments are in a fully observable environment where the locations can be uniquely determined directly from the sensations, and sequence learning is not part of the model. Combined with a discretization step, this model could play the role of a more sophisticated vector quantizer that can feed into a model like CSCG.

Contemporary work on Tolman-Eichenbaum machines (TEMs) (51) has many similarities to CSCG in inspiration. However, unlike CSCG, TEMs do not learn latent graphs in aliased settings like ours. Instead, TEMs focus on learning general transitivity rules applicable to a single graph from multiple noisy realizations of that graph. Moreover, TEMs do not deal with multiple graphs at the same time (52), or do latent transitive stitching. In the context of learning spatial representations, TEMs have so far been demonstrated only in allocentric settings with a global coordinate system, and they rely on the hard-coded loop-closing semantics of allocentric actions (e.g., north-east-south-west closes the loop) to learn the representations, which does not explain how spatial representations can arise from egocentric sensations and actions. A TEM is formulated purely as a predictive model, and its internal representation does not learn a modifiable graph that corresponds to the environment. Therefore, a TEM does not have the same ability as a CSCG to deal with dynamic environments quickly by changing its graph connectivity or to form hierarchies through community detection (53) on the latent graph as demonstrated in (15).

Unlike other computational models of place fields, CSCGs do not use grid fields to learn place fields and still explain varied remapping phenomena. Recent experimental evidence suggests that grid cells are not necessary for learning (54, 55) and continued functioning of place cells (55). If grid cell outputs are available, CSCG can use those as additional sensations. This would speed up learning in the middle portions of empty arenas where unique sensations are not available, and it will also help stabilize the place fields away from the boundaries or other landmark cues (56, 57), consistent with the idea of grid cells providing an optional scaffolding for place cells (58).

All the phenomena replicated in this work were robust without the need for careful handcrafting on the side of the experimenter. Given sufficient capacity in the number of clones, CSCG learning was robust and converged to the ground-truth graph of the environment from random initializations. Only one experiment—the lap running experiment—required multiple random initializations to recover the ground-truth graph, and we suspect that it is due to the extreme degeneracy of observations in the repeated laps. Moreover, since all the neurons and their connections in the model are observed, and since their representations are directly interpretable in relation to the environment, the relevant neurons for analysis can be identified directly by inspection rather than through sampling and population-level analysis. However, our setting can also be a good test bed for studying sampling effects and population-level metrics. This is left for future work. Given that the graph for an arena is learned faithfully, locations can be decoded from the current most likely hidden state, and this is robust despite noisy or missing observations or actions.

Another under-explored property of the CSCG is its ability to learn multiple maps of different fidelities and task contexts in the same model, and to dynamically switch between them based on contexts, just like switching between the maps of different arenas. The learning dynamics of EM usually results in a latent representation that is a superposition of multiple graphs, which we then consolidated to a single graph using Viterbi-EM. Having multiple maps of the same arena (59) is compatible with the CSCG representation and could provide explanations for more phenomena, such as the simultaneous presence of unidirectional and bidirectional place fields in linear tracks (60). This is left for future work.

The most important message from our work is that many of the diverse fascinating hippocampal phenomena might be artifacts of Euclidean place field mapping. Hippocampal cells are usually interpreted by plotting their responses on to a 2D map corresponding to the environment, collapsing the sequential responses into a static place field. Characterizing place field maps in terms of Euclidean concepts is akin to characterizing the effects rather than the underlying causes and might be the source of new phenomena. Often, these new phenomena are explained away invoking familiar, but ultimately unsatisfactory, answers like distributed coding or mixed selectivity. These answers are unsatisfactory because instead of answering the questions they just shift the questions elsewhere. Our experiments show that phenomena that look extremely different—for example, place field expansion in a uniform room and event-specific responses in a lap running—can have the same underlying explanation, which can be understood through the sequence learning model. By maintaining the same underlying model and altering the environments to align with neuroscience experiments, we demonstrate how distinct phenomena emerge solely from those environment changes through the application of the same learning and inference mechanisms. Rather than treating place field fidelity as something the hippocampus is trying to achieve, our examples serve to demonstrate why place field mapping should be treated as just a visualization tool that the experimenter has at their disposal. We hope that this opens up a new avenue of exploration that takes us away from the familiar questions centered on encoding and decoding locations.

Much remains to be explored on this new path we have struck out on. To demonstrate the viability of our theory, we selected a subset of the results from each of the experiments that we judged to be adequately representative. However, more can be done along this path. For example, we did not test the model for remapping in the presence of new objects as in (3), although the model offers a potential explanation via probabilistic inference. We have only briefly touched upon replay-based planning and schemas (61), and both can be expanded in future research. Place fields in animals develop with considerably less experience than the amount required in our experiments, and schemas could be crucial for explaining this gap by giving a mechanism for abstracting and transferring prior experience. Our work can also be expanded in the direction of active learning and inference (62, 63). Reward mechanisms can be layered on top of CSCG. CSCGs have the ability for temporal abstractions via community detection (53) on the underlying graphs, an idea worthy of more exploration. Our current models are learned using random walks. Potentially efficient exploration techniques can be developed as active learning on CSCGs. We hope that our work gives a concrete tool that would help hippocampal researchers think beyond the place field paradigm.

METHODS

Expectation-maximization learning of CSCGs

Cloned HMMs, first introduced in (14), are a sparse restriction of overcomplete HMMs (18) that can overcome many of the training shortcomings of dynamic Markov coding (64). Similar to HMMs, cloned HMMs assume the observed data x ≡ {x1…, xN} are generated from a hidden process z ≡ {z1…, zN} that obeys the Markovian property

P(x,z)=P(z1)n=1N1P(zn+1zn)n=1NP(xnzn) (3)

Here, P(z1) is the initial hidden state distribution, P(zn+1zn) is the transition probability from zn to zn+1, and P(xnzn) is the probability of emitting xn from the hidden state zn.

In contrast to HMMs, cloned HMMs assume that each hidden state maps deterministically to a single observation. Further, cloned HMMs allow multiple hidden states to emit the same observation. All the hidden states that emit the same observation are called the clones of that observation.

CSCGs build on top of cloned HMMs by augmenting the model with the actions of an agent. In this section, we first review the expectation-maximization learning of cloned HMMs, before describing the learning of CSCGs for both discrete and continuous observations.

Expectation-maximization learning of cloned HMMs

The standard algorithm to train HMMs is the expectation-maximization (EM) algorithm (65), which in this context is known as the Baum-Welch algorithm. The Baum-Welch algorithm uses forward and backward message passing to compute posterior marginals of the latent variables given a sequence of observations/emissions during the expectation step. Learning a cloned HMM using the Baum-Welch algorithm requires a few simple modifications: The sparsity of the emission matrix can be exploited to only use small blocks of the transition matrix both in the expectation (E) and maximization (M) steps.

Learning a cloned HMM requires optimizing the vector of prior probabilities π: πk = P(z1 = k) and the transition matrix T: Tij = P(zn+1 = jzn = i). To this end, we assume that the hidden states are indexed such that all the clones of the first emission appear first, all the clones of the second emission appear next, etc. Let Nobs be the total number of emitted symbols. The transition matrix T can then be broken down into smaller submatrices T(u, v) indexed by emissions u, v ∈ {1, …, Nobs}. The submatrix T(u, v) contains the transition probabilities P(zn+1zn) for znC(u) and zn+1C(v), where C(u) and C(v) correspond to the clones of emissions u and v, respectively.

The standard Baum-Welch equations can then be expressed in a simpler form in the case of cloned HMMs. The E-step recursively computes the forward and backward probabilities and then updates the posterior probabilities. The M-step updates the transition matrix via row normalization.

E-step

α(1)=π(x1) α(n+1)=T(xn,xn+1)α(n) (4)
β(N)=1(xN) β(n)=T(xn,xn+1)β(n+1) (5)
ξuv(n)=α(n)T(u,v)β(n+1)α(n)T(u,v)β(n+1) (6)
γ(n)=α(n)β(n)α(n)β(n) (7)

M-step

π(x1)=γ(1) (8)
T(u,v)=n=1Nξuv(n)v=1Nobsn=1Nξuv(n) (9)

where ∘ and ⊘ denote the element-wise product and division, respectively (with broadcasting where needed). All vectors are M × 1 column vectors, where M is the number of clones per emission. We use a constant number of clones per emission for simplicity here, but the number of clones can be selected independently per emission. π(x1) refers to the portion of the prior probability vector corresponding to the clones of the emission x1. Similarly, 1(xN) is an all-one vector with length equal to the number of clones of emission xN. Cloned HMMs exploit the sparsity pattern in the emission matrix when performing training updates and inference, and achieve considerable computational savings when compared with HMMs.

CSCGs: Action-augmented cloned HMMs

CSCGs are an extension of cloned HMMs in which an action happens at every time step (conditional on the current hidden state) and the hidden state of the next time step depends not only on the current hidden state but also on the current action. The joint probability density function of the observations and the actions is given by

P(x,a)=z1C(x1)znC(xn)P(z1)n=1N1P(zn+1 zn,an)P(anzn) (10)

and the standard cloned HMM can be recovered by integrating out the actions.

We group the actions with the next hidden state to remove loops and create a chain that is amenable to exact inference (i.e., similar to a standard HMM, we can exactly compute the required posterior marginal distribution of the latent variables given a sequence of observations). We can write the action-conditioned joint distribution as

P(xa)=z1C(x1)znC(xn)P(z1)n=1N1P(zn+1zn,an) (11)

Learning a CSCG requires optimizing the vector of prior probabilities π: πk = P(z1 = k) and the action-augmented transition tensor T: Tijk = P(zn+1 = k∣, zn = j, an = i). Similar to cloned HMMs, we can break the action-augmented transition tensor T into smaller submatrices T(u, v, w), indexed by u ∈ {1, …, Nactions}, v, w ∈ {1, …, Nobs}. The submatrix T(u, v, w) contains the transition probabilities P(zn+1zn, an = u) for znC(v), zn+1C(w), where C(v) and C(w) correspond to the clones of emissions v and w, respectively. All the previous considerations about cloned HMMs apply to CSCGs, and the EM equations for learning are also very similar:

E-step

α(1)=π(x1)α(n+1)=T(an,xn,xn+1)α(n) (12)
β(N)=1(xN)β(n)=T(an,xn,xn+1)β(n+1) (13)
ξuvw(n)=α(n)T(u,v,w)β(n+1)α(n)T(u,v,w)β(n+1), u=an (14)
γ(n)=α(n)β(n)α(n)β(n) (15)

M-step

π(x1)=γ(1) (16)
T(u,v,w)=n=1Nξuvw(n)v=1Nobsn=1Nξuvw(n) (17)

In (15), it was observed that the convergence of EM for learning the parameters of a CSCG can be improved by using a smoothing parameter called the pseudocount. The pseudocount is a small constant that is added to the accumulated counts statistics matrix [ n=1Nξuvw(n) ], which ensures that any transition under any action has a nonzero probability. This ensures that the model does not have zero probability for any sequence of observations at test time.

Learning the emission matrix with transitions fixed

With a CSCG, transfer learning between different environments can be accomplished by keeping its transition probabilities T fixed and learning the emissions associated to its nodes E in the new environment. Further, if we know that the new environment preserves the emission structure, then we can further restrict the learning of E, with all the rows of E that correspond to the same observation in the original environment sharing the same parameters.

The EM algorithm can be used to learn the emission matrix as follows. The E-step recursively computes the forward and backward probabilities and then updates the posterior probabilities. The M-step updates only the emission matrix.

E-step

α(n)=T(an1)α(n1)E(xn) pα(n)=k=1Zαk(n) α(n)=α(n)/pα(n) (18)
β(n)=T(an)β(n+1)E(xn+1)pβ(n)=k=1Zβk(n)β(n)=β(n)/pβ(n) (19)
γ(n)=α(n)β(n)α(n)β(n) (20)

M-step

E(j)=n=1N1xn=jγ(n)n=1Nγ(n) (21)

Note that the updates here involve ∣Z∣ × 1 vectors, where ∣Z∣ is the total number of hidden states in the model. T(an) corresponds to the transition matrix for the action an and is of size ∣Z∣ × ∣Z∣, E(xn) is a column of the emission matrix corresponding to the emission xn, and 1xn = j is an indicator function. The forward message is initialized as α~(1)=πE(x1) , and the backward message is initialized as a vector of all 1s. The emission matrix can be initialized randomly or with equal probabilities for all observations.

When the clone structure is to be preserved, the γ(n) term in the E-step is modified as follows. For each observation j ∈ {1, …, Nobs}, we set the posterior probability for all clones of j to be the same

γ¯k(n)=lC(j)γl(n) kC(j) (22)

Learning the continuous CSCG

To extend CSCG to continuous observations, we introduced a new variable yn between the hidden state zn and the observation xn. To recap, the joint distribution of a sequence of actions and continuous observations is

p(x,a)=zyP(z1)n=1N1P(zn+1zn,an)P(anzn)n=1Np(xnyn)P(ynzn) (23)

where the observation model p(xnyn) = 𝒩(xn∣μyn, σ2I) is parameterized as an isotropic Gaussian with variance σ2 and mean μyn, which is the centroid associated to the discrete emission yn. The emission model is parameterized by an emission matrix E with elements Eij = P(yn = jzn = i).

In a simple case, if we have a finite number of potential observations, we can allocate a centroid to each of them and set σ2 → 0. When the observations are discrete, the observation model becomes deterministic and we can set y = x. This exactly recovers the discrete CSCG from (15).

When the observation at each time step n is indeed a continuous vector xn (e.g., an image), we want to maximize the following log likelihood

logp(xa;T,μ)=logz,yP(z1)n=2NP(znzn1,an1)n=1Np(ynzn)N(xnμyn,σ2I) (24)

with respect to (w.r.t.) T, where Tijk = P(zn = kzn−1 = j, an−1 = i), and p(ynzn) is 1 for znC(yn) and 0 for all others.

Parameter optimization can be done via EM, where both the expectation and maximization steps are exact. Rather than learning both T and μ simultaneously, we find it simpler to proceed in two steps.

In the first step, we fix P(z1) = Tijk = 1/ ∣Z∣, where ∣Z∣ is the number of states in the hidden space, and learn μ only. The problem thus simplifies to maximizing

log p(xa;T=Tuniform,μ)=N logZ+logz,yn=1Np(ynzn)N(xnμyn,σ2I)=N logZ+n=1Nlogyn=1KnC(yn)N(xnμyn,σ2I)=n=1Nlogk=1KnC(k)ZN(xnμ^k,σ2I) (25)

w.r.t. μ. The number of clones of a given centroid μ^k is denoted by nC(k) (thus, k=1KnC(k)=Z ). In the last equality, we collect all the Gaussians that are known to have the same mean (i.e., all the clones with the same centroid). The computation is more efficient, since it no longer scales with the total number of hidden states ∣Z∣, but only with the number of distinct means K.

The astute reader will recognize Eq. 25 as the log likelihood of an isotropic Gaussian mixture model, which can be optimized greedily using K-means. The centroids μ^k are the cluster centers, and nC(k)Z corresponds to the prior probabilities of each center. In other words, to maximize logp(xa; T = Tuniform, μ) w.r.t. μ, we simply need to run K-means on the input data x (temporal ordering becomes irrelevant) and then assign a number of clones to each centroid that is proportional to the prior probability of that cluster. When using K-means, the value of σ2 does not affect the optimization of the centroids. Once the centroids have been chosen, one can set σ2 based on the hard assignments from K-means, where the maximum likelihood estimate is closed form and simply matches the average distortion. Directly optimizing Eq. 25 w.r.t. μ and σ2 is also possible using EM (and in principle, more precise) but will be more computationally expensive.

In the second step, we keep μ fixed as per the previous step and learn T by maximizing Eq. 24 w.r.t. T using EM. While it would be possible to learn μ and T simultaneously in the same EM loop, we find that proceeding in these two steps results in faster convergence without considerably deteriorating the quality of the found local optimum.

For learning T, we can apply here the same idea that converts EM for Gaussian mixture modeling into K-means. Instead of considering all the centroids as partially responsible (66) for each sample xn, we can assign all the responsibility to the dominating Gaussian. This is often a very precise approximation, since the dominating Gaussian typically takes most of the responsibility. In turn, this means that at each time step, only the clones of the centroid that is closest (in the Euclidean sense) can be active. With this change, the cost per learning iteration now scales as 𝒪(NnC2)𝒪(NZ2) , where we have assumed a constant number of clones nC = nC(k) for all centroids for ease of comparison. When doing this, the observations become effectively discrete and learning T reduces to the procedure described for the standard CSCG.

Putting all together, the final learning procedure can then be summarized as follows:

1) Run K-means on the training data for a given number of centroids K.

2) Assign to each centroid a number of clones proportional to the prior probability of that centroid, as obtained by K-means.

3) Vector-quantize the data (train and test) by assigning to each data point the label of the closest centroid.

4) Learn the action-conditional transition matrix as if this was a discrete CSCG.

CSCG learning stages

In our experiments, CSCG learning is typically performed in two stages: (i) EM learning and (ii) Viterbi learning. We first learn the model using the EM algorithm, as described previously, by allocating sufficient number of clones. The EM learning stage is always followed by a Viterbi training stage (67). Viterbi training can be considered as hard-decision EM, which uses max-product message passing instead of sum-product. It iteratively applies (i) the Viterbi decoding algorithm to get the most likely latent state sequence, and (ii) uses that to re-estimate the model parameters.

Soft EM in the first stage can result in multiple copies of the same graphs being learned in an interconnected and weighted manner, whereas the hard-EM (Viterbi) stage consolidates those copies. Evidence for coexistence of multiple maps representing the same environment using shared cells has been discovered in the mouse hippocampus (59).

Computing place fields of clones in CSCGs

Given a sequence of observations and actions, we define the activation of clone i at time n in a CSCG as the following marginal posterior probability

ρi(n)=P(zn=ix1,,xn,a1,,an1) (26)

Since the CSCG model (with the action an−1 and hidden state zn collapsed in a single variable) forms a chain, inference on it using message passing is exact. The marginal posterior distribution can be computed at each time step as follows

ρ¯(n)=T(an1)ρ(n1)(Exn); ρ(n)=ρ¯(n)i=1Nclonesρ¯i(n) (27)

where ρ¯(n) is the unnormalized distribution, T(an) is the transition matrix corresponding to action an, ∘ denotes element-wise product, E is the clone-structured emission matrix, and x~n is a one-hot encoding of the observation xn. The unnormalized marginal distribution at the first time step is computed as ρ¯(1)=π(Ex~1) , where π is the vector of prior probabilities.

While computing place fields of a CSCG in a test environment, it is possible that the agent might encounter previously unseen observations. To account for new observations in test environments, we use a Hamming distance based emission matrix E(Ham) in Eq. 27, which is constructed as follows. For clone i that maps to observation k in the clone-structured emission matrix

Eij(Ham)exp{[dHam(k,j)]22σE2} (28)

where dHam(k, j) is the Hamming distance between observation patches k and j, and σE is a factor that controls the variance of the emission distribution.

We also consider settings where the observations are noisy or uncertain. In such cases, x~n can be a distribution over all possible observations. A concrete example is when there is uncertainty in the agent’s position. In this case, we model the observation vector as a linear combination of observations from a window centered at the agent’s ground-truth position (rn, cn)

xn=Δr=WWΔc=WWX(rn+Δr,cn+Δc)exp(Δr2+Δc2)2σx2 (29)

where W is the window size, X(rn + Δr, cn + Δc) is a one-hot vector corresponding to the observation at position (rn + Δr, cn + Δc), and σx controls the spatial uncertainty. We also normalize x~n to ensure that the sum of its elements is 1.

To compute place fields in a given environment, we first obtain activations of the clones from Ntrials random walks of the agent, each of length Nlen time steps. We can then use the agent’s ground-truth spatial information to compute the average activation of each clone i at each spatial location in the environment, thus obtaining its place field

Ri(r,c)=k=1Ntrialsn=1Nlenρi(k)(n)I[rn(k)=r,cn(k)=c]k=1Ntrialsn=1NlenI[rn(k)=r,cn(k)=c] (30)

where ρi(k)(n) , [rn(k),cn(k)] are the activation of clone i and agent’s ground-truth position, respectively, at time step n in trial k, and

I[rn(k)=r,cn(k)=c]={1,if rn(k)=r and cn(k)=c0,otherwise (31)

is an indicator function.

Acknowledgments

We thank M. Botvinick, K. Stachenfeld, D. Kumaran, C. Blundell, M. Shanahan, and D. Hassabis for critically reading this manuscript and for insightful discussions.

Funding: The authors acknowledge that they received no funding in support of this research.

Author contributions: Conceptualization: D.G. Methodology: M.L.-G., R.V.R., J.S.G., and D.G. Investigation: R.V.R., J.S.G., C.W., M.L.-G., and D.G. Visualization: R.V.R. and D.G. Supervision: M.L.-G. and D.G. Writing—original draft: R.V.R., J.S.G., M.L.-G., and D.G. Writing—review and editing: R.V.R., J.S.G., G.Z., and D.G.

Competing interests: The authors declare that they have no competing interests.

Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials.

Supplementary Materials

This PDF file includes:

Supplementary Text

Figs. S1 to S19

sciadv.adm8470_sm.pdf (12.9MB, pdf)

REFERENCES AND NOTES

  • 1.Kubie J. L., Levy E. R. J., Fenton A. A., Is hippocampal remapping the physiological basis for context? Hippocampus 30, 851–864 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Bicanski A., Burgess N., Neuronal vector coding in spatial cognition. Nat. Rev. Neurosci. 21, 453–470 (2020). [DOI] [PubMed] [Google Scholar]
  • 3.Deshmukh S. S., Knierim J. J., Influence of local objects on hippocampal representations: Landmark vectors and memory. Hippocampus 23, 253–267 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sarel A., Finkelstein A., Las L., Ulanovsky N., Vectorial representation of spatial goals in the hippocampus of bats. Science 355, 176–180 (2017). [DOI] [PubMed] [Google Scholar]
  • 5.P. A. Dudchenko, E. R. Wood, “Splitter cells: Hippocampal place cells whose firing is modulated by where the animal is going or where it has been” in Space,Time and Memory in the Hippocampal Formation, D. Derdikman, J. J. Knierim, Eds. (Springer Vienna, 2014), pp. 253–272. [Google Scholar]
  • 6.Ainge J. A., Tamosiunaite M., Woergoetter F., Dudchenko P. A., Hippocampal CA1 place cells encode intended destination on a maze with multiple choice points. J. Neurosci. 27, 9769–9779 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Ainge J. A., van der Meer M. A. A., Langston R. F., Wood E. R., Exploring the role of context-dependent hippocampal activity in spatial alternation behavior. Hippocampus 17, 988–1002 (2007). [DOI] [PubMed] [Google Scholar]
  • 8.Sun C., Yang W., Martin J., Tonegawa S., Hippocampal neurons represent events as transferable units of experience. Nat. Neurosci. 23, 651–663 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Buzsáki G., Llinás R., Space and time in the brain. Science 358, 482–485 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Whitehead S. D., Ballard D. H., Learning to perceive and act by trial and error. Mach. Learn. 7, 45–83 (1991). [Google Scholar]
  • 11.L. Chrisman, AAAI (Citeseer, 1992), vol. 1992, pp. 183–188.
  • 12.Niv Y., Learning task-state representations. Nat. Neurosci. 22, 1544–1553 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Plitt M. H., Giocomo L. M., Experience-dependent contextual codes in the hippocampus. Nat. Neurosci. 24, 705–714 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.A. Dedieu, N. Gothoskar, S. Swingle, W. Lehrach, M. Lázaro-Gredilla, D. George, Learning higher-order sequential structure with cloned HMMs. arXiv:1905.00507 [stat.ML] (2019).
  • 15.George D., Rikhye R. V., Gothoskar N., Guntupalli J. S., Dedieu A., Lázaro-Gredilla M., Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps. Nat. Commun. 12, 2392 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Eichenbaum H., Hippocampus: Cognitive processes and neural representations that underlie declarative memory. Neuron 44, 109–120 (2004). [DOI] [PubMed] [Google Scholar]
  • 17.Warren W. H., Non-Euclidean navigation. J. Exp. Biol. 222, jeb187971 (2019). [DOI] [PubMed] [Google Scholar]
  • 18.V. Sharan, S. M. Kakade, P. S. Liang, G. Valiant, “Learning overcomplete hmms” in Advances in Neural Information Processing Systems (2017), pp. 940–949.
  • 19.J. Pearl, Causality (Cambridge Univ. Press, ed. 2, 2013). [Google Scholar]
  • 20.J. Peters, D. Janzing, B. Schölkopf, Elements of Causal Inference: Foundations and Learning Algorithms (The MIT Press, 2017). [Google Scholar]
  • 21.D. Eaton, K. Murphy, Artificial Intelligence and Statistics (PMLR, 2007), pp. 107–114. [Google Scholar]
  • 22.Parr T., Markovic D., Kiebel S. J., Friston K. J., Neuronal message passing using mean-field, bethe, and marginal approximations. Sci. Rep. 9, 1889 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rao R. P., Bayesian computation in recurrent neural circuits. Neural Comput. 16, 1–38 (2004). [DOI] [PubMed] [Google Scholar]
  • 24.Nessler B., Pfeiffer M., Buesing L., Maass W., Bayesian computation emerges in generic cortical microcircuits through spike-timing-dependent plasticity. PLOS Comput. Biol. 9, e1003037 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Nessler B., Pfeiffer M., Maass W., Stdp enables spiking neurons to detect hidden causes of their inputs. Adv. Neural Inf. Process. Syst. 22, 1357–1365 (2009). [Google Scholar]
  • 26.Muller R. U., Kubie J. L., The effects of changes in the environment on the spatial firing of hippocampal complex-spike cells. J. Neurosci. 7, 1951–1968 (1987). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.O’Keefe J., Burgess N., Geometric determinants of the place fields of hippocampal neurons. Nature 381, 425–428 (1996). [DOI] [PubMed] [Google Scholar]
  • 28.D. J. Sheehan, S. Charczynski, B. A. Fordyce, M. E. Hasselmo, M. W. Howard, A compressed representation of spatial distance in the rodent hippocampus. bioRxiv 2021.02.15.431306 [Preprint] (2021). 10.1101/2021.02.15.431306. [DOI]
  • 29.Fuhs M. C., VanRhoads S. R., Casale A. E., McNaughton B., Touretzky D. S., Influence of path integration versus environmental orientation on place cell remapping between visually identical environments. J. Neurophysiol. 94, 2603–2616 (2005). [DOI] [PubMed] [Google Scholar]
  • 30.Derdikman D., Whitlock J. R., Tsao A., Fyhn M., Hafting T., Moser M.-B., Moser E. I., Fragmentation of grid cell maps in a multicompartment environment. Nat. Neurosci. 12, 1325–1332 (2009). [DOI] [PubMed] [Google Scholar]
  • 31.Duvelle É., Grieves R. M., Liu A., Jedidi-Ayoub S., Holeniewska J., Harris A., Nyberg N., Donnarumma F., Lefort J. M., Jeffery K. J., Summerfield C., Pezzulo G., Spiers H. J., Hippocampal place cells encode global location but not connectivity in a complex space. Curr. Biol. 31, 1221–1233.e9 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Acharya L., Aghajan Z. M., Vuong C., Moore J. J., Mehta M. R., Causal influence of visual cues on hippocampal directional selectivity. Cell 164, 197–207 (2016). [DOI] [PubMed] [Google Scholar]
  • 33.Jercog P. E., Ahmadian Y., Woodruff C., Deb-Sen R., Abbott L. F., Kandel E. R., Heading direction with respect to a reference point modulates place-cell activity. Nat. Commun. 10, 2333 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Moore J. J., Cushman J. D., Acharya L., Popeney B., Mehta M. R., Linking hippocampal multiplexed tuning, hebbian plasticity and navigation. Nature 599, 442–448 (2021). [DOI] [PubMed] [Google Scholar]
  • 35.Muller R. U., Bostock E., Taube J. S., Kubie J. L., On the directional firing properties of hippocampal place cells. J. Neurosci. 14, 7235–7251 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Eichenbaum H., Dudchenko P., Wood E., Shapiro M., Tanila H., The hippocampus, memory, and place cells: Is it spatial memory or a memory space? Neuron 23, 209–226 (1999). [DOI] [PubMed] [Google Scholar]
  • 37.Sanders H., Wilson M. A., Gershman S. J., Hippocampal remapping as hidden state inference. eLife 9, e51140 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Ólafsdóttir H. F., Bush D., Barry C., The role of hippocampal replay in memory and planning. Curr. Biol. 28, R37–R50 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Jadhav S. P., Kemere C., German P. W., Frank L. M., Awake hippocampal sharp-wave ripples support spatial memory. Science 336, 1454–1458 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Baraduc P., Duhamel J.-R., Wirth S., Schema cells in the macaque hippocampus. Science 363, 635–639 (2019). [DOI] [PubMed] [Google Scholar]
  • 41.Barry C., Lever C., Hayman R., Hartley T., Burton S., O’Keefe J., Jeffery K., Burgess N., The boundary vector cell model of place cell firing and spatial memory. Rev. Neurosci. 17, 71–97 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Tang W., Shin J. D., Jadhav S. P., Geometric transformation of cognitive maps for generalization across hippocampal-prefrontal circuits. Cell Rep. 42, 112246 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Tanni S., De Cothi W., Barry C., State transitions in the statistically stable place cell population correspond to rate of perceptual change. Curr. Biol. 32, 3505–3514.e7 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Skaggs W. E., McNaughton B. L., Spatial firing properties of hippocampal ca1 populations in an environment containing two visually identical regions. J. Neurosci. 18, 8455–8466 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Buzsáki G., Tingley D., Space and time: The hippocampus as a sequence generator. Trends Cogn. Sci. 22, 853–869 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Ranganath C., Hsieh L.-T., The hippocampus: A special place for time. Ann. N. Y. Acad. Sci. 1369, 93–110 (2016). [DOI] [PubMed] [Google Scholar]
  • 47.B. Uria, B. Ibarz, A. Banino, V. Zambaldi, D. Kumaran, D. Hassabis, C. Barry, C. Blundell, A model of egocentric to allocentric understanding in mammalian brains. bioRxiv 2020.11.11.378141 [Preprint] (2022). 10.1101/2020.11.11.378141. [DOI]
  • 48.Howard M. W., Kahana M. J., A distributed representation of temporal context. J. Math. Psychol. 46, 269–299 (2002). [Google Scholar]
  • 49.Stachenfeld K. L., Botvinick M. M., Gershman S. J., The hippocampus as a predictive map. Nat. Neurosci. 20, 1643–1653 (2017). [DOI] [PubMed] [Google Scholar]
  • 50.Benna M. K., Fusi S., Place cells may simply be memory cells: Memory compression leads to spatial tuning and history dependence. Proc. Natl. Acad. Sci. U.S.A. 118, e2018422118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Whittington J., Muller T., Mark S., Barry C., Behrens T., Generalisation of structural knowledge in the hippocampal-entorhinal system. Adv. Neural Inf. Process. Syst. 31, (2018). [Google Scholar]
  • 52.Sanders H., Wilson M., Klukas M., Sharma S., Fiete I., Efficient inference in structured spaces. Cell 183, 1147–1148 (2020). [DOI] [PubMed] [Google Scholar]
  • 53.Schapiro A. C., Turk-Browne N. B., Norman K. A., Botvinick M. M., Statistical learning of temporal community structure in the hippocampus. Hippocampus 26, 3–8 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Tan H. M., Wills T. J., Cacucci F., The development of spatial and memory circuits in the rat. Wiley Interdiscip. Rev. Cogn. Sci. 8, e1424 (2017). [DOI] [PubMed] [Google Scholar]
  • 55.Brandon M. P., Koenig J., Leutgeb J. K., Leutgeb S., New and distinct hippocampal place codes are generated in a new environment during septal inactivation. Neuron 82, 789–796 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Mallory C. S., Hardcastle K., Bant J. S., Giocomo L. M., Grid scale drives the scale and long-term stability of place maps. Nat. Neurosci. 21, 270–282 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Muessig L., Hauser J., Wills T. J., Cacucci F., A developmental switch in place cell accuracy coincides with grid cell maturation. Neuron 86, 1167–1173 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Mulders D., Yim M. Y., Lee J. S., Lee A. K., Taillefumier T., Fiete I. R., A structured scaffold underlies activity in the hippocampus. bioRxiv 2021.11.20.469406 [Preprint] (2016). https://doi.org/10.1101/2021.11.20.469406. [Google Scholar]
  • 59.Sheintuch L., Geva N., Baumer H., Rechavi Y., Rubin A., Ziv Y., Multiple maps of the same spatial context can stably coexist in the mouse hippocampus. Curr. Biol. 30, 1467–1476.e6 (2020). [DOI] [PubMed] [Google Scholar]
  • 60.Battaglia F. P., Sutherland G. R., McNaughton B. L., Local sensory cues and place cell directionality: Additional evidence of prospective coding in the hippocampus. J. Neurosci. 24, 4541–4550 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Farzanfar D., Spiers H. J., Moscovitch M., Rosenbaum R. S., From cognitive maps to spatial schemas. Nat. Rev. Neurosci. 24, 63–79 (2023). [DOI] [PubMed] [Google Scholar]
  • 62.Pezzulo G., Cartoni E., Rigoli F., Pio-Lopez L., Friston K., Active inference, epistemic value, and vicarious trial and error. Learn. Mem. 23, 322–338 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Kaplan R., Friston K. J., Planning and navigation as active inference. Biol. Cybern. 112, 323–343 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Cormack G. V., Horspool R. N. S., Data compression using dynamic Markov modelling. Comput. J. 30, 541–550 (1987). [Google Scholar]
  • 65.Wu C. J., On the convergence properties of the em algorithm. Ann. Stat. 11, 95–103 (1983). [Google Scholar]
  • 66.Verbeek J. J., Vlassis N., Kröse B., Efficient greedy learning of gaussian mixture models. Neural Comput. 15, 469–485 (2003). [DOI] [PubMed] [Google Scholar]
  • 67.Jelinek F., Continuous speech recognition by statistical methods. Proc. IEEE 64, 532–556 (1976). [Google Scholar]
  • 68.Kamada T., Kawai S., An algorithm for drawing general undirected graphs. Information processing letters 31, 7–15 (1989). [Google Scholar]
  • 69.C. Beattie, J. Z. Leibo, D. Teplyashin, T. Ward, M. Wainwright, H. Kuttler, A. Lefrancq, S. Green, V. Valdes, A. Sadik, J. Schrittwieser, K. Anderson, S. York, M. Cant, A. Cain, A. Bolton, S. Gaffney, H. King, D. Hassabis, S. Legg, S. Petersen, Deepmind lab. arXiv preprint arXiv:1612.03801 (2016).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Text

Figs. S1 to S19

sciadv.adm8470_sm.pdf (12.9MB, pdf)

Articles from Science Advances are provided here courtesy of American Association for the Advancement of Science

RESOURCES