Abstract
Experiments with rodents demonstrate that visual cues play an important role in the control of hippocampal place cells and spatial navigation. Nevertheless, rats may also rely on auditory, olfactory and somatosensory stimuli for orientation. It is also known that rats can track odors or self-generated scent marks to find a food source. Here we model odor supported place cells by using a simple feed-forward network and analyze the impact of olfactory cues on place cell formation and spatial navigation. The obtained place cells are used to solve a goal navigation task by a novel mechanism based on self-marking by odor patches combined with a Q-learning algorithm. We also analyze the impact of place cell remapping on goal directed behavior when switching between two environments. We emphasize the importance of olfactory cues in place cell formation and show that the utility of environmental and self-generated olfactory cues, together with a mixed navigation strategy, improves goal directed navigation.
Keywords: Self-marking navigation, Reinforcement learning, Q-learning, Place cell directionality, Remapping
Introduction
Place cells are principal neurons in hippocampus which respond maximally when the animal is in a specific location in an environment. They were discovered in the rat hippocampus by O’Keefe & Dostrovsky in 1971 (O’Keefe and Dostrovsky 1971; O’Keefe and Nadel 1978) and investigated in numerous studies (for reviews see Eichenbaum et al. 1999; Hölscher 2003). Place fields (PF) form from environmental cues and play an important role in spatial navigation. Cells having similar properties to rat place cells had also been found in humans using extracellular recordings from epileptic children (Ekstrom et al. 2003). Thus, the formation of PFs, and their influence on navigation remains an important experimental and theoretical question. In particular, little is known on how different sensory cues contribute to PF formation and spatial navigation. Thus, the goal of the first part of this study is to investigate how PFs are formed under visual as well as olfactory influences. In the second part, we address the question of how PFs can be used in navigation, and compare this to olfactory based navigation based on self-laid scent marks.
PF formation and their relations to other hippocampal subsystems
Different models have been proposed for hippocampal place cell formation including Gaussian functions (O’Keefe and Burgess 1996; Touretzky and Redish 1996; Hartley et al. 2000; Foster et al. 2000), back-propagation algorithm (Shapiro and Hetherington 1993), auto-associative memory (Recce and Harris 1996), competitive learning (Sharp 1991; Brown and Sharp 1995), neural architecture based on landmark recognition (Gaussier et al. 2002), neuronal plasticity (Arleo and Gerstner 2000; Arleo et al. 2004; Strösslin et al. 2005; Sheynikhovich et al. 2005; Krichmar et al. 2005), independent component analysis (Takács and Lőrincz 2006; Franzius et al. 2007), self organizing map (Chokshi et al. 2003; Ollington and Vamplew 2004) or Kalman filter (Bousquet et al. 1998; Balakrishnan et al. 1999). None of these, however, addresses the question of how multiple sensory inputs might affect PF formation. Experiments with rodents demonstrate that visual cues play an important role for the control of place cells (Muller and Kubie 1987; Knierim et al. 1995; Collett et al. 1986; O’Keefe and Speakman 1987; Maaswinkel and Whishaw 1999; Dudchenko 2001). On the other hand, in the absence of visual cues rats can rely on other cues such as olfactory, auditory or somatosensory stimuli (Hill and Best 1981; Carvell and Simons 1990; Maaswinkel and Whishaw 1999; Wallace et al. 2002a). Thus, it seems reasonable to consider the influence of such cues also on the formation of PFs. This view is supported by the observation that PFs become unstable when olfactory cues are removed, suggesting that olfactory cues are important in the formation and stability of PFs (Markus et al. 1994; Save et al. 2000).
Other types of cells related to hippocampal place cells and spatial navigation are head direction cells and grid cells. Head direction cells are found in found in many brain areas including postsubiculum, the thalamus, lateral mammillary nucleus, dorsal tegmental nucleus, and striatum (Taube et al. 1990a, b; Muller et al. 1996; Knierim et al. 1998). Head direction cells respond maximally when animal’s head is oriented in preferred direction in the horizontal plane. Like place cells, head direction cells are under control of distal stimuli, and have different preferred directions in different environments. Experimental data suggests that the head direction cell system may orient the place cell system (Jeffery and O’Keefe 1999; Calton et al. 2003; Yoganarasimha and Knierim 2005).
Grid cells are found in entorhinal cortex (Hafting et al. 2005; Sargolini et al. 2006; Barry et al. 2007). Grid cells, like place cells, also fire strongly when an animal is in specific locations in an environment, but differ from place cells in that they have multi-peak firing fields which are organized into a hexagonal grid. It has been suggested that grid cells may make associations between places and events which is needed for the formation of memories (Hafting et al. 2005).
Navigation guided by PFs and other influences
Many experimental studies have been performed on goal directed learning in rodents (Barnes et al. 1980; Morris 1984; Prados and Trobalon 1998; Lavenex and Schenk 1998; Maaswinkel and Whishaw 1999; Wallace et al. 2002a; Etienne and Jeffery 2004; Jeffery et al. 2003; Hines and Whishaw 2005). Navigation models based on place cells usually address goal learning by using reinforcement learning algorithms (Arleo and Gerstner 2000; Arleo et al. 2004; Strösslin et al. 2005; Sheynikhovich et al. 2005; Krichmar et al. 2005) where place cell representation is based on combination of visual information and information provided by head direction cells or path integration.
Path integration was considered by many researchers as evidence for an additional mechanism when navigating in the absence of visual cues (for a review see Etienne and Jeffery 2004). Experimental data suggests that grid cells may be related to the path integra tion system (Hafting et al. 2005; Sargolini et al. 2006; McNaughton et al. 2006). However, Save et al. (2000) have shown that path integration alone is not sufficient to maintain stable receptive fields of place cells when rats navigate in the dark. Without additional cues, path integration leads to an accumulation of errors in direction and distance, and it thus needs to be reset through position information from stable cues (Etienne et al. 1996, 2004). In the study of Strösslin et al. (2005) the authors claim that their model is able to work in the dark based on self-motion cues (visual cues together with path integration were used), yet it is unclear how the model can succeed if visual cues used for recalibration are not available while navigating for a longer time in the dark.
Thus, for navigation in natural environments it seems reasonable to consider other sensory inputs, and it is known from the literature that rodents can form spatial representations based on olfactory cues and use this information for spatial orientation and navigation (Tomlinson and Johnston 1991; Lavenex and Schenk 1995, 1996, 1998). Experiments show that rats can track odors or self-generated scent marks to find a food source (Wallace et al. 2002a, 2003). To accommodate these findings, we propose a novel navigation mechanism based on self-marking by odor patches combined with a Q-learning algorithm based on (multi-sensory formed) place cells in order to improve spatial navigation.
Studies show that rats use visual and/or olfactory cues when available, and that such allothetic cues dominate over path integration information (ideothetic components) (Maaswinkel and Whishaw 1999; Whishaw et al. 2001). Therefore, the focus of the current study is on place cell formation and spatial navigation in cue-rich, illuminated environment, where path integration would be extraneous.
Another interesting consideration concerns the question how navigation is affected by remapping. It is known that PFs change very quickly when the rat is confronted with a new environment and that many PFs will re-obtain their former properties as soon as the animal returns to the initial environment (Muller and Kubie 1987; Wilson and McNaughton 1993; Shapiro et al. 1997; Tanila et al. 1997; Knierim et al. 1995, 1998). It is, however, an unresolved question how remapping affects navigation and navigation (re-)learning (Jeffery et al. 2003).
Specific questions addressed
In this study, we concentrate on the impact of olfactory cues on place cells formation and on a goal navigation learning in different environments. We focus on the following three questions:
What is the contribution of olfactory cues to the formation of place cells and goal navigation?
Can goal navigation based on place cells be improved by additional navigation mechanisms?
How does the remapping of PFs influence goal navigation when switching between different environments?
The paper is organized as follows. First we describe the sensory inputs and the model system. Then we present different goal navigation strategies and thereafter we show the results of place cell analysis, and a comparison of the presented navigation algorithms. Finally, we discuss our results and relate them to other studies and biological data.
Methods
Sensory inputs
We use a square box with dimensions of 10000×10000 points where walls of the arena are marked by different landmarks (see Fig. 1(a)). Visual and olfactory cues are used as allothetic inputs to the place cells in our model. As visual input, we use the perpendicular distances from the rat’s position to all four walls, similar to many other models which use distances to walls or landmarks (Sharp 1991; Recce and Harris 1996; O’Keefe and Burgess 1996; Touretzky and Redish 1996; Hartley et al. 2000; Ollington and Vamplew 2004). Let us define the visual input by , where x and y denote the position in the environment and k = 1...4 is the number of possible visual inputs related to the four walls of the arena. In our model the rat has a view-field of 180 degrees (real rats have a wider field of view), which means that the rat can see only the walls which are ahead, but can not see what is behind. Prediction of the distance to a non-visible wall is made by taking the last estimate of distance to the wall when it was visible. This can be described by the following recurrent equation:
where j denotes the index of the non-visible wall, and t denotes the time in steps. Note that if the rat is moving along a linear trajectory away from a non-visible wall then the error of the estimate of this wall accumulates over time. The estimate is re-calibrated as soon as the wall becomes visible again.
We also use four different odors as an additional input to the place cells. Five examples of odors are shown in Fig. 1(b), where each box represents a different odor with a different source location in the environment. We model our odors at the ground level (2D space) by the following Gaussian functions:
where x and y denote the position in the environment, k = 1...4 is the number of the odor sources, and a = 0.01 is the scaling factor. The variables denote the coordinates of the center (maximum intensity) of the odor source and are given as follows: , , and . Values and are randomly drawn from a Gaussian distribution with zero mean and a standard deviation of 100. Note, that here we model static odors that do not change during different runs of the same experiment but differ across experiments. The rat can smell the odors locally, and it does not sense the direction of the odor source. Noise is also added to the visual sensory inputs, assuming that the rat makes larger errors in the estimation of long distances. Similarly the rat makes larger errors in estimating odors with low intensity and smaller errors for odors with high intensity. This is given by the following equations:
where and are random values from a uniform distribution within the interval [-1;1]. Note, that both visual and olfactory inputs are normalized and bounded within the interval [0;1], where L = 10000 points is the size of the environment, and is the maximal intensity of the k-th odor source.
Place cell model
We model place cells by using a simple feed-forward network with an input and an output layer as shown in Fig. 1(c). At the input layer we have sensory inputs received from visual and olfactory stimuli. Here we have a fully-connected network where every neuron in the input layer is connected to every neuron in the output layer via connection weights W i = [w i,1...w i,n], where i = 1...N, N = 500 is the total number of place cells and n is the number of sensory inputs (n = 4 if only visual cues are used and n = 8 if both visual and olfactory cues are used). Weights are initialized randomly by a function f z:
where z is a random number from a uniform distribution within the interval [0;1], m = 0.5 and σ = 0.2. The distribution of initial weights is plotted in Fig. 1(d). We have chosen such a distribution for the reason that if the weights are initialized according to a uniform distribution then all PF centers are located around the center of the environment and we do not obtain PFs close to the walls of the environment. In our model weights are basis vectors, which are used to compute firing rates of place cells (see equation below) where we start with random initialization of basis vectors. By employing competitive learning, cells become tuned to a specific input, which leads to the spatial selectivity of the place cells.
The firing rate of place cell i is expressed by a Gaussian function (similar to O’Keefe and Burgess 1996; Hartley et al. 2000) and is computed as follows:
where σ f = 0.07 defines the width of the PF, n is the dimension of the input space, and the norm is the Euclidean distance. Weights of our neural network are modified according to a winner-takes-all mechanism where we change only the weights of the best matching unit β t:
Weights of the winner neuron β t are changed according to the following equation:
where 0 < μ ≪ 1 is the rate factor.
Navigation strategies
Closed loop context
Before presenting the details of navigation strategies, we stress that we are dealing with a closed loop system (Fig. 2(a)). We create place cells from allothetic visual and olfactory cues. Place cells are connected to motor neurons, which produce certain motor actions. The rat has to learn appropriate motor actions, which eventually lead to the food source. As a consequence, sensory inputs as well as place cells are affected whenever the rat navigates in the environment, thus closing the loop as shown in Fig. 2(a).
Goal navigation task
The rat has to learn to navigate from its home location to the goal, i.e the food source. The rat can use allothetic visual and olfactory cues described above but it can not see or smell the food source (similar to the Morris water-maze task, Morris (1984)). The rat gets a reward only when it approaches the goal location. The setup for such a spatial task is shown in Fig. 2(b). We use the same discrete environment (square box) as described above, where we have different landmarks on all four walls (see Fig. 1(a)). The home location of the rat is in the bottom-left corner, 1000 points from both walls and is marked by a gray dot. The dimensions of the food source, marked by a square, are 2000×2000 points and it is located 3000 points from the left wall and 2000 points from the upper wall. At the beginning, the rat explores the environment randomly and finds the goal just by chance (dashed line), whereas after a few learning runs the rat finds a more or less direct path to the food source. Whenever the rat finds the food location we start a new run from the start position (home location). A maximum number of 200 steps is allowed for one run with a step size in the range of 400-600 points. In our model during the first run in most of the cases (80%) the rat finds the goal within less than 200 steps, so the rat has enough time to find the goal even when navigating randomly. Another reason for the 200 step limit is related to the frustration phenomenon observed in animals where creatures return to “home-base” if the goal is not found within an expected time (Eilam and Golani 1989; Whishaw et al. 2001; Wallace et al. 2002b; Hines and Whishaw 2005; Nemati and Whishaw 2007).
Q-learning with function approximation
As a first approach we apply reinforcement learning (Sutton and Barto 1998) as used by other studies on hippocampus-based navigation (Arleo and Gerstner 2000; Arleo et al. 2004; Foster et al. 2000; Strösslin et al. 2005; Krichmar et al. 2005). Here we employ a version of Q-learning with function approximation similar to Reynolds (2002). The algorithm is implemented by a two layer neural network (see Fig. 2(c)) where we have place cells as inputs to the network. The place cells are connected to motor neurons representing eight directional cells: north (N), north-east (NE), east (E), south-east (SE), south (S), south-west (SW), west (W) and north-west (NW). The actual direction of movement is determined by the maximum Q-value of the eight possible directions averaged over all cells, which are firing at the present location, with additional noise. For example the horizontal movements W or E are given by the following simple equations:
where Δs = 500 is the step size, η x and η y are random values from a uniform distribution within the interval [-1;1], and b = 100 is the amplitude of the noise. Here we use the minus sign for the W direction and the plus sign for the E direction. Similarly, for SW or NE we have:
and the equivalent for the other directions. The rat makes a random movement whenever Q-values are zero at the present location. In this case, the rat keeps the direction of the movement with a probability of 1 − p r, whereas with p r = 0.25 it will randomly take a new direction. When Q values are non-zero we use a usual RL strategy, with exploration and exploitation, where the direction of the movement is chosen according to the learned Q-values most times, (exploitation probability 1 − p e), and a random move is made with exploration probability p e = 0.1.
As mentioned before, the learning mechanism from place cells to motor cells is a version of Q-learning with function approximation. Let us define our basis functions Φi as a function of the firing rate r t of the place cell i at the time step t:
Here, i = 1...N, N = 500 is the total number of place cells. Note, we discretize the space representation provided by place cell prior to the goal-navigation learning in order to reduce the amount of noise in the PF system since low firing rates give larger errors in position estimation compared to the real position of the rat in the environment. By using binary cells we still get different PF sizes and we preserve the directionality of place cells.
We define the action-value function by the following equation:
where Θi,a is the weight from the i − th place cell to the motor action a. In the given equation we sum over all basis functions, but at a specific location within the environment only a specific subset of basis functions will be non-zero. We use an averaging Q-learning rule according to Reynolds (2002) where we update weights of the actually taken action a t at the time step t according to the following learning rule:
where α = 0.7 is the learning rate, γ = 0.7 is the discount factor and R is a reward. We define our reward function R t by
Self-marking navigation
The second approach in our study is to use navigation based on self-generated odor marks, where the rat follows the self-laid scent marks to find the food source. The rat always explores the environment randomly by keeping the direction of the movement whenever it does not smell anything locally. Note that the rat can smell only within a given radius of 600 points, which corresponds to the maximum step size. At the beginning, the rat finds the food source by moving randomly and marks it by a small amount of scent. In the next run/runs, when the rat approaches the previously laid scent mark within a distance at which the rat can smell it, the rat will mark its location and then will go directly to the perceived scent mark and remark it again by another small amount of scent. The whole navigational process can be defined as follows. The rat marks the location of the food source or remarks the current location if it smells another scent mark/marks ahead by
where u defines the self-laid odor patches in the environment, x,y define coordinates of the position within the environment and Δu = 0.005. The locations which have strong smell, i.e. , are not remarked any more. The rat goes directly to the location marked by scent mark which has the strongest smell according to
otherwise it makes a random movement as explained above. It is worth noting that the given method propagates scent-marks backwards from the location of the reward as in reinforcement learning, but here we do not have predefined features. Instead, we create them “on the fly”, and we do not directly memorize action values associated to states, where a state is defined by the rat’s position in the environment x,y. In our model self-laid scent marks are modeled by little “drops” which are less intense relative to the environmental odors which may have very strong odor sources and diffuse within the environment. Self-generated odor marks can be smelled and distinguished by the rat only locally within a relatively small radius (in our case within one step size).
Combining Q-learning with self-marking navigation
The third and the last approach is a combination of the two previously described methods. In this case the rat marks the location only if it smells another scent mark/marks and the normalized maximum Q-value at this location obtained by using the first method has reached a given threshold of λ = 1.5:
The action in the combined strategy is taken by the following rule. If the rat does not smell any scent mark within given radius then it takes an action according to the Q-values, otherwise the rat follows the scent gradient. By using this type of navigation the rat develops Q values and lays scent marks at the same time.
Remapping and navigation
It is known from the literature that PFs can change in firing rate, position, shape, or turn on/off when the animal is exposed to different environments, a phenomenon which is called remapping (Muller and Kubie 1987; Wilson and McNaughton 1993; Shapiro et al. 1997; Tanila et al. 1997; Knierim et al. 1995; Knierim et al. 1998). Fundamental changes occur within 5-10 minutes of exploration in a new environment, whereas the firing rate can change even within the first second (Wilson and McNaughton 1993). In this study we also investigate how remapping of place cells affects goal navigation task when the rat switches between different environments. We compare different navigation strategies with respect to change of environmental cues, as well as to a change of the goal location.
To look at the remapping of place cells, we first let the rat explore randomly the whole environment “A” for 5000 time steps. Environment “A” contains visual and olfactory cues as shown in Fig. 3, as already used in the previously described experiments. Afterward the rat is exposed to another environment, “B”, for 5000 time steps (see panels a and b). In our model we use the same visual landmarks and the same odors for both environments “A” and “B”. In order to change the environment we switch the landmarks and change the locations of odor sources. Landmarks are used by the rat in order to distinguish between the four walls and to estimate distance to them. When we switch landmarks the rat gets different estimates of distances to the walls marked by the same landmark when being at the same position in the environments “A” and “B”. The rat also gets different odor intensity at the same position in the environment “A” compared to the environment “B”. After exploration in the environment “B” the rat was moved back to the familiar environment “A”.
To compare Q-learning based on PFs obtained from combined visual and olfactory stimuli with the combination of Q-learning with the navigation based on self-generated odor marks we perform different sets of experiments. In the first set of experiments, we switch between two environments “A” and “B”, changing only environmental cues and keeping the location of goal unchanged (see Fig. 3(a)). In the second set of experiments, we switch between the environment “A” and “C”, and in “C” the environmental cues as well as the location of the food source are changed.
Results
Place cell analysis
Examples of PFs after random exploration over 5000 time steps are presented in Fig. 4. PFs obtained when using visual or olfactory cues alone are shown in panels a and b. PFs obtained from both visual and olfactory cues are shown in panel c. Here we show only selected PFs which have a maximum firing rate r > 0.5. Resulting PFs are localized, can differ in size and firing rate, and are similar to real PFs. For examples of PFs obtained from the rodent hippocampus see Wilson and McNaughton (1993), O’Keefe (1999).
The distribution of firing rates is shown in Fig. 5(b), where we have fewer cells with a high firing rate than cells with a low firing rate, which resembles experimental data (Hartley et al. 2000). Some of the cells which are silent in a specific environment become active when moved to the other environment (see Fig. 11(a)). PF centers from a single experiment (location of maximum firing rate within the field) are shown in Fig. 5(a), where circles represent centers of PFs with a low firing rate (r ≤ 0.5) and dots those with a high firing rate (r > 0.5). We observed that cells with low firing rate are distributed around the center of the environment (similar to Gaussian distribution, panel c) whereas cells with high firing rate are evenly distributed within the whole environment (see panel d). The latter cells will drive the learning in the goal navigation task (see Section 3.3).
Before looking at the comparison of goal navigation strategies we would like to investigate the contribution of the olfactory input to place cell formation. This influence can be assessed by measuring the directionality of place cells. For this investigation, we let the rat to explore the environment randomly as shown in Fig. 5(e) for 5000 time steps (development phase). For comparison we used a relatively low rate factor (μ = 0.01) to develop connection weights between an input and an output layer (see Fig. 1(c)), because weights oscillate and do not converge when a high rate factor (μ = 0.1) is used, and this does not lead to the final stabilization of place cells. For comparison of weight development for different rate factors see Fig. 5(g). After the development phase we let the rat move in the environment for another 5000 time steps to create test data. To evaluate the directionality of place cells we looked at the locations which had been passed by the rat in different directions. We say that a cell is omnidirectional, i.e. independent of the movement direction, if at a given location the cell fires with its highest firing rate regardless of crossing the location in different directions. Averaged results of 20 experiments are presented in Fig. 5(f) where we compare the directionality of place cells obtained from visual cues alone with that obtained from both visual and olfactory stimuli. The white bars show the control case, with place cell directionality before the development phase (i.e. before learning). We can see that we obtain more omnidirectional cells when we use combined stimuli compared to visual stimuli alone and more omnidirectional cells develop during the development phase compared to control case. The improvement in omni-directionality when using olfactory cues can be explained by the fact that perception of olfactory cues is direction independent whereas perception of visual cues depends on local views. Note that the view-field influences the directionality of PFs. The larger the view-field, the fewer directional cells are obtained. Since the rats do not have the omnidirectional view we still would get more directional cells obtained from visual information alone compared to combined stimuli (visual and olfactory cues) or olfactory cues alone. Our results on place cell directionality are qualitatively similar to experimental data of Battaglia et al. (2004). For further discussion on place cell directionality see Section 4.
Goal navigation
Comparison of different navigation strategies
Before presenting statistical analysis of different navigation strategies, we compare different strategies by showing examples of single experiments. An example of navigation by using Q-learning based on place cells formed from combined visual and olfactory cues is presented in Fig. 6(a) and (b). Trajectories of the rat’s paths obtained from 30 runs are shown in panel a, and the number of steps needed to reach the goal versus number of runs are plotted in panel b. The rat found a more or less straight path to the goal after seven trials. Results for self-marking navigation are shown in Fig. 6(c–e). Trajectories of the rat’s paths obtained from 60 runs are presented in panel c. The environment with self-laid scent marks (marked as dots) is shown in panel d, where the dot’s size is proportional to the strength of the scent mark. The rat follows the scent gradient to find the food source. The number of steps needed to reach the goal versus number of runs is plotted in panel e, where the rat had generated the trail of scent mark, which leads from the home location to the food source after 56 runs (see the last four trajectories in panel c). One example of navigation with combined strategies is shown in Fig. 6(f–h) where trajectories of the rat’s paths obtained from 30 runs are presented in panel f and the number of steps needed to reach the goal versus number of runs in panel h. In this experiment the rat found a more or less straight path to the food source already after five runs. From the given example we can see that scent marks (panel g) are laid only along the way to the food source whereas in the previous example of self-marking navigation scent marks (see Fig. 6(d)) are spread out widely through the environment.
Results obtained from single experiments using different navigation strategies to find a goal from a random start position are presented in Fig. 7, where in every trial the rat was placed randomly within the environment. In panel a we show results from Q-learning navigation based on place cells formed from both visual and olfactory input. A vector field representation of learned Q-values after 100 runs is shown where each vector represents the cumulative direction of movement from corresponding location. The vector field was calculated according to the following procedure. A 20×20 grid was used to define specific points in the environment. Corresponding subsets of place cells were found, which fire at each intersection point of the grid. Average Q-values for eight directions were calculated for the corresponding subset of place cells. The resulting movement direction vector was computed from obtained average Q-values for each intersection point of the grid. In panel b we show the resulting map of self-laid scent marks (marked by dots) from self-marking navigation after 200 runs. Here we use more runs since self-marking navigation converges slower than Q-learning (see Fig. 9(b)). When starting from random positions, the rat creates a map of a tree-like structure of scent marks, where it chooses the closest branch and then follows the gradient of scent marks, leading to the goal. Results of combined navigation are presented in panel c, where we show the vector field of learned actions (left) and the corresponding map of scent marks (right) after 100 runs. As expected we obtained similar results to those of self-marking and Q-learning navigation (see panels a and b). In general, we observed that when starting from the same location the rat creates one main trail of scent marks, whereas when starting from a random location the rat creates tree-like structures of scent marks with several main branches. Also, the rat creates more scent marks when using pure self-marking navigation compared to the combined strategy.
We also investigated the performance of self-marking navigation in the environment with multiple targets. For this experiment we used an environment with two food sources as shown in Fig. 8(a), where in one case the rat always started to search for food from the same start position (home location) and in the other case the rat was placed at a random position. Results of a single experiment for self-marking navigation when always starting from the home location are shown in Fig. 8(b), where we show a map of self-laid scent marks after 200 runs. In the beginning the rat back-propagates scent marks from both goal locations, where at the end it creates a stronger trail of scent marks, which leads to only one of two food sources (see left and right sub-panels). When starting from a random location (panel c), the rat creates a map of scent marks with a tree-like structure similar to the case with one food source (see Fig. 7(b)). Here we obtain two trees of scent marks where each leads to one corresponding food source. Results of combined navigation are presented in Fig. 8(d, e). As expected, when starting from the home location (panel d), the rat marks only one route. Note, as opposed to self-marking navigation (panel b), the rat back-propagates scent marks only from one of the two food sources. Results for combined navigation when starting from a random location are shown in panel e where we show the vector field of learned actions (left sub-panel) and the corresponding map of scent marks (right sub-panel). As opposed to self-marking navigation (panel c), the rat creates only one tree of scent marks, where all direction vectors point to the marked food source. This is due to the fact that in combined navigation the rat marks only the locations where Q-values are relatively high. As soon as the rat finds one of the two goals, it goes to that goal location more often and propagates scent marks backwards (similarly to the results presented in panel d). We also observed that if one of two food sources is located significantly closer to the home location than the other, the rat in most of the cases finds the closer food source. This is due to the fact that the rat propagates scent marks from the food sources to home location backwards, and scent-marks from the closer food source reach home location earlier than those of food source which is further away. In general, we observed that the rat learns a unique route which leads to one of the two targets and only in the case of pure self-marking navigation when starting from a random locations does the rat create routes to both targets.
In the following paragraph we statistically determine the effectiveness of different stimuli for the goal navigation task and compare the previously described navigation strategies. The task for the rat was to find a route from home location to the food source as shown in Fig. 2(b). In Fig. 9 results from four cases are shown: VQ) Place cells based on visual cues alone are used for goal navigation by using Q-learning; VOQ) Similar to the case VQ, but here cells are created from combination of visual and olfactory cues; S) Self-marking navigation based on odor patches where the rat follows self-laid scent marks to find a food source; VOQS) Combined navigation where the rat marks its location only if the Q-values (obtained by VOQ) at this location have reached a given threshold (for details see Section 2). The average number of steps needed to find the goal versus number of runs obtained from 200 experiments is shown for each case in panel b. We obtained faster convergence when both visual and olfactory cues are used as compared to visual stimuli alone (see VQ, VOQ). This can be explained by the observation that cells formed from combined stimuli are less directional than those formed from visual cues alone. Note, that if we have place cell system where all place cells are directional then it will require learning of actions for every movement direction of an animal for every specific location in the environment. For instance, if the rat learns the direction to the goal from a specific location with a certain movement direction (e.g. north) then the rat will not know the direction to the goal from the same location when crossing this location with a different movement direction (east), since place cells will not fire when moving along this different direction. If we have omni-directional place cell system then we learn actions for a specific location independently of the movement direction of the animal (the same actions for all movement directions for a specific location) which as a consequence makes the learning faster. Self-marking navigation alone (S) converges much slower than Q-learning based on PCs obtained from combined stimuli (VOQ), whereas the combination of self-marking navigation with Q-learning (VOQS) is faster than Q-learning alone (VOQ). Note that the number of steps needed to reach the goal when using Q-learning (VQ/VOQ) is larger on average than that for self-marking navigation (S) or combined method (VOQS). This is due to the fact that we use a RL strategy with exploration and exploitation, where the rat tries random directions hoping to find a better path. This sometimes leads to a loss of track and long path trajectories, which on average shifts the curve up. In self-marking navigation or with the combined method the rat does not explore the environment anymore as it now follows self-laid scent marks. We also compared self-marking navigation (S) with combined method (VOQS) in a task where after learning of the spatial task the self-generated marks were “cleaned” (i.e. u(x,y) = 0). Results are presented in Fig. 9(c). As expected, the rat has to relearn the path to the goal from scratch when using self-marking navigation alone, whereas the combined strategy allows the rat to use learned Q-values (or in the other words, to navigate using allothetic visual and olfactory cues) whenever self-generated scent marks are not available anymore and it remarks the path again. The small peak with a decay after “cleaning” (see case VOQS) is a result of the previously discussed exploratory behavior.
Hierarchical input preference in spatial navigation
In the presented combined strategy scent trails are used by the rat to find a goal after learning. However, this kind of strategy is inconsistent with biological findings. Maaswinkel and Whishaw (1999) showed that rats use visual cues for spatial navigation if they are available. If visual cues are not available, the rats rely on self generated odor cues. To address this problem we modified our combined navigation strategy by adding hierarchical input preference to the model. At the beginning the rat uses both environmental cues and self-marking cues (combined strategy) in order to speed up learning as described above. This differs from the previous version in that the rat stops laying and following scent marks as soon as the trail of scent marks reaches the home location, whereas Q-values are still left modifiable. Furthermore, the rat prefers environmental cues (i.e. navigation based on Q-values) if they are available; if not, the rat follows previously generated scent marks. Here we use a combined strategy (Q-learning with self-marking navigation) for learning as it makes learning faster and only later on we use the hierarchical input preference for navigation. During learning, Q-values as well as odor marks are generated where initially the Q-value development dominates in the learning and guides the placing of the odor patches since the rat lays a scent mark only if the normalized maximum Q-value at this location has reached a given threshold. As we would associate the Q-system with landmarks we find that during learning we are, due to Q-dominance, compatible with Maaswinkel and Whishaw (1999). Note, that if we were starting with the hierarchical input preference from the beginning then this would lead to a slower convergence since the rat would learn the route based on landmarks alone (without self-generated odor marks) and this would lead to the results obtained by using Q-learning algorithm alone. After learning the model allows distinguishing between different input preferences. To demonstrate such a behavior we have performed two different experiments. In the first experiment we flipped the self-generated scent marks after learning along the diagonal of the box in a way that the scent trail does not lead to the goal anymore (see left and right panels in Fig. 10(a, b)), where environmental cues were left unaffected. In the second experiment we removed all environmental cues (visual and olfactory) after learning and left scent trail unaffected. Two examples of single results from the first experiment are shown in Fig. 10(a, b), where in the left sub-panel we show the scent trail and the rat’s trajectory at the end of learning and in the right sub-panel we show three trajectories of consecutive runs after scent marks were flipped. We found that the rat takes a correct route to the goal using environmental cues. We also noticed that the route is along the trail of scent marks that were produced during learning, which means that the rat has created two similar representations of route to the goal, where one is based on environmental cues and the other based on self-laid scent marks. After learning, the rat prefers environmental cues, so the rat’s performance remains unaffected when we flip scent marks. Statistics for 200 experiments are presented in panel c. We show the average number of steps needed to find a goal versus number of runs, where after 49 runs we flipped the scent trail. This analysis shows that the rat finds a path using combined navigation after approximately 20 runs, on average. After learning, the rat switches to the navigation based on environmental cues, and we observe an upwards curve shift due to the exploration and exploitation strategy of the Q-learning. As expected, the rat’s performance is not affected after scent marks were flipped since the rat prefers environmental cues after learning. Statistics for the second experiment are presented in Fig. 10(d) where we can see that as soon as environmental cues are unavailable (i.e. removed) the rat follows the trail of scent marks which leads to the food source. Lack of exploration in this case leads to the noise free flat line after run 49. Our modified model captures similar properties of hierarchical input preference observed in animals (Maaswinkel and Whishaw 1999). For further discussion and relation to biological data see the Section 4.
Remapping
Remapping of PFs
The resulting PFs of a remapping experiment when switching between environments “A” and “B” are shown in Fig. 11(a), with the same selected 100 of total 500 place cells shown for each case. As expected, we can see that PFs of cells can change their firing rate, position, shape, or turn on/off. Note that there are also cells which do not change their properties in both environments. The average distribution of change in maximal firing rates of PFs between environments “A” and “B” in 100 experiments is shown in Fig. 11(b). Note that we show change in firing rates of PFs only for cells with maximum firing rate r > 0.5, which are the cells that actually drive Q-learning. Positive values mean that cells increased firing rate or turned on when moving the rat from the environment “A” to “B” and vice versa. The distribution of changes in the positions of PFs (only with maximum firing rate r > 0.5) is presented in Fig. 11(c), where we plot the average distance between PFs centers (given by the location of the maximal firing in the PF) in environment “A” versus “B”. Place cells, as expected, display their original fields when returned to “A” from “B” back to “A” (see Fig. 11(a)).
Remapping and goal navigation
In the following subsection we present results on spatial navigation with respect to the remapping of PFs when switching between to different environments. For environmental setup see Fig. 3. The results of goal navigation while switching between environments “A” and “B” are shown in Fig. 11(d–g), where the average number of steps needed to find the food source is plotted versus number of runs for 200 experiments. Navigation results obtained by using Q-learning based on PCs obtained from visual and olfactory stimuli (VOQ) are presented in panel d, and results of the combined method (VOQS) are shown in panel e. Note that here we used a combined strategy without hierarchical input preference, i.e. the rate would still follow a scent trail after learning. We can see that by using both navigation strategies the rat can learn to find the goal in two environments “A” and “B”, whenever the location of the food source is the same in both environments, and it goes directly to the goal after returning to the previous environment. It is worthwhile to note that in our model we do not introduce unfamiliar cues to the rat in the new environment, but we just “fool” the rat by switching visual cues and changing the position and shape of olfactory cues. That is why we also observe that the rat uses some information (i.e. learned Q-values) from the previous environment, and it does not have to relearn from scratch when moved to the new environment. In panel d, for comparison, we show the control case where in environments “A” and “B” we initialize Q-values randomly from a uniform distribution within the interval [0;1]. The results for the goal navigation while switching between environments “A” and “C”(the location of the goal is also changed) for the cases VOQ and VOQS are presented in Fig. 11 (f, g) respectively. Here we found that the rat has to relearn the food location all the time, even if returned to the previously visited environment. However, by employing the combined strategy (see panel g), the rat can easily find the food source in both environments even if the location of the goal is changed, because the rat just follows the trail of scent marks. Note that if we used the combined strategy with hierarchical input preference we would have obtained results similarly to the case VOQ (see panel f), since after learning the rat would prefer environmental cues and navigate according Q-values. In general, we observed that the rat can learn both environments when location of the goal is unchanged but has to relearn the route in case of changes in both environmental cues and location of the goal. For further discussion on remapping results see the Section 4.
Discussion
In the following we compare our place cell model and goal navigation strategies with other approaches. We also discuss our results in relation to biological data.
A starting point for this study was experimental data which show that olfactory cues play an important role for the stability of PFs (Markus et al. 1994; Save et al. 2000) and navigation of rodents (Tomlinson and Johnston 1991; Lavenex and Schenk 1995, 1996, 1998; Wallace et al. 2002a, 2003). We have for the first time, to our knowledge, implemented an odor supported place cell model and applied it for goal navigation learning. Based on self-marking behavior in rodents (Harley and Martin 1999), we proposed a novel navigation mechanism which allows better performance in goal directed navigation. We predict that use of environmental odor cues improve omni-directionality of place cells which as a consequence results in faster goal directed learning, whereas use of self-generated scent marks results in even faster learning, and could serve as an additional information for path finding when environmental cues are not available.
Place cell model
We modeled place cells from visual and olfactory cues using a feed-forward network based on radial basis functions (RBF). Here we used an abstract model excluding interactions between hippocampal layers. This is justified as we did not focus on the place model itself but rather on the contribution of sensory inputs to the formation of place cells and on the utilization of place cells in spatial navigation. Our approach is similar to the model of O’Keefe and Burgess (1996) or Hartley et al. (2000), but we use n-dimensional RBFs instead of calculating the thresholded sum of the Gaussian tuning-curves of the rat’s distance from each box wall (O’Keefe and Burgess 1996). Our model differs from the augmented model of Hartley et al. (2000), where the firing rate of a place cell is modeled as the thresholded sum of boundary vector cells (BVCs). The response of a BVC is the product of two Gaussian tuning curves, where one is a function of the distance from the rat to the wall and the second is a function of the rat’s head direction (Hartley et al. 2000). In these models, the amplitude and the width of the PF depend on the distance to the wall: the larger the distance, the lower the amplitude and the broader the field, and vice versa. In our model we keep the width of the PF σ f fixed and the obtained PFs that vary in shape and amplitude because of the combination of different sensory inputs (see Fig. 4(c)). We use a winner-takes-all mechanism for PF formation, which means that we do not change weights of neighbor neurons as in self-organizing map (SOM) approaches (Chokshi et al. 2003; Ollington and Vamplew 2004) as there are no obvious topographical relations between the positions of the PFs and the anatomical locations of the place cells relative to each other within the hippocampus (O’Keefe 1999).
In several studies (Arleo and Gerstner 2000; Arleo et al. 2004; Sheynikhovich et al. 2005; Strösslin et al. 2005) self-motion cues have been used as an additional input to hippocampus to create place cells. The disadvantage of self-motion cues is that path integration leads to an accumulation of errors in direction and distance, and needs to be re-calibrated according to position estimation from stable cues (Etienne et al. 1996, 2004). Save et al. (2000) have shown that path integration alone is insufficient to maintain the stability of PFs. If visual or olfactory sensory cues are available then these cues dominate over path integration information (Maaswinkel and Whishaw 1999; Whishaw et al. 2001). In contrast to other models we use odor cues as an additional input to form place cells. For the sake of simplicity we model static odors. Models of dynamic odors are quite complex and include many parameters (Boeker et al. 2000). By using static odors we ignore odor patch development, and effects that might be induced by changes of odors in time. Here we concentrate only on an odor function as a reference cue that is sensed unambiguously by the rat, as opposed to visual cues, which might be mismatched, misinterpreted or not seen at all. Obtained PFs capture similar properties to those that were found in the rats’ hippocampus (Muller and Kubie 1987; Muller et al. 1994; Wilson and McNaughton 1993; O’Keefe 1999).
Place cells tend to be less directional when navigating in an open environment as compared to navigation where the rat is forced to move along a specific direction (McNaughton et al. 1983; Muller et al. 1994; Markus et al. 1995). These properties has been also captured by the models of Sharp (1991) and Brunel and Trullier (1998). In this study, we have investigated the contribution of olfactory input to the directionality of place cells. From our analysis, we found that if olfactory cues are available for the formation of place cells, more omnidirectional fields develop. This agrees with observations of PFs by Battaglia et al. (2004) on cue-rich and cue-poor linear tracks. The proportion of omnidirectional cells over total spatially selective cells was ≈ 43% in a cue-rich environment vs. ≈ 30% in a cue-poor environment. We obtained more omnidirectional cells because cells tend to be more directional in eight-arm mazes or T-mazes compared to open environments (Muller et al. 1994; Markus et al. 1995). Our results support the notion that place cell directionality should influence goal directed behavior as we obtained better performance in a goal navigation task when using place cells formed from both visual and combined stimuli than when using place cells formed from visual cues alone.
Goal navigation learning
In the second part of our study we presented different navigation strategies and compared them in a goal navigation task and in a remapping situation. Goal navigation based on place cells has previously been addressed by implementing reinforcement learning algorithms (Arleo and Gerstner 2000; Arleo et al. 2004; Foster et al. 2000; Strösslin et al. 2005; Sheynikhovich et al. 2005; Krichmar et al. 2005). We presented a new navigation mechanism that combined Q-learning with navigation based on self-generated odor patches in order to achieve better performance in goal directed navigation. Our approach differs from that of Russell (1995), who developed a robotic system where the robot is able to lay an odor trail on the ground and to follow the trail afterward. In his approach the robot is not using odor marking to find a goal, whereas in our approach, the rat lays scent marks in order to find a goal and to create a trail, which leads to the food source. The proposed mechanism, based on self-marking, propagates scent marks backwards from the location of the reward as in reinforcement learning, but here we do not have predefined features, but rather create them “on the fly”, and we do not directly memorize action values associated to states. The mechanism of RBF1-like features created on-line in action learning was used in several other studies (Kretchmar and Anderson 1997; Atkeson et al. 1997). The method of updating odor marks resembles a TD(0) approach with function approximation (Sutton and Barto 1998), where the weights towards the value function are increased if the following states have high values. The update rule in our study is different from the one used in TD. Here, updates of odor marks are made by a fixed amount based on the binary decision whether some odor is sensed at the current location or not.
Experimental data show that rats perform better in cue-rich environments compared to the cue-poor environments. Barnes et al. (1980) showed that if all of the extra-maze cues surrounding a circular maze were removed, rats made many more errors finding a goal location. Morris (1984) demonstrated that rats performed worse when he obscured some of the cues around the water maze by pulling the curtains 1/4 of the way around. When he obscured all of the extra-maze cues by pulling the curtains fully around, the rats performed very badly. Prados and Trobalon (1998) showed that rats could learn the platform location in a water maze if 4 or 2 extra-maze cues were available, but they were much worse if only 1 cue was present. We addressed these findings by testing the performance of our model rat with and without olfactory input where we served that the model rat performed significantly better with both, visual and olfactory, cues compared to visual stimuli alone.
The experiments of Maaswinkel and Whishaw (1999) suggest that rats have a hierarchical preference in using sensory cues. In their experiments, rats ignored distortion in self-motion cues when they where moved to a new starting position or ignored distortion in odor cues (scent marks) when the apparatus was rotated suggesting that visual cues dominate over other cues whenever they are available. However, when blindfolded, the rats still performed well suggesting that they were using odor cues when available, and path integration when odor cues were disrupted. To address these findings we modified our combined navigation strategy by adding an input preference component where the rat uses both environmental and self generated cues for the learning. After learning the rat prefers environmental cues if they are available and uses self-generated olfactory cues when visual cues are not available. By using such an modified strategy, we have demonstrated that the model rat succeeds in faster goal directed learning showing unaffected performance when environmental cues are changed. This is supported by the finding that rat can find a goal when scent trail is distorted or removed, or can find the route to the goal using self-laid odor cues when environmental cues are unavailable.
Remapping and goal navigation
The results for goal navigation with respect to remapping of place cells show that the rat can learn to find a goal in two environments, “A” and “B”, by using Q-learning or combined navigation when the location of the goal is unchanged, but environmental cues are switched. Note that the rat can learn both environments only as long as different, partially overlapping subsets of place cells fire in the environments “A” and “B”, i.e. most of the cells, which do not fire in the environment “A”, fire in the environment “B”. In case of cue rotation the rat would need to relearn the task all the time if the location of goal is not rotated together with landmarks, because in both environments the same subset of place cells would be used. This is an equivalent of leaving the environment the same, but changing the location of the goal. Also in the Morris water-maze experiment (Morris 1981) the rat also has to relearn the location of the platform every time whenever it is moved to another location. When environments are substantially different and the cells remap, in our experiments the rat can easily find the food source in both environments even if the location of the goal is changed by employing the combined strategy, because the rat can use the trail of scent marks.
Our model predicts that the remapping of PFs would disrupt a previously learned route to a goal. The closest empirical data addressing this prediction is a study by Jeffery et al. (2003), who examined the relationship between remapping and performance of a spatial navigation task. In their experiment, rats were trained to search for a food source in a black box, and subsequently tested in a white box. Jeffery et al. (2003) found that place cells re-mapped between the two boxes, and although the rats were slightly worse in the second environment, they still performed well. This finding suggests that, although the place cells may encode spatial contexts, they dont directly guide behavior. One difference between the experimental situation of Jeffery et al. (2003) and that of the current model is that in the experimental situation there were no landmarks within the square apparatus. Instead, rats relied on spatial landmarks - posters on the curtains surrounding the apparatus - for orientation. So, in the Jeffery et al. (2003) experiment, unlike in our model, cues outside the immediate environment were the only way in which the animal could distinguish the correct corner. The results of Yoganarasimha and Knierim (2005) suggest that head direction cells are influenced by distal landmarks, whereas some place cells are influenced by local landmarks. Thus it may be that the Jeffery et al. (2003) task was one that could not be solved using place cells, because there was no way of distinguishing one corner of the apparatus from the other because there were no local cues available within the square. Rats may have used a non-place cell representation - such as the head direction cell system - to solve the task. Had there been local cues inside the square enclosure and no cues outside the enclosure, a stronger link between remapping and disrupted navigation may have been observed. An acknowledged difficulty with this account, however, is that Jeffery et al. (2003) also show that this task is impaired by lesions of the hippocampus.
Predictions and suggested experiments
Present experimental studies on spatial learning in cue-rich-cue-poor environments are still based on visual cues alone (Barnes et al. 1980; Morris 1984; Prados and Trobalon 1998). They also test the performance of the rat after learning. It would thus be interesting to test whether real animals would learn the task faster in environments with additional olfactory cues compared to visual stimuli alone as our model predicts.
Experiments on self-marking behavior in the process of learning would be useful to prove or disprove the proposed setup and hypothesis that self-marking behavior speeds-up learning.
In the Jeffery et al. (2003) experiment on place cell remapping and goal navigation, it may be that the task was one that could not be solved using place cells, be cause there was no way of distinguishing one corner of the apparatus from the other because there were no local cues available within the square. It would be interesting to make more experiments in order to test the hypothesis whether remapping of place cells influences goal directed learning or not as our model predicts.
By using a combined strategy with hierarchical input preference the model rat creates two representations of the route to the goal: one is based on environmental cues while the other is based on self-generated scent marks. Our model predicts that in case of remapping, when the goal in two environments is at different locations, the rat would fail when moved back to the previous environment since it would prefer environmental cues. We would hypothesize that the rat could use the scent trail in the next trial after it fails to find a goal when using environmental cues. Experiments to test this hypothesis would also be of great interest.
Acknowledgement
We thank Alexander Wolf for helpful comments.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Footnotes
RBF – radial basis function.
An erratum to this article can be found online at 10.1007/s10827-010-0216-9
Contributor Information
Tomas Kulvicius, Email: tomas@bccn-goettingen.de.
Minija Tamosiunaite, Email: m.tamosiunaite@if.vdu.lt.
Florentin Wörgötter, Email: worgott@bccn-goettingen.de.
References
- Arleo A., Gerstner W. Spatial cognition and neuro-mimetic navigation: A model of hippocampal place cell activity. Biological Cybernetics. 2000;83(3):287–299. doi: 10.1007/s004220000171. [DOI] [PubMed] [Google Scholar]
- Arleo A., Smeraldi F., Gerstner W. Cognitive navigation based on nonuniform Gabor space sampling, unsupervised growing networks, and reinforcement learning. IEEE Transactions on Neural Networks. 2004;15(3):639–652. doi: 10.1109/TNN.2004.826221. [DOI] [PubMed] [Google Scholar]
- Atkeson C. G., Moore A. W., Schaal S. Locally weighted learning for control. Artificial Intelligence Review. 1997;11(1):75–113. doi: 10.1023/A:1006511328852. [DOI] [Google Scholar]
- Balakrishnan K., Bousquet O., Honavar V. Spatial learning and localization in animals: A computational model and its implications for mobile robots. Adaptive Behavior. 1999;7(2):137–216. doi: 10.1177/105971239900700203. [DOI] [Google Scholar]
- Barnes C., Nadel L., Honig W. Spatial memory deficit in senescent rats. Canadian Journal of Psychology. 1980;34:29–39. doi: 10.1037/h0081022. [DOI] [PubMed] [Google Scholar]
- Barry C., Hayman R., Burgess N., Jeffery K. J. Experience-dependent rescaling of entorhinal grids. Nature Neuroscience. 2007;10(6):682–684. doi: 10.1038/nn1905. [DOI] [PubMed] [Google Scholar]
- Battaglia F. P., Sutherland G. R., McNaughton B. L. Local sensory cues and place cell directionality: Additional evidence of prospective coding in the hippocampus. The Journal of Neuroscience. 2004;24(19):4541–4550. doi: 10.1523/JNEUROSCI.4896-03.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diekmann B., Griebel M., et al. The modelling of odour dispersion with time-resolved models. Agrartechnische Forschung. 2000;4:84–89. [Google Scholar]
- Bousquet, O., Balakrishnan, K., & Honavar, V. (1998). Is the hippocampus a kalman filter? In Proceedings of the pacific symposium on biocomputing (pp. 655–666). [PubMed]
- Brown M. A., Sharp P. E. Simulation of spatial learning in the Morris water maze by a neural networksork model of the hippocampal formation and nucleus accumbens. Hippocampus. 1995;5(3):171–188. doi: 10.1002/hipo.450050304. [DOI] [PubMed] [Google Scholar]
- Brunel N., Trullier O. Plasticity of directional place fields in a model of rodent CA3. Hippocampus. 1998;8:651–665. doi: 10.1002/(SICI)1098-1063(1998)8:6<651::AID-HIPO8>3.0.CO;2-L. [DOI] [PubMed] [Google Scholar]
- Calton J., Stackman R., Goodridge J., Archey W., Dudchenko P., Taube J., et al. Hippocampal place cell instability after lesions of the head direction cell network. The Journal of Neuroscience. 2003;23:9719–9731. doi: 10.1523/JNEUROSCI.23-30-09719.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carvell G. E., Simons D. J. Biometric analyses of vibrissal tactile discrimination in the rat. The Journal of Neuroscience. 1990;10(8):2638–2648. doi: 10.1523/JNEUROSCI.10-08-02638.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chokshi, K., Wermter, S., & Weber, C. (2003). Learning localisation based on landmarks using self-organisation. In ICANN (pp. 504–514).
- Collett T. S., Cartwright B. A., Smith B. A. Landmark learning and visuo-spatial memories in gerbils. Journal of Comparative Physiology. 1986;158(6):835–851. doi: 10.1007/BF01324825. [DOI] [PubMed] [Google Scholar]
- Dudchenko P. A. How do animals actually solve the T maze? Behavioral Neuroscience. 2001;115:850–860. doi: 10.1037/0735-7044.115.4.850. [DOI] [PubMed] [Google Scholar]
- Eichenbaum H., Dudchenko P., Wood E., Shapiro M., Tanila H. The hippocampus, memory, and place cells: Is it spatial memory or a memory space? Neuron. 1999;23(2):209–226. doi: 10.1016/S0896-6273(00)80773-4. [DOI] [PubMed] [Google Scholar]
- Eilam D., Golani I. Home base behavior of rats (rattus norvegicus) exploring a novel environment. Behavioural Brain Research. 1989;34:199–211. doi: 10.1016/S0166-4328(89)80102-0. [DOI] [PubMed] [Google Scholar]
- Ekstrom A. D., Kahana M. J., Caplan J. B., Fields T. A., Isham E. A., Newman E. L., et al. Cellular networks underlying human spatial navigation. Nature. 2003;425(6954):184–188. doi: 10.1038/nature01964. [DOI] [PubMed] [Google Scholar]
- Etienne A. S., Jeffery K. J. Path integration in mammals. Hippocampus. 2004;14(2):180–192. doi: 10.1002/hipo.10173. [DOI] [PubMed] [Google Scholar]
- Etienne A. S., Maurer R., Boulens V., Levy A., Rowe T. Resetting the path integrator: A basic condition for route-based navigation. The Journal of Experimental Biology. 2004;207(Pt 9):1491–1508. doi: 10.1242/jeb.00906. [DOI] [PubMed] [Google Scholar]
- Etienne A. S., Maurer R., Seguinot V. Path integration in mammals and its interaction with visual landmarks. The Journal of Experimental Biology. 1996;199(Pt 1):201–209. doi: 10.1242/jeb.199.1.201. [DOI] [PubMed] [Google Scholar]
- Foster D. J., Morris R. G., Dayan P. A model of hippocampally dependent navigation, using the temporal difference learning rule. Hippocampus. 2000;10(1):1–16. doi: 10.1002/(SICI)1098-1063(2000)10:1<1::AID-HIPO1>3.0.CO;2-1. [DOI] [PubMed] [Google Scholar]
- Franzius M., Vollgraf R., Wiskott L. From grids to places. Journal of Computational Neuroscience. 2007;22(3):297–299. doi: 10.1007/s10827-006-0013-7. [DOI] [PubMed] [Google Scholar]
- Gaussier P., Revel A., Banquet J. P., Babeau V. From view cells and place cells to cognitive map learning: Processing stages of the hippocampal system. Biological Cybernetics. 2002;86(1):15–28. doi: 10.1007/s004220100269. [DOI] [PubMed] [Google Scholar]
- Hafting T., Fyhn M., Molden S., Moser M. B., Moser E. I. Microstructure of a spatial map in the entorhinal cortex. Nature. 2005;436:801–806. doi: 10.1038/nature03721. [DOI] [PubMed] [Google Scholar]
- Harley C. W., Martin G. M. Open field motor patterns and object marking, but not object sniffing, are altered by ibotenate lesions of the hippocampus. Neurobiology of Learning and Memory. 1999;72(3):202–214. doi: 10.1006/nlme.1998.3902. [DOI] [PubMed] [Google Scholar]
- Hartley T., Burgess N., Lever C., Cacucci F., O’Keefe J. Modeling place fields in terms of the cortical inputs to the hippocampus. Hippocampus. 2000;10(4):369–379. doi: 10.1002/1098-1063(2000)10:4<369::AID-HIPO3>3.0.CO;2-0. [DOI] [PubMed] [Google Scholar]
- Hill A. J., Best P. J. Effects of deafness and blindness on the spatial correlates of hippocampal unit activity in the rat. Experimental Neurology. 1981;74(1):204–217. doi: 10.1016/0014-4886(81)90159-X. [DOI] [PubMed] [Google Scholar]
- Hines D. J., Whishaw I. Q. Home bases formed to visual cues but not to self-movement (dead reckoning) cues in exploring hippocampectomized rats. European Journal of Neuroscience. 2005;22:2363–2375. doi: 10.1111/j.1460-9568.2005.04412.x. [DOI] [PubMed] [Google Scholar]
- Hölscher C. Time, space and hippocampal functions. Reviews in the Neurosciences. 2003;14(3):253–284. doi: 10.1515/revneuro.2003.14.3.253. [DOI] [PubMed] [Google Scholar]
- Jeffery K., Gilbert A., Burton S., Strudwick A. Preserved performance in a hippocampal-dependent spatial task despite complete place cell remapping. Hippocampus. 2003;13:175–189. doi: 10.1002/hipo.10047. [DOI] [PubMed] [Google Scholar]
- Jeffery K., O’Keefe J. Learned interaction of visual and idiothetic cues in the control of place field orientation. Experimental Brain Research. 1999;127:151–161. doi: 10.1007/s002210050785. [DOI] [PubMed] [Google Scholar]
- Knierim J. J., Kudrimoti H. S., McNaughton B. L. Place cells, head direction cells, and the learning of landmark stability. The Journal of Neuroscience. 1995;15(3 Pt 1):1648–1659. doi: 10.1523/JNEUROSCI.15-03-01648.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knierim J. J., Kudrimoti H. S., McNoughton B. L. Interaction between idiothetic cues and external landmarks in the control of place cells and head direction cells. Journal of Neurophysiology. 1998;80:425–446. doi: 10.1152/jn.1998.80.1.425. [DOI] [PubMed] [Google Scholar]
- Kretchmar, R., & Anderson, C. (1997). Comparison of cmacs and radial basis functions for local function approximators in reinforcement learning. In Proceedings of the IEEE international conference on neural networksorks (pp. 834–837). Houston, TX.
- Krichmar J. L., Seth A. K., Nitz D. A., Fleischer J. G., Edelman G. M. Spatial navigation and causal analysis in a brain-based device modeling cortical-hippocampal interactions. Neuroinformatics. 2005;3(3):197–221. doi: 10.1385/NI:3:3:197. [DOI] [PubMed] [Google Scholar]
- Lavenex P., Schenk F. Influence of local environmental olfactory cues on place learning in rats. Physiology & Behavior. 1995;58(6):1059–1066. doi: 10.1016/0031-9384(95)02002-0. [DOI] [PubMed] [Google Scholar]
- Lavenex P., Schenk F. Integration of olfactory information in a spatial representation enabling accurate arm choice in the radial arm maze. Learning & Memory. 1996;2(6):299–319. doi: 10.1101/lm.2.6.299. [DOI] [PubMed] [Google Scholar]
- Lavenex P., Schenk F. Olfactory traces and spatial learning in rats. Animal Behaviour. 1998;56(5):1129–1136. doi: 10.1006/anbe.1998.0873. [DOI] [PubMed] [Google Scholar]
- Maaswinkel H., Whishaw I. Q. Homing with locale, taxon, and dead reckoning strategies by foraging rats: Sensory hierarchy in spatial navigation. Behavioural Brain Research. 1999;99:143–152. doi: 10.1016/S0166-4328(98)00100-4. [DOI] [PubMed] [Google Scholar]
- Markus E., Barnes C., McNaughton B., Gladden V., Skaggs W. Spatial information content and reliability of hippocampal CA1 neurons: Effects of visual input. Hippocampus. 1994;4:410–421. doi: 10.1002/hipo.450040404. [DOI] [PubMed] [Google Scholar]
- Markus E. J., Qin Y. L., Leonard B., Skaggs W. E., McNaughton B. L., Barnes C. A. Interactions between location and task affect the spatial and directional firing of hippocampal neurons. The Journal of Neuroscience. 1995;15(11):7079–7094. doi: 10.1523/JNEUROSCI.15-11-07079.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McNaughton B. L., Barnes C. A., O’Keefe J. Plasticity of directional place fields in a model of rodent CA3. Experimental Brain Research. 1983;52(1):41–49. doi: 10.1007/BF00237147. [DOI] [PubMed] [Google Scholar]
- McNaughton B., Battaglia F., Jensen O., Moser E., Moser M. Path integration and the neural basis of the ‘cognitive map’. Nature Reviews Neuroscience. 2006;7:663–678. doi: 10.1038/nrn1932. [DOI] [PubMed] [Google Scholar]
- Morris R. Spatial localization does not require the presence of local cues. Learning and Motivation. 1981;12:239–260. doi: 10.1016/0023-9690(81)90020-5. [DOI] [Google Scholar]
- Morris R. Developments of a water-maze procedure for studying spatial learning in the rat. The Journal of Neuroscience Methods. 1984;11(1):47–60. doi: 10.1016/0165-0270(84)90007-4. [DOI] [PubMed] [Google Scholar]
- Muller R. U., Bostock E., Taube J. S., Kubie J. L. On the directional firing properties of hippocampal place cells. The Journal of Neuroscience. 1994;14(12):7235–7251. doi: 10.1523/JNEUROSCI.14-12-07235.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muller R. U., Kubie J. L. The effects of changes in the environment on the spatial firing of hippocampal complex-spike cells. The Journal of Neuroscience. 1987;7(7):1951–1968. doi: 10.1523/JNEUROSCI.07-07-01951.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muller R. U., Ranck J. B., Taube J. S. Head direction cells: Properties and functional significance. Current Opinion in Neurobiology. 1996;6(2):196–206. doi: 10.1016/S0959-4388(96)80073-0. [DOI] [PubMed] [Google Scholar]
- Nemati F., Whishaw I. Q. The point of entry contributes to the organization of exploratory behaviour of rats on an open field: An example of spontaneous episodic memory. Behavioural Brain Research. 2007;182:119–128. doi: 10.1016/j.bbr.2007.05.016. [DOI] [PubMed] [Google Scholar]
- O’Keefe J. Do hippocampal pyramidal cells signal non-spatial as well as spatial information? Hippocampus. 1999;9:352–365. doi: 10.1002/(SICI)1098-1063(1999)9:4<352::AID-HIPO3>3.0.CO;2-1. [DOI] [PubMed] [Google Scholar]
- O’Keefe J., Burgess N. Geometric determinants of the place fields of hippocampal neurons. Nature. 1996;381(6581):425–428. doi: 10.1038/381425a0. [DOI] [PubMed] [Google Scholar]
- O’Keefe J., Dostrovsky J. The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. Brain Research. 1971;34(1):171–175. doi: 10.1016/0006-8993(71)90358-1. [DOI] [PubMed] [Google Scholar]
- O’Keefe J., Nadel L. The hippocampus as a cognitive map. Oxford: Oxford University Press; 1978. [Google Scholar]
- O’Keefe J., Speakman A. Single unit activity in the rat hippocampus during a spatial memory task. Experimental Brain Research. 1987;68(1):1–27. doi: 10.1007/BF00255230. [DOI] [PubMed] [Google Scholar]
- Ollington, R., & Vamplew, P. (2004). Learning place cells from sonar data. In AISAT2004: International conference on artificial intelligence in science and technology (pp. 126–131).
- Prados J., Trobalon J. The location of an invisible goal requires the presence of at least two landmarks. Psychobiology. 1998;26:42–48. [Google Scholar]
- Recce M., Harris K. D. Memory for places: A navigational model in support of Marr’s theory of hippocampal function. Hippocampus. 1996;6(6):735–748. doi: 10.1002/(SICI)1098-1063(1996)6:6<735::AID-HIPO15>3.0.CO;2-1. [DOI] [PubMed] [Google Scholar]
- Reynolds, S. I. (2002). The stability of general discounted reinforcement learning with linear function approximation. In UK workshop on computational intelligence (UKCI-02) (pp. 139–146). Birmingham, UK.
- Russell R. A. Laying and sensing odor markings as a strategy for assisting mobilerobot navigation tasks. Robotics & Automation Magazine, IEEE. 1995;2(3):3–9. doi: 10.1109/100.414920. [DOI] [Google Scholar]
- Sargolini F., Fyhn M., Hafting T., McNaughton B., Witter M. P., Moser E. I., et al. Conjunctive representation of position, direction, and velocity in entorhinal cortex. Science. 2006;312:758–762. doi: 10.1126/science.1125572. [DOI] [PubMed] [Google Scholar]
- Save E., Nerad L., Poucet B. Contribution of multiple sensory information to place field stability in hippocampal place cells. Hippocampus. 2000;10(1):64–76. doi: 10.1002/(SICI)1098-1063(2000)10:1<64::AID-HIPO7>3.0.CO;2-Y. [DOI] [PubMed] [Google Scholar]
- Shapiro M. L., Hetherington P. A. A simple network model simulates hippocampal place fields: Parametric analyses and physiological predictions. Behavioral Neuroscience. 1993;107:34–50. doi: 10.1037/0735-7044.107.1.34. [DOI] [PubMed] [Google Scholar]
- Shapiro M. L., Tanila H., Eichenbaum H. Cues that hippocampal place cells encode: Dynamic and hierarchical representation of local distal stimuli. Hippocampus. 1997;7:624–642. doi: 10.1002/(SICI)1098-1063(1997)7:6<624::AID-HIPO5>3.0.CO;2-E. [DOI] [PubMed] [Google Scholar]
- Sharp P. E. Computer simulation of hippocampal place cells. Psychobiology. 1991;19(2):103–115. [Google Scholar]
- Sheynikhovich, D., Chavarriaga, R., Strösslin, T., & Gerstner, W. (2005). Spatial representation and navigation in a bio-inspired robot. In Biomimetic neural learning for intelligent robots: Intelligent systems, cognitive robotics, and neuroscience (pp. 245–264).
- Strösslin T., Sheynikhovich D., Chavarriaga R., Gerstner W. Robust self-localisation and navigation based on hippocampal place cells. eural Networks. 2005;18(9):1125–1140. doi: 10.1016/j.neunet.2005.08.012. [DOI] [PubMed] [Google Scholar]
- Sutton R., Barto A. Reinforcement learning: An introduction. Cambridge, MA: MIT; 1998. [Google Scholar]
- Takács B., Lőrincz A. Independent component analysis forms place cells in realistic robot simulations. Neurocomputing. 2006;69:1249–1252. doi: 10.1016/j.neucom.2005.12.086. [DOI] [Google Scholar]
- Tanila H., Shapiro M. L., Eichenbaum H. Discordance of spatial representations in ensembles of hippocampal place cells. Hippocampus. 1997;7:613–623. doi: 10.1002/(SICI)1098-1063(1997)7:6<613::AID-HIPO4>3.0.CO;2-F. [DOI] [PubMed] [Google Scholar]
- Taube J. S., Muller R. U., Ranck J. B. J. Head direction cells recorded from the postsubiculum in freely moving rats. I. Description and quantitative analysis. The Journal of Neuroscience. 1990;10:420–435. doi: 10.1523/JNEUROSCI.10-02-00420.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taube J. S., Muller R. U., Ranck J. B. J. Head direction cells recorded from the postsubiculum in freely moving rats. II. Effects of environmental manipulations. The Journal of Neuroscience. 1990;10:436–447. doi: 10.1523/JNEUROSCI.10-02-00436.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tomlinson W. T., Johnston T. D. Hamsters remember spatial information derived from olfactory cues. Animal Learning and Behavior. 1991;19:185–190. [Google Scholar]
- Touretzky D. S., Redish A. D. Theory of rodent navigation based on interacting representations of space. Hippocampus. 1996;6(3):247–270. doi: 10.1002/(SICI)1098-1063(1996)6:3<247::AID-HIPO4>3.0.CO;2-K. [DOI] [PubMed] [Google Scholar]
- Wallace D. G., Gorny B., Whishaw I. Q. Rats can track odors, other rats, and themselves: Implications for the study of spatial behavior. Behavioural Brain Research. 2002;131(1–2):185–192. doi: 10.1016/S0166-4328(01)00384-9. [DOI] [PubMed] [Google Scholar]
- Wallace D. G., Hines D. J., Whishaw I. Q. Quantification of a single exploratory trip reveals hippocampal formation mediated dead reckoning. The Journal of Neuroscience Methods. 2002;113:131–145. doi: 10.1016/S0165-0270(01)00489-7. [DOI] [PubMed] [Google Scholar]
- Wallace D. G., Kolb B., Whishaw I. Q. Odor tracking in rats with orbital frontal lesions. Behavioral Neuroscience. 2003;117(3):616–620. doi: 10.1037/0735-7044.117.3.616. [DOI] [PubMed] [Google Scholar]
- Whishaw I. Q., Hines D. J., Wallace D. G. Dead reckoning (path integration) requires the hippocampal formation: Evidence from spontaneous exploration and spatial learning tasks in light (allothetic) and dark (idiothetic) tests. Behavioural Brain Research. 2001;127(1–2):49–69. doi: 10.1016/S0166-4328(01)00359-X. [DOI] [PubMed] [Google Scholar]
- Wilson M. A., McNaughton B. L. Dynamics of the hippocampal ensemble code for space. Science. 1993;261(5124):1055–1058. doi: 10.1126/science.8351520. [DOI] [PubMed] [Google Scholar]
- Yoganarasimha D., Knierim J. Coupling between place cells and head direction cells during relative translations and rotations of distal landmarks. Experimental Brain Research. 2005;160:344–359. doi: 10.1007/s00221-004-2016-9. [DOI] [PubMed] [Google Scholar]