Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Feb 19.
Published in final edited form as: Neuron. 2019 Dec 18;105(4):725–741.e8. doi: 10.1016/j.neuron.2019.11.022

Dorsal and ventral hippocampal sharp-wave ripples activate distinct nucleus accumbens networks

Marielena Sosa 1, Hannah R Joo 1, Loren M Frank 1,2,*
PMCID: PMC7035181  NIHMSID: NIHMS1545114  PMID: 31864947

Summary

Memories of positive experiences link places, events, and reward outcomes. These memories recruit interactions between the hippocampus and nucleus accumbens (NAc). Both dorsal and ventral hippocampus (dH and vH) project to the NAc, but it remains unknown whether dH and vH act in concert or separately to engage NAc representations related to space and reward. We recorded simultaneously from the dH, vH, and NAc of rats during an appetitive spatial task and focused on hippocampal sharp-wave ripples (SWRs) to identify times of memory reactivation across brain regions. Here we show that dH and vH awake SWRs occur asynchronously and activate distinct and opposing patterns of NAc spiking. Only NAc neurons activated during dH SWRs were tuned to task- and reward-related information. These temporally and anatomically separable hippocampal-NAc interactions point to distinct channels of mnemonic processing in the NAc, with the dH-NAc channel specialized for spatial task and reward information.

eTOC blurb:

Using simultaneous multi-region recordings in rats, Sosa et al. reveal distinct networks of NAc neurons engaged during dorsal versus ventral hippocampal memory processes. NAc neurons encoding spatial and reward information are activated only by the dorsal, but not ventral, hippocampus.

Introduction

Episodic memories integrate diverse aspects of experience, such as places, events, context, and reward. The hippocampus is critical for these memories and coordinates mnemonic processing across downstream brain regions (Buzsaki and Moser, 2013; Sosa et al., 2016; Eichenbaum, 2017). Within the hippocampus, different aspects of experience are thought to be preferentially processed in different subdivisions, with the dorsal hippocampus (dH) specialized for precise spatial representations and the ventral hippocampus (vH) specialized for contextual and emotional representations (Moser and Moser, 1998; Fanselow and Dong, 2010; Royer et al., 2010; Komorowski et al., 2013; Strange et al., 2014; Ciocchi et al., 2015; Jimenez et al., 2018).

Memories linking space and reward are thought to depend on hippocampal communication with the nucleus accumbens (NAc), a striatal region that represents reward and the value of chosen actions (Ito et al., 2008; Humphries and Prescott, 2010; Pennartz et al., 2011; Chersi and Burgess, 2015). The most prominent anatomical projections from the hippocampus to the NAc arise from the vH. Manipulations of this pathway can drive or block expression of spatial-reward memories (Britt et al., 2012; Riaz et al., 2017; LeGates et al., 2018), and vH neurons that project to the NAc are modulated at locations associated with reward (Ciocchi et al., 2015). These findings have suggested a role for the vH-NAc pathway in processing information related to locations and rewards. Interestingly, the dH also projects to the NAc, albeit much more sparsely (Brog et al., 1993; Li et al., 2018; Trouche et al., 2019), and optogenetic inhibition of projections from dorsal CA1 to the NAc impairs recall of a spatial-reward association (Trouche et al., 2019).

Thus, both dH and vH have been linked to spatial-reward memory, but these links are based primarily on manipulations of entire pathways or structures that drive neural activity patterns not seen in the intact system. Under normal conditions, coordinated firing between dH and NAc neurons is expressed during spatially-guided appetitive behaviors (van der Meer and Redish, 2011; Lansink et al., 2016; Sjulson et al., 2018; Trouche et al., 2019), but whether neural activity patterns in the vH and NAc are coordinated during behavior has not yet been explored. Moreover, it is unclear whether NAc representations of space and reward are linked to vH, dH, or both.

Addressing these issues requires identifying specific NAc neurons that are engaged during dH- or vH-specific information processing. Hippocampal sharp-wave ripples (SWRs) are discrete events that are well suited for this identification. These high-frequency (150-250 Hz) oscillations occur during sleep and awake immobility and coincide with the sequential reactivation of place cell ensembles (Foster and Wilson, 2006; Buzsaki, 2015; Joo and Frank, 2018). SWRs also engage extrahippocampal structures (Pennartz et al., 2004; Ji and Wilson, 2007; Gomperts et al., 2015; Jadhav et al., 2016; Girardeau et al., 2017; Rothschild et al., 2017; Yu et al., 2017), thus providing a mechanism for time-compressed memory retrieval across the brain.

SWRs can be detected in both dH and vH, and during sleep these events can occasionally propagate along the entire dorsoventral axis (Patel et al., 2013). However, the relationship between dH SWRs (dSWRs) and vH SWRs (vSWRs) during waking has not been explored. Previous work has also reported the activation of NAc neurons during dH SWRs in sleep and found that these neurons tended to fire near reward sites during task performance (Lansink et al., 2008; Lansink et al., 2009; Sjulson et al., 2018). Whether NAc neurons are engaged during awake dSWRs or during either awake or sleep vSWRs remains unknown. It is also unknown whether dSWRs and vSWRs engage similar or different NAc populations.

Here we report that dSWRs and vSWRs occur asynchronously in the awake state and engage largely distinct subpopulations of NAc neurons. Surprisingly, when individual neurons were modulated during both types of SWRs, they were most often activated during dSWRs and suppressed during vSWRs or vice-versa. We also found that only dSWR-activated NAc neurons encoded information related to reward history and progression along spatial paths to goals, indicating that dH and vH coordinate distinct neural representations in the NAc. These circuit dynamics could provide a substrate for the independent storage and retrieval of distinct aspects of experience.

Results

Temporal asynchrony of awake dH and vH SWRs during a spatial memory task

Identifying the nature of coordination between the dH-NAc and vH-NAc pathways requires a simultaneous survey of all three regions. We therefore recorded from dH, vH, and NAc using chronically implanted tetrode arrays in rats (Figures 1A and S1), in the context of both a dynamic spatial memory task and interleaved sleep periods in a separate rest box (Figure 1B). We utilized a “Multiple-W” task (Singer and Frank, 2009) in which a rat must first learn which three of six maze arms are rewarded and alternate between them to receive liquid food reward on each correct well visit (Figure 1C). Learning this alternation requires hippocampal-dependent memory (Kim and Frank, 2009) as well as the association of specific locations with reward, a process thought to involve the hippocampal-NAc circuit (Ito et al., 2008; Humphries and Prescott, 2010; Pennartz et al., 2011; Chersi and Burgess, 2015). Once the animal acquired the first alternation sequence to ~80% correct, we introduced a new sequence (Figure 1C), requiring the animal to transfer the alternation rule to a new set of reward locations. Sequences were switched across task epochs for the remainder of the experiment (Figure 1B) to promote adaptive, spatially-guided reward-seeking behavior.

Figure 1. Awake dH and vH SWRs occur asynchronously.

Figure 1.

(A) Tetrodes targeting NAc, dH, and vH in the rat brain.

(B) Behavior on an example “Switch” day, with Multiple-W task epochs flanked by sleep epochs in a separate rest box. On “Acquisition” days, the same sequence was repeated in each task epoch.

(C) The Multiple-W task. Alternation from the center to the outer wells (yellow circles) of the “W” yielded liquid food reward on each correct well visit. Expanded section depicts 4 consecutive correct trials. After acquisition of one sequence (A or B), the second sequence was introduced on the first Switch day.

(D) Example dSWRs and vSWRs during immobility at a reward well (Rat 4). Top: raw (1-400 Hz) and ripple-filtered (150-250 Hz) local field potential for two tetrodes each in dorsal CA1 (dCA1) and ventral CA1 (vCA1). Shaded regions highlight detected dSWRs (pink) and vSWRs (blue). Bottom: speed of the animal.

(E) Cross-correlation histogram (CCH) between onset times of dSWRs and vSWRs across animals (mean ± s.e.m, n=5 rats). Top: normalized by the number of dSWRs, signifying the fraction of dSWRs with a co-occurring vSWR in each time bin. Magenta lines in zoomed-in inset depict the average shuffle distribution across animals (mean ± 95% confidence intervals), for illustration purposes only. Bottom: z-scored relative to shuffled vSWR onset times.

(F) CCH (mean ± s.e.m.) of awake SWRs between tetrode pairs in dCA1 (left, n=5 rats) or vCA1 (right, n=3 rats with >1 tetrode in vCA1). Top: normalized by the number of SWRs on one tetrode. Bottom: z-scored relative to shuffle.

See also Figures S1 and S2.

This task engages dSWRs, particularly during reward consumption (Singer and Frank, 2009), but the occurrence of vSWRs during awake behavior had not been previously described. We therefore examined awake SWRs detected in dH or vH during periods of immobility on the task, which occurred primarily at the reward wells (Figure 1D). Both dSWRs and vSWRs showed the expected spectral properties (Figure S2A-B) and increases in multiunit activity (Patel et al., 2013; Buzsaki, 2015) in CA1 as well as in CA3 (Figure S2C-D), indicating strong, transient activation of the local hippocampal networks. Also consistent with previous characterization in sleep (Patel et al., 2013), awake vSWRs occurred more frequently but were of smaller amplitude and shorter duration than dSWRs (Figure S2E-G).

Strikingly, despite the existence of dorsoventral connectivity within the hippocampus (Witter, 2007; van Strien et al., 2009) and observations of occasional synchrony between dSWRs and vSWRs during sleep (Patel et al., 2013), dSWRs and vSWRs occurred asynchronously during awake immobility on the task (Figure 1D-E). Only ~3.7% of dSWRs occurred within 50 ms of a vSWR, which was no more than expected by chance (Figure 1E), indicating temporally separable dH and vH outputs to downstream brain areas during awake SWRs. By contrast, there was prominent synchrony between pairs of recording sites within dH or within vH (Figure 1F).

Distinct modulation of dSWRs and vSWRs by novelty and reward

DSWRs and vSWRs were also differentially modulated by novelty and reward. We examined SWR occurrence on rewarded and error trials (Figure S3A) as a function of the animals’ behavioral performance (Figures 2A and S4), as improving performance parallels decreasing novelty and increasing familiarity with the task and environment. We first confirmed previous findings (Singer and Frank, 2009; Ambrose et al., 2016) that novel and rewarding experiences strongly enhance the rate of dSWRs (Figures 2B and S3B). Given the strong anatomical projections from the vH to limbic brain areas involved in reward processing (Fanselow and Dong, 2010; Strange et al., 2014), we expected a similar pattern of enhancement for vSWRs.

Figure 2. Awake dSWRs and vSWRs are differently modulated by reward and novelty.

Figure 2.

(A) Example Multiple-W behavior (Rat 5) for acquisition and 2 full switch days (see Figure S4 for complete behavior), shown as the probability (mode ± 90% confidence interval) that the rat is making an accurate choice on each trial according to Sequence A (orange) or B (green). Colored bars above the plot indicate the rewarded sequence, grey vertical lines mark epoch boundaries, black triangles mark the start of each day, horizontal dotted line indicates chance performance (0.167).

(B and C) dSWR rate (B) and vSWR rate (C) on rewarded vs. error trials (well visits). Each point is the mean SWR rate across animals ± s.e.m. within learning stage (n=4 rats in Acquisition stages 0.7-0.8, >0.8; n=5 rats all other stages; see STAR Methods). Learning stages are defined by each animal’s probability of performing the rewarded sequence correctly and are used here as a proxy for novelty. SWR rate is shown for the second sequence only when it was rewarded. In (B), all stages except the first are p<0.05 between rewarded and error trials (Wilcoxon rank-sum test).

(D and E) Timing of dSWRs (D) and vSWRs (E) relative to nosepoke, by behavioral stage. Gold line indicates time of actual or expected reward delivery, 2 s after nosepoke at the well. Top: mean speed in each behavioral stage (repeated from D to E); note that the rat’s head takes ~1 s to fully decelerate. Below: mean SWR rate across animals in 200 ms bins for rewarded and error trials, during the first 100 trials of Multiple-W behavior (“Novel”), trials occurring at >0.6 probability correct on the first sequence (“Late Acquisition”), and during all Switch epochs (“All switch”). Empty bins indicate too little data to calculate an SWR rate (see STAR Methods).

See also Figures S3 and S4.

Instead, vSWRs maintained a similar rate on rewarded and error trials and were not enhanced during early novelty (Figures 2C and S3C). The onset time of dSWRs and vSWRs also differed relative to arrival at the reward wells. We utilized a 2-second delay between nosepoke and reward delivery to separate the time of immobility from the time of reward. While dSWRs shifted later following initial learning to begin after receipt of reward (Figure 2D), vSWRs were detected as soon as the animal stopped moving at all stages of learning (Figure 2E). Together, these results indicate that dSWRs and vSWRs are differently regulated by novelty and reward.

NAc subpopulations are oppositely modulated during dSWRs and vSWRs

The temporal separation and distinct modulation patterns of awake dSWRs and vSWRs provided the opportunity to determine whether these events differentially engaged the NAc. To sample the respective target regions of the sparse dH projection and the much more prominent vH projection, we recorded from both the NAc core and shell (Figure S1D). We classified NAc single units into putative medium-sized spiny neurons (MSNs) and fast-spiking interneurons (FSIs) based on firing rate and waveform properties (Figure S5A) and examined their activity aligned to the times of awake dSWRs and vSWRs.

We found that 51% of MSNs significantly changed their firing rates around the times of dSWRs and/or vSWRs. Strikingly, the observed firing rate changes were often opposite for dSWRs and vSWRs, such that 10.6% of cells were significantly dSWR-activated and vSWR-suppressed (D+V−) or dSWR-suppressed and vSWR-activated (D−V+) (Figure 3A-B). This bidirectional modulation indicates that SWRs from dH and vH have opposing influences on the same neurons. Crucially, the fraction of oppositely modulated cells was significantly larger than would be expected by total chance overlap of independent dSWR- (D+, D−) and vSWR-modulated (V+, V−) subgroups, while the fraction of co-positively modulated cells (3.2% D+V+) was not greater than chance. In addition, many cells were significantly modulated during only dSWRs or vSWRs but not both (Figure 3B). Across the full population of MSNs (Figures 3C and S6A), we found a significant anti-correlation in SWR modulation (Figure 3D), demonstrating that dSWRs and vSWRs are consistently associated with opposite activity changes at the level of individual NAc MSNs. We also noted that MSN activity changes predominantly followed dH or vH neuronal activation during SWRs, consistent with hippocampus to NAc information flow (Figure S6B).

Figure 3. Opposite patterns of NAc modulation during awake dH vs. vH SWRs.

Figure 3.

(A) Examples of NAc MSNs showing significant SWR modulation (p<0.01, shuffle test). Spike rasters and peri-event time histograms (PETHs) are aligned to the onset of dSWRs (left within cell, pink line) or vSWRs (right within cell, blue line). Horizontal lines separate task epochs in which each cell was isolated. Categories at the top indicate directions of significant modulation.

(B) Proportions of significantly SWR-modulated NAc MSNs. Top: fractions modulated during dSWRs only (D only), vSWRs only (V only), Both, or Neither dSWRs nor vSWRs, regardless of modulation direction (cell counts in white, out of 502 MSNs from 5 rats). Significantly more cells are modulated during Both than would be expected by chance overlap of dSWR- and vSWR-modulated cells (***p=5.44×10−4, z-test for proportions). Bottom: directional modulation of MSNs (cell counts next to each bar). The fractions of D+V− cells alone and total “opposing” cells (gradient bar, D+V− and D−V+) are higher than would be expected by chance (**p=0.0017 and ***p=6.12×10−4, respectively, z-tests for proportions).

(C) NAc MSN population shows opposing modulation during dSWRs vs. vSWRs. Left: dSWR-aligned z-scored PETHs for each MSN ordered by its modulation amplitude (mean z-scored firing rate in the 200 ms following SWR onset). Right: vSWR-aligned z-scored PETHs for the same ordered MSNs shown on the left.

(D) Anti-correlation (Pearson’s) between dSWR and vSWR modulation amplitudes of MSNs. Points represent single cells, dotted line represents a linear fit ± 95% confidence intervals.

(E) Examples of single NAc FSIs showing significant SWR modulation (p<0.01, shuffle test). Format and modulation categories as in (A), top row.

(F) Proportions of significantly SWR-modulated FSIs after exclusion of potential duplicates from recording the same cell across days, which could bias this small population of cells (see STAR Methods). Fractions are out of 13 FSIs from 5 rats. Similar to (B).

(G) NAc FSI population shows opposing modulation during dSWRs vs. vSWRs. Similar to (C).

(H) Anti-correlation between dSWR and vSWR modulation amplitudes of FSIs. Similar to (D).

See also Figures S5 and S6.

We confirmed that this opposite modulation could not be explained by the temporal proximity of dSWRs and vSWRs by excluding the small number of dSWRs and vSWRs that occurred within 250 ms of one another (Figure S6E-G). We also verified that our results held when we applied a conservative criterion (see STAR Methods) to ensure that each cell was included only once (Figure S6H-J), accounting for the possibility of recording the same neurons across days.

The same patterns of modulation were seen when we examined FSIs (Figure 3E), applying the same criterion to ensure that each cell was included only once. The majority of FSIs were SWR-modulated (85%), and we identified FSIs that were D+V−, D−V+, or only D+ or V+, but none that were D+V+ (Figure 3F). The FSI population as a whole also showed anti-correlated modulation during dSWRs versus vSWRs (Figures 3G-H and S6C-D).

For both MSNs and FSIs, SWR-modulation was anatomically distributed in a pattern consistent with reported dH and vH projections to the NAc (Brog et al., 1993; Humphries and Prescott, 2010; Britt et al., 2012; Strange et al., 2014; Trouche et al., 2019), with V+ neurons present mostly in the medial shell and parts of the core and D+ neurons restricted to the core and lateral shell (Figure S5C-D). Together, these findings reveal that dSWRs and vSWRs engage largely distinct subpopulations of multiple cell types in the NAc, and when these populations overlap, their modulation is opposite for dSWRs versus vSWRs.

Distinct task firing patterns in MSNs activated during dSWRs versus vSWRs

Given previous observations implicating the vH-NAc pathway in spatial-reward associations (Britt et al., 2012; Ciocchi et al., 2015; LeGates et al., 2018), we expected that V+ NAc neurons would show patterns of spiking consistent with encoding information about spatial locations and their relationship to reward. Instead, we found that only D+ NAc neurons expressed reliable and robust representations related to spatial locations between reward sites.

To examine the dSWR- and vSWR-activated populations independently, we grouped together all cells that were D+ (only D+ or D+V−) and separately, all cells that were V+ (only V+ or D−V+), excluding the small number of D+V+ cells. We then examined the firing patterns of each population on the six rewarded trajectories across the two alternation sequences of the task (Figure 4A, top), as a function of both time (Figure 4A, bottom) and distance along each trajectory.

Figure 4. Selective encoding of task-related information in the dH-NAc network.

Figure 4.

(A) The 6 rewarded task trajectories, defined by start and end reward well and by left (top) or right (middle) movement between wells. Bottom: Example trajectory split into time spent at the start well, from nosepoke to when the animal turns around (“well,” excluding spikes during SWRs), and time spent moving between wells, from turnaround to next nosepoke (“path”).

(B) Example D+ and V+ MSN firing patterns across trajectories, color coded according to (A). Cell numbers do not correspond to previous figures. Top: firing as a function of normalized trial time. Grey vertical line marks turnaround separating the well and path period. Bottom: firing as a function of linearized position (one-dimensional distance from start to end well) during movement. Note that well times fall into the first and last position bins of the path since there is no position change at the well. X-axis for each plot covers 220 cm. Each cell’s r2 across all possible pairs of trajectories is shown in upper right.

(C) Occupancy-normalized spatial firing for the example cells shown in (B), across the whole day. Color scale indicates maximum spatial firing rate in Hz.

(D) D+ MSN firing patterns for each trajectory, shown in cartoons. Firing rates are normalized to each cell’s maximum as calculated by normalized trial time (top) or linearized position (bottom). Cells are sorted by the bin of their peak firing on the third trajectory (light green) in normalized time. White line in top row marks turnaround; white dots in bottom row indicate junctions of the maze between vertical and horizontal segments, with position on the x-axis normalized. W = well, P = path. n = 85 cells sufficiently active on all 6 trajectories, predominantly in the Switch phase of the task.

(E) V+ MSN firing patterns, format as in (D) (n=22 cells sufficiently active across all 6 trajectories).

(F) Mean r2 across trajectories by SWR-modulation category, in normalized trial time. Circles are individual cells, boxes show interquartile range, horizontal lines mark the median, triangles mark the 95% confidence interval of the median, whiskers mark non-outlier extremes. N (n=226 cells) vs. D+ (n=154 cells): ***p=5.45×10−19; D+ vs. V+ (n=42 cells): ***p=6.33×10−9. All tests in F-L are Wilcoxon rank-sum tests with Bonferroni correction for multiple comparisons, setting significance level at p<0.017.

(G) Mean r2 across trajectories, in linearized position. N (n=220 cells) vs. D+ (n=152 cells): ***p=1.47×10−14; D+ vs. V+ (n=40 cells): ***p=7.19×10−9. Boxes as in (F).

(H) Mean r2 trial-by-trial, in normalized trial time. N (n=211 cells) vs. D+ (n=151 cells): ***p=2.35×10−12; D+ vs. V+ (n=40 cells): ***p=5.09×10−7.

(I) Mean r2 trial-by-trial, in linearized position. N (n=194 cells) vs. D+ (n=147 cells): ***4.70×10−5; D+ vs. V+ (n=38 cells): p=0.024.

(J) Mean firing rate on the path. N (n=226 cells) vs. D+ (n=154 cells): ***p=1.57×10−12; D+ vs. V+ (n=42 cells): ***p=5.70×10−6.

(K) Mean firing rate at the well outside of SWRs, same cells as in (J). N vs. D+: ***p=4.94×10−22; D+ vs. V+: ***p=3.69×10−5; V+ vs. N: p=0.024.

(L) Error (summed squared residual) from modeling D+ MSN spike trains by linear position (pos) or movement variables (speed, acceleration, angular speed, angular acceleration). Error is normalized to the position model within each cell for comparison. Crosses indicate outliers, notches indicate 95% confidence interval of the median. ***p<10−50, n=154 cells.

See also Figure S7.

We found that D+ MSNs tended to fire very similarly across distinct trajectories. We quantified firing similarity as a mean coefficient of determination (r2) across all trajectory pairs on which a given cell was active. D+ MSNs with high r2 values were “tuned” to the same relative point of progression through each trajectory in both time and distance (examples in Figure 4B, left), regardless of actual spatial location or egocentric movement direction (Figure S7A). This firing yielded a two-dimensional spatial rate map (Figure 4C, left) that resembles the path equivalence observed in dH place cells in geometrically repetitive environments (Singer et al., 2010), consistent with D+ cells receiving dH input. Across D+ cells, the preferred trajectory stage varied but was consistent across trajectories for a given cell, such that D+ population activity spanned the full extent of each trajectory. We also observed an abundance of D+ cells tuned to either the departure from reward wells (turnaround) or to the latter half of the path leading to the next reward well (Figure 4D), suggesting a preferential representation of trajectory initiation and final approach to reward. As a population, D+ MSNs showed significantly higher firing similarity across trajectories compared to both the V+ MSNs and unmodulated (N) MSNs (Figures 4F-G and S7E-F).

By contrast, many V+ MSNs showed low rate, sparse firing patterns that were largely uncorrelated across distinct trajectories (Figure 4B-C, right). Only a small minority of V+ cells displayed some reliability across trajectories, and these were most often tuned to departures from reward wells (Figure 4E). Importantly, the sparse firing of V+ cells did not mirror the broad spatial representations seen in the vH, where a single cell can be active across a large fraction of an environment (Kjelstrup et al., 2008; Royer et al., 2010; Komorowski et al., 2013; Keinath et al., 2014; Ciocchi et al., 2015). We computed each cell’s two-dimensional spatial coverage and found that V+ cells did not cover a larger total proportion of the environment than the D+ cells or N cells (Figures 4C and S7B). Instead, individual V+ cells showed diffuse spiking that could be preferential to the path or well (Figure S7C) or a specific direction (Figure S7A), but the timing and location of this firing along the trajectory was typically unreliable across trials, and correlations in trial-to-trial firing were lower than those of the D+ population (Figure 4H-I). This indicates that the V+ population lacks consistent encoding of either spatial information or trial progression. Moreover, we found no evidence for consistent relationships between V+ firing and task-relevant variables such as the rewarded alternation sequence (Figure S7D), accuracy of the upcoming trial, preference for a specific trajectory or maze segment, or the behavioral switch between sequences (data not shown).

Furthermore, the D+ population had much higher mean firing rates on both the path and well components suggesting greater task engagement overall (Figures 4J-K and S7G-H). Importantly, these higher firing rates could not account for the observed differences in firing similarity across trajectories (Figure S7I-J). Thus, in the context of our task, D+ cells (and D+V+ cells, Figure S7L) are much more active and express clear task-related firing properties that are not evident in cells that are V+ (and not D+).

We confirmed that the preferential encoding seen in the D+ population for relative location on each trajectory was unlikely to be accounted for by tuning to stereotyped movement variables at those locations. We computed the residuals of firing rate and movement variables on individual trials relative to the mean across trials (see STAR Methods), which reveals whether fluctuations in firing rate correlate with fluctuations in kinematic variables such as running speed. We then excluded any NAc cells that showed significant correlations between firing rate and movement residuals, and found that the high similarity of D+ firing patterns across trajectories remained (shown for speed in Figure S7K). We also asked whether individual D+ firing patterns could be better explained by location or movement parameters. We explicitly modeled the spike trains of D+ MSNs as a function of relative linear position along each trajectory, running speed, acceleration, angular speed (of the animal’s head), and angular acceleration. Across the population, D+ MSN spiking was best predicted by relative linear position and not by movement covariates (Figure 4L).

dSWR-activated MSNs uniquely encode reward history

We next investigated whether D+ or V+ cells would preferentially signal reward-related information. In particular, we aimed to test the longstanding hypothesis that vH would most strongly engage valence-related representations downstream (Fanselow and Dong, 2010; Strange et al., 2014; Ciocchi et al., 2015; Riaz et al., 2017), including reward information in the NAc.

Contrary to this hypothesis, we observed a strong and differential effect of reward history only on D+ MSNs. We computed a reward history preference index for each MSN from the difference in its mean firing rate curve on paths following a rewarded well visit versus an error visit. We found that the D+ population fired more on paths following reward than following error, demonstrating a clear reward history preference that was not seen in the V+ or N populations (Figures 5A-B and S7M-N). This preference persisted when we controlled for the effects of running speed on firing rate (Figure S7O) and for upcoming choice and reward expectation (Figure S7P). Overall, ~21% of D+ cells exhibited a significant firing rate increase on paths following reward compared to error. While many of these cells were tuned to the turnaround from the well (Cell 1, Figure 5A), this D+ subpopulation covered the full extent of a given path (Figure S7Q). This pattern implies that the reward history signal persists until the next path is complete.

Figure 5. Selective encoding of past reward outcome on the path in the dH-NAc network.

Figure 5.

(A) Example path firing patterns of D+ and V+ MSNs as a function of normalized time and split by reward history (outcome of the previous trial). Top: firing rate (mean ± s.e.m. across trials) on all paths following a reward (teal) vs. an error (brown). Cell numbers do not correspond to previous figures. Reward history preference: Cell 1: 0.52, Cell 2: 0.15, Cell 3: 0.018 (p=0.81), Cell 4: −0.26 (p=0.053). *p<0.05, **p<0.01 (permutation test). Bottom: faded lines indicate speed profiles of individual paths following reward and error, thick lines indicate mean speeds.

(B) Reward history preference by SWR-modulation category. Filled circles indicate significantly reward-preferring (>0) or error-preferring (<0) cells, open circles indicate non-significant cells. The D+ population (n=159 cells) is significantly shifted positive of zero (p=6.00×10−11, one-tailed signed-rank test). N (n=235 cells) vs. D+: ***p=1.29×10−4; D+ vs. V+ (n=45 cells): **p=0.003 (Wilcoxon rank-sum tests).

(C) Example well firing patterns of D+ and V+ MSNs as a function of normalized time on rewarded vs. error well visits (mean ± s.e.m. across trials). Format as in (A). Gold vertical line marks actual or expected reward delivery time. Reward vs. error index: Cell 5: 0.65, Cell 6: - 0.69, Cell 7: 0.21, Cell 8: −0.48. *p<0.05, ***p<0.001 (permutation test within the time period flanked by dotted grey lines, when both rewarded and error mean speeds are <2 cm/s).

(D) Reward vs. error index during normalized well time, by SWR-modulation category. Similar to (B). N n=188 cells, D+ n=131 cells, V+ n=33 cells.

(E) Examples well firing patterns of D+ and V+ MSNs as a function of time since nosepoke. Format as in (C). Reward vs. error index: Cell 9: 0.60 (p=0.093), Cell 10: −0.69, Cell 11: 0.77 (p=0.22), Cell 12: −0.48. *p<0.05, ***p<0.001 (permutation test in the 2-4 s window).

(F) Reward vs. error index during 2 s following reward delivery time, by SWR-modulation category (N n=196 cells, D+ n=147 cells, V+ n=37 cells). The D+ population is significantly shifted negative of zero by this metric (p=9.05×10−7, one-tailed signed-rank test). D+ vs. N: ***p=1.85×10−5 (Wilcoxon rank-sum test).

See also Figure S7.

At the same time, we were surprised to find an overall lack of enhanced firing during receipt of reward at the reward sites, given previous work suggesting reward-site-specificity for NAc neurons activated during dSWRs in sleep (Lansink et al., 2008; Lansink et al., 2009). While we found individual examples of MSNs that had higher firing rates during rewarded as opposed to unrewarded times at the wells in both the D+ and V+ populations, we also found cells that showed higher firing during errors (Figure 5C,E). Importantly, neither the D+ nor V+ populations were enriched for cells showing reward-specific firing at the wells (Figure 5D,F). When examined in absolute time, the D+ neurons tended to fire more when reward was not delivered on error trials (Figure 5F), likely because of their activity preceding turns away from the reward wells. These findings suggest that D+ neurons multiplex a signal of past reward outcome with their encoding of trajectory features rather than encoding reward receipt per se.

D+ and V+ MSNs comprise distinct neuronal networks

If the D+ and V+ physiological subtypes reflect distinct anatomical networks, we would expect them to show coordinated spiking activity within each population but not across populations. We therefore examined spike cross-correlations between NAc cells outside of SWRs at zero-lag. During movement on the task, pairs of D+ MSNs showed a stronger tendency to be coactive than D+/V+ pairs (Figure 6A,C), which more often showed negative cross-correlations (Figure 6D), indicating that D+ and V+ cells are typically active at different times. Strong co-firing was not seen for V+ MSN pairs as compared to D+/V+ pairs (Figure 6B-C), perhaps because of the overall low activity levels of V+ neurons in our task (Figure 4J-K) and because we had so few co-recorded V+ cells (10 pairs). Importantly, the enhanced positive correlations of D+/D+ pairs could not be explained by their higher firing rates (Figure S8A) or by firing rate correlations with running speed (Figure S8B). These findings suggest that the D+ MSN population constitutes a specific coordinated network distinct from the V+ MSN population.

Figure 6. Distinct coordination of spiking in the dH-NAc vs. vH-NAc networks.

Figure 6.

(A) Spike cross-correlations during movement (mean at 0 ±10 ms, z-scored) between pairs of MSNs. Left: pairs of D+ MSNs (D+/D+, n=272 pairs) vs. pairs of D+ and V+ MSNs (D+/V+, n=146 pairs). The D+/D+ distribution is significantly shifted to the right of the D+/V+ distribution (***p=1.00×10−6, Wilcoxon rank-sum test).

(B) Spike cross-correlations during movement between pairs of V+ MSNs (V+/V+, n=10 pairs) vs. D+/V+ pairs (distribution repeated from A). In (A) and (B), only z-scores up to 25 are shown for clarity.

(C and D) Fraction of cell pairs exhibiting positive (C) or negative (D) z-scored spike cross-correlations during movement. (C) D+/D+ vs. D+/V+: ***p=1.11×10−9; D+/D+ vs. V+/V+: p=0.094. (D) D+/D+ vs. D+/V+: ***p=1.04×10−6; D+/D+ vs. V+/V+: p=0.13 (z-tests for proportions).

(E) Histogram of peak theta spike phase for D+ MSNs with significant theta phase modulation (n=118). Scale indicates fraction of cells. Arrow direction indicates mean phase preference of all D+ MSNs (213.73 degrees; arrow length in arbitrary units).

(F) Phase-locking strength and peak spike phase (arrow direction) of individual D+ MSNs. Arrow lengths and scale on grid indicate the phase concentration parameter, kappa.

(G) Histogram of peak theta spike phase for V+ MSNs with significant theta phase modulation (n=27). Similar to (E). Mean V+ phase preference: 308.43 degrees. Mean phase preference of D+ and V+ cells is offset by 94.70 degrees (p=0.0123, permutation test).

(H) Phase-locking strength (arrow length in kappa) and peak spike phase (arrow direction) of individual V+ MSNs.

(I-K) Spike coactivity during movement (cross-correlation z-score at zero-lag) vs. coactivity z-score during awake SWRs. Lines represent linear fit with 95% confidence intervals for populations with a significant rho. (I) D+/D+ MSN pairs (same as in A), with SWR coactivity calculated during dSWRs (Spearman’s rho=0.39, p=2.67×10−11). (J) Pairs of D+ MSNs and dH (dCA1) pyramidal cells, with SWR coactivity calculated during dSWRs (n=988 pairs, rho=0.11, p=3.72×10−4). (K) V+/V+ MSNs (same as in B), with SWR coactivity calculated during vSWRs (rho= −0.31, p=0.39).

See also Figure S8.

We next examined whether the uncorrelated firing of D+ and V+ MSNs during movement may be partly explained by their spike organization on different phases of the hippocampal theta rhythm. Theta oscillations (~5-11 Hz) travel as a wave across the dorsoventral axis of the hippocampus (Lubenov and Siapas, 2009), such that theta is offset by ~180 degrees at the dorsal and ventral poles (Patel et al., 2012). This observation predicts that NAc subpopulations which are activated during dSWRs or vSWRs would likewise be distinctly activated in relation to dH or vH theta. We first verified that dH and vH theta-locked multiunit activity peaks were offset by ~156 degrees with variability according to anatomical location (Figure S8C-D), consistent with previous work (Patel et al., 2012). Using dH theta phase as the reference for NAc spiking, we then determined the spike phase preference of significantly theta-modulated D+ and V+ MSNs. We found a significant difference (~95 degrees) between the mean phase preferences of the D+ and V+ populations (Figures 6E-H and S8E-F), suggesting that these networks are coordinated asynchronously during movement. We note, however, that because NAc cells are known to phase precess across theta cycles (van der Meer and Redish, 2011), there is likely to be substantial variability across theta phase at the level of individual cells.

Co-firing during behavior is thought to drive plasticity which is then expressed during subsequent SWRs (Buzsaki, 2015). Consistent with this, for both pairs of D+ MSNs (Figure 6I) and for pairs of D+ MSNs and dH pyramidal cells (Figure S5B; Figure 6J), co-firing during movement predicted pairwise coactivity during SWRs. These coactivity patterns are consistent with coordinated reactivation across brain regions during dSWRs and mirror SWR reactivation patterns in hippocampus (O'Neill et al., 2008; Karlsson and Frank, 2009) and across hippocampus and prefrontal cortex (Jadhav et al., 2016; Tang et al., 2017). We had too few V+ pairs and too few single units in vH to quantify inter-regional reactivation. Nevertheless, all but one V+/V+ pair showed a positive coactivity z-score during vSWRs (Figure 6K), suggesting the presence of co-reactivation of V+ MSNs.

Patterns of SWR-modulation and network activity are maintained during sleep

Finally, we asked whether D+ and V+ neurons constitute separate networks across both waking task performance and sleep. During sleep, we observed greater synchrony between dSWRs and vSWRs than during wake (example in Figure 7A). While synchronous SWRs occurred more often than expected from a shuffle of SWR times, they still comprised a small minority of events, with only ~6.7% of dSWRs occurring within 50 ms of a vSWR (Figure 7B). This degree of synchrony was substantially smaller than the synchrony observed within dH or within vH (Figure 7C), consistent with previous results (Patel et al., 2013). Nevertheless, the contrast between modest synchrony in sleep and strong asynchrony in wake was apparent in each animal (Figure 7D), despite small anatomical differences in recording sites (Figure S1C,E-F).

Figure 7. Increased synchrony between dSWRs and vSWRs in sleep.

Figure 7.

(A) Example of dCA1 and vCA1 SWRs during sleep (Rat 4). Shaded regions highlight detected dSWRs (pink) and vSWRs (blue).

(B) CCH between sleep vSWRs vs. dSWRs across animals (mean ± s.e.m., n=5 rats). Top: normalized by the number of dSWRs; magenta lines in zoomed-in inset depict the average shuffle distribution across animals (mean ± 95% confidence intervals), for illustration purposes only. Bottom: z-scored relative to shuffled vSWR onset times.

(C) Mean CCH (± s.e.m.) for sleep SWRs between tetrodes within dCA1 (top, n=5 rats) or vCA1 (bottom, n=3 rats with >1 tetrode in vCA1), normalized by the number of SWRs on one tetrode.

(D) Z-scored CCH (relative to shuffle) of vSWRs vs. dSWRs in wake (top) vs. sleep (bottom) in each animal. Error bars indicate s.e.m. across days (n=15-19 days).

When we examined only isolated dSWRs or vSWRs (separated by >250 ms), we found that 30% of MSNs were significantly modulated during either dSWRs, vSWRs, or both, and that this modulation was predominantly positive. Notably, although the proportion of single MSNs showing opposing modulation during dSWRs versus vSWRs (1.6%) was smaller than in wake, it was again greater than chance (Figure 8A). Furthermore, the population-level anti-correlation of MSN activity during dSWRs versus vSWRs remained apparent (Figure 8B-C). A majority of FSIs (~62%) were again modulated during either dSWRs or vSWRs (Figure 8D). While this anti-correlation was no longer significant for FSIs (Figure 8E-F), the absence of a relationship between dSWR and vSWR modulation suggests that dSWR- and vSWR-engaged FSIs are also activated separately in sleep. Together, these findings suggest that D+ and V+ neurons comprise largely distinct NAc networks, coordinated in opposition during SWRs from dH and vH across behavioral states.

Figure 8. Hippocampal-NAc network patterns are maintained during sleep.

Figure 8.

(A) Proportions of NAc MSNs showing significant modulation during asynchronous dSWRs and vSWRs in sleep, similar to Figure 3B. Top: fractions of modulated MSNs regardless of modulation direction (out of 1241 MSNs from 5 rats). Significantly more cells are modulated during Both than would be expected by chance (***p=1.50×10−4, z-test for proportions). Bottom: directional modulation of NAc MSNs. The fraction of D−V+ cells alone and total “opposing” cells (gradient bar, D+V− and D−V+) are higher than would be expected by chance (*p=0.012 and *p=0.037, respectively, z-tests for proportions).

(B) NAc MSN population shows opposing modulation during asynchronous dSWRs and vSWRs in sleep. Similar to Figure 3C.

(C) Anti-correlation between dSWR and vSWR modulation amplitudes of MSNs. Similar to Figure 3D.

(D) Similar to (A), but for fractions of FSIs showing significant modulation during asynchronous SWRs in sleep, after removal of potential duplicate cells. Fractions are out of 13 FSIs from 5 rats.

(E) Similar to (B) but for FSI population in sleep.

(F) Similar to (C) but for FSI population in sleep.

(G) Pearson’s correlation of SWR-modulation direction and amplitude of NAc MSNs from awake immobility on the task to sleep, for dSWRs (left) and vSWRs (right). Points represent single cells active in both wake and sleep (n=368 cells), lines represent linear fits ± 95% confidence intervals.

(H) Spearman’s correlation of spike coactivity z-score during awake dSWRs vs. during asynchronous sleep dSWRs, for pairs of NAc MSNs (SWR modulation category defined during wake) and dCA1 pyramidal cells. Lines represent linear fits ± 95% confidence intervals. Left: D+/dH (n=565 pairs). Center: V+/dH (n=120 pairs). Right: N/dH (n=690 pairs).

We hypothesized that if SWR modulation reflects network-level connectivity between dH, vH, and NAc subpopulations, then on average, individual NAc neurons should respond similarly during SWRs in wake and sleep. Indeed, for both dSWRs and vSWRs, the modulation amplitude and direction of NAc MSNs was positively correlated from wake to sleep (Figure 8G). Moreover, D+ MSNs showed strong co-firing outside of SWRs in sleep, with significantly more positively correlated pairs as compared to D+/V+ pairs (Figure S8G-H). Finally, we asked whether patterns of inter-regional reactivation between pairs of MSNs and dH pyramidal cells during dSWRs were preserved from wake to sleep. We found that the coactivity of D+/dH pairs during awake dSWRs predicted the strength of their coactivity during sleep dSWRs, albeit weakly. This relationship was not observed for either V+/dH or N/dH cell pairs (Figure 8H). These results suggest that functionally connected hippocampal and D+ MSNs preferentially reactivate together in both wake and sleep, and that previously reported NAc cells reactivated during dorsal hippocampal replay in sleep (Lansink et al., 2009) are likely to be D+ cells in wake.

Discussion

Our findings demonstrate that dSWRs and vSWRs occur asynchronously during waking and activate distinct subpopulations in the NAc during both wake and sleep. Contrary to our initial hypotheses, vSWRs are not modulated by reward, and V+ (vSWR-activated) MSNs show no consistent tuning to spatial locations, progression through a trial, or reward history. By contrast, dSWRs are reward modulated, and D+ (dSWR-activated) MSNs show strong encoding of information related to both the spatial progression through a trial and past reward. These findings establish that SWR-related communication in the dH-NAc and vH-NAc pathways occurs at separate moments in time and engages NAc networks with distinct representations.

An absence of spatial and reward representations in the vH-NAc network

Our expectation of reward-related activity in the vH-NAc network was based on three main types of prior evidence. First, the vH has long been associated with the valence components of episodic memory (Moser and Moser, 1998; Fanselow and Dong, 2010; Strange et al., 2014) and projects heavily to the NAc, which is associated with reward and reward prediction (Pennartz et al., 1994; Humphries and Prescott, 2010). Second, manipulation of the vH-NAc circuit can alter spatial reward-seeking behavior (Floresco et al., 1997; Britt et al., 2012; Riaz et al., 2017; LeGates et al., 2018). Third, as some NAc-projecting vH neurons show modulation at reward sites (Ciocchi et al., 2015), we expected a similar response in vH-associated NAc neurons. Instead, we found that in the Multiple-W task, V+ NAc neurons were much less active than D+ neurons and lacked the task-relevant representations seen in D+ neurons of path progression and reward history.

What could explain the discrepancy between these physiological findings and results of previous studies? First, in addition to the non-physiological nature of most manipulations, it is often assumed that manipulations of the targeted pathway do not affect other pathways. Our finding that the vH-NAc and dH-NAc pathways can act in opposition during SWRs suggests that stimulating or inactivating one pathway would influence activity in the other, making it difficult to assign a unique function to one pathway. Second, optogenetic activation of multiple glutamatergic inputs to the NAc can be positively reinforcing (Britt et al., 2012), and thus activation of the vH-NAc pathway does not necessarily drive behavior specific to vH inputs to the NAc. Additionally, optogenetic inhibition of the vH-NAc pathway impairs recall of a social reward rather than a food reward (LeGates et al., 2018), raising the possibility that different reward types could differentially recruit vH-NAc sub-circuits. Finally, while reward-site-related activity has been observed in a subset of ventral CA1 neurons (Ciocchi et al., 2015), there are no prior reports, to our knowledge, of NAc cells that respond to the vH and encode reward or position. Ventral CA1 pyramidal neurons have been shown to arborize to multiple downstream regions (Dougherty et al., 2012; Ciocchi et al., 2015), such that vH firing patterns will not necessarily be recapitulated in their NAc targets.

Our findings are, however, consistent with a recent report that vH is suppressed during effortful goal-directed behavior (Yoshida et al., 2019). The observed low activity levels of V+ MSNs suggest that they receive minimal excitation from vH during our goal-directed task. Suppression of the vH would be highest on the paths of our task, where the spatial and reward-related signals in D+ MSNs were observed, and thus could explain the absence of these signals in V+ neurons.

We note, however, that our findings do not preclude the involvement of the vH-NAc network in other tasks, such as those that rely on discrimination between environments defined by contextual cues (Komorowski et al., 2013; Riaz et al., 2017). A growing body of work examining vH and its projections also suggests a specialization for aversive experiences and anxiety (Bannerman et al., 2004; Adhikari et al., 2011; Kheirbek et al., 2013; Ciocchi et al., 2015; Padilla-Coreano et al., 2016; Jimenez et al., 2018). The vH-NAc network could perhaps be specialized for variables present but not immediately relevant to our task, such as associations between overall context and emotional state. Finally, as the NAc is remarkably heterogeneous (Carelli, 2002; Castro and Bruchas, 2019), the vH may engage other representation types in subregions of the NAc not sampled here.

Spatial-reward memory in the dH-NAc network

The spatial path and reward history representations of D+ NAc neurons are consistent with prior studies suggesting that dH-NAc communication links spatial paths to reward (Berke et al., 2004; Lansink et al., 2009; van der Meer and Redish, 2011; Sjulson et al., 2018). Recent work demonstrated that direct dH recruitment of NAc ensembles is indeed necessary for spatial-reward memory in a conditioned place preference assay (Trouche et al., 2019). Our findings complement these results in several important ways.

First, we demonstrated that coordinated reactivation between dH pyramidal cells and NAc cells is present during awake dSWRs and recruits a specific NAc network. This reactivation may contribute to the active storage of associations during the experience as well as to the retrieval of associations for decision-making processes (Joo and Frank, 2018). We also found that NAc MSNs can be inhibited during SWRs. This inhibition is likely mediated by lateral connections with other MSNs or by local FSIs, consistent with our observation that FSIs are SWR-modulated and previous observations of FSI activation from hippocampal inputs (Trouche et al., 2019).

Second, we found that individual D+ neurons are active at “path equivalent” (Frank et al., 2000; Singer et al., 2010) locations on multiple trajectories, and can thus be understood as encoding progression along spatial paths between reward sites. We propose that D+ MSN firing patterns correspond to goal-directed actions that the animal learns to repeat at specific locations on each spatial path. For instance, the turnaround from the reward well is the first in a set of goal-directed actions to approach the next well and occurs at the same relative location on each path. Such patterns that generalize across task elements rooted in space have been reported throughout the striatum (Lavoie and Mizumori, 1994; Mulder et al., 2004; Berke et al., 2009; van der Meer et al., 2010; Lansink et al., 2012), but they have not been previously linked to dSWR activation. Moreover, these D+ MSN firing patterns mirror those seen in the subset of medial prefrontal cortical neurons activated during dSWRs (Yu et al., 2018). Our findings are therefore consistent with a role for dSWRs in binding discrete spatial sequences to generalized goal-directed action sequences across the brain.

Furthermore, we found that NAc representations activated during awake dSWRs are not restricted to reward sites. Based on previous work (Lansink et al., 2008; Lansink et al., 2009), we expected that NAc cells encoding receipt of reward or the reward location itself would preferentially activate during awake dSWRs. Instead, we found that D+ neurons do not reliably encode the delivery, consumption, or location of reward, but are modulated by past receipt of reward. Importantly, our task included a 2-second delay between the animal’s arrival at a reward site and reward delivery, allowing us to separate location and reward-delivery signals. In addition, as dSWRs are modulated by reward themselves and activate NAc neurons, some fraction of reward-specific spikes previously reported likely occurred during awake dSWRs (or vSWRs).

Storage and retrieval of different aspects of experience

We propose that the opposition between the dH-NAc and vH-NAc networks during SWRs is well suited to support the processing of different aspects of experience at different times. Given the evidence for functional divergence across dH and vH, we speculate that awake dSWRs and vSWRs send information about largely distinct features of experience to separate downstream circuits for memory storage, facilitating the flexible retrieval of specific information in the future. The temporal asynchrony of dSWRs and vSWRs could thus keep those features separate as they are stored and/or retrieved during pauses in behavior, as only some of that information may be relevant to the task at hand. The mechanism of this separation remains to be studied; as we did not observe evidence of direct reciprocal inhibition between D+ and V+ NAc neurons, their opposite SWR activation may be regulated by processes upstream in the hippocampus or cortex. We also found that, on average, D+ and V+ NAc MSNs are maximally activated at distinct phases of hippocampal theta. Separation of dH and vH outputs to downstream structures may therefore persist during movement. Conversely, during sleep, the greater synchrony of dSWRs and vSWRs may reflect the consolidation of a more complete memory.

STAR Methods

LEAD CONTACT AND MATERIALS AVAILABILITY

Further information and requests for resources and materials should be directed to and will be fulfilled by the Lead Contact, Loren Frank (loren@phy.ucsf.edu).

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Animals

All procedures were in accordance with guidelines from the University of California San Francisco Institutional Animal Care and Use Committee and US National Institutes of Health. Male Long-Evans rats (Charles River, RRID:RGD_2308852) were pair-housed with a 12-hour light/dark cycle (lights on 6 am – 6 pm) and had ad libitum access to food (standard rat chow) until the beginning of food restriction, when they were single-housed. For this study, we used five male rats (500-650 g, 5-8 months old).

METHOD DETAILS

Implants and behavior

Animals were food restricted to 85% of their baseline weight and pre-trained to run back and forth on a 1 m long linear track for liquid reward (evaporated milk plus 5% sucrose), delivered automatically from reward wells at the ends of the track. Animals were incrementally introduced to a delay between well entry (nosepoke) and reward delivery of up to 2 seconds. After animals learned to alternate consistently for at least ~30 well visits per 5 min (4-6 days), they were switched back to an ad libitum diet and then surgically implanted with microdrive arrays.

Each microdrive housed a maximum of 28 independently movable tetrodes in a custom 3D-printed drive body (PolyJetHD Blue, Stratasys Ltd.) cemented to 3 stainless steel cannulae at fixed relative positions, targeting NAc vertically (8-12 tetrodes) and dH (6-7 tetrodes) and vH (9-13 tetrodes) at a 12° angle from vertical (tilted mediolaterally). NAc and dH tetrodes were made of 12.7 μm-diameter nichrome (Sandvik), while vH tetrodes were made of 12.7 μm nichrome, 12.7 μm tungsten, or 20 μm tungsten (California Fine Wire). Tetrode ends were plated with gold to a final impedance of ~240-350 kOhms. The microdrive was stereotaxically implanted over the right hemisphere such that the center of each cannula was targeted to the following coordinates relative to the animal’s Bregma: dH: AP −3.9-4.0 mm, ML +1.7 mm; vH: AP −5.6-5.7 mm, ML +4.0 mm; NAc: AP +1.3-1.4 mm, ML +1.3 mm (Rat 1 vH: AP −5.75, ML +4.1, oval-shaped cannula). The approximate AP/ML spread of tetrodes in each area was defined by the inner radius of each cannula as follows: dH: ±0.49 mm, vH: ±0.87 mm, NAc: ±0.60 mm. A ground screw was inserted in the skull above the right cerebellum as a global reference.

While animals recovered from surgery, tetrodes were manually adjusted over ~2-3 weeks to their target depths relative to brain surface (dCA1: ~2.2-3.3 mm, 12° angle; vCA1: ~7.0-8.3 mm, 12° angle; NAc: ~5.4-7.5 mm, 0° angle), using electrophysiological landmarks such as unit density and SWR amplitude. Each rat was then food restricted again and re-trained on the linear track for 4-6 days with neural recording (not analyzed in this study). Animals were then introduced to the Multiple-W task (Figure 1B-C), a version of which has been described previously (Singer and Frank, 2009). Tetrodes were sometimes advanced a small amount after the conclusion of the day’s recording on a case-by-case basis to improve cell yield.

The Multiple-W track consisted of six 76 cm arms spaced ~36 cm apart at their midpoints, with 3 cm high walls, connected to a “back” which extended past the first and sixth arm by 14 cm on each side (to mimic the availability of a right and left turn from these arms), and elevated 76 cm off the ground. On each day, the animal experienced three 20 min “run” (task) epochs on the track flanked by four 20-45 min sleep epochs in a separate high-walled rest box; only in rare cases (2 epochs each for Rats 1-2, 1 epoch each for Rats 3-4) were there four run epochs. The track was separated from the experimenter by an opaque black curtain, and the white walls of the room were marked with black distal cues of various shapes. Each arm contained a visually identical reward well connected to milk tubing, and milk was run through each well at the beginning of the day to create similar olfactory cues in all wells.

In each run epoch, the animal was placed at the back of the center arm of the rewarded sequence and was required by trial-and-error to find the 3 rewarded wells and figure out the alternation sequence between them, Sequence A (SA) or Sequence B (SB). Trials are defined as well visits. A visit to the center well of the sequence (well 3 in SA, well 4 in SB) was rewarded if the animal came from any other well. If a center visit was the first of the epoch or followed an error to a non-sequence arm, the animal could initiate an “outer” well visit to either of the center-adjacent wells to get reward. If a center visit followed a visit to a center-adjacent well, the animal had to then visit the opposite center-adjacent well, requiring hippocampal-dependent memory of the previous trial (Kim and Frank, 2009). For example, a correct series of trials for SA would be 3-2-3-4-3-2; for SB, 2-1-2-3-2-1. Consecutive visits to the same well were counted as errors, such that chance performance was defined as 1 out of 6 (0.167). The nosepoke at each well was detected by an infrared beam break, which automatically triggered liquid reward delivery (105 μL evaporated milk plus 5% sucrose) via a syringe pump (New Era Pump Systems, Inc.) after a 2 second delay, and the animal’s departure from the well was self-paced.

During the “Acquisition” phase (5-9 days), the same sequence was rewarded on every epoch: 3 animals acquired SA and 2 animals acquired SB. When the animal achieved greater than 80% correct performance on the Acquisition sequence for at least 1 epoch (assessed in real time as an epoch average), the novel sequence was introduced in the second epoch of the first “Switch” day. Only Rat 1 failed to reach 80% correct but was advanced to the Switch phase after achieving >75% correct and one full week of training; this rat was thus excluded from the 70-80% and >80% Acquisition performance bins in Figure 2B-C. In the Switch phase, the rewarded sequence was switched on each run epoch, such that the starting sequence of each Switch day was also alternated (8-10 days).

Data collection and processing

Spiking, local field potential (LFP), position video, and reward well digital inputs and outputs were collected using the NSpike data acquisition system (L.M.F and J. MacArthur, Harvard Instrumentation Design Laboratory). For Rats 1-3, LFP data were collected at 1500 Hz sampling rate and digitally filtered online at 1-400 Hz (2-pole Bessel for high- and low-pass). Spikes were sampled at 30 kHz and saved as snippets of each waveform, filtered at 600-6000 Hz for hippocampus and 600-6000 Hz (2 rats) or 300-6000 Hz (1 rat) for NAc. For Rats 4-5, LFP and spikes were collected continuously at 30 kHz and filtered online at 1-6000 Hz, with post-hoc filtering applied in Matlab to extract LFP and spike waveforms using the same parameters as above (300-6000 Hz for NAc spikes). Subsets of spike data were collected as snippets in these animals to verify our post-hoc filtering. Note that negative voltages are displayed upward (e.g. Figure 1D). All LFP and spikes were collected relative to local references lacking spiking activity, which were themselves referenced to cerebellar ground: for dH tetrodes, the reference tetrode was located in corpus callosum (4 rats) or deep cortex with no units (1 rat); for vH, in ventral corpus callosum (1 rat) or in white matter at the ventrolateral edge of the midbrain (internal capsule or optic tract; 4 rats); for NAc, typically in corpus callosum, the lateral ventricle, or anterior commissure. Overhead video of the track, collected at 30 frames/s, allowed us to track the animal’s position via an array of infrared diodes attached to the top of the headstage, a few cm above the rat’s head.

Spike sorting was performed using a combination of manual clustering in Matclust (M. Karlsson; Rats 1-3) and automated sorting with manual curation in Mountainsort (Chung et al., 2017) (Flatiron Institute; all data for Rats 4-5, individual days for Rats 1-3). Cells were clustered within epochs but tracked across all run and sleep epochs for which they could be isolated; with Mountainsort, this was done with a drift-tracking extension of the core pipeline and manual merging as needed (Chung et al., 2019). In Matclust, clustering was performed in amplitude and principal component space, and only well-separated units with clear refractory periods in their ISI distributions were accepted. In Mountainsort, we generally accepted clusters with isolation score >0.96, noise overlap <0.03 (median isolation score ~0.995, median noise overlap ~0.002), and clear separation from other clusters in amplitude and principal component space. The similarity of cluster quality between Mountainsort and Matclust was verified manually on a subset of the data and has been extensively verified in previous work (Chung et al., 2017). The same pattern of SWR-modulation of NAc cells was observed within each animal (data not shown), indicating that our results were not due to unit clustering in certain animals.

Histology

At the conclusion of the experiment, animals were anesthetized with isoflurane and small electrolytic lesions were made at the end of each tetrode to mark recording locations (30 μA of positive current for 3 seconds, on 2-4 channels of the tetrode). The animal recovered for 24 hours to allow gliosis and was then euthanized with pentobarbitol and perfused transcardially with PBS followed by 4% paraformaldehyde in 0.1M PB. The brain was post-fixed in 4% paraformaldehyde, 0.1M PB in situ for at least 24 hours, followed by removal of the tetrodes and cryoprotection in 30% sucrose in PBS. Brains were embedded in OCT compound and sectioned coronally at 50 μm thickness. Tissue was either Nissl stained using cresyl violet, or for a subset of dH sections, immunostained for RGS14, a marker of CA2, using previously described methods (Kay et al., 2016).

To reconstruct recording sites (Figure S1), evenly spaced plates from the Paxinos and Watson Rat Atlas (2007), which is based on Wistar rats, were stretched and modified to align to representative sections from each brain region, using landmarks such as the ventricles, corpus callosum, and hippocampal pyramidal layers as guides. These modified plates were then treated as atlases to align the remaining sections and recording sites across animals.

Data analysis

All analyses were performed using custom code written in Matlab (Mathworks).

Behavioral analysis

The animals’ task performance was analyzed using a state space algorithm (Smith et al., 2004) which estimates the probability that the animal is performing accurately according to Sequence A or B on each trial. This algorithm provides 90% confidence intervals which reveal when the animal is performing one sequence significantly better than the other. All trials in the Acquisition phase were analyzed together and background probability was set at chance (0.167), so that behavior of all animals could be compared from a similar starting point. During the Switch phase, each epoch was estimated independently with an unspecified background probability to get the most accurate representation of the animals’ behavior; this means that occasionally the behavioral state could “jump” at an epoch boundary. The mode of the probability distribution was used to assign trials into performance stages for the SWR rate analysis in Figure 2B-C, which yielded a different number of trials per stage from each animal.

SWR detection

SWRs were detected in dCA1 and vCA1 in 4 rats (Rats 2-5), and dCA1 and ventral CA3 in 1 rat (Rat 1), using methods described previously (Kay et al., 2016). Briefly, each tetrode’s LFP was filtered to the ripple band at 150-250 Hz, the ripple amplitude was squared, summed across tetrodes (3 per animal in dH, only 1 per animal in vH, as this was the minimum number present in all animals), and smoothed with a 4 ms s.d., 32 ms wide Gaussian kernel. We then took the square root of this trace as the power envelope to detect excursions greater than 2 s.d. of the mean power within an epoch, lasting at least 15 ms. Tetrodes for detection were chosen based on ripple band power and proximity to the center of the pyramidal layer. The SWR start time (when the envelope first crosses the threshold) was used as the event detection time. For spiking modulation, SWR characterization, and SWR cross-correlation analyses, we excluded SWRs that occurred within 0.5 s of a previous SWR (i.e. chained SWRs). SWRs were only included for all analyses if detected at head speeds <4 cm/s.

As a control, we also detected “noise ripples,” events in the 150-250 Hz band that exceeded a 2 s.d. threshold on our reference tetrodes for dH and vH (which were not in the hippocampus). These events are highly unlikely to be SWRs, but instead may reflect muscle artifacts or other high-frequency noise. For all analyses of SWRs other than NAc spiking (to include the maximum number of NAc spikes), we excluded SWRs with start times occurring within 100 ms of a “noise ripple” on the local reference.

Behavioral state definitions

During run epochs, periods of “immobility” were defined as times with a head movement speed <4 cm/s calculated as the derivative of the smoothed position data from the headstage-mounted diodes. We defined “sleep” as periods of immobility in sleep epochs that occurred >60 s after any movement at >4 cm/s. To calculate overall sleep SWR rate in Figure S2, NREM sleep periods were defined by exclusion of REM sleep as defined previously (Kay et al., 2016). Specifically, REM periods were detected as times when the ratio of Hilbert amplitudes of theta (5–11 Hz) to delta (1–4 Hz), referenced to cerebellar ground, exceeded a per-animal threshold of 1.4-1.7 for at least 10 s.

Characterization of SWR properties

To characterize the spectral properties of dSWRs and vSWRs, multi-tapered spectrograms of the raw LFP triggered on SWR start times were generated using the Chronux toolbox (mtspecgramtrigc, sliding 100 ms window with 10 ms overlap, bandwidth 2-300 Hz), and z-scored to the mean power in each epoch before averaging across epochs and days. To approximate the peak ripple frequency, a slice of this spectrogram was taken at the time of peak ripple power per animal. For the remaining properties described in Figure S2: we defined SWR amplitude as the minimum threshold in s.d. that would be required to detect the event (see above). SWR duration is the time between first threshold crossing and return of the envelope to the mean. Mean epoch SWR rate was calculated for all immobility periods in run epochs and all NREM sleep periods during sleep epochs.

SWR cross-correlation

Cross-correlations between vSWRs and dSWRs were performed within day, using dSWRs as the reference, in 50 ms bins up to 0.5 s lag, and were normalized to the number of dSWRs in each day before averaging across days and animals. To create a z-scored version of the cross-correlation histogram, vSWR event times were circularly shuffled 1000 times within immobility periods (by a random amount up to ±half the mean immobility period length) to create 1000 shuffled histograms. The real cross-correlation values were z-scored relative to the distribution of shuffled values within each bin, such that a z-score of 0 indicates that the real data is no different than the mean of the shuffles. For cross-correlations within dH or vH, SWRs were detected on individual tetrodes.

SWR rate relative to reward and novelty

In Figure 2B-C, we calculated the rate of SWR events per time spent immobile after reward delivery time on individual rewarded or error trials, and then averaged those rates across trials in each learning stage from each animal. Trials with less than 1 s spent immobile were excluded. In Figure S3C-D, this rate was averaged across trials within each run epoch for each animal. In Figure 2D-E, we calculated SWR rate in 200 ms bins from 0 to 5 s after nosepoke. We subsampled rewarded and error trials based on speed by excluding any trial where the animal spent more than 5 position samples (150 ms) moving faster than 4 cm/s (allowing for some jitter of head position), from 1.5 s after nosepoke to the end of the 5 s window. As the animal’s retreat from the reward well is self-paced, this greatly reduced the number of included error trials and focused exclusively on error trials when the animal waited at the well beyond the expected reward delivery time. We also excluded any bins that were not below the speed threshold, as SWRs could not be detected in these bins according to our criteria. SWR rate per bin was then calculated per the number of included bins in each animal, and we required at least 2 s total of data per bin (10 accepted bins) to calculate a rate across animals.

Unit inclusion

Only units firing at least 100 spikes in a given epoch were included in the current study (865 total NAc units in run, 1678 units in sleep, from all five rats across days). Additional inclusion criteria were applied per analysis.

Putative cell type classification

NAc single units were classified similar to methods described previously (Berke, 2008; Atallah et al., 2014), using mean firing rate, mean waveform peak width at half-maximum, mean waveform trough width at half-minimum, and ISI distribution. These values were averaged across epochs when a cell was present in multiple epochs within a day. When plotted, mean firing rate and waveform features generated distinguishable clusters (Figure S5A), the boundaries of which were defined as follows: fast-spiking interneurons (FSIs): firing rate >3 Hz, peak width <0.2 ms, and a ratio of trough width to peak width (TPR) <2.7 (TPR was estimated by k-means clustering and was more reliable than exact trough width for FSIs); tonically-active neurons (TANs): <5% of ISIs less than 10 ms, a median ISI >100 ms, and peak width and trough width above the 95th percentile for the remainder of the units; unclassified units had low TPR and/or narrow trough widths (<0.2 or 0.3 ms) but firing rates <2 Hz; all other units were considered putative medium-sized spiny neurons (MSNs). Only MSNs and FSIs are included in the current study.

Hippocampal units were also classified according to mean firing rate and peak and trough width. Putative interneurons were defined as having firing rates >5 Hz, peak width <0.2 ms and trough width <0.3 ms. All other non-unclassified units were considered putative pyramidal cells.

SWR-triggered spiking activity

For all analyses of SWR-aligned spiking, we created SWR-onset-triggered rasters (1 ms bins) in a 1 s peri-SWR window. From this raster, the mean firing rate was smoothed with a 10 ms s.d., 80 ms wide Gaussian kernel to generate a peri-event time histogram (PETH). For analyses based on z-scored firing rates (e.g. Figure 3C,G), the raster was padded with a 100x repetition of its start and end values, smoothed, unpadded, and z-scored to the pre-SWR period −500 to 0 ms.

For multiunit activity (MUA) analysis in dH and vH, we thresholded all spike events at 40 μV on tetrodes with clear multiunit firing in the pyramidal layer. In Rats 4 and 5, MUA was extracted by post-hoc thresholding of the 600-6000 Hz filtered LFP. SWR-triggered MUA spike counts were summed across tetrodes and then divided by the total time per bin to calculate a mean firing rate per animal.

To detect significant SWR-modulation of NAc cells, we followed a procedure described previously (Jadhav et al., 2016). Briefly, for each cell, we circularly shuffled each SWR-triggered spike train by a random amount up to ±0.5 s to generate 5000 shuffled PETHs. We then calculated the summed squared difference of the real PETH relative to the mean of the shuffles in a 0-200 ms window post SWR-onset, and compared it to the same value for each shuffle relative to the mean of the shuffles. Significance at p<0.05 indicates that the real modulation exceeded 95% of the shuffles. The direction of modulation was defined from a modulation index, calculated as the mean firing rate in the 0-200 ms window minus the mean baseline firing rate from −500 to −100 ms, divided by the mean baseline firing rate. This sign of this index was used to assign cells as significantly positively or negatively SWR-modulated.

To categorize cells according to both dSWRs and vSWRs, we only included cells that fired at least 50 spikes in the peri-event rasters for both types of SWRs. Cells were subsequently categorized according to their significance and direction as unmodulated (Neither, N), dSWR-significant only (D only), vSWR-significant only (V only), significant during both (Both), dSWR-activated (D+), dSWR-suppressed (D−), vSWR-activated (V+), vSWR-suppressed (V−), or combinations of these: D+V+, D+V−, D−V+, or D−V−. In both wake and sleep, we observed more dSWR- and vSWR-modulated cells than the chance level of 5%. To assess the significance of the “both” modulation categories, we compared each fraction to the chance overlap of our empirical fractions of dSWR- and vSWR-modulated cells using a nonparametric z-test for proportions. We defined “modulation amplitude” as the mean z-scored firing rate of each cell (relative to the pre-SWR period −500 to 0 ms) in the 0-200 ms window following SWR onset.

Potential duplicate cell control

To control for the possibility that cells stably recorded on the same tetrode across days could have been counted more than once and could influence any of our results, we excluded potential duplicate cells based on waveform similarity (Schmitzer-Torbert and Redish, 2004). We first established a waveform correlation threshold based on cells recorded on different tetrodes on the same day, which are different cells by definition. For each pair of cells, we aligned their mean waveforms at the peak (on the maximum channel) of one of the cells and calculated a Pearson’s correlation coefficient on each channel (channel 1 of cell A was compared to channel 1 of cell B, and so on). In cases where the waveform snippets were different lengths (due to different spike extraction in Matclust as compared to Mountainsort), we aligned the snippets at their peaks and padded the edges with zeroes as needed. The resulting r values for each channel were then averaged to establish a mean r for that pair. The 95th percentile of r values in this different-cell distribution, 0.979 for wake and 0.980 for sleep, was taken as the threshold for waveform correlation. Next, if a tetrode was moved ≥78 μm across days (Berke, 2008), we considered the newly acquired cells to be “unique.” If a tetrode was moved less than 78 μm between days, we computed the mean r for all pairs of cells on that tetrode across all previous days of similar depth. This could exclude cells that disappeared and “came back” across multiple days, even though this scenario would seem to be unlikely. Cell pairs with a mean r greater than the threshold were tagged as potential duplicates. We first kept cells from the day with the most cells on that tetrode (randomly selected if multiple days tied for maximum cell count). If a given potential duplicate cell had not yet been kept, one instance of that cell was randomly selected to keep. Cells present in both wake and sleep were classified as potential duplicates based on their r in wake. We note that this system will result in some false positive exclusions and false negative inclusions; different MSNs can have highly similar waveforms even though they are different cells (false positive), and waveforms can change dramatically from day to day even for the same cell due to changes in cell health or relative position of the tetrode (false negative). However, applying this conservative control did not change any of our main results.

NAc neuron task firing

We analyzed trajectory firing patterns using two methods: normalized trial time and linearized position. In the normalized trial time method, each trial was split into normalized progression of time spent at the well (from nosepoke to when the animal turns around; “well”) and time spent moving along the path between wells (from turnaround to next nosepoke; “path”). The turnaround time was detected by a >4 cm movement in the x-direction, a change in head direction of >0.25 radians (~14°), and a speed of >2 cm/s. Additionally, we required that the animal had moved away from the well in the y-direction one second in the future, otherwise the turnaround time was incremented. Path and well time were divided into bins of 0.5% of the total completion time. Firing rate was calculated by dividing the number of spikes in each bin by the bin width in seconds on that particular trial (excluding spikes during either dSWRs or vSWRs), smoothing the rate with a 5 bin (2.5%) s.d., 40 bin (20%) wide Gaussian kernel, and then averaging across trials of the same trajectory type (defined by start and end well). We further attempted to control for variation in the animal’s behavior on individual trials in three ways: by only calculating mean trajectory rates when there were at least 3 trials on that trajectory; by performing a pairwise speed profile correlation across trials and only accepting trials that fell at or above the 25th percentile of speed similarity values; and by only accepting trials with a duration at or below the 75th percentile of the trial length distribution. These methods excluded trials that were long, slow, or had many stops.

In the linearized position method, we projected the animal’s 2D position to a line connecting each junction and endpoint of the maze, generating a linearized position relative to the start of each trajectory defined by start and end well. Each trajectory thus contained a specific set of maze segments, and we again controlled for behavioral variation by only accepting trials where the animal deviated ≤12 linear cm onto segments not included in the current trajectory (this allowed for small “head swings” onto neighboring segments). Only data during movement >4 cm/s were included. From the set of included trials on each trajectory, we calculated a firing rate per time spent moving (occupancy) in each linear position bin of 2 cm, smoothed it with a 4 cm s.d., 20 cm wide Gaussian kernel, and calculated the mean rate on that trajectory within day. Bins with <100 ms total occupancy were excluded. Trajectories missing more than 5 bins (as a result of diode occlusion or low time occupancy) were excluded from firing similarity analysis (5 or fewer missing bins were interpolated), and linearized distance was normalized before pairwise correlation across trajectories (below).

To assess the firing similarity of a given cell across trajectories that differ in spatial location and direction, we focused on the 6 rewarded trajectories (across SA and SB) depicted in Figure 4A. We calculated the coefficient of determination between the mean firing profiles of each pair of trajectories (as a function of normalized trial time or linearized position), and then took the mean r2 across pairs. We controlled for the effect of firing rate by matching cells in the V+ population to D+ and N cells with the closest firing rates, generating subsampled D+ and N populations. Note that the variety of behavioral controls applied to both methods excluded slightly different numbers of cells, depending on whether the cells were active on enough trials that passed our criteria to compute an r2.

Trial-by-trial correlation was performed with the same controls for behavioral variability as described above. Specifically, we correlated successive pairs of individual trials (minimum 10 traversals per included trajectory) to get a mean r2 for each trajectory, and then took the mean r2 across trajectories. In the linearized position version, this was done with firing rate in 4 cm bins, smoothed with a 4 cm s.d., 20 cm wide Gaussian. A larger bin size was used to account for lower time occupancy in any given bin on a single trial, and we excluded bins with <30 ms occupancy (~1 position sample).

We explored a variety of additional task-related firing parameters to characterize NAc MSNs. Left/right trajectory directionality was calculated as the absolute area between leftward and rightward trajectory firing rate curves on the same maze segments in linearized position, divided by their sum (values closer to 1 indicate a stronger preference in one direction, either left or right). For two-dimensional (2D) spatial coverage, we first generated an occupancy-normalized firing rate map of each cell in each task epoch, in 1 cm2 bins smoothed with a symmetric 2D Gaussian (4 cm s.d.). Coverage was defined as the fraction of the area with >5% of non-zero occupancy where the cell fired >10% of its peak spatial firing rate; coverage was then averaged per cell across epochs. Path vs. well preference was calculated from each cell’s mean path and well firing rates (excluding SWR times) across trajectories in normalized trial time, as (path − well)/(path + well), such that values greater than zero indicate path preference and values less than zero indicate well preference. To assess preference for a specific alternation sequence on Switch days (Sequence B vs. A), we calculated the mean firing rate on the path from all trials following a switch to the newly rewarded sequence. The “switch” trial from SA to SB, for example, was defined as the first trial of the longest contiguous stretch of trials in which the lower confidence bound of the animal’s SB performance (see behavioral analysis) exceeded the upper confidence bound of the SA performance. The sequence preference index was calculated as (FRSB − FRSA)/(FRSB + FRSA). Cells were only included which were active in at least one successful switch epoch per sequence. When cells were active in all 3 epochs, the firing rates from the first and third epochs (the same sequence) were averaged.

Analysis of movement covariates

To determine whether the firing patterns of D+ cells could be explained by movement variables, we performed two independent analyses. In the first analysis, we took advantage of the fact that if a cell is significantly modulated by kinematic variables, then fluctuations in its firing rate will be significantly correlated with fluctuations in those kinematic variables. To measure these fluctuations, we first calculated mean and individual trial firing rates on each of the 6 included trajectories from the linearized position firing field (2 cm bins, as described above). We likewise calculated the mean and individual trial running speed, running acceleration, angular speed (of the animal’s head), and angular acceleration in the same bins of 2 cm on each trajectory. We excluded trajectories with fewer than 3 trials and bins where the expected (mean) firing rate was <0.2 Hz. We then calculated the residuals between each trial and the mean for each of these variables (Yu et al., 2018), and concatenated residuals across the whole day to compute a Spearman’s correlation between the firing rate and movement residuals in each valid spatial bin. Cells showing significant correlations with movement residuals were excluded from population analysis as a control (e.g. Figures S7K and S8B, excluding cells correlated with running speed residuals). Correlations with acceleration, angular speed, and angular acceleration likewise had no effect on our results (data not shown).

In the second analysis, for D+ cells that fired at least 50 spikes during movement >4 cm/s, we generated 5 explicit models of each cell’s spike train during movement, according to (1) linear position along the 6 included trajectories in cm, (2) running speed, (3) running acceleration, (4) angular speed, and (5) angular acceleration. To generate these models, we first created slightly smoothed tuning curves for each variable, which were (1) the linearized position firing field of the cell on each trajectory, or (2-5) the occupancy-normalized firing rate of the cell in bins according to each movement variable (e.g. 1 cm/s bins for speeds 4 to 60 cm/s). We then predicted the firing rate of the cell using the tuning curves in each 33 ms time bin across the day (the video sampling rate). For instance, if at time step t the animal was moving at 10 cm/s, and the cell had a mean epoch firing rate of 4 Hz at 10 cm/s, we predicted the firing rate at time t to be 4 Hz. The predicted firing rate curves were then integrated with a trapezoidal function to produce the predicted spike train in 500 ms bins. We quantified the error of each model as the summed squared (SS) residuals from the true spike train, and compared them by normalizing to the SS residuals of the linear position model. The advantage of this method (as compared to a generalized linear model) is that it does not assume a linear relationship between firing rate and any of the independent variables.

NAc neuron reward and reward history firing

To examine reward history preference, we calculated firing rate on the path in normalized trial time, using the same methods as above (but smoothed with a 1.5% s.d., 12% wide Gaussian kernel), now comparing all paths (regardless of trajectory) following a rewarded well visit or an error well visit. We required a minimum of 2 s (the delay between nosepoke and reward delivery) to be spent at the well for a trial to be included, as this is the minimum time at which the animal would know if the trial was rewarded. We only included cells for which at least 3 rewarded and 3 error trials passed our speed profile and trial length controls. The reward history preference was calculated from the mean firing rate curves as (post-reward − post-error)/(post-reward + post-error). Significance of reward preference (>0) vs. error preference (<0) was calculated with a permutation test from the set of rewarded and error trials. To ask whether fluctuations in running speed on the path on rewarded vs. error trials could account for reward history preference, we removed cells that showed a significant positive or negative correlation between speed residuals and firing rate residuals. Similar to the analysis of movement covariates above, we calculated the residual between the speed on each individual trial (as a function of normalized time, 0.5% bins) from the mean speed taken across rewarded and error trials, as well as the residual between the firing rate on each trial and the mean firing rate. We separately summed the speed residuals and firing rate residuals for each trial, and calculated the Spearman’s correlation of the summed residuals across trials. This analysis thus asks whether, on average, increases in speed across the path predict increases in firing rate.

To calculate reward vs. error preference at the wells (based on current reward or error), we used two methods. In the first method (Figure 5C-D), we calculated firing rate on rewarded and error well visits as a function of normalized time at the wells, excluding SWR spikes, again requiring a minimum dwell time of 2 s. Specifically, we separately normalized the time from nosepoke to reward delivery (2 s) and from delivery to turnaround in 1% bins each, such that expected delivery time would be aligned across rewarded and error trials. The mean firing rate curve for the whole well period was then smoothed with a 3 bin s.d., 24 bin wide Gaussian. We additionally applied a pairwise speed profile correlation to only include trials that fell at or above the 25th percentile of speed similarity values, and only included cells for which at least 3 rewarded and 3 error trials met the above criteria. We then calculated a reward vs. error index per cell from the mean firing rate curves as (reward − error)/(reward + error), exclusively in time bins for which the mean speed on both rewarded and error trials was <2 cm/s. Significance of reward preference (>0) or error preference (<0) was calculated with a permutation test.

In the second method for reward preference at the wells (Figure 5E-F), we calculated rewarded and error firing rates as a function of true time from nosepoke (0 to 4 s, 100 ms bins; smoothed with a 100 ms s.d., 800 ms wide Gaussian). We then computed a reward vs. error index from the mean firing rate curves post-reward-delivery (2 to 4 s, excluding SWR spikes) as (reward − error)/(reward + error). We again controlled for speed by excluding any trial where the animal spent more than 5 position samples (150 ms) moving faster than 4 cm/s, and required at least 5 included trials of both types to compute an index. Significance was again assessed with a permutation test in the 2-4 s window.

Spike cross-correlations

Spike cross-correlations between pairs of cells were calculated in 10 ms bins at up to 0.5 s lag. Each CCH was first normalized by the square root of the product of the number of spikes from each cell. To z-score the CCH of each cell pair, one of the spike trains was circularly shuffled 1000 times (by a random amount up to ±half the mean immobility period length) to create 1000 shuffled CCHs. Each real and shuffled CCH was smoothed with a 20 ms s.d., 160 ms wide Gaussian. The real cross-correlation values were then z-scored relative to the distribution of shuffled values within each bin. We averaged the cross-correlation z-score ±10 ms around 0 to get an approximate “zero-lag” value. To control for the higher firing rates of D+ cells, we subsampled D+/D+ pairs to match the activity levels of D+/V+ pairs, using the mean firing rate of the pair (during all movement) and the difference in firing rates of the two cells in the pair (so that the pairs would have a similar mean and variance). We similarly subsampled D+/N pairs to match the activity of D+/D+ pairs.

Theta modulation of spiking

Theta phase was extracted by the Hilbert transform of the 5-11 Hz filtered LFP referenced to ground. Because the difference between dH and vH theta phase varies according to anatomical location (Patel et al., 2012), we established a common reference for theta phase within each animal from the dH theta rhythm. Using theta from one of the dCA1 ripple detection tetrodes, we assigned phase 0 as the peak of local multiunit activity within epoch (by shifting theta phase by the mean offset between the LFP trough and peak MUA). To verify an offset between dH and vH theta activity, we measured the phase difference between dH and vH theta-locked multiunit activity peaks rather than the phase shift of the LFP oscillation, as LFP theta phase can vary greatly with recording distance from the pyramidal cell layer (Lubenov and Siapas, 2009). We found the difference between the phases of peak dH and vH MUA within day and took the mean phase difference across days, relative to vH (i.e. θD - θV, where θ is the peak phase of multiunit spikes). MUA was detected on the same tetrodes used for SWR detection. To find the phase preference of NAc cells, we first tested for significant phase-locking to dH theta using the Rayleigh test for uniformity on cells that fired at least 50 spikes during movement, restricting the analysis to spikes occurring during movement and outside SWRs. We then measured the peak spike phase and mean spike phase for each cell. Peak phase was computed as the maximum of a spike-phase histogram in bins of π/1200, smoothed with a π/12 s.d. Gaussian (same method as for dH and vH MUA). Mean phase is simply the circular mean of all spike phases. Theta modulation strength of each cell was defined by kappa (the concentration parameter; CircStat toolbox for Matlab). The difference between the phase preference of the D+ and V+ populations was tested for significance with a permutation test, in which permuted values were the shortest circular differences between the phase preferences of all permuted populations (10,000 permutations).

SWR coactivity

We quantified pairwise cell reactivation during SWRs using a coactivity z-score as previously described (Cheng and Frank, 2008; Singer and Frank, 2009), which measures how likely two cells are to spike together normalized by how often each one spikes independently during SWRs. Specifically, we counted the number of awake dSWRs (for D+/D+ and NAc/dCA1 pairs) or vSWRs (for V+/V+ pairs) within a day during which each cell spiked at least once, where the boundaries of each SWR were defined by the 2 s.d. threshold (see SWR detection). Because reactivation events can span SWRs in a chain (Davidson et al., 2009), we included chained SWR events (although exclusion of chains did not change the results). For both awake and sleep SWRs, we limited analysis to asynchronous dSWRs and vSWRs that occurred greater than 250 ms apart. From the set of SWR events, the observed coincidence of spiking was calculated as a z-score:

z=nABnAnBNnAnB(NnA)(NnB)(N2(N1))

where N is the total number of SWR events, nA is the number of events in which cell A spiked, nB is the number of events in which cell B spiked, and nAB is the number of events in which both cells spiked.

QUANTIFICATION AND STATISTICAL ANALYSIS

No statistical methods were used to predetermine sample size. The minimum number of required animals was established beforehand as four or more, in line with similar studies in which this number yields data with sufficient statistical power. All statistical tests were non-parametric and two-sided unless otherwise specified. Exact p-values, n-values with units, and statistical tests are reported in the figure legends or figures when applicable. Statistical significance was set at p<0.05 unless otherwise specified.

DATA AND CODE AVAILABILITY

Data and custom code are available from the authors upon reasonable request.

Supplementary Material

1

Highlights:

  • Dorsal and ventral hippocampal awake SWRs occur at different times

  • dH and vH SWRs modulate individual NAc neurons in opposite ways

  • dH (but not vH) SWRs activate NAc neurons encoding spatial paths and past reward

  • Distinct dH- and vH-coordinated NAc networks persist in movement and sleep

Acknowledgements

We thank J. Berke, H. Fields, M. Kheirbek, M. Brainard, A. Gillespie, A. Joshi, A. Comrie, M. Coulter, J. Yu, and K. Kay for comments on an earlier version of the manuscript; members of the Frank laboratory for useful discussions; K. Kay for contributing key analysis code; J. Chung and J. Magland for development of drift tracking in Mountainsort; and I. Grossrubatscher, V. Kharazia, and E. Miller for technical assistance. This work was supported by Howard Hughes Medical Institute (L.M.F.), Simons Collaboration for the Global Brain Grants 521921 and 542981 (L.M.F.), NIMH Ruth L. Kirschstein NRSA F31MH111214 (M.S.) and F30MH115582 (H.R.J.), and NIGMS MSTP Grant T32GM007618 (H.R.J.).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Declaration of interests

The authors declare no competing interests.

References

  1. Adhikari A, Topiwala MA, and Gordon JA (2011). Single units in the medial prefrontal cortex with anxiety-related firing patterns are preferentially influenced by ventral hippocampal activity. Neuron 71, 898–910. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Ambrose RE, Pfeiffer BE, and Foster DJ (2016). Reverse Replay of Hippocampal Place Cells Is Uniquely Modulated by Changing Reward. Neuron 91, 1124–1136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Atallah HE, McCool AD, Howe MW, and Graybiel AM (2014). Neurons in the ventral striatum exhibit cell-type-specific representations of outcome during learning. Neuron 82, 1145–1156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bannerman DM, Rawlins JN, McHugh SB, Deacon RM, Yee BK, Bast T, Zhang WN, Pothuizen HH, and Feldon J (2004). Regional dissociations within the hippocampus--memory and anxiety. Neurosci Biobehav Rev 28, 273–283. [DOI] [PubMed] [Google Scholar]
  5. Berke JD (2008). Uncoordinated firing rate changes of striatal fast-spiking interneurons during behavioral task performance. J Neurosci 28, 10075–10080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Berke JD, Breck JT, and Eichenbaum H (2009). Striatal versus hippocampal representations during win-stay maze performance. J Neurophysiol 101, 1575–1587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Berke JD, Okatan M, Skurski J, and Eichenbaum HB (2004). Oscillatory entrainment of striatal neurons in freely moving rats. Neuron 43, 883–896. [DOI] [PubMed] [Google Scholar]
  8. Britt JP, Benaliouad F, McDevitt RA, Stuber GD, Wise RA, and Bonci A (2012). Synaptic and behavioral profile of multiple glutamatergic inputs to the nucleus accumbens. Neuron 76, 790–803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Brog JS, Salyapongse A, Deutch AY, and Zahm DS (1993). The patterns of afferent innervation of the core and shell in the "accumbens" part of the rat ventral striatum: immunohistochemical detection of retrogradely transported fluoro-gold. J Comp Neurol 338, 255–278. [DOI] [PubMed] [Google Scholar]
  10. Buzsaki G (2015). Hippocampal sharp wave-ripple: A cognitive biomarker for episodic memory and planning. Hippocampus 25, 1073–1188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Buzsaki G, and Moser EI (2013). Memory, navigation and theta rhythm in the hippocampal-entorhinal system. Nat Neurosci 16, 130–138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Carelli RM (2002). The nucleus accumbens and reward: neurophysiological investigations in behaving animals. Behav Cogn Neurosci Rev 1, 281–296. [DOI] [PubMed] [Google Scholar]
  13. Castro DC, and Bruchas MR (2019). A Motivational and Neuropeptidergic Hub: Anatomical and Functional Diversity within the Nucleus Accumbens Shell. Neuron 102, 529–552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cheng S, and Frank LM (2008). New experiences enhance coordinated neural activity in the hippocampus. Neuron 57, 303–313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chersi F, and Burgess N (2015). The Cognitive Architecture of Spatial Navigation: Hippocampal and Striatal Contributions. Neuron 88, 64–77. [DOI] [PubMed] [Google Scholar]
  16. Chung JE, Joo HR, Fan JL, Liu DF, Barnett AH, Chen S, Geaghan-Breiner C, Karlsson MP, Karlsson M, Lee KY, et al. (2019). High-Density, Long-Lasting, and Multi-region Electrophysiological Recordings Using Polymer Electrode Arrays. Neuron 101, 21–31.e25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Chung JE, Magland JF, Barnett AH, Tolosa VM, Tooker AC, Lee KY, Shah KG, Felix SH, Frank LM, and Greengard LF (2017). A Fully Automated Approach to Spike Sorting. Neuron 95, 1381–1394.e1386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Ciocchi S, Passecker J, Malagon-Vina H, Mikus N, and Klausberger T (2015). Brain computation. Selective information routing by ventral hippocampal CA1 projection neurons. Science 348, 560–563. [DOI] [PubMed] [Google Scholar]
  19. Davidson TJ, Kloosterman F, and Wilson MA (2009). Hippocampal replay of extended experience. Neuron 63, 497–507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Dougherty KA, Islam T, and Johnston D (2012). Intrinsic excitability of CA1 pyramidal neurones from the rat dorsal and ventral hippocampus. J Physiol 590, 5707–5722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Eichenbaum H (2017). On the Integration of Space, Time, and Memory. Neuron 95, 1007–1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Fanselow MS, and Dong HW (2010). Are the dorsal and ventral hippocampus functionally distinct structures? Neuron 65, 7–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Floresco SB, Seamans JK, and Phillips AG (1997). Selective roles for hippocampal, prefrontal cortical, and ventral striatal circuits in radial-arm maze tasks with or without a delay. J Neurosci 17, 1880–1890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Foster DJ, and Wilson MA (2006). Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature 440, 680–683. [DOI] [PubMed] [Google Scholar]
  25. Frank LM, Brown EN, and Wilson MA (2000). Trajectory encoding in the hippocampus and entorhinal cortex. Neuron 27, 169–178. [DOI] [PubMed] [Google Scholar]
  26. Girardeau G, Inema I, and Buzsaki G (2017). Reactivations of emotional memory in the hippocampus-amygdala system during sleep. Nat Neurosci 20, 1634–1642. [DOI] [PubMed] [Google Scholar]
  27. Gomperts SN, Kloosterman F, and Wilson MA (2015). VTA neurons coordinate with the hippocampal reactivation of spatial experience. eLife 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Humphries MD, and Prescott TJ (2010). The ventral basal ganglia, a selection mechanism at the crossroads of space, strategy, and reward. Prog Neurobiol 90, 385–417. [DOI] [PubMed] [Google Scholar]
  29. Ito R, Robbins TW, Pennartz CM, and Everitt BJ (2008). Functional interaction between the hippocampus and nucleus accumbens shell is necessary for the acquisition of appetitive spatial context conditioning. J Neurosci 28, 6950–6959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Jadhav SP, Rothschild G, Roumis DK, and Frank LM (2016). Coordinated Excitation and Inhibition of Prefrontal Ensembles during Awake Hippocampal Sharp-Wave Ripple Events. Neuron 90, 113–127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Ji D, and Wilson MA (2007). Coordinated memory replay in the visual cortex and hippocampus during sleep. Nat Neurosci 10, 100–107. [DOI] [PubMed] [Google Scholar]
  32. Jimenez JC, Su K, Goldberg AR, Luna VM, Biane JS, Ordek G, Zhou P, Ong SK, Wright MA, Zweifel L, et al. (2018). Anxiety Cells in a Hippocampal-Hypothalamic Circuit. Neuron 97, 670–683. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Joo HR, and Frank LM (2018). The hippocampal sharp wave-ripple in memory retrieval for immediate use and consolidation. Nat Rev Neurosci 19, 744–757. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Karlsson MP, and Frank LM (2009). Awake replay of remote experiences in the hippocampus. Nat Neurosci 12, 913–918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Kay K, Sosa M, Chung JE, Karlsson MP, Larkin MC, and Frank LM (2016). A hippocampal network for spatial coding during immobility and sleep. Nature 531, 185–190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Keinath AT, Wang ME, Wann EG, Yuan RK, Dudman JT, and Muzzio IA (2014). Precise spatial coding is preserved along the longitudinal hippocampal axis. Hippocampus 24, 1533–1548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kheirbek MA, Drew LJ, Burghardt NS, Costantini DO, Tannenholz L, Ahmari SE, Zeng H, Fenton AA, and Hen R (2013). Differential control of learning and anxiety along the dorsoventral axis of the dentate gyrus. Neuron 77, 955–968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kim SM, and Frank LM (2009). Hippocampal lesions impair rapid learning of a continuous spatial alternation task. PLoS ONE 4, e5494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kjelstrup KB, Solstad T, Brun VH, Hafting T, Leutgeb S, Witter MP, Moser EI, and Moser MB (2008). Finite scale of spatial representation in the hippocampus. Science 321, 140–143. [DOI] [PubMed] [Google Scholar]
  40. Komorowski RW, Garcia CG, Wilson A, Hattori S, Howard MW, and Eichenbaum H (2013). Ventral hippocampal neurons are shaped by experience to represent behaviorally relevant contexts. J Neurosci 33, 8079–8087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Lansink CS, Goltstein PM, Lankelma JV, Joosten RN, McNaughton BL, and Pennartz CM (2008). Preferential reactivation of motivationally relevant information in the ventral striatum. J Neurosci 28, 6372–6382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Lansink CS, Goltstein PM, Lankelma JV, McNaughton BL, and Pennartz CM (2009). Hippocampus leads ventral striatum in replay of place-reward information. PLoS Biol 7, e1000173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Lansink CS, Jackson JC, Lankelma JV, Ito R, Robbins TW, Everitt BJ, and Pennartz CM (2012). Reward cues in space: commonalities and differences in neural coding by hippocampal and ventral striatal ensembles. J Neurosci 32, 12444–12459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Lansink CS, Meijer GT, Lankelma JV, Vinck MA, Jackson JC, and Pennartz CM (2016). Reward Expectancy Strengthens CA1 Theta and Beta Band Synchronization and Hippocampal-Ventral Striatal Coupling. J Neurosci 36, 10598–10610. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Lavoie AM, and Mizumori SJ (1994). Spatial, movement- and reward-sensitive discharge by medial ventral striatum neurons of rats. Brain Res 638, 157–168. [DOI] [PubMed] [Google Scholar]
  46. LeGates TA, Kvarta MD, Tooley JR, Francis TC, Lobo MK, Creed MC, and Thompson SM (2018). Reward behaviour is regulated by the strength of hippocampus-nucleus accumbens synapses. Nature 564, 258–262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Li Z, Chen Z, Fan G, Li A, Yuan J, and Xu T (2018). Cell-Type-Specific Afferent Innervation of the Nucleus Accumbens Core and Shell. Front Neuroanat 12, 84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Lubenov EV, and Siapas AG (2009). Hippocampal theta oscillations are travelling waves. Nature 459, 534–539. [DOI] [PubMed] [Google Scholar]
  49. Moser MB, and Moser EI (1998). Functional differentiation in the hippocampus. Hippocampus 8, 608–619. [DOI] [PubMed] [Google Scholar]
  50. Mulder AB, Tabuchi E, and Wiener SI (2004). Neurons in hippocampal afferent zones of rat striatum parse routes into multi-pace segments during maze navigation. Eur J Neurosci 19, 1923–1932. [DOI] [PubMed] [Google Scholar]
  51. O'Neill J, Senior TJ, Allen K, Huxter JR, and Csicsvari J (2008). Reactivation of experience-dependent cell assembly patterns in the hippocampus. Nat Neurosci 11, 209–215. [DOI] [PubMed] [Google Scholar]
  52. Padilla-Coreano N, Bolkan SS, Pierce GM, Blackman DR, Hardin WD, Garcia-Garcia AL, Spellman TJ, and Gordon JA (2016). Direct Ventral Hippocampal-Prefrontal Input Is Required for Anxiety-Related Neural Activity and Behavior. Neuron 89, 857–866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Patel J, Fujisawa S, Berenyi A, Royer S, and Buzsaki G (2012). Traveling theta waves along the entire septotemporal axis of the hippocampus. Neuron 75, 410–417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Patel J, Schomburg EW, Berenyi A, Fujisawa S, and Buzsaki G (2013). Local generation and propagation of ripples along the septotemporal axis of the hippocampus. J Neurosci 33, 17029–17041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Pennartz CM, Groenewegen HJ, and Lopes da Silva FH (1994). The nucleus accumbens as a complex of functionally distinct neuronal ensembles: an integration of behavioural, electrophysiological and anatomical data. Prog Neurobiol 42, 719–761. [DOI] [PubMed] [Google Scholar]
  56. Pennartz CM, Ito R, Verschure PF, Battaglia FP, and Robbins TW (2011). The hippocampal-striatal axis in learning, prediction and goal-directed behavior. Trends Neurosci 34, 548–559. [DOI] [PubMed] [Google Scholar]
  57. Pennartz CM, Lee E, Verheul J, Lipa P, Barnes CA, and McNaughton BL (2004). The ventral striatum in off-line processing: ensemble reactivation during sleep and modulation by hippocampal ripples. J Neurosci 24, 6446–6456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Riaz S, Schumacher A, Sivagurunathan S, Van Der Meer M, and Ito R (2017). Ventral, but not dorsal, hippocampus inactivation impairs reward memory expression and retrieval in contexts defined by proximal cues. Hippocampus 27, 822–836. [DOI] [PubMed] [Google Scholar]
  59. Rothschild G, Eban E, and Frank LM (2017). A cortical-hippocampal-cortical loop of information processing during memory consolidation. Nat Neurosci 20, 251–259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Royer S, Sirota A, Patel J, and Buzsaki G (2010). Distinct representations and theta dynamics in dorsal and ventral hippocampus. J Neurosci 30, 1777–1787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Schmitzer-Torbert N, and Redish AD (2004). Neuronal activity in the rodent dorsal striatum in sequential navigation: separation of spatial and reward responses on the multiple T task. J Neurophysiol 91, 2259–2272. [DOI] [PubMed] [Google Scholar]
  62. Singer AC, and Frank LM (2009). Rewarded outcomes enhance reactivation of experience in the hippocampus. Neuron 64, 910–921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Singer AC, Karlsson MP, Nathe AR, Carr MF, and Frank LM (2010). Experience-dependent development of coordinated hippocampal spatial activity representing the similarity of related locations. J Neurosci 30, 11586–11604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Sjulson L, Peyrache A, Cumpelik A, Cassataro D, and Buzsaki G (2018). Cocaine Place Conditioning Strengthens Location-Specific Hippocampal Coupling to the Nucleus Accumbens. Neuron 98, 926–934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Smith AC, Frank LM, Wirth S, Yanike M, Hu D, Kubota Y, Graybiel AM, Suzuki WA, and Brown EN (2004). Dynamic analysis of learning in behavioral experiments. J Neurosci 24, 447–461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Sosa M, Gillespie AK, and Frank LM (2016). Neural Activity Patterns Underlying Spatial Coding in the Hippocampus. Current Topics in Behavioral Neurosciences 37, 43–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Strange BA, Witter MP, Lein ES, and Moser EI (2014). Functional organization of the hippocampal longitudinal axis. Nat Rev Neurosci 15, 655–669. [DOI] [PubMed] [Google Scholar]
  68. Tang W, Shin JD, Frank LM, and Jadhav SP (2017). Hippocampal-Prefrontal Reactivation during Learning Is Stronger in Awake Compared with Sleep States. J Neurosci 37, 11789–11805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Trouche S, Koren V, Doig NM, Ellender TJ, El-Gaby M, Lopes-Dos-Santos V, Reeve HM, Perestenko PV, Garas FN, Magill PJ, et al. (2019). A Hippocampus-Accumbens Tripartite Neuronal Motif Guides Appetitive Memory in Space. Cell 176, 1393–1406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. van der Meer MA, Johnson A, Schmitzer-Torbert NC, and Redish AD (2010). Triple dissociation of information processing in dorsal striatum, ventral striatum, and hippocampus on a learned spatial decision task. Neuron 67, 25–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. van der Meer MA, and Redish AD (2011). Theta phase precession in rat ventral striatum links place and reward information. J Neurosci 31, 2843–2854. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. van Strien NM, Cappaert NL, and Witter MP (2009). The anatomy of memory: an interactive overview of the parahippocampal-hippocampal network. Nat Rev Neurosci 10, 272–282. [DOI] [PubMed] [Google Scholar]
  73. Witter MP (2007). Intrinsic and extrinsic wiring of CA3: indications for connectional heterogeneity. Learn Mem 14, 705–713. [DOI] [PubMed] [Google Scholar]
  74. Yoshida K, Drew MR, Mimura M, and Tanaka KF (2019). Serotonin-mediated inhibition of ventral hippocampus is required for sustained goal-directed behavior. Nat Neurosci 22, 770–777. [DOI] [PubMed] [Google Scholar]
  75. Yu JY, Kay K, Liu DF, Grossrubatscher I, Loback A, Sosa M, Chung JE, Karlsson MP, Larkin MC, and Frank LM (2017). Distinct hippocampal-cortical memory representations for experiences associated with movement versus immobility. eLife 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Yu JY, Liu DF, Loback A, Grossrubatscher I, and Frank LM (2018). Specific hippocampal representations are linked to generalized cortical representations in memory. Nat Commun 9, 2209. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

Data and custom code are available from the authors upon reasonable request.

RESOURCES