Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Oct 24.
Published in final edited form as: Hippocampus. 2021 Jun 9;31(10):1051–1067. doi: 10.1002/hipo.23367

Disrupting the medial prefrontal cortex with DREADDs alters hippocampal sharp-wave ripples and their associated cognitive processes

Brandy Schmidt 1, A David Redish 1,*
PMCID: PMC9590403  NIHMSID: NIHMS1843523  PMID: 34107138

Abstract

The hippocampus and medial prefrontal cortex (mPFC) interact during a myriad of cognitive processes including decision-making and long-term memory consolidation. Exactly how the mPFC and hippocampus interact during goal-directed decision-making remains to be fully elucidated. During periods of rest, bursts of high frequency oscillations, termed sharp wave ripple (SWR), appear in the local field potential. Impairing SWRs on the maze or during post-learning rest can interfere with memory-guided decision-making and memory consolidation. We hypothesize that the hippocampus and mPFC bidirectionally interact during SWRs to support memory consolidation and decision-making. Rats were trained on the neuroeconomic spatial decision-making task, Restaurant Row, to make serial stay-skip decisions where the amount of effort (delay to reward) varied upon entry to each restaurant. Hippocampal cells and SWRs were recorded in rats with the mPFC transduced with inhibitory DREADDs. We found that disrupting the mPFC impaired consolidating SWRs in the hippocampus. Hippocampal SWR rates depended on the internalized value of the reward (derived from individual flavor preferences), a parameter important in decision-making, and disrupting the mPFC changed this relationship. Additionally, we found a dissociation between SWRs that occurred while rats were on the maze dependent upon whether those SWRs occurred while the rat was anticipating food reward or during post-reward consumption.

Keywords: decision-making, spatial navigation, Restaurant Row, neuroeconomic, foraging, consolidation

Intro

The hippocampus and medial prefrontal cortex (mPFC) interact to support memory consolidation and decision-making. Lesion (Churchwell & Kesner, 2011; Spellman et al., 2015; Wang & Cai, 2006) and electrophysiological studies (Shin et al., 2019) have shown that the prelimbic cortex supports hippocampal goal-related cognitive processes. Bursts of high frequency oscillations (180–220 Hz), known as sharp-wave ripples (SWR), appear in the local field potential of the hippocampus in rats during learning and maze exploration, quiescent times on a maze, and post-maze rest (Buzsáki et al., 1983; Buzsáki, 2015; Dupret et al., 2010; Joo & Frank, 2018; Lee & Wilson, 2002; O’Keefe & Nadel, 1978; Pfeiffer & Foster, 2013). During SWRs, hippocampal place cells (O’Keefe & Dostrovsky, 1971; O’Keefe & Nadel, 1978; Redish, 1999) fire in a temporally compressed manner that recapitulate recent behavioral events (Foster & Wilson, 2006; Karlsson & Frank, 2008; O’Neill et al., 2008). Post-learning SWRs are believed to facilitate the consolidation of information by strengthening the connections within the hippocampus and between the hippocampus and cortical areas, such as the mPFC (Sutherland & McNaughton, 2000). Disrupting SWRs during sleep impairs spatial learning (Ego-Stengel & Wilson, 2010; Girardeau et al., 2009). In contrast, SWRs exhibited when the rat is performing a cognitive task on the maze are believed to help facilitate planning (Carr et al., 2011; Roumis & Frank, 2015). Disrupting awake SWRs impairs memory-guided decision making (Jadhav et al., 2012), and artificially prolonging spontaneous SWRs with optogenetic stimulation improves memory performance (Fernández-Ruiz et al., 2019). Taken together, these studies suggest that SWRs facilitate different cognitive processes (Joo & Frank, 2018).

The mPFC receives mono-synaptic input from the hippocampus, but only sends input indirectly to the hippocampus (Hoover & Vertes, 2007; Vertes et al., 2007). As such, interactions between SWRs and the mPFC have traditionally been examined by manipulating SWRs or hippocampal projections to the mPFC. However, recent studies suggest that the mPFC may have more control over hippocampal physiology than previously realized (Helfrich et al., 2019; Maingret et al., 2016; Preston & Eichenbaum, 2013; Wang et al., 2015). Recent studies suggest that the mPFC could play a role in triggering hippocampal SWRs during planning (Shin & Jadhav, 2016; Yu & Frank, 2015).

The neuroeconomic spatial decision-making task, Restaurant Row, requires the rat to encounter a series of stay/go decisions for different flavored food rewards (Schmidt et al., 2019; Steiner & Redish, 2014; Sweis et al., 2018). Based on a rat’s willingness to wait out the delay we can measure individual preferences for specific flavors (plain, cherry, banana, chocolate) and in the 4 × 20 task (see below), reward size (1 vs. 3 pellets). Post-reward SWR rates increase with reward size (Ambrose et al., 2016; Singer & Frank, 2009; Sosa et al., 2020), however, it is not clear how subjective reward value affects SWR rates. In the current study, the mPFC of rodents were transduced with inhibitory (h4MDi) DREADDs and the rats were given daily injections of vehicle (VEH) or CNO (clozapine-n-oxide). Local field potentials and neural ensembles from the hippocampus were recorded while DREADD-transduced rats performed the Restaurant Row task under VEH and CNO conditions. We hypothesized that disrupting the mPFC would alter hippocampal SWRs, and that in addition to being affected by reward size, SWRs are also modulated by offer value, including both cost (as delay to reward) and reward preferences. Furthermore, given the differences seen between on- and off-maze SWRs (Joo & Frank, 2018), we examined the extent to which SWRs differed between times while the rat was waiting for a reward and times while the rat lingered at the reward site after having just received reward.

Methods

DREADDs transfection

Seven male Brown-Norway rats aged 10–14 months at the start of the experiment were used in this study. Rats were maintained above their 80% free-feeding weight. Prior to training on the task, the mPFC of rats were transfected with mCherry-tagged AAV8-CaMKIIα-hM4Di virus (UNC Vector Core, Chapel Hill, NC) under isoflurane anesthesia. The virus was injected bilaterally into the prelimbic cortex. We infused a total of 4 μL of 3.4 × 1012 mol/mL titer at a rate of 200 nL/min into each site (Pump 11 Elite, Harvard Apparatus). The injector (28 GA cannula) was left in place for an additional 5 minutes to minimize diffusion up the injector tract. mPFC coordinates for the infusion were 3.0 mm A/P, 0.7 mm M/L, and 3.6 mm D/V. Viral surgeries were conducted under Biological Safety Level 1 practices and procedures as identified by the University of Minnesota’s Institutional Biological Safety Committee and all animal-related procedures were approved by the University of Minnesota Institutional Animal Care and Use Committee (IACUC).

Pretraining

Following DREADDs transfection surgery and 3 days of recovery, rats received twice-daily training sessions lasting 30 minutes each (see Fig. S1a). Training began with 5 days of habituation to the environment. Delays in this phase remained a constant 1 second at all feeder sites. After 5 days of habituation, the randomized list of delays presented to animals was expanded to 1–2, 1–3, 1–4, and 1–5 second delays on 4 following days (i.e. on day 6 of training each restaurant had a random delay between 1 and 2 seconds, on day 7 each restaurant had a random delay between 1 and 3 seconds, etc; Fig. S1a). Rats then received 10 days of training on which delays were randomly selected from a uniform distribution between 1 and 30 seconds. After this 19-day sequence of twice-daily, 30-minute sessions, rats transitioned to training on once-daily, 60-minute sessions in which they could encounter delays between 1 and 30s. These sessions were presented over a minimum of 5 days until performance was deemed stable (30+ laps). On the rare occasion the rat did not eat enough food to maintain their weight they would be post-fed to keep them above their 80% free feeding weight. At this point, five rats received hyperdrive implantation surgery.

Hyperdrive Implantation surgery

After initial task training, the rats underwent a triple bundle, 24 moveable tetrode, four reference, hyperdrive implantation targeting the prelimbic (A/P 3.0; M/L 0.7; 3 tetrodes), ventral striatum (VSTR; A/P 1.2; M/L 2.4; 5 tetrodes), and hippocampus (A/P −3.0; M/L 4.0; 16 tetrodes). Because cell yields were small, prelimbic cortex and ventral striatal data were not analyzed in this paper. Tetrodes were lowered daily until they hit the mPFC and the pyramidal cell layer of CA1. Theta was recorded from the hippocampal fissure from two of the four references, and the other two references were placed in the corpus callosum. Daily recordings were taken while the rat was resting before behavior (5 min, Pre), while on the maze (60 or 80 min, Maze), and resting after behavior (5 min, Post; see Fig. 1b).

Figure 1: Disrupting the mPFC with DREADDs impaired off-line hippocampal SWR rates.

Figure 1:

a) Schematic of the daily paradigm. On Restaurant Row, rats are required to make serial stay/skip choices for different flavored food rewards (color reflects flavor: white, plain; yellow, banana; pink, cherry; brown, chocolate). When a rat entered a restaurant/zone (demarcated by the dashed lines), a delay counted down reflecting how long the rat needed to wait to receive the food reward (1–30 sec). The rat could wait out the delay to receive the food reward or skip the current restaurant and proceed to the next. The mPFC was transfected with the inhibitory DREADDs (Schmidt et al., 2019) and rats were given daily injections of VEH or CNO before recording. b) Neuronal ensembles and local field potentials in the hippocampus were recorded during three behavioral epochs: 5-minute pre-maze record (Pre), on the maze, and 5-minute post-maze record (Post). c) SWR rates were examined on VEH and CNO days during the Pre and Post recordings. SWR rates increased from the Pre-maze to the Post-maze sessions. Disrupting the mPFC with CNO impaired this effect. Though SWR rates increased from the Pre-maze to Post-maze on CNO days, the increase in VEH Post-maze SWR rates vs. Pre-maze rates were significantly higher than on CNO days. Boxplot center mark depicts the median (red line), and top and bottom edges represent first and third quartiles. Whiskers extend to extreme data points not considered outliers. Different colors represent different rats. Diamonds = VEH days, circles = CNO days. ** p < 0.01; *** p < 0.001.

Injection sequence

After the hyperdrive surgery, rats were retrained daily on the maze as tetrodes were lowered. A 20-day injection sequence (see Fig. S1a) followed once the tetrodes had reached their respective areas. Experimenters were blind to the identity of the solution (VEH or CNO) injected on any given day. Experimental and control conditions were presented in matched pairs in pseudorandomized order, controlling for first-order sequence effects. The rats were given CNO (5 mg/kg, s.c.) or VEH 20 minutes before testing. CNO (NIMH Chemical Synthesis and Drug Supply Program) was dissolved in dimethylsulfoxide (DMSO; Fisher Scientific, Pittsburg; PA) and 0.9% saline to yield a DMSO concentration of 10%. VEH injections also contained 10% DMSO. The blind was broken once all the data were collected.

The Restaurant Row task

Restaurant Row is a neuroeconomic spatial decision-making task in which rats make serial stay/skip choices for different flavors of food reward in a naturalistic foraging paradigm (Schmidt et al., 2019; Steiner & Redish, 2014; Sweis et al., 2018) (Fig. 1a). The Restaurant Row task enables direct measures of value from the flavor preferences revealed by individual rats. Rats were trained to run clockwise around a circular maze (approximately one meter in diameter) with four evenly spaced spokes; at the end of each spoke was one of four differently flavored rewards (“restaurants”: plain, chocolate, banana, cherry). At each restaurant, a feeder (MedAssociates, St. Albans, VT) dispensed two 45-mg food pellets of the given flavor (Research Diets, New Brunswick, NJ). Flavor locations remained constant throughout training. As the rat entered a restaurant perimeter, a tone sounded, where the pitch of the tone indicated the required delay remaining before food would be delivered. The maze was evenly divided up into four quadrants to demarcate restaurant entry and exit (Fig. 1a dashed lines indicate restaurant quadrant). Delays were randomized (uniform distribution) between 1–30 seconds on each entry, so the rat did not know what the cost (delay) would be until entering the restaurant. Longer delays were indicated by higher frequency tones and counted down every second in decreasing 250 Hz steps. Rats can discriminate between longer and shorter delays as revealed by their individual thresholds for different flavors which were different across rats but consistent within rats (Schmidt et al., 2019; Steiner & Redish, 2014; Sweis et al., 2018). If the rat left the restaurant before the countdown completed (skipped), the tone stopped, the offer was rescinded, and the rat was required to proceed to the next restaurant. Restaurants were primed in serial order, forcing the rats to run the maze in a clockwise direction. Rats had one hour to gather their food for the day (seven days a week) making this an economic task, in which rats had a time budget of 60 minutes to forage for food. Daily recordings were taken while the rat was resting before the task (5 min, Pre), while on the maze (Maze), and resting after the task (5 min, Post). Behavioral data for these rats on the Restaurant Row task have been previously reported in Schmidt et al., (2019).

4 × 20 Task

After 20 days of the VEH/CNO injection sequence (see below) on the Restaurant Row task, rats were trained on the 4 × 20 task for 8 days (Steiner & Redish, 2014) (Fig. S1b). In this variant, rats were given four, daily, 20-minute epochs to complete the Restaurant Row task (80-minute total), but instead of receiving 2 pellets for each restaurant, the rats received 3 pellets at one restaurant and 1 pellet at the other three restaurants. Rats were removed from the maze and returned to their resting pot to mark the end of each 20-minute session. In each of the four 20-minute epochs, a different restaurant dispensed 3 pellets while the other three restaurants dispensed 1 pellet. As with the Restaurant Row task, daily recordings were taken while the rat was resting before the task (5 min, Pre), while on the maze (4, 20 min sessions; Maze), and resting after the task (5 min, Post).

Behavioral and Neurophysiological data collection

An overhead camera recorded the rat’s position via light-emitting diodes mounted to the hyperdrive. Data were recorded with a Cheetah Digital Lynx SX system (Neuralynx). The task was controlled by software written in-house in MATLAB R2012a (The Mathworks, Natick, MA) using video tracked and time-stamped from the Neuralynx Digital Lynx system.

Sharp Wave Ripple Detection

SWRs were detected as described in Jackson et al. (2006). SWRs were examined on local field potentials (LFP) from one tetrode in hippocampus with the best cell yield for each rat. The LFP was bandpass filtered from 180–220 Hz. Amplitudes for each signal was found via Hilbert-transform. The distribution of log-transformed average amplitude was used to find events > 1 s.d. above the mean power (similar results were found using 2 and 3 s.d. above the mean). Events were also required to meet a criterion of movement speed less than 5 cm/sec and a theta/delta ratio less than 1 s.d. above mean power.

Sharp Wave Ripple Rate

SWR rate was calculated as the number of SWRs seen on a tetrode divided by the time spent in a condition, providing a per second rate for SWRs. The SWR rate was measured for the Waiting epoch as the rat waited out the delay to receive the food reward. To examine the SWR rates for post-reward epochs, the SWR rate was calculated from the feeder fire to the start of the next restaurant entry.

Bayesian Decoding

For each SWR that met criteria (see above), the represented path in space was determined using a one-step Bayesian decoding method (Brown et al., 1998; Zhang et al., 1998). Bayesian decoding was done as described in Wikenheiser & Redish (2013). Given spike counts from each cell in the ensemble, the posterior probability of the ensemble was computed representing each position in space. Spatial information was decoded in each SWR using 40 ms time windows with a uniform spatial prior, resulting in a posterior probability distribution across 64 spatial bins. Posterior distributions were normalized to sum to one, and we calculated the decoded position for each time step as the circular mean of the posterior probability distribution. Only time steps with at least one spike were decoded. To examine the decoding for all 4 restaurants we categorized and rotated the maze in relation to the rat’s current restaurant and calculated the summed posterior probability for the Current, Next, Opposite, and Previous restaurants for each 40 ms bin in a SWR.

Delay threshold

To quantify the subjective value of each of the four flavors presented to each rat, we fit a Heaviside step function to the stay/skip decisions as a function of the presented delays by least-squares (Steiner & Redish, 2014). Separate step functions were fit to the data for each restaurant, rat, and session. The delay at which the function predicted an equal likelihood of stay and skip for a given flavor was defined as the threshold for that flavor and provided a measure of the subjective value of the flavor for the rat for the session. The mean threshold across flavors for a session provided the rat’s overall willingness to wait for food of any kind.

Rate of reinforcement

To quantify the effectiveness of the rats’ decision-making, we calculated the overall session reward accumulation for all food. We obtained this measure by summing the total number of pellets that a rat obtained in a session. Higher rates imply that rats made objectively better choices, whereas lower rates imply that rats made objectively less advantageous choices.

Vicarious trial-and-error

When making difficult decisions, rats often pause and orient back and forth, a behavior termed “vicarious trial-and-error” (VTE; Muenzinger & Gentry, 1931; Muenzinger, 1938; Redish, 2016; Tolman, 1938). To quantify VTE, we calculated the integrated absolute head angle velocity (IdPhi) in the first 3 seconds of zone entry (Papale et al., 2012; Schmidt et al., 2013, 2019). The IdPhi values were subsequently Z-scored for each rat across all zone entries made in each drug condition. Large values of Z(IdPhi) (> 1) corresponded to trajectories that qualitatively matched the pause-and-orient description of VTE, whereas low values (< 1) corresponded to smooth passes through the zone. In the case of behavior-only rats (n = 2), position was tracked with backpack-mounted LEDs; in the case of recording rats (n = 5), position was tracked from LEDs mounted to recording head stages.

Running Speed

We computed the running speed as the change in x and y position (dx, dy) using an adaptive windowing of best-fit velocity vectors (Janabi-Sharifi et al., 2000).

Statistics and General Data Analyses

Data analyses were performed in MATLAB (MathWorks, Natick, MA). Two-tailed tests for normally distributed data and nonparametric tests for non-normally distributed data were used for statistical comparisons unless otherwise noted. Student t-tests, Wilcoxon signed rank tests of significance, n- way ANOVA, and RMANOVA were used and a Tukey-Kramer test for multiple comparisons were used when appropriate.

Perfusion/Histology

After the end of the experiment, current (100 mA, 10 s) was passed through the electrodes to verify tetrode locations. Three days later, rats were overdosed with sodium pentobarbital (150 mg/kg, Nembutal) and perfused intracardially with formalin. Their brains were transferred to a 30% (wt/vol) sucrose-formalin solution, sectioned on a cryostat, and stained with immunofluorescence for DREADDs or cresyl violet. Immunofluorescence staining was conducted as described in Dong et al., (2010). For further details on the histological methods from these rats see Schmidt et al. (2019). Tetrode placements were visually confirmed with cresyl violet stained sections.

Data Availability

The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Code Availability

In house written code used during the current study is available from the corresponding author upon reasonable request.

Results

Disrupting the mPFC impaired SWR rates during consolidation times

The hippocampus and mPFC support the consolidation of long-term memories necessary to support goal-directed decision-making (Eichenbaum, 2017; Foster, 2017; Ito et al., 2015; Redish, 2016; Sutherland & McNaughton, 2000; Tang & Jadhav, 2019; Yu & Frank, 2015). Recent studies have suggested that the mPFC may have control over the retrieval of hippocampal-dependent contextual memories (Navawongse & Eichenbaum, 2013; Place et al., 2016). If the mPFC controls the recruitment of hippocampal representations, we hypothesize that disrupting the mPFC may affect hippocampal SWRs.

The rate of SWRs increased after running on the maze (Pre → Post) (Repeated-Measures ANOVA (n=sessions): main effect of Epoch F(1,49) = 130.3, p = 2.1e−15; Fig. 1c), replicating previous results (Ambrose et al., 2016; Dupret et al., 2010; Eschenko et al., 2008; Joo & Frank, 2018; Kudrimoti et al., 1999). However, disrupting the mPFC with DREADDs significantly reduced SWR rates on CNO days (Repeated-Measures ANOVA (n=sessions): main effect of Condition F(1,49) = 8.36, p = 0.0057 and a Condition*Epoch interaction F(1,49) = 10.9, p = 0.0018).

Not all on-maze SWRs are created equal

SWRs are not only seen during quiescent rest before and after a training session; SWRs are also seen during quiescent periods on the maze (Kudrimoti et al., 1999; Pfeiffer & Foster, 2013; Roumis & Frank, 2015). These on-maze SWRs decode non-local information (Davidson et al., 2009; Gupta et al., 2010; Jensen & Lisman, 2000; Pfeiffer & Foster, 2013) and have been suggested to be involved in planning (Davidson et al., 2009; Jensen & Lisman, 2000; Pfeiffer & Foster, 2013; Shin et al., 2019) and learning (Carr et al., 2011; Ego-Stengel & Wilson, 2010, p.; Foster, 2017; Jadhav et al., 2012; Joo & Frank, 2018; Singer et al., 2013). On the Restaurant Row task, rats show two quiescent periods, one while waiting for/anticipating the reward (“Waiting”), and a second after having received the reward (“Lingering”) (Fig. 2a). Note that the Waiting period is before reward receipt and the rat can still decide to leave the restaurant, and thus may be involved in planning or decision-making. The Lingering period is after reward, and thus may be more plausibly involved in consolidating the recently completed decisions. Therefore, we hypothesized that SWRs might be different depending upon whether the rat was waiting for a reward or had just received it.

Figure 2: Differences between SWRs on the Waiting vs. Lingering epochs.

Figure 2:

a) SWRs on the maze were measured during two different non-ambulatory times: while waiting out the delay to receive the food reward (“Waiting” epoch) and after eating the food reward (“Lingering” epoch). b) SWR rates were significantly higher on the Lingering epoch, compared to the Waiting epoch. CNO had no effect on the SWR rate for either epoch. c) The percentage of SWRs where at least one cell fired were measured for both the Waiting and Lingering epochs. A larger proportion of SWR events had no measured cells firing during the Waiting epoch than during the Lingering epoch. Boxplot center mark depicts the median (red line), and top and bottom edges represent first and third quartiles. Whiskers extend to extreme data points not considered outliers. Different colors represent different rats. Diamonds = VEH days, circles = CNO days. *** p < 0.001.

SWR rates were higher after receiving food reward (“Lingering”) than while waiting for food reward (“Waiting”) (Repeated-Measures ANOVA (n=sessions): main effect of Epoch F(1,49) = 290.87, p = 3e−22; Fig. 2b), replicating previous results that SWR rates increase after reward. Unlike off-maze SWRs, disrupting the mPFC with DREADDs, had no effect on SWR rates during either the Waiting or Lingering epochs on the maze (Repeated-Measures ANOVA (n=sessions): no effect of Condition F(1,49) = 1.62, p = 0.21 or Condition*Epoch interaction F(1,49) = 2.31, p = 0.13; Fig. 2b).

In order to explore the information content of each SWR, we measured the number of cells that fired per SWR event during Waiting and Lingering epochs (Fig. 2c). We ran a multi-factor ANOVA examining the probably of that at least one cell fired during the SWR, with Epoch and Condition as variables. Lingering epoch SWRs had a higher probability of having cells fire (ANOVA (n = sessions), main effect of Epoch F(1,162) = 56.79, p = 3e−12). However, CNO had no measurable effect (ANOVA (n = sessions), no main effect of Condition F(1,162) = 1.28, p = 0.26).

SWRs reflect different cognitive processes such as learning and planning (Pfeiffer & Foster, 2013), and both on and off the maze SWRs may facilitate these different processes (Roumis & Frank, 2015; Wikenheiser & Redish, 2013). Disrupting the mPFC diminished the increase in SWR events typically seen during the post-learning epoch, without affecting SWR events on the maze, supporting the theory that SWRs serve different neural processes on and off the maze. In order to assess this, we examined where hippocampal representations predominantly decoded to during SWRs that occurred during these different maze epochs.

Disrupting the mPFC increased non-local decoding

The four different restaurants of the Restaurant Row task allow for the analysis of planning (what should I do?) vs. memory (what did I just do?). Applying Bayesian decoding methods (see Methods) to the neural firing patterns during SWRs revealed that SWRs during the Waiting and Lingering epochs decoded to each restaurant differently. We ran a Repeated-Measures ANOVA with the summed posterior probability of each Restaurant (Current, Next, Opposite, Previous) and Epoch (Waiting vs. Lingering) as variables. Though Waiting and Lingering SWR hippocampal ensembles predominantly decoded the Current restaurant (RM-ANOVA main effect of Restaurant (n = sessions): F(3,66) = 69.70, p = 2e−20; corrected for multiple comparisons, Current vs. Next, Opp, Pre all p < 0.0006; Fig. 3a/b), Waiting SWR non-local representations suggest that hippocampal ensembles were more likely about planning what to do next than about what they just did (multiple comparisons: Next vs. Previous; p = 0.003), while Lingering epoch SWR representations failed to reach significance (multiple comparisons: Next vs. Previous; p = 0.059). Hippocampal ensembles during the Lingering epochs decoded more to the local restaurant, but the Waiting epoch decoded more to non-local restaurants (RM-ANOVA Restaurant*Epoch interaction: F(3,66) = 25.18, p = 5e−11).

Figure 3: SWRs during the Waiting epoch decoded to other Restaurants while SWRs during the Lingering epoch decoded to the current Restaurant.

Figure 3:

a/b) Bayesian decoding was used on hippocampal ensembles during SWR events to estimate the rat’s spatial location (see Methods). Decoded SWR events are shown for a) Waiting and b) Lingering SWRs on VEH days in relation to the rat’s current restaurant. Though rats primarily decoded to their current restaurant, non-local spatial decoding (decoding to the Next, Opposite, and Previous restaurants) was greater during the Waiting epoch than the Lingering epoch. Decoded SWR events are shown for c) Waiting and d) Lingering SWRs on CNO days. Though rats primarily decoded to their current restaurant on CNO days, we found a small increase in non-local representations of space on both Waiting and Lingering epochs. Again, Bayesian decoding was used on hippocampal ensembles during SWR events to estimate the rat’s spatial location across the e) amount of time waited for food reward during the Waiting epoch and f) the amount of time waited after receiving food reward during the Lingering epoch. Same analyses as e/f but for CNO g) Waiting and h) Lingering SWRs. i) The Waiting SWR rates measured as a function of time waited showed a linear relationship for the first 6 seconds and plateaued thereafter. j) Lingering SWR rates as a function of time spent lingering showed an immediate peak and sharp drop thereafter. Curr = current restaurant, Next = next restaurant, Opp = opposite restaurant, Prev = previous restaurant

Disrupting the mPFC with CNO altered the local/non-local decoding pattern seen in Waiting epoch VEH days, while maintaining the Lingering patterns (RM-ANOVA main effect of Restaurant (n = sessions): F(3,66) = 32.7 p = 7e−13; Restaurant*Epoch interaction: F(3,66) = 21.18, p = 1.3e−09; Fig. 3c/d). Multiple comparisons revealed that the Current restaurant decoding during the Waiting epoch was not greater than the other three restaurants (only showing significance compared to the Opposite restaurant, p = 0.006).

In order to examine whether this encoding pattern was consistent, we measured how the decoding changed across time waited. Waiting SWR events overwhelmingly decoded to the Current restaurant, but SWRs more evenly decoded all four restaurants as the rat waited out the delay (Fig. 3e/g). Unlike the Waiting epoch, Lingering SWR events initially decoded all four restaurants, but then dramatically shifted predominantly to the Current restaurant (Fig. 3f/h). Due to fewer samples, time waited during Lingering after 30 seconds was not examined.

We similarly measured the SWR rate across time delay. Waiting SWR rates were low as the rat started their wait and gradually increased and plateaued (Fig. 3e). In contrast, Lingering SWR rates peaked around five seconds after feeder fire and then greatly decreased (Fig. 3f).

As expected, SWRs predominantly decoded to their Current restaurant location. Waiting SWRs showed more forward information, while Lingering SWRs showed more local information. Impairing the mPFC disrupted representations in Waiting SWRs, increasing their non-local decoding. We interpret decoding to the Previous location as remembering/thinking of the previous restaurant (“what did I just do?”) and decoding to the Next restaurant as thinking/planning about approaching the next restaurant (“what should I do?”). VEH days showed that the Next restaurant was decoded to more than the Previous restaurants, something we did not see on CNO days, suggesting that the mPFC affects hippocampal planning SWRs.

SWR rates reflect value and flavor preference

On the Restaurant Row task, rats show flavor preferences, revealed by individual delay thresholds, the delay at which a rat was equally likely to stay or skip (Schmidt et al., 2019; Steiner & Redish, 2014; Sweis et al., 2018). Given that SWR rates are modulated by reward size, we predicted that SWR rates would increase with reward preferences and offer value. For each rat, we ranked the four flavors in order of preference: Most, More, Less, Least. During the Waiting epoch, SWR rates tracked flavor preferences (Fig. 4a). SWR rates were lowest at the Least preferred restaurant and highest at the Most preferred restaurant (Repeated Measures ANOVA (n = sessions): main effect of Rank, F(3,147) = 15.98, p = 4.8e−09; corrected for multiple comparisons: Least = Less, Least & Less < More, Least & Less < Most, all p < 0.0001; Fig. 4a). Disrupting the mPFC reduced the SWR rate/flavor preference relationship, particularly for the preferred restaurants (Condition*Rank interaction F(3,147) = 2.4, p = 0.071). In contrast, Lingering SWR rates did not reflect flavor preferences (Fig. 4b).

Figure 4: Waiting SWRs are modulated by flavor preference and value.

Figure 4:

a) The four restaurants (banana, chocolate, plain, and cherry) were ranked for each rat in order of flavor preference (Most = 1st, More = 2nd, Less = 3rd, Least = 4th). SWR rates tracked with flavor preferences on the VEH Waiting epoch, higher for the most preferred flavors and lower for the least preferred flavors (Most > Less & Least***; More > Less & Least*). This relationship was disrupted on the most preferred restaurants on CNO days (Most > More & Least *). b) Lingering SWR rates didn’t track with flavor preference, SWR rates were mostly uniform across flavors. c) Individual trials were designated a Good Value (delay lower than threshold), Difficult Decision (close enough to threshold that it’s not obvious if they should take the deal), or a Bad Value (delay higher than threshold). Waiting epoch SWR rates were linearly correlated with trial value (VEH/CNO: Bad Deals < Difficult & Good***, Difficult < Good***). d) Lingering SWR rates, in contrast, did not show an effect of Trial Value, though CNO SWR rates were lower on Good deals than Difficult deals. e) Measuring Waiting SWR rates across the spectrum of trial values showed a positive linear relationship, SWR rates increased as trial value increased. f) Measuring Lingering SWR rates across the spectrum of trial values revealed a small negative correlation. + p = 0.0507, * p < 0.05, ** p < 0.01, *** p < 0.001.

When approaching the zone, the rat can encounter a delay much greater than the individual threshold for that restaurant (a Bad deal), a delay near threshold (a Difficult decision), or a delay much smaller than threshold (a Good deal). Given the strong relationship found between flavor preference and SWR rate, we asked if SWR rates had this same relationship with trial value (Value = Threshold - Delay). Waiting SWR rates increased with trial deal (main effect of Trial Deal (n = sessions): F(2, 96) = 93.25, p <3e−23). SWR rates were lowest for Bad deals and highest for Good deals (Fig. 4c; corrected for multiple comparisons, Bad < Difficult < Good, all p < 0.001). Interestingly, only Good Deal SWR rates trended to reduce on CNO days (no main effect of Condition: F(1,48) = 0.29, p = 0.59; Trial Deal*Condition interaction: F(2,96) = 2.76, p = 0.068). We did not see this same relationship across bad to good deals during Lingering SWRs (Fig. 4d). SWR rates decreased along the different trial deals (main effect of Trial Deal: F(2, 64) = 4.26, p = 0.018). Disrupting the mPFC with CNO reduced SWR rates overall for both Waiting and Lingering SWRs (main effect of Condition: F(1, 32) = 4.12, p = 0.0507).

Upon examining SWR rates across the distribution of Trial Values, we found a linear relationship between SWR rate and trial value in the Waiting SWRs (Fig. 4e, Pearson’s r = 0.19, p <1e−100). Examining the SWR rate during the time waited, this positive correlation was limited to the Waiting epoch, as the lingering epoch showed a small, but significant negative correlation between trial value and SWR rate (Fig. 4f; Pearson’s r = −0.07 p <3.1e−15).

The 4 × 20 Task

In order to examine the relation of SWRs to changes in reward sizes, learning and consolidation, rats were tested on a new task, the 4 × 20 variant of Restaurant Row (see Methods; Steiner & Redish, 2014; Fig. 5a). On the 4 × 20 task, rats were given four, 20-minute blocks wherein in each block one restaurant dispensed 3-pellets and the other three dispensed 1-pellet. Each of the four restaurants was the 3-pellet restaurant for one of the four daily blocks.

Figure 5: The 4 × 20 variant of Restaurant Row included 4 daily, 20-minute sub-sessions.

Figure 5:

a) After completing the Restaurant Row task, rats were trained on the 4 × 20 variant. During each 20-minute sub-session one restaurant dispensed 3 pellets and the other three restaurants dispensed 1 pellet. A different flavor restaurant became the 3-pellet restaurant for each of the four daily sub-sessions (brown = chocolate, black = plain, yellow = banana, pink = cherry). b-e) SWR rates were compared between the over-trained Restaurant Row task and the novel 4 × 20 task. SWR rates were higher after the maze run (Pre → Post). These off-maze (Pre/Post) SWR rates increased on the 4 × 20 task from the Restaurant Row task for both b) VEH and d) CNO days. d) Disrupting the mPFC with CNO reduced SWR rates. Examining on-maze SWR rates revealed the same increase on the 4 × 20 task on both c) VEH and e) CNO days, but no overall effect of CNO. Boxplot center mark depicts the median (red line), and top and bottom edges represent first and third quartiles. Whiskers extend to extreme data points not considered outliers. Diamonds = VEH days, circles = CNO days; Different colors represent different rats * p < 0.05, ** p < 0.01, *** p < 0.001.

We have previously shown that mPFC disruption improved behavior in these rats on the standard Restaurant Row task by increasing the rate of reinforcement and lowering thresholds (Schmidt et al., 2019). Consistent with previous results (Schmidt et al., 2019), rats earned more pellets on CNO days on the 4 × 20 task (Wilcoxon signed rank test p = 5e−06; Fig. S2a). However, unlike on the standard Restaurant Row task, compromising the mPFC with CNO did not measurably alter their willingness to wait for food (threshold; Wilcoxon signed rank test p = 0.60; Fig. S2b). This effect was possibly mediated by the learning component of the 4 × 20 task, unlike the Restaurant Row, which was an overly trained task.

We measured two behavioral variables of deliberative planning: the reaction time to skip a trial (hesitation time) and the probability of Vicarious Trial and Error behavior (pVTE; Schmidt et al., 2019). On the 4 × 20 task, CNO reduced the rat’s hesitation time (Wilcoxon signed rank test (n = sessions) p = 3e−07; Fig. S2c) and the probability of VTE behavior (pVTE) (Wilcoxon signed rank test (n = sessions) p = 0.034; Fig. S2d).

As noted above, the 4 × 20 task allows for the comparison of flavor preference and reward size. In order to examine flavor value preference between the four different restaurants, we measured the lingering time at the reward site after consumption. Both rats and mice linger longer after reward consumption for more preferred rewards (Sweis et al., 2018). Similarly, on 4 × 20, mPFC disruption reduced the time rats spent at the feeder after eating the reward (Wilcoxon signed rank test (n = sessions) p = 2e−05; Fig. S2e).

As rats are less likely to sit and wait at a restaurant (hesitation time and lingering time) under mPFC disruption, they also run faster under mPFC disruption (Wilcoxon signed rank test (n = sessions) p < 1.0e−09; Fig. S2f). We ran a general linear model with pVTE, thresholds, lingering time, hesitation time, and drug condition as explanatory variables of the rate of reinforcement. Lingering time had the most significant effect on rate of reinforcement (β = −1.7, t = −22, p = 5.1e−55). Threshold was the only other variable found to have a significant effect on rate of reinforcement (β = −0.46, t = −2.41, p = 0.017).

Taken together, most of the mPFC disruption behavioral results from the Restaurant Row task were replicated on the 4 × 20 task; the only incongruity was thresholds. mPFC disruption had no detectable effect on thresholds, which could be a result of changes in pellet sizes (3- or 1- pellets vs 2-pellets) between the two tasks (4 × 20 vs. Restaurant Row, respectively) or due to the amount of training on each task (newly learned vs. overtrained).

SWR increased on the novel 4 × 20 task

SWR rates increase during novelty (Cheng & Frank, 2008; Eschenko et al., 2008; Karlsson & Frank, 2008; O’Neill et al., 2008), therefore, we predicted that SWR rates would increase on the 4 × 20 task. We ran an ANOVA on SWR rate with Condition (VEH vs. CNO), Task (Restaurant Row vs. 4 × 20), and Epoch (Pre-maze rest vs. Post-maze rest) as variables (n = sessions). SWR rates significantly increased on the 4 × 20 task in general (main effect of Task: F(1,261) = 4.6, p = 0.033), but significantly decreased on CNO days (main effect of Condition: F(1,261) = 4.1, p = 0.04; Fig. 5b/d), and significantly increased on the Post-maze rest (main effect of Epoch: F(1,261) = 32, p < 0.001).

To examine SWRs on the maze, we ran an ANOVA on SWR rate with Condition (VEH vs. CNO) and Task (Restaurant Row vs. 4 × 20) as variables (n = sessions). On the maze we found a significant increase in SWR rates on the novel 4 × 20 task (main effect of Task: F(1,130) = 5.1, p = 0.025), but no effect of Condition (no main effect of Condition: F(1,130) = 0.14, p = 0.71; Fig. 5c/e). As expected, training rats on a novel task resulted in an increase in SWRs. We found this effect on SWRs both on the maze and during the off-maze rest. Interestingly, the increase in SWR rates due to novelty was primarily driven by an increased rate in SWRs during the Pre-maze rest (Pre-Restaurant Row vs. Pre-4 × 20; t-test: t132 = −3.2, p = 0.002) and not an overall increase in SWR rates across the two epochs (Post-Restaurant Row vs. Post-4 × 20; t-test: t132 = −0.25, p = 0.80).

The mPFC is necessary to anticipate an increase in reward

On the Restaurant Row task, rats show consistent flavor preferences for each restaurant, revealed by individual delay thresholds (the delay at which a rat was equally likely to stay or skip; Steiner & Redish, 2014). The 4 × 20 task allowed for the comparison of different reward sizes in each restaurant. We examined thresholds for the 1-pellet and 3-pellet sessions within the same restaurant across the four, daily sessions (Fig. 6a).

Figure 6: Rats showed behavioral and electrophysiological differences with increased reward value.

Figure 6:

a) The 4 × 20 task allowed for the direct comparison of each restaurant when it dispensed 3-pellets (unfilled circle) and 1-pellet (filled circle). (Grey circles represent restaurants/pellet configurations not examined in the current figure). b) A rat’s willingness to wait for food reward (threshold) tracked with reward size on VEH days (left), but not CNO days (right). c) Post-reward lingering time was greater on higher reward restaurants on both VEH and CNO days. d) Waiting SWR rates were higher for 3-pellet restaurants on both VEH and CNO days. e) Lingering SWR rates also tracked with reward size on VEH and CNO days. VEH = blue circle, CNO = red circle. ** p < 0.01; *** p < 0.001.

Rats wait longer for the 3-pellet rewards than the 1-pellet rewards on the 4 × 20 task (Steiner & Redish, 2014). We replicated this result on VEH days (1-pellet vs. 3-pellet restaurant thresholds (n = sessions): t(74) = −2.97, p = 0.004, CI [−5.0, −1.0]; Fig. 6b left). This effect required learning, as there were no differences in thresholds between 1-pellet and 3-pellet restaurants on their first few days of the 4 × 20 task (1st day: p = 0.51, 2nd day: p = 0.44), but there was on the last few days (3rd day: p = 0.032; Last day: p = 0.001; Fig. S3a top row). Disrupting the mPFC impaired the rats’ ability to anticipate the increase in reward size (1-pellet vs. 3-pellet restaurant thresholds: t(74) = −1.38, p = 0.17, CI [−3.4, 0.62]; Fig. 6b right). Additionally, mPFC disruption impaired the rat’s ability to recognize the increase in pellet reward (1st day: p = 0.59; 2nd day: p = 0.48; 3rd day: p = 0.33; Last day: p = 0.019; Fig. S3a bottom row).

Disrupting the mPFC impaired the rat’s ability to anticipate the difference in pellet size. We hypothesize that this could result from 1) the rat’s inability to recognize the difference between 1 and 3 pellets, 2) an impaired ability to remember the difference in reward size, or 3) an inability to link the memory to the actions taken.

Post-reward evaluation can be measured in the amount of time the rat lingers at a restaurant post consumption. Animals linger longer after more preferred flavors (Sweis et al., 2018). Similar to threshold measures, rats lingered longer after receiving 3 pellets of reward rather than 1 pellet (1-pellet vs. 3-pellet restaurant lingering time VEH: t(74) = −7.29, p = 2.8e−10, CI [−9.1, −5.2]; CNO: t(74) = −9.65, p = 9.9e−15, CI [−7.2, −4.7]; Fig. 6c). This implies that rats were still able to recognize the difference in reward size after mPFC disruption. Examining the lingering time across training revealed that this effect did not require learning (Fig. S3b).

Disrupting the mPFC did not affect the behavioral consequences of reward preference equally. Rats had an increased threshold for higher value rewards, but only with an intact mPFC. However, disrupting the mPFC impaired the rat’s ability to anticipate the increase in 3-pellet restaurants. Nonetheless, CNO left the post-reward evaluation intact; rats recognized the difference between 1 and 3 pellets under both VEH and CNO conditions. This implies that disrupting the mPFC left post-reward evaluation intact but leaves open the question of whether it impaired the ability to remember the difference in reward size and/or the ability to link the memory to the mPFC action-selection system.

Waiting SWR rates reflect the anticipation of higher value reward

SWR rates increase with reward size (Ambrose et al., 2016; Singer & Frank, 2009; Sosa et al., 2020). We predicted that SWR rates would be higher for 3-pellet restaurants vs. 1-pellet restaurants if the animal could recognize and remember the difference. However, the differences seen between Waiting and Lingering epochs suggest that the effect of reward size on SWR rates during post-reward evaluation may not be the case during reward anticipation.

On the 4 × 20 task, Waiting SWR rates tracked reward size (1-pellet vs. 3-pellet restaurant SWR rates (n=session) VEH: t(50) = −2.87, p = 0.006, CI [−0.16, −0.029]; CNO: t(50) = −3.66, p = 6.1e−04, CI [−0.087, −0.025]; Fig. 6d). On VEH days, thresholds and SWR rates were higher for larger reward restaurants. On CNO days even when rats failed to show an increase in threshold for larger reward restaurants, they still showed increased SWR rates during the Waiting period. This implies that the difference in thresholds was not due to an impaired ability to remember the reward size, as the SWR rate during waiting did increase proportionally.

The Lingering epoch also demonstrated this effect. SWR rates were greater for 3-pellet restaurants than 1-pellet restaurants (1-pellet vs. 3-pellet restaurant SWR rates VEH: t50) = −6.24, p = 9.2e−08, CI [−0.23, −0.12]; CNO: t50) = −9.79, p = 3.3e−13, CI [−0.28, −0.19]; Fig. 6e).

The memory of value is carried over from sub-session to sub-session

Rats waited longer for 3-pellet rewards (Fig. 6b). SWR rates increased on 3-pellet restaurants. This implies that thresholds and SWR rates reflected expected value or reward. If the rats remember the increased pellet reward, the increased thresholds and SWR rates on 3-pellet restaurants should carry over to the next within-day sub-session, even though the reward value decreased back to 1-pellet in the subsequent sub-session (Fig. 7a). In order to test this, SWR rates were compared between restaurants categorized as a “1-pellet restaurant” (the restaurant in the sessions such that it had only dispensed 1 pellet up until the examined sub-session) and as a “previous 3-pellet restaurant” (the restaurant after it had dispensed 3 pellets in a previous sub-session (of that day) but was then dispensing 1 pellet again). This categorization allowed us to track behavioral and physiological variables across sub-sessions in order to determine whether the memory of getting 3 pellets at a restaurant persisted by comparing to other restaurants that had only ever dispensed 1-pellet.

Figure 7: Rats showed behavioral and electrophysiological differences to changes in reward value.

Figure 7:

a) In addition to measuring when a restaurant increased from 1 to 3 pellets, we also measured different variables after the restaurant dispensed 3 pellets (“previous 3”) (i.e. currently dispensing 1 pellet (filled circle), but dispensed 3 pellets on a previous sub-session of that day (unfilled circle)). b) The rats waited for 1-pellet restaurants similarly as long as previous 3-pellet restaurants on both VEH (left) and CNO days (right). c) Interestingly, post-reward lingering revealed that previous 3-pellet restaurants were valued more than 1-pellet restaurants, as the rats lingered longer at these restaurants post reward consumption. d) SWR rates during the Waiting epoch were greater for previous 3-pellet restaurants on both VEH (left) and CNO (right) days. e) SWR rates during the Lingering epoch did not differentiate between previous 3-pellet restaurants and 1-pellet restaurants on VEH days (left), though it did on CNO days (right). VEH = blue circle, CNO = red circle. * p < 0.05; ** p < 0.01; *** p < 0.001.

In anticipation of the reward, rats valued 1-pellet and previous 3-pellet restaurants similarly, (VEH: t(49) = −0.85, p = 0.40, CI [−3.2, 1.3]; CNO: t(49) = 1.20, p = 0.24, CI [−1.0, 4.0]; Fig. 7b). Interestingly, during the post-reward evaluation, rats lingered longer at the previous 3-pellet restaurants than 1-pellet restaurants on (VEH: t(49) = −3.02, p = 0.004, CI [−29, −0.58]; CNO: t(49) = −2.15, p = 0.037, CI [−1.5, −0.05]; Fig. 7c). This implies that the rats did remember that these restaurants had provided 3-pellets in the previous sessions.

Going from a 3-pellet restaurant back to a 1-pellet restaurant (“previous 3-pellet”) resulted in an increase in SWR rates during the Waiting period (VEH: t(33) = −4.79, p = 3.4e−05, CI [−0.12, −0.046]; CNO: t(33) = −2.83, p = 0.008, CI [−0.12, −0.019]; Fig. 7d). This increase in SWR reward value representation from the previous sub-session carried over to the current sub-session, despite the fact that the reward size received was only 1-pellet, implying that the rats remembered the anticipated reward.

Given that Lingering SWR rates were affected by reward size (Fig. 6e), we predicted that Lingering SWR rates would also differentiate between previous 3-pellet and 1-pellet restaurants. That was not the case under VEH; SWR rates during the Lingering epochs were similar for both 1-pellet and previous 3-pellet restaurants (t(33) = −0.73, p = 0.47, CI [−0.062 0.029]; Fig. 7e left). However, SWR rates increased on previous 3-pellet restaurants on CNO days (t(33) = −2.86, p = 0.007, CI [−0.11, −0.019]; Fig. 7e right), implying that the rats were surprised by the change.

Given that the previous-3-pellet restaurants only dispensed 1-pellet, the increase in Waiting SWR rates between previous 3-pellet over 1-pellet restaurants likely reflected the changed expectations from the memory of receiving 3-pellets – an effect that carried over from sub-session to sub-session. This precluded our second hypothesis as to why CNO disrupted the rat’s ability to recognize the increase in pellet reward, CNO did not impair the ability to remember the difference in reward size. Unlike the Waiting epoch, we didn’t see this effect on the Lingering epoch, thereby further supporting the hypothesis that SWRs on the maze reflected different cognitive processes.

How does the memory of the previous 3-pellet restaurant compare to the restaurant when it dispensed 3 pellets (Fig. 8a)? There were no significant differences in threshold between 3-pellet and previous 3-pellet restaurants on VEH days (t(74) = 1.59, p = 0.12, CI [−0.45 4.0]; Fig. 8b left). In contrast, on CNO days, 3-pellet restaurant thresholds had a small, but significant, increase over previous 3-pellets (t(74) = 2.16, p = 0.034, CI [0.18 4.5]; Fig. 8b right). Though previous 3-pellet restaurants had a higher post-reward valuation over 1-pellet restaurants (Fig. 7c), they weren’t evaluated as high as 3-pellet restaurants. Rats lingered longer at 3-pellet restaurants than previously 3-pellet restaurants (VEH: t(74) = 8.34, p = 2.9e−12, CI [3.1, 5.1]; CNO: t(74) = 8.42, p = 2.1e−12, CI [3.2, 5.1]; Fig. 8c). This implies that the Lingering time reflected the gradient of reward value with 1-pellet being evaluated as least rewarding, previous 3-pellet evaluated as more rewarding, and 3-pellet being evaluated as most rewarding.

Figure 8: Rats showed different behavioral and electrophysiological responses to decreased reward value.

Figure 8:

a) The 4 × 20 task allowed for the direct comparison of a restaurant on the sub-sessions after it dispensed 3 pellets (“previous 3”; unfilled circle) and to that same restaurant when it had dispensed 3-pellets (filled circle). b) Interestingly, the rats waited as long for the previous 3-pellet restaurants as they did when the same restaurant dispensed 3-pellets on VEH (left), but not, CNO days (right). c) Though rats were willing to wait as long for previous 3-pellet restaurant, post-reward lingering revealed that 3-pellet restaurants were valued more. d) SWR rates during the Waiting epoch did not differentiate between previous 3-pellet restaurants and 3-pellet restaurants on both VEH and CNO days. e) SWR rates during the Lingering epoch were greater for the 3-pellet restaurants than the previous 3-pellet restaurants on both VEH and CNO days. VEH = blue circle, CNO = red circle.* p < 0.05; *** p < 0.001.

If the increase in reward size was reflected in SWR rates and this increase is carried over from sub-session to sub-session, then SWR rates on 3-pellet restaurants should remain high even when the reward drops back down to 1-pellet on the subsequent sub-sessions (i.e., becomes a previous 3-pellet restaurant). This was the case – Waiting SWR rates were similar between previous 3-pellet restaurants and 3-pellet restaurants (VEH: t(50) = 1.1, p = 0.26, CI [−0.015, 0.054]; CNO: t(50) = −0.74, p = 0.46, CI [−0.047, 0.022]; Fig. 8d). Consistent with the previous comparisons, Lingering SWR rates were greater for 3-pellet restaurants than previous 3-pellet restaurants (VEH: t(50) = 8.24, p = 6.9e−11, CI [0.14, 0.23]; CNO: t(50) = 7.24, p = 2.53e−09, CI [0.12, 0.20]; Fig. 8e).

mPFC disruption impaired the rat’s ability to recognize the changes in pellet reward size. We hypothesized that this could result from either 1) the rat’s inability to recognize the difference between 1 and 3 pellets, 2) an impaired ability to remember the difference in reward size, or 3) an inability to link the memory to action-selection. The rats were able to accurately evaluate larger rewards post-consumption. The rats showed increased SWR rates for larger rewards, suggesting that their ability to remember the difference in reward size was intact. Taken together, this leaves open the third hypothesis to account for disruption of mPFC on this task. We suspect that the memory of the larger reward within the hippocampus was intact, but by disrupting the mPFC this memory was not reaching or accessible to the action-selection system, likely mediated by the mPFC.

Discussion

During post-learning rest, SWRs are hypothesized to facilitate a process of consolidation by recapitulating behaviorally relevant information in a coordinated manner with neo-cortical unit firing, delta oscillations, and sleep spindles (Battaglia et al., 2004; Maingret et al., 2016; Siapas & Wilson, 1998; Tang et al., 2017). In contrast, during awake behavior, SWRs are hypothesized to facilitate planning (Jadhav et al., 2012; Ólafsdóttir et al., 2017; Pfeiffer & Foster, 2013; Shin et al., 2019) and value-learning functions (Ambrose et al., 2016; Ólafsdóttir et al., 2017; Shin et al., 2019). Our data show that 1) disrupting the mPFC with DREADDs impaired post-learning SWR rates, 2) disrupting the mPFC with DREADDs altered SWRs differently depending upon whether the rat was waiting for a reward or after having just received it and 3) SWRs were modulated by offer value, including both cost (as delay to reward) and reward preferences and the memory of the offer value carried over from daily sub-session to sub-session.

Disrupting the mPFC with DREADDs diminished the post-learning increase in SWR rates typically seen after learning/decision-making. Disrupting post-task SWRs affects learning and retention of novel tasks (Ego-Stengel & Wilson, 2010; Girardeau et al., 2009). Global and neuronal mPFC activity correlates with SWRs during these post-task SWRs (Battaglia et al., 2004; Euston et al., 2007; Maingret et al., 2016; Siapas & Wilson, 1998; Sirota et al., 2003; Wierzynski et al., 2009). Our data suggest that the mPFC has a causal functionality in generating these SWRs; disrupting mPFC diminished the number and rate of SWRs emitted during post-task rest.

Internally disrupting hippocampal SWRs impairs spatial memory (Ego-Stengel & Wilson, 2010; Girardeau et al., 2009; Jadhav et al., 2012) and artificially prolonging SWRs improves memory (Fernández-Ruiz et al., 2019). Ours is the first study to show that internally disrupting the mPFC impairs hippocampal SWRs. Previous studies have shown that SWR rates increase during novelty (Cheng & Frank, 2008; Eschenko et al., 2008; Karlsson & Frank, 2008; O’Neill et al., 2008) and after reward receipt (Ambrose et al., 2016; Singer & Frank, 2009). Taken together, these studies imply that the mPFC may facilitate the post-learning/reward increase in SWR rate.

SWR rates increase during novelty (Cheng & Frank, 2008; Eschenko et al., 2008; Karlsson & Frank, 2008; O’Neill et al., 2008), results we replicated on the 4 × 20 task. Interestingly, the increase seen off the maze was primarily driven by the pre-maze rest epoch. This implies that novelty increases SWR rates in anticipation of learning new contingencies each day. Post-maze rest SWR rates between the Restaurant Row and 4 × 20 tasks were comparable. This could potentially be due to a ceiling effect, as SWRs during post-learning rest are already increased compared to pre-learning rest.

SWRs on the maze and SWRs during rest are believed to support planning and consolidation, respectively, though it is possible these two functions are two sides of the same coin (Joo & Frank, 2018). Even on the maze, during tasks, representational differences have been found, both in the representational component of SWRs themselves (Carey et al., 2019; Ólafsdóttir et al., 2017) and in the correlation between hippocampal and mPFC activity (Jadhav et al., 2016; Shin et al., 2019). Though fewer in number than SWRs after receiving food reward, SWRs during the anticipation of food reward (Waiting epoch) showed more non-local decoding than those emitted after having received the food reward (Lingering epoch). This was particularly prominent for the Next restaurant, suggesting that these anticipatory SWRs were more related to planning (comparing Current and Next restaurants) than consolidation (which we would expect would entail representations of Previous restaurants). Not only did we find a dissociation between SWRs as a rat was anticipating food reward vs. after having received food reward, we found that mPFC disruption affected SWRs differently at these two times, with a larger effect on Lingering (post-reward) than Waiting (pre-reward) SWRs, suggesting a diversity in how mPFC causally impacted different SWRs.

Previous studies have reported that SWR rate tracks with reward size (Ambrose et al., 2016; Singer & Frank, 2009; Sosa et al., 2020). We replicated these results, and furthermore, found that this increase in SWR rate for greater reward size restaurants carried over from session to session in the 4 × 20 variant. As rats were waiting out the delay for food reward (Waiting), restaurants that previously provided 3 pellets (but currently provided 1 pellet) showed higher SWR rates than when these same restaurants had only ever dispensed 1-pellet (Fig. 7d). This effect was seen only while the rats were waiting out the delay for food reward; SWR rates during the lingering session (post reward delivery) did not show this SWR rate carry-over effect within the daily sessions. Previous studies have shown that stabilization of a memory trace is contingent upon the memory salience (Salvetti et al., 2014) and increased SWR rates correlate with memory performance (Norman et al., 2019). Taken together, these data suggest that the mPFC may play a role in stabilizing salient events that will later be used for future goal-directed decision-making.

Our data provide evidence for a causal role of mPFC in hippocampal SWR emission rates and SWR representations. However, as the current study did not apply CNO to a cohort of non-DREADD rats we cannot rule out that the results stemmed from a general CNO effect. Importantly, however, we found that mPFC disruption did not affect all hippocampal SWRs equally. This implies that differences seen between SWRs at different behavioral times likely facilitate different cognitive processes. SWRs while waiting for food are likely involved in anticipation and planning (Jadhav et al., 2012; Ólafsdóttir et al., 2017; Pfeiffer & Foster, 2013; Shin et al., 2019), whereas, SWRs while lingering after reward post-task rest are likely involved in consolidation and learning (de Lavilléon et al., 2015; Girardeau et al., 2009; Wikenheiser & Redish, 2013). Each depend differently on the mPFC-hippocampal interaction.

Supplementary Material

SFig 1

Figure S1: Diagram of the training, surgery, and injection sequence. a) Rats were initially trained twice, daily on the Restaurant Row task (see Methods). Upon learning the task, the rats underwent a triple-bundle hyperdrive implantation. After recovery, the rats were retrained once daily on the Restaurant Row task while tetrodes were moved to their respective locations. When ready, the rats underwent a 20-day CNO/VEH injection sequence. b) After completing the Restaurant Row task, the rats were then trained on the 4 × 20 variant for eight additional days (see Methods).

SFig 2

Figure S2: Disrupting the mPFC with DREADDs increased food reward and decreased deliberation. a) Rats earned more food CNO days than VEH days. b) Disrupting the mPFC with DREADDs had no effect on thresholds. c) Rats decreased their hesitation time (time to skip a trial) on CNO days. d) The probability of vicarious trial-and-error behavior (pVTE, a behavioral correlate of decision-making) decreased on CNO days. e) Post-reward lingering time decreased on CNO days. f) Rats ran faster on CNO days. Different colors represent different rats. Boxplot center mark depicts the median (red line), and top and bottom edges represent first and third quartiles. Whiskers extend to extreme data points not considered outliers. Different colors represent different rats. Diamonds = VEH days, circles = CNO days. * p < 0.05; *** p < 0.001.

SFig 3

Figure S3: Higher thresholds for larger rewards are learned and required the mPFC. a/b) The 8 days of training were divided between VEH (top row) and CNO days (bottom row). a) Thresholds (willingness to wait) for 1-pellet and 3-pellet restaurants are plotted in relation to flavor preference. Thresholds are higher for larger value reward, but this relationship developed over the course of training on VEH days. In contrast, rats treated 1- and 3-pellet restaurants similarly, for longer, on CNO days. (yellow-least, pink-less, black-more, white-most). b) Post-reward evaluation (Lingering time) for 1-pellet and 3-pellet restaurants are plotted in relation to flavor preference. Unlike thresholds, post-reward evaluation isn’t learned over the course of training and is not affected by CNO. (yellow-least, pink-less, black-more, white-most). * p < 0.05; *** p < 0.001.

Supplemental Text

Acknowledgments

The authors would like to thank Dr. Bryan Roth, Dr. Daniel Urban and the UNC Vector Core for help with viruses, DREADDs set-up, and troubleshooting. Technical assistance was provided by Kelsey Seeland, Christopher Boldt, and Ayaka Sheehan. Financial support for this work was provided by NIDA grant DA030672 (ADR), NIMH grant MH080318 (ADR), as well as a diversity supplement NIDA DA030672S1 (BJS), funding from the Society for Neuroscience Scholars Program (BJS), and an NIH NRSA fellowship DA038392 (BJS).

References:

  1. Ambrose RE, Pfeiffer BE, & Foster DJ (2016). Reverse Replay of Hippocampal Place Cells Is Uniquely Modulated by Changing Reward. Neuron, 91(5), 1124–1136. 10.1016/j.neuron.2016.07.047 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Battaglia FP, Sutherland GR, & McNaughton BL (2004). Hippocampal sharp wave bursts coincide with neocortical “up-state” transitions. Learning & Memory (Cold Spring Harbor, N.Y.), 11(6), 697–704. 10.1101/lm.73504 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Brown EN, Frank LM, Tang D, Quirk MC, & Wilson MA (1998). A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells. The Journal of Neuroscience, 18(18), 7411–7425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Buzsáki G, Leung LW, & Vanderwolf CH (1983). Cellular bases of hippocampal EEG in the behaving rat. Brain Research, 287(2), 139–171. 10.1016/0165-0173(83)90037-1 [DOI] [PubMed] [Google Scholar]
  5. Buzsáki György. (2015). Hippocampal sharp wave-ripple: A cognitive biomarker for episodic memory and planning. Hippocampus, 25(10), 1073–1188. 10.1002/hipo.22488 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Carey AA, Tanaka Y, & van der Meer MAA (2019). Reward revaluation biases hippocampal replay content away from the preferred outcome. Nature Neuroscience, 22(9), 1450–1459. 10.1038/s41593-019-0464-6 [DOI] [PubMed] [Google Scholar]
  7. Carr MF, Jadhav SP, & Frank LM (2011). Hippocampal replay in the awake state: A potential substrate for memory consolidation and retrieval. Nature Neuroscience, 14(2), 147–153. 10.1038/nn.2732 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cheng S, & Frank LM (2008). New experiences enhance coordinated neural activity in the hippocampus. Neuron, 57(2), 303–313. 10.1016/j.neuron.2007.11.035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Churchwell JC, & Kesner RP (2011). Hippocampal-prefrontal dynamics in spatial working memory: Interactions and independent parallel processing. Behavioural Brain Research, 225(2), 389–395. 10.1016/j.bbr.2011.07.045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Davidson TJ, Kloosterman F, & Wilson MA (2009). Hippocampal Replay of Extended Experience. Neuron, 63(4), 497–507. 10.1016/j.neuron.2009.07.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. de Lavilléon G, Lacroix MM, Rondi-Reig L, & Benchenane K (2015). Explicit memory creation during sleep demonstrates a causal role of place cells in navigation. Nature Neuroscience, 18(4), 493–495. 10.1038/nn.3970 [DOI] [PubMed] [Google Scholar]
  12. Dong S, Rogan SC, & Roth BL (2010). Directed molecular evolution of DREADDs: A generic approach to creating next-generation RASSLs. Nature Protocols, 5(3), 561–573. 10.1038/nprot.2009.239 [DOI] [PubMed] [Google Scholar]
  13. Dupret D, O’Neill J, Pleydell-Bouverie B, & Csicsvari J (2010). The reorganization and reactivation of hippocampal maps predict spatial memory performance. Nature Neuroscience, 13(8), 995–1002. 10.1038/nn.2599 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Ego-Stengel V, & Wilson MA (2010). Disruption of ripple-associated hippocampal activity during rest impairs spatial learning in the rat. Hippocampus, 20(1), 1–10. 10.1002/hipo.20707 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Eichenbaum H (2017). Prefrontal–hippocampal interactions in episodic memory. Nature Reviews Neuroscience, 18(9), 547–558. 10.1038/nrn.2017.74 [DOI] [PubMed] [Google Scholar]
  16. Eschenko O, Ramadan W, Mölle M, Born J, & Sara SJ (2008). Sustained increase in hippocampal sharp-wave ripple activity during slow-wave sleep after learning. Learning & Memory (Cold Spring Harbor, N.Y.), 15(4), 222–228. 10.1101/lm.726008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Euston DR, Tatsuno M, & McNaughton BL (2007). Fast-forward playback of recent memory sequences in prefrontal cortex during sleep. Science, 318(5853), 1147–1150. 10.1126/science.1148979 [DOI] [PubMed] [Google Scholar]
  18. Fernández-Ruiz A, Oliva A, Fermino de Oliveira E, Rocha-Almeida F, Tingley D, & Buzsáki G (2019). Long-duration hippocampal sharp wave ripples improve memory. Science, 364(6445), 1082–1086. 10.1126/science.aax0758 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Foster DJ (2017). Replay Comes of Age. Annual Review of Neuroscience, 40(1), 581–602. 10.1146/annurev-neuro-072116-031538 [DOI] [PubMed] [Google Scholar]
  20. Foster DJ, & Wilson MA (2006). Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature, 440(7084), 680–683. 10.1038/nature04587 [DOI] [PubMed] [Google Scholar]
  21. Girardeau G, Benchenane K, Wiener SI, Buzsáki G, & Zugaro MB (2009). Selective suppression of hippocampal ripples impairs spatial memory. Nature Neuroscience, 12(10), 1222–1223. 10.1038/nn.2384 [DOI] [PubMed] [Google Scholar]
  22. Gupta AS, van der Meer MAA, Touretzky DS, & Redish AD (2010). Hippocampal replay is not a simple function of experience. Neuron, 65(5), 695–705. 10.1016/j.neuron.2010.01.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Helfrich RF, Lendner JD, Mander BA, Guillen H, Paff M, Mnatsakanyan L, Vadera S, Walker MP, Lin JJ, & Knight RT (2019). Bidirectional prefrontal-hippocampal dynamics organize information transfer during sleep in humans. Nature Communications, 10(1), 3572. 10.1038/s41467-019-11444-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hoover WB, & Vertes RP (2007). Anatomical analysis of afferent projections to the medial prefrontal cortex in the rat. Brain Structure & Function, 212(2), 149–179. 10.1007/s00429-007-0150-4 [DOI] [PubMed] [Google Scholar]
  25. Ito HT, Zhang S-J, Witter MP, Moser EI, & Moser M-B (2015). A prefrontal-thalamo-hippocampal circuit for goal-directed spatial navigation. Nature, 522(7554), 50–55. 10.1038/nature14396 [DOI] [PubMed] [Google Scholar]
  26. Jackson JC, Johnson A, & Redish AD (2006). Hippocampal sharp waves and reactivation during awake states depend on repeated sequential experience. The Journal of Neuroscience, 26(48), 12415–12426. 10.1523/JNEUROSCI.4118-06.2006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Jadhav SP, Kemere C, German PW, & Frank LM (2012). Awake hippocampal sharp-wave ripples support spatial memory. Science, 336(6087), 1454–1458. 10.1126/science.1217230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Jadhav SP, Rothschild G, Roumis DK, & Frank LM (2016). Coordinated Excitation and Inhibition of Prefrontal Ensembles during Awake Hippocampal Sharp-Wave Ripple Events. Neuron, 90(1), 113–127. 10.1016/j.neuron.2016.02.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Janabi-Sharifi F, Hayward V, & Chen C-SJ (2000). Discrete-time adaptive windowing for velocity estimation. IEEE Transactions on Control Systems Technology, 8(6), 1003–1009. 10.1109/87.880606 [DOI] [Google Scholar]
  30. Jensen O, & Lisman JE (2000). Position Reconstruction From an Ensemble of Hippocampal Place Cells: Contribution of Theta Phase Coding. Journal of Neurophysiology, 83(5), 2602–2609. 10.1152/jn.2000.83.5.2602 [DOI] [PubMed] [Google Scholar]
  31. Joo HR, & Frank LM (2018). The hippocampal sharp wave-ripple in memory retrieval for immediate use and consolidation. Nature Reviews Neuroscience, 19(12), 744–757. 10.1038/s41583-018-0077-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Karlsson MP, & Frank LM (2008). Network dynamics underlying the formation of sparse, informative representations in the hippocampus. The Journal of Neuroscience, 28(52), 14271–14281. 10.1523/JNEUROSCI.4261-08.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kudrimoti HS, Barnes CA, & McNaughton BL (1999). Reactivation of hippocampal cell assemblies: Effects of behavioral state, experience, and EEG dynamics. The Journal of Neuroscience, 19(10), 4090–4101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Lee AK, & Wilson MA (2002). Memory of sequential experience in the hippocampus during slow wave sleep. Neuron, 36(6), 1183–1194. 10.1016/s0896-6273(02)01096-6 [DOI] [PubMed] [Google Scholar]
  35. Maingret N, Girardeau G, Todorova R, Goutierre M, & Zugaro M (2016). Hippocampo-cortical coupling mediates memory consolidation during sleep. Nature Neuroscience, 19(7), 959–964. 10.1038/nn.4304 [DOI] [PubMed] [Google Scholar]
  36. Muenzinger KF, & Gentry E (1931). Tone discrimination in white rats. Journal of Comparative Psychology, 12(2), 195–206. 10.1037/h0072238 [DOI] [Google Scholar]
  37. Muenzinger Karl F. (1938). Vicarious Trial and Error at a Point of Choice: I. A General Survey of its Relation to Learning Efficiency. The Pedagogical Seminary and Journal of Genetic Psychology, 53(1), 75–86. 10.1080/08856559.1938.10533799 [DOI] [Google Scholar]
  38. Navawongse R, & Eichenbaum H (2013). Distinct Pathways for Rule-Based Retrieval and Spatial Mapping of Memory Representations in Hippocampal Neurons. Journal of Neuroscience, 33(3), 1002–1013. 10.1523/JNEUROSCI.3891-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Norman Y, Yeagle EM, Khuvis S, Harel M, Mehta AD, & Malach R (2019). Hippocampal sharp-wave ripples linked to visual episodic recollection in humans. Science, 365(6454). 10.1126/science.aax1030 [DOI] [PubMed] [Google Scholar]
  40. O’Keefe J, & Dostrovsky J (1971). The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. Brain Research, 34(1), 171–175. 10.1016/0006-8993(71)90358-1 [DOI] [PubMed] [Google Scholar]
  41. O’Keefe J, & Nadel L (1978). The Hippocampus as a Cognitive Map. Clarendon Press; Oxford University Press. [Google Scholar]
  42. Ólafsdóttir HF, Carpenter F, & Barry C (2017). Task Demands Predict a Dynamic Switch in the Content of Awake Hippocampal Replay. Neuron, 96(4), 925–935.e6. 10.1016/j.neuron.2017.09.035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. O’Neill J, Senior TJ, Allen K, Huxter JR, & Csicsvari J (2008). Reactivation of experience-dependent cell assembly patterns in the hippocampus. Nature Neuroscience, 11(2), 209–215. 10.1038/nn2037 [DOI] [PubMed] [Google Scholar]
  44. Papale AE, Stott JJ, Powell NJ, Regier PS, & Redish AD (2012). Interactions between deliberation and delay-discounting in rats. Cognitive, Affective & Behavioral Neuroscience, 12(3), 513–526. 10.3758/s13415-012-0097-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Pfeiffer BE, & Foster DJ (2013). Hippocampal place-cell sequences depict future paths to remembered goals. Nature, 497(7447), 74–79. 10.1038/nature12112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Place R, Farovik A, Brockmann M, & Eichenbaum H (2016). Bidirectional prefrontal-hippocampal interactions support context-guided memory. Nature Neuroscience, 19(8), 992–994. 10.1038/nn.4327 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Preston AR, & Eichenbaum H (2013). Interplay of hippocampus and prefrontal cortex in memory. Current Biology: CB, 23(17), R764–773. 10.1016/j.cub.2013.05.041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Redish AD (1999). Beyond the Cognitive Map: From Place Cells to Episodic Memory. MIT Press. [Google Scholar]
  49. Redish AD (2016). Vicarious trial and error. Nature Reviews Neuroscience, 17(3), 147–159. 10.1038/nrn.2015.30 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Roumis DK, & Frank LM (2015). Hippocampal sharp-wave ripples in waking and sleeping states. Current Opinion in Neurobiology, 35, 6–12. 10.1016/j.conb.2015.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Salvetti B, Morris RGM, & Wang S-H (2014). The role of rewarding and novel events in facilitating memory persistence in a separate spatial memory task. Learning & Memory (Cold Spring Harbor, N.Y.), 21(2), 61–72. 10.1101/lm.032177.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Schmidt B, Duin AA, & Redish AD (2019). Disrupting the medial prefrontal cortex alters hippocampal sequences during deliberative decision making. Journal of Neurophysiology, 121(6), 1981–2000. 10.1152/jn.00793.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Schmidt B, Papale A, Redish AD, & Markus EJ (2013). Conflict between place and response navigation strategies: Effects on vicarious trial and error (VTE) behaviors. Learning & Memory (Cold Spring Harbor, N.Y.), 20(3), 130–138. 10.1101/lm.028753.112 [DOI] [PubMed] [Google Scholar]
  54. Shin JD, & Jadhav SP (2016). Multiple modes of hippocampal-prefrontal interactions in memory-guided behavior. Current Opinion in Neurobiology, 40, 161–169. 10.1016/j.conb.2016.07.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Shin JD, Tang W, & Jadhav SP (2019). Dynamics of Awake Hippocampal-Prefrontal Replay for Spatial Learning and Memory-Guided Decision Making. Neuron. 10.1016/j.neuron.2019.09.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Siapas AG, & Wilson MA (1998). Coordinated interactions between hippocampal ripples and cortical spindles during slow-wave sleep. Neuron, 21(5), 1123–1128. 10.1016/s0896-6273(00)80629-7 [DOI] [PubMed] [Google Scholar]
  57. Singer AC, Carr MF, Karlsson MP, & Frank LM (2013). Hippocampal SWR activity predicts correct decisions during the initial learning of an alternation task. Neuron, 77(6), 1163–1173. 10.1016/j.neuron.2013.01.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Singer AC, & Frank LM (2009). Rewarded outcomes enhance reactivation of experience in the hippocampus. Neuron, 64(6), 910–921. 10.1016/j.neuron.2009.11.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Sirota A, Csicsvari J, Buhl D, & Buzsáki G (2003). Communication between neocortex and hippocampus during sleep in rodents. PNAS, 100(4), 2065–2069. 10.1073/pnas.0437938100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Sosa M, Joo HR, & Frank LM (2020). Dorsal and Ventral Hippocampal Sharp-Wave Ripples Activate Distinct Nucleus Accumbens Networks. Neuron, 105(4), 725–741.e8. 10.1016/j.neuron.2019.11.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Spellman T, Rigotti M, Ahmari SE, Fusi S, Gogos JA, & Gordon JA (2015). Hippocampal-prefrontal input supports spatial encoding in working memory. Nature, 522(7556), 309–314. 10.1038/nature14445 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Steiner AP, & Redish AD (2014). Behavioral and neurophysiological correlates of regret in rat decision-making on a neuroeconomic task. Nature Neuroscience, 17(7), 995–1002. 10.1038/nn.3740 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Sutherland GR, & McNaughton B (2000). Memory trace reactivation in hippocampal and neocortical neuronal ensembles. Current Opinion in Neurobiology, 10(2), 180–186. 10.1016/s0959-4388(00)00079-9 [DOI] [PubMed] [Google Scholar]
  64. Sweis BM, Abram SV, Schmidt BJ, Seeland KD, MacDonald AW, Thomas MJ, & Redish AD (2018). Sensitivity to “sunk costs” in mice, rats, and humans. Science, 361(6398), 178–181. 10.1126/science.aar8644 [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Tang W, & Jadhav SP (2019). Sharp-wave ripples as a signature of hippocampal-prefrontal reactivation for memory during sleep and waking states. Neurobiology of Learning and Memory, 160, 11–20. 10.1016/j.nlm.2018.01.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Tang W, Shin JD, Frank LM, & Jadhav SP (2017). Hippocampal-Prefrontal Reactivation during Learning Is Stronger in Awake Compared with Sleep States. The Journal of Neuroscience, 37(49), 11789–11805. 10.1523/JNEUROSCI.2291-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Tolman EC (1938). The determiners of behavior at a choice point. Psychological Review, 45(1), 1–41. 10.1037/h0062733 [DOI] [Google Scholar]
  68. Vertes RP, Hoover WB, Szigeti-Buck K, & Leranth C (2007). Nucleus reuniens of the midline thalamus: Link between the medial prefrontal cortex and the hippocampus. Brain Research Bulletin, 71(6), 601–609. 10.1016/j.brainresbull.2006.12.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Wang G-W, & Cai J-X (2006). Disconnection of the hippocampal-prefrontal cortical circuits impairs spatial working memory performance in rats. Behavioural Brain Research, 175(2), 329–336. 10.1016/j.bbr.2006.09.002 [DOI] [PubMed] [Google Scholar]
  70. Wang JX, Cohen NJ, & Voss JL (2015). Covert rapid action-memory simulation (CRAMS): A hypothesis of hippocampal-prefrontal interactions for adaptive behavior. Neurobiology of Learning and Memory, 117, 22–33. 10.1016/j.nlm.2014.04.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Wierzynski CM, Lubenov EV, Gu M, & Siapas AG (2009). State-dependent spike-timing relationships between hippocampal and prefrontal circuits during sleep. Neuron, 61(4), 587–596. 10.1016/j.neuron.2009.01.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Wikenheiser AM, & Redish AD (2013). The balance of forward and backward hippocampal sequences shifts across behavioral states. Hippocampus, 23(1), 22–29. 10.1002/hipo.22049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Yu JY, & Frank LM (2015). Hippocampal-cortical interaction in decision making. Neurobiology of Learning and Memory, 117, 34–41. 10.1016/j.nlm.2014.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Zhang K, Ginzburg I, McNaughton BL, & Sejnowski TJ (1998). Interpreting neuronal population activity by reconstruction: Unified framework with application to hippocampal place cells. Journal of Neurophysiology, 79(2), 1017–1044. 10.1152/jn.1998.79.2.1017 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SFig 1

Figure S1: Diagram of the training, surgery, and injection sequence. a) Rats were initially trained twice, daily on the Restaurant Row task (see Methods). Upon learning the task, the rats underwent a triple-bundle hyperdrive implantation. After recovery, the rats were retrained once daily on the Restaurant Row task while tetrodes were moved to their respective locations. When ready, the rats underwent a 20-day CNO/VEH injection sequence. b) After completing the Restaurant Row task, the rats were then trained on the 4 × 20 variant for eight additional days (see Methods).

SFig 2

Figure S2: Disrupting the mPFC with DREADDs increased food reward and decreased deliberation. a) Rats earned more food CNO days than VEH days. b) Disrupting the mPFC with DREADDs had no effect on thresholds. c) Rats decreased their hesitation time (time to skip a trial) on CNO days. d) The probability of vicarious trial-and-error behavior (pVTE, a behavioral correlate of decision-making) decreased on CNO days. e) Post-reward lingering time decreased on CNO days. f) Rats ran faster on CNO days. Different colors represent different rats. Boxplot center mark depicts the median (red line), and top and bottom edges represent first and third quartiles. Whiskers extend to extreme data points not considered outliers. Different colors represent different rats. Diamonds = VEH days, circles = CNO days. * p < 0.05; *** p < 0.001.

SFig 3

Figure S3: Higher thresholds for larger rewards are learned and required the mPFC. a/b) The 8 days of training were divided between VEH (top row) and CNO days (bottom row). a) Thresholds (willingness to wait) for 1-pellet and 3-pellet restaurants are plotted in relation to flavor preference. Thresholds are higher for larger value reward, but this relationship developed over the course of training on VEH days. In contrast, rats treated 1- and 3-pellet restaurants similarly, for longer, on CNO days. (yellow-least, pink-less, black-more, white-most). b) Post-reward evaluation (Lingering time) for 1-pellet and 3-pellet restaurants are plotted in relation to flavor preference. Unlike thresholds, post-reward evaluation isn’t learned over the course of training and is not affected by CNO. (yellow-least, pink-less, black-more, white-most). * p < 0.05; *** p < 0.001.

Supplemental Text

Data Availability Statement

The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.

RESOURCES