Abstract
Most reinforcement learning models assume that the reward signal arrives after the activity that led to the reward, placing constraints on the possible underlying cellular mechanisms. Here we show that dopamine, a positive reinforcement signal, can retroactively convert hippocampal timing-dependent synaptic depression into potentiation. This effect requires functional NMDA receptors and is mediated in part through the activation of the cAMP/PKA cascade. Collectively, our results support the idea that reward-related signaling can act on a pre-established synaptic eligibility trace, thereby associating specific experiences with behaviorally distant, rewarding outcomes. This finding identifies a biologically plausible mechanism for solving the ‘distal reward problem’.
DOI: http://dx.doi.org/10.7554/eLife.09685.001
Research organism: mouse
eLife digest
To help someone learn a new task, we might give them a reward after they have performed well. However, these rewards tend to be given several seconds or minutes after the behavior they are supposed to promote. Therefore, it is unclear how the rewards affect the brain and help accelerate the learning process.
Information is processed and sent around the brain by networks of cells called neurons. These networks are constantly remodeled because learning changes the connections—called synapses—that neighboring neurons signal across. Synapses can be strengthened so that signals are sent across them more easily in the future. Synapses can also be weakened, making it harder for the neurons to subsequently communicate.
A chemical called dopamine is often produced in the brain when a reward is received. If dopamine is present in a synapse whilst a neuron is signaling to its neighbor, it can affect how effectively this communication occurs. Brzosko et al. have now investigated whether dopamine can also change the synapses if it is applied after signaling has already happened.
The strengthening or weakening of synapses can be triggered by electrically stimulating the neurons on either side of a synapse at particular times. Brzosko et al. did this to neurons in slices of mouse brain, and then applied dopamine to the neurons. The results suggest that dopamine can reverse synaptic weakening and can even cause the synapses to strengthen. However, the dopamine had to be applied immediately after stimulation to be able to strengthen the synapse. The next challenge is to establish whether this change in synaptic strength is responsible for the change in behavior.
Introduction
Spike timing-dependent plasticity (STDP) is a physiologically relevant form of Hebbian learning (Caporale and Dan, 2008). In its classic form, STDP depends on the order and precise timing of presynaptic and postsynaptic spikes: pre-before-post spike pairings induce timing-dependent long-term potentiation (t-LTP), whereas post-before-pre pairings induce timing-dependent long-term depression (t-LTD) (Markram et al., 1997; Bi and Poo, 1998). However, the quantitative rules of STDP are profoundly influenced by neuromodulators (Seol et al., 2007; Pawlak et al., 2010), including dopamine (DA) (Zhang et al., 2009; Edelmann and Lessmann, 2011; Yang and Dani, 2014). Although it is well established that reward-motivated behavior depends on the activity of DA neurons (Schultz et al., 1997; Suri and Schultz, 1999; Pan et al., 2005), the mechanisms that associate specific experiences with rewarding outcomes, which typically occur after a delay, are not well understood. This is referred to as the distal reward problem (Hull, 1943). To address this problem, here, we examined whether DA modulates STDP not only when applied during, but also—more importantly—when applied after the pairing event.
Results and discussion
We first sought to corroborate the shape of the STDP induction curve by varying the time interval between the presynaptic and postsynaptic activity (Δt; Figure 1A,B,G). To this end, we monitored excitatory postsynaptic potentials (EPSPs) that were evoked by extracellular stimulation of the Schaffer-collateral-CA1 pathway during whole-cell recordings of CA1 pyramidal cells in mouse horizontal slices (postnatal days 12–18; ‘Materials and methods’). Plasticity was induced in current clamp mode using an induction protocol that involved 100 pairings of a single EPSP followed by a single postsynaptic spike (t-LTP; Figure 1A) or a single postsynaptic spike followed by a single EPSP (t-LTD; Figure 1B) at 0.2 Hz. Consistent with previous studies (Bi and Poo, 1998; Zhang et al., 2009; Edelmann and Lessmann, 2011), the pre-before-post pairing protocol with Δt = +10 ms induced t-LTP (182 ± 14%; t(4) = 5.6, p = 0.0049 vs 100%, n = 5; Figure 1A) and the post-before-pre pairing protocol with Δt = −20 ms induced t-LTD (61 ± 9%; t(4) = 4.4, p = 0.0121 vs 100%, n = 5; Figure 1B). Surprisingly, however, we found that the post-before-pre pairing protocol with Δt = −10 ms instead of inducing t-LTD, elicited robust t-LTP (202 ± 21%; t(6) = 5.0, p = 0.0025 vs 100%, n = 7; Figure 1C,D,G). This conflicts with previous reports from hippocampal cultures (Bi and Poo, 1998; Zhang et al., 2009) and acute slices (Edelmann and Lessmann, 2011; Yang and Dani, 2014), where post-before-pre pairing protocols never elicited synaptic potentiation in baseline conditions. Given that DA has been found to widen the time window for the induction of t-LTP (Zhang et al., 2009; Yang and Dani, 2014), we wanted to assess whether endogenous DA could be responsible for the potentiation observed with the post-before-pre pairing under our experimental conditions. Therefore, we repeated this set of experiments using the post-before-pre pairing protocol with Δt = −10 ms in the presence of DA receptor (DAR) antagonists. Indeed, combined application of the D1/D5 receptor antagonist SCH23390 (10 μM) and D2-like receptor antagonist sulpiride (50 μM) from the start of the recordings prevented t-LTP and enabled t-LTD instead (72 ± 8%; t(5) = 3.6, p = 0.0160 vs 100%, n = 6; Figure 1C,D,G), rendering the STDP induction curve similar to that observed in hippocampal cultures (Bi and Poo, 1998; Zhang et al., 2009). These results suggest a modulatory action of endogenous DA, presumably released during the pairing protocol (Frey et al., 1990; Yang and Dani, 2014), which resulted in the changed polarity of plasticity at narrow negative spike-timing intervals.
Next, we wanted to examine whether the application of exogenous DA during pairing at negative spike-timing intervals facilitates synaptic potentiation. Indeed, the post-before-pre pairing protocol with Δt = −20 ms, which elicited robust t-LTD in control condition (74 ± 9%, t(8) = 3.0, p = 0.0165 vs 100%, n = 9; Figure 1E–G), induced significant t-LTP when exogenous DA (20 μM) was bath-applied for 10–12 min from 2 min before and during the post-before-pre pairings in interleaved experiments (144 ± 12%; t(5) = 3.7, p = 0.0148 vs 100%, n = 6; Figure 1E–G). Therefore, in accordance with previous findings (Zhang et al., 2009; Yang and Dani, 2014), the presence of DA during the coordinated spiking activity widens the spike time interval for induction of t-LTP.
A crucial aspect of reinforcement learning models is the ability of the reinforcing signal (DA) to strengthen active synapses, even when it arrives after the activity (Sutton and Barto, 1981; Izhikevich, 2007). To test this hypothesis experimentally, we applied DA after the t-LTD induction protocol. Exogenous DA (100 μM) added to the perfusion system for 10–12 min starting within 1 min after the post-before-pre pairing protocol with Δt = −20 ms converted t-LTD into t-LTP (169 ± 16%, t(5) = 4.3, p = 0.0078 vs 100%, n = 6; Figure 2A,B). This implies that DA can have a retroactive effect allowing negative spike pairings to induce t-LTP. The specificity of this DAergic conversion of STDP was assessed using DAR antagonists. Indeed, combined application of DAR antagonists, SCH23390 (10 μM) and sulpiride (50 μM), prevented DA-induced conversion of t-LTD into t-LTP, resulting in significant t-LTD instead (63 ± 12%, t(5) = 3.0, p = 0.0289 vs 100%, n = 6, Δt = −20 ms). Thus, the conversion of t-LTD into t-LTP was due to specific DAR activation. Importantly, when the test pathway was not stimulated following the pairing protocol until after DA washout (stimulation resumed 15 min after pairing), robust t-LTD was induced (61 ± 5%, t(5) = 8.1, p = 0.0005 vs 100%, n = 6; Figure 2A,B). The effect of DA was, therefore, activity dependent, demonstrating that the reinforcing signal is capable of acting specifically on the active inputs. To exclude the possibility that DA by itself could potentiate the test pathway, control experiments with ongoing synaptic stimulation over 60 min at 0.2 Hz, but without pairing with postsynaptic action potentials, were performed. Consistent with earlier reports (Otmakhova and Lisman, 1999), DA had no significant effect on the basal Schaffer-collateral transmission (110 ± 7%, t(7) = 1.4, p = 0.2169 vs 100%, n = 8; Figure 2A,B).
Subsequently, we wanted to determine whether the observed DA-induced conversion of t-LTD into t-LTP depends on the timing of DA application following the post-before-pre protocol (Figure 3A–D). We found that delayed application of DA (10 or 30 min after t-LTD pairing protocol) failed to convert t-LTD into t-LTP. Application of DA 10 min after the post-before-pre pairing caused a reversal of t-LTD back to baseline (94 ± 9%, t(11) = 0.7, p = 0.5230 vs 100%, n = 12; Figure 3B,D), whereas application of DA 30 min after the post-before-pre pairing failed to influence t-LTD altogether (59 ± 12%, t(5) = 3.4, p = 0.0200 vs 100%, n = 6; Figure 3C,D). Whilst it has previously been reported that DA applied after the induction of low frequency stimulation-induced LTD can reduce the magnitude of synaptic depression (Mockett et al., 2007), our data demonstrate, for the first time to our knowledge, that DA can change the polarity of STDP when acting within a short time window following the induction protocol.
Finally, we aimed to explore the possible mechanisms underlying the DA-induced conversion of t-LTD into t-LTP. Both hippocampal t-LTP and t-LTD (Edelmann and Lessmann, 2011; Yang and Dani, 2014), as well as the modulation of STDP by DA (Zhang et al., 2009), require functional NMDA receptors. We, therefore, asked whether the DA-induced conversion of t-LTD into t-LTP is also NMDA receptor dependent. Application of the NMDA receptor antagonist d-2-amino-5-phosphonopentanoic acid (d-AP5, 50 μM) after the post-before-pre pairing protocol did not by itself affect the development of t-LTD (57 ± 12%, t(5) = 3.6, p = 0.0156 vs 100%, n = 6, Δt = −20 ms; Figure 4A,D). Nevertheless, DA in the presence of d-AP5 reversed t-LTD back to baseline albeit failing to convert t-LTD into t-LTP (105 ± 12%, t(5) = 0.4, p = 0.6898 vs 100%, n = 6, Δt = −20 ms; Figure 4A,D). This suggests an important dissociation between two mechanisms involved in the DA-induced conversion of t-LTD into t-LTD, namely a reversal of synaptic depression (de-depression) and synaptic potentiation, one of which is NMDA receptor dependent.
To investigate the intracellular signaling mechanisms involved, we initially set out to establish the DAR subtype associated with the DA-induced conversion of t-LTD into t-LTP. Even though combined application of D1-like and D2-like receptor antagonist completely blocked the DA effect (Figure 4—figure supplement 1A,D), application of either D1-like or D2-like receptor antagonist alone only partially prevented the conversion of t-LTD into t-LTP (SCH 23390: 131 ± 16%; t(6) = 1.4, p = 0.0113 vs DA (t-LTP); t(6) = 3.6, p = 0.2277 vs control (t-LTD); t(6) = 1.9, p = 0.0994 vs 100%; n = 7. Sulpiride: 94 ± 13%; t(6) = 3.2, p = 0.0228 vs DA (t-LTP); t(6) = 1.5, p = 0.1932 vs control (t-LTD); t(6) = 0.5, p = 0.6435 vs 100%; n = 7; Figure 4—figure supplement 1B–D). This suggests that both receptor subtypes might mediate the retroactive effect of DA on STDP. Given the astounding complexity of D2-like receptor pharmacology (Neve et al., 2004), we first wanted to evaluate the possible involvement of D1-like receptor-activated cAMP/PKA signaling cascade in the DA-induced conversion of t-LTD into t-LTP. D1/D5 receptor stimulation leads to the activation of adenylyl cyclase (AC) and subsequent increase in cyclic adenosine monophosphate (cAMP) and protein kinase A (PKA) activation (Greengard et al., 1999; Neve et al., 2004). We found that the general AC activator, forskolin (50 μM), applied for 10–12 min immediately after the pairing protocol with Δt = −20 ms resulted in robust conversion of t-LTD into t-LTP (167 ± 17%, t(6) = 3.9, p = 0.0078 vs 100%, n = 7; Figure 4B,D). Notably, forskolin (50 μM) applied in the absence of the pairing protocol had no significant effect on baseline EPSPs (83 ± 12%, t(6) = 1.4, p = 0.2046 vs 100%, n = 7; Figure 4—figure supplement 2). The forskolin-induced conversion of t-LTD into t-LTP was also NMDA receptor dependent since bath application of d-AP5 (50 μM) 1 min before forskolin treatment prevented synaptic potentiation from developing, although synaptic depression was reversed (102 ± 11%, t(5) = 0.2, p = 0.8616 vs 100%, n = 6; Figure 4B,D). This result suggests that the cAMP cascade works either upstream of or in parallel to NMDA receptor activation for the development of potentiation to occur. Downstream of cAMP, PKA is involved because the PKA inhibitor, H-89 (20 µM), blocked the DA-induced conversion of t-LTD into t-LTP, revealing significant t-LTD (76 ± 5%, t(6) = 5.0, p = 0.0025 vs 100%, n = 7; Figure 4C,D). Taken together, these results imply that the DA-induced conversion of t-LTD into t-LTP involves activation of the cAMP/PKA signaling cascade, which closely mimics the effects of DA (Figure 4A vs Figure 4B). Although, D2-like receptors are typically associated with the inhibition of AC (Neve et al., 2004), interestingly, there is also evidence that D2-like receptor stimulation can potentiate AC activity (Glass and Felder, 1997; Watts and Neve, 1997). Therefore, while the possibility that D2-like receptors contribute to the conversion of t-LTD into t-LTP via a different signaling cascade cannot be excluded, it is tempting to suggest that the DA-induced conversion of t-LTD into t-LTP is mediated primarily via the cAMP/PKA pathway. Hence, based on our results, we propose that the stimulation of postsynaptic (Figure 5ai) or presynaptic (Figure 5aii) DARs activates the cAMP/PKA pathway, which—via two cellular mechanisms (de-depression and potentiation)—leads to the retroactive conversion of t-LTD into t-LTP.
Figure 5. Proposed mechanisms underlying the DA-induced conversion of t-LTD into t-LTP.
(A) Schematic diagram depicting core components of the proposed cellular mechanisms underlying the DA-induced conversion of t-LTD into t-LTP (de-depression and potentiation). (Ai) Model based on postsynaptic NMDAR-dependent potentiation (Bi and Poo, 1998; Caporale and Dan, 2008; Zhang et al., 2009; Edelmann and Lessmann, 2011; Yang and Dani, 2014) and metabotropic glutamate receptor-dependent (mGluRs) depression (Otani and Connor, 1998; Kemp and Bashir, 1999; Huber et al., 2000). De-depression (red, left): Activation of G protein-coupled D1/D5 receptors stimulates AC, increasing cAMP and activating PKA (Greengard et al., 1999; Neve et al., 2004), which, via phosphorylation of I-1 (Ingebritsen and Cohen, 1983), reverses the PP1-induced dephosphorylation of synaptic AMPARs (Lee et al., 2000; Mockett et al., 2007). Potentiation (red, right): PKA activation enhances NMDAR function (Westphal et al., 1999; Chen and Roche, 2007). (Aii) Model based on presynaptic depression (Bolshakov and Siegelbaum, 1994; Siegel et al., 1994; Oliet et al., 1997; Charton et al., 1999; Watabe et al., 2002; Jourdain et al., 2007). De-depression (red, left): Activation of presynaptic DA receptors stimulates AC, increasing cAMP and activating PKA (Greengard et al., 1999; Neve et al., 2004), which reverses the calcineurin-dependent presynaptic depression. Potentiation (red, right): as in Ai. Arrow indicates activation/phosphorylation, blunt-ended line indicates inhibition/dephosphorylation. Abbreviations: AMPAR, AMPA-type glutamate receptor; NMDAR, NMDA-type glutamate receptor; mGluR1/5, group I metabotropic glutamate receptor; DAR, dopamine receptor; AC, adenylate cyclase; cAMP, cyclic adenosine monophosphate; PKA, protein kinase A; I-1, inhibitor 1; PP1, protein phosphatase 1; PLC, phospholipase C; IP3, inositol 1,4,5-trisphosphate; ER, endoplasmic reticulum; DAG, diacylglycerol; eCB, endocannabinoid; CB1R, cannabinoid receptor type 1; CN, calcineurin; CaMKII, calcium-calmodulin-dependent protein kinase II. (B) Schematic diagram of synaptic and behavioral timescales in reward learning. During Exploration, the activity-dependent modification of synaptic strength due to spike timing-dependent plasticity (STDP) depends on the coordinated spiking between presynaptic and postsynaptic neurons on a millisecond time scale. Post-before-pre pairing leads to synaptic depression that develops gradually on a scale of minutes. When Reward, signaled via dopamine, follows Exploration with a Delay of seconds to minutes, synaptic depression is converted into potentiation.
The functional implications of our finding depend on the presynaptic source of DA in the hippocampus, which remains controversial because of the apparent discrepancy between DAergic terminals and receptors (Scatton et al., 1980; Gasbarri et al., 1997; Lisman and Grace, 2005). On one hand, it has been argued that noradrenergic terminals from the locus coeruleus, which mediates arousal and the optimization of behavioral performance (Aston-Jones and Cohen, 2005; Chamberlain and Robbins, 2013), may provide a major source of DA release (Smith and Greene, 2012). On the other hand, hippocampal pyramidal cell assemblies are directly affected by concurrent activity in midbrain DAergic neurons (McNamara et al., 2014), which have been linked to reward-seeking behavior (Schultz et al., 1997) and appetitive stimuli (Mirenowicz and Schultz, 1996; Fiorillo et al., 2013). This latter finding (McNamara et al., 2014) not only supports the hypothesis that hippocampal DA is relevant for reward processing, but is also consistent with our result that the activation of DAergic receptors during coordinated spiking activity changes the functional outcome of STDP (Figure 1C–G).
Previous studies examining reinforcement learning at the level of synaptic plasticity have showed that neuromodulators can affect timing-dependent plasticity in locust (Cassenaer and Laurent, 2012) and spine structural plasticity in striatal medium spiny neurons in mice (Yagishita et al., 2014) when acting within a delay time window of 1 s. While this narrow temporal detection window may be important in the striatum, it cannot account for the experimental evidence from behavioral studies of response acquisition with an extended reinforcement delay in rats and pigeons (Lattal and Gleeson, 1990; Sutphin et al., 1998), rhesus monkeys (Galuska and Woods, 2005), and humans (Okouchi, 2009). Meanwhile, our finding demonstrates that DA can modulate STDP in the CA1 with a reinforcement delay of at least 1 min (Figure 5B). Such extended reinforcement delay is likely to be particularly important in hippocampus-dependent learning during spatial exploration.
In conclusion, our work demonstrates a retroactive effect of DA on STDP—converting t-LTD into t-LTP. This effect is mediated at least in part through the activation of the cAMP/PKA cascade and requires the activation of synaptic NMDA receptors. This in turn suggests that the conversion can only occur at synapses that are re-activated following the initial pairing event. Interestingly, it has been reported that hippocampal reactivation events (sharp wave ripples) increase in frequency following reward (Singer and Frank, 2009; Atherton et al., 2015). Thus, in behaving animals, the conditions for the conversion of depression into potentiation might occur during reward-related sharp wave ripple activity. Together, these findings support the concept of a slowly decaying synaptic eligibility trace that is committed to memory by the occurrence of reward and provide a possible mechanism for associating specific experiences with behaviorally distant, rewarding outcomes in animals (Sutton and Barto, 1981; Suri and Schultz, 1999; Pan et al., 2005; Izhikevich, 2007; Harnett et al., 2009), including humans (Dunsmoor et al., 2015).
Materials and methods
Animals
Wild-type mice (C57BL/6; postnatal days 12–18; from Harlan, Bicester, UK or Central Animal Facility, Physiological Laboratory, Cambridge University) of both sexes were housed on a 12-hr light/dark cycle at 19–23 °C, with water and food ad libitum. Experimental procedures and animal use were in accordance with the animal care guidelines of the UK Animals (Scientific Procedures) Act 1986 under personal and project licenses held by the authors. Caution was taken to minimize stress and the number of animals used in the experiments.
Slice preparation
Mice were anesthetized with isoflurane and decapitated. The brain was rapidly removed, glued to the stage of a vibrating microtome (Leica VT 1200S, Leica Biosystems, Wetzlar, Germany) and immersed in ice-cold artificial cerebrospinal fluid (ACSF) containing the following (mM): 126 NaCl, 3 KCl, 26.4 NaH2CO3, 1.25 NaH2PO4, 2 MgSO4, 2 CaCl2, and 10 glucose. The ACSF solution, with pH adjusted to 7.2 and osmolarity to 270–290 mOsm l−1, was continuously bubbled with carbogen gas (95% O2/5% CO2). The brain was sectioned into 350-μm-thick horizontal slices. The slices were incubated in ACSF at room temperature in a submerged-style storage chamber for at least 1 hr. For recordings (1–7 hr after slicing), individual slices were transferred to an immersion-type recording chamber, perfused with ACSF (2 ml min−1) at 24–26 °C.
Electrophysiology
Whole-cell recordings
Whole-cell patch-clamp recordings were performed on CA1 pyramidal neurons (located adjacent to the stratum oriens). For stimulation of Schaffer collaterals, a monopolar stimulation electrode was placed in the stratum radiatum of the CA1 subfield. The hippocampal subfields were visually identified using infrared differential interference contrast (DIC) microscopy. Patch pipettes (resistance: 4–8 MΩ) were made from borosilicate glass capillaries (0.68 mm inner diameter, 1.2 mm outer diameter), pulled using a P-97 Flaming/Brown micropipette puller (Sutter Instruments Co., Novato, California, USA). The internal solution of patch pipettes was (mM) 110 potassium gluconate, 4 NaCl, 40 HEPES, 2 ATP-Mg, 0.3 GTP (pH adjusted with 1 M KOH to 7.2, and osmolarity with ddH2O to 270 mOsm l−1). The liquid junction potential was not corrected for. Cells were accepted for experiment only if the resting membrane potential at the start of the recording was between −55 and −70 mV. Membrane potential was held at −70 mV throughout further recording by direct current application via the recording electrode. At the beginning of each recording all cells were tested for regular spiking responses to positive current steps—characteristic of pyramidal neurons.
Stimulation protocol
EPSPs of amplitude between 3 and 8 mV were evoked at 0.2 Hz by adjusting the magnitude of direct current pulses (stimulus duration 50 μs, intensity 100 μA–1 mA). After a stable EPSP baseline period of at least 10 min, STDP was induced by repeated pairings of single presynaptic EPSP evoked by stimulation of Schaffer collaterals and single postsynaptic action potential elicited with the minimum somatic current pulse (1–1.8 nA, 3 ms) via the recording electrode. Pairings were repeated 100 times at 0.2 Hz. Spike-timing intervals (∆t in ms) were measured between the onset of the EPSP and the onset of the action potential. The EPSPs were monitored for at least 40 min after the end of the pairing protocol. Presynaptic stimulation frequency remained constant throughout the experiment.
Data acquisition and data analysis
Voltage signals were low-pass filtered at 2 kHz using an Axon Multiclamp 700B amplifier (Molecular Devices, Sunnyvale, California, USA). Data were acquired at 5 kHz via an ITC18 interface board (Instrutech, Port Washington, New York, USA), transmitting to a Dell computer running the Igor Pro software (WaveMetrics, Lake Oswego, Oregon, USA). All experiments were carried out in the current clamp (‘bridge’) mode. Series resistance was monitored (10–15 MΩ) and compensated for by adjusting the bridge balance. Data were discarded if series resistance changed by more than 30%. Data were analyzed using Igor Pro. EPSP slopes were measured on the rising phase of the EPSP as a linear fit between the time points corresponding to 25–30% and 70–75% of the peak amplitude. For statistical analysis, the mean EPSP slope per minute of the recording was calculated from 12 consecutive sweeps and normalized to the baseline. Normalized ESPS slopes from the last 5 min of the baseline (immediately before pairing) and from the last 5 min of the recording (35–40 min or 55–60 min after pairing) were averaged. The magnitude of plasticity, as an indicator of synaptic change, was defined as the average EPSP slope after pairing expressed as a percentage of the average EPSP slope during baseline.
Drugs
The following drugs were used: dopamine hydrochloride 20 μM, forskolin 50 μM, d-AP5 50 μM, SCH23390 hydrochloride 10 μM, sulpiride 50 μM, H-89 20 µM. All drugs (purchased from Sigma–Aldrich, Dorset, United Kingdom; Tocris Bioscience, Bristol, United Kindgom; or Abcam, Cambridge, United Kingdom) were bath-applied through the perfusion system by dilution of concentrated stock solutions (prepared in water or DMSO) in ACSF.
Statistical analysis
Statistical comparisons were made using one-sample two-tailed or paired two-tailed Student's t-test, with a significance level of α = 0.05. Data are presented as mean ± s.e.m. Significance levels are indicated by *p < 0.05, **p < 0.01, ***p < 0.001.
Acknowledgements
This research was supported by a studentship from the Medical Research Council (UK) to ZB, the School of Biological Sciences at the University of Cambridge, and the Wellcome Trust (WS).
Funding Statement
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Funding Information
This paper was supported by the following grants:
Medical Research Council (MRC) Graduate studentship to Zuzanna Brzosko.
University of Cambridge Student research support fund to Zuzanna Brzosko, Wolfram Schultz, Ole Paulsen.
Additional information
Competing interests
WS: Reviewing editor, eLife.
The other authors declare that no competing interests exist.
Author contributions
ZB, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article.
WS, Conception and design, Drafting or revising the article.
OP, Conception and design, Drafting or revising the article.
Ethics
Animal experimentation: Experimental procedures and animal use were in accordance with the animal care guidelines of the UK Animals (Scientific Procedures) Act 1986 under Home Office personal license PIL- ICB486697 and project license PPL80/2440 held by the authors. Caution was taken to minimize stress and the number of animals used in experiments.
References
- Aston-Jones G, Cohen JD. An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. Annual Review of Neuroscience. 2005;28:403–450. doi: 10.1146/annurev.neuro.28.061604.135709. [DOI] [PubMed] [Google Scholar]
- Atherton LA, Dupret D, Mellor JR. Memory trace replay: the shaping of memory consolidation by neuromodulation. Trends in Neurosciences. 2015;38:560–570. doi: 10.1016/j.tins.2015.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bi GQ, Poo MM. Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type. The Journal of Neuroscience. 1998;18:10464–10472. doi: 10.1523/JNEUROSCI.18-24-10464.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolshakov VY, Siegelbaum SA. Postsynaptic induction and presynaptic expression of hippocampal long-term depression. Science. 1994;264:1148–1152. doi: 10.1126/science.7909958. [DOI] [PubMed] [Google Scholar]
- Caporale N, Dan Y. Spike timing-dependent plasticity: a Hebbian learning rule. Annual Review of Neuroscience. 2008;31:25–46. doi: 10.1146/annurev.neuro.31.060407.125639. [DOI] [PubMed] [Google Scholar]
- Cassenaer S, Laurent G. Conditional modulation of spike-timing-dependent plasticity for olfactory learning. Nature. 2012;482:47–52. doi: 10.1038/nature10776. [DOI] [PubMed] [Google Scholar]
- Chamberlain SR, Robbins TW. Noradrenergic modulation of cognition: therapeutic implications. Journal of Psychopharmacology / British Association for Psychopharmacology. 2013;27:694–718. doi: 10.1177/0269881113480988. [DOI] [PubMed] [Google Scholar]
- Charton JP, Herkert M, Becker CM, Schroder H. Cellular and subcellular localization of the 2B-subunit of the NMDA receptor in the adult rat telencephalon. Brain Research. 1999;816:609–617. doi: 10.1016/S0006-8993(98)01243-8. [DOI] [PubMed] [Google Scholar]
- Chen BS, Roche KW. Regulation of NMDA receptors by phosphorylation. Neuropharmacology. 2007;53:362–368. doi: 10.1016/j.neuropharm.2007.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunsmoor JE, Murty VP, Davachi L, Phelps EA. Emotional learning selectively and retroactively strengthens memories for related events. Nature. 2015;520:345–348. doi: 10.1038/nature14106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edelmann E, Lessmann V. Dopamine modulates spike timing-dependent plasticity and action potential properties in CA1 pyramidal neurons of acute rat hippocampal slices. Frontiers in Synaptic Neuroscience. 2011;3:6. doi: 10.3389/fnsyn.2011.00006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fiorillo CD, Song MR, Yun SR. Multiphasic temporal dynamics in responses of midbrain dopamine neurons to appetitive and aversive stimuli. The Journal of Neuroscience. 2013;33:4710–4725. doi: 10.1523/JNEUROSCI.3883-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frey U, Schroeder H, Matthies H. Dopaminergic antagonists prevent long-term maintenance of posttetanic LTP in the CA1 region of rat hippocampal slices. Brain Research. 1990;522:69–75. doi: 10.1016/0006-8993(90)91578-5. [DOI] [PubMed] [Google Scholar]
- Galuska CM, Woods JH. Acquisition of cocaine self-administration with unsignaled delayed reinforcement in rhesus monkeys. Journal of the Experimental Analysis of Behavior. 2005;84:269–280. doi: 10.1901/jeab.2005.99-04. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gasbarri A, Sulli A, Packard MG. The dopaminergic mesencephalic projections to the hippocampal formation in the rat. Progress in Neuro-psychopharmacology & Biological Psychiatry. 1997;21:1–22. doi: 10.1016/S0278-5846(96)00157-1. [DOI] [PubMed] [Google Scholar]
- Glass M, Felder CC. Concurrent stimulation of cannabinoid CB1 and dopamine D2 receptors augments cAMP accumulation in striatal neurons: evidence for a Gs linkage to the CB1 receptor. The Journal of Neuroscience. 1997;17:5327–5333. doi: 10.1124/jpet.108.145425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greengard P, Allen PB, Nairn AC. Beyond the dopamine receptor: the DARPP-32/protein phosphatase-1 cascade. Neuron. 1999;23:435–447. doi: 10.1016/S0896-6273(00)80798-9. [DOI] [PubMed] [Google Scholar]
- Harnett MT, Bernier BE, Ahn KC, Morikawa H. Burst-timing-dependent plasticity of NMDA receptor-mediated transmission in midbrain dopamine neurons. Neuron. 2009;62:826–838. doi: 10.1016/j.neuron.2009.05.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huber KM, Kayser MS, Bear MF. Role for rapid dendritic protein synthesis in hippocampal mGluR-dependent long-term depression. Science. 2000;288:1254–1256. doi: 10.1126/science.288.5469.1254. [DOI] [PubMed] [Google Scholar]
- Hull CL. Principles of behavior: an introduction to behavior theory. Oxford, England: Appleton Century; 1943. [Google Scholar]
- Ingebritsen TS, Cohen P. Protein phosphatases: properties and role in cellular regulation. Science. 1983;221:331–338. doi: 10.1126/science.6306765. [DOI] [PubMed] [Google Scholar]
- Izhikevich EM. Solving the distal reward problem through linkage of STDP and dopamine signalling. Cerebral Cortex. 2007;17:2443–2452. doi: 10.1093/cercor/bhl152. [DOI] [PubMed] [Google Scholar]
- Jourdain P, Bergersen LH, Bhaukaurally K, Bezzi P, Santello M, Domercq M, Matute C, Tonello F, Gundersen V, Volterra A. Glutamate exocytosis from astrocytes controls synaptic strength. Nature Neuroscience. 2007;10:331–339. doi: 10.1038/nn1849. [DOI] [PubMed] [Google Scholar]
- Kemp N, Bashir ZI. Induction of LTD in the adult hippocampus by the synaptic activation of AMPA/kainate and metabotropic glutamate receptors. Neuropharmacology. 1999;38:495–504. doi: 10.1016/S0028-3908(98)00222-6. [DOI] [PubMed] [Google Scholar]
- Lattal KA, Gleeson S. Response acquisition with delayed reinforcement. Journal of Experimental Psychology. 1990;16:27–39. doi: 10.1037/0097-7403.16.1.27. [DOI] [PubMed] [Google Scholar]
- Lee HK, Barbarosie M, Kameyama K, Bear MF, Huganir RL. Regulation of distinct AMPA receptor phosphorylation sites during bidirectional synaptic plasticity. Nature. 2000;405:955–959. doi: 10.1038/35016089. [DOI] [PubMed] [Google Scholar]
- Lisman JE, Grace AA. The hippocampal-VTA loop: controlling the entry of information into long- term memory. Neuron. 2005;46:703–713. doi: 10.1016/j.neuron.2005.05.002. [DOI] [PubMed] [Google Scholar]
- Markram H, Lubke J, Frotscher M, Sakmann B. Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science. 1997;275:213–215. doi: 10.1126/science.275.5297.213. [DOI] [PubMed] [Google Scholar]
- McNamara CG, Tejero-Cantero Á, Trouche S, Campo-Urriza N, Dupret D. Dopaminergic neurons promote hippocampal reactivation and spatial memory persistence. Nature Neuroscience. 2014;17:1658–1660. doi: 10.1038/nn.3843. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mirenowicz J, Schultz W. Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature. 1996;379:449–451. doi: 10.1038/379449a0. [DOI] [PubMed] [Google Scholar]
- Mockett BG, Guévremont D, Williams JM, Abraham WC. Dopamine D1/D5 receptor activation reverses NMDA receptor-dependent long-term depression in rat hippocampus. The Journal of Neuroscience. 2007;27:2918–2926. doi: 10.1523/JNEUROSCI.0838-06.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neve KA, Seamans JK, Trantham-Davidson H. Dopamine receptor signalling. Journal of Receptor and Signal Transduction Research. 2004;24:165–205. doi: 10.1081/RRS-200029981. [DOI] [PubMed] [Google Scholar]
- Okouchi H. Response acquisition by humans with delayed reinforcement. Journal of the Experimental Analysis of Behavior. 2009;91:377–390. doi: 10.1901/jeab.2009.91-377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oliet SHR, Malenka RC, Nicoll RA. Two distinct forms of long-term depression coexist in CA1 hippocampal pyramidal cells. Neuron. 1997;18:1294–1297. doi: 10.1016/S0896-6273(00)80336-0. [DOI] [PubMed] [Google Scholar]
- Otani S, Connor JA. Requirement of rapid Ca2+ entry and synaptic activation of metabotropic glutamate receptors for the induction of long-term depression in adult rat hippocampus. The Journal of Physiology. 1998;511:761–770. doi: 10.1111/j.1469-7793.1998.761bg.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Otmakhova NA, Lisman JE. Dopamine selectively inhibits the direct cortical pathway to the CA1 hippocampal region. The Journal of Neuroscience. 1999;19:1437–1445. doi: 10.1523/JNEUROSCI.19-04-01437.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pan W-X, Schmidt R, Wickens JR, Hyland BI. Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. The Journal of Neuroscience. 2005;25:6235–6242. doi: 10.1523/JNEUROSCI.1478-05.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pawlak V, Wickens JR, Kirkwood A, Kerr JND. Timing is not everything: neuromodulation opens the STDP gate. Frontiers in Synaptic Neuroscience. 2010;2:146. doi: 10.3389/fnsyn.2010.00146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scatton B, Simon H, Le Moal M, Bischoff S. Origin of dopaminergic innervation of the rat hippocampal formation. Neuroscience Letters. 1980;18:125–131. doi: 10.1016/0304-3940(80)90314-6. [DOI] [PubMed] [Google Scholar]
- Schultz W, Dayan P, Montague RR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
- Seol GH, Ziburkus J, Huang S, Song L, Kim IT, Takamiya K, Huganir RL, Lee HK, Kirkwood A. Neuromodulators control the polarity of spike-timing-dependent synaptic plasticity. Neuron. 2007;55:919–929. doi: 10.1016/j.neuron.2007.08.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Siegel SJ, Brose N, Janssen WG, Gasic GP, Jahn R, Heinemann SF, Morrison JH. Regional, cellular, and ultrastructural distribution of N-methyl-d-aspartate receptor subunit 1 in monkey hippocampus. Proceedings of the National Academy of Sciences of USA. 1994;91:564–568. doi: 10.1073/pnas.91.2.564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singer AS, Frank LM. Rewarded outcomes enhance reactivation of experience in the hippocampus. Neuron. 2009;64:910–921. doi: 10.1016/j.neuron.2009.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith CC, Greene RW. CNS dopamine transmission mediated by noradrenergic innervation. The Journal of Neuroscience. 2012;32:6072–6080. doi: 10.1523/JNEUROSCI.6486-11.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suri R, Schultz W. A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task. Neuroscience. 1999;91:871–890. doi: 10.1016/S0306-4522(98)00697-6. [DOI] [PubMed] [Google Scholar]
- Sutphin G, Byrne T, Poling A. Response acquisition with delayed reinforcement: a comparison of two-lever procedures. Journal of the Experimental Analysis of Behavior. 1998;69:17–28. doi: 10.1901/jeab.1998.69-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sutton RS, Barto AG. Toward a modern theory of adaptive networks: expectation and prediction. Psychological Review. 1981;88:135–170. doi: 10.1037/0033-295X.88.2.135. [DOI] [PubMed] [Google Scholar]
- Watabe AM, Carlisle HJ, O'Dell TJ. Postsynaptic induction and presynaptic expression of group 1 mGluR-dependent LTD in the hippocampal CA1 region. Journal of Neurophysiology. 2002;87:1395–1403. doi: 10.1152/jn.00723.2001. [DOI] [PubMed] [Google Scholar]
- Watts VJ, Neve KA. Activation of type II adenylate cyclase by D2 and D4 but not D3 dopamine receptors. Molecular Pharmacology. 1997;52:181–186. doi: 10.1124/mol.52.2.181. [DOI] [PubMed] [Google Scholar]
- Westphal RS, Tavalin SJ, Lin JW, Alto NM, Fraser IDC, Langeberg LK, Sheng M, Scott JD. Regulation of NMDA receptors by an associated phosphatase-kinase signaling complex. Science. 1999;285:93–96. doi: 10.1126/science.285.5424.93. [DOI] [PubMed] [Google Scholar]
- Yagishita S, Hayashi-Tkagi A, Ellis-Davies GCR, Urakubo H, Ishii S, Kasai H. A critical time window for dopamine actions on the structural plasticity of dendritic spines. Science. 2014;345:1616–1620. doi: 10.1126/science.1255514. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang K, Dani JA. Dopamine D1 and D5 receptors modulate spike timing-dependent plasticity at medial perforant path to dentate granule cell synapses. The Journal of Neuroscience. 2014;34:15888–15897. doi: 10.1523/JNEUROSCI.2400-14.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang JC, Lau PM, Bi GQ. Gain in sensitivity and loss in temporal contrast of STDP by dopaminergic modulation at hippocampal synapses. Proceedings of the National Academy of Sciences of USA. 2009;106:13028–13033. doi: 10.1073/pnas.0900546106. [DOI] [PMC free article] [PubMed] [Google Scholar]