Animals use the neurotransmitter dopamine to encode the relationship between their responses and reward. Reinforcement learning theory (1) successfully explains the role of phasic bursts of dopamine in terms of future reward maximization. Yet, dopamine clearly plays other roles in shaping behavior that have no obvious relationship to reinforcement learning, including modulating the rate at which our subjective sense of time grows in real time. On page 1273 of this issue, Soares et al. (2) closely examine the role of dopamine in mice performing a task in which they keep track of the time between two events and make decisions about this temporal duration. The results suggest the need to reassess the leading theory of dopamine function in timing—the dopamine clock hypothesis (3). They may also help explain empirical phenomena that challenge the reinforcement learning account of dopamine function.
Reinforcement learning theory posits that reward prediction errors inform animals about which behaviors to engage in so as to maximize future reward. Time is the key factor in the experiments that have linked reinforcement learning to dopamine (4). These experiments demonstrate that behavior is shaped by bursts of activity (“phasic activity”) in midbrain dopamine-secreting neurons. When a reward occurs at an unexpected time, a positive reward prediction error is generated (increased dopamine release signals that “things are better than expected”). Consequently, communication from cortical neurons to neurons in the striatum is altered to change behavior in ways that maximize reward. Conversely, a negative reward prediction error signaled by a decrease in dopamine release occurs when a reward is not provided when expected. In addition, if delivery of a reward reliably occurs at the predicted time, it produces no burst of dopamine release at that moment, but instead begins to produce a phasic dopamine signal in response to the earliest reward-predicting cue. Thus, future choices can be made so as to obtain the cue, and thereafter the reward. Eventually this process yields a temporally extended chain of cues and behaviors terminating in reward. However, time itself has typically been of less interest to reinforcement learning-dopamine researchers than the mechanisms that link these elements into chains of expected sensory inputs and planned motor outputs.
Yet, dopamine plays a key role in interval timing. The dopamine clock hypothesis holds that increased dopamine release speeds up an animal’s subjective sense of time—its internal clock. For example, rats treated with amphetamine, which enhances dopamine release, respond earlier than when they are tested without the drug. Curiously, a simple prediction of the dopamine clock hypothesis would seem to be that time doesn’t fly, but rather crawls, when you’re having fun. Unexpectedly pleasurable events boost dopamine release, which should cause your internal clock to run faster. Your subjective sense of time in that case grows faster than time itself, so that short intervals seem longer than they are. The dopamine clock hypothesis accounts for this counterintuitive prediction by an additional assumption about attention: When things are good, attention to time is reduced, such that intervals seem shorter than they are (5).
To clarify the role of dopamine in interval timing, Soares et al. investigated midbrain dopamine neuron activity in the substantia nigra pars compacta (SNc) of mice performing a timing task. They presented mice with two brief tones, and trained them to classify the interval between the tones as shorter or longer than a standard criterion. They then observed calcium influx into dopaminergic SNc neurons, which signals activity. Consistent with standard reinforcement learning theory, the authors observed bursts of activity in dopamine-synthesizing neurons that were locked to the second tone, reflecting the probability of an upcoming reward. This probability was greatest when the in-tertone duration was much shorter or much longer than the criterion—i.e., when the duration could be easily discriminated from the intermediate, criterion duration.
Reward prediction errors are only prediction errors to the extent that they are surprising, however. An animal’s surprise about the arrival time of the second tone ought, therefore, to modulate the reward prediction error, and it did. The intertone interval was picked from a set of six durations, so that a mouse’s surprise about the second tone should become smaller as the intertone interval becomes longer—a longer wait for the second tone makes its imminent arrival more likely. The animal’s surprise ought therefore to decrease to 0 at the maximum intertone duration. Soares et al. demonstrate that dopamine bursts locked to the second tone were modulated by both reward probability and temporal surprise. Further, they found ramping dopaminergic neuron activity during the intertone interval that declined in the same way as the surprise function.
A faster subjective sense of time should produce a more rapidly declining surprise function, by definition. However, this seems to conflict with the dopamine clock hypothesis: Faster clocks should lead to more rapid declines in dopamine, but declines in dopamine should slow down the clock. This apparent inconsistency may rest on a lack of knowledge about how (6–8) or where (9–11) time is represented in the brain. Does a subjective time estimate computed outside the SNc drive the surprise function (12)? Or does the midbrain dopamine surprise signal drive a timer outside the SNc, as the dopamine clock hypothesis suggests? Soares et al. observed that lower intertone dopaminergic neuron activity correlated with a faster clock, not a slower clock, in contrast to the predictions of the dopamine-timing hypothesis. Purely observational methods cannot establish causality, so the authors also used optogenetic methods that allow precise timing of perturbations to manipulate dopamine neuron activity on a subset of timing trials. Their results were unambiguous: Dopaminergic neuron activity did indeed appear to modulate internal clock speed, supporting a causal role for dopamine in regulating the subjective time sense. However, rather than speeding up the internal clock, increased dopaminergic neuron activity appeared to slow it down.
How can these results be reconciled with the large body of data on dopamine’s role in subjective time estimation? One possibility is that, in contrast to the spatially and cell type-specific targeting of Soares et al., systemic dopamine manipulation alters cortical timing processes (13) in a different direction and to a greater degree than the SNc-striatum effects.
Although phasic dopamine release seems to fit the reinforcement learning story, tonic dopamine release seems to do something rather different than encoding a reward prediction error. Instead, it seems to control overall response vigor or motivation (14). These findings can be accommodated within a theory in which tonic and phasic dopamine represent different kinds of information. Yet findings of “quasi-tonic” ramping signals within individual trials of nontiming tasks (15) pose a challenge even to this compromise theory. Soares et al. similarly see “quasi-tonic” surprise signals that challenge the dopamine clock hypothesis. These results suggest that upward- and downward-ramping dopamine signals may prove essential to unifying the reinforcement learning and interval-timing literatures, resolving their internal inconsistencies, and clarifying dopamine’s true role in shaping behavior.
REFERENCES
- 1.Montague PR et al. , J. Neurosci 16, 1936 (1996). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Soares S et al. , Science 354, 1273 (2016). [DOI] [PubMed] [Google Scholar]
- 3.Meck WH, J. Exp. Psychol. Anim. Behav. Process 9, 171 (1983). [PubMed] [Google Scholar]
- 4.Hollerman JR, Schultz W, Nat. Neurosci 1, 304 (1998). [DOI] [PubMed] [Google Scholar]
- 5.Lake JI, Meck WH, Neuropsychologia 51, 284 (2013). [DOI] [PubMed] [Google Scholar]
- 6.Laje R, Buonomano DV, Nat. Neurosci 16, 925 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Matell M, Meck WH, Cogn. Brain Res 21, 139 (2004). [DOI] [PubMed] [Google Scholar]
- 8.Simen P et al. , Timing Time Percept. 1, 159 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chubykin AA et al. , Neuron 77, 723 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wiener M et al. , NeuroImage 49, 1728 (2010). [DOI] [PubMed] [Google Scholar]
- 11.Jazayeri M, Shadlen MN, Curr. Biol 25, 2599 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Takahashi YK et al. , Neuron 91, 182 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Parker KL et al. , J. Neurosci 34, 16774 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Niv Y et al. , Psychopharmacology (Berl.) 191, 507 (2007). [DOI] [PubMed] [Google Scholar]
- 15.Howe MW et al. , Nature 500, 575 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]