Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2018 May 8;2:2398212818772179. doi: 10.1177/2398212818772179

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2018

This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages(https://us.sagepub.com/en-us/nam/open-access-at-sage).

PMC Copyright notice

Figure 4. — (a) Model circuit for the control of dopaminergic Now Print signals in response to unexpected rewards. Cortical inputs (I_i), activated by conditioned stimuli, learn to excite the SNc via a multi-stage pathway from the ventral striatum (S) to the ventral pallidum and then on to the PPTN (P) and the SNc (D). The inputs I_i excite the ventral striatum via adaptive weights W_iS, and the ventral striatum excites the PPTN via double inhibition through the ventral pallidum, with strength W_SP. When the PPTN activity exceeds a threshold GP, it excites the SNc with strength W_PD. The striosomes, which contain an adaptive spectral timing mechanism (x_ij, G_ij, Y_ij, Z_ij), learn to generate adaptively timed signals that inhibit reward-related activation of the SNc. Primary reward signals (I_R) from the lateral hypothalamus both excite the PPTN directly (with strength W_RP) and act as training signals to the ventral striatum S (with strength W_RS) that trains the weights W_iS. Arrowheads denote excitatory pathways, circles denote inhibitory pathways, and hemidiscs denote synapses at which learning occurs. Thick pathways denote dopaminergic signals. Reprinted with permission from Brown et al. (1999). (b) Dopamine cell firing patterns: Left: data. Right: model simulation, showing model spikes and underlying membrane potential. (A) In naive monkeys, the dopamine cells fire a phasic burst when unpredicted primary reward R occurs, such as if the monkey unexpectedly receives a burst of apple juice. (B) As the animal learns to expect the apple juice that reliably follows a sensory cue (conditioned stimulus, CS) that precedes it by a fixed time interval, then the phasic dopamine burst disappears at the expected time of reward, and a new burst appears at the time of the reward-predicting CS. (C) After learning, if the animal fails to receive reward at the expected time, a phasic depression, or dip, in dopamine cell firing occurs. Thus, these cells reflect an adaptively timed expectation of reward that cancels the expected reward at the expected time. The data are reprinted with permission from Schultz et al. (1997). The model simulations are reprinted with permission from Brown et al. (1999). (c) Dopamine cell firing patterns: Left: data. Right: model simulation, showing model spikes and underlying membrane potential. (A) The dopamine cells learn to fire in response to the earliest consistent predictor of reward. When CS2 (instruction) consistently precedes the original CS (trigger) by a fixed interval, the dopamine cells learn to fire only in response to CS2. Data reprinted with permission from Schultz et al. (1993). (B) During training, the cell fires weakly in response to both the CS and reward. Data reprinted with permission from Ljungberg et al. (1992). (C) Temporal variability in reward occurrence: When reward is received later than predicted, a depression occurs at the time of predicted reward, followed by a phasic burst at the time of actual reward. (D) If reward occurs earlier than predicted, a phasic burst occurs at the time of actual reward. No depression follows since the CS is released from working memory. Data in C and D reprinted with permission from Hollerman and Schultz (1998). (E) When there is random variability in the timing of primary reward across trials (e.g. when the reward depends on an operant response to the CS), the striosomal cells produce a Mexican Hat depression on either side of the dopamine spike. Data reprinted with permission from Schultz et al. (1993). Model simulation reprinted with permission from Brown et al. (1999).