Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Sep 4.
Published in final edited form as: Nat Neurosci. 2008 Nov 30;12(1):77–84. doi: 10.1038/nn.2233

Representation of negative motivational value in the primate lateral habenula

Masayuki Matsumoto 1, Okihide Hikosaka 1
PMCID: PMC2737828  NIHMSID: NIHMS140472  PMID: 19043410

Abstract

An action may lead to a reward or punishment. Therefore, an appropriate action needs to be chosen based on the values of both expected rewards and expected punishments. To understand the underlying neural mechanisms we conditioned monkeys using a Pavlovian procedure with two distinct contexts: one in which rewards were available, and another in which punishments were feared. We found that the population of lateral habenula neurons were most strongly excited by a conditioned stimulus associated with the most unpleasant event in each context: the absence of the reward or the presence of the punishment. The population of lateral habenula neurons were also excited by the punishment itself and inhibited by the reward itself, especially when they were less predictable. These results suggest that the lateral habenula has the potential to adaptively control both reward-seeking and punishment-avoidance behaviors, presumably through its projections to dopaminergic and serotonergic systems.


Making an appropriate choice of action requires computing the value for each action based on both expected rewards and expected punishments. A straightforward way to perform this computation would be to have neurons that represent both kinds of values. But are there such neurons in the brain?

Human imaging studies have reported that activity in several brain areas represents the values of both rewards and punishments14. However, it is possible that distinct types of single neurons in these areas represent the values separately for rewards and punishments. The only way to answer to this question is to analyze activity of single neurons while an animal behaves in the expectation of rewards and punishments. To date, several unit-recording studies have found value-coding neurons in several brain areas511. However, most of these studies were based on the manipulation of reward properties (i.e., size and/or probability). A small number of studies used both rewards and punishments as possible outcomes7, 1214.

The lateral habenula, a brain structure located in the epithalamus, is in a good position to represent emotional and motivational events. It receives inputs from forebrain limbic regions15, 16 and projects to midbrain structures, such as the substantia nigra pars compacta and ventral tegmental area which contain dopamine neurons, and the raphe nuclei which contain serotonin neurons17. Thus, the lateral habenula could control the monoaminergic (especially dopaminergic and serotonergic) systems which influence emotion and motivation1820. Indeed, electrical stimulation of the lateral habenula inhibits dopamine21, 22 and serotonin neurons23. Consistent with this view, the lateral habenula has been implicated in many emotional and cognitive functions including anxiety, stress, pain, learning and attention24, 25. In a recent study, we showed that neurons in the lateral habenula respond to rewards and sensory stimuli predicting rewards, and that they send these reward-related signals to dopamine neurons in the substantia nigra by inhibiting them26. However, this study did not test whether lateral habenula neurons respond to punishments or sensory stimuli predicting punishments.

To investigate how lateral habenula neurons respond to punishments and their predictors as well as rewards and their predictors, we recorded the activity of lateral habenula neurons while monkeys were conditioned in a Pavlovian procedure with two distinct contexts: one in which rewards were available, and another in which punishments were feared. We found that many lateral habenula neurons responded differentially to visual stimuli that indicated rewarding and aversive events and did so in a context-dependent manner.

RESULTS

We conditioned two monkeys using a Pavlovian procedure with an appetitive unconditioned stimulus (US) (liquid reward) and an aversive US (airpuff directed at the face). This Pavlovian procedure consisted of two blocks of trials, a reward block (Fig. 1a) and a punishment block (Fig. 1b). In the reward block, three conditioned stimuli (CSs) were associated with reward with 100%, 50% and 0% probability, respectively. In the punishment block, three CSs were associated with airpuff with 100%, 50% and 0% probability, respectively. Thus, this Pavlovian procedure had two distinct contexts. Each trial of each block started after the presentation of a timing cue (central small spot) on the screen. After 1 s, the timing cue disappeared and one of the three CSs was presented pseudo-randomly. After 1.5 s, the CS disappeared and the US was delivered. In addition to the cued trials, uncued trials were included in which a reward alone (free reward) was delivered during the reward block and an airpuff alone (free airpuff) was delivered during the punishment block. Each block consisted of 42 trials and was repeated twice or more. Notably, the same visual stimulus (blue square) was used as the CS associated with no-outcome in both blocks, though this CS would be unpleasant in the reward block but pleasant in the punishment block. We monitored anticipatory licking (a type of approach behavior) and blinking (a type of avoidance behavior) of the monkeys during CS presentation. These behavioral data suggested that the monkeys discriminated between the CSs (Supplementary note B and Supplementary Fig. 1 online).

Figure 1.

Figure 1

Pavlovian procedure with two distinct contexts. (a) Reward block. (b) Punishment block.

Response of lateral habenula neurons to CS

We recorded single unit activity from 72 lateral habenula neurons (45 in monkey N and 27 in monkey D) using the Pavlovian procedure. These neurons were estimated to be in the lateral habenula by their physiological properties and MRI (see Methods), and their localization was confirmed histologically (Supplementary note C and Supplementary Fig. 3 online). We first show the activity of an example neuron aligned by CS onset (Fig. 2a and b). The activity decreased after the appearance of 100% reward CS (the CS associated with reward with 100% probability) and 50% reward CS, and increased after the appearance of 0% reward CS in the reward block (Fig. 2a). The magnitude of the inhibition decreased and reversed to an excitation as the reward probability decreased. In contrast, the activity increased after the appearance of 100% airpuff CS and 50% airpuff CS, and decreased after the appearance of 0% airpuff CS in the punishment block (Fig. 2b). The magnitude of the excitation decreased and reversed to an inhibition as the airpuff probability decreased. Notably, the same blue square associated with no-outcome elicited an excitation in the reward block but inhibition in the punishment block.

Figure 2.

Figure 2

Responses of lateral habenula neurons to CSs. (a) Activity of an example neuron during the reward block. Rasters and spike density functions (SDFs) are aligned by CS onset and shown for 100% reward CS, 50% reward CS, and 0% reward CS. (b) Activity of the same neuron in a during the punishment block. Rasters and SDFs are shown for 100% airpuff CS, 50% airpuff CS, and 0% airpuff CS. (c) Averaged activity of the 49 neurons during the reward block. SDFs are shown for 100% reward CS (dark red), 50% reward CS (light red), and 0% reward CS (gray). Gray area indicates the period that was used to analyze CS response. (d) Averaged activity of the 49 neurons during the punishment block. SDFs are shown for 100% airpuff CS (dark blue), 50% airpuff CS (light blue), and 0% airpuff CS (gray).

In order to characterize the responses to the CSs (hereafter called CS responses), we performed a one-way analysis of variance (ANOVA) across the six conditions (i.e., 100%, 50% and 0% reward CSs, and 100%, 50% and 0% airpuff CSs) for each neuron. By this analysis, 49 of the 72 neurons showed significantly differential CS responses across the conditions (P < 0.05, one-way ANOVA). The averaged activity of these neurons showed the strongest excitation to 0% reward CS in the reward block (Fig. 2c) and 100% airpuff CS in the punishment block (Fig. 2d). These excitatory responses were graded by the reward probability and airpuff probability in the opposite directions.

These results suggest that the CS responses of lateral habenula neurons were modulated by the motivational valence assigned with the CSs. We thus plotted, in Fig. 3a, the averaged magnitude of the CS responses according to the objective value of the outcomes. Since the objective value of a future reward is determined by the multiplicative product of its magnitude and its probability27 and since we fixed the reward magnitude, the objective reward value should be scaled according to its probability, as shown on the positive side in Fig. 3a. It is then natural to scale negative values in the same manner, now on the negative side in Fig. 3a. In support of this assumption, the frequency of approach behavior (anticipatory licking) increased as the positive value increased, while the frequency of avoidance behavior (anticipatory blinking) increased as the negative value increased (Supplementary Fig. 1 online).

Figure 3.

Figure 3

Relation between objective value and CS response. (a) Averaged magnitude of the CS response of the 49 neurons plotted against the objective value of outcome for the reward block (black) and the punishment block (gray). Filled symbols indicate a significant deviation from zero (P < 0.05, Wilcoxon signed-rank test). Double asterisks indicate a significant difference between two CS responses (P < 0.01, Wilcoxon signed-rank test). Error bars indicate s.e.m. (b) Correlation coefficients of the 49 neurons between objective value and CS response. The abscissa indicates correlation coefficient between reward value and CS response. The ordinate indicates correlation coefficient between airpuff value and CS response. Cyan, dark blue and magenta plots indicate neurons with statistically significant correlation between reward value and CS response, between airpuff value and CS response, and both of them, respectively (P < 0.05). White plots, no significance. The marginal histograms show the distribution of correlation coefficients. Black bars indicate neurons with statistically significant correlation (P < 0.05). White bars, no significance. (c) Distributions of the linearity indices of the 49 neurons. Black bars indicate the distribution of linearity indices in the reward block. Gray bars indicate the distribution in the punishment block.

As shown in Fig. 3a, the averaged magnitude of the CS response increased as the objective value decreased in both the reward block (black line) and the punishment block (gray line). To examine whether such a response pattern was achieved by single lateral habenula neurons, we calculated the correlation coefficient between CS response and objective value for each neuron, separately for the reward block (abscissa in Fig. 3b) and the punishment block (ordinate in Fig. 3b). Many neurons showed a significant negative correlation in the reward block (n = 35) and the punishment block (n = 30) (P < 0.05). Of these, 23 neurons showed a significant negative correlation in both of them (P < 0.05). The mean correlation coefficient was significantly smaller than zero in both blocks (P < 0.01, Wilcoxon signed-rank test). These results indicate that many individual neurons increased their CS responses as the objective value decreased in both blocks.

However, the relation between the CS response and the objective value appears somewhat different between the reward and punishment blocks. In the punishment block the averaged CS response linearly increased as the objective value decreased (gray line in Fig. 3a). In the reward block the increase in the averaged CS response was larger between 0% and 50% reward CSs than between 50% and 100% reward CSs (black line in Fig. 3a). To statistically analyze this trend, we calculated a linearity index (see Methods) for each neuron, separately for the reward and punishment blocks, and its distribution is shown in Fig. 3c. Briefly, the linearity index is positive if the response to 50% CS is larger than the average of the responses to 0% CS and 100% CS, and negative if the response to 50% CS is smaller than the average. In the punishment block (gray bars), the mean of the linearity indices was not significantly different from zero (P > 0.05, Wilcoxon signed-rank test), indicating that the CS response linearly increased as the objective value decreased. In the reward block (black bars), however, the mean of the linearity indices was significantly smaller than zero (P < 0.01, Wilcoxon signed-rank test), indicating that the CS response increased abruptly from 50% reward CS to 0% 7 reward CS. This non-linearity is reminiscent of the change in the number of blinks in the reward block (Supplementary Fig. 1c and d online). If the number of blinks is related to the unpleasantness, these results may suggest that lateral habenula neurons preferentially represent unpleasant events (e.g., 0% reward CS) rather than non-unpleasant events (e.g., 50% reward CS and 100% reward CS).

Another notable feature of the relation between CS response and objective value is the interruption between the reward and punishment blocks. Although the objective values of 0% reward CS and 0% airpuff CS were identical, the response to 0% reward CS was significantly larger than the response to 0% airpuff CS (P < 0.01, Wilcoxon signed-rank test) (Fig. 3a). This interruption indicates the context-dependency of the CS response and leads important consequences. That is, lateral habenula neurons, on the average, were most strongly and equally excited by a CS associated with the most unpleasant event in each context, regardless of whether the event was the absence of reward (0% reward CS) or the presence of airpuff (100% airpuff CS) (Fig. 3a). This appears to correspond to the well-known relativity of subjective values28.

The relativity of the CS response was accomplished by single lateral habenula neurons, because there was a clear correlation in single cellular responses between the most unpleasant events in the two contexts: the response to 0% reward CS and the response to 100% airpuff CS for individual neurons (r = 0.906, P < 0.01) (Supplementary note D and Supplementary Fig. 4a online). The same tendency was observed for the most pleasant events in the two contexts: the response to 100% reward CS and the response to 0% airpuff CS (r = 0.729, P < 0.01) (Supplementary note D and Supplementary Fig. 4b online).

Because the physical properties of 0% reward CS and 0% airpuff CS were identical, these CSs could only be distinguished by the block context (reward block or punishment block). We thus examined how the differential responses to 0% reward CS and 0% airpuff CS developed after the block context was changed. We plotted, in Fig. 4, the averaged responses to 0% reward CS (black line) and 0% airpuff CS (gray line) against the number of preceding trials (excluding 0% reward CS and 0% airpuff CS trials) in a given block. The responses at trial zero reflected the previous context, but then changed and reached a plateau after the second or third trials.

Figure 4.

Figure 4

Changes in the averaged responses of the 49 neurons to 0% reward CS (black) and 0% airpuff CS (gray) after the block context was reversed. The abscissa indicates the number of preceding trials (excluding 0% reward CS and 0% airpuff CS trials) in a given block. When either 0% reward CS and 0% airpuff CS was presented on the first trial after block change, the neuronal response was included in the data at trial zero. Error bars indicate s.e.m.

Response of lateral habenula neurons to US and US omission

Many lateral habenula neurons also responded to the USs. In Fig. 5a and b the activity of an example neuron is aligned by US onset. The neuron showed phasic responses to the USs (hereafter called US responses): an inhibition to free reward (Fig. 5a) and an excitation to free airpuff (Fig. 5b). However, the US responses were strongly modulated by the preceding CSs. The inhibitory response to reward disappeared when the reward was completely predictable by following 100% reward CS (100% reward), and decreased when the reward was partially predictable by following 50% reward CS (50% reward). The excitatory response to airpuff decreased as the airpuff was completely predictable by following 100% airpuff CS (100% airpuff) or partially predictable by following 50% airpuff CS (50% airpuff).

Figure 5.

Figure 5

Responses of lateral habenula neurons to USs. (a) Activity of an example neuron during the reward block. Rasters and SDFs are aligned by reward onset and shown for 100% reward, 50% reward, and free reward. (b) Activity of the same neuron in a during the punishment block. Rasters and SDFs are aligned by airpuff onset and shown for 100% airpuff, 50% airpuff, and free airpuff. (c) Averaged activity of the 51 neurons showing a significant response to at least one of 100% reward (dark red), 50% reward (light red), and free reward (gray) (P < 0.05, Wilcoxon signed-rank test). Gray area indicates the period that was used to analyze US response. (d) Averaged activity of the 60 neurons showing a significant response to at least one of 100% airpuff (dark blue), 50% airpuff (light blue), and free airpuff (gray) (P < 0.05, Wilcoxon signed-rank test).

These US responses were commonly found in lateral habenula neurons. To investigate the response to reward, we analyzed the activity of 51 neurons with a significant response to at least one of 100% reward, 50% reward or free reward (P < 0.05, Wilcoxon signed-rank test). The averaged activity was strongly inhibited by free reward (Fig. 5c). This inhibitory response was decreased by the preceding 50% reward CS and diminished by the preceding 100% reward CS. To investigate the response to airpuff, we analyzed the activity of 60 neurons with a significant response to at least one of 100% airpuff, 50% airpuff or free airpuff (P < 0.05, Wilcoxon signed-rank test). The averaged activity was strongly excited by free airpuff (Fig. 5d). This excitatory response was decreased by the preceding CSs.

The omission of the US sometimes evoked an opposite response (hereafter called US omission response). In Fig. 6a and b, we show the US omission responses of the same neuron shown in Fig. 5a and b. The neuron showed an excitation when reward was partially predicted by 50% reward CS but did not occur (50% reward omission), although it showed neither excitation nor inhibition when reward did not occur as predicted by 0% reward CS (0% reward omission) (Fig. 6a). On the other hand, this neuron did not show a clear response when airpuff was partially predicted by 50% airpuff CS but did not occur (50% airpuff omission) or when airpuff did not occur as predicted by 0% airpuff CS (0% airpuff omission) (Fig. 6b).

Figure 6.

Figure 6

Responses of lateral habenula neurons to US omission. (a) Activity of the same neuron in Figures 5a and b during the reward block. Rasters and SDFs are aligned by CS offset which occurred simultaneously with reward onset in rewarded trials, and shown for 50% reward omission and 0% reward omission. (b) Activity of the same neuron in a during the punishment block. Rasters and SDFs are aligned by CS offset which occurred simultaneously with airpuff onset in airpuff trials, and shown for 50% airpuff omission and 0% airpuff omission. (c) Averaged activity of the 51 neurons for 50% reward omission (light red) and 0% reward omission (gray). Gray area indicates the period that was used to analyze US omission response. (d) Averaged activity of the 60 neurons for 50% airpuff omission (light blue) and 0% airpuff omission (gray).

The population of lateral habenula neurons showed a similar response pattern as the example neuron (Fig. 6c and d). The averaged activity was excited by 50% reward omission but not by 0% reward omission (Fig. 6c). On the other hand, the averaged activity did not change in response to 50% airpuff omission or 0% airpuff omission (Fig. 6d).

The profiles of the US and US omission responses show an interesting parallel with a ‘prediction error signal’ which indicates a discrepancy between predicted and actual values of outcomes. The averaged magnitude of the US and US omission responses is sorted by prediction errors for reward (Fig. 7a) and airpuff (Fig. 7b). Here, a positive value of the prediction error indicates that the outcome was better (more appetitive or less aversive) than predicted by the preceding CS; a negative value indicates that the outcome was worse (less appetitive or more aversive) than predicted by the CS. The results indicate that the averaged magnitudes of the responses increased as the prediction error became more negative for both reward and airpuff.

Figure 7.

Figure 7

Relation between prediction error and US and US omission responses. (a) Averaged magnitude of the US and US omission responses of the 51 neurons plotted against prediction error for reward. Filled symbols indicate a significant deviation from zero (P < 0.05, Wilcoxon signed-rank test). Double asterisks indicate a significant difference between two responses (P < 0.01, Wilcoxon signed-rank test). Error bars indicate s.e.m. (b) Averaged magnitude of the US and US omission responses of the 60 neurons plotted against prediction error for airpuff. Conventions are the same as a. (c) Correlation coefficients of all 72 neurons between prediction error and US and US omission responses. The abscissa indicates the correlation coefficient between reward prediction error and US and US omission responses. The ordinate indicates the correlation coefficient between airpuff prediction error and US and US omission responses. Cyan, dark blue and magenta plots indicate neurons with statistically significant correlation between reward prediction error and US and US omission responses, between airpuff prediction error and US and US omission responses, and both of them, respectively (P < 0.05). White plots, no significance. The marginal histograms show the distribution of correlation coefficients. Black bars indicate neurons with statistically significant correlation (P < 0.05). White bars, no significance.

To examine whether this response pattern was achieved by single lateral habenula neurons, we calculated the correlation coefficient between US and US omission responses and prediction error for all 72 neurons, separately for reward (abscissa in Fig. 7c) and airpuff (ordinate in Fig. 7c). Many neurons showed a significant negative correlation for reward (34/72 neurons) and airpuff (17/72 neurons) (P < 0.05). The mean correlation coefficient was significantly smaller than zero for both of them (P < 0.01, Wilcoxon signed-rank test). Of these, 10 neurons showed a significant negative correlation for both reward and airpuff (P < 0.05). The frequency of neurons showing a significant negative correlation for both reward and airpuff (i.e., 10/72) was not significantly different from the frequency expected by chance level under the assumption that the negative correlations for reward and airpuff happened independently (chi-square test for independence, P > 0.05). These results indicate that lateral habenula neurons, as a population, increased their activity as the prediction error became more negative for both reward and airpuff. But this was not necessary true for individual neurons.

Relationship between CS and US responses

Next, we examined the relationship between the CS and US responses. The scatter plot of Fig. 8a compares the responses to 100% reward CS and free reward for each neuron. A majority of lateral habenula neurons responded to both the reward CS and the reward itself in the same directions (mostly inhibition). Of the 72 neurons, 20 neurons showed significant CS and US responses in the same direction, while 7 neurons showed significant CS and US responses in opposite directions (P < 0.05, Wilcoxon signed-rank test). The scatter plot of Fig. 8b compares the responses to 100% airpuff CS and free airpuff. The responses to the airpuff CS and the responses to the airpuff itself were often expressed by separate groups of neurons: of the 72 neurons, 13 showed significant CS and US responses in the same directions, 18 showed a significant response to only 100% airpuff CS, and 20 showed a significant response to only free airpuff (P < 0.05, Wilcoxon signed-rank test). Thus, although lateral habenula neurons as a population responded both to the CSs and the USs in the same directions, it was not necessarily true for individual neurons. This might suggest that the CS and US responses of lateral habenula neurons are mediated by afferents from different brain areas.

Figure 8.

Figure 8

Comparison between CS response and US response. (a) Comparison between the response to 100% reward CS and the response to free reward for all 72 neurons. Dark blue, cyan and magenta dots indicate neurons with statistically significant responses to free reward, 100% reward CS, and both of them, respectively (P < 0.05, Wilcoxon signed-rank test). (b) Comparison between the response to 100% airpuff CS and the response to free airpuff for all 72 neurons. Dark blue, cyan and magenta dots indicate neurons with statistically significant responses to free airpuff, 100% airpuff CS, and both of them, respectively (P < 0.05, Wilcoxon signed-rank test).

Effect of eye position and movement on neural responses

On most of the trials the monkeys fixated their gaze on the central cue (timing cue) before the CS was presented, even though the central eye fixation was not required. On a small number of trials (1 % in monkey N, 4 % in monkey D), however, the eye position was away from the central cue when the CS was presented (i.e., out of a central eye window: ± 2.5 × 2.5 deg). Therefore, it is possible that the responses of lateral habenula neurons to CSs, USs and US omissions were influenced by the variation of eye position. To test this possibility we re-analyzed the entire dataset based on trials that were selected using several criteria of eye fixation during the presentation of the CSs (Supplementary note E and F, and Supplementary Fig. 5 online). We found that the effects of eye position were small and did not affect the main results.

DISCUSSION

Using a Pavlovian procedure with two distinct contexts, we showed that neurons in the lateral habenula responded to motivational events and their predictors, such that their response magnitude was inversely correlated with the associated values. Their population responses were graded both for reward-based values and for punishment-based values.

Notably, the CS responses of lateral habenula neurons were context-dependent. This context-dependency had two remarkable consequences. First, lateral habenula neurons were most strongly and equally excited by a CS associated with the most unpleasant event in a given context, namely, the absence of reward in the reward block and the presence of airpuff in the punishment block. Behavioral studies using blocking procedures have suggested that these kinds of unpleasant experiences are processed by the same neural mechanism29. Therefore, the lateral habenula may be part of the unified mechanism implied by the behavioral studies. Second, the expectation of no-outcome induced different responses in lateral habenula neurons depending on the context: excitation in the reward block vs. no net response in the punishment block. This appears to correspond to the departure of subjective value from objective value, which we experience in everyday life28.

These profiles of the CS response suggest that the lateral habenula is a unified neural mechanism to represent negative motivational values induced by both rewards and punishments. However, the value coding by lateral habenula neurons was somewhat different between rewards and punishments. Whereas the CS response linearly increased as the objective value decreased in the punishment block, its changes in the reward block were less linear (Fig. 3a). These results suggest that lateral habenula neurons represent unpleasant events more precisely than pleasant events.

The US responses of lateral habenula neurons were also context-dependent and were modulated by prediction errors for both rewards and punishments. Reward prediction error is thought to be crucial for learning of goal-directed behaviors30. Its neural correlate has been found in different brain areas. The most striking among them is midbrain dopamine neurons31, 32. However, there has been no report on prediction error coding for punishments. Our results indicated that the US responses of lateral habenula neurons were modulated by the prediction error for punishments. Thus, the US responses of lateral habenula neurons to airpuff were weaker when the airpuff was fully expected (i.e., 100% airpuff) than partially expected (i.e., 50% airpuff). The effect of expectation on neuronal responses was also proven in Supplementary note G and Supplementary Fig. 6 online. However, the punishment prediction error coding by lateral habenula neurons was not perfect. On the average they were still excited by airpuff even when it was fully expected (i.e., 100% airpuff) and did not show significant response to the omission of expected airpuff (i.e., 50% airpuff omission). This may suggest that lateral habenula neurons preferentially respond to negative motivational events (e.g., 100% airpuff) rather than positive motivational events (e.g., 50% airpuff omission).

In addition to the lateral habenula, other brain areas are considered to represent rewards and punishments. A previous study12 showed that midbrain dopamine neurons respond to airpuff-predicting stimuli, but rather inconsistently, unlike lateral habenula neurons. One hypothesis may be that positive motivational values are preferentially represented by dopamine neurons whereas negative motivational values are preferentially represented by lateral habenula neurons. However, the inconsistency of the airpuff-related responses in dopamine neurons could be due to the fact that in their experiment the monkeys were able to avoid the airpuff by acting quickly. Another area that is likely to represent both rewards and punishments is the amygdala. Both reward- and punishment-related values are represented by different groups of amygdale neurons7. Many of them respond differently to reward and airpuff themselves, and these responses are frequently modulated by prediction33. Possible functional relationships between the lateral habenula and the amygdale will be an important issue, although no direct connection has been shown between them.

The value signals in the lateral habenula would be useful for controlling both reward-seeking behaviors and punishment-avoidance behaviors. These functions might be mediated, at least partly, by dopamine and serotonin neurons which have been implicated in learning and motivation of goal directed behaviors1820, 3436. Indeed, several lines of evidence have suggested that the lateral habenula exert inhibitory control over dopamine21, 22, 37 and serotonin neurons23, 38. While both dopamine neurons and serotonin neurons encode reward-related signals but in different manners39, how they encode punishment-related signals remains debatable. To understand the function of the value signals in the lateral habenula, it is important to elucidate how these signals are processed in dopamine and serotonin neurons.

METHODS

Animals

Two adult rhesus monkeys (Macaca mulatta; monkey N, female, 6.0 kg; monkey D, male, 11.0 kg) were used for the experiments. All procedures for animal care and experimentation were approved by the Institute Animal Care and Use Committee and complied with the Public Health Service Policy on the humane care and use of laboratory animals. See Supplementary note A for detailed experimental procedure.

Behavioral task

The monkeys were trained in a Pavlovian procedure which consisted of two blocks of trials, a reward block (Fig. 1a) and a punishment block (Fig. 1b). In the reward block, three conditioned stimuli (CSs) (red circle, green cross and blue square for monkey N; yellow ring, cyan triangle and blue square for monkey D) were associated with a liquid reward as an unconditioned stimulus (US) with 100%, 50% and 0% probability, respectively. In the punishment block, three CSs (yellow ring, cyan triangle and blue square for monkey N; red circle, green cross and blue square for monkey D) were associated with an airpuff directed at the monkey’s face as an US with 100%, 50% and 0% probability, respectively. The sizes of these CSs were 8.6 × 8.6 to 10 × 10 deg. The liquid reward was delivered through a spout which was positioned in front of the monkey’s mouth. The airpuff (20 – 30 psi) was delivered through a narrow tube placed 6 – 7 cm from the face. Each trial started after the presentation of a timing cue (size, 2.6 × 2.6 deg) on the screen (the monkeys were not required to fixate it) for both blocks. After 1 s, the timing cue disappeared and one of the three CSs was presented pseudo-randomly. After 1.5 s, the CS disappeared and the US was delivered. In addition to the cued trials, uncued trials were included in which a reward alone (free reward) was delivered during the reward block or an airpuff alone (free airpuff) was delivered during the punishment block. All trials were presented with a random inter-trial-interval (ITI) that averaged 5 s (3 – 7 s) for monkey N and 4.5 s (3 – 6 s) for monkey D. One block consisted of 42 trials with fixed proportions of trial types (100%: 12 trials, 50 %: 12 trials, 0 %: 12 trials, uncued: 6 trials). For 50 % trials, the CS was followed by the US on 6 trials and was not followed by the US on the other 6 trials. The block changed without any external cue. For each neuron we collected data by repeating the reward and punishment blocks twice or more.

We monitored licking and blinking of the monkeys. To monitor licking, we attached a strain gauge to the spout which was positioned in front of the monkey’s mouth, and measured strains of the spout caused by licking. To monitor blinking, a magnetic search coil technique was used40. A small Teflon-coated stainless steel wire (< 5 mm diameter, 5 or 6 turns) was taped to an eyelid. Eye closure was identified by the vertical component of the eyelid coil signal.

Localization of the lateral habenula

We used the same technique to localize the lateral habenula as in our previous study26. We estimated the position of the lateral habenula by obtaining MRIs (4.7T, Bruker, Germany) based on the coordinates of the recording chamber whose inner walls were visualized with an enhancer (betadine ointment). On MRIs parallel to the recording chamber, the habenulae appeared as two round structures located about 4 mm anterior to the superior colliculi. Then, the localization of the lateral habenula was achieved by electrophysiological recording and verified by histological examination at the end of the experiments. As shown in our previous study26, the firing patterns and spike shapes of lateral habenula neurons were distinctly different from those of neurons in the surrounding thalamic area [mediodorsal thalamus (MD)]. Lateral habenula neurons fired tonically with relatively high background rates, whereas MD neurons exhibited irregular and bursty firing with lower background rates and their action potentials were much broader than those of lateral habenula neurons. Furthermore, most of the lateral habenula neurons, but none of the MD neurons, were sensitive to reward outcome.

Data analysis

We analyzed anticipatory licking, anticipatory blinking and neuronal activity during the Pavlovian procedure.

To evaluate the frequency and strength of anticipatory licking, the strain gauge signal was used. We first calculated the velocity of the strain of the spout. Then we integrated the absolute velocity during CS presentation for each trial. This integrated velocity becomes larger if the monkeys more frequently and strongly lick the spout. We defined this value as the magnitude of anticipatory licking in the trial. The magnitude was normalized by the following formula, normalized magnitude = (X – Min) / (Max – Min), where X is the magnitude of anticipatory licking in the trial, Max is the maximum magnitude in the recording session, and Min is the minimum magnitude in the recording session.

To count the number of anticipatory blinks during CS presentation, the vertical component of the eyelid signal was used. We first calculated the downward velocity of eyelid movement. We set a threshold and counted how many times the velocity crossed the threshold during CS presentation for each trial. This count was defined as the number of anticipatory blinks in the trial.

In analyses of neuronal activity, responses to each CS were defined as the discharge rate during 150–400 ms after CS onset minus the background discharge rate during the 250 ms before CS onset. Responses to reward and reward omission were defined as the discharge rate during 200–500 ms after reward onset minus the background discharge rate during the 250 ms before reward onset. Responses to airpuff and airpuff omission were defined as the discharge rate during 50–150 ms after airpuff onset minus the background discharge rate during the 250 ms before airpuff onset. These time windows were determined based on the averaged activity of lateral habenula neurons. Specifically, we set the time windows such that they include major parts of the excitatory and inhibitory responses of lateral habenula neurons.

Because the 0% reward and 0% airpuff CSs were physically identical, they could only be distinguished by the block context (reward block or punishment block). Therefore, to analyze responses to 0% reward and 0% airpuff CSs, we excluded all 0% reward and 0% airpuff CSs that were presented before the block context could be known, that is, before the block's first presentation of a 100% CS, 50% CS or free outcome.

To examine the linearity between the objective value and the magnitude of the CS response, we calculated a linearity index for each neuron, separately for the reward and punishment blocks. The linearity index was calculated by the following equation:

Linearity index={R50CS(R100CS+R0CS)/2}/|R100CSR0CS|

where R100CS, R50CS and R0CS indicate the response magnitudes for 100% CS, 50% CS and 0% CS, respectively. The linearity index is zero if the relation is perfectly linear (i.e., if the response to 50% CS is equal to the average of the responses to 0% CS and 100% CS). It is positive if the response to 50% CS is larger than the 0%–100% average, and negative if the response to 50% CS is smaller than the 0%–100% average.

To calculate spike density functions (SDFs), each spike was replaced by a Gaussian curve (σ = 10 ms).

Histology

After the end of the recording session in monkey N, we selected representative locations for electrode penetrations into the lateral habenula. When typical single- or multi-unit activities were recorded, we made electrolytic microlesions at the recording sites (12 µA and 30 s). Then, monkey N was deeply anesthetized with an overdose of pentobarbital sodium, and perfused with 10 % formaldehyde. The brain was blocked and equilibrated with 10 % sucrose. Frozen sections were cut every 50 µm in coronal plane. The sections were stained with cresyl violet.

Supplementary Material

Supplementary data

ACKNOWLEDGEMENTS

We thank S. Hong, M. Yasuda and E. Bromberg-Martin for valuable discussion, and M.K. Smith, J.W. McClurkin, A.M. Nichols, T.W. Ruffner, A.V. Hays and L.P. Jensen for technical assistance. This research was supported by the Intramural Research Program at the National Institutes of Health, National Eye Institute.

Footnotes

COMPETING INTERESTS STATEMENT

The authors declare that they have no competing financial interests.

REFERENCES

  • 1.Delgado MR, Nystrom LE, Fissell C, Noll DC, Fiez JA. Tracking the hemodynamic responses to reward and punishment in the striatum. J. Neurophysiol. 2000;84:3072–3077. doi: 10.1152/jn.2000.84.6.3072. [DOI] [PubMed] [Google Scholar]
  • 2.O'Doherty J, Kringelbach ML, Rolls ET, Hornak J, Andrews C. Abstract reward and punishment representations in the human orbitofrontal cortex. Nat. Neurosci. 2001;4:95–102. doi: 10.1038/82959. [DOI] [PubMed] [Google Scholar]
  • 3.Breiter HC, Aharon I, Kahneman D, Dale A, Shizgal P. Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron. 2001;30:619–639. doi: 10.1016/s0896-6273(01)00303-8. [DOI] [PubMed] [Google Scholar]
  • 4.Nieuwenhuis S, et al. Activity in human reward-sensitive brain areas is strongly context dependent. Neuroimage. 2005;25:1302–1309. doi: 10.1016/j.neuroimage.2004.12.043. [DOI] [PubMed] [Google Scholar]
  • 5.Tobler PN, Fiorillo CD, Schultz W. Adaptive coding of reward value by dopamine neurons. Science. 2005;307:1642–1645. doi: 10.1126/science.1105370. [DOI] [PubMed] [Google Scholar]
  • 6.Sugrue LP, Corrado GS, Newsome WT. Matching behavior and the representation of value in the parietal cortex. Science. 2004;304:1782–1787. doi: 10.1126/science.1094765. [DOI] [PubMed] [Google Scholar]
  • 7.Paton JJ, Belova MA, Morrison SE, Salzman CD. The primate amygdala represents the positive and negative value of visual stimuli during learning. Nature. 2006;439:865–870. doi: 10.1038/nature04490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex encode economic value. Nature. 2006;441:223–226. doi: 10.1038/nature04676. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Samejima K, Ueda Y, Doya K, Kimura M. Representation of action-specific reward values in the striatum. Science. 2005;310:1337–1340. doi: 10.1126/science.1115270. [DOI] [PubMed] [Google Scholar]
  • 10.Sallet J, et al. Expectations, gains, and losses in the anterior cingulate cortex. Cogn. Affect. Behav. Neurosci. 2007;7:327–336. doi: 10.3758/cabn.7.4.327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lau B, Glimcher PW. Value representations in the primate striatum during matching behavior. Neuron. 2008;58:451–463. doi: 10.1016/j.neuron.2008.02.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Mirenowicz J, Schultz W. Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature. 1996;379:449–451. doi: 10.1038/379449a0. [DOI] [PubMed] [Google Scholar]
  • 13.Yamada H, Matsumoto N, Kimura M. Tonically active neurons in the primate caudate nucleus and putamen differentially encode instructed motivational outcomes of action. J. Neurosci. 2004;24:3500–3510. doi: 10.1523/JNEUROSCI.0068-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kobayashi S, et al. Influences of rewarding and aversive outcomes on activity in macaque lateral prefrontal cortex. Neuron. 2006;51:861–870. doi: 10.1016/j.neuron.2006.08.031. [DOI] [PubMed] [Google Scholar]
  • 15.Herkenham M, Nauta WJ. Afferent connections of the habenular nuclei in the rat. A horseradish peroxidase study, with a note on the fiber-of-passage problem. J. Comp. Neurol. 1977;173:123–146. doi: 10.1002/cne.901730107. [DOI] [PubMed] [Google Scholar]
  • 16.Parent A, Gravel S, Boucher R. The origin of forebrain afferents to the habenula in rat, cat and monkey. Brain Res. Bull. 1981;6:23–38. doi: 10.1016/s0361-9230(81)80066-4. [DOI] [PubMed] [Google Scholar]
  • 17.Herkenham M, Nauta WJ. Efferent connections of the habenular nuclei in the rat. J. Comp. Neurol. 1979;187:19–47. doi: 10.1002/cne.901870103. [DOI] [PubMed] [Google Scholar]
  • 18.Wise RA. Dopamine, learning and motivation. Nat. Rev. Neurosci. 2004;5:483–494. doi: 10.1038/nrn1406. [DOI] [PubMed] [Google Scholar]
  • 19.Hikosaka O, Nakamura K, Nakahara H. Basal ganglia orient eyes to reward. J. Neurophysiol. 2006;95:567–584. doi: 10.1152/jn.00458.2005. [DOI] [PubMed] [Google Scholar]
  • 20.Cools R, Roberts AC, Robbins TW. Serotoninergic regulation of emotional and behavioural control processes. Trends Cogn. Sci. 2008;12:31–40. doi: 10.1016/j.tics.2007.10.011. [DOI] [PubMed] [Google Scholar]
  • 21.Christoph GR, Leonzio RJ, Wilcox KS. Stimulation of the lateral habenula inhibits dopamine-containing neurons in the substantia nigra and ventral tegmental area of the rat. J. Neurosci. 1986;6:613–619. doi: 10.1523/JNEUROSCI.06-03-00613.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ji H, Shepard PD. Lateral habenula stimulation inhibits rat midbrain dopamine neurons through a GABA(A) receptor-mediated mechanism. J. Neurosci. 2007;27:6923–6930. doi: 10.1523/JNEUROSCI.0958-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wang RY, Aghajanian GK. Physiological evidence for habenula as major link between forebrain and midbrain raphe. Science. 1977;197:89–91. doi: 10.1126/science.194312. [DOI] [PubMed] [Google Scholar]
  • 24.Sutherland RJ. The dorsal diencephalic conduction system: a review of the anatomy and functions of the habenular complex. Neurosci. Biobehav. Rev. 1982;6:1–13. doi: 10.1016/0149-7634(82)90003-3. [DOI] [PubMed] [Google Scholar]
  • 25.Lecourtier L, Kelly PH. A conductor hidden in the orchestra? Role of the habenular complex in monoamine transmission and cognition. Neurosci. Biobehav. Rev. 2007;31:658–672. doi: 10.1016/j.neubiorev.2007.01.004. [DOI] [PubMed] [Google Scholar]
  • 26.Matsumoto M, Hikosaka O. Lateral habenula as a source of negative reward signals in dopamine neurons. Nature. 2007;447:1111–1115. doi: 10.1038/nature05860. [DOI] [PubMed] [Google Scholar]
  • 27.Glimcher PW. Indeterminacy in brain and behavior. Annu. Rev. Psychol. 2005;56:25–56. doi: 10.1146/annurev.psych.55.090902.141429. [DOI] [PubMed] [Google Scholar]
  • 28.Solomon RL, Corbit JD. An opponent-process theory of motivation. I. Temporal dynamics of affect. Psychol. Rev. 1974;81:119–145. doi: 10.1037/h0036128. [DOI] [PubMed] [Google Scholar]
  • 29.Seymour B, Singer T, Dolan R. The neurobiology of punishment. Nat. Rev. Neurosci. 2007;8:300–311. doi: 10.1038/nrn2119. [DOI] [PubMed] [Google Scholar]
  • 30.Schultz W, Dickinson A. Neuronal coding of prediction errors. Annu. Rev. Neurosci. 2000;23:473–500. doi: 10.1146/annurev.neuro.23.1.473. [DOI] [PubMed] [Google Scholar]
  • 31.Schultz W. Predictive reward signal of dopamine neurons. J. Neurophysiol. 1998;80:1–27. doi: 10.1152/jn.1998.80.1.1. [DOI] [PubMed] [Google Scholar]
  • 32.Nakahara H, Itoh H, Kawagoe R, Takikawa Y, Hikosaka O. Dopamine neurons can represent context-dependent prediction error. Neuron. 2004;41:269–280. doi: 10.1016/s0896-6273(03)00869-9. [DOI] [PubMed] [Google Scholar]
  • 33.Belova MA, Paton JJ, Morrison SE, Salzman CD. Expectation modulates neural responses to pleasant and aversive stimuli in primate amygdala. Neuron. 2007;55:970–984. doi: 10.1016/j.neuron.2007.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
  • 35.Montague PR, Dayan P, Sejnowski TJ. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 1996;16:1936–1947. doi: 10.1523/JNEUROSCI.16-05-01936.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Doya K. Metalearning and neuromodulation. Neural Netw. 2002;15:495–506. doi: 10.1016/s0893-6080(02)00044-8. [DOI] [PubMed] [Google Scholar]
  • 37.Lecourtier L, Defrancesco A, Moghaddam B. Differential tonic influence of lateral habenula on prefrontal cortex and nucleus accumbens dopamine release. Eur. J. Neurosci. 2008;27:1755–1762. doi: 10.1111/j.1460-9568.2008.06130.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Yang LM, Hu B, Xia YH, Zhang BL, Zhao H. Lateral habenula lesions improve the behavioral response in depressed rats via increasing the serotonin level in dorsal raphe nucleus. Behav. Brain Res. 2008;188:84–90. doi: 10.1016/j.bbr.2007.10.022. [DOI] [PubMed] [Google Scholar]
  • 39.Nakamura K, Matsumoto M, Hikosaka O. Reward-dependent modulation of neuronal activity in the primate dorsal raphe nucleus. J. Neurosci. 2008;28:5331–5343. doi: 10.1523/JNEUROSCI.0021-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Gandhi NJ, Bonadonna DK. Temporal interactions of air-puff-evoked blinks and saccadic eye movements: insights into motor preparation. J. Neurophysiol. 2005;93:1718–1729. doi: 10.1152/jn.00854.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary data

RESOURCES