Abstract
Single neurons in the ventral striatum of primates carry signals that are related to reward and motivation. When monkeys performed a task requiring one to three bar release trials to be completed successfully before a reward was given, they seemed more motivated as the rewarded trials approached; they responded more quickly and accurately. When the monkeys were cued as to the progress of the schedule, 89 out of 150 ventral striatal neurons responded in at least one part of the task: (1) at the onset of the visual cue, (2) near the time of bar release, and/or (3) near the time of reward delivery. When the cue signaled progress through the schedule, the neuronal activity was related to the progress through the schedule. For example, one large group of these neurons responded in the first trial of every schedule, another large group responded in trials other than the first of a schedule, and a third large group responded in the first trial of schedules longer than one. Thus, these neurons coded the state of the cue, i.e., the neurons carried the information about how the monkey was progressing through the task. The differential activity disappeared on the first trial after randomizing the relation of the cue to the schedule. Considering the anatomical loop structure that includes ventral striatum and prefrontal cortex, we suggest that the ventral striatum might be part of a circuit that supports keeping track of progress through learned behavioral sequences that, when successfully completed, lead to reward.
Keywords: ventral striatum, motivation, schedule length, reward, visual cue, information analysis, macaque monkey, neurophysiology
The ventral striatum seems to play an important role in motivation and reward-related behavior (Schultz et al., 1995). It is part of a circuit that includes the anterior cingulate cortex, globus pallidus, ventral pallidum, substantia nigra, and mediodorsal nucleus of thalamus (Alexander et al., 1986). It also receives connections from orbitofrontal cortex (Haber et al., 1996), parts of the temporal lobe cortex (VanHoesen et al., 1981), and the amygdala (Russchen et al., 1985), which suggest that it could play a role in planning future behavior and providing emotional content to it.
Single neurons in the ventral striatum of trained primates carry signals that are strongly related to reward and motivation. Two types of neural activity have been reported thus far in trained monkeys, (1) responses to the delivery of the primary reward (Apicella et al., 1991;Bowman et al., 1996) and (2) responses that anticipate certain task events, especially in rewarded trials (Schultz et al., 1992).
Before Bowman et al. (1996), the tasks used to study ventral striatal neuronal responses had several events but single trials. By contrast,Bowman et al. (1996) used a task in which the monkey was required to complete several successful trials before a reward was given. They reported that approximately one-third of the ventral striatal neurons showed reward-related neural activity. The monkeys behaved differently and ventral striatal neurons responded differently during trials in which a cue indicated there would be a reward relative to those trials in which the cue indicated there would be no reward. Behaviorally, the monkeys seemed more motivated in the rewarded trials. Bowman et al. (1996) concentrated on the reward-related activity of the neurons during this task. However, they pointed out that some neurons responded in other phases of the task. To investigate the relations among these neurons with other events in this task more thoroughly, we compared the responses when the monkey was cued as to whether or not it would receive a reward (i.e., cues predicted proximity of reward) with the responses when the monkey received the same cues but a random reinforcement schedule (i.e., cues did not predict proximity of reward). When the cue was meaningful, the neural activity coded the state of the cue, i.e., the neurons carried the information needed to know how the monkey was progressing through the task. These effects disappeared immediately (for these very experienced monkeys) when the meaning of the cue was removed, showing that the effects are the result of the monkey associating the meaning of the cue with the task. We suggest that the ventral striatum might be part of a circuit that supports keeping track of progress through learned behavioral sequences that, when successfully completed, lead to reward.
MATERIALS AND METHODS
Animal preparation. Behavioral and single-unit data were collected from two young adult (5–9 kg) monkeys (Macaca mulatta). Both monkeys were initially trained to fixate a small target spot to obtain a fluid reward (Wurtz, 1969). After this training, a cylinder for microelectrode recording and a head holder were fixed to the skull during an aseptic surgical procedure performed under isoflurane anesthesia. The head holder allowed the head to be fixed in the standard stereotaxic position during the experiments. Scleral magnetic search coils for measuring eye movements were implanted (Robinson, 1963; Judge et al., 1980). Electrophysiological recording sessions generally began a week after surgery.
Behavioral paradigms and visual stimuli. The behavioral paradigms and visual stimuli used in the present study are similar to those of Bowman et al. (1996) (Fig. 1). Visual stimuli were presented on a computer video monitor subtending 10.5° of visual angle in front of the animal. In each trial, a white cue, which will be described below, was present at the top of the computer video screen, and a small white fixation spot (0.07°) appeared in the center of the screen. Then, after the monkey touched the bar in the chair and fixated the fixation point and after at least 400 msec, a red Wait signal (0.2°) appeared around the fixation point. After a randomly selected Wait time (400, 600, 800, 1000, or 1200 msec), the red Wait signal changed to become the green Go signal, indicating that the monkey could release the bar to earn a liquid reward. If the monkey responded within 1 sec, the target turned blue (OK signal), signaling the monkey that the trial had been completed correctly. The target then disappeared. If the monkey responded in <200 msec after the Go signal, we counted this as an anticipatory error. The target disappeared, and the trial was terminated immediately. If the monkey did not respond within 1 sec after the onset of the Go signal, we counted this trial as a late error.
Initially, each correct trial was rewarded by delivering a drop of juice at a randomly chosen time beginning 250–350 msec after the target turned blue. When the monkeys completed >80% of the trials correctly, a cued multiple-ratio reinforcement schedule was introduced. The monkeys were required to complete randomly interleaved ratio schedules of one, two, or three correct trials to obtain a reward. The brightness of the rectangular cue (10.5 × 0.26°) at the top of the screen varied from black to white in direct proportion to the schedule fraction. The schedule fraction, schedule fraction = (trial number)/(schedule length), quantified the progress toward the rewarded trial, that is, 1/3, 2/3, 3/3, 1/2, 2/2, and 1/1. The brightness of the schedule fraction cue was changed at the onset of the intertrial interval so that the monkeys could interpret the meaning of the cue before responding to the target in the forthcoming trial. We call this the brightening paradigm, because the brightness of the cue increased along with the progress of the schedule. The luminance of the brightest cue and black level were 9.8 and 0.8 cd/m2, respectively, when using a 19 inch monitor, and 65 and 0.05 cd/m2, respectively, when using a 14 inch monitor.
The monkeys had to complete each schedule before beginning a new one, no matter how many errors they made. On correct trials in which no reward was delivered, the reward apparatus was activated with the delivery valve turned off (sham reward) so that the auditory stimulation was the same as in the rewarded trials.
In the session for each day, after recording the single-unit activity in a block of trials with the trial sequence tied to the cue brightness and if the neuron was still electrically well-isolated, the neural activity was recorded in a block of trials in which the cue brightness was not related to the trial sequence (randomized). The monkeys behaved differently in the two blocks (Bowman et al., 1996; see Results).
In these behavioral tasks, the items that changed across trials are the schedule fraction cue and whether a reward is delivered. All of the other sensory conditions and all of the motor conditions are the same in every trial. Using this design, we can study how the schedule fraction cue is associated with the neural responses.
Recording technique. Single units were recorded while the monkeys performed the task. A hydraulic microdrive was mounted on the recording cylinder, and tungsten microelectrodes with an impedance of 0.8–1.3 MΩ (MicroProbe, Clarksburg, MD) were used through a stainless steel guide tube. Experimental control and data collection were performed by a Hewlett-Packard Vectra 486/33, using a real-time data acquisition program (Hayes et al., 1982) adapted for the QNX operating system. Single units were discriminated according to spike shape and amplitude by calculating principal components using an IBM personal computer-compatible microcomputer (Abeles and Goldstein, 1977;Gawne and Richmond, 1993).
All of the experimental procedures described here were approved by the Animal Care and Use Committee of the National Institute of Mental Health and were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals.
Statistical analysis. The mean reaction time (onset of Go signal to bar release) and correct rates were calculated for each combination of the schedule fraction and the cue brightness in the brightening and random paradigms. The single-unit activity was displayed as raster plots and spike density functions (Richmond and Optican, 1987). A spike density function, which is an estimate of spike probability over time, was constructed for each individual response by replacing each spike with a Gaussian pulse, ς = 5 msec (convolving the spike train with a Gaussian pulse). These were averaged at each millisecond.
To quantify the neural responses, we measured the firing frequencies during selected time periods. To select the time period for the phasic response, we found the schedule fraction with the peak average response in the brightening paradigm (Fig. 2). We then looked for the minimum firing frequency in the period between 200 msec before the appearance of the cue and the peak and in the period between the peak and 1000 msec after the appearance of the cue. This defined the time period over which to quantify the neuronal activity (Fig. 2). Then we compared the firing frequency between the phasic responses and the spontaneous firing frequency that was measured during 200 msec before the onset of the cue. For many neurons, we also measured the firing frequency in the random paradigm and compared it with the firing frequency in the schedule fraction that had the same cue brightness in the brightening paradigm. For the bar release-related neurons, the search ranged between 200 msec before and 500 msec after the bar release, and for the reward-related neurons, the search ranged between 200 msec before and 750 msec after the activation of the reward apparatus.
For bar release-related neurons, the response did not disappear even in the random paradigm (see Results). Therefore, we quantified the modulation strength by first averaging the responses across all schedule fractions and then calculating the ratio of the response in each cue condition to this average, i.e.:
where Mj is the relative activity measure for schedule fraction j,Aj is the activity for schedule fractionj, and the summation is over all of the schedule fractions,i = 1 … 6.
Information analysis. We wished to determine whether the neuronal responses could be decoded to identify the experimental conditions. In the past we have used an information theoretical analysis to make similar assessments (Kjaer et al., 1994; Bowman et al., 1996). In this approach, each neuron is considered as a transmission channel carrying information about the experimental conditions via their event-related responses in the task. Transmitted information quantifies the discriminability of the experimental conditions based on the responses. This discriminability is in turn a function of the means and variabilities of the responses across different experimental conditions. Intuitively, the less the distributions of the responses elicited by different experimental conditions overlap, the greater the transmitted information. The amount of information regarding the condition is the entropy:
The transmitted information is the entropy before receiving a message, H(C), minus the residual entropy (or uncertainty) after receiving a message,H(C‖R), that is, the conditional entropy or the entropy given the response. This can be rewritten as:
where I(C;R) is the information transmitted about the conditions c given in the responses R. C is the average over each condition, and c is the condition related to responser. P(c‖r) is the probability of c given r, i.e., the conditional probability of the condition being selected on the basis of the response. P(c) is the a priori probability of the condition, which is known in the experiment.P(c‖r)log2[P(c‖r)/P(c)] is the expected or average information about condition cover all of the responses r. Obtaining an accurate estimate of the transmitted information,I(C;R), requires an accurate estimate of P(c‖r). We have done this using a neural network to perform a nominal regression of the experimental condition on the neural response. The analysis performed by the neural network is similar to logistic regression (Kjaer et al., 1994). Here, the spike count was used as the response representation. The spike counts from one or more neurons were used as inputs to the neural network, and the conditions of the six task states were used as target outputs in the analyses, in which for each trial one schedule fraction was set to one and the others were zero. Getting an unbiased value for P(c‖r) when the data set has a limited number of responses r for each conditionc can be problematical. Recent work shows that the neural network method accurately estimates the conditional probabilities (Golomb et al., 1997).
In the extreme, if the network could perfectly categorize each schedule fraction using the neuronal response, i.e., if there were no noise and all responses were distinct, the amount of information transmitted in the spike train would be equal to the amount required to encode the experimental conditions. Estimates of transmitted information always depend on the stimulus set and the response representation and cannot be taken to be absolute. However, for the conditions under study at any time, transmitted information provides a quantitative assessment of the use of a response for estimating the experimental condition. The amount of transmitted information is usually expressed in bits. Because there were almost equal numbers of trials, the maximum amount of information that could have been transmitted about the task state (1/3, 2/3, 3/3, 1/2, 2/2, and 3/3), that is, the entropy, is log2 6 or 2.58 bits. The spike count was used as the neuronal code r.
Histology. During the last few recording sessions with the first monkey, small electrolytic lesions were made at a few recording sites by passing 5 μA of current for 30 sec. At the end of the experiments, this monkey was given a lethal dose of pentobarbital (75 mg/kg) and was transcardially perfused with saline followed by 10% formalin. Fifty micrometer frozen sections were cut, mounted on microscope slides, and stained for histology. Comparing the recording tracks, penetration records, and electrolytic marking lesions confirmed that all of the units described in this paper were located in the ventral striatum. Figure 3 shows guide tube tracks at the top of all three sections, indicating that guide tubes were placed at those locations. The middlesection shows a small electrolytic marking lesion at a location where responsive units were recorded during the task.
We used magnetic resonance (MR) imaging to confirm that our recordings were taken from the same region in the other monkey (Saunders et al., 1990). The MR sections from the second monkey were matched to the histological sections from the first monkey. The depths of recording sites were calculated from the lengths of the guide tubes and electrodes. The recording depths were confirmed further by calculating backward from the depth at which the electrode struck the dura at the bottom of some penetrations. The sampling of neurons in both monkeys was approximately uniform across the anterior-posterior range represented by the sections in Figure 3.
RESULTS
Behavioral results
As reported previously (Bowman et al., 1996), the mean reaction time to bar release decreased as the brightness of the cue approached the level that signaled a forthcoming reward in the brightening paradigm (Fig. 4A), suggesting that the monkeys were more motivated. When we randomized the schedule fraction cue (random paradigm) with respect to the reward, the reaction time was short and relatively constant in all trials (Fig.4A). It has long been known that when animals perform tasks using variable-ratio reward schedules, the rate of responding becomes nearly constant (Mackintosh, 1983). Thus, the results we see here in the random paradigm suggest that the monkey treats the random paradigm as a task with a variable-ratio reward schedule.
When the data are sorted by the cue brightness rather than by the schedule fraction in the random paradigm, we see that the monkeys responded a little faster when the cue was brighter. This effect is significantly smaller (p < 0.01; ttest) than the effect of the cue when it is meaningful (Fig.4B).
The percentage of correct trials increased as the monkey progressed through the schedule toward the rewarded trial in the brightening paradigm (Fig. 5A). In the random paradigm, the percentage of correct trials was high and relatively constant in all trials (Fig. 5A). Although there seems to be a slight trend toward better performance when the cue was brighter, the effect does not reach significance (Fig.5B).
These changes in reaction time and percentage correct behavioral responses across paradigms indicate that the schedule fraction cue has meaning for the monkeys in the brightening paradigm. Thus, the results might be interpreted as an indication of the level of effort or motivation of the monkeys.
Neural response
We recorded from 150 neurons in the ventral striatum of two monkeys. There were 89 neurons that responded to one or more phases of this task. Some neurons responded to the appearance of the schedule fraction cue, some responded to the bar release or Go signal, and others responded to the OK signal or reward dispensing (Fig.6). There were some neurons that showed phasic responses in relation to more than one task event.
Figure 7 shows the neural responses of one neuron that responded to all three task phases. Although we only found two neurons responding in all three phases, the responses are typical for the response seen at each phase, i.e., cue-related, bar release-related, and reward-related. On the time scale used here, all of the response types had clear phasic components. The cue-related responses, seen in this example for the 2/3, 3/3, and 2/2 schedule fractions, ended before any other event, including the appearance of the red Wait signal (Fig. 7A). The bar release-related responses began just before bar release occurred, as seen in this example (Fig. 7B), or otherwise started after bar release (see below). The reward-related responses also frequently started before the reward was delivered, as shown here (Fig. 7C). Both the bar release- and reward-related responses were clearly largest in the 3/3, 2/2, and 1/1 conditions for this neuron, i.e., in trials when a reward was forthcoming. There was also some response to bar release in the other conditions, a frequently seen result (see below).
Cue-related neurons
Forty-seven out of the 150 neurons showed phasic responses when the schedule fraction cue appeared in some schedule fractions and not in others. A phasic response was considered significant when the mean response in the time period selected, as described in Materials and Methods and shown in Figure 2, was different from the background activity taken from the intertrial interval (p< 0.05; t test). The phasic response related to the cue lasted <1 sec, averaging 927 ± 30 msec (± SE; n= 44). This phasic component always ended before the bar release occurred. The neuron shown in Figure8A responded to the appearance of the cue when the schedule fraction was 2/3, 3/3, and 2/2 in the brightening paradigm. Thus, this neuron responded in trials other than the first one in a schedule and failed to respond when the cue appeared in the random paradigm (Fig. 8B).
The responses illustrated in Figure 8A ended rather abruptly, raising the question of whether the responses ended in relation to one of the other task events, especially the appearance of the red Wait signal. For 36 out of 47 cue-related neurons, the response ended before the onset of the Wait signal (average time from cue onset to Wait signal was 1270 msec). Of the remaining 10 neurons, four had responses that returned to the spontaneous firing level within 200 msec after the Wait signal, and six had responses that gradually declined, lasting >300 msec before returning to the spontaneous firing level. Thus, most of the cue-related neurons had phasic responses that ended before other task events occurred.
The cue-related neurons fell into five main groups (Table1) of neurons that responded (1) in the middle of or at the end of the multiple trial schedules (n = 16) (Fig.9A); (2) in the first trials of all schedules (n = 13) (Fig. 9B); (3) in only the first trials of multiple trial schedules (n = 6) (Fig. 9C); (4) in all of the rewarded trials (n = 3); and (5) only in the last trial of the multiple trial schedules (n = 3). Two neurons responded in all schedule fractions except 1/1; one neuron responded in 1/3, 1/2, 2/3, and 3/3; one neuron responded in 1/3, 2/3, and 3/3; one neuron responded in 1/3 and 2/3; and one neuron responded only in 2/2.
Table 1.
1/3 | 1/2 | 2/3 | 3/3 | 2/2 | 1/1 | n | |
---|---|---|---|---|---|---|---|
(1) | X | X | X | 16 | |||
(2) | X | X | X | 13 | |||
(3) | X | X | 6 | ||||
(4) | X | X | X | 3 | |||
(5) | X | X | 3 |
X indicates the schedule fraction in which the neuron responded to the appearance of the schedule fraction cue. Forty-one out of 47 neurons that responded to the appearance of the schedule fraction fell into five main groups. The numbers of neurons in each category were shown under n.
For 23 of the cue-related neurons, we also recorded while the monkey performed in the random paradigm, i.e., when the cue had no predictive value. During the random paradigm, none of these neurons showed activity that was significantly different (Kruskall–Wallis,p = 0.07) than the background activity during the 200 msec before the trial onset. Thus, the responses that were seen in the brightening paradigm disappeared in the random paradigm, showing the associative nature of the relation between the cue and the neural response.
We identified significant responses during the brightening paradigm by comparing the responses to the background activity during the brightening paradigm. However, as described above, the phasic cue-related responses seen in the brightening paradigm disappeared, even at the same cue brightness in the random paradigm. We calculated how much larger the phasic responses were by forming a ratio: (significant response in the brightening paradigm)/(response to the cue of same brightness in the random paradigm). The ratios for the 1/3, 1/2, 2/3, 3/3, 2/2, and 1/1 cues were 2.0 ± 0.39 (mean ± SE; n = 11), 1.8 ± 0.28 (n = 11), 3.5 ± 1.1 (n = 8), 3.1 ± 0.76 (n = 13), 3.7 ± 1.0 (n = 13), and 1.6 ± 0.37 (n = 9), respectively. These ratios show that the cue-related phasic responses were especially large during the 2/3, 3/3, and 2/2 schedule fractions in the brightening paradigm, showing that the responses are largest after the first trial of a schedule.
Neurons responding at the time of bar release
There were 41 neurons that showed phasic responses at the time of bar release or the Go signal. Figure 10shows the neural responses of one of these neurons, aligned at the time of bar release. This neuron responded in all of the schedule fractions, and the response onset preceded the onset of the bar release. The response was largest in the first trial of multiple trial schedules, i.e., 1/3 and 1/2 (Fig. 10A). When we randomized the cue sequence, these large responses decreased (Fig.10B) but did not disappear. The overall activity also fell for this neuron in the random paradigm.
There were also five main response groups for these neurons. However, two of them were different than those seen for the cue-related neurons (Table 2 vs Table 1). The differences are that there were no neurons for the previous groups 2 and 5; two new groups, one containing neurons that showed large responses in all of the nonrewarded trials (1/3, 1/2, and 2/3) and another containing neurons that showed large responses in all of the trials, were found; and the largest group, group 4, which contained neurons that showed large responses in all of the rewarded trials, was 15 of 41 neurons. Figure 7B shows an example from a group 4 neuron.
Table 2.
1/3 | 1/2 | 2/3 | 3/3 | 2/2 | 1/1 | n | |
---|---|---|---|---|---|---|---|
(1) | X | X | X | 3 | |||
(2) | X | X | X | 0 | |||
(3) | X | X | 2 | ||||
(4) | X | X | X | 15 | |||
(5) | X | X | 0 | ||||
(6) | X | X | X | 8 | |||
(7) | X | X | X | X | X | X | 11 |
X indicates the schedule fraction in which the neuron responded. Thirty-nine out of 41 neurons fell into five main groups. The numbers of neurons in each category were shown undern.
Unlike the cue-related neurons, the bar release-related neurons showed activity in both the brightening and random paradigms. The difference in activity between the brightening and random paradigms was that the activity in the brightening paradigm was larger in some schedule fractions and smaller in others, whereas the activity in the random paradigm was almost the same for all cue brightnesses. To quantify the differential activity seen in the brightening paradigm, we formed a relative measure: (activity in one schedule fraction)/(average activity across all schedule fractions). This measure gives the activity relative to the mean across all conditions (see Materials and Methods). When the activity went up, the ratios were 1.14 ± 0.06 (mean ± SE; n = 21), 1.15 ± 0.08 (n = 21), 1.20 ± 0.06 (n = 22), 1.17 ± 0.04 (n = 28), 1.10 ± 0.04 (n = 28), and 1.20 ± 0.07 (n = 25) for 1/3, 1/2, 2/3, 3/3, 2/2, and 1/1 conditions, respectively. When the activity went down, the ratios were 0.76 ± 0.07 (n = 17), 0.79 ± 0.06 (n = 17), 0.80 ± 0.05 (n = 16), 0.64 ± 0.06 (n = 10), 0.59 ± 0.06 (n = 10), and 0.69 ± 0.06 (n = 13), for the same schedule fractions, respectively. For the random paradigm, these same ratios were 0.95 ± 0.03 (n = 19), 1.04 ± 0.04 (n = 19), 1.02 ± 0.02 (n = 19), and 0.97 ± 0.05 (n = 19) for the 1/3, 1/2, 2/3, and 1/1 cue brightnesses, respectively. Thus, in the brightening paradigm, the responses were substantially different from the expected value, whereas in the random paradigm, the ratios were basically equal to 1, showing that there was no modulation.
Neurons responding at the time of activation of the reward apparatus
There were 24 neurons that responded at the time of activation of the reward apparatus. Figure11A shows one of the neural responses of these neurons in the brightening paradigm. The response is aligned to the onset of the activation of the reward apparatus. This neuron showed large neural responses at the time the reward was dispensed in the rewarded trials, 3/3, 2/2, and 1/1; the response preceded the onset of activation of the reward apparatus. In the random paradigm, the responses increased in the nonrewarded trials, indicating that these neurons were closely related to the behavior of the monkey; they responded as if a reward was expected on every trial (Fig. 11B).
For 15 neurons, the response preceded the onset of activation of the reward apparatus. For nine neurons, the response began at the time of or after activation of the reward apparatus.
The response types are summarized in Table3 using the same group classification given in Tables 1 and 2. Fourteen out of the 24 neurons responded in the rewarded trial, group 4. There were no neurons responding for groups 1, 2, 3, and 5. One neuron responded in all schedule fractions except 1/1, and one neuron responded in 1/2 and 2/3. These reward-related neurons responding at the time of activation of the reward apparatus are probably similar to those reported by others (Apicella et al., 1991; Schultz et al., 1992; Bowman et al., 1996).
Table 3.
1/3 | 1/2 | 2/3 | 3/3 | 2/2 | 1/1 | n | |
---|---|---|---|---|---|---|---|
(1) | X | X | X | 0 | |||
(2) | X | X | X | 0 | |||
(3) | X | X | 0 | ||||
(4) | X | X | X | 14 | |||
(5) | X | X | 0 | ||||
(6) | X | X | X | 4 | |||
(7) | X | X | X | X | X | X | 4 |
X indicates the schedule fraction in which the neuron responded. Twenty-two out of 24 neurons fell into three main groups. The numbers of neurons in each category were shown undern.
On average, the reward-related responses that showed a significant increase were 1.6 ± 0.07 (mean ± SE; n = 8), 1.6 ± 0.42 (n = 8), 1.6 ± 0.36 (n = 8), 2.5 ± 0.61 (n = 18), 2.6 ± 0.59 (n = 18), and 2.5 ± 0.48 (n = 18) times larger than the responses in the random condition for the 1/3, 1/2, 2/3, 3/3, 2/2, and 1/1 schedule fractions, respectively. Here, the responses were greatest in the trials when a reward was forthcoming, as compared with the cue-related responses, which tended to be largest in any trial except the first in a series.
The relation between change in reaction time and neural response
The differences between the results in the brightening and random paradigms show that the relation of the response to the cue is associatively formed. We investigated how quickly the association is gated.
When the behavioral paradigm changed, the monkeys had no explicit cues. Thus, the monkeys would work until they discovered that the cue was no longer related to the schedule. By the time we began the unit recording, the monkeys had many weeks of experience with the task change, and they almost never made more than the one unavoidable error.
We compared how quickly the behavior and neural activity changed after the shift from the brightening to the random paradigms. For the behavior, we compared the reaction times in the last trial of the brightening paradigm in the 1/3 condition, the condition in which the monkey had the longest reaction time, with the second trial in the random paradigm. This test tends to underestimate the difference slightly because, occasionally, when the brightest cue appears with a reward by chance in the first trial after the switch to the random paradigm, this first trial in the random condition still seems to be valid. The reaction times before the switch (median, 413 msec) were significantly longer than were the latter ones (median, 343 msec) (Wilcoxon signed-rank test; p < 0.01). For the neural activity, the spike counts during 800 msec from the onset of the cue in the last trial of the brightening paradigm in the neurons that were responsive to the task states (median, six spikes) were significantly larger than were those in the second trial of the random paradigm (median, two spikes) (Wilcoxon signed-rank test; p < 0.05). These results show that the change in neuronal responses paralleled the change in the behavioral state. We cannot conclude that these neuronal responses at the onset of the cue directly drive the motor activity. It seems more likely that these signals influence subsequent processing that is closer to the motor output.
The information of the responses of cue-related neurons about the schedule fraction
It is clear that the cue-related neurons encode information about the schedule. We wanted to know whether the monkey could unambiguously determine what the schedule fraction is by using only these neurons. To study this issue, we performed an information theoretical analysis. The states 3/3, 2/2, and 1/1 all correspond to the schedule fraction of 1. However, we treat them independently because there were cue-related neurons that responded in only some of these three states. We calculated the information about all six task states (1/3, 1/2, 2/3, 3/3, 2/2, and 1/1) using the spike count during the 100–800 msec epoch after the onset of the cue as the response code. The information was 0.267 ± 0.183 bits (mean ± SD; n = 47) (Fig. 12). This calculation is for the current state, regardless of its predictability given previous states, i.e., given the response without its history.
In information theory, information from independent channels adds. Because there are six task states, the a priori uncertainty is 2.58 bits (see Materials and Methods). Therefore only 10 independent neurons (10 × 0.267 bits) would be required to differentiate among the six task states. However, from inspection, it is not clear how independent these groups of cue-related neurons are. Therefore, we also calculated the information using the spike count of one or two neuron in each group (total, 5 or 10 neurons) taken together, even though they were not recorded at the same time. When we chose the neurons carrying the largest amount of information in each group, the information carried by the five neurons was 1.21 ± 0.20 bits, somewhat less than the 1.34 bits that would be found if these were completely independent. When we chose the neurons carrying the smallest amount of information in each group, the information carried by the five neurons was 0.62 ± 0.09 bits. When we chose the two neurons with the largest and second largest amount of information in each group (total, 10 neurons), the information carried by the 10 neurons was 1.35 ± 0.33 bits. Thus, using more neurons from each of the groups does not help much. Because adding more neurons from each group does not increase the transmitted information substantially, the result of this information theoretical analysis indicates that the neurons we assigned to the groups via inspection are very similar in their information processing, thus supporting our categorization. Although the information calculation showed that this discrimination among the states could be done with as few as 10 independent neurons, the neurons we have recorded, when considered together, do not seem to reach the level of signal independence needed to reach the a priori uncertainty of 2.58 bits. It is possible that neurons in as yet undiscovered other classes or in other brain areas are used to solve this problem.
DISCUSSION
The new finding reported here is that a majority of the neurons in the monkey ventral striatum respond to different parts of a task in which one or more scheduled trials must be completed before a reward is delivered. The neurons carry signals that show which schedule is in effect and where the current trial is in the schedule. A large proportion of the neurons respond in schedules requiring more than one trial. Each neuron can also be placed in one or more of three categories: (1) neurons that respond at the onset of the cue, (2) neurons that respond near the time of bar release, and (3) neurons that respond near the time that the reward is dispensed. These are similar to categories seen by others in ventral and dorsal striatum (Hikosaka et al., 1989; Apicella et al., 1991; Schultz et al., 1992). Here we concentrate on the effects related to the number of scheduled trials.
Relation to other studies
The reward-related neurons are easiest to compare to other studies. In our study, the majority of reward-related neurons responded in every rewarded trial (compare Table 3). A smaller but significant number of reward-related neurons responded in correct, but nonjuice-rewarded, trials. Some neurons responded in every successfully completed trial whether or not a reward was delivered. These three groups are similar to reward-related neurons seen by others in dorsal and ventral striatum (Hikosaka et al., 1989; Apicella et al., 1991;Schultz et al., 1992; Bowman et al., 1996).
It is more difficult to compare the other two categories, bar release-related and cue-related neurons, to other studies. The striking aspect of the results here is the relation to schedule. Some neurons responded in schedules having only one trial. These can be interpreted as predicting the reward and seem most similar to the neurons reported in other studies. Many neurons responded in schedules with more than one trial. Presumably neurons such as these would not have been activated in previous studies using single-trial schedules. The higher percentage of responsive neurons seen in the population here [>50% vs ∼30% for Hikosaka et al. (1989) in dorsal striatum and 7% forApicella et al. (1991) and 14% for Schultz et al. (1992) in ventral striatum] seems to be related to the schedule cuing used here.
Ventral and dorsal striatal neurons respond when predictable events will occur in the future (Hikosaka et al., 1989; Schultz et al., 1992,1995). These neurons only respond when the reward is predictable; the same was true here. Thus, all of these researchers conclude that these neurons carry predictive signals. Here, we extend the idea about predictive signals. In our task, a large number of neurons were recruited by longer schedules (length more than one). Thus, these neurons show specific activity about the parts of the schedule, not about the reward itself.
Relation to scheduling
The cue-related neurons are not directly related to reward expectation. If we assume the response types are related only to the reward expectation, the responses should occur in the trials in which the reward is forthcoming (fourth line in Table 1 or the fourth and seventh lines in Tables 2, 3). However, these cue-related neurons code the meaning of the cue more finely. The largest group responds in trials other than the first of a schedule–a “keep going” signal (compare Table 1, group 1). The second most common category responds in the first trial of all of the schedules (compare Table 1, group 2), and the third most common category responds in the first trial of schedules longer than one (compare Table 1, group 3). Thus, they robustly code for situations in which the schedule must continue for more than one trial, thus not directly predicting reward.
What can the animal know using these cue-related neurons? The amount of information available about the cue from the best two neurons in each category was only approximately one-half (1.35 bits) of the amount needed (2.58 bits) to decode unambiguously the meaning of the cue. If, instead, we consider the task as start, continue, and reward, examination of Table 1 shows that these neurons can solve the problem completely.
Clearly, the cue-related neurons can provide effective signals about progress through a schedule. It would take only a small generalization for these neurons to signal progress through any sequence of epochs in which intermediate goals and rewards can be identified. In addition, the ventral striatal neurons may also be viewed as encoding motivational and emotional states associated with the cues, a possibility that is not incompatible with the first. Both possibilities are consistent with our findings in the brightening versus random paradigms. The activities of the cue-related neurons disappeared in the random paradigm, showing that their responses arise from associative learning of the meaning of the cue. Also, when we compared the change in reaction time to bar release with the neural responses of cue-related neurons from the brightening and random paradigms, the change in neuronal responses paralleled the change in behavioral responses. Although this could be considered as a code for the motor command, it seems more likely that these neurons are in the circuit that codes the meaning of the schedule fraction cue and provides the information about the progress of the schedule for neurons that are related to motivation and motor output.
Brown and Bowman (1995) and Bowman and Brown (1996) have conducted ablation experiments in rats using similar tasks. In one study, they cued the animal about the size of the reward (Brown and Bowman, 1995). Normal animals had shorter reaction times when the cue indicated a bigger reward. Bilateral ablations of the nucleus accumbens did not affect reaction-time performance or learning in this task. In a preliminary report of a subsequent study, they showed that normal animals stopped lever pressing late in a testing session in a progressive-ratio schedule. Animals with bilateral nucleus accumbens lesions continued to respond far longer than did control animals (Bowman and Brown, 1996). This latter result shows that the normal animals must have an internal signal in the ventral striatum that codes the schedule length and affects the motivation of the animal. The cue-related neurons in the ventral striatum could carry the needed signal and are part of a neural circuit that is related to motivation.
Relation to behavioral response
Some neurons responded near the time of the bar release movement, the most frequently found ones being those responding at approximately the time of the bar release movement in rewarded trials and the next most frequently found ones being those responding at approximately the time of bar release in all trials. These are similar to the neurons reported by Hikosaka et al. (1989) in the dorsal striatum and Schultz et al. (1992) in the ventral striatum. A significant number also responded in all nonrewarded trials or in any but the first trial when the schedule was longer than one. These neurons do not carry a simple motor or premotor signal because the response became significantly smaller when the brightness of the cue was randomized with respect to the schedule.
Models of ventral striatal function
The finding that many cue-related and bar-related neurons differentiate among the states in the brightening paradigm is quite striking. How does this relate to the role of the striatum in behavior? The ventral striatum is thought to be within a processing loop that includes the anterior cingulate cortex, the internal segment of the globus pallidus, the ventral pallidum, rostrodorsal substantia nigra, and posterior medial portions of the medial dorsal nucleus of the thalamus (Alexander et al., 1986). The orbital prefrontal cortex, the amygdala, and other parts of the medial temporal lobe also project to the ventral striatum (VanHoesen et al., 1981; Russchen et al., 1985; Haber et al., 1995). Thus, the ventral striatum is well-placed to take part in planning and maintaining behavior in response to emotionally significant stimuli. By having signals that keep track of progress through sequences of behavior, the ventral striatum seems to have signals useful for measuring progress through a previously set plan. The results presented here show that the population of ventral striatal neurons keep an internal model by coding the place in the schedule for long sequences of behavior that ultimately lead to reward.
Using behavioral data, Everitt et al. (1991) and Everitt and Robbins (1992) concluded that the ventral striatum is important for linking cues with their reinforcement value. The cue-related neurons seem to be involved in a stage that is before the translation of a motivational signal and related only to keeping track of the sequence. The behavioral results (Bowman and Brown, 1996) taken with our single-unit studies suggest that the ventral striatum is involved in the normal pacing of activity, which may include delaying behavior when the reward value is not large enough to provide the drive for sequenced, and perhaps costly, behavior. We wonder whether the ventral striatum is important for the planning and persistence that it takes to keep working when reward can only be achieved via stepwise progression.
Footnotes
This work was supported by the National Institute of Mental Health Intramural Research Program and by Agency of Industrial Science and Technology, Ministry of International Trade and Industry, Japan. We thank Dr. Mortimer Mishkin for his encouragement and support. We thank Drs. Kenji Kawano and Elizabeth Murray for their reading and discussion of this manuscript and Dr. Takao Oishi for the photograph of the histological section. B.J.R. and T.G.A. express warm appreciation to Dr. Steven Paul (Eli Lilly Company), who, as scientific director of the National Institute of Mental Health, provided support and encouragement for developing this line of work.
Correspondence should be addressed to Dr. B. J. Richmond, National Institute of Mental Health, Building 49, Room 1B80, Bethesda, MD 20892-4415.
Dr. Aigner’s present address: National Institute on Drug Abuse, Division of Basic Research, Parklawn Building 10A19, 5600 Fischers Lane, Rockville, MD 20857.
REFERENCES
- 1.Abeles M, Goldstein MH. Multiple spike train analysis. Proc IEEE. 1977;65:762–773. [Google Scholar]
- 2.Alexander GE, DeLong MR, Strick PL. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu Rev Neurosci. 1986;9:357–381. doi: 10.1146/annurev.ne.09.030186.002041. [DOI] [PubMed] [Google Scholar]
- 3.Apicella P, Ljungberg T, Scarnati E, Schultz W. Responses to reward in monkey dorsal and ventral striatum. Exp Brain Res. 1991;85:491–500. doi: 10.1007/BF00231732. [DOI] [PubMed] [Google Scholar]
- 4.Bowman EM, Brown VJ. Lesions of the rat ventral striatum change performance in a progressive fixed-ratio schedule of reinforcement without affecting reaction times when visual cues indicate reward cost. Soc Neurosci Abstr. 1996;22:446. [Google Scholar]
- 5.Bowman EM, Aigner TG, Richmond BJ. Neural signals in the monkey ventral striatum related to motivation for juice and cocaine rewards. J Neurophysiol. 1996;75:1061–1073. doi: 10.1152/jn.1996.75.3.1061. [DOI] [PubMed] [Google Scholar]
- 6.Brown VJ, Bowman EM. Discriminative cues indicating reward magnitude continue to determine reaction time of rats following lesions of the nucleus accumbens. Eur J Neurosci. 1995;7:2479–2485. doi: 10.1111/j.1460-9568.1995.tb01046.x. [DOI] [PubMed] [Google Scholar]
- 7.Everitt BJ, Robbins TW. Amygdala-ventral striatal interactions and reward-related processes. In: Aggleton JP, editor. The amygdala: neurobiological aspects of emotion, memory, and mental dysfunction. Wiley-Liss; New York: 1992. pp. 401–429. [Google Scholar]
- 8.Everitt BJ, Morris KA, O’Brien A, Robbins TW. The basolateral amygdala-ventral striatal system and conditioned place preference: further evidence of limbic striatal interactions underlying reward-related processes. Neuroscience. 1991;42:1–18. doi: 10.1016/0306-4522(91)90145-e. [DOI] [PubMed] [Google Scholar]
- 9.Gawne TJ, Richmond BJ. How independent are the messages carried by adjacent inferior temporal cortical neurons? J Neurosci. 1993;13:2758–2771. doi: 10.1523/JNEUROSCI.13-07-02758.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Golomb D, Hertz J, Panzeri S, Richmond B, Treves A. How well can we estimate the information carried in neuronal responses from limited samples? Neural Comput. 1997;9:649–665. doi: 10.1162/neco.1997.9.3.649. [DOI] [PubMed] [Google Scholar]
- 11.Haber SN, Kunishio K, Mizobuchi M, Lynd-Balta E. The orbital and medial prefrontal circuit through the primate basal ganglia. J Neurosci. 1996;15:4851–4867. doi: 10.1523/JNEUROSCI.15-07-04851.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hays AV, Richmond BJ, Optican LM. A UNIX-based multiple process system for real-time data acquisition and control. WESCON Conf Proc. 1982;2:1–10. [Google Scholar]
- 13.Hikosaka O, Sakamoto M, Usui S. Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward. J Neurophysiol. 1989;61:814–832. doi: 10.1152/jn.1989.61.4.814. [DOI] [PubMed] [Google Scholar]
- 14.Judge SJ, Richmond BJ, Chu FC. Implantation of magnetic search coils for measuring eye position: an improved method. Vision Res. 1980;20:535–538. doi: 10.1016/0042-6989(80)90128-5. [DOI] [PubMed] [Google Scholar]
- 15.Kjaer TW, Hertz JA, Richmond BJ. Decoding cortical neuronal spike signals: network models, information estimation and spatial tuning. J Comput Neurosci. 1994;1:109–139. doi: 10.1007/BF00962721. [DOI] [PubMed] [Google Scholar]
- 16.Mackintosh NJ. Conditioning and associative learning. Clarendon; Oxford: 1983. [Google Scholar]
- 17.Richmond BJ, Optican LM. Temporal encoding of two-dimensional patterns by single units in primate inferior temporal cortex. II. Quantification of response waveform. J Neurophysiol. 1987;57:147–161. doi: 10.1152/jn.1987.57.1.147. [DOI] [PubMed] [Google Scholar]
- 18.Robinson DA. A method of measuring eye movements using a scleral search coil in a magnetic field. IEEE Trans Biomed Eng. 1963;10:137–145. doi: 10.1109/tbmel.1963.4322822. [DOI] [PubMed] [Google Scholar]
- 19.Russchen FT, Bakst I, Amaral DG, Price JL. The amygdalostriatal projections in the monkey. An anterograde tracing study. Behav Brain Res. 1985;329:241–257. doi: 10.1016/0006-8993(85)90530-x. [DOI] [PubMed] [Google Scholar]
- 20.Saunders RC, Aigner TG, Frank JA. Magnetic resonance imaging of the rhesus monkey brain: use for stereotactic neuroimaging. Behav Brain Res. 1990;81:443–446. doi: 10.1007/BF00228139. [DOI] [PubMed] [Google Scholar]
- 21.Schultz W, Apicella P, Scarnati E, Ljungberg T. Neuronal activity in monkey ventral striatum related to the expectation of reward. J Neurosci. 1992;12:4595–4610. doi: 10.1523/JNEUROSCI.12-12-04595.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Schultz W, Apicella P, Romo R, Scarnati E. Context-dependent activity in primate striatum reflecting past and future behavioral events. In: Houk JC, Davis JL, Beiser DG, editors. Models of information processing in the basal ganglia. MIT; Cambridge, MA: 1995. pp. 11–28. [Google Scholar]
- 23.VanHoesen GW, Yeterian EH, Lavizzo-Mourney R. Widespread corticostriate projections from temporal cortex of the rhesus monkey. J Comp Neurol. 1981;199:205–219. doi: 10.1002/cne.901990205. [DOI] [PubMed] [Google Scholar]
- 24.Wurtz RH. Visual receptive fields of striate cortex neurons in awake monkeys. J Neurophysiol. 1969;32:727–742. doi: 10.1152/jn.1969.32.5.727. [DOI] [PubMed] [Google Scholar]