Skip to main content
PLOS Biology logoLink to PLOS Biology
. 2021 Jul 1;19(7):e3001055. doi: 10.1371/journal.pbio.3001055

D1- and D2-like receptors differentially mediate the effects of dopaminergic transmission on cost–benefit evaluation and motivation in monkeys

Yukiko Hori 1, Yuji Nagai 1, Koki Mimura 1, Tetsuya Suhara 1, Makoto Higuchi 1, Sebastien Bouret 2, Takafumi Minamimoto 1,*
Editor: Matthew F S Rushworth3
PMCID: PMC8248602  PMID: 34197448

Abstract

It has been widely accepted that dopamine (DA) plays a major role in motivation, yet the specific contribution of DA signaling at D1-like receptor (D1R) and D2-like receptor (D2R) to cost–benefit trade-off remains unclear. Here, by combining pharmacological manipulation of DA receptors (DARs) and positron emission tomography (PET) imaging, we assessed the relationship between the degree of D1R/D2R blockade and changes in benefit- and cost-based motivation for goal-directed behavior of macaque monkeys. We found that the degree of blockade of either D1R or D2R was associated with a reduction of the positive impact of reward amount and increasing delay discounting. Workload discounting was selectively increased by D2R antagonism. In addition, blocking both D1R and D2R had a synergistic effect on delay discounting but an antagonist effect on workload discounting. These results provide fundamental insight into the distinct mechanisms of DA action in the regulation of the benefit- and cost-based motivation, which have important implications for motivational alterations in both neurological and psychiatric disorders.


Using quantitatively controlled pharmacological manipulations, this study teases apart the role of D1- and D2-like dopamine receptors in motivation and goal-directed behavior in monkeys, revealing complementary roles of two dopamine receptor subtypes in the computation of the cost/benefit trade-off to guide action.

Introduction

In our daily lives, we routinely determine whether to engage or disengage in an action according to its benefits and costs: The expected value of benefits (i.e., rewards) has a positive influence, while the cost necessary to earn the expected reward (e.g., delay, risk, or effort) decreases the impact of reward value [13]. Arguably, the dopamine (DA) system plays a central role in the motivation, which adjusts behavior as a function of expected costs and benefits. Phasic firing of midbrain DA neurons positively scales with the magnitude of future rewards and negatively scales with risk or time delay to reward [411]. In addition, several studies demonstrated that DA neurotransmission was causally involved in regulation of behavior based on expected costs and benefits [1218]. In patients suffering from depression, schizophrenia, or Parkinson disease (PD), the alteration of DA transmission is frequently associated with various pathological impairments of motivation such as anergia, fatigue, psychomotor retardation, and apathy [14,1921]. DA signaling is mediated at postsynaptic sites by 2 classes of DA receptors (DARs), the D1-like receptor (D1R) and the D2-like receptor (D2R), and both classes are thought to be involved in the regulation of motivation [22,23].

However, the specific mechanisms through which DA contributes to motivation based on cost–benefit trade-off remain unclear. For example, in tasks where animals must exert a higher force to obtain a bigger reward, blockade of either D1R or D2R shifts preferences toward less efforts, thus less rewards, suggesting a role of DA in effort [2429]. On the other hand, since DA activity shows little sensitivity to information about effort when it is decoupled from reward, it has been proposed that DA is strongly involved in adjusting motivation based on expected benefits (reward availability) rather than on expected energetic costs (effort) [9,30,31]. Note that this apparent controversy might be related to the difficulty of interpreting results from experiments where the nature of costs and benefits was not clearly identified and isolated [11].

To understand the role of DA in motivation, it is critical to identify not only the pattern of DA activity and release across costs and benefits, but also the action of DA on DARs [17]. However, the relative implication of distinct receptor subtypes in specific aspects of the cost–benefit trade-off in motivation also remains under debate. For example, systemic administration of D1R or D2R antagonist was shown to increase preference for small immediate rewards over larger, delayed rewards [25,3234]. Some of these studies, however, have also shown that blockade of D1R [34] or D2R [33] has no effect on delay cost. These and other previous behavioral pharmacology studies have compared the effect of DAR blockade according to the antagonist dose–response relationship for each DAR subtype. However, since different antagonists display distinct pharmacological properties (e.g., target affinity, brain permeability, and biostability), it is difficult to accurately predict the effects on their target receptors in vivo. Therefore, to describe the role of DARs in motivational processes beyond a simple dose–response relationship, it seems essential to measure receptor occupancy after antagonist administration. Indeed, positron emission tomography (PET) studies of patients have shown that in vivo D2R occupancy is a reliable predictor of clinical and side effects of antipsychotic drugs [35,36]. Similarly, receptor occupancy has been measured in rats and monkeys, as well as the relationship with the behavioral effects following D2R antagonists [3739].

In the present study, we aimed to quantify and directly compare the roles of DA signaling via D1R and D2R in motivation based on the costs and benefits in macaque monkeys. For this purpose, we manipulated DA transmission by systemic injections of specific antagonists for D1R and D2R and assessed the degree of DAR occupancy using in vivo PET imaging with selective radioligands. The effects of this quantitatively controlled DAR blockade on benefit- and cost-based motivation were evaluated in 2 sets of behavioral experiments. First, we quantified the effects of DAR blockade on the incentive impact of reward prediction, namely the relationship between predicted reward amount and the motivation of a goal-directed task. Second, to assess the effect of DAR blockade on 2 types of costs, workload and delay, we used a similar behavioral task for a fixed amount of reward, but either cost was implemented, allowing us to estimate the negative impacts of cost as steepness of reward discounting (i.e., workload and delay discounting). Based on our data, D1R and D2R have similar roles in incentive impact of reward prediction and delay discounting, whereas workload discounting is exclusively related to D2R manipulation.

Results

PET measurement of D1R/D2R occupancy following systemic antagonist administration

To establish appropriate antagonist doses and experimental timing, we measured the degree of receptor blockade (i.e., receptor occupancy) following systemic administration of DAR antagonists. We performed PET imaging with selective radioligands for D1R ([11C]SCH23390) and D2R ([11C]raclopride) in a total of 4 monkeys (3 for each) under awake condition for both baseline (without drug administration) and following antagonist administration. We quantified specific radioligand binding using a simplified reference tissue model with the cerebellum as reference region.

For D1R measurement, high radiotracer binding was seen in the striatum at baseline condition (Fig 1A, baseline). PET scans were obtained after pretreatment with non-radiolabeled SCH23390 for D1R antagonist at different doses (10, 30, 50, and 100 μg/kg), demonstrating that specific tracer binding was diminished in a dose-dependent manner (Fig 1A). We performed a volume of interest (VOI)-based analysis quantifying the reduction of specific bindings from baseline, which was homogenous across several brain regions within a blocking condition (S1 Fig). We defined receptor occupancy as the degree of reduction of specific binding using the values from striatal VOI, since they appeared to be the most reliable (see Materials and methods) [40]. In 3 monkeys, we measured the relationship between D1R occupancy and the dose of SCH23390, which was approximated by a Hill function (Fig 1C and Eq 4). We found that treatment with SCH23390 at doses of 100 and 30 μg/kg corresponded to 81% and 57% of D1R occupancy, respectively.

Fig 1. D1R and D2R occupancy measured by PET.

Fig 1

(A) Representative horizontal MR (left) and parametric PET images showing specific binding (BPND) of [11C]SCH23390 at baseline and following drug treatment with SCH23390 (10, 30, 50, or 100 μg/kg, i.m.). (B) Representative horizontal MR (left) and parametric PET images showing specific binding (BPND) of [11C]raclopride at baseline and on 0 to 7 days after injection with haloperidol (10 μg/kg, i.m.). Color scale indicates BPND (regional binding potential relative to non-displaceable radioligand). (C) Occupancy of D1R measured at striatal ROI is plotted against the dose of SCH23390. Three of 4 doses were examined in each monkey. (D) Occupancy of D2R measured at striatal ROI is plotted against the day after haloperidol injection. Dotted curves in C and D are the best fit of Eqs 4 and 5, respectively. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D1R, D1-like receptor; D2R, D2-like receptor; i.m., intramuscular; MR, magnetic resonance; PET, positron emission tomography; ROI, region of interest.

Haloperidol was used for D2R antagonism. Unlike SCH23390, which was rapidly washed from the brain within a few hours, a single dose of haloperidol treatment was expected to show persistent D2R occupancy for the following several days as described in humans and mice [41,42], providing the opportunity to test different occupancy conditions. The baseline [11C]raclopride PET image showed the highest radiotracer binding in the striatum (Fig 1B, baseline). As expected, striatal binding was diminished not only just after pretreatment with haloperidol (10 μg/kg, intramuscular [i.m.]), but also on post-haloperidol day 2 (Fig 1B, day 2). Binding had returned to the baseline level by day 7 (Fig 1B, day 7). We measured D2R occupancy on days 0, 1, 2, 3, and 7 after a single haloperidol injection in 3 monkeys. An exponential decay function approximated the relationship between D2R occupancy and post-haloperidol days (Eq 5); a single injection of haloperidol yielded 78% and 48% of D2R occupancy on days 0 and 1, respectively (Fig 1D).

Effects of D1R and D2R blockade on behavior

We next quantified the effects of DAR blockade on behavior using a total of 3 monkeys not used in the PET occupancy study (monkeys KT, ST, and MP; the first 2 for incentive and all 3 for cost-based motivation, respectively). Our goal here was to study the influence of D1R and D2R manipulation on how monkeys adjusted their behavior based on expected benefits (reward size) or expected costs (delay or workload). We use tasks where reward could be obtained by performing a simple action (releasing a bar). In each version of the task, we manipulated costs (delay or workload) or benefits (reward size), such that distinct trials corresponded to different levels of cost or benefits. At the beginning of each trial, a visual cue provided information about the current cost and benefit, so that monkeys could adjust their behavior accordingly. We evaluated motivational processes by using computational modeling to measure the impact of incentive or costs on 2 behavioral measures: refusal rate (whether monkeys accepted or refused to perform the offered option; see below) and reaction time (RT; how quickly they respond).

Effects of D1R and D2R blockade on benefit-based motivation

To assess the effect of blockade of D1R and D2R on benefit-based motivation, we tested 2 monkeys with a reward-size task (Fig 2A). In every trial of this task, the monkeys were required to release a bar when a visual target changed from red to green to get a liquid reward. A visual cue indicated the amount of reward (1, 2, 4, or 8 drops) at the beginning of each trial (Fig 2A). All monkeys had been trained to perform basic color discrimination trials in the cued multi-trial reward schedule task [43] for more than 3 months. As in previous experiments using a single option presentation, the action was very easy, and monkeys could not fail if they actually tried to release the bar on time (the error rate is indeed much lower in the absence of information about costs and benefits) [2,44]. As in those previous experiments manipulating information regarding costs and benefits, failures (either releasing the bar too early or too late) were usually observed in small reward trials and/or close to the end of daily sessions. Therefore, they were regarded as trials in which the monkeys refused to release the bar, presumably because they were not sufficiently motivated to correctly release the bar (i.e., refusal) [2]. Hence, the refusal rate provided a reliable measure of the influence of motivation on behavior [9,4548]. We had previously shown that the refusal rate (E) was inversely related to reward size (R), which had been formulated with a single free parameter a [2] (Fig 2B),

E=1/aR (1)

Fig 2. D1R/D2R blockade increased refusal rates in reward-size task.

Fig 2

(A) Reward-size task. Left: sequence of events during a trial. Right: association between visual cues and reward size. (B) Schematic illustration of inverse function between refusal rate and reward size. (C) Schematic illustration of 2 explanatory models of decrease in motivation. Left: increase in refusal rate (i.e., decrease in motivation) in relation to reward size caused by decrease in incentive impact (a). Right: an alternative model explaining increase in refusal rate irrespective of reward size. (D, E) Behavioral data under D1R and D2R blockade, respectively. Refusal rates (mean ± SEM) as a function of reward size for monkeys KN (top) and ST (bottom). Dotted curves are the best-fit inverse function (S1 Table). The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. CON, control; D1R, D1-like receptor; D2R, D2-like receptor.

In agreement with these previous studies, both monkeys exhibited the inverse relationship in nontreatment condition (Fig 2D and 2E, control).

For D1R blockade, the monkeys were tested with the task 15 minutes after a systemic injection of SCH23390 (10, 30, 50, and 100 μg/kg) or vehicle as control. D1R blockade increased the refusal rates particularly in smaller reward-size trials (Fig 2D). We considered whether this increase was due to a reduction in the incentive impact of reward or a decrease in motivation irrespective of reward size. These factors can be captured by a decrease in parameter a of the inverse function and implementing intercept e, respectively (Fig 2C). To quantify the increases in refusal rate, we compared 4 models while considering these 2 factors as random effects: model #1, random effect on a; model #2, random effect on a with fixed e; model #3, fixed a with random effect on e; model #4, random effect on both a and e (see S1 Table). For both monkeys, the increases in refusal rate were explained by a decrease in parameter a due to the treatment, while the inverse relation with reward size was maintained (Fig 2D and S1 Table; model #2 for monkey KN and model #1 for ST). We then assessed changes in parameter a, which indicates the incentive impact of reward size. As shown in Fig 3A, normalized a decreased as the dose of SCH23390 was increased to 30 or 50 μg/kg, but then it increased at the highest dose (100 μg/kg) for monkeys KN but less clearly so for monkey ST (Fig 3A, left).

Fig 3. Effect of D1R/D2R blockade on incentive impact of reward size.

Fig 3

(A) Bars indicate normalized incentive impact (a) for each treatment condition under D1R blockade for monkeys KN and ST. The value was normalized by the value of control condition. (B) Same as A, but for D2R blockade. (C) Relationship between an incentive impact and occupancy for D1R (blue) and D2R blockade (red). Thick curves indicate LOESS of individual data (filled circles and triangles for monkeys KN and ST, respectively). The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D1R, D1-like receptor; D2R, D2-like receptor; LOESS, locally weighted smoothing.

For D2R blockade, the monkeys were tested with the task 15 minutes after a single injection of haloperidol (10 μg/kg, i.m., day 0), and they were then successively tested on the following days 1, 2, 3, 4 and 7. We also found an increase in refusal rates for D2R blockade in both monkeys: The refusal rates were highest on the day of haloperidol injection, after which they decreased over days (Fig 2E). Similar to the D1R blockade, the increases in refusal rate due to D2R blockade were explained solely by a decrease of parameter a according to the days following the treatment for both monkeys (Fig 2E and S1 Table; model #1 for both monkeys KN and ST). Our model-based analysis revealed that a decreased about 40% on the day of haloperidol injection and the following 3 days as compared to control and then recovered to almost the control level by day 7 (Fig 3B).

To compare the effects between D1R and D2R blockades directly, we plotted changes in incentive impact along with the degree of blockage that was normalized across 3 monkeys (Fig 3C). In both D1R and D2R blockade experiments, a declined according to the increase in occupancy; it gradually declined as D1R occupancy increased, but then increased at the highest occupancy, presenting a U-shaped tendency, whereas it steeply declined until 20% D2R occupancy and then continued to decrease slightly until 80% occupancy (Fig 3C). At 20% to 80% occupancy, the incentive impacts for D2R blockade stayed lower than those for D1R, suggesting a stronger sensitivity of incentive impact to D2R blockade.

We sought to verify that the effect of D2R antagonism was not specific for haloperidol and also to validate the comparison between D1R and D2R in terms of receptor occupancy. We examined the behavioral effect of another D2R antagonist, raclopride, at a dose yielding about 50% receptor occupancy (10 μg/kg, i.m.; S2A Fig). Following this dose of raclopride administration in one monkey, refusal rates increased, which was explained by inverse function with a = 5.2 (drop-1), a value very similar to that observed at 50% D2R occupancy with haloperidol [a = 5.4 (drop-1), day 1; S2B Fig]. Thus, the reduction of incentive impact (captured by a decrease in a parameter) was clearly associated with the degree of D2 receptor blockade regardless of the antagonist used.

Effects of D1R and D2R blockade on response speed

To evaluate the extent to which the influence of DAR manipulation in the reward-size task could affect another behavioral measure through a single motivational process, we examined RT modulations across trials. Consistent with previous studies using systemic administration of D1R or D2R antagonists (e.g., [49]), DAR blockade prolonged RTs in a treatment-dependent manner. For D1R blockade, RTs were increased according to the antagonist dose (2-way ANOVA, main effect of treatment, p < 1.0 × 10−13 for both monkeys, e.g., S3A–S3C Fig, see details in legend). D1R antagonism also tended to increase the proportion of late release (2-way ANOVA, main effect of treatment, p = 0.08, monkey KN; p = 0.0038 monkey ST, e.g., S3D Fig). A simple account of these effects of D1R manipulation on RT is that the modulations in RT across conditions are caused by changes in motivation, such that the positive impact of reward on behavior affects both whether monkeys perform the action (refusal rate) as well as how quickly they will respond (RT). We reasoned that, if this were the case, then the intersession variability in RT and refusal rate should be correlated. A session-by-session analysis revealed that there was indeed a significant linear relationship between refusal rates and RTs in both monkeys, even when the treatment conditions were changed (Fig 4A and S2 Table). D2R blockade also prolonged RTs (main effect of treatment, p < 1.0 × 10−4, e.g., S3E–S3G Fig). D2R blockade did not change the refusal patterns (i.e., too early or late release) (2-way ANOVA, treatment, p = 0.31, e.g., S3H Fig). As with the case of D1R, there was a linear relationship between refusal rates and RTs across D2R antagonism sessions, in which treatment had no discernible effect on the steepness of the slope (Fig 4B and S2 Table). Collectively, these results indicate that refusal rate and RT were similarly affected by DAR manipulation, in line with the concept that DAR affects a central process (motivation), which controls the influence of expected reward on both action selection and execution.

Fig 4. Relationship between refusal rate and RT in reward-size task.

Fig 4

(A) Relationship between refusal rate and average RT for each reward size in session by session for D1R blocking in monkeys KN and ST. Colors indicate treatment condition. (B) Same as A, but for D2R blocking. Note that a simple linear regression (white line, model #1 in S2 Table) was selected as the best-fit model to explain the data. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D1R, D1-like receptor; D2R, D2-like receptor; RT, reaction time.

Little influence of D1R or D2R blockade on hedonic impact of reward

The behavioral data shown above suggest that blockade of DAR attenuates the incentive effect of reward on behavior. To evaluate the impact of DAR blockade on other aspects of motivation and reward processing, we also examined the effect of DAR manipulation on hedonic processes, i.e., on how pleasant was the reward consumption. In line with previous experiments in rodents [15,50], we did not find any effect of treatment with D1R or D2R antagonist on overall intake or sucrose preference in either of the 2 monkeys tested (S4A Fig; see legend). We also assessed blood osmolality, a physiological index of dehydration and thirst drive [51], before and after the preference test. Again, DAR treatment had no significant influence on overall osmolality or recovery of osmolality (rehydration) (S4B Fig; see legend). These results suggest that DAR blockade has no influence on hedonic impact of reward. These results also support the notion that the increased refusal rate was not directly due to a reduction of thirst drive.

In short, these results indicate that both D1R and D2R are involved in incentive motivation, i.e., in the positive influence of the expected reward size on behavior (refusal rate and RT), but not in the hedonic impact of reward. We next examined the influence of DAR manipulation on cost processing.

Differential effects of D1R and D2R blockades on workload and delay discounting

The trade-off between the reward and costs of obtaining the reward affects decision-making as well as motivation. Both humans and animals have the tendency to prefer immediate, smaller rewards over larger, but delayed rewards. The preference can be predicted by discounting the reward’s intrinsic value by the duration of the expected delay, an effect designated as “delay discounting” [52,53]. Discounting of the reward value also occurs in proportion to the predicted effort needed to obtain the rewards, an effect called “effort discounting” [54]. Delay and effort discounting are typically measured in choice tasks, providing the relative impact of costs on reward in decision-making. Previously, we measured the discounting effect of these costs on outcome value by quantifying the relation between the amount of expected cost and the change in operant, reward-directed behavior [55].

In this study, we used the same procedure to assess the effect of selective DAR blockade on cost-based motivation. For this purpose, we used a work/delay task (Fig 5A), where the basic features were the same as those in reward-size task. There were 2 trial types. In the work trials, the monkeys had to perform 0, 1, or 2 additional instrumental actions to obtain a fixed amount of reward, and the cost (workload) scaled with the number of trials to perform. In the delay trials, after the monkeys correctly performed one instrumental trial, a reward was delivered 0 to 7 seconds later, such that the cost (delay) scaled with the time between action and reward delivery. Note that here, as in most natural conditions, greater workload is inherently associated with longer delays. Thus, in an attempt to isolate the effort component, we adjusted the delay for reward in delay trials based on the duration of corresponding workload trials: Since the timing of the trials is matched between workload and delay trials, they only differed in the number of actions and therefore in the amount of effort. At the beginning of each trial, the cost (workload or delay) was indicated by a visual cue that lasted throughout the trial. As with the reward task, we used computational modeling to quantify the influence of cost information on behavior. We have shown that the monkeys exhibited linear relationships between refusal rate (E) and remaining costs (CU) for both work and delay trials, as follows:

E=kCU+E0 (2)

Fig 5. Differential effects of D1R and D2R blockade on cost-based motivational valuation.

Fig 5

(A) The work/delay task. The sequence of events (left) and relationships between visual cues and trial schedule in the work trials (right top 3 rows) or delay duration in the delay trials (right bottom 3 rows) are shown. CU denotes the remaining (arbitrary) cost unit to get a reward, i.e., either remaining workload to perform trial(s) or remaining delay periods. (B) Schematic illustration of an explanatory model of increases in refusal rate by increasing cost sensitivity (k). (C) Effects of D1R blockade. Representative relationships between refusal rates (monkey KN; mean ± SEM) and remaining costs for workload (green) and delay trials (black). Saline control (Control) and moderate (30 μg/kg; MO) and high D1R occupancy treatment condition (100 μg/kg; HO) are shown. Green and black lines are the best-fit lines for work and delay trials in model #1 in S3 Table, respectively. (D) Effects of D2R blockade. Nontreatment control (Control), moderate (1 day after haloperidol; MO) and high D2 occupancy treatment conditions (day of haloperidol; HO) are shown. Others are the same for C. (E) Comparison of effects between D1R and D2R blockade on delay discounting parameter (kd). Bars and symbols indicate mean and individual data, respectively. (F) Comparison of effects between D1R and D2R blockade on workload discounting parameter (kw). Asterisks represent significant difference (* p < 0.05, 1-way ANOVA with post hoc Tukey HSD test). The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D1R, D1-like receptor; D2R, D2-like receptor; HO, high occupancy; HSD, honestly significant difference; MO, moderate occupancy.

where k is a coefficient and E0 is an intercept [55] (Fig 5B). By extending the inference and formulation of reward-size task (Eq 1), this linear effect proposes that the reward value is hyperbolically discounted by cost, where the coefficient k corresponds to discounting factors.

We tested 3 monkeys (monkeys KN, MP, and ST) and measured refusal rate to infer delay and workload discounting. We confirmed that refusal rates of control condition increased as the remaining cost increased (e.g., Fig 5C, control). Fig 5B illustrates our hypothesis that DAR blockade increases cost sensitivity (i.e., discounting factor, k), which appears as an increase in refusal rate relative to remaining cost.

To compare the effect of D1R versus D2R antagonism on cost sensitivity at the same degree of receptor blockade, we assessed the performance of the monkeys under 2 comparable levels of DAR occupancy for D1R and D2R: 50% occupancy (called “moderate occupancy” or MO) and 80% occupancy (“high occupancy” or HO). We also measured performance in absence of treatment as control. According to the occupancy study (Fig 1), MO and HO conditions corresponded to pretreatment with 30 and 100 μg/kg of SCH23390 for D1R and 1 day after and the day of haloperidol treatment for D2R, respectively. Linear mixed models (LMMs) analysis verified the assumption that DAR blockade increased delay and workload discounting independently without considering the random effect of treatment condition or subject (Fig 5CD and S3 Table; see Materials and methods). We found that delay discounting was significantly increased according to the degree of DAR blockade irrespective of receptor subtype (D1, F(2, 4) = 36.9, p = 0.0026; D2, F(2, 4) = 41.4, p = 0.0021; Fig 5E). Workload discounting (kw), on the other hand, was specifically increased by D2R blockade in an occupancy-dependent manner (1-way ANOVA, main effect of occupancy; D1, F(2, 4) = 0.125, p = 0.89; D2, F(2, 4) = 243.2, p = 6.6 × 10−5; Fig 5F).

In line with what we found in the reward-size task, D1R blockade did not have any significant effect on the linear relation between refusal rate and RT in either trial type (Fig 6A). Thus, the influence of D1R manipulation on behavior could readily be accounted for by a single variable, which affects both RT and refusal rate. By contrast, D2R blockade produced an occupancy-dependent increase in the steepness of the linear relation between RT and refusal rate in workload trials, but not in delay trials (Fig 6B). Thus, D2R manipulation had a distinct influence on RT and refusal rate in workload trials, suggesting that it was acting on behavior through a distinct motivational process such as overcoming effort costs (see Discussion).

Fig 6. Relationship between refusal rate and RT in work/delay task.

Fig 6

(A) Relationship between refusal rate and average RT for each remaining cost in session by session for D1R blocking in delay and workload trials. Data are plotted individually for monkeys KN, MP, and ST, in order from top to bottom. Colors indicate treatment condition. Thick lines indicate linear regression lines. (B) Same as A, but for D2R blocking. Note that for the data in workload trials under D2R treatment, a linear model with random effect of condition (model #4 in S4 Table) was chosen as the best model to explain the data, whereas for the other data, a simple linear regression model (model #1, without any random effect or model #2 with random effect of subject) was selected. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. CON, control; D1R, D1-like receptor; D2R, D2-like receptor; HO, high occupancy; MO, moderate occupancy; RT, reaction time.

Joint influences of D1R and D2R blockades on motivation

Considering the direct and indirect striatal output pathways where neurons exclusively express D1R and D2R, respectively, and the potential functional opposition between these pathways [56], we examined the effect of joint blockade of D1R and D2R. To facilitate the comparison of the influence of 2 receptors, we examined the behavioral effects of both D1R and D2R blockades at the same occupancy level. After treatment with both SCH23390 (100 μg/kg) and haloperidol (10 μg/kg), seemingly achieving approximately 80% of occupancy for both subtypes (Fig 1C and 1D), all monkeys virtually stopped performing the task: They only performed 1% to 13% of the trials compared to control conditions. When we treated the monkeys with SCH23390 (30 μg/kg) on the day following that of haloperidol injection (i.e., both D1R and D2R assumed to be occupied at approximately 50%), the monkeys had higher refusal rates in delay trials than control (Fig 7A, D1R+D2R block), such that discounting factor (kd) became significantly higher than in control conditions (p < 0.05, Tukey honestly significant difference [HSD] test; Fig 7B, delay). By contrast, this simultaneous D1R and D2R blockade appeared to attenuate the effect of D2R antagonism on workload: The refusal rates in work trials were not as high as in D2R blockade alone (Fig 7A), and the difference in workload discounting factor (kw) between treated and control or baseline conditions disappeared (p > 0.05; Fig 7B, workload). A similar tendency of counterbalancing influences of D1R and D2R blockade was also seen in the motivation for minimum cost trials (E0) (Fig 7B). These results suggest that blocking both receptor subtypes tends to induce a synergistic effect on delay discounting, while their effects on workload discounting cancel each other out.

Fig 7. Effect of both D1R and D2R blockades on cost evaluation for motivation.

Fig 7

(A) Representative relationship between refusal rates (in monkey KN; mean ± SEM) and remaining costs for workload (green) and delay trials (black). (B) Best-fit parameters, workload discounting (kw), delay discounting (kd), and intercept (E0) are plotted for each treatment condition. Bars and symbols indicate mean and individual data, respectively. D1R+D2R indicates the data obtained under both D1R and D2R blockades at MO, while D1R and D2R blockades at HO resulted in almost no correct performance (see text). All parameters are derived from the best fit of model #1 in S3 Table. Asterisks represent significant difference (*p < 0.05, 1-way ANOVA with post hoc Tukey HSD test). The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D1R, D1-like receptor; D2R, D2-like receptor; HO, high occupancy; HSD, honestly significant difference; MO, moderate occupancy.

Finally, since workload trials revealed a potential specific action of D2R, with a dissociation between refusal rate and RT effects, we examined the joint influence of D1R and D2R manipulations on the relation between these 2 behavioral measures. As shown in S5B Fig, when D1R and D2R were simultaneously blocked, the relationship between RT and refusal rate in work trials became closer to that of control monkeys than those treated with D2R agonists alone, consistent with their impacts on refusal rate (Fig 7B). Therefore, even if D1R antagonist alone had little effect on workload sensitivity, it may be able to counteract the effect of D2R treatment under these conditions.

Discussion

Combining the PET occupancy study and pharmacological manipulation of D1R and D2R with quantitative measurement of motivation in monkeys, the current study demonstrated dissociable roles of the DA transmissions via D1R and D2R in the computation of the cost–benefits trade-off to guide action. To the best of our knowledge, this is the first study to directly compare the contribution of DA D1R and D2R along with the degree of receptor blockade. Using model-based analysis, we showed that DAR blockade had a clear quantitative effect on the sensitivity of animals to information about potential costs and benefits, without any qualitative effect on the way monkeys integrated costs and benefits and adjusted their behavior. We showed that blockade of D1R or D2R reduced the incentive impact of reward as the degree of DAR blockade increased, and the incentive impact was more sensitive to the D2R blockade than the D1R blockade at lower occupancy. In cost-discounting experiments, we could dissociate the relation between each DAR type and workload versus delay discounting: Workload discounting was increased exclusively by D2R antagonism, whereas delay discounting was increased by DAR blockade irrespective of receptor subtype. When both D1R and D2R were blocked simultaneously, the effects were synergistic and strengthened for delay discounting, while the effects were antagonistic and diminished for workload discounting. These results suggest that the action of DA is similar between incentive motivation and temporal discounting, but different for workload discounting.

DA controls the incentive effect of expected reward amount

Previous pharmacological studies have shown that DAR blockade decreased the speed of action and/or probability of engagement behavior [22,23]. However, previous studies did not measure the effect of DAR blockade on incentive motivation in multiple rewarding conditions, and, therefore, data describing the quantitative relationship among DAR stimulation, reward, and motivation are not available. In the present study, we used a behavioral paradigm that enabled us to formulate and quantify the relationship between reward and motivation [2] (Fig 2). Our finding, a reduction of incentive impact due to DAR antagonism (Fig 3), is in line with the incentive salience theory, that is, DA transmission attributes salience to incentive cue to promote goal-directed action [12]. The lack of effect of DA manipulation on satiety and spontaneous water consumption is consistent with previous studies in rodents [57,58]. Our results are also compatible with the idea that DA manipulation mainly influences incentive processes (influence of reward on action) but does not cause a general change of reward processing, which includes hedonic processes (evaluation itself, pleasure associated with consuming reward) [59,60], although further experiments would be necessary to address that point directly [12].

Our model-based analysis indicates that DAR blockade only had a quantitative influence (reduction of incentive impact of reward) without changing the qualitative relationship between reward size and behavior. This is in marked contrast to the reported effects of inactivation of brain areas receiving massive DA inputs, including the orbitofrontal cortex, rostromedial caudate nucleus, and ventral pallidum. Indeed, in experiments using nearly identical tasks and analysis, inactivation or ablation of these regions produced a qualitative change in the relationship between reward size and behavior (more specifically, a violation of the inverse relationship between reward size and refusal rates) [47,48,61]. Thus, the influence of DAR cannot be understood as a simple permissive or activating effect on target regions. The specificity of the DAR functional role is further supported by the subtle, but significant difference between the behavioral consequences of blocking of D1R versus D2R. By combining a direct measure of DAR occupancy and quantitative behavioral assessment, the present study demonstrates that the incentive impact of reward is more sensitive to D2R blockade than D1R blockade, and, especially, at a lower degree of occupancy (Fig 3C). Moreover, the relationship between occupancy and incentive impact was monotonous for D2R, but tended to be U-shaped for D1R. Although this U-shaped effect of D1R blockade was inferred solely based on the refusal rate of 2 monkeys without statistical support at the population level and was not found in the RT data, such non-monotonic effects have been repeatedly reported. For example, working memory performance and related neural activity in the prefrontal cortex take the form of an “inverted-U” shaped curve, where too little or too much D1R activation impairs cognitive performance [6264]. As for the mechanisms underlying the distinct functional relation between the behavioral effects of D1R versus D2R blockade, it is tempting to speculate that this is related to a difference in their distribution, their affinity, and the resulting relation with phasic versus tonic DA action. Indeed, DA affinity for D2R is approximately 100 times higher than that for D1R [65]. This is directly in line with the stronger effect of D2R antagonists at low occupancy levels. Moreover, in the striatum, a basal DA concentration of approximately 5 to 10 nM is sufficient to constantly stimulate D2R. Using available biological data, a recent simulation study showed that the striatal DA concentration produced by the tonic activity of DA neurons (approximately 40 nM) would occupy 75% of D2R but only 3.5% of D1R [66]. Thus, blockade of D2R at low occupancy may interfere with tonic DA signaling, whereas D1R occupancy would only be related to phasic DA action, i.e., when transient but massive DA release occurs (e.g., in response to critical information about reward). We acknowledge that this remains very hypothetical, but irrespective of the underlying mechanisms, our data clearly support the idea that DA action on D1R versus D2R exerts distinct actions on their multiple targets to enhance incentive motivation.

DA transmission via D1R and D2R distinctively controls cost-based motivational process

Although many rodent studies have demonstrated that attenuation of DA transmission alters not only benefit but also cost-related decision-making, the exact contribution of D1R and D2R remains elusive. For example, reduced willingness to exert physical effort to receive higher reward was similarly found following D1R and D2R antagonism in many rodent studies [24,26,59], while it was observed exclusively by D2 antagonism in other studies [21,25]. This inconsistency may have 2 reasons. First, previous studies usually investigated the effect of antagonism on D1R and D2R along with a relative pharmacological concentration (e.g., low and high doses). In the present study, PET-assessed DAR manipulation allowed us to directly compare the behavioral effect between D1R and D2R with an objective reference, namely occupancy (i.e., approximately 50% and approximately 80% occupancy). Second, the exact nature of the cost (effort versus delay) has sometimes been difficult to identify, and effort manipulation is often strongly correlated with reward manipulation (typically when the amount of reward earned is instrumentally related to the amount of effort exerted; see [10]). Here, using a task manipulating forthcoming workload independently from reward value, we demonstrated that blockade of D2R, but not D1R, increased workload discounting in an occupancy-dependent manner while maintaining linearity (Fig 5F).

Delay discounting and impulsivity—the tendency associated with excessive delay discounting—are also thought to be linked to the DA system [67,68]. Systemic administration of D1R or D2R antagonist increases preference for immediate small rewards, rather than larger and delayed rewards [25,3234]. Concurrently, some of these studies also showed negative effects of D1R [34] or D2R blockade [33] on impulsivity. These inconsistencies may be attributed to the differences in behavioral paradigms or drugs (and doses) used. Our PET-assessed DAR manipulation demonstrated that blockade of D1R and D2R at the same occupancy level (approximately 50% and approximately 80%) similarly increased delay discounting (Fig 5E), suggesting that DA transmission continuously adjusts delay discounting at the postsynaptic site. This observation is in good accord with the previous finding that increasing DA transmission decreases temporal discounting, e.g., amphetamine or methylphenidate increased the tendency to choose long delays options for larger rewards [3234,69,70].

In sharp contrast to incentive motivation and delay discounting, which involve both D1R and D2R, the following 3 observations illustrate a unique mechanism of DA action on workload discounting through D2R only. First, workload discounting increased only with D2R antagonism (Fig 5F). Second, the occupancy-dependent effect of D2R antagonism on the decision–response time relationship was only seen in workload trials (Fig 6). Third, D1R and D2R had a synergistic effect in the delay discounting trials, but an antagonistic effect in the workload discounting trials (Fig 7). These results extend previous studies demonstrating increased effort discounting by D2R blockade [25,71]. Besides, our observation that blocking D2R increased refusal rates without slowing down responses (Fig 6B) emphasizes the role of DA in effort-based decision-making and supports the notion that DA activation allows overcoming effort costs [21]. This is in apparent contrast to neurophysiological and voltammetry studies that show a lack of sensitivity of DA release to effort but comforts the idea that DA function requires the integration of receptor action on top of neuronal activity and releasing patterns [10,17].

This differential relation between DA and delay versus workload might be related to the differential expression of these receptors in the direct versus indirect striatopallidal pathway, where the striatal neurons exclusively express D1R and D2R, respectively [72]. Opposing functions between these pathways have been proposed: Activity of the direct pathway (D1R) neurons reflects positive rewarding events promoting movement, whereas activity of the indirect pathway (D2R) neurons is related to negative values mediating aversion or inhibiting movements [56,73,74] (but see [75]). DA increases the excitability of direct pathway neurons, and this effect was reduced by D1R antagonism, decreasing motor output. DA reduces the responsiveness of indirect pathway neurons via D2R [72], and blockade of D2R would increase the activity, reducing motor output via decreased thalamocortical drive [76]. Interestingly, a neural network model has been proposed by considering these opposing DA functions of direct/indirect circuit embedded in reinforcement learning framework, successfully explaining the enhancement of effort cost due to D2R blockade [77]. This scenario might also explain our finding of a synergistic effect of simultaneous D1R and D2R blockade on delay discounting. Further work would be necessary to clarify this hypothesis, including the dynamic relation with tonic versus phasic DA release, but altogether, these data strongly support the concept that distinct neurobiological processes underlie benefits (reward availability) and costs (energy expenditure).

Limitations of the current study

Finally, limitations of the current study and areas for further research can be discussed. First, there were relatively larger individual disparities in estimated values of D2R occupancy by haloperidol in 3 monkeys (Fig 1D), which could reflect individual variance of haloperidol metabolism and/or elimination. However, the time course of the recovery of D2R occupancy was relatively consistent across subjects, being in line with that of behavioral change. Haloperidol induced long-lasting occupancy for several days, thereby potentially causing unexpected long-term changes, such as synaptic plasticity. Although we cannot eliminate the potential effects of plastic change, a comparable behavioral impact was also observed after raclopride administration, which would induce short-term occupancy (S2 Fig), supporting the view that blockade of DAR reduced motivation in an occupancy-dependent manner. Second, because of applying systemic antagonist administration, the current study could not determine which brain area(s) is responsible for antagonist-induced alterations of benefit- and cost-based motivation. While our data support the notion that differential neural networks involve workload and delay discounting, further study (e.g., local infusion of DA antagonist) is needed to identify the locus of the effects, generalizing our findings to unravel the circuit and molecular mechanism of motivation. We should also note that the current study does not address dynamic learning paradigms, and, therefore, does not generalize our findings to the function of the DA system in learning directly. Despite these limitations, the current study provides unique insights into the role of the DA system in the motivational process.

Conclusions

In summary, the present study demonstrates a dissociation between the functional role of DA transmission via D1R and D2R in benefit- and cost-based motivational processing. DA transmissions via D1R and D2R modulate both the incentive impact of reward size and the negative influence of delay. By contrast, workload discounting is regulated exclusively via D2R, since apparently D1R alone had no role. In addition, D1R and D2R had synergistic roles in delay discounting but opposite roles in workload discounting. These dissociations indicate different underlying mechanisms of DA on motivation, which can be attributed to differential involvement of the direct and indirect striatofugal pathways. Together, our findings add an important aspect to our current knowledge concerning the role of DA signaling motivation based on the trade-off between costs and benefits, thus providing an advanced framework for understanding the pathophysiology of psychiatric disorders.

Materials and methods

Ethics statement

All surgical and experimental procedures were approved by the Animal Care and Use Committee of the National Institutes for Quantum and Radiological Science and Technology (#09–1035) and were in accordance with the Institute of Laboratory Animal Research Guide for the Care and Use of Laboratory Animals.

Subjects

A total of 9 male adult macaque monkeys (8 Rhesus and 1 Japanese; 4.6 to 7.7 kg) were used in this study. All monkeys were individually housed. Food was available ad libitum, and motivation was controlled by restricting access to fluid to experimental sessions, when water was delivered as a reward for performing the task. Animals received water supplementation whenever necessary (e.g., if they could not obtain enough water during experiments), and they had free access to water whenever testing was interrupted for more than a week. For environmental enrichment, play objects and/or small foods (fruits, nuts, and vegetables) were provided daily in the home cage.

Drug treatment

All experiments in this study were carried out with injected i.m. SCH23390 (Sigma-Aldrich, St. Louis, MO), haloperidol (Dainippon Sumitomo Pharma, Japan), and raclopride (Sigma-Aldrich) dissolved or diluted in 0.9% saline solution. Animals were pretreated with an injection of SCH23390 (10, 30, 50, or 100 μg/kg), haloperidol (10 μg/kg), or raclopride (10 or 30 μg/kg) 15 minutes before the beginning of behavioral testing or PET scan. In behavioral testing, saline was injected as a vehicle control by the same procedure as the drug treatment. The administered volume was 1 mL across all experiments with each monkey.

Surgery

Four monkeys underwent surgery to implant a head-hold device for the PET study using aseptic techniques [78]. We monitored body temperature, heart rate, SpO2, and tidal CO2 throughout all surgical procedures. Monkeys were immobilized by i.m. injection of ketamine (5 to 10 mg per kg) and xylazine (0.2 to 0.5 mg per kg) and intubated with an endotracheal tube. Anesthesia was maintained with isoflurane (1% to 3%, to effect). The head-hold device was secured with plastic screws and dental cement over the skull. After surgery, prophylactic antibiotics and analgesics were administered. The monkeys were habituated to sit in a primate chair with their heads fixed for approximately 30 minutes for more than 2 weeks.

PET procedure and occupancy measurement

Four monkeys were used in the measurement. PET measurements were performed with 2 PET ligands: [11C]SCH23390 (for studying D1R binding) and [11C]raclopride (for studying D2R binding). The injected radioactivities of [11C]SCH23390 and [11C]raclopride were 91.7 ± 6.0 MBq (mean ± SD) and 87.0 ± 4.9 MBq, respectively. Specific radioactivities of [11C]SCH23390 and [11C]raclopride at the time of injection were 86.2 ± 40.6 GBq/μmol and 138.2 ± 70.1 GBq/μmol, respectively. All PET scans were performed using an SHR-7700 PET scanner (Hamamatsu Photonics, Japan) under conscious conditions and seated in a chair. After transmission scans for attenuation correction using a 68Ge–68Ga source, a dynamic scan in three-dimensional (3D) acquisition mode was performed for 60 minutes ([11C]SCH23390) or 90 minutes ([11C]raclopride). The ligands were injected via crural vein as a single bolus at start of the scan. All emission data were reconstructed with a 4.0-mm Colsher filter. Tissue radioactive concentrations were obtained from VOIs placed on several brain regions where DARs are relatively abundant: caudate nucleus, putamen, nucleus accumbens (NAcc), thalamus, hippocampus, amygdala, parietal cortex, principal sulcus (PS), dorsolateral prefrontal cortex (dlPFC), and ventrolateral prefrontal cortex (vlPFC), as well as the cerebellum (as reference region). Each VOI was defined on individual T1-weighted axial magnetic resonance (MR) images (EXCELART/VG Pianissimo at 1.0 tesla, Toshiba, Japan) that were co-registered with PET images using PMOD image analysis software (PMOD Technologies, Switzerland). Regional radioactivity of each VOI was calculated for each frame and plotted against time. Regional binding potentials relative to non-displaceable radioligands (BPND) of D1R and D2R were estimated with a simplified reference tissue model on VOI and voxel-by-voxel bases [7981]. The monkeys were scanned with and without drug treatment condition on different days.

Occupancy levels were determined from the degree of reduction (%) of BPND by antagonists [82]. DAR occupancy was estimated as follows:

Occupancy(%)=(1BPNDTreament/BPNDBaseline)×100 (3)

where BPND Baseline and BPND Treatment are BPND measured without (baseline) and with an antagonist, respectively. Relationship between D1R occupancy (D1Occ) and dose of SCH23390 (Dose) was estimated with 50% effective dose (ED50) as follows:

D1Occ(%)=100×Dose/(ED50+Dose) (4)

Relationship between D2R occupancy (D2Occ) and days after haloperidol injection was estimated using the level at day 0 with a decay constant (λ) as follows:

D2Occ(%)=OccDay0eλDay (5)

Behavioral tasks and testing procedures

Three monkeys (ST, 6.4 kg; KN, 6.3 kg; MP, 7.3 kg) were used for the behavioral study. For all behavioral training and testing, each monkey sat in a primate chair inside a sound-attenuated dark room. Visual stimuli were presented on a computer video monitor in front of the monkey. Behavioral control and data acquisition were performed using the REX program. Neurobehavioral Systems Presentation software was used to display visual stimuli (Neurobehavioral Systems, www.neurobs.com). We used 2 types of behavioral tasks, reward-size task and work/delay task, as described previously [2,51]. Both tasks consisted of color discrimination trials (see Figs 2A and 5A). Each trial began when the monkey touched a bar mounted at the front of the chair. The monkey was required to release the bar between 200 and 1,000 ms after a red spot (wait signal) turned green (go signal). In correctly performed trials, the spot then turned blue (correct signal). A visual cue was presented at the beginning of each color discrimination trial (500 ms before the red spot appears).

In the reward-size task, a reward of 1, 2, 4, or 8 drops of water (1 drop = approximately 0.1 mL) was delivered immediately after the blue signal. Each reward size was selected randomly with equal probability. The visual cue presented at the beginning of the trial indicated the number of drops for reward (Fig 2A).

In the work/delay task, a water reward (approximately 0.25 mL) was delivered after each correct signal immediately or after an additional 1 or 2 instrumental trials (work trial) or after a delay period (delay trials). The visual cue indicated the combination of the trial type and requirement to obtain a reward (Fig 5A). Pattern cues indicated the delay trials with the timing of reward delivery after a correct performance: either immediately (0.3 seconds, 0.2 to 0.4 seconds; mean, range), short delay (3.6 seconds, 3.0 to 4.2 seconds), or long delay (7.2 seconds, 6.0 to 8.4 seconds). Gray scale cues indicated work trials with the number of trials the monkey would have to perform to obtain a reward. We set the delay durations to be equivalent to the duration for 1 or 2 color discrimination trials, so that we could directly compare the cost of 1 or 2 arbitrary units (cost unit; CU).

If the monkey released the bar before the green target appeared or within 200 ms after the green target appeared or failed to respond within 1 second after the green target appeared, we regarded the trial as a “refusal trial”; all visual stimuli disappeared, the trial was terminated immediately, and after the 1-second intertrial interval, the trial was repeated. Our behavioral measurement for the motivational value of outcome was the proportion of refusal trials. Before each testing session, the monkeys were subject to approximately 22 hours of water restriction in their home cage. Each session continued until the monkey would no longer initiate a new trial (usually less than 100 minutes).

Before this experiment, all monkeys had been trained to perform color discrimination trials in cued multi-trial reward schedule task [43] for more than 3 months. The monkeys were tested with the reward-size task and work/delay task for more than 2 months as training to become familiar with the cueing condition.

Each monkey was tested from Monday to Friday. Treatment with SCH23390 was performed every 4 or 5 days. On other days without SCH23390, sessions with saline (1 mL) treatment were analyzed as control sessions. Haloperidol was given every 2 or 3 weeks on Monday or Tuesday, because D2R occupancy persisted for several days after a single dose of haloperidol treatment (Fig 1D). The days before haloperidol treatment were analyzed as control sessions. Each dose of SCH23390 or a single dose of haloperidol was tested 4 or 5 times with the reward-size task and at least 3 times with the work/delay task per each animal.

Sucrose preference test

Two monkeys (RO, 5.8kg; KY, 5.6kg) were used for the sucrose preference test. The test was performed in their home cages once a week. In advance of the test, water access was prevented for 22 hours. The monkeys were injected with SCH23390 (30 μg/kg), haloperidol (10 μg/kg), or saline 15 minutes before the sucrose preference test. Two bottles containing either 1.5% sucrose solution or tap water were set into bottle holders in the home cage, and the monkeys were allowed to freely consume fluids for 2 hours. The total amount of sucrose (SW) and tap water (TW) intake was measured and calculated as sucrose preference index (SP) as follows: SP = (SW–TW) / (SW + TW). The position of sucrose and tap water bottles (right or left toward the front panel of the home cage) was counterbalanced across sessions and monkeys. Drugs or saline was injected alternatively once a week. We also measured the osmolality level in blood samples (1 mL) obtained immediately before and after each testing session.

Behavioral data analysis

All data and statistical analyses were performed using the R statistical computing environment (www.r-project.org). The average error rate for each trial type was calculated for each daily session, with the error rates in each trial type being defined as the number of error trials divided by the total number of trials of that given type. The monkeys sometimes made many errors at the beginning of the daily session, probably due to high motivation/impatience; we excluded the data until the first successful trial in these cases. A trial was considered an error trial if the monkey released the bar either before or within 200 ms after the appearance of the green target (early release) or failed to respond within 1 second after the green target (late release). We did not distinguish between the 2 types of errors and used their sum except for the error pattern analysis. We performed repeated-measures ANOVAs to test the effect of treatment × reward size (for data in reward-size task) on RT, on late release rate (i.e., error pattern). Post hoc comparisons were performed using Tukey HSD test, and a priori statistical significance was set at = 0.05.

We used the refusal rates to estimate the level of motivation because the refusal rates of these tasks (E) are inversely related to the value for action [2]. In the reward-size task, we used the inverse function (Eq 1). We fitted the data to LMMs [83], in which the random effects across DAR blockade conditions on parameter a and/or intercept e (Fig 2C) were nested. Model selection was based on Bayesian information criterion (BIC), an estimator of in-sample prediction error for the nested models (S1 Table). Using the selected model, the parameter a was estimated individually and then normalized by the value in nontreated condition (control, CON) (Fig 3A and 3B).

In the work/delay task, we used linear models to estimate the effect of remaining cost, i.e., workloads and delay, as described previously [55],

Ew=kwCU+E0 (6)
Ed=kdCU+E0 (7)

where Ew and Ed are the error rates, and kw and kd are workload discounting and delay discounting parameters, respectively. CU is the number of remaining CUs, and E0 is the intercept. We used LMMs to estimate the effect of DAR blockade on the discounting parameters. We imposed the constraint that the intercept (E0) has the same value across trials and assumed the base statistical model in which the random effects of the 2 receptor types (delay and workload) affect the regression confidents independently. Four models were nested to consider the presence or absence of random effects, random effects of treatment conditions, and subjects (S3 Table). The best model was selected based on BIC for the entire data set, which is the sum of the regression results for each unit faceted by individual and/or treatment condition. For example, model #1 was fit to a total of 18 data sets (3 monkeys × 3 treatment conditions (CON, MO, and HO) × 2 subtypes (D1R and D2R), and then BIC was calculated by the sum of each fitting. Modeling was performed with the lme4 package in R, and the parameters (e.g., kw and kd) were estimated from the model. We performed 1-way ANOVAs to test the significance of the effect of treatment on discounting parameters with post hoc Tukey HSD test.

LMMs were also applied for the correlation analysis between refusal rate (E) and reaction time (Rt) (Figs 4 and 6 and S5), where 4 statistical models were nested to take into account the presence or absence of random effects of subjects and treatment conditions, and the best-fit model was selected based on BIC (S2, S4 and S5 Tables).

Supporting information

S1 Table. Model comparison the effect of DAR blockade on refusal rates in reward-size task (for Fig 2).

a(cond) and e(cond) indicate the random effects of DAR blocking treatment conditions on parameters a and e, respectively. BIC is a relative measure of quality for the models (#1–4). ΔBIC denotes difference from minimum BIC. BIC, Bayesian information criterion; DAR, DA receptor.

(PDF)

S2 Table. Model comparison for the effect of DAR blockade on the relationship between refusal rate and RT in reward-size task (for Fig 4).

(Rt|*) indicates random effects on regression parameters. E, refusal rate; Rt, reaction time; cond, treatment condition; monkey, subject. DAR, DA receptor.

(PDF)

S3 Table. Model comparison for the effect of DAR blockade on refusal rates in work/delay task (for Fig 5).

CU and E0 indicate remaining cost and intercept, respectively. (0+ CU|*) and (CU|*) indicate random effects on both regression coefficient and intercept (E0) or on regression coefficient alone, respectively. E, refusal rate; type, trial type (delay or work); cond, treatment condition (CON, MO, and HO for D1R and D2R blocking); monkey, subject. CON, control; D1R, D1-like receptor; D2R, D2-like receptor; DAR, DA receptor; HO, high occupancy; MO, moderate occupancy.

(PDF)

S4 Table. Model comparison for the effect of DAR blockade on the relationship between refusal rate and RT in work/delay task (for Fig 6).

(Rt|*) indicates random effects on regression parameters. E, refusal rate; Rt, reaction time; cond, treatment condition; monkey, subject. DAR, DA receptor.

(PDF)

S5 Table. Model comparison for the effect of both D1R and D2R blockades on the relationship between refusal rate and RT in work/delay task (for S5 Fig).

(Rt|*) indicates random effects on regression coefficient. E, refusal rate; Rt, reaction time; cond, treatment condition; monkey, subject. D1R, D1-like receptor; D2R, D2-like receptor.

(PDF)

S1 Fig. Occupancy estimation.

Example of occupancy estimation based on modified Lassen plot of [11C]SCH23390 PET data obtained from monkey DO. Colored dots represent the relationship between decreased specific binding [i.e., BPND (baseline)–BPND (blocking)] and baseline [BPND (baseline)] for each brain region under each blocking condition (indexed by color). Occupancy was determined as a proportion of reduced specific binding to baseline, which corresponds to the slope of linear regression. In this case, D1 occupancy was 80%, 78%, 67%, and 26% for 100, 50, 30, and 10 μg/kg doses, respectively. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. PET, positron emission tomography.

(PDF)

S2 Fig. Comparable effects of D2R antagonism between raclopride and haloperidol at similar occupancy.

(A) Occupancy of D2R measured at striatal ROI is plotted against dose of raclopride. (B) Error rates as a function of reward size for control (black) and after injection of raclopride (10 μg/kg, i.m. left side) and haloperidol (10 μg/kg, i.m. right side) in monkey KN are plotted. Dotted curves are best-fit inverse function (model #1 in S1 Table). The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D2R, D2-like receptor; ROI, region of interest.

(PDF)

S3 Fig. Effect of D1R/D2R blockade on RT and error pattern.

(A, B) Cumulative distribution of RT for control and D1R blockade conditions in drop-1 and drop-8 trials, respectively. (C) Mean RT as function of reward size for control and D1R blockade conditions. Two-way ANOVA, reward × condition; main effect of condition, F(4, 164) = 109.8, p < 1.0 × 10−15; main effect of reward, F(3, 164) = 111.0, p < 10−15; interaction, F(12, 164) = 4.7, p < 1.0 × 10−5. (D) Late release rate (mean ± SEM) as function of reward size for control and D1R blockade conditions. Two-way ANOVA, reward × condition; main effect of condition, F(4, 163) = 18.6, p < 1.0 × 10−11; main effect of reward, F(3, 163) = 9.8, p < 10−5; interaction, F(12, 163) = 1.0, p = 0.4. (E–H) Same as (A–D), but for D2R blockade. RT; main effect of condition, F(6, 92) = 7.2, p < 1.0 × 10−5; main effect of reward, F(3, 92) = 81.9, p < 10−15; interaction, F(18, 164) = 0.6, p = 0.65. Late release rate; main effect of condition, F(6, 90) = 3.5, p = 0.0038; main effect of reward, F(3, 90) = 19.2, p < 10−9; interaction, F(18, 90) = 1.4, p = 0.14. * significantly different from control, p < 0.05 post hoc Tukey HSD. Data were obtained from monkey ST. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D1R, D1-like receptor; D2R, D2-like receptor; HSD, honestly significant difference; RT, reaction time.

(PDF)

S4 Fig. Little influence of DAR blockade on sucrose preference and blood osmolality.

(A) Sucrose preference index after administration of saline (Control), SCH23390 (30μg/kg, D1), and haloperidol (10μg/kg; D2, day 0), respectively. There was no significant effect of DAR blockade on overall intake (1-way ANOVA, treatment, monkey KY, F(2, 8) = 1.26, p = 0.33; monkey RO, F(2, 14) = 2.01, p = 0.17) or sucrose preference (1-way ANOVA; treatment, monkey KY, F(2, 8) = 1.62, p = 0.26; monkey RO, F(2, 8) = 1.38, p = 0.31). (B) Blood osmolality measured in serum samples obtained before (pre) and after (post) sucrose test. There was no significant impact of DAR blockade (2-way ANOVA, monkey KY, main effect of treatment, F(2, 10) = 4.0, p = 0.056; pre-post, F(1, 10) = 93.83, p = 2.1 × 10−6, interaction, F(2,10) = 0.74, p = 0.50; monkey RO, treatment, F(2, 20) = 1.22, p = 0.32; pre-post, F(1, 20) = 40.8, p = 3.1 × 10−6, interaction, F(2,20) = 0.13, p = 0.88). Filled circles and shades indicate median and raw data points, while horizontal bars indicate SD. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. DAR, DA receptor.

(PDF)

S5 Fig. Effect of both D1R and D2R blockades on the relationship between refusal rate and RT.

(A) Relationship between refusal rate and average RT for each reward size in session by session for D2 blocking and D1+D2 blocking in delay trials. Data are plotted individually for monkeys KN, MP, and ST, in order from top to bottom. Colors indicate treatment condition. Thick lines indicate linear regression lines (model #1 in S5 Table). (B) Same as A, but for workload trials. Note that for the data in workload trials, a multiple linear model with random effect of condition (model #3 in S5 Table) was chosen as the best model to explain the data, where the steepness of the slope under D1+D2 treatment was the same as that of control. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D1R, D1-like receptor; D2R, D2-like receptor; RT, reaction time.

(PDF)

Acknowledgments

We thank R. Suma, T. Okauchi, Y. Sugii, R. Yamaguchi, Y. Matsuda, and J. Kamei for their technical assistance, and K. Oyama for discussion. We also thank Dr. M-R. Zhang and his colleagues at the Department of Radiopharmaceuticals Development, QST, for producing the radioligands. A Japanese monkey used in this study was provided by National Bio-Resource Project “Japanese Monkeys” of MEXT, Japan.

Abbreviations

3D

three-dimensional

BIC

Bayesian information criterion

CON

control

CU

cost unit

D1R

D1-like receptor

D2R

D2-like receptor

DA

dopamine

DAR

DA receptor

dlPFC

dorsolateral prefrontal cortex

HO

high occupancy

HSD

honestly significant difference

i.m.

intramuscular

LMM

linear mixed model

MO

moderate occupancy

MR

magnetic resonance

NAcc

nucleus accumbens

PD

Parkinson disease

PET

positron emission tomography

RT

reaction time

vlPFC

ventrolateral prefrontal cortex

VOI

volume of interest

Data Availability

All data presented in this paper have been posted on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR.

Funding Statement

This research was supported in part by KAKENHI Grants JP26282221, JP26120733, JP18H04037, and JP20H05955 from Japan Society for the Promotion of Science (JSPS) (http://www.jsps.go.jp/english/index.html) to TM, and by Japan Agency for Medical Research and Development (AMED) (https://www.amed.go.jp/en/index.html) Grant Numbers JP20dm0107094 (to TS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Berridge KC. Motivation concepts in behavioral neuroscience. Physiol Behav. 2004;81(2):179–209. doi: 10.1016/j.physbeh.2004.02.004 [DOI] [PubMed] [Google Scholar]
  • 2.Minamimoto T, La Camera G, Richmond BJ. Measuring and modeling the interaction among reward size, delay to reward, and satiation level on motivation in monkeys. J Neurophysiol. 2009;101(1):437–47. doi: 10.1152/jn.90959.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Pessiglione M, Vinckier F, Bouret S, Daunizeau J, Le Bouc R. Why not try harder? Computational approach to motivation deficits in neuro-psychiatric diseases. Brain. 2018;141(3):629–50. doi: 10.1093/brain/awx278 [DOI] [PubMed] [Google Scholar]
  • 4.Bromberg-Martin ES, Matsumoto M, Hikosaka O. Dopamine in motivational control: rewarding, aversive, and alerting. Neuron. 2010;68(5):815–34. doi: 10.1016/j.neuron.2010.11.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Fiorillo CD, Tobler PN, Schultz W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science. 2003;299(5614):1898–902. doi: 10.1126/science.1077349 [DOI] [PubMed] [Google Scholar]
  • 6.Kobayashi S, Schultz W. Influence of reward delays on responses of dopamine neurons. J Neurosci. 2008;28(31):7837–46. doi: 10.1523/JNEUROSCI.1600-08.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Ravel S, Richmond BJ. Dopamine neuronal responses in monkeys performing visually cued reward schedules. Eur J Neurosci. 2006;24(1):277–90. doi: 10.1111/j.1460-9568.2006.04905.x [DOI] [PubMed] [Google Scholar]
  • 8.Tobler PN, Fiorillo CD, Schultz W. Adaptive coding of reward value by dopamine neurons. Science. 2005;307(5715):1642–5. doi: 10.1126/science.1105370 [DOI] [PubMed] [Google Scholar]
  • 9.Varazzani C, San-Galli A, Gilardeau S, Bouret S. Noradrenaline and dopamine neurons in the reward/effort trade-off: A direct electrophysiological comparison in behaving monkeys. J Neurosci. 2015;35(20):7866–77. doi: 10.1523/JNEUROSCI.0454-15.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Walton ME, Bouret S. What is the relationship between dopamine and effort? Trends Neurosci. 2019;42(2):79–91. doi: 10.1016/j.tins.2018.10.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Stauffer WR, Lak A, Bossaerts P, Schultz W. Economic choices reveal probability distortion in macaque monkeys. J Neurosci. 2015;35(7):3146–54. doi: 10.1523/JNEUROSCI.3653-14.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Berridge KC. The debate over dopamine’s role in reward: the case for incentive salience. Psychopharmacology (Berl). 2007;191(3):391–431. doi: 10.1007/s00213-006-0578-x [DOI] [PubMed] [Google Scholar]
  • 13.Le Bouc R, Rigoux L, Schmidt L, Degos B, Welter ML, Vidailhet M, et al. Computational dissection of dopamine motor and motivational functions in humans. J Neurosci. 2016;36(25):6623–33. doi: 10.1523/JNEUROSCI.3078-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Muhammed K, Manohar S, Ben Yehuda M, Chong TT, Tofaris G, Lennox G, et al. Reward sensitivity deficits modulated by dopamine are associated with apathy in Parkinson’s disease. Brain. 2016;139(Pt 10):2706–21. doi: 10.1093/brain/aww188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Salamone JD, Correa M. Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine. Behav Brain Res. 2002;137(1–2):3–25. doi: 10.1016/s0166-4328(02)00282-6 [DOI] [PubMed] [Google Scholar]
  • 16.Walton ME, Kennerley SW, Bannerman DM, Phillips PE, Rushworth MF. Weighing up the benefits of work: behavioral and neural analyses of effort-related decision making. Neural Netw. 2006;19(8):1302–14. doi: 10.1016/j.neunet.2006.03.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Berke JD. What does dopamine mean? Nat Neurosci. 2018;21(6):787–93. doi: 10.1038/s41593-018-0152-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Chong TT, Bonnelle V, Manohar S, Veromann KR, Muhammed K, Tofaris GK, et al. Dopamine enhances willingness to exert effort for reward in Parkinson’s disease. Cortex. 2015;69:40–6. doi: 10.1016/j.cortex.2015.04.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Buyukdura JS, McClintock SM, Croarkin PE. Psychomotor retardation in depression: Biological underpinnings, measurement, and treatment. Prog Neuropsychopharmacol Biol Psychiatry. 2011;35(2):395–409. doi: 10.1016/j.pnpbp.2010.10.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Demyttenaere K, De Fruyt J, Stahl SM. The many faces of fatigue in major depressive disorder. Int J Neuropsychopharmacol. 2005;8(1):93–105. doi: 10.1017/S1461145704004729 [DOI] [PubMed] [Google Scholar]
  • 21.Salamone JD, Yohn SE, Lopez-Cruz L, San Miguel N, Correa M. Activational and effort-related aspects of motivation: neural mechanisms and implications for psychopathology. Brain. 2016;139(Pt 5):1325–47. doi: 10.1093/brain/aww050 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Choi WY, Morvan C, Balsam PD, Horvitz JC. Dopamine D1 and D2 antagonist effects on response likelihood and duration. Behav Neurosci. 2009;123(6):1279–87. doi: 10.1037/a0017702 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Pattij T, Janssen MC, Vanderschuren LJ, Schoffelmeer AN, van Gaalen MM. Involvement of dopamine D1 and D2 receptors in the nucleus accumbens core and shell in inhibitory response control. Psychopharmacology (Berl). 2007;191(3):587–98. doi: 10.1007/s00213-006-0533-x [DOI] [PubMed] [Google Scholar]
  • 24.Bardgett ME, Depenbrock M, Downs N, Points M, Green L. Dopamine modulates effort-based decision making in rats. Behav Neurosci. 2009;123(2):242–51. doi: 10.1037/a0014625 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Denk F, Walton ME, Jennings KA, Sharp T, Rushworth MF, Bannerman DM. Differential involvement of serotonin and dopamine systems in cost-benefit decisions about delay or effort. Psychopharmacology (Berl). 2005;179(3):587–96. doi: 10.1007/s00213-004-2059-4 [DOI] [PubMed] [Google Scholar]
  • 26.Hosking JG, Floresco SB, Winstanley CA. Dopamine antagonism decreases willingness to expend physical, but not cognitive, effort: a comparison of two rodent cost/benefit decision-making tasks. Neuropsychopharmacology. 2015;40(4):1005–15. doi: 10.1038/npp.2014.285 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Salamone JD. Dopamine, effort, and decision making: theoretical comment on Bardgett et al. (2009). Behav Neurosci. 2009;123(2):463–7. doi: 10.1037/a0015381 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Salamone JD, Correa M. The mysterious motivational functions of mesolimbic dopamine. Neuron. 2012;76(3):470–85. doi: 10.1016/j.neuron.2012.10.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Yohn SE, Santerre JL, Nunes EJ, Kozak R, Podurgiel SJ, Correa M, et al. The role of dopamine D1 receptor transmission in effort-related choice behavior: Effects of D1 agonists. Pharmacol Biochem Behav. 2015;135:217–26. doi: 10.1016/j.pbb.2015.05.003 [DOI] [PubMed] [Google Scholar]
  • 30.Gan JO, Walton ME, Phillips PE. Dissociable cost and benefit encoding of future rewards by mesolimbic dopamine. Nat Neurosci. 2010;13(1):25–7. doi: 10.1038/nn.2460 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Pasquereau B, Turner RS. Limited encoding of effort by dopamine neurons in a cost-benefit trade-off task. J Neurosci. 2013;33(19):8288–300. doi: 10.1523/JNEUROSCI.4619-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Cardinal RN, Robbins TW, Everitt BJ. The effects of d-amphetamine, chlordiazepoxide, alpha-flupenthixol and behavioural manipulations on choice of signalled and unsignalled delayed reinforcement in rats. Psychopharmacology (Berl). 2000;152(4):362–75. doi: 10.1007/s002130000536 [DOI] [PubMed] [Google Scholar]
  • 33.van Gaalen MM, van Koten R, Schoffelmeer ANM, Vanderschuren LJMJ. Critical involvement of dopaminergic neurotransmission in impulsive decision making. Biol Psychiatry. 2006;60(1):66–73. doi: 10.1016/j.biopsych.2005.06.005 [DOI] [PubMed] [Google Scholar]
  • 34.Wade TR, de Wit H, Richards JB. Effects of dopaminergic drugs on delayed reward as a measure of impulsive behavior in rats. Psychopharmacology (Berl). 2000;150(1):90–101. doi: 10.1007/s002130000402 [DOI] [PubMed] [Google Scholar]
  • 35.Farde L, Nordström A, Wiesel F, Pauli S, Halldin C, Sedvall G. Positron emission tomographic analysis of central D1 and D2 dopamine receptor occupancy in patients treated with classical neuroleptics and clozapine. Relation to extrapyramidal side effects. Arch Gen Psychiatry. 1992;49(7):538. doi: 10.1001/archpsyc.1992.01820070032005 [DOI] [PubMed] [Google Scholar]
  • 36.Kapur S, Zipursky R, Jones C, Remington G, Houle S. Relationship between dopamine D(2) occupancy, clinical response, and side effects: a double-blind PET study of first-episode schizophrenia. Am J Psychiatry. 2000;157(4):514–20. doi: 10.1176/appi.ajp.157.4.514 [DOI] [PubMed] [Google Scholar]
  • 37.Takano A, Suhara T, Maeda J, Ando K, Okauchi T, Obayashi S, et al. Relation between cortical dopamine D(2) receptor occupancy and suppression of conditioned avoidance response in non-human primate. Psychiatry Clin Neurosci. 2004;58(3):330–2. doi: 10.1111/j.1440-1819.2004.01240.x [DOI] [PubMed] [Google Scholar]
  • 38.Wadenberg ML, Kapur S, Soliman A, Jones C, Vaccarino F. Dopamine D2 receptor occupancy predicts catalepsy and the suppression of conditioned avoidance response behavior in rats. Psychopharmacology (Berl). 2000;150(4):422–9. doi: 10.1007/s002130000466 [DOI] [PubMed] [Google Scholar]
  • 39.Wadenberg ML, Soliman A, VanderSpek SC, Kapur S. Dopamine D(2) receptor occupancy is a common mechanism underlying animal models of antipsychotics and their clinical effects. Neuropsychopharmacology. 2001;25(5):633–41. doi: 10.1016/S0893-133X(01)00261-5 [DOI] [PubMed] [Google Scholar]
  • 40.Halldin C, Gulyas B, Farde L. PET studies with carbon-11 radioligands in neuropsychopharmacological drug development. Curr Pharm Des. 2001;7(18):1907–29. doi: 10.2174/1381612013396871 [DOI] [PubMed] [Google Scholar]
  • 41.Ishiwata K, Kawamura K, Kobayashi T, Matsuno K. Sigma1 and dopamine D2 receptor occupancy in the mouse brain after a single administration of haloperidol and two dopamine D2-like receptor ligands. Nucl Med Biol. 2003;30(4):429–34. doi: 10.1016/s0969-8051(03)00003-9 [DOI] [PubMed] [Google Scholar]
  • 42.Nordstrom AL, Farde L, Halldin C. Time course of D2-dopamine receptor occupancy examined by PET after single oral doses of haloperidol. Psychopharmacology (Berl). 1992;106(4):433–8. doi: 10.1007/BF02244811 [DOI] [PubMed] [Google Scholar]
  • 43.Bowman EM, Aigner TG, Richmond BJ. Neural signals in the monkey ventral striatum related to motivation for juice and cocaine rewards. J Neurophysiol. 1996;75(3):1061–73. doi: 10.1152/jn.1996.75.3.1061 [DOI] [PubMed] [Google Scholar]
  • 44.Shidara M, Richmond BJ. Anterior cingulate: single neuronal signals related to degree of reward expectancy. Science. 2002;296(5573):1709–11. doi: 10.1126/science.1069504 [DOI] [PubMed] [Google Scholar]
  • 45.Bouret S, Richmond BJ. Sensitivity of locus ceruleus neurons to reward value for goal-directed actions. J Neurosci. 2015;35(9):4005–14. doi: 10.1523/JNEUROSCI.4553-14.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Eldridge MA, Lerchner W, Saunders RC, Kaneko H, Krausz KW, Gonzalez FJ, et al. Chemogenetic disconnection of monkey orbitofrontal and rhinal cortex reversibly disrupts reward value. Nat Neurosci. 2016;19(1):37–9. doi: 10.1038/nn.4192 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Fujimoto A, Hori Y, Nagai Y, Kikuchi E, Oyama K, Suhara T, et al. Signaling incentive and drive in the primate ventral pallidum for motivational control of goal-directed action. J Neurosci. 2019;39(10):1793–804. doi: 10.1523/JNEUROSCI.2399-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Nagai Y, Kikuchi E, Lerchner W, Inoue K, Ji B, Eldridge MAG, et al. PET imaging-guided chemogenetic silencing reveals a critical role of primate rostromedial caudate in reward evaluation. Nat Commun. 2016;7:13605. doi: 10.1038/ncomms13605 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Weed MR, Gold LH. The effects of dopaminergic agents on reaction time in rhesus monkeys. Psychopharmacology (Berl). 1998;137(1):33–42. doi: 10.1007/s002130050590 [DOI] [PubMed] [Google Scholar]
  • 50.Salamone JD, Steinpreis RE, McCullough LD, Smith P, Grebel D, Mahan K. Haloperidol and nucleus accumbens dopamine depletion suppress lever pressing for food but increase free food consumption in a novel food choice procedure. Psychopharmacology (Berl). 1991;104(4):515–21. doi: 10.1007/BF02245659 [DOI] [PubMed] [Google Scholar]
  • 51.Minamimoto T, Yamada H, Hori Y, Suhara T. Hydration level is an internal variable for computing motivation to obtain water rewards in monkeys. Exp Brain Res. 2012;218(4):609–18. doi: 10.1007/s00221-012-3054-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Ainslie GW. Impulse control in pigeons. J Exp Anal Behav. 1974;21(3):485–9. doi: 10.1901/jeab.1974.21-485 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Mazur JE. Fixed and variable ratios and delays: further tests of an equivalence rule. J Exp Psychol Anim Behav Process. 1986;12(2):116–24. [PubMed] [Google Scholar]
  • 54.Kivetz R. The effects of effort and intrinsic motivation on risky choice. Mar Sci. 2003;22(4):477–502. doi: 10.1287/mksc.22.4.477.24911 [DOI] [Google Scholar]
  • 55.Minamimoto T, Hori Y, Richmond BJ. Is working more costly than waiting in monkeys? PLoS ONE. 2012;7(11):e48434. doi: 10.1371/journal.pone.0048434 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Kravitz AV, Freeze BS, Parker PRL, Kay K, Thwin MT, Deisseroth K, et al. Regulation of parkinsonian motor behaviours by optogenetic control of basal ganglia circuitry. Nature. 2010;466(7306):622–U7. doi: 10.1038/nature09159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Nunes EJ, Randall PA, Hart EE, Freeland C, Yohn SE, Baqi Y, et al. Effort-related motivational effects of the VMAT-2 inhibitor tetrabenazine: implications for animal models of the motivational symptoms of depression. J Neurosci. 2013;33(49):19120–30. doi: 10.1523/JNEUROSCI.2730-13.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Yang JH, Presby RE, Jarvie AA, Rotolo RA, Fitch RH, Correa M, et al. Pharmacological studies of effort-related decision making using mouse touchscreen procedures: effects of dopamine antagonism do not resemble reinforcer devaluation by removal of food restriction. Psychopharmacology (Berl). 2020;237(1):33–43. doi: 10.1007/s00213-019-05343-8 [DOI] [PubMed] [Google Scholar]
  • 59.Randall PA, Lee CA, Nunes EJ, Yohn SE, Nowak V, Khan B, et al. The VMAT-2 inhibitor tetrabenazine affects effort-related decision making in a progressive ratio/chow feeding choice task: reversal with antidepressant drugs. PLoS ONE. 2014;9(6):e99320. doi: 10.1371/journal.pone.0099320 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Randall PA, Pardo M, Nunes EJ, Lopez Cruz L, Vemuri VK, Makriyannis A, et al. Dopaminergic modulation of effort-related choice behavior as assessed by a progressive ratio chow feeding choice task: pharmacological studies and the role of individual differences. PLoS ONE. 2012;7(10):e47934. doi: 10.1371/journal.pone.0047934 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Simmons JM, Minamimoto T, Murray EA, Richmond BJ. Selective ablations reveal that orbital and lateral prefrontal cortex play different roles in estimating predicted reward value. J Neurosci. 2010;30(47):15878–87. doi: 10.1523/JNEUROSCI.1802-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Callicott JH, Mattay VS, Bertolino A, Finn K, Coppola R, Frank JA, et al. Physiological characteristics of capacity constraints in working memory as revealed by functional MRI. Cereb Cortex. 1999;9(1):20–6. doi: 10.1093/cercor/9.1.20 [DOI] [PubMed] [Google Scholar]
  • 63.Vijayraghavan S, Wang M, Birnbaum SG, Williams GV, Arnsten AFT. Inverted-U dopamine D1 receptor actions on prefrontal neurons engaged in working memory. Nat Neurosci. 2007;10(3):376–84. doi: 10.1038/nn1846 [DOI] [PubMed] [Google Scholar]
  • 64.Zahrt J, Taylor JR, Mathew RG, Arnsten AF. Supranormal stimulation of D1 dopamine receptors in the rodent prefrontal cortex impairs spatial working memory performance. J Neurosci. 1997;17(21):8528–35. doi: 10.1523/JNEUROSCI.17-21-08528.1997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Richfield EK, Penney JB, Young AB. Anatomical and affinity state comparisons between dopamine D1 and D2 receptors in the rat central nervous system. Neuroscience. 1989;30(3):767–77. doi: 10.1016/0306-4522(89)90168-1 [DOI] [PubMed] [Google Scholar]
  • 66.Dreyer JK, Herrik KF, Berg RW, Hounsgaard JD. Influence of phasic and tonic dopamine release on receptor activation. J Neurosci. 2010;30(42):14273–83. doi: 10.1523/JNEUROSCI.1894-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Joutsa J, Voon V, Johansson J, Niemela S, Bergman J, Kaasinen V. Dopaminergic function and intertemporal choice. Transl Psychiatry. 2015;5:e491. doi: 10.1038/tp.2014.133 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Pine A, Shiner T, Seymour B, Dolan RJ. Dopamine, time, and impulsivity in humans. J Neurosci. 2010;30(26):8888–96. doi: 10.1523/JNEUROSCI.6028-09.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Evenden JL, Ryan CN. The pharmacology of impulsive behaviour in rats: the effects of drugs on response choice with varying delays of reinforcement. Psychopharmacology (Berl). 1996;128(2):161–70. doi: 10.1007/s002130050121 [DOI] [PubMed] [Google Scholar]
  • 70.Rajala AZ, Jenison RL, Populin LC. Decision making: effects of methylphenidate on temporal discounting in nonhuman primates. J Neurophysiol. 2015;114(1):70–9. doi: 10.1152/jn.00278.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Wang S, Hu SH, Shi Y, Li BM. The roles of the anterior cingulate cortex and its dopamine receptors in self-paced cost-benefit decision making in rats. Learn Behav. 2017;45(1):89–99. doi: 10.3758/s13420-016-0243-0 [DOI] [PubMed] [Google Scholar]
  • 72.Gerfen CR, Surmeier DJ. Modulation of striatal projection systems by dopamine. Annu Rev Neurosci. 2011;34:441–66. doi: 10.1146/annurev-neuro-061010-113641 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Hikida T, Kimura K, Wada N, Funabiki K, Nakanishi S. Distinct roles of synaptic transmission in direct and indirect striatal pathways to reward and aversive behavior. Neuron. 2010;66(6):896–907. doi: 10.1016/j.neuron.2010.05.011 [DOI] [PubMed] [Google Scholar]
  • 74.Nonomura S, Nishizawa K, Sakai Y, Kawaguchi Y, Kato S, Uchigashima M, et al. Monitoring and Updating of Action Selection for Goal-Directed Behavior through the Striatal Direct and Indirect Pathways. Neuron. 2018;99(6):1302–14.e5. doi: 10.1016/j.neuron.2018.08.002 [DOI] [PubMed] [Google Scholar]
  • 75.Kupchik YM, Brown RM, Heinsbroek JA, Lobo MK, Schwartz DJ, Kalivas PW. Coding the direct/indirect pathways by D1 and D2 receptors is not valid for accumbens projections. Nat Neurosci. 2015;18(9):1230–2. doi: 10.1038/nn.4068 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Lee B, Groman S, London ED, Jentsch JD. Dopamine D2/D3 receptors play a specific role in the reversal of a learned visual discrimination in monkeys. Neuropsychopharmacology. 2007;32(10):2125–34. doi: 10.1038/sj.npp.1301337 [DOI] [PubMed] [Google Scholar]
  • 77.Collins AG, Frank MJ. Opponent actor learning (OpAL): modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive. Psychol Rev. 2014;121(3):337–66. doi: 10.1037/a0037015 [DOI] [PubMed] [Google Scholar]
  • 78.Nagai Y, Minamimoto T, Ando K, Obayashi S, Ito H, Ito N, et al. Correlation between decreased motor activity and dopaminergic degeneration in the ventrolateral putamen in monkeys receiving repeated MPTP administrations: a positron emission tomography study. Neurosci Res. 2012;73(1):61–7. doi: 10.1016/j.neures.2012.02.007 [DOI] [PubMed] [Google Scholar]
  • 79.Gunn RN, Lammertsma AA, Hume SP, Cunningham VJ. Parametric imaging of ligand-receptor binding in PET using a simplified reference region model. Neuroimage. 1997;6(4):279–87. doi: 10.1006/nimg.1997.0303 [DOI] [PubMed] [Google Scholar]
  • 80.Jucaite A, Odano I, Olsson H, Pauli S, Halldin C, Farde L. Quantitative analyses of regional [11C]PE2I binding to the dopamine transporter in the human brain: a PET study. Eur J Nucl Med Mol Imaging. 2006;33(6):657–68. doi: 10.1007/s00259-005-0027-9 [DOI] [PubMed] [Google Scholar]
  • 81.Lammertsma AA, Hume SP. Simplified reference tissue model for PET receptor studies. Neuroimage. 1996;4(3 Pt 1):153–8. doi: 10.1006/nimg.1996.0066 [DOI] [PubMed] [Google Scholar]
  • 82.Cunningham VJ, Rabiner EA, Slifstein M, Laruelle M, Gunn RN. Measuring drug occupancy in the absence of a reference region: the Lassen plot re-visited. J Cereb Blood Flow Metab. 2010;30(1):46–50. doi: 10.1038/jcbfm.2009.190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Softw. 2015;67(1):1–48. doi: 10.18637/jss.v067.i01 [DOI] [Google Scholar]

Decision Letter 0

Lucas Smith

10 Nov 2020

Dear Dr Minamimoto,

Thank you for submitting your manuscript entitled "Differential Contribution of Dopaminergic Transmission at D1- and D2-like Receptors to Cost/Benefit Evaluation for Motivation in Monkeys" for consideration as a Research Article by PLOS Biology.

Your manuscript has now been evaluated by the PLOS Biology editorial staff as well as by an academic editor with relevant expertise and I am writing to let you know that we would like to send your submission out for external peer review.

However, before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire.

Please re-submit your manuscript within two working days, i.e. by Nov 12 2020 11:59PM.

Login to Editorial Manager here: https://www.editorialmanager.com/pbiology

During resubmission, you will be invited to opt-in to posting your pre-review manuscript as a bioRxiv preprint. Visit http://journals.plos.org/plosbiology/s/preprints for full details. If you consent to posting your current manuscript as a preprint, please upload a single Preprint PDF when you re-submit.

Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. Once your manuscript has passed all checks it will be sent out for review.

Given the disruptions resulting from the ongoing COVID-19 pandemic, please expect delays in the editorial process. We apologize in advance for any inconvenience caused and will do our best to minimize impact as far as possible.

Feel free to email us at plosbiology@plos.org if you have any queries relating to your submission.

Kind regards,

Lucas Smith, Ph.D.,

Associate Editor

PLOS Biology

Decision Letter 1

Lucas Smith

16 Dec 2020

Dear Dr Minamimoto,

Thank you very much for submitting your manuscript "Differential Contribution of Dopaminergic Transmission at D1- and D2-like Receptors to Cost/Benefit Evaluation for Motivation in Monkeys" for consideration as a Research Article at PLOS Biology. Your manuscript has been evaluated by the PLOS Biology editors, an Academic Editor with relevant expertise, and by several independent reviewers.

As you will see from their detailed responses (below), all of the reviewers are enthusiastic about the overall potential of this study. However, the reviewers do ask that you provide some additional clarifications of your ideas, and provide a better grounding of these findings and mechanistic suggestions within the broader literature across species. Additionally, Reviewer 2 raises some technical concerns, particularly around the ANOVA analyses, D2 occupancy estimates, and ability to rule out a motor impairment.

In light of the reviews, we will not be able to accept the current version of the manuscript, but we would welcome re-submission of a much-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent for further evaluation by the reviewers.

We expect to receive your revised manuscript within 3 months.

Please email us (plosbiology@plos.org) if you have any questions or concerns, or would like to request an extension. At this stage, your manuscript remains formally under active consideration at our journal; please notify us by email if you do not intend to submit a revision so that we may end consideration of the manuscript at PLOS Biology.

**IMPORTANT - SUBMITTING YOUR REVISION**

Your revisions should address the specific points made by each reviewer. Please submit the following files along with your revised manuscript:

1. A 'Response to Reviewers' file - this should detail your responses to the editorial requests, present a point-by-point response to all of the reviewers' comments, and indicate the changes made to the manuscript.

*NOTE: In your point by point response to the reviewers, please provide the full context of each review. Do not selectively quote paragraphs or sentences to reply to. The entire set of reviewer comments should be present in full and each specific point should be responded to individually, point by point.

You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response.

2. In addition to a clean copy of the manuscript, please also upload a 'track-changes' version of your manuscript that specifies the edits made. This should be uploaded as a "Related" file type.

*Re-submission Checklist*

When you are ready to resubmit your revised manuscript, please refer to this re-submission checklist: https://plos.io/Biology_Checklist

To submit a revised version of your manuscript, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' where you will find your submission record.

Please make sure to read the following important policies and guidelines while preparing your revision:

*Published Peer Review*

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*PLOS Data Policy*

Please note that as a condition of publication PLOS' data policy (http://journals.plos.org/plosbiology/s/data-availability) requires that you make available all data used to draw the conclusions arrived at in your manuscript. If you have not already done so, you must include any data used in your manuscript either in appropriate repositories, within the body of the manuscript, or as supporting information (N.B. this includes any numerical values that were used to generate graphs, histograms etc.). For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5

*Blot and Gel Data Policy*

We require the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare them now, if you have not already uploaded them. Please carefully read our guidelines for how to prepare and upload this data: https://journals.plos.org/plosbiology/s/figures#loc-blot-and-gel-reporting-requirements

*Protocols deposition*

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosbiology/s/submission-guidelines#loc-materials-and-methods

Thank you again for your submission to our journal. We hope that our editorial process has been constructive thus far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Lucas Smith, Ph.D.,

Associate Editor,

lsmith@plos.org,

PLOS Biology

*****************************************************

REVIEWS:

Reviewer's Responses to Questions

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes: John D Salamone

Reviewer #1: This is the first study to directly compare the contribution of the dopamine D1R and D2R by showing the degree of biding to the striatum and other brain structures. I think the series of results that the authors presented are very convincing and the experimental procedure of the authors is straightforward and excellent. I have a few following concerns. If the authors revise the manuscript considering the following comments, the paper will be much improved. The task paradigm and the obtained change in behaviors are very convincing because of the established procedure of the author's series of experiments. However, for the readers who are not familiar with the task procedures, it is better to add a detailed explanation of the task and add further interpretation of the results. For example, each assignment and the experimental process is accurately described, but the conclusions are drawn from them, and their interpretation is not emphasized. As a result, each assignment's significance and its importance to the reader is often not conveyed to the reader.

Major comments:

(1) Regarding the data in Figure 1, it is very important to examine the degree of D1R and D2R block by examining the occupancy with PET, and I think it is clean data. However, I fear that long-term plastic changes may be a side effect, especially in D2R, because of the long-lasting occupancy. The possibility of mixed plastic changes caused by long-term blocks needs to be discussed in detail.

(2) On the relationship between reaction time and refusal rate, at L201, the authors mentioned, "D1R blockade also influences response speed, probably due to slowing cognitive processing". If the motivation itself is decreased by D1R block, it may directly affect reaction time and refusal rate. In this case, it is not necessarily related to whether there is some disorder to cognitive processing or not. The interpretation of this behavior, especially the relationship between RT and the refusal rate, needs to be discussed in detail.

(3) "Relative reward" in L213-215 is an abrupt concept that needs to be explained; what is the purpose of the Sucrose preference test?

(4) About L300-302, the authors introduced concepts of "delay discounting" and "workload discounting." Although the relationship between them was carefully discussed in the Discussion, I would appreciate it if the authors add detail conceptual details when they introduce them (i.e., at Results). At least, the author should add "why" they focused on these parameters.

(5) About the relationship between delay and workload discounting, I have naïve questions. First, can we directly compare the two in a paradigm? And how are those two concepts distinguished in monkeys?

(6) The circuitry changes due to the D1R and D2R blocks are discussed in the Discussion. Of course, there are many difficulties with circuit changes, but is there anything added from the changes seen in PET that we can add to the Discussion?

Reviewer #2: The authors present a pharmacological study addressing the roles of D1 and D2 receptor stimulation on how monkeys process reward incentives and discount effort or waiting time while they do variations of a task to collect rewards. The authors tested the animals either after receiving saline or different doses of SCH23390 to test the involvement of D1 receptors. For the D2 receptors, they administered a fixed dose of haloperidol and tested the animals on different days over a period of 8 days. This was done instead of administering different doses because haloperidol has a high half-life and stays long enough in the system to be able to test the animals with different effective doses.

What appears to be the key feature of the study is that the authors used PET to quantify the receptor occupancy achieved by the doses used in the behavioral experiments. Furthermore, they tested the behavioral impact of different levels of occupancy, effectively doing a dose/response curve for blockade of D1 and D2 receptors.

These are definitely valuable data on a hot topic in systems neurosciences. The question of the relative contribution of D1 and D2 receptors to the dopamine modulation of incentive processing and motivation is timely, and I think that the paper has potential to contribute to clarifying it. However, the manuscript in its current form is far from satisfactory. The presentation is rather poor and requires lots of suppositions from the reader. Thus, there are some aspects when I am not sure if the current analyses are well grounded. Yet in some instances, the authors make unwarranted claims.

- Presentation - even though the details of the methods are given after the results section, it should be possible to know what the authors did while reading the results section. Details such as the number of monkeys for each experiment should be stated straight away before giving any specific results. Similarly, the authors should state: that the monkeys are not the same for the PET and the pharmacological experiments; the gist of the PET analysis (at least name the method used for analysis; whether the analysis is ROI or voxel-based, which ROIs do the results refer to); whether the quantification of receptor occupancy for SCH23390 was done in different sessions; how many (or range of) months of training; the models that are tested to ultimately infer the parameter a in the first experiment should be spelled out so that it is possible to know what the authors are testing before looking at the supplement; define what is meant by moderate dose and high dose in the workload and time discounting study and how these doses were derived; and that the reward size in the discounting study is constant.

- It was unclear how the ANOVAs were performed and how the data from the two or three individual monkeys are treated. Given that there are only 2 or 3 monkeys and that it is impossible to make any random effects inference, I would suggest that the authors analyze each individual separately. This is the most honest way to present the results. Conclusions about the effects of the drugs on monkeys in general is not possible, but this does not mean that this data is not very valuable as human studies cannot provide the value occupancy information and test the range of doses. However, the data needs to be interpreted rightly. Furthermore, ANOVAs need to be followed up with post hoc tests in order to be able to interpret interactions.

- Related to the previous point, I do not believe that the authors can say that the effects of D1 antagonism are quadratic when they have evidence for a quadratic effect in one monkey and evidence for saturating monotonic effects in the other monkey. I would only dear to make inferences when the effects are consistent across tested monkeys. And even in this case, inference will need to be tentative as far as random effects cannot be tested. Similarly, the authors overinterpret the mean effects of combining antagonists as the effects do not seem to be consistent across monkeys.

- Whereas the estimates of occupancy based on administered dose of SCH23390 appears to be very robust, the estimates of occupancy for haloperidol is much more uncertain, provably reflecting individual differences in the metabolism and elimination of haloperidol. This is important given that occupancy was not measured for the animals that took part in the behavioral experiments. This limitation should be acknowledged and discussed.

- At multiple points in the manuscript the authors present their ideas and results without enough consideration of previous evidence. Some examples: the authors omit a paper by Gao et al in Nature Neuroscience providing evidence that dopamine neurons are not sensitive to effort costs; Husain's lab in Oxford have some nice studies quantifying the effects of effort discounting and its modulation by dopamine; Berke's lab in Seattle have provided very important insights on the differential roles of dopamine in learning and performance.

- What do the authors mean by " However, the previous studies did not address the quantitative effect of DAR blockade on incentive motivation; more specifically, there was a lack of experimental data to model the causal relationship among DAR stimulation, reward, and motivation." Certainly, there are previous studies showing that D1 and D2 blockade affects incentive motivation.

- Michael Frank models of the direct and indirect pathway suggest a division of labor between D1 and D2 receptor whereby D1 and D2 affect learning from positive and negative reward prediction errors, respectively, but also have different impacts on choice performance and motivation. The latter being strongly controlled by D2 receptors. See, for instance, the excellent review by Collins and Frank in Psychological Reviews. The authors should discuss this model and how it plays on their results.

- on page 12, how can you rule out that the effects are not related to cognitive but motor impairment? Related to that, in the sentence, "Thus, the effect of D2 manipulation on value-based decision was relatively independent from the effects on cognitive or motor speed itself" is not appropriate. Motor speed is what is measured in the experiment, cognition or motor impairments are hypothesized mechanisms. They should not be mixed up and equated in one and the same sentence.

- On page 29, the authors state that 3 monkeys, whose codes are provided, were used for the behavioral experiments. This seem to apply to the discounting experiment but according to the results section and figures, 2 different monkeys were used for the incentive experiment.

- In figure 1, there seems to be missing some data points. Does it imply that not all doses were tested in all subjects?

- In the introduction, for the sentence "Blockade of either D1R or D2R biases animals' choices in tasks manipulating the cost/benefits trade-off. These biases should be spelled out.

- Why was a head hold device implanted?

Reviewer #3: Overall, this was an interesting and potentially very useful manuscript for the field, focusing on the effects of DA antagonists on several distinct aspects of motivation. The behavioral paradigms are elegant, and the use of monkeys is critical, because most of the animal studies in this area have been with rodents, and monkeys potentially provide a bridge between the rodent and human literatures. Moreover, the receptor occupancy data are a very important addition. As one can see from my comments, this manuscript stimulated a lot of thinking. My specific comments are listed below.

Comments:

In the first paragraph of the introduction, the authors state "For motivational value computation, the expected value of benefits (i.e., rewards) has a positive influence, while the cost necessary to earn the expected reward has a negative impact and discounts the net value of reward". Although one sees wording like this in the literature, it is actually misleading. As written, this statement implies that the reward itself is devalued (i.e., "…net value of reward"). However, when considering what is involved in cost/benefit analyses such as this, the net value is actually the relative value of the whole complex activity related to the reinforcer itself plus the instrumental response with all its associated costs. This issue also was evident in the literature on response/reinforcement matching, going back to the 1960s and 70s. While a reinforcement value parameter was determined from response/reinforcement matching studies of variable interval responding, it was evident to some researchers that this parameter was not the value of the reward per se, but rather, the value of the whole activity involving the instrumental response and the delivery of the reinforcer (i.e., the entire sequence of behavior that includes the response, its associated biases, and the reinforcer; Williams 1988). Indeed, the focus of the present article on the lack of effect of DA antagonism on relative reward value highlights the importance of these subtle distinctions in wording and terminology.

On a related note, in rodent studies it has been shown that in parallel experiments that involve intake of and preference for the reinforcers used in costs/benefit procedures, DAergic manipulations did not affect reinforcer intake or preference (Nunes et al. 2013; Yang et al. 2020). Moreover, studies involving the progressive ratio/chow feeding choice task in rats, which essentially functions as a discounting task involving increasing ratio requirements, it has been shown that D1 and D2 receptor antagonism produce effects that are very different from the effect produced by reinforcer devaluation (Randall et al. 2012, 2014). These findings are consistent with the conclusion of the authors that DA receptor blockade did not produce a general alteration of reinforcement value.

There have been a large number of rodent studies showing that DA D1 antagonism altered decision making based upon physical effort (Cousins et al. 1994; Nowend et al. 2001; Salamone et al. 2002; Sink et al. 2008; Randall et al. 2014; Yohn et al. 2015; Hosking et al. 2015). The authors seem to suggest that these effects may be solely due to actions on incentive motivation, rather than something to do with work load per se. However, I wonder if there is a possibility that there are species differences; i.e., perhaps the role of direct vs. indirect pathways and hence the effects of D1 antagonism are different in rodents vs. primates. Also, the authors should clarify if the implication of their line of thinking is that blunting of incentive motivation leads to an apparent lack of exertion of effort due to a reduced propensity for action, or if something else is going on.

One of the implications of these studies, taken as a whole, is the incentive motivation, exertion of effort, delay discounting, and reward value are dissociable from each other. This is consistent with much of the published literature, and is touched upon in the discussion, but is worth emphasizing more because it is an important point. In addition, I would like to see a bit more emphasis on what the authors mean be incentive motivation, in terms of behavior theory.

Minor Comments:

P 4, "DAR blockades" should be "DAR blockade"

P 16, "following D1 and D2 antagonisms" should be "following D1 and D2 antagonism"

Decision Letter 2

Lucas Smith

18 May 2021

Dear Dr Minamimoto,

Thank you for submitting your revised Research Article entitled "Differential Contribution of Dopaminergic Transmission at D1- and D2-like Receptors to Cost/Benefit Evaluation for Motivation in Monkeys" for publication in PLOS Biology. I have now obtained advice from the original reviewers and have discussed their comments with the Academic Editor. 

The reviews of your manuscript are appended below. As you will see, all three reviewers feel that you have satisfactorily addressed their previous comments. However reviewer 2 has noted a few lingering minor issues with the manuscript which we think can be addressed via textual changes.

Based on the reviews, we will probably accept this manuscript for publication, provided you satisfactorily address the remaining points raised by the reviewers. **IMPORTANT: Please also make sure to address the following data and other policy-related requests outlined here:

1) ETHICS REQUEST: Please provide information about the housing and environmental enrichment used for your monkeys, including whether they were group housed. For experiments requiring surgery, please describe the steps taken to minimize suffering, including use of anesthesia.

2) DATA REQUEST: Thank you for providing the data underlying each figure. Please add a sentence to each figure legend (including supplementary figures), noting where the underlying data can be found. For example, you could say "the data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR.”

3) I have discussed the title of your manuscript with my colleagues, and we wonder if it might be streamlined a bit to make it more accessible. For example, we suggest the following title: "D1- and D2-like Receptors differentially mediate the effects of dopaminergic transmission on Cost/Benefit Evaluation and Motivation in Monkeys"

4) BLURB: Please provide a blurb which (if accepted) will be included in our weekly and monthly Electronic Table of Contents, sent out to readers of PLOS Biology, and may be used to promote your article in social media. The blurb should be about 30-40 words long and is subject to editorial changes. It should, without exaggeration, entice people to read your manuscript. It should not be redundant with the title and should not contain acronyms or abbreviations. For examples, view our author guidelines: https://journals.plos.org/plosbiology/s/revising-your-manuscript#loc-blurb

5) I noticed a typo in figure 5 (there are two panels labeled 5D).

6) As you address these items, please take this last chance to review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the cover letter that accompanies your revised manuscript.

We expect to receive your revised manuscript within two weeks.

To submit your revision, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' to find your submission record. Your revised submission must include the following:

-  a cover letter that should detail your responses to any editorial requests, if applicable, and whether changes have been made to the reference list

-  a Response to Reviewers file that provides a detailed response to the reviewers' comments (if applicable)

-  a track-changes file indicating any changes that you have made to the manuscript. 

NOTE: If Supporting Information files are included with your article, note that these are not copyedited and will be published as they are submitted. Please ensure that these files are legible and of high quality (at least 300 dpi) in an easily accessible file format. For this reason, please be aware that any references listed in an SI file will not be indexed. For more information, see our Supporting Information guidelines:

https://journals.plos.org/plosbiology/s/supporting-information  

*Published Peer Review History*

Please note that you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*Early Version*

Please note that an uncorrected proof of your manuscript will be published online ahead of the final version, unless you opted out when submitting your manuscript. If, for any reason, you do not want an earlier version of your manuscript published online, uncheck the box. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us as soon as possible if you or your institution is planning to press release the article.

*Protocols deposition*

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please do not hesitate to contact me should you have any questions.

Sincerely,

Lucas Smith, Ph.D.,

Associate Editor,

lsmith@plos.org,

PLOS Biology

------------------------------------------------------------------------

Reviewer remarks:

Reviewer #1: The new manuscripts and the replies are satisfactory.

Reviewer #2: The authors have successfully addressed my comments. The manuscript is much clearer in terms of the novelty of the matched work and delay tasks and the analysis made. There are only some minor issues remaining.

I am still not convinced about the quadratic relationship between D1 blockade and incentive motivation based on the data of two monkeys presented and I would suggest that this statement is further tuned down. Beyond saying that the relationship tends to be U-shaped in the discussion, it needs to be clear that the relationship between D1 blockade and refusal rate is inferred solely based on 2 monkeys and that there is no statistics supporting its existence for this behavior at the population level. In line 180-181 the authors could say "went clearly up for monkey KN but less clearly so for monkey ST".

In new analysis, the authors show that refusal rates and RT are correlated, possibly both reflecting motivation. If this is the case, and the effects of D1 blockade on motivation are quadratic, shouldn't the effects of SCH23390 on RT also be quadratic?

What do the authors have in mind when suggesting (on line 334-335) that the effects of D2 blockade on workload are through a distinct motivational process? I find the dissociation in RT/refusal rate correlations after different levels of D2 blockade very intriguing and I believe it deserved further discussion.

Reviewer #3 (John Salamone): The authors did a very good job addressing the comments.

Decision Letter 3

Lucas Smith

27 May 2021

Dear Dr Minamimoto,

On behalf of my colleagues and the Academic Editor, Matthew Rushworth, I am pleased to say that we can in principle offer to publish your Research Article "D1- and D2-like Receptors Differentially Mediate the Effects of Dopaminergic Transmission on Cost/Benefit Evaluation and Motivation in Monkeys" in PLOS Biology, provided you address any remaining formatting and reporting issues. These will be detailed in an email that will follow this letter and that you will usually receive within 2-3 business days, during which time no action is required from you. Please note that we will not be able to formally accept your manuscript and schedule it for publication until you have made the required changes.

As one last minor point, while going through your revised manuscript, we noticed a minor typo on line 497 ('empathize' should be 'emphasize'). You can fix this while addressing the formatting and reporting requests, to come.

Please take a minute to log into Editorial Manager at http://www.editorialmanager.com/pbiology/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production process.

PRESS

We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with biologypress@plos.org. If you have not yet opted out of the early version process, we ask that you notify us immediately of any press plans so that we may do so on your behalf.

We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit http://www.plos.org/about/media-inquiries/embargo-policy/.

Thank you again for supporting Open Access publishing. We look forward to publishing your paper in PLOS Biology. 

Sincerely, 

Lucas Smith, Ph.D. 

Associate Editor 

PLOS Biology

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Model comparison the effect of DAR blockade on refusal rates in reward-size task (for Fig 2).

    a(cond) and e(cond) indicate the random effects of DAR blocking treatment conditions on parameters a and e, respectively. BIC is a relative measure of quality for the models (#1–4). ΔBIC denotes difference from minimum BIC. BIC, Bayesian information criterion; DAR, DA receptor.

    (PDF)

    S2 Table. Model comparison for the effect of DAR blockade on the relationship between refusal rate and RT in reward-size task (for Fig 4).

    (Rt|*) indicates random effects on regression parameters. E, refusal rate; Rt, reaction time; cond, treatment condition; monkey, subject. DAR, DA receptor.

    (PDF)

    S3 Table. Model comparison for the effect of DAR blockade on refusal rates in work/delay task (for Fig 5).

    CU and E0 indicate remaining cost and intercept, respectively. (0+ CU|*) and (CU|*) indicate random effects on both regression coefficient and intercept (E0) or on regression coefficient alone, respectively. E, refusal rate; type, trial type (delay or work); cond, treatment condition (CON, MO, and HO for D1R and D2R blocking); monkey, subject. CON, control; D1R, D1-like receptor; D2R, D2-like receptor; DAR, DA receptor; HO, high occupancy; MO, moderate occupancy.

    (PDF)

    S4 Table. Model comparison for the effect of DAR blockade on the relationship between refusal rate and RT in work/delay task (for Fig 6).

    (Rt|*) indicates random effects on regression parameters. E, refusal rate; Rt, reaction time; cond, treatment condition; monkey, subject. DAR, DA receptor.

    (PDF)

    S5 Table. Model comparison for the effect of both D1R and D2R blockades on the relationship between refusal rate and RT in work/delay task (for S5 Fig).

    (Rt|*) indicates random effects on regression coefficient. E, refusal rate; Rt, reaction time; cond, treatment condition; monkey, subject. D1R, D1-like receptor; D2R, D2-like receptor.

    (PDF)

    S1 Fig. Occupancy estimation.

    Example of occupancy estimation based on modified Lassen plot of [11C]SCH23390 PET data obtained from monkey DO. Colored dots represent the relationship between decreased specific binding [i.e., BPND (baseline)–BPND (blocking)] and baseline [BPND (baseline)] for each brain region under each blocking condition (indexed by color). Occupancy was determined as a proportion of reduced specific binding to baseline, which corresponds to the slope of linear regression. In this case, D1 occupancy was 80%, 78%, 67%, and 26% for 100, 50, 30, and 10 μg/kg doses, respectively. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. PET, positron emission tomography.

    (PDF)

    S2 Fig. Comparable effects of D2R antagonism between raclopride and haloperidol at similar occupancy.

    (A) Occupancy of D2R measured at striatal ROI is plotted against dose of raclopride. (B) Error rates as a function of reward size for control (black) and after injection of raclopride (10 μg/kg, i.m. left side) and haloperidol (10 μg/kg, i.m. right side) in monkey KN are plotted. Dotted curves are best-fit inverse function (model #1 in S1 Table). The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D2R, D2-like receptor; ROI, region of interest.

    (PDF)

    S3 Fig. Effect of D1R/D2R blockade on RT and error pattern.

    (A, B) Cumulative distribution of RT for control and D1R blockade conditions in drop-1 and drop-8 trials, respectively. (C) Mean RT as function of reward size for control and D1R blockade conditions. Two-way ANOVA, reward × condition; main effect of condition, F(4, 164) = 109.8, p < 1.0 × 10−15; main effect of reward, F(3, 164) = 111.0, p < 10−15; interaction, F(12, 164) = 4.7, p < 1.0 × 10−5. (D) Late release rate (mean ± SEM) as function of reward size for control and D1R blockade conditions. Two-way ANOVA, reward × condition; main effect of condition, F(4, 163) = 18.6, p < 1.0 × 10−11; main effect of reward, F(3, 163) = 9.8, p < 10−5; interaction, F(12, 163) = 1.0, p = 0.4. (E–H) Same as (A–D), but for D2R blockade. RT; main effect of condition, F(6, 92) = 7.2, p < 1.0 × 10−5; main effect of reward, F(3, 92) = 81.9, p < 10−15; interaction, F(18, 164) = 0.6, p = 0.65. Late release rate; main effect of condition, F(6, 90) = 3.5, p = 0.0038; main effect of reward, F(3, 90) = 19.2, p < 10−9; interaction, F(18, 90) = 1.4, p = 0.14. * significantly different from control, p < 0.05 post hoc Tukey HSD. Data were obtained from monkey ST. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D1R, D1-like receptor; D2R, D2-like receptor; HSD, honestly significant difference; RT, reaction time.

    (PDF)

    S4 Fig. Little influence of DAR blockade on sucrose preference and blood osmolality.

    (A) Sucrose preference index after administration of saline (Control), SCH23390 (30μg/kg, D1), and haloperidol (10μg/kg; D2, day 0), respectively. There was no significant effect of DAR blockade on overall intake (1-way ANOVA, treatment, monkey KY, F(2, 8) = 1.26, p = 0.33; monkey RO, F(2, 14) = 2.01, p = 0.17) or sucrose preference (1-way ANOVA; treatment, monkey KY, F(2, 8) = 1.62, p = 0.26; monkey RO, F(2, 8) = 1.38, p = 0.31). (B) Blood osmolality measured in serum samples obtained before (pre) and after (post) sucrose test. There was no significant impact of DAR blockade (2-way ANOVA, monkey KY, main effect of treatment, F(2, 10) = 4.0, p = 0.056; pre-post, F(1, 10) = 93.83, p = 2.1 × 10−6, interaction, F(2,10) = 0.74, p = 0.50; monkey RO, treatment, F(2, 20) = 1.22, p = 0.32; pre-post, F(1, 20) = 40.8, p = 3.1 × 10−6, interaction, F(2,20) = 0.13, p = 0.88). Filled circles and shades indicate median and raw data points, while horizontal bars indicate SD. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. DAR, DA receptor.

    (PDF)

    S5 Fig. Effect of both D1R and D2R blockades on the relationship between refusal rate and RT.

    (A) Relationship between refusal rate and average RT for each reward size in session by session for D2 blocking and D1+D2 blocking in delay trials. Data are plotted individually for monkeys KN, MP, and ST, in order from top to bottom. Colors indicate treatment condition. Thick lines indicate linear regression lines (model #1 in S5 Table). (B) Same as A, but for workload trials. Note that for the data in workload trials, a multiple linear model with random effect of condition (model #3 in S5 Table) was chosen as the best model to explain the data, where the steepness of the slope under D1+D2 treatment was the same as that of control. The data underlying this figure can be found on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR. D1R, D1-like receptor; D2R, D2-like receptor; RT, reaction time.

    (PDF)

    Attachment

    Submitted filename: response0405.docx

    Attachment

    Submitted filename: response0521.docx

    Data Availability Statement

    All data presented in this paper have been posted on the following public repository: https://github.com/minamimoto-lab/2021-Hori-DAR.


    Articles from PLoS Biology are provided here courtesy of PLOS

    RESOURCES