Abstract
The present review article summarizes and expands upon the discussions that were initiated during a meeting of the Cognitive Neuroscience Treatment Research to Improve Cognition in Schizophrenia (CNTRICS; http://cntrics.ucdavis.edu). A major goal of the CNTRICS meeting was to identify experimental procedures and measures that can be used in laboratory animals to assess psychological constructs that are related to the psychopathology of schizophrenia. The issues discussed in this review reflect the deliberations of the Motivation Working Group of the CNTRICS meeting, which included most of the authors of this article as well as additional participants. After receiving task nominations from the general research community, this working group was asked to identify experimental procedures in laboratory animals that can assess aspects of reinforcement learning and motivation that may be relevant for research on the negative symptoms of schizophrenia, as well as other disorders characterized by deficits in reinforcement learning and motivation. The tasks described here that assess reinforcement learning are the Autoshaping Task, Probabilistic Reward Learning Tasks, and the Response Bias Probabilistic Reward Task. The tasks described here that assess motivation are Outcome Devaluation and Contingency Degradation Tasks and Effort-Based Tasks. In addition to describing such methods and procedures, the present article provides a working vocabulary for research and theory in this field, as well as an industry perspective about how such tasks may be used in drug discovery. It is hoped that this review can aid investigators who are conducting research in this complex area, promote translational studies by highlighting shared research goals and fostering a common vocabulary across basic and clinical fields, and facilitate the development of medications for the treatment of symptoms mediated by reinforcement learning and motivational deficits.
Keywords: reinforcement, reward, motivation, learning, cognition
This review manuscript is a result of discussions initiated during a meeting of the Cognitive Neuroscience Treatment Research to Improve Cognition in Schizophrenia (CNTRICS; http://cntrics.ucdavis.edu/; accessed August 19, 2013) and discussions that continued among the co-authors of this manuscript. The purpose of this CNTRICS meeting was to identify experimental procedures and measures that can be used in laboratory animals to assess psychological constructs with relevance to the psychopathology of schizophrenia. The Motivation Working Group of this CNTRICS meeting, which included most of the authors of this article as well as additional participants, was asked to identify experimental procedures in laboratory animals that assess reinforcement learning and motivation, two constructs that are highly relevant to the negative symptoms cluster of schizophrenia (Andreasen, 1982). Tasks nominated by the scientific community were discussed, evaluated, and selected by the Motivation Working Group of CNTRICS based on: (i) relevance to the assessment of reinforcement learning and/or motivation; (ii) extent of prior validation work of the construct validity of these tests (i.e., their ability to measure what they are purported to measure); and (iii) translational potential. The motivation group agreed to focus on tasks that assess the most fundamental processes involved in Reinforcement Learning and Motivation.
The objectives of this review article are to provide descriptions of methods that can be used to study these constructs in experimental animals, review how these procedures are implemented, and discuss how the results are interpreted. In addition, the validity of the measures provided by these procedures and what is known about the neural substrates of the processes involved in performing these tasks are discussed. The tasks selected to be discussed here are the ones that were nominated to the Motivation Working Group of CNTRICS and that, after detailed discussions, were selected as being tasks that assess processes and constructs relevant to the negative symptoms of schizophrenia. This is not necessarily an exhaustive list, and it is recognized that additional tasks exist or can be designed that can assess various aspects of the negative symptoms of schizophrenia. For example, although motivation research includes studies of both aversive and appetitive motivation, the present review focuses on appetitive learning only. Furthermore, we highlight tasks that assess fundamental processes in a relatively simple fashion. Additional more-complicated tasks may also prove to be useful and could have high relevance for modeling the complexity of everyday life challenges faced by patients. Nevertheless, the assessment of basic reinforcement learning and motivational processes will facilitate the analysis of the neuropathological changes in these processes that lead to the negative symptoms. Finally, this review article provides a pharmaceutical industry perspective about how such experimental procedures may be applied to drug discovery efforts for the treatment of the negative symptoms of schizophrenia.
It should be emphasized that the constructs discussed here have relevance to multiple neuropsychiatric disorders characterized by alterations in reinforcement learning and motivational functions, such as major depression, neurodegenerative diseases (e.g., Parkinsonism, Alzheimer’s disease), dementias, and ageing, in addition to having relevance to the negative symptoms of schizophrenia (Salamone et al., 2007; Der-Avakian and Markou, 2012).
Reinforcement Learning was defined by CNTRICS as “Acquired behavior as a function of both positive and negative reinforcers, including the ability to: (a) associate previously neutral stimuli with value, as in Pavlovian conditioning; (b) rapidly modify behavior as a function of changing reinforcement contingencies; and (c) slowly integrate over multiple reinforcement experiences to determine behaviors that are optimal in the long run despite environmental uncertainty”. Motivation was defined during the discussion as those processes that modulate the direction and activation (i.e., initiation, persistence, speed, or exertion of effort) of behavior in relation to significant external and internal stimuli. As revealed by the above definitions, both reinforcement learning and motivation are multifaceted processes (Salamone, 2007; Ward et al., 2012).
Several procedures, including an Autoshaping Task, Probabilistic Reward Learning, and the Response Bias Probabilistic Reward Task, are described here as procedures to assess Reinforcement Learning. Although motivation is involved in the execution of almost all experimental procedures, Effort-Based Tasks and the Outcome Devaluation and Contingency Degradation Tasks provide the opportunity to parse various components of motivation from each other.
Clinical and Preclinical Terminology
Much clinical and preclinical research has examined reinforcement learning and motivational processes in healthy humans, patients, and laboratory animals. The clinical and preclinical literature indicates that there are several different terms used to describe experimental findings in animals and humans. Some of these terms are not defined in a consistent or clear manner. Moreover, little effort has been invested in linking the preclinical and clinical realms, and therefore, terminology varies widely. These discrepancies in terminology have led to some confusion and a lack of successful communication. Thus, we attempt here to briefly relate the various terms used by clinicians, clinical researchers, and animal researchers in the hopes of enhancing the translational value of work on reward and motivation to both preclinical and clinical researchers concerned with these domains of function.
Clinical Terms
The most frequently used term to describe reduced behavioral activation/output in the clinical literature is fatigue, understood as encompassing mental or central fatigue in addition to physical or motor exhaustion (Chaudhuri and Behan, 2004). This term is used for many indications, including schizophrenia, depression, and muscular and neurodegenerative disorders, despite the lack of a strict and precise definition. Used more frequently in other domains (e.g., Parkinson’s disease; c.f., Friedman et al., 2010), the term fatigue is not used in scales common in schizophrenia research (Andreasen, 1981; Kay et al., 1987).
A frequently used term in the clinical realm, and sometimes in preclinical work as well, is apathy. This term is most frequently used to refer to motivational deficits in schizophrenia, dysthymia, depression, stroke, progressive supranuclear palsy, and neurodegeneration, particularly in Huntington’s disease (Ishizaki and Mimura, 2011; van Reekum et al., 2005). Apathy is defined as “diminished goal-oriented behavior and cognition, and a diminished emotional connection to goal-directed behavior” (Marin, 1991); thus, this term describes a type of motivational dysfunction relevant to the focus of this review (Clarke et al., 2011; Oakeshott et al., 2012).
Anhedonia, a term coined by the French psychologist Ribot (Ribot, 1896), is also used in both clinical and preclinical domains. Anhedonia refers to the reduced experience of or inability to experience pleasure during reward delivery. The term anhedonia has been used at times to refer to not only the experience but also the pursuit of pleasure, thus leading to confusion in the literature. This extension of the definition of the term anhedonia to both of these aspects of reinforced behavior (i.e., both the experience of pleasure and the pursuit of rewards) is undesirable because of references to two different psychological processes that are mediated by dissociable, albeit overlapping, neural circuits (Berridge and Robinson, 1998; Der-Avakian and Markou, 2012) and because it leads to confusion in the field. Here, we restrict our use of the term anhedonia to the emotional reaction to reward as it is being experienced.
Avolition, understood as a reduction of the ability to initiate and maintain goal-directed behavior, is a term used in schizophrenia research and diagnosis as part of the negative symptom cluster. Indeed, the Scale for the Assessment of Negative Symptoms (SANS), one of the most commonly used scales in the clinic, considers an avolition-apathy domain as distinct from the other negative symptom clusters, namely anhedonia, alogia, affective flattening, asociality, and attentional impairment (Andreasen, 1981, 1982). The Positive and Negative Syndrome Scale (PANSS; Kay et al., 1987), another widely used scale, uses the term apathy in the context of social withdrawal, which is included in the negative symptom cluster, and “disturbance of volition,” which is included in a general pathology subscale. In the current paper, we restrict our definition of avolition to the processes that modulate the initiation and maintenance of goal-directed behavior.
Anergia, which is defined as “lack of perceived energy,” is used to describe the lack of physical activity without ascribing such inactivity to any particular process, although it is included in the avolition/apathy cluster in the SANS (Andreasen and Olsen, 1982). Psychomotor retardation tends to connote a slowing or reduction of motor activity in general, although it is also used to reflect a reduction of the speed of processing incoming stimuli, resulting in a slow motor response. It is difficult to empirically separate its two components, the motor and the mental slowness (bradyphrenia). Motor retardation appears in the PANSS as part of the general pathology scale (Kay et al., 1987). These clinical terms correspond most closely to what we call the activational aspects of motivation later in this paper.
Preclinical Terms and Their Relationship to the Clinical Terms
There does not appear to be a discrepancy in either the preclinical or clinical domain about reinforcement learning despite the fact that there are many aspects of reinforcement learning. However, the term motivation presents more challenges because motivation is the result of many subprocesses. Motivated behavior takes place in phases, and one distinction seen in the literature is between the sequences of behaviors that bring the experimental animal subject in physical proximity with the goal object (e.g., the reward or reinforcer) or increase the chances that the goal object will be delivered (anticipatory, appetitive, preparatory, approach, or seeking behavior) versus the direct interaction with the motivational stimulus or goal object (i.e., consummatory or taking behavior; Craig, 1917; Andreasen and Olsen, 1982). Note that the anticipatory behavior itself is regulated by multiple subprocesses, including those that underlie the representation of the value of a goal, the assessment of the effort needed to obtain the goal, and the cost-benefit analysis that leads to action (Salamone and Correa, 2002).
Another classic distinction in the literature is between directional aspects of motivation (i.e., the fact that behavior is directed toward or away from stimuli) versus activational aspects (i.e., the fact that motivated behavior is characterized by a high degree of activity, persistence, or effort; Cofer and Appley, 1964; Salamone, 1988). These dimensions of motivation are not mutually exclusive and often are used in concert in the literature (Berridge and Robinson, 1998; Brebion et al., 2000; Salamone, 2007). Nevertheless, these components are also dissociable experimentally (see below). Although the terms used in the human clinical literature and preclinical animal studies are sometimes at variance, there are also examples of consistency and overlap. The term liking is similar to terms such as hedonic impact or positive valence (Smith et al., 2011). Thus, anhedonia could be said to reflect a condition of diminished liking. Similarly, clinical conditions, such as fatigue, apathy, psychomotor retardation, and anergia, are thought to reflect blunted behavioral activation (Salamone, 2007, 2010). Furthermore, studies of reinforcement learning are conducted both in humans and in other animals, often using analogous procedures. Thus, it is possible to review the literature on preclinical studies of motivation and reinforcement learning in a manner that is highly relevant for the understanding of motivational dysfunctions in schizophrenia.
Reinforcement Learning Tasks
There is an extensive literature that suggests that schizophrenia patients have deficits in reinforcement learning (e.g., Yilmaz et al., 2012; Weiler et al., 2009; Farkas et al., 2008; for review, see Gold et al., 2008), although the basis for these deficits may be varied in different subgroups of patients (Farkas et al., 2008). Importantly, reinforcement learning deficits are associated with the expression of negative symptoms and may indeed be an important contributor to the etiology of these symptoms (e.g., Yilmaz et al., 2012). Learning deficits have been reported in both classical (Pavlovian) conditioning procedures (e.g., Hofer et al., 2001; Dowd and Barch, 2012) and instrumental or operant learning tasks (e.g., Weiler et al., 2009).
The three animal procedures that were selected to be described here are the Autoshaping Task that allows the assessment of Pavlovian (classical) conditioning processes and the Probabilistic Learning Task and Response Bias Probabilistic Learning Task that involve operant conditioning.
Autoshaping Task
The Autoshaping Task allows the assessment of the construct of Pavlovian (classical) conditioning processes. Autoshaping was discovered when researchers shaped pigeons to peck an illuminated response key and found that merely preceding the delivery of reinforcement with illumination of the key was sufficient to induce pigeons to peck at the illuminated key. In other words, key-pecking behavior was “shaped” without actually making the reward contingent on pecking, hence the name autoshaping (Brown and Jenkins, 1968). In autoshaping, repeated conditioned stimulus (CS)–unconditioned stimulus (US) pairing gives rise to Pavlovian approach behavior that develops across trials, allowing the assessment and comparison of animals’ reinforcement learning rates. In rodents, there are several computer-automated procedures with which to study autoshaping, primarily involving operant conditioning chambers equipped with levers or computer monitors fitted with touch-sensitive screens. In the operant conditioning chambers, first a CS, usually a light near or within a lever or the extension of a retractable lever into the chamber, is followed by reinforcement, and lever contacts are the recorded measure of interest. The task is best conducted as a discriminative conditioning procedure, using a CS+ (e.g., left lever) and a CS− (e.g., right lever) to provide a control for overall levels of responding that are assessed as responding on the CS− manipulandum. After the CS+ presentation, reinforcement is delivered, while no reinforcement is delivered after the CS− presentation. In the touchscreen method (Bussey et al., 1997; see Figure 1A), which is also a discriminative conditioning procedure, a stimulus (white rectangle) is shown on either the left or right of the screen for (usually) 10 s (a detailed protocol is provided in Horner et al., 2013). In both of these methods, over trials the animal learns that the CS+ predicts reward and makes increasingly more responses to the CS+. Responses to the CS− typically stabilize at a low level. Data from the touchscreen method using C57BL/6 mice are shown in Figure 1B. It is also possible to measure, concurrently with stimulus approaches, approaches toward the location of the reinforcer delivery during the CS presentation. In this way, “sign tracking” (i.e., approaches toward the CS+) can be dissociated from “goal tracking” (i.e., approaches toward the reinforcer; Flagel et al., 2007; Danna and Elmer, 2010).
The neural circuitry underlying autoshaping has been strongly implicated in the neuropathology of schizophrenia. Specifically, structures within the mesolimbic dopamine system, as well as elements of the prefrontal cortex, are critical for normal autoshaping behavior. Neurotransmitter systems relevant to schizophrenia, including the dopaminergic and glutamatergic systems, are also involved in performance of this task.
Dopamine depletion in the nucleus accumbens impairs autoshaping acquisition and also performance of a pre-operatively learnt discrimination (Parkinson et al., 2002; Dalley et al., 2002). The core, but not the shell, region of the nucleus accumbens is particularly important for the acquisition of autoshaping (Parkinson et al., 1999). In addition, excitotoxic lesions of the nucleus accumbens core region impair a previously acquired association (Cardinal et al., 2002). Studies involving drug infusions into the nucleus accumbens core during the acquisition of a lever-based autoshaping task revealed that the amino-3-hydroxy-5-methyl-4-isoxazolepropionate (AMPA)/kainate receptor antagonist LY293558 disrupted discriminated approach performance but not acquisition; whereas the N-methyl-d-aspartate (NMDA) receptor antagonist AP-5 impaired acquisition but did not interfere with performance of a previously learned approach (Di Ciano et al., 2001). The dopamine D1/D2 receptor antagonist α-flupenthixol decreased approaches to the CS+ during both acquisition and performance (Di Ciano et al. 2001). A particularly interesting finding is the observation that in rats performing the touch screen version of the task, injections into the nucleus accumbens of the dopamine D1 (SCH 23390) or NMDA (AP-5) receptor antagonists after each daily session of touch screen autoshaping impaired acquisition (Dalley et al., 2005). D2 receptor antagonism (sulpiride) and amphetamine infusion had no effect in this procedure (Dalley et al., 2005). These findings suggest that D1 and NMDA receptors are specifically involved in the consolidation of Pavlovian learning. In contrast to the circuits that include the nucleus accumbens, the dorsal striatum-prefrontal cortex circuit does not appear to be critically involved in autoshaping (Christakou et al., 2001).
The prefrontal cortex appears to be necessary for the acquisition of discriminated approach responses. The orbitofrontal cortex is necessary for acquisition but is not necessary once the discrimination has been learned (Chudasama and Robbins, 2003). Similarly, lesions of the post-genual anterior cingulate have been shown to impair acquisition (Bussey et al., 1997). Additionally, Parkinson et al. (2000b) used a disconnection lesion procedure to show that the nucleus accumbens core and anterior cingulate compose part of a corticostriatal circuit involved in autoshaping. Medial prefrontal lesions (including prelimbic, infralimbic, and pre-genual anterior cingulate) cortex lesions do not substantially affect autoshaping, nor do lesions of the posterior cingulate (Bussey et al., 1997).
Lesions of the subthalamic nucleus (STN) impair autoshaping (Winstanley et al., 2005), consistent with other reports that STN lesions impair sign-tracking (Uslaner et al., 2008). Furthermore, lesions of the pedunculopontine tegmental nucleus (PPTg), a brainstem nucleus that has been implicated in schizophrenia (Yeomans, 1995), impair autoshaping (Inglis et al., 2000), perhaps by altering attentional control or sensory gating, functions thought to be affected in schizophrenia (Nuechterlein et al., 2009; Luck and Gold, 2008).
Other structures involved in autoshaping include the central but not basolateral nucleus of the amygdala (Parkinson et al., 2000a), whereas lesions of the hippocampus may facilitate autoshaping (see below; Ito et al., 2005).
Systemic pharmacological studies also underscore the utility of the autoshaping procedure for preclinical studies relevant to schizophrenia patients. The nonselective dopamine receptor antagonist apomorphine impaired autoshaping (Dalley et al., 2002). Furthermore, the atypical antipsychotic olanzapine and typical antipsychotic haloperidol disrupted conditioned approach to the reward-predictive cue (sign-tracking), but neither drug disrupted conditioned approach to the reward (goal-tracking; Danna and Elmer, 2010). As mentioned above, a D1 receptor antagonist but not D2 receptor antagonist impaired autoshaping when infused into the nucleus accumbens post-session (Dalley et al., 2005).
A few studies have shown that certain manipulations can enhance autoshaping. For example, an improvement of the consolidation of autoshaping after administration of the 5-HT1A receptor agonist 8-OH-DPAT was seen (Meneses and Hong, 1994). Furthermore, intracerebroventricular (ICV) infusions of the serotonergic neurotoxin 5,7-dihydroxytryptamine increased the speed and number of responses made in autoshaping (Winstanley et al., 2004); however, it cannot be argued that these effects represent an “improvement” in task performance, at least not in terms of discrimination learning rate. Interestingly, a more convincing demonstration of increased responding and what appears to be an increase in learning rate was found after excitotoxic hippocampal lesions (Ito et al., 2005).
There are conditioning paradigms available for human testing that involve at least some of the same brain regions implicated in the rodent autoshaping task (e.g., Bechara et al., 1995; Phelps et al., 2001, Delgado et al., 2006). Studies in healthy humans that have examined pavlovian autoshaping indicate similar response patterns to those seen in experimental animals (e.g., Pithers, 1985; Wilcove and Miller, 1984). Although the results of such studies are not in agreement with regard to whether the mechanisms that underlie the behavior in humans and experimental animals are the same, the findings suggest that the development and validation of an autoshaping procedure for humans is feasible.
To summarize, there is now a substantial amount of data on the neurobiology of autoshaping, particularly in the touch screen version of the task, that shows that the task can dissociate the functions of different systems, including neurotransmitter systems and receptor subtypes, and subregions within the same subcortical nuclei. Furthermore, both decrements and increments in performance can be detected. The structures and systems that underlie autoshaping are highly relevant to schizophrenia, as are the psychological processes thought to be tapped by this task. However, whether schizophrenia patients exhibit deficits in Pavlovian (classical) conditioning is unclear. Lubow (2009) argued that the conditioned eyeblink response is not impaired in schizophrenia. However, Romaniuk et al. (2010) reported that patients with schizophrenia showed abnormal activation of the amygdala, midbrain, and ventral striatum during conditioning. Indeed, in the patient group, the activation of midbrain structures correlated with the severity of delusional symptoms. Nevertheless, the use of conditioning paradigms, such as the autoshaping task, will allow researchers to explore this important question. Classical conditioning is a fundamental type of learning, deficits in which could profoundly affect performance in a variety of learning tasks. Classical conditioning is not independently assessed in other reinforcement learning tasks.
Probabilistic Reward Learning Task
Reinforcement learning is generally defined as the modification of behavior based on past experience of the positive and/or negative consequences of particular predictive events (stimuli or actions). Reinforcement learning also refers to computational theories that quantitatively describe how optimal action selection emerges based on internal assignment of value to choice alternatives, where values are derived from reward and/or punishment expectancies built through prior feedback. As discussed below, there is considerable evidence that implicates frontostriatal circuitry and dopaminergic activity in reinforcement learning, neural substrates known to be dysregulated in schizophrenia and other neuropsychiatric disorders. Moreover, recent studies reporting deficits in probabilistic learning in several of these disorders (Frank et al., 2004; Waltz et al., 2007, 2011; Gold et al., 2012; Whitmer et al., 2012; Weiler et al., 2009; Heerey et al., 2008) have led to growing interest in the development of appropriate preclinical animal procedures in this domain.
The majority of probabilistic reinforcement learning paradigms employ an instrumental two-alternative forced choice procedure. Subjects are presented with a series of trials and on every trial are required to select one of two possible response options. Both response options may be associated with a positive or negative reinforcing outcome, but on any given trial, delivery of the reinforcer is uncertain. The odds of outcome delivery for each response option are determined based on a specified probability distribution that is unknown to the subject. Thus, choice options might result in reinforcement but sometimes lead to spurious null or negative feedback. The goal of the subject is to learn to make choices that maximize positive outcomes (and/or minimize negative outcomes) over the course of the session.
Several versions of instrumental probabilistic reinforcement learning tasks have been developed for rodents. Brunswik (1939) first implemented the basic procedure in an elevated T-maze, while more recently reinforcement learning with multiple alternatives has been studied in the nine-hole box (Bari et al., 2010) and operant lever chambers (Hiraoka, 1984). In all of these task variants, rodents learn to preferentially select the more profitable response option as the number of trials increases. Two-alternative probabilistic learning tasks are also widely used in humans and nonhuman primates. However, these tasks typically involve response choices defined by visual stimuli rather than by spatial locations that are predominant in rodent versions. The simplest procedures are similar in structure to rodent tasks; that is, subjects choose between two stimulus alternatives associated with a high or low (e.g., 80:20%) probability of reward (Chamberlain et al., 2006; Kasanova et al., 2011). Other procedures have extended the basic two-choice paradigm to promote the reliance on implicit reinforcement learning processes. For example, in the probabilistic classification task, subjects are asked to predict a binary outcome (e.g., sun or rain) based on a complex set of visual cues, where individual cues are each probabilistically associated with the outcomes (Knowlton et al., 1994). Over trials, normal subjects progressively learn to select the outcome more frequently associated with a specific cue pattern. More recently, Frank et al. (2004) developed a probabilistic selection task in an attempt to better distinguish between learning that results from positive versus negative feedback. During task acquisition, one of three pairs of visual stimuli (e.g., hirigami characters) that vary in reward probabilities (80:20%, 70:30%, and 60:40%) are randomly presented in each trial. The subjects are trained up to 360 trials to gain exposure to these initial pairings and learn to select the more profitable options. To assess the contribution of positive or negative feedback, a transfer/probe phase is implemented, in which novel combinations of stimuli are presented, but no feedback is given. The extent to which the 80% stimulus option is chosen more frequently than its novel pairings (70%, 60%, 40%, 30%) provides an index of positive feedback learning. Conversely, avoidance of the 20% option in the same novel pairings is a marker for negative feedback learning.
Recently, Mar and colleagues developed a novel rodent analogue of the probabilistic selection task in the touchscreen apparatus (Trecker et al., 2012). In the basic procedure, rats are trained to initiate trials in which a pair of distinct visual stimuli are presented on the touchscreen. The first novel pair of stimuli is presented as a standard two-alternative visual discrimination (100:0%) such that touching the profitable stimulus always results in a food pellet reward, and touching the unprofitable stimulus leads only to a 5-s timeout period. After the animals learn to adopt a choice preference for the profitable stimulus (~4-8 sessions of up to 200 trials each), the subjects are sequentially exposed to five novel stimulus pairs, with each pair presented for six consecutive sessions before the next stimuli are introduced. The reward probability ratios for each stimulus pair are 100:0%, 90:10%, 80:20%, 70:30%, and 50:50%, with the order of presentation counterbalanced across animals to help control for possible carry-over effects. As expected based on reinforcement learning theory, rats show a diminishing choice preference for the more profitable option as the difference between profitable and unprofitable probabilities decreases. As an important control, no significant choice preference or stimulus bias for the 50:50% pair was seen (Trecker et al., 2012). In the next phase, the five stimulus pairs are presented, interleaved randomly across trials within the same session, analogous to the human version. Two sets of probe trials are then implemented within the interleaved sessions to assess positive and negative feedback learning. First, all five pairs are presented in novel combinations (each of the 40 possible pairs presented once per session for four sessions, without reinforcement) such that the stimulus associated with higher reward probabilities can be compared with all stimuli that have lower reward probabilities and vice versa. Further information is also obtained by examining choice preferences in relation to the anchoring pairs of 100:0% and 50:50%. Then, all 10 stimuli from the original pairs are combined with a novel stimulus (10 possible pairs presented four times per session, without reinforcement) such that biases toward selecting more profitable stimuli and avoiding unprofitable stimuli can be examined. The application of these probes to a sample of control Lister hooded rats suggested that, similar to the human version, both positive and negative feedback learning is taking place during probabilistic discrimination. Moreover, the avoidance of previously unprofitable stimuli was significantly greater than approach to previously profitable stimuli when these stimuli were paired with a novel stimulus.
This analogue of the probabilistic selection task offers a promising translational tool for examining probabilistic reinforcement learning in rodents and potentially primates. The task engages similar stimulus and response modalities as the human version and enables assessment of identical outcome measures. This task also adds two control choice conditions (100:0% and 50:50%), which are an important innovation for better interpretation of the task results. The task presently requires approximately 50 training sessions from initial box introduction to the end of probe trial testing, but the overall training time has been compressed by giving two sessions per day. Further refinements (e.g., curtailing the task to examine only a single probability) may render the basic task more suitable for higher throughput testing and optimize it for drug discovery/development. Moreover, the procedure also lends itself well to the assessment of other constructs, such as probabilistic reversal learning, providing a further index of executive function (Gilmour et al., 2012). As this task has only recently been developed, there is as yet no psychometric data concerning test-retest reliability or data examining construct validity or sensitivity to pharmacological agents.
There is a growing body of empirical evidence examining the neural bases of reinforcement learning. Ventral and medial areas of the prefrontal cortex have been implicated in the encoding of value of a stimulus and/or expected rewards in human functional magentic resonance imaging studies (Kable and Glimcher, 2007; Hare et al., 2008; Chib et al., 2009; FitzGerald et al., 2009; Levy et al., 2010), as well as in electrophysiological studies in nonhuman primates (Padoa-Schioppa and Assad, 2006, 2008; Kennerley et al., 2011; Padoa-Schioppa, 2009) and rats (Takahashi et al., 2011). Across these species, the medial regions of the orbitofrontal cortex have been implicated in the processing of reward outcome value (Arana et al., 2003; Kringelbach, 2005; Grabenhorst and Rolls, 2009; Rudebeck and Murray, 2011; Noonan et al., 2010; Gourley et al., 2010; Mar et al., 2011), whereas null or negative error-related feedback has been linked to the function of the anterior cingulate cortex (Bellebaum et al., 2010; Carter et al., 1998, 1999; Ito et al., 2003; Bryden et al., 2011). A large amount of theoretical work suggests that the difference between such expected value and actual value signals (prediction error) are requisite to reinforcement learning, and evidence indicates that one of the conditions encoded by the phasic activity of midbrain dopamine neurons is reward prediction errors (Montague et al., 1996; Schultz et al., 1997; for reviews, see Glimcher, 2011; Schultz, 2010). Phasic dopamine signaling has been proposed to modify synaptic plasticity within the frontal cortex and basal ganglia, such that stored expected values can be updated and/or adjusted (Wickens et al., 1996; Surmeier et al., 2009; Sheynikhovich et al., 2011).
While much of the work examining the neural substrates of reinforcement learning has used Pavlovian learning paradigms or instrumental tasks, in which reward timing or magnitude are manipulated to generate prediction errors, relatively less work has examined the neural substrates of instrumental probabilistic reinforcement learning per se. Existing evidence from probabilistic classification and probabilistic learning tasks most consistently implicate roles for the basal ganglia and mesolimbic dopamine system (Shohamy et al., 2008; Maia and Frank, 2011). Empirical and computational modeling work on the probabilistic selection task in humans has suggested that dopamine may contribute to reinforcing both “Go” (learning to select actions with a high reward probability) as well as “No Go” (learning to avoid or suppress actions with low reward probability) learning (Frank et al., 2004). Unmedicated Parkinson’s disease patients (who exhibit depleted striatal dopamine levels) have been observed to learn more from negative than positive outcome feedback, whereas patients on medication (generally exhibiting increased striatal dopamine levels compared with unmedicated patients) show the opposite pattern (Frank et al., 2004; Palminteri et al., 2009). These learning biases are correlated with medication-induced increased sensitivity to positive prediction errors and reduced sensitivity to negative prediction errors in the ventral and dorsolateral striatum (Voon et al., 2010).
Recent studies have also associated genetic factors related to dopamine function with probabilistic learning. A triple dissociation was found for three polymorphisms that affect distinct aspects of dopamine function on probabilistic selection task performance. DARPP-32, linked with D1 dopamine receptor function in the striatum, was associated with choosing stimuli having higher (versus lower) probabilities of reward. C957T, which affects dopamine D2 receptor mRNA translation and stability and striatal postsynaptic D2 receptor density, was associated with the avoidance of stimuli having lower probabilities of reward. Val/Met, which affects the levels of the COMT enzyme and dopamine in the prefrontal cortex, was associated with trial-to-trial lose-shift strategies (Frank et al., 2007). However, identification of the genetic factors and neural circuits that contribute to reinforcement learning must also be tempered by the recognition that reinforcement learning processes appear to be intertwined with higher cognitive functions, such as working memory (Collins and Frank, 2012).
The primary measures of interest in probabilistic reinforcement tasks are the learning rate and asymptotic choice performance. However, as computational models attest, there are numerous factors or mechanisms within reinforcement tasks that may impact a subject’s learning and choice performance and influence interpretation of the results. A subject’s ability to associate and/or assign value to a stimulus or choice-response option may influence learning rates and has long been hypothesized to depend on or be modulated by attentional mechanisms (Mackintosh, 1975; Pearce and Hall, 1980; Dickinson, 1981). The ability to reliably detect and encode the differences between actual and expected outcomes (prediction errors) is widely considered to be the main engine of reinforcement learning (Rangel et al., 2008; Kable and Glimcher, 2009). As described briefly above, a subject’s valuation or weighting of positive or negative outcomes may differentially contribute to reinforcement learning. The extent to which prior reinforcement events are remembered or discounted may also affect individual choice patterns (e.g., a subject who only considers feedback from their last choice might engage in relatively more win-stay lose-shift behavior; Shimp, 1976; Williams, 1991; Collins and Frank, 2012). Furthermore, a subject’s capacity to update memories when outcome value or availability changes will also influence choice, as will susceptibility to bias. In summary, a subject’s propensity to adopt certain response strategies (e.g., exploitation versus exploration), use prediction error information (e.g., model-based or model-free reinforcement learning), or engage other cognitive systems (e.g., implicit or explicit) may further impact their trial-by-trial choices (Daw et al., 2006; Fu and Anderson, 2008; Doll et al., 2012). Such factors and strategic differences should thus be carefully considered when assessing the validity and translation of probabilistic reinforcement learning tasks. The next section suggests some methods for assessing sensitivity to changing reinforcement contingencies.
Response-bias probabilistic reward task
This task provides a measure of reward responsiveness in terms of how reinforcement history and current reinforcement contingencies affect future actions and specifically the pursuit of rewards. The task has both a probabilistic reinforcement learning component and a signal-detection component. However, the interest is neither solely in probabilistic reinforcement learning, which is best assessed with the probabilistic learning tasks described above, nor in the accuracy of signal detection. An important difference between the classic probabilistic learning procedures and the presently described task is that the two stimuli presented are difficult to discriminate, in addition to being differentially and only partially reinforced. One stimulus is designated as the rich stimulus (i.e., the more frequently reinforced), and the other stimulus is the lean stimulus (i.e., the less frequently reinforced; the stimuli are counterbalanced across subjects). By making the two stimuli ambiguous, the procedure allows healthy control subjects to develop a response bias toward the more richly reinforced stimulus. The development of this response bias is often accompanied by differences in accuracy responding for the two stimuli because the subjects tend to respond as though they have detected the rich stimulus even in cases when the lean stimulus was presented. The degree of development of this response bias is a measure of sensitivity to prior reinforcement contingencies (i.e., reward responsiveness) and presumably reflects the subject’s pursuit of reinforcers based on these contingencies.
This task was originally developed by Pizzagalli and colleagues (adapted from Tripp and Alsop, 1999) for use in human subjects in order to provide an objective laboratory-based measure of reward responsiveness that is not based on subjective self-reports (Pizzagalli et al., 2005). In the human version of the task, the subject is presented on a computer screen with a cartoon face that lacks a mouth. On each discrete trial, one of two mouths appears very briefly, and the subject has to press a key on the keyboard to indicate whether s/he saw a long (e.g., 13 mm) or a short (e.g., 11.5 mm) mouth. To allow the response bias to develop, correct responses are only partially reinforced (~40% of the time), with the mouth length designated as the rich stimulus being reinforced three times more frequently than the mouth length designated as the lean stimulus (Pizzagalli et al., 2005; a review of the human version of this task is provided in another CNTRICS paper; see Ragland et al., 2008). Over the duration of a single test session, healthy human subjects develop a response bias, expressed as more responses on the key that signifies the detection of the rich stimulus than the key associated with the lean stimulus. The development of this bias is reflected in the progressively increased accuracy for the rich stimulus and progressively decreased accuracy for the lean stimulus over the duration of the test session (Pizzagalli et al., 2005). The construct validity of the response bias measure as a measure of decreased reward responsiveness has been demonstrated by data showing that depressed inpatients (Vrieze et al., 2013a) and outpatients (Pizzagalli et al., 2008), as well as college student subjects who report increased depressive symptoms (as measured in the Beck Depression Inventory scale; score ≥ 16), failed to develop the response bias seen in healthy subjects (Pizzagalli et al., 2005). Of note, in a recent study, blunted response bias was greatest in major depressive disorder inpatients who reported elevated anhedonic symptoms, and reduced response bias predicted the chronicity of major depressive disorder diagnosis after 8 weeks of naturalistic treatment (Vrieze et al., 2013a). This pattern of results suggests that depressed patients, who are usually characterized by anhedonia and amotivation (American Psychiatric Association, 2000), and subjects with high Beck Inventory “melancholic” subscores and negative affect in the Positive and Negative Affect Scale (PANAS-NA; Watson et al., 1988) are relatively less affected by the differences in reinforcement associated with the different options than controls (Huys et al., 2013). Thus, the task parameters, procedures, and measures, as well as the studies in individuals with high depression scores or depression diagnosis, support the conclusion that the response bias probabilistic reward task provides a measure of reward responsiveness that may be useful for studying dysfunctions in how reinforcement contingencies influence the pursuit of rewards.
A study by Gold and colleagues indicated that medicated schizophrenia patients exhibited no deficits in the response bias probabilistic reward task (Heerey et al., 2008). However, the fact that these were medicated “stable outpatients” with schizophrenia and the fact that the smoking status of the study participants was not assessed provide some limitations and confounds to this work that need to be explored in future investigations. Smoking status at the time of the test is relevant because it has been hypothesized that the high smoking rates of psychiatric populations may reflect attempts to medicate untreated depression-like negative symptoms (Markou et al., 1998); that is, both the prescribed antipsychotic medications and the high smoking rates of the schizophrenia subjects may have alleviated deficits in the patients who participated in this study. Relevant to the above discussion are recent findings showing that the decreased development of response bias in this task was associated with increased levels of nicotine dependence in schizophrenia patients but not in control subjects (Ahnallen et al., 2012). Furthermore, nicotine increased response bias in healthy subjects (Barr et al., 2008). Taken together, these data suggest that nicotine in tobacco smoke may normalize the responsiveness to reinforcement in schizophrenia patients, measured by the response bias probabilistic reward task. Future investigations may address the question of whether response bias measured in this task is affected in unmedicated non-tobacco-smoking schizophrenia patients as it is in depressed patients. If this is not the case in schizophrenia patients, then this task provides the opportunity to study differences in reward processing between depressed patients and schizophrenia patients with high levels of negative symptoms. In this regard, the rat analogue of the response bias probabilistic reward task will provide a valuable tool in the investigation of the neurobiology of deficits in reward responsiveness that may be seen in some, but not all, neuropsychiatric populations characterized by reinforcement learning and motivational deficits.
Toward that goal, Markou, Pizzagalli, and colleagues have been working on developing a rat version of the response-bias probabilistic reward task (Der-Avakian et al., under review). Virtually identical procedures and parameters between the human and rat versions of the task have been used so that the processes measured in the rat task may be homologous and not just analogous to those measured in the human task. In the rat version of the task, rats are gradually trained in a discrete-trial tone duration discrimination task procedure (called a bisection procedure in the literature on temporal discrimination). A trial starts with the presentation of one of two tones, one being a long duration tone and the other being a short duration tone; all of the other parameters of the tone (e.g., frequency and volume) are identical between the two tones. After the tone presentation, the two levers that were previously retracted extend into the box, and the rat may emit a single response on either lever. When the rat responds on the correct lever associated with the previously presented tone duration (location associated with a particular tone duration is counterbalanced across subjects), a food reinforcer is delivered. After either a correct or incorrect response (or after a limited period without any response), the levers are retracted, and after a variable intertrial interval, another trial is initiated by the presentation of another tone. Once the duration discrimination is learned, the rats are allowed to become accustomed to a partial reinforcement contingency that is the same for both types of stimuli. On the test day, which is identical in its parameters to the single test session to which the human subjects are exposed, all parameters of the task remain the same except that the two stimuli (i) become difficult to discriminate (i.e., the long and short durations are similar) and (ii) are differentially reinforced. That is, one stimulus is defined as the rich stimulus and is reinforced 60% of the time, whereas the other lean stimulus is reinforced 20% of the time, resulting in correct responses for rich stimuli being reinforced three times more frequently than correct responses for lean stimuli, exactly as in the human task. The measure of response bias (b) is operationally defined by the following formula:
[Eq. 1] |
Rich correct is the number of correct responses for the rich stimulus, Lean incorrect is the number of incorrect responses for the lean stimulus, Rich incorrect is the number of incorrect responses for the rich stimulus, and Lean correct is the number of correct responses for the lean stimulus.
Higher b scores reflect increased response bias toward the stimulus associated with greater reinforcement probability, whereas lower scores reflect decreased response bias toward the same stimulus. Although response bias is the primary measure of interest derived from this task, the same data are used to calculate additional measures, such as (i) discriminability, defined as the ability to perceptually discriminate between the two stimuli, and (ii) accuracy for the rich and lean stimuli, defined as the percent correct responses for the rich and lean stimuli, respectively. The following formula is used to calculate discriminability (d), with lower numbers reflecting lower discriminability:
[Eq. 2] |
Accuracy is of interest because it provides an analysis of the pattern of responding that results in changes in overall response bias. For example, an increased response bias is typically reflected by increased accuracy for the rich stimulus compared with the lean stimulus. Discriminability is of interest because it provides a measure of performance, which can vary depending on the pattern of responding as well. For example, blunted response bias may correspond with high discriminability if the accuracy for both stimuli is equally increased over the duration of the session or low discriminability if the accuracy for both stimuli is equally decreased over the duration of the session. As demonstrated by these examples, response bias may be independent of discriminability.
The development of this response bias probabilistic reward task in rats is very recent, and thus only the effects of few manipulations have been compared between humans and animal subjects. A recent study showed that nicotine withdrawal in both rats and humans resulted in decreased response bias in the task (Pergadia et al., in preparation). Furthermore, a single low dose of the dopamine D2/D3 receptor agonist pramipexole, which presumably acts on presynaptic receptors to decrease dopamine output, decreased response bias in both humans (Pizzagalli et al., 2008) and rats (Der-Avakian et al., under review). The decreased response bias induced by pramipexole in humans was correlated with decreased event-related potential (ERP) activation in the dorsal anterior cingulate cortex (Santesso et al., 2009), an area implicated in the representation of reinforcement value (Rushworth et al., 2007; Bussey et al., 1997). In contrast to the effects of pramipexole, pharmacological manipulations that increase dopamine transmission, that is, nicotine administration in humans (Barr et al., 2008) or rats (Pergadia et al., in preparation) and amphetamine administration in rats (Der-Avakian et al., 2013), increased response bias compared with vehicle-treated controls. These parallel observations in rats and humans provide the first demonstrations that this task has translational potential. As more such data accrue, the task will be ripe for drug discovery efforts aimed to treat reinforcement learning deficits and uncover the neurobiological bases of such abnormalities.
At this stage of task development, it appears prudent to eliminate subjects that do not appear to perform on the task as it was intended to. For example, data from human subjects with extremely fast or slow reaction times throughout the task may be eliminated because such reaction times suggest that the subject was not appropriately attending to the task. Similarly, rats are excluded if an insufficient number of responses for either stimulus is emitted during the test session (e.g., many omissions or responses only on one lever for both tones), thus preventing the reinforcement of correct rich vs. lean responses at a 3:1 ratio. Although inherent biases toward one of the two stimuli/levers before the actual test are sometimes present in rats, such biases are controlled for by including a covariate during analyses of test data that is a measure of the degree of bias toward one lever/stimulus during the most recent training session when stimuli are equally reinforced. Furthermore, although there is good test-retest reliability in humans who perform the task (Pizzagalli et al., 2005; Santesso et al., 2009; Ragland et al., 2008), it has not been determined yet whether there is good test-retest reliability in rats, although so far retesting of the subjects has been possible. Such a feature of the task is very desirable because repeated-measures experimental designs are powerful and would allow one to derive more data from subjects that need to be trained for a couple of months before the test is implemented. Finally, there is a need to implement manipulations hypothesized to induce neuropathology implicated in schizophrenia and/or other psychiatric disorders characterized by reward and motivational deficits and assess the effects of such manipulations on the measures provided by the response bias probabilistic reward task.
Although no functional imaging investigations have been conducted in subjects who perform the response bias probabilistic reward task, the studies in humans and experimental animals that reported the brain areas activated in response to the presentation of conditioned stimuli, in preparation for approach appetitive stimuli, and in response selection based on expected outcomes that are highly likely to be involved in performance in this task. Specifically, nonhuman primate and rat studies indicated that the orbitofrontal cortex (OFC; Roesch and Olson, 2004; Feierstein et al., 2006) and dorsolateral prefrontal cortex (dlPFC; Dias et al., 1996; Kobayashi et al., 2002; Tsujimoto and Sawaguchi, 2004; Wallis and Miller, 2003) contribute to making and evaluating goal-directed decisions, even in situations in which there is imperfect data, as is the case in the response bias probabilistic reward task. The OFC (Kepecs et al., 2008) has been implicated in these tasks, and an electroencephalographic study also indicated that activation within the left dorsolateral and ventromedial prefrontal regions is associated with appetitively motivated behaviors (Pizzagalli et al., 2011; Vrieze et al., 2013b).
Finally, genetics appear to also modulate the effects of manipulations on developing the response bias in this task. In human subjects, an acute mild stress manipulation induced deficits in the development of the response bias, which were more pronounced in subjects who expressed homozygosity for the A allele at the rs12938031 position of the corticotropin-releasing receptor type 1 receptor gene (CRHR1; Bogdan et al., 2011). Similarly, self-reported perceived stress is associated with decreased response bias in human subjects who carry the S or LG allele of the 5-HTTLPR/rs25531 serotonin transporter gene (Nikolova et al., 2012). Consistent with these results, self-perceived professional success among psychiatrically healthy subjects predicted increased response bias in carriers of the Val/Val allele of the COMT/rs4680 genotype that is associated with increased phasic dopamine signaling (Goetz et al., 2013). In rat subjects, a psychosocial stressor involving social defeat resulted in decreased response bias in the task (Der-Avakian, Pizzagalli, and Markou, unpublished observations).
In summary, the response bias probabilistic reward task has been established in the rat in a way that is almost identical to the human version of the task to enhance the translational value of both human and rat tasks. The validation of this task has begun and has so far provided consistent data between humans and rats. The unique aspect of this task is that it assesses whether the integration of reinforcement contingencies occurs that leads to the future pursuit of rewards.
Tasks that Assess the Construct of Motivation
It is important to note that all tasks selected to assess the construct of motivation involve reinforcement learning during their initial acquisition. However, in designing and using these tasks, investigators attempt to derive measures that are most relevant to the construct of motivation unconfounded by learning deficits. Ways of achieving this goal include using relatively simple learning tasks in which learning deficits are unlikely to reveal themselves and/or by assessing motivation after asymptotic performance has been achieved by the subjects in the task.
There are many component aspects of motivation that are essential for goal-directed action. Motivation is influenced by an animal’s representation of the value of future rewards, its’ representation of the cost of obtaining them, and the computation that discounts the value of the reward by the effort and delay costs of working to obtain the reward. Patients may be affected by any or all of these aspects of motivation. Consequently, it is important to analyze the component processes in patients and then, where relevant, in animal models.
Schizophrenia patients have a deficit in the capacity to represent the value of future positive outcomes based on their past experiences (Barch and Dowd, 2010). Given the intuitive relationship between hedonia and motivation, it is surprising that several research groups have found a dissociation between hedonic reaction to rewarding stimuli and motivated behavior in patients with schizophrenia (Barch and Dowd, 2010; Cohen and Minor, 2010; Gard et al., 2007; Gold et al., 2008; Heerey and Gold, 2007; Kring et al., 2011). The majority of the current literature shows intact hedonic reaction to rewarding stimuli but impaired incentive motivation in schizophrenia patients, although a recent study reported a difference in the reaction to emotional stimuli in patients and controls (Strauss and Herbener, 2011). Specifically, patients have a deficit in representing the value of future outcomes (Gard et al., 2007; Gold et al., 2008) and are less likely to respond more for highly rewarding alternatives compared with less rewarding ones relative to controls (Kasanova et al., 2011). Thus, at least one contributor to lowered motivation in patients may be their ability to represent and update the value of future outcomes.
Outcome Devaluation and Contingency Degradation Tasks
One method for assessing sensitivity to the value of future outcomes is the outcome devaluation procedure. This protocol is based on the idea that post-conditioning changes in the value of an outcome should lead subjects to alter the behavior that produces that outcome. For example, if an animal learns that a bar press produces pellets and then the pellets are devalued by being paired with an illness-inducing agent, then the animal becomes less likely to make a response that had previously produced pellets if its behavior is guided by the current value of that outcome (Yin et al., 2005; Corbit and Balleine, 2005). A typical experiment of this sort teaches the subjects to make two different responses to produce two distinctive outcomes. For example, in one session each day, the left bar might produce pellets, and in a second daily session the right bar might produce sucrose. After instrumental training, one of the outcomes is devalued. In this example, half the animals might be satiated on pellets and half on sucrose. After satiation, the animals are given a choice test in which both levers are present but neither produces any rewards. To the extent that behavior is guided by current outcome values, working on the lever associated with the devalued outcome should be suppressed relative to the non-devalued lever. Such results have been reported in rats (Balleine and Dickinson, 1992), mice (Crombag et al., 2010; Hilario et al., 2007), monkeys (West et al., 2011), and humans (Klossek et al., 2008; Valentin et al., 2007). Thus, the outcome devaluation task is suggested as an excellent translational task for assessing the capacity to update the value of future reinforcers.
While the neural underpinnings of impaired anticipatory motivation in schizophrenia are not precisely known, there appears to be homologous circuitry in humans and rodents that regulates these processes. One interesting convergence is that there appear to be distinct neurobiological substrates underlying anticipatory motivation and hedonia (Kringelbach and Berridge, 2010). For example, dopamine signaling has been shown to be critical for anticipatory motivation (Balci et al., 2010; Cagniard et al., 2006; Salamone et al., 2007) but is relatively uninvolved in hedonic reaction to reward (Berridge et al., 2010; Peciña et al., 2003). Thus, to the extent that altered dopamine signaling is an important part of the pathophysiology of schizophrenia, one might anticipate changed incentive motivation but relatively intact hedonic reactions as described above (see Ward et al., 2012).
Studies with human subjects have documented a similar sensitivity of behavior to the current value of an outcome as that found in studies with nonhuman animals. The devaluation of primary rewards (Hogarth et al., 2012; Valentin et al., 2007) as well as secondary rewards, such as stimuli associated with money (de Wit et al., 2012; Lijehom et al, 2012), results in the differential selection of actions associated with outcomes that have not been devalued. These procedures have been adapted for use in children (Klossek et al., 2011) and in psychiatric populations (Gillan et al., 2011). In addition, there is a remarkable similarity in the brain structures and networks that are involved in the selection of goal-directed action (Balleine and O’Doherty, 2010; de Wit et al., 2012; Valentin et al., 2007).
Effort-based Tasks
As mentioned above, the activation of goal-directed action requires a computation of whether the value of a particular goal is worth the effort expended to obtain that outcome. The observation that motivated behaviors have an energetic or activational component is a consistent feature of the literature in psychology, psychiatry, and neurology over the last several decades. Motivational stimuli not only serve to direct actions to particular outcomes; they also activate or invigorate behavior. These activational aspects of motivation are highly adaptive because organisms must overcome work-related constraints or obstacles to gain access to significant stimuli, either by foraging over large distances in the wild or lever pressing or climbing barriers in a laboratory. The vigor or persistence of work output in stimulus-seeking behavior is widely seen as a fundamental aspect of motivation. Furthermore, organisms must make effort-related decisions based on cost/benefit analyses, allocating behavioral resources into goal-directed behaviors based on differential assessments of motivational value and response costs (Salamone et al., 2007, 2009, 2012). These activational aspects of motivation are widely studied in behavioral neuroscience, and they are clinically relevant also. As discussed above, clinicians have come to emphasize the importance of motivational symptoms related to effort expenditure, such as psychomotor slowing, apathy, and anergia in major depression, fatigue in parkinsonism and multiple sclerosis, and avolition in schizophrenia (Demyttenaere et al., 2005; Oakeshott et al., 2012; Salamone et al., 2006, 2010; Ward et al., 2011; Treadway and Zald, 2011). Moreover, it has been argued that many people with psychopathologies have fundamental deficits in reward seeking, exertion of effort, and effort-related decision making that do not simply depend on any problems that they may have with experiencing pleasure (Treadway and Zald, 2011). For these reasons, the establishment of effort-based behavioral tasks in animals can be a critical component of the development of preclinical models of motivational symptoms that are relevant for human psychopathology.
A number of behavioral tasks have been used to assess effort-related motivational processes in animals. One such procedure is the progressive-ratio schedule (i.e., a schedule in which the number of lever presses required per reinforcer gradually increases). As the ratio requirement increases, the animals reach a point at which they cease responding, which is generally known as a breakpoint. Although changes in progressive-ratio breakpoints are sometimes interpreted only in terms of “reward value,” progressive-ratio breakpoints clearly reflect more than just alterations in the appetitive motivational properties of a reinforcing stimulus. For example, changing the kinetic requirements of the instrumental response by increasing the height of the lever decreases progressive-ratio breakpoints (Skjoldager et al., 1993; Schmelzeis and Mittleman, 1996). Thus, despite the fact that some researchers have maintained that the breakpoint provides a direct measure of the appetitive motivational characteristics of a stimulus, it is, as discussed in a classic review by Stewart (1974), most directly a measure of how much work the organism will do to obtain access to that stimulus. Fundamentally, a progressive-ratio breakpoint is an outcome that results from effort-related decision-making processes. The organism is making a cost/benefit decision about whether or not to respond, based partly on the value of the reinforcer but also on the work-related response costs and time constraints imposed by the ratio schedule (Salamone, 2006). Progressive-ratio responding has been used to assess motivational impairments related to schizophrenia; striatal-specific increases in dopamine D2 receptor expression in mice led to decreases in progressive-ratio responding for food reinforcement that were generally unrelated to changes in appetite and other nonspecific effects (Drew et al., 2007; Simpson et al., 2011). Recent studies with a progressive-ratio/chow feeding choice task have shown that the dopamine receptor antagonist haloperidol suppresses food-reinforced progressive ratio responding and lowers breakpoints but nevertheless leaves the consumption of a concurrently available but less preferred food source intact (Randall et al., 2012). The actions of haloperidol on this task differed markedly from those produced by reinforcer devaluation (pre-feeding) and an appetite suppressant drug (the cannabinoid CB1 inverse agonist AM251; Randall et al., 2012). Moreover, high levels of progressive-ratio output were associated with increased expression of phosphorylated DARPP-32 (Thr34) in the nucleus accumbens core (Randall et al., 2012).
Another way of controlling work requirements in an operant schedule is to vary the fixed-ratio (FR) requirement across different schedules. In untreated animals, the overall relationship between ratio size (i.e., the number of lever presses required per reinforcer) and response rate is inverted-U-shaped. Up to a point, as the ratio requirement increases, animals adjust to this challenge by increasing response output. However, if the ratio requirement is high enough (i.e., if the cost is too high), then the animal reaches the point at which additional responses being required actually tend to suppress responding. This pattern of results is known as ratio strain and is analogous to a breakpoint on the progressive-ratio schedule. For example, Aberman and Salamone (1999) studied a range of ratio schedules (FR1, 4, 16, and 64) to assess the effects of nucleus accumbens dopamine depletions. FR1 performance was unaffected by dopamine depletion, and FR4 responding was only transiently and mildly suppressed; however, the schedules with large ratio requirements (i.e., FR16 and FR64) were severely impaired. In fact, dopamine-depleted rats that lever pressed on the FR64 schedule showed significantly fewer responses than those performing on the FR16 schedule. In behavioral economic terms, this pattern can be described as reflecting a change in the elasticity of the demand for food reinforcement (Salamone et al., 2009).
One of the drawbacks of using ratio schedules such as FR or progressive-ratio is that, as the ratio level increases, there is a corresponding increase in reinforcement intermittency because of the lengthening time required to complete the ratio. One way of controlling for this confound is to use tandem variable interval-fixed ratio (VI FR) schedules. Basically, these are interval schedules that have a ratio requirement attached to the interval. Comparing the effect of a manipulation on performance on a VI schedule with a FR1 attached versus the effects of the same manipulation on a comparable VI schedule with a higher ratio attached (FR5 or FR10) allows one to control for the time interval elapsed and independently vary the ratio requirement. These schedules have been used to demonstrate that the effects of nucleus accumbens dopamine depletions or adenosine receptor antagonism are greater with increasing ratios, even when one controls for the time interval requirement (Correa et al., 2002; Mingote et al., 2005, 2008). Another way of controlling for the influence of time intervals is to compare the effects of a manipulation on progressive-ratio responding with effects on a progressive-interval schedule (e.g., Wakabayashi et al., 2004; Ward et al., 2011).
As noted above, animals must continually make effort-related choices that involve assessments of work-related response costs and the potential benefits of responding. Tests of effort-related choice behavior (or effort-related decision making) generally involve tasks in which animals have choices between high effort/high reward and low effort/low reward options. There are several ways of assessing effort-related choice behavior in rodents. One of the procedures that has been used to assess effort-related choice behavior is a concurrent lever pressing/chow feeding task, which offers rodents the option of either lever pressing to obtain a relatively preferred food (e.g., high carbohydrate pellets; usually obtained by lever pressing on an FR5 schedule), or approaching and consuming a less preferred food (lab chow) that is concurrently available in the chamber (Salamone et al., 1991). Extensively trained rats under baseline or control conditions typically get most of their food by lever pressing and consume only small quantities of chow. Several dopamine receptor antagonists with different patterns of selectivity for the various dopamine receptors, including cis-flupenthixol, haloperidol, raclopride, eticlopride, SCH 23390, SKF83566, and ecopipam, all decreased lever pressing for food but substantially increased the intake of the concurrently available chow (Salamone et al., 1991, 1996, 2002; Cousins et al., 1994; Koch et al., 2000; Sink et al., 2008; Worden et al., 2009). Moreover, dopamine transporter (DAT) knockdown mice show the opposite pattern; they display increases in lever pressing and decreases in chow intake (Cagniard et al., 2006). The use of this task for assessing effort-related choice behavior has been validated in several studies. For example, the low dose of haloperidol that produced the shift from lever pressing to chow intake (0.1 mg/kg) did not affect total food intake or alter preference between these two specific foods in free-feeding choice tests (Salamone et al., 1991). Although dopamine receptor antagonists have been shown to reduce FR5 lever pressing and increase chow intake, appetite suppressants from different classes, including amphetamines (Cousins et al., 1994), fenfluramine (Salamone et al., 2002), and cannabinoid CB1 receptor antagonists (Sink et al., 2008), did not increase chow intake at doses that suppressed lever pressing. Similarly, pre-feeding to reduce food motivation suppressed both lever pressing and chow intake (Salamone et al., 1991). The attachment of higher-ratio requirements (up to FR20) in the absence of any drug treatments caused rats to shift from lever pressing to chow intake (Salamone et al., 1997), indicating that this task is sensitive to work load. The concurrent FR/chow intake task also has been used to assess a model of motivational impairments in schizophrenia. Increases in dopamine D2 receptor expression in mice were shown to decrease lever pressing and decrease chow intake (Ward et al., 2012).
A T-maze barrier choice procedure has been developed to assess the effects of drug or lesion manipulations on effort-related decision making in rodents (Salamone et al., 1994). In this task, the two choice arms of the maze can have different reinforcement densities (e.g., 4 vs. 2 food pellets, or 4 vs. 0), and under some conditions a large barrier is placed in the arm with the higher density of food reinforcement to present the animal with an effort-related challenge. Administration of the dopamine receptor antagonist haloperidol and nucleus accumbens dopamine depletions dramatically affect choice behavior when the high-density arm (4 pellets) has the barrier in position, and the arm without the barrier contains an alternative food source (2 pellets). Dopamine depletions or antagonism decrease the choice of the high density arm and increase the choice of the low density arm (Salamone et al., 1994; Cousins et al., 1996; Denk et al., 2005; Mott et al., 2009). The results of these T-maze studies in rodents, together with the findings from the operant concurrent choice studies reviewed above, indicate that low doses of dopamine receptor antagonists and nucleus accumbens dopamine depletions cause animals to reallocate their instrumental response selection based on the response requirements of the task and select lower effort alternatives for obtaining reinforcers. Like the operant concurrent choice task, the T-maze task for measuring effort-based choice behavior also has undergone considerable behavioral validation and evaluation (Salamone et al., 1994; Cousins et al., 1996; Van den Bos et al., 2006). Although rats treated with dopamine receptor antagonists or nucleus accumbens dopamine depletions are slower than those tested under control conditions, it does not appear as though the choice deficit is secondary to a latency deficit (Salamone et al., 1994; Bardgett et al., 2009). For example, although the increases in latency induced by nucleus accumbens dopamine depletions show rapid post-surgical recovery, the alteration in choice is much more persistent (Salamone et al., 1994). In addition, drug-induced effects on latency and arm choice in the T-maze are pharmacologically dissociable (Bardgett et al., 2009). When no barrier is placed in the arm with the high reinforcement density, rats mostly choose that arm, and neither haloperidol nor nucleus accumbens dopamine depletion alters their response choice (Salamone et al., 1994). When the arm with the barrier contained four pellets, but the other arm contained no pellets, rats with nucleus accumbens dopamine depletions were very slow but still managed to choose the high-density arm, climb the barrier, and consume the pellets (Cousins et al., 1996). In a recent T-maze choice study with mice, it was confirmed that haloperidol reduced the choice of the arm with the barrier, and it also was demonstrated that haloperidol had no effect on choice when both arms had a barrier in place (Pardo et al., 2012). Thus, dopaminergic manipulations do not alter the preference for the high density of reinforcement over the lower density and do not affect discrimination or memory processes related to arm preference.
Recent experiments have used effort discounting procedures to study the effects of dopaminergic manipulations. A T-maze effort discounting task was recently developed by Bardgett et al. (2009). With this task, the amount of food in the high-density arm of the maze was diminished in each trial in which the rats selected that arm (i.e., an “adjusting-amount” discounting variant of the T-maze procedure that allows for the determination of an indifference point for each rat). Administration of either the dopamine D1 receptor family antagonist SCH23390 or D2 receptor family antagonist haloperidol altered effort discounting, making it more likely that rats would choose the arm with the smaller reward. Administration of amphetamine blocked the effects of SCH23390 and haloperidol and also biased rats toward choosing the high-reward/high-cost arm. Floresco et al. (2008) studied the effects of dopaminergic and glutamatergic drugs on both effort (i.e., ratio) and delay discounting using operant procedures. The dopamine receptor antagonist haloperidol altered effort discounting, even when the effects of time delay were controlled for (Floresco et al., 2008).
Although much of the early work using the FR5/chow feeding and T-maze barrier choice tasks has focused on dopaminergic manipulations, it is clear that nucleus accumbens dopamine is a component of a broader circuitry involving multiple brain areas and transmitters. Several different behavioral tasks have been used by multiple laboratories to characterize the involvement of several brain areas, including the basolateral amygdala, prefrontal/anterior cingulate cortex, nucleus accumbens, and ventral pallidum, in the exertion of effort and effort-related decision making (Salamone et al., 1994; Walton et al., 2006; Denk et al., 2005; Schweimer and Hauber, 2006; Van den Bos et al., 2006; Floresco and Ghods-Sharifi, 2007; Mingote et al., 2008; Hauber and Sommer, 2009; Mott et al., 2009). Moreover, the effort-related effects of D2 receptor antagonism on the FR5/chow feeding and T-maze tasks are attenuated by co-administration of adenosine A2A receptor antagonists (Farrar et al., 2007; Mott et al., 2009; Salamone et al., 2009, 2012; Pardo et al., 2012) and by adenosine A2A receptor knockout (Pardo et al., 2012).
Recent studies have focused on effort-related decision making in humans (e.g., Croxson et al., 2009; Kurniawan et al., 2011). Treadway et al. (2009) developed a human version of the rodent tasks described above (i.e., the Effort-Expenditure for Rewards Task [EEfRT]), in which people must choose between a high-effort/high-monetary-reward option vs. a low-effort/low-reward option. Using this task, it was shown that individual differences in the exertion of effort in humans were associated with an imaging marker of striatal dopamine transmission (Treadway et al., 2012b). In addition, amphetamine enhanced the willingness of people to exert effort to obtain rewards, particularly when reward probability was low, but did not alter the effects of reward magnitude on the willingness to exert effort (Wardle et al., 2011). Furthermore, decreased selection of high-effort/high-reward options was seen in patients with major depression (Treadway et al., 2012a) and also in schizophrenia patients, particularly those with substantial negative symptoms (Gold et al., 2013).
In their recent review of the literature on the neural bases of avolition and anhedonia, Barch and Dowd (2010) described a network that involves the basal ganglia, OFC, anterior cingulate cortex, and dlPFC that is involved in the learning and performance mechanisms that underlie goal-directed action and is the likely substrate for clinical avolition/anhedonia. There are homologous regions in animals that mediate similar behavioral functions. The basal ganglia and dopamine systems play a key role in reinforcement learning and the modulation of incentive motivation (Balleine and O’Doherty, 2010; Smith et al., 2011; Schultz, 2007) and the willingness to expend effort to obtain a goal (Salamone and Correa, 2009). The interplay between the medial and lateral striatum is important in determining the extent of the influence of the current value of goals over action (Balleine and O’Doherty, 2010). The OFC plays a key role in updating and maintaining value representations (Schoenbaum et al., 2009; Kringelbach and Berridge, 2010; O’Doherty, 2007), as does the amygdala (Balleine and Kilcross, 2006; Morrison and Salzman, 2010; Savage and Ramos, 2009). Furthermore, the OFC and anterior cingulate cortex appear to be important in the computation of cost/benefit values that underlie the initiation and maintenance of action (Rushworth et al., 2011). In addition, the PFC (Grabenhorst and Rolls, 2011; Rushworth et al., 2011; Savine and Braver, 2010; Wunderlich et al., 2011) appears to be involved in the construction, selection, and execution of specific action plans based on value representations.
ISSUES RELATED TO DRUG DISCOVERY
Drug discovery efforts need to do one thing successfully: predict how pharmacological agents will affect the quality of life of patients. Achieving this goal in the context of learning or motivational deficits in schizophrenia is no doubt challenging, but nevertheless an increasingly attractive and viable proposition. After a prolonged period during which positive symptom management dominated both clinical and discovery fields, the negative and cognitive aspects of schizophrenia, including constructs of reward and motivation, have become more of a focus for the pharmaceutical industry. One reason for this focus is that without improvements in motivational processes, even compliance with standard-of-care antipsychotic regimens for the control of positive symptoms remains a major obstacle to the maintenance of efficacy. A second reason has been the recent drive for strategic reemphasis of preclinical biological investigations in schizophrenia, as exemplified by the National Institute of Mental Health-driven Research Domains Criteria (RDoc) initiative (Insel et al., 2010). The proposal here is that greater insights will be gained into the nature of psychiatric disorders by focusing more on the dimensions of underlying neurobiology than the expressed symptoms themselves. To this end, the RDoc framework includes the domain of “positive valence,” which clearly shares the same conceptual space as the present review article addresses. It is therefore timely to describe and appraise preclinical behavioral tests that assess reinforcement learning and motivational constructs for the purposes of drug discovery.
Standard-of-care antipsychotics are often “double-edged” when it comes to effects on learning and motivational processes. Continued use of many currently available antipsychotic medications can result in the development of severe side effects, including sedation, extrapyramidal side effects, and antipsychotic-induced parkinsonism, which not only become intolerable but also contribute to the development of motivational deficits as a secondary symptom or aggravation of primary negative symptoms. Alternatively, it is also possible that a successful remediation of psychotic symptoms can also positively impact learning and motivation processes in some patients. In this sense, a preclinical discovery scientist might be just as interested in the ability of a compound to exacerbate existing learning and motivational deficits as in the desire to treat them. As a result, it would be preferable for behavioral tests to be free of floor/ceiling effects that might prohibit the detection of drug effects in one direction or another. The notion of the existence of primary negative symptoms, central to the schizophrenia disease process itself, and secondary negative symptoms, which represent comorbid or iatrogenic sequelae, complicates expectations of what an animal model of learning and motivation deficits should and can deliver and how it might translate into clinical experience. As researchers begin to assess the validity of animal models of schizophrenia in some of the tasks described in this article, it should be anticipated that drug effects on reinforcement learning and motivation constructs will be complex, multivariate, and potentially bidirectional. To this end, it is advantageous that at least the autoshaping tasks described earlier can potentially detect both increases and decreases in performance.
Practically, preclinical drug discovery efforts are constrained by financial, temporal, and scientific pressures (Brunner et al., 2011). As a result, particular types of tests tend to be favored, such as (i) those that maximize the efficiency of animal usage (i.e., more than one study can be conducted in the same animals where drug effects can be detected with manageable sample sizes), (ii) those that maximize confidence in the findings (i.e., parameters are measured objectively and/or blindly, drug and animal model effects are as consistent as possible over repeated studies, and procedures are amenable to incorporation of biomarkers), and (iii) those that minimize the time taken to generate a decision-making result. In reality, different companies and different scientists will have different levels of tolerance here, and difficult compromises often have to be made between all of these factors. However, if a test fairs particularly badly in any of these criteria, then it is not likely to be broadly adopted by the drug discovery community, who needs to reach general conclusions about predictive validity for the human population.
The nominated tasks in this article represent theoretically viable starting points for “discovery-friendly” experimentation, although practical aspects of each could be improved to better fit drug discovery efforts. For example, all of the tasks described here suffer from a relative lack of published studies that describe the effects of systemically administered compounds; thus, it is difficult to conclude too much about the robustness of the effects of pharmacological manipulations. In addition, very little work has assessed “classic” animal models of schizophrenia in this context or whether novel manipulations that impair reinforcement learning and motivation will be required. Reinforcement learning paradigms are normally “one-chance” tests in animals because of the limited stimuli sets available in operant testing chambers and would therefore represent a high resource demand in a drug discovery environment. Touchscreens may provide a clear advantage here, in which an almost limitless set of visual stimulus shapes can be generated; thus, it may be possible to repeatedly assess reinforcement learning in the same animal with novel stimulus pairs. Where the training time of the tasks is lengthy, such as the touchscreen probabilistic response learning assay, it would be advantageous if asymptotic response performance also provided a meaningful index of function because asymptotic levels of performance would allow for repeated drug studies within the same batch of animals. The probabilistic response learning task and response bias probabilistic task are complex tasks that can involve changing contingencies over several sessions, which can complicate the timing of drug administrations, particularly in cases where the assessment of the effects of chronic compound administration is of interest. Acute drug studies, a common approach in drug discovery settings, are made much more straightforward if the complete dataset can be collected within one fairly short test session. Many of the operant chamber effort-based tasks, such as progressive-ratio or concurrent lever pressing/chow feeding, actually have this advantage of short, rapid test sessions, although the temporal aspects of such tests might become problematic with compounds with rapid pharmacokinetics. Furthermore, for a drug discovery setting, hand-run tasks, such as maze-based barrier or discounting tasks, would always represent a last resort should no other way be possible to obtain equivalent data in a more automated manner. Finally and perhaps most importantly, although all of the experimental animal tasks described herein have analogous tasks in humans, these tasks vary in the degree to which they have been validated as being analogous across species and providing comparable data after similar manipulations. The latter is an area where research efforts need to be focused.
In conclusion, basic behavioral neuroscience research has provided several tasks that assess specific reinforcement learning and motivational constructs with potential relevance to the negative symptoms of schizophrenia. There is also a rich literature on the neurobiology of reinforcement learning and motivation. However, due to the only recent focus of the industry on the negative symptoms of schizophrenia, these tasks have not yet been optimized for drug discovery efforts in this field. Furthermore, the potential usefulness of these tasks in this arena remains to be shown. Hopefully, the rich literature and long-term interest and work of behavioral neuroscientists in this field will facilitate the rapid adoption of these tests in drug discovery programs for the treatment of reinforcement learning and motivational deficits that characterize schizophrenia patients.
Conclusions
This review article focused on the description of some appropriate behavioral tasks with which to measure reinforcement learning and motivational constructs preclinically. It should be emphasized, however, that using these tasks to assess drug effects in normal healthy animals will most likely be inadequate for the prediction of therapeutic utility in clinical disease states. Animal models of schizophrenia, or at least disruptor models of the constructs in question, will have to be employed. This issue raises an interesting question about the specificity of effects and commonality of endpoints between such models. Deficits in learning and motivational processes in isolation do not define schizophrenia, nor are they unique to this syndrome. The spectrum of negative symptoms in schizophrenia comprises reduced affective responses, poverty of speech, and social withdrawal, among other effects. This constellation of symptoms may reflect potentially distinct neurobiological domains, including affective flattening, alogia, asociality, anhedonia, and avolition, although the exact groupings and symptoms included in these domains depend on the scales and analyses utilized. To take the example of the domain of apathy, which is considered one of the negative symptoms of schizophrenia, it is also prevalent in dysthymia, active and remitted depression, stroke, Parkinson’s disease, progressive supranuclear palsy, Huntington’s disease, and dementias, such as Alzheimer’s disease, vascular dementia, and frontotemporal dementia. Nevertheless, any means by which a reinforcement learning or motivational deficit can be induced (e.g., via a manipulation that induces a depression-like state in animals) may still inform about the behavioral construct as it relates to schizophrenia. This approach opens great opportunities for preclinical researchers to utilize disparate models in parallel to demonstrate the convergent validity of drug effects. However, such an approach would clearly depend on the establishment of strong biomarker strategies to determine etiological validity. That is, despite the fact that a notionally “unbiased” animal model may be used, the reinforcement learning and motivational deficit state observed may be mediated by a neurophysiological mechanism equivalent to that in schizophrenia itself. Thus, it would be valuable if all of the tasks nominated in this article were appraised not only in standard animal models of schizophrenia but also in a broader context of other types of models that impact reward and motivational processes in a relevant manner.
All of the animal tasks described in this article have translational potential, and most of these tasks, such as the probabilistic learning task, the response bias probabilistic learning tasks, and some effort-related tasks, already have corresponding identical or at least analogous tasks in humans. In addition, there is a rich and extensive literature on the neurobiology of reinforcement learning and motivation because this has been an area of intense research focus by behavioral neuroscientists since the beginning of the 20th century. Thus, there is a great opportunity to build on this research tradition. The tasks described in this review have the potential to be further developed and validated in the context of the study of the negative symptoms of schizophrenia and as such (i) contribute to the investigation of the neuropathology that mediates the negative symptoms of schizophrenia and (ii) provide a means of discovering new therapeutic approaches for these symptoms that are currently not well treated.
Highlights.
Clarification of terms used in clinical and animal literature pertaining to negative symptoms of schizophrenia
Description of experimental animal tasks to assess reinforcement learning and motivation
Drug discovery perspective on issues relevant to the treatment of negative symptoms of schizophrenia
Acknowledgements
This work was supported by NIH grants R01MH62527 to AM, R01MH094966 and R01MH078023 to JS, and R01MH068073 to PB. TJB and ACM received support from the Innovative Medicine Initiative Joint Undertaking under grant agreement no. 115008 of which resources are composed of EFPIA in-kind contribution and financial contribution from the European Union’s Seventh Framework Programme (FP7/2007-2013). The authors thank Mr. Michael Arends for outstanding editorial assistance, Drs. Andre Der-Avakian and Diego Pizzagalli for input on this manuscript, and Campden Instruments for Figure 1.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Financial Disclosures
AM has received contract research support from Bristol-Myers Squibb, Forest Laboratories, and Astra-Zeneca and honoraria/consulting fees from AbbVie Company during the past 3 years. JS has received research support from Merck Seronno, Pfizer, and Roche in the last 3 years. TB consults for Campden Instruments. DB is an employee of PsychoGenics, Inc. GG is an employee of Eli Lilly & Co. Ltd. The remaining authors have no disclosures.
References
- Aberman JE, Salamone JD. Nucleus accumbens dopamine depletions make rats more sensitive to high ratio requirements but do not impair primary food reinforcement. Neuroscience. 1999;92:545–552. doi: 10.1016/s0306-4522(99)00004-4. [DOI] [PubMed] [Google Scholar]
- Ahnallen CG, Liverant GI, Gregor KL, Kamholz BW, Levitt JJ, Gulliver SB, Pizzagalli DA, Koneru VK, Kaplan GB. The relationship between reward-based learning and nicotine dependence in smokers with schizophrenia. Psychiatry Res. 2012;196:9–14. doi: 10.1016/j.psychres.2011.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- American Psychiatric Association . Diagnostic and Statistical Manual of Mental Disorders. 4th edition. American Psychiatric Association; Washington DC: 2000. text revision. [Google Scholar]
- Andreasen NC. Scale for the Assessment of Negative Symptoms (SANS) University of Iowa; Iowa City: 1981. [Google Scholar]
- Andreasen NC. Negative symptoms in schizophrenia: definition and reliability. Arch. Gen. Psychiatry. 1982;39:784–788. doi: 10.1001/archpsyc.1982.04290070020005. [DOI] [PubMed] [Google Scholar]
- Andreasen NC, Olsen S. Negative v positive schizophrenia: definition and validation. Arch. Gen. Psychiatry. 1982;39:789–794. doi: 10.1001/archpsyc.1982.04290070025006. [DOI] [PubMed] [Google Scholar]
- Arana FS, Parkinson JA, Hinton E, Holland AJ, Owen AM, Roberts AC. Dissociable contributions of the human amygdala and orbitofrontal cortex to incentive motivation and goal selection. J. Neurosci. 2003;23:9632–9638. doi: 10.1523/JNEUROSCI.23-29-09632.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balci F, Ludvig EA, Abner R, Zhuang X, Poon P, Brunner D. Motivational effects on interval timing in dopamine transporter (DAT) knockdown mice. Brain Res. 2010;1325:89–99. doi: 10.1016/j.brainres.2010.02.034. [DOI] [PubMed] [Google Scholar]
- Balleine BW, O’Doherty JP. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology. 2010;35:48–69. doi: 10.1038/npp.2009.131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balleine B, Dickinson A. Signalling and incentive processes in instrumental reinforcer devaluation. Q. J. Exp. Psychol. B Comp. Physiol. Psychol. 1992;45:285–301. [PubMed] [Google Scholar]
- Balleine BW, Killcross S. Parallel incentive processing: an integrated view of amygdala function. Trends Neurosci. 2006;29:272–279. doi: 10.1016/j.tins.2006.03.002. [DOI] [PubMed] [Google Scholar]
- Barch DM, Dowd EC. Goal representations and motivational drive in schizophrenia: the role of prefrontal-striatal interactions. Schizophr. Bull. 2010;36:919–934. doi: 10.1093/schbul/sbq068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bardgett ME, Depenbrock M, Downs N, Points M, Green L. Dopamine modulates effort-based decision making in rats. Behav. Neurosci. 2009;123:242–251. doi: 10.1037/a0014625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bari A, Theobald DE, Caprioli D, Mar AC, Aidoo-Micah A, Dalley JW, Robbins TW. Serotonin modulates sensitivity to reward and negative feedback in a probabilistic reversal learning task in rats. Neuropsychopharmacology. 2010;35:1290–1301. doi: 10.1038/npp.2009.233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barr RS, Pizzagalli DA, Culhane MA, Goff DC, Evins AE. A single dose of nicotine enhances reward responsiveness in nonsmokers: implications for development of dependence. Biol. Psychiatry. 2008;63:1061–1065. doi: 10.1016/j.biopsych.2007.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bechara A, Tranel D, Damasio H, Adolphs R, Rockland C, Damasio AR. Double dissociation of conditioning and declarative knowledge relative to the amygdala and hippocampus in humans. Science. 1995;269:1115–1118. doi: 10.1126/science.7652558. [DOI] [PubMed] [Google Scholar]
- Bellebaum C, Polezzi D, Daum I. It is less than you expected: the feedback-related negativity reflects violations of reward magnitude expectations. Neuropsychologia. 2010;48:3343–3350. doi: 10.1016/j.neuropsychologia.2010.07.023. [DOI] [PubMed] [Google Scholar]
- Berridge KC, Ho CY, Richard JM, DiFeliceantonio AG. The tempted brain eats: pleasure and desire circuits in obesity and eating disorders. Brain Res. 2010;1350:43–64. doi: 10.1016/j.brainres.2010.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berridge KC, Robinson TE. What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? Brain Res. Brain Res. Rev. 1998;28:309–369. doi: 10.1016/s0165-0173(98)00019-8. [DOI] [PubMed] [Google Scholar]
- Bogdan R, Santesso DL, Fagerness J, Perlis RH, Pizzagalli DA. Corticotropin-releasing hormone receptor type 1 (CRHR1) genetic variation and stress interact to influence reward learning. J. Neurosci. 2011;31:13246–13254. doi: 10.1523/JNEUROSCI.2661-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brébion G, Amador X, Smith M, Malaspina D, Sharif Z, Gorman JM. Depression, psychomotor retardation, negative symptoms, and memory in schizophrenia. Neuropsychiatry Neuropsychol. Behav. Neurol. 2000;13:177–183. [PubMed] [Google Scholar]
- Brown P, Jenkins HM. Auto-shaping of the pigeon’s key peck. J. Exp. Anal. Behav. 1968;11:1–8. doi: 10.1901/jeab.1968.11-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brunner D, Balci F, Ludvig EA. Comparative psychology and the grand challenge of drug discovery in psychiatry and neurodegeneration. Behav. Processes. 2011;89:187–195. doi: 10.1016/j.beproc.2011.10.011. [DOI] [PubMed] [Google Scholar]
- Brunswik E. Probability as a determiner of rat behavior. J. Exp. Psychol. 1939;25:175–197. [Google Scholar]
- Bryden DW, Johnson EE, Tobia SC, Kashtelyan V, Roesch MR. Attention for learning signals in anterior cingulate cortex. J. Neurosci. 2011;31:18266–18274. doi: 10.1523/JNEUROSCI.4715-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bussey TJ, Everitt BJ, Robbins TW. Dissociable effects of cingulate and medial frontal cortex lesions on stimulus-reward learning using a novel Pavlovian autoshaping procedure for the rat: implications for the neurobiology of emotion. Behav. Neurosci. 1997;111:908–919. doi: 10.1037//0735-7044.111.5.908. [DOI] [PubMed] [Google Scholar]
- Cagniard B, Balsam PD, Brunner D, Zhuang X. Mice with chronically elevated dopamine exhibit enhanced motivation, but not learning, for a food reward. Neuropsychopharmacology. 2006;31:1362–1370. doi: 10.1038/sj.npp.1300966. [DOI] [PubMed] [Google Scholar]
- Cardinal RN, Parkinson JA, Lachenal G, Halkerston KM, Rudarakanchana N, Hall J, Morrison CH, Howes SR, Robbins TW, Everitt BJ. Effects of selective excitotoxic lesions of the nucleus accumbens core, anterior cingulate cortex, and central nucleus of the amygdala on autoshaping performance in rats. Behav. Neurosci. 2002;116:553–567. doi: 10.1037//0735-7044.116.4.553. [DOI] [PubMed] [Google Scholar]
- Carter CS, Botvinick MM, Cohen JD. The contribution of the anterior cingulate cortex to executive processes in cognition. Rev. Neurosci. 1999;10:49–57. doi: 10.1515/revneuro.1999.10.1.49. [DOI] [PubMed] [Google Scholar]
- Carter CS, Braver TS, Barch DM, Botvinick MM, Noll D, Cohen JD. Anterior cingulate cortex, error detection, and the online monitoring of performance. Science. 1998;280:747–749. doi: 10.1126/science.280.5364.747. [DOI] [PubMed] [Google Scholar]
- Chamberlain SR, Müller U, Blackwell AD, Clark L, Robbins TW, Sahakian BJ. Neurochemical modulation of response inhibition and probabilistic learning in humans. Science. 2006;311:861–863. doi: 10.1126/science.1121218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chaudhuri A, Behan PO. Fatigue in neurological disorders. Lancet. 2004;363:978–988. doi: 10.1016/S0140-6736(04)15794-2. [DOI] [PubMed] [Google Scholar]
- Chib VS, Rangel A, Shimojo S, O’Doherty JP. Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex. J. Neurosci. 2009;29:12315–12320. doi: 10.1523/JNEUROSCI.2575-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Christakou A, Robbins TW, Everitt BJ. Functional disconnection of a prefrontal cortical-dorsal striatal system disrupts choice reaction time performance: implications for attentional function. Behav. Neurosci. 2001;115:812–825. doi: 10.1037//0735-7044.115.4.812. [DOI] [PubMed] [Google Scholar]
- Chudasama Y, Robbins TW. Dissociable contributions of the orbitofrontal and infralimbic cortex to Pavlovian autoshaping and discrimination reversal learning: further evidence for the functional heterogeneity of the rodent frontal cortex. J. Neurosci. 2003;23:8771–8780. doi: 10.1523/JNEUROSCI.23-25-08771.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clarke DE, Ko JY, Kuhl EA, van Reekum R, Salvador R, Marin RS. Are the available apathy measures reliable and valid? A review of the psychometric evidence. J. Psychosom. Res. 2011;70:73–97. doi: 10.1016/j.jpsychores.2010.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cofer CN, Appley MH. Motivation: Theory and Research. John Wiley; New York: 1964. [Google Scholar]
- Cohen AS, Minor KS. Emotional experience in patients with schizophrenia revisited: meta-analysis of laboratory studies. Schizophr. Bull. 2010;36:143–150. doi: 10.1093/schbul/sbn061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Collins AG, Frank MJ. How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. Eur. J. Neurosci. 2012;35:1024–1035. doi: 10.1111/j.1460-9568.2011.07980.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corbit LH, Balleine BW. Double dissociation of basolateral and central amygdala lesions on the general and outcome-specific forms of pavlovian-instrumental transfer. J. Neurosci. 2005;25:962–970. doi: 10.1523/JNEUROSCI.4507-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Correa M, Carlson BB, Wisniecki A, Salamone JD. Nucleus accumbens dopamine and work requirements on interval schedules. Behav. Brain Res. 2002;137:179–187. doi: 10.1016/s0166-4328(02)00292-9. [DOI] [PubMed] [Google Scholar]
- Cousins MS, Atherton A, Turner L, Salamone JD. Nucleus accumbens dopamine depletions alter relative response allocation in a T-maze cost/benefit task. Behav. Brain Res. 1996;74:189–197. doi: 10.1016/0166-4328(95)00151-4. [DOI] [PubMed] [Google Scholar]
- Cousins MS, Wei W, Salamone JD. Pharmacological characterization of performance on a concurrent lever pressing/feeding choice procedure: effects of dopamine antagonist, cholinomimetic, sedative and stimulant drugs. Psychopharmacology. 1994;116:529–537. doi: 10.1007/BF02247489. [DOI] [PubMed] [Google Scholar]
- Craig W. Appetites and aversions as constituents of instincts. Proc. Natl Acad. Sci. U. S. A. 1917;3:685–688. doi: 10.1073/pnas.3.12.685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crombag HS, Johnson AW, Zimmer AM, Zimmer A, Holland PC. Deficits in sensory-specific devaluation task performance following genetic deletions of cannabinoid (CB1) receptor. Learn. Mem. 2010;17:18–22. doi: 10.1101/lm.1610510. [DOI] [PubMed] [Google Scholar]
- Croxson PL, Walton ME, O’Reilly JX, Behrens TE, Rushworth MF. Effort-based cost-benefit valuation and the human brain. J. Neurosci. 2009;29:4531–4541. doi: 10.1523/JNEUROSCI.4515-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dalley JW, Chudasama Y, Theobald DE, Pettifer CL, Fletcher CM, Robbins TW. Nucleus accumbens dopamine and discriminated approach learning: interactive effects of 6-hydroxydopamine lesions and systemic apomorphine administration. Psychopharmacology. 2002;161:425–433. doi: 10.1007/s00213-002-1078-2. [DOI] [PubMed] [Google Scholar]
- Dalley JW, Lääne K, Theobald DE, Armstrong HC, Corlett PR, Chudasama Y, Robbins TW. Time-limited modulation of appetitive Pavlovian memory by D1 and NMDA receptors in the nucleus accumbens. Proc. Natl Acad. Sci. U. S. A. 2005;102:6189–6194. doi: 10.1073/pnas.0502080102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danna CL, Elmer GI. Disruption of conditioned reward association by typical and atypical antipsychotics. Pharmacol. Biochem. Behav. 2010;96:40–47. doi: 10.1016/j.pbb.2010.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ. Cortical substrates for exploratory decisions in humans. Nature. 2006;441:876–879. doi: 10.1038/nature04766. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Wit S, Watson P, Harsay HA, Cohen MX, van de Vijver I, Ridderinkhof KR. Corticostriatal connectivity underlies individual differences in the balance between habitual and goal-directed action control. J. Neurosci. 2012;32:12066–12075. doi: 10.1523/JNEUROSCI.1088-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Delgado MR, Olsson A, Phelps EA. Extending animal models of fear conditioning to humans. Biol. Psychol. 2006;73:39–48. doi: 10.1016/j.biopsycho.2006.01.006. [DOI] [PubMed] [Google Scholar]
- Demyttenaere K, De Fruyt J, Stahl SM. The many faces of fatigue in major depressive disorder. Int. J. Neuropsychopharmacol. 2005;8:93–105. doi: 10.1017/S1461145704004729. [DOI] [PubMed] [Google Scholar]
- Denk F, Walton ME, Jennings KA, Sharp T, Rushworth MF, Bannerman DM. Differential involvement of serotonin and dopamine systems in cost-benefit decisions about delay or effort. Psychopharmacology. 2005;179:587–596. doi: 10.1007/s00213-004-2059-4. [DOI] [PubMed] [Google Scholar]
- Der-Avakian A, D’Souza DA, Pizzagalli DA, Markou A. Assessment of reward responsiveness in the response bias probabilistic reward task in rats: implications for cross-species translational research. Transl. Psychiatry. 2013 doi: 10.1038/tp.2013.74. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Der-Avakian A, Markou A. The neurobiology of anhedonia and other reward-related deficits. Trends Neurosci. 2012;35:68–77. doi: 10.1016/j.tins.2011.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Di Ciano P, Cardinal RN, Cowell RA, Little SJ, Everitt BJ. Differential involvement of NMDA, AMPA/kainate, and dopamine receptors in the nucleus accumbens core in the acquisition and performance of Pavlovian approach behavior. J. Neurosci. 2001;21:9471–9477. doi: 10.1523/JNEUROSCI.21-23-09471.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dias R, Robbins TW, Roberts AC. Dissociation in prefrontal cortex of affective and attentional shifts. Nature. 1996;380:69–72. doi: 10.1038/380069a0. [DOI] [PubMed] [Google Scholar]
- Di Ciano P, Cardinal RN, Cowell RA, Little SJ, Everitt BJ. Differential involvement of NMDA, AMPA/kainate, and dopamine receptors in the nucleus accumbens core in the acquisition and performance of pavlovian approach behavior. J. Neurosci. 2001;21:9471–9477. doi: 10.1523/JNEUROSCI.21-23-09471.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dickinson A. Conditioning and associative learning. Br. Med. Bull. 1981;37:165–168. doi: 10.1093/oxfordjournals.bmb.a071695. [DOI] [PubMed] [Google Scholar]
- Doll BB, Simon DA, Daw ND. The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 2012;22:1075–1081. doi: 10.1016/j.conb.2012.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dowd EC, Barch DM. Pavlovian reward prediction and receipt in schizophrenia: relationship to anhedonia. PLoS One. 2012;7:e35622. doi: 10.1371/journal.pone.0035622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drew MR, Simpson EH, Kellendonk C, Herzberg WG, Lipatova O, Fairhurst S, Kandel ER, Malapani C, Balsam PD. Transient overexpression of striatal D2 receptors impairs operant motivation and interval timing. J. Neurosci. 2007;27:7731–7739. doi: 10.1523/JNEUROSCI.1736-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Farkas M, Polgar P, Kelemen O, Rethelyi J, Bitter I, Myers CE, Gluck MA, Keri S. Associative learning in deficit and nondeficit schizophrenia. Neuroreport. 2008;19:55–58. doi: 10.1097/WNR.0b013e3282f2dff6. [DOI] [PubMed] [Google Scholar]
- Farrar AM, Pereira M, Velasco F, Hockemeyer J, Muller CE, Salamone JD. Adenosine A2A receptor antagonism reverses the effects of dopamine receptor antagonism on instrumental output and effort-related choice in the rat: implications for studies of psychomotor slowing. Psychopharmacology. 2007;191:579–586. doi: 10.1007/s00213-006-0554-5. [DOI] [PubMed] [Google Scholar]
- Feierstein CE, Quirk MC, Uchida N, Sosulski DL, Mainen ZF. Representation of spatial goals in rat orbitofrontal cortex. Neuron. 2006;51:495–507. doi: 10.1016/j.neuron.2006.06.032. [DOI] [PubMed] [Google Scholar]
- FitzGerald THB, Seymour B, Dolan RJ. The role of human orbitofrontal cortex in value comparison for incommensurable objects. J. Neurosci. 2009;29:8388–8395. doi: 10.1523/JNEUROSCI.0717-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flagel SB, Watson SJ, Robinson TE, Akil H. Individual differences in the propensity to approach signals vs goals promote different adaptations in the dopamine system of rats. Psychopharmacology. 2007;191:599–607. doi: 10.1007/s00213-006-0535-8. [DOI] [PubMed] [Google Scholar]
- Floresco SB, Ghods-Sharifi S. Amygdala-prefrontal contrical circuitry regulates effort-based decision making. Cereb. Cortex. 2007;17:251–260. doi: 10.1093/cercor/bhj143. [DOI] [PubMed] [Google Scholar]
- Floresco SB, Tse MT, Ghods-Sharifi S. Dopaminergic and glutamatergic regulation of effort- and delay-based decision making. Neuropsychopharmacology. 2008;33:1966–1979. doi: 10.1038/sj.npp.1301565. [DOI] [PubMed] [Google Scholar]
- Frank MJ. Dynamic dopamine modulation in the basal ganglia: a neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism. J. Cogn. Neurosci. 2005;17:51–72. doi: 10.1162/0898929052880093. [DOI] [PubMed] [Google Scholar]
- Frank MJ, Moustafa AA, Haughey HM, Curran T, Hutchison KE. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl Acad. Sci. U. S. A. 2007;104:16311–16316. doi: 10.1073/pnas.0706111104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frank MJ, Seeberger LC, O’Reilly RC. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science. 2004;306:1940–1943. doi: 10.1126/science.1102941. [DOI] [PubMed] [Google Scholar]
- Friedman JH, Alves G, Hagell P, Marinus J, Marsh L, Martinez-Martin P, Goetz CG, Poewe W, Rascol O, Sampaio C, Stebbins G, Schrag A. Fatigue rating scales critique and recommendations by the Movement Disorders Society task force on rating scales for Parkinson’s disease. Mov. Disord. 2010;25:805–822. doi: 10.1002/mds.22989. [DOI] [PubMed] [Google Scholar]
- Fu WT, Anderson JR. Solving the credit assignment problem: explicit and implicit learning of action sequences with probabilistic outcomes. Psychol. Res. 2008;72:321–330. doi: 10.1007/s00426-007-0113-7. [DOI] [PubMed] [Google Scholar]
- Gard DE, Kring AM, Gard MG, Horan WP, Green MF. Anhedonia in schizophrenia: distinctions between anticipatory and consummatory pleasure. Schizophr. Res. 2007;93:253–260. doi: 10.1016/j.schres.2007.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gillan CM, Papmeyer M, Morein-Zamir S, Sahakian BJ, Fineberg NA, Robbins TW, de Wit S. Disruption in the balance between goal-directed behavior and habit learning in obsessive-compulsive disorder. Am J Psychiatry. 2011;168:718–726. doi: 10.1176/appi.ajp.2011.10071062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilmour G, Arguello A, Bari A, Brown VJ, Carter C, Floresco SB, Jentsch DJ, Tait DS, Young JW, Robbins TW. Measuring the construct of executive control in schizophrenia: defining and validating translational animal paradigms for discovery research. Neurosci. Biobehav. Rev. 2012 doi: 10.1016/j.neubiorev.2012.04.006. in press. [DOI] [PubMed] [Google Scholar]
- Glimcher PW. Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc. Natl Acad. Sci. U. S. A. 2011;108(Suppl. 3):15647–15654. doi: 10.1073/pnas.1014269108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goetz EL, Hariri AR, Pizzagalli DA, Strauman TJ. Genetic moderation of the association between regulatory focus and reward responsiveness: a proof-of-concept study. Biol. Mood Anxiety Disord. 2013;3:3. doi: 10.1186/2045-5380-3-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gold JM, Strauss GP, Waltz JA, Robinson BM, Brown JK, Frank MJ. Negative symptoms of schizophrenia are associated with abnormal effort-cost computations. Biol. Psychiatry. 2013;74:130–136. doi: 10.1016/j.biopsych.2012.12.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gold JM, Waltz JA, Matveeva TM, Kasanova Z, Strauss GP, Herbener ES, Collins AGE, Frank MJ. Negative symptoms and the failure to represent the expected reward value of actions: behavioral and computational modeling evidence. Arch. Gen. Psychiatry. 2012;69:129–138. doi: 10.1001/archgenpsychiatry.2011.1269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gold JM, Waltz JA, Prentice KJ, Morris SE, Heerey EA. Reward processing in schizophrenia: a deficit in the representation of value. Schizophr. Bull. 2008;34:835–847. doi: 10.1093/schbul/sbn068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Lee AS, Howell JL, Pittenger C, Taylor JR. Dissociable regulation of instrumental action within mouse prefrontal cortex. Eur. J. Neurosci. 2010;32:1726–1734. doi: 10.1111/j.1460-9568.2010.07438.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grabenhorst F, Rolls ET. Different representations of relative and absolute subjective value in the human brain. Neuroimage. 2009;48:258–268. doi: 10.1016/j.neuroimage.2009.06.045. [DOI] [PubMed] [Google Scholar]
- Grabenhorst F, Rolls ET. Value, pleasure and choice in the ventral prefrontal cortex. Trends Cogn. Sci. 2011;15:56–67. doi: 10.1016/j.tics.2010.12.004. [DOI] [PubMed] [Google Scholar]
- Hare TA, O’Doherty J, Camerer CF, Schultz W, Rangel A. Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J. Neurosci. 2008;28:5623–5630. doi: 10.1523/JNEUROSCI.1309-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hauber W, Sommer S. Prefrontostriatal circuitry regulates effort-related decision making. Cerebr. Cortex. 2009;10:2240–2247. doi: 10.1093/cercor/bhn241. [DOI] [PubMed] [Google Scholar]
- Heerey EA, Bell-Warren KR, Gold JM. Decision-making impairments in the context of intact reward sensitivity in schizophrenia. Biol. Psychiatry. 2008;64:62–69. doi: 10.1016/j.biopsych.2008.02.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heerey EA, Gold JM. Patients with schizophrenia demonstrate dissociation between affective experience and motivated behavior. J. Abnorm. Psychol. 2007;116:268–278. doi: 10.1037/0021-843X.116.2.268. [DOI] [PubMed] [Google Scholar]
- Hilario MR, Clouse E, Yin HH, Costa RM. Endocannabinoid signaling is critical for habit formation. Front. Integr. Neurosci. 2007;1:6. doi: 10.3389/neuro.07.006.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hiraoka K. Discrete-trial probability learning in rats: effects of local contingencies of reinforcement. Anim. Learn. Behav. 1984;12:343–349. [Google Scholar]
- Hofer E, Doby D, Anderer P, Dantendorfer K. Impaired conditional discrimination learning in schizophrenia. Schizophr. Res. 2001;51:127–136. doi: 10.1016/s0920-9964(00)00118-3. [DOI] [PubMed] [Google Scholar]
- Hogarth L, Attwood AS, Bate HA, Munafò MR. Acute alcohol impairs human goal-directed action. Biol. Psychol. 2012;90:154–160. doi: 10.1016/j.biopsycho.2012.02.016. [DOI] [PubMed] [Google Scholar]
- Horner AE, Heath CJ, Hvoslef-Eide M, Kent BA, Kim CH, Nilsson S, Alsio J, Oomen CA, Holmes A, Saksida LM, Bussey TJ. The touchscreen operant platform for testing learning and memory in rats and mice. Nat. Protoc. 2013 doi: 10.1038/nprot.2013.122. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huys QJM, Pizzagalli DA, Bogdan R, Dayan P. Mapping anhedonia onto reinforcement learning: a behavioural meta-analysis. Biol. Mood Anxiety Disord. 2013;3:12. doi: 10.1186/2045-5380-3-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Inglis WL, Olmstead MC, Robbins TW. Pedunculopontine tegmental nucleus lesions impair stimulus-reward learning in autoshaping and conditioned reinforcement paradigms. Behav. Neurosci. 2000;114:285–294. doi: 10.1037//0735-7044.114.2.285. [DOI] [PubMed] [Google Scholar]
- Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, Sanislow C, Wang P. Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. Am. J. Psychiatry. 2010;167:748–751. doi: 10.1176/appi.ajp.2010.09091379. [DOI] [PubMed] [Google Scholar]
- Ishizaki J, Mimura M. Dysthymia and apathy: diagnosis and treatment. Depress. Res. Treat. 2011;2011:893905. doi: 10.1155/2011/893905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ito R, Everitt BJ, Robbins TW. The hippocampus and appetitive Pavlovian conditioning: effects of excitotoxic hippocampal lesions on conditioned locomotor activity and autoshaping. Hippocampus. 2005;15:713–721. doi: 10.1002/hipo.20094. [DOI] [PubMed] [Google Scholar]
- Ito S, Stuphorn V, Brown JW, Schall JD. Performance monitoring by the anterior cingulate cortex during saccade countermanding. Science. 2003;302:120–122. doi: 10.1126/science.1087847. [DOI] [PubMed] [Google Scholar]
- Kable JW, Glimcher PW. The neural correlates of subjective value during intertemporal choice. Nat. Neurosci. 2007;10:1625–1633. doi: 10.1038/nn2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kable JW, Glimcher PW. The neurobiology of decision: consensus and controversy. Neuron. 2009;63:733–745. doi: 10.1016/j.neuron.2009.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kasanova Z, Waltz JA, Strauss GP, Frank MJ, Gold JM. Optimizing vs. matching: response strategy in a probabilistic learning task is associated with negative symptoms of schizophrenia. Schizophr. Res. 2011;127:215–222. doi: 10.1016/j.schres.2010.12.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kay SR, Fiszbein A, Opler LA. The Positive and Negative Syndrome Scale (PANSS) for schizophrenia. Schizophr. Bull. 1987;13:261–276. doi: 10.1093/schbul/13.2.261. [DOI] [PubMed] [Google Scholar]
- Kennerley SW, Behrens TEJ, Wallis JD. Double dissociation of value computations in orbitofrontal and anterior cingulate neurons. Nat. Neurosci. 2011;14:1581–1589. doi: 10.1038/nn.2961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kepecs A, Uchida N, Zariwala HA, Mainen ZF. Neural correlates, computation and behavioural impact of decision confidence. Nature. 2008;455:227–231. doi: 10.1038/nature07200. [DOI] [PubMed] [Google Scholar]
- Klossek UM, Yu S, Dickinson A. Choice and goal-directed behavior in preschool children. Learn Behav. 2011;39:350–357. doi: 10.3758/s13420-011-0030-x. [DOI] [PubMed] [Google Scholar]
- Klossek UMH, Russel J, Dickinson A. The control of instrumental action following outcome devaluation in young children aged between 1 and 4 years. J. Exp. Psychol. Gen. 2008;137:39–51. doi: 10.1037/0096-3445.137.1.39. [DOI] [PubMed] [Google Scholar]
- Knowlton BJ, Squire LR, Gluck MA. Probabilistic classification learning in amnesia. Learn. Mem. 1994;1:106–120. [PubMed] [Google Scholar]
- Kobayashi S, Lauwereyns J, Koizumi M, Sakagami M, Hikosaka O. Influence of reward expectation on visuospatial processing in Macaque lateral prefrontal cortex. J. Neurophysiol. 2002;87:1488–1498. doi: 10.1152/jn.00472.2001. [DOI] [PubMed] [Google Scholar]
- Koch M, Schmid A, Scnhnitzler HU. Role of nucleus accumbens dopamine D1 and D2 receptors in instrumental and Pavlovian paradigms of conditioned reward. Psychopharmacology. 2000;152:67–73. doi: 10.1007/s002130000505. [DOI] [PubMed] [Google Scholar]
- Kring AM, Germans Gard M, Gard DE. Emotion deficits in schizophrenia: timing matters. J. Abnorm. Psychol. 2011;120:79–87. doi: 10.1037/a0021402. [DOI] [PubMed] [Google Scholar]
- Kringelbach ML. The human orbitofrontal cortex: linking reward to hedonic experience. Nat. Rev. Neurosci. 2005;6:691–702. doi: 10.1038/nrn1747. [DOI] [PubMed] [Google Scholar]
- Kringelbach ML, Berridge KC. The functional neuroanatomy of pleasure and happiness. Discov. Med. 2010;9:579–587. [PMC free article] [PubMed] [Google Scholar]
- Kurniawan IT, Guitart-Masip M, Dolan RJ. Dopamine and effort-based decision making. Front. Neurosci. 2011;5:81. doi: 10.3389/fnins.2011.00081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Levy I, Snell J, Nelson AJ, Rustichini A, Glimcher PW. Neural representation of subjective value under risk and ambiguity. J. Neurophysiol. 2010;103:1036–1047. doi: 10.1152/jn.00853.2009. [DOI] [PubMed] [Google Scholar]
- Liljeholm M, Molloy CJ, O’Doherty JP. Dissociable brain systems mediate vicarious learning of stimulus-response and action-outcome contingencies. Neuroscience. 2012;32:9878–9886. doi: 10.1523/JNEUROSCI.0548-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lubow RE. Classical eyeblink conditioning and schizophrenia: a short review. Behav. Brain Res. 2009;202:1–4. doi: 10.1016/j.bbr.2009.03.006. [DOI] [PubMed] [Google Scholar]
- Luck SJ, Gold JM. The construct of attention in schizophrenia. Biol. Psychiatry. 2008;64:34–39. doi: 10.1016/j.biopsych.2008.02.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MackIntosh NJ. A theory of attention: variations in the associability of stimuli with reinforcement. Psychol. Rev. 1975;82:276–298. [Google Scholar]
- Maia TV, Frank MJ. From reinforcement learning models to psychiatric and neurological disorders. Nat. Neurosci. 2011;14:154–162. doi: 10.1038/nn.2723. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mar AC, Walker AL, Theobald DE, Eagle DM, Robbins TW. Dissociable effects of lesions to orbitofrontal cortex subregions on impulsive choice in the rat. J. Neurosci. 2011;31:6398–6404. doi: 10.1523/JNEUROSCI.6620-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marin RS. Apathy: a neuropsychiatric syndrome. J. Neuropsychiatry Clin. Neurosci. 1991;3:243–254. doi: 10.1176/jnp.3.3.243. [DOI] [PubMed] [Google Scholar]
- Markou A, Kosten TR, Koob GF. Neurobiological similarities in depression and drug dependence: a self-medication hypothesis. Neuropsychopharmacology. 1998;18:135–174. doi: 10.1016/S0893-133X(97)00113-9. [DOI] [PubMed] [Google Scholar]
- Meneses A, Hong E. Modification of 8-OH-DPAT effects on learning by manipulation of the assay conditions. Behav. Neural Biol. 1994;61:29–35. doi: 10.1016/s0163-1047(05)80041-x. [DOI] [PubMed] [Google Scholar]
- Mingote S, Font L, Farrar AM, Vontell R, Worden LT, Stopper CM, Port RG, Sink KS, Bunce JG, Chrobak JJ, Salamone JD. Nucleus accumbens adenosine A2A receptors regulate exertion of effort by acting on the ventral striatopallidal pathway. J. Neurosci. 2008;28:9037–9046. doi: 10.1523/JNEUROSCI.1525-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mingote S, Weber SM, Ishiwari K, Correa M, Salamone JD. Ratio and time requirements on operant schedules: effort-related effects of nucleus accumbens dopamine depletions. Eur. J. Neurosci. 2005;21:1749–1757. doi: 10.1111/j.1460-9568.2005.03972.x. [DOI] [PubMed] [Google Scholar]
- Montague PR, Dayan P, Sejnowski TJ. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 1996;16:1936–1947. doi: 10.1523/JNEUROSCI.16-05-01936.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morrison SE, Salzman CD. Re-valuing the amygdala. Curr. Opin. Neurobiol. 2010;20:221–230. doi: 10.1016/j.conb.2010.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mott AM, Nunes EJ, Collins LE, Port RG, Sink KS, Hockemeyer J, Müller CE, Salamone JD. The adenosine A2A antagonist MSX-3 reverses the effects of the dopamine antagonist haloperidol on effort-related decision making in a T-maze cost/benefit procedure. Psychopharmacology. 2009;204:103–112. doi: 10.1007/s00213-008-1441-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nikolova Y, Bogdan R, Pizzagalli DA. Perception of a naturalistic stressor interacts with 5-HTTLPR/rs25531 genotype and gender to impact reward responsiveness. Neuropsychobiology. 2012;65:45–54. doi: 10.1159/000329105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Noonan MP, Walton ME, Behrens TEJ, Sallet J, Buckley MJ, Rushworth MFS. Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proc. Natl Acad. Sci. U. S. A. 2010;107:20547–20552. doi: 10.1073/pnas.1012246107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nuechterlein KH, Luck SJ, Lustig C, Sarter M. CNTRICS final task selection: control of attention. Schizophr. Bull. 2009;35:182–196. doi: 10.1093/schbul/sbn158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Doherty JP. Lights, camembert, action! The role of human orbitofrontal cortex in encoding stimuli, rewards, and choices. Ann. N. Y. Acad. Sci. 2007;1121:254–272. doi: 10.1196/annals.1401.036. [DOI] [PubMed] [Google Scholar]
- Oakeshott S, Port R, Cummins-Sutphen J, Berger J, Watson-Johnson J, Ramboz S, Paterson N, Kwak S, Howland D, Brunner D. A mixed fixed ratio/progressive ratio procedure reveals an apathy phenotype in the BAC HD and the zQ175 KI mouse models of Huntington’s disease. PLoS Curr. Huntington Dis. 2012 Apr 25; doi: 10.1371/4f972cffe82c0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oakeshott S, Port RG, Cummins-Sutphen J, Watson-Johnson J, Ramboz S, Park L, Howland D, Brunner D. HD mouse models reveal clear deficits in learning to perform a simple instrumental response. PLoS Curr. 2011;3:RRN1281. doi: 10.1371/currents.RRN1282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padoa-Schioppa C. Range-adapting representation of economic value in the orbitofrontal cortex. J. Neurosci. 2009;29:14004–14014. doi: 10.1523/JNEUROSCI.3751-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex encode economic value. Nature. 2006;441:223–226. doi: 10.1038/nature04676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padoa-Schioppa C, Assad JA. The representation of economic value in the orbitofrontal cortex is invariant for changes of menu. Nat. Neurosci. 2008;11:95–102. doi: 10.1038/nn2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palminteri S, Lebreton M, Worbe Y, Grabli D, Hartmann A, Pessiglione M. Pharmacological modulation of subliminal learning in Parkinson’s and Tourette’s syndromes. Proc. Natl Acad. Sci. U. S. A. 2009;106:19179–19184. doi: 10.1073/pnas.0904035106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pardo M, Lopez-Cruz L, Valverde O, Ledent C, Baqi Y, Müller CE, Salamone JD, Correa M. Adenosine A2A receptor antagonism and genetic deletion attenuate the effects of dopamine D2 antagonism on effort-based decision making in mice. Neuropharmacology. 2012;62:2068–2077. doi: 10.1016/j.neuropharm.2011.12.033. [DOI] [PubMed] [Google Scholar]
- Parkinson JA, Dalley JW, Cardinal RN, Bamford A, Fehnert B, Lachenal G, Rudarakanchana N, Halkerston KM, Robbins TW, Everitt BJ. Nucleus accumbens dopamine depletion impairs both acquisition and performance of appetitive Pavlovian approach behaviour: implications for mesoaccumbens dopamine function. Behav. Brain Res. 2002;137:149–163. doi: 10.1016/s0166-4328(02)00291-7. [DOI] [PubMed] [Google Scholar]
- Parkinson JA, Olmstead MC, Burns LH, Robbins TW, Everitt BJ. Dissociation in effects of lesions of the nucleus accumbens core and shell on appetitive pavlovian approach behavior and the potentiation of conditioned reinforcement and locomotor activity by D-amphetamine. J. Neurosci. 1999;19:2401–2411. doi: 10.1523/JNEUROSCI.19-06-02401.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parkinson JA, Robbins TW, Everitt BJ. Dissociable roles of the central and basolateral amygdala in appetitive emotional learning. Eur. J. Neurosci. 2000a;12:405–413. doi: 10.1046/j.1460-9568.2000.00960.x. [DOI] [PubMed] [Google Scholar]
- Parkinson JA, Willoughby PJ, Robbins TW, Everitt BJ. Disconnection of the anterior cingulate cortex and nucleus accumbens core impairs Pavlovian approach behavior: further evidence for limbic cortical-ventral striatopallidal systems. Behav. Neurosci. 2000b;114:42–63. [PubMed] [Google Scholar]
- Pearce JM, Hall G. A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 1980;87:532–552. [PubMed] [Google Scholar]
- Peciña S, Cagniard B, Berridge KC, Aldridge JW, Zhuang X. Hyperdopaminergic mutant mice have higher “wanting” but not “liking” for sweet rewards. J. Neurosci. 2003;23:9395–9402. doi: 10.1523/JNEUROSCI.23-28-09395.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phelps EA, O’Connor KJ, Gatenby JC, Gore JC, Grillon C, Davis M. Activation of the left amygdala to a cognitive representation of fear. Nat. Neurosci. 2001;4:437–441. doi: 10.1038/86110. [DOI] [PubMed] [Google Scholar]
- Pithers RT. The roles of event contingencies and reinforcement in human autoshaping and omission responding. Learn. Motiv. 1985;16:210–237. [Google Scholar]
- Pizzagalli DA, Iosifescu D, Hallett LA, Ratner KG, Fava M. Reduced hedonic capacity in major depressive disorder: evidence from a probabilistic reward task. J. Psychiatr. Res. 2008;43:76–87. doi: 10.1016/j.jpsychires.2008.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pizzagalli DA, Jahn AL, O’Shea JP. Toward an objective characterization of an anhedonic phenotype: a signal-detection approach. Biol. Psychiatry. 2005;57:319–327. doi: 10.1016/j.biopsych.2004.11.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pizzagalli DA, Sherwood RJ, Henriques JB, Davidson RJ. Frontal brain asymmetry and reward responsiveness: a source-localization study. Psychol. Sci. 2011;16:805–813. doi: 10.1111/j.1467-9280.2005.01618.x. [DOI] [PubMed] [Google Scholar]
- Ragland JD, Cools R, Frank M, Pizzagalli DA, Preston A, Ranganath C, Wagner AD. CNTRICS final task selection: long-term memory. Schizophr. Bull. 2008;35:197–212. doi: 10.1093/schbul/sbn134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Randall PA, Pardo M, Nunes EJ, López Cruz L, Vemuri VK, Makriyannis A, Baqi Y, Muller CE, Correa M, Salamone JD. Dopaminergic modulation of effort-related choice behavior as assessed by a progressive ratio chow feeding choice task: pharmacological studies and the role of individual differences. PLoS One. 2012;7:e47934. doi: 10.1371/journal.pone.0047934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rangel A, Camerer C, Montague PR. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 2008;9:545–556. doi: 10.1038/nrn2357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ribot T. La Psychologie des Sentiments. F. Alcan; Paris: 1896. [Google Scholar]
- Roesch MR, Olson CR. Neuronal activity related to reward value and motivation in primate frontal cortex. Science. 2004;304:307–310. doi: 10.1126/science.1093223. [DOI] [PubMed] [Google Scholar]
- Romaniuk L, Honey GD, King JR, Whalley HC, McIntosh AM, Levita L, Hughes M, Johnstone EC, Day M, Lawrie SM, Hall J. Midbrain activation during Pavlovian conditioning and delusional symptoms in schizophrenia. Arch. Gen. Psychiatry. 2010;67:1246–1254. doi: 10.1001/archgenpsychiatry.2010.169. [DOI] [PubMed] [Google Scholar]
- Rudebeck PH, Murray EA. Balkanizing the primate orbitofrontal cortex: distinct subregions for comparing and contrasting values. Ann. N. Y. Acad. Sci. 2011;1239:1–13. doi: 10.1111/j.1749-6632.2011.06267.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rushworth MFS, Buckley MJ, Behrens TEJ, Walton ME, Bannerman DM. Functional organization of the medial prefrontal cortex. Curr. Opin. Neurobiol. 2007;17:220–227. doi: 10.1016/j.conb.2007.03.001. [DOI] [PubMed] [Google Scholar]
- Rushworth MFS, Noonan MP, Boorman ED, Walton ME, Behrens TE. Frontal cortex and reward-guided learning and decision-making. Neuron. 2011;70:1054–1069. doi: 10.1016/j.neuron.2011.05.014. [DOI] [PubMed] [Google Scholar]
- Salamone JD. Dopaminergic involvement in activational aspects of motivation: effects of haloperidol on schedule-induced activity, feeding, and foraging in rats. Psychobiology. 1988;16:196–206. [Google Scholar]
- Salamone JD. Will the last person who uses the term “reward” please turn out the lights? Comments on processes related to reinforcement, learning, motivation, and effort. Addict. Biol. 2006;11:43–44. doi: 10.1111/j.1369-1600.2006.00011.x. [DOI] [PubMed] [Google Scholar]
- Salamone JD. Functions of mesolimbic dopamine: changing concepts and shifting paradigms. Psychopharmacology. 2007;191:389. doi: 10.1007/s00213-006-0623-9. [DOI] [PubMed] [Google Scholar]
- Salamone JD. Preladenant, a novel adenosine A2A receptor antagonist for the potential treatment of parkinsonism and other disorders. IDrugs. 2010;13:723–731. [PubMed] [Google Scholar]
- Salamone JD, Correa M. Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine. Behav. Brain Res. 2002;137:3–25. doi: 10.1016/s0166-4328(02)00282-6. [DOI] [PubMed] [Google Scholar]
- Salamone JD, Arizzi M, Sandoval MD, Cervone KM, Aberman JE. Dopamine antagonsts alter response allocation but do not suppress appetite for food in rats: contrast between the effects of SKF 83566, raclopride and fenfluramine on a concurrent choice task. Psychopharmacology. 2002;160:371–380. doi: 10.1007/s00213-001-0994-x. [DOI] [PubMed] [Google Scholar]
- Salamone JD, Correa M. Dopamine/adenosine interactions involved in effort-related aspects of food motivation. Appetite. 2009;53:422–425. doi: 10.1016/j.appet.2009.07.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salamone JD, Correa M, Farrar A, Mingote SM. Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits. Psychopharmacology. 2007;191:461–482. doi: 10.1007/s00213-006-0668-9. [DOI] [PubMed] [Google Scholar]
- Salamone JD, Correa M, Farrar AM, Nunes EJ, Collins LE. Role of dopamine-adenosine interactions in the brain circuitry regulating effort-related decision making: insights into pathological aspects of motivation. Future Neurol. 2010;5:377–392. [Google Scholar]
- Salamone JD, Correa M, Farrar AM, Nunes EJ, Pardo M. Dopamine, behavioral economics, and effort. Front. Behav. Neurosci. 2009;3:13. doi: 10.3389/neuro.08.013.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salamone JD, Correa M, Mingote SM, Weber SM, Farrar AM. Nucleus accumbens dopamine and the forebrain circuitry involved in behavioral activation and effort-related decision making: implications for understanding anergia and psychomotor slowing in depression. Curr. Psychiatr. Rev. 2006;2:267–280. [Google Scholar]
- Salamone JD, Correa M, Nunes EJ, Randall PA, Pardo M. The behavioral pharmacology of effort-related choice behavior: dopamine, adenosine and beyond. J. Exp. Anal. Behav. 2012;97:125–146. doi: 10.1901/jeab.2012.97-125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salamone JD, Cousins MS, Bucher S. Anhedonia or anergia? Effects of haloperidol and nucleus accumbens dopamine depletion on instrumental response selection in a T-maze cost/benefit procedure. Behav. Brain Res. 1994;65:221–229. doi: 10.1016/0166-4328(94)90108-2. [DOI] [PubMed] [Google Scholar]
- Salamone JD, Cousins MS, Maio C, Champion M, Turski T, Kovach J. Different behavioral effects of haloperidol, clozapine and thioridazine in a concurrent lever pressing and feeding procedure. Psychopharmacology. 1996;125:105–112. doi: 10.1007/BF02249408. [DOI] [PubMed] [Google Scholar]
- Salamone JD, Cousins MS, Snyder BJ. Behavioral functions of nucleus accumbens dopamine: empirical and conceptual problems with the anhedonia hypothesis. Neurosci. Biobehav. Rev. 1997;21:341–359. doi: 10.1016/s0149-7634(96)00017-6. [DOI] [PubMed] [Google Scholar]
- Salamone JD, Steinpreis RE, McCullough LD, Smith P, Grebel D, Mahan K. Haloperidol and nucleus accumbens dopamine depletion suppress lever pressing for food but increase free food consumption in a novel food choice procedure. Psychopharmacology. 1991;104:515–521. doi: 10.1007/BF02245659. [DOI] [PubMed] [Google Scholar]
- Santesso DL, Evins AE, Frank MJ, Schetter EC, Bogdan R, Pizzagalli DA. Single dose of a dopamine agonist impairs reinforcement learning in humans: evidence from event-related potentials and computational modeling of striatal-cortical function. Hum. Brain Map. 2009;30:1963–1976. doi: 10.1002/hbm.20642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Savage LM, Ramos RL. Reward expectation alters learning and memory: the impact of the amygdala on appetitive-driven behaviors. Behav. Brain Res. 2009;198:1–12. doi: 10.1016/j.bbr.2008.10.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Savine AC, Braver TS. Motivated cognitive control: reward incentives modulate preparatory neural activity during task-switching. J. Neurosci. 2010;30:10294–10305. doi: 10.1523/JNEUROSCI.2052-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- Schmelzeis MC, Mittleman G. The hippocampus and reward: effects of hippocampal lesions on progressive-ratio responding. Behav. Neurosci. 1996;110:1049–1066. doi: 10.1037//0735-7044.110.5.1049. [DOI] [PubMed] [Google Scholar]
- Schoenbaum G, Roesch MR, Stalnaker TA, Takahashi YK. A new perspective on the role of the orbitofrontal cortex in adaptive behaviour. Nat. Rev. Neurosci. 2009;10:885–892. doi: 10.1038/nrn2753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schultz W. Multiple dopamine functions at different time courses. Annu. Rev. Neurosci. 2007;30:259–288. doi: 10.1146/annurev.neuro.28.061604.135722. [DOI] [PubMed] [Google Scholar]
- Schultz W. Dopamine signals for reward value and risk: basic and recent data. Behav. Brain Funct. 2010;6:24. doi: 10.1186/1744-9081-6-24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
- Schweimer J, Hauber W. Dopamine D1 receptors in the anterior cingulate cortex regulate effort-related decision making. Learn. Mem. 2006;13:777–782. doi: 10.1101/lm.409306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sheynikhovich D, Otani S, Arleo A. The role of tonic and phasic dopamine for long-term synaptic plasticity in the prefrontal cortex: a computational model. J. Physiol. (Paris) 2011;105:45–52. doi: 10.1016/j.jphysparis.2011.08.001. [DOI] [PubMed] [Google Scholar]
- Shimp CP. Short-term memory in the pigeon: the previously reinforced response. J. Exp. Anal. Behav. 1976;26:487–493. doi: 10.1901/jeab.1976.26-487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shohamy D, Myers CE, Kalanithi J, Gluck MA. Basal ganglia and dopamine contributions to probabilistic category learning. Neurosci. Biobehav. Rev. 2008;32:219–236. doi: 10.1016/j.neubiorev.2007.07.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simpson EH, Kellendonk C, Ward RD, Richards V, Lipatova O, Fairhurst S, Kandel ER, Balsam PD. Pharmacologic rescue of motivational deficit in an animal model of the negative symptoms of schizophrenia. Biol. Psychiatry. 2011;69:928–935. doi: 10.1016/j.biopsych.2011.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sink KS, Vemuri VK, Olszewska T, Makriyannis A, Salamone JD. Cannabinoid CB1 antagonists and dopamine antagonists produce different effects on a task involving response allocation and effort-related choice in food-seeking behavior. Psychopharmacology. 2008;196:565–574. doi: 10.1007/s00213-007-0988-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skjoldager P, Pierre PJ, Mittlman G. Reinforcer magnitude and progressive ratio responding in the rat: effects of increased effort, prefeeding, and extinction. Learn. Motiv. 1993;24:303–343. [Google Scholar]
- Smith KS, Berridge KC, Aldridge JW. Disentangling pleasure from incentive salience and learning signals in brain reward circuitry. Proc. Natl Acad. Sci. U. S. A. 2011;108:E255–E264. doi: 10.1073/pnas.1101920108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stewart WJ. Progressive reinforcement schedules: a review and evaluation. Aust. J. Psychol. 1974;27:9–22. [Google Scholar]
- Strauss GP, Herbener ES. Patterns of emotional experience in schizophrenia: differences in emotional response to visual stimuli are associated with clinical presentation and functional outcome. Schizophr. Res. 2011;128:117–123. doi: 10.1016/j.schres.2011.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Surmeier DJ, Plotkin J, Shen W. Dopamine and synaptic plasticity in dorsal striatal circuits controlling action selection. Curr. Opin. Neurobiol. 2009;19:621–628. doi: 10.1016/j.conb.2009.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takahashi YK, Roesch MR, Wilson RC, Toreson K, O’Donnell P, Niv Y, Schoenbaum G. Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex. Nat. Neurosci. 2011;14:1590–1597. doi: 10.1038/nn.2957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Treadway MT, Buckholtz JW, Schwartzman AN, Lambert WE, Zald DH. Worth the ‘EEfRT’? The effort expenditure for rewards task as an objective measure of motivation and anhedonia. PLoS One. 2009;4:e6598. doi: 10.1371/journal.pone.0006598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Treadway MT, Bossaller NA, Shelton RC, Zald DH. Effort-based decision-making in major depressive disorder: a translational model of motivational anhedonia. J. Abnorm. Psychol. 2012a;121:553–558. doi: 10.1037/a0028813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Treadway MT, Buckholtz JW, Cowan RL, Woodward ND, Li R, Ansari MS, Baldwin RM, Schwartzman AN, Kessler RM, Zald DH. Dopaminergic mechanisms of individual differences in human effort-based decision-making. J. Neurosci. 2012b;32:6170–6176. doi: 10.1523/JNEUROSCI.6459-11.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Treadway MT, Zald DH. Reconsidering anhedonia in depression: lessons from translational neuroscience. Neurosci. Biobehav. Rev. 2011;35:537–555. doi: 10.1016/j.neubiorev.2010.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trecker A, Robbins TW, Mar AC. Development of a visual-guided probabilistic selection task for rats; Proceedings of Measuring Behaviour; Utrecht, Netherlands. 2012.pp. 482–483. [Google Scholar]
- Tripp G, Alsop B. Sensitivity to reward frequency in boys with attention deficit hyperactivity disorder. J. Clin. Child Psychol. 1999;28:366–375. doi: 10.1207/S15374424jccp280309. [DOI] [PubMed] [Google Scholar]
- Tsujimoto S, Sawaguchi T. Neuronal representation of response-outcome in the primate prefrontal cortex. Cereb. Cortex. 2004;14:47–55. doi: 10.1093/cercor/bhg090. [DOI] [PubMed] [Google Scholar]
- Uslaner JM, Dell’Orco JM, Pevzner A, Robinson TE. The influence of subthalamic nucleus lesions on sign-tracking to stimuli paired with food and drug rewards: facilitation of incentive salience attribution? Neuropsychopharmacology. 2008;33:2352–2361. doi: 10.1038/sj.npp.1301653. [DOI] [PubMed] [Google Scholar]
- Valentin VV, Dickinson A, O’Doherty JP. Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 2007;27:4019–4026. doi: 10.1523/JNEUROSCI.0564-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van den Bos R, van der Harst J, Jonkman S, Schilders M, Spruijt B. Rats assess costs and benefits according to an internal standard. Behav. Brain Res. 2006;171:350–354. doi: 10.1016/j.bbr.2006.03.035. [DOI] [PubMed] [Google Scholar]
- van Reekum R, Stuss DT, Ostrander L. Apathy: why care? J. Neuropsychiatry Clin. Neurosci. 2005;17:7–19. doi: 10.1176/jnp.17.1.7. [DOI] [PubMed] [Google Scholar]
- Voon V, Pessiglione M, Brezing C, Gallea C, Fernandez HH, Dolan RJ, Hallett M. Mechanisms underlying dopamine-mediated reward bias in compulsive behaviors. Neuron. 2010;65:135–142. doi: 10.1016/j.neuron.2009.12.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vrieze E, Pizzagali DA, Demyttenaere K, Hompes T, Sienaert P, de Boer P, Schmidt M, Claes S. Reduced reward learning predicts outcome in major depressive disorder. Biol. Psychiatry. 2013a;73:639–645. doi: 10.1016/j.biopsych.2012.10.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vrieze E, Ceccarini J, Pizzagalli DA, Bormans G, Vandenbulcke M, Demyttenaere K, Van Laere K, Claes S. Measuring extrastriatal dopamine release during a reward learning task. Hum. Brain Map. 2013b;34:575–586. doi: 10.1002/hbm.21456. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wakabayashi KT, Fields HL, Nicola SM. Dissociation of the role of nucleus accumbens dopamine in responding to reward-predictive cues and waiting for reward. Behav. Brain Res. 2004;154:19–30. doi: 10.1016/j.bbr.2004.01.013. [DOI] [PubMed] [Google Scholar]
- Wallis JD, Miller EK. Neuronal activity in primate dorsolateral and orbital prefrontal cortex during performance of a reward preference task. Eur. J. Neurosci. 2003;18:2069–2081. doi: 10.1046/j.1460-9568.2003.02922.x. [DOI] [PubMed] [Google Scholar]
- Walton ME, Kennerley SW, Bannerman DM, Phillips PE, Rushworth MF. Weighing up the benefits of work: behavioral and neural analyses of effort-related decision making. Neural Netw. 2006;19:1302–1314. doi: 10.1016/j.neunet.2006.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waltz JA, Frank MJ, Robinson BM, Gold JM. Selective reinforcement learning deficits in schizophrenia support predictions from computational models of striatal-cortical dysfunction. Biol. Psychiatry. 2007;62:756–764. doi: 10.1016/j.biopsych.2006.09.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waltz JA, Frank MJ, Wiecki TV, Gold JM. Altered probabilistic learning and response biases in schizophrenia: behavioral evidence and neurocomputational modeling. Neuropsychology. 2011;25:86–97. doi: 10.1037/a0020882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ward RD, Simpson EH, Kandel ER, Balsam PD. Modeling motivational deficits in mouse models of schizophrenia: behavior analysis as a guide for neuroscience. Behav. Processes. 2011;87:149–156. doi: 10.1016/j.beproc.2011.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ward RD, Simpson EH, Richards VL, Deo G, Taylor K, Glendinning JI, Kandel ER, Balsam PD. Dissociation of hedonic reaction to reward and incentive motivation in an animal model of the negative symptoms of schizophrenia. Neuropsychopharmacology. 2012;37:1699–1707. doi: 10.1038/npp.2012.15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wardle MC, Treadway MT, Mayo LM, Zald DH, de Wit H. Amping up effort: effects of d-amphetamine on human effort-based decision-making. J. Neurosci. 2011;31:16597–1602. doi: 10.1523/JNEUROSCI.4387-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Watson D, Clark LA, Tellegen A. Development and validation of brief measures of positive and negative affect: the PANAS scales. J. Pers. Soc. Psychol. 1988;54:1063–1070. doi: 10.1037//0022-3514.54.6.1063. [DOI] [PubMed] [Google Scholar]
- Weiler JA, Bellebaum C, Brune M, Juckel G, Daum I. Impairment of probabilistic reward-based learning in schizophrenia. Neuropsychopharmacology. 2009;23:571–580. doi: 10.1037/a0016166. [DOI] [PubMed] [Google Scholar]
- West EA, DesJardin JT, Gale K, Malkova L. Transient inactivation of orbitofrontal cortex blocks reinforcer devaluation in macaques. J. Neurosci. 2011;31:15128–15135. doi: 10.1523/JNEUROSCI.3295-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Whitmer AJ, Frank MJ, Gotlib IH. Sensitivity to reward and punishment in major depressive disorder: effects of rumination and of single versus multiple experiences. Cogn. Emot. 2012;26:1475–1485. doi: 10.1080/02699931.2012.682973. [DOI] [PubMed] [Google Scholar]
- Wickens JR, Begg AJ, Arbuthnott GW. Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex in vitro. Neuroscience. 1996;70:1–5. doi: 10.1016/0306-4522(95)00436-m. [DOI] [PubMed] [Google Scholar]
- Wilcove WG, Miller JC. CS-UCS presentations and a lever: human autoshaping. J. Exp. Psychol. 1974;103:868–877. doi: 10.1037/h0037388. 1974. [DOI] [PubMed] [Google Scholar]
- Williams BA. Choice as a function of local versus molar reinforcement contingencies. J. Exp. Anal. Behav. 1991;56:455–473. doi: 10.1901/jeab.1991.56-455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Winstanley CA, Baunez C, Theobald DE, Robbins TW. Lesions to the subthalamic nucleus decrease impulsive choice but impair autoshaping in rats: the importance of the basal ganglia in Pavlovian conditioning and impulse control. Eur. J. Neurosci. 2005;21:3107–3116. doi: 10.1111/j.1460-9568.2005.04143.x. [DOI] [PubMed] [Google Scholar]
- Winstanley CA, Dalley JW, Theobald DE, Robbins TW. Fractionating impulsivity: contrasting effects of central 5-HT depletion on different measures of impulsive behavior. Neuropsychopharmacology. 2004;29:1331–1343. doi: 10.1038/sj.npp.1300434. [DOI] [PubMed] [Google Scholar]
- Worden LT, Shahriari M, Farrar AM, Sink KS, Hockemeyer J, Müller C, Salamone JD. The adenosine A2A antagonist MSX-3 reverses the effort-related effects of dopamine blockade: differential interaction with D1 and D2 family antagonists. Psychopharmacology. 2009;203:489–499. doi: 10.1007/s00213-008-1396-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wunderlich K, Beierholm UR, Bossaerts P, O’Doherty JP. The human prefrontal cortex mediates integration of potential causes behind observed outcomes. J. Neurophysiol. 2011;106:1558–1569. doi: 10.1152/jn.01051.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yeomans JS. Role of tegmental cholinergic neurons in dopaminergic activation, antimuscarinic psychosis and schizophrenia. Neuropsychopharmacology. 1995;12:3–16. doi: 10.1038/sj.npp.1380235. [DOI] [PubMed] [Google Scholar]
- Yilmaz A, Simsek F, Gonul AS. Reduced reward-related probability learning in schizophrenia patients. Neuropsychiatr. Dis. Treat. 2012;8:27–34. doi: 10.2147/NDT.S26243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yin HH, Ostlund SB, Knowlton BJ, Balleine BW. The role of the dorsomedial striatum in instrumental conditioning. Eur. J. Neurosci. 2005;22:513–523. doi: 10.1111/j.1460-9568.2005.04218.x. [DOI] [PubMed] [Google Scholar]