Abstract
The discrimination reversal paradigm is commonly used to measure a subject's ability to adapt their behavior according to changes in stimulus-reward contingencies. Human functional neuroimaging studies have demonstrated activations in the lateral orbitofrontal cortex (OFC) and the inferior frontal gyrus (IFG) in subjects performing this task. Excitotoxic lesions of analogous regions in marmosets have revealed, however, that while the OFC is indeed critical for reversal learning, ventrolateral prefrontal cortex (VLPFC) (analogous to IFG) is not, contributing instead to higher order processing, such as that required in attentional set-shifting and strategy transfer. One major difference between the marmoset and human studies has been the level of experience subjects received in reversal learning, being far greater in the latter. Since exposure to repeated contingency changes, as occurs in serial reversal learning, is likely to trigger the development of higher order, rule based strategies, we hypothesised a critical role of the marmoset VLPFC in performance of a serial reversal learning paradigm. After extensive training in reversal learning, marmosets received an excitotoxic lesion of the VLPFC, OFC or a sham control procedure. In agreement with our prediction, VLPFC lesioned animals were impaired in performing a series of discrimination reversals, post-surgery, but only when novel visual stimuli were introduced. In contrast, OFC lesioned animals were impaired regardless of whether the visual stimuli were the same or different from those used during pre-surgery training. Together, these data demonstrate the heterogeneous but inter-related involvement of primate OFC and VLPFC in the performance of serial reversal learning.
Keywords: OFC, VLPFC, reversal learning, marmoset, primate, lesion
Introduction
A hallmark of human cognition is its remarkable flexibility. Cognitive flexibility enables rapid behavioral adaptation to changing internal states and environmental circumstances. A biological substrate of this flexibility has been localized within the prefrontal cortex (PFC). Lesion studies in non-human primates, have demonstrated that distinct regions of the PFC carry out independent, but complementary forms cognitive flexibility processing. The capacity to reverse affective associations for specific stimuli (discrimination reversal learning) is mediated by regions within the OFC of marmosets (Dias et al., 1996). In contrast, regions within the VLPFC (proposed to be cytoarchitectonically similar to the VLPFC of rhesus monkeys and the IFG of humans (Burman and Rosa, 2009) are implicated in higher order processes such as attentional set shifting and strategy transfer (Dias et al., 1996; Wallis et al., 2001). Conversely, lesions of OFC do not affect set-shifting and strategy transfer while lesions of VLPFC do not affect reversal of stimulus-reward associations (Dias et al., 1996, 1997; Wallis et al., 2001). A similar double dissociation of lesion effects has since been shown within the PFC of rats (Birrell and Brown, 2000; McAlonan and Brown, 2003) and mice (Bissonette et al., 2008).
In humans however, although functional imaging studies have consistently identified activity in the IFG associated with shifting attentional sets (Hampshire and Owen, 2006), activity associated with reversing stimulus-reward associations has been identified both in the lateral OFC and the IFG (Cools et al., 2002; Fellows and Farah, 2003; O'Doherty et al., 2003; Hornak et al., 2004; Budhani et al., 2007). Only when attentional set-shifting and reversal learning are directly compared within the same study has reversal learning been associated selectively with activations in the OFC and not IFG (Hampshire and Owen, 2006).
One possible explanation for why significant activation during reversal learning has been reported in the IFG in human imaging studies, whilst lesions of a similar region in marmosets are without effect, is difference in experimental design. Whereas lesioned marmosets are naïve to changes in stimulus-reward contingencies, human neuroimaging investigations have usually employed paradigms where the reinforcement contingencies are reversed repeatedly to obtain sufficient numbers of reversal events for analysis. Since exposure to multiple reversals may well result in the development of an attentional learning set, or higher-order, rule based strategy (e.g. if not A then B) (Mackintosh and Little, 1969; Murray and Gaffan, 2006), the lateral IFG may be recruited, explaining the activation of this region in reversal learning in human neuroimaging studies. To address this issue, the present study directly compared the effects of excitotoxic lesions of the OFC and VLPFC on reversal learning in marmosets that, prior to the lesion, had already developed a reversal learning set. It was hypothesized that, in contrast to previous findings showing that VLPFC lesions had no effects on reversal performance in naïve animals, the same lesions would impair reversal learning in animals whose performance was dependent upon a learning set established prior to surgery. The contribution of the OFC in such circumstances remained to be determined.
Material and Methods
Subjects and housing
Ten experimentally naïve common marmosets (Callithrix jacchus), five females, five males, bred on site at the University of Cambridge Marmoset Breeding Colony were housed in pairs. All monkeys were fed 20 g of MP.E1 primate diet (Special Diet Services/SDS) and two pieces of carrot 5 days per week after the daily behavioral testing session, with simultaneous access to water for 2 h. At weekends, they received fruit, rusk, malt loaf, eggs, treats, and marmoset jelly (SDS), and had ad libitum access to water. Their cages contained a variety of environmental enrichment aids that were regularly varied, and all procedures were performed in accordance with the UK Animals (Scientific Procedures) Act 1986.
Apparatus
Behavioral testing took place in a specially designed, sound-attenuated box in a dark room. The animal was positioned in a clear, plastic transport box, one side of which was removed to reveal a colour computer monitor (Samsung). The marmoset reached through a series of vertical metal bars to touch stimuli presented on the monitor, and these responses were detected by an array of infrared beams (Intasolve, Interact 415) attached to the screen. Banana milkshake (Nestlé) which served as a reward was delivered to a centrally placed licker for 5 s. Presentation of reward was associated with a 2 kHz tone played through a loudspeaker located at the back of the testing box and was dependent on the marmoset licking the licker. The test chamber was lit with a 3 W bulb. The stimuli presented on the monitor were abstract, multicoloured visual patterns (32 mm wide × 50 mm high; 12 cm apart from the centre of the stimuli) that were displayed to the left and right of the central spout. The stimuli were presented using the Whisker control system (Cardinal, 2001) running Monkey Cantab [designed by Roberts and Robbins; version 3.6 (Cardinal, 2007)], which also controlled the apparatus and recorded responding.
Behavioral training
All monkeys were trained initially to enter a clear plastic transport box for marshmallow reward and familiarized with the testing apparatus. Monkeys then received the following sequence of training: familiarization of a milkshake reward, learning a tone–reward contingency, and responding on the touch-screen until they were reliably and accurately making 30 responses or more to a square stimulus presented to the left and right of the licker in 20 min. [For full experimental details, see (Roberts et al., 1988)]. After preliminary behavioral training, the marmosets were tested in the experimental procedures.
Serial reversal learning
As described previously (Clarke et al., 2004), this consisted of two-choice discriminations composed of abstract, colored patterns (see Fig. 1). For all discriminations, a pair of stimuli were presented to the left and right of the center of the screen. A response to the correct stimulus resulted in the incorrect stimulus disappearing from the screen, and the onset of a 5 s tone that signaled the availability of 5 s of reinforcement. Failure to collect the reward was scored as a missed reinforcement. After a response to the incorrect stimulus, both stimuli disappeared from the screen and the house light was extinguished for 5 s timeout period. The intertrial interval was 3 s and, within a session, the stimuli were presented equally to the left and right sides of the screen. Each monkey was presented with 30 trials per day, 5 d per week. The reward contingency reversed after attaining a criterion of 6 correct responses in a row in the immediately preceding session. After each reversal, the previously correct stimulus became incorrect and the previously incorrect stimulus became correct. If a monkey showed a significant side bias (10 consecutive responses to one side), a rolling correction procedure was implemented whereby the correct stimulus was presented on the non-preferred side until the monkey had made a total of three correct responses.
Experimental design and behavioral measures
The experiment was divided into 3 phases: (1) Pre-surgery acquisition of stable reversal performance: (2) Post-surgery maintenance of pre-surgery serial reversal performance: and (3) Post-surgery acquisition of stable reversal performance with novel stimuli. The main measure of the monkeys' performance on the visual discriminations was the total number of errors made before achieving the criterion of 6 consecutive, correct trials within one session. To achieve the criterion of stable pre-surgery reversal performance the animals had to make fewer than 20 errors in completing a reversal, and the total number of errors for that reversal had to lie within the confidence intervals of the preceding 3 reversals. This ensured that all marmosets had developed a stable level of reversal performance prior to the lesion. Once this had been achieved the animals were separated into 3 groups and underwent an OFC lesion (n = 3), a VLPFC lesion (n = 3), or a sham operation (n = 4). After 10 days of recovery, the animals received a further series of reversals until they regained their pre-surgery reversal performance. They then received a novel discrimination using a new pair of visual stimuli and were subsequently tested on a series of reversals until they had regained pre-surgical levels of reversal performance. For both of these post-surgery series of reversals the criterion for passing each series was performing 2 consecutive reversals, in which the numbers of errors made were within the upper confidence interval of the last 3 pre-surgery reversals (used previously to establish the pre surgery criterion).
In addition, signal detection theory (Macmillan and Creelman, 1991) was used to establish subjects' ability to discriminate correct from incorrect stimuli independently of any side bias that might have been present. The discrimination measure d' and the bias measure c were calculated and the normal cumulative distribution function (CDF) compared with the criterion values of a two-tailed Z test (each tail p = 0.05) to determine the classification of each 30 trial session as perseveration, chance or learning (including correction trials). Sessions in which CDF(d′) < 0.05 were classified as perseverative; sessions in which CDF(d′) > 0.95 were classified as learning, and sessions in which 0.05 <CDF(d′) > 0.95 were classified as chance (Clarke et al., 2004). Errors during perseverative sessions were considered perseverative errors. Days on which subjects attained criterion were not included in the analysis.
Statistics
The behavioral results were subjected to ANOVA using SPSS version 16. ANOVA models are in the form A3 × (B2 × S) or A3 × (B2 × C2 × S), where A is a between-subject factor with three levels (OFC/VLPFC/control group), B is within-subject factor of experimental stage with 2 levels (maintenance/novel) and C is a within-subjects factor of feedback with two levels (win shift/lose shift); S represents subjects (Keppel, 1991). Where raw data exhibited heterogeneity of variance, they were transformed appropriately, see (Howell, 1997). The data violating the normality assumptions of ANOVA were analysed using non-parametric Kruskal-Wallis H test. The post hoc comparisons were made using Fisher's protected least significant difference (LSD) tests. This test was chosen as it is the most powerful when the analysis of variance involves three levels of a factor (Howell, 1997). The post hoc analysis of within group differences in probability of shifting after an error and rewarded trial were made using paired t-tests.
Surgical procedure
Marmosets were premedicated with ketamine hydrochloride (0.1 ml of a 100 mg/ml solution, i.m.; Pfizer, Kent, UK) and a prophylactic analgesic (Norocarp; 0.03 ml of 50 mg/ml carprofen, s.c.; Pfizer), before being intubated and maintained on isoflurane gas anaesthetic (flow rate: 2.5% isoflurane in 0.2 l/min O2; Novartis Animal Health UK, Herts, UK). Animals were then placed in a stereotaxic frame (David Kopf Instruments, Tujunga, CA) with incisor and zygoma bars specially adapted for the marmoset. Anatomically defined lesions were achieved using stereotaxic injections of quinolinic acid (Sigma) in 0.01 M phosphate buffer at carefully defined coordinates (see Table 1), which were individually adjusted where necessary in situ to take into account individual differences in brain size, as described previously (Dias et al., 1996). Injections were made at a rate of 0.04 μl/20 s through a steel cannula attached to a 2μl (OFC lesions) or 10μl (VLPFC lesions) Hamilton syringe (Precision Sampling Co., Baton Rouge, LA). Sham operated controls (2 VLPFC and 2 OFC) underwent identical surgical procedures with the toxin omitted from the infusate. Dexamethasone phosphate (0.2ml i.m.; Fauling Pharmaceuticals plc, Warwicks, UK) was administered on completion of surgery to prevent tissue inflammation. The analgesic Metacam (meloxicam, 0.1ml of a 1.5mg/ml oral suspension; Boehringer Ingelheim, Germany) was given every 24 hours for 3 days post-operatively for further pain relief. Animals were returned to their home cage and had ad libitum access to water and supplementary diet during a recovery period of at least 10 days.
Table 1.
AP (mm) |
LM (mm) |
Position of cannula from base of the skull (mm) |
Angle of injection arm (°) |
Volume injected (μ) |
---|---|---|---|---|
OFC lesion | ||||
+16.75 | ±2.5 | 0.7 | 0 | 0.5 |
±4.8 | 0.7 | 0 | 0.5 | |
+17.75 | ±2.0 | 0.7 | 0 | 0.5 |
±4.8 | 0.7 | 0 | 0.5 | |
+18.50 | ±2.0 | 0.7 | 0 | 0.6 |
±4.0 | 0.7 | 0 | 0.5 | |
VLPFC lesion | ||||
+16.00 | ±6.2 | 0.9 | 10 | 1.0 |
+16.75 | ±5.9 | 1.0 | 8 | 1.6 |
+16.75 | ±5.9 | 1.5 | 8 | 1.6 |
+17.50 | ±5.6 | 1.0 | 8 | 1.2 |
+18.25 | ±5.3 | 1.0 | 8 | 1.5 |
+19.00 | ±4.6 | 0.7 | 8 | 0.8 |
+20.00 | ±3.0 | 1.7 | 0 | 0.5 |
Post mortem lesion assessment
All monkeys were humanely killed with Euthatal (1 ml of a 200 mg/ml solution, pentobarbital sodium; Merial Animal Health; i.p.) before being perfused transcardially with 500 ml of 0.1 M PBS, followed by 500 ml of 4% paraformaldehyde fixative over 10 min. The entire brain was then removed and placed in further paraformaldehyde overnight before being transferred to a 30% sucrose solution for at least 48 h. For verification of lesions, coronal sections (60 μm) of the brain were cut using a freezing microtome and cell bodies stained using Cresyl Fast Violet. The sections were viewed under a Leitz DMRD microscope and lesioned areas were defined by the presence of major neuronal loss, often with marked gliosis (See Fig. 2). For each animal, areas with cell loss were schematized onto drawings of standard marmoset brain coronal sections, and composite diagrams were then made to illustrate the extent of overlap between lesions.
Results
Lesion Assessment
The schematic representations of the extent of the lesions seen in all monkeys in the VLPFC and OFC groups, along with photomicrographs of histological material, are shown in Fig. 2. Figs. 2A and B illustrate those regions of the brain that were consistently lesioned in three, two or one of the marmosets within each lesion group. In all cases, the intention was to create discrete lesions of the target structures that did not incur damage to either fibers of passage or extra-target tissue.
VLPFC
Lesions of the VLPFC extended from the caudal extent of the frontal pole to the rostral extent of the lateral ventricles. According to the cytoarchitectonic map of (Burman and Rosa, 2009), the region consistently lesioned in all three marmosets (illustrated by the black shading in Fig. 2A) was area 12l, apart from its very posterior sector. Neuronal cell loss throughout this region was almost complete, as exemplified in Figs. 2C and D. In contrast, adjacent areas, 12m and 12o, on the lateral orbital surface were only damaged variably in one or two of the marmosets (Fig. 2 A, section AP 15.5). The posterior half of area 46 was also variably damaged, being damaged bilaterally in only one of the marmosets (Fig. 2 A, section 17.5). There was no damage to adjacent OFC (see Figs. 2C and D).
OFC
Lesions of the OFC were focused on the antero-medial half of the orbital surface, extending from just caudal of the frontal pole to rostral of the anterior limit of the lateral ventricles (see Figs. 2B, E and F). Neuronal cell loss was very extensive with only a few neurons remaining (See Figs. 2E and F). In all three marmosets neuronal degeneration was restricted to areas 11m, 13b and the very anterior extent of 13m (Burman and Rosa, 2009), sparing the more posterior aspect of 13b in one marmoset (Fig. 2B, AP section 16.5). The majority of area 13, including 13a and most of 13m was only damaged in one of the three marmosets (Fig. 2B, AP section 14.5), the other two either having no damage or damage to the anterior half only (Fig. 2B, AP section 15.5).
Behavioral Assessment
Pre-operative serial discrimination learning
All animals significantly improved their initial reversal performance taking between 10 to 64 reversals to reach a stable level of performance with each reversal being performed in 20 errors or under. It can be seen in Fig 3A that the numbers of errors made to learn a reversal were far fewer in the final four reversals (criterion stage) compared to the first four reversals (initial stage). There was no difference however between groups. A two way ANOVA with factors of lesion (Control, VLPFC, OFC) and pre–surgery stage (Initial, Criterion) revealed significant effects of stage (F(1,7) = 23.92, p = 0.002), but not lesion F<1, NS, or stage × lesion interaction F<1, NS. In addition, there was no difference in the overall number of reversals taken to reach stable reversal performance (Fig 3B) between groups; control (42.0 ± 11.2), VLPFC (40.7 ± 7.3) and OFC (31.3 ± 16.4; ANOVA F < 1, NS).
Post-operative performance
As demonstrated in Fig. 4A, at the maintenance stage OFC lesioned animals required more reversals to regain stable reversal performance than the control or VLPFC lesioned groups. There was no difference between VLPFC lesioned and control groups. At the novel discrimination stage, both groups of lesioned animals required more reversals to acquire stable reversal performance compared with the control group. There was no difference between VLPFC and OFC lesioned groups. A two way ANOVA with factors of lesion (control, VLPFC, OFC) and experimental stage (maintenance, novel) revealed significant effects of lesion (F(2,7) = 11.2, p = 0.007), and experimental stage (F(1,7)= 15.7, p = 0.005), and a significant stage × lesion interaction (F(2,7) = 4.9, p = 0.046). Post hoc analysis using the LSD test revealed significant (p ≤ 0.05) differences in the number of reversals required to attain criterion at the maintenance stage, between OFC lesioned and control, but not between VLPFC lesioned and control groups or VLPFC and OFC lesioned groups. At the novel discrimination stage there were significant differences between controls and both VLPFC and OFC lesioned groups (p ≤ 0.05 and p ≤ 0.01 respectively) but not between VLPFC and OFC lesioned groups.
The same pattern was observed in the numbers of errors at each stage (Fig 4B). A two way ANOVA with factors of lesion (control, VLPFC, OFC) and experimental stage (maintenance, novel) on LOG transformed data revealed significant effects of experimental stage (F(1,7) = 6.91, p = 0.034), of lesion (F(2,7) = 12.02, p = 0.005]) and a significant stage × lesion interaction (F(2,7) = 4.81, p = 0.048). Post hoc analysis using LSD test revealed significant difference in number of errors to criterion at the maintenance stage, between OFC lesioned and control (p ≤ 0.05), but not between VLPFC lesioned and control groups or VLPFC and OFC lesioned groups. At the novel discrimination stage there were significant differences between controls and both VLPFC and OFC lesioned groups (p ≤ 0.05 and p ≤ 0.001 respectively) but not between VLPFC and OFC lesioned groups.
To further explore the nature of the observed deficits, additional analyses were undertaken to determine whether any differences in the impaired serial reversal performance of VLPFC and OFC monkeys could be revealed. The responsivity of all three groups of monkeys to positive and negative feedback was investigated on all trials across all reversals of both maintenance and novel discrimination stages. Probabilities of shifting responding to the other stimulus on trial X were calculated according to whether the response on trial X-1 was rewarded (P[shift/win]) or not rewarded (P[shift/loss]), where P[shift/win]+P[stay/win]=1 and P[shift/loss]+P[stay/loss]=1) (Clarke et al., 2008). Figure 5 demonstrates the probability of shifting after an error or correct response in control, VLPFC and OFC lesioned animals. There was no overall difference between lesioned and control groups in their likelihood of shifting after an error but the OFC lesioned group were more likely to shift after a correct response. In addition, whereas controls were more likely to shift following an error than following a correct response, this was not the case for the VLPFC and OFC lesioned groups, which showed instead, an equal likelihood of shifting after either feedback. Repeated measures ANOVA of the arcsine-transformed probability data [lesion3 × (feedback2 × experimental stage2] revealed a significant effect of feedback (F(1,7) = 9.27, p = 0.019) and a significant lesion × feedback interaction. (F(2,7) = 13.08, p = 0.004). To explore the lesion × feedback interaction at the level of the lesion, an analysis of the simple main effects of lesion for each type of feedback, collapsed across both post-surgery experimental stages, was undertaken. This showed lesion differences in the probability of shifting after a rewarded trial (shift given correct response (F(2,9) = 6.8, p = 0.023) but no difference in the probability of shifting after an error (shift given incorrect response [F>1, NS). Post hoc analysis using the LSD test revealed significant differences between OFC lesioned and control groups (p≤0.01) in the probability of shifting after a correct response. In contrast, the VLPFC group did not differ from either controls or OFC lesioned animals. Further within-groups analysis, using paired t-tests to investigate the effect of feedback, revealed that only control animals showed a significantly higher probability of shifting after an error, compared to a correct response [t(3) = 6.75, p = 0.007]. No such difference in shifting after an error and correct response was seen in the VLPFC and OFC lesioned groups [t(2) = NS]
Since we have shown previously that lesions of the OFC produced deficits in reversal learning that were primarily of a perseverative nature, i.e. due to repeated responding to the previously-rewarded stimulus, we also determined the type of errors (perseverative, chance and learning) that were made in the present study. This analysis revealed that only OFC lesioned monkeys displayed perseverative errors. A Kruskal-Wallis test was used to evaluate differences among the three experimental groups in their proportion of perseverative errors. This revealed a significant difference between the experimental groups (controls and VLPFC: 0; OFC: 7.76±1.41) at the novel discrimination stage, Chi-square: 8.67, df = 2, p = 0.013, but not at the maintenance stage (controls and VLPFC: 0; OFC: 5.42±5.42), Chi-square: 2.33, df = 2, NS.
Whilst there was variation in the lesion extent between animals, this did not correlate with behavioral performance in either OFC or VLPFC lesioned groups.
Discussion
The present study compared effects of excitotoxic lesions of the VLPFC and OFC on serial discrimination reversal performance in marmosets trained on reversal learning prior to lesioning. VLPFC lesioned animals exhibited no difficulty in maintaining pre-surgery levels of reversal performance when the pair of discriminative stimuli used in the discrimination task were the same, pre- and post-surgery. However, they did display impaired reversal performance, compared to controls, when a novel pair of visual stimuli were introduced. In contrast, monkeys with OFC lesions were slower to regain pre-surgery levels of reversal performance regardless of whether the discriminative stimuli were the same or different from those used pre-surgery. Closer examination of reversal performance revealed that only marmosets in the control group were more likely to shift their response away from a previously non-rewarded stimulus on the subsequent trial, compared to a previously rewarded stimulus. Moreover, compared to controls, the OFC lesioned group made significantly more responses away from a stimulus if it had been rewarded on the previous trial. In addition, only OFC lesioned animals displayed perseverative responding to the previously rewarded stimulus following a reversal, an effect that was only apparent however when novel stimuli were introduced. These findings demonstrate that, depending upon the level of prior task experience, distinct regions of the PFC are recruited during visual discrimination serial reversal performance.
Experience with multiple reversals is likely to lead to the development of learning sets or rules that animals can use to guide responding and optimize performance (Browning et al., 2007; Wilson and Gaffan, 2008). These may take the form of learning (i) a ‘win-stay, lose-shift strategy’ (Mackintosh and Little, 1969), (ii) to attend to task relevant features, i.e. form/shape of the stimuli rather than their spatial location (Sutherland and Mackintosh, 1971) and (iii) to lay down prospective memories (Murray and Gaffan, 2006). One or a combination of these processes may contribute to improved reversal performance. Their relative contribution may account for differences between species such as marmoset and macaque in the speed of developing learning sets and final level of performance. A likely explanation for the observed effects of VLPFC lesions in the present study is that the transfer of such rules from one reversal learning context to another i.e. introduction of novel visual stimuli, is dependent on the VLPFC. The efficient pattern of response selection of lateral PFC lesioned animals on the familiar, serial discrimination reversal, rules out any account of the impairment in terms of simply failing to inhibit a pre-potent response tendency, disruption of action-outcome learning or inability to hold information “on line”. In addition, it suggests that such rules, once implemented within a given context, are not dependent upon the VLPFC. Instead, such rules are instantiated in distinct circuits. For example, once the relevant features in the task have been identified by the VLPFC, i.e. an attentional set has been formed, the changing association of those specific features with reward, across reversals, may then be tracked within orbitofrontal circuitry (see below) that includes the caudate nucleus, a region also shown to contribute to reversal learning (Rogers et al., 2000; Clarke et al., 2008; Cools et al., 2009).
The proposed role of the VLPFC in the application of rules to new contexts is in accord with the results from other studies in our laboratory showing that VLPFC is involved in higher-order set shifting and strategy transfer. Thus, a VLPFC lesion impairs attentional set-shifting, in which marmosets have to shift from one perceptual dimension of the visual stimuli to another, i.e. shapes to lines, in order to solve a series of visual discrimination problems (Dias et al, 1996). In addition, VLPFC lesions also impair performance on a detour-reaching task when marmosets are required to transfer, to a transparent box, a detour reaching strategy previously acquired on an opaque box (Wallis et al., 2001). Consistent with the cytoarchitectonic comparability of the VLPFC in marmosets with the VLPFC in rhesus monkeys (Burman and Rosa, 2009), lesions of the VLPFC (Baxter et al., 2009) but not OFC (Baxter et al., 2007) or DLPFC (Baxter et al., 2008), impair performance of rhesus monkeys on a preoperatively learned strategy implementation task that requires them to alternate choices between two categories of visual stimuli in order to earn rewards at an optimal rate. Such lesions also disrupt conditional learning between stimuli and actions (Bussey et al., 2001), which has also been proposed to reflect a failure of strategy implementation (Rushworth et al., 2008). Thus, in all these examples the animal is required to represent and implement behavioral rules or strategies that use higher-order information, e.g. categorical, to guide behavior. It is this contribution of the VLPFC to serial reversal learning that likely explains the activation of this region during reversal learning in functional neuroimaging studies in humans (Cools et al., 2002; Budhani et al., 2007). Moreover, it may also be relevant to our understanding of the failure of strategy implementation in patients with obsessive compulsive disorder (Chamberlain et al., 2006) since these patients, and their first degree relatives, show grey matter abnormalities within the IFG (Chamberlain et al., 2008).
In contrast to the selective role of the VLPFC in reversal learning, an intact OFC seems to be necessary for reversal performance regardless of the animal's prior experience. These findings extend previous results of OFC lesion-induced reversal impairments (Dias et al., 1997; Chudasama and Robbins, 2003; McAlonan and Brown, 2003; Izquierdo et al., 2004; Boulougouris et al., 2007; Schoenbaum et al., 2009) by showing that (i) the impairment is apparent despite extensive reversal experience and (ii) that whilst the deficit may weaken with continued experience (Boulougouris et al., 2007) it can be reinstated by changing the visual discriminanda. It is noteworthy that the apparently exclusively perseverative nature of the deficit seen following OFC lesions in marmosets, naïve to reversal learning (Dias et al., 1996, 1997; Clarke et al., 2008), was not seen in the present study when repeatedly reversing between stimuli on which the animals had been extensively trained prior to surgery. The most likely explanation for the lack of perseveration in the latter is that, at the time of the reversal, both stimuli were equally likely to elicit a response, since both stimuli had received an extensive history of being rewarded (pre- and post-surgery). Thus, we suggest that the overall reward strength that had accrued to each stimulus overshadowed any advantage gained by one of the stimuli from being rewarded more recently than the other. This was not the case though when novel stimuli were introduced. Since neither stimulus had a strong history of being rewarded, the most recently rewarded stimulus was more likely to induce a prepotent response bias following contingency reversal. Under these circumstances perseveration was still apparent in the OFC lesioned animals. It cannot be determined in the present study whether the perseveration was due to enhanced avoidance of the previously unrewarded stimulus or enhanced approach to the previously rewarded stimulus. We would suggest, however, that such perseverative responding, when seen, is indicative of a loss of top down control, resulting in the expression of prepotent withdrawal or approach biases.
A number of hypotheses have been proposed to explain the role of the OFC in reversal learning including response inhibition (Fuster, 1997), rapid, flexible encoding of associative information (Rolls, 1996) and more recently the signaling of outcome expectancies (Schoenbaum et al., 2009). There is considerable evidence from both electrophysiological (Tremblay and Schultz, 1999; Schoenbaum et al., 2007) and lesion (Gallagher et al., 1999; Izquierdo et al., 2004) studies that the OFC is critical for encoding the incentive value of outcomes predicted by pavlovian cues. However, recent findings that lesions restricted to orbital areas 11/13 fail to affect reversal learning (Kazama and Bachevalier, 2009), whilst impairing reinforcer devaluation test (in which changes in the incentive value of a predicted outcome guide responding), suggest that the coding of incentive value by the OFC is not critical for rapid reversal learning. More likely, OFC involvement in representing the contingent relationship between cues and their outcomes (Ostlund and Balleine, 2007) underlies rapid reversal learning, particularly given that the latter involves a change in the contingencies between the cue and its outcome, rather than in the value of the outcome per se. Ultimately, whether the OFC lesion-induced reversal deficit is the result of a loss of a single function or multiple functions remains to be determined, see (Clarke and Roberts, in press) for detailed consideration of this issue.
To conclude, the findings reported in this study demonstrate that the serial reversal-learning task makes simultaneous demands on distinct functions attributed, respectively, to the OFC and VLPFC. The rapid reversal of responses between stimuli, as a consequence of changes in the reward contingencies, is dependent upon the OFC; likely, as a consequence of the OFC's contribution to contingency (Ostlund and Balleine, 2007) and working memory processes (Walton et al., 2010). Repeated reversals allow for the development of rules/strategies to guide behavior and optimize performance, the implementation of which, across contexts, is dependent upon the VLPFC.
Acknowledgements
This work was supported by a Wellcome Trust programme grant, Ref No 089589/Z/09/Z (to T.W. R., B.J. Everitt, A.C.R., and B.J. Sahakian) and conducted within the Cambridge University Behavioural and Clinical Neuroscience Institute. We thank Mercedes Arroyo for preparation of histological material, and Adrian Newman for graphical support.
References
- Baxter MG, Gaffan D, Kyriazis DA, Mitchell AS. Orbital prefrontal cortex is required for object-in-place scene memory but not performance of a strategy implementation task. J Neurosci. 2007;27:11327–11333. doi: 10.1523/JNEUROSCI.3369-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baxter MG, Gaffan D, Kyriazis DA, Mitchell AS. Dorsolateral prefrontal lesions do not impair tests of scene learning and decision-making that require frontal-temporal interaction. Eur J Neurosci. 2008;28:491–499. doi: 10.1111/j.1460-9568.2008.06353.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baxter MG, Gaffan D, Kyriazis DA, Mitchell AS. Ventrolateral prefrontal cortex is required for performance of a strategy implementation task but not reinforcer devaluation effects in rhesus monkeys. Eur J Neurosci. 2009;29:2049–2059. doi: 10.1111/j.1460-9568.2009.06740.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Birrell JM, Brown VJ. Medial frontal cortex mediates perceptual attentional set shifting in the rat. J Neurosci. 2000;20:4320–4324. doi: 10.1523/JNEUROSCI.20-11-04320.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bissonette GB, Martins GJ, Franz TM, Harper ES, Schoenbaum G, Powell EM. Double dissociation of the effects of medial and orbital prefrontal cortical lesions on attentional and affective shifts in mice. J Neurosci. 2008;28:11124–11130. doi: 10.1523/JNEUROSCI.2820-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boulougouris V, Dalley JW, Robbins TW. Effects of orbitofrontal, infralimbic and prelimbic cortical lesions on serial spatial reversal learning in the rat. Behav Brain Res. 2007;179:219–228. doi: 10.1016/j.bbr.2007.02.005. [DOI] [PubMed] [Google Scholar]
- Browning PG, Easton A, Gaffan D. Frontal-temporal disconnection abolishes object discrimination learning set in macaque monkeys. Cereb Cortex. 2007;17:859–864. doi: 10.1093/cercor/bhk039. [DOI] [PubMed] [Google Scholar]
- Budhani S, Marsh AA, Pine DS, Blair RJ. Neural correlates of response reversal: considering acquisition. Neuroimage. 2007;34:1754–1765. doi: 10.1016/j.neuroimage.2006.08.060. [DOI] [PubMed] [Google Scholar]
- Burman KJ, Rosa MG. Architectural subdivisions of medial and orbital frontal cortices in the marmoset monkey (Callithrix jacchus) J Comp Neurol. 2009;514:11–29. doi: 10.1002/cne.21976. [DOI] [PubMed] [Google Scholar]
- Bussey TJ, Wise SP, Murray EA. The role of ventral and orbital prefrontal s(Macaca mulatta) Behav Neurosci. 2001;115:971–982. doi: 10.1037//0735-7044.115.5.971. [DOI] [PubMed] [Google Scholar]
- Cardinal RN. Whisker (version 2.11) Cambridge University Technical Services Ltd; Cambridge, UK: 2001. [Google Scholar]
- Cardinal RN. Monkey Cantab (version 3.6) Cambridge University Technical Services Ltd; Cambridge, UK: 2007. [Google Scholar]
- Chamberlain SR, Blackwell AD, Fineberg NA, Robbins TW, Sahakian BJ. Strategy implementation in obsessive-compulsive disorder and trichotillomania. Psychol Med. 2006;36:91–97. doi: 10.1017/S0033291705006124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chamberlain SR, Menzies L, Hampshire A, Suckling J, Fineberg NA, del Campo N, Aitken M, Craig K, Owen AM, Bullmore ET, Robbins TW, Sahakian BJ. Orbitofrontal dysfunction in patients with obsessive-compulsive disorder and their unaffected relatives. Science. 2008;321:421–422. doi: 10.1126/science.1154433. [DOI] [PubMed] [Google Scholar]
- Chudasama Y, Robbins TW. Dissociable contributions of the orbitofrontal and infralimbic cortex to pavlovian autoshaping and discrimination reversal learning: further evidence for the functional heterogeneity of the rodent frontal cortex. J Neurosci. 2003;23:8771–8780. doi: 10.1523/JNEUROSCI.23-25-08771.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clarke HF, Roberts AC. Reversal learning in fronto-striatal circuits: a functional, autonomic and neurochemical analysis. Attention and Performance XXII; in press. [Google Scholar]
- Clarke HF, Robbins TW, Roberts AC. Lesions of the medial striatum in monkeys produce perseverative impairments during reversal learning similar to those produced by lesions of the orbitofrontal cortex. J Neurosci. 2008;28:10972–10982. doi: 10.1523/JNEUROSCI.1521-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clarke HF, Dalley JW, Crofts HS, Robbins TW, Roberts AC. Cognitive inflexibility after prefrontal serotonin depletion. Science. 2004;304:878–880. doi: 10.1126/science.1094987. [DOI] [PubMed] [Google Scholar]
- Cools R, Clark L, Owen AM, Robbins TW. Defining the neural mechanisms of probabilistic reversal learning using event-related functional magnetic resonance imaging. J Neurosci. 2002;22:4563–4567. doi: 10.1523/JNEUROSCI.22-11-04563.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cools R, Frank MJ, Gibbs SE, Miyakawa A, Jagust W, D'Esposito M. Striatal dopamine predicts outcome-specific reversal learning and its sensitivity to dopaminergic drug administration. J Neurosci. 2009;29:1538–1543. doi: 10.1523/JNEUROSCI.4467-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dias R, Robbins TW, Roberts AC. Dissociation in prefrontal cortex of affective and attentional shifts. Nature. 1996;380:69–72. doi: 10.1038/380069a0. [DOI] [PubMed] [Google Scholar]
- Dias R, Robbins TW, Roberts AC. Dissociable forms of inhibitory control within prefrontal cortex with an analog of the Wisconsin Card Sort Test: restriction to novel situations and independence from “on-line” processing. J Neurosci. 1997;17:9285–9297. doi: 10.1523/JNEUROSCI.17-23-09285.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fellows LK, Farah MJ. Ventromedial frontal cortex mediates affective shifting in humans: evidence from a reversal learning paradigm. Brain. 2003;126:1830–1837. doi: 10.1093/brain/awg180. [DOI] [PubMed] [Google Scholar]
- Fuster JM. The Prefrontal Cortex: Anatomy, Physiology, and Neuropsychology of the Frontal Lobe. 3 Edition Philadelphia: Lippincott-Raven: 1997. [Google Scholar]
- Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and representation of incentive value in associative learning. J Neurosci. 1999;19:6610–6614. doi: 10.1523/JNEUROSCI.19-15-06610.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hampshire A, Owen AM. Fractionating attentional control using event-related fMRI. Cereb Cortex. 2006;16:1679–1689. doi: 10.1093/cercor/bhj116. [DOI] [PubMed] [Google Scholar]
- Hornak J, O'Doherty J, Bramham J, Rolls ET, Morris RG, Bullock PR, Polkey CE. Reward-related reversal learning after surgical excisions in orbito-frontal or dorsolateral prefrontal cortex in humans. J Cogn Neurosci. 2004;16:463–478. doi: 10.1162/089892904322926791. [DOI] [PubMed] [Google Scholar]
- Howell DC. Statistical Methods for Psychology. Fourth Edition Belmont, California: Wadsworth: 1997. [Google Scholar]
- Izquierdo A, Suda RK, Murray EA. Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J Neurosci. 2004;24:7540–7548. doi: 10.1523/JNEUROSCI.1921-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kazama A, Bachevalier J. Selective aspiration or neurotoxic lesions of orbital frontal areas 11 and 13 spared monkeys' performance on the object discrimination reversal task. J Neurosci. 2009;29:2794–2804. doi: 10.1523/JNEUROSCI.4655-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mackintosh NJ, Little L. Selective attention and response strategies as factors in serial reversal learning. Canadian Journal of Psychology. 1969;23(5):335–346. [Google Scholar]
- Macmillan NA, Creelman CD. Detection Theory: a user's guide. Cambridge University Press; 1991. [Google Scholar]
- McAlonan K, Brown VJ. Orbital prefrontal cortex mediates reversal learning and not attentional set shifting in the rat. Behav Brain Res. 2003;146:97–103. doi: 10.1016/j.bbr.2003.09.019. [DOI] [PubMed] [Google Scholar]
- Murray EA, Gaffan D. Prospective memory in the formation of learning sets by rhesus monkeys (Macaca mulatta) J Exp Psychol Anim Behav Process. 2006;32:87–90. doi: 10.1037/0097-7403.32.1.87. [DOI] [PubMed] [Google Scholar]
- O'Doherty J, Critchley H, Deichmann R, Dolan RJ. Dissociating valence of outcome from behavioral control in human orbital and ventral prefrontal cortices. J Neurosci. 2003;23:7931–7939. doi: 10.1523/JNEUROSCI.23-21-07931.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ostlund SB, Balleine BW. The contribution of orbitofrontal cortex to action selection. Ann N Y Acad Sci. 2007;1121:174–192. doi: 10.1196/annals.1401.033. [DOI] [PubMed] [Google Scholar]
- Roberts AC, Robbins TW, Everitt BJ. The effects of intradimensional and extradimensional shifts on visual discrimination learning in humans and non-human primates. Q J Exp Psychol B. 1988;40:321–341. [PubMed] [Google Scholar]
- Rogers RD, Andrews TC, Grasby PM, Brooks DJ, Robbins TW. Contrasting cortical and subcortical activations produced by attentional-set shifting and reversal learning in humans. J Cogn Neurosci. 2000;12:142–162. doi: 10.1162/089892900561931. [DOI] [PubMed] [Google Scholar]
- Rolls ET. The orbitofrontal cortex. Philos Trans R Soc Lond B Biol Sci. 1996;351:1433–1443. doi: 10.1098/rstb.1996.0128. discussion 1443-1434. [DOI] [PubMed] [Google Scholar]
- Rushworth MF, Croxson PL, Buckley MJ, Walton ME. Ventrolateral and Medial Frontal Contributions to Decision-Making and Action Selection. In: Bunge SA, Wallis JD, editors. Neuroscience of Rule-Guided Behavior. Oxford University Press; New York: 2008. pp. 129–157. [Google Scholar]
- Schoenbaum G, Saddoris MP, Stalnaker TA. Reconciling the roles of orbitofrontal cortex in reversal learning and the encoding of outcome expectancies. Ann N Y Acad Sci. 2007;1121:320–335. doi: 10.1196/annals.1401.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoenbaum G, Roesch MR, Stalnaker TA, Takahashi YK. A new perspective on the role of the orbitofrontal cortex in adaptive behaviour. Nat Rev Neurosci. 2009;10:885–892. doi: 10.1038/nrn2753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sutherland NS, Mackintosh NJ. Mechanisms of animal discrimination learning. Academic Press; New York: 1971. [Google Scholar]
- Tremblay L, Schultz W. Relative reward preference in primate orbitofrontal cortex. Nature. 1999;398:704–708. doi: 10.1038/19525. [DOI] [PubMed] [Google Scholar]
- Wallis JD, Dias R, Robbins TW, Roberts AC. Dissociable contributions of the orbitofrontal and lateral prefrontal cortex of the marmoset to performance on a detour reaching task. Eur J Neurosci. 2001;13:1797–1808. doi: 10.1046/j.0953-816x.2001.01546.x. [DOI] [PubMed] [Google Scholar]
- Walton ME, Behrens TE, Buckley MJ, Rudebeck PH, Rushworth MF. Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron. 2010;65:927–939. doi: 10.1016/j.neuron.2010.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson CR, Gaffan D. Prefrontal-inferotemporal interaction is not always necessary for reversal learning. J Neurosci. 2008;28:5529–5538. doi: 10.1523/JNEUROSCI.0952-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]