Skip to main content
The Journal of Neuroscience logoLink to The Journal of Neuroscience
. 2010 Oct 27;30(43):14552–14559. doi: 10.1523/JNEUROSCI.2631-10.2010

Differential Contributions of the Primate Ventrolateral Prefrontal and Orbitofrontal Cortex to Serial Reversal Learning

Rafal Rygula 1,2,, Susannah C Walker 1,2, Hannah F Clarke 1,2, Trevor W Robbins 1,2, Angela C Roberts 2,3
PMCID: PMC3044865  EMSID: UKMS33089  PMID: 20980613

Abstract

The discrimination reversal paradigm is commonly used to measure a subject's ability to adapt their behavior according to changes in stimulus–reward contingencies. Human functional neuroimaging studies have demonstrated activations in the lateral orbitofrontal cortex (OFC) and the inferior frontal gyrus (IFG) in subjects performing this task. Excitotoxic lesions of analogous regions in marmosets have revealed, however, that although the OFC is indeed critical for reversal learning, ventrolateral prefrontal cortex (VLPFC) (analogous to IFG) is not, contributing instead to higher order processing, such as that required in attentional set-shifting and strategy transfer. One major difference between the marmoset and human studies has been the level of training subjects received in reversal learning, being far greater in the latter. Since exposure to repeated contingency changes, as occurs in serial reversal learning, is likely to trigger the development of higher order, rule-based strategies, we hypothesized a critical role of the marmoset VLPFC in performance of a serial reversal learning paradigm. After extensive training in reversal learning, marmosets received an excitotoxic lesion of the VLPFC, OFC, or a sham control procedure. In agreement with our prediction, postsurgery, VLPFC lesioned animals were impaired in performing a series of discrimination reversals, but only when novel visual stimuli were introduced. In contrast, OFC lesioned animals were impaired regardless of whether the visual stimuli were the same or different from those used during presurgery training. Together, these data demonstrate the heterogeneous but interrelated involvement of primate OFC and VLPFC in the performance of serial reversal learning.

Introduction

A hallmark of human cognition is its remarkable flexibility. Cognitive flexibility enables rapid behavioral adaptation to changing internal states and environmental circumstances. A biological substrate of this flexibility has been localized within the prefrontal cortex (PFC). Lesion studies in nonhuman primates have demonstrated that distinct regions of the PFC carry out independent, but complementary forms of cognitive flexibility processing. The capacity to reverse affective associations for specific stimuli (discrimination reversal learning) is mediated by regions within the orbitofrontal cortex (OFC) of marmosets (Dias et al., 1996). In contrast, regions within the ventrolateral prefrontal cortex [VLPFC; proposed to be cytoarchitectonically similar to the VLPFC of rhesus monkeys and the inferior frontal gyrus (IFG) of humans (Burman and Rosa, 2009)] are implicated in higher order processes such as attentional set shifting and strategy transfer (Dias et al., 1996; Wallis et al., 2001). Conversely, lesions of OFC do not affect set shifting or strategy transfer, and lesions of VLPFC do not affect reversal of stimulus–reward associations (Dias et al., 1996, 1997; Wallis et al., 2001). A similar double dissociation of lesion effects has since been shown within the PFC of rats (Birrell and Brown, 2000; McAlonan and Brown, 2003) and mice (Bissonette et al., 2008).

In humans, however, although functional imaging studies have consistently identified activity in the IFG associated with shifting attentional sets (Hampshire and Owen, 2006), activity associated with reversing stimulus–reward associations has been identified both in the lateral OFC and the IFG (Cools et al., 2002; Fellows and Farah, 2003; O'Doherty et al., 2003; Hornak et al., 2004; Budhani et al., 2007). Only when attentional set shifting and reversal learning are directly compared within the same study has reversal learning been associated selectively with activations in the OFC and not IFG (Hampshire and Owen, 2006).

One possible explanation for why significant activation during reversal learning has been reported in the IFG in human imaging studies, though lesions of a similar region in marmosets are without effect, is difference in experimental design. Whereas lesioned marmosets are naive to changes in stimulus–reward contingencies, human neuroimaging investigations have usually used paradigms where the reinforcement contingencies are reversed repeatedly to obtain sufficient numbers of reversal events for analysis. Since exposure to multiple reversals may well result in the development of an attentional learning set, or higher-order, rule-based strategy (e.g., if not A then B) (Mackintosh and Little, 1969; Murray and Gaffan, 2006), this may act to recruit the IFG, explaining the activation of this region in reversal learning in human neuroimaging studies. To address this issue, the present study directly compared the effects of excitotoxic lesions of the OFC and VLPFC on reversal learning in marmosets that, before the lesion, had already developed a reversal learning set. It was hypothesized that, in contrast to previous findings showing that VLPFC lesions had no effects on reversal performance in naive animals, the same lesions would impair reversal learning in animals whose performance was dependent upon a learning set established before surgery. The contribution of the OFC in such circumstances remain to be determined.

Materials and Methods

Subjects and housing.

Ten experimentally naive common marmosets (Callithrix jacchus), five females, five males, bred on-site at the University of Cambridge Marmoset Breeding Colony, were housed in pairs. All monkeys were fed 20 g of MP.E1 primate diet (Special Diet Services/SDS) and two pieces of carrot 5 d per week after the daily behavioral testing session, with simultaneous access to water for 2 h. On weekends, they received fruit, rusk, malt loaf, eggs, treats, and marmoset jelly (SDS), and had ad libitum access to water. Their cages contained a variety of environmental enrichment aids that were regularly varied, and all procedures were performed in accordance with the UK Animals (Scientific Procedures) Act 1986.

Apparatus.

Behavioral testing took place in a specially designed, sound-attenuated box in a dark room. The animal was positioned in a clear, plastic transport box, one side of which was removed to reveal a color computer monitor (Samsung). The marmoset reached through a series of vertical metal bars to touch stimuli presented on the monitor, and these responses were detected by an array of infrared beams (Intasolve, Interact 415) attached to the screen. Banana milkshake (Nestlé), which served as a reward, was delivered to a centrally placed licker for 5 s. Presentation of reward was associated with a 2 kHz tone played through a loudspeaker located at the back of the testing box and was dependent on the marmoset licking the licker. The test chamber was lit with a 3 W bulb. The stimuli presented on the monitor were abstract, multicoloured visual patterns (32 mm wide × 50 mm high; 12 cm apart from the center of the stimuli) that were displayed to the left and right of the central spout. The stimuli were presented using the Whisker control system (Cardinal, 2001) running Monkey Cantab [designed by Roberts and Robbins; version 3.6 (Cardinal, 2007)], which also controlled the apparatus and recorded responding.

Behavioral training.

All monkeys were trained initially to enter a clear plastic transport box for marshmallow reward and familiarized with the testing apparatus. Monkeys then received the following sequence of training: familiarization of a milkshake reward, learning a tone–reward contingency, and responding on the touch-screen until they were reliably and accurately making 30 responses or more to a square stimulus presented to the left and right of the licker in 20 min. For full experimental details, see Roberts et al. (1988). After preliminary behavioral training, the marmosets were tested in the experimental procedures.

Serial reversal learning.

As described previously (Clarke et al., 2004), this consisted of two-choice discriminations composed of abstract, colored patterns (Fig. 1). For all discriminations, a pair of stimuli was presented to the left and right of the center of the screen. A response to the correct stimulus resulted in the incorrect stimulus disappearing from the screen and the onset of a 5 s tone that signaled the availability of 5 s of reinforcement. Failure to collect the reward was scored as a missed reinforcement. After a response to the incorrect stimulus, both stimuli disappeared from the screen and the house light was extinguished for 5 s timeout period. The intertrial interval was 3 s and, within a session, the stimuli were presented equally to the left and right sides of the screen. Each monkey was presented with 30 trials per day, 5 d per week. The reward contingency reversed after attaining a criterion of six correct responses in a row in the immediately preceding session. After each reversal, the previously correct stimulus became incorrect and the previously incorrect stimulus became correct. If a monkey showed a significant side bias (10 consecutive responses to one side), a rolling correction procedure was implemented whereby the correct stimulus was presented on the nonpreferred side until the monkey had made a total of three correct responses.

Figure 1.

Figure 1.

Experimental schedule. Stimulus exemplars used for the various stages of the serial reversal experiment. The rewarded and unrewarded stimuli on each discrimination are indicated by “+” and “−”, respectively.

Experimental design and behavioral measures.

The experiment was divided into three phases: presurgery acquisition of stable reversal performance, postsurgery maintenance of presurgery serial reversal performance, and postsurgery acquisition of stable reversal performance with novel stimuli. The main measure of the monkeys' performance on the visual discriminations was the total number of errors made before achieving the criterion of six consecutive, correct trials within one session. To achieve the criterion of stable presurgery reversal performance, the animals had to make <20 errors in completing a reversal, and the total number of errors for that reversal had to lie within the confidence intervals of the preceding three reversals. This ensured that all marmosets had developed a stable level of reversal performance before the lesion. Once this had been achieved, the animals were separated into three groups and underwent an OFC lesion (n = 3), a VLPFC lesion (n = 3), or a sham operation (n = 4). After 10 d of recovery, the animals received a further series of reversals until they regained their presurgery reversal performance. They then received a novel discrimination using a new pair of visual stimuli and were subsequently tested on a series of reversals until they had regained presurgical levels of reversal performance. For both of these postsurgery series of reversals, the criterion for passing each series was performing two consecutive reversals, in which the numbers of errors made were within the upper confidence interval of the last three presurgery reversals (used previously to establish the presurgery criterion).

In addition, signal detection theory (Macmillan and Creelman, 1991) was used to establish subjects' ability to discriminate correct from incorrect stimuli independently of any side bias that might have been present. The discrimination measure, d′, and the bias measure, c, were calculated and the normal cumulative distribution function (CDF) compared with the criterion values of a two-tailed Z test (each tail, p = 0.05) to determine the classification of each of the 30 trial sessions as perseveration, chance, or learning (including correction trials). Sessions in which CDF(d′) < 0.05 were classified as perseverative, sessions in which CDF(d′) > 0.95 were classified as learning, and sessions in which 0.05 <CDF(d′) > 0.95 were classified as chance (Clarke et al., 2004). Errors during perseverative sessions were considered perseverative errors. Days on which subjects attained criterion were not included in the analysis.

Statistics.

The behavioral results were subjected to ANOVA using SPSS version 16. ANOVA models are in the form A3 × (B2 × S) or A3 × (B2 × C2 × S), where A is a between-subject factor with three levels (OFC/VLPFC/control group), B is within-subject factor of experimental stage with two levels (maintenance/novel), C is a within-subjects factor of feedback with two levels (win shift/lose shift), and S represents subjects (Keppel, 1991). Where raw data exhibited heterogeneity of variance, they were transformed appropriately (Howell, 1997). The data violating the normality assumptions of ANOVA were analyzed using nonparametric Kruskal–Wallis H test. The post hoc comparisons were made using Fisher's protected least significant difference (LSD) test. This test was chosen as it is the most powerful when the ANOVA involves three levels of a factor (Howell, 1997). The post hoc analysis of within-group differences in probability of shifting after an error and rewarded trial were made using paired t tests.

Surgical procedure.

Marmosets were premedicated with ketamine hydrochloride (0.1 ml of a 100 mg/ml solution, i.m.; Pfizer) and a prophylactic analgesic (Norocarp; 0.03 ml of 50 mg/ml carprofen, s.c.; Pfizer), before being intubated and maintained on isoflurane gas anesthetic (flow rate, 2.5% isoflurane in 0.2 l/min O2; Novartis Animal Health). Animals were then placed in a stereotaxic frame (David Kopf Instruments) with incisor and zygoma bars specially adapted for the marmoset. Anatomically defined lesions were achieved using stereotaxic injections of quinolinic acid (Sigma) in 0.01 m phosphate buffer at carefully defined coordinates (Table 1), which were individually adjusted where necessary in situ to take into account individual differences in brain size, as described previously (Dias et al., 1996). Injections were made at a rate of 0.04 μl/20 s through a steel cannula attached to a 2 μl (OFC lesions) or 10 μl (VLPFC lesions) Hamilton syringe (Precision Sampling). Sham-operated controls (two VLPFC and two OFC) underwent identical surgical procedures with the toxin omitted from the infusate. Dexamethasone phosphate (0.2 ml i.m.; Fauling Pharmaceuticals) was administered on completion of surgery to prevent tissue inflammation. The analgesic Metacam (meloxicam, 0.1 ml of a 1.5 mg/ml oral suspension; Boehringer Ingelheim) was given every 24 h for 3 d postoperatively for further pain relief. Animals were returned to their home cage and had ad libitum access to water and supplementary diet during a recovery period of at least 10 d.

Table 1.

OFC and VLPFC lesion parameters, including the stereotaxic coordinates of each injection (based on interaural plane) and the injection volume

AP (mm) LM (mm) Position of cannula from base of the skull (mm) Angle of injection arm (°) Volume injected (μ)
OFC lesion
    +16.75 ±2.5 0.7 0 0.5
±4.8 0.7 0 0.5
    +17.75 ±2.0 0.7 0 0.5
±4.8 0.7 0 0.5
    +18.50 ±2.0 0.7 0 0.6
±4.0 0.7 0 0.5
VLPFC lesion
    +16.00 ±6.2 0.9 10 1.0
    +16.75 ±5.9 1.0 8 1.6
    +16.75 ±5.9 1.5 8 1.6
    +17.50 ±5.6 1.0 8 1.2
    +18.25 ±5.3 1.0 8 1.5
    +19.00 ±4.6 0.7 8 0.8
    +20.00 ±3.0 1.7 0 0.5

AP, Anterior-posterior; LM, lateral-medial.

Postmortem lesion assessment.

All monkeys were humanely killed with Euthatal (1 ml of a 200 mg/ml solution, pentobarbital sodium, i.p.; Merial Animal Health) before being perfused transcardially with 500 ml of 0.1 m PBS, followed by 500 ml of 4% paraformaldehyde fixative over 10 min. The entire brain was then removed and placed in further paraformaldehyde overnight before being transferred to a 30% sucrose solution for at least 48 h. For verification of lesions, coronal sections (60 μm) of the brain were cut using a freezing microtome and cell bodies stained using Cresyl Fast Violet. The sections were viewed under a Leitz DMRD microscope and lesioned areas were defined by the presence of major neuronal loss, often with marked gliosis. For each animal, areas with cell loss were schematized onto drawings of standard marmoset brain coronal sections, and composite diagrams were then made to illustrate the extent of overlap between lesions.

Results

Lesion assessment

The schematic representations of the extent of the lesions seen in all monkeys in the VLPFC and OFC groups, along with photomicrographs of histological material, are shown in Figure 2. Figure 2, A and B, illustrates those regions of the brain that were consistently lesioned in three, two, or one of the marmosets within each lesion group. In all cases, the intention was to create discrete lesions of the target structures that did not incur damage to either fibers of passage or extra-target tissue.

Figure 2.

Figure 2.

A, B, Schematic coronal sections taken through the frontal lobe of the marmoset monkey indicating the extent of VLPFC and OFC excitotoxic lesions. The decreasing shades of gray indicate regions that were lesioned in all three monkeys (black), any two monkeys (dark gray), and one monkey (light gray). The coronal sections on the right illustrate cytoarchitectonic regions identified within the prefrontal cortex. Figure adapted from Burman and Rosa (2009). CF, Low- and high-power photomicrographs of a coronal section midway within the prefrontal cortex of a representative VLPFC and OFC lesion. The dashed lines indicate the boundaries of the lesions and the asterisks in the low-powered photomicrographs of C and E represent the same location as in the high-power photomicrographs of D and F, respectively. Note the wide expanse of intact OFC in C (VLPFC lesion) compared with almost complete absence of neural tissue in the lesioned OFC in E. Similarly, note the large expanse of VLPFC in E (OFC lesion) compared with the lesioned VLPFC in C. Comparison of cellular architecture on either side of the boundary line in the high-power photomicrograph of D reveals how the lesioned VLPFC region on the left contains primarily glia with very few remaining neurons, in contrast to the typical five to six layered pattern of neurons in the neighboring intact OFC. The converse pattern is seen when comparing cellular architecture on either side of the boundary line in F. The VLPFC on the left is composed of layers of neurons whereas the lesioned OFC on the right is considerably shrunken and composed primarily of glia.

VLPFC

Lesions of the VLPFC extended from the caudal extent of the frontal pole to the rostral extent of the lateral ventricles. According to the cytoarchitectonic map of Burman and Rosa (2009), the region consistently lesioned in all three marmosets was area 12l (Fig. 2A, black shading), apart from its very posterior sector. Neuronal cell loss throughout this region was almost complete, as exemplified in Figure 2, C and D. In contrast, adjacent areas, 12m and 12o, on the lateral orbital surface were only damaged variably in one or two of the marmosets (section AP 15.5) (Fig. 2A). The posterior half of area 46 was also variably damaged, being damaged bilaterally in only one of the marmosets (section 17.5) (Fig. 2A). There was no damage to adjacent OFC (Fig. 2C,D).

OFC

Lesions of the OFC were focused on the anteromedial half of the orbital surface, extending from just caudal of the frontal pole to rostral of the anterior limit of the lateral ventricles (Figs. 2B,E,F). Neuronal cell loss was very extensive, with only a few neurons remaining (Fig. 2E,F). In all, three marmosets' neuronal degeneration was restricted to areas 11m, 13b, and the very anterior extent of 13m (Burman and Rosa, 2009), sparing the more posterior aspect of 13b in one marmoset (Fig. 2B, AP section 16.5). The majority of area 13, including 13a and most of 13m, was only damaged in one of the three marmosets (Fig. 2B, AP section 14.5), the other two either having no damage or damage to the anterior half only (Fig. 2B, AP section 15.5).

Behavioral assessment

Preoperative serial discrimination learning

All animals significantly improved their initial reversal performance, taking between 10 and 64 reversals to reach a stable level of performance, with each reversal being performed in 20 errors or under. As can be seen in Figure 3A, the numbers of errors made before learning a reversal were far fewer in the final four reversals (criterion stage) compared with the first four reversals (initial stage). There was no difference between groups. A two-way ANOVA with factors of lesion (control, VLPFC, OFC) and presurgery stage (initial, criterion) revealed significant effects of stage (F(1,7) = 23.92, p = 0.002) but not lesion [F < 1, not significant (NS)], or stage × lesion interaction (F < 1, NS). In addition, there was no difference in the overall number of reversals taken to reach stable reversal performance (Fig. 3B) between groups (control, 42.0 ± 11.2; VLPFC, 40.7 ± 7.3; OFC, 31.3 ± 16.4; ANOVA, F < 1, NS).

Figure 3.

Figure 3.

Development of the learning set presurgery. A, Initial performance versus criterion. The bars represent the mean number of errors made by each experimental group of animals during four initial and four final presurgery criterion reversals. Scattered symbols represent the performance of individual animals within groups. SED, SE of the difference of the means. B, Learning curves. Lines represent learning curves of the individual animals in each experimental group. For clarity, the mean number of errors across blocks of 10 reversals is presented. It should be noted that for each animal, the final block represents <10 reversals, dependent upon the number of reversals an individual marmoset performed before reaching criterion. Numbers in parenthesis represent the total number of reversals for each marmoset. The single point in block 1 represents an animal in the control group that reached criterion after the first 10 reversals.

Postoperative performance

As demonstrated in Figure 4A, at the maintenance stage, OFC-lesioned animals required more reversals to regain stable reversal performance than the control or VLPFC-lesioned groups. There was no difference between VLPFC-lesioned and control groups. At the novel discrimination stage, both groups of lesioned animals required more reversals to acquire stable reversal performance compared with the control group. There was no difference between VLPFC- and OFC-lesioned groups. A two-way ANOVA with factors of lesion (control, VLPFC, OFC) and experimental stage (maintenance, novel) revealed significant effects of lesion (F(2,7) = 11.2, p = 0.007) and experimental stage (F(1,7) = 15.7, p = 0.005), and a significant stage × lesion interaction (F(2,7) = 4.9, p = 0.046). Post hoc analysis using the LSD test revealed significant (p ≤ 0.05) differences in the number of reversals required to attain criterion at the maintenance stage between OFC-lesioned and control groups, but not between VLPFC-lesioned and control groups or VLPFC- and OFC-lesioned groups. At the novel discrimination stage, there were significant differences between controls and both VLPFC- and OFC-lesioned groups (p ≤ 0.05 and p ≤ 0.01, respectively) but not between VLPFC- and OFC-lesioned groups.

Figure 4.

Figure 4.

Serial discrimination reversal performance postsurgery. A, The bars represent total numbers of reversals (mean for each group) performed by animals before reaching stable reversal performance during both postsurgery stages (maintenance and novel discrimination). B, The bars represent total numbers of errors (mean for each group, log transformed) during both postsurgery stages (maintenance and novel). Scattered symbols represent the performance of individual animals within groups. *p < 0.05, **p < 0.01, ***p < 0.001 compared with control group (LSD post hoc test). SED, SE of the difference of the means.

The same pattern was observed in the numbers of errors at each stage (Fig. 4B). A two-way ANOVA with factors of lesion (control, VLPFC, OFC) and experimental stage (maintenance, novel) on log-transformed data revealed significant effects of experimental stage (F(1,7) = 6.91, p = 0.034), of lesion (F(2,7) = 12.02, p = 0.005]) and a significant stage × lesion interaction (F(2,7) = 4.81, p = 0.048). Post hoc analysis using LSD test revealed significant difference in the number of errors to criterion at the maintenance stage, between OFC-lesioned and control (p ≤ 0.05) groups, but not between VLPFC-lesioned and control groups or VLPFC- and OFC-lesioned groups. At the novel discrimination stage, there were significant differences between controls and both VLPFC- and OFC-lesioned groups (p ≤ 0.05 and p ≤ 0.001, respectively), but not between VLPFC- and OFC-lesioned groups.

To further explore the nature of the observed deficits, additional analyses were undertaken to determine whether any differences in the impaired serial reversal performance of VLPFC- and OFC-lesioned monkeys could be revealed. The responsivity of all three groups of monkeys to positive and negative feedback was investigated on all trials across all reversals of both maintenance and novel discrimination stages. Probabilities (P) of shifting responding to the other stimulus on trial X were calculated according to whether the response on trial X-1 was rewarded (P[shift/win]) or not rewarded (P[shift/loss]), where P[shift/win] + P[stay/win] = 1 and P[shift/loss] + P[stay/loss] = 1) (Clarke et al., 2008). Figure 5 demonstrates the probability of shifting after an error or correct response in control and VLPFC- and OFC-lesioned animals. There was no overall difference between lesioned and control groups in their likelihood of shifting after an error, but the OFC-lesioned animals were more likely to shift after a correct response. In addition, whereas controls were more likely to shift following an error than following a correct response, this was not the case for the VLPFC- and OFC-lesioned groups, which showed instead an equal likelihood of shifting after either feedback. Repeated-measures ANOVA of the arcsine-transformed probability data [lesion3 × (feedback2 × experimental stage2)] revealed a significant effect of feedback (F(1,7) = 9.27, p = 0.019) and a significant lesion × feedback interaction. (F(2,7) = 13.08, p = 0.004). To explore the lesion × feedback interaction at the level of the lesion, an analysis of the simple main effects of lesion for each type of feedback, collapsed across both postsurgery experimental stages, was undertaken. This showed lesion differences in the probability of shifting after a rewarded trial (shift given correct response, F(2,9) = 6.8, p = 0.023) but no difference in the probability of shifting after an error (shift given incorrect response, F > 1, NS). Post hoc analysis using the LSD test revealed significant differences between OFC-lesioned and control groups (p ≤ 0.01) in the probability of shifting after a correct response. In contrast, the VLPFC group did not differ from either controls or OFC-lesioned animals. Further within-groups analysis, using paired t tests to investigate the effect of feedback, revealed that only control animals showed a significantly higher probability of shifting after an error compared with a correct response (t(3) = 6.75, p = 0.007). No such difference in shifting after an error and correct response was seen in the VLPFC- and OFC-lesioned groups (t(2) = NS).

Figure 5.

Figure 5.

The mean probabilities of monkeys shifting their responding to the other stimulus after making an incorrect response (and therefore not receiving reward) or a correct response and receiving reward. Scattered symbols represent the performance of individual animals within groups, **p < 0.01 compared with control group (LSD post hoc test) or **p < 0.01 comparison of shifting after correct and incorrect responses (paired t test). SED, SE of the difference of the means.

Because we have shown previously (Dias et al., 1996; Clarke et al., 2008) that lesions of the OFC produced deficits in reversal learning that were primarily of a perseverative nature, i.e., due to repeated responding to the previously rewarded stimulus, we also determined the type of errors (perseverative, chance, and learning) that were made in the present study. This analysis revealed that only OFC-lesioned monkeys displayed perseverative errors. A Kruskal–Wallis test was used to evaluate differences among the three experimental groups in their proportion of perseverative errors. This revealed a significant difference between the experimental groups (controls and VLPFC, 0; OFC, 7.76 ± 1.41) at the novel discrimination stage (χ2 = 8.67; df = 2, p = 0.013), but not at the maintenance stage (controls and VLPFC: 0; OFC: 5.42 ± 5.42; χ2 = 2.33, df = 2, NS).

Although there was variation in the lesion extent between animals, this did not correlate with behavioral performance in either the OFC- or VLPFC-lesioned groups.

Discussion

The present study compared effects of excitotoxic lesions of the VLPFC and OFC on serial discrimination reversal performance in marmosets trained on reversal learning before lesioning. VLPFC-lesioned animals exhibited no difficulty in maintaining presurgery levels of reversal performance when the pair of discriminative stimuli used in the discrimination task were the same presurgery and postsurgery. However, they did display impaired reversal performance, compared with controls, when a novel pair of visual stimuli were introduced. In contrast, monkeys with OFC lesions were slower to regain presurgery levels of reversal performance regardless of whether the discriminative stimuli were the same or different from those used presurgery. Closer examination of reversal performance revealed that only marmosets in the control group were likely to shift their response away from a previously nonrewarded stimulus on the subsequent trial compared with a previously rewarded stimulus. Moreover, compared with controls, the OFC-lesioned group made significantly more responses away from a stimulus if it had been rewarded on the previous trial. In addition, only OFC-lesioned animals displayed perseverative responding to the previously rewarded stimulus following a reversal, an effect that was only apparent when novel stimuli were introduced. These findings demonstrate that, depending upon the level of prior task experience, distinct regions of the PFC are recruited during visual discrimination serial reversal performance.

Experience with multiple reversals is likely to lead to the development of learning sets or rules that animals can use to guide responding and optimize performance (Browning et al., 2007; Wilson and Gaffan, 2008). These may take the form of learning a win–stay, lose–shift strategy (Mackintosh and Little, 1969); attending to task relevant features, i.e., form/shape of the stimuli rather than their spatial location (Sutherland and Mackintosh, 1971); and laying down prospective memories (Murray and Gaffan, 2006). One or a combination of these processes may contribute to improved reversal performance. Their relative contribution may account for differences between species (such as marmoset and macaque) in the speed of developing learning sets and final level of performance. A likely explanation for the observed effects of VLPFC lesions in the present study is that the transfer of such rules from one reversal learning context to another, i.e., introduction of novel visual stimuli, is dependent on the VLPFC. The efficient pattern of response selection of lateral PFC-lesioned animals on the familiar, serial discrimination reversal rules out any account of the impairment in terms of simply failing to inhibit a prepotent response tendency, disruption of action–outcome learning, or inability to retain information. In addition, it suggests that such rules, once implemented within a given context, are not dependent upon the VLPFC. Instead, such rules are instantiated in distinct circuits. For example, once the relevant features of the task have been identified by the VLPFC, i.e., an attentional set has been formed, the changing association of those specific features with reward, across reversals, may then be tracked within orbitofrontal circuitry (see below), which includes the caudate nucleus, a region also shown to contribute to reversal learning (Rogers et al., 2000; Clarke et al., 2008; Cools et al., 2009).

The proposed role of the VLPFC in the application of rules to new contexts is in accord with the results from other studies in our laboratory showing that VLPFC is involved in higher-order set shifting and strategy transfer. Thus, a VLPFC lesion impairs attentional set shifting, in which marmosets have to shift from one perceptual dimension of the visual stimuli to another, i.e., shapes to lines, to solve a series of visual discrimination problems (Dias et al., 1996). In addition, VLPFC lesions also impair performance on a detour-reaching task when marmosets are required to transfer a detour-reaching strategy previously acquired on an opaque box to a transparent box (Wallis et al., 2001). Consistent with the cytoarchitectonic comparability of the VLPFC in marmosets with the VLPFC in rhesus monkeys (Burman and Rosa, 2009), lesions of the VLPFC (Baxter et al., 2009), but not OFC (Baxter et al., 2007) or DLPFC (Baxter et al., 2008), impair performance of rhesus monkeys on a preoperatively learned strategy implementation task that requires them to alternate choices between two categories of visual stimuli to earn rewards at an optimal rate. Such lesions also disrupt conditional learning between stimuli and actions (Bussey et al., 2001), which has also been proposed to reflect a failure of strategy implementation (Rushworth et al., 2008). Thus, in all these examples, the animal is required to represent and implement behavioral rules or strategies that use higher-order information, e.g., categorical, to guide behavior. It is this contribution of the VLPFC to serial reversal learning that likely explains the activation of this region during reversal learning in functional neuroimaging studies in humans (Cools et al., 2002; Budhani et al., 2007). Moreover, it may also be relevant to our understanding of the failure of strategy implementation in patients with obsessive compulsive disorder (Chamberlain et al., 2006), since these patients, and their first-degree relatives, show gray matter abnormalities within the IFG (Chamberlain et al., 2008).

In contrast to the selective role of the VLPFC in reversal learning, an intact OFC seems to be necessary for reversal performance regardless of the animal's prior experience. These findings extend previous results of OFC lesion-induced reversal impairments (Dias et al., 1997; Chudasama and Robbins, 2003; McAlonan and Brown, 2003; Izquierdo et al., 2004; Boulougouris et al., 2007; Schoenbaum et al., 2009) by showing that the impairment is apparent despite extensive reversal experience and that, although the deficit may weaken with continued experience (Boulougouris et al., 2007), it can be reinstated by changing the visual discriminanda. It is noteworthy that the apparently exclusively perseverative nature of the deficit seen following OFC lesions in marmosets, naive to reversal learning (Dias et al., 1996, 1997; Clarke et al., 2008), was not seen in the present study when repeatedly reversing between stimuli on which the animals had been extensively trained before surgery. The most likely explanation for the lack of perseveration in the latter is that, at the time of the reversal, both stimuli were equally likely to elicit a response, since both stimuli had received an extensive history of being rewarded (presurgery and postsurgery). Thus, we suggest that the overall reward strength that had accrued to each stimulus overshadowed any advantage gained by one of the stimuli from being rewarded more recently than the other. This was not the case though when novel stimuli were introduced. Since neither stimulus had a strong history of being rewarded, the most recently rewarded stimulus was more likely to induce a prepotent response bias following contingency reversal. Under these circumstances, perseveration was still apparent in the OFC-lesioned animals. It cannot be determined in the present study whether the perseveration was due to enhanced avoidance of the previously unrewarded stimulus or enhanced approach to the previously rewarded stimulus. We would suggest, however, that such perseverative responding, when seen, is indicative of a loss of top-down control, resulting in the expression of prepotent withdrawal or approach biases (Clarke and Roberts, 2010).

A number of hypotheses have been proposed to explain the role of the OFC in reversal learning, including response inhibition (Fuster, 1997); rapid, flexible encoding of associative information (Rolls, 1996); and, more recently, the signaling of outcome expectancies (Schoenbaum et al., 2009). There is considerable evidence from both electrophysiological (Tremblay and Schultz, 1999; Schoenbaum et al., 2007) and lesion (Gallagher et al., 1999; Izquierdo et al., 2004) studies that the OFC is critical for encoding the incentive value of outcomes predicted by Pavlovian cues. However, recent findings that lesions restricted to orbital areas 11/13 fail to affect reversal learning (Kazama and Bachevalier, 2009) while impairing reinforcer devaluation test (in which changes in the incentive value of a predicted outcome guide responding) suggest that the coding of incentive value by the OFC is not critical for rapid reversal learning. More likely, OFC involvement in representing the contingent relationship between cues and their outcomes (Ostlund and Balleine, 2007) underlies rapid reversal learning, particularly given that the latter involves a change in the contingencies between the cue and its outcome, rather than in the value of the outcome per se. Ultimately, whether the OFC lesion-induced reversal deficit is the result of a loss of a single function or multiple functions remains to be determined (for detailed consideration of this issue, see Clarke and Roberts, 2010).

To conclude, the findings reported in this study demonstrate that the serial reversal-learning task makes simultaneous demands on distinct functions attributed, respectively, to the OFC and VLPFC. The rapid reversal of responses between stimuli, as a consequence of changes in the reward contingencies, is dependent upon the OFC, likely as a consequence of the OFC's contribution to contingency (Ostlund and Balleine, 2007) and working memory processes (Walton et al., 2010). Repeated reversals allow for the development of rules/strategies to guide behavior and optimize performance, the implementation of which, across contexts, is dependent upon the VLPFC.

Footnotes

This work was supported by Wellcome Trust Programme Grant 089589/Z/09/Z (to T.W.R., B.J. Everitt, A.C.R., and B.J. Sahakian) and conducted within the Cambridge University Behavioural and Clinical Neuroscience Institute. We thank Mercedes Arroyo for preparation of histological material, and Adrian Newman for graphical support.

References

  1. Baxter MG, Gaffan D, Kyriazis DA, Mitchell AS. Orbital prefrontal cortex is required for object-in-place scene memory but not performance of a strategy implementation task. J Neurosci. 2007;27:11327–11333. doi: 10.1523/JNEUROSCI.3369-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Baxter MG, Gaffan D, Kyriazis DA, Mitchell AS. Dorsolateral prefrontal lesions do not impair tests of scene learning and decision-making that require frontal-temporal interaction. Eur J Neurosci. 2008;28:491–499. doi: 10.1111/j.1460-9568.2008.06353.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Baxter MG, Gaffan D, Kyriazis DA, Mitchell AS. Ventrolateral prefrontal cortex is required for performance of a strategy implementation task but not reinforcer devaluation effects in rhesus monkeys. Eur J Neurosci. 2009;29:2049–2059. doi: 10.1111/j.1460-9568.2009.06740.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Birrell JM, Brown VJ. Medial frontal cortex mediates perceptual attentional set shifting in the rat. J Neurosci. 2000;20:4320–4324. doi: 10.1523/JNEUROSCI.20-11-04320.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bissonette GB, Martins GJ, Franz TM, Harper ES, Schoenbaum G, Powell EM. Double dissociation of the effects of medial and orbital prefrontal cortical lesions on attentional and affective shifts in mice. J Neurosci. 2008;28:11124–11130. doi: 10.1523/JNEUROSCI.2820-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Boulougouris V, Dalley JW, Robbins TW. Effects of orbitofrontal, infralimbic and prelimbic cortical lesions on serial spatial reversal learning in the rat. Behav Brain Res. 2007;179:219–228. doi: 10.1016/j.bbr.2007.02.005. [DOI] [PubMed] [Google Scholar]
  7. Browning PG, Easton A, Gaffan D. Frontal-temporal disconnection abolishes object discrimination learning set in macaque monkeys. Cereb Cortex. 2007;17:859–864. doi: 10.1093/cercor/bhk039. [DOI] [PubMed] [Google Scholar]
  8. Budhani S, Marsh AA, Pine DS, Blair RJ. Neural correlates of response reversal: considering acquisition. Neuroimage. 2007;34:1754–1765. doi: 10.1016/j.neuroimage.2006.08.060. [DOI] [PubMed] [Google Scholar]
  9. Burman KJ, Rosa MG. Architectural subdivisions of medial and orbital frontal cortices in the marmoset monkey (Callithrix jacchus) J Comp Neurol. 2009;514:11–29. doi: 10.1002/cne.21976. [DOI] [PubMed] [Google Scholar]
  10. Bussey TJ, Wise SP, Murray EA. The role of ventral and orbital prefrontal cortex in conditional visuomotor learning and strategy use in rhesus monkeys (Macaca mulatta) Behav Neurosci. 2001;115:971–982. doi: 10.1037//0735-7044.115.5.971. [DOI] [PubMed] [Google Scholar]
  11. Cardinal RN. Whisker (version 2.11) Cambridge, UK: Cambridge University Technical Services; 2001. [Google Scholar]
  12. Cardinal RN. Monkey Cantab (version 3.6) Cambridge, UK: Cambridge University Technical Services; 2007. [Google Scholar]
  13. Chamberlain SR, Blackwell AD, Fineberg NA, Robbins TW, Sahakian BJ. Strategy implementation in obsessive-compulsive disorder and trichotillomania. Psychol Med. 2006;36:91–97. doi: 10.1017/S0033291705006124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Chamberlain SR, Menzies L, Hampshire A, Suckling J, Fineberg NA, del Campo N, Aitken M, Craig K, Owen AM, Bullmore ET, Robbins TW, Sahakian BJ. Orbitofrontal dysfunction in patients with obsessive-compulsive disorder and their unaffected relatives. Science. 2008;321:421–422. doi: 10.1126/science.1154433. [DOI] [PubMed] [Google Scholar]
  15. Chudasama Y, Robbins TW. Dissociable contributions of the orbitofrontal and infralimbic cortex to pavlovian autoshaping and discrimination reversal learning: further evidence for the functional heterogeneity of the rodent frontal cortex. J Neurosci. 2003;23:8771–8780. doi: 10.1523/JNEUROSCI.23-25-08771.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Clarke HF, Roberts AC. Reversal learning in fronto-striatal circuits: a functional, autonomic and neurochemical analysis. In: Delgado M, Phelps EA, Robbins TW, editors. Attention and performance XXII. Decision making, affect, and learning. Oxford, UP: Oxford; 2010. in press. [Google Scholar]
  17. Clarke HF, Dalley JW, Crofts HS, Robbins TW, Roberts AC. Cognitive inflexibility after prefrontal serotonin depletion. Science. 2004;304:878–880. doi: 10.1126/science.1094987. [DOI] [PubMed] [Google Scholar]
  18. Clarke HF, Robbins TW, Roberts AC. Lesions of the medial striatum in monkeys produce perseverative impairments during reversal learning similar to those produced by lesions of the orbitofrontal cortex. J Neurosci. 2008;28:10972–10982. doi: 10.1523/JNEUROSCI.1521-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Cools R, Clark L, Owen AM, Robbins TW. Defining the neural mechanisms of probabilistic reversal learning using event-related functional magnetic resonance imaging. J Neurosci. 2002;22:4563–4567. doi: 10.1523/JNEUROSCI.22-11-04563.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Cools R, Frank MJ, Gibbs SE, Miyakawa A, Jagust W, D'Esposito M. Striatal dopamine predicts outcome-specific reversal learning and its sensitivity to dopaminergic drug administration. J Neurosci. 2009;29:1538–1543. doi: 10.1523/JNEUROSCI.4467-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Dias R, Robbins TW, Roberts AC. Dissociation in prefrontal cortex of affective and attentional shifts. Nature. 1996;380:69–72. doi: 10.1038/380069a0. [DOI] [PubMed] [Google Scholar]
  22. Dias R, Robbins TW, Roberts AC. Dissociable forms of inhibitory control within prefrontal cortex with an analog of the Wisconsin Card Sort Test: restriction to novel situations and independence from “on-line” processing. J Neurosci. 1997;17:9285–9297. doi: 10.1523/JNEUROSCI.17-23-09285.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Fellows LK, Farah MJ. Ventromedial frontal cortex mediates affective shifting in humans: evidence from a reversal learning paradigm. Brain. 2003;126:1830–1837. doi: 10.1093/brain/awg180. [DOI] [PubMed] [Google Scholar]
  24. Fuster JM. The prefrontal cortex: anatomy, physiology, and neuropsychology of the frontal lobe. 3rd Ed. Philadelphia: Lippincott-Raven; 1997. [Google Scholar]
  25. Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and representation of incentive value in associative learning. J Neurosci. 1999;19:6610–6614. doi: 10.1523/JNEUROSCI.19-15-06610.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hampshire A, Owen AM. Fractionating attentional control using event-related fMRI. Cereb Cortex. 2006;16:1679–1689. doi: 10.1093/cercor/bhj116. [DOI] [PubMed] [Google Scholar]
  27. Hornak J, O'Doherty J, Bramham J, Rolls ET, Morris RG, Bullock PR, Polkey CE. Reward-related reversal learning after surgical excisions in orbito-frontal or dorsolateral prefrontal cortex in humans. J Cogn Neurosci. 2004;16:463–478. doi: 10.1162/089892904322926791. [DOI] [PubMed] [Google Scholar]
  28. Howell DC. Statistical methods for psychology. 4th Ed. Belmont, California: Wadsworth; 1997. [Google Scholar]
  29. Izquierdo A, Suda RK, Murray EA. Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J Neurosci. 2004;24:7540–7548. doi: 10.1523/JNEUROSCI.1921-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kazama A, Bachevalier J. Selective aspiration or neurotoxic lesions of orbital frontal areas 11 and 13 spared monkeys' performance on the object discrimination reversal task. J Neurosci. 2009;29:2794–2804. doi: 10.1523/JNEUROSCI.4655-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Mackintosh NJ, Little L. Selective attention and response strategies as factors in serial reversal learning. Can J Psychol. 1969;23:335–346. [Google Scholar]
  32. Macmillan NA, Creelman CD. Detection theory: a user's guide. Cambridge, UK: Cambridge UP; 1991. [Google Scholar]
  33. McAlonan K, Brown VJ. Orbital prefrontal cortex mediates reversal learning and not attentional set shifting in the rat. Behav Brain Res. 2003;146:97–103. doi: 10.1016/j.bbr.2003.09.019. [DOI] [PubMed] [Google Scholar]
  34. Murray EA, Gaffan D. Prospective memory in the formation of learning sets by rhesus monkeys (Macaca mulatta) J Exp Psychol Anim Behav Process. 2006;32:87–90. doi: 10.1037/0097-7403.32.1.87. [DOI] [PubMed] [Google Scholar]
  35. O'Doherty J, Critchley H, Deichmann R, Dolan RJ. Dissociating valence of outcome from behavioral control in human orbital and ventral prefrontal cortices. J Neurosci. 2003;23:7931–7939. doi: 10.1523/JNEUROSCI.23-21-07931.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Ostlund SB, Balleine BW. The contribution of orbitofrontal cortex to action selection. Ann N Y Acad Sci. 2007;1121:174–192. doi: 10.1196/annals.1401.033. [DOI] [PubMed] [Google Scholar]
  37. Roberts AC, Robbins TW, Everitt BJ. The effects of intradimensional and extradimensional shifts on visual discrimination learning in humans and non-human primates. Q J Exp Psychol B. 1988;40:321–341. [PubMed] [Google Scholar]
  38. Rogers RD, Andrews TC, Grasby PM, Brooks DJ, Robbins TW. Contrasting cortical and subcortical activations produced by attentional-set shifting and reversal learning in humans. J Cogn Neurosci. 2000;12:142–162. doi: 10.1162/089892900561931. [DOI] [PubMed] [Google Scholar]
  39. Rolls ET. The orbitofrontal cortex. Philos Trans R Soc Lond B Biol Sci. 1996;351:1433–1443. doi: 10.1098/rstb.1996.0128. [DOI] [PubMed] [Google Scholar]
  40. Rushworth MF, Croxson PL, Buckley MJ, Walton ME. Ventrolateral and medial frontal contributions to decision-making and action selection. In: Bunge SA, Wallis JD, editors. Neuroscience of rule-guided behavior. New York: Oxford UP; 2008. pp. 129–157. [Google Scholar]
  41. Schoenbaum G, Saddoris MP, Stalnaker TA. Reconciling the roles of orbitofrontal cortex in reversal learning and the encoding of outcome expectancies. Ann N Y Acad Sci. 2007;1121:320–335. doi: 10.1196/annals.1401.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Schoenbaum G, Roesch MR, Stalnaker TA, Takahashi YK. A new perspective on the role of the orbitofrontal cortex in adaptive behaviour. Nat Rev Neurosci. 2009;10:885–892. doi: 10.1038/nrn2753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Sutherland NS, Mackintosh NJ. Mechanisms of animal discrimination learning. New York: Academic; 1971. [Google Scholar]
  44. Tremblay L, Schultz W. Relative reward preference in primate orbitofrontal cortex. Nature. 1999;398:704–708. doi: 10.1038/19525. [DOI] [PubMed] [Google Scholar]
  45. Wallis JD, Dias R, Robbins TW, Roberts AC. Dissociable contributions of the orbitofrontal and lateral prefrontal cortex of the marmoset to performance on a detour reaching task. Eur J Neurosci. 2001;13:1797–1808. doi: 10.1046/j.0953-816x.2001.01546.x. [DOI] [PubMed] [Google Scholar]
  46. Walton ME, Behrens TE, Buckley MJ, Rudebeck PH, Rushworth MF. Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron. 2010;65:927–939. doi: 10.1016/j.neuron.2010.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Wilson CR, Gaffan D. Prefrontal-inferotemporal interaction is not always necessary for reversal learning. J Neurosci. 2008;28:5529–5538. doi: 10.1523/JNEUROSCI.0952-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from The Journal of Neuroscience are provided here courtesy of Society for Neuroscience

RESOURCES