Abstract
Background
Patients with alcohol dependence (AD) and pathological gambling (PG) are characterized by dysfunctional reward processing and their ability to adapt to alterations of reward contingencies is impaired. However, most neurocognitive tasks investigating reward processing involve a complex mix of elements, such as working memory, immediate and delayed rewards, and risk-taking. As a consequence, it is not clear whether contingency learning is altered in AD or PG. Therefore, the current study aimed to examine performance in a deterministic contingency learning task, investigating discrimination, reversal, and extinction learning.
Methods
Thirty-three alcohol-dependent patients (ADs), 28 pathological gamblers (PGs), and 18 healthy controls (HCs) performed a contingency learning task in which they learned stimulus–reward associations that were first reversed and later extinguished while receiving deterministic feedback throughout. Accumulated points, number of perseverative errors and trials required to reach a criterion in each learning phase were compared between groups using nonparametric Kruskal–Wallis rank-sum tests. Regression analyses were performed to compare learning curves.
Results
PGs and ADs did not differ from HCs in discrimination learning, reversal learning, or extinction learning, on the nonparametric tests. Regression analyses, however, showed differences in the initial speed of learning: PGs were significantly faster in discrimination learning compared to ADs, and both PGs and ADs learned slower than HCs in the reversal learning and extinction phases of the task.
Conclusions
Learning rates for reversal and extinction were slower for the alcohol-dependent group and PG group compared to HCs, suggesting that reversing and extinguishing learned contingencies require more effort in ADs and PGs. This implicates a diminished flexibility to overcome previously learned contingencies.
Keywords: Reversal Learning, Extinction Learning, Alcohol Dependence, Pathological Gambling, Orbitofrontal Cortex
The ability to appropriately process reward and punishment is crucial for adaptive behavior in a constantly changing environment. Intact contingency learning abilities can be seen as the basis for reliable long-term decision-making processes in real life: stimulus–response–outcome associations must be identified and learned appropriately to know which behavior will result in the most rewarding and least damaging outcome. At the same time, these processes must be flexible enough to adapt when reward contingencies in the environment change.
Reward and punishment processing are compromised in patients with substance use disorders (SUDs) compared to healthy subjects (De Ruiter et al., 2009; for a review, see Diekhof et al., 2008). Specifically, patients show a motivational bias toward the drug of abuse and related stimuli (Goldstein and Volkow, 2002; Wrase et al., 2002), while at the same time being less sensitive to drug-unrelated rewards (Goldstein et al., 2007; Martin-Soelch et al., 2001). Pathological gambling (PG) is classified as an addiction in the DSM-5 and has also been associated with compromised reward processing (Van Holst et al., 2010). Sensitivity to punishment in particular seems to be reduced in pathological gamblers (PGs; De Ruiter et al., 2009; Reuter et al., 2005). These impairments in patients with SUDs and PG also reflect the dysfunctional reward processing of affected individuals in real life ultimately leading to continued substance use or persistent gambling despite serious negative consequences in terms of health and social functioning. This indicates a lack of flexibility to learn or unlearn new reward contingencies (to be distinguished from a more general cognitive flexibility to change problem-solving strategies when needed, as measured for example by the Wisconsin Card Sorting Test; Heaton, 1981) which is essential for adaptive functioning.
Contingency learning has discrimination learning at its basis. In discrimination learning, reward values of certain stimuli are learned. After the discrimination learning phase, cognitive flexibility in response to changing contingencies can be indexed by 2 abilities within contingency learning: reversal learning and extinction learning (Itami and Uno, 2002). Discrimination learning, reversal learning, and extinction learning can be studied in simple visual decision-making tasks, in which the reward values of certain stimuli are learned. Respondents hereby learn when to react and when to refrain from reacting to gain reward and avoid punishment, respectively. When the contingency rules have been sufficiently acquired (discrimination learning), there are 2 main ways in which reward contingencies can be altered. The first is to change contingencies such that responses to previously rewarding stimuli are punished and responses to previously punishing stimuli are rewarded (reversal learning). Alternatively, stimuli and responses can abruptly fail to deliver reward altogether, forcing the respondent to refrain from any type of reaction. This last aspect of contingency learning is called extinction learning.
Response perseveration in people with SUD and PG has generally been studied with probabilistic discrimination learning tasks (De Ruiter et al., 2009), Go/No-Go tasks with probabilistic cueing (Fillmore and Rush, 2006). or tasks involving gambling elements (Bechara et al., 2001; Goudriaan et al., 2005; Leeman and Potenza, 2012). In all these tasks, information on reward contingencies must be integrated over a period of time to perform successfully. In probabilistic discrimination learning, responses to stimuli are followed by feedback which is correct most, but not all of the time. One trial per stimulus is therefore not sufficient to learn its reward value. Clearly, this process requires more complex cognitive processing, including working memory, processing of immediate and delayed rewards, and weighing risks and benefits. To study pure contingency learning, a simpler, deterministic task is needed. Itami and Uno (2002) developed a task in which responses are followed by accurate feedback throughout, relieving working memory from integrating probabilistic information during the task. This task is used in the current study. Reversal and extinction learning deficits with a deterministic feedback setup have also been associated with deficient orbitofrontal cortex (OFC) functioning (Fellows and Farah, 2003).
Importantly, disruptions in discrimination, reversal, or extinction learning can elucidate the neural basis of the conditions under study. The OFC in particular has been frequently implicated in cognitive flexibility in contingency learning (De Ruiter et al., 2009; Rolls, 2004; Tsuchida et al., 2010). A large number of human lesion studies have shown that lesions to the OFC are associated with increased perseverative responding to previously rewarded stimuli following extinction or reversal of reinforcement contingencies (Bechara et al., 2001; Hornak et al., 2004; Rolls et al., 1994).
Interestingly, OFC dysfunctions have been implicated in SUDs and PG (Cavedini et al., 2002; London et al., 2000; Van Holst et al., 2012; Volkow and Fowler, 2000). Specifically, drug exposure may cause alterations in neuronal activity in the OFC and lead to impaired performance on orbitofrontal-dependent learning tasks (Schoenbaum and Shaham, 2008). In line with this reasoning, response perseveration to previously rewarding stimuli in reversal and extinction learning and similar tasks has been shown in patients with cocaine dependence (Ersche et al., 2008, 2011), patients with alcohol dependence (AD; Bechara et al., 2001; Goudriaan et al., 2005), as well as people with a family history of alcoholism (Giancola et al., 1993) and prenatal exposure to alcohol (Kodituwakku et al., 2001), and in people with PG (De Ruiter et al., 2009; Goudriaan et al., 2005; for a review, see Leeman and Potenza, 2012).
In conclusion, there is a lack of knowledge on contingency learning in AD and PG, specifically with regard to deterministic feedback (Izquierdo and Jentsch, 2012). In the current study, we therefore examine discrimination, reversal, and extinction learning in alcohol-dependent patients (ADs) and PGs in comparison with healthy controls (HCs) while reducing working memory load during the task by implementing a visual discrimination learning task with deterministic feedback. We expected ADs and PGs to show impaired reversal and extinction learning performance as compared to HCs due to OFC dysfunction in these disorders as well as evidence of maladaptive responding to reward and punishment and inflexibility following contingency changes in some of our previous studies (De Ruiter et al., 2009; Goudriaan et al., 2005).
Materials and Methods
Participants
The study sample consisted of 28 treatment-seeking PGs, 34 treatment-seeking, abstinent ADs, and 19 HCs. Only male participants were included as treatment-seeking PGs are mainly male. Participants were between the ages of 19 and 59 (M = 40.03; SD = 10.71). PGs and ADs were recruited from Dutch addiction treatment centers, and HCs were recruited through advertisements in local newspapers. Procedures were approved by the ethical review board of the Academic Medical Center, and written informed consent was provided by all participants.
DSM-IV criteria for PG were assessed with section T of the Diagnostic Interview Schedule for DSM-IV (Robins et al., 1995). In addition, the South Oaks Gambling Screen (SOGS; Lesieur and Blume, 1987) was administered, the main inclusion criterion for PG being a score of 5 or higher. PG was an exclusion criterion for ADs and HCs.
DSM-IV-TR criteria for alcohol abuse or dependence were assessed with section J of the Dutch version of the Clinical International Interview Schedule (World Health Organization, 1997). In addition, AD severity was assessed with the Alcohol Use Disorders Identification Test (AUDIT; Bush et al., 1998). ADs had been abstinent for a minimum of 2 weeks. AD or abuse was an exclusion criterion for PGs and HCs.
Exclusion criteria for all groups were brain trauma, lifetime diagnosis of schizophrenia or psychotic episodes, 12-month diagnosis of manic disorder, substance dependence or abuse other than AD in the alcohol-dependent group, obsessive–compulsive disorder or posttraumatic stress disorder, treatment in the last 12 months for neurological disorders or mental disorders other than those under study, and use of psychotropic medication. In addition, urine tests for alcohol, amphetamines, benzodiazepines, opioids, or cocaine had to be negative.
Intelligence (Wechsler Adult Intelligence Score—Revised [WAIS-R]; Wechsler, 1981), depression severity (Beck Depression Inventory [BDI]; Beck et al., 1996), impulsivity (Barratt Impulsivity Scale [BIS-11]; Patton et al., 1995), and number of cigarettes smoked per day were assessed as potential confounders.
Task Description
General Procedure
Stimuli consisted of 4 rectangles containing different colored patterns (approx. 20° visual angle horizontally, 12° visual angle vertically) which appeared 1 at a time on a computer screen in a randomized order (Figure 1). Participants were instructed that they were to react to some, but not to all patterns by pressing the space bar and that the goal of the task was to find out which patterns required a reaction and which did not. They were also informed that correct reactions (i.e., pressing the space bar on the correct patterns and refraining from pressing the space bar on the incorrect patterns) would yield 1 point, while conversely incorrect reactions would lead to subtraction of 1 point from the running total. Participants were informed that 10 Eurocents would be paid for every gained point at the end of the task. Further instructions stated that the rules could unexpectedly change at any given moment. Trials began with the presentation of a fixation cross for 500 ms. Stimuli were presented for 2,000 ms or until a response with given. After every trial, written feedback (“correct”/“incorrect”) and an indication of whether a point had been won or lost were presented on the screen (1,500 ms). Participants were instructed to use this feedback to learn which stimuli did or did not require a response. All participants performed a practice block, consisting of a different set of 3 stimuli, which were each presented repeatedly until a correct response was given.
Fig. 1.

Example trials in discrimination (A), reversal (B), and extinction (C) phase.
Discrimination Learning
The discrimination phase directly followed the practice phase. Of the 4 stimuli, 2 required a response and 2 did not. The criterion for successfully completing the discrimination phase was giving 9 correct responses within 10 consecutive trials. Hence, the number of trials presented was not equal for all participants, but depended on their speed of learning. If the criterion had not been reached within 120 trials, the discrimination phase was automatically ended (this, however, did not occur, as all participants reached criterion in <120 trials). Example trials are shown in Fig. 1A.
Reversal Learning
After reaching criterion in the discrimination phase, reward contingencies were reversed without warning. Responding to the previously correct stimuli was penalized, whereas responding to the previously incorrect stimuli was rewarded with 1 point. Testing was discontinued upon reaching the criterion of 9 correct trials of 10 consecutive trials. For 5 participants, reversal learning was ended automatically as they did not succeed in reaching criterion within 120 trials. Example trials are shown in Fig. 1B.
Extinction Learning
Following a 10-minute break, during which questionnaires were administered, participants were subjected to a second discrimination phase. Reward contingencies were identical to those previously learned in the reversal phase, so this phase did not require respondents to learn unfamiliar contingencies. When a criterion of 9 correct responses in 10 consecutive trials had been reached, the extinction phase began. No stimulus required a response, so pressing the space bar to any stimulus was penalized with the loss of 1 point. The task was ended after 15 consecutive correct extinction trials. All participants reached criterion within 120 trials. Example trials are shown in Fig. 1C.
Statistical Analyses
Statistical analyses of behavioral data were performed for the first discrimination phase (Fig. 1A), the reversal phase (Fig. 1B), and the extinction phase (Fig. 1C). Thus, the second discrimination phase, which takes place preceding the extinction phase, was not analyzed, as this phase has the same discrimination rules as during the phase directly preceding it (i.e., the reversal phase). Data were analyzed using R software for statistical computing (R Core Team, 2012). There were no missing data.
Main variables of interest were number of errors and number of trials until the criterion was reached per phase (Itami and Uno, 2002). Errors were defined as commission errors (reactions to punishing stimuli), which are equivalent to perseveration errors in reversal and extinction. Due to skewed data, nonparametric Kruskal–Wallis rank-sum tests were performed on these variables as well as on sample characteristics. Log-transformation or centralizing did not succeed in normalizing most data for parametric analyses.
Effect sizes for nonsignificant results of these analyses were consistently very small (η2 < 0.05) and are therefore not reported.
Linear regressions were performed to compare the learning processes (i.e., mean score as a function of trial number) between the groups. Due to the adaptive nature of the task, there was a large variability in the amount of trials required until criterion was reached. Over half of the participants completed each phase within 40 trials, with increasingly less participants accounting for the shape of the learning curve up to the maximum possible trial (120). This leads to unsystematic distortions caused by a small sample size at high trial numbers (e.g., N = 5 in trials 100 to 120 of the reversal phase) and increasingly drastic changes in the curve due to single participants reaching criterion and dropping out. Hence, only the first sections of each phase were included into analyses. Cutoff points were set to the trial number per phase at which a maximum of 60% of all participants had completed the task. This cutoff was chosen to ensure that at no stage in the analysis should less than 30 participants account for the data and that adjusted R2 of each analysis should not fall below 0.80. A high R2 value is a reasonable requirement as we assumed trial number and group to be the principal predictors of mean score. This resulted in a cutoff at trial 24 (inclusive) for the discrimination phase, trial 29 for the reversal phase, and trial 35 for the extinction phase. The results of these analyses therefore reflect the first phases of discrimination, reversal, and extinction learning.
For each phase, a stepwise forward model selection procedure was implemented. With this method, the continuous predictor trial number and categorical predictor group are successively added to a linear model, the saturated model consisting of both additive and interactive effects of both predictors. Successive models are compared with likelihood ratio tests, and the model providing the best fit to the data is retained as the most suitable model.
Results
Sample Characteristics
Per group, participants who made errors exceeding 2 standard deviations from the mean were excluded from further analyses. This resulted in the exclusion of 1 HC and 1 alcohol-dependent patient, leaving 18 HCs, 28 PGs, and 33 ADs for analyses.
Sample characteristics are presented in Table 1. Groups differed significantly in age (ADs being older than both HCs and PGs) and depression severity as measured by the BDI (PGs being the most depressed and HCs the least depressed). Neither age nor BDI scores correlated significantly with task performance (Spearman's ρ = −0.22 to 0.40, all ps > 0.05) and were therefore not entered into further analyses as potential confounding factors. Groups also differed on the mean number of cigarettes smoked per day (with HCs smoking less than both ADs and PGs). However, smoking behavior was not significantly correlated with task performance within any group (Spearman's ρ = −0.32 to 0.38, all ps > 0.05), and therefore, smoking was not included as a potential confounder in further analyses. As expected, groups differed significantly in the extent of gambling- and drinking-related problems, as measured by SOGS and AUDIT, respectively. Number of years since problem drinking or problem gambling began did not differ between ADs (M = 10.1, SD = 8.8) and PGs (M = 8.93, SD = 8.99) and did not relate to task performance (Spearman's ρ = −0.05 to 0.31, all ps > 0.05). It was therefore not included in further analyses.
Table 1.
Means, Standard Deviations, Test Statistics, and Statistical Significance of Sample Characteristics as a Function of Group
| HCs | PGs | ADs | Test statistics | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Sample characteristics | M | SD | M | SD | M | SD |
F-value |
df | p-Value |
| N | 18 | 28 | 33 | ||||||
| Age (years) | 39.1 | 10.5 | 36.6 | 12.0 | 43.8 | 8.5 | 6.24 | 2 | 0.04 |
| Cigarettes/d | 3.94 | 6.86 | 10.41 | 10.96 | 14.73 | 14.45 | 10.12 | 2 | <0.001 |
| WAIS-R score | 14.7 | 4.3 | 13.1 | 3.4 | 12.8 | 3.9 | 2.51 | 2 | 0.28 |
| BDI | 5.1 | 6.4 | 13.5 | 8.5 | 8.4 | 6.7 | 10.54 | 2 | 0.005 |
| BIS-11 | 69.7 | 2.6 | 70.9 | 3.5 | 71.1 | 5.7 | 1.09 | 2 | 0.58 |
| SOGS | 0.11 | 0.32 | 10.61 | 3.15 | 0.15 | 0.36 | 64.87 | 2 | <0.001 |
| AUDIT | 5.5 | 3.6 | 5.6 | 5.2 | 27.1 | 7.1 | 46.70 | 2 | <0.001 |
HCs, healthy controls; PGs, pathological gamblers; ADs, alcohol-dependent patients; χ2, Kruskal–Wallis test statistic; WAIS-R, Wechsler Adult Intelligence Score; BDI, Beck Depression Inventory; BIS-11, Barratt Impulsivity Scale; SOGS, South Oaks Gambling Screen; AUDIT, Alcohol Use Disorders Identification Test.
Discrimination Learning
The average running score increased with the number of trials (as seen in a significant positive linear effect of trial number on score in all groups in all phases, all ps < 0.001), indicating that overall learning took place.
The number of commission errors until criterion was reached did not significantly differ between groups, χ2(2) = 2.27, p = 0.32. The number of trials needed until reaching criterion did not differ between groups, χ2(2) = 2.65, p = 0.27. Means and standard deviations can be found in Tables 2 and 3.
Table 2.
Means, Standard Deviations, Test Statistics, and Statistical Significance of Commission Errors/Perseveration Errors Made in Each Phase Per Group
| HCs N = 18 | PGs N = 28 | ADs N = 33 | Test statistics | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Phase | M | SD | M | SD | M | SD | χ2 | df | p-Value |
| Discrimination learning | 4.7 | 5.4 | 5.3 | 6.3 | 4.5 | 3.0 | 2.27 | 2 | 0.32 |
| Reversal | 6.2 | 6.8 | 10.3 | 11.8 | 7.9 | 8.8 | 0.61 | 2 | 0.74 |
| Extinction | 5.6 | 3.0 | 6.1 | 4.4 | 6.4 | 3.6 | 0.95 | 2 | 0.62 |
HCs, healthy controls; PGs, pathological gamblers; ADs, alcohol-dependent patients; χ2, Kruskal–Wallis test statistic.
Table 3.
Means, Standard Deviations, Test Statistics, and Statistical Significance of Trials Until Criterion in Each Phase Per Group
| HCs N = 18 | PGs N = 28 | ADs N = 33 | Test statistics | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Phase | M | SD | M | SD | M | SD | χ2 | df | p-Value |
| Discrimination 1 | 25.6 | 23.3 | 26.1 | 21.0 | 25.6 | 11.7 | 2.65 | 2 | 0.27 |
| Reversal | 28.2 | 20.7 | 42.2 | 37.4 | 36.1 | 29.9 | 1.02 | 2 | 0.60 |
| Extinction | 31.8 | 8.2 | 31.3 | 11.1 | 32.9 | 9.8 | 0.81 | 2 | 0.67 |
HCs, healthy controls; PGs, pathological gamblers; ADs, alcohol-dependent patients; χ2, Kruskal–Wallis test statistic.
A regression analysis was performed on the first 24 trials of the discrimination phase to determine learning progression in the 3 groups. Figure 2 shows the learning curves in HCs, PGs, and ADs. A stepwise forward model selection procedure entering first trial number and then group as predictors of mean score revealed that a model with an interactive trial by group term provided a significantly better fit to the data than a model with trial number alone, p = 0.03, indicating a difference in slopes between the groups (adjusted R2 = 0.80). On closer inspection, this was due to a steeper slope (i.e., faster learning) in PGs than in ADs, p < 0.01. No other differences were significant.
Fig. 2.

Learning curves in the first 24 trials of discrimination.
Reversal Learning
HCs, PGs, and ADs made no significantly different number of perseverative errors until reaching criterion, χ2(2) = 0.61, p = 0.74, nor did they require a different number of trials to reach criterion, χ2(2) = 1.02, p = 0.60. Means and standard deviations can be found in Tables 2 and 3.
Initial reversal learning was further analyzed in the first 29 trials administered in this condition. Figure 3 shows the learning curves for HCs, PGs, and ADs. Linear regression analyses on mean score revealed that the best fit of the data was provided by a model including an interactive term for number of trials and group (adjusted R2 = 0.86). This was due to faster learning in HCs than in both PGs and ADs, all ps < 0.05. Similar learning in PGs and ADs was reflected in the lack of significant difference between the group × trial interaction terms for PGs and ADs, p = 0.49.
Fig. 3.

Learning curves in the first 29 trials of reversal.
Extinction Learning
Participants of all 3 groups made a similar number of perseverative errors until reaching criterion in extinction, χ2(2) = 0.95, p = 0.62, and did not differ in the number of trials until reaching criterion, χ2(2) = 0.81, p = 0.67. Means and standard deviations can be found in Tables 2 and 3.
Regression analyses were performed on the first 35 trials administered in this condition. Figure 4 shows the learning curves in the extinction phase for HCs, PGs, and ADs. A stepwise procedure for linear regression model selection revealed that a model including an interactive term for trial number and group provided the best fit to the data (adjusted R2 = 0.97). This was based on a significant interaction for trial number and group. A significantly faster learning was present in HCs compared to both PGs and ADs, and a steeper slope in ADs compared to PGs, all ps < 0.01.
Fig. 4.

Learning curves in the first 35 trials of extinction.
Correlations
Nonparametric Spearman's correlations were computed between BIS-11 questionnaire data and number of trials until reaching criterion and number of errors in reversal and extinction (trev, erev, text, and eext, respectively) for groups separately. In PGs, impulsivity (as measured by BIS-11) was moderately positively correlated with number of perseverative errors in extinction, Spearman's ρ = 0.39, p < 0.01. In ADs or HCs, there were no significant correlations. No significant correlations were present between the dependent variables and AUDIT scores in the alcohol-dependent group or between the dependent variables and SOGS scores in the PG group, indicating that addiction severity was not related to contingency learning performance.
Discussion
The goal of this study was to determine whether aspects of contingency learning, including discrimination, reversal, and extinction learning are impaired in ADs and PGs. We used a deterministic discrimination learning task in which previously learned reward contingencies were altered without warning. We expected ADs and PGs to show impaired reversal and extinction learning performance as compared to HCs based on previous evidence of maladaptive reward processing and inflexibility following changes of reinforcement contingencies in AD and PG (De Ruiter et al., 2009; Goudriaan et al., 2005), as well as the presence of OFC dysfunctions in patients with these disorders (van Holst et al., 2012; Volkow and Fowler, 2000). The current deterministic discrimination learning task was meant to reduce working memory load during the task to investigate contingency learning capacity, without the interference of working memory load. Thus, results from this study give insight in the contingency learning aspects of discrimination learning, reversal learning, and extinction learning in ADs and PGs compared to HCs.
Contrary to our expectations, ADs and PGs did not significantly differ from HCs with regard to total accumulated points, number of perseverative errors, or trials until criterion performance in either reversal or extinction phases. However, the learning curves in the reversal and extinction phases were steeper in HCs than in ADs and PGs, indicating a more efficient learning process in HCs.
There are a number of possible reasons for the lack of significant group differences in the reversal and extinction phase regarding total accumulated points, errors, and number of trials in our study. First, it is interesting to note that the descriptive data showed a tendency for group mean differences in the reversal task, particularly with regard to trials needed to reach criterion performance (PGs > ADs > HCs). The large variability within groups is a possible explanation for the lack of group differences in the total accumulated points, errors, and number of trials. Interestingly, this tendency was not observed in the extinction phase. Second, the task setup in our study was relatively simple in comparison with a number of previous studies. We chose a deterministic setup based on an attention-deficit hyperactivity disorder study of Itami and Uno (2002). Our choice was also based on pilot data, suggesting that alcohol-dependent participants could not successfully learn when a probabilistic design was used, and on an earlier study in PG showing that with probabilistic feedback, PGs were unable to successfully reverse their response strategy (De Ruiter et al., 2009). Probabilistic feedback setups require contradictory information (i.e., receiving both reward and punishment for the same behavior) to be maintained in working memory and integrated over a number of trials to result in a probabilistic judgment. They also encourage perseverative responding after contingency reversal (Cools et al., 2002) by rewarding incorrect responses in a minority of trials. In the task setup chosen in our study, working memory load was substantially lowered by the choice of stimuli, which consisted of only 4 easily distinguishable patterns (by comparison, Goudriaan et al. 2005 used 8 different 2-digit numbers). Working memory deficits have been implicated in alcohol and substance use disorders (Ambrose et al., 2001; Bechara and Martin, 2004) as well as in PG (Leiserson and Pihl, 2007) and may bear upon response perseveration previously observed in these groups. It is possible that when working memory is minimally challenged, both ADs and PGs are not significantly impaired in reversal and extinction learning.
As to the early learning advantage of HCs in reversal and extinction compared to ADs and PGs found in the regression analyses, these are in line with our prior expectations of group differences. In contrast to the parametric tests, which assess overall performance on the task, the regressions provide information on how fast participants adapt to task demands in each phase. It is therefore possible that learning rates initially differ between groups, before eventually leveling out so that task performance ultimately does not differ significantly between groups. Unfortunately, the adaptive nature of the task did not allow us to perform reliable regressions on the entire data set. Nevertheless, the analyses performed on the early learning phases within the reversal and extinction parts of the task reveal noteworthy results. Whereas all groups learned at the same pace during initial discrimination, HCs were faster at adapting to new reward contingencies (reversal learning). This result converges with previous findings that HCs perform better after contingency changes, but not during initial discrimination learning, compared to subjects with SUDs (Ersche et al., 2008) and to subjects with damage to the OFC (Rolls et al., 1994). However, in the current study, the faster learning in HCs compared to ADs and PGs was too subtle to be reflected in significant differences in the number of errors or the number of trials needed to reach criterion performance. It is therefore a possibility that differences between HCs and patients indeed exist in reversal and extinction learning, but that these differences are strongly dependent on task difficulty and additional working memory demands. Processing of probabilistic information may pose a particular problem for ADs and PGs, accounting for the difficulty to adapt behavior in everyday life (where reward contingencies are not always as consistent and clear-cut as in our experiment). Yet, when there is no uncertainty that reward contingencies have been altered—as in deterministic feedback setups—ADs and PGs have less difficulty in learning the new set of rules. Note, however, the correlation in PGs between impulsivity and number of perseverative errors in extinction: particularly impulsive gamblers seem less able to refrain from any reaction, even when it becomes apparent that each response is punished. Elevated impulsivity has commonly been implicated in PG (Castellani and Rugle, 1995; Moran, 1970) and could be a further driving force behind various decision-making deficits.
In summary, we conclude that reversal and extinction learning are not severely afflicted in AD and PG when learning conditions are straightforward, but that learning rates are somewhat slower, indicating that unlearning reinforcement contingencies takes more effort in ADs and PGs compared to HCs. We propose that task difficulty—particularly the complexity of the feedback setup—contributes substantially to differences between healthy individuals on the one hand and PGs and ADs on the other. Systematically manipulating task difficulty, for example, the degree of probabilistic feedback information provided, would help determine more specifically where difficulties lie in PG and AD. Additionally, physiological measures could serve to detect differences in processes that are too subtle to manifest themselves on a behavioral level. Future research should also reflect on the fact that variability within clinical groups can be substantial; it is therefore plausible that factors inherent to particular subgroups of clinical samples (i.e., enhanced impulsivity) take different effects on reversal and extinction performance. Finally, given this heterogeneity in clinical samples, it is important also to consider factors which can influence cognitive functioning over time, such as clinical treatment. A limitation of the current study is that treatment duration was not assessed; research in this field would be well advised to control for this variable in future.
Sudden alterations of established reward contingencies in the environment are a disturbance to any system. Our resourcefulness is demonstrated by the efficiency to adapt quickly to these alterations. Seemingly, even in severe psychological disorders such as AD and PG, this ability is not overly impaired when the learning conditions are plain and reliable. It is, however, the confrontation with contradictory guidelines—rare rewards for dysfunctional behavior—that leads to perseveration or relapse. A limitation of the current study is that the relatively simple setup with 4 different stimuli may have lead to a lower power to detect differences in the overall number of errors and number of trials to attain the criteria. Future research should focus on where the exact threshold in task complexity lies that discriminates healthy from compromised reward processing.
References
- Ambrose ML, Bowden SC, Whelan G. Working memory impairments in alcohol-dependent participants without clinical amnesia. Alcohol Clin Exp Res. 2001;25:185–191. [PubMed] [Google Scholar]
- Bechara A, Dolan S, Denburg N, Hindes A, Anderson SW, Nathan PE. Decision-making deficits, linked to a dysfunctional ventromedial prefrontal cortex, revealed in alcohol and stimulant abusers. Neuropsychologia. 2001;39:276–289. doi: 10.1016/s0028-3932(00)00136-6. [DOI] [PubMed] [Google Scholar]
- Bechara A, Martin EM. Impaired decision making related to working memory deficits in individuals with substance addictions. Neuropsychology. 2004;18:152–162. doi: 10.1037/0894-4105.18.1.152. [DOI] [PubMed] [Google Scholar]
- Beck AT, Steer RA, Ball R, Ranieri WF. Comparison of Beck Depression Inventories -IA and -II in psychiatric outpatients. J Pers Assess. 1996;67:588–597. doi: 10.1207/s15327752jpa6703_13. [DOI] [PubMed] [Google Scholar]
- Bush K, Kivlahan DR, McDonell MB, Fihn SD, Bradley KA. The AUDIT alcohol consumption questions (AUDIT-C): an effective brief screening test for problem drinking. Arch Intern Med. 1998;158:1789–1795. doi: 10.1001/archinte.158.16.1789. [DOI] [PubMed] [Google Scholar]
- Castellani B, Rugle L. A comparison of pathological gamblers to alcoholics and cocaine misusers on impulsivity, sensation seeking, and craving. Int J Addict. 1995;30:275–289. doi: 10.3109/10826089509048726. [DOI] [PubMed] [Google Scholar]
- Cavedini P, Riboldi G, Keller R, D'Annucci A, Bellodi L. Frontal lobe dysfunction in pathological gambling patients. Biol Psychiatry. 2002;51:334–341. doi: 10.1016/s0006-3223(01)01227-6. [DOI] [PubMed] [Google Scholar]
- Cools R, Clark L, Owen AM, Robbins TW. Defining the neural mechanisms of probabilistic reversal learning using event-related functional magnetic resonance imagine. J Neurosci. 2002;22:4563–4567. doi: 10.1523/JNEUROSCI.22-11-04563.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Ruiter MB, Veltman DJ, Goudriaan AE, Oosterlaan J, Sjoerds Z, van den Brink W. Response perseveration and ventral prefrontal sensitivity to reward and punishment in male problem gamblers and smokers. Neuropsychopharmacology. 2009;34:1027–1038. doi: 10.1038/npp.2008.175. [DOI] [PubMed] [Google Scholar]
- Diekhof EK, Falkai P, Gruber O. Functional neuroimaging of reward processing and decision-making: a review of aberrant motivational and affective processing in addiction and mood disorders. Brain Res Rev. 2008;59:164–184. doi: 10.1016/j.brainresrev.2008.07.004. [DOI] [PubMed] [Google Scholar]
- Ersche KD, Roiser JP, Abbott S, Craig KJ, Müller U, Suckling J, Ooi C, Shabbir SS, Clark L, Sahakian BJ, Fineberg NA, Merlo-Pich EV, Robbins TW, Bullmore ET. Response perseveration in stimulant dependence is associated with striatal dysfunction and can be ameliorated by a D2/3 receptor agonist. Biol Psychiatry. 2011;70:754–762. doi: 10.1016/j.biopsych.2011.06.033. [DOI] [PubMed] [Google Scholar]
- Ersche KD, Roiser JP, Robbins TW, Sahakian BJ. Chronic cocaine but not chronic amphetamine use is associated with perseverative responding in humans. Psychopharmacology. 2008;197:421–431. doi: 10.1007/s00213-007-1051-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fellows LK, Farah MJ. Ventromedial frontal cortex mediates affective shifting in humans: evidence from a reversal learning paradigm. Brain. 2003;126:1830–1837. doi: 10.1093/brain/awg180. [DOI] [PubMed] [Google Scholar]
- Fillmore MT, Rush CR. Polydrug abusers display impaired discrimination-reversal learning in a model of behavioural control. J Psychopharmacol. 2006;20:24–31. doi: 10.1177/0269881105057000. [DOI] [PubMed] [Google Scholar]
- Giancola PR, Peterson JB, Pihl RO. Risk for alcoholism, antisocial behaviour, and response perseveration. J Clin Psychol. 1993;49:423–428. doi: 10.1002/1097-4679(199305)49:3<423::aid-jclp2270490317>3.0.co;2-1. [DOI] [PubMed] [Google Scholar]
- Goldstein RZ, Alia-Klein N, Tomasi D, Zhang L, Cottone L, Maloney T, Telang F, Caparelli E, Chang L, Ernst T, Samaras D, Squires NK, Volkow N. Decreased prefrontal cortical sensitivity to monetary reward is associated with impaired motivation and self-control in cocaine addiction. Am J Psychiatry. 2007;164:43–51. doi: 10.1176/appi.ajp.164.1.43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldstein RZ, Volkow ND. Drug addiction and its underlying neurobiological basis: neuroimaging evidence for the involvement of the frontal cortex. Am J Psychiatry. 2002;159:1642–1652. doi: 10.1176/appi.ajp.159.10.1642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goudriaan AE, Oosterlaan J, de Beurs E, van den Brink W. Decision making in pathological gambling: a comparison between pathological gamblers, alcohol dependents, persons with Tourette syndrome, and normal controls. Brain Res Cogn Brain Res. 2005;23:137–151. doi: 10.1016/j.cogbrainres.2005.01.017. [DOI] [PubMed] [Google Scholar]
- Heaton RK. Odessa: Psychological Assessment Resources; 1981. Wisconsin Card Sorting Test Manual. [Google Scholar]
- Hornak J, O'Doherty J, Bramham J, Rolls ET, Morris RG, Bullock PR, Polkey CE. Reward-related reversal learning after surgical excisions in orbito-frontal or dorsolateral prefrontal cortex in humans. J Cogn Neurosci. 2004;16:463–478. doi: 10.1162/089892904322926791. [DOI] [PubMed] [Google Scholar]
- Itami S, Uno H. Orbitofrontal cortex dysfunction in attention-deficit hyperactivity disorder revealed by reversal and extinction tasks. Neuroreport. 2002;13:2453–2457. doi: 10.1097/00001756-200212200-00016. [DOI] [PubMed] [Google Scholar]
- Izquierdo A, Jentsch JD. Reversal learning as a measure of impulsive and compulsive behaviour in addiction. Psychopharmacology. 2012;219:607–620. doi: 10.1007/s00213-011-2579-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kodituwakku PW, May PA, Clericuzio CL, Weers D. Emotion-related learning in individuals prenatally exposed to alcohol: an investigation of the relation between set shifting, extinction of responses, and behaviour. Neuropsychologia. 2001;39:699–708. doi: 10.1016/s0028-3932(01)00002-1. [DOI] [PubMed] [Google Scholar]
- Leeman RF, Potenza MN. Similarities and differences between pathological gambling and substance use disorders: a focus on impulsivity and compulsivity. Psychopharmacology. 2012;219:469–490. doi: 10.1007/s00213-011-2550-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leiserson V, Pihl RO. Reward-sensitivity, inhibition of reward-seeking, and dorsolateral prefrontal working memory function in problem gamblers not in treatment. J Gambl Stud. 2007;23:435–455. doi: 10.1007/s10899-007-9065-5. [DOI] [PubMed] [Google Scholar]
- Lesieur HR, Blume SB. The South Oaks Gambling Screen (SOGS): a new instrument for the identification of pathological gamblers. Am J Psychiatry. 1987;144:1184–1188. doi: 10.1176/ajp.144.9.1184. [DOI] [PubMed] [Google Scholar]
- London ED, Ernst M, Grant S, Bonson K, Weinstein A. Orbitofrontal cortex and human drug abuse: functional imaging. Cereb Cortex. 2000;10:334–342. doi: 10.1093/cercor/10.3.334. [DOI] [PubMed] [Google Scholar]
- Martin-Soelch C, Chevalley AF, Künig G, Missimer J, Magyar S, Mino A, Schutz W, Leenders KL. Changes is reward-induced brain activation in opiate addicts. Eur J Neurosci. 2001;14:1360–1368. doi: 10.1046/j.0953-816x.2001.01753.x. [DOI] [PubMed] [Google Scholar]
- Moran E. Varieties of pathological gambling. Br J Psychiatry. 1970;116:593–597. doi: 10.1192/bjp.116.535.593. [DOI] [PubMed] [Google Scholar]
- Patton JH, Stanford MS, Barratt ES. Factor structure of the Barratt Impulsiveness Scale. J Clin Psychol. 1995;51:768–774. doi: 10.1002/1097-4679(199511)51:6<768::aid-jclp2270510607>3.0.co;2-1. [DOI] [PubMed] [Google Scholar]
- R Core Team. Vienna, Austria: R Foundation for Statistical Computing; 2012. R: A Language and Environment for Statistical Computing. [Google Scholar]
- Reuter J, Raedler T, Rose M, Hand I, Gläscher J, Büchel C. Pathological gambling is linked to reduced activation of the mesolimbic reward system. Nat Neurosci. 2005;8:147–148. doi: 10.1038/nn1378. [DOI] [PubMed] [Google Scholar]
- Robins LN, Cottler L, Bucholz K, Compton W. St. Louis, MO: Washington University; 1995. The Diagnostic Interview Schedule, Version IV. [Google Scholar]
- Rolls ET. The functions of the orbitofrontal cortex. Brain Cogn. 2004;55:11–29. doi: 10.1016/S0278-2626(03)00277-X. [DOI] [PubMed] [Google Scholar]
- Rolls ET, Hornak J, Wade D, McGrath J. Emotion-related learning in patients with social and emotional changes associated with frontal lobe damage. J Neurol Neurosurg Psychiatry. 1994;57:1518–1524. doi: 10.1136/jnnp.57.12.1518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoenbaum G, Shaham Y. The role of orbitofrontal cortex in drug addiction: a review of preclinical studies. Biol Psychiatry. 2008;63:256–262. doi: 10.1016/j.biopsych.2007.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsuchida A, Doll BB, Fellows LK. Beyond reversal: a critical role for human orbitofrontal cortex in flexible learning from probabilistic feedback. J Neurosci. 2010;30:16868–16875. doi: 10.1523/JNEUROSCI.1958-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Holst RJ, van den Brink W, Veltman DJ, Goudriaan AE. Brain imaging studies in pathological gambling. Curr Psychiatry Rep. 2010;12:418–425. doi: 10.1007/s11920-010-0141-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Holst RJ, Veltman DJ, Büchel C, van den Brink W, Goudriaan AE. Distorted expectancy coding in problem gambling: is the addictive in the anticipation? Biol Psychiatry. 2012;71:741–748. doi: 10.1016/j.biopsych.2011.12.030. [DOI] [PubMed] [Google Scholar]
- Volkow ND, Fowler JS. Addiction, a disease of compulsion and drive: involvement of the orbitofrontal cortex. Cereb Cortex. 2000;10:318–325. doi: 10.1093/cercor/10.3.318. [DOI] [PubMed] [Google Scholar]
- Wechsler D. San Antonio: The Psychological Corp; 1981. WAIS-R Manual. [Google Scholar]
- World Health Organization. Geneva: World Health Organization; 1997. Composite International Diagnostic Interview—Version 2.1. [Google Scholar]
- Wrase J, Grüsser SM, Klein S, Diener C, Hermann D, Flor H, Mann K, Braus DF, Heinz A. Development of alcohol-associated cues and cue-induced brain activation in alcoholics. Eur Psychiatry. 2002;17:287–291. doi: 10.1016/s0924-9338(02)00676-4. [DOI] [PubMed] [Google Scholar]

