Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2016 Mar 1.
Published in final edited form as: Neuropsychopharmacology. 2015 Jul 15;41(4):940–948. doi: 10.1038/npp.2015.208

Biases in the explore-exploit tradeoff in addictions: the role of avoidance of uncertainty

Laurel S Morris 1, Kwangyeol Baek 2, Prantik Kundu 2,3, Neil A Harrison 4, Michael J Frank 5, Valerie Voon 1,2,6,7
PMCID: PMC4650253  EMSID: EMS64108  PMID: 26174598

Abstract

We focus on exploratory decisions across disorders of compulsivity, a potential dimensional construct for the classification of mental disorders. Behaviours associated with the pathological use of alcohol or food, in alcohol use disorders (AUD) or binge-eating disorder (BED), suggest a disturbance in explore-exploit decision-making, whereby strategic exploratory decisions in attempt to improve long-term outcomes may diminish in favour of more repetitive or exploitatory choices. We compare exploration versus exploitation across disorders of natural (obesity with and without BED) and drug rewards (AUD). We separately acquired resting state functional MRI data using a novel multi-echo planar imaging sequence and independent components analysis from healthy individuals to assess the neural correlates underlying exploration. Participants with AUD showed reduced exploratory behaviour across gain and loss environments, leading to lower-yielding exploitatory choices. Obese subjects with and without BED did not differ from healthy volunteers but when compared to each other or to AUD subjects, BED had enhanced exploratory behaviours particularly in the loss domain. All subject groups had decreased exploration or greater uncertainty avoidance to losses compared to rewards. More exploratory decisions in the context of reward were associated with frontal polar and ventral striatal connectivity. For losses, exploration was associated with frontal polar and precuneus connectivity. We further implicate the relevance and dimensionality of constructs of compulsivity across disorders of both natural and drug rewards.

Introduction

Tricky decisions arise almost daily, from the mundane, should I try something new for lunch today, to the more exotic, should I move to a different city? To navigate a dynamic world, individuals must adapt behaviour and consider the trade-off between exploring an uncertain environment for the potential to improve beyond the status quo and exploiting known reward sources, in the hope of maintaining optimal decision-making. Behaviours associated with the pathological use of alcohol or food, in alcohol use disorders (AUD) or binge-eating disorder (BED), might suggest a disturbance in explore-exploit decision-making, whereby strategic exploratory decisions in attempt to improve long-term outcomes may diminish in favour of more repetitive or exploitatory choices. Here we aim to further characterize the trade-off between exploring the uncertain and exploiting the known in these groups.

Faced with an explore-exploit dilemma, one may initially randomly sample the environment and gradually reduce the probability of choosing each action with increasing outcome knowledge. However, descriptions using stochastic choice rules initially govern random exploration and do not take into account the amount of information that could be gained by sampling an unknown choice. Instead, choices may be directed by the amount gained by an exploratory choice1-3. Within this framework, the level of certainty that a choice will engender a better than expected outcome, will influence exploratory choice. Using a temporal utility decision-making task, a recent study provided support for this assumption; the inclusion of an uncertainty term in computational modeling of trail-by-trial choices provided a superior description of exploratory choice2. Thus, behavioural measures that are not accounted for by positive and negative prediction error updating can instead be explained as exploratory adjustments towards uncertainty1,4.

At a neural level, the frontopolar cortex (FPC) and intraparietal sulcus have been implicated in exploratory behaviours5. With widespread cortical and subcortical anatomical and functional connectivity6, the FPC sits at the top of a hierarchical behavioural control system, evaluating heterogeneous inputs for reward-related cognitive task integration in the pursuit of an advanced behavioural goal7-9. Activity in FPC increases and decreases, with exploratory and exploitative decisions, respectively5. In line with the role of uncertainty in driving exploratory choice, the lateral FPC has been shown to track the relative uncertainty of choices when exploratory choices are made and preferentially in those subjects who use an uncertainty-guided exploration strategy1,4. Striatal dopamine function, marked by functional polymorphisms in dopaminergic genes, has also been associated with exploitative decision-making by modulating learning from positive and negative prediction errors2.

We focus on exploratory decisions across disorders of compulsivity, a potential dimensional construct for the classification of mental disorders in line with recent Research Domain Criteria (RDoC) strategies10. Compulsivity can be described as repetitions of deleterious choices, which remain insensitive to changes in outcome contingencies and occur despite negative consequences11,12. An outstanding question is to what extent exploratory choices are altered in disorders of compulsivity. We have recently shown that binge-eating, a compulsive pattern of food intake, presents similar behavioural characteristics to drug taking disorders including greater risk-taking for rewards13 and greater reliance on habitual learning strategies12. Binge-eating behaviour provides a means of distinguishing crucial subtypes within obesity.

With a task previously shown to elicit uncertainty-driven exploratory decision-making behaviour in humans1,2,4,14, we compare on a behavioural level, exploration versus exploitation across disorders of natural (obesity with and without binge-eating disorder) and drug rewards (alcohol use disorders). We expect a trans-pathological marker of reduced strategic uncertainty-driven exploratory behaviours compared to healthy volunteers. We separately acquired resting state functional MRI (rsfMRI) data from healthy individuals to assess the neural correlates underlying exploration. We use a novel multi-echo planar imaging sequence and independent components analysis (ME-ICA) to separate blood oxygen level dependent (BOLD) from non-BOLD activity. This acquisition and analysis greatly enhances signal-to-noise ratios compared to traditional single-echo sequences thus allowing higher spatial resolution15. We focus on the connectivity of the FPC and hypothesize that connectivity with ventral striatum (reward valuation) and inferior parietal cortex (action implementation) is associated with exploratory behaviours in the context of reward. We secondarily assess exploration in the context of loss, expecting a similar network including FPC and inferior parietal cortex.

Methods

Participants

We recruited healthy volunteers (HV) from community and University-based advertisements in the East Anglia region, United Kingdom. The recruitment strategy for patient groups has been reported elsewhere16. For all patient groups primary diagnoses were confirmed by a psychiatrist using the Diagnostic and Statistical Manual of Mental Disorders, Version IV criteria for substance dependence or Research Diagnostic Criteria for BED17. Written informed consent was obtained and the study was approved by the University of Cambridge Research Ethics Committee. The same subjects completed the behavioural task outside of the scanner and underwent the rsfMRI scan. For further information see Supplemental Materials.

Task

We used a task previously shown to elicit exploratory decision-making behaviour in humans1,2. Participants viewed a clock arm that rotates at 5 seconds per revolution (Figure 1). Participants were instructed to press the space bar before a full turn of the arm to win and were informed that the time at which the arm is stopped will determine how much money would be won. The outcome (£0-£200) was revealed for 1 second followed by an inter-trial interval of 300ms. There were 40 trials per condition. An early key press did not affect the total time of the task and subjects were instructed to stop the clock at different times to maximize potential of winning.

Figure 1.

Figure 1

Exploratory behaviour in disorders of natural and drug rewards. A. Participants viewed a rotating clock and were instructed to stop the clock in order to win money or avoid losing money. The time at which the clock was stopped determined how much was won or lost. Exploratory choices are those that had not been previously sampled. B. Explore-exploit index (represented in units of milliseconds per unit standard deviation of the belief distributions) in alcohol use disorders (AUD) was lower than matched healthy volunteers (HV) (group effect; p=0.003). C. Comparing obese subjects with binge-eating disorder (BED) and without (Obese) with matched HV revealed no group differences. Comparing BED and Obese revealed a group difference (p=0.04) when controlled for age, gender and smoking status. D. Comparing current smokers and non-smokers in healthy volunteers revealed a group × valence interaction (p=0.03).

In the previously described task, outcomes varied in probability and magnitude as a function of response times (RT) such that expected value increased, decreased or remained constant with increasing response times. In the current version of the task, only the conditions in which expected value was constant across the whole clock were used which engender most exploratory decisions, but with different frequencies and magnitudes. The increasing and decreasing conditions were replaced with a duplicate set of constant expected value conditions (both CEV and CEVReverse), but for which the outcomes were losses instead of gains. This allows us to assess whether participants use the same uncertainty-driven exploration strategy in the domain of losses, whether they are more averse to uncertainty in that case and whether compulsive individuals show any difference not only in exploration but in its modulation by valence. Exploratory choices are those made for clock arm positions (coarsely, fast vs. slow portions of the clock) for which reward outcomes were more uncertain given previous samples1. The relationship between the probability of winning or losing, the outcome magnitude and clock position was random hence was not associated with learning. Further task details and the computational model are reported in Supplemental Materials.

The model parameters were inspected for normality of distribution using Shapiro-Wilkes. For the exploration parameter, each group was compared to their own matched HV and assessed using mixed measures ANOVA with within-subject factor of Valence (gain, loss) and between-subject factor of Group. The BED and Obese subjects were also directly compared. Data that were skewed (learning rates, ρ) were analyzed using Mann-Whitney U tests.

Resting state functional MRI

We employed a novel multi-echo planar sequence and independent components analysis (ME-ICA) in which BOLD signals were identified as independent components having linear TE-dependent signal change and non-BOLD signals were identified as TE-independent components15. Spatial smoothing was conducted with a Gaussian kernel (full width half maximum = 6mm). CONN-fMRI Functional Connectivity toolbox19 for Statistical Parametric Mapping SPM8 (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/) was used for functional connectivity analysis. A strictly defined region of interest (ROI) for the frontal polar cortex was used based on strong on strong a priori hypotheses5, to compute ROI-to-voxel connectivity maps. These maps were entered into second level correlation analysis with exploration behavioural measures, using cluster extent threshold correction calculated at 15 voxels at p<0.001 whole brain uncorrected, which corrects for multiple comparisons at p<0.05 assuming an individual-voxel Type I error of p=0.0120. Further details are reported in Supplemental Materials.

Results

The subject characteristics have been previously reported12,13. Thirty-two AUD subjects (weeks abstinent 16.62 (SD 16.72); years of dependence 13.67 (SD 9.40); Units/day 27.28 (SD 13.95), on the following medications (acamprosate 2; disulfiram 1)), 31 Obese with BED and 30 Obese without BED were matched with their own age- and gender-matched HV (N=55 for each group). AUD and Obese with BED had higher depression scores compared to healthy volunteers. Obese with and without BED had higher body mass index and Obese with BED had higher Binge Eating Scale (BES) scores.

Behavioural characterization of explore-exploit dilemma across disorders of natural and drug rewards

The data from 1 healthy volunteer and 1 AUD were removed as they were greater than 3 SD above the group mean. The Exploration indices for gain and for loss were square root transformed.

Exploration indices were compared between gain and loss separately for each subject group using repeated measures ANOVA with smoking status as a covariate of no interest. Higher exploratory behaviours in the context of gain compared to loss was observed in healthy volunteers (F(1,94)=511.77, p<0.001), AUD (F(1,29)=178.99, p<0.001), Obese subjects (F(1,28)=109.17, p<0.001) and in BED (F(1,29)=72.10, p<0.001), supporting the interpretation that subjects are averse to uncertainty in the context of losses, possibly in the fear that their exploratory choices could yield yet worse outcomes.

In the AUD comparison to HV, there was a main Valence effect (F(1,84)=36.00, p<0.001) and a Group effect (F(1,84)=6.69, p=0.003) in which AUD subjects had lower Exploration indices compared to HV with no interaction effect (F(1,84)=0.032, p=0.858) (Figure 1). With the addition of smoking status as a covariate of no interest, the Group effect remained significant (p=0.035).

In the BED comparison to HV, there was a main Valence effect (F(1,84)=4187.31, p<0.001) and no Group (F(1,84)=0.46, p=0.499) or interaction effect (F(1,84)=1.50, p=0.224). In the Obese comparison to HV, there was a main Valence effect (F(1,83)=4105.23, p<0.001) and no Group (F(1,83)=2.00, p=0.161) or interaction effect (F(1,83)=0.17, p=0.683). We then compared the BED and Obese subjects which showed a trend towards a Group difference (F(1,56)=3.47, p=0.068). With the addition of age, gender and smoking status as covariates of no interest, we show a main Valence effect (F(1,56)=58.39, p<0.001) and main Group effect (F(1,56)=4.60, p=0.037) in which Obese subjects had lower exploration indices than BED. There was a trend towards an interaction between Group × Valence (F(1,56)=3.48, p=0.068). Posthoc analysis revealed significant differences between groups in the loss (p=0.041) but not gain condition (p=0.405).

We also compared AUD to BED subjects with age, gender and smoking status as a covariate of no interest showing a main Group effect (F(1,54)=9.19, p=0.004) in which AUD subjects were less exploratory than BED subjects; a main Valence effect (F(1,54)=50.94, p<0.001); and a Group × Valence interaction (F(1,54)=8.60, p=0.005). Posthoc testing revealed significant Group difference in the Loss domain only (p=0.003) in which BED subjects were more exploratory compared to AUD subjects.

On an exploratory basis, we examined the influence of smoking status in healthy volunteers. We identified 13 current smokers and 83 current non-smokers and compared these using mixed measures ANOVA. There was a main Valence effect (F(1,94)=511.77, p<0.001) and a Group × Valence interaction (F(1,94)=5.02, p=0.027) in which smokers made more exploratory choices under gain and fewer exploratory choices under loss compared to non-smokers (Figure 1). There was no main Group effect (F(1,94)=1.76, p=0.187).

The other parameter fits were also compared between AUD and HV and between Obese subjects with and without BED. There were no differences in the other parameters (Table 2, Figure S1). There were no correlations between the Exploration indices and measures of alcohol severity, BMI or BES.

Table 2.

Best fitting model parameters and model fit

K λ α Gain α Loss ρ ν SSE
HV (AUD) 1016.95 (356.61) 0.35 (0.09) 0.15 (0.33) 0.08 (0.20) 257.66 (416.14) 0.35 (0.09) 5646.64 (742.33)
AUD 1007.80 (385.15) 0.34 (0.11) 0.11 (0.22) 0.02 (0.03) 362.32 (646.31) 0.38 (0.10) 5441.61 (585.11)
t 0.12 0.80 −0.93 1.399
p-value 0.907 0.426 0.496* 0.128* 0.354 0.271* 0.166
Obese 1227.34 (376.54) 0.30 (0.10) 0.19 (0.37) 0.13 (0.25) 545.12 (729.82) 0.36 (0.10) 5424.07 (780.23)
BED 1064.25 (463.31) 0.33 (0.10) 0.15 (0.33) 0.12 (0.33) 271.84 (486.92) 0.39 (0.13) 5420.22 (1058.00)
t −1.55 1.09 −1.78 −0.02
p-value 0.127 0.280 0.476* 0.133* 0.080 0.130* 0.987

Abbreviations: HV = healthy volunteers; AUD = alcohol use disorder; BED = binge eating disorder; SSE = summed square of residuals (model fit)

*

Mann-Whitney test

Frontal polar cortex connectivity and exploration

Of the participants that completed the task, thirty-seven healthy volunteers (20 male; mean age 35, SD 15; verbal IQ 115, SD 11), underwent resting state fMRI with a multi-echo resting state sequence. This acquisition and an analysis greatly enhances signal-to-noise ratios compared to traditional techniques and provides enhanced spatial resolution based on robust physical priniciples15. The explore/exploit task was tested out of the scanner. The frontal polar cortex (FPC) was carefully defined and used as a seed. Connectivity was quantified by calculating Pearson correlations coefficients between activity within the seed and the whole brain, producing seed-to-voxel whole-brain connectivity maps. These maps were then correlated with the behavioural measure of exploration. Age was included as a covariate of no interest.

Cluster-extent threshold analysis (calculated at 15 voxels at p<0.001 whole-brain uncorrected, correcting for multiple comparisons at p<0.05 assuming an individual-voxel Type I error of p=0.0120) revealed that exploration in the context of reward was positively correlated with FPC and ventral striatal connectivity (peak coordinates x y z = −22, 21, −10 mm; Cluster size= 32; Z= 4.38, Figure 2). In the context of loss, greater exploration was positively correlated with greater FPC and precuneus connectivity (peak x y z = −1, −41, 42 mm; Cluster size= 24; Z= 3.61).

Figure 2.

Figure 2

Neural correlates of exploratory decisions in healthy volunteers. Left, frontal polar cortex seed. Seed-to-whole brain connectivity maps were correlated with exploratory behaviours in the context of reward and loss. Middle, regions whose functional connectivity with frontal polar cortex correlated with exploratory behaviour. Right, parameter estimates extracted for illustrated peak coordinates are correlated with the exploratory behaviour.

Finally we map connectivity of the FPC to the whole brain. At whole-brain FWE p<0.05 we find that FPC is functionally connected with a network including dorsolateral prefrontal cortex, precuneus, inferior parietal and subcortically, the ventral striatum (Figure 3, Table 3).

Figure 3.

Figure 3

Frontal polar cortex connectivity. A frontal polar cortex seed was correlated with the whole brain to produce seed-to-voxel functional connectivity maps. The connectivity map is displayed at p=0.001 uncorrected for illustration

Table 3.

Statistics of frontal polar cortex and whole brain connectivity.

p(FWE-corr) cluster size Z x y z

Frontal Cortex (including medial PFC and anterior cingulate) <0.001 10095 >8 −29 66 7
>8 34 63 −3
>8 27 66 2
Parietal Cortex <0.001 1205 >8 48 −58 49
<0.001 896 >8 −41 −62 53
>8 −43 −60 42
Cerebellum <0.001 1280 >8 −43 −67 −38
7.49 −13 −81 −28
6.32 −22 −83 −26
Dorsolateral PFC <0.001 1060 >8 −45 26 44
>8 −24 31 53
6.78 −34 17 56
Posterior Cingulate (including precuneus) <0.001 1482 >8 −1 −41 35
7.77 1 −30 39
7.08 −1 −69 42
Temporal Cortex <0.001 483 7.63 69 −16 −14
7.61 66 −34 −12
<0.001 149 6.26 −66 −39 −17
0.018 2 4.9 −62 −20 −7
Anterior Insula <0.001 23 5.36 41 19 −10
4.96 34 24 −10
Ventral Striatum 0.013 3 5.04 −10 17 0
0.018 2 4.97 13 17 4

Abbreviations: PFC, prefrontal cortex; p(FWE-corr), whole brain (p<0.05) family wise error corrected p value; : Z, Z-score; xyz, peak voxel coordinates.

Discussion

We employed a choice task previously used to demonstrate strategic exploratory decision-making behaviour in healthy humans1,2. All groups show a conserved effect of valence such that exploration was higher in the reward domain compared to the loss domain. Indeed, in the loss domains subjects showed a consistently negative exploration parameter, meaning that they were averse to uncertainty when there was some prospect of losing even more. These findings potentially reflect the asymmetrical influence of gains and losses on choice behaviour21 imposed by the strength of loss aversion as a consistent mediator of choice22.

Exploratory behaviour in subjects with AUD was reduced across gain and loss environments, in favor of more repetitive or exploitative choices. Obese subjects with and without binge-eating disorder (BED) did not differ from healthy volunteers in their exploratory choices. However, when compared to each other, there was greater exploratory behaviours in BED subjects compared to those without BED. There was a trend towards a Group × Valence interaction driven by greater exploratory behaviours to losses in BED subjects compared to those without BED. Similarly, BED subjects had greater exploratory behaviours particularly to losses compared to AUD. Furthermore, we investigated the influence of smoking in healthy volunteers on a pilot basis: current smokers showed an enhancement of the influence of valence with greater exploration to gain outcomes and less exploration to loss outcomes compared to non-smokers. Exploratory behaviour in healthy volunteers was associated with an underlying network including frontal polar cortex (FPC) and ventral striatal connectivity in the context of reward and FPC and precuneus for losses.

Compared to healthy volunteers, AUD subjects had restricted exploratory behaviours and were more likely to avoid uncertainty across both gain and loss stimulus-outcomes in a task that is independent of learning. AUD subject have been shown to have abnormalities in decision making under ambiguity or uncertainty as measured using the Iowa Gambling Task23,24. Our findings extend these results to suggest either intolerance / avoidance of uncertainty, or a reduced tendency to use a controlled strategy that searches for uncertain outcomes so as to maximize information gain. The current findings of reduced exploration in an unknown environment dovetail with findings suggesting that the effects of alcohol are selective for uncertainty-related anxiety rather than certainty-related fear25, the former being hypothesized to drive the negative-reinforcement cycle of alcohol use26. An alternate explanation may be that changes in outcome sensitivity, rather than uncertainty avoidance, may engender reluctance to explore. However, decreased sensitivity to outcome may be more likely to manifest as greater exploration to sample further stimulus outcome contingencies. Although we do not explicitly measure the role of novelty, decreased exploration may relate to the possible presence of novel environments. Ethanol withdrawal in rodents indeed causes reduced exploration of brightly lit chambers27.

Furthermore, like healthy volunteers, AUD subjects had decreased exploratory behaviours to losses compared to gains suggesting sensitivity to their differential influences. Current smokers also have an enhancement of this differential effect of valence with greater exploratory behaviours to gains and the opposite to losses relative to non-smokers. The enhancement in exploration for gains is in line with enhanced reward sensitivity related to nicotine use34. This finding invites the suggestion that participants who are more likely to explore the potential hedonic benefits of smoking are those that become smokers. The findings in the loss domain suggest a potential role for enhanced loss aversion in smokers with greater avoidance of uncertainty in a loss context, perhaps facilitating sustained smoking in the presence of perceived small losses associated with immediate health consequences, rather than explore alternative strategies that would require giving up smoking for potentially other (e.g. social) losses. Although losses in the form of social and health cost are difficult to model, the secondary reinforcer of money can act as a proxy. These findings in AUD and smokers may be consistent with the negative reinforcement model of addiction32,33 whereby a negative context may drive exploitative repetitive behaviours to avoid losses. Reduced exploration, or more repetitive choices, in the face of losses is consistent with theories that neuroadaptive systems driving aversive states lead to repetitive drug-seeking behaviours26. Indeed, negative affect in smokers is associated with craving severity38. Together with the current findings, this may explain how particular environmental influences (i.e. negative outcomes in the form of financial, social or health losses) may facilitate the repetition of behaviours with certain, known outcomes, such as pathological drinking and smoking behaviours. Although these findings are intriguing, we caution that the findings in smokers are preliminary as the sample size of current users is small, and we cannot rule out an impact of nicotine etc. on exploration rather than the other direction of causality.

That subjects display reduced exploration for losses contrasts with the observation of enhanced ambiguity seeking in the face of losses in healthy humans28,29. However, this discrepancy is also similar to the observation of ambiguity aversion in the face of gains, despite exploration towards uncertain options in that case. The main difference is that in a learning task, choosing an ambiguous option can serve to reduce subsequent ambiguity, i.e. exploration drives learning. In the case of losses, it is thus perhaps surprising that subjects do not seek uncertain options to reduce subsequent ambiguity. In addition, the current study deals with explicit and experienced uncertainty rather than hypothetical ambiguity. The effect of valence on risky choice has been shown to be reversed when choices are either experience or description–based, with the former reducing risk-seeking for losses30 consistent with our findings. Furthermore, there may be at least two strategies for approaching an explore-exploit dilemma: choice biased towards information seeking; and random exploratory decisions involving chance31 and perhaps subjects adopted a strategy to simply increase random choices in the case of losses rather than rely on uncertainty.

Our findings show decreased exploration in Obese subjects without BED as compared to with BED suggesting differences as a function of greater avoidance of uncertainty. BED subjects appear to be more biased towards exploratory behaviour but particularly in the context of losses and not to gains, that is, the opposite profile from smokers. These findings are similarly evident in the comparison of AUD and BED subjects in which BED have greater exploratory behaviours and particularly in the loss domain. This dissociation of valence coincides with previous work showing that BED subjects demonstrate greater risk taking for high probability losses only13 possibly suggesting less of an influence of loss aversion. These findings suggest differences between AUD and BED subjects particularly in the loss domain. Whether the distinct rewards of choice (natural or drug) are responsible for causing increased or decreased exploration in the face of loss or whether they are a product of an inherent attraction or aversion to exploration, remains a question for future studies. The suggestion that neuroadaptive negative reinforcement systems are initiated or propagated by excessive reward system activation32, may explain the current finding of heightened sensitivity to losses in smokers and individuals with AUD, but not in BED, whereby nicotine and alcohol hijack the reward system to a greater degree than food. Moreover, we note that the negative consequences of binge eating on weight gain are far more immediate than those of smoking, which are perceived to be delayed and subject to potential quitting.

Our findings further highlight a role for an intrinsic network of FPC connectivity in exploration biases. The FPC sits at the outermost periphery of the hierarchical prefrontal control regions7,8, being well poised to mediate higher level strategic switches rather than behavioural sequence control. Accumulating evidence suggests that through interactions with social/emotional network (orbitofrontal cortex, amygdala), cognitive network (dorsolateral prefrontal cortex) and default mode network (precuneus, anterior cingulate cortex)6, the FPC orchestrates more flexible and self-relevant behavioural control in the pursuit of optimal decision-making8. We show that FPC and ventral striatal connectivity is associated with exploration in the context of a rewarding environment. This coincides with the notion that the FPC coordinates voluntary and adaptive switching based on uncertainty and expected value1,5. Exploration may depend on the probability that an explored choice will provide a better outcome than expected based on previous experiences (a positive prediction error)2. It is thus possible that the FPC-VS connectivity implies a reward value assignment to the potential for exploring. This would not be expected in the context of losses because the value of exploring is only to reduce loss values rather than provide a positive outcome.

We also show that FPC and precuneus connectivity positively correlates with exploration in the loss domain. Although the precuneus has been traditionally associated with integration of visuo-spatial imagery39, converging evidence suggests a role in integration of external and self-relevant information40. Furthermore, goal-directed hand movements41 and voluntary attentional shifts between targets even in the absence of an overt motor response42, are mediated by the precuneus. Functional links between FPC and the default mode network6 support its role in processing internal rather than external generation of information7 to guide future-focused43 decision-making. The current findings suggest that while assignment of perceived agency to actions and encoding and organizing of intentions is mediated by the precuneus, it may interact with the FPC6 which in turn processes internally-generated goals for behavioural control9,43. Further evidence of the role of the precuneus in exploratory choices comes from studies of foraging behaviour. Humans may alternate between economic decisions and choices governed by sequential ‘engage or search elsewhere’ foraging choices44. Foraging choices (compared to decisions between two options) have been associated with activations in the precuneus extending to posterior cingulate cortex (PCC)44 and PCC seems to be sensitive to risker compared to safer choices45. That this region is associated with risker choices suggests why it may be associated with exploratory choices losses rather than rewards.

Although recent evidence implicates both FPC and inferior parietal cortex in exploratory choices5,46, we did not find significant correlations for inferior parietal cortex. In a previous study, activity in both FPC and the inferior parietal sulcus correlated with the ratio between an unchosen and chosen action probability, or the relative unchosen probability46. However, the inferior parietal sulcus was only recruited when a switch in choice occurred46. Therefore, the FPC seems to track information accumulation relevant to switching to an alternate choice –here to reduce uncertainty – but engages the parietal cortex immediately prior to switching, which implements the switch itself. In line with this hypothesis, a recent study examining negative outcomes implicated the inferior parietal cortex in encoding actions and outcome objects but a more medial region, similar to that implicated in the current study, in encoding the action × object interaction reflecting the appropriate or inappropriate action47.

Our findings suggest biases in exploratory behaviours in the context of an uncertain environment across the misuse of drug and natural rewards. We also highlight the conserved effect of valence on exploration across groups with enhanced uncertainty avoidance to losses possibly reflecting an interaction with underlying loss aversion tendencies. While we do not currently examine the neural correlates of exploration in the pathological groups, we build upon the understanding of the role of the frontal polar cortex in guiding higher order and flexible decision-making, illustrating the possible means through which it coordinates behavioural processes in healthy volunteers. Together, the findings further the characterization of overlapping disorders of natural and drug rewards by maintaining the use of dimensional facets of compulsivity.

Supplementary Material

1
2

Figure S1. Histogram plots of parameter fits in alcohol use disorders Histograms are plotted for alcohol use disorders (red) and healthy volunteers (blue) for exploration indices for gain (top left) and loss (bottom left), κ (top right) and λ (bottom right).

Table 1.

Subject characteristics.

AUD HV - AUD T
P
Obese BED HV T
P
Obese control HV T
P
N 32 55 31 55 30 55
Age 41.29 (11.38) 42.15 (11.91) 0.330
0.742
42.51 (8.92) 43.18 (10.31) 0.303
0.762
44.06 (9.70) 42.94 (9.57) 0.513
0.609
Males (N) 19 32 14 25 19 34
IQ 114.11 (6.72) 115.49 (6.33) 0.959
0.340
115.95 (6.67) 114.52 (6.73) 0.949
0.345
115.18 (6.45) 114.71 (6.83) 0.309
0.758
BDI 11.92 (9.33) 5.24 (5.75) 4.147
<0.001
13.49 (7.13) 5.48 (5.69) 5.706
<0.001
6.96 (5.92) 5.21 (5.13) 1.422
0.159
BMI 34.68 (5.49) 22.18 (2.59) 14.334
<0.001
32.72 (3.41) 23.14 (2.88) 13.72
<0.001
BES 24.70 (7.56) 6.57 (6.92) 11.282
<0.001
8.67 (7.08) 6.98 (7.14) 1.045
0.299

Abbreviations: AUD = alcohol use disorder; HV = healthy volunteer; BED = binge eating disorder; N = number of participants; BDI = Beck Depression Inventory; BMI = body mass index; BES = Binge Eating Scale

Acknowledgments

Funding and Disclosure

The study was funded by the Wellcome Trust Fellowship grant for VV (093705/Z/10/Z) and Cambridge NIHR Biomedical Research Centre. VV and NAH are Wellcome Trust (WT) intermediate Clinical Fellows. LSM is in receipt of an MRC studentship. The BCNI is supported by a WT and MRC grant. MF is funded by NIMH and NSF grants and is consultant for Hoffman LaRoche pharmaceuticals. The remaining authors declare no competing financial interests.

References

  • 1.Badre D, Doll BB, Long NM, Frank MJ. Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron. 2012;73:595–607. doi: 10.1016/j.neuron.2011.12.025. doi:10.1016/j.neuron.2011.12.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Frank MJ, Doll BB, Oas-Terpstra J, Moreno F. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nature neuroscience. 2009;12:1062–1068. doi: 10.1038/nn.2342. doi:10.1038/nn.2342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Dayan P, Sejnowski TJ. Exploration bonuses and dual control. Mach Learn. 1996;25:5–22. doi:Doi 10.1023/A:1018357105171. [Google Scholar]
  • 4.Cavanagh JF, Figueroa CM, Cohen MX, Frank MJ. Frontal Theta Reflects Uncertainty and Unexpectedness during Exploration and Exploitation. Cerebral cortex. 2012;22:2575–2586. doi: 10.1093/cercor/bhr332. doi:Doi 10.1093/Cercor/Bhr332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ. Cortical substrates for exploratory decisions in humans. Nature. 2006;441:876–879. doi: 10.1038/nature04766. doi:10.1038/nature04766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Liu H, et al. Connectivity-based parcellation of the human frontal pole with diffusion tensor imaging. The Journal of neuroscience: the official journal of the Society for Neuroscience. 2013;33:6782–6790. doi: 10.1523/JNEUROSCI.4882-12.2013. doi:10.1523/JNEUROSCI.4882-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Christoff K, Gabrieli JDE. The frontopolar cortex and human cognition: Evidence for a rostrocaudal hierarchical organization within the human prefrontal cortex. Psychobiology. 2000;28:168–186. [Google Scholar]
  • 8.Koechlin E, Hyafil A. Anterior prefrontal function and the limits of human decision-making. Science. 2007;318:594–598. doi: 10.1126/science.1142995. doi:Doi 10.1126/Science.1142995. [DOI] [PubMed] [Google Scholar]
  • 9.Ramnani N, Owen AM. Anterior prefrontal cortex: insights into function from anatomy and neuroimaging. Nature reviews. Neuroscience. 2004;5:184–194. doi: 10.1038/nrn1343. doi:10.1038/nrn1343. [DOI] [PubMed] [Google Scholar]
  • 10.Insel T, et al. Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. Am J Psychiatry. 2010;167:748–751. doi: 10.1176/appi.ajp.2010.09091379. doi:10.1176/appi.ajp.2010.09091379. [DOI] [PubMed] [Google Scholar]
  • 11.Robbins TW, Gillan CM, Smith DG, de Wit S, Ersche KD. Neurocognitive endophenotypes of impulsivity and compulsivity: towards dimensional psychiatry. Trends in cognitive sciences. 2012;16:81–91. doi: 10.1016/j.tics.2011.11.009. doi:10.1016/j.tics.2011.11.009. [DOI] [PubMed] [Google Scholar]
  • 12.Voon V, et al. Disorders of compulsivity: a common bias towards learning habits. Molecular psychiatry. 2014 doi: 10.1038/mp.2014.44. doi:10.1038/mp.2014.44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Voon V, et al. Risk-Taking in Disorders of Natural and Drug Rewards: Neural Correlates and Effects of Probability, Valence, and Magnitude. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology. 2014 doi: 10.1038/npp.2014.242. doi:10.1038/npp.2014.242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kayser AS, Mitchell JM, Weinstein D, Frank MJ. Dopamine, locus of control, and the exploration-exploitation tradeoff. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology. 2015;40:454–462. doi: 10.1038/npp.2014.193. doi:10.1038/npp.2014.193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kundu P, Inati SJ, Evans JW, Luh WM, Bandettini PA. Differentiating BOLD and non-BOLD signals in fMRI time series using multi-echo EPI. NeuroImage. 2012;60:1759–1770. doi: 10.1016/j.neuroimage.2011.12.028. doi:10.1016/j.neuroimage.2011.12.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Voon V, et al. Measuring “waiting” impulsivity in substance addictions and binge eating disorder in a novel analogue of rodent serial reaction time task. Biological psychiatry. 2014;75:148–155. doi: 10.1016/j.biopsych.2013.05.013. doi:10.1016/j.biopsych.2013.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Association AP. Diagnostic and statistical manual of mental disorders (4th Ed., text rev) American Psychiatric Association; 2000. [Google Scholar]
  • 18.Kundu P, et al. Integrated strategy for improving functional connectivity mapping using multiecho fMRI. Proceedings of the National Academy of Sciences of the United States of America. 2013;110:16187–16192. doi: 10.1073/pnas.1301725110. doi:Doi 10.1073/Pnas.1301725110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Whitfield-Gabrieli S, Nieto-Castanon A. Conn: a functional connectivity toolbox for correlated and anticorrelated brain networks. Brain connectivity. 2012;2:125–141. doi: 10.1089/brain.2012.0073. doi:10.1089/brain.2012.0073. [DOI] [PubMed] [Google Scholar]
  • 20.Slotnick SD, Moo LR, Segal JB, Hart J., Jr. Distinct prefrontal cortex activity associated with item memory and source memory for visual shapes. Brain research. Cognitive brain research. 2003;17:75–82. doi: 10.1016/s0926-6410(03)00082-x. [DOI] [PubMed] [Google Scholar]
  • 21.Kahneman D, Tversky A. Prospect Theory - Analysis of Decision under Risk. Econometrica. 1979;47:263–291. doi:Doi 10.2307/1914185. [Google Scholar]
  • 22.Tom SM, Fox CR, Trepel C, Poldrack RA. The neural basis of loss aversion in decision-making under risk. Science. 2007;315:515–518. doi: 10.1126/science.1134239. doi:315/5811/515 [pii] 10.1126/science.1134239. [DOI] [PubMed] [Google Scholar]
  • 23.Goudriaan AE, Oosterlaan J, de Beurs E, van den Brink W. Decision making in pathological gambling: A comparison between pathological gamblers, alcohol dependents, persons with Tourette syndrome, and normal controls. Cognitive Brain Res. 2005;23:137–151. doi: 10.1016/j.cogbrainres.2005.01.017. doi:Doi 10.1016/J.Cogbrainres.2005.01.01. [DOI] [PubMed] [Google Scholar]
  • 24.Bechara A, et al. Decision-making deficits, linked to a dysfunctional ventromedial prefrontal cortex, revealed in alcohol and stimulant abusers. Neuropsychologia. 2001;39:376–389. doi: 10.1016/s0028-3932(00)00136-6. [DOI] [PubMed] [Google Scholar]
  • 25.Hefner KR, Curtin JJ. Alcohol stress response dampening: selective reduction of anxiety in the face of uncertain threat. Journal of psychopharmacology. 2012;26:232–244. doi: 10.1177/0269881111416691. doi:10.1177/0269881111416691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Edwards S, Koob GF. Neurobiology of dysregulated motivational systems in drug addiction. Future neurology. 2010;5:393–401. doi: 10.2217/fnl.10.14. doi:10.2217/fnl.10.14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hascoet M, Bourin M, Nic Dhonnchadha BA. The mouse light-dark paradigm: a review. Progress in neuro-psychopharmacology & biological psychiatry. 2001;25:141–166. doi: 10.1016/s0278-5846(00)00151-2. [DOI] [PubMed] [Google Scholar]
  • 28.Ho JLY, Keller LR, Keltyka P. Effects of outcome and probabilistic ambiguity on managerial choices. J Risk Uncertainty. 2002;24:47–74. doi:Doi 10.1023/A:1013277310399. [Google Scholar]
  • 29.Chakravarty S, Roy J. Recursive expected utility and the separation of attitudes towards risk and ambiguity: an experimental study. Theor Decis. 2009;66:199–228. doi:Doi 10.1007/S11238-008-9112-4. [Google Scholar]
  • 30.Ludvig EA, Spetch ML. Of Black Swans and Tossed Coins: Is the Description-Experience Gap in Risky Choice Limited to Rare Events? PloS one. 2011:6. doi: 10.1371/journal.pone.0020262. doi:ARTN e20262 DOI 10.1371/journal.pone.0020262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wilson RC, Geana A, White JM, Ludvig EA, Cohen JD. Humans Use Directed and Random Exploration to Solve the Explore-Exploit Dilemma. J Exp Psychol Gen. 2014;143:2074–2081. doi: 10.1037/a0038199. doi:Doi 10.1037/A0038199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Koob GF. Negative reinforcement in drug addiction: the darkness within. Curr Opin Neurobiol. 2013;23:559–563. doi: 10.1016/j.conb.2013.03.011. doi:Doi 10.1016/J.Conb.2013.03.011. [DOI] [PubMed] [Google Scholar]
  • 33.Koob GF, Le Moal M. Plasticity of reward neurocircuitry and the ‘dark side’ of drug addiction. Nature neuroscience. 2005;8:1442–1444. doi: 10.1038/nn1105-1442. doi:10.1038/nn1105-1442. [DOI] [PubMed] [Google Scholar]
  • 34.Rose EJ, et al. Acute Nicotine Differentially Impacts Anticipatory Valence- and Magnitude-Related Striatal Activity. Biological psychiatry. 2013;73:280–288. doi: 10.1016/j.biopsych.2012.06.034. doi:Doi 10.1016/J.Biopsych.2012.06.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Martin LE, Cox LS, Brooks WM, Savage CR. Winning and losing: differences in reward and punishment sensitivity between smokers and nonsmokers. Brain Behav. 2014;4:915–924. doi: 10.1002/brb3.285. doi:Doi 10.1002/Brb3.285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Aron AR. The neural basis of inhibition in cognitive control. The Neuroscientist: a review journal bringing neurobiology, neurology and psychiatry. 2007;13:214–228. doi: 10.1177/1073858407299288. doi:10.1177/1073858407299288. [DOI] [PubMed] [Google Scholar]
  • 37.Aron AR, Fletcher PC, Bullmore ET, Sahakian BJ, Robbins TW. Stop-signal inhibition disrupted by damage to right inferior frontal gyrus in humans. Nature neuroscience. 2003;6:115–116. doi: 10.1038/nn1003. doi:10.1038/nn1003. [DOI] [PubMed] [Google Scholar]
  • 38.Robinson JD, et al. A Multimodal Approach to Assessing the Impact of Nicotine Dependence, Nicotine Abstinence, and Craving on Negative Affect in Smokers. Exp Clin Psychopharm. 2011;19:40–52. doi: 10.1037/a0022114. doi:Doi 10.1037/A0022114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Selemon LD, Goldman-Rakic PS. Common cortical and subcortical targets of the dorsolateral prefrontal and posterior parietal cortices in the rhesus monkey: evidence for a distributed neural network subserving spatially guided behavior. The Journal of neuroscience: the official journal of the Society for Neuroscience. 1988;8:4049–4068. doi: 10.1523/JNEUROSCI.08-11-04049.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Cavanna AE, Trimble MR. The precuneus: a review of its functional anatomy and behavioural correlates. Brain : a journal of neurology. 2006;129:564–583. doi: 10.1093/brain/awl004. doi:10.1093/brain/awl004. [DOI] [PubMed] [Google Scholar]
  • 41.Karnath HO, Perenin MT. Cortical control of visually guided reaching: evidence from patients with optic ataxia. Cerebral cortex. 2005;15:1561–1569. doi: 10.1093/cercor/bhi034. doi:10.1093/cercor/bhi034. [DOI] [PubMed] [Google Scholar]
  • 42.Culham JC, et al. Cortical fMRI activation produced by attentive tracking of moving targets. Journal of neurophysiology. 1998;80:2657–2670. doi: 10.1152/jn.1998.80.5.2657. [DOI] [PubMed] [Google Scholar]
  • 43.Okuda J, et al. Thinking of the future and past: The roles of the frontal pole and the medial temporal lobes. Neuroimage. 2003;19:1369–1380. doi: 10.1016/s1053-8119(03)00179-4. doi:Doi 10.1016/S1053-8119(03)00179-4. [DOI] [PubMed] [Google Scholar]
  • 44.Kolling N, Behrens TEJ, Mars RB, Rushworth MFS. Neural Mechanisms of Foraging. Science. 2012;336:95–98. doi: 10.1126/science.1216930. doi:Doi 10.1126/Science.1216930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Kolling N, Wittmann M, Rushworth MFS. Multiple Neural Mechanisms of Decision Making and Their Competition under Changing Risk Pressure. Neuron. 2014;81:1190–1202. doi: 10.1016/j.neuron.2014.01.033. doi:Doi 10.1016/J.Neuron.2014.01.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Boorman ED, Behrens TE, Woolrich MW, Rushworth MF. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron. 2009;62:733–743. doi: 10.1016/j.neuron.2009.05.014. doi:10.1016/j.neuron.2009.05.014. [DOI] [PubMed] [Google Scholar]
  • 47.Morrison I, Tipper SP, Fenton-Adams WL, Bach P. “Feeling” others’ painful actions: the sensorimotor integration of pain and action information. Human brain mapping. 2013;34:1982–1998. doi: 10.1002/hbm.22040. doi:10.1002/hbm.22040. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

Figure S1. Histogram plots of parameter fits in alcohol use disorders Histograms are plotted for alcohol use disorders (red) and healthy volunteers (blue) for exploration indices for gain (top left) and loss (bottom left), κ (top right) and λ (bottom right).

RESOURCES