Skip to main content
eLife logoLink to eLife
. 2024 Dec 9;13:e80979. doi: 10.7554/eLife.80979

Efficient value synthesis in the orbitofrontal cortex explains how loss aversion adapts to the ranges of gain and loss prospects

Jules Brochard 1,2,3, Jean Daunizeau 1,2,3,
Editors: Thorsten Kahnt4, Michael J Frank5
PMCID: PMC11627503  PMID: 39652465

Abstract

Is irrational behavior the incidental outcome of biological constraints imposed on neural information processing? In this work, we consider the paradigmatic case of gamble decisions, where gamble values integrate prospective gains and losses. Under the assumption that neurons have a limited firing response range, we show that mitigating the ensuing information loss within artificial neural networks that synthetize value involves a specific form of self-organized plasticity. We demonstrate that the ensuing efficient value synthesis mechanism induces value range adaptation. We also reveal how the ranges of prospective gains and/or losses eventually determine both the behavioral sensitivity to gains and losses and the information content of the network. We test these predictions on two fMRI datasets from the OpenNeuro.org initiative that probe gamble decision-making but differ in terms of the range of gain prospects. First, we show that peoples' loss aversion eventually adapts to the range of gain prospects they are exposed to. Second, we show that the strength with which the orbitofrontal cortex (in particular: Brodmann area 11) encodes gains and expected value also depends upon the range of gain prospects. Third, we show that, when fitted to participant’s gambling choices, self-organizing artificial neural networks generalize across gain range contexts and predict the geometry of information content within the orbitofrontal cortex. Our results demonstrate how self-organizing plasticity aiming at mitigating information loss induced by neurons’ limited response range may result in value range adaptation, eventually yielding irrational behavior.

Research organism: Human

Introduction

Why do we maintain unrealistic expectations or engage in irresponsible conduct? Maybe one of the most substantial and ubiquitous violations of rationality is peoples’ sensitivity to modifications and/or manipulations of contextual factors that are irrelevant to the decision problem (Kahneman, 2011; Seymour and McClure, 2008). A prominent example is that people’s attitude towards risk depends upon whether alternative choice options are framed either in terms of losses or in terms of gains (Kahneman and Tversky, 2012). More generally, many forms of irrational behaviors stem from peoples’ relative (as opposed to absolute) perception of value, that is: value is perceived in relation to a contextual reference point (Seymour and McClure, 2008; Srivastava and Schrater, 2011). Because it provides a mechanistic interpretation of such relative/context-dependent decision processes, range adaptation in value-sensitive neurons is currently under intense scrutiny (Louie et al., 2013; Rangel and Clithero, 2012; Rigoli et al., 2016; Rustichini et al., 2017; Steverson et al., 2019). Neural range adaptation was first observed in the brain’s perceptual system: neurons in the retina normalize their response to incoming light in their receptive field w.r.t. to the illumination context, such that output firing rates span the variation range of surrounding light intensities (see Carandini and Heeger, 2011 for a review). Importantly, this neural mechanism provides a principled explanation for some forms of context-dependent visual illusions (May and Zhaoping, 2016; Pooresmaeili et al., 2013; Troscianko and Osorio, 2023). The underlying assumption here is that perceptual neurons transmit the information that they receive, i.e., a neuron’s input is the physical quantity that is signaled to the brain (e.g. light intensity within a certain frequency band), whereas the neuron’s output is the percept (e.g. perceived amount of red). In turn, range adaptation (to a neuron’s input signal) directly induces perceptual context-dependent effects. But neural range adaptation may not be a glitch in the brain’s perceptual system. Rather, it may be understood as the brain’s best attempt to produce optimal information processing, given its own hard-wired biological constraints. This is the perspective afforded by efficient coding: light-sensitive neurons should adapt their firing properties to mitigate visual information loss resulting from their limited firing range (Brenner et al., 2000; Laughlin, 1981; Valerio and Navarro, 2003; Wark et al., 2007). In other words, range adaptation would improve the average reliability of neural information transmission, at the cost of inducing visual illusions in some circumstances.

Range adaptation was later evidenced on neural value coding, i.e., value-sensitive neurons was shown to normalize their response w.r.t. the set of alternative options within a given choice context and/or to the recent history of experienced/prospective rewards (Louie et al., 2013; Rangel and Clithero, 2012). In particular, gradual range adaptation effects have been the focus of intense research over the past decade, because they hold the promise of explaining persistent forms of irrational behavior. In line with the existing literature on value processing in the brain, they have been repeatedly documented in non-human primates, mostly using electrophysiological recordings in the orbitofrontal cortex or OFC (Conen and Padoa-Schioppa, 2019; Kobayashi et al., 2010; Padoa-Schioppa, 2009; Tremblay and Schultz, 1999; Yamada et al., 2018), though similar effects have been demonstrated in the anterior cingulate cortex (Cai and Padoa-Schioppa, 2012) and the amygdala (Bermudez and Schultz, 2010; Saez et al., 2017). Although comparatively sparser, neural evidence for gradual value normalization in the human OFC and ventral striatum also exists (Burke et al., 2016; Cox and Kable, 2014; Elliott et al., 2008). Importantly, when included in computational models of value-based decision-making, efficient coding in value-sensitive neurons partially explains specific forms of irrational behavior away (Polanía et al., 2019; Zimmermann et al., 2018).

Having said this, the neurophysiological bases of range adaptation in value-sensitive neurons are virtually unknown and their behavioral consequences are debated (Khaw et al., 2017; Rustichini et al., 2017). For example, that overt preferences do not shift along with the observed changes in the value-sensitivity of OFC neurons is puzzling. A possibility is that range adaptation in OFC neurons may be ‘“undone’ by Hebbian-like plasticity mechanisms that fine-tune the synaptic efficacy of downstream ‘value comparison’ neurons (Padoa-Schioppa and Rustichini, 2014; Rustichini et al., 2017). Implicit in this reasoning is the assumption that option values are typically considered as input signals to value-sensitive OFC neurons, which then transmit this information to downstream decision systems, in analogy to the transmission of light-intensity information by neurons in the retina (Louie and Glimcher, 2012). But another possibility is that value coding in OFC neurons departs from the logic of efficient information transmission in the visual system (Burke et al., 2016; Conen and Padoa-Schioppa, 2019). For example, OFC neurons may be constructing (as opposed to receiving) value signals, out of input signals conveying information about possibly conflicting decision-relevant attributes (Lim et al., 2013; O’Doherty et al., 2021; Pessiglione and Daunizeau, 2021; Raghuraman and Padoa-Schioppa, 2014). We refer to this as value synthesis. In what follows, we consider the paradigmatic case of risky decisions, which require integrating attributes such as prospective gains and losses. Here, value synthesis implies weighing prospective gains and losses, such that the ensuing subjective value effectively arbitrates between pro- versus anti-gamble behavioral tendencies. Our working assumption is twofold: (i) attribute-integration units in the OFC receive idiosyncratic mixtures of signals from attribute-specific units, and (ii) value is read out from integration units using a dedicated population code. This enables us to extend existing models of efficient coding to the case of value synthesis. Under mild conditions regarding units’ response properties, we show that a simple form of self-organized plasticity between attribute-specific and attribute-integration units can mitigate information loss induced by the limited firing range of attribute-integration units. Under such an efficient value synthesis scenario, OFC neurons would not adapt to the range of (output) values; rather, they would adapt to the range of their native input signals. Importantly, the ensuing neural and behavioral consequences depend upon how the underlying self-organized plasticity mechanism modifies the shape of integration units’ receptive fields over the spanned gain/loss domain. In this work, we derive the self-organized plasticity rule that operates efficient value synthesis, highlight its neural and behavioral consequences, and test the ensuing quantitative predictions against behavioral and neural data.

In particular, we show how the ranges of value-relevant attributes (i.e. here: prospective gains and losses) eventually determine the geometry of information encoded in the population of integration units, as well as the landscape of output value signals over the spanned domain of attributes. This is important, because the latter drives peoples’ loss aversion, i.e., their tendency to overweigh prospective losses over prospective gains when considering whether to accept or reject risky gambles. We then perform an entirely novel re-analysis of two independent fMRI datasets, which are made available in the context of the https://openneuro.org/ initiative (Poldrack et al., 2013). In both studies, participants are asked to accept or reject a series of gambles, but the two studies differ w.r.t. to the range of prospective gains. First, we test the neural and behavioral predictions of efficient value synthesis. In particular, we provide evidence that peoples’ loss aversion progressively adapts to the range of prospective gains. We also evaluate the neural predictions of the efficient value synthesis scenario by quantifying the geometry of information in five subregions of the OFC. Finally, we fit the artificial neural network models to peoples’ gambling choices and show that, only when endowed with the self-organized plasticity mechanism for efficient value synthesis do they generalize across gain range context (i.e. across participants’ groups) and predict (out-of-sample) the full geometry of information content within the OFC.

Results

Efficient value synthesis: Computational mechanism and model predictions

We consider that value synthesis is operated by neural networks composed of two layers (see Methods): (i) an attribute-specific layer further divided into two sets of units that differ in terms of their inputs (either trial-by-trial gains or losses), and (ii) an integration layer receiving outputs from both attribute-specific units (see Figure 1A below).

Figure 1. Efficient value synthesis.

(A) Schematic structure of the artificial neural network (ANN). Trial-by-trial prospective gains (G) and losses (L) first enter attribute-specific units (white circles), which then send their outputs to integration units (gray circles). The outputs of these units are then combined to yield trial-by-trial gamble values, using a linear population code. (B) The impact of self-organized plasticity on integration units’ response. Integration units receive a weighted mixture (v) of attribute units’ outputs (x), and their firing response is a sigmoidal mapping z=f(v) of their input. In principle, inputs to integration units (black dots) may sample the saturating range of their sigmoidal activation function (black vertical bar on the y-axis). However, self-organized plasticity modifies the connection strengths between attribute and integration units, such that the ensuing inputs to integration units (red dots) eventually fall within the responsive range of the activation function (red vertical bar on the y-axis).

Figure 1.

Figure 1—figure supplement 1. Summary of sigmoidal integration units’ response profiles.

Figure 1—figure supplement 1.

Top row: the receptive field of each artificial neural network /(ANN) integration unit (color code: from blue -minimal output response- to yellow -maximal output response-) is displayed as a function of prospective losses (x-axis) and gains (y-axis). Middle row: the units’ output response (y-axis) is plotted against the gamble’s expected value (x-axis). Black dots show the possibly different response magnitudes of integration units for a given expected value (EV) (which may be obtained by different combinations of prospective gains and losses). Bottom row: ANN value readout weights (y-axis), ANN’s readout value profile (color code) displayed as a function of prospective losses (x-axis) and gains (y-axis), ANN’s readout value (y-axis) plotted as a function of EV (x-axis), and the ANN readout value’s sensitivity to prospective gains (ωG) and losses (ωL).
Figure 1—figure supplement 2. Efficiency of value synthesis.

Figure 1—figure supplement 2.

(A) The expected log steepness of units’ activation functions (y-axis) is plotted before (black) and after self-organized plasticity for both the symmetrical (red) and asymmetrical (blue) gain/loss domains. (B) The stable choices’ rate (y-axis) is plotted as a function of neural noise variance, (x-axis) before (black dots and dotted lines) and after (plain lines) self-organized plasticity for both the symmetrical (red) and asymmetrical (blue) gain/loss domains.
Figure 1—figure supplement 3. Summary of integration units’ response profiles after self-organized plasticity.

Figure 1—figure supplement 3.

Same format as Figure 1. Top row: the black boxes depict the (‘wide’) domain of prospective gains and losses that are spanned during efficient integration. Middle row: the gray shaded areas depict the range of expected values (EVs) that corresponds to the spanned (‘wide’) domain of prospective gains and losses.
Figure 1—figure supplement 4. Analysis of the information content within the artificial neural networks (ANN’s) integration layer.

Figure 1—figure supplement 4.

Upper panels: representational dissimilarity matrices after self-organized plasticity has modified the integration layer’s receptive fields (color code: from blue -minimal dissimilarity- to yellow -maximal dissimilarity-) are displayed as a function of either prospective gains, prospective losses, or expected value (EV) (from left to right, 10 bins, x-axis, and y-axis). The black boxes show the spanned range of either gains, losses or EV. Lower panels: average neural dissimilarity (y-axis) is plotted as a function of absolute difference in prospective gains, prospective losses, or EV (from left to right, x-axis), both before (black) and after (red) self-organized plasticity has modified the integration layer’s receptive fields.

By assumption, the gamble’s value Vt at trial t is read out from the response pattern in the integration layer using a linear population code (Pessiglione and Daunizeau, 2021):

{Vt=kw(k)zt(k)zt(k)=fz(k)(υt(k))υt(k)=i,jC(i,j,k)xt(i,j)xt(i,j)=fx(j)(ut(i)) (1)

where ztk (respectively, xti,j) is the output response of the kth integration unit (respectively, jth unit within the ith attribute sublayer), wk is the corresponding value readout weight in the population code, fzk is its input-output activation function, υtk is the input signal to the kth integration unit (which is made of a weighted mixture of attribute units’ outputs), Ci,j,k is its connection strength with the jth unit of the ith attribute sublayer, and uti is the input signal to the ith attribute sublayer (i.e. either gain or loss information) at trial t. Importantly, we assume that units have a bounded response range, i.e., their firing rate cannot exceed some predefined physiological limit. Recall that the main mechanistic constraint that acts on a neuron’s firing rate is its action potential’s refractory period, which depends upon how long it takes ion channels to complete a whole voltage activate-deactivate cycle (about a few milliseconds). Typically, even fast-spiking neocortical neurons cannot fire at a frequency higher than about 500 or 600 Hz (Wang et al., 2016). In our context, this can be simply modeled using saturating (more precisely: sigmoidal) input-output activation functions. In turn, the receptive field of integration units over the bidimensional domain of prospective gains and losses can exhibit arbitrary idiosyncratic shapes (see e.g. Figure 1—figure supplement 1).

In principle, artificial neural networks (ANNs) described by Equation 1 can be trained to output almost any subjective value landscape over the bidimensional domain spanned by prospective gains and losses. In particular, they can be trained to output actions’ expected value (EV), which would yield a neural implementation of normative decision-making. For example, consider a gamble decision that entails a 50/50 chance of either earning an amount G of money or losing an amount L. In this case, the gamble’s EV is simply the sum of prospective gains and losses, weighted by their occurrence probability, i.e., EV=12G12L. Training the network to operate this kind of (linear) value synthesis is trivial. If the units’ responses were perfectly reliable (i.e. noiseless), then this would eventually yield rational behavioral responses. But, due to the limited response range of neural units, even small amounts of noise on the integration layer may induce a strong loss of information on value. This happens when inputs to integration units fall outside their responsive range, because it saturates the output contrast. By analogy with efficient coding in the brain’s perceptual systems, efficient value synthesis would rely upon some form of unsupervised adaptation mechanism to mitigate such information loss. For ANNs that obey Equation 1, one can show that the following self-organized plasticity rule operates efficient value synthesis (see Methods section):

ΔCt(i,j,k)=α(1β)ΔCt1(i,j,k)+αβσz(k)(12zt(k))xt(i,j) (2)

where ΔCt(i,j,k) is the trial-by-trial change in connection strength between attribute and integration units and σzk is the slope parameter of the kth integration unit’s activation function. Here, α and β determine both the magnitude and temporal scale at which self-organized plasticity unfolds (see Methods section). Equation 2 modifies the network connections such that inputs to integration units (i.e. vtk in Equation 1) fall within the responsive range of their activation function, which maximizes the output variability induced by attribute inputs (see Figure 1B). This eventually improves the behavioral resilience of the system to neural noise (see Figure 1—figure supplement 2). The plasticity rule in Equation 2 is ‘self-organized’ in the sense that it does not require any teaching or feedback signal. It is also ‘local,’ in that a connection only changes in response to (the recent history of) the outputs of the corresponding pair of connected units. Finally, it is ‘anti-Hebbian,’ in that connections tend to weaken when units co-activate. The exact form of self-organized plasticity that operates efficient value synthesis depends upon the units’ activation function, but these properties generalize to any nonlinear activation function that is continuous and monotonically increasing.

Setting α=0 or β=0 yields non-adaptive value synthesis, whereby the value readout is independent of the history of spanned prospective gains and losses. In what follows, we refer to this as static value synthesis. Otherwise (i.e. when α>0 and β>0), the self-organized plasticity mechanism in Equation 2 progressively modifies the shape of integration units’ receptive fields over the spanned gain/loss domain (see Figure 1—figure supplement 3). This has three main notable consequences.

First, this eventually induces apparent value range adaptation, i.e., the average response of integration units settles between two (almost) invariant bounds which systematically map onto the gambles’ EV extrema over the spanned gain/loss domain. In addition, despite potentially strong nonlinearities in the receptive fields of integration units, the average relationship between their activity and gambles’ EV is almost linear. Interestingly, this apparent value range adaptation effect comes in two variants, depending upon the sign of the correlation between units’ activity and gambles’ EV.

Figure 2 exemplifies the apparent value range adaptation effect. This is an average of over 1000 Monte-Carlo simulations, where we randomized the trained connections of ANNs that operate (either efficient or static) value synthesis prior to exposing them to four different series of 256 decision trials made of prospective gains and losses with a predefined range. More precisely, we considered two ranges (either narrow or wide) for both prospective gains and losses, and exposed the ANNs to each of the 2×2 range combinations. By chance, some units become less sensitive to prospective gains than to prospective losses: those will tend to show a negative correlation with EV (‘-EV-units’). Interestingly, the responses of ‘+EV-units’ and ‘-EV-units’ shown on panels A and B are reminiscent of value range adaptation effects evidenced using electrophysiological recordings of OFC neuron activity (Conen and Padoa-Schioppa, 2019; Padoa-Schioppa, 2009). In particular, one can see that the slope of the relationship between EV and integration units’ activity only depends upon the EV range, and not upon the actual bounds of the spanned domain of EVs. However, this value range adaptation effect is only apparent, in the sense that integration units do not respond to value: they respond to prospective gains and losses. This is important, because the underlying plasticity mechanism reacts to the relative range of spanned gains and losses, which is partially orthogonal to the induced range of EVs. On thus needs to consider the shape of the spanned domain of decision-relevant attributes to properly understand the neural and behavioral consequences of efficient value synthesis.

Figure 2. Apparent value range adaptation.

Figure 2.

(A) The average units’ response (y-axis) to pairs of prospective gains and losses that fall within predefined expected value (EV) bins (x-axis) is shown, while artificial neural networks (ANNs) that operate efficient value synthesis are exposed to four different spanned gain/loss domains (blue: narrow ranges of gains and losses, violet: wide ranges of gains and losses, red: narrow gain range and wide loss range, yellow: wide gain range and narrow loss range, see main text). Only integration units that correlate positively with EV are shown. (B) integration units that correlate negatively with EV, same format as panel (A). C/D: same format as panels (A and B), for ANNs that operate static value synthesis.

Second, self-organized plasticity changes the information content within the integration layer. To show this, we measure the neural dissimilarity between response patterns within the integration layer for any pair of decision trials, and then quantify its change when pairwise differences in either prospective gains, losses or EV vary. We define the ‘neural encoding strength’ of gains, losses, or EV in terms of the gradient of neural dissimilarity per unit of absolute difference in gains, losses, or EV, respectively (see Figure 1—figure supplement 4). In brief, the neural encoding strength of gains (resp. losses) decreases when the spanned range of gains (resp. losses) increases. This is also true for EV, whose neural encoding strength in the integration layer decreases when either the range of gains or losses increases.

Third, this modifies the relative sensitivity of readout value to prospective gains and losses. We quantify the sensitivity of readout value w.r.t. to its constituent attributes in terms of its gradient per unit of prospective gain and losses. In brief, the sensitivity of the ANN’s readout value signal to prospective gains (resp. losses) decreases when the spanned range of gains (resp. losses) increases. In turn, behavioral loss aversion - as defined by the ratio between loss and gain sensitivities- increases (resp. decreases) when the range of gains (resp. losses) increases. This is important, because this suggests how peoples’ behavior will deviate from classical decision theory and exhibit irrational context-dependency effects.

Figure 3 below summarizes the impact of the shape of the spanned bidimensional gain/loss domain. This is an average of over 1000 Monte-Carlo simulations, where the trained ANN’s connections are randomized prior to self-organizing according to Equation 2 in response to a series of 256 decision trials made of prospective gains and losses with a predefined range, which is systematically varied.

Figure 3. Impact of spanned ranges of gains and losses.

Figure 3.

(A) The neural encoding strength of gains (color code: from blue -minimal encoding strength- to yellow -maximal encoding strength-) is shown as a function of the spanned range of losses (x-axis, range increases from left to right) and gains (y-axis, range increases from bottom to top). Note that the maximal range of prospective gains and losses is arbitrarily set to unity. (B) Neural encoding strength of losses, same format. (C) Neural encoding strength of EV, same format. (D) Behavioral sensitivity to gains, same format. (E) Behavioral sensitivity to losses, same format. (F) Behavioral loss aversion, same format.

Strikingly, the behavior will exhibit positive (respectively, negative) loss aversion when the context is favorable (respectively, unfavorable), i.e., when the spanned range of gains is greater (respectively, smaller) than that of losses (Figure 3F). Also, behavioral loss aversion is expected to be neutral when the spanned ranges of gains and losses are comparable (symmetrical gain/loss domain).

In addition to the main effects described above, one can see that there is a cross-attribute spillover effect, such that the behavioral and neural sensitivities to prospective gains (respectively, losses) also decrease when the spanned range of losses (respectively, gains) increases (Figure 3A, B, D and E). The magnitude of these spillover effects is comparatively weaker and may thus be more difficult to detect in an empirical setting. In fact, the only quantity that is similarly impacted by the ranges of gains and losses is the neural encoding strength of EV (Figure 3C). This effect is partly driven by changes in the neural sensitivity to gains and losses (Figure 3A and B), which constrains the availability of information on EV within the integration layer. But it also derives from the distortion of the readout value profile (Figure 3D and E), which weakens the statistical relationship between EV and integration units’ activity patterns. Both are consequences of the changes in integration units’ receptive fields that are induced by self-organized plasticity (see Figure 1—figure supplements 1 and 3). This eventually translates into an apparent value range adaptation phenomenon that generalizes the univariate effect reported in Figure 2.

Note that all these range adaptation effects actually unfold over time, as the network progressively self-organizes in response to prospective gains and losses. Importantly, however, numerical simulations show that the ensuing dynamics of neural and behavioral sensitivities converge, i.e., they eventually reach a steady-state. The convergence rate is governed by the parameter β, whereas the overall magnitude of these context-dependency effects is determined by the parameter α. Although the exact setting of the plasticity magnitude and rate parameters do modify the global magnitude of neural and behavioral sensitivity changes, the results shown in Figure 3 are representative of the impact of the spanned gain/loss domain’s shape.

Model-free analysis of the NARPS dataset

We now present our re-analysis of the NARPS dataset (Botvinik-Nezer et al., 2020b). This dataset includes two studies, each of which is composed of a group of 54 participants who make a series of risky decisions. On each trial, a gamble was presented, entailing a 50/50 chance of gaining an amount G of money or losing an amount L. As in Tom et al., 2007, participants were asked to evaluate whether or not they would like to accept or reject the gambles presented to them. In the first study (hereafter referred to as the ‘narrow range’ group), participants decided on gambles made of gain and loss levels that were sampled from the same range (G and L independently varied between 5$ and 20$). In the second study (hereafter: the ‘wide range’ group), gain levels scaled to double the loss levels (L varied between 5$ and 20$, and G independently varied between 10$ and 40$). Importantly, both groups experience the exact same range of losses. In both studies, all 256 possible combinations of gains and losses were presented across trials (see Methods section). Importantly, the gambles’ outcomes were not revealed until the end of the experiment.

To begin with, we ask whether peoples’ loss aversion exhibits range adaptation, as predicted by efficient value synthesis. In our context, this implies that (i) peoples’ gambling rate should depend upon the gain range context (even within the EV range common to both groups), (ii) peoples’ behavioral sensitivity to gains should be higher in the narrow gain range group than in the wide gain range group, (iii) within-group averages of loss aversion should be initially similar and then progressively diverge as time unfolds, and (iv) participants from the wide gain range group should eventually exhibit strong loss aversion while participants from the narrow gain range group should be loss-neutral (cf. symmetrical gain/loss domain). Figure 4 below summarizes the results of behavioral data analyses that aim at testing these predictions.

Figure 4. Do peoples’ loss aversion exhibit range adaptation?

(A) The group-average probability of gamble acceptance (y-axis) is plotted against deciles of gambles’ expected value (EV=(G–L)/2, x-axis), in both groups (red: narrow range, blue: wide range). Dots show raw data (error bars depict s.e.m.), and plain lines show predicted data under a logistic regression model (see main text). The gray-shaded area highlights the range of expected values that is common to both groups. (B) Estimates of gamble bias (w0) as well as sensitivity to gains (wG) and losses (wL) for both groups, under the logistic model (same color code as panel A, errorbars depict s.e.m.). (C) Average loss aversion (y-axis) is plotted for both groups (same color code, errorbars depict s.e.m.). (D) Temporal dynamics of group-average loss aversion (log wL/wG, y-axis, same color code) are plotted against time (a.u., x-axis). Shaded areas depict s.e.m.

Figure 4.

Figure 4—figure supplement 1. Postdiction error and out-of-sample predictions of the logistic model.

Figure 4—figure supplement 1.

(A) ‘Postdiction’ error (y-axis) is plotted against deciles of gambles’ expected value (x-axis), in both groups (red: narrow range, blue: wide range). The gray-shaded area highlights the range of expected values that is common to both groups. (B) The probability of gamble acceptance (y-axis) is plotted against the gambles’ expected value (x-axis), in both groups (red: narrow range, blue: wide range). Dots show raw data (errorbars depict s.em.), and Plain and dashed lines show postdictions and out-of-sample predictions, respectively. (C) Peoples’ gambling rate within the common expected value (EV) range (y-axis) is plotted as a function of estimated loss aversion (x-axis), for each group of subjects. (D) People’s gambling rate within the common EV range (y-axis) is plotted as a function of estimated gambling bias (x-axis), same format.

Overall, the average gambling rate of people from the wide range group (65±2%) is much higher than that of people from the narrow range group (44±2%), and the group difference is significant (p<10–4, F=44.8, dof=101). This is of course expected, given that people from the wide range group are exposed to gambles with higher value on average. However, and most importantly, within the range of EV that is common to both groups, people from the wide range group are less likely to gamble than people from the narrow range group (Figure 4A). Here, the average gambling rate of people from the wide range group is 41±3%, whereas it is 54±3% for people in the narrow range group, and this group difference is significant (p=0.003, F=9.2). This difference is likely due to the context in which people made these decisions, which is more favorable (higher gain prospects on average) in the wide-range group. This is the hallmark of context-dependency effects.

From Figure 4A, it seems that the variation in peoples’ tendency to accept risky gambles approximately spans the range of gambles’ values that they are exposed to. Under the efficient value synthesis scenario, this apparent value range adaptation effect is due to context-dependent changes in loss aversion. To investigate this effect, we first performed a within-subject logistic regression of trial-by-trial choice data onto gains and losses (including an intercept, see Methods section). In terms of balanced accuracy, this regression accurately explains 91±0.7% (respectively, 87±0.8%) of individual choices in the narrow (respectively, wide) range group (cf. plain lines in Figure 4A). A random effect analysis on regression weight estimates shows that all regression weights are significant at the group level (all p<10–3), except for the intercept parameters (narrow range: p=0.26, wide range: p=0.14). This implies that peoples’ gambling behavior exhibits no systematic bias above and beyond the effects of prospective gains and losses. Regarding group differences, this analysis also failed to identify a group difference in the constant gambling bias (w0: p=0.82, F=0.05). However, peoples’ sensitivity to gains is significantly higher in the narrow range group than in the wide range group (wG: p=0.022, F=17.5), and this difference is almost significant for loss sensitivity (wL: p=0.068, F=3.4). This means that increasing the range of gain prospects decreases peoples’ sensitivity to gains (and maybe to losses as well, though to a lesser extent; see Figure 4B).

We then derived indices of individual loss aversion, which we define as the log-transformed ratio of loss sensitivity to gain sensitivity, i.e., log(wL/wG) (Tom et al., 2007). This definition is not confounded by possible behavioral temperature differences between groups of participants. Mean loss aversion indices are shown in Figure 4C. We found that people from the wide range group exhibit significant loss aversion (mean loss aversion index=0.41, sem=0.05, p<10–4) whereas people from the narrow range group do not (mean loss aversion index=0.037, sem = 0.05, p=0.46), and the ensuing group difference is significant (p=0.0031, F=9.2). Importantly, inter-individual differences in loss aversion explain the observed inter-individual differences in peoples’ gambling rate within the common EV range across all participants (p<10–4, F=39.9, see Figure 4—figure supplement 1).

But is this loss aversion difference due to inter-individual trait differences, or did it grow over time as people are exposed to more gambles? To address this question, we repeated the within-subject logistic regression, this time on consecutive chunks of 16 trials (see Methods section). The resulting temporal dynamics of loss aversion are shown in Figure 3D. We found no significant time-by-group interaction (p=0.43, F=0.61), which is why we report separate (instantaneous) group comparisons. At the start of the experiment (first 16 trials), loss aversion is significant in both groups (wide range: mean loss aversion index=0.26, sem = 0.1, p=0.027, narrow range: mean loss aversion=0.25, sem=0.1, p=0.039), and there is no significant difference between groups (p=0.96, F=0.003). However, as time unfolds, loss aversion in both groups tends to spread apart: the difference between groups starts becoming significant after 32 trials (p=0.005, F=13.0) and stays significant thereafter (all p<0.05) except for two chunks of trials (p=0.12 and p=0.054). At the end of the experiment (last 16 trials), loss aversion is significant in participants of the wide range group (mean loss aversion=0.41, sem=0.07, p<10–4) but not in the narrow range group (mean loss aversion=0.07, sem=0.06, p=0.24), and the group difference is significant (p=0.00042, F=13.3).

Those results validate the behavioral predictions of the efficient value synthesis scenario. We now wish to test its neural predictions, namely: (i) EV, as well as prospective gains and losses, should be encoded in neural activity patterns within the OFC, (ii) the neural encoding strength of prospective gains should be higher in the narrow gain range group than in the wide gain range group, (iii) the neural encoding strength of prospective losses should be equivalent in both groups (up to cross-attribute spillover effects), and (iv) the neural encoding strength of EV should be higher in the narrow gain range group than in the wide gain range group.

We thus extracted the multivariate trial-by-trial BOLD response in five OFC subregions: the lateral and medial parts of Brodmann area 11, Brodmann area 13, Brodmann area 14, and Brodmann area 32 (see Figure 11 in the Methods section). After correcting for between-session and temporal autocorrelation confounding effects (see Methods), we derived the ROI-specific representational dissimilarity matrices and measured the neural encoding strengths of gains (Figure 5), losses (Figure 6), and EV (Figure 7). The corresponding RDMs are shown in Figure 5—figure supplement 1, Figure 6—figure supplement 1, and Figure 7—figure supplement 1, respectively. For the sake of completeness, the results of standard univariate fMRI data analyses can also be eyeballed in Figure 7—figure supplement 2.

Figure 5. Neural encoding of prospective gains.

The upper panels show the trial-by-trial neural dissimilarity (y-axis) plotted as a function of trial-by-trial absolute difference in prospective gains (x-axis), for both groups of participants (red: narrow range, blue: wide range), and within each subregion of the OFC. The lower panels show the ensuing neural encoding strength of gains (y-axis), for both groups (same color code), and within each subregion of the OFC. The dotted line indicates the y-axis origin. Errorbars depict s.e.m., and p-values are uncorrected for multiple comparisons.

Figure 5.

Figure 5—figure supplement 1. Gain-representational dissimilarity matrices (RDMs).

Figure 5—figure supplement 1.

Average trial-by-trial neural dissimilarity is shown, having binned trials according to prospectives gains (average gains within bins increase from left to right and from bottom to top). Upper row: wide range group, lower row: narrow range group. Each column shows an OFC subregion (from left to right: medial part of BA13, lateral part of BA11, rostral part of BA14, medial part of BA11, rostral part of BA32). The color code is the same for all panels.

Figure 6. Neural encoding of prospective losses.

Same format as Figure 5.

Figure 6.

Figure 6—figure supplement 1. Loss-representational dissimilarity matrices (RDMs).

Figure 6—figure supplement 1.

Average trial-by-trial neural dissimilarity is plotted, having binned trials according to prospectives losses. Same format as Figure 9 (and the same color code).

Figure 7. Neural encoding of expected value (EV).

Same format as Figure 5.

Figure 7.

Figure 7—figure supplement 1. Expected value (EV)-representational dissimilarity matrices (RDMs).

Figure 7—figure supplement 1.

Average trial-by-trial neural dissimilarity is plotted, having binned trials according to EVs. Same format as Figure 9 (and the same color code).
Figure 7—figure supplement 2. Univariate fMRI analyses.

Figure 7—figure supplement 2.

First row: the average BOLD response (y-axis) is plotted as a function of prospective gains (x-axis), for both groups of participants (red: narrow group, blue: wide group). Second row: same as first row, for prospective losses. Third row: GLM parameter estimates (y-axis) for gains and losses. Fourth row: regression parameter estimates for expected value (EV). Each column shows an OFC subregion (from left to right: medial part of BA13, lateral part of BA11, rostral part of BA14, medial part of BA11, rostral part of BA32). The dashed black lines depict the y-axis origin.

Of course, prospective gain distances for the wide range-group extend beyond those of the narrow-range group. One can see that, in all subregions of the OFC, neural dissimilarity tends to increase when the absolute difference in prospective gains increases, though this gradient typically attenuates for extreme gain differences. The lateral part of Brodmann area 11 is the only OFC subregion that exhibits a significant encoding of prospective gains in both groups of participants (wide range: p0.0051, narrow range: p<10–3), as well as a significantly higher encoding strength of gains in the narrow range group than in the wide range group (p=0.032).

Overall, it seems that prospective losses are encoded less strongly in OFC neural activity than prospective gains (though typically about four times stronger in magnitude). Nevertheless, the lateral part of Brodmann area 11 still exhibits a significant encoding of prospective losses in both groups of participants wide range: p=0.0049, narrow range: p=0.0056, without a significant group difference (p=0.59).

The pattern of neural encoding of EV is globally similar to that of prospective gains. Importantly, the lateral part of Brodmann area 11 is the only OFC subregion that exhibits a significant encoding of EV in both groups of participants (wide range: p=0.0061, narrow range: p<10–3), as well as a significantly higher encoding strength of EV in the narrow range group than in the wide range group (p=0.0012).

Note that ignoring the fMRI confounding effects does not alter qualitatively the results, though it tends to bury the signal within structured noise, which dampens statistical significance.

In brief, qualitative predictions of the efficient value synthesis scenario at both the behavioral and neural levels have been confirmed (at least in the lateral part of Brodmann area 11). We will now provide further quantitative evidence that efficient value synthesis in the OFC can explain range adaptation of loss aversion.

Model-based analysis of the NARPS dataset

As can be seen from Equations 1; 2, quantitative predictions from the efficient value synthesis scenario actually depend upon model parameters that may vary across individuals. For example, differences in the connectivity matrix C(i,j,k) (at the start of the experiment) and/or value readout weights w(k) can, in principle, account for a broad range of inter-individual differences in gambling behavior (and, possibly, in the neural encoding strength of prospective losses and gains). This raises the question: can the observed neural and behavioral differences between groups be explained by inter-individual differences in static value synthesis, without caring about self-organized plasticity?

To address this question, we fit the ANN model of value synthesis, with and without self-organized plasticity, to each participant’s series of gamble decisions (given the corresponding prospective gains and losses). In what follows, we refer to the models’ predictions about fitted behavioral data as models’ postdiction. We then perform counterfactual model simulations: for each subject-specific fitted model, we simulate the trial-by-trial gamble decisions that would have been observed, had this subject/model been exposed to the sequence of prospective gains and losses that each subject of the other group was exposed to (see Methods). That is, we ask what an ANN trained on the gambling decisions of a participant in the narrow gain range group would predict when exposed to the trial-by-trial series of gains and losses of participants from the wide gain range group (and reciprocally). These out-of-sample predictions provide a strong test of the model’s generalization ability. Figure 8 below shows both postdiction and out-of-sample predictions of the two ANN model variants (static versus efficient value synthesis).

Figure 8. Efficient value synthesis: Artificial neural network (ANN) analysis of behavioral data.

(A) The probability of gamble acceptance under plastic ANNs (y-axis) is plotted against gambles’ expected value (EV) (x-axis), in both groups (red: narrow range, blue: wide range). Dots show raw data (error bars depict s.em.). Plain and dashed lines show postdiction and out-of-sample predictions (see main text), respectively. The gray-shaded area highlights the range of expected values that is common to both groups. (B) Same as panel A, for static ANN. (C) Postdicted loss aversion (y-axis) is plotted as a function of time (x-axis) for both groups (same color code), under the efficient value synthesis model. (D) Same as panel C, for static value synthesis.

Figure 8.

Figure 8—figure supplement 1. Artificial neural network (ANNs’) postdiction error rate.

Figure 8—figure supplement 1.

(A) The rate of postdiction error (y-axis) under the efficient value synthesis scenario is plotted against gambles’ expected value (x-axis), for both groups (red: narrow range, blue: wide range). (B) Static value synthesis, same format.
Figure 8—figure supplement 2. Self-organized plasticity parameter estimates.

Figure 8—figure supplement 2.

(A) Empirical distribution of plasticity magnitude, for both groups (red: narrow range, blue: wide range). (B) Empirical distribution of plasticity rate. (C) Observed loss aversion (y-axis) is plotted as a function of plasticity magnitude (x-axis, log scale), for both groups. Each dot is a participant; plain lines depict the best-fitting linear trend. (D) Observed loss aversion as a function of plasticity rate (log scale).

Unsurprisingly, both model postdictions accurately describe the qualitative group difference in gambling behavior. In addition, both candidate ANNs perform similarly to the logistic model, in terms of both percentage of explained variance (wide range group: 65.8±2.5%, narrow range group: 75.1±1.9%) and balanced fit accuracy (wide range group: 87.3±0.9%, narrow range group: 92.1±0.7%). We note that both ANN models yield postdiction error rates that are similar to the logistic model, in that they are maximal for hard decisions, i.e., when EV lies around zero (see Figure 8—figure supplement 1). In addition, under the efficient value synthesis scenario, empirical distributions of self-organized plasticity parameter estimates are comparable across both groups of participants (see Figure 8—figure supplement 2). But did ANN models capture a mechanism that faithfully generalizes to different gain range contexts (i.e. across groups)? First, static ANNs do not yield accurate out-of-sample predictions. This is expected, because static ANNs cannot exhibit range adaptation. Thus, they leave gambling behavior unchanged within the common range of expected values, and simply extrapolate postdicted behavior outside that range (as is the case for the logistic model, see Figure 4—figure supplement 1). In other terms, within the range of expected values that is common to both groups, static ANNs wrongly predict that the gambling rate should be higher in the wide range group than in the narrow range group (mean gambling rate group difference=9.7%). The situation is quite different for plastic ANNs, which yield more accurate out-of-sample predictions of peoples’ risk attitudes within the common range of expected values. In particular, plastic ANNs correctly predict that gambling rate should be lower in the wide-range group than in the narrow-range group (mean gambling rate group difference=−6.2%, p=0.038, F=1.8). We then measured the absolute out-of-sample prediction error of both plastic and static ANN models, for both participant groups. We found that it was significantly greater for static than for plastic ANN models (wide range group: p=0.013, F=9.45, narrow range group: p=0.032, F=6.43).

We also quantified postdicted loss aversion dynamics under both types of models (Figure 8C and D). One can see that ANNs that operate efficient value synthesis do exhibit realistic loss aversion dynamics, whereby both groups are initially comparable and then progressively spread apart as time unfolds (and the impact of the range of prospective gains accumulates). Note that this systematic dynamical change in peoples’ behavior is the information that plastic ANNs exploit to calibrate both the magnitude and the rate of self-organized plasticity, which reacts to the past history of prospective gains and losses. This does not hold, however, for ANNs that operate static value synthesis, which overlook dynamical changes and attempt to explain gambling choices in terms of an idiosyncratic value landscape. Note that, under the efficient value synthesis scenario, the dynamics of self-organized plasticity are determined by magnitude (α in Equation 2) and rate (β in Equation 2) parameters. Accordingly, inter-individual differences in fitted plasticity magnitudes -but nor plasticity rates- significantly correlate with inter-individual differences in behavioral loss aversion indices (narrow range: p=0.024, wide range: p<10–3, see Figure 8—figure supplement 2). Taken together, these results suggest that the self-organized plasticity mechanism in Equation 2 is necessary to capture the context-dependency of peoples’ loss aversion.

We now aim to evaluating the neurophysiological validity of fitted ANN models of value synthesis. To address this question, we ask whether the activity patterns in ANN models that were fitted to each participant’s gambling choices resemble the corresponding within-subject fMRI activity patterns in the OFC. We approach this problem using representational similarity analysis (RSA) within each subregion of the OFC. This allows us to compare the trial-by-trial multivariate activity patterns of candidate ANNs with those of fMRI signals in the OFC, without any additional ANN parameter adjustment. In brief, we compute four types of within-subject Representational Dissimilarity Matrices or RDMs (see Methods): (i) full trial-by-trial RDMs, (ii) gain-RDMs, where trials have been binned according to prospective gains, (iii) loss-RDMs, and (iv) EV-RDMs. We then measure the correlation between ANN-based and fMRI-based RDMs, for each OFC subregion and each participant. We then test for the statistical significance of this correlation within each group of participants (one-sample t-tests on Fisher-transformed within-subject correlations). Figure 9 below summarizes the RSA results in terms of the group-average RDM correlations, for both plastic and static ANNs.

Figure 9. Efficient value synthesis: Representational similarity analysis (RSA) analysis results.

Within each panel, the correlation between artificial neural network (ANN)-based and fMRI-based representational dissimilarity matrices (RDMs) (y-axis) is shown for both groups of participants (red: narrow range group, blue: wide range group), and both ANN variants (left: plastic ANN, right: static ANN). Errorbars depict within-group s.e.m., and p-values are uncorrected for multiple comparisons. Each column shows the representational similarity analysis (RSA) results of a given OFC subregion from left to right: BA13 (medial), BA11 (lateral), BA14 (rostral), BA11 (medial), and BA32 (rostral). Each row shows one type of RDM (from top to bottom: trial-by-trial RDMs, gain-RDMs, loss-RDMs, and expected value (EV)-RDMs).

Figure 9.

Figure 9—figure supplement 1. Response profile diversity in HP-artificial neural networks (ANNs’) integration units.

Figure 9—figure supplement 1.

The average proportion of HP-ANNs integration units (y-data) that shows a significant correlation with choice, chosen value, and offer value (or none of these) is plotted for both groups (blue: wide range, red: narrow range). Error bars depict s.e.m. (across participants). Horizontal gray bars show the normalized frequency of ‘offer value,’ ‘chosen value,’ and ‘choice’ cells detected at OFC neurons’ response peak (see Figure 4 in Padoa-Schioppa and Assad, 2006).

Intriguingly, both plastic and static ANN variants yield trial-by-trial activity patterns that significantly correlate with fMRI activity patterns in almost all OFC subregions (upper panels in Figure 9). This suggests that raw fMRI estimates of trial-by-trial dynamics of neural activity are not reliable enough to reveal the functional segregation of OFC subregions. This is not the case, however, when considering gain/loss/EV-RDMs. In brief, irrespective of the ANN model variant, no OFC subregion reaches statistical significance in both groups, for all types of RDM. Nevertheless, the lateral part of Brodmann area 11 almost meets this criterion, in that all RSA analyses are significant except for loss-RDMs in the narrow range group of participants, for both ANN variants (plastic ANN: p=0.22, static ANN: p=0.20). Given the anatomical specificity of this result, this is strong evidence that ANNs that operate value synthesis (whether plastic or static) provide a reasonably realistic prediction of the representational geometry of OFC neurons within the lateral part of Brodmann area 11. We note that, irrespective of the type of RDM considered, nowhere in the OFC is the comparison between the two model variants statistically significant.

Interestingly, ANNs that operate efficient value synthesis also reproduce other known features of value-coding neurons in the OFC. Recall that OFC neurons are notoriously diverse in their response profile, but a consistent finding is that, in the context of value-based decision-making, they can be classified in terms of so-called ‘choice cells,’ ‘chosen value cells,’ and ‘offer value cells’ (Padoa-Schioppa and Assad, 2006; Padoa-Schioppa and Assad, 2008). Given that this can be considered a pre-requisite for any computational model of value coding in the OFC, we asked whether plastic ANNs reproduce this known property of OFC neurons. For each subject, we thus tested whether the response of integration units correlates (across trials) with choice, chosen value, and/or gamble value, where value is defined as the weighted sum of gains and losses (according to the static logistic model parameter estimates). The results of this analysis are shown on Figure 9—figure supplement 1: in brief, plastic ANNs do exhibit this type of apparent coding variability, and predicted category proportions are qualitatively comparable to those reported in the existing literature. This provides additional neurobiological validity to ANN models of efficient value synthesis in the OFC.

Finally, we show that other adaptation models (in particular: efficient coding at the level of decision attributes) cannot explain neural data on value range adaptation. This is summarized in Figure 10 below. Although they predict qualitatively similar behavioral range adaptation effects (see Figure 10—figure supplement 1), they do not predict value range adaptation in the ANN’s integration layer (Figure 10AB). They also predict that increasing the range of prospective gains should increase the neural encoding strengths of gains within the integration layer (Figure 10C), which is at odds with the empirical data that we report here (Figures 5 and 6). Finally, when fitted on participants’ behavioral choices, they do not generalize well across gain range contexts (Figure 10—figure supplement 2), and their RSA results are less convincing (even in the lateral part of Brodman area 11, see Figure 10—figure supplement 3). The mathematical derivation of such models, as well as the analysis of their predictions, are summarized in the Supplementary material.

Figure 10. Main divergent predictions of the efficient coding of attributes scenario.

(A) The average units’ response (y-axis) to pairs of prospective gains and losses that fall within predefined expected value (EV) bins (x-axis) is shown, while ANNs that operate efficient value synthesis are exposed to four different spanned gain/loss domains (blue: narrow ranges of gains and losses, violet: wide ranges of gains and losses, red: narrow gain range and wide loss range, yellow: wide gain range and narrow loss range, see main text). Only integration units that correlate positively with EV are shown. (B) Integration units that correlate negatively with EV, same format as panel (A). (C) The neural encoding strength of gains (color code: from blue -low encoding strength- to yellow -high encoding strength-) is shown as a function of the spanned range of losses (x-axis, range increases from left to right) and gains (y-axis, range increases from bottom to top). (D) Neural encoding strength of losses, same format.

Figure 10.

Figure 10—figure supplement 1. Efficient coding of attributes: impact of spanned ranges of gains and losses.

Figure 10—figure supplement 1.

(A) The neural encoding strength of gains (color code: from blue -minimal encoding strength- to yellow -maximal encoding strength-) is shown as a function of the spanned range of losses (x-axis, range increases from left to right) and gains (y-axis, range increases from bottom to top). (B) Neural encoding strength of losses, same format. (C) Neural encoding strength of expected value (EV), same format. (D) Behavioral sensitivity to gains, same format. (E) Behavioral sensitivity to losses, same format. (F) Behavioral loss aversion, same format.
Figure 10—figure supplement 2. Efficient coding of attributes: Artificial neural network (ANN) analysis of behavioral data.

Figure 10—figure supplement 2.

(A) The probability of gamble acceptance (y-axis) is plotted against the gambles’ expected value (EV) (x-axis), in both groups (red: narrow range, blue: wide range). Dots show raw data (error bars depict s.em.). Plain and dashed lines show postdictions and out-of-sample predictions (see main text), respectively. The gray-shaded area highlights the range of expected values that is common to both groups. (B) Postdicted loss aversion (y-axis) is plotted as a function of time (x-axis) for both groups (color code).
Figure 10—figure supplement 3. Efficient coding of attributes: Representational similarity analysis (RSA) analysis results.

Figure 10—figure supplement 3.

Within each panel, the correlation between artificial neural network (ANN)-based and fMRI-based representational dissimilarity matrices (RDMs) (y-axis) is shown for both groups of participants (red: narrow range, blue: wide range). Error bars depict within-group s.e.m. Each column shows the representational similarity analysis (RSA) results of a given OFC subregion from left to right: BA13 (medial), BA11 (lateral), BA14 (rostral), BA11 (medial), and BA32 (rostral). Each row shows one type of RDM (from top to bottom: full-RDMs, gain-RDMs, loss-RDMs, and expected value (EV)-RDMs).

Taken together these behavioral and neural analyses provide converging evidence that self-organized plasticity that operates efficient value synthesis in the lateral part of BA11 is a likely explanation for range adaptation of loss aversion.

Discussion

In this work, we investigate the neural range adaptation mechanism in OFC neurons that underlies the irrational context-dependency of value-based decisions. We focus on risky decisions, where value needs to be constructed out of primitive decision attributes (here: prospective gains and losses). This eventually disambiguates the neural and behavioral implications of candidate computational scenarios for range adaptation. We show that a specific form of self-organized plasticity between attribute-specific and attribute-integration neurons best predicts (out-of-sample) both context-dependent behavioral biases and range adaptation in OFC neurons.

The processing of reward signals in OFC neurons is known to exhibit range adaptation (Conen and Padoa-Schioppa, 2019; Louie and Glimcher, 2012; Padoa-Schioppa, 2009; Rangel and Clithero, 2012). The typical explanation is that OFC neurons adapt their output firing properties to match the recent history of values (Polanía et al., 2019). Implicit in this reasoning is the idea that OFC neurons are receiving value signals, which they are transmitting to downstream decision systems (Padoa-Schioppa and Rustichini, 2014; Rustichini et al., 2017). However, this assumption is at odds with the notion that OFC neurons are rather constructing value from input signals about decision-relevant attributes (O’Doherty et al., 2021; Pessiglione and Daunizeau, 2021; Raghuraman and Padoa-Schioppa, 2014). An important contribution of this work is to show that such a value synthesis scenario is compatible with known value range adaptation effects in OFC neurons (Figure 2). In particular, our results suggest that value range adaptation may be the byproduct of self-organized plasticity that aims at mitigating information loss induced by limited neural response ranges. At the behavioral level, this scenario predicts that peoples’ sensitivity to decision attributes inversely scales with the range of each decision attribute. In the context of gamble decisions, this implies that loss aversion follows the ratio of spanned ranges of gain w.r.t. losses (Figure 3). This systematic dependence of peoples’ loss aversion on the relative ranges of spanned gains and losses has already been documented (Rakow et al., 2020). However, when considering behavioral data alone, the interpretative power of this kind of experimental design is limited (Williams et al., 2021).

In fact, the same behavioral pattern can be predicted under simpler efficient coding scenarios that operate at the level of attribute-specific layers (see the section on ‘efficient coding of gains and losses’ in the Supplementary materials). Interestingly, these models make neural predictions that are distinct from the efficient value synthesis scenario. In particular, ANNs that operate efficient coding of attributes do not exhibit value range adaptation effects in integration units (see Figure 10A, B). Also, they predict that the neural encoding strength of gains in integration units increases when the spanned range of gains increases (see Figure 10C). This clearly goes against the results of our model-free analyses of fMRI data in the OFC. The distinction between these two scenarios (i.e. efficient coding of attributes versus efficient value synthesis) is important, because it may confound the relationship between range adaptation in OFC neurons and its behavioral consequences. In this sense, our results complement and extend previous computational modeling studies that focus on the behavioural impact of range adaptation in attribute-specific units (Soltani et al., 2012). In principle, the two mechanisms may coexist. Importantly, however, range adaptation within the attribute layers does not obviate the need for range adaptation within the integration layer. This is simply because each integration unit receives an arbitrary mixture of inputs. More generally, within a hierarchical system relying on units equipped with saturating activation functions, efficient information processing requires range adaptation at all levels of the hierarchy. Having said this, many other candidate neural mechanisms may, in principle, compete or interact with the self-organized plasticity that we have considered here, eventually crystalizing or destabilizing plastic changes. This is the case for, e.g., Hebbian and homeostatic plasticities, which are known to induce slow neural hysteretic effects (Fox and Stryker, 2017; Pezzulo et al., 2015; Toyoizumi et al., 2014; Turrigiano, 2017). Recent theoretical arguments also suggest that flexible attribute integration in OFC neurons may necessitate plastic changes in the synaptic gain of upstream attribute-specific neurons (O’Doherty et al., 2021). More precisely, the wiring between attribute-specific and attribute-integration neurons should self-organize according to the contextual relevance of attributes. The extent to which the properties of these or similar kinds of neurophysiological mechanisms may explain contextual dependence and/or irrational behavioral responses is an open and challenging issue.

At the neural level, it is reassuring to see that fMRI patterns of activity in the lateral part of Brodmann area 11 strongly resemble the quantitative predictions of plastic ANN models. One might find it disappointing that these predictions turn out not to be verified in Brodmann area 32, owing to the known value encoding within the ventromedial PFC (see, e.g. Clairis and Pessiglione, 2022; Lopez-Persem et al., 2020). In fact, there is an ongoing debate regarding the relative contribution of OFC subregions w.r.t. value processing. For example, lateral, but not medial, OFC may host representations of attributes that presumably compose value judgements (Suzuki et al., 2017). Although this clearly aligns with our model-free fMRI data analysis results, we do not claim that the evidence we provide regarding the anatomical location of value synthesis generalizes beyond decision contexts that probe peoples’ loss aversion. In fact, our main claim is about whether and how efficient value synthesis operates within the OFC, as opposed to which specific subregion of the OFC drives the adaptation of loss aversion and/or related behavioral processes.

Having said this, we note that ANN integration units do exhibit response profiles that are reminiscent of typical OFC neurons electrophysiological activity during value-based decision-making. For example, they reproduce the diversity of coding that has been repeatedly observed in OFC neurons during value-based decision-making (‘offer value cells,’ ‘chosen value cells,’ and ‘choice cells;’ Padoa-Schioppa and Assad, 2006, Padoa-Schioppa and Assad, 2008). This is summarized in Figure 9—figure supplement 1. We see this as a non-specific byproduct of the mixed selectivity of integration units, which exhibit arbitrary complex and idiosyncratic receptive fields (see Figure 1—figure supplements 1 and 3). More importantly, integration units also exhibit the known properties of value range adaptation in these same neurons (Figure 2). Intriguingly, however, value range adaptation in ‘offer value cells’ had been observed without any significant behavioral preference change. Under the assumption that preferences between offers derive from the direct comparison of output signals from distinct ‘offer value cells,’ this is surprising. To solve this puzzle, later theoretical work proposed that value range adaptation is somehow ‘undone’ downstream value coding in the OFC (Padoa-Schioppa and Rustichini, 2014). In our context, this would suggest that readout weights (wk in Equation 2) would compensate for value-related adaptation, effectively thwarting the behavioral consequences of self-organized plasticity between attribute and integration layers. However, this reasoning critically relies upon the assumed computational role of ‘offer value cells.’ In fact, this puzzle may effectively dissolve under other scenarios of how ‘offer value cells’ contribute to decision-making. Recall that this null result was obtained in a decision context where choice options were characterized in terms of the type of offer (i.e. juices that differ w.r.t. palatability), whose quantity was systematically varied. Here, value synthesis would effectively aggregate two attributes, namely palatability and quantity. Under this view, ‘offer value cells’ simply are integration units that show a certain form of mixed selectivity, whereby units’ sensitivity to quantity strongly depends upon palatability. At this point, one needs to consider candidate scenarios of how the OFC may operate value synthesis for multiple options in a choice set. A possibility is that the OFC is automatically computing the value of the option that is currently under the attentional focus (Lebreton et al., 2009; Lopez-Persem et al., 2020), while storing the value of previously attended options within an orthogonal population code (Pessiglione and Daunizeau, 2021). In principle, this implies that the OFC is wired such that it can handle arbitrary switches in attentional locus without compromising the integration of option-specific attributes. In this scenario, integration units (including those that look like ‘offer value cells’) would adapt to the range of all incoming attribute signals, irrespective of which option in the choice set is currently attended. In turn, ‘offer value cells’ would look like they are only partially adapting to the value range of a given offer type (Burke et al., 2016; Conen and Padoa-Schioppa, 2019). More importantly, to the extent that between-attribute spillover effects are negligible, changes in the range of offer quantities would distort the readout value profile along the quantity dimension without altering the palatability dimension. This would effectively leave the relative preference between offer types unchanged. Of course, this is only one candidate scenario among many. Nevertheless, we would still argue that the behavioral consequences of range adaptation in ‘offer value cells’ actually depend upon their underlying computational role.

Now, whether this sort of ANN model produces ‘realistic’ electrophysiological activity profiles beyond this kind of empirical observation is questionable. The reason is twofold. First, they are agnostic w.r.t. within-trial temporal dynamics. Second, there is some level of arbitrariness in the modeling assumptions (e.g. ANN structural constraints) that cannot be finessed using either behavioral or neuroimaging data. What we argue is robust in these ANN models is the information content that they carry, which is distributed over the activity profiles of their artificial neural unit layers. This is the main reason why we resort to variants of RSA analyses for comparing their predictions to multivariate fMRI activity patterns.

At this point, let us comment on a seemingly innocuous neural modeling assumption: namely, that units’ input-output activation functions are saturating. This was motivated by the fact that neurons’ firing rate cannot exceed some predefined physiological limit (see, e.g. Wang et al., 2016). Under the framework of efficient coding, such response range limitation is eventually what creates the need for range adaptation. This is because information loss mostly follows from inputs reaching the saturating domain of units’ activation functions. However, one may wonder how robust our efficient value synthesis scenario to deviations from this assumption is. Analytical derivations show that other monotonic (and bounded) activation functions would yield very similar self-organized plasticity rules. This means that our results would generalize to any monotonic activation function. However, it turns out that efficient value synthesis yields unstable self-organized plasticity dynamics under non-monotonic (e.g. Gaussian or bell-shaped) activation functions. To understand this, recall that the self-organized plasticity rule derives from aligning the connectivity change with the gradient of information loss w.r.t. connection strengths. This gradient explodes when inputs fall within domains where the derivative of the activation function approaches zero. This unavoidably happens with non-monotonic activation functions because the plasticity mechanism eventually focuses the weighted inputs within the vicinity of their mode. In other terms, one may argue that only monotonic activation functions are compatible with the efficient value synthesis scenario.

Now, how generalizable is the neural mechanism we disclose here? We argue that self-organized plasticity may explain many forms of persistent irrational behavioral changes, through gradual range adaptation effects in OFC neurons. We note that, in our context, these changes seem to unfold over several minutes (Figure 4D), which is consistent with the fastest time scale of long-term potentiation/depression (Abraham, 2003). However, we contend that the evidence we provide here is insufficient to establish whether these changes remain stable over longer periods and whether they can be overcome by explicit instructions or intensive training (Cicchini et al., 2012). A related issue is whether similar plasticity mechanisms may explain virtually instantaneous range adaptation in value-coding neurons (Louie et al., 2015), eventually driving behavioral phenomena such as the framing effect. Here, we speculate that the framing of decisions may automatically trigger contextual expectations regarding expected gain and/or loss ranges, which may induce fast plastic changes within value-constructing networks through, e.g., short-term potentiation (Fiebig and Lansner, 2017).

That the brain’s biology is to blame for all kinds of cognitive and/or behavioral flaws is not a novel idea (Buschman et al., 2011; Marois and Ivanoff, 2005; Miller and Buschman, 2015; Ramsey et al., 2004). However, providing neuroscientific evidence that a hard-wired biological constraint shapes and/or distorts the way the brain processes information is not an easy task. This is because whether the brain deviates from how it should process a piece of information is virtually unknown. This is particularly true for value-guided decision-making, which relates to subjective assessments of preferences rather than objective processing of decision-relevant evidence (Rangel et al., 2008). Nevertheless, value-guided decision-making is known to exhibit many irrational biases, the neurocognitive explanations of which have been the focus of intense research over the past decades. From a methodological standpoint, our main contribution is to show how to leverage computational models (in particular: ANNs) to test hypotheses regarding neurophysiological mechanisms that may constrain or distort behaviorally-relevant information processing. On the one hand, we retain the simplicity of established ‘model-based’ fMRI approaches (Borst et al., 2011; O’Doherty et al., 2007), which proceed by cross-validating the identification of hidden computational determinants of behavior with neural data. On the other hand, our dual ANN/RSA approach enables us to quantify the statistical evidence for neurophysiological mechanisms that are difficult –if not impossible- to include in computational models that are defined at Marr’s algorithmic level (McClamrock, 1991), e.g., normative models of behavior (as derived from, e.g. learning or decision theories) and/or cognitive extensions thereof. Self-organized plasticity between attribute-specific and attribute-integration units is a paradigmatic example of what we mean here. More generally, hard-wired biological mechanisms or constraints may not always be instrumental to the cognitive process of interest. In turn, it may be challenging to account for incidental biological disturbances of neural information processing, when described at the algorithmic level. A possibility here is to conceive of these disturbances as some form of random noise that perturbs cognitive computations (Drugowitsch et al., 2016; Wyart and Koechlin, 2016). In contrast, we rather suggest relying on computational models that solve a well-defined computational problem (here: constructing the gambles’ subjective value from prospective gains and losses) but operate at the neural level. Accounting for possibly incidental, biological constraints and/or hard-wired mechanisms then enables comparing quantitative/deterministic scenarios for sub-optimal disturbances of covert cognitive processes of interest.

Methods

Artificial neural network models of value synthesis

Artificial neural networks or ANNs decompose a possibly complex form of information processing in terms of a combination of very simple computations performed by connected ‘units,’ which are mathematical abstractions of neurons. Here, we take inspiration from a growing number of studies that use ANNs as mechanistic models of neural information processing (Güçlü and van Gerven, 2015; Kietzmann et al., 2017; Kietzmann et al., 2019; Kriegeskorte and Golan, 2019), with the added requirement that they eventually explain (possibly irrational) behavioural data.

In abstract terms, any decision can be thought of as a cognitive process that transforms some input information u=u(1),u(2),...,u(nu) into a behavioral output response r. Here, participants have to accept or reject a risky gamble composed of a 50% chance of obtaining a gain G and a 50% chance of experiencing a loss L, i.e., u is composed of nu=2 input attributes: u=G,L. Under an ANN model of such decisions, people’s behavioral response is the output of a neural network that processes the attributes u, i.e.,: rgANNu,ϑ, where ϑ are unknown ANN parameters and gANN() is the ANN’s input-output transformation function. So-called ‘shallow’ ANNs effectively reduce gANN() to a combination of neural units organized in a single hidden layer. In what follows, we rather rely on (moderately) deep ANNs with two hidden layers: namely, an attribute-specific layer (which is itself decomposed into gain-specific and loss-specific layers) and an integration layer (which receives inputs from both attribute-specific layers). The units that compose the latter then collectively determine gamble decisions by integrating prospective gains and losses.

We assume that each attribute ut(i) is encoded into the activity of neurons xt(i,1),xt(i,2),...,xt(i,j),...,xt(i,nx) of its dedicated ‘attribute-specific layer,’ where nx is the number of attribute-specific neurons per attribute. What we mean here is that the neuron j in the attribute-specific layer i responds to ut(i) as follows:

xt(i,j)=fx(i,j)ut(i),θ(i,j) (3)

where f. is the activation function of neural units that compose the ANN’s attribute-specific layer:

f(u,θ)=11+exp(μuσ) (4)

which yields a sigmoidal transform of inputs. Critically, such activation functions are bounded, i.e., we assume that neural units cannot fire beyond a certain rate. In neocortical neurons, the main mechanistic constraint that acts on firing rate is the action potential’s refractory period, which depends upon how long it takes ion channels to complete a whole voltage activate-deactivate cycle (about a few milliseconds). Typically, even fast-spiking neocortical neurons cannot fire at a frequency higher than about 500 or 600 Hz (Wang et al., 2016). As we will see, this type of response saturation is a critical component of range adaptation.

The parameters θ(i,j)={μ(i,j),σ(i,j)} in Equation 3 captures idiosyncratic properties of the neuron j in the input layer i (e.g. its firing rate threshold μ(i,j) and the slope parameter σ(i,j)).

Collectively, the activity vector xt(i,j)j=1,...,nx forms a multivariate representation of attribute ut(i) in the form of a population code (Ebitz and Hayden, 2021). Then the output of the attribute-integration layers is passed to the ‘integration layer’ [zt(1),zt(2),...,zt(k),...,zt(nz)], i.e., the neuron k of the integration layer responds to xt(i,j)j=1,...,nxi=1,...,nu as follows:

{zt(k)=fz(k)(υt(k),ϕ(k))υt(k)=i=1nuj=1nxC(i,j,k)xt(i,j) (5)

where C(i,j,k) is the connection weight from the neuron j in the attribute-specific layer i to the neuron k of the integration layer, ϕ(k) capture idiosyncratic properties of the integration neuron k (i.e. its firing rate threshold and slope), and υt(k) are the inputs of integration units.

Collectively, integration neurons form a representation of decision value in the form of a population code, i.e., the gambles’ subjective value Vt at the time or trial t is read out from the integration layer as follows:

Vt=k=1nzW(k)zt(k) (6)

where W(k) are the population readout weights.

Taken together, Equations 3–6 define the end-to-end ANN’s transformation of prospective gains and losses into decision value Vt=Vut,ϑ:

V(ut,ϑ)k=1nzW(k)fz(i=1nuj=1nxC(i,j,k)fx(ut(i),θ(i,j)),ϕ(k)) (7)

where ϑ lumps all ANN parameters together, i.e.,: ϑ{W,C,θ,ϕ,υ}. This is what we coin value synthesis. A schematic summary of the ANN’s double-layer structure is shown on panel A of Figure 1.

Efficient value synthesis and self-organized plasticity

We start with the premise that the brain system that integrates value-relevant option features (here: prospective gains and losses) to construct value signals may be doing this under neural noise, which degrades the information about value. In particular, the limited range of physiological responses of neural units that perform this integration induces some information loss on value signals. This is because, when inputs to integration units fall too far away from their firing threshold (say outside a ±22σ range), activation functions saturate, i.e., they produce non-discriminable outputs (close to 0 or 1). In this context, efficient value synthesis refers to the idea that neural networks that perform the integration of prospective gains and losses to construct value may adapt their response properties to mitigate information loss, hence the ‘efficiency’ of value synthesis. We now sketch how efficient value synthesis can be achieved within ANNs whose 2-layer structure is described in Equations 3-6.

In the presence of neural noise, the ANN’s readout value V~t of a gamble made of a pair (Gt,Lt) of prospective gain and loss is given by (in lieu of Equation 6):

{Vt=kw(k)zt(k)zt(k)=z(k)(Gt,Lt)+ηt(k) (8)

where ηt(k) is some (uncontrollable) neural noise that competes with the ‘utile’ component z(k)(Gt,Lt) that is given in Equation 5.

Equation 8 can serve to measure the information loss IL that is induced by neural noise under units’ limited response range:

IL=MI(z,z)η0KH[z]=KH[υ]kE[ln|fz(k)υ(k)|] (9)

Equation 9 states that the information loss increases when the mutual information between the noisy responses of integration units and their “utile” (i.e. noiseless) component decreases. Here, MI, is Shannon’s mutual information, K is a constant, H is Shannon’s entropy, and E is the standard expectation operator. The right-hand term in Equation 9 arises at the small noise limit (Nadal, 1994), and the expectation is taken under the distribution of integration units’ inputs υ. The last term in Equation 9 is simply the average steepness (in log space) of units’ activation functions. Importantly, Equation 9 holds irrespective of the type of nonlinearity of ANN units’ activation functions.

The entropy Hυ has no closed-form expression, but can be given a multivariate gaussian approximation, i.e.,: HυlnCSCT/2+K`, where S=ExxT is the covariance matrix of the output of the ANN’s first layer and K` is a constant. In principle, this approximation works because, when the size of the network grows, the central limit theorem implies that the distribution of integration units’ inputs υ will tend towards normality. The robustness of this approximation has been established in the context of undercomplete ICA (Porrill and Stone, 1998).

Efficient value synthesis can then be simply achieved by modifying the connectivity matrix C to decrease the information loss IL, i.e., along the direction of the information loss gradient:

ΔC=αILC=αH[υ]C+αkCE[ln|fz(k)υ(k)|] (10)

where α controls the magnitude of the gradient-following step and the first term in the right-hand side of Equation 10 is given by:

H[υ]C(CSCT)1CSnonlocal (11)

The matrix multiplier in the right-hand side of Equation 11 is non-local, i.e., the gradient H[υ]/C(i,j,k) depends upon all connection weights in the network. This is unrealistic for biological systems, and we thus drop this term in the remainder of this manuscript. In turn, Equation 11 can be approximated as a collection of local changes to the connectivity matrix:

ΔC(i,j,k)αC(i,j,k)E[ln|fz(k)υ(k)|] (12)

We will see that Equation 12 only involves the output response z(k) and x(i,j) of the pair of attribute and integration units that are connected through C(i,j,k). Equation 12 implies that efficient integration will tend to change the distribution of inputs υ(k) to each integration unit such that they span the range where the steepness of its activation function is maximal. Focusing inputs to the responsive range of integration units’ activation functions then maximizes the output variability induced by attribute inputs. This makes sense, since this is expected to yield maximal contrast over the response outputs of integration units.

But Equation 12 still requires a last modification to derive a realistic self-organized plasticity rule for efficient value synthesis. This is because self-organized plasticity is a dynamical process, which reacts to recent network activity, as trials and/or time unfolds.

Note that the expectation in Equation 12 is taken under the distribution of prospective gains and losses, and can, therefore, be defined as a sample average over trial-by-trial gamble attributes. If the underlying distribution is non-stationary, then E can be estimated at trial or time t using a simple weighted moving average operator E^t[]:

E^t[ln|fz(k)υ(k)|]=βt=1t(1β)ttln|fz(k)υt(k)|=(1β) E^t1[ln|fz(k)υ(k)|]+β ln|fz(k)υt(k)| (13)

where β (note: 0<β<1) controls the exponential decay of past samples’ weights in the moving average operator.

Let ΔCt(i,j,k) be the change of connectivity at trial or time t. Replacing the expectation in Equation 12 with the moving average operator E^t[] in Equation 13 now yields:

ΔCt(i,j,k)αC(i,j,k)E^t[ln|fz(k)υ(k)|]=α(1β)ΔCt1(i,j,k)+αβC(i,j,k)ln|fz(k)υt(k)| (14)

where the local gradient can be written as:

C(i,j,k)ln|fz(k)υ(k)|=|fz(k)υ(k)|1υ(k)|fz(k)υ(k)|υ(k)C(i,j,k) (15)

Under sigmoidal activation functions, then:

{|fz(k)υ(k)|=fz(k)υ(k)=fz(k)(1fz(k))σz(k)υ(k)|fz(k)υ(k)|=2fz(k)υ(k)2=1σz(k)fz(k)υ(k)(12fz(k)) (16)

Replacing Equation 16 into Equations 14-15 then yields:

ΔCt(i,j,k)=α(1β)ΔCt1(i,j,k)+αβσz(k)(12zt(k))xt(i,j) (17)

which only depends upon the output response of connected pairs of attribute-specific and attribute-integration units.

Note that accounting for the nonlocal component of Equations 10-11 would require inserting the correction term αβ[(υtυtT)1υtxtT]ij in the right-hand side of Equation 17. In our experience, its magnitude is typically small when compared to the Hebbian term in Equation 17. In turn, this term can be neglected without altering the main properties of efficient value synthesis.

Equation 17 states that efficient value synthesis can be operated by local, history-dependent, self-organized plasticity within the network. The plasticity in Equation 17 is ‘self-organized’ in the sense that it does not require any teaching or feedback signal. In this context, β determines the adaptation rate of the network’s connectivity to changes in the distribution of prospective gains and losses. Importantly, the anti-Hebbian component of self-organized plasticity generalizes to any nonlinear activation function that is continuous and monotonically increasing. This is not the case, however, for non-monotonic activation functions (e.g. pseudo-gaussian activation functions).

In summary, as long as ANN units have monotonically increasing activation functions, efficient value synthesis can be implemented through some self-organized plasticity rule of the form given in Equation 17. It turns out that the self-organized plasticity rule in Equation 17 essentially modifies the integration units’ receptive fields, i.e., their pattern of response to a given pair of prospective gain and loss. This has two main consequences: it changes the information content of the network, and it distorts the readout value. We unpack these two phenomena using numerical simulations, which we report in the two first section of the Supplementary materials. At this point, we simply note that the effect of self-organized plasticity on both the readout value profile and the information content within the integration layer depends upon the ranges of prospective gains and losses that the ANN is exposed to. The neural and behavioral impacts of the shape of the spanned domain of gains and losses are summarized in Figures 2 and 3 of the Results section.

Behavioral and fMRI data: Experimental paradigm

In this work, we perform a re-analysis of the NARPS dataset (Botvinik-Nezer et al., 2019; Botvinik-Nezer et al., 2020a), openly available on https://openneuro.org/; Poldrack et al., 2013. This dataset includes two studies, each of which is composed of a group of 54 participants who make a series of decisions made of 256 risky gambles. On each trial, a gamble was presented, entailing a 50/50 chance of gaining an amount G of money or losing an amount L. As in Tom et al., 2007, participants were asked to evaluate whether or not they would like to play each of the gambles presented to them (strongly accept, weakly accept, weakly reject, or strongly reject). They were told that, at the end of the experiment, four trials would be selected at random: for those trials in which they had accepted the corresponding gamble, the outcome would be decided with a coin toss and for the other ones -if any- the gamble would not be played. In the first study (hereafter: ‘narrow range’ group), participants decided on gambles made of gain and loss levels that were sampled from within the same range (G and L varied between 5 and 20 $). In the second study (hereafter: the ‘wide range’ group), gain levels scaled to double the loss levels (L varied between 5 and 20$, and G varied between 10 and 40$). In both studies, all 16×16=256 possible combinations of gains and losses were presented across trials, which were separated by 7 s on average with some random jitter (min 6, max 10).

MRI scanning was performed on a 3T Siemens Prisma scanner. High-resolution T1-weighted structural images were acquired using a magnetization-prepared rapid gradient echo (MPRAGE) pulse sequence with the following parameters: TR=2530ms, TE=2.99ms, FA=7, FOV=224 × 224 mm, resolution=1 × 1 × 1 mm. Whole-brain fMRI data were acquired using echo-planar imaging with a multi-band acceleration factor of 4 and parallel imaging factor (iPAT) of 2, TR=1000 ms, TE=30 ms, flip angle=68 degrees, in plane resolution of 2 × 2 mm 30 degrees of the anterior commissure-posterior commissure line to reduce the frontal signal dropout, with a slice thickness of 2 mm and a gap of 0.4 mm between slices to cover the entire brain. See https://www.narps.info/analysis.html#protocol for more details.

Extraction of trial-by-trial BOLD responses within OFC subregions

In the results Section, we focus on five subregions of the OFC: namely, the lateral and medial parts of Brodmann area 11, Brodmann area 13, Brodmann area 14, and the subgenual part of Brodmann area 32. This parcellation is based on anatomical masks in standard MNI coordinates obtained from the BRAINNETOME atlas (https://atlas.brainnetome.org/, Fan et al., 2016). As can be seen in Figure 11 below, these areas tile the entire OFC, except its most rostro-lateral part (which is Brodmann area 12).

Figure 11. Anatomical masks of OFC subregions.

Figure 11.

Pink: medial part of BA11, yellow: lateral part of BA11, green: A13, blue: BA14, red: BA32. Dark and light colors correspond to left and right hemispheric analogous regions, respectively.

The standard MNI coordinates of each subregion barycenter are given in Table 1 below.

Table 1. Anatomical coordinates of OFC subregions’ barycenter.

Anatomical region Barycenter coordinates (left) Barycenter coordinates (right)
Medial part of BA11 68,166,125 114,164,126
Lateral part of BA11 81,146,127 100,149,127
BA13 81,146,127 100,149,127
BA14 84,182,114 97,176,114
Subgenual part of BA32 87,167,110 96,169,101

To balance the statistical power across OFC subregions, we then removed the voxels that fall outside a 200-voxel sphere centered on the barycenter of the masks. This procedure yielded spherical ROIs with similar sizes across all ROIs.

FMRI data were preprocessed using SPM (https://www.fil.ion.ucl.ac.uk/spm/), following standard realignment and movement correction guidelines. Note that we excluded five participants from the narrow range group because the misalignment between functional and anatomical scans could not be corrected. In each ROI, we regressed trial-by-trial activations with SPM through a GLM that included one stick regressor for each trial (at the time of the gamble presentation onset), which was convolved with the canonical HRF. To account for variations in hemodynamic delays, we added the basis function set induced by the HRF temporal derivative. To correct for movement artifacts, we also included the six head movement regressors and their squared values as covariates of no interest. We then extracted the 256 trial-wise regression coefficients in each voxel of each ROI. Next, we removed potential between-session confounding effects by projecting the ensuing trial series onto the null space of a categorical session-encoding design matrix. This effectively provided a BOLD trial series YfMRI that are deconvolved from the hemodynamic response function (Dale, 1999) and corrected for standard confounding effects. No spatial smoothing was applied to preserve information buried in spatial fMRI activity patterns. Finally, we concatenated the corrected multivariate fMRI activity patterns of left and right analogous ROIs, eventually yielding five OFC subregions.

Model-free analysis of behavioral data

First, we describe peoples’ behavior in terms of the probability of gambling given the gamble’s expected value EV=0.5*(G-L), where G and L are the gamble’s prospective gain and loss, respectively. For each participant, we binned trials according to deciles of EV, and measured the rate of gamble acceptance (Figure 2, upper-left panel).

Second, we regressed peoples’ decision to gamble onto gains and losses. Within each participant, we fit the following logistic regression model: p(gi)=s(w0+wGGiwLLi), where gi is the binary gamble decision at trial i, Gi and Li are the prospective gain and loss of trial i, w0 is the intercept or gambling bias, wG and wL are the sensitivity to gains and losses, respectively, and s(.) is the standard sigmoid mapping. Note that logistic model parameter estimates can be recombined to measure peoples’ loss aversion (log(wL/wG)). We then report within-subject parameter estimates at the group-level for random effect analyses (see Figure 3 of the Results section). The logistic model can also be used to perform counterfactual model simulations. For each subject, we use the corresponding fitted parameters to evaluate the trial-by-trial probability of gamble acceptance that would have been observed, had this subject/model been exposed to the sequence of prospective gains and losses that each subject of the other group was exposed to. It turns out that such out-of-sample predictions of peoples’ behaviour are (expectedly) inaccurate. More precisely, such logistic regression cannot predict the observed group-difference in peoples’ gambling rate (see Figure 4—figure supplement 1).

Third, we performed a sliding window analysis: decisions were first partitioned into chunks of 16 consecutive trials each, which were then regressed against corresponding gains and losses using the same logistic model as above. From this, we obtain a set of logistic parameter estimates (intercept and sensitivity to gains and losses) per temporal window, per subject. Temporal changes in the ensuing loss aversion index can thus be followed as time unfolds (Figure 2D).

RSA

Each ANN model of value synthesis makes specific trial-by-trial predictions of activity patterns within the integration layer that can be compared to multivariate fMRI signals in each OFC subregion. This enables us to evaluate the neurophysiological validity of candidate models. Here, we have chosen to rely on representational similarity analysis or RSA (Diedrichsen and Kriegeskorte, 2017; Friston et al., 2019; Kriegeskorte et al., 2008). In brief, RSA consists of evaluating the statistical resemblance between model-based and data-based representational dissimilarity matrices or RDMs, which we derive as follows. Let Y be the ny×nt multivariate time series of (modeled or empirical) neural activity, where ny and nt are the number of units and trials, respectively. Note that, for model predictions, ‘units’ mean artificial elementary units in ANNs, whereas they mean voxels in a given ROI for fMRI data. First, we derive the nt×nt raw RDM DY=DYt,t`, where the matrix element DYt,t` measures the dissimilarity of neural patterns of activity between trial t and trial t`: DYt,t`=1corrYt,Yt`. By construction, these RDMs are invariant to affine transformations of activity patterns. In particular, this implies that the ensuing RSA are orthogonal to univariate analyses that rely on the mean activity within OFC subregions.

Second, we correct the raw RDM for autocorrelation confounds. To do this, we remove the average neural dissimilarity for each possible delay between trial pairs from the raw RDM. Note that this correction does not confound the existing relationship between neural dissimilarity and prospective gains and losses or EV, because these are randomized across trials.

When quantifying the neural encoding strength of prospective gains and losses, we simply regress the vectorized lower-left triangular part of DY against Euclidean distances in either gains or losses concurrently (having included a constant term). This measures the gradient of neural dissimilarity per unit of gains and losses. We quantify the neural encoding strength of EV similarly (using a distinct regression analysis, to prevent regressor collinearities).

Finally, we measure the statistical similarity of DYANN and DYfMRI, where DYANN is derived from activity patterns of the ANNs’ integration layer and DYfMRI is derived from HRF-deconvolved multi-voxel fMRI trial series in each ROI, in terms of the Pearson correlation coefficient ρ between the vectorized lower-left triangular part of DY. We then assess the group-level statistical significance of RDMs' correlations using one-sample t-tests on the group mean of Fisher-transformed RDM correlation coefficients ρ. Note that ANN-RSA summary statistics (such as RDM correlation coefficients) do not favor more complex ANNs (i.e. ANNs with more parameters, such as plastic ANNs). This is because, once fitted to behavioural data, ANNs produce activity patterns that have no degree of freedom whatsoever when they enter RDM derivations. In particular, this means that static ANNs can a priori show a greater RDM correlation than plastic ANNs. In turn, this enables a simple yet unbiased statistical procedure for comparing candidate ANN models. Importantly, this procedure is immune to arbitrary modeling choices such as the total number of units in ANN models.

Appendix 1

Supplementary material

How efficient is efficient value synthesis

In what follows, we unpack the impact of efficient value synthesis using an exemplar numerical simulation of an ANN operating value synthesis (as described in Equations 3 to 6). For the sake of clarity, we will focus on a toy example made of 8 input units (4 for the gain sublayer and 4 for the loss sublayer) and 4 integration units. We assume that the network has been set to construct a readout value that is close to the objective gamble’s EV, as defined by decision theory. We thus initialize the ANN parameters (including connectivity weights) using a random perturbation of ideal population codes, and then train the ANN to output EV over a unitary range of prospective gains and losses. .

As a reference point, we estimate the ensuing units’ receptive fields, in terms of the units’ response z(k) output to any (G,L) pair of admissible prospective gains and losses. Note that we can associate a value to each point in that space (EV=12G12L). This enables us to characterize the relationship between EV and each integration unit in terms of their average response to the subset of pairs of prospective gain and loss that lie along iso-value lines. Note that this relationship is statistical in essence (as opposed to causal), since units respond to prospective gains and losses (as opposed to EV). Nevertheless, this relationship may be strong (in a statistical sense) because the network has been trained to signal value (which is constructed as a weighted sum of integration units’ responses). Figure 1—figure supplement 1 below summarizes this analysis.

One can see that each integration unit has a receptive field that spans a specific range of prospective gains and losses. In turn, it exhibits an idiosyncratic statistical relationship to EV. Note that these relationships are slightly ambiguous (variations of response outputs within EV bins). This is because units typically respond nonlinearly to prospective gains and losses. Nevertheless, despite those multiple nonlinearities, the ANN’s readout value profile is clearly linear in the prospective gains and losses that compose gambles. Finally the ANN readout value’s sensitivities to prospective gains and losses match their theoretical values (namely: 0.5), which simply means that the network has been correctly trained. .

We then simulate self-organized plasticity according to Equation 17 over two restricted domains of prospective gains and losses: (i) the ‘narrow’ domain is symmetrical and such that the spanned range of gains is the same as that of losses, and (ii) the ‘wide’ domain is asymmetrical and such that the spanned range of gains is twice that of losses. Finally, we quantify the efficiency of value synthesis, both before and after self-organized plasticity has reshaped the system’s connectivity, in terms of the expected log steepness of units’ activation functions (Equation 9) and in terms of the system’s resilience to neural noise. We measure the latter by reading out the noisy value response of the network to gain/loss pairs spanning the corresponding domain, having added Gaussian neural noise to integration units’ activity patterns with variances ranging from 1/256 to 2 (Equation 8), and quantifying the rate of ‘stable choices,’ i.e., gamble decisions that are identical to those taken without neural noise.

The results of this analysis are summarized in Figure 1—figure supplement 2 below.

One can see that self-organized plasticity has increased the average steepness of units’ activation functions. This is reassuring, since it was derived to operate a gradient ascent on this metric. More importantly, self-organized plasticity tends to increase the system’s resilience to neural noise on integration units.

What does self-organized plasticity do to the network?

To address this question, we first reproduce the analysis of Figure 1—figure supplement 1, this time after self-organized plasticity has modified the network connectivity, while being exposed to the ‘wide.’ domain above (see Figure 1—figure supplement 3: below).

Comparing Figure 1—figure supplement 1 and Figure 1—figure supplement 3: shows that the receptive fields of integration units have been altered, even beyond the domain of prospective gains and losses that were spanned during efficient integration (except, maybe for unit #3, which underwent very small changes in its connectivity to the feature layer of the network). In particular, units #1 and #2 are now mostly sensitive to losses, whereas unit #4 has become mostly sensitive to gains. In turn, this eventually distorted the readout value profile. Importantly, the readout value profile now shows, within the spanned domain of prospective gains and losses, distorted sensitivities to gains and losses. More precisely, the sensitivity to losses ωL is now about twice that of gains ωG. Self-organized plasticity over this domain of prospective gains and losses would thus eventually yield loss aversion.

One can also see that self-organized plasticity has changed the statistical relationship between integration units’ responses and gamble EVs. In brief, all integration units now show stronger response variations across the spanned range of EVs, i.e., inputs to integration units now tend to span the non-saturating range of their activation function (this is what is measured in the left panel of Figure 1—figure supplement 2). In addition, this relationship also tends to become more ambiguous, i.e., there are stronger variations of response outputs within EV bins. This is due to the induced distortion of the units’ receptive fields. Nevertheless, self-organized plasticity results in an apparent phenomenon of (partial) adaptation to the range of EVs. Practically speaking, should one attempt to detect those integration units that show a significant relationship with EV, one would conclude that self-organized plasticity seems to have ‘recruited’ integration units (that would otherwise show no strong covariation with EV). But here again, this phenomenon is only apparent, because EV is not an input to the network.

Now, we ask whether and how self-organized plasticity changed the information content within the integration layer. To do this, we rely upon RSA, which can be reproduced in an empirical setting where neural populations are sampled over an arbitrary domain of prospective gains and losses. We first measure the neural dissimilarity d(i,j,k,l) of each possible pairwise combination of gains and losses, in terms of the correlation (across integration units) between the corresponding response patterns, i.e.,: d(i,j,k,l)=1corr[z(Gi,Lj),z(Gk,Ll)]. By construction, this measure of dissimilarity is bounded between 0 (when the response patterns are colinear) and 2 (when they are anti-correlated). We then construct the RDM for EV as follows. Recall that each pair of prospective gain and loss belongs to a given EV bin. The RDM element that corresponds to a pair {EVm,EVn} of EV bins is estimated as the average dissimilarity d(i,j,k,l) over all combinations for which (GiLj)/2EVm and (GkLl)/2EVn. We also construct RDMs for prospective gains and losses similarly. If the network contains information about a given variable, then the corresponding RDM should show a diagonal pattern, such that the neural dissimilarity increases with the absolute difference between elements of the pair. Accordingly, we define the neural encoding strength of EV (resp., prospective gains or losses) in terms of the gradient of neural dissimilarity per unit of absolute difference of EV (resp., prospective gains or losses). This analysis can be reproduced after self-organized plasticity has modified the integration layer’s receptive fields, while exposing the ANN to the wide (asymmetrical) domain above. Figure 1—figure supplement 4 below summarizes the results of these analyses.

In brief, self-organized plasticity strongly strengthened the encoding of all variables, i.e., the ANN’s integration layer contains more information about EV and prospective gains/losses than before self-organized plasticity. Interestingly, although this network has been trained to perform value synthesis, the encoding of EVs after self-organized plasticity is relatively weaker than before (smaller variations of dissimilarity along directions orthogonal to iso-EV lines), at least when compared to prospective gains and losses. This is because changes in integration units’ receptive fields eventually reduced the redundancy between integration units, which now tend to decompose the array of feature units’ outputs into their independent sources of variations (here: prospective gains and losses). Nevertheless, as one can check in Figure 1—figure supplement 4, applying the readout weights on the integration units’ response still yields a reasonable readout value profile (that exhibits loss aversion).

Logistic regression of behavioral data: Postdiction and out-of-sample predictions

The logistic regression analysis of peoples’ sensitivity to gains and losses provides a trial-by-trial prediction of gamble acceptance, for each pair of prospective gain and loss. But are hard decisions (i.e. when EV is close to zero) as well explained as easy decisions (e.g., when there is a strong incentive to gamble)? To address this question, we binned trials according to EV deciles (see Methods), and measured the rate of the logistic model’s postdiction error (see Figure 4—figure supplement 1A).

Note: hereafter, we refer to ‘postdiction”’ as model predictions on data that was used to fit the model’s parameters. In contrast, ‘out-of-sample predictions’ are model-proper predictions on yet unseen data.

One can see that the logistic regression model achieves relatively similar postdiction error profiles in both groups. In particular, easy decisions (either low or high EV) exhibit much lower postdiction errors than difficult decisions (EV around zero).

We also performed counterfactual model simulations: for each subject, we simulated the trial-by-trial gamble acceptances that would have been observed, under the logistic model, had this subject/model been exposed to the sequence of prospective gains and losses that each subject of the other group was exposed to (see Figure 4—figure supplement 1B). Those out-of-sample predictions are inaccurate: within the EV range that are common to both groups, it wrongly predicts that the gambling rate should be higher in the wide-range group than in the narrow-range group. This is not surprising, because predictions of the logistic model are entirely determined by the pair of prospective gain and loss (and cannot exhibit contextual effects). In other words, for a given gamble (defined in terms of its constituent gain and loss), the logistic model makes an out-of-sample prediction that corresponds to its equivalent postdiction (or is an extrapolation of it, if performed outside the EV range it was originally trained with).

Finally, we asked whether inter-individual differences in gambling rate (within the common EV range) were better explained in terms of inter-individual differences in loss aversion (i.e. the ratio wL/wG, where wL and wG are estimates of behavioral sensitivity to gains and losses, respectively) or in terms of gambling bias (w0). This is summarized in Figure 4—figure supplement 1C, D. In brief, loss aversion negatively correlates with gambling rate within the common EV range (wide gain range group: r=0.59, p<10–3, narrow gain range group: r=0.26, p=0.067, both groups together: r=0.53, p<10–3).

Representational dissimilarity matrices within OFC subregions

In the main text, we summarize the information content within OFC subregions in terms of the gradients of neural dissimilarity per unit of prospective gains, losses, or EV. For the sake of completeness, we report below the underlying representational dissimilarity matrices, where trials have been binned according to prospective gains (gain-RDMs, Figure 5—figure supplement 1), losses (loss-RDMs, Figure 6—figure supplement 1), or EV (EV-RDMs, Figure 7—figure supplement 1).

Note that a strong neural encoding of gains would correspond to strong variations in neural dissimilarity along directions orthogonal to iso-distance lines. Such RDMs would show a clear bandwise-diagonal structure, where neural dissimilarity would increase when moving away from the main diagonal. One can see that gain-RDMs within the lateral part of BA11 are the closest to this ideal situation.

Here again, loss-RDMs within the lateral part of BA11 show the strongest encoding of prospective losses. Interestingly, the loss-RDM of BA11 for the narrow gain range group seems to exhibit a block structure, such that trials seem to be partitioned into two classes according to whether they lie either below or above the median loss. This suggests that the encoding of losses is not, strictly speaking, linear. Nevertheless, this does not confound the analysis of encoding strength which we report in the main text.

Except in the lateral part of BA11, all EV-RDMs exhibit antidiagonal elements with weak neural dissimilarity. This means that multivariate fMRI patterns of trials that correspond to either very low or very high EV are similar to each other. This is the reason why the average neural dissimilarity tends to eventually decrease for extreme absolute EV distances (Figure 7 of the main text).

Univariate analysis of fMRI data

In the main text, we report the results of multivariate (RSA) analyses of fMRI data, in five distinct subregions of the orbitofrontal cortex. For completeness, we also performed univariate analyses, having summarized the activity within OFC subregions in terms of the average trial-by-trial BOLD response over voxels (see Methods section for parametric estimation of trial-by-trial BOLD responses), and corrected the ensuing univariate responses for between-session confounding effects.

Here, we rely on two distinct general linear models or GLMs (Friston et al., 1995). All models incorporated only one event per trial, which was the onset of the gamble presentation, convolved with a canonical hemodynamic response function, as well as with its temporal derivative (Hopfinger et al., 2000). In the first GLM, we used two parametric modulations of the trial epoch regressor: namely prospective gains (G) and losses (L). We regressed the average trial-by-trial BOLD response against prospective gains and losses concurrently, using a GLM including an offset (constant term). The ensuing GLM parameter estimates measure the within-subjects gradient of BOLD responses per unit of prospective gains and losses. In the second GLM, we used only one parametric modulation, i.e. the gamble’s expected value EV=0.5*(G-L). Here, the ensuing GLM parameter estimates measure the within-subject’s gradient of BOLD response per unit of EV. Note that gain, loss and EV regressors were mean-centered but not rescaled to allow for a proper between-group comparison of neural sensitivity to these factors. For both regression analyses, we reported the ensuing parameter estimates at the group level and performed within-group and between-group statistical significance tests using standard one-sample and two-sample F-tests, respectively. The results of these analyses are summarized in Figure 7—figure supplement 2 below.

One can see that EV is significantly encoded in both groups of subjects in subregions BA14 and BA32. At the very least, the latter result survives the correction for multiple comparisons across OFC subregions. This reproduces established univariate results regarding the encoding of value in the fMRI literature. However, although some regions also show a significant encoding of prospective losses, nowhere in the OFC is there a significant encoding of both gains and losses, in both groups of subjects. This is strikingly different from the multivariate fMRI analyses results summarized in Figures 46 of the main text, which clearly exhibit stronger statistical power. We note that, by construction, these two analyses are orthogonal to each other. This is because we chose to measure neural dissimilarity in terms of Pearson correlations across voxels, which are invariant under isotropic (within OFC subregions) affine transformations of voxel-wise trial series.

Bayesian priors on ANNs’ parameters

We fit each candidate ANNs to observed trial-by-trial gamble decision sequences using a dedicated Bayesian approach, which requires setting specific prior distributions on model parameters. These priors distributions are summarized on Appendix 1—table 1 below.

Appendix 1—table 1. Parameters' priors for biologically constrained artificial neural networks (ANNs).

Parameter Distributions Rational
Firing rate threshold μN(jn+1,1) Regular tiling of inputs
Activation function slope σ=eθwithθN(0,1) Partially overlapping tiling of inputs
Initial connectivity c(i,j,k)N(sign(i1/2),1) EV population code
Value readout weights w(k)N(1,1) EV population code
Plasticity magnitude α=eθwithθN(0,1) Non informative prior
Plasticity rate β=11+eθwithθN(0,1) Non informative prior

All parameter notations are defined in the Methods section of the main text. Note that we performed all the analyses using the VBA academic freeware (Daunizeau et al., 2014). Although this toolbox only handles Gaussian prior distributions, native (Gaussian) VBA parameters can be passed through arbitrary mappings prior to entering model computations. This enables VBA to enforce any required constraint (see e.g. Daunizeau, 2017). This is the case here for the slopes of activation functions, as well as for the self-organized plasticity magnitude and rate parameters. In Appendix 1—table 1, θ denotes VBA native parameters: they are given Gaussian prior distributions, and then passed through the appropriate nonlinear mapping to enforce positivity or bounding constraints.

In addition to the prior distributions for ANN parameters given in Appendix 1—table 1 above, we set the number of attribute-specific and attribute-integration units to nx=4 (per sublayer) and nz=8, respectively. We also rescaled the ANN’s inputs (i.e. prospective gains and losses) such that they lie within the unit interval. To ensure an unbiased comparison between decision attributes and/or between groups of participants, this rescaling is invariant across decision attributes and groups, and such that the upper bound of the unit interval corresponds to 150% of the maximum prospective gain or loss (here: 40$). This also prevents artifactual ceiling effects in the population code of the attribute layer (whose units with the highest firing rate thresholds always remain weakly activated).

ANNs’ behavioral postdiction and prediction accuracy

We fit the static and plastic ANN models to each subject’s sequence of gamble decisions. In what follows, we provide summary statistics of our ANN-based behavioral data analyses.

In addition to the balanced accuracy scores given in the main text, Appendix 1—table 2 below gives the group average percentage of explained behavioral variance (R2) and its standard deviation (across participants) for each model (including the logistic regression model, for comparison purposes), and each group.

Appendix 1—table 2. Mean R2 and its standard deviation for each model, for both groups.

Narrow range group Wide range group
Mean Std Mean Std
Logistic 0.74 0.02 0.66 0.02
Static ANN 0.73 0.02 0.62 0.03
Plastic ANN 0.75 0.02 0.66 0.02

In brief, all models achieve similar fit accuracies, in both groups, and no pairwise comparison between models reaches statistical significance.

But do the models yield similar types of postdiction errors? To address this question, we bin trials according to EV deciles, prior to averaging the rate of postdiction error across subjects. Figure 8—figure supplement 1 below summarizes the accuracy of behavioral postdiction as a function of gambles’ expected value, for all ANNs.

As was the case for the logistic regression model, ANNs exhibit high fit accuracy for easy decisions (extreme EVs) and lower fit accuracy for hard decisions (EV around 0).

Parameter estimates of plastic ANNs

Of particular interest in our ANN-based analysis of behavioral data are the parameters that control the self-organized plasticity rule of efficient value synthesis (plasticity magnitude α and rate β parameters). Figure 8—figure supplement 2 below shows the empirical histograms of parameter estimates, for both groups of participants.

Overall, empirical distributions of parameter estimates are qualitatively similar in both groups of participants. When comparing groups of participants with respect to either α or β parameter estimates, nothing reaches statistical significance.

We also asked whether fitted plasticity parameters explained inter-individual differences in observed loss aversion, as measured using the logistic model (see Figure 8—figure supplement 2C, D). Nothing really stands out for plasticity rates, but the situation is quite different for plasticity magnitudes. Within the wide gain range group, there is a significant correlation between plasticity magnitudes and loss aversion across subjects (r=0.47, p<10–3), while this correlation is negative within the narrow gain range group (r=−0.32, p=0.024). That the sign of this correlation is different in both groups makes sense, given that the effect of self-organized plasticity is to decrease loss aversion within the narrow gain range group, whereas, if anything, it tends to increase it within the wide range group (see Figure 4D in the main text).

On the diversity of response profiles in ANNs’ integration units

OFC neurons are notoriously diverse in their response profile, but a consistent finding is that, in the context of value-based decision-making, they can be classified in terms of so-called ‘choice cells,’ ‘chosen value cells’, and ‘offer value cells’ (Padoa-Schioppa and Assad, 2006; Padoa-Schioppa and Assad, 2008). Given that this can be considered a pre-requisite for any computational model of value integration in the OFC, we asked whether ANNs reproduce this known property of OFC neurons.

For each subject, we thus tested whether the response of integration units correlates (across trials) with choice, chosen value, and/or gamble value, where value is defined as the weighted sum of gains and losses (according to the static logistic model parameter estimates). Integration units are then classified in terms of which variable it correlate most with (but none of these labels is assigned if the best correlation is not significant). In brief, ‘choice units’ show response outputs that vary in a quasi-categorical manner with value (i.e. that discriminate trials in which the value of gambling is either positive or negative), ‘chosen value units’ exhibit a ReLU-like relationship with value (null when the value of gambling is negative, and increasing with value when it is positive), and ‘offer value units’ show a quasi-linear relationship with value. The results of this analysis are summarized in Figure 9—figure supplement 1 below.

In previous electrophysiological experiments, the detection rate of ‘offer value,’ ‘chosen value,’ and ‘choice’ cells within OFC neurons depends upon the delay to stimulus onset. This is because the detection analysis is typically performed at each time point within a given peristimulus time window. We thus extracted the frequency of ‘offer value,’ ‘chosen value,’ and ‘choice’ cells detected at OFC neurons’ response peak, i.e., about 300 msec after stimulus onset (see Figure 4 in Padoa-Schioppa and Assad, 2006). For comparison purposes, we then normalized these estimated frequencies to remove non-responsive OFC neurons (see horizontal grey bars in Figure 9—figure supplement 1). We find that, although HP-ANN’s integration units were not at all designed to encode these quantities, they eventually reproduce the known response variability observed in OFC neurons (although few units are eventually classified as ‘choice cells’ in the narrow gain range group). This means that our ANN models reproduce the known diversity of response profiles within OFC neurons. Importantly, the response profiles are almost identical for both groups. In particular, we find that about 34% of integration units are classified as ‘chosen value’ cells. This is interesting, because the computational role of these units is in fact exactly the same as that of units that are classified as ‘offer value’ cells: together, they form a population code for the subjective value of gambling. In other words, this classification is not directly relevant for guessing the underlying computational role of integration units.

Value synthesis under efficient coding of gains and losses

In the main text, we assumed that neural noise would be acting on the output of integration units. But it may also be acting on its inputs, or equivalently on the outputs of the attribute units:

{υt(k)=i=1nuj=1nxC(i,j,k)xt(i,j)xt(i,j)=xt(i,j)(ut(i))+ηt(i,j) (A1)

where ηt(i,j) is some (uncontrollable) neural noise that competes with the ‘utile’ component xt(i,j)ut(i) of the responses of attributes units.

Similarly to Equation 9, this also induces an information loss IL:

IL=MI(x,x)η0KH[u]ijE[ln|fx(i,j)u(i)|] (A2)

In our framework, we do not consider how brain systems upstream of the attribute layers extract gain and loss information from visual stimuli and project it onto attribute layers. Rather, we assume that prospective gains and losses are encoded into population codes within attribute layers. Nevertheless, we can still model efficient coding at the level of attribute units. More precisely, efficient coding of gains and losses can then be achieved by modifying the response properties of attribute units to decrease the information loss IL in Equation A2 i.e.,:

Δθ(i,j)=αILθ(i,j)=αθ(i,j)E[ln|fx(i,j)u(i)|] (A3)

where θ(i,j) is the 2 × 1 vector of location and scale parameters of attribute units’ activation functions.

This is but a proxy for the impact of self-organized plasticity upstream attribute layers. Note that there is no nonlocal term in Equation A3, because the entropy term Hu is beyond the network’s control.

Let Δθt(i,j) be the change of location and scale parameters at trial or time t. Similarly to efficient value synthesis, an online implementation of Equation A3 is operated as follows:

Δθt(i,j)α(1β)Δθt1(i,j)+αβθ(i,j)ln|fx(i,j)ut(i)| (A4)

where β (note: 0§amp;lt;β§amp;lt;1) controls the exponential decay of past samples’ weights in the moving average operator (Equations 13-14 in the main text).

If the ANN is equipped with sigmoid activation functions, then simple analytical derivations show that Equation A4 reduces to the following update rule for location and scale parameters of attribute units:

{Δμx,t(i,j)=α(1β)Δμx,t(i,j)αβσx,t(i,j)(12xt(i,j))Δσx,t(i,j)=α(1β)Δσx,t(i,j)αβ2σx,t(i,j)2(12xt(i,j))(ut(i)μx,t(i,j)) (A5)

Equation A5 is the equivalent of Equation 2 of the main text, i.e., it describes how the properties of attribute units within the network should modify their response properties to operate efficient coding of attribute inputs. In what follows, we simply refer to Equation A5 as efficient coding of attributes. Note that Equation A5 would also modify the receptive fields of integration units within the ANN. In principle, it could thus yield effects that are qualitatively similar to those of efficient value integration.

But what are the neural and behavioral impacts of this efficient coding mechanism? To address this question, we reproduced the same analyses as for efficient value synthesis.

First, we asked whether Equation A5 would produce apparent value range adaptation (Figure 2 in the main text) in integration units. We randomized the trained connections of ANNs that operate value synthesis prior to exposing them to four different series of 256 decision trials made of prospective gains and losses with a predefined range. As before, we considered two ranges (either narrow or wide) for both prospective gains and losses, and exposed the ANNs to each of the 2 × 2 range combinations. We then averaged the activity of integration units, after having binned trials into EV deciles. We repeated this procedure 1000 times, and Figure 10 in the main text summarizes the results of this analysis. One can see that the lower and upper limits of units’ mean responses are tied to the bounds of the spanned EV range, as is the case for static value synthesis (see Figure 2C and D in the main text). In turn, the slope of the relationship between EV and integration units’ mean responses does not seem to vary strongly with the range of spanned EVs. In other words, the efficient coding mechanism in Equation A5 does not induce apparent value range adaptation in the univariate response of integration units.

Second, we asked whether and how the shape of spanned the gain/loss domain modifies the neural and behavioral sensitivities to prospective gains and losses. We thus performed the same series of Monte-Carlo simulations as for efficient value synthesis. We simulated the response of the ANN, modified according to Equation A5 to operate efficient coding of attributes, to a series of 256 gambles, after having randomized the trained connectivity within the network. We systematically varied the spanned range of prospective gains and losses, and measured the behavioral sensitivities to gains and losses (in terms of the gradient of the readout value per unit of gain or loss) and the neural sensitivity to EV (in terms of the gradient of the integration layer’s neural dissimilarity per unit of EV). We repeated this procedure 1000 times, and Figure 10—figure supplement 1 below summarizes the results of this analysis.

In brief, the behavioral impact of efficient coding is qualitatively similar to that of efficient value synthesis. More precisely, the behavioral sensitivity to gains (resp., losses) decreases as the spanned range of gains (resp., losses) increases (Figure 10—figure supplement 1D, E). Note that cross-attribute spillover effects are also present here, and to a greater extent than under efficient value synthesis. In addition, efficient coding of attributes induces a similar effect on behavioral loss aversion, which follows the ratio of the spanned range of gains relative to the spanned range of losses (Figure 10—figure supplement 1F). This implies that behavioral observations alone would not disambiguate efficient value synthesis (as described in Equation 2 in the main text) from efficient coding of attributes (as described in Equation A5).

Also, the neural encoding strength of EV in the integration layer is similarly impacted by the shape of the spanned gain/loss domain. That is, the neural encoding strength of EV decreases when the spanned range of either gains or losses increases. Thus, this effect does not disambiguate efficient value synthesis from efficient value coding of attributes. However, the neural sensitivity to gains and losses does not react to the shape of the spanned gain/loss domain as they do under the efficient value synthesis scenario (see Figure 10C, D in the main text). The difference here is twofold. First, the cross-attribute spillover effect dominates, i.e., the most salient effect is that the neural sensitivity to gains (resp. losses) increases when the spanned range of losses (resp., gains) increases. Second, if anything, the within-attribute effect seems to be opposite to that of efficient value synthesis. That is, the neural sensitivity to gains (resp. losses) increases when the spanned range of gains (resp., losses) increases. At the very least, this holds for small to intermediate ranges of gains and losses; this effect reverses for extreme ranges of gains and losses (most likely because of artefactual ceiling effects). This is interesting, because this is clearly at odds with the neural predictions of the efficient value synthesis scenario. Importantly, this directly contradicts the fMRI data sampled in the lateral part of Brodman area 11 (see Figure 5 in the main text).

We then reproduced the same model-based analyses as with the efficient value synthesis scenario.

First, we fit the ANN model equipped with efficient coding of attributes to each participant’s trial-by-trial gamble decisions, and extract counterfactual out-of-sample behavioral predictions when exposing the fitted ANNs to the gamble series of the other group. We also performed the sliding window analysis to investigate the temporal dynamics of loss aversion, as captured by the scenario of efficient coding of attributes. The results of these analyses are summarized in Figure 10—figure supplement 2 below.

In brief, behavioral postdictions under the scenario of efficient coding of attributes are as accurate as under the efficient value synthesis scenario. In particular, the postdicted dynamics of loss aversion exhibit the same qualitative properties (compared with Figure 7C). However, out-of-sample behavioral predictions are not as convincing: although the model does predict a behavioral change, predicted gamble rates are still higher for the wide gain range group (mean gambling rate=0.58±0.02) than for the narrow gain range group (mean gambling rate=0.48±0.02). This observation is partially confirmed when comparing the absolute out-of-sample prediction error of both plastic ANN models. In brief, we found that out-of-sample behavioral predictions under efficient coding of attributes were significantly less accurate than under efficient value synthesis for the wide range group (p=0.006, F=12.8), but not for the narrow range group (p=0.31, F=1.15).

Second, we reproduced the same RSA analyses as before, i.e., we evaluated the similarity between the ANN model and fMRI data sampled in the same five OFC subregions. The results of these analyses are summarized in Figure 10—figure supplement 3 below.

No OFC subregion reaches statistical significance in both groups, for all types of RDMs. Importantly, this also holds for the lateral part of Brodman area 11 (only 5 out of 8 tests are significant). We note that, irrespective of the type of RDM considered, nowhere in the OFC is the comparison between both plastic models statistically significant.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Jean Daunizeau, Email: jean.daunizeau@gmail.com.

Thorsten Kahnt, National Institute on Drug Abuse Intramural Research Program, United States.

Michael J Frank, Brown University, United States.

Funding Information

This paper was supported by the following grant:

  • Agence Nationale de la Recherche ANR-20-CE37-0006 to Jean Daunizeau.

Additional information

Competing interests

No competing interests declared.

Author contributions

Resources, Data curation, Software, Formal analysis, Visualization, Methodology, Writing – original draft.

Conceptualization, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Methodology, Writing – original draft, Project administration.

Additional files

MDAR checklist

Data availability

All data analysed during this study are openly available from the https://openneuro.org/ website (https://doi.org/10.18112/openneuro.ds001734.v1.0.5). All the modelling and analysis code are available as part of the academic freeware VBA (https://github.com/MBB-team/VBA-toolbox/, Rigoux et al., 2023), which is under a GNU open-source license.

The following previously published dataset was used:

Botvinik-Nezer R, Iwanir R, Poldrack RA, Schonberg T. 2020. NARPS. OpenNeuro.

References

  1. Abraham WC. How long will long-term potentiation last? Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 2003;358:735–744. doi: 10.1098/rstb.2002.1222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bermudez MA, Schultz W. Responses of amygdala neurons to positive reward-predicting stimuli depend on background reward (contingency) rather than stimulus-reward pairing (contiguity) Journal of Neurophysiology. 2010;103:1158–1170. doi: 10.1152/jn.00933.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Borst JP, Taatgen NA, van Rijn H. Using a symbolic process model as input for model-based fMRI analysis: locating the neural correlates of problem state replacements. NeuroImage. 2011;58:137–147. doi: 10.1016/j.neuroimage.2011.05.084. [DOI] [PubMed] [Google Scholar]
  4. Botvinik-Nezer R, Iwanir R, Holzmeister F, Huber J, Johannesson M, Kirchler M, Dreber A, Camerer CF, Poldrack RA, Schonberg T. fMRI data of mixed gambles from the Neuroimaging Analysis Replication and Prediction Study. Scientific Data. 2019;6:106. doi: 10.1038/s41597-019-0113-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Botvinik-Nezer R, Holzmeister F, Camerer CF, Dreber A, Huber J, Johannesson M, Kirchler M, Iwanir R, Mumford JA, Adcock RA, Avesani P, Baczkowski BM, Bajracharya A, Bakst L, Ball S, Barilari M, Bault N, Beaton D, Beitner J, Benoit RG, Berkers R, Bhanji JP, Biswal BB, Bobadilla-Suarez S, Bortolini T, Bottenhorn KL, Bowring A, Braem S, Brooks HR, Brudner EG, Calderon CB, Camilleri JA, Castrellon JJ, Cecchetti L, Cieslik EC, Cole ZJ, Collignon O, Cox RW, Cunningham WA, Czoschke S, Dadi K, Davis CP, Luca AD, Delgado MR, Demetriou L, Dennison JB, Di X, Dickie EW, Dobryakova E, Donnat CL, Dukart J, Duncan NW, Durnez J, Eed A, Eickhoff SB, Erhart A, Fontanesi L, Fricke GM, Fu S, Galván A, Gau R, Genon S, Glatard T, Glerean E, Goeman JJ, Golowin SAE, González-García C, Gorgolewski KJ, Grady CL, Green MA, Guassi Moreira JF, Guest O, Hakimi S, Hamilton JP, Hancock R, Handjaras G, Harry BB, Hawco C, Herholz P, Herman G, Heunis S, Hoffstaedter F, Hogeveen J, Holmes S, Hu CP, Huettel SA, Hughes ME, Iacovella V, Iordan AD, Isager PM, Isik AI, Jahn A, Johnson MR, Johnstone T, Joseph MJE, Juliano AC, Kable JW, Kassinopoulos M, Koba C, Kong XZ, Koscik TR, Kucukboyaci NE, Kuhl BA, Kupek S, Laird AR, Lamm C, Langner R, Lauharatanahirun N, Lee H, Lee S, Leemans A, Leo A, Lesage E, Li F, Li MYC, Lim PC, Lintz EN, Liphardt SW, Losecaat Vermeer AB, Love BC, Mack ML, Malpica N, Marins T, Maumet C, McDonald K, McGuire JT, Melero H, Méndez Leal AS, Meyer B, Meyer KN, Mihai G, Mitsis GD, Moll J, Nielson DM, Nilsonne G, Notter MP, Olivetti E, Onicas AI, Papale P, Patil KR, Peelle JE, Pérez A, Pischedda D, Poline JB, Prystauka Y, Ray S, Reuter-Lorenz PA, Reynolds RC, Ricciardi E, Rieck JR, Rodriguez-Thompson AM, Romyn A, Salo T, Samanez-Larkin GR, Sanz-Morales E, Schlichting ML, Schultz DH, Shen Q, Sheridan MA, Silvers JA, Skagerlund K, Smith A, Smith DV, Sokol-Hessner P, Steinkamp SR, Tashjian SM, Thirion B, Thorp JN, Tinghög G, Tisdall L, Tompson SH, Toro-Serey C, Torre Tresols JJ, Tozzi L, Truong V, Turella L, van ’t Veer AE, Verguts T, Vettel JM, Vijayarajah S, Vo K, Wall MB, Weeda WD, Weis S, White DJ, Wisniewski D, Xifra-Porxas A, Yearling EA, Yoon S, Yuan R, Yuen KSL, Zhang L, Zhang X, Zosky JE, Nichols TE, Poldrack RA, Schonberg T. Variability in the analysis of a single neuroimaging dataset by many teams. Nature. 2020a;582:84–88. doi: 10.1038/s41586-020-2314-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Botvinik-Nezer R, Holzmeister F, Camerer CF, Dreber A, Huber J, Johannesson M, Kirchler M, Iwanir R, Mumford JA, Adcock RA, Avesani P, Baczkowski BM, Bajracharya A, Bakst L, Ball S, Barilari M, Bault N, Beaton D, Beitner J, Benoit RG, Berkers R, Bhanji JP, Biswal BB, Bobadilla-Suarez S, Bortolini T, Bottenhorn KL, Bowring A, Braem S, Brooks HR, Brudner EG, Calderon CB, Camilleri JA, Castrellon JJ, Cecchetti L, Cieslik EC, Cole ZJ, Collignon O, Cox RW, Cunningham WA, Czoschke S, Dadi K, Davis CP, Luca AD, Delgado MR, Demetriou L, Dennison JB, Di X, Dickie EW, Dobryakova E, Donnat CL, Dukart J, Duncan NW, Durnez J, Eed A, Eickhoff SB, Erhart A, Fontanesi L, Fricke GM, Fu S, Galván A, Gau R, Genon S, Glatard T, Glerean E, Goeman JJ, Golowin SAE, González-García C, Gorgolewski KJ, Grady CL, Green MA, Guassi Moreira JF, Guest O, Hakimi S, Hamilton JP, Hancock R, Handjaras G, Harry BB, Hawco C, Herholz P, Herman G, Heunis S, Hoffstaedter F, Hogeveen J, Holmes S, Hu CP, Huettel SA, Hughes ME, Iacovella V, Iordan AD, Isager PM, Isik AI, Jahn A, Johnson MR, Johnstone T, Joseph MJE, Juliano AC, Kable JW, Kassinopoulos M, Koba C, Kong XZ, Koscik TR, Kucukboyaci NE, Kuhl BA, Kupek S, Laird AR, Lamm C, Langner R, Lauharatanahirun N, Lee H, Lee S, Leemans A, Leo A, Lesage E, Li F, Li MYC, Lim PC, Lintz EN, Liphardt SW, Losecaat Vermeer AB, Love BC, Mack ML, Malpica N, Marins T, Maumet C, McDonald K, McGuire JT, Melero H, Méndez Leal AS, Meyer B, Meyer KN, Mihai G, Mitsis GD, Moll J, Nielson DM, Nilsonne G, Notter MP, Olivetti E, Onicas AI, Papale P, Patil KR, Peelle JE, Pérez A, Pischedda D, Poline JB, Prystauka Y, Ray S, Reuter-Lorenz PA, Reynolds RC, Ricciardi E, Rieck JR, Rodriguez-Thompson AM, Romyn A, Salo T, Samanez-Larkin GR, Sanz-Morales E, Schlichting ML, Schultz DH, Shen Q, Sheridan MA, Silvers JA, Skagerlund K, Smith A, Smith DV, Sokol-Hessner P, Steinkamp SR, Tashjian SM, Thirion B, Thorp JN, Tinghög G, Tisdall L, Tompson SH, Toro-Serey C, Torre Tresols JJ, Tozzi L, Truong V, Turella L, van ‘t Veer AE, Verguts T, Vettel JM, Vijayarajah S, Vo K, Wall MB, Weeda WD, Weis S, White DJ, Wisniewski D, Xifra-Porxas A, Yearling EA, Yoon S, Yuan R, Yuen KSL, Zhang L, Zhang X, Zosky JE, Nichols TE, Poldrack RA, Schonberg T. Variability in the analysis of a single neuroimaging dataset by many teams. bioRxiv. 2020b doi: 10.1101/843193. [DOI] [PMC free article] [PubMed]
  7. Brenner N, Bialek W, de Ruyter van Steveninck R. Adaptive rescaling maximizes information transmission. Neuron. 2000;26:695–702. doi: 10.1016/s0896-6273(00)81205-2. [DOI] [PubMed] [Google Scholar]
  8. Burke CJ, Baddeley M, Tobler PN, Schultz W. Partial adaptation of obtained and observed value signals preserves information about gains and losses. The Journal of Neuroscience. 2016;36:10016–10025. doi: 10.1523/JNEUROSCI.0487-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Buschman TJ, Siegel M, Roy JE, Miller EK. Neural substrates of cognitive capacity limitations. PNAS. 2011;108:11252–11255. doi: 10.1073/pnas.1104666108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cai X, Padoa-Schioppa C. Neuronal encoding of subjective value in dorsal and ventral anterior cingulate cortex. The Journal of Neuroscience. 2012;32:3791–3808. doi: 10.1523/JNEUROSCI.3864-11.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Carandini M, Heeger DJ. Normalization as a canonical neural computation. Nature Reviews. Neuroscience. 2011;13:51–62. doi: 10.1038/nrn3136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cicchini GM, Arrighi R, Cecchetti L, Giusti M, Burr DC. Optimal encoding of interval timing in expert percussionists. The Journal of Neuroscience. 2012;32:1056–1060. doi: 10.1523/JNEUROSCI.3411-11.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Clairis N, Pessiglione M. Value, confidence, deliberation: a functional partition of the medial prefrontal cortex demonstrated across rating and choice tasks. The Journal of Neuroscience. 2022;42:5580–5592. doi: 10.1523/JNEUROSCI.1795-21.2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Conen KE, Padoa-Schioppa C. Partial adaptation to the value range in the macaque orbitofrontal cortex. The Journal of Neuroscience. 2019;39:3498–3513. doi: 10.1523/JNEUROSCI.2279-18.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Cox KM, Kable JW. BOLD subjective value signals exhibit robust range adaptation. The Journal of Neuroscience. 2014;34:16533–16543. doi: 10.1523/JNEUROSCI.3927-14.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Dale AM. Optimal experimental design for event-related fMRI. Human Brain Mapping. 1999;8:109–114. doi: 10.1002/(SICI)1097-0193(1999)8:2/3&#x0003c;109::AID-HBM7&#x0003e;3.0.CO;2-W. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Daunizeau J, Adam V, Rigoux L. VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLOS Computational Biology. 2014;10:e1003441. doi: 10.1371/journal.pcbi.1003441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Daunizeau J. On parameters transformations for emulating sparse priors using variational laplace inference. arXiv. 2017 http://arxiv.org/abs/1703.07168
  19. Diedrichsen J, Kriegeskorte N. Representational models: A common framework for understanding encoding, pattern-component, and representational-similarity analysis. PLOS Computational Biology. 2017;13:e1005508. doi: 10.1371/journal.pcbi.1005508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Drugowitsch J, Wyart V, Devauchelle AD, Koechlin E. Computational precision of mental inference as critical source of human choice suboptimality. Neuron. 2016;92:1398–1411. doi: 10.1016/j.neuron.2016.11.005. [DOI] [PubMed] [Google Scholar]
  21. Ebitz RB, Hayden BY. The population doctrine in cognitive neuroscience. Neuron. 2021;109:3055–3068. doi: 10.1016/j.neuron.2021.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Elliott R, Agnew Z, Deakin JFW. Medial orbitofrontal cortex codes relative rather than absolute value of financial rewards in humans. The European Journal of Neuroscience. 2008;27:2213–2218. doi: 10.1111/j.1460-9568.2008.06202.x. [DOI] [PubMed] [Google Scholar]
  23. Fan L, Li H, Zhuo J, Zhang Y, Wang J, Chen L, Yang Z, Chu C, Xie S, Laird AR, Fox PT, Eickhoff SB, Yu C, Jiang T. The human brainnetome atlas: a new brain atlas based on connectional architecture. Cerebral Cortex. 2016;26:3508–3526. doi: 10.1093/cercor/bhw157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Fiebig F, Lansner A. A spiking working memory model based on hebbian short-term potentiation. The Journal of Neuroscience. 2017;37:83–96. doi: 10.1523/JNEUROSCI.1989-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Fox K, Stryker M. Integrating Hebbian and homeostatic plasticity: introduction. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 2017;372:20160413. doi: 10.1098/rstb.2016.0413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Friston KJ, Holmes AP, Poline JB, Grasby PJ, Williams SCR, Frackowiak RSJ, Turner R. Analysis of fMRI time-series revisited. NeuroImage. 1995;2:45–53. doi: 10.1006/nimg.1995.1007. [DOI] [PubMed] [Google Scholar]
  27. Friston KJ, Diedrichsen J, Holmes E, Zeidman P. Variational representational similarity analysis. NeuroImage. 2019;201:115986. doi: 10.1016/j.neuroimage.2019.06.064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Güçlü U, van Gerven MAJ. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. The Journal of Neuroscience. 2015;35:10005–10014. doi: 10.1523/JNEUROSCI.5023-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hopfinger JB, Büchel C, Holmes AP, Friston KJ. A study of analysis parameters that influence the sensitivity of event-related fMRI analyses. NeuroImage. 2000;11:326–333. doi: 10.1006/nimg.2000.0549. [DOI] [PubMed] [Google Scholar]
  30. Kahneman D. Thinking, Fast and Slow. Macmillan; 2011. [Google Scholar]
  31. Kahneman D, Tversky A. In: In Handbook of the Fundamentals of Financial Decision Making. Kahneman D, Tversky A, editors. WORLD SCIENTIFIC; 2012. Prospect theory: an analysis of decision under risk; pp. 99–127. [DOI] [Google Scholar]
  32. Khaw MW, Glimcher PW, Louie K. Normalized value coding explains dynamic adaptation in the human valuation process. PNAS. 2017;114:12696–12701. doi: 10.1073/pnas.1715293114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kietzmann TC, McClure P, Kriegeskorte N. Deep neural networks in computational neuroscience. bioRxiv. 2017 doi: 10.1101/133504. [DOI] [PMC free article] [PubMed]
  34. Kietzmann TC, Spoerer CJ, Sörensen LKA, Cichy RM, Hauk O, Kriegeskorte N. Recurrence is required to capture the representational dynamics of the human visual system. PNAS. 2019;116:21854–21863. doi: 10.1073/pnas.1905544116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Kobayashi S, Pinto de Carvalho O, Schultz W. Adaptation of reward sensitivity in orbitofrontal neurons. The Journal of Neuroscience. 2010;30:534–544. doi: 10.1523/JNEUROSCI.4009-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Kriegeskorte N, Mur M, Bandettini P. Representational similarity analysis - connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience. 2008;2:4. doi: 10.3389/neuro.06.004.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kriegeskorte N, Golan T. Neural network models and deep learning. Current Biology. 2019;29:R231–R236. doi: 10.1016/j.cub.2019.02.034. [DOI] [PubMed] [Google Scholar]
  38. Laughlin S. A simple coding procedure enhances A neuron’s information capacity. Zeitschrift Fur Naturforschung. Section C, Biosciences. 1981;36:910–912. [PubMed] [Google Scholar]
  39. Lebreton M, Jorge S, Michel V, Thirion B, Pessiglione M. An automatic valuation system in the human brain: evidence from functional neuroimaging. Neuron. 2009;64:431–439. doi: 10.1016/j.neuron.2009.09.040. [DOI] [PubMed] [Google Scholar]
  40. Lim SL, O’Doherty JP, Rangel A. Stimulus value signals in ventromedial PFC reflect the integration of attribute value signals computed in fusiform gyrus and posterior superior temporal gyrus. The Journal of Neuroscience. 2013;33:8729–8741. doi: 10.1523/JNEUROSCI.4809-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Lopez-Persem A, Bastin J, Petton M, Abitbol R, Lehongre K, Adam C, Navarro V, Rheims S, Kahane P, Domenech P, Pessiglione M. Four core properties of the human brain valuation system demonstrated in intracranial signals. Nature Neuroscience. 2020;23:664–675. doi: 10.1038/s41593-020-0615-9. [DOI] [PubMed] [Google Scholar]
  42. Louie K, Glimcher PW. Efficient coding and the neural representation of value. Annals of the New York Academy of Sciences. 2012;1251:13–32. doi: 10.1111/j.1749-6632.2012.06496.x. [DOI] [PubMed] [Google Scholar]
  43. Louie K, Khaw MW, Glimcher PW. Normalization is a general neural mechanism for context-dependent decision making. PNAS. 2013;110:6139–6144. doi: 10.1073/pnas.1217854110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Louie K, Glimcher PW, Webb R. Adaptive neural coding: from biological to behavioral decision-making. Current Opinion in Behavioral Sciences. 2015;5:91–99. doi: 10.1016/j.cobeha.2015.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Marois R, Ivanoff J. Capacity limits of information processing in the brain. Trends in Cognitive Sciences. 2005;9:296–305. doi: 10.1016/j.tics.2005.04.010. [DOI] [PubMed] [Google Scholar]
  46. May KA, Zhaoping L. Efficient coding theory predicts a tilt aftereffect from viewing untilted patterns. Current Biology. 2016;26:1571–1576. doi: 10.1016/j.cub.2016.04.037. [DOI] [PubMed] [Google Scholar]
  47. McClamrock R. Marr’s three levels: A re-evaluation. Minds and Machines. 1991;1:185–196. doi: 10.1007/BF00361036. [DOI] [Google Scholar]
  48. Miller EK, Buschman TJ. Working memory capacity: limits on the bandwidth of cognition. Daedalus. 2015;144:112–122. doi: 10.1162/DAED_a_00320. [DOI] [Google Scholar]
  49. Nadal J. Non Linear Neurons in the Low Noise Limit: A Factorial Code Maximizes Information transferJean. Semantic Scholar; 1994. [DOI] [Google Scholar]
  50. O’Doherty JP, Hampton A, Kim H. Model-based fMRI and its application to reward learning and decision making. Annals of the New York Academy of Sciences. 2007;1104:35–53. doi: 10.1196/annals.1390.022. [DOI] [PubMed] [Google Scholar]
  51. O’Doherty JP, Rutishauser U, Iigaya K. The hierarchical construction of value. Current Opinion in Behavioral Sciences. 2021;41:71–77. doi: 10.1016/j.cobeha.2021.03.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex encode economic value. Nature. 2006;441:223–226. doi: 10.1038/nature04676. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Padoa-Schioppa C, Assad JA. The representation of economic value in the orbitofrontal cortex is invariant for changes of menu. Nature Neuroscience. 2008;11:95–102. doi: 10.1038/nn2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Padoa-Schioppa C. Range-adapting representation of economic value in the orbitofrontal cortex. The Journal of Neuroscience. 2009;29:14004–14014. doi: 10.1523/JNEUROSCI.3751-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Padoa-Schioppa C, Rustichini A. Rational attention and adaptive coding: a puzzle and a solution. The American Economic Review. 2014;104:507–513. doi: 10.1257/aer.104.5.507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Pessiglione M, Daunizeau J. Bridging across functional models: The OFC as a value-making neural network. Behavioral Neuroscience. 2021;135:277–290. doi: 10.1037/bne0000464. [DOI] [PubMed] [Google Scholar]
  57. Pezzulo G, Rigoli F, Friston K. Active Inference, homeostatic regulation and adaptive behavioural control. Progress in Neurobiology. 2015;134:17–35. doi: 10.1016/j.pneurobio.2015.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Polanía R, Woodford M, Ruff CC. Efficient coding of subjective value. Nature Neuroscience. 2019;22:134–142. doi: 10.1038/s41593-018-0292-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Poldrack RA, Barch DM, Mitchell JP, Wager TD, Wagner AD, Devlin JT, Cumba C, Koyejo O, Milham MP. Toward open sharing of task-based fMRI data: the OpenfMRI project. Frontiers in Neuroinformatics. 2013;7:12. doi: 10.3389/fninf.2013.00012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Pooresmaeili A, Arrighi R, Biagi L, Morrone MC. Blood oxygen level-dependent activation of the primary visual cortex predicts size adaptation illusion. The Journal of Neuroscience. 2013;33:15999–16008. doi: 10.1523/JNEUROSCI.1770-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Porrill J, Stone J. Undercomplete Independent Component Analysis for Signal Separation and Dimension Reduction. Semantic Scholar; 1998. [Google Scholar]
  62. Raghuraman AP, Padoa-Schioppa C. Integration of multiple determinants in the neuronal computation of economic values. The Journal of Neuroscience. 2014;34:11583–11603. doi: 10.1523/JNEUROSCI.1235-14.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Rakow T, Cheung NY, Restelli C. Losing my loss aversion: the effects of current and past environment on the relative sensitivity to losses and gains. Psychonomic Bulletin & Review. 2020;27:1333–1340. doi: 10.3758/s13423-020-01775-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Ramsey NF, Jansma JM, Jager G, Van Raalten T, Kahn RS. Neurophysiological factors in human information processing capacity. Brain. 2004;127:517–525. doi: 10.1093/brain/awh060. [DOI] [PubMed] [Google Scholar]
  65. Rangel A, Camerer C, Montague PR. A framework for studying the neurobiology of value-based decision making. Nature Reviews. Neuroscience. 2008;9:545–556. doi: 10.1038/nrn2357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Rangel A, Clithero JA. Value normalization in decision making: theory and evidence. Current Opinion in Neurobiology. 2012;22:970–981. doi: 10.1016/j.conb.2012.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Rigoli F, Friston KJ, Dolan RJ. Neural processes mediating contextual influences on human choice behaviour. Nature Communications. 2016;7:12416. doi: 10.1038/ncomms12416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Rigoux L, Daunizeau J, Adam V. VBA-toolbox. 5899497GitHub. 2023 https://github.com/MBB-team/VBA-toolbox/
  69. Rustichini A, Conen KE, Cai X, Padoa-Schioppa C. Optimal coding and neuronal adaptation in economic decisions. Nature Communications. 2017;8:1208. doi: 10.1038/s41467-017-01373-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Saez RA, Saez A, Paton JJ, Lau B, Salzman CD. Distinct roles for the amygdala and orbitofrontal cortex in representing the relative amount of expected reward. Neuron. 2017;95:70–77. doi: 10.1016/j.neuron.2017.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Seymour B, McClure SM. Anchors, scales and the relative coding of value in the brain. Current Opinion in Neurobiology. 2008;18:173–178. doi: 10.1016/j.conb.2008.07.010. [DOI] [PubMed] [Google Scholar]
  72. Soltani A, De Martino B, Camerer C. A range-normalization model of context-dependent choice: A new model and evidence. PLOS Computational Biology. 2012;8:e1002607. doi: 10.1371/journal.pcbi.1002607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Srivastava N, Schrater P. A value-relativistic decision theory predicts known biases in human preferences. Proceedings of the Annual Meeting of the Cognitive Science Society.2011. [Google Scholar]
  74. Steverson K, Brandenburger A, Glimcher P. Choice-theoretic foundations of the divisive normalization model. Journal of Economic Behavior & Organization. 2019;164:148–165. doi: 10.1016/j.jebo.2019.05.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Suzuki S, Cross L, O’Doherty JP. Elucidating the underlying components of food valuation in the human orbitofrontal cortex. Nature Neuroscience. 2017;20:1780–1786. doi: 10.1038/s41593-017-0008-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Tom SM, Fox CR, Trepel C, Poldrack RA. The neural basis of loss aversion in decision-making under risk. Science. 2007;315:515–518. doi: 10.1126/science.1134239. [DOI] [PubMed] [Google Scholar]
  77. Toyoizumi T, Kaneko M, Stryker MP, Miller KD. Modeling the dynamic interaction of Hebbian and homeostatic plasticity. Neuron. 2014;84:497–510. doi: 10.1016/j.neuron.2014.09.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Tremblay L, Schultz W. Relative reward preference in primate orbitofrontal cortex. Nature. 1999;398:704–708. doi: 10.1038/19525. [DOI] [PubMed] [Google Scholar]
  79. Troscianko J, Osorio D. A model of colour appearance based on efficient coding of natural images. PLOS Computational Biology. 2023;19:e1011117. doi: 10.1371/journal.pcbi.1011117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Turrigiano GG. The dialectic of Hebb and homeostasis. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 2017;372:20160258. doi: 10.1098/rstb.2016.0258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Valerio R, Navarro R. Optimal coding through divisive normalization models of V1 neurons. Network. 2003;14:579–593. [PubMed] [Google Scholar]
  82. Wang B, Ke W, Guang J, Chen G, Yin L, Deng S, He Q, Liu Y, He T, Zheng R, Jiang Y, Zhang X, Li T, Luan G, Lu HD, Zhang M, Zhang X, Shu Y. Firing frequency maxima of fast-spiking neurons in human, monkey, and mouse neocortex. Frontiers in Cellular Neuroscience. 2016;10:239. doi: 10.3389/fncel.2016.00239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Wark B, Lundstrom BN, Fairhall A. Sensory adaptation. Current Opinion in Neurobiology. 2007;17:423–429. doi: 10.1016/j.conb.2007.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Williams TB, Burke CJ, Nebe S, Preuschoff K, Fehr E, Tobler PN. Testing models at the neural level reveals how the brain computes subjective value. PNAS. 2021;118:e2106237118. doi: 10.1073/pnas.2106237118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Wyart V, Koechlin E. Choice variability and suboptimality in uncertain environments. Current Opinion in Behavioral Sciences. 2016;11:109–115. doi: 10.1016/j.cobeha.2016.07.003. [DOI] [Google Scholar]
  86. Yamada H, Louie K, Tymula A, Glimcher PW. Free choice shapes normalized value signals in medial orbitofrontal cortex. Nature Communications. 2018;9:162. doi: 10.1038/s41467-017-02614-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Zimmermann J, Glimcher PW, Louie K. Multiple timescales of normalized value coding underlie adaptive choice behavior. Nature Communications. 2018;9:3206. doi: 10.1038/s41467-018-05507-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Thorsten Kahnt 1

This valuable manuscript proposes a neural network mechanism for range adaptation for value-based decision making. The authors present solid evidence for the proposed mechanism.

Decision letter

Editor: Thorsten Kahnt1

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Synaptic plasticity in the orbitofrontal cortex explains how risk attitude adapts to the range of risk prospects" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Michael Frank as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) As you can see in the individual comments, all reviewers thought that your paper addresses an important topic. However, a central concern raised by all reviewers was that a substantial part of the model performance is driven by the activation function – yet, throughout the manuscript, you mainly discuss the role of Hebbian plasticity and largely ignore the effects of the activation function. There was agreement among reviewers that this is not warranted and that your manuscript requires substantial reframing, clarification, justification, and discussion. Reviewers would expect an adequately revised manuscript to look quite different from the current version, with an equal focus on all factors that allow the model to account for the observed changes across different ranges.

2) There are additional comments in the individual critiques that would be important to address, specifically regarding aspects of the analysis and interpretation.

Reviewer #1 (Recommendations for the authors):

1) Please explain on page 4 of the introduction why Hebbian plasticity should lead to spill-over effects. This is not very intuitive.

2) Interpretation of the null findings in the fMRI data (page 4 of results) is problematic because it is unclear whether they reflect a true null effect or a lack of sensitivity. Although this is true for all null results, it is particularly problematic for re-analyses, as the study was not designed or powered to test this question. It would be best to remove these results from the paper.

3) There is a fundamental difference between gaussian and sigmoidal activation functions. It would be important to include an adequate discussion of the assumptions and implications of these different functions in the main text.

4) The out-of-range predictions of the best model (Figure 3, lower-right panel, HP-ANN (gauss)) are not very convincing when considering the entire EV range. Model performance should be compared for the entire EV range, not just the common range. What does this mean for the proposed mechanism?

5) The paper focuses on Hebbian plasticity as a mechanism for context effects but judging from Figure 3, the choice of activation function has a comparable effect. Indeed, HP does almost nothing for models with sigmoidal activation functions, and HP only improves the out-of-range prediction for the common EV range but does very little for the uncommon range. A more balanced presentation that also discusses the type of activation function as a mechanism of adaptation would be important.

6) Are the RSA results in Figure 4 based on the ANNs with gaussian activation function? It would be important to show RSA results for all 4 ANNs in Figure 4, so it is possible to compare the results across models. Also, please add labels to the plot axes.

7) The proportion of offer and chosen value neurons shown in Figure 6 is opposite to what has been reported in the OFC of non-human primates. It would be good to discuss this discrepancy. Also, are the same proportions found for all 4 ANNs?

8) Figure 4: the spheres shown for BA11 are in the posterior medial rather than the lateral OFC. Please double-check the anatomical location of these ROIs. Are they really in the lateral OFC? Also, it would be good to provide center coordinates for the ROIs. In general, it would be important to better describe how the ROIs were generated. What were the search terms used in NeuroQuery? Also, NeuroQuery generates meta-analytic activation maps, not maps of anatomical structures. It would be better to use actual anatomical ROIs for fMRI data analysis.

Reviewer #2 (Recommendations for the authors):

1) Range adaptation is not shown directly in ANN units. Specific questions:

a) How do ANN units respond across different input values? How variable are these response patterns across units?

b) How do these response patterns vary with range?

c) Do these patterns (at the individual unit or mean level) resemble neuronal data from the orbitofrontal cortex (OFC)? Should it be expected to?

d) The example units that are shown (Figure 7) seem potentially different from previously reported neuronal data from OFC. For example, the dynamic range of the model covers only a narrow subset of the input space and varies substantially across conditions, whereas firing rates in OFC neurons tend to span the full range of possible values, and firing rates change only slightly between conditions [ref,ref]. Is this discrepancy just a side effect of showing results in terms of loss units rather than expected value? How should we interpret this apparent discrepancy?

e) ANN unit responses are compared to neuron classes observed in OFC (Figure 6). What does the mean ANN unit in each category look like, and how does this compare to the OFC responses referenced?

2) Modeling results focus on Hebbian networks with gaussian activation functions. Specific questions:

a) Is there a physiological motivation for the model, or is this primarily for mathematical convenience?

b) Do the main qualitative results (range effect on risk sensitivity) require a non-monotonic activation function?

c) How do the properties of the response functions (saturation and (non)-monotonicity) affect responses of ANN units?

d) There are visible differences between model behavior for Hebbian networks with sigmoidal and gaussian activation functions. How should these differences be interpreted? Does this lead to any predictions or constraints on what physiological implementation is consistent with this algorithm?

3) Out-of-sample predicted choices in the Hebbian model with gaussian activation function seem unintuitive for more extreme parts of the value range. E.g. for the out-of-sample wide range predictions, the model seems to over-predict risk aversion to the point that choice probabilities saturate and ~0.6 for increasingly high-value options. For the narrow range, out-of-sample prediction behavior even appears to be non-monotonic, leading to increased acceptance of extremely high loss options.

a) How should this be interpreted?

b) Does this depend on model parameters like update rate or covariance threshold?

c) In these parts of the range, how does the Hebbian model compare to alternate models, such as the other ANNs or logistic regression?

4) The relationship between neural responses and ANN activity relies on representational similarity analysis (RSA). However, significant RDM correlations could arise as a byproduct of the fact that both ANN output and BOLD activity in select regions correlate with behavioral choice patterns. Is there evidence that the correlation between Hebbian ANNs and BOLD activation reflects more than average choice patterns?

5) The authors make the strong claim that this model is the mechanistic explanation for adaptation in orbitofrontal cortex, but there is little comparison with previous models. Divisive normalization and other forms of adaptation to the value range are discarded based on a qualitative argument from the behavioral data. However, given that the Hebbian ANNs also produce some counterintuitive behavioral predictions, it is not obvious that they are better at accounting for observed patterns in neuronal adaptation or behavior. Addressing the following questions could clarify whether there is an argument for Hebbian ANNs over alternate mechanisms of adaptation:

a) How do behavioral predictions and RSA results from HP-ANNs compare quantitatively to results from other models of adaptation, including models of adaptation at the input stage?

b) Can Hebbian ANNs account for other previously observed patterns of behavior across different value ranges, such as stability of relative values across ranges in two-option choices and range-dependent decoy effects []?

c) Are there specific predictions that arise from Hebbian networks that could be tested in later work and used to differentiate between competing models?

Reviewer #3 (Recommendations for the authors):

1) The paper assumes loss aversion as the primary behavioral factor, which is fine. However, it may be worth briefly mentioning the limitation that with the present experimental design it is impossible to dissociate loss aversion from risk aversion (see e.g. Williams et al., 2021, PNAS).

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Efficient value synthesis in the orbitofrontal cortex: how does peoples' risk attitude adapts to the range of risk prospects?" for further consideration by eLife. Your revised article has been evaluated by Michael Frank (Senior Editor) and a Reviewing Editor.

All reviewers agreed that the revised manuscript has been improved. However, given the revised manuscript is essentially a fundamentally new manuscript, there is a new set of comments that would need to be addressed, as outlined below.

Reviewer #1 (Recommendations for the authors):

I appreciate the authors' effort and dedication in re-working the manuscript. The revised manuscript is a fundamentally new and improved paper with new conclusions. I believe it could make an important contribution to the field and I remain enthusiastic. However, given most of the manuscript has changed, I have new comments that I think should be addressed

1) In general, the manuscript is quite long with extensive supplementary analyses. I believe the paper could be streamlined by highlighting the most important aspects and reducing the extent to which details are discussed in the main text.

2) All relevant information to understand the plots should be embedded within the figure rather than just the figure legends. It is cumbersome for readers to constantly have to consult the legends to understand what is shown in the figure. For instance, there is no label for the different colored plots (red/back, red/blue) or line styles in ANY of the figures. Also, some legends (Figure 4) refer to a color code, but this code is not provided. Moreover, there are no axis ticks and/or axis tick labels in some of the panels in Figures 3, 5, 6, 7, 9, 10 (top row), 12 (top row), 13 (top row), S4, S5, S6, S7, S10, and S12. Several figures don't include a color bar (e.g., Figure S5-7). Please carefully revise all figures. Note that this point was already raised in the previous round of reviews, but it was not addressed.

3) Figure 3D – Loss aversion over time: Instead of running a separate between group comparisons for each time-point, it would be more appropriate to run a single two-way ANOVA with within-subject factor time and between-subject factor group. A significant group-by-time interaction would support the conclusion that loss aversion diverges between the two groups across time.

4) Why was the lateral OFC (area 47/12) not included here, given that work by Suzuki et al. 2017 suggests that lateral OFC represents attribute-specific values?

5) It is not fully clear what is plotted in Figure 5-7. Are these the averages across all RDM cells with a certain δ G/L/EV in Figures S5-S7? Does this include the δ G/L/EV = 0? How is neural coding strength defined in the lower rows? Tick labels would have helped.

6) Figure 9 shows that subject-specific ANNs correlate with subject-specific RDMs. To claim that these models capture individual patterns of OFC activity, it would be important to show that these correlations exceed those with group-level ANNs. Moreover, to claim specificity for plastic ANNs, it would be necessary to show superiority of predictions from plastic vs static ANNs.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Efficient value synthesis in the orbitofrontal cortex explains how loss aversion adapts to the ranges of gain and loss prospects" for further consideration by eLife. Your revised article has been evaluated by Michael Frank (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below.

Reviewer #1 (Recommendations for the authors):

The authors have addressed my comments. I think the manuscript makes an interesting contribution to the field.

Reviewer #2 (Recommendations for the authors):

Overall the proposed model presents an interesting possible explanation for types of context-dependent loss aversion. The manuscript has improved over the course of revision and will be a worthwhile contribution to the literature. I have a few remaining comments that would help improve the clarity and accessibility of the results if they can be addressed before publication, but these are relatively minor.

1) While the figures have been substantially improved, several are still missing a description of the colors in the figure or legend, and instead have the placeholder phrase "(color code)" in the figure legend.

2) I appreciate your response to my previous comment about "undoing" adaptation (R2 Comment 3), but it is not clear to me in your response whether you are describing the computational role of "offer value" units in your model specifically, or just giving a hypothetical scenario. If I understand right, your model produces choice via a comparison of Vt for two options, and the"offer value" units are part of the integration layer (i.e. an input to Vt rather than the signal being compared directly). Is the idea that this would lead to stable preferences even without "undoing" adaptation downstream? Or would your model predict that preferences do shift in responses to "offer value" adaptation, and you suspect that past studies may not have been able to see it? Or are you just trying to say that there are several hypothetical possibilities, and in the specific task you are modeling it is not necessary to modify the weights? (As an aside, I also disagree with the argument that Rustichini et al. are interpreting a null result as evidence of absence – they start by predicting how preferences would change if choices arose from a simple comparison of offer value firing rates, then show that actual choice behavior does not match this prediction.)

3) In line 456 you discuss the spatial specificity of results, but unless I'm missing something this doesn't involve a direct comparison between regions. It may be worth reducing this claim.

Reviewer #3 (Recommendations for the authors):

Thank you for a responsive revision, I have no further points other than that the paper would still benefit from careful spell-checking.

eLife. 2024 Dec 9;13:e80979. doi: 10.7554/eLife.80979.sa2

Author response


Essential revisions:

1) As you can see in the individual comments, all reviewers thought that your paper addresses an important topic. However, a central concern raised by all reviewers was that a substantial part of the model performance is driven by the activation function – yet, throughout the manuscript, you mainly discuss the role of Hebbian plasticity and largely ignore the effects of the activation function. There was agreement among reviewers that this is not warranted and that your manuscript requires substantial reframing, clarification, justification, and discussion. Reviewers would expect an adequately revised manuscript to look quite different from the current version, with an equal focus on all factors that allow the model to account for the observed changes across different ranges.

2) There are additional comments in the individual critiques that would be important to address, specifically regarding aspects of the analysis and interpretation.

First of all, we would like to thank you for your insightful and constructive criticism. Some of your comments helped us understand that our previous interpretations were unsatisfactory, and we eventually changed our mind on many important aspects of our work. In particular, extensive posthoc numerical simulations made us realize that Hebbian plasticity, as described in the previous version of the manuscript, does not induce value range adaptation in most contexts. This meant that our previous results were somehow illusory, in the sense that they would not generalize to other types of datasets. We thus started from scratch, and derived another -novel- computational framework, which is grounded on formal models of efficient coding. We will detail this below. The resulting manuscript is entirely different in its content, and we hope to have significantly improved the quality of our contribution.

Second, we would like to apologize for the time it took us to revise this manuscript. The reason for this delay is twofold. First, the first author (Jules Brochard) first moved to a new lab for his postdoc, and then eventually quit academia. This implied that we could only progress very slowly. Second, we made so many changes – on both computational modelling and data analysis sides – that this revision really required a lot of time to complete. Nevertheless, we hope that the resulting work will satisfy most of your concerns. We reiterate that this would not have been possible without you making us think again about our work.

Before we respond to your comments point by point below, we would like to summarize our revisions. We believe this is necessary to clarify some of our responses.

As we mentioned above, raw Hebbian plasticity turns out not to be a good model for range adaptation. We realized this when attempting to address one of your questions, about why Hebbian plasticity would induce range adaptation. We first tried to derive analytical results regarding the impact of Hebbian plasticity: this did not work. We then resorted to numerical simulations on a wide range of conditions, but this eventually demonstrated that Hebbian plasticity does not yield value range adaptation. More precisely, it only does so in very specific settings of the ANN internal connectivity, and these settings cannot be summarized in a simple manner and/or justified from first principles. So what does this imply for our previous analyses? When fitting the plastic ANNs to empirical data, we unknowingly identified idiosyncratic variants of these settings, eventually extracting “Hebbian explanations” for range adaptation. But we now know that these explanations were, at best, anecdotal: they would not generalize.

This was a rather disappointing realization. At this point, we wondered how to move forward. We thus reversed the logic of our reasoning and asked: what sort of change in the ANN structure would eventually yield range adaptation? We were looking for a computational principle that would suggest why neurons would range-adapt, i.e. why this would be adapted. We took inspiration from the theoretical literature on efficient coding, which highlighted that range adaptation is the mechanism by which neurons minimize the information loss that is induced by their limited firing range. However, existing efficient coding models had been derived for perceptual brain systems, where neurons transmit the information that they receive. Put simply, the idea here is that a neuron’s input is the physical quantity that is signaled to the brain (e.g., light intensity within a certain frequency band), whereas the neuron’s output is the percept (e.g., perceived amount of red). In turn, range adaptation (to a neuron’s input signal) directly induces perceptual context-dependent effects. This is not case in our context: value is the outcome of an integration mechanism, over multiple (and possibly conflicting) decision-relevant information. We thus extended existing efficient coding models to ANNs that operate such value synthesis. More precisely, we show that a simple form of self-organized plasticity between the ANN’s attribute-specific and attribute-integration layers does mitigate the information loss induced by the limited firing range of neural units. In what follows, we refer to this as efficient value synthesis.

As you will see, this type of self-organized plasticity shares with Hebbian plasticity its simple form, i.e. connections progressively change as a function of the output responses of pairwise connected units (see Equation 2 in the revised manuscript, as well as its mathematical derivation in the revised Methods section). However, it is not, strictly speaking, Hebbian (i.e. it does not reinforce connections that yield co-activation of source and target units). If anything, it is similar to -though simpler than- ANN training rules underlying infomax variants of independent component analysis or ICA, which are essentially anti-Hebbian (Bell & Sejnowski, 1995; Nadal, 1994). Importantly, and for the same reason than in ICA network models, it turns out that this type of self-organized plasticity is incompatible with Gaussian activation functions. When revising this work, we thus focused on sigmoidal activation functions.

In the revised manuscript, we first summarize the mathematical derivation of efficient value synthesis and highlight its neural and behavioral consequences (in terms of the sensitivity to decision-relevant attributes, i.e. here: prospective gains and losses). Importantly, we show that efficient value synthesis induces value range adaptation in a wide range of contexts. We then test these neural and behavioral predictions using model-free data analyses in five subregions of the OFC. Finally, we reperform our previous model-based analyses using ANNs with and without self-organized plasticity, and show that only the former do yield accurate out-of-sample predictions of neural and behavioral data. Note that we also compare this model to a simpler range adaptation model, which operates at the level of attributes.

These changes eventually translated in significant modifications in all sections of the manuscript (Intro, Results, Methods, Discussion and Supplementary Materials). In fact, we had to replace most of the content of the previous version of the manuscript. This implies that some of the raised reviewers’ comments and questions may not be relevant anymore.

Nevertheless, we tried to address each one of them below, while referring to the novel computational framework that propose here. In any case, we believe our work has been significantly strengthened, and we thank once again the reviewers for their insightful comments and constructive criticism.

Reviewer #1 (Recommendations for the authors):

1) Please explain on page 4 of the introduction why Hebbian plasticity should lead to spill-over effects. This is not very intuitive.

This is a fair point. In brief, numerical simulations show that neither does Hebbian plasticity induce value range adaptation, nor does it lead to spillover effects. This is not the case for efficient value synthesis. However, we have toned down the importance of spillover effects. This is because we don’t think they are a reliable signature of range adaptation in integration units. We understood this by deriving an efficient coding model that at the level of attributes: this model also predicts spillover cross-attribute effects. More precisely, what discriminates the two models is the direction of the within-attribute effects, whereas the cross-attribute spillover effects are similar.

Note that the mathematical derivation of the model of efficient coding of attributes, the summary of its neural and behavioral predictions, as well as the results of its related model- based data analyses are reported in the revised Supplementary Materials.

2) Interpretation of the null findings in the fMRI data (page 4 of results) is problematic because it is unclear whether they reflect a true null effect or a lack of sensitivity. Although this is true for all null results, it is particularly problematic for re-analyses, as the study was not designed or powered to test this question. It would be best to remove these results from the paper.

You are referring here to the null findings of univariate fMRI data analyses. We have now moved these results to the Supplementary Materials.

3) There is a fundamental difference between gaussian and sigmoidal activation functions. It would be important to include an adequate discussion of the assumptions and implications of these different functions in the main text.

Sigmoidal activation functions are a simple summary of the known physiological neural response to an electrical input: the firing rate of neurons is known to be both lower and upper bounded. In our previous manuscript, we also considered Gaussian activation functions. In brief, they are a phenomenological modelling assumption that captures neural receptive fields that span a bounded subregion of stimulus’ properties (e.g. spatial field of view in V1 neurons). We note that Gaussian activation function can be understood as the physiological output of a neuron that is reciprocally coupled with an inhibitory unit (where both have sigmoidal activation functions).

In any case, we do not consider Gaussian activation functions anymore in the revised manuscript. This is because, under Gaussian activation functions, the self-organized plasticity mechanism that mitigate information loss yields unstable dynamics. We comment on this point in the revised Discussion section (lines 536-561).

4) The out-of-range predictions of the best model (Figure 3, lower-right panel, HP-ANN (gauss)) are not very convincing when considering the entire EV range. Model performance should be compared for the entire EV range, not just the common range. What does this mean for the proposed mechanism?

In the previous version of the manuscript, out-of-sample predictions of ANNs equipped with Hebbian plasticity were indeed inaccurate outside the common EV range. In particular they exhibited a non-monotonic relationship between EV and gambling rate. This is not the case anymore when relying on ANNs endowed with self-organized plasticity that operates efficient value synthesis under sigmoidal activation functions (see Figure 7 in the revised manuscript). Of course, the ensuing behavioral out-of-sample predictions are still not perfect. More precisely, they tend to slightly under-predict the observed context-dependency effect of peoples’ risk attitude. Nevertheless, the accuracy of our out-of-sample predictions has clearly improved. In addition, we now provide complementary analytical results and model-free evidence (see below) that strengthen our computational claims.

5) The paper focuses on Hebbian plasticity as a mechanism for context effects but judging from Figure 3, the choice of activation function has a comparable effect. Indeed, HP does almost nothing for models with sigmoidal activation functions, and HP only improves the out-of-range prediction for the common EV range but does very little for the uncommon range. A more balanced presentation that also discusses the type of activation function as a mechanism of adaptation would be important.

You are right: in our previous version of the manuscript, Hebbian plasticity could only capture temporal range adaptation in peoples’ risk attitude when combined with Gaussian activation functions. Let us reiterate that, given that Hebbian plasticity only yields range adaptation under specific ANN connectivity patterns, this should be considered as anecdotal evidence for this mechanism. Since we do not consider Gaussian activation functions anymore, this comment is now irrelevant. Nevertheless, for the sake of completeness, we would like to clarify our previous analyses.

In brief, neither gaussian nor sigmoidal activation functions can induce temporal range adaptation effects by themselves: those can only be the outcome of dynamical mechanisms that change the network’s response over time (as is the case for efficient coding of attributes or efficient value synthesis). This is why we did not consider the form of the units’ activation functions as a potential cause for temporal range adaptation. If anything, it should be considered a potential moderator of the impact of dynamical mechanisms such as plastic changes in the ANN’s internal connectivity.

6) Are the RSA results in Figure 4 based on the ANNs with gaussian activation function? It would be important to show RSA results for all 4 ANNs in Figure 4, so it is possible to compare the results across models. Also, please add labels to the plot axes.

Again, we have abandoned Gaussian activation functions in the revised manuscript. Nevertheless, we do now report the results of both out-of-sample behavioral and neural predictions for all models. For RSA, the results of the efficient value synthesis (resp., efficient coding of attributes) scenario are summarized in Figure 8 of the main text (resp., Figure S12 of the Supplementary Materials).

We note that we have extended our previous RSA analyses. In addition to the quantification of trial-by-trial similarity between ANNs’ integration layer and fMRI activity patterns, we now also measure their similarity on gain-, loss- and EV-dependent RDMs (the latter are derived by binning trials according to either gain, loss or EV). We did this because we could obtain quantitative predictions regarding the related information content within the integration layer of ANNs operating efficient value synthesis (or efficient coding of attributes). This eventually provides more opportunities for evaluating the evidence strength for or against our candidate computational models. In particular, this allows us to perform model-free data analyses, which are designed to test between-group differences in the neural encoding strength of gains, losses and EVs.

7) The proportion of offer and chosen value neurons shown in Figure 6 is opposite to what has been reported in the OFC of non-human primates. It would be good to discuss this discrepancy. Also, are the same proportions found for all 4 ANNs?

This comment refers to our previous posthoc analysis of ANN integration units’, which showed that ANNs equipped with Hebbian plasticity reproduced the known diversity of coding properties in OFC neurons. In the revised version of our manuscript, we have reperformed this analysis (this time with ANNs operating efficient value synthesis), and it yields qualitatively similar results. Nevertheless, we have decided to tone down this point, and have moved these results to the Supplementary Materials.

8) Figure 4: the spheres shown for BA11 are in the posterior medial rather than the lateral OFC. Please double-check the anatomical location of these ROIs. Are they really in the lateral OFC? Also, it would be good to provide center coordinates for the ROIs. In general, it would be important to better describe how the ROIs were generated. What were the search terms used in NeuroQuery? Also, NeuroQuery generates meta-analytic activation maps, not maps of anatomical structures. It would be better to use actual anatomical ROIs for fMRI data analysis.

We agree that our previous ROI partition of the OFC was somehow arbitrary. In particular, although the functional definition of the vmPFC is established in the fMRI community, its anatomical definition remains vague. We have now re-performed the RSA analyses based upon standard Broadman areas of the OFC, in particular: BA11 (splat into its lateral and medial parts), BA13, BA14 and BA32. This parcellation is based upon masks obtained from the BRAINNETOME atlas (https://atlas.brainnetome.org/): it tiles the entire OFC, except its most lateral part (which is BA12). We now describe in full details how the ROIs were obtained from BRAINNETOME anatomical masks. We also provide the ROI barycenter coordinates in a Table (see revised Methods section).

Reviewer #2 (Recommendations for the authors):

1) Range adaptation is not shown directly in ANN units. Specific questions:

We now include detailed analyses of ANN numerical simulations with and without self- organized plasticity. We describe these analyses in the Methods and Results section of the revised manuscript. In brief, they allow us to address all the concerns below, and more. We now summarize the results of these analyses when answering your comments:

a) How do ANN units respond across different input values? How variable are these response patterns across units?

As a reminder of the overall structure of the ANN, the first layer is divided into two sublayers, which receive prospective gains (resp., prospective losses). Units of the first layer (so-called “attribute units”) send their outputs to units of a second layer (so-called “integration units”).

Subjective value is readout from the output activity of this second layer.

In brief, attribute sublayers form a population code of their respective inputs, i.e. their units are selective of either prospective gains or prospective losses (depending on the specific sublayer), which they belong to. In contrast, integration units exhibit mixed selectivity, with heterogeneous response profiles across units. We refer the reviewer to Figures 10 and 12 of the revised manuscript for representative examples of integration units’ receptive fields (over the gain/loss domain).

b) How do these response patterns vary with range?

Under the efficient value synthesis scenario, attribute units do not change their response patterns with the range of either gains or losses. However, self-organized plasticity eventually modifies the receptive field of integration units. This modification tends to blur the tiling of the domain that is spanned by prospective gains and losses inputs. More importantly, one can show that this eventually modifies the information content within the ANN’s integration layer in a systematic manner. In particular, the encoding strength of prospective gains (resp., losses) decreases when the spanned range of gains (resp., losses) increases.

c) Do these patterns (at the individual unit or mean level) resemble neuronal data from the orbitofrontal cortex (OFC)? Should it be expected to?

Qualitatively speaking, integration units do exhibit response profiles that are reminiscent of typical OFC neurons electrophysiological activity during value-based decision making. For example, they reproduce the diversity of coding that has been repeatedly observed in OFC neurons during value-based decision making (cf., “offer value cells”, “chosen value cells”, and “choice cells”, see Figure S8 of the revised Supplementary Materials). More importantly, integration units also exhibit the known properties of value range adaptation in these same neurons (see Figure 2 in the revised Results section).

Now, whether this sort of ANN model produces “realistic” electrophysiological activity profiles beyond this kind of statistical relationship is questionable. The reason is twofold. First, they are agnostic w.r.t. within-trial temporal dynamics. Second, there is some level of arbitrariness in the modelling assumptions (cf., e.g., structural constraints) that cannot be finessed using either behavioral or neuroimaging data. What we argue is robust in these ANN models is the information content that they carry, which is distributed over the activity profiles of their artificial neural unit layers. This is the main reason why we resort to variants of RSA for comparing their predictions to multivariate fMRI activity patterns.

We now comment on these issues in the revised Discussion section (lines 521-535).

d) The example units that are shown (Figure 7) seem potentially different from previously reported neuronal data from OFC. For example, the dynamic range of the model covers only a narrow subset of the input space and varies substantially across conditions, whereas firing rates in OFC neurons tend to span the full range of possible values, and firing rates change only slightly between conditions [ref,ref]. Is this discrepancy just a side effect of showing results in terms of loss units rather than expected value? How should we interpret this apparent discrepancy?

This point is directly related to the above comment Re: whether one should expect the ANN units to resemble neuronal data from OFC. We note that Figure 7 (in the previous version of the manuscript) was meant as a schematic depiction of the Hebbian plasticity mechanism under a strawman “population code” scenario. In our revised manuscript, we have now replaced it with graphical summaries of numerical simulations of ANNs. As you will see, representative ANN units exhibit receptive fields that have a rather arbitrary shape (i.e., with mixed gain/loss selectivity ; see Figures 10 and 12 of the revised manuscript). However, the induced statistical relationship between integration units’ output response and value is much simpler (in particular, it tends to be rather monotonic). In addition, this relationship typically spans the full range of values, with firing rates that show small changes across conditions (cf. Figure 2).

e) ANN unit responses are compared to neuron classes observed in OFC (Figure 6). What does the mean ANN unit in each category look like, and how does this compare to the OFC responses referenced?

In our context, “choice units” show response outputs that vary in a quasi-categorical manner with value (i.e. that discriminate trials in which the value of gambling is either positive or negative), “chosen value units” exhibit a ReLU-like relationship with value (null when the value of gambling is negative, and increasing with value when it is positive), and “offer value units” show a quasi-linear relationship with value.

2) Modeling results focus on Hebbian networks with gaussian activation functions. Specific questions:

a) Is there a physiological motivation for the model, or is this primarily for mathematical convenience?

In brief, the Gaussian activation functions that we used in the previous version of this manuscript were a mathematical convenience. Having said this, a Gaussian activation function can be understood as the physiological output of an excitatory unit that is reciprocally coupled with an inhibitory unit (where both have sigmoidal activation functions).

But we have abandoned this type of activation function anyway…

b) Do the main qualitative results (range effect on risk sensitivity) require a non-monotonic activation function?

We now provide extensive theoretical analyses of adaptation effects induced by self- organized plasticity within the network. In principle, these results would generalize to any monotonic activation function. Importantly, this would not hold for non-monotonic activation functions, because those would induce unstable plasticity dynamics. This is the reason why we now only focus on sigmoidal activation functions.

c) How do the properties of the response functions (saturation and (non)-monotonicity) affect responses of ANN units?

Intuitively, units with sigmoidal activation functions will be generally more active than units with Gaussian activation functions. This is because their effective receptive field is not upper-bounded. For the same reason, they will exhibit less functional specificity than units with Gaussian activation functions. But again, this is now irrelevant.

d) There are visible differences between model behavior for Hebbian networks with sigmoidal and gaussian activation functions. How should these differences be interpreted? Does this lead to any predictions or constraints on what physiological implementation is consistent with this algorithm?

Let us first respond in the frame of our previous computational model and analyses. In brief, we had a priori expected that both types of units, when equipped with Hebbian plasticity, would exhibit range adaptation effects. We had thus hoped to show that the ability of the Hebbian mechanism to yield accurate out-of-sample choice predictions would not depend upon the type of underlying activation functions. This turned out not to be the case. Note that we did not believe that there was a physiological lesson to be learned here. Rather, we thought that this difference may be an unavoidable artefact of our data analysis procedure.

Recall that fitted ANN models can generalize across groups (or contexts) if and only if they neither underfit nor overfit the choice data (and hence accurately discriminate temporal changes in gambling value that are due to adaptation from unsystematic choice variations). Although we had no a priori reason to expect that ANNs with sigmoidal activation functions would be more prone to underfitting or overfitting, it is reasonable to think that the tendency to underfit/overfit choice data may depend upon the type of units’ activation function.

Now, although we do not use Gaussian activation functions in the revised manuscript, this point still deserves a further comment. As we highlighted before, efficient value synthesis yields unstable plasticity dynamics under non-monotonic (e.g. Gaussian) activation functions. To understand this, recall that the self-organized plasticity rule derives from aligning the connectivity change with the gradient of information loss w.r.t. connection strengths. This gradient explodes when inputs fall within domains where the derivative of the activation function approaches zero. This unavoidably happens with non-monotonic activation functions because the plasticity mechanism eventually focuses the weighted inputs within the vicinity of their mode. In other terms, one may argue that only monotonic activation functions are compatible with the efficient value synthesis scenario. We note that this kind of issue was already highlighted in computational studies of network models of ICA (Bell & Sejnowski, 1995; Nadal, 1994). We now comment on this in the revised Discussion section (cf. lines 536-551).

3) Out-of-sample predicted choices in the Hebbian model with gaussian activation function seem unintuitive for more extreme parts of the value range. E.g. for the out-of-sample wide range predictions, the model seems to over-predict risk aversion to the point that choice probabilities saturate and ~0.6 for increasingly high-value options. For the narrow range, out-of-sample prediction behavior even appears to be non-monotonic, leading to increased acceptance of extremely high loss options.

a) How should this be interpreted?

This was an artefact of our previous data analysis procedure. Recall that ANNs effectively are a nonlinear mapping between their inputs (here: prospective gains and losses) and their outputs (here: gambling choices). This mapping was made of a mixture of basis functions, whose receptive fields typically tile the gain/loss domain over which the ANN is trained. This means that, without additional constraints, this mapping poorly generalizes outside the range of gain and loss inputs with which it was trained on. Having said this, our modified model shows much better-behaved out-of-sample predictions…

b) Does this depend on model parameters like update rate or covariance threshold?

No, this artefact did not depend upon model parameters.

c) In these parts of the range, how does the Hebbian model compare to alternate models, such as the other ANNs or logistic regression?

In this part of the range, logistic regression or ANNs with sigmoidal activation functions did generally better than ANNs with Gaussian activation functions. But this is not the case with the efficient value scenario (which relies upon sigmoidal activation functions).

4) The relationship between neural responses and ANN activity relies on representational similarity analysis (RSA). However, significant RDM correlations could arise as a byproduct of the fact that both ANN output and BOLD activity in select regions correlate with behavioral choice patterns. Is there evidence that the correlation between Hebbian ANNs and BOLD activation reflects more than average choice patterns?

This is a fair point: you are asking whether the similarity between the model and neural data may not simply be driven by the similarity between the model and choice data (e.g., a model that would not fit choice data at all would be very unlikely to be similar to neural data). We do agree that this is in principle possible here. For example, if, for some reason, static ANNs underfit choice data (at least when compared to plastic ANNs), then they might have a better chance to look like trial-by-trial fMRI signals in the OFC. A first hint here comes from Figure S3 in our revised manuscript: one can see that plastic and static ANNs yield very similar choice postdiction error rates. Therefore, they are unlikely to differ in terms of their behavioral explanatory power.

But our model-free fMRI data analyses provide, in our opinion, a stronger counter argument here. In brief, only ANNs that operate efficient value synthesis do predict the observed change in the information content induced by the difference in spanned gain range. In particular, the information content about prospective gains and losses (as opposed to integrated value) is conditionally independent from choice data. Note that ANNs that operate efficient coding of attributes do not make accurate neural predictions (cf. Figures S10 and S12 in the Supplementary Marterials).

5) The authors make the strong claim that this model is the mechanistic explanation for adaptation in orbitofrontal cortex, but there is little comparison with previous models. Divisive normalization and other forms of adaptation to the value range are discarded based on a qualitative argument from the behavioral data. However, given that the Hebbian ANNs also produce some counterintuitive behavioral predictions, it is not obvious that they are better at accounting for observed patterns in neuronal adaptation or behavior. Addressing the following questions could clarify whether there is an argument for Hebbian ANNs over alternate mechanisms of adaptation:

a. How do behavioral predictions and RSA results from HP-ANNs compare quantitatively to results from other models of adaptation, including models of adaptation at the input stage?

This is an important point. In the revised version of the manuscript, we now consider a model of adaptation at the level of attributes. Its mathematical derivation, its neural and behavioral predictions, as well as the ensuing model-based data analyses are reported in the revised Supplementary Materials (cf. “value synthesis under efficient coding of gains and losses”). At the behavioral level, it makes predictions that are similar to the efficient value synthesis scenario. At the neural level however, it – somehow surprisingly- exhibits distinct properties. In particular, it does not induce value range adaptation in integration units (see Figure S9).

Moreover, it predicts that the neural encoding strength of gains (resp., losses) increases when the range of spanned gains (resp., losses) increases. This clearly contradicts our model-free analyses of fMRI data in the OFC.

Regarding divisive normalization, we do not see how it would provide a simple alternative explanation for this dataset. We acknowledge that divisive normalization provides an elegant explanation for specific forms of instantaneous context-dependency effects. For example, when deciding between three options, divisive normalization would predict irrational decoy effects (Louie et al., 2013; Steverson et al., 2019). But these kinds of models are not standard explanations for temporal range adaptation effects. To our knowledge, there exists only one variant of divisive normalization models that aims at capturing temporal range adaptation (Zimmermann et al., 2018). The model is very complex and relies on coupling slow and fast winner-take-all networks. Critically, it treats option values as inputs, and does not consider situations where subjective value has to be constructed from the integration of multiple decision attributes. We believe that it is beyond the scope of this work to extend this kind of model to networks that operate value synthesis.

b) Can Hebbian ANNs account for other previously observed patterns of behavior across different value ranges, such as stability of relative values across ranges in two-option choices and range-dependent decoy effects []?

Addressing this sort of issues would require extending the model to situations in which two (or three) option values are simultaneously represented and/or compared by the network. These additional mechanisms are beyond the scope of the current work (but we will be addressing these points in subsequent publications).

c) Are there specific predictions that arise from Hebbian networks that could be tested in later work and used to differentiate between competing models?

One possibility here is to target, using invasive neurophysiological approaches, the mechanisms that underlie plasticity (e.g., LTP and LTD). For example, specifically suppressing LTP and/or LTD (in value-coding OFC neurons) should prevent or distort value range adaptation. One could then compare choice behavior immediately after exposing subjects to (high or low) ranges of gains and/or losses, with and without LTP/LTD suppression.

Reviewer #3 (Recommendations for the authors):

1) The paper assumes loss aversion as the primary behavioral factor, which is fine. However, it may be worth briefly mentioning the limitation that with the present experimental design it is impossible to dissociate loss aversion from risk aversion (see e.g. Williams et al., 2021, PNAS).

This is a fair point and we have included a comment on this in the revised Discussion section.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

All reviewers agreed that the revised manuscript has been improved. However, given the revised manuscript is essentially a fundamentally new manuscript, there is a new set of comments that would need to be addressed, as outlined below.

First of all, we thank the Editor and reviewers for providing us with another opportunity to improve our work. We have tried to address each comment (see our response below). In particular, we included a few additional results reports, modified most figures, moved some material from the Methods section into the Supplementary Material, and included novel discussion points in the Discussion section. We hope that you will agree with these changes.

Reviewer #1 (Recommendations for the authors):

I appreciate the authors' effort and dedication in re-working the manuscript. The revised manuscript is a fundamentally new and improved paper with new conclusions. I believe it could make an important contribution to the field and I remain enthusiastic. However, given most of the manuscript has changed, I have new comments that I think should be addressed

Thank you very much for your positive appreciation of our work.

1) In general, the manuscript is quite long with extensive supplementary analyses. I believe the paper could be streamlined by highlighting the most important aspects and reducing the extent to which details are discussed in the main text.

We understand your comment. However, we found it difficult to remove material in the main text without harming the readability of the manuscript. In the hope of conciliating these two imperatives, we have moved some of the content of the Methods section in the Supplementary Materials (see also our response to point #12 of reviewer #2). We hope that you will find our revised manuscript concise enough.

2) All relevant information to understand the plots should be embedded within the figure rather than just the figure legends. It is cumbersome for readers to constantly have to consult the legends to understand what is shown in the figure. For instance, there is no label for the different colored plots (red/back, red/blue) or line styles in ANY of the figures. Also, some legends (Figure 4) refer to a color code, but this code is not provided. Moreover, there are no axis ticks and/or axis tick labels in some of the panels in Figures 3, 5, 6, 7, 9, 10 (top row), 12 (top row), 13 (top row), S4, S5, S6, S7, S10, and S12. Several figures don't include a color bar (e.g., Figure S5-7). Please carefully revise all figures. Note that this point was already raised in the previous round of reviews, but it was not addressed.

We have gone through all the manuscript’s Figures and modified them to improve readability as much as possible. In particular, we have now inserted self-contained axis ticks, colorbars and legends whenever they were missing. Note that most Figures number have changed, because we had to tie Figures in the supplementary material to Figures in the main text.

3) Figure 3D – Loss aversion over time: Instead of running a separate between group comparisons for each time-point, it would be more appropriate to run a single two-way ANOVA with within-subject factor time and between-subject factor group. A significant group-by-time interaction would support the conclusion that loss aversion diverges between the two groups across time.

This is a fair suggestion. We tried this but found no significant time-by-group interaction (p=0.43, F=0.61, R2=0.6%), which is why we report separate group comparisons. We now also report this null finding prior to reporting separate group comparisons.

4) Why was the lateral OFC (area 47/12) not included here, given that work by Suzuki et al. 2017 suggests that lateral OFC represents attribute-specific values?

In brief, we had no particular reason for not including area 47/12. We simply considered OFC subregions that were close neighbors to (the typical fMRI definition of) the ventromedial PFC. This effectively disqualified BA12/47, which is positioned on the most lateral part of the OFC. Having said this, we do not think that including BA47/12 is critical for our empirical demonstration. This is because fMRI activity patterns in BA11 already validate the model’s predictions (and we essentially rely on other OFC subregions as control ROIs). Given that inserting this additional ROI would mean adding more material to the main text and induce further delays in the revision of this work, we thus decided not to do it. We hope you understand and agree with us.

5) It is not fully clear what is plotted in Figure 5-7. Are these the averages across all RDM cells with a certain δ G/L/EV in Figures S5-S7? Does this include the δ G/L/EV = 0? How is neural coding strength defined in the lower rows? Tick labels would have helped.

Yes, upper panels in Figure 5 to 7 show the group mean of within-subject RDMs binned by differences in variables of interest (i.e.: G, L or EV). And yes, the group mean of G/L/EV-specific RDMs can be eyeballed in the Supplementary Material (Figure 5 —figure supplement 1, Figure 6 —figure supplement 1 and Figure 7 —figure supplement 1, respectively). In our previous manuscript, the y-axis limits were kept identical across OFC subregions, but this was not apparent because we had removed tick labels. We have included them all now.

6) Figure 9 shows that subject-specific ANNs correlate with subject-specific RDMs. To claim that these models capture individual patterns of OFC activity, it would be important to show that these correlations exceed those with group-level ANNs. Moreover, to claim specificity for plastic ANNs, it would be necessary to show superiority of predictions from plastic vs static ANNs.

These are fair comments. We tried both suggestions but failed to detect significant differences (both for within-subject versus group correlations and for plastic versus static ANN variants). We thus have modified our results report to tone down these claims. In brief, we simply use this analysis to afford evidence that ANNs that operate value synthesis (whether plastic or static) do yield reasonably realistic predictions regarding fMRI activity patterns within the OFC.

Note that, for the sake of completeness, we modified Figure 9 to enable readers to eyeball and compare the RSA results of both plastic and static models.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Reviewer #2 (Recommendations for the authors):

Overall the proposed model presents an interesting possible explanation for types of context-dependent loss aversion. The manuscript has improved over the course of revision and will be a worthwhile contribution to the literature. I have a few remaining comments that would help improve the clarity and accessibility of the results if they can be addressed before publication, but these are relatively minor.

1) While the figures have been substantially improved, several are still missing a description of the colors in the figure or legend, and instead have the placeholder phrase "(color code)" in the figure legend.

We have now modified all relevant figure legends to provide an explicit description of the color code (e.g., for Figure 3: “[…] A: the neural encoding strength of gains (color code: from blue -minimal encoding strength- to yellow -maximal encoding strength-) is shown as a function of the spanned range of losses (x-axis, range increases from left to right) and gains (y-axis, range increases from bottom to top)). […]”.

2) I appreciate your response to my previous comment about "undoing" adaptation (R2 Comment 3), but it is not clear to me in your response whether you are describing the computational role of "offer value" units in your model specifically, or just giving a hypothetical scenario. If I understand right, your model produces choice via a comparison of Vt for two options, and the"offer value" units are part of the integration layer (i.e. an input to Vt rather than the signal being compared directly). Is the idea that this would lead to stable preferences even without "undoing" adaptation downstream? Or would your model predict that preferences do shift in responses to "offer value" adaptation, and you suspect that past studies may not have been able to see it? Or are you just trying to say that there are several hypothetical possibilities, and in the specific task you are modeling it is not necessary to modify the weights? (As an aside, I also disagree with the argument that Rustichini et al. are interpreting a null result as evidence of absence – they start by predicting how preferences would change if choices arose from a simple comparison of offer value firing rates, then show that actual choice behavior does not match this prediction.)

In brief, we were arguing (i) that Rustichini et al’s argument is statistically unsound, and (ii) that a variant of our model would actually predict no behavioral change despite apparent value range adaptation in “offer value cells”. We take the latter as a relevant counter-example for how puzzling the result was in the first place. Now, since we believe this point is the most important, we have dropped our former statistical criticism in the revised manuscript. In addition, we have modified this paragraph to clarify our reasoning as much as possible:

Intriguingly however, value range adaptation in “offer value cells” had been observed without any significant behavioral preference change. Under the assumption that preferences between offers derive from the direct comparison of output signals from “offer value cells”, this is surprising. To solve this puzzle, later theoretical work proposed that value range adaptation is somehow “undone” downstream value coding in the OFC (Padoa-Schioppa & Rustichini, 2014). In our context, this would suggest that readout weights (w^(k) in Equation 2) would compensate for value-related adaptation, effectively thwarting the behavioral consequences of self-organized plasticity between attribute and integration layers. However, this reasoning critically relies upon the assumed computational role of “offer value cells”. In fact, this puzzle may effectively dissolve under other scenarios of how “offer value cells” contribute to decision making. Recall that this null result was obtained in a decision context where choice options were characterized in terms of the type of offer (i.e. juices that differ w.r.t. palatability), whose quantity was systematically varied. Here, value synthesis would effectively aggregate two attributes, namely palatability and quantity. Under this view, “offer value cells” simply are integration units that show a certain form of mixed selectivity, whereby units’ sensitivity to quantity strongly depends upon palatability. At this point, one needs to consider candidate scenarios of how the OFC may operate value synthesis for multiple options in a choice set. A possibility is that the OFC is automatically computing the value of the option that is currently under the attentional focus (Lebreton et al., 2009; Lopez-Persem et al., 2020), while storing the value of previously attended options within an orthogonal population code (Pessiglione & Daunizeau, 2021). In principle, this implies that the OFC is wired such that it can handle arbitrary switches in attentional locus without compromising the integration of option-specific attributes. In this scenario, integration units (including those that look like “offer value cells”) would adapt to the range of all incoming attribute signals, irrespective of which option in the choice set is currently attended. In turn, “offer value cells” would look like they are only partially adapting to the value range of a given offer type (Burke et al., 2016; Conen & Padoa-Schioppa, 2019). More importantly, to the extent that between-attribute spillover effects are negligible, changes in the range of offer quantities would distort the readout value profile along the quantity dimension without altering the palatability dimension. This would effectively leave the relative preference between offer types unchanged. Of course, this is only one candidate scenario among many. Nevertheless, we would still argue that the behavioral consequences of range adaptation in “offer value cells” actually depend upon their underlying computational role.

3) In line 456 you discuss the spatial specificity of results, but unless I'm missing something this doesn't involve a direct comparison between regions. It may be worth reducing this claim.

You are right. To make sure our claims regarding anatomical specificity are not over-interpreted, we have modified the relevant paragraph in the Discussion section as follows:

Although this clearly aligns with our model-free fMRI data analysis results, we do not claim that the evidence we provide regarding the anatomical location of value synthesis generalize beyond decision contexts that probe peoples’ loss aversion. In fact, our main claim is about whether and how efficient value synthesis operates within the OFC, as opposed to which specific subregion of the OFC drives the adaptation of loss aversion and/or related behavioral processes.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Botvinik-Nezer R, Iwanir R, Poldrack RA, Schonberg T. 2020. NARPS. OpenNeuro. [DOI]

    Supplementary Materials

    MDAR checklist

    Data Availability Statement

    All data analysed during this study are openly available from the https://openneuro.org/ website (https://doi.org/10.18112/openneuro.ds001734.v1.0.5). All the modelling and analysis code are available as part of the academic freeware VBA (https://github.com/MBB-team/VBA-toolbox/, Rigoux et al., 2023), which is under a GNU open-source license.

    The following previously published dataset was used:

    Botvinik-Nezer R, Iwanir R, Poldrack RA, Schonberg T. 2020. NARPS. OpenNeuro.


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES