Behavioral results. (A) Plotted above are subjects' mean probability of offer acceptance as a function of the number of offers already seen (ranging from 1–9) and number of offers already rejected (ranging from 0–8) in a trial, split by offer value (3, 5, 7) (top panel). The spectrum runs from blue (p = 0) to red (p = 1). We stress that we have only plotted behavior corresponding to periods in a trial where the offer distribution is uniform and thus any penalty for prematurely accepting a high value offer has not yet been instantiated. Compared to an optimal model in which choice is dictated by correctly inferring long-term value (middle panel), subjects under-accept value 3 offers and over-accept value 7 offers at the start of trials (top panel; based on group mean data, n = 23). This discrepancy is rectified by a model in which immediate and long-term values trade off for behavioral control (lower panel). We note that the lower panel illustrates choice predicted by the trade-off model based on mean group parameter fits (n = 23). The yellow boxes in the middle panel demonstrate offers for which immediate and long-term values are maximally decoupled, and those for which all fMRI analyses are centered on. We note that for display purposes, we discarded cells with less than a total of 15 data points across subjects.
(B) Model comparison showed that a model in which each offer value (3, 5, 7) is assigned a separate parameter that governs how much weight is placed on immediate versus long-term value in the associated trade-off fits behavior better than alternatives, indicated by its lowest iBIC score (3 trade). These alternatives included a model in which a single parameter governs the trade-off (1 trade), a model dependent on optimally inferring long-term value (optimal), and a model driven purely by immediate value (immediate). The number of free parameters is indicated in brackets for each model.
(C) Pair-wise scatter plots show individually fit trade-off parameters (c1, see the Materials & methods section) from the winning model for 3 versus 5-token offers, 3 versus 7-token offers, and 5 versus 7-token offers. A trade-off value closer to 0 indicates that behavior is predominantly driven by immediate value, while a value closer to 1 indicates that behavior is predominantly driven by long-term value. Each circle represents one participant.