Optimal policy (h-5), i.e., value difference between foraging and waiting according to the optimal policy with a horizon of five days. |
Ideally, participants should minimize the probability of starvation after five days. The optimal policy per se specifies the probabilities with which participants should forage (or wait) given the current internal state and the current time step. Since the optimal policy per se relies on taking the “true” maximum over the value difference between the two choice options, it either prescribes waiting or foraging (or is indifferent between the two choice options). We therefore use the continuous value difference between foraging and waiting as predictors of participants’ choice, RT, and fMRI data. The optimal policy can also be calculated according to a horizon different from the five days incentivized in our task. These horizons, and notably a horizon of only one step (1-h), are not normative in our task (see Supplementary Fig. 10 for the prescriptions according to different horizons). |
Choice uncertainty: probability of foraging success |
Cases in which the prescriptions of the employed heuristic policy are closer to 0 (i.e., waiting) or 1 (i.e., foraging) are less uncertain than cases in which the prescriptions lie in-between. We used the mean parameter estimates of the behavioral sample to derive the relevant logistic function (cf. Supplementary Fig. 3c). The derivative of this logistic function is used to index choice uncertainty. |
Choice uncertainty: optimal policy (h-5) |
In analogy to the choice uncertainty of the employed heuristic, the optimal policy can confer more or less choice uncertainty. In some cases, the absolute value difference between foraging and waiting is small (i.e., it does not matter which option is chosen). In other cases, the value differences clearly indicate that foraging or waiting should be chosen. As for the choice uncertainty of the heuristic, derivatives of the logistic function obtained from the mean parameter estimates of the behavioral sample are used (cf. Supplementary Fig. 3d). |
Discrepancies., absolute differences in the prescriptions of the two policies |
The optimal policy and any heuristic policy make prescriptions about whether foraging or waiting should be chosen (according to logistic functions that relate the respective decision variables to choices). In some cases, optimal and heuristic policies make quite distinct prescriptions (high discrepancy), whereas in others they make quite similar prescriptions (low discrepancy). |