Task setup, model predictions, and behavioral data. A, Task setup: Participants chose between two fractal stimuli, which led probabilistically to one of the two different second-stage sets. B, Model predictions and observed behavior. Left, If choice were completely controlled by the model-based system, the first-stage choice predominantly associated with the rewarded second-stage choice would be reinforced. Middle, If choice were completely controlled by the model-free system, then repeating the first-stage choice in the subsequent trial (stay probability) is a function of reward delivery regardless of the transition occurred. Right, Data averaged over all subjects show signature of both systems. The analysis of stay probability data revealed a significant main effect of reward delivery (i.e., model-free signature) as well as an interaction between reward delivery and transition (i.e., model-based signature). C, Individual variability in the reliance on the model-based system. Subjects are sorted in descending order based on reward-by-transition interaction effect, which is an index of model-based control in the task. In half of participants, the hallmark of model-based control is clearly observable. However, the other half of participants show no evidence of reliance on the model-based strategy. Insets, Mean stay probabilities as a function of reward and transition. Bottom left inset, Data from the median-split half of individuals with a large reward-by-transition effect. Top right inset plot, Data from the median split half of individuals with a small reward-by-transition effect. Error bars indicate SEM.