Fig. 3.
Study 2: state transition learning rate and model-based behavior in the two-step task. (a) A high state transition learning rate can produce behavior resembling compromised model-based control. The plots show the proportion of times a participant (real or simulated) engaged the same action they deployed in the preceding trial (‘Stay probability’) conditioned on whether or not one's prior action was followed by a common or rare state transition and whether reward was administered. The behavioral signature of model-basedness is shown in the top right plot (Simulated Data: MB Typical). The deviation of Daw et al.'s real data (top left panel) from this signature was deemed to reflect reduced utilization of intact transition matrix knowledge when making decisions, relative to a competing model-free system. The bottom row depicts simulations of the ISTL algorithm which gradually learns the transition matrix from experienced state prediction errors. The bottom left plot shows that the same qualitative pattern found in Daw et al.'s empirical data (top left) can emerge due to a fast transition learning rate, even in the absence of a putative model-free system. (b) Model-basedness as a function of transition learning rate and model-based β in empirical data. Model-basedness quantifies the degree to which participants complied with the behavioral signature of model-based choice as shown in panel A (e.g. switching after a rare transition was rewarded; see Methods). The top subplot reflects how model-basedness inversely covaries with state transition learning rate whereas the bottom subplot shows the positive relationship between model-basedness and model-based β. Both were generated using a subsample of participants with high (>2.5) model-based control. (c) Model-basedness as a function of transition learning rate and model-based β in simulated data. As a post-predictive check, we simulated the data using the winning ISTL model and best-fitting parameters, which generated the same effects as the empirical data in (c). (d) Regression weights of computational parameters in explaining model-basedness.