Correct and exploratory response rates in control (blue), transfer (orange), and open (green) episodes plotted against the number of trials following episode onsets (data from Experiment 2). Lines ± shaded areas (mean+S.E.M.): participants' performances. Lines ± error bars (mean ± S.E.M.): performances predicted by the fitted PROBE model in every trial according to the actual history of participants' responses. Left, context-exploiting participants: Correct responses increased and exploratory responses vanished faster in control than transfer episodes (Wilcoxon-tests, both zs>2.4, ps<0.015) and faster in transfer than open episodes (Wilcoxon-tests, both zs>3.1, ps<0.002). Middle, outcome-exploiting participants: performances were similar in control and transfer episodes (correct and exploratory responses: Wilcoxon-tests, both zs<1.4, ps>0.15), but correct responses increased and exploratory responses vanished faster in transfer than open episodes (Wilcoxon-tests, both zs>2.3, ps<0.023). Right, exploring participants: performances were similar in control, transfer, and open episodes (correct and exploratory responses: Friedmann-tests, both χ2<5.3, ps>0.07). Note that in open episodes, exploring participants adjusted faster than exploiting participants (correct responses: both ts>3.0, ps<0.004). See Table S2 for fitted model parameters in each group.