a Left: task setting and complexity algorithm for reward delivery (see text) Right: tree structure of the task and reward distribution. b Typical trajectories in training (T) and complexity (C) conditions. c Increase of the success rate over sessions in the complexity setting. Mice improved their performance in the first sessions (c01 versus c05, T = 223.5, p = 0.015, Wilcoxon test) then reached a plateau (c05 versus c10, t(25) = −0.43, p = 0.670, paired t-test) close to the theoretical 75% success rate of random selection (c10, t(25) = −1.87, p = 0.073, single sample t-test). The shaded area represents a 95% confidence interval. Inset, linear regressions of individual mice performance increase for individual mice (gray line) and average progress (blue line). d Increase of the behavior complexity over sessions: the NLZcomp measure of complexity increased in the beginning (training versus c01, T = 52, p = 0.0009, Wilcoxon test, c01 versus c05, t(26) = −2.67, p = 0.012, paired t-test) before reaching a plateau (c05 versus c10, T = 171, p = 0.909, Wilcoxon test). The average complexity reached by animal is lower than 1 (c10, t(25) = −9.34, p = 10–9, single sample t-test), which corresponds to the complexity of random sequences. The RQA ENT entropy-based measure of complexity decreased over sessions (training versus c01, t(26) = 2.81, p = 0.009, paired t-test, c01 versus c05, T = 92, p = 0.019, Wilcoxon test, c05 versus c10, T = 116, p = 0.13, Wilcoxon test). The rate of U-turns increased over sessions (training versus c01, t(26) = −2.21, p = 0.036, c01 versus c05, t(26) = −3.07, p = 0.004, paired t-test, c05 versus c10, T = 75, p = 0.010, Wilcoxon test). Error bars represent 95% confidence intervals. e Correlation between individual success rate and complexity of mice sequences. Also noteworthy is the decrease in data dispersion in session c10 compared to c1. N = 27 in all sessions except c10 where N = 26.