Skip to main content
. 2023 Feb 14;12:e64978. doi: 10.7554/eLife.64978

Figure 2. Rats do not greedily maximize instantaneous reward rate during learning.

(a) Reaction time (blue) and error rate (pink) for an example subject (rat AL14) across 23 sessions. (b) Learning trajectory of individual subject (rat AL14) in speed-accuracy space. Color map indicates training time. Optimal performance curve (OPC) in blue. (c) Maximum iRR opportunity cost (see Methods) for individual subject (rat AL14). (d) Mean reaction time (blue) and error rate (pink) for n=26 rats during learning. Sessions across subjects were transformed into normalized sessions, averaged and binned to show learning across 10 bins. Normalized training time allows averaging across subjects with different learning rates (see Methods). (e) Learning trajectory of n=26 rats in speed-accuracy space. Color map and OPC as in a. (f) Maximum iRR opportunity cost of rats in b throughout learning. Errors reflect within-subject session SEMs for a and b and across-subject session SEMs for d, e, and f.

Figure 2.

Figure 2—figure supplement 1. Comparison of training regimes.

Figure 2—figure supplement 1.

(a) ‘Canonical only’: rats trained to asymptotic performance with only front-view image of each of the two stimuli. ‘Size and rotation’: rats first shown front-view image of stimuli. After reaching criterion (accuracy=0.7), size staircased. Following criterion, rotation staircased. Upon criterion, stimuli randomly drawn across size and rotation. (b) Learning trajectory in speed-accuracy space over normalized training time for rats trained with the ‘size and rotation’ (left panel) and the ‘canonical only’ training regimes (right panel). (c) Average location in speed-accuracy space for 10 sessions after asymptotic performance for individual rats in both training regimes, as in b. (d) Mean accuracy over learning (left panel) and for 5 sessions after asymptotic performance (right panel) for rats trained with the ‘size and rotation’ (n=26) and the ‘canonical only’ (n=8) training regimes. (e) Mean reaction time. (f) Mean fraction max iRR. (g) Mean total trials per session. (h) Mean voluntary intertrial interval up to 500 ms after error trials. (i) Mean fraction ignored trials. All errors are SEM. Significance in right panels of d–i determined by Wilcoxon rank-sum test with p<0.05.