Skip to main content
. 2021 Aug 16;12:4942. doi: 10.1038/s41467-021-25123-3

Fig. 9. Linear RL explains Pavlovian-instrumental transfer.

Fig. 9

a the task testing outcome-specific PIT consists of three phases: a Pavlovian training phase, an instrumental training phase, and the PIT test. Outcomes 1 and 2 are both rewarding. During PIT test, both stimuli are presented in succession, and “same” responses denote the one whose associated outcome matches that associated with the presented stimulus, e.g. Response 1 chosen following presentation of Stimulus 1. The other response is “different.” b, c Data adapted from Corbit et al.84 when rats are hungry (b) and simulated behavior of the model (c). The model learns the default policy during the Pavlovian phase, which biases performance during the PIT test. d, e Outcome-specific PIT persists even when rats are sated on both outcomes (d). The model shows the same behavior (e) because default state probabilities learned during Pavlovian training influence responses even in the absence of reward. Mean and standard error of the mean are plotted in b and d. Data in b and d are adapted from Corbit et al.84, in which the mean and standard error of the mean are plotted (obtained over n = 16 independent samples). Source data are provided as a Source Data file.