Figure 2.
A graphical Bayesian approach for parameter estimation of the reinforcement learning model for the dynamic reward task. In this graphical model, nodes represent variables of interest and arrows indicate dependencies among these variables. Note that shaded nodes represent observed variables. Specifically, Ri, j−1 is the reward received by subject i in trial j – 1, and Chi, j is the observed choice of subject i in trial j. The parameters αi and βi represent the learning rate and choice perseveration for subject i, respectively. Each parameter was assumed from a normally distributed group-level population with respective means and standard deviations. In our implementation, both μα and σα were assigned an (non-informative) uniform distribution between 0 and 1 for the prior. For β, a uniform prior between 0 and 10 was assigned to μβ. For σβ, we assigned a uniform prior between 0 and 5.
