Skip to main content
. 2012 Oct 30;6:87. doi: 10.3389/fncom.2012.00087

Figure 2.

Figure 2

(A) Best-fit average learning curve (10,000 simulation runs) during the pre-reversal phase for Actor-Critic gating algorithm (N = 4, α = 0.14, T = 1.38, and λ = 0.93) compared with average animal learning curve (±1 standard deviation). Shading indicates ±1 standard deviation for simulations. (B) Minimum KL divergences for Actor-Critic for different values of n (10 runs of optimization algorithm for each n). (C) Average learning curves (10,000 simulation runs) for Actor-Critic corresponding to best-fit parameters found for different values of n. (D) Best-fit average learning curve (10,000 simulation runs) during the pre-reversal phase for SARSA gating algorithm (N = 5, α = 0.31, T = 0.13, and λ = 0.03). (E) Minimum KL divergences for SARSA for different values of n (10 runs of optimization algorithm for each n). (F) Average learning curves (10,000 simulation runs) for SARSA corresponding to best-fit parameters found for different values of n.