Skip to main content
. 2022 Jan 12;18(1):e1009634. doi: 10.1371/journal.pcbi.1009634

Fig 4. How MF forgetting influences gain estimation.

Fig 4

(A) Estimated gain as a function of the difference between the agent’s current MF Q-value and the model-estimated MB Q-value, Q^MB-QMF, for varying degrees of MF forgetting, ϕMF. The dashed grey lines show the x- and y-intercepts. Note that the estimated gain is negative whenever the model-generated Q^MB estimates are worse than the current MF Q-values. (B) Current MF Q-values for the optimal and sub-optimal actions with varying MF forgetting rate, coloured in the same way as above. The horizontal solid black bar is the average reward experienced so far, towards which MF values tend. The true Q-value for each action is shown in dashed black.