Skip to main content
. 2016 Jan;20(1):15–24. doi: 10.1016/j.tics.2015.07.010

Figure I.

Figure I

Probabilistic Kalman–Filter [80] Models of the Environment. The reward outcome r at time-step t is sampled from a normal distribution whose mean vts is specific to the current state. (A) For a particular state, depicted here, changes in the mean follow a random walk with normally distributed noise. (B) A general environmental factor affects multiple states. At each time-step t, a general factor gt is sampled from a normal distribution whose mean is zero, and is then added to multiple state means (vts). (C) Changes in reward follow an underlying momentum. The mean reward vts of a state is sampled from a normal distribution whose mean is the sum of the previous mean vt1s and the current momentum mt. Changes in momentum follow a random walk.