Skip to main content
. 2011 May 12;7(5):e1001133. doi: 10.1371/journal.pcbi.1001133

Figure 7. The grid-world task.

Figure 7

Average latency in reaching the reward state and standard deviations over Inline graphic runs for the neuronal network model with optimized parameters Inline graphic, Inline graphic, Inline graphic and reward Inline graphic (red curve) and the corresponding classical discrete-time algorithmic TDInline graphic implementation with Inline graphic, Inline graphic, Inline graphic and reward Inline graphic (blue curve). Each data point shows the average latency over Inline graphic successive trials. Inset: grid-world environment consisting of Inline graphic states. Only the state marked with an asterisk is rewarded. In each state the agent (A) can choose between Inline graphic directions (indicated by the arrows). Once the rewarded state has been found, the agent is moved randomly to a new starting position.