(
a) The agents are subject to the same learning paradigm as in
Figure 4. Learning is successful only when the integral of the STDP window is positive. (
ai) Learning windows with negative, zero and positive integrals. Parameters and color scheme are as specified at the top. (
aii) Learning curve presented as a percentage of successful simulations over successive trials (trials 1–20; 1000 simulations). Only the agents using a learning rule with a net positive integral of the STDP window learn successfully. (
b) Learning of a displaced reward location is not achieved successfully by any of the STDP learning rules. (
bi) The percentage of visits to the previous reward area is low only for agents with a negative integral of the STDP window (trials 21–40; 1000 simulations). This is because unlearning occurred in the first phase of the experiments. (
bii) Agents with a positive integral of the STDP window only partially learn the new reward location, but do not effectively unlearn the previous reward location (as shown in
bi). Agents with a negative integral of the STDP window unlearn both the old and the new reward areas. The shaded area (
aii and
bi-ii) represents the 95% confidence interval of the sample mean.