Optimal learning rates and number of synaptic states for environments with different volatility. The baiting probabilities change at different rates in the three plots (from left to right, the number of trials per block is s = 10, 100,1000). Each plot shows the overall performance of the simulated network (gray/scale) as a function of the learning rate α (αr = αn = α) and the number of synaptic states m. The performance is the harvesting efficiency, which is defined as the average number of received rewards per trial, divided by the total reward rate. The optimal parameter region is always achieved for a relatively small number of synaptic states (m < 10), even in the case of stable environments (right). T = 0.05, γ = 0 and rL + rR = 0.35.