Skip to main content
. 2009 Feb 20;100(3):249–260. doi: 10.1007/s00422-009-0295-8

Fig. 7.

Fig. 7

Statistics for other types of learning methods. PS = path straightening (our method). We used reward type D, no punishment, discount factor γ = 0.7 for all cases except case “fully discounted”, where we have γ = 0