Table 2.
Performance metrics for goal-oriented tasks: move-to-goal steps, time elapsed, mean reward, and standard deviation of rewards.
Move to Goal Steps | Time Elapsed (s) | Mean Reward | Std. of Rewards |
---|---|---|---|
Step: 10,000 | 55.449 | −0.877 | 0.447 |
Step: 20,000 | 85.770 | −0.093 | 0.787 |
Step: 30,000 | 117.238 | 0.323 | 0.857 |
Step: 40,000 | 148.027 | 0.691 | 0.599 |
Step: 50,000 | 179.527 | 0.578 | 0.645 |
Step: 60,000 | 210.429 | 0.704 | 0.539 |
Step: 70,000 | 241.473 | 0.707 | 0.552 |
Step: 80,000 | 272.897 | 0.680 | 0.578 |
Step: 90,000 | 304.707 | 0.708 | 0.569 |
Step: 100,000 | 336.163 | 0.863 | 0.432 |
Step: 200,000 | 656.214 | 0.983 | 0.129 |
Step: 300,000 | 982.363 | 0.989 | 0.146 |
Step: 400,000 | 1319.801 | 0.990 | 0.121 |
Step: 500,000 | 1671.575 | 1.000 | 0.000 |