Skip to main content
. Author manuscript; available in PMC: 2010 Dec 1.
Published in final edited form as: Cognition. 2009 May 8;113(3):293–313. doi: 10.1016/j.cognition.2009.03.013

Figure 1. Panel A.

Figure 1

The payout function for the Farming on Mars task. The horizontal axis is the number of choices out of the last ten in which the Long-Term robot was selected. The vertical axis is the number of oxygen units generated as a result of choosing one of the robots on a trial. The two diagnonal lines show the reward associated with each robot for each task state. By design the Short-Term robot is better at every point, but the best long-term strategy is to exclusively choose the Long-Term robot because the selection of the Short-Term robot transitions the state to the left, whereas selection of the Long-Term robot transitions the state to the right. Panel B: Two potential representations of the state structure of the task are shown. States are depicted as black circles. In the top figure, the problem consists of a single state. In this representaton, trials which differ from one another in terms of the available rewards are aliased together. In the bottom panel, eleven distinct states are shown. Actions (such as selecting the Short-Term robot) push the system into an adjacent state. In this case, states directly correspond to the positions along the horizontal axis in panel A and better capture the underlying structure of the task.