Sample screen events for two phases as seen by a human subject. In acquisition phase 1 (right), the subject sees 3 colored doors and chooses one by clicking with the mouse. In this case, the subject chooses pink and gets the reward. If the subject had chosen one of the other 2 colors, the door would have been locked, the order of colors would have been shuffled and the subject would have to choose again. After choosing the correct door 5 times consecutively, the subject is started in phase 2 (left). Here the subject sees 3 different colored doors. If he chooses the wrong door, it is again locked. If, as above, he chooses the correct door, he sees the first room in the distance and is then moved to that room, where he must remember the previously learned correct door to gain the reward. After 5 consecutive successful 2 room navigations, the chain is lengthened to 3 and finally 4 rooms. If the subject learns the 4 room navigation a probe phase is started. In each room one of the incorrect door colors is replaced with a door color that is correct in another room. In the illustration above, for example, the yellow in the first room could be replaced with blue. The subject then has to navigate through the rooms 6 further times, choosing the colors in the same order as in the acquisition phases to gain the reward.