General schematic for the two-stage model-based learning task. In the first stage, participants choose one of two gray boxes, with a Tibetan character to identify it. Depending on the chosen box, participants transition with different probabilities to a second-stage state, either the red or the blue state. In this example, each box preferentially transitions participants to a particular state (red or blue) with a 70% chance and with the remaining chance (30%) to the opposite state. In the second stage, participants choose between two boxes (with identifying Tibetan characters) and receive a reward or do not. Each box in the second stage has a different reward probability which changes throughout the experiment