Figure 6.
A comparison between the human results and the best-fit model predictions for each of the three learning models. Using the best fit parameters for each subject, we found the predicted probability of selecting the Long-term option on trial t given the response history of that subject for all trials up to trial t − 1. For presentation purposes, proportion of long-term responses for human participants and predicted probabilities of long-term responses for the model were smoothed using a sliding window of 15 trials. Note that although the Softmax model does not predict that participants will select more from the Long-term option, when linked to individual choice histories, it can end up slightly favoring the long-term option. Overall, the Q-learning network model provides the best account across all five conditions.