Table 2.
Varying values (default value) | ||||
---|---|---|---|---|
Learner parameters | Task parameters | |||
† | ‡ | Target amplitude (units of ) | Reward criterion (R(t) = 1 if: …) | |
1 | 1 | 0 | 0 | Random: |
50% of trials | ||||
4 | 4 | 0.1 | 2 | Adaptive (median): |
If EP < target: ≤ EP ≤ target + 1 | ||||
If EP within fixed reward zone: target – 1 ≤ EP ≤ target + 1 | ||||
If EP > target: target – 1 ≤ EP ≤ | ||||
9 | 16 | 0.15 | 4 | Adaptive (mean): |
If EP < target: ≤ EP ≤ target + 1 | ||||
If EP within fixed reward zone: target – 1 ≤ EP ≤ target + 1 | ||||
If EP > target: target – 1 ≤ EP ≤ | ||||
16 | 36 | 0.2 | 6 | Fixed: |
If EP within fixed reward zone: target – 1 ≤ EP ≤ target + 1 | ||||
25 | 64 | 1 | 8 | Fixed with lower target fraction (target fraction = 2): |
If EP within fixed reward zone: target – 1 ≤ EP ≤ target + 1 |
† The input is equal to the exploratory variance used following non-successful trials in the model of Cashaback19 and Therrien16. Using a variability control function (see Eq. 10, Fig. 3), it defines the two exploratory variances in the Therrien18 model, and a whole range of variances in the Dhawale19 model
‡ The values 0.1, 0.15 and 0.2 are not used in the Therrien16 and Therrien18 models, as their learning parameter is fixed at 1 (Eq. 4)