. 2021 Aug 2;115(4):365–382. doi: 10.1007/s00422-021-00884-8

Table 2.

Model parameters used for simulating learning. See Table 1 for abbreviations

Varying values (default value)
Learner parameters			Task parameters
$σ_{m}^{2}$	$σ_{η *}^{2}$ †	$α$ ‡	Target amplitude (units of $σ_{m}$ )	Reward criterion (R(t) = 1 if: …)
1	1	0	0	Random:
				50% of trials
4	4	0.1	2	Adaptive (median):
				If EP < target: $m e d i a n (E P_{t - 1 : t - 10})$ ≤ EP ≤ target + 1
				If EP within fixed reward zone: target – 1 ≤ EP ≤ target + 1
				If EP > target: target – 1 ≤ EP ≤ $m e d i a n (E P_{t - 1 : t - 10})$
9	16	0.15	4	Adaptive (mean):
				If EP < target: ${\bar{EP}}_{t - 1 : t - 10}$ ≤ EP ≤ target + 1
				If EP within fixed reward zone: target – 1 ≤ EP ≤ target + 1
				If EP > target: target – 1 ≤ EP ≤ ${\bar{EP}}_{t - 1 : t - 10}$
16	36	0.2	6	Fixed:
				If EP within fixed reward zone: target – 1 ≤ EP ≤ target + 1
25	64	1	8	Fixed with lower target fraction (target fraction = 2):
				If EP within fixed reward zone: target – 1 ≤ EP ≤ target + 1

† The input $σ_{η *}^{2}$ is equal to the exploratory variance used following non-successful trials in the model of Cashaback19 and Therrien16. Using a variability control function (see Eq. 10, Fig. 3), it defines the two exploratory variances in the Therrien18 model, and a whole range of variances in the Dhawale19 model

‡ The values 0.1, 0.15 and 0.2 are not used in the Therrien16 and Therrien18 models, as their learning parameter is fixed at 1 (Eq. 4)