. 2011 Apr 6;31(14):5504–5511. doi: 10.1523/JNEUROSCI.6316-10.2011

Table 2.

Qualities of behavioral fits of both models

	Direct actor	Q-learning
−LL	107.6	114.3
Pseudo-R²	0.3534	0.3131
Number of parameters	4	5
BIC	118.6	128.0

Q-learning and policy-gradient models were fit to 20 subjects individually. Average quantities are reported. −LL, Negative log likelihood; BIC, Bayesian information criterion.