Skip to main content
. 2016 Nov 30;36(48):12228–12242. doi: 10.1523/JNEUROSCI.0763-16.2016

Table 1.

Symbols used in the equations and values that were used for the parameters in the simulations

Variable Description Value (if applicable)
Ns No. of spatial units 2
Ne No. of units in recurrent network (autoencoder) 490
Nm No. of place cells 980
Nv No. of cortical units (layer 1) 980
Nh No. of cortical units (layer 2) 300
β Bout length Variable (see text)
tb Time elapsed within bout [0, β]
Rt Reward administered at time t {0, 1}
σϵ Within-bout variance 0 (none) or 0.003
σφ Interbout variance 0 (none) or 0.11
σf Place cell breadth 0.16
ϕ Sampled reward location (sudden shift) N(μ, σφ) bound to [0, 1]
ϵ Incremental shift N(μ, σϵ)
lt Reward location at time t [0, 1]
xt Agent location at time t [0, 1]
si Place cell centerfield [0, 1]
s Spatial cell activation vector
e(k) Recurrent network (autoencoder) layer k activation vector
m Place cell activation vector (memory)
mE Episodic output
mS Schematic output
mO Combined episodic/schematic output
mR Output from replay event
m∼(xt+1|ai) Predicted output given action ai
V Cortex layer 1 activation vector
H Cortex layer 2 activation vector
WSE-AE Spatial encoder to autoencoder weights
WAE-AE Autoencoder recurrent weights
WAE-PC Autoencoder to place cell weights
WCTX Cortical weights
Agent speed 0.04
at Action taken at time t ϵ{N, NW, W, SW, S, SE, E, NE}
ai Possible action at time t ϵ{N, NW, W, SW, S, SE, E, NE}
αt Policy unit (episodic, schematic) at time t [0, 1]
αRt Policy unit (random) at time t [0, 1]
δt Temporal difference error
γ Temporal difference discounting factor 0.95
λ Learning rate (autoencoder) δt 0.1
Learning rate (cortex) 0.00001
Learning rate (place cells, actor) 0.0075
Learning rate (place cells, critic) 0.04