Skip to main content
. 2022 Oct 22;22(21):8085. doi: 10.3390/s22218085
G Generator network
C Critic network
θG,θR Parameters of generator or discriminator
α,β1,β2 Parameters of Adam optimizer
St State captured by the agent at time-slot t
At Possible actions taken by the agent at time-slot t
Rt Reward returned to the agent at time-slot t
Pa Transaction of state probability matrix
γ Discount factor, where 0<γ<1
τ Bellman operator
X Original Data
M Replay memory data set
z Random noise vector
λ Coefficient of penalty