Anomaly Detection in Industrial IoT Using Distributional Reinforcement Learning and Generative Adversarial Networks

. 2022 Oct 22;22(21):8085. doi: 10.3390/s22218085

G	Generator network
C	Critic network
$θ_{G}, θ_{R}$	Parameters of generator or discriminator
$α, β_{1}, β_{2}$	Parameters of Adam optimizer
$S_{t}$	State captured by the agent at time-slot t
$A_{t}$	Possible actions taken by the agent at time-slot t
$R_{t}$	Reward returned to the agent at time-slot t
$P_{a}$	Transaction of state probability matrix
$γ$	Discount factor, where $0 < γ < 1$
$τ$	Bellman operator
X	Original Data
M	Replay memory data set
z	Random noise vector
$λ$	Coefficient of penalty