. 2023 Feb 4:1–43. Online ahead of print. doi: 10.1007/s10462-022-10272-8

Table 9.

DRL in economics

Article	Aim of study	Specific approach	Benchmark methods for comparison	Superiority of the proposed method
(Chakole & Kurhekar, 2020)	Make trading decisions	Deep Q-learning	Decision Tree strategy, Buy-and-Hold strategy	Outperforms in terms of some economic indicators: Accumulated Return, Maximum Drawdown, Average daily return, average annual return, Skewness, Kurtosis, Sharpe ratio, and Standard Deviation
(Zhou et al., 2020b)	Derive optimal power flow	DRL, PPO with IL	IL, PPO	Perform better in accuracy and running time
(Qiu et al., 2020)	Pricing electric vehicles	PDDPG	Q-learning, DQN, DDPG	Better performance in standard deviation, learning pace, flexibility and computational time
(Sattarov et al., 2020)	Recommend cryptocurrency trading points	Deep Neural Model of DRL	Double cross strategy, swing trading, scalping trading	Best performance in number of actions and quality of Trading
(Uddin et al., 2020)	Estimate impact of COVID-19 on the spread of the infection, personal satisfaction or quality of life, resource use and economy	DQN, DDPG	Random, Q-Learning, SARSA	Perform better in terms of best rewards and best policy

Note: IL (Imitation Learning), PDDPG (Prioritized Deep Deterministic Policy Gradient), DQN (Deep Q Network), DDPG (Deep Deterministic Policy Gradient), SARSA (State-Action-Reward-State-Action).