Skip to main content
. 2020 Oct 19;20(20):5911. doi: 10.3390/s20205911
MAMMDP Multi-Arm Manipulator Markov Decision Process
SAC Soft Actor-Critic
HER Hindsight Experience Replay
AI Artificial Intelligence
FMMs Fast Marching Methods
PRM Probabilistic Road Map
RRT Rapid exploring Random Trees
DNN Deep Neural Network
TD3 Twin Delayed Deep Deterministic Policy Gradient
MDP Markov Decision Process
DOF Degree of Freedom
OBB Oriented Bounding Boxes
DQN Deep Q-Network
DPG Deterministic Policy Gradient
DDPG Deep Deterministic Policy Gradient
A3C Asynchronous Advantage Actor-Critic
TRPO Trust Region Policy Optimization
MPO Maximum a Posteriori Policy Optimisation
D4PG Distributed Distributional Deep Deterministic Policy Gradient
KL Kullback-Leibler