Two-Layer Edge Intelligence for Task Offloading and Computing Capacity Allocation with UAV Assistance in Vehicular Networks

. 2024 Mar 14;24(6):1863. doi: 10.3390/s24061863

Algorithm 1 DQN algorithm.

1:
Initialize replay memory pool $M$
2:
Initialize neural network weight $θ$ and target weight $θ^{'} = θ$
3:
for episode i = 1, M do
4:
for $t = 1, T$ do
5:
Obtain state $s_{t}$ from the environment
6:
Randomly select an action $a_{t}$ or determine $a_{t} = arg {max}_{a} Q (s_{t}, a_{t}; θ)$
7:
Observe the reward $R_{t}$ with the action $a_{t}$ and obtain the next state $s_{t + 1}$
8:
Store transition $(s_{t}, a_{t}, R_{t}, s_{t + 1})$ in memory $M$
9:
Randomly sample a mini-batch $\tilde{M}$ of $M$
10:
Update the evaluation network and perform $\nabla_{θ_{i}} L F (θ_{i})$
11:
Update the target network after C steps
12:
end for
13:
end for
14:
Output: offloading decision strategy $a_{t}$ .