Skip to main content
. 2023 Mar 12;23(6):3054. doi: 10.3390/s23063054
Algorithm 1: DQN-based dynamic SFC deployment algorithm
Input: The underlying network state st the set of dynamically arriving SFC requests r1,r2rm.
Output: Dynamic SFC deployment policy Π1.
1: Initialize the action-value function Q(st,a;θ) where θ are the randomly generated neural network weights.
2: Initialize the target action-value function Q^(st,a;θ), where θ=θ.
3: Initialize the experience pool D with memory N.
4: for episode in range (EPISODES):
5:  Generate a new collection of SFCs.
6:  Initialize state s.
7:  for step in range (STEPS):  
  8:     Select the nodes that satisfy the resource and delay requirements.
  9:     Select m nodes that are closest to the last deployed node among the nodes that satisfy the deployment requirements and add them to set Φ.
  10:    With probability ε, select an action at at random.
  11:    Otherwise, select the action at=argmaxaQ(st,a;θ), aΦ.
  12:    Execute action at and observe reward rt.
  13:    Store transition et=(st,at,rt,st+1) in D.
  14:    Sample random minibatch of transitions (sj,aj,rj,sj+1) from D.
  15:    Set yj=rj,rj=endrj+γmaxaQ^(sj+1,a;θ),rjend
  16:    Perform a gradient descent step on (yjQ(sj+1,a;θ))2 with respect to the network parameters θ.
  17:    Every C steps, reset Q^=Q.
  18:  End.
  19: End.