| Algorithm 1: DQN-based dynamic SFC deployment algorithm |
|
Input: The underlying network state the set of dynamically arriving SFC requests . Output: Dynamic SFC deployment policy . 1: Initialize the action-value function where are the randomly generated neural network weights. 2: Initialize the target action-value function , where . 3: Initialize the experience pool with memory . 4: for episode in range (EPISODES): 5: Generate a new collection of SFCs. 6: Initialize state . 7: for step in range (STEPS): 8: Select the nodes that satisfy the resource and delay requirements. 9: Select m nodes that are closest to the last deployed node among the nodes that satisfy the deployment requirements and add them to set . 10: With probability , select an action at random. 11: Otherwise, select the action . 12: Execute action and observe reward . 13: Store transition in . 14: Sample random minibatch of transitions from D. 15: Set 16: Perform a gradient descent step on with respect to the network parameters . 17: Every steps, reset . 18: End. 19: End. |