| Algorithm 1 Our proposed Q-learning algorithm for the UAV path planning problem. |
|
Input: Source location, destination location, and solution space Output: Optimal path for UAV from source to destination 1: Initialize ; 2: for each episode do 3: set random state from state set ; 4: while do 5: for each where do 6: Determine location of agent by doing action 7: Calculate distance from to Target location. 8: Choose corresponds to smallest from 9: Choose corresponds to which makes the agent move closer to Target location 10: end 11: Perform action and receive penalty or reward 12: Update 13: end 14: end |