Optimization of On-Demand Shared Autonomous Vehicle Deployments Utilizing Reinforcement Learning

. 2022 Oct 29;22(21):8317. doi: 10.3390/s22218317

Algorithm 1 Q Learning Algorithm
1:	Initialize Q-table of size (states, actions)
2:	Choose reward discount, learning rate and exploration rate
3:	Do
4:	Choose random number between 0 and 1
5:	If random number is less than exploration rate
6:	Choose random action
7:	Else
8:	Choose maximum Q-action
9:	Perform chosen action
10:	Observe
11:	Update Q entry based on previously defined Q-update
12:	Until
13:	Reward threshold is achieved