. Author manuscript; available in PMC: 2009 Sep 22.

Published in final edited form as: J Artif Intell Res. 2008 Jul 1;32(2):663–704.

Algorithm 3.1.

Generic Online Algorithm.

1:	Function OnlinePOMDPSolver()
	Static: b_c: The current belief state of the agent.
	T: An AND-OR tree representing the current search tree.
	D: Expansion depth.
	L: A lower bound on V ^*.
	U: An upper bound on V ^*.
2:	b_c ← b₀
3:	Initialize T to contain only b_c at the root
4:	while not ExecutionTerminated() do
5:	while not PlanningTerminated() do
6:	b^* ← ChooseNextNodeToExpand()
7:	Expand(b^*, D)
8:	UpdateAncestors(b^*)
9:	end while
10:	Execute best action â for b_c
11:	Perceive a new observation z
12:	b_c ← τ(b_c, â, z)
13:	Update tree T so that b_c is the new root
14:	end while