. Author manuscript; available in PMC: 2010 May 12.

Published in final edited form as: Proc Int Conf Mach Learn. 2008;301:256–263. doi: 10.1901/jaba.2008.301-256

Table 1.

POMDP active learning approach.

Active Learning with Bayes Risk
Sample POMDPs from a prior distribution. Complete a task choosing actions based on Bayes risk: – Use the POMDP samples to compute the action with minimal Bayes risk (Section 4.1). – If the risk is larger than a given ξ, perform a meta-query (Section 4.1). – Update each POMDP sample's belief based on the observation received (Section 4.2). Once a task is completed, update prior (Section 4.2): – Use a kernel incorporating action-observation history to propagate POMDP samples. – Weight POMDPs based on meta-query history.
Performance and termination bounds are in 4.3 and 4.4.

Active Learning with Bayes Risk

Sample POMDPs from a prior distribution.
Complete a task choosing actions based on Bayes risk:
- – Use the POMDP samples to compute the action with minimal Bayes risk (Section 4.1).
- – If the risk is larger than a given ξ, perform a meta-query (Section 4.1).
- – Update each POMDP sample's belief based on the observation received (Section 4.2).
Once a task is completed, update prior (Section 4.2):
- – Use a kernel incorporating action-observation history to propagate POMDP samples.
- – Weight POMDPs based on meta-query history.

Performance and termination bounds are in 4.3 and 4.4.