Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2018 Jan 1.

Published in final edited form as: Adv Neural Inf Process Syst. 2017 Dec;30:5973–5981.

Nonlinear baseline reward g, in scenario with 2 nonzero actions and reward function based on collected HeartSteps data. Cumulative regret shown for proposed Action-Centered approach, compared to baseline contextual bandit, median computed over 100 random trials.