Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2014 Dec 23.

Published in final edited form as: Biometrika. 2014 Oct 20;101(4):831–847. doi: 10.1093/biomet/asu043

Fig. 3 — A scenario where IQ-learning achieves a large gain in value over Q-learning. The constant C determines the second-stage treatment effect size, from no treatment effects (C = 0) to large effects (C = 2). In the left panel, (X₁, C) pairs where linear Q-learning agrees and disagrees with the true first-stage rule are shown in dark and light gray, respectively, where X₁ is a normally distributed first-stage covariate. On the right, average proportion of optimal value attained by the normal IQ-learning estimator, nonparametric IQ-learning estimator, support vector regression Q-learning, and linear Q-learning regimes shown by gray lines with squares, gray lines with circles, light gray lines, and black lines, respectively.