Skip to main content
. Author manuscript; available in PMC: 2019 Aug 22.
Published in final edited form as: J Mach Learn Res. 2019;20:48.

Table 2:

Agreements between the estimated optimal decision rule yield by different methods and the observed treatment. OWL-logit: OWL using logistic loss; EARL: EARL using logistic loss; EARL: EARL using exponential oss; EARL-hinge: EARL using hinge loss; EARL-sqhinge: EARL using squared hinge loss; QL: Q-learning.

OWL-Logit EARL-logit EARL-exp EARL-hinge EARL-sqhinge QL
OWL-Logit 1 0.642 0.821 0.639 0.639 0.577
EARL-logit 1 0.588 0.996 0.996 0.920
EARL-exp 1 0.591 0.591 0.529
EARL-hinge 1 1 0.916
EARL-sqhinge 1 0.916
QL 1
Observed 0.193 0.449 0.117 0.453 0.453 0.507