Table 6:
Parametric value estimates for V-learning applied to type 1 diabetes data.
Action space | Basis | γ = 0.7 | γ = 0.8 | γ = 0.9 |
---|---|---|---|---|
Binary | Linear | −6.20 | −9.35 | −15.99 |
Polynomial | −3.91 | −9.03 | −17.50 | |
Gaussian | −3.44 | −13.09 | −25.52 | |
Multiple | Linear | −6.47 | −9.92 | −0.49 |
Polynomial | −2.44 | −6.80 | −14.48 | |
Gaussian | −8.45 | −3.58 | −21.18 | |
Observational policy | −6.77 | −11.28 | −21.79 |