Skip to main content
. Author manuscript; available in PMC: 2024 Mar 2.
Published in final edited form as: Proc Mach Learn Res. 2023 Apr;206:6245–6262.

Table 2:

Single Behavior Policy With Biased Timing

Low dosage Optimal dosage High dosage
CTDT 356.7 ± 1.3 379.4± 0.1 328.0± 11.7
DT 331.4 ± 42.5 382.5±38.5 325.1±5.8
BC 27.1± 11.2 375.5±67.8 318.8±32.3
BCQ 96.1± 6.0 182.2± 5.7 239.9±89.4
CQL 159.7± 24.5 160.2± 11.5 155.1±24.4