Skip to main content
. Author manuscript; available in PMC: 2024 May 3.
Published in final edited form as: Adv Neural Inf Process Syst. 2021 Dec;2021(DB1):1–15.

Table 4:

Class-averaged results on Task 3 over the 7 behaviors of interest (mean ± standard deviation over 5 runs.) The average F1 score in brackets corresponds to improvements with threshold tuning. See appendix for per class results.

Method Data Used During Training Average F1 MAP
Task 1 (train split) Task 3 (train split) Unlabeled Set
Baseline 0.338 ± .004 .317 ± .005
Baseline w/task prog .328 ± .009 .320 ± .009
MABe 2021 Task 3 Top-1 .319 ± .025
(.363 ± .020)
.352 ± .023