. Author manuscript; available in PMC: 2024 May 3.

Published in final edited form as: Adv Neural Inf Process Syst. 2021 Dec;2021(DB1):1–15.

Table 2:

Class-averaged results on Task 1 (attack, investigation, mount; mean ± standard deviation over 5 runs.) See appendix for per class results. The “All Tasks” column indicates that the model was jointly trained on all three tasks, and “all splits” indicates that both labeled train set and trajectory from unlabeled test set are used.

Method	Data Used During Training			Average F1	MAP
Method	Task 1 (train split)	Unlabeled Set	All Tasks (all splits)	Average F1	MAP
Baseline	✓			.793 ± .011	.856 ± .010
Baseline w/ task prog	✓	✓		.829 ± .004	.889 ± .004
MABe 2021 Task 1 Top-1	✓	✓	✓	.864 ± .011	.914 ± .009