Real-Time Online Goal Recognition in Continuous Domains via Deep Reinforcement Learning

. 2023 Oct 4;25(10):1415. doi: 10.3390/e25101415

Algorithm 2 Online Infer most likely Goal for the Observations

Require:

T_{π} (G)

: State

S

and action

A

spaces in the continuous domain, and policy evaluation networks

Q_{π_{g}}

Require:

G

: a set of candidate goals
Require:

O

: an observation sequence

O = 〈s_{0}, a_{0}, s_{1}, a_{1}, \dots〉