Skip to main content
. 2021 Apr 21;15:662181. doi: 10.3389/fnbot.2021.662181

Figure 1.

Figure 1

Model components and behavior. The user is composed of a generative model (1) that describes the probability of drawing a certain n-dimensional action qk at the instant k. The action qk contributes to generating a smooth trajectory of control signals sk for the interface (2). The interface map computes a vector pk of a dimension m < n and provides it to the user as feedback related to the smooth control action sk (3). The user model processes the feedback assigning a reward for the action based on the feedback (4). After each iteration, the map recursively updates its parameters based on the distribution of the observed user commands sk with learning rate γ. The user updates its generative model with learning rate η after a feedback is received, reinforcing the generated smooth action according to its reward.