Skip to main content
. 2020 May 19;9:e50654. doi: 10.7554/eLife.50654

Figure 5. Two-level Hierarchical Gaussian Filter for continuous inputs.

(A) Schematic of the two-level HGF, which models how an agent infers a hidden state in the environment (a random variable), x1, as well as its rate of change over time (x2, environmental volatility). Beliefs about those two hierarchically related hidden states (x1, x2) at trial k are updated by the sensory input (uk, observed feedback scores in our study) for that trial via prediction errors (PEs). The states x1 and x2 are continuous variables evolving as coupled Gaussian random walks, where the step size (variance) of the random walk depends on a set of parameters (shown in yellow boxes). The lowest level is coupled to the level above through the variance of the random walk: x1k𝒩(x1k-1,exp(κx2k-1+ω1)). The posterior distribution of beliefs about these states is fully determined by the sufficient statistics μi (mean) and σi (variance) for levels i=1,2. The equations describing how expectations (μi) change from trial k-1 to k are Equation 6 and Equation 10. The response model generates the most probable response, yk, according to the current beliefs, and is modulated by the response model parameters β0,β1,β2,ζ. In the winning model, the response parameter was the change between trial k-1 and k in the degree of temporal variability across keystrokes: yk=ΔcvIKItrialk, normalized to range 0–1. (B, C) Example of belief trajectories (mean, variance) associated with the two levels of the HGF for continuous inputs. Panel (C) displays the expectation on the first level, μ1k, which represents an individual’s expectation (posterior mean) of the true reward values for the trial, x1k. Black dots represent the trial-wise input (feedback scores, uk). Panel (B) shows the trial-by-trial beliefs about log-volatility x2k , determined by the expectation μ2k and associated variance. Shaded areas denote the variance or estimation uncertainty on that level. (D) Illustration of the performance measure used as response in the winning model, yk=ΔcvIKItrialk.

Figure 5.

Figure 5—figure supplement 1. Trial-by-trial belief trajectories for simulated performances.

Figure 5—figure supplement 1.

All belief trajectories were generated using prior values on the HGF parameters as shown in Table 1. We simulated performances in six agents by changing the trial-to-trial difference in IKI values across keystroke positions, thus leading to different trajectories of cvIKItrial (B) and feedback scores (A). We started with a pattern of IKI values of [0.2, 0.6, 0.2, 0.6, 0.2, 0.6, 0.2] s and iteratively prolonged the inter-keystroke interval at positions 2, 4, and 6, thereby increasing the temporal difference between IKI values, the vector norm of the total IKI pattern, and the cvIKI value across keystroke positions within the trial, termed cvIKItrial. In the plot, steeper and shallower slopes of change across trials in cvIKItrial and associated feedback scores are denoted by green and pink colored lines, respectively. In addition, lighter colors denote smoother trial-by-trial transitions in cvIKItrial values. Darker colors indicate noisier trial-by-trial changes in this measure, representing an agent with a more variable behavioral strategy every trial. (C, D) Expectation on reward and log-volatility, and (E, F) the associated variance or estimation uncertainty. (G,H) Precision-weighted prediction error on reward, ϵ1, and volatility, ϵ2. A steeper slope of change in feedback scores and cvIKItrial was associated with higher log-volatility estimates and reduced uncertainty about volatility, σ2. For a fixed slope, increasing levels of noise in the trajectories of the feedback scores and cvIKItrial also contributed to higher volatility estimates and reduced σ2. Thus, agents either (i) introducing more fluctuations in behavior from trial to trial or (ii) observing a faster rise in scores had a higher expectation of volatility and lower uncertainty about volatility.
Figure 5—figure supplement 2. Simulated trial-by-trial belief trajectories in an ideal learner.

Figure 5—figure supplement 2.

Simulated trial-by-trialtrajectories of posterior means of belief distributions in an ideal learner with different values of ω1 (A, B) or ω2 (C, D). All trajectories were simulated with identical input scores and parameters, except for ω1 (A, B) or ω2 (C, D): μ1(0)=0.1,μ2(0)=1,σ1(0)=σ2(0)=log(1),κ=1,πu=1/35 (precision of input). (A) Increasing values of ω1 (largest value ω1 = −1 in this example) lead to a more pronounced general reduction in the estimate of log-volatility across trials, log(μ2). (B) The time series of the expectation on reward does not vary in a noticeable way as a function of ω1. (C) Increasing values of ω2 (largest value ω2 = 0 in this example) triggered more phasic trial-by-trial changes in the log-volatility estimate, log(μ2), in response to prediction errors at the lower level (PE about reward, indicated by sharp changes in the trajectories of reward expectations in panel [D]). Increasing values of ω2 correspond to higher uncertainty in the prediction on that level (see Equation 13 in 'Materials and methods'). (D) Same as panel (B), but for varying values of ω2. Expectations on reward did not change considerably as a function of ω2 in this example.
Figure 5—figure supplement 3. β coefficients of the winning response model.

Figure 5—figure supplement 3.

(A–C) Mean (and SEM) values of the β coefficients that explain the performance measure in trial k as a linear function of (i) a constant value (ΔcvIKItrialk) and (ii) the precision-weighted prediction errors on the previous trial k-1: pwPE concerning reward (ϵ1k-1) and pwPE concerning volatility (ϵ2k-1). The performance measure in the winning model, ΔcvIKItrialk, was the change in the degree of temporal variability across keystroke positions from trial k-1 to k. The β values are plotted separately for each control and experimental group. The best response model was obtained using Random Effects Bayesian Model Comparison (BMC) in a set of two families of response models, followed by BMC within the winning family (see main text). The noise parameter ζ did not significantly differ between groups (P>0.05), and therefore we found no differences in how the model was able to estimate predicted responses to fit the observed responses in each group. (B, C) There were no significant between-group differences in β0 or β1 coefficients (P>0.05). (C) Control participants had a positive and significantly higher β2 coefficient than anx1 and anx2 participants (PFDR<0.05, denoted by the horizontal lines and black asterisks). This implies that in control participants an increase in ϵ2 (larger update in the expectation of volatility) contributed to a greater change in the relevant performance measure on the following trial, yet it led to a decrease in anx1 and anx2.
Figure 5—figure supplement 4. Example in one control participant of the association between pwPEs and performance.

Figure 5—figure supplement 4.

Example in one control participant of the association between pwPEs relating to volatility and subsequent changes in performance. (A, B) Illustration of the trajectory of pwPE relating to volatility on trial k-1, ϵ2k-1. Right panels are an enlarged display of a section of the corresponding left panel. (C, D) Trajectory of the expectation on log-volatility, μ2k-1. (E, F) Performance measure in the winning response model, ΔcvIKItrialk, representing the change from trial k-1 to k in the task-relevant performance variable associated with reward, cvIKItrialk. Green circles denote trials of large values of ϵ2 that were followed by an increment in the performance measure (larger behavioral change). This figure illustrates the effect of positive β2 coefficients in the response model in control participants, linking large ϵ2 values to large changes in behavior on the next trial. Red circles mark trials of large ϵ2 values that led to smaller changes in the performance variable in the subsequent trial.
Figure 5—figure supplement 5. Example in one anx1 participant of the association between pwPEs and performance.

Figure 5—figure supplement 5.

Example in one anx1 participant of the association between pwPEs relating to volatility and subsequent changes in performance. This figure illustrates the effect of negative β2 coefficients in the response model in anx1 and anx2 participants, linking large ϵ2 values to large changes in behavior on the next trial. Same as Figure 5—figure supplement 4 but in one participant from the anx1 group.
Figure 5—figure supplement 6. Grand-average trialwise residuals.

Figure 5—figure supplement 6.

Grand-average trialwise residuals resulting from the difference between the observed responses and the responses predicted by the HGF. (A–C) The trialwise residuals in each control and experimental group are shown together as mean and SEM (shaded areas). The winning response model used as response variable the trial-by-trial change in cvIKItrial, which is related to the temporal variability of IKI across keystroke positions in a trial. There were no systematic differences in the model fits across groups (P>0.05; between-group differences in the mean residual, after averaging across trials — cont: 0.0004[0.00095]; anx1: -0.001[0.0013]; anx2: 0.0003[0.0003]). In the additional control experiment, we also found no significant differences between groups in the mean residual values (mean residual values per group: cont: -0.0003[0.00047]; anx3: 0.0023[0.0016]).