Abstract
Humans spend a lifetime learning, storing and refining a repertoire of motor memories. For example, through experience, we become proficient at manipulating a large range of objects with distinct dynamical properties. However, it is unknown what principle underlies how our continuous stream of sensorimotor experience is segmented into separate memories and how we adapt and use this growing repertoire. Here we develop a theory of motor learning based on the key principle that memory creation, updating and expression are all controlled by a single computation – contextual inference. Our theory reveals that adaptation can arise both by creating and updating memories (proper learning) and by changing how existing memories are differentially expressed (apparent learning). This insight allows us to account for key features of motor learning that had no unified explanation: spontaneous recovery1, savings2, anterograde interference3, how environmental consistency affects learning rate4,5 and the distinction between explicit and implicit learning6. Critically, our theory also predicts novel phenomena – evoked recovery and context-dependent single-trial learning – which we confirm experimentally. These results suggest that contextual inference, rather than classical single-context mechanisms1,4,7–9, is the key principle underlying how a diverse set of experiences is reflected in our motor behaviour.
Throughout our lives, we experience different contexts, in which the environment exhibits distinct dynamical properties, such as when manipulating different objects or walking on different surfaces. Although it has been recognised that the brain maintains multiple motor memories appropriate for these contexts10,11, classical theories of motor learning have focused on how the brain adapts to a single type of environmental dynamics1,7,8. However, with multiple memories come new computational challenges: the brain must decide when to create new memories12 and how much to express and update them for each movement we make. These operations, their governing principles and consequences on motor learning, remain poorly understood. Here, we propose a unifying principle – contextual inference – that specifies how sensory cues and state feedback affect memory creation, expression and updating. We show that contextual inference is the core feature that underlies a range of fundamental aspects of motor learning that were previously explained by a number of distinct and often heuristic processes.
COIN: a model of contextual inference
In order to formalise the role of contextual inference in motor learning, we developed the COIN (COntextual INference) model, a principled nonparametric Bayesian model of motor learning (see Methods). The COIN model is based on an internal model that specifies the learner’s assumptions about how the environment generates their sensory observations (Fig. 1a, Extended Data Fig. 1a). Motor learning corresponds to online Bayesian inference under this generative model (Fig. 1b, Extended Data Fig. 1b). For this, the COIN model jointly infers contexts, their transitions, their dynamical and sensory properties, and the current state of each context, such that each motor memory stores the learner’s inferences about a different context (for validation, see Extended Data Fig. 2a–b). The major challenge in motor learning is that neither contexts nor their transitions come labelled, and thus the learner needs to continually infer which context they are in based on a continuous stream of experience.
The result of contextual inference is a posterior distribution expressing the probability with which each known context, or a yet-unknown novel context, is currently active (Fig. 1b, top row). In turn, contextual inference determines memory creation, expression and updating (Fig. 1b, numbered arrows). Fig. 1c–f (and Extended Data Fig. 1c–e) illustrates this in a simulation of the COIN model (parameters in Extended Data Fig. 3) when handling objects of varying weights. For determining the current motor command (Fig. 1e), rather than selecting a single memory to be expressed11,12, the state associated with each memory (Fig. 1d) is expressed commensurate with the probability of the corresponding context under the posterior, computed after observing the sensory cue but before movement (‘predicted probability’; Fig. 1b, arrow 1; Fig. 1f1). After movement, the ‘responsibility’ of each known context as well as of a novel, yet-unknown context is computed as their posterior probability given both the cue and the resultant state feedback. A new memory is created flexibly, whenever the responsibility of a novel context becomes high (Fig. 1b, arrow 2; Fig. 1f2). Critically, context responsibilities also scale the updating of the previously existing memories and any newly created memory (Fig. 1b, arrows 3; Fig. 1f3, red and pink arrows respectively showing how high and low responsibility for the red context speeds up and slows down the updating of its state, Fig. 1d). Finally, these responsibilities are used to compute the predicted context probabilities on the next time step (Fig. 1f1).
In summary, the COIN model proposes that contextual inference is core to motor learning. In particular, unlike in traditional models of learning, adaptation to a change in the environment (e.g. Fig. 1e, blue and cyan arrows) can arise from two distinct and interacting mechanisms. First, in line with classical notions of learning, proper learning constitutes the creation and updating of memories (the inferred states of known contexts; Fig. 1d, blue arrow). Second, apparent learning occurs due to the updating of the predicted context probabilities (Fig. 1f1, cyan arrow), thereby altering the extent to which existing memories are ultimately expressed in behaviour.
Apparent learning underlies memory recovery
As an ideal litmus test of the contributions of contextual inference to memory creation and expression (Fig. 1b, arrows 1–2), we revisited a widely-used motor learning paradigm. In this paradigm (Fig. 2a and b, top left), participants learn a perturbation P+ applied by a robotic interface while reaching to a target. Adaptation is assessed using occasional channel trials, Pc, which remove movement errors and measure the forces participants use to counteract the perturbation (Fig. 2a, see Methods for details). Exposure to P+ is followed by brief exposure to the opposite perturbation, P−, bringing adaptation back, near to baseline. Finally, a series of channel trials is administered. As in previous studies1, our participants showed the intriguing feature of spontaneous recovery in this phase (Fig. 2c): a transient re-expression of P+ adaptation, rather than a simple decay towards baseline.
Although this paradigm has no explicit sensory cues, according to our theory, contextual inference plays an important role. When simulated for this paradigm (Fig. 2b), the COIN model starts with a memory appropriate for moving in the absence of a perturbation (P0, blue Fig. 2b, bottom left) and creates new memories for the P+ (red) and P− (orange) perturbations. Spontaneous recovery arises due to the dynamics of contextual inference. As P+ has been experienced in most trials, it is quickly inferred to be active with a high probability during the channel-trial phase (Fig. 2b, top right). Therefore, as its state has not yet decayed (Fig. 2b bottom left), the memory of P+ is transiently expressed in the participant’s motor output (Fig. 2b bottom right). This mechanism is fundamentally different from that of a classical, single-context model of motor learning, the dual-rate model1. There, motor output is determined by a combination of individual memories that update at different rates (fast and slow) but whose expression does not change over time. Thus the dynamics of adaptation is solely determined by the dynamics of memory updating, i.e. proper learning. In contrast, in the COIN model, changes in motor output can occur without updating any individual memory, simply due to changes in the extent to which existing memories are expressed due to contextual inference, i.e. apparent learning. This mechanism allows the COIN model to account robustly for spontaneous recovery (Extended Data Fig. 4a), including elevated or reduced levels when the P+ phase is extended13 (Extended Data Fig. 5a–j) or when P− is experienced prior to the P+ phase14 (Extended Data Fig. 5k–o), respectively.
In order to distinguish between proper and apparent learning as the main mechanism underlying spontaneous recovery, we designed a novel ‘evoked recovery’ paradigm (similar to the reinstatement paradigm in classical conditioning15) in which sensorimotor evidence clearly indicates that a change in context has occurred. For this, two early trials in the channel-trial phase of the spontaneous recovery paradigm were replaced with P+ (‘evoker’) trials (Fig. 2d, top left, akin to trigger trials in visuomotor learning11). In this case, the COIN model predicts a strong and long-lasting recovery of P+-adapted behaviour (Fig. 2d, bottom right; Extended Data Fig. 4b), primarily due to the inference that the P+ context is now active (Fig. 2d, top right, red) and the gradual decay of the P+ state over subsequent channel trials (Fig. 2d, bottom left, red). In addition, our mathematical analysis suggested that evoked as well as spontaneous recovery are inherent features of the COIN model (Suppl. Inf. and Extended Data Fig. 6a–c). In contrast, the dual-rate model only predicts a transient recovery that rapidly decays due to the same underlying adaptation process with fast dynamics governing both recovery and decay (Extended Data Fig. 6d).
In line with COIN model predictions, participants showed a strong evoked recovery in response to the P+ trials (Fig. 2e). This recovery lasted for the duration of the experiment, defying models that predict a simple exponential decay to baseline4,11,16 (Extended Data Fig. 6e and Extended Data Table 1). We fit the COIN and dual-rate models to individual participants’ data in both experiments (Fig. 2c & e). The COIN model fit the data accurately, but the dual-rate model (and its multi-rate extensions, Extended Data Fig. 6d) showed a qualitative mismatch in the time course of decay of evoked recovery (insets in Fig. 2c & e). Formal model comparison provided strong support for the COIN model overall (Δ group-level BIC of 302.6 and 394.1 nats for the spontaneous and evoked recovery groups, respectively) and for the majority of participants (6 out of 8 for each experiment; individual fits shown in Extended Data Fig. 6f, Extended Data Fig. 2c–e).
The COIN model explains memory recovery by creating a new memory only when existing memories cannot account for a perturbation, such as on the abrupt introduction of P+ and P−, but not when a new perturbation is introduced gradually. This explains why deadaptation is slower following the removal of a gradually (vs. abruptly) introduced perturbation17 (Extended Data Fig. 5p–s).
Memory updating depends on contextual inference
In the COIN model, contextual inference also controls how each existing memory is updated, that is proper learning (Fig. 1b, arrows 3). In the COIN model all memories are updated, with the updates scaled by their respective inferred responsibilities (Fig. 1f3). This contrasts with models which only update a single memory11,12 or update multiple memories independent of context1,18. To test this prediction, we examined the extent to which memories for two contexts were updated when we modulated their responsibilities by controlling the sensory cue and state feedback – the two observations that determine context responsibilities (Fig. 1b).
In many natural scenarios, sensory cues and state feedback provide consistent evidence about context (e.g. larger cups are heavier), and thus context responsibilities are approximately all-or-none (Fig. 1f3). Thus to test for graded memory updating, we created conflicts between cues and state feedback (akin to a light, large cup). Specifically, participants experienced an extensive training phase designed to form separate memories for two contexts associated with a distinct cue (target location) and perturbation (Fig. 3a; context 1 = and context 2 = , with sub- and superscript specifying sensory cue and perturbation sign, respectively). These contexts switched randomly (with probability 0.5; Fig. 3b). As expected19, participants formed separate memories for each context and expressed them appropriately based on the sensory cues (Extended Data Fig. 7a). In a subsequent test phase, we studied the updating of one of the memories, that associated with context 1, in response to exposure to a single trial of a potentially conflicting cue-feedback combination. To quantify single-trial learning for the memory associated with context 1, we assessed the adaptation of this memory using channel trials with the appropriate cue (cue 1) both before and after an exposure trial (Fig. 3c). The change in adaptation from the first to last channel trial of this ‘triplet’ (channel-exposure-channel) reflects single-trial learning in response to the exposure trial4,5. To bring adaptation back close to baseline before each triplet, we used sequences of washout trials, pairing P0 with the sensory cues ( and ).
The COIN model predicted that the responsibility of context 1, and hence the updating of the corresponding memory (as reflected in single-trial learning; Fig. 3d, column 2, Extended Data Fig. 4c), should exhibit a graded pattern that arises over training (Extended Data Fig. 7b): it should be greatest when the cue and state feedback on the exposure trial both provide evidence of context 1 ( exposure trial), least when both provide evidence for context 2 ( exposure trial) and intermediate when the two sources of evidence are in conflict ( and exposure trials; see also Suppl. Inf. and Extended Data Fig. 7c–d for an analytical approximation). Comparing the two conditions with intermediate updating, due to the cues being paired with P0 in the washout trials, we also expected the cue to have a weaker effect than the perturbation and therefore less updating of the memory for context 1 following exposure with than with .
The pattern of single-trial learning in pre- and post-training confirmed the COIN model’s qualitative predictions (Fig. 3d, column 1). Prior to training, there was no significant difference in single-trial learning across exposure conditions (two-way repeated-measures ANOVA, F1,23 = 2.40, p = 0.135 for cue, F1,23 = 0.97, p = 0.335 for perturbation). After learning, single-trial learning showed a gradation across conditions with a significant modulatory effect for both the cue and the perturbation (F1,23 = 10.35, p = 3.82 × 10−3 for cue, F1,23 = 21.16, p = 1.26 × 10−4 for perturbation, with no significant interaction, F1,23 = 0.64, p = 0.432; Extended Data Fig. 7e). The modulatory effects of the cue and the perturbation were not confined to separate subsets of participants (Fisher’s exact test, odds ratio = 1.0, p = 1.00, see Methods and Extended Data Fig. 7f). After fitting to the data, the COIN model also accounted quantitatively for how single-trial learning changed during the training phase (Extended Data Fig. 7b). Taken together, the pattern of single-trial learning shows the gradation in memory updating (at an individual participant-level) predicted by the COIN model, with multiple memories updated in proportion to their responsibilities.
Apparent changes in learning rate
The COIN model also suggested an alternative account of classical results about apparent changes in learning rate under a variety of conditions. Fig. 4 shows three paradigms (column 1) with experimental data (column 2). What is common in all these cases is that the empirical finding of trial-to-trial changes in adaptation has been interpreted as proper learning, i.e. changes to existing memories (states). Thus differences between the magnitudes of these changes have been interpreted as differences in learning rate. For example, savings (Fig. 4a) refers to the phenomenon that learning the same perturbation a second time (even after washout) is faster than the first time1,2,20,21. In anterograde interference (Fig. 4b) learning a perturbation (P−) is slower if an opposite perturbation (P+) has been learned previously3, with the amount of interference increasing with the length of experience of the first perturbation. The persistence of the environment has also been shown to affect single-trial learning (Fig. 4c)4,5: more consistent environments lead to increased levels of single-trial learning.
The COIN model suggests that changes in adaptation can occur without proper learning, simply through apparent learning, that is by changing the way existing memories are expressed (Fig. 1d–f, blue vs. cyan arrows). Therefore, apparent changes in learning rate in these paradigms may be due to changes in memory expression rather than changes in memory updating. To test this hypothesis, we simulated the COIN model using the parameters obtained by fitting each of the 40 participants in our experiments (Extended Data Fig. 3), thus providing parameter-free predictions. The COIN model reproduced the pattern of adaptation and single-trial learning seen in these paradigms (Fig. 4 and Extended Data Fig. 8, column 3; Extended Data Fig. 4d–f). Crucially, differences in the apparent learning rate were not driven by differences in either the proper learning rate (Kalman gain, see Methods) or the underlying state (column 4). Instead, they were driven by changes in contextual inference (column 5). For example, according to the COIN model, in savings P+ is expected with higher probability during the second exposure after having experienced it during the first exposure. Similarly, anterograde interference arises as more extended experience with P+ makes it less probable that a transition to other contexts (i.e. P−) will occur. Finally, more (less) consistent environments lead to higher (lower) probabilities with which contexts are predicted to persist to the next trial, leading to more (less) memory expression, as reflected in single-trial learning. More generally, our analysis of the COIN model indicated that single-trial learning can be expressed mathematically as a mixture of two processes that both depend on contextual inference (see Suppl. Inf. and Extended Data Fig. 7c–d) and each of which can be dissected by the appropriate experimental manipulation: proper learning (as studied in Fig. 3) and apparent learning (as studied in Fig. 4c).
Cognitive mechanisms in contextual inference
In addition to providing a comprehensive account of the phenomenology of motor learning, the COIN model also suggests how specific cognitive mechanisms contribute to the underlying computations. For example, associating working memory with the maintenance of the currently estimated context probabilities explains how a working memory task can effectively lead to evoked recovery in a modified version of the spontaneous recovery paradigm22 (see Suppl. Inf. and Extended Data Fig. 9a–d). Furthermore, identifying explicit and implicit forms of visuomotor learning with inferences in the model about state (i.e. estimate of visuomotor rotation) versus a bias parameter (i.e. sensory recalibration between the proprioceptive and visual locations of the hand), respectively, explains the complex time courses of these components of learning23–25 (see Suppl. Inf. and Extended Data Fig. 9e–l).
Discussion
The COIN model puts the problem of learning a repertoire of memories — rather than a single motor memory — centre stage. Once this more general problem is considered, contextual inference becomes a key computation that unifies seemingly disparate data sets. By partitioning motor learning into two fundamentally different processes, contextual inference (Fig. 1b, top row) and state inference (Fig. 1b, bottom rows), the COIN model provides a principled framework for studying the neural bases of learning motor repertoires (see Suppl. Inf.).
In contrast to the COIN model, previous theories of motor learning typically did not have a notion of context1,4,18. In the few cases in which contextual motor learning was considered within a principled probabilistic framework11,16,26, the generative models underlying learning did not incorporate fundamental properties of the environment (e.g. context transitions, cues or state dynamics) that are critical for explaining a number of learning phenomena. Consequently, previous models can only account for a subset of the data sets we model (Extended Data Table 1), which they were often hand-tailored to address.
There are deep analogies between the context-dependence of learning in the motor system and other learning systems, both in terms of their phenomenologies and the computational problems they are trying to solve12,27–30. However, there is one important conceptual issue that has been absent from work on contextual learning in other domains that our work has brought to the fore – the distinction between proper learning and apparent learning. We have shown that many features of motor learning arise not from the updating of existing memories (proper learning) but from changes in the extent to which existing memories are expressed (apparent learning). This distinction, and the role of contextual inference in both proper and apparent learning, is likely to be relevant to all forms of learning in which experience can be usefully broken down into discrete contexts – in the motor system and beyond.
Methods
Here, we provide an overview of the methods. For full details see Suppl. Inf.
Participants
Forty right-handed, neurologically-healthy participants (18 males and 22 females; age 27.7 ± 5.6 yr, mean ± s.d.) participated in two experiments, which had been approved by the Cambridge Psychology Research Ethics Committee and the Columbia University IRB (AAAR9148). All participants provided written informed consent.
Experimental apparatus
Experiments were performed using a vBOT planar robotic manipulandum with virtual-reality system and air table31. Participants grasped the handle of the manipulandum with their right hand while their forearm was supported on an air sled and moved their hand in the horizontal plane.
The manipulandum controlled a virtual “object” that was displayed centred on the hand and translated with hand movements as participants made repeated movements from a home position to a target located 12 cm distally in the sagittal direction.
On each trial, the vBOT could either generate no forces (P0, null field), a velocity-dependent curl force field (P+ or P− perturbation depending on the direction of the field) or a force channel (Pc, channel trials). For the curl force field, the force generated on the hand was given by
(1) |
where Fx, Fy, and are the forces and velocities at the handle in the x (transverse) and y (sagittal) directions respectively. The gain was set to ±15 N·s·m−1, with the sign specifying the direction of the curl field (counterclockwise or clockwise, which were assigned to P+ and P−, counterbalanced across participants). On channel trials, the hand was constrained to move along a straight line to the target by simulating channel walls on each side of the straight line as stiff springs (3,000 N·m−1) with damping (140 N·s·m−1)32,33.
Experiment 1: spontaneous and evoked recovery
Sixteen participants were assigned to either a spontaneous (n = 8) or evoked (n = 8) recovery group. The virtual object controlled by participants was simply a cursor.
Participants in the spontaneous recovery group performed a version of the standard spontaneous recovery paradigm1. A pre-exposure phase (50 trials) with a null field (P0) was followed by an exposure phase (125 trials) with P+. Participants then underwent a counter-exposure phase of 15 trials with the opposite perturbation (P−). This was followed by a channel-trial phase (150 channel trials, Pc). In the pre-exposure and exposure phases, to assess adaptation, each block of 10 trials had one channel trial (Pc) in a random location (not the first). A 45 s rest break was given after trial 60 of the exposure phase, followed by an additional 5 P+ trials prepended to the next block.
The evoked recovery group experienced the identical paradigm to the spontaneous recovery group except that the 3rd and 4th trials of the channel-trial phase were replaced with P+ trials (Fig. 2d).
Experiment 2: memory updating
Twenty-four participants performed the memory updating experiment. The paradigm is based on the control point experiment described in Ref. 19 in which perturbations , , , , and are presented with one of two possible sensory cues (different control points on a rectangular virtual object, denoted by subscripts). The experiment consisted of a pre-training, training and post-training phase. In the pre-training and post-training phases, participants performed blocks of trials consisting of a variable number (8, 10 or 12 in the pre-training phase and 2, 4 or 6 in the post-training phase) of washout trials (an equal number of and in a pseudorandom order) followed by 1 of 4 possible ‘triplets’. Each triplet consisted of 2 channel trials (both with cue 1, ) bracketing a cue-perturbation ‘exposure’ trial (, , or , see main text and Fig. 3c). Each of the 4 triplet types was experienced once every 4 blocks, using pseudorandom permutations, with a total of 16 blocks in the pre-training phase and 32 blocks in the post-training phase.
In the training phase (Fig. 3b), participants performed 24 blocks each consisting of 62–70 trials. The key feature of each block was that 32 force-field trials (equal number of and in a pseudorandom order) was followed by 2 triplets (with exposure trials of and ). Each triplet was preceded by a variable number of washout trials (equal number of and in a pseudorandom order) to bring adaptation back close to baseline. For full details of the block structure see Suppl. Inf.
The control point assigned to sensory cue 1 (used on all triplet channel trials) and sensory cue 2 was counterbalanced across participants as was the direction of force field assigned to P+ and P−.
Data analysis
On each channel trial, we linearly regressed the time series of actual forces generated by participants into the channel wall against the ideal forces that would fully compensate for the forces on a force-field trial1. The offset of the regression was constrained to zero, and we used the slope as our (dimensionless) measure of adaptation.
To identify changes in single-trial learning between triplets in the memory updating experiment, two-way repeated-measures ANOVAs were performed with factors of cue (2 levels: cue 1 and cue 2) and perturbation (2 levels: P+ and P−). To test whether the modulatory effects of cue and perturbation were confined to separate subsets of participants, we quantified the effect of each by computing, on an individual-participant basis, the following contrasts in single-trial learning: (cue effect) and (perturbation effect). We then split participants into 2×2 groups based on whether each effect was below or above the median of each effect and performed a Fisher’s exact test on the resulting 2×2 histogram (see Suppl. Inf. for details).
All statistical tests were two-sided with significance set to p < 0.05. Data analysis was performed using MATLAB R2020a.
COIN generative model
Fig. 1a shows the graphical model for the generative model. At each time step t = 1, …, T there is a discrete latent variable (the context) ct ∈ {1, …, ∞} that evolves as a Markov process:
(2) |
where is the transition probability matrix and is its jth row containing the transition probabilities from context j to each context k (including itself). In principle, there are an infinite number of rows and columns in this matrix. However, in practice, generation and inference can both be accomplished using finite-sized matrices by placing a nonparametric prior on the matrix (see below).
Each context j is associated with a continuous (scalar) latent variable (the state, e.g. the strength of a force field) that evolves according to its own linear-Gaussian dynamics independently of all other states:
(3) |
where a(j) and d(j) are the context-specific state retention factor and drift, respectively, and is the variance of the process noise (shared across contexts). Each state is assumed to have existed for long enough that its prior for the first time it is observed is its stationary distribution:
(4) |
At each time step, a continuous (scalar) observation yt (the state feedback) is emitted from the state associated with the current context:
(5) |
where is the variance of the observation noise (also shared across contexts).
In addition to the state feedback, a discrete observation (the sensory cue) qt ∈ {1, …, ∞} is also emitted. The distribution of sensory cues depends on the current context:
(6) |
where is the cue probability matrix (which, in principle, is also doubly infinite in size but can be treated as finite in practice) and is its jth row containing the probability of each cue k in context j.
In order to make this infinite-dimensional switching state-space model well-defined, we place hierarchical Dirichlet process priors34 on the transition and cue probability matrices. The transition probability matrix is generated in two steps (Extended Data Fig. 1a). First, an infinite set of global probabilities for transitioning into each context (‘global transition probabilities’) is generated by sampling from a GEM (Griffiths, Engen and McCloskey) distribution:
(7) |
where 0 ≤ βj ≤ 1 and , as required for a set of probabilities. The global transition probabilities decay exponentially as a function of j in expectation, with the hyperparameter γ controlling the rate of decay and thus the effective number of contexts: large γ implies a large number of small-probability contexts (slow decay from a relatively small initial probability), whereas small γ implies a smaller number of relatively large-probability contexts (fast decay from a relatively large initial probability).
Second, for each context (row of the transition probability matrix), an infinite set of local (context-specific) probabilities for transitioning into each context (‘local transition probabilities’) are generated via a ‘sticky’ variant35 of the Dirichlet process (DP):
(8) |
where 0 ≤ πjk ≤ 1 and , as required for a set of probabilities, and δj is an infinite-dimensional one-hot vector with the jth element set to 1 and all other elements set to 0. The mean (base) distribution of the Dirichlet process is (αβ + κδj)/(α + κ), with large α + κ reducing variability around this mean (for a tutorial on the Dirichlet process see Ref. 36). Thus the concentration parameter α controls the resemblance of local transition probabilities to the global transition probabilities β. The self-transition bias parameter κ > 0 controls the resemblance of local transition probabilities to δj (i.e. a certain self-transition, ct = ct−1 = j). This self-transition bias expresses the fact that a context often persists for several time steps before switching (i.e. that contexts are ‘sticky’), such as when an object is manipulated for an extended period of time.
Note that the rows of the transition probability matrix are dependent as their expected values (the base distributions of the corresponding Dirichlet processes) contain a shared term, the global transition distribution β. This dependency, controlled by α, captures the intuitive notion that contexts that are common in general (i.e. have a large global transition probability) will be transitioned to frequently from all contexts.
The cue probability matrix is generated using an analogous (non-sticky) hierarchical construction:
(9) |
where γe determines the distribution of the global cue probabilities βe, and αe determines the across-context variability of local cue probabilities around the global cue probabilities.
In order to allow full Bayesian inference over the parameters governing the state dynamics , we also place a prior on these parameters. For this, we use a bivariate normal distribution (truncated for a(j) between 0 and 1):
(10) |
where μ = [μa 0]⊤ and is a diagonal covariance matrix. Here we have set the prior mean of d(j) to zero under the assumption that positive and negative drifts are equally probable.
Inference in the COIN model
At each time step t = 1, …, T, the goal of inference is to compute the joint posterior distribution of all quantities that are not directly observed by the learner: the current context ct, the current state of each context , the parameters governing the state dynamics in each context ω(j), the context transition parameters (global β and local Π transition probabilities) and the cue emission parameters (global βe and local Φ cue probabilities) based on the sequence of state feedback y1:τ and sensory cue observations . made until time τ and τ′, respectively (with τ and τ′ each being either t or t − 1, see below). In principle, this posterior is fully determined by the generative model defined in the previous section and can be obtained in a sequential manner by recursively propagating (‘filtering’) the joint posterior from one time point to the next after each new set of observations is made. As exact inference is infeasible, we use a sequential Monte Carlo method known as particle learning that computes an approximation to this filtered posterior37,38. We extensively validated the accuracy of this method (Extended Data Fig. 2a–b). The details of the inference method are given in Suppl. Inf. Here we only describe how the approximate posterior is used to obtain the main model-derived quantities plotted in the paper.
The predicted probability of context j ∈ {1, …, J,∅}, where J is the number of known contexts and ∅ is the novel context, on trial t (computed after observing the cue but before observing the state feedback; Fig. 1f1 and corresponding panels in later figures) is
(11) |
where Θt\ct denotes the set Θt excluding ct and … represents all observations before time t (as in Fig. 1). The responsibility of context j on trial t (computed after observing both the cue and the state feedback; Fig. 1f2−3 and corresponding panels in later figures) is
(12) |
The predicted state distribution for context j on trial t (computed before observing the state feedback; Fig. 1d and corresponding panels in later figures) is
(13) |
where denotes the set Θt excluding . The mean of this distribution can be shown to evolve across trials (see Suppl. Inf.) as
(14) |
where denotes the expected value of a(j) with respect to the distribution is the prediction error for context j and corresponds to the ‘Kalman gain’ for context j, which we plot in Fig. 4. Note that this update is scaled by the context’s responsibility p(ct = j | qt, yt,…), which underlies the effect of contextual inference on memory updating (arrows 3 in Fig. 1b).
The ‘overall’ predicted state distribution on trial t (i.e. the predicted state distribution of the context that is currently active, and of which the identity the learner cannot know with certainty; purple distribution in Fig. 1e and corresponding panels in later figures) is computed by integrating out the context from Eq. 13 using the predicted probabilities from Eq. 11 (arrow 1 in Fig. 1b):
(15) |
The motor output ut of the learner (cyan line in Fig. 1e and corresponding panels in later figures) is the mean of this predicted state distribution:
(16) |
Applying the COIN model to experimental data
Applying the COIN model to experimental data required solving two additional challenges. First, participants’ state feedback observations are hidden from the perspective of the experimenter, as they are noisy realisations of the ‘true’ underlying states (Eq. 5). To appropriately account for our uncertainty about the state feedback participants actually observed, we computed the distribution of COIN model inferences by integrating over the possible sequences of state feedback observations y1:T given the sequence of true states (experimentally-applied perturbations)39. Specifically, on each trial, was assigned a value of 0 (null-field trials), +1 (P+ perturbation trials) or −1 (P− perturbation trials) and yt was assumed to be distributed around with i.i.d. zero-mean Gaussian observation noise of variance (Eq. 5), except on channel trials (Pc) where we treated yt as unobserved, as the state (the magnitude of a force field) was not observed by the participants on those trials. Note that the distribution of state feedback given the true state shares the same parameters as those underlying the COIN model inferences as it is self-consistently defined by the generative model. All figures showing COIN model inferences applied to experimental data (i.e. all but Fig. 1) show the quantities described in the previous section after the state feedback has been integrated out (Fig. 1d–f shows COIN model inferences conditioned on the state feedback sequence shown in Fig. 1c).
Second, real participants’ behaviour can always be subject to influences not explicitly included in the COIN model. In order to account for these uncontrolled and unmodelled factors, we introduced a phenomenological ‘motor noise’ component that related the motor output ut of the COIN model (Eq. 16) to the experimentally measured adaptation at via i.i.d. zero-mean Gaussian noise:
(17) |
where σm is the standard deviation of the motor noise.
Model fitting and model comparison
In Experiments 1 and 2, we fit the parameters of the COIN model ϑ to participants’ data by maximising the data log likelihood using Bayesian adaptive direct search (BADS)40. In Experiment 1, ϑ = {σq, μa, σa, σd, α, ρ, σm}, where
(18) |
is the normalised self-transition bias parameter. In Experiment 2, which included sensory cues, an additional parameter αe was also fit. In Experiment 1, we also fit a two-state (dual-rate) and three-state state-space model to the data of individual participants by minimising the mean squared error using MATLAB’s fmincon and BADS. In all cases, optimisation was performed from 30 random initial parameter settings (see Suppl. Inf.).
To perform model comparison for individual participants, we calculated the Bayesian information criterion (BIC). A BIC difference of greater than 4.6 nats (a Bayes factor of greater than 10) is considered to provide strong evidence in favour of the model with the lower BIC value41. To perform model comparison at the group level, we calculated the group-level BIC, which is the sum of BICs over individuals42.
Parameter and model recovery
We used the parameters from the fits of the COIN and dual-rate models to the data of each participant in the spontaneous and evoked recovery experiments to generate 10 synthetic data sets per model class (COIN and dual-rate) for each participant from the corresponding experiment. In the dual-rate model, the only source of variability across the different synthetic data sets for a given participant was motor noise. In contrast, for the COIN model, sensory noise provided another source of variability in addition to motor noise. We fit both model classes to each synthetic data set as we did with real data (see above).
For parameter recovery (Extended Data Fig. 2c), we compared the COIN model parameters that were used to generate the synthetic data (‘true’ parameters) with the COIN model parameters fit to these synthetic data sets (‘recovered’ parameters).
For model recovery (Extended Data Fig. 2d–e), we examined the proportion of times the difference in BIC between the COIN and dual-rate fits favoured the true model class that generated the data.
Simulating existing data sets
We performed COIN model simulations on a diverse set of extant data in Fig. 4 (similarly Extended Data Figs. 5, 8 and 9) in a purely cross-validated manner, such that we used model parameters fitted to participants in our own experiments to make predictions for experiments conducted in other laboratories using other paradigms.
The paradigms in Fig. 4 and Extended Data Fig. 8 were simulated using the 40 sets of parameters fit to our individual participants’ data from both experiments. One hundred simulations (each conditioned on a different noisy state feedback sequence) were performed for each parameter set. The results shown are based on the average of all of these simulations.
The paradigms in Extended Data Fig. 5a–o and Extended Data Fig. 9 were variations of the standard spontaneous recovery paradigm. Therefore, we simulated these paradigms (as well as the paradigm in Extended Data Fig. 5p–s) using the parameters fit to the average spontaneous and evoked recovery data sets. One hundred simulations (each conditioned on a different noisy state feedback sequence) were performed. The results shown are based on the average of these simulations.
Modelling working memory
A working memory task performed after the last P− trial of a spontaneous recovery paradigm has been shown to interfere with spontaneous recovery, producing an effect that is reminiscent of evoked recovery on the first Pc trial (Extended Data Fig. 9a, Ref. 22). We modelled the effect of the working memory task as selectively abolishing the (working) memory of the responsibilities on the last P− trial (Extended Data Fig. 9b–d). This means that on the first Pc trial, the predicted probabilities are based on the expected context frequencies (the stationary probabilities).
Modelling visuomotor learning and its explicit and implicit components
In visuomotor rotation experiments, the cursor moves in a different direction to the hand (which is occluded from vision). Hence, visuomotor rotations introduce a discrepancy between the location of the hand as sensed by vision and proprioception. To model this discrepancy, we include a context-specific bias parameter in the state feedback (Eq. 5):
(19) |
To support Bayesian inference, we place a normal distribution prior over this parameter:
(20) |
We set μb to zero based on the assumption that positive and negative biases are equally probable and σb to 70−1 by hand to match the empirical data in Extended Data Fig. 9e. We extend and modify the inference algorithm accordingly (see Suppl. Inf.).
On each trial, the state feedback was assigned a value of 0 (no rotation trials), +1 (P+ rotation trials) or −1 (P− rotation trials) plus i.i.d. zero-mean Gaussian observation noise with variance . Visual error-clamp trials (Pc) were modelled in the same way as channel trials (i.e. with state feedback unobserved). Adaptation was modelled as the mean of the predicted state feedback distribution (Extended Data Fig. 5q and Extended Data Fig. 9f, dashed pink) plus Gaussian motor noise.
We also modelled an experiment in which an explicit judgement of the perturbation is obtained on every trial, and the implicit component is taken as the difference between adaption and the explicit judgement23. We hypothesised that participants have explicit access to the state representing their belief about the visuomotor rotation but do not have access to the bias in the state feedback, which is therefore implicit. Hence, we mapped the state of the context with the highest responsibility on the previous trial (Extended Data Fig. 9h, black line) onto the explicit component and the average bias across contexts weighted by the predicted probabilities (Extended Data Fig. 9j, cyan line) onto the implicit component. Adaptation is then, by definition, the sum of these two components (Extended Data Fig. 9e, solid pink) plus Gaussian motor noise. See Suppl. Inf. for full details.
Data availability
All experimental data are publically available at the Dryad repository (https://doi.org/10.5061/dryad.m63xsj42r). The data include the raw kinematics and force profiles of individual participants on all trials as well as the adaptation measures used to generate the experimental data shown in Fig. 2c,e and Fig. 3d.
Code availability
Code for the COIN model is available at GitHub (https://github.com/jamesheald/COIN).
Extended Data
Extended Data Table 1 |. Comparison of the COIN model to other models.
single-context models | multiple-context models | ||||||
---|---|---|---|---|---|---|---|
dual-rate Smith et al. 1 |
memory of errors Herzfeld et al. 4 |
source of errors Berniker & Körding 16 |
winner-take-all Oh & Schweighofer 11 |
DP-KF Gershman et al. 12 |
MOSAIC Haruno et al. 26 |
COIN | |
spontaneous recovery | ✔ | ✘ a | ✘ b | ✘ b | ✘ c | ✘ d | ✔ |
evoked recovery | ✘ e | ✘ e | f | f | f | ✘ d | ✔ |
memory updating | ✘ g | ✘ g | ✘ g | ✘ h | ✘ g,h | ✔ | ✔ |
savings after full washout | ✘ i | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
anterograde interference | ✔ | ✘ a | ✘ b | ✘ b | ✔ | ✘ j | ✔ |
environmental consistency | ✘ i | ✔ | ✘ b | ✘ b | ✘ k | ✔ | ✔ |
explicit/implicit learning | m | ✘ l | ✘ l | ✘ l | ✘ l | ✘ l | ✔ |
Spontaneous recovery, the gradual re-expression of P+ in the channel-trial phase (Fig. 2c), requires a single-context model to have multiple states that decay on different time scales or a multiple-context model that can change the expression of memories in a gradual manner based on the amount of experience with each context. Therefore, single-context models that have a single statea, or multiple-context models that do not learn context transition probabilitiesb or do not have state dynamicsd do not show spontaneous recovery. Models that learn transition probabilities but that do not represent uncertainty about the previous contextc (the ‘local’ approximation in DP-KF) can either include a self-transition bias or not. With a self-transition bias, the expression of memories changes in an abrupt manner (akin to evoked recovery) when, in the channel-trial phase, the belief about the previous context changes (e.g. from P− to P+), and thus such models fail to explain the gradual nature of spontaneous recovery. Without a self-transition bias, the change in expression of memories is gradual based on updated context counts, but this occurs too slowly relative to the time scale on which the rise of spontaneous recovery occurs.
Evoked recovery, the rapid re-expression of the memory of P+ in the channel-trial phase (Fig. 2e) that does not simply decay exponentially to baseline (Extended Data Fig. 6e), requires a model to be able to switch between different memories based on state feedback. Therefore, single-context modelse that cannot switch between memories are unable to show the evoked recovery pattern seen in the data. Multiple-context models with memories that decay exponentially to zero in the absence of observationsf (as during channel trials) can only partially explain evoked recovery, showing the initial evocation but not the subsequent change in adaptation over the channel-trial phase. Models with no state decayd cannot explain evoked recovery.
Memory updating requires a model to update memories in a graded fashion and to use sensory cues to compute these graded updates. Therefore, models that either have no concept of sensory cuesg or multiple-context models that only update the state of the most probable context in an all-or-none mannerh do not show graded memory updating.
Savings, faster learning during re-exposure compared to initial exposure, after full washout requires a single-context model to increase its learning rate or a multiple-context model to protect its memories from washout and/or learn context transition probabilities. Therefore, single-context models with fixed learning ratesi do not show savings.
Anterograde interference, increasing exposure to P+ leads to slower subsequent adaptation to P−, requires a single-context model to learn on multiple time scales or a multiple-context model to learn transition probabilities that generalise across contexts. Therefore, single-context models with a single statea, or multiple-context models that either do not learn transition probabilitiesb or that learn local transition probabilities independently for each row of the transition probability matrixj do not show anterograde interference.
Environmental consistency, the increase/decrease in single-trial learning for slowly/rapidly switching environments, requires a model to either adapt its learning rate or learn local transition probabilities based on context transition counts. Therefore, single-context models with fixed learning ratesi or multiple-context models that either do not learn transition probabilitiesb or that learn non-local transition probabilities based only on context countsk do not show the effects of environmental consistency on single-trial learning.
Explicit and implicit learning, the decomposition of visuomotor learning into explicit and implicit components, requires a model to have elements that can be mapped onto these components. For most models, there is no clear way to map model elements onto these componentsl. It has been suggested that the fast and slow processes of the dual-rate model correspond to the explicit and implicit components of learning, respectively. However, in a spontaneous recovery paradigm, this mapping only holds during initial exposure and fails to account for the time course of the implicit component during the counter-exposure and channel-trial phasesm (see Suppl. Inf.).
Supplementary Material
Acknowledgements
This work was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 726090 to M.L.), the Wellcome Trust (Investigator Awards 212262/Z/18/Z to M.L. and 097803/Z/11/Z to D.M.W.), Royal Society (Noreen Murray Professorship in Neurobiology to D.M.W), National Institutes of Health (R01NS117699 and U19NS104649 to D.M.W. ) and Engineering and Physical Sciences Research Council (studentship to J.B.H). For the purpose of open access, the authors have applied a CC-BY public copyright licence to any author accepted manuscript version arising from this submission. We thank J. Ingram for technical support and G. Hennequin for discussions.
Footnotes
Competing interests
The authors have no competing interests.
References
- 1.Smith MA, Ghazizadeh A & Shadmehr R Interacting adaptive processes with different timescales underlie short-term motor learning. PLoS Biol. 4, e179 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Kitago T, Ryan S, Mazzoni P, Krakauer JW & Haith AM Unlearning versus savings in visuomotor adaptation: comparing effects of washout, passage of time, and removal of errors on motor memory. Front. Hum. Neurosci. 7, 307 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Sing GC & Smith MA Reduction in learning rates associated with anterograde interference results from interactions between different timescales in motor adaptation. PLoS Comput. Biol. 6, e1000893 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Herzfeld DJ, Vaswani PA, Marko MK & Shadmehr R A memory of errors in sensorimotor learning. Science 345, 1349–1353 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Gonzalez Castro LN, Hadjiosif AM, Hemphill MA & Smith MA Environmental consistency determines the rate of motor adaptation. Curr. Biol. 24, 1050–1061 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mcdougle SD et al. Credit assignment in movement-dependent reinforcement learning. Proc. Natl. Acad. Sci. 113, 6797–6802 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Donchin O, Francis JT & Shadmehr R Quantifying generalization from trial-by-trial behavior of adaptive systems that learn with basis functions: theory and experiments in human motor control. J. Neurosci. 23, 9032–9045 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Thoroughman KA & Shadmehr R Learning of action through adaptive combination of motor primitives. Nature 407, 742–747 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Shadmehr R, Smith MA & Krakauer JW Error correction, sensory prediction, and adaptation in motor control. Ann. Rev. Neurosci. 33, 89–108 (2010). [DOI] [PubMed] [Google Scholar]
- 10.Wolpert DM & Kawato M Multiple paired forward and inverse models for motor control. Neural Netw. 11, 1317–1329 (1998). [DOI] [PubMed] [Google Scholar]
- 11.Oh Y & Schweighofer N Minimizing precision-weighted sensory prediction errors via memory formation and switching in motor adaptation. J. Neurosci. 9237–9250 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Gershman SJ, Radulescu A, Norman KA & Niv Y Statistical computations underlying the dynamics of memory updating. PLoS Comput. Biol. 10, e1003939 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hulst T et al. Cerebellar degeneration reduces memory resilience after extended training. bioRxiv (2020). [Google Scholar]
- 14.Pekny SE, Criscimagna-Hemminger SE & Shadmehr R Protection and expression of human motor memories. J. Neurosci. 31, 13829–13839 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Rescorla RA & Heth CD Reinstatement of fear to an extinguished conditioned stimulus. J. Exp. Psychol.: Animal Behavior Processes 1, 88–96 (1975). [PubMed] [Google Scholar]
- 16.Berniker M & Körding K Estimating the sources of motor errors for adaptation and generalization. Nat. Neurosci. 11, 1454–1461 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Taylor JA, Wojaczynski GJ & Ivry RB Trial-by-trial analysis of intermanual transfer during visuomotor adaptation. J. Neurophysiol. 106, 3157–3172 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Körding KP, Tenenbaum JB & Shadmehr R The dynamics of memory as a consequence of optimal adaptation to a changing body. Nat. Neurosci. 10, 779–786 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Heald JB, Ingram JN, Flanagan JR & Wolpert DM Multiple motor memories are learned to control different points on a tool. Nat. Hum. Behav. 2, 300–311 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Coltman SK, Cashaback JGA & Gribble PL Both fast and slow learning processes contribute to savings following sensorimotor adaptation. J. Neurophysiol. 121, 1575–1583 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Huang VS, Haith A, Mazzoni P & Krakauer JW Rethinking motor learning and savings in adaptation paradigms: model-free memory for successful actions combines with internal models. Neuron 70, 787–801 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Keisler A & Shadmehr R A shared resource between declarative memory and motor memory. J. Neurosci. 30, 14817–14823 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Mcdougle SD, Ivry RB & Taylor JA Taking aim at the cognitive side of learning in sensorimotor adaptation tasks. Trends Cogn. Sci. 20, 535–544 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.McDougle SD, Bond KM & Taylor JA Explicit and implicit processes constitute the fast and slow processes of sensorimotor learning. J. Neurosci. 35, 9568–9579 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Miyamoto YR, Wang S & Smith MA Implicit adaptation compensates for erratic explicit strategy in human motor learning. Nat. Neurosci. 23, 443–455 (2020). [DOI] [PubMed] [Google Scholar]
- 26.Haruno M, Wolpert DM & Kawato M MOSAIC model for sensorimotor learning and control. Neural Comput. 13, 2201–2220 (2001). [DOI] [PubMed] [Google Scholar]
- 27.Gershman SJ, Blei DM & Niv Y Context, learning, and extinction. Psychol. Rev. 117, 197–209 (2010). [DOI] [PubMed] [Google Scholar]
- 28.Sanders H, Wilson MA & Gershman SJ Hippocampal remapping as hidden state inference. eLife 9, e51140 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Collins A & Koechlin E Reasoning, learning, and creativity: frontal lobe function and human decision-making. PLoS Biol. 10, e1001293 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Collins AGE & Frank MJ Cognitive control over learning: creating, clustering, and generalizing task-set structure. Psychol. Rev. 120, 190–229 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Howard IS, Ingram JN & Wolpert DM A modular planar robotic manipulandum with end-point torque control. J. Neurosci. Methods 181, 199–211 (2009). [DOI] [PubMed] [Google Scholar]
- 32.Milner TE & Franklin DW Impedance control and internal model use during the initial stage of adaptation to novel dynamics in humans. J. Physiol. 567, 651–664 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Scheidt RA, Reinkensmeyer DJ, Conditt MA, Rymer WZ & Mussa-Ivaldi FA Persistence of motor adaptation during constrained, multi-joint, arm movements. J. Neurophysiol. 84, 853–862 (2000). [DOI] [PubMed] [Google Scholar]
- 34.Teh YW, Jordan MI, Beal MJ & Blei DM Hierarchical Dirichlet processes. J. Amer. Stat. Assoc. 101, 1566–1581 (2006). [Google Scholar]
- 35.Fox EB, Sudderth EB, Jordan MI & Willsky AS An HDP-HMM for systems with state persistence. In Proc. 25th Int. Conf. Machine Learning, 312–319 (2008). [Google Scholar]
- 36.Teh YW Dirichlet processes. In Encyclopedia of Machine Learning (Springer, 2010). [Google Scholar]
- 37.Carvalho CM, Johannes MS, Lopes HF & Polson NG Particle learning and smoothing. Stat. Sci. 25, 88–106 (2010). [Google Scholar]
- 38.Bernardo J et al. Particle learning for sequential Bayesian computation. Bayesian Statistics 9 9, 317 (2011). [Google Scholar]
- 39.Houlsby N et al. Cognitive tomography reveals complex, task-independent mental representations. Curr. Biol. 23, 2169–2175 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Acerbi L & Ji W Practical Bayesian optimization for model fitting with bayesian adaptive direct search. In Adv. Neural Inf. Proc. Sys., 1836–1846 (2017). [Google Scholar]
- 41.Jeffreys H The theory of probability (OUP Oxford, 1998). [Google Scholar]
- 42.Li J, Wang ZJ, Palmer SJ & McKeown MJ Dynamic Bayesian network modeling of fMRI: a comparison of group-analysis methods. Neuroimage 41, 398–407 (2008). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All experimental data are publically available at the Dryad repository (https://doi.org/10.5061/dryad.m63xsj42r). The data include the raw kinematics and force profiles of individual participants on all trials as well as the adaptation measures used to generate the experimental data shown in Fig. 2c,e and Fig. 3d.