Spatiotemporal dissociation of fMRI activity in the caudate nucleus underlies human de novo motor skill learning

Yera Choi; Emily Yunha Shin; Sungshin Kim

doi:10.1073/pnas.2003963117

. 2020 Sep 8;117(38):23886–23897. doi: 10.1073/pnas.2003963117

Spatiotemporal dissociation of fMRI activity in the caudate nucleus underlies human de novo motor skill learning

Yera Choi ^a,¹, Emily Yunha Shin ^a,¹, Sungshin Kim ^a,^b,^c,²

PMCID: PMC7519330 PMID: 32900934

Significance

Numerous real-world motor skills require learning arbitrary relationships between actions and their consequences from scratch. However, little is understood about the neural signatures of de novo motor learning and associated individual variability. In a longitudinal fMRI experiment, where participants learned to control a cursor by moving fingers, we found a gradual transition of performance-related activity from the head to tail of the caudate nucleus. This finding reflects the flexible and stable reward representations in the head and tail, respectively. Additionally, intrinsic cortico-caudate connectivity predicted better learners with weaker head–prefrontal and stronger tail–sensorimotor interactions. The present study provides unprecedented insight into de novo motor learning, which may contribute to the understanding of motor-related disorders, and infant learning.

Keywords: de novo motor skill learning, caudate nucleus, spatiotemporal dissociation, cortico-caudate interactions, fMRI

Abstract

Motor skill learning involves a complex process of generating novel movement patterns guided by evaluative feedback, such as a reward. Previous literature has suggested anteroposteriorly separated circuits in the striatum to be implicated in early goal-directed and later automatic stages of motor skill learning, respectively. However, the involvement of these circuits has not been well elucidated in human de novomotor skill learning, which requires learning arbitrary action–outcome associations and value-based action selection. To investigate this issue, we conducted a human functional MRI (fMRI) experiment in which participants learned to control a computer cursor by manipulating their right fingers. We discovered a double dissociation of fMRI activity in the anterior and posterior caudate nucleus, which was associated with performance in the early and late learning stages. Moreover, cognitive and sensorimotor cortico-caudate interactions predicted individual learning performance. Our results suggest parallel cortico-caudate networks operating in different stages of human de novomotor skill learning.

Motor skill learning is a complex process of acquiring new patterns of movements to achieve more accurate and faster performance for motor tasks through repetitive practice (1). It is an adaptive mechanism crucial for the survival and well-being of all animals, as it enables more efficient motor behaviors that may lead to favorable outcomes, such as greater rewards per unit time (2).

Existing literature has revealed that the dynamic interplay of multiple brain regions—encompassing the frontoparietal cortices, cerebellum, and basal ganglia—is required for the acquisition of new motor skills (3, 4). The cortico-basal ganglia circuit is of particular interest, as many anatomical findings (5, 6), animal studies (7–10), and human functional magnetic resonance imaging (fMRI) studies (4, 11–17) have shown that distinct patterns of interaction between the structures in this circuit arise in different learning stages. Specifically, the evidence suggests that a transition from the anterior associative to the posterior sensorimotor (in rodents, from the dorsomedial to the dorsolateral) regions in the striatum occurs during learning.

Notably, many studies that have reported the differential involvement of the separate cortico-basal ganglia circuits during motor skill learning adopted sequence-learning tasks (11–15, 17, 18), in which participants practiced repetitive sequences of well-learned discrete actions such as button press. Although such sequence-learning paradigms have elucidated many essential aspects of motor learning, they may provide limited explanations for the acquisition of continuous motor skills involving explorative action selections (1, 19).

Meanwhile, relatively few studies employed de novo motor learning paradigms, in which participants learned arbitrary associations between stimuli and required actions (20–23). Accordingly, there remains much to be elucidated about the neural basis of de novo motor learning. In this type of learning, action selection is sensitive to evaluative feedback such as reward, and, thus, the basal ganglia circuits processing reward are particularly important, as implied by studies in patients with motor-related neurodegenerative disorders (24). Surprisingly, however, the direct neural signatures of reward-based motor learning have been rarely investigated in human experiments, although a few behavioral and modeling studies discussed them (25–27).

Consistent with the idea of the functional dissociation in the basal ganglia for motor learning, recent nonhuman primate studies have found that the anterior and posterior regions of the caudate nucleus, respectively, modulate early flexible and later stable values arbitrarily associated with visual stimuli (28, 29). Such modulation of changing values would be crucial for the selection of appropriate actions based on their values, as well as for the transition from early goal-directed to later automatic behavior.

The present study thus aimed to investigate whether the subregions of the human caudate nucleus serve distinct functions in different stages of continuous de novo motor skill learning based on evaluative feedback. We conducted a series of behavioral and fMRI experiments, in which participants learned to control an on-screen cursor via a magnetic resonance imaging (MRI)-compatible data-glove interface. Hand postures recorded by 14 sensors of the data glove were linearly mapped onto cursor locations.

As a result, we found a clear double dissociation of performance-related fMRI activities in the caudate nucleus, which declined in the anterior region but increased in the posterior region from the early to late learning stage. Furthermore, we also found that the intrinsic functional interactions between the caudate nucleus and cortical regions were predictive of individual learning performance. In sum, our findings suggest that the caudate nucleus serves as the primary locus for de novo motor skill learning in humans.

Results

Successful Learning of De Novo Motor Skills.

Thirty participants completed the study, which included two fMRI sessions separated by five behavioral training sessions outside the MRI scanner (Fig. 1). In the first and second fMRI sessions, participants performed the main task of reaching four on-screen targets with a cursor by moving their right fingers, while learning two different mappings between the cursor and finger movements. During the behavioral training sessions, participants were trained only for the first mapping that they learned in the first fMRI session (i.e., the trained mapping). By contrasting the trained and untrained mappings that shared an identical target sequence, we intended to control for nonspecific effects on learning, such as prolonged exposure to the task. In addition to the main task, participants also performed a secondary task of moving the cursor along a continuous path with the trained mapping at the end of the training session. This path-following task was designed to encourage participants to learn the generalized mapping between the cursor position and hand posture, rather than to merely memorize the hand postures corresponding to the four targets (SI Appendix, Fig. S1 A and B).

Fig. 1. — Overview of the experiment design. (A) Overview of the de novo motor learning task. Participants learned to control a cursor (white crosshair) to reach a target (gray rectangular grid cell with a yellow crosshair at its center) on a 5 × 5 grid by manipulating their right fingers, using a data glove measuring degrees of freedom with 14 sensors on finger flexures and abductions. The 14-dimensional vector representing hand posture was linearly mapped onto the two-dimensional position of the cursor. When the cursor was on a target grid cell, the color of the target cell was changed to red. Participants were instructed to stay on the target as long as possible and try to get as close as possible to the center of the target. The upper right depicts the four target locations. (B) Visit schedule of the entire experiment. Visit 1 (fMRI 1): The familiarization and calibration phases determined and inspected the cursor-hand posture mapping created from random finger movements. An fMRI session composed of a resting-state fMRI scan, a localizer scan to identify regions implicated in random finger movements, and seven runs of the main-task fMRI scan including both of the trained (green; runs 1 to 3 and 7) and untrained (gray; runs 4 to 6) mappings followed. Visits 2 to 6 (Behav 1 to 5): Behavioral training sessions only on the trained mapping (three main-task runs, two path-following runs) were conducted outside the MRI scanner. Visit 7 (fMRI 2): Seven runs of the main-task fMRI, including both trained and untrained mappings, were conducted. PF, path-following task; rs-fMRI, resting-state fMRI.

As well demonstrated by the actual cursor trajectories of a representative participant (Fig. 2A), the participants evidently acquired the ability to maneuver the cursor to the targets, which significantly improved from the first to the second fMRI session. All participants showed significant improvement of performance in terms of the straightness of cursor trajectories (Fig. 2B) and the success rate (Fig. 2C and see SI Appendix, Fig. S2 for the individual participants), primarily for the trained mapping. The success rate, which was calculated as the fraction of time during which the cursor stayed on the targets and was used as the major indicator of performance, tended to increase throughout the entire experiment. This increase suggests that the participants successfully learned to reach the targets quickly and stably for the trained mapping. Between-session forgetting, which was observed as a drop of task performance at the beginning of each session, tended to decrease across sessions as in motor adaptation tasks (30). Participants also showed improved trained-mapping performance for the path-following task, which implemented an adaptive design in which better performance led to faster completion. The time to complete the path-following runs significantly declined from the first to the last training session (P < 10⁻⁵, Wilcoxon signed-rank test; SI Appendix, Fig. S1C), suggesting successful learning of the cursor-posture mapping.

In the case of the untrained mapping, to which participants were exposed only during the two fMRI sessions, the learning curves showed patterns similar to those of the first two sessions with the trained mapping. Indeed, there were no significant differences in the amount of learning between the first sessions with the two mappings (i.e., the trained- and untrained-mapping sessions during the first fMRI; P = 0.20, Wilcoxon signed-rank test), as well as between the second sessions with the mappings (i.e., the trained-mapping session during the first behavioral training and the untrained-mapping session during the second fMRI; P = 0.80, Wilcoxon signed-rank test) (SI Appendix, Fig. S3).

Spatiotemporal Dissociation of the fMRI Activity Related to Motor Skill Learning.

Seeking to reach and stay on targets, the participants learned to select and execute a series of continuous actions leading to higher success rates, through the trial-and-error exploration of the action–outcome mapping. In this aspect, the success rates associated with different actions can be considered as “action values.” A high success rate at one time point would indicate that the cursor had reached the target and needed to be maintained at the current location and that the actions related to the current outcome should be associated with a high value or reward. Accordingly, the success rate would be positively correlated with the reward-processing activity, the pattern of which would change in different learning stages. We thus performed whole-brain voxel-wise general linear model (GLM) analyses with a parametric regressor modulating the time-varying success rate using the data from the two fMRI sessions to delineate the regions involved in value or reward processing in the early and later stages (SI Appendix, Figs. S4 and S11).

Whole-brain analyses.

The analyses identified distinctive patterns of fMRI activity modulated by the success rate in the early vs. late learning stages (Fig. 3 and SI Appendix, Tables S1 and S2). During early learning, robust positive success-rate modulation was observed in the striatum, including the nucleus accumbens (NAc), anterior regions of the putamen (AP), and caudate nucleus (CDh), reflecting their roles in the goal-directed behaviors (Fig. 3, Upper). These regions were no longer involved in the late learning stage, but only the bilateral posterior caudate nucleus (CDt) in the striatum were positively related to the success-rate modulation (Fig. 3, Lower). In the cortical regions, the modulatory activity globally decreased in the cognitive-attentional network, including the supramarginal gyrus, insular, superior parietal cortex (31, 32), and the ventral visual network (SI Appendix, Fig. S5 and Table S1). These changes potentially reflect suppressed activity related to habitual responses and increasing neural efficiency as learning proceeded (3, 12, 17, 33). Interestingly, the left superior frontal cortex or frontal pole cortex (FPC), ventromedial prefrontal cortex (VMPFC), and precuneus/middle cingulate cortex appeared to be involved both in the early and late learning, potentially due to their functions commonly required throughout the learning stages (Fig. 3 and SI Appendix, Tables S1 and S2).

Region-of-interest analyses in the stratum and VMPFC.

To closely examine the performance-related fMRI activity in reward-processing regions across learning stages, we conducted region-of-interest (ROI) analyses primarily in the striatum, including the subregions of the caudate nucleus (head, body, and tail), putamen, and NAc. As there exists evidence of the learning-induced transition of activity from the anterodorsal to posteroventral regions of the putamen (11), subdivided ROIs of the putamen were defined accordingly (SI Appendix, Fig. S6). For the cortical region, we included the VMPFC due to its established role in value representation and reward prediction (34–36).

We first investigated whether the success-modulated fMRI activity would show changes across learning stages differentially in the subregions of the caudate nucleus as well as in other ROIs, using the trained-mapping data. For each caudate subregion, the left and right ROIs were combined, as their respective activities did not differ from each other (corrected P > 0.61 for all subregions, pairwise t test). The patterns of success-modulated activity indeed differed by the regions [two-way ROI X learning stage repeated-measures permutation ANOVA, F(6,174) = 12.96, P = 10⁻⁴, η_p² = 0.31], by learning stage [F(1,29) = 5.26, P = 0.027, η_p² = 0.15], and also by the interaction of the regions and learning stages [F(6,174) = 21.16, P = 10⁻⁴, η_p² = 0.42] (Fig. 4 and SI Appendix, Fig. S6).

Post hoc analyses revealed the hypothesized anteroposterior transition in the caudate nucleus (Fig. 4, Lower). From the early to the late stage of learning, the success-modulated activity decreased in the caudate head [T(29) = 3.56, corrected P = 0.0069, Cohen’s d = 0.65], yet increased in the caudate tail [T(29) = 4.43, corrected P < 10⁻³, Cohen’s d = 0.81]. No such change was observed in the caudate body [T(29) = 0.45, corrected P = 1.00, Cohen’s d = 0.082] (Fig. 4).

A general tendency of decreasing success-rate modulation was observed in other striatal regions (SI Appendix, Fig. S6), including the anterior putamen [T(29) = 5.06, corrected P < 10⁻³, Cohen’s d = 0.92] and NAc [T(29) = 6.00, corrected P < 10⁻⁴, Cohen’s d = 1.09]. However, the posterior putamen T(29) = 2.33, corrected P = 0.17, Cohen’s d = 0.43] and VMPFC [T(29) = 0.41, corrected P = 1.00, Cohen’s d = 0.075] showed no significant changes.

These results appear to provide intriguing insights regarding the differential involvement of the regions in reward-based de novo motor learning. First, although a previous study using a motor sequence-learning task had reported an anteroposterior transition of learning-related activity in the putamen (11), the locus of such clear transition was the caudate nucleus in the current study (SI Appendix, Fig. S6). Second, the NAc, a part of the ventral striatum that had been rarely reported in motor learning studies (37), showed significant positive success modulation in the early stage, which diminished in the later stage. Third, the success-modulated activity in the VMPFC was significantly positive in both early and late learning stages (Fig. 3 and SI Appendix, Fig. S6D and Table S1, “Common” section). In line with this outcome, previous studies had shown that the VMPFC represents not only the values associated with stimuli, but also the learned values of actions (35, 38), and that it is involved in reward processing regardless of learning stages, probably as a value comparator (39).

The activity changes were learning-induced and specific to the trained mapping.

We examined the possibility that the observed changes in fMRI activity were not due to learning of the mappings, but due to nonspecific effects, such as prolonged exposure to the same visual stimuli. We thus conducted identical analyses with the data from the untrained mapping, which shared the same visual stimuli with the trained mapping, yet implemented an altered cursor-posture mapping. In the untrained mapping, no between-stage differences in the activity were observed [F(1,29) = 0.37, P = 0.56, η_p² = 0.013], although there were significant differences due to the regions [F(6,174) = 27.42, P = 10⁻⁴, η_p² = 0.49] and the interaction between the regions and learning stages [F(6,174) = 2.51, P = 0.023, η_p² = 0.080] (Fig. 4 and SI Appendix, Fig. S6).

Post hoc analyses found that, in the early stage, the activity for the untrained mapping was not different from that for the trained mapping (corrected P > 0.30 for all ROIs) (Fig. 4 and SI Appendix, Fig. S6). This was not surprising, as the learning curves of the mappings were not different between the mappings (P = 0.20, Wilcoxon signed-rank test; SI Appendix, Fig. S3). However, in the late stage, the fMRI activity for the untrained mapping was maintained significantly higher than that for the trained mapping in the caudate head [T(29) = 2.60, P = 0.015], NAc [T(29) = 4.40, P < 10⁻³], but lower [T(29) = 2.07, P = 0.048] in the caudate tail, as we expected from the absence of behavioral training. There was no such effect in the VMPFC [T(29) = 0.41, P = 0.68] (Fig. 4 and SI Appendix, Fig. S6).

Cortico-Caudate Functional Interactions Predict Individual Learning Performance.

Following our observation of the robust learning-induced anteroposterior transition of fMRI activity in the caudate nucleus, we sought to address whether the respective interactions within the two separate cortico-caudate loops—the anterior cognitive and posterior sensorimotor loops—could predict individual learning performance (Fig. 5). To do so, we used independently defined cortical ROIs, including the bilateral dorsolateral prefrontal cortex (DLPFC) and left motor/somatosensory cortex (M1/S1) (SI Appendix, Fig. S7). These two regions have been well established to have anatomical and functional connections to the caudate nucleus and operate, respectively, in early associative learning and late sensorimotor control (5, 40, 41). Then, we performed functional connectivity analyses using the resting-state fMRI data acquired prior to the first main-task fMRI session (Fig. 1B). Specifically, we tested the connectivity between the caudate head/body and DLPFC for the anterior cognitive loop and that between the caudate tail and left M1/S1 for the posterior sensorimotor loop.

Fig. 5. — Resting-state fMRI connectivity predictive of individual learning performance. A schematic view of the seed ROIs (head, body, and tail) in the bilateral caudate nucleus (yellow/brown) and the independently defined cortical ROIs in the left M1/S1 (blue) and bilateral DLPFC (green) is shown. The relationships between the cortico-caudate intrinsic functional connectivity and the overall success rate are shown in *Lower*. (A) Caudate head–DLPFC. (B) Caudate body–DLPFC. (C) Caudate tail–L M1/S1. For each caudate subregion, the ROIs for the left and right hemispheres were used as separate seeds to calculate the functional connectivity, and the resulting connectivities were then averaged together. The gray shades indicate 95% CI. Pearson correlation coefficients (r) and uncorrected P values are presented. L, left; R, right.

As the most robust measure of the learning performance, we focused on the overall success rate of the trained-mapping main-task trials in all fMRI and behavioral training sessions. The connectivity between the caudate heads and the DLPFC was negatively related to the overall success rate (Fig. 5A; R = −0.57, P = 0.0012; L-head: R = −0.52, P = 0.0036, corrected P = 0.014; R-head: R = −0.59, P < 10⁻³, corrected P = 0.003). A similar relationship was found in the caudate bodies (Fig. 5B; R = −0.49, P = 0.0069, corrected P = 0.027, L-body: R = −0.45, P = 0.013, corrected P = 0.052; R-body: R = −0.50, P = 0.0063, corrected P = 0.025). Interestingly, in contrast, the connectivity between the caudate tails and the left M1/S1 showed significant positive correlations with the overall success rate (Fig. 5C; R = 0.49, P = 0.0074, L-tail: R = 0.48, P = 0.0089, corrected P = 0.018; R-tail: R = 0.44, P = 0.018, corrected P = 0.036). Taken together, these results support dissociable processing of the cognitive and sensorimotor loops via the caudate nucleus in de novo motor skill learning.

When tested with the learning rate, which was adopted as an alternative measure of learning performance, we found a positively significant relationship only between the caudate tails and the left M1/S1 (R = 0.43, P = 0.018). Interestingly, this relationship was only significant in the ipsilateral connectivity (R = 0.52, P = 0.0037, corrected P = 0.0073), but not in the contralateral caudate tail (R = 0.29, P = 0.12, corrected P = 0.23).

As the presence of the visual corticostriatal loop through the caudate tail has been well known (42), we also examined the interactions between the caudate tail and the visual cortex. The connectivity between the caudate tail and the visual regions, which would have been responsive to the target presentation, was not related to the overall success rate (R = −0.30, P = 0.12). However, the relationship was significantly pronounced in the left caudate tail for the overall success rate (R = −0.44, P = 0.017, corrected P = 0.034). Additionally, the connectivity between the bilateral caudate tail and the visual regions was marginally related to the learning rate (R = −0.37, P = 0.050). Overall, the connectivity in the visual corticostriatal loop was not as robust as those in the cognitive and sensorimotor loops in predicting learning performance (SI Appendix, Fig. S8). Although this result needs to be interpreted with discretion, it may potentially implicate that strong intrinsic functional connectivity in the visual cortico-caudate loop would indicate considerable and/or lingering dependence on visual feedback, which would negatively affect the learning performance.

To investigate whether the aforementioned connectivities were distinctively related to the learning performance in different stages, we conducted identical analyses separately for the early and late stages. We found no significant relationship for all of the ROIs and performance measures (corrected P > 0.05). One possible explanation for this null finding would be the use of single-stage performance data, which might not be sufficient to assess the learning performance. Lastly, in addition to the hypothesis-driven ROI analysis, we also performed exploratory whole-brain voxel-wise correlation analysis for each seed ROI in the caudate nucleus and the measures of learning performance (SI Appendix, Fig. S9 and Table S3).

Localization of Hand Movement-Related Regions.

A GLM analysis on the data from an independent localizer scan identified regions significantly activated by random finger movements. Seven clusters (highly stringent voxel-wise threshold of P < 10⁻⁵ and a cluster size larger than 150 voxels) were defined in the bilateral pre/postcentral gyrus, left posterior putamen, right cerebellum (lobules IV, V, and VIII), SMA, and thalamus (SI Appendix, Fig. S3B and Table S4). All of these regions are implicated in hand movement and have been shown to exhibit impairments in activation and connectivity under conditions such as focal hand dystonia (43).

In our experiment, it should be noted that participants were required to stop moving their fingers once they reached a target until the next target appeared to obtain higher success rates. Due to this experiment design, the amount of movement was highly collinear with the success rate used for the GLM analysis (R = −0.70 ± 0.01, mean ± SE), indicating a greater amount of movement in the early stage of learning. Thus, we analyzed the localizer data to examine whether finger movements were sufficient to induce the dissociative activities in the caudate nucleus, as observed in the preceding analyses. Patterns of activity similar to those of success-modulated activities were observed neither in the caudate head (i.e., greater activity with increased finger movements in the early stage) nor in the caudate tail (i.e., greater activity with paused or reduced finger movements in the late stage) (P > 0.8 for both, one-sample, one-tailed Wilcoxon signed-rank test). This result demonstrates that the double dissociation of fMRI activities in the caudate nucleus is less likely due to movement per se, but, rather, due to the learned values of motor actions as we hypothesized.

Discussion

The current study investigated the role of the human caudate nucleus in de novo motor skill learning. The task using the data-glove interface, which was adopted in an fMRI study, allowed participants to learn a completely new motor skill from scratch. Specifically, they learned the mapping from the visual feedback in a low-dimensional outcome space (two-dimensional [2D] cursor position) to the motor commands in a high-dimensional action space (14-dimensional [14D] hand posture) as they pursued positive feedbacks. In this aspect, the task may more closely resemble the complexity and diversity of motor skills in the real world than other tasks that have been frequently implemented in laboratory settings. Hence, the current study offers a unique opportunity to track the neural changes underlying the emergence and development of the continuous motor skill, evolving from the early action-selection level to the late action-execution level (1). It may also intrigue those interested in the neuroscience of development, as de novo motor skill learning would occur incessantly during infant and child development. Notably, the complexity and relative difficulty of the current task, compared with simple sequence-learning tasks adopted in previous studies, contribute to elucidating the distinct role of the caudate nucleus in the learning of action–outcome associations and value-based action selection essential for de novo skill learning.

Learning-Induced Spatiotemporal Dissociation in the Caudate Nucleus.

We discovered a robust learning-induced double dissociation of fMRI activities in the subregions of the caudate nucleus, as the success-rate modulation increased in the posterior region, but decreased in the anterior region. Our results are a demonstration of the learning-induced anterior-to-posterior transition of fMRI activity in the caudate nucleus when humans acquire a novel motor skill from scratch over an extended period of time. They are in line with many human fMRI and nonhuman primate studies suggesting the parallel operation of the anterior cognitive and posterior sensorimotor loops in the basal ganglia in the early and late stages of sequence learning (4, 7, 11, 14–17, 44, 45).

However, we did not find a similar transition in the putamen, as reported in a previous human fMRI study using a sequence-learning task (11). Indeed, a critical difference from this previous study is that the learning-induced effect was specific to the caudate nucleus, and not to the putamen. Accumulative evidence has supported distinct roles of the caudate nucleus and putamen, as the former is involved in action selection based on the values of action–outcome associations, while the latter is primarily implicated in learning of stimulus-action mappings, or habit formation (46). The unique nature of our task—de novo learning of a relatively complex motor skill—might have accentuated this discrepancy in the functionality of the two structures. As the participants were required to continuously track the cursor movement for successful performance, they had to depend on the visual feedback, even in the late learning stage. This complexity of our task might have substantially compromised the development of automatic or habitual performance, which would induce a less pronounced involvement of the putamen. This would not be the case for the simple sequence-learning task in the preceding study (11).

Existing studies on the functional connections between the caudate tail and the visual cortex appear to further support this interpretation (29, 42, 47). Although further investigation is warranted, our results may partially suggest that stronger visual-caudate tail functional connectivity hinders skillful performance. This may be interpreted as the evidence of lingering dependence on visual feedback, even in the advanced stage of learning, which would likely lead to less automaticity and suboptimal performance. For these reasons, we hypothesize that our task, which required continuous visual feedback due to its considerable complexity, elicited greater engagement of the visual cortico-caudate interaction, unlike typical sequence-learning tasks, and induced the differential engagement of the caudate nucleus and putamen.

Recently, a series of nonhuman primate studies identified distinct sets of dopaminergic neurons innervating the anterior and posterior regions of the caudate nucleus as the potential neurobiological underpinnings of the respective modulation of early flexible and later stable values for the learning of arbitrary object–value associations, which is named “object skill” (2, 28, 29, 48). In these studies, monkeys learned to make saccades to multiple visual objects with higher chances of receiving a liquid reward. The object-value contingency was consistent in the stable condition, but was frequently reversed in the flexible condition. The neurons in the caudate head were found to be sensitive to immediate-reward outcomes in the flexible condition, responding more strongly to high-value objects. Interestingly, after extensive training in the stable condition, the neurons in the caudate tail responded to the stably high-valued objects, even in the absence of reward, as monkeys continued to show automatic gaze to the objects. These findings have significantly advanced the understanding of the role of the parallel basal ganglia circuits in mediating goal-directed and automatic processes for object skill. However, there remains much to be elucidated about the respective roles of these circuits for the “action skill” (i.e., the learning of arbitrary action–outcome associations). Here, we have demonstrated that successful learning of actions arbitrarily associated with higher values (i.e., greater success rates) was closely linked to the activities encoding “action values” in the caudate nucleus, which could be similarly dissociated for the early-flexible and late-stable learning stages. These results thus provide evidence in humans supporting the hypothesis that the object skill and the action skill share common neurobiological mechanisms involving the parallel circuits of the caudate nucleus.

Our results may also be interpreted in relation to the role of the caudate nucleus in well-studied category learning, since the current task involves feedback-based learning of associations between different hand postures and distinct target positions. It has been shown that impairments of the caudate head, which are typically observed in patients with Parkinson’s disease, may lead to deficits in rule-based and explicit category learning (49, 50). Few fMRI studies also have demonstrated that the activities in the body and tail of the caudate are associated with improved category learning (51, 52). However, it should also be noted that, in our experiment, participants learned the general mapping between hand postures and cursor positions, rather than simple associations between specific hand postures and target positions, as suggested by the learning-induced increase in the straightness of movement (Fig. 2 A and B). Moreover, participants generalized the motor skill to untrained targets presented in the secondary “path-following tasks,” in which adjacent untrained targets had to be reached sequentially along with the main-task targets. This generalization ability, the hallmark of implicitly acquired motor learning, was shown to be attributable to the continuous cursor feedback in similar tasks (23, 53). Specifically, without the feedback, participants would simply learn the hand postures associated with the target position, not the mapping between them. Investigation on the neural mechanisms of generalization by manipulating the visual feedback would be an interesting future study.

Dissociable Roles of the Cognitive and Sensorimotor Loops.

The resting-state functional connectivity analysis further revealed distinct relationships of the cognitive and sensorimotor cortico-caudate loops with learning performance. We hypothesized that the cognitive loop would play an important role in implementing cognitive strategies in the early stage, while the sensorimotor loop would be implicated in the late stage with more efficient movement and higher performance. If an individual continues to rely heavily on conscious cognitive processes, even in the later stage of learning, desired motor behavior may be delayed or hindered, which would, in turn, negatively affect performance (54–56). Conversely, if attention is divided by dual tasks with the main motor task being away from cognitive processes, automatic skill performance may improve (57). In line with these predictions, we found that participants with higher learning performance showed weaker intrinsic connectivity between the caudate head/body and DLPFC, but stronger intrinsic connectivity between the caudate tail and M1/S1. This result is supported by previous rodent and human fMRI studies that showed that greater disengagement of the associative loop predicted higher learning performance and proposed parallel, but dissociable, activity dynamics of the cognitive and sensorimotor loops (8, 58).

However, the role of the cognitive loop in our task might be different from that in the previous sequence-learning tasks, as the participants did not explicitly learn the spatial and motor sequences, but instead learned the mapping between them (59). Indeed, the target sequence was identical between the trained and untrained mappings, and, thus, learning was not related to a specific spatial sequence of the target. Accordingly, the spatial and motor information were not clearly distinct in our task and would have been concurrently processed (8), while the participants learned to transform the spatial coordinate (i.e., cursor feedback) to the motor coordinate (i.e., hand posture). For these reasons, in our task, the role of the cognitive loop would be more related to strategic action selection for a higher reward through exploration or exploitation, rather than spatial information processing as suggested by a previous model (59). The cognitive loop might also have been involved in other high-level learning components, such as attention and working memory retained for the reward history associated with a series of performed actions.

On the other hand, the positive relationship between the functional connectivity in the sensorimotor loop and higher learning performance suggests that the learned associations between actions and outcomes would be encoded as long-term memories in the sensorimotor loop (48), much like the internal model in the cerebellum (60). In addition to the role of the caudate tail in the sensorimotor loop, its interaction with the visual region appears to affect the skill performance, at least to some extent. Although the relationship was not as robust as in the cognitive loop, greater disengagement in the visual corticostriatal loop vis the caudate tail (left only) predicted higher learning performance. The opposite roles played by the caudate tail in the sensorimotor loop and in the visual loop implicate the learning-induced dissociation between these loops, which is also supported by the previous study using a motor sequence-learning task (58).

Furthermore, other modeling and fMRI studies have suggested a competitive interaction between the “fast” goal-directed and the “slow” habitual processes (30, 61–63). Future studies would still be needed to elucidate the dynamic interactions between learning processes occurring at different—possibly multiple—time scales, as suggested by motor adaptation studies (30, 64) and recent review papers (29, 48).

Linking Reward-Based Mechanisms to Motor Learning.

It should be noted that the current experiment provided feedbacks indicating current success (red-lighted target) or failure (unlighted target) without monetary reward for a successful performance. To provide a monetary reward after a good performance has been shown to enhance skill consolidation (65) and even affect unconscious motivation (66), with heightened activation in the ventral striatum. Interestingly, we found highly significant success-modulated activity in the ventral striatum (NAc) during the early stage, which supports its role in motor learning without explicit monetary reward that has been rarely studied (67, 68). We speculate that, in complex de novo learning of an entirely new controller with considerable difficulty, positive feedbacks following successfully executed actions may be intrinsically considered more rewarding and, accordingly, elicit strong neural activities in the ventral striatum (68). In contrast, many previous motor sequence-learning and motor adaptation experiments had adopted relatively simple tasks, which would have yielded lower intrinsic rewards for successful learning. Thus, in these experiments, the ventral striatum might not have responded as strongly to positive feedbacks, unless the values of feedbacks were significantly heightened by the introduction of monetary reward. Interestingly, there has been a study suggesting that the involvement of the ventral striatum tends to be more pronounced for a more challenging task condition, particularly in older adults (67), which appears to be in accordance with our interpretation. It would be an intriguing future work to investigate whether and to what extent the difficulty of a motor task modulates the ventral striatal activity.

The success-modulated activity in the VMPFC indicates that the region also contributes to the reward-based mechanism involved in motor learning and that the “action values” are represented in this region (35, 38). However, in contrast to the ventral striatum, which has demonstrated involvement specific to the early learning stage (69), the VMPFC appears to play a significant role during the entire course of learning (39). This difference might be attributable to the dissociable roles of the VMPFC and the ventral striatum in reward processing, which have been suggested to be, respectively, associated with action values and their predicted errors (36). According to this hypothesis, the activity in the ventral striatum would be high in the early learning stage, as a relatively lower reward rate is expected (i.e., positive reward prediction error), but would decrease in the late learning stage, as a relatively higher reward rate is expected. A future study needs to be conducted to test this hypothesis by employing a computational model that reliably assesses predicted rewards and their errors. In sum, the current study provides valuable insights on how the reward-based mechanisms, which have been extensively studied in different contexts (34–36, 38, 39, 61, 70), can be explored in the context of motor learning, through the rare experimental evidence substantiating the role of the ventral striatum and VMPFC in motor learning (37).

Limitations of the Current Study.

There are several limitations of the current study. First, delineating the neural activity in the caudate tail by using fMRI has been a challenge, due to its narrow and curved structure, proximity to the ventricles, and ensued partial volume effects (42). The current study attempted to bypass this issue at least partially by performing manual segmentation and using a high-resolution subcortical atlas. Nevertheless, the signals from the caudate tail are likely to be affected by partial volume effects and signals from nearby ventricles, which would, in turn, affect the current results, at least to some degree. Thus, the current results regarding the caudate tail should be interpreted with caution. Yet, it should also be noted that evidences from animal studies, especially nonhuman primate studies, corroborate the presence of spatiotemporally distinct circuits in the caudate nucleus.

Second, we could not completely rule out the other possibility that the reduced amount of movement, rather than learning, might have exerted a considerable influence on the results. Nevertheless, the absence of significant effects for the untrained mapping appears to preclude the possibility that the current findings are simply attributable to prolonged exposure to the task. Furthermore, null findings from the localizer task—which was performed without goal-directed movement—strongly suggest that the observed fMRI activities were not solely due to movement per se. Instead, these activities are more likely due to learning of goal-directed movement while maximizing utility, or reward per effort and time. Future studies with more deliberate experiment designs would be necessary to dissociate out movement from its learned utility and delineate the pure learning-induced activity in the caudate nucleus.

Third, the current study does not provide a more theoretical explanation for possible mechanisms of de novo motor skill learning. While most model-based fMRI studies in reinforcement learning incorporated decision-making tasks with discrete state and choice spaces (34–36, 38, 70), the current study implemented a motor learning task that requires learners to make decisions continuously in a high-dimensional state and choice spaces. Thus, learners were more likely to employ policy-based methods, directly mapping states to advantageous actions (71, 72), instead of learning all of the values of state–action pairs in a continuous space. One important contribution of our study is to bridge the extensively studied reward-based mechanisms to motor learning.

Finally, the current study also does not provide causal neurobiological accounts for the anterior-to-posterior transition in the caudate nucleus and cortico-caudate interactions during motor skill learning. Future studies combining fMRI and noninvasive brain stimulation targeting selectively cognitive (73) and sensorimotor (74) networks may enhance our understanding of the intricate neural mechanisms underlying de novo motor skill learning.

Methods

Participants.

Forty-three neurologically healthy young adults were enrolled in the current study. A total of 30 participants (12 females, mean age = 23.2 y; range = 19 to 30 y) completed all fMRI and behavioral experiment sessions and were included in the analyses. Among the 13 participants (three females; mean age = 22.2 y; range = 18 to 28 y) who did not complete the entire experiment, three were excluded due to technical problems with data acquisition, and six dropped out due to light-headedness (n = 5) and severe fatigue (n = 1) during extensive fMRI sessions lasting ∼80 min. Four participants were additionally excluded due to unexpected scheduling conflicts (n = 3) or failure of contact (n = 1).

All participants were right-handed, according to a modified version of the Edinburgh Handedness Inventory (75). They had normal or corrected-to-normal vision and provided written informed consent. This study was approved by Sungkyunkwan University Institutional Review Board.

Task Procedure.

We designed a task-based fMRI experiment for a complicated motor skill learning using an MR-compatible data glove (14 Ultra, 5DT Technologies), based on a behavioral experiment implementing a data-glove paradigm (23, 53). The data glove measured finger flexures excluding the proximal one (two sensors per finger) and abductions between fingers (four sensors) from 14 sensors. The 14D vector (h) representing the hand posture, measured by the 14 sensors, was linearly mapped onto the 2D position of the cursor (p) on the screen using the equation (23, 53)

[\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} a_{x, 1} & a_{x, 2} \\ a_{y, 1} & a_{y, 2} \end{matrix} \dots \begin{matrix} a_{x, 13} & a_{x, 14} \\ a_{y, 13} & a_{y, 14} \end{matrix}] \times {[\begin{matrix} h_{1} & h_{2} \end{matrix} \dots \begin{matrix} h_{13} & h_{14} \end{matrix}]}^{T} + [\begin{matrix} x_{0} \\ y_{0} \end{matrix}],

i.e., p = Ah + p₀, where the mapping matrix A and the offset p₀ were determined from the calibration phase in the first fMRI session. The time-series data were sampled from the 14 sensors at 60 Hz, and each of the data were smoothed by using the exponentially weighted moving average of 20 samples to reduce the intrinsic noise of the data-glove system. The smoothed data were used to construct the 14D vector h. For illustrative purposes, we included a demonstration of finger movements and corresponding cursor movements in Movie S1.

Stimuli were generated by using a MATLAB toolbox Cogent 2000 (http://www.vislab.ucl.ac.uk/cogent.php) and were projected onto a custom-made viewing screen. Participants wore the data glove on their right hand and placed the hand in a comfortable position, lying supine in the scanner and viewing the screen via a mirror. They were unable to see their hands throughout the experiment. Foam pads were applied to all participants to minimize head motions during the experiment.

Experiment Design.

Participants completed two fMRI sessions and five behavioral training sessions between the fMRI sessions. The overall experiment typically lasted 3 wk (range = 8 to 43 d, mean ± SD = 23.4 ± 8.7 d) and was completed within 45 d. The sessions for each participant were carefully scheduled so that the duration between the first behavioral training session and the second fMRI session would not exceed 15 d (range = 6 to 15 d, mean ± SD = 9.9 ± 2.7 d).

Visit 1.

Familiarization and calibration.

On the first visit, participants underwent a familiarization phase before fMRI scanning. Participants wore the data glove on their right hands and were instructed to move their fingers freely for 2 min. A bar graph visualizing the real-time variance of each of the 14 sensors was presented, to encourage participants to explore different movements. Then, principal component analysis using the covariance matrix of the acquired time series from 14 sensors was performed on the data. The first two principal components were used to construct the mapping matrix A, and the offset p₀ was determined such that the mean hand posture was mapped to the center of the screen. Once the mapping was determined, the participants were instructed to reach all 25 targets of the 5 × 5 grid to ensure that all cells were reachable (Fig. 1). In the descriptions to follow, each cell was referred to as its respective index number, which was determined by numbering the cells from top to bottom and from left to right. Accordingly, we referred to the cell at the top left corner (1, 1) as cell 1 and the one located right below (2, 1) as cell 6.

Resting-state fMRI scanning.

One run of an 8-min-long resting-state fMRI scan was acquired prior to task-based fMRI scans. Participants were instructed to keep their eyes open, maintain fixation on a cross presented in the middle of the screen, and refrain from focusing on any particular thought.

Localizer scanning.

Localizer scanning was performed to define a region related to random finger movements. Participants were instructed to freely move their right fingers at natural speed when the text “Move” was presented and to stop when the text “Stop” was presented. Each “Move” or “Stop” condition lasted for 1 min, and a total of four “Move”–“Stop” pairs were conducted. Using the finger-movement data of the last two “Move” blocks, we recalibrated the mapping matrix A and the offset p₀, as in the familiarization phase. We also confirmed that all 25 grid cells on the 5 × 5 grid could be reasonably reached by finger movements (Fig. 1).

First main-task fMRI session.

For each trial, a target appeared for 5 s as a gray grid cell with a yellow crosshair at its center in one of the four corner cells on the 5 × 5 grid (cell 1, top left; cell 5, top right; cell 21, bottom left; and cell 25, bottom right). The cursor was displayed as a white crosshair. When the cursor reached the target grid, the color of the target grid changed to red. Holding a static posture led to the cursor staying at the approximately same location. The task was to place the cursor on the target grid as quickly and accurately as possible and maintain the cursor on the target. Participants were also instructed to move the cursor as straightly as possible when moving between targets.

There were seven runs in the first fMRI session, and each run consisted of 96 movements (eight blocks of 12 movements) from the current location to the target. The target sequence in each block was ordered as cells 1–5–25–21–1–25–5–21–25–1–21–5–1 and repeated for all eight blocks. In runs 1 to 3, the previously obtained mapping matrix A, which would be presented as the mapping to be learned during the following behavioral training sessions, was implemented. In runs 4 to 6, the two rows of the mapping matrix A were swapped so that the cursor positions were flipped about the 45-degree diagonal line. Then, in run 7, the original mapping A was restored.

Visits 2 to 6.

Behavioral training sessions.

Participants performed five behavioral training sessions on separate days, each lasting about 40 min. The mapping implemented in the behavioral training sessions was the same as the initial mapping used in runs 1, 2, 3, and 7 of the first fMRI session (i.e., the “trained mapping”).

Each session was composed of five runs. The first three runs presented the task that was identical to the task presented during the fMRI sessions (i.e., the “main task”), with targets appearing at the four corner cells of the grid. In the last two runs, participants were presented with the “path-following task,” in which they had to reach not just the four corner cells, but also the “in-between” cells that consisted of the shortest path between each pair of the corner cells (illustrated in SI Appendix, Fig. S1). Thus, instead of simply reaching cell 1 and then cell 5, participants had to reach cells 1, 2, 3, 4, and 5 consecutively. In each trial, the current target was presented along with the four upcoming targets, so the participants were aware of the path to follow in advance. Since the order of the corner-cell target sequence was the same as in the main-task blocks, the exact target sequence in each path-following block was cells 1–2–3–4–5–10–15–20–25–24–23–22–21–16–11–6–1–7–13–19–25–20–15–10–5–9–13–17–21–22–23–24–25–19–13–7–1–6–11–16–21–17–13–9–5–4–3–2–1 (a total of 49 targets, with bold cell numbers indicating the corner cells) (SI Appendix, Fig. S1B). As each path-following run contained eight such blocks, each behavioral training session included a total of 16 path-following blocks (two runs of eight blocks). If the cursor stayed on the current target cell for 100 ms or did not reach it at all in 5,000 ms, the target was moved to the next cell in the sequence. Due to this adaptive design, the time to complete the path-following runs would differ between individual participants. The purpose of these runs was to encourage participants to generalize the learning of the cursor-posture mapping so that they would not merely rely on the cognitive and discrete knowledge of specific hand postures corresponding to the four corner cells.

Visit 7.

Second main-task fMRI session.

On the seventh visit, participants underwent the second fMRI session, which followed a procedure nearly identical to that of the first fMRI session. There were no familiarization and calibration phases. For the majority of participants, the second fMRI session was performed within 24 h after the last behavioral training session. Otherwise, participants practiced the main task for a few minutes before the fMRI session.

Behavioral Data Analysis.

All statistical analyses and visualization were performed by using MATLAB (Versions R2015b and R2018a, MathWorks), Python (Version 3.6), and R (Version 3.5.3). To obtain measures of learning performance, we calculated the success rates and aspect ratios of the cursor trajectories from two fMRI sessions and five behavioral training sessions.

Success rate.

A trial-by-trial success rate was calculated as a proportion of time during which the cursor was on the target, i.e., targets turned on in red. As each task block was designed to consist of all of the 12 possible paths between four targets and the same target sequence was repetitive, we averaged the success rate in each block and estimated the learning rate (see SI Appendix, Fig. S1 for the details). In addition, we also calculated the overall success rate for each mapping, by averaging the block-by-block success rates from all fMRI and behavioral training sessions and used it as individual participants’ learning performance.

Learning rate.

To calculate the individual learning rate, we fitted an exponential model to the block-by-block success rate (S) for the trained mapping from all fMRI and behavioral training sessions (SI Appendix, Fig. S2).

S (t) = A (1 - e^{- B t}) + C .

In the equation, t and B denote training-block numbers and the learning rate.

Aspect ratio.

For each of the trajectories between targets made by participants, we calculated its aspect ratio to find whether participants simply memorized the hand postures corresponding to targets or learned to control the cursor. To calculate the aspect ratio, we first estimated the maximum perpendicular distance between the straight line connecting the start to end points and the in-between points on the actual cursor trajectory for each trial (23). Then, the distance was normalized with the length of the straight line.

Time-varying amount of movement.

We calculated the time-varying amount of movement to examine whether it was collinear with the time-varying success rate used for the GLM analyses. For each of 14 sensors, we calculated the displacement (i.e., absolute change) for every 1 s and defined their average as the movement amount. For each of the three runs from the trained mapping, we calculated Pearson’s correlation coefficients between the time-varying amount of movement and the success rate.

Duration of the experiment.

The duration of the entire experiment in days, which might be considered as a potential confounding factor in the analyses of performance measures, did not show significant correlations with the overall success rate (R = 0.084, P = 0.66). It was thus not included as a covariate in the main analyses.

The 3-T MRI Acquisition.

We acquired fMRI data using a 3-T Siemens Magnetom Prisma scanner with a 64-channel head coil. Functional scans were acquired by using an echo planar imaging (EPI) sequence with the following parameters: 1,096 volumes (1,113 volumes for resting-state fMRI); repetition time (TR) = 460 ms; echo time (TE) = 27.20 ms; flip angle (FA) = 44°; field of view (FOV) = 220 × 220 mm; matrix, 82 × 82 × 56 voxels; 56 axial slices; slice thickness = 2.7 mm. For anatomical reference, a whole-brain T1-weighted anatomical scan was performed by using a magnetization-prepared rapid acquisition with gradient echo MPRAGE sequence with the following parameters: TR = 2,400 ms; TE = 2.34 ms; FA = 8°; FOV = 224 × 224 mm; matrix = 320 × 224 × 320 voxels; 224 axial slices; and slice thickness = 0.7 mm. Before the functional scans, two EPI images with opposite-phase encoding directions (posterior-to-anterior and anterior-to-posterior) were acquired for subsequent distortion correction, with the following parameters: TR = 7,220 ms; TE = 73 ms; FA = 90°; FOV = 220 × 220 mm; matrix = 82 × 82 × 56; 56 axial slices; and slice thickness = 2.7 mm.

fMRI Data Analysis.

Analyses of fMRI data were performed by using AFNI (Analysis of Functional NeuroImages, NIH; https://afni.nimh.nih.gov), MATLAB (Versions R2015b and R2018a), FreeSurfer (Version 6.0.0; http://surfer.nmr.mgh.harvard.edu), Python (Version 3.6), and R (Version 3.5.3).

Preprocessing.

Anatomical and functional image data were preprocessed by using AFNI. Task-based and resting-state functional images were first corrected for slice-time acquisition and realigned to adjust for motion-related artifacts. Then, retrospective distortion correction was performed by using a field map calculated from the aforementioned two EPI images with opposite-phase encoding directions. The corrected images were spatially registered to the anatomical data and transformed into Montreal Neurological Institute (MNI) template and resampled into 2.68-mm-cube voxels. All images were spatially smoothed through a Gaussian kernel of 4 × 4 × 4-mm full-width at half-maximum and scaled the time series to have a mean of 100 and range of 0 and 200.

Whole-brain voxel-wise GLM analysis.

To identify regions responding to “success” (i.e., reaching a target), we designed a parametric regressor used in a subject-level GLM analysis (AFNI’s 3Ddeconvolve function). Specifically, the time-varying success rate in 1-s-long time bins, defined as the proportion of time during which the cursor stayed on the target grid cell and turned it red, was used as a parameter-modulating pulse regressors at the middle of the time bins. The success rate was then convolved with a gamma function modeling a canonical hemodynamic response function (HRF). To balance the difference in the overall success rate between the early and late learning stages, its mean value was removed with “stim_times_AM2” option in the 3dDeconvolve function (see the modeled responses modulating success rate in SI Appendix, Figs. S4 and S11). To better account for potential deviations from the HRF, a two-parameter Statistical Parametric Mapping gamma variate basis function with temporal derivatives (using 3dDeconvolve with “SPMG2” option for a basis function) was adopted for the main GLM results (76). For regressors of noninterest, we included six regressors estimating rigid-body head motion and five regressors for each run modeling up to fourth-order polynomial trends in the fMRI data. The volumes associated with excessive head motion (defined as those with a displacement greater than 0.4 mm; mean ± SD = 0.6 ± 1.0%, range = 0.0 to 5.4% of the entire volumes) were excluded from the analysis. The respective amounts of average head motion in the resting-state and task-based fMRI runs are shown in SI Appendix, Fig. S10.

For each of the two fMRI sessions (“early” and “late” learning stages), the GLM analyses were performed separately for trained (fMRI runs 1 to 3) and untrained (fMRI runs 4 to 6) mappings. An additional fMRI run (run 7) was performed with the trained mapping, but we did not analyze the data in this study. Then, the regression coefficients of the parametric regressors were taken to group-level whole-brain voxel-wise paired t tests (AFNI’s 3dttest++ function) between the early and late stages. The voxel-wise threshold was P < 10⁻³, and the criterion of 40 suprathreshold voxels was determined by a conservative nonparametric method of randomization and permutation to provide a cluster-wise corrected threshold of P < 0.05 within the whole-brain group mask (AFNI’s 3dttest++ function with “-Clustsim” option). Notably, the results shown in Fig. 3 and SI Appendix, Table S1 were at much more stringent significance levels than the threshold (77). The commonly significant clusters in both “early” and “late” stages (voxel-wise P < 10⁻³, >40 voxels) were identified as the regions commonly activated in both stages and listed in SI Appendix, Table S1, “Common” section.

ROI analysis.

Based on the GLM results and previous literature on reward processing, the bilateral VMPFC, caudate head/body/tail, anterior/posterior putamen, and NAc were chosen as ROIs. The manually segmented caudate ROIs were used for the main analyses (SI Appendix), and the VMPFC ROIs were defined by using an atlas provided in AFNI (78). All other ROIs were generated by using the Reinforcement Learning Atlas (79). The putamen ROIs were divided into the anterior (Y > −0.56) and posterior (Y < −3.25) regions, with a 1-voxel gap between them to reduce partial volume effects, as suggested by literature (80) (SI Appendix, Fig. S6). We confirmed that the defined anterior and posterior putamen ROIs included the main focus of activation, respectively, for the early and late stages, which were reported (11).

For each ROI, the average beta estimates from the GLM analyses of success-rate modulation were extracted by using AFNI’s 3dmaskave function. The extracted data were then subjected to a two-way repeated-measures permutation ANOVA, with the region and learning stage (early vs. late) as within-subject factors. For all these analyses, the effect sizes were estimated by using partial eta-squared values, and subsequent post hoc pairwise t tests were performed with the Holm–Bonferroni adjustment to correct for multiple comparisons. In addition, to assess the effects of the learning stage and mapping (trained vs. untrained), a two-way repeated-measures permutation ANOVA and subsequent post hoc t tests were also performed. To account for deviations from normality and homogeneity of variances in the fMRI activities, permutation ANOVA and Wilcoxon signed-rank test were adopted throughout the analysis.

Resting-state functional connectivity analysis.

After initial preprocessing, the resting-state fMRI data were further processed to control for white matter (WM) signals according to the following procedures. First, each participant’s WM mask was created from automatic segmentation performed by FreeSurfer’s recon-all pipeline. Then, the resting-state fMRI time courses were detrended by using fourth-order polynomial regressors and bandpass filtering (0.01 to 0.1 Hz) (AFNI’s 3dTproject function). To avoid spurious correlation, we regressed out the first five principal components of signals from the WM mask (81). However, we did not perform global signal regression to avoid introducing artifactual anticorrelations (82). Importantly, signals from the ventricles were also not regressed out, in order to minimize the signal loss from the caudate tail, which is an ROI located immediately next to the ventricles.

The resulting residual images were then used for the seed-based functional connectivity analysis. The seed ROIs were defined as the manually segmented individual ROIs of the caudate nucleus (bilateral head, body, and tail) (SI Appendix, SI Methods). We also independently defined two cortical ROIs: one in the bilateral DLPFC for the anterior cognitive loop and the other in M1/S1 for the posterior sensorimotor loop.

The bilateral DLPFC ROI was defined as a mask obtained from Neurosynth, a large-scale meta-analytic fMRI database (https://neurosynth.org/; accessed on September 23, 2019), with the use of the term “dorsolateral prefrontal,” which retrieved 1,049 studies and 36,216 activations (SI Appendix, Fig. S7A). We then selected the two most significant clusters in the right and left prefrontal regions and combined them into a single ROI. For the ROI in the left M1/S1, we performed a whole-brain GLM analysis for the localizer fMRI data contrasting conditions between “Move” and “Stop,” and then the second-level group analysis selected the most significant positive cluster in the left M1/S1 region using AFNI’s 3dttest++ function (SI Appendix, Fig. S7B). The other six significant clusters related to finger movements are listed in SI Appendix, Table S4. We applied a highly stringent voxel-wise threshold of P < 10⁻⁵ and a criterion of a cluster size larger than 150 voxels to more clearly define the discrete ROIs.

In addition to the ROIs in the left M1/S1 and in the DLPFC, we also defined a region in the visual cortex to further test whether the interaction in the visual loop via the caudate tail accounts for the learning performance. Specifically, we identified a region evoked by visual stimuli using the data from the late stage for the trained mapping, in which the activity in the caudate tail significantly increased (Fig. 3). We performed a whole-brain GLM analysis with a regressor encoding the target onsets, which was then convolved with a canonical hemodynamic response (AFNI’s 3dDeconvolve function). Then, the second-level group analysis defined the significant positive cluster in the bilateral visual cortex using AFNI’s 3dttest++ function with a voxel-wise threshold of P < 10⁻³ and 40 suprathreshold voxels determining a cluster-wise corrected P < 0.05 (SI Appendix, Fig. S7C)

For the cognitive loop, we tested the connectivity between the anterior regions of the caudate nucleus (bilateral caudate heads and bodies) and the DLPFC. For the sensorimotor loop, we tested the connectivity between the bilateral caudate tails and the left M1/S1, which was most significantly related to contralateral right finger movements. Finally, for the visual loop, we tested the connectivity between the bilateral caudate tails and regions in the visual cortex evoked by the visual stimuli.

For each participant, we calculated Pearson’s correlation coefficients between the mean time series extracted from each seed and residual signals from all other voxels for each of three cortical ROIs (DLPFC, M1/S1, and visual cortex) and those from the whole brain. The correlation coefficients were converted to Z values by using Fisher’s transformation (AFNI’s 3dTcorr1D function with “-pearson -Fisher” options). Finally, we concatenated the resulting Z maps of the 29 participants (after excluding one outlier) and correlated them with the individual learning performance. We excluded one participant who had experienced technical issues that affected the performance during the behavioral training sessions; the participant showed exceptionally low success rates during these sessions (lower 0.3% in the distribution of performance, shown as the 29th participant in SI Appendix, Fig. S2).

The following tests were conducted for the correlation analyses: 1) four tests between the bilateral caudate heads/bodies and the DLPFC, 2) two tests between the bilateral caudate tails and the left M1/S1, and 3) two tests between the bilateral caudate tails and the ROI in the visual cortex. To correct for multiple comparisons, we reported Bonferroni-corrected P values for these tests. In addition, the same analysis was performed for all of the voxels in the whole brain with a voxel-wise threshold of P < 10⁻³, and 40 suprathreshold voxels determining a cluster-wise corrected P < 0.05. The results are summarized in SI Appendix, Fig. S9 and Table S3.

Supplementary Material

Supplementary File

pnas.2003963117.sapp.pdf^{(56.3MB, pdf)}

Supplementary File

Download video file^{(346.7KB, mov)}

Acknowledgments

We thank Dr. Seong-Gi Kim, director of Center for Neuroscience Imaging Research (CNIR), for his scientific comments and administrative support; In-Gyu Choi and Hohyun Kang for producing the supplementary movie; Seung-Yeon Lee for her work in manual segmentation of the caudate nucleus; Hyeji Kim for assistance with the manual segmentation; Boohee Choi for her technical assistance in the MRI experiment; and Drs. Dongho Kim and Hyoung F. Kim for their scientific comments. Neuroimaging was performed at the CNIR located in Sungkyunkwan University, Korea. This work was supported by Young Science Fellowship IBS-R015-Y1 and IBS-R015-D1 from the Institute for Basic Science, Korea.

Footnotes

The authors declare no competing interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2003963117/-/DCSupplemental.

Data Availability.

The behavioral and MRI data are available via the Open Science Framework (OSF) database at https://osf.io/rmn63/.

References

1.Diedrichsen J., Kornysheva K., Motor skill learning between selection and execution. Trends Cogn. Sci. 19, 227–233 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Hikosaka O., Yamamoto S., Yasuda M., Kim H. F., Why skill matters. Trends Cogn. Sci. 17, 434–441 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Dayan E., Cohen L. G., Neuroplasticity subserving motor skill learning. Neuron 72, 443–454 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Doyon J., Penhune V., Ungerleider L. G., Distinct contribution of the cortico-striatal and cortico-cerebellar systems to motor skill learning. Neuropsychologia 41, 252–262 (2003). [DOI] [PubMed] [Google Scholar]
5.Haber S. N., Knutson B., The reward circuit: Linking primate anatomy and human imaging. Neuropsychopharmacology 35, 4–26 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.de Wit S. et al., Corticostriatal connectivity underlies individual differences in the balance between habitual and goal-directed action control. J. Neurosci. 32, 12066–12075 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Miyachi S., Hikosaka O., Miyashita K., Kárádi Z., Rand M. K., Differential roles of monkey striatum in learning of sequential hand movement. Exp. Brain Res. 115, 1–5 (1997). [DOI] [PubMed] [Google Scholar]
8.Kupferschmidt D. A., Juczewski K., Cui G., Johnson K. A., Lovinger D. M., Parallel, but dissociable, processing in discrete corticostriatal inputs encodes Skill learning. Neuron 96, 476–489.e5 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Miyachi S., Hikosaka O., Lu X., Differential activation of monkey striatal neurons in the early and late stages of procedural learning. Exp. Brain Res. 146, 122–126 (2002). [DOI] [PubMed] [Google Scholar]
10.Thorn C. A., Atallah H., Howe M., Graybiel A. M., Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning. Neuron 66, 781–795 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Lehéricy S. et al., Distinct basal ganglia territories are engaged in early and advanced motor sequence learning. Proc. Natl. Acad. Sci. U.S.A. 102, 12566–12571 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Poldrack R. A. et al., The neural correlates of motor skill automaticity. J. Neurosci. 25, 5356–5364 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Jankowski J., Scheef L., Hüppe C., Boecker H., Distinct striatal regions for planning and executing novel and automated movement sequences. Neuroimage 44, 1369–1379 (2009). [DOI] [PubMed] [Google Scholar]
14.Bapi R. S., Miyapuram K. P., Graydon F. X., Doya K., fMRI investigation of cortical and subcortical networks in the learning of abstract and effector-specific representations of motor sequences. Neuroimage 32, 714–727 (2006). [DOI] [PubMed] [Google Scholar]
15.Doyon J. et al., Role of the striatum, cerebellum, and frontal lobes in the learning of a visuomotor sequence. Brain Cogn. 34, 218–245 (1997). [DOI] [PubMed] [Google Scholar]
16.Floyer-Lea A., Matthews P. M., Changing brain networks for visuomotor control with increased movement automaticity. J. Neurophysiol. 92, 2405–2412 (2004). [DOI] [PubMed] [Google Scholar]
17.Ungerleider L. G., Doyon J., Karni A., Imaging brain plasticity during motor skill learning. Neurobiol. Learn. Mem. 78, 553–564 (2002). [DOI] [PubMed] [Google Scholar]
18.Wymbs N. F., Bassett D. S., Mucha P. J., Porter M. A., Grafton S. T., Differential recruitment of the sensorimotor putamen and frontoparietal cortex during motor chunking in humans. Neuron 74, 936–946 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Krakauer J. W., Hadjiosif A. M., Xu J., Wong A. L., Haith A. M., Motor learning. Compr. Physiol. 9, 613–663 (2019). [DOI] [PubMed] [Google Scholar]
20.Telgen S., Parvin D., Diedrichsen J., Mirror reversal and visual rotation are learned and consolidated via separate mechanisms: Recalibrating or learning de novo? J. Neurosci. 34, 13768–13779 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Radhakrishnan S. M., Baker S. N., Jackson A., Learning a novel myoelectric-controlled interface task. J. Neurophysiol. 100, 2397–2408 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Johansson R. S. et al., How a lateralized brain supports symmetrical bimanual tasks. PLoS Biol. 4, e158 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Ranganathan R., Adewuyi A., Mussa-Ivaldi F. A., Learning to be lazy: Exploiting redundancy in a novel task to minimize movement-related effort. J. Neurosci. 33, 2754–2760 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Gutierrez-Garralda J. M. et al., The effect of Parkinson’s disease and Huntington’s disease on human visuomotor learning. Eur. J. Neurosci. 38, 2933–2940 (2013). [DOI] [PubMed] [Google Scholar]
25.Izawa J., Shadmehr R., Learning from sensory and reward prediction errors during motor adaptation. PLoS Comput. Biol. 7, e1002012 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Therrien A. S., Wolpert D. M., Bastian A. J., Effective reinforcement learning following cerebellar damage requires a balance between exploration and motor noise. Brain 139, 101–114 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Abe M. et al., Reward improves long-term retention of a motor memory through induction of offline memory gains. Curr. Biol. 21, 557–562 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Kim H. F., Ghazizadeh A., Hikosaka O., Dopamine neurons encoding long-term memory of object value for habitual behavior. Cell 163, 1165–1175 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Kim H. F., Hikosaka O., Parallel basal ganglia circuits for voluntary and automatic behaviour to reach rewards. Brain 138, 1776–1800 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Kim S., Ogawa K., Lv J., Schweighofer N., Imamizu H., Neural substrates related to motor memory with multiple timescales in sensorimotor adaptation. PLoS Biol. 13, e1002312 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Steele C. J., Penhune V. B., Specific increases within global decreases: A functional magnetic resonance imaging investigation of five days of motor sequence learning. J. Neurosci. 30, 8332–8341 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Puttemans V., Wenderoth N., Swinnen S. P., Changes in brain activation during the acquisition of a multifrequency bimanual coordination task: From the cognitive stage to advanced levels of automaticity. J. Neurosci. 25, 4270–4278 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Toni I., Krams M., Turner R., Passingham R. E., The time course of changes during motor sequence learning: A whole-brain fMRI study. Neuroimage 8, 50–61 (1998). [DOI] [PubMed] [Google Scholar]
34.O’Doherty J. P., Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Curr. Opin. Neurobiol. 14, 769–776 (2004). [DOI] [PubMed] [Google Scholar]
35.Gläscher J., Hampton A. N., O’Doherty J. P., Determining a role for ventromedial prefrontal cortex in encoding action-based value signals during reward-related decision making. Cereb. Cortex 19, 483–495 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Hare T. A., O’Doherty J., Camerer C. F., Schultz W., Rangel A., Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J. Neurosci. 28, 5623–5630 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Penhune V. B., Steele C. J., Parallel contributions of cerebellar, striatal and M1 mechanisms to motor sequence learning. Behav. Brain Res. 226, 579–591 (2012). [DOI] [PubMed] [Google Scholar]
38.Wunderlich K., Rangel A., O’Doherty J. P., Neural computations underlying action-based decision making in the human brain. Proc. Natl. Acad. Sci. U.S.A. 106, 17199–17204 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Wunderlich K., Dayan P., Dolan R. J., Mapping value based planning and extensively trained choice in the human brain. Nat. Neurosci. 15, 786–791 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Jarbo K., Verstynen T. D., Converging structural and functional connectivity of orbitofrontal, dorsolateral prefrontal, and posterior parietal cortex in the human striatum. J. Neurosci. 35, 3865–3878 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Marquand A. F., Haak K. V., Beckmann C. F., Functional corticostriatal connection topographies predict goal directed behaviour in humans. Nat. Hum. Behav. 1, 0146 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Seger C. A., The visual corticostriatal loop through the tail of the caudate: Circuitry and function. Front. Syst. Neurosci. 7, 104 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Moore R. D., Gallea C., Horovitz S. G., Hallett M., Individuated finger control in focal hand dystonia: An fMRI study. Neuroimage 61, 823–831 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Jueptner M., Frith C. D., Brooks D. J., Frackowiak R. S. J., Passingham R. E., Anatomy of motor learning. II. Subcortical structures and learning by trial and error. J. Neurophysiol. 77, 1325–1337 (1997). [DOI] [PubMed] [Google Scholar]
45.Kim H. F., Hikosaka O., Distinct basal ganglia circuits controlling behaviors guided by flexible and stable values. Neuron 79, 1001–1010 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Grahn J. A., Parkinson J. A., Owen A. M., The cognitive functions of the caudate nucleus. Prog. Neurobiol. 86, 141–155 (2008). [DOI] [PubMed] [Google Scholar]
47.Lopez-Paniagua D., Seger C. A., Interactions within and between corticostriatal loops during component processes of category learning. J. Cogn. Neurosci. 23, 3068–3083 (2011). [DOI] [PubMed] [Google Scholar]
48.Hikosaka O. et al., Multiple neuronal circuits for variable object-action choices based on short- and long-term memories. Proc. Natl. Acad. Sci. U.S.A. 116, 26313–26320 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Ashby F. G., Ell S. W., The neurobiology of human category learning. Trends Cogn. Sci. 5, 204–210 (2001). [DOI] [PubMed] [Google Scholar]
50.Ashby F. G., Noble S., Filoteo J. V., Waldron E. M., Ell S. W., Category learning deficits in Parkinson’s disease. Neuropsychology 17, 115–124 (2003). [PubMed] [Google Scholar]
51.Seger C. A., Cincotta C. M., The roles of the caudate nucleus in human classification learning. J. Neurosci. 25, 2941–2951 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Nomura E. M. et al., Neural correlates of rule-based and information-integration visual category learning. Cereb. Cortex 17, 37–43 (2007). [DOI] [PubMed] [Google Scholar]
53.Liu X., Scheidt R. A., Contributions of online visual feedback to the learning and generalization of novel finger coordination patterns. J. Neurophysiol. 99, 2546–2557 (2008). [DOI] [PubMed] [Google Scholar]
54.Mazzoni P., Krakauer J. W., An implicit plan overrides an explicit strategy during visuomotor adaptation. J. Neurosci. 26, 3642–3645 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Beilock S. L., Carr T. H., On the fragility of skilled performance: What governs choking under pressure? J. Exp. Psychol. Gen. 130, 701–725 (2001). [PubMed] [Google Scholar]
56.Collins A. G., Frank M. J., Cognitive control over learning: Creating, clustering, and generalizing task-set structure. Psychol. Rev. 120, 190–229 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Beilock S. L., Carr T. H., MacMahon C., Starkes J. L., When paying attention becomes counterproductive: Impact of divided versus skill-focused attention on novice and experienced performance of sensorimotor skills. J. Exp. Psychol. Appl. 8, 6–16 (2002). [DOI] [PubMed] [Google Scholar]
58.Bassett D. S., Yang M., Wymbs N. F., Grafton S. T., Learning-induced autonomy of sensorimotor systems. Nat. Neurosci. 18, 744–751 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Hikosaka O. et al., Parallel neural networks for learning sequential procedures. Trends Neurosci. 22, 464–471 (1999). [DOI] [PubMed] [Google Scholar]
60.Imamizu H. et al., Human cerebellar activity reflecting an acquired internal model of a new tool. Nature 403, 192–195 (2000). [DOI] [PubMed] [Google Scholar]
61.Daw N. D., Niv Y., Dayan P., Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005). [DOI] [PubMed] [Google Scholar]
62.Lee J. Y., Schweighofer N., Dual adaptation supports a parallel architecture of motor memory. J. Neurosci. 29, 10396–10404 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
63.Smith M. A., Ghazizadeh A., Shadmehr R., Interacting adaptive processes with different timescales underlie short-term motor learning. PLoS Biol. 4, e179 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Kording K. P., Tenenbaum J. B., Shadmehr R., The dynamics of memory as a consequence of optimal adaptation to a changing body. Nat. Neurosci. 10, 779–786 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Widmer M., Ziegler N., Held J., Luft A., Lutz K., “Rewarding feedback promotes motor skill consolidation via striatal activity” in Progress in Brain Research, Studer B., Knecht S., Eds., (Elsevier, Amsterdam, Netherlands, 2016), Vol. 229, pp. 303–323. [DOI] [PubMed] [Google Scholar]
66.Pessiglione M. et al., How the brain translates money into force: A neuroimaging study of subliminal motivation. Science 316, 904–906 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
67.Chalavi S. et al., Anatomy of subcortical structures predicts age-related differences in skill acquisition. Cereb. Cortex 28, 459–473 (2018). [DOI] [PubMed] [Google Scholar]
68.Lutz K., Pedroni A., Nadig K., Luechinger R., Jäncke L., The rewarding value of good motor performance in the context of monetary incentives. Neuropsychologia 50, 1739–1747 (2012). [DOI] [PubMed] [Google Scholar]
69.Erickson K. I. et al., Striatal volume predicts level of video game skill acquisition. Cereb. Cortex 20, 2522–2530 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
70.O’Doherty J. et al., Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304, 452–454 (2004). [DOI] [PubMed] [Google Scholar]
71.Sutton R. M., McAllester D., Singh S., Mansour Y., “Policy gradient methods for reinforcement learning with function approximation” in NIPS’99 Proceedings of the 12th International Conference on Neural Information Processing Systems, Solla S. A., Leen T. K., Müller K., Eds. (MIT Press, Cambridge, MA, 2000), pp. 1057–1063. [Google Scholar]
72.Li J., Daw N. D., Signals in human striatum are appropriate for policy update rather than value prediction. J. Neurosci. 31, 5504–5511 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
73.Dayan E., Herszage J., Laor-Maayany R., Sharon H., Censor N., Neuromodulation of reinforced skill learning reveals the causal function of prefrontal cortex. Hum. Brain Mapp. 39, 4724–4732 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
74.Censor N., Horovitz S. G., Cohen L. G., Interference with existing memories alters offline intrinsic functional brain connectivity. Neuron 81, 69–76 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
75.Oldfield R. C., The assessment and analysis of handedness: The Edinburgh inventory. Neuropsychologia 9, 97–113 (1971). [DOI] [PubMed] [Google Scholar]
76.Henson R. N. A., Price C. J., Rugg M. D., Turner R., Friston K. J., Detecting latency differences in event-related BOLD responses: Application to words versus nonwords and initial versus repeated face presentations. Neuroimage 15, 83–97 (2002). [DOI] [PubMed] [Google Scholar]
77.Cox R. W., Chen G., Glen D. R., Reynolds R. C., Taylor P. A., FMRI clustering in AFNI: False-positive rates redux. Brain Connect. 7, 152–171 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
78.Mackey S., Petrides M., Architecture and morphology of the human ventromedial prefrontal cortex. Eur. J. Neurosci. 40, 2777–2796 (2014). [DOI] [PubMed] [Google Scholar]
79.Pauli W. M., Nili A. N., Tyszka J. M., A high-resolution probabilistic in vivo atlas of human subcortical brain nuclei. Sci. Data 5, 180063 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
80.Aarts E. et al., Aberrant reward processing in Parkinson’s disease is associated with dopamine cell loss. Neuroimage 59, 3339–3346 (2012). [DOI] [PubMed] [Google Scholar]
81.Chai X. J., Castañón A. N., Ongür D., Whitfield-Gabrieli S., Anticorrelations in resting state networks without global signal regression. Neuroimage 59, 1420–1428 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
82.Murphy K., Birn R. M., Handwerker D. A., Jones T. B., Bandettini P. A., The impact of global signal regression on resting state correlations: Are anti-correlated networks introduced? Neuroimage 44, 893–905 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

pnas.2003963117.sapp.pdf^{(56.3MB, pdf)}

Supplementary File

Download video file^{(346.7KB, mov)}

Data Availability Statement

The behavioral and MRI data are available via the Open Science Framework (OSF) database at https://osf.io/rmn63/.

[r1] 1.Diedrichsen J., Kornysheva K., Motor skill learning between selection and execution. Trends Cogn. Sci. 19, 227–233 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2] 2.Hikosaka O., Yamamoto S., Yasuda M., Kim H. F., Why skill matters. Trends Cogn. Sci. 17, 434–441 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r3] 3.Dayan E., Cohen L. G., Neuroplasticity subserving motor skill learning. Neuron 72, 443–454 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r4] 4.Doyon J., Penhune V., Ungerleider L. G., Distinct contribution of the cortico-striatal and cortico-cerebellar systems to motor skill learning. Neuropsychologia 41, 252–262 (2003). [DOI] [PubMed] [Google Scholar]

[r5] 5.Haber S. N., Knutson B., The reward circuit: Linking primate anatomy and human imaging. Neuropsychopharmacology 35, 4–26 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r6] 6.de Wit S. et al., Corticostriatal connectivity underlies individual differences in the balance between habitual and goal-directed action control. J. Neurosci. 32, 12066–12075 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r7] 7.Miyachi S., Hikosaka O., Miyashita K., Kárádi Z., Rand M. K., Differential roles of monkey striatum in learning of sequential hand movement. Exp. Brain Res. 115, 1–5 (1997). [DOI] [PubMed] [Google Scholar]

[r8] 8.Kupferschmidt D. A., Juczewski K., Cui G., Johnson K. A., Lovinger D. M., Parallel, but dissociable, processing in discrete corticostriatal inputs encodes Skill learning. Neuron 96, 476–489.e5 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r9] 9.Miyachi S., Hikosaka O., Lu X., Differential activation of monkey striatal neurons in the early and late stages of procedural learning. Exp. Brain Res. 146, 122–126 (2002). [DOI] [PubMed] [Google Scholar]

[r10] 10.Thorn C. A., Atallah H., Howe M., Graybiel A. M., Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning. Neuron 66, 781–795 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11] 11.Lehéricy S. et al., Distinct basal ganglia territories are engaged in early and advanced motor sequence learning. Proc. Natl. Acad. Sci. U.S.A. 102, 12566–12571 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r12] 12.Poldrack R. A. et al., The neural correlates of motor skill automaticity. J. Neurosci. 25, 5356–5364 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r13] 13.Jankowski J., Scheef L., Hüppe C., Boecker H., Distinct striatal regions for planning and executing novel and automated movement sequences. Neuroimage 44, 1369–1379 (2009). [DOI] [PubMed] [Google Scholar]

[r14] 14.Bapi R. S., Miyapuram K. P., Graydon F. X., Doya K., fMRI investigation of cortical and subcortical networks in the learning of abstract and effector-specific representations of motor sequences. Neuroimage 32, 714–727 (2006). [DOI] [PubMed] [Google Scholar]

[r15] 15.Doyon J. et al., Role of the striatum, cerebellum, and frontal lobes in the learning of a visuomotor sequence. Brain Cogn. 34, 218–245 (1997). [DOI] [PubMed] [Google Scholar]

[r16] 16.Floyer-Lea A., Matthews P. M., Changing brain networks for visuomotor control with increased movement automaticity. J. Neurophysiol. 92, 2405–2412 (2004). [DOI] [PubMed] [Google Scholar]

[r17] 17.Ungerleider L. G., Doyon J., Karni A., Imaging brain plasticity during motor skill learning. Neurobiol. Learn. Mem. 78, 553–564 (2002). [DOI] [PubMed] [Google Scholar]

[r18] 18.Wymbs N. F., Bassett D. S., Mucha P. J., Porter M. A., Grafton S. T., Differential recruitment of the sensorimotor putamen and frontoparietal cortex during motor chunking in humans. Neuron 74, 936–946 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r19] 19.Krakauer J. W., Hadjiosif A. M., Xu J., Wong A. L., Haith A. M., Motor learning. Compr. Physiol. 9, 613–663 (2019). [DOI] [PubMed] [Google Scholar]

[r20] 20.Telgen S., Parvin D., Diedrichsen J., Mirror reversal and visual rotation are learned and consolidated via separate mechanisms: Recalibrating or learning de novo? J. Neurosci. 34, 13768–13779 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r21] 21.Radhakrishnan S. M., Baker S. N., Jackson A., Learning a novel myoelectric-controlled interface task. J. Neurophysiol. 100, 2397–2408 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r22] 22.Johansson R. S. et al., How a lateralized brain supports symmetrical bimanual tasks. PLoS Biol. 4, e158 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r23] 23.Ranganathan R., Adewuyi A., Mussa-Ivaldi F. A., Learning to be lazy: Exploiting redundancy in a novel task to minimize movement-related effort. J. Neurosci. 33, 2754–2760 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r24] 24.Gutierrez-Garralda J. M. et al., The effect of Parkinson’s disease and Huntington’s disease on human visuomotor learning. Eur. J. Neurosci. 38, 2933–2940 (2013). [DOI] [PubMed] [Google Scholar]

[r25] 25.Izawa J., Shadmehr R., Learning from sensory and reward prediction errors during motor adaptation. PLoS Comput. Biol. 7, e1002012 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r26] 26.Therrien A. S., Wolpert D. M., Bastian A. J., Effective reinforcement learning following cerebellar damage requires a balance between exploration and motor noise. Brain 139, 101–114 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r27] 27.Abe M. et al., Reward improves long-term retention of a motor memory through induction of offline memory gains. Curr. Biol. 21, 557–562 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r28] 28.Kim H. F., Ghazizadeh A., Hikosaka O., Dopamine neurons encoding long-term memory of object value for habitual behavior. Cell 163, 1165–1175 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r29] 29.Kim H. F., Hikosaka O., Parallel basal ganglia circuits for voluntary and automatic behaviour to reach rewards. Brain 138, 1776–1800 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r30] 30.Kim S., Ogawa K., Lv J., Schweighofer N., Imamizu H., Neural substrates related to motor memory with multiple timescales in sensorimotor adaptation. PLoS Biol. 13, e1002312 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r31] 31.Steele C. J., Penhune V. B., Specific increases within global decreases: A functional magnetic resonance imaging investigation of five days of motor sequence learning. J. Neurosci. 30, 8332–8341 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r32] 32.Puttemans V., Wenderoth N., Swinnen S. P., Changes in brain activation during the acquisition of a multifrequency bimanual coordination task: From the cognitive stage to advanced levels of automaticity. J. Neurosci. 25, 4270–4278 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r33] 33.Toni I., Krams M., Turner R., Passingham R. E., The time course of changes during motor sequence learning: A whole-brain fMRI study. Neuroimage 8, 50–61 (1998). [DOI] [PubMed] [Google Scholar]

[r34] 34.O’Doherty J. P., Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Curr. Opin. Neurobiol. 14, 769–776 (2004). [DOI] [PubMed] [Google Scholar]

[r35] 35.Gläscher J., Hampton A. N., O’Doherty J. P., Determining a role for ventromedial prefrontal cortex in encoding action-based value signals during reward-related decision making. Cereb. Cortex 19, 483–495 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r36] 36.Hare T. A., O’Doherty J., Camerer C. F., Schultz W., Rangel A., Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J. Neurosci. 28, 5623–5630 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r37] 37.Penhune V. B., Steele C. J., Parallel contributions of cerebellar, striatal and M1 mechanisms to motor sequence learning. Behav. Brain Res. 226, 579–591 (2012). [DOI] [PubMed] [Google Scholar]

[r38] 38.Wunderlich K., Rangel A., O’Doherty J. P., Neural computations underlying action-based decision making in the human brain. Proc. Natl. Acad. Sci. U.S.A. 106, 17199–17204 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r39] 39.Wunderlich K., Dayan P., Dolan R. J., Mapping value based planning and extensively trained choice in the human brain. Nat. Neurosci. 15, 786–791 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r40] 40.Jarbo K., Verstynen T. D., Converging structural and functional connectivity of orbitofrontal, dorsolateral prefrontal, and posterior parietal cortex in the human striatum. J. Neurosci. 35, 3865–3878 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r41] 41.Marquand A. F., Haak K. V., Beckmann C. F., Functional corticostriatal connection topographies predict goal directed behaviour in humans. Nat. Hum. Behav. 1, 0146 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r42] 42.Seger C. A., The visual corticostriatal loop through the tail of the caudate: Circuitry and function. Front. Syst. Neurosci. 7, 104 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r43] 43.Moore R. D., Gallea C., Horovitz S. G., Hallett M., Individuated finger control in focal hand dystonia: An fMRI study. Neuroimage 61, 823–831 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r44] 44.Jueptner M., Frith C. D., Brooks D. J., Frackowiak R. S. J., Passingham R. E., Anatomy of motor learning. II. Subcortical structures and learning by trial and error. J. Neurophysiol. 77, 1325–1337 (1997). [DOI] [PubMed] [Google Scholar]

[r45] 45.Kim H. F., Hikosaka O., Distinct basal ganglia circuits controlling behaviors guided by flexible and stable values. Neuron 79, 1001–1010 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r46] 46.Grahn J. A., Parkinson J. A., Owen A. M., The cognitive functions of the caudate nucleus. Prog. Neurobiol. 86, 141–155 (2008). [DOI] [PubMed] [Google Scholar]

[r47] 47.Lopez-Paniagua D., Seger C. A., Interactions within and between corticostriatal loops during component processes of category learning. J. Cogn. Neurosci. 23, 3068–3083 (2011). [DOI] [PubMed] [Google Scholar]

[r48] 48.Hikosaka O. et al., Multiple neuronal circuits for variable object-action choices based on short- and long-term memories. Proc. Natl. Acad. Sci. U.S.A. 116, 26313–26320 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r49] 49.Ashby F. G., Ell S. W., The neurobiology of human category learning. Trends Cogn. Sci. 5, 204–210 (2001). [DOI] [PubMed] [Google Scholar]

[r50] 50.Ashby F. G., Noble S., Filoteo J. V., Waldron E. M., Ell S. W., Category learning deficits in Parkinson’s disease. Neuropsychology 17, 115–124 (2003). [PubMed] [Google Scholar]

[r51] 51.Seger C. A., Cincotta C. M., The roles of the caudate nucleus in human classification learning. J. Neurosci. 25, 2941–2951 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r52] 52.Nomura E. M. et al., Neural correlates of rule-based and information-integration visual category learning. Cereb. Cortex 17, 37–43 (2007). [DOI] [PubMed] [Google Scholar]

[r53] 53.Liu X., Scheidt R. A., Contributions of online visual feedback to the learning and generalization of novel finger coordination patterns. J. Neurophysiol. 99, 2546–2557 (2008). [DOI] [PubMed] [Google Scholar]

[r54] 54.Mazzoni P., Krakauer J. W., An implicit plan overrides an explicit strategy during visuomotor adaptation. J. Neurosci. 26, 3642–3645 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r55] 55.Beilock S. L., Carr T. H., On the fragility of skilled performance: What governs choking under pressure? J. Exp. Psychol. Gen. 130, 701–725 (2001). [PubMed] [Google Scholar]

[r56] 56.Collins A. G., Frank M. J., Cognitive control over learning: Creating, clustering, and generalizing task-set structure. Psychol. Rev. 120, 190–229 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r57] 57.Beilock S. L., Carr T. H., MacMahon C., Starkes J. L., When paying attention becomes counterproductive: Impact of divided versus skill-focused attention on novice and experienced performance of sensorimotor skills. J. Exp. Psychol. Appl. 8, 6–16 (2002). [DOI] [PubMed] [Google Scholar]

[r58] 58.Bassett D. S., Yang M., Wymbs N. F., Grafton S. T., Learning-induced autonomy of sensorimotor systems. Nat. Neurosci. 18, 744–751 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r59] 59.Hikosaka O. et al., Parallel neural networks for learning sequential procedures. Trends Neurosci. 22, 464–471 (1999). [DOI] [PubMed] [Google Scholar]

[r60] 60.Imamizu H. et al., Human cerebellar activity reflecting an acquired internal model of a new tool. Nature 403, 192–195 (2000). [DOI] [PubMed] [Google Scholar]

[r61] 61.Daw N. D., Niv Y., Dayan P., Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005). [DOI] [PubMed] [Google Scholar]

[r62] 62.Lee J. Y., Schweighofer N., Dual adaptation supports a parallel architecture of motor memory. J. Neurosci. 29, 10396–10404 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r63] 63.Smith M. A., Ghazizadeh A., Shadmehr R., Interacting adaptive processes with different timescales underlie short-term motor learning. PLoS Biol. 4, e179 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r64] 64.Kording K. P., Tenenbaum J. B., Shadmehr R., The dynamics of memory as a consequence of optimal adaptation to a changing body. Nat. Neurosci. 10, 779–786 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r65] 65.Widmer M., Ziegler N., Held J., Luft A., Lutz K., “Rewarding feedback promotes motor skill consolidation via striatal activity” in Progress in Brain Research, Studer B., Knecht S., Eds., (Elsevier, Amsterdam, Netherlands, 2016), Vol. 229, pp. 303–323. [DOI] [PubMed] [Google Scholar]

[r66] 66.Pessiglione M. et al., How the brain translates money into force: A neuroimaging study of subliminal motivation. Science 316, 904–906 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r67] 67.Chalavi S. et al., Anatomy of subcortical structures predicts age-related differences in skill acquisition. Cereb. Cortex 28, 459–473 (2018). [DOI] [PubMed] [Google Scholar]

[r68] 68.Lutz K., Pedroni A., Nadig K., Luechinger R., Jäncke L., The rewarding value of good motor performance in the context of monetary incentives. Neuropsychologia 50, 1739–1747 (2012). [DOI] [PubMed] [Google Scholar]

[r69] 69.Erickson K. I. et al., Striatal volume predicts level of video game skill acquisition. Cereb. Cortex 20, 2522–2530 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r70] 70.O’Doherty J. et al., Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304, 452–454 (2004). [DOI] [PubMed] [Google Scholar]

[r71] 71.Sutton R. M., McAllester D., Singh S., Mansour Y., “Policy gradient methods for reinforcement learning with function approximation” in NIPS’99 Proceedings of the 12th International Conference on Neural Information Processing Systems, Solla S. A., Leen T. K., Müller K., Eds. (MIT Press, Cambridge, MA, 2000), pp. 1057–1063. [Google Scholar]

[r72] 72.Li J., Daw N. D., Signals in human striatum are appropriate for policy update rather than value prediction. J. Neurosci. 31, 5504–5511 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r73] 73.Dayan E., Herszage J., Laor-Maayany R., Sharon H., Censor N., Neuromodulation of reinforced skill learning reveals the causal function of prefrontal cortex. Hum. Brain Mapp. 39, 4724–4732 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r74] 74.Censor N., Horovitz S. G., Cohen L. G., Interference with existing memories alters offline intrinsic functional brain connectivity. Neuron 81, 69–76 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r75] 75.Oldfield R. C., The assessment and analysis of handedness: The Edinburgh inventory. Neuropsychologia 9, 97–113 (1971). [DOI] [PubMed] [Google Scholar]

[r76] 76.Henson R. N. A., Price C. J., Rugg M. D., Turner R., Friston K. J., Detecting latency differences in event-related BOLD responses: Application to words versus nonwords and initial versus repeated face presentations. Neuroimage 15, 83–97 (2002). [DOI] [PubMed] [Google Scholar]

[r77] 77.Cox R. W., Chen G., Glen D. R., Reynolds R. C., Taylor P. A., FMRI clustering in AFNI: False-positive rates redux. Brain Connect. 7, 152–171 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r78] 78.Mackey S., Petrides M., Architecture and morphology of the human ventromedial prefrontal cortex. Eur. J. Neurosci. 40, 2777–2796 (2014). [DOI] [PubMed] [Google Scholar]

[r79] 79.Pauli W. M., Nili A. N., Tyszka J. M., A high-resolution probabilistic in vivo atlas of human subcortical brain nuclei. Sci. Data 5, 180063 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r80] 80.Aarts E. et al., Aberrant reward processing in Parkinson’s disease is associated with dopamine cell loss. Neuroimage 59, 3339–3346 (2012). [DOI] [PubMed] [Google Scholar]

[r81] 81.Chai X. J., Castañón A. N., Ongür D., Whitfield-Gabrieli S., Anticorrelations in resting state networks without global signal regression. Neuroimage 59, 1420–1428 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r82] 82.Murphy K., Birn R. M., Handwerker D. A., Jones T. B., Bandettini P. A., The impact of global signal regression on resting state correlations: Are anti-correlated networks introduced? Neuroimage 44, 893–905 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Spatiotemporal dissociation of fMRI activity in the caudate nucleus underlies human de novo motor skill learning

Yera Choi

Emily Yunha Shin

Sungshin Kim

Significance

Abstract

Results

Successful Learning of De Novo Motor Skills.

Fig. 1.

Fig. 2.

Spatiotemporal Dissociation of the fMRI Activity Related to Motor Skill Learning.

Whole-brain analyses.

Fig. 3.

Region-of-interest analyses in the stratum and VMPFC.

Fig. 4.

The activity changes were learning-induced and specific to the trained mapping.

Cortico-Caudate Functional Interactions Predict Individual Learning Performance.

Fig. 5.

Localization of Hand Movement-Related Regions.

Discussion

Learning-Induced Spatiotemporal Dissociation in the Caudate Nucleus.

Dissociable Roles of the Cognitive and Sensorimotor Loops.

Linking Reward-Based Mechanisms to Motor Learning.

Limitations of the Current Study.

Methods

Participants.

Task Procedure.

Experiment Design.

Visit 1.

Familiarization and calibration.

Resting-state fMRI scanning.

Localizer scanning.

First main-task fMRI session.

Visits 2 to 6.

Behavioral training sessions.

Visit 7.

Second main-task fMRI session.

Behavioral Data Analysis.

Success rate.

Learning rate.

Aspect ratio.

Time-varying amount of movement.

Duration of the experiment.

The 3-T MRI Acquisition.

fMRI Data Analysis.

Preprocessing.

Whole-brain voxel-wise GLM analysis.

ROI analysis.

Resting-state functional connectivity analysis.

Supplementary Material

Acknowledgments

Footnotes

Data Availability.

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases