Abstract
Despite recent progress in our understanding of sensorimotor integration in speech learning, a comprehensive framework to investigate its neural basis is lacking at behaviorally relevant timescales. Structural and functional imaging studies in humans have helped us identify brain networks that support speech but fail to capture the precise spatiotemporal coordination within the networks that takes place during speech learning. Here we use neuronal oscillations to investigate interactions within speech motor networks in a paradigm of speech motor adaptation under altered feedback with continuous recording of EEG in which subjects adapted to the real-time auditory perturbation of a target vowel sound. As subjects adapted to the task, concurrent changes were observed in the theta-gamma phase coherence during speech planning at several distinct scalp regions that is consistent with the establishment of a feedforward map. In particular, there was an increase in coherence over the central region and a decrease over the fronto-temporal regions, revealing a redistribution of coherence over an interacting network of brain regions that could be a general feature of error-based motor learning in general. Our findings have implications for understanding the neural basis of speech motor learning and could elucidate how transient breakdown of neuronal communication within speech networks relates to speech disorders.
Keywords: speech motor learning, neural oscillations, feedforward map
as we learn to speak, the auditory, somatosensory and motor areas of our brain operate in parallel to perceive speech stimuli, detect feedback errors and generate the motor commands necessary to achieve speech goals (Guenther 2006; Perkell 2012; Tourville et al. 2008). While this process is believed to involve coordinated activity over several distributed brain areas (Guenther 2006; Hickok et al. 2011; Scott and Wise 2004), the neural mechanisms underlying their precise spatiotemporal coordination remain elusive. The oscillatory dynamics inherent in the brain's electrical activity provide a suitable means to study the neural communication within brain networks (Golfinopoulos et al. 2010; Varela et al. 2011) at timescales relevant for speech. In this study, we provide evidence that brain oscillation patterns change as a result of speech motor adaptation, enabling us to probe into the neural basis of learning at finer temporal resolutions.
In recent years, the role of neuronal oscillations and cross-frequency phase coupling, particularly in the theta (3–8 Hz) and gamma (30–55 Hz) bands, has been demonstrated in a variety of cognitive and sensorimotor tasks in humans involving working memory (Herrmann et al. 2010), spatial navigation (Burke et al. 2013), perceptual grouping (Gray et al. 1989), visuomotor integration (Roelfsema et al. 1997) and adaptation (Perfetti et al. 2011). Similar phase coupling has been shown for working memory tasks in monkeys (Lee et al. 2005) and for associative learning in rats (Tort et al. 2009). Another study done on monkeys has shown that phase coupling across frequency bands in EEG signals and local field potentials are significant predictors of multiunit activity in neurons (Whittingstall and Logothetis 2009), suggesting that phase coherence in neural oscillations plays a critical role in reorganizing neural circuits (Schack et al. 2002). Based on these results, it has been proposed that communication within brain networks during behavior is facilitated by synchronous neuronal oscillations in the theta and the gamma bands (Fries 2005). While the role of neuronal oscillations has been explored in speech perception (Arnal et al. 2011; Giraud and Poeppel 2012; Schroeder et al. 2008), no such studies have been done in the context of speech motor learning.
We studied neural oscillations in speech motor adaptation using an altered auditory feedback paradigm (Houde and Jordan 1998; Jones and Munhall 2005; Purcell and Munhall 2006a, 2006b; Villacorta et al. 2007) in which subjects repeated a target vowel sound (as in “HEAD”) while feedback was perturbed in real-time. EEG signals were continuously recorded as subjects adapted to the task by shifting their production over the course of training and measured changes in phase coherence over the scalp between the theta and the gamma bands (Cohen 2008). Changes in phase coherence were observed at distinct regions over the scalp that significantly correlated with the amount of adaptation, increasing over the central region while decreasing over the bilateral fronto-temporal areas, revealing an interacting speech motor network that supports learning. Moreover, these changes were found primarily during speech planning, suggesting the emergence of a feedforward map as subjects adapted to the task. Our results possibly reflect an important mechanism based on the redistribution of phase coherence that underlies error-based motor learning and, to the best of our knowledge, provide the first direct evidence of the role of neural oscillations in speech motor learning.
METHODS
Subjects
Seventeen male subjects between the ages of 19 and 30 yr (21.7 ± 0.6 yr) participated in this study. All subjects were native English speakers, and none had any known history of hearing or speech disorders. The Northwestern University Research Ethics Board approved all experimental procedures, and written, informed consent was obtained from all of the participants.
Experimental Setup and Task
All experiments were conducted in a soundproof booth. The speech motor task involved repeating aloud the target word “HEAD” displayed on a computer monitor under continuous recording of EEG. The experiment consisted of 8 baseline blocks followed by 12 training blocks and 10 after-effect blocks of 12 trials each. There was a pause of 1–2 min between blocks and 2.5 s between trials. In the baseline and after-effect phases, subjects received normal auditory feedback, while in the training phase auditory feedback was perturbed by shifting the first two formant frequencies of the target vowel/æ/toward/I/ (Fig. 1A). These two vowels primarily differ in their height that is measured by their first formant frequencies. The estimated formant shifts were obtained for each subject during a screening phase prior to the experiment when the vowel space was mapped out. On average, the first and the second formant frequencies were shifted by −148.1 Hz and 183.9 Hz, respectively. The intensity of the feedback signal played back to participants was adjusted to 80 dB to minimize air-borne unaltered auditory feedback. We also delivered 60-dB masking noise through the headphones to minimize any bone-conducted unaltered feedback.
Neural activity was assessed during baseline and at early and late training phases that consisted of the first and last 30% of the training trials. The after-effect phase was not included for the analysis of phase coherence. A time window extending from 300 ms before to 300 ms after the voice onset was selected for the analyses.
Altered Feedback
Vowel formant frequencies were altered in real time during speech production following the methods of altered auditory feedback paradigm (Jones and Munhall 2005; Purcell and Munhall 2006a, 2006b). The LabView real-time language implemented in the National Instruments PXI system can estimate the formant frequencies using the Burg algorithm and update the linear predictive coding filter coefficients of the speech signal at a rate of 10 kHz. The participant's voice was recorded at 10 kHz to obtain offline estimates of the formant frequencies.
Acoustical Analyses and Adaptation
The first and second formant frequencies were extracted for each spoken word using PRAAT and customized Matlab routines. For each subject, the formant frequencies were normalized relative to their baseline mean. It should be noted that the shifts in the second formant frequency were much less (∼10% of the F2 value of head) compared with shifts in F1 (∼25%). Therefore, to assess speech motor learning, the focus was only on the first formant frequency, and moreover the F2 compensations were similar across all the subjects. Compensation in the first formant frequency was measured over the course of training, and statistical significance of adaptation was derived under the t-test between the early and late training phases (α < 0.01). A regression line was then fitted over all the training trials, and the slope was taken as a measure of adaptation, where a positive slope significantly different than zero implies adaptation to the speech motor task. According to this metric, nine subjects adapted to the speech motor task.
EEG Acquisition
EEG data were obtained at a sampling rate of 512 Hz using a 64-channel active Brainvision system. The electrodes were mounted on an elastic cap using the standard 10–20 system of electrode placement, and electrical impedances of the scalp electrodes were kept below 10 kΩ. Only the scalp electrodes above the sensory and motor regions supporting the speech motor task were selected, and, therefore, electrodes over the occipital region were excluded. Electrodes over the extreme temporal and frontal region were also excluded to minimize movement artifacts, and the remaining 38 electrodes indicated by gray circles in Fig. 1D were analyzed. Participants were instructed to minimize their eye blinks and head movements during word production. The brief pauses between trials and blocks were inserted to avoid fatigue and muscle tension while minimizing head movements.
The real-time Labview system for alternating of auditory feedback delivers a transistor-transistor logic pulse at the detection of voice onset (Fig. 1F) to provide a time stamp for the EEG signal. This allows extraction of event-related potentials (ERPs) to examine changes in neural oscillations following speech motor adaptation.
Analysis of Neural Oscillations
Filtering.
The EEG signals were extracted using Matlab based EEGLAB toolbox and band-pass filtered offline between 0.75 and 55 Hz using a second-order Butterworth filter. All trial ERP epochs were then time aligned at the initiation of voice onset and extended from 900 ms before and after voice onset and re-referenced at electrode AFz. The average of the prevoicing part of the signal was subtracted before conducting further analyses.
Artifact rejection.
Stereotypical artifacts arising from eye movements, head movement and muscular activity were removed by implementing the following steps. Epochs in which the scalp voltage at any of the electrode locations exceeded 50 μV are excluded from further analysis. As a basis for further artifact rejection, the presence of aberrant temporal patterns and large negative kurtosis were detected, and muscle artifacts were eliminated by detecting spectral peaks that coincided with muscle activation and techniques based on independent component analysis (Olbrich et al. 2011). The variance in detrended signals before voice onset with that of after the onset of voice were also computed in which the effect of muscle artifacts, if any, would be more pronounced, but no significant differences were seen.
Computation of phase coherence.
ERPs within the theta (3–8 Hz) and the gamma (30–50 Hz) bands were analyzed for the purpose of computing signal power and phase coherence. For each trial epoch, an overlapping 800-ms time-window that was slid by 10 ms in each time step was used, and, for the frequency bands, a 3-Hz frequency window with 1-Hz step was used. The EEG signal of the gamma band (upper frequency band) was first band-pass filtered before obtaining power time series by squaring the amplitude of its Hilbert transformation (Cohen 2008; Perfetti et al. 2011). The power time series of the upper frequency was then used to compute its instantaneous phase. The EEG signal was also band-pass filtered in the theta band (lower frequency band), and the instantaneous phase of the theta band signal was obtained from the angle of its Hilbert transformation. Phase coherence between the signals of the two frequency bands was then computed for a given time window by taking the difference between their respective phase time series. The average phase coherence can vary from 0 to 1, where 0 means no synchronization and 1 indicates complete synchronization.
Pairwise phase coherence.
For a given pair of electrodes, we also computed pairwise phase coherence by calculating phase coherence described above using the theta band signal of one electrode with the gamma band of the other. In this way each pair generated two measures of phase coherence. To obtain the strength of pairwise phase coherence, we averaged over the gamma band (30–50 Hz) and over a time window of 300 ms during speech planning (defined below), which was then used to generate a directed graph of phase coherence network. By following the methods of statistical bootstrapping (below), we identified the links that showed significant coherence changes at early and late training relative to baseline.
Region of interest selection.
For the purpose of conducting our coherence analyses, we focused on the electrodes that showed significant differences in theta and gamma power and amplitude based on the bootstrap procedures described below.
Statistical bootstrapping.
Coherence maps and scalp topographic maps were obtained using t-scores. Statistical significances were conducted using bootstrap sampling techniques with replacement that allows correction for family-wise error. For each electrode and at a given time-frequency window, we computed t-scores between the coherence estimates at two experimental phases, using all trials across all subjects. The trials for the two conditions were randomly shuffled to generate bootstrap samples from which a t-score between the coherence estimates was calculated for each time frequency. These provided the maximum t-score over the entire time-frequency range for each bootstrap sample, and the procedure was repeated 4,000 times for generating a distribution of maximum statistics. The 95th percentile of this distribution (corresponding to α = 0.05) was taken as the critical t-score, and time-frequency regions exceeding this critical value are considered to have shown a significant difference under the consideration.
Scalp topographic map.
We computed instantaneous signal powers in the theta and gamma bands by squaring the amplitude of their Hilbert transformation. For each trial, the power time series was normalized by the total power. We then obtained t-scores between the instantaneous power estimates at two conditions at all electrode locations, and the resulting t-scores were plotted on the scalp. Similar to the procedure described above, the trials from two given conditions were randomly shuffled before generating bootstrap samples from which a t-score for each electrode was calculated. These provided to generate the critical t-score at α = 0.01. The critical t-score was then used to generate bootstrapped topographic maps.
Statistical Analyses
For each electrode and for each subject, we obtained coherence estimates over the entire time-frequency range using windows of 8 ms by 8 Hz. We then conducted repeated-measures ANOVA, followed by Tukey's honestly significant difference post hoc tests, to assess statistical significance.
RESULTS
The experimental setup is shown in Fig. 1. Subjects repeated the word “Head” under continuous recording of EEG. Their auditory feedback was altered in real time by shifting the first formant frequency of the spoken utterances (red trace, Fig. 1A) downward (white trace, Fig. 1A). Figure 1B shows the normalized first formant frequency from a representative subject. The experiment began with a baseline phase under normal auditory feedback (red) and followed by a training phase under altered auditory feedback (green). Subjects compensated for the auditory perturbation by progressively shifting the first formant frequency upwards (black). The experiment ended with an after-effect phase in which auditory feedback returned to normal (blue). We used the slope of the regression line fitted to the training data as a measure of adaptation according to which subjects learned the task by variable amounts. The data averaged across nine subjects who learned the task (Fig. 1C) show an upward shift (significant positive slope indicates adaptation) in the production of the first formant frequency under altered auditory feedback (red, baseline; black, training). To assess phase coherence, only three different phases of the speech motor task, baseline, early and late training, were assessed (indicated by gray bar in Fig. 1C). The objective in this study was to observe the neural changes accompanying motor speech adaptation, which can only be assessed directly by comparing training trials with baseline trials. The after-effect, being mainly a washout effect, seems to persist much longer without reaching the baseline level and was excluded from further analyses. A repeated-measures ANOVA followed by post hoc tests (Tukey's honestly significant difference) revealed significant training effect [F(2,24) = 15.93, P < 0.00004], with the produced formant frequency at the late training being significantly higher than both baseline and early training phases.
Figure 1D shows the scalp electrodes from which EEG activity was recorded during the speech motor task that included electrodes over the auditory and motor areas of the brain that are believed to support learning (Guenther 2006; Hickok and Poeppel 2007). The ERPs extracted from the EEG signals were aligned at the voice onset, and representative ERPs at electrode location Cz are shown in Fig. 1E (red, baseline; black, early training; blue, late training). Our primary goal was to identify the neural processes that support motor adaptation, and hence the present analyses were aimed specifically at identifying the factors that facilitate it. The neural bases of adaptability and the factors that impede it remain to be investigated.
To compute phase coherence, we first looked at ERPs across subjects extending 300 ms before and after the onset of voice and identified regions of interest (ROIs) over the scalp showing significant changes in theta or gamma power (P < 0.01) at early and late training relative to baseline. For the purpose of our analyses, the 300-ms window before the voice onset was defined as the phase of speech planning, and the window after voice onset as the phase of speech production. We converted the power differences into t-scores (see methods) and, using statistical bootstrapping methods, obtained the scalp power topography showing electrode locations with significant power changes (Fig. 2A, top panel for gamma band; bottom panel for theta band). The increase in gamma power was mostly observed over a window spanning 150 ms before and after the voice onset and over a small region of the scalp that includes the fronto-temporal and centro-parietal areas of the brain. While at early training, gamma power changes were mostly after the onset of voice (production), and at late training the changes were observed primarily before the voice onset (planning), suggesting the emergence of a feedforward mechanism over the course of motor learning. The theta power changes were confined largely from 100 ms before to 250 ms after the voice onset. The electrode locations obtained from the scalp power topography provided us with the ROIs for further coherence analyses.
As power changes accompanied adaptation, we wanted to know if it also altered phase coherence patterns. To get a glimpse into global coherence changes as subjects learned, a map of scalp coherence topography (Fig. 2B) was obtained by computing the average difference in coherence at early and late training relative to baseline over the entire periods of speech planning and production. Coherence changes were found to be different, not only over the course of training (early vs. late), but also showed distinct patterns during planning and production. More specifically, at the late training phase there was an increase in coherence over the central region and a decrease over the bilateral fronto-temporal regions of the scalp during speech planning. Similarly during production, a significant increase in the central region was accompanied by a decrease in left fronto-temporal regions at late training. Although scalp topography showed coherence changes during both speech planning and production, further analyses at the level of individual electrodes (described below) revealed significance only during speech planning.
To carry out a more detailed analysis of coherence changes, we computed the coherence spectrogram for each ROI electrode. Figure 3A shows the evolution of phase coherence at three representative electrode locations: Cz, Fc5 and Fc6. Notice the progressive buildup of coherence at Cz and a corresponding decline at Fc5 and Fc6 over the course of training during speech planning. A repeated-measures ANOVA (Fig. 3B) revealed significant training effect only for a subset of ROI electrodes at specific times and frequency bands (black rectangles): Cz [F(2,24) = 6.85, P < 0.01], Fc3 [F(2,24) = 3.63, P < 0.05], Fc5 [F(2,24) = 9.5, P < 0.001] and Fc6 [F(2,24) = 3.5, P < 0.05]. Moreover, these changes were observed only during speech planning, suggesting the establishment of feedforward map as subjects adapted to the task. Electrode Cz over the central region showed an increase in coherence while bilateral fronto-temporal electrodes Fc3, Fc5 and Fc6 showed a decline, and, interestingly, the increase at Cz preceded the decline by about 100 ms. In contrast, no significant changes in phase coherence were observed during early or late training in any electrode for the subjects who did not learn the task (Fig. 3C). These results imply that the redistribution of phase coherence over an interacting brain network not only accompanies motor speech adaptation, but was a direct consequence of it (see also Fig. 5 below).
An additional bootstrap analysis conducted for all the ROI electrodes independently confirmed significant coherence changes relative to baseline for the same four electrodes and in the same time-frequency windows as identified above (black rectangles, Fig. 4A). The coherence changes were observed only during speech planning and at late training phase alone, lending further support to the idea that a feedforward map was progressively established with speech motor adaptation.
We then asked how the neural activity revealed by power and coherence changes was tied to behavior. A correlation analysis was performed between the amount of adaptation and phase coherence and power in theta and gamma bands, and we found that phase coherence, but not power, was a significant predictor of motor adaptation. Significant correlations were observed only at two electrode locations, Cz (r = 0.7, P < 0.03) and Fc5 (r = −0.69, P < 0.03) (Fig. 4B), between the degree of adaptation and the phase coherence at late training. Consistent with the results of an increased coherence at Cz, the coherence at this location was positively correlated with the amount of adaptation, while the decrease at Fc5 was negatively correlated. Furthermore, significant correlations were observed only during speech planning at the same time-frequency windows that were identified previously.
Our analysis of phase coherence revealed a network of brain regions that supports motor speech adaptation. To understand the interactions within the network, two types of analyses were performed. For the time-frequency windows identified above, we computed how coherence changes relative to baseline are correlated for each electrode pair. Figure 5A shows the network graphs of pairwise correlation at early and late training relative to baseline in which significant correlations are marked in color. At late training, the coherence change at Cz relative to the baseline correlated negatively with that at Fc5 (also present at early training), and the electrodes at Fc5 and Fc6 showed a positive correlation. Although activities at Fc3 and Fc6 were found to be correlated at early training, the correlation disappeared by the end of the training. To further reveal pairwise interactions within the network, we computed the theta-gamma phase coherence for each electrode pair (see methods) and analyzed coherence changes during speech planning at early and late training relative to baseline (significant changes are marked by color links in Fig. 5B). For each electrode pair, there are two links. For example the clockwise link between Fc6 and Cz denotes the coherence between the theta band at Fc6 and the gamma band at Cz, and vice versa. Note that, by late training, there are significant increases in coherence between Fc6 and Cz and between Fc6 and Fc3 and a decrease between Cz and Fc6. The pairwise coherence changes at early training are quite similar to patterns observed at late training. These results demonstrate that coherence changes within the speech motor network consist of interactions at two distinct regions, between the central and the fronto-temporal regions and between the bilateral fronto-temporal areas, which together accompany the process of adaptation. The findings, together with the pairwise correlation analyses, support the idea that motor adaptation drives a redistribution of phase coherence mediated by local coherence changes within the speech motor network.
DISCUSSION
We studied the relationship between speech motor adaptation and phase coherence between theta and gamma bands in EEG signals. We found motor adaptation to be accompanied by significant changes in both power and phase coherence across several scalp regions in the speech planning phase of the task. While theta and gamma power increased with motor adaptation, the phase coherence progressively declined over bilateral fronto-temporal areas at electrode locations Fc3, Fc5 and Fc6 and increased over central region at Cz as subjects learned. The largest increase in phase coherence was seen at Cz, and the largest decrease at Fc5, and the activity at these locations were also strongly correlated with the amount of adaptation. Moreover, following learning, strong correlations were observed between activities over Fc5 and Fc6, and between Cz and the left fronto-temporal electrode Fc5, suggesting that the concurrent changes in phase coherence in these regions form an interacting network that underlies speech motor adaptation.
Our findings revealed that increase in coherence at centroparietal region was concurrent with a decrease at the bilateral fronto-temporal regions on the scalp that lie above parts of the speech motor network containing sensorimotor areas, and superior temporal gyrus and inferior frontal gyrus (Guenther 2006; Tourville et al. 2008). The observed correlation between activities at Cz and Fc5 suggests that activities over the sensorimotor areas of the brain increase at the expense of the fronto-temporal areas that manifests as a redistribution of coherence with speech motor adaptation. Under the assumption that coherence over a brain area measures its degree of engagement of local networks (Mazzoni et al. 2010), these results imply that with motor adaptation sensorimotor areas become more engaged with the accompanying disengagement of the fronto-temporal areas during speech planning. This shift in dependence was possibly driven by feedback errors, causing the speech motor network to reorganize and establish feedforward map as subjects learn. In further support of the idea that coherence changes reflect the establishment of a feedforward map over the course of learning, significant changes in phase coherence were observed only at late training and were seen only during speech planning before the onset of voice. This feedforward map in turn generates updated motor commands to match the auditory goals at the end of the training. Such a mechanism of redistribution of phase coherence could be a general feature of error-based motor learning and adaptation and captures the establishment of feedforward maps associated with the process of adaptation. It is, however, necessary to determine whether the same phenomena also operate at the source level before identifying the redistribution of phase coherence as a general mechanism of motor learning.
Do the phase coherence changes reflect neurophysiological correlate of learning or adaptation? In our study, subjects were given a clearly defined goal to speak aloud the word “HEAD” that was presented on the computer display, and subjects met the target gradually over the course of several trials. Similar to altered feedback paradigms of motor adaptation involving human arm movement, there is evidence for generalization, retention and interference, phenomena that are considered to be hallmarks of learning (Brashers-Krug et al. 1996; Mattar and Ostry 2007). Evidence of after-effect and context specificity observed in the speech motor task could also imply learning as opposed to simply adaptation (Rochet-Capellan et al. 2012) in the context of the present experimental paradigm.
In addition to the single electrode analyses, pairwise phase coherence was computed as an attempt to capture the information flow within the network at the scalp level. Strong correlations between activities at Cz and at Fc5 and between Fc5 and Fc6 suggested that motor adaptation is supported by two subnetworks reflecting distinct but related mechanisms, processing of feedback error and formation of feedforward map. Moreover, the correlation between phase coherence and adaptation was found to be significant only at the two electrode locations that emerged at the end of the training: at Cz the correlation was positive while at Fc5, it was negative. Interestingly, the changes in phase coherence with the amount of adaptation were also the greatest at these two locations, with the largest increase being at Cz and the largest decrease at Fc5. It is thus possible that activity at these two locations modulate the overall neuronal oscillation patterns that underlie speech motor adaptation. Furthermore, the coherence change at the central region Cz preceded those of the bilateral fronto-temporal electrodes by about 100 ms, implying that the feedforward map that emerges at the end of motor training is possibly driven by activity over the sensorimotor areas.
Our results suggest a direct relationship between speech motor adaptation and phase coherence. It could be argued that the changes in phase coherence were brought about as a result of mere repetition of the same target word during the speech motor task. Such repeated utterance could induce changes in phase coherence due to perceptual changes unrelated to motor learning (Cooper and Lauritsen 1974). However, this possibility is ruled out because the strong correlations observed between phase coherence and adaptation measure at electrode locations Cz and Fc5 suggest that the repetition of target word does not drive the observed changes in phase coherence. Moreover, as noted above, subjects who failed to adapt to the task did not show any significant changes in coherence, implying that changes in phase coherence are induced by adaptation alone. It could also be argued that the changes were driven by anticipation of movement rather than a reorganization of the motor map. The changes in the phase coherence were seen in two distinct scalp regions, and they were in opposite directions: increasing in the regions over the motor cortex and decreasing over the temporal regions. This suggests that the observed changes were not merely anticipatory, which were likely to be restricted to motor regions alone. Furthermore, the coherence changes manifested only at the end of training, whereas anticipatory adjustments would come into effect during early training trials as well.
It remains to be identified what the sources are that drive the activity in the phase coherence network underlying speech motor adaptation. To fully understand phase coherence at the level of sources, source coherence analyses identified through a combination of structural MRI and EEG-based study will provide further insight into the mechanisms of speech motor adaptation and learning. Resolving the network at the source level will further elucidate the neural processes underlying adaptation and learning at high spatial and temporal resolutions.
The manipulations of altered auditory feedback involved not only formant shifts, which varied over subjects, but also masking noise and loudness level that were constant across all subjects. It is possible that the speech motor adaptation observed here arose as a result of complex interaction across all these level of acoustical manipulations. Since the only covariate during the training task was the formant shift, it thus seems to be the primary factor driving the observed learning differences.
It can also be argued that adaptation differences arose solely due to individual differences in speech perception. It is generally recognized that there is a close link between speech perception and production, and over the last several years it is becoming increasingly clear that speech motor learning also alters auditory perception (Nasir and Ostry 2009; Shiller et al. 2009). In studies with human arm movement, it has been shown that perceptual learning affects motor performance that is also likely to be true in speech (Darainy et al. 2013; Ostry et al. 2010). It is therefore plausible that adaptation and learning differences arise from perceptual variance across the individuals in addition to their motor performance.
The phase coherence analyses presented here looks only into the theta and gamma bands, but can be expanded to other frequency bands, such as beta (14–20 Hz), mu (8-4 Hz) and high-gamma bands (>70 Hz), which have also been implicated in motor learning (Cheyne et al. 2008; Engel and Fries 2010). Such a detailed picture of phase coherence across different signal bands will not only add to the growing body of literature that attempts to unravel the neural bases of speech motor learning, but also elucidate the role of neural oscillations and cross-frequency coupling in motor learning in general. Understanding the neural correlates of sensorimotor learning in speech at a time scale that bears directly on the behavior will no doubt be an important tool for diagnostics and treatments of speech learning disorders. It remains to be seen what additional neural factors other than coherence may contribute to learning and adaptation differences and in particular why some subjects learn while others fail to do so.
We will, furthermore, seek to identify the neural bases of speech learnability and the behavioral factors that differentiate between learners and nonlearners and in particular how differences in baseline neural activity contribute to learning differences.
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the author(s).
AUTHOR CONTRIBUTIONS
Author contributions: R.S. and S.M.N. conception and design of research; R.S. performed experiments; R.S. analyzed data; R.S. and S.M.N. interpreted results of experiments; R.S. prepared figures; R.S. drafted manuscript; R.S. and S.M.N. edited and revised manuscript; R.S. and S.M.N. approved final version of manuscript.
REFERENCES
- Arnal LH, Wyart V, Giraud AL. Transitions in neural oscillations reflect prediction errors generated in audiovisual speech. Nat Neurosci 14: 797–801, 2011. [DOI] [PubMed] [Google Scholar]
- Brashers-Krug T, Shadmehr R, Bizzi E. Consolidation in human motor memory. Nature 382: 252–255, 1996. [DOI] [PubMed] [Google Scholar]
- Burke JF, Zaghloul KA, Jacobs J, Williams RB, Sperling MR, Sharan AD, Kahana MJ. Synchronous and asynchronous theta and gamma activity during episodic memory formation. J Neurosci 33: 292–304, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheyne D, Bells S, Ferrari P, Gaetz W, Bostan AC. Self-paced movements induce high-frequency gamma oscillations in primary motor cortex. Neuroimage 42: 332–342, 2008. [DOI] [PubMed] [Google Scholar]
- Cohen MX. Assessing transient cross-frequency coupling in EEG data. J Neurosci Methods 168: 494–499, 2008. [DOI] [PubMed] [Google Scholar]
- Cooper WE, Lauritsen MR. Feature processing in the perception and production of speech. Nature 252: 121–123, 1974. [DOI] [PubMed] [Google Scholar]
- Darainy M, Vahdat S, Ostry DJ. Perceptual learning in sensorimotor adaptation. J Neurophysiol 110: 2152–2162, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engel AK, Fries P. Beta-band oscillations–signalling the status quo? Curr Opin Neurobiol 20: 156–165, 2010. [DOI] [PubMed] [Google Scholar]
- Fries P. A mechanism for cognitive dynamics: neuronal communication through neuronal coherence. Trends Cogn Sci 9: 474–480, 2005. [DOI] [PubMed] [Google Scholar]
- Giraud AL, Poeppel D. Cortical oscillations and speech processing: emerging computational principles and operations. Nat Neurosci 15: 511–517, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Golfinopoulos E, Tourville JA, Guenther FH. The integration of large-scale neural network modeling and functional brain imaging in speech motor control. Neuroimage 52: 862–874, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gray CM, König P, Engel AK, Singer W. Oscillatory responses in cat visual cortex exhibit inter-columnar synchronization which reflects global stimulus properties. Nature 338: 334–337, 1989. [DOI] [PubMed] [Google Scholar]
- Guenther FH. Cortical interactions underlying the production of speech sounds. J Commun Disord 39: 350–365, 2006. [DOI] [PubMed] [Google Scholar]
- Herrmann CS, Fründ I, Lenz D. Human gamma-band activity: a review on cognitive and behavioral correlates and network models. Neurosci Biobehav Rev 34: 981–992, 2010. [DOI] [PubMed] [Google Scholar]
- Hickok G, Houde J, Rong F. Sensorimotor integration in speech processing: computational basis and neural organization. Neuron 69: 407–422, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hickok G, Poeppel D. The cortical organization of speech processing. Nat Rev Neurosci 8: 393–402, 2007. [DOI] [PubMed] [Google Scholar]
- Houde JF, Jordan MI. Sensorimotor adaptation in speech production. Science 279: 1213–1216, 1998. [DOI] [PubMed] [Google Scholar]
- Jones JA, Munhall KG. Remapping auditory-motor representations in voice production. Curr Biol 15: 1768–1772, 2005. [DOI] [PubMed] [Google Scholar]
- Lee H, Simpson GV, Logothetis NK, Rainer G. Phase locking of single neuron activity to theta oscillations during working memory in monkey extrastriate visual cortex. Neuron 45: 147–156, 2005. [DOI] [PubMed] [Google Scholar]
- Mattar AA, Ostry DJ. Modifiability of generalization in dynamics learning. J Neurophysiol 98: 3321–3329, 2007. [DOI] [PubMed] [Google Scholar]
- Mazzoni A, Whittingstall K, Brunel N, Logothetis NK, Panzeri S. Understanding the relationships between spike rate and delta/gamma frequency bands of LFPs and EEGs using a local cortical network model. Neuroimage 52: 956–972, 2010. [DOI] [PubMed] [Google Scholar]
- Nasir SM, Ostry DJ. Audiotry plasticity and speech motor learning. Proc Natl Acad Sci U S A 106: 20470–20475, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olbrich S, Jödicke J, Sander C, Himmerich H, Hegerl U. ICA-based muscle artefact correction of EEG data: what is muscle and what is brain? Neuroimage 54: 1–3, 2011. [DOI] [PubMed] [Google Scholar]
- Ostry DJ, Darainy M, Mattar AA, Wong J, Gribble PL. Somatosensory plasticity and motor learning. J Neurosci 30: 5384–5393, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perfetti B, Moisello C, Landsness EC, Kvint S, Lanzafame S, Onofrj M, Di Rocco A, Tononi G, Ghilardi MF. Modulation of gamma and theta spectral amplitude and phase synchronization is associated with the development of visuo-motor learning. J Neurosci 31: 14810–14819, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perkell J. Movement goals and feedback and feedforward control mechansisms in speech production. J Neurolinguistics 25: 382–407, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Purcell DW, Munhall KG. Adaptive control of vowel formant frequency: evidence from real-time formant manipulation. J Acoust Soc Am 120: 966–977, 2006a. [DOI] [PubMed] [Google Scholar]
- Purcell DW, Munhall KG. Compensation following real-time manipulation of formants in isolated vowels. J Acoust Soc Am 119: 2288–2297, 2006b. [DOI] [PubMed] [Google Scholar]
- Rochet-Capellan A, Richer L, Ostry DJ. Nonhomogeneous transfer reveals specificity in speech motor learning. J Neurophysiol 107: 1711–1717, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roelfsema PR, Engel AK, Konig P, Singer W. Visuomotor integration is associated with zero time-lag synchronization among cortical areas. Nature 385: 157–161, 1997. [DOI] [PubMed] [Google Scholar]
- Schack B, Vath N, Petsche H, Geissler HG, Möller E. Phase-coupling of theta-gamma EEG rhythms during short-term memory processing. Int J Psychophysiol 44: 143–163, 2002. [DOI] [PubMed] [Google Scholar]
- Schroeder CE, Lakatos P, Kajikawa Y, Partan S, Puce A. Neuronal oscillations and visual amplification of speech. Trends Cogn Sci 12: 106–113, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scott SK, Wise RJS. The functional neuroanatomy of prelexical processing in speech perception. Cognition 92: 13–45, 2004. [DOI] [PubMed] [Google Scholar]
- Shiller DM, Sato M, Gracco VL, Baum SR. Perceptual recalibration of speech sounds following speech motor learning. J Acoust Soc Am 125: 1103–1113, 2009. [DOI] [PubMed] [Google Scholar]
- Tort AB, Komorowski RW, Manns JR, Kopell NJ, Eichenbaum H. Theta-gamma coupling increases during the learning of item-context associations. Proc Natl Acad Sci U S A 106: 20942–20947, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tourville JA, Reilly KJ, Guenther FH. Neural mechanisms underlying auditory feedback control of speech. Neuroimage 39: 1429–1443, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Varela F, Lachaux JP, Rodriguez E, Martinerie J. The brainweb: phase synchronization and large-scale integration. Nat Rev Neurosci 2: 229–239, 2011. [DOI] [PubMed] [Google Scholar]
- Villacorta VM, Perkell JS, Guenther FH. Sensorimotor adaptation to feedback perturbations of vowel acoustics and its relation to perception. J Acoust Soc Am 122: 2306–2319, 2007. [DOI] [PubMed] [Google Scholar]
- Whittingstall K, Logothetis NK. Frequency-band coupling in surface EEG reflects spiking activity in monkey visual cortex. Neuron 64: 281–289, 2009. [DOI] [PubMed] [Google Scholar]