Skip to main content
The Journal of Neuroscience logoLink to The Journal of Neuroscience
. 2018 Jun 13;38(24):5620–5631. doi: 10.1523/JNEUROSCI.3480-17.2018

Subthalamic Nucleus Neurons Differentially Encode Early and Late Aspects of Speech Production

Witold J Lipski 1, Ahmad Alhourani 1, Tara Pirnia 1, Peter W Jones 1, Christina Dastolfo-Hromack 1, Leah B Helou 2, Donald J Crammond 1, Susan Shaiman 3, Michael W Dickey 3, Lori L Holt 6, Robert S Turner 2,5, Julie A Fiez 4,5, R Mark Richardson 1,2,5,
PMCID: PMC6001034  PMID: 29789378

Abstract

Basal ganglia-thalamocortical loops mediate all motor behavior, yet little detail is known about the role of basal ganglia nuclei in speech production. Using intracranial recording during deep brain stimulation surgery in humans with Parkinson's disease, we tested the hypothesis that the firing rate of subthalamic nucleus neurons is modulated in sync with motor execution aspects of speech. Nearly half of 79 unit recordings exhibited firing-rate modulation during a syllable reading task across 12 subjects (male and female). Trial-to-trial timing of changes in subthalamic neuronal activity, relative to cue onset versus production onset, revealed that locking to cue presentation was associated more with units that decreased firing rate, whereas locking to speech onset was associated more with units that increased firing rate. These unique data indicate that subthalamic activity is dynamic during the production of speech, reflecting temporally-dependent inhibition and excitation of separate populations of subthalamic neurons.

SIGNIFICANCE STATEMENT The basal ganglia are widely assumed to participate in speech production, yet no prior studies have reported detailed examination of speech-related activity in basal ganglia nuclei. Using microelectrode recordings from the subthalamic nucleus during a single-syllable reading task, in awake humans undergoing deep brain stimulation implantation surgery, we show that the firing rate of subthalamic nucleus neurons is modulated in response to motor execution aspects of speech. These results are the first to establish a role for subthalamic nucleus neurons in encoding of aspects of speech production, and they lay the groundwork for launching a modern subfield to explore basal ganglia function in human speech.

Keywords: deep-brain stimulation, microelectrode recording, single neuron, speech, subthalamic nucleus

Introduction

Producing speech is the most complex of human motor behaviors, requiring dynamic interaction between multiple brain regions. The segregated loop organization of basal ganglia-thalamocortical circuits suggests that the basal ganglia, including the subthalamic nucleus (STN), play a critical role in speech production. This concept is supported additionally by observations that impairments in speech production are common features of basal ganglia-associated degenerative disorders including Parkinson's disease, and that other disorders in speech production (e.g., stuttering) are associated with abnormalities in basal ganglia activity (Alm, 2004; Giraud et al., 2008; Toyomura et al., 2015). Additionally, an extensive body of work in song birds implicate bird-homologues of the basal ganglia (Doupe and Kuhl, 1999), including a homolog of the STN, in the learning and production of vocalizations (Jiao et al., 2000). Many prominent models of speech production nonetheless virtually ignore the basal ganglia (Hickok, 2012), as few studies have examined speech-related neural activity in these subcortical nuclei directly (Ziegler and Ackermann, 2017).

Electrophysiological recordings obtained during the implantation of leads for deep-brain stimulation (DBS) represent the only clinically-indicated opportunity to measure neural activity directly from the basal ganglia in awake, behaving human subjects. Previous reports of STN unit activity, however, have been limited to only a single preliminary, qualitative analysis of speech production-related changes in STN firing rates (Watson and Montgomery, 2006). Thus, recording from STN neurons during speech production is a unique opportunity to test hypotheses about the role of this region in the control of complex motor function, where the basal ganglia have alternately been hypothesized to participate in action selection, movement gain and motor learning (Desmurget and Turner, 2010).

To begin to define the role of the STN in speech production more clearly, we established an intraoperative protocol for microelectrode recording during a task that required subjects to read aloud single syllables displayed on a computer screen. We then examined trial-to-trial timing of changes in STN unit activity relative to either the visual presentation of single syllables or to the onset of speech production. Such time-locking is considered as evidence for an underlying functional linkage between the behavioral event and the linked neural discharge (Seal and Commenges, 1985; Anderson and Turner, 1991; DiCarlo and Maunsell, 2005). Our results suggest that aspects of speech production are encoded in the STN through the inhibition and excitation of functionally segregated neurons.

Materials and Methods

Subjects.

Subjects were 12 movement disorders patients (10 male) undergoing awake DBS surgery for Parkinson's disease. Unified Parkinson's disease rating scale (UPDRS) testing was administered by a neurologist within 4 months before DBS surgery. Ten of 12 subjects underwent bilateral DBS implantation (left lead inserted first), whereas two underwent unilateral implantation (one left). All subjects underwent overnight withdrawal from their dopaminergic medication before surgery. All participants provided written, informed consent in accordance with a protocol approved by the Institutional Review Board of the University of Pittsburgh (IRB Protocol #PRO13110420). In our practice, lead implantation is undertaken using a Leksell frame, with the patient in a semi-sitting position, and occurs first on the left side (for bilateral cases). To minimize strain on patients, these subjects were not offered research participation on the second (right brain) side. One subject underwent unilateral right-sided implantation.

Electrophysiological recordings.

Unit recordings were performed using the Neuro-Omega recording system and Parylene-insulated, microphonics-free tungsten microelectrodes (Alpha Omega). Microelectrode impedances ranged from 200 to 600 kΩ. Targeting of the dorsolateral STN and microelectrode recording (MER) were performed using a standard combination of indirect (starting AC-PC coordinates of x = ±12, y = −3, z = −4) and direct (visualization of the STN in the z = −4 plane of a T2-weighted scan obtained on a 3-tesla MRI scanner) targeting (Starr et al., 2003). For each subject, two to three simultaneous microelectrode recording passes were made, starting at 15 mm above the surgical target with manual advance of the microdrive in 0.1 mm steps, using a center, and posterior and/or medial trajectories, with center-to-center spacing of 2 mm in a standard cross-shaped Ben-Gun array. Microelectrode signals were bandpass filtered at 0.075 Hz to 10 kHz and digitized at 44 kHz (Neuro Omega, Alpha Omega).

Speech task.

The speech task was performed during pauses in the microelectrode recording portion of DBS lead implantation in which stable units were detected. Visual stimuli were created using MATLAB software (MathWorks) and Psychophysics Toolbox extensions. Subjects were asked to read, in a normal manner, a consonant-vowel-consonant (CVC) syllable presented in white text on an otherwise dark computer screen. Each trial was initiated manually by the experimenter, beginning with presentation of a green fixation cross at the center of the screen (0–250 ms), followed by a variable time delay (500–1000 ms) during which the screen remained dark. At the end of the delay, text denoting a unique CVC syllable appeared on the screen and remained visible until the subject completed their naming response. A white fixation cross was displayed on the screen during the intertrial interval (ITI; Fig. 1A). Subjects were instructed to respond as soon as the word cue appeared. The CVC stimuli were drawn from prior behavioral work (Moore et al., 2017), and were matched along a number of dimensions, including phoneme recurrence, number of letters, phonological neighborhood density, orthographic neighborhood, and mean bigram frequency. Stimulus lists contained an equal portion of CVC words and non-words, and were composed of consonants drawn from a set of seven early-developing or seven late-developing consonant phonemes.

Figure 1.

Figure 1.

Speech task and representative spoken and neural responses. A, Intraoperative syllable speech task. Subjects were asked to read aloud words presented on a computer screen. Each trial consisted of a sequence beginning with the fixation cross turning green for 250 ms, followed by a variable delay black screen (500–1000 ms), and followed by a unique CVC syllable cue appearing on the screen until the response was recorded. A white fixation cross appeared during the intertrial interval. B, An example audio spectrogram time-aligned to the onset of a subject's utterance of the syllable “loath”. The time (in seconds) of cue presentation is indicated by the solid vertical line, and the response onset and offsets are indicated by dotted lines. C, A single unit recording from the subject's STN, showing an increase in firing during speech. Red hash marks indicate timing of detected spike waveforms from the background activity. D, Overlay of 50 spike waveforms from the single unit shown in C. Scatterplots of the first two principal components (E; principal component1 and principal component2), as well as the first principal component and spike timestamp (F), showing clear separation of single-unit spike waveforms (red) corresponding to the example shown in C from background (blue).

Audio recordings.

Speech output was recorded using an omnidirectional microphone (8 subjects: Audio-Technica, model ATR3350iS, frequency response 50–18,000 Hz; 4 subjects: PreSonus, model PRM1 Precision Flat Frequency Mic, frequency response 20–20,000 Hz) oriented at an angle of ∼45° and a distance of ∼8 cm to the subject's mouth. In the four cases where the PreSonus PRM1 microphone was used, a Zoom H6 digital recorder was used to digitize the audio at 96 kHz. In all cases, the audio signal was split out to a Grapevine Neural Interface Processor, where it was digitized at 30 kHz. The audio signal was synchronized with the neural recordings and with visual cue events using digital pulses delivered via a USB data acquisition unit (Measurement Computing, model USB-1208FS).

Task performance.

The audio signal was segmented into trials and responses were coded by a speech-language pathologist using a custom-designed graphical user interface implemented in MATLAB. The response epoch for each trial was defined to start at cue presentation and end at the start of the ITI. The audio signal within each response epoch was coded as follows: (1) production onset was identified, (2) production offset was identified, and (3) the phonetic content was identified. Only trials that met the following criteria were included for further analyses: (1) the subject's entire response could clearly be identified within the response epoch, (2) the time from cue presentation to production onset (production latency) was less than the mean production latency (1.2 s) plus 3 SD (0.93 s) for all subjects (threshold = 4.0 s), (3) the duration of the response was less than the mean production latency (0.60 s) plus 3 SD (0.20 s) for all subjects (threshold = 1.19 s), and (4) the subject's response was a CVC or CV syllable and was composed of phonemes within the target set or the included mismatch set. Of 2200 total trials, 150 (6.8%) were rejected from further analysis on the basis of these response criteria. In 11 of the rejected trials, no response was recorded. In 608 trials (139 of which were rejected), the response did not match the target.

Spike sorting.

Microelectrode recording data were imported into off-line spike-sorting software (Plexon). A 4-pole Butterworth high-pass filter with a cutoff frequency of 200 Hz was applied to the microelectrode recording signal and waveforms were detected by setting a negative threshold at an amplitude equal to ∼3 times the SD of the voltage signal; single- and multi-unit action potentials were then discriminated using principal components analysis. The results were graded according to the quality and stability of the spike sorting over the duration of the recording. An assignment of “A-sort” was given only to spike clusters that could be discriminated from background activity throughout the duration of a recording, and whose spikes were not strongly modulated by cardiac rate (Fig. 1C–F). A-sorts were further subdivided into single- and multi-unit subcategories. A cluster qualified as a single unit (SU) if: (1) the principal component cluster was clearly separated from other clusters associated with background activity and other units, (2) contained spike waveforms with a unimodal distribution in principal component space, and (3) displayed a refractory period of at least 3 ms in its interspike interval distribution (Starr et al., 2003; Schrock et al., 2009). For some SU recordings, the location of the principal component cluster drifted gradually during the period of the recording, likely due to a shift of the brain relative to the electrode. Other A-sorts were classified as multi-unit (MU) recordings because the principle components cluster appeared to include waveforms from multiple units, forming multimodal principal component distributions that could not be clearly separated on short time scales, or that failed to obey the 3 ms refractory period in their interspike interval distribution. An assignment of “B-sort” was given to spike clusters that violated the above criteria due to presence of a non-uniform or rapid (5 s time scale) shift of the waveform cluster in principal component space, or due to incomplete separation of the spike cluster from the cluster associated with background noise.

STN unit baseline activity.

Baseline spike rates were estimated by averaging across trials the spike rates during the baseline epoch, defined as the 1 s portion of the ITI preceding cue presentation. Because the firing rate of MU recordings depends on the number of neurons contributing to the spike population and thus is difficult to interpret, we calculated baseline firing rates only for SU recordings.

STN unit activity during speech.

We used two different estimates of unit activity to test for task-related changes in neuronal spike rate. We tested for task-related increases using a spike density function (SDF), which is a direct representation of a unit's mean instantaneous firing rate. We tested for task-related decreases using a function that reflects a unit's mean interspike interval (ISI), which scales with the reciprocal of instantaneous spike rate. This approach was chosen to avoid potential under-sensitivity for the detection of decreases in firing in SDFs due to floor effects (Alexander and Crutcher, 1990a). To construct an SDF function, spike time stamps were rounded to 1 ms. The resulting time series was then convolved with a Gaussian kernel (σ = 25 ms). The ISI time series was computed from the 1 kHz binned time stamp time series by taking the value of the current ISI at each millisecond time point:

graphic file with name zns02418-0855-m01.jpg

for t between tsi and tsi+1, where ts is the set of consecutive time stamps for that spike population.

Across-trial means of the SDF and ISI functions were constructed aligned on two epochs-of-interest: (1) from cue presentation to 0.5 s after the mean production onset for that session (aligned on cue presentation, termed the cue epoch), and (2) from the mean time of cue presentation to 0.5 s after production onset (aligned on production onset, termed the production epoch). A baseline period for each trial was defined as the 1 s portion of the ITI preceding cue presentation, and the trial-wise mean SDF and ISI functions during this epoch served as baselines against which the test epochs were compared. Baseline firing rates for each unit were defined as the mean of discharge rate during the baseline period across trials.

A unit was considered to have significantly elevated firing during a given epoch if the mean spike density within that test epoch exceeded a threshold level for at least 100 ms. The threshold was defined as the upper 5% of a normal distribution with a mean and σ of the baseline mean SDF, Bonferroni corrected for multiple comparisons (where the number of independent observations was considered to be the duration of the epoch of interest divided by the width of the Gaussian kernel, 50 ms). Similarly, a unit was considered to have significantly reduced firing within a given epoch if the mean ISI time series exceeded a threshold ISI value for at least 100 ms. The threshold ISI value was defined as the upper 5% tail of a normal distribution with a mean and σ of the baseline mean ISI time series, Bonferroni corrected for multiple comparisons (where the number of independent observations was the mean number of ISIs within the epoch of interest).

Speech onset- and cue-locking.

For all units with significant changes in mean firing, we sought to determine whether the timing of these responses was more closely locked to the presentation of the cue or to the onset of the production, by examining the trial-to-trial relationship between RTs and neuronal response onsets. First, response onsets were estimated for individual trials. The trial-to-trial timing of an increase in firing was estimated by searching for bursts. The spiking pattern within each trial (after cue presentation) was examined to find a sequence of at least 3 spikes with the highest Poisson surprise (PS) burst index. For a given sequence of n spikes within time interval T, the PS burst index was based on the probability of encountering n or more spikes within time interval T, given a Poisson-distributed spike train with a discharge rate r:

graphic file with name zns02418-0855-m02.jpg

Similarly, the trial-to-trial timing of a decrease in firing was estimated by searching for pauses. The PS Pause index was based on the probability of encountering n or fewer spikes within time interval T, in a Poisson-distributed spike train:

graphic file with name zns02418-0855-m03.jpg

For both increase and decrease indices, r was estimated separately for each trial as the discharge rate across the entire trial, and T was the duration of the trial. Only trials with burst or pause sequences whose PS indices exceeded those found in that trial's baseline epoch were considered for further analysis. For each trial, the onset time of the PS Burst (for units with significant excitatory responses) or PS Pause (for units with significant inhibitory responses) spike sequences was defined as the neuronal response increase or decrease onset, respectively.

Next, two intervals were correlated (Spearman rank correlation, MATLAB function corr) with the production latency across trials for each unit: (1) the interval between cue presentation and the neuronal response onset (neuronal response latency), and (2) the interval between the neuronal response onset and production onset (neuronal response to production interval). A unit's response was considered to be temporally-locked to: (1) cue onset, if a significant change in activity during the cue epoch was observed, and the corresponding neuronal response to production interval was correlated (p < 0.05) with production latency (Fig. 2A,B); or (2) the onset of speech, if significant change in activity in the production epoch was observed, and the corresponding neuronal response latency was correlated (p < 0.05) with production latency (Fig. 2C,D). If both correlations were significant, then the unit's response was considered to be both cue- and production-locked, i.e., its activity was temporally associated with both events.

Figure 2.

Figure 2.

Schematic illustrating cue- and speech production-locking neuronal response types. A, Hypothetical example of cue-aligned trials, illustrating a constant neuronal response latency with varying speech production latencies. B, Corresponding correlation schematic showing that a significant correlation between neuronal response to production onset interval and speech production latency indicates cue-locking. C, Hypothetical example of cue-aligned trials, illustrating a constant neuronal response to production onset interval with varying speech production latencies. D, Corresponding correlation schematic showing that a significant correlation between the neuronal response latency and speech production latency indicates speech-locking.

Analysis of speech volume.

Relative speech volume was computed based on the audio recording corresponding to the subject's response (speech) and the audio corresponding to the baseline epoch (baseline). The ratio of the speech to baseline root-mean-square (RMS) amplitudes was represented as a decibel statistic for each trial:

graphic file with name zns02418-0855-m04.jpg

For all units with a significant speech-related modulation of firing, the relative speech volume was then correlated (Spearman rank correlation, MATLAB function corr) across trials with the mean firing rate during speech, i.e., between speech onset and speech offset for each trial. Because the timing of the firing rate modulation varied between units and between trials, an additional analysis was performed to examine the correlation between relative speech volume and the mean burst firing rate (for increase-type responses) or mean pause firing rate (for decrease-type responses; see Speech onset- and cue-locking). For each type of firing rate measure, the firing rate was z-scored against the baseline firing rate (within each trial) before computing the correlation.

Anatomical localization of recordings.

Anatomical locations of microelectrode recordings were expressed in terms of the microelectrode recording-defined STN boundaries along each electrode trajectory. Thus, each microelectrode recording location was identified by its relative position within the Ben-Gun orientation (central, posterior, or medial) and the percentage depth through the STN within that trajectory (with 0% representing the ventral STN boundary and 100% representing the dorsal STN boundary). In addition, electrode localization was carried using the Lead-DBS toolbox (Horn and Kühn, 2015). Preoperative and postoperative magnetic resonance images were coregistered and normalized to Montreal Neurological Institute (MNI) space. MNI locations of DBS lead placements were determined from postoperative images, and intraoperative microelectrode locations were calculated based on their position relative to final lead placement. To test whether unit responses recorded within the STN were anatomically segregated according to their speech-related response types and locking types, linear discriminant analysis was used to classify units based on MNI coordinates (MATLAB function fitcdiscr). Tenfold cross validation was used to estimate classification accuracy.

Analysis of spike isolation and stability.

To quantify the sort quality of STN units, two different measures were adapted from a method by Joshua et al. (2007): signal-to-noise ratio (SNR) and isolation score (IS). SNR was defined as follows:

graphic file with name zns02418-0855-m05.jpg

where peak–to–peak indicates the signal amplitude, or difference between the minimum and maximum of the average spike waveform, and the Noise is the SD of the concatenated residuals (spike waveforms minus average spike waveform; Joshua et al., 2007). Isolation score is an estimate of the probability that a given individual spike waveform (typically 66 samples, e.g., 1.5 ms long) belongs to the assigned spike cluster rather than the noise cluster (Joshua et al., 2007). Clusters for each candidate single unit and for noise (all other waveforms from the same recording) were defined in the first two dimensions of a principal components analysis (Plexon Offline Sorter). Our measure of the similarity of waveforms within a cluster was based on the Euclidean distance d(X,Y) between raw waveforms X and Y, both from the same cluster:

graphic file with name zns02418-0855-m06.jpg

normalized according to the average Euclidean distance between spikes in the spike cluster, d0, and a gain constant, λ (equal to 10; Joshua et al., 2007). That similarity index was then normalized according to the mean similarity between within-cluster waveforms X and all other waveforms Z (e.g., waveforms from other spikes and noise):

graphic file with name zns02418-0855-m07.jpg

Importantly, to consistently characterize this quantity across units, we chose to modify the method by Joshua et al. (2007) by selecting an equal number of waveforms in the spike and noise clusters for each unit whenever possible. Thus, if a sort resulted in a greater number of noise waveforms than spike waveforms, the noise cluster was estimated by randomly subsampling noise waveforms to match the number of spike waveforms (random subsampling was performed using MATLAB function randperm, using a uniform distribution). If, on the other hand, the number of spike waveforms (Nspike) was greater than the number of noise waveforms (Nnoise), the normalization term in the similarity index was adjusted to weight spike and noise waveforms equally:

graphic file with name zns02418-0855-m08.jpg

Summing the similarity index over all waveforms in the spike cluster results in a measure of how close waveform X is to the spike cluster compared with the noise cluster:

graphic file with name zns02418-0855-m09.jpg

Equal weighting of the normalization term in Equations 7 and 8 thus ensures that a P(X) value of 0.5 indicates that waveform X is equidistant from the spike and noise clusters. Finally, the isolation score is computed by taking the mean value of the above measure in the spike cluster:

graphic file with name zns02418-0855-m10.jpg

SNRs and isolation scores were computed for all single- and multi-units, including all spikes during speech task performance. To further assess spike stability during speech, these measures were then calculated separately for spikes recorded during the baseline epoch and speech epoch (1 s following production onset). For each unit, baseline and speech epoch spikes were pooled across trials. We then used a permutation testing procedure to determine whether the difference between baseline and speech measures of SNR and isolation score was greater than expected by chance. To determine the null distribution of the test statistic, the difference between baseline and speech measures of isolation, we generated 1000 surrogate statistics by randomly selecting “baseline” waveforms and “speech” waveforms from all waveforms detected during the baseline and speech epochs.

To assess the degree to which quantitative measures of isolation predicted unit type (single- vs multi-unit) and unit sort quality (A- vs B-sort) linear discriminant analysis was used to classify units based on SNR and isolation score measures (MATLAB function fitcdiscr). Tenfold cross-validation was used to estimate classification accuracy.

Results

Subject demographics are summarized in Tables 1 and 2. Twelve subjects each performed between one and four blocks of 60 trials during single-unit recording sessions (median 2.5 blocks, mean 160 trials). An average 6.5 ± 1.9% of trials were excluded from analysis due to incorrect responses. Across subjects, the mean latency to the onset of a production was 1.10 ± 0.31 s, and the mean duration of speech was 0.605 ± 0.175 s. A subject's mean production latency correlated significantly with the subject's speech UPDRS subscore (Spearman ρ = 0.72, p = 0.02). This correlation failed to reach significance for speech duration (Spearman ρ = −0.09, p = 0.8) or the fraction of trials with incorrect responses (Spearman ρ = −0.62, p = 0.06). The subjects' total UPDRS score was not correlated with any of these task measures (production latency: Spearman ρ = 0.22, p = 0.5; speech duration: Spearman ρ = −0.23, p = 0.5; percentage correct: Spearman ρ = 0.22, p = 0.5).

Table 1.

Subject characteristics

Subject Age Sex Handedness UPDRS III off score
Recorded hemisphere No. of units recorded No. of sessions Mean production latency, s Production latency SE, s Mean speech duration, s Speech duration SE, s Correct trials, %
Speech Total
1 60 Male R NR 53 L 6 2 1.5 0.005 0.777 0.005 89
2 68 Male R 1 47 L 15 4 1.01 0.016 0.578 0.016 95
3 47 Female R NR NR R 2 3 0.91 0.010 0.623 0.01 100
4 60 Male R 0 31 L 1 2 0.77 0.020 0.642 0.02 98
5 68 Male L 1 50 L 6 2 0.75 0.039 0.592 0.039 97
6 56 Male R 1 46 L 5 2 0.86 0.009 0.467 0.009 98
7 82 Male R 2 36 L 11 3 1.99 0.007 0.442 0.007 76
8 66 Male R 0 46 L 8 4 0.67 0.007 0.61 0.007 97
9 66 Male R 2 45 L 8 2 1.13 0.023 0.745 0.023 93
10 71 Female R 1 24 L 6 3 1.26 0.016 0.796 0.016 86
11 77 Male R 1 27 L 2 2 1.33 0.012 0.539 0.012 96
12 59 Male R 1 40 L 9 3 1.04 0.004 0.452 0.004 96
Mean ± SE 65.0 ± 2.7 1.0 ± 0.2 40.4 ± 2.9 6.6 ± 1.2 2.7 ± 0.2 1.10 ± 0.11 0.014 ± 0.003 0.605 ± 0.035 0.014 ± 0.003 93 ± 2

Demographic, recording, and speech performance characteristics. NR, Not recorded.

Table 2.

Additional subject characteristics

Subject Tremor dominance Voice or respiratory complaints Hearing complaints
1 Bilateral (greater on L) None None
2 NR None None
3 L None None
4 NR None None
5 Bilateral (greater on R) None None
6 L None None
7 NR None None
8 NR Dysphonia; atrophy of the bilateral true vocal fold; hypophonic speech related to parkinsonism and atrophy None
9 No NR Yes
10 NR Vocal fold atrophy, dysphonia, dysphagia None
11 Yes Vocal fold atrophy, dysphonia Bilateral hearing aids
12 No None none

Side of tremor dominance and presence of voice or hearing complaints.

A total of 45 neuronal recordings met the criteria for A-sorts (22 single-unit, 23 multi-unit recordings). Thirty-four additional recordings met criteria for B-sorts (3 single-unit, 31 multi-unit recordings). The mean baseline firing rates were not significantly different between A- and B-sort single units (21.8 ± 3.2 spikes/s vs 27.3 ± 7.1 spikes/s; mean ± SE, p = 0.55, unpaired t test), and were consistent with data reported previously from the human STN (Rodriguez-Oroz et al., 2001; Abosch et al., 2002; Starr et al., 2003; Theodosopoulos et al., 2003; Romanelli et al., 2004; Schrock et al., 2009).

Spike sort quality was quantified for all units using SNR and isolation score measures. Isolation scores were significantly different between single- and multi-units (single-unit median = 0.97, interquartile range (IQR) = 0.06; multi-unit median = 0.86 IQR = 0.15; p = 8.6 × 10−9, Wilcoxon rank sum test), and between A- and B-sorts (A-sort median = 0.93, IQR = 0.11; B-sort median = 0.86 IQR = 0.19; p = 3.1 × 10−5, Wilcoxon rank sum test). Similarly, SNRs were significantly different between single- and multi-units (single-unit median = 9.8, IQR = 2.2; multi-unit median = 5.4 IQR = 1.5; p = 6.3 × 10−11, Wilcoxon rank sum test), and between A- and B-sorts (A-sort median = 7.7, IQR = 4.3; B-sort median = 5.3 IQR = 1.7; p = 8.5 × 10−5, Wilcoxon rank sum test). Based on these two measures, a linear discriminant analysis classifier could distinguish between single- and multi-units with 85.0 ± 0.5% accuracy (significantly greater than chance, 66.6%, p = 6.6 × 10−16, unpaired t test), and between A- and B-sorts with 67.2 ± 4.4% accuracy (significantly greater than chance, 54.3%, p = 6.5 × 10−6, unpaired t test).

Overall, a high percentage of units demonstrated a speech-related change in firing. Twenty-two units exhibited significant increases in firing rate, 13 units showed significant decreases, and 7 units showed a mixed increase/decrease response during the production epoch. The proportion of units exhibiting these speech-related changes did not depend on sort quality (A- or B-sorts) or on unit type (single- or multi-unit; Table 3). Figure 3A–C shows examples of these unit response categories. Although there was an overall significant difference in the proportions of neurons in the four response categories (increase, decrease, mixed, and nonresponse, χ2 = 25.8608, p = 1.02 × 10−5), there was no significant difference between the proportion of increase-type and decrease-type units (χ2 = 2.3 p = 0.13). The prevalence of speech-responsive units did not relate to the subjects' symptom severity, and the proportion of units recorded from each subject that showed increase, decrease, cue-locking or speech-locking response types was not correlated with the speech subscore or total UPDRS score (Table 4). Increase- and decrease-type single-units were not differentiated statistically by baseline firing rates (increase-type firing rate = 20.6 ± 6.4 spikes/s; decrease-type firing rate = 34.4 ± 10.5 spikes/s; p = 0.27, unpaired t test). The mean latency of neuronal responses (defined as onset of the first significant change relative to production onset; Fig. 3D–F) also was similar between increase and decrease response types (−0.23 ± 0.07 s and −0.20 ± 0.14 s, respectively, p = 0.87, unpaired t test). In the one participant with right STN recordings (2 multi-units), no speech-related responses were found.

Table 3.

Unit type and sort quality do not determine response type

Total No response Increase Decrease Mixed
A-sort units 45 19 13 10 3
B-sort units 34 18 9 3 4
χ2 0.89 0.06 2.53 0.62
p 0.34 0.81 0.11 0.43
Single-unit 25 10 6 5 4
Multi-unit 54 27 16 8 3
χ2 0.69 0.27 0.33 2.3
p 0.41 0.60 0.56 0.13
Total 79 37 22 13 7
Baseline, % firing rate 178 ± 15 68 ± 3 203 ± 30
Mean ± SE 62 ± 7

Figure 3.

Figure 3.

STN neuronal firing is modulated during speech. Examples of A-sort single-unit neuronal responses during speech showing (A) increases, (B) decreases, and (C) mixed responses in firing rate, aligned to production onset (t = 0). Spike rasters across trials are shown in AC, top, and mean firing rate (A, C) or mean ISI (B) are shown on the bottom. Diamonds labeled with a “c” indicate mean time of cue presentation; diamonds labeled with an “e” indicate mean speech end; dashed error bars indicate the corresponding SDs. DF, Raster plots illustrating the timing of firing rate responses across the population of unit recordings. Each row represents a unit's significant changes relative to baseline, during a time segment surrounding production onset. The time scale is normalized across units from 0.5 s before the mean cue onset until 0.5 s after the mean end of speech.

Table 4.

Subject symptom severity is not correlated with unit speech response types

Proportion of units by response type
Increase-type Decrease-type Speech-locked Cue-locked
Correlation with speech UPDRS
    Spearman ρ 0.28 0.08 0.39 0.45
    p value 0.44 0.82 0.27 0.20
Correlation with total UPDRS
    Spearman ρ 0.04 −0.03 −0.51 −0.30
    p value 0.92 0.92 0.11 0.36

Response types were observed to be differentially associated with speech onset- and cue-locking. Among 29 units with significant increases in firing rate during the production epoch, the responses were preferentially time-locked to production (41%), with a minority time-locked to cue onset (7%) or to both cue and production onset (7%; Fig. 4). In contrast, among 20 units with significant decreases in firing rate, 40% were time-locked to cue onset, whereas only 15% had responses time-locked to production onset, and no responses were time-locked to both cue and production onset (Fig. 5). Again, the proportion of responses showing speech- versus cue-locking firing changes did not depend on sort quality or on unit type (Table 5). A χ2 test was used to verify that increase-type neural responses were more likely to be time-locked to the production onset than were decreases (χ2 = 3.89, p = 0.049), whereas decrease-type responses were more likely to be time-locked to cue onset (χ2 = 7.99, p = 0.0047; Table 6). The mean latency of neuronal responses (defined as the mean neuronal response latency across trials) was shorter for cue-locked responses (0.76 ± 0.12 s) than for speech-locked responses (1.08 ± 0.08 s; p = 0.039, unpaired t test). The mean neuronal response to production interval (defined as the mean neuronal response to production interval across trials) was also greater in magnitude for cue-locked responses (−0.48 ± 0.12 s) than for speech-locked responses (−0.15 ± 0.6 s; p = 0.011, unpaired t test).

Figure 4.

Figure 4.

STN neuronal firing increases are primarily speech-locked. A, An example of an A-sort single unit whose firing rate increase is locked to production onset. Spike raster (top) and mean firing rate (bottom) aligned to cue presentation. Significant spike bursts are shaded for each trial according to their PS index. Trials are sorted by speech production latency; speech production onset for each trial is indicated in green. B, The time interval between cue presentation and burst onset (neuronal response latency) and between burst onset and production onset (neuronal response to production interval) for each trial is correlated against production latency. C, Summary of correlation analyses for all unit recordings with increase-type responses, showing 12/29 responses locked to production onset (red circles), 2/29 responses locked to cue presentation (blue circles), and 2/29 responses locked to both cue and production onset (black circles). Open circles in C and indicate B-sorts.

Figure 5.

Figure 5.

STN neuronal firing decreases are primarily cue-locked. A, An example of an A-sort multi-unit whose firing rate decrease is locked to cue presentation. Spike raster (top) and mean firing rate (bottom) aligned to cue presentation. Significant decreases in firing rate (pauses) are shaded for each trial according to their PS index. Trials are sorted by speech production latency; speech production onset for each trial is indicated in green. B, The time interval between cue presentation and pause onset (neuronal response latency) and between pause onset and production onset (neuronal response to production interval) for each trial is correlated against production latency. C, Summary of correlation analyses for all unit recordings with inhibitory responses, showing 3/20 responses were locked to production onset (red circles), 8/20 units were locked to cue presentation (blue circles), and none locked to both cue presentation and production onset (black circles). Open circles in C and indicate B-sorts.

Table 5.

Unit type and sort quality do not determine locking type

Total Locked to cue Locked to production onset Locked to both
A-sort units 29 5 9 2
B-sort units 20 5 6 0
χ2 0.44 0.006 1.44
p 0.51 0.94 0.23
Single-unit 19 3 5 1
Multi-unit 30 7 10 1
χ2 0.41 0.27 0.11
p 0.52 0.60 0.74
Total 49 10 15 2

Table 6.

Dissociation between cue-locking decreases and speech-locking increases of firing

Total Locked to cue (%) Locked to production onset (%) Locked to both (%)
Increase-type responses 29 2 (7) 12 (41) 2 (7)
Decrease-type responses 20 8 (40) 3 (15) 0 (0)
χ2 7.99 3.89 1.44
p 0.0047 0.049 0.23
Total 49 10 15 2

Encoding of speech duration was not prevalent in recorded STN units. The duration of the neural response had a significant correlation with the duration of speech production (Spearman correlation, p < 0.05) in only 2 of 29 units with increase-type responses, and in only 1 of 20 neurons with decrease-type responses. These proportions were not significantly different from zero (Fisher's exact test, p = 0.49 for increase-type responses, p = 1.0 for decrease-type responses), indicating that they are too small to be estimated statistically from this experiment.

Evidence for encoding the volume of speech was found in a small number of decrease-type STN units. When firing rate during speech was examined for each trial, none of the 29 increase-type responses and 1 of 23 decrease-type responses showed a significant correlation (ρ = −0.27, p = 0.04) with relative speech volume. Similarly, when mean burst or pause firing rate was examined, none of the 29 increase-type responses and 2 of 23 decrease-type responses (2 subjects) showed a significant correlation (ρ = −0.42, −0.30; p = 0.020, 0.025, respectively) with relative speech volume across trials.

We did not find evidence for topographical organization of response types. Unit recording locations were analyzed based on the recording trajectory (center, 23 units, average span 4.7 ± 0.5 mm; posterior, 29 units, average span 5.2 ± 0.5 mm; or medial, 27 units, average span 5.6 ± 0.6 mm), and the recording depth, relative to the microelectrode recording-defined boundaries of the STN within each trajectory. There was no significant difference in STN recording depth between speech response types (excitatory, inhibitory, mixed, no response; Fig. 6; Kruskal–Wallis test, central trajectory χ2 = 7.2, p = 0.066; posterior trajectory χ2 = 6.2, p = 0.10; medial trajectory χ2 = 7.2, p = 0.066). There was also no significant difference in STN recording depth between locking response types (production onset-locked, cue-locked, locked to both events, no locking) in any of the recording trajectories (Kruskal–Wallis test, central trajectory χ2 = 5.6, p = 0.13; posterior trajectory χ2 = 3.9, p = 0.27; medial trajectory χ2 = 2.1, p = 0.35). Collapsing the recording depths across trajectories did not reveal significant differences between response types. Microelectrode recording locations additionally were normalized to MNI space, allowing for group-level analysis within a common coordinate system (Fig. 7). Linear discriminant analysis was used to model speech-related response types and locking types of units based on their MNI coordinates, to test whether speech-related responses are anatomically segregated within the sampled region of the STN. The classification accuracy of this model was not higher than expected by chance.

Figure 6.

Figure 6.

Anatomical distribution of speech responses in the STN. Unit locations are represented according to the recording trajectory and recording depth relative to electrophysiology-defined STN boundaries (0% corresponds to the ventral STN border and 100% corresponds to the dorsal STN border. Box plots represent the median and IQR of recording depths within each response category. NR = no response.

Figure 7.

Figure 7.

Anatomical distribution of STN microelectrode unit recordings in MNI space. A, Speech-related unit response types and (B) locking types were not segregated in normalized anatomical coordinates.

Finally, we tested for potential influences of recording stability by comparing single-unit isolation between baseline and speech epochs, for each unit. Overall, 23/79 units showed a small but significant change between baseline and speech isolation scores (7 decreases, 16 increases, 3.8 ± 0.7% mean magnitude change from baseline; p < 0.05, permutation testing). Similarly, a significant change between baseline and speech SNRs was observed in 25/79 units (14 decreases, 11 increases, 9.5 ± 1.0% mean magnitude change from baseline; p < 0.05, permutation testing). However, the specific change in spike isolation measure was not consistently related to the speech-related modulation in firing. Specifically, among 22 units with increase-type responses, 4 showed decreases, and 5 showed increases between baseline and speech isolation scores (4.1 ± 1.3% mean magnitude change from baseline; p < 0.05, permutation testing), whereas 7 showed decreases and 3 showed increases between baseline and speech SNRs (6.6 ± 1.2% mean magnitude change from baseline; p < 0.05, permutation testing). Similarly, among 13 units with decrease-type responses, 2 showed decreases, and 2 showed increases between baseline and speech isolation scores (4.6 ± 2.6% mean magnitude change from baseline; p < 0.05, permutation testing), whereas none showed a decrease and 1 showed an increase between baseline and speech SNRs (18% mean change from baseline; p < 0.05, permutation testing). Among seven units with mixed-type responses, none showed a decrease, and three showed increases between baseline and speech isolation scores (2.5 ± 1.3% mean change from baseline; p < 0.05, permutation testing), whereas none showed a decrease, and one showed an increase between baseline and speech SNRs (8% mean change from baseline; p < 0.05, permutation testing).

Discussion

We found that both phasic increases and decreases in the discharge rate of STN neurons accompany the production of speech. In this study, subjects read aloud syllables presented on a computer screen, a behavioral paradigm that requires a series of neural events beginning from processing the visual cue to activating motor commands for the vocal organ. Neural events that occur early in this series, such as processing of the visual cue and forming a phonological plan, might be expected to be time-locked to cue presentation. Events that occur later in the series, such as forming and executing the motor speech plan, might be expected to be time-locked to speech output. We showed that decrease-type responses are predominantly locked to cue presentation and increase-type STN responses are predominantly locked to the onset of speech. These findings suggest that STN inhibition may be associated with early, cognitive aspects of speech production, while STN excitation may be associated with later, motor aspects of speech production.

The extent to which speech-related activity in the STN may reflect lower-order movement-related activity, akin to results from studies involving simple limb movements, versus higher-order functions has important implications. Although kinematic aspects of speech production often improve following DBS (Pinto et al., 2004; De Gaspari et al., 2006; Parsons et al., 2006; Mikos et al., 2011), a decrease in verbal fluency is the most common cognitive side effect of STN-DBS, with specific deficits in lexical and grammatical processing having been observed, albeit inconsistently across studies (Phillips et al., 2012). The observation of increases in firing rates associated with speech onset are expected, in the context of previous studies of limb movement-related activity. In STN recordings from both human subjects and nonhuman primates, firing rate increases comprise 75–93% of movement-related responses during active and passive limb movements (Wichmann et al., 1994; Rodriguez-Oroz et al., 2001; Abosch et al., 2002; Starr et al., 2003; Theodosopoulos et al., 2003; Romanelli et al., 2004; Schrock et al., 2009). We found that nearly one-half of increase-type responses in our study were locked to the onset of speech, indicating that motor aspects of speech production are encoded in STN activity. A significantly smaller proportion of increase-type responses was locked to cue presentation (7%) and to both cue presentation and speech production onset (7%), with remaining responses not clearly associated with either event.

In contrast, we observed that early stages of speech production may involve the inhibition of STN neurons. We found that a large proportion (40%) of decrease-type responses were locked to cue presentation, with cue-locked responses occurred at significantly lower latencies relative to cue presentation, compared with speech-locked responses. A smaller proportion (15%) of decrease-type responses were locked to speech production onset, with remaining responses not clearly associated with either event. Although minority populations of neurons with movement-related firing-rate decreases have been reported previously (Wichmann et al., 1994; Schrock et al., 2009; Lipski et al., 2017) and active movements have been associated with a higher proportion of decrease-type responses in the STN (Lipski et al., 2017), it is remarkable that such a high percentage of decrease-type responses were observed in the present study. Interestingly, and in contrast to our results, a marked reduction of STN activity was reported to be associated with the onset of speech production in the only previous report of STN unit activity recorded during speech production (Watson and Montgomery, 2006), although that study was largely descriptive in nature, limiting comparisons to our data. Although other investigators have shown correlations of STN single unit firing rates and rhythms to premotor functions, such as the encoding of difficulty level of a choice task (Zaghloul et al., 2012; Zavala et al., 2016), cue-locked decreases in firing were not reported. Our data do suggest that, compared with limb movement, speech may involve a different balance of activation and suppression in the STN, and that modulation of this balance may occur at the single neuron level before speech onset.

This study was not designed to determine whether early, cue-locked STN modulation of activity reflects responses to the presented stimulus (i.e., reading) versus other aspects of preparing to speak. Although it is important to note that cortical activation of motor commands, as well as adjustments in the chest wall, laryngeal and articulatory musculature, occur well before the acoustic signal is realized, and in a time-locked manner, we relied upon the acoustic output as a simple and noninvasive landmark for exploring timing relationships (Bouchard et al., 2013). Direct measurements of respiratory or articulatory kinematics, however, are indicated for futures studies, to more clearly understand behavioral correlates of STN speech-related activity. Whether similar STN responses would be observed with nonspeech related engagement of the same musculature also is an open question. Notably, our findings are based on data collected in patients with Parkinson's disease, and it was not possible to determine the extent to which the dynamics of corticosubthalamic coupling described reflect physiological versus pathophysiological basal ganglia function. Nonetheless, the prevalence of speech-responsive units did not relate to the subjects' symptom severity, as measured by UPDRS.

The STN functions within the basal ganglia thalamocortical circuit primarily by way of glutamatergic inputs to the GABAergic output neurons of the globus pallidus internus and substantia nigra pars reticulata. The firing rate model of basal ganglia function posits that increases in STN activity may have a suppressive effect on basal ganglia-recipient circuits, whereas decreases may be facilitatory. This balance of basal ganglia-mediated activation and suppression has been understood most frequently in terms of either selecting and focusing motor actions (Mink, 1996; Redgrave et al., 2010), or modulating their gain over time (Alexander and Crutcher, 1990b; Nambu et al., 2000, 2002; Nambu, 2005; Turner and Desmurget, 2010; Thura and Cisek, 2017). Proponents of an action selection hypothesis have proposed that the STN participates in a response inhibition function to reduce premature action when multiple competing responses are possible (Frank, 2006). Our findings of suppressed STN firing locked to speech cues and increased STN firing locked to speech production, however, are not consistent with action selection-related functions of the STN. Similarly, Ziegler and Ackermann (2017) recently compiled extensive evidence in support of the idea that, for well learned adult speech, basal ganglia circuits play key roles in the emotional/motivational modulation of speech (i.e., in prosody) but not in the selection and sequencing of articulatory gestures.

Speech-related phasic increases in the STN likely are a result of excitatory inputs and decreases likely a result of inhibitory inputs. The major excitatory input into the STN comes from the neocortex via the basal ganglia hyperdirect pathway (Nambu et al., 2002), which forms glutamatergic synapses onto distal dendrites of STN projection neurons (Künzle, 1978; Romansky et al., 1979; Kitai and Deniau, 1981; Romansky and Usunoff, 1987). The primate STN receives direct projections from broadly distributed cortical areas including primary motor cortex, premotor cortex, supplementary motor area, dorsolateral prefrontal, anterior cingulate, and inferior frontal cortex (Afsharpour, 1985; Parent and Hazrati, 1995; Nambu et al., 1997; Haynes and Haber, 2013). A primary form of inhibitory drive arises from GABAergic projections to the STN from the external segment of the globus pallidus via the indirect basal ganglia pathway (Nauta and Mehler, 1966; Romansky and Usunoff, 1987; Bell et al., 1995; Sato et al., 2000). Thus, it is possible that speech onset-locked increased firing-rate responses (STN excitation) could be mediated via the hyperdirect pathway, whereas cue-locked inhibitory responses during speech could be mediated via the indirect pathway. These findings also can be interpreted in the context of the GODIVA model (Bohland et al., 2010) of speech production. This model posits a dual role for the basal ganglia, participating in two processes that may be correlated with cue presentation and speech production in our task: (1) a planning loop that is involved in generating a phonological sequence corresponding to the target word, and (2) a motor loop that releases the planned speech sounds for motor execution.

This study did not examine interhemispheric differences in speech-related STN activity, as recordings were performed in the left STN in 11/12 subjects. Similarly, language laterality was not specifically assessed, and the current cohort is skewed toward right-handed individuals (Table 1). There are good reasons to expect both the left and right STN will exhibit speech-related responses, since speech-related potentials are represented in a bilateral fashion (Grözinger et al., 1980) and functional neuroimaging studies have demonstrated robust activation of the precentral and postcentral gyri bilaterally during overt speech production (Turkeltaub et al., 2002; Guenther and Hickok, 2015). Interestingly, clinical outcome studies on speech and STN DBS have suggested that left STN stimulation has a greater impact on speech production compared with right-sided stimulation (Aldridge et al., 2016); thus, future experiments designed to examine bilateral responses in individual patients are needed to address questions of the impact of language laterality.

It is important to consider that respiratory kinematics and articulatory movements may change intracranial pressure, potentially transiently affecting unit recording quality. To examine the possibility of these transient changes affecting our assessment of speech-related physiological modulation of STN neuron firing, we tested unit isolation measures following the onset of speech relative to baseline. We found that SNR and isolation score were significantly altered in 25 and 23 of 79 units, respectively. Importantly, however, the magnitude of change in isolation measures was small, and the direction of change was not predictive of speech-related response type. Given that intraoperatively recorded human single-unit activity seldom is completely stable across time, the isolation measure difference between baseline and speech likely reflects ongoing fluctuations in isolation rather than specific effects of speech. Small changes in isolation during speech may also be attributed to modulation of background population activity during speech, which affects isolation of sorted units.

In summary, our results demonstrate that STN neurons comprise separate functional populations whose activity during speech production can be differentiated by the timing and direction of firing rate changes. The extent to which these functional groupings may be specific to speech versus common to complex motor function is an important question for future work, in light of conflicting theories of the role of the STN, and that of the basal ganglia as a whole, in motor behaviors. Our ongoing studies aim to examine the granularity of STN functional encoding in and to verify the specificity of these findings to speech production.

Footnotes

This work was supported by NINDS U01NS098969 (R.M.R.), the Hamot Health Foundation (R.M.R.), and a University of Pittsburgh Brain Institute NeuroDiscovery Pilot Research Award (R.M.R.). We thank Danielle Corson and Jim Sweat for expert intraoperative clinical assistance.

The authors declare no competing financial interests.

References

  1. Abosch A, Hutchison WD, Saint-Cyr JA, Dostrovsky JO, Lozano AM (2002) Movement-related neurons of the subthalamic nucleus in patients with Parkinson disease. J Neurosurg 97:1167–1172. 10.3171/jns.2002.97.5.1167 [DOI] [PubMed] [Google Scholar]
  2. Afsharpour S. (1985) Topographical projections of the cerebral cortex to the subthalamic nucleus. J Comp Neurol 236:14–28. 10.1002/cne.902360103 [DOI] [PubMed] [Google Scholar]
  3. Aldridge D, Theodoros D, Angwin A, Vogel AP (2016) Speech outcomes in Parkinson's disease after subthalamic nucleus deep brain stimulation: a systematic review. Parkinsonism Relat Disord 33:3–11. 10.1016/j.parkreldis.2016.09.022 [DOI] [PubMed] [Google Scholar]
  4. Alexander GE, Crutcher MD (1990a) Preparation for movement: neural representations of intended direction in three motor areas of the monkey. J Neurophysiol 64:133–150. 10.1152/jn.1990.64.1.133 [DOI] [PubMed] [Google Scholar]
  5. Alexander GE, Crutcher MD (1990b) Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci 13:266–271. 10.1016/0166-2236(90)90107-L [DOI] [PubMed] [Google Scholar]
  6. Alm PA. (2004) Stuttering and the basal ganglia circuits: a critical review of possible relations. J Commun Disord 37:325–369. 10.1016/j.jcomdis.2004.03.001 [DOI] [PubMed] [Google Scholar]
  7. Anderson ME, Turner RS (1991) A quantitative analysis of pallidal discharge during targeted reaching movement in the monkey. Exp Brain Res 86:623–632. [DOI] [PubMed] [Google Scholar]
  8. Bell K, Churchill L, Kalivas PW (1995) GABAergic projection from the ventral pallidum and globus pallidus to the subthalamic nucleus. Synapse 20:10–18. 10.1002/syn.890200103 [DOI] [PubMed] [Google Scholar]
  9. Bohland JW, Bullock D, Guenther FH (2010) Neural representations and mechanisms for the performance of simple speech sequences. J Cogn Neurosci 22:1504–1529. 10.1162/jocn.2009.21306 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bouchard KE, Mesgarani N, Johnson K, Chang EF (2013) Functional organization of human sensorimotor cortex for speech articulation. Nature 495:327–332. 10.1038/nature11911 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. De Gaspari D, Siri C, Di Gioia M, Antonini A, Isella V, Pizzolato A, Landi A, Vergani F, Gaini SM, Appollonio IM, Pezzoli G (2006) Clinical correlates and cognitive underpinnings of verbal fluency impairment after chronic subthalamic stimulation in Parkinson's disease. Parkinsonism Relat Disord 12:289–295. 10.1016/j.parkreldis.2006.01.001 [DOI] [PubMed] [Google Scholar]
  12. Desmurget M, Turner RS (2010) Motor sequences and the basal ganglia: kinematics, not habits. J Neurosci 30:7685–7690. 10.1523/JNEUROSCI.0163-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. DiCarlo JJ, Maunsell JH (2005) Using neuronal latency to determine sensory-motor processing pathways in reaction time tasks. J Neurophysiol 93:2974–2986. 10.1152/jn.00508.2004 [DOI] [PubMed] [Google Scholar]
  14. Doupe AJ, Kuhl PK (1999) Birdsong and human speech: common themes and mechanisms. Annu Rev Neurosci 22:567–631. 10.1146/annurev.neuro.22.1.567 [DOI] [PubMed] [Google Scholar]
  15. Frank MJ. (2006) Hold your horses: a dynamic computational role for the subthalamic nucleus in decision making. Neural Netw 19:1120–1136. 10.1016/j.neunet.2006.03.006 [DOI] [PubMed] [Google Scholar]
  16. Giraud AL, Neumann K, Bachoud-Levi AC, von Gudenberg AW, Euler HA, Lanfermann H, Preibisch C (2008) Severity of dysfluency correlates with basal ganglia activity in persistent developmental stuttering. Brain Lang 104:190–199. 10.1016/j.bandl.2007.04.005 [DOI] [PubMed] [Google Scholar]
  17. Grözinger B, Kornhuber HH, Kriebel J, Szirtes J, Westphal KT (1980) The Bereitschaftspotential preceding the act of speaking. Also an analysis of artifacts. Prog Brain Res 54:798–804. 10.1016/S0079-6123(08)61705-7 [DOI] [PubMed] [Google Scholar]
  18. Guenther FH, Hickok G (2015) Role of the auditory system in speech production. Handb Clin Neurol 129:161–175. 10.1016/B978-0-444-62630-1.00009-3 [DOI] [PubMed] [Google Scholar]
  19. Haynes WI, Haber SN (2013) The organization of prefrontal-subthalamic inputs in primates provides an anatomical substrate for both functional specificity and integration: implications for basal ganglia models and deep brain stimulation. J Neurosci 33:4804–4814. 10.1523/JNEUROSCI.4674-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hickok G. (2012) Computational neuroanatomy of speech production. Nat Rev Neurosci 13:135–145. 10.1038/nrn3158 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Horn A, Kühn AA (2015) Lead-DBS: a toolbox for deep brain stimulation electrode localizations and visualizations. Neuroimage 107:127–135. 10.1016/j.neuroimage.2014.12.002 [DOI] [PubMed] [Google Scholar]
  22. Jiao Y, Medina L, Veenman CL, Toledo C, Puelles L, Reiner A (2000) Identification of the anterior nucleus of the ansa lenticularis in birds as the homolog of the mammalian subthalamic nucleus. J Neurosci 20:6998–7010. 10.1523/JNEUROSCI.20-18-06998.2000 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Joshua M, Elias S, Levine O, Bergman H (2007) Quantifying the isolation quality of extracellularly recorded action potentials. J Neurosci Methods 163:267–282. 10.1016/j.jneumeth.2007.03.012 [DOI] [PubMed] [Google Scholar]
  24. Kitai ST, Deniau JM (1981) Cortical inputs to the subthalamus: intracellular analysis. Brain Res 214:411–415. 10.1016/0006-8993(81)91204-X [DOI] [PubMed] [Google Scholar]
  25. Künzle H. (1978) An autoradiographic analysis of the efferent connections from premotor and adjacent prefrontal regions (areas 6 and 9) in macaca fascicularis. Brain Behav Evol 15:185–234. 10.1159/000123779 [DOI] [PubMed] [Google Scholar]
  26. Lipski WJ, Wozny TA, Alhourani A, Kondylis ED, Turner RS, Crammond DJ, Richardson RM (2017) Dynamics of human subthalamic neuron phase-locking to motor and sensory cortical oscillations during movement. J Neurophysiol 118:1472–1487. 10.1152/jn.00964.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Mikos A, Bowers D, Noecker AM, McIntyre CC, Won M, Chaturvedi A, Foote KD, Okun MS (2011) Patient-specific analysis of the relationship between the volume of tissue activated during DBS and verbal fluency. Neuroimage 54:S238–S246. 10.1016/j.neuroimage.2010.03.068 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Mink JW. (1996) The basal ganglia: focused selection and inhibition of competing motor programs. Prog Neurobiol 50:381–425. 10.1016/S0301-0082(96)00042-1 [DOI] [PubMed] [Google Scholar]
  29. Moore MW, Fiez JA, Tompkins CA (2017) Consonant Age-of-Acquisition Effects in Nonword Repetition Are Not Articulatory in Nature. J Speech Lang Hear Res 60:3198–3212. 10.23641/asha.5435137 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Nambu A. (2005) A new approach to understand the pathophysiology of Parkinson's disease. J Neurol 252:IV1-IV4. 10.1007/s00415-005-4002-y [DOI] [PubMed] [Google Scholar]
  31. Nambu A, Tokuno H, Inase M, Takada M (1997) Corticosubthalamic input zones from forelimb representations of the dorsal and ventral divisions of the premotor cortex in the macaque monkey: comparison with the input zones from the primary motor cortex and the supplementary motor area. Neurosci Lett 239:13–16. 10.1016/S0304-3940(97)00877-X [DOI] [PubMed] [Google Scholar]
  32. Nambu A, Tokuno H, Hamada I, Kita H, Imanishi M, Akazawa T, Ikeuchi Y, Hasegawa N (2000) Excitatory cortical inputs to pallidal neurons via the subthalamic nucleus in the monkey. J Neurophysiol 84:289–300. 10.1152/jn.2000.84.1.289 [DOI] [PubMed] [Google Scholar]
  33. Nambu A, Tokuno H, Takada M (2002) Functional significance of the cortico-subthalamo-pallidal “hyperdirect” pathway. Neurosci Res 43:111–117. 10.1016/S0168-0102(02)00027-5 [DOI] [PubMed] [Google Scholar]
  34. Nauta WJ, Mehler WR (1966) Projections of the lentiform nucleus in the monkey. Brain Res 1:3–42. 10.1016/0006-8993(66)90103-X [DOI] [PubMed] [Google Scholar]
  35. Parent A, Hazrati LN (1995) Functional anatomy of the basal ganglia: II. The place of subthalamic nucleus and external pallidum in basal ganglia circuitry. Brain Res Brain Res Rev 20:128–154. 10.1016/0165-0173(94)00008-D [DOI] [PubMed] [Google Scholar]
  36. Parsons TD, Rogers SA, Braaten AJ, Woods SP, Tröster AI (2006) Cognitive sequelae of subthalamic nucleus deep brain stimulation in Parkinson's disease: a meta-analysis. Lancet Neurol 5:578–588. 10.1016/S1474-4422(06)70475-6 [DOI] [PubMed] [Google Scholar]
  37. Phillips L, Litcofsky KA, Pelster M, Gelfand M, Ullman MT, Charles PD (2012) Subthalamic nucleus deep brain stimulation impacts language in early Parkinson's disease. Sgambato-Faure V, ed. PLoS One 7:e42829. 10.1371/journal.pone.0042829 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Pinto S, Ozsancak C, Tripoliti E, Thobois S, Limousin-Dowsey P, Auzou P (2004) Treatments for dysarthria in Parkinson's disease. Lancet Neurol 3:547–556. 10.1016/S1474-4422(04)00854-3 [DOI] [PubMed] [Google Scholar]
  39. Redgrave P, Rodriguez M, Smith Y, Rodriguez-Oroz MC, Lehericy S, Bergman H, Agid Y, DeLong MR, Obeso JA (2010) Goal-directed and habitual control in the basal ganglia: implications for Parkinson's disease. Nat Rev Neurosci 11:760–772. 10.1038/nrn2915 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Rodriguez-Oroz MC, Rodriguez M, Guridi J, Mewes K, Chockkman V, Vitek J, DeLong MR, Obeso JA (2001) The subthalamic nucleus in Parkinson's disease: somatotopic organization and physiological characteristics. Brain 124:1777–1790. 10.1093/brain/124.9.1777 [DOI] [PubMed] [Google Scholar]
  41. Romanelli P, Heit G, Hill BC, Kraus A, Hastie T, Brontë-Stewart HM (2004) Microelectrode recording revealing a somatotopic body map in the subthalamic nucleus in humans with Parkinson disease. J Neurosurg 100:611–618. 10.3171/jns.2004.100.4.0611 [DOI] [PubMed] [Google Scholar]
  42. Romansky KV, Usunoff KG (1987) The fine structure of the subthalamic nucleus in the cat: II. Synaptic organization. Comparisons with the synaptology and afferent connections of the pallidal complex and the substantia nigra. J Hirnforsch 28:407–433. [PubMed] [Google Scholar]
  43. Romansky KV, Usunoff KG, Ivanov DP, Galabov GP (1979) Corticosubthalamic projection in the cat: an electron microscopic study. Brain Res 163:319–322. 10.1016/0006-8993(79)90359-7 [DOI] [PubMed] [Google Scholar]
  44. Sato F, Lavallée P, Lévesque M, Parent A (2000) Single-axon tracing study of neurons of the external segment of the globus pallidus in primate. J Comp Neurol 417:17–31. 10.1002/(SICI)1096-9861(20000131)417:1%3C17::AID-CNE2%3E3.0.CO;2-I [DOI] [PubMed] [Google Scholar]
  45. Schrock LE, Ostrem JL, Turner RS, Shimamoto SA, Starr PA (2009) The subthalamic nucleus in primary dystonia: single-unit discharge characteristics. J Neurophysiol 102:3740–3752. 10.1152/jn.00544.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Seal J, Commenges D (1985) A quantitative analysis of stimulus- and movement-related responses in the posterior parietal cortex of the monkey. Exp brain Res 58:144–153. [DOI] [PubMed] [Google Scholar]
  47. Starr PA, Theodosopoulos PV, Turner R (2003) Surgery of the subthalamic nucleus: use of movement-related neuronal activity for surgical navigation. Neurosurgery 53:1146–1149; discussion 1149. 10.1227/01.NEU.0000088803.79153.05 [DOI] [PubMed] [Google Scholar]
  48. Theodosopoulos PV, Marks WJ Jr, Christine C, Starr PA (2003) Locations of movement-related cells in the human subthalamic nucleus in Parkinson's disease. Mov Disord 18:791–798. 10.1002/mds.10446 [DOI] [PubMed] [Google Scholar]
  49. Thura D, Cisek P (2017) The basal ganglia do not select reach targets but control the urgency of commitment. Neuron 95:1160–1170.e5. 10.1016/j.neuron.2017.07.039 [DOI] [PubMed] [Google Scholar]
  50. Toyomura A, Fujii T, Kuriki S (2015) Effect of an 8-week practice of externally triggered speech on basal ganglia activity of stuttering and fluent speakers. Neuroimage 109:458–468. 10.1016/j.neuroimage.2015.01.024 [DOI] [PubMed] [Google Scholar]
  51. Turkeltaub PE, Eden GF, Jones KM, Zeffiro TA (2002) Meta-analysis of the functional neuroanatomy of single-word reading: method and validation. Neuroimage 16:765–780. 10.1006/nimg.2002.1131 [DOI] [PubMed] [Google Scholar]
  52. Turner RS, Desmurget M (2010) Basal ganglia contributions to motor control: a vigorous tutor. Curr Opin Neurobiol 20:704–716. 10.1016/j.conb.2010.08.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Watson P, Montgomery EB Jr (2006) The relationship of neuronal activity within the sensori-motor region of the subthalamic nucleus to speech. Brain Lang 97:233–240. 10.1016/j.bandl.2005.11.004 [DOI] [PubMed] [Google Scholar]
  54. Wichmann T, Bergman H, DeLong MR (1994) The primate subthalamic nucleus: I. Functional properties in intact animals. J Neurophysiol 72:494–506. 10.1152/jn.1994.72.2.494 [DOI] [PubMed] [Google Scholar]
  55. Zaghloul KA, Weidemann CT, Lega BC, Jaggi JL, Baltuch GH, Kahana MJ (2012) Neuronal activity in the human subthalamic nucleus encodes decision conflict during action selection. J Neurosci 32:2453–2460. 10.1523/JNEUROSCI.5815-11.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Zavala B, Tan H, Ashkan K, Foltynie T, Limousin P, Zrinzo L, Zaghloul K, Brown P (2016) Human subthalamic nucleus-medial frontal cortex theta phase coherence is involved in conflict and error related cortical monitoring. Neuroimage 137:178–187. 10.1016/j.neuroimage.2016.05.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Ziegler W, Ackermann H (2017) Subcortical contributions to motor speech: phylogenetic, developmental, clinical. Trends Neurosci 40:458–468. 10.1016/j.tins.2017.06.005 [DOI] [PubMed] [Google Scholar]

Articles from The Journal of Neuroscience are provided here courtesy of Society for Neuroscience

RESOURCES