Skip to main content
Springer logoLink to Springer
. 2025 Jun 25;230(6):105. doi: 10.1007/s00429-025-02968-8

Word onset tracking in neural responses of human basal ganglia nuclei

Arkan Al-Zubaidi 1,2,, Inga M Schepers 1, Anne-Kathrin Beck 3,4, Kerstin Schwabe 2,3, Joachim Runge 3, Mahmoud Abdallat 5, Joachim K Krauss 2,3, Jochem W Rieger 1,2,
PMCID: PMC12198293  PMID: 40560240

Abstract

Auditory speech is a highly relevant ecological stimulus with intrinsic temporal dynamics. At the cortical level, it is well established that neural responses contain information about the temporal dynamics of speech. Previous research further suggests that the basal ganglia also play a role in temporal processing. However, it is unknown whether neural responses in basal ganglia structures contain information about temporal speech structure. To address this question, we recorded local field potentials from the subthalamic nucleus (STN) and the globus pallidus internus (GPi) through contacts of deep brain stimulation (DBS) electrodes in patients with movement disorders (STN in Parkinson’s disease, GPi in Tourette syndrome or dystonia) while they listened to two-speaker speech streams with a task-relevant and an -irrelevant speech stream. Compared to baseline, amplitude modulations occurred in the beta-band (12–30 Hz) during listening to speech (word onset and duration) both in STN or GPi. Furthermore, the neural responses in both subcortical structures tracked word onsets during two-speech stream presentations, indicating that they code information about temporal speech structure. These findings provide further evidence that basal ganglia are involved in gating information about temporal speech structure relevant for task goal-directed processing.

Supplementary Information

The online version contains supplementary material available at 10.1007/s00429-025-02968-8.

Keywords: Local field potentials, Deep brain stimulation, Envelope of beta-band oscillations, Speech processing

Introduction

Human speech is characterized by its dynamic temporal structure, which is reflected in the activity of cortical neural dynamics (Ding et al. 2015; Kayser et al. 2015; Kubanek et al. 2013; Nourski et al. 2009; Peelle et al. 2013; Golumbic et al. 2013; Zoefel et al., 2022; Tankus et al., 2012). Despite significant degradation of other features, speech remains intelligible due to its temporal structure (Shannon et al. 1995; Golumbic et al. 2012). While speech perception primarily involves cortical processes such as auditory feature extraction (in the auditory cortex) and integration with morphosyntactic information (in Broca’s and Wernicke’s areas; Fonteneau et al. 2015; Friederici 2002), there are more layers of complexity.

Beyond the cortex, the basal ganglia (BG) and their interconnected thalamus circuitry have emerged as additional players in speech processing (Lim et al. 2014; Schepers et al. 2017; Beck et al. 2020; Tankus et al. 2024a, b). Studies on patients with BG dysfunction who exhibit impairments in speech production and comprehension underscore the involvement of this circuitry in the complex domain of speech (Kempler and Lancker 2002; Volkmann et al. 1992; Silveri 2021; Tankus et al., 2019). However, the contributions of the BG to encoding information from multiple speech sources remain unclear. It is possible that the BG are involved in selecting relevant information from a complex auditory scene or in integrating information from different speakers.

The BG have been demonstrated to play a critical role in resolving competition among alternatives across motor, cognitive, and sensory domains (Albin et al. 1989; Alexander 1986; Bočková et al. 2011; Combs et al. 2015; Utter and Basso 2008; Van Schouwenburg et al. 2015; Brittain and Brown 2014; Schwartze et al. 2011; Singh et al., 2011; Beck et al. 2018), providing a basis for investigation of its involvement in speech processing. Recent insights have expanded our understanding of the BG’s contribution to temporal processing, rhythm, sub-lexical speech processing, and attentional gating (Al-Zubaidi et al. 2022; Grahn 2009; Krauzlis et al. 2014; Nozaradan et al. 2017; Schubotz 2001; Van Schouwenburg et al. 2015; Wiener et al. 2014). Pathological disturbances in time perception associated with BG disorders further underscore its role in temporal processing (Allman and Meck 2012; Schwartze et al. 2011, 2015).

This expanding body of evidence, including functional magnetic resonance imaging (fMRI) findings that demonstrate BG activation during regular temporal stimulation (Geiser et al. 2012; Grahn 2009) and electroencephalogram (EEG) studies that reveal altered tone sequence cortical processing in patients with BG lesions (Nozaradan et al. 2017; Schwartze et al. 2015), highlights the essential role of the BG in cognitive control and executive processes, particularly in manipulating sequence information (Moore et al. 2013; Macoir et al. 2013). Furthermore, the responsiveness of the BG to regular temporal stimulation in both non-musicians and musicians (Grahn and Brett 2007; Grahn 2009) underscores its importance in processing temporal structures. Therefore, it is consequential to explore whether the BG encodes information about the temporal structure of speech in order to elucidate the neural mechanisms underlying speech processing (Albin et al. 1989; Alexander 1986; Bočková et al. 2011; Combs et al. 2015; Utter and Basso 2008; Van Schouwenburg et al. 2015).

The precise role of the BG in processing temporal speech structure within complex auditory scenes, such as two-speaker speech streams, remains unclear (Lim et al. 2014), primarily due to the limited spatial and temporal resolution of non-invasive neuroimaging techniques like EEG, MEG, and fMRI. Despite these challenges, clinical studies of patients with Parkinson’s disease (PD) and other movement disorders have leveraged direct and invasive brain measurements, such as deep brain stimulation (DBS) electrodes implanted in the subthalamic nucleus (STN), the globus pallidus internus (GPi) and other structures (Lozano et al. 2019; Krauss et al. 2021). These invasive methods provide insights that complement non-invasive techniques, highlighting the BG’s role in motor control and selection processes (Klostermann et al. 2008; Beck et al. 2018; Williams et al. 2014). The connection between these findings and the BG’s involvement in speech processing has been further supported by studies on challenging auditory conditions (Barnaud et al. 2018; Skipper and Hasson 2017). The integration between motor and auditory cortices aligns with the intricate interplay between the motor cortex, responsible for motor recruitment, and the BG, known for their role in motor selection and initiation (Hikosaka et al. 2000; Nambu 2004; Turner & Desmurget 2010), and it furthermore underscores the multifaceted role of the BG, serving as a bridge between motor control mechanisms and the intricate landscape of speech fluency.

The present study investigates the BG structures to track word onsets during the complex process of speech processing. Direct recordings of neural dynamics from the human STN and GPi via contacts of DBS electrodes have been shown to provide insight into the subcortical structures involved in sensory information processing (Airaksinen et al. 2011; Alam et al., 2015 ; Münte et al. 2017). We first acquired local field potential (LFP) recordings from the STN and the GPi of patients implanted with DBS electrodes while they listened to speech streams of two simultaneous speakers and performed a task focused on one of the speech streams. Then, word onset information from both speakers was extracted to determine each speech stream’s temporal structure. Prior research has demonstrated that word onsets facilitate lexical access and guide attention during speech processing (Astheimer & Sanders, 2009) ; Li et al. 2014, 2017; Sanford et al. 2006). Finally, we utilized machine learning techniques to estimate the temporal response function (TRF; Crosse et al. 2016) to analyze linear temporal relations between the word onsets in the two speech streams and the neural signal. In order to obtain an empirical p-value for individual electrode contact, a nonparametric bootstrap technique was employed to ascertain the statistical significance of the correction coefficient between predicted and actual neural signals. This was achieved by resampling the stimulus feature vector with replacement 100 times for each speech stream. This data-driven approach is further supported by recent advances demonstrating that speech-related neuronal activity in subcortical structures such as the STN and thalamus can be decoded using machine learning methods, allowing for inference of speech content at the single-neuron level (Tankus et al. 2021; Tankus et al., 2024a).

Although studies on the BG have indicated its involvement in temporal speech processing, it remains unclear whether BG structures specifically track word onsets during speech processing (Lim et al. 2014). Establishing a direct link between the neural dynamics of the STN, GPi, and the temporal structure of speech would provide compelling evidence for their role in this critical aspect of auditory processing. Therefore, the present study aims to explicitly test the hypothesis that BG structures, particularly the STN and GPi, actively contribute to tracking word onsets during speech processing.

Materials and methods

Participant information

The study was approved by the ethics committee of Hannover Medical School and informed consent was obtained from all participants. All participants were native German speakers. Nine patients suffering from PD who were implanted bilaterally in the STN participated in the experiment. One of the patients was removed from data analysis due to technical artifacts, resulting in eight STN data sets (STN1-STN8; 1 woman, age range 36–66, mean age 52, SD 11 years; seven participants were right-handed and one participant was ambidextrous). In addition, six patients suffering from either Tourette syndrome (GPi1-GPi3) or dystonia (GPi4-GPi6) were implanted bilaterally in the GPi (2 women, age range 19–59, mean age 38, SD 18 years; five participants were right-handed and one participant was left-handed) participated in the experiment. Demographic data of the individual patients are presented in Table 1.

Table 1.

Demographic data of the individual patients and overall task performance per individual. The last column refers to the percentage of correctly identified task words (collapsed across both words). PD: parkinson’s disease (PD); TS: tourette syndrome

Case (sex) Age Diagnosis Target region Handedness LEDD/
medication
Disease duration (years) Speech/ language/
hearing impairments
UPDRS III (Off/ On) Correct
responses (%)
STN 1 (m) 57 PD Bilateral STN R 315 7 None 24/8 94
STN 2 (m) 36 PD Bilateral STN R 2360 8 Hypophonic speech 37/9 80
STN 3 (m) 40 PD Bilateral STN R / L 1710 7 None 53/24 88
STN 4 (m) 52 PD Bilateral STN R 400 2 Hypophonic and slurred speech 21/13 78
STN 5 (f) 51 PD Bilateral STN R 1197 10 None 26/18 95
STN 6 (m) 53 PD Bilateral STN R 1052 5 None 40/18 80
STN 7 (m) 64 PD Bilateral STN R 625 12 None 39/30 84
STN 8 (m) 66 PD Bilateral STN R 1137.5 10 None 33/ 21 64
GPi 1 (f) 19 TS Bilateral GPi L

Aripiprazole, Cannabidiol,

delta-9-tetrahydrocannabinol

2 None 68
GPi 2 (m) 29 TS Bilateral GPi R Aripiprazole 21 None 53
GPi 3 (m) 19 TS Bilateral GPi R Aripiprazole 7 None 89
GPi 4 (m) 54 Dystonia Bilateral GPi R Metoprolol, amitriptyline, atorvastatin, citalopram, acetylsalicylic acid, ramipril 5 None 84
GPi 5 (f) 59 Dystonia Bilateral GPi R 250 13 Severe dysarthria 96
GPi 6 (m) 47 Dystonia Bilateral GPi R Metoprolol, ramipril 8 Mild dysarthria 87

Experimental design, task, and stimuli

During the performance of the experiment, patients were seated in a chair facing a video monitor (ViewSonic VG930m, 1280 × 1024 resolution, 60 Hz frame rate) at a distance of approximately 64 cm. Sounds were presented with two loudspeakers on each side of the video monitor at a comfortable volume for the patients, ensuring comprehension. We used an adapted German version of the Coordinate Response Measure Speech Corpus (Bolia et al. 2000) to represent the two-speech stream paradigm. We have applied this paradigm previously during subcortical recordings from the centromedian-parafascicular complex (Schepers et al. 2017). The trial structure of the experimental paradigm and stimulus features (word onsets) are depicted in Fig. 1a and b, respectively. All sentences started with the expression Los geht’s (Let’s go) followed by a target word (Goethe, Lessing, Schiller, Heine). This was followed by one of four action expressions with similar meanings, such as drücke gleich (press now), and a color number combination, such as grün und drei (green and three; Table 2). This structure resulted in sentences such as Los geht’s Goethe entscheide dich für blau und zwei (Let’s go Goethe choose blue and two). Importantly, in each experimental trial, two sentences were presented simultaneously and binaurally, one by a female speaker and one by a male speaker. Before each experimental block consisting of 24 sentence pairs, a target word (name) was presented in print on the video screen. The name indicated the target speaker that should be attended in the two-speaker speech stream because this speaker would later utter the task-relevant color and number word. The irrelevant, distracting speaker would always utter a different name (e.g., in the Goethe block, the distracting speaker always uttered Lessing). After the presentation of the sentences, an image of a keyboard (target image) appeared on the screen. This image indicated to the participant that she (or he) could now press the keyboard buttons corresponding to the color and the number word that the target speaker, who had uttered the target word, had pronounced. Each sentence contained one of four color words (green, blue, red, white) and one of four number words (two, three, four, five) and these words differed between the two speakers on each trial. Participants could start the subsequent trial by pressing the space bar. The keyboard image stayed on the computer screen until the participant pressed the space bar. A fixation cross appeared on the screen and remained there until the next keyboard image appeared, indicating the response interval. All sentences were root-mean-square (RMS) normalized to the same volume level. Participants were presented with 96 trials in total, except participants STN 1 and GPi 6 were presented with 72 and 85 trials due to technical reasons, respectively.

Fig. 1.

Fig. 1

Experimental setup and selected speech features. (a) Trial structure of the two-speech stream paradigm. On each trial, two speech streams (male and female speakers) were presented simultaneously and binaurally to the participant. Before each experimental block, a name was printed on the video screen, indicating that the speaker who utters this name will also provide task-relevant information. An image of a keyboard was presented at the end of the sentence, indicating to participants that they should now make the responses. Upon the appearance of the target image, the participant’s task was to press the buttons referring to the color (Task1) and number (Task2) uttered in the target speech stream. (b) The simplified stimulus features entered into the TRF analysis consisted of two vectors. Two binary vectors were utilized as stimulus feature vectors, each corresponding to a distinct speech stream (i.e., attended and unattended). These vectors comprised zeros and ones, with “ones” denoting the word onsets. The onset of the target word, which was identified as the speakers’ names (Goethe, Lessing, Schiller, and Heine), served as the initial point for differentiating the streams based on task relevance. For the random data analysis, the word onsets in both stimulus vectors were shuffled with the restriction that the distances between the word onsets were maintained

Table 2.

Stimulus material. Multi-speaker paradigm with task-relevant auditory and visual cues

Target word Action expression Task 1 Task 2
Goethe drücke gleich (press now) grün (green) zwei (two)
Los geht’s Schiller entscheide dich für (your choose) rot (red) und drei (three)
(Let’s go) Lessing wähle nun (choose now) weiß (white) (and) vier (four)
Heine nimm diesmal (take this time) blau (blue) fünf (five)

On each trial, two simultaneously spoken sentences were presented binaurally (see Figure. 1a), where both speakers were of different genders and uttered different combinations of the sentence elements. The target word was shuffled among the four blocks. The action expression and the two task words were shuffled among all sentences. English translations for the different expressions are shown in italics. Adapted from Schepers et al. (2017)

Electrode implantation and data acquisition

Patients were implanted bilaterally with quadripolar DBS electrodes (Medtronic 3387 or 3389 Minneapolis, MN, USA) in either the STN or GPi guided by computer tomography (CT) stereotactic surgery and microelectrode recording. Contacts on DBS electrodes were longitudinally spaced at distances of 0.5 mm in the STN and 1.5 mm in the GPi (1.27 mm diameter, 1.5 mm length). A stereotactic head frame was attached to the patient’s head under local anesthesia. Based on stereotactic CT imaging combined with preoperative MRI, the anterior (AC) and posterior (PC) commissures were identified. Post-surgical stereotactic CT scans were performed to document the placement of DBS electrodes. Electrodes were externalized for a few days prior to implantation of the pacemaker. Details of the implantation procedure and postoperative imaging have been published elsewhere (Runge et al. 2022, 2023, 2024).

Signals were sampled continuously at a rate of 2 kHz with a Digitimer D360 (Digitimer Ltd., Welwyn Garden City, Hertfordshire, UK) and digitized through a 1401 A-D converter onto a computer using the Spike2 software (both Cambridge Electronic Design, UK). During the signal recording, a 50 Hz notch filter, a high-pass filter with a cutoff frequency of 0.5 Hz, and a low-pass filter with a cutoff frequency below the Nyquist frequency were applied. Electrode impedances were below 5 kΩ during the whole recording session.

Electrophysiological data preprocessing and analysis

The electrophysiological data of all patients were preprocessed and analyzed in MATLAB (MathWorks Inc. Natick, MA) using the open-source toolbox FieldTrip (Oostenveld et al. 2011) and customized scripts. The data concerning the minimum speech onset of the two speaker streams were examined. For automatic artifact rejection, trials were z-scored and all trials with a standard deviation ≥ 6 were removed from further analysis. This rejection criterion led to an average removal of 2.1 ± 2.4 trials (mean ± standard deviation). Of these trials, only those where participants had correctly performed the task (i.e., the two task words of the target speaker were correctly reported) were retained for further analyses. This rejection criterion led to an overall removal of 35.39 ± 30.03 trials. Electrodes were re-referenced bipolarly to their neighboring contact (three bipolar channels per hemisphere: 0–1, 1–2, 2–3) to maximize spatial selectivity and to reduce the effects of volume conduction from distant sources (Schepers et al. 2017).

For the analysis of the neural responses, the data were transformed to time-frequency space using discrete prolate spheroidal (Slepian) sequences with 2 s of window length, followed by Fourier transformation. Four tapers were applied to a frequency window from 12 to 30 Hz. Power values (squared analytic amplitude) were calculated for each data point in the trial window. A baseline was selected from − 1 to 0 s before the onset of the audio file. The average power over the baseline window was used to calculate the percent change for every data point relative to the baseline at the individual trial level. Percent change values were then averaged across trials for each time point.

Time-frequency analysis

Time-frequency amplitude maps were generated for the frequency range of 3 to 100 Hz, masked at p < 0.05 (t-test to baseline, uncorrected), and examined across six electrode montages, three per hemisphere, for each participant (Fig. 2). The beta-band (12–30 Hz) exhibited the most consistent response across participants in comparison to other frequency bands, which is in line with a recent study demonstrating that the beta-band power of neural activity in the STN is associated with speech intelligibility (Avantaggiato et al. 2023). Moreover, Hovsepyan et al. (2023) emphasized the importance of beta-band oscillations in modulating prediction errors during speech processing, underscoring a top-down gating function that is crucial for complex auditory tasks. Furthermore, beta-band functional connectivity has been shown to support divided auditory attention during the processing of two simultaneous speech streams (Tóth et al. 2019) and is implicated in control processes (Engel and Fries 2010; Siegel et al. 2008). Thus, in light of these observations (Meyer 2018; Ortone et al. 2023), only the 12–30 Hz frequency range was selected for further investigation in the present study. For visualization purposes, uncorrected t-tests were used to threshold the maps shown in Fig. 2; however, all statistical analyses used for inference and interpretation were conducted using non-parametric tests based on a bootstrap technique.

Fig. 2.

Fig. 2

Average time-frequency response power values (squared analytic amplitude) over all trials in representative STN and GPi contacts (left hemisphere, channel 2) with sustained neural responses. The frequency band from 12–30 Hz (beta-band range) was the most consistent response across participants compared to other bands. Data are averaged across trials. 0 s refers to the onset of the target word. A baseline from − 500 to 0 ms before the speech stream onset, which occurred approximately 560 ms before the target word onset, was used

Word onset analysis

To determine whether the temporal structure of speech related to word onsets is represented in neural responses in the STN and GPi during sentence perception for both speech streams within a single encoding model, a regularized linear regression method was employed to estimate temporal response functions (TRFs). In this study, TRFs were used to characterize how the amplitude envelope of beta-band neural activity (12–30 Hz) encodes temporal information associated with word onsets. Beta-band oscillations were extracted from the neural signal, and their Hilbert envelope was computed to capture amplitude fluctuations over time. To ensure the robustness of the findings, TRFs were estimated separately for each participant and electrode contact using the Multivariate Temporal Response Function (mTRF) toolbox (Crosse et al. 2016), a method that predicts neural responses based on time-delayed multiple regression.

For the TFR estimation, two binary vectors, one for each speech stream (attended and unattended), served as stimulus feature vectors, consisting of zeros and ones. Here, “ones” marked the word onsets, beginning with the target word onset. The target word onset (Goethe, Lessing, Schiller, Heine) provided an initial point for differentiating the streams based on task relevance (see Fig. 1b for the experimental design). TFRs were then applied to predict neural responses as a convolution kernel with a lag length of 700 ms (0-700 ms) to the word onset time series.

In order to evaluate the model’s performance, cross-validation and bootstrap techniques were employed. First, the stimulus feature vector was resampled with replacement 100 times for each speech stream, generating 100 vectors that can infer information about a sample (stimulus feature vector) from resampled data. Subsequently, the bootstrap procedure was employed to ensure that the order of the word onset and its interval in resampled data was consistent with that in the sample data when estimating the distribution of generating the experimental results. Second, for each electrode contact, the resampled vectors (attended and unattended) and the corresponding neural dataset, the TRFs were estimated using five-fold cross-validation. This ensured that no test set was used in the model estimation, thus avoiding any potential bias. Then, Pearson correlation coefficients were calculated between the predicted and actual neural responses across the five folds, with the resulting values averaged to assess the model’s performance. This procedure was repeated 100 times to create a bootstrap distribution around the mean correlation, utilizing the speech’s actual temporal structure (Fig. 3, read distribution).

Fig. 3.

Fig. 3

Word onsets tracking in STN and GPi. The illustration depicts exemplary significant tracking of word onsets in both STN (top) and GPi (bottom) contacts, particularly from left hemisphere channel 2. Box plots display the median, interquartile range, and range of the middle 50% of the data within the reference distribution. The x-axis represents correlation coefficients between actual and predicted neural responses, with red distributions indicating data resampled using bootstrap techniques. Blue distributions represent correlation coefficients generated from shuffled temporal structures of the speech, serving as a random reference. The y-axis represents the proportion of occurrences of correlation coefficient values within the distribution

To assess statistical significance, a random resampling of the stimulus feature vector was conducted 500 times for each speech stream. The word onset interval positions were shuffled randomly to create a randomized temporal structure of the speech. As described above, a cross-validation schema was employed to generate a reference distribution of Pearson correlation coefficients for the randomized temporal structure, which served as a reliable baseline for comparison (Fig. 3, blue distributions). This allowed us to estimate empirical p-values for each mean Pearson correlation coefficient of the actual (real) distribution, significantly different from the random chance level.

Finally, the number of individual contact electrodes with a significant empirical p-value (p < 0.05) for the 48 STN contacts (2 hemispheres * 3 contacts * 8 participants) or the 36 GPi contacts (2 hemispheres * 3 contacts * 6 participants) was calculated. This calculation determined the percentage consistency of tracking the temporal structure (word onsets) of both speech streams.

Speech stream separation analysis

In order to ascertain whether the temporal structure of attended or unattended speech related to word onsets is represented in neural responses (envelope of beta oscillation) in the STN and GPi, a linear regression model was employed again to estimate the TRFs for each speech stream separately. Subsequently, the number of individual contact electrodes with a significant empirical p-value for attended or unattended speech terms was calculated, as detailed in Sect. 2.6. The results indicated that neuronal activity in both regions can represent features of both speech streams without a consistent preference for either speaker across contacts. The full details of the results of this analysis are provided in the supplementary material.

Results

Behavioral performance

On average ( Table 1, last column), the two task words were correctly identified in 81% of the trials (SD: 12%, range: 53–96%), indicating that participants were attentive to the speech stimuli and demonstrated a superior ability to perform the task compared to chance, with a 6.25% probability of correct guessing per trial.

Word-onset interval distribution

We analyzed the distribution of word-onset intervals to characterize the temporal structure of the speech stimuli. These intervals exhibited substantial variability across trials (male speaker: mean = 353 ms, SD = 134 ms; female speaker: mean = 411 ms, SD = 137 ms), corresponding to several cycles of beta-band activity. The histograms of word-onset intervals across all 96 trials are presented in Figure S1 (Supplementary Material). This variability confirms that speech onsets occurred irregularly, providing a suitable structure for assessing beta-band envelope modulations in response to naturalistic input.

Time-frequency responses in the STN and GPi to continuous speech

Beta-band range responses were observed in both STN and GPi, with all participants displaying significant power changes compared to baseline in the 12–30 Hz frequency range during the presentation of the two-speech streams. The time-frequency responses for representative contacts in STN and GPi are illustrated in Fig. 2, masked at p < 0.05 (t-test to baseline, uncorrected). The power could remain constant or vary throughout the sentence presentation compared to the baseline. Some contacts showed sustained responses for the entire speech stream presentation (mean duration = 3.18 s), while others only exhibited significant power changes for a portion of the speech stimulation.

Word onsets tracking in STN and GPi

Approximately half of the contacts in the STN and GPi demonstrated significant word onset tracking (Pearson correlation coefficients) in the 5-fold cross-validation. Specifically, 28 out of 48 STN contacts (58%) and 16 out of 36 GPi contacts (44%) exhibited significant word onset tracking in comparison to the reference distribution (p < 0.05). Figure 3 shows an example of the Pearson correlation coefficient distribution of actual (real; red distribution) data and random data (blue distribution) for representative individual STN and GPi contacts. These results indicate that most STN and GPi contacts tracked word onsets during speech perception.

Discussion

Our investigation delves into the intricate role of the BG in temporal processing, specifically focusing on its contribution to encoding the temporal structure of speech. The results demonstrate that the neural responses in the STN and the GPi track word onsets during speech perception in continuous speech streams.

A substantial body of evidence from both human and animal studies has accumulated, suggesting the importance of the BG in temporal prediction and the processing of non-speech stimuli (Chiba et al. 2008; Gouvêa et al. 2015; Grahn and Rowe 2013; Kotz and Schmidt-Kassow 2015; Matell et al. 2003; Mello et al. 2015; Schwartze et al. 2015). Existing experiments utilizing non-invasive methods, such as EEG and fMRI, have touched upon the question of temporal processing in the BG (Geiser et al. 2012; Grahn 2009; Grahn and Rowe 2013; Kotz and Schmidt-Kassow 2015; Schwartze et al. 2015; Teki et al. 2011). Our study advances this understanding by employing direct electrophysiological recordings in human BG nuclei, establishing a direct link between neural responses in the beta frequency range (12–30 Hz) and the temporal structure of presented speech input. This direct relationship provides a perspective not achievable through non-invasive methods, significantly expanding our comprehension of the neural underpinnings of temporal speech processing in the human brain.

In a previous study, Gulberti et al. (2015) had conducted an EEG investigation involving patients with PD and healthy controls. The results of this study indicated a reduction in cortical beta-band oscillations in patients with PD that was correlated with predictive timing information. This finding suggests that individuals with PD experience impaired predictive timing in the cortex. Subsequent DBS targeting the STN was observed to adjust these oscillations to resemble those in healthy individuals. While this suggests a role for STN beta-band activity in supporting aspects of temporal processing, the extent to which these oscillations contribute specifically to sensory stimulus processing, like word onset, remains unclear. On the other hand, sensory inputs play a significant role in guiding attention by predicting upcoming events, and BG-centered circuits contribute to value-based decision-making (Krauzlis et al. 2014). Building upon these findings, our study demonstrates that the neural responses of BG structures reflect the temporal structure of speech.

Further insights into the BG’s role in attentional processing are provided by studies such as that by McNab and Klingberg (2008), which demonstrated a correlation between GPi activation and the storage of relevant information in working memory. GPi, as the primary output structure of the BG, might act as a gatekeeper for attended and task-relevant stimuli, enhancing stimulus processing at the cortical level. From a temporal perspective, GPi could potentially gate relevant time windows to the cortex, thus optimizing sensory stimulus processing within those temporal windows. Furthermore, GPi inactivation studies indicate its role in influencing the speed and intensity of movements without affecting goal selection (Desmurget and Turner 2008, 2010; Thura and Cisek 2017). This proposition aligns with computational models that propose the BG’s involvement in resolving conflicts between competing streams, serving as an information-routing device among cortical regions and determining the state of cortical regions (Stocco et al. 2010).

The latest research findings offer compelling evidence that, despite the motor and oscillatory dysfunctions that are associated with Parkinson’s disease, dystonia and Tourette syndrome, neuronal data from patients with these conditions can still be utilized for language and speech processing tasks (Cai et al. 2024; Yokoi et al. 2023; Schepers et al. 2017; Beck et al. 2020). Beta-band (12–30 Hz) oscillations, which are generally enhanced in Parkinson’s disease, have been implicated in a number of cognitive tasks, including speech processing and attention (Sörös et al. 2017; Jensen et al., 2013). Although these disorders affect motor control, the capacity to perceive speech remains intact due to augmented theta and beta-band activity, which enhances auditory processing and attention, respectively (Schepers et al. 2017; Beck et al. 2020). In our multi-speaker paradigm, patients exhibited the capacity to recall the target word, with an average accuracy rate of 81% in their responses. In summary, in conjunction with the present findings, these studies suggest that language and speech processing remain largely intact in these disorders and that distinctive neural signatures can be leveraged to analyze speech-related cognitive functions.

Future studies should consider simultaneous recordings from subcortical and cortical regions to elucidate further the BG’s role in speech processing and its relationship with cortical temporal processing. Building on this foundation, our study underscores the importance of investigating the precise temporal features of speech that are accurately mirrored in BG neural activity to enhance our understanding of the neural circuitry underlying speech processing.

One limitation of most LFP studies, including our own, is the selection of electrode contacts (channels) for analysis (Beck et al. 2020). In the present study, no channel selection was made for the LFP. The study utilizes bootstrap and randomization techniques to examine all montages with varying electrode contacts (0–1, 1–2, 2–3). Such an analysis would be devoid of a priori assumptions regarding the electrode contact at which task effects manifest while maintaining sensitivity to differences in electrode contact and effects across participants (Alonso-Prieto et al. 2015). The employment of bootstrapping in this context offers several advantages over parametric approaches for estimating the reliability of brain activity (Fabiani et al. 1998). First, parametric statistics typically require robust assumptions about the data distribution. However, verifying these assumptions can be difficult, especially with limited sample sizes, where violations frequently occur. Collecting data from populations with DBS is also constrained, making it challenging to obtain large samples. Additionally, there may be inherent limitations to specific experimental paradigms or procedures.

The variability in word-onset intervals demonstrates that the speech stimuli lacked strict rhythmicity, indicating that beta-band tracking in STN and GPi cannot be attributed to entrainment by periodic input. Instead, our findings suggest that beta-band envelope fluctuations reflect dynamic neural encoding of temporally structured speech. This interpretation is further supported by our randomization analysis, which showed no predictive power when the temporal structure of the speech was disrupted, highlighting the importance of the specific timing of word onsets in shaping neural responses. Moreover, the TRF analysis employed a lag window of 0–700 ms, aligning with the timescale of word-onset variability and with the expected latencies of cortical and subcortical neural responses. This lag structure enables the model to capture temporally specific neural tracking of speech without relying on assumptions of rhythmic entrainment.

Timing disruptions in speech-related STN activity have been previously reported in PD patients with speech disorders (Tankus et al. 2021). These alterations manifested as delayed and more variable neuronal responses, varying as a function of task (production, perception, or imagery) and response type (planning vs. feedback). The reported heterogeneity underscores the complex and context-sensitive nature of STN dynamics in speech processing. Although two PD patients had documented mild speech disorders (hypophonia), these were not clinically prominent features within the cohort, as evidenced by their percentage of correctly identified task words from two speaker streams (accuracy: 80% and 87%). Thus, combining the patients’ clinical profile with their percentage of correct responses enables investigation of STN dynamics in relatively intact speech function. Moreover, our approach differs methodologically, employing temporal receptive field (TRF) modeling to quantify prediction accuracy over time-lagged windows. Tankus et al. (2021) did not use TRF-based methods, their findings suggest that increased response variability, particularly in feedback-related neurons, could adversely affect model predictability. While longer TRF lags can accommodate slower neural responses, substantial temporal variability may still constrain model performance. This suggests a critical need for future studies linking trial-to-trial variability with modeling outcomes. Additionally, while Tankus et al.‘s study focused exclusively on PD patients, our cohort included individuals with PD, dystonia, and Tourette syndrome. However, it is important to note that STN recordings in our study were obtained exclusively from PD patients, none were obtained from patients diagnosed with dystonia or Tourette syndrome. Future studies that integrate STN recordings across the latter patient populations may elucidate both convergent and divergent temporal dynamics in STN activity. This could be an important direction for research that extends beyond the scope of the present studystudy.

Conclusion

Our study lends further support to the concept that the BG are involved in temporal processing, including complex stimuli such as speech patterns. However, the specific temporal features of speech streams mirrored in BG neural activity to enhance temporal processing in cortico-basal ganglia-thalamocortical circuits remain unknown. Our findings underscore the importance of further investigating the BG’s involvement in speech processing, emphasizing the need to also consider subcortical structures to understand the neural circuitry underlying speech processing.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1 (474.1KB, docx)

Acknowledgements

We thank all patients and their families whose cooperation made this research possible. We also thank Marc René Schädler and Bernd T. Meyer for phonetically characterizing the sentences. We thank Helge Ahrens for preparing the first analysis.

Abbreviations

STN

Subthalamic nucleus

GPi

Globus pallidus internus

DBS

Deep brain stimulation

LFP

Local field potential

TRF

Temporal response function

Author contributions

I.M. S., AK. B., K. S., M. A. and J. K. K. contributed to the study conception and design. Material preparation, data collection were performed by I.M. S., AK. B., K. S., M. A., J.R. and J. K. K. Data analysis was performed by A.A. and J.W. R. The first draft of the manuscript was written by A.A. and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Funding

Open Access funding enabled and organized by Projekt DEAL. Open Access funding enabled and organized by Projekt DEAL. This work was funded by the Deutsche Forschungsge-meinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy—EXC 2177/1—Project ID 390895286.

Data availability

No datasets were generated or analysed during the current study.

Declarations

Ethics approval

Ethical approval All procedures performed in the study were in accordance with the local ethics committee (Medical School of Hanover, Germany) and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Arkan Al-Zubaidi, Email: arkan.al-zubaidi@uni-oldenburg.de.

Jochem W. Rieger, Email: jochem.rieger@uni-oldenburg.de

References

  1. Airaksinen K, Mäkelä JP, Taulu S, Ahonen A, Nurminen J, Schnitzler A, Pekkonen E (2011) Effects of DBS on auditory and somatosensory processing in Parkinson’s disease.Human Brain Mapping. 10.1002/hbm.21096. [DOI] [PMC free article] [PubMed]
  2. Al-Zubaidi A, Bräuer S, Holdgraf CR, Schepers IM, Rieger JW (2022) Sublexical cues affect degraded speech processing: insights from fMRI. Cerebral cortex communications, 3(1), tgac007 [DOI] [PMC free article] [PubMed]
  3. Alam M, Schwabe K, Lütjens G, Capelle HH, Manu M, von Wrangel C, Müller-Vahl K, Schrader C, Scheinichen D, Blahak C, Heissler HE, Krauss JK (2015) Comparative characterization of single cell activity in the globus pallidus internus of patients with dystonia or Tourette syndrome. J. Neural Transm. Vienna Austria 122(5):687–699. 10.1007/s00702-014-1277-0. [DOI] [PubMed]
  4. Albin RL, Young AB, Penney JB (1989) The functional anatomy of basal ganglia disorders. Trends in Neurosciences. 10.1016/0166-2236(89)90074-X. [DOI] [PubMed]
  5. Alexander G (1986) Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual Review of Neuroscience. 10.1146/annurev.neuro.9.1.357. [DOI] [PubMed]
  6. Allman MJ, Meck WH (2012) Pathophysiological distortions in time perception and timed performance. In Brain. 10.1093/brain/awr210 [DOI] [PMC free article] [PubMed]
  7. Alonso-Prieto E, Pancaroglu R, Dalrymple KA, Handy T, Barton JJ, Oruc I (2015) Temporal dynamics of the face familiarity effect: bootstrap analysis of single-subject event-related potential data. Cognitive neuropsychology, 32(5), 266–282 [DOI] [PubMed]
  8. Astheimer LB, Sanders LD (2009) Listeners modulate temporallyselective attention during natural speech processing. Biol Psycho 80(1):23–34. 10.1007/s00702-014-1277-0. [DOI] [PMC free article] [PubMed]
  9. Avantaggiato F, Farokhniaee A, Bandini A, Palmisano C, Hanafi I, Pezzoli G, Isaias IU (2023) Intelligibility of speech in Parkinson’s disease relies on anatomically segregated subthalamic beta oscillations. Neurobiology of Disease, 185, 106239 [DOI] [PubMed]
  10. Barnaud M, Lou, Bessière P, Diard J, Schwartz JL (2018) Reanalyzing neurocognitive data on the role of the motor system in speech perception within COSMO, a bayesian perceptuo-motor model of speech communication. Brain and Language. 10.1016/j.bandl.2017.12.003. [DOI] [PMC free article] [PubMed]
  11. Beck AK, Lütjens G, Schwabe K, Dengler R, Krauss JK, Sandmann P (2018) Thalamic and basal ganglia regions are involved in attentional processing of behaviorally significant events: evidence from simultaneous depth and scalp EEG. Brain Struct Function 223:461–474 [DOI] [PubMed] [Google Scholar]
  12. Beck AK, Sandmann P, Dürschmid S, Schwabe K, Saryyeva A, Krauss JK (2020) Neuronal activation in the human centromedian-parafascicular complex predicts cortical responses to behaviorally significant auditory events. NeuroImage 211:116583. 10.1016/j.neuroimage.2020.116583 [DOI] [PubMed] [Google Scholar]
  13. Bočková M, Chládek J, Jurák P, Halámek J, Baláž M, Rektor I (2011) Involvement of the subthalamic nucleus and globus pallidus internus in attention. Journal of Neural Transmission. 10.1007/s00702-010-0575-4. [DOI] [PubMed]
  14. Bolia RS, Nelson WT, Ericson MA, Simpson BD (2000) A speech corpus for multitalker communications research. The Journal of the Acoustical Society of America. 10.1121/1.428288. [DOI] [PubMed]
  15. Brittain JS, Brown P (2014) Oscillations and the basal ganglia: Motor control and beyond. In NeuroImage. 10.1016/j.neuroimage.2013.05.084 [DOI] [PMC free article] [PubMed]
  16. Cai W, Young CB, Yuan R, Lee B, Ryman S, Kim J, Menon V (2024) Subthalamic nucleus–language network connectivity predicts dopaminergic modulation of speech function in Parkinson’s disease. Proceedings of the National Academy of Sciences, 121(22), e2316149121 [DOI] [PMC free article] [PubMed]
  17. Chiba A, Oshio KI, Inase M (2008) Striatal neurons encoded Temporal information in duration discrimination task. Experimental Brain Research. 10.1007/s00221-008-1347-3. [DOI] [PubMed]
  18. Combs HL, Folley BS, Berry DTR, Segerstrom SC, Han DY, Anderson-Mooney AJ, Walls BD, van Horne C (2015) Cognition and Depression Following Deep Brain Stimulation of the Subthalamic Nucleus and Globus Pallidus Pars Internus in Parkinson’s Disease: A Meta-Analysis. In Neuropsychology Review. 10.1007/s11065-015-9302-0 [DOI] [PubMed]
  19. Crosse MJ, Liberto D, Bednar GM, A., Lalor EC (2016) The multivariate Temporal response function (mTRF) toolbox: A MATLAB toolbox for relating neural signals to continuous stimuli. Frontiers in Human Neuroscience. 10.3389/fnhum.2016.00604. [DOI] [PMC free article] [PubMed]
  20. Desmurget M, Turner RS (2008) Testing basal ganglia motor functions through reversible inactivations in the posterior internal globus pallidus. Journal of Neurophysiology. 10.1152/jn.01010.2007. [DOI] [PMC free article] [PubMed]
  21. Desmurget M, Turner RS (2010) Motor sequences and the basal ganglia: kinematics, not habits. Journal of Neuroscience. 10.1523/JNEUROSCI.0163-10.2010. [DOI] [PMC free article] [PubMed]
  22. Ding N, Melloni L, Zhang H, Tian X, Poeppel D (2015) Cortical tracking of hierarchical linguistic structures in connected speech. Nature Neuroscience. 10.1038/nn.4186. [DOI] [PMC free article] [PubMed]
  23. Engel AK, Fries P (2010) Beta-band oscillations—signalling the status quo? Current opinion in neurobiology, 20(2), 156–165 [DOI] [PubMed]
  24. Fabiani M, Gratton G, Corballis PM, Cheng J, Friedman D (1998) Bootstrap assessment of the reliability of maxima in surface maps of brain activity of individual subjects derived with electrophysiological and optical methods. Behavior Research Methods, Instruments, & Computers, 30(1), 78–86
  25. Fonteneau E, Bozic M, Marslen-Wilson WD (2015) Brain network connectivity during Language comprehension: interacting linguistic and perceptual subsystems. Cerebral Cortex. 10.1093/cercor/bhu283. [DOI] [PMC free article] [PubMed]
  26. Friederici AD (2002) Towards a neural basis of auditory sentence processing. In Trends in Cognitive Sciences. 10.1016/S1364-6613(00)01839-8 [DOI] [PubMed]
  27. Geiser E, Notter M, Gabrieli JDE (2012) A corticostriatal neural system enhances auditory perception through Temporal context processing. Journal of Neuroscience. 10.1523/JNEUROSCI.5153-11.2012. [DOI] [PMC free article] [PubMed]
  28. Golumbic EMZ, Poeppel D, Schroeder CE (2012) Temporal context in speech processing and attentional stream selection: a behavioral and neural perspective. Brain Lang 122(3):151–161 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Golumbic EMZ, Ding N, Bickel S, Lakatos P, Schevon CA, McKhann GM, Goodman RR, Emerson R, Mehta AD, Simon JZ, Poeppel D, Schroeder CE (2013) Mechanisms underlying selective neuronal tracking of attended speech at a cocktail party. Neuron. 10.1016/j.neuron.2012.12.037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Gouvêa TS, Monteiro T, Motiwala A, Soares S, Machens C, Paton JJ (2015) Striatal dynamics explain duration judgments. ELife. 10.7554/eLife.11386. [DOI] [PMC free article] [PubMed]
  31. Grahn JA (2009) The role of the basal ganglia in beat perception: neuroimaging and neuropsychological investigations. Annals of the New York Academy of Sciences. 10.1111/j.1749-6632.2009.04553.x. [DOI] [PubMed]
  32. Grahn JA, Brett M (2007) Rhythm and beat perception in motor areas of the brain. Journal of Cognitive Neuroscience. 10.1162/jocn.2007.19.5.893. [DOI] [PubMed]
  33. Grahn JA, Rowe JB (2013) Finding and feeling the musical beat: striatal dissociations between detection and prediction of regularity. Cerebral Cortex. 10.1093/cercor/bhs083. [DOI] [PMC free article] [PubMed]
  34. Gulberti A, Hamel W, Buhmann C, Boelmans K, Zittel S, Gerloff C, Westphal M, Engel AK, Schneider TR, Moll CKE (2015) Subthalamic deep brain stimulation improves auditory sensory gating deficit in parkinson’s disease. Clin Neurophysiol. 10.1016/j.clinph.2014.06.046 [DOI] [PubMed] [Google Scholar]
  35. Hikosaka O, Takikawa Y, Kawagoe R (2000) Role of the basal ganglia in the control of purposive saccadic eye movements. Physiological reviews. 803:953–978. 10.1152/physrev.2000.80.3.953. [DOI] [PubMed]
  36. Hovsepyan S, Olasagasti I, Giraud AL (2023) Rhythmic modulation of prediction errors: A top-down gating role for the beta-range in speech processing. PLOS Computational Biology, 19(11), e1011595 [DOI] [PMC free article] [PubMed]
  37. Jensen O, Bonnefond M (2013) Prefrontal alpha-and beta-band oscillations are involved in rule selection. Trends in cognitive sciences, 17(1), 10–12 [DOI] [PubMed]
  38. Kayser SJ, Ince RAA, Gross J, Kayser C (2015) Irregular speech rate dissociates auditory cortical entrainment, evoked responses, and frontal alpha. Journal of Neuroscience. 10.1523/JNEUROSCI.2243-15.2015. [DOI] [PMC free article] [PubMed]
  39. Kempler D, Van Lancker D (2002) Effect of speech task on intelligibility in dysarthria: A case study of Parkinson’s disease. Brain and Language. 10.1006/brln.2001.2602. [DOI] [PubMed]
  40. Klostermann F, Ehlen F, Vesper J, Nubel K, Gross M, Marzinzik F, Curio G, Sappok T (2008) Effects of subthalamic deep brain stimulation on dysarthrophonia in Parkinson’s disease. Journal of Neurology, Neurosurgery & Psychiatry, 79(5), 522–529 [DOI] [PubMed]
  41. Kotz SA, Schmidt-Kassow M (2015) Basal ganglia contribution to rule expectancy and Temporal predictability in speech. Cortex. 10.1016/j.cortex.2015.02.021. [DOI] [PubMed]
  42. Krauss JK, Lipsman N, Aziz T, Boutet A, Brown P, Chang JW, Lozano AM (2021) Technology of deep brain stimulation: current status and future directions. Nature Reviews Neurology, 17(2), 75–87 [DOI] [PMC free article] [PubMed]
  43. Krauzlis RJ, Bollimunta A, Arcizet F, Wang L (2014) Attention as an effect not a cause. In Trends in Cognitive Sciences. 10.1016/j.tics.2014.05.008 [DOI] [PMC free article] [PubMed]
  44. Kubanek J, Brunner P, Gunduz A, Poeppel D, Schalk G (2013) The tracking of speech envelope in the human cortex. PLoS ONE. 10.1371/journal.pone.0053398. [DOI] [PMC free article] [PubMed]
  45. Li X, Lu Y, Zhao H (2014) How and when predictability interacts with Accentuation in temporally selective attention during speech comprehension. Neuropsychologia. 10.1016/j.neuropsychologia.2014.09.020. [DOI] [PubMed]
  46. Li X, Zhang Y, Li L, Zhao H, Du X (2017) Attention is shaped by semantic level of event-structure during speech comprehension: an electroencephalogram study. Cognitive Neurodynamics. 10.1007/s11571-017-9442-4. [DOI] [PMC free article] [PubMed]
  47. Lim S-J, Fiez JA, Holt LL (2014) How May the basal ganglia contribute to auditory categorization and speech perception? Frontiers in Neuroscience. 10.3389/fnins.2014.00230. [DOI] [PMC free article] [PubMed]
  48. Lozano AM, Lipsman N, Bergman H, Brown P, Chabardes S, Chang JW, Krauss JK (2019) Deep brain stimulation: current challenges and future directions. Nature Reviews Neurology, 15(3), 148–160 [DOI] [PMC free article] [PubMed]
  49. Macoir J, Fossard M, Mérette C, Langlois M, Chantal S, Auclair-Ouellet N (2013) The role of basal ganglia in language production: Evidence from Parkinson’s disease. Journal of Parkinson’s disease, 3(3), 393–397 [DOI] [PubMed]
  50. Matell MS, Meck WH, Nicolelis MAL (2003) Interval timing and the encoding of signal duration by ensembles of cortical and striatal neurons. Behavioral Neuroscience. 10.1037/0735-7044.117.4.760. [DOI] [PubMed]
  51. McNab F, Klingberg T (2008) Prefrontal cortex and basal ganglia control access to working memory. Nature Neuroscience. 10.1038/nn2024. [DOI] [PubMed]
  52. Mello GBM, Soares S, Paton JJ (2015) A scalable population code for time in the striatum. Current Biology. 10.1016/j.cub.2015.02.036. [DOI] [PubMed]
  53. Meyer L (2018) The neural oscillations of speech processing and Language comprehension: state of the Art and emerging mechanisms. European Journal of Neuroscience, 48(7), 2609–2621 [DOI] [PubMed]
  54. Moore AB, Li Z, Tyner CE, Hu X, Crosson B (2013) Bilateral basal ganglia activity in verbal working memory. Brain and language, 125(3), 316–323 [DOI] [PubMed]
  55. Münte TF, Marco-Pallares J, Bolat S, Heldmann M, Lütjens G, Nager W, Müller-Vahl K, Krauss JK (2017) The human globus pallidus internus is sensitive to rewards– Evidence from intracerebral recordings. Brain Stimulation. 10.1016/j.brs.2017.01.004. [DOI] [PubMed]
  56. Nambu A (2004) A new dynamic model of the cortico-basal ganglia loop. Progress in brain research. 143:461–466. 10.1016/S0079-6123(03)43043-4. [DOI] [PubMed]
  57. Nourski KV, Reale RA, Oya H, Kawasaki H, Kovach CK, Chen H, Howard MA, Brugge JF (2009) Temporal envelope of time-compressed speech represented in the human auditory cortex. Journal of Neuroscience. 10.1523/JNEUROSCI.3065-09.2009. [DOI] [PMC free article] [PubMed]
  58. Nozaradan S, Schwartze M, Obermeier C, Kotz SA (2017) Specific contributions of basal ganglia and cerebellum to the neural tracking of rhythm. Cortex. 10.1016/j.cortex.2017.08.015. [DOI] [PubMed]
  59. Oostenveld R, Fries P, Maris E, Schoffelen JM (2011) FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Computational Intelligence and Neuroscience. 10.1155/2011/156869. [DOI] [PMC free article] [PubMed]
  60. Ortone A, Vergani AA, Ahmadipour M, Mannella R, Mazzoni A (2023) Dopamine depletion leads to pathological synchronization of distinct basal ganglia loops in the beta band. PLOS Computational Biology, 19(4), e1010645 [DOI] [PMC free article] [PubMed]
  61. Peelle JE, Gross J, Davis MH (2013) Phase-locked responses to speech in human auditory cortex are enhanced during comprehension. Cerebral Cortex. 10.1093/cercor/bhs118. [DOI] [PMC free article] [PubMed]
  62. Runge J, Nagel JM, Ascencao LC, Blahak C, Kinfe TM, Schrader C, Krauss JK (2022) Are transventricular approaches associated with increased hemorrhage? A comparative study in a series of 624 deep brain stimulation surgeries. Operative Neurosurgery, 23(2), e108–e113 [DOI] [PubMed]
  63. Runge J, Nagel JM, Schrader C, Blahak C, Weigel RE, Wolf ME, Krauss JK (2023) Microelectrode recording and hemorrhage in functional neurosurgery: a comparative analysis in a consecutive series of 645 procedures. Journal of Neurosurgery, 140(5), 1442–1450 [DOI] [PubMed]
  64. Runge J, Nagel JM, Blahak C, Kinfe TM, Heissler HE, Schrader C, Krauss JK (2024) Does temporary externalization of electrodes after deep brain stimulation surgery result in a higher risk of infection?? Neuromodulation: Technology at the Neural Interface, 27(3), 565–571 [DOI] [PubMed]
  65. Sanford AJS, Sanford AJ, Molle J, Emmott C (2006) Shallow processing and attention capture in written and spoken discourse. Discourse Processes. 10.1207/s15326950dp4202_2.
  66. Schepers IM, Beck AK, Bräuer S, Schwabe K, Abdallat M, Sandmann P, Dengler R, Rieger JW, Krauss JK (2017) Human centromedian-parafascicular complex signals sensory cues for goal-oriented behavior selection. NeuroImage. 10.1016/j.neuroimage.2017.03.019. [DOI] [PubMed]
  67. Schubotz RI (2001) Interval and ordinal properties of sequences are associated with distinct premotor areas. Cerebral Cortex. 10.1093/cercor/11.3.210. [DOI] [PubMed]
  68. Schwartze M, Keller PE, Patel AD, Kotz SA (2011) The impact of basal ganglia lesions on sensorimotor synchronization, spontaneous motor tempo, and the detection of tempo changes. Behavioural Brain Research. 10.1016/j.bbr.2010.09.015. [DOI] [PubMed]
  69. Schwartze M, Stockert A, Kotz SA (2015) Striatal contributions to sensory timing: Voxel-based lesion mapping of electrophysiological markers. Cortex. 10.1016/j.cortex.2015.07.016. [DOI] [PubMed]
  70. Shannon RV, Zeng FG, Kamath V, Wygonski J, Ekelid M (1995) Speech recognition with primarily Temporal cues. Science. 10.1126/science.270.5234.303. [DOI] [PubMed]
  71. Siegel M, Donner TH, Oostenveld R, Fries P, Engel AK (2008) Neuronal synchronization along the dorsal visual pathway reflects the focus of Spatial attention. Neuron. 10.1016/j.neuron.2008.09.010. [DOI] [PubMed]
  72. Silveri MC (2021) Contribution of the cerebellum and the basal ganglia to Language production: speech, word fluency, and sentence construction—evidence from pathology. The Cerebellum, 20(2), 282–294 [DOI] [PMC free article] [PubMed]
  73. Singh A, Levin J, Mehrkens JH, Bötzel K (2011) Alpha frequency modulation in the human basal ganglia is dependent on motor task. European Journal of Neuroscience, 33(5), 960–967 [DOI] [PubMed]
  74. Skipper JI, Hasson U (2017) A core speech circuit between primary motor, somatosensory, and auditory cortex: Evidence from connectivity and genetic descriptions. BioRxiv
  75. Sörös P, Doñamayor N, Wittke C, Al-Khaled M, Brüggemann N, Münte TF (2017) Increase in beta-band activity during Preparation for overt speech in patients with parkinson’s disease. Frontiers in human neuroscience, 11, 371 [DOI] [PMC free article] [PubMed]
  76. Stocco A, Lebiere C, Anderson JR (2010) Conditional routing of information to the cortex: A model of the basal ganglia’s role in cognitive coordination. Psychological review, 117(2), 541 [DOI] [PMC free article] [PubMed]
  77. Tankus A, Fried I (2019) Degradation of neuronal encoding of speech in the subthalamic nucleus in parkinson’s disease. Neurosurgery 84(2):378–387 [DOI] [PubMed] [Google Scholar]
  78. Tankus A, Fried I, Shoham S (2012) Structured neuronal encoding and decoding of human speech features. Nat Commun 3(1):1015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Tankus A, Solomon L, Aharony Y, Faust-Socher A, Strauss I (2021) Machine learning algorithm for decoding multiple subthalamic Spike trains for speech brain–machine interfaces. J Neural Eng 18(6):066021 [DOI] [PubMed] [Google Scholar]
  80. Tankus A, Rosenberg N, Ben-Hamo O, Stern E, Strauss I (2024a) (a Machine learning decoding of single neurons in the thalamus for speech brain-machine interfaces. Journal of neural engineering, 21(3), 036009 [DOI] [PubMed]
  81. Tankus A, Lustig-Barzelay Y, Gurevitch G, Faust-Socher A, Strauss I (2024b) Neuronal encoding of speech features in the human thalamus in Parkinson’s disease and essential tremor patients. Neurosurgery, 94(2), 307–316 [DOI] [PubMed]
  82. Teki S, Grube M, Kumar S, Griffiths TD (2011) Distinct neural substrates of duration-based and beat-based auditory timing. Journal of Neuroscience. 10.1523/JNEUROSCI.5561-10.2011. [DOI] [PMC free article] [PubMed]
  83. Thura D, Cisek P (2017) The basal ganglia do not select reach targets but control the urgency of commitment. Neuron. 10.1016/j.neuron.2017.07.039 [DOI] [PubMed] [Google Scholar]
  84. Tóth B, Farkas D, Urbán G, Szalárdy O, Orosz G, Hunyadi L, Hajdu B, Kovács A, Szabó BT, Shestopalova LB, Winkler I (2019) Attention and speech-processing related functional brain networks activated in a multi-speaker environment. 2e0212754. PLoS ONE14. 10.1371/JOURNAL.PONE.0212754. [DOI] [PMC free article] [PubMed]
  85. Turner RS, Desmurget M (2010) Basal ganglia contributions to motor control: a vigorous tutor. Current opinion in neurobiology. 20(6):704–716. 10.1016/j.conb.2010.08.022. [DOI] [PMC free article] [PubMed]
  86. Utter AA, Basso MA (2008) The basal ganglia: An overview of circuits and function. In Neuroscience and Biobehavioral Reviews. 10.1016/j.neubiorev.2006.11.003 [DOI] [PubMed]
  87. Van Schouwenburg MR, Ouden D, H. E. M., Cools R (2015) Selective attentional enhancement and Inhibition of fronto-posterior connectivity by the basal ganglia during attention switching. Cerebral Cortex. 10.1093/cercor/bht345. [DOI] [PubMed]
  88. Volkmann J, Hefter H, Lange HW, Freund HJ (1992) Impairment of Temporal organization of speech in basal ganglia diseases. Brain and Language. 10.1016/0093-934X(92)90108-Q. [DOI] [PubMed]
  89. Wiener M, Lee YS, Lohoff FW, Coslett HB (2014) Individual differences in the morphometry and activation of time perception networks are influenced by dopamine genotype. NeuroImage. 10.1016/j.neuroimage.2013.11.019. [DOI] [PMC free article] [PubMed]
  90. Williams NR, Foote KD, Okun MS (2014) Subthalamic nucleus versus globus pallidus internus deep brain stimulation: translating the rematch into clinical practice. Movement Disorders Clinical Practice. 10.1002/mdc3.12004. [DOI] [PMC free article] [PubMed]
  91. Yokoi K, Iribe Y, Kitaoka N, Tsuboi T, Hiraga K, Satake Y, Katsuno M (2023) Analysis of spontaneous speech in Parkinson’s disease by natural language processing. Parkinsonism & Related Disorders, 113, 105411 [DOI] [PubMed]
  92. Zoefel B, Kösem A (2022) Neural Dynamics in Speech Processing: Properties, Specificity and Open Questions

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (474.1KB, docx)

Data Availability Statement

No datasets were generated or analysed during the current study.


Articles from Brain Structure & Function are provided here courtesy of Springer

RESOURCES