Abstract
Rhythm, as a prominent characteristic of auditory experiences such as speech and music, is known to facilitate attention, yet its contribution to working memory (WM) remains unclear. Here, human participants temporarily retained a 12-tone sequence presented rhythmically or arrhythmically in WM and performed a pitch change-detection task. Behaviorally, while having comparable accuracy, rhythmic tone sequences showed a faster response time and lower response boundaries in decision-making. Electroencephalographic recordings revealed that rhythmic sequences elicited enhanced non-phase-locked beta-band (16 Hz–33 Hz) and theta-band (3 Hz–5 Hz) neural oscillations during sensory encoding and WM retention periods, respectively. Importantly, the two-stage neural signatures were correlated with each other and contributed to behavior. As beta-band and theta-band oscillations denote the engagement of motor systems and WM maintenance, respectively, our findings imply that rhythm facilitates auditory WM through intricate oscillation-based interactions between the motor and auditory systems that facilitate predictive attention to auditory sequences.
Supplementary Information
The online version contains supplementary material available at 10.1007/s12264-024-01289-w.
Keywords: Rhythm, Working memory, Sensorimotor, Neural oscillation, Drift diffusion model
Introduction
Rhythm, referring to the structured organization of events in time, is one of the most prominent features of auditory communication signals such as speech and music [1]. In parallel, motor activities such as walking, dancing, vocalization, and saccades, also contain rich rhythmic characteristics [2–4]. Humans possess unique capabilities to detect and spontaneously synchronize their movements to music rhythms [5, 6], an ability posited to play a fundamental role in music appreciation that presumably arose during evolution [7–11].
Rhythm has also been found to benefit attention, especially in dynamic temporal contexts, known as dynamic attention [12–14]. Targets occurring at the predicted time induced by rhythmic patterns are easily detected in both the auditory and visual domains [15–23]. Animal recordings have shown that this rhythmic facilitation arises from a neural entrainment process, whereby an internal neural oscillation is aligned with ongoing external events so that task-relevant stimuli reside in excitability phases of the oscillation to be optimally processed [24–26]. Similar effects have also been reported in noninvasive human studies [27–36]. Moreover, by applying a time-resolved behavioral measurement, recent studies have revealed direct rhythmic patterns in attentional behavioral performance [37–40], further supporting the rhythmic nature of attention. This rhythmic nature has been postulated to originate from motor systems [41–44], known as “active sensing”, in which animals actively sample information from the environment using rhythmic motor effectors [45–48]. Accordingly, the Action Simulation for Auditory Prediction (ASAP) theory posits that instead of initiating actual movements, motor areas can still facilitate sensory processing by simply simulating movements and transmitting real-time prediction signals to auditory systems [11, 49]. This view is supported by studies revealing the engagement of motor areas in tasks containing no overt movement, such as passive listening to music and speech [50–58].
Although rhythm’s role in attention has been widely studied, its function in working memory (WM) remains less explored. As a central hub linking perception, attention, decision-making, and long-term memory, it engages a wide range of brain regions. In fact, it has long been known that WM is strongly associated with attention. Selective attention to task-relevant features during the WM encoding, maintaining, and retrieval phases enhances neural representation and facilitates memory behavior [59–61]. Attention and WM have been proposed to share control mechanisms to flexibly route information, and similar neural representations have been found to underlie information prioritization in both WM and attention [62–66]. Interestingly, WM also displays rhythmic activity as attention does. For example, theta-band neural oscillations increase in a load-dependent way during WM retention [67–69]. Consistent with this, applying rhythmic theta-band transcranial magnetic stimulation to the dorsal pathway during retention improves WM performance [70–74]. Moreover, rhythmically presented items display greater episodic memory performance [75–78] and activate the frontoparietal network [79].
Given the intertwined links between attention and WM, we hypothesized that rhythm, an organized temporal structure that contributes to attention, would also facilitate auditory WM. Moreover, since WM is known to flexibly reorganize and manipulate inputs, it might benefit from the temporal regularity conveyed in the rhythm that would trigger the motor system to exert a predictive top-down influence on information processing and storage [80–82]. Here we combined behavior, electroencephalography (EEG) recordings, and computational modeling (the Hierarchical Diffusion Drift Model, HDDM) to test this hypothesis and examine the underlying computational and neural mechanisms, with a focus on neuronal oscillatory signatures. We demonstrated that rhythm facilitates auditory WM performance by speeding up response time, which arises from a lower response boundary in decision-making. Crucially, this rhythm-induced WM facilitation is mediated by coordination between the beta-band oscillations during sensory encoding and the theta-band oscillations during WM retention, denoting motor engagement, and memory maintenance, respectively. Taken together, rhythm facilitates auditory WM through intricate oscillation-based interactions presumably between the motor and auditory systems.
Materials and Methods
Participants
In Experiment 1 (behavioral experiment), 25 participants from Peking University were recruited (age range: 18–27 years, mean = 21.32 years; 11 males), and 11 reported having received amateur musical training (6.05 training years on average).
In Experiment 2 (EEG recordings), another 25 participants from Peking University were recruited (age range: 19–24 years, mean = 21.24 years; 8 males), and 11 reported having received amateur musical training (3.45 training years on average).
Musical training was not considered in our analysis as our participants were not professional musicians and the average learning time was short. Notably, none of them reported having absolute pitch. The experiments were conducted in accordance with the principles of the Declaration of Helsinki and approved by the local Ethics Committee at the School of Psychological and Cognitive Sciences, Peking University (Beijing, China). Informed consent was given by all participants before the experiment. All participants had normal hearing and reported no history of neurological or psychiatric disorders. Participants were compensated 100 Chinese Yuan after Experiment 1 and 120 Chinese Yuan after Experiment 2.
Stimuli
In Experiments 1 and 2, the auditory stimuli were 16 piano tones ranging from 196 Hz to 880 Hz, spanning 3 octaves and corresponding to G3, A3, B3, C4, D4, E4, F4, G4, A4, B4, C5, D5, E5, F5, G5, and A5. All stimuli were generated by Overture (v.5; GenieSoft Inc., Sydney, Australia). Each piano tone was sampled at 48,000 Hz and lasted 100 ms, with a dampening length and attenuation of 10 ms. Perceptual loudness was matched across all tones using Adobe Audition CC (v.2018; Adobe Inc., San Jose, USA).
In both experiments, auditory stimuli were presented binaurally via headphones (CX213; Sennheiser Inc., Weddemark, Germany) at each participant’s comfortable hearing level, using the Psychophysics-3 toolbox and additional custom scripts written for MATLAB (2021a; The MathWorks Inc., Natick, USA).
Experimental Design
In both experiments, participants performed an auditory WM task. In each trial, participants were instructed to temporarily retain in WM a sequence of 12 piano tones (target tone sequence). After a 2-s delay period, a probe tone sequence that was either the same or different (with some tones altered in pitch) was presented. The task was to report whether or not the probe tone sequence was identical to the target tone sequence on the pitch by pressing the corresponding key (change vs no-change). Participants were instructed to respond as quickly as possible after the onset of the last tone in the probe tone sequence.
In Experiment 1, we used a 2 (Rhythm: rhythmic, arrhythmic) × 3 (Change number: 0, 2, 4) within-subject design. In Experiment 2, we used a single factor (Rhythm: rhythmic, arrhythmic) within-subject design.
The stimulus onset asynchronies (SOAs) between two adjacent tones were 350 ms in the rhythmic condition, whereas the SOAs were randomly selected from the uniform distribution of 200 ms, 500 ms, and summed to 3,850 ms (equal to the duration of target tone sequences in the rhythmic condition) in the arrhythmic condition. For probe tone sequences, the SOAs between two adjacent tones were randomly selected from the uniform distribution of 300 ms, and 400 ms in both conditions, and all SOAs summed to 3,850 ms. Notably, the same probe sequence was used to ensure that any behavioral differences between conditions were not due to the probe sequence itself. Furthermore, we introduced a different SOA distribution (300 ms–400 ms) to the probe sequence to ensure that the temporal context during retrieval was not identical to either rhythmic or arrhythmic conditions during encoding. Temporal irregularities in the probe sequence fell between rhythmic and arrhythmic.
In Experiment 1, rhythmic and arrhythmic conditions (144 trials for each) both contained change-0, change-2, and change-4 trials (48 trials each). For change-0 trials, no tones were changed in probe tone sequences. The combinations of positions where the changes may occur were as follows: 1, 7/2, 8/3, 9/4, 10/5, 11/6, 12 for change-2 trials (numbers represent the positions of the changed tones); and 1, 4, 7, 10/ 2, 5, 8, 11/ 3, 6, 9, 12 for change-4 trials. All changed tones were adjusted in the same direction, either higher or lower. In Experiment 2, rhythmic and arrhythmic conditions (96 trials each) both contained change-2 trials and change-0 trials (48 trials each). The rules for tone changes were identical to Experiment 1. All the changed tones were adjusted higher or lower by two semitones (e.g., C4 adjusted to D4). C and F could not be adjusted lower while B and E could not be adjusted higher to keep all tones on white keys so that participants could not make judgments from tonality. Meanwhile, any two adjacent tones in a sequence could not be identical in pitch.
In both experiments, for each trial, the 12-tone sequence was generated from a random network composed of the 16 piano tones noted above (Fig. S1). Each node in the network denotes one piano tone and each line denotes a one-step transition. For each trial, a random walk in the network stipulated the order of the 12 tones in the sequence. The same tone sequences were used in the rhythmic and arrhythmic conditions.
In both experiments, rhythmic trials and arrhythmic trials were intermixed and randomly presented. Experiment 1 (288 trials) was divided into 4 sessions, while Experiment 2 (192 trials) was divided into 3 sessions; participants were allowed to rest after each session. Participants had a short training session of 5 rhythmic trials and 5 arrhythmic trials with feedback before the actual experiments. After the start of the experiments, no feedback was given. Participants were not informed beforehand that there would be rhythmic and arrhythmic trials, but most of them could report after the experiments that some sequences were more “regular” and some were more “irregular”.
In Experiment 1, we used the change-0 condition as the baseline and compared it separately to the change-2 and change-4 conditions. All accuracy and RT analyses and model fittings were applied to the two groups (Change-2, Change-4). Experiment 2 used the same procedure.
Hierarchical Drift-Diffusion Model (HDDM)
We applied a drift-diffusion model (DDM) to further characterize the influence of rhythm on WM performance in both Experiments 1 and 2. Basically, a DDM models two-choice decision-making as a noisy evidence accumulation process with four core parameters: drift rate (v), response boundary (a), non-decision time (t), and bias (z). In this model, two choices are represented as two boundaries (separated by a response boundary) and a drift process accumulates relative evidence over time with drift rate v. A corresponding response is initiated when the drift process crosses one of the two boundaries. In addition, t characterizes the non-decision motor process and z represents the starting point of the drift process. We used an HDDM which applies the Markov-chain Monte-Carlo (MCMC) method to estimate the posterior probability distributions of DDM parameters. This hierarchical Bayesian method estimates group parameters and individual participant parameters simultaneously with individual parameter estimation constrained by the group-level distribution. All HDDM model fittings and analyses were applied using the HDDM packages [83].
HDDM generally assumes that different conditions are completely independent of each other and parameters are sampled from separate group priors. However, for the within-subject experimental design, different conditions are correlated and there may be individual differences in overall performance. Therefore, considering our experimental design, we used a regression HDDM to capture the within-subject effect in both experiments. In a regression HDDM, for each participant, an intercept is used to capture overall performance in one condition as a baseline, and other conditions are expressed relative to this baseline condition as slopes (Eq. 1).
| 1 |
The parameter βi,0 is the intercept for the baseline condition capturing each participant’s overall performance. Parameters βi,1 and βi,2 are the slopes for the other two conditions capturing their effects relative to the baseline condition.
HDDM for Experiment 1
In Experiment 1, four dummy variables were used in regression HDDM to investigate the effects of rhythm and number of changed tones on behavioral responses (Table 1). The difference between any two conditions was calculated using the four dummy variables.
Table 1.
Dummy variables for regression HDDM in Experiment 1.
| Conditions | x0 (intercept) | x1 | x2 | x3 |
|---|---|---|---|---|
| Change 2 & arrhythmic | 1 | 0 | 0 | 0 |
| Change 4 & arrhythmic | 1 | 1 | 0 | 0 |
| Change 2 & rhythmic | 1 | 0 | 1 | 0 |
| Change 4 & rhythmic | 1 | 0 | 0 | 1 |
Dummy variables used for regression HDDM in Experiment 1. The parameter x1 corresponds to the difference between change-2 and change-4 in the arrhythmic condition. Parameters x2 and x3 correspond to the difference between rhythmic and arrhythmic conditions for change-2 and change-4 separately.
In Experiment 1, we assumed there was no difference in z between conditions so we used a single value of z for all conditions. We constructed Model_vat first with v, a, and t varying freely between conditions. The results showed that a displayed the largest difference between conditions. Then, we constructed several models assuming a with different parameters varying freely between conditions. Model comparison using the deviance information criterion (DIC) showed that the model with only a varying between conditions (Model_a) was the best. Table 2 shows all the models and their ΔDIC compared to Model_a. For all the models in the table, apart from the parameters that vary freely between conditions, all the other parameters were fixed to a single value between conditions.
Table 2.
Model parameters and ΔDIC in Experiment 1.
| Models | Parameters varying between conditions | ΔDIC |
|---|---|---|
| Model_a | a | 0 |
| Model_va | v, a | 2716.37 |
| Model_at | a, t | 2715.61 |
| Model_vat | v, a, t | 2776.90 |
Different models with different parameters vary freely between conditions in Experiment 1. ΔDIC represents the increment in DIC compared to the best model.
We used RT distributions for correct and incorrect responses for model fitting (accuracy coding). Trials with RT <0.15 s or >10 s were removed [84–86]. Given the difficulty of the task, the RTs here were relatively long. We chose 10 s as the upper bound to include as much data as possible for model fitting. Moreover, other criteria such as excluding trials with RT exceeding 3 SDs for each participant did not alter the results.
The length of the probe tone sequence (3.85 s) was added to the RT in each trial. We used the default non-informative priors in the HDDM package for Bayesian parameter estimation. In each model, we obtained parameter estimates by generating 3 separate Markov chains with 10,000 MCMC samples at both individual and group levels. For each chain, the first 2,000 samples were discarded as burn-in. Then, the remaining 8,000 samples of 3 chains were concatenated and group-level parameters were estimated from the posterior distributions across the resulting 24,000 samples. Model convergence was checked by visually inspecting the traces of posteriors for each parameter. Hypothesis testing was applied to the mean group posteriors by calculating the percentage of posterior samples in one condition smaller than the other condition. This yielded a P-value similar to but not equivalent to a P-value estimated by frequentist methods [87].
HDDM for Experiment 2
In Experiment 2, two dummy variables were used to investigate the effect of rhythm on behavioral performance (Table 3).
Table 3.
Dummy variables for regression HDDM in Experiment 2.
| Conditions | x0 (intercept) | x1 |
|---|---|---|
| Arrhythmic | 1 | 0 |
| Rhythmic | 1 | 1 |
Dummy variables used for regression HDDM in Experiment 2. The parameter x1 corresponds to the difference between rhythmic and arrhythmic conditions.
In Experiment 2, we also assumed there was no difference in z between conditions. We constructed Model_vat first with v, a, and t varying between conditions. The results showed that a displayed the largest difference between conditions. Then, we constructed several models assuming a with different parameters varying between conditions. Model comparison using the deviance information criterion (DIC) showed that the model with a and t varying between conditions (Model_at) was the best. Table 4 shows all the models and their ΔDIC compared to Model_at. The methods of model fitting and hypothesis testing were the same as in Experiment 1.
Table 4.
Model parameters and ΔDIC in Experiment 2.
| Models | Parameters varying between conditions | ΔDIC |
|---|---|---|
| Model_at | a, t | 0 |
| Model_a | a | 26.64 |
| Model_va | v, a | 1126.07 |
| Model_vat | v, a, t | 2.46 |
Different models with different parameters vary freely between conditions in Experiment 2. ΔDIC represents the increment in DIC compared to the best model.
EEG
System
EEG data were recorded at a sampling rate of 500 Hz in a quiet, electrically shielded room using a 64-channel Brain-Vision system. This system contained a Brain-Vision recorder and two BrainAmp amplifiers (Brain Products GmbH). One reference electrode at FCz, one ground electrode at AFz, and one ocular electrode recording vertical electrooculogram (EOG) were used. The impedance of each electrode was kept below 20 kΩ during recording. All EEG data were analyzed using MNE-Python (v.1.5.1; https://mne.tools/stable/index.html).
Data Preprocessing
EEG was preprocessed using the following procedure. First, bad channels were repaired using interpolation and continuous raw data was filtered using a low-pass filter at 33.3 Hz and a high-pass filter at 1 Hz with the sampling rate reduced to 100 Hz. Second, the filtered data were passed into independent component analysis, and components containing eye blinks and heartbeat artifacts were removed. Third, the reconstructed data were re-referenced using an averaged reference and then segmented into encoding epochs (0 s to 3.95 s relative to the onset of the first tones of target sequences) and maintaining epochs (0 s to 1.5 s relative to the 500 ms after the onset of the last tones of target sequences). Moreover, since the encoding and maintaining periods were exactly the same for the Change-0 and Change-2 conditions, we combined these conditions, for the Rhythmic and Arrhythmic conditions.
Event-Related Potentials
Event-related potentials were calculated by averaging encoding epochs and corrected by the mean baseline activity (−500 ms to 0 ms before each trial) in each subject.
Time-Frequency Analysis
To investigate frequency domain activity during the encoding and maintaining periods, we submitted these EEG epochs to time-frequency decomposition using a complex Morlet wavelet convolution. 15 logarithmically spaced frequencies were used within each frequency range of interest. The number of wavelet cycles was adjusted for each frequency to ensure an increase in the wavelet cycle as the frequency increased. The frequency ranges and numbers of wavelet cycles in all time-frequency analyses are listed in Table 5.
Table 5.
Frequency ranges and numbers of wavelet cycles.
| Frequency range | Number of wavelet cycles |
|---|---|
| 2 Hz-33.3 Hz | Linearly spaced from 2 to 6 |
| 16 Hz-33.3 Hz | 1/2 of all frequencies |
| 8 Hz-12 Hz | 1/2 of all frequencies |
| 3 Hz-5 Hz | Linearly spaced from 2 to 6 |
Frequency ranges for broadband, beta-band, alpha-band, and theta-band power, along with the corresponding numbers of wavelet cycles used in temporal frequency decompositions.
Time-frequency analysis was applied to the encoding period (0 s to 3.95 s) and maintaining period (0 s to 1.5 s), separately, using the same baseline period ranging from 0.5 s to 0.2 s before the onset of each trial.
For each participant and each electrode, the time-frequency decomposition was applied to each trial and averaged across trials for rhythmic and arrhythmic conditions separately to get non-phase-locked time-frequency power and subsequently rescaled to decibels (dBs, Eq. 2). In Eq.2, subscripts t and f are time and frequency points. Note that the baseline has no t subscript, indicating that all the time points within a frequency were used for the baseline period. Negative power in dBs implies activity lower than baseline.
| 2 |
Pearson Correlation and General Linear Model (GLM)
In Experiment 2, we applied a Pearson correlation analysis to investigate the relationship between neural signatures during encoding and maintaining stages. The encoding-stage signature was defined as the difference in beta-band power (16–33.3 Hz) averaged over the significant temporal (2.7 s to 2.95 s, Fig. 3D) and spatial clusters (frontal-parietal regions marked with black dots, Fig. 3E). The maintaining-stage signature was defined as the difference in theta-band power (3 Hz–5 Hz) averaged over the significant temporal (0.13 s to 0.73 s, Fig. 4B) and spatial clusters (frontal regions marked with black dots, Fig. 4C). Participants with a beta or theta power difference exceeding 3 SDs were removed.
Fig. 3.
Auditory event-related potentials (ERPs) and time-frequency analysis during the encoding period (Experiment 2). A Grand averaged auditory ERPs as a function of time throughout the encoding period under rhythmic (purple) and arrhythmic (green) conditions. Shaded areas indicate 95% bootstrapping confidential intervals. B Grand averaged non-phase-locked time-frequency power difference between rhythmic and arrhythmic conditions throughout the encoding period (0–3.95 s) averaged across all the EEG channels. Power is converted to decibels (dB). C Grand averaged beta-band time courses averaged over all channels as a function of time during the late encoding period (2 s–3 s after the onset of the target sequence), for rhythmic (green) and arrhythmic (orange) conditions. Black horizontal line: significant cluster (permutation cluster analysis, n = 25, P = 0.046, corrected). Shaded areas indicate the SEM. D Grand averaged topographic map of the beta-band power difference between rhythmic and arrhythmic conditions between 2.7 s and 2.95 s after onset of the target sequence (corresponding to the significant time range in Fig. 3C). Black dots: significant clusters (permutation cluster analysis, n = 25, P = 0.023, corrected).
Fig. 4.
Time-frequency analysis during the maintaining period (Experiment 2). A Grand averaged non-phase-locked time-frequency power difference between rhythmic and arrhythmic conditions throughout the maintaining period (0 s–1.5 s relative to the 500 ms after the onset of the last tones of target sequences) averaged across all the EEG channels. Power is converted to decibels (dB). B Grand averaged theta-band time courses averaged over all channels as a function of time throughout the maintaining period (0 s–1 s), for rhythmic (green) and arrhythmic (orange) conditions. Black horizontal line: significant cluster (permutation cluster analysis, n = 25, P = 0.021, corrected). Shaded areas indicate SEM. C Grand averaged topographic map of the theta-band power difference between rhythmic and arrhythmic conditions within the corresponding significant time range in Fig. 4B. Black dots: significant cluster (permutation cluster analysis, n = 25, P = 0.012, corrected).
We investigated the behavioral relevance of the two-stage neural signatures by fitting GLMs for the rhythmic and arrhythmic conditions separately. In each condition, we constructed GLMs using high beta power (20 Hz–33.3 Hz) during encoding and theta power during maintaining (3–5 Hz) as fixed-effects predictors, individual participants as a random-effects term for the intercept, and a or t as the response variable separately (Eq. 3). For beta and theta power, we averaged over the significant time periods of power difference (beta: 2.7 s–2.95 s, Fig. 3D; theta: 0.13 s–0.73 s, Fig. 4B) and all channels. All the variables were standardized before model fitting.
| 3 |
Statistical Procedures
For the behavioral data, we calculated the accuracy (defined as the percentage of correct responses) and mean RTs. Only RTs of correct trials were included. In Experiment 1, repeated-measures Analysis of Variance (ANOVAs ) were applied to analyze the effect of rhythm and change number on accuracy and RT. In Experiment 2, paired t-tests were applied to analyze the effect of rhythm on accuracy and RT. The effect size was measured by calculating Cohen’s d values. All paired t-tests and repeated-measures ANOVAs were performed using SPSS (PASW Statistics Release 22.0.0; International Business Machines Corporation, Armonk, USA) and MATLAB (2021a; The MathWorks Inc., Natick, USA).
For EEG data analysis, to assess the statistical difference between rhythmic and arrhythmic conditions while controlling for multiple comparisons, we applied cluster-based nonparametric analyses across participants [88] with MNE-Python (v.1.5.1). To generate the corresponding null distribution, we randomly shuffled the two conditions in each subject and calculated the corresponding condition difference for each of the 100,000 permutations. This is equal to randomly flipping the sign of power difference between conditions for each subject. The corrected P-value was estimated by comparing the observed cluster-level t-statistics to the null distribution across all permutations. Clusters with t statistics greater than the 95% point were considered significant. To assess the statistical significance of the difference in the power time courses, we averaged the power over all channels at each time point and applied cluster analysis in the time dimension. To assess the statistical significance of the difference in the spatial topography map, we averaged the power over the significant time period of difference and applied cluster analysis in the spatial dimension. Single-tailed tests were used as our hypotheses for neural oscillations were directional.
Results
Rhythm Facilitates Auditory WM via Decreasing the Response Boundary of Perceptual Decision (Experiment 1)
Twenty-five participants performed an auditory sequence WM task in Experiment 1, in which each trial consisted of three phases: Encoding, Maintaining, and Retrieval (Fig. 1A). During Encoding, participants were presented with a 12-tone sequence (randomly generated from 16 piano tones; color denotes pitch) in either a rhythmic or arrhythmic manner but with the same sequence length (Fig. 1A, left, upper and lower panels; see Methods for stimulus details). After a 2-s delay period (Maintaining), a 12-tone probe sequence with a slightly jittered inter-tone interval was played, and participants reported whether the probe sequence was identical to the memorized sequence on pitch (Retrieval), by pressing the corresponding key (change vs no-change). The number of tones in the probe sequence that changed in pitch was 0 (No-change), 2 (Change-2), or 4 (Change-4) (Fig. 1A, right; small triangles denote pitch change). Note that the rhythmic and arrhythmic conditions were randomly presented; the Change-0 condition served as the baseline and was compared to the Change-2 and Change-4 conditions, separately, resulting in WM performance for the Change-2 and Change-4 conditions.
Fig. 1.
Experimental paradigm and results of Experiment 1. A Participants were instructed to temporarily memorize a 12-tone sequence (randomly generated from 16 piano tones) presented in a rhythmic (upper) or arrhythmic (lower) manner (Encoding). Color denotes different pitch. Only four colors are used for illustration. Space between consecutive notes denotes the in-between interval (ISI). After a 2-s delay period (Maintaining), a probe 12-tone sequence with 0 (upper), 2 (middle), or 4 tones (lower) altered in pitch (indicated by small triangles) was presented (Retrieval). Participants reported whether the probe sequence was identical to the target sequence on the pitch by pressing corresponding keys (change vs no-change). B Accuracy (left) and RT (right) under rhythmic (black) and arrhythmic (grey) trials under Change-2 and Change-4 conditions, with Change-0 as the common baseline (repeated-measures ANOVAs, n = 25, *P <0.05, **P <0.01, ***P <0.001, n.s., no significant difference). C HDDM fitting results. The group-level posterior probability of response boundary difference between rhythmic and arrhythmic conditions for Change-2 (blue) and Change-4 conditions (green; a.u., arbitrary unit).
We conducted a 2 × 2 repeated-measures ANOVA on WM performance accuracy and RT, with rhythm (rhythmic, arrhythmic) and change number (Change 2, Change 4) as the two within-subject factors (Fig. 1B). First, we assessed the influence of task difficulty (Change 2 vs Change 4) on performance and confirmed its significant effects on both accuracy (main effect: F(1, 24) = 20.91, P <0.001, η2 = 0.47) and RT (main effect: F(1, 24) = 10.00, P = 0.004, η2 = 0.29). As expected, easier pitch-change detection (Change-4) showed better performance and a faster RT than the more difficult condition (Change-2). Interestingly, regarding the effects of rhythm on WM, we only found its impact on RT, such that the rhythmic condition was associated with a faster RT than the arrhythmic condition (F(1, 24) = 4.27, P = 0.05, η2 = 0.15), but not on performance accuracy (F(1, 24) = 2.26, P = 0.145, η2 = 0.09).
After revealing that rhythm speeds up the perceptual decision in the WM task, we next applied an HDDM as a computational model to decompose the behavioral measurements and determine which component of the perceptual-processing process is essentially modulated by rhythm and ultimately contributes to the faster RT. The DDM conceptualizes two-choice decision-making as a noisy evidence accumulation process, in which a response is initiated when the cumulative evidence crosses one of the two boundaries [89]. The HDDM uses the Bayesian method to estimate group and individual DDM parameters simultaneously, with the four major parameters v, a, t, and z [83].
We first constructed several HDDMs with different free parameters that could vary between conditions to account for observed behavioral performance (details in Methods). The best model after comparisons was the one with only a varying between conditions (Methods, Table 2). Next, we examined the best model results. Fig. 1C shows that a is lower under rhythmic conditions than arrhythmic conditions, but only for the challenging change-2 task (P = 0.006) and not for the easier change-4 task (P = 0.443).
Taken together, by accessing behavioral performance combined with computational modeling, we demonstrate that rhythmic presentation facilitates auditory WM performance by speeding up RT, which arises from the decreased response boundary during perceptual decisions, particularly in challenging tasks.
Neural Oscillatory Mechanisms of Rhythm-induced WM Facilitation (Experiment 2)
In Experiment 2, 25 participants performed an auditory WM task similar to Experiment 1, with 64-channel EEG activity recorded (Fig. 2A). Since Experiment 1 showed that rhythm improves WM only in a more difficult task, we removed the Change-4 condition and only kept the No-change and Change-2 conditions. Again, the Change-0 condition served as the baseline to be compared to the Change-2 condition.
Fig. 2.
Experimental paradigm and behavioral results of Experiment 2. A Participants performed the same auditory WM task as Experiment 1 with 64-channel EEG activities recorded simultaneously with only Change-0 (no-change) and Change-2 conditions. Participants reported whether the 12-tone probe sequence during retrieval was identical to the target 12-tone sequence on pitch by pressing corresponding keys (Change-2 vs Change-0). B Accuracy (left) and RT (right) under rhythmic (black) and arrhythmic (grey) trials (paired t-test, n = 25, **P = 0.002, n.s., no significant difference). C HDDM fitting results. Left (blue): Group-level posterior probability of response boundary difference between rhythmic and arrhythmic conditions. Right (green): Group-level non-decision time difference between rhythmic and arrhythmic conditions (a.u., arbitrary unit).
Experiment 2 largely replicated the results of Experiment 1. As shown in Fig. 2B, rhythm facilitated RT (t24 = −3.42, P = 0.002, Cohen’s d = 0.23) but did not impact accuracy (t24 = −0.08, P = 0.934). Moreover, HDDM comparison revealed that the model with a and t differing between conditions best accounted for behaviors (Methods, Table 4). Finally, rhythm reduced the response boundary of the HDDM (P = 0.033), but unlike Experiment 1, also impacted the non-decision time (P = 0.048).
After confirming the behavioral effects, we next studied the neural mechanisms of the rhythm-induced auditory WM facilitation. Specifically, we compared the neural activity in rhythmic and arrhythmic conditions during the two key periods of the auditory WM process: encoding and maintaining. Notably, since the encoding and maintaining periods were exactly the same for Change-0 and Change-2 conditions, we combined them for the rhythmic and arrhythmic conditions.
First, the event-related potentials (ERPs) during the encoding period showed stronger responses for rhythmic than arrhythmic sequences (Fig. 3A). This is not surprising, since the individual tones presented in regular rhythm would be aligned across trials in time (i.e., sharing the same onset), while the onset of individual tones in arrhythmic sequences would vary in time across trials, leading to stronger evoked responses for the former condition. Instead, we sought to focus on the none-phase-locked neural response during the encoding period. Specifically, an induced time-frequency analysis was applied to EEG signals in which the time-frequency power profile in each trial was computed separately and then averaged across trials. This induced time-frequency analysis can overcome the phase alignment issue since it discards the phase information in each trial and instead focuses on the non-phase-locked power.
Figure 3B plots the difference in non-phase-locked time-frequency power between rhythmic and arrhythmic conditions throughout the encoding period, averaged over all channels and all participants. It is clear that the rhythmic tone sequence elicited stronger induced beta-band (16–33.3 Hz) power than the arrhythmic sequence during the late encoding period. Consistent with this, the extracted beta-band power time course showed a difference between the rhythmic and arrhythmic conditions, ~2 s-3 s after the onset of the target sequence (Fig. 3C; permutation cluster analysis, 2.7 s-2.95 s, P = 0.046). The beta-band increase during the late encoding period is reasonable since it takes time to differentiate rhythmic and arrhythmic conditions. Moreover, the topography map of the rhythmic-arrhythmic beta-band power difference revealed significant clusters in frontal-parietal regions (Fig. 3D; permutation cluster analysis, P = 0.023), implying the neural origin of the beta-band activity. Control analyses in the alpha band (8 Hz–12 Hz) and theta band (3 Hz–5 Hz) did not show a similar effect (Fig. S2), supporting the specific role of the beta-band in memory encoding.
After revealing the enhanced beta-band power during the late encoding phase, we next applied the same time-frequency analysis to the neural activity in the 1.5-s retention period. Note that since no stimulus was presented during this period, any difference between rhythmic and arhythmic conditions would only arise from the internal memory retention process. Interestingly, the retention period showed a different time-frequency profile from the sensory encoding period. Specifically, theta-band neural activity (3 Hz–5 Hz) was stronger in rhythmic versus arrhythmic conditions (Fig. 4A). Consistent with this, the theta-band power time course showed a significant difference between rhythmic and arrhythmic conditions (Fig. 4B; permutation cluster analysis, 0.13 s-0.73 s, P = 0.021). Furthermore, the topographic map of the rhythmic-arrhythmic theta-band power difference revealed a significant cluster in the frontal-parietal region (permutation cluster analysis, P = 0.012). Finally, control analyses of the alpha-band and beta-band did not reveal any difference between conditions (Fig. S2), further supporting the special role of the theta-band during WM retention.
Taken together, Experiment 2 with EEG recordings demonstrates dissociated neural oscillatory mechanisms during the encoding and retention periods for rhythm-induced WM facilitation. As tone sequences are presented during encoding, rhythm increases beta-band neural activity that might reflect motor cortex engagement, which contributes to WM via enhancing the theta-band neural activity in the frontal-parietal region during memory maintenance.
The Encoding-Maintaining Relationship and Behavioral Correlates (Experiment 2)
Finally, we sought to examine how the two neural signatures during the two memory stages are related to each other and ultimately contribute to behavior. We defined the encoding-stage signature as the difference in beta-band power (16 Hz–33 Hz) between rhythmic and arrhythmic conditions during the encoding period, averaged over the corresponding significant spatiotemporal clusters. Similarly, the maintenance-stage signature was defined as that in terms of the theta-band (3–5 Hz) during retention. The Pearson correlation was then calculated between the encoding-stage signature and the maintenance-stage signature across participants.
As shown in Fig. 5A, the two neural signatures were positively correlated with each other (r = 0.47, P = 0.025), i.e., the stronger the beta-band power during encoding, the stronger the theta-band power during retention. Furthermore, we investigated the behavioral relevance of the two-stage neural signatures. Specifically, we constructed a GLM incorporating encoding-stage high beta-band power (20 Hz–33.3 Hz) and maintenance-stage theta-band power (3 Hz–5 Hz) as factors to predict α and t for rhythmic and arrhythmic conditions. As shown in Fig. 5B, it was the encoding-stage beta-band power that negatively predicted α for the rhythmic condition (β = −0.36, t22 = −1.78, P = 0.045). In other words, the stronger the beta-band power during encoding, the lower the response boundary during decision-making. Neither the beta-band nor theta-band were indicators of t (Fig. S3).
Fig. 5.
Correlation between two-stage neural signatures and behavioral relevance. A Correlation between the encoding-stage Rhythmic-Arrhythmic beta-band power difference (x-axis) and the maintaining-stage Rhythmic-Arrhythmic theta-band power difference (y-axis) across participants, averaged within the corresponding significant spatiotemporal clusters (small insets on x- and y-axes). Each dot denotes an individual participant. Shaded areas indicate SEM (Pearson test, n = 24, r = 0.47, P* = 0.025). B A general linear model (GLM) incorporating encoding-stage beta-band power and maintenance-stage theta-band power as factors constructed to predict a response boundary. Regression coefficients of encoding-stage beta-band power (red) and maintenance-stage theta-band power (green) under rhythmic (left) and arrhythmic (right) conditions. Bars indicate the 95% confidence interval (GLM with t-test, n = 25, P* = 0.045).
Taken together, the dissociated neural mechanisms of the encoding and retention stages, instead of being independent of each other, work synergistically to facilitate WM and speed up the response by lowering the response boundary during decision-making.
Discussion
It is widely accepted that rhythm facilitates perception and attention, but little is known about its contribution to WM and the underlying computational neural mechanism. Here two studies demonstrate that tones presented in rhythm result in better WM behavior, i.e., faster retrieval RT, which is attributed to a lower response boundary in evidence accumulation. This rhythm-induced WM facilitation is mediated by interactions between two stages of neural oscillation: the encoding-stage beta-band and the maintaining-stage theta-band. Taken together, temporal regularities (rhythm) facilitate WM performance through rhythm-induced driving of the sensory-motor system (i.e., beta-band) that increases efficient attentional allocation to stimulus processing during encoding, which leads to better information maintenance (i.e., theta-band) during retention.
Our results show that rhythm lowers the response boundary in evidence accumulation, leading to a faster RT during WM retrieval. The response boundary represents decision conservatism, i.e., a larger value indicates higher criteria, requiring more information to be gathered to avoid mistakes [90]. When participants are aware of their tendency to make mistakes, the response boundary increases [84, 91, 92]. Here, the lower response boundary found for rhythmic tone sequences is likely to reflect participants' higher confidence about their choice when retrieving information from WM. Interestingly, the response boundary was also correlated with the beta-band neural activity that indexes engagement of the motor system and top-down prediction [81, 93–95]. This implies that the involvement of the motor system driven by rhythmic stimuli lowers the threshold for evidence accumulation and facilitates response readiness [96]. Moreover, the non-decision time was also modulated by rhythm in Experiment 2. It has been shown that higher levels of stimulus noise or memory load increase non-decision time [97, 98]. Therefore, rhythm accelerates memory access by constructing a predicted temporal context to decrease noise during encoding, as indicated by the shorter non-decision time. Overall, our results suggest that rhythm activates the sensory-motor circuit to accelerate the initiation of evidence accumulation and lower decision criteria during memory retrieval.
One key neuronal finding was the stronger beta-band neural oscillations induced by rhythmic tones during the encoding period. Beta-band oscillations, as a prominent neural signature of motor systems during resting states [99–102], are considered essential for the maintenance of sensorimotor or cognitive states [103]. Importantly, several studies have recently revealed the instrumental role of beta-band neural oscillations in temporal prediction. For example, pre-stimulus beta-band power is associated with target prediction, time estimation, and motor readiness [104–109]. Periodic modulation of beta-band power in both auditory and motor areas reflects the timing of isochronous beats and is further regulated by metrical context and top-down anticipation of tempo change [93–95]. Based on these findings, it has been posited that the motor system modulates sensory processing by conveying temporal prediction through beta-band neural oscillations [80, 81, 110]. Consistent with this view, our findings suggest that tones embedded in rhythm impose a highly predictive temporal context, which drives the motor system to generate precise temporal anticipation of forthcoming stimuli through beta-band oscillations. The beta-band increase during the late encoding period is reasonable since it takes time to differentiate rhythmic and arrhythmic conditions, which would drive the attention or motor cortex differently. The established ongoing temporal prediction optimizes attentional allocation among tones in the sequence, leading to more effective memory encoding.
In addition to beta-band neural activity during encoding, rhythm also enhances theta-band activity during WM maintenance. Theta-band oscillation is widely known to play fundamental roles in WM. For example, theta-band activity over the frontal area increases during memory retention, is parametrically modulated by WM load [67–69, 111], and has been proposed to implement top-down executive control over posterior sensory regions [112–115]. The theta-band has also been found to be engaged in the coordination of sequential reactivation of individual WM items [116–119]. Our findings also coincide with previous studies showing that applying theta-band stimulation to the dorsal pathway during retention enhances WM performance [70, 73, 74, 120]. Overall, our work provides direct neural evidence supporting the conclusion that rhythm facilitates WM, i.e., increasing theta-band neural activity during retention.
Interestingly, instead of being independent of each other, the two neural signatures—encoding-stage beta-band and retention-stage theta-band activity—are correlated with each other. One interpretation is that precise temporal predictions generated by the motor cortex promote information processing and in turn increase the depth of memory encoding, which further boosts memory maintenance. This view is in line with ASAP theory positing that the motor system modulates sensory processing in the auditory cortex by transmitting timing signals via the dorsal pathway [11, 49]. Alternatively, it is also possible that the rhythm-triggered motor system directly drives WM networks and facilitates information storage. This view is consistent with previous results showing that motor systems support WM by generating internal motor traces that reinforce information representation [121, 122].
The present work can also be understood under the framework of predictive coding (PC), which emphasizes top-down predictions passing through a cortical hierarchy to sensory inputs [123–125]. According to PC theory, the ultimate aim of the brain is to minimize prediction errors and attain a fully predicted representation of the world [126]. Our findings suggest that the motor cortex is a vital hub to generate anticipatory signals which are further combined with sensory information to minimize prediction errors. Rhythmic presentation of a stimulus enhances predictive coding by reducing sensory uncertainty and promoting active interactions between bottom-up and top-down processes.
It is noteworthy that the task required participants to retain a whole tone sequence in WM, which encouraged the integration of all tones into a single stream. This is in line with previous findings that auditory stream analysis is facilitated when the target or distractor sequence is temporally regular [127–129]. Most previous studies have focused on the effect of rhythm on processing single items presented on-beat and off-beat or the detection of single targets embedded in a rhythmic or arrhythmic temporal context [e.g. 19–21]. Our experiments provide additional evidence supporting the hypothesis that rhythm also benefits the global perception of multiple items and facilitates information integration in WM representations. Moreover, since the long sequence tested here was well over the capacity of the WM, subjects may use other chunking or schema strategies to compress it. Here we reveal the beneficial role of rhythm in WM compared to arrhythmic conditions. However, it is certainly possible that rhythmic presentation by itself might contribute to WM through the facilitation of chunking or schema formation.
Our work has several limitations. First, we used slightly arrhythmic tone sequences as probes to avoid the influence of temporal context during retrieval. This would make participants manipulate their WM representations, increasing task difficulty and weakening the rhythm effect. Second, we only tested pitch WM and did not find correlations between theta-band power and behavioral performance. Finally, constrained by the low spatial resolution of EEG, we could not find direct evidence of the neural pathway connecting the motor system and WM networks, as well as the specific neural signals transmitting predictive signals. Future research is needed to address these issues.
In conclusion, rhythm facilitates auditory WM by driving the sensory-motor system to enhance time-based attention during sensory encoding, which leads to better WM maintenance. Our work constitutes novel evidence for the beneficial role of motor systems in WM. It also provides insights for clinical applications, such as engaging motor systems to improve memory.
Supplementary Information
Below is the link to the electronic supplementary material.
Acknowledgements
This work was supported by the STI2030-Major Project (2021ZD0204100 and 2021ZD0204103) and the National Natural Science Foundation of China (31930052). We thank Haiming Yang for her suggestions on the writing of this article.
Data Availability
The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.
Conflict of interest
The authors declare no competing financial interests.
References
- 1.Large E. Resonating to musical rhythm: Theory and experiment. The Psychology of Time 2008: 189–231.
- 2.Hogendoorn H. Voluntary saccadic eye movements ride the attentional rhythm. J Cogn Neurosci 2016, 28: 1625–1635. [DOI] [PubMed] [Google Scholar]
- 3.MacDougall HG, Moore ST. Marching to the beat of the same drummer: The spontaneous tempo of human locomotion. J Appl Physiol 2005, 99: 1164–1173. [DOI] [PubMed] [Google Scholar]
- 4.Ravignani A, Dalla Bella S, Falk S, Kello CT, Noriega F, Kotz SA. Rhythm in speech and animal vocalizations: A cross-species perspective. Ann N Y Acad Sci 2019, 1453: 79–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Levitin DJ, Grahn JA, London J. The psychology of music: Rhythm and movement. Annu Rev Psychol 2018, 69: 51–75. [DOI] [PubMed] [Google Scholar]
- 6.Merchant H, Grahn J, Trainor L, Rohrmeier M, Fitch WT. Finding the beat: A neural perspective across humans and non-human Primates. Philos Trans R Soc Lond B Biol Sci 2015, 370: 20140093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Criscuolo A, Schwartze M, Prado L, Ayala Y, Merchant H, Kotz SA. Macaque monkeys and humans sample temporal regularities in the acoustic environment. Prog Neurobiol 2023, 229: 102502. [DOI] [PubMed] [Google Scholar]
- 8.Kotz SA, Ravignani A, Fitch WT. The evolution of rhythm processing. Trends Cogn Sci 2018, 22: 896–910. [DOI] [PubMed] [Google Scholar]
- 9.Merchant H, Honing H. Are non-human Primates capable of rhythmic entrainment? Evidence for the gradual audiomotor evolution hypothesis. Front Neurosci 2013, 7: 274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Patel AD. Vocal learning as a preadaptation for the evolution of human beat perception and synchronization. Philos Trans R Soc Lond B Biol Sci 2021, 376: 20200326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Patel AD, Iversen JR. The evolutionary neuroscience of musical beat perception: The Action Simulation for Auditory Prediction (ASAP) hypothesis. Front Syst Neurosci 2014, 8: 57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Jones MR. Time, our lost dimension: Toward a new theory of perception, attention, and memory. Psychol Rev 1976, 83: 323–355. [PubMed] [Google Scholar]
- 13.Jones MR, Boltz M. Dynamic attending and responses to time. Psychol Rev 1989, 96: 459–491. [DOI] [PubMed] [Google Scholar]
- 14.Large EW, Jones MR. The dynamics of attending: How people track time-varying events. Psychol Rev 1999, 106: 119–159. [Google Scholar]
- 15.Auksztulewicz R, Myers NE, Schnupp JW, Nobre AC. Rhythmic temporal expectation boosts neural activity by increasing neural gain. J Neurosci 2019, 39: 9806–9817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Cason N, Schön D. Rhythmic priming enhances the phonological processing of speech. Neuropsychologia 2012, 50: 2652–2658. [DOI] [PubMed] [Google Scholar]
- 17.Chang A, Bosnyak DJ, Trainor LJ. Rhythmicity facilitates pitch discrimination: Differential roles of low and high frequency neural oscillations. Neuroimage 2019, 198: 31–43. [DOI] [PubMed] [Google Scholar]
- 18.Geiser E, Notter M, Gabrieli JD. A corticostriatal neural system enhances auditory perception through temporal context processing. J Neurosci 2012, 32: 6177–6182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Jones MR, Moynihan H, MacKenzie N, Puente J. Temporal aspects of stimulus-driven attending in dynamic arrays. Psychol Sci 2002, 13: 313–319. [DOI] [PubMed] [Google Scholar]
- 20.Mathewson KE, Fabiani M, Gratton G, Beck DM, Lleras A. Rescuing stimuli from invisibility: Inducing a momentary release from visual masking with pre-target entrainment. Cognition 2010, 115: 186–191. [DOI] [PubMed] [Google Scholar]
- 21.Rohenkohl G, Cravo AM, Wyart V, Nobre AC. Temporal expectation improves the quality of sensory information. J Neurosci 2012, 32: 8424–8428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Sanabria D, Capizzi M, Correa A. Rhythms that speed you up. J Exp Psychol Hum Percept Perform 2011, 37: 236–244. [DOI] [PubMed] [Google Scholar]
- 23.Su Z, Zhou X, Wang L. Dissociated amplitude and phase effects of alpha oscillation in a nested structure of rhythm- and sequence-based temporal expectation. Cereb Cortex 2023, 33: 9741–9755. [DOI] [PubMed] [Google Scholar]
- 24.Barczak A, O’Connell MN, McGinnis T, Ross D, Mowery T, Falchier A. Top-down, contextual entrainment of neuronal oscillations in the auditory thalamocortical circuit. Proc Natl Acad Sci U S A 2018, 115: E7605–E7614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lakatos P, Shah AS, Knuth KH, Ulbert I, Karmos G, Schroeder CE. An oscillatory hierarchy controlling neuronal excitability and stimulus processing in the auditory cortex. J Neurophysiol 2005, 94: 1904–1911. [DOI] [PubMed] [Google Scholar]
- 26.Lakatos P, Karmos G, Mehta AD, Ulbert I, Schroeder CE. Entrainment of neuronal oscillations as a mechanism of attentional selection. Science 2008, 320: 110–113. [DOI] [PubMed] [Google Scholar]
- 27.Calderone DJ, Lakatos P, Butler PD, Castellanos FX. Entrainment of neural oscillations as a modifiable substrate of attention. Trends Cogn Sci 2014, 18: 300–309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Haegens S, Zion Golumbic E. Rhythmic facilitation of sensory processing: A critical review. Neurosci Biobehav Rev 2018, 86: 150–165. [DOI] [PubMed] [Google Scholar]
- 29.Henry MJ, Herrmann B, Obleser J. Entrained neural oscillations in multiple frequency bands comodulate behavior. Proc Natl Acad Sci USA 2014, 111: 14935–14940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Henry MJ, Obleser J. Frequency modulation entrains slow neural oscillations and optimizes human listening behavior. Proc Natl Acad Sci U S A 2012, 109: 20095–20100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lakatos P, Gross J, Thut G. A new unifying account of the roles of neuronal entrainment. Curr Biol 2019, 29: R890–R905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Obleser J, Kayser C. Neural entrainment and attentional selection in the listening brain. Trends Cogn Sci 2019, 23: 913–926. [DOI] [PubMed] [Google Scholar]
- 33.Palva S, Palva JM. Roles of brain criticality and multiscale oscillations in temporal predictions for sensorimotor processing. Trends Neurosci 2018, 41: 729–743. [DOI] [PubMed] [Google Scholar]
- 34.Spaak E, de Lange FP, Jensen O. Local entrainment of α oscillations by visual stimuli causes cyclic modulation of perception. J Neurosci 2014, 34: 3536–3544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Stefanics G, Hangya B, Hernádi I, Winkler I, Lakatos P, Ulbert I. Phase entrainment of human delta oscillations can mediate the effects of expectation on reaction speed. J Neurosci 2010, 30: 13578–13585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.VanRullen R. Perceptual cycles. Trends Cogn Sci 2016, 20: 723–735. [DOI] [PubMed] [Google Scholar]
- 37.Huang Y, Chen L, Luo H. Behavioral oscillation in priming: Competing perceptual predictions conveyed in alternating theta-band rhythms. J Neurosci 2015, 35: 2830–2837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Jia J, Liu L, Fang F, Luo H. Sequential sampling of visual objects during sustained attention. PLoS Biol 2017, 15: e2001903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Landau AN, Fries P. Attention samples stimuli rhythmically. Curr Biol 2012, 22: 1000–1004. [DOI] [PubMed] [Google Scholar]
- 40.Song K, Meng M, Chen L, Zhou K, Luo H. Behavioral oscillations in attention: Rhythmic α pulses mediated through θ band. J Neurosci 2014, 34: 4837–4844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Morillon B, Hackett TA, Kajikawa Y, Schroeder CE. Predictive motor control of sensory dynamics in auditory active sensing. Curr Opin Neurobiol 2015, 31: 230–238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Rimmele JM, Morillon B, Poeppel D, Arnal LH. Proactive sensing of periodic and aperiodic auditory patterns. Trends Cogn Sci 2018, 22: 870–882. [DOI] [PubMed] [Google Scholar]
- 43.Schroeder CE, Wilson DA, Radman T, Scharfman H, Lakatos P. Dynamics of Active Sensing and perceptual selection. Curr Opin Neurobiol 2010, 20: 172–176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Schubotz RI. Prediction of external events with our motor system: Towards a new framework. Trends Cogn Sci 2007, 11: 211–218. [DOI] [PubMed] [Google Scholar]
- 45.Kleinfeld D, Deschênes M, Ulanovsky N. Whisking, sniffing, and the hippocampal θ-rhythm: A tale of two oscillators. PLoS Biol 2016, 14: e1002385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Verhagen JV, Wesson DW, Netoff TI, White JA, Wachowiak M. Sniffing controls an adaptive filter of sensory input to the olfactory bulb. Nat Neurosci 2007, 10: 631–639. [DOI] [PubMed] [Google Scholar]
- 47.Wachowiak M. All in a sniff: Olfaction as a model for active sensing. Neuron 2011, 71: 962–973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Wesson DW, Donahou TN, Johnson MO, Wachowiak M. Sniffing behavior of mice during performance in odor-guided tasks. Chem Senses 2008, 33: 581–596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Cannon JJ, Patel AD. How beat perception co-opts motor neurophysiology. Trends Cogn Sci 2021, 25: 137–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Chen JL, Penhune VB, Zatorre RJ. Listening to musical rhythms recruits motor regions of the brain. Cereb Cortex 2008, 18: 2844–2854. [DOI] [PubMed] [Google Scholar]
- 51.Kasdan AV, Burgess AN, Pizzagalli F, Scartozzi A, Chern A, Kotz SA, et al. Identifying a brain network for musical rhythm: A functional neuroimaging meta-analysis and systematic review. Neurosci Biobehav Rev 2022, 136: 104588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Keitel A, Gross J, Kayser C. Perceptually relevant speech tracking in auditory and motor cortex reflects distinct linguistic features. PLoS Biol 2018, 16: e2004473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Large EW, Roman I, Kim JC, Cannon J, Pazdera JK, Trainor LJ, et al. Dynamic models for musical rhythm perception and coordination. Front Comput Neurosci 2023, 17: 1151895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Matthews TE, Witek MAG, Lund T, Vuust P, Penhune VB. The sensation of groove engages motor and reward networks. Neuroimage 2020, 214: 116768. [DOI] [PubMed] [Google Scholar]
- 55.Ross JM, Balasubramaniam R. Time perception for musical rhythms: Sensorimotor perspectives on entrainment, simulation, and prediction. Front Integr Neurosci 2022, 16: 916220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Teki S, Grube M, Kumar S, Griffiths TD. Distinct neural substrates of duration-based and beat-based auditory timing. J Neurosci 2011, 31: 3805–3812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Todd NPM, Lee CS. The sensory-motor theory of rhythm and beat induction 20 years on: A new synthesis and future perspectives. Front Hum Neurosci 2015, 9: 444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zatorre RJ, Chen JL, Penhune VB. When the brain plays music: Auditory-motor interactions in music perception and production. Nat Rev Neurosci 2007, 8: 547–558. [DOI] [PubMed] [Google Scholar]
- 59.Ravizza SM, Uitvlugt MG, Hazeltine E. Where to start? Bottom-up attention improves working memory by determining encoding order. J Exp Psychol Hum Percept Perform 2016, 42: 1959–1968. [DOI] [PubMed] [Google Scholar]
- 60.Lim SJ, Wöstmann M, Obleser J. Selective attention to auditory memory neurally enhances perceptual precision. J Neurosci 2015, 35: 16094–16104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Souza AS, Oberauer K. In search of the focus of attention in working memory: 13 years of the retro-cue effect. Atten Percept Psychophys 2016, 78: 1839–1860. [DOI] [PubMed] [Google Scholar]
- 62.Peters B, Kaiser J, Rahm B, Bledowski C. Object-based attention prioritizes working memory contents at a theta rhythm. J Exp Psychol Gen 2021, 150: 1250–1256. [DOI] [PubMed] [Google Scholar]
- 63.Panichello MF, Buschman TJ. Shared mechanisms underlie the control of working memory and attention. Nature 2021, 592: 601–605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Gazzaley A, Nobre AC. Top-down modulation: Bridging selective attention and working memory. Trends Cogn Sci 2012, 16: 129–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Gresch D, Boettcher SEP, van Ede F, Nobre AC. Shifting attention between perception and working memory. Cognition 2024, 245: 105731. [DOI] [PubMed] [Google Scholar]
- 66.van Ede F, Nobre AC. Turning attention inside out: How working memory serves behavior. Annu Rev Psychol 2023, 74: 137–165. [DOI] [PubMed] [Google Scholar]
- 67.Brzezicka A, Kamiński J, Reed CM, Chung JM, Mamelak AN, Rutishauser U. Working memory load-related Theta power decreases in dorsolateral prefrontal cortex predict individual differences in performance. J Cogn Neurosci 2019, 31: 1290–1307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Jensen O, Tesche CD. Frontal theta activity in humans increases with memory load in a working memory task. Eur J Neurosci 2002, 15: 1395–1399. [DOI] [PubMed] [Google Scholar]
- 69.Zakrzewska MZ, Brzezicka A. Working memory capacity as a moderator of load-related frontal midline theta variability in Sternberg task. Front Hum Neurosci 2014, 8: 399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Albouy P, Weiss A, Baillet S, Zatorre RJ. Selective entrainment of theta oscillations in the dorsal stream causally enhances auditory working memory performance. Neuron 2017, 94: 193-206.e5. [DOI] [PubMed] [Google Scholar]
- 71.Albouy P, Baillet S, Zatorre RJ. Driving working memory with frequency-tuned noninvasive brain stimulation. Ann N Y Acad Sci 2018, 1423: 126–137. [DOI] [PubMed] [Google Scholar]
- 72.Hanslmayr S, Axmacher N, Inman CS. Modulating human memory via entrainment of brain oscillations. Trends Neurosci 2019, 42: 485–499. [DOI] [PubMed] [Google Scholar]
- 73.Riddle J, Scimeca JM, Cellier D, Dhanani S, D’Esposito M. Causal evidence for a role of theta and alpha oscillations in the control of working memory. Curr Biol 2020, 30: 1748-1754.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Violante IR, Li LM, Carmichael DW, Lorenz R, Leech R, Hampshire A, et al. Externally induced frontoparietal synchronization modulates network dynamics and enhances working memory performance. Elife 2017, 6: e22001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Hickey P, Merseal H, Patel AD, Race E. Memory in time: Neural tracking of low-frequency rhythm dynamically modulates memory formation. Neuroimage 2020, 213: 116693. [DOI] [PubMed] [Google Scholar]
- 76.Johndro H, Jacobs L, Patel AD, Race E. Temporal predictions provided by musical rhythm influence visual memory encoding. Acta Psychol 2019, 200: 102923. [DOI] [PubMed] [Google Scholar]
- 77.Jones A, Ward EV. Rhythmic temporal structure at encoding enhances recognition memory. J Cogn Neurosci 2019, 31: 1549–1562. [DOI] [PubMed] [Google Scholar]
- 78.Thavabalasingam S, O’Neil EB, Zeng Z, Lee ACH. Recognition memory is improved by a structured temporal framework during encoding. Front Psychol 2016, 6: 2062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Albouy P, Martinez-Moreno ZE, Hoyer RS, Zatorre RJ, Baillet S. Supramodality of neural entrainment: Rhythmic visual stimulation causally enhances auditory working memory performance. Sci Adv 2022, 8: eabj9782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Abbasi O, Gross J. Beta-band oscillations play an essential role in motor-auditory interactions. Hum Brain Mapp 2020, 41: 656–665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Morillon B, Baillet S. Motor origin of temporal predictions in auditory attention. Proc Natl Acad Sci U S A 2017, 114: E8913–E8921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Reznik D, Mukamel R. Motor output, neural states and auditory perception. Neurosci Biobehav Rev 2019, 96: 116–126. [DOI] [PubMed] [Google Scholar]
- 83.Wiecki TV, Sofer I, Frank MJ. HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python. Front Neuroinform 2013, 7: 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Andrejević M, White JP, Feuerriegel D, Laham S, Bode S. Response time modelling reveals evidence for multiple, distinct sources of moral decision caution. Cognition 2022, 223: 105026. [DOI] [PubMed] [Google Scholar]
- 85.Ratcliff R, Tuerlinckx F. Estimating parameters of the diffusion model: Approaches to dealing with contaminant reaction times and parameter variability. Psychon Bull Rev 2002, 9: 438–481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Son JY, Bhandari A, FeldmanHall O. Crowdsourcing punishment: Individuals reference group preferences to inform their own punitive decisions. Sci Rep 2019, 9: 11625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Cavanagh JF, Wiecki TV, Cohen MX, Figueroa CM, Samanta J, Sherman SJ, et al. Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold. Nat Neurosci 2011, 14: 1462–1467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Maris E, Oostenveld R. Nonparametric statistical testing of EEG- and MEG-data. J Neurosci Methods 2007, 164: 177–190. [DOI] [PubMed] [Google Scholar]
- 89.Ratcliff R, Smith PL, Brown SD, McKoon G. Diffusion decision model: Current issues and history. Trends Cogn Sci 2016, 20: 260–281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Voss A, Rothermund K, Voss J. Interpreting the parameters of the diffusion model: An empirical validation. Mem Cognit 2004, 32: 1206–1220. [DOI] [PubMed] [Google Scholar]
- 91.Dunovan K, Verstynen T. Errors in action timing and inhibition facilitate learning by tuning distinct mechanisms in the underlying decision process. J Neurosci 2019, 39: 2251–2264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Horn SS, Bayen UJ. Modeling criterion shifts and target checking in prospective memory monitoring. J Exp Psychol Learn Mem Cogn 2015, 41: 95–117. [DOI] [PubMed] [Google Scholar]
- 93.Fujioka T, Trainor LJ, Large EW, Ross B. Internalized timing of isochronous sounds is represented in neuromagnetic β oscillations. J Neurosci 2012, 32: 1791–1802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Fujioka T, Ross B, Trainor LJ. Beta-band oscillations represent auditory beat and its metrical hierarchy in perception and imagery. J Neurosci 2015, 35: 15187–15198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Graber E, Fujioka T. Induced beta power modulations during isochronous auditory beats reflect intentional anticipation before gradual tempo changes. Sci Rep 2020, 10: 4207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Morillon B, Schroeder CE, Wyart V, Arnal LH. Temporal Prediction in lieu of Periodic Stimulation. J Neurosci 2016, 36: 2342–2347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Ratcliff R, Smith PL. Perceptual discrimination in static and dynamic noise: The temporal relation between perceptual encoding and decision making. J Exp Psychol Gen 2010, 139: 70–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Sewell DK, Lilburn SD, Smith PL. Object selection costs in visual working memory: A diffusion model analysis of the focus of attention. J Exp Psychol Learn Mem Cogn 2016, 42: 1673–1693. [DOI] [PubMed] [Google Scholar]
- 99.Groppe DM, Bickel S, Keller CJ, Jain SK, Hwang ST, Harden C, et al. Dominant frequencies of resting human brain activity as measured by the electrocorticogram. Neuroimage 2013, 79: 223–233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Hillebrand A, Barnes GR, Bosboom JL, Berendse HW, Stam CJ. Frequency-dependent functional connectivity within resting-state networks: An atlas-based MEG beamformer solution. Neuroimage 2012, 59: 3909–3921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Litvak V, Jha A, Eusebio A, Oostenveld R, Foltynie T, Limousin P, et al. Resting oscillatory cortico-subthalamic connectivity in patients with Parkinson’s disease. Brain 2011, 134: 359–374. [DOI] [PubMed] [Google Scholar]
- 102.Sugata H, Yagi K, Yazawa S, Nagase Y, Tsuruta K, Ikeda T, et al. Role of beta-band resting-state functional connectivity as a predictor of motor learning ability. Neuroimage 2020, 210: 116562. [DOI] [PubMed] [Google Scholar]
- 103.Engel AK, Fries P. Beta-band oscillations—signalling the status quo? Curr Opin Neurobiol 2010, 20: 156–165. [DOI] [PubMed] [Google Scholar]
- 104.Arnal LH, Doelling KB, Poeppel D. Delta-beta coupled oscillations underlie temporal prediction accuracy. Cereb Cortex 2015, 25: 3077–3085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Betti V, Della Penna S, de Pasquale F, Corbetta M. Spontaneous beta band rhythms in the predictive coding of natural stimuli. Neuroscientist 2021, 27: 184–201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Biau E, Kotz SA. Lower beta: A central coordinator of temporal prediction in multimodal speech. Front Hum Neurosci 2018, 12: 434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Kononowicz TW, van Rijn H. Single trial beta oscillations index time estimation. Neuropsychologia 2015, 75: 381–389. [DOI] [PubMed] [Google Scholar]
- 108.Meijer D, Te Woerd E, Praamstra P. Timing of beta oscillatory synchronization and temporal prediction of upcoming stimuli. NeuroImage 2016, 138: 233–241. [DOI] [PubMed] [Google Scholar]
- 109.Schmidt-Kassow M, White TN, Abel C, Kaiser J. Pre-stimulus beta power varies as a function of auditory-motor synchronization and temporal predictability. Front Neurosci 2023, 17: 1128197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Morillon B, Arnal LH, Schroeder CE, Keitel A. Prominence of delta oscillatory rhythms in the motor cortex and their relevance for auditory and speech perception. Neurosci Biobehav Rev 2019, 107: 136–142. [DOI] [PubMed] [Google Scholar]
- 111.Hsieh LT, Ranganath C. Frontal midline theta oscillations during working memory maintenance and episodic encoding and retrieval. Neuroimage 2014, 85(Pt 2): 721–729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Berger B, Griesmayr B, Minarik T, Biel AL, Pinal D, Sterr A, et al. Dynamic regulation of interregional cortical communication by slow brain oscillations during working memory. Nat Commun 2019, 10: 4242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.de Vries IEJ, Slagter HA, Olivers CNL. Oscillatory control over representational states in working memory. Trends Cogn Sci 2020, 24: 150–162. [DOI] [PubMed] [Google Scholar]
- 114.Ratcliffe O, Shapiro K, Staresina BP. Fronto-medial theta coordinates posterior maintenance of working memory content. Curr Biol 2022, 32: 2121-2129.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Sauseng P, Griesmayr B, Freunberger R, Klimesch W. Control mechanisms in working memory: A possible function of EEG theta oscillations. Neurosci Biobehav Rev 2010, 34: 1015–1022. [DOI] [PubMed] [Google Scholar]
- 116.Axmacher N, Henseler MM, Jensen O, Weinreich I, Elger CE, Fell J. Cross-frequency coupling supports multi-item working memory in the human hippocampus. Proc Natl Acad Sci U S A 2010, 107: 3228–3233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Hsieh LT, Ekstrom AD, Ranganath C. Neural oscillations associated with item and temporal order maintenance in working memory. J Neurosci 2011, 31: 10803–10810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Lisman JE, Idiart MA. Storage of 7 ± 2 short-term memories in oscillatory subcycles. Science 1995, 267: 1512–1515. [DOI] [PubMed] [Google Scholar]
- 119.Rajji TK, Zomorrodi R, Barr MS, Blumberger DM, Mulsant BH, Daskalakis ZJ. Ordering information in working memory and modulation of gamma by theta oscillations in humans. Cereb Cortex 2017, 27: 1482–1490. [DOI] [PubMed] [Google Scholar]
- 120.Nakamura-Palacios EM, Falçoni Júnior AT, Anders QS, de Paula LDSP, Zottele MZ, Ronchete CF, et al. Would frontal midline theta indicate cognitive changes induced by non-invasive brain stimulation? A mini review. Front Hum Neurosci 2023, 17: 1116890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Liao DA, Kronemer SI, Yau JM, Desmond JE, Marvel CL. Motor system contributions to verbal and non-verbal working memory. Front Hum Neurosci 2014, 8: 753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Marvel CL, Morgan OP, Kronemer SI. How the motor system integrates with working memory. Neurosci Biobehav Rev 2019, 102: 184–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Friston K. Beyond phrenology: What can neuroimaging tell us about distributed circuitry? Annu Rev Neurosci 2002, 25: 221–250. [DOI] [PubMed] [Google Scholar]
- 124.Friston K. The free-energy principle: A unified brain theory? Nat Rev Neurosci 2010, 11: 127–138. [DOI] [PubMed] [Google Scholar]
- 125.Koelsch S, Vuust P, Friston K. Predictive processes and the peculiar case of music. Trends Cogn Sci 2019, 23: 63–77. [DOI] [PubMed] [Google Scholar]
- 126.Vuust P, Witek MA. Rhythmic complexity and predictive coding: A novel approach to modeling rhythm and meter perception in music. Front Psychol 2014, 5: 1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Andreou LV, Kashino M, Chait M. The role of temporal regularity in auditory segregation. Hear Res 2011, 280: 228–235. [DOI] [PubMed] [Google Scholar]
- 128.Bendixen A. Predictability effects in auditory scene analysis: A review. Front Neurosci 2014, 8: 60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Devergie A, Grimault N, Tillmann B, Berthommier F. Effect of rhythmic attention on the segregation of interleaved melodies. J Acoust Soc Am 2010, 128: EL1–EL7. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.





