Skip to main content
Science Advances logoLink to Science Advances
. 2025 Sep 17;11(38):eadz0510. doi: 10.1126/sciadv.adz0510

How musicality enhances top-down and bottom-up selective attention: Insights from precise separation of simultaneous neural responses

Cassia Low Manting 1,2,3,*, Dimitrios Pantazis 1, John Gabrieli 1, Daniel Lundqvist 3
PMCID: PMC12442866  PMID: 40961204

Abstract

Natural environments typically contain a blend of simultaneous sounds. A substantial challenge in neuroscience is identifying specific neural signals corresponding to each sound and analyzing them separately. Combining frequency tagging and machine learning, we achieved high-precision separation of neural responses to mixed melodies, classifying them by selective attention toward specific melodies. Across two magnetoencephalography datasets, individual musicality and task performance heavily influenced the attentional recruitment of cortical regions, correlating positively with top-down attention in the left parietal cortex but negatively with bottom-up attention in the right. In prefrontal areas, neural responses indicating higher sustained selective attention reflected better performance and musicality. These results suggest that musical training enhances neural mechanisms in the frontoparietal regions, boosting performance via improving top-down attention, reducing bottom-up distractions, and maintaining selective attention over time. This work establishes the effectiveness of combining frequency tagging with machine learning to capture cognitive and behavioral effects with stimulus precision, applicable to other studies involving simultaneous stimuli.


Precise separation of simultaneous neural responses reveals how musical experience shapes selective attention mechanisms.

INTRODUCTION

Analyzing neurophysiological signals generated by multiple concurrent stimuli is challenging due to the difficulty in resolving brain activity to their respective stimuli. In auditory neuroscience, an effective method to achieve this is crucial as simultaneous sounds are commonly encountered in naturalistic environments, such as a cocktail party scenario or an orchestra performance.

In terms of separation precision, few techniques can rival frequency tagging—a technique which labels neural activity with a unique frequency-based ID tag, allowing it to be identified from a mixture of other activities. Frequency tagging elicits a neural auditory steady-state response (ASSR) which phase-locks to the envelope of the driving auditory stimulus and is measurable via electrophysiological methods like magnetoencephalography (MEG). Simultaneous neural ASSRs with unique frequencies can thus be reliably isolated and extracted using power spectral density estimation techniques such as Fourier analysis (1, 2). The magnitude and spatial distribution of ASSRs over time offer a powerful, noninvasive means of characterizing distributed neural processing across the brain with exceptional precision.

Because of its unique capability to generate robust and reproducible neural signatures (2, 3), frequency tagging has found applications in both clinical (4, 5) and experimental research. In cognitive neuroscience, it has been used to investigate intermodal selective attention, demonstrating that cortical ASSRs are enhanced when attention is directed toward an auditory stimulus from a competing visual stimulus (68). Within intramodal auditory attention, ASSR enhancements have also been observed in dichotic paradigms where spatial attention shifted between the left and right ears (911).

However, in more complex scenarios, frequency tagging is notoriously difficult to implement, often producing steady-state responses with insufficient signal-to-noise ratio (SNR) needed to capture subtle cognitive or behavioral effects, such as feature-selective auditory attention. This challenge is worsened when multiple sound sources (e.g., voices/instruments) are presented simultaneously due to the suppression of ASSRs that occurs in the presence of simultaneous sounds, yielding inconsistent results (1214). An obstacle faced by many ASSR studies seeking higher SNR is that auditory frequency tagging is extremely intolerant toward small timing inconsistencies (e.g., jitters in stimulus onset time), as out-of-phase signals can easily cancel each other out during averaging. To circumvent this limitation, we operated a specialized auditory setup that achieves near-zero lag in sound delivery (see Materials and Methods) and recorded brain activity at high temporal resolution using MEG. Moreover, although signal averaging within regions of interest (ROIs) is often implemented improve the SNR, even small intersubject spatial variations can dilute and obscure potential effects when using such univariate approaches. To address this, we developed a unique multivariate decoding method optimized for high-precision ASSR analysis, with enhanced sensitivity to detect subtle cognitive and behavioral modulations.

Auditory training has been shown to induce neuroplastic changes that enhance auditory processing (1517). Music training, in particular, enhances higher-order cognitive skills such as pitch perception, auditory attention, and selective listening in noisy environments, with transfer effects to domains like language processing—possibly mediated by overlapping neural networks (1721). Prior evidence from our laboratory demonstrated that the ASSR power correlates positively with individual musicality (22), suggesting that the ASSR is sensitive to musical training. However, whether this sensitivity extends to its attentional modulation remains unknown. Leveraging the enhanced sensitivity of our multivariate decoding approach, the present study aims to characterize how musicality influences top-down and bottom-up mechanisms across the cortex, contributing to enhanced selective attention.

Here, we established the robustness of our signal separation method, with clear evidence from two experiments, showing that our algorithms excelled in their ability to (i) resolve neural signals according to their driving stimuli with exceptional precision, (ii) isolate and extract the effects of top-down and bottom-up selective attention to melody pitch over cortical regions, as well as (iii) achieve this on a continuous scale that correlated significantly with individual musicality and performance scores on selective melody listening tasks.

Overall, results demonstrate that the pattern of brain activity and direction of correlations differ significantly between top-down and bottom-up attention at both sensor and source levels, with compelling evidence highlighting the frontoparietal cortices as key drivers of these differences. The findings also indicate that musical training sustains and enhances selective auditory attention over time through frontal mechanisms, offering valuable insights into how learning and expertise optimize brain function to accomplish cognitive goals.

RESULTS

General task description

Participants listened to two simultaneous melodies (see Materials and Methods) of different pitch for a randomized duration of 10 to 15 s per block. The low-pitched melody was frequency-tagged at 39 Hz, and the high-pitched melody was frequency-tagged at 43 Hz to elicit corresponding neural responses at 39 and 43 Hz (Fig. 1).

Fig. 1. Experimental tasks and behavioral results.

Fig. 1.

(A) Melody attention task. Participants attended selectively to one of two simultaneous melodies with different pitches. The low-pitched and high-pitched melodies were frequency-tagged at 39 and 43 Hz, respectively. When the melody stopped, participants reported the final direction of pitch change for the attended melody, which was either falling, rising, or constant. Identical sounds were presented to both ears, ensuring that the melodies could only be distinguished by pitch or timing. (B) Two variations of the experimental task. Experiment I (left): Alternately overlapping melodies. Tone onsets alternate between the low-pitched and high-pitched melodies. Each tone onset induces bottom-up attention toward it, allowing the dissociation of top-down and bottom-up attention effects toward each melody. Experiment II (right): Completely overlapping melodies. Melody tones overlap completely, engaging bottom-up attention simultaneously toward both melodies. All other parameters including modulation frequencies (fm) and carrier frequencies (fc) were identical to experiment I. (C) Correlations between participant task performance and musicality. Task performance correlated positively (Pearson correlation, P < 0.001) with musicality across participants in present (Expt I and II) and past experiments (Expt 0). Musicality was measured by the Goldsmiths MSI (25, 26). Notably, the strength of correlation (Pearson r) between musicality and performance increased with task complexity. The least complex task (Expt 0, r = 0.64, N = 28) involved selectively attending to melodies that were completely separated in time and pitch. The moderately complex task (Expt I, r = 0.71, N = 28) required attending to melodies that were partly separated in time and pitch. The most complex task (Expt II, r = 0.79, N = 20) involved selectively attending to melodies that completely overlapped in time and could only be separated by pitch. Expt, experiment.

For both experiments (experiments I and II), we manipulated top-down selective attention by instructing participants to focus exclusively on either the low-pitched or high-pitched melody. In experiment I, the tone onsets alternate between the low-pitched and high-pitched melodies, drawing an effect of bottom-up attention toward each tone onset. Thus, this experimental design allowed us to dissociate top-down and bottom-up attention effects toward each melody. In experiment II, the tone onsets for both melodies occurred concurrently, thereby engaging bottom-up attention simultaneously toward both melodies.

For both experiments, the mean task performance (measured by fraction of correct responses) across participants was above the chance level of 0.33, verifying that selective attention was manipulated accordingly (NI = 28, meanI = 0.70, SDI = 0.25; NII = 20, meanII = 0.77, SDII = 0.21; two-tailed t tests, P < 0.001). Furthermore, similar to previous experiments (2224) (see fig. S1 for description of previous experiments), task performance correlated positively with individual musicality in the current two experiments (Pearson correlation, P < 0.001). Musicality was measured by an adapted version of the Goldsmiths musical sophistication index (MSI) (25, 26). Notably, the strength of correlation (Pearson r) between musicality and performance increased with task complexity (Fig. 1C). The least complex task (experiment 0, r = 0.64) involved selectively attending to melodies that were completely separated in time and pitch. The moderately complex task (experiment I, r = 0.71) required attending to melodies that were partly separated in time and pitch. The most complex task (experiment II, r = 0.79) involved selectively attending to melodies that completely overlapped in time and could only be separated by pitch.

Repeated splitting decoder classified selective attention exclusively at frequency tags

We developed a specialized decoder with heightened sensitivity for ASSRs, using a repeated splitting classification algorithm (see Fig. 2A and the “Classification of attention” section for details). The repeated splitting classifier was trained on the frequency-tagged neural activity to discriminate between conditions in which attention was directed toward either the low-pitched or high-pitched melody. The area under the curve (AUC) was computed to measure the classifier’s ability to discriminate between these attention conditions, which was expected to increase with stronger engagement of selective attention.

Fig. 2. Repeated splitting classification of selective attention at frequency tags.

Fig. 2.

We trained a classifier on frequency-tagged neural activities to discriminate between conditions where attention was directed toward the 39-Hz low-pitched or 43-Hz high-pitched melody. (A) Single-subject repeated splitting support vector machine pipeline for ASSR classification. For each condition, epochs were randomly divided into five groups and averaged to produce five evoked ASSRs. Next, the evoked ASSRs were Fourier transformed to acquire five power spectra. For any contrast between two conditions, the corresponding power spectra (five per condition) were classified via cross-validation (chance level = 0.5), obtaining the AUC which was expected to increase with higher selective attention. Subsequently, we repeated the entire process from the initial group division step 1000 times and computed the mean AUC across repetitions. The AUC was computed independently for each frequency from 4 to 45 Hz. For source analysis, the evoked ASSRs were localized to the cortical surface before Fourier transformation (step iii). Below the power spectra, MEG gradiometer topographies of a single fold input to the classifier demonstrated dominant auditory cortical power precisely tagged to 39 and 43 Hz but not at the adjacent 41 Hz. For visualization, the power at each frequency was normalized to the mean across all frequencies, and the subject grand average for one condition in experiment I is shown. Units are arbitrary (a.u.). (B and C) Peak AUC values were specifically observed at the stimulus frequency tags of 39 and 43 Hz (red vertical lines) but not at other frequencies. For each plot, the mean across participants is shown in black, with the AUC peak values at 39 and 43 Hz displayed in gray boxes above. Shading indicates SEM. (B) Experiment I (N = 25), top-down (left) and bottom-up attention (right). (C) Experiment II (N = 20).

For experiment I, peak discrimination of the classifier was observed specifically at the set stimulus frequency tags of 39 and 43 Hz but not at other frequencies. This applied to both top-down [AUC = 0.62 (39 Hz) and 0.60 (43 Hz); Fig. 2B, left] and bottom-up attention [AUC = 0.54 (39 Hz) and 0.63 (43 Hz); Fig. 2B, right]. Similarly for experiment II, peak discrimination was observed at 39 and 43 Hz [AUC = 0.62 (39 and 43 Hz), obtained from the higher AUC between the early half and late half of the tone at each frequency; Fig. 2C]. The individual tone halves, however, did not exhibit any discrimination peak at the modulation frequencies (fig. S2). These results confirm that we can reliably extract stimulus-specific neural responses at their predefined frequency tags (i.e., 39 and 43 Hz) and that these responses are sufficiently sensitive to capture the effects of selective attention.

Bottom-up attention triggers temporal processing; top-down attention recruits frontal mechanisms

We proceeded to examine how different brain regions are recruited in selective attentional processes, focusing on cortical areas that have been observed to contain ASSR sources in our previous studies (2224). Six ROIs, namely, the orbital gyrus (OrG), superior temporal gyrus (STG), and inferior parietal lobe (IPL) at each hemisphere, were identified and demarcated with a predefined atlas [Brainnetome Atlas (27)].

Among these six regions, bottom-up attention was discriminated above the chance level of 0.5 only at the right STG, while top-down attention was discriminated above chance at all regions (P < 0.01 for all significant cases, n = 10,000, one-tailed permutation test, false discovery rate (FDR) corrected for 12 tests) except the right IPL (Fig. 3). At the left and right OrG, the classifier could discriminate top-down attention significantly better than bottom-up attention (Pleft < 0.01, Pright < 0.001, n = 10,000, two-tailed permutation test, FDR corrected for six tests). These results are in agreement with current literature (2831) asserting that bottom-up attention is triggered by lower-level automatic sensory mechanisms situated predominantly in the primary cortices such as the STG, while top-down attention recruits higher-level executive mechanisms located in the prefrontal cortex.

Fig. 3. Top-down and bottom-up selective attention across cortical regions.

Fig. 3.

Across participants (individual markers; N = 27), bottom-up attention was significantly discriminated above chance (dotted line) only at the right STG, while top-down attention was discriminated above chance at all regions except the right IPL. Significance levels above chance are marked by black asterisks beside the corresponding attention condition on the x axis. The y axis denotes the AUC and is identical for all subplots. Box plots outline the 25th to 75th percentiles of the data, with the center dot indicating the mean. At the left and right orbital gyri (OrG), top-down attention was discriminated significantly better than bottom-up attention (blue brackets and asterisks). These results support the notion that bottom-up attention is triggered by lower-level automatic sensory mechanisms situated predominantly in the primary cortices such as the STG, while top-down attention recruits higher-level executive mechanisms located in the prefrontal cortex. All p-values are computed with permutation tests and FDR-corrected. ***P < 0.001 and **P < 0.01. Neural activity illustrating cortical power within each ROI is displayed over the standard brain at the bottom of each subplot. For visualization, the subject grand average activity at 39 Hz was normalized to the mean power across all frequencies for a single condition. Units are arbitrary (a.u.).

Top-down and bottom-up attention show opposing correlations with musicality and performance

To examine the relationship between selective attention, musicality, and task performance, we correlated the AUC with participants’ musicality scores as well as task performance. For musicality, a positive correlation with top-down attention was observed already at sensor level in experiment I (Pearson correlation, r = 0.42, P < 0.05; fig. S3A).

We proceeded to investigate which ASSR sources underlie this correlation, with our primary analysis centered on sources in the IPL, a region that has shown correlations with musicality in our previous study (22). The left IPL exhibited a significant positive correlation between top-down attention and task performance (Pearson correlation, r = 0.49, P < 0.01; Fig. 4A, left). Further analysis of the remaining regions revealed that this correlation effect was also driven by the left OrG (Pearson correlation, r = 0.40, P < 0.05) and right STG (Pearson correlation, r = 0.39, P < 0.05), albeit to a smaller extent (fig. S3C).

Fig. 4. Correlations of selective attention with musicality and performance.

Fig. 4.

Scatterplots showing correlations between classifier AUC values against individual musicality or task performance. The AUC is plotted in the y axis for all subfigures and reflects the degree of selective attention. Neural activity illustrating cortical power within each ROI is displayed over the standard brain at the bottom of each subplot. For visualization, the subject grand average activity at 39 Hz was normalized to the mean power across all frequencies for a single condition. Units are arbitrary (a.u.). (A) In experiment I, top-down attention at the left IPL shows positive correlations with both musicality (left; Pearson correlation, r = 0.49, P < 0.01) and performance (right; Pearson correlation, r = 0.38, P < 0.05) across 27 participants. (B) Across the same participants, bottom-up attention correlated negatively with performance at the right IPL (Pearson correlation, r = −0.47, P < 0.05). (C) Similarly, in experiment II, bottom-up attention correlated negatively with performance at the right IPL and right OrG across 19 participants (Pearson correlation, r = −0.47, P < 0.05 for both regions).

For task performance, a positive correlation with top-down attention was also observed at the left IPL in experiment I (Pearson correlation, r = 0.38, P < 0.05; Fig. 4A, right). Contrastingly, the correlation between bottom-up attention and task performance was negative at the right IPL (Pearson correlation, r = −0.47, P < 0.05; Fig. 4B). This negative correlation, although weaker, was already observed at sensor level (Pearson correlation, r = −0.45, P < 0.05; fig. S3B). Furthermore, in experiment II, the early half of the tone showed a negative correlation between bottom-up attention and task performance but not the late half. This effect was also observed at the right IPL (Pearson correlation, r = −0.47, P < 0.05; Fig. 4C, left), like in experiment I, as well as at the right OrG (Pearson correlation, r = −0.47, P < 0.05; Fig. 4C, right). The early half of the tone was thought to elicit a stronger bottom-up attention effect (on both melodies) due to the perceptually salient change in pitch. In contrast, the late half, being a continuation of the same tone, would not strongly draw bottom-up attention. Hence, we infer that the negative correlation manifested from the effect of bottom-up attention rather than top-down attention which was unlikely to differ systematically between the two halves of the tone. Success in performing the experimental task requires participants to actively direct top-down selective attention toward the attended melody while inhibiting bottom-up attentional diversions toward task-irrelevant, salient pitch changes in the competing melody. Consequently, successful inhibition suppresses the effect of bottom-up attention on neural activity, leading to poorer classification. Together, the results from both experiments provide compelling evidence that enhancements in selective attention across individuals, likely from musical training, occur by enhancing top-down mechanisms while suppressing bottom-up mechanisms in the frontoparietal regions.

Musical training sustains selective attention mechanisms in the prefrontal cortex, enhancing performance

To investigate temporal dynamics of selective attention, we used a sliding time window to calculate AUC values over the 2-s tone duration in experiment II. The time of peak selective attention, corresponding to maximum AUC, appeared to follow a bimodal distribution across participants, with a separation at around 0.5 s (Fig. 5A). Participants were thus categorized into two equal groups based on their time of peak selective attention: early attendees, whose selective attention peaked before 0.5 s from tone onset, and late attendees, whose selective attention peaked after 0.5 s. We found that late attendees performed significantly better at the task than early attendees (∆ = 0.19, P < 0.05, n = 10,000, two-tailed permutation test; Fig. 5B, right). Moreover, late attendees tended to be more musical than early attendees, although the difference nearly missed significance ( = 34.3, P = 0.057, n = 10,000, two-tailed permutation test; Fig. 5B, left).

Fig. 5. Musicality and performance across early and late attendees.

Fig. 5.

(A) Histogram of the time of peak selective attention across participants (N = 20). We computed the classifier AUC over the 2-s tone duration using a sliding time window and extracted the time of peak selective attention, corresponding to maximum AUC, for each participant in experiment II. The distribution appeared to be bimodal, dividing participants equally into “early” (blue) or “late” (orange) attendees depending on whether their attention peaked before or after 0.5 s, respectively. Colored lines represent normal distributions fitted to each group of attendees. (B) Musicality (left) and performance (right) comparison between early versus late attendees. Late attendees performed significantly better (∆ = 0.19, P < 0.05, n = 10,000, two-tailed permutation test) at the task than early attendees. Moreover, late attendees tended to be more musical than early attendees, although the difference nearly missed significance ( = 34.3, P = 0.057, n = 10,000, two-tailed permutation test). Box plots represent the 25th to 75th percentiles of the data, with the center dot indicating the mean. (C) Correlational analysis of lateness index with musicality (left) and performance (right). For each participant, we computed a lateness index that reflects the relative strength of selective attention in the late half of the tone compared to the early half. The lateness index correlated positively with both musicality and task performance at the right OrG (colored green over the adjacent standard brain) (Pearson correlation, P < 0.01 for musicality and performance). Together, these results suggest that musical training sharpens the neural mechanisms for sustaining or improving auditory selective attention over time, particularly in the right prefrontal cortex, thereby boosting task performance.

At source level, we compared selective attention in the early half of the tone to the late half. To quantify the difference, we computed a “lateness index” for each participant, reflecting the relative strength of selective attention in the late half compared to the early half (Fig. 5C). Our findings at the right OrG revealed positive correlations between the lateness index and both musicality as well as task performance across participants (Pearson correlation, P < 0.01 for musicality and performance; Fig. 5C). Essentially, participants who had better musical skills and task performance paid higher selective attention during the late half compared to the early half of the tone. Conversely, participants who were less musical and performed poorly could not maintain selective attention throughout the tone duration. These results suggest that musical training sharpens the neural mechanisms for sustaining or improving auditory selective attention over time, particularly in the right prefrontal cortex, thereby boosting task performance.

DISCUSSION

The separation of simultaneous neural responses presents a substantial challenge to experimental neuroscience, especially when disentangling subtle cognitive and behavioral effects from complex auditory scenes. By combining the unparalleled separational precision of frequency tagging with a repeated splitting classification approach, we isolated the effect of selective attention to a target melody within a mixture of two simultaneous melodies, further demonstrating the implications of individual task performance and musicality, both at sensor and source levels. The combined findings across two experiments demonstrate that our implementation of frequency tagging in simultaneous stimuli is both reliable and robust. These results elucidated the differential recruitment of cortical regions in bottom-up and top-down attentional processes, aligning with established theories of attentional control which posit that top-down attention engages higher-level, goal-directed executive mechanisms orchestrated by prefrontal regions, while bottom-up attention is driven by automatic lower-level sensory mechanisms predominantly situated in the primary cortices (2831). This study demonstrates successful classification of simultaneous ASSRs based on varying cognitive states and behavior, opening possibilities for highly precise separation of simultaneous neural responses with frequency tagging to study cognitive and behavioral effects.

Frequency tagging enables the identification and separation of mixed brain responses stemming from simultaneous sounds, which are ubiquitous in our everyday environment. This facilitates the use of research environments that more accurately reflect the complexity of natural soundscapes, ensuring that experimental findings are more applicable to real-life cognitive phenomena. Our approach of applying frequency tagging at full modulation depth, while eliciting stronger signals, noticeably deteriorates sound quality and naturalness. Decreasing the modulation depth mitigates the degradation in sound quality but may render the ASSRs too weak for extracting cognitive effects within realistic time constraints (1, 2). Therefore, it is imperative for future frequency tagging endeavors to work on finding an optimal balance between minimizing modulation depth and preserving sufficient signal strength for capturing desired effects. A potential limitation of this work is that replicating our auditory setup (see the “Stimulus presentation” section under Materials and Methods) to achieve near zero-lag is tedious and costly, requiring the purchase and installation of additional specialized hardware and software if not already accessible. As a more implementable alternative, we recommend recording the auditory output into a separate channel (MISC) of the MEG system to precisely track the time of stimulus delivery. Imperatively, researchers should take into account the Nyquist frequency requirements to capture the auditory signal with sufficiently high fidelity by setting the sampling rate to at least twice the maximum carrier frequency of the auditory stimulus.

Our study provides compelling evidence that top-down and bottom-up attentional mechanisms are differentially influenced by musicality as well as task performance. We found a positive correlation between top-down attention and both musicality and task performance, particularly in the left IPL. This suggests that musical training enhances top-down attentional mechanisms which are crucial for goal-directed behavior and cognitive control, leading to better task performance. Conversely, bottom-up attention showed a negative correlation with task performance in the right IPL, indicating that higher sensitivity to distractions via bottom-up attentional mechanisms may impede performance. Together, these insights contribute to a more nuanced understanding of the neural substrates underlying selective attention, highlighting the distinct yet complementary roles of the left and right parietal cortices in top-down and bottom-up attention. Consistent with our hypothesis, the involvement of the parietal cortex is expected as previous studies from our laboratory and others (22, 3234) have demonstrated its role in musical training and the ability to perform music-related tasks.

While the bilateral IPL has been linked to pitch and melody processing (35, 36), only the left IPL exhibited top-down attentional modulation in our present study. Janata et al. (37) reported a similar left-lateralized activation in the IPL when listeners focused on a target melody played by one instrument amid another. This lateralization can be explained by the asymmetric sampling in time model, which posits that the left hemisphere operates on short timescales via fast β/γ oscillations, compared to slower δ/θ activity in the right hemisphere. Because top-down attention (38, 39), prediction (40, 41), and working memory (42, 43) mechanisms engaged by our tasks modulate these fast oscillations, their effects are stronger in the left hemisphere than in the right. On the other hand, bottom-up processes driven by salient acoustic changes operate on slower timescales predominantly in the right hemisphere (44, 45).

Similar to the IPL, correlations between selective attention and task performance were also observed in the right OrG but only under the more demanding condition where the melodies fully overlapped (experiment II) and not in the partially overlapping condition (experiment I). Task difficulty modulates the engagement of higher cognitive functions such as selective attention and pitch discrimination, altering activity within frontoparietal neural networks (4648). Thus, we postulate that the correlation effects in the OrG emerged only when task demands were sufficient to strongly engage the neural mechanisms involved, thereby exposing individual differences in aptitude. Further evidence for the right OrG’s role in musicality and selective attention ability was revealed through the analysis of temporal attention dynamics. Participants whose selective attention peaked later during the tone performed significantly better and were more musical than those who peaked earlier. Subsequent correlational analysis corroborated with these results, revealing that the stronger attention was during the late tone half, relative to the early tone half, the more musical and better performing the participant was. Our results corroborate longstanding literature establishing the prefrontal cortex as a center for attentional control (28, 30, 31), aligning with the hypothesis that musical training sharpens neural mechanisms for sustaining or improving selective attention over time, thereby boosting task performance.

Although our findings are compelling, the present study is not designed to establish causality between musical training and attention, so any causal interpretation should be regarded with caution. Future studies should incorporate longitudinal or interventional designs to test such causal links more reliably. Nonetheless, our interpretation aligns neatly with Patel’s OPERA (Overlap, Precision, Emotion, Repetition, Attention) hypothesis (49), which suggests that musical training sharpens selective attention via adaptive cross-domain plasticity in auditory-processing networks when five conditions are met: overlap, precision, emotion, repetition, and attention. Music practice engages frontal, temporal, and parietal neural networks (1721, 3234) that overlap with those recruited during selective attention to pitch and timing under distracting conditions (overlap). To stay in tune and on tempo, it pushes these networks to encode pitch and temporal cues with far finer granularity than everyday listening demands (precision). Moreover, musical activities engage these networks with strong emotions (emotion), frequent practice (repetition), and focused attention (attention). Similarly, converging evidence from speech research suggests that musical experience reshapes overlapping auditory-cognitive networks, producing cross-domain transfer effects that enhance language and speech-in-noise perception (19, 5053). Thus, the benefits of musical training are likely to generalize to other listening situations that engage common neural mechanisms.

By tracking how the modulation of attention toward each melody varies over time with our sliding window analysis, we can infer the attentional strategies used by individual participants. Although beyond the scope of this paper, it is fascinating that we found correlations between the attentional modulation of the 39- and 43-Hz melodies over time, particularly among participants with higher musicality (fig. S4). The data also suggest that these participants tended to perform better, although the difference was not significant. These observations might indicate that musical participants use an organized strategy to manage attention effectively toward simultaneous sounds, contributing to their better performance. Using a simple index like the Kendall’s tau, significant correlations between the 39- and 43-Hz attentional modulation curves only emerged in six participants (of 20). Visual inspection of the individual participant curves revealed more complex correlations with time lags between the 39- and 43-Hz curves for many other participants (fig. S5). Although outside our primary expertise, we believe that it is possible to characterize these complex correlations given a more appropriate index. A potential confound in our study design arises as pitch, and fm covaried simultaneously. While the classifier assesses attention effects independently within each frequency, raw AUC values should not be directly compared across the modulation frequencies 39 and 43 Hz. Because the ASSR exhibits different SNRs at 39 and 43 Hz, any cross-frequency differences in AUC could reflect this signal disparity rather than true differences in selective attention to the lower (39 Hz) or higher (43 Hz) pitch. We therefore confine our conclusions to within-frequency effects and advise against cross-frequency interpretations that could be confounded by SNR differences.

Overall, this study contributes to a deeper understanding of the neural substrates underlying selective attention and highlights the potential of musical training as a tool for cognitive enhancement. Further research could expand on these findings by exploring whether instrument-specific musical training impedes attentional control away from the trained instrument tones (e.g., whether a violinist finds it more challenging to ignore a violin tone) and elucidating the mechanisms involved. In addition, biochemical techniques such as magnetic resonance spectroscopy can provide insights into the neurochemical basis, possibly involving γ-aminobutyric acid and glutamate neurotransmitters (5458), that differentiate selective attention control between high and low performers or musicians and nonmusicians. Last, while most studies in the field focus mainly on the primary auditory cortex (6, 7, 9, 13, 59, 60), where the strongest ASSR sources are situated, our current and past works (2224) have demonstrated that secondary sources in the frontoparietal regions can better capture cognitive effects such as selective attention and their correlation with behavioral measures (e.g., musicality and task performance). Given their relevance and usefulness, we strongly encourage future research to include these secondary ASSR sources in their analysis. Although the focus of our current work is on auditory attention, frequency tagging has been used in a diverse range of other disciplines, such as working memory (6163), language (7), aging (64), cognitive development (65) and impairment (66), pain (6769), Alzheimer’s disease (66), schizophrenia (70) and bipolar disorder (71, 72), as well as sensory modalities including tactile (67, 68, 73, 74) and vision (61, 63, 75, 76). Our repeated splitting machine learning algorithms can be readily applied to these different fields, enabling researchers to leverage frequency tagging’s unique ability to separate and precisely trace simultaneous neural signals back to their stimuli, at the same time extracting cognitive and behavioral effects with high sensitivity.

MATERIALS AND METHODS

Participants

Twenty-eight participants took part in experiment I (18 to 49 years, mean age = 28.4, SD = 6.2; nine females; two left-handed) and 20 participants took part in experiment II (21 to 49 years, mean age = 28.5, SD = 6.1; nine female; two left-handed). All participants were fluent in English with self-reported normal hearing and participated voluntarily. The experiment was approved by the Regional Ethics Review Board in Stockholm (Dnr: 2017/998-31/2). Both written and oral informed consents were obtained from all participants before the experiments. All participants received a monetary compensation of SEK 600 (~EUR 60).

Quantification of individual musicality

A subset of the Goldsmiths MSI self-report questionnaire (25, 26) containing 22 questions was used to estimate each participant’s level of musical sophistication. The MSI quantifies a participant’s musical skills, engagement, and behavior across multiple facets, making it ideal for testing a general population that includes both musicians and nonmusicians. A copy of the questionnaire used for this study can be found in fig. S5. To ensure relevance to the selective auditory attention task, we emphasized on the perceptual ability, musical training, and singing ability subscales. The emotion subscale was excluded because our melody stimuli lack emotional content, rendering this subscale irrelevant to the task. In addition, several questions from the active engagement subcategory were omitted to prioritize musical aptitude and ability over exposure. For instance, questions such as “Music is kind of an addiction for me—I couldn’t live without it,” which do not directly affect musical ability, were excluded. This specific combination of questions was previously used in a similar selective auditory attention study (22), and the resultant MSI demonstrated strong correlations with performance. Among all participants, MSI scores ranged from 40 to 132, with a maximum possible score of 154 (meanI = 88.4, SDI = 26.8; meanII = 89.4, SDII = 28.9).

Task

For both experiments I and II, participants were presented with two simultaneous melodies of different pitch and instructed to selectively attend to either one, following a verbal cue. Each melody was composed of a series of 2-s tones and lasted for 8 to 26 s. During melody playback, participants were required to constantly follow the pitch contour of the attended melody until it stopped, at which point they reported the final direction of pitch change with a button press. The correct answer could be falling, rising or constant, corresponding to a chance level of 33% (see Fig. 1). In each experiment, a total of 28 button responses were collected over ~10 min of MEG recording time for each participant. The fraction of correct button responses, of 28, was computed as an index of task performance. To ensure that the sounds were separated on the basis of features (in this case, pitch and timing) rather than location (i.e., left-versus-right), the stimuli were presented identically to both ears via insert earphones (model ADU1c, KAR Audio, Finland).

Experiment I

Tone onsets alternate between the low-pitched and high-pitched melodies, beginning with either melody (order balanced across trials). As each tone onset would draw an effect of bottom-up attention toward it, this experimental design allowed us to dissociate top-down (from the cue) and bottom-up attention effects toward each melody.

Experiment II

Tone onsets for both melodies occurred concurrently, thereby engaging bottom-up attention simultaneously toward both melodies.

Stimuli

Frequency tagging

The melodies were constructed with 2-s-long sinusoidal tones that were generated and amplitude-modulated using the Ableton Live 9 software (Berlin, Germany). The carrier frequencies of the low-pitched melody tones and high-pitched melody tones were 131 to 220 Hz and 329 to 523 Hz, respectively. Two hundred fifty-six tones were used to construct all 28 melodies of each pitch. For simplicity, only tones in the C major harmonic scale were used. Sinusoidal amplitude modulation of the tone was carried out by modifying its amplitude envelope, which corresponds to its raw sound volume, with regular increases and decreases according to a sine wave (see fig. S6). The rate of modulation, or modulation frequency (fm), was 39 Hz for the low-pitched melody and 43 Hz for the high-pitched melody. The modulation frequency was centered around 40 Hz, and the modulation depth (m) was set at 100% to maximize cortical ASSR power (1).

Stimulus presentation

The melody duration was randomized to last for 8 to 26 s each, preventing participants from predicting when it ends and thus maintaining their high levels of attention throughout playback. The volume of the high-pitched melody was reduced by 10 Db relative to the low-pitched melody to balance subjective loudness differences across frequency ranges (77). Identical stimuli were delivered to both ears through sound tubes, with the volume calibrated to ~75-dB sound pressure level per ear using a sound meter (type 2235, Brüel & Kjær, Nærum, Denmark) and slightly adjusted for individual comfort. To ensure that the stimulus was presented with less than 1-ms jitter, we used a specialized auditory setup composing of the AudioFile Stimulus Processor (Cambridge Research Systems Limited) coupled with customized scripts in Presentation (version 18.3, Neurobehavioral Systems Inc., Berkeley, CA, www.neurobs.com).

MEG data acquisition and preprocessing

MEG measurements were carried out using a 306-channel whole-scalp neuromagnetometer system (Elekta TRIUXTM, Elekta Neuromag Oy, Helsinki, Finland). Data were recorded at a sampling rate of 1 kHz, bandpass filtered online between 0.1 and 330 Hz, and stored for offline analysis. Horizontal eye movements and eye blinks were monitored using horizontal and vertical bipolar electrooculography electrodes. Cardiac activity was monitored with bipolar electrocardiography electrodes attached below the left and right clavicles. Internal active shielding was active during MEG recordings to suppress electromagnetic artifacts from the surrounding environment. In preparation for the MEG measurement, each participant’s head shape was digitized using a Polhemus Fastrak. The participant’s head position and head movement were monitored during MEG recordings using head position indicator coils. The acquired MEG data were preprocessed using MaxFilter (78, 79) by applying temporal signal space separation (tSSS) to suppress artifacts from outside the MEG helmet and compensate for head movement during the recording, before being transformed to a default head position. The tSSS had a buffer length of 10 s and a cutoff correlation coefficient of 0.98.

MEG data analysis

Following preprocessing, subsequent MEG data analyses were done using the minimum-norm estimate (MNE)–Python software package (80). First, we applied a 1- to 50-Hz bandpass filter to the continuous MEG data. Next, eye movement and heartbeat artifacts were automatically detected from electrooculographic and electrocardiography signals using independent components analysis (ICA; fastICA algorithm), validated visually, and removed for each participant. On average, we removed 4.9 components (SEM 0.2) across participants.

Experiment I data analysis

The continuous MEG data were segmented into 1-s-long epochs from each tone onset. Since the salient tone onsets alternate between the low- and high-pitched melodies, bottom-up attention was strongly drawn toward either melody tone but never both. Thus, we defined all epochs into two conditions based on whether bottom-up attention was directed to the low- or high-pitched melody tone. Alternatively, we also defined all epochs into another two conditions based on whether the cue directed top-down attention to the low- or high-pitched melody. In both instances, 228 epochs were defined per condition per participant.

Experiment II data analysis

The first 0 to 1 s from each tone onset was defined as the early half of the tone, while the subsequent 1 to 2 s was defined as the late half of the tone, defining two conditions by time period. Subsequently, each of the two time periods was analyzed separately for top-down attention according to the cue, categorizing the epochs into attend low- or high-pitched for each of the early and late half. As a result, about 114 epochs were defined per condition per participant.

Classification of attention

Repeated splitting support vector machine

To classify attention conditions at single-subject level, we designed a specialized “repeated splitting” decoder that was highly sensitive to ASSRs. These customized algorithms were based on the scikit-learn package in Python. For each condition, epochs were randomly divided into five groups and averaged within each group to produce five evoked ASSRs. Next, the five evoked ASSRs were fast Fourier transformed (frequency resolution = 1 Hz) to acquire five power spectra. Spectral smoothing was performed using a Hann window in experiment I and a boxcar window in experiment II. For any contrast between two conditions, the corresponding power spectra (five each condition) were fed into a binary linear support vector machine with all 306 sensors as features and classified using a fivefold cross validation method. We computed the AUC as a measure of discriminability between conditions (chance level = 0.5). Last, we repeated the entire process from the initial group division step 1000 times and computed the mean AUC across repetitions. The AUC was computed independently for each frequency from 4 to 45 Hz. Figure 2A outlines each of the steps described above.

For experiment I, the AUC across frequency was then averaged across 25 participants (1 participant was excluded because of incomplete data collection, and another 2 participants were excluded because of excessive noise at sensor level; see fig. S7) to give the group-averaged AUC across frequency. For each participant in experiment II, the AUC across frequency was computed separately for the early and late half of the tone, before combining them by selecting the higher AUC between the two halves at each frequency. The resultant “best of two” AUC across frequency was then averaged across all 20 participants to give the group-averaged AUC across frequency.

Sliding window analysis

To investigate fluctuations in selective attention across time during experiment II, we used a sliding window approach (fast Fourier transform, window length = 1 s, step size = 50 ms, boxcar window) across the 2-s tone duration for the ASSRs, computing the power specifically at 39 and 43 Hz. Classification was performed independently for each time window, resulting in the AUC across time at 39 and 43 Hz, corresponding to the attentional modulation of the low- and high-pitched melody, respectively. For each participant, we determined the time of peak selective attention during which the AUC was maximum, regardless of which melody it corresponded to. Participants were split into two equal groups of 10 based on their time of peak selective attention, which was before 0.5 s for early attendees and after 0.5 s for late attendees. Subsequently, we compared differences in musicality and performance between the early and late attendees using a two-tailed permutation test with 10,000 shuffles.

Structural magnetic resonance imaging acquisition and processing

Anatomical magnetic resonance imaging (MRI) was acquired using hi-res Sagittal T1 weighted three-dimensional IR-SPGR (inversion recovery spoiled gradient echo) images by a GE MR750 3 Tesla scanner with the following pulse sequence parameters: 1 mm isotropic resolution; field of view (FoV), 240 mm by 240 mm; acquisition matrix, 240 × 240; 180 slices 1 mm thick; bandwidth per pixel = 347 Hz/pixel; flip angle = 12°; inversion time (TI) = 400 ms; echo time (TE) = 2.4 ms; repetition time (TR) = 5.5 ms resulting in a TR per slice of 1390 ms. Cortical reconstruction and volumetric segmentation of all participants’ MRI was performed with the Freesurfer image analysis suite (81).

MEG source reconstruction

Source localization of the MEG signals (combined planar gradiometers and magnetometers) was performed using an MNE (82) distributed source model containing 20484 dipolar cortical sources. The MEG signals were coregistered onto each individual participant’s anatomical MRI using previously digitized points on the head surface during MEG acquisition. On the basis of our previous studies (2224) on ASSR sources, we demarcated six ROIs, namely, the OrG, STG, and IPL at each hemisphere, according to the Brainnetome Atlas (27).

Classification of selective attention (source level)

To classify attention conditions at individual source level, we incorporated the source localization step into the repeated splitting support vector machine decoder (with identical parameters previously described in the “Classification of attention” section) as follows: For each contrast, epochs per condition were randomly divided into five groups and averaged within each group to produce five evoked ASSRs. The five evoked ASSRs were then source localized to produce five source estimations across the entire cortex. For each of the six ROIs, a subset of these sources was selected to contain only sources originating within the ROI. The data were then fast Fourier transformed. The power at 39 and 43 Hz was combined in feature space before being classified into one of two conditions as we measured the decoding performance with the AUC. Last, we repeated the entire process from the initial group division step 1000 times and computed the mean AUC across repetitions. One participant was excluded because of incomplete data collection in both experiments, resulting in a final sample of 27 participants in experiment I and 19 participants in experiment II for all source-level analysis. Group-level statistics were conducted with nonparametric permutation testing (10,000 label shuffles within subjects), using the difference in group means as the test statistic. For each of the six ROIs, AUC values were compared against the binary-chance level of 0.5 in separate one-tailed permutation tests for top-down and bottom-up attention, FDR corrected over 12 tests. Significant differences in AUC between top-down and bottom-up attention at each ROI were assessed using two-tailed permutation tests, FDR corrected over six tests.

Lateness index

For experiment II, we compared the AUC between the early and late tone half to investigate fluctuations in selective attention across time at source level. This was done in place of the sliding window approach at sensor level to reduce computational costs. For each participant, we computed a lateness index using the formula: AUCLateAUCEarlyAUCLate+AUCEarly to quantify the relative strength of selective attention in the late compared to the early tone half. Following which, we carried out Pearson correlation tests between the lateness index and MSI, as well as between the lateness index and task performance to examine which brain regions exhibit relationships between lateness and these behavioral measures. We reported two-tailed P values from these tests. Data normality was assessed using the Shapiro-Wilk test at a threshold of P < 0.05 for non-normal distributions and visually confirmed through quantile-quantile plots.

Acknowledgments

We thank the Knut and Alice Wallenberg Foundation for providing the Berzelius resource at the National Supercomputer Centre which enabled the computations for source analysis.

Funding: This work was supported by the Swedish Foundation for Strategic Research (SBE 13-0115) and Knut and Alice Wallenberg Foundation (KAW2021.0329 to C.L.M.). Data for this study was collected at the Swedish National Facility for Magnetoencephalography, Karolinska Institutet, Sweden, supported by Knut and Alice Wallenberg Foundation (KAW2011.0207).

Author contributions: C.L.M. contributed to conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing—original draft, writing—review and editing, visualization, supervision, project administration, and funding acquisition. D.P. contributed to methodology, software, validation, and supervision. J.G. contributed to validation and supervision. D.L. contributed to conceptualization, methodology, writing—review and editing, supervision, project administration, and funding acquisition.

Competing interests: The authors declare that they have no competing interests.

Data and materials availability: All code is publicly available at https://github.com/cassialmt/ALTOL_ms and https://doi.org/10.5281/zenodo.16877796. All other data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials.

Supplementary Materials

This PDF file includes:

Figs. S1 to S7

Table S1

sciadv.adz0510_sm.pdf (1.4MB, pdf)

REFERENCES AND NOTES

  • 1.Roß B., Borgmann C., Draganova R., Roberts L. E., Pantev C., A high-precision magnetoencephalographic study of human auditory steady-state responses to amplitude-modulated tones. J. Acoust. Soc. Am. 108, 679–691 (2000). [DOI] [PubMed] [Google Scholar]
  • 2.Picton T. W., John M. S., Dimitrijevic A., Purcell D., Human auditory steady-state responses: Respuestas auditivas de estado estable en humanos. Int. J. Audiol. 42, 177–219 (2003). [DOI] [PubMed] [Google Scholar]
  • 3.Picton T. W., Dimitrijevic A., Perez-Abalo M.-C., Van Roon P., Estimating audiometric thresholds using auditory steady-state responses. J. Am. Acad. Audiol. 16, 140–156 (2005). [DOI] [PubMed] [Google Scholar]
  • 4.Lins O. G., Picton T. W., Boucher B. L., Durieux-Smith A., Champagne S. C., Moran L. M., Perez-Abalo M. C., Martin V., Savio G., Frequency-specific audiometry using steady-state responses. Ear Hear. 17, 81–96 (1996). [DOI] [PubMed] [Google Scholar]
  • 5.Kuwada S., Batra R., Maher V. L., Scalp potentials of normal and hearing-impaired subjects in response to sinusoidally amplitude-modulated tones. Hear. Res. 21, 179–192 (1986). [DOI] [PubMed] [Google Scholar]
  • 6.Saupe K., Schröger E., Andersen S. K., Müller M. M., Neural mechanisms of intermodal sustained selective attention with concurrently presented auditory and visual stimuli. Front. Hum. Neurosci. 3, 58 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Keitel C., Schröger E., Saupe K., Müller M. M., Sustained selective intermodal attention modulates processing of language-like stimuli. Exp. Brain Res. 213, 321–327 (2011). [DOI] [PubMed] [Google Scholar]
  • 8.Saupe K., Widmann A., Bendixen A., Müller M. M., Schröger E., Effects of intermodal attention on the auditory steady-state response and the event-related potential. Psychophysiology 46, 321–327 (2009). [DOI] [PubMed] [Google Scholar]
  • 9.Müller N., Schlee W., Hartmann T., Lorenz I., Weisz N., Top-down modulation of the auditory steady-state response in a task-switch paradigm. Front. Hum. Neurosci. 3, 1 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Tiitinen H. T., Sinkkonen J., Reinikainen K., Alho K., Lavikainen J., Näätänen R., Selective attention enhances the auditory 40-Hz transient response in humans. Nature 364, 59–60 (1993). [DOI] [PubMed] [Google Scholar]
  • 11.Mahajan Y., Davis C., Kim J., Attentional modulation of auditory steady-state responses. PLOS ONE 9, e110902 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.John M. S., Lins O. G., Boucher B. L., Picton T. W., Multiple Auditory Steady-state Responses (MASTER): Stimulus and recording parameters. Int. J. Audiol. 37, 59–82 (1998). [DOI] [PubMed] [Google Scholar]
  • 13.Gander P. E., Bosnyak D. J., Roberts L. E., Evidence for modality-specific but not frequency-specific modulation of human primary auditory cortex by attention. Hear. Res. 268, 213–226 (2010). [DOI] [PubMed] [Google Scholar]
  • 14.Riecke L., Scharke W., Valente G., Gutschalk A., Sustained selective attention to competing amplitude-modulations in human auditory cortex. PLOS ONE 9, e108045 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Tremblay K. L., Shahin A. J., Picton T., Ross B., Auditory training alters the physiological detection of stimulus-specific cues in humans. Clin. Neurophysiol. 120, 128–135 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Tremblay K., Kraus N., McGee T., Ponton C., Otis B., Central auditory plasticity: Changes in the N1-P2 complex after speech-sound training. Ear Hear. 22, 79–90 (2001). [DOI] [PubMed] [Google Scholar]
  • 17.Schneider P., Groß C., Bernhofs V., Christiner M., Benner J., Turker S., Zeidler B. M., Seither-Preisler A., Short-term plasticity of neuro-auditory processing induced by musical active listening training. Ann. N. Y. Acad. Sci. 1517, 176–190 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Intartaglia B., White-Schwoch T., Kraus N., Schön D., Music training enhances the automatic neural processing of foreign speech sounds. Sci. Rep. 7, 12631 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Parbery-Clark A., Skoe E., Kraus N., Musical experience limits the degradative effects of background noise on the neural processing of sound. J. Neurosci. 29, 14100–14107 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Trainor L. J., Shahin A. J., Roberts L. E., Understanding the benefits of musical training. Ann. N. Y. Acad. Sci. 1169, 133–142 (2009). [DOI] [PubMed] [Google Scholar]
  • 21.Habibi A., Cahn B. R., Damasio A., Damasio H., Neural correlates of accelerated auditory processing in children engaged in music training. Dev. Cogn. Neurosci. 21, 1–14 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Manting C. L., Gulyas B., Ullén F., Lundqvist D., Auditory steady-state responses during and after a stimulus: Cortical sources, and the influence of attention and musicality. Neuroimage 233, 117962 (2021). [DOI] [PubMed] [Google Scholar]
  • 23.Manting C. L., Gulyas B., Ullén F., Lundqvist D., Steady-state responses to concurrent melodies: Source distribution, top-down, and bottom-up attention. Cereb. Cortex 33, 3053–3066 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Manting C. L., Andersen L. M., Gulyas B., Ullén F., Lundqvist D., Attentional modulation of the auditory steady-state response across the cortex. Neuroimage 217, 116930 (2020). [DOI] [PubMed] [Google Scholar]
  • 25.Müllensiefen D., Gingras B., Musil J., Stewart L., Measuring the facets of musicality: The Goldsmiths Musical Sophistication Index (Gold-MSI). Personal. Individ. Differ. 60, S35 (2014). [Google Scholar]
  • 26.Müllensiefen D., Gingras B., Musil J., Stewart L., The musicality of non-musicians: An index for assessing musical sophistication in the general population. PLOS ONE 9, e89642 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Fan L., Li H., Zhuo J., Zhang Y., Wang J., Chen L., Yang Z., Chu C., Xie S., Laird A. R., Fox P. T., Eickhoff S. B., Yu C., Jiang T., The human brainnetome atlas: A new brain atlas based on connectional architecture. Cereb. Cortex 26, 3508–3526 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Foster J. K., Eskes G. A., Stuss D. T., The cognitive neuropsychology of attention: A frontal lobe perspective. Cogn. Neuropsychol. 11, 133–147 (1994). [Google Scholar]
  • 29.Huang S., Belliveau J. W., Tengshe C., Ahveninen J., Brain networks of novelty-driven involuntary and cued voluntary auditory attention shifting. PLOS ONE 7, e44062 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.R. A. Cohen, “Attention and the frontal cortex,” in The Neuropsychology of Attention, R. A. Cohen, Ed. (Springer US, 2014), pp. 335–379. 10.1007/978-0-387-72639-7_13. [DOI]
  • 31.Plakke B., Romanski L. M., Auditory connections and functions of prefrontal cortex. Front. Neurosci. 8, 199 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Rosenkranz K., Williamon A., Rothwell J. C., Motorcortical excitability and synaptic plasticity is enhanced in professional musicians. J. Neurosci. 27, 5200–5206 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Gärtner H., Minnerop M., Pieperhoff P., Schleicher A., Zilles K., Altenmüller E., Amunts K., Brain morphometry shows effects of long-term musical practice in middle-aged keyboard players. Front. Psychol. 4, 636 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.de Aquino M. P. B., Verdejo-Román J., Pérez-García M., Pérez-García P., Different role of the supplementary motor area and the insula between musicians and non-musicians in a controlled musical creativity task. Sci. Rep. 9, 13006 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Lee Y.-S., Janata P., Frost C., Hanke M., Granger R., Investigation of melodic contour processing in the brain using multivariate pattern-based fMRI. Neuroimage 57, 293–300 (2011). [DOI] [PubMed] [Google Scholar]
  • 36.Foster N. E. V., Zatorre R. J., A role for the intraparietal sulcus in transforming musical pitch information. Cereb. Cortex 20, 1350–1359 (2010). [DOI] [PubMed] [Google Scholar]
  • 37.Janata P., Tillmann B., Bharucha J. J., Listening to polyphonic music recruits domain-general attention and working memory circuits. Cogn. Affect. Behav. Neurosci. 2, 121–140 (2002). [DOI] [PubMed] [Google Scholar]
  • 38.Debener S., Herrmann C. S., Kranczioch C., Gembris D., Engel A. K., Top-down attentional processing enhances auditory evoked gamma band activity. Neuroreport 14, 683–686 (2003). [DOI] [PubMed] [Google Scholar]
  • 39.Jensen O., Kaiser J., Lachaux J.-P., Human gamma-frequency oscillations associated with attention and memory. Trends Neurosci. 30, 317–324 (2007). [DOI] [PubMed] [Google Scholar]
  • 40.Snyder J. S., Large E. W., Gamma-band activity reflects the metric structure of rhythmic tone sequences. Cogn. Brain Res. 24, 117–126 (2005). [DOI] [PubMed] [Google Scholar]
  • 41.Schadow J., Lenz D., Dettler N., Fründ I., Herrmann C. S., Early gamma-band responses reflect anticipatory top-down modulation in the auditory cortex. Neuroimage 47, 651–658 (2009). [DOI] [PubMed] [Google Scholar]
  • 42.Kaiser J., Lutzenberger W., Decker C., Wibral M., Rahm B., Task- and performance-related modulation of domain-specific auditory short-term memory representations in the gamma-band. Neuroimage 46, 1127–1136 (2009). [DOI] [PubMed] [Google Scholar]
  • 43.Kaiser J., Ripper B., Birbaumer N., Lutzenberger W., Dynamics of gamma-band activity in human magnetoencephalogram during auditory pattern working memory. Neuroimage 20, 816–827 (2003). [DOI] [PubMed] [Google Scholar]
  • 44.Hsiao F.-J., Wu Z.-A., Ho L.-T., Lin Y.-Y., Theta oscillation during auditory change detection: An MEG study. Biol. Psychol. 81, 58–66 (2009). [DOI] [PubMed] [Google Scholar]
  • 45.Ko D., Kwon S., Lee G.-T., Im C. H., Kim K. H., Jung K.-Y., Theta oscillation related to the auditory discrimination process in mismatch negativity: Oddball versus control paradigm. J. Clin. Neurol. 8, 35–42 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Zhang H., Eppes A., Beatty-Martínez A., Navarro-Torres C., Diaz M. T., Task difficulty modulates brain-behavior correlations in language production and cognitive control: Behavioral and fMRI evidence from a phonological Go No-Go picture naming paradigm. Cogn. Affect. Behav. Neurosci. 18, 964–981 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Fusser F., Linden D. E. J., Rahm B., Hampel H., Haenschel C., Mayer J. S., Common capacity-limited neural mechanisms of selective attention and spatial working memory encoding. Eur. J. Neurosci. 34, 827–838 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Rinne T., Koistinen S., Salonen O., Alho K., Task-dependent activations of human auditory cortex during pitch discrimination and pitch memory tasks. J. Neurosci. 29, 13338–13343 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Patel A. D., Why would musical training benefit the neural encoding of speech? The OPERA hypothesis. Front. Psychol. 2, (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Besson M., Chobert J., Marie C., Transfer of training between music and speech: Common processing, attention, and memory. Front. Psychol. 2, 142 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Bidelman G. M., Gandour J. T., Krishnan A., Cross-domain effects of music and language experience on the representation of pitch in the human auditory brainstem. J. Cogn. Neurosci. 23, 425–434 (2011). [DOI] [PubMed] [Google Scholar]
  • 52.Wong P. C. M., Skoe E., Russo N. M., Dees T., Kraus N., Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nat. Neurosci. 10, 420–422 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Parbery-Clark A., Skoe E., Lam C., Kraus N., Musician enhancement for speech-in-noise. Ear Hear. 30, 653–661 (2009). [DOI] [PubMed] [Google Scholar]
  • 54.Sumner P., Edden R. A. E., Bompas A., Evans C. J., Singh K. D., More GABA, less distraction: A neurochemical predictor of motor decision speed. Nat. Neurosci. 13, 825–827 (2010). [DOI] [PubMed] [Google Scholar]
  • 55.Falkenberg L. E., Westerhausen R., Specht K., Hugdahl K., Resting-state glutamate level in the anterior cingulate predicts blood-oxygen level-dependent response to cognitive control. Proc. Natl. Acad. Sci. U.S.A. 109, 5069–5073 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Kihara K., Kondo H. M., Kawahara J. I., Differential contributions of GABA concentration in frontal and parietal regions to individual differences in attentional blink. J. Neurosci. 36, 8895–8901 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.von Düring F., Ristow I., Li M., Denzel D., Colic L., Demenescu L. R., Li S., Borchardt V., Liebe T., Vogel M., Walter M., Glutamate in salience network predicts BOLD response in default mode network during salience processing. Front. Behav. Neurosci. 13, 232 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Dobri S. G. J., Ross B., Total GABA level in human auditory cortex is associated with speech-in-noise understanding in older age. Neuroimage 225, 117474 (2021). [DOI] [PubMed] [Google Scholar]
  • 59.Bharadwaj H. M., Lee A. K. C., Shinn-Cunningham B. G., Measuring auditory selective attention using frequency tagging. Front. Integr. Neurosci. 8, 6 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Lazzouni L., Ross B., Voss P., Lepore F., Neuromagnetic auditory steady-state responses to amplitude modulated sounds following dichotic or monaural presentation. Clin. Neurophysiol. 121, 200–207 (2010). [DOI] [PubMed] [Google Scholar]
  • 61.Peterson D. J., Gurariy G., Dimotsantos G. G., Arciniega H., Berryhill M. E., Caplovitz G. P., The steady-state visual evoked potential reveals neural correlates of the items encoded into visual working memory. Neuropsychologia 63, 145–153 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Ellis K. A., Silberstein R. B., Nathan P. J., Exploring the temporal dynamics of the spatial working memory n-back task using steady state visual evoked potentials (SSVEP). Neuroimage 31, 1741–1751 (2006). [DOI] [PubMed] [Google Scholar]
  • 63.Perlstein W. M., Cole M. A., Larson M., Kelly K., Seignourel P., Keil A., Steady-state visual evoked potentials reveal frontally-mediated working memory activity in humans. Neurosci. Lett. 342, 191–195 (2003). [DOI] [PubMed] [Google Scholar]
  • 64.Horwitz A., Thomsen M. D., Wiegand I., Horwitz H., Klemp M., Nikolic M., Rask L., Lauritzen M., Benedek K., Visual steady state in relation to age and cognitive function. PLOS ONE 12, e0171859 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Kabdebon C., Fló A., de Heering A., Aslin R., The power of rhythms: How steady-state evoked responses reveal early neurocognitive development. Neuroimage 254, 119150 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.van Deursen J. A., Vuurman E. F. P. M., van Kranen-Mastenbroek V. H. J. M., Verhey F. R. J., Riedel W. J., 40-Hz steady state response in Alzheimer’s disease and mild cognitive impairment. Neurobiol. Aging 32, 24–30 (2011). [DOI] [PubMed] [Google Scholar]
  • 67.Colon E., Nozaradan S., Legrain V., Mouraux A., Steady-state evoked potentials to tag specific components of nociceptive cortical processing. Neuroimage 60, 571–581 (2012). [DOI] [PubMed] [Google Scholar]
  • 68.Colon E., Legrain V., Mouraux A., Steady-state evoked potentials to study the processing of tactile and nociceptive somatosensory input in the human brain. Neurophysiol. Clin. 42, 315–323 (2012). [DOI] [PubMed] [Google Scholar]
  • 69.Mouraux A., Iannetti G. D., Colon E., Nozaradan S., Legrain V., Plaghki L., Nociceptive steady-state evoked potentials elicited by rapid periodic thermal stimulation of cutaneous nociceptors. J. Neurosci. 31, 6079–6087 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Thuné H., Recasens M., Uhlhaas P. J., The 40-Hz auditory steady-state response in patients with schizophrenia: A meta-analysis. JAMA Psychiatry 73, 1145–1153 (2016). [DOI] [PubMed] [Google Scholar]
  • 71.Oda Y., Onitsuka T., Tsuchimoto R., Hirano S., Oribe N., Ueno T., Hirano Y., Nakamura I., Miura T., Kanba S., Gamma band neural synchronization deficits for auditory steady state responses in bipolar disorder patients. PLOS ONE 7, e39955 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Rass O., Krishnan G., Brenner C. A., Hetrick W. P., Merrill C. C., Shekhar A., O’Donnell B. F., Auditory steady state response in bipolar disorder: Relation to clinical state, cognitive performance, medication status, and substance disorders. Bipolar Disord. 12, 793–803 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Brickwedde M., Schmidt M. D., Krüger M. C., Dinse H. R., 20 Hz steady-state response in somatosensory cortex during induction of tactile perceptual learning through LTP-like sensory stimulation. Front. Hum. Neurosci. 14, 257 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Nangini C., Ross B., Tam F., Graham S. J., Magnetoencephalographic study of vibrotactile evoked transient and steady-state responses in human somatosensory cortex. Neuroimage 33, 252–262 (2006). [DOI] [PubMed] [Google Scholar]
  • 75.Norcia A. M., Appelbaum L. G., Ales J. M., Cottereau B. R., Rossion B., The steady-state visual evoked potential in vision research: A review. J. Vis. 15, 4 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Parkkonen L., Andersson J., Hämäläinen M., Hari R., Early visual brain areas reflect the percept of an ambiguous scene. Proc. Natl. Acad. Sci. 105, 20500–20504 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Robinson D. W., Dadson R. S., A re-determination of the equal-loudness relations for pure tones. Br. J. Appl. Phys. 7, 166–181 (1956). [Google Scholar]
  • 78.Taulu S., Kajola M., Simola J., Suppression of Interference and artifacts by the signal space separation method. Brain Topogr. 16, 269–275 (2004). [DOI] [PubMed] [Google Scholar]
  • 79.Taulu S., Simola J., Spatiotemporal signal space separation method for rejecting nearby interference in MEG measurements. Phys. Med. Biol. 51, 1759–1768 (2006). [DOI] [PubMed] [Google Scholar]
  • 80.Gramfort A., Luessi M., Larson E., Engemann D. A., Strohmeier D., Brodbeck C., Goj R., Jas M., Brooks T., Parkkonen L., Hämäläinen M., MEG and EEG data analysis with MNE-Python. Front. Neurosci. 7, 267 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Fischl B., FreeSurfer. Neuroimage 62, 774–781 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Hämäläinen M. S., Ilmoniemi R. J., Interpreting magnetic fields of the brain: Minimum norm estimates. Med. Biol. Eng. Comput. 32, 35–42 (1994). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figs. S1 to S7

Table S1

sciadv.adz0510_sm.pdf (1.4MB, pdf)

Articles from Science Advances are provided here courtesy of American Association for the Advancement of Science

RESOURCES