Human newborns form musical predictions based on rhythmic but not melodic structure

Roberta Bianco; Brigitta Tóth; Felix Bigand; Trinh Nguyen; István Sziller; Gábor P Háden; István Winkler; Giacomo Novembre

doi:10.1371/journal.pbio.3003600

. 2026 Feb 5;24(2):e3003600. doi: 10.1371/journal.pbio.3003600

Human newborns form musical predictions based on rhythmic but not melodic structure

Roberta Bianco ^1,^2,^*, Brigitta Tóth ³, Felix Bigand ¹, Trinh Nguyen ^1,⁴, István Sziller ⁵, Gábor P Háden ³, István Winkler ³, Giacomo Novembre ¹

Editor: Mathew Ernest Diamond⁶

PMCID: PMC12875487 PMID: 41642755

Abstract

The ability to anticipate rhythmic and melodic structures in music is considered a fundamental human trait, present across all cultures and predating linguistic comprehension in human development. Yet, it remains unclear the extent to which this ability is already developed at birth. Here, we used temporal response functions to assess rhythmic and melodic neural encoding in newborns (N = 49) exposed to classical monophonic musical pieces (real condition) and control stimuli with shuffled tones and inter-onset intervals (shuffled condition). We computationally quantified context-based rhythmic and melodic expectations and dissociated these high-level processes from low-level acoustic tracking, such as local changes in timing and pitch. We observed encoding of probabilistic rhythmic expectations only in response to real but not shuffled music. This proves newborns’ ability to rely on rhythmic statistical regularities to generate musical expectations. We found no evidence for the tracking of melodic information, demonstrating a downweighting of this dimension compared to the rhythmic one. This study provides neurophysiological evidence that the capacity to track statistical regularities in music is present at birth and driven by rhythm. Melodic tracking, in contrast, may receive more weight through development with exposure to signals relevant to communication, such as speech and music.

The ability to anticipate musical structure is a fundamental human trait, but whether it exists at birth is unclear. This study shows that newborns encode rhythmic expectations based on statistical regularities in real music, while melodic tracking is absent, suggesting rhythm-driven predictive processing from birth.

Introduction

Music is an increasingly compelling means for understanding the development of a wealth of neuro-cognitive processes, including those that support communication through sound patterns [1]. From the earliest stages of development, the human brain relies on multiple auditory cues to extract meaningful patterns—such as words or melodies—from the acoustic environment [2–4]. This process is facilitated by the integration of sequential information and, thereby, by the extraction of statistical patterns along temporal and spectral dimensions, such as timing and pitch [5,6]. In music, tracking of statistical patterns is largely implicit [7], allowing the brain to anticipate events or patterns that occur more frequently than others based on both recent and past contexts. Expectations, therefore, build on statistical regularities acquired in real time as the current sequence unfolds, and/or retrieved from prior exposure. This process permits listeners to recognize rhythmic (temporal) and melodic (spectral) patterns [8], as well as to anticipate when an event will occur and what it will be [9–11]. Such rhythmic and melodic expectations are the backbone of music perception and appreciation [12] and are assumed to have contributed to the evolution and development of human musicality [13–18].

Based on cross-species studies, rhythmic and melodic expectations in primate species seem to have evolved along different phylogenetic pathways. Sensitivity to rhythmic patterns was observed in nonhuman primates, suggesting deep phylogenetic roots [19–24]. In contrast, the sensitivity to melodic patterns based on pitch relations appears more variable, if not absent, in nonhuman primates and may be unique to humans within the primate lineage [19,25–27]. This observation raises an important question: are humans naturally predisposed to melodic tracking? Answering this question is challenging yet important for understanding how biological predispositions, along with cultural traits, shape the complex spectrum of human musical abilities observed worldwide [28,29].

Here, we take human newborns as a testbed for studying the human brain’s predisposition to process music, specifically its rhythmic and melodic aspects. Newborns’ auditory responses can be reliably recorded using electroencephalography (EEG) [30,31], and these responses are marginally influenced by prior exposure compared with those measured at any later developmental stage (but see [32–36]). Compelling evidence suggests that the human brain engages with sounds already in utero, as fetuses discriminate, habituate to, and memorize sounds [37]. By approximately 35 weeks of gestation, fetuses begin to respond to music with changes in heart rates and body movements [38]. What remains unclear is which specific aspect of music—namely its rhythmic or melodic structure—drives these early predispositions.

In terms of rhythm perception, EEG studies demonstrated an early neural tuning to temporal structure in the human neonatal brain, such as specialization for temporal cues in both speech [39] and nonspeech signals [40], adaptation to the presentation rate of temporal patterns [41], tracking of meter-related frequencies [42], and perception of the beat [43]. Also, studies in newborns suggest that exposure to structured temporal input, such as music, can strengthen auditory networks and scaffold later language development [44,45]. Despite this evidence, it remains unclear whether newborns use rhythmic statistical regularities beyond sound periodicities, such as transition probabilities, to form temporal expectations [46]. In terms of melodic capacities, EEG studies showed that newborns exhibit discrimination of pitch independent of timbre [47] and detection of highly surprising events, such as deviants from deterministic patterns of tones [48] or regularities in sequences of tone intervals [49,50]. These studies provide preliminary evidence for expectations based on probabilistic distributions of melodic information. Yet they tested only the two tail-ends of such presumed probabilistic distribution: very frequent versus very infrequent events, ignoring the wide range of note-by-note surprises of real music. This leaves it unclear whether newborns can form melodic expectations whilst listening to continuous naturalistic music, as observed in adults [19,51,52]. Finally, because melodic and rhythmic abilities have often been studied separately, the weights of rhythmic and melodic expectations during music processing at birth are unknown.

Here, we investigate neural tracking of expectations based on both timing and pitch structures to understand how the newborn brain weights these musical features while listening to naturalistic musical stimuli (i.e., classical piano pieces). Therefore, unlike traditional paradigms, our design directly assesses rhythmic and melodic tracking within a full, ecologically valid stimulus, rather than inferring them from detection of salient irregular sounds. Rhythmic and melodic expectations can be generated through different anticipatory mechanisms sensitive to different features of the stimulus—from surface acoustical attributes to local and global event-based probabilities. Thus, using the multivariate Temporal Response Function analysis (mTRF) [53,54], we measured how multiple features of the continuous musical stimuli—namely ‘low-level’ acoustic features and ‘high-level’ probabilistic rhythmic/melodic information—predict human newborns’ EEG responses to music. As in previous human and nonhuman primate work [19,51,52], we assessed neural encoding of J. S. Bach’s piano monophonic pieces—rich musical stimuli combining both melodic and rhythmic probabilistic structures. Based on previous findings of rhythmic but not melodic tracking in nonhuman primates [19], we hypothesized that human newborns would show a similar pattern if these abilities were inherited phylogenetically. This would imply that whilst rhythm encoding is embedded in the human brain from the outset, melodic encoding might develop more slowly with experience and behavioral relevance. Conversely, if, unlike other nonhuman primates, rhythmic sensitivity and melodic sensitivity each emerge in parallel in humans, then human newborns might already exhibit some capacity for melodic encoding, potentially comparable to rhythmic encoding, as observed in adults [19,51].

Results

An mTRF analysis was carried out to assess the neural encoding of musical expectations in humans at birth (Fig 1A). Newborns were exposed to musical melodies (real condition) and control stimuli (shuffled condition, where pitch and note timings were shuffled over time to create sequences with disrupted musical regularities). Musical melodies composed by Bach contain the regular melodic and rhythmic patterns typically found in tonal Western music. In contrast, the shuffled stimuli lack comparable predictability in pitch or timing (including a weak sense of musical beat), despite being acoustically similar (see S1 Fig; stimuli are available at https://doi.org/10.17605/OSF.IO/K758D).

To objectively assess the predictability of the experimental stimuli, probabilistic expectations were estimated based on the information-theoretic properties of the stimuli using a variable-order Markov model of statistical learning (i.e., information dynamic of music [IDyOM]; [55]). The model learns statistical patterns from sequences of discrete symbols representing different stimulus attributes, specifically concerning pitch and timing. It leverages observations from the past (long- and short-term) musical context, and it computes Shannon’s surprise (S) and entropy (E) of each note in a melody associated with pitch (Sp and Ep, respectively) and onset timing (St and Et). Surprise and entropy provide complementary characterizations of predictive processing: entropy captures the inherent uncertainty of an event, whereas surprise reflects the unexpectedness of that event given prior context. Including both predictors allows us to fully represent pitch and timing tracking, ensuring that we capture neural activity related to both aspects of musical anticipation. The estimates provided by the model confirmed that shuffled melodies were overall more unexpected than real melodies, both with respect to pitch (Sp: W = 40, p = .002; Ep: W = 40, p = .002; see Methods ‘Statistical analysis’ for details) and to timing (St: W = 35, p = .036; Ep: W = 33, p = .075) (Fig 1B).

We further investigated the relationship between stimulus predictability and low-level acoustic features. Note-specific surprise and entropy estimates positively correlated with low-level acoustic features such as the magnitude of the latest pitch or timing interval (i.e., inter-pitch-interval, IPI, or inter-onset-interval, IOI) (Fig 1C). Hence, in the following analyses, we assessed the unique contribution of probabilistic pitch and timing expectations, above and beyond the contribution of low-level acoustic processing (including IPI, IOI, acoustic onsets, and spectral flux) (Fig 1D). To do so, we derived single-participant TRFs by fitting multivariate lagged regression models (Fig 2A). We then estimated prediction accuracy (Pearson’s correlations, r) between the EEG signals predicted by the TRF models and the actual EEG data (averaged across all participants, ‘ground-truth EEG data’ see Methods ‘TRF analysis’), separately for each melody and EEG channel, using leave-one-melody-out cross-validation over a lag window ranging from −50 to 400 ms. To assess the unique contributions of the variables of interest to the EEG data, we trained reduced models that included a veridical representation of all variables except for the variable of interest, which was randomized (see Methods ‘TRF Analysis’). Finally, we calculated the difference in prediction accuracy (Δr) between the reduced and full models, selecting the top 25% of channels with the highest prediction accuracy in the full model across conditions (see Methods for ROI definition).

Encoding of probabilistic expectations in real but not shuffled music

Fig 2A shows the weights (see Methods ‘TRF analysis’) in time yielded by TRF for all stimulus features of the full model. Fig 2B (left panel) shows that the full model—including all features—predicted the EEG data with reasonable accuracy across virtually all participants, yielding correlation values comparable to those reported in previous TRF studies [51,56]. However, the intersubject variability was substantial, but not explained by the gestational age [57] (Spearman ρ = 0.183, p = 0.207), perhaps reflecting the limited variability of this measure in our sample (gestational age mean 279.8 ± 6.8 GG, range 257–290 GG).

To what extent do probabilistic (high-level) expectations contribute to the neural signal? We tested the unique contribution of probabilistic (high-level) features derived from the IDyOM model (St, Et, Sp, and Ep) beyond low-level stimulus features (onset, spectral flux, IOI, and IPI). We thus compared the change in prediction accuracy (Δr) between the full model and the reduced model—where event-related predictors (St, Et, Sp, and Ep) were randomized both in real and shuffled music (Fig 2B, top central panel). A linear mixed effect model (see Methods ‘Statistical analysis’ for details) with the fixed factor Condition (real versus shuffled) yielded a main effect of Condition (χ²(1) = 12.065, p < .001) indicating encoding of probabilistic expectations in real but not shuffled music (real > shuffled: b = .004, SE = .001, p = .005; real > 0: b = 0.003, SE = .0007, p < .001; shuffled > 0: p = .594). These effects were not driven by any specific melody (Fig 2B, bottom central panel) and exhibited high variability across subjects (Fig 2B, right panel, see also S2A Fig for a visualization of the condition effect on Δr values across individual participants and electrodes). This analysis demonstrates that the predictable structure of real (but not shuffled) melodies allows newborns to generate musical expectations over and above mere acoustic tracking.

Timing- but not pitch-related expectations

We tested whether the encoding of probabilistic expectations was specifically driven by pitch or timing structures (Figs 2C, 2D and S2B upper panels). We thus examined the difference between the full model and a reduced model, in which either St and Et (timing probabilistic TRF model) or Sp and Ep (pitch probabilistic TRF model) were randomized. A linear mixed effect model with fixed factor Condition (real versus shuffled) and TRF model (St and Et versus Sp and Ep) yielded a main effect of condition (χ²(1) = 12.353, p < .001) and an interaction between Condition and TRF model (χ²(1) = 9.897, p = .002). For real music, paired contrasts indicated encoding of probabilistic expectations based on timing but not pitch structure (St and Et > Sp and Ep: b = .002, SE = 0.0004, p < .001; St and Et > 0: b = 0.0024, SE = .0005, p < .001; Sp and Ep > 0: p = .384), whereas for shuffled music, neither of the two dimensions yielded significant effects (St and Et > Sp and Ep: p = .856; St and Et > 0: p = .166; Sp and Ep > 0: p = .601). This analysis demonstrates that newborns track the predictable rhythmic structure of the real melodies to generate expectations. In contrast, pitch-based probabilistic expectations do not appear to emerge with statistical significance.

As a control, we ran similar analyses to test the unique contribution of expectations driven by just immediate local changes in timing and pitch, as estimated by IOI and IPI (Figs 2C, 2D and S2B lower panels). We thus examined the difference between the full model and a reduced model, in which either IOI or IPI was randomized. A linear mixed effect model with fixed factor Condition (real versus shuffled) and TRF model (IOI versus IPI) yielded a main effect of TRF model (χ²(1) = 48.225, p < .001) and an interaction between Condition and TRF model (χ²(1) = 5.032, p = .025). Paired contrasts indicated encoding of IOI but not IPI for both real (IOI > IPI: b = .001, SE = 0.0003, p = .001; IOI > 0: b = 0.0022, SE = .0005, p < .001; IPI > 0: p = .12) and shuffled music (IOI > IPI: b = .003, SE = 0.0005, p < .001; IOI > 0: b = .00026, SE = .0007, p = .002; IPI > 0: p = .705). This analysis demonstrates that (low-level) expectations based on local temporal intervals are not altered by the rhythmic structure of the music, as IOIs were similarly tracked in real and shuffled melodies. It also shows that encoding of the pitch information did not reach significance in either condition (although IPI tracking approached significance when compared to zero in the real condition). Hence, the current results do not support the tracking of either pitch probabilistic expectations or local pitch change.

Note that notes carrying high surprise are often preceded by relatively larger IOIs, and this bias was stronger in real than in shuffled music (S3A Fig). However, the stronger tracking of probabilistic rhythmic expectations (St and Et model) in real than shuffled music cannot be explained by low-level timing alone, as St and Et regressors captured additional EEG variance beyond that explained by the preceding IOI (Fig 2C). We also conducted further control analyses. First, because IOIs were occasionally short, one could argue that consecutive ERPs may have overlapped, potentially confounding the TRF results across conditions. This concern had already been addressed in our main analysis, where the IOI preceding each event was included as a regressor in the TRF model. To further rule out this possibility, we repeated the analysis, also adding the subsequent IOI as a regressor. The results of this analysis confirmed the findings reported above (S3B Fig). Second, to rule out the possibility that some, even if not all, infants were able to generate pitch-based probabilistic expectations, we explored whether those infants generating relatively stronger predictions for timing were also generating stronger predictions for pitch. To do so, we correlated Δr across Sp and St but found no significant correlation for either real (Spearman ρ = 0.234, p = 0.105) or shuffled music (Spearman ρ = 0.209, p = 0.15).

Converging evidence from Event-Related Potentials (ERPs)

To ground the TRF results in more widely used neurophysiological responses, we examined ERP responses to a subset of musical notes, specifically those carrying the highest and lowest 20% quantiles of surprise values (High S and Low S, respectively), separately for pitch and timing (Fig 3A). The ERPs consisted of a first negative peak (termed N1) followed by two broad positive-going deflections (P1 and P2) separated by a small (second) negative-going deflection (N2). The ERP waveforms resemble those previously observed in newborns evoked by auditory stimuli [58]. Furthermore, the waveform is reminiscent of TRF’s regression weights (Fig 2A), suggesting that the TRF analysis primarily captured phase-locked auditory responses, as observed in previous studies [19,51,59].

Fig 3 — **(A)** Human newborns. Group-average (n = 49) ERPs (electrode Fz) evoked by notes carrying relatively high (continuous line) versus low (dashed line) St (green) and Sp (yellow) for real and shuffled music locked to the note onset (0 s). The amplitude of the P2 component was higher in response to notes carrying relatively high versus low temporal surprise (green) (from +.24 to +.37 s) for real (left) but not for shuffled music (right). No effect of pitch-related surprise was found. Gray windows highlight significant differences between low versus high surprise responses (cluster-corrected permutation tests over time across all electrodes). Topographies illustrate the amplitude difference between conditions in the time windows of identified clusters. See S3 and S4 Data. **(B)** Human adults. To assist the comparison with results of previous studies, we plotted group-average (n = 20) ERPs (electrode FCz) recorded from human adults listening to the same stimuli as the infants (reanalysis of data from [51]). Note that no shuffled stimuli were presented in this study. The amplitude of the P1-N1-P2 components was higher in response to notes associated with high than those with low temporal surprise (P1: from +.05 to +.07 s; N1: from +.11 to +.12 s; P2: from +.16 to +.21 s). Notably, human adults also exhibited sensitivity to pitch-related surprise, as indicated by enhanced P1-N1-P2 in response to notes carrying relatively high versus low surprise in pitch (Sp, yellow) (P1: from +.07 to +.09 s; N1: from +.12 to +.14 s; P2: from +.17 to +.27 s). **(C)** Adult Rhesus monkeys. Group-average (n = 2) ERPs (electrode FCz) recorded from Rhesus Monkeys listening to the same stimuli as the infants (reanalysis of data from [19]). The amplitude of the P1 and P2 components (P1: from +.03 to +.06 s; P2 from +.12 to +.15 s, frontal electrodes) was higher in response to notes associated with high than those with low temporal surprise for real, but not for shuffled music. Similarly to newborns, pitch-related surprise did not yield any significant modulation of the EEG amplitude.

Notably, the amplitude of the two positive-going deflections was enhanced in response to temporally unexpected (High S) compared to expected (Low S) notes, reaching significance in the second peak (from +.24 to +.37 s). This was observed for real but not for shuffled music. Conversely, no significant amplitude modulation was evoked by notes with unexpected pitch. This dissociation, together with the observation that pitch- and time-related surprise values (Sp and St) are weakly correlated (rho = .17), suggests that unexpected pitch and timing events are processed independently. These results fully align with the TRF results, confirming that newborns generate expectations based on the rhythmic rather than melodic structure of the musical stimuli. They further provide insights into a neurophysiological response, specifically a late EEG positivity, whose amplitude varies as a function of timing- but not pitch-related surprise. Together, these results warrant comparison with findings from previous studies that exposed human adults (Fig 3B) and Rhesus monkeys (Macaca mulatta, Fig 3C) to the same stimuli [19,51] to be further elaborated upon in the Discussion.

Discussion

We employed continuous music stimuli with a rich melodic and rhythmic “alphabet” as a testbed for investigating the neurophysiology of music encoding in newborns. We demonstrated the feasibility of using naturalistic complex stimuli, such as Western tonal music, to examine different levels of auditory processing at birth. By determining how newborns use statistical regularities in melodic and rhythmic information to process music, our findings provide key contributions to understanding auditory development and its built-in biological constraints. Specifically, while rhythmic statistical regularities embedded in musical stimuli are neurally encoded already at birth, pitch-based information does not receive the same depth of processing, whether at low or high levels of encoding. This suggests that rhythmic and melodic sensitivities do not emerge in parallel in humans, with rhythm developing earlier than melody.

TRF analyses revealed high inter-individual variability in overall neural tracking of musical stimuli (Fig 2B left panel, see also [60]), likely stemming from the high variability in morphology and latency of newborns’ auditory ERPs [61], and compatible with the notion that TRFs capture ERP-like responses [54]. Crucially, we showed that newborns track note-by-note predictability in real but not shuffled music, and that the rhythmic, not the melodic, aspect of sound sequences drove this effect. This indicates that newborns extract statistical regularities from structured contexts (real condition)—likely the rhythmic relationship between nonadjacent timing intervals—to predict upcoming events in the sequence. Conversely, musical expectations were reduced when regularities were weak or absent in random contexts (shuffled condition). As expected, local temporal information—the latency difference between two adjacent notes—was encoded while listening to both real and shuffled music (see Fig 2C, IOI reduced model), independently of whether high-order structural patterns were present. These findings align with the idea that tracking event predictability relies on the ability to extract and represent structural information from the past context [62,63]. The reduced response in the shuffled condition reflects a down-weighting of such predictability-related response when the inferred stability, or precision, of the sensory input is low, and the present information does not conform with past experience [64,65].

This finding also brings novel evidence to our understanding of human rhythmic abilities present at birth. While rhythmic skills, such as sensitivity to isochrony and beat periodicity, are well-documented in infants at 5 months [66], at birth [43], and even in preterm infants [42], evidence regarding sensitivity to context-based probabilistic expectations remains elusive [46]. Here, we offer positive evidence. Using J. S. Bach’s compositions with a variable range of IOIs, we show that newborns are not merely tracking isochrony and periodic patterns. They also process a higher-level feature, namely the probability of when the next event will occur based on a range of past different IOIs. This capacity in infants might build upon the well-documented sensitivity to isochrony and periodicity: in other words, an isochronous or periodic representation of a sequence might provide a temporal grid of predictable sound events, like a scaffold facilitating the segmentation and organization of more complex temporal and/or spectral patterns [67].

What could underlie such precocious rhythmic abilities? A potential candidate is the fetal sensory environment, which is characterized by the prominence of biological rhythms. This includes auditory stimulations (e.g., the mother’s heartbeat [4]), as well as vestibular stimulations (e.g., associated with the regular pace of maternal gait [68–70]). An alternative possibility is that newborns developed such predictive skills through exposure to musical input during gestation [71]. This hypothesis, however, appears not to be supported by our supplementary analysis, demonstrating that estimating surprise values using a model pre-trained on a large musical corpus, reflecting prior exposure, produced results similar to those obtained from a model without pre-training (S4 Fig). Thus, fast statistical learning throughout the stimulus set provides a more parsimonious explanation. This is generally consistent with the existence of an inborn automatic statistical learning mechanism for sequence processing [72], and in line with recent EEG evidence of the neonates’ ability to rapidly learn transition probabilities across different attributes of complex sounds, such as speech [73,74]. While we could not manipulate prenatal musical exposure, future research should systematically manipulate it under the hypothesis that greater musical exposure would lead to stronger neural encoding of musical expectations.

Turning to the functional significance of such precocious rhythmic abilities, we speculate that it might be key in the early development of cognition, not only as a precursor to higher-order statistical learning but also as a mechanism for orienting attention and organizing behavior in time [75]. In support of this idea, newborns can partially adapt spontaneous rhythmical behaviors, such as sucking, to external stimuli [76]; rhythmical rocking interventions on preterm infants improve orienting responses [77]; and vestibular rhythmical stimulation on preterm infants increases their adaptive breathing response, vital for organizing structured behaviors, such as feeding, early vocalizations, and interactions [78].

As opposed to rhythm, we found no evidence for neural encoding of local pitch intervals (IPI) or pitch-based probabilistic expectations (Sp) (note, however, that IPI neural tracking approached significance when compared to zero in the real condition). Whilst this is at first surprising based on past work [47–50], large variability and hardly detectable responses to pitch variations were also previously highlighted, suggesting that pitch neural tracking in newborns requires clear-cut pitch-change manipulations [61]. Such generally weak pitch encoding may stem from the fact that fetal hearing is heavily low-pass filtered in the womb [79], resulting in substantial attenuation of pitch details during gestation and slower maturation of pitch sensitivity. This is consistent with immature frequency-specific pathways and coarse frequency tuning at birth [80], as well as immature temporal resolution of different tones (see evidence from 6-month-old infants [81]). This factor, combined with the greater complexity of our stimuli compared with previous work, might explain the limited melodic tracking we observed. Indeed, our musical stimuli (across real and shuffled melodies) were characterized by sequences of multiple pitches (N = 38, mean height = 73.04 ± 6.32; range: 55−93, in MIDI notation), IPIs (N = 29, mean interval size = 4.66 ± .4.12; range: 0−28), and IOIs (N = 30, mean interval size = .19 ± .13; range: .03–2.6) presented at varying tempi and rhythms. While these features better approximate everyday music listening, they indeed also pose a greater computational challenge for the neonate’s brain compared to traditional oddball paradigms using constant IOI and large infrequent pitch deviants. In sum, newborns’ sensitivity to pitch appears far from sufficient for appreciating musical pitch regularities, which likely emerge through maturation and enculturation. According to this view, reports of musical memory from the womb to birth [82] are likely to rely primarily on timing rather than pitch information, a hypothesis that deserves further testing.

The dissociation observed between rhythmic and melodic statistical tracking might stem from their independent yet complementary neural implementations—respectively relying on temporal and content-based signaling along the auditory hierarchy [83–85]. This separation grants flexibility, allowing the brain to weight predictive signals by their reliability to optimize sequence perception [6]. Our findings suggest that the weighting of these two predictive processes is shaped by developmental refinement, with rudimentary pitch encoding at birth, eventually becoming as robust as temporal encoding later on. It is also possible that these two processes are differentially vulnerable to vigilance states. According to this hypothesis, during sleep, timing is favored over pitch because it is more salient and potentially linked to survival-relevant cues [75]. This would align with EEG studies on adults, suggesting that in-sleep perception and learning might be restricted to simple salient information [86] (see examples in music [87] and speech [88]). Future research should investigate whether melodic processing is modulated by sleep in newborns and whether it is similarly underweighted in sleeping adults. This would clarify whether this effect is truly lacking in newborns or is a consequence of how pitch-related information is processed during sleep.

From a phylogenetic perspective, the prominent perceptual role of rhythm observed in early phases of human ontogeny might piggyback on a more ancestral phylogenetically conserved sensitivity to rhythm (rather than melody) within the primate lineage. The ERP analysis showed greater amplitude of P1-P2 responses to temporally unexpected than expected notes but no modulation in the pitch dimension (Fig 3A). Given their similar broad frontal topography, these two peaks may reflect a single positivity with a similar underlying generator (as also discussed in [58,89]). They might also represent precursors to the adults’ P1 and P2 components [90] (Fig 3B), possibly involving frontotemporal areas for sensory predictive processing and memory-based sequential integration [91–93]. Interestingly, the P1-P2 responses of both monkeys and human adults listening to the same stimuli presented to the newborns were also modulated by temporal surprise [19] (see the re-analyses in Fig 3B and 3C). Thus, the similarity in cortical responses to temporal but not pitch surprise across groups suggests rhythm as a primary perceptual cue in auditory sequence tracking. This does not imply that humans and monkeys generate rhythmic expectations through the same neural mechanism, even if the cortical responses are similarly modulated by temporally unexpected events. For instance, these responses might reflect the contribution of anticipatory mechanisms operating over different rhythmic features—ranging from periodicity and local temporal changes to statistical and hierarchical structures [94, 95]. Supplementary analyses suggest that, in newborns, probabilistic and local temporal information explain comparable amounts of EEG variance (Figs 2C and S3), whereas in monkeys, local temporal information contributes relatively more strongly (Figure S3 in [19]). Comparing distinct rhythmic computational models across phylogenetically close groups and as a function of exposure might shed light on the biological basis and evolutionary history of these different rhythmic capacities.

Regarding melodic expectations, the lack of significant melodic tracking observed in both human newborns and musically naïve monkeys (as opposed to human adults, Fig 3) leaves open the hypothesis that melodic sensitivity may not have emerged only in humans within the primate lineage but might potentially develop in other nonhuman primates given sufficient musical exposure. Testing this hypothesis across species could shed light on the role of experience in shaping the relative weighting of pitch- and timing-based expectations in auditory processing.

Conclusions

Overall, this study provides neurophysiological evidence that tracking rhythmic statistical regularities is a capacity present at birth, whilst melodic tracking might not be, at least with respect to naturalistic musical stimuli, such as the ones we used here. Future investigations should assess whether the observed dominance of rhythm over melody reflects state-dependent factors such as sleep or instead marks an early developmental bias that gradually shifts with experience toward the balanced sensitivity observed in adulthood.

Methods

Ethics statement

Written formal consent was obtained from the parent/guardian, and the infant’s mother could opt to be present during the recording. The study fully complied with the World Medical Association Helsinki Declaration and all applicable national laws. Approval was granted by the Hungarian Medical Research Council, Committee of Scientific and Research Ethics (ETT TUKEB), ethics approval: IV/2199-4/2020/EKU.

Participants

Sixty-four healthy full-term newborn infants (0–2 days of age, 30 male, and APGAR score 9/10) were tested at the Department of Obstetrics-Genecology, Szent Imre Hospital, Budapest. EEG data from 6 infants were corrupted, and 9 other infants did not complete the experiment. As a result, these data were not analyzed, leaving a total sample size of 49. The analyzed infants had a mean gestational age of 40 weeks (SD = 7.1 days) and a mean birthweight of 3468.1 g (SD = 398.2 g). All newborns had normal hearing and passed the Brainstem Evoked Response Audiometry (BERA) test.

Stimuli

The stimuli consisted of 14 monophonic piano melodies used in [19] (details in S1 Table): 10 melodies (real music) composed by Johann Sebastian Bach (previously also used in [51]) and 4 control melodies (shuffled music) created by disrupting the pitch order and timing regularities of four of the original melodies (see below). The length of the melodies varied (average duration = 158.07 s ± 24.06), and the tempo ranged from 47 to 140 bpm (average tempo = 106.5 bpm ± 34.7). The four shuffled melodies were derived from four of the real melodies, specifically selected to represent those with the highest (melodies 05 and 08) and lowest (melodies 01 and 10) temporal-onset mean surprise. This selection was motivated by evidence suggesting that music with higher timing surprise elicits stronger brain responses in humans, aiming to balance these effects across both real and shuffled music. The shuffled melodies were matched to the real melodies in terms of pitch content, average note duration, and IOIs, but their structure was disrupted in two key musical dimensions. Pitch regularities were altered by reordering the temporal sequence of the original notes. Rhythmic patterns were disrupted by creating a new set of IOIs drawn from a Gaussian distribution centered around the original mean IOI, with an added variation based on the difference between the mean and the minimum IOI. These randomly generated IOIs were then adjusted in MuseScore software (version 3.3.4.24412, https://musescore.org) to align with 16th-note quantization, preserving integer ratios. In MuseScore, the MIDI velocity (which correlates to note loudness) was standardized to a constant value of 100, and piano sound waveforms were synthesized with a 44,100 Hz sampling rate. Each melody was preceded and followed by a beep (800 Hz pure tone, linearly ramped with a 5 ms fade-in and fade-out) and a 5-s silence, following the structure: beep-silence-music-silence-beep. The resulting audio files were converted to mono and amplitude-normalized by dividing by the standard deviation using Matlab (R2019, The MathWorks, Natick, MA, USA).

Information dynamics of music model

Stimuli were analyzed using the IDyOM model (https://www.marcus-pearce.com/idyom/), which predicts note-by-note unexpectedness (surprise) and uncertainty (entropy). IDyOM is a variable-order Markov model that learns statistical patterns from musical sequences. It generates probability distributions for each new note based on prior context and outputs surprise (S) and entropy (E) over time. Surprise measures the unexpectedness of an event at time ‘t’ once it has occurred. Entropy reflects the uncertainty about the event at ‘t’ before it occurs based on the probability distribution of all potential notes considering all observations prior to the event at ‘t’. The model incorporates both short-term and long-term contexts, with the short-term model trained on the current sequence and the long-term model on prior musical exposure. To simulate the statistical knowledge that the newborns would acquire through mere exposure to the stimuli, predictions were derived from a combination of short and long-term models, with the latter being trained only on the stimuli used in the experiment, i.e., via resampling (10-fold cross-validation) (in IDyOM terminology: no pretraining, ‘‘both+’’ model configuration). IDyOM can account for many aspects of music, but here, we focused on two key dimensions that best describe piano monophonic melodies: pitch and timing. To this end, time series representing pitch and inter-onset interval ratios (using separate ‘cpitch’ and ‘ioi-ratio’ IDyOM viewpoints) were analyzed independently by IDyOM to calculate note-by-note surprise (S) and entropy (E) for both pitch (Sp, Ep) and timing (St, Et). These were then combined to determine the joint (sum) probability for each note (S, E). To simulate the long-term statistical knowledge of music possibly acquired by infants in the womb, we run control analyses by using S and E estimates derived by an IDyOM model pre-trained on a large corpus of music (comprising 152 Canadian folk songs, 566 German folk songs from the Essen folk song collection, and 185 J. S. Bach chorale melodies (as in previous applications [19,51,96]).

Procedure

As a common procedure in EEG studies in newborns, infants were asleep during the EEG recording and stimulus presentation. Stimuli were presented using a Maya 22 USB external soundcard and ER-2 Insert Earphones (Etymotic Research, Elk Grove Village, IL, USA) placed into the infants’ ears via ER-2 Foam Infant Ear-tips. The melodies were presented at a comfortable intensity (about 70 dB SPL). Two sets of the 14 melodies were presented in a randomized order within each set, ensuring that each infant listened to each melody at least once, with some melodies being heard twice. None of the infants heard the full set of 14 melodies twice (range 14–25), and on average, they had 1.18 (SD = 1.4) repetitions. The presentation was implemented in Matlab (R2014, The MathWorks, Natick, MA, USA) and Psychtoolbox (version 3.0.14). EEG was recorded throughout the stimulus presentation. The inter-stimulus interval between melodies (ISI, offset to onset) was 900–1,300 ms (random with even distribution, 1 ms step). The experiment took 45 min overall, including both preparation and stimulation.

Data recording and preprocessing

An ActiChamp Plus amplifier with a 64-channel sponge-based electrode system (saltwater sponges and passive Ag/AgCl electrodes, R-Net) and a Brain-Vision Recorder were employed to record EEG (Brain Products GmbH, Gilching, Germany). The sampling rate was 500 Hz with a 100 Hz online low-pass filter applied. Electrodes were placed according to the International 10/10 system. The Cz channel served as the reference electrode while the ground electrode was placed on the midline of the forehead. During the recording, impedances were kept below 50 kΩ.

Data were preprocessed and analyzed in MATLAB R2019. For the analysis, we applied a fully data-driven pipeline for preprocessing EEG data, combining open-access denoising algorithms, similar to previous studies dealing with noisy EEG recordings [19,59]. The analysis used Fieldtrip [97] and EEGLAB toolboxes (http://sccn.ucsd.edu/). The continuous EEG data were bandpass filtered between 1 and 30 Hz (Butterworth filter, zero-phase, order 3), down-sampled to 100 Hz, and segmented into epochs from the onset to the offset of each melody, separately. Before re-referencing the data to the average of a set of electrodes (‘F9’, ‘F10’, ‘P9’, ‘P10’, and ‘Iz’), faulty or noisy electrodes were temporarily discarded to prevent noise contamination across electrodes. Specifically, for each electrode, the mean, standard deviation, and peak-to-peak values were calculated across time within each trial. If any of these values deviated by more than 2.75 standard deviations from the mean of other electrodes, the electrode was flagged as noisy/faulty. This process was repeated until a distribution without outliers was obtained. The data were then further denoised in EEGLAB using the Artefact Subspace Reconstruction (ASR) algorithm [98] (threshold value 5 previously validated for both adult human and monkey EEG data [19]). Eye-movement artifacts were corrected using the ICLabel algorithm in EEGLAB. After performing independent component analysis (ICA) with EEGLAB’s ‘runica’ function, independent components labeled by ICLabel as ‘eye movements’ (with > 90% likelihood) were rejected. Subsequently, electrodes that were initially excluded (due to being faulty or noisy) were interpolated by replacing their voltage with the average voltage of the (preprocessed) neighboring electrodes (18 mm distance, including 8 electrodes on average). If, following the above preprocessing, noisy electrodes were still automatically identified, the interpolation step was repeated (the number of such iterations varied between 1 and 2).

TRF analysis

We employed Temporal Response Functions (TRF) to model EEG responses to the continuous acoustic and musical features of the presented stimuli using the mTRF MATLAB toolbox [53]. Each stimulus feature (as listed below) was normalized across time for each melody, ensuring that the root mean square of each feature was 1. A forward model was run to predict the ongoing EEG response from the stimulus features, with a time lag window of −50 to +400 ms to capture EEG fluctuations related to changes in the stimulus. This time window was sufficiently large to encapsulate well-known ERP-like modulations of EEG signals that are known to drive the variance modeled by TRF. Ridge regression was used to prevent overfitting (lambda range: 10⁻⁴ to 10⁸). TRFs were fitted to all melodies (pooled real and shuffled melodies) using leave-one-melody-out cross-validation, and the EEG time course of the left-out melody was predicted. Note that the correlation values are typically calculated between EEG signals and their predictions by considering single-participant EEG signals, which might carry much noise (especially if recorded from newborns). As such, EEG prediction correlations are variable between participants largely due to the variable SNR of the EEG signal across participants (as every prediction is correlated with a different EEG signal). To overcome this issue, we averaged all participants’ EEG timeseries data to form a single EEG ‘super-subject’ data timeseries, which we refer to as ‘ground-truth EEG’. Then, per each participant, melody, and electrode, prediction accuracy was quantified by calculating Pearson’s correlation between the predicted and ground-truth EEG data.

We tested the contribution of high-level probabilistic musical expectations to the predicted EEG in addition to that of the low-level acoustic features in both the timing and pitch dimensions. Feature selection was based on the approach used in [19]. We additionally tested for the contribution of local changes in timing and pitch, such as IOI and IPI (measured in ms and absolute number of semitones, respectively), as these features are to some extent correlated with surprise values in naturalistic music (e.g., relatively larger temporal or pitch deviations tend to be relatively more unexpected, particularly in structured compositions like Bach’s) [99].

Thus, we run a full model including low-level acoustic features (acoustic onset, spectral flux, as well as IOI and IPI) and high-level probabilistic musical features with impulses at the note onsets but whose amplitudes are set to the pitch and onset surprise and entropy values from IDyOM (surprise pitch, surprise timing and entropy pitch, entropy timing—Sp, St and Ep, Et). A control analysis, adding envelope and its half-rectified derivative as part of low-level acoustic regressors, yielded a pattern of results similar to the main analysis (see S5 Fig).

Although, as shown in Fig 1C, the correlations across regressors in our stimuli are small (<0.3) to moderate (<0.5), we used a variance partitioning approach that accounts for shared variance across regressors, allowing us to estimate their unique contributions to neural responses. We acknowledge, however, that a more direct way to assess the unique contribution of individual features is through causal manipulation, as demonstrated by the “model-matched” stimulus approach [100]. Thus, to assess the unique neural encoding of a single or a set of stimulus features, we subtracted the prediction accuracy of several reduced models from the full model (containing all features). We then analyzed the Δr values obtained for each reduced model. Note that we used the term ‘reduced’ to indicate a model in which a given predictor (or a set of predictors) is temporally shuffled to estimate its unique contribution to the full model. The reduced models had the same dimensionality as the full model, but the feature/s of interest was/were randomized in time whilst preserving the onset times. We tested five reduced models: 1) A probabilistic music model where the high-level features (St and Et and Sp and Ep) were randomized to assess the overall effect of adding surprise and entropy estimates to low-level features to the neural tracking; 2) A probabilistic timing model with randomized St and Et; 3) A probabilistic pitch model with randomized Sp and Ep; 4) A local timing model with randomized IOI; and 5) A local pitch model with randomized IPI. To compare across the different models, for each participant, condition, and TRF model, Δr values were averaged across 25% of channels with the highest prediction accuracy in the full model and across the real and the shuffled conditions. These values were then entered into linear mixed-effects regressions. Note that ROIs were defined as the top 25% of electrodes showing the highest correlation values in the full model, averaged across conditions, ensuring independence from both condition (real versus shuffled) and regressor type (surprise pitch versus surprise timing). Using alternative thresholds (top 10% or 50%) yielded the same pattern of results, confirming that findings were robust to ROI definition.

Statistical analysis

Statistical analyses were run in R (version 4.1.3, 2022-03-10) and included nonparametric tests or linear mixed-effects models (lme4 package). All models included Random Effects of Infants and Melodies (IDs 1–14). The Fixed effects included Condition (real/shuffled) and TRF model (depending on the comparison; see Results). Statistical significance was evaluated by likelihood-ratio tests (χ²) conducted using the ‘anova’ function (stats package). Follow-up contrasts were conducted using the ‘emmeans’ package and the Tukey method to account for the increased risk of type I error resulting from multiple comparisons. Adjusted p-values were calculated to determine significant differences between conditions. A significance level of a = 0.05 was used. All linear mixed-effects models (LMMs) report fixed-effect estimates (b) along with their standard errors (SE) and t-values, with degrees of freedom estimated via Satterthwaite approximation when applicable.

When a direct test of differences was needed, the nonparametric Wilcoxon signed-rank test was used. For these, results are reported as W-values, indicating the sum of ranks of signed differences.

ERP analysis

Event-related potential (ERP) analyses were performed by segmenting the EEG data into 600 ms epochs, beginning 100 ms before the onset of each note and ending 500 ms after the onset. Epochs were baseline corrected using a 50 ms window before the note onsets, and trials that deviated from the mean by more than 2.5 the average standard deviation were rejected (3.45 ± 1.12% of the trials per subject). To assess ERP modulation based on note surprise, we selected the notes with the highest and lowest 20% surprise (high S and low S) values, separately for each melody, as assessed by the IDyOM. For each subject, epochs were trimmed to a window of −50 + 400 ms relative to note onset and averaged by high/low S condition, separately for real and shuffled melodies. Cluster-based permutation testing [101] was used to account for multiple comparisons across adjacent time points and electrodes. Clusters of adjacent timepoints and neighboring electrodes (at least three) associated with significant (p-values < 0.025) differences across conditions were formed. A cluster-level threshold of p < 0.05 was applied to the t-statistic, and the Monte Carlo method (1,000 iterations) was used to estimate the null distribution of this statistic. To assist comparability with the previous work, we re-analyzed the EEG data recorded from human [51] and monkey [19] adults following the same pipeline described here (both datasets are open source). Note that for the monkey data, as in the original work, clusters were identified separately for each animal (across 22 sessions) and considered significant only when exhibited by both animals (conjunction analysis).

Supporting information

S1 Fig. Summary statistics (mean and variance of the amplitude envelope across frequency bands) of stimuli.

To extract the envelope associated with each frequency band, we bandpass-filtered the musical stimuli into 128 logarithmically spaced frequency bands ranging from 100 to 8,000 Hz using a gammatone filter bank. We then computed the amplitude envelope of each band as the absolute value of the Hilbert-transformed signal over time. The envelope mean (left) and variance (right) are shown as a function of frequency band. Thin lines represent individual melodies (real and shuffled), while thick lines indicate the average across melodies for Real (solid line) and shuffled (dotted line) conditions. Note that the real and shuffled conditions show comparable envelope means and variances. See S5 Data.

(TIF)

pbio.3003600.s001.tif^{(930.2KB, tif)}

S2 Fig. Difference in prediction accuracy (Δr) between real and shuffled conditions across participants and electrodes for each reduced model.

The 2D matrices display electrodes on the Y-axis and participants on the X-axis, with colors coding the difference in Δr values (full—reduced model) between real and shuffled conditions. Note that positive values (red color coded) indicate a greater contribution of the real compared to the shuffled condition. Near-zero values (white color coded) indicate similar contributions across conditions. (A) Unique contribution of high-level musical features. Between condition difference in Δr (full—reduced model assessing the unique contribution of high-level musical features—St, Et, Sp, and Ep). To facilitate visualization, we display the labels of 16 representative electrodes (out of 63) on the y-axis, along with their corresponding position on the EEG cap (bottom). (B) Unique contribution of timing and pitch-related features. Between condition difference in Δr (full and reduced models separately assessing the unique contribution of St and Et, Sp and Ep, IOI, and IPI). See S6 Data.

(TIF)

pbio.3003600.s002.tif^{(2.4MB, tif)}

S3 Fig. (A) Length of preceding and subsequent IOIs as a function of rhythmic surprise and condition.

Notes carrying high surprise are often preceded by relatively larger IOIs. A linear mixed model predicting the preceding IOI, with factors condition (real/shuffled) and surprise level (high/low), yielded no main effect of condition (χ²(1) = 1.12, p = 0.29), but a significant main effect of surprise level (χ2(1) = 16.89, p < .001), and an interaction of condition and surprise level (χ2(1) = 18.12, p < .001). This indicates that larger IOIs generally anticipate notes carrying high surprise, more so in real than in shuffled music (left panel). Conversely, the same analysis predicting the subsequent (rather than preceding) IOI, yielded a nearly significant effect of surprise level (χ²(1) = 3.52, p = 0.06) but no main effect of condition (χ²(1) = 0.008, p = 0.90) and no interaction (χ²(1) = 0.70, p = 0.40). This indicates that notes carrying high surprise tend to be followed by larger IOIs, but comparably across real and shuffled music (right panel). See S7 Data. (B) Unique contribution of St and Et is independent of preceding or subsequent IOI. To distinguish neural tracking of rhythm from spurious modulations of event-related potentials (ERPs) attributable to overlapping (i.e., temporally proximal) neural responses, we re-run the main analysis, adding the length of the subsequent IOI as a regressor in the mTRF. We thus run a full model with the following regressors: onset, spectral flux, inter-pitch interval, preceding IOI, subsequent IOI, Sp, Ep, St, and Et. We then computed three reduced models, each randomizing one of the following regressors: 1) St and Et, 2) preceding IOI, and 3) subsequent IOI. The results of this control analysis confirm a unique contribution of St and Et features to the neural response beyond the contribution of subsequent and preceding IOI. The plot shows the topographical maps representing group-average Δr resulting from the difference between the full and the three reduced models across real and shuffled conditions.

(TIF)

pbio.3003600.s003.tif^{(1.5MB, tif)}

S4 Fig. Control analyses.

No effects of IDyOM statistical knowledge on EEG prediction accuracy. We compared the effect of deriving surprise estimates by training IDyOM on either the experimental stimuli alone (left panel) or on the experimental stimuli, as well as an additional corpus of Western tonal music (right panel). For each panel, we plot the difference in EEG prediction accuracy (Δr) between the full and the reduced models (randomizing St, Et, Sp, and Ep). Dots represent the grand-average mean Δr computed across all channels and melodies (left top panel, with associated topographical maps) for real (red) and shuffled (gray) music. Error bars represent bootstrapped 95% CI. The absence of differences in predicting neural responses between pre-trained and nonpre-trained model configurations suggests that incorporating pretraining to estimate surprise and entropy values does not enhance the prediction of EEG data. This may be due to the high correlation between the estimates derived from the two IDyOM configurations, leading to similar EEG predictive power. Additionally, it may indicate that Bach’s music contains sufficient rules and statistical regularities, allowing the model to learn these directly from the stimulus set, rendering pretraining on the large music corpus redundant to predict brain signals. See S8 Data.

(TIF)

pbio.3003600.s004.tif^{(886.9KB, tif)}

S5 Fig. Control analysis.

Replication of the results reported in Fig 2C, here adding envelope and its half-wave rectified derivative as predictors in the full model. We repeated the analysis using an enriched acoustic model, thus adding envelope and its half-wave rectified derivative to the already used acoustic regressors (onsets, spectral flux, ITI, and IOI). This additional analysis confirms the robustness of our results: the main findings remain overall unchanged, indicating a unique contribution of the high-level music regressors in the real but not the shuffled music condition, specifically driven by the St and Et regressors. See S9 Data.

(TIF)

pbio.3003600.s005.tif^{(1.1MB, tif)}

S1 Table. Characteristics of the experimental stimuli.

(DOCX)

pbio.3003600.s006.docx^{(26KB, docx)}

S1 Data. Excel file containing the numerical data values for Fig 1B and 1C.

Sheet1: average surprise and entropy values of each note, computed separately for pitch and timing, within each melody. Sheet2: values used for Pearson’s correlation between the stimulus features of all melodies: inter-pitch-interval (IPI), inter-onset-interval (IOI), and surprise and entropy associated with timing (S t and E t) and pitch (Sp and Ep).

(XLSX)

pbio.3003600.s007.xlsx^{(11.6KB, xlsx)}

S2 Data. Excel file containing the numerical data values for Fig 2B, 2C, and 2D.

Sheet1: EEG prediction accuracy (r values) for each infant and melody of the full model, as well as the difference between the full and the reduced model (Δr values), assessing the unique contribution of high-level musical features (St, Et, Sp, and Ep). Sheet 2: EEG prediction accuracy (r values) for each infant and melody of the full model, as well as the difference between the full and the reduced model (Δr values), assessing the unique contribution of timing (St and Et) and pitch-related (Sp and Ep) high-level features. Sheet 3: EEG prediction accuracy (r values) for each infant and melody of the full model, as well as the difference between the full and the reduced model (Δr values), assessing the unique contribution of timing (IOI) and pitch-related (IPI) low-level features.

(XLSX)

pbio.3003600.s008.xlsx^{(138.7KB, xlsx)}

S3 Data. MATLAB file containing the numerical values underlying Fig 3A, representing ERPs evoked by notes with relatively high versus low pitch surprise (Sp) for real and shuffled music, time-locked to note onset.

The data are stored as a 4D matrix with dimensions subject × condition × channel × time, where conditions 1–4 correspond to Low Sp (real), High Sp (real), Low Sp (shuffled), and High Sp (shuffled). Data can be opened using nonproprietary software such as R or Python.

(MAT)

pbio.3003600.s009.mat^{(4.2MB, mat)}

S4 Data. MATLAB file containing the numerical values underlying Fig 3A, representing ERPs evoked by notes with relatively high versus low timing surprise (St) for real and shuffled music, time-locked to note onset.

The data are stored as a 4D matrix with dimensions subject × condition × channel × time, where conditions 1–4 correspond to Low St (real), High St (real), Low St (shuffled), and High St (shuffled). Data can be opened using nonproprietary software such as R or Python.

(MAT)

pbio.3003600.s010.mat^{(4.2MB, mat)}

S5 Data. Excel file containing the summary statistics (mean and variance of the amplitude envelope across frequency bands) of stimuli shown in S1 Fig.

(XLSX)

pbio.3003600.s011.xlsx^{(62.6KB, xlsx)}

S6 Data. Excel file containing the numerical values underlying S2 Fig.

The Δr values (full—reduced model) for each of the 5 plots, corresponding to 5 different reduced models, are stored in 5 different spreadsheets containing values for each subject × condition (real/shuffled) × electrode.

(XLSX)

pbio.3003600.s012.xlsx^{(416.8KB, xlsx)}

S7 Data. Excel file containing the numerical data values for S3A Fig.

Surprise associated with timing, length of preceding IOI, as well as length of the subsequent IOIs for each note of all melodies).

(XLSX)

pbio.3003600.s013.xlsx^{(315.6KB, xlsx)}

S8 Data. Excel file containing the numerical data values for S4 Fig.

Difference of EEG prediction accuracy for each infant between the full and the reduced model (Δr values), assessing the unique contribution of high-level musical features (St, Et, Sp, and Ep) obtained from an enculturated IDyOM model trained on an extra music corpus.

(XLSX)

pbio.3003600.s014.xlsx^{(28.2KB, xlsx)}

S9 Data. Excel file containing the numerical data values for S5 Fig, with the full model additionally containing envelope and its derivative.

(XLSX)

pbio.3003600.s015.xlsx^{(139.1KB, xlsx)}

Acknowledgments

We thank the Brain and Machines Flagship Programme of the Italian Institute of Technology (https://www.iit.it/our-research) for the support.

Abbreviations

ASR: Artefact Subspace Reconstruction
BERA: Brainstem Evoked Response Audiometry
CI: confidence intervals
EEG: electroencephalography
ERPs: event-related potentials
ICA: independent component analysis
IDyOM: information dynamic of music
IOI: inter-onset-interval
IPI: inter-pitch-interval
LMMs: linear mixed-effects models
mTRF: multivariate temporal response function
SE: standard errors

Data Availability

All data underlying the findings described in this manuscript are fully available without restriction. The EEG data and analysis code are publicly available from the Open Science Framework (OSF) at https://doi.org/10.17605/OSF.IO/K758D. The EEG data are shared in accordance with the Continuous-event Neural Data (CND) format standard. The corresponding musical stimuli are available in the same repository under the STIMULI folder. All data used to generate the figures are included as Supporting Information files (S1–S9 Data).

Funding Statement

R.B. is funded by the European Union (MSCA, PHYLOMUSIC, 101064334, https://marie-sklodowska-curie-actions.ec.europa.eu/). G.N. and F.B. are funded by the European Research Council (ERC, MUSICOM, 948186, https://erc.europa.eu/homepage). T.N. is funded by the European Union (MSCA, SYNCON, 101105726, https://marie-sklodowska-curie-actions.ec.europa.eu/). B.T., G.P.H., and I.W. are funded by the Hungarian National Research Development and Innovation Office (ANN131305, FK139135, and K147135, respectively, https://nkfih.gov.hu/english-nkfih). The funders did not play any role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Singh M, Mehr SA. Universality, domain-specificity, and development of psychological responses to music. Nat Rev Psychol. 2023;2(6):333–46. doi: 10.1038/s44159-023-00182-z [DOI] [PMC free article] [PubMed] [Google Scholar]
2.McMullen E, Saffran JR. Music and language: a developmental comparison. Music Perception. 2004;21(3):289–311. doi: 10.1525/mp.2004.21.3.289 [DOI] [Google Scholar]
3.Trehub SE, Hannon EE. Infant music perception: domain-general or domain-specific mechanisms? Cognition. 2006;100(1):73–99. doi: 10.1016/j.cognition.2005.11.006 [DOI] [PubMed] [Google Scholar]
4.Webb AR, Heller HT, Benson CB, Lahav A. Mother’s voice and heartbeat sounds elicit auditory plasticity in the human brain before full gestation. Proc Natl Acad Sci U S A. 2015;112(10):3152–7. doi: 10.1073/pnas.1414924112 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Winkler I, Denham SL, Nelken I. Modeling the auditory scene: predictive regularity representations and perceptual objects. Trends Cogn Sci. 2009;13(12):532–40. doi: 10.1016/j.tics.2009.09.003 [DOI] [PubMed] [Google Scholar]
6.Dehaene S, Meyniel F, Wacongne C, Wang L, Pallier C. The neural representation of sequences: from transition probabilities to algebraic patterns and linguistic trees. Neuron. 2015;88(1):2–19. doi: 10.1016/j.neuron.2015.09.019 [DOI] [PubMed] [Google Scholar]
7.Tillmann B. Music and language perception: expectations, structural integration, and cognitive sequencing. Top Cogn Sci. 2012;4(4):568–84. doi: 10.1111/j.1756-8765.2012.01209.x [DOI] [PubMed] [Google Scholar]
8.Vuust P, Heggli OA, Friston KJ, Kringelbach ML. Music in the brain. Nat Rev Neurosci. 2022;23(5):287–305. doi: 10.1038/s41583-022-00578-5 [DOI] [PubMed] [Google Scholar]
9.Koelsch S, Vuust P, Friston K. Predictive processes and the peculiar case of music. Trends Cogn Sci. 2019;23(1):63–77. doi: 10.1016/j.tics.2018.10.006 [DOI] [PubMed] [Google Scholar]
10.Woods KJP, McDermott JH. Schema learning for the cocktail party problem. Proc Natl Acad Sci U S A. 2018;115(14):E3313–22. doi: 10.1073/pnas.1801614115 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Kotz SA, Schwartze M. Cortical speech processing unplugged: a timely subcortico-cortical framework. Trends Cogn Sci. 2010;14(9):392–9. doi: 10.1016/j.tics.2010.06.005 [DOI] [PubMed] [Google Scholar]
12.Zatorre RJ, Salimpoor VN. From perception to pleasure: music and its neural substrates. Proc Natl Acad Sci U S A. 2013;110 Suppl 2(Suppl 2):10430–7. doi: 10.1073/pnas.1301228110 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Malloch S, Trevarthen C. Musicality in infancy. In: Press OU, editor. Communicative musicality exploring the basis of human companionship. 2009. p. 241–62. [Google Scholar]
14.Trehub SE. The developmental origins of musicality. Nat Neurosci. 2003;6(7):669–73. doi: 10.1038/nn1084 [DOI] [PubMed] [Google Scholar]
15.Hannon EE, Trainor LJ. Music acquisition: effects of enculturation and formal training on development. Trends Cogn Sci. 2007;11(11):466–72. doi: 10.1016/j.tics.2007.08.008 [DOI] [PubMed] [Google Scholar]
16.Honing H, ten Cate C, Peretz I, Trehub SE. Without it no music: cognition, biology and evolution of musicality. Philos Trans R Soc Lond B Biol Sci. 2015;370(1664):20140088. doi: 10.1098/rstb.2014.0088 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Hauser MD, McDermott J. The evolution of the music faculty: a comparative perspective. Nat Neurosci. 2003;6(7):663–8. doi: 10.1038/nn1080 [DOI] [PubMed] [Google Scholar]
18.Trainor LJ. The origins of music in auditory scene analysis and the roles of evolution and culture in musical creation. Philos Trans R Soc Lond B Biol Sci. 2015;370(1664):20140089. doi: 10.1098/rstb.2014.0089 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Bianco R, Zuk NJ, Bigand F, Quarta E, Grasso S, Arnese F, et al. Neural encoding of musical expectations in a non-human primate. Curr Biol. 2024;34(2):444-450.e5. doi: 10.1016/j.cub.2023.12.019 [DOI] [PubMed] [Google Scholar]
20.Hattori Y, Tomonaga M. Rhythmic swaying induced by sound in chimpanzees (Pan troglodytes). Proc Natl Acad Sci U S A. 2020;117(2):936–42. doi: 10.1073/pnas.1910318116 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Honing H, Bouwer FL, Prado L, Merchant H. Rhesus monkeys (Macaca mulatta) sense isochrony in rhythm, but not the beat: additional support for the gradual audiomotor evolution hypothesis. Front Neurosci. 2018;12:475. doi: 10.3389/fnins.2018.00475 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Selezneva E, Deike S, Knyazeva S, Scheich H, Brechmann A, Brosch M. Rhythm sensitivity in macaque monkeys. Front Syst Neurosci. 2013;7:49. doi: 10.3389/fnsys.2013.00049 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Merchant H, Grahn J, Trainor L, Rohrmeier M, Fitch WT. Finding the beat: a neural perspective across humans and non-human primates. Philos Trans R Soc Lond B Biol Sci. 2015;370(1664):20140093. doi: 10.1098/rstb.2014.0093 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Rajendran VG, Prado L, Marquez JP, Merchant H. Monkeys have rhythm. Science. 2025;390(6776):940–4. doi: 10.1126/science.adp5220 [DOI] [PubMed] [Google Scholar]
25.Brosch M, Oshurkova E, Bucks C, Scheich H. Influence of tone duration and intertone interval on the discrimination of frequency contours in a macaque monkey. Neurosci Lett. 2006;406(1–2):97–101. doi: 10.1016/j.neulet.2006.07.021 [DOI] [PubMed] [Google Scholar]
26.Wright AA, Rivera JJ, Hulse SH, Shyan M, Neiworth JJ. Music perception and octave generalization in rhesus monkeys. J Exp Psychol Gen. 2000;129(3):291–307. doi: 10.1037//0096-3445.129.3.291 [DOI] [PubMed] [Google Scholar]
27.Norman-Haignere SV, Kanwisher N, McDermott JH, Conway BR. Divergence in the functional organization of human and macaque auditory cortex revealed by fMRI responses to harmonic tones. Nat Neurosci. 2019;22(7):1057–60. doi: 10.1038/s41593-019-0410-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Mehr SA, Singh M, Knox D, Ketter DM, Pickens-Jones D, Atwood S, et al. Universality and diversity in human song. Science. 2019;366(6468):eaax0868. doi: 10.1126/science.aax0868 [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Savage PE, Brown S, Sakai E, Currie TE. Statistical universals reveal the structures and functions of human music. Proc Natl Acad Sci U S A. 2015;112(29):8987–92. doi: 10.1073/pnas.1414495112 [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Alho K, Sainio K, Sajaniemi N, Reinikainen K, Näätänen R. Event-related brain potential of human newborns to pitch change of an acoustic stimulus. Electroencephalogr Clin Neurophysiol. 1990;77(2):151–5. doi: 10.1016/0168-5597(90)90031-8 [DOI] [PubMed] [Google Scholar]
31.Kujala T, Partanen E, Virtala P, Winkler I. Prerequisites of language acquisition in the newborn brain. Trends Neurosci. 2023;46(9):726–37. doi: 10.1016/j.tins.2023.05.011 [DOI] [PubMed] [Google Scholar]
32.James DK. Fetal learning: a critical review. Infant Child Dev. 2010;19(1):45–54. doi: 10.1002/icd.653 [DOI] [Google Scholar]
33.Giordano V, Alexopoulos J, Spagna A, Benavides-Varela S, Peganc K, Kothgassner OD, et al. Accent discrimination abilities during the first days of life: an fNIRS study. Brain Lang. 2021;223:105039. doi: 10.1016/j.bandl.2021.105039 [DOI] [PubMed] [Google Scholar]
34.Moon C, Cooper RP, Fifer WP. Two-day-olds prefer their native language. Infant Behav Dev. 1993;16(4):495–500. doi: 10.1016/0163-6383(93)80007-u [DOI] [Google Scholar]
35.Keller PE. What movement force reveals about cognitive processes in music performance. Art in motion II. Frankfurt: Peter Lang. 2012. p. 115–53. [Google Scholar]
36.Lang A, Ott P, Del Giudice R, Schabus M. Memory traces formed in utero—newborns’ autonomic and neuronal responses to prenatal stimuli and the maternal voice. Brain Sci. 2020;10(11):837. doi: 10.3390/brainsci10110837 [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Kisilevsky BS, Hains SMJ. Exploring the relationship between fetal heart rate and cognition. Infant Child Dev. 2010;19(1):60–75. doi: 10.1002/icd.655 [DOI] [Google Scholar]
38.Kisilevsky S, Hains SMJ, Jacquet AY, Granier-Deferre C, Lecanuet JP. Maturation of fetal responses to music. Dev Sci. 2004;7(5):550–9. doi: 10.1111/j.1467-7687.2004.00379.x [DOI] [PubMed] [Google Scholar]
39.Cabrera L, Gervain J. Speech perception at birth: the brain encodes fast and slow temporal information. Sci Adv. 2020;6(30):eaba7830. doi: 10.1126/sciadv.aba7830 [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Telkemeyer S, Rossi S, Koch SP, Nierhaus T, Steinbrink J, Poeppel D, et al. Sensitivity of newborn auditory cortex to the temporal structure of sounds. J Neurosci. 2009;29(47):14726–33. doi: 10.1523/JNEUROSCI.1246-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Háden GP, Honing H, Török M, Winkler I. Detecting the temporal structure of sound sequences in newborn infants. Int J Psychophysiol. 2015;96(1):23–8. doi: 10.1016/j.ijpsycho.2015.02.024 [DOI] [PubMed] [Google Scholar]
42.Edalati M, Wallois F, Safaie J, Ghostine G, Kongolo G, Trainor LJ, et al. Rhythm in the premature neonate brain: very early processing of auditory beat and meter. J Neurosci. 2023;43(15):2794–802. doi: 10.1523/JNEUROSCI.1100-22.2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Winkler I, Háden GP, Ladinig O, Sziller I, Honing H. Newborn infants detect the beat in music. Proc Natl Acad Sci U S A. 2009;106(7):2468–71. doi: 10.1073/pnas.0809035106 [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Lordier L, Meskaldji D-E, Grouiller F, Pittet MP, Vollenweider A, Vasung L, et al. Music in premature infants enhances high-level cognitive brain networks. Proc Natl Acad Sci U S A. 2019;116(24):12103–8. doi: 10.1073/pnas.1817536116 [DOI] [PMC free article] [PubMed] [Google Scholar]
45.François C, Teixidó M, Takerkart S, Agut T, Bosch L, Rodriguez-Fornells A. Enhanced neonatal brain responses to sung streams predict vocabulary outcomes by age 18 months. Sci Rep. 2017;7(1):12451. doi: 10.1038/s41598-017-12798-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Háden GP, Bouwer FL, Honing H, Winkler I. Beat processing in newborn infants cannot be explained by statistical learning based on transition probabilities. Cognition. 2024;243:105670. doi: 10.1016/j.cognition.2023.105670 [DOI] [PubMed] [Google Scholar]
47.Háden GP, Stefanics G, Vestergaard MD, Denham SL, Sziller I, Winkler I. Timbre-independent extraction of pitch in newborn infants. Psychophysiology. 2009;46(1):69–74. doi: 10.1111/j.1469-8986.2008.00749.x [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Tóth B, Velősy PK, Kovács P, Háden GP, Polver S, Sziller I, et al. Auditory learning of recurrent tone sequences is present in the newborn’s brain. Neuroimage. 2023;281:120384. doi: 10.1016/j.neuroimage.2023.120384 [DOI] [PubMed] [Google Scholar]
49.Stefanics G, Háden GP, Sziller I, Balázs L, Beke A, Winkler I. Newborn infants process pitch intervals. Clin Neurophysiol. 2009;120(2):304–8. doi: 10.1016/j.clinph.2008.11.020 [DOI] [PubMed] [Google Scholar]
50.Carral V, Huotilainen M, Ruusuvirta T, Fellman V, Näätänen R, Escera C. A kind of auditory “primitive intelligence” already present at birth. Eur J Neurosci. 2005;21(11):3201–4. doi: 10.1111/j.1460-9568.2005.04144.x [DOI] [PubMed] [Google Scholar]
51.Di Liberto GM, Pelofi C, Bianco R, Patel P, Mehta AD, Herrero JL, et al. Cortical encoding of melodic expectations in human temporal cortex. Elife. 2020;9:e51784. doi: 10.7554/eLife.51784 [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Kern P, Heilbron M, de Lange FP, Spaak E. Cortical activity during naturalistic music listening reflects short-range predictions based on long-term experience. Elife. 2022;11:e80935. doi: 10.7554/eLife.80935 [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Crosse MJ, Di Liberto GM, Bednar A, Lalor EC. The multivariate temporal response function (mtrf) toolbox: a matlab toolbox for relating neural signals to continuous stimuli. Front Hum Neurosci. 2016;10:604. doi: 10.3389/fnhum.2016.00604 [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Lalor EC, Power AJ, Reilly RB, Foxe JJ. Resolving precise temporal processing properties of the auditory system using continuous stimuli. J Neurophysiol. 2009;102(1):349–59. doi: 10.1152/jn.90896.2008 [DOI] [PubMed] [Google Scholar]
55.Pearce MTT. Statistical learning and probabilistic prediction in music cognition: mechanisms of stylistic enculturation. Ann N Y Acad Sci. 2018;1423: 378–95. doi: 10.1111/nyas.13654 [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Robert AP, Cang MPV, Mercier M, Trébuchon A. PolyRNN: a time-resolved model of polyphonic musical expectations aligned with human brain responses. bioRxiv Prepr. 2025. [Google Scholar]
57.Saadatmehr B, Edalati M, Wallois F, Ghostine G, Kongolo G, Flaten E. Auditory rhythm encoding during the last trimester of human gestation: from tracking the basic beat to tracking hierarchical nested temporal structures. J Neurosci. 2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Kushnerenko E, Ceponiene R, Balan P, Fellman V, Huotilaine M, Näätäne R. Maturation of the auditory event-related potentials during the first year of life. Neuroreport. 2002;13(1):47–51. doi: 10.1097/00001756-200201210-00014 [DOI] [PubMed] [Google Scholar]
59.Bigand F, Bianco R, Abalde SF, Nguyen T, Novembre G. EEG of the dancing brain: decoding sensory, motor, and social processes during dyadic dance. J Neurosci. 2025;45(21):e2372242025. doi: 10.1523/JNEUROSCI.2372-24.2025 [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Jessen S, Obleser J, Tune S. Neural tracking in infants–an analytical tool for multisensory social processing in development. Dev Cogn Neurosci. 2021;52:101034. doi: 10.1016/j.dcn.2021.101034 [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Kushnerenko E, Winkler I, Horváth J, Näätänen R, Pavlov I, Fellman V, et al. Processing acoustic change and novelty in newborn infants. Eur J Neurosci. 2007;26(1):265–74. doi: 10.1111/j.1460-9568.2007.05628.x [DOI] [PubMed] [Google Scholar]
62.Conway CM, Pisoni DB. Neurocognitive basis of implicit learning of sequential structure and its relation to language processing. Ann N Y Acad Sci. 2008;1145:113–31. doi: 10.1196/annals.1416.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
63.Schröger E, Roeber U, Coy N. Markov chains as a proxy for the predictive memory representations underlying mismatch negativity. Front Hum Neurosci. 2023;17:1249413. doi: 10.3389/fnhum.2023.1249413 [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Yon D, Frith CD. Precision and the Bayesian brain. Curr Biol. 2021;31(17):R1026–32. doi: 10.1016/j.cub.2021.07.044 [DOI] [PubMed] [Google Scholar]
65.Heilbron M, Chait M. Great expectations: is there evidence for predictive coding in auditory cortex? Neuroscience. 2018;389:54–73. doi: 10.1016/j.neuroscience.2017.07.061 [DOI] [PubMed] [Google Scholar]
66.Lenc T, Peter V, Hooper C, Keller PE, Burnham D, Nozaradan S. Infants show enhanced neural responses to musical meter frequencies beyond low-level features. Dev Sci. 2023;26(5):e13353. doi: 10.1111/desc.13353 [DOI] [PubMed] [Google Scholar]
67.Ravignani A. Isochrony, vocal learning, and the acquisition of rhythm and melody. Behav Brain Sci. 2021;44:e88. doi: 10.1017/S0140525X20001478 [DOI] [PubMed] [Google Scholar]
68.Phillips-Silver J, Trainor LJ. Feeling the beat: movement influences infant rhythm perception. Science. 2005;308(5727):1430. doi: 10.1126/science.1110922 [DOI] [PubMed] [Google Scholar]
69.Trainor LJ, Gao X, Lei J, Lehtovaara K, Harris LR. The primal role of the vestibular system in determining musical rhythm. Cortex. 2009;45(1):35–43. doi: 10.1016/j.cortex.2007.10.014 [DOI] [PubMed] [Google Scholar]
70.Larsson M, Richter J, Ravignani A. Bipedal steps in the development of rhythmic behavior in humans. Music Sci. 2019;2. doi: 10.1177/2059204319892617 [DOI] [Google Scholar]
71.Ullal-Gupta S, Vanden Bosch der Nederlanden CM, Tichko P, Lahav A, Hannon EE. Linking prenatal experience to the emerging musical mind. Front Syst Neurosci. 2013;7:48. doi: 10.3389/fnsys.2013.00048 [DOI] [PMC free article] [PubMed] [Google Scholar]
72.Santolin C, Saffran JR. Constraints on statistical learning across species. Trends Cogn Sci. 2018;22(1):52–63. doi: 10.1016/j.tics.2017.10.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
73.Fló A, Benjamin L, Palu M, Dehaene-Lambertz G. Statistical learning beyond words in human neonates. Elife. 2025;13:RP101802. doi: 10.7554/eLife.101802 [DOI] [PMC free article] [PubMed] [Google Scholar]
74.Suppanen E, Winkler I, Kujala T, Ylinen S. More efficient formation of longer-term representations for word forms at birth can be linked to better language skills at 2 years. Dev Cogn Neurosci. 2022;55:101113. doi: 10.1016/j.dcn.2022.101113 [DOI] [PMC free article] [PubMed] [Google Scholar]
75.Jones MR, Boltz M. Dynamic attending and responses to time. Psychol Rev. 1989;96(3):459–91. doi: 10.1037/0033-295x.96.3.459 [DOI] [PubMed] [Google Scholar]
76.Bobin-Bègue A, Provasi J, Marks A, Pouthas V. Influence of auditory tempo on the endogenous rhythm of non-nutritive sucking. Rev Eur Psychol Appl. 2006;56: 239–45. doi: 10.1016/j.erap.2005.09.006 [DOI] [Google Scholar]
77.Barnard KE, Bee HL. The impact of temporally patterned stimulation on the development of preterm infants. Child Development. 1983;54(5):1156. doi: 10.2307/1129671 [DOI] [PubMed] [Google Scholar]
78.Zimmerman E, Barlow SM. The effects of vestibular stimulation rate and magnitude of acceleration on central pattern generation for chest wall kinematics in preterm infants. J Perinatol. 2012;32(8):614–20. doi: 10.1038/jp.2011.177 [DOI] [PMC free article] [PubMed] [Google Scholar]
79.Granier-Deferre C, Ribeiro A, Jacquet A-Y, Bassereau S. Near-term fetuses process temporal features of speech. Dev Sci. 2011;14(2):336–52. doi: 10.1111/j.1467-7687.2010.00978.x [DOI] [PubMed] [Google Scholar]
80.Novitski N, Huotilainen M, Tervaniemi M, Näätänen R, Fellman V. Neonatal frequency discrimination in 250-4000-Hz range: electrophysiological evidence. Clin Neurophysiol. 2007;118(2):412–9. doi: 10.1016/j.clinph.2006.10.008 [DOI] [PubMed] [Google Scholar]
81.Smith NA, Trainor LJ, Shore DI. The development of temporal resolution: between-channel gap detection in infants and adults. J Speech Lang Hear Res. 2006;49(5):1104–13. doi: 10.1044/1092-4388(2006/079 [DOI] [PubMed] [Google Scholar]
82.Loukas S, Lordier L, Meskaldji D-E, Filippa M, Sa de Almeida J, Van De Ville D, et al. Musical memories in newborns: a resting-state functional connectivity study. Hum Brain Mapp. 2022;43(2):647–64. doi: 10.1002/hbm.25677 [DOI] [PMC free article] [PubMed] [Google Scholar]
83.Auksztulewicz R, Schwiedrzik CM, Thesen T, Doyle W, Devinsky O, Nobre AC, et al. Not all predictions are equal: “What” and “when” predictions modulate activity in auditory cortex through different mechanisms. J Neurosci. 2018;38(40):8680–93. doi: 10.1523/JNEUROSCI.0369-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
84.Musacchia G, Large EW, Schroeder CE. Thalamocortical mechanisms for integrating musical tone and rhythm. Hear Res. 2014;308:50–9. doi: 10.1016/j.heares.2013.09.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
85.Cappotto D, Luo D, Lai HW, Peng F, Melloni L, Schnupp JWH, et al. “What” and “when” predictions modulate auditory processing in a mutually congruent manner. Front Neurosci. 2023;17:1180066. doi: 10.3389/fnins.2023.1180066 [DOI] [PMC free article] [PubMed] [Google Scholar]
86.Hoedlmoser K, Peigneux P, Rauchs G. Recent advances in memory consolidation and information processing during sleep. J Sleep Res. 2022;31(4):e13607. doi: 10.1111/jsr.13607 [DOI] [PubMed] [Google Scholar]
87.Sifuentes-Ortega R, Lenc T, Nozaradan S, Peigneux P. Partially preserved processing of musical rhythms in REM but not in NREM sleep. Cereb Cortex. 2022;32(7):1508–19. doi: 10.1093/cercor/bhab303 [DOI] [PubMed] [Google Scholar]
88.Makov S, Sharon O, Ding N, Ben-Shachar M, Nir Y, Zion Golumbic E. Sleep disrupts high-level speech parsing despite significant basic auditory processing. J Neurosci. 2017;37(32):7772–81. doi: 10.1523/JNEUROSCI.0168-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
89.Kushnerenko EV, Van den Bergh BRH, Winkler I. Separating acoustic deviance from novelty during the first year of life: a review of event-related potential evidence. Front Psychol. 2013;4:595. doi: 10.3389/fpsyg.2013.00595 [DOI] [PMC free article] [PubMed] [Google Scholar]
90.Nguyen T, Bigand F, Reisner S, Koul A, Bianco R, Markova G. Development of auditory and spontaneous movement responses to music over the first year of life. Elife. 2025. doi: 2025.05.04.649695 [Google Scholar]
91.Liégeois-Chauvel C, Musolino A, Badier JM, Marquis P, Chauvel P. Evoked potentials recorded from the auditory cortex in man: evaluation and topography of the middle latency components. Electroencephalogr Clin Neurophysiol. 1994;92(3):204–14. doi: 10.1016/0168-5597(94)90064-7 [DOI] [PubMed] [Google Scholar]
92.Dürschmid S, Edwards E, Reichert C, Dewar C, Hinrichs H, Heinze H-J, et al. Hierarchy of prediction errors for auditory events in human temporal and frontal cortex. Proc Natl Acad Sci U S A. 2016;113(24):6755–60. doi: 10.1073/pnas.1525030113 [DOI] [PMC free article] [PubMed] [Google Scholar]
93.Mouraux A, Diukova A, Lee MC, Wise RG, Iannetti GD. A multisensory investigation of the functional significance of the “pain matrix”. Neuroimage. 2011;54(3):2237–49. doi: 10.1016/j.neuroimage.2010.09.084 [DOI] [PubMed] [Google Scholar]
94.Kotz SA, Ravignani A, Fitch WT. The evolution of rhythm processing. Trends Cogn Sci. 2018;22(10):896–910. doi: 10.1016/j.tics.2018.08.002 [DOI] [PubMed] [Google Scholar]
95.Rimmele JM, Morillon B, Poeppel D, Arnal LH. Proactive sensing of periodic and aperiodic auditory patterns. Trends Cogn Sci. 2018;22(10):870–82. doi: 10.1016/j.tics.2018.08.003 [DOI] [PubMed] [Google Scholar]
96.Pearce MT. The construction and evaluation of statistical models of melodic structure in music perception and composition. 2005. Available from: http://webprojects.eecs.qmul.ac.uk/marcusp/papers/Pearce2005.pdf
97.Oostenveld R, Fries P, Maris E, Schoffelen J-M. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput Intell Neurosci. 2011;2011:156869. doi: 10.1155/2011/156869 [DOI] [PMC free article] [PubMed] [Google Scholar]
98.Plechawska-Wojcik M, Kaczorowska M, Zapala D. The artifact subspace reconstruction (ASR) for eeg signal correction. a comparative study. Adv Intell Syst Comput. Springer International Publishing. 2018. p. 125–35. doi: 10.1007/978-3-319-99996-8_12 [DOI] [Google Scholar]
99.Quiroga-Martinez DR, Hansen NC, Højlund A, Pearce M, Brattico E, Vuust P. Decomposing neural responses to melodic surprise in musicians and non-musicians: evidence for a hierarchy of predictions in the auditory system. Neuroimage. 2020;215:116816. doi: 10.1016/j.neuroimage.2020.116816 [DOI] [PubMed] [Google Scholar]
100.Norman-Haignere SV, McDermott JH. Neural responses to natural and model-matched stimuli reveal distinct computations in primary and nonprimary auditory cortex. PLoS Biol. 2018;16(12):e2005127. doi: 10.1371/journal.pbio.2005127 [DOI] [PMC free article] [PubMed] [Google Scholar]
101.Maris E, Oostenveld R. Nonparametric statistical testing of EEG- and MEG-data. J Neurosci Methods. 2007;164(1):177–90. doi: 10.1016/j.jneumeth.2007.03.024 [DOI] [PubMed] [Google Scholar]

PLoS Biol. doi: 10.1371/journal.pbio.3003600.r001

Decision Letter 0

Christian Schnell, PhD

11 Mar 2025

Dear Dr Bianco,

Thank you for submitting your manuscript entitled "Human newborns form musical predictions based on rhythmic but not melodic structure" for consideration as a Research Article by PLOS Biology.

Your manuscript has now been evaluated by the PLOS Biology editorial staff as well as by an academic editor with relevant expertise and I am writing to let you know that we would like to send your submission out for external peer review.

However, before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire.

Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. After your manuscript has passed the checks it will be sent out for review. To provide the metadata for your submission, please Login to Editorial Manager (https://www.editorialmanager.com/pbiology) within two working days, i.e. by Mar 13 2025 11:59PM.

If your manuscript has been previously peer-reviewed at another journal, PLOS Biology is willing to work with those reviews in order to avoid re-starting the process. Submission of the previous reviews is entirely optional and our ability to use them effectively will depend on the willingness of the previous journal to confirm the content of the reports and share the reviewer identities. Please note that we reserve the right to invite additional reviewers if we consider that additional/independent reviewers are needed, although we aim to avoid this as far as possible. In our experience, working with previous reviews does save time.

If you would like us to consider previous reviewer reports, please edit your cover letter to let us know and include the name of the journal where the work was previously considered and the manuscript ID it was given. In addition, please upload a response to the reviews as a 'Prior Peer Review' file type, which should include the reports in full and a point-by-point reply detailing how you have or plan to address the reviewers' concerns.

During the process of completing your manuscript submission, you will be invited to opt-in to posting your pre-review manuscript as a bioRxiv preprint. Visit http://journals.plos.org/plosbiology/s/preprints for full details. If you consent to posting your current manuscript as a preprint, please upload a single Preprint PDF.

Feel free to email us at plosbiology@plos.org if you have any queries relating to your submission.

Kind regards,

Christian

Christian Schnell, PhD

Senior Editor

PLOS Biology

cschnell@plos.org

PLoS Biol. doi: 10.1371/journal.pbio.3003600.r002

Decision Letter 1

Christian Schnell, PhD

24 Apr 2025

Dear Dr Bianco,

Thank you for your patience while your manuscript "Human newborns form musical predictions based on rhythmic but not melodic structure" was peer-reviewed at PLOS Biology. It has now been evaluated by the PLOS Biology editors, an Academic Editor with relevant expertise, and by several independent reviewers.

In light of the reviews, which you will find at the end of this email, we would like to invite you to revise the work to thoroughly address the reviewers' reports.

As you will see below, the reviewers think that the study is overall well executed and provides important insights. The main concerns which the reviewers share concern the interpretation of the results (for example about the comparisons between species and to human adults), but also the model.

Given the extent of revision needed, we cannot make a decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is likely to be sent for further evaluation by all or a subset of the reviewers.

In addition to these revisions, you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests shortly.

We expect to receive your revised manuscript within 3 months. Please email us (plosbiology@plos.org) if you have any questions or concerns, or would like to request an extension.

At this stage, your manuscript remains formally under active consideration at our journal; please notify us by email if you do not intend to submit a revision so that we may withdraw it.

**IMPORTANT - SUBMITTING YOUR REVISION**

Your revisions should address the specific points made by each reviewer. Please submit the following files along with your revised manuscript:

1. A 'Response to Reviewers' file - this should detail your responses to the editorial requests, present a point-by-point response to all of the reviewers' comments, and indicate the changes made to the manuscript.

*NOTE: In your point-by-point response to the reviewers, please provide the full context of each review. Do not selectively quote paragraphs or sentences to reply to. The entire set of reviewer comments should be present in full and each specific point should be responded to individually, point by point.

You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response.

2. In addition to a clean copy of the manuscript, please also upload a 'track-changes' version of your manuscript that specifies the edits made. This should be uploaded as a "Revised Article with Changes Highlighted" file type.

*Re-submission Checklist*

When you are ready to resubmit your revised manuscript, please refer to this re-submission checklist: https://plos.io/Biology_Checklist

To submit a revised version of your manuscript, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' where you will find your submission record.

Please make sure to read the following important policies and guidelines while preparing your revision:

*Published Peer Review*

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*PLOS Data Policy*

Please note that as a condition of publication PLOS' data policy (http://journals.plos.org/plosbiology/s/data-availability) requires that you make available all data used to draw the conclusions arrived at in your manuscript. If you have not already done so, you must include any data used in your manuscript either in appropriate repositories, within the body of the manuscript, or as supporting information (N.B. this includes any numerical values that were used to generate graphs, histograms etc.). For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5

*Blot and Gel Data Policy*

We require the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare them now, if you have not already uploaded them. Please carefully read our guidelines for how to prepare and upload this data: https://journals.plos.org/plosbiology/s/figures#loc-blot-and-gel-reporting-requirements

*Protocols deposition*

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Thank you again for your submission to our journal. We hope that our editorial process has been constructive thus far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Christian

Christian Schnell, PhD

Senior Editor

PLOS Biology

cschnell@plos.org

------------------------------------

REVIEWS:

Reviewer #1: This manuscript addresses an important and timely question in developmental neuroscience and music cognition: whether the human ability to anticipate musical structure is already present at birth. The authors use temporal response functions (TRFs) to probe rhythmic and melodic encoding in newborns exposed to real versus temporally shuffled musical stimuli.

The study is built on a particularly rich and valuable dataset — EEG recordings from 49 newborns, an impressive achievement in itself. The main finding — that newborns encode probabilistic rhythmic regularities in real music, but not melodic ones — is highly interesting. It provides compelling evidence that rhythm-based statistical learning is already functional at birth.

We have some concerns regarding the channel selection procedure, which should be reworked.

Moreover, the interpretative sections of the manuscript, especially the conclusion, do not fully do justice to the depth of the results. The discussion often gets sidetracked, particularly by extended considerations of sleep state, while the core implications — the precedence of rhythmic over melodic processing — are underdeveloped. Several rich interpretive avenues (e.g., prenatal experience, cognitive cost of temporal vs. spectral prediction, the nature of the "structured context") are left largely unexplored.

Finally, we are also unconvinced by the relevance of the comparison with non-human primates in this context. While it supports the analytical framework, it does not substantially advance a phylogenetic interpretation — which, in any case, does not seem to be the central focus of the study.

Major Comments

1. Channel selection procedure:

While it is understandable that neonatal EEG signals are inherently noisy, the method used for channel selection—choosing the top 25% based on correlation—appears suboptimal. In particular, it introduces a form of double-dipping in the context of the linear mixed-effects model applied to delta-r (> 0). Indeed, currently the channels are selected based on the highest r-values in the full model, thus the following statistical test comparing full vs. reduced models is not valid. A cluster-based permutation-based approach could offer a more statistically grounded way to assess whether models including high-level features outperform controls. Alternatively, permuting the regressors of interest 1000 times (for instance) would allow identification of channels that respond significantly, rather than relying on an arbitrary percentile threshold.

2. Data visualization in supplementary material:

It would also be valuable to include, as supplementary material, a 3D visualization of the delta-r values—participants on the Y-axis, channels on the X-axis, and delta-r represented by color—for both conditions (music and shuffled) and both models (high-level, low-level). This would allow for a direct inspection of the TRF results without averaging over channels or participants, helping assess the robustness of the effects.

3. Acoustic model improvements:

The low-level acoustic model should be refined. Including at minimum the amplitude envelope and its half-wave rectified derivative would align better with known auditory encoding mechanisms and better capture music-related acoustic dynamics. These features have been shown to strongly modulate auditory responses in both music and speech and may coincide with tempo information in the stimuli. See, e.g., Di Liberto et al., eLife 2020 or Robert et al., bioRxiv 2024 (see below for references).

4. Rewriting the conclusion:

The current conclusion includes some redundancy and underplays the main theoretical contributions. Rather than focusing so heavily on the impact of sleep state, the authors should more directly address:

o Why rhythm may be more predictable than melody (e.g., prenatal exposure to rhythmic patterns through maternal physiology; possible cognitive prerequisites for pitch processing…). See e.g. Dehaene et al., 2015, Neuron for leads on this topic.

o What exactly distinguishes the "real" from the "shuffled" music, beyond surprise values? Could the authors provide concrete descriptions or statistical measures of this structured context? See, e.g., the work of McDermott.

o Whether newborns' predictions are based on prior exposure (e.g., in utero learning) or fast learning during the experiment. If the IDyOM model is trained on the same material as the infants hear, this distinction is essential and warrants more discussion, possibly including references on short context auditory learning in newborns.

o The study would benefit from a clearer contrast with previous work, particularly regarding the shift from detecting isolated deviants to modeling continuous, event-by-event predictions, and from using simple, sparse stimuli to rich, fast musical input — which may challenge newborns' cognitive processing capacity.

Minor Comments

1. In the introduction, the comparison between newborns and monkeys may confuse the notion of "innate" capacities, particularly as fetuses are already exposed to music and speech. A brief clarification on what "innate" is taken to mean in this context would help.

2. Relatedly, drawing on developmental literature about early temporal processing might provide a more relevant theoretical framework than uniquely the primate comparison.

3. Could the authors provide more interpretation of the correlations between stimulus features? It would help understand to what extent they are separable in the model.

4. Consider comparing model performance (correlation values) with prior EEG TRF studies (e.g., Di Liberto et al., 2020, eLife; Robert et al., 2024, bioRxiv) to contextualize what constitutes a "reasonable" model accuracy.

5. The term "reduced model" is potentially misleading, as its size is not reduced but its meaningful content is randomized. Consider rewording to avoid confusion.

6. Figure 1D lacks the representation of the full model, making it unclear for readers.

7. Typo in Figure 2B (x10^-3 missing on the right axis).

8. Typo in Figure 2A: label should read "IPI" instead of "ITI".

9. Regarding ERPs: are unexpected pitch and unexpected timing correlated? A short clarification would be helpful.

10. Does musical tempo affect time prediction accuracy? If so, could the authors explore whether infants show a preferred tempo for prediction?

11. Procedure: why are some melodies heard twice? Could this repetition facilitate learning and influence the results?

References:

Di Liberto, G. M., Pelofi, C., Bianco, R., Patel, P., Mehta, A. D., Herrero, J. L., De Cheveigné, A., Shamma, S., & Mesgarani, N. (2020). Cortical encoding of melodic expectations in human temporal cortex. eLife, 9, e51784. https://doi.org/10.7554/eLife.51784

Robert, P., Van Cang, M. P., Mercier, M., Trébuchon, A., Bartolomei, F., Arnal, L. H., ... & Doelling, K. (2024). Multi-stream predictions in human auditory cortex during natural music listening. bioRxiv, 2024-11.

Dehaene, S., Meyniel, F., Wacongne, C., Wang, L., & Pallier, C. (2015). The Neural Representation of Sequences: From Transition Probabilities to Algebraic Patterns and Linguistic Trees. Neuron, 88(1), 2-19. https://doi.org/10.1016/j.neuron.2015.09.019

Reviewer #2: This paper presents results from a novel EEG study with sleeping newborn humans. The babies were presented with normal and shuffled experts of monophonic piano versions of Bach melodies. The authors consider both low level features, like inter-pitch-intervals and inter-onset-intervals as well as "higher level" quantifications of entropy and surprisal (generated from IDyOM) when testing models for predicting infant neural responses using mTRFs. Results suggest that EEG data were accuracy predicted using models that included both low and high level features. High level features added significantly to prediction accuracy but only for real and not shuffled melodies. This effect seems especially driven by high-level timing features but perhaps not high-level pitch features. ERP analyses tell a similar story - notes that are higher in temporal surprise generate a larger P1/P2-ish response than notes low in temporal surprise for real but not shuffled sequences (no such effects with pitch surprisal). The authors compare these findings to re-analyzed data from awake adults (hearing the "real" stimuli) and awake rhesus monkeys (hearing the same "real" and "shuffled" stimuli).

Overall, this is a very interesting and worthwhile study that answers some questions about early rhythm perception, raises some new interesting questions about how predictions are made about pitch vs timing in auditory stimuli, and uses compelling and cutting edge analysis techniques to explore these questions. I particularly like the use of real vs shuffled music. The authors present an interesting and thoughtful discussion.

In that discussion, they highlight something I was thinking about as I read the paper - that their comparison to adult and monkey data should be taken with a grain of salt given that the infants were the only group out of the three that were asleep. I agree that this is really important to highlight and appreciate the discussion provided. Is the "unimportance" of pitch here the result of development/experience or related to attentive/inattentive processing? I have to say, if the authors had the means to collect a sample of sleeping adults hearing both the shuffled and real stimuli, that would greatly strengthen the impact of this paper. I also realize that this was already a ton of work and is enough "as is", so if the authors choose to suggest running sleeping adults through this paradigm as a future direction (which they do) they may need to tone down some of their conclusions - for example, the entire final paragraph of the discussion.

In the introduction, it would be helpful to spend a bit more time explaining probabilistic regularities of melodies. I'm having a hard time understanding how we could possibly make these predictions about pitch without exposure to a musical system. Is the point that we see lower surprisal for pitch if the same kinds of intervals happen again and again within a sequence? This must take at least a bit of time to establish. It would be helpful to clarify this for those who may be unfamiliar with IDyOM/music theory, and it would be very helpful to link readers to your specific set of stimuli (that link to the Bach website was quite a rabbit hole). There was some discussion in the paper about how the IDyOM model using only this small set of stimuli came up with similar numbers compared to a model using a large corpus. But even still, the first time infants hear a few of these melodies, they can't be expected to already have accrued enough information about pitch regularities to make useful predictions. Melodic pitch information is mostly filtered out in the womb, especially more so than timing information (something that could be discussed), so even papers showing musical memory from womb to birth likely rely on timing information. Anyway, all that to say, did you consider exploring whether infants begin to rely more on higher-level pitch patterns in the last few trials than in the first few trials?

Real and shuffled stimuli differ broadly in entropy and surprisal. Is there any argument to be made for exploring how entropy and surprisal (regardless of shuffled/real categorization or within shuffled/real categories) relates to mTRF prediction accuracy? What is the value of considering surprisal and entropy separately in this paper? Do they tell us different stories? If not, why not pick one? If yes, the difference and value of presenting both is unclear.

These results raise interesting questions about individual differences - thank you for plotting individual data on the figures. Is it worthwhile exploring whether the babies who rely on higher level features the most for timing do the same for pitch? You mention that gestational age may explain some variability here. Do you have that information? Can you test that prediction? It might also be worth citing François et al., 2017, who report that newborn infant EEG responses to continuous song predict language outcomes in toddlerhood (DOI:10.1038/s41598-017-12798-2).

Minor comments:

- Can you clarify with the ERP analyses if the results are on Fz, as shown in the figure? Why was this electrode selected? It is being compared to Fcz in adults and monkeys. Is there an argument for presenting the same electrode in these figures across groups?

- It might be nice to add more discussion about primate rhythm perception/production in the intro. I was more familiar with behavioural work showing that primates are not good at predictively tapping, but learned a bit from this paper about newer work showing that beat perception may still be happening in some primates. Elaborating on how beat perception/productions in humans compared to non-human primates in the intro would be helpful given the phylogenetic arguments in the discussion

- This paper appears to argue that the predictive power of the IDyOM outputs suggests that infants are attuned to statistical regularities in musical beats. These statements about statistical learning have me thinking about other work by some of the authors here, specifically Háden's 2024 paper in Cognition suggesting that newborn EEG responses to beats do NOT reflect statistical learning. How do these two pieces of work complement each other (or not)?

- Last paragraph of discussion: Social interactions do not begin at 6 months

- The links to language presented in the end of the discussion are unclear. I'd argue that speech processing usually relies on temporal cues more than pitch cues unless we are thinking about prosody. But for understanding word meaning and basic communication, why would pith be prioritized?

Reviewer #3 (Anne Kösem): This study utilizes EEG recordings in newborns to examine whether infants can track pitch and rhythmic expectations in musical excerpts. The authors found that encoding of probabilistic rhythmic expectations occurred only in response to real music, as opposed to shuffled music. However, no evidence was observed regarding the tracking of melodic information.

I would like to express my appreciation for the clarity and quality of the manuscript. The study presents a highly relevant topic and is executed with great competence. Furthermore, the methodologies employed are robust and well-founded.

My main inquiry pertains to the interpretation of the results related to rhythmic expectations. I am curious if these findings genuinely support the existence of predictive timing mechanisms as the authors suggest. Alternatively, these results might indicate non-predictive mechanisms, potentially arising from the superposition of evoked responses that arrive at different timings. The authors report a significant effect of inter-onset interval on the mTRF, which, in my view, could signify the impact of the evoked response to the preceding sound—possibly occurring as recently as 100 ms prior. Moreover, it is conceivable that the inter-onset interval of the subsequent tone may also affect the mTRF. If, in real music, the inter-onset interval of the next tone is biased in duration under conditions of high surprise versus low surprise, this could further influence the observed results.

PLoS Biol. 2026 Feb 5;24(2):e3003600. doi: 10.1371/journal.pbio.3003600.r003

Author response to Decision Letter 2

24 Oct 2025

Attachment

Submitted filename: ResponseToReviewers.docx

pbio.3003600.s018.docx^{(6.5MB, docx)}

PLoS Biol. doi: 10.1371/journal.pbio.3003600.r004

Decision Letter 2

Christian Schnell, PhD

15 Dec 2025

Dear Dr Bianco,

Thank you for your patience while we considered your revised manuscript "Human newborns form musical predictions based on rhythmic but not melodic structure" for publication as a Research Article at PLOS Biology. This revised version of your manuscript has been evaluated by the PLOS Biology editors, the Academic Editor and the original reviewers.

Based on the reviews, we are likely to accept this manuscript for publication, provided you satisfactorily address the remaining points raised by the reviewers Please also make sure to address the following data and other policy-related requests:

* Please add the links to the funding agencies in the Financial Disclosure statement in the manuscript details.

* Please include the approval/license number of the ethical approval for the experiments.

* DATA POLICY:

You may be aware of the PLOS Data Policy, which requires that all data be made available without restriction: http://journals.plos.org/plosbiology/s/data-availability. For more information, please also see this editorial: http://dx.doi.org/10.1371/journal.pbio.1001797

Note that we do not require all raw data. Rather, we ask that all individual quantitative observations that underlie the data summarized in the figures and results of your paper be made available in one of the following forms:

1) Supplementary files (e.g., excel). Please ensure that all data files are uploaded as 'Supporting Information' and are invariably referred to (in the manuscript, figure legends, and the Description field when uploading your files) using the following format verbatim: S1 Data, S2 Data, etc. Multiple panels of a single or even several figures can be included as multiple sheets in one excel file that is saved using exactly the following convention: S1_Data.xlsx (using an underscore).

2) Deposition in a publicly available repository. Please also provide the accession code or a reviewer link so that we may view your data before publication.

Regardless of the method selected, please ensure that you provide the individual numerical values that underlie the summary data displayed in the following figure panels as they are essential for readers to assess your analysis and to reproduce it: 2BC, S3A, S4 and S5.

NOTE: the numerical data provided should include all replicates AND the way in which the plotted mean and errors were derived (it should not present only the mean/average values).

Please also ensure that figure legends in your manuscript include information on where the underlying data can be found, and ensure your supplemental data file/s has a legend.

Please ensure that your Data Statement in the submission system accurately describes where your data can be found.

* CODE POLICY

Per journal policy, if you have generated any custom code during the course of this investigation, please make it available without restrictions. Please ensure that the code is sufficiently well documented and reusable, and that your Data Statement in the Editorial Manager submission system accurately describes where your code can be found. [IF APPLICABLE: As the code that you have generated to XXX is important to support the conclusions of your manuscript, its deposition is required for acceptance.]

Please note that we cannot accept sole deposition of code in GitHub, as this could be changed after publication. However, you can archive this version of your publicly available GitHub code to Zenodo. Once you do this, it will generate a DOI number, which you will need to provide in the Data Accessibility Statement (you are welcome to also provide the GitHub access information). See the process for doing this here: https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content

As you address these items, please take this last chance to review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the cover letter that accompanies your revised manuscript.

In addition to these revisions, you may need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests shortly. If you do not receive a separate email within a few days, please assume that checks have been completed, and no additional changes are required.

We expect to receive your revised manuscript within two weeks.

To submit your revision, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' to find your submission record. Your revised submission must include the following:

- a cover letter that should detail your responses to any editorial requests, if applicable, and whether changes have been made to the reference list

- a Response to Reviewers file that provides a detailed response to the reviewers' comments (if applicable, if not applicable please do not delete your existing 'Response to Reviewers' file.)

- a track-changes file indicating any changes that you have made to the manuscript.

NOTE: If Supporting Information files are included with your article, note that these are not copyedited and will be published as they are submitted. Please ensure that these files are legible and of high quality (at least 300 dpi) in an easily accessible file format. For this reason, please be aware that any references listed in an SI file will not be indexed. For more information, see our Supporting Information guidelines:

https://journals.plos.org/plosbiology/s/supporting-information

*Published Peer Review History*

Please note that you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://plos.org/published-peer-review-history/

*Press*

Should you, your institution's press office or the journal office choose to press release your paper, please ensure you have opted out of Early Article Posting on the submission form. We ask that you notify us as soon as possible if you or your institution is planning to press release the article.

*Protocols deposition*

Please do not hesitate to contact me should you have any questions.

Sincerely,

Christian

Christian Schnell, PhD

Senior Editor

cschnell@plos.org

PLOS Biology

------------------------------------------------------------------------

Reviewer remarks:

Reviewer #1 (Benjamin Morillon): We congratulate the authors on this high-quality work.

Reviewer #2: I thank the authors for their careful and clear revisions and responses to reviewer comments. I feel the manuscripts clarity and impact is improved and I have no further comments.

Reviewer #3 (Anne Kösem): The authors have satisfactorily addressed my concern. I suggest including the response to the reviewers that addresses how larger IOIs are associated with relatively more surprising notes (in music) in the main text (for example, in the results section page 6). I believe this is an important point that is only currently indicated in the legend of Figure S3. I am referring to this particular response:

« A linear mixed model predicting the preceding IOI, with factors condition (real /shuffled) and surprise level (high / low), yielded no main effect of condition (χ2(1) = 1.12, p = 0.29), butsignificant main effect of surprise level (χ2(1) = 16.89, p < .001), and an interaction of condition andsurprise level (χ2(1) = 18.12, p < .001), suggesting that this bias is stronger in real than in shuffled music (Figure S3A, left panel). However, we do not believe that this constitutes an issue for our interpretation of the results because our TRF analyses included the preceding IOI as a regressor, allowing us to separate its unique contribution from that of Musical Surprise. Indeed, the effects attributed to the high level rhythmic tracking (St and Et model) cannot be explainable by low-level timing alone, as the St and Et regressors accounts for additional EEG variance beyond that explained by the preceding IOI »

PLoS Biol. 2026 Feb 5;24(2):e3003600. doi: 10.1371/journal.pbio.3003600.r005

Author response to Decision Letter 3

19 Dec 2025

Attachment

Submitted filename: ResponseToReviewers_auresp_3.docx

pbio.3003600.s019.docx^{(33.2KB, docx)}

PLoS Biol. doi: 10.1371/journal.pbio.3003600.r006

Decision Letter 3

Christian Schnell, PhD

6 Jan 2026

Dear Roberta,

Happy New Year!

Thank you for the submission of your revised Research Article "Human newborns form musical predictions based on rhythmic but not melodic structure" for publication in PLOS Biology and apologies for the delay in getting back to you over the holiday period.

On behalf of my colleagues and the Academic Editor, Mathew Diamond, I am pleased to say that we can in principle accept your manuscript for publication, provided you address any remaining formatting and reporting issues. These will be detailed in an email you should receive within 2-3 business days from our colleagues in the journal operations team; no action is required from you until then. Please note that we will not be able to formally accept your manuscript and schedule it for publication until you have completed any requested changes.

Please take a minute to log into Editorial Manager at http://www.editorialmanager.com/pbiology/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production process.

PRESS

We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with biologypress@plos.org. If you have previously opted in to the early version process, we ask that you notify us immediately of any press plans so that we may opt out on your behalf.

We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit http://www.plos.org/about/media-inquiries/embargo-policy/.

Thank you again for choosing PLOS Biology for publication and supporting Open Access publishing. We look forward to publishing your study.

Sincerely,

Christian

Christian Schnell, PhD

Senior Editor

PLOS Biology

cschnell@plos.org

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Fig. Summary statistics (mean and variance of the amplitude envelope across frequency bands) of stimuli.

(TIF)

pbio.3003600.s001.tif^{(930.2KB, tif)}

S2 Fig. Difference in prediction accuracy (Δr) between real and shuffled conditions across participants and electrodes for each reduced model.

(TIF)

pbio.3003600.s002.tif^{(2.4MB, tif)}

S3 Fig. (A) Length of preceding and subsequent IOIs as a function of rhythmic surprise and condition.

(TIF)

pbio.3003600.s003.tif^{(1.5MB, tif)}

S4 Fig. Control analyses.

(TIF)

pbio.3003600.s004.tif^{(886.9KB, tif)}

S5 Fig. Control analysis.

(TIF)

pbio.3003600.s005.tif^{(1.1MB, tif)}

S1 Table. Characteristics of the experimental stimuli.

(DOCX)

pbio.3003600.s006.docx^{(26KB, docx)}

S1 Data. Excel file containing the numerical data values for Fig 1B and 1C.

(XLSX)

pbio.3003600.s007.xlsx^{(11.6KB, xlsx)}

S2 Data. Excel file containing the numerical data values for Fig 2B, 2C, and 2D.

(XLSX)

pbio.3003600.s008.xlsx^{(138.7KB, xlsx)}

(MAT)

pbio.3003600.s009.mat^{(4.2MB, mat)}

(MAT)

pbio.3003600.s010.mat^{(4.2MB, mat)}

S5 Data. Excel file containing the summary statistics (mean and variance of the amplitude envelope across frequency bands) of stimuli shown in S1 Fig.

(XLSX)

pbio.3003600.s011.xlsx^{(62.6KB, xlsx)}

S6 Data. Excel file containing the numerical values underlying S2 Fig.

(XLSX)

pbio.3003600.s012.xlsx^{(416.8KB, xlsx)}

S7 Data. Excel file containing the numerical data values for S3A Fig.

Surprise associated with timing, length of preceding IOI, as well as length of the subsequent IOIs for each note of all melodies).

(XLSX)

pbio.3003600.s013.xlsx^{(315.6KB, xlsx)}

S8 Data. Excel file containing the numerical data values for S4 Fig.

(XLSX)

pbio.3003600.s014.xlsx^{(28.2KB, xlsx)}

S9 Data. Excel file containing the numerical data values for S5 Fig, with the full model additionally containing envelope and its derivative.

(XLSX)

pbio.3003600.s015.xlsx^{(139.1KB, xlsx)}

Attachment

Submitted filename: ResponseToReviewers.docx

pbio.3003600.s018.docx^{(6.5MB, docx)}

Attachment

Submitted filename: ResponseToReviewers_auresp_3.docx

pbio.3003600.s019.docx^{(33.2KB, docx)}

Data Availability Statement

[pbio.3003600.ref001] 1.Singh M, Mehr SA. Universality, domain-specificity, and development of psychological responses to music. Nat Rev Psychol. 2023;2(6):333–46. doi: 10.1038/s44159-023-00182-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref002] 2.McMullen E, Saffran JR. Music and language: a developmental comparison. Music Perception. 2004;21(3):289–311. doi: 10.1525/mp.2004.21.3.289 [DOI] [Google Scholar]

[pbio.3003600.ref003] 3.Trehub SE, Hannon EE. Infant music perception: domain-general or domain-specific mechanisms? Cognition. 2006;100(1):73–99. doi: 10.1016/j.cognition.2005.11.006 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref004] 4.Webb AR, Heller HT, Benson CB, Lahav A. Mother’s voice and heartbeat sounds elicit auditory plasticity in the human brain before full gestation. Proc Natl Acad Sci U S A. 2015;112(10):3152–7. doi: 10.1073/pnas.1414924112 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref005] 5.Winkler I, Denham SL, Nelken I. Modeling the auditory scene: predictive regularity representations and perceptual objects. Trends Cogn Sci. 2009;13(12):532–40. doi: 10.1016/j.tics.2009.09.003 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref006] 6.Dehaene S, Meyniel F, Wacongne C, Wang L, Pallier C. The neural representation of sequences: from transition probabilities to algebraic patterns and linguistic trees. Neuron. 2015;88(1):2–19. doi: 10.1016/j.neuron.2015.09.019 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref007] 7.Tillmann B. Music and language perception: expectations, structural integration, and cognitive sequencing. Top Cogn Sci. 2012;4(4):568–84. doi: 10.1111/j.1756-8765.2012.01209.x [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref008] 8.Vuust P, Heggli OA, Friston KJ, Kringelbach ML. Music in the brain. Nat Rev Neurosci. 2022;23(5):287–305. doi: 10.1038/s41583-022-00578-5 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref009] 9.Koelsch S, Vuust P, Friston K. Predictive processes and the peculiar case of music. Trends Cogn Sci. 2019;23(1):63–77. doi: 10.1016/j.tics.2018.10.006 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref010] 10.Woods KJP, McDermott JH. Schema learning for the cocktail party problem. Proc Natl Acad Sci U S A. 2018;115(14):E3313–22. doi: 10.1073/pnas.1801614115 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref011] 11.Kotz SA, Schwartze M. Cortical speech processing unplugged: a timely subcortico-cortical framework. Trends Cogn Sci. 2010;14(9):392–9. doi: 10.1016/j.tics.2010.06.005 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref012] 12.Zatorre RJ, Salimpoor VN. From perception to pleasure: music and its neural substrates. Proc Natl Acad Sci U S A. 2013;110 Suppl 2(Suppl 2):10430–7. doi: 10.1073/pnas.1301228110 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref013] 13.Malloch S, Trevarthen C. Musicality in infancy. In: Press OU, editor. Communicative musicality exploring the basis of human companionship. 2009. p. 241–62. [Google Scholar]

[pbio.3003600.ref014] 14.Trehub SE. The developmental origins of musicality. Nat Neurosci. 2003;6(7):669–73. doi: 10.1038/nn1084 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref015] 15.Hannon EE, Trainor LJ. Music acquisition: effects of enculturation and formal training on development. Trends Cogn Sci. 2007;11(11):466–72. doi: 10.1016/j.tics.2007.08.008 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref016] 16.Honing H, ten Cate C, Peretz I, Trehub SE. Without it no music: cognition, biology and evolution of musicality. Philos Trans R Soc Lond B Biol Sci. 2015;370(1664):20140088. doi: 10.1098/rstb.2014.0088 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref017] 17.Hauser MD, McDermott J. The evolution of the music faculty: a comparative perspective. Nat Neurosci. 2003;6(7):663–8. doi: 10.1038/nn1080 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref018] 18.Trainor LJ. The origins of music in auditory scene analysis and the roles of evolution and culture in musical creation. Philos Trans R Soc Lond B Biol Sci. 2015;370(1664):20140089. doi: 10.1098/rstb.2014.0089 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref019] 19.Bianco R, Zuk NJ, Bigand F, Quarta E, Grasso S, Arnese F, et al. Neural encoding of musical expectations in a non-human primate. Curr Biol. 2024;34(2):444-450.e5. doi: 10.1016/j.cub.2023.12.019 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref020] 20.Hattori Y, Tomonaga M. Rhythmic swaying induced by sound in chimpanzees (Pan troglodytes). Proc Natl Acad Sci U S A. 2020;117(2):936–42. doi: 10.1073/pnas.1910318116 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref021] 21.Honing H, Bouwer FL, Prado L, Merchant H. Rhesus monkeys (Macaca mulatta) sense isochrony in rhythm, but not the beat: additional support for the gradual audiomotor evolution hypothesis. Front Neurosci. 2018;12:475. doi: 10.3389/fnins.2018.00475 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref022] 22.Selezneva E, Deike S, Knyazeva S, Scheich H, Brechmann A, Brosch M. Rhythm sensitivity in macaque monkeys. Front Syst Neurosci. 2013;7:49. doi: 10.3389/fnsys.2013.00049 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref023] 23.Merchant H, Grahn J, Trainor L, Rohrmeier M, Fitch WT. Finding the beat: a neural perspective across humans and non-human primates. Philos Trans R Soc Lond B Biol Sci. 2015;370(1664):20140093. doi: 10.1098/rstb.2014.0093 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref024] 24.Rajendran VG, Prado L, Marquez JP, Merchant H. Monkeys have rhythm. Science. 2025;390(6776):940–4. doi: 10.1126/science.adp5220 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref025] 25.Brosch M, Oshurkova E, Bucks C, Scheich H. Influence of tone duration and intertone interval on the discrimination of frequency contours in a macaque monkey. Neurosci Lett. 2006;406(1–2):97–101. doi: 10.1016/j.neulet.2006.07.021 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref026] 26.Wright AA, Rivera JJ, Hulse SH, Shyan M, Neiworth JJ. Music perception and octave generalization in rhesus monkeys. J Exp Psychol Gen. 2000;129(3):291–307. doi: 10.1037//0096-3445.129.3.291 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref027] 27.Norman-Haignere SV, Kanwisher N, McDermott JH, Conway BR. Divergence in the functional organization of human and macaque auditory cortex revealed by fMRI responses to harmonic tones. Nat Neurosci. 2019;22(7):1057–60. doi: 10.1038/s41593-019-0410-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref028] 28.Mehr SA, Singh M, Knox D, Ketter DM, Pickens-Jones D, Atwood S, et al. Universality and diversity in human song. Science. 2019;366(6468):eaax0868. doi: 10.1126/science.aax0868 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref029] 29.Savage PE, Brown S, Sakai E, Currie TE. Statistical universals reveal the structures and functions of human music. Proc Natl Acad Sci U S A. 2015;112(29):8987–92. doi: 10.1073/pnas.1414495112 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref030] 30.Alho K, Sainio K, Sajaniemi N, Reinikainen K, Näätänen R. Event-related brain potential of human newborns to pitch change of an acoustic stimulus. Electroencephalogr Clin Neurophysiol. 1990;77(2):151–5. doi: 10.1016/0168-5597(90)90031-8 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref031] 31.Kujala T, Partanen E, Virtala P, Winkler I. Prerequisites of language acquisition in the newborn brain. Trends Neurosci. 2023;46(9):726–37. doi: 10.1016/j.tins.2023.05.011 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref032] 32.James DK. Fetal learning: a critical review. Infant Child Dev. 2010;19(1):45–54. doi: 10.1002/icd.653 [DOI] [Google Scholar]

[pbio.3003600.ref033] 33.Giordano V, Alexopoulos J, Spagna A, Benavides-Varela S, Peganc K, Kothgassner OD, et al. Accent discrimination abilities during the first days of life: an fNIRS study. Brain Lang. 2021;223:105039. doi: 10.1016/j.bandl.2021.105039 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref034] 34.Moon C, Cooper RP, Fifer WP. Two-day-olds prefer their native language. Infant Behav Dev. 1993;16(4):495–500. doi: 10.1016/0163-6383(93)80007-u [DOI] [Google Scholar]

[pbio.3003600.ref035] 35.Keller PE. What movement force reveals about cognitive processes in music performance. Art in motion II. Frankfurt: Peter Lang. 2012. p. 115–53. [Google Scholar]

[pbio.3003600.ref036] 36.Lang A, Ott P, Del Giudice R, Schabus M. Memory traces formed in utero—newborns’ autonomic and neuronal responses to prenatal stimuli and the maternal voice. Brain Sci. 2020;10(11):837. doi: 10.3390/brainsci10110837 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref037] 37.Kisilevsky BS, Hains SMJ. Exploring the relationship between fetal heart rate and cognition. Infant Child Dev. 2010;19(1):60–75. doi: 10.1002/icd.655 [DOI] [Google Scholar]

[pbio.3003600.ref038] 38.Kisilevsky S, Hains SMJ, Jacquet AY, Granier-Deferre C, Lecanuet JP. Maturation of fetal responses to music. Dev Sci. 2004;7(5):550–9. doi: 10.1111/j.1467-7687.2004.00379.x [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref039] 39.Cabrera L, Gervain J. Speech perception at birth: the brain encodes fast and slow temporal information. Sci Adv. 2020;6(30):eaba7830. doi: 10.1126/sciadv.aba7830 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref040] 40.Telkemeyer S, Rossi S, Koch SP, Nierhaus T, Steinbrink J, Poeppel D, et al. Sensitivity of newborn auditory cortex to the temporal structure of sounds. J Neurosci. 2009;29(47):14726–33. doi: 10.1523/JNEUROSCI.1246-09.2009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref041] 41.Háden GP, Honing H, Török M, Winkler I. Detecting the temporal structure of sound sequences in newborn infants. Int J Psychophysiol. 2015;96(1):23–8. doi: 10.1016/j.ijpsycho.2015.02.024 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref042] 42.Edalati M, Wallois F, Safaie J, Ghostine G, Kongolo G, Trainor LJ, et al. Rhythm in the premature neonate brain: very early processing of auditory beat and meter. J Neurosci. 2023;43(15):2794–802. doi: 10.1523/JNEUROSCI.1100-22.2023 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref043] 43.Winkler I, Háden GP, Ladinig O, Sziller I, Honing H. Newborn infants detect the beat in music. Proc Natl Acad Sci U S A. 2009;106(7):2468–71. doi: 10.1073/pnas.0809035106 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref044] 44.Lordier L, Meskaldji D-E, Grouiller F, Pittet MP, Vollenweider A, Vasung L, et al. Music in premature infants enhances high-level cognitive brain networks. Proc Natl Acad Sci U S A. 2019;116(24):12103–8. doi: 10.1073/pnas.1817536116 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref045] 45.François C, Teixidó M, Takerkart S, Agut T, Bosch L, Rodriguez-Fornells A. Enhanced neonatal brain responses to sung streams predict vocabulary outcomes by age 18 months. Sci Rep. 2017;7(1):12451. doi: 10.1038/s41598-017-12798-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref046] 46.Háden GP, Bouwer FL, Honing H, Winkler I. Beat processing in newborn infants cannot be explained by statistical learning based on transition probabilities. Cognition. 2024;243:105670. doi: 10.1016/j.cognition.2023.105670 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref047] 47.Háden GP, Stefanics G, Vestergaard MD, Denham SL, Sziller I, Winkler I. Timbre-independent extraction of pitch in newborn infants. Psychophysiology. 2009;46(1):69–74. doi: 10.1111/j.1469-8986.2008.00749.x [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref048] 48.Tóth B, Velősy PK, Kovács P, Háden GP, Polver S, Sziller I, et al. Auditory learning of recurrent tone sequences is present in the newborn’s brain. Neuroimage. 2023;281:120384. doi: 10.1016/j.neuroimage.2023.120384 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref049] 49.Stefanics G, Háden GP, Sziller I, Balázs L, Beke A, Winkler I. Newborn infants process pitch intervals. Clin Neurophysiol. 2009;120(2):304–8. doi: 10.1016/j.clinph.2008.11.020 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref050] 50.Carral V, Huotilainen M, Ruusuvirta T, Fellman V, Näätänen R, Escera C. A kind of auditory “primitive intelligence” already present at birth. Eur J Neurosci. 2005;21(11):3201–4. doi: 10.1111/j.1460-9568.2005.04144.x [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref051] 51.Di Liberto GM, Pelofi C, Bianco R, Patel P, Mehta AD, Herrero JL, et al. Cortical encoding of melodic expectations in human temporal cortex. Elife. 2020;9:e51784. doi: 10.7554/eLife.51784 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref052] 52.Kern P, Heilbron M, de Lange FP, Spaak E. Cortical activity during naturalistic music listening reflects short-range predictions based on long-term experience. Elife. 2022;11:e80935. doi: 10.7554/eLife.80935 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref053] 53.Crosse MJ, Di Liberto GM, Bednar A, Lalor EC. The multivariate temporal response function (mtrf) toolbox: a matlab toolbox for relating neural signals to continuous stimuli. Front Hum Neurosci. 2016;10:604. doi: 10.3389/fnhum.2016.00604 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref054] 54.Lalor EC, Power AJ, Reilly RB, Foxe JJ. Resolving precise temporal processing properties of the auditory system using continuous stimuli. J Neurophysiol. 2009;102(1):349–59. doi: 10.1152/jn.90896.2008 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref055] 55.Pearce MTT. Statistical learning and probabilistic prediction in music cognition: mechanisms of stylistic enculturation. Ann N Y Acad Sci. 2018;1423: 378–95. doi: 10.1111/nyas.13654 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref056] 56.Robert AP, Cang MPV, Mercier M, Trébuchon A. PolyRNN: a time-resolved model of polyphonic musical expectations aligned with human brain responses. bioRxiv Prepr. 2025. [Google Scholar]

[pbio.3003600.ref057] 57.Saadatmehr B, Edalati M, Wallois F, Ghostine G, Kongolo G, Flaten E. Auditory rhythm encoding during the last trimester of human gestation: from tracking the basic beat to tracking hierarchical nested temporal structures. J Neurosci. 2024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref058] 58.Kushnerenko E, Ceponiene R, Balan P, Fellman V, Huotilaine M, Näätäne R. Maturation of the auditory event-related potentials during the first year of life. Neuroreport. 2002;13(1):47–51. doi: 10.1097/00001756-200201210-00014 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref059] 59.Bigand F, Bianco R, Abalde SF, Nguyen T, Novembre G. EEG of the dancing brain: decoding sensory, motor, and social processes during dyadic dance. J Neurosci. 2025;45(21):e2372242025. doi: 10.1523/JNEUROSCI.2372-24.2025 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref060] 60.Jessen S, Obleser J, Tune S. Neural tracking in infants–an analytical tool for multisensory social processing in development. Dev Cogn Neurosci. 2021;52:101034. doi: 10.1016/j.dcn.2021.101034 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref061] 61.Kushnerenko E, Winkler I, Horváth J, Näätänen R, Pavlov I, Fellman V, et al. Processing acoustic change and novelty in newborn infants. Eur J Neurosci. 2007;26(1):265–74. doi: 10.1111/j.1460-9568.2007.05628.x [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref062] 62.Conway CM, Pisoni DB. Neurocognitive basis of implicit learning of sequential structure and its relation to language processing. Ann N Y Acad Sci. 2008;1145:113–31. doi: 10.1196/annals.1416.009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref063] 63.Schröger E, Roeber U, Coy N. Markov chains as a proxy for the predictive memory representations underlying mismatch negativity. Front Hum Neurosci. 2023;17:1249413. doi: 10.3389/fnhum.2023.1249413 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref064] 64.Yon D, Frith CD. Precision and the Bayesian brain. Curr Biol. 2021;31(17):R1026–32. doi: 10.1016/j.cub.2021.07.044 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref065] 65.Heilbron M, Chait M. Great expectations: is there evidence for predictive coding in auditory cortex? Neuroscience. 2018;389:54–73. doi: 10.1016/j.neuroscience.2017.07.061 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref066] 66.Lenc T, Peter V, Hooper C, Keller PE, Burnham D, Nozaradan S. Infants show enhanced neural responses to musical meter frequencies beyond low-level features. Dev Sci. 2023;26(5):e13353. doi: 10.1111/desc.13353 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref067] 67.Ravignani A. Isochrony, vocal learning, and the acquisition of rhythm and melody. Behav Brain Sci. 2021;44:e88. doi: 10.1017/S0140525X20001478 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref068] 68.Phillips-Silver J, Trainor LJ. Feeling the beat: movement influences infant rhythm perception. Science. 2005;308(5727):1430. doi: 10.1126/science.1110922 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref069] 69.Trainor LJ, Gao X, Lei J, Lehtovaara K, Harris LR. The primal role of the vestibular system in determining musical rhythm. Cortex. 2009;45(1):35–43. doi: 10.1016/j.cortex.2007.10.014 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref070] 70.Larsson M, Richter J, Ravignani A. Bipedal steps in the development of rhythmic behavior in humans. Music Sci. 2019;2. doi: 10.1177/2059204319892617 [DOI] [Google Scholar]

[pbio.3003600.ref071] 71.Ullal-Gupta S, Vanden Bosch der Nederlanden CM, Tichko P, Lahav A, Hannon EE. Linking prenatal experience to the emerging musical mind. Front Syst Neurosci. 2013;7:48. doi: 10.3389/fnsys.2013.00048 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref072] 72.Santolin C, Saffran JR. Constraints on statistical learning across species. Trends Cogn Sci. 2018;22(1):52–63. doi: 10.1016/j.tics.2017.10.003 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref073] 73.Fló A, Benjamin L, Palu M, Dehaene-Lambertz G. Statistical learning beyond words in human neonates. Elife. 2025;13:RP101802. doi: 10.7554/eLife.101802 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref074] 74.Suppanen E, Winkler I, Kujala T, Ylinen S. More efficient formation of longer-term representations for word forms at birth can be linked to better language skills at 2 years. Dev Cogn Neurosci. 2022;55:101113. doi: 10.1016/j.dcn.2022.101113 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref075] 75.Jones MR, Boltz M. Dynamic attending and responses to time. Psychol Rev. 1989;96(3):459–91. doi: 10.1037/0033-295x.96.3.459 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref076] 76.Bobin-Bègue A, Provasi J, Marks A, Pouthas V. Influence of auditory tempo on the endogenous rhythm of non-nutritive sucking. Rev Eur Psychol Appl. 2006;56: 239–45. doi: 10.1016/j.erap.2005.09.006 [DOI] [Google Scholar]

[pbio.3003600.ref077] 77.Barnard KE, Bee HL. The impact of temporally patterned stimulation on the development of preterm infants. Child Development. 1983;54(5):1156. doi: 10.2307/1129671 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref078] 78.Zimmerman E, Barlow SM. The effects of vestibular stimulation rate and magnitude of acceleration on central pattern generation for chest wall kinematics in preterm infants. J Perinatol. 2012;32(8):614–20. doi: 10.1038/jp.2011.177 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref079] 79.Granier-Deferre C, Ribeiro A, Jacquet A-Y, Bassereau S. Near-term fetuses process temporal features of speech. Dev Sci. 2011;14(2):336–52. doi: 10.1111/j.1467-7687.2010.00978.x [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref080] 80.Novitski N, Huotilainen M, Tervaniemi M, Näätänen R, Fellman V. Neonatal frequency discrimination in 250-4000-Hz range: electrophysiological evidence. Clin Neurophysiol. 2007;118(2):412–9. doi: 10.1016/j.clinph.2006.10.008 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref081] 81.Smith NA, Trainor LJ, Shore DI. The development of temporal resolution: between-channel gap detection in infants and adults. J Speech Lang Hear Res. 2006;49(5):1104–13. doi: 10.1044/1092-4388(2006/079 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref082] 82.Loukas S, Lordier L, Meskaldji D-E, Filippa M, Sa de Almeida J, Van De Ville D, et al. Musical memories in newborns: a resting-state functional connectivity study. Hum Brain Mapp. 2022;43(2):647–64. doi: 10.1002/hbm.25677 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref083] 83.Auksztulewicz R, Schwiedrzik CM, Thesen T, Doyle W, Devinsky O, Nobre AC, et al. Not all predictions are equal: “What” and “when” predictions modulate activity in auditory cortex through different mechanisms. J Neurosci. 2018;38(40):8680–93. doi: 10.1523/JNEUROSCI.0369-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref084] 84.Musacchia G, Large EW, Schroeder CE. Thalamocortical mechanisms for integrating musical tone and rhythm. Hear Res. 2014;308:50–9. doi: 10.1016/j.heares.2013.09.017 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref085] 85.Cappotto D, Luo D, Lai HW, Peng F, Melloni L, Schnupp JWH, et al. “What” and “when” predictions modulate auditory processing in a mutually congruent manner. Front Neurosci. 2023;17:1180066. doi: 10.3389/fnins.2023.1180066 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref086] 86.Hoedlmoser K, Peigneux P, Rauchs G. Recent advances in memory consolidation and information processing during sleep. J Sleep Res. 2022;31(4):e13607. doi: 10.1111/jsr.13607 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref087] 87.Sifuentes-Ortega R, Lenc T, Nozaradan S, Peigneux P. Partially preserved processing of musical rhythms in REM but not in NREM sleep. Cereb Cortex. 2022;32(7):1508–19. doi: 10.1093/cercor/bhab303 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref088] 88.Makov S, Sharon O, Ding N, Ben-Shachar M, Nir Y, Zion Golumbic E. Sleep disrupts high-level speech parsing despite significant basic auditory processing. J Neurosci. 2017;37(32):7772–81. doi: 10.1523/JNEUROSCI.0168-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref089] 89.Kushnerenko EV, Van den Bergh BRH, Winkler I. Separating acoustic deviance from novelty during the first year of life: a review of event-related potential evidence. Front Psychol. 2013;4:595. doi: 10.3389/fpsyg.2013.00595 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref090] 90.Nguyen T, Bigand F, Reisner S, Koul A, Bianco R, Markova G. Development of auditory and spontaneous movement responses to music over the first year of life. Elife. 2025. doi: 2025.05.04.649695 [Google Scholar]

[pbio.3003600.ref091] 91.Liégeois-Chauvel C, Musolino A, Badier JM, Marquis P, Chauvel P. Evoked potentials recorded from the auditory cortex in man: evaluation and topography of the middle latency components. Electroencephalogr Clin Neurophysiol. 1994;92(3):204–14. doi: 10.1016/0168-5597(94)90064-7 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref092] 92.Dürschmid S, Edwards E, Reichert C, Dewar C, Hinrichs H, Heinze H-J, et al. Hierarchy of prediction errors for auditory events in human temporal and frontal cortex. Proc Natl Acad Sci U S A. 2016;113(24):6755–60. doi: 10.1073/pnas.1525030113 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref093] 93.Mouraux A, Diukova A, Lee MC, Wise RG, Iannetti GD. A multisensory investigation of the functional significance of the “pain matrix”. Neuroimage. 2011;54(3):2237–49. doi: 10.1016/j.neuroimage.2010.09.084 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref094] 94.Kotz SA, Ravignani A, Fitch WT. The evolution of rhythm processing. Trends Cogn Sci. 2018;22(10):896–910. doi: 10.1016/j.tics.2018.08.002 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref095] 95.Rimmele JM, Morillon B, Poeppel D, Arnal LH. Proactive sensing of periodic and aperiodic auditory patterns. Trends Cogn Sci. 2018;22(10):870–82. doi: 10.1016/j.tics.2018.08.003 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref096] 96.Pearce MT. The construction and evaluation of statistical models of melodic structure in music perception and composition. 2005. Available from: http://webprojects.eecs.qmul.ac.uk/marcusp/papers/Pearce2005.pdf

[pbio.3003600.ref097] 97.Oostenveld R, Fries P, Maris E, Schoffelen J-M. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput Intell Neurosci. 2011;2011:156869. doi: 10.1155/2011/156869 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref098] 98.Plechawska-Wojcik M, Kaczorowska M, Zapala D. The artifact subspace reconstruction (ASR) for eeg signal correction. a comparative study. Adv Intell Syst Comput. Springer International Publishing. 2018. p. 125–35. doi: 10.1007/978-3-319-99996-8_12 [DOI] [Google Scholar]

[pbio.3003600.ref099] 99.Quiroga-Martinez DR, Hansen NC, Højlund A, Pearce M, Brattico E, Vuust P. Decomposing neural responses to melodic surprise in musicians and non-musicians: evidence for a hierarchy of predictions in the auditory system. Neuroimage. 2020;215:116816. doi: 10.1016/j.neuroimage.2020.116816 [DOI] [PubMed] [Google Scholar]

[pbio.3003600.ref100] 100.Norman-Haignere SV, McDermott JH. Neural responses to natural and model-matched stimuli reveal distinct computations in primary and nonprimary auditory cortex. PLoS Biol. 2018;16(12):e2005127. doi: 10.1371/journal.pbio.2005127 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3003600.ref101] 101.Maris E, Oostenveld R. Nonparametric statistical testing of EEG- and MEG-data. J Neurosci Methods. 2007;164(1):177–90. doi: 10.1016/j.jneumeth.2007.03.024 [DOI] [PubMed] [Google Scholar]

PERMALINK

Human newborns form musical predictions based on rhythmic but not melodic structure

Roberta Bianco

Brigitta Tóth

Felix Bigand

Trinh Nguyen

István Sziller

Gábor P Háden

István Winkler

Giacomo Novembre

Roles

Abstract

Introduction

Results

Fig 1. Materials and methods.

Fig 2. Neural encoding of timing but not pitch expectations at birth.

Encoding of probabilistic expectations in real but not shuffled music

Timing- but not pitch-related expectations

Converging evidence from Event-Related Potentials (ERPs)

Fig 3. Modulation of auditory event-related potentials as a function of surprise.

Discussion

Conclusions

Methods

Ethics statement

Participants

Stimuli

Information dynamics of music model

Procedure

Data recording and preprocessing

TRF analysis

Statistical analysis

ERP analysis

Supporting information

Acknowledgments

Abbreviations

Data Availability

Funding Statement

References

Decision Letter 0

Christian Schnell, PhD

Roles

Decision Letter 1

Christian Schnell, PhD

Roles

Author response to Decision Letter 2

Decision Letter 2

Christian Schnell, PhD

Roles

Author response to Decision Letter 3

Decision Letter 3

Christian Schnell, PhD

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases