Abstract
It is known that tension is a core principle of the generation of music emotion and meaning, and supposed to be induced by prediction in process of music listening. Using EEG and behavioral rating, the current research investigated how contextual certainty affects musical tension induction and resolution. The major results were that in the tension induction process, incongruent conditions elicited larger EN and P600 in ERP responses compared with congruent conditions, and the amplitude of P600, tension ratings were mediated by contextual certainty. In the tension resolution process, contextual certainty further affected the duration of P600 and tension ratings. For the certain conditions, tension ratings were higher, tension curves fluctuated faster, and a larger P600 was evoked in the incongruent condition compared with the congruent condition. For the uncertain conditions, there was no congruency effect on behavioral ratings and tension curves, but a larger P600 was elicited in the congruent condition. These results show that contextual certainty affects tension induction and resolution. Our findings provide a more comprehensive view on how musical prediction affects musical tension.
Keywords: Tension induction, Tension resolution, Contextual certainty, EEG
Introduction
Composer and music psychologist believe that the alternations of tension and relaxation constitute the core principle of emotion induction (Farbood 2001; Hindemith 1937; Koelsch 2014; Lehne et al. 2013; Lerdahl and Jackendoff 1983; Schenker 1935). That is, the ebb and flow of tension is basic to musical experience and central to music theory.
In the field of music psychology, many models expound cognitive mechanism and neural bases for tension generation by revealing the close relationship between musical tension and prediction. Early models have strived to reveal the relationship between music and tension on the bases of Gestalt principles (Margulis 2005; Meyer 1956; Narmour 1990), and late models focused on perceptual and cognitive processes underlying musical tension induction and resolution (Huron 2006; Farbood 2012; Lehne and Koelsch 2015). All these models emphasize that the effects of prediction on musical tension could be reflected in both tension induction and resolution processes. In the induction process, the perception of an uncertain, conflict and unstable music event triggers future-oriented predictive processes, and creates a space of possible outcome events. The experience of tension is elicited by the divergence between affective values of predictive events (their desirability), and the unpredictable outcome could produce prediction errors and further increase tension. In the resolution process, prediction is updated according to the prediction errors and the contextual certainty. In other words, music tension in this process is affected by predictions consisted of first-order prediction (prediction of perceptual content, i.e., surprise) and second-order prediction ascribed to first-order predictions (prediction of the precision, i.e., certainty). Precision-weighted prediction error (first-order prediction regulated by second-order prediction) is used by predictive coding to update mental representations (Koelsch et al. 2019), implying that the outcome event could be an initiating event of a new tension process, inducing a succession of different tension experiences, i.e., a dynamic flow of tension and resolution.
However, previous studies have devoted much attention to tension induction by prediction errors (Steinbeis et al. 2006; Steinbeis and Koelsch 2008). How the prediction errors regulated by contextual certainty affect music tension at both induction and resolution processes? What kind of cognitive operations are involved in the resolution process, and how these processes affect music tension? These questions need to be answered through empirical explorations.
The event segmentation theory (EST) (Bailey et al. 2017; Kurby and Zacks 2007; Radvansky and Zacks 2017; Stawarczyk et al. 2021; Zacks 2020) has explained possible cognitive processes involved in the resolution process. In this model, perceivers generate predictions by constructing event model. If accurate, the model guides subsequent information processing and integration with preceding context; if not, an event boundary could be detected, and a prediction model for the forthcoming event be constructed. Meanwhile, the attentional focus would be turned to external stimuli (Faber and D’Mello 2018; Huff et al. 2012). In the case of music processing, the cognitive processes engaged in the tension resolution process may involve detecting event boundaries, updating mental models in working memory, and attending toward the external information.
In light of the studies mentioned above, we would like to focus on the effects of contextual certainty on music tension at both induction and resolution processes to further explore the relationship between music tension and prediction. In the tension induction process, prediction errors could induce a sudden increase in experienced tension (Bigand 1997; Lerdahl and Krumhansl 2007; Koelsch 2012; Koelsch et al. 2008). Bianco et al. (2020) suggests that the effects of the prediction error on arousal were weighted by contextual certainty. We assume that the effect of contextual certainty would mediate the effect of prediction errors on tension. In the resolution process, prediction updating is triggered by the errors to capture the current information, and a new prediction model is constructed based on currently available perceptual information. For prediction updating is a process adaptive to external stimuli, the tension caused by prediction errors could be resolved gradually (Koelsch et al. 2000).
In the experiment, two independent factors, congruency and contextual certainty, are manipulated. Specifically, phrase consisting of three sections are used as material, of which the third one is taken as the target section, and the congruency refers to the tonal relation between the first section and target section. The second section is used to mediate the certainty of the congruency. As assumed by music prediction theories, the first chord of the section3 (chord7), in which prediction errors occurred, is taken as the tension induction processes, and the remaining two chords (chord8 and chord9) are taken as the subsequent resolution processes. Behavioral tension rating and neural response are recorded simultaneously. According to musical prediction theories, it is hypothesized that in the tension induction process, tension would increase in the incongruent conditions, and these effects would be mediated by contextual certainty. In the tension resolution process, tension would be resolved gradually and dynamically, and the process also mediated by contextual certainty.
Methods
Participants
A total of 29 musicians (Mage = 21.24 years, SD = 2.52, 15 females) participated in the experiment. Subsequent analysis removed data of one female participant because of quitting for medical reasons. All participants were familiar with Western tonal music and harmonic tonal knowledge, through formal music instrument training (average 13 years; range 8–20) and practice for one hour or more per day. They were right-handed and had no auditory or neurological disorders. Prior to the experiment, they signed informed consent and were notified that they could terminate the experiment at any time. 25 dollars was given for their participation. This research was approved by The Ethics Committee of the Institute of Psychology, Chinese Academy of Sciences.
Stimuli
Two sequences consisting of 9-chords were created for the experiment by a professional musician. Each sequence could be segmented into three sections according to western music syntax, with three chords in each section. Two independent variables were manipulated (contextual certainty and congruency) with two levels (certain vs uncertain and congruent vs incongruent). Congruency refers to the tonal relations between section1 and section3, and the incongruent condition is created if their tonality is different. Contextual certainty refers to the tonal relations between section1 and section2, and the uncertain condition is created if their tonality is different. The section3 was taken as the target section and in identical key1 across conditions. Specifically, the first chord in section3 (the chord7 in sequences) was used to investigate the tension induced by incongruency, and the chord8 and chord9 were used to further investigate updating (resolution) processes after the induced tension.
Four versions of these sequences were created (Fig. 1). Specifically, in the certain-congruent condition (CER-CON), three sections of music sequences all belonged to the same key, i.e. all chords were in key1 (key1 + key1 + key1); the preceding context of section3 was both locally and globally consistent with higher certainty. In the certain-incongruent condition (CER-IN), the first two sections belonged to the key2, and the section3 belonged to the key1, so that it was preceded by both locally and globally incongruent context, and expected to induce a prediction error with certainty. In the uncertain-congruent condition (UN-CON), the section1 belonged to key1, the section2 to key2, and the section3 to key1; section3 was preceded by a context locally incongruent but globally congruent (as the case of nested structure), and expected to induce a prediction error with less certainty. In the uncertain-incongruent condition (UN-IN), three sections were all in different keys (key3 + key2 + key1), section3 was preceded by a context both locally and globally incongruent, and expected to induce a prediction error with uncertainty.
Fig. 1.
The chord sequences used in the experiment is exemplified. Section 3 (chord7–9) was the region for investigating tension induction and resolution processes. Chords denoted with the same color belong to the same key. In the certain-congruent condition (CER-CON), all the chords were in C major (blue, key1); in the certain-incongruent condition (CER-IN), the first six chords in Bb major (red, key2), but the last three chords in C major (blue, key1); in the uncertain-congruent condition (UN-CON), the first three chords and the last three chords in C major (blue, key1), but the middle three chords in Bb major (red, key2); in the uncertain-incongruent condition (UN-IN), the first three chords were in B major (green, key3), and the middle three chords in Bb major (red, key2), the last three chords in C major (blue, key1). (Color figure online)
The experimental stimuli consisted of 240 sequences. Specifically, four versions of two sequences were first transposed to five major keys (B, C, D, G and Db), and presented six times in the experiment to increase signal-to-noise ratio (2 sequences × 4 versions × 5 major keys × 6 times). Stimuli were manipulated in the principles: Sibelius 8.3 and Cubase 5 (Steinberg Media Technologies, Hamburg, Germany) with an acoustic piano sound were used to generate sound files with a tempo of 80 beats per minute. Chord1–8 lasted for 750 ms and chord9 for 1500 ms. In addition, Adobe Audition 3.0 was used to standardize the loudness of each sequence to − 3 dB to ensure that the velocity constant across all chords. All stimuli were presented to subjects in six blocks, which consisted of 4 versions of 2 sequences in five major keys. The presentation of stimuli in each block followed a pseudorandomized order such that each sequence and condition was not successively repeated twice.
Procedure
The experiment was conducted in a soundproof room, both behavioral ratings and EEG responses were recorded continuously. All sequences were divided into 6 blocks, and each block started with a mouse click, music sequence was presented in 2 s through in-ear phones. Participants were asked to evaluate the experienced tension by continuously moving slider within the range of 0–100 until the end of the block. A 3 s silence was inserted between chord sequences to reduce the effect of previous stimuli on the next one. Participants were provided practice trials until they were familiar with the response interface.
Data measurement and analysis
Behavioral data analysis
Behavioral ratings were recorded by Psychopy2, and the sampling rate was 20 Hz. A time window of 1 s before the onset of chord4 (1.25–2.25 s) was used as the baseline for correction to ensure the comparability of experimental conditions. Then, two-factors (contextual certainty and congruency) repeated measures ANOVAs were performed on the epoch of chord7 (4.5–5.25 s), chord8 (5.25–6 s) and chord9 (6–7.5 s) respectively. In order to better describe the dynamic changes of the tension curves, growth curve analysis (Mirman 2014) was used to analyze the time course of tension score from the onset of chord4 to the end of the trial (2.25–10.5 s). The dynamic characters of tension score were captured with a fourth-order (quartic) orthogonal polynomial with fixed effects of contextual certainty (Certain vs. Uncertain) and congruency (Congruent vs. Incongruent) on all time terms, and participant and participant-by-contextual certainty-by-congruency random effects on all time terms. The normal approximation (i.e., treating the t value as a z value) was used to assess the statistical significance (p values) of the individual parameter. The lme4 package (version 1.1–21) in R version 3.6.1 was used to carry out these analyses.
EEG data analysis
The EEG signal was continuously recorded by 64 Ag/AgCl electrodes mounted on an elastic cap (10–20 system) and amplified with Neuroscan Synamps amplifier. The data sampling rate was 1000 Hz with band pass filtering from 0.05 to 100 Hz. An electrode between FPz and Fz was used as the ground. The online reference electrode was placed on the left mastoid. Offline reference was calculated according to the algebraic average of the left and right mastoids. The horizontal electrooculogram (HEOG) was monitored by two electrodes placed at the outer canthus of the left and right eye, and the vertical electrooculogram (VEOG) was monitored by two electrodes placed supra- and infra-orbitally at the left eye. The impedances of all electrodes were kept below 5 kΩ during the whole experiment.
ERP
EEGLAB (EEGLAB 14.1.1b, http://www.sccn.ucsd.edu/eeglab), an open source toolbox that runs in the MATLAB environment, was used to preprocess the raw EEG data. Continuous EEG data were filtered with band-pass filtering of 0.1–30 Hz to remove slow drifting. The filtered data were divided into epochs ranging from 500 ms preceding the onset of chord4 to 1500 ms after the onset of chord9 (1.75–7.5 s). Independent Component Analysis (ICA) algorithm was used to correct the trials affected by eye movements and blinks, and automatic artifact rejection (signal amplitude exceeding ± 100 μV) was used to identify the trials affected by muscle artifacts, electrode drifting, amplifier saturation or other artifacts. Trials affected by artifacts were rejected (an average of 3% of all trials). There were 57.79 ± 2.67, 58 ± 2.39, 57.86 ± 2.68, 57.96 ± 2.55 artifact-free trials obtained for the CER-CON, CER-IN, UN-CON and UN-IN conditions respectively. A time window of 500 ms before the onset of chord4 (the onset of the section2) was used as baseline for correction (1.75–2.25 s). Finally, average ERPs were computed for each participant under each condition at each electrode.
Statistical analysis
We used the cluster-based random permutation test to perform the statistical analysis of ERP data (Maris and Oostenveld 2007). The permutation test was conducted within 0–3000 ms post the onset of chord7 (the step size is 1 ms) over 45 electrodes (HEO, VEO, FP1, FPZ, FP2, AF3, AF4, FT7, FT8, T7, F8, TP7, TP8, F7, F8, O1, O2 and OZ were removed). Simple dependent t tests were conducted to compare two conditions at each data point (e.g., Certain vs. Uncertain, Congruent vs. Incongruent; CER-CON vs. UN-CON; CER-IN vs. UN-IN). Adjacent data points were grouped into clusters when exceeding the preset significance level (0.05). The sum of the t values for each cluster was taken as cluster-level statistics. The Monte Carlo method with 1000 random draws was used to calculate the possibility of significance of the clusters.
First, the main effect of contextual certainty was analyzed: calculating the effect of Certainty (CER-CON combined with CER-IN) and Uncertainty (UN-CON combined with UN-IN); applying the permutation tests to these two conditions (Certain vs. Uncertain) in an epoch of 0–3000 ms post chord7 onset, first for the whole target section (4.5–7.5 s), then only in the epoch of 0–750 ms post chord7 onset (4.5–5.25 s) for the target chord. Second, the main effect of congruency was analyzed: calculating the effect of Congruency (CER-CON combined with UN-CON) and Incongruency (CER-IN combined with UN-IN); the permutation tests were conducted to these two conditions (Congruency vs. Incongruency) in an epoch of 0–3000 ms post chord7 onset (4.5–7.5 s), then only for the epoch of 0–750 ms post chord7 onset (4.5–5.25 s). Third, the ERP of these four experimental conditions were calculated separately: CER-CON, CER-IN, UN-CON and UN-IN, and the two-way contextual certainty × congruency interaction was examined; permutation tests were conducted on the differences in the 0–3000 ms post chord7 onset epoch (CER-CON-minus-CER-IN vs. UN-CON-minus-UN-IN, 4.5–7.5 s). Subsequent tests of pair-comparison were calculated for the period in which this interaction was significant. To avoid type I error, FDR method was used to correct p values when multiple comparisons occur.
Results
Behavioral results
The overall tension scores in four conditions derived from continuous ratings were shown in Fig. 2A. For chord8, the two-way interaction of contextual certainty and congruency was significant [F(1, 27) = 11.11, p = 0.002, partial η2 = 0.29]. Simple-effect tests showed that the behavioral ratings of incongruent chords were higher than congruent chords in the certain condition [F(1, 27) = 6.47, p = 0.017, partial η2 = 0.19]; the behavioral ratings of uncertain conditions were higher than certain conditions in the congruent condition [F(1, 27) = 5.23, p = 0.03, partial η2 = 0.16] (Fig. 2B). For chord9, the two-way interaction of contextual certainty and congruency was significant [F(1, 27) = 13.03, p = 0.001, partial η2 = 0.33]. Simple-effect tests showed that the behavioral ratings of incongruent chords were higher than congruent chord in the certain conditions [F(1, 27) = 9.83, p = 0.004, partial η2 = 0.27] (Fig. 2C). No other significant effects were found (ps > 0.05).
Fig. 2.
A The tension scores evaluated by participants from the onset of chord4 to the end of chord9. Each vertical dotted black line in the figure indicates the onset of each chord. The vertical axis is the tension score, and the horizontal axis is the time points (s), which is recorded every 50 ms. B The difference between conditions in the duration of chord8. C The difference between conditions in the duration of chord9. (*p < 0.05; **p < 0.01)
The dynamic features of tension curves shown in results of growth curve analysis were shown in Fig. 3. Significant effects of contextual certainty on the linear slope term (Estimate = − 1.66, SE = 0.79, p = 0.04), quadratic term (Estimate = − 1.69, SE = 0.81, p = 0.04) and cubic term (Estimate = 1.66, SE = 0.56, p = 0.003) were found (uncertain conditions relative to certain conditions), capturing the complementary slopes and curvatures respectively for these difference curves. Significant effects of congruency on the linear slope term (Estimate = 2.4, SE = 0.79, p = 0.003), quadratic term (Estimate = − 2.07, SE = 0.81, p = 0.01), and quartic term (Estimate = 1.62, SE = 0.56, p = 0.004) were found (incongruent conditions relative to congruent conditions), indicating that the effects of congruency on the curvatures of these difference curves. In addition, the interaction of contextual certainty and congruency on the linear slope term (Estimate = − 0.89, SE = 0.39, p = 0.03) and quartic term (Estimate = − 0.68, SE = 0.28, p = 0.01) was significant. In the incongruent condition, a significant effect of contextual certainty on the linear slope term (Estimate = − 3.45, SE = 1.12, p = 0.003) and quartic term (Estimate = − 1.67, SE = 0.79, p = 0.03) (uncertain conditions relative to certain conditions) was found (Fig. 3B). In the certain condition, a significant effect of congruency on the linear slope term (Estimate = 4.19, SE = 1.12, p < 0.001) and quartic term (Estimate = 2.97, SE = 0.79, p < 0.001) (incongruent conditions relative to congruent conditions) was found (Fig. 3C). No other significant effects were found (ps > 0.1).
Fig. 3.
Observed data (symbols, error bars indicate ± SE) and growth curve model fits (lines) for effects of contextual certainty and congruency on the time course of tension score. The zero of x-axis indicates the onset of chord7. Different colored numbers indicated coefficients under different conditions. Growth curve analysis results of (A) the CER-CON (red) and the UN-CON (blue); (B) the CER-IN (red) and the UN-IN (blue); (C) the CER-CON (red) and the CER-IN (blue); (D) the UN-CON (red) and the UN-IN (blue)
Event-related potential (ERP) results
The main effect of congruency was significant (p = 0.033; in the time window of approximately 467 ms to 685 ms post the onset of chord7); this effect was pronounced over central-posterior electrodes, and the amplitude of incongruent conditions was more positive than that of congruent conditions. Analyzing only chord7 found that the main effect of congruency was significant (p = 0.01). Two significant clusters were found: cluster1: in the time window of approximately 240 ms to 345 ms post chord7 onset (this effect was pronounced over anterior electrodes, and the amplitude of incongruent conditions was more negative than that of congruent conditions; as shown in Fig. 4A, B); cluster2: in the time window of approximately 467 ms to 685 ms post chord7 onset (this effect was pronounced over central-posterior electrodes, and the amplitude of incongruent conditions was more positive than that of congruent conditions).
Fig. 4.
A Grand average ERP waveforms elicited by congruent conditions and incongruent conditions at the selected electrodes F4 from the onset of chord7 to the end of chord9. The zero of x-axis indicates the onset of chord7. Black solid line: congruent conditions (CON); black dotted line: incongruent conditions (IN); cyan solid line: incongruent conditions minus congruent conditions. B Results of the cluster-based permutation. Topographies and colors reflect the spatial distribution and magnitude of congruency effects. IN-CON indicates that the incongruent condition minus the congruent condition. (Color figure online)
In addition, three clusters were found extending from 310 to 2341 ms post the onset of chord7 by comparing difference waveforms (congruent-minus-incongruent) between certain and uncertain conditions; a significant difference was observed between certain and uncertain conditions with the cluster-based random permutation test (p = 0.009) (Fig. 5A, B). Further analyses found that for the certain condition, a significant difference was observed between the incongruent and congruent condition (p = 0.012; in the time window of approximately 373 ms and 1025 ms post chord7 onset; this effect was pronounced over central-posterior electrodes, and the amplitude of the certain incongruent condition was more positive than that of the certain congruent condition); for the uncertain condition, a significant difference was observed between the incongruent and congruent condition (p = 0.024; in the time window of approximately 310 ms and 425 ms post chord7 onset; this effect was pronounced over central-posterior electrodes, and the amplitude of the uncertain congruent condition was more positive than that of the uncertain incongruent condition); for the congruent condition, a significant difference was observed between the certain and the uncertain condition (p = 0.023; cluster1: in the time window of approximately 283 ms and 900 ms post chord9 onset; cluster2: in the time window of approximately 736 ms and 1013 ms post chord7 onset; cluster3: in the time window of approximately 456 ms and 722 ms post chord7 onset; cluster4: at in the time window of approximately 310 ms and 449 ms post chord7 onset; cluster5: in the time window of approximately 60 ms and 261 ms post chord9 onset; all of these effect were pronounced over central-posterior electrodes, and the amplitudes of the uncertain congruent condition were more positive than that of the certain congruent condition); for the incongruent condition, a significant difference was observed between the certain and uncertain conditions (p = 0.016; at in the time window of approximately 431 ms and 901 ms post chord7 onset; this effect was pronounced over central-posterior electrodes, and the amplitude of the certain incongruent condition was more positive than that of the uncertain incongruent condition). No other effects were significant (ps > 0.1).
Fig. 5.
A Grand average ERP waveforms elicited by the four conditions at the selected electrodes Pz from the onset of chord7 to the end of chord9. The zero of x-axis indicates the onset of chord7. Blue solid line: CER-CON; blue dotted line: CER-IN; red solid line: UN-CON; red dotted line: UN-IN. B Simple-effect results of the cluster-based permutation. Topographies and colors reflect the spatial distribution and magnitude of variables effects. CER: IN-CON indicates that the incongruent condition minus the congruent condition under the certain condition; UN: CON-IN indicates that the congruent condition minus the incongruent condition under the uncertain condition; IN: CER-UN indicates that the certain condition minus the uncertain condition under the incongruent condition; CON: UN-CER indicates that the uncertain condition minus the certain condition under the congruent condition. (Color figure online)
Discussion
The current study aimed to investigate how contextual certainty affect music tension in both induction and the resolution processes. The major results obtained in the experiment were outlined in the Fig. 6. In the tension induction process (chord7), music tension was affected by congruency and contextual certainty. Incongruent conditions elicited an EN and P600 in the ERPs, and the amplitude of P600 was affected by contextual certainty. In the tension resolution process, behavioral ratings, the slopes and curvatures of tension curves, and the magnitude and duration of P600 component were modulated by contextual certainty. These results are discussed below in more details.
Fig. 6.
The figure shows the results from the onset of chord7 to the end of chord9. A The chord7, chord8 and chord9 in a chord sequence used in the current experiment. These chords were identical between conditions. B Tension score evaluated by participants since the onset of chord7. C Grand average ERP waveforms elicited by the four conditions at the selected electrodes Pz from the onset of chord7 to the end of chord9. The zero of x-axis indicates the onset of chord7
The effect of contextual certainty in the induction process
First, we found that compared to congruent conditions, in incongruent conditions an early negativity (EN) was elicited post chord7 onset. According to previous literature, this component was considered to reflect the detection of unpredictable events (Steinbeis et al. 2006), that is, listeners detected prediction errors caused by the chords out of key. Moreover, in incongruent conditions P600 was elicited. This result is in line with previous literatures, which found that unpredictable chords elicited a larger P600 and more tension experiences while listening to music (Featherstone et al. 2013; Patel et al. 1998). They believed that the effects of P600 reflected the conscious detection of a specific unpredictable element and the efforts of musicians to integrate the harmonic incongruity into its context. Some researchers also used P600 as an indicator of prediction updating processes, which is closely related to music tension experiences, including prediction model updating and the integration of new events into the representation (Brouwer et al. 2012; Delogu et al. 2017; Schumacher and Hung 2012).
Furthermore, the amplitude of P600 was mediated by contextual certainty, specifically the P600 amplitude in the certain incongruent condition was larger than in the certain congruent condition, but in the uncertain congruent condition was larger than in the uncertain incongruent condition. It is explained that compared with the certain congruent condition, in the certain incongruent condition the section3 did not return to the original key (key1), resulting in the occurrence of prediction errors at chord7. More importantly, for the prediction error appears in a certain context formed by the section1 and section2, precision-enhanced prediction errors lead to a more difficult updating process (Koelsch et al. 2019). Therefore, a larger P600 was observed in the certain incongruent condition. In addition, compared with the uncertain congruent condition, in the uncertain incongruent condition the section3 also did not return to the original key (key1), but prediction errors in the uncertain incongruent condition are weakened by uncertainty, leading to an easier updating process, a smaller P600 was observed in the uncertain incongruent condition correspondingly.
As the observation and estimation of previous studies (Sloboda and Lehmann 2001; Krumhansl 1996), there is a time-lag of the behavioral results to the ERP responses. Specifically, the effects of certainty and congruency in the behavioral results appeared (on chord8) later than in the ERP responses (on chord7). That is, the behavioral responses delayed about one chord (750 ms) than the ERP responses. During chord8 window, the interaction of congruency and contextual certainty on tension was observed. For the certain condition, the tension ratings of the incongruent condition were higher than the congruent condition, but for the uncertain condition, there is no significant difference. These results suggest that listeners had constructed a predictive model in the induction process, and sudden increase in tension could be experienced when prediction errors occurred. In addition, prediction errors could be weakened by uncertainty and decrease experienced tension. The pattern of behavioral data is in line with the findings of previous studies (Bianco et al. 2020; Quiroga-Martinez et al. 2019). For example, Bianco et al. (2020) found that the same changes occurred in low or high uncertainty leads to different levels of arousal. For the more explicit predictions formed in predictable contexts, violation may be more easily detected compared to that produced in unpredictable contexts, leading to greater changes in the listeners’ arousal state.
The behavioral data and the ERP results reveal that in contrast to congruent conditions (back to key1), in incongruent conditions (shifted to another key), listeners are no longer able to integrate upcoming information well based on their previous prediction model. According to the EST model (Kurby and Zacks 2007), they need to update or even reconstruct their prediction models based on currently available information for listening smoothly, as reflected by the P600. Meanwhile, contextual certainty plays an important role in prediction updating processes, as shown by the amplitude of P600 and tension ratings.
The effect of contextual certainty in the resolution process
In the current experiment, we investigate not only the effects of contextual certainty in the induction process, but also the effects and their dynamic features in the resolution process.
We found that the duration of P600 was mediated by contextual certainty, and varied in different conditions. For the congruent condition, the uncertain condition induced a larger P600 from 310 to 1013 ms and 1560 ms to 2400 ms post chord7 onset compared with the certain condition; For the incongruent condition, the certain condition induced a larger P600 from 431 to 901 ms post chord7 onset compared with the uncertain condition. For the certain condition, the incongruent condition elicited a greater P600 from 373 to 1025 ms post chord7 onset compared with the congruent condition; for the uncertain condition, the incongruent condition elicited a smaller P600 from 310 to 425 ms post chord7 onset compared with the congruent condition.
These results suggest that prediction updating could not be completed during the critical chord (chord7), but is a process lasting for gradual adaptation to the external information (Barret and Simmons 2015). As suggested by Brouwer et al. (2012; Brouwer and Hoeks 2013), the psychological mechanism behind P600 correlated with mental representation of what is communicated (MRC). Listeners construct some kind of MRC to comprehend a sentence or story smoothly, which consist of direct linguistic input and various inferences based on world knowledge. For the current experiment, it is reasonable to speculate that compared with the certain congruent condition, in the certain incongruent condition listeners need to update or even reconstruct their prediction model when a prediction error occurs. Therefore, the prolonged effect of P600 is observed, reflecting the processes of updating working memory and constructing a new model based on currently available information. However, in the uncertain congruent condition and the uncertain incongruent condition, the duration of P600 is shorter compared to certain conditions. This may be explained that the contextual uncertainty weakens the effect of prediction errors and leads to a ready updating. Meanwhile, in the uncertain congruent condition, tonal context shifts from uncertainty to certainty, and the second-order prediction is updated. Therefore, a P600 is induced but the duration of the effect is short.
Behavioral rating results show that the experienced tension in the certain incongruent condition is higher than that in the certain congruent condition, and the effect of prediction updating on tension is not transient. Growth curve analyses show significant effects of congruency and contextual certainty on the linear slope term, quadratic term, cubic term and quartic term. According to Mirman et al. (2008), these terms reflect the dynamic traits of the growth curves, specifically, the overall angle of the curve, the symmetric rise and fall rate around a central inflection point and the cubic and the steepness of the curve around the inflection point, all indicate that the tension curve of the incongruent condition fluctuates faster compared with that of the congruent condition, and reveal that listeners feel more unstable and volatile when they are in incongruent conditions. We also found that the effects of interactions. For the certain condition, tension curve fluctuated faster in the incongruent condition compared with the congruent condition, suggesting that listeners feel more unpredictable in the certain incongruent condition compared with the certain congruent condition. Together with ERP results, the behavioral results suggest that previous prediction model is no longer a good guide for listeners to integrate upcoming information in the certain incongruent condition, an updating is urgently needed. However, we found no significant differences between uncertain incongruent condition and uncertain congruent condition in behavioral ratings and tension curves, but combining the theoretical assumptions with ERP results, we believe that the underlying cognitive processes are different.
Altogether, the behavioral and ERP results suggest that contextual certainty plays an important role in the tension resolution process. The underlying cognitive processes might include updating the working memory representations to continually capture changes and reconstructing a new model to guide the integration of upcoming information. Tension is gradually resolved when a befitting prediction model is constructed. These results are in line with the hypothesis of musical prediction theories, i.e. the effect of prediction updating on musical tension is an adaptive process, and the previous musical tension experience could affect subsequent musical tension experience (Farbood 2012; Lehne and Koelsch 2015).
Conclusion
The results of the current study show that contextual certainty plays an important role in both tension induction and resolution processes. Compared with previous literatures, these findings provide a more comprehensive view on how musical predictions affect musical tension, and an empirical basis for theoretical hypotheses.
Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 31971034).
Data availability
The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- Bailey HR, Kurby CA, Sargent JQ, Zacks JM. Attentional focus affects how events are segmented and updated in narrative reading. Mem Cognit. 2017;45(6):940–955. doi: 10.3758/s13421-017-0707-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barrett LF, Simmons WK. Interoceptive predictions in the brain. Nat Rev Neurosci. 2015;16(7):419–429. doi: 10.1038/nrn3950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bianco R, Ptasczynski LE, Omigie D. Pupil responses to pitch deviants reflect predictability of melodic sequences. Brain Cogn. 2020;138:103621. doi: 10.1016/j.bandc.2019.103621. [DOI] [PubMed] [Google Scholar]
- Bigand E. Perceiving musical stability: the effect of tonal structure, rhythm, and musical expertise. J Exp Psychol Hum Percept Perform. 1997;23(3):808–822. doi: 10.1037//0096-1523.23.3.808. [DOI] [PubMed] [Google Scholar]
- Brouwer H, Hoeks JC. A time and place for language comprehension: mapping the N400 and the P600 to a minimal cortical network. Front Hum Neurosci. 2013;7:758. doi: 10.3389/fnhum.2013.00758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brouwer H, Fitz H, Hoeks J. Getting real about semantic illusions: rethinking the functional role of the P600 in language comprehension. Brain Res. 2012;1446:127–143. doi: 10.1016/j.brainres.2012.01.055. [DOI] [PubMed] [Google Scholar]
- Delogu F, Drenhaus H, Crocker MW. On the predictability of event boundaries in discourse: an ERP investigation. Mem Cognit. 2017;46(2):315–325. doi: 10.3758/s13421-017-0766-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Faber M, D’Mello SK. How the stimulus influences mind wandering in semantically rich task contexts. Cogn Res Princ Implic. 2018 doi: 10.1186/s41235-018-0129-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Farbood MM (2001) A quantitative, parametric model of musical tension. Dissertation, Massachusetts Institute of Technology
- Farbood MM. A parametric, temporal model of musical tension. Music Percept. 2012;29(4):387–428. doi: 10.1525/mp.2012.29.4.387. [DOI] [Google Scholar]
- Featherstone CR, Morrison CM, Waterman MG, MacGregor LJ. Semantics, syntax or neither? A case for resolution in the interpretation of N500 and P600 responses to harmonic incongruities. PLoS ONE. 2013;8(11):e76600. doi: 10.1371/journal.pone.0076600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hindemith P (1937/1942) The craft of musical composition. New York, Belwin-Mills
- Huff M, Papenmeier F, Zacks JM. Visual target detection is impaired at event boundaries. Vis Cogn. 2012;20(7):848–864. doi: 10.1080/13506285.2012.705359. [DOI] [Google Scholar]
- Huron D. Sweet anticipation: music and the psychology of expectation. Cambridge: Massachusetts; 2006. [Google Scholar]
- Koelsch S. Brain and music. London: Blackwell; 2012. [Google Scholar]
- Koelsch S. Brain correlates of music-evoked emotions. Nat Rev Neurosci. 2014;15(3):170. doi: 10.1038/nrn3666. [DOI] [PubMed] [Google Scholar]
- Koelsch S, Gunter T, Friederici AD, Schröger E. Brain indices of music processing: "Nonmusicians" are musical. J Cogn Neurosci. 2000;12(3):520–541. doi: 10.1162/089892900562183. [DOI] [PubMed] [Google Scholar]
- Koelsch S, Kilches S, Steinbeis N, Schelinski S. Effects of unexpected chords and of performer’s expression on brain responses and electrodermal activity. PLoS ONE. 2008;3(7):e2631. doi: 10.1371/journal.pone.0002631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koelsch S, Vuust P, Friston K. Predictive processes and the peculiar case of music. Trends Cogn Sci. 2019;23(1):63–77. doi: 10.1016/j.tics.2018.10.006. [DOI] [PubMed] [Google Scholar]
- Krumhansl CL. A perceptual analysis of Mozart’s Piano Sonata K. 282: segmentation, tension and musical ideas. Music Percept. 1996;13:401–432. doi: 10.1016/0042-6989(93)90228-O. [DOI] [Google Scholar]
- Kurby CA, Zacks JM. Segmentation in the perception and memory of events. Trends Cogn Sci. 2007;12(2):72–79. doi: 10.1016/j.tics.2007.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lehne M, Koelsch S. Toward a general psychological model of tension and suspense. Front Psychol. 2015;6:79. doi: 10.3389/fpsyg.2015.00079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lehne M, Rohrmeier M, Koelsch S. Tension-related activity in the orbitofrontal cortex and amygdala: an FMRI study with music. Soc Cogn Affect Neurosci. 2013;9(10):1515. doi: 10.1093/scan/nst141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lerdahl F, Jackendoff RS. A generative theory of tonal music. Cambridge: Cambridge University Press; 1983. [Google Scholar]
- Lerdahl F, Krumhansl CL. Modeling tonal tension. Music Percept. 2007;24(4):329–366. doi: 10.1525/mp.2007.24.4.329. [DOI] [Google Scholar]
- Margulis EH. A model of melodic expectation. Music Percept. 2005;22:663–714. doi: 10.1525/mp.2005.22.4.663. [DOI] [Google Scholar]
- Maris E, Oostenveld R. Nonparametric statistical testing of EEG- and MEG-data. J Neurosci Methods. 2007;164(1):177–190. doi: 10.1016/j.jneumeth.2007.03.024. [DOI] [PubMed] [Google Scholar]
- Meyer LB. Emotion and meaning in music. Chicago: Illinois; University of Chicago Press; 1956. [Google Scholar]
- Mirman D. Growth curve analysis and visualization using R. Boca Raton: Chapman; 2014. [Google Scholar]
- Mirman D, Dixon JA, Magnuson JS. Statistical and computational models of the visual world paradigm: growth curves and individual differences. J Mem Lang. 2008;59(4):475–494. doi: 10.1016/j.jml.2007.11.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Narmour E. The analysis and cognition of basic melodic structures: the implication-realization model. Chicago: Chicago University Press; 1990. [Google Scholar]
- Patel AD, Gibson E, Ratner J, Besson M, Holcomb PJ. Processing syntactic relations in language and music: an event-related potential study. J Cogn Neurosci. 1998;10(6):717–733. doi: 10.1162/089892998563121. [DOI] [PubMed] [Google Scholar]
- Quiroga-Martinez DR, Hansen NC, Højlund A, Pearce MT, Brattico E, Vuust P. Reduced prediction error responses in high-as compared to low-uncertainty musical contexts. Cortex. 2019;120:181–200. doi: 10.1016/j.cortex.2019.06.010. [DOI] [PubMed] [Google Scholar]
- Radvansky GA, Zacks JM. Event boundaries in memory and cognition. Curr Opin Behav Sci. 2017;17:133–140. doi: 10.1016/j.cobeha.2017.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schenker H (1935/1979) Free composition. Longman, New York
- Schumacher PB, Hung YC. Positional influences on information packaging: insights from topological fields in German. J Mem Lang. 2012;67:295–310. doi: 10.1016/j.jml.2012.05.006. [DOI] [Google Scholar]
- Sloboda JA, Lehmann AC. Tracking performance correlates of changes in perceived intensity of emotion during different interpretations of a Chopin piano prelude. Music Percept. 2001;19:87–120. doi: 10.1525/mp.2001.19.1.87. [DOI] [Google Scholar]
- Stawarczyk D, Bezdek MA, Zacks JM. Event representations and predictive processing: the role of the midline default network core. Top Cogn Sci. 2021;13(1):164–186. doi: 10.1111/tops.12450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinbeis N, Koelsch S. Shared neural resources between music and language indicate semantic processing of musical tension-resolution patterns. Cereb Cortex. 2008;18(5):1169–1178. doi: 10.1093/cercor/bhm149. [DOI] [PubMed] [Google Scholar]
- Steinbeis N, Koelsch S, Sloboda JA. The role of harmonic expectancy violations in musical emotions: evidence from subjective, physiological, and neural responses. J Cogn Neurosci. 2006;18(8):1380–1393. doi: 10.1162/jocn.2006.18.8.1380. [DOI] [PubMed] [Google Scholar]
- Zacks JM. Event perception and memory. Annu Rev Psychol. 2020;71:165–191. doi: 10.1146/annurev-psych-010419-051101. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.






