Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2023 Sep 10;61(1):e14435. doi: 10.1111/psyp.14435

Individual prediction tendencies do not generalize across modalities

Juliane Schubert 1,, Nina Suess 1, Nathan Weisz 1,2
PMCID: PMC10909557  PMID: 37691098

Abstract

Predictive processing theories, which model the brain as a “prediction machine”, explain a wide range of cognitive functions, including learning, perception and action. Furthermore, it is increasingly accepted that aberrant prediction tendencies play a crucial role in psychiatric disorders. Given this explanatory value for clinical psychiatry, prediction tendencies are often implicitly conceptualized as individual traits or as tendencies that generalize across situations. As this has not yet explicitly been shown, in the current study, we quantify to what extent the individual tendency to anticipate sensory features of high probability generalizes across modalities. Using magnetoencephalography (MEG), we recorded brain activity while participants were presented with a sequence of four different (either visual or auditory) stimuli, which changed according to predefined transitional probabilities of two entropy levels: ordered vs. random. Our results show that, on a group‐level, under conditions of low entropy, stimulus features of high probability are preactivated in the auditory but not in the visual modality. Crucially, the magnitude of the individual tendency to predict sensory events seems not to correlate between the two modalities. Furthermore, reliability statistics indicate poor internal consistency, suggesting that the measures from the different modalities are unlikely to reflect a single, common cognitive process. In sum, our findings suggest that quantification and interpretation of individual prediction tendencies cannot be generalized across modalities.

Keywords: auditory processing, inter‐individual differences, magnetoencephalography, predictive processing, visual processing

Short abstract

Our research challenges the implicit assumption of “prediction tendency” as a subject‐specific, potentially unified trait. Using magnetoencephalography (MEG), we replicate previous findings for fine‐tuned anticipatory predictions in the auditory modality. Additionally, our results clearly demonstrate significant differences in how predictive processes are engaged in the visual modality. We show that measures from the different modalities are unlikely to reflect a single, common cognitive process.

1. INTRODUCTION

In our complex sensory environments, the amount, as well as the level of ambiguity, of information entering our senses requires a system that can filter and integrate information efficiently. As such a system, our brain has to rely fully on its sensory input to infer the true states of objects that cause such input. This inference results in the formation of a so‐called internal model, which exists so that predictions can be drawn from it. Even though the assumed implementations of these predictions vary across different “predictive brain” theories (Friston, 2010; Knill & Pouget, 2004; Yon et al., 2019), most agree on the importance of predictions for perception in general. Bottom‐up‐driven sensory representations in the brain inevitably lag behind the events that caused them, and therefore compensatory mechanisms, such as the prediction of future events, are highly beneficial for our everyday lives (e.g., to correctly allocate moving objects). Indeed, such anticipation and pre‐activation of sensory input has been found for visual perception (Hogendoorn & Burkitt, 2018), as well as for language processing (Dikker & Pylkkänen, 2013). Furthermore, we recently showed that “prediction tendency”, which we defined as the tendency to anticipate auditory features of high probability before their occurrence, contributes to explaining individual differences in speech tracking (Schubert et al., 2023).

Complementary to our results, various research has pointed out that individuals seem to vary in the extent to which they rely on top‐down priors compared to bottom‐up sensory signals for perception. Crucially, these differences in handling predictions or in overall “prediction tendency” have been associated with clinical psychological conditions, such as autism (Sinha et al., 2014), schizophrenia (Sterzer et al., 2018), depression (Barrett et al., 2016), PTSD (Kube et al., 2020) and tinnitus (Partyka et al., 2019; Sedley et al., 2016 ). Taking its explanatory value to clinical psychiatry into account, prediction tendency is often implicitly conceptualized as an individual trait or as a tendency that generalizes across situations. Indeed, in line with predictive brain theories for psychosis (Sterzer et al., 2018), it has been found that strong predictive tendencies can promote phantom perception in the auditory modality (Powers et al., 2017). Together with our findings, which state that the aforementioned internal mechanisms are related to differences in speech tracking (Schubert et al., 2023), this strongly suggests that individual prediction tendencies are generalizable across different listening situations. However, the extent to which they generalize across different modalities remains unclear.

Furthermore, there are considerable differences between different modalities (e.g., auditory vs. visual) in the predictability of sensory information and, therefore, also in the way predictions should best be applied. Visual information (e.g., a single object or a whole scene) naturally unfolds across space, and sensory inference is often drawn from spatially surrounding information. In audition, however, this is not necessarily the case. Considering speech and music, for example, whose predictability mainly unfolds over time, which seems to be a core characteristic of auditory objects. It follows that, in the auditory domain, two types of predictions are usually distinguished, carrying spectral (what is going to happen) or temporal (when is it going to happen) information (Auksztulewicz et al., 2018; Wollman & Morillon, 2018). In the visual modality, on the other hand, the focus is often on object‐based (what is it) and spatial (where is it) predictions, based on dual‐stream theories of vision Rao & Ballard, 2005). Nevertheless, a correct anticipation of sensory events should include all of these aspects (what, when and where) in both modalities, especially since predictions that generalize and complement across modalities are highly beneficial in natural settings (integrating, e.g., lip movements and acoustic speech; Crosse et al., 2016; Sumby & Pollack, 1954). If “prediction tendency” can indeed be conceptualized as an individual trait, it should be similar across situations, as well as across modalities. Complementing correlation estimates, standard measures of internal consistency should help in evaluating whether measures from the auditory and visual modality reflect a putatively common construct. In such a case, internal consistency measures should be sufficiently high (at least >= 0.7; Tavakol & Dennick, 2011). Inherent differences between auditory and visual perception, however, emphasize the importance of comparative research in the field of predictive processing.

In the current study, we used magnetoencephalographic (MEG) data in visual and auditory versions of an entropy modulation paradigm (see also Demarchi et al., 2019; Schubert et al., 2023) to capture individual prediction tendencies for both modalities. In this paradigm, participants were presented with a sequence of four different (either visual or auditory) stimuli, which changed according to predefined transitional probabilities of two entropy levels: ordered vs. random. In order to optimize comparability, we kept constant spatial predictability (visual stimuli were presented at screen center and auditory stimuli were presented binaurally at phantom center), as well as temporal predictability (with a fixed stimulation rate of 3 Hz), modulating only one stimulus feature in each modality. In the visual version, participants were presented with a sequence of gabor patches that changed in orientation, whereas in the auditory version, participants listened to pure tones of different frequencies. For both modalities, individual prediction tendency was defined as the tendency to anticipate and pre‐activate stimulus features of high probability.

2. METHOD

2.1. Subjects

In total, 35 subjects (16 male, mean age = 32.5, range = 19–57) were recruited to participate in the experiment. All participants reported normal hearing and had normal, or corrected to normal, vision. They gave written, informed consent and reported that they had no previous neurological or psychiatric disorders. The experimental procedure was approved by the ethics committee of the University of Salzburg and was carried out in accordance with the declaration of Helsinki. All participants received either a reimbursement of 10 € per hour or course credits for their participation.

2.2. Experimental procedure

Before the start of the experiment, participants' individual head shapes were assessed using cardinal head points (nasion and pre‐auricular points), digitized with a Polhemus Fastrak Digitiser (Polhemus), as well as around 300 points on the scalp. For every participant, MEG sessions started with a 5‐min resting‐state recording, after which individual hearing threshold was determined, using a pure tone of 1043 Hz. This was followed by 4 experimental blocks (2 blocks per modality) of an entropy modulation paradigm. Participants started with either the auditory or visual blocks (chosen randomly for each participant). In the auditory paradigm (see also Schubert et al., 2023), participants passively listened to sequences of four different pure tones (1: 440 Hz, 2: 587 Hz, 3: 782 Hz, 4: 1043 Hz) while watching a landscape movie. All tones were presented binaurally at equal volume for the left and right ears (i.e., at phantom center) at 40 db above the individual hearing threshold. In the visual paradigm, a gabor patch (spatial frequency: 0.01 cycles/pixel, sigma: 60 pixels, phase: 90°) was presented at four perceptually different orientations (1: 0°, 2: 45°, 3: 90°, 4: 135°) in the center of the screen while participants listened to podcasts from “Bayern 2—radioWissen”: “Kakao—Das braune Gold” (Cocoa—the brown gold) and “Zimt—Die Würze des Lebens” (Cinnamon—the spice of life). There was no task, and participants were just instructed to relax and to move as little as possible. To ensure temporal predictability, all stimuli were presented at a fixed stimulation rate of 3 Hz, and each stimulus presentation (i.e., pure tone or gabor patch) lasted for 100 ms. One block consisted of a sequence of 2800 trials, totalling 5600 trials per modality. Transitional probabilities between stimuli (1, 2, 3, 4) were determined by two different entropy levels (ordered vs. random, see Figure 1a). Entropy changed pseudorandomly within each block every 700 trials. While in an “ordered” context, certain transitions (hereinafter referred to as forward transitions, i.e., 1 → 2, 2 → 3, 3 → 4, 4 → 1) were to be expected, with a high probability of 75%, self repetitions (e.g., 1 → 1, 2 → 2, …) were rather unlikely, with a probability of 25%. However, in a “random” context, all possible transitions (including forward transitions and self repetitions) were equally likely, with a probability of 25% (see Figure 1a and see also Schubert et al., 2023). Furthermore, a pseudorandom 10% of the trials (i.e., ~70 trials per stimulus, per entropy, per modality) was omitted (as in Demarchi et al., 2019). Between each of the four blocks (~15 min each), there was a short break. In total, the experiment lasted approximately 2 h per participant (including MEG preparation time). The experiment was coded and conducted with the Psychtoolbox‐3 (Brainard, 1997; Kleiner et al., 2007), with an additional class‐based library (‘Objective Psychophysics Toolbox’, o_ptb) on top of it (Hartmann & Weisz, 2020).

FIGURE 1.

FIGURE 1

Quantification of individual prediction tendency. (a) Participants were presented with sequences of four different stimuli. Transitional probabilities varied according to two entropy conditions (ordered vs. random). (b) Visual stimulation consisted of gabor patches in four different orientations, auditory stimulation consisted of pure tones at four different frequencies. An LDA classifier was used to decode stimulus feature from brain activity across time, trained on (high‐probability) ordered forward transition trials and tested on all repetition trials. (c) Expected classifier decision values contrasting the brain's pre‐stimulus tendency to predict a forward transition between the two entropy levels. Individual prediction tendency quantification results from the summed difference between conditions (ordered > random) over pre‐stimulus time.

2.3. MEG data acquisition and preprocessing

A whole‐head MEG system (Elekta Neuromag Triux, Elekta Oy, Finland), placed within a standard, passive magnetically shielded room (AK3b, Vacuumschmelze, Germany), was used to capture magnetic brain activity with a sampling frequency of 1 kHz (hardware filters: 0.1–330 Hz). The signal was recorded with 102 magnetometers and 204 orthogonally placed planar gradiometers at 102 different positions. In a first step, a signal space separation algorithm, implemented in the Maxfilter program (version 2.2.15) provided by the MEG manufacturer, was used to clean the data of external noise and realign data from different blocks to a common standard head position. Data preprocessing was performed using Matlab R2020b (The MathWorks, Natick, Massachusetts, USA) and the FieldTrip Toolbox (Oostenveld et al., 2010). To identify eye blinks and heartbeat artifacts, 50 independent components were identified from filtered (0.1–100 Hz) continuous data of the first experimental block of both modalities (auditory + visual). On average, 2.6 (range = 2–5) components were removed for each subject. All data were filtered between 0.1 Hz and 30 Hz (Kaiser windowed finite impulse response filter) and downsampled to 100 Hz. Then, the data of each block were epoched into segments of 1200 ms (from 400 ms before stimulus onset to 800 ms after onset) for further analysis (as in Schubert et al., 2023).

2.4. Decoding analysis

Multivariate pattern analyses were carried out using the MVPA‐Light package (Treder, 2020) and were conducted separately for each modality. Prior to the classification analysis, we excluded the first 20 trials after each new entropy onset (resulting in a total of 2720 trials per entropy and modality). This decision was based on the number of trials, it would take an ideal Bayesian observer to reach above 0.5 posterior probability for a certain stimulus class in an ordered context, which was modeled using the HGF toolbox (Frässle et al., 2021).

A multi‐class linear discriminant analyzer (LDA) was used to classify stimulus feature (i.e., 1–4, sound frequency or gabor‐patch orientation) from brain activity between −0.3 and 0.3 ms in a time‐resolved manner. Based on resulting classifier decision values (i.e., d1 − d4 for every test‐trial and time‐point), we calculated the individual prediction tendency within each modality (as in Schubert et al., 2023): We define individual prediction tendency as the tendency to pre‐activate sound frequencies of high probability (i.e., a forward transition from one stimulus to another: 1 → 2, 2 → 3, 3 → 4, 4 → 1). In order to capture any prediction‐related neural activity, we trained the classifier exclusively on ordered forward trials. Afterwards, the classifier was tested on self‐repetition trials, providing classifier decision values for every stimulus feature, which were then transformed into corresponding transitions (e.g., d1(t) | 1(t − 1) “dval for 1 at trial t, given that 1 was presented at trial t − 1” → repetition, d2(t) | 1(t − 1) → forward, …). The tendency to represent a forward vs. repetition transition was contrasted for both ordered and random trials. Using self‐repetition trials for testing, we ensured a fair comparison between the ordered and random contexts (with an equal probability and the same preceding bottom‐up input) and that test trials were always different from training trials. To ensure that classifier decision could also not have been biased by the stimulus presentation at t − 2, ordered and random trials were matched for the preceding stimuli at t − 1 (using only repetitions from t − 1 → t) and t − 2 (using only forward transitions from t − 2 → t − 1). This resulted in a total of 84–124 (mean = 108.6) test trials in the auditory modality and 92–126 (mean = 108.77) test trials in the visual modality per subject and entropy condition (ordered trials were randomly subselected to match the number of random trials). Thus, our time window of interest (−0.3–0 s) should not contain any carry‐over effects of preceding stimuli (note that stimulus identity can be classified until 700 ms after presentation, see Figure S1, Supplementary Material). Additionally, we also conducted a similar analysis (which can be found in more detail in Demarchi et al., 2019), where we trained the classifier on random trials and time‐generalized its performance to capture potential prestimulus representations in both conditions (see Supplementary Material). Based on the consideration that stimulus “predictions” are not necessarily the same as poststimulus representations, we focus our interpretation on the analysis approach in which the classifier has been trained on ordered trials (and was able to capture “prediction”‐specific patterns).

Thus, we quantified “prediction tendency” as the classifier's pre‐stimulus tendency to a forward transition in an ordered context exceeding the same tendency in a random context (which can be attributed to carry‐over processing of the preceding stimulus). Then, using the summed difference across pre‐stimulus times, one value can be extracted per subject, per modality.

2.5. Statistical analysis

The classifier's tendency toward a forward transition in an ordered context was compared to the same tendency in a random context, using a cluster‐based permutation t‐test (using 10,000 random permutations, a cluster‐alpha of 0.025 and a Monte Carlo critical alpha of 0.025) over a time window from −0.3 to 1.5 s. This contrast was calculated for both modalities separately. Afterwards, the summed differences across pre‐stimulus time was extracted for each subject and modality, resulting in individual prediction tendency values. These single values were then used to calculate the pearson's correlation coefficient between auditory and visual prediction tendency. To extend our findings by classical estimates of internal consistency, we also calculated Cronbach's alpha and the Spearman‐Brown reliability estimates, assuming that prediction tendency can be understood as a multi‐dimensional concept, here captured with a two‐item (auditory, visual) scale. Additionally, we also calculated the reliability of predictions between modalities (Spearman‐Brown cofficient: rho = 2 * r/(1 + r); formula taken from Eisinga et al., 2013) for all possible pre‐stimulus time‐by‐time combinations. All analyses were done in MATLAB version R2021a.

3. RESULTS

3.1. Anticipatory predictions can be found mainly in the auditory modality

To investigate anticipatory prediction, participants were presented with a sequence of four different (either visual or auditory) stimuli, which changed according to predefined transitional probabilities of two entropy levels: ordered vs. random. In a first step, we were interested if we would see a pre‐activation of high‐probability stimulus features in brain activity across subjects. Cluster‐based permutation showed an overall auditory prediction tendency (i.e., ordered > random) in two pre‐stimulus clusters from −0.23 to −0.2 s (p = .035) and from −0.17 to −0.07 s (p = 6.9 * 10−4). For visual prediction tendency, we found only a trend suggesting prestimulus predictions in a short cluster from −0.24 to −0.22 s (p = .056). Using a different approach, where we trained on random trials and time‐generalized classifier performance into a prestimulus window we found a similar picture indicating stimulus pre‐activations in the auditory but not in the visual modality (see Figure S2, Supplementary Material). These results indicate that, irrespective of individual differences, there seems to be an overall tendency to anticipate stimulus features of increased probability in the auditory modality (see Figure 2a). In the visual modality, however, the results are less conclusive, suggesting that anticipatory predictions might not be a general group phenomenon.

FIGURE 2.

FIGURE 2

(a) Time‐resolved prediction tendency for auditory and visual modalities: On a group‐level, there seems to be an anticipatory tendency to represent stimulus features of high probability (i.e., forward transition) in a predictable context in the auditory modality, but not in the visual modality (the y‐axis represents the difference in dvals (ordered − random) for a forward transition; the solid lines on x‐axis indicate significant time‐points (based on a random permutation test using an alpha level of 0.025)). (b) Correlation of individual prediction tendency between modalities: There is no significant correlation between auditory and visual prediction tendency (the solid line represents a perfect generalization from one modality to another; the dashed line indicates the least‐squared regression line). (c) Temporal generalization of forward predictions across modalities: early anticipatory predictions in the auditory modality seem to generalize best to late anticipatory predictions in the visual modality; however, only poor reliability is indicated overall (colormap represents Spearman‐Brown coefficient, * taken from Tavakol & Dennick, 2011; N = 35).

3.2. Individual prediction tendency is not correlated between modalities

In a second step, we wanted to further investigate prediction tendency as a subject‐specific, potentially unified concept. For this purpose, we looked particularly into the generalization between auditory and visual modalities. Using the summed prediction tendency values across pre‐stimulus time (−0.3–0 s), one value was extracted per subject, per modality. Even though individual prediction tendency values were, on average, higher for the auditory (mean = 0.076, SD = 0.164) than the visual (mean = 0.059, SD = 0.127) modality, there was no significant difference in a one‐sample t‐test (t = 0.533, 95% CI = [−0.048, 0.082]). In total, 20 subjects showed a stronger auditory prediction tendency, compared to 15 subjects with stronger prediction tendency in the visual domain (see Figure 2b). Crucially, we found no relationship of individual prediction tendency between modalities, using pearson's correlation (r = .18, p = .300; see Figure 2b). To test the assumption that both measures represent the same concept of individual prediction tendency, we further calculated Cronbach's alpha and Spearman‐Brown rho to estimate internal consistency (see Table 1). In order to be able to interpret the reliability across modalities, we also calculated the internal (split‐half) consistency within each modality. We found a significant positive correlation between prediction tendency estimated from one half (trials were randomly selected) and the other half in the auditory (r = .74, p = 3.73 * 10−7) as well as in the visual modality (r = .58, p = 3.00 * 10−4). Because there seems to be some disagreement as to which reliablity statistic should be used for measures comprising two items, we decided to report all three here (Eisinga et al., 2013). In an attempt to account for individual temporal dynamics of each modality, we also estimated the reliability for all possible pre‐stimulus time‐by‐time combinations. It seems that early auditory predictions generalize best to late visual predictions, with overall low reliability (see Figure 2c). Crucially, based on guidelines from classical test theory, our results indicate that the effect is too weak to provide a reliable generalization from one modality to another (Tavakol & Dennick, 2011). In sum, these findings suggest that individual prediction tendencies do not necessarily generalize across different modalities.

TABLE 1.

Comparison of correlation coefficient, Cronbach alpha and Spearman‐Brown coefficient.

Pearson's r Cronbach's alpha Spearman‐Brown rho
Reliability across modalities .18 .298 .306
Auditory (split‐half) reliability .74 .850 .851
Visual (split‐half) reliability .58 .710 .730

Note: Reliability was calculated treating prediction tendency as a two‐item (auditory, visual) scale.

4. DISCUSSION

Predictive processing has been found throughout the range of human cognition: in language perception and production (e.g., Pickering & Garrod, 2013), object recognition (e.g., Oliva & Torralba, 2007), motor perception and action (e.g., Shipp et al., 2013), emotional awareness (e.g., Smith et al., 2019), and decision making (e.g., Summerfield & de Lange, 2014). However, the extent to which all these predictions can be attributed to a collective mechanism remains unclear. Beyond that, several conceptual and mathematical predictive brain models have been proposed to explain cognitive and perceptual pathologies in clinical neuroscience (for a review, see Smith et al., 2021). Considering that psychiatric conditions, such as depression, autism or schizophrenia, are assumed to be highly individualized and stable across time and situations, estimations of prediction tendency are likely to be interpreted as a “trait‐dependent” rather than a “state‐dependent” measure. In this regard, prediction tendency should (a) vary considerably across individuals, (b) show long‐term stability within an individual and c) generalize across different situations. Yet research that validates this claim is scarce. In the current study, we aimed to investigate to what extent the individual tendency to anticipate sensory features of high probability generalizes across modalities. Even though aberrant predictive tendencies, especially linked to psychosis, have been proposed for auditory (Corlett et al., 2019) as well as visual perception (Adams et al., 2012), this is, to our knowledge, the first study to directly compare individual prediction tendencies between two modalities. Our results show that under conditions of low entropy, stimulus features of high probability are preactivated in the auditory modality irrespective of individual differences. In the visual modality, however, evidence for such preactivations is less clear. Crucially, this tendency to predict sensory events seems not to correlate between the two modalities, suggesting that the quantification and interpretation of prediction tendencies cannot be generalized.

Even though we only found inconclusive evidence for anticipatory predictions in the visual modality on a group‐level (with the lowest p‐value of .056 not reaching significance), we argue that our focus lies on individual differences, which should be compared across modalities independent of their group‐level significance. However, we can conclude that less than 4% of total variance in visual prediction tendency can be explained by auditory prediction tendency and vice versa. We argue that this seems too weak to support the assumption that individual prediction tendency can be generalized from one modality to another. Considering that the same metric was applied in both cases, we orient ourselves primarily on standards that are also used to assess reliability, for which acceptable statistics usually range between 0.7 and 0.95 (Tavakol & Dennick, 2011). Additionally, it should be noted that our prediction tendency quantification was derived from the decoding of stimulus features from brain activity, a measure that is highly affected by a subject‐specific signal‐to‐noise ratio (as a result of anatomical differences or individual distributive properties, a problem often discussed when calculating source localization; Puce & Hämäläinen, 2017). It can therefore be assumed that decoding accuracies are correlated within subjects, and it should be considered that a certain proportion of the relationship is driven by individual differences in signal‐to‐noise ratio alone. In addition, it should be considered that our quantification might not be a “pure” measure of prediction tendency, which reduces internal consistency, even within a modality. However, reliability statistics indicate acceptable internal consistency within each modality (see Table 1). In sum, we support the notion that scientific evidence should not be judged solely on the basis of its significance, but that the strength of effects should be evaluated within the framework of underlying assumptions. We therefore conclude that our estimates of reliability are too weak to allow a generalization across modalities.

Our results indicate that individuals who show a stronger auditory prediction tendency do not necessarily show a similarly strong tendency for visual predictions. Indeed it seems that overall individuals do not show similar anticipatory predictions in the visual modality as they do in the auditory modality. There are different possible explanations for these findings. In the current paradigm, stimulus predictability was modulated via transitional probabilities in a sequence of events, a feature that is quite characteristic for auditory stimuli, such as music and speech, but maybe less so for visual stimulation (ten Oever et al., 2014). To make optimal use of top‐down inferences in auditory processing, sensory information has to be integrated over (past) time (and extrapolated into the future) and, crucially, has to be accurately located within time. Although visual processing has to meet similar requirements, particularly when confronted with moving objects, the motivation for anticipatory representations might be different (e.g., to facilitate suitable motor action in time), potentially resulting in increasingly goal‐driven predictions in the visual modality (for a review, see Fiehler et al., 2019). Such differences in motivation could very well lead to modality‐specific implementations of anticipatory predictions, which might explain the weak generalization between modalities in our results.

Indeed, it seems that the extent to which an observer (vs. a listener) has to be actively engaged in the sampling of information is different between vision and audition. For example, information that gradually unfolds over time, such as a spoken sentence, can (under optimal noise conditions) be received without any active adjustments from the listener (although see, e.g., Gehmacher et al., 2022, for top‐down influence on auditory nerve activity). Brain responses to unexpected changes in auditory stimulation (the classic “mismatch negativity”), for example, have even been found during sleep (Loewy et al., 1996). Meanwhile, visual information, which is usually spread throughout space, extending the foveal field of the observer, requires active effort (e.g., eye, head or body movements) to be sampled. According to some theories, however, active sampling of sensory evidence is crucial for the formation of internal models and the predictions that come from them (Friston et al., 2017). Since our paradigm was entirely passive, no beneficial gain was to be expected from focusing on the sequences. Thus, participants likely did not allocate their attentional resources to them. If the proposed link between predictions and active engagement is stronger in vision, a lack of attention and goal‐directedness could explain why we found automatic anticipatory predictions in the auditory but not in the visual modality. We suggest that future research should investigate the (potentially different) roles of attention for predictive processing in vision versus audition.

A different explanation is that statistical learning (SL), which has been defined as “a general capacity for picking up regularities”, is qualitatively different between modalities (Siegelman et al., 2017). This individual capacity is implicit in our conceptualization of “prediction tendency”, as we quantify to what extent an individual anticipates sensory features of high probability (compared to a low probability context). Our results are consistent with findings in that area, which suggest that SL can also not be generalized across different modalities (Siegelman et al., 2017). Crucially, it has been found that visual SL performance is higher (comparable to auditory SL performance for sequences) when information is distributed spatially, compared to when visual input was distributed temporally (Conway & Christiansen, 2009). This finding supports our interpretation that overall prediction tendency for sequential input is different between the auditory and the visual modality. Finally, it should be noted that in sequential SL paradigms, the observers internal model of input probability is based on past experience of a few trials. Therefore, individual prediction tendency might not be stationary but fluctuate even within a given regularity context (see, e.g., Notaro et al., 2019). A related limitation of the current paradigm is that our quantification of prediction tendency includes the observers ability to distinguish between an ordered and a random context. This way, strong priors that are immune to changes in contextual probability (i.e., anticipatory predictions toward a favored transition in a random context) remain undetected. An interesting future perspective would be to look into the generalizability of predictive tendencies across different regularity contexts.

5. CONCLUSIONS AND IMPLICATIONS FOR FUTURE RESEARCH

Predictive processing theories, which model the brain as a “prediction machine”, are able to explain a wide range of cognitive functions, including learning, perception and action. Furthermore, it seems to be widely accepted that aberrant prediction tendencies play a crucial role in long‐term psychiatric disorders. Crucially, in clinical neuroscience, prediction tendency is often implicitly conceptualized as an individual trait or as a tendency that generalizes across situations, yet research to validate this claim is scarce. In the current study, we found evidence for anticipatory predictions in the auditory but not in the visual modality. Furthermore, our results suggest that quantification and interpretation of individual prediction tendency cannot be generalized across modalities. This emphasizes the importance of research, further investigating and validating the implicit assumption of “prediction tendency” as a trait, since this assumption bears concrete implications for the growing field of computational modeling in neuroscience and psychiatry. We propose that future research focuses on the validation of state‐ vs. trait‐dependent assumptions in predictive processing theories. Furthermore, we strongly suggest investigation of the intraindividual stability of prediction tendency within and across different modalities.

AUTHOR CONTRIBUTIONS

Juliane Schubert: Conceptualization; data curation; formal analysis; investigation; methodology; visualization; writing – original draft. Nina Suess: Data curation; writing – review and editing. Nathan Weisz: Conceptualization; funding acquisition; project administration; supervision; writing – review and editing.

CONFLICT OF INTEREST STATEMENT

The authors declare no competing financial interests.

Supporting information

Data S1.

PSYP-61-e14435-s001.docx (15.2KB, docx)

Figure S1. Time‐resolved decoding accuracy for auditory and visual features: sound frequency as well as gabor patch orientation can be classified from brain activity from ~100 ms until ~700 ms after stimulus onset in a random context (shaded area indicates 95%CI, and dashed line shows the chance level of 0.25; N = 35).

PSYP-61-e14435-s004.png (553.6KB, png)

Figure S2. Tendency to represent a “forward” transition (compared to a “repetition” transition), separately for different entropy levels and modalities. On a group‐level, there seems to be an anticipatory tendency to represent stimulus features of high probability (i.e., forward transition) in a predictable context in the auditory modality, but not in the visual modality. (Note that this figure shows the same data as Figure 2a, but with separate lines for each entropy condition. The y‐axis represents the classifier dvals for a “forward” transition before subtraction (ordered − random), and the solid lines on the x‐axis indicate significant time‐points (ordered > random); N = 35).

PSYP-61-e14435-s002.png (842.8KB, png)

Figure S3. Temporal generalization of feature‐specific activations in the auditory (left) and visual (right) modality. Left: In the auditory modality there is a significant difference in the generalization from post‐ to prestimulus processing between entropy levels. This suggests that in an ordered, but not in a random, context people generate feature‐specific anticipatory predictions, that resemble bottom‐up processing. Right: In the visual modality, however, we find no significant evidence for a regularity‐dependent generalization from post‐ to prestimulus acitvations. (The y‐axis represents classifier poststimulus training‐time, x‐axis represents classifier pre‐ and poststimulus testing‐time and the dashed‐gray line indicates the diagonal; T‐values are shown in color, and marked outlines indicate a significant cluster in the comparison ordered vs. random; N = 35).

PSYP-61-e14435-s003.png (613.5KB, png)

ACKNOWLEDGMENTS

J.S. was supported by the Austrian Science Fund (FWF; Doctoral College “Imaging the Mind”; W 1233‐B). N.S. was supported by the Austrian Science Fund, P31230 (“Audiovisual speech entrainment in deafness”) and P34237 (“Impact of face masks on speech comprehension”). Thanks to the whole research team. Special thanks to Manfred Seifter for his support in conducting the measurements.

Schubert, J. , Suess, N. , & Weisz, N. (2024). Individual prediction tendencies do not generalize across modalities. Psychophysiology, 61, e14435. 10.1111/psyp.14435

DATA AVAILABILITY STATEMENT

The raw data and the analysis code of this study are available from the corresponding author upon reasonable request.

REFERENCES

  1. Adams, R. A. , Perrinet, L. U. , & Friston, K. (2012). Smooth pursuit and visual occlusion: Active inference and oculomotor control in schizophrenia. PLoS One, 7(10), e47502. 10.1371/journal.pone.0047502 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Auksztulewicz, R. , Schwiedrzik, C. M. , Thesen, T. , Doyle, W. , Devinsky, O. , Nobre, A. C. , Schroeder, C. E. , Friston, K. J. , & Melloni, L. (2018). Not all predictions are equal: “What” and “when” predictions modulate activity in auditory cortex through different mechanisms. The Journal of Neuroscience, 38(40), 8680–8693. 10.1523/JNEUROSCI.0369-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Barrett, L. F. , Quigley, K. S. , & Hamilton, P. (2016). An active inference theory of allostasis and interoception in depression. Philosophical Transactions of the Royal Society B: Biological Sciences, 371(1708), 20160011. 10.1098/rstb.2016.0011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision, 10(4), 433–436. 10.1163/156856897X00357 [DOI] [PubMed] [Google Scholar]
  5. Conway, C. M. , & Christiansen, M. H. (2009). Seeing and hearing in space and time: Effects of modality and presentation rate on implicit statistical learning. European Journal of Cognitive Psychology, 21(4), 561–580. 10.1080/09541440802097951 [DOI] [Google Scholar]
  6. Corlett, P. R. , Horga, G. , Fletcher, P. C. , Alderson‐Day, B. , Schmack, K. , & Powers, A. R. (2019). Hallucinations and strong priors. Trends in Cognitive Sciences, 23(2), 2. 10.1016/j.tics.2018.12.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Crosse, M. J. , Liberto, G. M. D. , & Lalor, E. C. (2016). Eye can hear clearly now: Inverse effectiveness in natural audiovisual speech processing relies on long‐term crossmodal temporal integration. Journal of Neuroscience, 36(38), 9888–9895. 10.1523/JNEUROSCI.1396-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Demarchi, G. , Sanchez, G. , & Weisz, N. (2019). Automatic and feature‐specific prediction‐related neural activity in the human auditory system. Nature Communications, 10(1), 3440. 10.1038/s41467-019-11440-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Dikker, S. , & Pylkkänen, L. (2013). Predicting language: MEG evidence for lexical preactivation. Brain and Language, 127(1), 55–64. 10.1016/j.bandl.2012.08.004 [DOI] [PubMed] [Google Scholar]
  10. Eisinga, R. , te Grotenhuis, M. , & Pelzer, B. (2013). The reliability of a two‐item scale: Pearson, Cronbach, or Spearman‐Brown? International Journal of Public Health, 58(4), 637–642. 10.1007/s00038-012-0416-3 [DOI] [PubMed] [Google Scholar]
  11. Fiehler, K. , Brenner, E. , & Spering, M. (2019). Prediction in goal‐directed action. Journal of Vision, 19(9), 10. 10.1167/19.9.10 [DOI] [PubMed] [Google Scholar]
  12. Frässle, S. , Aponte, E. A. , Bollmann, S. , Brodersen, K. H. , Do, C. T. , Harrison, O. K. , Harrison, S. J. , Heinzle, J. , Iglesias, S. , Kasper, L. , Lomakina, E. I. , Mathys, C. , Müller‐Schrader, M. , Pereira, I. , Petzschner, F. H. , Raman, S. , Schöbi, D. , Toussaint, B. , Weber, L. A. , … Stephan, K. E. (2021). TAPAS: An open‐source software package for translational neuromodeling and computational psychiatry. Frontiers in Psychiatry, 12, 680811. 10.3389/fpsyt.2021.680811 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Friston, K. (2010). The free‐energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138. 10.1038/nrn2787 [DOI] [PubMed] [Google Scholar]
  14. Friston, K. , FitzGerald, T. , Rigoli, F. , Schwartenbeck, P. , & Pezzulo, G. (2017). Active inference: A process theory. Neural Computation, 29(1), 1–49. 10.1162/NECO_a_00912 [DOI] [PubMed] [Google Scholar]
  15. Gehmacher, Q. , Reisinger, P. , Hartmann, T. , Keintzel, T. , Rösch, S. , Schwarz, K. , & Weisz, N. (2022). Direct Cochlear recordings in humans show a theta rhythmic modulation of auditory nerve activity by selective attention. Journal of Neuroscience, 42(7), 1343–1351. 10.1523/JNEUROSCI.0665-21.2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hartmann, T. , & Weisz, N. (2020). An Introduction to the objective psychophysics toolbox. Frontiers in Psychology, 11, 585437. 10.3389/fpsyg.2020.585437 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hogendoorn, H. , & Burkitt, A. N. (2018). Predictive coding of visual object position ahead of moving objects revealed by time‐resolved EEG decoding. NeuroImage, 171, 55–61. 10.1016/j.neuroimage.2017.12.063 [DOI] [PubMed] [Google Scholar]
  18. Kleiner, M. , Brainard, D. H. , Pelli, D. , Ingling, A. , Murray, R. , & Broussard, C. (2007). What's new in Psychtoolbox‐3. Perception, 36, 1–16. 10.1068/v070821 [DOI] [Google Scholar]
  19. Knill, D. C. , & Pouget, A. (2004). The Bayesian brain: The role of uncertainty in neural coding and computation. Trends in Neurosciences, 27(12), 12. 10.1016/j.tins.2004.10.007 [DOI] [PubMed] [Google Scholar]
  20. Kube, T. , Berg, M. , Kleim, B. , & Herzog, P. (2020). Rethinking post‐traumatic stress disorder—A predictive processing perspective. Neuroscience & Biobehavioral Reviews, 113, 448–460. 10.1016/j.neubiorev.2020.04.014 [DOI] [PubMed] [Google Scholar]
  21. Loewy, D. H. , Campbell, K. B. , & Bastien, C. (1996). The mismatch negativity to frequency deviant stimuli during natural sleep. Electroencephalography and Clinical Neurophysiology, 98(6), 493–501. 10.1016/0013-4694(96)95553-4 [DOI] [PubMed] [Google Scholar]
  22. Notaro, G. , van Zoest, W. , Altman, M. , Melcher, D. , & Hasson, U. (2019). Predictions as a window into learning: Anticipatory fixation offsets carry more information about environmental statistics than reactive stimulus‐responses. Journal of Vision, 19(2), 8. 10.1167/19.2.8 [DOI] [PubMed] [Google Scholar]
  23. Oliva, A. , & Torralba, A. (2007). The role of context in object recognition. Trends in cognitive sciences, 11(12), 520–527. 10.1016/j.tics.2007.09.009 [DOI] [PubMed] [Google Scholar]
  24. Oostenveld R, Fries P, Maris E, Schoffelen J‐M (2010). FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput Intell Neurosci, 2011, e156869. 10.1155/2011/156869 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Partyka, M. , Demarchi, G. , Roesch, S. , Suess, N. , Sedley, W. , Schlee, W. , & Weisz, N. (2019). Phantom auditory perception (tinnitus) is characterised by stronger anticipatory auditory predictions. bioRxiv, 869842. 10.1101/869842 [DOI] [Google Scholar]
  26. Pickering, M. J. , & Garrod, S. (2013). An integrated theory of language production and comprehension. The Behavioral and brain sciences, 36(4), 329–347. 10.1017/S0140525X12001495 [DOI] [PubMed] [Google Scholar]
  27. Powers, A. R. , Mathys, C. , & Corlett, P. R. (2017). Pavlovian conditioning‐induced hallucinations result from overweighting of perceptual priors. Science (New York, N.Y.), 357(6351), 596–600. 10.1126/science.aan3458 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Puce, A. , & Hämäläinen, M. S. (2017). A review of issues related to data acquisition and analysis in EEG/MEG studies. Brain Sciences, 7(6), 6. 10.3390/brainsci7060058 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Rao, R. P. N. , & Ballard, D. H. (2005). CHAPTER 91—Probabilistic models of attention based on iconic representations and predictive coding. In Itti L., Rees G., & Tsotsos J. K. (Eds.), Neurobiology of attention (pp. 553–561). Academic Press. 10.1016/B978-012375731-9/50095-1 [DOI] [Google Scholar]
  30. Schubert, J. , Schmidt, F. , Gehmacher, Q. , Bresgen, A. , & Weisz, N. (2023). Cortical speech tracking is related to individual prediction tendencies. Cerebral Cortex, 33(11), 6608‐6619. 10.1093/cercor/bhac528 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Sedley, W. , Friston, K. J. , Gander, P. E. , Kumar, S. , & Griffiths, T. D. (2016). An integrative tinnitus model based on sensory precision. Trends in Neurosciences, 39(12), 799–812. 10.1016/j.tins.2016.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Shipp, S. , Adams, R. A. , & Friston, K. J. (2013). Reflections on agranular architecture: predictive coding in the motor cortex. Trends in neurosciences, 36(12), 706–716. 10.1016/j.tins.2013.09.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Siegelman, N. , Bogaerts, L. , Christiansen, M. H. , & Frost, R. (2017). Towards a theory of individual differences in statistical learning. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1711), 20160059. 10.1098/rstb.2016.0059 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Sinha, P. , Kjelgaard, M. M. , Gandhi, T. K. , Tsourides, K. , Cardinaux, A. L. , Pantazis, D. , Diamond, S. P. , & Held, R. M. (2014). Autism as a disorder of prediction. Proceedings of the National Academy of Sciences, 111(42), 15220–15225. 10.1073/pnas.1416797111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Smith, R. , Badcock, P. , & Friston, K. J. (2021). Recent advances in the application of predictive coding and active inference models within clinical neuroscience. Psychiatry and Clinical Neurosciences, 75(1), 3–13. 10.1111/pcn.13138 [DOI] [PubMed] [Google Scholar]
  36. Smith, R. , Parr, T. , & Friston, K. J. (2019). Simulating Emotions: An Active Inference Model of Emotional State Inference and Emotion Concept Learning. Frontiers in psychology, 10, 2844. 10.3389/fpsyg.2019.02844 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Sterzer, P. , Adams, R. A. , Fletcher, P. , Frith, C. , Lawrie, S. M. , Muckli, L. , Petrovic, P. , Uhlhaas, P. , Voss, M. , & Corlett, P. R. (2018). The predictive coding account of psychosis. Biological Psychiatry, 84(9), 634–643. 10.1016/j.biopsych.2018.05.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Sumby, W. H. , & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. The Journal of the Acoustical Society of America, 26(2), 212–215. 10.1121/1.1907309 [DOI] [Google Scholar]
  39. Summerfield, C. , & de Lange, F. P. (2014). Expectation in perceptual decision making: neural and computational mechanisms. Nature reviews. Neuroscience, 15(11), 745–756. 10.1038/nrn3838 [DOI] [PubMed] [Google Scholar]
  40. Tavakol, M. , & Dennick, R. (2011). Making sense of Cronbach's alpha. International Journal of Medical Education, 2, 53–55. 10.5116/ijme.4dfb.8dfd [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. ten Oever, S. , Schroeder, C. E. , Poeppel, D. , van Atteveldt, N. , & Zion‐Golumbic, E. (2014). Rhythmicity and cross‐modal temporal cues facilitate detection. Neuropsychologia, 63, 43–50. 10.1016/j.neuropsychologia.2014.08.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Treder, M. S. (2020). MVPA‐light: A classification and regression toolbox for multi‐dimensional data. Frontiers in Neuroscience, 14, 289. 10.3389/fnins.2020.00289 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Wollman, I. , & Morillon, B. (2018). Organizational principles of multidimensional predictions in human auditory attention. Scientific Reports, 8(1), 1. 10.1038/s41598-018-31878-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Yon, D. , de Lange, F. P. , & Press, C. (2019). The predictive brain as a stubborn scientist. Trends in Cognitive Sciences, 23(1), 1. 10.1016/j.tics.2018.10.003 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data S1.

PSYP-61-e14435-s001.docx (15.2KB, docx)

Figure S1. Time‐resolved decoding accuracy for auditory and visual features: sound frequency as well as gabor patch orientation can be classified from brain activity from ~100 ms until ~700 ms after stimulus onset in a random context (shaded area indicates 95%CI, and dashed line shows the chance level of 0.25; N = 35).

PSYP-61-e14435-s004.png (553.6KB, png)

Figure S2. Tendency to represent a “forward” transition (compared to a “repetition” transition), separately for different entropy levels and modalities. On a group‐level, there seems to be an anticipatory tendency to represent stimulus features of high probability (i.e., forward transition) in a predictable context in the auditory modality, but not in the visual modality. (Note that this figure shows the same data as Figure 2a, but with separate lines for each entropy condition. The y‐axis represents the classifier dvals for a “forward” transition before subtraction (ordered − random), and the solid lines on the x‐axis indicate significant time‐points (ordered > random); N = 35).

PSYP-61-e14435-s002.png (842.8KB, png)

Figure S3. Temporal generalization of feature‐specific activations in the auditory (left) and visual (right) modality. Left: In the auditory modality there is a significant difference in the generalization from post‐ to prestimulus processing between entropy levels. This suggests that in an ordered, but not in a random, context people generate feature‐specific anticipatory predictions, that resemble bottom‐up processing. Right: In the visual modality, however, we find no significant evidence for a regularity‐dependent generalization from post‐ to prestimulus acitvations. (The y‐axis represents classifier poststimulus training‐time, x‐axis represents classifier pre‐ and poststimulus testing‐time and the dashed‐gray line indicates the diagonal; T‐values are shown in color, and marked outlines indicate a significant cluster in the comparison ordered vs. random; N = 35).

PSYP-61-e14435-s003.png (613.5KB, png)

Data Availability Statement

The raw data and the analysis code of this study are available from the corresponding author upon reasonable request.


Articles from Psychophysiology are provided here courtesy of Wiley

RESOURCES