Skip to main content
PLOS Biology logoLink to PLOS Biology
. 2020 Aug 26;18(8):e3000840. doi: 10.1371/journal.pbio.3000840

Cortical tracking of speech in noise accounts for reading strategies in children

Florian Destoky 1,*,#, Julie Bertels 1,2,#, Maxime Niesen 1,3, Vincent Wens 1,4, Marc Vander Ghinst 1, Jacqueline Leybaert 5, Marie Lallier 6, Robin A A Ince 7, Joachim Gross 7,8, Xavier De Tiège 1,4, Mathieu Bourguignon 1,5,6
Editor: Timothy D Griffiths9
PMCID: PMC7478533  PMID: 32845876

Abstract

Humans’ propensity to acquire literacy relates to several factors, including the ability to understand speech in noise (SiN). Still, the nature of the relation between reading and SiN perception abilities remains poorly understood. Here, we dissect the interplay between (1) reading abilities, (2) classical behavioral predictors of reading (phonological awareness, phonological memory, and rapid automatized naming), and (3) electrophysiological markers of SiN perception in 99 elementary school children (26 with dyslexia). We demonstrate that, in typical readers, cortical representation of the phrasal content of SiN relates to the degree of development of the lexical (but not sublexical) reading strategy. In contrast, classical behavioral predictors of reading abilities and the ability to benefit from visual speech to represent the syllabic content of SiN account for global reading performance (i.e., speed and accuracy of lexical and sublexical reading). In individuals with dyslexia, we found preserved integration of visual speech information to optimize processing of syntactic information but not to sustain acoustic/phonemic processing. Finally, within children with dyslexia, measures of cortical representation of the phrasal content of SiN were negatively related to reading speed and positively related to the compromise between reading precision and reading speed, potentially owing to compensatory attentional mechanisms. These results clarify the nature of the relation between SiN perception and reading abilities in typical child readers and children with dyslexia and identify novel electrophysiological markers of emergent literacy.


Humans’ propensity to acquire literacy relates to several factors, one of which is the ability to understand speech in noise. This neuroimaging study reveals that reading abilities and neuronal traces of speech processing in noise are related in multiple specific ways.

Introduction

Acquiring literacy is tremendously important in our societies. Central for reading acquisition are adequate phonological awareness [13], phonological memory [4,5], and rapid automatized naming (RAN) [68]. The adequacy of the learning environment also plays a major role [9,10]. In particular, the presence of recurrent noise in the learning environment can substantially hinder reading acquisition [11,12]. Therefore, the ability to understand speech in noise (SiN)—which is known to differ among individuals [13,14]—should modulate the negative impact of environmental noise on reading acquisition. And indeed, the quality of brainstem responses to syllables in noise predicts reading abilities and its precursors [15]. Moreover, individuals with dyslexia often exhibit a SiN perception deficit [16,17] that is particularly apparent when the background noise is composed of speech [18]. This deficit has been hypothesized to be rooted in a deficit in phonological awareness [19,20], but contradictory reports do exist [21]. The question of whether SiN processing abilities relate to reading because of a common dependence on classical behavioral predictors (i.e., phonological awareness, phonological memory, and RAN) or other cognitive or neurophysiological processes specific to SiN processing is thus open. Furthermore, which aspects of reading and SiN processing abilities are related is also unexplored. Understanding these relations is especially important given that acoustic noise is ubiquitous and given how adverse dyslexia can be for the cognitive and social development of children.

Reading is a multifaceted process. Hence, it is reasonable to think that SiN processing might relate to a restricted set of aspects of reading. Following the dual-route cascaded model, reading in languages with alphabetic orthographies is supported by two separate routes: the sublexical and the lexical routes [22,23], which do interact following other models of reading [24]. The sublexical route implements the grapheme-to-phoneme conversion. It is used when reading unfamiliar words or pseudowords, but it is not suitable for correctly reading irregular words (i.e., yacht) and acquiring fluent reading. Skilled reading relies on the lexical route, which supports fast recognition of the orthographic word form of familiar words. The lexical route is indispensable for reading irregular words, and it benefits the reading of regular words much more than the reading of pseudowords. Remarkably, the brain would implement these two reading strategies in two distinct neural pathways, mostly in the left hemisphere [2529].

There are also several distinct aspects of SiN processing that could relate to reading, and these can be derived from electrophysiological recordings of brain activity during connected-speech listening. When listening to connected speech, human auditory cortical activity tracks the fluctuations of speech temporal envelope at frequencies matching the speech hierarchical linguistic structures, i.e., phrases/sentences (0.2–1.5 Hz) and words/syllables (2–8 Hz) [3040]. Such cortical tracking of speech (CTS) is thought to be essential for speech comprehension [33,35,37,39,4143]. Most convincingly, speech intelligibility can be enhanced by speech-matched transcranial electrical stimulation of auditory cortices [42,44]. Corresponding brain oscillations would subserve the segmentation or parsing of incoming connected speech to promote speech recognition [33,34,39,41,45]. In SiN conditions, child and adult brains preferentially track the attended speech rather than the global auditory scene, though with reduced fidelity (especially reduced in the right hemisphere) when the noise hinders comprehension [30,31,40,4656]. Assessing CTS in noise can therefore provide objective measures of the impact of noise on the cortical representation of the different hierarchical linguistic structures of speech. Also relevant is how SiN perception is impacted by noise properties. In essence, the relevant parameters for an acoustic noise in SiN conditions are the degree of energetic and informational masking [57]. The noise is energetic when it overlaps spectrotemporally with speech signal and is nonenergetic otherwise. The noise is informational when it is made up of other speech signals (as in the case of a multitalker babble, even in an unknown language, but not time-reversed) and noninformational otherwise [5860]. An energetic noise introduces physical interferences, and an informational noise introduces perceptual interferences. Finally, to enhance SiN processing, humans also benefit from visual information of the speaker’s articulatory mouth movements [61,62]. All these aspects of SiN perception can be captured by measures of CTS.

In this study, we investigated the relations between reading abilities, neural representations of SiN quantified with CTS, and classical behavioral predictors of reading in elementary school children. To fully characterize cortical SiN processing, we measured CTS in several types of background noises introducing different levels of energetic and informational masking and in conditions in which the face of the speaker was visible (“lips”) or not (“pics”) while talking. This study was designed to answer four major questions: (1) What aspects of cortical SiN processing and reading abilities are related in typically developing elementary school children? (2) To what extent are these relations mediated by classical behavioral predictors of reading? (3) Are these different aspects of cortical SiN processing altered in children with dyslexia in comparison with typical readers matched for age or reading level? (4) What aspects of cortical SiN processing and reading abilities are related in children with dyslexia? As preliminary steps to tackle these questions, we identify relevant features of CTS in noise and assess in a global analysis the nature of the information about reading brought by all the identified features of CTS in noise and classical behavioral predictors of reading abilities.

Results

We first report on 73 children with typical reading abilities. Then, we report on 26 children with dyslexia matched with a subsample of the 73 typical readers for age (n = 26) or reading level (n = 26). Both control groups were included to tell whether development or reading experience can explain potentially uncovered SiN deficits [63]. Reading performance and its classical behavioral predictors were characterized in a comprehensive cognitive evaluation (Table 1). Children’s brain activity was recorded with magnetoencephalography (MEG) while they were attending to four videos of approximately 6 min each. Each video featured nine conditions: one noiseless and eight SiN resulting from the combination of four types of noise with lips or pics visual inputs (Fig 1, S1 Fig, and S1 Video). The opposite- and same-gender babble noises introduced informational interferences and a similar degree of energetic masking (see S1 Methods). The least- and most-energetic nonspeech noises introduced a degree of energetic masking in accordance with their naming but no informational interference.

Table 1. Mean and standard deviation of behavioral scores in each reading group of 26 children and comparisons (t tests) between groups.

Behavioral measure Children with dyslexia Age-matched controls Reading level controls Children with dyslexia compared with controls
Mean SD Mean SD Mean SD in age in reading level
p t(df) p t(df)
Chronological age 10.2 1.08 9.97 1.01 7.76 0.60 0.36 0.93 <0.0001 10.3
Nonverbal IQ 111 11 114 10 112 9 0.30 −1.04 0.784 −0.28
Socioeconomic status 6.12 2.44 6.96 1.45 6.96 2.47 0.14 −1.50 0.17 −1.40
Alouette reading accuracy 89.0 5.7 96.2 2.1 89.0 6.46 <0.0001 6.07 0.988 0.01
Alouette reading speed 141 61 292 91 138 64 <0.0001 7.04 0.867 0.17
Irregular words reading (words/s) 0.54 0.33 1.16 0.44 0.40 0.35 <0.0001 5.82 0.15 1.47
Regular words reading (words/s) 0.73 0.41 1.35 0.41 0.61 0.35 <0.0001 5.51 0.29 1.06
Pseudowords reading (words/s) 0.42 0.24 0.78 0.30 0.39 0.21 <0.0001 4.88 0.61 0.50
Visual attention 30.3 3.74 32.0 2.69 27.4 4.43 0.070 −0.95 0.014 2.53
Phoneme suppression 7.92 2.15 9.04 1.75 8.42 1.27 0.046 2.05 0.313 −1.02
Phoneme fusion 7.73 1.59 9.31 0.97 8.92 1.16 <0.0001 4.32 0.003 3.09
Forward digit span 5.08 0.84 5.8 0.69 5.15 0.78 0.001 3.41 0.735 −0.34
Backward digit span 3.69 0.79 4.5 1.33 3.38 0.75 0.011 2.66 0.156 1.44
RAN time (s) 24.4 7.84 20.1 3.02 30.6 7.51 0.013 2.59 0.005 2.91
TAP mean response time (ms) 627 99.0 613 75.4 667 93.4 0.59 0.53 0.07 1.86
TAP SD response time (ms) 140 45.0 129 30.3 171 46.7 0.33 0.98 0.02 2.36
TAP correct responses 15.6 0.58 15.7 0.68 15.3 1.07 0.42 −0.81 0.11 1.65
TAP false responses 2.15 2.26 0.84 1.28 1.21 0.97 0.014 2.54 0.89 0.13

The number of df was 50 for all comparisons, except those involving auditory attention (TAP) scores (children with dyslexia versus controls in age, df = 49; children with dyslexia versus controls in reading level, df = 38) and socioeconomic status (children with dyslexia versus controls in age, df = 49; children with dyslexia versus controls in reading level, df = 47).

Abbreviations: df, degrees of freedom; IQ, intelligence quotient; RAN, rapid automatized naming; SD, standard deviation; TAP, test of attentional performance

Fig 1. Illustration of the experimental material used in the neuroimaging assessment.

Fig 1

Subjects viewed four videos of approximately 6 min in duration in which a different narrator (two females, two males) told a story. Each video was divided into 10 blocks to which experimental conditions were assigned. There were two blocks of the noiseless condition, and eight blocks of speech-in-noise conditions: one block for each possible combination of the four types of noise and two types of visual display. The interference introduced by the noise was either informational or not and varied in terms of degree of energetic masking. Power spectra are presented for all types of noise (colored traces) and one of the attended speeches (gray traces; here, that of a female narrator). The visual display provided visual speech information (lips) or not (pics).

For each condition, we regressed the temporal envelope of the attended speech on MEG signals with several time lags using ridge regression and cross validation (see Methods for details) [64]. The ensuing regression model was used to reconstruct speech temporal envelope from the recorded MEG signal. CTS was computed as the correlation between the genuine and reconstructed speech temporal envelopes. We did this for MEG and speech envelope signals filtered at 0.2–1.5 Hz (phrasal rate) [30,65] and 2–8 Hz (syllabic rate) [50,54,66,67] and for MEG sensor signals in the left and right hemispheres separately because the cortical bases of reading and SiN processing are hemispherically asymmetric [2529,31,40].

S1 Table presents the percentage of the 73 typical readers showing statistically significant phrasal and syllabic CTS for both hemispheres and each condition. All typical readers showed significant phrasal CTS in noiseless and nonspeech noise conditions, and still most of them in babble noise conditions (mean ± SD across conditions, 98.3% ± 2.1%). Most of the typical readers showed significant syllabic CTS in noiseless and nonspeech noise conditions (93.8% ± 3.2%) and slightly less of them in babble noise conditions (80.1% ± 4.3%). This result clearly indicates that CTS can be robustly assessed at the subject level.

S1 Data provides all participants’ behavioral and CTS values on which the remainder of the results is based.

What aspects of SiN processing modulate the measures of CTS in noise?

First, we identify the main factors modulating CTS in SiN conditions. To that aim, we evaluated with linear mixed-effects modeling how the normalized CTS (nCTS) in SiN conditions depends on hemisphere, noise properties, and visibility of the talker’s lips. The nCTS is a contrast between CTS in SiN (CTSSiN) and noiseless (CTSnoiseless) conditions defined as

nCTS=(CTSSiNCTSnoiseless)/(CTSSiN+CTSnoiseless)

(see Methods for further technical details). It takes values between −1 and 1, with negative values indicating that the noise reduces CTS. Such contrast presents the advantage of being specific to SiN processing abilities by factoring out the global level of CTS in the noiseless condition. In that analysis, nCTS values were corrected (linear regression intertwined with outlier fixing) for age, time spent at school, and intelligence quotient (IQ) (see S2 Methods).

Table 2 presents the final linear mixed-effects model of phrasal and syllabic nCTS, and Fig 2 illustrates underlying values.

Table 2. Factors included in the final linear mixed-effects model fit to the nCTS (independent variable) at phrasal rate and at syllabic rate.

Factors are listed in their order of inclusion.

Factors 𝒳2 p
df value
Phrasal nCTS
Noise 3 598 <0.0001
Visual 1 127 <0.0001
Hemisphere 1 17.3 <0.0001
Noise × visual 3 67.7 <0.0001
Noise × hemisphere 3 11.0 0.012
Syllabic nCTS
Noise 3 333 <0.0001
Visual 1 21.1 <0.0001
Hemisphere 1 10.5 0.0012

Abbreviation: nCTS, normalized cortical tracking of speech

Fig 2.

Fig 2

Impact of the main fixed effects on the nCTS at phrasal (A) and syllabic rates (B). Mean and SEM values are displayed as a function of noise properties. The four traces correspond to visual conditions with the speaker’s talking face visible (lips; black traces) and with static pictures illustrating the story (pics; gray traces), within the left (lh; connected traces) and right (rh; dashed traces) hemispheres. nCTS values are bounded between −1 and 1, with values below 0 indicating lower CTS in speech-in-noise conditions than in noiseless conditions. S2 Data contains the underlying data for this figure. lh, left hemisphere; nCTS, normalized cortical tracking of speech; rh, right hemisphere.

The pattern of how nCTS changed with different types of noise was overall similar for phrasal and syllabic nCTS. Nonspeech noise did not substantially change CTS (nCTS was close to 0). However, babble noise resulted in a substantial reduction of CTS compared with the noiseless condition for both hemispheres and irrespective of the availability of visual speech information. That is, nCTS in babble noise conditions was roughly between −0.1 and −0.3, indicating that CTS in babble noise was 20%–50% (values obtained by inverting the formula of nCTS) lower than CTS in noiseless conditions.

Availability of visual speech information (lips conditions) increased the level of nCTS only in babble noise conditions for phrasal nCTS and in all noise conditions for syllabic nCTS.

And finally, the noise impacted nCTS differently in the left and right hemispheres. The phrasal nCTS was higher in the left than right hemisphere in babble noise conditions. It was the other way around for syllabic nCTS in all noise conditions.

Note that in the lips conditions, wherein participants saw the narrator’s talking face, visual cortical activity driven by articulatory mouth movements could have contributed to nCTS values. However, such visual contribution was actually negligible (see S1 Results).

In summary, the CTS is mostly impacted by babble noises and is also modulated by the availability of visual speech and the hemisphere (only in babble noise conditions for phrasal CTS and in all noise conditions for syllabic CTS). These observations guided the elaboration of eight relevant features (contrasts) of nCTS in SiN conditions (see S3 Methods): the global level of nCTS and its informational, visual, and hemispheric modulations all for phrasal and syllabic nCTS. In the next sections, we unravel the associations between these features, reading abilities, and classical behavioral predictors of reading. Note the absence of circularity in this approach because features of nCTS were not selected based on their relation with behavioral scores [68]. And on a technical note, seeking association with a limited set of features of nCTS rather than with all nCTS values (32 = 4 noise conditions × 2 visual conditions × 2 hemispheres × 2 frequency ranges of interest) was necessary to avoid introducing close-to-collinear regressors in subsequent analyses and to decrease random errors on nCTS estimates.

What is the nature of the information about reading abilities brought by measures of SiN processing and classical behavioral predictors of reading?

Having identified relevant features of cortical SiN processing, we first evaluated to which extent these features and classical behavioral predictors of reading bring information about reading abilities in a single, statistically controlled analysis. More precisely, we used partial information decomposition (PID) to dissect the information about reading abilities (target) brought by behavioral scores (first set of explanatory variables) and features of the nCTS in noise (second set of explanatory variables) [69,70]. Generally speaking, PID can reveal to which extent two sets of explanatory variables bring unique information about a target (information present in one set but not in the other), redundant information (information common to the two sets), and synergistic information (information emerging from the interaction of the two sets). Here, the target consisted of five reading scores: (1) an accuracy and (2) a speed score for the reading of a connected meaningless text (Alouette test) and scores (number of correctly read words per unit of time) for the reading of a list of (3) irregular words, (4) regular words, and (5) pseudowords. The first set of explanatory variables, i.e., the classical behavioral predictors of reading, consisted of a total of five measures indexing phonological awareness (scores on phoneme suppression and fusion tasks), phonological memory (scores on forward and backward digit repetition), and RAN score. The second set of explanatory variables was the eight features of nCTS in SiN conditions identified in the previous subsection. Again, in that analysis, all measures were corrected for age, time spent at school, and IQ (see S2 Methods). For statistical assessment and conversion into easily interpretable z-scores, measures of information were compared to the distribution of these measures obtained after permuting reading scores across subjects (see S4 Methods).

As a result, features of nCTS in noise brought significant unique information about reading abilities (unique information, z = 2.52; p = 0.013), whereas classical behavioral predictors did not (unique information, z = 1.51; p = 0.077). Both sets of explanatory variables brought significant redundant but not synergistic information about reading (redundant information, z = 4.22; p = 0.0007; synergistic information, z = 0.68; p = 0.22).

Further supporting the result that features of nCTS bring significant unique information about reading, this information measure was significantly higher than its permutation distribution in which features of nCTS (rather than reading scores) were permuted across subjects (p = 0.009); and so was the value of redundant information (p = 0.004). Of notice, the unique information about reading brought by classical behavioral predictors was significantly higher when classical behavioral predictors were not permuted across subjects than when they were (p = 0.040); and so was the value of redundant information (p = 0.010).

These results indicate that the way the CTS is impacted by ambient noise relates to reading abilities in a way that is not fully explained by classical behavioral predictors of reading. Further analyses will therefore strive to identify which aspects of SiN processing and reading are related and which of these relations are mediated by classical behavioral predictors of reading.

Which features of SiN processing relate to reading abilities in a way that is not mediated by classical behavioral predictors of reading?

Having identified relevant features of cortical SiN processing, we evaluated to which extent these features bring information about reading abilities above and beyond that provided by classical behavioral predictors of reading. In practice, we identified with linear mixed-effects modeling (1) the set of classical behavioral predictors of reading that best explains reading abilities and (2) the set of features of nCTS in noise that brings additional information about reading abilities. Importantly, all measures were corrected for age, time spent at school, and IQ and were standardized. In that analysis, the type of reading score used to assess reading abilities was taken as a factor. Classical behavioral predictors of reading (five measures) were first entered as regressors before considering the features of nCTS in noise (eight measures) as additional regressors.

Table 3 presents the final linear mixed-effects model fit to reading scores. It shows that RAN score and phonological memory (indexed by the forward digit span) relate to global reading abilities. It also shows that two aspects of SiN processing, the visual and informational modulations in phrasal nCTS, explain a different part of the variance in reading abilities. Importantly, these two indices relate to reading in a way that depends on the type of reading score. These effects are illustrated with simple Pearson correlations in Table 4. The time necessary to fulfil the RAN task was significantly negatively correlated with all reading scores. The forward digit span was significantly positively correlated with all reading scores. The visual modulation in phrasal nCTS was overall positively correlated with scores involving reading speed (Alouette speed score and regular, irregular, and pseudoword reading scores; significantly so for pseudoword reading only) but not with the Alouette accuracy score. The informational modulation in phrasal nCTS was characterized by a significant positive correlation with the score on irregular word reading only. Interestingly, the correlation was not significant—and even negative—with the score on pseudoword reading.

Table 3. Regressors included in the final linear mixed-effects model fit to the five reading scores (dependent variables).

Regressors are listed in their order of inclusion.

Regressors 𝒳2 p
df value
RAN 1 15.8 <0.0001
Forward digit span 1 11.1 0.0009
Visual modulation in phrasal nCTS 1 4.85 0.028
Informational modulation in phrasal nCTS dependent on reading score 5 15.6 0.0080
Visual modulation in phrasal nCTS dependent on reading score 4 11.1 0.026

Abbreviations: df, degrees of freedom; nCTS, normalized cortical tracking of speech; RAN, rapid automatized naming

Table 4. Pearson correlation between measures of reading abilities and relevant brain and behavioral measures.

Reading parameter RAN Forward digit span Visual modulation in phrasal nCTS Informational modulation in phrasal nCTS Visual modulation in syllabic nCTS Phoneme suppression Phoneme fusion
Alouette accuracy −0.37** 0.33** 0.00 0.03 0.29* 0.11 0.25*
Alouette speed −0.41*** 0.38*** 0.21# 0.08 0.30** 0.31** 0.30**
Irregular words −0.35** 0.42*** 0.18 0.26* 0.37** 0.21# 0.17
Regular words −0.42*** 0.35** 0.18 0.12 0.31** 0.25* 0.19
Pseudowords −0.34** 0.30* 0.31** 0.07 0.23* 0.21# 0.11

***p < 0.001

**p < 0.01

*p < 0.05

#p < 0.1.

Abbreviations: nCTS, normalized cortical tracking of speech; RAN, rapid automatized naming

We will now attempt to better understand the meaning of this last association (between the informational modulation in phrasal nCTS and irregular but not pseudoword reading). Given that different routes support reading of irregular words (lexical route) and pseudowords (sublexical route), the contrast between corresponding standardized scores (irregular − pseudowords) indicates reading strategy. We henceforth refer to this index as the reading strategy index. Further strengthening the correlation pattern highlighted above for the informational modulation in phrasal nCTS, this latter index correlated even more strongly with the reading strategy index (r = 0.44, p < 0.0001; see Fig 3, left) than with the score on irregular word reading. This suggests that irregular and pseudoword reading scores bring synergistic information about the informational modulation in phrasal nCTS. To confirm this, we used PID to dissect the information about the informational modulation in phrasal nCTS (target) brought by irregular reading scores (first explanatory variable) and pseudoword reading scores (second explanatory variable). This analysis revealed that the score on irregular word reading carried significant, unique information about the informational modulation in phrasal nCTS (unique information, z = 4.92, p = 0.0052)—whereas the score on pseudowords did not (unique information, z = −0.21, p = 0.38)—and most interestingly, that these two reading scores carried significant synergistic but not redundant information about the informational modulation in phrasal nCTS (redundant information, z = −0.55, p = 0.63; synergistic information, z = 9.73, p = 0.0003).

Fig 3. Relation between the reading strategy index and the nCTS at phrasal rate.

Fig 3

Left—the informational modulation in phrasal nCTS as a function of the reading strategy index. Gray circles depict participants’ values, and a black trace is the regression line, with correlation and significance values indicated in the top-left corner. Right—the mean nCTS across visual conditions and both hemispheres for the four types of noise: least-energetic nonspeech (blue circles), most-energetic nonspeech (turquoise crosses), opposite-gender babble (red circles), and same-gender babble (pink crosses). Circles and crosses depict participants’ values, and full traces are the regression lines. Correlation and significance level for all noise conditions are indicated on the right of each plot. S3 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

Fig 3 (right panel) further illustrates that the reading strategy index was correlated with phrasal nCTS only in the babble noise conditions.

In summary, classical behavioral predictors of reading were informative about global reading abilities (similar correlation with all five measures of reading), whereas two aspects of the CTS in noise (informational and visual modulations in phrasal nCTS) related to specific aspects of reading (correlation with some but not all five measures of reading). The extent to which visual speech boosts phrasal CTS in noise was related to reading speed but not accuracy, and the ability to maintain adequate phrasal CTS in babble noise related to reading strategy (dominant reliance on the lexical rather than sublexical route).

Do other features of SiN processing or classical behavioral predictors of reading relate to reading abilities?

Above, we have identified a set of brain and behavioral measures related to reading. Importantly, each measure was included because it explained a new part of the variance in reading abilities. But the first PID analysis revealed that brain and behavioral measures do carry significant redundant information. This means that some measures might have been left aside if they explained some variance that was already explained (i.e., if they provided mainly redundant information). Accordingly, we also ran the linear mixed-effects analysis with nCTS and behavioral regressors that were not included. This analysis identified an overall positive correlation between reading abilities and (1) the visual modulation in syllabic nCTS (𝒳2(1) = 9.74, p = 0.0018), (2) phoneme suppression (𝒳2(1) = 4.94, p = 0.026), and (3) phoneme fusion (𝒳2(1) = 4.00, p = 0.038). Corresponding Pearson correlation coefficients are presented in Table 4. A detailed PID analysis revealed that these “side” measures were redundant—and synergistic to some extent—with RAN and forward digit span but not with visual and informational modulations in phrasal nCTS (see S2 Results, S3 Table, and S4 Table). Importantly, these results clarify why behavioral predictors of reading did not bring significant unique information about reading abilities: most of the variance in reading abilities they could explain (maximum |r| = 0.42; see Table 4) was also explained by the visual modulation in syllabic nCTS (maximum |r| = 0.37). And conversely, the visual modulation in syllabic nCTS was not retained in the final linear mixed-effects model of reading abilities for the same reason.

In summary, scores indexing phonological awareness (score on phoneme suppression and phoneme fusion) and the extent to which visual speech boosts syllabic CTS in noise (visual modulation in syllabic nCTS) relate to global reading abilities in a way that is mediated by the main classical behavioral predictors of reading we identified (RAN and forward digit span) but not with visual and informational modulations in phrasal nCTS.

Does phonological awareness mediate SiN perception capacities?

Having identified three relations between various aspects of cortical SiN processing and reading, we now specifically test the hypothesis that each of these relations is mediated by phonological awareness. For that, we again relied on PID to decompose the information about reading abilities (target) brought by each identified feature of the CTS in noise (first explanatory variable) and the mean of the two scores indexing phonological awareness (second explanatory variable). Ensuing results are provided in S2 Table. In summary, phonological awareness mediated one aspect of the relation between reading and cortical SiN processing (relation with the benefit of visual speech to boost syllabic CTS in noise) but not the two others (relations involving phrasal CTS in noise).

Is SiN comprehension accounted for by the features of nCTS related to reading?

If the three features of nCTS related to reading abilities are to index relevant aspects of cortical SiN processing, we would expect them to directly relate to SiN comprehension. To substantiate this consideration, we correlated these features of nCTS with a comprehension score computed as the percentage of correct answers to a total of 40 yes/no forced-choice questions. Again, all variables were corrected for age, time spent at school, and IQ. All three correlations were positive, but none of them were deemed significant (informational modulation in phrasal nCTS, r = 0.16, p = 0.17; visual modulation in phrasal nCTS, r = 0.20, p = 0.082; visual modulation in syllabic nCTS, r = 0.09, p = 0.47). The weakness of these associations could however be explained by ceiling effects in comprehension score due to comprehension questions being too simple. Indeed, 48% of the participants score 38/40 or more.

Do relations between reading and features of nCTS translate to alterations in dyslexia?

We next evaluated whether the relations between features of nCTS and reading abilities translate to alterations in dyslexia. That analysis was conducted on a group of 26 children with dyslexia and on groups of 26 age-matched and 26 reading-level–matched typically developing children selected among the 73 children included in the first part of the study.

S5 Table presents the percentage of the 26 children of each reading group (children with dyslexia, controls in age, and controls in reading level) showing statistically significant phrasal and syllabic CTS in each condition. All children showed significant phrasal CTS in all conditions except for one control in age that lacked significant CTS in one of the most challenging conditions (gender-matched babble noise without visual speech information). Qualitatively, fewer controls in reading level (than children with dyslexia and controls in age) showed significant syllabic CTS in all conditions. Still, the percentage of significant CTS remained above 80%, except for controls in reading level in the most-challenging noise conditions (gender-matched babble noises), which indicates that CTS could be robustly assessed at the subject level in all reading groups.

Based on the result that reading abilities relate to phrasal nCTS in babble noise and to the boost in nCTS brought by visual speech, we focused the comparison on the phrasal nCTS in lips and pics averaged across hemispheres and babble noise conditions (see Fig 4A). As a result, phrasal nCTS in pics was similar among individuals with dyslexia and controls in reading level and higher in controls in age (significantly only for children with dyslexia; marginally for controls in reading level). In contrast, phrasal nCTS in lips was similar in all reading groups.

Fig 4. Comparison between children with dyslexia and controls in the measures of nCTS significantly related to reading abilities.

Fig 4

(A) Modulations involving phrasal nCTS. Displayed are the mean and SEM within groups (dyslexia, control in age, and control in reading level) of phrasal nCTS in the conditions with (lips) and without (pics) visual speech information. Values of nCTS were averaged across hemispheres and babble noise conditions for phrasal nCTS and across hemispheres and all noise conditions for syllabic nCTS. (B) Modulations involving syllabic nCTS. On the left is the visual modulation in syllabic nCTS. The right part is as in (A). S4 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

Based on the result that reading abilities relate to the visual modulation in syllabic nCTS, we focused the comparison on this index (see Fig 4B, left part). This revealed that individuals with dyslexia had significantly lower visual modulation in syllabic nCTS than age-matched but not reading-level–matched controls; the two latter groups showing similar level of visual modulation in syllabic nCTS. To better understand the nature of this difference, we further compared between groups the syllabic nCTS in lips and pics averaged across hemispheres and noise conditions (see Fig 4B, right part). As a result, syllabic nCTS in pics was similar in all reading groups, whereas in lips, it was similar among individuals with dyslexia and controls in reading level and higher in controls in age (significantly for children with dyslexia; marginally for controls in reading level).

In summary, one aspect of cortical SiN processing (reliance on visual speech to boost phrasal nCTS) was not altered in dyslexia, whereas two other aspects (phrasal nCTS in babble noise and reliance on visual speech to boost syllabic nCTS) were altered in dyslexia in comparison with typical readers matched for age but not reading level. This suggests that these two later aspects are altered as a consequence of reduced reading experience.

Are features of nCTS related to the importance of reading difficulties in dyslexia?

In S3 Results (complemented by S2 Fig), we show that our group with dyslexia was homogenous in terms of reading profile but not in the severity of the reading deficit. This raises the important question of whether and how the reading deficit in dyslexia relates to nCTS in noise. In S4 Results (complemented by S3 Fig, S6 Table, S7 Table, and S8 Table), we answer this question with the same linear mixed-effects modeling approach used in typical readers. However, the results are best illustrated by Pearson correlation between reading scores and nCTS in babble noise conditions in pics and lips (all measures corrected for age, time spent at school, and IQ).

Most surprisingly, phrasal nCTS both in lips and pics for children with dyslexia correlated significantly negatively with all reading scores indexing reading speed but not accuracy or strategy (see Fig 5 and S9 Table). That is, the higher the phrasal nCTS, the slower they read. Beyond that, S4 Results show that the informational modulation in phrasal nCTS correlated positively with the difference between reading accuracy and reading speed (r = 0.51; p = 0.0081). Syllabic nCTS in lips or pics for children with dyslexia did not correlate significantly with any of the reading scores (see S9 Table).

Fig 5. Relation between reading speed and the nCTS at phrasal rate in dyslexia.

Fig 5

On the x-axis is the mean of the four reading scores indexing reading speed: reading score for irregular, regular, and pseudowords and Alouette reading speed (converted to a number of words read per second). On the y-axis is the mean nCTS across babble noise conditions and both hemispheres for the two types of visual input: pics (orange) and lips (green). Circles depict participants’ values, and full traces are the regression lines. Correlation and significance level are indicated on the right. S5 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

Discussion

The main objective of this study was to fully characterize the nature of the relation between objective cortical measures of SiN processing and reading abilities in elementary school children. Results demonstrate that some cortical measures of SiN processing relate to reading performance and reading strategy. First, phrasal nCTS in babble (i.e., informational) noise relates to the ability to read irregular but not pseudowords, which in the dual-route cascaded model indicates maturation of the lexical route. Second, the ability to leverage visual speech to boost phrasal nCTS in babble noise relates to reading speed (but not accuracy). Third, the ability to leverage visual speech to boost syllabic nCTS in noise relates to global reading abilities. Fourth, classical behavioral predictors of reading abilities (RAN, phonological memory, and phonological awareness) relate to global reading performance and not strategy. Importantly, behavioral scores and the two features of phrasal CTS in babble noise explained a different part of the variance in reading abilities. Finally, the features of nCTS underlying the first and third relations uncovered in typical readers (phrasal nCTS in babble noise and visual modulation in syllabic nCTS) were significantly altered in dyslexia in comparison with aged-matched but not reading-level–matched typically developing children. However, within the population with dyslexia, nCTS measures of the ability to deal with babble noise were negatively related to reading speed and positively related to the compromise between reading precision and reading speed.

Significant associations were found between reading abilities and some features of phrasal and syllabic nCTS. There is evidence that CTS at phrasal rate (here taken as 0.2–1.5 Hz) partly reflects parsing or chunking of words, phrases, and sentences [71]. Indeed, the brain tracks phrase and sentence boundaries even when speech is devoid of prosody but only if it is comprehensible [41], and the phase of brain oscillations below 4 Hz modulates perception of ambiguous sentences [39]. CTS at phrasal/sentential rate would help align neural excitability with syntactic information to optimize language comprehension [38]. In contrast, CTS at syllable rate (here taken as 2–8 Hz) would reflect low-level auditory processing [71]. In light of the above, our results highlight that associations between SiN perception and reading abilities build on their shared reliance on both language processing and low-level auditory processing.

Robustness of cortical speech representation to babble noise indexes the degree of development of the lexical route

Our results indicate that an objective cortical measure of the ability to deal with babble noise relates to the maturation of the lexical route. Technically, the informational modulation in phrasal nCTS correlated significantly positively with the reading score on irregular but not pseudowords. Reading score on irregular words indeed provided unique information about the informational modulation in nCTS. Also, the two reading scores in synergy provided some additional information about the informational modulation in nCTS. Furthermore, the result that the informational modulation in nCTS correlated more with the reading strategy index than the score on the irregular words suggests that the key elements at the basis of this relation are the processes needed to read irregular words that are not needed to read pseudowords.

The relation between the degree of development of the lexical route and the level of phrasal nCTS in babble noise could be explained by a positive influence of good SiN abilities on reading acquisition. Let us take as an example the situation of being faced for the first time with a written word that is read by a teacher while some classmates are making noise. SiN abilities will naturally determine the odds of hearing that word properly and hence the odds of building up the orthographic lexicon. When again reading the word alone, only children with good SiN abilities will have the opportunity to train their lexical route for that specific word. Of course, the same chain of action could be posited for the training of grapheme–phoneme correspondence. But there are many more words than phonemes and syllables, so good SiN abilities might be more important to successfully learning the correspondence between irregular words’ orthographic and phonological representations. Indeed, grapheme–phoneme correspondence is intensively trained when learning to read. Children are repeatedly exposed to examples of successful grapheme–phoneme correspondence, some with noise and some without noise. Accordingly, no matter what children’s SiN abilities are, they will learn the grapheme–phoneme correspondence and develop their sublexical route provided that they have adequate phonological awareness. Supporting this, phonological awareness does not predict SiN abilities in typical readers [21].

Alternatively, the relation between the ability to read irregular words (which tags the degree of development of the lexical route) and nCTS in babble noise could be mediated by the degree of maturation of the mental lexicon [72,73]. The mental lexicon integrates and binds the orthographic, semantic, and phonological representations of words. Its proper development is important for reading acquisition. Indeed, reading acquisition entails creating a new orthographic lexicon and binding it to the preexisting semantic and phonological lexicons [74]. Development of such binding (1) is indispensable for reading irregular words [75], (2) benefits reading of regular words, and (3) does not contribute to reading pseudowords. The proper degree of development of the mental lexicon is also important for SiN comprehension. Indeed, SiN comprehension strongly depends on lexical knowledge [21,7678]. And the level of CTS in noise relates to the listeners’ level of comprehension [37,42,43]. This therefore suggests that the robustness of CTS to babble noise depends on the level of comprehension, which in turn depends on how developed the mental lexicon is. The degree of development of the mental lexicon could therefore be the hidden factor mediating the relation between SiN and lexical reading ability. This is also perfectly in line with our result that altered phrasal nCTS in babble noise in dyslexia may result from reduced reading experience. In brief, reading difficulties in dyslexia would reduce their reading experience, which would impair building up the mental lexicon and in turn impede SiN perception. Still, future studies on the association between SiN processing and reading should include measures of the degree of development of the mental lexicon to carefully analyze the interrelation between SiN perception, reading abilities, and the degree of development of the mental lexicon.

Our results in dyslexia support the existence of a relation between reading abilities and cortical measures of the ability to deal with SiN, but they bring important nuances. First, phrasal nCTS in nonvisual babble noise conditions was altered in children with dyslexia compared with age-matched but not reading-level–matched controls, indicating that such alteration could be due to variability in reading experience. Second, within the children with dyslexia, phrasal nCTS was globally and negatively correlated with reading speed, and the informational modulation in phrasal nCTS was positively correlated with the contrast between reading accuracy and reading speed. These two relations could be explained by compensatory attentional mechanisms so that children with severe dyslexia developed enhanced attentional abilities at the basis of improved SiN abilities and more accurate—despite still slower—reading (compared with children with a mild dyslexia). Hence, such relations might hold only in children with dyslexia free of attentional disorder, as was the case with our participants. Also, it should be remembered that these relations were found in a relatively small sample of children with dyslexia (n = 26) and should be confirmed by future studies.

Audiovisual integration and reading abilities

We found significant relations between reading abilities and the ability to leverage visual speech to maintain phrasal and syllabic CTS in noise. Visual speech cues (articulatory mouth and facial gestures) are well known to benefit SiN comprehension [61] and CTS in noise [7983]. Obviously, the auditory signal carries much more fine-grained information about the phonemic content of speech than the visual signal. But the effect of audiovisual speech integration is quite evident in SiN conditions, in which it affords a substantial comprehension benefit [61,62,84,85]. Mirroring this perceptual benefit, it is already well documented that phrasal and syllabic CTS in noise is boosted in adults when visual speech information is available [7983,8689].

We found that the visual modulation in phrasal nCTS correlated globally and positively with reading speed (significantly so for the pseudowords) but not accuracy. However, our children with dyslexia (compared with both control groups) did not have any alteration in their phrasal nCTS in babble noise when visual speech was provided. Instead, they successfully relied on visual speech information to restore their phrasal CTS in babble noise (which was altered without visual speech information). In other words, reliance on lipreading to maintain appropriate phrasal CTS in babble noise appeared as a protection factor in our group of children with dyslexia.

We also found that the visual modulation in syllabic nCTS correlated globally and positively with reading abilities. More interestingly, our children with dyslexia (compared with both control groups) did not have any significant alteration in their syllabic nCTS in noise when visual speech was not provided. However, compared with age-matched typically developing children, they benefited significantly less from visual speech to boost syllabic CTS in noise. Instead, they behaved more like reading-level–matched typically developing children. Accordingly, our results cannot argue against the view that poor audiovisual integration in dyslexia is caused by reduced reading experience [63,90,91]. Notwithstanding, the pattern of results (see Fig 4B left) is even suggestive of an alteration in dyslexia in comparison with reading-level–matched children. More statistical power would be needed to confirm/disprove the trend.

Our result that audiovisual integration abilities correlate with reading abilities is in line with existing literature. Indeed, individuals with dyslexia benefit less from visual cues to perceive SiN than typical readers [9296]. Audiovisual integration and reading could be altered in dyslexia simply because both rely on similar mechanisms. Indeed, reading relies on the ability to bind visual (graphemic) and auditory (phonemic) speech representations [97,98]. And according to some authors, suboptimal audiovisual integration mechanisms could reduce reading fluency [99]. Importantly, the finding that individuals with dyslexia benefit normally from visual speech to boost phrasal but not syllabic CTS in noise brings important information about the nature of the audiovisual integration deficit in dyslexia. Following the functional roles attributed to CTS, individuals with dyslexia would properly integrate visual speech information to optimize processing of syntactic information [38] but not to support acoustic/phonemic processing [71]. This could be explained by their preserved ability to extract and integrate the temporal dynamics of visual speech but not the lip configuration [96], two aspects of audiovisual speech integration currently thought to be supported by distinct neuronal pathways [100]. This inability to rely on lip configuration to improve auditory phonemic perception in SiN conditions may be caused by a supramodal phonemic categorization deficit, as already proposed for children with specific language impairment [101]. Finally, the fact that the visual modulation in syllabic nCTS brought a limited amount of unique information about reading with respect to classical behavioral predictors of reading, but that all of them brought more information in synergy, suggests that a broad set of low-level processing abilities contribute to determining reading abilities and alterations in dyslexia [102,103].

Classical behavioral predictors related to global reading abilities

Our results confirm that classical behavioral predictors of reading (RAN, phonological memory, and metaphonological abilities) are directly related to the global reading level rather than reading strategy. We draw this conclusion because the optimal model for reading score contained a common slope for all reading subtests. This means that the model was not significantly improved by optimizing the slope for each of the five reading subtests separately. Accordingly, univariate correlation coefficients presented in Table 4 were roughly similar across the five reading scores.

Phonological memory (assessed with forward digit span) was significantly positively correlated with the global reading level. That phonological memory relates to global reading abilities rather than reading strategy is well documented [4]. Poor readers, regardless of their reading profile, typically perform poorly on phonological memory tests involving digits, letters [104,105], or words [106].

Performance on the RAN task was also related to the global reading level, in line with existing literature [68,107110]. RAN performance indeed has a moderate to strong relationship with all classical reading measures alike, including word, nonword, and text reading, as well as text comprehension [107]. It is a consistent predictor of reading fluency in various alphabetic orthographies independent of their complexity [111]. RAN performance even predicts reading performance similarly well at an interval of 2 years [112] for reading performance assessed with tasks tagging lexical and sublexical routes. It is thought that RAN and reading performances correlate because they involve serial processing and oral production [110], two processes that are common to both reading routes.

Finally, phonological awareness assessed with phoneme suppression and fusion tasks was significantly related to reading abilities. However, the information it brought about reading was less and essentially redundant with that brought by RAN and phonological memory. This is not surprising given that children tested in the present study had at least 1 year of reading experience. Phonological awareness indeed plays a key role in the early stages of reading acquisition, i.e., when learning grapheme-to-phoneme conversion [113115], and undergoes a substantial maturation during that period [116].

Phonological awareness

Our results indicate that, in typical readers, phonological awareness mediates at best part of the relation between the cortical processing of SiN and reading abilities. Indeed, the information about reading brought by phonological awareness was redundant with that brought by the visual modulation in syllabic nCTS but not with that brought by the informational and visual modulations in phrasal nCTS. This finding illustrates the importance of separating the different processes involved in SiN processing and reading to seek associations. It also provides a potential reason why contradictory reports exist on the topic [1921].

Nevertheless, the role of phonological awareness might have been underestimated in the present study because of a lack of sensitivity in our phonological awareness subtests. Indeed, phonological awareness tasks turned out to be too easy for older participants, leading to ceiling effects (about half of the participants reached the maximum score on phoneme fusion and suppression tasks). This could explain the weak relation observed between reading abilities and phonological awareness skills. In contrast, there was no ceiling effect for the RAN, which may explain the strong correlation between this score and reading abilities.

Further discussion

In S1 Discussion, we discuss considerations related to the fact that (1) only one acoustic signal-to-noise ratio was studied, (2) regression models to estimate CTS in a given condition were trained on all other conditions, (3) occipital sensors were included in regression models to estimate CTS, and (4) the study was conducted in French. We also discuss the potential yield of future studies in illiterate adults.

Conclusion

Overall, these results significantly further our understanding of the nature of the relation between SiN processing abilities and reading abilities. They demonstrate that cortical processing of SiN and reading abilities are related in several specific ways and that some of these relations translate into alterations in dyslexia that are attributable to reading experience. However, within children with dyslexia, these relations appeared changed or even reversed, potentially owing to compensatory attentional mechanisms. Our results also demonstrate that classical behavioral predictors of reading (including phonological awareness) mediate relations involving the processing of acoustic/phonemic but not syntactic information in natural SiN conditions. This contrasts with the classically assumed mediating role of phonological awareness. Instead, the ability to process speech syntactic content in babble noise (indexed by phrasal nCTS) could directly modulate skilled reading acquisition. Finally, the information about reading abilities brought by cortical markers of syntactic processing of SiN was complementary to that provided by classical behavioral predictors of reading. This implies that such markers of SiN processing could serve as novel electrophysiological markers of reading abilities.

Methods

Participants

In total, 73 typical readers (mean ± SD age, 8.74 ± 1.41 years; age range, 6.70–11.72 years) and 26 children with dyslexia (mean ± SD age, 10.24 ± 1.08 years; age range, 7.97–12.29 years) enrolled in elementary school took part in this experiment (see Table 1 for participants’ characteristics). Children with dyslexia had received a diagnosis of dyslexia, which implies that children had (at the time of diagnosis) at least 2 years of delay in reading acquisition that could not be explained by low IQ or social or sensitive disorders. All were native French speakers, reported being right-handed, had normal hearing according to pure-tone audiometry (normal hearing thresholds between 0–25 dB HL for 250, 500, 1,000, 2,000, 4,000, and 8,000 Hz) and normal SiN perception as revealed by a SiN test (Lafon 30) from a French language central auditory battery [117]. We used a French translation of the Family Affluence Scale [118] to evaluate participants’ socioeconomic level.

This study was approved by the local ethics committee (Comité d'Ethique Hospitalo-Facultaire Erasme-ULB, 021/406, Brussels, Belgium; approval number: P2017/081) and conducted according to the principles expressed in the Declaration of Helsinki. Participants were recruited mainly from local schools through flyer advertisements or from social networks. Participants and their legal representatives signed a written informed consent before participation. Participants were compensated with a gift card worth 50 euros.

Behavioral assessment

Participants underwent a comprehensive behavioral assessment intended to appraise their reading abilities and some cognitive abilities related to reading or speech perception.

Reading abilities

Children completed the word-reading (regular, irregular, and pseudowords) tasks of a dyslexia detection tool (ODEDYS-2; [119] and the Alouette-R reading task [120]).

For each of the word-reading tasks (regular, irregular, or pseudowords), participants had to read as rapidly and accurately as possible a list of 20 words. Each task provided a reading score computed as the number of words correctly read divided by the reading time (in seconds).

In the Alouette-R task [120], children had 3 min to read as rapidly and accurately as possible a text of 256 words. This text is composed of a succession of words that do not tell a meaningful story. This peculiarity forces children to solely rely on their reading skills and prevents children from using anticipation or inference strategies that could boost the reading scores. An accuracy score was computed as the number of words correctly read divided by the total number of words read, and a speed score was computed as the number of words correctly read multiplied by the ratio of 180 s (maximal reading time) to the effective reading time.

Phonological processing

The initial phoneme suppression and initial phonemes fusion tasks of the ODEDYS-2 [119] were used to assess phonological processing.

In the initial phoneme suppression task, children had to repeat orally presented words while intentionally suppressing the initial phoneme of the word (i.e., dog → og). In total, 10 words were presented, and performance was quantified as the percentage correct.

In the initial phoneme fusion task, children had to combine the initial phoneme of two orally presented words to create a new (non-)word (i.e., Big & Owen → /bo/). In total, 10 pairs of words were presented, and performance was quantified as percentage correct.

RAN

We used the RAN task of the ODEDYS-2 [119]. Children had to name as rapidly and accurately as possible 25 pictures (five different pictures randomly repeated five times). Performance was quantified as the total time to complete the task, meaning that the lower the score, the better the performance.

Phonological memory

The forward and backward digit repetition task from the ODEDYS-2 [119] was used to assess phonological memory.

In the forward digit repetition task, children were asked to repeat orally presented number series in the same order as presented. The series are different at every trial. The first series contains three digits, and the size of the series is incremented by one every second trial. The task ends after a failure to repeat the two series of a given size. Forward digit span score was taken as the number of digits in the last correctly repeated series.

The backward digit repetition task is akin to the forward one. The only difference is that digit series have to be repeated in the exact reverse order (e.g., children presented 1 2 3 4 have to repeat 4 3 2 1).

Attention abilities

The bells test [121] was used to assess visual attention, and the TAP auditory attention subtest [122] was used to assess the auditory attentional level.

In the bells test, children had 2 min to find as many bells as possible on a sheet comprising 35 bells scattered among 280 visual distractors. Performance was quantified as the number of bells found divided by the time needed.

In the TAP auditory attention subtest, children had to focus their attention during 3 min 20 s on an auditory stream. Children heard a train of 200 pure-tone stimuli lasting 500 ms with a 1,000-ms stimulus-onset asynchrony. Tones alternated between high (1,073 Hz) and low (450 Hz) pitch. There were 16 occurrences in which two high- or low-pitch tones were following one another. Only in this case, participants had to press a response button as fast as possible. A performance score was quantified as the number of correct responses, a speed score as the mean response time, and a failure score as the number of responses to tones differing in pitch with the preceding one.

Nonverbal intelligence

The brief version of the Weschler Nonverbal (WNV) Scale of Ability [123] was used to assess nonverbal intelligence.

This assessment consisted of matrices and recognition subtests for children younger than 8 years. Older children were assessed with matrices and spatial memory subtests.

In the matrices subtest, children were presented with incomplete visual matrices and had to select the correct missing portion among four or five response options. The subtest ended when four mistakes were made in the last five trials. A raw score was taken as the number of correctly completed matrices. This raw score was converted to a T score by comparison with values provided in a table of norms.

In the recognition subtest, children had to carefully look at visual geometric designs that were presented one by one for 3 s. After each presentation, they had to identify the previously seen design among four or five response options. The subtest ended when four mistakes were made in the last five recognition trials. A raw score was taken as the number of correctly recognized drawings. This raw score was converted to a T score by comparison with values provided in a table of norms.

In the spatial memory subtest, children were presented with a board with 10 cubes spread on it and were asked to mimic the examiner’s tapping sequence. The sequences are different on every trial. The first sequence consists of tapping on two cubes, and the size of the sequences is incremented by one every second trial. The task ends after a failure to repeat two sequences of a given size. This task was performed twice, in forward and backward directions. For each direction, a raw score was taken as the number of correctly repeated sequences. Raw scores were summed and converted to a T score by comparison with values provided in a table of norms.

Total nonverbal IQ was computed as the sum of both T scores, which was compared with a table of norms, providing a total nonverbal IQ score.

Neuroimaging assessment

Stimuli

The stimuli were derived from 12 audiovisual recordings of four native French-speaking narrators (two females, three recordings per narrator) telling a story for approximately 6 min (mean ± SD, 6.0 ± 0.8 min) (for more details, see S5 Methods). Fig 1 illustrates the time course of a video stimulus. In each video, the first 5 s were kept unaltered to enable children to unambiguously identify the narrator’s voice and face that they were requested to attend to. The remainder of the video was divided into 10 consecutive blocks of equal size that were assigned to nine conditions. Two blocks were assigned to the noiseless condition, in which the audio track was kept but the video was replaced by static pictures illustrating the story (mean ± SD picture presentation time across all videos, 27.7 ± 10.8 s). The remaining eight blocks were assigned to eight conditions in which the original sound was mixed with a background noise at 3 dB signal-to-noise ratio. There were four different types of noise, and each type of noise was presented once with the original video, thereby giving access to lip-read information (lips visual conditions), and once with the static pictures illustrating the story (pics visual conditions). The different types of noise differed in the degree of energetic and informational interference they introduced [57]. Fig 1 and S1 Fig illustrate their spectral and spectrotemporal properties. The least-energetic nonspeech (i.e., noninformational) noise was a white noise high-pass filtered at 10,000 Hz. The most-energetic nonspeech noise had its spectral properties dynamically adapted to mirror those of the narrator’s voice approximately 1 s around. It was derived from the actual narrators’ audio recording by (1) Fourier transforming the sound in 2-s-long windows sliding by step of 0.5 s, (2) replacing the phase by random numbers, (3) inverse Fourier transforming the Fourier coefficients in each window, (4) multiplying these phase-shuffled sound segments by a sine window (i.e., half a sine cycle with 0 at edges, and 1 in the middle), and (5) summing the contribution of each overlapping window. The opposite-gender babble (i.e., informational) noise was a five-talker cocktail party noise recorded by individuals of gender opposite to the narrator’s (i.e., five men for female narrators). The same-gender babble noise was a five-talker cocktail party noise recorded by individuals of gender identical to the narrator’s. For both babble noises, the five individual noise components were obtained from a French audiobook database (http://www.litteratureaudio.com), normalized, and mixed linearly. The assignment of conditions to blocks was random, with the constraint that each of the five first and last blocks contained exactly one noiseless audio and each type of noise, two with lips videos and two with pics videos. Smooth audio and video transitions between blocks was ensured with 2-s fade-in and fade-out. Ensuing videos were grouped in three disjoint sets featuring one video of each of the narrators (total set duration: 23.0, 24.3, 24.65 min), and there were four versions of each set differing in condition random ordering.

Experimental paradigm

During the imaging session, participants lay on a bed with their head inside the MEG helmet. Their brain activity was recorded while they were attending four videos (separate recording for each video) of a randomly selected set and ordering of the videos presented in a random order, and finally while they were at rest (eyes opened, fixation cross) for 5 min. They were instructed to watch the videos attentively, listen to the narrators’ voice while ignoring the interfering noise, and remain as still as possible. After each video, they were asked 10 yes/no simple comprehension questions. Videos were projected onto a back-projection screen placed vertically, approximately 120 cm away from the MEG helmet. The inner dimensions of the black frame were 35.2 cm (horizontal) and 28.8 cm (vertical), and the narrator’s face spanned approximately 15 cm (horizontal) and approximately 20 cm (vertical). Participants could see the screen through a mirror placed above their head. In total, the optical path from the screen to participants’ eyes was of approximately 150 cm. Sounds were delivered at 60 dB (measured at ear level) through a MEG-compatible, front-facing, flat-panel loudspeaker (Panphonics Oy, Espoo, Finland) placed approximately 1 m behind the screen.

Data acquisition

During the experimental conditions, participants’ brain activity was recorded with MEG at the CUB Hôpital Erasme. Neuromagnetic signals were recorded with a whole-scalp–covering MEG system (Triux, MEGIN) placed in a lightweight, magnetically shielded room (Maxshield, MEGIN), the characteristics of which are described elsewhere [124]. The sensor array of the MEG system comprised 306 sensors arranged in 102 triplets of one magnetometer and two orthogonal planar gradiometers. Magnetometers measure the radial component of the magnetic field, whereas planar gradiometers measure its spatial derivative in the tangential directions. MEG signals were band-pass filtered at 0.1–330 Hz and sampled at 1,000 Hz.

We used four head-position indicator coils to monitor the subjects’ head position during the experimentation. Before the MEG session, we digitized the location of these coils and at least 300 head-surface points (on scalp, nose, and face) with respect to anatomical fiducials with an electromagnetic tracker (Fastrack, Polhemus).

Finally, subjects’ high-resolution 3D T1-weighted cerebral images were acquired with a magnetic resonance imaging (MRI) scanner (MRI 1.5T, Intera, Philips) after the MEG session.

Data preprocessing

Continuous MEG data were first preprocessed off-line using the temporal signal space separation method implemented in MaxFilter software (MaxFilter, MEGIN; correlation limit 0.9, segment length 20 s) to suppress external interferences and to correct for head movements [125,126]. To further suppress physiological artifacts, 30 independent components were evaluated from the data band-pass filtered at 0.1–25 Hz and reduced to a rank of 30 with principal component analysis. Independent components corresponding to heartbeat, eye-blink, and eye-movement artifacts were identified, and corresponding MEG signals reconstructed by means of the mixing matrix were subtracted from the full-rank data. Across subjects and conditions, the number of subtracted components was 3.45 ± 1.23 (mean ± SD across subjects and recordings). Finally, a window time of 1-s time points at timings 1 s around remaining artifacts were set to bad. Data were considered contaminated by artifacts when MEG amplitude exceeded 5 pT in at least one magnetometer or 1 pT/cm in at least one gradiometer.

We extracted the temporal envelope of the attended speech (narrators’ voice) using the optimal approach proposed by Biesmans and colleagues [127]. Briefly, audio signals were band-pass filtered using a gammatone filter bank (15 filters centered on logarithmically spaced frequencies from 150 Hz to 4,000 Hz), and sub-band envelopes were computed using Hilbert transform, elevated to the power 0.6, and averaged across bands.

Accuracy of speech envelope reconstruction and normalized CTS

For each condition and participant, a global value of cortical tracking of the attended speech was evaluated for all left-hemisphere sensors at once and for all right-hemisphere sensors at once. Using the mTRF toolbox [64], we trained a decoder on MEG data to reconstruct speech temporal envelope and estimated its Pearson correlation with real speech temporal envelope. This correlation is often referred to as the reconstruction accuracy, and it provides a global measure of CTS. See S6 Methods for a full description of the procedure and statistical assessment. A similar approach has been used in previous studies on the CTS [50,54,66,67].

Based on CTS values, we derived the normalized CTS (nCTS) in SiN conditions as the following contrast between CTS in SiN (CTSSiN) and noiseless (CTSnoiseless) conditions:

nCTS=(CTSSiNCTSnoiseless)/(CTSSiN+CTSnoiseless).

Such contrast presents the advantage of being specific to SiN processing abilities by factoring out the global level of CTS in the noiseless condition. However, it can be misleading when derived from negative CTS values (which may happen because CTS is an unsquared correlation value). For this reason, CTS values below a threshold of 10% of the mean CTS across all subjects, conditions, and hemispheres were set to that threshold prior to nCTS computation. Thanks to this thresholding, the nCTS index takes values between −1 and 1, with negative values indicating that the noise reduces CTS.

PID

All behavioral and nCTS measures were corrected for IQ, age, time spent at elementary school, and outliers (see S2 Methods).

We used PID to appraise without a priori the relation between reading abilities, cortical measures of SiN processing, and classical behavioral predictors of reading. In general, PID decomposes the mutual information (MI) quantifying the relationship between two explanatory variables (or sets of explanatory variables) and a single target into four constituent terms: the unique information about the target, which is available separately from each explanatory variable alone; the redundant or shared information, which is common to the two explanatory variables; and synergistic information, which is information about the target that is available only when both explanatory variables are observed together (e.g., the relationship between their values is informative about the target) [69,70,128]. PID was previously used to decompose the information brought by acoustic and visual speech signals about brain oscillatory activity [80] and to compare auditory encoding models of MEG during speech processing [128]. In our analysis, the five reading scores were used as the target, the features of nCTS as the first set of explanatory variables, and behavioral scores as the second set of explanatory variables. PID was also used to better understand the nature of some other statistical associations we uncovered. For further details on PID, its quantification with z-scores, and its statistical assessment, see S4 Methods.

Linear mixed-effects modeling of nCTS and reading values

We performed linear mixed-effects analysis with R [129] and lme4 [130] to identify how different fixed effects modulate nCTS. We started with a null model that included only a different random intercept for each subject. The model was iteratively compared with models incremented with simple fixed effects of hemisphere, noise (least-energetic nonspeech, most-energetic nonspeech, opposite-gender babble, and same-gender babble), and visual (lips versus pics) added one by one. At every step, the most significant fixed effect was retained until the addition of the remaining effects did not improve the model any further (p > 0.05). The same procedure was then repeated to refine the ensuing model with the interactions of the simple fixed effects of order 2 (e.g., hemisphere × noise) and then 3 (hemisphere × noise × visual).

We followed the same approach to identify how reading abilities (five standardized scores) relate to classical behavioral predictors of reading and features of nCTS. In that analysis, we first considered a nonzero slope for the classical behavioral predictors identical for all reading scores, then a nonzero slope for the classical behavioral predictors different for all reading scores, then a nonzero slope for the features of nCTS identical for all reading scores, and finally a nonzero slope for the features of nCTS different for all reading scores.

Of note, we preferred linear mixed-effects modeling over other statistical methods for two reasons. (1) This method could identify both the factors that modulate nCTS and the regressors that explain reading scores. (2) It could simultaneously model all the reading scores and identify possible differences in correlation with the different readings scores.

Also worth noting, performing model selection with a stepwise deletion approach (i.e., when starting with the full model and iteratively removing fixed effects that did not decrease significantly model accuracy) yielded the exact same linear mixed-effects models.

Supporting information

S1 Methods. Assessment of the degree of energetic masking.

(DOCX)

S2 Methods. Preprocessing of brain and behavioral indices.

(DOCX)

S3 Methods. Extraction of the relevant features of nCTS.

nCTS, normalized cortical tracking of speech.

(DOCX)

S4 Methods. Partial information decomposition.

(DOCX)

S5 Methods. Recording of video stimuli.

(DOCX)

S6 Methods. Accuracy of speech envelope reconstruction.

(DOCX)

S1 Results. Contribution of visual cortical activity to nCTS.

nCTS, normalized cortical tracking of speech.

(DOCX)

S2 Results. Side measures are redundant with RAN and digit span but not with modulations in phrasal nCTS.

nCTS, normalized cortical tracking of speech; RAN, rapid automatized naming.

(DOCX)

S3 Results. Reading profile and reading deficit in the group with dyslexia.

(DOCX)

S4 Results. Are features of nCTS related to the importance of reading difficulties in dyslexia?

nCTS, normalized cortical tracking of speech.

(DOCX)

S1 Discussion. Supplementary Discussion.

(DOCX)

S1 Table. Percentage of the 73 typical readers showing significant CTS at phrasal and syllabic rates in the nine different conditions.

The two values provided for the noiseless condition correspond to two arbitrary subdivisions of the noiseless data to match the amount of data for the eight noise conditions. CTS, cortical tracking of speech.

(DOCX)

S2 Table. Nature of the information about reading abilities brought by each of the three uncovered features of the CTS in noise and phonological awareness (mean of the scores for phoneme fusion and suppression).

Significant values (p < 0.05) are displayed in boldface, and marginally significant values are displayed in boldface and italicized. CTS, cortical tracking of speech.

(DOCX)

S3 Table. Nature of the information about reading brought by (1) the visual modulation in syllabic nCTS and (2) each of the four regressors included in the final model of reading abilities (informational modulation in phrasal nCTS, visual modulation in phrasal nCTS, forward digit span, and RAN).

nCTS, normalized cortical tracking of speech; RAN, rapid automatized naming.

(DOCX)

S4 Table. Same as in S3 Table for metaphonological abilities.

(DOCX)

S5 Table. Percentage of the 26 children of each reading group (dyslexia, control in age, and control in reading level) showing significant CTS in at least one hemisphere at phrasal and syllabic rates in the nine different conditions.

CTS, cortical tracking of speech.

(DOCX)

S6 Table. Factors included in the final linear mixed-effects model fit to the nCTS (independent variable) at phrasal and at syllabic rates in children with dyslexia.

Factors are listed in their order of inclusion. nCTS, normalized cortical tracking of speech.

(DOCX)

S7 Table. Regressors included in the final linear mixed-effects model fit to the five reading scores (dependent variables) in children with dyslexia.

Regressors are listed in their order of inclusion.

(DOCX)

S8 Table. Pearson correlation between measures of reading abilities and relevant brain and behavioral measures in children with dyslexia.

***p < 0.001, **p < 0.01, *p < 0.05, #p < 0.1. nCTS, normalized cortical tracking of speech.

(DOCX)

S9 Table. Pearson correlation between measures of reading abilities and nCTS measures in children with dyslexia.

***p < 0.001, **p < 0.01, *p < 0.05, #p < 0.1. nCTS, normalized cortical tracking of speech.

(DOCX)

S1 Fig

Spectrogram of a 4-s excerpt of attended speech (A) and corresponding noise (B) in the range of 0–7 kHz. Wide-band spectrograms (0–20 kHz) are also presented for the attended speech and the least-energetic nonspeech noise (C) to show that noise power was confined to frequencies above 10 kHz in this latter noise condition. The zeros of the dBFS were fixed based on the attended speech spectrogram and applied to all noise spectrograms. dBFS, decibel full scale.

(TIF)

S2 Fig. Relation between reading abilities and the nCTS at phrasal rate in dyslexia.

S6 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

(TIF)

S3 Fig

Impact of the main fixed effects on the nCTS at phrasal (A) and syllabic rates (B) in children with dyslexia. All is as in Fig 2. S7 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

(TIF)

S1 video. Exemplary video stimulus wherein static pictures were replaced by text descriptions.

(M4V)

S1 data. Behavioral and CTS values for all participants.

CTS, cortical tracking of speech.

(XLSX)

S2 data. Raw data underlying Fig 2.

(XLSX)

S3 data. Raw data underlying Fig 3.

(XLSX)

S4 data. Raw data underlying Fig 4.

(XLSX)

S5 data. Raw data underlying Fig 5.

(XLSX)

S6 data. Raw data underlying S2 Fig.

(XLSX)

S7 data. Raw data underlying S3 Fig.

(XLSX)

Acknowledgments

We thank Wafae El Hammouchi, Morgane De Boeck, Konstantina Kanellou, and Pauline Delvingt for help with data acquisition.

Abbreviations

CTS

cortical tracking of speech

IQ

intelligence quotient

MEG

magnetoencephalography

nCTS

normalized CTS

PID

partial information decomposition

RAN

rapid automatized naming

SiN

speech in noise

TAP

test of attentional performance

Data Availability

The data and the code that support the findings of this study are available on the Open Science Framework at “https://osf.io/9ce5t/”. The underlying numerical data for each figure can also be found in the supporting data files.

Funding Statement

F.D., J.B. and M.B. were supported by the program Attract of Innoviris (https://innoviris.brussels/; grant number 2015-BB2B-10). J.B. was supported by a research grant from the Fonds de Soutien Marguerite-Marie Delacroix (https://www.fondsmmdelacroix.org/). R.A.A.I. was supported by the Wellcome Trust (https://wellcome.ac.uk/; grant number 214120/Z/18/Z). X.D.T. was Post-doctorate Clinical Master Specialist at the Fonds de la Recherche Scientifique (F.R.S.-FNRS, https://www.frs-fnrs.be/en/). M.B. was supported by the Spanish Ministry of Economy and Competitiveness (https://www.ciencia.gob.es/; grant number PSI2016-77175-P), and by the Marie Skłodowska-Curie Action of the European Commission (https://ec.europa.eu/research/mariecurieactions/msca-actions_en; grant number 743562). This study and the MEG project at the CUB Hôpital Erasme are financially supported by the Fonds Erasme (https://www.fondserasme.org/fondserasme_en.html; Research Convention “Les Voies du Savoir”). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Leppänen PHT, Hämäläinen JA, Guttorm TK, Eklund KM, Salminen H, Tanskanen A, et al. Infant brain responses associated with reading-related skills before school and at school age. Neurophysiol Clin. 2012;42: 35–41. 10.1016/j.neucli.2011.08.005 [DOI] [PubMed] [Google Scholar]
  • 2.Share DL, Jorm AF, Maclean R, Matthews R. Sources of individual differences in reading acquisition. Journal of Educational Psychology. 1984; 1309–1324. 10.1037//0022-0663.76.6.1309 [DOI] [Google Scholar]
  • 3.Caravolas M, Hulme C, Snowling MJ. The Foundations of Spelling Ability: Evidence from a 3-Year Longitudinal Study. Journal of Memory and Language. 2001; 751–774. 10.1006/jmla.2000.2785 [DOI] [Google Scholar]
  • 4.Muter V, Snowling M. Concurrent and Longitudinal Predictors of Reading: The Role of Metalinguistic and Short-Term Memory Skills. Reading Research Quarterly. 1998; 320–337. 10.1598/rrq.33.3.4 [DOI] [Google Scholar]
  • 5.Gathercole SE, Baddeley AD. Phonological working memory: A critical building block for reading development and vocabulary acquisition? European Journal of Psychology of Education. 1993; 259–272. 10.1007/bf03174081 [DOI] [Google Scholar]
  • 6.Manis FR, Doi LM, Bhadha B. Naming speed, phonological awareness, and orthographic knowledge in second graders. J Learn Disabil. 2000;33: 325–33, 374. 10.1177/002221940003300405 [DOI] [PubMed] [Google Scholar]
  • 7.Wimmer H, Mayringer H, Landerl K. The double-deficit hypothesis and difficulties in learning to read a regular orthography. Journal of Educational Psychology. 2000; 668–680. 10.1037//0022-0663.92.4.668 [DOI] [Google Scholar]
  • 8.Wimmer H, Mayringer H, Landerl K. Poor Reading: A Deficit in Skill-Automatization or a Phonological Deficit? Scientific Studies of Reading. 1998; 321–340. 10.1207/s1532799xssr0204_2 [DOI] [Google Scholar]
  • 9.Samuelsson S, Lundberg I. The impact of environmental factors on components of reading and dyslexia. Annals of Dyslexia. 2003; 201–217. 10.1007/s11881-003-0010-8 [DOI] [Google Scholar]
  • 10.Hooper SR, Roberts J, Sideris J, Burchinal M, Zeisel S. Longitudinal predictors of reading and math trajectories through middle school for African American versus Caucasian students across two samples. Dev Psychol. 2010;46: 1018–1029. 10.1037/a0018877 [DOI] [PubMed] [Google Scholar]
  • 11.Klatte M, Bergström K, Lachmann T. Does noise affect learning? A short review on noise effects on cognitive performance in children. Frontiers in Psychology. 2013. 10.3389/fpsyg.2013.00578 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Stockman JA. Aircraft and Road Traffic Noise and Children’s Cognition and Health: A Cross-National Study. Yearbook of Pediatrics. 2007; 69–71. 10.1016/s0084-3954(08)70038-1 [DOI] [Google Scholar]
  • 13.McDermott JH. The cocktail party problem. Current Biology. 2009; R1024–R1027. 10.1016/j.cub.2009.09.005 [DOI] [PubMed] [Google Scholar]
  • 14.Anderson S, Kraus N. Sensory-cognitive interaction in the neural encoding of speech in noise: a review. J Am Acad Audiol. 2010;21: 575–585. 10.3766/jaaa.21.9.3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.White-Schwoch T, Woodruff Carr K, Thompson EC, Anderson S, Nicol T, Bradlow AR, et al. Auditory Processing in Noise: A Preschool Biomarker for Literacy. PLoS Biol. 2015;13: e1002196 10.1371/journal.pbio.1002196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Calcus A, Colin C, Deltenre P, Kolinsky R. Informational masking of speech in dyslexic children. The Journal of the Acoustical Society of America. 2015; EL496–EL502. 10.1121/1.4922012 [DOI] [PubMed] [Google Scholar]
  • 17.Ziegler JC, Pech-Georgel C, George F, Lorenzi C. Speech-perception-in-noise deficits in dyslexia. Developmental Science. 2009; 732–745. 10.1111/j.1467-7687.2009.00817.x [DOI] [PubMed] [Google Scholar]
  • 18.Dole M, Hoen M, Meunier F. Speech-in-noise perception deficit in adults with dyslexia: Effects of background type and listening configuration. Neuropsychologia. 2012; 1543–1552. 10.1016/j.neuropsychologia.2012.03.007 [DOI] [PubMed] [Google Scholar]
  • 19.Nittrouer S. From Ear to Cortex: A Perspective on What Clinicians Need to Understand About Speech Perception and Language Processing. Lang Speech Hear Serv Sch. 2002;33: 237–252. 10.1044/0161-1461(2002/020) [DOI] [PubMed] [Google Scholar]
  • 20.Fallon M, Trehub SE, Schneider BA. Children’s perception of speech in multitalker babble. The Journal of the Acoustical Society of America. 2000; 3023–3029. 10.1121/1.1323233 [DOI] [PubMed] [Google Scholar]
  • 21.Lewis D, Hoover B, Choi S, Stelmachowicz P. Relationship between speech perception in noise and phonological awareness skills for children with normal hearing. Ear Hear. 2010;31: 761–768. 10.1097/AUD.0b013e3181e5d188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J. DRC: a dual route cascaded model of visual word recognition and reading aloud. Psychol Rev. 2001;108: 204–256. 10.1037/0033-295x.108.1.204http://paperpile.com/b/Tks0CC/KrkGhttps://www.ncbi.nlm.nih.gov/pubmed/11212628 [DOI] [PubMed] [Google Scholar]
  • 23.Coltheart M, Curtis B, Atkins P, Haller M. Models of reading aloud: Dual-route and parallel-distributed-processing approaches. Psychological Review. 1993; 589–608. 10.1037/0033-295x.100.4.589 [DOI] [Google Scholar]
  • 24.Perry C, Ziegler JC, Zorzi M. Nested incremental modeling in the development of computational theories: the CDP+ model of reading aloud. Psychol Rev. 2007;114: 273–315. 10.1037/0033-295X.114.2.273 [DOI] [PubMed] [Google Scholar]
  • 25.Fiez JA, Petersen SE. Neuroimaging studies of word reading. Proceedings of the National Academy of Sciences. 1998;95: 914–921. 10.1073/pnas.95.3.914 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Turkeltaub PE, Eden GF, Jones KM, Zeffiro TA. Meta-analysis of the functional neuroanatomy of single-word reading: method and validation. Neuroimage. 2002;16: 765–780. 10.1006/nimg.2002.1131http://paperpile.com/b/Tks0CC/IaFmhttps://www.ncbi.nlm.nih.gov/pubmed/12169260 [DOI] [PubMed] [Google Scholar]
  • 27.McCandliss BD, Cohen L, Dehaene S. The visual word form area: expertise for reading in the fusiform gyrus. Trends Cogn Sci. 2003;7: 293–299. 10.1016/s1364-6613(03)00134-7http://paperpile.com/b/Tks0CC/sUdlhttps://www.ncbi.nlm.nih.gov/pubmed/12860187 [DOI] [PubMed] [Google Scholar]
  • 28.Dehaene S, Cohen L. The unique role of the visual word form area in reading. Trends Cogn Sci. 2011;15: 254–262. 10.1016/j.tics.2011.04.003 [DOI] [PubMed] [Google Scholar]
  • 29.Price CJ. A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. Neuroimage. 2012;62: 816–847. 10.1016/j.neuroimage.2012.04.062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Destoky F, Philippe M, Bertels J, Verhasselt M, Coquelet N, Vander Ghinst M, et al. Comparing the potential of MEG and EEG to uncover brain tracking of speech temporal envelope. Neuroimage. 2019;184: 201–213. 10.1016/j.neuroimage.2018.09.006 [DOI] [PubMed] [Google Scholar]
  • 31.Vander Ghinst M, Bourguignon M, Niesen M, Wens V, Hassid S, Choufani G, et al. Cortical Tracking of Speech-in-Noise Develops from Childhood to Adulthood. J Neurosci. 2019;39: 2938–2950. 10.1523/JNEUROSCI.1732-18.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Bourguignon M, De Tiège X, Op de Beeck M, Ligot N, Paquier P, Van Bogaert P, et al. The pace of prosodic phrasing couples the listener’s cortex to the reader's voice. Human Brain Mapping. 2013; 314–326. 10.1002/hbm.21442 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ahissar E, Nagarajan S, Ahissar M, Protopapas A, Mahncke H, Merzenich MM. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc Natl Acad Sci U S A. 2001;98: 13367–13372. 10.1073/pnas.201400998 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Gross J, Hoogenboom N, Thut G, Schyns P, Panzeri S, Belin P, et al. Speech rhythms and multiplexed oscillatory sensory coding in the human brain. PLoS Biol. 2013;11: e1001752 10.1371/journal.pbio.1001752 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Luo H, Poeppel D. Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron. 2007;54: 1001–1010. 10.1016/j.neuron.2007.06.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Molinaro N, Lizarazu M, Lallier M, Bourguignon M, Carreiras M. Out-of-synchrony speech entrainment in developmental dyslexia. Hum Brain Mapp. 2016;37: 2767–2783. 10.1002/hbm.23206 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Peelle JE, Gross J, Davis MH. Phase-locked responses to speech in human auditory cortex are enhanced during comprehension. Cereb Cortex. 2013;23: 1378–1387. 10.1093/cercor/bhs118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Meyer L, Gumbert M. Synchronization of Electrophysiological Responses with Speech Benefits Syntactic Information Processing. J Cogn Neurosci. 2018;30: 1066–1074. 10.1162/jocn_a_01236 [DOI] [PubMed] [Google Scholar]
  • 39.Meyer L, Henry MJ, Gaston P, Schmuck N, Friederici AD. Linguistic Bias Modulates Interpretation of Speech via Neural Delta-Band Oscillations. Cereb Cortex. 2017;27: 4293–4302. 10.1093/cercor/bhw228 [DOI] [PubMed] [Google Scholar]
  • 40.Vander Ghinst M, Bourguignon M, Op de Beeck M, Wens V, Marty B, Hassid S, et al. Left Superior Temporal Gyrus Is Coupled to Attended Speech in a Cocktail-Party Auditory Scene. J Neurosci. 2016;36: 1596–1606. 10.1523/JNEUROSCI.1730-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Ding N, Melloni L, Zhang H, Tian X, Poeppel D. Cortical tracking of hierarchical linguistic structures in connected speech. Nat Neurosci. 2016;19: 158–164. 10.1038/nn.4186 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Riecke L, Formisano E, Sorger B, Başkent D, Gaudrain E. Neural Entrainment to Speech Modulates Speech Intelligibility. Curr Biol. 2018;28: 161–169.e5. 10.1016/j.cub.2017.11.033 [DOI] [PubMed] [Google Scholar]
  • 43.Vanthornhout J, Decruy L, Wouters J, Simon JZ, Francart T. Speech intelligibility predicted from neural entrainment of the speech envelope. Journal of the Association for Research in Otolaryngology. 2018. 10.1101/246660 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wilsch A, Neuling T, Obleser J, Herrmann CS. Transcranial alternating current stimulation with speech envelopes modulates speech comprehension. Neuroimage. 2018;172: 766–774. 10.1016/j.neuroimage.2018.01.038 [DOI] [PubMed] [Google Scholar]
  • 45.Ding N, Simon JZ. Cortical entrainment to continuous speech: functional roles and interpretations. Front Hum Neurosci. 2014;8: 311 10.3389/fnhum.2014.00311 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Fuglsang SA, Dau T, Hjortkjær J. Noise-robust cortical tracking of attended speech in real-world acoustic scenes. Neuroimage. 2017;156: 435–444. 10.1016/j.neuroimage.2017.04.026 [DOI] [PubMed] [Google Scholar]
  • 47.Puschmann S, Steinkamp S, Gillich I, Mirkovic B, Debener S, Thiel CM. The Right Temporoparietal Junction Supports Speech Tracking During Selective Listening: Evidence from Concurrent EEG-fMRI. J Neurosci. 2017;37: 11505–11516. 10.1523/JNEUROSCI.1007-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Rimmele JM, Golumbic EZ, Schröger E, Poeppel D. The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene. Cortex. 2015; 144–154. 10.1016/j.cortex.2014.12.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Broderick MP, Anderson AJ, Di Liberto GM, Crosse MJ, Lalor EC. Electrophysiological Correlates of Semantic Dissimilarity Reflect the Comprehension of Natural, Narrative Speech. Curr Biol. 2018;28: 803–809.e3. 10.1016/j.cub.2018.01.080 [DOI] [PubMed] [Google Scholar]
  • 50.Ding N, Simon JZ. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc Natl Acad Sci U S A. 2012;109: 11854–11859. 10.1073/pnas.1205381109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Ding N, Simon JZ. Adaptive temporal encoding leads to a background-insensitive cortical representation of speech. J Neurosci. 2013;33: 5728–5735. 10.1523/JNEUROSCI.5297-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Horton C, D’Zmura M, Srinivasan R. Suppression of competing speech through entrainment of cortical oscillations. J Neurophysiol. 2013;109: 3082–3093. 10.1152/jn.01026.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Mesgarani N, Chang EF. Selective cortical representation of attended speaker in multi-talker speech perception. Nature. 2012;485: 233–236. 10.1038/nature11020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.O’Sullivan JA, Power AJ, Mesgarani N, Rajaram S, Foxe JJ, Shinn-Cunningham BG, et al. Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. Cereb Cortex. 2014;25: 1697–1706. 10.1093/cercor/bht355 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Simon JZ. The encoding of auditory objects in auditory cortex: insights from magnetoencephalography. Int J Psychophysiol. 2015;95: 184–190. 10.1016/j.ijpsycho.2014.05.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Zion-Golumbic E, Schroeder CE. Attention modulates “speech-tracking” at a cocktail party. Trends in Cognitive Sciences. 2012; 363–364. 10.1016/j.tics.2012.05.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Pollack I. Auditory informational masking. The Journal of the Acoustical Society of America. 1975; S5–S5. 10.1121/1.1995329 [DOI] [Google Scholar]
  • 58.Hoen M, Meunier F, Grataloup C-L, Pellegrino F, Grimault N, Perrin F, et al. Phonetic and lexical interferences in informational masking during speech-in-speech comprehension. Speech Communication. 2007; 905–916. 10.1016/j.specom.2007.05.008 [DOI] [Google Scholar]
  • 59.Cooke M, Garcia Lecumberri ML, Barker J. The foreign language cocktail party problem: Energetic and informational masking effects in non-native speech perception. J Acoust Soc Am. 2008;123: 414–427. 10.1121/1.2804952 [DOI] [PubMed] [Google Scholar]
  • 60.Rhebergen KS, Versfeld NJ, Dreschler WA. Release from informational masking by time reversal of native and non-native interfering speech. J Acoust Soc Am. 2005;118: 1274–1277. 10.1121/1.2000751 [DOI] [PubMed] [Google Scholar]
  • 61.Sumby WH, Pollack I. Visual Contribution to Speech Intelligibility in Noise. The Journal of the Acoustical Society of America. 1954; 212–215. 10.1121/1.1907309 [DOI] [Google Scholar]
  • 62.Schwartz J-L, Berthommier F, Savariaux C. Seeing to hear better: evidence for early audio-visual interactions in speech identification. Cognition. 2004;93: B69–78. 10.1016/j.cognition.2004.01.006 [DOI] [PubMed] [Google Scholar]
  • 63.Goswami U. Sensory theories of developmental dyslexia: three challenges for research. Nat Rev Neurosci. 2015;16: 43–54. 10.1038/nrn3836 [DOI] [PubMed] [Google Scholar]
  • 64.Crosse MJ, Di Liberto GM, Bednar A, Lalor EC. The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli. Front Hum Neurosci. 2016;10: 604 10.3389/fnhum.2016.00604 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Bourguignon M, Baart M, Kapnoula EC, Molinaro N. Lip-reading enables the brain to synthesize auditory features of unknown silent speech. J Neurosci. 2019. 10.1523/JNEUROSCI.1101-19.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Zion-Golumbic EM, Ding N, Bickel S, Lakatos P, Schevon CA, McKhann GM, et al. Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party.” Neuron. 2013;77: 980–991. 10.1016/j.neuron.2012.12.037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Lalor EC, Foxe JJ. Neural responses to uninterrupted natural speech can be extracted with precise temporal resolution. Eur J Neurosci. 2010;31: 189–193. 10.1111/j.1460-9568.2009.07055.x [DOI] [PubMed] [Google Scholar]
  • 68.Kriegeskorte N, Simmons WK, Bellgowan PSF, Baker CI. Circular analysis in systems neuroscience: the dangers of double dipping. Nat Neurosci. 2009;12: 535–540. 10.1038/nn.2303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Ince R. Measuring Multivariate Redundant Information with Pointwise Common Change in Surprisal. Entropy. 2017; 318 10.3390/e19070318 [DOI] [Google Scholar]
  • 70.Ince RAA, Giordano BL, Kayser C, Rousselet GA, Gross J, Schyns PG. A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula. Human Brain Mapping. 2017;38: 1541–1573. 10.1002/hbm.23471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Molinaro N, Lizarazu M. Delta(but not theta)-band cortical entrainment involves speech-specific processing. European Journal of Neuroscience. 2018; 2642–2650. 10.1111/ejn.13811 [DOI] [PubMed] [Google Scholar]
  • 72.Allport DA, Funnell E. Components of the Mental Lexicon. Philosophical Transactions of the Royal Society B: Biological Sciences. 1981; 397–410. 10.1098/rstb.1981.0148 [DOI] [Google Scholar]
  • 73.McClelland JL, Rogers TT. The parallel distributed processing approach to semantic cognition. Nat Rev Neurosci. 2003;4: 310–322. 10.1038/nrn1076 [DOI] [PubMed] [Google Scholar]
  • 74.Ramus F. The neural basis of reading acquisition. In: Gazzaniga MS, editor. The Cognitive Neurosciences (3rd ed). 2004. pp. 815–824.
  • 75.Ricketts J, Davies R, Masterson J, Stuart M, Duff FJ. Evidence for semantic involvement in regular and exception word reading in emergent readers of English. J Exp Child Psychol. 2016;150: 330–345. 10.1016/j.jecp.2016.05.013 [DOI] [PubMed] [Google Scholar]
  • 76.Kaandorp MW, De Groot AMB, Festen JM, Smits C, Goverts ST. The influence of lexical-access ability and vocabulary knowledge on measures of speech recognition in noise. Int J Audiol. 2016;55: 157–167. 10.3109/14992027.2015.1104735 [DOI] [PubMed] [Google Scholar]
  • 77.Carroll R, Warzybok A, Kollmeier B, Ruigendijk E. Age-Related Differences in Lexical Access Relate to Speech Recognition in Noise. Front Psychol. 2016;7: 990 10.3389/fpsyg.2016.00990 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Mattys SL, Wiget L. Effects of cognitive load on speech recognition. Journal of Memory and Language. 2011; 145–160. 10.1016/j.jml.2011.04.004 [DOI] [Google Scholar]
  • 79.Golumbic EZ, Zion Golumbic E, Cogan GB, Schroeder CE, Poeppel D. Visual Input Enhances Selective Speech Envelope Tracking in Auditory Cortex at a “Cocktail Party.” Journal of Neuroscience. 2013; 1417–1426. 10.1523/JNEUROSCI.3675-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Park H, Ince RAA, Schyns PG, Thut G, Gross J. Representational interactions during audiovisual speech entrainment: Redundancy in left posterior superior temporal gyrus and synergy in left motor cortex. PLoS Biol. 2018;16: e2006558 10.1371/journal.pbio.2006558 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Park H, Kayser C, Thut G, Gross J. Lip movements entrain the observers’ low-frequency brain oscillations to facilitate speech intelligibility. eLife. 2016. 10.7554/elife.14521 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Bourguignon M, Baart M, Kapnoula EC, Molinaro N. Hearing through lip-reading: the brain synthesizes features of absent speech. 10.1101/395483 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Giordano BL, Ince RAA, Gross J, Schyns PG, Panzeri S, Kayser C. Contributions of local speech encoding and functional connectivity to audio-visual speech perception. eLife. 2017. 10.7554/elife.24763 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.MacLeod A, Summerfield Q. Quantifying the contribution of vision to speech perception in noise. Br J Audiol. 1987;21: 131–141. 10.3109/03005368709077786http://paperpile.com/b/Tks0CC/3nyFhttps://www.ncbi.nlm.nih.gov/pubmed/3594015 [DOI] [PubMed] [Google Scholar]
  • 85.Helfer KS, Freyman RL. The role of visual speech cues in reducing energetic and informational masking. J Acoust Soc Am. 2005;117: 842–849. 10.1121/1.1836832 [DOI] [PubMed] [Google Scholar]
  • 86.Crosse MJ, Di Liberto GM, Lalor EC. Eye Can Hear Clearly Now: Inverse Effectiveness in Natural Audiovisual Speech Processing Relies on Long-Term Crossmodal Temporal Integration. J Neurosci. 2016;36: 9888–9895. 10.1523/JNEUROSCI.1396-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Hauswald A, Lithari C, Collignon O, Leonardelli E, Weisz N. A Visual Cortical Network for Deriving Phonological Information from Intelligible Lip Movements. Curr Biol. 2018;28: 1453–1459.e3. 10.1016/j.cub.2018.03.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.O’Sullivan AE, Crosse MJ, Di Liberto GM, Lalor EC. Visual Cortical Entrainment to Motion and Categorical Speech Features during Silent Lipreading. Front Hum Neurosci. 2016;10: 679 10.3389/fnhum.2016.00679 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Crosse MJ, Lalor EC. The cortical representation of the speech envelope is earlier for audiovisual speech than audio speech. J Neurophysiol. 2014;111: 1400–1408. 10.1152/jn.00690.2013 [DOI] [PubMed] [Google Scholar]
  • 90.Baart M, de Boer-Schellekens L, Vroomen J. Lipread-induced phonetic recalibration in dyslexia. Acta Psychol. 2012;140: 91–95. 10.1016/j.actpsy.2012.03.003 [DOI] [PubMed] [Google Scholar]
  • 91.Keetels M, Bonte M, Vroomen J. A Selective Deficit in Phonetic Recalibration by Text in Developmental Dyslexia. Front Psychol. 2018;9: 710 10.3389/fpsyg.2018.00710 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.van Laarhoven T, Keetels M, Schakel L, Vroomen J. Audio-visual speech in noise perception in dyslexia. Developmental Science. 2018; e12504 10.1111/desc.12504 [DOI] [PubMed] [Google Scholar]
  • 93.Bastien-Toniazzo M, Stroumza A, Cavé C. Audio-visual perception and integration in developmental dyslexia: An exploratory study using the McGurk effect. Curr Psychol Lett. 2010;25. [Google Scholar]
  • 94.Rüsseler J, Gerth I, Heldmann M, Münte TF. Audiovisual perception of natural speech is impaired in adult dyslexics: an ERP study. Neuroscience. 2015;287: 55–65. 10.1016/j.neuroscience.2014.12.023 [DOI] [PubMed] [Google Scholar]
  • 95.Ramirez J, Mann V. Using auditory-visual speech to probe the basis of noise-impaired consonant-vowel perception in dyslexia and auditory neuropathy. J Acoust Soc Am. 2005;118: 1122–1133. 10.1121/1.1940509 [DOI] [PubMed] [Google Scholar]
  • 96.Campbell R. The processing of audio-visual speech: empirical and neural bases. Philos Trans R Soc Lond B Biol Sci. 2008;363: 1001–1010. 10.1098/rstb.2007.2155 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.van Atteveldt N, Formisano E, Goebel R, Blomert L. Integration of Letters and Speech Sounds in the Human Brain. Neuron. 2004; 271–282. 10.1016/j.neuron.2004.06.025 [DOI] [PubMed] [Google Scholar]
  • 98.Raij T, Uutela K, Hari R. Audiovisual Integration of Letters in the Human Brain. Neuron. 2000; 617–625. 10.1016/s0896-6273(00)00138-0 [DOI] [PubMed] [Google Scholar]
  • 99.Blomert L. The neural signature of orthographic–phonological binding in successful and failing reading development. NeuroImage. 2011; 695–703. 10.1016/j.neuroimage.2010.11.003 [DOI] [PubMed] [Google Scholar]
  • 100.Bernstein LE, Liebenthal E. Neural pathways for visual speech perception. Frontiers in Neuroscience. 2014. 10.3389/fnins.2014.00386 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Leybaert J, Macchi L, Huyse A, Champoux F, Bayard C, Colin C, et al. Atypical audio-visual speech perception and McGurk effects in children with specific language impairment. Front Psychol. 2014;5: 422 10.3389/fpsyg.2014.00422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Hood M, Conlon E. Visual and auditory temporal processing and early reading development. Dyslexia. 2004;10: 234–252. 10.1002/dys.273 [DOI] [PubMed] [Google Scholar]
  • 103.Boets B, Wouters J, van Wieringen A, De Smedt B, Ghesquière P. Modelling relations between sensory processing, speech perception, orthographic and phonological ability, and literacy achievement. Brain Lang. 2008;106: 29–40. 10.1016/j.bandl.2007.12.004 [DOI] [PubMed] [Google Scholar]
  • 104.Katz RB, Healy AF, Shankweiler D. Phonetic coding and order memory in relation to reading proficiency: A comparison of short-term memory for temporal and spatial order information. Applied Psycholinguistics. 1983; 229–250. 10.1017/s0142716400004598 [DOI] [Google Scholar]
  • 105.Shankweiler D. The speech code and learning to read. Journal of Experimental Psychology: Human Learning & Memory. 1979; 531–545. 10.1037//0278-7393.5.6.531 [DOI] [Google Scholar]
  • 106.Brady S, Shankweiler D, Mann V. Speech perception and memory coding in relation to reading ability. J Exp Child Psychol. 1983;35: 345–367. 10.1016/0022-0965(83)90087-5http://paperpile.com/b/Tks0CC/wXV9https://www.ncbi.nlm.nih.gov/pubmed/6842131 [DOI] [PubMed] [Google Scholar]
  • 107.Araújo S, Reis A, Petersson KM, Faísca L. Rapid automatized naming and reading performance: A meta-analysis. Journal of Educational Psychology. 2015; 868–883. 10.1037/edu0000006 [DOI] [Google Scholar]
  • 108.Lervåg A, Hulme C. Rapid automatized naming (RAN) taps a mechanism that places constraints on the development of early reading fluency. Psychol Sci. 2009;20: 1040–1048. 10.1111/j.1467-9280.2009.02405.x [DOI] [PubMed] [Google Scholar]
  • 109.Norton ES, Wolf M. Rapid automatized naming (RAN) and reading fluency: implications for understanding and treatment of reading disabilities. Annu Rev Psychol. 2012;63: 427–452. 10.1146/annurev-psych-120710-100431 [DOI] [PubMed] [Google Scholar]
  • 110.Georgiou GK, Parrila R, Cui Y, Papadopoulos TC. Why is rapid automatized naming related to reading? J Exp Child Psychol. 2013;115: 218–225. 10.1016/j.jecp.2012.10.015 [DOI] [PubMed] [Google Scholar]
  • 111.Landerl K, Harald Freudenthaler H, Heene M, De Jong PF, Desrochers A, Manolitsis G, et al. Phonological Awareness and Rapid Automatized Naming as Longitudinal Predictors of Reading in Five Alphabetic Orthographies with Varying Degrees of Consistency. Scientific Studies of Reading. 2019; 220–234. 10.1080/10888438.2018.1510936 [DOI] [Google Scholar]
  • 112.Torgesen JK, Wagner RK, Rashotte CA, Burgess S, Hecht S. Contributions of Phonological Awareness and Rapid Automatic Naming Ability to the Growth of Word-Reading Skills in Second-to Fifth-Grade Children. Scientific Studies of Reading. 1997; 161–185. 10.1207/s1532799xssr0102_4 [DOI] [Google Scholar]
  • 113.Sprenger-Charolles L, Siegel LS, Béchennec D, Serniclaes W. Development of phonological and orthographic processing in reading aloud, in silent reading, and in spelling: a four-year longitudinal study. J Exp Child Psychol. 2003;84: 194–217. 10.1016/s0022-0965(03)00024-9http://paperpile.com/b/Tks0CC/KDpahttps://www.ncbi.nlm.nih.gov/pubmed/12706384https://www.ncbi.nlm.nih.gov/pubmed/12706384 [DOI] [PubMed] [Google Scholar]
  • 114.Elhassan Z, Crewther SG, Bavin EL. The Contribution of Phonological Awareness to Reading Fluency and Its Individual Sub-skills in Readers Aged 9- to 12-years. Front Psychol. 2017;8: 533 10.3389/fpsyg.2017.00533 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Boets B, Op de Beeck HP, Vandermosten M, Scott SK, Gillebert CR, Mantini D, et al. Intact but less accessible phonetic representations in adults with dyslexia. Science. 2013;342: 1251–1254. 10.1126/science.1244333 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Perfetti CA, Beck I, Bell LC, Hughes C. Phonemic Knowledge and Learning to Read are Reciprocal: A Longitudinal Study of First Grade Children. Merrill Palmer Q. 1987;33: 283–319. [Google Scholar]
  • 117.Demanez L, Dony-Closon B, Lhonneux-Ledoux E, Demanez JP. Central auditory processing assessment: a French-speaking battery. Acta Otorhinolaryngol Belg. 2003;57: 275–290. [PubMed] [Google Scholar]
  • 118.Currie CE, Elton RA, Todd J, Platt S. Indicators of socioeconomic status for adolescents: the WHO Health Behaviour in School-aged Children Survey. Health Educ Res. 1997;12: 385–397. 10.1093/her/12.3.385 [DOI] [PubMed] [Google Scholar]
  • 119.Jacquier-Roux M, Valdois S, Zorman M. Lequette C, Pouget GM. Odédyshttp://paperpile.com/b/Tks0CC/X05N. Grenoble, France: Laboratoire Cogni-Sciences; 2005.
  • 120.Lefavrais P. http://paperpile.com/b/Tks0CC/2qKYL’Ahttp://paperpile.com/b/Tks0CC/2qKYlouette R. Paris: Les Editions du Centre de Psychologie Appliquée; 2005.http://paperpile.com/b/Tks0CC/2qKYhttps://books.google.com/books/about/Manuel_du_test_de_l_alouette.html?hl=&id=YF_VPgAACAAJ
  • 121.Gauthier L, Dehaut F, Joanette Y. Bells Test. PsycTESTS Dataset. 1989. 10.1037/t28075-000 [DOI] [Google Scholar]
  • 122.Fimm B, Zimmermann P. A test battery for attentional performance. Applied Neuropsychology of Attention Theory, Diagnosis and Rehabilitation. 2002; 110–151. [Google Scholar]
  • 123.Wechsler D, Naglieri JA. Wechsler Nonverbal Scale of Ability (WNV). https://books.google.com/books/about/WNV.html?hl=&id=Xm3tSAAACAAJTechnical and interpretive manual. San Antonio: Harcourt Assessment; 2006.
  • 124.De Tiège X, Op de Beeck M, Funke M, Legros B, Parkkonen L, Goldman S, et al. Recording epileptic activity with MEG in a light-weight magnetic shield. Epilepsy Res. 2008;82: 227–231. 10.1016/j.eplepsyres.2008.08.011 [DOI] [PubMed] [Google Scholar]
  • 125.Taulu S, Simola J. Spatiotemporal signal space separation method for rejecting nearby interference in MEG measurements. Phys Med Biol. 2006;51: 1759–1768. 10.1088/0031-9155/51/7/008 [DOI] [PubMed] [Google Scholar]
  • 126.Taulu S, Simola J, Kajola M. Applications of the signal space separation method. IEEE Trans Signal Process. 2005;53: 3359–3372. 10.1109/tsp.2005.853302 [DOI] [Google Scholar]
  • 127.Biesmans W, Das N, Francart T, Bertrand A. Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario. IEEE Trans Neural Syst Rehabil Eng. 2017;25: 402–412. 10.1109/TNSRE.2016.2571900 [DOI] [PubMed] [Google Scholar]
  • 128.Daube C, Ince RAA, Gross J. Simple Acoustic Features Can Explain Phoneme-Based Predictions of Cortical Responses to Speech. Curr Biol. 2019;29: 1924–1937.e9. 10.1016/j.cub.2019.04.067 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria; 2018. [Google Scholar]
  • 130.Bates DM, Maechler M, Bolker B. lme4: Linear mixed-effects models using S4 classes. 2011. [Google Scholar]

Decision Letter 0

Gabriel Gasque

9 Feb 2020

Dear Dr Destoky,

Thank you for submitting your manuscript entitled "Cortical tracking of speech in noise accounts for reading strategies in children" for consideration as a Research Article by PLOS Biology.

Your manuscript has now been evaluated by the PLOS Biology editorial staff, as well as by an Academic Editor with relevant expertise, and I am writing to let you know that we would like to send your submission out for external peer review. Please accept my apologies for the delay in sending this decision to you.

Before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire.

Please re-submit your manuscript within two working days, i.e. by Feb 12 2020 11:59PM.

Login to Editorial Manager here: https://www.editorialmanager.com/pbiology

During resubmission, you will be invited to opt-in to posting your pre-review manuscript as a bioRxiv preprint. Visit http://journals.plos.org/plosbiology/s/preprints for full details. If you consent to posting your current manuscript as a preprint, please upload a single Preprint PDF when you re-submit.

Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. Once your manuscript has passed all checks it will be sent out for review.

Feel free to email us at plosbiology@plos.org if you have any queries relating to your submission.

Kind regards,

Gabriel Gasque, Ph.D.,

Senior Editor

PLOS Biology

Decision Letter 1

Gabriel Gasque

10 Apr 2020

Dear Dr Destoky,

Thank you very much for submitting your manuscript "Cortical tracking of speech in noise accounts for reading strategies in children" for consideration as a Research Article at PLOS Biology. Your manuscript has been evaluated by the PLOS Biology editors, by an Academic Editor with relevant expertise, and by three independent reviewers.

The reviews of your manuscript are appended below. You will see that the reviewers find the work potentially interesting. However, based on their specific comments and following discussion with the Academic Editor, I regret that we cannot accept the current version of the manuscript for publication. We remain interested in your study and we would be willing to consider resubmission of a comprehensively revised version that thoroughly addresses all the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript would be sent for further evaluation by the reviewers. Please note that a positive outcome is not guaranteed at this stage.

If you decide to revise for PLOS Biology, your revisions should address the specific points made by each reviewer. As you will see, reviewer 1 raises the fundamental issue of whether SiN tracking does actually reflect performance, reviewer 2 raises the issue of large number of readings and MEG measures and the reproducibility of the correlation and suggests a very reasonable analysis to test on current data set, and reviewer 3 raises issues with respect to sample characterisation. The Academic Editor thinks, and we agree, that all these concerns should be thoroughly addressed for a successful revision.

We appreciate that our and the reviewers’ requests represent a great deal of extra work, and we are willing to relax our standard revision time to allow you six months to revise your manuscript. We expect to receive your revised manuscript within 6 months.

Please email us (plosbiology@plos.org) if you have any questions or concerns, or would like to request an extension --we understand these are particularly complex times. At this stage, your manuscript remains formally under active consideration at our journal; please notify us by email if you do not intend to submit a revision so that we may end consideration of the manuscript at PLOS Biology.

**IMPORTANT - SUBMITTING YOUR REVISION**

As stated above, your revisions should address the specific points made by each reviewer. Please submit the following files along with your revised manuscript:

1. A 'Response to Reviewers' file - this should detail your responses to the editorial requests, present a point-by-point response to all of the reviewers' comments, and indicate the changes made to the manuscript.

*NOTE: In your point by point response to the reviewers, please provide the full context of each review. Do not selectively quote paragraphs or sentences to reply to. The entire set of reviewer comments should be present in full and each specific point should be responded to individually, point by point.

You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response.

2. In addition to a clean copy of the manuscript, please also upload a 'track-changes' version of your manuscript that specifies the edits made. This should be uploaded as a "Related" file type.

*Resubmission Checklist*

When you are ready to resubmit your revised manuscript, please refer to this resubmission checklist: https://plos.io/Biology_Checklist

To submit a revised version of your manuscript, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' where you will find your submission record.

Please make sure to read the following important policies and guidelines while preparing your revision:

*Published Peer Review*

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*PLOS Data Policy*

Please note that as a condition of publication PLOS' data policy (http://journals.plos.org/plosbiology/s/data-availability) requires that you make available all data used to draw the conclusions arrived at in your manuscript. If you have not already done so, you must include any data used in your manuscript either in appropriate repositories, within the body of the manuscript, or as supporting information (N.B. this includes any numerical values that were used to generate graphs, histograms etc.). For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5

*Blot and Gel Data Policy*

We require the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare them now, if you have not already uploaded them. Please carefully read our guidelines for how to prepare and upload this data: https://journals.plos.org/plosbiology/s/figures#loc-blot-and-gel-reporting-requirements

*Protocols deposition*

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosbiology/s/submission-guidelines#loc-materials-and-methods

Thank you again for your submission to our journal. We hope that our editorial process has been constructive thus far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Gabriel Gasque, Ph.D.,

Senior Editor

PLOS Biology

*****************************************************

REVIEWS:

Reviewer #1: Review: PBIOLOGY-D-20-00208R1

The manuscript presents an intriguing investigation of the relationship between indicators of reading skill and cortical tracking of speech in noise in the presence of additional visual (lip-movement, 'lips') information or in its absence ('pics'). They examine tracking at two time-scales, the phrasal and the syllabic rate. They compare the modulation of tracking by visual information in the presence of informational and non-informational noise.

As a result of the number of factors investigated and the number of outcomes presented, I found the paper overly dense and the conclusions hard to evaluate. Cortical tracking (nCTS) is evaluated for each hemisphere, for 2 visual conditions, for 4 noise conditions at 2 time-scales. These 32 values are then somehow reduced to 8 "contrasts" which are then related to a an array of reading indicators. This is achieved through a combination of variable selection using LME followed by PID.

The data point to a relationship between the enhancement of cortical tracking in noise by the provision of supporting visual information and several indicators of reading performance. The authors suggest that the ability to use visual information to enhance cortical entrainment to speech is related to reading ability. One potential mechanism offered is that in noisy classroom situations, children with better ability to track speech will benefit from easier development of the lexical route to reading for words with irregular orthographies.

The results are certainly of potential interest, but I have a number of concerns at the design level, about the way the data are interpreted and about the data presentation, which I found to be muddled. Two issues in particular stand out:

1) The conception of the different types of noise employed in the study

2) Frequent unguarded statements that imply direction of causation, that cannot be determined from the data, in part because of an absence of relevant behavioural data

I shall outline the concerns I have below.

Fundamental Issues

I felt that there was at least one potentially fundamental flaw relating to the stimuli. The design of the experiment is articulated as a 2*2 parametric manipulation of informational * energetic masking. Unfortunately, it is simply not credible that masking can be non-energetic. Informational masking necessarily implies some degree of spectro-temporal coincidence of the target and the masker, in addition to the categorical coincidence (both are of the same nature as the target). A 'masker' that does not spectro-temporally overlap with a signal is not a masker, though it may have a deleterious impact on target processing for various reasons. The non-energetic non-informational condition is an interesting one, but since by design it aims to avoid masking the target it is difficult to see how these 4 'masking' conditions truly relate to one-another. It may be that the so-called non-energetic conditions are in fact distractors, but it is rather hard to determine what the conditions actually are based on the description in the methods in the main text. The supplementary materials only describe the generation of the energetic non-informational mask. More information is needed, and the terminology must be reconsidered. This has a considerable impact on the interpretation of the results. Furthermore, the difference between the masker types is not properly explored - to infer that it is the speech content of the informational conditions, and not their spectro-temporal modulation properties is not easily justified. This is, of course, a classical problem in the literature. How well are the non-informational and informational conditions really matched? Would a time-reversed version of the babble provoke the same kinds of effects, or not? It may be, of course that the authors do not want to draw specific conclusions based upon types of maskers. I would suggest that an effort be made to ensure that the rationale behind the masker design and their relationship to the hypotheses be made clearer.

A basic theoretical issue that I found somewhat disconcerting is the underlying assumption of the desirability of higher cortical tracking. It is as yet unproven whether more tracking somehow equates to better performance. Despite the community's best efforts, the causal relationship between speech tracking and speech comprehension is at best unclear.

Another serious concern is the complete absence of behavioural data on SiN. That is - without ascertaining that the differences in nCTS that are reported across groups and condition relate to SiN perception, claims cannot be made about the relationship between entrainment and SiN comprehension ability. This concern permeates the discussion. I do not question the various relationships that are reported between the nCTS and reading indicators but it does not seem tenable to make claims about how decreased AV fusion is responsible for impoverished SiN comprehension in dyslexic individuals. For instance: LL. 360-361 the measures are described as "objective cortical measure of the ability to deal with babble noise", this does not seem acceptable - where is the evidence that this reflects the ability to deal with babble noise? Do the participants report greater subjective clarity of the target, less effortful listening, higher accuracy of report? The one behavioural measure that seems to have been gathered is explicitly not analysed. I would strongly suggest that the authors reconsider this decision and attempt to link behaviour directly related to SiN to these measures in order to make their other conclusions more sustainable.

A general question concerns the decisions to use the lexical/phonological route distinction and to take the existence of separate routes as a given. It may be the case that this applies to alphabetic languages, but it is well known that it cannot generalise to non-alphabetic languages (e.g. Chinese). Some effort should be made to acknowledge that this study focuses on an alphabetic language with a non-transparent orthography, which may represent a specific subcategory of how reading can be implemented.

The authors further analyse the difference between a dyslexic group two non-dyslexic groups: one a group of reading-level matched children and a second group of age-matched children. The outcomes of this analysis are not straightforward, but they are summarised as follows; dyslexic individuals show the same "reliance on visual speech to boost phrasal nCTS" as age-matched controls, but phrasal nCTS in babble and reliance on visual speech to boost syllabic nCTS are altered. These results could be made substantially clearer and conclusions can only meaningfully be drawn from them if an explicit comparison between the reading-matched controls and the age-matched controls is carried out, to determine whether these effects are in any way specific to dyslexia.

Major Issues

The repeated application of the PID methodology is challenging to follow. A little more effort to explain why each of the separate analyses is carried out, given that the method is presented as one that can provide insights into the unique contributions of a large set of variables. Naively, one would ask why all the variables of interest are not therefore handled together. Again, naively, one asks how legitimate it is to use LME to eliminate variables and then to use PID on selected variables only. To what extent could this be considered, in neuroimaging parlance, "double-dipping"? If the PID is intended only to provide qualitative insights then this is of no concern. It may be that it really is of no concern, and it would be welcome if the authors indicated this (and why).

There is a consistent lack of clarity in what the measures are, and this should be rectified. For example, Table 3 refers to the following: "Regressors included in the final linear mixed-effects model fit to the 5 reading scores". This is presumably not the correct description, since the factors listed in the table are the 5 reading scores. What is the dependent variable in this analysis?

Methodological queries

A lot of relevant information is assumed or relegated to the supplementary materials. It would be helpful to make some of the more crucial aspects of the methodology (e.g. stimulus generation, interpretation of PID) more obvious.

Numerous references are made to correcting variables for age, time in school, IQ, and then standardising (again). What was this correction? Would it not be possible to correct by the simple expedient of including the variables in the LME model?

Why did the model comparison procedure begin with the simplest rather than the maximal model?

Why was LME used for variable selection and not stepwise regression?

The statistics reported for the various PID analysis need further elucidation. It is not clear what the statistic is, what the degrees of freedom are, nor how the p values are derived. Consequently the p values seem inconsistent, e.g. LL 207-208: "(redundant information =0.16; p = 0.0020; synergistic information = 0.12; p = 0.26)", but in LL.206 unique information = 0.31 corresponds to p=.10. How are we to interpret these figures? I accept that this information may be in the various existing publications on the PID, but it would be extremely helpful to be able to interpret these values in context without referring to these.

It is not made clear why hemisphere is a variable of interest in the analyses - such a large-scale division of the brain seems somewhat arbitrary and should be clearly motivated.

Reviewer #2: This manuscript presents research aimed at investigating the links between reading ability and 1) the cortical tracking of speech (as measured using MEG) and 2) classic behavioural predictors of reading in a population of schoolchildren. The authors present children with stories that either have no noise added or four different types of noise and that either are accompanied by a relevant static picture or a video of the speaker's face. They then calculate a measure of how well the MEG is tracking the speech in each of the 8 noise conditions normalized by the tracking in the no noise condition. The authors also collected a large number of measures of reading performance and a large number of measures of classical behavioural predictors. They then use linear mixed-effect modelling to explore how any of their 8 cortical tracking measures - together with their many classical behavioural predictors - might explain reading performance. Furthermore, they use Partial Information Decomposition to identify whether any of these predictors makes a unique contribution to predicting reading performance or whether it might be redundant with other predictors or whether it might combine with other predictors to make even better predictions (synergy). They find a number of relationships between cortical tracking measures and behavioural predictors. And they show that some of these relationships (but not others) apply to individuals with dyslexia.

This manuscript tackles an interesting topic and does so with a nice data set and nice experiment.

However, ultimately, I have one overarching concern that substantially dampens my enthusiasm for the work in its present form. Specifically, I could not help but worry about the robustness and replicability of the array of results we are presented with. The authors focus most of their analysis on 76 subjects. But they have 8 cortical tracking measures x 5 reading performance measures x 10 behavioral predictors (according to Table 1, but maybe only 5 in their analysis?). And, as such, I just found myself being sceptical about the results I was reading in every section. I would suggest that the authors might want to consider adding some additional analyses to reassure sceptical readers like me that the results we are seeing are likely to replicate. For example, the authors might consider permuting the labels on some of their predictors (e.g., the cortical tracking ones) and showing us that they can no longer get unique predictions from those cortical tracking measures. Or the authors might consider dividing their data in half and showing us that they consistently get the same pattern of results in both halves.

Some more specific comments:

1) I think the nCTS equation should be included in the main body of the text.

2) Sorry if I missed it, but I did not see the authors discuss the fact that cortical tracking of speech will be (uninterestingly) improved by the inclusion of a video of the speakers face because of the contribution of correlated activity from visual (i.e., occipital) sensors.

3) In line with my overarching concern above - I just found it implausible that nCTS could uniquely predict reading abilities when the classical behavioural predictors could not.

4) When you mention that "Two limitations are discussed in Supplementary Discussion", I think you should mention that they refer to limitations on only have one SNR for the stimuli and on training MEG models across all conditions and testing on each condition. Otherwise a reader is left wondering about/searching for those limitations.

5) As I read the discussion - I could not help but wonder what the authors might expect to see in the cortical tracking of illiterate adults. Surely their cortex will reliably track speech in noise, no? Is there any literature to suggest that illiterate adults struggle more in challenging listening environments?

Reviewer #3: This study looked systematically at the association between cortical tracking of speech in noise and reading skill in children. The authors found that cortical tracking of the phrasal content of speech in noise is differentially related to lexical reading strategies as opposed to sublexical reading strategies. There was also evidence of differences in the cortical tracking of speech in noise of children with dyslexia, suggesting that they better integrate visual speech information to improve processing of phrasal level speech tracking, rather than syllable-level.

Major points:

This was a novel and interesting study with some clear findings and I appreciated the chance to review it. In the interest of transparency, while I have some expertise in neuroimaging, I do not have expertise in MEG specifically. However, I was able to follow the procedure and analysis and, to the best of my knowledge, the methodology appeared robust. There are some details that I am seeking clarification on in this review but, on the whole, it seems to me that enough detail is included to allow replication and scrutiny of methods. The sample size is good for a study of this nature. I have some concerns about how aspects of the data are interpreted but, in general, conclusions do not go too far beyond the findings and add value to the existing literature base in this area. The manuscript was very clear and well-written and it was a thought provoking study.

I have some concerns around the dyslexic sample, however, and I think that the manuscript needs to provide more detail about this subgroup and the analysis strategy taken. It is not clear anywhere that I could find how the dyslexic group were defined and recruited. Was it on the basis of existing diagnosis, or screening tests as part of the research project? How homogenous were the group in terms of their reading difficulties? This is particularly important because inferences are drawn in relation to reading strategies using findings from the dyslexic group. I'm also unclear why the authors choose not to look at the relationships between CTSiN and reading skill in the children with dyslexia. I appreciate that, due to statistical power issues, they may not be able to conduct the same analysis as for the control group. However, in order to support some of the key interpretations of what CTS deficits in the dyslexic group mean that are proposed in the discussion, some idea of whether the relationships (even in terms of basic correlations) look similar seems vital to me. It would be hard to argue that CTS deficits are of importance in the dyslexic profile if they don't seem to relate to reading skill in this group. Similarly, can the authors provide details about the individual differences in CTS for the dyslexic group, as they do for the controls i.e. what percentage show statistically significant phrasal and syllable CTS? Important to know this is a reliable effect in that group in order to interpret their data.

A more minor point, but one that I think permeates several findings and discussions within the manuscript, is around the role of phonological awareness and how it has been tested. The relationships between reading and the phonological awareness measures are quite weak in this dataset. The authors rightly propose that this may be due to the age of the children and that phonological awareness becomes less central as reading becomes more automated. However, it is important for the authors to acknowledge the ceiling effects in their phonological awareness tests (~90% accuracy in control children, if I've interpreted tables correctly). It is much less likely that you will find phonological awareness mediates CTS effects if the tests are not sensitive enough, rather than because that skill does not mediate the relationship. I think that it is important that this is acknowledged as a possible reason why phonological awareness does not explain much of the variance in reading and why there may be no mediation effects. I think the conclusions relating to phonological processing need significant tempering because of this. In case of interest to the authors in their future work, we've found tests of spoonerisms to be more sensitive to phonological processing in these slightly older children who tend to perform towards ceiling on phoneme deletion or fusion tasks.

Minor points

Line 41 - I think the authors should be cautious about claiming phrasal content of SiN relates to 'development of' lexical strategy when this is a concurrent association, not a longitudinal one. It is a little misleading.

Line 201 - is PID analysis robust to the fact that one set of variables has 5 indicators and the other has 8? Seems like this could bias the analysis, but I'm not particularly familiar with this analysis approach, so would appreciate the authors' clarification on this.

Line 220 - what does 'and further standardised' mean?

Section starting with line 215 - Table of correlations show that visual modulation of syllable nCTS very consistently correlated with reading measures. Why doesn't this come out in the linear mixed-effects modelling? Is it because it doesn't contribute anything unique? It would be helpful for this to come through more clearly somewhere in this section.

Lines 343-345 - The first and third relations referred to here seem to be to do with the link between CTS and reading skill so doesn't seem accurate to say that these were altered in the dyslexic group and relationships with reading skill weren't investigated in this group.

Lines 368-369 - I'm not clear how the results in dyslexia support this relation, particularly as children with dyslexia often have more significant difficulties with pseudoword reading than irregular word reading. Can this be clarified?

Line 518 - I think it's important to state the age range of the children somewhere here. I know it's in Table 1 but it's important information that needs to be found easily.

Line 708 - A (very brief) description of what nCTS is would be beneficial here. I know it's described in the results but despite the ordering of the sections many people will read the method before the results.

Decision Letter 2

Gabriel Gasque

7 Jul 2020

Dear Dr Destoky,

Thank you for submitting your revised Research Article entitled "Cortical tracking of speech in noise accounts for reading strategies in children" for publication in PLOS Biology. We've now obtained advice from one of the original reviewers and have discussed their comments with the Academic Editor.

Based on the review, we will probably accept this manuscript for publication, assuming that you will modify the manuscript to address the remaining points raised by reviewer #2. Please also make sure to address the data and other policy-related requests noted at the end of this email.

We expect to receive your revised manuscript within two weeks. Your revisions should address the specific points made by each reviewer. In addition to the remaining revisions and before we will be able to formally accept your manuscript and consider it "in press", we also need to ensure that your article conforms to our guidelines. A member of our team will be in touch shortly with a set of requests. As we can't proceed until these requirements are met, your swift response will help prevent delays to publication.

*Copyediting*

Upon acceptance of your article, your final files will be copyedited and typeset into the final PDF. While you will have an opportunity to review these files as proofs, PLOS will only permit corrections to spelling or significant scientific errors. Therefore, please take this final revision time to assess and make any remaining major changes to your manuscript.

NOTE: If Supporting Information files are included with your article, note that these are not copyedited and will be published as they are submitted. Please ensure that these files are legible and of high quality (at least 300 dpi) in an easily accessible file format. For this reason, please be aware that any references listed in an SI file will not be indexed. For more information, see our Supporting Information guidelines:

https://journals.plos.org/plosbiology/s/supporting-information

*Published Peer Review History*

Please note that you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*Early Version*

Please note that an uncorrected proof of your manuscript will be published online ahead of the final version, unless you opted out when submitting your manuscript. If, for any reason, you do not want an earlier version of your manuscript published online, uncheck the box. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us as soon as possible if you or your institution is planning to press release the article.

*Protocols deposition*

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosbiology/s/submission-guidelines#loc-materials-and-methods

*Submitting Your Revision*

To submit your revision, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' to find your submission record. Your revised submission must include a cover letter, a Response to Reviewers file that provides a detailed response to the reviewers' comments (if applicable), and a track-changes file indicating any changes that you have made to the manuscript.

Please do not hesitate to contact me should you have any questions.

Sincerely,

Roli Roberts

Roland G Roberts PhD

Senior Editor

PLOS Biology

on behalf of

Gabriel Gasque, Ph.D.,

Senior Editor

PLOS Biology

------------------------------------------------------------------------

ETHICS STATEMENT:

-- Please include the full name of the IACUC/ethics committee that reviewed and approved the animal care and use protocol/permit/project license. Please also include an approval number.

-- Please include the specific national or international regulations/guidelines to which your animal care and use protocol adhered. Please note that institutional or accreditation organization guidelines (such as AAALAC) do not meet this requirement.

-- Please include information about the form of consent (written/oral) given for research involving human participants. All research involving human participants must have been approved by the authors' Institutional Review Board (IRB) or an equivalent committee, and all clinical investigation must have been conducted according to the principles expressed in the Declaration of Helsinki.

------------------------------------------------------------------------

DATA POLICY:

You may be aware of the PLOS Data Policy, which requires that all data be made available without restriction: http://journals.plos.org/plosbiology/s/data-availability. For more information, please also see this editorial: http://dx.doi.org/10.1371/journal.pbio.1001797

We note that you plan to deposit your data in OSF; please could you deposit it and send us a reviewer link or password so that we can assess it. Note that as well as the raw data, we ask that all individual numerical values that underlie the data summarized in the figures and results of your paper be made available in one of the following forms:

1) Supplementary files (e.g., excel). Please ensure that all data files are uploaded as 'Supporting Information' and are invariably referred to (in the manuscript, figure legends, and the Description field when uploading your files) using the following format verbatim: S1 Data, S2 Data, etc. Multiple panels of a single or even several figures can be included as multiple sheets in one excel file that is saved using exactly the following convention: S1_Data.xlsx (using an underscore).

2) Deposition in a publicly available repository. Please also provide the accession code or a reviewer link so that we may view your data before publication.

Regardless of the method selected, please ensure that you provide the individual numerical values that underlie the summary data displayed in the following figure panels as they are essential for readers to assess your analysis and to reproduce it: Figs 2AB, 3, 4AB, 5, S2, S3AB. NOTE: the numerical data provided should include all replicates AND the way in which the plotted mean and errors were derived (it should not present only the mean/average values).

Please also ensure that figure legends in your manuscript include information on where the underlying data can be found, and ensure your supplemental data file/s has a legend.

Please ensure that your Data Statement in the submission system accurately describes where your data can be found.

------------------------------------------------------------------------

REVIEWER'S COMMENTS:

Reviewer #2:

Many thanks to the reviewers for their efforts in addressing my previous comments. I think the manuscript is significantly improved.

One remaining comment: I still remain skeptical about the idea of using 8 different cortical tracking measures and 5 different classical behavioral predictors of reading to account for 5 measures of reading in 73 subjects. And saying that you do so "in a single, statistically controlled analysis" doesn't ease my skepticism I am afraid. So I will leave it as a suggestion to the authors that they might want to consider how they can try to make their approach more compelling to new readers. This could be, for example, by discussing the strengths and weakness of PID for detecting spurious vs real relationships. Or by including some other analysis that would convince a new reader that the results are likely to replicate and are not simply the result of overfitting to the present dataset.

Decision Letter 3

Gabriel Gasque

12 Aug 2020

Dear Dr Destoky,

On behalf of my colleagues and the Academic Editor, Timothy D. Griffiths, I am pleased to inform you that we will be delighted to publish your Research Article in PLOS Biology.

The files will now enter our production system. You will receive a copyedited version of the manuscript, along with your figures for a final review. You will be given two business days to review and approve the copyedit. Then, within a week, you will receive a PDF proof of your typeset article. You will have two days to review the PDF and make any final corrections. If there is a chance that you'll be unavailable during the copy editing/proof review period, please provide us with contact details of one of the other authors whom you nominate to handle these stages on your behalf. This will ensure that any requested corrections reach the production department in time for publication.

Early Version

The version of your manuscript submitted at the copyedit stage will be posted online ahead of the final proof version, unless you have already opted out of the process. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

PRESS

We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with biologypress@plos.org. If you have not yet opted out of the early version process, we ask that you notify us immediately of any press plans so that we may do so on your behalf.

We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit http://www.plos.org/about/media-inquiries/embargo-policy/.

Thank you again for submitting your manuscript to PLOS Biology and for your support of Open Access publishing. Please do not hesitate to contact me if I can provide any assistance during the production process.

Kind regards,

Vita Usova

Publication Editor,

PLOS Biology

on behalf of

Gabriel Gasque,

Senior Editor

PLOS Biology

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Methods. Assessment of the degree of energetic masking.

    (DOCX)

    S2 Methods. Preprocessing of brain and behavioral indices.

    (DOCX)

    S3 Methods. Extraction of the relevant features of nCTS.

    nCTS, normalized cortical tracking of speech.

    (DOCX)

    S4 Methods. Partial information decomposition.

    (DOCX)

    S5 Methods. Recording of video stimuli.

    (DOCX)

    S6 Methods. Accuracy of speech envelope reconstruction.

    (DOCX)

    S1 Results. Contribution of visual cortical activity to nCTS.

    nCTS, normalized cortical tracking of speech.

    (DOCX)

    S2 Results. Side measures are redundant with RAN and digit span but not with modulations in phrasal nCTS.

    nCTS, normalized cortical tracking of speech; RAN, rapid automatized naming.

    (DOCX)

    S3 Results. Reading profile and reading deficit in the group with dyslexia.

    (DOCX)

    S4 Results. Are features of nCTS related to the importance of reading difficulties in dyslexia?

    nCTS, normalized cortical tracking of speech.

    (DOCX)

    S1 Discussion. Supplementary Discussion.

    (DOCX)

    S1 Table. Percentage of the 73 typical readers showing significant CTS at phrasal and syllabic rates in the nine different conditions.

    The two values provided for the noiseless condition correspond to two arbitrary subdivisions of the noiseless data to match the amount of data for the eight noise conditions. CTS, cortical tracking of speech.

    (DOCX)

    S2 Table. Nature of the information about reading abilities brought by each of the three uncovered features of the CTS in noise and phonological awareness (mean of the scores for phoneme fusion and suppression).

    Significant values (p < 0.05) are displayed in boldface, and marginally significant values are displayed in boldface and italicized. CTS, cortical tracking of speech.

    (DOCX)

    S3 Table. Nature of the information about reading brought by (1) the visual modulation in syllabic nCTS and (2) each of the four regressors included in the final model of reading abilities (informational modulation in phrasal nCTS, visual modulation in phrasal nCTS, forward digit span, and RAN).

    nCTS, normalized cortical tracking of speech; RAN, rapid automatized naming.

    (DOCX)

    S4 Table. Same as in S3 Table for metaphonological abilities.

    (DOCX)

    S5 Table. Percentage of the 26 children of each reading group (dyslexia, control in age, and control in reading level) showing significant CTS in at least one hemisphere at phrasal and syllabic rates in the nine different conditions.

    CTS, cortical tracking of speech.

    (DOCX)

    S6 Table. Factors included in the final linear mixed-effects model fit to the nCTS (independent variable) at phrasal and at syllabic rates in children with dyslexia.

    Factors are listed in their order of inclusion. nCTS, normalized cortical tracking of speech.

    (DOCX)

    S7 Table. Regressors included in the final linear mixed-effects model fit to the five reading scores (dependent variables) in children with dyslexia.

    Regressors are listed in their order of inclusion.

    (DOCX)

    S8 Table. Pearson correlation between measures of reading abilities and relevant brain and behavioral measures in children with dyslexia.

    ***p < 0.001, **p < 0.01, *p < 0.05, #p < 0.1. nCTS, normalized cortical tracking of speech.

    (DOCX)

    S9 Table. Pearson correlation between measures of reading abilities and nCTS measures in children with dyslexia.

    ***p < 0.001, **p < 0.01, *p < 0.05, #p < 0.1. nCTS, normalized cortical tracking of speech.

    (DOCX)

    S1 Fig

    Spectrogram of a 4-s excerpt of attended speech (A) and corresponding noise (B) in the range of 0–7 kHz. Wide-band spectrograms (0–20 kHz) are also presented for the attended speech and the least-energetic nonspeech noise (C) to show that noise power was confined to frequencies above 10 kHz in this latter noise condition. The zeros of the dBFS were fixed based on the attended speech spectrogram and applied to all noise spectrograms. dBFS, decibel full scale.

    (TIF)

    S2 Fig. Relation between reading abilities and the nCTS at phrasal rate in dyslexia.

    S6 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

    (TIF)

    S3 Fig

    Impact of the main fixed effects on the nCTS at phrasal (A) and syllabic rates (B) in children with dyslexia. All is as in Fig 2. S7 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

    (TIF)

    S1 video. Exemplary video stimulus wherein static pictures were replaced by text descriptions.

    (M4V)

    S1 data. Behavioral and CTS values for all participants.

    CTS, cortical tracking of speech.

    (XLSX)

    S2 data. Raw data underlying Fig 2.

    (XLSX)

    S3 data. Raw data underlying Fig 3.

    (XLSX)

    S4 data. Raw data underlying Fig 4.

    (XLSX)

    S5 data. Raw data underlying Fig 5.

    (XLSX)

    S6 data. Raw data underlying S2 Fig.

    (XLSX)

    S7 data. Raw data underlying S3 Fig.

    (XLSX)

    Attachment

    Submitted filename: SiN_tracking_and_reading_rebuttal.pdf

    Attachment

    Submitted filename: SiN_tracking_and_reading_rebuttal_3.docx

    Data Availability Statement

    The data and the code that support the findings of this study are available on the Open Science Framework at “https://osf.io/9ce5t/”. The underlying numerical data for each figure can also be found in the supporting data files.


    Articles from PLoS Biology are provided here courtesy of PLOS

    RESOURCES