Skip to main content
Elsevier Sponsored Documents logoLink to Elsevier Sponsored Documents
. 2008 Dec 9;18(23-2):1855–1859. doi: 10.1016/j.cub.2008.10.052

Task-Dependent Modulation of Medial Geniculate Body Is Behaviorally Relevant for Speech Recognition

Katharina von Kriegstein 1,2,, Roy D Patterson 3, TD Griffiths 1,2
PMCID: PMC2631608  PMID: 19062286

Summary

Recent work has shown that responses in first-order sensory thalamic nuclei are modulated by cortical areas [1–5]. However, the functional role of such corticothalamic modulation and its relevance for human perception is still unclear. Here, we show in two functional magnetic resonance imaging (fMRI) studies that the neuronal response in the first-order auditory thalamus, the medial geniculate body (MGB), is increased when rapidly varying spectrotemporal features of speech sounds are processed, as compared to processing slowly varying spectrotemporal features of the same sounds. The strength of this task-dependent modulation is positively correlated with the speech recognition scores of individual subjects. These results show that task-dependent modulation of the MGB serves the processing of specific features of speech sounds and is behaviorally relevant for speech recognition. Our findings suggest that the first-order auditory thalamus is not simply a nonspecific gatekeeper controlled by attention [6]. Together with studies in nonhuman mammals [4, 5], our findings imply a mechanism in which the first-order auditory thalamus, possibly by corticothalamic modulation, reacts adaptively to features of sensory input.

Keywords: SYSNEURO

Results and Discussion

The anatomical and physiological similarity of first-order thalamic nuclei in different sensory modalities has led to the assumption that they perform general functions in the processing of sensory information [7–11]. But the functions themselves remain to be specified. Recent functional magnetic resonance imaging (fMRI) has shown that the first-order sensory thalamus in humans is modulated by top-down attention [6]. This finding supports a picture of the first-order sensory thalamus as an early, but relatively nonspecific gatekeeper for sensory information [6, 12]. However, electrophysiological studies in nonhuman mammals suggest a more specific role for first-order thalamic nuclei [1–5, 13, 14]. It has been proposed that thalamic receptive fields are tuned by cortical areas to optimize processing of the dynamically varying sensory input [4, 14–16]. For example, in the auditory system, corticofugal connections are thought to mediate dynamic changes in the response properties of first-order thalamic receptive fields for species-specific communication sounds [5], either directly or through the thalamic reticular nucleus (TRN). The first study reported in this paper tested whether modulation by nonspecific attention is sufficient to explain top-down modulation of the medial geniculate body (MGB) in humans (experiment 1). Note that “top-down” here refers to all types of effects (e.g., attentional, task dependent) that are not elicited directly by stimulus input (which would be “bottom-up”) [17, 18]. However, we do not necessarily imply a specific neuronal mechanism (e.g., direct cortico-thalamic modulation).

In experiment 1, subjects performed tasks in which they had to attend either to the content (syllable task) or the level (loudness task) of speech sounds. Both tasks were one-back same/different judgments performed on sequences of syllables (see Experimental Procedures). Subjects attended to the same stimulus material for both tasks. If the modulation by attention is not specific, we would expect that the amplitude of the MGB activity will either be the same for the two tasks or greater for the more difficult task. We localized MGB, functionally, by contrasting blood oxygen level-dependent (BOLD) responses to speech sounds with responses to silence (Figure 1A and Figure S1 available online). In the group analysis (n = 16), only the response in the left MGB was significant. Within this region, we found that BOLD activity was greater for the syllable task than the loudness task (p < 0.05 family-wise error [fwe] corrected for multiple comparisons) (Figure 1B, Table S1, and Figure S2).

Figure 1.

Figure 1

Categorical Analysis

Experiment 1 (A and B) experiment 2 (C and D). Group statistical parametric maps are rendered on coronal sections of the group-mean normalized, structural MRI volume. Plots show parameter estimates extracted from the left sensory auditory thalamus, the medial geniculate body (MGB) for experimental conditions contrasted against the silent baseline. The percent signal change refers to the difference in BOLD response in relation to the global mean. y, MNI-coordinate in anterior-posterior direction.

The results of experiment 1 indicate that the task-dependent modulation of MGB cannot be explained by nonspecific attention to the speech stimulus. Behaviorally, the syllable task was easier to do for subjects than the loudness task [Table S2; F(1,15) = 127.56, p < 0.001, n = 16]. Therefore, the syllable task should have required relatively less attention. However, we found the opposite effect; MGB activity in the syllable task was greater than in the loudness task.

In the second experiment, we tested a more specific hypothesis about the functional role of the task-dependent modulation of the MGB. Critical features for speech recognition vary rapidly over time at the rate of the phonemes (e.g., in the syllable “aba,” a transition from /a/ to /b/ to /a/) [19]. In contrast, speaker-relevant variables such as vocal-tract length or fundamental frequency are more stable over time (during normal conversation and also in our experiment). We hypothesized that adjusting receptive fields for fast-changing speech characteristics at the phoneme level is resolved at an early processing stage and requires more modulation of MGB than the temporally more stable speaker characteristics.

Subjects performed the same syllable task as in experiment 1. For the control task, however, subjects had to attend to the speaker characteristics (speaker task) of the same speech sounds. Within both functionally localized MGBs (Figure 1C and Figure S1), the syllable task elicits greater BOLD activity than the speaker task (p < 0.05 fwe corrected, n = 17) (Figure 1D, Table S1, and Figure S2). Note that behavioral performance was matched for the two tasks [Table S2; F(1,16) = 0.53, p = 0.48, n = 17].

The results are consistent with our hypothesis; the increase of responses in the MGB for the speech task, as compared to the speaker task, suggests that modulation of first-order thalamic responses is involved in processing fast time-varying features at the phoneme level.

It is generally assumed that tuning of first-order thalamic receptive fields in nonhuman mammals leads to improved detection and discrimination of stimuli (although to our knowledge there is no experimental evidence for this assumption) [5]. Given this assumption, we would expect a positive correlation of behavioral performance with the amount of task-dependent modulation in human MGB during speech recognition. We found this to be the case for both MGBs, over both experiments (p < 0.05 fwe corrected, n = 33) (Figure S3). There were no significant differences in the strength of correlation between the two experiments (p = 0.6). Separate correlation analyses for the two experiments (Figures 2A and 2B and Table S3) revealed a positive correlation between performance and difference contrast, in experiment 1 for both MGBs. In experiment 2, the positive correlation was significant in the left MGB only.

Figure 2.

Figure 2

Correlation Analysis

The plots show the positive correlation between the behavioral performance in the syllable task (as percent correct performance in rationalized arcsine units (rau) [40]) and the BOLD-signal change in the sensory auditory thalamus, the medial geniculate body (MGB) (A and B), and the auditory midbrain, the inferior colliculus (IC) (C and D), over subjects. The linear regression is shown with 95% individual prediction interval.

Two previous fMRI studies have implicated thalamic structures in attention [20] or recognition success [21] during speech perception. However, it seems that activations were not located in sensory thalamic structures (i.e., the MGB), although this cannot be stated with certainty because of the specific acquisition and analysis techniques used in these studies. Activation maxima (in [20]: −13, −6, 12; in [21]: −0, −12, 5; Talairach coordinates) are more than 1.5 cm away from maxima reported for the MGB (e.g., in [22]: −15, −28, −5 [Talairach]; −15, −28, −8 [MNI]; see also Table S1). The distinction between primary sensory thalamus and other thalamic structures is important because only modulation of first-order nuclei, such as the MGB, indicates task-dependent modulation at an early sensory processing stage [8, 10].

Differences in brainstem evoked potentials between groups with relatively high or low language skills have been attributed to the inferior colliculus (IC) [23, 24]. Accordingly, we also examined the BOLD-responses of the ICs in the current experiments (Figure S4). They were not more active for the syllable task than the control tasks (i.e., loudness task [experiment 1] or speaker task [experiment 2]). However, there was a positive correlation of behavioral performance with the activation difference between the syllable and control tasks, in both ICs over both experiments (p < 0.05 fwe corrected, n = 33). Further analyses revealed a significant positive correlation for experiment 1 in both ICs, and in experiment 2, in the left IC (Figures 2C and 2D and Table S4). This correlation of task-dependent modulation with recognition success further suggests a possible role for IC in subjects with high language skills [23, 24].

In summary our findings show (1) that responses in auditory, first-order thalamus during the processing of speech sounds are task dependent, (2) that the response is strongest when processing features of speech sounds at the phoneme level, and (3) that the relative amplitude of the task-dependent modulation is correlated with individual behavioral performance. This suggests that a task-dependent, behaviorally relevant feedback mechanism supports the recognition of speech sounds at the level of sensory thalamus. In light of the massive corticothalamic connections, and their known influence on the response properties of thalamic neurons in animals [1–5, 13, 14], we speculate that the task-dependent modulation seen in the current experiments is a result of feedback from cortical areas. Such feedback could be direct (e.g., from auditory primary or association cortices) or indirect (i.e., via other structures such as the reticular nucleus) [25]. Although not the most parsimonious explanation, task-dependent modulation could also be mediated via the IC, which receives afferents from cortex and is one of the main input structures to the MGB [25, 26].

What kind of mechanism can account for these results? Experimental and theoretical accounts of brain function [27–29] emphasize the importance of an anatomical hierarchy that is organized according to the timescale of complex stimuli, in the natural environment. In brief, it is assumed that levels closer to the sensory input encode faster dynamics of the stimulus than levels further away from the sensory input. In accordance with this view, the MGB (as well as visual first-order thalamus [LGN] [30]) are tuned to high frequencies of temporal modulation (ca. 16 Hz in human MGB [27]) in relation to their associated primary sensory cortical areas [27, 29, 31]. In addition, in nonhuman animals, thalamic receptive fields change dynamically in response to inputs from cortical areas [4, 5]. For humans, the optimized encoding of relatively fast dynamics, e.g., at the phoneme level, is critical for speech recognition and communication [32–34]. Here, we suggest that slower dynamics encoded by auditory cortical areas [27, 29] provide predictions about input arriving at lower levels of the temporal-anatomic hierarchy [28]. In this view, these dynamic predictions modulate the response properties of the first-order sensory thalamus to optimize the early stages of speech recognition.

Experimental Procedures

This section provides only essential information about the Experimental Procedures. For a more detailed description, see Supplemental Experimental Procedures.

Participants

Sixteen subjects were included in experiment 1 (all right handed; eight female, eight male; aged 19–40 years; mean age of 26). Seventeen subjects were included in experiment 2 (all right handed; 6 female, 11 male; aged 20–37; mean age of 26).

Stimuli

Stimuli were based on syllables recorded from a single speaker (16-bit resolution, 48 -kHz sample rate) and preprocessed with level balancing and perceptual centering as described previously [35]. Experiment 1 contained 96 syllables (48 consonant-vowel, 48 vowel-consonant). Experiment 2 contained 150 vowel-consonant-vowel syllables. We used vowel-consonant-vowel syllables in experiment 2 because preliminary behavioral studies indicated that they impart the same level of task difficulty for the syllable and speaker task. Stimulus versions with different speaker characteristics were synthesized from the recorded speech sounds with the STRAIGHT software package [36, 37].

Experimental Design

For both experiments spoken syllables were concatenated to form syllable sequences (Supplemental Experimental Procedures—Example Stimuli). Before each sequence, participants received a visual instruction to either perform the syllable task or the control task (i.e., loudness task in experiment 1 or speaker task in experiment 2).

Both experiments included a second factor, which was the synthetic manipulation of voice characteristics. These manipulations were not of immediate interest to the main hypotheses of the current report and the results are therefore presented in the Supplemental Experimental Procedures.

Experiment 1

In experiment 1, all syllable-sequences lasted 9.44 s and contained eight syllable events (680 ms stimulus, 500 ms pause). Within each sequence there were three different syllables (e.g., /ga/, /ke/, and /la/; /mu/, /mi/, and /ka/; etc.) and three different values of sound level (values differed by 9–12 dB SPL). Syllable and sound level values were randomly presented with restrictions on the minimum and maximum number of changes within a sequence. Changes in syllable and sound level were independent of each other. In the speech task, subjects indicated via button press whether the current syllable was different from the previous one. In the loudness task, subjects indicated via button press whether the level of the current syllable was different from the previous one. During half of the syllable sequences, the vocal-tract length (VTL) of the speaker varied within the syllable sequence (VTL varies); during the other half, the VTL of the speaker was fixed (VTL same). Thus, we analyzed four experimental conditions: (1) syllable task, VTL varies; (2) syllable task, VTL same; (3) loudness task, VTL varies; and (4) loudness task, VTL same. The VTL values ranged from 10.6 to 21.7 cm.

Experiment 2

In experiment 2, all syllable-sequences lasted 8.4 s and contained six syllable events (1100 ms stimulus, 300 ms pause). Within each sequence there were three different syllables (e.g., /aga/, /ake/, and /ala/; or /esi/, /elu/, and /ero/; etc.) and three different speakers (i.e., different VTLs or different fundamental frequencies, see below). Again, the minimum and maximum numbers of changes were restricted, and variations in syllable and speaker were independent. In the speech task, subjects indicated via button press whether the current syllable was different from the previous one. In the speaker task, subjects indicated via button press whether the current speaker was different from the previous one. Half of the syllable sequences were spoken by speakers that differed in VTL but that had the same fundamental frequency, i.e., the same glottal pulse rate (GPR). In the remaining sequences, the speakers differed in GPR but not in VTL. Thus, the experiment had four experimental conditions: (1) syllable task, VTL varies; (2) syllable task, GPR varies; (3) speaker task, VTL varies; and (4) speaker task, GPR varies. The smallest values for the speaker manipulations were 95 Hz (GPR)/9.1 cm (VTL), and the largest values were 220 Hz (GPR)/20.3 cm (VTL). The values were chosen to simulate a speaker change, rather than a change in the voice quality of one speaker.

Both experiments also included silence conditions, which were 9.44 s in experiment 1 and 8.4 s in experiment 2. These conditions were used to locate the regions of interest (see below).

In both experiments, the order of sequences was randomized. Each sequence (with a specific stimulus combination) always occurred twice, once in the syllable task, and once in the control task.

Scanning Procedure

The stimuli were delivered with a custom electrostatic system at 70 dB SPL.

After each syllable sequence, functional gradient-echo planar images (EPI) were acquired (sparse imaging [38, 39]) on a 3 Tesla-scanner (Siemens Allegra, Erlangen, Germany; 42 slices; −5 degree tilt; slice thickness 2 mm, interslice distance 1 mm; cardiac triggering).

Data Analysis

Imaging data were analyzed with statistical parametric mapping implemented in SPM5 software (http//:www.fil.ion.ucl.ac.uk/spm) (see Supplemental Experimental Procedures). Population-level inferences about BOLD signal changes between conditions of interest were based on a random effects model that estimated the second level t statistic at each voxel.

Definition of Regions of Interest for the Categorical Analysis

For both experiments, we located the MGB and IC by the contrast all speech conditions > silence at the second level. For the regions of interest (ROI) definition, we used the MarsBaR toolbox (http://marsbar.sourceforge.net). The ROIs are displayed in Figure S1 (MGB) and Figure S4 (IC). The statistical maxima for the ROIs are listed in Table S1 (MGB) and Table S4 (IC).

Categorical Analysis

In the categorical analysis, the contrasts of interest were syllable task > loudness task (experiment 1) and syllable task > speaker task (experiment 2). Effects were considered significant if (1) the statistic maximum of the cluster was within the functionally defined ROI and (2) this was significant at p < 0.05, fwe corrected for multiple comparisons within the functionally defined ROI.

Definition of ROIs for the Correlation Analysis

The ROIs for the correlation analysis were defined as described above, except that the functional cluster was not derived from the contrast all speech conditions > silence, but from the functionally more specific contrast syllable task > loudness task (experiment 1) or syllable > speaker task (experiment 2) (MGB, Figure S1 and Table S1). Because there were no significant responses for this contrast in the IC, we used the same ROIs as for the categorical analysis (Figure S4).

Correlation Analysis

In the correlation analysis, the fMRI difference signal, between syllable and control tasks (i.e., loudness or speaker task) in MGB and IC, was correlated with the behavioral performance (in rationalized arcsine units [40]) in the syllable task. To estimate Pearson's statistics, we extracted the parameter estimates from the region of interest at the voxel where we found the maximum value of the statistic. These values and the behavioral scores were then analyzed with SPSS 12.02. To estimate differences in correlation between the two experiments, we used a univariate ANOVA.

Analysis of Behavioral Data

The behavioral data were analyzed with a repeated-measures ANOVA and post hoc paired t tests in SPSS 12.02 (Table S2). The statistical results are described in the main text and in the legend of Table S2.

Acknowledgments

This work was supported by the Wellcome Trust, VW Stiftung (I/79 783), and the UK MRC (G9900362). We thank David R. Smith for providing the stimuli. We thank Stefan Kiebel for valuable discussions and comments on the manuscript. We thank Hendrikje Nienborg for comments on an earlier version of the manuscript.

Published online: December 4, 2008

Footnotes

Supplemental Data include Supplemental Experimental Procedures, five figures, four tables, and six audio files and can be found with this article online at http://www.current-biology.com/supplemental/S0960-9822(08)01421-8.

Supplemental Data

Document S1. Supplemental Experimental Procedures, Five Figures, and Four Tables
mmc1.pdf (656.8KB, pdf)
Audio S1. Exp1_VTL_varies_v
Download audio file (220.6KB, mp3)
Audio S2. Exp1_VTL_varies_w
Download audio file (220.6KB, mp3)
Audio S3. Exp1_VTL_fixed_v
Download audio file (220.6KB, mp3)
Audio S4. Exp1_VTL_fixed_w
Download audio file (220.6KB, mp3)
Audio S5. Exp2_VTL_varies
Download audio file (173.4KB, mp3)
Audio S6. Exp2_GPR_varies
Download audio file (173.4KB, mp3)

References

  • 1.Andolina I.M., Jones H.E., Wang W., Sillito A.M. Corticothalamic feedback enhances stimulus response precision in the visual system. Proc. Natl. Acad. Sci. USA. 2007;104:1685–1690. doi: 10.1073/pnas.0609318104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Cudeiro J., Sillito A.M. Looking back: Corticothalamic feedback and early visual processing. Trends Neurosci. 2006;29:298–306. doi: 10.1016/j.tins.2006.05.002. [DOI] [PubMed] [Google Scholar]
  • 3.Ergenzinger E.R., Glasier M.M., Hahm J.O., Pons T.P. Cortically induced thalamic plasticity in the primate somatosensory system. Nat. Neurosci. 1998;1:226–229. doi: 10.1038/673. [DOI] [PubMed] [Google Scholar]
  • 4.Ghazanfar A.A., Nicolelis M.A. Feature article: the structure and function of dynamic cortical and thalamic receptive fields. Cereb. Cortex. 2001;11:183–193. doi: 10.1093/cercor/11.3.183. [DOI] [PubMed] [Google Scholar]
  • 5.Suga N., Ma X. Multiparametric corticofugal modulation and plasticity in the auditory system. Nat. Rev. Neurosci. 2003;4:783–794. doi: 10.1038/nrn1222. [DOI] [PubMed] [Google Scholar]
  • 6.O'Connor D.H., Fukui M.M., Pinsk M.A., Kastner S. Attention modulates responses in the human lateral geniculate nucleus. Nat. Neurosci. 2002;5:1203–1209. doi: 10.1038/nn957. [DOI] [PubMed] [Google Scholar]
  • 7.Adams P., Cox K. A new interpretation of thalamocortical circuitry. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2002;357:1767–1779. doi: 10.1098/rstb.2002.1164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rouiller E.M., Welker E. A comparative analysis of the morphology of corticothalamic projections in mammals. Brain Res. Bull. 2000;53:727–741. doi: 10.1016/s0361-9230(00)00364-6. [DOI] [PubMed] [Google Scholar]
  • 9.Sherman S.M., Guillery R.W. The role of the thalamus in the flow of information to the cortex. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 2002;357:1695–1708. doi: 10.1098/rstb.2002.1161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sherman S.M., Guillery R.W. 2 edn. MIT Press; Cambridge, MA: 2006. Exploring the thalamus and its role in cortial function. [Google Scholar]
  • 11.Jones E.G., Powell T.P.S. An electron microscopic study of mode of termination of cortico-thalamic fibres within sensory relay nuclei of thalamus. Proc. R. Soc. Lond. B. Biol. Sci. 1969;172:173–185. doi: 10.1098/rspb.1969.0018. [DOI] [PubMed] [Google Scholar]
  • 12.Crick F. Function of the thalamic reticular complex: The searchlight hypothesis. Proc. Natl. Acad. Sci. USA. 1984;81:4586–4590. doi: 10.1073/pnas.81.14.4586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Murphy P.C., Sillito A.M. Corticofugal feedback influences the generation of length tuning in the visual pathway. Nature. 1987;329:727–729. doi: 10.1038/329727a0. [DOI] [PubMed] [Google Scholar]
  • 14.Sillito A.M., Cudeiro J., Jones H.E. Always returning: Feedback and sensory processing in visual cortex and thalamus. Trends Neurosci. 2006;29:307–316. doi: 10.1016/j.tins.2006.05.001. [DOI] [PubMed] [Google Scholar]
  • 15.Koch C. The action of the corticofugal pathway on sensory thalamic nuclei: A hypothesis. Neuroscience. 1987;23:399–406. doi: 10.1016/0306-4522(87)90064-9. [DOI] [PubMed] [Google Scholar]
  • 16.Villa A.E., Tetko I.V., Dutoit P., De Ribaupierre Y., de Ribaupierre F. Corticofugal modulation of functional connectivity within the auditory thalamus of rat, guinea pig and cat revealed by cooling deactivation. J. Neurosci. Methods. 1999;86:161–178. doi: 10.1016/s0165-0270(98)00164-2. [DOI] [PubMed] [Google Scholar]
  • 17.Wolfe J.M., Butcher S.J., Lee C., Hyle M. Changing your mind: On the contributions of top-down and bottom-up guidance in visual search for feature singletons. J. Exp. Psychol. Hum. Percept. Perform. 2003;29:483–502. doi: 10.1037/0096-1523.29.2.483. [DOI] [PubMed] [Google Scholar]
  • 18.Li W., Piech V., Gilbert C.D. Perceptual learning and top-down influences in primary visual cortex. Nat. Neurosci. 2004;7:651–657. doi: 10.1038/nn1255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Rosen S. Temporal information in speech: acoustic, auditory and linguistic aspects. Philos. Trans. R. Soc. Lond. B Biol. Sci. 1992;336:367–373. doi: 10.1098/rstb.1992.0070. [DOI] [PubMed] [Google Scholar]
  • 20.Christensen T.A., Antonucci S.M., Lockwood J.L., Kittleson M., Plante E. Cortical and subcortical contributions to the attentive processing of speech. Neuroreport. 2008;19:1101–1105. doi: 10.1097/WNR.0b013e3283060a9d. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Alain C., Reinke K., McDonald K.L., Chau W., Tam F., Pacurar A., Graham S. Left thalamo-cortical network implicated in successful speech separation and identification. Neuroimage. 2005;26:592–599. doi: 10.1016/j.neuroimage.2005.02.006. [DOI] [PubMed] [Google Scholar]
  • 22.Devlin J.T., Sillery E.L., Hall D.A., Hobden P., Behrens T.E., Nunes R.G., Clare S., Matthews P.M., Moore D.R., Johansen-Berg H. Reliable identification of the auditory thalamus using multi-modal structural analyses. Neuroimage. 2006;30:1112–1120. doi: 10.1016/j.neuroimage.2005.11.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Song J.H., Banai K., Kraus N. Brainstem timing deficits in children with learning impairment may result from corticofugal origins. Audiol. Neurootol. 2008;13:335–344. doi: 10.1159/000132689. [DOI] [PubMed] [Google Scholar]
  • 24.Wong P.C., Skoe E., Russo N.M., Dees T., Kraus N. Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nat. Neurosci. 2007;10:420–422. doi: 10.1038/nn1872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Rouiller E.M., de Ribaupierre F. Origin of afferents to physiologically defined regions of the medial geniculate body of the cat: ventral and dorsal divisions. Hear. Res. 1985;19:97–114. doi: 10.1016/0378-5955(85)90114-5. [DOI] [PubMed] [Google Scholar]
  • 26.Huffman R.F., Henson J. The descending auditory pathway and acousticomotor systems: connections with the inferior colliculus. Brain Res. Brain Res. Rev. 1990;15:295–323. doi: 10.1016/0165-0173(90)90005-9. [DOI] [PubMed] [Google Scholar]
  • 27.Giraud A.L., Lorenzi C., Ashburner J., Wable J., Johnsrude I., Frackowiak R., Kleinschmidt A. Representation of the temporal envelope of sounds in the human brain. J. Neurophysiol. 2000;84:1588–1598. doi: 10.1152/jn.2000.84.3.1588. [DOI] [PubMed] [Google Scholar]
  • 28.Kiebel S.J., Daunizeau J., Friston K.J. A hierarchy of time scales and the brain. PLoS Comput. Biol. 2008 doi: 10.1371/journal.pcbi.1000209. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wang X., Lu T., Bendor D., Bartlett E. Neural coding of temporal information in auditory thalamus and cortex. Neuroscience. 2008;154:294–303. doi: 10.1016/j.neuroscience.2008.03.065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Hicks T.P., Lee B.B., Vidyasagar T.R. The responses of cells in macaque lateral geniculate nucleus to sinusoidal gratings. J. Physiol. 1983;337:183–200. doi: 10.1113/jphysiol.1983.sp014619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Foster K.H., Gaska J.P., Nagler M., Pollen D.A. Spatial and temporal frequency selectivity of neurones in visual cortical areas V1 and V2 of the macaque monkey. J. Physiol. 1985;365:331–363. doi: 10.1113/jphysiol.1985.sp015776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Shannon R.V., Zeng F.G., Kamath V., Wygonski J., Ekelid M. Speech recognition with primarily temporal cues. Science. 1995;270:303–304. doi: 10.1126/science.270.5234.303. [DOI] [PubMed] [Google Scholar]
  • 33.Tallal P., Piercy M. Developmental aphasia: the perception of brief vowels and extended stop consonants. Neuropsychologia. 1975;13:69–74. doi: 10.1016/0028-3932(75)90049-4. [DOI] [PubMed] [Google Scholar]
  • 34.Tallal P., Miller S.L., Bedi G., Byma G., Wang X., Nagarajan S.S., Schreiner C., Jenkins W.M., Merzenich M.M. Language comprehension in language-learning impaired children improved with acoustically modified speech. Science. 1996;271:81–84. doi: 10.1126/science.271.5245.81. [DOI] [PubMed] [Google Scholar]
  • 35.von Kriegstein K., Warren J.D., Ives D.T., Patterson R.D., Griffiths T.D. Processing the acoustic effect of size in speech sounds. Neuroimage. 2006;32:368–375. doi: 10.1016/j.neuroimage.2006.02.045. [DOI] [PubMed] [Google Scholar]
  • 36.Kawahara H., Masuda-Kasuse I., de Cheveigne A. Restructuring speech representations using pitch-adaptive time-frequency smoothing and instantaneous-frequency-based F0 estraction: Possible role of repetetive structure in sounds. Speech Commun. 1999;27:187–207. [Google Scholar]
  • 37.Kawahara H., Irino T. Underlying prinicples of a high-quality speech manipulation system STRAIGHT and its application to speech segregation. In: Divenyi P., editor. Speech separation by humans and machines. Kluer Academic; Massachusetts: 2004. pp. 167–180. [Google Scholar]
  • 38.Belin P., Zatorre R.J., Hoge R., Evans A.C., Pike B. Event-related fMRI of the auditory cortex. Neuroimage. 1999;10:417–429. doi: 10.1006/nimg.1999.0480. [DOI] [PubMed] [Google Scholar]
  • 39.Hall D.A., Haggard M.P., Akeroyd M.A., Palmer A.R., Summerfield A.Q., Elliott M.R., Gurney E.M., Bowtell R.W. “Sparse” temporal sampling in auditory fMRI. Hum. Brain Mapp. 1999;7:213–223. doi: 10.1002/(SICI)1097-0193(1999)7:3&#x0003c;213::AID-HBM5&#x0003e;3.0.CO;2-N. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Studebaker G.A. A “rationalized” arcsine transform. J. Speech Hear. Res. 1985;28:455–462. doi: 10.1044/jshr.2803.455. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Supplemental Experimental Procedures, Five Figures, and Four Tables
mmc1.pdf (656.8KB, pdf)
Audio S1. Exp1_VTL_varies_v
Download audio file (220.6KB, mp3)
Audio S2. Exp1_VTL_varies_w
Download audio file (220.6KB, mp3)
Audio S3. Exp1_VTL_fixed_v
Download audio file (220.6KB, mp3)
Audio S4. Exp1_VTL_fixed_w
Download audio file (220.6KB, mp3)
Audio S5. Exp2_VTL_varies
Download audio file (173.4KB, mp3)
Audio S6. Exp2_GPR_varies
Download audio file (173.4KB, mp3)

RESOURCES