Abstract
There is little doubt that predictive coding is an important mechanism in language processing – indeed, in information processing generally. However, it is less clear whether the action system is the source of such predictions during perception. Here I summarize the computational problem with motor prediction for perceptual processes and argue instead for a dual-stream model of predictive coding.
Predictive coding is in vogue in cognitive neuroscience, probably for good reason, and we are no strangers to the idea in the domain of speech (Hickok et al. 2011; van Wassenhove et al. 2005). The current trendsetters in predictive coding are the motor control crowd who have developed, empirically validated, and promoted the notion of internal forward models as a neural mechanism necessary for smooth, efficient motor control (Kawato, 1999; Shadmehr et al. 2010; Wolpert et al. 1995). But the basic idea has been pervasive in cognitive science for decades in the form of theoretical proposals like analysis-by-synthesis (Stevens & Halle 1967) and in the form of empirical observations like priming, context and top-down effects, and the like. So Pickering & Garrod’s (P&G’s) claim that language comprehension involves prediction is nothing new. Nor is it a particularly novel claim, right or wrong, that the motor system might be involved in receptive language; it has gotten much attention in the domain of speech perception/phonemic processing, for example (Hickok et al. 2011; Rauschecker & Scott 2009; Sams et al. 2005; van Wassenhove et al. 2005; Wilson & Iacoboni 2006) and has been a component of at least some aspects of sentence processing models for decades (Crain & Fodor 1985; Frazier & Flores d’Arcais 1989; Gibson & Hickok 1993). What appears to be new here is the idea that prediction at the syntactic and semantic levels can come out of the action system rather than being part of a purely perceptual mechanism.
This is an interesting idea worth investigation, but it is important to note that there are computational reasons why motor prediction generally is an inefficient, or even maladaptive, source for predictive coding during receptive functions. Here’s the heart of the problem. The computational goal of a motor prediction in the context of action control is to increase perceptual sensitivity to deviations from prediction (because something is wrong and correction is needed) and to decrease sensitivity to accurate predictions (all is well, carry on). Hence, if motor prediction were used in the context of perception, it would tend to suppress sensitivity to that which is predicted, whereas an efficient mechanism should enhance perception. Behavioral evidence bears this out. The system is less sensitive to the perceptual effects of self-generated actions (unless there is a deviation) than to externally generated perceptual events. Some of the “reafference cancellation” effects noted by P&G are good examples: inability to self-tickle, saccadic suppression of motion percepts, and the motor-induced suppression effect measured electrophysiologically. This contrasts with nonmotor forms of prediction, what P&G referred to as the association route, which might include context effects and priming and which tend to facilitate perceptual recognition. Put simply, motor prediction decreases perceptual sensitivity to the predicted sensory event, nonmotor prediction increases perceptual sensitivity to the predicted sensory event. Why, then, is there so much attention on motor-based prediction?
P&G argue that there is evidence to support a role for motor prediction in language-related perceptual processes – a good reason to focus attention on a motor-based prediction process. There are problems with the evidence they cite, however. One cannot infer causation from motor activation during perception (it could be pure associative priming [Heyes, 2010; Hickok, 2009a]), the transcranial magnetic stimulation (TMS) evidence in speech perception tasks is likely a response bias effect (Venezia et al. 2012), and the studies showing effects of imitation training on perception do not necessarily imply that imitation is carried out during perception, which is the claim that P&G wish to make.
There is a better way to conceptualize the architecture of the system, one that flows naturally out of fairly well-established models of cortical organization (Hickok & Poeppel 2007; Milner & Goodale 1995). A dorsal stream subserves sensory-motor integration for motor control; it is a highly adaptable system (Catmur et al. 2007) that links sensory targets (objects in space, sequences of phonemes) with motor systems tuned to hit those targets under varying conditions. A ventral stream subserves the linkage between sensory inputs and conceptual memory systems; it is a more stable system designed to abstract over irrelevant sensory details. Both systems enlist predictive coding as a fundamental computational strategy (Friston et al. 2010), but both in the service of what the systems are designed for computationally. Motor prediction facilitates motor behavioral (but suppresses perception) and “sensory” or “ventral stream” prediction facilitates perception (Hickok 2012b).
P&G underline that their approach blurs the line between comprehension and production and thus rejects the “cognitive sandwich” view, whereas the alternative perspective just outlined might be interpreted as preserving the comprehension-production distinction. In this context, it is worth pointing out that P&G do not actually blur the distinction between the two slices of bread all that much. They are quite distinct computational and representational components as their c and p notation attests, and they have even added some slices, an action implementation system (p), a forward production model (p-hat), a forward comprehension model (c-hat) and a perceptual system (c), each of which generates phonological, syntactic, and semantic representations – nearly a loaf of bread. They do argue, correctly in my view, and consistent with many speech scientists and motor control researchers as well as the classical aphasiologists (despite P&G’s claims to the contrary), that comprehension and production systems must interact. We make the same claims of our dorsal stream (Hickok 2012a; Hickok & Poeppel 2007). But where P&G and others – including myself (Hickok et al. 2011) – have gone wrong, in my view, is that they are trying to shoehorn a motor-control-based mechanism into a perceptual system that it was not designed to serve.
Acknowledgments
Supported by NIH grant DC009659.
References
- Catmur C, Walsh V, Heyes C. Sensorimotor learning configures the human mirror system. Curr Biol. 2007;17(17):1527–31. doi: 10.1016/j.cub.2007.08.006. [DOI] [PubMed] [Google Scholar]
- Crain S, Fodor JD. How can grammars help parsers? In: Dowty DR, Karttunen L, Zwicky AM, editors. Natural language parsing: Psychological, computational, and theoretical perspectives. Cambridge University Press; 1985. pp. 94–128. [Google Scholar]
- Frazier L, Flores d’Arcais G. Filler driven parsing: A study of gap filling in Dutch. Journal of Memory and Language. 1989;28:331–44. [Google Scholar]
- Friston KJ, Daunizeau J, Kilner J, Kiebel SJ. Action and behavior: A free-energy formulation. Biological Cybernetics. 2010;102(3):227–60. doi: 10.1007/s00422-010-0364-z. [DOI] [PubMed] [Google Scholar]
- Gibson E, Hickok G. Sentence processing with empty categories. Language and Cognitive Processes. 1993;8:147–61. [Google Scholar]
- Heyes C. Where do mirror neurons come from? Neurosci Biobehav Rev. 2010;34(4):575–83. doi: 10.1016/j.neubiorev.2009.11.007. [DOI] [PubMed] [Google Scholar]
- Hickok G. Eight problems for the mirror neuron theory of action understanding in monkeys and humans. J Cogn Neurosci. 2009a;21(7):1229–43. doi: 10.1162/jocn.2009.21189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hickok G. Computational neuroanatomy of speech production. Nature Reviews Neuroscience. 2012a;13(2):135–45. doi: 10.1038/nrn3158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hickok G. The cortical organization of speech processing: Feedback control and predictive coding the context of a dual-stream model. Journal of Communication Disorders. 2012b;45:393–402. doi: 10.1016/j.jcomdis.2012.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hickok G, Houde J, Rong F. Sensorimotor integration in speech processing: Computational basis and neural organization. Neuron. 2011;69(3):407–22. doi: 10.1016/j.neuron.2011.01.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hickok G, Poeppel D. The cortical organization of speech processing. Nature Reviews Neuroscience. 2007;8(5):393–402. doi: 10.1038/nrn2113. [DOI] [PubMed] [Google Scholar]
- Kawato M. Internal models for motor control and trajectory planning. Curr Opin Neurobiol. 1999;9(6):718–27. doi: 10.1016/s0959-4388(99)00028-8. [DOI] [PubMed] [Google Scholar]
- Milner AD, Goodale MA. The visual brain in action. Oxford University Press; 1995. [Google Scholar]
- Rauschecker JP, Scott SK. Maps and streams in the auditory cortex: Nonhuman primates illuminate human speech processing. Nature Neuroscience. 2009;12(6):718–24. doi: 10.1038/nn.2331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sams M, Mottonen R, Sihvonen T. Seeing and hearing others and oneself talk. Brain Research: Cognitive Brain Research. 2005;23(2–3):429–35. doi: 10.1016/j.cogbrainres.2004.11.006. [DOI] [PubMed] [Google Scholar]
- Shadmehr R, Smith MA, Krakauer JW. Error correction, sensory prediction, and adaptation in motor control. Annual Review of Neuroscience. 2010;33:89–108. doi: 10.1146/annurev-neuro-060909-153135. [DOI] [PubMed] [Google Scholar]
- Stevens KN, Halle M. Remarks on the analysis by synthesis and distinctive features. In: Walthen-Dunn W, editor. Models for the perception of speech and visual form. MIT Press; 1967. pp. 88–102. [Google Scholar]
- van Wassenhove V, Grant KW, Poeppel D. Visual speech speeds up the neural processing of auditory speech. Procedings of the National Academy of Sciences. 2005;02(4):1181–86. doi: 10.1073/pnas.0408949102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Venezia JH, Saberi K, Chubb C, Hickok G. Response bias modulates the speech motor system during syllable discrimination. Frontiers in Psychology. 2012;3 doi: 10.3389/fpsyg.2012.00157. article 157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson SM, Iacoboni M. Neural responses to non-native phonemes varying in producibility: Evidence for the sensorimotor nature of speech perception. Neuroimage. 2006;33(1):316–25. doi: 10.1016/j.neuroimage.2006.05.032. [DOI] [PubMed] [Google Scholar]
- Wolpert DM, Ghahramani Z, Jordan MI. An internal model for sensorimotor integration. Science. 1995;269(5232):1880–82. doi: 10.1126/science.7569931. [DOI] [PubMed] [Google Scholar]