Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2022 Nov 30;119(49):e2217108119. doi: 10.1073/pnas.2217108119

A computational model of language comprehension unites diverse perspectives

Maryellen C MacDonald a,1
PMCID: PMC9894107  PMID: 36449546

When we comprehend language, we convert the acoustic or visual signal that we perceive into an internal representation of meaning. This conversion is complex, and language scientists have worked for decades to understand how it functions. In studying language comprehension, the field has been marked by several sharp theoretical divides. One of these controversies concerns modularity of the component processes of comprehension, such as whether interpretation of a sentence’s syntactic structure proceeds independently from interpretation of meaning and other aspects of language (1), or whether comprehension involves interaction among syntax, meaning, and so on (2). A second debate concerns the relationship between comprehension processes and the temporary memory that is necessary to execute them: does this working memory stem from a dedicated temporary storage system (3), or do language processes themselves create the temporary memories they need (4)? These longstanding debates get at core questions about the nature of human language capacities, but alternative positions have been difficult to distinguish from one another in precise ways. In PNAS, Hahn et al. (5) describe a theory and implemented computational model of central pieces of language comprehension, linking disparate theoretical perspectives in both language comprehension and working memory.

“In PNAS, Hahn et al. (5) describe a theory and implemented computational model of central pieces of language comprehension, linking disparate theoretical perspectives in both language comprehension and working memory.”

Because linguistic signals arrive over time, converting signal to meaning requires integrating earlier and later portions of the signal. For perceivers to understand who did what to whom in the sentence in Fig. 1A, for example, the word dog must be linked to three verbs—the nearby word jumped and the more distant chased and caught. The perceiver must also determine that the squirrel is the object of the chasing and is not doing any jumping, chasing, or catching itself. Although the example in Fig. 1A may not seem much like everyday language, these complex sentences have been a central testing ground of theories of language comprehension, for two reasons. First, sentences with embedded clauses, such as the one in Fig. 1A, have been a cornerstone of generative linguists’ claims for the competence–performance distinction (6, 7), which holds that humans have expansive linguistic knowledge (competence) but more restricted comprehension performance because of constraints on their working memory capacity. Studying these difficult embedded sentences may therefore inform longstanding claims about human language knowledge and use. Second, as Hahn et al. note, computational models provide quantitative predictions of varying levels of comprehension difficulty across these types of sentences (8, 9), yielding insight into the nature of the signal-to-meaning computations that undergird language comprehension more generally. Three historically disparate approaches, shown in Figs. 1 BD, have all aimed to characterize patterns of comprehension difficulty at different points of complex sentences.

Fig. 1.

Fig. 1.

(A) To understand this complex sentence, the noun dog must be linked to three verbs: jumped, chased, and caught. The curved lines show that such linkages between words can occur over short or long distances. Other linkages, such as the fact that squirrel is the object of chased, are also necessary but not shown here. (BD) show three historical antecedents of the Hahn et al. model. (B) illustrates the Dependency Locality Theory (DLT), where comprehension difficulty reflects the distance between integrated words, such as a noun and verb. (C) illustrates Surprisal Theory, where comprehension difficulty is tied to the extent to which new input conforms to expectations based on experience with previously encountered sentence structures. (D) illustrates constraint satisfaction accounts, in which syntactic, word meaning, and other computations interact. (E) sketches components of Hahn et al.’s resource-rational model, in which a lossy working memory is optimized to best support the Surprisal component, and which makes predictions for upcoming words via rich language experience captured by the GPT-2 neural network model.

Fig. 1B shows the DLT (8), which is foundational to treatment of memory in the Hahn et al. (5) account. In DLT, difficulty of comprehension is tied to limitations in verbal working memory: when words that must be integrated with one another are far apart, working memory demands are higher, and comprehension is more difficult, evidenced by slowed reading.

A second key component of the Hahn et al. model is Surprisal Theory (9, 10), shown in Fig. 1C. It ties comprehension difficulty at a particular point in a sentence to uncertainty about what input is coming next. The model predicts upcoming input as a function of what has been encountered in the sentence to that point, guided by knowledge of the likelihood of completions, as estimated via large language corpora. Surprisal Theory and the DLT differ in the role of prior experience in explanations for comprehension difficulty, with Surprisal linking difficulty to prior comprehension experience—difficult sentence regions are ones that violate expectations based on past experience—while the DLT pointing to a direct relationship between distance between integrated words and comprehension difficulty, independent of experience. However, they share a commitment to modularity of syntactic computations. As illustrated in Fig. 1 B and C, both models operate over abstract grammatical categories (noun, verb, etc.) without regard to word meaning.

The constraint satisfaction approach (Fig. 1D) rejects isolation of syntax-level processes and argues that interpretation of sentence structure, word meaning, and other aspects of language must be jointly computed (2). Words are massively ambiguous in both meaning and their grammatical categories; for example, all the nouns in the sentence in Fig. 1A (dog, fence, and squirrel) are also used as verbs in English. DLT and Surprisal computations over grammatical category input may be more complicated if words’ grammatical categories must also be disambiguated. Words also exhibit lexico-syntactic regularities, where certain words tend to occur in certain sentence types, so that knowledge of words informs syntax interpretation and vice versa (11).

Despite empirical evidence for interactive comprehension processes, there is a crucial advantage of modular syntax-level accounts like DLT and Surprisal over interactive approaches like constraint satisfaction: feasibility of computational modeling. Computational implementations of syntax-level processes in DLT and Surprisal Theory can quantify predictions for comprehension difficulty and test them against empirical data. Constraint satisfaction proponents have a more difficult computational modeling path, as they need to capture complex interactions of knowledge of words, sentences, events in the world, and so on. Existing models of interactive comprehension processes (12, 13) are limited in scope, with small vocabulary and other restrictions.

Hahn et al. (5) developed a computational model that integrates aspects of all three of the approaches in Fig. 1 BD. To do so, they retain components of their prior work (the DLT and Surprisal Theory, Fig. 1 B and C) but make three substantial additions, sketched in Fig. 1E. First, their resource-rational account retains the working memory limitations of DLT but now incorporates parameters that make temporary memory of language input subject to loss of information, aligning the model with insights from working memory research (14) and the authors’ own prior work (15). Second, the fallibility of working memory is not a constant over all words; it is optimized to retain words likely to be useful later, to better support downstream predictions for upcoming language input. Third, the model incorporates rich statistical information emphasized by constraint satisfaction accounts, via representations from the neural network language model GPT-2 (16), which encodes vast amounts of language experience. This combination of lossy memory and rich language context goes beyond previous modular models and generates predictions for specific words in upcoming language input. The broad coverage implemented model integrates working memory limitations with massive linguistic knowledge gleaned from language experience. Hahn et al. also present empirical work in three languages, identifying a previously unstudied lexico-syntactic constraint (embedding bias, the statistical tendency for certain nouns to be followed by an embedded clause). This work captures joint information from words and sentence structure during comprehension, a claim promoted by constraint satisfaction approaches but not previously modeled at this scale.

Beyond these developments in sentence comprehension, the Hahn et al. model (5) may also have implications for debates on the nature of working memory—whether brief memories are held in a dedicated temporary store (3) or are emergent from language comprehension and other cognitive processes (17, 18). A major hurdle in testing these alternatives has been a lack of computational modeling to provide precise, quantifiable predictions of how temporary memory could emerge from language processes (19). Existing computational models of emergent working memory are limited in scope and often address memory maintenance of isolated words (20). The Hahn et al. model, however, provides a large-scale computational account of temporary memory in the service of sentence comprehension, where properties of memory maintenance are shaped by the needs of the comprehension system. In other words, Hahn et al.’s model may provide a path toward modeling verbal working memory as emergent from language comprehension processes.

The debate between dedicated-store and emergent accounts is not just a turf war of theoretical constructs; it has real consequences for investigations of human computational capacity and for diagnosis and treatment of atypical language development and impairments. For example, researchers who view working memory as an independent temporary store have suggested that low working memory capacity is a cause of language impairments (21), sometimes with recommendations of memory training exercises to improve working memory (22). Increasingly, however, researchers have recognized that verbal working memory assessments are inextricable from language use (4), and that working memory training doesn’t generalize: people get better on the training task, with minimal or no transfer other activities (23). By showing that verbal working memory capacity is not independent of language knowledge and processes, these findings are consistent with emergent memory approaches. They also emphasize how these scientific debates have real practical applications, as treatment strategies for impairments depend on a dedicated-store vs. emergent approach to working memory. With few exceptions (8, 24), models of language comprehension have given little attention to working memory. If the Hahn et al. (5) model brings these issues to the fore and promotes greater computational rigor, that would be a welcome development for both basic science and more applied areas of language comprehension.

Acknowledgments

Author contributions

M.C.M. wrote the paper.

Competing interest

The author declares no competing interest.

Footnotes

See companion article, “A resource-rational model of human processing of recursive linguistic structure,” 10.1073/pnas.2122602119.

References

  • 1.Frazier L., “Sentence processing: A tutorial review” in Attention and Performance XII, Coltheart M., Ed. (Lawrence Erlbaum Associates Inc, 1987), pp. 559–586. [Google Scholar]
  • 2.MacDonald M. C., Pearlmutter N. J., Seidenberg M. S., The lexical nature of syntactic ambiguity resolution. Psychol. Rev. 101, 676–703 (1994). [DOI] [PubMed] [Google Scholar]
  • 3.Baddeley A., Working memory. Science 255, 556–559 (1992). [DOI] [PubMed] [Google Scholar]
  • 4.Schwering S. C., MacDonald M. C., Verbal working memory as emergent from language comprehension and production. Front Hum. Neurosci. 14, 68 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hahn M., Futrell R., Levy R., Gibson E., A resource-rational model of human processing of recursive linguistic structure. Proc. Natl. Acad. Sci. U.S.A. 119, e2122602119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Chomsky N., Miller G. A., “Introduction to the formal analysis of natural languages” in Handbook of Mathematical Psychology, Luce R., Bush R., Galanter E., Eds. (Wiley, 1963). [Google Scholar]
  • 7.Miller G. A., Chomsky N., “Finitary models of language users” in Handbook of Mathematical Psychology, Luce R., Bush R., Galanter E., Eds. (Wiley, 1963), pp. 419–491. [Google Scholar]
  • 8.Gibson E., Linguistic complexity: Locality of syntactic dependencies. Cognition 68, 1–76 (1998). [DOI] [PubMed] [Google Scholar]
  • 9.Levy R., Expectation-based syntactic comprehension. Cognition 106, 1126–1177 (2008). [DOI] [PubMed] [Google Scholar]
  • 10.Hale J., Uncertainty about the rest of the sentence. Cogn. Sci. 30, 643–672 (2006). [DOI] [PubMed] [Google Scholar]
  • 11.McRae K., Spivey-Knowlton M. J., Tanenhaus M. K., Modeling the influence of thematic fit (and other constraints) in on-line sentence comprehension. J. Mem. Lang. 38, 283–312 (1998). [Google Scholar]
  • 12.Chang F., Dell G. S., Bock K., Becoming syntactic. Psychol. Rev. 113, 234–272 (2006). [DOI] [PubMed] [Google Scholar]
  • 13.Rabovsky M., Hansen S. S., McClelland J. L., Modelling the N400 brain potential as change in a probabilistic representation of meaning. Nat. Hum. Behav. 2, 693–705 (2018). [DOI] [PubMed] [Google Scholar]
  • 14.Hulme C., et al. , Word-frequency effects on short-term memory tasks: Evidence for a redintegration process in immediate serial recall. J. Exp. Psychol. Learn Mem. Cogn. 23, 1217–1232 (1997). [DOI] [PubMed] [Google Scholar]
  • 15.Futrell R., Gibson E., Levy R. P., Lossy-context surprisal: An information-theoretic model of memory effects in sentence processing. Cogn. Sci. 44, e12814 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Radford A., et al. , Language models are unsupervised multitask learners. OpenAI Blog, (2019). [Google Scholar]
  • 17.Cowan N., “What are the differences between long-term, short-term, and working memory?” in Progress in Brain Research, Essence of Memory, Sossin W. S., Lacaille J.-C., Castellucci V. F., Belleville S., Eds. (Elsevier, 2008), pp. 323–338). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Buchsbaum B. R., D’Esposito M., A sensorimotor view of verbal working memory. Cortex 112, 134–148 (2018). [DOI] [PubMed] [Google Scholar]
  • 19.Norris D., Short-term memory and long-term memory are still different. Psychol. Bull. 143, 992–1009 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ueno T., Saito S., Rogers T. T., Lambon Ralph M. A., Lichtheim 2: Synthesizing aphasia and the neural basis of language in a neurocomputational model of the dual dorsal-ventral language pathways. Neuron 72, 385–396 (2011). [DOI] [PubMed] [Google Scholar]
  • 21.Gathercole S. E., Service E., Hitch G. J., Adams A.-M., Martin A. J., Phonological short-term memory and vocabulary development: Further evidence on the nature of the relationship. Appl. Cogn. Psychol. 13, 65–77 (1999). [Google Scholar]
  • 22.Ingvalson E. M., Dhar S., Wong P. C. M., Liu H., Working memory training to improve speech perception in noise across languages. J. Acoust. Soc. Am. 137, 3477–3486 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Watrin L., Wilhelm O., Hülür G., Training working memory for two years – no evidence of latent transfer to intelligence. J. Exp. Psychol. Learn Mem. Cogn. 48, 717–733 (2022). [DOI] [PubMed] [Google Scholar]
  • 24.Lewis R. L., Vasishth S., Van Dyke J. A., Computational principles of working memory in sentence comprehension. Trends Cogn. Sci. 10, 447–454 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES