Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2024 Aug 19;379(1911):20230149. doi: 10.1098/rstb.2023.0149

Symbol ungrounding: what the successes (and failures) of large language models reveal about human cognition

Guy Dove 1,
PMCID: PMC11529626  PMID: 39155725

Abstract

Large language models can handle sophisticated natural language processing tasks. This raises the question of how their understanding of semantic meaning compares to that of human beings. Supporters of embodied cognition often point out that because these models are trained solely on text, their representations of semantic content are not grounded in sensorimotor experience. This paper contends that human cognition exhibits capabilities that fit with both the embodied and artificial intelligence approaches. Evidence suggests that semantic memory is partially grounded in sensorimotor systems and dependent on language-specific learning. From this perspective, large language models demonstrate the richness of language as a source of semantic information. They show how our experience with language might scaffold and extend our capacity to make sense of the world. In the context of an embodied mind, language provides access to a valuable form of ungrounded cognition.

This article is part of the theme issue ‘Minds in movement: embodied cognition in the age of artificial intelligence’.

Keywords: cognition, concepts, embodied cognition, grounded cognition, artificial intelligence, language

1. Introduction

How does the natural language processing (NLP) of contemporary artificial intelligence (AI) systems compare to that of human beings? This has become a central theoretical question in the discussion surrounding the effectiveness of large language models (LLMs). Deciphering the extent to which systems employing these models understand, or fail to understand, semantic content has become something of a cottage industry (e.g. [1,2]). The fact that such systems are trained only on text raises the possibility that the knowledge that they acquire is insufficiently grounded in experience with the physical world. Notable failures of extant models suggest that this problem may be endemic to them [3].

Given that a lack of grounding has become a widely recognized theoretical challenge for LLMs, there is reason to look to another contemporary research programme, embodied cognition, for a possible solution to this problem. Embodied cognition emphasizes the degree to which human cognition depends on experiential systems associated with bodily action and perception. The stratagem of evaluating LLMs in relation to embodiment has led to the following two general responses, one optimistic and the other sceptical. The optimistic approach seeks to mimic the embodied nature of human cognition by adding models trained on perceptual data or incorporating epigenetic robotics. The sceptical approach contends that the disembodied character of LLMs excludes them from having access to a rich understanding of semantic meanings.

The purpose of this essay is to move away from this dichotomous conceptual framework. I argue that aspects of human cognition fit with both the embodied and AI approaches. More specifically, there are several reasons to think that human semantic memory is both partially grounded in sensorimotor systems and dependent on language-specific learning. From this perspective, LLMs amount to an exaggerated version of an important human skill. Despite their biological implausibility and massive training input, they demonstrate the richness of language as a source of semantic information about our world. At a minimum, they serve as a proof of concept with respect to how language might extend our cognitive reach [4,5]. In other words, they show how experience with language might scaffold and extend our capacity to make sense of the world. Although my focus will primarily be on the implications that LLMs have for our understanding of human cognition, this proposed division of cognitive labour has implications for the prospects of an embodied AI.

2. Surprising successes and disappointing failures

LLMs are neural networks containing up to hundreds of billions of parameters. They are typically transformer-based models [6] trained on massive sets of human-produced text. For example, even an early version of the LLM, GPT-3 (Generative Pre-trained Transformer 3), was initially trained on a corpus consisting of approximately 200 billion words—estimated to be equivalent to 20 000 years’ worth of human experience [7]. Fundamentally, what these models do is carry out a sequence prediction task. More specifically, they treat words or parts of words as individual tokens and attempt to predict the next token in a particular series. They are thus generative mathematical models of the statistical distribution of such tokens in the corpus that they have been trained on.

The capacity of such models to generate further text in response to a given chunk of text is often described as a kind of pattern completion [8]. Despite the seemingly basic nature of the task, systems employing LLMs exhibit what appears to be extraordinarily sophisticated performance on language understanding tasks. For instance, several recent models have been shown to exhibit human-like performance on language understanding metrics specifically designed to challenge NLP systems [911]. These metrics include comprehension, entailment, question–answer, reasoning, recognition and summarizing tasks.

Recent research suggests that LLMs can develop intriguing degrees of competence with respect to the formal aspects of language processing [7,12]. LLMs appear to glean information about subject–verb agreement [13,14], constituent labelling [15] and filler–gap dependencies [16]. Syntactic information can be recovered from the word embeddings of BERT (Bidirectional Encoder Representations from Transformers) [17]. Despite these successes, LLMs have also exhibited disturbing patterns of failure in relation to understanding aspects of semantic meaning. For example, some systems containing LLMs have been found to struggle with negation [1820]. LLMs can also struggle with pragmatic inferences [21], common sense reasoning [22] and understanding the relationships between properties and affordances [23].

Reviewing the literature on the language processing of LLMs, Mahowald et al. [24] draw a distinction between formal and functional types of linguistic competence: formal linguistic competence involves knowledge of the rules and distributional patterns of a natural language, and functional linguistic competence involves understanding how language applies to real-world situations. They argue that LLMs exhibit a lot of promise with respect to capturing formal linguistic competence but struggle greatly when it comes to functional linguistic competence. They attribute this divergence to the fact that real-life language use requires non-linguistic cognitive skills.

3. Embodied AI

LLMs show great potential with a diverse collection of language-related processing and production tasks, but questions remain with respect to whether they can, even in principle, achieve human-like understanding [3]. Bender & Koller offer a blunt and categorical assessment ([25, p. 5185]; italics in the original): ‘We argue that the language-modeling task, because it only uses form as training data, cannot in principle lead to learning of meaning’. They argue that meaning involves a relation between linguistic symbols and something external to language. In particular, they identify meanings with the communicative intentions behind utterances. These intentions may involve conveying information, but they can also encompass actions and social activities. Because LLMs lack communicative intentions, they lack access to meanings. Shanahan [8, p. 5] offers a similarly sceptical take—albeit one that does not require a commitment to a specific conception of linguistic meaning: ‘A bare bones LLM doesn’t “really” know anything because all it does, at a fundamental level, is sequence prediction’. The idea is that although some of the sequences that it produces may fit the form of propositions, this does not mean that the models have any meaningful understanding of the states of affairs that these propositions capture.

The reasoning behind this scepticism echoes the symbol grounding problem [26], which arises with any computational account that characterizes cognition in terms of the relation of symbols to other symbols. This kind of functional organization leads to a symbolic merry-go-round in which cognitive processes involve going from one meaningless symbol to another. To help illustrate the nature of the problem, Harnad [26] asks us to imagine trying to learn Chinese as a new second language solely by means of a Chinese-to-Chinese dictionary. The obvious difficulty is that each word would be defined in terms of its connections to other words in the unknown language. LLMs carry out this sort of symbol-to-symbol processing on a grand scale.

Cognitive scientists have traditionally assumed that computational theories of cognition that are developed in a top–down manner will eventually connect with theories of peripheral experiential systems. In other words, the presumption has been that the cognitive system would ultimately be connected to the external world by means of links between amodal conceptual representations and those contained within functionally independent action and perception systems. Harnad [26] argues that this approach underestimates the extent of the symbol grounding problem and that a different approach is needed. He suggests that theories of cognition should incorporate intrinsically dedicated symbols that retain their connection to the world.

Embodied cognition explores this approach. It is a broad research programme that emphasizes the ways in which the neural systems associated with bodily action and perception contribute to cognition [27]. Theories of embodied cognition tend to focus on the contribution of the systems associated with processing the regularities of physical and social experiences. Within the context of semantic memory, evidence implicating affective and sensorimotor systems in cognition has led to the influential grounded cognition model [28], which holds that we think about objects and events by means of the same mechanisms that experience and interact with them. According to this model, conceptual processing relies on the reactivation of neural circuits contained within modality-specific regions of the cortex—that is, thinking involves simulations of experience.

There are good reasons to speculate that embodied cognition could help researchers develop better NLP systems. Some of the weaknesses of LLMs appear to be connected to their lack of access to suitably grounded representations. In keeping with this, some researchers have explored the possibility of embedding LLMs with other networks trained on perceptual data such as visual images [29,30]. Others have explored the possibility of embedding LLMs within systems that directly interact with the world, such as robots [31]. My aim in this paper is to explore a related speculation, namely, that the sort of language-based processing exhibited by LLMs might help us recognize and overcome the theoretical shortcomings associated with a strictly embodied approach to human cognition.

4. Symbol ungrounding

The symbol grounding problem has become part of the origin story of embodied cognition because a cognitive system that relies on the redeployment of representations that are grounded in sensorimotor experience seems well-positioned to overcome this problem [32,33]. Unfortunately, linking cognition to bodily experience creates a different problem. Although embodiment holds promise with respect to the symbol grounding, grounded cognitive systems face a corresponding symbol ungrounding problem: How does a system that relies on representations contained within action, perception and emotion systems acquire knowledge that requires going beyond particular experiences? Conceptual knowledge requires abstraction from perceptual and motor particulars [34].

The need for symbol ungrounding in the context of embodied cognition has received two prominent, yet relatively independent, treatments. The first originates from the theoretical framework developed by Terrence Deacon [35]. Rączaszek-Leonardi & Deacon [36] argue that the way researchers typically conceptualize the symbol grounding problem is flawed for two reasons. The first flaw is that the problem is typically formulated in a way that assumes that cognitive systems have prior access to abstract, language-like symbols. This seems implausible from a developmental perspective. The second flaw is that it fails to acknowledge the possibility that ‘non-symbolic structures also play a vital role in regulating the relationships among organisms and—crucially—the emergence of the symbols themselves’ [36, p. 232]’. They propose that symbols become endowed with meaning through a process that depends on simpler semiotic relationships—associated with icons and indices—that are linked to bodily dynamics and action in the world.

The shift to an embodied perspective creates a new theoretical challenge: How does symbolic meaning emerge out of complex interactive contexts? As Rączaszek-Leonardi & Deacon [36, p. 232] put it:

In such contexts it becomes evident that the real problem is not the grounding of symbols but rather explaining the mystery of how such an embedded, embodied and situated use of signs can ever (at least partially) become liberated from the immediate reliance on the online events, thus, how they become, at least partially, ungrounded.

The emergence of symbolic cognition out of the dynamical activity associated with perceiving and acting in specific contexts requires a process of symbol ungrounding. Explaining just how this happens is a fundamental theoretical challenge for any approach that emphasizes the importance of situated, embedded and embodied action. From this perspective, language development becomes a central test case for symbolic emergence. The question at hand for researchers is how meaningful symbols emerge from the grounded interactions of children and their linguistic and social niches. Rączaszek-Leonardi & Deacon [36] suggest that children capitalize on the relations between iconic and indexical signs. Given that this approach focuses on situated bodily action, it is not surprising that they emphasize the promise of the field of epigenetic robotics. At first glance, their approach would seem to have little in common with that of contemporary NLP. Nevertheless, LLMs provide a clear example of how language-based ungrounding might occur.

The second treatment of symbol ungrounding occurs in my own work. In an early paper [37], I argue that abstract concepts pose three variants of the symbol ungrounding problem for embodied cognition. The first of these is the problem of generalization. Many abstract concepts involve generalizing away from certain experiential particulars. For instance, superordinate concepts such as MAMMAL have a broader scope than basic-level concepts such as DOG and subordinate concepts such as PUG. The second is the problem of flexibility. Abstract concepts tend to be more context-dependent than concrete concepts [38,39]. The third is the problem of disembodiment. Certain abstract concepts, such as PRIME NUMBER, QUARK and TRUTH, appear to be fundamentally divorced from sensorimotor experience [40]. More recently, I have emphasized the universality of the problem of abstraction for theories of concepts [34].

Embodied cognition struggles to explain our capacity to employ concepts that are disconnected from affective, motor and perceptual experience. Experience with language may help our brains overcome the symbol ungrounding problem [41] by offering us access to an external symbol system that has different computational properties from non-linguistic grounded cognition [4]. Some of our knowledge may be captured by means of the associative relationships between linguistic representations and other linguistic representations [5,42]. In other words, cognition may be shaped to some significant degree by our experiences with words, sentences and conversations [43].

5. A rich source of information

LLMs demonstrate how knowledge pertaining to language use can provide access to semantic content that goes beyond sensorimotor experience. Language provides an additional—yet in many ways complementary—source of information about the world. I am going to defend this hypothesis by pointing to research in two domains. First, evidence suggests that language can provide help with the acquisition and use of concrete concepts, even perceptually based ones, when normally grounded experience is unavailable. Second, there are compelling reasons to think that language-based information provides an important leg up with respect to the acquisition and use of abstract concepts. This fits naturally with the idea that language provides a means of symbol ungrounding.

(a). Colour concepts

Bender & Keller [25, p, 5192] argue that LLMs lack an understanding of linguistic meaning because meaning ‘is based on a link between linguistic form and something that is not language’. What happens when individuals lack the appropriate grounded access to something external to language? Consider the case of congenitally blind individuals and colour concepts. Clearly, such individuals lack perceptual access to the referents of these terms. This example has a clear historical precedent. Hume [44, p. 15] famously asserts:

If it happen, from a defect of the organ, that a man is not susceptible to any species of sensation, we always find that he is as little susceptible of the correspondent ideas. A blind man can form no notion of colours; a deaf man of sounds.

Although this may initially seem to be a definitional or a priori claim, it is in fact a defeasible empirical claim. Indeed, the semantic understanding of colour terms by blind individuals is a well-studied area of psycholinguistic and cognitive neuroscience research. Congenitally and early blind individuals have been shown to have sophisticated semantic knowledge of colour similarity [45] and the colour of objects [4648].

Gubelmann [49] argues that the ability of congenitally and early blind individuals to acquire rich colour concepts shows that grounded representations are not required for language understanding—even in the context of semantic domains that involve perceptual experience. In keeping with this, a research programme for defending the sophistication of the semantic knowledge acquired by NLP systems has emerged. Pavlick [50, p. 11] explains, ‘One way to show that grounding is not necessary for learning meaning is to show that conceptual representations learned by an (ungrounded) LLM are isomorphic to grounded representations of those same concepts’. Evidence suggests that it is possible to glean a substantial amount of information about colour from the statistical co-occurrence patterns of colour words [51]. LLMs appear to be able to acquire contextual representations that correspond to the human perception of colour [5254].

None of this is to say that we do not typically acquire semantic knowledge through our direct experience of the world. It is just that the ability of ungrounded LLMs to learn about experiential phenomena such as colour suggests that language-based cognition can be an alternative route to such knowledge—particularly in the context of an otherwise grounded cognitive system. As Borghi et al. [55, p. 362] point out, the success of these models raises the possibility that ‘humans could also extract part of their knowledge and skills in a similar manner’.

(b). Abstract concepts

A diverse body of evidence suggests that language plays an important role in our ability to acquire and use abstract concepts [41,56]. Some of this evidence implicates the sort of distributional information leveraged by LLMs. For instance, hybrid models that statistically combined language-based distributional data and non-linguistic experiential data have been found to match up with observed human performance better than models that relied exclusively on one of these forms of data [5759]. These results suggest that the ability to take advantage of the distributional information contained within natural language might benefit an otherwise grounded conceptual system.

Given the apparent importance of language to the encoding and processing of abstract concepts, we would expect LLMs to be particularly effective at handling abstract concepts. Unfortunately, this may not be the case. Liao, Chen and Du [60] examined the conceptual understanding capabilities of six different pre-trained LLMs. More specifically, they compared how these LLMs handled abstract and concrete concepts on a hypernym discovery task. They found that, across the board, the LLMs performed more poorly with abstract concepts than with concrete ones. This result is surprising when compared to the previous success of distributional semantic models.

Clearly, more research is needed to uncover just what is going on here. Several possible explanations come to mind. It could be that the task was not the right one for such a comparison. Or it could be a problem specific to LLMs. They have been shown to struggle with abstract reasoning [61]. In addition, abstract concepts tend to be more polysemous than concrete concepts [62], and polysemy has been shown to be a particular problem for LLMs [63]. Our ability to understand abstract concepts may also be more dependent on context-dependent pragmatic factors that are not captured well by LLMs [55].

6. From how-possibly to how-actually

The ability of LLMs to meet or exceed pre-established benchmarks suggests that they have achieved a moderate form of semantic competence. They thus provide a how-possibly explanation for the acquisition of some conceptual knowledge. They do this by simultaneously demonstrating the informational richness of the linguistic input and the power of deep learning architectures. This form of language-based learning fits well with certain aspects of human semantic competence, such as the ability of congenitally blind individuals to acquire rich colour concepts and our general capacity to acquire abstract concepts that go well beyond sensorimotor experience.

One might object, though, that there is an important disanalogy between the human and AI cases. 1 Unlike LLMs, humans have direct access to grounded conceptual content. Consider the acquisition of colour concepts by the congenitally blind. It is not clear that these individuals learn these concepts in the same way that LLMs do—that is, through an analysis of statistical co-occurrence patterns in language. After all, these individuals would likely have access to grounded conceptual knowledge associated with other sensory modalities. Perhaps they can leverage this knowledge in the process of acquiring colour concepts. If this is correct, then any model that relies on distributional semantics is unlikely to help in the search for a how-actually explanation.

I maintain that there are several reasons to be interested in the sort of processing contained within LLMs. First, they provide more than just a possible explanation of how semantic competence is achieved. As outlined above, they also provide a promising explanation of how formal linguistic competence is acquired. If our brains use statistical co-occurrence information to acquire this sort of competence, it seems reasonable to conjecture that they might also use it to acquire some aspects of semantic competence. In other words, LLMs enjoy some promise with respect to explanatory power.

Second, LLMs have indirect access to grounded conceptual content—it is just that this content comes from the utterances that they are trained on. These utterances are produced by grounded cognitive agents. Criticisms of the purported status of LLMs as understanders of language often turn on this observation [7,8,25]. Recall, though, that I am proposing that LLMs support a how-possibly conjecture with respect to symbol ungrounding. As this terminology suggests, this process requires access to grounded symbols. The idea is that LLMs show how such ungrounding might occur. In keeping with this, Lupyan et al. [64, p. 938] claim that ‘co-occurrence statistics can act as a kind of echo of real-world linkages and causal relationships’. The suggestion on offer is that our brains take advantage of this echo.

Finally, even though LLMs are unrealistic models of our cognition for several reasons—including the size of their training sets, the nature of their learning algorithms and the number of parameters they involve—they might still contribute to a research programme aimed at providing a how-actually explanation of human cognition. Buckner [65] argues that deep learning networks fit with a moderate form of empiricism tied to faculty psychology. If this is correct, LLMs can be viewed as an initial step in this direction.

7. Conclusion

Rather than treat LLMs as fully realized consumers of semantic knowledge or embodied cognition as a full-throated counterexample to the plausibility of AI, I have argued that lessons can be learned by putting them into conversation with each other. Many of those interested in the promise of LLMs—and by extension, of systems in which they are embedded—have been so focused on ambitious assessments of their capacity to understand language that they have tended to dismiss problems arising from a lack of effective grounding of these systems. Supporters of embodied cognition, on the other hand, have been so focused on overcoming symbol grounding problems that they have failed to adequately recognize the need for symbol ungrounding.

Reframing the LLMs as an exaggeration of an important aspect of human cognition enables a kind of rapprochement between the two sides of the debate concerning the ability of LLMs to master linguistic meaning. Critics tend to argue that the outputs of systems employing them amount to little more than sophisticated collage [66] or pretending to think [24]. Supporters tend to argue that LLMs have the capacity to achieve a special sort of language-based semantic competence [49].

The suggestion explored in this essay is that our experience with language provides a means of symbol ungrounding for what is otherwise a largely embodied conceptual system. In other words, the abstractness of language-derived semantic representations is a feature rather than a bug. There are times in which pretending to think in the absence of a fully grounded mechanism is useful and advantageous. To some important degree, learning a language requires faking it until one makes it. The very features of LLMs that lead people to question them as successful models of knowledge make them useful in the context of the symbol ungrounding problem.

This approach has consequences for future research into how we think and how machines might think. With respect to cognitive science, LLMs can be viewed as biologically implausible models of the way that language-based experience might help to unground a fundamentally embodied conceptual system. With respect to AI, the need for grounding provides an impetus to explore larger systems that include action- and perception-oriented representations. More generally, symbol ungrounding appears to be an important design feature of cognition, whether human or artificial.

Footnotes

1

I thank an anonymous reviewer for raising this objection.

Ethics

This work did not require ethical approval from a human subject or animal welfare committee.

Data accessibility

This article has no additional data.

Declaration of AI use

We have not used AI-assisted technologies in creating this article.

Authors’ contributions

G.D.: conceptualization, writing—original draft, writing—review and editing.

Conflict of interests

I declare I have no competing interests.

Funding

No funding has been received for this article.

References

  • 1. Dhingra S, Singh M, S.B. V, Malviya N, Gill SS. 2023. Mind meets machine: unravelling GPT-4’s cognitive psychology. BenchCouncil Trans. Bench. and Stand. Eval. 3 , 100139. ( 10.1016/j.tbench.2023.100139) [DOI] [Google Scholar]
  • 2. Rogers A, Kovaleva O, Rumshisky A. 2020. A primer in Bertology: what we know about how BERT works. Trans. Assoc. Comput. Linguist. 8 , 842–866. ( 10.1162/tacl_a_00349) [DOI] [Google Scholar]
  • 3. Marcus G, Leivada E, Murphy E. 2023. A sentence is worth a thousand pictures: can large language models understand human language. arXiv. preprint 2308.0010. ( 10.48550/arXiv.2308.00109) [DOI] [Google Scholar]
  • 4. Clark A. 2006. Language, embodiment, and the cognitive niche. Trends Cogn. Sci. 10 , 370–374. ( 10.1016/j.tics.2006.06.012) [DOI] [PubMed] [Google Scholar]
  • 5. Dove G. 2020. More than a scaffold: language is a neuroenhancement. Cogn. Neuropsychol. 37 , 288–311. ( 10.1080/02643294.2019.1637338) [DOI] [PubMed] [Google Scholar]
  • 6. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. 2017. Attention is all you need. Adv. Neural. Inf. Process. 30 , 5998–6008. ( 10.48550/arXiv.1706.03762) [DOI] [Google Scholar]
  • 7. Warstadt A, Bowman SR. 2022. What artificial neural networks can tell us about human language acquisition. arXiv. preprint. ( 10.48550/arXiv.2208.07998) [DOI] [Google Scholar]
  • 8. Shanahan M. 2023. Talking about large language models. arXiv. 2212.03551. ( 10.48550/arXiv.2212.03551) [DOI] [Google Scholar]
  • 9. Wang A, Singh A, Michael J, Hill F, Levy O, Bowman S. 2018. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In Proc. 2018 EMNLP Workshop Blackbox NLP, Brussels, Belgium, November 2018 (eds Linzen T, Chrupala G, Alishahi A), pp. 353–355. Stroudsburg, PA: Association for Computational Linguistics. ( 10.18653/v1/W18-5446) [DOI] [Google Scholar]
  • 10. Wang A, Pruksachatkun Y, Nangia N, Singh A, Michael J, Hill F, Levy O, Bowman SR. 2019. Superglue: a stickier benchmark for general-purpose language understanding systems. In 33rd Conf. Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, 8–14 December 2019, pp. 3266–3280. ( 10.48550/arXiv.1905.00537) [DOI] [Google Scholar]
  • 11. Srivastava A, Rastogi A, Rao Aet al. 2022. Beyond the imitation game: quantifying and extrapolating the capabilities of language models. arXiv 2206.04615. ( 10.48550/arXiv.2206.04615) [DOI] [Google Scholar]
  • 12. Hu J, Gauthier J, Qian P, Wilcox E, Levy R. 2020. A systematic assessment of syntactic generalization in neural language models. In Proc. 58th Annual Meeting of the Assoc. for Computational Linguistics, July 2020, Online (eds Jurafsky D, Chai J, Schluter N, Tetreault J). Stroudsburg, PA: Association for Computational Linguistics. ( 10.18653/v1/2020.acl-main.158) [DOI] [Google Scholar]
  • 13. Golberg Y. 2019. Assessing BERT’s syntactic abilities. arXiv. preprint 11901.05287v1. ( 10.48550/arXiv.1901.05287) [DOI] [Google Scholar]
  • 14. Jawahar G, Sagot B, Seddah D. 2019. What Does BERT Learn about the Structure of Language? In Proc. 57th Annual Meeting Assoc. for Computational Linguistics, Florence, Italy, July 2019 (eds Korhonen A, Traum D, Màrquez L), pp. 3651–3657. Stroudsburg, PA: Association for Computational Linguistics. ( 10.18653/v1/P19-1356) [DOI] [Google Scholar]
  • 15. Tenney I, Xia P, Chen B, Wang A, Poliak A, McCoy RTet al. 2019. What do you learn from context? probing sentence structure in contextualized word representations. arXiv. preprint 1–17. ( 10.48550/arXiv.1905.06316) [DOI] [Google Scholar]
  • 16. Wilcox E, Levy R, Morita T, Futrell R. 2018. What do RNN Language Models Learn about Filler–Gap Dependencies? In Proc. 2018 EMNLP Workshop Blackbox NLP, Brussels, Belgium, November 2018 (eds Linzen T, Chrupała G, Alishahi A), pp. 211–221. Stroudsburg, PA: Association for Computational Linguistics. ( 10.18653/v1/W18-5423) [DOI] [Google Scholar]
  • 17. Hewitt J, Manning CD. 2019. A structural probe for finding syntax in word representations. In Proc. 2019 Conf. North American Chapter Assoc. for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, June 2019(eds Burstein J, Doran C, Solorio T), pp. 4129–4138. Minneapolis, MN: Association for Computational Linguistics. ( 10.18653/v1/N19-1419) [DOI] [Google Scholar]
  • 18. Ettinger A. 2020. What BERT is not: lessons from a new suite of psycholinguistic diagnostics for language models. Trans. Assoc. Comput. Linguist. 8 , 34–48. ( 10.1162/tacl_a_00298) [DOI] [Google Scholar]
  • 19. Kassner N, Schütze H. 2020. Negated and misprimed probes for pretrained language models: birds can talk, but cannot fly. In Proc. 58th Annual Meeting Assoc. Computational Linguistics, July 2020, Online (eds Jurafsky D, Chai J, Schluter N, Tetreault J). Stroudsburg, PA: Association for Computational Linguistics. ( 10.18653/v1/2020.acl-main.698) [DOI] [Google Scholar]
  • 20. Niven T, Kao HY. 2019. Probing neural network comprehension of natural language arguments. In Proc. 57th Annual Meeting Assoc. for Computational Linguistics, Florence, Italy, July 2019 (eds Korhonen A, Traum D, Màrquez L), pp. 4658–4664. Stroudsburg, PA: Association for Computational Linguistics. ( 10.18653/v1/P19-1459) [DOI] [Google Scholar]
  • 21. Pandia L, Cong Y, Ettinger A. 2021. Pragmatic competence of pre-trained language models through the lens of discourse connectives. arXiv. preprint. pp. 1–13 ( 10.48550/arXiv.2109.12951) [DOI]
  • 22. Da J, Kasai J. 2019. Cracking the commonsense code: understanding commonsense reasoning aptitude of deep contextual representations. In Proc. 1st Workshop on Commonsense Inference in Natural Language Processing, November 2019 (eds Ostermann S, Zhang S, Roth M, Clark P), pp. 1–12. Hong Kong, China: Association for Computational Linguistics. ( 10.18653/v1/D19-6001) [DOI] [Google Scholar]
  • 23. Forbes M, Holtzman A, Choi Y. 2019. Do neural language representations learn physical commonsense. arXiv. preprint. pp. 1–14 ( 10.48550/arXiv.1908.02899) [DOI]
  • 24. Mahowald K, Ivanova AA, Blank IA, Kanwisher N, Tenenbaum JB, Fedorenko E. 2023. Dissociating language and thought in large language models: a cognitive perspective. arXiv. preprint. pp. 1–30 ( 10.48550/arXiv.2301.06627) [DOI] [PMC free article] [PubMed]
  • 25. Bender EM, Koller A. 2020. Climbing towards nlu: on meaning, form, and understanding in the age of data. In Proc. 58th Annual Meeting of the Association for Computational Linguistics, July 2020, Online (eds Jurafsky D, Chai J, Schluter N, Tetreault J), pp. 5185–5198. Association for Computational Linguistics. ( 10.18653/v1/2020.acl-main.463) [DOI] [Google Scholar]
  • 26. Harnad S. 1990. The symbol grounding problem. Physica D 42 , 335–346. ( 10.1016/0167-2789(90)90087-6) [DOI] [Google Scholar]
  • 27. Shapiro L. 2019. Embodied cognition, 2nd Edition. New York, NY: Routledge. ( 10.4324/9781315180380) [DOI] [Google Scholar]
  • 28. Kemmerer D. 2022. Cognitive Neuroscience of language, 2nd Edition. New York, NY: Routledge. ( 10.4324/9781138318427) [DOI] [Google Scholar]
  • 29. Lu J, Batr D, Parikh D, Lee S. 2019. Vilbert: Pretraining task-agnostic visuolinguistic representations for vision-language tasks. Adv. Neural Inf. Process. Syst. 32 , 1–11. ( 10.48550/arXiv.1908.02265) [DOI] [Google Scholar]
  • 30. Alayrac JB, Donohue J, Luc P, Miech A, Barr I, Hasson Yet al. 2022. Flamingo: a visual language model for few-shot learning. arXiv. preprint 2204.1498v2. ( 10.48550/arXiv.2204.14198) [DOI]
  • 31. Ahn M, Brohan A, Brown N, Chebotar Y, Cortes O, David Bet al. 2022. Do as I say: grounding language in Robotic Affordances. arXiv. preprint 2204.01691. pp. 1–34. ( 10.48550/arXiv.2204.01691) [DOI]
  • 32. Barsalou LW. 1999. Perceptual symbol systems. Behav. Brain Sci. 22 , 577–609. ( 10.1017/s0140525x99002149) [DOI] [PubMed] [Google Scholar]
  • 33. Goldstone RL, Barsalou LW. 1998. Reuniting perception and cognition. Cognition 65 , 231–262. ( 10.1016/s0010-0277(97)00047-4) [DOI] [PubMed] [Google Scholar]
  • 34. Dove G. 2023. Concepts require flexible grounding. Brain Lang. 245 , 1–9. ( 10.1016/j.bandl.2023.105322) [DOI] [PubMed] [Google Scholar]
  • 35. Deacon TW. 1987. The symbolic species: the coevolution of language and the brain. New York, NY: W. W. Norton & Company. [Google Scholar]
  • 36. Raczaszek-Leonardi J, Deacon TW. 2018. Ungrounding symbols in language development: implications for modeling emergent symbolic communication in artificial systems. In 2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), Tokyo, Japan, 17–20 September 2018, pp. 232–237. ( 10.1109/DEVLRN.2018.8761016) [DOI] [Google Scholar]
  • 37. Dove G. 2016. Three symbol ungrounding problems: abstract concepts and the future of embodied cognition. Psychon. Bull. Rev. 23 , 1109–1121. ( 10.3758/s13423-015-0825-4) [DOI] [PubMed] [Google Scholar]
  • 38. Davis CP, Altmann GTM, Yee E. 2020. Situational systematicity: a role for schema in understanding the differences between abstract and concrete concepts. Cogn. Neuropsychol. 37 , 142–153. ( 10.1080/02643294.2019.1710124) [DOI] [PubMed] [Google Scholar]
  • 39. Hoffman P, Lambon Ralph MA, Rogers TT. 2013. Semantic diversity: a measure of semantic ambiguity based on variability in the contextual usage of words. Behav. Res. Methods 45 , 718–730. ( 10.3758/s13428-012-0278-x) [DOI] [PubMed] [Google Scholar]
  • 40. Borghi AM, Binkofski F. 2014. Words as social tools: an embodied view on abstract concepts. New York, NY: Springer. ( 10.1007/978-1-4614-9539-0) [DOI] [Google Scholar]
  • 41. Dove G. 2022. Abstract concepts and the embodied mind: rethinking grounded cognition. New York, NY: Oxford University Press. ( 10.1093/oso/9780190061975.001.0001) [DOI] [Google Scholar]
  • 42. Tillas A. 2015. Language as grist to the mill of cognition. Cogn. Process. 16 , 219–243. ( 10.1007/s10339-015-0656-2) [DOI] [PubMed] [Google Scholar]
  • 43. Dove G. 2023. Rethinking the role of language in embodied cognition. Phil. Trans. R. Soc. B 378 , 20210375. ( 10.1098/rstb.2021.0375) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Hume D. 2000. An enquiry concerning human understanding. (ed. Beauchamp T). New York, NY: Oxford University Press. [Google Scholar]
  • 45. Saysani A, Corballis MC, Corballis PM. 2021. Seeing colour through language: colour knowledge in the blind and sighted. Vis. cogn. 29 , 63–71. ( 10.1080/13506285.2020.1866726) [DOI] [Google Scholar]
  • 46. Dimitrova-Radojichikj D. 2015. Concepts of colors in children with congenital blindness. J. Spec. Educ. Rehabil. 16 , 7–16. ( 10.1515/jser-2015-0001) [DOI] [Google Scholar]
  • 47. Lenci A, Baroni M, Cazzolli G, Marotta G. 2013. BLIND: a set of semantic feature norms from the congenitally blind. Behav. Res. Methods 45 , 1218–1233. ( 10.3758/s13428-013-0323-4) [DOI] [PubMed] [Google Scholar]
  • 48. Kim JS, Elli GV, Bedny M. 2019. Knowledge of animal appearance by among sighted and blind adults. Proc. Natl Acad. Sci. USA 116 , 11213–11222. ( 10.1073/pnas.1900952116) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Gubelmann R. 2022. A loosely Wittgensteinian conception of the linguistic understanding of large language models like BERT, GPT-3, and ChatGPT. Grazer. Philos. Stud. 99 , 485–523. ( 10.1163/18756735-00000182) [DOI] [Google Scholar]
  • 50. Pavlick E. 2023. Symbols and grounding in large language models. Phil. Trans. R. Soc. A 381 , 1–19. ( 10.1098/rsta.2022.0041) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Lewis M, Zettersten M, Lupyan G. 2019. Distributional semantics as a source of visual knowledge. Proc. Natl Acad. Sci. USA 116 , 19237–19238. ( 10.1073/pnas.1910148116) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Abdou M, Kulmizev A, Hershcovich D, Frank S, Pavlick E, Sogaard A. 2021. Can language models encode perceptual structure without grounding? a case study in color. In Proc. 25th Conf. Computational Natural Language Learning, November 2021, Online (eds Bisazza A, Abend O), pp. 109–132. Stroudsburg, PA: Association for Computational Linguistics. ( 10.18653/v1/2021.conll-1.9) [DOI] [Google Scholar]
  • 53. Patel R, Pavlick E. 2022. Mapping language models to grounded conceptual spaces. In Int. Conf. on Learning Representations. https://arxiv.org/pdf/2402.15337. [Google Scholar]
  • 54. Sogaard A. 2023. Grounding the vector space of an octopus: word meanings from raw text. Minds Mach. 33 , 33–54. ( 10.1007/s11023-023-09622-4) [DOI] [Google Scholar]
  • 55. Borghi AM, De Livio C, Mannella F, Tummolini L, Nolfi S. 2023. Exploring the prospects and challenges of large language models for language learning and production. OSF Preprints. pp. 1–13 ( 10.31219/osf.io/zw8q9) [DOI]
  • 56. Borghi AM. 2023. The freedom of words: abstractness and the power of language. Cambridge, UK: Cambridge University Press. ( 10.1017/9781108913294) [DOI] [Google Scholar]
  • 57. Andrews M, Vigliocco G, Vinson D. 2009. Integrating experiential and distributional data to learn semantic representations. Psychol. Rev. 116 , 463–498. ( 10.1037/a0016261) [DOI] [PubMed] [Google Scholar]
  • 58. Bruni E, Tran NK, Baroni M. 2014. Multimodal distributional semantics. Jair 49 , 1–47. ( 10.1613/jair.4135) [DOI] [Google Scholar]
  • 59. Steyvers M. 2010. Combining feature norms and text data with topic models. Acta Psychol. 133 , 234–243. ( 10.1016/j.actpsy.2009.10.010) [DOI] [PubMed] [Google Scholar]
  • 60. Liao J, Chen X, Du L. 2023. Concept understanding in large language models: an empirical study. T. P. ICLR 1–5. https://openreview.net/pdf?id=losgEaOWIL7 [Google Scholar]
  • 61. Gendron G, Bao Q, Witbrock M, Dobbie G. 2023. Language models are not abstract reasoners. arXiv. preprint. ( 10.48550/arXiv.2305.19555) [DOI]
  • 62. Dove GO. 2021. The challenges of abstract concepts. In Handbook of embodied psychology: thinking, feeling, and acting (eds Robinson M, Thomas LE), pp. 171–195. Cham, Switzerland: Springer. ( 10.1007/978-3-030-78471-3) [DOI] [Google Scholar]
  • 63. Christiansen JG, Gammelgaard ML, Sogaard A. 2023. Large language models converge toward human-like concept organization. arXiv. preprint: 2308.15047v1. pp. 1–12 ( 10.48550/arXiv.2308.15047) [DOI]
  • 64. Lupyan G, Rahman RA, Boroditsky L, Clark A. 2020. Effects of language on visual perception. Trends Cogn. Sci. 24 , 930–944. ( 10.1016/j.tics.2020.08.005) [DOI] [PubMed] [Google Scholar]
  • 65. Buckner CJ. From deep learning to rational machines: what the history of philosophy can teach us about the future of artificial intelligence. New York: NY: Oxford University Press. ( 10.1093/oso/9780197653302.001.0001) [DOI] [Google Scholar]
  • 66. Kousta ST, Vigliocco G, Vinson DP, Andrews M, Del Campo E. 2011. The representation of abstract words: why emotion matters. J. Exp. Psychol. Gen. 140 , 14–34. ( 10.1037/a0021446) [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

This article has no additional data.


Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES