Skip to main content
Frontiers in Psychology logoLink to Frontiers in Psychology
editorial
. 2016 Feb 9;7:102. doi: 10.3389/fpsyg.2016.00102

What's in a Name? The Multiple Meanings of “Chunk” and “Chunking”

Fernand Gobet 1,*, Martyn Lloyd-Kelly 1, Peter C R Lane 2
PMCID: PMC4746241  PMID: 26903910

The term chunk, denoting a unit, and the related term chunking, denoting a mechanism to construct that unit, are familiar terms within psychology and cognitive science. The Oxford English Dictionary provides several definitions for “chunk.” First, “a thick, more or less cuboidal, lump, cut off anything,” or, colloquially, “a large or substantial amount.” The Merriam-Webster dictionary provides similar definitions. OUP's Oxford Dictionary alone gives a computer-related meaning: “a section of information or data.” It is in this context, a chunk as a section of information, that the word is used within psychology and cognitive science.

In these fields, a chunk typically refers to a single unit built from several smaller elements, and chunking to the process of creating a chunk. Gobet et al. (2001, p. 236) define a chunk as “a collection of elements having strong associations with one another, but weak associations with elements within other chunks.” However, in different contexts and with different authors, these two terms are used with a variety of meanings, which are very often conflated, leading to considerable confusion. Table 1 provides a taxonomy of the main meanings of “chunk” and “chunking,” which will be used to structure this article.

Table 1.

A taxonomy of the meanings of “chunk” and “chunking”.

PSYCHOLOGY
Memory graphic file with name fpsyg-07-00102-i0001.jpg Deliberate chunking
Automatic chunking
PERCEPTION
Motoraction graphic file with name fpsyg-07-00102-i0001.jpg Deliberate chunking
Automatic chunking
COGNITIVE ARCHITECTURES
ACT-R
Soar
EPAM/CHREST
OTHER MEANINGS
Computer science
Linguistics
Education

In orange, meanings from psychology; in green, meanings related to cognitive architectures; in blue, meanings from other fields in cognitive science.

The multiplicity of meanings of the terms “chunk” and “chunking” in the literature raises a number of questions. Do these meanings refer to the same “things”? Do they simply reflect differences in the empirical domains studied? Are the same mechanisms involved? Is learning underpinned by the same processes? As we shall see, these meanings sometimes differ in considerable ways (e.g., when referring to conscious vs. unconscious mechanisms or labeling declarative vs. procedural knowledge structures), but researchers often cite them together as if they refer to the same theoretical objects. Thus, there is the danger that, while researchers in different fields think they refer to the same theoretical concepts, they actually have different structures and mechanisms in mind. This inevitably results in lack of communication, or even worse, miscommunication.

The aim of this article is explicitly not to review the extensive literature on chunking, which would be impossible in such a short format, but to highlight some of the main meanings of the terms “chunk” and “chunking,” to discuss their commonalities and differences, and to argue that progress in our understanding of chunking will be difficult until researchers recognize these different meanings and are more precise in the way they refer to them.

Psychology and cognitive science

Memory

Gobet et al. (2001) distinguish between two main meanings of chunking with regard to memory: deliberate chunking and automatic chunking. Deliberate chunking is conscious, explicit, intermittent, goal-directed, and strategically intended to structure the material to memorize. This meaning is mostly used in the literature on short-term memory (also known as working memory) and is the meaning used by Miller (1956) when he provides the example of recoding binary digits in the decimal system. Conversely, automatic chunking is unconscious, implicit, and continuous. It deals with processes occurring in long-term memory and is the kind of chunking hypothesized to occur with experts when developing familiarity with a domain, for example in chess (Chase and Simon, 1973; Gobet and Simon, 1996). It is also the meaning used in several computational models, such as Competitive Chunking (Servan-Schreiber and Anderson, 1990), EPAM (Feigenbaum and Simon, 1984), and CHREST (Gobet and Lane, 2005).

Deliberate chunking can be further divided into several meanings, which often occur together. A first meaning is grouping. For example, (a a a b b b a a a) would be chunked in three groups: (a a a), (b b b), and (a a a). A second meaning, used for example by Cermak (1975), is equivalent to categorizing. For example, the list (apple car plane orange boat banana) can be chunked as (apple orange banana) and (car plane boat). A third meaning is recoding, for which Miller's discussion of recoding binary digits into decimal digits is a good example: 0000011110010011 can be recoded as 1939. A final meaning concerns using prior knowledge to reliably memorize material. For example, 1939 can be coded as “the start of World War II.”

Practically, there are important differences between the two types of chunking. Deliberate chunking leads to chunks that are fairly easy to identify, since they are explicitly defined by the chunker and can be readily illustrated or explained. By contrast, identifying chunks created by automatic chunking is more problematic, and various methods have been designed for that purpose, such as pauses in speech and eye movements in chess (for reviews, see Gilchrist, 2015; Gobet, 2015).

These two meanings are rarely distinguished in the literature; many articles start with the first meaning, and then mention the second meaning (or vice-versa), without any indication that different concepts are meant1. A moment of thought shows this is confusing, since three states of affairs are possible. First, both deliberate and automatic chunking are present. This is the case, for example, in many mnemonics, where the information is consciously chunked so that a long-term memory trace is created (e.g., Ericsson et al., 1980; Richman et al., 1995). Second, deliberate chunking is used in the absence of automatic chunking. For example, one might use a mnemonic to briefly memorize a phone number, without any long-term memory trace being created. Third, automatic chunking is used in the absence of deliberate chunking. This is presumably the case in many implicit-learning tasks (Berry, 1997), expertise acquisition in most fields (Gobet, 2015) and first-language acquisition (Freudenthal et al., 2007; Jones et al., 2007). For completeness sake, one can mention a final case where memory is used but neither form of chunking is involved. This would be the case, for example, when one rehearses a phone number mechanically for a few seconds without any long-term memory encoding.

The notion of compression, where a set of elements is recoded more economically (discussed by Miller in his 1956 article) is always present in both deliberate and automatic chunking. Gradients of compression exist, however. For example, with deliberate chunking, recoding 01010101 as 85 seems to use more compression than recoding (I B M) as IBM. With automatic chunking, a set of elements of arbitrary length is recoded as a single unit: rather than storing all the elements in short-term memory (STM), only a pointer is stored that denotes a chunk in long-term memory (LTM) (Newell and Simon, 1972; Guida et al., 2012).

Perception and motor action

In the literature on perception, chunking is sometimes used with the meaning of implicit and automatic grouping of perceptual information. Thus, following the Gestalt laws of perception, objects are grouped together based on proximity, similarity, symmetry, continuity, and closure (Koffka, 1935; Gobet, 2016). Note that, in this meaning, there is no notion of memory storage.

The literature on motor action defines chunking as the learning of a complex movement sequence consisting of movement components and has studied it in a number of domains (see Rhodes et al., 2004; Diedrichsen and Kornysheva, 2015, for reviews). These include learning movement sequences (Agam et al., 2007; Cohen and Sekuler, 2010), typing (Yamaguchi and Logan, 2014), drawing the Rey–Osterrieth complex figure (Obaidellah and Cheng, 2015), drawing electricity diagrams (Lane et al., 2001), performing the discrete sequence production task (Verwey and Abrahamse, 2012; Abrahamse et al., 2013), playing the piano (van Vugt et al., 2012), speech production (Segawa et al., 2015), and sports (Shea and Wright, 2012). Typically, it is argued that motor chunks are organized hierarchically, and the production of the motor responses associated with them is unconscious and automatic. However, some research has also investigated how consciously dividing the sequence of movements to learn might lead to better skill acquisition (Fontana et al., 2009). Finally, in line with the literature on memory, Verwey and colleagues (Abrahamse et al., 2013; Verwey et al., 2015) have proposed a Dual Processor Model which distinguishes between chunks represented in an explicit format and automatic chunks.

Cognitive architectures

The concept of a chunk is used in three leading cognitive architectures. In ACT-R (Anderson et al., 1997, 2004), a chunk is defined as unit of declarative knowledge. (Procedural knowledge is encoded as productions.) Chunks contain an “isa” field, which indicates the category to which they belong (for example numeric, textual or visual) and additional fields encoding the knowledge within the chunk (for example: “2 + 2 = 4”). Chunks have a level of activation that is a function of how recently and frequently they have been used. In Soar (Laird et al., 1987; Newell, 1990), all knowledge is encoded in procedural knowledge, and a chunk is a production (i.e., a condition–action pair). Therefore, “chunking” in this context is the mechanism by which productions are created. Thus, confusingly, a chunk is a unit of declarative knowledge in ACT-R and a unit of procedural knowledge in Soar. With CHREST (Gobet and Lane, 2005; Lloyd-Kelly et al., 2015), a chunk refers to a node in LTM. Chunking refers to the creation of such nodes, either by adding a node to the network by discrimination, or by adding information to an existing node, by familiarization. This usage follows the tradition set by the EPAM models (Feigenbaum and Simon, 1984). Thus, it can be seen that, in these three cognitive architectures, the concept of a “chunk” has totally different meanings.

Other uses of the term “chunk”

Although, the dictionary definition refers to a computing-related meaning of chunk as a section of information or data, the term appears to be applied colloquially in computer science rather than formally. Two distinct meanings may be identified as examples: chunks as collections of information sent from one point to another, such as in distributed computing; and chunks as collections of information stored within a file.

A good example of chunks used in distributed computing is the Google File System (Ghemawat et al., 2003). Files are divided into chunks of a fixed size (64 MB) to facilitate their storage and movement between different computers (called chunk servers). Separation into chunks provides redundancy and makes it easier to balance the work done by tens of thousands of computers.

The PNG image format (http://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html) uses chunks to divide the information contained within a picture into sections. For example, a header chunk holds information on the width/height of the image, the number of colors, and so on; another chunk holds the color palette for the image. The image data may be a single or multiple chunks; each chunk of image data must fit in the working buffer of the image-encoding algorithm. Chunk types are also used for user-defined extensions to the PNG format, holding specialist image information, such as copyright information, text comments, etc.

In computational linguistics, chunking (also known as “light parsing” and “shallow parsing”) refers to a technique whereby a sentence is analyzed in terms of its constituents (i.e., nouns, noun groups, verbs, etc.), without specifying the internal structure and their role in the sentence. Finally, in education, chunking refers to an elementary method of division, where successive subtractions are carried out.

Conclusions

As described in this article, the terms “chunk” and “chunking” have multiple meanings. Sometimes, the distinction is obvious, for example when memory chunks and action chunks are mentioned. At other times, the distinction is unclear, most notably when deliberate chunking and automatic chunking are mentioned with respect to memory. Finally, there are instances where mentioning diverse kinds of chunks as if they were referring to the same structures or mechanisms makes little sense, as is the case when ACT-R, Soar and deliberate chunking are mentioned together without further qualification.

The different meanings we have discussed in this article raise the question of whether different terms should be used. While polysemy (the use of the same term with different meanings) is common in everyday language and science, it is far from an ideal state of affairs. Conversely, while also common in science, synonymity (the use of different terms with the same meaning) is also problematic. Ensuring that the term “chunk” has a single meaning, at least in the context of cognitive psychology and cognitive science, would be particularly important for architects of computational chunking models who require unambiguous definitions of concepts to facilitate the development process (Gobet et al., 2015). In an ideal world, this could be achieved by constructing an ontology. Not only would this allow for precise system specifications, but it would also allow architects from diverse academic backgrounds such as computer-science, psychology and biology to communicate without ambiguity. Perhaps even more importantly, given that software development can be open-source and hence open to development by architects from different nationalities who may not share a common tongue, a precise meaning of “chunk” and other associated terms would facilitate effective system development and reduce potential friction between computational modelers.

Unfortunately, policing linguistic use is difficult, if possible at all, in science, where ideas, theories and methods are in constant flux. In fact, attempts to unify the definitions of basic concepts in psychology (e.g., De Groot, 1990) have met with little success. Thus, ambiguity may be a price to pay for the evolutionary nature of science, and this paper has limited itself to providing a taxonomy of the meanings of the terms “chunk” and “chunking.” This being said, before any understanding is met, it is important to first define the objects of research. In this respect, carefully specifying the intended meaning of terms as central as “chunk” and “chunking” is desirable for making progress in our understanding of human cognition.

Author contributions

FG wrote the first draft, and ML and PL contributed to the following drafts.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Footnotes

1An additional source of confusion in the literature is that the term “chunk” is used to refer both to an element in STM and an element in LTM

References

  1. Abrahamse E. A., Ruitenberg M. F. L., De Kleine E., Verwey W. B. (2013). Control of automated behaviour: insights from the discrete sequence production task. Front. Hum. Neurosci. 7:82 10.3389/fnhum.2013.00082 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Agam Y., Galperin H., Gold B. J., Sekuler R. (2007). Learning to imitate novel motion sequences. J. Vis. 7, 1–17. 10.1167/7.5.1 [DOI] [PubMed] [Google Scholar]
  3. Anderson J. R., Bothell D., Byrne M. D., Douglass S., Lebiere C., Qin Y. L. (2004). An integrated theory of the mind. Psychol. Rev. 111, 1036–1060. 10.1037/0033-295X.111.4.1036 [DOI] [PubMed] [Google Scholar]
  4. Anderson J. R., Matessa M., Lebiere C. (1997). ACT-R: a theory of higher level cognition and its relation to visual attention. Hum. Comput. Interact. 12, 439–462. 10.1207/s15327051hci1204_5 [DOI] [Google Scholar]
  5. Berry D. C. (Ed.). (1997). How Implicit is Implicit Learning? Oxford, UK: Oxford University Press. [Google Scholar]
  6. Cermak L. S. (1975). Improving Your Memory. New York, NY: Norton. [Google Scholar]
  7. Chase W. G., Simon H. A. (1973). Perception in chess. Cogn. Psychol. 4, 55–81. 10.1016/0010-0285(73)90004-2 [DOI] [Google Scholar]
  8. Cohen N. R., Sekuler R. (2010). Chunking and compound cueing of movement sequences: learning, retention and transfer. Percept. Motor Skills 110, 736–750. 10.2466/pms.110.3.736-750 [DOI] [PubMed] [Google Scholar]
  9. De Groot A. D. (1990). Unifying psychology: its preconditions, in Recent Trends in Theoretical Psychology, Vol. 2, eds Baker W. J., Hyland M. E., van Hezewijk R., Terwee S. (New York, NY: Springer Verlag; ), 1–25. [Google Scholar]
  10. Diedrichsen J., Kornysheva K. (2015). Motor skill learning between selection and execution. Trends Cogn. Sci. 19, 227–233. 10.1016/j.tics.2015.02.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Ericsson K. A., Chase W. G., Faloon S. (1980). Acquisition of a memory skill. Science 208, 1181–1182. 10.1126/science.7375930 [DOI] [PubMed] [Google Scholar]
  12. Feigenbaum E. A., Simon H. A. (1984). EPAM-like models of recognition and learning. Cogn. Sci. 8, 305–336. 10.1207/s15516709cog0804_1 [DOI] [Google Scholar]
  13. Fontana F. E., Mazzardo O., Furtado O., Gallagher J. D. (2009). Whole and part practice: a meta-analysis. Percept. Motor Skills 109, 517–530. 10.2466/pms.109.2.517-530 [DOI] [PubMed] [Google Scholar]
  14. Freudenthal D., Pine J. M., Aguado-Orea J., Gobet F. (2007). Modelling the developmental patterning of finiteness marking in English, Dutch, German and Spanish using MOSAIC. Cogn. Sci. 31, 311–341. 10.1080/15326900701221454 [DOI] [PubMed] [Google Scholar]
  15. Ghemawat S., Gobioff H., Leung S. T. (2003). The google file system. ACM SIGOPS Oper. Syst. Rev. 37, 29–43. 10.1145/1165389.945450 [DOI] [Google Scholar]
  16. Gilchrist A. L. (2015). How should we measure chunks? A continuing issue in chunking research and a way forward. Front. Psychol. 6:1456. 10.3389/fpsyg.2015.01456 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Gobet F., Lane P. C. R. (2005). The CHREST architecture of cognition: listening to empirical data, in Visions of Mind, ed Davis D. (Hershey, PA: IPS; ), 204–224. [Google Scholar]
  18. Gobet F., Lane P. C. R., Croker S., Cheng P. C. H., Jones G., Oliver I., et al. (2001). Chunking mechanisms in human learning. Trends Cogn. Sci. 5, 236–243. 10.1016/S1364-6613(00)01662-4 [DOI] [PubMed] [Google Scholar]
  19. Gobet F., Lane P. C. R., Lloyd-Kelly M. (2015). Chunks, schemata and retrieval structures: past and current computational models. Front. Psychol. 6:1785. 10.3389/fpsyg.2015.01785 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Gobet F., Simon H. A. (1996). Templates in chess memory: a mechanism for recalling several boards. Cogn. Psychol. 31, 1–40. 10.1006/cogp.1996.0011 [DOI] [PubMed] [Google Scholar]
  21. Gobet F. (2015). Understanding Expertise: A Multidisciplinary Approach. London: Palgrave. [Google Scholar]
  22. Gobet F. (2016). Entrenchment, Gestalt formation and chunking, in Entrenchment, Memory and Automaticity: The Psychology of Linguistic Knowledge and Language Learning, ed Schmid H. J. (Berlin: Gruyter Mouton; ). [Google Scholar]
  23. Guida A., Gobet F., Tardieu H., Nicolas S. (2012). How chunks, long-term working memory and templates offer a cognitive explanation for neuroimaging data on expertise acquisition: a two-stage framework. Brain Cogn. 79, 221–244. 10.1016/j.bandc.2012.01.010 [DOI] [PubMed] [Google Scholar]
  24. Jones G., Gobet F., Pine J. M. (2007). Linking working memory and long-term memory: a computational model of the learning of new words. Dev. Sci. 10, 853–873. 10.1111/j.1467-7687.2007.00638.x [DOI] [PubMed] [Google Scholar]
  25. Koffka K. (1935). The Principles of Gestalt Psychology. New York, NY: Harcourt, Brace and World. [Google Scholar]
  26. Laird J. E., Newell A., Rosenbloom P. S. (1987). SOAR: an architecture for general intelligence. Artif. Intell. 33, 1–64. 10.1016/0004-3702(87)90050-6 [DOI] [Google Scholar]
  27. Lane P. C. R., Cheng P. C. H., Gobet F. (2001). Learning perceptual chunks for problem decomposition, in Proceedings of the 23rd Meeting of the Cognitive Science Society (Mahwah, NJ: Erlbaum; ), 528–533. [Google Scholar]
  28. Lloyd-Kelly M., Gobet F., Lane P. C. R. (2015). A question of balance: the benefits of pattern-recognition when solving problems in a complex domain, Transactions on Computational Collective Intelligence XX, in eds Nguyen N. T., Kowalczyk R., Duval B., van den Herik J., Loiseau S., Filipe J. (Berlin: Springer; ), 259–293. [Google Scholar]
  29. Miller G. A. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev. 63, 81–97. 10.1037/h0043158 [DOI] [PubMed] [Google Scholar]
  30. Newell A., Simon H. A. (1972). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall. [Google Scholar]
  31. Newell A. (1990). Unified Theories of Cognition. Cambridge, MA: Harvard University Press. [Google Scholar]
  32. Obaidellah U., Cheng P. (2015). The role of chunking in drawing Rey complex figure. Percept. Motor Skills 120, 535–555. 10.2466/24.PMS.120v17x6 [DOI] [PubMed] [Google Scholar]
  33. Rhodes B. J., Bullock D., Verwey W. B., Averbeck B. B., Page M. P. A. (2004). Learning and production of movement sequences: behavioral, neurophysiological, and modeling perspectives. Hum. Mov. Sci. 699–746. 10.1016/j.humov.2004.10.008 [DOI] [PubMed] [Google Scholar]
  34. Richman H. B., Staszewski J. J., Simon H. A. (1995). Simulation of expert memory with EPAM IV. Psychol. Rev. 102, 305–330. 10.1037/0033-295X.102.2.305 [DOI] [PubMed] [Google Scholar]
  35. Segawa J. A., Tourville J. A., Beal D. S., Guenther F. H. (2015). The neural correlates of speech motor sequence learning. J. Cogn. Neurosci. 27, 819–831. 10.1162/jocn_a_00737 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Servan-Schreiber E., Anderson J. (1990). Learning artificial grammars with competitive chunking. J. Exp. Psychol. 16, 592–608. 10.1037/0278-7393.16.4.592 [DOI] [Google Scholar]
  37. Shea C. H., Wright D. L. (2012). The representation, production, and transfer of simple and complex movement sequences, in Skill Acquisition in Sport: Research, Theory and Practice, eds Hodges N. J., Williams A. M. (London: Routledge; ), 131–149. [Google Scholar]
  38. van Vugt F. T., Jabusch H.-C., Altenmüller E. (2012). Fingers phrase music differently: trial-to-trial variability in piano scale playing and auditory perception reveal motor chunking. Front. Psychol. 3:495. 10.3389/fpsyg.2012.00495 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Verwey W. B., Abrahamse E. L. (2012). Distinct modes of executing movement sequences: reacting, associating, and chunking. Acta Psychol. 140, 274–282. 10.1016/j.actpsy.2012.05.007 [DOI] [PubMed] [Google Scholar]
  40. Verwey W. B., Shea C. H., Wright D. L. (2015). A cognitive framework for explaining serial processing and sequence execution strategies. Psychon. Bull. Rev. 22, 54–77. 10.3758/s13423-014-0773-4 [DOI] [PubMed] [Google Scholar]
  41. Yamaguchi M., Logan G. D. (2014). Pushing typists back on the learning curve: revealing chunking in skilled typewriting. J. Exp. Psychol. 40, 592–612. 10.1037/a0033809 [DOI] [PubMed] [Google Scholar]

Articles from Frontiers in Psychology are provided here courtesy of Frontiers Media SA

RESOURCES