Abstract
Vocal communication systems in humans and other animals experience selection for efficiency—optimizing the benefits they convey relative to the costs of producing them. Two hallmarks of efficiency, Menzerath’s law and Zipf’s law of abbreviation, predict that longer sequences will consist of shorter elements and more frequent elements will be shorter, respectively. Here, we assessed the evidence for both laws in cetaceans by analyzing vocal sequences from 16 baleen and toothed whale species and comparing them to 51 human languages. Eleven whale species exhibit Menzerath’s law, sometimes with greater effect sizes than human speech. Two of the five whale species with categorized element types exhibit Zipf’s law of abbreviation. On average, whales also tend to shorten elements and intervals toward the end of sequences, although this varies by species. Overall, the results of this study suggest that the vocalizations of many cetacean species have undergone compression for increased efficiency in time.
Whale vocalizations follow efficiency rules seen in human language, revealing striking similarities in communication systems.
INTRODUCTION
Vocal communication is essential to survival and reproduction in many species, as it enables individuals to convey critical information related to predation, resource access, courtship, and social relationships (1). More complex signals, which vary across multiple dimensions, can encode greater amounts of information (2), and redundancy increases the likelihood of successful transmission between signalers and receivers (3). However, elaborate and sustained vocalizations carry considerable costs, including heightened predation risk (4) and increased energetic demands, sometimes up to two to eight times the resting metabolic rate in certain species (5). Consequently, vocal communication systems experience selection for efficiency (6)—optimizing the benefits they convey relative to the costs of producing them (7, 8)—a concept closely related to the “principle of least effort” in linguistics (9).
One of the simplest ways to increase efficiency is by reducing vocalization time (4). Individuals who convey the same information in less time incur lower metabolic costs (10) and are less likely to be detected by predators and potential prey (4). Vocalization time can evolve in response to factors that alter the relative costs and benefits of communication, like group size (11), as well as physical features that affect vocal production (12). Within species, vocalization time may also change over generations through cultural evolution (i.e., via social learning) (13), and within individuals during ontogeny (14) or as a flexible response to anthropogenic noise (15–18), in a way that optimizes efficiency.
In human language, efficiency is often quantified through two linguistic laws that directly relate to vocalization time: Menzerath’s law and Zipf’s law of abbreviation. Imagine a set of sequences (e.g., sentences, words, and songs), each composed of multiple elements (e.g., words, phonemes, and notes). Menzerath’s law predicts that longer sequences (e.g., songs and words) will be composed of shorter elements (e.g., notes and phonemes) (19). In other words, when production costs increase in one domain (e.g., sequence length), they decrease in another (e.g., element duration). Zipf’s law of abbreviation predicts that more frequently used elements (e.g., notes, phonemes, and words) will be shorter in duration (9). Both laws result in an overall reduction in vocalization time, and mathematical modeling work indicates that they emerge from pressure for more efficient communication (20–22).
Outside of human language, Menzerath’s law and Zipf’s law of abbreviation have been observed in an increasing number of species, including gibbons (23), African penguins (24), and house finches (25). Comparative studies assessing both Menzerath’s law and Zipf’s law of abbreviation within the same species, however, reveal an interesting discrepancy: The former is always found (23–34), whereas the latter only appears in around half of cases (23, 24, 31–34). As others have noted, this discrepancy may stem from the laws reflecting different mechanisms or constraints (29, 35).
One hypothesis for this pattern is that Menzerath’s law has primarily physical origins, driven by natural selection for a more efficient vocal apparatus. Menzerath’s law in humans appears to be stronger in spoken than in written language (22, 36), deafened canaries and zebra finches produce songs consistent with the law without hearing adult models (37), and African penguins display the law without engaging in vocal learning (24). In contrast, Zipf’s law of abbreviation may result from a more complex combination of factors (7). Physical efficiency appears to be important, as common words are shorter and have more easily articulated phoneme sequences (38), but predictability and informativeness may also play a role. Experiments with artificial languages show that Zipfian abbreviation emerges when participants are under pressure to be both informative and fast (39), speakers shorten words when their meaning is predictable from context (40), and information content may predict word length more than frequency in some conditions (41). The two laws, then, may have very different prerequisites. Menzerath’s law might arise wherever vocalizations occur in sequences, regardless of whether learning is involved. In contrast, Zipf’s law of abbreviation may require that elements form distinct categories that vary in predictability and convey meaningful information.
Communicative efficiency is relatively understudied in cetaceans. To our knowledge, Menzerath’s law has only been observed in bottlenose dolphins (31, 33), and Zipf’s law of abbreviation has only been observed in humpback whales and bottlenose dolphins (31, 33, 42). Given cetaceans’ extensive reliance on learned vocalizations for complex social behavior—from courtship in baleen whales to individual recognition and coordination in toothed whales (43)—they offer a valuable research model for efficiency in nonhuman communication. Additionally, the breadth of data on cetacean vocalizations makes it possible to conduct a meta-analysis, assessing the prevalence and strength of Menzerath’s law and Zipf’s law of abbreviation in a wide range of species using previously published datasets. To date, comprehensive meta-analyses of these two laws in vocal communication have only been done for human speech, where both appear to be statistical universals (22, 35), and birdsong, where Menzerath’s law is widespread (37) but Zipf’s law of abbreviation is quite rare (44). The aims of this meta-analysis were to (i) determine the prevalence of Menzerath’s law and Zipf’s law of abbreviation in cetaceans, and to (ii) directly compare the strength of the laws in cetaceans with spoken human language data—in other words, assess whether vocal efficiency in cetaceans is “language-like.”
In studies of Menzerath’s law in vocal communication, duration is typically measured in one of two ways: (i) from the start to the end of a sound, or (ii) from the start of one sound to the start of the next. The first method, which captures only the vocalization time and excludes pauses, is widely used for animal communication (20, 23, 26, 32, 34, 37). We refer to this as the element duration—the difference between a sound’s start and end time. The second method measures the vocalization time including the pause before the next sound. This approach has been used for marmosets (45) and bottlenose dolphins (31) and is standard for human speech (22, 36, 46), which is fairly continuous. Large spoken language corpora, such as Glissando (36), Buckeye (22), and DoReCo (46), include the small gaps between phonemes in their duration measurements. More broadly, this measure is the “go-to” for studies of rhythm in humans and animals (47). Following the rhythm literature, we refer to this as the inter-onset interval—the difference between the start of one sound and the start of the next. A couple of studies in treefrogs (27) and geladas (20) have assessed Menzerath’s law using only the pauses between sounds, to supplement analyses of element durations, but this approach is rare and will not be used in this study.
In cetacean vocalizations, element durations are typically used when sequences consist of distinct notes, calls, or elements, with information thought to be encoded in acoustic features like frequency, bandwidth, and timbre (analogous to birdsong, second and third rows of Fig. 1). In contrast, inter-onset intervals are used when sequences are made up of uniform clicks or pulses, where the rhythmic timing is thought to encode information (analogous to human drumming, fourth and fifth rows of Fig. 1). It is worth noting that the latter case is quite different from human language (first row of Fig. 1), where inter-onset intervals are used because gaps between phonemes are either absent or minimal. However, regardless of the measurement used—element durations or inter-onset intervals—Menzerath’s law reflects the same underlying principle: “the greater the whole the smaller its parts” (19, 48). In other words, when longer sequences are made up of smaller components, the total vocalization time is reduced. A recent study in marmosets illustrates this concept: When individuals were rewarded for producing an increasing number of vocalizations, they maximized their vocal efficiency by reducing both the element durations and inter-onset intervals of their call sequences (45). Here, the distributions of element durations and inter-onset intervals in whale vocal sequences exhibit the same shape (fig. S1), and Menzerath’s law is only slightly different when computed from intervals in both whales (see Results) and humans (see Supplementary Text).
Fig. 1. Spectrograms of an English sentence (first row), humpback whale song (second row), killer whale call sequence (third row), Commerson’s dolphin burst pulse (fourth row), and sperm whale codas (fifth row) that were included in this study.
These recordings include all vocalization types included in Table 1. The levels of hierarchy in each vocalization are labeled with text and white bars. The element durations (in humpback whales and killer whales) span from the beginning to the end of a sound, whereas the inter-onset intervals (in Commerson’s dolphins and sperm whales) span from the beginning of a sound to the beginning of the next sound. Given the fairly continuous nature of human speech, durations of phonemes and words in the DoReCo corpus are measured from the beginning of a sound to the beginning of the next sound, identically to inter-onset intervals (see Materials and Methods for details).
RESULTS
In total, this analysis includes 610,219 elements and intervals from 65,511 sequences, 24 studies, and 16 species (see Table 1). All datasets were suitable for assessing Menzerath’s law. In contrast, Zipf’s law of abbreviation makes predictions about types of elements, and thus requires elements to be categorized into types (29). Only eight datasets in five species were suitable for assessing Zipf’s law of abbreviation (see Table 1), all of which measured element durations rather than inter-onset intervals.
Table 1. The datasets included in this analysis, with whether they are open access, the vocalization category, and whether the sequences are composed of element durations or inter-onset intervals.
All datasets were appropriate for assessing Menzerath’s law, and the subset that was also appropriate for Zipf’s law of abbreviation (ZLA) are denoted in the final column.
| Group | Species | Dataset | Open | Vocalization | Type | ZLA |
|---|---|---|---|---|---|---|
| Baleen whale | Blue whale | (55) | Yes | Songs | Elements | Yes |
| Bowhead whale | (68) | No | Songs | Elements | Yes | |
| Common minke whale | (87) | Yes | Call sequences | Intervals | No | |
| Fin whale | (82) | Yes | Songs | Intervals | No | |
| (83) | Yes | Songs | Intervals | No | ||
| (84) | Yes | Songs | Intervals | No | ||
| Humpback whale | (85) | Yes | Songs | Elements | Yes | |
| (86) | Yes | Songs | Elements | Yes | ||
| (67) | Yes | Phrases | Elements | Yes | ||
| North Pacific right whale | (51) | No | Songs | Elements | No | |
| Sei whale | (88) | No | Call sequences | Elements | No | |
| (89) | Yes | Call sequences | Elements | Yes | ||
| Toothed whale | Bottlenose dolphin | (31) | No | Burst pulses | Intervals | No |
| Commerson’s dolphin | (57) | No | Burst pulses | Intervals | No | |
| Heaviside’s dolphin | (58) | No | Burst pulses | Intervals | No | |
| Hector’s dolphin | (56) | No | Burst pulses | Intervals | No | |
| Killer whale | (90) | Yes | Call sequences | Elements | Yes | |
| (96) | No | Calls | Elements | Yes | ||
| Narrow-ridged finless porpoise | (91) | No | Burst pulses | Intervals | No | |
| Peale’s dolphin | (92) | No | Burst pulses | Intervals | No | |
| Risso’s dolphin | (93) | Yes | Burst pulses | Intervals | No | |
| Sperm whale | (81) | Yes | Codas | Intervals | No | |
| (94) | Yes | Codas | Intervals | No | ||
| (95) | Yes | Codas | Intervals | No |
As a comparison with the whale data, we also analyzed spoken language data from DoReCo—a corpus of ~500,000 annotated words (with phonemes) from 51 languages that focuses on small and endangered languages [(49); see “DoReCo references” in the Supplementary Materials] and has been used in previous studies of Menzerath’s law and Zipf’s law of abbreviation (35).
Menzerath’s law
The main model used to test Menzerath’s law was a linear model with the log-transformed element duration or inter-onset interval as the outcome variable, the log-transformed sequence length (i.e., number of element durations or inter-onset intervals in the sequence) as a fixed effect, and sequence ID as a varying intercept to account for the repeated measurements of durations within sequences. This model is directly derived from the Menzerath-Altmann law—a precise and more robust mathematical form of Menzerath’s law (48, 50) (see Materials and Methods for details). Some species had multiple datasets, in which case the study ID was included as a second varying intercept. Here is the main model in Wilkinson notation—standard R model syntax.
| (1) |
The strength of Menzerath’s law in baleen and toothed whale species, computed using Eq. 1, can be seen in Figs. 2 and 3, respectively (see figs. S5 and S6 for the same plots with transformed axes to match the statistical model). In all baleen whale species, except for the North Pacific right whale, there is a negative relationship between sequence length and element durations or inter-onset intervals consistent with Menzerath’s law. The results are more mixed for the toothed whale species, where only five of the nine exhibit Menzerath’s law. All three dolphins in the Cephalorhynchus genus, as well as killer whales, display a neutral or positive relationship between sequence length and element durations or inter-onset intervals.
Fig. 2. The baleen whale (Mysticete) species included in the study (left), alongside the distribution of element durations or inter-onset intervals and sequence lengths (middle) and the slope of Menzerath’s law (right).
Each point in the distribution plots (middle) marks the mean element duration (ED) or inter-onset interval (IOI), but the slopes on the right were computed from the full set of elements/intervals. The bars in the slope plots (right) mark the 95% CIs around the point estimates.
Fig. 3. The toothed whale (Odontocete) species included in the study (left), alongside the distribution of element durations or inter-onset intervals and sequence lengths (middle) and the slope of Menzerath’s law (right).
Each point in the distribution plots (middle) marks the mean element duration or inter-onset interval, but the slopes on the right were computed from the full set of elements/intervals. The bars in the slope plots (right) mark the 95% CIs around the point estimates.
The North Pacific right whales have four distinct clusters of sequences in Fig. 2, which directly correspond to the four song types identified by Crance et al. (51). The strong positive relationship between sequence length and element duration appears to be driven by the distribution of these clusters. Menzerath’s law makes no predictions about different categories of sequences, but it is worth noting that when Eq. 1 is computed separately on each song type the results vary {GS1-PF estimate: −0.11, 95% confidence interval (CI): [−0.17, −0.05]; GS4-DG estimate: 0.01, 95% CI: [−0.03, 0.04]; GS3-PU estimate: −0.03, 95% CI: [−0.05, 0]; GS2-TP estimate: 0.06, 95% CI: [0.04, 0.08]}. Note that the GS* abbreviations are the North Pacific right whale song types, as named by Crance et al. (51).
For humpback and killer whales, we also assessed Menzerath’s law using data from a higher level of analysis (i.e., one step up in the structural hierarchy). In humpback whales, the length of songs negatively predicted the duration of phrases (estimate = −0.25, 95% CI: [−0.333, −0.167], using Eq. 1), similar to the pattern for notes within phrases. In killer whales, the length of call sequences negatively predicted the duration of calls (estimate = −0.043, 95% CI: [−0.082, −0.004], Eq. 1), although the situation is reversed for elements within calls (Fig. 3).
Additionally, we fit a second model that included the position of each element or inter-onset interval in the sequence as a fixed effect, following previous studies of Menzerath’s law in nonhuman animals (20, 23, 29, 33, 37, 52). Position was normalized between 0 and 1 using the function , where is the position of the element or interval and is the length of the sequence (37). The purpose of this model was to assess whether Menzerath’s law is driven by a shortening of elements or intervals over the course of the sequence, or a tendency to begin long sequences with shorter elements or intervals.
| (2) |
Figure 4 shows a direct comparison between the strength of Menzerath’s law in the whale data and the spoken human language data (i.e., phonemes within words) from the DoReCo corpus [(49); see “DoReCo references” in the Supplementary Materials], alongside the influence of the position of elements or inter-onset intervals on their duration computed using Eqs. 1 and 2. The same results for words within sentences can be seen in fig. S2. The 11 whale species that adhere to Menzerath’s law express it to at least a similar extent as the human languages, and sometimes to a much greater extent (e.g., humpback whales). The effect of the position of elements and intervals on their duration is much more variable. Human languages tend to have a positive relationship between position and inter-onset intervals, which means that intervals are lengthened as sequences progress. Whales, on the other hand, appear to shorten elements and intervals over the course of sequences (see Table 2), but this varies dramatically across species.
Fig. 4. The 95% CIs for the effect of sequence length (top; computed from Eq. 1) and position (bottom; computed from Eq. 2) on element duration and inter-onset intervals for the 16 whale species and 51 human languages.
The human language data are composed of phonemes within words. The colors correspond to the taxonomic group and whether the data are element durations or inter-onset intervals.
Table 2. The estimated effect of each predictor and interaction (indented and marked with) on element durations and inter-onset intervals in sequences.
Length is the sequence length (in number of elements or intervals), position is the normalized position of each element or interval in the sequence, group is whether the species is a baleen (0) or toothed (1) whale, and type is whether the data are composed of element durations (0) or inter-onset intervals (1). Values of 2.5% and 97.5% denote the lower and upper bounds of the 95% CIs. Asterisks mark 95% CIs that do not overlap zero, interpreted here as evidence for a strong effect.
| Predictor | Effect | 2.5% | 97.5% | |
|---|---|---|---|---|
| Length | −0.341 | −0.363 | −0.318 | * |
| : Group | −0.005 | −0.028 | 0.019 | |
| : Type | 0.090 | 0.058 | 0.122 | * |
| Position | −0.067 | −0.073 | −0.061 | * |
| : Group | −0.036 | −0.040 | −0.032 | * |
| : Type | 0.062 | 0.055 | 0.069 | * |
There are several exceptions to Menzerath’s law in the human language data. Arapaho exhibits a positive effect of word length on the inter-onset intervals of phonemes (Fig. 4), and Tabasaran, Sanzhi Dargwa, Pnar, English (recorded in southern England), Yongning Na, and Cabécar show no effect of sentence length on the inter-onset intervals of words (fig. S2). These exceptions come from a wide variety of language families (e.g., Algic, Nakh-Daghestanian, Austroasiatic, Indo-European, Sino-Tibetan, and Chibchan) from North America, Europe, and Asia.
Finally, we assessed broader cross-species trends in Menzerath’s law with expanded forms of Eqs. 1 and 2 applied to all species at once—Eqs. 3 and 4 below. Interactions between length and position and the following two features were added: (i) the group the species comes from, to determine whether the effect varies between Mysticetes and Odontocetes, and (ii) the type of vocalization, to determine whether the effect is stronger for element durations or inter-onset intervals. Group and type were not added as separate fixed effects (outside of the interactions) because the z scoring of duration within species removes species differences (see Materials and Methods). Sequence and study were included as varying intercepts. The effect of sequence length on elements and intervals does not have significant phylogenetic signal ( = 0.32; = 0.45), computed using the method of Ives et al. (53) as implemented in the phytools package (2.1.1) in R (v4.3.1) (54), so phylogeny was not included in the modeling.
| (3) |
| (4) |
Of the two models used to assess cross-species trends, the one that included both length and position best fit the data (Eq. 4; = 1847). The results of this model can be seen in Table 2. Overall, there is a strong negative effect of sequence length on element durations and inter-onset intervals, which is consistent with Menzerath’s law. The interaction between this effect and data type is positive, suggesting that Menzerath’s law is slightly weaker when data are composed of inter-onset intervals rather than elements. Additionally, there is a negative effect of position on element durations and inter-onset intervals, indicating that elements and intervals tend to shorten as sequences progress. The interactions between position, group, and type suggest two things: toothed whales (Odontocetes) shorten later elements and intervals to a greater extent, and elements tend to get shortened more than intervals over the course of sequences. These interactions are strong enough to neutralize the effect of position in some conditions. For example, the overall effect of position on duration in a baleen whale species (Mysticete, group = 0) with interval data (type = 1) would only be −0.005 (95% CI: [−0.018, 0.008]).
Zipf’s law of abbreviation
Unlike Menzerath’s law, Zipf’s law of abbreviation is a qualitative law that simply predicts that common types of elements will have shorter duration than rare ones (22). To assess Zipf’s law of abbreviation, we followed previous studies in using a lognormal model with duration as the outcome variable, count as a fixed effect, and the type of element as a varying intercept to account for the repeated measurements of durations within each type (25). Note that only duration is log-transformed in this model, but the results are qualitatively the same if count is also log-transformed. Some species had multiple datasets, in which case the study ID was included as a varying intercept.
| (5) |
The strength of Zipf’s law of abbreviation in the five whale species considered, computed using Eq. 5, can be found in Fig. 5 (see fig. S8 for the same plot with transformed axes to match the statistical model). The negative relationship between element duration and count is only found in blue whales and humpback whales. Blue whales from the northeast Pacific population analyzed in this study only use two call types in sequences (A and B calls) (55), so we confirmed Zipf’s law of abbreviation in that species using a simpler lognormal model with duration as the outcome variable and whether the element is of the more common type (B calls) as a fixed effect (binary: 1/0). An element coming from the more common type negatively predicts duration (estimate = −0.208, 95% CI: [−0.284, −0.132]), supporting the result shown in Fig. 5.
Fig. 5. The whale species included in the study (left), alongside the distribution of element durations and counts (middle) and the slope of Zipf’s law of abbreviation (right).
Each point in the distribution plots (middle) marks the mean duration of elements, but the slopes on the right were computed from the full set of elements. The bars in the slope plots (right) mark the 95% CIs around the point estimates.
For humpback whales, we also assessed Zipf’s law of abbreviation using data from a higher level of analysis (i.e., one step up in the structural hierarchy). Common phrases tend to be shorter in duration (estimate = −0.086, 95% CI: [−0.124, −0.049], using Eq. 5), similar to the pattern for notes within phrases.
Figure 6 shows a direct comparison between the strength of Zipf’s law of abbreviation in the whale data and the spoken human language data (i.e., phonemes) from the DoReCo corpus [(49); see “DoReCo references” in the Supplementary Materials], computed using Eq. 5. The same results for words within can be seen in fig. S7. Only humpback whales exhibit Zipf’s law of abbreviation to a similar extent as the human languages, while blue whales are much closer to neutrality.
Fig. 6. The 95% CIs for the effect of count on element duration for the five whale species and 51 human languages.
The human language data are composed of phonemes. The colors correspond to the taxonomic group and whether the data are element durations or inter-onset intervals. Effect of count on element duration computed from Eq. 5.
We assessed cross-species trends in Zipf’s law of abbreviation with an expanded form of Eq. 5 applied to all species at once—Eq. 6 below. An interaction between count and the group the species comes from was added, to determine whether the effect varies between Mysticetes and Odontocetes. Group was not added as a separate fixed effect (outside of the interaction) because the z scoring of duration within species removes species differences (see Materials and Methods). Study was included as a varying intercept.
| (6) |
The results of the model used to assess overall trends in Zipf’s law of abbreviation (Eq. 6) can be seen in Table 3. Overall, there is a strong negative effect of count on the duration of elements, which is consistent with Zipf’s law of abbreviation, although this effect is probably driven primarily by humpback whales and blue whales. The interaction between this effect and the group the data come from is positive, suggesting that Zipf’s law of abbreviation is weaker in Odontocetes. However, the only Odontocete species in this model is the killer whale, so this result may not generalize to a larger sample.
Table 3. The estimated effect of each predictor and interaction (indented and marked with) on the duration of elements.
Count is the number of times that each type of element is found in each dataset, and group is whether the species is a baleen (0) or toothed (1) whale. Values of 2.5% and 97.5% denote the lower and upper bounds of the 95% CIs. Asterisks mark 95% CIs that do not overlap zero, interpreted here as evidence for a strong effect.
| Predictor | Effect | 2.5% | 97.5% | |
|---|---|---|---|---|
| Count | −0.135 | −0.162 | −0.109 | * |
| : Group | 0.250 | 0.216 | 0.285 | * |
DISCUSSION
The vocalizations of 11 of the 16 whale species included in this analysis adhere to Menzerath’s law, suggesting that they have undergone compression for increased efficiency in time. Among these 11 species, the strength of Menzerath’s law is comparable to, and sometimes far greater than, what is observed in spoken human language data. In Figure 4, we compared the whale sequences to phonemes within words, but the results were similar for words within sentences (fig. S2). For two species, humpback whales and killer whales, we were able to analyze sequences at two levels of analysis. Humpback whales exhibit Menzerath’s law for both notes within phrases and phrases within songs. Killer whales, on the other hand, only exhibit Menzerath’s law at the level of call sequences, as opposed to the elements comprising calls. When data from all 16 whale species are included in a single analysis, there is strong evidence for both Menzerath’s law and for an effect of position—elements and intervals tend to be shortened over the course of sequences.
Several species produce vocalizations that do not adhere to Menzerath’s law—killer whales (at the level of elements within calls), North Pacific right whales, and the three Cephalorhynchus dolphin species. The fact that killer whale vocalizations exhibit Menzerath’s law in their call sequences, but not elements within calls, suggests that the former may be the more relevant level of analysis for communication. The results from the North Pacific right whales are more puzzling. The data used in this study are from the first documented recordings of song in any right whale species (51) and are composed of four song types with fairly marked differences in sequence lengths and interval durations (see clusters in Fig. 2). When Menzerath’s law is assessed separately on each song type, two display the expected negative relationship, one displays a neutral relationship, and one displays a positive relationship between sequence length and interval duration. One speculative explanation for the mixed results in North Pacific right whales is that the songs may be in an early stage of cultural evolution, either for the first time or as part of recovery from endangerment. Crance et al. (51) found only one clear case of different animals producing the same song type, and linguistic laws may emerge from repeated cultural transmission between individuals (39). The three Cephalorhynchus species in this study—Hector’s dolphins, Commerson’s dolphins, and Heaviside’s dolphins—all produce burst pulses with both narrowband high-frequency and broadband clicks (56–58). The former are thought to be used for echolocation and cryptic communication (above the hearing range of killer whales), whereas the latter are used for long-range communication (58). One potential explanation for the absence of Menzerath’s law in Cephalorhynchus is that the crypsis of narrowband high-frequency clicks is more important than their efficiency. Another interesting detail is that Heaviside’s and Hector’s dolphins sometimes produce temporally patterned burst pulses with much more rhythmic variation during social interactions (56, 59). A preliminary analysis of patterned burst pulses from Heaviside’s dolphins (59) shows that they do adhere to Menzerath’s law (see Supplementary Text), suggesting that they may experience more information compression than burst pulses with more consistent intervals. Patterned burst pulses may be a good candidate for future studies of communication in Cephalorhynchus, although more documentation of these relatively rare vocalizations is needed.
On a related note, Menzerath’s law does not appear to be universal in spoken language at the level of phonemes in words (Fig. 4) or words within sentences (fig. S2), which is consistent with previous work on clauses in written sentences (60) and syllables in written words (50). Menzerath’s law in language, then, appears to be a statistical tendency rather than an absolute universal.
The shortening of elements and intervals later in sequences is an unexpected finding, as the opposite pattern is often (but not always) observed in birdsong (29, 37) and human language (50) (see Fig. 4). “Final lengthening” is a well-studied linguistic phenomenon in which vowels are lengthened right before word, phrase, and sentence boundaries (46, 61). One account for final lengthening is that it initially evolved to minimize the cost of switching from exhaling to inhaling between elements, and has subsequently been elaborated via cultural evolution to make the boundaries between elements easier to perceive (62). Both toothed and baleen whales have specialized adaptations that allow them to vocalize while holding their breath (12, 63), which may release them from the specific motor constraints that drive final lengthening.
Another explanation comes from primates, where coppery titi monkeys, eastern grey gibbons, and geladas shorten some aspects of their vocalizations over the course of sequences (elements for the first two, intervals for the third) (20, 52). Longer vocalizations are more energetically costly (64), which is probably why humans and other mammals shorten their vocalizations as they fatigue (65). Gustison et al. (20) and Clink and Lau (52) hypothesize that vocal shortening later in sequences reflects this simple energetic constraint, and that it may even explain Menzerath’s law in some species. Other work in humans and birds supports the idea that Menzerath’s law has physical origins (22, 36, 37)—a development that some have described as “liberating” after decades of debate about the origins of linguistic laws (66). Menzerath’s law in humans appears to be stronger in spoken than in written language (22, 36), deafened canaries and zebra finches produce songs consistent with the law without hearing adult models (37), and African penguins display the law without engaging in vocal learning (24). If Gustison et al. (20) and Clink and Lau (52) are correct, then the presence of vocal shortening may point to a physical origin for Menzerath’s law in whale communication.
The vocalizations of humpback whales and blue whales also adhere to Zipf’s law of abbreviation, suggesting that they have undergone additional compression for efficiency in time. Of these two species, only humpback whales exhibit Zipf’s law of abbreviation to the extent observed in spoken human language data. In Figure 6, we compared the whale elements to phonemes, but the results are similar for words (see fig. S7). Humpback whales also exhibit Zipf’s law of abbreviation at a higher level of analysis—phrases, the sequences of notes that make up songs. When data from all five whale species are included in a single analysis, there is strong evidence for Zipf’s law of abbreviation, but this is probably driven primarily by humpback whales and blue whales. Note that Zipf’s law of abbreviation requires that elements are categorized into types, so it could only be assessed in 5 of the 16 species included in this study.
Previous studies looking at both laws in the same species indicate that Zipf’s law of abbreviation is not usually found without Menzerath’s law (23, 24, 31–34), while Menzerath’s law often exists on its own (26–30). A similar pattern appears to be present in cetaceans, where four of the five species with categorized elements exhibit Menzerath’s law, but only two of those four also exhibit Zipf’s law of abbreviation. As described in the Introduction, this discrepancy may be due to differences in the mechanisms or constraints underlying the two laws. Menzerath’s law may be rooted in physical constraints (22, 36, 37), whereas Zipf’s law of abbreviation may reflect additional pressure for informativeness and predictability (39, 41). On the basis of this logic, one might expect species that exhibit Zipf’s law of abbreviation to have more complex vocalizations with more learned content, but this is not necessarily the case in cetaceans. Humpback whales and bowhead whales both have complex, hierarchically structured songs (67, 68), but only humpback whales appear to exhibit the law. Similarly, blue whales from the northeast Pacific population exhibit the law despite producing very simple sequences composed of only two call types (55).
A variety of factors can counteract compression in vocal communication (21), including transmission range, noise, and sexual selection. In some terrestrial mammals, including bats, marmosets, and geladas, linguistic laws appear in short-range but not in long-range vocalizations (20, 69). This pattern likely reflects a trade-off (70)—increased duration and redundancy enhance the transmission success of long-range signals at the expense of production cost (69). No clear relationship exists between transmission range and the strength of Menzerath’s law or Zipf’s law of abbreviation across cetacean species, although we were unable to quantitatively confirm this due to limited data. It is possible that the enhanced speed and range of underwater sound (71) reduce the need for cetaceans to sacrifice efficiency for the transmissibility of long-range signals. In noisy environments, beluga whales (15), bottlenose dolphins (16), killer whales (17), and humpback whales (18) have all been observed to increase vocalization time to boost transmission success. Last, vocalization time itself can be a target of sexual selection. For instance, female humpback whales may prefer males who sing longer and more complex songs (72). Selection should lead to adaptive compromises among all of the costs and benefits associated with vocal communication (6), including those mentioned above.
The physical mechanisms of underwater vocal production in cetaceans have just recently been investigated in detail (12, 63), and there are some key differences between Mysticetes and Odontocetes. Mysticetes produce sound with their larynx, whereas Odontocetes rely on their nasal passages (12, 63). Both groups appear to recycle air as they vocalize, but Odontocetes do so more rapidly because of the smaller volume of the nasal passages (63). These differences may partly explain why Odontocetes exhibit more final shortening in their vocal sequences than Mysticetes. It is also unclear how life underwater might shift the relative pressures on elements and intervals. To our knowledge, there has been very little work on the metabolic costs of changing the durations of elements and intervals in cetaceans. One study conducted with two captive bottlenose dolphins found that the metabolic cost of vocalizing increased with the duration of elements, but was unaffected by the durations of the intervals between elements (10). This finding suggests that elements may be subject to more constraints than intervals, which is consistent with the result that Menzerath’s law is slightly stronger in elements, but this is necessarily speculative. Much more work needs to be done on how vocal production varies across aquatic mammals, with a focus on the differences in the physical constraints that they are subject to (71).
The findings of this study contribute to an emerging consensus that whale communication, like birdsong, shares remarkable structural parallels with human language (31, 73–78). 11 of the 16 species analyzed exhibit Menzerath’s law, with effect sizes comparable to or even greater than those observed in human speech. Two of the five species with categorized element types adhere to Zipf’s law of abbreviation, with one having an effect size comparable to human speech. On average, whales also tend to shorten elements and intervals toward the end of sequences, although this varies by species. Overall, Menzerath’s law is widespread, while Zipf’s law of abbreviation is relatively rare. This discrepancy aligns with findings in songbirds and other animals (23, 24, 31–34, 37, 44), and may point to differences in the mechanisms driving the two laws (29, 35). Notably, humpback whales exhibit both laws across two levels of analysis—notes within phrases and phrases within songs.
A parallel analysis of human speech data from 51 languages reveals multiple exceptions to Menzerath’s law at the levels of phonemes within words and words within sentences, while Zipf’s law of abbreviation appears to be universal across these levels. Previous studies of these two laws in human speech have focused on one or two languages (22, 36) or converted spoken data from nine languages into written formats (35).
It is important to recognize that similarity in form does not necessarily imply similarity in function. Language-like structure can emerge from pressures for efficiency even when communication carries no semantic meaning (25). While some cetacean species have the capacity to convey a great deal of information (78), researchers should be wary of over-interpreting structural similarities with language (79). Further comparisons with music—which shares many language-like features without conveying semantic meaning (80)—may also help illuminate the diverse forms and functions of communication across whale species (47).
MATERIALS AND METHODS
Data
Cetacean vocal sequences have different names in different species (e.g., songs, codas, and burst pulses), and there is substantial variation in research effort across taxa, so we used a mixture of different strategies to compile a convenience sample of candidate datasets. For heavily studied species, we were able to find papers by using species-specific search term combinations like {“humpback whale” AND “song sequences”} and {“sperm whale” AND “codas”} on Google Scholar. For less represented taxa, like dolphins and porpoises, we also searched for datasets directly on repositories like Dryad, Zenodo, and Figshare. Within odontocetes (i.e., toothed whales), who produce clicks for echolocation, we only included vocalizations that have a known or hypothesized communication function (e.g., sperm whale codas and dolphin burst pulses) (58, 81).
In total, we found 44 studies that reported the durations of elements, or the intervals between elements, within vocal sequences. Fourteen of these had open data that were suitable for analysis. We emailed the corresponding authors of the remaining studies and was granted access to 10 closed datasets that were suitable for analysis. The final 24 datasets can be seen in Table 1 (31, 51, 55–58, 67, 68, 81–96). For each dataset, we analyzed sequences as they were defined by the original authors. Three of the datasets, two in humpback whales (85, 86) and one in killer whales (90), were analyzed separately because they measured the durations of higher-level units (e.g., for the humpbacks, phrases within songs rather than notes within phrases).
The phrase-level humpback whale dataset (67) was the only one that did not include the durations of individual elements or intervals in sequences. Instead, Owen et al. (67) report the sequences as strings of element categories, with a separate file that logs the durations of many different elements from each category. For this dataset, we interpolated the sequences with the median duration of each element category. Supplementary analysis with human language data suggests that interpolation with median values systematically reduces the strength of Menzerath’s law, which should lead to more conservative conclusions (fig. S4). One clear outlier was removed from one of the other humpback whale datasets (85)—a phrase with a duration of 752 s (the next longest in that category is 16.9 s, with a median of 9.3 s).
The phylogeny in Figs. 2 and 3 comes from a metatree of Cetacea composed of both molecular and morphological data (97). As the phylogeny was primarily for visualization purposes, we assigned three species that do not appear in the metatree to close relatives in the same genus: the narrow-ridged finless porpoise (Neophocaena asiaeorientalis) to the Indo-Pacific finless porpoise (Neophocaena phocaenoides), the Commerson’s dolphin (Cephalorhynchus commersonii) to the Chilean dolphin (Cephalorhynchus eutropia), and the Peale’s dolphin (Lagenorhynchus australis) to the white-beaked dolphin (Lagenorhynchus albirostris).
For the DoReCo spoken language dataset [(49); see “DoReCo references” in the Supplementary Materials], the only preprocessing was removing everything marked as an “exceptional speech event” (i.e., singing and disfluencies). For the main analysis, we followed Menzerath (19) in using data on phonemes within words, but the results for words within sentences can be found in fig. S2.
Element durations and inter-onset intervals were measured slightly differently across the datasets. Most element durations were measured manually from spectrograms (51, 55, 67, 68, 85, 86, 89, 90, 96), with one dataset processed semiautomatically using PAMlab (88). The approaches for measuring inter-onset intervals were more varied. Some datasets were processed manually (83, 87, 94), while others used automated methods through custom pipelines, MATLAB, or Raven Pro (31, 82, 84, 92). Several studies did not specify the method used for measuring intervals (56–58, 81, 91, 93, 95). The spoken language data from DoReCo were annotated semiautomatically using WebMAUS (46). Researchers used different strategies for automatically identifying onsets, such as when cumulative energy in the signal reached 5% (82), when maximum energy occurred within a specific frequency range (84), or when a sliding time window detected above-threshold kurtosis (31). For more details, refer to the original source for each dataset.
There are two notes about the datasets used to assess Zipf’s law of abbreviation. First, only one of the two sei whale datasets separated out the calls into types (89), which is why the other is excluded (88). Second, although the two killer whale datasets focus on different levels of analysis (i.e., elements in calls versus calls in call sequences), both of them have labeled call types, which is why both are included (90, 96).
Model fitting
All models were fit using the lme4 (v1.1-35.1) package in R (v4.3.1) with the BOBYQA optimizer. To avoid the many problems associated with P values (98), we report mean estimates and 95% Wald CIs, and interpret intervals that do not overlap zero as indicating a strong effect. To enable direct comparison of fixed effects across different models, we used maximum likelihood and z-scored the sequence lengths and element or interval durations within species and languages. All reported models were manually checked for convergence.
We focus on the Menzerath-Altmann law—a precise and more robust mathematical form of Menzerath’s law (48, 50). Here is the standard form of the Menzerath-Altmann law where is the duration of elements within a sequence composed of elements, and , , and are parameters controlling the shape of the relationship.
| (7) |
is usually close to 0 when this model is fit to empirical data (37), leading to a reduced model that is its most common form in contemporary linguistics (60).
| (8) |
With some simple algebra, we can convert Eqs. 7 and 8 into linear models.
| (9) |
| (10) |
We will use Eq. 10 to enable direct comparison with previous studies of the Menzerath-Altmann law in nonhuman animals (20, 23, 24, 29, 31, 33, 37, 52), and because the inclusion of twice in Eq. 9 leads to severe multicollinearity.
In Eq. 10, is usually the mean duration of elements within sequences, but we will use the full distribution of element durations within sequences (25). This leads to similar estimates of and in linguistic corpora, helps to avoid spurious “regression to the mean” effects (20, 99, 100), and better captures uncertainty in the models (25). We also follow other work in excluding single-element sequences (i.e., with a length of one) from the analysis, which have been shown to depart from Menzerath’s law (22, 28, 36, 50).
We originally planned to compare the patterns in the real data with simulated data from a null model that is thought to account for production constraints (37), as we recently did for house finch song (25), but analyses of language data suggest that it is far too conservative of a null model. More details about this exploratory analysis can be found in the Supplementary Text.
Several previous studies of Menzerath’s law in nonhuman communication have also included individual identity as a varying intercept (23, 52). We did not include individual identity in the models because it is not available in any of the included datasets and is rarely included in linguistics studies (22). All details about Zipf’s law of abbreviation can be found in the Results.
Supplementary Material
Acknowledgments
We would like to thank all first authors who contributed data to this study, either directly (via personal correspondence) or indirectly (by publishing open data): L. Lewis, F. Erbs, M. Romagosa, M. Wood, P. Best, E. Schall, C. Owen, C. Martin, J. Crance, G. Macklin, A. Stepanov, M. Martin, N. Nielsen, A. Selbmann, D. Sharpe, T. Terada, P. Arranz, T. Hersh, F. Vachon, and S. Gero.
Funding: The author acknowledges that they received no funding in support for this research.
Author contributions: M.Y. conceptualized and designed the project, compiled all of the data, ran all statistical analyses, and prepared the manuscript and figures.
Competing interests: The author declares no competing interests.
Data and materials availability: The analysis code and all datasets that were made open access by the original authors, can be found in the permanent Zenodo repository (https://doi.org/10.5281/zenodo.14164900) or on GitHub (https://github.com/masonyoungblood/whale_efficiency). For access to the other datasets that are not publicly available, please reach out to the original authors (see Table 1).
Supplementary Materials
This PDF file includes:
Supplementary Text
Figs. S1 to S8
Tables S1 to S6
DoReCo references
Correction (24 March 2025):
The DoReCo dataset requires that all languages analyzed be cited separately. The full reference list of DoReCo citations has been added to the end of the Supplementary Materials PDF and main text citations to the dataset have been updated for clarity in the HTML and PDF versions of the article. The original version of the Supplementary Materials is available here:
REFERENCES AND NOTES
- 1.Seyfarth R. M., Cheney D. L., Bergman T., Fischer J., Zuberbühler K., Hammerschmidt K., The central importance of information in studies of animal communication. Anim. Behav. 80, 3–8 (2010). [Google Scholar]
- 2.Fitch W. T., The evolution of speech: A comparative review. Trends Cogn. Sci. 4, 258–267 (2000). [DOI] [PubMed] [Google Scholar]
- 3.Hebets E. A., Barron A. B., Balakrishnan C. N., Hauber M. E., Mason P. H., Hoke K. L., A systems approach to animal communication. Proc. R. Soc. B 283, 20152889 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Semple S., Hsu M. J., Agoramoorthy G., Efficiency of coding in macaque vocal communication. Biol. Lett. 6, 469–471 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ophir A. G., Schrader S. B., Gillooly J. F., Energetic cost of calling: General constraints and species-specific differences. J. Evol. Biol. 23, 1564–1569 (2010). [DOI] [PubMed] [Google Scholar]
- 6.Endler J. A., Some general comments on the evolution and design of animal communication systems. Philos. Trans. R. Soc. Lond. B Biol. Sci. 340, 215–225 (1993). [DOI] [PubMed] [Google Scholar]
- 7.Gibson E., Futrell R., Piantadosi S. P., Dautriche I., Mahowald K., Bergen L., Levy R., How efficiency shapes human language. Trends Cogn. Sci. 23, 389–407 (2019). [DOI] [PubMed] [Google Scholar]
- 8.Gruber T., Chimento M., Aplin L. M., Biro D., Efficiency fosters cumulative culture across species. Philos. Trans. R. Soc. B 377, 20200308 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.G. Zipf, Human Behavior and the Principle of Least Effort: An Introducton to Human Ecology (Addison-Wesley, 1949). [Google Scholar]
- 10.Noren D. P., Holt M. M., Dunkin R. C., Williams T. M., The metabolic cost of communicative sound production in bottlenose dolphins ( Tursiops truncatus ). J. Exp. Biol. 216 (Pt. 9), 1624–1629 (2013). [DOI] [PubMed] [Google Scholar]
- 11.May-Collado L. J., Agnarsson I., Wartzok D., Phylogenetic review of tonal sound production in whales in relation to sociality. BMC Evol. Biol. 7, 136 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Elemans C. P. H., Jiang W., Jensen M. H., Pichler H., Mussman B. R., Nattestad J., Wahlberg M., Zheng X., Xue Q., Fitch T. W., Evolutionary novelties underlie sound production in baleen whales. Nature 627, 123–129 (2024). [DOI] [PubMed] [Google Scholar]
- 13.Moseley D. L., Phillips J. N., Derryberry E. P., Luther D. A., Evidence for differing trajectories of songs in urban and rural populations. Behav. Ecol. 30, 1734–1742 (2019). [Google Scholar]
- 14.Zeh J. M., Adcock D. L., Perez-Marrufo V., Cusano D. A., Robbins J., Tackaberry J. E., Jensen F. H., Weinrich M., Friedlaender A. S., Wiley D. N., Parks S. E., Acoustic behavior of humpback whale calves on the feeding ground: Comparisons across age and implications for vocal development. PLOS ONE 19, e0303741 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lesage V., Barrette C., Kingsley M. C. S., Sjare B., The effect of vessel noise on the vocal behavior of belugas in the St. Lawrence river estuary, Canada. Mar. Mamm. Sci. 15, 65–84 (1999). [Google Scholar]
- 16.Sørensen P. M., Haddock A., Guarino E., Jaakkola K., McMullen C., Jensen F. H., Tyack P. L., King S. L., Anthropogenic noise impairs cooperation in bottlenose dolphins. Curr. Biol. 33, 749–754.e4 (2023). [DOI] [PubMed] [Google Scholar]
- 17.Foote A. D., Osborne R. W., Hoelzel A. R., Whale-call response to masking boat noise. Nature 428, 910 (2004). [DOI] [PubMed] [Google Scholar]
- 18.Fristrup K. M., Hatch L. T., Clark C. W., Variation in humpback whale (Megaptera novaeangliae) song length in relation to low-frequency sound broadcasts. J. Acoust. Soc. Am. 113, 3411–3424 (2003). [DOI] [PubMed] [Google Scholar]
- 19.P. Menzerath, Die Architektonik Des Deutschen Wortschatzes (Dümmler, 1954). [Google Scholar]
- 20.Gustison M. L., Semple S., Ferrer-i-Cancho R., Bergman T. J., Gelada vocal sequences follow Menzerath’s linguistic law. Proc. Natl. Acad. Sci. U.S.A. 113, E2750–E2758 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ferrer-i-Cancho R., Hernández-Fernández A., Lusseau D., Agoramoorthy G., Hsu M. J., Semple S., Compression as a universal principle of animal behavior. Cognit. Sci. 37, 1565–1578 (2013). [DOI] [PubMed] [Google Scholar]
- 22.Torre I. G., Luque B., Lacasa L., Kello C. T., Hernández-Fernández A., On the physical origin of linguistic laws and lognormality in speech. R. Soc. Open Sci. 6, 191023 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Huang M., Ma H., Ma C., Garber P. A., Fan P., Male gibbon loud morning calls conform to Zipf’s law of brevity and Menzerath’s law: Insights into the origin of human language. Anim. Behav. 160, 145–155 (2020). [Google Scholar]
- 24.Favaro L., Gamba M., Cresta E., Fumagalli E., Bandoli F., Pilenga C., Isaja V., Mathevon N., Reby D., Do penguins’ vocal sequences conform to linguistic laws? Biol. Lett. 16, 20190589 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Youngblood M., Language-like efficiency and structure in house finch song. Proc. R. Soc. B 291, 20240250 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Clink D. J., Ahmad A. H., Klinck H., Brevity is not a universal in animal communication: Evidence for compression depends on the unit of analysis in small ape vocalizations. R. Soc. Open Sci. 7, 200151 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Deng K., He Y.-X., Wang X.-P., Wang T.-L., Wang J.-C., Chen Y.-H., Cui J.-G., Hainan frilled treefrogs’ calls partially conform to Menzerath–Altmann’s law, but oppose Zipf’s law of abbreviation. Anim. Behav. 213, 51–59 (2024). [Google Scholar]
- 28.Heesen R., Hobaiter C., Ferrer-i-Cancho R., Semple S., Linguistic laws in chimpanzee gestural communication. Proc. R. Soc. Lond. B Biol. Sci. 286, 20182900 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.R. N. Lewis, A. Kwong, M. Soma, S. R. De Kort, R. T. Gilman, Java sparrow song conforms to Menzerath’s law but not Zipf’s law of abbreviation. bioRxiv 2023.12.13.571437 [Preprint] (2023). 10.1101/2023.12.13.571437. [DOI]
- 30.Safryghin A., Cross C., Fallon B., Heesen R., Ferrer-i-Cancho R., Hobaiter C., Variable expression of linguistic laws in ape gesture: A case study from chimpanzee sexual solicitation. R. Soc. Open Sci. 9, 220849 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.A. Stepanov, H. Zhivomirov, I. Nedelchev, P. Stateva, Bottlenose dolphins’ broadband clicks are structured for communication. bioRxiv 2023.01.11.523588 [Preprint] (2023). 10.1101/2023.01.11.523588. [DOI]
- 32.Valente D., De Gregorio C., Favaro L., Friard O., Miaretsoa L., Raimondi T., Ratsimbazafy J., Torti V., Zanoli A., Giacoma C., Gamba M., Linguistic laws of brevity: Conformity in Indri indri. Anim. Cogn. 24, 897–906 (2021). [DOI] [PubMed] [Google Scholar]
- 33.A. A. Vradi, “Dolphin communication. A quantitative linguistics approach,” thesis, Universitat Politècnica de Catalunya, Barcelona, Spain (2021). [Google Scholar]
- 34.Zhang C., Zheng Z., Lucas J. R., Wang Y., Fan X., Zhao X., Feng J., Sun C., Jiang T., Do bats’ social vocalizations conform to Zipf’s law and the Menzerath-Altmann law? iScience 27, 110401 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Stave M., Paschen L., Pellegrino F., Seifart F., Optimization of morpheme length: A cross-linguistic assessment of Zipf’s and Menzerath’s laws. Linguist Vanguard 7, 20190076 (2021). [Google Scholar]
- 36.Hernández-Fernández A., Torre I. G., Garrido J.-M., Lacasa L., Linguistic laws in speech: The case of Catalan and Spanish. Entropy 21, 1153 (2019). [Google Scholar]
- 37.James L. S., Mori C., Wada K., Sakata J. T., Phylogeny and mechanisms of shared hierarchical patterns in birdsong. Curr. Biol. 31, 2796–2808.e9 (2021). [DOI] [PubMed] [Google Scholar]
- 38.Mahowald K., Dautriche I., Gibson E., Piantadosi S. T., Word forms are structured for efficient use. Cognit. Sci. 42, 3116–3134 (2018). [DOI] [PubMed] [Google Scholar]
- 39.Kanwal J., Smith K., Culbertson J., Kirby S., Zipf’s law of abbreviation and the principle of least effort: Language users optimise a miniature lexicon for efficient communication. Cognition 165, 45–52 (2017). [DOI] [PubMed] [Google Scholar]
- 40.Mahowald K., Fedorenko E., Piantadosi S. T., Gibson E., Info/information theory: Speakers choose shorter words in predictive contexts. Cognition 126, 313–318 (2013). [DOI] [PubMed] [Google Scholar]
- 41.Piantadosi S. T., Tily H., Gibson E., Word lengths are optimized for efficient communication. Proc. Natl. Acad. Sci. U.S.A. 108, 3526–3529 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.T. S. Kang, “Linguistic laws and compression in a comparative perspective: A conceptual review and phylogenetic test in mammals,” thesis, Durham University, Durham, UK (2021). [Google Scholar]
- 43.Janik V. M., Cetacean vocal learning and communication. Curr. Opin. Neurobiol. 28, 60–65 (2014). [DOI] [PubMed] [Google Scholar]
- 44.R. T. Gilman, C. Durrant, L. Malpas, R. N. Lewis, Does Zipf’s law of abbreviation shape birdsong? bioRxiv 2023.12.06.569773 [Preprint] (2023). 10.1101/2023.12.06.569773. [DOI]
- 45.Risueno-Segovia C., Dohmen D., Gultekin Y. B., Pomberger T., Hage S. R., Linguistic law-like compression strategies emerge to maximize coding efficiency in marmoset vocal communication. Proc. R. Soc. B 290, 20231503 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Paschen L., Fuchs S., Seifart F., Final lengthening and vowel length in 25 languages. J. Phon. 94, 101179 (2022). [Google Scholar]
- 47.Hersh T. A., Ravignani A., Whitehead H., Cetaceans are the next frontier for vocal rhythm research. Proc. Natl. Acad. Sci. U.S.A. 121, e2313093121 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Altmann G., Prolegomena to Menzerath’s law. Glottometrika 2, 1–10 (1980). [Google Scholar]
- 49.F. Seifart, L. Paschen, M. Stave, Language documentation reference corpus (DoReCo), version 1.2 (2022). 10.34847/nkl.7cbfq779. [DOI]
- 50.Torre I. G., Dębowski Ł., Hernández-Fernández A., Can Menzerath’s law be a criterion of complexity in communication? PLOS ONE 16, e0256133 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Crance J. L., Berchok C. L., Wright D. L., Brewer A. M., Woodrich D. F., Song production by the North Pacific right whale, Eubalaena japonica. J. Acoust. Soc. Am. 145, 3467–3479 (2019). [DOI] [PubMed] [Google Scholar]
- 52.Clink D. J., Lau A. R., Adherence to Menzerath’s law is the exception (not the rule) in three duetting primate species. R. Soc. Open Sci. 7, 201557 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Ives A. R., Midford P. E., Garland T., Within-species variation and measurement error in phylogenetic comparative methods. Syst. Biol. 56, 252–270 (2007). [DOI] [PubMed] [Google Scholar]
- 54.Revell L. J., phytools 2.0: An updated R ecosystem for phylogenetic comparative methods (and other things). PeerJ 12, e16505 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Lewis L. A., Calambokidis J., Stimpert A. K., Fahlbusch J., Friedlaender A. S., McKenna M. F., Mesnick S. L., Oleson E. M., Southall B. L., Szesciorka A. R., Širović A., Context-dependent variability in blue whale acoustic behaviour. R. Soc. Open Sci. 5, 180241 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Nielsen N. A., Dawson S. M., Torres Ortiz S., Wahlberg M., Martin M. J., Hector’s dolphins (Cephalorhynchus hectori) produce both narrowband high-frequency and broadband acoustic signals. J. Acoust. Soc. Am. 155, 1437–1450 (2024). [DOI] [PubMed] [Google Scholar]
- 57.Martin M. J., Torres Ortiz S., Reyes Reyes M. V., Marino A., Iñíguez Bessega M., Wahlberg M., Commerson’s dolphins (Cephalorhynchus commersonii) can relax acoustic crypsis. Behav. Ecol. Sociobiol. 75, 100 (2021). [Google Scholar]
- 58.Martin M. J., Gridley T., Elwen S. H., Jensen F. H., Heaviside’s dolphins (Cephalorhynchus heavisidii) relax acoustic crypsis to increase communication range. Proc. R. Soc. B 285, 20181178 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Martin M. J., Elwen S. H., Kassanjee R., Gridley T., To buzz or burst-pulse? The functional role of Heaviside’s dolphin, Cephalorhynchus heavisidii, rapidly pulsed signals. Anim. Behav. 150, 273–284 (2019). [Google Scholar]
- 60.Hou R., Huang C.-R., Do H. S., Liu H., A study on correlation between Chinese sentence and constituting clauses based on the Menzerath-Altmann Law. J. Quant. Linguist. 24, 350–366 (2017). [Google Scholar]
- 61.Seifart F., Strunk J., Danielsen S., Hartmann I., Pakendorf B., Wichmann S., Witzlack-Makarevich A., Himmelmann N. P., Bickel B., The extent and degree of utterance-final word lengthening in spontaneous speech from 10 languages. Linguist Vanguard 7, 20190063 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Matzinger T., Fitch W. T., Voice modulatory cues to structure across languages and species. Philos. Trans. R. Soc. B 376, 20200393 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Madsen P. T., Siebert U., Elemans C. P. H., Toothed whales use distinct vocal registers for echolocation and communication. Science 379, 928–933 (2023). [DOI] [PubMed] [Google Scholar]
- 64.Holt M. M., Noren D. P., Dunkin R. C., Williams T. M., Vocal performance affects metabolic rate in dolphins: Implications for animals communicating in noisy environments. J. Exp. Biol. 218 (Pt. 11), 1647–1654 (2015). [DOI] [PubMed] [Google Scholar]
- 65.Vannoni E., McElligott A. G., Fallow bucks get hoarse: Vocal fatigue as a possible signal to conspecifics. Anim. Behav. 78, 3–10 (2009). [Google Scholar]
- 66.Benešová M., Faltýnek D., Zámečník L. H., Explain the law: When the evidence is not enough. Linguistic Front., (2021). [Google Scholar]
- 67.Owen C., Rendell L., Constantine R., Noad M. J., Allen J., Andrews O., Garrigue C., Poole M. M., Donnelly D., Hauser N., Garland E. C., Migratory convergence facilitates cultural transmission of humpback whale song. R. Soc. Open Sci. 6, (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Erbs F., Van Der Schaar M., Weissenberger J., Zaugg S., André M., Contribution to unravel variability in bowhead whale songs and better understand its ecological significance. Sci. Rep. 11, 168 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Ferrer-i-Cancho R., Hernández-Fernández A., The failure of the law of brevity in two new world primates. Statistical caveats. Glottotheory 4, (2013). [Google Scholar]
- 70.Semple S., Ferrer-i-Cancho R., Gustison M. L., Linguistic laws in biology. Trends Ecol. Evol. 37, 53–66 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Ladich F., Winkler H., Acoustic communication in terrestrial and aquatic vertebrates. J. Exp. Biol. 220, 2306–2317 (2017). [DOI] [PubMed] [Google Scholar]
- 72.Garland E. C., Allen J. A., Eichenberger F., Garrigue C., Bonneville C., Steel D., Carroll E. L., Does female choice for song complexity drive sexual selection in humpback whales? J. Acoust. Soc. Am. 154, A88 (2023). [Google Scholar]
- 73.Garland E. C., Rendell L., Lamoni L., Poole M. M., Noad M. J., Song hybridization events during revolutionary song change provide insights into cultural transmission in humpback whales. Proc. Natl. Acad. Sci. U.S.A. 114, 7822–7829 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Suzuki R., Buck J. R., Tyack P. L., Information entropy of humpback whale songs. J. Acoust. Soc. Am. 119, 1849–1866 (2006). [DOI] [PubMed] [Google Scholar]
- 75.Pines H., Mapping the phonetic structure of humpback whale song units: Extraction, classification, and Shannon-Zipf confirmation of sixty sub-units. Proc. Mtgs. Acoust. 35, 010003 (2018). [Google Scholar]
- 76.Allen J. A., Garland E. C., Dunlop R. A., Noad M. J., Network analysis reveals underlying syntactic features in a vocally learnt mammalian display, humpback whale song. Proc. R. Soc. B 286, 20192014 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.G. Begus, R. Sprouse, A. Leban, M. Silva, S. Gero, Vowels and Diphthongs in sperm whales. OSF [Preprint] (2023). 10.31219/osf.io/285cs. [DOI]
- 78.Sharma P., Gero S., Payne R., Gruber D. F., Rus D., Torralba A., Andreas J., Contextual and combinatorial structure in sperm whale vocalisations. Nat. Commun. 15, 3617 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.L. Rendell, Are we really about to talk to whales?, The Conversation (2024). https://theconversation.com/are-we-really-about-to-talk-to-whales-229778.
- 80.Rohrmeier M., Zuidema W., Wiggins G. A., Scharff C., Principles of structure building in music, language and animal song. Philos. Trans. R. Soc. B 370, 20140097 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Hersh T. A., Gero S., Rendell L., Cantor M., Weilgart L., Amano M., Dawson S. M., Slooten E., Johnson C. M., Kerr I., Payne R., Rogan A., Antunes R., Andrews O., Ferguson E. L., Hom-Weaver C. A., Norris T. F., Barkley Y. M., Merkens K. P., Oleson E. M., Doniol-Valcroze T., Pilkington J. F., Gordon J., Fernandes M., Guerra M., Hickmott L., Whitehead H., Evidence from sperm whale clans of symbolic marking in non-human cultures. Proc. Natl. Acad. Sci. U.S.A. 119, e2201692119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Romagosa M., Nieukirk S., Cascão I., Marques T. A., Dziak R., Royer J.-Y., O’Brien J., Mellinger D. K., Pereira A., Ugalde A., Papale E., Aniceto S., Buscaino G., Rasmussen M., Matias L., Prieto R., Silva M. A., Fin whale song evolution in the North Atlantic. eLife 13, e83750 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Wood M., Širović A., Characterization of fin whale song off the Western Antarctic Peninsula. PLOS ONE 17, e0264214 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Best P., Marxer R., Paris S., Glotin H., Temporal evolution of the Mediterranean fin whale song. Sci. Rep. 12, 13565 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Schall E., Thomisch K., Boebel O., Gerlach G., Mangia Woods S., Roca I. T., Van Opzeeland I., Humpback whale song recordings suggest common feeding ground occupation by multiple populations. Sci. Rep. 11, 18806 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Schall E., Djokic D., Ross-Marsh E. C., Oña J., Denkinger J., Ernesto Baumgarten J., Rodrigues Padovese L., Rossi-Santos M. R., Carvalho Gonçalves M. I., Sousa-Lima R., Hucke-Gaete R., Elwen S., Buchan S., Gridley T., Van Opzeeland I., Song recordings suggest feeding ground sharing in Southern Hemisphere humpback whales. Sci. Rep. 12, 13924 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Martin C. R., Guazzo R. A., Helble T. A., Alongi G. C., Durbach I. N., Martin S. W., Matsuyama B. M., Henderson E. E., North Pacific minke whales call rapidly when calling conspecifics are nearby. Front. Mar. Sci. 9, 897298 (2022). [Google Scholar]
- 88.Macklin G. F., Moors-Murphy H. B., Leonard M. L., Characteristics and spatiotemporal variation of sei whale (Balaenoptera borealis) downsweeps recorded in Atlantic Canada. J. Acoust. Soc. Am. 155, 145–155 (2024). [DOI] [PubMed] [Google Scholar]
- 89.Cerchio S., Weir C. R., Mid-frequency song and low-frequency calls of sei whales in the Falkland Islands. R. Soc. Open Sci. 9, 220738 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Selbmann A., Miller P. J. O., Wensveen P. J., Svavarsson J., Samarra F. I. P., Call combination patterns in Icelandic killer whales (Orcinus orca). Sci. Rep. 13, 21771 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Terada T., Morisaka T., Wakabayashi I., Yoshioka M., Communication sounds produced by captive narrow-ridged finless porpoises (Neophocaena asiaeorientalis). J. Ethol. 40, 245–256 (2022). [Google Scholar]
- 92.Martin M. J., Ortiz S. T., Wahlberg M., Weir C. R., Peale’s dolphins (Lagenorhynchus australis) are acoustic mergers between dolphins and porpoises. J. Exp. Mar. Biol. Ecol. 572, 151977 (2024). [Google Scholar]
- 93.Arranz P., DeRuiter S. L., Stimpert A. K., Neves S., Friedlaender A. S., Goldbogen J. A., Visser F., Calambokidis J., Southall B. L., Tyack P. L., Discrimination of fast click series produced by tagged Risso’s dolphins (Grampus griseus) for echolocation or communication. J. Exp. Biol. 219 (Pt. 18), 2898–2907 (2016). [DOI] [PubMed] [Google Scholar]
- 94.Vachon F., Hersh T. A., Rendell L., Gero S., Whitehead H., Ocean nomads or island specialists? Culturally driven habitat partitioning contrasts in scale between geographically isolated sperm whale populations. R. Soc. Open Sci. 9, 211737 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Gero S., Whitehead H., Rendell L., Individual, unit and vocal clan level identity cues in sperm whale codas. R. Soc. Open Sci. 3, 150372 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Sharpe D. L., Castellote M., Wade P. R., Cornick L. A., Call types of Bigg’s killer whales (Orcinus orca) in western Alaska: Using vocal dialects to assess population structure. Bioacoustics 28, 74–99 (2019). [Google Scholar]
- 97.Lloyd G. T., Slater G. J., A total-group phylogenetic metatree for Cetacea and the importance of fossil data in diversification analyses. Syst. Biol. 70, 922–939 (2021). [DOI] [PubMed] [Google Scholar]
- 98.Amrhein V., Greenland S., McShane B., Retire statistical significance. Nature 567, 305–307 (2019). [DOI] [PubMed] [Google Scholar]
- 99.Milička J., Menzerath’s law: Is it just regression toward the mean? Glottometrics 55, 1–16 (2023). [Google Scholar]
- 100.Ferrer-i-Cancho R., Hernández-Fernández A., Baixeries J., Dębowski Ł., Mačutek J., When is Menzerath-Altmann law mathematically trivial? a new approach. Stat. Appl. Genet. Mol. Biol. 13, 633–644 (2014). [DOI] [PubMed] [Google Scholar]
- 101.Coupé C., Oh Y. M., Dediu D., Pellegrino F., Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche. Sci. Adv. 5, eaaw2594 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Rothe-Neves R., Bernardo B. M., Espesser R., Shortening tendency for syllable duration in Brazilian Portuguese utterances. J. Quant. Linguist. 25, 156–167 (2018). [Google Scholar]
- 103.D. Schusterová, J. Ščigulinská, M. Benešová, D. Faltýnek, O. Kučera, “An application of the Menzerath–Altmann law to contemporary spoken Chinese” in Menzerath-Altmann Law Applied (Univerzita Palackého v Olomouci, 2014), pp. 121–141. [Google Scholar]
- 104.Mercado E., Intra-individual variation in the songs of humpback whales suggests they are sonically searching for conspecifics. Learn. Behav. 50, 456–481 (2022). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Text
Figs. S1 to S8
Tables S1 to S6
DoReCo references
Correction (24 March 2025):
The DoReCo dataset requires that all languages analyzed be cited separately. The full reference list of DoReCo citations has been added to the end of the Supplementary Materials PDF and main text citations to the dataset have been updated for clarity in the HTML and PDF versions of the article. The original version of the Supplementary Materials is available here:






