Skip to main content
PLOS One logoLink to PLOS One
. 2025 Jun 11;20(6):e0325890. doi: 10.1371/journal.pone.0325890

AvianLexiconAtlas: A database of descriptive categories of English-language bird names around the world

Erin S Morrison 1,☯,*, Guinevere P Pandolfi 1, Stepfanie M Aguillon 2, Jarome R Ali 3, Olivia Archard 1, Daniel T Baldassarre 4, Illeana Baquero 1, Kevin F P Bennett 5, Kevin M Bonney 1, Riley Bryant 1, Rosanne M Catanach 6, Therese A Catanach 7, Ida Chavoshan 1, Sarah N Davis 8, Brooke D Goodman 4, Eric R Gulson-Castillo 9, Matthew Hack 9, Jocelyn Hudon 10, Gavin M Leighton 11, Kira M Long 12,¤, Ziqi Ma 1, Dakota E McCoy 13, J F McLaughlin 14, Gaia Rueda Moreno 1, Talia M Mota 1, Lara Noguchi 1, Ugo Nwigwe 1, Teresa Pegan 9, Kaiya L Provost 15, Shauna A Rasband 5, Jessie Frances Salter 16, Lauren C Silvernail 17, Jared A Simard 1, Heather R Skeen 18, Juliana Soto-Patiño 19, Young Ha Suh 20, Qingyue Wang 1, Matthew E Warshauer 21, Sissy Yan 1, Betsy Zalinski 1, Ziqi Zhao 1, Allison J Shultz 20,
Editor: Shoko Sugasawa22
PMCID: PMC12157040  PMID: 40498755

Abstract

Common names of species are important for communicating with the general public. In principle, these names should provide an accessible way to engage with and identify species. The common names of species have historically been labile without standard guidelines, even within a language. Currently, there is no systematic assessment of how often common names communicate identifiable and biologically relevant characteristics about species. This is a salient issue in ornithology, where common names are used more often than scientific names for species of birds in written and spoken English, even by professional researchers. To gain a better understanding of the types of terminology used in the English-language common names of bird species, a group of 85 professional ornithologists and non-professional contributors classified unique descriptors in the common names of all recognized species of birds. In the AvianLexiconAtlas database produced by this work, each species’ common name is assigned to one of ten categories associated with aspects of avian biology, ecology, or human culture. Across 10,906 species of birds, 89% have names describing the biology of the species, while the remaining 11% of species have names derived from human cultural references, human names, or local non-English languages. Species with common names based on features of avian biology are more likely to be related to each other or be from the same geographic region. The crowdsourced data collection also revealed that many common names contain specialized or historic terminology unknown to many of the data collectors, and we include these terms in a glossary and gazetteer alongside the dataset. The AvianLexiconAtlas can be used as a quantitative resource to assess the state of terminology in English-language common names of birds. Future research using the database can shed light on historical approaches to nomenclature and how people engage with species through their names.

Introduction

Humans have observed and classified living organisms for thousands of years, across both cultures and languages [1]. The names of species can have consequences for how they are perceived by people, and this can, in turn, affect how people engage with species as part of education, research, and conservation efforts [26]. Constructing classification systems that delineate and name organisms can be subjective [5,7,8]. To standardize scientific animal names, the International Commission on Zoological Nomenclature (ICZN) has overseen the assignment of a unique universal scientific name for each recognized animal species since 1895. This scientific name is written as a binomial, often in Latin, and includes the genus and species names in the Linnaean taxonomy system [9]. The vernacular names of species, however, are not standardized across languages and have been labile across history, even within a single language or region [1012]. Known as ‘common’ or, as argued by [13], ‘standard’ names, the use of these vernacular names can communicate information about species in a way that is more accessible to a wider audience outside of the scientific community [3,14,15]. However, even these common names may be unique to a single region, species that span cultures may take on multiple common names in the same language, or the same common name may be unofficially used for multiple species.

In ornithology, the common names of birds tend to be used more frequently in spoken and written English than their scientific binomial names, even among professionals [16]. Despite this, the English-language common names of birds are only standardized regionally by professional ornithology societies [17], and there is no universal set of rules for English-language common names [3,1821]. Within the ornithological community, there is much reflection about what the terminology in standardized English-language common names should communicate about species [5,16,2224]. Currently, some names directly describe characteristics of a species (e.g., Yellow-rumped Warbler, Setophaga coronata), while other names are ambiguous (e.g., Barnacle Goose, Branta leucopsis) or unrelated to the species’ biology (e.g., Wilson’s Warbler, Cardellina pusilla). To understand what English-language common names currently communicate about species of birds, it is necessary to comprehensively examine the scope and variability of the terminology used in these names.

While the history of English-language bird names has been extensively documented [2528], there is currently no systematic resource of the types of terminology used in the English-language common names of birds. We therefore endeavored to inventory and classify the terms currently used as the specific descriptors in English-language common names of all avian species and assembled these data into a freely available database known as the AvianLexiconAtlas. In the database, the unique descriptor of each species name is assigned to one of ten distinct categories associated with aspects of avian physical traits, avian natural history, or human culture. To validate the utility of the dataset as a resource for professional ornithologists, linguists, and the general public to learn more about how humans name species, we summarized frequency differences among different categories across species, and also examined trends in the terminology associated with species’ close taxonomic relationships or shared biogeography. We use the results from these descriptive analyses to propose directions for future quantitative research using the database that could focus on the relationship these human-constructed names have with aspects of avian biology or examine historical trends in the terminology.

Additionally, since one goal of standardized common names is to be useful for the general public, data collection for this project presented a unique opportunity to involve data collectors from both within and beyond the academic ornithology community. The inclusion of undergraduate students and amateurs as data collectors was a chance to assess familiarity with the terminology used in English-language bird names. During data collection, data collectors were also asked to contribute to a glossary and a gazetteer when they encountered unfamiliar terminology. We make these resources and the dataset produced from their work available for future analyses on the state of avian nomenclature as the AvianLexiconAtlas.

Materials and methods

Ethics statement

The New York University Human Research Protection Program (HRPP) determined that the work involved in the data collection for the database did not meet the criteria involving human subjects per the United States’ regulations for the protection of human subjects (45CFR46.102(e)) [29]. This verbal and written determination was made because the focus of the research was not on collecting information about the data collectors, but only involved a crowdsourcing model where the data collectors helped categorize the meanings behind unique descriptors in bird names. As such, the data collection methods did not require HRPP review and approval.

Selection of the eBird/Clements checklist for species names

The English-language common names in the database are based on the taxonomy of 10,906 species in the 2022 eBird/Clements Checklist [30]. At the time of the data collection, there were two additional annually curated global checklists with similar, but not identical, taxonomy to the eBird/Clements Checklist. The International Ornithological Congress (IOC) Checklist 12.2 contained 11,140 species, and 86.9% of these species had the same scientific and common names in the eBird/Clements 2022 Checklist [20]. The Handbook of Birds of the World (HBW) and BirdLife International Checklist 7.0 contained 11,170 species, and 79.7% of these species had the same scientific and common names in the eBird/Clements 2022 Checklist [31]. We made the decision to use the 2022 eBird/Clements checklist based on its alignment with the comprehensive Birds of the World online database [32], the main reference source for the data collection. The eBird/Clements taxonomy used in Birds of the World incorporates the HBW and BirdLife International taxonomy but makes more conservative distinctions between species and subspecies (refer to [33] and [34] for more details).

Establishment of categories for species’ English-language common names

A majority of bird species’ English-language common names contain a unique descriptor followed by a name it shares with closely related species. For example, the unique descriptors in Common Ostrich (Struthio camelus) and Somali Ostrich (Struthio molybdophanes) are ‘Common’ and ‘Somali’, respectively, while the shared name is ‘Ostrich’. Only the unique descriptor in a species’ English-language common name, not the shared name, was categorized for the database.

Prior to the start of data collection, we ran a series of trial categorizations with a random sample of 30 species drawn from the eBird/Clements 2021 Checklist [35]. We started with a list of potential categories and three people independently attempted to categorize each of the species names without communicating with each other. Importantly, this group included two professional ornithologists and one undergraduate student without any formal training in ornithology. As a result of this trial, we established via consensus 10 categories for the unique descriptors that align with independent categorization for both common and scientific bird names [26,27] and encompass aspects of avian physical traits (physical trait in both sexes, male-only physical trait, female-only physical trait, size), avian natural history (behavior, geographic location, natural history), and human-centered terminology independent of the biology of the species in the English language (local language, miscellaneous, eponyms (named after a particular person)) (Table 1).

Table 1. Categories used to group unique descriptors in English common names of bird species.

Category Description
Avian physical traits
Both sexes physical trait Physical characteristics that are observable in both males and females of a species. Includes: size; colors and patterns of feathers, skin, beak, feet, and eyes; general terms related to plumage complexity (e.g.,: painted, beautiful, handsome, drab, ornate, plain).
Male physical trait Physical characteristics that are only observable and clearly identifiable in males of a species.
Female physical trait Physical characteristics that are only observable and clearly identifiable in females of a species.
Size Relative physical size of the species (e.g.,: fairy, giant, greater, lesser, little, pygmy).
Avian natural history
Behavior Specific behavior associated with the species (e.g.,: vocalizations, personality characteristics such as shy or cryptic, or named after another species that has a similar behavior).
Geographic location Specific geographic location. Includes: general geographic descriptors (eastern, western); regions named after local cultures or communities.
Natural history Relates to habitat (e.g.,: mountain, grassland, highland, upland, lowland, paradise), nest type, diet of the species, smell, or nest location.
Human-centered terminology
Local language Phonetic English pronunciation of species name in a non-English language.
Miscellaneous Descriptor that does not clearly fit into any of the other categories.
Person Name of a person.

Categorization of species common names

Data collectors.

A total of 85 people participated in the data collection for the AvianLexiconAtlas. Of these participants, 56 were pursuing undergraduate degrees (students), 7 had at least an undergraduate degree but no formal training in ornithology (amateurs), and 22 were pursuing graduate degrees or worked in post-graduate careers related to ornithology (professionals). Of the 56 undergraduate students, 50 were enrolled in an introductory biology seminar course for non-science majors at New York University during the Fall 2022 semester. All data collectors who categorized at least 100 species were eligible for authorship credit provided that they agreed to review the manuscript prior to submission for publication. Students in the course were given in-class participation credit for engaging in the data collection during two 75-minute class sessions. This participation grade counted towards 1.6% of their final grade.

Data collection.

Data collectors were split into two groups (A and B), and each group evaluated identical lists of 10,906 species. Each species name was therefore independently categorized by two different people. Each data collector only worked on one of the two datasets and was not given access to the other dataset. Given the large number of people involved in the data collection, categorizing the names of species in duplicate was built into the methodology in order to be able to validate the accuracy of each species’ category assignment and identify descriptors that were difficult to classify. Professionals, amateurs, and the six students not enrolled in the introductory biology course were randomly assigned to Dataset A or B based on the order in which they signed up for the project. The introductory biology course consisted of two sections of 25 students each, and each of these sections was assigned to a different dataset. A total of 39 individuals worked on Dataset A (28 students, 2 amateurs, and 9 professionals) and a total of 47 individuals worked on Dataset B (29 students, 5 amateurs, and 13 professionals). Data collectors signed up for 10 species at a time, partitioned by the order they were listed in the 2022 eBird/Clements Checklist. Groups of 10 species had to be categorized in sequential order and data collectors were not allowed to selectively choose groups to categorize.

Categorization of the unique descriptor in a species’ name first started by finding its account in the Birds of the World database and then the category assignment associated with the unique descriptor was determined using the provided text, images, and media in the species account. If the origin of the unique descriptor could not be determined from Birds of the World, data collectors would then conduct a broader internet search to find other potential sources. Any sources used outside of Birds of the World were recorded in the complete dataset. If the origin of the unique descriptor could not be determined based on information in Birds of the World or a subsequent internet search within 10 minutes, data collectors were asked to note that they could not identify the name within the given time limit and move on to the next species in their list.

To facilitate categorizations for the data collectors who were not familiar with common terms in ornithology, a glossary was developed for the project that included diagrams of common field markers and definitions of common ornithology descriptors of traits and colors. Participants were granted editing access to the glossary (S1 Appendix), and were asked to add additional terms and definitions they had to look up during the data collection period. A gazetteer (S1 Appendix) that included descriptions of all geographic locations encountered in species’ names was also added to this glossary as a reference. In addition to these resources, data collectors were provided with two lists of eponyms documented in species’ English-language names compiled by Beolens and Watkins [25] and Bird Names For Birds [24].

Dataset reconciliation.

When the two duplicate datasets were compared, 906 out of the 10,906 species (8.4%) were assigned to different categories in the two datasets (mismatches). There were also 9 additional species that were designated as time limited by data collectors in both datasets. The majority of the 906 species that were assigned to different categories in Datasets A and B were labeled as a physical trait appearing in both sexes in one dataset and as a male-only physical trait in the other dataset (S1 Fig). This was followed by mismatches in category assignments between geographic location and life history, general descriptor and a physical trait in both sexes, and life history and behavior. These 915 species were subsequently re-categorized independently in three identical datasets (C, D, E) by 10 professional ornithologists that had also participated in the first round of data collection, each of whom was assigned to one of the three datasets. When the 915 species’ re-categorizations were compared across the three datasets, categories for 554 species (60.5%) matched across all 3 datasets and were retained for the final dataset. Categories for 324 (35.4%) of the re-categorized species matched in 2 out of 3 datasets, and the majority category was retained for these species in the final dataset. There was no category agreement across the three datasets for the remaining 37 (4.0%) re-categorized species. The final category for each of these species was jointly assigned by the lead authors of the study (ESM and AJS), together, based on the references provided by all those who had categorized the species in the first and second rounds.

Data validation

To validate the utility of the AvianLexiconAtlas dataset, we summarized the frequency of unique descriptors in species’ names and also examined phylogenetic and biogeographic categorization trends. All analyses were conducted in R 4.3 [36], and only considered the first word in the descriptor of species' English-language common names. The R package wordcloud2 0.2.1 [37] was used to generate visualizations of the relative frequencies of terminology used in the name descriptors of species across the dataset and across each of the three general categories.

To examine categorization trends among species across taxonomic groups in the database, we mapped the general categories assigned to each species’ name (i.e., avian physical traits, avian natural history traits, or human-centered terminology; Table 1) to a time-calibrated maximum clade credibility global avian phylogeny of 10,824 species [38]. The taxonomy in this phylogeny is based on the eBird/Clements 2021 checklist. The R package clootl 0.0.0.900 [39] was used to extract the phylogeny, and the final tree was pruned to only include the 10,775 species with common names that occur in both the 2021 and 2022 checklists (S3 Fig.). We measured phylogenetic signals for each of these categories using Fritz & Purvis’ D for binary traits (presence/absence) [40], using the package caper 1.0.3 [41] in R. D calculates the number of sister-clade differences in a binary trait for a given phylogeny [38] (S1 Table). An estimated D close to 1 represents a random distribution of a binary trait among related species on the phylogeny, while an estimated D close to 0 represents a clumped distribution of a binary trait among related species that would be expected under the Brownian motion model of evolution [40]. To test for significance, D was estimated for two different trait distributions for each trait that were simulated on the tips of the same phylogeny based on (1) randomly reshuffling the trait values and (2) trait evolution under Brownian motion. Each simulation was repeated 1,000 times.

To identify any geographic categorization trends, we mapped all species names to the IOC World Bird List 13.2 [42]. We extracted the “general region” for the breeding range of each species from the range description data included in the IOC World Bird List 13.2 (S1 Appendix). The IOC World Bird List compiles range descriptions for each species based on several authoritative sources and classifies these geographic ranges into general regions, which are generally at the subcontinent level, but also include a separate classification for primarily oceanic species [43]. Species with a breeding range in more than one general region were assigned to each region. We acknowledge the importance of the non-breeding range, but the lack of standardization of non-breeding range information across species precluded our ability to assign species to a general region for non-breeding range. We grouped all oceans together into a single category to increase interpretability, dropped species with a Worldwide distribution (n = 10), an Antarctic distribution (n = 12; sample size too small for analyses), and 110 species that could not be mapped due to taxonomic differences, for a total of 12,058 species-general region assignments in this dataset. To test whether general regions varied in the proportions of general descriptor categories (avian physical traits, avian natural history traits, or human-centered terminology), we conducted a Chi-square test using the chisq.test function in R.

Results

Database implementation

The AvianLexiconAtlas database consists of a dataset with the final consensus categorization of the unique descriptor in the English-language common name for all 10,906 species of birds in the eBird/Clements 2022 taxonomy checklist as well as the comprehensive glossary and gazetteer compiled by the contributors to the data collection (S1 Appendix). The database also includes the raw dataset that documents all category assignments for each species’ name across the duplicate datasets (A and B) from round 1 for all 10,906 species and the triplicate datasets (C, D, E) from round 2 for the 915 species without a consensus category in round 1. Downloadable CSV files of the datasets (final decisions and raw data) and PDF versions of the glossary and gazetteer are freely accessible on the AvianLexiconAtlas GitHub site (S1 Appendix, https://github.com/ajshultz/AvianLexiconAtlas). To allow for different modes of access to the database information, the GitHub site also contains read-only links to Google Sheets and Google Docs versions of both of the datasets and the Glossary and Gazetteer document, respectively. With the establishment of the protocol for name categorization, our long-term vision for the database is that it will continue to track annual revisions to taxonomy and English-language common names in the eBird/Clements taxonomy through a yearly database update.

Dataset description

Of the 10,906 species categorized, 57% are named based on avian physical traits, 32% are named based on avian natural history, and 11% are named based on human-centered terminology unrelated to the biology of the species (Fig 1; Table 2; S1 Appendix). Within avian physical traits, most species are named after traits present in both sexes, although nearly 1,000 species are named after traits only found in males, whereas only 20 species are named after traits only found in females (Fig 1; Table 2). Within avian natural history, most species are named after their geographic location compared to other natural history attributes or behavior (Fig 1; Table 2), and in human-centered terminology, most species are named after people (eponyms), compared to local language or miscellaneous categorizations (Fig 1; Table 2). There is variation across categories for the number of times a term was repeated across species, with local language having the greatest proportion of distinct terms, and size having the smallest proportion of distinct terms (Table 2).

Fig 1. Counts of descriptor categories of bird common names in English-language per category in the final dataset of 10,906 bird species.

Fig 1

Categories are colored by avian physical traits (blue), avian natural history (orange), and human-centered terminology unrelated to the biology of a species (green).

Table 2. Numbers and proportions of unique descriptors as common names in different categories. Calculations for the mean and median frequencies of unique descriptors only include descriptors that occur more than once among species names in each category.

Category General category N species N unique descriptors Proportion unique descriptors Mean frequency of unique descriptors Median frequency of unique descriptors
Both sexes physical trait Avian physical trait 4926 1530 0.31 6.30 3
Male
physical trait
Avian physical trait 933 517 0.55 3.62 3
Female
physical trait
Avian physical trait 20 19 0.95 2.00 2
Size Avian physical trait 355 28 0.08 22.80 13
Geographic location Avian natural history 2852 739 0.26 7.02 4
Natural history Avian natural history 455 196 0.43 4.32 2
Behavior Avian natural history 197 145 0.74 3.00 2.5
Person Human-
centered terminology
810 505 0.62 3.19 2
Local language Human-
centered terminology
102 101 0.99 2.00 2
Miscellaneous Human-
centered terminology
256 112 0.44 7.26 4
Total dataset 10906 3499 0.32 5.89 3

We next investigated the frequencies of the name descriptors (for this purpose, we considered the first word in the name as the descriptor) across the dataset and across categories. The five most common descriptors in the dataset are “African” (69 times), “Common” (68 times), “Lesser” (65 times), “Great” (65 times), and “Black” (64 times) (Fig 2). We note that after the size-related terms, many of the commonly used physical trait descriptors are related to plumage color (e.g., black, white-browed, spotted) (S2 Fig). The natural history-related common terms are geography related, especially regarding cardinal directions (e.g., northern, southern) (S2 Fig). The human-centered terminology has very few commonly used terms, except for the term “Common” (S2 Fig).

Fig 2. Word cloud of most common descriptors across the dataset.

Fig 2

Word sizes are scaled to the number of times they are repeated, and descriptors are colored according to their general category: avian physical trait (blue), avian natural history trait (orange), and human-centered terminology (green).

Phylogenetic and geographic categorization trends in the dataset

All three general categories of species names exhibit weak phylogenetic signal across 10,775 species (Figs 3, S3, S1 Table). Species names associated with avian physical traits (D = 0.729) and avian natural history traits (D = 0.745) have somewhat stronger phylogenetic signal than names associated with human-centered terminology (D = 0.875). Based on simulation tests of the trait distributions, the phylogenetic signal D of each category differs significantly (P < 0.001) from what would be expected under Brownian motion (D = 0), but also differs significantly (P < 0.001) from what would be expected from a random trait distribution on the phylogeny (D = 1) for each category.

Fig 3. Cladogram of 249 families of birds.

Fig 3

The location of each family branch was determined by a representative species in a time calibrated phylogeny of 10,775 species of birds [38]. At the tips, the shaded lines represent the proportion of species in each family with English common names associated with the general categories avian physical traits (blue), avian natural history (orange), and human-centered terminology unrelated to the biology of the species (green) (Table 1). See S3 Fig for species-specific name categories mapped to phylogenetic relationships.

The proportion of general descriptor categories is significantly different across geographic regions (Fig 4; χ2 = 940.43, df = 14, P < 0.0001). South America and Middle America show the highest proportions of species named after avian physical traits (0.74 and 0.72, respectively), and Oceans show the lowest proportion (0.26). Oceans show the highest proportion of species named after avian natural history (0.58) and Middle America and South America show the lowest (0.20 and 0.19, respectively). Africa and Oceans show the highest proportions of species named after human-centered terminology (both 0.17) and South America shows the lowest (0.06).

Fig 4. The frequency of general descriptor categories in species’ breeding ranges.

Fig 4

General categories are avian physical traits (blue), avian natural history (orange), and human-centered terminology unrelated to the biology of a species (green).

Discussion

The AvianLexiconAtlas contains assignments for the descriptive English-language common names of 10,906 bird species into 10 categories. In addition to the category assignments themselves, the AvianLexiconAtlas contains all references for the assignments based on information found outside of the Birds of the World database, a glossary of common terminology and a gazetteer of geographic locations used in English-language common names. We find that 89% of species are named after some aspect of their biology, whether it be their appearance, ecology, behavior, or geographic location. We establish that the most common way to name species is after a physical trait, usually a physical trait that is found in both sexes. We also find a tendency for the names of species that are related to each other or from similar geographic areas to share similar types of descriptors. The assembly of this database, which was a successful collaboration among professional ornithologists, amateurs, and students, has also led to insights about the nature of term recognition and the difficulty, in some cases, of understanding the meaning behind descriptors, including some that many professional ornithologists would have considered to be general knowledge. These initial observations establish the utility of the AvianLexiconAtlas for future research into historical and biological patterns in avian nomenclature, which we will outline here.

We find that physical traits are the most frequently applied category, and one potential explanation could be centuries of specimen-based taxonomic descriptions. Many species were originally formally described primarily by scientists working with specimens in collections, whether in museums or private collections. Specimens themselves typically only have information about physical traits and size, though well-curated metadata should also reveal geographic location, natural history, and, on occasion, behaviors. If the original historical basis of species taxonomy is specimen-based, future work using the database might investigate whether both scientific nomenclature and English-language nomenclature are biased toward descriptions from specimen-associated information.

An additional observation from the database is that the frequency with which types of physical trait descriptors are used in common names in the database corresponds to documented patterns of avian trait diversity. Among species that are named after their physical traits, the majority have common names that describe a trait that occurs in both sexes (Fig 1). This outcome aligns with documented evidence that most bird species do not display any sex differences in plumage brightness and/or pattern [44]. Most common names that do describe sex-specific traits of species, however, refer to traits that only occur in the males, and rarely identify traits unique to the females (Fig 1). One potential reason for this discrepancy is the tendency for humans to focus on more elaborate features that are more commonly found in males, which has been proposed as one explanation for the biases towards male specimens in museum collections [45]. Another explanation for the bias towards naming species after traits found only in males is that, due to intersexual selection, many males have more elaborate and faster evolving physical traits than females in dichromatic species [44,4649]. Males of species in lineages characterized by a large degree of sexual dichromatism may have more unique and prominent traits that can be used as descriptive identifiers compared to females of the same lineage. Female plumage color does vary across species within lineages, but it tends to converge on cryptic colors and patterns that are difficult to distinguish, and thus these traits may not be as easy to use for identification markers between species [46,5052]. Future work using the database could compare naming patterns to existing data on the degree of sexual dimorphism in species [44,46,53]. It is likely that species that are named after single-sex traits will be more sexually dimorphic than species that are named after both-sex traits, but it remains to be seen whether this pattern would hold for species named after behaviors or other potentially less sex-specific aspects of biology.

When we investigated whether the names of related species tend to be associated with the same categories, we found that avian-centered name categories have relatively stronger phylogenetic signals than human-centered name categories. This result aligns with previous work establishing that the types of traits used to name species often have phylogenetic signals, including color [5457], song [5860], and ecology [48,61,62]. Anecdotally, participants noticed that closely-related species have similar naming schemes, particularly in species-rich groups found in the Neotropics. For example, in the 45 species of Grallaria antpittas, 22 and 18 species are named after physical traits or geographic locations, respectively. This is also demonstrated by geographic trends, which show that the large majority of species in Middle and South American geographic regions are named after physical traits. Conversely, oceanic species are most commonly named after their natural history (primarily geographic location), likely reflecting that many ocean-dwelling species are difficult to tell apart by plumage or other morphological traits, but live in different geographic areas. Future work using the database could further investigate fine-scale variation in specific clades (i.e., orders or families) or geographies.

In addition to the category assignments, the use of crowdsourcing and duplicate datasets in the data collection for the database also provided an important perspective into the accessibility of English-language common names. We found that most of the mismatches between the duplicate datasets arose because, in some cases, it is difficult to distinguish whether a physical trait occurs in both sexes of a species or only in males (S2 Fig). In the Maroon Pigeon (Columba thomensis), for example, parts of the male’s plumage are described in Birds of the World as “rich maroon”, while female coloration is described as “duller… with only hints of maroon”, and this distinction is difficult to make out in the provided images [63]. For issues such as this, we had to decide the extent that a trait would have to occur in both sexes for it to be assigned to that category, and much of this was up to personal interpretation. The resolution to independently categorize the mismatched species again in triplicate was an opportunity for consensus agreement in most species, but disagreements across all three datasets still remained for some species’ names. Another issue that likely contributed to category mismatches was the use of historical and specialized English-language terminology in species names. Diagrams of the names and locations of parts of the bird were provided to all data collectors at the start of the project, but it was the terminology used to describe colors and patterns that proved to be one of the biggest issues (for example, “painted”, “festive”, or “glowing”). These types of descriptors comprise a majority of the words in the glossary that all participants were asked to contribute to when they encountered an unknown term. Similarly, the gazetteer was started during the first round of data collection when both professional and non-professional participants struggled to find information on outdated or rarely used names for geographic locations. This brings up a question of how useful specialized terms for color, pattern, and geographic location are in English-language common names, even if they are very specific descriptors, if many people do not know what they mean without further research. The English language is constantly evolving at regional and global scales [64,65] and the way colors are categorized and described can vary among people [66]. Furthermore, the diversity of color terms observed in the dataset are likely associated with how often general categories of color (white/black, red, green/yellow, blue, brown, purple/pink/orange/grey) are referenced across species [67,68]. These issues highlight the lack of consistency that is inherent in many of the trait descriptors used in common names, and could be analyzed as a type of data in and of itself in the future. For example, further examination into the types of terminology used in the 915 species’ names that required a second round of categorization as well as the terms listed in the glossary and gazetteer may be useful for current discussions surrounding changes to some English-language common names.

The scope of this database represents the first comprehensive source for further quantitative examinations of the types of terminology used in the common names of birds. This provides a starting point for future development of the AvianLexiconAtlas. First, the categories were intentionally designed to broadly capture distinct themes in the terminology used for English-language common names, with a particular focus on comparing how often terminology is associated with different properties of avian biology or human culture. Future expansions of the AvianLexiconAtlas database could expand these contents into more specific subcategories. For example, Burrowing Owls (Athene cunicularia) and Barking Owls (Ninox connivens) are both named after behaviors, but the former describes nesting habits and the latter describes vocalizations. Likewise, Golden-crowned Sparrows (Zonotrichia atricapilla) and White-throated Sparrows (Zonotrichia albicollis) are both named after physical traits (both sexes), specifically color, but the colors themselves and body parts are different. Striped Sparrows (Oriturus superciliosus), on the other hand, are named after a pattern that does not appear to be specific to any given body part. Now that the AvianLexiconAtlas has established which English-language common names are associated with avian biology, future research can provide insight into what specific types of biological descriptions are used in these common names that presumably should help to distinguish between species.

The database was created based on the eBird/Clements Checklist [30], but expanding to other checklists, especially those more regionally focused or based in other languages could provide additional insights. For example, the species officially known as the Jamaican Spindalis (Spindalis nigricephala) in the eBird/Clements 2022 Checklist has several local Jamaican names, including Mark Head, Cashew Bird, Silver Head, Spanish Quail, and Champa Beeza [69]. Future work could investigate how approaches to English-language common names vary regionally by comparing the categorizations of species’ officially recognized English-language names in the AvianLexiconAtlas to local, or alternative, English-language names on a regional scale. Lastly, the database is currently limited to the categorizations of common names of birds in American (U.S.) English, which is the standard language of the eBird/Clements checklist [34]. Historically, European and, later, American naturalists played outsized roles in the development of modern taxonomy that occurred alongside the expansion of Western imperialism around the world [1,16]. The relative distributions of categories for the unique descriptions of species English-language common names included in the database therefore mostly capture the philosophy of Western science in species nomenclature. Expanding the database and repeating the same methodology for common names in different languages and comparing the frequency of different types of descriptors used for the same species names across multiple languages would provide valuable insight into how approaches to avian nomenclature and the philosophy surrounding it vary across cultures. For example, some species might have many names in other languages in culturally diverse regions [70].

The AvianLexiconAtlas demonstrates that while common names of birds serve many purposes in English, there is a strong emphasis on descriptive characteristics associated with avian biology. The database highlights gaps in naming conventions, particularly for descriptors that are not associated with all individuals in a population or include specialized English-language terminology. We anticipate that further work using the database investigating common group names might find very different patterns, as there are different goals for the components of a name (i.e., to tell species apart or to identify groups of species). Furthermore, future work using the database could more specifically analyze different geographic and phylogenetic trends in common names in relation to the date of the species descriptions, where the species was first described (e.g., from a museum specimen or in the wild), the person the species description is attributed to, and types of life history or physical similarities across species. Investigations such as these would be useful for disentangling the influence of human history and avian biology on these observed patterns in avian nomenclature. The basic analyses describing the database we provided barely scratch the surface of what is possible for investigating trends in common names. Even beyond the myriad ways in which the database could be expanded, such as with additional taxonomies, languages, or groups of organisms, the AvianLexiconAtlas is a tool for researchers or amateur ornithologists alike. Researchers might associate the categorizations with other types of life history data (e.g., migratory patterns or sexual dichromatism), or investigate fine-scale analyses or specific clades or geographic locations. The amateur ornithologist might be curious about the history of a particular species name, and the AvianLexiconAtlas could help clarify its meaning or origins. This database is a rich resource that will enable a large variety of future work that can thus address the extent to which common names represent how people interact with birds today compared with historical interactions and across different cultural, biological, and regional contexts.

Supporting information

S1 Fig. Summary of mismatched category assignments in the 906 species common names assigned to different categories in Dataset A and Dataset B.

(TIF)

pone.0325890.s001.tif (166.3KB, tif)
S2 Fig. Word cloud frequencies of terminology in common names.

Word cloud frequencies of (A) avian physical traits, (B) avian natural history traits, and (C) human-centered terminology. Names are scaled according to frequencies in each dataset.

(TIF)

pone.0325890.s002.tif (866.4KB, tif)
S3 Fig. Cladogram of the categories assigned to the English common names of 10,775 species of birds.

Categories of the English common names identified at the tips of the branches, based on color. Cladogram adapted from [38]. The inner circle includes species names associated with the general category of avian physical traits (both sexes physical trait, male physical trait, female physical trait, size). The middle circle includes species names associated with the general category of avian natural history (behavior, geographic range, natural history), and the outer circle includes names associated with the general category of human-centered terminology unrelated to the biology of the species. See Table 1 for detailed explanations of each category.

(TIF)

pone.0325890.s003.tif (1.2MB, tif)
S1 Table. Calculation of D statistic for the phylogenetic structure of categories.

Results of Fritz & Purvis’ [40] D statistic calculations for the phylogenetic structure of each of the grouped categories: physical traits, natural history, and human-centered terminology. For each grouped category, species assigned to the category were represented by a state of 1 and the remaining species assigned to other categories were represented by a state of 0. D is calculated by scaling the observed sum of sister-clade differences, Σdobs, with the mean values of the sum of sister-clade differences for 1,000 simulated trait distributions on the tips of the same phylogeny based on randomly reshuffling the trait values, Σdr , and trait evolution under Brownian motion Σdb: D = [Σdobs  mean(Σdb)/[mean(Σdr mean(Σdb)]. An estimated D close to 1 represents a random distribution of a binary trait among related species on the phylogeny, while an estimated D close to 0 represents a clumped distribution of a binary trait among related species that would be expected under the Brownian motion model of evolution. Calculations were completed using the R package caper 1.0.3 [41].

(PDF)

pone.0325890.s004.pdf (82.9KB, pdf)
S1 Appendix. AvianLexiconAtlas Database Files.

The data, glossary, and gazetteer reported in this article can be accessed at https://github.com/ajshultz/AvianLexiconAtlas.

(PDF)

pone.0325890.s005.pdf (61.4KB, pdf)

Acknowledgments

We thank Birds of the World for providing complimentary 1-week subscriptions for the student and amateur participants. We also thank Eden Bayou, Jacob Bijou, Jakiya J. Campbell, Emily Chen, Cindy Cheng, Reema Demopoulos, Savannah Garza, Tamar Hadad, Yujing He, Mia Hejlsberg, Jikke Inia, Yoonseo Jang, Zilai Jin, Risa Kanai, Nick Kruczynski, Langrun Li, Yujia Liu, Queenie Liu, Ziqi Ma, Nyjur Majok, Cecilia Méndez, Kimi Modiri, Yazmin Munoz, Iyioluwa Okediji, Sophia Ordonez, Mohammed Osmanu, Yian Pan, Wend-manegde Pitroipa, Stephanie Salas, Rylie Shaeffer, Andrew Shafer, Paul Shen, Jesse Sivan, Ciaran Timlin, Bryant To, Alexander Valenzuela-Jones, Frances Vandervoort, Nicholas Walsh, Qingyue Wang, Ruoxi Ye, Yawei Zhang, Guicheng Zhang, Yuchong Zhu, and Brian Zou for their contributions to the data collection. Thanks to Emily Webb, Kayce Bell, Jann Vendetti, the NHMLAC Urban Nature Research Center members, and two anonymous reviewers for providing comments on drafts of the manuscript.

Data Availability

The data, glossary, and gazetteer reported in this article can be accessed at https://github.com/ajshultz/AvianLexiconAtlas.

Funding Statement

Funding was provided by the New York University Liberal Studies New Faculty Scholarship Award to E.S.M. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Raven PH, Berlin B, Breedlove DE. The origins of taxonomy. Science. 1971;174(4015):1210–3. doi: 10.1126/science.174.4015.1210 [DOI] [PubMed] [Google Scholar]
  • 2.Karaffa PT, Draheim MM, Parsons ECM. What’s in a name? Do species’ names impact student support for conservation? Hum Dimens Wildl. 2012;17(4):308–10. doi: 10.1080/10871209.2012.676708 [DOI] [Google Scholar]
  • 3.Ehmke G, Fitzsimons JA, Garnett ST. Standardising English names for Australian bird subspecies as a conservation tool. Bird Conserv Int. 2018;28(1):73–85. [Google Scholar]
  • 4.Gregg EA, Bekessy SA, Martin JK, Garrard GE. Many IUCN red list species have names that evoke negative emotions. Hum Dimens Wildl. 2020;25(5):468–77. [Google Scholar]
  • 5.Heard SB. The name of the rose (and everything else): How codes and practices in naming biological species reflect cultural identities. In: Nick IM, editor. Names naming law. Routledge; 2023. [Google Scholar]
  • 6.Heard SB, Mlynarek JJ. Naming the menagerie: creativity, culture and consequences in the formation of scientific names. Proc Biol Sci. 2023;290(2010):20231970. doi: 10.1098/rspb.2023.1970 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bailenson JN, Shum MS, Atran S, Medin DL, Coley JD. A bird’s eye view: biological categorization and reasoning within and across cultures. Cognition. 2002;84(1):1–53. doi: 10.1016/s0010-0277(02)00011-2 [DOI] [PubMed] [Google Scholar]
  • 8.Thiele KR, Conix S, Pyle RL, Barik SK, Christidis L, Costello MJ. Towards a global list of accepted species I. Why taxonomists sometimes disagree, and why this matters. Org Divers Evol. 2021;21(4):615–22. [Google Scholar]
  • 9.Ride WDL, Cogger HG, Dupuis C, Kraus O, Minelli A, Thompson FC, et al. International code of zoological nomenclature [Internet]. 2012. [cited 2024 Nov 25]. Available from: https://www.iczn.org/the-code/the-international-code-of-zoological-nomenclature/ [Google Scholar]
  • 10.McAtee WL. Longevity of bird names. Names. 1953;1(2):85–102. [Google Scholar]
  • 11.Kitson PR. Old English bird‐names. Engl Stud. 1997;78(6):481–505. [Google Scholar]
  • 12.Caramaschi U, De Sá RO, Heyer WR. Common names for the frog genus Leptodactylus (Amphibia, Anura, Leptodactylidae). Herpetol Rev. 2005;36(2):119–20. [Google Scholar]
  • 13.Crother BI. Standard language (insert language of choice) names versus common names. Herpetol Rev. 2007;38(2):143. [Google Scholar]
  • 14.Link‐Pérez MA, Dollo VH, Weber KM, Schussler EE. What’s in a name: Differential labelling of plant and animal photographs in two nationally syndicated elementary science textbook series. Int J Sci Educ. 2010;32(9):1227–42. [Google Scholar]
  • 15.García-de-Lomas J, Clavero M, García CM, Alba D, Torres JM. From Linderiella baetica to gambilusa: Involving children in conservation by giving a new species a common name. Aquat Conserv Mar Freshw Ecosyst. 2021;31(6):1543–7. [Google Scholar]
  • 16.Driver RJ, Bond AL. Towards redressing inaccurate, offensive and inappropriate common bird names. Ibis. 2021;163(4):1492–9. [Google Scholar]
  • 17.Schulenberg TS, Iliff MJ. Methods: Updating the eBird/Clements Checklist 6th Edition [Internet]. Ithaca, NY: Cornell Lab of Ornithology; 2014. [cited 2024 Nov 25]. Available from: https://www.birds.cornell.edu/clementschecklist/about/methods/ [Google Scholar]
  • 18.Schodde R, Glover B, Kinsky FC, Marchant S, McGill AR, Parher SA. Recommended English names for Australian birds. Emu - Austral Ornithol. 1978;77(sup1):245–307. [Google Scholar]
  • 19.Chesser RT, Billerman SM, Burns KJ, Cicero C, Dunn JL, Hernández-Baños BE, et al. Check-list of North and Middle American birds [Internet]. American Ornithological Society; 2024. [cited 2024 Nov 25]. Available from: https://checklist.americanornithology.org/taxa/ [Google Scholar]
  • 20.Gill F, Donsker D, Rasmussen P. English names: Principles [Internet] IOC World Bird List v14.2; 2024. Aug 17 [cited 2024 Nov 25]. Available from: https://www.worldbirdnames.org/new/english-names/principles/ [Google Scholar]
  • 21.Remsen JV Jr, Areta JI, Bonaccorso E, Claramunt S, Del-Rio G, Jaramillo A., et al. A classification of the bird species of South America [Internet]. Louisiana State University Museum of Natural Science; 2024. Nov 18 [cited 2024 Nov 25]. Available from: https://www.museum.lsu.edu/~Remsen/SACCBaseline.htm [Google Scholar]
  • 22.Guedes P, Alves-Martins F, Arribas JM, Chatterjee S, Santos AMC, Lewin A, et al. Eponyms have no place in 21st-century biological nomenclature. Nat Ecol Evol. 2023;7(8):1157–60. doi: 10.1038/s41559-023-02022-y [DOI] [PubMed] [Google Scholar]
  • 23.Liu IA, Gulson-Castillo ER, Wu JX, Demery AJC, Cortes-Rodriguez N, Covino KM. Building bridges in the conversation on eponymous common names of North American birds. Ibis. 2024;166(3):1092–102. [Google Scholar]
  • 24.Bird Names For Birds [Internet]. 2020. [cited 2024 Nov 25]. Available from: https://birdnamesforbirds.wordpress.com/ [Google Scholar]
  • 25.Beolens B, Watkins M. Whose bird? Common bird names and the people they commemorate. Yale University Press; 2004. [Google Scholar]
  • 26.Jobling JA. The Helm dictionary of scientific bird names: From Aalge to Zusii. London: Christopher Helm; 2010. [Google Scholar]
  • 27.Meiter GH. Bird is the word: An historical perspective on the names of North American birds. Newark, Ohio: The McDonald & Woodward Publishing Company; 2020. [Google Scholar]
  • 28.Myers S. The bird name book: A history of English bird names. Princeton, NJ: Princeton University Press; 2022. [Google Scholar]
  • 29.National Archives Code of Federal Regulations. 45 CFR part 46 - Protection of human subjects [Internet]. eCFR; 2025. Mar 12 [cited 2025 Mar 14]. Available from: https://www.ecfr.gov/current/title-45/subtitle-A/subchapter-A/part-46 [Google Scholar]
  • 30.Clements JF, Schulenberg TS, Iliff MJ, Federicks TA, Lepage D, Billerman SM, et al. The eBird/Clements checklist of birds of the world: v2022 [Internet]. Ithaca, NY: Cornell Lab of Ornithology; 2022. Oct [cited 2024 Nov 25]. Available from: https://www.birds.cornell.edu/clementschecklist/updateindex/october-2022/ [Google Scholar]
  • 31.HBW and BirdLife International. Handbook of the Birds of the World and BirdLife International Digital Checklist of the Birds of the World [Internet]. 2022. [cited 2024 Nov 25]. Available from: https://datazone.birdlife.org/about-our-science/taxonomy [Google Scholar]
  • 32.Billerman SM, Keeney BK, Rodewald PG, Schulenberg TS, editors. Birds of the world [Internet]. Ithaca, NY: Cornell Laboratory of Ornithology; 2024. [cited 2024 Nov 25]. Available from: https://birdsoftheworld.org/bow/home [Google Scholar]
  • 33.Birds of the World. Taxonomy [Internet]. Ithaca, NY: Cornell Laboratory of Ornithology. 2024. [cited 2024 Nov 25]. Available from: https://birdsoftheworld.org/bow/content/taxonomy. [Google Scholar]
  • 34.eBird. Bird names in eBird [Internet]. Ithaca, NY: Cornell Laboratory of Ornithology. 2024. Sep 6 [cited 2024 Nov 25]. Available from: https://support.ebird.org/en/support/solutions/articles/48000804865-bird-names-in-ebird [Google Scholar]
  • 35.Clements JF, Schulenberg TS, Iliff MJ, Federicks TA, Gerbracht JA, Lepage D, et al. The eBird/Clements checklist of birds of the world: v2021 [Internet]. Ithaca, NY: Cornell Lab of Ornithology;. 2021. [cited 2024 Nov 25]. Available from: https://www.birds.cornell.edu/clementschecklist/updateindex/august-2021/. [Google Scholar]
  • 36.R Core Team. R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2023. [cited 2024 Nov 25]. Available from: https://www.r-project.org/ [Google Scholar]
  • 37.Lang D, Chien G-t. wordcloud2: Create word cloud by ‘htmlwidget’ [Internet]. 2018. Jan 3 [cited 2025 Mar 14]. Available from: https://cran.r-project.org/web/packages/wordcloud2/index.html [Google Scholar]
  • 38.McTavish EJ, Gerbracht JA, Holder MT, Iliff MJ, Lepage D, Rasmussen PC, et al. A complete and dynamic tree of birds. Proc Natl Acad Sci U S A. 2025;122(18):e2409658122. doi: 10.1073/pnas.2409658122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Miller E, McTavish EJ. clootl [Internet]. 2024. [cited 2024 Nov 25]. Available from: https://github.com/eliotmiller/clootl/tree/master [Google Scholar]
  • 40.Fritz SA, Purvis A. Selectivity in mammalian extinction risk and threat types: a new measure of phylogenetic signal strength in binary traits. Conserv Biol. 2010;24(4):1042–51. doi: 10.1111/j.1523-1739.2010.01455.x [DOI] [PubMed] [Google Scholar]
  • 41.Orme D, Freckleton R, Thomas G, Petzoldt T, Fritz S, Isaac N, et al. caper: Comparative analyses of phylogenetics and evolution in R [Internet]. 2023. [cited 2024 Nov 25]. Available from: https://cran.r-project.org/web/packages/caper/index.html [Google Scholar]
  • 42.Gill F, Donsker D, Rasmussen P, editors. IOC World Bird List (v13.2) [Internet]. 2023. [cited 2024 Nov 25]. Available from: doi: 10.14344/IOC.ML.13.0 [DOI] [Google Scholar]
  • 43.Gill F, Donsker D, Rasmussen P, editors. Ranges [Internet]. IOC World Bird List v15.1; 2022. Jan 14 [cited 2025 Mar 14]. Available from: https://www.worldbirdnames.org/new/ioc-lists/range-terminology/ [Google Scholar]
  • 44.Gonzalez-Voyer A, Thomas GH, Liker A, Krüger O, Komdeur J, Székely T. Sex roles in birds: Phylogenetic analyses of the influence of climate, life histories and social environment. Ecol Lett. 2022;25(3):647–60. doi: 10.1111/ele.13938 [DOI] [PubMed] [Google Scholar]
  • 45.Cooper N, Bond AL, Davis JL, Portela Miguez R, Tomsett L, Helgen KM. Sex biases in bird and mammal natural history collections. Proc Biol Sci. 2019;286(1913):20192025. doi: 10.1098/rspb.2019.2025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Dale J, Dey CJ, Delhey K, Kempenaers B, Valcu M. The effects of life history and sexual selection on male and female plumage colouration. Nature. 2015;527(7578):367–70. doi: 10.1038/nature15509 [DOI] [PubMed] [Google Scholar]
  • 47.Delhey K. The colour of an avifauna: A quantitative analysis of the colour of Australian birds. Sci Rep. 2015;5:18514. doi: 10.1038/srep18514 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Cooney CR, Varley ZK, Nouri LO, Moody CJA, Jardine MD, Thomas GH. Sexual selection predicts the rate and direction of colour divergence in a large avian radiation. Nat Commun. 2019;10(1):1773. doi: 10.1038/s41467-019-09859-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Nolazco S, Delhey K, Nakagawa S, Peters A. Ornaments are equally informative in male and female birds. Nat Commun. 2022;13(1):5917. doi: 10.1038/s41467-022-33548-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Price JJ, Eaton MD. Reconstructing the evolution of sexual dichromatism: current color diversity does not reflect past rates of male and female change. Evolution. 2014;68(7):2026–37. doi: 10.1111/evo.12417 [DOI] [PubMed] [Google Scholar]
  • 51.Shultz AJ, Burns KJ. The role of sexual and natural selection in shaping patterns of sexual dichromatism in the largest family of songbirds (Aves: Thraupidae). Evolution. 2017;71(4):1061–74. doi: 10.1111/evo.13196 [DOI] [PubMed] [Google Scholar]
  • 52.Delhey K, Valcu M, Muck C, Dale J, Kempenaers B. Evolutionary predictors of the specific colors of birds. Proc Natl Acad Sci U S A. 2023;120(34):e2217692120. doi: 10.1073/pnas.2217692120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Owens IPF, Hartley IR. Sexual dimorphism in birds: why are there so many different forms of dimorphism?. Proc R Soc Lond B Biol Sci. 1998;265(1394):397–407. [Google Scholar]
  • 54.Stoddard MC, Prum RO. Evolution of avian plumage color in a tetrahedral color space: a phylogenetic analysis of new world buntings. Am Nat. 2008;171(6):755–76. doi: 10.1086/587526 [DOI] [PubMed] [Google Scholar]
  • 55.Shultz AJ, Burns KJ. Plumage evolution in relation to light environment in a novel clade of Neotropical tanagers. Mol Phylogenet Evol. 2013;66(1):112–25. doi: 10.1016/j.ympev.2012.09.011 [DOI] [PubMed] [Google Scholar]
  • 56.Marcondes RS, Brumfield RT. Fifty shades of brown: Macroevolution of plumage brightness in the Furnariida, a large clade of drab Neotropical passerines. Evolution. 2019;73(4):704–19. doi: 10.1111/evo.13707 [DOI] [PubMed] [Google Scholar]
  • 57.Merwin JT, Seeholzer GF, Smith BT. Macroevolutionary bursts and constraints generate a rainbow in a clade of tropical birds. BMC Evol Biol. 2020;20(1):32. doi: 10.1186/s12862-020-1577-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Freeman BG, Montgomery GA, Schluter D. Evolution and plasticity: Divergence of song discrimination is faster in birds with innate song than in song learners in Neotropical passerine birds. Evolution. 2017;71(9):2230–42. doi: 10.1111/evo.13311 [DOI] [PubMed] [Google Scholar]
  • 59.Mason NA, Burns KJ, Tobias JA, Claramunt S, Seddon N, Derryberry EP. Song evolution, speciation, and vocal learning in passerine birds. Evolution. 2017;71(3):786–96. doi: 10.1111/evo.13159 [DOI] [PubMed] [Google Scholar]
  • 60.Demery AJ, Burns KJ, Mason NA. Bill size, bill shape, and body size constrain bird song evolution on a macroevolutionary scale. Ornithology. 2021;138(2):ukab011. [Google Scholar]
  • 61.Phillimore AB, Freckleton RP, Orme CDL, Owens IPF. Ecology predicts large-scale patterns of phylogenetic diversification in birds. Am Nat. 2006;168(2):220–9. doi: 10.1086/505763 [DOI] [PubMed] [Google Scholar]
  • 62.Hawkins BA, Diniz-Filho JAF, Jaramillo CA, Soeller SA. Climate, niche conservatism, and the global bird diversity gradient. Am Nat. 2007;170 Suppl 2:S16–27. doi: 10.1086/519009 [DOI] [PubMed] [Google Scholar]
  • 63.Baptista LF, Trail PW, Horblit HM, Boesman PFD, Garcia EFJ. Maroon Pigeon (Columba thomensis), version 1.0. In: del Hoyo J, Elliott A, Sargatal J, Christie DA, de Juana E, editors. Birds of the World [Internet]. Ithaca, NY: Cornell Lab of Ornithology; 2020. [cited 2024 Nov 25]. Available from: doi: 10.2173/bow.marpig1.01 [DOI] [Google Scholar]
  • 64.Galloway N, Rose H. The “new” Englishes. In: Introducing global Englishes. Oxford, UK: Taylor & Francis Group; 2015. p. 95–123. [Google Scholar]
  • 65.McWhorter JH. Words on the move: Why English won’t- and can’t- sit still (like, literally). New York: Henry Holt and Co.; 2016. [Google Scholar]
  • 66.Lindsey DT, Brown AM. The color lexicon of American English. J Vis. 2014;14(2):17. doi: 10.1167/14.2.17 [DOI] [PubMed] [Google Scholar]
  • 67.Berlin B, Kay P. Basic color terms: their universality and evolution. Berkeley, CA: University of California Press; 1969. [Google Scholar]
  • 68.Twomey CR, Roberts G, Brainard DH, Plotkin JB. What we talk about when we talk about colors. Proc Natl Acad Sci U S A. 2021;118(39):e2109237118. doi: 10.1073/pnas.2109237118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Raffaele H, Wiley J, Garrido O, Keith A, Raffaele J. A guide to the birds of the West Indies. Princeton, NJ: Princeton University Press; 1998. [Google Scholar]
  • 70.Sicard-Ayala AM, Jaramillo-Mejía L, Ayerbe-Quiñones F. Un ave, muchos nombres: un pluriverso. Ornitol Colomb. 2019;17(17):eC01. [Google Scholar]

Decision Letter 0

Shoko Sugasawa

31 Jan 2025

Dear Dr. Morrison,

Both reviewers commented that while they see the value of the AvianLexiconAtlas database, it is unclear whether the manuscript is meant to be a research article using the database, or a data paper which describes the database but does not test any specific research question with it. As this was submitted as a Research Article, following the traditional format of Introduction, Methods, Results and Discussion, the readers would expect to read a research paper that tests specific questions. Could the authors please clarify what their intentions are (i.e. is this meant to be a research article or a data paper), and if latter, refer to Guidelines for Specific Study Types on PLOSONE website and submit it under database category? Either way, I expect there to be a fair amount of restructuring and rewriting of the manuscript, hence the decision of major revision. Both reviewers provided a number of helpful and insightful suggestions whichever route the authors are going to take, so please refer to and consider making use of them.

Please submit your revised manuscript by Mar 17 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Shoko Sugasawa

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. You indicated that ethical approval was not necessary for your study. We understand that the framework for ethical oversight requirements for studies of this type may differ depending on the setting and we would appreciate some further clarification regarding your research. Could you please provide further details on why your study is exempt from the need for approval and confirmation from your institutional review board or research ethics committee (e.g., in the form of a letter or email correspondence) that ethics review was not necessary for this study? Please include a copy of the correspondence as an ""Other"" file.

3. Thank you for stating the following financial disclosure:

“Funding was provided by the New York University Liberal Studies New Faculty Scholarship Award to E.S.M”

Please state what role the funders took in the study.  If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

If this statement is not correct you must amend it as needed.

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

4. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: No

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: Yes

**********

Reviewer #1: This manuscript presents the ‘AvianLexiconAtlas’, a dataset of 10,906 species with their (American) English names classified into ten categories related to, e.g., the avian or cultural context of the bird. Basic descriptions of this data are presented, along with a long discussion about the coding decisions made when generating these crowd-sourced dataset. This is a really interesting dataset, and the manuscript itself is beautifully written. I only have a few technical comments (see below), which should be relatively easy to address, but I do have a broader, perhaps more philosophical point.

To my mind, this manuscript tries to be attempting several things at once, and thus both its aims and its impact is murky. If this is intended as a data paper, it does a great job, no further engagement with this paragraph required. (And, indeed, it’s possible that a subset of this author team has another manuscript in the works, but wants to get the dataset published first. Reasonable!) However, several other possible goals are perhaps attempted here.

• Is this manuscript testing hypothesis about this data? If so, what research question(s) is/are being addressed? This isn’t entirely clear to me. (More on this in the technical comments.)

• Is this manuscript presenting a case study on engaging undergraduates in the research process? If so, one would expect to see more engagement with the pedagogical literature. (Don’t read anything into this recommendation, other than that I recently saw it on BlueSky, but Stephan Lautenschlager recently published a piece of this ilk, https://doi.org/10.1186/s12052-024-00214-z )

• Is this manuscript trying to engage with the Bird Names for Birds movement? If so, I would expect to see slightly more connection to the motivation for changing bird names, and how past naming practices are being understood in modern times. (Again, don’t read too much into this recommendation, but Bond & Lavers recently had an Ibis paper doing just this, https://doi.org/10.1111/ibi.13356 )

• Is this manuscript trying to make a point about how humans relate to birds through naming conventions? If so, I would expect to see *far* more engagement with the linguistic / anthropological / ethno-ornithological literature on this topic. The work of the authors behind the EWA (Ethno-Ornithology World Atlas, https://ewatlas.net/ ) spring to mind, as does the vast literature on folk taxonomies. The discussion touches on this to a certain extent (e.g., L419-423, which are great) – I’d love to see ever more of this when setting up the context for this work. Similarly, I was fascinated by the terms in the gazette that needed to be defined, such as “hoary” or “poll” – the disconnect in vocabulary between ornithologists and the general public is an excellent point!

Anyway, all of this is mostly a scope / editorial point rather than a reviewer point, but I figured it might be useful context for the authors in how an audience might read and interpret their work. Again, I want to stress that I really enjoyed reading this – it’s a cool dataset! – and if the authors have anything else planned for this topic, I will greatly look forward to reading it.

Technical comments:

L91: It’s not clear to me what “As part of these deliberations” refers to.

L101-104: Why would you predict this? Much more context is needed to understand why there might be geographic and/or taxonomic patterns towards naming conventions.

L104-106: What insights could this provide? This context is currently missing for the reader.

L147 onward: This is possibly an unpopular opinion, but I found this description of the 11-versus-10 category thing very hard to follow. It would perhaps be less confusing to either straight-up omit the mention of the 11 categories, or to say something like “Though 11 categories were initially scored, these were further refined during the data cleaning phase to a set of 10 categories [describe the 10]”.

L152: This may be rendered irrelevant by the above comment, but the use of “random” here is distracting, because of course birds do not have “random” names. They may have names that are miscellaneous, or opaque, or ‘other’, or something, but language is not ‘random’.

Paragraph beginning L165: Were all data collectors fluent speakers of English? (Obviously for an undergraduate class, participants could not be excluded based on language ability, but it’s useful context for a language-based coding task.)

Similarly, for the scorers who made assessments based on photographs, did all scorers have typical colour vision (e.g., the ability to distinguish between red and green)?

L180: Comma needed after the bracket/parenthesis.

L230: Again, why would you predict this? This needs to be spelled out somewhere (e.g., the introduction, or maybe the supplement).

L234: My understanding is that this phylogeny (and the associated R package) is currently unreviewed. I don’t think *I* care (there’s a trade-off to be had between including nearly all species in an unreviewed phylogeny and, say, losing ~10% of the species to use a published phylogeny with known biases, and I’m sympathetic to either decision), but the editor may have an opinion.

Paragraph beginning L247: Why was this done as a chi-squared test and not with a phylogenetic correction? You know that there’s a phylogenetic signal in the data, and that species naming conventions are non-independent – you should be correcting for that.

L248: How did you extract the general region? How is “general region” defined? Also, is this in the published data and I just can’t find it in the Github?

Table 2: It took me a very long time to understand “Mean times repeated if not unique” and “Median times repeated if not unique”. Please consider clarifying somewhere.

Figure 2: How was this word cloud made? (What software/package?) (Apologies if I missed this!)

L330: These differences in D values are relatively small; without confidence intervals, the reader doesn’t know how to distinguish them. I’d suggest either adding confidence intervals, or softening the phrasing to “somewhat stronger” (or something).

A lot of your discussion is really great. For example, your point about the naming of Grallaria species is a lovely insight, as is the idea that closely-related seabird species are more distinguishable by geography than by morphology. I’d, personally, love to see even more about the (highly colonial) history of how people have named birds in (American) English and how that interfaces with global patterns of, e.g., sexual dimorphism, community assembly, even if it’s just anecdotal.

L485-486: This strikes me as an odd question, given that this manuscript presents data that can, in part, answer this question. Perhaps rephrase? In particular, what scientific * question * (ornithological, linguistic, ethno-ornithological, whatever) would be addressed by conducting an investigation like this?

(Again, it almost feels like the authors are queuing up a follow-up study, with more anthropological/linguistic insights. If so, of course, that’s fine, but maybe dialing a little bit back on that and focusing a little more about the impact of * this * study would help improve the stand-alone effect of this work on the field.)

Table S2: The last three rows of Table S2 are very difficult to understand unless you dig into Fritz & Purvis 2010 and look up the definitions of things like \sum d_{obs} . Consider either omitting or explaining somewhere how D is actually calculated?

Gazette:

• There are several changes in font and font size that you might want to fix.

• You probably want to remove the note in red on the first page.

• Do you have permission to publish these figures?

• You might want to triple-check all of the political affiliations claimed in the geographic location section, and/or include a disclaimer; these in general seem excellent, but people have strong feelings about all sorts of things.

• The Perijá entry is quite obviously copied from Wikipedia – you may want to disable the links and standardize the font, if nothing else.

Reviewer #2: In this study, authors describe an AvianLexiconAtlas which contains assignments for descriptive English-language common names of 10,906 birds. Authors state that there is currently no systematic assessment of how often common names communicate identifiable and biologically relevant characteristics about species, suggesting that this is an issue in ornithology because common names are used more often than scientific names even by professional researchers. Hence, through a collaboration of professional ornithologists, amateurs, and students they classify these names into 10 categories based on avian physical traits, natural history and human-constructed terminology. They show that 89% of the birds are named after some aspect of their biology, either after a physical trait found in one or both sexes or after their natural history or behaviour. When birds are named after some aspect of their biology, name descriptors appear to be geographically and phylogenetically clustered.

The manuscript is clearly written and relatively easy to read, however, I have some concerns about what the main aim of the study is and the take home message it intends to convey.

In addition to making a clear list of achievable objectives in the introduction which should be reflected in the methods, results and discussions, I have the following feedback for consideration by the authors:

1. I struggled to identify the main biological problem addressed in the manuscript or the main biological insight gained. I can understand the fact that name descriptors based on biology are more informative and that some name descriptors are very localised and so not very helpful for generalisation. I also understand that there are debates about fair naming – what bird common names shouldn’t (or should) be. Is this analysis in light of these debates? Otherwise, if the main aim is to show how bird common names are arrived at or the unique histories of such, then the manuscript could be better focused, with that aim clearly stated, and more involving. There are some predictions outlined in the introduction and methods, like those associated with phylogenetic and geographical clustering of names (lines 100 – 104 and 230 - 231), but there isn’t a clear indication of what might be the case if these predictions were right. Moreover, it might be important to show whether there are any biological implications for common names that do not follow accepted naming conventions (if any).

2. Authors have taken time (e.g line 273 - 275) to explain author contributions and criteria for qualifying for co-authorship. This is useful in some ways, but I do not believe that this is relevant for the main text. It might be required by the journal to judge fair authorship, but I believe that it should be included in the author contribution section, and not in the main text. If the inclusion or exclusion criteria has an impact on the results or the interpretation of it, then maybe these criteria discussed.

3. Line 109: it may be a goal of common names that they are useful for the general public, but they must not be generalised to the extent that they cease to make sense locally, except a species species means exactly the same thing to everyone, and I do not believe that this is the case. An interesting perspective for me would be to discuss how names differ according to context or locations (see you comment in lines 512 - 513).

Specific comments

Lines 405 – species not specious.

The term human-constructed terminology is a bit misleading because all names are human-constructed including avian centred names.

Lines 349 – 351 – your result which shows that Africa and Oceans show the highest proportions of species named after human-constructed terminology seemed to have been ignored in the discussion. Isn’t it easily predictable that species described by colonial ornithologists (which may be the case for most of Africa and the Oceans) are more likely to be named after persons? You mentioned the outsized roles of western naturalists in lines 502 – 508. Although, I am curious about why South America has the least proportion of names based on human-constructed terminologies. The timing of species description may influence how they are described. Newly described species are less likely to have been described from a museum collection which you argue may have been the motivation for naming species based on physical traits.

Lines 417 – 501 discusses methodological considerations and participation. Down-sizing this aspect of the discussion may help make the manuscript better concise.

Good luck with the revisions.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step.

PLoS One. 2025 Jun 11;20(6):e0325890. doi: 10.1371/journal.pone.0325890.r003

Author response to Decision Letter 1


19 Mar 2025

Dear Dr. Sugasawa,

Thank you and the two anonymous reviewers for the helpful feedback on our manuscript; we believe the comments have resulted in a considerably strengthened manuscript. We appreciate the opportunity to submit a revised version of the manuscript based on the reviews we received. Based on the main feedback we received from you and both reviewers, we have now reframed the manuscript to be a Research Article that solely reports on a new database, the AvianLexiconAtlas. We are no longer using the manuscript to present the results of original research that address a clearly defined research question. As part of this database description, we now include a section in the Results (lines 299-315) focusing on the implementation of the database that specifically provides a direct link to the database hosting site, plans for long-term database maintenance and growth. We retained the analyses included in the initial submission, but now provide support that these analyses are simply a way to validate the utility of the database. The trends that are presented in these analyses can be used to ask future questions about the utility of these English-language descriptors in a biological context as well as provide data for further examination into the linguistic history of these descriptors. In the Discussion we use these trends as a way to propose several areas of future research using the database.

As part of the revision, we have provided further details on why this study was exempt from the need for approval from the institutional review board. In the ‘Data Collectors’ section of the Methods (lines 194-201), we now include the full ethics statement. We specifically explain that the New York University (NYU) Human Research Protection Program (HRPP) determined via verbal and written consent that the work involved in the data collection for the database did not meet the criteria involving human subjects per the United States’ regulations for the protection of human subjects (45CFR46.102(e)). This determination was made because the focus of the research was not on collecting information about the data collectors, but only involved a crowdsourcing model where the data collectors helped categorize the meanings behind unique descriptors in bird names. As such, the data collection methods did not require HRPP review and approval. We have included an official letter from the NYU Human Research Protection Program also stating this language in the revision documents. This has been submitted as an “Other” file. To further clarify that this study is a crowdsourced data collection research project associated with scientific nomenclature, and not a study on human subjects, we have changed from calling the individuals who helped with categorizing the bird names ‘participants’ and now call them ‘data collectors’ throughout the entire manuscript.

As per the journal’s request, we now have an amended Role of Funder statement, and thank the journal editorial staff for changing this on our behalf:

“Funding was provided by the New York University Liberal Studies New Faculty Scholarship Award to E.S.M. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.”

Please note, that due to the removal of the S1 Table and the S1 Figure from the revised manuscript, the numbering of the remaining Supporting Information table and figures has now been changed. This has been updated in the manuscript, and we describe why this table and figure were removed from the Supporting Information documentation in detail below. We also edited the text for further clarity and PLOS ONE style requirements

Below is the list of specific changes incorporated in the revised manuscript:

1. Reviewer 1: L91: It’s not clear to me what “As part of these deliberations” refers to

Authors: This phrase has been deleted and in its place (lines 115-117) we have now clarified that the goal of constructing this database is to be able to understand what common names currently communicate about species in terms of the scope and variability of the terminology.

2. Reviewer 1: L101-104: Why would you predict this? Much more context is needed to understand why there might be geographic and/or taxonomic patterns towards naming conventions.

Reviewer 1: L104-106: What insights could this provide? This context is currently missing for the reader.

Reviewer 2: I struggled to identify the main biological problem addressed in the manuscript or the main biological insight gained. I can understand the fact that name descriptors based on biology are more informative and that some name descriptors are very localised and so not very helpful for generalisation. I also understand that there are debates about fair naming – what bird common names shouldn’t (or should) be. Is this analysis in light of these debates? Otherwise, if the main aim is to show how bird common names are arrived at or the unique histories of such, then the manuscript could be better focused, with that aim clearly stated, and more involving. There are some predictions outlined in the introduction and methods, like those associated with phylogenetic and geographical clustering of names (lines 100 – 104 and 230 - 231), but there isn’t a clear indication of what might be the case if these predictions were right. Moreover, it might be important to show whether there are any biological implications for common names that do not follow accepted naming conventions (if any).

Authors: The predictions in the original manuscript in L101-104 have been deleted from the introduction to align with the focus of the paper on introducing the database. Since we are no longer presenting the work as a hypothesis test, we did not believe we should retain the predictions, which we believe led to some of the confusion about what the goal of this manuscript was. Instead, the introduction of phylogenetic and biogeographic analyses in the Introduction section (lines 125-132) are presented as a way to validate the utility of the dataset for future research questions, and we propose several future research topics in the Discussion as part of our initial observations about the trends we observed in the database (lines 442-453).

3. Reviewer 1: L485-486: This strikes me as an odd question, given that this manuscript presents data that can, in part, answer this question. Perhaps rephrase? In particular, what scientific * question * (ornithological, linguistic, ethno-ornithological, whatever) would be addressed by conducting an investigation like this?

Authors: We have added a line after this question to clarify its significance (lines 499-502).

4. Reviewer 1: L147 onward: This is possibly an unpopular opinion, but I found this description of the 11-versus-10 category thing very hard to follow. It would perhaps be less confusing to either straight-up omit the mention of the 11 categories, or to say something like “Though 11 categories were initially scored, these were further refined during the data cleaning phase to a set of 10 categories [describe the 10]”.

Authors: Thank you for pointing this out, we also can see how this could be confusing and per your advice have simplified the initial description of the categories in the Methods section (lines 171-177) and omit the mention of the 11 initial categories. We now only present the set of 10 categories used for terminology assignments. As part of this update, we moved the more detailed descriptions of each of the 10 categories into Table 1 that were previously only included in S1 Table. Since the S1 Table only existed to provide a comparison between the 11 initial categories and the 10 final categories we decided to delete the S1 Table about these 11 categories from the manuscript.

5. Reviewer 1: L152: This may be rendered irrelevant by the above comment, but the use of “random” here is distracting, because of course birds do not have “random” names. They may have names that are miscellaneous, or opaque, or ‘other’, or something, but language is not ‘random’.

Authors: We have addressed this issue by removing any mention of the initial 11 categories from the manuscript.

6. Reviewer 1: Paragraph beginning L165: Were all data collectors fluent speakers of English? (Obviously for an undergraduate class, participants could not be excluded based on language ability, but it’s useful context for a language-based coding task.)

Reviewer 1: Similarly, for the scorers who made assessments based on photographs, did all scorers have typical colour vision (e.g., the ability to distinguish between red and green)?

Authors: We have now added a clarifying sentence in the ‘Data collection’ subsection (lines 207-210) of the Methods that explains the use of duplicate datasets (in the first round of scoring) and triplicate datasets (in the second round of scoring) in the design of the data collection was established to validate the accuracy of the category assignments made by different people. We decided not to specifically address differences in English language fluency and/or color vision of the data collectors in the manuscript, because we did not ask any of the data collectors about this during the recruitment process.

7. Reviewer 1: L180: Comma needed after the bracket/parenthesis.

Authors: The comma has now been added in line 204.

8. Reviewer 1: L230: Again, why would you predict this? This needs to be spelled out somewhere (e.g., the introduction, or maybe the supplement).

Authors: Based on the decision to reframe the manuscript as a description of a new database, we have now changed the language in this section of the Methods and removed the mention of ‘predictions’ for the analyses used in the manuscript. Instead, this section is now called ‘Data validation’ (line 258) and we now described the analyses as a way to validate the utility of the database by simply exploring trends in the dataset that was collected (lines 259-296). This goal for the analyses has also been updated in the Introduction and Discussion sections to remove the implication we are hypothesis testing with these analyses.

9. Reviewer 1: My understanding is that this phylogeny (and the associated R package) is currently unreviewed. I don’t think *I* care (there’s a trade-off to be had between including nearly all species in an unreviewed phylogeny and, say, losing ~10% of the species to use a published phylogeny with known biases, and I’m sympathetic to either decision), but the editor may have an opinion.

Authors: The decision to use the unpublished phylogeny that is currently in review was made because it most closely aligns with the taxonomy of the species in the database, because they are both based on the eBird/Clements taxonomy. This phylogeny, its documented methodology, and its associated R programs are all currently available as a preprint on BioRxiv (McTavish et al., https://www.biorxiv.org/content/10.1101/2024.05.20.595017v1). Additionally, as of the date of this letter the phylogeny has been cited by 4 published peer reviewed studies:

• Barber, R.A. et al. (2024). PLOS Biol. 22(11): e3002856. https://doi.org/10.1371/journal.pbio.3002856

• Janzen, E. & Etienne, R.S. (2024). Mol Phylogenet Evol. 200: 108168. https://doi.org/10.1016/j.ympev.2024.108168

• Nussbaumer, R. et al. (2025). Divers Distrib. 31: e13935. https://doi.org/10.1111/ddi.13935

• Van Doren, B.M. et al. (2025). Curr Biol. 35(4): P898-904.E4. https://doi.org/10.1016/j.cub.2024.12.033

10. Reviewer 1: Paragraph beginning L247: Why was this done as a chi-squared test and not with a phylogenetic correction? You know that there’s a phylogenetic signal in the data, and that species naming conventions are non-independent – you should be correcting for that

Authors: Since several species did not have unique values for breeding locations, a phylogenetic regression analysis or a linear model was not possible. We agree that there may be some tendencies for species from the same families and orders to be named in a similar manner from the same geographic region, but in this manuscript we do not look to completely explain these patterns and we believe that question is now beyond the scope of the revised manuscript. Thus, we felt that the chi-squared test is describing patterns that may intrigue readers enough to investigate it more thoroughly elsewhere.

11. Reviewer 1: L248: How did you extract the general region? How is “general region” defined? Also, is this in the published data and I just can’t find it in the Github?

Authors: We provided more details in the “Data validation” section of the Methods to clarify what we mean by “extracted” the general region (lines 284-286). This data is compiled by the IOC World Bird List and we extracted the specific general regions for each species in the IOC World Bird list. We retained the definition for “general region” that was included in the initial submission of the manuscript, which describes the term as geographic ranges at the continent level, but we have now added a reference to the IOC World Bird List Range Terminology website so the reader can learn more details about how these designations were made. We have added the the general region or regions used for the breeding range of each species from the IOC World Bird List 13.2 to the ‘total_dataset_allrounds.csv’ file posted on the Github site for the database (S1 Appendix, https://github.com/ajshultz/AvianLexiconAtlas).

12. Reviewer 2: Table 2: It took me a very long time to understand “Mean times repeated if not unique” and “Median times repeated if not unique”. Please consider clarifying somewhere.

Authors: We have added text in the legend of Table 2 that clarifies that calculations for the mean and median frequencies of unique descriptors only include descriptors that occur more than once within species names in each category. Additionally, we changed the titles of the columns associated with the mean and median times to: “Mean frequency of unique descriptors” and “Median frequency of unique descriptors”

13. Reviewer 1: Figure 2: How was this word cloud made? (What software/package?)

Authors: We have now identified the R software package (wordcloud2) used to produce the word clouds in the “Data validation” section of the Methods (lines 262-263).

14. Reviewer 1: L330: These differences in D values are relatively small; without confidence intervals, the reader doesn’t know how to distinguish them. I’d suggest either adding confidence intervals, or softening the phrasing to “somewhat stronger” (or something).

Authors: The wording to describe the differences in the D values has been changed to “somewhat stronger” (line 358). There are no confidence intervals as part of the calculation for D, and we agree with Reviewer 1 that it is difficult to distinguish the magnitude of difference in this case

15. Reviewer 1: Table S2: The last three rows of Table S2 are very difficult to understand unless you dig into Fritz & Purvis 2010 and look up the definitions of things like \sum d_{obs}. Consider either omitting or explaining somewhere how D is actually calculated?

Authors: Due to the deletion of the table that was originally S1 Table from the revised manuscript, the supporting information table reporting the calculations for the D statistic that was S2 Table in the original manuscript has now been changed to S1 Table in the revised manuscript. The legend of this table has now been expanded to provide more detail on how D is actually calculated. It includes the specific equation used for D along with clarifying definitions for each of the calculations and statistics reported in the table.

16. Reviewer 1:

Gazette:

• There are several changes in font and font size that you might want to fix.

• You probably want to remove the note in red on the first page.

• Do you have permission to publish these figures?

• You might want to triple-check all of the political affiliations claimed in the geographic location section, and/or include a disclaimer; these in general seem excellent, but people have strong feelings about all sorts of things.

• The Perijá entry is quite obviously copied from Wikipedia – you may want to disable the links an

Attachment

Submitted filename: ResponsetoReviewers_PLOSONE_031725.pdf

pone.0325890.s007.pdf (205.8KB, pdf)

Decision Letter 1

Shoko Sugasawa

30 Apr 2025

Dear Dr. Morrison,

Please submit your revised manuscript by Jun 14 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Shoko Sugasawa

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #1: Yes

Reviewer #2: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: Yes

Reviewer #2: N/A

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: No

**********

Reviewer #1: I thank the authors for their thorough and thoughtful revisions, which have greatly increased the readability of this manuscript. The current version of this work is very interesting, and I enjoyed the opportunity to read it.

I have just a few extremely minor comments remaining:

L74: They’re not necessarily Latin. “Latinized”? “binomial, often in Latin”? Something else?

L103: Perhaps the “gull” family of birds? Or just “gull species”?

L238: Purvis, not Pervis

L248-249: What are some other general regions, if not continents?

L537-538: This is extremely picky, but I’d recommend phrasing this sentence with a little less surprise; the fact that colour naming varies among people is an extremely well-documented anthropological phenomenon (e.g., https://doi.org/10.1073/pnas.2109237118 , or even Berlin & Kay’s “Basic Color Terms”)

Figures – as a heads-up, Manuscript Central (or whatever software this is – “Editorial Manager”??) has compressed your figures in a weird way. It’s also obliterated the mathematical formatting on Page 35.

Gazetteer:

• Extra square bracket under ‘Dimorphic’

• Extra bolding in ‘pinnated’, ‘russet’, ‘tawny’

• I don’t know what “If toponym varies from spelling indexed under, listed at end of entry” means?

Reviewer #2: Thank you for revising your manuscript and responding the comments made on the initial draft. I have made some recommendations to the Editor who will advise you appropriately on how the manuscript might revised further as a data paper.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org

PLoS One. 2025 Jun 11;20(6):e0325890. doi: 10.1371/journal.pone.0325890.r005

Author response to Decision Letter 2


15 May 2025

Dear Dr. Sugasawa,

Thank you and the two anonymous reviewers for taking the time to review our revised manuscript and providing further feedback. We appreciate the opportunity to continue to revise the manuscript, and believe we have thoroughly addressed the comments from the reviewers.

As part of the first revision, we provided further details on why this study was exempt from the need for approval from the institutional review board. As requested by the journal, we have included an Ethics statement subsection in lines 145-153 at the start of the Materials and methods section. We specifically explain that the New York University (NYU) Human Research Protection Program (HRPP) determined via verbal and written consent that the work involved in the data collection for the database did not meet the criteria involving human subjects per the United States’ regulations for the protection of human subjects (45CFR46.102(e)). This determination was made because the focus of the research was not on collecting information about the data collectors, but only involved a crowdsourcing model where the data collectors helped categorize the meanings behind unique descriptors in bird names. As such, the data collection methods did not require HRPP review and approval. We have included an official letter from the NYU Human Research Protection Program also stating this language in the revision documents. This has been submitted as an “Other” file. To further clarify that this study is a crowdsourced data collection research project associated with scientific nomenclature, and not a study on human subjects, we have changed from calling the individuals who helped with categorizing the bird names ‘participants’ and now call them ‘data collectors’ throughout the entire manuscript.

As per the journal’s request, we now have an amended Role of Funder statement, and thank the journal editorial staff for changing this on our behalf:

“Funding was provided by the New York University Liberal Studies New Faculty Scholarship

Award to E.S.M. The funders had no role in study design, data collection and analysis, decision

to publish, or preparation of the manuscript.”

Below is the list of specific changes incorporated in the revised manuscript:

1. Reviewer 2 & Academic Editor: Reviewer2 commented that the discussion section should clarify the database utility and prospects, which I agree. As the current discussion is structured around the results and the constraints of the database, the significance and future ideas for the database are scattered across the discussion, making it hard to get a comprehensive and concrete idea of how this database contributes to research in biology and other fields. Please clarify the utility and future prospects of the database in discussion -- I think that rewriting the concluding paragraph might be the least destructive way, but am open to other ways to achieve this.

Authors: We appreciate the comments by Reviewer 2 and the Academic Editor, and have revised the Discussion as suggested. We now have restructured the end of the discussion to be centered on the utility and future prospects of the database. We have reframed the constraints as opportunities to expand the database in the future, summarize all of the opportunities for future research directions mentioned throughout the Discussion, and end on highlighting the significance of the AvianLexiconAtlas for both professional researchers and amateurs.

2. Reviewer 1:

L74: They’re not necessarily Latin. “Latinized”? “binomial, often in Latin”? Something else?

L103: Perhaps the “gull” family of birds? Or just “gull species”?

L238: Purvis, not Pervis

Authors: In line 100, we have edited this passage to now say that ICZN scientific names are written as a binomial, often in Latin. In line 276 we corrected the spelling error and it now says Purvis.

The topics of the lines referenced by Reviewer 1 don’t match the ones in the manuscript file we submitted in the first revision. In the case of the referenced line 103, we were not entirely sure what this was in reference to, as we don’t mention any gull species in the manuscript. We thought the comment might be referring to the use of “shared group names” to describe the part of a species’ English common name that is shared across species (for example Ostrich) in lines 160-165 in the manuscript for the first revision. In the current version of the manuscript, we have now deleted the term “group” in lines 172-177, and instead refer to the names that occur in multiple species, as “shared names”. We chose not to specifically define these shared names as ‘family’ names, because the shared descriptor across species may not always correspond directly to taxonomic groups.

3. Reviewer 1: L248-249: What are some other general regions, if not continents?

In lines 288-291 we have now clarified that the general range classifications for species occur at the subcontinent level (instead saying that the classifications were generally, but not always, at the continent level) and have added that in addition to these regions there is a separate classification for primarily oceanic species.

4. Reviewer 1: L537-538: This is extremely picky, but I’d recommend phrasing this sentence with a little less surprise; the fact that colour naming varies among people is an extremely well-documented anthropological phenomenon (e.g., https://doi.org/10.1073/pnas.2109237118 , or even Berlin & Kay’s “Basic Color Terms”)

Authors: Thank you for bringing this work to our attention. We have now incorporated both of these sources into the manuscript in lines 474-478 to hypothesize that the diversity of color terms observed in the dataset are likely associated with how often general categories of color are referenced across species. This section about color terminology was moved from the concluding paragraph of the Discussion to an earlier paragraph in the Discussion that provides an overview of the issues data collectors faced with unfamiliar terminology.

5. Reviewer 1: Figures – as a heads-up, Manuscript Central (or whatever software this is – “Editorial Manager”??) has compressed your figures in a weird way. It’s also obliterated the mathematical formatting on Page 35.

Authors: Thank you for bringing this to our attention. We have been in touch with the PLOS ONE editorial office and were told that “the reduction in resolution is part of the PDF building process that cannot be prevented, and the PDF version of manuscripts will always contain a compressed version of very high-resolution figures.” The editorial office advised us that all of the files are available to download in their original format in the manuscript’s file inventory. We tested this and the files and images were viewable in the form we intended.

6. Reviewer 1:

Gazetteer:

• Extra square bracket under ‘Dimorphic’

• Extra bolding in ‘pinnated’, ‘russet’, ‘tawny’

• I don’t know what “If toponym varies from spelling indexed under, listed at end of entry” means?

Authors: The extra square bracket and extra bolding in those entries have now been removed. We have now reworded the phrase “If toponym varies from spelling indexed under, listed at end of entry” that was located at the start of the Gazetteer to clarify its meaning, and it is now phrased as: “Index of specific place names. Any documented variations of the spelling for an indexed place name are listed at the end of its entry.” The new version of the Glossary and Gazetteer file has been uploaded to the GitHub site.

7. Authors: In the first round of feedback we received, there was concern that the phylogeny referenced in the manuscript was currently unreviewed. In the time since we submitted the first revision, however, the phylogeny has now been peer reviewed and published in PNAS:

McTavish EJ, Gerbracht JA, Holder MT, Iliff MJ, Lepage D, Rasmussen PC, et al. A complete and dynamic tree of birds. Proc Natl Acad Sci. 2025;122(18):e2409658122. https://doi.org/10.1073/pnas.2409658122

We have now updated the citation (38) for the phylogeny in the revised manuscript to its peer reviewed publication.

Thank you, again, to all of the reviewers for your continued feedback.

Sincerely,

Dr. Erin Morrison and Dr. Allison Shultz

Attachment

Submitted filename: ResponsetoReviewers2_PLOSONE_051525.pdf

pone.0325890.s008.pdf (172.8KB, pdf)

Decision Letter 2

Shoko Sugasawa

21 May 2025

AvianLexiconAtlas: A database of descriptive categories of English-language bird names around the world

PONE-D-24-54740R2

Dear Dr. Morrison,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager®  and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Shoko Sugasawa

Academic Editor

PLOS ONE

Acceptance letter

Shoko Sugasawa

PONE-D-24-54740R2

PLOS ONE

Dear Dr. Morrison,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Shoko Sugasawa

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Summary of mismatched category assignments in the 906 species common names assigned to different categories in Dataset A and Dataset B.

    (TIF)

    pone.0325890.s001.tif (166.3KB, tif)
    S2 Fig. Word cloud frequencies of terminology in common names.

    Word cloud frequencies of (A) avian physical traits, (B) avian natural history traits, and (C) human-centered terminology. Names are scaled according to frequencies in each dataset.

    (TIF)

    pone.0325890.s002.tif (866.4KB, tif)
    S3 Fig. Cladogram of the categories assigned to the English common names of 10,775 species of birds.

    Categories of the English common names identified at the tips of the branches, based on color. Cladogram adapted from [38]. The inner circle includes species names associated with the general category of avian physical traits (both sexes physical trait, male physical trait, female physical trait, size). The middle circle includes species names associated with the general category of avian natural history (behavior, geographic range, natural history), and the outer circle includes names associated with the general category of human-centered terminology unrelated to the biology of the species. See Table 1 for detailed explanations of each category.

    (TIF)

    pone.0325890.s003.tif (1.2MB, tif)
    S1 Table. Calculation of D statistic for the phylogenetic structure of categories.

    Results of Fritz & Purvis’ [40] D statistic calculations for the phylogenetic structure of each of the grouped categories: physical traits, natural history, and human-centered terminology. For each grouped category, species assigned to the category were represented by a state of 1 and the remaining species assigned to other categories were represented by a state of 0. D is calculated by scaling the observed sum of sister-clade differences, Σdobs, with the mean values of the sum of sister-clade differences for 1,000 simulated trait distributions on the tips of the same phylogeny based on randomly reshuffling the trait values, Σdr , and trait evolution under Brownian motion Σdb: D = [Σdobs  mean(Σdb)/[mean(Σdr mean(Σdb)]. An estimated D close to 1 represents a random distribution of a binary trait among related species on the phylogeny, while an estimated D close to 0 represents a clumped distribution of a binary trait among related species that would be expected under the Brownian motion model of evolution. Calculations were completed using the R package caper 1.0.3 [41].

    (PDF)

    pone.0325890.s004.pdf (82.9KB, pdf)
    S1 Appendix. AvianLexiconAtlas Database Files.

    The data, glossary, and gazetteer reported in this article can be accessed at https://github.com/ajshultz/AvianLexiconAtlas.

    (PDF)

    pone.0325890.s005.pdf (61.4KB, pdf)
    Attachment

    Submitted filename: ResponsetoReviewers_PLOSONE_031725.pdf

    pone.0325890.s007.pdf (205.8KB, pdf)
    Attachment

    Submitted filename: ResponsetoReviewers2_PLOSONE_051525.pdf

    pone.0325890.s008.pdf (172.8KB, pdf)

    Data Availability Statement

    The data, glossary, and gazetteer reported in this article can be accessed at https://github.com/ajshultz/AvianLexiconAtlas.


    Articles from PLOS One are provided here courtesy of PLOS

    RESOURCES