Abstract
Our understanding of biodiversity patterns comes primarily from described species. Here, we analyze how known biodiversity has increased across living organisms. Past research suggested that the number of new species per year peaked near 1900 and that only ~2 million species exist. We find that overall rates of species descriptions have recently accelerated, with the largest numbers of new species per year all in the past ~20 years (2000 to 2020). The largest groups grew the most quickly during this period, including animals, arthropods, insects, and beetles. However, long-term trends in rates of species descriptions were often unrelated to recent rates and current richness. For example, rates for fungi have recently increased, whereas rates for insects have not. Extrapolating these rates of species descriptions into the future requires considerable caution. Nevertheless, some intriguing patterns are suggested, such as unexpectedly high projected species numbers of plants, fungi, arachnids, malacostracan crustaceans, ray-finned fishes, and amphibians.
The number of known species on Earth is increasing rapidly, suggesting unexpectedly large numbers of many groups.
INTRODUCTION
The rate at which new species are formally described is an essential but relatively understudied aspect of biodiversity. These rates determine the number of known species on Earth and the groups to which these species belong. The projected number of species on Earth has received far more attention, with estimates of global diversity ranging from the low millions (1–4), to tens of millions (5, 6), to hundreds of millions (7–9), and to trillions (10). However, these projected species generally remain hypothetical until they are formally described, making the rate of species description of crucial importance. For example, species may be invisible to conservation efforts until they are scientifically described. Furthermore, some authors have suggested that more recently described species are more likely to be threatened with extinction than older described species (11, 12). Thus, a fundamental question is whether rates of species description are fast enough to allow Earth’s species to be described before they go extinct (3). Given this issue, some influential papers have addressed how the pace of species discovery and description might be accelerated [e.g., (13–17)], but with less focus on the rates themselves. Overall, rates of species descriptions play a crucial role in determining the future of known biodiversity.
There have been few large-scale studies on rates of species description across life, and some have been controversial. For example, Costello et al. (2, 3) concluded that the maximum rate of species descriptions occurred ~100 years ago, that the recent rate of species descriptions was ~16,000 a year (in the 2000s), and that this latter rate should allow most species to be discovered and described before they go extinct (but depending also on extinction rates). This latter conclusion was based on extrapolating recent description rates hundreds of years into the future to project overall species numbers. Mora et al. (18) criticized this study and suggested that recent rates of species descriptions were closer to ~8000 species a year instead (based on the 1990s) and that this slower rate indicated that most species would go extinct before being described [but see (19)]. However, these studies did not examine rates of species descriptions over time across major groups of organisms. Miralles et al. (20) analyzed species descriptions in major eukaryotic groups from 1950 to 2014 and concluded that there was no increase in overall description rates over time.
By contrast, there have been several studies of rates of species descriptions at smaller taxonomic scales, often focused on projecting the future number of described species in a group. These include studies within groups of vertebrates (21–22) and insects [scale insects (23) and stoneflies (24)], and for fungi (25), polychaete annelids (26), and amphipod and isopod crustaceans (27, 28). Other research has examined rates of species descriptions over time across groups but only for a specific region [i.e., Europe; (29)]. Wang et al. (30) examined taxonomic trends across seven groups of varying rank and suggested that recent description rates were faster in fungi than in other groups [see also (20)]. Costello et al. (31) documented increasing numbers of new marine species over time.
Here, we analyze rates of species description over time across living organisms. We estimate annual rates of species descriptions for major groups using the Catalogue of Life [CoL; (32)] and other databases (33, 34). We consider only unique, accepted, and extant species and exclude viruses [which are not unambiguously alive; (35)]. We use these data to address the following questions. First, how have rates of species descriptions changed over time? Did rates of species descriptions peak near 1900 and decline subsequently [suggesting that most of Earth’s species have been found; (2, 36)], have recent rates been constant (20), or have rates accelerated toward the present day? Second, how do these rates vary among taxonomic groups? For example, are groups with the most species also the ones that are growing the most rapidly? Did most groups have their maximum rate of species description in recent decades or in past centuries? Third, what might these rates and patterns of species descriptions tell us about future patterns of known species richness?
RESULTS
Across life, we found that the overall number of living, described species continues to increase rapidly, with little sign of slowing in recent decades (Fig. 1A and dataset S1; up to 2020; all datasets and code are available at https://figshare.com/s/a59c81d51846d79065de). Furthermore, the fastest rates of species description (>16,000 species/year) were all since 2015 (Fig. 1B). The maximum rate (17,044 species/year) was in 2020 (dataset S1). A previous study suggested that new species descriptions peaked near 1900 (2), but that study was not comprehensive across taxonomic groups and did not extend past the year 2000. Based on our results (Fig. 1B), it was after 2008 that rates of species descriptions overtook those in the early 1900s (the maximum rate before 2009 was in 1912). Overall, these results show that the rate of new species descriptions is still generally accelerating at the decadal timescale.
Fig. 1. Cumulative numbers and rates in new species over time across living organisms.
(A) The total (cumulative) number of species known in each year and (B) the number of species newly described each year. The analysis includes 1.9 million species from the CoL (32), including only extant, accepted names, and excluding viruses, fossil taxa, and species with unknown publication dates. Data are in dataset S1.
We then examined which groups underlie the overall increase in known richness over time for the past ~20 years and overall patterns in rates of descriptions over time. We tested the hypothesis that the groups increasing most rapidly are those that are already the most species rich. Specifically, we examined mean numbers of new species described per year from 2000 to 2020. We excluded 2021 to 2024 to reduce potential artifacts associated with delays in adding new species to the CoL and other databases (see Materials and Methods). We examined patterns among kingdoms, animal phyla (~70% of all known species are animals; Table 1), arthropod classes (~80% of animals are arthropods; Table 1), insect orders (~90% of arthropods are insects; Table 1), and vertebrate classes.
Table 1. Current and projected species richness for selected major groups.
We present the current richness of each group (for 2020) and projections of future richness for the year 2400. Confidence intervals (95%) on each projection are also provided. “NA” indicated that a confidence interval could not be calculated. For brevity, this table only includes the most species-rich taxa within each group. Projections for all taxa are given in dataset S6.
| Taxon | Current richness (2020) | Model-based projection | Lower 95% confidence interval | Upper 95% confidence interval |
|---|---|---|---|---|
| Kingdoms | ||||
| Animalia | 1,341,026 | 2,601,719 | 2,535,894 | 2,667,544 |
| Archaea | 523 | 842 | 781 | 903 |
| Bacteria | 14,143 | 87,279 | −62,864 | 237,421 |
| Chromista | 18,083 | 19,155 | 18,505 | 19,806 |
| Fungi | 142,422 | 307,221 | 296,865 | 317,577 |
| Plantae | 362,900 | 532,458 | 520,652 | 544,263 |
| Protozoa | 2,369 | 2,381 | 2,314 | 2,448 |
| Animal phyla | ||||
| Annelida | 15,986 | 55,799 | 53,791 | 57,807 |
| Arthropoda | 1,078,685 | 2,018,995 | 1,966,195 | 2,071,795 |
| Chordata | 74,079 | 135,249 | 130,354 | 140,143 |
| Cnidaria | 13,910 | 23,338 | 21,894 | 24,782 |
| Echinodermata | 7,389 | 7,300 | 7,210 | 7,391 |
| Mollusca | 83,112 | 94,958 | 90,814 | 99,102 |
| Nematoda | 19,289 | 20,268 | 19,842 | 20,694 |
| Platyhelminthes | 23,860 | 26,472 | 26,013 | 26,931 |
| Porifera | 9,258 | 9,148 | 8,731 | 9,566 |
| Arthropod classes | ||||
| Arachnida | 90,877 | 751,965 | 750,440 | 753,490 |
| Branchiopoda | 1,648 | 2,011 | 1,888 | 2,134 |
| Collembola | 8,468 | 10,913 | 10,762 | 11,063 |
| Copepoda | 14,218 | 20,664 | 19,952 | 21,376 |
| Diplopoda | 12,525 | 14,970 | 14,677 | 15,263 |
| Insecta | 881,474 | 1,425,879 | 1,385,319 | 1,466,438 |
| Malacostraca | 41,575 | 353,184 | NA | NA |
| Ostracoda | 7,195 | 7,201 | 6,951 | 7,451 |
| Insect orders | ||||
| Coleoptera | 248,806 | 447,444 | 432,741 | 462,146 |
| Diptera | 163,401 | 218,535 | 205,075 | 231,995 |
| Hemiptera | 94,639 | 170,167 | 170,104 | 170,230 |
| Hymenoptera | 120,658 | 224,754 | 216,118 | 233,390 |
| Lepidoptera | 159,378 | 159,638 | 157,427 | 161,850 |
| Neuroptera | 5,917 | 8,660 | 8,037 | 9,284 |
| Odonata | 6,279 | 7,943 | 7,729 | 8,157 |
| Orthoptera | 27,892 | 60,410 | 57,435 | 63,384 |
| Vertebrate groups | ||||
| Actinopterygii | 33,490 | 114,599 | 114,091 | 115,106 |
| Amphibia | 8,176 | 41,381 | 41,361 | 41,401 |
| Aves | 10,417 | 10,330 | 10,299 | 10,362 |
| Elasmobranchii | 1,260 | 7,763 | NA | NA |
| Mammalia | 5,964 | 5,801 | 5,730 | 5,873 |
| Squamata | 11,073 | 16,467 | 15,386 | 17,547 |
Within each group (kingdoms, animal phyla, arthropod classes, insect orders, and vertebrate classes), the mean annual rate of species descriptions in the past ~20 years (2000 to 2020) in each higher taxon was strongly related to its overall richness (Fig. 2 and dataset S2). Thus, those higher taxa that already had the most species were those that grew the fastest from 2000 to 2020. In each of these five groups, this relationship was strongly influenced by a single taxon with description rates that were at least twice as high as the next highest (Fig. 3 and dataset S3). Thus, among kingdoms, animals had the highest mean annual rate, as did arthropods among animal phyla, insects among arthropod classes, beetles among insect orders, and ray-finned fish (Actinopterygii) among vertebrate classes (Fig. 3). However, this strong relationship between rates and overall richness was hardly inevitable: For each of these species-rich clades, the number of species described in the past ~20 years (2000 to 2020) was still a small minority of the clade’s overall 2020 species richness (animals = 17% described from 2000 to 2020, arthropods = 16%, insects = 15%, beetles = 19%, and ray-finned fish = 21%; table S1 and datasets S2 and S3). Costello et al. (19) also suggested that taxa with more species had faster rates of species descriptions, but did not specify which taxa.
Fig. 2. The relationship between the total number of species in each group and the rate of new species descriptions in that group.
Larger groups have the fastest rates of new species descriptions. Recent rates of new descriptions are the numbers of new species described per year averaged across years from 2000 to 2020. Results are shown for kingdoms across life (n = 7), animal phyla (n = 32), arthropod classes (n = 17), insect orders (n = 26), and vertebrate classes (n = 11). Darker colored circles indicate overlapping data points. The total number of species is based on the year 2020. Regression statistics are given in the upper left corner of each graph. Mean rates for the largest taxa in each group are shown in Fig. 3. Data for each higher taxon are given in dataset S2.
Fig. 3. Mean rates of recent species descriptions for five sets of higher taxa.
Species descriptions are for the years 2000 to 2020 (data in dataset S2). Results are shown for (A) kingdoms across life, (B) phyla across animals, (C) classes across arthropods, (D) orders across insects, and (E) classes across vertebrates. For easier visualization, only the most species-rich taxa are shown within each group (data for all taxa within each group for each year are in dataset S3). For each higher taxon, a boxplot of their rates is shown including the minimum and maximum values (vertical line), the lower 25th percentile (lower edge of box), the mean (thick horizontal line), and the upper 75th percentile (upper edge of box).
We also examined long-term patterns in rates of species descriptions over time within these groups (dataset S4). Different kingdoms showed very different patterns over time (Fig. 4). Animals had peak rates in the early 1900s and 2000s, with strong declines coinciding with World Wars I and II (Fig. 4). Over the past ~50 years, rates in animals were similar from decade to decade but slowly increased over time. By contrast, there were accelerating rates in archaeans, bacteria, and fungi, with peak rates within the past 20 years. Chromista and Protozoa showed more idiosyncratic patterns, both peaking in the 1970s. Rates in plants were similar over the past 150 years. Plants had their highest rate of species descriptions in the 1750s, with a second highest peak near 1910, and a third highest peak near 2020, with increasing recent rates over the past ~70 years.
Fig. 4. Rates of species descriptions over time among kingdoms across life.
Colored lines show the number of new species described each year in each kingdom from 1757 to 2020. Data for each year are given in dataset S4.
Long-term patterns among the largest animal phyla were also highly idiosyncratic (Fig. 5 and dataset S4). The pattern in Arthropoda (insects, crustaceans, spiders, etc.) was similar to that in all animals, with similar peaks in description rates in the early 1900s and early 2000s and sharp declines coinciding with World Wars I and II. By contrast, Mollusca (snails, clams, octopi, etc.), the second largest animal phylum, showed increasing rates over the past 100 years and a strong peak in this century. Chordata (vertebrates, tunicates, and lancelets) had their fastest rates in the mid-1700s and another smaller peak in the 21st century. The marine phylum Echinodermata (sea stars, sea urchins, etc.) peaked in the early 1900s, whereas the largely parasitic phyla Nematoda (roundworms) and Platyhelminthes (flatworms) peaked around the 1970s. Porifera (sponges) peaked in the late 1800s. Annelida (earthworms, leeches, etc.) showed many peaks, with the highest in the 21st century.
Fig. 5. Rates of species descriptions over time among the largest animal phyla.
Colored lines show the number of new species described each year in each phylum from 1774 to 2020. Data for each phylum and year are given in dataset S4.
Among arthropod classes (Fig. 6 and dataset S4), insects showed their highest description rates in the early 1900s (higher than the peak in the 2000s) and sharp declines coinciding with the two world wars. By contrast, Arachnida (spiders, scorpions, mites, etc.) and Malacostraca (crabs, crayfish, shrimp, etc.) had generally increasing rates over time, with peaks in the 2000s. There were generally increasing rates in Collembola (springtails), Copepoda (copepod crustaceans), and Ostracoda (seed shrimp), whereas Branchiopoda (fairy shrimp, clam shrimp, etc.) and Diplopoda (millipedes) had their highest rates before the 1950s.
Fig. 6. Rates of species descriptions over time among the largest arthropod classes.
Colored lines show the number of new species described in each class from 1757 to 2020. Data for each class and year are given in dataset S4.
Within insects (Fig. 7 and dataset S4), many of the largest orders had their highest description rates in the early 1900s. For example, Diptera (flies), Hymenoptera (wasps, bees, and ants), Lepidoptera (butterflies and moths), and Neuroptera (lacewings, antlions, etc.) all had their fastest rates near 1910. By contrast, Coleoptera (beetles) and Orthoptera (crickets, grasshoppers, and locusts) had generally increasing rates over time, with their highest rates in the 2000s. Odonata (dragonflies and damselflies) had peaks near 1840, 1910, and 2010. Hemiptera (true bugs) peaked in the 2000s, but their rates generally declined since the 1970s.
Fig. 7. Rates of species descriptions over time among the largest insect orders.
Colored lines show the number of new species described each year in each class from 1758 to 2020. Data for each order and year are given in dataset S4.
Among the largest vertebrate classes (Fig. 8 and dataset S4), ray-finned fishes (Actinopterygii), amphibians, and squamates (lizards and snakes) showed an overall pattern of increasing rates, with the highest rates in the 2000s. Elasmobranchs (sharks and rays) showed a more uniform pattern over time, but also peaked in the 2000s. Birds and mammals both peaked in the mid-1700s and subsequently had generally declining rates (birds) or more constant rates (mammals).
Fig. 8. Rates of species descriptions over time among the largest vertebrate classes.
Colored lines show the number of new species described each year in each class from 1758 to 2020. Data for each class and year are given in dataset S4.
We plotted rates of species descriptions from 2000 to 2020 in figs. S1 to S5. These revealed generally accelerating recent rates in fungi and plants (fig. S1) and some animal phyla (annelids, mollusks, nematodes, and sponges; fig. S2), but not in most arthropod classes (fig. S3) or insect orders (fig. S4). Among vertebrates, recent rates were accelerating in amphibians and squamates (fig. S5). Recent rates were decreasing in some taxa, including arthropods, insects, flies (Diptera), hymenopteran insects, and birds. However, these apparent declines should be interpreted cautiously (see Discussion).
We used two approaches to estimate future known richness (for 2400) based on species descriptions over time. First, we used the best-fitting model of change in cumulative species numbers over time for each group, which generally suggested an asymptote in future species numbers (see below). Second, we used the Bayesian approach bayside (37), but only for larger groups (see Materials and Methods). All these projections should be taken with considerable caution.
For the model-based approach, we compared five models of growth in species numbers over time: linear, exponential, asymptotic, and logistic growth, and the Gompertz model. We compared models for species richness from ~1750 to 2020. For most groups and time periods (dataset S5), either the logistic or Gompertz model had the best fit [lowest Akaike Information Criterion (AIC)]. Under the logistic model, there is slower growth at the initial and final time periods, and the most rapid growth in the middle. Under the Gompertz model, there is slower growth at the final time period than at the initial time period, whereas growth is similar at both time periods under the logistic model.
The model-based approach tended to project future described richness that was broadly similar to current known richness (Table 1, dataset S6, and figs. S6 to S10). For example, it estimated 2.6 million animal species, 2.0 million arthropods, and 1.4 million insects (note that estimates from different taxa are independent, and so, combined estimates from different lower-level taxa within a given higher taxon can exceed the estimate for the higher taxon). These latter three taxa (animals, arthropods, and insects) had their fastest rates of species descriptions around 1900 (but roughly tied with the 2000s for animals). However, some taxa were projected to have much higher richness by 2400. These were generally taxa in which rates of species descriptions were highest in this century. These included plants (532,000 projected species versus 362,000 in 2020; see Table 1 for confidence intervals), fungi (307,000 versus 142,000), bacteria (87,000 versus 14,000), annelids (56,000 versus 16,000), arachnids (752,000 versus 91,000), malacostracan crustaceans (353,000 versus 42,000), hemipteran and orthopteran insects (170,000 versus 95,000 and 60,000 versus 28,000, respectively), chordates (135,000 versus 74,000), ray-finned fishes (Actinopterygii; 114,599 versus 33,490), elasmobranchs (sharks and rays; 7763 versus 1260), amphibians (41,381 versus 8176), and squamate reptiles (lizards and snakes; 16,467 versus 11,073). In some of these groups, new species were not projected to fully asymptote by 2400 (figs. S6 to S10), including Animalia and Fungi; the animal phyla Annelida, Brachiopoda (lamp shells), and Chordata; the arthropod classes Arachnida and Malacostraca; the insect orders Coleoptera (beetles), Embioptera (webspinners), Ephemeroptera (mayflies), Mecoptera (scorpionflies), Megaloptera (dobsonflies, etc.), Orthoptera (crickets, grasshoppers, etc.), Plecoptera (stoneflies), Zoraptera (angel insects), and Zygentoma (silverfish, etc.); and the vertebrate classes Actinopterygii, Elasmobranchii, Holocephali (chimeras), Petromyzonti (lampreys), and Squamata. Thus, many more species might continue to be described in these groups past 2400. However, most other groups were projected to have future richness similar to their current described richness. For some of these groups, the projected richness for 2400 has already been exceeded in 2025.
Bayside estimated that most groups will have approximately two to four times their current described richness by 2400 (table S2, dataset S7, and figs. S11 to S15). However, some bayside projections seemed unrealistic, such as the very large projected number of birds (see Discussion). Furthermore, all bayside projections suggested that richness would still be increasing in every group in 2400.
DISCUSSION
In this study, we analyze rates and patterns of new species descriptions over time across life. We found that the fastest overall rates across life have been in this century, in contrast to previous studies, which suggested that the fastest rates were ~100 years ago (2, 3). We found that rates of recent species descriptions among clades within major groups were strongly related to their current richness. Thus, the clades with the fastest mean rates from 2000 to 2020 included animals, arthropods, insects, beetles (among insect orders), and ray-finned fishes (among vertebrate classes). These clades each accumulated ~15 to 21% of their current richness just in the past ~20 years. However, rates of species descriptions over time in the past ~250 years were highly variable among groups, with some groups having rates that are still generally increasing over time (e.g., bacteria, fungi, mollusks, arachnids, malacostracan crustaceans, beetles, ray-finned fishes, amphibians, and squamates) and others with rates that have declined over time after peaking long ago, such as in the early 1900s (e.g., insects, including dipterans, hymenopterans, and lepidopterans). Last, we tentatively projected future (described) richness for these groups. Our results suggest that some higher taxa might be far more rich than suggested by their current species numbers, including plants, fungi, arachnids, malacostracan crustaceans, ray-finned fishes, elasmobranchs, squamates, and amphibians.
Limitations of the data
We point out several limitations of the data, but these limitations should not overturn our main conclusions. First, these data represent only an estimate of rates of species descriptions. The data are not truly comprehensive, because some species are not included in the CoL, and for some species, we could not determine the year in which they were described (~2%; see Materials and Methods). Furthermore, the total numbers of species given for some major insect orders were less than expected based on other sources. For example, we included a total of 248,806 species of Coleoptera (beetles) in 2020, but an older study suggested there were already 389,487 extant species in 2013 (38). Although this discrepancy is substantial, we nevertheless found that the highest recent rates of species descriptions (Fig. 3) were in animals, arthropods, insects, and beetles. Thus, adding more beetle species would only help reinforce each of these four patterns. Furthermore, many of the missing species without dates were animals, arthropods, and insects, but these included only ~15,000 species in total (see fig. S16), and these seem more likely to be older species rather than more recent species. Most importantly, the estimate of 389,487 beetle species is based partially on projections and rounding and not precise species counts, especially for the largest groups (39).
Delays between when species are described and when they are added to the CoL could also cause recent rates to appear to be declining (see Materials and Methods and fig. S17). We tried to account for such delays by using data only up to 2020. Nevertheless, despite these delays, we concluded that the fastest overall rates of species descriptions have been in the past ~20 years, with the highest overall rates in 2020, 2015, 2017, and 2016, respectively (Fig. 1B and dataset S1). Again, including additional recently described species would only reinforce this conclusion.
On the other hand, changes in rates of species descriptions per year from 2000 to 2020 might be sensitive to this source of error. Given this, we did not focus primarily on variation in rates over this time period, and focused on longer timescales instead (but see figs. S1 to S5). A potentially artefactual pattern occurs in Hymenoptera (ants, bees, and wasps), in which relatively few species per year were added since ~2008 (151 to 681), but ~721 to 1598 species per year were added in the years immediately preceding 2008. This may have contributed to the recent declines in rates of species descriptions in animals, arthropods, and insects. On the other hand, other major insect groups did not have their highest rates in recent years [Diptera (flies) and Lepidoptera (moths and butterflies)], but without showing the recent decline seen in Hymenoptera. Hymenoptera also had much faster rates near 1900 than around 2000 (suggesting that this same pattern in animals, arthropods, and insects is not explained by a post-2008 slump in Hymenoptera). Overall, we acknowledge that delays in adding species could be problematic, but we reiterate that (i) we partially accounted for this problem by excluding data from 2021 to 2024, (ii) our analysis of CoL data from 2020 suggests that this exclusion should generally be sufficient (see Materials and Methods and fig. S17), (iii) the overall number of species described in 2020 is higher than any other year, suggesting that these delays did not strongly affect 2020 or previous years, and (iv) many groups still showed overall increases in rates over time from 2000 to 2020 (e.g., plants, fungi, and squamate reptiles). A delay in adding the most recently described species to the CoL would not explain these increasing recent rates in these or other groups.
Last, our analyses were based on rates of species descriptions, but some newly described species may prove to be synonyms of existing species. For example, authors in the early 1990s estimated that ~20% of newly described species are synonyms, across diverse insect orders (40). Here, we included only species currently considered “accepted” by the CoL, but some species might still be synonyms. Analyses (41) have found that more recent species descriptions include more authors and more data and describe new species that are less likely to be synonymized [see also (31) for marine animals]. Analyses in fishes (42) and fungi (30) also suggest that synonymy rates have declined sharply over time (from ~50% to <10%).
Projecting future richness
We also used patterns of species descriptions over time to project future species richness (Table 1). These projections may be especially useful for groups that have not been the focus of many detailed projections, as insects have been (43). Nevertheless, we urge considerable caution about these estimates, for several reasons. First, bayside gave estimates of future richness that were consistently approximately two to four times the size of current richness (Table 1). For some groups, this seems unlikely. In birds, <10 new species have been described per year since 2000, and description rates have sharply decreased since 2000 (fig. S5). Doubling the current number of bird species by 2400 would require rates to increase again. All groups showed increasing rates close to 2400 (figs. S11 to S15). This approach may fail at predicting so far into the future. Therefore, we focused on the model-based predictions (Table 1).
These model-based predictions used an initial step that selected the best-fitting model to describe change in species numbers over time. Most taxa fit models with slower growth rates initially and more recently (logistic and Gompertz). Hypothetically, apparent recent declines might reflect delays in adding species to the CoL, but many groups (such as insects) had peak description rates ~100 years ago, suggesting that lower rates in the past ~20 years are not necessarily an artifact (see above). This may explain the relatively low projection of future described insect richness (~1.4 million species), whereas many other projections converge on ~5 million undescribed species (43). The low projections may also reflect the fact that rates of species descriptions can be decoupled from actual species richness.
What is more interesting are groups in which large species numbers were projected by this conservative, model-based approach (Table 1). We discuss several of these cases here. The large number of projected fungal species (~307,000) is highly credible: Large-scale sequencing efforts suggest that there are ~6 million undescribed fungal species (44), although other recent estimates have suggested only ~2 to 3 million instead (45). However, our results suggest that, based on current rates of species descriptions, it may take many hundreds of years until all these species are described.
We projected ~532,000 plant species (Table 1). This is unexpected given some other projections of plant richness. Joppa et al. (11) suggested that there were only 10 to 20% more species of flowering plants to be described relative to the current number at that time (which was not given). There has already been a 15% increase from 2000 to 2020 (316,323 to 362,900; dataset S4), and plant description rates rapidly accelerated from 2000 to 2020 (fig. S1). The number of described plant species in 2025 [386,117 (CoL)] is already 30% higher than the 298,000 predicted by Mora et al. (4) and is rapidly approaching the ~400,000 predicted by Chapman (46). Studies by Pimm et al. (47) and Pimm and Joppa (48) proposed that there were ~450,000 plant species, and Dirzo and Raven (1) suggested there could be 500,000 angiosperms alone. We acknowledge that the rate of new species descriptions may peak and decline before reaching 500,000 species, but known plant diversity has already far exceeded some earlier projections.
We projected (Table 1) large numbers of arachnids (752,000 species) and malacostracan crustaceans (353,000) relative to their current richness (91,000 and 42,000, respectively). These projections are broadly similar to those of Chapman (46), who projected 600,000 arachnids and 150,000 crustaceans (but without a specific methodology underlying their projections). Large numbers of arachnids are especially plausible given the potential for many animal species (especially insects) to host one or more unique mite species [review in (7)].
In vertebrates (Table 1), our results suggest that numbers of described amphibians, squamate reptiles, and ray-finned fishes might increase substantially in the future relative to current numbers (to 41,000, 16,000, and 115,000, respectively). These projections derive from the rapid rates of recent species descriptions in all three groups. Chapman (46) projected more modest increases for these groups (totals of 15,000, 10,000, and 40,000). However, there are now already ~8900 amphibian species (49), ~12,200 squamate species (50), and ~37,000 fish species [mostly ray-finned; (51)]. Thus, described squamates have already exceeded predictions from 2009 by >20%. By contrast, birds and mammals were projected to have numbers similar to their current richness (Table 1).
The projected number of ray-finned fish species (115,000 species) may seem implausibly large. However, this number is less than the mean number of new species per year from 2000 to 2020 projected forward for 400 years (yielding ~134,000 additional species). Previous studies have projected smaller numbers of undescribed fish species, including studies suggesting that only 21 to 50% of marine fishes remain undescribed (42, 52, 53) and 34 to 42% of Neotropical freshwater fishes (54).
For fish and many other groups, the number of species may be increased greatly by cryptic species. Cryptic species are species that were initially thought be conspecific based on morphological data but were later found to be distinct from each other, typically based on phylogenetic analyses of DNA sequence data (55–57). Although the frequency of cryptic species in many groups is unclear, in insects, recent estimates suggest that each morphology-based species may contain (on average) approximately three cryptic species (8). More research is needed on the frequency of cryptic species in other groups, and on how many newly described species are cryptic. Specifically, if most newly described species in a group are cryptic, then cryptic species may not strongly influence our projections. On the other hand, if most newly described species are not cryptic, then cryptic species might lead to higher richness than our projections. A survey from 1990 to 2018 (20) suggested that most new species in 2018 were described without molecular data in plants and insects (ruling out cryptic species), whereas for other groups (fungi, protists, vertebrates, and mollusks), the majority of recent species descriptions incorporated molecular data. Surveys of new species descriptions from 2015 to 2020 (58) suggest that very few new species are described without morphological data.
We reiterate the need for caution in these long-term projections. The model-based approach could underestimate richness if apparent slowdowns in species descriptions are artifactual or temporary (59). Thus, we highlighted projections that were much larger than current richness. Conversely, we did not include putative sampling effort. Thus, we might underestimate when species descriptions will slow down over time as more species are described (and thereby overestimate overall richness). Sampling effort has been characterized based on fewer species being described per taxonomist, despite an increasing number of taxonomists: Together, these two factors may indicate a dwindling pool of new species to describe (2, 36, 60). Other authors have argued against using the number of authors on taxonomic papers as an indication of the effort needed to find new species, since newer papers with more authors may also tend to have more data (41). The increasing number of authors on taxonomic papers most likely reflects the overall increases in the number of authors per scientific paper across the sciences, rather than a phenomenon that is specific to taxonomy that necessarily indicates that most species have already been described (61, 62). Furthermore, the number of authors on a taxonomic paper may not actually reflect the number of dedicated taxonomists involved (61, 62).
We suggest that projections based on taxonomic effort might underestimate species richness. Costello et al. (2) projected future species numbers based on inferred taxonomic effort. Comparing their projections to current (2025; CoL) species numbers shows that some groups have already surpassed their projections for 2100 and beyond (i.e., total projected richness). These groups include fungi (present number = 156,000; their projected mean by 2100 = 72,700; their projected mean total = 84,400) and reptiles (present number = 12,400; mean 2100 = 10,900; mean total = 11,900). Similarly, Joppa et al. (11) predicted only a 10 to 20% increase in plant richness in the future, whereas there has already been a 14% increase since 2000 (see above). All three groups (fungi, reptiles, and plants) are ones in which rates of species descriptions still appear to be accelerating (at least from 2000 to 2020; figs. S1 and S5). Again, all of these predictions should be taken with much caution.
All of these projections are only for numbers of described species, and true species numbers might be considerably larger. Some groups that may actually be among the largest have very few described species. Protists (Chromista + Protozoa) have ~64,000 described species (CoL), but might number in the millions (63), especially apicomplexans associated with insect hosts (7). For bacteria, estimates based on putatively host-specific species project numbers in the low billions (7, 8), whereas we projected only ~87,000 described species by 2400 (Table 1). Other lines of evidence suggest that there might be trillions (10, 64), and even low estimates are in the millions (65). Rates of recent species descriptions may not be useful for estimating total species numbers in these groups. However, they do highlight that species descriptions might lag behind the actual richness of these groups to an extreme degree. New approaches for cataloging and describing this diversity may be needed, such as the Genome Taxonomy Database for prokaryotes (66). Among macroscopic organisms, many projections suggest that there are ~6 million insect species [review in (43)], whereas our projections imply that there are only ~1.4 million (Table 1). Our interpretation of this discrepancy is not that previous projections of high insect diversity are wrong. Instead, we suggest that it will take >400 years to describe a small fraction of these many millions of insect species at the current rate of species description. Thus, our results suggest that new approaches are needed to accelerate the rate of species discovery and description, even for macroscopic organisms. However, a recent review (58) estimated that most new arthropod species are still described based on morphological data alone (as are most plants).
What about extinction?
Systematists are in a race to discover and describe Earth’s species before they go extinct (3). Our projections do not include the possibility that many species may go extinct before being described [dark extinctions: (67–69)]. Current documented extinctions are low [i.e., only ~912 documented species extinctions over the past ~500 years; (70)], but these rates may be underestimated due to dark extinctions. On the other hand, current rates of species descriptions (Fig. 1B) show that considerable diversity can still be found, even after these potential dark extinctions. The problem is that it is difficult to include estimated extinction rates in our calculations if the number of species going extinct is unknown. In short, high future extinction rates could decrease rates of future species discovery, but these extinction rates are unknown and contentious (2, 18, 19).
The rate of species description in each year can be strongly dependent on other human-related factors. One of the most striking patterns in the rates of species descriptions over time is the negative effects of World Wars I and II [Fig. 1B; as previously noted in (2)]. Climate change might cause similar levels of disruption, along with the potential loss of up to ~30% of macroscopic species richness [recent reviews in (71–72)]. On the other hand, there may be new technologies (e.g., environmental DNA) and other factors that help accelerate future rates of species discovery and description.
Summary
In this study, we have examined the rate at which new species are described over time and across life. Our results suggest that in the past ~20 years, overall rates of species descriptions have been at their fastest ever, in contrast to the idea that most species have already been described and that description rates are therefore decreasing. We find that across life, it is the largest groups that are growing most rapidly. Thus, the future will likely continue to see the current numerical dominance of animals across life, arthropods across animals, insects across arthropods, beetles among insects, and ray-finned fish among vertebrates. We also provide projections of future species richness for many groups across life and within animals. Although these projections should be taken with great caution, they suggest that there may be unexpectedly high species richness in some groups, such as plants, fungi, arachnids, malacostracan crustaceans, ray-finned fishes, amphibians, and squamates.
MATERIALS AND METHODS
Obtaining data
We downloaded all species records from the CoL website using the Catalogue of Life Data Package (ColDP) on 29 September 2024 (database version: 2024-09-25) (32). The ColDP includes several files with different types of information (see the format definitions at https://github.com/CatalogueOfLife/coldp). We also downloaded all the species records from the Global Biodiversity Information Facility (GBIF; downloaded on 29 October 2024, database version: 2023-08-28) (33) and the World Flora Online (WFO; downloaded on 23 January 2024, database version: v.2024.1) (34) websites using Darwin Core Archive format (DwC-A). These were used for cross-validating the primary published year of species in the CoL. We did not use GBIF as the primary data source because it contains large numbers of extinct, fossil species (which are not easily identifiable as such), whereas WFO only contains plants. We focused on the “NameUsage.tsv” (ColDP), “Taxon.tsv” (GBIF), and “classification.csv” (WFO) files in each database. These include the species name and the date of description.
To prepare the dataset for further analyses, we extracted all valid species and their corresponding primary published years from the three databases individually. We then determined the correct primary published year through a majoritarian vote.
The extraction of CoL data followed several steps. The 2,099,602 accepted species records were selected using the “rank” of species and were listed as “accepted” under the column for status (i.e., we included only accepted species) in the pandas data frame (73).
To determine the higher taxa to which each species belonged, we recursively searched the “NameUsage.tsv” file using the species’ “parentID” column. We then assigned each species to different taxonomic ranks above the genus level (i.e., kingdom, phylum, class, order, and family). Each taxonomic rank was given a different column.
We excluded viruses from our dataset given that many scientists do not consider viruses to be living organisms (35). The names of virus kingdoms were used to identify virus species in our dataset and exclude them. We obtained virus-related kingdom names from the CoL website. The 11,270 viruses were excluded by matching a species’ kingdom name (including “Viruses,” which was considered a kingdom name during the higher taxa determination process) to the virus kingdom names.
We excluded extinct species because our interest is in estimating and discovering Earth’s living biodiversity, which we see as more urgent than describing species that are already extinct. Therefore, the 147,405 extinct species (almost all fossil taxa) were also excluded (species listed as “True” under the column “extinct”). Species were retained as extant when there was no value in this column. We also downloaded species records from the Paleobiology Database (PBDB; https://paleobiodb.org) and filtered out truly extinct species by setting “is_extant” to “extinct.” Then, we removed 1964 extinct species by comparing the parsed scientific name with the extinct species records in PBDB.
Determining the year of the species description
For each species, the year that the species description was published was parsed from the columns: “basionymID,” “authorship,” “nameReferenceID,” and “referenceID.” First, we retrieved the species’ basionym record from the column “basionymID” and obtained the primary published year from the corresponding authorship and reference columns. If the column “basionymID” was empty, we used the column “authorship” to determine whether the species was a new combination (a new combination should have the original author names in parentheses). We then used regular expression patterns [“(\d{4})”: four consecutive numbers; “(\d{4})\)”: four consecutive numbers following a closing parenthesis] to extract the correct primary published year. If no year was obtained from these steps, we retrieved the journal year from the “reference.tsv” file (extracted from “nameReferenceID,” corresponding to the nomenclatural reference). If no journal year was obtained, we attempted to determine the earliest published year from all reference records (“referenceID”), corresponding to the taxonomic reference(s) as the primary published year. We generally retrieved this information from the reference field “issued” (the date the work was issued or published) with the regular expression with four consecutive numbers (e.g., extracting only the year from a date given as “2000-01-01”). We also obtained this information from the raw text from the columns “nameReferenceID” and “referenceID.”
Some species records in CoL had more than one reference associated with the species name. In these cases, the earliest year was chosen as the year of publication (assuming that the scientific name was published first in that year). Some journals corrected the year in which the paper was published (e.g., “2007 publ. 2008”). Following the International Code of Zoological Nomenclature Fourth Edition (ICZN; https://code.iczn.org), the formal published year is the later year (see Article 21.2 and 21.9, www.iczn.org/the-code/the-code-online/). The choice between adjacent years should have very limited impact on our overall results.
During the process of removing duplicated species records, we noted that some species names appeared duplicated across different kingdoms. However, these were recognized as a valid name in each kingdom. Therefore, we initially unified all species names to the standardized binomial nomenclature form (genus name followed by the species, “genus species”), using a set of regular expression patterns. Then, we combined the kingdom name with the unified species name and removed the duplicated combinations of kingdoms with binomial names throughout the dataset. There were 458 duplicate species records with identical genus and species names in the same kingdom. The duplicated species were dropped, and only the first species was kept, using the function “drop_duplicates” in the pandas data frame (73).
In preliminary analyses, the primary published year for some species (especially plants) was incorrect, because new combinations were treated as new species. To address this issue, we made several changes. We obtained the primary published year through the “basionymID” in the CoL database, which is associated with new combinations. If the “basionymID” was missing or the extraction of the year for the basionymID failed, we examined the authorship of the new species or the new combination by searching for the closing parenthesis: This parenthesis is used to display the original author(s) and primary published year. If the new combination included the original publication year of the species, specifically any four consecutive number immediately followed by a closing parenthesis, we used this year instead of the date of the new combination. This avoided mistakenly treating the publication year extracted from the new combination as the primary published year. To test whether this resolved the issue, we randomly selected 200 species records from all plant species in the final combined dataset, using pandas [“sample(n = 200, random_state = 42)”]. We then manually verified whether the primary published year for the newly combined species was correct. Among these 200 species, 49 represented new combinations of genus and species names, and not new species. Two of these 49 species lacked a published year and were subsequently excluded from the analysis. The remaining 47 species all had the correct primary publication year. Based on this sample, it appeared that the primary published year was accurately determined when there were new combinations of genus and species names.
After these steps, the CoL dataset had 1,940,469 species records. The year of publication was obtained directly from CoL for 93.8% of the species. The year of publication remained unclear for 6.2% of the species. The extractions of the primary published year for the GBIF and WFO databases in DwC-A format were similar to CoL in ColDP format, except that the higher taxa names were already included in the GBIF and WFO databases. The GBIF dataset had 2,566,048 species records (but included many extinct, fossil species), of which 96.0% had their primary published years parsed, whereas 4.0% were missing. The WFO dataset contained 342,608 species records, of which 91.5% had their primary published years parsed, with 8.5% missing. In the end, three parsed datasets were merged using the “merge” function in pandas. The dataset from CoL served as the framework, and the scientific name and primary published year from the other two datasets were integrated into CoL through their shared column “x_sci_name_simpleRE_kingdom” (kingdom-binomial names). The final primary published year for each species was determined via a majoritarian vote among the three datasets: if only one database provided the year, that year was accepted. If two or three databases provided the year, it was accepted only if at least two had the same year. If all databases had different years for the same species, or none provided the year, the species was excluded.
For Archaea and Bacteria, our primary dataset based on the CoL lacked new species past 2012, whereas GBIF included new species from 2013 to 2024. For these groups, we used all GBIF species records published after 2012. Furthermore, it was unclear whether the CoL species lists for 2012 were comprehensive or not, and so we included GBIF data for 2012 also. We then checked these species and excluded those with scientific names matching extinct species in the PBDB database and found that none were extinct. These GBIF records were then merged with our dataset by the shared column “x_sci_name_simpleRE_kingdom.”
We also found other groups in which there was an abrupt absence of new species in recent years in the CoL database (up to 2020), despite many new species per year in preceding years. These included the animal phylum Acanthocephala (no new species from 2016 to 2020), the vertebrate classes Actinopterygii (2018 to 2020) and Amphibia (2020), and the insect orders Ephemeroptera (2013 to 2020), Neuroptera (2019 to 2020), and Trichoptera (2000 to 2020). We used GBIF to fill in the years with no new species (in CoL) in these cases, as described above. In other cases, when there were no new species in CoL for a given year but adjacent years also had few or no new species, we assigned a species as having zero new species for that year. However, in the cases of Elasmobranchii (2018 to 2020), Holocephali (2018 to 2020), and Myxini (2018 to 2020), we used GBIF because of the dearth of fish records in CoL after 2017, even though these groups have some recent years with no new species. We also found that the CoL was entirely lacking records for the insect orders Archaeognatha and Zygentoma, and we used GBIF data from 1753 to 2020 for these taxa. Overall, we included 9889 supplemental species from GBIF. Similarly, Copepoda (a class within Arthropoda) was absent in the initial versions of the CoL dataset we used, but was added very recently. We downloaded data for Copepoda on 27 July 2025 (database version: 2025-06-13). After data cleaning (e.g., excluding extinct species) and parsing (determining the primary published year), a total of 14,753 copepod species were added to the final dataset.
The final dataset contained 1,962,217 species, after removing species without a clear published year. Overall, 97.98% of the species in the combined dataset were successfully parsed, and only 2.02% had to be excluded because they lacked publication dates (fig. S16 and dataset S8). These 39,582 missing species were dominated by animals [17,384; consisting mostly of arthropods (15,789), and specifically insects (14,004)], plants [12,011; mostly angiosperms (9258)], and fungi (6585). Most of the missing insects were Lepidoptera (3543), Hymenoptera (3199), Hemiptera (2824), Coleoptera (2377), and Diptera (1666). Because the missing species were dominated by the largest groups, adding these species should not overturn our conclusions based on the nonmissing species.
Effects of delays in adding species to the CoL
Our initial analyses suggested that there was a sharp decline in the number of species described in the past few years (2021 to 2024). Therefore, we examined the annual count of new species published in three databases (CoL, GBIF, and WFO) from 2018 to 2024 (fig. S17). The results showed a reduction in the number of new species published each year by all databases after 2020. This could reflect a time delay between species description and their inclusion in CoL and other databases. Regardless of which database, the number of new species described in 2020 was higher compared to the previous year. Therefore, we assumed that data from 2020 and before should not be strongly affected by these delays. Furthermore, in preliminary analyses, we found that there was a similar reduction in the number of species described per year using the 2020 version of the CoL. This strongly suggested that the decline in the number of new species per year in recent years reflected delays in adding new species to this database (and presumably the other two as well).
Projecting future species richness
We used two approaches to project future numbers of described species. First, we used the “predict” function in R version 4.3.3 (74). R was used on the Montage Jintide C5218R chip platform. The “predict” function was used to forecast the number of described species in 2400. These forecasts used logistic and Gompertz models that were fitted with the “drm” function in the R package drc (75), as described in detail in the “Comparing models of growth in species numbers over time” section below. Models were fit based on the cumulative numbers of described species in each group from year 1753 to 2020. The “predict” function also calculated the 95% confidence intervals on this prediction. The past 4 years (2021 to 2024) were removed given that we found reduced numbers of species added to the databases in the past 4 years for some groups (see above). Projections were made for each year spanning from 2020 to 2400, and the total, cumulative species number was the number for 2400.
Prior to predicting future richness, we tested the fit of different models for the growth of species numbers over time (see section below). These analyses (dataset S5) showed that most groups had the best fit to a logistic or logistic-like model (i.e., Gompertz). Therefore, the number of projected species remained similar over time once they reached a “plateau” stage. A few groups showed better fit to a linear or exponential model of richness over time, but these groups generally had limited current and predicted species numbers. The only exception was the animal phylum Kinorhyncha, which showed exponential increases in richness and high projected species diversity by 2400. Kinorhyncha presently has only a few hundred species, and so projections of tens of millions of species seem highly implausible. Therefore, we did not describe these projections in the Results.
As a second approach, we used the bayside model to forecast species richness (37). Bayside is an R language script that provides a probabilistic framework for predicting future species discovery through the application of Bayesian time series regression analysis. Bayside projects the number of newly described species each year in the future, based on counts of new species for each year in the past. Bayside uses Stan (76), an advanced platform designed for efficient statistical modeling and computation.
To align with the data requirements of the bayside model, the original dataset was transformed with the following steps. A new variable, “valid_species_id” was represented by a series of unique numerical identifiers. We appended a column of sequential numbers to the original dataset to serve as the identification number for each species, given that the scientific name of each species is unique. The “year” referred to the year of publication for each newly described species.
We focused on a limited number of clades that collectively contained most living species on Earth. In a series of five separate analyses, we analyzed kingdoms across life, phyla across animals, classes across arthropods, orders across insects, and classes across vertebrates. In each analysis, each of these higher taxa (i.e., kingdoms, phyla, classes, and orders) was labeled as the “group.” For example, for Insecta, the order names within Insecta were used as the categories for the “group.”
We did not include the authorship of species (“species_authority”), although this was used in the original study in which bayside was proposed (37). We think that an increasing number of authors on each species description does not necessarily reflect decreasing species numbers within a group. Therefore, we did not perform analyses that relied on this assumption (see also Discussion). Furthermore, our data on authorship of species names only contained the surnames of authors. This could yield misleading results (e.g., large numbers of people sharing the surname “Smith” in the United States or “Li” in China). On the other hand, we acknowledge that this may have led bayside to give problematic results.
Bayside generates multiple simulated numbers of species for each year in the future. However, in some simulation replicates, it can sometimes generate extreme values that are considered outliers. Given that there were very different numbers of species among groups, the original outlier screening in bayside (which was applied across groups) was changed to screen for outliers within each group. Here, we excluded simulation replicates for each group with cumulative species counts (i.e., total number of species at 2400) that were four times higher than current species richness of the group in 2020. The use of a value of four followed from bayside’s setting to exclude outliers. However, we also tested the impact of using a value of two and a value of eight to exclude outliers. These different values had limited impact on the final projected species numbers (dataset S9). Specifically, using a value of eight gave results that were almost identical to those using four. Using a value of two yielded final species numbers that could be somewhat smaller than those using a value of four, but only by 15.6% on average (maximum of 37.5%). Overall, we concluded that the outlier screening value did not strongly affect the results.
The cutoff time for the input data was generally set to 2020 (see above). Preliminary analyses indicated that the absence of any newly described species in a group for several years led to problems of divergent transitions and low effective sample sizes in Stan (76). Therefore, for the bayside analyses, we only included groups with relatively few years without newly described species (<30% of the years from 1758 to 2020). We also avoided groups with no new species near the cutoff year (2020). In general, this meant that we focused on relatively large groups within each major clade. These were the groups of greatest interest here.
We conducted analyses among kingdoms across life, among animal phyla, among arthropod and vertebrate classes, and among insect orders. The end year of the projection was set to 2400. We added some code to plot the original cumulative species number and the forecasted cumulative species number together. The final projected cumulative number of species was calculated by averaging the species counts from all simulations (1000 times) conducted in the year 2400. Confidence intervals for the 10th and 90th percentiles were calculated.
We note that both approaches often estimated confidence intervals on these projections that were relatively narrow, but especially the model-based approach. We caution that these narrow intervals may overestimate the confidence that one should have in these projections.
The R code used in all analyses is given in dataset S10. All Python scripts used are given in dataset S11.
Comparing models of growth in species numbers over time
We compared different growth models to describe patterns of increasing species richness over time. We compared 11 models of five types (linear, exponential, asymptotic, logistic, and Gompertz) for all of the major groups analyzed (kingdoms, animal phyla, arthropod classes, insect orders, and vertebrate classes).
Model-fitting analyses were performed in R version 4.3.3 (74). The asymptotic, logistic, and Gompertz models were tested using the “drm” function in the R package drc (75). To use the “drm” function, three parameters are required: a formula (listed below), a dataset, and the “fct” parameter. The “fct” parameter is a list with two or more elements specifying the linear or nonlinear function, the accompanying self-starter function, the names of the parameters in the nonlinear function, and (optionally) the first and second derivatives. To enable direct comparison of AIC values across all models (including linear and exponential, not available in the drc package), we incorporated another R package, statforbiology (77). This package extends the “drm” function by providing additional “fct” parameter specifications, including “DRC.linear()” for the linear model and “DRC.expoGrowth()” for the exponential model. All possible “fct” parameters provided by the drc and statforbiology packages were evaluated, respectively, and the best “fct” parameter was determined based on the one yielding the lowest AIC value for each model type. Then, the best-fit model was used for projecting the number of species by 2400 in each group. For the formula descriptions below, all datasets were renamed as “data,” the independent variable “year of publication” was named “x,” and the dependent variable (number of species) was named “y.” The linear model was fit with the function “drm(y ~ x, data = data, fct = DRC.linear())”; the exponential model with “drm(y ~ x, data = data, fct = DRC.expoGrowth())”; the asymptotic model with “drm(y ~ x, data = data, fct = AR.2())” or “drm(y ~ x, data = data, fct = AR.3())”; the logistic model with “drm(y ~ x, data = data, fct = L.3()),” “fct = L.4(),” or “fct = L.5()”; and the Gompertz model with “drm(y ~ x, data = data, fct = G.2()),” “fct = G.3(),” “fct = G.3u(),” or “fct = G.4().” The number following the period in the “fct” parameter indicates the number of fixed variables in the model formula. For example, for the asymptotic model formula “f(x) = c + (d − c)(1 − exp(−x/e)),” the “AR.3()” model has three fixed parameters (c, d, e), whereas the “AR.2()” model has two (d, e). The L.3(), L.4(), and L.5() variations of the logistic model “f(x) = c + \frac{d − c}{(1 + \exp(b(x − e)))^f}” have fixed parameters (b, d, e), (b, c, d, e), and (b, c, d, e, f), respectively. The G.2(), G.3(), G.3u(), and G.4() in Gompertz models “f(x) = c + (d − c)(exp(−exp(b(x − e))))” have fixed parameters (b, e), (b, d, e), (b, c, e), and (b, c, d, e), respectively. Across different models, the parameters serve specific functions: Parameter b controls the rate of change or the steepness of the curve, parameters c and d define the lower and upper limits of the response, whereas parameter e signifies the position of the inflection point or midpoint, collectively determining the overall growth trend of the curve.
The best model was selected as the one with the lowest AIC value, calculated using the built-in “AIC” function in R. All model-comparison results are given in dataset S5.
Acknowledgments
We thank S. Edie for helpful suggestions on implementing the bayside approach. We thank J. Albert, M. Costello, R. Leschen, E. Miller, K. Saban, N. Stork, J. Streicher, and the reviewers for valuable comments on various versions of the manuscript. We thank T.-T. Zhang for making available to us the Bioinformatics Computing Platform at the Institute of Applied Biology, Shanxi University.
Funding:
X.L. acknowledges the National Natural Science Foundation of China (no. 32400363) and the Fundamental Research Program of Shanxi Province (no. 202403021212083). D.Y. acknowledges the National Natural Science Foundation of China (no. 32130012). L.W. acknowledges the Science Foundation of Fairy Lake Botanical Garden (no. FLSF-2024-01).
Author contributions:
Conceptualization: X.L. and J.J.W. Data curation: X.L. Formal analysis: X.L. and J.J.W. Funding acquisition: X.L., L.W., and D.Y. Investigation: X.L., L.W., and J.J.W. Methodology: X.L. and J.J.W. Project administration: X.L. and J.J.W. Resources: X.L. Software: X.L. Supervision: X.L. and J.J.W. Validation: X.L. and J.J.W. Visualization: X.L. Writing—original draft: X.L. and J.J.W. Writing—review and editing: X.L., D.Y., and J.J.W.
Competing interests:
The authors declare that they have no competing interests.
Data and materials availability:
All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. The datasets and code are permanently and freely available on figshare (https://figshare.com/s/a59c81d51846d79065de).
Supplementary Materials
The PDF file includes:
Figs. S1 to S17
Tables S1 and S2
Datasets S1 to S11
REFERENCES AND NOTES
- 1.Dirzo R., Raven P. H., Global state of biodiversity and loss. Annu. Rev. Env. Resour. 28, 137–167 (2003). [Google Scholar]
- 2.Costello M. J., Wilson S., Houlding B., Predicting total global species richness using rates of species description and estimates of taxonomic effort. Syst. Biol. 61, 871–883 (2012). [DOI] [PubMed] [Google Scholar]
- 3.Costello M. J., May R. M., Stork N. E., Can we name Earth’s species before they go extinct? Science 339, 413–416 (2013). [DOI] [PubMed] [Google Scholar]
- 4.Mora C., Tittensor D. P., Adl S., Simpson A. G. B., Worm B., How many species are there on Earth and in the ocean? PLOS Biol. 9, e1001127 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.May R. M., How many species are there on Earth? Science 241, 1441–1449 (1988). [DOI] [PubMed] [Google Scholar]
- 6.Scheffers B. R., Joppa L. N., Pimm S. L., Laurance W. F., What we know and don’t know about Earth’s missing biodiversity. Trends Ecol. Evol. 27, 501–510 (2012). [DOI] [PubMed] [Google Scholar]
- 7.Larsen B. B., Miller E. C., Rhodes M. K., Wiens J. J., Inordinate fondness multiplied and redistributed: The number of species on Earth and the new Pie of Life. Q. Rev. Biol. 92, 229–265 (2017). [Google Scholar]
- 8.Li X., Wiens J. J., Estimating global biodiversity: the role of cryptic insect species. Syst. Biol. 72, 391–403 (2023). [DOI] [PubMed] [Google Scholar]
- 9.Wiens J. J., How many species are there on Earth? Progress and problems. PLOS Biol. 21, e300238 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Locey K. J., Lennon J. T., Scaling laws predict global microbial diversity. Proc. Natl. Acad. Sci. U.S.A. 113, 5970–5975 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Joppa L. N., Roberts D. L., Pimm S. L., How many species of flowering plants are there? Proc. R. Soc. Lond. B 278, 554–559 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Liu J. J., Slik F., Zheng S., Lindenmayer D. B., Undescribed species have higher extinction risk than known species. Conserv. Lett. 15, e12876 (2022). [Google Scholar]
- 13.Godfray H. C. J., Challenges for taxonomy. Nature 417, 17–19 (2002). [DOI] [PubMed] [Google Scholar]
- 14.Tautz D., Arctander P., Minelli A., Thomas R. H., Vogler A. P., A plea for DNA taxonomy. Trends Ecol. Evol. 18, 70–74 (2003). [Google Scholar]
- 15.Maddison D. R., Guralnick R., Hill A., Reysenbach A.-L., McDade L. A., Ramping up biodiversity discovery via online quantum contributions. Trends Ecol. Evol. 27, 72–77 (2012). [DOI] [PubMed] [Google Scholar]
- 16.Wheeler Q. D., Knapp S., Stevenson D. W., Stevenson J., Blum S. D., Boom B. M., Borisy G. G., Buizer J. L., De Carvalho M. R., Cibrian A., Donoghue M. J., Doyle V., Gerson E. M., Graham C. H., Graves P., Graves S. J., Guralnick R. P., Hamilton A. L., Hanken J., Law W., Lipscomb D. L., Lovejoy T. E., Miller H., Miller J. S., Naeem S., Novacek M. J., Page L. M., Platnick N. I., Porter-Morgan H., Raven P. H., Solis M. A., Valdecasas A. G., Van Der Leeuw S., Vasco A., Vermeulen N., Vogel J., Walls R. L., Wilson E. O., Woolley J. B., Mapping the biosphere: Exploring species to understand the origin, organization and sustainability of biodiversity. Syst. Biodiv. 10, 1–20 (2012). [Google Scholar]
- 17.Engel M. S., Ceríaco L. M. P., Daniel G. M., Dellapé P. M., Löbl I., Marinov M., Reis R. E., Young M. T., Dubois A., Agarwal I., Pablo Lehmann A., Alvarado M., Alvarez N., Andreone F., Araujo-Vieira K., Ascher J. S., Baêta D., Baldo D., Bandeira S. A., Barden P., Barrasso D. A., Bendifallah L., Bockmann F. A., Böhme W., Borkent A., Brandão C. R. F., Busack S. D., Bybee S. M., Channing A., Chatzimanolis S., Christenhusz M. J. M., Crisci J. V., D’elía G., Da Costa L. M., Davis S. R., De Lucena C. A. S., Deuve T., Elizalde S. F., Faivovich J., Farooq H., Ferguson A. W., Gippoliti S., Gonçalves F. M. P., Gonzalez V. H., Greenbaum E., Hinojosa-Díaz I. A., Ineich I., Jiang J., Kahono S., Kury A. B., Lucinda P. H. F., Lynch J. D., Malécot V., Marques M. P., Marris J. W. M., Mckellar R. C., Mendes L. F., Nihei S. S., Nishikawa K., Ohler A., Orrico V. G. D., Ota H., Paiva J., Parrinha D., Pauwels O. S. G., Pereyra M. O., Pestana L. B., Pinheiro P. D. P., Prendini L., Prokop J., Rasmussen C., Rödel M.-O., Rodrigues M. T., Rodríguez S. M., Salatnaya H., Sampaio Í., Sánchez-García A., Shebl M. A., Santos B. S., Solórzano-Kraemer M. M., Sousa A. C. A., Stoev P., Teta P., Trape J.-F., Santos C. V.-D. D., Vasudevan K., Vink C. J., Vogel G., Wagner P., Wappler T., Ware J. L., Wedmann S., Zacharie C. K., The taxonomic impediment: A shortage of taxonomists, not the lack of technical approaches. Zool. J. Linn. Soc. 193, 381–387 (2021). [Google Scholar]
- 18.Mora C., Rollo A., Tittensor D. P., Comment on “Can We Name Earth’s Species Before They Go Extinct?”. Science 341, 237 (2013). [DOI] [PubMed] [Google Scholar]
- 19.Costello M. J., May R. M., Stork N. E., Response to comments on “Can We Name Earth’s Species Before They Go Extinct?”. Science 341, 237 (2013). [DOI] [PubMed] [Google Scholar]
- 20.Miralles A., Bruy T., Wolcott K., Scherz M. D., Begerow D., Beszteri B., Bonkowski M., Felden J., Gemeinholzer B., Glaw F., Glöckner F. O., Hawlitschek O., Kostadinov I., Nattkemper T. W., Printzen C., Renz J., Rybalka N., Stadler M., Weibulat T., Wilke T., Renner S. S., Vences M., Repositories of taxonomic data: where we are and what is missing. Syst. Biol. 69, 1231–1253 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Moura M. R., Jetz W., Shortfalls and opportunities in terrestrial vertebrate species discovery. Nat. Ecol. Evol. 5, 631–639 (2021). [DOI] [PubMed] [Google Scholar]
- 22.Wilkinson B. H., Ivany L. C., Drummond C. N., Estimating vertebrate biodiversity using the tempo of taxonomy – A view from Hubbert’s peak. Biol. J. Linn. Soc. 134, 402–422 (2021). [Google Scholar]
- 23.Deng J., Li K., Chen C., Wu S., Huang X., Discovery pattern and species number of scale insects (Hemiptera: Coccoidea). PeerJ 4, e2526 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.DeWalt R. E., Ower G. D., Ecosystem services, global diversity, and rate of stonefly species descriptions (Insecta: Plecoptera). Insects 10, 99 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Cunningham J. A., Padamsee M., Wilson S., Costello M. J., Fungi species description rates confirm high global diversity and suggest half remain unnamed. Front. Biogeogr. 16, e62358 (2024). [Google Scholar]
- 26.Pamungkas J., Glasby C. J., Read G. B., Wilson S. P., Costello M. J., Progress and perspectives in the discovery of polychaete worms (Annelida) of the world. Helgol. Mar. Res. 73, 4 (2019). [Google Scholar]
- 27.Arfianti T., Wilson S., Costello M. J., Progress in the discovery of amphipod crustaceans. PeerJ 6, e5187 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Hartebrodt L., Wilson S., Costello M. J., Progress in the discovery of isopods (Crustacea: Peracarida)—Is the description rate slowing down? PeerJ 11, e15984 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Fontaine B., Achterberg K. V., Alonso-Zarazaga M. A., Araujo R., Asche M., Aspöck H., Aspöck U., Audisio P., Aukema B., Bailly N., Balsamo M., Bank R. A., Belfiore C., Bogdanowicz W., Boxshall G., Burckhardt D., Chylarecki P., Deharveng L., Dubois A., Enghoff H., Fochetti R., Fontaine C., Gargominy O., Gomez Lopez M. S., Goujet D., Harvey M. S., Heller K.-G., van Helsdingen P., Hoch H., Jong Y. D., Karsholt O., Los W., Magowski W., Massard J. A., McInnes S. J., Mendes L. F., Mey E., Michelsen V., Minelli A., Nieto Nafrıa J. M., van Nieukerken E. J., Pape T., De Prins W., Ramos M., Ricci C., Roselaar C., Rota E., Segers H., Timm T., van Tol J., Bouchet P., New species in the Old World: Europe as a frontier in biodiversity exploration, a test bed for 21st century taxonomy. PLOS ONE 7, e36881 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Wang K., Kirk P. M., Yao Y.-J., Development trends in taxonomy, with special reference to fungi. J. Syst. Evol. 58, 406–412 (2020). [Google Scholar]
- 31.Costello M. J., Vanhoorne B., Appeltans W., Conservation of biodiversity through taxonomy, data publication, and collaborative infrastructures. Cons. Biol. 29, 1094–1099 (2015). [DOI] [PubMed] [Google Scholar]
- 32.O. Bánki, Y. Roskov, M. Döring, G. Ower, D. R. Hernández Robles, C. A. Plata Corredor, T. Stjernegaard Jeppesen, A. Örn, L. Vandepitte, D. Hobern, P. Schalk, R. E. DeWalt, K. Ma, J. Miller, T. Orrell, R. Aalbu, J. Abbott, R. Adlard, C. Aedo, E. Aescht, S. Alexander, M. Alonso-Zarazaga, B. Alvarez, G. C. Andrella, L. S. Antonietto, C. Arango, T. Artois, M. Atahuachi Burgos, S. Atkinson, J. J. Atwood, A. L. Bagnatori Sartori, N. Bailly, J. Baixeras, E. Baker, A. Balan, R. Bamber, S. Bandyopadhyay, H. Barber-James, R. Barbosa Pinto, R. Barrett, L. Bartolozzi, I. Bartsch, G. Beccaloni, C. L. Bellamy, D. Bellan-Santini, P. F. Bellinger, Y. Ben-Dov, I. Blasco-Costa, J. S. Boatwright, P. Bock, L. M. Borges, R. L. Bossard, C. Bota-Sierra, P. Bouchard, T. Borgoin, N. Boury-Esnault, G. Boxshall, C. Boyko, S. Brandao, H. Braun, R. Bray, G. Brehm, J. C. Brinda, P. D. Brock, S. L. Broich, J. Brown, S. Brown, N. Bruce, S. Brullo, A. Bruneau, L. Bush, T. Buscher, M. Blazewicz-Paskowycz, A. Cabras, S. Cairns, M. Calonje, W. Cardinal-McTeague, D. Cardoso, L. Cardoso, R. C. Castilho, I. C. Castro Silva, A. Cervantes, H. Chevillotte, L. M. Choo, K. A. Christiansen, F. Cianferoni, M. M. Cigliano, R. Clarkie, T. Cobra e Monteiro, A. Collins, J. Compton, D. Copilas-Coiocianu, L. Corbari, R. Cordeiro, K. Cortés-Hernández, M. Costello, S. Crameri, J. A. Cruz-López, P. Cárdenas, M. Daly, M. Daneliya, J.-C. Dauvin, P. Davie, C. De Broyer, S. De Grave, H. C. De Lima, J. De Prins, W., De Prins, M. De la Estrella, R. DeSalle, P. Decker, W. Decock, A. Delgado-Salinas, C. Deliry, P. M. Dellapé, J. Den Heyer, K.-D. Dijkstra, D. A. Dmitriev, M. Dohrmann, O. Dorado, F. Dorkeld, R. Downey, L. Duan, M.-C. Diaz, D. C. Eades, Catalogue of Life Checklist (Version 2024-09-25) (2024); www.catalogueoflife.org.
- 33.GBIF Backbone Taxonomy, Checklist Dataset (GBIF Secretariat, 2023); 10.15468/39omei. [DOI]
- 34.World Flora Online (Version 2024.01) (WFO, 2024); www.worldfloraonline.org.
- 35.Moreira D., López-García P., Ten reasons to exclude viruses from the tree of life. Nat. Rev. Microbiol. 7, 306–311 (2009). [DOI] [PubMed] [Google Scholar]
- 36.Costello M. J., Wilson S., Houlding B., More taxonomists describing significantly fewer species per unit effort may indicate that most species have been discovered. Syst. Biol. 62, 616–624 (2013). [DOI] [PubMed] [Google Scholar]
- 37.Edie S. M., Smits P. D., Jablonski D., Probabilistic models of species discovery and biodiversity comparisons. Proc. Natl. Acad. Sci. U.S.A. 114, 3366–3371 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Zhang Z.-Q., Animal biodiversity: An outline of higher-level classification and survey of taxonomic richness (Addenda 2013). Zootaxa 3703, 1–82 (2013). [DOI] [PubMed] [Google Scholar]
- 39.Slipinski S. A., Leschen R. A. B., Lawrence J. F., Order Coleoptera Linnaeus, 1758. In: Zhang, Z.-Q. (Ed.) Animal biodiversity: An outline of higher-level classification and survey of taxonomic richness. Zootaxa 3148, 203–206 (2011). [Google Scholar]
- 40.Gaston K. J., Mound L. A., Taxonomy, hypothesis testing and the biodiversity crisis. Proc. R. Soc. Lond. B 251, 139–142 (1993). [Google Scholar]
- 41.Sangster G., Luksenburg J. A., Declining rates of species described per taxonomist: Slowdown of progress or a side-effect of improved quality in taxonomy? Syst. Biol. 64, 144–151 (2015). [DOI] [PubMed] [Google Scholar]
- 42.Eschmeyer W. N., Fricke R., Fong J. D., Polack D. A., Marine fish diversity: history of knowledge and discovery. Zootaxa 2525, 19–50 (2010). [Google Scholar]
- 43.Stork N. E., How many species of insects and other terrestrial arthropods are there on Earth? Annu. Rev. Entomol. 63, 31–45 (2018). [DOI] [PubMed] [Google Scholar]
- 44.Baldrian P., Větrovsky T., Lepinay C., Kohout P., High-throughput sequencing view on the magnitude of global fungal diversity. Fungal Divers. 114, 539–547 (2022). [Google Scholar]
- 45.Niskanen T., Lücking R., Dahlberg A., Gaya E., Suz L. M., Mikryukov V., Liimatainen K., Druzhinina I., Westrip J. R. S., Mueller G. M., Martins-Cunha K., Kirk P., Tedersoo L., Antonelli A., Pushing the frontiers of biodiversity research: Unveiling the global diversity, distribution, and conservation of fungi. Annu. Rev. Env. Resour. 48, 149–176 (2023). [Google Scholar]
- 46.A. D. Chapman. Numbers of Living Species in Australia and the World (Australian Biodiversity Information Services, ed. 2, 2009). [Google Scholar]
- 47.Pimm S. L., Jenkins C. N., Abell R., Brooks R. M., Gittleman J. L., Joppa L. N., Raven P. H., Roberts C. M., Sexton J. O., The biodiversity of species and their rates of extinction, distribution, and protection. Science 344, 1246752 (2014). [DOI] [PubMed] [Google Scholar]
- 48.Pimm S. L., Joppa L. N., How many plant species are there, where are they, and at what rate are they going extinct? Ann. Missouri Bot. Gard. 100, 170–176 (2015). [Google Scholar]
- 49.AmphibiaWeb (University of California); https://amphibiaweb.org [accessed 24 June 2024].
- 50.P. Uetz, P. Freed, R. Aguilar, F. Reyes, J. Kudera, J. Hosek, Eds., The Reptile Database; www.reptile-database.org [accessed 24 June 2024].
- 51.R. Fricke, W. N. Eschmeyer, R. Van der Laan, Eds., Eschmeyer’s Catalog of Fishes: Genera, Species, References (California Academy of Sciences, 2024); http://researcharchive.calacademy.org/research/ichthyology/catalog/fishcatmain.asp [accessed 08 August 2024].
- 52.Mora C., Tittensor D. P., Myers R. A., The completeness of taxonomic inventories for describing the global diversity and distribution of marine fishes. Proc. R. Soc. Lond. B 275, 149–155 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Appeltans W., Ahyong S. T., Anderson G., Angel M. V., Artois T., Bailly N., Bamber R., Barber A., Bartsch I., Berta A., Błażewicz-Paszkowycz M., Bock P., Boxshall G., Boyko C. B., Brandao S. N., Bray R. A., Bruce N. L., Cairns S. D., Chan T.-Y., Cheng L., Collins A. G., Cribb T., Curini-Galletti M., Dahdouh-Guebas F., Davie P. J. F., Dawson M. N., De Clerck O., Decock W., De Grave S., de Voogd N. J., Domning D. P., Emig C. C., Erséus C., Eschmeyer W., Fauchald K., Fautin D. G., Feist S. W., Fransen C. H. J. M., Furuya H., Garcia-Alvarez O., Gerken S., Gibson D., Gittenberger A., Gofas S., Gómez-Daglio L., Gordon D. P., Guiry M. D., Hernandez F., Hoeksema B. W., Hopcroft R. R., Jaume D., Kirk P., Koedam N., Koenemann S., Kolb J. B., Kristensen R. M., Kroh A., Lambert G., Lazarus D. B., Lemaitre R., Longshaw M., Lowry J., Macpherson E., Madin L. P., Mah C., Mapstone G., McLaughlin P. A., Mees J., Meland K., Messing C. G., Mills C. E., Molodtsova T. N., Mooi R., Neuhaus B., Ng P. K. L., Nielsen C., Norenburg J., Opresko D. M., Osawa M., Paulay G., Perrin W., Pilger J. F., Poore G. C. B., Pugh P., Read G. B., Reimer J. D., Rius M., Rocha R. M., Saiz-Salinas J. I., Scarabino V., Schierwater B., Schmidt-Rhaesa A., Schnabel K. E., Schotte M., Schuchert P., Schwabe E., Segers H., Self-Sullivan C., Shenkar N., Siegel V., Sterrer W., Stöhr S., Swalla B., Tasker M. L., Thuesen E. V., Timm T., Todaro M. A., Turon X., Tyler S., Uetz P., van der Land J., Vanhoorne B., van Ofwegen L. P., van Soest R. W. M., Vanaverbeke J., Walker-Smith G., Walter T. C., Warren A., Williams G. C., Wilson S. P., Costello M. J., The magnitude of global marine species diversity. Curr. Biol. 22, 2189–2202 (2012). [DOI] [PubMed] [Google Scholar]
- 54.Reis R. E., Albert J. S., Di Dario F., Mincarone M. M., Petry P., Rocha L. A., Fish biodiversity and conservation in South America. J. Fish Biol. 89, 12–47 (2016). [DOI] [PubMed] [Google Scholar]
- 55.Bickford D., Lohman D. J., Sodhi N. S., Ng P. K. L., Meier R., Winker K., Ingram K. K., Das I., Cryptic species as a window on diversity and conservation. Trends Ecol. Evol. 22, 148–155 (2007). [DOI] [PubMed] [Google Scholar]
- 56.Adams M., Raadik T. A., Burridge C. P., Georges A., Global biodiversity assessment and hypercryptic species complexes: More than one species of elephant in the room? Syst. Biol. 63, 518–533 (2014). [DOI] [PubMed] [Google Scholar]
- 57.Pérez-Ponce de León G., Poulin R., Taxonomic distribution of cryptic diversity among metazoans: not so homogeneous after all. Biol. Lett. 12, 20160371 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Ziegler A. J., Li X., Wiens J. J., The evidence for new species across the Tree of Life: morphology still rules the largest kingdoms. Bull. Soc. Syst. Biol. 4, doi.org/10.18061/bssb.v4i1.10466 (2025). [Google Scholar]
- 59.Bebber D. P., Marriott F. H. C., Gaston K. J., Harris S. A., Scotland R. W., Predicting unknown species numbers using discovery curves. Proc. R. Soc. Lond. B 274, 1651–1658 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Joppa L. N., Roberts D. L., Pimm S. L., The population ecology and social behaviour of taxonomists. Trends Ecol. Evol. 26, 551–553 (2011). [DOI] [PubMed] [Google Scholar]
- 61.Bebber D. P., Wood J. R. I., Barker C., Scotland R. W., Author inflation masks global capacity for species discovery in flowering plants. New Phytol. 201, 700–706 (2014). [DOI] [PubMed] [Google Scholar]
- 62.Bebber D. P., Polaszek A., Wood J. R. I., Barker C., Scotland R. W., Taxonomic capacity and author inflation. New Phytol. 202, 741–742 (2014). [DOI] [PubMed] [Google Scholar]
- 63.Adl S. M., Leander B. S., Simpson A. G. B., Archibald J. M., Anderson O. R., Bass D., Bowser S. S., Brugerolle G., Farmer M. A., Karpov S., Kolisko M., Lane C. E., Lodge D. J., Mann D. G., Meisterfeld R., Mendoza L., Moestrup Ø., Mozley-Standridge S. E., Smirnov A. V., Spiegel F., Diversity, nomenclature, and taxonomy of protists. Syst. Biol. 56, 684–689 (2007). [DOI] [PubMed] [Google Scholar]
- 64.Fishman F. J., Lennon J. T., Macroevolutionary constraints on global microbial diversity. Ecol. Evol. 13, e10403 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Louca S., Mazel F., Doebeli M., Parfrey L. W., A census-based estimate of Earth’s bacterial and archaeal diversity. PLOS Biol. 17, e3000106 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Parks D. H., Chuvochina M., Rinke C., Mussig A. J., Chaumeil P.-A., Hugenholtz P., GTDB: An ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 50, D785–D794 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Tedesco P. A., Bigorne R., Bogan A. E., Giam X., Jezequel C., Hugueny B., Estimating how many undescribed species have gone extinct. Conserv. Biol. 28, 1360–1370 (2014). [DOI] [PubMed] [Google Scholar]
- 68.Boehm M. M. A., Cronk Q. C. B., Dark extinction: The problem of unknown historical extinctions. Biol. Lett. 17, 20210007 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Boyle M. J. W., Sharp A. C., Barclay M. V., Chung A. Y. C., Ewers R. M., de Rougemont G., Bonebrake T. C., Kitching R. L., Stork N. E., Ashton L. A., Tropical beetles more sensitive to impacts are less likely to be known to science. Curr. Biol. 34, R770–R771 (2024). [DOI] [PubMed] [Google Scholar]
- 70.Saban K. E., Wiens J. J., Unpacking the extinction crisis: Rates, patterns and causes of recent extinctions in plants and animals. Proc. R. Soc. Lond. B. 292, 2025171 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Wiens J. J., Zelinka J., How many species will Earth lose to climate change? Glob. Chang. Biol. 30, 17125 (2024). [DOI] [PubMed] [Google Scholar]
- 72.Urban M. C., Climate change extinctions. Science 386, 1123–11128 (2024). [DOI] [PubMed] [Google Scholar]
- 73.W. McKinney, “Data structures for statistical computing in Python” in Proceedings of the 9th Python in Science Conference, S. van der Walt and J. Millman, Eds. (SciPy, 2010), pp. 56–61. [Google Scholar]
- 74.R Core Team, R: A language and environment for statistical computing (R Foundation for Statistical Computing, 2024). [Google Scholar]
- 75.Ritz C., Baty F., Streibig J. C., Gerhard D., Dose-response analysis using R. PLOS ONE 10, e0146021 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Stan Development Team, RStan: The R interface to Stan, version 2.32.6 (2024); https://mc-stan.org/.
- 77.A. Onofri, Statforbiology: Tools for data analyses in biology (2024); 10.32614/CRAN.package.statforbiology. [DOI]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figs. S1 to S17
Tables S1 and S2
Datasets S1 to S11
Data Availability Statement
All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. The datasets and code are permanently and freely available on figshare (https://figshare.com/s/a59c81d51846d79065de).








