When most people think of paleontology, they picture the field paleontologist digging up fossils with a toothbrush and publishing descriptions of the anatomies of uncovered specimens in the trade literature. New discoveries of dinosaurs even make the national headlines. For many decades the discovery and description of a previously unknown fossil species was the calling card that gained would-be paleontologists professional acceptance. However, in the last quarter of a century or so, many of the most intriguing new results in paleontology have come not from field studies, but from the compilation and analysis of large-scale databases of fossil species. These databases have provided us with quantitative pictures of the pattern and size of mass extinction events, the rate at which new species have appeared, and crucially the number of species on the planet through time, the so-called standing diversity. In an article appearing in this issue of PNAS (1), John Alroy, Charles Marshall, and a large group of distinguished collaborators report on the creation of a new database that catalogs fossils at the level of individual collections. Preliminary analysis of this database reveals interesting results, calling into question some fundamental ideas about the history of life on Earth.
John Alroy, Charles Marshall, and a large group of distinguished collaborators report on the creation of a new database that catalogs fossils at the level of individual collections.
There are three principal features worthy of note in the paper by Alroy et al. First, the paper announces the creation of the new database. Second, the authors describe new methods of data analysis made possible by the database that help to eliminate biases inherent in previous studies as a result of variations in patterns of fossil preservation and collection. Third, these new methods raise doubts about the long-held belief that biodiversity has increased dramatically in the last 250 million years; it may in fact be that diversity has been roughly constant, although no firm verdict has been reached yet on this point.
Statistical analyses of species turnover and diversity in the fossil record have been dominated in the past by the work of one man, Jack Sepkoski, who from the early eighties until his death in May 1999 worked single-handedly on the compilation of an encyclopedic database of occurrences of marine invertebrates in the fossil record, using journal publications as his primary source (2, 3). Other compilations also have been published (4, 5), but Sepkoski's database has received more attention by far than any other. Sepkoski's database was simple in structure: it recorded the first and last known occurrences in the fossil record of more than 30,000 marine invertebrate genera in about 4,000 families. Marine invertebrates have been the focus of most statistical studies, because preservation is much more reliable in marine environments and invertebrates are much more numerous than vertebrates. Time was measured in stratigraphic stages, uneven intervals defined by using a variety of geological and paleontological markers. Many features of the fossil record have been deduced from Sepkoski's data. One of the most famous is shown in Fig. 1, which is a plot of the total number of genera in the database as a function of time during the Phanerozoic—approximately the last 540 million years, from the so-called “Cambrian explosion” of metazoan diversity until the present day. The shape of this curve mirrors the accepted view of life's history on the planet: a burst of diversification in the Cambrian and Ordovician, followed by a rough plateau in diversity for about 200 million years in the latter half of the Paleozoic, until the dip in the center of the figure, which represents the massive late-Permian extinction event. Following this extinction, it appears that diversity first recovered and then increased substantially during the Mezozoic and Cenozoic, rising to a present-day level two or more times higher than any seen during the Paleozoic.
Sepkoski's database, although extensive and thorough, has a number of shortcomings. In particular, it records only first and last occurrences of taxa anywhere in the world, and no other data, such as how commonly taxa occur or where. Thus very widely occurring taxa are accorded exactly the same status as ones that are found rarely. Also, by the very fact that the database is as exhaustive as possible, substantial biases are introduced. For example, it is quite feasible that the increase in diversity toward recent times seen in Fig. 1 is a result primarily of the greater volume of rock available from recent times, and the greater amount of effort that has been put into studying these rocks. A number of studies over the years have presented evidence showing that apparent diversity is closely correlated with the intensity with which different periods of geologic time have been sampled (6, 7).
The new database compiled by Alroy and coworkers (one of whom is the same Jack Sepkoski mentioned above) attempts to correct some of these problems by including more comprehensive data about fossil taxa, in particular dividing data into collections—groups of fossils recovered from specific locales by specific workers or teams—with repeated occurrences of taxa at different times and places explicitly noted. Like the database of Sepkoski, the new database focuses on marine invertebrates, and is at present incomplete—work is still continuing on the compilation. Currently it covers two time periods of about 150 million years each, one in the middle part of the Paleozoic, during the plateau seen in Fig. 1, and one from the mid-Mesozoic to the mid-Cenozoic, the central portion of the diversity increase in the right-hand part of the figure.
Because of the division of the database into collections, Alroy and coworkers have been able to compensate for biases in the intensity of sampling of different time intervals, and to some extent for varying quality of fossil preservation in their data, and so make more accurate estimates of diversity (although, as they are the first to emphasize, biases are still present). Their technique of analysis involves breaking the data down in two ways. First, they divide the data into roughly equal time intervals—more uniform in length than the intervals used by Sepkoski. Second, within each interval they attempt to choose a constant number of actual fossil specimens, as if the intensity of sampling across different times and places had been uniform, rather than widely varying as it in fact is. Unfortunately, only the number of taxa is recorded for many of their collections and not the number of specimens, so it is not possible to fix specimen number directly. Instead therefore, they have used a variety of different proxy techniques to simulate uniform sampling. The simplest such technique is to take a fixed number of collections (or “lists” as the authors call them), being careful that the ones chosen come from geographically distributed localities. This method works well if the sampling intensity is roughly the same from one collection to another. This, however, may not be the case, so they also use several other techniques that weight lists according to their length, and they report separate results for each of the different methods used. Clearly in the absence of more detailed information about which weighting is correct, only results that are robust across different methods should be considered to have strong support.
The diversity counts given by Alroy et al. are the total numbers of taxa seen across all collections sampled, taken variously either during the time intervals of study, or at the boundaries of those intervals. It is important to notice that these counts are not expected to be directly proportional to actual diversity (which is, in any case, not well defined). However, the counts should increase monotonically with increasing real diversity, and two intervals that have the same total count can be expected to have approximately equal real diversities. That is, the results are comparable between different geologic times.
To some extent, the principal new contributions of the present study are the database itself and the sampling-standardized methods for measuring diversity. However, the preliminary results also offer some interesting suggestions of what is to come in this field. The authors make a host of different observations about the results of their calculations, but perhaps the most interesting is that most of their measures of biodiversity are found to give approximately equal figures for diversity in the two time periods studied. Recall that in the curve of Fig. 1, derived from the earlier work of Sepkoski, the two periods showed very different behavior, the first having a rough plateau in diversity, the second showing a marked diversity increase. This increase is not clearly visible in the new results, suggesting that the supposed post-Paleozoic diversification of marine fauna may be merely an artifact of biases in the Sepkoski database. It should be emphasized however, that these results are by no means final, and it is too early to draw any firm conclusions from the data.
The creation of this new database of the fossil record may well have far-reaching effects. The mere fact that most of the previous work in this area has made use of just a single source of data—the Sepkoski compilation—makes the creation of an independent database an important and worthwhile enterprise. However, the inclusion in this new database of far more detailed information on frequency of occurrence of taxa opens the way for statistically superior analyses of fossil biodiversity and other quantities, which have not been possible before. The paper appearing in this issue represents only the first effort in this direction, and we can hope to see many new and interesting results emerging as the database and the analytical methods applied to it mature.
Footnotes
See companion article on page 6261.
References
- 1.Alroy J, Marshall C R, Bambach R K, Bezusko K, Foote M, Fürsich F T, Hansen T A, Holland S M, Ivany L C, Jablonski D, et al. Proc Natl Acad Sci USA. 2001;98:6261–6266. doi: 10.1073/pnas.111144698. . (First Published May 15, 2001; 10.1073/pnas.111144698) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Sepkoski J J., Jr Milwaukee Pub Mus Contrib Biol Geol. 1992;83:1–156. [PubMed] [Google Scholar]
- 3.Sepkoski J J., Jr Paleobiology. 1993;19:43–51. doi: 10.1017/s0094837300012306. [DOI] [PubMed] [Google Scholar]
- 4.Benton M J, editor. The Fossil Record 2. London: Chapman and Hall; 1993. [Google Scholar]
- 5.Labandeira C C. Milwaukee Pub Mus Contrib Biol Geol. 1994;88:1–71. [Google Scholar]
- 6.Raup D M. Paleobiology. 1976;2:289–297. [Google Scholar]
- 7.Sepkoski J J, Jr, Bambach R K, Raup D M, Valentine J W. Nature (London) 1981;293:435–437. [Google Scholar]