Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2016 Nov 21;113(49):14079–14084. doi: 10.1073/pnas.1616804113

Large numbers of vertebrates began rapid population decline in the late 19th century

Haipeng Li a,b,1, Jinggong Xiang-Yu b,1, Guangyi Dai b, Zhili Gu b, Chen Ming b,c, Zongfeng Yang b,c, Oliver A Ryder d, Wen-Hsiung Li e,f,2, Yun-Xin Fu g,h,2, Ya-Ping Zhang a,c,g,2
PMCID: PMC5150392  PMID: 27872315

Significance

The current rate of species extinction is ∼1,000 times the background rate of extinction and is attributable to human impact, ecological and demographic fluctuations, and inbreeding due to small population sizes. The rate and the initiation date of rapid population decline (RPD) can provide important clues about the driving forces of population decline in threatened species, but they are generally unknown. We analyzed the genetic diversity data in 2,764 vertebrate species. Our population genetics modeling suggests that in many threatened vertebrate species the RPD on average began in the late 19th century, and the mean current size of threatened vertebrates is only 5% of their ancestral size. We estimated a ∼25% population decline every 10 y in threatened vertebrate species.

Keywords: vertebrate, threatened species, coalescent, rapid population decline, conservation

Abstract

Accelerated losses of biodiversity are a hallmark of the current era. Large declines of population size have been widely observed and currently 22,176 species are threatened by extinction. The time at which a threatened species began rapid population decline (RPD) and the rate of RPD provide important clues about the driving forces of population decline and anticipated extinction time. However, these parameters remain unknown for the vast majority of threatened species. Here we analyzed the genetic diversity data of nuclear and mitochondrial loci of 2,764 vertebrate species and found that the mean genetic diversity is lower in threatened species than in related nonthreatened species. Our coalescence-based modeling suggests that in many threatened species the RPD began ∼123 y ago (a 95% confidence interval of 20–260 y). This estimated date coincides with widespread industrialization and a profound change in global living ecosystems over the past two centuries. On average the population size declined by ∼25% every 10 y in a threatened species, and the population size was reduced to ∼5% of its ancestral size. Moreover, the ancestral size of threatened species was, on average, ∼22% smaller than that of nonthreatened species. Because the time period of RPD is short, the cumulative effect of RPD on genetic diversity is still not strong, so that the smaller ancestral size of threatened species may be the major cause of their reduced genetic diversity; RPD explains 24.1–37.5% of the difference in genetic diversity between threatened and nonthreatened species.


Although preservation of biodiversity is vital to a sustainable human society, rapid population decline (RPD) continues to be widespread across taxa (13). When RPD occurs, it is accompanied by a loss of genetic diversity. Genetic diversity is reflected in the genetic differences among individuals and is essential for populations to adapt to changing environments (4). The start date and the rate of RPD provide useful information for effective conservation of threatened species and are important for promotion of public awareness of the threat. However, these two key parameters are difficult to estimate because there are virtually no time-series data on population size over hundreds of years. For most species, the population size may only be traced back to 40 y (2). Therefore, an alternative approach is to estimate the start date and the rate of RPD, using mathematical modeling.

Changes in population size over thousands of years could be inferred for a species from genome-wide DNA polymorphism data (57). However, it remains a formidable technical challenge to infer the event of RPD because the signal of such an event is weak in the typical time scale of observable polymorphisms (8). To overcome the limited resolution power of the genetic data from a single species, we propose an approach that draws conclusions based on the collective support from many species. The central premise of our approach is that the threat of extinction of thousands of species was primarily due to a common cause in the past that led to a significant depletion of available habitats and resources. Consequently, we were able to draw conclusions based on present-day polymorphism data from a large number of threatened species and their nonthreatened relatives. Our method is depicted in Fig. 1. Here we studied RPD in vertebrates, because vertebrates have been more extensively investigated in the past. However, our conclusions should have some generality because vertebrate species live in a wide range of ecosystems. Moreover, the proposed method is also suitable for studying nonvertebrate species.

Fig. 1.

Fig. 1.

Schematic inference on the start date and the rate of RPD under one particular demographic model. The coalescence simulations were conducted conditional on the sample sizes, the numbers of loci, the pattern of missing data, the generation times, the census sizes, the species distributions, and the years of sampling. The data were summarized as the relative difference in four genetic diversity measurements between two species groups. The species categorized as near threatened (NT) and least concern (LC) are treated as the nonthreatened species. The threatened species include those listed as critically endangered (CR), endangered (EN), and vulnerable (VU) (6). The uncategorized species include those that are listed as data deficient and have not been evaluated by the IUCN.

Results and Discussion

Data Collected.

We reviewed more than 10,000 peer-reviewed papers published in the last two and half decades, among which ∼2,500 papers in 164 scientific journals were found to have surveyed the genetic diversity of at least one vertebrate species. The level of genetic diversity was measured with one of the following summary statistics (9): the expected and observed heterozygosity (He and Ho), the number of alleles per locus (α) at the microsatellite loci, Watterson’s θw (10), and the mean number of nucleotide differences per nucleotide site between two mitochondrial sequences (π). The collected dataset includes 2,764 vertebrate species belonging to 1,466 genera and 465 families (Fig. 2). Then, we used the International Union for Conservation of Nature (IUCN) Red List categories (3) to determine the level of extinction risk for each species, and the species were categorized into three groups: nonthreatened species (NS), threatened species (TS), and uncategorized species (Fig. 2B). The uncategorized species include (i) those that are listed as data deficient and (ii) those that have not been evaluated by the IUCN. A taxon is listed as data-deficient when there is inadequate information to make an assessment of its risk of extinction (3). The uncategorized species were excluded from our analyses (Fig. 1), unless noted otherwise.

Fig. 2.

Fig. 2.

Categories of the 2,764 vertebrate species used in this study. (A) The number and relative proportion of the species in each taxon category. (B) IUCN Red List categories of the examined species and their relative proportions. CR, critically endangered; DD, data deficient; EN, endangered; LC, least concern; NE, not evaluated; NT, near threatened; VU, vulnerable. The threatened species include the species of critically endangered, endangered, and vulnerable, and the nonthreatened species include the near-threatened and least-concern species.

Comparison of Genetic Diversity Between Nonthreatened and Threatened Species.

Following a previous study (11), we compared the genetic diversity between nonthreatened and threatened vertebrate species using the permutation test (12). The establishment of those IUCN categories does not rely on the information of genetic diversity. Although the distributions of genetic diversity of nonthreatened and threatened species overlap (Fig. 3 A and B and SI Appendix, Fig. S1), the mean genetic diversity of nonthreatened species is significantly higher than that of related threatened species in all 16 comparisons (Fig. 3 CE and SI Appendix, Table S1), generally agreeing with the previous finding (11). The results remain the same when we recompiled the data with different numbers of microsatellite loci (<or10 loci) or different sequenced lengths of the D-loop (<or500 bp) (SI Appendix, Fig. S2).

Fig. 3.

Fig. 3.

Comparisons of genetic diversity between nonthreatened and threatened vertebrate species. (A) Empirical distributions of He on microsatellite loci. (B) Empirical distributions of Watterson’s θw calculated from sequence variation in the D-loop (control region) of mitochondrial DNA. (C) Results of the permutation test on He of microsatellite loci between nonthreatened and threatened species. (D) Results of the permutation test on the number of alleles per microsatellite locus (α) between nonthreatened and threatened species. (E) Results of the permutation test on Watterson’s θw and the mean pairwise nucleotide differences (π) in the D-loop and coding regions in the mitochondrial genome. The null hypothesis of the permutation test is that the mean genetic diversity of nonthreatened species is equal to that of threatened species. The numbers of species examined are shown on the columns, and the one-tailed P values of the test are shown above the columns. The SEM is presented as an error bar. To ensure a reliable estimation of genetic diversity for a species, we required a sample size of n20 individuals. *P < 0.05, ***P < 0.01.

To examine whether differences in population structure can explain the reduction in genetic diversity of threatened species, we first compared the Fst values (an indicator of recent population structure estimated from microsatellite loci) between two species groups and found no significant difference (P = 0.25) (SI Appendix, Table S2). Next, we calculated the one-tailed P values of Tajima’s D (13) for the mitochondrial DNA polymorphism data, which is sensitive to ancient but not recent population structure (14). There was also no significant difference between the two species groups (P = 0.67) (SI Appendix, Table S2). Thus, population structure differences are unlikely the principle cause of the difference in genetic diversity between nonthreatened and threatened species.

To assess the impact of recent demographic change on genetic diversity of threatened species, we considered pairs of nonthreatened and threatened species from the same family. For each pair we calculated the ratio of the long-term effective population size (Ne) and the ratio of the effective population size at present N(0). The observed genetic diversity provides an estimate of Ne=θ/4μ for autosomes (15) or Ne=θ/μ for mitochondrial DNA (10), where μ is the mutation rate per generation. Therefore, the ratio of Ne between two species groups was estimated as θ^NS/θ^TS for either autosomal or mitochondrial loci, where the subscripts NS and TS stand for nonthreatened and threatened species, respectively. Also, the ratio of N(0) was approximated by the ratio of the current census size N. This is based on the finding that the ratio of effective to actual population size (f=N(0)/N) has a mean value of 0.1 (16) and is largely independent of N. We found that Ne,NS/Ne,TS (median 1.89, the 5th and 95th percentiles 0.16 and 15.32, respectively) is remarkably smaller than NNS(0)/NTS(0) (median 36.95, the 5th and 95th percentiles 1.9 and 3,282.9, respectively) (P<105) (Fig. 4 and SI Appendix, Tables S3 and S4). This may indicate a much larger ancestral size and a RPD across all or most of the threatened taxa.

Fig. 4.

Fig. 4.

Ratios of long-term effective population size (circles, measured as θ^NS/θ^TS) and ratios of effective population size at present (crosses, measured as NNS/NTS) between nonthreatened and threatened species for five vertebrate classes. Species group pairs are from the same family. θ^ was calculated using the D-loop of mitochondrial DNA (10) and allelic variation at microsatellite loci (15).

We suggest that the recent impacts on population size could be measured by NNS(0)/NTS(0) and normalized by θ^NS/θ^TS. Then we examined which families of species were affected the most or the least (SI Appendix, Table S5). A larger impact index indicates that one or a few threatened species in the family experienced a more severe population decline.

Demographic Models.

We used a model-based approach to quantify the RPD. One model is illustrated in Fig. 5A. The essential premise is that many threatened species began the RPD at similar times due to the increased impact of human activities and habitat losses. Specifically, the model assumes that each threatened species began an exponential decrease in size t years ago, which follows a distribution with the mean equal to τ, whereas nonthreatened species have maintained a constant population size (the case of nonthreatened species with nonconstant size is examined below). Naturally, the time t splits the population history into two phases (Fig. 5A). We define R=E(θw,NS(t))/E(θw,TS(t))E(NNS(t))/E(NTS(t)), which is the ratio of the ancestral genetic diversity of nonthreatened species to that of threatened species at time t and represents the difference in census size between the two species groups before RPD.

Fig. 5.

Fig. 5.

Coalescence-based modeling and analysis. (A) The two-phase model of exponential population decline for threatened species with a constant effective population size of nonthreatened species during both phases. (B) The likelihood surface obtained from the analysis. (C) The estimates based on the data from all studied species and subgroups of species. The estimates (R^, τ^) are shown by open circles, the estimates of τ conditional on R are shown in solid lines, and their 95% confidence intervals are given in dashed lines. The estimates for the case of 40% species were averaged over five random replicates. (D) The estimates based on the microsatellite data from five taxa. (E) The estimates based on the data from species with different generation times. (F) Results from species categorized as temperate and tropical zones.

To estimate the two parameters (R and τ), a numerical approximation of their likelihood L(R,τ) was obtained using the principle of rejection sampling (5, 17, 18) after representing the data as the relative differences in the mean genetic diversity between the two species groups (Methods). Then we explored each of the scenarios with the start date of RPD that follows the normal, the exponential, and the one-point (i.e., constant) distribution with the mean τ. The SD of the normal distribution was estimated as the SD of t^ among 18 well-documented mammal and bird species (SI Appendix, Table S6). The estimated τ^ is 111, 154, and 123 y, respectively. Judging from the amount of uncertainty associated with these estimates, it seems that τ^ is robust with regard to the assumption of the distribution of t. However, assuming a nonconstant t will lead to a substantial increase in computation/simulation time in subsequent analyses. Therefore, only the one-point distribution of t was implemented in the subsequent analyses.

The surface of the likelihood L(R,τ) is shown in Fig. 5B. The estimate R^ is 1.22 (Fig. 5C) with a 95% confidence interval between 1.11 and 1.35, implying that the ancestral size of threatened vertebrate species was, on average, 22% smaller than that of nonthreatened vertebrate species. As expected, R is more precise than Ne,NS/Ne,TS estimated above because the latter was estimated from the small number of species and had a very large variance. The corresponding τ^ is 123 y with a 95% confidence interval of 20–260 y, and the rate of RPD (λ^) implies a 24.5% population decline every 10 y. The results suggest that the difference in genetic diversity between the two species groups (Fig. 3) was due to the joint effect of the smaller ancestral size of threatened species and the RPD that began on average in the late 19th century. Then, the effect of RPD was quantified by comparing the simulated genetic diversity between the cases of RPD and no RPD. We found that RPD explains 24.1–37.5% of the difference in genetic diversity between the two species groups (SI Appendix, Table S7). The effect varies with different measurements of genetic diversity because the number of alleles per locus (α) and Watterson’s θw are more sensitive to RPD than the expected heterozygosity (He) and π (19, 20).

The effect of RPD on the reduction in genetic diversity is relatively weak because the time period of RPD is short and because the observed differences in genetic diversity between the two species groups were small. For example, the initial simulated heterozygosity of threatened species was 0.572, and the reduced heterozygosity after RPD was 0.558. That is, only 2.4% of the heterozygosity was lost due to RPD because the time period of RPD was only 123 y.

Comparison with Historical Data.

It has been documented that accelerated land use by humans might have caused the decrease of biodiversity since the middle of the 19th century (21) and extinction rates increased sharply over the past 200 y (22). Therefore, our estimated τ^ agrees with these two studies. τ^ also agrees with the estimated t^ for 18 well-documented species [ranging from 55 to 415 y, and mean(t^)166 y] (SI Appendix, Table S6). The Living Planet Index (2), based on vertebrate population time series between 1970 and 2010, reported a less rapid population decline than we inferred. This may have occurred because conservation efforts had slowed down the rate of population decline in the last 40 y, or populations of nonthreatened vertebrate species were also included to track global biodiversity change in their analysis (23, 24).

Unbiasedness and Robustness of the Method.

We examined the bias of our inference method. First, the sampling effect among species was studied, and it was found that smaller subsets of data gave unbiased estimates but with large variances, as expected (Fig. 5C). Second, RPDs were simulated (10,000 replicates) with the parameter values R=1.2 and τ=120, and the obtained mean R^ and τ^ were 1.20 and 128, respectively, showing little bias (SI Appendix, Fig. S3).

The robustness of our estimates was investigated in a number of ways. Several scenarios of the ancient demography before RPD were simulated, and we found that the ancient demography has almost no impact on the estimates (SI Appendix, Fig. S4). We also found that even if the population size of species categorized as near-threatened was assumed to decrease by half since RPD the estimates were virtually unchanged (SI Appendix, Fig. S5A). We also considered the uncertainty of assessment on the genetic diversity due to small sample size, the uncertainty of assessment on the threatened status of species, the uncertainty of estimating N(0), and the uncertainty of the evolutionary model of microsatellite loci (SI Appendix, Fig. S6). We found that these factors have no visible impacts on the estimates (SI Appendix, Fig. S5 BE). It has been found that positive and negative selection have acted on mitochondrial genomes (25, 26), which may invalidate the neutral assumption of the model. Moreover, the two short highly hypervariable regions in the D-loop region have been preferentially selected for study, and we observed an increased genetic diversity in mtDNA sequencing data when a short region (<500 bp) was sequenced. To avoid this potential bias, we conducted an analysis by completely excluding the mitochondrial dataset but we found that the estimates R^ and τ^ remained approximately the same (SI Appendix, Fig. S5F).

Our large collection of species allowed us to conduct refined inferences. First, we reanalyzed each taxon class (Fig. 5D). We found that threatened birds and fish species seem to have experienced a more recent RPD than other threatened vertebrates. Thus, the estimated τ^ should be interpreted as the average starting date of RPD among different threatened species/taxa. Interestingly, a threatened species with a longer generation time might have an earlier start of RPD (Fig. 5E). This might be due to a positive correlation between generation time and body size (27) and the possibility that a threatened species with a larger body size may be affected earlier by human activities. Second, our analysis of temperate/tropical species suggested that tropical threatened species might have been affected earlier than other threatened species (Fig. 5F), which coincides with the greater habitat loss in the tropics than in other parts of the world. However, we did not observe a large difference in the estimated τ between threatened terrestrial and aquatic species (SI Appendix, Fig. S7A) or among low/mid/high-income geographic regions (SI Appendix, Fig. S7B). Finally, because the genetic diversity could be different between different IUCN threatened categories (critically endangered, endangered, and vulnerable) (SI Appendix, Table S8), we compared the nonthreatened species with the threatened species belonging to different threatened categories. We found that the different IUCN threatened categories have almost no impact on the estimated τ^ (SI Appendix, Fig. S7C). Overall, recent RPDs were observed in all of the examined cases, which strengthened our view that RPD on average widely began in the late 19th century.

In summary, this study demonstrates a utility of genetic polymorphism data in conservation biology. The model with the two parameters R and τ (Fig. 5A) is able to capture the crucial features in the observed patterns of polymorphism, and the estimates are robust against various conditions (SI Appendix, Figs. S4 and S5). The estimated start date of RPD coincides with that of widespread industrialization and a profound change in global living ecosystems in the late 19th century, reinforcing the belief that human activity is the primary cause for recent massive extinction. R^ ranges between 1.1 and 1.3 (Fig. 5 and SI Appendix, Figs. S4, S5, and S7), indicating that the ancestral size of threatened species was 10–30% smaller than that of nonthreatened species. Overall, a species with small population size is likely more vulnerable to environmental change (4). Therefore, an important question is raised whether a relatively small ancestral size of a species is an important factor for its current threatened status. This could be investigated with the genome-wide polymorphism data from the recently initiated Genome 10K Project (28) in the future. Finally, if current conservation efforts are generally enacted when population sizes are low, the historical inferences of this study suggest that recovery may be compromised by delaying action as population sizes continue to decline.

Methods

Collecting Genetic Diversity Data.

For over a decade we carefully examined 2,475 peer-reviewed papers published in 164 scientific journals (Dataset S1) and collected DNA polymorphism data for vertebrate species living in various ecosystems. We focused on vertebrate species because available data were mostly from vertebrates. The genetic diversity in each of these species was typically assayed by one or two molecular techniques: allelic variation at microsatellite loci and sequence variation in the control (D-loop) and coding regions of the mitochondrial DNA. Therefore, the genetic diversity data presented here represent the polymorphism level in both nuclear and mitochondrial genomes.

To ensure data quality, if an inconsistency (in the sample size, the number of haplotypes, the number of alleles, or any related key information) was found in a paper we contacted the authors for their confirmation or discarded the data if we received no response. We also excluded studies using museum samples because the number of such studies is very limited. To increase nomenclatural consistency, the standard world checklists (version 2014.3) on the IUCN Red List were used (www.iucnredlist.org/technical-documents/information-sources-and-quality).

The within-species polymorphism data of 2,764 vertebrate species are presented in Dataset S1. In this dataset, there were 400 vertebrate species surveyed by more than one method, so we have 3,219 nonredundant entries in total, where each entry is composed of one to three summary statistics. If the microsatellite loci of a vertebrate species were surveyed, the sample size (n), the number of microsatellite loci, the expected heterozygosity (He), the observed heterozygosity (Ho), the mean number of alleles per locus (α), FST (29), and the year of publication were recorded and later used in simulations. If the D-loop region of the mitochondrion was sequenced in a vertebrate species, we recorded the sample size (n), the length of the sequenced region, the mean number of pairwise nucleotide differences per base pair (π) between the DNA sequences examined, Watterson’s θw per base pair (10), and the year of publication. When it was necessary and possible, we downloaded mitochondrial sequences from EMBL/GenBank and aligned them by DNASTAR, checked the alignment by eye, and then calculated π and Watterson’s θw. The haplotype frequencies were obtained from the original articles, and the sites containing insertions and deletions were excluded in our analysis. To investigate population structure, we calculated Tajima’s D (13) and its P value, as the summary statistic for the ancient population structure (14). The π and Watterson’s θw were rescaled according to the length of the sequenced fragment that was used to calculate Tajima’s D. We also collected a small dataset of genetic diversity for the coding regions of mitochondrial DNA. The collection process was similar to that described above, and we used the dataset only in the description analysis because the number of available species is small.

According to the IUCN, the threatened species (TS) include those listed as critically endangered (CR), endangered (EN), and vulnerable (VU) (3). The species categorized as near threatened (NT) and least concern (LC) are treated as the nonthreatened species (NS). The uncategorized species include those that are listed as data deficient (DD) and have not been evaluated by the IUCN. A taxon is listed as data deficient when there is inadequate information to make an assessment of its risk of extinction (3). Generally, the uncategorized species were excluded from our analyses (Fig. 1), unless stated otherwise.

Collecting Species Distribution and Generation Time Data.

The geographic distributions of 2,552 vertebrate species were retrieved from the IUCN Red List (version 2014.3) and a reptile database (www.reptile-database.org). The data are given in Dataset S1. The generation times of 3,146 vertebrate species were obtained from different published resources (Dataset S1), which formed the basis for our generation time estimates. Assume that the generation time of a species is βρ, where ρ is the sexual maturity age of the species. β is equal to the generation time divided by the sexual maturity age and is species/genus-dependent. If the generation time of a species is unknown but its age of sexual maturity is documented, its generation time was estimated using the mean β from closely related well-studied species.

Ratio of Effective Population Sizes Between Nonthreatened and Threatened Species.

It is difficult to estimate the effective population size at a specific time of a species with fluctuating population size (57). However, the ratio of N(0) between two species groups, each composed of a large number of related species, can be approximated by the ratio of their current census size N. It has been inferred that the ratio of the effective to the actual population size [f=N(0)/N] is of the order of 0.1 (16), and f is usually independent of N. When both nonthreatened and threatened species groups are each composed of a large number of related species, it is reasonable to assume E(fNS)E(fTS), where subscripts NS and TS represent nonthreatened and threatened species, respectively. Consequently,

ω=E(θNS(0))E(θTS(0))=E(NNS(0))E(NTS(0))=E(fNSNNS)E(fTSNTS)E(NNS)E(NTS), [1]

where N(t) denotes the effective population size at time t (counting backward) and N the current census size.

Based on the current census of 1,868 vertebrate species obtained from IUCN, a professional book (30), and peer-reviewed literature (Dataset S1), we estimated ω^25 for mammals, 146 for birds, 32 for amphibians, 26 for reptiles, and 14 for fish (SI Appendix, Table S9). Therefore, we set ω^=25 in the modeling study, but we also used ω^=10 and 100 to examine the robustness of the results.

Demographic Model and Likelihood Inference of RPD.

Demographic model.

We assumed that the effective population size of a nonthreatened species remains constant. Denote the ancestral effective population size of a threatened species at time t by NTS(t), and assume that at t years ago its population size started declining exponentially (Fig. 5A); time t is counted backward. We assumed that the start date of RPD in threatened species follows the normal, the exponential, or the one-point (i.e., constant) distribution with the mean τ. Once ω=E(NNS(0))/E(NTS(0)) was estimated by Eq. 1, the demographic model was characterized by two parameters, R=E(θNS(t))/E(θTS(t)) and τ=E(t). Then, the two parameters R and τ were estimated from an analysis in the likelihood framework as described below.

Under the assumption of constant population size during the first phase and E(fNS)E(fTS), we have R=E(NNS(t))/E(NTS(t))E(NNS(t))/E(NTS(t)), where NNS(t) and NTS(t) are the census sizes of the nonthreatened and threatened species at the start date of RPD. Thus, R represents the ratio of census size at the start date of RPD between the two species groups. If the assumption of constant size during the first phase is invalid, R represents the long-term ratio of census size between two species groups before the start of RPD.

Assumption about sampling times.

The sampling time is likely different in different studies, but the time duration between the time of sampling and the time of publication is generally much shorter than t. We assumed that the sampling happened 3 y before the year of publication. In this study, the term “at present” means “2015,” so the sampling of the i-th species happened at γi (=2015γi+3) years ago, where γi is the year of publication. The duration between the start date of RPD and the year of sampling is ti=tγi.

Coalescence-based simulations.

The coalescence-based simulations followed the standard procedure (31, 32). To simulate the microsatellite polymorphism data for the i-th species, we (randomly) chose θi(γi)=4Ni(γi)μ from the θobs value(s) of its nonthreatened related species, where μ is the mutation rate per locus per generation, and the sampling happened γi years ago. The details are given below.

We first calculated θobs from the observed within-species polymorphism data of nonthreatened species with a reasonable sample size (n20) based on the stepwise mutation model (15). Then, we denoted the set of θobs values in the group of nonthreatened species related to the i-th species as θi={θobs,1,θobs,2,}. If θi= at the genus level, θi would be obtained at the family level, or even at the phylum level. Then a value θs was (randomly) drawn from θi. The value θs would be used as θ(=4NNS,i(0)μ=4NNS,i(γi)μ) if the i-th species is nonthreatened. If the i-th species is threatened, the value θs was rescaled according to its effective population size at the time of sampling, NTS,i(γi). Then, we have

θTS,i(γi)={eλtiω^eλtθs,ti>01Rθs,ti0, [2]

where λ is the rate of RPD, and a negative ti' indicates the sampling happened before the start of RPD. From the definitions of R and ω and Eq. 1, we have

λ=ln(Rω^)/t, [3]

where t was randomly sampled from the distribution described above.

To model the heterogeneity of mutation rates among microsatellite loci, we assumed that the mutation rate for a randomly selected locus follows a lognormal distribution, with the coefficient of variation equal to 1 (33). We also assumed that the microsatellite loci are independent and are autosomal.

If the evolution of the i-th species followed a nonconstant size model, the time in the unit of years was transformed to the unit of 2Ni(0) generations (31, 32), where Ni(0)=E(f)Ni, E(f)=0.1 (16), and Ni is the current census size of the i-th species.

To simulate the single-nucleotide polymorphism data on the D-loop region, we followed the procedures described above with two modifications. First, θobs was estimated as Watterson’s θw (10), based on the within-species sequence variation in the D-loop of nonthreatened species with n20; the cases of θobs=0 were discarded because those estimated values represent N(0)=0 or μ=0. Second, the rescaled decline time was multiplied by 4 because the effective population size of a mitochondrial locus is only one-fourth that of an autosomal locus.

Likelihood inference of RPD.

The observed genetic diversity of a species can be represented by a vector S={He,α,θw,π} in which at least one element was observed. The corresponding likelihood function is L(R,τ)=iP(Si|R,τ), where the i-th species is designated by subscript i. Although there is no exact method for computing L(R,τ), numerical approximations can be obtained by following the principle of rejection sampling (5, 17, 18) and by representing the data as the relative differences in the mean genetic diversity between nonthreatened and threatened species. The details are as follows.

For the summary statistic He on the microsatellite loci, we denote ΔHe=(He,NS¯He,TS¯)/He,NS¯. Denote ΔHe,simu and ΔHe,obs as the simulated and observed relative difference of mean He between nonthreatened and threatened species. The likelihood function LHe(R,τ) is then estimated as a numerical approximation of P(|ΔHe,simuΔHe,obs|ε|R,τ), where ε is a fixed tolerance. When ε is very small, the computational load is very large, whereas the precision of the estimate will be poor when ε is large. Our experience suggests that ε=0.05 works well, and the estimate of τ is not sensitive to ε. We also set ε=0.01, 0.1, or 0.2, and the results remained almost unchanged. The estimation procedure of P(|ΔHe,simuΔHe,obs|ε|R,τ) is given below. In step 1, we simulated the microsatellite polymorphic dataset using the procedure described above and calculated ΔHe,simu. The pattern of missing data, the information of sample size and the number of loci have been properly considered in the simulation and also in the related calculation. In step 2, we introduced an indicator variable I(He)={1,|ΔHe,simuΔHe,obs|ε0,otherwise. In step 3, we repeated steps 1 and 2 B times. The likelihood LHe(R,τ) was then estimated by L^He(R,τ)1BI(He), where B=104.

The above procedure was applied to multiple summary statistics with only minor modifications. For the microsatellite dataset, we jointly considered He and α, which are not independent. Then we have Lmicro(R,τ)=LHe,α(R,τ)=P(|ΔHe,simuΔHe,obs|ε,|Δα,simuΔα,obs|ε|R,τ). Similarly, we have LmtDNA(R,τ)=Lθ,π(R,τ). Finally, we have L(R,τ)=Lmicro(R,τ)LmtDNA(R,τ).

Then, R and τ are estimated by a two-step process through a likelihood framework. The first step is to calculate the profile likelihood for R [L1(R)=maxτL(R,τ)] and then the mean R^, which is the mean of R’s weighted by the relative profile likelihood values (Fig. 5 CF). The second step is to obtain the maximum likelihood estimate τ^ of τ conditional on R=R^. In general, if R>1, which indicates that the ancestral genetic diversity of nonthreatened species is higher than that of threatened species, a small τ is needed to explain the observed low genetic diversity in the present-day threatened species. In particular, if R1.35, we have τ40 (Fig. 5C). Because it has been documented that the RPD of threatened species happened at least 40 y ago (2), we set 1.35 as the upper bound of R. Moreover, it is likely that, on average, the ancestral size of nonthreatened species was larger than that of its related threatened species. Thus, we assumed R1.

All of the collected species, including those with small sample size (n<20), were taken in the coalescence-based modeling, unless noted otherwise. This is because the modeling was conducted conditional on the values of n.

Likelihood ratio test.

To conduct the likelihood ratio test and obtain likelihood-based confidence intervals, we obtained the empirical distribution of the likelihood ratio ζ=log(maxL1/maxL0) by analyzing 104 simulated datasets conditional on R and τ, where L1 and L0 are the likelihoods for the alternative and null models, respectively. An example of the empirical distribution of ζ is shown (SI Appendix, Fig. S8). In this case, the 95% critical value (ζ95%) is 1.792. Note that, for a different dataset, or for different R and τ, ζ95% could be different. Therefore, the corresponding empirical distribution of ζ was used to obtain ζ95% for the likelihood ratio test.

Effect of RPD on the Difference in Genetic Diversity Between Nonthreatened and Threatened Species.

We first conducted the coalescence-based modeling described above, conditional on R=1.22 and τ=0 or 123. We used θNS(τ)=2 (microsatellite loci, per locus) and 0.02 (mitochondrial locus, per base pair). We then compared the case of τ=123 (the estimated RPD) with the case of τ=0 (indicating no RPD) to estimate the effect of PRD on the difference in genetic diversity between nonthreatened and threatened species. We denoted ΔM,τ=(Mτ,NS¯Mτ,TS¯)/Mτ,NS¯, where M stands for He, the mean number of alleles per locus (α), π, or Watterson’s θw. The effect of PRD was calculated as (ΔM,τ=123ΔM,τ=0)/ΔM,τ=123 (SI Appendix, Table S7). In the considered model, we have Mτ=123,NS¯=Mτ=0,NS¯, so the effect of RPD can be simplified as (Mτ=0,TS¯Mτ=123,TS¯)/(Mτ=123,NS¯Mτ=123,TS¯). In the mitochondrial case of τ=0, the values of Δπ,τ=0 and Δθw,τ=0 can be computed analytically, which agree with the simulation.

Robustness Analysis Under Various Demographic Models.

To examine the robustness of the estimates, we conducted reanalyses under various demographic models. The likelihood method described above is very flexible, and little modification is needed to analyze other demographic models. First, we considered a slow population decline in a nonthreatened vertebrate species categorized as near-threatened (NT) based on the rationale that human activities may also have an impact on those nonthreatened vertebrate species. We assumed that their population size declined by half (E(NNT(τ))/E(NNT(0))=2). We assumed that the populations of those near-threatened vertebrate species also started to decline at time τ.

Second, we considered an ancestral population with varying size (SI Appendix, Fig. S4). Under the ancestral instantaneous expansion model (SI Appendix, Fig. S4 A and B), we assumed that t1 follows a uniform distribution in [100,000, 1,000,000] (in unit of years), and N2/N1 is uniformly distributed between [2, 5], where N1 and N2 are the effective population size before and after the expansion, respectively. For the ancestral instantaneous bottleneck model (SI Appendix, Fig. S4 C and D), we assumed that t0 follows a uniform distribution of [10,000, 100,000], t1 follows a uniform distribution of [100,000, 1,000,000], and N2/N1 is uniformly distributed between [2, 5], where N1 is the effective population size during the bottleneck and N2 the effective population size before and after the bottleneck. Then we assigned the ancestral expansion or the ancestral bottleneck model with equal probability (0.5 vs. 0.5) for a species, as its demographic scenario before the RPD.

To ensure that the simulated genetic diversity level is equal to the observed one in a nonthreatened species with varying population size, θNS(0) was determined as follows. Based on the constant size model, θ was first estimated from the observed polymorphic data (10, 15). Then we simulated 50 random coalescent trees given the desired demographic model and denoted the mean tree length by l¯. Then we have σ=l¯/i=1n12i, and θNS(0)=θ/σ.

To determine θTS(0) of a threatened species, we reconstructed a demographic model of the nonthreatened species. During the first phase, the model is the same as that of the threatened species. During the second phase, however, the population size is constant. Then θTS(τ) was obtained by the method described above, and θTS(0) was rescaled from θTS(τ). The effect of different sampling time was also considered and Eq. 2 was slightly revised.

Supplementary Material

Supplementary File
pnas.1616804113.sapp.pdf (894.7KB, pdf)
Supplementary File

Acknowledgments

We thank Feng Gao for technical assistance; Chun Ye, Dongsheng Lu, and Xixian Ma for help during data collection; Jing Luo, Xuemei Lü, Peng Shi, and John Parsch for their comments; and Sara Barton for editorial assistance. This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences Grant XDB13040800, National Natural Science Foundation of China Grant 91531306, and 973 Project Grant 2012CB316505 (all to H.L., J.X.-Y., G.D., Z.G., C.M., and Z.Y.). This work was also supported in part by Grant nos. 91231120 and 91631304 from the National Natural Science Foundation of China and an endowment from The University of Texas Health Science Center at Houston (to Y.-X.F.). Y.-P.Z. is supported in part by the Yunnan Provincial Science and Technology Department and the National Natural Science Foundation of China.

Footnotes

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1616804113/-/DCSupplemental.

References

  • 1.Stuart SN, et al. Status and trends of amphibian declines and extinctions worldwide. Science. 2004;306(5702):1783–1786. doi: 10.1126/science.1103538. [DOI] [PubMed] [Google Scholar]
  • 2.WWF 2014 Living Planet Report 2014 (World Wildlife Fund, Gland, Switzerland). Available at http://wwf.panda.org/wwf_news/?231893/Living-Planet-Report-2014.
  • 3.IUCN 2014. The IUCN Red List of Threatened Species, version 2014.2 (International Union for Conservation of Nature and Natural Resources, Cambridge, UK)
  • 4.Frankham R, Ballou JD, Briscoe DA. Introduction to Conservation Genetics. Cambridge Univ Press; Cambridge, UK: 2002. [Google Scholar]
  • 5.Li H, Stephan W. Inferring the demographic history and rate of adaptive substitution in Drosophila. PLoS Genet. 2006;2(10):e166. doi: 10.1371/journal.pgen.0020166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475(7357):493–496. doi: 10.1038/nature10231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Liu X, Fu Y-X. Exploring population size changes using SNP frequency spectra. Nat Genet. 2015;47(5):555–559. doi: 10.1038/ng.3254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Fu Y-X. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics. 1997;147(2):915–925. doi: 10.1093/genetics/147.2.915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hartl DL, Clark AG. Principles of Population Genetics. Sinauer; Sunderland, MA: 1988. [Google Scholar]
  • 10.Watterson GA. On the number of segregating sites in genetical models without recombination. Theor Popul Biol. 1975;7(2):256–276. doi: 10.1016/0040-5809(75)90020-9. [DOI] [PubMed] [Google Scholar]
  • 11.Spielman D, Brook BW, Frankham R. Most species are not driven to extinction before genetic factors impact them. Proc Natl Acad Sci USA. 2004;101(42):15261–15264. doi: 10.1073/pnas.0403809101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Good PI. Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer; Heidelberg: 2000. [Google Scholar]
  • 13.Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123(3):585–595. doi: 10.1093/genetics/123.3.585. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Fu Y-X. New statistical tests of neutrality for DNA samples from a population. Genetics. 1996;143(1):557–570. doi: 10.1093/genetics/143.1.557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Xu H, Fu Y-X. Estimating effective population size or mutation rate with microsatellites. Genetics. 2004;166(1):555–563. doi: 10.1534/genetics.166.1.555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Frankham R. Effective population size/adult population size ratios in wildlife: A review. Genet Res. 2007;89(5-6):491–503. doi: 10.1017/S0016672308009695. [DOI] [PubMed] [Google Scholar]
  • 17.Fu Y-X, Li W-H. Estimating the age of the common ancestor of a sample of DNA sequences. Mol Biol Evol. 1997;14(2):195–199. doi: 10.1093/oxfordjournals.molbev.a025753. [DOI] [PubMed] [Google Scholar]
  • 18.Tavaré S, Balding DJ, Griffiths RC, Donnelly P. Inferring coalescence times from DNA sequence data. Genetics. 1997;145(2):505–518. doi: 10.1093/genetics/145.2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Nei M, Maruyama T, Chakraborty R. The bottleneck effect and genetic variability in populations. Evolution. 1975;29(1):1–10. doi: 10.1111/j.1558-5646.1975.tb00807.x. [DOI] [PubMed] [Google Scholar]
  • 20.Tajima F. The effect of change in population size on DNA polymorphism. Genetics. 1989;123(3):597–601. doi: 10.1093/genetics/123.3.597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Newbold T, et al. Global effects of land use on local terrestrial biodiversity. Nature. 2015;520(7545):45–50. doi: 10.1038/nature14324. [DOI] [PubMed] [Google Scholar]
  • 22.Ceballos G, et al. Accelerated modern human-induced species losses: Entering the sixth mass extinction. Sci Adv. 2015;1(5):e1400253. doi: 10.1126/sciadv.1400253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Collen B, et al. Monitoring change in vertebrate abundance: the living planet index. Conserv Biol. 2009;23(2):317–327. doi: 10.1111/j.1523-1739.2008.01117.x. [DOI] [PubMed] [Google Scholar]
  • 24.Collen B, Nicholson E. Taking the measure of change. Science. 2014;346(6206):166–167. doi: 10.1126/science.1255772. [DOI] [PubMed] [Google Scholar]
  • 25.Bazin E, Glémin S, Galtier N. Population size does not influence mitochondrial genetic diversity in animals. Science. 2006;312(5773):570–572. doi: 10.1126/science.1122033. [DOI] [PubMed] [Google Scholar]
  • 26.Meiklejohn CD, Montooth KL, Rand DM. Positive and negative selection on the mitochondrial genome. Trends Genet. 2007;23(6):259–263. doi: 10.1016/j.tig.2007.03.008. [DOI] [PubMed] [Google Scholar]
  • 27.Martin AP, Palumbi SR. Body size, metabolic rate, generation time, and the molecular clock. Proc Natl Acad Sci USA. 1993;90(9):4087–4091. doi: 10.1073/pnas.90.9.4087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Koepfli KP, Paten B, Genome 10K Community Scientists, O’Brien SJ (2015) The Genome 10K Project: A way forward. Annu Rev Anim Biosci 3:57-111. [DOI] [PMC free article] [PubMed]
  • 29.Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution. 1984;38(6):1358–1370. doi: 10.1111/j.1558-5646.1984.tb05657.x. [DOI] [PubMed] [Google Scholar]
  • 30.Nowak RM. Walker’s Mammals of the World. Johns Hopkins Univ Press; Baltimore: 1999. [Google Scholar]
  • 31.Hudson RR. Gene genealogies and the coalescent process. In: Futuyma D, Antonovics J, editors. Oxford Surveys in Evolutionary Biology. Vol 7. Oxford Univ Press; New York: 1990. pp. 1–44. [Google Scholar]
  • 32.Griffiths RC, Tavaré S. Sampling theory for neutral alleles in a varying environment. Philos Trans R Soc Lond B Biol Sci. 1994;344(1310):403–410. doi: 10.1098/rstb.1994.0079. [DOI] [PubMed] [Google Scholar]
  • 33.Renwick A, Davison L, Spratt H, King JP, Kimmel M. DNA dinucleotide evolution in humans: Fitting theory to facts. Genetics. 2001;159(2):737–747. doi: 10.1093/genetics/159.2.737. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.1616804113.sapp.pdf (894.7KB, pdf)
Supplementary File

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES