Significance
The current rate of species extinction is ∼1,000 times the background rate of extinction and is attributable to human impact, ecological and demographic fluctuations, and inbreeding due to small population sizes. The rate and the initiation date of rapid population decline (RPD) can provide important clues about the driving forces of population decline in threatened species, but they are generally unknown. We analyzed the genetic diversity data in 2,764 vertebrate species. Our population genetics modeling suggests that in many threatened vertebrate species the RPD on average began in the late 19th century, and the mean current size of threatened vertebrates is only 5% of their ancestral size. We estimated a ∼25% population decline every 10 y in threatened vertebrate species.
Keywords: vertebrate, threatened species, coalescent, rapid population decline, conservation
Abstract
Accelerated losses of biodiversity are a hallmark of the current era. Large declines of population size have been widely observed and currently 22,176 species are threatened by extinction. The time at which a threatened species began rapid population decline (RPD) and the rate of RPD provide important clues about the driving forces of population decline and anticipated extinction time. However, these parameters remain unknown for the vast majority of threatened species. Here we analyzed the genetic diversity data of nuclear and mitochondrial loci of 2,764 vertebrate species and found that the mean genetic diversity is lower in threatened species than in related nonthreatened species. Our coalescence-based modeling suggests that in many threatened species the RPD began ∼123 y ago (a 95% confidence interval of 20–260 y). This estimated date coincides with widespread industrialization and a profound change in global living ecosystems over the past two centuries. On average the population size declined by ∼25% every 10 y in a threatened species, and the population size was reduced to ∼5% of its ancestral size. Moreover, the ancestral size of threatened species was, on average, ∼22% smaller than that of nonthreatened species. Because the time period of RPD is short, the cumulative effect of RPD on genetic diversity is still not strong, so that the smaller ancestral size of threatened species may be the major cause of their reduced genetic diversity; RPD explains 24.1–37.5% of the difference in genetic diversity between threatened and nonthreatened species.
Although preservation of biodiversity is vital to a sustainable human society, rapid population decline (RPD) continues to be widespread across taxa (1–3). When RPD occurs, it is accompanied by a loss of genetic diversity. Genetic diversity is reflected in the genetic differences among individuals and is essential for populations to adapt to changing environments (4). The start date and the rate of RPD provide useful information for effective conservation of threatened species and are important for promotion of public awareness of the threat. However, these two key parameters are difficult to estimate because there are virtually no time-series data on population size over hundreds of years. For most species, the population size may only be traced back to 40 y (2). Therefore, an alternative approach is to estimate the start date and the rate of RPD, using mathematical modeling.
Changes in population size over thousands of years could be inferred for a species from genome-wide DNA polymorphism data (5–7). However, it remains a formidable technical challenge to infer the event of RPD because the signal of such an event is weak in the typical time scale of observable polymorphisms (8). To overcome the limited resolution power of the genetic data from a single species, we propose an approach that draws conclusions based on the collective support from many species. The central premise of our approach is that the threat of extinction of thousands of species was primarily due to a common cause in the past that led to a significant depletion of available habitats and resources. Consequently, we were able to draw conclusions based on present-day polymorphism data from a large number of threatened species and their nonthreatened relatives. Our method is depicted in Fig. 1. Here we studied RPD in vertebrates, because vertebrates have been more extensively investigated in the past. However, our conclusions should have some generality because vertebrate species live in a wide range of ecosystems. Moreover, the proposed method is also suitable for studying nonvertebrate species.
Results and Discussion
Data Collected.
We reviewed more than 10,000 peer-reviewed papers published in the last two and half decades, among which ∼2,500 papers in 164 scientific journals were found to have surveyed the genetic diversity of at least one vertebrate species. The level of genetic diversity was measured with one of the following summary statistics (9): the expected and observed heterozygosity ( and ), the number of alleles per locus () at the microsatellite loci, Watterson’s (10), and the mean number of nucleotide differences per nucleotide site between two mitochondrial sequences (). The collected dataset includes 2,764 vertebrate species belonging to 1,466 genera and 465 families (Fig. 2). Then, we used the International Union for Conservation of Nature (IUCN) Red List categories (3) to determine the level of extinction risk for each species, and the species were categorized into three groups: nonthreatened species (NS), threatened species (TS), and uncategorized species (Fig. 2B). The uncategorized species include (i) those that are listed as data deficient and (ii) those that have not been evaluated by the IUCN. A taxon is listed as data-deficient when there is inadequate information to make an assessment of its risk of extinction (3). The uncategorized species were excluded from our analyses (Fig. 1), unless noted otherwise.
Comparison of Genetic Diversity Between Nonthreatened and Threatened Species.
Following a previous study (11), we compared the genetic diversity between nonthreatened and threatened vertebrate species using the permutation test (12). The establishment of those IUCN categories does not rely on the information of genetic diversity. Although the distributions of genetic diversity of nonthreatened and threatened species overlap (Fig. 3 A and B and SI Appendix, Fig. S1), the mean genetic diversity of nonthreatened species is significantly higher than that of related threatened species in all 16 comparisons (Fig. 3 C–E and SI Appendix, Table S1), generally agreeing with the previous finding (11). The results remain the same when we recompiled the data with different numbers of microsatellite loci ( loci) or different sequenced lengths of the D-loop ( bp) (SI Appendix, Fig. S2).
To examine whether differences in population structure can explain the reduction in genetic diversity of threatened species, we first compared the Fst values (an indicator of recent population structure estimated from microsatellite loci) between two species groups and found no significant difference (P = 0.25) (SI Appendix, Table S2). Next, we calculated the one-tailed P values of Tajima’s D (13) for the mitochondrial DNA polymorphism data, which is sensitive to ancient but not recent population structure (14). There was also no significant difference between the two species groups (P = 0.67) (SI Appendix, Table S2). Thus, population structure differences are unlikely the principle cause of the difference in genetic diversity between nonthreatened and threatened species.
To assess the impact of recent demographic change on genetic diversity of threatened species, we considered pairs of nonthreatened and threatened species from the same family. For each pair we calculated the ratio of the long-term effective population size () and the ratio of the effective population size at present . The observed genetic diversity provides an estimate of for autosomes (15) or for mitochondrial DNA (10), where is the mutation rate per generation. Therefore, the ratio of between two species groups was estimated as for either autosomal or mitochondrial loci, where the subscripts NS and TS stand for nonthreatened and threatened species, respectively. Also, the ratio of was approximated by the ratio of the current census size . This is based on the finding that the ratio of effective to actual population size () has a mean value of 0.1 (16) and is largely independent of . We found that (median 1.89, the 5th and 95th percentiles 0.16 and 15.32, respectively) is remarkably smaller than (median 36.95, the 5th and 95th percentiles 1.9 and 3,282.9, respectively) () (Fig. 4 and SI Appendix, Tables S3 and S4). This may indicate a much larger ancestral size and a RPD across all or most of the threatened taxa.
We suggest that the recent impacts on population size could be measured by and normalized by . Then we examined which families of species were affected the most or the least (SI Appendix, Table S5). A larger impact index indicates that one or a few threatened species in the family experienced a more severe population decline.
Demographic Models.
We used a model-based approach to quantify the RPD. One model is illustrated in Fig. 5A. The essential premise is that many threatened species began the RPD at similar times due to the increased impact of human activities and habitat losses. Specifically, the model assumes that each threatened species began an exponential decrease in size years ago, which follows a distribution with the mean equal to , whereas nonthreatened species have maintained a constant population size (the case of nonthreatened species with nonconstant size is examined below). Naturally, the time splits the population history into two phases (Fig. 5A). We define , which is the ratio of the ancestral genetic diversity of nonthreatened species to that of threatened species at time and represents the difference in census size between the two species groups before RPD.
To estimate the two parameters ( and ), a numerical approximation of their likelihood was obtained using the principle of rejection sampling (5, 17, 18) after representing the data as the relative differences in the mean genetic diversity between the two species groups (Methods). Then we explored each of the scenarios with the start date of RPD that follows the normal, the exponential, and the one-point (i.e., constant) distribution with the mean . The SD of the normal distribution was estimated as the SD of among 18 well-documented mammal and bird species (SI Appendix, Table S6). The estimated is 111, 154, and 123 y, respectively. Judging from the amount of uncertainty associated with these estimates, it seems that is robust with regard to the assumption of the distribution of . However, assuming a nonconstant will lead to a substantial increase in computation/simulation time in subsequent analyses. Therefore, only the one-point distribution of was implemented in the subsequent analyses.
The surface of the likelihood is shown in Fig. 5B. The estimate is 1.22 (Fig. 5C) with a 95% confidence interval between 1.11 and 1.35, implying that the ancestral size of threatened vertebrate species was, on average, 22% smaller than that of nonthreatened vertebrate species. As expected, is more precise than estimated above because the latter was estimated from the small number of species and had a very large variance. The corresponding is 123 y with a 95% confidence interval of 20–260 y, and the rate of RPD () implies a 24.5% population decline every 10 y. The results suggest that the difference in genetic diversity between the two species groups (Fig. 3) was due to the joint effect of the smaller ancestral size of threatened species and the RPD that began on average in the late 19th century. Then, the effect of RPD was quantified by comparing the simulated genetic diversity between the cases of RPD and no RPD. We found that RPD explains 24.1–37.5% of the difference in genetic diversity between the two species groups (SI Appendix, Table S7). The effect varies with different measurements of genetic diversity because the number of alleles per locus () and Watterson’s are more sensitive to RPD than the expected heterozygosity () and (19, 20).
The effect of RPD on the reduction in genetic diversity is relatively weak because the time period of RPD is short and because the observed differences in genetic diversity between the two species groups were small. For example, the initial simulated heterozygosity of threatened species was 0.572, and the reduced heterozygosity after RPD was 0.558. That is, only 2.4% of the heterozygosity was lost due to RPD because the time period of RPD was only 123 y.
Comparison with Historical Data.
It has been documented that accelerated land use by humans might have caused the decrease of biodiversity since the middle of the 19th century (21) and extinction rates increased sharply over the past 200 y (22). Therefore, our estimated agrees with these two studies. also agrees with the estimated for 18 well-documented species [ranging from 55 to 415 y, and y] (SI Appendix, Table S6). The Living Planet Index (2), based on vertebrate population time series between 1970 and 2010, reported a less rapid population decline than we inferred. This may have occurred because conservation efforts had slowed down the rate of population decline in the last 40 y, or populations of nonthreatened vertebrate species were also included to track global biodiversity change in their analysis (23, 24).
Unbiasedness and Robustness of the Method.
We examined the bias of our inference method. First, the sampling effect among species was studied, and it was found that smaller subsets of data gave unbiased estimates but with large variances, as expected (Fig. 5C). Second, RPDs were simulated (10,000 replicates) with the parameter values and , and the obtained mean and were 1.20 and 128, respectively, showing little bias (SI Appendix, Fig. S3).
The robustness of our estimates was investigated in a number of ways. Several scenarios of the ancient demography before RPD were simulated, and we found that the ancient demography has almost no impact on the estimates (SI Appendix, Fig. S4). We also found that even if the population size of species categorized as near-threatened was assumed to decrease by half since RPD the estimates were virtually unchanged (SI Appendix, Fig. S5A). We also considered the uncertainty of assessment on the genetic diversity due to small sample size, the uncertainty of assessment on the threatened status of species, the uncertainty of estimating , and the uncertainty of the evolutionary model of microsatellite loci (SI Appendix, Fig. S6). We found that these factors have no visible impacts on the estimates (SI Appendix, Fig. S5 B–E). It has been found that positive and negative selection have acted on mitochondrial genomes (25, 26), which may invalidate the neutral assumption of the model. Moreover, the two short highly hypervariable regions in the D-loop region have been preferentially selected for study, and we observed an increased genetic diversity in mtDNA sequencing data when a short region (<500 bp) was sequenced. To avoid this potential bias, we conducted an analysis by completely excluding the mitochondrial dataset but we found that the estimates and remained approximately the same (SI Appendix, Fig. S5F).
Our large collection of species allowed us to conduct refined inferences. First, we reanalyzed each taxon class (Fig. 5D). We found that threatened birds and fish species seem to have experienced a more recent RPD than other threatened vertebrates. Thus, the estimated should be interpreted as the average starting date of RPD among different threatened species/taxa. Interestingly, a threatened species with a longer generation time might have an earlier start of RPD (Fig. 5E). This might be due to a positive correlation between generation time and body size (27) and the possibility that a threatened species with a larger body size may be affected earlier by human activities. Second, our analysis of temperate/tropical species suggested that tropical threatened species might have been affected earlier than other threatened species (Fig. 5F), which coincides with the greater habitat loss in the tropics than in other parts of the world. However, we did not observe a large difference in the estimated between threatened terrestrial and aquatic species (SI Appendix, Fig. S7A) or among low/mid/high-income geographic regions (SI Appendix, Fig. S7B). Finally, because the genetic diversity could be different between different IUCN threatened categories (critically endangered, endangered, and vulnerable) (SI Appendix, Table S8), we compared the nonthreatened species with the threatened species belonging to different threatened categories. We found that the different IUCN threatened categories have almost no impact on the estimated (SI Appendix, Fig. S7C). Overall, recent RPDs were observed in all of the examined cases, which strengthened our view that RPD on average widely began in the late 19th century.
In summary, this study demonstrates a utility of genetic polymorphism data in conservation biology. The model with the two parameters and (Fig. 5A) is able to capture the crucial features in the observed patterns of polymorphism, and the estimates are robust against various conditions (SI Appendix, Figs. S4 and S5). The estimated start date of RPD coincides with that of widespread industrialization and a profound change in global living ecosystems in the late 19th century, reinforcing the belief that human activity is the primary cause for recent massive extinction. ranges between 1.1 and 1.3 (Fig. 5 and SI Appendix, Figs. S4, S5, and S7), indicating that the ancestral size of threatened species was 10–30% smaller than that of nonthreatened species. Overall, a species with small population size is likely more vulnerable to environmental change (4). Therefore, an important question is raised whether a relatively small ancestral size of a species is an important factor for its current threatened status. This could be investigated with the genome-wide polymorphism data from the recently initiated Genome 10K Project (28) in the future. Finally, if current conservation efforts are generally enacted when population sizes are low, the historical inferences of this study suggest that recovery may be compromised by delaying action as population sizes continue to decline.
Methods
Collecting Genetic Diversity Data.
For over a decade we carefully examined 2,475 peer-reviewed papers published in 164 scientific journals (Dataset S1) and collected DNA polymorphism data for vertebrate species living in various ecosystems. We focused on vertebrate species because available data were mostly from vertebrates. The genetic diversity in each of these species was typically assayed by one or two molecular techniques: allelic variation at microsatellite loci and sequence variation in the control (D-loop) and coding regions of the mitochondrial DNA. Therefore, the genetic diversity data presented here represent the polymorphism level in both nuclear and mitochondrial genomes.
To ensure data quality, if an inconsistency (in the sample size, the number of haplotypes, the number of alleles, or any related key information) was found in a paper we contacted the authors for their confirmation or discarded the data if we received no response. We also excluded studies using museum samples because the number of such studies is very limited. To increase nomenclatural consistency, the standard world checklists (version 2014.3) on the IUCN Red List were used (www.iucnredlist.org/technical-documents/information-sources-and-quality).
The within-species polymorphism data of 2,764 vertebrate species are presented in Dataset S1. In this dataset, there were 400 vertebrate species surveyed by more than one method, so we have 3,219 nonredundant entries in total, where each entry is composed of one to three summary statistics. If the microsatellite loci of a vertebrate species were surveyed, the sample size (n), the number of microsatellite loci, the expected heterozygosity (), the observed heterozygosity (), the mean number of alleles per locus (), (29), and the year of publication were recorded and later used in simulations. If the D-loop region of the mitochondrion was sequenced in a vertebrate species, we recorded the sample size (n), the length of the sequenced region, the mean number of pairwise nucleotide differences per base pair () between the DNA sequences examined, Watterson’s per base pair (10), and the year of publication. When it was necessary and possible, we downloaded mitochondrial sequences from EMBL/GenBank and aligned them by DNASTAR, checked the alignment by eye, and then calculated and Watterson’s . The haplotype frequencies were obtained from the original articles, and the sites containing insertions and deletions were excluded in our analysis. To investigate population structure, we calculated Tajima’s D (13) and its P value, as the summary statistic for the ancient population structure (14). The and Watterson’s were rescaled according to the length of the sequenced fragment that was used to calculate Tajima’s D. We also collected a small dataset of genetic diversity for the coding regions of mitochondrial DNA. The collection process was similar to that described above, and we used the dataset only in the description analysis because the number of available species is small.
According to the IUCN, the threatened species (TS) include those listed as critically endangered (CR), endangered (EN), and vulnerable (VU) (3). The species categorized as near threatened (NT) and least concern (LC) are treated as the nonthreatened species (NS). The uncategorized species include those that are listed as data deficient (DD) and have not been evaluated by the IUCN. A taxon is listed as data deficient when there is inadequate information to make an assessment of its risk of extinction (3). Generally, the uncategorized species were excluded from our analyses (Fig. 1), unless stated otherwise.
Collecting Species Distribution and Generation Time Data.
The geographic distributions of 2,552 vertebrate species were retrieved from the IUCN Red List (version 2014.3) and a reptile database (www.reptile-database.org). The data are given in Dataset S1. The generation times of 3,146 vertebrate species were obtained from different published resources (Dataset S1), which formed the basis for our generation time estimates. Assume that the generation time of a species is , where is the sexual maturity age of the species. is equal to the generation time divided by the sexual maturity age and is species/genus-dependent. If the generation time of a species is unknown but its age of sexual maturity is documented, its generation time was estimated using the mean from closely related well-studied species.
Ratio of Effective Population Sizes Between Nonthreatened and Threatened Species.
It is difficult to estimate the effective population size at a specific time of a species with fluctuating population size (5–7). However, the ratio of between two species groups, each composed of a large number of related species, can be approximated by the ratio of their current census size . It has been inferred that the ratio of the effective to the actual population size [] is of the order of 0.1 (16), and is usually independent of . When both nonthreatened and threatened species groups are each composed of a large number of related species, it is reasonable to assume , where subscripts NS and TS represent nonthreatened and threatened species, respectively. Consequently,
[1] |
where denotes the effective population size at time (counting backward) and the current census size.
Based on the current census of 1,868 vertebrate species obtained from IUCN, a professional book (30), and peer-reviewed literature (Dataset S1), we estimated for mammals, 146 for birds, 32 for amphibians, 26 for reptiles, and 14 for fish (SI Appendix, Table S9). Therefore, we set in the modeling study, but we also used and 100 to examine the robustness of the results.
Demographic Model and Likelihood Inference of RPD.
Demographic model.
We assumed that the effective population size of a nonthreatened species remains constant. Denote the ancestral effective population size of a threatened species at time by , and assume that at years ago its population size started declining exponentially (Fig. 5A); time is counted backward. We assumed that the start date of RPD in threatened species follows the normal, the exponential, or the one-point (i.e., constant) distribution with the mean . Once was estimated by Eq. 1, the demographic model was characterized by two parameters, and . Then, the two parameters and were estimated from an analysis in the likelihood framework as described below.
Under the assumption of constant population size during the first phase and , we have , where and are the census sizes of the nonthreatened and threatened species at the start date of RPD. Thus, R represents the ratio of census size at the start date of RPD between the two species groups. If the assumption of constant size during the first phase is invalid, R represents the long-term ratio of census size between two species groups before the start of RPD.
Assumption about sampling times.
The sampling time is likely different in different studies, but the time duration between the time of sampling and the time of publication is generally much shorter than . We assumed that the sampling happened 3 y before the year of publication. In this study, the term “at present” means “2015,” so the sampling of the i-th species happened at () years ago, where is the year of publication. The duration between the start date of RPD and the year of sampling is .
Coalescence-based simulations.
The coalescence-based simulations followed the standard procedure (31, 32). To simulate the microsatellite polymorphism data for the i-th species, we (randomly) chose from the value(s) of its nonthreatened related species, where is the mutation rate per locus per generation, and the sampling happened years ago. The details are given below.
We first calculated from the observed within-species polymorphism data of nonthreatened species with a reasonable sample size () based on the stepwise mutation model (15). Then, we denoted the set of values in the group of nonthreatened species related to the i-th species as . If at the genus level, would be obtained at the family level, or even at the phylum level. Then a value was (randomly) drawn from . The value would be used as if the i-th species is nonthreatened. If the i-th species is threatened, the value was rescaled according to its effective population size at the time of sampling, . Then, we have
[2] |
where is the rate of RPD, and a negative indicates the sampling happened before the start of RPD. From the definitions of and and Eq. 1, we have
[3] |
where was randomly sampled from the distribution described above.
To model the heterogeneity of mutation rates among microsatellite loci, we assumed that the mutation rate for a randomly selected locus follows a lognormal distribution, with the coefficient of variation equal to 1 (33). We also assumed that the microsatellite loci are independent and are autosomal.
If the evolution of the i-th species followed a nonconstant size model, the time in the unit of years was transformed to the unit of generations (31, 32), where , (16), and is the current census size of the i-th species.
To simulate the single-nucleotide polymorphism data on the D-loop region, we followed the procedures described above with two modifications. First, was estimated as Watterson’s (10), based on the within-species sequence variation in the D-loop of nonthreatened species with ; the cases of were discarded because those estimated values represent or . Second, the rescaled decline time was multiplied by 4 because the effective population size of a mitochondrial locus is only one-fourth that of an autosomal locus.
Likelihood inference of RPD.
The observed genetic diversity of a species can be represented by a vector in which at least one element was observed. The corresponding likelihood function is , where the i-th species is designated by subscript i. Although there is no exact method for computing , numerical approximations can be obtained by following the principle of rejection sampling (5, 17, 18) and by representing the data as the relative differences in the mean genetic diversity between nonthreatened and threatened species. The details are as follows.
For the summary statistic He on the microsatellite loci, we denote . Denote and as the simulated and observed relative difference of mean He between nonthreatened and threatened species. The likelihood function is then estimated as a numerical approximation of , where is a fixed tolerance. When is very small, the computational load is very large, whereas the precision of the estimate will be poor when is large. Our experience suggests that works well, and the estimate of is not sensitive to . We also set , 0.1, or 0.2, and the results remained almost unchanged. The estimation procedure of is given below. In step 1, we simulated the microsatellite polymorphic dataset using the procedure described above and calculated . The pattern of missing data, the information of sample size and the number of loci have been properly considered in the simulation and also in the related calculation. In step 2, we introduced an indicator variable . In step 3, we repeated steps 1 and 2 B times. The likelihood was then estimated by , where .
The above procedure was applied to multiple summary statistics with only minor modifications. For the microsatellite dataset, we jointly considered He and , which are not independent. Then we have . Similarly, we have . Finally, we have .
Then, and are estimated by a two-step process through a likelihood framework. The first step is to calculate the profile likelihood for [] and then the mean , which is the mean of ’s weighted by the relative profile likelihood values (Fig. 5 C–F). The second step is to obtain the maximum likelihood estimate of conditional on . In general, if , which indicates that the ancestral genetic diversity of nonthreatened species is higher than that of threatened species, a small is needed to explain the observed low genetic diversity in the present-day threatened species. In particular, if , we have (Fig. 5C). Because it has been documented that the RPD of threatened species happened at least 40 y ago (2), we set 1.35 as the upper bound of . Moreover, it is likely that, on average, the ancestral size of nonthreatened species was larger than that of its related threatened species. Thus, we assumed .
All of the collected species, including those with small sample size (), were taken in the coalescence-based modeling, unless noted otherwise. This is because the modeling was conducted conditional on the values of n.
Likelihood ratio test.
To conduct the likelihood ratio test and obtain likelihood-based confidence intervals, we obtained the empirical distribution of the likelihood ratio by analyzing 104 simulated datasets conditional on and , where and are the likelihoods for the alternative and null models, respectively. An example of the empirical distribution of is shown (SI Appendix, Fig. S8). In this case, the 95% critical value () is 1.792. Note that, for a different dataset, or for different and , could be different. Therefore, the corresponding empirical distribution of was used to obtain for the likelihood ratio test.
Effect of RPD on the Difference in Genetic Diversity Between Nonthreatened and Threatened Species.
We first conducted the coalescence-based modeling described above, conditional on and or 123. We used (microsatellite loci, per locus) and 0.02 (mitochondrial locus, per base pair). We then compared the case of (the estimated RPD) with the case of (indicating no RPD) to estimate the effect of PRD on the difference in genetic diversity between nonthreatened and threatened species. We denoted , where stands for , the mean number of alleles per locus (), , or Watterson’s . The effect of PRD was calculated as (SI Appendix, Table S7). In the considered model, we have , so the effect of RPD can be simplified as . In the mitochondrial case of , the values of and can be computed analytically, which agree with the simulation.
Robustness Analysis Under Various Demographic Models.
To examine the robustness of the estimates, we conducted reanalyses under various demographic models. The likelihood method described above is very flexible, and little modification is needed to analyze other demographic models. First, we considered a slow population decline in a nonthreatened vertebrate species categorized as near-threatened (NT) based on the rationale that human activities may also have an impact on those nonthreatened vertebrate species. We assumed that their population size declined by half . We assumed that the populations of those near-threatened vertebrate species also started to decline at time .
Second, we considered an ancestral population with varying size (SI Appendix, Fig. S4). Under the ancestral instantaneous expansion model (SI Appendix, Fig. S4 A and B), we assumed that follows a uniform distribution in [100,000, 1,000,000] (in unit of years), and is uniformly distributed between [2, 5], where and are the effective population size before and after the expansion, respectively. For the ancestral instantaneous bottleneck model (SI Appendix, Fig. S4 C and D), we assumed that follows a uniform distribution of [10,000, 100,000], follows a uniform distribution of [100,000, 1,000,000], and is uniformly distributed between [2, 5], where is the effective population size during the bottleneck and the effective population size before and after the bottleneck. Then we assigned the ancestral expansion or the ancestral bottleneck model with equal probability (0.5 vs. 0.5) for a species, as its demographic scenario before the RPD.
To ensure that the simulated genetic diversity level is equal to the observed one in a nonthreatened species with varying population size, was determined as follows. Based on the constant size model, was first estimated from the observed polymorphic data (10, 15). Then we simulated 50 random coalescent trees given the desired demographic model and denoted the mean tree length by . Then we have , and .
To determine of a threatened species, we reconstructed a demographic model of the nonthreatened species. During the first phase, the model is the same as that of the threatened species. During the second phase, however, the population size is constant. Then was obtained by the method described above, and was rescaled from . The effect of different sampling time was also considered and Eq. 2 was slightly revised.
Supplementary Material
Acknowledgments
We thank Feng Gao for technical assistance; Chun Ye, Dongsheng Lu, and Xixian Ma for help during data collection; Jing Luo, Xuemei Lü, Peng Shi, and John Parsch for their comments; and Sara Barton for editorial assistance. This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences Grant XDB13040800, National Natural Science Foundation of China Grant 91531306, and 973 Project Grant 2012CB316505 (all to H.L., J.X.-Y., G.D., Z.G., C.M., and Z.Y.). This work was also supported in part by Grant nos. 91231120 and 91631304 from the National Natural Science Foundation of China and an endowment from The University of Texas Health Science Center at Houston (to Y.-X.F.). Y.-P.Z. is supported in part by the Yunnan Provincial Science and Technology Department and the National Natural Science Foundation of China.
Footnotes
The authors declare no conflict of interest.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1616804113/-/DCSupplemental.
References
- 1.Stuart SN, et al. Status and trends of amphibian declines and extinctions worldwide. Science. 2004;306(5702):1783–1786. doi: 10.1126/science.1103538. [DOI] [PubMed] [Google Scholar]
- 2.WWF 2014 Living Planet Report 2014 (World Wildlife Fund, Gland, Switzerland). Available at http://wwf.panda.org/wwf_news/?231893/Living-Planet-Report-2014.
- 3.IUCN 2014. The IUCN Red List of Threatened Species, version 2014.2 (International Union for Conservation of Nature and Natural Resources, Cambridge, UK)
- 4.Frankham R, Ballou JD, Briscoe DA. Introduction to Conservation Genetics. Cambridge Univ Press; Cambridge, UK: 2002. [Google Scholar]
- 5.Li H, Stephan W. Inferring the demographic history and rate of adaptive substitution in Drosophila. PLoS Genet. 2006;2(10):e166. doi: 10.1371/journal.pgen.0020166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475(7357):493–496. doi: 10.1038/nature10231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Liu X, Fu Y-X. Exploring population size changes using SNP frequency spectra. Nat Genet. 2015;47(5):555–559. doi: 10.1038/ng.3254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Fu Y-X. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics. 1997;147(2):915–925. doi: 10.1093/genetics/147.2.915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hartl DL, Clark AG. Principles of Population Genetics. Sinauer; Sunderland, MA: 1988. [Google Scholar]
- 10.Watterson GA. On the number of segregating sites in genetical models without recombination. Theor Popul Biol. 1975;7(2):256–276. doi: 10.1016/0040-5809(75)90020-9. [DOI] [PubMed] [Google Scholar]
- 11.Spielman D, Brook BW, Frankham R. Most species are not driven to extinction before genetic factors impact them. Proc Natl Acad Sci USA. 2004;101(42):15261–15264. doi: 10.1073/pnas.0403809101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Good PI. Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer; Heidelberg: 2000. [Google Scholar]
- 13.Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123(3):585–595. doi: 10.1093/genetics/123.3.585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Fu Y-X. New statistical tests of neutrality for DNA samples from a population. Genetics. 1996;143(1):557–570. doi: 10.1093/genetics/143.1.557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Xu H, Fu Y-X. Estimating effective population size or mutation rate with microsatellites. Genetics. 2004;166(1):555–563. doi: 10.1534/genetics.166.1.555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Frankham R. Effective population size/adult population size ratios in wildlife: A review. Genet Res. 2007;89(5-6):491–503. doi: 10.1017/S0016672308009695. [DOI] [PubMed] [Google Scholar]
- 17.Fu Y-X, Li W-H. Estimating the age of the common ancestor of a sample of DNA sequences. Mol Biol Evol. 1997;14(2):195–199. doi: 10.1093/oxfordjournals.molbev.a025753. [DOI] [PubMed] [Google Scholar]
- 18.Tavaré S, Balding DJ, Griffiths RC, Donnelly P. Inferring coalescence times from DNA sequence data. Genetics. 1997;145(2):505–518. doi: 10.1093/genetics/145.2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Nei M, Maruyama T, Chakraborty R. The bottleneck effect and genetic variability in populations. Evolution. 1975;29(1):1–10. doi: 10.1111/j.1558-5646.1975.tb00807.x. [DOI] [PubMed] [Google Scholar]
- 20.Tajima F. The effect of change in population size on DNA polymorphism. Genetics. 1989;123(3):597–601. doi: 10.1093/genetics/123.3.597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Newbold T, et al. Global effects of land use on local terrestrial biodiversity. Nature. 2015;520(7545):45–50. doi: 10.1038/nature14324. [DOI] [PubMed] [Google Scholar]
- 22.Ceballos G, et al. Accelerated modern human-induced species losses: Entering the sixth mass extinction. Sci Adv. 2015;1(5):e1400253. doi: 10.1126/sciadv.1400253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Collen B, et al. Monitoring change in vertebrate abundance: the living planet index. Conserv Biol. 2009;23(2):317–327. doi: 10.1111/j.1523-1739.2008.01117.x. [DOI] [PubMed] [Google Scholar]
- 24.Collen B, Nicholson E. Taking the measure of change. Science. 2014;346(6206):166–167. doi: 10.1126/science.1255772. [DOI] [PubMed] [Google Scholar]
- 25.Bazin E, Glémin S, Galtier N. Population size does not influence mitochondrial genetic diversity in animals. Science. 2006;312(5773):570–572. doi: 10.1126/science.1122033. [DOI] [PubMed] [Google Scholar]
- 26.Meiklejohn CD, Montooth KL, Rand DM. Positive and negative selection on the mitochondrial genome. Trends Genet. 2007;23(6):259–263. doi: 10.1016/j.tig.2007.03.008. [DOI] [PubMed] [Google Scholar]
- 27.Martin AP, Palumbi SR. Body size, metabolic rate, generation time, and the molecular clock. Proc Natl Acad Sci USA. 1993;90(9):4087–4091. doi: 10.1073/pnas.90.9.4087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Koepfli KP, Paten B, Genome 10K Community Scientists, O’Brien SJ (2015) The Genome 10K Project: A way forward. Annu Rev Anim Biosci 3:57-111. [DOI] [PMC free article] [PubMed]
- 29.Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution. 1984;38(6):1358–1370. doi: 10.1111/j.1558-5646.1984.tb05657.x. [DOI] [PubMed] [Google Scholar]
- 30.Nowak RM. Walker’s Mammals of the World. Johns Hopkins Univ Press; Baltimore: 1999. [Google Scholar]
- 31.Hudson RR. Gene genealogies and the coalescent process. In: Futuyma D, Antonovics J, editors. Oxford Surveys in Evolutionary Biology. Vol 7. Oxford Univ Press; New York: 1990. pp. 1–44. [Google Scholar]
- 32.Griffiths RC, Tavaré S. Sampling theory for neutral alleles in a varying environment. Philos Trans R Soc Lond B Biol Sci. 1994;344(1310):403–410. doi: 10.1098/rstb.1994.0079. [DOI] [PubMed] [Google Scholar]
- 33.Renwick A, Davison L, Spratt H, King JP, Kimmel M. DNA dinucleotide evolution in humans: Fitting theory to facts. Genetics. 2001;159(2):737–747. doi: 10.1093/genetics/159.2.737. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.