Abstract
The germline mutation rate determines the pace of genome evolution and is an evolving parameter itself1. However, little is known about what determines its evolution, as most studies of mutation rates have focused on single species with different methodologies2. Here we quantify germline mutation rates across vertebrates by sequencing and comparing the high-coverage genomes of 151 parent–offspring trios from 68 species of mammals, fishes, birds and reptiles. We show that the per-generation mutation rate varies among species by a factor of 40, with mutation rates being higher for males than for females in mammals and birds, but not in reptiles and fishes. The generation time, age at maturity and species-level fecundity are the key life-history traits affecting this variation among species. Furthermore, species with higher long-term effective population sizes tend to have lower mutation rates per generation, providing support for the drift barrier hypothesis3. The exceptionally high yearly mutation rates of domesticated animals, which have been continually selected on fecundity traits including shorter generation times, further support the importance of generation time in the evolution of mutation rates. Overall, our comparative analysis of pedigree-based mutation rates provides ecological insights on the mutation rate evolution in vertebrates.
Subject terms: Evolutionary genetics, Molecular evolution, Mutation, Evolutionary biology
Using sequencing and comparing high-coverage genomes, the germline mutation rates across vertebrates are quantified.
Main
Germline mutations are the proximate source of genomic innovation and inherited diseases4. Consequently, considerable effort has been spent on characterizing the molecular processes underlying these mutations and estimating germline mutation rates (GMRs). Mutations are rare events, yet the frequency at which they are introduced into genomes at each generation varies considerably across taxa, from approximately 10−11 mutations per site per generation in unicellular eukaryotes up to approximately 10−7 mutations per site per generation in multicellular eukaryotes1,5,6. Inferring the driving forces of GMR evolution has important implications for understanding the mechanisms underlying mutagenesis. Several hypotheses have been proposed to explain variation in GMRs among lineages. Some of these invoke molecular mechanisms such as DNA methylation7 or microsatellite instability8, whereas others invoke external factors such as exposure to mutagenic environments9. Other studies have argued that life-history traits might explain some of the variation both in the prevalence of mutations and in the ability to repair DNA. In particular, the generation time10 and the metabolic rate11 have been suggested to be key life-history traits that could be associated with germline mutations. From a long-term evolutionary perspective, the ‘drift barrier hypothesis’ proposes that lower mutation rates may reflect the increased efficiency of natural selection at reducing the occurrence of mutations in species with large effective population sizes3.
However, a lack of accurate and standardized GMR estimation has so far precluded testing current hypotheses of GMR evolution. Pedigree-based estimates of GMRs per generation have recently been published for a handful of vertebrate species, mainly focusing on humans and primates12–17. Furthermore, a recent comparative study of 16 mammalian species identified an effect of lifespan on somatic mutation rates inferred from the sequencing of intestinal crypts18. Nevertheless, interspecific comparisons of GMR variation remain restricted in taxonomic scope19, partly due to the difficulty of comparing GMR estimates derived using different methodologies2. For example, alternative bioinformatic pipelines used in different studies can yield GMR estimates that vary by a factor of two, even when applied to the same parent–offspring trios2. This highlights the importance of applying consistent analytical pipelines for interspecies comparisons of GMRs. We therefore generated high-depth genome sequences (average coverage of more than 67×) for 323 individuals representing 151 trios of 68 vertebrate species, including 36 mammals, 18 birds, 8 ray-finned fishes and 6 reptiles (Supplementary Table 1). We then quantified species-specific GMRs across this wide range of vertebrate taxa using consistent bioinformatics pipelines to test long-standing evolutionary hypotheses on GMR evolution.
Per-generation mutation rate variation
We first estimated the per generation GMR (µgeneration) for each trio (that is, mother, father and offspring) by comparing parental and offspring genomes (Fig. 1a, Supplementary Tables 2 and 3 and Supplementary Figs. 1–5 for details on the method). Overall, µgeneration varies by a factor of 40 across all species. On average, mutation rates per generation are higher in reptiles (average of all species 1.17 × 10−8, 95% CI of the mean = 5.34 × 10−9 to 1.80 × 10−8) and birds (average of all species 1.01 × 10−8, 95% CI of the mean = 6.10 × 10−9 to 1.42 × 10−8) than in mammals (average of all species 7.97 × 10−9, 95% CI of the mean = 7.04 × 10−9 to 8.90 × 10−9) and fishes (average of all species 5.97 × 10−9, 95% CI of the mean = 4.39 × 10−9 to 7.55 × 10−9). However, the difference among the four major classes of vertebrates is not overall statistically significant (analysis of variance (ANOVA): F = 1.86, P = 0.15). Furthermore, the amount of variation in µgeneration among species tends to be higher for birds and lower for mammals and fishes (Fig. 1a), although this variation is arguably modest given large differences in life-history traits among these species (for example, there is a 2.8 million-fold difference in the body mass of killer whales and Siamese fighting fish, and there is a 93-fold difference in the generation time between humans and Texas banded geckos).
Species with longer generation intervals are expected to have higher per-generation mutation rates due to a combination of a larger number of cell divisions in spermatogenesis and more time for DNA damage to accumulate12–14,20. For the 105 trios for which parental age was known at reproduction, we found a significant positive association between µgeneration and the average parental age at reproduction (linear regression adjusted r2 = 0.14, P = 3.9 × 10−5; Fig. 1b). This pattern is also significant for the 60 mammalian trios with known parental ages (linear regression adjusted r2 = 0.37, P = 1.6 × 10−7) and for the 32 bird trios after excluding a single outlier, the Darwin’s rhea (linear regression adjusted r2 = 0.31, P = 0.0005). Furthermore, all three of these regressions have similar positive y-intercept values on the order of approximately 0.59 × 10−8 mutations per site per generation. For the trios with known parental ages, paternal and maternal ages at conception are strongly correlated (linear regression adjusted r2 = 0.77, P < 2.2 × 10−16; Extended Data Fig. 1). However, multiple linear regression showed that the age of the father is the most significant explanatory variable (adjusted r2 = 0.15, P = 9.3 × 10−5; paternal age P = 0.018; maternal age P = 0.785). Thus, a stronger effect of paternal than maternal age on the mutation rate seems to be universal for birds and mammals due to more germline mutations accumulating throughout the life of the male.
The specific types of de novo mutations (DNMs) observed across the 151 trios are concordant with the results of previous studies of individual species12–14,21–25, including a ratio of transitions over transversions of 2.3 (95% CI on binomial distribution = 2.2–2.5) and a high proportion (48.5%, 95% CI on binomial distribution = 46.7–50.3%) of transitions from strong base pairing to weak base pairing (C:G > T:A) across all DNMs (Supplementary Table 4). Among C:G > T:A mutations, 42.4% (95% CI on binomial distribution = 39.9–45.0%) occurred at CpG sites. The direction of mutations from one base to another (that is, the spectrum of mutation) differed significantly across vertebrate classes (χ2 = 30.0, d.f. = 15, P = 0.012; Supplementary Table 4 and Supplementary Fig. 6). We also found significant differences among vertebrate classes for A > C mutations (χ2 = 16.2, d.f. = 3, P = 0.001) and for C > A mutations (χ2 = 8.8, d.f. = 3, P = 0.032). In particular, fish species exhibit significantly fewer A > C mutations and significantly more C > A mutations than the other vertebrate classes. However, this mutation pattern does not appear to be associated with genome-wide CG content, as overall, the CG content of fishes is similar to that of mammals and birds and lower than that of reptiles (Supplementary Fig. 7). Finally, there is no significant difference between the classes of species in the percentage of all mutations located in CpG sites (χ2 = 4.3, d.f. = 3, P = 0.23), implying that high mutation rates at CpG sites are a conserved feature across vertebrates.
Variable male-driven evolution
In mammals and birds, the much larger number of germ-cell divisions per generation in the male germ line leads to the expectation of a male mutation rate bias, coined the ‘male-driven evolution hypothesis’26,27. However, very little is known about interspecific variation in the magnitude of the male-to-female ratio of the contribution of germline mutations (α). Previous studies have reported high α values in mammals (ranging from 1.0 to 20.1)28 and birds (ranging from 3.9 to 6.5)29 based on indirect estimates obtained by comparing rates of sequence divergence on the autosomes and sex chromosomes (see Extended Data Fig. 2 and Supplementary Table 5). However, other evolutionary forces can also act differently on the X chromosome and autosomes. For example, stronger natural selection on the X chromosome could lead to lower than expected divergence from the common ancestor, upwardly biasing estimates of α28. Furthermore, estimates of α derived in this way are averages over a phylogenetic branch and may thus differ from the contemporary species α. Here we directly quantified α by assigning the parental origin of the DNMs. Around 48% of all 3,034 DNMs across all of the trios could be phased to their parental origin (see Supplementary Table 6 for positions of all mutations). Owing to the relatively small number of mutations in each trio (Supplementary Table 2), we analysed male bias after taxonomically grouping the species into classes and orders (Fig. 1c).
Mammals showed a male bias of α = 2.3 (95% CI = 2.0–2.6). In general, our α estimates are in line with previous estimates derived for similar species based on genome alignments30,31. For example, we found that among mammals, primates have the largest male bias with α = 3.8 (95% CI = 2.6–5.7), similar to what was previously reported for several species belonging to this group12–14,21,22,32,33. Rodents have the lowest male bias among the mammals in our study, with α = 2.1 (95% CI = 1.4–3.1), consistent with a previous study based on mouse pedigrees34. This pattern can be explained by the short generation time of rodents, which leads to a smaller difference in cell divisions between the male and female germ lines35. However, the variation in α is relatively small given the variation in generation time among species (for example, between 30 years for humans and 8 months for the short-tailed opossum). Thus, an alternative hypothesis to explain the observed α would be a higher contribution of DNA damage, specifically in the male germ line for species with large generation times31.
Birds also showed an overall high male bias with α = 3.2 (95% CI = 2.5–4.1), although there is appreciable variation among different lineages. In particular, passerine birds and waterbirds (Pelecaniformes and Sphenisciformes) exhibited the largest male bias, both with α = 7.6 (95% CI = 4.3–13.5 for Passeriformes and 95% CI = 3.5–16.3 for Pelecaniformes and Sphenisciformes). High levels of male–male competition will lead to an increased amount of sperm being produced and faster sperm turnover, which would be expected to cause a higher male bias36. Indeed, many passerine birds have large cloacal protuberances37 and relatively heavy testes38, which are often used as proxies of sperm competition39. For instance, in two of the passerine species included in our study, testes represent between 1.2% (for Turdus merula) and over 2% (for Saxicola maurus) of the total body mass38. Moreover, extra-pair mating is common in many passerine birds40 as well as in penguins41, also indicating a high level of sperm competition. Overall, our results lend further support to the male-driven hypothesis in birds and mammals27.
By contrast, reptiles have a relatively small male bias with α = 1.5 (95% CI = 1.2–1.8), whereas fishes appear to have a greater proportion of mutations of maternal origin (α = 0.8), although the 95% CI overlaps 1 (95% CI = 0.5–1.4). This variation among vertebrate classes can be explained by differences in the process of gametogenesis. Although most birds and mammals produce sperm cells continuously through time42, reptiles and fishes tend to be seasonal breeders, producing sperm cells during a limited period before the mating season43–45, which will tend to reduce differences in cell division numbers between males and females, leading to more equal α. Moreover, female fishes are usually synchronous ovulators46, producing hundreds to millions of eggs at the same time followed by a proliferation of new oogonia47. This implies that females continually produce germ cells throughout their life, which would further reduce the difference in cell division number between males and females.
Species with lower sex bias also exhibited a larger proportion of shared mutations between siblings, with 12.0% (s.e. of 6.5%) of shared mutations between siblings for fish and 8.1% (s.e. of 5.3%) for reptiles compared with 1.5% (s.e. of 0.7%) for mammals and 2.2% (s.e. of 1.4%) for birds (Supplementary Table 7). An explanation for the repeated occurrence of those mutations is that they appear during the primordial germ cell specification in one of the parents48. The occurrence of primordial germ cell specification mutations is independent of parental sex. Consequently, a higher number of primordial germ cell specification mutations in some vertebrate groups could be an alternative explanation for the lower male-biased contribution to DNMs.
Yearly mutation rates
To use our results for phylogenetic dating and to compare the speed of evolution among species with different generation times, we needed estimates of yearly mutation rates. Different methods have been used in the literature to estimate yearly mutation rates. When sample sizes are small, yearly rates are commonly inferred by dividing the per-generation rate by the average age of the parents (or the generation time if parental age is unknown)49–51. However, this method implicitly assumes a constant accumulation of mutations from conception to reproduction, that is, the regression line of mutation rate on parental age should run through the origin. Our results (Fig. 1b), as well as previous studies of mice, humans and cats20,34, imply that parents always carry a minimum number of mutations in their gametes regardless of their age. This could lead to the yearly rate being overestimated for a given species if the sampled trio (or trios) had young parents compared with the average generation time for that species52. Consequently, we built a model that incorporates this mutational contribution at birth. Unfortunately, small per-species sample sizes in our dataset precluded modelling the effects of parental age separately for each species. However, we observed very similar intercepts and slopes across taxonomic groups, allowing us to fit a joint model for all species. A Poisson model explaining the number of mutations in each trio using a mutational contribution at birth and a weighted average of paternal and maternal age fits the data surprisingly well. To incorporate interspecific variation in male bias, we used the per-species fraction of paternal and maternal mutations estimated using read-backed phasing to weigh the average of the parental ages for each trio. Using this model, the number of predicted mutations matches the observed number with an overall r2 of 0.73 (mammalian r2 = 0.58, avian r2 = 0.51; Supplementary Note 1).
The yearly rates inferred with the naive method of dividing the per-generation rate by parental age (µyearly) and the rates obtained with our model (µyearly_modelled) yielded similar results (Pearson’s correlation r2 = 0.40, P = 0.002), and for 55% of the species, µyearly falls within the 95% confidence interval of the µyearly_modelled. As expected, the estimates showed the greatest differences for those species in which the parents reproduced far from the generation time, with the model-based estimates being smaller for those species that reproduced earlier than their generation time and larger for those species that reproduced later than their generation time. For example, the pigs in our dataset reproduced at around 6 months of age, which is more than 5 years earlier than the estimated generation time of this species. Thus, µyearly = 8.64 × 10−9 mutations per site per year was potentially overestimated compared with the µyearly_modelled = 1.05 × 10−9 mutations per site per year at the generation time. Conversely, the yearly rate of the Texas banded gecko was potentially underestimated at µyearly = 3.17 × 10−9 mutations per site per year using the reproductive age of 2 years of age from our dataset, whereas the modelled rate was µyearly_modelled = 1.96 × 10−8 mutations per site per year at a generation time of between 3 and 4 months. Both the naive method and the modelled method have been used in the literature to estimate yearly rates and both have caveats owing to the underlying assumptions they require. Bearing this in mind, we decided to use µyearly_modelled for the current analysis as we believe that this measure is more representative of the yearly rate at the generation time for each species (estimated yearly rates are provided in Supplementary Table 9 for comparison).
The estimated average µyearly_modelled varies more than 120-fold among species (Supplementary Note 1 and Supplementary Table 9), with the highest µyearly_modelled estimated for the Texas banded gecko at 1.96 × 10−8 mutations per site per year (95% CI = 1.23 × 10−8 to 2.83 × 10−8), whereas the lowest µyearly_modelled estimates were obtained for two bird species, the griffon vulture and the snowy owl, both with less than 0.18 × 10−9 mutations per site per year (snowy owl: µyearly_modelled = 0.16 × 10−9, 95% CI = 0.05 × 10−9 to 0.34 × 10−9; griffon vulture: µyearly_modelled = 0.17 × 10−9, 95% CI = 0.07 × 10−9 to 0.32 × 10−9). This large amount of interspecific variation is remarkable given that pedigree-based GMR estimates of individual species assessed by previous separate studies only show an approximately 16-fold variation in yearly GMRs34,51. Within primates, we observed a twofold variation across species and found a general trend for rates to be higher in the New World monkeys than in the great apes. This is consistent with previous independent estimates from primates19 and supports the ‘hominoid slowdown’ hypothesis53–56.
Next, we used µyearly_modelled to assess the strength of the association between GMRs and long-term evolutionary substitution rates. To obtain an estimate of the long-term substitution rate, we used the alignment of ultraconserved elements (UCEs), which are more likely to align among taxonomically distant species, plus 1,000 bp of flanking regions on each side of the UCE sequences, which will more closely reflect the neutral substitution rate57. We found a significant positive correlation between µyearly_modelled and the UCE substitution rate after excluding domesticated species owing to their overall much higher yearly mutation rates (see the following section; phylogenetic generalized least squares (PGLS): adjusted r2 = 0.23, P = 0.002; Fig. 2a). This pattern is especially pronounced for mammals (PGLS: adjusted r2 = 0.44, P = 0.0008), even after removing the two outliers (PGLS: adjusted r2 = 0.32, P = 0.009). We also found a significant relationship between µyearly_modelled and the long-term substitution rate inferred using whole-genome alignments (PGLS: adjusted r2 = 0.12, P = 0.02; Fig. 2b).
Life-history traits shape GMR variation
To test various hypotheses relating to the causes of GMR variation among species, we tested for associations between the modelled mutation rate per generation (µgeneration_modelled) and life-history traits including mating system (monogamy versus polygamy), maturation time, body mass, longevity, fecundity and the generation time (Supplementary Table 9). We used the µgeneration_modelled instead of the µgeneration as the former is less dependent on the age of the parents and is more representative of the rate at generation time for a given species. Although taking into account phylogenetic relatedness, many of these traits are significantly associated with µgeneration_modelled including the generation time (PGLS: adjusted r2 = 0.15, P = 0.002; Fig. 3a), the maturation time (PGLS: adjusted r2 = 0.18, P = 0.0006; Fig. 3b) and the number of offspring per generation (PGLS: adjusted r2 = 0.10, P = 0.013; Fig. 3c). Species with a higher number of offspring per generation also showed significantly lower µgeneration_modelled when considering only mammalian species (PGLS: adjusted r2 = 0.17, P = 0.011), but this relationship was not significant for birds (PGLS: adjusted r2 = −0.066, P = 0.720). Collectively, these traits explain almost 18% of the variation in µgeneration_modelled (multiple PGLS: adjusted r2 = 0.18, P = 0.004). The other life-history traits that we tested, including longevity, mating strategy and body mass, are not significantly associated with µgeneration_modelled (see Extended Data Fig. 7).
Another key parameter for species evolution is the effective population size (Ne), which impacts genetic drift and the efficacy of selection. To investigate the effect of Ne on µgeneration_modelled and to test the drift barrier hypothesis3, which predicts the evolution of higher mutation rates in species with small Ne, we calculated Ne using the pairwise sequentially Markovian coalescent method based on one randomly selected father per species. To avoid circularity, we estimated Ne based on the substitution rate calculated from the UCE alignment (Supplementary Table 9). Indeed, if Ne was estimated using the pedigree-based mutation rate, a stronger correlation might arise between Ne and the mutation rate (see Extended Data Fig. 8). We found a significant negative association between µgeneration_modelled and the harmonic mean Ne per species over the past 30,000–1,000,000 years (PGLS: adjusted r2 = 0.08, P = 0.020; Fig. 3d) as would be expected under the drift barrier hypothesis. This relationship is mainly driven by mammals (PGLS: adjusted r2 = 0.31, P = 0.0006), a signal that is also observed when using the harmonic average Ne over a smaller timescale (30,000–130,000 years; PGLS: adjusted r2 = 0.10, P = 0.04, Extended Data Fig. 8). The most appropriate timeframe used to estimate Ne depends on the evolutionary time necessary for the mutation rate to adapt to changes in Ne. However, the pairwise sequentially Markovian coalescent method cannot accurately estimate recent Ne. To overcome this limitation, we also estimated Ne as π/4μ, with nucleotide diversity (π) and the substitution rate per site per generation (μ) estimated from the UCE alignments. This results in a similar negative association between Ne and µgeneration_modelled (linear regression: adjusted r2 = 0.83, P = 2.2 × 10−16; Extended Data Fig. 9), further supporting the drift barrier hypothesis. However, caution should be taken as Ne estimates rely on generation times inferred from contemporary observations, whereas generation times could conceivably have changed over evolutionary timescales. Furthermore, population size depends negatively on the generation time (PGLS Ne in log scale: adjusted r2 = 0.20, P = 0.0004). Therefore, a negative association between Ne and μ could potentially be driven by a large effect of the generation time on per-generation mutation rates.
High yearly rates in domesticated species
Domestication imposes strong artificial selection, recurrent genetic bottlenecks or both. Our dataset includes 22 domesticated or semi-wild species that have been bred in captivity for many generations. When using the naive method of dividing the per-generation rate by the parental age, these species show significantly higher µyearly than the non-domesticated species (PGLS: adjusted r2 = 0.13, P = 0.0015; Fig. 4a). The higher mutation rates of domesticated animals are likely due to strong artificial selection for traits such as shorter generation times. Indeed, using µyearly_modelled, we found no difference between domesticated and non-domesticated species (PGLS: adjusted r2 = 0.037, P = 0.08; Fig. 4b). Consequently, the higher yearly mutation rate observed in domesticated species is more likely to be explained by the lowering of reproductive age associated with domestication rather than by an inherent change to the mutational process caused by relaxed selection on the mutation rate due to small population sizes and bottlenecks associated with domestication58,59.
Conclusions
Here we analysed pedigree-based GMR variation in an unprecedentedly broad phylogenetic context. We showed that there is a consistent male bias in mammals and birds, whereas reptiles and fish exhibited more evenly matched contributions of DNMs between parents. This could be due to contrasting mutagenic processes, such as differences in male and female germline cell division observed in mammals and birds, or differences among species in the proportion of DNMs occurring in primordial germ cell specification versus in the parental germ lines. Our results also support the drift barrier hypothesis, as we found a negative association between the per-generation mutation rate and effective population size. Moreover, our results suggest that an appreciable proportion of the variation in the GMR can be explained by life-history traits, including maturation time and the number of offspring per generation. Our study also highlights the importance of the generation time, as illustrated by the particular case of domesticated animals, in which exceptionally high yearly mutation rate estimates can be explained by artificially induced short generation times. In addition, some of the trio samples in our study were collected from captive animals at zoos or conservation centres. These populations might have different generation times than those in the wild, which could potentially introduce biases into some of our mutation rate estimates. Future studies should focus on wild pedigree samples, which can be accessed from long-term conservation and monitoring programmes60.
Methods
Samples
Samples were collected from zoos, zoological museums, research institutes and farms from all over the world. Samples were provided from collaborators for research that was undertaken at the Natural History Museum of Denmark, permit 2020-12-7186-00733 from the Danish Ministry of Environment and Food, and when applicable, CITES Certificate of Scientific Exchange number DK003. Genomic DNA was extracted using DNeasy Blood and Tissue Kits (Qiagen) following the manufacturer’s instructions. BGIseq libraries were built in China National GeneBank (CNGB), Shenzhen, China, and whole-genome paired-end sequencing (read length 2 × 100 bp) were performed on the BGISEQ500 platform. We aimed for 60–80× raw sequence coverage per sample. A total of 68 species for which a reference genome was available were retained in the final dataset, representing 151 trios for which whole blood or other tissue material was available for DNA extraction and for which parentage had been genetically determined61. Information on the samples is provided in Supplementary Table 1.
GMR estimation
We applied a similar bioinformatic analysis pipeline to our previous study of rhesus macaques12. Raw reads were trimmed with SOAPnuke filter62. The mapping was conducted with BWA-MEM version 0.7.15 (ref. 63). The versions of the reference genomes for each species are provided in Supplementary Table 9. A post-mapping step removed any reads mapping to multiple regions of the genome as well as duplicated reads using Picard MarkDuplicates 2.7.1. We called variants for each individual using HaplotypeCaller in BP-RESOLUTION mode with GATK 4.0.7.0 (ref. 64). This mode returns a genotype quality and depth for all positions of the genome, not only the polymorphic sites. As recommended by GATK best practices, GenomicsDBImport combined all gVCF files per species into a single file and GenotypeGVCF applied a joint genotyping of all samples within a given species (see Supplementary Table 3 with details of raw sequences coverage, mapping quality, and coverage after mapping and variant calling). Similar filtering methods to those in our previous study were then applied to detect DNMs12. Therefore, each trio was filtered as followed:
For site filtering, the variant positions were filtered with the following parameters: QualByDepth (QD) < 2.0, FisherStrand (FS) > 20.0, RMSMappingQuality (MQ) < 40.0, MQRankSum < −2.0, MQRankSum > 4.0, ReadPosRankSum < −3.0, ReadPosRankSum > 3.0 and StrandOddsRatio (SOR) > 3.0 according to previously tested filters12.
For Mendelian violations, variants that deviated from Mendelian inheritance were selected using GATK SelectVariant and refined with an R script to keep only sites in which both parents were homozygous for the reference allele (HomRef), and the offspring was heterozygous (Het).
For allelic balance filter, in the case of a DNM, approximately 50% of the reads in the offspring should support the alternative alleles. Our allelic balance filter cut-off was 30–70% of the reads supporting the alternative allele, similar to previous studies12,32,65,66.
For depth filter (DP), only positions with a DP > 0.5 × mdepth and DP < 2 × mdepth for each individual were kept, with mdepth being the average depth of the trio. This strict DP filter minimized the effects of sequencing errors in regions of low sequencing depth and mis-mapping errors in high-coverage regions.
For genotype quality filter (GQ), to ensure that only high-quality genotypes were retained for the analysis of trios, we removed all sites where one individual of the trio had a GQ < 60 (see Supplementary Fig. 2 for a comparison of various GQ thresholds on a subset of species).
In addition, we called variants with bcftools (version 1.2)67 in the region of the candidate DNMs and removed the sites that appeared as false-positive calls (that is, at least one parent had the same variant as the offspring or the offspring had no variant). The number of candidates discarded varied among species (Supplementary Table 2). This quality control step produced similar results to a manual check with IGV68. Moreover, calling variants with different variant callers has been shown to be an efficient method to reduce false-positive calls2. All positions of DNMs are provided in Supplementary Table 6. In addition, we showed that sample type, reference genome quality and mapping quality can affect the results on the number of candidates, the false-positive rate and false-negative rate (FNR), yet, the estimated mutation rates are not affected (Supplementary Figs. 3–5).
To estimate per-generation rates, we divided the number of candidate DNMs, without the apparent false-positive candidates, per the callable genome. A site was considered callable when it passed the same filters as the polymorphic sites, that is, when both parents were HomRef (filter 2) and the three individuals passed the depth filter (filter 4) and the genotype quality threshold (filter 5). On the sites considered callable, we applied a correction for the FNR, that is, the proportion of sites where true DNMs will not be called as such. Two methods have been used in the literature to estimate FNR: one is the simulation of mutations and the other is a correction on the filters that are not accounted for in the callable genome. As in our previous study of GMR12, we used the latter method, which is more conservative. This corrected for the remaining filters that can only be applied on polymorphic sites, such as the site filters and the allelic balance filter (filter 2). We estimated the proportion of sites that would be filtered away by the site filters on the parameters following a known distribution (FS, MQRankSum and ReadPosRankSum), and the expected sites filtered away by the allelic balance filter as the number of true heterozygote sites (one parent HomRef, the other parent HomAlt and their offspring Het) outside the allelic balance threshold. The mutation rate per site per generation was then estimated per trio as µgeneration = DNMs/((1 − FNR) × 2 × CG). We estimated the 95% binomial confidence interval per species using the binconf() function in R, with the default Wilson score.
To calculate yearly rates (µyearly), we divided the per-generation rate by the average age of the parents at the time of reproduction weighted by the relative contribution of each parent (inferred with α for 105 trios) or by the generation time (for 46 trios without parental ages). The resulting µyearly estimates were averaged per species (for 29 species with multiple trios available). These yearly rates are dependent on the age of reproduction of the parents. Therefore, to calculate a yearly rate at generation time, we first modelled how the mutation rate of a trio was affected by the weighted average of the parental ages (using the paternal fraction estimated for that species as a weight). We then extended the model to fit how each species deviated from the average and used this to correct for differences between the observed reproductive age in our dataset and the expected generation time of a species (see Supplementary Note 1). With this, we estimated a new µyearly_modelled and a µgeneration_modelled that are more representative of the rate at generation time for each species.
Phylogenetic analysis
The phylogeny was built based on two sets of UCEs: 5,472 baits for 5,060 UCEs in tetrapods57 and 2,628 baits for 1,314 UCEs in acanthomorphs69. We used the Phyluce software70 to locate the probes in the reference genomes of our 68 species with 6 additional species contained in our original dataset. We extracted a flanking region of ±1,000 bp for each probe and aligned them with Mafft aligner version 7.470 (ref. 71). We then created a 75% completion matrix, that is, each alignment contains at least 75% of the taxa (55 species), resulting in 63 alignments from the acanthomorph set and 2,742 probes from the tetrapod set (all alignments are available on Figshare). A phylogenetic tree was built using IQ-TREE version 2.0.3 (ref. 72), with the appropriate substitution model inferred for each of the 2,805 alignments, a maximum likelihood tree search and 1,000 bootstrap replicates. To validate our tree, we also estimated a second tree based on a MultiZ alignment to the human genome and obtained similar results (Extended Data Fig. 9). The phylogenetic tree was calibrated to absolute time using the chronos function of the ‘ape’ package in R, with a smoothing parameter lambda of 0 and a ‘relaxed’ model73,74. Fourteen nodes were calibrated following previously published calibrations. The robustness of the tree was assessed by removing each node independently (see Extended Data Fig. 3).
Actinopterygii/Sarcopterygii: divergence time 416 million years ago (Ma), upper bound 425.4 Ma75
The first node in the Actinopterygii group: divergence time 378.2 Ma76
Sauropsida (birds and reptiles)/Synapsida (mammals): divergence time 313.4 Ma77
Archosauria (birds)/Testudines: divergence time 260 Ma78
The basal nodes of the Lepidosauria: divergence time 222.8 Ma79
First mammalian node, Eutheria/Metatheria: divergence time 160.7 Ma75
Galloanserae/Neoaves: divergence time 66 Ma77
Glire/Primates: divergence time 61.7 Ma77
Basal gekkotan node: divergence time 54 Ma80
Passeriformes/Psittaciformes: divergence time 51.81 Ma81
Cynoglossidae/Paralichthyidae: divergence time 50 Ma76
Sus scrofa/other Cetartiodactyla: divergence time 48.5 Ma77
Canidae/Arctoidea: divergence time 37.1 Ma75
Hominoidea/Cercopithecoidea: divergence time 23.5 Ma77
Mutational spectrum and sex bias
To analyse the spectrum of mutation, we grouped the trios into higher taxonomic levels, that is, mammals, birds, fishes and reptiles. Thus, the percentages reported are based on the total candidate mutations from each group of species. We explored the genomic context of the mutations from a C or a G base to determine whether they were located in CpG sites (respectively followed by a G or preceded by a C) (see Supplementary Table 4). We phased the DNMs to their parental origin using the read-backed phasing method described previously (GitHub: https://github.com/besenbacher/POOHA)82. This method uses the read-pairs containing a DNM and another heterozygous variant to determine the parental origin of the mutation when the heterozygous variant is present in both the offspring and one of the parents. The phasing allowed us to identify parental biases in the contribution of the DNMs by grouping multiple species to increase the number of phased mutations and obtain a minimum of 30 phased mutations per taxon. From this analysis, we omitted the Egyptian roussette (Rousettus aegyptiacus), Chinese tree shrew (Tupaia belangeri), griffon vulture (Gyps fulvus), blue-throated macaw (Ara glaucogularis), snowy owl (Bubo scandiacus) and Darwin’s rhea (Rhea pennata), as these could not be grouped with another monophyletic clade. To quantify the effect of parental age, a linear regression between the per-generation mutation rate and the average parental age at the time of reproduction was implemented using the lm function in R. Multiple linear regression was also used to identify whether paternal or maternal age was the strongest predictor of the empirical mutation rate.
Life-history trait analysis
We tested the effect of various life-history traits (fitted as continuous and discrete variables) on the yearly rate for each species using PGLS analysis in the R package ‘caper’83 (see Supplementary Table 9 for details about each life-history trait).
Effective population size
We used pairwise sequentially Markovian coalescent (PSMC) models to estimate the effective population size of each species84. Fastq sequences were obtained using bam format aligned sequences of one randomly selected father per species and were converted into fastq format using samtools mpileup command and vcf2fq. As recommended, the minimum depth was set to one-third of the average for the sample and twice the average for the maximum. For mammals, fish and reptiles, the parameters of the PSMC were set to –N25 for the maximum number of iterations of the algorithm, –t15 as the upper limit for the time to the most recent common ancestor, –r5 for the initial θ/ρ value, and finally the atomic intervals –p of ‘4 + 25 × 2 + 4 + 6’. These parameters were used previously for PSMC analysis of various species, including primates84,85, cetaceans86, Felidae87, fishes88,89 and turtles90. For birds, we used different parameters according to the literature with –N30 –t5 –r5 (ref. 91). Finally, to simulate the history inferred by PSMC, we parameterized the generation time and the mutation rate inferred from the UCE alignment. We then explored the effect of the harmonic mean Ne over windows of 30,000 years to 1,000,000 years. We also compared Ne estimated obtained with this method with those estimated based on Ne = π/4μ. Nucleotide diversity (π) was calculated using ANGSD92. This approach was implemented in three consecutive steps. From the alignment files, a global estimate of the site frequency spectrum was inferred using a maximum likelihood method, then the empirical π value was estimated per site, and finally, a sliding window approach was used to estimate π for each species. We used a window size of 50 kb and a step size of 10 kb together with an average pairwise estimation of the π to obtain global estimates of π. This analysis was restricted to unrelated individuals from each species, which corresponded to the 2 unrelated parents for 55 species, between 3 and 7 individuals for 10 species, and 3 species were excluded from this analysis as the parents were first-degree relatives.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41586-023-05752-y.
Supplementary information
Acknowledgements
The authors thank the following contributors of samples for this study: A. Girard, C. Small, E. Couture, E. Gangloff, A. Bronikowski, F. Yu, H. Fernández, A. Carbajal Brossa, the Barcelona Zoo Biological Bank, J. Partecke, J. Judson, F. Janzen, J. Fjeldså, K. Thorup, K. Glover, L. Koren, M. Nagel, M. Fredholm, M. Liedvogel, T. Aquarium, P. Vullioud, S.-J. Luo, T. Gamble, Y. Yovel, J. Bakker, C. Bombis, T. Charlton, A. Corl, A. Foote, N. Geli, M. Guille, K. L. Hansen, W. Huizinga, M. Hunter, T. Knauf-Witzens, T. Lund Koch, S. Potier, A. Prahl, K. Robertson, C. Scala, M. Schellerup, I. Schnell, K. Vesterdorf, K. Wendelin, K. Worm and W.-z. Wang; G. Pacheco for valuable advice in the laboratory; K. Boomsma for stimulating conceptual discussions and for providing comments on the final version of this manuscript; and GenomeDK at Aarhus University for providing computational resources and support for this study. All sequencing data were generated with MGI-sequencers at the China National Genebank of BGI-Shenzhen. This project was funded by the Strategic Priority Research Programme (XDB13020000) and the International Partnership Programme (no. 152453KYSB20170002) of the Chinese Academy of Sciences, a Villum Investigator grant (no. 25900) from The Villum Foundation, and a Carlsberg Foundation Grant to G.Z. (CF16-0663). L.A.B. was supported by the Carlsberg Foundation and the Villum Foundation. M.H.S. was funded by the Novo Nordisk Foundation (NNF18OC0031004). J.I.H. was funded by the German Research Foundation (DFG) as part of the SFB TRR 212 (NC³) (project nos. 316099922 and 396774617), the priority programme “Antarctic Research with Comparative Investigations in Arctic Ice Areas” SPP 1158 (project no. 424119118) and the sequencing costs in projects scheme (project no. 497640428).
Extended data figures and tables
Author contributions
G.Z., M.H.S., S.B. and L.A.B. conceived this work. M.F.B., B.Q., J.I.H., Z.L., J.S.L. and C.S. provided samples for several species, as well as input into the writing and results interpretation. L.A.B., J.Z., P.L. and M.T.P.G. participated in the extraction, library preparation and sequencing. All analyses were conducted by L.A.B. with input from J.S. for the phylogenetic analysis and S.B. for the mutation rate estimation. L.A.B., G.Z., S.B., M.H.S. and J.I.H. wrote the initial draft of the manuscript with input from all co-authors. G.Z. supervised this project.
Peer review
Peer review information
Nature thanks Anne Goriely and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
Whole-genome sequences of all species except humans are accessible in the National Center for Biotechnology Information under the BioProject ID PRJNA767781. The human sequences are available on request to L.A.B. and should be used only for GMR studies, based on the participant’s request. The alignments for the UCE tree are available on Figshare (10.6084/m9.figshare.19221693.v1). All animal silhouettes are from PhyloPic (http://phylopic.org/), except for the silhouette of S. scovelli, which was created by J.S. The silhouette of P. troglodytes was created by T. M. Keesey (vectorization) and T. Hisgett (photography), and the one of S. harrissi silhouettes was created by S. Werning; both are available under a CC-BY 3.0 license (https://creativecommons.org/licenses/by/3.0/); the other silhouettes are available under a Public Domain Mark 1.0 licence.
Code availability
The bioinformatics pipeline to analyse the genomes and all other data analyses are available on GitHub (https://github.com/lucieabergeron/vertebrate_rate).
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Lucie A. Bergeron, Email: lucie.a.bergeron@gmail.com
Guojie Zhang, Email: guojiezhang@zju.edu.cn.
Extended data
is available for this paper at 10.1038/s41586-023-05752-y.
Supplementary information
The online version contains supplementary material available at 10.1038/s41586-023-05752-y.
References
- 1.Lynch M, et al. Genetic drift, selection and the evolution of the mutation rate. Nat. Rev. Genet. 2016;17:704–714. doi: 10.1038/nrg.2016.104. [DOI] [PubMed] [Google Scholar]
- 2.Bergeron LA, et al. The mutationathon highlights the importance of reaching standardization in estimates of pedigree-based germline mutation rates. eLife. 2022;11:e73577. doi: 10.7554/eLife.73577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Lynch M. Evolution of the mutation rate. Trends Genet. 2010;26:345–352. doi: 10.1016/j.tig.2010.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Acuna-Hidalgo R, Veltman JA, Hoischen A. New insights into the generation and role of de novo mutations in health and disease. Genome Biol. 2016;17:241. doi: 10.1186/s13059-016-1110-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sturtevant AH. Essays on evolution. I. On the effects of selection on mutation rate. Q. Rev. Biol. 1937;12:464–467. doi: 10.1086/394543. [DOI] [Google Scholar]
- 6.Zhang G. The mutation rate as an evolving trait. Nat. Rev. Genet. 2022;24:3. doi: 10.1038/s41576-022-00547-9. [DOI] [PubMed] [Google Scholar]
- 7.Mugal CF, Arndt PF, Holm L, Ellegren H. Evolutionary consequences of DNA methylation on the GC content in vertebrate genomes. G3. 2015;5:441–447. doi: 10.1534/g3.114.015545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Baer CF, Miyamoto MM, Denver DR. Mutation rate variation in multicellular eukaryotes: causes and consequences. Nat. Rev. Genet. 2007;8:619–631. doi: 10.1038/nrg2158. [DOI] [PubMed] [Google Scholar]
- 9.Wright SD, Ross HA, Jeanette Keeling D, McBride P, Gillman LN. Thermal energy and the rate of genetic evolution in marine fishes. Evol. Ecol. 2011;25:525–530. doi: 10.1007/s10682-010-9416-z. [DOI] [Google Scholar]
- 10.Ohta T. An examination of the generation-time effect on molecular evolution. Proc. Natl Acad. Sci. USA. 1993;90:10676–10680. doi: 10.1073/pnas.90.22.10676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Martin AP, Palumbi SR. Body size, metabolic rate, generation time, and the molecular clock. Proc. Natl Acad. Sci. USA. 1993;90:4087–4091. doi: 10.1073/pnas.90.9.4087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bergeron LA, et al. The germline mutational process in rhesus macaque and its implications for phylogenetic dating. Gigascience. 2021;10:giab029. doi: 10.1093/gigascience/giab029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wu FL, et al. A comparison of humans and baboons suggests germline mutation rates do not track cell divisions. PLoS Biol. 2020;18:e3000838. doi: 10.1371/journal.pbio.3000838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wang RJ, et al. Paternal age in rhesus macaques is positively associated with germline mutation accumulation but not with measures of offspring sociability. Genome Res. 2020;30:826–834. doi: 10.1101/gr.255174.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Campbell CR, et al. Pedigree-based and phylogenetic methods support surprising patterns of mutation rate and spectrum in the gray mouse lemur. Heredity. 2021;127:233–244. doi: 10.1038/s41437-021-00446-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Besenbacher S, Hvilsom C, Marques-Bonet T, Mailund T, Schierup MH. Direct estimation of mutations in great apes reconciles phylogenetic dating. Nat. Ecol. Evol. 2019;3:286–292. doi: 10.1038/s41559-018-0778-x. [DOI] [PubMed] [Google Scholar]
- 17.Thomas GWC, et al. Reproductive longevity predicts mutation rates in primates. Curr. Biol. 2018;28:3193–3197.e5. doi: 10.1016/j.cub.2018.08.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Cagan A, et al. Somatic mutation rates scale with lifespan across mammals. Nature. 2022;604:517–524. doi: 10.1038/s41586-022-04618-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chintalapati M, Moorjani P. Evolution of the mutation rate across primates. Curr. Opin. Genet. Dev. 2020;62:58–64. doi: 10.1016/j.gde.2020.05.028. [DOI] [PubMed] [Google Scholar]
- 20.Wang RJ, et al. De novo mutations in domestic cat are consistent with an effect of reproductive longevity on both the rate and spectrum of mutations. Mol. Biol. Evol. 2022;39:msac147. doi: 10.1093/molbev/msac147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Venn O, et al. Strong male bias drives germline mutation in chimpanzees. Science. 2014;344:1272–1275. doi: 10.1126/science.344.6189.1272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Jónsson H, et al. Parental influence on human germline de novo mutations in 1,548 trios from Iceland. Nature. 2017;549:519–522. doi: 10.1038/nature24018. [DOI] [PubMed] [Google Scholar]
- 23.Tatsumoto S, et al. Direct estimation of de novo mutation rates in a chimpanzee parent-offspring trio by ultra-deep whole genome sequencing. Sci. Rep. 2017;7:13561. doi: 10.1038/s41598-017-13919-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Yuen RKC, et al. Genome-wide characteristics of de novo mutations in autism. npj Genomic Med. 2016;1:160271–1602710. doi: 10.1038/npjgenmed.2016.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wang H, Zhu X. De novo mutations discovered in 8 Mexican American families through whole genome sequencing. BMC Proc. 2014;8:S24. doi: 10.1186/1753-6561-8-S1-S24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Li W-H, Yi S, Makova K. Male-driven evolution. Curr. Opin. Genet. Dev. 2002;12:650–656. doi: 10.1016/S0959-437X(02)00354-4. [DOI] [PubMed] [Google Scholar]
- 27.Miyata T, Hayashida H, Kuma K, Mitsuyasu K, Yasunaga T. Male-driven molecular evolution: a model and nucleotide sequence analysis. Cold Spring Harb. Symp. Quant. Biol. 1987;52:863–867. doi: 10.1101/SQB.1987.052.01.094. [DOI] [PubMed] [Google Scholar]
- 28.Wilson Sayres MA, Makova KD. Genome analyses substantiate male mutation bias in many species. BioEssays. 2011;33:938–945. doi: 10.1002/bies.201100091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ellegren H, Fridolfsson A-K. Male-driven evolution of DNA sequences in birds. Nat. Genet. 1997;17:182–184. doi: 10.1038/ng1097-182. [DOI] [PubMed] [Google Scholar]
- 30.Sayres MAW, Venditti C, Pagel M, Makova KD. Do variations in substitution rates and male mutations bias correlate with life-history traits? A study of 32 mammalian genomes. Evolution. 2011;65:2800–2815. doi: 10.1111/j.1558-5646.2011.01337.x. [DOI] [PubMed] [Google Scholar]
- 31.de Manuel M, Wu FL, Przeworski M. A paternal bias in germline mutation is widespread in amniotes and can arise independently of cell division numbers. eLife. 2022;11:e80008. doi: 10.7554/eLife.80008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Francioli LC, et al. Genome-wide patterns and properties of de novo mutations in humans. Nat. Genet. 2015;47:822–826. doi: 10.1038/ng.3292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Gao Z, et al. Overlooked roles of DNA damage and maternal age in generating human germline mutations. Proc. Natl Acad. Sci. USA. 2019;116:9491–9500. doi: 10.1073/pnas.1901259116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Lindsay SJ, Rahbari R, Kaplanis J, Keane T, Hurles ME. Similarities and differences in patterns of germline mutation between mice and humans. Nat. Commun. 2019;10:4053. doi: 10.1038/s41467-019-12023-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Gibbs RA, et al. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature. 2004;428:493–520. doi: 10.1038/nature02426. [DOI] [PubMed] [Google Scholar]
- 36.Blumenstiel JP. Sperm competition can drive a male-biased mutation rate. J. Theor. Biol. 2007;249:624–632. doi: 10.1016/j.jtbi.2007.08.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Birkhead TR, Briskie JV, Møller AP. Male sperm reserves and copulation frequency in birds. Behav. Ecol. Sociobiol. 1993;32:85–93. doi: 10.1007/BF00164040. [DOI] [Google Scholar]
- 38.Moller AP. Sperm competition, sperm depletion, paternal care, and relative testis size in birds. Am. Nat. 1991;137:882–906. doi: 10.1086/285199. [DOI] [Google Scholar]
- 39.Birkhead TR, Montgomerie R. Three decades of sperm competition in birds. Phil. Trans. R. Soc. B. 2020;375:20200208. doi: 10.1098/rstb.2020.0208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Brouwer L, Griffith SC. Extra-pair paternity in birds. Mol. Ecol. 2019;28:4864–4882. doi: 10.1111/mec.15259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Hunter FM, Harcourt R, Wright M, Davis LS. Strategic allocation of ejaculates by male Adélie penguins. Proc. R. Soc. Lond. B. 2000;267:1541–1545. doi: 10.1098/rspb.2000.1176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Hamamah S, Gatti JL. Role of the ionic environment and internal pH on sperm activity. Hum. Reprod. 1998;13:20–30. doi: 10.1093/humrep/13.suppl_4.20. [DOI] [PubMed] [Google Scholar]
- 43.Gribbins K. Reptilian spermatogenesis. Spermatogenesis. 2011;1:250–269. doi: 10.4161/spmg.1.3.18092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Gribbins KM, Gist DH, Congdon JD. Cytological evaluation of spermatogenesis and organization of the germinal epithelium in the male slider turtle, Trachemys scripta. J. Morphol. 2003;255:337–346. doi: 10.1002/jmor.10069. [DOI] [PubMed] [Google Scholar]
- 45.Schulz RW, et al. Spermatogenesis in fish. Gen. Comp. Endocrinol. 2010;165:390–411. doi: 10.1016/j.ygcen.2009.02.013. [DOI] [PubMed] [Google Scholar]
- 46.Lubzens E, Young G, Bobe J, Cerdà J. Oogenesis in teleosts: how fish eggs are formed. Gen. Comp. Endocrinol. 2010;165:367–389. doi: 10.1016/j.ygcen.2009.05.022. [DOI] [PubMed] [Google Scholar]
- 47.Jalabert B. Particularities of reproduction and oogenesis in teleost fish compared to mammals. Reprod. Nutr. Dev. 2005;45:261–279. doi: 10.1051/rnd:2005019. [DOI] [PubMed] [Google Scholar]
- 48.Jónsson H, et al. Multiple transmissions of de novo mutations in families. Nat. Genet. 2018;50:1674–1680. doi: 10.1038/s41588-018-0259-9. [DOI] [PubMed] [Google Scholar]
- 49.Martin HC, et al. Insights into platypus population structure and history from whole-genome sequencing. Mol. Biol. Evol. 2018;35:1238–1252. doi: 10.1093/molbev/msy041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Smeds L, Qvarnström A, Ellegren H. Direct estimate of the rate of germline mutation in a bird. Genome Res. 2016;26:1211–1218. doi: 10.1101/gr.204669.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Feng C, et al. Moderate nucleotide diversity in the Atlantic herring is associated with a low mutation rate. eLife. 2017;6:e23907. doi: 10.7554/eLife.23907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Gao Z, Wyman MJ, Sella G, Przeworski M. Interpreting the dependence of mutation rates on age and time. PLoS Biol. 2016;14:e1002355. doi: 10.1371/journal.pbio.1002355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Goodman M. Rates of molecular evolution: the hominoid slowdown. BioEssays. 1985;3:9–14. doi: 10.1002/bies.950030104. [DOI] [PubMed] [Google Scholar]
- 54.Moorjani P, Amorim CEG, Arndt PF, Przeworski M. Variation in the molecular clock of primates. Proc. Natl Acad. Sci. USA. 2016;113:10607–10612. doi: 10.1073/pnas.1600374113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Scally A, Durbin R. Revising the human mutation rate: implications for understanding human evolution. Nat. Rev. Genet. 2012;13:745–753. doi: 10.1038/nrg3295. [DOI] [PubMed] [Google Scholar]
- 56.Soojin VY. Morris Goodman’s hominoid rate slowdown: the importance of being neutral. Mol. Phylogenet. Evol. 2013;66:569–574. doi: 10.1016/j.ympev.2012.07.031. [DOI] [PubMed] [Google Scholar]
- 57.Faircloth BC, et al. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Syst. Biol. 2012;61:717–726. doi: 10.1093/sysbio/sys004. [DOI] [PubMed] [Google Scholar]
- 58.Garcia JA, Lohmueller KE. Negative linkage disequilibrium between amino acid changing variants reveals interference among deleterious mutations in the human genome. PLoS Genet. 2021;17:e1009676. doi: 10.1371/journal.pgen.1009676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Hedrick PW, Garcia-Dorado A. Understanding inbreeding depression, purging, and genetic rescue. Trends Ecol. Evol. 2016;31:940–952. doi: 10.1016/j.tree.2016.09.005. [DOI] [PubMed] [Google Scholar]
- 60.Bonnet T, et al. Genetic variance in fitness indicates rapid contemporary adaptive evolution in wild animals. Science. 2022;376:1012–1016. doi: 10.1126/science.abk0853. [DOI] [PubMed] [Google Scholar]
- 61.Manichaikul A, et al. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26:2867–2873. doi: 10.1093/bioinformatics/btq559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Chen Y, et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience. 2017;7:1–6. doi: 10.1093/gigascience/gix120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv10.1101/201178 (2018).
- 65.Kong A, et al. Rate of de novo mutations and the importance of father’s age to disease risk. Nature. 2012;488:471–475. doi: 10.1038/nature11396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Besenbacher S, et al. Novel variation and de novo mutation rates in population-wide de novo assembled Danish trios. Nat. Commun. 2015;6:5969. doi: 10.1038/ncomms6969. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27:2987–2993. doi: 10.1093/bioinformatics/btr509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Robinson JT, et al. Integrative genomics viewer. Nat. Biotechnol. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Alfaro ME, et al. Explosive diversification of marine fishes at the Cretaceous–Palaeogene boundary. Nat. Ecol. Evol. 2018;2:688–696. doi: 10.1038/s41559-018-0494-6. [DOI] [PubMed] [Google Scholar]
- 70.Faircloth BC. PHYLUCE is a software package for the analysis of conserved genomic loci. Bioinformatics. 2016;32:786–788. doi: 10.1093/bioinformatics/btv646. [DOI] [PubMed] [Google Scholar]
- 71.Katoh K, Misawa K, Kuma KI, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:3059–3066. doi: 10.1093/nar/gkf436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Minh BQ, et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 2020;37:1530–1534. doi: 10.1093/molbev/msaa015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Sanderson MJ. Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Mol. Biol. Evol. 2002;19:101–109. doi: 10.1093/oxfordjournals.molbev.a003974. [DOI] [PubMed] [Google Scholar]
- 74.Kim J, Sanderson MJ. Penalized likelihood phylogenetic inference: bridging the parsimony-likelihood gap. Syst. Biol. 2008;57:665–674. doi: 10.1080/10635150802422274. [DOI] [PubMed] [Google Scholar]
- 75.Meredith RW, et al. Impacts of the cretaceous terrestrial revolution and KPg extinction on mammal diversification. Science. 2011;334:521–524. doi: 10.1126/science.1211028. [DOI] [PubMed] [Google Scholar]
- 76.Hughes LC, et al. Comprehensive phylogeny of ray-finned fishes (Actinopterygii) based on transcriptomic and genomic data. Proc. Natl Acad. Sci. USA. 2018;115:6249–6254. doi: 10.1073/pnas.1719358115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Benton MJ, Donoghue PCJ. Paleontological evidence to date the tree of life. Mol. Biol. Evol. 2007;24:26–53. doi: 10.1093/molbev/msl150. [DOI] [PubMed] [Google Scholar]
- 78.Green RE, et al. Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs. Science. 2014;346:1254449. doi: 10.1126/science.1254449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Sues HD, Olsen PE. Triassic vertebrates of Gondwanan aspect from the Richmond basin of Virginia. Science. 1990;249:1020–1023. doi: 10.1126/science.249.4972.1020. [DOI] [PubMed] [Google Scholar]
- 80.Bauer AM, Böhme W, Weitschat W. An Early Eocene gecko from Baltic amber and its implications for the evolution of gecko adhesion. J. Zool. 2005;265:327–332. doi: 10.1017/S0952836904006259. [DOI] [Google Scholar]
- 81.Gelabert P, et al. Evolutionary history, genomic adaptation to toxic diet, and extinction of the Carolina parakeet. Curr. Biol. 2020;30:108–114.e5. doi: 10.1016/j.cub.2019.10.066. [DOI] [PubMed] [Google Scholar]
- 82.Maretty L, et al. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference. Nature. 2017;548:87–91. doi: 10.1038/nature23264. [DOI] [PubMed] [Google Scholar]
- 83.Orme, D. et al. The caper package: comparative analysis of phylogenetics and evolution in R. R version 1.0.1 https://cran.r-project.org/package=caper (2018).
- 84.Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475:493–496. doi: 10.1038/nature10231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Schmitz J, et al. Genome sequence of the basal haplorrhine primate Tarsius syrichta reveals unusual insertions. Nat. Commun. 2016;7:12997. doi: 10.1038/ncomms12997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Vijay N, et al. Population genomic analysis reveals contrasting demographic changes of two closely related dolphin species in the last glacial. Mol. Biol. Evol. 2018;35:2026–2033. doi: 10.1093/molbev/msy108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Liu YC, et al. Genome-wide evolutionary analysis of natural history and adaptation in the world’s tigers. Curr. Biol. 2018;28:3840–3849.e6. doi: 10.1016/j.cub.2018.09.019. [DOI] [PubMed] [Google Scholar]
- 88.Xu S, Zhao L, Xiao S, Gao T. Whole genome resequencing data for three rockfish species of Sebastes. Sci. Data. 2019;6:97. doi: 10.1038/s41597-019-0100-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Yuan Z, et al. Historical demography of common carp estimated from individuals collected from various parts of the world using the pairwise sequentially markovian coalescent approach. Genetica. 2018;146:235–241. doi: 10.1007/s10709-017-0006-7. [DOI] [PubMed] [Google Scholar]
- 90.Fitak RR, Johnsen S. Green sea turtle (Chelonia mydas) population history indicates important demographic changes near the mid-Pleistocene transition. Mar. Biol. 2018;165:110. doi: 10.1007/s00227-018-3366-3. [DOI] [Google Scholar]
- 91.Nadachowska-Brzyska K, Li C, Smeds L, Zhang G, Ellegren H. Temporal dynamics of avian populations during pleistocene revealed by whole-genome sequences. Curr. Biol. 2015;25:1375–1380. doi: 10.1016/j.cub.2015.03.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Korneliussen TS, Albrechtsen A, Nielsen R. ANGSD: analysis of next generation sequencing data. BMC Bioinformatics. 2014;15:356. doi: 10.1186/s12859-014-0356-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Milholland B, et al. Differences between germline and somatic mutation rates in humans and mice. Nat. Commun. 2017;8:15183. doi: 10.1038/ncomms15183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.The 1000 Genomes Project. Variation in genome-wide mutation rates within and between human families. Nat. Genet.43, 712–714 (2011). [DOI] [PMC free article] [PubMed]
- 95.Rahbari R, et al. Timing rates and spectra of human germline mutation. Nat. Genet. 2016;48:126–133. doi: 10.1038/ng.3469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Wong WSW, et al. New observations on maternal age effect on germline de novo mutations. Nat. Commun. 2016;7:10486. doi: 10.1038/ncomms10486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Turner TN, et al. Genomic patterns of de novo mutation in simplex autism. Cell. 2017;171:710–722.e12. doi: 10.1016/j.cell.2017.08.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Sasani TA, et al. Large three-generation human families reveal post-zygotic mosaicism and variability in germline mutation accumulation. eLife. 2019;8:e46922. doi: 10.7554/eLife.46922. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Kessler MD, et al. De novo mutations across 1465 diverse genomes reveal mutational insights and reductions in the Amish founder population. Proc. Natl Acad. Sci. USA. 2020;117:2560–2569. doi: 10.1073/pnas.1902766117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Malinsky M, et al. Whole-genome sequences of Malawi cichlids reveal multiple radiations interconnected by gene flow. Nat. Ecol. Evol. 2018;2:1940–1955. doi: 10.1038/s41559-018-0717-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Koch EM, et al. De novo mutation rate estimation in wolves of known pedigree. Mol. Biol. Evol. 2019;36:2536–2547. doi: 10.1093/molbev/msz159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Harland, C. et al. Frequency of mosaicism points towards mutation-prone early cleavage cell divisions in cattle. Preprint at bioRxiv10.1101/079863 (2017).
- 103.Pfeifer SP. Direct estimate of the spontaneous germ line mutation rate in African green monkeys. Evolution. 2017;71:2858–2870. doi: 10.1111/evo.13383. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Whole-genome sequences of all species except humans are accessible in the National Center for Biotechnology Information under the BioProject ID PRJNA767781. The human sequences are available on request to L.A.B. and should be used only for GMR studies, based on the participant’s request. The alignments for the UCE tree are available on Figshare (10.6084/m9.figshare.19221693.v1). All animal silhouettes are from PhyloPic (http://phylopic.org/), except for the silhouette of S. scovelli, which was created by J.S. The silhouette of P. troglodytes was created by T. M. Keesey (vectorization) and T. Hisgett (photography), and the one of S. harrissi silhouettes was created by S. Werning; both are available under a CC-BY 3.0 license (https://creativecommons.org/licenses/by/3.0/); the other silhouettes are available under a Public Domain Mark 1.0 licence.
The bioinformatics pipeline to analyse the genomes and all other data analyses are available on GitHub (https://github.com/lucieabergeron/vertebrate_rate).