Abstract
Spatial patterns of phylogenetic diversity (PD) are increasingly becoming relevant for conservation decisions. PD measures are based on phylogenies estimated from molecular data. This paper addresses the question of how different molecular markers impact PD spatial patterns. We first conducted a simple simulation to explore the effect of deep and shallow changes in topology (simulating variations in molecular markers), using ultrametric and non-ultrametric trees, and then used a dataset of Chilean flora with four sets of markers to assess potential differences in spatial patterns of PD ranks using different markers and types of trees. The simulation consistently showed that the difference in PD rank was lower for ultrametric trees than for phylograms. A similar trend was observed using the Chilean flora dataset, with among-markers variability in spatial patterns of the PD metrics lower for ultrametric than for non-ultrametric trees, depicted as top 2.5 and 5% hotspots. Frequency distribution of PD values differed among markers as well, with this variation less apparent for ultrametric trees. We conclude that the choice of markers impacts spatial patterns of PD, and these results vary more strongly for phylograms, suggesting that ultrametric trees are more robust to the choice of marker.
Keywords: Chile, ultrametric tree, endemism, phylogram, phylodiversity, vascular flora
1. Introduction
Spatial patterns of phylogenetic diversity (PD, [1]) and other phylogeny-based measures [2–4] have become increasingly relevant for the prioritization of conservation areas [5–7]. For a given study region and taxon set, spatial patterns of PD depend on phylogenetic tree topology and distribution of branch lengths, the richness of the communities and the geographical distribution of taxa. In general, studies on spatial patterns of PD employ available molecular data to infer the phylogeny used to generate estimates of PD. However, little is known about the effects of different choices of molecular data on PD estimates. By impacting phylogenetic inference (topology and branch lengths), different molecular markers might lead to different geographical distributions of branch lengths, thus influencing spatial patterns of PD. Indeed, in a recent study of a small lineage of Brassicaceae [8], spatial patterns of PD appeared to be influenced by the choice of molecular data. However, these results still await generalization.
Among-taxa sequence variation differs among molecular markers thus influencing estimates of branch lengths that may affect PD estimates. In a phylogram, these effects might be more sensitive to topological changes at the deeper nodes than at shallow ones. In a cladogram (a tree with all branches of the same length), these effects are proportional to the depth of the most recent common ancestor (MRCA) of all taxa in a community. Changes in spatial patterns of PD are potentially negligible if topological changes occur at shallow nodes [9,10], though phylogenetic uncertainty, most common at shallow nodes, may have effects on PD estimates [11]. On the other hand, it has been shown that, for the same taxon set and molecular markers, PD estimates based on phylograms might dramatically differ from PD estimates based on ultrametric trees [12]. Since root-tip distances in a phylogeny can vary much more across taxa in a phylogram than in an ultrametric tree, changes in topology of ultrametric trees should translate into less significant changes in PD compared to phylograms. Therefore, one should expect that differences in spatial patterns of PD are less sensitive to the choice of molecular marker when PD estimates are based on ultrametric trees.
Here we evaluate whether different sets of molecular markers and the associated phylogenetic inferences influence the spatial distribution of PD. We aimed at addressing the following specific questions: (1) do phylogenies based on different molecular markers and same taxon set produce different spatial patterns of PD measures? (2) Is there an influence of the type of tree (phylogram versus ultrametric tree) on estimates of spatial patterns of PD measures when using phylogenies based on different molecular markers?
2. Material and methods
(a) . Simulation
In order to explore the theoretical behaviour of PD with phylogenies derived from different markers and with ultrametric and non-ultrametric trees, we conducted a simple simulation in R v. 4.3.2 [13]. We first defined two tree topologies with five and 10 tips each and generated 1000 sets of random branch lengths independently over each of these topologies using compute.brlen of the ape package v. 5.7-1 [14]. Then the topology of each phylogeny with five tips was modified by swapping two closely related tips (affecting shallow nodes) and two distantly related tips (affecting deep nodes). Likewise, the topology of each phylogeny with 10 tips was modified as above but swapping four tips. Each of the resulting phylogenies with branch lengths was then ultrametrized with the penalized likelihood method [15], which uses a relaxed correlated clock model. We employed the function chronos of the package ape, setting the lambda parameter to 0.01 and leaving all remaining parameters as default, which renders ultrametric trees with a depth of 1. In total, 12 groups of 1000 phylogenies were generated, each corresponding to a combination of number of tips (five or 10), topology (original, shallow and deep) and type of tree (phylogram and ultrametric). In parallel, we defined a fixed distribution of each tip in five communities. For each community, Faith PD was calculated based on each phylogeny with the R-package picante v. 1.8.2 [16] and the communities ordered according to their PD rank. Finally, the difference between maximum and minimum PD rank values was computed for each community and group of phylogenies. The R script used for this simulation is available from the Zenodo Repository: https://doi.org/10.5281/zenodo.10524983 [17].
(b) . Analysis of phylogenetic diversity metrics of the Chilean flora
We used the dataset published by Scherson et al. [18], which includes a phylogeny of 754 (87%) Chilean native vascular plant genera and an associated occurrence database (141 222 records). The phylogeny is based on seven molecular markers (18S, atpB, ITS, matK, ndhF, rbcL, trnL-trnF) each with a different coverage. In order to have a comparable dataset, we first selected the four markers with the highest coverage, namely ITS, matK, rbcL and trnL-trnF and reduced the taxon matrix and the occurrence database to those genera having all four markers available. This reduced dataset contained 220 genera and 52 097 occurrence records.
We downloaded the sequences from GenBank in R [13] using the accession numbers from Scherson et al. [18] and the package rentrez v. 1.2.2 [19]. Each matrix was aligned with mafft v. 7.310 [20] using the default settings. Phylogenetic analyses were conducted in RaxML v. 7.7.1 [21] without any constraint, for each marker separately as well as for the whole four-marker set. The latter analysis was done with a marker-partitioned matrix and with substitution models unlinked. We used the model GTR + G in all analyses because we did not want the model to be another variable, GTR is more general than other models and is recommended by the authors of RAxML [21]. The resulting maximum-likelihood (ML) trees were used in all downstream analyses, which were carried out in R. Trees were processed in R using the package ape [14].
PD was calculated for each tree in R using the package picante [16] and a 1 × 1 degree cell grid built with the R package raster v. 2.8-19 [22]. We calculated two different PD metrics: (1) PD as described by Faith [1], corresponding to the sum of the branch lengths connecting the taxa present in a grid cell to the root of the tree. (2) Since PD is mathematically correlated to taxon richness (TR), we calculated richness-corrected PD (PDcorr), which corresponds to normalized PD minus normalized TR as described in Pio et al. [23]. This metric renders negative values when PD is lower than expected by TR and positive values when greater.
PD metrics were projected on a map in order to visualize spatial patterns. PD was normalized to a 0–1 range and PDcorr to negative and positive values between −1 and 1. For comparison across sites, all cells were assigned their ranking [24] of the corresponding PD metric. Following Daru et al. [25], we also considered as hotspots of PD metrics the 2.5% and 5% of the cells with the highest values. These spatial analyses were done using the R packages raster, maps v. 3.3.0 [26] and mapdata v. 2.3.0 [27].
All comparisons were conducted using ultrametric and non-ultrametric trees. Ultrametric trees were generated in R, scaling the tip-to-root length to one with the penalized likelihood algorithm [15] as implemented in the function chronos of the ape package. We used a value of the smoothing parameter lambda = 0.01, which assumes a relatively high among-branch rate variation of the tree.
The R script used for these calculations and analyses is available at Zenodo (doi:10.5281/zenodo.10524983).
3. Results
The simulation consistently showed that the difference in PD rank across markers was smaller for ultrametric trees than for phylograms regardless of the number of tips (table 1), suggesting that the choice of molecular marker influences spatial distribution of PD derived from phylograms more than from ultrametric trees.
Table 1.
Results of the simulation. The upper panel shows the presence of each tip in the five theoretical communities. The lower panel shows the difference between maximum and minimum PD rank values for each community and tree type for both five-tip and 10-tip phylogenies across 1000 simulated phylogenies. The original and modified topologies are shown on the right. Differences in PD rank for ultrametric trees are highlighted in grey.
| tips presence in communities | communities |
|||||
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | ||
| A | 1 | 1 | 1 | 0 | 0 | |
| B | 1 | 0 | 1 | 1 | 0 | |
| C | 0 | 1 | 0 | 1 | 1 | |
| D | 0 | 1 | 1 | 1 | 1 | |
| E | 0 | 1 | 1 | 0 | 1 | |
| F | 0 | 1 | 1 | 0 | 1 | |
| G | 0 | 1 | 1 | 1 | 1 | |
| H | 0 | 1 | 0 | 1 | 1 | |
| I | 1 | 0 | 1 | 1 | 0 | |
| J | 1 | 1 | 1 | 0 | 0 | |
| PD rank differences | ||||||
| five-tip (A–E) phylogenies | topologies in newick format | |||||
| phylograms original | 3 | 3 | 3 | 4 | 3 | (((A,B),(C,D)),E); |
| ultrametric original | 0 | 1 | 1 | 0 | 0 | |
| phylograms shallow | 3 | 3 | 3 | 4 | 3 | (((C,B),(A,D)),E); |
| ultrametric shallow | 0 | 0 | 0 | 0 | 0 | |
| phylograms deep | 3 | 3 | 3 | 4 | 3 | (((E,B),(C,D)),A); |
| ultrametric deep | 2 | 1 | 1 | 1 | 1 | |
| 10-tip (A–J) phylogenies | ||||||
| phylograms original | 3 | 3 | 3 | 4 | 3 | ((((A,B),(C,D)),E),(F,(G,(H,(I,J))))); |
| ultrametric original | 0 | 1 | 1 | 0 | 0 | |
| phylograms shallow | 3 | 3 | 3 | 4 | 3 | ((((C,B),(A,D)),E),(F,(G,(J,(I,H))))); |
| ultrametric shallow | 0 | 0 | 0 | 0 | 0 | |
| phylograms deep | 3 | 3 | 3 | 4 | 3 | ((((E,B),(C,I)),A),(F,(G,(H,(D,J))))); |
| ultrametric deep | 2 | 1 | 1 | 2 | 2 | |
Spatial patterns of PD and PDcorr in the Chilean flora (for details see electronic supplementary material, figures S1 and S2, respectively [28]) suggest that both PD metrics are influenced by the choice of the molecular marker as revealed by the distribution of PD hotspots (figure 1). The among-markers variability in spatial patterns of the PD metrics was lower for ultrametric than for non-ultrametric trees. These differences were accentuated when comparing PDcorr, where the usage of ultrametric tree did not appear to correct the effect of the molecular maker on the distribution of hotspots (figure 1).
Figure 1.
Maps of Chile showing 2.5% (red) and 5% (yellow) PD hotspots according to different markers, tree type and PD index. (a) PD of phylograms; (b) PD of ultrametric trees; (c) PDcorr of phylograms; (d) PDcorr of ultrametric trees.
Frequency distribution of values differed among markers for both metrics (electronic supplementary material, figures S3 and S4 [28]), both for ultrametric and non-ultrametric trees, although this variation was less apparent for ultrametric trees. In other words, PD resulting from non-ultrametric trees tended to be more concentrated around some specific value than it was when PD was derived from ultrametric trees.
Comparison of PD rank values between markers for the corresponding cells indicated that the choice of molecular marker has little impact on PD for both phylograms and ultrametric trees, although the differences appeared to be lower for ultrametic trees in some marker comparisons (figure 2a). However, when comparing PDcorr, rank values of corresponding cells between markers generally differ for phylograms irrespective of marker comparison (figure 2b, upper panel). These differences in PDcorr were reduced when using ultramteric trees (figure 2b, lower panel).
Figure 2.
Comparison of PD ranks between markers. Each graph depicts the cell-to-cell comparison of PD ranks between two markers for both phylograms (blue) and ultrametric trees (orange) according to PD (a) and PDcorr (b).
4. Discussion
This is the first study known to us documenting the effects of the marker choice on estimates of spatial patterns of PD. Our results suggest that the choice of marker influences spatial patterns of PD metrics, even though this is less pronounced when using ultrametric trees. Elliot et al. [12] recently showed that the choice of phylogram versus ultrametric tree (chronogram) greatly influences the spatial patterns of phylodiversity metrics in several taxa. Fewer differences between the use of chronogram versus phylogram were found by Allen et al. [29] when comparing spatial patterns of PD for the vascular plants of Florida. They did however see some differences depending on the source of the phylogeny used, especially when using a Phylomatic tree, highlighting the importance of how branch lengths are obtained from source trees. When inferring ultrametric trees, among-branch rate variation may lead to errors in PD estimates with respect to true values, regardless of the method employed to accommodate rate variation [30]. For areas with taxa that have been less studied in phylogenetic analyses, the use of guide trees such as the Open Tree of Life (OTL) or Phylomatic is less of an option [29], therefore the source tree (guide trees versus trees from original sequences) upon which PD metrics are calculated becomes more important.
Our results also suggest that spatial patterns of PDcorr are more affected by marker choice than those of PD (compare figures 1 and 2). This is to be expected since PD is known to be highly correlated with richness [23] thus spatial patterns of PD will be, at least to some extent, determined by richness regardless of which phylogeny was used to calculate PD. When the effects of richness are removed in order to calculate PDcorr, the effects of the phylogeny are fully expressed and the spatial distribution of PDcorr hotspots varies dramatically among markers, especially for phylograms (figure 1c,d).
Fewer differences in spatial patterns of both PD and PDcorr resulting from the use of ultrametric trees compared to phylograms may have a biological meaning. Kling et al. [6] argue that while phylograms reflect lineage genetic, morphological or functional divergence, ultrametric trees are rather related to accumulation of evolutionary time. Accepting this argument, our results show that lineages survival time estimations are less prone to be affected by the choice of molecular marker. Identification of areas of evolutionary dynamism or stability as suggested by [6], will then be different with phylogenies inferred from different molecular markers. While we agree with [6] and other authors in that PD estimates derived from different tree types should be analysed comparatively in PD studies [12,29], we have shown that the effects of marker choice are ameliorated when using ultrametric trees.
Most studies use a combination of molecular markers making it difficult to separate the effect of each marker on topology and/or branch lengths. In addition, more and more phylogenies derived from next-generation sequencing (NGS) are becoming available for PD and other biodiversity metrics [31,32], in which case distinction among different markers may make no sense. Considering this, our study suggests that the use of ultrametric trees may contribute to ameliorate the effect of marker choice, though these comparisons might be affected if a Bayesian, uncorrelated approach is employed to build an ultrametric tree because resulting branch length estimations will be less similar than those resulting from the PL approach. The results of this study are based on a specific geographical region. Generalizing these patterns will require replicating the analyses in other geographical areas.
Acknowledgements
We are grateful to Joel H. Nitta and two anonymous reviewers for helpful comments on an earlier version of the manuscript.
Contributor Information
Federico Luebert, Email: fluebert@u.uchile.cl.
Rosa A. Scherson, Email: rositascherson@uchile.cl.
Ethics
This work did not require ethical approval from a human subject or animal welfare committee.
Data accessibility
Data used in this study were obtained from Scherson et al.: http://dx.doi.org/10.1016/j.ympev.2017.04.021 [18], and GenBank. Code, geodata and tree-files are available from the Zenodo repository: https://zenodo.org/records/10524983 [17].
Electronic supplementary material figures S1 to S4 are provided in the electronic supplementary material [28].
Declaration of AI use
We have not used AI-assisted technologies in creating this article.
Authors' contributions
F.L.: conceptualization, formal analysis, investigation, methodology, writing—original draft, writing—review and editing; R.A.S.: conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, resources, writing—original draft, writing—review and editing.
All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
This work was supported by Fondecyt (grant nos 1171586 and 1230598).
References
- 1.Faith DP. 1992. Conservation evaluation and phylogenetic diversity. Biol. Conserv. 61, 1-10. ( 10.1016/0006-3207(92)91201-3) [DOI] [Google Scholar]
- 2.Butlin RK, Galindo J, Grahame JW. 2008. Sympatric, parapatric or allopatric: the most important way to classify speciation? Phil. Trans. R. Soc. B 363, 2997-3007. ( 10.1098/rstb.2008.0076) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Scherson RA. 2018. Medidas basadas en filogenias como argumentos de selección de taxa y áreas para la conservación aplicadas a la flora nativa de Chile. In Metodologías aplicadas para la conservación de la biodiversidad en Chile (eds Pérez-Quezada JF, Rodrigo P), pp. 161-183. Santiago, Chile: Facultad de Ciencias Agronómicas, Universidad de Chile. [Google Scholar]
- 4.Winter M, Devictor V, Schweiger O. 2013. Phylogenetic diversity and nature conservation: where are we? Trends Ecol. Evol. 28, 199-204. ( 10.1016/j.tree.2012.10.015) [DOI] [PubMed] [Google Scholar]
- 5.Daru BH, van der Bank M, Maurin O, Yessoufou K, Schaefer H, Slingsby JA, Davies TJ. 2016. A novel phylogenetic regionalization of phytogeographical zones of southern Africa reveals their hidden evolutionary affinities. J. Biogeogr. 43, 155-166. ( 10.1111/jbi.12619) [DOI] [Google Scholar]
- 6.Kling MM, Mishler BD, Thornhill AH, Baldwin BG, Ackerly DD. 2019. Facets of phylodiversity: evolutionary diversification, divergence and survival as conservation targets. Phil. Trans. R. Soc. B 374, 20170397. ( 10.1098/rstb.2017.0397) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Laity T, et al. 2015. Phylodiversity to inform conservation policy: an Australian example. Sci. Total Environ. 534, 131-143. ( 10.1016/j.scitotenv.2015.04.113) [DOI] [PubMed] [Google Scholar]
- 8.Toro-Núñez O, Lira-Noriega A. 2020. Discordant phylogenetic endemism patterns in a recently diversified Brassicaceae lineage from the Atacama Desert: when choices in phylogenetics and species distribution information matter. J. Biogeogr. 47, 1792-1804. ( 10.1111/jbi.13846) [DOI] [Google Scholar]
- 9.Maliet O, Gascuel F, Lambert A. 2018. Ranked tree shapes, nonrandom extinctions, and the loss of phylogenetic diversity. Syst. Biol. 67, 1025-1040. ( 10.1093/sysbio/syy030) [DOI] [PubMed] [Google Scholar]
- 10.Swenson NG. 2009. Phylogenetic resolution and quantifying the phylogenetic diversity and dispersion of communities. PLoS ONE 4, e4390. ( 10.1371/journal.pone.0004390) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Debastiani VJ, Bastazini VAG, Pillar VD. 2021. Phylogenetic uncertainty and the inference of patterns in community ecology and comparative studies. Oecologia 196, 633-647. ( 10.1007/s00442-021-04972-1) [DOI] [PubMed] [Google Scholar]
- 12.Elliott MJ, Knerr NJ, Schmidt-Lebuhn AN. 2018. Choice between phylogram and chronogram can have a dramatic impact on the location of phylogenetic diversity hotspots. J. Biogeogr. 45, 2190-2201. ( 10.1111/jbi.13399) [DOI] [Google Scholar]
- 13.R Core Team. 2021. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. [Google Scholar]
- 14.Paradis E, Claude J, Strimmer K. 2004. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289-290. ( 10.1093/bioinformatics/btg412) [DOI] [PubMed] [Google Scholar]
- 15.Sanderson MJ. 2002. Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Mol. Biol. Evol. 19, 101-109. ( 10.1093/oxfordjournals.molbev.a003974) [DOI] [PubMed] [Google Scholar]
- 16.Kembel SW, Cowan PD, Helmus MR, Cornwell WK, Morlon H, Ackerly DD, Blomberg SP, Webb CO. 2010. Picante: R tools for integrating phylogenies and ecology. Bioinformatics 26, 1463-1464. ( 10.1093/bioinformatics/btq166)) [DOI] [PubMed] [Google Scholar]
- 17.Luebert F, Scherson RA. 2024. Data from: Choice of molecular marker influences spatial patterns of phylogenetic diversity. Zenodo. (https://zenodo.org/records/10524983) [DOI] [PubMed] [Google Scholar]
- 18.Scherson RA, Thornhill AH, Urbina-Casanova R, Freyman WA, Pliscoff PA, Mishler BD. 2017. Spatial phylogenetics of the vascular flora of Chile. Mol. Phylogenet. Evol 112, 88-95. ( 10.1016/j.ympev.2017.04.021) [DOI] [PubMed] [Google Scholar]
- 19.Winter D. 2017. rentrez: Entrez in R. R package version 1.2.2. See https://CRAN.R-project.org/package=rentrez.
- 20.Katoh K, Misawa K, Kuma K, Miyata T. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059-3066. ( 10.1093/nar/gkf436) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Stamatakis A, Hoover P, Rougemont J. 2008. A rapid bootstrap algorithm for the RAxML web servers. Syst. Biol. 57, 758-771. ( 10.1080/10635150802429642) [DOI] [PubMed] [Google Scholar]
- 22.Hijmans RJ. 2016. raster: Geographic data analysis and modeling. R package version 2.8-19. See https://CRAN.R-project.org/package=raster.
- 23.Pio DV, Broennimann O, Barraclough TG, Reeves G, Rebelo AG, Thuiller W, Guisan A, Salamin N. 2011. Spatial predictions of phylogenetic diversity in conservation decision making. Conserv. Biol. 25, 1229-1239. ( 10.1111/j.1523-1739.2011.01773.x) [DOI] [PubMed] [Google Scholar]
- 24.Cardoso P, Silva I, de Oliveira NG, Serrano ARM. 2004. Higher taxa surrogates of spider (Araneae) diversity and their efficiency in conservation. Biol. Cons. 117, 453-459. ( 10.1016/j.biocon.2003.08.013) [DOI] [Google Scholar]
- 25.Daru BH, van der Bank M, Davies TJ. 2015. Spatial incongruence among hotspots and complementary areas of tree diversity in southern Africa. Divers. Distrib. 21, 769-780. ( 10.1111/ddi.12290) [DOI] [Google Scholar]
- 26.Becker RA, Wilks AR, Brownrigg R, Minka TP, Deckmyn A. 2017. maps: Draw Geographical Maps. R package version 3.3.0. See https://CRAN.R-project.org/package=maps.
- 27.Becker RA, Wilks AR, Brownrigg R. 2016. mapdata: Extra Map Databases. R package version 2.3.0. See https://CRAN.R-project.org/package=mapdata.
- 28.Luebert F, Scherson RA. 2024. Supplementary figures to: Choice of molecular marker influences spatial patterns of phylogenetic diversity. Figshare. ( 10.6084/m9.figshare.c.7109006) [DOI]
- 29.Allen JM, et al. 2019. Spatial phylogenetics of Florida vascular plants: the effects of calibration and uncertainty on diversity estimates. iScience 11, 57-70. ( 10.1016/j.isci.2018.12.002) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Ritchie AM, Hua X, Cardillo M, Yaxley KJ, Dinnage R, Bromham L. 2021. Phylogenetic diversity metrics from molecular phylogenies: modelling expected degree of error under realistic rate variation. Divers. Distrib. 27, 164-178. ( 10.1111/ddi.13179) [DOI] [Google Scholar]
- 31.Ahrendsen DL, Aust SK, Roxanne Kellar P. 2016. Biodiversity assessment using next-generation sequencing: comparison of phylogenetic and functional diversity between Nebraska grasslands. Plant Syst. Evol. 302, 89-108. ( 10.1007/s00606-015-1246-6) [DOI] [Google Scholar]
- 32.Soltis DE, Soltis PS. 2016. Mobilizing and integrating big data in studies of spatial and phylogenetic patterns of biodiversity. Plant Divers. 38, 264-270. ( 10.1016/j.pld.2016.12.001) [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data used in this study were obtained from Scherson et al.: http://dx.doi.org/10.1016/j.ympev.2017.04.021 [18], and GenBank. Code, geodata and tree-files are available from the Zenodo repository: https://zenodo.org/records/10524983 [17].
Electronic supplementary material figures S1 to S4 are provided in the electronic supplementary material [28].


