Abstract
Species tree estimation from genes sampled from throughout the whole genome is challenging because of gene tree discordance, often caused by incomplete lineage sorting (ILS). Quartet-based summary methods for estimating species trees from a collection of gene trees are becoming popular due to their high accuracy and theoretical guarantees of robustness to arbitrarily high amounts of ILS. ASTRAL, the most widely used quartet-based method, aims to infer species trees by maximizing the number of quartets in the gene trees consistent with the species tree. An alternative approach is inferring quartets for all subsets of four species and amalgamating them into a coherent species tree. While summary methods can be sensitive to gene tree estimation error, quartet amalgamation offers an advantage by potentially bypassing gene tree estimation. However, greatly understudied is the choice of weighted quartet inference method and downstream effects on species tree estimations under realistic model conditions. In this study, we investigated a wide array of methods for generating weighted quartets and critically assessed their impact on species tree inference. Our study provides evidence that the careful generation and amalgamation of weighted quartets, as implemented in methods like wQFM, can lead to significantly more accurate trees than popular methods like ASTRAL, especially in the face of gene tree estimation errors.
Keywords: species tree, gene tree, gene tree discordance, incomplete lineage sorting, summary methods, quartets, quartet amalgamation
Significance.
Despite the growing awareness that quartets, weighted based on their relative importance, can improve phylogenomic analyses, the methods for generating and leveraging them have been insufficiently explored. In this study, we present the first comprehensive evaluation of a range of computational strategies for generating and utilizing weighted quartets in species tree estimation. Our findings reveal that appropriately selected weighting schemes lead to the best known accuracy in species tree inference, significantly surpassing existing methods.
Introduction
The estimation of species trees using multiple loci has become increasingly common. However, species tree estimation from multilocus data sampled from throughout the whole genome is difficult because different loci can have different phylogenetic histories, a phenomenon known as gene tree discordance which occurs due to several different biological processes, including incomplete lineage sorting (ILS), gene duplication (GD), and horizontal gene transfer (HGT) (Maddison 1997; Degnan and Rosenberg 2009). In particular, many groups of species evolve with rapid speciation events, a process that tends to create conflicts between gene trees and species trees due to ILS (Jarvis et al. 2014), which is modeled by the multispecies coalescent model (Jiao et al. 2021).
In the presence of gene tree discordance, standard methods for estimating species trees, such as concatenation (which concatenates multiple sequence alignments of different genes into a single supermatrix, which is then used to estimate the species tree), can be statistically inconsistent (Degnan et al. 2009; Roch and Steel 2015), and produce incorrect trees with high support (Kubatko and Degnan 2007). Therefore, a two-step process, where gene trees are first inferred independently from sequence data and then combined using “summary methods” to estimate species trees by summarizing the input gene trees, is becoming increasingly popular, and many of them are provably statistically consistent under ILS given sufficiently large numbers of correct gene trees (Snir and Rao 2010; Zhang 2011; Bayzid and Warnow 2012; Mirarab et al. 2014c; Reaz et al. 2014; Avni et al. 2015; Islam et al. 2020; Mahbub et al. 2021; Mim et al. 2023).
Quartet-based summary methods have gained significant attention because quartets (4-leaf unrooted gene trees) avoid the “anomaly zone”—a scenario where the most likely gene tree topology may differ from the species tree topology (Degnan and Rosenberg 2006, 2009; Degnan 2013). ASTRAL, the most widely used summary method, takes a set of gene trees to infer a species tree that maximizes the number of quartets in the gene trees that are consistent with the species tree. Another class of methods, including wQFM and wQMC (known as quartet amalgamation techniques), involves inferring individual quartets for each set of four taxa and then combining them into a cohesive species tree. A major challenge in summarizing gene trees is that when we infer gene trees, often from relatively short sequences, the estimated gene trees tend to be highly error-prone—making summary methods sensitive to gene tree estimation error. A broader impact and a clear benefit of the quartet amalgamation-based approach over ASTRAL is that it can be used outside the context of gene tree estimation. Chifman and Kubatko introduced SVDquartets (Chifman and Kubatko 2014), a popular quartet-based method, which avoids estimating trees on each locus and hence bypasses the issue of gene tree estimation error. It combines multilocus unlinked single-site data, infers the quartet trees for all subsets of four species, and then combines the set of quartet trees into a species tree using a quartet amalgamation heuristic such as QFM (Reaz et al. 2014 ) or QMC (Snir and Rao 2012). SVDquartets has been shown to be statistically consistent under the coalescent model (Wascher and Kubatko 2020). It has been implemented in the PAUP* software (Swofford 2002; Swofford and Kubatko 2023). There is ample evidence that assigning weights to quartets—where the weight of a quartet denotes the relative confidence of a particular quartet topology out of the three alternate topologies on a set of four taxa—can enhance phylogenetic analyses (Avni et al. 2015 ; Mahbub et al. 2021) despite the presence of gene tree estimation error. Zhang et al. showed that wASTRAL, the variant of ASTRAL that assigns weights to quartets based on gene tree uncertainty, outperforms the unweighted version of ASTRAL on simulated data in terms of its topology and branch support values (Zhang and Mirarab 2022). Casanellas et al. proposed a new weighting system for quartets based on algebraic and semialgebraic tools (Casanellas et al. 2023), which is specially tailored to deal with data evolving under heterogeneous rates.
Despite the popularity of quartet-based methods and the growing awareness that appropriate weighting schemes of quartets can improve phylogenomic analyses, a greatly understudied aspect is the way weighted quartets are estimated before summarizing/amalgamating them. In this study, we address this gap by computing weighted quartets using a broad range of meaningful ways, using widely used Bayesian, maximum likelihood (ML), and statistical tools, including MrBayes (Huelsenbeck and Ronquist 2001), BUCKy (Ané et al. 2007; Larget et al. 2010), RAxML (Stamatakis 2006), and SVDquartets (Chifman and Kubatko 2014). We compare these weighted quartet generation strategies that significantly differ in their underlying statistical frameworks and computational demands. We combine these various quartet distribution generation methods with popular quartet amalgamation techniques like wQFM and wQMC. We performed an extensive experimental study and compared these techniques with the leading species tree estimation methods such as ASTRAL, BUCKy, TREE-QMC (Han and Molloy 2023), and SVDquartets. Specifically, our study addresses the following research questions (RQs), with RQ1 as the primary focus.
RQ1: Propose and explore a wide range of ways to generate quartet distributions and assess their performance in terms of species tree accuracy.
RQ2: Investigate the relative performance of the most popular quartet amalgamation techniques, wQFM and wQMC, when paired with various quartet distribution generation approaches.
RQ3: Compare the relative performance of the promising approaches identified in RQ1 and RQ2 with leading quartet-based species tree estimation methods (e.g. ASTRAL, TREE-QMC, SVDquartets, and BUCKy).
RQ4: Determine if the quartet scores of species trees (defined to be the number of quartets from the set of input gene trees that agree with the species tree) estimated by statistically consistent methods are predictive of species tree accuracy under practical model conditions (i.e. limited number of gene trees and low phylogenetic signal per gene).
RQ5: Evaluate the performance of the best methods identified in RQ1–RQ3 on real biological datasets.
We systematically investigated each research question through extensive analysis on a diverse collection of simulated and biological datasets. Our results show that leveraging all possible quartet topologies, along with their respective weights, yields better results compared to considering only the dominant quartet topologies. Furthermore, instead of using a point estimate gene tree as input, e.g. BestML gene tree, utilizing a distribution of gene trees, preferably estimated using Bayesian MCMC techniques, yields far superior results. We also observed that wQFM usually outperforms ASTRAL, wASTRAL, and TREE-QMC when paired with such distributions of input trees.
Results
In this section, we present the findings related to the research questions (RQs) explored in this study. For each research question, we compare the methods that we consider most appropriate for addressing that specific question.
RQ1: Finding the Most Appropriate Quartet Distributions
In RQ1, we have four separate experiments to assess the impact of different strategies for generating quartet distributions on species tree estimation.
Experiment 1: What is the most effective approach for generating gene tree frequency (GTF)-based quartet distributions: using only dominant quartets (with or without weights) or incorporating all quartets?
Experiment 2: Should weighted quartets be derived solely from best maximum likelihood gene trees (BestML), or gene tree uncertainty be accounted for by using distributions of gene trees obtained via nonparametric bootstrapping or Bayesian sampling?
Experiment 3: Does the weighted setting of SVDquartets lead to improved phylogenies?
Experiment 4: How do ML-based and BUCKy-based quartet distributions compare to GTF-based quartet distributions?
RQ1—Experiment 1: Dominant Quartets vs all Quartets
In this experiment, we compare the effects of considering only the dominant quartets (with or without weights) as opposed to considering all possible quartet trees along with their corresponding weights (See Materials and Methods section for details). We analyzed these three different quartet distribution generation strategies (e.g. GTF-dom, GTF-dom (without weights), and GTF-all, etc.; see Table 4 in Materials and Methods section for details) using two leading weighted quartet amalgamation techniques, wQFM and wQMC.
Table 4.
Brief descriptions of the quartet generation strategies explored in this study. These quartets are amalgamated by wQFM and wQMC to estimate species trees.
| Method | Description |
|---|---|
| wQFM-GTF-all | wQFM-GTF refers to wQFM with quartets weighted based on gene tree frequencies (GTF), which is the number of occurrences of a quartet in the input gene trees. wQFM-GTF-all considers all three alternative topologies for each set of four taxa. Therefore, unlike SVDquartets and BUCKy-pop, which consider only the dominant quartet for each set of four taxa, wQFM-GTF-all computes weights for every possible four-leaf tree (and so for each of the three possible unrooted trees for every four leaves), and then combines this set of weighted quartet trees into a tree on the full set of species. |
| wQFM-GTF-dom | This method considers only the dominant quartets (i.e. the quartet topology with the highest gene tree frequency) and their weights for each set of four taxa. |
| wQFM-GTF-dom (without weights) | Similar to wQFM-GTF-dom, only the dominant quartets are taken into consideration, but weights of the quartets are ignored (i.e. each dominant quartet is assigned unit weight). |
| wQFM-GTF-BS | This approach refers to wQFM-GTF-all when run on bootstrap (BS) gene tree distributions for every gene computed by RAxML. We use nonparametric bootstrapping using RAxML to produce 200 trees for each gene. We then count GTF for all possible quartets based on all these BS trees. The motivation behind using a BS gene tree distribution instead of single best tree estimate for each gene is that a larger pool of trees is likely to augment the quartets with more meaningful weights and better address gene tree uncertainty. |
| wQFM-GTF-MB | Similar to wQFM-GTF-BS, but here the set of trees for each gene is computed using MrBayes (by performing Bayesian analysis on its MSA) instead of RAxML. MrBayes was run twice, each for 55,000 generations, and we sample a total of 200 gene trees, excluding burn-in samples. |
| wQFM-SVD-Exp | SVDquartets has two modes for generating weighted quartets: (i) exponential and (ii) reciprocal. wQFM-SVD-Exp refers to wQFM when run on weighted quartets computed by SVDquartets using the exponential weighting scheme. |
| wQFM-SVD-Rec | Similar to wQFM-SVD-Exp, but the weights are computed using the reciprocal weighting format. |
| wQFM-RAxML | It refers to wQFM with quartets weighted based on the likelihood scores computed by RAxML (using the fast quartet calculator option) given the combined alignment of the gene sequences. |
| wQFM-BUCKy-MB | wQFM with quartets weighted based on CFs, generated by BUCKy when a set of trees generated by MB is provided as input. As the concordance factor of a quartet reflects the percentage of gene trees that support it, it is a natural choice as a weight for that quartet. |
| wQFM-BUCKy-RAxML | Similar to wQFM-BUCKy-MB except that the input gene trees to BUCKy are generated by RAxML bootstrapping instead of MrBayes. |
These quartets are amalgamated by wQFM and wQMC to estimate species trees.
The performance of six methods (three quartet generation strategies coupled with both wQFM and wQMC) on the 15-taxon dataset is shown in Fig. 1, where we investigated the performance on varying gene tree estimation errors using 100 bp and 1,000 bp sequence lengths and on varying numbers of genes (100 and 1,000). Across all model conditions, except for the easiest one (1,000 bp–1,000 gt), where all methods perform very well, GTF-all is significantly better than GTF-dom—suggesting that using all quartets is better than using only the dominant ones. Moreover, as shown in previous studies (Mahbub et al. 2021), wQFM is, in general, more accurate than wQMC. Interestingly, wQMC achieved similar accuracies with dominant quartets both with and without weights. In contrast, wQFM performed better with weighted dominant quartets when the number of genes was relatively small, but performed better without weights as the number of genes increased.
Fig. 1.
RQ1 (Experiment 1): Results on the 15-taxon dataset. Comparison of performance when using all quartets with weights, as opposed to using only the dominant quartets with and without weights. We show the average RF rates averaged over 10 replicates with standard errors.
We further investigated the performance of these six approaches on the 37-taxon dataset under various model conditions with varying amounts of ILS, numbers of genes, and gene tree estimation errors (Fig. S1). Similar trends, as found on 15-taxon datasets, were observed for this dataset as well. All methods showed improved accuracy as we increased the sequence length and number of gene trees and decreased the amount of ILS. wQFM-GTF-all outperformed other methods across all model conditions. Identical patterns were observed on the 11-taxon dataset (see Fig. S2).
In light of these results, we will only consider the GTF-all strategy in the subsequent comparisons, while excluding GTF-dom. From now on, we will denote the wQFM and wQMC-based results by wQFM-GTF and wQMC-GTF.
RQ1—Experiment 2: BestML Gene Trees vs Bootstrap/Bayesian Distribution of Gene Trees
In this experiment, we compare the effects of considering the BestML tree of a gene vs a distribution of gene trees for each gene obtained through Bayesian analysis using MrBayes (GTF-MB) or through nonparametric bootstrap with RAxML (GTF-BS). All three weight generation techniques were paired with both wQFM and wQMC.
Results on the 15-taxon datasets are shown in Fig. 2. For the 100 bp (shorter sequence length) model conditions, methods using distributions of gene trees as input outperformed those using BestML gene trees. Additionally, Bayesian tree distributions resulted in more accurate species trees than ML bootstrapping trees. Notably, wQFM-GTF-MB achieved the highest accuracy across all model conditions, showing substantial improvements over other methods. As the sequence length increased to 1,000 bp, Bayesian tree distributions continued to produce the best results, whereas the RAxML bootstrapping-based approach performed worse than BestML, regardless of the number of genes.
Fig. 2.
RQ1 (Experiment 2): Results on the 15-taxon dataset. We compare methods utilizing weighted quartets generated from BestML gene trees (GTF), bootstrap distribution of gene trees (GTF-BS), and Bayesian distribution of gene trees (GTF-MB).
Similar trends were observed on the 37-taxon dataset (Fig. S3), with the exception that wQFM-GTF-BS (RAxML) consistently produced less accurate trees than wQFM-GTF, even under model conditions with shorter sequence lengths. As observed in other experiments, wQFM produced notably better trees than wQMC. Although wQFM-GTF-MB is better than wQFM-GTF across all model conditions, their difference tends to decrease as we increase the number of genes, the amount of ILS, and the sequence length. The superiority of wQFM-GTF over wQFM-GTF-BS indicates that BestML gene trees tend to produce more accurate quartet tree distributions than nonparametric bootstrapping using RAxML, supporting prior evidence that BestML produces better trees than BS (Mirarab et al. 2014b). On the other hand, the enhanced performance of wQFM-GTF-MB over wQFM-GTF suggests that Bayesian gene tree distributions produce more accurate quartet weights than BestML trees.
To further investigate this, we compared the quartet distributions computed based on, BestML, bootstrapping (BS), and MrBayes (MB) gene trees with those of true gene trees. We measure the divergence between true quartet distributions (computed from true gene trees) and different sets of quartet distributions in estimated gene trees (e.g. BestML, BS, and MB) in terms of the number of dominant quartets that differ between two quartet distributions. Note that, for a set of n taxa, there are different four-taxa sets, and thus dominant quartets. In Table 1, we show the number of dominant quartets in BestML, BS, and MB sets of gene trees that differ from the dominant quartets in true gene trees. MB trees yield the least number of mismatches in dominant quartets with respect to true dominant quartets. Similarly, the mismatches for BestML are less than BS on the model conditions where wQFM-GTF is better than wQFM-GTF-BS. These are aligned with the relative species tree accuracies of wQFM-GTF-MB, wQFM-GTF-BS, and wQFM-GTF. This supports prior evidence that BestML tends to produce better trees than BS. But we did not explore multilocus bootstrapping (MLBS).
Table 1.
Number of mismatched dominant quartets with respect to the true gene trees for different model conditions averaged over all replicates.
| Taxa | Model condition | wQFM-GTF | wQFM-GTF-BS | wQFM-GTF-MB |
|---|---|---|---|---|
| 11 | lower-ILS | 19 | 18 | 13 |
| higher-ILS | 77 | 54 | 54 | |
| 15 | 100gene-100 bp | 152 | 109 | 100 |
| 100gene-1,000 bp | 30 | 41 | 16 | |
| 1,000gene-100 bp | 73 | 66 | 52 | |
| 1,000gene-1,000bp | 6 | 26 | 1 | |
| 37 | 1×-200-50 | 2,675 | 2,258 | 1,890 |
| 1×-200-250 | 990 | 914 | 740 | |
| 1×-200-500 | 608 | 556 | 473 | |
| 1×-200-1,000 | 351 | 320 | 301 | |
| 0.5×-200-500 | 705 | 623 | 555 | |
| 2×-200-500 | 265 | 270 | 123 | |
| 1×-100-500 | 818 | 799 | 710 | |
| 1×-500-500 | 249 | 325 | 178 |
The best-performing method (i.e. the lowest number of mismatch) for each model condition has been highlighted in bold.
In the 11-taxon dataset, the difference between the methods is less prominent (Fig. S4). On higher ILS, the bestML trees perform poorly. But other than that, all methods obtain comparable results. There is not much distinction between wQFM and wQMC either, unlike the previous two datasets.
Moreover, we also explored all possible combinations between the choice of quartets (all vs dominant) paired with the choice of gene trees (bestML vs BS vs MB). For the 15-taxon dataset, considering all quartets is the optimal choice regardless of the gene trees (Fig. S5). Also, weighted dominant quartets are better than the unweighted ones when the number of genes is low (100), but opposite results are observed for 1000 genes. Both of these observations are aligned with the findings in experiment 1 (Fig. 1). Additionally, in all scenarios, the Bayesian distribution of gene trees outperformed its counterparts. Identical to the trends in Fig. 2, bestML approach is better than bootstrapping on the 1,000 bp model conditions, and worse on the 100 bp ones.
Similarly, in Fig. S6, for the 37-taxon dataset, we see results consistent with the observations in experiments 1 and 2 (Figs. S1 and S3). The only exception is the 1×-200-500 model condition, where leveraging all quartets (as opposed to only the dominant ones) does not yield the most accurate trees.
RQ1—Experiment 3: Utilizing the Unweighted and Weighted Quartets Generated by SVDquartets
In this experiment, we compared standard SVDquartets with wQFM and wQMC where they leveraged the weighted quartets generated by SVDquartets (i.e. wQFM-SVD-Exp, wQFM-SVD-Rec).
Comparisons of these methods on 15- and 37-taxon datasets are shown in Fig. 3 and Fig. S7. Results on the 11-taxon dataset are presented in Fig. S8. SVDquartets, wQFM-SVD-Rec, and wQMC-SVD-Exp showed comparable performance, but wQFM-SVD-Rec was better than SVDquartets on some conditions (e.g. 1,000-gt model condition in the 15-taxon dataset). wQFM-SVD-Exp and wQMC-SVD-Rec performed worse than others in many model conditions. On the 15-taxon dataset, all methods show improvement with an increase in the number of genes or sequence lengths (Fig. 3).
Fig. 3.
RQ1 (Experiment 3): Results on the 15-taxon dataset. Performance comparison between unweighted and weighted (both exponential and reciprocal) quartets generated by SVDquartets. The unweighted quartets are amalgamated by QFM, whereas the weighted ones are processed by both wQFM and wQMC.
Interestingly, on the 37-taxon dataset, contrary to the expected trends, the trees estimated by SVDquartets and those based on the weighted quartets generated by SVDquartets do not show notable improvements with increased sequence lengths or decreased levels of ILS. As shown in Fig. S7a, expected improvements were observed when the sequence length increased from 50 bp to 250 bp; however, no further improvement was noted beyond 250 bp. Similarly, no improvements were observed with decreased levels of ILS (Fig. S7b), and surprisingly, it produced the worst results under the lowest-ILS (2×) model condition. However, with an increase in the number of genes, the results improved as expected. It is important to note that SVDquartets was primarily designed for large genome-scale data. Therefore, the limited improvement with an increase in data on a much smaller-scale dataset is not very surprising.
Overall, wQFM-SVD-Rec performs better than other variants of SVDquartets explored in this particular experiment.
RQ1—Experiment 4: Utilizing the Weighted Quartets Generated by BUCKy and Maximum Likelihood-Based Methods
In this experiment, we compare the population tree obtained from BUCKy (BUCKy-pop) with the species trees estimated through wQFM or wQMC using weighted quartets generated by BUCKy. We obtain these weighted quartets from BUCKy using gene tree distributions generated by MB or RAxML as its input. We denote these BUCKy-MB and BUCKy-RAxML respectively.
The results on the 37-taxon datasets are presented in Fig. 4. BUCKy-MB and wQFM-GTF achieved the highest accuracy. Notably, BUCKy-MB significantly outperformed BUCKy-RAxML, demonstrating the superiority of gene trees estimated by Bayesian techniques (MrBayes) over those derived from ML-based techniques (Douady et al. 2003; Mar et al. 2005). This further supports our findings in Experiment 2. However, wQFM-BUCKy-MB and wQFM-BUCKy-RAxML performed worse than other methods, suggesting that using CFs as weights for quartets and estimating species trees from these weighted quartets using quartet amalgamation techniques may not be effective. Another key observation is that increasing sequence length, and thereby reducing gene tree estimation error, significantly reduces the error rate of BUCKy-estimated trees.
Fig. 4.
RQ1 (Experiment 4): Results on the 37-taxon dataset. Comparison of various BUCKy-based and ML-based methods. We also included wQFM-GTF and wQMC-GTF. The model conditions are identical to Fig. S3.
Similar trends were observed on 15- and 11-taxon datasets (see Figs. S9 and S10). In this experiment, we also investigated the efficacy of the weighted quartets generated by RAxML in species tree estimation. We ran wQFM on the quartets weighted based on the likelihood scores computed by RAxML. This approach (wQFM-RAxML) performed poorly (see Fig. S11).
This particular set of analyses presents us with a method that is clearly superior to others, wQFM-GTF-MB.
RQ2: Investigating the Relative Performance of Different Weighted Quartet Amalgamation Techniques
wQFM and wQMC are two leading weighted quartet amalgamation techniques. Prior studies (Mahbub et al. 2021, 2022) demonstrated that wQFM-GTF substantially outperforms wQMC-GTF. In addition to GTF, this study evaluates both wQFM and wQMC using a broad range of weighted quartet generation methods. Our extensive evaluation (as discussed in Experiments 1–4 under RQ1) supports previous findings on the relative performance of wQFM and wQMC, further establishing the superiority of wQFM. It consistently outperformed wQMC across various model conditions, with many differences being statistically significant. As a result, wQMC is excluded from further experiments in RQ3-RQ5. Additionally, Table S1 shows that trees estimated by wQFM attain “closer” quartet scores to the true tree compared to wQMC, proving that wQFM is more resistant to gene tree estimation errors. These results have been explained in more detail in the “RQ4: Are the quartet scores predictive of species tree accuracy?” section.
RQ3: Relative Performance of Different Quartet-Based Summary Methods
We compared the best weighted quartet amalgamation approaches identified in RQ1—wQFM-GTF, wQFM-GTF-MB, and wQFM-SVD-Rec—with the leading quartet-based summary methods ASTRAL, BUCKy, TREE-QMC, and SVDquartets. Both ASTRAL and its weighted variant (wASTRAL) were analyzed, with wASTRAL utilizing weights derived from branch supports. Our experiments, particularly Experiment 2 in RQ1, demonstrated that leveraging a distribution of trees for each gene, rather than a single BestML tree per gene, enhances tree accuracy. Consequently, we ran wASTRAL on BestML gene trees with supports estimated from both nonparametric bootstrapping (denoted as wASTRAL) and Bayesian MCMC sampling (wASTRAL-MB). Similarly, we also analyzed TREE-QMC and its weighted variant (Han and Molloy 2025), leveraging support values from nonparametric bootstrapping (Weighted TREE-QMC) and Bayesian MCMC sampling (Weighted TREE-QMC-MB). Thus, we used the same distributions of trees as input for wQFM-GTF-MB, wASTRAL-MB, and Weighted TREE-QMC-MB, ensuring a fair comparison. These tree distributions were utilized to generate weighted quartets for wQFM and to draw branch support on the gene trees used as input for ASTRAL.
Results on 15- and 37-taxon datasets are shown in Fig. S12 and Fig. 5, respectively. The general patterns were consistent with our expectations: for all methods, the species tree estimation accuracy was improved by increasing the number of genes and sequence lengths (i.e. decreasing the gene tree estimation error), but was reduced by increasing the amount of gene tree discordance (i.e. the amount of ILS).
Fig. 5.
RQ3: Results on the 37-taxon dataset. We compare the best methods from previous experiments: wQFM-GTF, wQFM-GTF-MB, wQFM-SVD-Rec with ASTRAL, BUCKy, and SVDquartets. ASTRAL’s weighted counterpart, configured to utilize branch supports as weights, was also analyzed. Branch supports were estimated from both nonparametric RAxML bootstrapping (wASTRAL) and Bayesian MCMC sampling (wASTRAL-MB). The settings are identical to Fig. S3.
Next, we compared the relative performance of different methods. The most significant observation is that wQFM-GTF-MB achieved the best overall performance, substantially outperforming both ASTRAL and wASTRAL with weighted TREE-QMC-MB being the close second. Utilizing MB-estimated tree distributions improves the performance of all methods. Consistent with the trend observed in Experiment 2 of RQ1, where wQFM-GTF-MB significantly outperformed wQFM-GTF, we found that the weighted versions of both ASTRAL and TREE-QMC outperformed their unweighted counterparts. However, irrespective of the set of trees used, wQFM consistently matched or exceeded the accuracy of its ASTRAL and TREE-QMC counterparts. Additionally, BUCKy-MB frequently produced more accurate trees than ASTRAL. BUCKy-MB and wQFM-GTF yield the third-best accuracy (after wQFM-GTF-MB and weighted TREE-QMC-MB), which either matched or outperformed ASTRAL.
To further investigate the relative performance of these methods and to understand why wQFM performs better than ASTRAL, we computed the quartet scores of the estimated species trees as well as the true species tree with respect to both estimated and true gene trees. These quartet scores, further analyzed in the “RQ4: are the quartet scores predictive of species tree accuracy?” section, indicate that although ASTRAL-estimated trees have lower overall tree accuracy compared to wQFM, ASTRAL achieves higher quartet scores relative to the estimated gene trees used as input. Interestingly, however, wQFM achieves higher quartet scores than ASTRAL when these scores are computed with respect to the true gene trees. Furthermore, regardless of whether true or estimated gene trees are used to compute the quartet scores, the quartet scores of wQFM-estimated trees are closer to the true quartet score (i.e. the quartet score of the true species tree) than those of ASTRAL-estimated trees. That means ASTRAL tends to “overestimate” the quartet scores on estimated gene trees and “underestimate” them when scores are computed based on true gene trees. These explain why wQFM is generally more accurate than ASTRAL and also suggest that wQFM is less susceptible to gene tree estimation errors compared to ASTRAL.
The 11-taxon dataset reveals almost identical patterns to the 37-taxon dataset (see Fig. S13). The only exception is that ASTRAL performs better than BUCKy-MB.
RQ4: Are the Quartet Scores Predictive of Species Tree Accuracy?
In this research question, we examine the relationship between the quartet scores and the RF rates of the most accurate species tree estimation approaches, namely wQFM-GTF, wQFM-GTF-MB, and ASTRAL. All these methods generate the species trees by maximizing the quartet score. The statistical consistency of estimating species trees by maximizing the quartet score criterion holds when a sufficiently large number of true gene trees (without estimation error) are available. However, in practice, the number of genes is limited, and estimated gene trees often contain errors. Consequently, the trees that optimize the quartet score may not correspond to the true species tree. As a result, quartet-based methods may “overshoot” the quartet score, returning trees with higher scores than the true quartet score, particularly when we have a limited number of estimated gene trees (with estimation errors) (Farah et al. 2021; Mahbub et al. 2022).
We examined the quartet scores of each estimated species tree as well as the true tree with respect to two sets of gene trees: the true ones and the estimated ones. We vary the number of genes and the gene tree estimation error (controlled by gene sequence lengths). As shown in Table 2, the quartet score of the true species tree is the highest with respect to true gene trees (which is aligned with the statistical consistency of quartet score). However, with respect to the estimated gene trees (with error), the true species tree may have lower quartet scores. Thus, in the presence of gene tree estimation error and limited numbers of genes, quartet-based methods may “overshoot” the quartet score as they return trees with higher quartet scores than the true quartet score. Similarly, the most accurate method wQFM-GTF-MB yields higher quartet scores than wQFM-GTF and ASTRAL with respect to true gene trees, but ASTRAL and wQFM-GTF produce higher quartet scores than wQFM-GTF-MB (although they are less accurate than wQFM-GTF-MB) when the scores are computed with respect to estimated gene trees. Thus, the quartet score wQFM-GTF-MB is closer to the true quartet score than wQFM-GTF, wQFM-GTF-BS, and ASTRAL, regardless of the gene tree set used to compute the quartet score.
Table 2.
Quartet scores of selected methods for estimated and true gene trees in different model conditions.
| Taxa | Model condition | Estimated gene trees | True gene trees | ||||||
|---|---|---|---|---|---|---|---|---|---|
| wQFM-GTF | ASTRAL | wQFM-GTF-MB | True Tree | wQFM-GTF | ASTRAL | wQFM-GTF-MB | True tree | ||
| 11 | lower-ILS | 40,412 | 40,421 | 40,341 | 40,301 | 50,737 | 50,809 | 50,997 | 51,252 |
| higher-ILS | 26,680 | 26,715 | 26,502 | 26,372 | 29,268 | 29,458 | 29,658 | 29,982 | |
| 15 | 100gene-100 bp | 69,776 | 69,933 | 69,681 | 69,307 | 82,708 | 82,158 | 83,481 | 84,634 |
| 100gene-1,000 bp | 82,129 | 82,166 | 82,115 | 82,099 | 84,437 | 84,345 | 84,563 | 84,634 | |
| 1,000gene-100 bp | 692,949 | 693,656 | 691,755 | 690,268 | 834,890 | 827,970 | 840,710 | 844,184 | |
| 1,000gene-1,000 bp | 817,993 | 818,022 | 817,937 | 817,937 | 843,791 | 843,362 | 844,184 | 844,184 | |
| 37 | 1×-200-50 | 7,523,379 | 7,525,464 | 7,522,522 | 7,517,955 | 11,707,475 | 11,688,779 | 11,714,721 | 11,744,078 |
| 1×-200-250 | 10,568,957 | 10,569,844 | 10,567,512 | 10,565,291 | 11,735,692 | 11,733,853 | 11,740,077 | 11,744,078 | |
| 1×-200-500 | 11,271,425 | 11,271,938 | 11,270,715 | 11,267,990 | 11,739,571 | 11,738,630 | 11,743,259 | 11,744,078 | |
| 1×-200-1000 | 11,586,354 | 11,586,641 | 11,586,204 | 11,584,969 | 11,743,720 | 11,743,809 | 11,744,931 | 11,744,078 | |
| 0.5×-200-500 | 10,006,593 | 10,007,425 | 10,006,778 | 10,003,929 | 10,311,452 | 10,311,630 | 10,312,310 | 10,311,622 | |
| 2×-200-500 | 11,947,020 | 11,947,266 | 11,946,771 | 11,946,371 | 12,565,347 | 12,565,281 | 12,569,707 | 12,570,342 | |
| 1×-100-500 | 5,639,605 | 5,640,131 | 5,638,477 | 5,636,883 | 5,871,208 | 5,870,844 | 5,872,890 | 5,874,698 | |
| 1×-500-500 | 28,171,681 | 28,171,698 | 28,171,282 | 28,171,385 | 29,363,114 | 29,362,939 | 29,362,109 | 29,364,013 | |
The highest scores are shown in boldface, and the scores closest to the true score have been italicized.
We visualize the quartet scores for the 15-taxon dataset in Fig. 6 to better understand this trend. This figure suggests that the expected trend that quartet scores decrease with increasing RF rates holds when the quartet scores are computed based on true gene trees (blue lines), and the true species tree has the highest quartet score. Interestingly, however, when the quartet scores are based on estimated gene trees (orange lines), the estimated trees may achieve higher quartet scores than the true species tree—meaning that quartet-based methods may “overestimate” quartet scores under challenging model conditions with limited numbers of estimated gene trees (with estimation error). However, regardless of the choice of input gene trees, more accurate methods tend to produce quartet scores that are “closer” to true quartet scores. As demonstrated in Fig. 6, quartet scores of the most accurate method wQFM-GTF-MB represented by triangles are closer to true quartet scores (filled rectangles) than other methods in both blue and orange lines. As expected, when we increase the number of genes or decrease the gene tree estimation error (by increasing sequence length)—thereby making the model conditions favorable for statistically consistent criteria like quartet score maximization—the issue with overshooting the quartet scores decreases, evident from the gentler slopes.
Fig. 6.
RQ4: Results on the 15-taxon dataset. We show the relation between quartet score and RF rate with respect to both true gene trees and estimated gene trees for different methods (wQFM-GTF, wQFM-GTF-MB, ASTRAL).
Additionally, as discussed in the “RQ2: investigating the relative performance of different weighted quartet amalgamation techniques” section, we compare the quartet scores for wQFM and wQMC in Table S1. Consistent with earlier findings, wQFM, the superior method, exhibits smaller deviations from the true quartet scores compared to wQMC. When using estimated gene trees, wQMC tends to overestimate the quartet scores, whereas for true gene trees, it underestimates them, yielding scores lower than the actual quartet score.
RQ5: Performance on Real Biological Dataset
Mammalian Dataset
Song et al. (2012) analyzed a dataset containing 447 genes across 37 mammals using MP-EST (Liu et al. 2010) and concatenation using maximum likelihood. We reanalyzed the mammalian dataset from Song et al. (2012) after removing 21 mislabeled genes (confirmed by the authors), and two other outlier genes. The placement of bats (Myotis lucifugus and Pteropus vampyrus) and tree shrews (Tupaia belangeri) were two of the questions of greatest interest, and alternative relationships have previously been reported (Janečka et al. 2007; Hu et al. 2012; Boussau et al. 2013; Kumar et al. 2013).
The trees produced by ASTRAL, BUCKy, and wQFM with different types of tree distributions are identical to each other (see Fig. S14). This tree placed tree shrews (Tupaia belangeri) as sister to Glires with high support, which is consistent to the CA-ML analyses (reported in Song et al. 2012), and bats have been placed as sister to the clade containing Cetartiodactyla, Carnivora, and Perissodactyla (which is consistent to the MP-EST analyses Mirarab et al. 2014b). The SVDquartets-estimated tree is identical to this tree except for the position of tree shrews. SVDquartets supports an alternative relationship (albeit with very low support), which has also been observed (Mirarab et al. 2014b, 2014c), that placed tree shrews as sister to Glires. With respect to the position of bats, all the trees reconstructed in this study agree with MP-EST, which placed bats as sisters to the (Cetartiodactyla, (Perissodactyla, Carnivora)) clade. The placement of tree shrews and bats is of substantial debate, and so the differential placement is of considerable interest in mammalian systematics.
Avian Dataset
We reanalyzed the avian biological dataset from Jarvis et al. (2014), which comprises 14,446 genes across 48 taxa, including exons, introns, and ultraconserved elements (UCEs). This dataset poses significant challenges due to high levels of gene tree discordance, likely driven by rapid radiation events in the evolutionary history of these species. Mahbub et al. (2021) provided a comparison between the reference tree (MP-EST with statistical binning), ASTRAL, wQFM, and wQMC. In this study, we extend this analysis to evaluate the performance of wQFM-GTF-BS and SVDquartets (see Fig. 7). Due to the computational demands of generating Bayesian distributions for each of the 14K gene trees, we were unable to include wQFM-GTF-MB and BUCKy-MB in our current analysis.
Fig. 7.
Analysis of the avian dataset. We show the trees estimated by different methods, e.g. ASTRAL, SVDquartets, wQFM, and wQFM-GTF-BS. The values on the branches of each tree denote their corresponding support values. Branch supports are computed based on quartet-based local posterior probability (multiplied by 100). All BS values are 100% except where noted.
ASTRAL, wQFM, and wQFM-GTF-BS produced reasonably good phylogenies, although neither successfully reconstructed several sub-groups that have been consistently supported in the avian phylogenomics project and other studies (Mahbub et al. 2021). Notably, both the wQFM-estimated trees showed greater congruence with the MP-EST tree (derived from binned gene trees) than the ASTRAL-estimated tree. Surprisingly, SVDquartets yielded a significantly poorer tree, failing to recover many of the established sub-groups. All three of wQFM, wQFM-GTF-BS, and ASTRAL accurately reconstructed the well-established Australaves clade, containing passeriformes, parrots, falcons, and seriemas. But SVDquartets misplaced both Seriema (with 100% support) and Falcon, thus failing to reconstruct Australaves.
All four methods failed to recover Columbea. While wQFM, wQFM-GTF-BS, and ASTRAL were able to recover the constituent clades Columbimorphae (mesite, sandgrouse, and pigeon) and Phoenicopterimorphae (flamingo and grebe), SVDquartets failed to recover any of the two. Additionally, SVDquartets was unable to resolve other key clades, such as Otidimorphae, Caprimulgimorphae, and Afroaves. Although it successfully recovered the Core Waterbird clade, the relationships within this group were not well-resolved. Interestingly, it managed to recover Cursores (crane and killdeer) with high support, a clade that all other methods failed to reconstruct.
Runtimes
We report the running time of different methods on the mammalian and avian biological datasets. All running times are wall time on a Linux machine with 64 GB RAM and an i7-10700K 3.80 GHz Octa-core processor with 16 threads.
In this study, we employed the quartet amalgamation techniques wQFM and wQMC on weighted quartets generated by a wide range of methods. The runtimes for generating these weighted quartets on the biological mammalian dataset are summarized in Table 3. For both nonparametric bootstrapping and Bayesian approaches (via MrBayes), we generated 200 trees for each gene and subsequently computed weighted quartets from these gene tree distributions. Once the weighted quartets were obtained, species tree inference using wQFM and wQMC took approximately 4 and 1 s, respectively.
Table 3.
Time taken in seconds by different methods for generating weighted quartets for the mammalian dataset containing 424 genes.
| Method | Time (in s) |
|---|---|
| SVD-Rec | 733.02 |
| SVD-Exp | 724.90 |
| GTF | 68.19 |
| GTF-BS / GTF-MB | 14,080.29 |
| BUCKy | 878.59 |
In comparison, ASTRAL required 1.24 s to infer the mammalian species tree, while SVDquartets took 718.44 s. BUCKy needed 878.59 s to infer the species tree from the gene tree distributions generated by MrBayes. The time taken by MrBayes for Bayesian analysis ranged from 66.66 s (for the shortest gene of 486 bp) to 2,587.57 s (for the longest gene of 27,360 bp).
Due to the large number of loci (14,446) on 48 species in the Avian dataset, we could not include Bayesian techniques (MrBayes and BUCKy) due to their computational demands. SVDquartets took around 3 h to analyze this dataset. ASTRAL’s running time is sensitive to the number of genes, and it takes around 12 and a half hours to construct the avian phylogeny. On the other hand, amalgamating the weighted quartets generated from bootstrap gene tree distributions or ML gene trees using wQFM takes only 37 s. However, generating weighted quartets is time-consuming, taking around 100 s to produce the weighted quartets from 200 bootstrap trees of a gene. Note that generating weighted quartets for each gene tree distribution is embarrassingly parallel.
Discussion
Highly accurate and scalable species tree inference methods, which summarize independently inferred gene trees to obtain a species tree, are sensitive to unavoidable errors introduced in the gene tree estimation step. This dilemma has sparked significant debate about the advantages of concatenation versus summary methods and has presented practical challenges to the broader adoption of summary methods over concatenation. Coestimation of gene trees and species trees (Liu 2008; Heled and Drummond 2010; Szöllősi et al. 2015) is perhaps the most accurate approach to dealing with such noise (Leaché and Rannala 2011; Bayzid and Warnow 2013). However, despite some progress (Ogilvie et al. 2017), these methods are extremely computationally intensive and do not scale to even moderately large numbers of species (Bayzid and Warnow 2012, 2013; Smith et al. 2013). Therefore, summary methods are comparatively more feasible for use on genome-scale datasets.
Quartet-based summary methods, including ASTRAL and quartet amalgamation-based methods wQFM and wQMC, have emerged as a highly accurate and scalable approach for estimating species trees from multilocus data. Prior studies suggest that weighting schemes may improve the performance of quartet-based methods. In this study, we proposed and examined various ways of generating weights for the quartets and their impact in species tree inferences. Moreover, we report on an extensive evaluation study, the relative performance of the leading quartet-based methods.
This study shows various important trends, which we summarize. The first observation is that the species trees estimated by quartet amalgamation-based methods were impacted by the choice of input quartet tree distribution (e.g. dominant or all quartets ) and that using all quartets with relative weights may better capture the gene tree uncertainty and produced more accurate species tree topologies, especially in the presence of gene tree estimation errors. This study also shows that wQFM produces substantially better trees than wQMC, and usually achieves better quartet scores. While prior studies also reported this observation (Mahbub et al. 2021, 2022), this study confirms this on a broad range of model conditions and weighting schemes.
Next, we observed that the estimated species trees were significantly impacted by the choice of input gene tree distribution (e.g. BestML or nonparametric bootstrapping (BS) or Bayesian trees). Quartet-based methods on BestML trees may produce highly accurate trees (very often better than those on BS trees) when estimated gene trees are reasonably accurate. However, using a distribution of trees (estimated using Bayesian MCMC techniques) instead of a point estimate BestML tree for each gene substantially improved the performance of wQFM (wQFM-GTF-MB). Similarly, the branch supports estimated based on Bayesian tree distributions make weighted ASTRAL and weighted TREE-QMC more accurate than ASTRAL and TREE-QMC, respectively. Thus, appropriate weighting based on gene tree distributions can reduce the incongruence often observed across different species tree inference pipelines. However, wQFM-GTF-MB was consistently better than wASTRAL and weighted TREE-QMC. So, leveraging distribution of trees for each gene (preferably with Bayesian MCMC techniques), generating weighted quartets based on them, and estimating species trees using wQFM may produce substantially better trees than ASTRAL, TREE-QMC, and their weighted counterparts.
We also examined the performance of SVDquartets and BUCKy and the impact of using quartets weighted based on SVD score (computed by SVDquartets) or concordance factors (computed by BUCKy). We found BUCKy to be more accurate than SVDquartets and to match or outperform ASTRAL in many model conditions. On the other hand, the species trees estimated by wQFM using weighted quartets produced by SVDquarets and BUCky are often reasonably accurate but were not among the best ones identified in this study.
Taken together, the results on biological and simulated data demonstrate that the wQFM can produce the best known species tree accuracy when Bayesian gene tree distributions are given as input. Notably, in many cases, the improvement of wQFM-GTF-MB over ASTRAL (when given the same MB-estimated tree distribution as input) is substantially large (Figs. S12 and 5). ASTRAL has evolved through versions and has been the most accurate and widely used method over the past decade. Therefore, any improvement over ASTRAL is a notable advancement. In this context, the substantial improvement achieved by wQFM-GTF-MB is remarkable and underscores the broader impact and a clear benefit of wQFM over ASTRAL that wQFM can be used outside the context of gene tree estimation and can take weighted quartets computed from various sources.
Taking all these observations into consideration, we make the following recommendations. First, if gene trees are well estimated, we can then use BestML trees with existing summary methods (ASTRAL, wQFM, BUCKy). However, under challenging model conditions with limited number of erroneous estimated gene trees, appropriate methods need to be chosen as their relative performance varies depending on the complexity of the dataset/model conditions. Under such practical model conditions, utilizing distributions of trees (preferably with Bayesian MCMC techniques) may result in higher species tree accuracy.
Second, we recommend considering multiple approaches to species tree estimation, such as wQFM-GTF, wQFM-GTF-MB, ASTRAL, and wASTRAL. When these analyses yield conflicting results, examining the underlying reasons for the disagreement can help identify which analysis is likely to be more reliable or indicate the need for additional data, such as more genes or taxa, or improved data quality (e.g. more accurate alignments and more accurate gene trees).
While this study addresses a broad range of challenging model conditions and considers a variety of techniques for quartet distribution generation and quartet-based species tree inferences, it has several limitations and potential areas for extension. This study is limited to small- and moderate-sized datasets, excluding large datasets due to the computational demands of Bayesian techniques such as MrBayes. Specifically, this study explored performance on a large number of genes but did not explore large numbers of taxa, nor under conditions with missing data (i.e. gene tree with missing taxa). Additionally, this study does not include coestimation techniques like BEST (Liu 2008), *BEAST (Heled and Drummond 2010) due to their high computational requirements.
This study evaluated summary methods using a single maximum likelihood (ML) tree estimate for each gene, a set of ML gene trees estimated for the bootstrap replicates of each gene, or a set of trees for each gene generated using Bayesian MCMC sampling. Future studies need to consider other approaches for leveraging distributions of trees for each gene. Specifically, investigating the performance of multilocus bootstrapping (MLBS) (Seo 2008), including resampling both sites and genes, would be an interesting research direction.
Finally, this study, like any other, did not capture all the complexities inherent in real biological data. We did not investigate scenarios where gene tree discordance may have been influenced by other biological factors, such as gene duplication and loss (Maddison 1997; Bayzid et al. 2013; Bayzid and Warnow 2018), gene flow (Leaché et al. 2014), recombination (Lanier and Knowles 2012), horizontal gene transfer, or hybridization (Nakhleh 2013). Other factors, such as deep versus shallow radiations, heterotachy, mistaken homology, and violations of the model of sequence evolution, should be adequately studied in future studies.
Materials and Methods
Species Tree Estimation Methods
In our study, we utilized leading quartet-based summary methods to estimate species trees using gene trees, multiple sequence alignments (MSAs), or a combination of both. Given that prior studies have already compared summary methods with combined analysis (CA) and established its relative performance (Bayzid and Warnow 2013; Gatesy and Springer 2014; Mirarab et al. 2014c), we chose not to include CA in our experiments. The methods included in our study are as follows:
ASTRAL and wASTRAL
ASTRAL is a statistically consistent method (when coupled with a method that consistently estimates gene trees) under the MSC model that tries to solve the Maximum Quartet Support Species Tree (MQSST) problem (Mirarab et al. 2014c). It takes a set of gene trees as input and seeks to find a species tree so that the number of induced quartets in the gene trees that are consistent with the species tree is maximized. We analyzed both unweighted ASTRAL (version 5.7.8) and weighted ASTRAL (wASTRAL) (Zhang and Mirarab 2022).
wQFM and wQMC
These are two highly accurate methods for species tree estimation that amalgamate weighted quartets. wQFM and wQMC extend the popular (unweighted) quartet amalgamation techniques QFM and QMC to a weighted setting.
SVDquartets
The SVDquartets algorithm is designed for data where each site is an independent observation from the species tree under the multispecies coalescent model, but it can also be applied to data with dependencies, such as multilocus data. For a set of four taxa (), SVDquartets assigns a score to each of the three possible quartet topologies () using algebraic statistics and singular value decomposition (SVD). The quartet topology, among the three alternative topologies with the lowest “SVD score”, is selected as the “dominant” or true topology for that quartet. These selected quartets are then amalgamated using quartet assembly techniques, such as QFM or QMC, to construct a species tree.
BUCKy
Given a distribution of trees for each gene (usually generated by MrBayes), BUCKy uses Bayesian concordance analysis to estimate the Concordance Factors (CF) of quartets. These CFs are used to estimate a concordance tree (BUCKY-con) and a population tree (BUCKY-pop). We use the population tree in our analyses, as this is provably statistically consistent. We estimated gene tree distributions using MB and using RAxML with bootstrapping as input to BUCKy, denoted by BUCKy-MB and BUCKy-RAxML, respectively. For various datasets and model conditions analyzed in this study, we ran BUCKy using a sufficiently large number of MCMC iterations to reach sufficiently low standard deviations for the concordance factors to suggest possible convergence.
TREE-QMC and weighted TREE-QMC
Like ASTRAL, TREE-QMC (Han and Molloy 2023) is a method for estimating species trees directly from gene trees. It is based on the QMC algorithm (Snir and Rao 2012), but skips the time-consuming process of enumerating all quartet trees explicitly. Additionally, we also analyzed weighted TREE-QMC (Han and Molloy 2025), which can leverage both branch lengths and support values of gene trees.
wQFM and wQMC, in general, can be used to amalgamate any set of given weighted quartets. Therefore, the broader impact and a clear benefit of wQFM and wQMC over ASTRAL and other similar summary methods is that these can be used outside the context of gene tree estimation. In this study, we have proposed and investigated a wide range of ways for computing weighted quartets (Quartet weight generation strategies section) and their impact on species tree inference when coupled with wQFM and wQMC.
Quartet Weight Generation Strategies
We utilized a wide array of weighted quartet generation techniques in this study. In addition to well-established techniques like gene tree frequency (GTF), we introduced numerous novel approaches to weight estimation. We now briefly explain all the strategies explored throughout the experiments in RQ1.
RQ1—Experiment 1: Dominant Quartets vs All Quartet
The basic idea of Combining Dominant Quartet Trees () is to take the input set of gene trees , compute a dominant quartet tree (i.e. the most frequent quartet topology) for every four species, and then combine the dominant quartet trees into a supertree on the full set of species using a preferred quartet amalgamation technique (QFM or QMC). is a statistically consistent approach when the input gene trees are accurate (Mahbub et al. 2021). We consider the dominant quartets with and without weights, where the weight of a quartet q is computed based on the number of that induce quartet topology q. These strategies are denoted by GTF-dom and GTF-dom (without weights), respectively. Another statistically consistent approach is Weighted Maximum Quartet Consistency (WMQC) problem (Mahbub et al. 2021), which computes weights for every possible quartet tree, and then combines this set of weighted quartet trees into a tree on the full set of species. Thus, it uses all quartet trees (i.e. quartets for n taxa) with relative weights and is denoted by GTF-all.
RQ1—Experiment 2: BestML Gene Trees vs Bootstrap/Bayesian Distribution of Gene Trees
In this experiment, we assess the impact of considering a distribution of trees for each gene, rather than a single tree per gene, in weighted quartet-based species tree estimation. Summary methods can utilize either a single ML tree estimate for each gene or a collection of gene trees derived from bootstrap replicates. The former approach, where only the best ML tree for each gene is used, is denoted as BestML. Methods such as wQFM-GTF and wQMC-GTF fall within this category. We seek to understand if considering a distribution of trees for each gene, as methods like BUCKy do, results in improved weighted quartet distribution and eventually more accurate species trees.
We compared the impact of using a distribution of trees from a Bayesian MCMC analysis (Huelsenbeck and Ronquist 2001) with a distribution of trees obtained through nonparametric bootstrapping (Felsenstein 1985). These two sampling methods are widely used to obtain credible sets of trees and to estimate support values for phylogenies. We used MB and RAxML to generate Bayesian MCMC and nonparametric bootstrap (BS) samples of trees, respectively. We denote this whole process of obtaining a distribution of trees through MrBayes or RAxML bootstrapping and then estimating weighted quartets through GTF as GTF-MB and GTF-BS respectively.
RQ1—Experiment 3: Utilizing the Unweighted and Weighted Quartets Generated by SVDquartets
SVDquartets generates dominant quartets based on the SVD-score and then amalgamates them using QFM or QMC in an unweighted setting. It can also generate weighted quartets in two settings: exponential and reciprocal. These weighted quartets can be used with wQFM or wQMC to estimate species trees. However, although SVDquartets is widely studied and has been compared to coalescent-based methods (Chou et al. 2015), to the best of our knowledge, the performance of the weighted settings in SVDquartets has not been previously explored. Thus, in this experiment, we combine both the exponential (SVD-exp) and reciprocal (SVD-rec) weights generated by SVDquartets with wQFM and wQMC and compare them with standard SVDquartets.
RQ1—Experiment 4: Utilizing the Weighted Quartets Generated by BUCKy and ML-Based Methods
We investigated different variations of ML and BUCKy-based methods and even some form of combinations between them. We used MrBayes and RAxML to generate gene tree distributions as input to BUCKy (denoted by BUCKy-MB and BUCKy-RAxML, respectively). BUCKy computes the concordance factors (CF) of each quartet tree. The concordance factor of a clade is the proportion of gene trees that truly contain that clade (Larget et al. 2010). In practice, as the true gene trees are unknown, methods like BUCKy estimate the CF by considering all possible gene tree topologies and their posterior probabilities. It then obtains the posterior distribution for the CF by summing the posterior probabilities of the gene tree topologies corresponding to each possible CF value. This provides a Bayesian estimate, including the posterior mean, credibility intervals, and modal values, to account for uncertainty in the data (Ané et al. 2007). Thus, these concordance factors measure the genomic support of each quartet. In addition to evaluating the BUCKy-pop tree, we used the CFs of the quartets generated by BUCKy as weights and amalgamated them using wQFM and wQMC to estimate the species tree.
Overall, we have explored a large number of combinations with different quartet generation techniques and quartet amalgamation techniques. We briefly define these methods again in Table 4 for wQFM (using the convention wQFM–quartet-generation-technique), but all of them are applicable for wQMC. Figure 8 shows a schematic diagram of our experimental study.
Fig. 8.
Schematic diagram of the experimental study pipeline. The process begins with a set of k input gene sequences, from which k ML gene trees are estimated using RAxML. In addition to ML gene trees, distributions of trees for each gene are estimated using bootstrapping (RAxML) or Bayesian methods (MrBayes). Weighted quartet distributions are then generated from these sets of gene trees, with the quartet’s weight determined by its frequency across the input trees. Furthermore, SVDquartets is used to directly infer a weighted quartet distribution from gene sequences. Finally, wQFM or wQMC amalgamates the various quartet sets to estimate species trees. For comparison, species trees are also inferred from gene trees using ASTRAL and TREE-QMC, from gene tree distributions using BUCKy, and from gene sequences using SVDquartets.
Dataset
Simulated dataset
We used previously studied simulated and real biological datasets to evaluate the performance of various ways for generating weighted quartets (Table 4), and various quartet-based summary methods. We analyzed three small to moderate size dataset: the 11-, 15-, and 37-taxon simulated datasets from Chung and Ané (2011) and Mirarab et al. (2014a). We did not include large datasets with hundreds of taxa in this study due to the computationally intensive nature of methods such as MrBayes and BUCKy, which require prohibitively long runtimes on larger datasets.
The 37-taxon mammalian simulated dataset was generated using the species tree estimated by MP-EST on the biological dataset studied in Song et al. (2012). This species tree, with branch lengths in coalescent units, served as the basis for producing a set of gene trees under the coalescent model. Consequently, the model tree exhibits an ILS level consistent with a coalescent analysis of the biological mammalian dataset, and other simulation properties reflect the biological sequences studied. Gene sequences were simulated along the true gene trees under the GTR+GAMMA substitution model with rate heterogeneity. Gene trees were then estimated from these sequences using RAxML under the same substitution model. We examined the impact of varying the number of genes (ranging from 100 to 500) and the degree of gene tree estimation error, which was controlled by adjusting the sequence length of the markers (from 50 bp to 1,000 bp). Additionally, the levels of ILS were varied by modifying all internal branch lengths in the model species tree, either doubling or halving them. This resulted in three model conditions: 1× (moderate ILS), 0.5× (high ILS), and 2× (low ILS).
We also analyzed both the high-ILS and low-ILS 11-taxon datasets from Chung and Ané (2011). These datasets vary in the number of genes and the extent of gene tree estimation error. The gene sequences for this dataset were simulated under the Jukes–Cantor (JC) model with no site-specific rate variation. The gene trees were estimated with RAxML under the GTR+GAMMA substitution model. Additionally, we examined 15-taxon datasets that also exhibit high levels of ILS and vary in both sequence lengths and the number of genes. Like the 37-taxon dataset, gene sequences were simulated under the GTR+GAMMA substitution model with rate heterogeneity. The gene trees were estimated using the same model, too. Consequently, these simulated datasets offer a broad spectrum of challenging and practical model conditions, allowing us to rigorously evaluate the performance of various methods explored in this study.
Biological dataset
We reanalyzed the 37-taxon mammalian dataset from Song et al. (2012) containing 447 genes across 37 mammals after removing 21 mislabeled genes (confirmed by the authors) and two other outlier genes. We also analyze the avian dataset by Jarvis et al. (2014) containing genome data of 48 avian species spanning most orders of birds. The avian dataset includes 14,446 loci comprising exons, introns, and UCEs.
Evaluation Criteria
On the simulated datasets, we compared the estimated trees with the model species tree using normalized Robinson–Foulds (RF) distance (Robinson and Foulds 1981). The RF distance between two trees is defined as the sum of the bipartitions (splits) present in one tree but absent in the other and vice versa. All the estimated trees in this study are binary, making false positive (FP), false negative (FN), and RF rates identical. Additionally, we compared the quartet scores (the number of quartets from the set of input gene trees that agree with the candidate species tree) of the trees estimated by different methods. For the biological dataset, we compared the estimated species trees with the established evolutionary relationships reported in the scientific literature. Multiple replicates of data for various model conditions were analyzed, and a two-sided Wilcoxon signed-rank test (with ) was performed to measure the statistical significance of the differences between methods.
Supplementary Material
Contributor Information
Navid Bin Hasan, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.
Avijit Biswas, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.
Zahin Wahab, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.
Mahim Mahbub, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.
Rezwana Reaz, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.
Md Shamsuzzoha Bayzid, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.
Supplementary Material
Supplementary material is available at Genome Biology and Evolution online.
Data Availability
The scripts used to perform the experimental study are available at https://github.com/navidh86/quartet-inference-comparative-study. Datasets are available at DRYAD repository: https://doi.org/10.5061/dryad.wstqjq2wn.
Literature cited
- Ané C, Larget B, Baum DA, Smith SD, Rokas A. Bayesian estimation of concordance among gene trees. Mol Biol Evol. 2007:24:412–426. 10.1093/molbev/msm107. [DOI] [PubMed] [Google Scholar]
- Avni E, Cohen R, Snir S. Weighted quartets phylogenetics. Syst Biol. 2015:64:233–242. 10.1093/sysbio/syu087. [DOI] [PubMed] [Google Scholar]
- Bayzid MS, Mirarab S, Warnow T. Inferring optimal species trees under gene duplication and loss. In: Proceedings of Pacific Symposium on Biocomputing (PSB); World Scientific; 2013. Vol. 18. p. 250–261. [DOI] [PubMed]
- Bayzid MS, Warnow T. Estimating optimal species trees from incomplete gene trees under deep coalescence. J Comput Biol. 2012:19:591–605. 10.1089/cmb.2012.0037. [DOI] [PubMed] [Google Scholar]
- Bayzid MS, Warnow T. Naive binning improves phylogenomic analyses. Bioinformatics. 2013:29:2277–2284. 10.1093/bioinformatics/btt394. [DOI] [PubMed] [Google Scholar]
- Bayzid MS, Warnow T. Gene tree parsimony for incomplete gene trees: addressing true biological loss. Algorithms Mol Biol. 2018:13:1. 10.1186/s13015-017-0120-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boussau B et al. Genome-scale coestimation of species and gene trees. Genome Res. 2013:23:323–330. 10.1101/gr.141978.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Casanellas M, Fernández-Sánchez J, Garrote-López M, Sabaté-Vidales M. Designing weights for quartet-based methods when data are heterogeneous across lineages. Bull Math Biol. 2023:85:68. 10.1007/s11538-023-01167-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chifman J, Kubatko L. Quartet from SNP data under the coalescent model. Bioinformatics. 2014:30:3317–3324. 10.1093/bioinformatics/btu530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chou J et al. A comparative study of svdquartets and other coalescent-based species tree estimation methods. BMC Genomics. 2015:16:S2. 10.1186/1471-2164-16-S10-S2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chung Y, Ané C. Comparing two Bayesian methods for gene tree/species tree reconstruction: a simulation with incomplete lineage sorting and horizontal gene transfer. Syst Biol. 2011:60:261–275. 10.1093/sysbio/syr003. [DOI] [PubMed] [Google Scholar]
- Degnan JH. Anomalous unrooted gene trees. Syst Biol. 2013:62:574–590. 10.1093/sysbio/syt023. [DOI] [PubMed] [Google Scholar]
- Degnan JH, DeGiorgio M, Bryant D, Rosenberg NA. Properties of consensus methods for inferring species trees from gene trees. Syst Biol. 2009:58:35–54. 10.1093/sysbio/syp008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Degnan JH, Rosenberg NA. Discordance of species trees with their most likely gene trees. PLoS Genet. 2006:2:762–768. 10.1371/journal.pgen.0020068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Degnan JH, Rosenberg NA. Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol Evol. 2009:24(6):332–340. 10.1016/j.tree.2009.01.009. [DOI] [PubMed] [Google Scholar]
- Douady CJ, Delsuc F, Boucher Y, Doolittle WF, Douzery EJP. Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability. Mol Biol Evol. 2003:20:248–254. 10.1093/molbev/msg042. [DOI] [PubMed] [Google Scholar]
- Farah IT, Islam M, Zinat KT, Rahman AH, Bayzid S. Species tree estimation from gene trees by minimizing deep coalescence and maximizing quartet consistency: a comparative study and the presence of pseudo species tree terraces. Syst Biol. 2021:70:1213–1231. 10.1093/sysbio/syab026. [DOI] [PubMed] [Google Scholar]
- Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985:39:783–791. 10.2307/2408678. [DOI] [PubMed] [Google Scholar]
- Gatesy J, Springer MS. Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum. Mol Phylogenet Evol. 2014:80:231–266. 10.1016/j.ympev.2014.08.013. [DOI] [PubMed] [Google Scholar]
- Han Y, Molloy EK. Improving quartet graph construction for scalable and accurate species tree estimation from gene trees. Genome Res. 2023:33:1042–1052. 10.1101/gr.277629.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Han Y, Molloy EK. Improved robustness to gene tree incompleteness, estimation errors, and systematic homology errors with weighted TREE-QMC. Syst Biol. 2025:syaf009. 10.1093/sysbio/syaf009. [DOI] [PubMed] [Google Scholar]
- Heled J, Drummond AJ. Bayesian inference of species trees from multilocus data. Mol Biol Evol. 2010:27:570–580. 10.1093/molbev/msp274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu J, Zhang Y, Yu L. Summary of Laurasiatheria (mammalia) phylogeny. Zool Res. 2012:33:65–74. 10.3724/SP.J.1141.2012.E05-06E65. [DOI] [PubMed] [Google Scholar]
- Huelsenbeck JP, Ronquist R. MrBayes: Bayesian inference of phylogeny. Bioinformatics. 2001:17:754–755. 10.1093/bioinformatics/17.8.754. [DOI] [PubMed] [Google Scholar]
- Islam M, Sarker K, Das T, Reaz R, Bayzid MS. Stelar: a statistically consistent coalescent-based species tree estimation method by maximizing triplet consistency. BMC Genomics. 2020:21:1–13. 10.1186/s12864-020-6519-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Janečka JE et al. Molecular and genomic data identify the closest living relative of primates. Science. 2007:318:792–794. 10.1126/science.1147555. [DOI] [PubMed] [Google Scholar]
- Jarvis ED et al. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science. 2014:346:1320–1331. 10.1126/science.1253451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jiao X, Flouri T, Yang Z. Multispecies coalescent and its applications to infer species phylogenies and cross-species gene flow. Natl Sci Rev. 2021:8:nwab127. 10.1093/nsr/nwab127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kubatko LS, Degnan JH. Inconsistency of phylogenetic estimates from concatenated data under coalescence. Syst Biol. 2007:56:17–24. 10.1080/10635150601146041. [DOI] [PubMed] [Google Scholar]
- Kumar V, Hallström BM, Janke A. Coalescent-based genome analyses resolve the early branches of the euarchontoglires. PLoS One. 2013:8:e60019. 10.1371/journal.pone.0060019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lanier HC, Knowles LL. Is recombination a problem for species-tree analyses? Syst Biol. 2012:61:691–701. 10.1093/sysbio/syr128. [DOI] [PubMed] [Google Scholar]
- Larget B, Kotha SK, Dewey CN, Ané C. BUCKy: gene tree/species tree reconciliation with the Bayesian concordance analysis. Bioinformatics. 2010:26:2910–2911. 10.1093/bioinformatics/btq539. [DOI] [PubMed] [Google Scholar]
- Leaché AD, Harris RB, Rannala B, Yang Z. The influence of gene flow on species tree estimation: a simulation study. Syst Biol. 2014:63:17–30. 10.1093/sysbio/syt049. [DOI] [PubMed] [Google Scholar]
- Leaché AD, Rannala B. The accuracy of species tree estimation under simulation: a comparison of methods. Syst Biol. 2011:60:126–137. 10.1093/sysbio/syq073. [DOI] [PubMed] [Google Scholar]
- Liu L. BEST: Bayesian estimation of species trees under the coalescent model. Bioinformatics. 2008:24:2542–2543. 10.1093/bioinformatics/btn484. [DOI] [PubMed] [Google Scholar]
- Liu L, Yu L, Edwards SV. A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evol Biol. 2010:10:302. 10.1186/1471-2148-10-302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maddison WP. Gene trees in species trees. Syst Biol. 1997:46:523–536. 10.1093/sysbio/46.3.523. [DOI] [Google Scholar]
- Mahbub M, Wahab Z, Reaz R, Rahman MS, Bayzid MS. wQFM: highly accurate genome-scale species tree estimation from weighted quartets. Bioinformatics. 2021:37:3734–3743. 10.1093/bioinformatics/btab428. [DOI] [PubMed] [Google Scholar]
- Mahbub S et al. Qt-gild: quartet based gene tree imputation using deep learning improves phylogenomic analyses despite missing data. In: International Conference on Research in Computational Molecular Biology. Springer; 2022. p. 159–176. [DOI] [PubMed]
- Mar JC, Harlow TJ, Ragan MA. Bayesian and maximum likelihood phylogenetic analyses of protein sequence data under relative branch-length differences and model violation. BMC Evol Biol. 2005:5:1–20. 10.1186/1471-2148-5-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mim SA, Zarif-Ul-Alam M, Reaz R, Bayzid MS, Rahman MS. Quartet Fiduccia–Mattheyses revisited for larger phylogenetic studies. Bioinformatics. 2023:39:btad332. 10.1093/bioinformatics/btad332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mirarab S et al. ASTRAL: genome-scale coalescent-based species tree estimation. Bioinformatics. 2014c:30:i541–i548. 10.1093/bioinformatics/btu462. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mirarab S, Bayzid MS, Boussau B, Warnow T. Statistical binning enables an accurate coalescent-based estimation of the avian tree. Science. 2014a:346:1250463. 10.1126/science.1250463. [DOI] [PubMed] [Google Scholar]
- Mirarab S, Bayzid MS, Warnow T. Evaluating summary methods for multilocus species tree estimation in the presence of incomplete lineage sorting. Syst Biol. 2014b:65:366–380. 10.1093/sysbio/syu063. [DOI] [PubMed] [Google Scholar]
- Nakhleh L. Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol Evol. 2013:28:719–728. 10.1016/j.tree.2013.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ogilvie HA, Bouckaert RR, Drummond AJ. Starbeast2 brings faster species tree inference and accurate estimates of substitution rates. Mol Biol Evol. 2017:34:2101–2114. 10.1093/molbev/msx126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reaz R, Bayzid MS, Rahman MS. Accurate phylogenetic tree reconstruction from quartets: a heuristic approach. PLoS One. 2014:9:e104008. 10.1371/journal.pone.0104008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson DF, Foulds LR. Comparison of phylogenetic trees. Math Biosci. 1981:53:131–147. 10.1016/0025-5564(81)90043-2. [DOI] [Google Scholar]
- Roch S, Steel M. Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent. Theor Popul Biol. 2015:100:56–62. 10.1016/j.tpb.2014.12.005. [DOI] [PubMed] [Google Scholar]
- Seo T-K. Calculating bootstrap probabilities of phylogeny using multilocus sequence data. Mol Biol Evol. 2008:25:960–971. 10.1093/molbev/msn043. [DOI] [PubMed] [Google Scholar]
- Smith BT, Harvey MG, Faircloth BC, Glenn TC, Brumfield RT. Target capture and massively parallel sequencing of ultraconserved elements for comparative studies at shallow evolutionary time scales. Syst Biol. 2013:63:83–95. 10.1093/sysbio/syt061. [DOI] [PubMed] [Google Scholar]
- Snir S, Rao S. Quartets MaxCut: a divide and conquer quartets algorithm. IEEE/ACM Trans Comput Biol Bioinform. 2010:7:704–718. 10.1109/TCBB.2008.133. [DOI] [PubMed] [Google Scholar]
- Snir S, Rao S. Quartet maxcut: a fast algorithm for amalgamating quartet trees. Mol Phylogenet Evol. 2012:62:1–8. 10.1016/j.ympev.2011.06.021. [DOI] [PubMed] [Google Scholar]
- Song S, Liu L, Edwards SV, Wu S. Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model. Proc Natl Acad Sci U S A. 2012:109:14942–14947. 10.1073/pnas.1211733109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006:22:2688–2690. 10.1093/bioinformatics/btl446. [DOI] [PubMed] [Google Scholar]
- Swofford DL. PAUP*: phylogenetic analysis using parsimony (* and other methods). Ver. 4. Sinauer Associates; 2002. [Google Scholar]
- Swofford DL, Kubatko LS. Chapter 4 Species tree estimation using site pattern frequencies. In: Kubato LS, Knowles LL, editors. Species tree inference: a guide to methods and applications. Princeton University Press; 2023. p. 68–88. ISBN 9780691245157 2023.
- Szöllősi GJ, Tannier E, Daubin V, Boussau B. The inference of gene trees with species trees. Syst Biol. 2015:64:e42–e62. 10.1093/sysbio/syu048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wascher M, Kubatko L. Consistency of svdquartets and maximum likelihood for coalescent-based species tree estimation. Syst Biol. 2020:70:33–48. 10.1093/sysbio/syaa039. [DOI] [PubMed] [Google Scholar]
- Zhang C, Mirarab S. Weighting by gene tree uncertainty improves accuracy of quartet-based species trees. Mol Biol Evol. 2022:39:msac215. 10.1093/molbev/msac215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang L. From gene trees to species trees II: species tree inference by minimizing deep coalescence events. IEEE/ACM Trans Comp Biol Bioinf. 2011:8:1685–1691. 10.1109/TCBB.2011.83. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The scripts used to perform the experimental study are available at https://github.com/navidh86/quartet-inference-comparative-study. Datasets are available at DRYAD repository: https://doi.org/10.5061/dryad.wstqjq2wn.








