Skip to main content
. 2011 May 10;12:144. doi: 10.1186/1471-2105-12-144

Table 3.

Prune results for different sized datasets and underlying alignment methods.

50 leaves 100 leaves 500 leaves 1000 leaves




Time Agreement Time Agreement Time Agreement Time Agreement
Pecan1 21.9 0.914 297. 0.879 _a _a _a _a
Prune w/Pecan 60% 7.26 0.880 39.2 0.862 _a _a _a _a
30% 3.13 0.909 19.6 0.839 _a _a _a _a
15% 7.26 0.912 13.3 0.878 125. 0.844 _a _a
7% 4.24 0.909 13.5 0.849 29.1 0.907 122. 0.877

FSA2 63.1 0.933 266. 0.856 _a _a _a _a
Prune w/FSA 60% 33.8 0.912 78.9 0.838 589. 0.871 _a _a
30% 10.5 0.893 23.8 0.838 142. 0.879 _a _a
15% 4.25 0.885 17.1 0.857 40.8 0.877 150. 0.861
7% 3.00 0.866 4.23 0.842 12.7 0.903 34.8 0.887

MUSCLE3 55.6 0.905 138. 0.799 _b _b _b _b
Prune w/MUSCLE 60% 40.7 0.899 77.9 0.777 886. 0.862 _b _b
30% 24.7 0.896 42.8 0.777 368. 0.883 _b _b
15% 15.1 0.905 29.1 0.828 185. 0.899 440. 0.900
7% 24.7 0.905 18.8 0.841 114. 0.924 228 0.928

MAFFT4 3.17 0.897 5.39 0.806 20.1 0.886 25.2 0.912

SATé5 101. 0.915 301. 0.840 _b _b _b _b

1 Pecan was run with default parameters.

2 FSA was run with the --exonerate, --anchored, --softmasked, and --fast flags.

3 MUSCLE was run with default parameters.

4 MAFFT was run with the --treein option.

5 SATé was run with the -t option but limited to two iterations. We found that more iterations did almost nothing for accuracy.

a The majority of these problems were unable to be aligned due to running out of memory.

b The majority of these problems took longer than 3 days and were aborted.

The run-time and average agreement score of Prune alignments of different sized datasets. Several sets of simulated alignment problems were generated using a root sequence of 10 kilobases. The neutral evolution of each root sequence was simulated over 50, 100, 500, and 1000 species trees. Fifty problems were generated per tree size for a total of two hundred test alignment problems. The agreement and run-time (in minutes) for each problem size is the average over the fifty simulated alignments. Each underlying alignment method was tested on the dataset (Pecan, FSA, MUSCLE). Prune was then used to break the problems down into sub-trees that contained at most 60%, 30%, 15%, and 7% of the nodes in the entire tree. The largest number of stages was six but most of the problems had no more than 3 stages. Pecan, FSA, and MUSCLE were used as the underlying alignment method to Prune. We also performed alignment using MAFFT and SATé to compare against. To ensure a fair comparison, the true tree topology was passed to SATé (using -t option) and to MAFFT (using the poorly documented --treein option). We were unable to apply some alignment algorithms to large problems because of very long run-times and memory issues. Using Prune, we were able to use Pecan, FSA, and MUSCLE to solve alignment problems that were much deeper than could be solved without Prune. Prune achieved a very large speedup with little loss of accuracy and sometimes with an increase in accuracy.