Fig. 1.
—Outline of the analytical procedure. Stages are depicted clockwise from top-left. The input for the analysis is (1) gene trees of all protein families for a group of species, including the information of AD per branch as calculated by MAD. Protein families are classified into complete and partial, single-copy, or multicopy families according to the gene copy number per species. (2) Branch ADs in the gene trees supply evidence for hypothetical root partitions in the species tree; these are collected in the (3) root support matrix. The information in the root support matrix is used to identify candidates for the species root partition (including the consensus root partition, if exists). (4) The comparison of root candidates is done by comparing the distribution of their ADs in all gene trees in a pairwise test. (5) If several root partitions are similarly supported by ADs, these can be analyzed in the context of a root neighborhood, where weakly supported partitions are sequentially eliminated from the root partitions set. (6) The remaining root partitions comprise the species LCA confidence set.
