Boussau et al. 10.1073/pnas. 0400975101.

Supporting Information

Files in this Data Supplement:

Supporting Figure 6
Supporting Table 3
Supporting Figure 7
Supporting Table 4
Supporting Figure 8
Supporting Table 5




Supporting Figure 6A
Supporting Figure 6B

Fig. 6. Comparisons of gene-order structures for different replichons in the Rhizobiales. The figure illustrates the high degree of gene-order conservation for genes located on the main chromosomes (a) and the lack of gene-order conservation for genes located on the auxiliary chromosomes (b). The boxes indicate the location of genes listed in Table 4 that were included in the 38 protein alignments used for the phylogenetic inference shown in Fig. 2. The numbers on the axes indicate chromosome numbers, with I referring to the main chromosome and II and III to the auxiliary chromosomes. Abbreviations for species names are as described in the legends to Figs. 1 and 2.





Suppoting Table 3

Table 3. List of genes used for the phylogenetic inferences shown in Fig. 2. These genes were selected on the basis that they are located in conserved segments in the Rhizobiales (Fig. 6).





Supporting Table 4

Table 4. The intercept and slope were estimated for each functional category in a plot of the number of genes versus genome size for α-proteobacteria (Fig. 1).





Supporting Figure 8

Fig. 8. Reconstruction of deletions/duplications and gene-genesis events based on the a-proteobacterial tree was made separately for clusters of orthologous groups (COGs) assigned to the different replichons. Inference of gene contents was made by using the ACCTRAN option for parsimony analysis in PAUP* for the complete set of a -proteobacterial proteins (73,658 proteins in total), with penalties for duplication, deletion, and gene genesis set to 1, 1, and 2 (a) and 1, 1, and 5 (b), respectively. Numbers along branches refer to the number of duplications/losses/genesis, respectively, for the complete set of COGs, with numbers in parenthesis referring to COGs assigned to the auxiliary replichons. Numbers at nodes refer to the putative number of genes in the inferred genome at the node, with numbers in parenthesis referring to COGs assigned to the auxiliary replichons. Outgroup sequences are as described for Fig. 2, but they were pruned from the tree shown here. Abbreviations for species names are as described in the legends to Figs. 1 and 2.





Supporting Table 5

Table 5. List of clusters of orthologous groups in the α-proteobacterial ancestor inferred by using clustering groups and penalty values as described for Fig. 3a.