Skip to main content
. 2017 Jun 22;18(9):1559–1571. doi: 10.15252/embr.201744102

Figure EV4. Gene phylogeny of histone H3 homologs.

Figure EV4

To find the putative orthologs of CenpA, we first aligned candidate orthologous sequences, which were experimentally identified centromeric H3 variants in divergent species (indicated with a pink branch in this phylogeny). From this alignment, we constructed a profile HMM and performed multiple HMM searches through our local proteome database. From these searches, we selected 831 sequences (belonging to the histone H3 family), aligned these and constructed the gene phylogeny, which is presented in this figure (see also Materials and Methods). We rooted the phylogeny on the cluster that contained all of these experimentally identified centromeric H3 variants and some additional sequences that, based on best blast hits, were also likely to be orthologous to CenpA. The cluster did not contain the candidate orthologs in Toxoplasma gondii 81. We do not know whether this is due to an error in the gene phylogeny, or to parallel invention of a centromeric H3 variants in this species, which would mean that it is not orthologous to CenpA. Nevertheless, we included these sequences in the orthologous group. The candidate centromeric H3 variants that are part of the CenpA cluster include sequences from all five eukaryotic supergroups: Homo sapiens 82, Saccharomyces cerevisiae 83, Drosophila melanogaster 84, Caenorhabditis elegans 85, Schizosaccharomyes pombe 86 (Opisthokonta), Dictyostelium discoideum 87 (Amoebozoa), Arabidopsis thaliana 88 (Archaeplastida), Tetrahymena thermophila 89, Plasmodium falciparum 90 (SAR), Giardia intestinalis 91 and Trichomonas vaginalis 92 (Excavata). The original gene tree in newick format is provided (Dataset EV3).