Abstract
Photosynthetic eukaryotes and their relatives are the result of an intricate evolutionary history involving a series of plastid acquisitions through endosymbiosis, multiple reversions to heterotrophy, and sometimes total plastid losses. Among these events, one of the most debated is the emergence and diversification of the CASH lineages (Cryptophyta, Alveolata, Stramenopiles, and Haptophyta). Although they all include species bearing a complex plastid that derived from the endosymbiosis of a red alga, their phylogenetic relationships remain controversial, and the timing and number of plastid acquisitions are still undetermined. The inner metabolism of all plastids is mostly supported by nuclear-encoded proteins, and consequently, mechanisms allowing the relocation of those proteins have evolved or were recycled at each endosymbiotic event. Thus, the study of the composition and origins of those translocation machineries provides important clues for understanding how photosynthetic lineages have emerged and might be related. In CASH species, the SELMA complex, composed of about 20 proteins, is dedicated to the transport of preproteins across the periplastidial membrane, the second outermost membrane of complex red plastids. In this work, we present a comprehensive genomic survey and phylogenetic analysis of the proteins composing the SELMA complex. We confirm the presence, homology, and monophyletic origin of SELMA in the four CASH lineages and use these observations to infer a scenario for the serial transmission of secondary red plastids that differs from previous hypotheses and sheds new light on the evolution of photosynthetic eukaryotes.
Keywords: macroevolution, endosymbiosis, phylogenomics, photosynthesis
Introduction
Eukaryotes acquired the ability to carry out oxygenic photosynthesis in plastids more than 1.2 billion years ago thanks to primary endosymbiosis, namely the engulfment and assimilation of a cyanobacterium symbiont by a heterotrophic protist host (Archibald 2009; Parfrey et al. 2011). Three extant lineages diversified from this founder event: Glaucophyta, Viridiplantae, and Rhodophyta; composing the monophyletic supergroup Archaeplastida (Adl et al. 2012). Although the nature of the last common ancestor of the host lineage that gave rise to Archaeplastida is still undetermined, phylogenetic analyses of the reduced genomes of primary plastids allowed to trace their common ancestry to an extinct cyanobacterium related to Gloeomargaritales (Ponce-Toledo et al. 2017). Following primary endosymbiosis, photosynthesis was repeatedly transmitted to distant phyla of eukaryotes thanks to secondary and tertiary endosymbioses, during which various hosts associated with Archaeplastida symbionts, either red or green algae. For instance, Chlorarachniophyta (Rhizaria) and Euglenida (Excavata) both carry green alga derived plastids that originated from two independent secondary endosymbiosis events, involving symbionts related to Ulvophyceae and Prasinophyceae, respectively (Rogers et al. 2007; Hrdá et al. 2012). Likewise, Cryptophyta, Alveolata, Stramenopiles, and Haptophyta (termed thereafter CASH [Petersen et al. 2014]) include species with complex plastids related to red algae. Contrary to the green ones, the precise origins of red algae-derived complex plastids, as well as the phylogenetic and evolutionary histories of CASH lineages, are still a matter of controversy. The long prevailing “chromalveolate” hypothesis argued in favor of a single endosymbiosis from which all CASH lineages would derive. This was mostly supported by the similarity of intracellular membrane organizations between those phyla, considered an ancestral trait, but also by the idea that endosymbioses being complex evolutionary procedures, they should be rare events (Cavalier-Smith 1999; Keeling 2009). Interestingly, the same kind of phylogenetic analyses that demonstrated the independent origins of green plastids have repeatedly shown that all Rhodophyta-derived secondary plastids are monophyletic (Yoon et al. 2002; Rodríguez-Ezpeleta et al. 2005; Baurain et al. 2010), supporting the chromalveolate hypothesis. However, the accumulation of nuclear genome data across the eukaryotic diversity has offered new phylogenetic insights that systematically rejected the monophyly of chromalveolates (Baurain et al. 2010; Burki et al. 2020). Thus, there is an incongruence opposing a seemingly vertical inheritance of plastids in CASH phyla and the absence of direct descent relationships between the nuclear genomes of the same groups. One way to reconcile these conflicting patterns is to hypothesize that complex plastids were acquired via intricate serial endosymbioses (Bodył 2005; Sanchez-Puerta et al. 2007; Baurain et al. 2010; Petersen et al. 2014; Stiller et al. 2014; Strassert et al. 2021). In this scenario, a primordial secondary endosymbiosis gave rise to one of the four CASH phyla, which was later involved in one or several higher order endosymbioses, transmitting the complex plastid to other CASH lineages. This hypothetical evolutionary pathway is, however, still mainly built upon indirect evidence.
An emblematic feature of endosymbioses resides in the reduction of the symbiont genome and in the massive transfer of genes toward the host nucleus. These are called endosymbiotic gene transfers (EGT) and have greatly modeled the genomes of photosynthetic eukaryotes (Martin et al. 1998). The genomes of Archaeplastida, for instance, contain a vast proportion of genes of cyanobacterial origin (Timmis et al. 2004). Similarly, genomes of species with complex plastids also enclose genes related to cyanobacteria as well as many eukaryotic genes that they obtained from their algal symbiont (Paper et al. 2000; Gould et al. 2006b; Li et al. 2006). A consequence of EGT is that most of the proteins engaged in plastid metabolisms are encoded in the nuclear genome and must be relocated into the plastid to perform their function. Primary plastids are surrounded by two membranes which contain a protein import complex called “Tic/Toc”—for Translocators of the Inner and Outer Chloroplast membranes (Sjuts et al. 2017). Nuclear encoded plastidial preproteins present a particular N-terminal sequence called the transit peptide that triggers their recognition and addressing to the Tic/Toc complex for translocation. This transit peptide is subsequently cleaved and mature proteins are released in the stroma or sent to the thylakoids. Complex plastids derive from photosynthetic green or red algae and are consequently surrounded by more membrane layers. Plastids of most CASH species possess four membranes, with the exception of peridinin-containing plastids of Dinophyta as well as plastids of a restricted number of species of Ochrophyta with peculiar evolutionary histories (Wetherbee et al. 2019). Those four membranes are, from the outside to the inside: the outermost membrane (OM); the periplastidial membrane (PPM), and finally the original plastid double membrane (PM) (fig. 1a). Dinophyta have lost the PPM secondarily. In Cryptophyta, Haptophyta, and Stramenopiles, the OM is continuous with the outer nuclear membrane and populated with ribosomes, hence the plastid resides in the endoplasmic reticulum lumen (or chloroplast endoplasmic reticulum; CER; fig. 1a [Gibbs 1979]). In Alveolata, the plastid is located in the cytoplasm. Between the PPM and the PM lies the periplastidial space (PPS), corresponding to the former cytoplasm of the algal symbiont (fig. 1a). In Cryptophyta, the PPS encloses a nucleomorph (fig. 1a), which is a relic of the symbiont's nucleus (Greenwood 1974; Hibberd and Norris 1984; Maier 1992) and contains a vestigial genome composed of 300 to 450 genes mainly involved in its own replication and expression (Maier et al. 2000; Zauner et al. 2019). In summary, preproteins that have to be relocated to complex red plastids must pass through three to four membranes. In lineages with a complex plastid, nuclear encoded preproteins are prefixed by a bipartite N-terminal topogenic signal (BTS); i.e. a combination of a signal peptide and a transit peptide. The relocation of preproteins is a two-step process involving the translocation into the secretory pathway followed by the import into the plastid stroma through Tic/Toc homologs. Both the signal peptide and the transit peptide are cleaved after preprotein translocation into the CER and the stroma, respectively (Grossman et al. 1990; Bhaya and Grossman 1991; Sulli et al. 1999; Waller et al. 2000; Rogers et al. 2004; Gould et al. 2006a).
Fig. 1.
Composition of the SELMA transporter across CASH lineages. a) Schematic organization of a photosynthetic Cryptophyta cell depicting organelles and membranes. N, nucleus; Nm, nucleomorph; C, cytoplasm; OM, plastid outer membrane; CER, chloroplast endoplasmic reticulum; PPM, periplastid membrane; PPS, periplastidial space; PM, plastid double membrane. b) Schematic structure of the SELMA transporter, adapted from Lau et al (2016). Functional categories are annotated using different colors (refer to panel c). Please refer to the main text for details about the name and role of SELMA constituents. c) Presence/absence, origin and phylogenetic relationships of putative SELMA components in CASH lineages. For each protein (line), a large dot indicates its existence in a lineage (column). The color of the dot corresponds to the inferred origin of the gene; dark purple: Cryptophyta nucleus, light purple: Cryptophyta nucleomorph, blue: Haptophyta, pink: Stramenopiles, brown: Alveolata. When two dots are linked with a thick line, the corresponding proteins share a common ancestor.
The nature of the transporter driving proteins across the PPM was unveiled much more recently. Its discovery resulted from the detection of a peculiar endoplasmic reticulum associated degradation (ERAD) complex encoded in the nucleomorph genome of Cryptophyta as well as in the nuclear genome of Apicomplexa and Stramenopiles (Sommer et al. 2007). ERAD is a ubiquitin-dependent pathway used by all eukaryotic cells to extract miss-folded proteins from the endoplasmic reticulum back into the cytoplasm where they can be degraded by the proteasome (Meusser et al. 2005). Sommer et al (2007) suggested that this newly found plastid-located additional copy of ERAD could have been acquired from red algae (via EGT) and tinkered to create a translocon that imports proteins across the PPM. The putative new complex was named SELMA for Symbiont-specific Erad-Like MAchinery (Hempel et al. 2009); and its structure and function were determined by in silico studies (Felsner et al. 2011; Moog et al. 2011; Stork et al. 2012) and functional experiments (Agrawal et al. 2009, 2013; Hempel et al. 2009, 2010; Lau et al. 2015; Fellows et al. 2017). Currently, SELMA is hypothesized to function using c.a. 20 proteins that are, for the most part, present in sequenced genomes of red plastid-containing CASH species (supplementary table S1, Supplementary Material online, Stork et al. 2012). The discovery of this CASH-associated PPM protein translocator, likely acquired by EGT from Rhodophyta, brought new arguments into the debate of complex red plastid evolution. Indeed, it seems unlikely that the same multiprotein transporter evolved several times after independent secondary endosymbioses with different red algae. This idea is strengthened by the observation that, in Chlorarachniophyta, an evolutionary distinct green complex plastid-containing lineage which also possesses a PPM, a PPS and a nucleomorph, there is no detectable SELMA-like complex (Hirakawa et al. 2012). The fact that Cryptophyta SELMA components are partly encoded in the nucleomorph (thus undoubtedly inherited from secondary endosymbiosis), and partly encoded in the nucleus, prompted some authors to speculate that Cryptophyta represent an evolutionary intermediate of secondary red plastid integration (Zimorski et al. 2014; Gould et al. 2015; Cavalier-Smith 2018). Additionally, in the very restricted set of single protein phylogenies available in the literature, SELMA components of Alveolata, Stramenopiles, and Haptophyta form a clade that is a sister group to the Cryptophyta homologs (Felsner et al. 2011; Fellows et al. 2017). Accordingly, Cryptophyta ancestors were argued to be likely involved in the endosymbiotic series that gave rise to other photosynthetic CASH phyla. However, no systematic phylogenetic analysis of the entire SELMA complex was ever attempted. Such an analysis could test whether SELMA genes were indeed acquired mostly from red algae; if they were transferred between CASH phyla; and if some of those transfers might indicate the timing and order of complex endosymbioses. In this work, we attempted to reconstruct phylogenies of all described protein components of the SELMA complex. We confirm the existence and homology of SELMA in all CASH phyla and propose an interpretation of the phylogenetic tree topologies in the context of complex red plastid evolution.
Results
Data Mining and Phylogenetic Analysis of SELMA-Related Proteins
Stork et al. (2012) gathered, using sequence similarity searches, a comprehensive sequence collection of putative SELMA components from all CASH genomes available in 2012. To differentiate SELMA from ERAD proteins, they used in silico predictions of BTS as well as BLAST dissimilarity as proxy of evolutionary distance between sequences. Members of the SELMA complex can be classified into four functional categories that will be fully described thereafter: (i) the Derlin pore, (ii) the ubiquitination machinery, (iii) a Cdc48-Ufd1-Npl4 complex, and (iv) a set of accessory proteins (fig. 1b). For the sake of clarity, the following terminology will be used throughout the rest of the manuscript: genes and proteins putatively involved in ERAD will have the prefix “e_” while those potentially involved in SELMA will be prefixed “s_.” We used protein sequence identifiers from supplementary table S1, Supplementary Material online in Stork et al. as a starting set for our own data mining procedure. We assembled a custom database containing protein sequences predicted from 864 genomes and transcriptomes of species distributed across the three domains of life with an emphasis on photosynthetic eukaryotes (supplementary table S6, Supplementary Material online; a taxon sampling comparison between Stork et al. and this study is available in supplementary table S5, Supplementary Material online). We used the starting set to query our database using reciprocal BLASTP (Altschul et al. 1990) in order to retrieve as many nonredundant similar protein sequences as possible for each potential ERAD/SELMA component. We then used those collections of similar proteins to reconstruct and refine phylogenetic trees and managed to obtain informative phylogenies for 16 of the 21 putative SELMA components analyzed (supplementary table S1, Supplementary Material online). SELMA is considered to have evolved from the refunctionalization of ERAD genes of Rhodophyta obtained by EGT (Bolte et al. 2011). Therefore, it is expected that eukaryotes lacking a complex plastid should only have one copy of each ERAD components while photosynthetic CASH genomes should carry at least two copies: one involved in the cytosolic/endoplasmic ERAD complex and the other one in the plastid-located SELMA complex. We looked for the corresponding patterns in our trees and included in silico predictions of cellular localization to determine which paralogous group of proteins might be implicated in SELMA or in ERAD. By doing so, we could detect several classification errors in Stork et al. study, probably due to the fact that they used BLAST “distances” in lieu of phylogenetic relationships, which is known to be unreliable (Koski and Golding 2001). An updated table of ERAD/SELMA protein references is shown in supplementary table S2, Supplementary Material online. We describe hereafter the results obtained for each functional group composing the SELMA transporter.
The SELMA Derlin Translocon Likely Originated by EGT from Rhodophyta
The nature of the pore through which proteins are excreted out of the ER when they are processed by ERAD is still an open debate. For some authors, Derlin transmembrane proteins, which are present in all eukaryotes, are the main constituent of this channel (Rao et al. 2021). Those proteins have been extensively studied in Saccharomyces cerevisiae, whose genome encodes two Derlin paralogs, namely Der1 (for “degradation in the ER”) and Dfm1 (for “Der1-like family member”) (Knop et al. 1996; Hitt and Wolf 2004). Analyses of yeast single mutants for each isoform indicate that Der1 is mainly implicated in the retrotranslocation of luminal proteins (ERAD-L) while Dfm1 would rather transport membrane integral proteins (ERAD-M) (Neal et al. 2018). It seems, however, that each isoform is able to compensate for the absence of the other, explaining the lack of altered growth phenotype in single mutant lines (Knop et al. 1996; Hitt and Wolf 2004). Additionally, double mutants impaired for both Derlins are able to grow, indicating that there might also be alternative ways to export misfolded proteins out of the ER (the E3 ubiquitin ligase Hrd1 is one candidate [Wu and Rapoport 2018]). Although all heterotrophic eukaryotes (and Archaeplastida) only have one pair of Derlins dedicated to ERAD, additional genes encoding Der1 and Dfm1 homologs have been detected in all photosynthetic CASH lineages (Sommer et al. 2007; Spork et al. 2009; Felsner et al. 2011). Interestingly, both are absent in the apicoplast-lacking parasite Cryptosporidium, suggesting a specific role of these proteins in the plastid (Agrawal et al. 2009). The gene coding s_Der1 was first detected in the nucleomorph genome of Guillardia theta and used to successfully complement the yeast der1 mutant, indicating a similar function (Sommer et al. 2007). s_Der1 and s_Dfm1 proteins are encoded in the nuclear genome of all other CASH and were proven to be located in the PPM in Apicomplexa and Stramenopiles (Hempel et al. 2009; Kalanon et al. 2009; Spork et al. 2009). A deletion of the nuclear gene s_der1 in Toxoplasma gondii hinders the relocation of proteins into the apicoplast, leading eventually to the death of the parasite (Agrawal et al. 2009), definitely proving the role of s_Derlins in SELMA.
To our knowledge, three publications have reported phylogenetic trees of the s_Derlin proteins (Hirakawa et al. 2012; Petersen et al. 2014; Cavalier-Smith 2018). However, they suffer from poor taxon sampling or miss many appropriate orthologs. We reconstructed several wide sampled phylogenetic trees of Der1 + Dfm1, which allowed us to confirm that all CASH lineages possess both s_Der1 and s_Dfm1, but also to determine that e_Dfm1 and e_Der1 are ancient paralogs that were likely duplicated in the Last Eukaryotic Common Ancestor (LECA). A phylogenetic tree rooted at the most likely split between Dfm1 and Der1 paralogs is shown in supplementary fig. S1, Supplementary Material online. We observed that the internal topology of each paralog is unstable, probably because of long branch attraction (LBA) artifacts when including both paralogs in the same tree. Hence, we split the dataset into two subtrees (fig. 2; supplementary fig. S2 and S3, Supplementary Material online). Our phylogenetic tree of Der1 supports the inheritance of s_der1 by serial EGT (fig. 2a and supplementary fig. S2, Supplementary Material online). Similarly, s_Dfm1 proteins of Alveolata, Stramenopiles, and Haptophyta form a single monophyletic group that is sister to Rhodophyta, although with low bootstrap support (fig. 2b and supplementary fig. S3, Supplementary Material online). This indicates that s_dfm1 in these phyla likely derives from e_dfm1 of Rhodophyta and was acquired by EGT. Dfm1 homologs in Cryptophyta are, however, found as two paralogous sister clades, one being nested in the other and having a significantly longer root branch. The plastid-lacking Goniomonas sp. seems to harbor only one of the two Dfm1 paralogs. We suggest that the short-branching group corresponds to e_Dfm1 and that the long-branching one is s_Dfm1 that evolved by duplication and replaced a putative protein originally derived from red algae. We observed this pattern for other SELMA components (see below).
Fig. 2.
Unrooted condensed maximum likelihood phylogenetic trees of Der1 a) and Dfm1 b) proteins. The corresponding expanded trees are available in supplementary fig. S2 and S3, Supplementary Material online, respectively. For the sake of readability, monophyletic phyla are depicted as triangles. Ultrafast bootstrap branch support values (1,000 replicates) are indicated except when lower than 70% or as a thick branch when the value is maximum. The number of amino acid positions in the corresponding sequence alignment, the substitution model and parameters used for IQ-TREE reconstruction as well as the branch length scale are indicated at the top right of each tree. Clades corresponding to proteins putatively implicated in SELMA are shadowed in gray.
Genes Involved in the Ubiquitylation of SELMA-Translocated Proteins Have Conflicting Origins
Ubiquitin (Ub) is a protein generally composed of 76 amino-acids, whose sequence is extremely conserved among eukaryotes (Swatek and Komander 2016). Ubiquitination is a posttranslational modification consisting of the covalent addition of ubiquitin moieties onto lysine residues of targeted proteins. This modification plays a pivotal role in signaling and regulation of many metabolic pathways of the eukaryotic cell (Komander and Rape 2012; Swatek and Komander 2016). During the ERAD process, ubiquitination of misfolded proteins on the cytosol side is essential to their Derlin/Hrd1 mediated extrusion as well as their relocation to the proteasome (Claessen et al. 2012; Lemus and Goder 2014). Ubiquitination is performed thanks to the consecutive action of three enzymes: Ub activating enzyme E1, Ub conjugating enzyme E2 and Ub ligase E3. Ubiquitin activation is handled by the versatile enzyme Uba1 that uses ATP to first adenylate Ub and subsequently transfers it to one of its cysteine residues, forming an E1∼Ub complex (McGrath et al. 1991). Ubiquitin is then transferred to the active site of one of the many ubiquitin conjugating E2 enzymes. Finally, a ubiquitin protein ligase (E3) interacts with both E2∼Ub and the final protein substrate to transfer Ub to a lysine residue of the latter. The nature of the E2/E3 couple determines the specificity of ubiquitination toward the substrate and differentiates ubiquitin-using pathways. In the case of ERAD, this couple consists predominantly of Ubc7 and Hrd1, with Hrd1 catalyzing the transfer of ubiquitin to misfolded proteins at the ER luminal side, but also regulating the extrusion process by self-ubiquitination (Baldridge and Rapoport 2016). The Ubiquitin moieties are then recognized by Ub-interacting chaperones and cofactors to address misfolded proteins to the proteasome.
SELMA transport across the PPM also requires the ubiquitination of precursor proteins on the PPS side. A specific Plastidial UBiquitin Like protein (PUBL) has been identified and its plastidial localization experimentally confirmed in Phaeodactylum tricornutum, Plasmodium falciparum, and Toxoplasma gondii (Sommer et al. 2007; Spork et al. 2009; Fellows et al. 2017). The specific implication of PUBL in SELMA has been demonstrated in a conditional mutant of T. gondii (Fellows et al. 2017) whose phenotype displays a decrease in plastid-located mature proteins and eventually induces the death of the parasite. PUBL proteins are longer than Ub, with extensions on the N-terminal side (comprising the BTS), but also on the C-terminal side, which eliminates the regular Ub C-terminal glycine and seems incompatible with polyubiquitination. A published phylogenetic tree of PUBL proteins, comprising conventional Ub sequences as outgroup, showed with moderate support that PUBL is shared by all photosynthetic CASH (excluding Cryptophyta) and constitutes a monophyletic group, indicating a common ancestry (Fellows et al. 2017). We could reconstruct an equivalent tree (supplementary fig. S4a, Supplementary Material online) using only the conserved positions of Ub and PUBL and with the addition of PUBL sequences of Plasmodium species that were originally excluded by Fellows et al. (2017). In our tree, those proteins are found nested within the Apicomplexa, with a very long basal branch, indicating a strong acceleration of their evolutionary rate. Although Plasmodium PUBL sequences are highly divergent, Alphafold-predicted structures can be successfully aligned with those of Toxoplasma (Jumper et al. 2021; Varadi et al. 2022) (supplementary fig. S4b, Supplementary Material online). Contrary to previous conclusions, PUBL-like protein homologs encoded in the nucleomorph genome can be observed, but only for some Cryptophyta species (Chroomonas mesostigmatica, Cryptomonas curvata, and Hemiselmis andersenii). Indeed, Rhodomonas abbreviata, Geminigera cryophila, and Guillardia theta share a nuclear encoded Ub homolog with a N-terminal BTS extension, suggesting that those species could have replaced a nucleomorph-encoded Ub with a nucleus-encoded plastid targeted PUBL. However, probably because of limited phylogenetic signal, our tree does not support an evolutionary relationship between nucleomorph-encoded Ub, nuclear-encoded Ub-like proteins and PUBL, the precise origin of which remains undetermined.
SELMA homologs of Uba1 have been reported on several occasions (Sommer et al. 2007; Spork et al. 2009, 2012; Felsner et al. 2011). A s_uba1 gene can be found in all photosynthetic CASH nuclear genomes and the PPS localization of s_Uba1 was demonstrated experimentally in Plasmodium (Spork et al. 2009). A phylogenetic tree of e_Uba1 and s_Uba1 proteins is available in Felsner et al. (2011) where CASH s_Uba1 sequences form a poorly supported monophyletic group that also comprises the unique red alga Cyanidiochyzon merolae. Our own phylogenetic tree of Uba1 proteins shows a more complex evolutionary history (supplementary fig. S5, Supplementary Material online). First, the tree displays evident signs of an ancient paralogy with differential losses. For instance, Rhodophyta and Glaucophyta share an ortholog that differs from the one found in Viridiplantae and Alveolata. Moreover, Haptophyta and Cryptophyta possess two copies of e_Uba, one related to the isoform of Viridiplantae/Alveolata and the second shared with all other eukaryotes. We also observe an additional homolog of Uba1 shared by all photosynthetic CASH, except Cryptophyta, and forming a monophyletic group that branches in between the two previously described paralogous groups. Proteins of this clade have putative BTS and are likely SELMA s_Uba1 enzymes. Interestingly, this group also contains orthologous proteins of Chlorarachniophyta species, suggesting the existence of plastid-located ubiquitin using pathways in this phylum. Finally, we observe in Cryptophyta the same topology as for s_Dfm1: a specific duplication with one paralog being e_Uba1 (found also in the plastid-lacking species Goniomonas sp.), and the other one being likely s_Uba1.
The main experimental evidence of the involvement of a specific E2 gene in SELMA was published by Agrawal et al (2013), who showed that a T. gondii mutant line affected at the TGME49_295990 locus was unable to import proteins into the apicoplast. E2 ubiquitin conjugating enzymes compose a very large family of orthologous genes, each having one or many E3 partners, and acting at different areas of the cell metabolism (Michelle et al. 2009). For instance, the genome of S. cerevisiae contains 13 E2 ubc genes; C. elegans encodes 20, and A. thaliana has no less than 37 (Jones et al. 2002; Kraft et al. 2005; Finley et al. 2012). Unfortunately, their naming is not coherent between organisms and their classification is misleading (Michelle et al. 2009). Additionally, it is hard to determine if putative SELMA E2 enzymes that were previously detected in the literature using sequence similarity are in fact all related to the experimentally evidenced gene TGME49_295990 (Sommer et al. 2007; Spork et al. 2009, 2012; Hempel et al. 2010). To overcome this issue, we used every S. cerevisiae Ubc protein sequence to search our database for homologs and combined all nonredundant hits to produce a general tree of E2 enzymes. Figure 3a presents this general tree where each isoform group is delimited and named based either on S. cerevisiae numbering (Ubc + N) or using Arabidopsis thaliana E2 genes naming convention (Atg + N) (Kraft et al. 2005). The position of all nucleomorph-encoded Cryptophyta E2-like proteins is depicted: they fall into three families: Atg9-Ubc10/Pex4, Atg6/13-Ubc4/5, and Atg3/Ubc2, which are respectively described as involved in peroxisome biogenesis and functioning; protein degradation/anaphase-promoting complex; and the N-end rule pathway/DNA repair (Finley et al. 2012). Two other subtrees contain proteins encoded in CASH genomes and which are predicted to be targeted to the plastid: Atg14.1 comprising a clade of Stramenopiles; and Atg7 which contains all E2 proteins previously reported as potentially involved in SELMA, including the product of TGME49_295990. Figure 3b and supplementary fig. S6, Supplementary Material online focus on this Atg7 subgroup and show that photosynthetic CASH E2-like proteins (excluding Cryptophyta) form a monophyletic group closely related to Rhodophyta, suggesting an acquisition by EGT. Additionally, we could again observe a pattern of specific duplication in Cryptophyta, with one paralog having a longer basal branch and possibly being s_Ubc.
Fig. 3.
Phylogenetic analysis of the E2 Ubiquitin conjugating (Ubc) enzymes. The number of amino acid positions in the corresponding sequence alignment, the substitution model and parameters used for IQ-TREE reconstruction as well as the branch length scale are indicated for each tree a) Maximum likelihood phylogenetic tree of the whole Ubc protein family. The tree is arbitrarily rooted. Each isoform is depicted with a different color and named using the convention for S. cerevisiae (UbcN) and A. thaliana (AtgN). The positions of nucleomorph encoded Ubc enzymes are indicated in gray. b) Unrooted condensed maximum likelihood phylogenetic tree focusing on Ubc enzymes of the Atg7 family. Monophyletic phyla are depicted as triangles. The corresponding expanded tree is available in supplementary fig. S6, Supplementary Material online. Ultrafast bootstrap branch support values (1,000 replicates) are indicated except when lower than 70% or as a thick branch when the value is maximum. Clades corresponding to proteins putatively implicated in SELMA are shadowed in gray.
The E3 component of the potential SELMA ubiquitination cascade is more elusive. In ERAD, ubiquitin transfer is performed by e_Hrd1, which also regulates the process of protein extrusion. A SELMA equivalent to Hrd1 has been repeatedly reported to exist only in Cryptophyta (Sommer et al. 2007; Spork et al. 2009; Hempel et al. 2010). We were also unable to detect any orthologs of s_hrd1 in other CASH genomes. Our attempts at reconstructing the phylogeny of Hrd1 proteins recovered nucleomorph-encoded s_Hrd1 nested in the Archaeplastida supergroup but not as a sister group to Rhodophyta (supplementary fig. S7, Supplementary Material online). Our results also invalidate the candidate s_Hrd1 protein of Emiliania huxleyi that was reported by Stork et al (2012). The absence of s_Hrd1 in most CASH lineages is unexpected and some authors have tried to identify alternatives by screening for proteins having transmembrane domains attached to an E3-ligase module. The ptE3p protein of the diatom P. tricornutum has been proposed as an equivalent to s_Hrd1 in Stramenopiles. This protein has a verified E3-ligase activity and seems to be an actual physical component of the purified SELMA complex of P. tricornutum (Hempel et al. 2010). We gathered homologs of ptE3p and tried to reconstruct their phylogeny but all our trees showed unresolved topologies owing to weak signal (data not shown). Nevertheless, this protein seems to exist only in Stramenopiles. Therefore, if it is indeed involved in SELMA, its analysis would not provide information about the origin of the specific E3 enzyme of the putative CASH plastid common ancestor.
Although the fate of ubiquitinated proteins in ERAD is their destruction, proteins translocated through SELMA should not be degraded. Accordingly, the PPS-located and stromal proteins should be deubiquitinated to function properly. Hempel et al. (2010) queried the genome of P. tricornutum for a putative PPS located deubiquitinating enzyme and proposed one candidate that they named ptDUP. We collected homologs of the ptDUP protein encoded in CASH genomes and reconstructed the corresponding phylogenetic tree (supplementary fig. S8, Supplementary Material online). Proteins similar to this enzyme are present in most eukaryotic genomes, including S. cerevisiae (named Ubp15p and involved in the peroxisomal export machinery [Debelyy et al. 2011]) and A. thaliana (named UBP12 and 13 and involved in several regulation processes [Cui et al. 2013; Lindbäck et al. 2022]). Our tree shows the existence of a monophyletic paralogous group containing ptDUP as well as similar proteins (with predicted BTS) of Haptophyta, Stramenopiles, and Vitrella brassicaformis (Chromerida) but not of Cryptophyta nor Apicomplexa. The closest sister group to this ptDUP clade is not Rhodophyta but Haptophyta (albeit with very low support) which, if confirmed, would be in itself an interesting observation (see Discussion).
Overall, the SELMA-specific ubiquitination pathway seems to have a monophyletic ancestry in Haptophyta, Stramenopiles, and Myzozoa (Alveolata), but the initial origin of each element appears contradictory (fig. 1c). Additionally, most of the corresponding proteins in Cryptophyta seem to have further evolved from ERAD components via ancient phylum-specific duplications that are not related to proteins of the other CASH lineages. We will discuss below how these replacements might affect our analysis.
The SELMA Cdc48-Ufd1-Npl4 Complex Was Likely Acquired from Rhodophyta
Cdc48 (p97 in mammals) belongs to the AAA protein family (ATPases Associated with diverse cellular Activities), whose members consume ATP to perform mechanical actions on macromolecules (Bodnar and Rapoport 2017). More precisely, Cdc48 achieves the physical displacement of ubiquitinated proteins in various cellular processes, with a specificity that depends on the nature of its cofactors (Bays and Hampton 2002). In the case of ERAD, Cdc48 associates with the couple Ufd1 (ubiquitin fusion degradation)/Npl4 (nuclear pore localization). Together, they pull misfolded ubiquitinated proteins out of the ER and transfer them to the proteasome. A single SELMA-putative copy of Cdc48 was initially detected in the nucleomorph genome of Cryptophyta and in the nuclear genome of other CASH species (Sommer et al. 2007). Phylogenetic analyses have shown that those s_Cdc48 proteins cluster in a monophyletic group directly related to Rhodophyta (Agrawal et al. 2009; Felsner et al. 2011; Petersen et al. 2014). A deeper mining of genomic data uncovered a second copy of s_cdc48 in all CASH except in Alveolata. s_Cdc48-1 and sCdc48-2 are both targeted to the plastid in P. tricornutum where they physically interact (Spork et al. 2009; Lau et al. 2016). Thus, the Cdc48 complex involved in SELMA is likely a heterohexamer. Our own survey confirms the existence of both isoforms of s_Cdc48 in all photosynthetic CASH except Alveolata, which have only one. Additionally, our phylogenetic analysis demonstrates that s_Cdc48-1 and s_Cdc48-2 are paralogs that derive from an original duplication acquired from Rhodophytes (fig. 4a and supplementary fig. S9, Supplementary Material online). However, the tree topology is not compatible with an early duplication of isoforms 1 and 2 before the diversification of CASH and suggests a more complex history of lineage-specific duplications. Lau et al. (2016) described in P. tricornutum a specific motif (DDDLYS*) present at the C-terminal side of s_Cdc48-1 but absent from s_Cdc48-2. We also observe this motif on many s_Cdc48 proteins, including some encoded in the nucleomorph of Cryptophyta and in Rhodophyta. Given the topology of the tree, the presence of this motif likely was the ancestral state of s_Cdc48. Additionally, there seems to be a correlation between the pattern of presence/absence of the motif and the delimitation of isoform 1 and isoform 2 in our phylogenetic tree (marked with green Ⓜ signs in supplementary fig. S9, Supplementary Material online). However, if the tree topology reflects the true history of the gene, the loss of the motif would then be a convergence. SELMA specific copies of Npl4 and Ufd1 also exist in the genomes of photosynthetic CASH species. In Cryptophyta, s_npl4 is found in the nuclear genomes, while s_ufd1 is encoded in the nucleomorph genome. Our phylogenetic analyses show that both genes in Cryptophyta derive from Rhodophyta genes via secondary EGT (figs. 1c, 4b and 4c, supplementary figs. S10 and S11, Supplementary Material online). Interestingly, s_Ufd1 in Stramenopiles, Apicomplexa, and Haptophyta originate from the same ancestral protein but does not seem to be related to the nucleomorph-encoded protein of Cryptophyta, but rather to e_Ufd1 (fig. 4c, supplementary fig. S11, Supplementary Material online). The case of s_Npl4 is slightly different: the corresponding gene has a common origin in Stramenopiles and Alveolata, and the group is related to Rhodophyta and Cryptophyta, indicating a possible inheritance by EGT (fig. 4b, supplementary fig. S10, Supplementary Material online). In Haptophyta, s_Npl4 is likely a specific duplication of e_Npl4, as it displays a similar topology than what we could observe for several proteins of the Ubiquitination pathway in Cryptophyta. Finally, we also analyzed two putative, PPS located, additional components of the Cdc48 complex, s_UBX and s_PUB, that might be respectively involved in its interaction with Derlins or in its regulation (Spork et al. 2009; Moog et al. 2011, 2016). s_PUB is composed of a combination of a thioredoxin domain and a PUB domain (supplementary fig. S12a, Supplementary Material online). After a thorough search for similar sequences in our database, we observed that only Stramenopiles and Chromerida (Alveolata) possess a protein with this combination of domains. The closest similar proteins are thioredoxins of the Ybbn family: they present a similar thioredoxin domain but are otherwise completely different in structure. We reconstructed a phylogenetic tree using only the portions that align (supplementary fig. S12b, Supplementary Material online). This tree shows that s_PUB sequences cluster in a monophyletic group that is nested into a group of Ybbn-like proteins of Stramenopiles. Thus, s_PUB is a specific innovation that is only shared by Chromerida and Stramenopiles, and might play a role that does not exist in other CASH. On the other hand, s_UBX is related to a widely distributed member of the UBX-like gene family that comprises Ubx3p of S. cerevisiae and PUX10 of A. thaliana, which are cofactors of Cdc48 and are respectively involved in clathrin-dependent endocytosis and in the dislocation of oleosins from lipid droplets (Farrell et al. 2015; Deruyffelaere et al. 2018). We reconstructed a phylogenetic tree of those Ubx-like proteins (supplementary fig. S13, Supplementary Material online) and observed that s_Ubx proteins are only found in Stramenopiles, forming a monophyletic group nested within Rhodophyta, suggesting a transmission by secondary EGT or HGT. If this protein is indeed involved in SELMA, it would be specific to Stramenopiles and may not be an ancestral member of the transporter that evolved initially in the secondary red plastid.
Fig. 4.
Unrooted condensed maximum likelihood phylogenetic trees of Cdc48 a), Npl4 b), and Ufd1 c) proteins. The corresponding expanded trees are available in supplementary figs. S9, S10, and S11, Supplementary Material online, respectively. Monophyletic phyla are depicted as triangles, ultrafast bootstrap branch support values (1,000 replicates) are indicated except when lower than 70% or as a thick branch when the value is maximum. The number of amino acid positions in the corresponding sequence alignment, the substitution model and parameters used for IQ-TREE reconstruction as well as the branch length scale are indicated at the top right of each tree. Clades corresponding to proteins putatively implicated in SELMA are shadowed in gray.
Accessory Proteins
ERAD is dedicated to the retrotranslocation of misfolded ER proteins. When they reach the cytosol, these proteins are tagged with ubiquitin and sent to the proteasome for degradation. Png1 and its interacting partner Rad23 are suspected to play an important role in the functional link between ERAD and the proteasome (Suzuki et al. 2001). Png1 is the dominant deglycosylating enzyme of the eukaryotic cell and is active in the cytoplasm and the nucleus. Efficient degradation of glycosylated proteins requires a trimming of their glycosidic moieties (Kim et al. 2006). Rad23 was first described as a lesion recognition factor involved in DNA repair. With its ubiquitin-interacting domains, Rad23 also plays the role of cargo for ubiquitinated proteins toward the proteasome (Dantuma et al. 2009). SELMA imported preproteins are not supposed to be degraded, and the need to immediately drive them to a putative PPS-located proteasome appears irrelevant. However, trimming preproteins that might have been accidentally glycosylated in the ER could be required for their proper functioning in the plastid. We gathered orthologs of Png1 and reconstructed a phylogenetic tree of those proteins (supplementary fig. S14, Supplementary Material online). This tree is unresolved, with phyla appearing as nonmonophyletic. However, an outlier clade composed of proteins encoded exclusively in CASH genomes, with some being putatively addressed to the reticulum can be observed. The origin of this paralog, which is likely linked to secondary endosymbioses and could be an accessory protein of SELMA, is unfortunately nontraceable. We also searched for Rad23 homologs in our database. Contrary to Png1, the phylogenetic tree of Rad23 properly assembles all major eukaryotic phyla, but does not contain a derived clade that resembles those we observed for other SELMA genes that evolved conjointly in all CASH (supplementary fig. S15, Supplementary Material online). Only Cryptophyta and Haptophyta have a second copy of Rad23, which was acquired from Rhodophyta in the case of Cryptophyta but duplicated from the canonical ortholog in Haptophyta. We could not detect a genuine BTS in any of the investigated s_Png1 and s_Rad23 protein sequences. Our phylogenetic data, alone, do not provide new arguments about the actual involvement of Png1 and Rad23 in the process of plastid protein targeting, which remains to be demonstrated.
The protein chaperone Hsp70 of the Ssa family, together with cochaperone Hsp40 (Ydj1 in Saccharomyces), is a post-ERAD factor that also participates in the degradation of cytoplasmic misfolded proteins. It recognizes misfolded protein domains and ensures their solubility while they are being delivered to the proteasome (Park et al. 2007; Lee et al. 2016). A PPS located Hsp70 protein was first described in studies deciphering the structure of the BTS in Cryptophyta and diatoms (Gould et al. 2006b) and was proposed to be a cofactor of SELMA but with no determined function (Stork et al. 2012). Our phylogenetic tree of Hsp70 shows that all photosynthetic CASH, except Alveolata, have a second copy of Hsp70 (supplementary fig. S16, Supplementary Material online). In Cryptophyta, s_Hsp70 is encoded in the nucleomorph and is related to e_Hsp70 of Rhodophyta. Alternatively, Haptophyta and Stramenopiles share the same s_Hsp70 paralog that appears unrelated to the nucleomorph encoded protein and whose position in the tree does not allow to infer its origin. Finally, using a systematic screening procedure in Toxoplasma, Sheiner et al. (2011) have identified a gene encoding a PPS-located protein that, when disrupted, leads to the blockage of protein import into the plastid and the death of the parasite. This protein, named PPP1 is related to the Sey1 protein of Saccharomyces that is involved in homotypic ER vesicles fusion (Anwar et al. 2012). Cryptophyta nucleomorphs encode three to four sey1-like copies (Zauner et al. 2019), each being related to a specific paralog of Rhodophyta, and three of them (including PPP1) being also present and probably acquired by EGT in CASH genomes (supplementary figs. S17a–c, Supplementary Material online). This observation suggests that those proteins probably have an important role in the functioning of the red plastid, and because all paralogs already existed in Rhodophyta before secondary endosymbiosis, they may not be directly involved in SELMA.
Discussion
The route taken by proteins to travel across the periplastidial membrane of complex plastids has long remained a mystery. The discovery of the SELMA transporter was a smart intuition, inspired by the detection of members of an ERAD-like complex encoded in the nucleomorph of Cryptophyta. Much experimental evidence has now proven that a multiprotein complex located in the PPM of all CASH species containing a red-alga derived plastid is responsible for the translocation of preproteins from the CER to the PPS. The requirement of preprotein ubiquitination for their effective translocation is also demonstrated and highlights the existence of ubiquitin-using pathways in secondary plastids that may extend beyond protein import. The composition of SELMA across CASH species and the origin of its components have been the subject of previous studies and are partly explained by endosymbiotic gene transfers and refunctionalization of ERAD genes from secondary red algal endosymbionts. The complexity of SELMA, its intricate evolution from ERAD and its similarity in all CASH has prompted authors to argue, on the principle of parsimony, that it must have had a single evolutionary origin (Zimorski et al. 2014; Gould et al. 2015; Cavalier-Smith 2018). If proven true, the monophyly of SELMA would greatly influence the understanding of the evolution of CASH. In this work, we evaluated the phylogeny of 21 of the demonstrated or putative components of the complex to provide a complete image of its molecular history. Although our analysis cannot provide direct phylogenetic evidence about the origin of CASH plastids, the combined observation of patterns of transmission of SELMA components between lineages is valuable in supporting or falsifying models of plastid evolution that have been proposed over the last decades.
The Composition of the SELMA Complex Is Highly Conserved
Figure 1 c summarizes the presence/absence of each SELMA component in CASH genomes, the inferred origin of those components in each lineage as well as how they relate to one another. As previously reported by Stork et al (2012), we observe a remarkable level of conservation of the SELMA complex, both in composition and origins, with 13 proteins being present in at least 3 CASH lineages, and a majority being of red algal origin directly or indirectly. First, the two Derlins are always present, even in the more reduced genomes of Apicomplexa parasites, underlining a complementary role of each subunit in the structure or functioning of the pore complex. Secondly, within the inferred PPS-localized ubiquitination machinery, PUBL, s_Uba1 and s_Ubc are conserved, and we could show that the E2 enzyme s_Ubc evolved from the same family of ubc-like genes in all CASH (fig. 3b). The identity of the E3 enzyme is however still elusive. Considering the observed conservation pattern and the functional specificity of the E2/E3 couple in ubiquitin-using pathways, it seems likely that, if there is an ancestral SELMA E3 enzyme, it is neither s_Hrd1 nor ptE3p. Several publications have evidenced an endomembrane network in the PPS of Cryptophyta as well as a proteasome (Stork et al. 2012; Cavalier-Smith 2018). Also, we show that Cryptophyta have retained many Ubc homologs encoded in the nucleomorph that are usually involved in protein degradation pathways. In S. cerevisiae, Hrd1 is able to single-handedly export misfolded proteins when overexpressed in vivo or in reconstituted vesicle experiments (Wu and Rapoport 2018). In Cryptophyta, s_Hrd1 might be involved in a PPS localized protein degradation pathway, and may even be sufficient to process a remnant ERAD-like process. On the other hand, ptE3p only exists in Stramenopiles, where it likely evolved from the duplication of another E3 enzyme. Thus, there might be an alternative SELMA E3 enzyme that we tried to identify. We referred to the literature to determine which E3 enzymes interact with ATG7-type Ubc in A. thaliana and Homo sapiens (Kraft et al. 2005; Markson et al. 2009). Unfortunately, none of the known interactors had homologs showing a phylogenetic topology similar to other SELMA components. Besides, the putative deubiquitination of preproteins after their translocation is supported by the presence and common origin of ptDUP in Haptophyta, Stramenopiles, and Alveolata. Its absence in Cryptophyta is surprising and could be due to the existence of an alternative enzyme for this function. Such an enzyme might be shared with other processes utilizing Ubiquitin in the plastid that did not persist in other CASH. Concerning the Cdc48 complex, its four main components are conserved in CASH (Cdc48-1, Cdc48-2, Ufd1, and Npl4) and, similarly to Derlins, the preservation of both Cdc48 isoforms (except for Alveolata that lack isoform 2) indicates a possible specialization as experimentally observed by Lau et al. (2016). The two potential additional components of the Cdc48 complex detected by interaction assays (s_PUB and s_Ubx) are, however, restricted to Stramenopiles and Alveolata. Finally, three of the four putative accessory proteins that we analyzed are conserved in at least three CASH lineages. The peptide:N-glycanase s_Png1 is conserved in all photosynthetic CASH, including Chromerida but not Apicomplexa, in which glycosylation is altered (Samuelson and Robbins 2015). s_Png1 has a unique but unresolved origin in CASH, indicating that deglycosylation of preproteins after their translocation is likely an important step. Similarly, the Hsp70 chaperone is conserved but only absent in Alveolata, and PPP1 is of red algal origin in all CASH. However, the exact function of both proteins in SELMA remains undetermined. Altogether, the composition of the SELMA complex appears very stable. Interestingly, we could not detect any of the SELMA components in the peridinin-plastid-containing Dinophyta transcriptomes available at the time of our analysis, underlining that the presence of the complex is bound to the existence of a PPM, and confirming its function as translocator across this membrane.
Plastids of Stramenopiles, Haptophyta, and Alveolata Have Entangled Origins
In this study, we observe that Stramenopiles, Haptophyta, and Alveolata (termed hereafter ASH) have acquired components of the SELMA transporter from mostly four genomic sources (Fig. 1c): Cryptophyta nucleomorph SELMA genes (s_der1, s_cdc48-1), Cryptophyta nuclear SELMA genes (s_ufd1), Rhodophyta nuclear ERAD genes (s_dfm1, s_Ubc, s_npl4, PPP1) and finally ERAD genes of unresolved common origin (s_Hsp70, s_PUB). Out of the 16 SELMA elements successfully analyzed, 11 are shared by ASH, and for 10 of those, phylogenetic trees show that they have acquired an ancestral gene from the same genomic source. This tends to demonstrate that SELMA in ASH is of monophyletic origin, and that their plastids might derive from a common ancestor. We also recover the monophyletic origin of ten SELMA components in Myzozoa (Alveolata) and Ochrophyta (Stramenopiles) and, for six of them, we even observe a nested relationship of Myzozoa within Stramenopiles. Together with the strikingly unique shared domain composition of s_PUB, this strongly suggests that the apicoplast derived from the plastid of Ochrophyta, as previously hypothesized (Ševčíková et al. 2015; Pietluch et al. 2024). We tried to determine if we could further resolve the position of Myzozoa within Stramenopiles by complementing single gene trees that already evidenced their nested relationship (Der1, Dfm1, Cdc48, Ubc, Ufd1, and PPP1). Starting from our final protein alignments, we added a set of SELMA proteins of Ochrophyta species that we gathered using the online blast tool of EukProt v3 (Richter et al. 2022, see Materials and Methods). The alternative phylogenetic trees for these six SELMA proteins are displayed in supplementary figs. S18 to S23, Supplementary Material online. Unfortunately, we could not detect a congruent trend regarding the position of Myzozoa within Ochrophyta. Our results, however, underline the need to revisit the origins of Myzozoa with phylogenomic approaches.
Cryptophyta Might Have Replaced Red-Alga Derived SELMA Genes by Nuclear Paralogs
Cryptophyta possess 14 of the 21 analyzed SELMA components: 6 of them are encoded by nucleomorph genes and 7 by nuclear genes. The case of the 14th component, the plastid specific ubiquitin PUBL, is peculiar, because the corresponding gene is found in the nucleomorph of some species and in the nucleus of others, illustrating the uneven gene content of the nucleomorph across Cryptophyta (Moore et al. 2012). Out of the six exclusively nucleomorph encoded SELMA proteins, five could be traced back to Rhodophyta (Fig. 1c), which is in accordance with the origin of the Cryptophyta plastid. Conversely, out of the seven nucleus encoded SELMA components, only two appear of red algal origin while three seem to have emerged from the duplication and refunctionalization of nuclear-encoded Cryptophyta ERAD genes (s_Dfm1, s_Uba1, and s_Ubc). It is important to note that none of the other CASH lineages appears to have inherited ERAD-derived s_Dfm1, s_Uba1, or s_Ubc genes from Cryptophyta. Indeed, the corresponding trees show that ASH have acquired those three genes by EGT from a common Rhodophyta source (supplementary figs. S3, S5, and S6, Supplementary Material online). There are at least two ways to interpret those phylogenies. On the one hand, host ERAD-derived s_Dfm1, s_Uba1, and s_Ubc represent the initial setup of SELMA in Cryptophyta at the time of secondary endosymbiosis. Thus, other CASH have acquired those genes by a common, independent, secondary endosymbiosis with a red alga, falsifying the possibility that their plastids initially derived from Cryptophyta. This, however, is incompatible with the nucleomorph origins of s_Der1 and s_Cdc48 in all ASH lineages. The alternative interpretation is that s_Dfm1, s_Uba1, and s_Ubc are late replacements in Cryptophyta, and were of red algal origins initially. Those original proteins, in congruence with nucleomorph-derived genes, could have branched as sister group to other rhodophyte-derived ASH proteins, supporting a Cryptophyta origin of SELMA, and indirectly, a common serial endosymbiotic origin of ASH plastids. Although it appears plausible that gene replacements in Cryptophyta are hiding parts of the actual evolutionary history of SELMA, it cannot be demonstrated using our data. Interestingly, s_Npl4 in Haptophyta also seems to derive from the late recycling of e_Npl4, indicating that this type of replacement might be frequent. In the next paragraph, we will try to combine congruities and inconsistencies that we can draw from our results to try to evaluate models of CASH complex plastids evolution.
A Putative New Scenario for the Origin of Complex Plastids in CASH
Over the years, several evolutionary models, differing in the number and order of endosymbioses, have been proposed to explain the emergence of CASH lineages and the origin of their plastids. In fig. 5, we depict five of those variants, among which three have been published, respectively by Pietluch et al. (2024, fig. 5a), Bodył (2009, fig. 5b), and Stiller et al. (2014, fig. 5c). We will refer to this figure throughout the remaining discussion to detail how and why some models are compatible or rejected by our data. As discussed above, our results strongly confirm the monophyly of Ochrophyta plastids and apicoplasts. Moreover, our phylogenetic trees converge toward the conclusion that a majority of SELMA components in ASH are monophyletic, and consequently that their respective plastids likely share a common origin. Those observations contradict the model proposed by Bodył (2009, fig. 5b), which proposes that the apicoplast is derived from the Haptophyta plastid, as well as the model of Pietluch et al. (2024, fig. 5a) where the Haptophyta plastid is not related to those of stramenopiles nor of Myzozoa. Figure 5c to 5e are all compatible with a monophyletic origin of Ochrophyta, Myzozoa, and Haptophyta complex plastids, but suggest completely different endosymbiosis sequences. Figure 5d, for instance, proposes that plastids of Cryptophyta and ASH lineages emerged from distinct secondary endosymbiosis with red algae. Our results tend to reject this possibility, because two SELMA components (s_Der1 and s_Cdc48-1) in ASH are monophyletic EGT related to the nucleomorph of Cryptophyta (we indicate in Fig. 5 which models are incompatible with the existence of nucleomorph-derived EGT in ASH lineages). Moreover s_Ufd1 in ASH seems to derive from the refunctionalization of a nuclear gene of Cryptophyta. Figure 5c and 5e are very similar: they both propose that Cryptophyta is the only lineage that emerged from a secondary endosymbiosis involving a red alga and later transferred its plastids by higher order endosymbioses. In the model of Stiller et al. (2014, fig. 5c), the host of the tertiary plastid is a stramenopile ancestor of Ochrophyta, which later transferred again the plastid to Haptophyta and Myzozoa by two quaternary endosymbioses. We propose an alternative model (Fig. 5e) in which a tertiary endosymbiosis occurred between Cryptophyta and an ancestor of Haptophyta followed by the serial transmission of the complex plastid to Ochrophyta and finally to Myzozoa. In almost all of our phylogenetic trees, Haptophyta and Ochrophyta + Myzozoa assemble into sister clades and we could never observe one clade being nested within the other. Consequently, we cannot determine, based on tree topologies, which model between [c] or [e] is the most likely. There is however a faint indication that Haptophyta might have acquired SELMA first. Indeed, the phylogeny of ptDUP (supplementary fig. S8, Supplementary Material online) indicates that this protein is likely derived from the duplication of a Haptophyta nuclear isoform (although with moderate support). We speculate that this duplication marks a transformation of the SELMA complex compared to its state in Cryptophyta (that lacks ptDUP), and that this transformation is linked to the elimination of the nucleomorph after tertiary endosymbiosis. Ancestors of Haptopyta could represent the recipients of the common ancestor of all plastids in ASH, but this affirmation needs to be evaluated based on wider phylogenomic data.
Fig. 5.
Hypothetical scenarios of the evolution of complex plastids and of the SELMA complex. Panels a) to e) present schematic phylogenetic trees, each displaying an alternative scenario for the evolution of photosynthetic CASH lineages in the form of sequences of endosymbioses. Panel a) is adapted from Pietluch et al (2024), panel b) is adapted from Bodyl et al (2009), panel c) is adapted from Stiller et al (2014), panel d) is presented for discussion purposes, and panel e) is inspired from the present work. Endosymbiosis events are depicted by vertical lines colored depending on the symbiont lineage and labeled with a number describing the level of endosymbiosis. Hypothetical endosymbiotic transfers of nucleomorph genes as well as rpl36 isoforms gains and losses are indicated. Acquisitions of pt_DUP and s_PUB by neofunctionalization are indicated by a plus sign. The unexplained existence of the same genes due to inconsistencies in a given scenario is indicated by a question mark. n-p, nonphotosynthetic; and Nm, nucleomorph.
It is important to mention here that few of the models presented in Fig. 5 can easily account for the peculiar taxonomic distribution of the plastidial rpl36 gene in Haptophyta and Cryptophyta (Rice and Palmer 2006). The two groups share a specific isoform of plastid-encoded rpl36 (c-type) that is derived from bacteria and not related to the isoform found in Archaeplastida and Stramenopiles (p-type). Initially considered a synapomorphy proving that Cryptophyta and Haptophyta share a common ancestor, it has been later challenged by the observation that those phyla are not phylogenetically related (Burki et al. 2012). However, the distribution of rpl36 remains the indicator of a potential relationship between plastids of Haptophyta and Cryptophyta that needs to be considered. The most parsimonious way to explain this distribution is to invoke two separate secondary endosymbioses with red algae possessing one or the other isoform (as in fig. 5a). As mentioned above, our results on SELMA tend to exclude dual secondary endosymbioses, because they are not compatible with the presence of EGT from the nucleomorph in ASH genomes. All other models seem to require at least one secondary endosymbiosis with a red alga that held both isoforms followed by differential transmissions or losses. Whatever the true scenario of complex plastid evolution, the clues provided by different markers, here SELMA genes versus rpl36, are confusing, even contradictory, and it seems impossible to infer an evolutionary scenario that would be the most parsimonious for all markers.
Conclusion
The Chromalveolate hypothesis, proposed by Cavalier-smith in 1997, argued that all photosynthetic lineages having plastids of red-algal origin emerged from a single secondary endosymbiosis. One of the prevailing arguments of that model was that fully integrated endosymbioses must have been rare, because they are complex evolutionary events that require tremendous modifications of the cell structure as well as important reshufflings of genomes and metabolic networks. In the meantime, as our understanding of microbial diversity improved, it has become clear that symbioses, and even intracellular symbioses, are very common in natural ecosystems, and that many of them have remarkably shaped the course of evolution (López-García et al. 2017). Although presented 20 years ago, the first proposals challenging the chromalveolate hypothesis have long been neglected before being brought back in the spotlight by the accumulation of molecular data falsifying the monophyly of chromalveolates (Bodył 2005; Sanchez-Puerta et al. 2007; Bodył et al. 2009; Baurain et al. 2010; Petersen et al. 2014; Stiller et al. 2014; Strassert et al. 2021). The idea that each photosynthetic phylum of this group could have been the result of tertiary, quaternary or even more complex serial endosymbioses, passing the same red plastid along the way, is now considered a plausible scenario. Among the many innovations that are required to transform symbionts into fully integrated organelles, the ability to address proteins encoded by nuclear genes back into the symbiont-derived compartment was crucial. Implementing such mechanisms is complex enough to consider that nonrelated endosymbiotic events are unlikely to develop the same solution by chance. Thus, the observation that all Archaeplastida use the same Tic/Toc transporter to translocate proteins into their plastids has been determinant in establishing the monophyly of primary plastids (Steiner et al. 2005). In this study, we provide new evidence that SELMA, which is responsible for the transport of proteins across the second outermost membrane of complex red plastids, is homologous in all CASH lineages. Moreover, we pinpoint the relative timing of subtle modifications of the composition of the complex and the horizontal transmission of those innovations between lineages. Our results support the idea of serial endosymbiosis involving the transfer of a unique red secondary plastid, which emerged in Cryptophyta, to all other CASH lineages by higher order endosymbioses. We speculate that ancestors of Haptophyta might have been the first recipient of the tertiary plastid that was later transferred to Stramenopiles and finally to Alveolata (fig. 5e). This chain of events slightly differs from what was proposed by other authors but is not contradictory with the estimated dating of endosymbioses inferred by Strassert et al. (2021). Our proposal is based on the interpretation of a collection of single gene phylogenies, which might be seen as moderately resolutive considering the evolutionary timescale of plastid endosymbioses. These interpretations are, however, constrained and reinforced by the need for conservation of elements that must interact properly to secure the important process of protein targeting. For instance, the evolution and transmission of peculiar proteins like s_PUB are valuable synapomorphies that provide solid inheritance arguments regarding complex plastids. Nonetheless, our evolutionary scenario will need to be corroborated by additional multimarker phylogenomic studies with extensive taxon sampling.
Materials and Methods
Dataset and Phylogenetic Tree Reconstruction
A custom SQL database was used to store and organize a set of about 5.5 million protein sequences predicted from 864 genomes and transcriptomes of species representing the diversity of the three domains of life, with an emphasis on photosynthetic eukaryotes. The complete list of assemblies that were used in this database is available in supplementary table S6, Supplementary Material online. The starting material of all analyses in the present work is the set of accession numbers reported in supplementary table S1, Supplementary Material online in Stork et al. (2012). Protein sequences corresponding to these accession numbers were retrieved and used to query our local custom database using reciprocal BLASTP (Altschul et al. 1990), allowing for a minimum of five rounds of iterative sequence searches with an e-value threshold of 1e−5. All hits for each reference protein were collected in a multiple sequence file and used to produce phylogenetic trees as follows: (1) sequences were aligned using mafft v7.205 with default parameters (Katoh and Toh 2010); and (ii) a custom python script was used to trim the alignments using relaxed conditions: deletion of columns showing more than 20% of gaps and pruning of sequences having a length coverage lower than 10% of the trimmed MSA. A first set of phylogenetic trees were reconstructed using FastTree v2.1.7 (LG + Gamma + CAT20) (Price et al. 2010). These first trees were manually inspected and pruned to remove duplicate sequences as well as subgroups of similar proteins that were not directly related to the homologous group of interest. When some species of a clade were missing one protein, or when an entire CASH phylum was absent from a tree, we tried to complement our alignments by running tblastn similarity searches on raw assemblies. The selected sequences were re-extracted from the database, realigned and trimmed using the same previous conditions, and the manual inspection and elimination of distant sequences was repeated. The final sets of homologous protein sequences that we selected are available as Supplementary material. For the final tree reconstructions, these sets were realigned, and subjected to different strategies of trimming (using trimAl [Capella-Gutiérrez et al. 2009]) depending on the size of each analyzed SELMA component in order to keep enough informative positions and recover phylogenies where major eukaryotic phyla were retrieved as solid clades with moderate to high bootstrap support values. Supplementary table S2, Supplementary Material online indicates the chosen strategy for each protein set. Final trees were reconstructed using IQ-TREE stable version 1.6.12, a combination of mixture model parameters detailed in supplementary table S2, Supplementary Material online, as well as 1,000 ultrafast bootstrap replicates (Minh et al. 2013; Nguyen et al. 2014). The only variation between models used for each alignment was the number of estimated classes in mixture models (C10/C20/C40). We decided to keep the parameter that maximized the support for the root branch of all widely accepted eukaryotic phyla or superphyla. To try to refine the branching position of Myzozoa within Stramenopiles, final alignments for proteins Der1, Dfm1, Cdc48, Ubc, Ufd1, and PPP1 were complemented with sequences obtained from the EukProt database version 3 (Richter et al. 2022). For each protein, three SELMA sequences were selected in the Ochrophyta clade and used to query EukProt using the BlastP tool provided at https://evocellbio.com/eukprot/. Blastp searches were limited to Stramenopiles and all hits from the three independent searches were gathered and deduplicated. Those hit sequences were aligned against the corresponding trimmed final alignment using mafft with parameters “–add” and “–keeplength.” A first set of phylogenetic trees were obtained using fastree and used to exclude all nonorthologous proteins from the set. Distant ERAD proteins were eliminated from the alignments to produce smaller datasets focusing on SELMA proteins to reduce calculation times. Final alignments were used to reconstruct phylogenetic trees using IQ-TREE with model LG + C10 + F + G + I. All trees presented in this article were edited using FigTree version 1.4 (Rambaut 2009) and the vector graphics program Inkscape (https://inkscape.org/).
Prediction of Proteins Subcellular Localization
All nonaligned, untrimmed protein sequences used for producing final phylogenetic trees were submitted online to HECTAR^SEC—https://webtools.sb-roscoff.fr/root?tool_id=abims_hectar—(Gschloessl et al. 2008) and locally to ASAFind (Gruber et al. 2015) to determine their putative subcellular localization, either toward the secretion pathway or to the secondary plastid. Many protein sequences proved to be truncated in their N-terminal portion, in particular for those generated from transcriptome data, and were therefore ignored by those detection programs.
Conserved Domain Searches and Alphafold Data
Conserved domains of ptDUP and related proteins were determined using the search engine of the Conserved Domain Database (Wang et al. 2023) hosted on the NCBI website (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). The coordinates of the detected domains were exported to manually create the maps presented in supplementary fig. S12, Supplementary Material online using the vector graphics program Inkscape (https://inkscape.org/). Alphafold 3D model representations of PUBL proteins of Toxoplasma gondii (TGME49_223125, Uniprot S8F891) and Plasmodium falciparum (PF3D7_0815700, Uniprot C0H4U7) presented in supplementary fig. S4, Supplementary Material online were retrieved from the AlphaFold Protein Structure Database (Varadi et al. 2022). Alignment of those same 3D models was achieved using the online tool ICN3D (Wang et al. 2022).
Supplementary Material
Contributor Information
Rafael I Ponce-Toledo, Unité D'Ecologie Systématique et Evolution, CNRS, Université Paris-Saclay, AgroParisTech, Gif-sur-Yvette, Essonne 91190, France; Institut de Systématique, Évolution, Biodiversité, Sorbonne Université, CNRS, Museum National D’Histoire Naturelle, EPHE, Université Des Antilles, Paris, Ile de France 75005, France.
David Moreira, Unité D'Ecologie Systématique et Evolution, CNRS, Université Paris-Saclay, AgroParisTech, Gif-sur-Yvette, Essonne 91190, France.
Purificación López-García, Unité D'Ecologie Systématique et Evolution, CNRS, Université Paris-Saclay, AgroParisTech, Gif-sur-Yvette, Essonne 91190, France.
Philippe Deschamps, Unité D'Ecologie Systématique et Evolution, CNRS, Université Paris-Saclay, AgroParisTech, Gif-sur-Yvette, Essonne 91190, France.
Supplementary Material
Supplementary material is available at Molecular Biology and Evolution online.
Author Contributions
P.D. and D.M. designed the study. R.I.P.-T. conducted all phylogenetic experiments under the supervision of P.D. and D.M. All authors contributed to the writing and review of this manuscript. D.M. and P.L.-G. were supported by grants from the European Research Council (ERC Advanced grants 787904 and 101141745, respectively).
Data Availability
supplementary figs. S1 to S23, Supplementary Material online and supplementary tables S1 to S6, Supplementary Material online are available on the publisher website. The same supplementary files, as well as multiple protein sequence alignments in fasta format and phylogenetic trees in Newick format are also available at https://data.deemteam.fr/
References
- Adl SM, Simpson AGB, Lane CE, Lukeš J, Bass D, Bowser SS, Brown MW, Burki F, Dunthorn M, Hampl V, et al. The revised classification of eukaryotes. J Eukaryote Microbiol. 2012:59:1–45. 10.1111/j.1550-7408.2012.00644.x. [DOI] [Google Scholar]
- Agrawal S, Chung DWD, Ponts N, van Dooren GG, Prudhomme J, Brooks CF, Rodrigues EM, Tan JC, Ferdig MT, Striepen B, et al. An apicoplast localized ubiquitylation system is required for the import of nuclear-encoded plastid proteins. PLoS Pathog. 2013:9:e1003426. 10.1371/journal.ppat.1003426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Agrawal S, van Dooren GG, Beatty WL, Striepen B. Genetic evidence that an endosymbiont-derived endoplasmic reticulum-associated protein degradation (ERAD) system functions in import of apicoplast proteins. J Biol Chem. 2009:284(48):33683–33691. 10.1074/jbc.M109.044024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990:215(3):403–410. 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- Anwar K, Klemm RW, Condon A, Severin KN, Zhang M, Ghirlando R, Hu J, Rapoport TA, Prinz WA. The dynamin-like GTPase Sey1p mediates homotypic ER fusion in S. cerevisiae. J Cell Biol. 2012:197(2):209–217. 10.1083/jcb.201111115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Archibald JM. The puzzle of plastid evolution. Curr Biol. 2009:19(2):R81–R88. 10.1016/j.cub.2008.11.067. [DOI] [PubMed] [Google Scholar]
- Baldridge RD, Rapoport TA. Autoubiquitination of the Hrd1 ligase triggers protein retrotranslocation in ERAD. Cell. 2016:166(2):394–407. 10.1016/j.cell.2016.05.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baurain D, Brinkmann H, Petersen J, Rodríguez-Ezpeleta N, Stechmann A, Demoulin V, Roger AJ, Burger G, Lang BF, Philippe H. Phylogenomic evidence for separate acquisition of plastids in cryptophytes, haptophytes, and stramenopiles. Mol Biol Evol. 2010:27(7):1698–1709. 10.1093/molbev/msq059. [DOI] [PubMed] [Google Scholar]
- Bays NW, Hampton RY. Cdc48-Ufd1-Npl4: stuck in the middle with Ub. Curr Biol. 2002:12(10):R366–R371. 10.1016/S0960-9822(02)00862-X. [DOI] [PubMed] [Google Scholar]
- Bhaya D, Grossman A. Targeting proteins to diatom plastids involves transport through an endoplasmic reticulum. Mol Gen Genet. 1991:229(3):400–404. 10.1007/BF00267462. [DOI] [PubMed] [Google Scholar]
- Bodnar N, Rapoport T. Toward an understanding of the Cdc48/p97 ATPase. F1000Res. 2017:6:1318. 10.12688/f1000research.11683.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bodył A. Do plastid-related characters support the chromalveolate hypothesis? J Phycol. 2005:41(3):712–719. 10.1111/j.1529-8817.2005.00091.x. [DOI] [Google Scholar]
- Bodył A, Stiller JW, Mackiewicz P. Chromalveolate plastids: direct descent or multiple endosymbioses? Trends Ecol Evol. 2009:24(3):119–121. 10.1016/j.tree.2008.11.003. [DOI] [PubMed] [Google Scholar]
- Bolte K, Gruenheit N, Felsner G, Sommer MS, Maier UG, Hempel F. Making new out of old: recycling and modification of an ancient protein translocation system during eukaryotic evolution. Bioessays. 2011:33(5):368–376. 10.1002/bies.201100007. [DOI] [PubMed] [Google Scholar]
- Burki F, Okamoto N, Pombert J-F, Keeling PJ. The evolutionary history of haptophytes and cryptophytes: phylogenomic evidence for separate origins. Proc Biol Sci. 2012:279(1736):2246–2254. 10.1098/rspb.2011.2301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burki F, Roger AJ, Brown MW, Simpson AGB. The new tree of eukaryotes. Trends Ecol Evol. 2020:35(1):43–55. 10.1016/j.tree.2019.08.008. [DOI] [PubMed] [Google Scholar]
- Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. Trimal: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009:25(15):1972–1973. 10.1093/bioinformatics/btp348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cavalier-Smith T. Principles of protein and lipid targeting in secondary symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastids origins and the eukaryote family tree. J Eukaryot Microbiol. 1999:46(4):347–366. 10.1111/j.1550-7408.1999.tb04614.x. [DOI] [PubMed] [Google Scholar]
- Cavalier-Smith T. Kingdom Chromista and its eight phyla: a new synthesis emphasising periplastid protein targeting, cytoskeletal and periplastid evolution, and ancient divergences. Protoplasma. 2018:255(1):297–357. 10.1007/s00709-017-1147-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Claessen JHL, Kundrat L, Ploegh HL. Protein quality control in the ER: balancing the ubiquitin checkbook. Trends Cell Biol. 2012:22(1):22–32. 10.1016/j.tcb.2011.09.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cui X, Lu F, Li Y, Xue Y, Kang Y, Zhang S, Qiu Q, Cui X, Zheng S, Liu B, et al. Ubiquitin-specific proteases UBP12 and UBP13 act in circadian clock and photoperiodic flowering regulation in Arabidopsis. Plant Physiol. 2013:162(2):897–906. 10.1104/pp.112.213009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dantuma NP, Heinen C, Hoogstraten D. The ubiquitin receptor Rad23: at the crossroads of nucleotide excision repair and proteasomal degradation. DNA Repair (Amst). 2009:8(4):449–460. 10.1016/j.dnarep.2009.01.005. [DOI] [PubMed] [Google Scholar]
- Debelyy MO, Platta HW, Saffian D, Hensel A, Thoms S, Meyer HE, Warscheid B, Girzalsky W, Erdmann R. Ubp15p, a ubiquitin hydrolase associated with the peroxisomal export machinery. J Biol Chem. 2011:286(32):28223–28234. 10.1074/jbc.M111.238600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deruyffelaere C, Purkrtova Z, Bouchez I, Collet B, Cacas J-L, Chardot T, Gallois J-L, D’Andrea S. 2018. PUX10 is a CDC48A adaptor protein that regulates the extraction of ubiquitinated oleosins from seed lipid droplets in Arabidopsis. Plant Cell 30(9):2116–2136. 10.1105/tpc.18.00275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Farrell KB, Grossman C, Di Pietro SM. New regulators of clathrin-mediated endocytosis identified in Saccharomyces cerevisiae by systematic quantitative fluorescence microscopy. Genetics. 2015:201(3):1061–1070. 10.1534/genetics.115.180729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fellows JD, Cipriano MJ, Agrawal S, Striepen B. A plastid protein that evolved from ubiquitin and is required for apicoplast protein import in toxoplasma gondii. MBio. 2017:8(3):1–18. 10.1128/mBio.00950-17. [DOI] [Google Scholar]
- Felsner G, Sommer MS, Gruenheit N, Hempel F, Moog D, Zauner S, Martin W, Maier UG. ERAD components in organisms with complex red plastids suggest recruitment of a preexisting protein transport pathway for the periplastid membrane. Genome Biol Evol. 2011:3:140–150. 10.1093/gbe/evq074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Finley D, Ulrich HD, Sommer T, Kaiser P. The ubiquitin-proteasome system of Saccharomyces cerevisiae. Genetics. 2012:192(2):319–360. 10.1534/genetics.112.140467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gibbs SP. The route of entry of cytoplasmically synthesized proteins into chloroplasts of algae possessing chloroplast ER. J Cell Sci. 1979:35(1):253–266. 10.1242/jcs.35.1.253. [DOI] [PubMed] [Google Scholar]
- Gould SB, Maier U-G, Martin WF. Protein import and the origin of red complex plastids. Curr Biol. 2015:25(12):R515–R521. 10.1016/j.cub.2015.04.033. [DOI] [PubMed] [Google Scholar]
- Gould SB, Sommer MS, Hadfi K, Zauner S, Kroth PG, Maier U-G. Protein targeting into the complex plastid of cryptophytes. J Mol Evol. 2006a:62(6):674–681. 10.1007/s00239-005-0099-y. [DOI] [PubMed] [Google Scholar]
- Gould SB, Sommer MS, Kroth PG, Gile GH, Keeling PJ, Maier U-G. Nucleus-to-nucleus gene transfer and protein retargeting into a remnant cytoplasm of cryptophytes and diatoms. Mol Biol Evol. 2006b:23(12):2413–2422. 10.1093/molbev/msl113. [DOI] [PubMed] [Google Scholar]
- Greenwood A. The Cryptophyta in relation to phylogeny and photosynthesis. In: Sanders J, Goodchild D, editors. Eighth international congress of electron microscopy. Canberra: Australian Academy of Sciences; 1974. p. 566–567. [Google Scholar]
- Grossman A, Manodori A, Snyder D. Light-harvesting proteins of diatoms: their relationship to the chlorophyll a/b binding proteins of higher plants and their mode of transport into plastids. Mol Gen Genet. 1990:224(1):91–100. 10.1007/BF00259455. [DOI] [PubMed] [Google Scholar]
- Gruber A, Rocap G, Kroth PG, Armbrust EV, Mock T. Plastid proteome prediction for diatoms and other algae with secondary plastids of the red lineage. Plant J. 2015:81(3):519–528. 10.1111/tpj.12734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gschloessl B, Guermeur Y, Cock JM. HECTAR: a method to predict subcellular targeting in heterokonts. BMC Bioinformatics. 2008:9(1):393. 10.1186/1471-2105-9-393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hempel F, Bullmann L, Lau J, Zauner S, Maier UG. ERAD-derived preprotein transport across the second outermost plastid membrane of diatoms. Mol Biol Evol. 2009:26(8):1781–1790. 10.1093/molbev/msp079. [DOI] [PubMed] [Google Scholar]
- Hempel F, Felsner G, Maier UG. New mechanistic insights into pre-protein transport across the second outermost plastid membrane of diatoms. Mol Microbiol. 2010:76(3):793–801. 10.1111/j.1365-2958.2010.07142.x. [DOI] [PubMed] [Google Scholar]
- Hibberd DJ, Norris RE. Cytology and ultrastructure of Chlorarachnion reptans (Chlorarachniophyta divisio nova, Chlorarachniophyceae classis nova). J Phycol. 1984:20(2):310–330. 10.1111/j.0022-3646.1984.00310.x. [DOI] [Google Scholar]
- Hirakawa Y, Burki F, Keeling PJ. Genome-based reconstruction of the protein import machinery in the secondary plastid of a chlorarachniophyte alga. Eukaryot Cell. 2012:11(3):324–333. 10.1128/EC.05264-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hitt R, Wolf DH. Der1p, a protein required for degradation of malfolded soluble proteins of the endoplasmic reticulum: topology and Der1-like proteins. FEMS Yeast Res. 2004:4(7):721–729. 10.1016/j.femsyr.2004.02.003. [DOI] [PubMed] [Google Scholar]
- Hrdá Š, Fousek J, Szabová J, Hampl V, Vlček Č. The plastid genome of eutreptiella provides a window into the process of secondary endosymbiosis of plastid in euglenids. PLoS One. 2012:7:e33746. 10.1371/journal.pone.0033746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones D, Crowe E, Stevens TA, Candido EPM. Functional and phylogenetic analysis of the ubiquitylation system in Caenorhabditis elegans: ubiquitin-conjugating enzymes, ubiquitin-activating enzymes, and ubiquitin-like proteins. Genome Biol. 2002:3(1):RESEARCH0002. 10.1186/gb-2001-3-1-research0002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021:596(7873):583–589. 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalanon M, Tonkin CJ, McFadden GI. Characterization of two putative protein translocation components in the apicoplast of Plasmodium falciparum. Eukaryot Cell. 2009:8(8):1146–1154. 10.1128/EC.00061-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katoh K, Toh H. Parallelization of the MAFFT multiple sequence alignment program. Bioinformatics. 2010:26(15):1899–1900. 10.1093/bioinformatics/btq224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keeling PJ. Chromalveolates and the evolution of plastids by secondary endosymbiosis 1. J Eukaryot Microbiol. 2009:56(1):1–8. 10.1111/j.1550-7408.2008.00371.x. [DOI] [PubMed] [Google Scholar]
- Kim I, Ahn J, Liu C, Tanabe K, Apodaca J, Suzuki T, Rao H. The png1-Rad23 complex regulates glycoprotein turnover. J Cell Biol. 2006:172(2):211–219. 10.1083/jcb.200507149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knop M, Finger A, Braun T, Hellmuth K, Wolf DH. Der1, a novel protein specifically required for endoplasmic reticulum degradation in yeast. EMBO J. 1996:15(4):753–763. 10.1002/j.1460-2075.1996.tb00411.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Komander D, Rape M. The ubiquitin code. Annu Rev Biochem. 2012:81(1):203–229. 10.1146/annurev-biochem-060310-170328. [DOI] [PubMed] [Google Scholar]
- Koski LB, Golding GB. The closest BLAST hit is often not the nearest neighbor. J Mol Evol. 2001:52(6):540–542. 10.1007/s002390010184. [DOI] [PubMed] [Google Scholar]
- Kraft E, Stone SL, Ma L, Su N, Gao Y, Lau O-S, Deng X-W, Callis J. Genome analysis and functional characterization of the E2 and RING-type E3 ligase ubiquitination enzymes of Arabidopsis. Plant Physiol. 2005:139(4):1597–1611. 10.1104/pp.105.067983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lau JB, Stork S, Moog D, Schulz J, Maier UG. Protein-protein interactions indicate composition of a 480 kDa SELMA complex in the second outermost membrane of diatom complex plastids. Mol Microbiol. 2016:100(1):76–89. 10.1111/mmi.13302. [DOI] [PubMed] [Google Scholar]
- Lau JB, Stork S, Moog D, Sommer MS, Maier UG. N-terminal lysines are essential for protein translocation via a modified ERAD system in complex plastids. Mol Microbiol. 2015:96(3):609–620. 10.1111/mmi.12959. [DOI] [PubMed] [Google Scholar]
- Lee DH, Sherman MY, Goldberg AL. The requirements of yeast Hsp70 of SSA family for the ubiquitin-dependent degradation of short-lived and abnormal proteins. Biochem Biophys Res Commun. 2016:475(1):100–106. 10.1016/j.bbrc.2016.05.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lemus L, Goder V. Regulation of endoplasmic reticulum-associated protein degradation (ERAD) by ubiquitin. Cells. 2014:3(3):824–847. 10.3390/cells3030824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li S, Nosenko T, Hackett JD, Bhattacharya D. Phylogenomic analysis identifies red algal genes of endosymbiotic origin in the chromalveolates. Mol Biol Evol. 2006:23(3):663–674. 10.1093/molbev/msj075. [DOI] [PubMed] [Google Scholar]
- Lindbäck LN, Hu Y, Ackermann A, Artz O, Pedmale UV. UBP12 and UBP13 deubiquitinases destabilize the CRY2 blue light receptor to regulate Arabidopsis growth. Curr Biol. 2022:32(15):3221–3231.e6. 10.1016/j.cub.2022.05.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- López-García P, Eme L, Moreira D. 2017. Symbiosis in eukaryotic evolution. J Theor Biol. 2017;434:20–33 10.1016/j.jtbi.2017.02.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maier UG. The four genomes of the alga Pyrenomonas salina (Cryptophyta). Biosystems. 1992:28(1–3):69–73. 10.1016/0303-2647(92)90009-N. [DOI] [PubMed] [Google Scholar]
- Maier UG, Douglas SE, Cavalier-Smith T. The nucleomorph genomes of cryptophytes and chlorarachniophytes. Protist. 2000:151(2):103–109. 10.1078/1434-4610-00011. [DOI] [PubMed] [Google Scholar]
- Markson G, Kiel C, Hyde R, Brown S, Charalabous P, Bremm A, Semple J, Woodsmith J, Duley S, Salehi-Ashtiani K, et al. Analysis of the human E2 ubiquitin conjugating enzyme protein interaction network. Genome Res. 2009:19(10):1905–1911. 10.1101/gr.093963.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martin W, Stoebe B, Goremykin V, Hapsmann S, Hasegawa M, Kowallik KV. Gene transfer to the nucleus and the evolution of chloroplasts. Nature. 1998:393(6681):162–165. 10.1038/30234. [DOI] [PubMed] [Google Scholar]
- McGrath JP, Jentsch S, Varshavsky A. UBA 1: an essential yeast gene encoding ubiquitin-activating enzyme. EMBO J. 1991:10(1):227–236. 10.1002/j.1460-2075.1991.tb07940.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meusser B, Hirsch C, Jarosch E, Sommer T. ERAD: the long road to destruction. Nat Cell Biol. 2005:7(8):766–772. 10.1038/ncb0805-766. [DOI] [PubMed] [Google Scholar]
- Michelle C, Vourc’h P, Mignon L, Andres CR. What was the set of ubiquitin and ubiquitin-like conjugating enzymes in the eukaryote common ancestor? J Mol Evol. 2009:68(6):616–628. 10.1007/s00239-009-9225-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Minh BQ, Nguyen MAT, Von Haeseler A. Ultrafast approximation for phylogenetic bootstrap. Mol Biol Evol. 2013:30(5):1188–1195. 10.1093/molbev/mst024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moog D, Stork S, Zauner S, Maier UG, Biology G, Stork S, Moog D, Stork S, Zauner S, Maier UG. In silico and in vivo investigations of proteins of a minimized eukaryotic cytoplasm. Genome Biol Evol. 2011:3:375–382. 10.1093/gbe/evr031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moore CE, Curtis B, Mills T, Tanifuji G, Archibald JM. Nucleomorph genome sequence of the cryptophyte alga chroomonas mesostigmatica CCMP1168 reveals lineage-specific gene loss and genome complexity. Genome Biol Evol. 2012:4:1162–1175. 10.1093/gbe/evs090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neal S, Jaeger PA, Duttke SH, Benner C, Glass CK, Ideker T, Hampton RY. The Dfm1 derlin is required for ERAD retrotranslocation of integral membrane proteins. Mol Cell. 2018:69(2):306–320.e4. 10.1016/j.molcel.2017.12.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nguyen LTL-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2014:32(1):268–274. 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paper O, Deane JA, Fraunholz M, Su V, Maier U-G, Martin W, Durnford DG, McFadden IG. Evidence for nucleomorph to host nucleus gene transfer: light-harvesting complex proteins from cryptomonads and chlorarachniophytes. Protist. 2000:151(3):239–252. 10.1078/1434-4610-00022. [DOI] [PubMed] [Google Scholar]
- Parfrey LW, Lahr DJG, Knoll AH, Katz LA. Estimating the timing of early eukaryotic diversification with multigene molecular clocks. Proc Natl Acad Sci U S A. 2011:108(33):13624–13629. 10.1073/pnas.1110633108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Park S-H, Bolender N, Eisele F, Kostova Z, Takeuchi J, Coffino P, Wolf DH. The cytoplasmic Hsp70 chaperone machinery subjects misfolded and endoplasmic reticulum import-incompetent proteins to degradation via the ubiquitin-proteasome system. Mol Biol Cell. 2007:18(1):153–165. 10.1091/mbc.e06-04-0338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petersen J, Ludewig A-K, Michael V, Bunk B, Jarek M, Baurain D, Brinkmann H. Chromera velia, endosymbioses and the rhodoplex hypothesis–plastid evolution in cryptophytes, alveolates, stramenopiles, and haptophytes (CASH lineages). Genome Biol Evol. 2014:6(3):666–684. 10.1093/gbe/evu043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pietluch F, Mackiewicz P, Ludwig K, Gagat P. A new model and dating for the evolution of complex plastids of red alga origin. Genome Biol Evol. 2024:16(9):evae192. 10.1093/gbe/evae192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ponce-Toledo RI, Deschamps P, López-García P, Zivanovic Y, Benzerara K, Moreira D. An early-branching freshwater Cyanobacterium at the origin of plastids. Curr Biol. 2017:27(3):386–391. 10.1016/j.cub.2016.11.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Price MN, Dehal PS, Arkin AP. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS One. 2010:5(3):e9490. 10.1371/journal.pone.0009490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rambaut A. Figtree. Tree figure drawing tool. 2009. [accessed 2023 Jul 18]. https://cir.nii.ac.jp/crid/1570009750125811712. [Google Scholar]
- Rao B, Li S, Yao D, Wang Q, Xia Y, Jia Y, Shen Y, Cao Y. The cryo-EM structure of an ERAD protein channel formed by tetrameric human Derlin-1. Sci Adv. 2021:7:eabe8591. 10.1126/sciadv.abe8591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rice DW, Palmer JD. An exceptional horizontal gene transfer in plastids: gene replacement by a distant bacterial paralog and evidence that haptophyte and cryptophyte plastids are sisters. BMC Biol. 2006:4(1):31. 10.1186/1741-7007-4-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richter DJ, Berney C, Strassert JFH, Poh Y-P, Herman EK, Muñoz-Gómez SA, Wideman JG, Burki F, de Vargas C. EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Community J. 2022:2:e56. 10.24072/pcjournal.173. [DOI] [Google Scholar]
- Rodríguez-Ezpeleta N, Brinkmann H, Burger G, Burey SC, Lo W, Bohnert HJ, Lang BF, Roure B, Löffelhardt W, Philippe H. Monophyly of primary photosynthetic eukaryotes: green plants, red algae, and glaucophytes. Curr Biol. 2005:15(14):1325–1330. 10.1016/j.cub.2005.06.040. [DOI] [PubMed] [Google Scholar]
- Rogers MB, Archibald JM, Field MA, Li C, Striepen B, Keeling PJ, Diseases EG. Plastid-targeting peptides from the chlorarachniophyte Bigelowiella natans. J Eukaryot Microbiol. 2004:51(5):529–535. 10.1111/j.1550-7408.2004.tb00288.x. [DOI] [PubMed] [Google Scholar]
- Rogers MB, Gilson PR, Su V, McFadden GI, Keeling PJ. The complete chloroplast genome of the chlorarachniophyte Bigelowiella natans: evidence for independent origins of chlorarachniophyte and euglenid secondary endosymbionts. Mol Biol Evol. 2007:24(1):54–62. 10.1093/molbev/msl129. [DOI] [PubMed] [Google Scholar]
- Samuelson J, Robbins PW. Effects of N-glycan precursor length diversity on quality control of protein folding and on protein glycosylation. Semin Cell Dev Biol. 2015:41:121–128. 10.1016/j.semcdb.2014.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sanchez-Puerta MV, Bachvaroff TR, Delwiche CF. Sorting wheat from chaff in multi-gene analyses of chlorophyll c-containing plastids. Mol Phylogenet Evol. 2007:44(2):885–897. 10.1016/j.ympev.2007.03.003. [DOI] [PubMed] [Google Scholar]
- Ševčíková T, Horák A, Klimeš V, Zbránková V, Demir-Hilton E, Sudek S, Jenkins J, Schmutz J, Přibyl P, Fousek J, et al. Updating algal evolutionary relationships through plastid genome sequencing: did alveolate plastids emerge through endosymbiosis of an ochrophyte? Sci Rep. 2015:5:10134. 10.1038/srep10134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sheiner L, Demerly JL, Poulsen N, Beatty WL, Lucas O, Behnke MS, White MW, Striepen B. A systematic screen to discover and analyze apicoplast proteins identifies a conserved and essential protein import factor. PLoS Pathog. 2011:7(12):e1002392. 10.1371/journal.ppat.1002392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sjuts I, Soll J, Bölter B. Import of soluble proteins into chloroplasts and potential regulatory mechanisms. Front Plant Sci. 2017:8:1–15. 10.3389/fpls.2017.00168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sommer MS, Gould SB, Lehmann P, Gruber A, Przyborski JM, Maier U-G. Der1-mediated preprotein import into the periplastid compartment of chromalveolates? Mol. Biol. Evol. 2007:24(4):918–928. 10.1093/molbev/msm008. [DOI] [PubMed] [Google Scholar]
- Spork S, Hiss JA, Mandel K, Sommer M, Kooij TWA, Chu T, Schneider G, Maier UG, Przyborski JM. An unusual ERAD-like complex is targeted to the apicoplast of Plasmodium falciparum. Eukaryot Cell. 2009:8(8):1134–1145. 10.1128/EC.00083-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steiner JM, Yusa F, Pompe JA, Löffelhardt W. Homologous protein import machineries in chloroplasts and cyanelles. Plant J. 2005:44(4):646–652. 10.1111/j.1365-313X.2005.02559.x. [DOI] [PubMed] [Google Scholar]
- Stiller JW, Schreiber J, Yue J, Guo H, Ding Q, Huang J. The evolution of photosynthesis in chromist algae through serial endosymbioses. Nat Commun. 2014:5(1):5764. 10.1038/ncomms6764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stork S, Moog D, Przyborski JM, Wilhelmi I, Zauner S, Maier UG. Distribution of the SELMA translocon in secondary plastids of red algal origin and predicted uncoupling of ubiquitin-dependent translocation from degradation. Eukaryot Cell. 2012:11(12):1472–1481. 10.1128/EC.00183-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Strassert JFH, Irisarri I, Williams TA, Burki F. A molecular timescale for eukaryote evolution with implications for the origin of red algal-derived plastids. Nat Commun. 2021:12(1):3574. 10.1038/s41467-021-23847-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sulli C, Fang Z, Muchhal U, Schwartzbach SD. Topology of Euglena chloroplast protein precursors within endoplasmic reticulum to Golgi to chloroplast transport vesicles. J Biol Chem. 1999:274(1):457–463. 10.1074/jbc.274.1.457. [DOI] [PubMed] [Google Scholar]
- Suzuki T, Park H, Kwofie MA, Lennarz WJ. Rad23 provides a link between the Png1 deglycosylating enzyme and the 26 S proteasome in yeast. J Biol Chem. 2001:276(24):21601–21607. 10.1074/jbc.M100826200. [DOI] [PubMed] [Google Scholar]
- Swatek KN, Komander D. Ubiquitin modifications. Cell Res. 2016:26(4):399–422. 10.1038/cr.2016.39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Timmis JN, Ayliffe MA, Huang CY, Martin W. Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nat Rev Genet. 2004:5(2):123–135. 10.1038/nrg1271. [DOI] [PubMed] [Google Scholar]
- Varadi M, Anyango S, Deshpande M, Nair S, Natassia C, Yordanova G, Yuan D, Stroe O, Wood G, Laydon A, et al. AlphaFold protein structure database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022:50(D1):D439–D444. 10.1093/nar/gkab1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waller RF, Reed MB, Cowman AF, McFadden GI. Protein trafficking to the plastid of Plasmodium falciparum is via the secretory pathway. EMBO J. 2000:19(8):1794–1802. 10.1093/emboj/19.8.1794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang J, Chitsaz F, Derbyshire MK, Gonzales NR, Gwadz M, Lu S, Marchler GH, Song JS, Thanki N, Yamashita RA, et al. The conserved domain database in 2023. Nucleic Acids Res. 2023:51(D1):D384–D388. 10.1093/nar/gkac1096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang J, Youkharibache P, Marchler-Bauer A, Lanczycki C, Zhang D, Lu S, Madej T, Marchler GH, Cheng T, Chong LC, et al. ICn3D: from web-based 3D viewer to structural analysis tool in batch mode. Front Mol Biosci. 2022:9:831740. 10.3389/fmolb.2022.831740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wetherbee R, Jackson CJ, Repetti SI, Clementson LA, Costa JF, van de Meene A, Crawford S, Verbruggen H. The golden paradox—a new heterokont lineage with chloroplasts surrounded by two membranes. J Phycol. 2019:55(2):257–278. 10.1111/jpy.12822. [DOI] [PubMed] [Google Scholar]
- Wu X, Rapoport TA. Mechanistic insights into ER-associated protein degradation. Curr Opin Cell Biol. 2018:53:22–28. 10.1016/j.ceb.2018.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoon HS, Hackett JD, Pinto G, Bhattacharya D. The single, ancient origin of chromist plastids. Proc Natl Acad Sci U S A. 2002:99(24):15507–15512. 10.1073/pnas.242379899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zauner S, Heimerl T, Moog D, Maier UG. The known, the new, and a possible surprise: a Re-evaluation of the nucleomorph-encoded proteome of cryptophytes. Genome Biol Evol. 2019:11(6):1618–1629. 10.1093/gbe/evz109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zimorski V, Ku C, Martin WF, Gould SB. Endosymbiotic theory for organelle origins. Curr Opin Microbiol. 2014:22C:38–48. 10.1016/j.mib.2014.09.008. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
supplementary figs. S1 to S23, Supplementary Material online and supplementary tables S1 to S6, Supplementary Material online are available on the publisher website. The same supplementary files, as well as multiple protein sequence alignments in fasta format and phylogenetic trees in Newick format are also available at https://data.deemteam.fr/





