Skip to main content
eLife logoLink to eLife
. 2017 Jun 29;6:e25093. doi: 10.7554/eLife.25093

Systematic bacterialization of yeast genes identifies a near-universally swappable pathway

Aashiq H Kachroo 1,*,†,, Jon M Laurent 1,†,§, Azat Akhmetov 1, Madelyn Szilagyi-Jones 1, Claire D McWhite 1, Alice Zhao 1, Edward M Marcotte 1,2,*
Editor: Naama Barkai3
PMCID: PMC5536947  PMID: 28661399

Abstract

Eukaryotes and prokaryotes last shared a common ancestor ~2 billion years ago, and while many present-day genes in these lineages predate this divergence, the extent to which these genes still perform their ancestral functions is largely unknown. To test principles governing retention of ancient function, we asked if prokaryotic genes could replace their essential eukaryotic orthologs. We systematically replaced essential genes in yeast by their 1:1 orthologs from Escherichia coli. After accounting for mitochondrial localization and alternative start codons, 31 out of 51 bacterial genes tested (61%) could complement a lethal growth defect and replace their yeast orthologs with minimal effects on growth rate. Replaceability was determined on a pathway-by-pathway basis; codon usage, abundance, and sequence similarity contributed predictive power. The heme biosynthesis pathway was particularly amenable to inter-kingdom exchange, with each yeast enzyme replaceable by its bacterial, human, or plant ortholog, suggesting it as a near-universally swappable pathway.

DOI: http://dx.doi.org/10.7554/eLife.25093.001

Research Organism: A. thaliana, E. coli, Human, S. cerevisiae

eLife digest

All life on Earth – from bacteria to human beings – can be traced back to a common ancestor that lived over three billion years ago. As a result, modern-day organisms share many essential parts of life’s molecular machinery, such as certain genes and proteins. Yet there are also vital differences that allow scientists to divide almost all living things into one of two groups, known as prokaryotes and eukaryotes. Prokaryotes are all simple, single-celled organisms, such as bacteria; while eukaryotes include more complex organisms, such as plants, animals and fungi.

Scientists have previously found that eukaryotes and prokaryotes have hundreds of genes in common, even though they have evolved separately for over two billion years. As different species evolve, however, their genes mutate and change, potentially affecting the way they work. So, although scientists can recognize equivalent genes between species, they are not sure if they work the same way as they did in the species’ ancient ancestors.

To investigate this, one-by-one Kachroo, Laurent et al. replaced over 50 genes in baker’s yeast (a eukaryote) with their equivalent gene from E. coli bacteria (a prokaryote). If the yeast cells grow healthily after the gene is replaced, it means that that gene works in a similar way in both bacteria and yeast. That, in turn, suggests it is likely that the genes work as they did in the last common ancestor of bacteria and yeast.

The experiments found that most of the tested E. coli genes (61% to be precise) could successfully replace equivalent genes in yeast cells. Moreover, genes often work together in groups, and Kachroo, Laurent et al. found that genes in some groups were more successfully replaced than others. For example, nearly every gene that is important for producing a molecule called heme could be freely swapped from bacteria, plants and humans into yeast. This group of genes has probably worked the same way in different species for billions of years.

Understanding why genes sometimes change how they work is an important question for scientists studying evolution, but this knowledge has other uses. For example, people need heme to, amongst other things, carry oxygen in their blood, and a mutation in a gene in the heme production pathway causes a disease called porphyria. Scientists could replace genes in yeast cells to better model the disease in humans, leading to a better understanding of its causes and more efficient development of new drugs.

DOI: http://dx.doi.org/10.7554/eLife.25093.002

Introduction

Despite over 2 billion years of divergence, eukaryotes and prokaryotes still share hundreds of genes (Theobald, 2010; O'Brien et al., 2005; Brown and Doolittle, 1997; Martin and Müller, 1998). Though these ancient genes are identifiable as orthologs at the sequence level, the preservation of original protein function across such deep timescales has not been systematically explored. The function of certain genes could potentially become frozen in place in the course of evolution, sheltered from lineage-specific functional alterations introduced by mutations, gene fusions, and non-orthologous gene displacements. Such functionally frozen genes would in principle be able to substitute for their least-diverged ortholog in any other species. Searching for such gene replaceability between species thus serves to test a core assumption of the ortholog-function conjecture: that orthologs retain ancestral function (Gabaldón and Koonin, 2013). This conjecture forms the basis of most modern biomedical research and is widely used to predict new gene function across organisms (Lee et al., 2007).

There are many individual examples of genes from one species functioning for their orthologous counterparts in a different species (Cherry et al., 2012; Heinicke et al., 2007), but this trend has only recently begun to be explored systematically, with several large-scale studies substituting human genes for yeast genes and confirming that many human orthologs can successfully replace their yeast counterparts (Kachroo et al., 2015; Sun et al., 2016; Hamza et al., 2015). At the level of evolutionary divergence of yeast and humans, such data demonstrate widespread functional conservation, even after 1 billion years of divergence. The ability of human genes to functionally replace their yeast orthologs is not strongly predicted by the similarity of sequences, but rather at the level of specific pathways or processes, wherein all genes in a process or pathway tend to be similarly replaceable, or not (Kachroo et al., 2015).

However, in the timescale of evolution, yeast and humans are relatively similar – both eukaryotes that share thousands of genes and the majority of their core biological processes. Data on eukaryote – prokaryote functional gene replacement are sparse (Heinicke et al., 2007). These cross-domain replacements represent a maximum test of the ability of genes to retain their ancestral function across time. Eukaryotic and bacterial genes have been, for the most part, evolving independently since at least the archaeal ancestor of eukaryotes endosymbiotically acquired its bacterial mitochondrion. In eukaryotes, the function of these genes would have had to survive the development of vastly different genome structures, cell division modalities, cell wall compositions, and subcellular compartmentalizations which occurred during eukaryogenesis. Prokaryotic and eukaryotic orthologs also diverged significantly at the amino acid sequence level (O'Brien et al., 2005) and evolved distinct expression patterns and codon usages (Sharp et al., 1993; Bulmer, 1991). Nonetheless, eukaryotes and bacteria are known to use many of the same orthologs to perform the same metabolic enzymatic reactions (Jardine et al., 2002; Peregrin-Alvarez et al., 2003).

Thus, in order to more systematically determine the replaceability of orthologs across such deep timescales, we asked in this study how many conserved E. coli genes can successfully substitute for their yeast orthologs. We focused on those genes that are essential for viability in yeast, allowing us to assay for the complementation of otherwise lethal growth defects. We analysed many features of the proteins and ortholog pairs to identify which properties best explained replaceability, finding that replaceability was often determined at the level of specific pathways and processes, with all genes in a pathway or process similarly replaceable. Start codon choice and eukaryote-specific subcellular localization were also critical determinants of replaceability. We discovered that certain core biological processes have remained largely unchanged since the last common ancestor of bacteria, yeast, and humans. In particular, heme biosynthesis pathway enzymes appear to be generally exchangeable between prokaryotic and eukaryotic organisms, broadly retaining ancestral functions across the tree of life over 2 billion years of independent evolution, even when accompanied by evolved changes in enzyme subcellular localization.

Results and discussion

Many E. coli genes successfully complement lethal defects in their yeast orthologs

We focused our efforts on the set of genes with 1:1 orthology between E. coli and yeast and that are known to be essential for yeast growth in standard laboratory conditions (Figure 1A). Each E. coli open reading frame (ORF) was cloned into a single-copy yeast centromeric (CEN) plasmid under the transcriptional control of a constitutive GPD promoter. Complementation assays were carried out using two types of conditionally essential yeast alleles, consisting of temperature-sensitive (TS) haploid and heterozygous diploid deletion strains. In the case of the heterozygous diploid deletion strains, the respective yeast gene null allele could be genetically segregated via sporulation, allowing selection for haploid yeast with the null allele (selected for in the presence of the antibiotic G418) or the wild-type yeast gene (in the absence of G418) (Figure 1B, Top panel). In the case of TS haploid yeast strains, the temperature sensitive yeast proteins functioned normally at the permissive temperature (25°C) but could be conditionally inactivated at the non-permissive temperature (36°C) in order to test for gene replaceability (Figure 1B, Bottom panel). Overall, we could perform informative complementation assays for 51 of the 58 orthologs, as shown for the examples in Figure 1B.

Figure 1. Many E. coli genes efficiently complement lethal growth defects in their yeast counterparts.

(A) Yeast and E. coli share hundreds of genes, 58 of which are essential in yeast and have clear 1:1 orthologs in either species. E. coli genes were cloned into a yeast expression vector under the control of a GPD promoter. 51 of these 58 E. coli genes provided informative assays for replaceability in yeast. Initial results from these complementation assays revealed that 25 of 51 (~49%) E. coli genes could functionally replace their orthologous yeast counterparts. (B) Complementation assays were performed in two different yeast strain backgrounds, as shown for representative assays. In the case of a yeast strain with a temperature-sensitive allele of the yeast gene Sc-cdc8, cells carrying the empty vector control grow at the permissive-temperature (25°C, yeast protein active) but not the restrictive-temperature (36°C, yeast protein inactive), unlike cells expressing the E. coli ortholog (Ec-tmK), indicating that the E. coli gene can functionally replace the yeast gene. In the case of yeast heterozygous diploid (Sc-ths1Δ/Sc-THS1) deletion strain, cells are sporulated and haploid progeny grown on selective medium (-Ura -Arg -His -Leu + Can) in the absence (yeast gene present) or presence of G418 (200 μg/ml) (yeast gene absent). Cells expressing the E. coli ortholog (Ec-thrS) grow on G418-containing medium, unlike cells carrying the empty vector control, indicating successful complementation. (C) Haploid yeast gene deletion strains carrying plasmids expressing functionally replacing E. coli genes (red solid-lines) generally exhibit comparable growth rates to the wild type parental yeast strain BY4741 (black dotted-lines). The empty vector control (grey solid-line) showed no such growth rescue in the presence of G418. Mean and standard deviation plotted with N = 3.

DOI: http://dx.doi.org/10.7554/eLife.25093.003

Figure 1.

Figure 1—figure supplement 1. Complementation assays performed in a 96-well format in two different yeast strain backgrounds (Supplementary file 1).

Figure 1—figure supplement 1.

(A and B) Magic marker heterozygous diploid deletion yeast strains expressing E. coli genes were sporulated and the sporulation mix was spotted on magic marker agar medium (-Ura -Arg -His -Leu + Can) with (yeast gene absent) or without (yeast gene present) G418 (200 μg/ml). (C) Temperature-sensitive haploid yeast strains expressing E. coli genes grown at permissive temperature (25°C) (yeast protein active) and at restrictive temperature (36°C) (yeast protein inactive) on -Ura agar medium with G418 (200 μg/ml). Empty vector containing yeast cells were used as negative control for the experiment. (D) Haploid yeast gene deletion strains carrying plasmids expressing functionally replacing E. coli genes (red solid-lines) generally exhibit comparable growth rates to the wild type parental yeast strain BY4741 (black dotted-lines) as grown in YPD liquid medium in the presence of G418 (300 μg/ml). Mean and standard deviation plotted with N = 3.
Figure 1—figure supplement 2. Constitutive plasmid expression of yeast genes efficiently replaced the corresponding genomic copies for 6 non-replaceable alleles.

Figure 1—figure supplement 2.

Bacterial orthologs of the yeast genes, Sc-RRP3, Sc-PGS1, Sc-SRP54, Sc-PCM1 and Sc-HSP60 did not show functional replacement when expressed from a constitutive GPD promoter. We expressed the corresponding yeast genes in a similar fashion under the control of the constitutive GPD promoter. All the tested yeast genes functionally replaced the corresponding yeast gene deletions. Empty vector containing yeast cells were used as negative control for the experiment.

Of the 51 E. coli genes tested, 25 successfully complemented lethal growth defects in the corresponding yeast strains (Figure 1—figure supplement 1A,B and C; Supplementary file 1). In nearly all cases, despite plasmid-based expression of the complementing genes, the bacterialized strains grew comparably to the parental, wild type yeast strain, in both synthetic defined medium (SD -Ura + G418) (Figure 1C) and rich medium (YPD + G418) (Figure 1—figure supplement 1D). We further verified complementation specificity by testing for plasmid loss (see Materials and methods and (Supplementary file 1) and sequence verifying all clones. We have previously demonstrated that plasmid-borne copies of yeast genes complemented their corresponding heterozygous diploid deletion alleles at a high rate (100% for 29 strains tested in Kachroo et al., 2015), but as an additional control, we repeated this test for six yeast strains where the E. coli gene failed to rescue, confirming that the corresponding yeast genes were able to complement the growth defect when expressed on a CEN plasmid under the control of the constitutive GPD promoter (Figure 1—figure supplement 2 and Sc-HEM1 as reported in Figure 4—figure supplement 1).

Mitochondrial localization and start codon choice both affect replaceability

Many eukaryotic orthologs of prokaryotic genes function in specific subcellular compartments absent from prokaryotes, and consistent with this trend, 15 of the 51 tested E. coli genes have mitochondrially-localized yeast orthologs (Cherry et al., 2012). Because all but one of these 15 genes were unable to replace their yeast ortholog, we reasoned that lack of mitochondria targeting might account for their failed complementation. We added the mitochondrial localization signal (MLS) from the yeast MIP1 gene to each of the 14 non-replaceable E. coli genes and repeated the complementation assays. Four genes could now functionally replace their yeast equivalents (Figure 2A, Figure 2—figure supplement 1), restoring growth rates to be nearly or fully comparable with the parental strain (Figure 2B). We verified mitochondrial localization by fusing the E. coli Ec-MLS-HscB and Ec-MLS-IlvD proteins with enhanced green fluorescent protein (EGFP) and confirming correct trafficking of the EGFP-tagged proteins to yeast mitochondria (Figure 2C).

Figure 2. The addition of a mitochondrial localization signal (MLS) and mutation of start codons from GTG to ATG allows some E. coli genes to swap for their respective yeast orthologs.

(A) 14 of the 25 non-replaceable E. coli genes were predicted to function in mitochondria in yeast. 4 of 14 were replaceable after adding the MLS at the N-termini of the E. coli genes. Site-specific mutagenesis of E. coli gene start codon from GTG to ATG allowed two to functionally complement the corresponding yeast genes bringing the total number E. coli genes that functionally replace yeast genes to 31 of 51 (~61%). (B) Haploid yeast gene deletion strains carrying mitochondrially localized E. coli genes rescued the growth defect of the yeast gene (red solid-line) comparable to the wild type yeast (black dashed-line). The empty vector control (grey solid-line) and the yeast cells expressing of E. coli gene without MLS (blue-solid line) showed no such growth rescue in the presence of G418. Mean and standard deviation plotted with N = 3. (C) EGFP-tagged E. coli genes that functionally replaced the yeast gene function were imaged after MitoTracker red staining. EGFP-tagged Ec-MLS-HscB and Ec-MLS-IlvD (green) show colocalization with MitoTracker red stained mitochondria (red).

DOI: http://dx.doi.org/10.7554/eLife.25093.006

Figure 2.

Figure 2—figure supplement 1. Some E. coli genes require a yeast mitochondrial localization signal to efficiently replace.

Figure 2—figure supplement 1.

The magic marker heterozygous diploid deletion yeast strains carrying empty vector or E. coli gene with or without MLS were sporulated and the sporulation mix was plated on magic marker agar medium (-Ura -Arg -His -Leu + Can) with or without G418 (200 μg/ml). E. coli genes Ec-rpiL, Ec-ilvC, Ec-ilvD and Ec-hscB without an appropriate mitochondrial localization signal cannot complement the corresponding yeast gene deletions Sc-mnp1, Sc-ilv5, Sc-ilv3 and Sc-jac1. However, expression of E. coli genes with yeast MLS efficiently rescued the growth defect of the corresponding yeast gene deletions.

Bacterial genes also occasionally lack a standard ATG start codon, with ~14% of all E. coli ORFs employing an alternative start codon (Blattner et al., 1997). Three of the tested non-replaceable E. coli genes used a GTG start codon while one used ATT. We therefore used site-directed mutagenesis to introduce canonical ATG start codons, then re-assayed for complementation. After changing their start codons to ATG, two of these four E. coli genes, Ec-rcsC and Ec-tadA, could now replace their yeast orthologs (Figure 2B).

Overall, after accounting for mitochondrial localization and alternative start codons and combining results from all assays, a total of 31 out of 51 tested E. coli genes could successfully replace their essential yeast orthologs (Figure 2). Thus, in a majority (61%) of our tests, both the current day prokaryotic and eukaryotic proteins must have retained their critical ancestral functions such that the prokaryotic proteins could carry out the essential roles of their eukaryotic orthologs well enough to support yeast cell growth. In one-fifth of the cases, replaceability depended on proper subcellular localization or start codon choice to express the prokaryotic gene in the proper eukaryotic context.

Replaceability varies strongly across different biological processes

Given that we observed both replaceable and non-replaceable genes, we sought to determine properties of the tested genes that best explained successful replacements. We considered 22 features of the tested genes, including protein lengths, interactions, sequence similarities, codon usages, and expression levels. We calculated the predictive utility of each feature as the area under a Receiver Operating Characteristic curve (AUC) (Figure 3A; Supplementary file 2). Notably, the extent of protein sequence similarity between orthologs was not a highly predictive feature. A large portion of the tested E. coli and yeast orthologs showed only 20–30% identical amino acid sequences and roughly half of these genes were replaceable; in contrast, the three most divergent orthologs replaced, each showing less than 20% identity (Figure 3B). As we observed a non-monotonic relationship between sequence identity and replaceability, potentially explained by replaceability differences among different functional categories of genes, we tested for the enrichment of particular GO Biological Process (defined by Gene Ontology Slim annotations (Ashburner et al., 2000) or KEGG categories (Kanehisa and Goto, 2000) within the individual bins of sequence identity in Figure 3B. Aside from an enrichment in glucose metabolism genes (3 of the 7) in the 40–50% identity range, we did not find evidence for strong pathway-specific biases that would explain the observed relationship between sequence identity and replaceability. We did observe moderate predictive power for some measures of codon bias, especially those related to codon optimality within E. coli, and less so for codon optimality within a yeast context; more highly optimized E. coli codon usage correlated with a lower replaceability rate.

Figure 3. Replaceability of E. coli genes is a modular phenomenon.

Figure 3.

(A) Several quantitative properties of the tested genes were assessed for their ability to predict replaceability, measured as the area under a receiver operating characteristic curve (AUC). Having a high fraction of interaction partners that replace was the most predictive property tested, suggesting that the ability to replace is a modular phenomenon whereby genes functioning together are similarly able to replace. A Random Forest classifier constructed with all attributes boosted the maximum AUC to 0.79. (B) As shown in (A), sequence similarity was not the most predictive feature. The fraction of replaceable genes in given ranges of similarity was variable, with the vast majority of orthologs being 20–30% identical, a range in which roughly half of proteins replaced. (C) Mapping of replaceability status onto yeast GO slim annotations revealed that GO categories have varying rates of replaceability, with core metabolic processes (e.g. energy metabolism, nucleobase metabolism) being largely replaceable while more specialized processes (e.g. protein assembly, membrane transport) were less so.

DOI: http://dx.doi.org/10.7554/eLife.25093.008

Instead, the strongest predictive features related to specific pathways and processes, much as we and others have observed for successful humanization of yeast (Kachroo et al., 2015; Sun et al., 2016; Hamza et al., 2015). This trend was most evident in the observation that a gene was more likely to replace (or not) if it had a higher fraction of interaction partners that also replaced (or not). Consequently, different biological processes (as defined by GO) displayed varied replaceability, with metabolic processes being largely replaceable, while processes known to be divergent, including ribosomal processing, were much less replaceable (Figure 3C). This trend suggests an explanation for why optimized E. coli codons predicted worse replaceability, as E. coli genes with optimized codons predominantly tend to be highly expressed ribosomal and translational proteins (Saikia et al., 2016). This is thus consistent with the notion that replaceability is determined at the level of the pathway or process, with codon choice and gene expression levels reflecting functional constraints of that process. Combining all of these features into a single predictor (after accounting for mitochondrial localization and alternative start codons), using a random forest classifier, improved our predictive power to a 0.79 AUC (Figure 3A), demonstrating that the features we investigated provide moderately orthogonal predictive information.

Each yeast heme biosynthesis enzyme can be replaced by its E. coli equivalent, irrespective of orthology or localization

Nearly all the genes that we tested from the heme biosynthesis pathway were replaceable by their E. coli orthologs, which in combination with the evidence that replaceability was determined at the level of processes, led us to investigate the heme pathway in more depth. Most of the enzymatic reactions in the heme biosynthesis pathway are identical between E. coli and yeast, but there are clear differences in the way this pathway functions between the species (Heinemann et al., 2008). First, heme biosynthesis pathway precursors differ: Yeast condense succinyl-CoA and glycine to produce delta-aminolevulinate in a single enzymatic step catalyzed by Sc-Hem1, while E. coli produces delta-aminolevulinate in two steps using glutamyl-tRNA as a precursor (Yin and Bauer, 2013). Second, the bacterial heme pathway is largely cytosolic but in yeast it is partitioned between the mitochondria and cytosol (Figure 4A). We thus next considered these two key pathway differences in more detail. As a control, we first expressed the corresponding yeast genes on plasmids either under the control of constitutive GPD or the native yeast promoter (Ho et al, 2009) to test the effect of constitutive expression on functional replaceability. Except for Sc-HEM4, which showed toxicity when expressed constitutively, all the other yeast genes showed functional replaceability irrespective of the mode of expression (Figure 4—figure supplement 1).

Figure 4. Bacterialization of yeast heme biosynthesis pathway genes at their native loci.

(A) A schematic of the yeast heme pathway shows the beginning of the pathway in mitochondria using succinyl-CoA and glycine as precursors. The subsequent enzymatic reactions are cytosolic up until the penultimate and ultimate reactions which are mitochondrial. (B) Growth kinetics of CRISPR-Cas9 engineered yeast heme pathway genes replaced with the corresponding bacterial genes at their native yeast loci show efficient replaceability in both BY4741 (red solid-line) and BY4742 (blue solid-line) yeast strains. The wild type BY4741 growth curve is shown as a comparison (black dotted-line). Mean and standard deviation plotted with N = 3.

DOI: http://dx.doi.org/10.7554/eLife.25093.009

Figure 4.

Figure 4—figure supplement 1. Constitutive or native plasmid-based expression of the yeast heme biosynthesis genes generally efficiently complemented growth defects in the corresponding yeast gene deletion strains.

Figure 4—figure supplement 1.

Heterologous expression of yeast genes Sc-HEM1, Sc-HEM2, Sc-HEM3, Sc-HEM4, Sc-HEM12, Sc-HEM13, Sc-HEM14 and Sc-HEM15 under the control of constitutive GPD promoter or native promoter efficiently rescued the growth defect of the corresponding yeast gene deletions respectively except in the case of Sc-HEM4. Sc-HEM4, when expressed constitutively, resulted in toxicity in the presence of the yeast gene at the native locus and did not complement the function in the absence of the yeast gene. This toxicity was relieved when the yeast gene was expressed under the control of the native yeast promoter.
Figure 4—figure supplement 2. Ec-hemA and Ec-hemL carry out the initial reaction in E. coli heme biosynthesis and are both required to complement Sc-HEM1 deletion in yeast, and non-orthologous yeast genes are replaced by E. coli genes that carry out the identical reaction.

Figure 4—figure supplement 2.

(A) Expression of heme pathway genes of E. coli, Ec-hemA or Ec-hemL, individually cannot complement the lethal growth defect of the deletion of Sc-HEM1 gene in yeast. Co-expression of Ec-HemA and Ec-HemL efficiently rescued the growth defect of Sc-hem1 gene deletion in yeast. (B) Growth curves of yeast strains with deletions of Sc-hem4 and Sc-hem14 genes (grey solid-line) show functional replaceability (red solid-line) by the non-orthologous E. coli genes Ec-hemD and Ec-hemG that carry out identical enzymatic reactions to the corresponding yeast genes. The wild type BY4741 growth curve is shown as a comparison (black dotted-line). The empty vector control (grey solid-line) showed no such growth rescue in the presence of G418. (C) Growth curve of engineered yeast strain Sc-hem14Δ::Ec-hemG; Sc-hem15Δ::Ec-hemH in YPD medium harboring E. coli genes at the native yeast loci. The strain displayed a growth defect (red solid-line) compared to the wild type BY4741 strain (black dotted-line). Mean and standard deviation plotted with N = 3.
Figure 4—figure supplement 3. The penultimate and ultimate heme pathway enzymes in yeast are replaceable by their bacterial orthologs, in spite of mis-localizing to the plasma membrane.

Figure 4—figure supplement 3.

EGFP-tagged Ec-HemG and Ec-HemH localize to the plasma membrane in yeast. The EGFP-tagged proteins do not localize to the mitochondria since no clear co-localization is observed with the Mitotracker red stain. EGFP-tagged Ec-HemG and Ec-HemH expression (red solid-line) efficiently rescue the growth defects of the respective yeast gene deletions (Sc-hem14 and Sc-hem15) (pink dotted-line) comparable to the wild type yeast (black dotted-line). Empty vector control is incapable of rescuing the growth defect of the deletion strains (grey dotted-line).
Figure 4—figure supplement 4. Confirmation of CRISPR-Cas9 mediated bacterialized yeast strains.

Figure 4—figure supplement 4.

(A) Schematics of the yeast heme pathway gene loci carrying functionally replaceable E. coli genes while retaining their native promoters and terminators. The arrows indicate the primers used to confirm the replacement (refer to Supplementary file 3). (B) PCR amplification of expected size was obtained for each individual bacterialized yeast strains.

In our initial screen, the E. coli ortholog of Sc-HEM1, Ec-kbL, failed to replace the yeast gene, an observation consistent with prior data showing that Ec-kbL does not take part in E. coli heme biosynthesis, but rather carries out an unrelated but mechanistically-similar oxido-reductase reaction involved in L-threonine degradation (UniProt Consortium, 2015; Mukherjee and Dekker, 1990). Instead, a two-step enzymatic reaction by E. coli proteins Ec-HemA and Ec-HemL produces the heme precursor, delta-aminolevulinate (Schauer et al., 2002; Ilag and Jahn, 1992). Since the initial steps of the pathway are localized to the mitochondria, we added the Sc-MIP1 MLS to the 5’ ends of these genes and expressed them simultaneously in the Sc-HEM1 heterozygous diploid deletion strain. Co-expression of the two E. coli genes successfully replaced yeast gene function (Figure 4—figure supplement 2A). Additionally, two enzymes, Ec-HemD and Ec-HemG, were not identified as orthologs between E. coli and yeast, despite carrying out identical reactions to Sc-Hem4 and Sc-Hem14, respectively. Expression of these non-orthologous but functionally analogous E. coli genes in the respective yeast deletion strains showed that they were indeed able to successfully replace the yeast genes (Figure 4—figure supplement 2B). For these enzymes, the key determinants for successful replacement are thus their enzymatic reactions, rather than any other aspects of the genes.

Sc-Hem14 and Sc-Hem15 carry out the final two steps in yeast heme biosynthesis and are localized to the mitochondria (Cherry et al., 2012; Koh et al, 2015) (Figure 4A). Both genes were replaceable by the E. coli genes carrying out the analogous reactions, Ec-HemG (Figure 4—figure supplement 2B) and Ec-HemH (Figure 1C), despite the lack of targeting sequences for mitochondrial localization. As E. coli lack mitochondria, and Ec-HemG and Ec-HemH are both predicted to localize to the plasma membrane in E. coli (Papanastasiou et al., 2013), we thus assayed their localization in yeast when expressed as EGFP-fusion proteins. Strikingly, both localized to the yeast plasma membrane (Figure 4—figure supplement 3). In spite of failing to localize to the yeast mitochondria, the bacterialized strains grew well compared to wild type yeast (Figure 4—figure supplement 3), suggesting that mitochondrial localization is not an absolute requirement for their functions, as many heme pathway intermediates are cytosolic. However, concurrent bacterialization of both yeast genes resulted in a viable but defective yeast strain (Figure 4—figure supplement 2C), suggesting that the fitness cost of mis-localizing both proteins is not tolerated well, potentially due to cumulative effects of reduced efficiency of the bacterial proteins, altered allosteric regulation in yeast, or the accumulation of heme precursors in the wrong compartment (cytosol) (Yin and Bauer, 2013).

Because heterologous expression using a constitutive promoter could be compensating for more subtle functional differences, we also wished to measure complementation after placing the bacterial orthologs under control of the native yeast gene regulation. We thus used CRISPR/Cas9-based precision genome engineering to genomically replace each of the heme biosynthesis pathway genes in turn in yeast (except Sc-HEM12) with its respective E. coli counterpart, from start to stop codon, while retaining the native promoters, terminators, and chromosomal context of the yeast genes (Figure 4B, Figure 4—figure supplement 4). All strains but two grew comparably to the wild-type; the Sc-hem14∆::Ec-hemG and Sc-hem15∆::Ec-hemH strains showed modest growth defects (Figure 4B). Because these two yeast proteins are known to be mitochondrially localized (Cherry et al., 2012), we re-engineered each of the Ec-hemG and Ec-hemH ORFs into the yeast chromosome such that each gene’s native yeast MLS was retained (Sc-hem14∆::Ec-MLS-hemG and Sc-hem15∆::Ec-MLS-hemH). The addition of the yeast MLS to each E. coli ORF completely ameliorated growth defects from the ORFs alone (Figure 4B).

Thus, the yeast heme biosynthesis pathway appears entirely replaceable, one gene at a time, by their corresponding bacterial genes, whether expressed constitutively from plasmids or directly integrated into chromosomes under native yeast transcriptional regulation. The extent of replaceability strongly suggests that ancestral functions in these genes (with the obvious exception of the non-orthologous steps) have remained intact and unaltered, at least in terms of critical, enzymatic functionality. Mitochondrial localization of several of the enzymes, while needed to fully recover growth rates, is not essential for viability.

Bacterialization with the E. coli ferrochelatase induces a yeast phenotype resembling human porphyria

Ec-hemH and Sc-HEM15 encode ferrochelatase, the enzyme responsible for adding iron to the porphyrin ring of protoporphyrin IX to produce protoheme (Figure 4A). In the course of constructing the CRISPR-edited yeast strains, we noticed that the Sc-hem15∆::Ec-hemH yeast strain turned pink on a standard YPD agar medium upon prolonged incubation of 3–4 days (Figure 5A). This phenotype was consistent across all independently obtained, sequence verified yeast clones. The pink phenotype decreased dramatically in the Sc-hem15∆::Ec-MLS-hemH strains in which Ec-HemH was correctly localized to the mitochondria by addition of an MLS.

Figure 5. Mislocalization of the bacterialized ferrochelatase enzyme identifies a porphyria-like phenotype in yeast.

(A) Bacterialization of the ultimate yeast gene in the heme biosynthesis pathway results in a distinct pink colony phenotype on YPD agar medium. In contrast, wild type BY4741 strain colonies appear as creamy-white. (B) Acetate-extracted secreted products from the pink Sc-hem15Δ::Ec-hemH strains show strongly enhanced fluorescence at 635 nm (excitation 399 nm), comparable to a protoporphyrin IX standard and unlike a heme standard or extracts from the parental BY4741 strain. The introduction of an MLS to the bacterialized yeast strain (Sc-hem15Δ::Ec-MLS-hemH) significantly reduced protoporphyrin IX secretion, while deletion of the MLS from the native yeast locus in strain Sc-ΔMLS-HEM15 caused several strains to increase protoporphyrin IX secretion.

DOI: http://dx.doi.org/10.7554/eLife.25093.014

Figure 5.

Figure 5—figure supplement 1. Absorbance (top) and emission (bottom) spectra of extracts obtained from acetate (left) and pyridine (right) extraction of the wild type or bacterialized yeast colonies grown on YPD medium.

Figure 5—figure supplement 1.

Purified protoporphyrin IX (red solid-line) or heme (yellow solid-line) were used as standards. Extract from the bacterialized Sc-hem15Δ::Ec-hemH yeast strain (dark blue-line) matched with that of the protoporphyrin IX standard. Bacterialized ScHEM15Δ::Ec-MLS-hemH yeast strain (orange solid-line) showed significantly reduced peak for protoporphyrin IX. Extracts from wild type BY4741 (black-line) and BY4742 (light blue solid-line) were used as controls.
Figure 5—figure supplement 2. Deletion of protoporphyrinogen oxidase, Sc-HEM14, in the Sc-hem15Δ::Ec-hemH strain suppressed the porphyria-like pink phenotype.

Figure 5—figure supplement 2.

Top row from left show growth spots of the BY4741 wild type, Sc-hem15Δ::Ec-hemH and Sc-hem15Δ::Ec-MLS-hemH yeast strains. Bottom row from left show corresponding strains harboring Sc-hem14 deletion.

We speculated that the pink phenotype was likely due to aberrant accumulation of porphyrin intermediates, presumably leading to their secretion, as we observed that the pigment could be washed off the cells. Therefore, we chemically extracted the pink pigment from Sc-hem15∆::Ec-hemH, Sc-hem15∆::Ec-MLS-hemH and wild type yeast cells (Materials and methods) and performed fluorescence spectroscopy to determine that the pigment likely corresponds to protoporphyrin IX (Figure 5B, Figure 5—figure supplement 1).

In order to determine whether protein mis-localization contributed to the phenotype, we removed the MLS from the native yeast gene. Several clones of the Sc-ΔMLS-HEM15 yeast strain displayed similar extracellular pigment (Figure 5B, Figure 5—figure supplement 1). These results suggest that mislocalized plasma membrane-bound Ec-HemH in yeast does not convert protoporphyrin IX to protoheme efficiently, resulting in the accumulation and secretion of protoporphyrin IX. We further tested this line of reasoning by deleting the gene for the preceding step in the pathway, Sc-HEM14, which encodes the enzyme protoporphyrinogen oxidase and is responsible for making protoporphyrin IX. Using CRISPR, we deleted the Sc-HEM14 ORF in wild type BY4741, Sc-hem15Δ::Ec-HemH, and Sc-hem15Δ::Ec-MLS-HemH strains. Consistent with protoporphyrin IX being the pink pigment in the Sc-hem15Δ::Ec-HemH strain, the Sc-hem15Δ::Ec-HemH hem14Δ strain lost the pink phenotype, even after growing for 6 days. Moreover, we observed that all strains carrying the hem14Δ allele were in fact significantly paler than even wild type BY4741 cells, presumably reflecting extensive protoporphyrin IX depletion in these cells (Figure 5—figure supplement 2).

In humans, disrupting heme biosynthesis leads to the disease porphyria, and the secretion of porphyrin intermediates is specifically observed in a subtype known as protoporphyria (Bloomer et al., 1998), wherein reduced activity of the human heme pathway protein Hs-FECH leads to accumulation and subsequent secretion of protoporphyrin IX into surrounding tissues. Our data suggest that yeast protein localization and protoporphyrin secretion phenotypes might in the future be exploited to investigate disease-causing mutations in human Hs-FECH, even in cases where disease variants do not show any discernible growth defect in yeast.

Most yeast heme biosynthesis enzymes can also be successfully plant-ized

The data above show that genes in the yeast heme biosynthesis pathway can be replaced by their bacterial counterparts, extending earlier studies demonstrating that some heme biosynthesis genes can also be humanized (Kachroo et al., 2015; Sun et al., 2016; Schauer and Mattoon, 1990). Given the ancient conservation of this pathway, we sought to further expand our investigation of its replaceability by swapping the corresponding genes from the plant Arabidopsis thaliana into yeast. In plants, heme biosynthesis enzymes form precursors for chlorophyll, and the pathway is largely chloroplast-localized, in contrast to compartmentalization of the heme biosynthetic pathway between the mitochondria and cytosol in many other eukaryotes (UniProt Consortium, 2015; Ashburner et al., 2000; Mochizuki et al., 2010). Nonetheless, the fact that Arabidopsis ferrochelatase was cloned by complementing a mutant yeast phenotype suggests that other heme pathway genes might also successfully replace the yeast genes (Smith et al., 1994).

The first enzymatic step in the plant heme biosynthetic pathway is similar to bacteria, a two-step reaction using glutamyl-tRNA as a substrate (Figure 6A and B) (Ilag et al., 1994). We expressed both plant genes, At-HEMA1 and At-GSA2, simultaneously and were able to functionally replace the corresponding yeast gene function. Neither protein, when individually expressed, could functionally replace the yeast gene (Figure 6—figure supplement 1A).

Figure 6. Yeast heme biosynthesis pathway enzymes can be successfully replaced by orthologs or analogs from bacteria, plants, and humans, in spite of alterations to subcellular localization.

Enzymatic steps of extant bacterial and eukaryotic heme biosynthesis pathways are identical save for the starting metabolites and conversion to delta-aminolevulinate; bacteria also exhibit non-orthologous gene displacement of several enzymes. Heme biosynthesis occurs in the bacterial cytoplasm and inner membrane, the human and yeast in mitochondria and cytoplasm, and the plant in chloroplast and cytoplasm. In spite of these localization changes over evolution, most of the defects in growth rate and viability conferred by heme pathway mutations in yeast can be complemented by introduction of the corresponding (A) bacterial genes, (B) plant genes (except for At-HemE), and (C) human genes. Yellow indicates a replaceable gene, blue non-replaceable.

DOI: http://dx.doi.org/10.7554/eLife.25093.017

Figure 6.

Figure 6—figure supplement 1. Heme biosynthesis genes from Arabidopsis thaliana and Glycine max generally efficiently replace their counterparts in yeast, except in the case of ΔSc-Hem12.

Figure 6—figure supplement 1.

(A) Expression of heme pathway genes from Arabidopsis thaliana, At-HEMA1 or At-GSA2, individually cannot complement the lethal growth defect of the deletion of Sc-hem1 gene in yeast. Co-expression of At-HEMA1 and At-GSA2 rescued the growth defect of Sc-hem1 gene deletion in yeast. (B) Haploid yeast gene deletion strains carrying plasmids expressing functionally replacing Arabidopsis (red or blue solid-lines) and (B’) Glycine max (Gm-HEMG) heme pathway genes (red solid-line) generally exhibit comparable growth rates to the wild type parental yeast strain BY4741 (black dotted-line) as grown in magic marker liquid medium in the presence of G418 (200 μg/ml). (B’’) Native At-HEMC with chloroplast localization signal (CLS) showed poor replaceability in yeast (red solid-line). Removal of the CLS from At-HEMC allowed efficient rescue of the corresponding yeast gene deletion, ΔSc-Hem3 (blue solid-line). (B’’’) However, neither the expression of Arabidopsis proteins At-HEME1 or At-HEME2 (with or without CLS) alone nor their co-expression could functionally rescue the corresponding yeast gene deletion, ΔSc-Hem12. Wild type BY4741 haploid strain is plotted for comparison (black dotted-line). Strains carrying empty vector were used as controls (grey solid-line). Mean and standard deviation plotted with N = 3.
Figure 6—figure supplement 2. Heme biosynthesis enzymes normally localized to plant chloroplasts or human mitochondria localize to the mitochondria when expressed in yeast.

Figure 6—figure supplement 2.

(A) EGFP-tagged penultimate At-PPOX1-EGFP and ultimate At-FC1-EGFP proteins localize to mitochondria in yeast. Green fluorescence proteins co-localized with Mitotracker red-stained mitochondria. In certain cases, At-FC1-EGFP formed aggregates. Expression of EGFP-tagged plant genes, At-PPOX1-EGFP and At-FC1-EGFP (red solid-line), efficiently rescue the growth defect of the corresponding yeast gene deletions (pink dotted-line). The over-expression of the tagged proteins is not toxic to the wild type yeast strain (grey dotted-line). The growth rescue by plant genes is as efficient as the wild type BY4741 yeast strain (black dotted-line). Mean and standard deviation plotted with N = 3. (B) The EGFP-tagged last three heme pathway genes from humans localize to mitochondria in yeast. The green fluorescence co-localized with the Mitotracker red-stained mitochondria in yeast. Expression of EGFP-tagged human genes, Hs-PPOX-EGFP, Hs-FECH-EGFP and Hs-CPOX-EGFP (red solid-line), efficiently rescue the growth defect of the corresponding yeast gene deletions (pink dotted-line). The over-expression of the tagged proteins is not toxic to the wild type yeast strain (grey dotted-line). The growth rescue by the human genes is as efficient as the wild type BY4741 yeast strain (black dotted-line). Mean and standard deviation plotted with N = 3.
Figure 6—figure supplement 3. Human heme biosynthesis genes efficiently replace their yeast counterparts.

Figure 6—figure supplement 3.

Functional replacement of human genes in yeast. (A) Expression of Hs-UROS in Sc-hem4 heterozygous diploid deletion yeast strain resulted in toxicity post-sporulation as seen by the lack of growth on either magic marker agar medium with (yeast gene present) or without G418 (yeast gene absent). (B) This toxicity was relieved by replacing the human Hs-UROS at the native yeast locus. Growth curve of the humanized yeast Sc-hem4Δ::Hs-UROS strain (red-solid line) showed comparable growth to the wild type yeast BY4741 (black dotted-line). (C) Expression of human Hs-UROD (a human orfeome clone with G303V mutation) in Sc-hem12 heterozygous diploid deletion yeast strain did not complement the growth defect of the yeast gene as shown by plating the post sporulation mix on magic marker medium with or without G418. Reverting the sequence to the wild type Hs-UROD gene resulted in efficient rescue of the growth defect of the corresponding yeast gene. (D) Expression of human genes, Hs-PPOX, Hs-UROD, Hs-ALAS1 (red solid-line) and Hs-ALAS2 (blue solid-line), efficiently rescue the growth defect of the corresponding yeast gene deletions (grey solid-line), Sc-hem14 and Sc-hem1, respectively. The rescue was largely comparable to the wild type BY4741 yeast strain (black dotted-line). Strains carrying empty vector were used as controls (grey solid-line). Mean and standard deviation plotted with N = 3.

In Arabidopsis, unlike for the case of E. coli, a majority of genes in the heme biosynthesis pathway have acquired lineage-specific amplifications, resulting in two co-orthologs for each single yeast gene (Figure 6B). In these cases, we tested both co-orthologs individually for replaceability; all replaced successfully, with the exception of one case where only one replaced (At-CPX1 replaced while At-CPX2 did not), and one case where neither replaced (At-HEME1 and At-HEME2) (Figure 6B, Figure 6—figure supplement 1B,B’’’).

Because the plant heme biosynthesis pathway builds precursors for chlorophyll synthesis (Tanaka et al., 2011; Papenbrock et al., 1999), this pathway, especially the penultimate step producing protoporphyrin IX, is the target of many commercial herbicides. Both Arabidopsis paralogs that we tested, At-PPOX1 and At-PPOX2, could efficiently complement the yeast gene responsible for this critical step, Sc-HEM14 (Figure 6—figure supplement 1B). To confirm the generality of these results, we further tested the soybean (Glycine max) ortholog Gm-HEMG in yeast. As for each of the Arabidopsis paralogs, the single soybean ortholog also successfully complemented the Sc-hem14 deletion growth defect (Figure 6—figure supplement 1B’).

It is noteworthy that plant heme biosynthesis genes harbor chloroplast localization sequences (UniProt Consortium, 2015), and we did not remove these for our complementation experiments. We speculated that the chloroplast leader peptides might be recognized and localized by the yeast mitochondrial localization machinery, so we constructed EGFP-fusions of the plant enzymes and assayed their localization by fluorescence microscopy. EGFP fusions of At-PPOX1 and At-FC1 showed clear mitochondrial localization in yeast (Figure 6—figure supplement 2A). At-FC1 additionally showed amorphous aggregates in some yeast cells, suggesting localization might occasionally be imperfect. Nonetheless, both EGFP-tagged genes were able to efficiently rescue the growth defect of the corresponding yeast gene deletion (Figure 6—figure supplement 2A). Thus, these plant chloroplast localization signals appear to be recognized and processed as mitochondrial localization signals in yeast.

These findings suggested that plant versions of cytosolic yeast heme pathway proteins could potentially be mis-localizing to the mitochondria in yeast (Figure 4A). Indeed, At-HEMC only weakly replaced the yeast gene, Sc-HEM3. We found that removing the chloroplast localization signal (CLS) from At-HEMC markedly enhanced its ability to functionally replace its yeast ortholog (Figure 6—figure supplement 1B’’). In contrast, neither of two Arabidopsis paralogs, At-HEME1 and At-HEME2, could functionally replace their yeast ortholog, Sc-HEM12, even after removing their CLS sequences, or even when co-expressed in the yeast strain (Figure 6—figure supplement 1B’’’). We speculate that there could be several other reasons why complementation failed, including unknown intermediate reactions, required localization in a special compartment (e.g. chloroplast) or different transcriptional/translational regulation in plants that might contribute to the lack of functional replaceability.

Each yeast heme biosynthesis enzyme can be replaced by its human ortholog

Earlier studies have shown successful replacement of the yeast heme biosynthesis genes by their human orthologs Hs-ALAD (Schauer and Mattoon, 1990), Hs-HMBS, Hs-CPOX and Hs-FECH (Kachroo et al., 2015), while Hs-UROS expression resulted in toxicity and Hs-UROD failed to replace its yeast ortholog (Kachroo et al., 2015; Sun et al., 2016). We, therefore, sought to complete tests of the remaining human genes in the pathway. In the case of Hs-UROS, we reasoned that toxicity was due to expression from the heterologous constitutive promoter (Figure 6—figure supplement 3A). Indeed, similar to the results obtained with the yeast version of this gene (Figure 4—figure supplement 1, Sc-HEM4), we found that toxicity could be abrogated by inserting the human gene at the native yeast chromosomal locus, thus providing native yeast gene expression and regulation for the human ORF (Figure 6—figure supplement 3B). This suggests that, at least in yeast, this step is regulated transcriptionally for optimal function. We also found that the human ORFeome clone of Hs-UROD contained a mutation (G303V) that when reverted to wild-type sequence allowed it to replace the yeast gene (Figure 6—figure supplement 3C and D), and we additionally confirmed that human Hs-PPOX could complement the severe growth defect of the yeast Sc-hem14 deletion strain (Figure 6C, Figure 6—figure supplement 3D). Finally, in humans, the initial step of heme biosynthesis is identical to that of yeast (Sc-HEM1) but is encoded by two co-orthologs, Hs-ALAS1 and Hs-ALAS2. We found that both of these human genes could individually replace the yeast gene function (Figure 6C, Figure 6—figure supplement 3D).

The subcellular localization of heme biosynthesis differs slightly between humans and yeast, such that the last three proteins in the human heme biosynthesis pathway are mitochondrially localized, as opposed to only the last two in yeast (Grandchamp et al., 1978; Ferreira et al., 1988). As all three of these genes replaced, we tested if the human genes were localized to the mitochondria in yeast. Indeed, EGFP-tagged Hs-FECH, Hs-PPOX, and Hs-CPOX all localized to mitochondria in yeast (Figure 6—figure supplement 2B) and efficiently rescued the growth defect of the corresponding yeast gene deletion (Figure 6—figure supplement 2B), confirming that the human mitochondrial localization signal is recognizable by the yeast localization machinery. Thus, across our attempts to humanize, plantize, and bacterialize this pathway, the presence of mitochondrial leader peptides from the human genes and the chloroplast leader peptides from the plant genes, as well as the absence of bacterial leaders, all overrode the native yeast localization of the heme biosynthesis pathway. However, the pathway function was largely resilient to these effects, with the exception of protoporphyrin IX accumulation in the mislocalized bacterialized strains (Figure 5).

Heme biosynthesis is a near-universally swappable pathway

As illustrated in Figure 7, the heme pathway has had a complicated evolutionary trajectory in eukaryotes due to endosymbiotic events, which has served to increase its similarity between bacteria and eukaryotes (Kořený and Oborník, 2011). During eukaryogenesis, early eukaryotes adopted a large portion of the bacteria-like heme biosynthesis pathway of their endosymbiont mitochondria. The subsequent endosymbiotic acquisition of chloroplasts along the plant lineage (Oborník and Green, 2005) resulted in redundancy between mitochondrial-origin and chloroplast-origin portions of their heme biosynthesis pathways, a state that can be observed today in Euglena, a non-plant, photosynthetic eukaryote with more recently acquired chloroplasts (Kořený and Oborník, 2011). Over time, plants kept the chloroplastic system and lost most of the mitochondrial system. These evolutionary transfers may have been possible due the apparent modularity of the heme pathway, which we observe in its high tolerance for substituting genes or enzymatic functions across species.

Figure 7. The complex evolutionary history of the heme biosynthesis pathway is reflected in high replaceability across species.

Figure 7.

In eukaryotes, heme biosynthesis enzymes have been replaced historically by endosymbiosis events from bacteria, leading to higher similarity across these lineages, while the archaeal pathway appears to be more divergent (Storbeck et al., 2010). Following the endosymbiosis of the cyanobacterial chloroplast, plants adopted most of the chloroplast-derived heme biosynthesis genes, losing many ancestral eukaryotic heme pathway genes (Oborník and Green, 2005). Yeast and humans both retain the predicted ancestral eukaryotic heme biosynthesis pathway. While enzymatic steps are mostly shared between yeast, plants, bacteria, and humans, localization of individual proteins differs substantially between species. Asterisks indicate results curated from literature.

DOI: http://dx.doi.org/10.7554/eLife.25093.021

Our data demonstrate that despite 2 billion years of divergence from their last common ancestor, heme biosynthesis genes are still carrying out a conserved and necessary function that can be swapped into yeast with minimal effect on growth and irrespective of orthology and subcellular localization. Taking these data together with literature studies showing successful replacement of the E. coli Ec-hemG gene by the plant or human Hs-PPOX gene (Lermontova et al., 1997; Dailey and Dailey, 1996; Narita et al., 1996), and that introducing the protoporphyrinogen oxidase from Bacillus subtilis into plants improves yields (Lee et al., 2000), heme biosynthesis thus appears to be a pathway whose genes are freely exchangeable between bacteria, plants (with the exception of At-HEME), humans, and yeast (Figure 7).

Conclusions

In conclusion, in order to discern whether orthology strictly confers function across deep evolutionary distances, we systematically tested E. coli genes with 1:1 orthology to essential yeast genes for their ability to functionally replace their yeast counterparts. We discovered that ~61% (31/51) of the tested E. coli and yeast genes still retain ancestral function to a sufficient extent that the bacterial genes efficiently replace their yeast equivalents. Eukaryote-specific features such as subcellular localization (4 of 14) and proper start codon usage (2 of 4) were critical for swappability for some of the E. coli orthologs. Our analysis of replaceable/non-replaceable orthologous pairs revealed that amino acid sequence similarity was not the most important property, consistent with a general trend for sequence conservation to often more strongly reflect other attributes of protein function (e.g., abundance and protein-specific functional constraints) (Jordan et al., 2002; Wang and Zhang, 2009). Rather, the top predictors of replaceability were features attributed to specific gene modules. These results largely agree with previously published work on humanization of yeast genes (Kachroo et al., 2015; Hamza et al., 2015; Sun et al., 2016), suggesting that functional replaceability is predominantly determined at the level of pathways and processes, even across very large evolutionary distances. As our assays can be considered a form of forced horizontal gene transfer, our results provide support for the ‘complexity hypothesis’ (Jain et al., 1999), which posits that informational (transcription, translation, etc.) genes are less likely to be horizontally transferred than those genes that are operational (metabolism, housekeeping, etc.). Consistent with this expectation, we see metabolism-associated genes replacing more often than those involved in ‘informational’ processes like transcription or translation.

In the course of these studies, we found that heme biosynthetic reactions were entirely replaceable across the prokaryote-eukaryote divide, despite non-orthologous functional displacement and lack of eukaryotic subcellular localization by native E. coli genes (Figure 7). Although the archaeal pathway is considerably diverged, our studies across bacteria and eukaryotes showed a high degree of replaceability: Plant heme biosynthesis enzymes functionally replaced yeast enzymes in all but one reaction. Swaps of the corresponding human enzymes into yeast in this and prior studies all suggest that heme biosynthesis is a near universally replaceable pathway.

Our results thus demonstrate that orthologous genes carry out similar functions that allow for their ability to functionally replace each other across even the 2 billion year evolutionary rift separating prokaryotes and eukaryotes from their last common ancestor. These swaps allow engineering of orthologous pathways in model organisms highly amenable to genetic perturbations, like yeast and bacteria, for further characterization.

Materials and methods

Construction of ORFs from bacteria, plants, and humans in yeast expression vectors

Refer to Supplementary file 3 for all the primers used in this study.

E.coli ORF yeast expression vectors

Initial E. coli ORF primers were designed such that the 3' ends of the primers had homology to E. coli genes and 5’ ends contained a universal flanking sequence. A second round of PCR was performed with primers recognizing the universal flanking sequence and also having 5’ ends corresponding to gateway compatible attL1 (or attB1) and attL2 (or attB2) sequences on the forward and reverse primers, respectively. Resulting PCR products from attL sequence containing primers were directly cloned via gateway LR cloning (ThermoFisher Scientific) into yeast destination vector pAG416GPD-ccdB (Addgene) to create expression clones. PCR products from attB primers were subcloned via gateway BP cloning into vector pDONR221 (ThermoFisher Scientific) to create entry clones. These entry clones were then cloned via gateway LR to the pAG416GPD-ccdB destination vector to create expression clones. Some E. coli genes were synthesized as gBlocks from IDT and made gateway compatible by adding attL1 and attL2 sequences at the 5’ and 3’ ends, respectively, making them compatible for direct LR cloning to create expression vectors.

Plant ORF yeast expression vectors

Arabidopsis thaliana ORFs were PCR amplified from cDNA obtained as a kind gift from Dr. Jeffrey Chen (UT Austin), using primers specific to each gene and containing gateway compatible attL1 and attL2 sequences at the 5’ and 3’ ends respectively (Supplementary file 3). PCR products were directly cloned into the yeast expression vector pAG416GPD-ccdB by LR gateway cloning (using LR clonase II from Invitrogen). At-HEME1 and At-HEMB2 were synthesized as gBlocks from Integrated DNA Technologies (IDT).

Plant ORF yeast expression vectors without the chloroplast localization signal

In order to remove the chloroplast localization signal from the plant proteins At-HEMC, At-HEME1 and At-HEME2, we first performed amino acid sequence alignment with the bacterial and yeast orthologs to identify unaligned N-terminal sequence. We attributed the non-alignment to the presence of chloroplast localization signal (CLS) sequence. We also used the TargetP 1.1 signal peptide predictor (Emanuelsson et al., 2007) to corroborate the sequence alignments. In the case of At-HEMC, 68 N-terminal amino acids were deleted while retaining ATG start codon. Similarly, in the case of At-HEME1 and At-HEME2, 47 N-terminal amino acids were deleted while retaining the ATG start codon. We synthesized these genes as gBlocks (IDT) with attB1 and attB2 attachment sites flanking their 5’ and 3’ ends, respectively, then subcloned the gBlocks into the entry clone pDONR221, sequence-verified the clones, and LR cloned the genes into yeast expression vector pAG416GPD-ccdB.

Plant ORF yeast expression vectors for co-expression of At-HEME1 and At-HEME2

At-HEME1 and At-HEME2 were cloned (with or without CLS) into the destination vectors pAG416GPD-ccdB and pCMY41 (kind gift of Christopher Yellman; pCMY41 is identical to pAG416GPD-ccdB but carries a hygromycin-resistance cassette), allowing us to co-transform two plasmids and select for the double plasmid transformants on synthetic defined medium, -Ura + Hyg (200 μg/ml).

Human ORF yeast expression vectors

Human ORF’s were obtained from the ORFeome collection (GE Dharmacon) and sequenced to verify correct, full-length clones. In the case of human Hs-UROD, the ORFeome clone contained a loss-of-function mutation (G303V), so wild-type human Hs-UROD was synthesized as a gBlock fragment (IDT) and used as a PCR template, amplifying the gene using primers with flanking gateway compatible sites attL1 and attL2 at the 5’ and 3’ ends respectively (Supplementary file 3). The PCR product was subcloned by LR reaction into the yeast expression vector pAG416GPD-ccdB.

Yeast ORF yeast expression vectors

Yeast ORFs were amplified using PCR from genomic DNA of yeast strain BY4741, and gateway compatible attL1 and attL2 sequences added by PCR to the amplicons 5’ and 3’ ends, respectively (Supplementary file 3). The resulting PCR products were subcloned by LR reaction into the yeast expression vector pAG416GPD-ccdB. Several yeast heme biosynthesis genes were first cloned in pENTR/SD/D-TOPO plasmid (Invitrogen) to obtain gateway entry clones (refer to Supplementary file 3 for primers). These clones were sequence-verified and then used to generate yeast expression vectors by LR reaction into the vector pAG416GPD-ccdB.

Mitochondrially-localized E. coli ORF yeast expression vectors

We added the MLS from the yeast MIP1 gene to the 5’ end of selected E. coli ORFs via PCR, using an ORF-specific ultramer containing the full MLS-coding sequence at the 5’ end such that MLS was in frame with the coding sequence of the E. coli gene while removing the E. coli gene start codon (Supplementary file 3). Each PCR product was then used as a template to add gateway cloning attachment sites attL1 and attL2, followed by LR gateway cloning into pAG416GPD-ccdB to generate yeast expression vectors.

EGFP tagged E. coli / plant / human ORF yeast expression vectors

Using PCR, we amplified E. coli / plant / human ORFs without their respective stop-codons while also adding attB1 and attB2 gateway attachment sites at the 5’ and 3’ ends of each PCR product (Supplementary file 3). The resulting PCR fragments were subcloned into plasmid pDONR221 to generate gateway entry clones using the BP gateway cloning reaction. Each entry clone was subjected to the LR cloning reaction in order to generate a carboxy-terminal EGFP-tagged yeast expression clones in the pAG416GPD-ccdB-EGFP destination vector.

Converting E. coli ORF yeast expression vectors with alternative start codons to ATG start codon

We introduced ATG start codons by PCR mutagenesis, employing ATG-containing primers (Supplementary file 3) to amplify and simultaneously add gateway cloning attachment sites attL1 and attL2 to the 5’ and 3’ ends of the PCR products, respectively, then subcloning these products by the LR gateway cloning reaction into the pAG416GPD-ccdB plasmid in order to construct yeast expression vectors.

E.coli and Arabidopsis two-gene expression vectors for complementing a yeast Sc-HEM1 deletion

E. coli genes Ec-hemA, Ec-hemL and plant genes At-HEMA1, At-GSA2 were PCR amplified from genomic DNA (E. coli) or gBlocks obtained from IDT (Arabidopsis). For E. coli genes, we also added an MLS at the 5’ end of the PCR products. These PCRs were made Golden Gate compatible by introducing Bsmb1 sites and cloned individually in pYTK001 (Supplementary file 3). In the case of At-HEMA1, the gBlock was synthesized to mutate an internal BsmBI site such that it doesn’t affect the protein sequence. Clones were sequence verified prior to assembly (Lee et al., 2015). Individual transcription units for each of the genes were obtained by Golden Gate assembly using the pYTK001-entry clone containing the E. coli or plant gene, along with pYTK vectorscontributing promoters and terminators. In the case of Ec-hemA and At-HEMA1 transcription units (TU1’s), the pHHF2 promoter was contributed by pYTK012 and tADH1 terminator by pYTK053. In the case of Ec-hemL and At-GSA2 transcription units (TU2’s), the pTEF1 promoter was contributed by pYTK013 and tSSA1 terminator by pYTK052. Unique contigs for directional assembly were obtained from pYTK002 (ConLS) and pYTK067 (ConR1) for TU1. For TU2, the unique contigs were obtained from pYTK003 (ConL1) and pYTK072 (ConRE). The individual transcription units (TU1 and TU2) were then assembled in a single yeast CEN6-URA vector via Golden Gate assembly with BsmbI.

All clones were sequence-verified using the University of Texas Genomic Sequencing and Analysis Facility.

Functional complementation assays

Gene replaceability was tested using available yeast strains from two yeast strain collections, the temperature-sensitive (TS) collection (Li et al., 2011) and the heterozygous diploid deletion magic marker collection (Pan et al., 2004), as follows:

(1)Temperature-sensitive (TS) collection assays

Typically, yeast strains in this collection grow at permissive temperatures (22–26°C) but cannot grow at restrictive temperatures (35–37°C). Growth at restrictive temperatures thus allows for the identification of foreign genes that complement the yeast defect. We tested for replaceability in temperature-sensitive yeast strains as follows:

The strains were transformed with either an empty vector control (pAG416GPD-ccdB) or with the clone expressing the foreign gene. The transformants were plated on:

  1. Ura dextrose medium at the permissive temperature (25°C), serving as a control for transformation efficiency and/or toxicity since both the yeast and the human gene are expressed.

  2. Ura dextrose medium at the non-permissive temperature (36°C), testing for functional replacement under conditions in which the corresponding yeast gene is non-functional.

(2)Heterozygous diploid deletion magic marker collection assays

The yeast heterozygous diploid deletion magic marker collection comprises yeast strains that harbor a deletion of one copy of a yeast gene replaced with a KanMX cassette. The strains also carry a magic marker or synthetic genetic array (SGA) cassette at the can1 locus, which enables selection for haploid cells on magic marker (MM) medium (−His −Arg −Leu +Can) post-sporulation with or without antibiotic G418 (200 μg/ml). Haploid a-type spores that harbor a wild type gene grow normally on magic marker (MM) medium without G418 and provide a test of sporulation efficiency and toxicity, if any, associated with heterologous expression of the foreign gene (using a −Ura selection marker in this study). Growth of haploid spores on MM medium in the presence of G418 selects for yeast cells that harbor the relevant gene deletion while testing for complementation by the foreign gene.

Expression clones or empty vector controls were transformed into appropriate strains and selected on −Ura G418 medium in a 96-well format. (Toxicity was inferred from a repeated failure to obtain transformants in the case of expression clones compared to the empty vector control) Transformants were re-plated on GNA-rich pre-sporulation medium containing G418 (200 μg/ml) and histidine (50 mg/l). Individual colonies were inoculated in liquid sporulation medium containing 0.1% potassium acetate, 0.005% Zinc acetate, and incubated with vigorous shaking at 25°C for 3–5 days, after which sporulation efficiency was estimated by microscopy, and the mixture re-suspended in water and equally plated on two assay conditions:

  1. ‘G418 minus’ magic marker dextrose medium (−His −Arg −Leu +Can −Ura), incubated at 30°C. The haploid spores that carry the wild-type yeast gene grow in this medium acting as a control for sporulation efficiency. This condition also assays for toxicity if the haploid spores carrying expression vectors fail to grow.

  2. ‘G418 plus’ magic marker dextrose medium (−His −Arg −Leu +Can −Ura) containing 200 μg/ml G418. The resulting haploid deletion strain is expected not to grow, providing an assay of replaceability for strains carrying the expression vector. Cases with approximately equal numbers of cells growing in the absence or presence of G418 were considered functional replacements.

Positive assays were verified independently. Individual colonies were isolated from selective plates and used for growth assays on YPD or magic marker medium with G418 (Figure 1B, Figure 1—figure supplement 1A). After growth on YPD with G418, each strain was spotted on 5-FOA agar to test plasmid dependency (Supplementary file 1). Only one strain (Ec-valS) failed that test.

Ortholog inference

Genes with 1:1 orthology between yeast and E. coli were obtained from the Inparanoid 8 webserver (Sonnhammer and Östlund, 2015) and filtered to an only yeast-essential set. Orthologs to these selected yeast genes in human and Arabidopsis were downloaded from Inparanoid 8 and further refined by comparison to orthology calculations by eggNOG4.5 (Huerta-Cepas et al., 2016), OMA (Altenhoff et al., 2015), and reference to the evolutionary history of the heme pathway in photosynthetic organisms (Oborník and Green, 2005).

Computational analyses of replaceability

Feature assembly

Sequence features

Protein sequence features were calculated using UniProt (UniProt Consortium, 2015) proteomes from the respective species downloaded in March 2015. E. coli nucleotide sequence features were calculated using EcoGene (Zhou and Rudd, 2013) sequences downloaded April 2015.

[Sc|Ec]_Length

The number of amino acids in the respective protein.

Sc-Ec_LengthDifference

Calculated as the difference of the amino acid length of the E. coli protein subtracted from the length of the S. cerevisiae ortholog.

Sc-Ec_AbsLengthDifference

Calculated as the absolute value of the above length difference.

Sc-Ec_PercentIDAligned

Sc-Ec_PercentIDLongest

Sc-Ec_PercentSimilarityAligned

Sc-Ec_PercentSimilarityLongest

The fraction of identical residues (PercentID) or similar residues (PercentSimilarity) in a global alignment (NWalign, http://zhanglab.ccmb.med.umich.edu/NW-align/) of the respective orthologs, as a function of the longest of the two (Longest) or the length of the aligned region (Aligned).

Ec_CAI

Ec_CBI

Ec_FOP

Ec_ScCAI

Ec_ScCBI

Ec_ScFOP

The Codon Adaptation Index (CAI), Codon Bias Index (CBI), or Frequency of OPtimal codons (FOP) for the respective E. coli gene, calculated using the E. coli optimal codon table (Ec_) or S. cerevisiae optimal codon table (Ec_Sc) using codonw (http://sourceforge.net/projects/codonw/).

Abundance features

Sc_TranscriptAbundance

Sc_ProteinAbundance

Sc_RPFAbundance

Sc_TranslationEfficiency

Ec_ProteinAbundance

Yeast (Sc) protein abundance data was taken from Kulak et al. (2014). Yeast Transcript and RPF abundance were taken from Ingolia et al. (2009). Yeast RPF abundance is calculated as the ratio of RPF reads to Transcript reads for a given gene. E. coli data was taken from Arike et al. (2012) (average iBAQ abundance only).

Network features

Sc_BIOGRID-Betweenness

Sc_BIOGRID-Clustering

Sc_BIOGRID-Degree

Sc_BIOGRID-SumLLS

Sc_BIOGRID-LT-Degree

Sc_BIOGRID-LT-SumLLS

Sc_BIOGRID-LT-Betweenness

Sc_BIOGRID-LT-Clustering

Calculated from interactions present in BIOGRID 3.1.93 (Stark et al., 2006). ‘BIOGRID’ was calculated using only those interactions annotated as ‘physical interactions’, while ‘BIOGRID-LT’ was calculated using the subset of physical interactions found only by low-throughput experiments.

Ec_EcoCyc_FractionComplementing

Calculated using the ‘All Pathways’ table from EcoCyc (https://ecocyc.org) downloaded in September 2016. To create the network, all pathways were considered ‘cliques’ so that all members of the pathway were annotated as interacting with all other members of the pathway. FractionComplementing is the fraction of interacting partners tested in our assays that were able to replace.

Calculating the predictive strength of features

The predictive power of each feature was calculated as the area under the receiver-operator characteristic curve (AUC) while treating each feature as an individual classifier. Each feature was sorted in both ascending and descending directions, retaining the direction providing an AUC > 0.5. To assess significance, a shuffling procedure was performed as follows: For each feature, the replaceable/non-replaceable status of each ortholog pair was shuffled (retaining the original ratio of replaceable to non-replaceable assignments), and the AUC was calculated. The shuffling procedure was carried out 1000 times for each feature, and the mean AUC values and their standard deviations are reported.

Combined classifier

A Random Forest classifier was constructed using all features and evaluated using 10-fold cross-validation. The random forest was constructed to have no maximum tree depth, and ties between similarly good attributes were broken randomly. The combined classifier was implemented using the Weka data-mining software (Frank et al., 2004).

Confocal microscopy

Yeast cultures expressing GFP-tagged bacterial, plant, or human genes were grown to an optical density (OD) of ~1, then 500 μl of the culture washed with 1X PBS, and mitochondria fluorescently labeled by adding 100 nM MitoTracker Red CMXRos (Invitrogen). The cells were incubated in the dark on a mildly shaking platform for 20 min at room temperature, then washed twice with 1X PBS and resuspended in 15 μl of 1X PBS for imaging by confocal microscopy, using a Zeiss LSM 710 confocal microscope with a Plan-Apochromat 63×/1.4 oil-immersion objective.

Quantitative growth curves

Yeast strains were either pre-cultured in liquid YPD or -Ura Dextrose selective medium for 2 hr or overnight respectively. The culture was diluted in YPD or -Ura Dextrose medium to an OD of ~0.1 in 100 or 150 μl total volume in a 96-well plate. Plates were incubated in a Synergy H1 shaking incubating spectrophotometer (BioTek), measuring the optical density every 15 min over 48 hr. Growth curves were performed in triplicate for each strain by splitting the pre-culture into three independent cultures for each 48–60 hr time course.

Detection of heme pathway intermediate metabolites

Bacterialized Ec-hemH yeast strains were grown on YPD as lawns or large patches for 5 days (the phenotype manifests after several days of growth). Clumps of cells about 5–7 mm in diameter were collected with a toothpick and first suspended in water, then pelleted at 15,000 g for 30 s. This created a distinctive pale yellow yeast pellet, with the red pigment appearing in a small clump on top. The water was removed while carefully avoiding disruption of the red pigment pellet, after which we performed extractions with two different methods. The first method, based on Bassel et al. (Bassel et al., 1975), was to add 1 ml pyridine to each pellet, spinning down at 15,000 g for 30 s and recovering only the liquid fraction (cell debris would pellet down while the red pigment migrated into the liquid pyridine phase). The second referred to as ‘acetate extraction’ in this text, was to extract with a 3:1 ethyl acetate:glacial acetic acid solution as described in Pretlow and Sherman (1967).

We then measured the absorbance of the extractions in a transparent plastic 96-well plate on the (Synergy H1 from BioTek) on wavelengths from 223 nm to 998 nm, with 1 nm steps. We measured fluorescence on the same instrument by exciting at 399 nm and measuring emission at 450 nm to 699 nm with 1 nm step. The spectra were compared with those shown in Bark et al. (2010).

We also obtained protoporphyrin IX (Sigma-Aldrich, P8293-1G) and hemin B (Sigma-Aldrich, 51280–1G) and suspended these in acetate and pyridine to closely resemble the chemistry of our extractions. These solutions were measured alongside the extractions themselves as standards, in order to further confirm the identity of the molecules we detected.

Replacement of bacterial and human genes at their native yeast loci using CRISPR-Cas9

Genomic editing and replacement of yeast ORFs is described in greater detail at Bio-protocol (Akhmetov et al., 2018).

Bacterializing yeast strains at native genomic loci using CRISPR

We inserted E. coli ORFs at their native yeast loci using CRISPR/Cas9-mediated double strand breaks (DSB) and homologous recombination. The integration was performed by chemically co-transforming yeast with a linear template DNA (Zymo Research - #T2001) and a plasmid carrying Cas9 and gRNA targeting the desired locus of integration (refer to Supplementary file 3). The transformed cells were plated on SD-Ura medium to select for successful transformation of the plasmid (CRISPR-induced DSBs act as partial selection against background), and screened for successful integration of the template via colony PCR using primers flanking the start codon of the ORF (a forward primer annealing to the promoter and a reverse primer annealing to the E. coli ORF) (Figure 4—figure supplement 4).

The template DNA is a linear sequence containing the E. coli ORF, flanked by the yeast promoter and terminator which act as homology. In order to produce this template DNA, we designed primers for each gene that amplify the entire coding sequence of the E. coli ortholog, while also inserting flanking homologies to the yeast locus targeted. In most cases, we used primers 120 bp long, with about 20 bp shared with the E. coli gene and 100 bp of yeast homology. In cases where this template failed to integrate (such as Ec-hemC) we designed 200 bp primers with about 180 bp homology. For chimeric ORFs of E. coli genes Ec-hemG and Ec-hemH that retained the native yeast MLS, the template was produced by including the MLS in the forward primer sequence. We amplified the template DNA with PCR, purified it using the DNA Clean and Concentrator-25 kit (Zymo Research - #D4006); final elutions were done with water. We used 5 μg DNA template per transformation, in cases where this failed we attempted it again with 10 μg.

CRISPR plasmids were constructed using a Golden Gate-based cloning strategy as described in Lee et al. (2015). Briefly, for each yeast gene we designed two gRNA sequences using Geneious v9 (Kearse et al., 2012); both sequences were selected from within the yeast ORF so as to exhibit high predicted efficiency with a low background activity for the rest of the yeast genome. We performed integration experiments separately for each gRNA, as often one of the gRNA sequences would have substantially lower efficiency than predicted. As per Lee et al. (Lee et al., 2015), each gRNA sequence was first synthesized as an oligonucleotide (IDT), subcloned into intermediate plasmids, and eventually into a Cas9 plasmid carrying a Ura selectable marker, finally transforming 500 ng into yeast cells for the integration assay.

In order to construct yeast strain Sc-ΔMLS-HEM15, we started with Sc-hem15Δ::Ec-hemH yeast which had lost their CRISPR plasmid, and co-transformed them with CRISPR plasmids carrying gRNA that targets the Ec-hemH sequence, as well as template DNA created by amplifying the yeast Sc-HEM15 sequence from yeast genomic DNA. The MLS was deleted by designing template amplification primers which leave it out. This was necessary since the MLS sequence did not contain unique CRISPR targets, thus it was not possible to construct a CRISPR system that would cleave wild type Sc-HEM15 but not the desired Sc-ΔMLS-HEM15.

Humanizing Hs-UROS gene at the native yeast locus

We co-transformed the plasmid expressing Cas9 and gRNA targeting yeast Sc-HEM4 gene and repair PCR template that contains human Hs-UROS gene flanked by 100 bp of homologous sequence to the yeast Sc-HEM4 promoter and terminator region. The colonies that grew after the transformation of CRISPR plasmid and the repair template were verified for the human gene insertion using a forward primer outside the region of homology and reverse primer specific to the human gene. The positive PCR reaction with appropriate size (375 bp) confirmed the right clone.

Generation of Sc-HEM14 yeast deletion strains

Using CRISPR, we deleted the Sc-HEM14 ORF in wild type BY4741, Sc-hem15Δ::Ec-HemH, and Sc-hem15Δ::Ec-MLS-HemH strains. Specifically, we co-transformed the plasmid expressing Cas9 and gRNA targeting the yeast Sc-HEM14 gene with a 200 bp oligonucleotide repair template comprising 100 bp each of sequence matching the 5' and 3' UTRs of the Sc-HEM14 gene and selected for growth on SD-Ura medium. The resulting hem14Δ strains were confirmed by PCR using primers outside the region of homology. Supplementary file 3 provides relevant primers and oligos.

Acknowledgements

This work was supported by grants from the NIH (R21 GM119021, R01 HD085901, DP1 GM106408, R01 DK110520, R35 GM122480), CPRIT, and the Welch foundation (F-1515) to EMM.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Funding Information

This paper was supported by the following grants:

  • National Institutes of Health R21 GM119021 to Edward M Marcotte.

  • Cancer Prevention and Research Institute of Texas to Edward M Marcotte.

  • Welch Foundation F1515 to Edward M Marcotte.

  • National Institutes of Health R01 HD085901 to Edward M Marcotte.

  • National Institutes of Health DP1 GM106408 to Edward M Marcotte.

  • National Institutes of Health R01 DK110520 to Edward M Marcotte.

  • National Institutes of Health R35 GM122480 to Edward M Marcotte.

Additional information

Competing interests

The authors declare that no competing interests exist.

Author contributions

AHK, Conceptualization, Data curation, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing.

JML, Conceptualization, Data curation, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing.

AA, Data curation, Validation, Investigation, Visualization.

MS-J, Validation, Investigation, Visualization.

CDM, Visualization, Methodology, Writing—review and editing.

AZ, Visualization, Methodology.

EMM, Conceptualization, Resources, Data curation, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Additional files

Supplementary file 1. Detailed results of complementation assays.

Rows are ortholog pairs. Columns A-E list several alternative gene IDs for each organism. ‘Assay location’ is the location of that complementation assay in the plate images shown in Figure 1—figure supplement 1. The following columns list the results of the specific assay type: ‘MM’ refers to the heterozygous diploid assay (Magic Marker). ‘TS’ refers to the temperature-sensitive allele assay. ‘With MLS’ are assays re-done with the yeast mitochondrial localization sequence as described in the test. ‘With ATG’ are assays re-done with an ATG start codon substituted for the E. coli genes non-canonical start codon. ‘Plasmid dependence’ lists the results of 5’ FOA screening of complementing clones to confirm that the E. coli gene-containing plasmid is present. ‘Final status’ and ‘Preliminary status’ refer to the complementation status of the given ortholog pair after (Final) or before (Preliminary) accounting for MLS, ATG, or plasmid dependence assays.

DOI: http://dx.doi.org/10.7554/eLife.25093.022

elife-25093-supp1.xlsx (57.3KB, xlsx)
DOI: 10.7554/eLife.25093.022
Supplementary file 2. Data used to calculate predictive features.

The first sheet displays all data for each property used in the study to determine predictive properties. The first several columns list IDs for the two organisms as well as the final complementation results. The following columns include all data for each feature. See Materials and Methods for detailed descriptions of each property. The second sheet lists the calculated AUCs for each feature, and the results of the shuffling procedure for each.

DOI: http://dx.doi.org/10.7554/eLife.25093.023

elife-25093-supp2.xlsx (69.2KB, xlsx)
DOI: 10.7554/eLife.25093.023
Supplementary file 3. Primers used in this study.

A brief description of each primer's use is included on each separate sheet of the file. For additional information, see Materials and methods for the relevant section.

DOI: http://dx.doi.org/10.7554/eLife.25093.024

elife-25093-supp3.xlsx (60.8KB, xlsx)
DOI: 10.7554/eLife.25093.024

References

  1. Akhmetov A, Laurent J, Gollihar J, Gardner E, Garge R, Ellington A, Kachroo A, Marcotte E. Single-step Precision Genome Editing in Yeast Using CRISPR-Cas9. BIO-PROTOCOL. 2018;8:e2765. doi: 10.21769/BioProtoc.2765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Altenhoff AM, Škunca N, Glover N, Train CM, Sueki A, Piližota I, Gori K, Tomiczek B, Müller S, Redestig H, Gonnet GH, Dessimoz C. The OMA orthology database in 2015: function predictions, better plant support, synteny view and other improvements. Nucleic Acids Research. 2015;43:gku1158–D249. doi: 10.1093/nar/gku1158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Arike L, Valgepea K, Peil L, Nahku R, Adamberg K, Vilu R. Comparison and applications of label-free absolute proteome quantification methods on Escherichia coli. Journal of Proteomics. 2012;75:5437–5448. doi: 10.1016/j.jprot.2012.06.020. [DOI] [PubMed] [Google Scholar]
  4. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene Ontology: tool for the unification of biology. the Gene Ontology Consortium. Nature Genetics. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bark K-M, Yang J-I, Lee H-S, Lee J-B, Park C-H, Park H-R. Physicochemical Properties of Protoporphyrin IX by metal ions in Acetonitrile-Water Mixture solution. Bulletin of the Korean Chemical Society. 2010;31:1633–1637. doi: 10.5012/bkcs.2010.31.6.1633. [DOI] [Google Scholar]
  6. Bassel J, Hambright P, Mortimer R, Bearden AJ. Mutant of the yeast Saccharomycopsis lipolytica that accumulates and excretes protorphyrin IX. Journal of Bacteriology. 1975;123:118–122. doi: 10.1128/jb.123.1.118-122.1975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, Gregor J, Davis NW, Kirkpatrick HA, Goeden MA, Rose DJ, Mau B, Shao Y. The complete genome sequence of Escherichia coli K-12. Science. 1997;277:1453–1462. doi: 10.1126/science.277.5331.1453. [DOI] [PubMed] [Google Scholar]
  8. Bloomer J, Bruzzone C, Zhu L, Scarlett Y, Magness S, Brenner D. Molecular defects in ferrochelatase in patients with protoporphyria requiring liver transplantation. Journal of Clinical Investigation. 1998;102:107–114. doi: 10.1172/JCI1347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Brown JR, Doolittle WF. Archaea and the prokaryote-to-eukaryote transition. Microbiology and Molecular Biology Reviews : MMBR. 1997;61:456–502. doi: 10.1128/mmbr.61.4.456-502.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bulmer M. The selection-mutation-drift theory of synonymous Codon usage. Genetics. 1991;129:897–907. doi: 10.1093/genetics/129.3.897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Cherry JM, Hong EL, Amundsen C, Balakrishnan R, Binkley G, Chan ET, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hitz BC, Karra K, Krieger CJ, Miyasato SR, Nash RS, Park J, Skrzypek MS, Simison M, Weng S, Wong ED. Saccharomyces genome Database: the genomics resource of budding yeast. Nucleic Acids Research. 2012;40:D700–D705. doi: 10.1093/nar/gkr1029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Dailey TA, Dailey HA. Human protoporphyrinogen oxidase: expression, purification, and characterization of the cloned enzyme. Protein Science. 1996;5:98–105. doi: 10.1002/pro.5560050112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Emanuelsson O, Brunak S, von Heijne G, Nielsen H. Locating proteins in the cell using TargetP, SignalP and related tools. Nature Protocols. 2007;2:953–971. doi: 10.1038/nprot.2007.131. [DOI] [PubMed] [Google Scholar]
  14. Ferreira GC, Andrew TL, Karr SW, Dailey HA. Organization of the terminal two enzymes of the heme biosynthetic pathway. orientation of protoporphyrinogen oxidase and evidence for a membrane complex. The Journal of Biological Chemistry. 1988;263:3835–3839. [PubMed] [Google Scholar]
  15. Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics. 2004;20:2479–2481. doi: 10.1093/bioinformatics/bth261. [DOI] [PubMed] [Google Scholar]
  16. Gabaldón T, Koonin EV. Functional and evolutionary implications of gene orthology. Nature Reviews Genetics. 2013;14:360–366. doi: 10.1038/nrg3456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Grandchamp B, Phung N, Nordmann Y. The mitochondrial localization of coproporphyrinogen III oxidase. Biochemical Journal. 1978;176:97–102. doi: 10.1042/bj1760097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Hamza A, Tammpere E, Kofoed M, Keong C, Chiang J, Giaever G, Nislow C, Hieter P. Complementation of yeast genes with human genes as an experimental platform for functional testing of human genetic variants. Genetics. 2015;201:1263–1274. doi: 10.1534/genetics.115.181099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Heinemann IU, Jahn M, Jahn D. The biochemistry of heme biosynthesis. Archives of Biochemistry and Biophysics. 2008;474:238–251. doi: 10.1016/j.abb.2008.02.015. [DOI] [PubMed] [Google Scholar]
  20. Heinicke S, Livstone MS, Lu C, Oughtred R, Kang F, Angiuoli SV, White O, Botstein D, Dolinski K. The Princeton protein Orthology database (P-POD): a comparative genomics analysis tool for biologists. PLoS One. 2007;2:e766. doi: 10.1371/journal.pone.0000766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Ho CH, Magtanong L, Barker SL, Gresham D, Nishimura S, Natarajan P, Koh JL, Porter J, Gray CA, Andersen RJ, Giaever G, Nislow C, Andrews B, Botstein D, Graham TR, Yoshida M, Boone C. A molecular barcoded yeast ORF library enables mode-of-action analysis of bioactive compounds. Nature Biotechnology. 2009;27:369–377. doi: 10.1038/nbt.1534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Huerta-Cepas J, Szklarczyk D, Forslund K, Cook H, Heller D, Walter MC, Rattei T, Mende DR, Sunagawa S, Kuhn M, Jensen LJ, von Mering C, Bork P. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Research. 2016;44:D286–D293. doi: 10.1093/nar/gkv1248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Ilag LL, Jahn D. Activity and spectroscopic properties of the Escherichia coli glutamate 1-semialdehyde aminotransferase and the putative active site mutant K265R. Biochemistry. 1992;31:7143–7151. doi: 10.1021/bi00146a016. [DOI] [PubMed] [Google Scholar]
  24. Ilag LL, Kumar AM, Söll D. Light regulation of chlorophyll biosynthesis at the level of 5-aminolevulinate formation in Arabidopsis. The Plant Cell Online. 1994;6:265–275. doi: 10.1105/tpc.6.2.265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Ingolia NT, Ghaemmaghami S, Newman JR, Weissman JS. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 2009;324:218–223. doi: 10.1126/science.1168978. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Jain R, Rivera MC, Lake JA. Horizontal gene transfer among genomes: the complexity hypothesis. PNAS. 1999;96:3801–3806. doi: 10.1073/pnas.96.7.3801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Jardine O, Gough J, Chothia C, Teichmann SA. Comparison of the small molecule metabolic enzymes of Escherichia coli and Saccharomyces cerevisiae. Genome Research. 2002;12:916–929. doi: 10.1101/gr.228002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Jordan IK, Rogozin IB, Wolf YI, Koonin EV. Essential genes are more evolutionarily conserved than are nonessential genes in Bacteria. Genome Research. 2002;12:962–968. doi: 10.1101/gr.87702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kachroo AH, Laurent JM, Yellman CM, Meyer AG, Wilke CO, Marcotte EM. Evolution. systematic humanization of yeast genes reveals conserved functions and genetic modularity. Science. 2015;348:921–925. doi: 10.1126/science.aaa0769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Research. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S, Duran C, Thierer T, Ashton B, Meintjes P, Drummond A. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28:1647–1649. doi: 10.1093/bioinformatics/bts199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Koh JL, Chong YT, Friesen H, Moses A, Boone C, Andrews BJ, Moffat J. CYCLoPs: a comprehensive database constructed from automated analysis of protein abundance and subcellular localization patterns in Saccharomyces cerevisiae. G3. 2015;5:1223–1232. doi: 10.1534/g3.115.017830. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kořený L, Oborník M. Sequence evidence for the presence of two tetrapyrrole pathways in Euglena gracilis. Genome Biology and Evolution. 2011;3:359–364. doi: 10.1093/gbe/evr029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Kulak NA, Pichler G, Paron I, Nagaraj N, Mann M. Minimal, encapsulated proteomic-sample processing applied to copy-number estimation in eukaryotic cells. Nature Methods. 2014;11:319–324. doi: 10.1038/nmeth.2834. [DOI] [PubMed] [Google Scholar]
  35. Lee D, Redfern O, Orengo C. Predicting protein function from sequence and structure. Nature Reviews Molecular Cell Biology. 2007;8:995–1005. doi: 10.1038/nrm2281. [DOI] [PubMed] [Google Scholar]
  36. Lee HJ, Lee SB, Chung JS, Han SU, Han O, Guh JO, Jeon JS, An G, Back K. Transgenic rice plants expressing a Bacillus subtilis protoporphyrinogen oxidase gene are resistant to diphenyl ether herbicide oxyfluorfen. Plant and Cell Physiology. 2000;41:743–749. doi: 10.1093/pcp/41.6.743. [DOI] [PubMed] [Google Scholar]
  37. Lee ME, DeLoache WC, Cervantes B, Dueber JE. A Highly Characterized yeast toolkit for Modular, Multipart Assembly. ACS Synthetic Biology. 2015;4:975–986. doi: 10.1021/sb500366v. [DOI] [PubMed] [Google Scholar]
  38. Lermontova I, Kruse E, Mock HP, Grimm B. Cloning and characterization of a plastidal and a mitochondrial isoform of tobacco protoporphyrinogen IX oxidase. PNAS. 1997;94:8895–8900. doi: 10.1073/pnas.94.16.8895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Li Z, Vizeacoumar FJ, Bahr S, Li J, Warringer J, Vizeacoumar FS, Min R, Vandersluis B, Bellay J, Devit M, Fleming JA, Stephens A, Haase J, Lin ZY, Baryshnikova A, Lu H, Yan Z, Jin K, Barker S, Datti A, Giaever G, Nislow C, Bulawa C, Myers CL, Costanzo M, Gingras AC, Zhang Z, Blomberg A, Bloom K, Andrews B, Boone C. Systematic exploration of essential yeast gene function with temperature-sensitive mutants. Nature Biotechnology. 2011;29:361–367. doi: 10.1038/nbt.1832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Martin W, Müller M. The hydrogen hypothesis for the first eukaryote. Nature. 1998;392:37–41. doi: 10.1038/32096. [DOI] [PubMed] [Google Scholar]
  41. Mochizuki N, Tanaka R, Grimm B, Masuda T, Moulin M, Smith AG, Tanaka A, Terry MJ. The cell biology of tetrapyrroles: a life and death struggle. Trends in Plant Science. 2010;15:488–498. doi: 10.1016/j.tplants.2010.05.012. [DOI] [PubMed] [Google Scholar]
  42. Mukherjee JJ, Dekker EE. 2-Amino-3-ketobutyrate CoA ligase of Escherichia coli: stoichiometry of pyridoxal phosphate binding and location of the pyridoxyllysine peptide in the primary structure of the enzyme. Biochimica Et Biophysica Acta (BBA) - Protein Structure and Molecular Enzymology. 1990;1037:24–29. doi: 10.1016/0167-4838(90)90097-Y. [DOI] [PubMed] [Google Scholar]
  43. Narita S, Tanaka R, Ito T, Okada K, Taketani S, Inokuchi H. Molecular cloning and characterization of a cDNA that encodes protoporphyrinogen oxidase of Arabidopsis thaliana. Gene. 1996;182:169–175. doi: 10.1016/S0378-1119(96)00545-8. [DOI] [PubMed] [Google Scholar]
  44. O'Brien KP, Remm M, Sonnhammer EL. Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Research. 2005;33:D476–D480. doi: 10.1093/nar/gki107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Oborník M, Green BR. Mosaic origin of the heme biosynthesis pathway in photosynthetic eukaryotes. Molecular Biology and Evolution. 2005;22:2343–2353. doi: 10.1093/molbev/msi230. [DOI] [PubMed] [Google Scholar]
  46. Pan X, Yuan DS, Xiang D, Wang X, Sookhai-Mahadeo S, Bader JS, Hieter P, Spencer F, Boeke JD. A robust toolkit for functional profiling of the yeast genome. Molecular Cell. 2004;16:487–496. doi: 10.1016/j.molcel.2004.09.035. [DOI] [PubMed] [Google Scholar]
  47. Papanastasiou M, Orfanoudaki G, Koukaki M, Kountourakis N, Sardis MF, Aivaliotis M, Karamanou S, Economou A. The Escherichia coli peripheral inner membrane proteome. Molecular & Cellular Proteomics. 2013;12:599–610. doi: 10.1074/mcp.M112.024711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Papenbrock J, Mock H-P, Kruse E, Grimm B. Expression studies in tetrapyrrole biosynthesis: inverse maxima of magnesium chelatase and ferrochelatase activity during cyclic photoperiods. Planta. 1999;208:264–273. doi: 10.1007/s004250050558. [DOI] [Google Scholar]
  49. Peregrin-Alvarez JM, Tsoka S, Ouzounis CA. The phylogenetic extent of metabolic enzymes and pathways. Genome Research. 2003;13:422–427. doi: 10.1101/gr.246903. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Pretlow T, Sherman F. Porphyrins and zinc porphyrins in normal and mutant strains of yeast? Biochimica Et Biophysica Acta (BBA) - General Subjects. 1967;148:629–644. doi: 10.1016/0304-4165(67)90036-0. [DOI] [Google Scholar]
  51. Saikia M, Wang X, Mao Y, Wan J, Pan T, Qian SB. Codon optimality controls differential mRNA translation during amino acid starvation. RNA. 2016;22:1719–1727. doi: 10.1261/rna.058180.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Schauer S, Chaturvedi S, Randau L, Moser J, Kitabatake M, Lorenz S, Verkamp E, Schubert WD, Nakayashiki T, Murai M, Wall K, Thomann HU, Heinz DW, Inokuchi H, Söll D, Jahn D. Escherichia coli glutamyl-tRNA Reductase. trapping the thioester intermediate. The Journal of Biological Chemistry. 2002;277:48657–48663. doi: 10.1074/jbc.M206924200. [DOI] [PubMed] [Google Scholar]
  53. Schauer WE, Mattoon JR. Heterologous expression of human 5-aminolevulinate dehydratase in Saccharomyces cerevisiae. Current Genetics. 1990;17:1–6. doi: 10.1007/BF00313241. [DOI] [PubMed] [Google Scholar]
  54. Sharp PM, Stenico M, Peden JF, Lloyd AT. Codon usage: mutational Bias, translational selection, or both? Biochemical Society Transactions. 1993;21:835–841. doi: 10.1042/bst0210835. [DOI] [PubMed] [Google Scholar]
  55. Smith AG, Santana MA, Wallace-Cook AD, Roper JM, Labbe-Bois R. Isolation of a cDNA encoding chloroplast ferrochelatase from Arabidopsis thaliana by functional complementation of a yeast mutant. The Journal of Biological Chemistry. 1994;269:13405–13413. [PubMed] [Google Scholar]
  56. Sonnhammer EL, Östlund G. InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic. Nucleic Acids Research. 2015;43:D234–D239. doi: 10.1093/nar/gku1203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M. BioGRID: a general repository for interaction datasets. Nucleic Acids Research. 2006;34:D535–D539. doi: 10.1093/nar/gkj109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Storbeck S, Rolfes S, Raux-Deery E, Warren MJ, Jahn D, Layer G. A novel pathway for the biosynthesis of heme in Archaea: genome-based bioinformatic predictions and experimental evidence. Archaea. 2010;2010:1–15. doi: 10.1155/2010/175050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Sun S, Yang F, Tan G, Costanzo M, Oughtred R, Hirschman J, Theesfeld CL, Bansal P, Sahni N, Yi S, Yu A, Tyagi T, Tie C, Hill DE, Vidal M, Andrews BJ, Boone C, Dolinski K, Roth FP. An extended set of yeast-based functional assays accurately identifies human disease mutations. Genome Research. 2016;26:670–680. doi: 10.1101/gr.192526.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Tanaka R, Kobayashi K, Masuda T. Tetrapyrrole Metabolism in Arabidopsis thaliana. The Arabidopsis Book. 2011;9:e0145. doi: 10.1199/tab.0145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Theobald DL. A formal test of the theory of universal common ancestry. Nature. 2010;465:219–222. doi: 10.1038/nature09014. [DOI] [PubMed] [Google Scholar]
  62. UniProt Consortium UniProt: a hub for protein information. Nucleic Acids Research. 2015;43:D204–212. doi: 10.1093/nar/gku989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Wang Z, Zhang J. Why is the correlation between gene importance and gene evolutionary rate so weak? PLoS Genetics. 2009;5:e1000329. doi: 10.1371/journal.pgen.1000329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Yin L, Bauer CE. Controlling the delicate balance of tetrapyrrole biosynthesis. Philosophical Transactions of the Royal Society B: Biological Sciences. 2013;368:20120262. doi: 10.1098/rstb.2012.0262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Zhou J, Rudd KE. EcoGene 3.0. Nucleic Acids Research. 2013;41:D613–D624. doi: 10.1093/nar/gks1235. [DOI] [PMC free article] [PubMed] [Google Scholar]
eLife. 2017 Jun 29;6:e25093. doi: 10.7554/eLife.25093.025

Decision letter

Editor: Naama Barkai1

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Systematic bacterialization of yeast genes identifies a near-universally swappable pathway" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Naama Barkai as the Senior Editor and Reviewing Editor. The following individual involved in review of your submission has agreed to reveal their identity: Eugene Koonin (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

This paper analyzes systematically the replaceability of yeast gene by their bacterial orthologs and discusses general differences between 'replaceable' and 'non-replaceable' genes. All three reviewers appreciated the study and found it suitable for publication in eLife. As you will see, the comments are related mostly to the Discussion and explanations. Please relate and try to address all of these suggestions.

Reviewer #1:

In the manuscript titled: "Systemic Bacterialization of yeast genes identifies a near-universal swappable pathway" Kachroo et al. describe their attempt in examining the complementation of 60 essential yeast genes with their bacterial counterparts. Of the 60, for 51 they were able to perform complementation assays. Complementation was tested in a plasmid based expression with the GPD1 promoter, CDS was cloned with the E. coli DNA sequence, tested with either the TS library yeast strain or in a heterozygous diploid from the deletion collection. 25 of the 51 complemented their yeast counterparts with no modification. Another 4 of 10 yeast mitochondrial genes complemented once a mitochondrial localization sequence was added. 2 of the 4 genes that lacked the canonical ATG start codon, complemented following mutation of the start codon to ATG. All told 31/51 showed complementation.

The authors then followed this up with statistics about the characteristics that determine replaceability between Bacteria and Yeast, taking into account 22 features, they find that protein sequence similarity is not a highly predictive feature, while specificity of pathway and/or process is highly predictive. Metabolic pathways show high replaceability while highly expressed genes, such as ribosomal genes, are more sensitive to codon choice.

The authors then turn to the heme pathway; they first examine the complementation of each of the bacterial genes for its yeast counterpart. For the first step in yeast (Sc-Hem1) there are two genes in bacteria (Ec-hemA and Ec-hemL) and in yeast it is performed in the mitochondria, they added MLS sequences to both bacterial genes and with co expression showed complementation and localization using a EGFP tag. For two more steps (Sc-HEM4 and Sc-HEM14), there only functional analogs (Ec-hemD and Ec-hemG), not orthologs, and they indeed complemented for the yeast genes. The final two steps in the yeast pathway are carried out in the mitochondria, however, the bacterial genes complemented without an MLS sequence, and surprisingly localized to the plasma membrane. However, when trying to complement for both yeast genes in a single strain there was a fitness defect. When integrating the bacterial genes to express them from their native yeast gene counterparts, all but two complemented, and those two complemented following MLS addition.

In complementation of Sc-HEM15 by Ec-hemH, authors noted that colonies are pink, they analyzed and found this to be due to accumulation of protoprophyrin IX, the substrate of Ec-hemH, determining that mislocalization to the plasma membrane caused reduce enzyme activity and accumulation of the substrate, similarly to the human disease protoporphyria. They suggest that this indicates that the yeast can be used to study mutations in the corresponding human gene.

The authors show that they can also replace most of the yeast heme pathway genes with the Arabidopsis genes that encode enzymes that form precursors for chlorophylls; this pathway is localized mostly to the chloroplast. Similarly to the bacterial pathway, the first step from yeast is performed in two steps in Arabidopsis, only when both genes (At-HENA1 and At-GSA) were expressed was complementation of Sc-HEM1 observed.

In Arabidopsis most of the heme genes are duplicated, in most, individually replaced their yeast orthologs, in one case only one and in one case neither replaced (At-HEME1 and At-HEME2 replacing Sc-HEM12). Due to its interest in commercial herbicides, the step producing protoporphyrin IX was tested for complementation by Arabidopsis genes (At-PPOX1 and At-PPOX2) as well as the soybean ortholog, Gm-HEMG; both complemented for Sc-Hem14. The authors also noted that the chloroplast localization signal on a couple of the Arabidopsis proteins caused their localization to the mitochondria, and complemented for their yeast orthologs.

Similarly, the authors assayed for complementation by the human orthologs of the heme pathway. Similar to what was shown in previous papers, they saw toxicity when overexpressing Hs-UROS from a plasmid using a constitutive promoter, thus they integrated it to the native locus of Sc-Hem4, and it showed reduced toxicity when integrated. They also showed complementation by Hs-UROD and Hs-PPOX. Two human orthologs complement for Sc-HEM1 individually. Although no yeast MLS was added to the human genes of the last three steps in the pathway they all were found to localize to the mitochondria.

I would consider acceptance of this paper following the authors' response to these issues:

1) The authors do a very poor job of describing the basis for choosing and refining the 60 coli genes. It is really hard to believe that there are only 60 good orthologs between yeast and E. coli. What exactly were the criteria for choosing those? Further, they did not indicate what possibly was the issue with the 9/60 that did not have a complementation assay. Was it a problem with the corresponding yeast host strain(s) or was it difficult to clone these genes? This should be spelled out, perhaps as an explicit description in the Materials and methods section. Complete tables of all orthologs attempted for all the donor species tested, along with reasons for failure for the 9 that missed in bacteria for example, should be provided in the supplement.

2) From what I could conclude, in both bacteria and Arabidopsis there was an issue with complementing Sc-hem12. The bacteria one could not be integrated and both Arabidopsis orthologs failed to complement. What is the issue with Hem12? How come the human ortholog Hs-UROD did complement? Even if there is no single clear answer to these questions a hypothesis would be welcome.

3) For the hemH pink phenotype, to show that this is actually due to the specific substrate proposed, they should delete Sc-HEM14 (or another upstream function) to genetically verify that the pink phenotype disappears because formation of protoprophyrin IX should be prevented by this "upstream" mutation.

Reviewer #2:

This is an interesting and important study on the replaceability of yeast gene by 1:1 orthologs from bacteria. To my knowledge, it is the largest systematic analysis of this kind and as such appears to be the best test of the 'Ortholog-Function Conjecture' to be reported so far (see Gabaldon & Koonin 2013, as cited here). The proverbial glass is more than half-full which in itself not particularly surprising but, given the gulf between prokaryotes and eukaryotes, should be considered a resounding vindication of the conjecture.

Apart from the general point above, the main interest of this work lies in the analysis of various predictors and correlates of ortholog replaceability. I do not share the authors' surprise regarding the lack of correlation between replaceability and sequence conservation. It has been shown in a number of analyses that there is at best a very limited connection between sequence conservation and gene essentiality (Jordan IK, Rogozin IB, Wolf YI, Koonin EV. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 2002 Jun;12(6):962-8; Wang Z, Zhang J. Why is the correlation between gene importance and gene evolutionary rate so weak? PLoS Genet. 2009 Jan;5(1):e1000329). I think we see here a manifestation of the same phenomenon: sequence conservation depends much stronger on the abundance of a protein product and the gene-specific functional constraints than on the "importance".

The shape of the dependency in Figure 3B seems paradoxical at first glance (non-monotonic curve, with moderately conserved genes being most replaceable) but I suspect is explained by the different in replaceability among functional classes of genes (Figure 3C). I find it highly desirable to test this directly and discuss accordingly.

To me, the results in Figure 3C are indeed the most interesting in the paper. Again, this looks striking and at least superficially, paradoxical, in that genes in the most highly conserved categories, such as translation and tRNA modification, are virtually non-replaceable. I believe the explanation lies in the complexity hypothesis (Jain R, Rivera MC, Lake JA. Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci U S A. 1999 Mar 30;96(7):3801-6) that seems to be the best explanation for the rates of horizontal gene transfer in different functional classes of prokaryotic genes. Indeed, ortholog replacement studied here can be considered an extreme, "forced" variant of horizontal gene transfer. I think a thoughtful discussion of these parallels and their utility for explaining the results could make the present story considerably more interesting.

Reviewer #3:

This is a fun and thought-provoking study that systematically replaces essential yeast genes with their orthologs from E. coli. There is good rescue by many of the 1:1 orthologs. The authors then more extensively investigate the 'swappability' of all the proteins in the haem biosynthesis pathway, finding that even mislocalised proteins can rescue some yeast deletion phenotypes as can swapping in an alternative (non-orthologous) part of the pathway. This extends their previous work (and the work of two other labs) performing similar experiments with human 1:1 orthologs over a larger evolutionary distance, formally showing that, at least for metabolic enzymes, function has been conserved over huge evolutionary distances to the extent that the enzymes can be swapped from one species to another.

Minor suggestions/queries:

1. The authors do not describe how the two complementation assays work in the main text or how much agreement there is between them when they are directly compared for the same ORFs. I think they should do both to assist the general reader. The rescue of TS mutants is straightforward, but only yeast aficionados will understand the second assay. In addition, this cannot be at all understood from their sentence in the main text: '…were carried out using two types of conditionally essential yeast alleles, consisting of temperature- sensitive (TS) haploid and heterozygous knockout diploid yeast strains”. In Figure 1 and elsewhere it would help to indicate how many ORFs and which were tested with which complementation assay, how many with both and the agreement between them.

2. Are any of the mutant phenotypes not rescued by overexpressing the yeast ORF?

eLife. 2017 Jun 29;6:e25093. doi: 10.7554/eLife.25093.026

Author response


Reviewer #1:

[…] 1) The authors do a very poor job of describing the basis for choosing and refining the 60 coli genes. It is really hard to believe that there are only 60 good orthologs between yeast and E. coli. What exactly were the criteria for choosing those? Further, they did not indicate what possibly was the issue with the 9/60 that did not have a complementation assay. Was it a problem with the corresponding yeast host strain(s) or was it difficult to clone these genes? This should be spelled out, perhaps as an explicit description in the Materials and methods section. Complete tables of all orthologs attempted for all the donor species tested, along with reasons for failure for the 9 that missed in bacteria for example, should be provided in the supplement.

To obtain clear loss-of-function phenotypes, we chose all E. coli orthologs of essential yeast genes with no lineage specific duplications (i.e., only 1:1 orthologs). Though in total there are 460 shared orthogroups between E. coli and yeast (as per InParanoid 8), only 58 fit the criteria of 1) yeast essentiality and 2) 1:1 orthology. We have now clarified this process in the text and in the Materials and methods section, and corrected a typo in the prior Figure 1A(60 vs. 58). As now described in the text (subsection “Many E. coli genes successfully complement lethal defects in their yeast orthologs”, first paragraph; subsection “Ortholog inference”), we used Inparanoid 8 to identify the 58 E. coli genes that are 1:1 orthologs of essential yeast genes. We cloned and confirmed the sequence of all 58 of these E. coli genes in yeast expression vectors. Of these, 51 provided informative assays, 5 were inconclusive, and 2 had no matched yeast strains available to test replaceability. We have modified the Supplementary file 1appropriately.

2) From what I could conclude, in both bacteria and Arabidopsis there was an issue with complementing Sc-hem12. The bacteria one could not be integrated and both Arabidopsis orthologs failed to complement. What is the issue with Hem12? How come the human ortholog Hs-UROD did complement? Even if there is no single clear answer to these questions a hypothesis would be welcome.

Our observations were that the bacterial ortholog (Ec-hemE) of Sc-HEM12 functionally replaced only when constitutively expressed under the GPD promoter. Human Hs-UROD also replaced when so expressed. However, plant versions did not work when constitutively expressed on plasmids. Thus, to address the issue of why we observed differential replaceability across variants of this gene, we performed the following analyses:

A) In the case of the Ec-HemE bacterial replacement at the yeast genomic locus, we suspect that the reason for not obtaining a replacement is that we used only 60bp of sequence homology to the flanking yeast locus, limiting the efficiency of the homologous repair. However, it is still possible that the bacterial version at the yeast native locus does not replace the yeast gene function, explaining the lack of positive clones. We did not pursue this particular case in more depth.

B) The human Hs-UROD version available in the human ORFeome had a single mutation resulting in a single amino acid change G303V. This variant is non-replaceable as explained in the text (subsection “Each yeast heme biosynthesis enzyme can be replaced by its human ortholog”, first paragraph). Reverting this mutation back to wild type (encoding glycine) allowed successful replacement of the yeast gene (Figure 6—figure supplement 3C).

C) The plant co-orthologs of the yeast gene Sc-Hem12 (AtHEME1/E2) did not replace the yeast gene when expressed under the control of the GPD promoter. In response to the referee’s queries, we have now tested and eliminated two possible reasons for this lack of replacement:

i) Plant heme pathway proteins possess chloroplast localization signals (CLS) at their N-termini, and we showed two cases where GFP-tagged plant proteins localize to mitochondria in yeast (At-PPOX1 and At-FC-I). However, the Sc- Hem12 reactions take place in the cytosol. We therefore first suspected mislocalization to the mitochondria to be the likely reason for non-replaceability. We have now tested whether the removal of the At-HEME1 or AtHEME2 CLS would allow functional replacement; however, this was unsuccessful (Figure 6—figure supplement 1B’’’). We also tested At-HEMC, an initially poor replacer. For this gene, removal of the CLS significantly enhanced replaceability compared to the wild type protein (Figure 6—figure supplement 1B’’), demonstrating that the CLS did indeed contribute to non-replaceability in some cases.

ii) We next suspected functional divergence or sub-functionalization as a potential contributor to the lack of complementation. We co-expressed both paralogs (testing both the wild type and CLS-less versions) under the control of a GPD promoter on two different plasmids with different selections for transformation (SD-Ura and Hygromycin). Co-expression of both genes in the same strain failed to functionally replace the yeast gene function (Figure 6—figure supplement 1B’’’), ruling out sub-functionalization as a likely reason for the failure to complement.

We speculate that there could be several other reasons why complementation failed, including unknown intermediate reactions, required localization in a special compartment (e.g. chloroplast) or different transcriptional/translational regulation in plants that might contribute to the lack of functional replaceability.

We have incorporated the new data into the manuscript, and indicate (subsection “Most yeast heme biosynthesis enzymes can also be successfully plant-ized”, last paragraph) that we tested multiple hypotheses to attempt to explain these trends.

3) For the hemH pink phenotype, to show that this is actually due to the specific substrate proposed, they should delete Sc-HEM14 (or another upstream function) to genetically verify that the pink phenotype disappears because formation of protoprophyrin IX should be prevented by this "upstream" mutation.

We have now performed additional experiments to confirm our hypothesis regarding protoporphyrin IX accumulation. Using CRISPR, we deleted the Sc-HEM14 ORF in wild type BY4741, Sc-hem15Δ::Ec-HemH, and Sc-hem15Δ::Ec-MLS-HemH strains. Consistent with protoporphyrin IX being the pink pigment in the Sc-hem15Δ::Ec-HemH strain, the Sc- hem15Δ::Ec-HemH hem14Δ strain lost the pink phenotype, even after growing for 6 days.

Moreover, we observed that all strains carrying the hem14Δ allele were in fact significantly paler than even wild type BY4741 cells, presumably reflecting extensive protoporphyrin IX depletion in these cells. These data are now provided in Figure 5—figure supplement 2.

Reviewer #2:

[…] Apart from the general point above, the main interest of this work lies in the analysis of various predictors and correlates of ortholog replaceability. I do not share the authors' surprise regarding the lack of correlation between replaceability and sequence conservation. It has been shown in a number of analyses that there is at best a very limited connection between sequence conservation and gene essentiality (Jordan IK, Rogozin IB, Wolf YI, Koonin EV. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 2002 Jun;12(6):962-8; Wang Z, Zhang J. Why is the correlation between gene importance and gene evolutionary rate so weak? PLoS Genet. 2009 Jan;5(1):e1000329). I think we see here a manifestation of the same phenomenon: sequence conservation depends much stronger on the abundance of a protein product and the gene-specific functional constraints than on the "importance".

The shape of the dependency in Figure 3B seems paradoxical at first glance (non-monotonic curve, with moderately conserved genes being most replaceable) but I suspect is explained by the different in replaceability among functional classes of genes (Figure 3C). I find it highly desirable to test this directly and discuss accordingly.

Though the majority of the proteins tested had moderate sequence conservation, we saw no particular relationship between sequence conservation and functional replaceability. We now expand on this point and have incorporated the citations mentioned by the referee in the subsection “Conclusions” (first paragraph). We additionally tested for the enrichment of particular GO Biological Process categories within each bin of sequence identity from Figure 3B. Those genes in the 40-50% category had an enrichment in glucose metabolism (3 of the 7 genes). Other than that bin, no other category had any significant enrichment in biological processes or KEGG pathways. We now discuss this point specifically in the first paragraph of the subsection “Replaceability varies strongly across different biological processes”.

To me, the results in Figure 3C are indeed the most interesting in the paper. Again, this looks striking and at least superficially, paradoxical, in that genes in the most highly conserved categories, such as translation and tRNA modification, are virtually non-replaceable. I believe the explanation lies in the complexity hypothesis (Jain R, Rivera MC, Lake JA. Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci U S A. 1999 Mar 30;96(7):3801-6) that seems to be the best explanation for the rates of horizontal gene transfer in different functional classes of prokaryotic genes. Indeed, ortholog replacement studied here can be considered an extreme, "forced" variant of horizontal gene transfer. I think a thoughtful discussion of these parallels and their utility for explaining the results could make the present story considerably more interesting.

Our results in Figure 3C do seem to agree with the complexity hypothesis, in that housekeeping genes are more likely to be replaceable while informational genes are not. We have added a short discussion of this topic in the subsection “Conclusions” (first paragraph).

Reviewer #3:

This is a fun and thought-provoking study that systematically replaces essential yeast genes with their orthologs from E. coli. There is good rescue by many of the 1:1 orthologs. The authors then more extensively investigate the 'swappability' of all the proteins in the haem biosynthesis pathway, finding that even mislocalised proteins can rescue some yeast deletion phenotypes as can swapping in an alternative (non-orthologous) part of the pathway. This extends their previous work (and the work of two other labs) performing similar experiments with human 1:1 orthologs over a larger evolutionary distance, formally showing that, at least for metabolic enzymes, function has been conserved over huge evolutionary distances to the extent that the enzymes can be swapped from one species to another.

Minor suggestions/queries:

1. The authors do not describe how the two complementation assays work in the main text or how much agreement there is between them when they are directly compared for the same ORFs. I think they should do both to assist the general reader. The rescue of TS mutants is straightforward, but only yeast aficionados will understand the second assay. In addition, this cannot be at all understood from their sentence in the main text: '…were carried out using two types of conditionally essential yeast alleles, consisting of temperature- sensitive (TS) haploid and heterozygous knockout diploid yeast strains”. In Figure 1 and elsewhere it would help to indicate how many ORFs and which were tested with which complementation assay, how many with both and the agreement between them.

We now describe the assays in the main text (Results). A more detailed assay description is also listed in the Methods section. For 11 cases, we obtained informative assays for both the TS and MM alleles of the same gene. 10 of these assays shared the same complementation status, whereas 1 did not. We have now updated Supplementary file 1 to clearly indicate results from both assays.

2. Are any of the mutant phenotypes not rescued by overexpressing the yeast ORF?

This is of course an important control. We previously tested the general complementation rate of deletion alleles of essential yeast genes by plasmid-based copies of the same genes under the GPD promoter and found they replaced at a rate of 100% in 29 strains tested (Kachroo et al., Science, 2015). As we are using the same strain collections here, we expect a comparable (high) rate. We have now performed additional control experiments to confirm this: First, we tested whether 6 yeast deletion strains, which could not be rescued by their corresponding E. coli orthologs, became replaceable if the corresponding yeast genes were similarly heterologously expressed on a CEN plasmid. In all 6 cases, complementation was successful (Figure 1—figure supplement 2 and Sc-HEM1 as reported in Figure 4—figure supplement 1). Second, we specifically tested the entire heme biosynthesis pathway for replaceability by the corresponding yeast genes when expressed either under the control of the heterologous GPD promoter or the native promoter (using MOBY collection yeast ORF plasmids) (Figure 4—figure supplement 1). In all but one case, the yeast genes replaced and the mode of expression was irrelevant to the efficiency of replaceability (Figure 4—figure supplement 1). However, similar to the human ortholog Hs-UROS, the expression of the yeast gene, Sc-HEM4, was toxic when expressed under the control of the constitutive GPD promoter (Figure 4—figure supplement 1 & Figure 6—figure supplement 3A, 3B). This toxicity was relieved if the yeast protein was expressed under the native promoter, again similar to the human Hs-UROS, which showed functional replacement when integrated at the genomic locus (Figure 4—figure supplement 1 and Figure 6—figure supplement 3B). These additional experiments have now been incorporated into the manuscript where indicated above.

Additional changes in the revised version: In addition to the suggestions of the reviewers, we also now plot all growth assays as a mean and standard deviation of N=3 replicate growth curves.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Supplementary file 1. Detailed results of complementation assays.

    Rows are ortholog pairs. Columns A-E list several alternative gene IDs for each organism. ‘Assay location’ is the location of that complementation assay in the plate images shown in Figure 1—figure supplement 1. The following columns list the results of the specific assay type: ‘MM’ refers to the heterozygous diploid assay (Magic Marker). ‘TS’ refers to the temperature-sensitive allele assay. ‘With MLS’ are assays re-done with the yeast mitochondrial localization sequence as described in the test. ‘With ATG’ are assays re-done with an ATG start codon substituted for the E. coli genes non-canonical start codon. ‘Plasmid dependence’ lists the results of 5’ FOA screening of complementing clones to confirm that the E. coli gene-containing plasmid is present. ‘Final status’ and ‘Preliminary status’ refer to the complementation status of the given ortholog pair after (Final) or before (Preliminary) accounting for MLS, ATG, or plasmid dependence assays.

    DOI: http://dx.doi.org/10.7554/eLife.25093.022

    elife-25093-supp1.xlsx (57.3KB, xlsx)
    DOI: 10.7554/eLife.25093.022
    Supplementary file 2. Data used to calculate predictive features.

    The first sheet displays all data for each property used in the study to determine predictive properties. The first several columns list IDs for the two organisms as well as the final complementation results. The following columns include all data for each feature. See Materials and Methods for detailed descriptions of each property. The second sheet lists the calculated AUCs for each feature, and the results of the shuffling procedure for each.

    DOI: http://dx.doi.org/10.7554/eLife.25093.023

    elife-25093-supp2.xlsx (69.2KB, xlsx)
    DOI: 10.7554/eLife.25093.023
    Supplementary file 3. Primers used in this study.

    A brief description of each primer's use is included on each separate sheet of the file. For additional information, see Materials and methods for the relevant section.

    DOI: http://dx.doi.org/10.7554/eLife.25093.024

    elife-25093-supp3.xlsx (60.8KB, xlsx)
    DOI: 10.7554/eLife.25093.024

    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES